Professional Documents
Culture Documents
All in One CH 1 Data Series
All in One CH 1 Data Series
All in One CH 1 Data Series
Python Pandas is a Python package providing fast, flexible and expressive data CHAPTER CHECKLIST
structures designed for manipulation.
Python Pandas was developed by Wes Mckinney in 2008 and
used for data Features of Python Pandas
analysis in Python. Data analysis requires lots of processing, such as restructuring, Data Structure
as Num/y, Scipy,
deaning, merging etc. Using different tools available such Series Data Structure
Cython and Panda. Using Pandas, we can accomplish five typical steps in the
manipulate, model and DataFrame Data Structure
processing and analysis of data, such as load, prepare,
Transferring Data between
analyse.
for using Pandas to CSV Files and DataFrames
Jupyter notebook offers a good and effective environment
do data exploration and modeling.
programming
In this chapter, we will use Jupyter Notebook for
of graphs formed.
Class 12th
2 Allinone | INFORMATICS PRACTICES
(iv)
SERIES DATA STRUCTURE
Grouping With the help of this feature
you can split data into categories
of Pandas
of your choice, can take integer values.
according to the criteria you set.
The GroupBy Series type of list in Pandas which
is a
labels of Series
function splits the data, implements
a function string values, double values and more. The row
and then combines the result. are called the index.
(v) Merging and
joining of datasets While analysing List, tuple and dictionary can be ecasily converted into Series by
data, we constantly Series() method.
need to merge and join
multiple datasets to create a final
dataset to be Serics has following parameters
able to properly analyse it. Pandas
can help to data list, tuple, dictionary or scalar value
merge various datasets, with
extreme efficiency so
that we dont face any problems while analysing index Is value should be unique. It uses default as
the data. np.arange(n), when we do not pass any index.
vi) Optimised performance Pandas have dtype data type of series
a really
optimised performance, which makes it really fast copy copying the data
and suitable for data. The critical code for Pandas
is written in C or Cython,
which makes it Creating a Series
extremely responsive and fast. In Pandas, Series can be created in two ways as
1. Creating an Empty Series
To Work with Pandas We can create an empty series object,
To work in Python Pandas, you need i.e. having no values
import pandas
to using Series () method.
library. By the following command
either on shell
prompt or script file, you can import pandas: Syntax Series_0bject pandas.Series()
import pandas as pd It contains always default data
type i.e. float64.
import pandas as pd
DATA STRUCTURE S pd.Series
It is a specialised print(s)
format for organising, processing,
retrieving and storing data. Any data Output Series([ 1, drype:float64)
structure is
designed to arrange data to suit a specific purpose 2. Creating a Series
so that Using Inputs
it can be accessed and worked with in appropriate ways.
Pandas' series can be created
Pandas provide rwo data structures for processing in different ways like from lists,
theactionary, scalar value, ndarray etc.
data as ) Create a Series
(i) Series It is one dimensional from Array In order Create a series
i
i mport numpy as np
(ii) Create a Series from Dictionary When we create a
series from dictionary and dictionary object is being data- [ "Hello*, 'How', np. NaN, *You"]
passed as an input but index is not specified, tlhen a-pd.Series(data)
the dictionary keys are taken in a sorted order to print a)
Construct the index. Output
c.g 1
mport pandas as pd Hello
data 'One": 1569.
1564, 'Two' How
Three': 7896. 'Four': 75961 NaN
a - pd.Series (data) You
print(a) dtype object
INFORMATICS PRACTICES Class 12th
A Allinome
mport numpy as np
info2-pd.Series(data-[12,56,36,45). xpd.Series (data - [1. 2. np.NaN])
index=[ 'a', 'b'. 'c', 'd']) y pd.Series (data - [4.9. 8.2. 5.6. 2.9].
print infol .dtype) index=[ 'a*. 'b'. 'c*. 'd')
print in fol.items i ze) pd.Series( )
print info2.dtype)
print(x.empty, y.empty. 2.empty)
print info2.items i
ze)
print(x.hasnans. y.hasnans. z. hasnans)
Output float64
Output
int64 (False, False, True)
(True, False, False)
(ii) Retrieving Shape
Shape of any series can be retrieve using shape attribute. It Accessing Elements from Series
defines the number of elements including missing or empty (Indexing and Slieing)
values (NaN).
Index number (an integer) is used to access the element of
e.g. import pandas as pd
Series.
x=pd. Series (data=[10. 20. 30])
To access the individual element, use the following syntax
y-pd.Series (data-[4.9. 8.2. 5.6. 2.9].
index=['a', 'b', 'c'. 'd*]) Series Object [ i ndex_number]
print(x. shape) To access multiple elements from Series, use slice operation.
This operation pertorms using the colon G).
print(y. shape)
Different forms of slice operation as
Output (3.)
(4,) : Index]-To print elements from begining to a range
-Index-To print elements from end
(iv) Retrieving Dimension, Size and Number
[Index:-To print elements from specific index till the
of Bytess
end use
If you want to retrieve dimension use ndim attribute, to
retrieve size use size attribute and to retrieve number of .[Start index: End index]-To print elements within a
bytes use nbytes attribute. range
eg. 1mport pandas as pd 1-To print whole series
xpd.Series (data-[10, 20. 30]) :-11-To print the whole series in reverse order
y pd.Series (data [4.9. 8.2. 5.6. 2.91.
-
c.g
index = ['a', 'b'. 'C°, 'd']) import pandas as pd
print(x.ndim, y. ndim) import numpy as np
print(x.size. y.size) data np.array([ 'P', 'R*. '0*. 'G*, 'R°, *A'.
print(x.nbytes, y. nbytes) M. 1,2.3])
a pd.Series (data)
Output
3
print'a[0] :
\n°, a[0])
print('a[:3] :\n*.al:3])
24 32
print'al: 3] \n'.a[:-3])
(v) Checking Emptiness and Presence print 'a[3:] :\n*.a[3:])
of NaN print'a[3:7] :\n',a[3:7])
empty attribute is used to check emptiness and
hasnans :\n*,a[:])
print( 'al:]
attribute is used to check that series object contains some
values or not. print 'a[:: -1] :\n',a[::-1])
PRACTICES Class 12th
|INFORMATICS
6 Allinone
al:-1)
Output a[0):
9
a:3 8 1
P
M
1 R
2
dtype object R
al-3] G
P O
R R
2 P
G
dtype: object
4 R
5 A Operations on Series Object
as follows
6 M We can perform various operations on series object
drype: object Element.
(i) Modify the Series
a[3:] item
The data value of series object can be modified using
3 G assignment.
4 R Syntax
Series 0bject[index]-new_data_value
M To modify the data value falling in mentioned slice, use
1 following syntax
8 2 Series0bject [start : stop] =
new_data_value
9 c.g. import pandas as pd
i mport numpy as np
dtype: object
data =
np.array([120,150.200.175])
a[3:7] a =
pd.Series (data)
3 G print(a
4 R a[2]-500
. .2
Series.head(n) 2
tail () tunction is used to return last n rows of a Series. p pd.Series (data-[8.6.7.5.103.indey-! 'a'.
Default value of n is 5.
Syntax print("Series 'x**)
Series p Output
8 -3.1
6 b 2.2
-1.4
7
d -3.1
6
e -2.0
10
dtype float64
dtype:int64
In above examples, series objects x and z have matching print(x 2)
indexes, i.e. 0, 1,2, ...
so on and series objects y and p have Output
matching indexes, i.e. a, b, c, d, e. 1.2
So, objects x and z successfully carries out arithmetic 1
12.6
operations on corresponding elements. Same as objects y 2 13.5
and p carries out arithmetic operations on corresponding
32.4
elements.
print(x+z) 4 21.5
5 NaN
Output
NaN
2.2
deype float64
1
8.3
print(y p)
7.5
12.1
Output
a 39.2
4 9.3
49.2
5 NaN
39.2
6 NaN
17.4
dtype: float64
80.0
print(y+p)
drype float64
Output
print(/x)
12.9
Output
b 14.2
1.200
12.6
3.150
8.9
1.500
18.0
2.025
dtype : float64
4 0.860
print(x-z) 5 NaN
Output NaN
-0.2
dtype: float64
A.3 print(y/p)
1.5 Output
3 4.1
0.612500
0.7
1.366667
5 NaN
0.800000
6 NaN 0.483333
aye: loat64 0.800000
print(y-p) dtype: tloat64
Data Handling Using Pandas-1 9
(iv) Vector Operations on Series Object Sorting Series Values
Vector operations are perfornmed individually on each
clement ot the series object. We can sort the data values of series objects using values and
indexes. Sorting can be done either in ascending order or
cg. import pandas as pd
descending order.
y pd.Series(data-[4, 8. 5, 21.
index-t 'a'. 'b°. 'c', 'd'1) Sort the Series Values using Values
print(y) To sort the series values using values, use sort_values (0
print "Add some value to el ement") function. This function sorts the vaues in ascending order.
print(y+3) Syntax
print"Multip1y some value t0 element")
print (y3) Series _0bject.sort_val ues
If you want the values in descending order pass
to sort
print( "use cubic power to el ement")
print(y**3) argument as ascending = False.
Syntax
print "Us ing relational operator
on element") Series_0bject.sort_values (ascending
False)
print(y>5)
Output c.B. import pandas as pd
x-pd.Series (data-[ 125.360.480.560.8503.
index=["'a'. 'b*.'c'.'a'.'e'])
b
print (x)
d 2
print( "Ascending 0rder")
print(x.sort_values ())
drype: int64
print( "Descending Order")
Add some value to clement
print(x.sort_vai ues (ascending=falseji
Output
b 125
8
360
480
dtype: int64
560
Multiply some value to element
12 850
24 dype: int64
15 Ascending Order
6 125
d
dtype: int64 360
Use cubic power to element 480
64 560
512 C 850
125 dtype: int64
d 8 Descending Order
deype: int64 850
Using relational operator on element d 560
False
480
b True
b 360
C False
125
False
bool dtype int64
dtype:
10 TAllinone IINFORMATICS PRACTICES Class 12th