Professional Documents
Culture Documents
NumPy Arrays and Pandas Series Object
NumPy Arrays and Pandas Series Object
SAMPLE CODES
Pandas or Python Pandas is Python’s library for data analysis. Pandas have derived its name from “panel data
system” which refers to multidimensional, structured data sets. The main author of Pandas is Wes McKinney.
Data Analysis: It refers to process of evaluating big data sets using analytical and statistical tools so as to
discover useful information and conclusions to support business decision-making.
NumPy Arrays: Numerical Python or Numeric Python is an open source module of Python that offers
functions and routines for fast mathematical computation on arrays and matrices. An array refers to a named
group of homogeneous (of same type) elements.
NumPy Arrays come in two forms:-
➢ 1-D arrays known as Vectors having single row/column only.
➢ Multidimensional arrays known as Matrices having multiple rows and columns.
Output:-
A 1-D array is
[1 2 3]
A 2-D array
[[1 2 3]
[4 5 6]
[7 8 9]]
An array= [1 3 5 7 9]
import numpy as np 0 1
import pandas as pd 1 3
a=np.arange(1,11,2) 2 5
8.
print("An array=",a) 3 7
s=pd.Series(a) 4 9
print(s) dtype: int32
rollno=[1,2,3] 1 manish
12.
name=['manish','harish','anurag'] 2 harish
Jan 31
Feb 28
Mar 31
b=pd.Series(data=[31,28,31,30],index=['Jan','Feb','
Apr 30
Mar','Apr'])
dtype: int64
print(b)
13. Jan 31.0
c=pd.Series(data=[31,28,31,30],index=['Jan','Feb','
Feb 28.0
Mar','Apr'],dtype=np.float64)
Mar 31.0
print(c)
Apr 30.0
dtype: float64
[1 2 3 4]
1 2
2 4
3 6
a=np.arange(1,5) 4 8
print(a) dtype: int32
b=pd.Series(index=a, data=a*2) 1 1
14.
print(b) 2 4
c=pd.Series(index=a, data=a**2) 3 9
print(c) 4 16
dtype: int32
0 1
1 2
a=[1,2,3,4] 2 3
15. b=pd.Series(data=a*2) 3 4
print(b) 4 1
5 2
6 3
17. Output:-
True
Output:-
Jan 31
Feb 28
Mar 31
Apr 30
18. dtype: int64
31
0 1
1 2
2 3
3 4
4 5
5 6
dtype: int64
4
1 a
2 e
3 i
2 o
4 u
dtype: object
2 e
Output:-
1 10
2 11
3 12
4 13
5 14
19
dtype: int64
2 11
3 12
4 13
5 14
dtype: int64
2 11
3 12
dtype: int64
1 10
3 12
5 14
dtype: int64
obj1=pd.Series(index=[1,2,3,4,5], data=[10,11,12,13,14])
print(obj1)
obj1[2]=5 #using index
print(obj1)
20
obj1[1:3]=99 #using slicing
print(obj1)
Output:
1 10
2 11
obj1=pd.Series(index=[1,2,3,4,5,6,7,8,9,10], data=[10,11,12,13,14,15,16,17,18,19])
print(obj1)
print(obj1.head()) #returns first five results by default
print(obj1.tail()) #returns last five results by default
print(obj1.head(4))
print(obj1.tail(3))
Output:
1 10
2 11
3 12
4 13
21 5 14
6 15
7 16
8 17
9 18
10 19
dtype: int64
1 10
2 11
3 12
4 13
5 14
dtype: int64
6 15
7 16
Output:
a 1
b 2
c 3
d 4
dtype: int64
a 3
b 4
22
c 5
d 6
dtype: int64
a 2
b 4
c 6
d 8
dtype: int64
a 1
b 4
c 9
d 16
dtype: int64
a False
b False
c False
Output:
a 6
b 8
c 10
d 12
dtype: int64
1 11
2 14
3 17
23 4 20
dtype: int64
a NaN
b NaN
c NaN
d NaN
1 NaN
2 NaN
3 NaN
4 NaN
dtype: float64
1 3.0
2 7.0
3 11.0
4 15.0
5 NaN
6 NaN
7 NaN
8 NaN
dtype: float64
Output:
1 6
2 7
3 8
4 9
24
5 10
dtype: int64
1 False
2 False
3 False
4 True
5 True
dtype: bool
4 9
5 10
dtype: int64
#Reindexing Series object
obj1=pd.Series(index=[1,2,3,4],data=['a','b','c','d'])
print(obj1)
obj2=obj1.reindex([1,3,2,4])
print(obj2)
Output:
1 a
25 2 b
3 c
4 d
dtype: object
1 a
3 c
2 b
4 d
dtype: object
Output:
1 a
26 2 b
3 c
4 d
dtype: object
1 a
3 c
4 d
dtype: object
Output:
{1: 'a', 2: 'b', 3: 'c', 4: 'd'}
[1 2 3 4]
[1. 2. 3. 4.]
Output:
['J' 'a' 'i' 'p' 'u' 'r' 'I' 's' 'C' 'a' 'p' 'i' 't' 'a' 'l' 'O' 'f' 'R' 'a'
Output:
[1 2 3 4]
[1.2 2.5 3.1]
[1 3 5]
Output:
['J' 'a' 'i' 'p' 'u']
Output:
[0 1 2 3 4 5 6 7 8 9]
[[0 1 2 3 4]
[5 6 7 8 9]]
[[0 1]
[2 3]
[4 5]
#slices in 2d array
import numpy as np
import pandas as pd
a=np.array([[1,2,3,4,5],[2,5,6,1,3],[6,7,8,9,1],[9,7,5,2,4]])
print(a)
slc1=a[0:3,0:4]
print(slc1)
slc2=a[:3,3:]
print(slc2)
slc3=a[1::2,:3]
print(slc3)
32 Output:-
[[1 2 3 4 5]
[2 5 6 1 3]
[6 7 8 9 1]
[9 7 5 2 4]]
[[1 2 3 4]
[2 5 6 1]
[6 7 8 9]]
[[4 5]
[1 3]
[9 1]]
[[2 5 6]
[9 7 5]]
[[ 1 2 3 7 8 9]
[ 4 5 6 10 11 12]]
[[ 1 2 3]
[ 4 5 6]
[ 7 8 9]
[10 11 12]]
Output:-
[[1 2 3]
[4 5 6]
[7 8 9]]
[[10 11 12]
[13 14 15]]
Array after concatenation
[[ 1 2 3]
[ 4 5 6]
Here a is 2x2 and b is 2x3 array and hence concatenation based on rows is achieved
as rows are matching but concatenation of a and b could not be achieved on columns as columns are
unequal hence transpose of array is achieved then b is converted to array d with dimension 3x2, thus
now array a and d can be concatenated based on columns which are now same.
import numpy as np
import pandas as pd
a=np.array([[1,2],[3,4]])
b=np.array([[3,4,5],[6,7,8]])
c=np.concatenate((a,b),axis=1)#arrays match row dimension as axis=1
print(c)
d=b.T # b was 2x3 now it has become 3x2 array
print(d)
e=np.concatenate((a,d),axis=0)#arrays match on column dimension as axis=0
print(e)
Output:-
[[1 2 3 4 5]
[3 4 6 7 8]]
[[3 6]
[4 7]
[5 8]]
[[1 2]
[3 4]
[3 6]
[4 7]
[5 8]]
Output:-
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
[[ 0 1 2 3 4 5]
[ 6 7 8 9 10 11]
[12 13 14 15 16 17]
[18 19 20 21 22 23]]
[array([[ 0, 1, 2],
[ 6, 7, 8],
[12, 13, 14],
[18, 19, 20]]), array([[ 3, 4, 5],
[ 9, 10, 11],
[15, 16, 17],
[21, 22, 23]])]
[array([[ 0, 1],
[ 6, 7],
[12, 13],
[18, 19]]), array([[ 2, 3],
[ 8, 9],
[14, 15],
[20, 21]]), array([[ 4, 5],
[10, 11],
[16, 17],
[22, 23]])]
Output:-
[[0 1 2 3 4]
[5 6 7 8 9]]
[[ True False False False False]
[ True False False False False]]
Output:-
[[0 1 2 3 4]
[5 6 7 8 9]]
[[0.3 1.3 2.3 3.3 4.3]
[5.3 6.3 7.3 8.3 9.3]]
[[ 0.3 2.3 4.3 6.3 8.3]
[10.3 12.3 14.3 16.3 18.3]]