Numpy - KickStart - Jupyter Notebook

10/8/21, 12:19 AM Numpy_KickStart - Jupyter Notebook
Why Numpy?
Inorder to perform some numerical operations like array addition, multiplication, create dummy values,
etc.,
In [1]:
import numpy as np
In [2]:
arr_1 =np.array([1,2,3,4,5])
arr_1
Out[2]:
array([1, 2, 3, 4, 5])
In [4]:
type(arr_1)
Out[4]:
numpy.ndarray
In [5]:
list_1 = [1,2,3,4,5]
list_1
Out[5]:
[1, 2, 3, 4, 5]
In [6]:
type(list_1)
Out[6]:
list
List Vs Numpy
1. In List, we cannot do element-wise operation directly but in array we can do that directly.
2. Array is homogenous datatype and List is heterogenous datatype.
3. List can be converted into an array and vice versa, but dimensions will be missed.
Create 1D array
localhost:8888/notebooks/Data science/Numpy_KickStart.ipynb 1/14

In [10]:
arr_1d = np.array([1,2,3,4,5])
print(arr_1d)
print('No of dimensions: ',arr_1d.ndim) #Attribute
print('No of elements : ',arr_1d.size) #Attribute
print('Max element : ',arr_1d.argmax()) #Returns the index number of the max value
[1 2 3 4 5]
No of dimensions: 1
No of elements : 5
Max element : 4
Differences
In [13]:
list_1.append([6,7])
In [14]:
list_1
Out[14]:
[1, 2, 3, 4, 5, [6, 7]]
In [ ]:
arr_1d #Cannot add new value
In [16]:
arr_1d + 3
Out[16]:
array([4, 5, 6, 7, 8])
In [17]:
list_1 + 3 #Not possible to go for element-wise operation in list
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-17-6657cea64c08> in <module>
----> 1 list_1 + 3
TypeError: can only concatenate list (not "int") to list

In [18]:
list_2 = [1,3.5,'Vennela']
list_2
Out[18]:
[1, 3.5, 'Vennela']
In [21]:
arr_2 = np.array([1,2,3])
print(arr_2)
print(arr_2.dtype)
[1 2 3]
int32
In [22]:
arr_2 = np.array([1,2.4,3])
print(arr_2)
print(arr_2.dtype)
[1. 2.4 3. ]
float64
In [23]:
arr_2 = np.array([1,2.4,'3'])
print(arr_2)
print(arr_2.dtype)
['1' '2.4' '3']
<U32
Create a 2d array
In [28]:
arr_2d = np.array([[1,2,3],[4,5,6]])
print(arr_2d)
print('No of dimensions : ',arr_2d.ndim)
[[1 2 3]
[4 5 6]]
No of dimensions : 2
Create 3d array

In [33]:
arr_3d = np.array([[[1,2,3],[4,5,6],[7,8,9]]])
print(arr_2d)
print('No of dimensions : ',arr_3d.ndim)
print('Type of elements : ',arr_3d.dtype)
[[[1 2 3]
[4 5 6]
[7 8 9]]]
Type of elements : int32
Convert int to float
In [35]:
arr_3d_converted = arr_3d.astype(dtype = 'float')

print(arr_3d_converted)
print('No of dimensions : ',arr_3d_converted.ndim)
print('Type of elements : ',arr_3d_converted.dtype)
[[[1. 2. 3.]
[4. 5. 6.]
[7. 8. 9.]]]
Type of elements : float64
List to Array Conversion
In [40]:
list_3 = [[1,2,3,4],[3,7,8,9]]
print(list_3)
print(type(list_3))
#Conversion
list_to_array = np.array(list_3)
print(list_to_array)
print(type(list_to_array))
print(list_to_array.ndim)
[[1, 2, 3, 4], [3, 7, 8, 9]]
<class 'list'>
[[1 2 3 4]
[3 7 8 9]]
<class 'numpy.ndarray'>
Array to List Conversion

In [45]:
arr_4 = np.array([[1,2,3],[4,5,6]])
print(arr_4)
print(type(arr_4))
print('No of dimensions : ',arr_4.ndim)
#Conversion
arr_to_list = arr_4.tolist()
arr_to_list
[[1 2 3]
[4 5 6]]
<class 'numpy.ndarray'>
Out[45]:
[[1, 2, 3], [4, 5, 6]]
In [48]:
import pandas as pd
pd.read_csv('dummy_data.csv')
Out[48]:
Name Age Salary
0 Ram 30.0 80000
1 Vinoth 32.0 120000
2 Ishwarya NaN 70000
3 Shadab 27.0 60000
Create Nan with numpy
In [52]:
arr_5 = np.array([[1.,2,3],[4,5,6]])
arr_5
Out[52]:
array([[1., 2., 3.],
[4., 5., 6.]])
In [53]:
arr_5[0][0] = np.nan

In [54]:
arr_5
Out[54]:
array([[nan, 2., 3.],
[ 4., 5., 6.]])
Statistical Operations
In [55]:
arr_6 = np.array([1,2,3,4,5,6,7,8,9,10])
print(arr_6)
[ 1 2 3 4 5 6 7 8 9 10]
In [56]:
arr_6.sum()
Out[56]:
55
In [57]:
arr_6.prod()
Out[57]:
3628800
In [58]:
arr_6.mean()
Out[58]:
5.5
In [59]:
arr_6.std() #From the center value, how much the datapoints got deviated
Out[59]:
2.8722813232690143
In [60]:
arr_6.argmax()
Out[60]:
Reshaping
In [64]:
arr_7 = np.array([[1,2,3,4,5],[2,3,4,4,6]])
print(arr_7)
print('Dimension: ',arr_7.ndim)
print('Shape : ',arr_7.shape)
[[1 2 3 4 5]
[2 3 4 4 6]]
Dimension: 2
Shape : (2, 5)
In [68]:
arr_7_reshape =arr_7.reshape((5,2))
print(arr_7_reshape)
print('Dimension: ',arr_7_reshape.ndim)
print('Shape : ',arr_7_reshape.shape)
[[1 2]
[3 4]
[5 2]
[3 4]
[4 6]]
Dimension: 2
Shape : (5, 2)
Reshape to 1 dimension
In [74]:
arr_7 = arr_7.reshape(1,10)
print(arr_7)
[[1 2 3 4 5 2 3 4 4 6]]
Dimension: 2
Shape : (1, 10)
In [75]:
arr_7 = arr_7.flatten()
print(arr_7)
[1 2 3 4 5 2 3 4 4 6]
Dimension: 1
Shape : (10,)
Sequencing, Repetition and Random numbers

In [82]:
np.arange(1,21,dtype='int')
Out[82]:
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20])
In [83]:
np.arange(1,21,dtype='float')
Out[83]:
array([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12., 13.,
14., 15., 16., 17., 18., 19., 20.])
In [84]:
np.linspace(start = 1,stop = 50,num=20) #Return evenly spaced numbers over a specified inte
Out[84]:
array([ 1. , 3.57894737, 6.15789474, 8.73684211, 11.31578947,
13.89473684, 16.47368421, 19.05263158, 21.63157895, 24.21052632,
26.78947368, 29.36842105, 31.94736842, 34.52631579, 37.10526316,
39.68421053, 42.26315789, 44.84210526, 47.42105263, 50. ])
In [86]:
np.ones((3,5),dtype = 'int')
Out[86]:
array([[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1]])
In [87]:
np.ones((3,5),dtype = 'float')
Out[87]:
array([[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.]])
In [90]:
np.zeros((5,3),dtype = 'int')
Out[90]:
array([[0, 0, 0],
[0, 0, 0],
[0, 0, 0],
[0, 0, 0],
[0, 0, 0]])

In [91]:
arr_2d
Out[91]:
array([[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]]])
In [93]:
arr_2d.repeat(repeats = 10, axis=0)
Out[93]:
array([[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]]])

In [97]:
#To generate random numbers

random_numbers = np.random.rand(10,10) # 0 to 1
print(random_numbers)
[[0.7522378 0.59840277 0.98032876 0.8909582 0.53373616 0.88835145
0.24153822 0.27212979 0.1555258 0.66672105]
[0.03359001 0.98019392 0.75856936 0.55605301 0.03174803 0.84657788
0.71622908 0.91338059 0.37556562 0.63562302]
[0.10884397 0.44691789 0.42777777 0.81528865 0.49332769 0.47537318
0.16523043 0.38317708 0.89125815 0.12553124]
[0.75474447 0.78261561 0.64383317 0.508903 0.86589117 0.87565516
0.63702356 0.86827629 0.31093215 0.92112643]
[0.28122148 0.11475459 0.2543637 0.67415472 0.40711809 0.07182503
0.10851266 0.95715354 0.47222885 0.08351885]
[0.92799134 0.14695707 0.14208547 0.71562343 0.55254851 0.27853705
0.54003526 0.91133382 0.36815828 0.85215297]
[0.93988949 0.341174 0.01166787 0.61474266 0.39748557 0.10211612
0.82334904 0.40665148 0.28809701 0.24895734]
[0.01654924 0.34930347 0.66160658 0.63317519 0.75035205 0.32912402
0.2542498 0.70585709 0.44998947 0.34655589]
[0.30147321 0.73018294 0.84467288 0.51520822 0.54461626 0.86300238
0.13285876 0.24993216 0.38268974 0.75638246]
[0.87338358 0.4557205 0.79204827 0.46789719 0.29564859 0.1751014
0.70685805 0.74206353 0.06701718 0.82941239]]

In [115]:
plt.hist(random_numbers)
Out[115]:
(array([[2., 1., 2., 0., 0., 0., 0., 2., 1., 2.],
[0., 2., 0., 2., 2., 0., 1., 2., 0., 1.],
[1., 1., 1., 0., 1., 0., 2., 1., 2., 1.],
[0., 0., 0., 0., 1., 3., 3., 1., 1., 1.],
[1., 0., 1., 1., 2., 3., 0., 1., 1., 0.],
[2., 1., 1., 1., 1., 0., 0., 0., 3., 1.],
[1., 2., 2., 0., 0., 1., 1., 2., 1., 0.],
[0., 0., 2., 1., 1., 0., 0., 2., 1., 3.],
[1., 1., 1., 4., 2., 0., 0., 0., 0., 1.],
[1., 1., 1., 1., 0., 0., 2., 1., 2., 1.]]),
array([0.01166787, 0.10853396, 0.20540005, 0.30226614, 0.39913223,
0.49599832, 0.59286441, 0.68973049, 0.78659658, 0.88346267,
0.98032876]),
<a list of 10 BarContainer objects>)
In [105]:
a = np.random.randint(low = 10, high=100, size=(10,5), dtype=int)

a
Out[105]:
array([[32, 76, 55, 98, 65],
[79, 73, 15, 21, 61],
[51, 97, 49, 97, 25],
[23, 83, 89, 35, 53],
[76, 85, 51, 88, 63],
[58, 41, 21, 13, 59],
[25, 31, 19, 35, 84],
[82, 89, 24, 15, 60],
[92, 45, 83, 59, 20],
[27, 27, 21, 90, 44]])

In [106]:
ages = [12,35,67,89,55,78,55,76,89,100]
ages
Out[106]:
[12, 35, 67, 89, 55, 78, 55, 76, 89, 100]
In [109]:
np.random.choice(a = ages,size=3)
Out[109]:
array([55, 76, 35])
In [112]:
norm_distribution_random_numbers = np.random.randn(10,10) #Return a samples from the "stand

norm_distribution_random_numbers
Out[112]:
array([[-0.14998811, -0.82616376, 1.23162413, 1.50599222, 0.60775798,
0.82031135, 0.16314201, -0.27971942, -0.31255425, -1.41858954],
[-0.61427653, 0.53437206, 0.94536002, -0.34814053, 0.92670669,
-1.20521558, -0.84808193, 0.79223646, 2.45851022, 1.82426662],
[ 0.5661328 , -1.15224168, -0.84290388, -0.16048055, 0.61652193,
1.1043627 , -0.88178525, -1.05846469, -0.45731413, 0.20114114],
[-1.94570301, 0.49578246, -1.03705626, 0.35186015, 1.41369587,
0.85136387, 0.3640365 , 0.51675965, 0.72282229, 1.9518135 ],
[ 0.06346569, 1.12512869, 0.22062349, -1.11470712, -1.10188094,
1.86510639, 0.66377541, 1.01920725, -0.64348622, 1.09742148],
[ 0.18142665, -0.01470521, -0.33146925, 1.71768155, -1.15201099,
0.9561929 , -0.65252581, -2.86042729, -1.58878786, -0.82784187],
[-0.49608431, -1.41429201, 0.24803719, -0.07503125, 0.60806954,
-1.15681029, 0.20593078, 2.04048407, -0.38445193, -0.4233213 ],
[ 0.67697507, 0.55686488, -0.78769268, 1.23991432, 0.97276586,
0.83458431, 0.83824446, -0.38067527, -0.76783127, -0.34740588],
[-2.34481876, 0.63495551, 1.23336232, -0.81977836, -0.75801358,
-0.79793036, 1.00629451, 0.19687593, -0.48135247, -1.03210872],
[ 0.4654954 , 0.36415721, 1.51658244, 0.12111279, 0.29816271,
-0.15215329, -0.19026641, -0.47142746, 0.76256217, -0.0871029 ]])
In [113]:
import matplotlib.pyplot as plt

In [114]:
plt.hist(norm_distribution_random_numbers)
Out[114]:
(array([[1., 1., 0., 0., 2., 3., 3., 0., 0., 0.],
[0., 0., 1., 2., 0., 1., 5., 1., 0., 0.],
[0., 0., 0., 3., 1., 2., 0., 3., 1., 0.],
[0., 0., 0., 2., 1., 3., 1., 1., 2., 0.],
[0., 0., 0., 3., 0., 1., 3., 2., 1., 0.],
[0., 0., 0., 3., 0., 1., 3., 2., 1., 0.],
[0., 0., 0., 2., 1., 3., 3., 1., 0., 0.],
[1., 0., 0., 1., 3., 1., 2., 1., 0., 1.],
[0., 0., 1., 1., 5., 0., 2., 0., 0., 1.],
[0., 0., 1., 2., 2., 2., 0., 1., 1., 1.]]),
array([-2.86042729, -2.32853354, -1.79663979, -1.26474604, -0.73285229,
-0.20095854, 0.33093521, 0.86282896, 1.39472271, 1.92661647,
2.45851022]),
<a list of 10 BarContainer objects>)
OBSERVATION
NORMAL DISTRIBUTION:
It follows an empirical rule:
68% of the datapoints, will fall between -1SD to +1SD.

95% of the datapoints, will fall between -2SD to +2SD.
99.99% of the datapoints, will fall between -3SD to +3SD.

Explore where function

Numpy - KickStart - Jupyter Notebook

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Numpy - KickStart - Jupyter Notebook

Uploaded by

Copyright:

Available Formats

10/8/21, 12:19 AM Numpy_KickStart - Jupyter Notebook

2. Array is homogenous datatype and List is heterogenous datatype.

localhost:8888/notebooks/Data science/Numpy_KickStart.ipynb 1/14

[1, 2, 3, 4, 5, [6, 7]]

arr_1d #Cannot add new value

list_1 + 3 #Not possible to go for element-wise operation in list

TypeError Traceback (most recent call last)

TypeError: can only concatenate list (not "int") to list

localhost:8888/notebooks/Data science/Numpy_KickStart.ipynb 2/14

[1, 3.5, 'Vennela']

['1' '2.4' '3']

localhost:8888/notebooks/Data science/Numpy_KickStart.ipynb 3/14

Type of elements : int32

Convert int to float

arr_3d_converted = arr_3d.astype(dtype = 'float')

Type of elements : float64

List to Array Conversion

[[1, 2, 3, 4], [3, 7, 8, 9]]

Array to List Conversion

localhost:8888/notebooks/Data science/Numpy_KickStart.ipynb 4/14

[[1, 2, 3], [4, 5, 6]]

Name Age Salary

0 Ram 30.0 80000

1 Vinoth 32.0 120000

2 Ishwarya NaN 70000

3 Shadab 27.0 60000

Create Nan with numpy

array([[1., 2., 3.],

[4., 5., 6.]])

localhost:8888/notebooks/Data science/Numpy_KickStart.ipynb 5/14

array([[nan, 2., 3.],

[ 4., 5., 6.]])

Shape : (1, 10)

Sequencing, Repetition and Random numbers

localhost:8888/notebooks/Data science/Numpy_KickStart.ipynb 7/14

array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,

18, 19, 20])

14., 15., 16., 17., 18., 19., 20.])

array([ 1. , 3.57894737, 6.15789474, 8.73684211, 11.31578947,

13.89473684, 16.47368421, 19.05263158, 21.63157895, 24.21052632,

26.78947368, 29.36842105, 31.94736842, 34.52631579, 37.10526316,

39.68421053, 42.26315789, 44.84210526, 47.42105263, 50. ])

array([[1., 1., 1., 1., 1.],

[1., 1., 1., 1., 1.],

[1., 1., 1., 1., 1.]])

localhost:8888/notebooks/Data science/Numpy_KickStart.ipynb 8/14

arr_2d.repeat(repeats = 10, axis=0)

localhost:8888/notebooks/Data science/Numpy_KickStart.ipynb 9/14

#To generate random numbers

[[0.7522378 0.59840277 0.98032876 0.8909582 0.53373616 0.88835145

0.24153822 0.27212979 0.1555258 0.66672105]

[0.03359001 0.98019392 0.75856936 0.55605301 0.03174803 0.84657788

0.71622908 0.91338059 0.37556562 0.63562302]

[0.10884397 0.44691789 0.42777777 0.81528865 0.49332769 0.47537318

0.16523043 0.38317708 0.89125815 0.12553124]

[0.75474447 0.78261561 0.64383317 0.508903 0.86589117 0.87565516

0.63702356 0.86827629 0.31093215 0.92112643]

[0.28122148 0.11475459 0.2543637 0.67415472 0.40711809 0.07182503

0.10851266 0.95715354 0.47222885 0.08351885]

[0.92799134 0.14695707 0.14208547 0.71562343 0.55254851 0.27853705

0.54003526 0.91133382 0.36815828 0.85215297]

[0.93988949 0.341174 0.01166787 0.61474266 0.39748557 0.10211612