Professional Documents
Culture Documents
Pandas - Jupyter Notebook
Pandas - Jupyter Notebook
Pandas is a library built using NumPy specifically for data analysis.you will be using Pandas heavily
for data manipulation,visuilization,building machine learning models,etc.
• series
• dataframes
The default way to store data in dataframes,and thus manipilating dataframes quickly in probable the most important skill set for datya analysis.
In [1]:
In [3]:
1 import pandas as pd
In [4]:
In [5]:
The Dataframe
Dataframe is the most widely used data-structure in data analysis.It is a table with rows andcolumns,with rows having index and columns having meaningful
data.
EXAMPLE - 1
In [8]:
In [9]:
1 print(dic_world)
{'country': ['United States', 'Australia', 'India', 'Russia', 'Morrocco'], 'symbol': ['US', 'AU', 'IND', 'RUS', 'MOR']}
In [10]:
1 dic_world["country"]
2
Out[10]:
In [11]:
1 dic_world["symbol"]
Out[11]:
In [12]:
1 data = pd.DataFrame(dic_world)
In [13]:
1 print(type(data))
2
<class 'pandas.core.frame.DataFrame'>
In [14]:
1 print(data)
2
country symbol
0 United States US
1 Australia AU
2 India IND
3 Russia RUS
4 Morrocco MOR
In [15]:
1 print(data["country"])
0 United States
1 Australia
2 India
3 Russia
4 Morrocco
Name: country, dtype: object
In [16]:
1 print(data["symbol"])
2
0 US
1 AU
2 IND
3 RUS
4 MOR
Name: symbol, dtype: object
EXAMPLE-2
In [18]:
In [20]:
1 print(cars_dict)
{'cars_per_cap': [809, 731, 588, 18, 200, 70, 45], 'country': ['United states', 'Australia', 'Japan', 'India', 'Russia', 'M
orroco', 'Egypt'], 'drives_right': [False, True, True, True, False, False, False]}
In [21]:
1 print(cars_dict['cars_per_cap'])
[809, 731, 588, 18, 200, 70, 45]
In [22]:
1 cars = pd.DataFrame(cars_dict)
AGGREGATION FUNCTION
In [24]:
1 cars
Out[24]:
3 18 India True
5 70 Morroco False
6 45 Egypt False
In [25]:
1 cars.cars_per_cap
Out[25]:
0 809
1 731
2 588
3 18
4 200
5 70
6 45
Name: cars_per_cap, dtype: int64
In [26]:
1 print(cars.cars_per_cap.max())
809
In [27]:
1 print(cars.cars_per_cap.min())
18
In [28]:
1 print(cars.cars_per_cap.mean())
351.57142857142856
In [29]:
1 print(cars.cars_per_cap.std())
345.59555222005633
In [30]:
1 print(cars.cars_per_cap.count())
7
In [39]:
In [41]:
Out[41]:
In [42]:
1 df.Age.max()
Out[42]:
30.0
In [43]:
1 df.Age.min()
Out[43]:
25.0
In [44]:
1 df.Age.mean()
Out[44]:
27.25
In [45]:
1 df.Age.std()
Out[45]:
2.217355782608345
In [46]:
1 df.Age.count()
Out[46]:
In [ ]: