Professional Documents
Culture Documents
Xii Record (Dataframe & CSV)
Xii Record (Dataframe & CSV)
STD – XII
PANDAS PROGRAMS
EX:6
AIM :
To create a dataframe for examination result and perform operations on rows and
columns.
LIBRARY IMPORTED:
● pandas
METHODS:
● DataFrame()
● print()
● drop()
● rename()
ATTRIBUTES :
● loc
CODING :
import pandas as pd
exam =[{'Name': 'Raj', 'Score':12.5}, {'Name':'Tina', 'Score':20}]
df = pd.DataFrame(exam, index=['st1','st2'])
print("Adding a row st3")
df.loc['st3']=['John',15]
print(df)
print("\n Adding a Gender column")
df['Gender']=['M','F','M']
print(df)
print("\n Removing the row with the label st2")
df=df.drop('st2',axis=0)
print(df)
print("\n Removing the Gender column")
df=df.drop('Gender',axis=1)
print(df)
print("\n Rename the row label")
df=df.rename({'st1': 1,'st3':2}, axis=0)
print(df)
print("\n Rename the column label")
df=df.rename({'Score': 'Mark'}, axis=1)
print(df)
OUTPUT
Adding a row st3
Name Score
st1 Raj 12.5
st2 Tina 20.0
st3 John 15.0
EX:7
AIM :
To create a pandas program to access DataFrame elements using indexing and slicing.
LIBRARY IMPORTED:
● pandas
METHODS:
● DataFrame()
● print()
ATTRIBUTES:
● loc
CODING :
import pandas as pd
d = [{'Name': 'Sachin', 'Age' :32, 'Gender': 'M'},
{'Name' : 'Vinitha', 'Age' :35, 'Gender': 'F'},
{'Name' : 'Rajesh', 'Age' :40, 'Gender': 'M'},
{'Name' : 'Sharma', 'Age' :28, 'Gender': 'M'},
{'Name' : 'Rosy', 'Age' : 42, 'Gender':'F'}]
df = pd.DataFrame(d,index=['emp1','emp2','emp3','emp4','emp5'])
print("All values of emp4 and emp5")
print(df.loc[['emp4', 'emp5']])
print("\nAll rows of age column")
print(df.loc[:,'Age'])
print("\n Boolean result for the age above 30")
print(df['Age']>30)
print("\n Name and age from emp1 to emp3")
print(df.loc['emp1': 'emp3', 'Name':'Age'])
print("\n Age and Gender from emp2 to emp4")
print(df.loc['emp2': 'emp4',['Age','Gender']])
OUTPUT
All values of emp4 and emp5
Name Age Gender
emp4 Sharma 28 M
emp5 Rosy 42 F
EX:8
AIM :
To create a pandas program to access the attributes and methods of the dataframe.
LIBRARY IMPORTED:
● pandas
● numpy
METHODS:
● DataFrame()
● arange()
● print()
● head()
● tail()
ATTRIBUTES :
● index
● columns
● dtypes
● shape
● size
● ndim
● empty
CODING :
import pandas as pd
import numpy as np
d = {'Name':pd.Series(['Earphone', 'Headphone', 'Speaker']),
'Rate': pd.Series([750, 1200,2700]),
'Model':pd.Series(['Boat','Mi','JBL'])}
df = pd.DataFrame(d, index=np.arange(0,3))
print("Index of the dataframe")
print(df.index)
print("\nColumns of the dataframe")
print(df.columns)
print("\nDatatypes of each column")
print(df.dtypes)
print("\nShape of the dataframe")
print(df.shape)
print("\nSize of the dataframe")
print(df.size)
print("\nDimensions of the dataframe")
print(df.ndim)
print("\nChecking whether the dataframe is empty")
print(df.empty)
print("\nFirst 2 rows")
print(df.head(2))
print("\nLast 2 rows")
print(df.tail(2))
OUTPUT:
Index of the dataframe
Int64Index([0, 1, 2], dtype='int64')
First 2 rows
Name Rate Model
0 Earphone 750 Boat
1 Headphone 1200 Mi
Last 2 rows
Name Rate Model
1 Headphone 1200 Mi
2 Speaker 2700 JBL
EX:9
AIM :
To filter out rows, drop duplicate rows and NaN values in the given dataframe.
LIBRARY IMPORTED:
● pandas
● numpy
METHODS:
● DataFrame()
● print()
● duplicated()
● drop_duplicates()
● dropna()
CODING :
import pandas as pd
import numpy as np
df=pd.DataFrame({'Name':pd.Series([np.nan,'Hina','Hina','John']),
'Degree':pd.Series(['Masters','Graduate','Graduate','Masters']),
'Age':pd.Series([27,23,23,np.nan])})
print("Original DataFrame:")
print(df)
print("\nChecking for duplicated rows:")
print(df.duplicated())
print("\nRemoving duplicate rows:")
print(df.drop_duplicates())
print("\nRemoving NaN values:")
print(df.dropna())
OUTPUT:
Original DataFrame:
Name Degree Age
0 NaN Masters 27.0
1 Hina Graduate 23.0
2 Hina Graduate 23.0
3 John Masters NaN
EX:10
AIM :
To create a dataframe sales where each row contains the region, item and sale_amt.
Group the rows by region and print the total sale per region.
LIBRARY IMPORTED:
● pandas
METHODS:
● DataFrame()
● print()
● groupby()
● sum()
CODING :
import pandas as pd
df=pd.DataFrame({'Region':pd.Series(['East', 'East','Central', 'Central','West', 'West']),
'Item':pd.Series(['Computer','Printer','Television','Home Theater','Cell
Phone','Video Game']),
'Sale_amt':([58500,35800,43128,22500,67300,18900])})
print("Original Dataframe:")
print(df)
print("\nGroup by region and sum of sale amount")
g1 = df.groupby(['Region']).sum()
print(g1)
OUTPUT:
Original Dataframe:
Region Item Sale_amt
0 East Computer 58500
1 East Printer 35800
2 Central Television 43128
3 Central Home Theater 22500
4 West Cell Phone 67300
5 West Video Game 18900
EX:11
AIM :
To import and export data between CSV and DataFrame
1. Create a csv named ‘student’, add the values given below
LIBRARY IMPORTED:
● pandas
METHODS:
● DataFrame()
● print()
● read_csv()
● to_csv()
CODING :
import pandas as pd
data = pd.read_csv (r'E:\student.csv')
print('Importing all the values from csv')
df = pd.DataFrame(data)
print(df)
print("\n Total mark for all the students")
df['Total']=df['Mark1']+df['Mark2']
print(df)
print("\nExporting Name and Total to another csv")
df1=pd.DataFrame(df, columns=['Name','Total'])
print(df1)
df1.to_csv(r'E:\result.csv')
print("\nData Exported")
OUTPUT :
Importing all the values from csv
Rollno Name Mark1 Mark2
0 11 Tina 78 87
1 12 Abay 88 98
2 13 Raj 67 85
3 14 Rosy 76 82
4 15 Moni 96 94
Data Exported
result.csv