Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

RECORD PROGRAMS (2022-23)

STD – XII
PANDAS PROGRAMS
EX:6
AIM :
To create a dataframe for examination result and perform operations on rows and
columns.

LIBRARY IMPORTED:
● pandas

METHODS:
● DataFrame()
● print()
● drop()
● rename()

ATTRIBUTES :

● loc

CODING :
import pandas as pd
exam =[{'Name': 'Raj', 'Score':12.5}, {'Name':'Tina', 'Score':20}]
df = pd.DataFrame(exam, index=['st1','st2'])
print("Adding a row st3")
df.loc['st3']=['John',15]
print(df)
print("\n Adding a Gender column")
df['Gender']=['M','F','M']
print(df)
print("\n Removing the row with the label st2")
df=df.drop('st2',axis=0)
print(df)
print("\n Removing the Gender column")
df=df.drop('Gender',axis=1)
print(df)
print("\n Rename the row label")
df=df.rename({'st1': 1,'st3':2}, axis=0)
print(df)
print("\n Rename the column label")
df=df.rename({'Score': 'Mark'}, axis=1)
print(df)
OUTPUT
Adding a row st3
Name Score
st1 Raj 12.5
st2 Tina 20.0
st3 John 15.0

Adding a Gender column


Name Score Gender
st1 Raj 12.5 M
st2 Tina 20.0 F
st3 John 15.0 M

Removing the row with the label st2


Name Score Gender
st1 Raj 12.5 M
st3 John 15.0 M

Removing the Gender column


Name Score
st1 Raj 12.5
st3 John 15.0

Rename the row label


Name Score
1 Raj 12.5
2 John 15.0

Rename the column label


Name Mark
1 Raj 12.5
2 John 15.0

EX:7
AIM :
To create a pandas program to access DataFrame elements using indexing and slicing.

LIBRARY IMPORTED:
● pandas
METHODS:
● DataFrame()
● print()

ATTRIBUTES:
● loc

CODING :
import pandas as pd
d = [{'Name': 'Sachin', 'Age' :32, 'Gender': 'M'},
{'Name' : 'Vinitha', 'Age' :35, 'Gender': 'F'},
{'Name' : 'Rajesh', 'Age' :40, 'Gender': 'M'},
{'Name' : 'Sharma', 'Age' :28, 'Gender': 'M'},
{'Name' : 'Rosy', 'Age' : 42, 'Gender':'F'}]
df = pd.DataFrame(d,index=['emp1','emp2','emp3','emp4','emp5'])
print("All values of emp4 and emp5")
print(df.loc[['emp4', 'emp5']])
print("\nAll rows of age column")
print(df.loc[:,'Age'])
print("\n Boolean result for the age above 30")
print(df['Age']>30)
print("\n Name and age from emp1 to emp3")
print(df.loc['emp1': 'emp3', 'Name':'Age'])
print("\n Age and Gender from emp2 to emp4")
print(df.loc['emp2': 'emp4',['Age','Gender']])

OUTPUT
All values of emp4 and emp5
Name Age Gender
emp4 Sharma 28 M
emp5 Rosy 42 F

All rows of age column


emp1 32
emp2 35
emp3 40
emp4 28
emp5 42
Name: Age, dtype: int64

Boolean result for the age above 30


emp1 True
emp2 True
emp3 True
emp4 False
emp5 True
Name: Age, dtype: bool

Name and age from emp1 to emp3


Name Age
emp1 Sachin 32
emp2 Vinitha 35
emp3 Rajesh 40

Age and Gender from emp2 to emp4


Age Gender
emp2 35 F
emp3 40 M
emp4 28 M

EX:8
AIM :
To create a pandas program to access the attributes and methods of the dataframe.

LIBRARY IMPORTED:
● pandas
● numpy

METHODS:
● DataFrame()
● arange()
● print()
● head()
● tail()
ATTRIBUTES :

● index
● columns
● dtypes
● shape
● size
● ndim
● empty

CODING :
import pandas as pd
import numpy as np
d = {'Name':pd.Series(['Earphone', 'Headphone', 'Speaker']),
'Rate': pd.Series([750, 1200,2700]),
'Model':pd.Series(['Boat','Mi','JBL'])}
df = pd.DataFrame(d, index=np.arange(0,3))
print("Index of the dataframe")
print(df.index)
print("\nColumns of the dataframe")
print(df.columns)
print("\nDatatypes of each column")
print(df.dtypes)
print("\nShape of the dataframe")
print(df.shape)
print("\nSize of the dataframe")
print(df.size)
print("\nDimensions of the dataframe")
print(df.ndim)
print("\nChecking whether the dataframe is empty")
print(df.empty)
print("\nFirst 2 rows")
print(df.head(2))
print("\nLast 2 rows")
print(df.tail(2))

OUTPUT:
Index of the dataframe
Int64Index([0, 1, 2], dtype='int64')

Columns of the dataframe


Index(['Name', 'Rate', 'Model'], dtype='object')

Datatypes of each column


Name object
Rate int64
Model object
dtype: object

Shape of the dataframe


(3, 3)
Size of the dataframe
9

Dimensions of the dataframe


2

Checking whether the dataframe is empty


False

First 2 rows
Name Rate Model
0 Earphone 750 Boat
1 Headphone 1200 Mi

Last 2 rows
Name Rate Model
1 Headphone 1200 Mi
2 Speaker 2700 JBL

EX:9
AIM :
To filter out rows, drop duplicate rows and NaN values in the given dataframe.

LIBRARY IMPORTED:
● pandas
● numpy

METHODS:
● DataFrame()
● print()
● duplicated()
● drop_duplicates()
● dropna()
CODING :
import pandas as pd
import numpy as np
df=pd.DataFrame({'Name':pd.Series([np.nan,'Hina','Hina','John']),
'Degree':pd.Series(['Masters','Graduate','Graduate','Masters']),
'Age':pd.Series([27,23,23,np.nan])})
print("Original DataFrame:")
print(df)
print("\nChecking for duplicated rows:")
print(df.duplicated())
print("\nRemoving duplicate rows:")
print(df.drop_duplicates())
print("\nRemoving NaN values:")
print(df.dropna())

OUTPUT:
Original DataFrame:
Name Degree Age
0 NaN Masters 27.0
1 Hina Graduate 23.0
2 Hina Graduate 23.0
3 John Masters NaN

Checking for duplicated rows:


0 False
1 False
2 True
3 False
dtype: bool
Removing duplicate rows:
Name Degree Age
0 NaN Masters 27.0
1 Hina Graduate 23.0
3 John Masters NaN

Removing NaN values:


Name Degree Age
1 Hina Graduate 23.0
2 Hina Graduate 23.0

EX:10
AIM :
To create a dataframe sales where each row contains the region, item and sale_amt.
Group the rows by region and print the total sale per region.

LIBRARY IMPORTED:
● pandas

METHODS:
● DataFrame()
● print()
● groupby()
● sum()
CODING :
import pandas as pd
df=pd.DataFrame({'Region':pd.Series(['East', 'East','Central', 'Central','West', 'West']),
'Item':pd.Series(['Computer','Printer','Television','Home Theater','Cell
Phone','Video Game']),
'Sale_amt':([58500,35800,43128,22500,67300,18900])})
print("Original Dataframe:")
print(df)
print("\nGroup by region and sum of sale amount")
g1 = df.groupby(['Region']).sum()
print(g1)

OUTPUT:
Original Dataframe:
Region Item Sale_amt
0 East Computer 58500
1 East Printer 35800
2 Central Television 43128
3 Central Home Theater 22500
4 West Cell Phone 67300
5 West Video Game 18900

Group by region and sum of sale amount


Sale_amt
Region
Central 65628
East 94300
West 86200

EX:11
AIM :
To import and export data between CSV and DataFrame
1. Create a csv named ‘student’, add the values given below

Rollno Name Mark1 Mark2


11 Tina 78 87
12 Abay 88 98
13 Raj 67 85
14 Rosy 76 82
15 Moni 96 94

2. Import all the values into the dataframe


3. Find the Total for all the students
4. Export only Name and Total to ‘result.csv’

LIBRARY IMPORTED:
● pandas

METHODS:
● DataFrame()
● print()
● read_csv()
● to_csv()

CODING :
import pandas as pd
data = pd.read_csv (r'E:\student.csv')
print('Importing all the values from csv')
df = pd.DataFrame(data)
print(df)
print("\n Total mark for all the students")
df['Total']=df['Mark1']+df['Mark2']
print(df)
print("\nExporting Name and Total to another csv")
df1=pd.DataFrame(df, columns=['Name','Total'])
print(df1)
df1.to_csv(r'E:\result.csv')
print("\nData Exported")

OUTPUT :
Importing all the values from csv
Rollno Name Mark1 Mark2
0 11 Tina 78 87
1 12 Abay 88 98
2 13 Raj 67 85
3 14 Rosy 76 82
4 15 Moni 96 94

Total mark for all the students


Rollno Name Mark1 Mark2 Total
0 11 Tina 78 87 165
1 12 Abay 88 98 186
2 13 Raj 67 85 152
3 14 Rosy 76 82 158
4 15 Moni 96 94 190

Exporting Name and Total to another csv


Name Total
0 Tina 165
1 Abay 186
2 Raj 152
3 Rosy 158
4 Moni 190

Data Exported
result.csv

You might also like