I.P Practical

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 48

INFORMATICS-

PRACTICES

PRACTICAL FILE

SECTION(2021-22)

BY-TEJASHVI CHOUDHARY

CLASS-12-E

ROLL NO. 22
Data Handling Using
Pandas-1
Series
Create a empty series-

 import pandas as pd
 x=pd.Series()
 print(x)
Output-
Series ([], dtype: float64)
Create a series by taking index as a,b,c,d-

 import pandas as pd
 x=pd.Series(['a','b','c','d']),([100,200,300,400])
 print(x)
Output-
0 a

1 b
2 c
3 d
dtype: object, [100, 200, 300, 400])

To create a series with range() and for loop-

 import pandas as pd
 s=pd.Series(range(1,17,2),index=[x for x in 'abcdefgh'])
 print(s)
Output-
a 1
b 3
c 5
d 7
e 9
f 11
g 13
h 15
dtype: int64

Create a series using missing values-

 import pandas as pd
 import numpy as np
 sobj=pd.Series([7.5,5.4,np.NaN,-34.5])
 print(sobj)
Output-
0 7.5
1 5.4
2 NaN
3 -34.5
dtype: float64

To perform indexing , sliing and accessing data from a series-

 import pandas as pd
 s=pd.Series([1,2,3,4,5],index=['a','b','c','d','e'])
 print(s[0])
 print(s[:3])
 print(s[-3:])
 print(s[:])
 print(s[1:4])
 print(s[::-1])
 print(s[0::2])
Output-
1
a 1
b 2
c 3
dtype: int64
c 3
d 4
e 5
dtype: int64
a 1
b 2
c 3
d 4
e 5
dtype: int64
b 2
c 3
d 4
dtype: int64
e 5
d 4
c 3
b 2
a 1
dtype: int64
a 1
c 3
e 5
dtype: int64

Create a series and use Loc and iloc function-

 import pandas as pd
 s=pd.Series([1,2,3,4,5],index=['a','b','c','d','e'])
 print(s.iloc[1:4])
 print(s.loc['b':'e'])
Output-
b 2
c 3
d 4
dtype: int64
b 2
c 3
d 4
e 5
dtype: int64
Create a series using a dictionary-

 import pandas as pd
 series =pd.Series({'Jan':31,'Feb':28,'Mar':31,'Apr':30})
 print(series)
Output-
Jan 31
Feb 28
Mar 31
Apr 30
dtype: int64

Create a series using a dictionary and also name it-

 import pandas as pd
 series=pd.Series({'Jan':31,'Feb':28,'Mar':31,'Apr':30})
 series.name='Days'
 series.index.name='month'
 print(series)
Output-
month
Jan 31
Feb 28
Mar 31
Apr 30
Name: Days, dtype: int64
To create a series using a mathematical expression-

 import pandas as pd
 import numpy as np
 s1=np.arange(10,15)
 print(s1)
 sobj=pd.Series(index=s1,data=s1*4)
 print(sobj)
Output-
[10 11 12 13 14]
10 40
11 44
12 48
13 52
14 56
dtype: int64

To illustrate the working of Head() & Tail() function in a series-

 import pandas as pd
 series1=pd.Series([10,20,30,40,50],index=['a','b','c','d','e'])
 print(series1)
 print(series1.head())
 print(series1.head(2))
 print(series1.tail(2))
 print(series1.tail())
 print(series1.head(-2))
 print(series1.tail(-2))
Output-
a 10
b 20
c 30
d 40
e 50
dtype: int64
a 10
b 20
c 30
d 40
e 50
dtype: int64
a 10
b 20
dtype: int64
d 40
e 50
dtype: int64
a 10
b 20
c 30
d 40
e 50
dtype: int64
a 10
b 20
c 30
dtype: int64
c 30
d 40
e 50
dtype: int64
To illustrate the working of mathematical operations in series-

 import pandas as pd
 import numpy as np
 s1=pd.Series([10,20,30,40,50])
 s2=pd.Series([1,2,3,4,5])
 s=s1+s2
 print(s1+s2)
 s=s1-s2
 print(s1-s2)
 s=s1*s2
 print(s1*s2)
 s=s1/s2
 print(s1/s2)
 Output-
0 11
1 22
2 33
3 44
4 55
dtype: int64
0 9
1 18
2 27
3 36
4 45
dtype: int64
0 10
1 40
2 90
3 160
4 250
dtype: int64
0 10.0
1 10.0
2 10.0
3 10.0
4 10.0
DataFrame
To sort the data of student dataframe on marks-

 import pandas as pd
 student_marks=pd.Series({'Vijaya':80,'Rahul':92,'Meghna':67,
'Radhika':95,'Shaurya':97})
 student_age=pd.Series({'Vijaya':32,'Rahul':28,'Meghna':30,
'Radhika':25,'Shaurya':20})
 student_df=pd.DataFrame({'Marks':student_marks,'Age':
student_age})
 print(student_df)
 print(student_df.sort_values(by=['Marks']))
 print(student_df.sort_values(by=['Marks'],ascending=False))
Output-
Marks Age

Vijaya 80 32
Rahul 92 28
Meghna 67 30
Radhika 95 25
Shaurya 97 20
Marks Age
Meghna 67 30
Vijaya 80 32
Rahul 92 28
Radhika 95 25
Shaurya 97 20
Marks Age
Shaurya 97 20
Radhika 95 25
Rahul 92 28
Vijaya 80 32
Meghna 67 30

Create a dataframe and use transpose function-

 import pandas as pd
 dict={'2018':[85.4,88.2,80.3,79.0],'2019':
[77.9,80.5,78.6,76.2],'2020':[86.5,90.0,77.5,80.5]}
 df=pd.DataFrame(dict,index=['Accountancy','IP','Eco','English'])
 print(df)
 print('\n')
 df1=df.T
 print('After Transpose:')
 print(df1)
Output-
2018 2019 2020
Accountancy 85.4 77.9 86.5
IP 88.2 80.5 90.0
Eco 80.3 78.6 77.5
English 79.0 76.2 80.5

After Transpose:
Accountancy IP Eco English
2018 85.4 88.2 80.3 79.0
2019 77.9 80.5 78.6 76.2
2020 86.5 90.0 77.5 80.5
76.2

To perform binary operations on two Dataframes-

 import pandas as pd
 d1={'unit test-1':[22,44,62,63,55],'unit test-2':[47,52,36,85,25]}
 d2={'unit test-1':[41,60,71,77,54],'unit test-2':[51,11,56,96,75]}
 x=pd.DataFrame(d1)
 y=pd.DataFrame(d2)
 print(x)
 print(y)
 print('adition')
 print(x.add(y))
 print('subtraction')
 print(x.sub(y))
 print('multiplication')
 print(x.mul(y))
 print('division')
 print(x.div(y))
 print(x.radd(y))
 print(x.rsub(y))
Output-
unit test-1 unit test-2
0 22 47
1 44 52
2 62 36
3 63 85
4 55 25
unit test-1 unit test-2
0 41 51
1 60 11
2 71 56
3 77 96
4 54 75
adition
unit test-1 unit test-2
0 63 98
1 104 63
2 133 92
3 140 181
4 109 100
subtraction
unit test-1 unit test-2
0 -19 -4
1 -16 41
2 -9 -20
3 -14 -11
4 1 -50
multiplication
unit test-1 unit test-2
0 902 2397
1 2640 572
2 4402 2016
3 4851 8160
4 2970 1875
division
unit test-1 unit test-2
0 0.536585 0.921569
1 0.733333 4.727273
2 0.873239 0.642857
3 0.818182 0.885417
4 1.018519 0.333333
unit test-1 unit test-2
0 63 98
1 104 63
2 133 92
3 140 181
4 109 100
unit test-1 unit test-2
0 19 4
1 16 -41
2 9 20
3 14 11
4 -1 50

To perform Iterations in a dataframe(iterroms)-

 import pandas as series


 total_sales={2015:{'Qtrl1':34500,'Qtrl2':45000,'Qtrl3':50000,
'Qtrl4':39000}, 2016:
{'Qtrl1':44500,'Qtrl2':65000,'Qtrl3':70000,'Qtrl4':49000},
2017:{'Qtrl1':44500,'Qtrl2':65000,'Qtrl3':70000,'Qtrl4':49000}}
 df=pd.DataFrame(total_sales)
 print(df)
 for(row,rowSeries) in df.iterrows():
 print('Rowindx:',row)
 print('containing:')
 print(rowSeries)
Output-
2015 2016 2017
Qtrl1 34500 44500 44500
Qtrl2 45000 65000 65000
Qtrl3 50000 70000 70000
Qtrl4 39000 49000 49000
Rowindx: Qtrl1
containing:
2015 34500
2016 44500
2017 44500
Name: Qtrl1, dtype: int64
Rowindx: Qtrl2
containing:
2015 45000
2016 65000
2017 65000
Name: Qtrl2, dtype: int64
Rowindx: Qtrl3
containing:
2015 50000
2016 70000
2017 70000
Name: Qtrl3, dtype: int64
Rowindx: Qtrl4
containing:
2015 39000
2016 49000
2017 49000
Name: Qtrl4, dtype: int64

To perform Iterations in a dataframe(iteritems or itercolumns)-

 import pandas as series


 total_sales={2015:{'Qtrl1':34500,'Qtrl2':45000,'Qtrl3':50000,
'Qtrl4':39000}, 2016:
{'Qtrl1':44500,'Qtrl2':65000,'Qtrl3':70000,'Qtrl4':49000},
2017:{'Qtrl1':44500,'Qtrl2':65000,'Qtrl3':70000,'Qtrl4':49000}}
 df=pd.DataFrame(total_sales)
 print(df)
 for(col,colseries) in df.iteritems():
 print('Column Index:',col)
 print('Containing:')
 print(colseries)
Output-
2015 2016 2017
Qtrl1 34500 44500 44500
Qtrl2 45000 65000 65000
Qtrl3 50000 70000 70000
Qtrl4 39000 49000 49000
Column Index: 2015
Containing:
Qtrl1 34500
Qtrl2 45000
Qtrl3 50000
Qtrl4 39000
Name: 2015, dtype: int64
Column Index: 2016
Containing:
Qtrl1 44500
Qtrl2 65000
Qtrl3 70000
Qtrl4 49000
Name: 2016, dtype: int64
Column Index: 2017
Containing:
Qtrl1 44500
Qtrl2 65000
Qtrl3 70000
Qtrl4 49000
Name: 2017, dtype: int64

Program to iterate oven a dataframe containing names and marks , then


calculate grade as per marks-

 import pandas as pd
 import numpy as np
 names=pd.Series(['Sanjeev','Rajeev','Sanjay','Abhay'])
 marks=pd.Series([76,86,55,54])
 stud={'Name':names,'Marks':marks}
 df=pd.DataFrame(stud,columns=['Name','Marks'])
 df['Grade']=np.NaN
 print('Initial Values in DataFrame')
 print(df)
 for(col,colSeries) in df.iteritems():
 length=len(colSeries)
 if col=='Marks':
 istMrks=[]
 for row in range(length):
 mrks=colSeries[row]
 if mrks>=90:
 istMrks.append('A+')
 elif mrks>=70:
 istMrks.append('A')
 elif mrks>=60:
 istMrks.append('B')
 elif mrks>=50:
 istMrks.append('C')
 elif mrks>=40:
 istMrks.append('D')
 else:
 mrks.append('F')
 df['Grade']=istMrks
 print('\n\nDataFrame after calculation of Grade')
 print(df)
Output-
Initial Values in DataFrame
Name Marks Grade
0 Sanjeev 76 NaN
1 Rajeev 86 NaN
2 Sanjay 55 NaN
3 Abhay 54 NaN

DataFrame after calculation of Grade


Name Marks Grade
0 Sanjeev 76 A
1 Rajeev 86 A
2 Sanjay 55 C
3 Abhay 54 C

Program to concat two dataframes -

 import pandas as pd
 d1={'roll_no':[10,11,12,13,14,15],
 'name':['Ankit','Pihu','Rinku','Yash','Vijay','Nikhil']}
 df1=pd.DataFrame(d1,columns=['roll_no','name'])
 print(df1)
 d2={'roll_no':[1,2,3,4,5,6],
 'name':['Renu','Jatin','Deep','Guddu','Chhaya','Sahil']}
 df2=pd.DataFrame(d2,columns=['roll_no','name'])
 print(df2)
 pd.concat([df1,df2],axis=0)
Output-
roll_no name
0 10 Ankit
1 11 Pihu
2 12 Rinku
3 13 Yash
4 14 Vijay
5 15 Nikhil
roll_no name
0 1 Renu
1 2 Jatin
2 3 Deep
3 4 Guddu
4 5 Chhaya
5 6 Sahil

roll_no name
0 10 Ankit
1 11 Pihu
2 12 Rinku
3 13 Yash
4 14 Vijay
5 15 Nikhil
0 1 Renu
1 2 Jatin
2 3 Deep
3 4 Guddu
4 5 Chhaya
5 6 Sahil

Program to concat two dataframes along columns-

 import pandas as pd
 d1={'roll_no':[10,11,12,13,14,15],
 'name':['Ankit','Pihu','Rinku','Yash','Vijay','Nikhil']}
 df1=pd.DataFrame(d1,columns=['roll_no','name'])
 print(df1)
 d2={'roll_no':[1,2,3,4,5,6],
 'name':['Renu','Jatin','Deep','Guddu','Chhaya','Sahil']}
 df2=pd.DataFrame(d2,columns=['roll_no','name'])
 print(df2)
 pd.concat([df1,df2],axis=1)
Output-
roll_no name
0 10 Ankit
1 11 Pihu
2 12 Rinku
3 13 Yash
4 14 Vijay
5 15 Nikhil
roll_no name
0 1 Renu
1 2 Jatin
2 3 Deep
3 4 Guddu
4 5 Chhaya
5 6 Sahil

roll_n
name roll_no name
o

0 10 Ankit 1 Renu

1 11 Pihu 2 Jatin

2 12 Rinku 3 Deep

3 13 Yash 4 Guddu
4 14 Vijay 5 Chhaya

5 15 Nikhil 6 Sahil
CSV FILE
To read a csv file-

 import pandas as pd
 df=pd.read_csv("E:\\Data\\Employee.csv")
 print(df)
Output-
Empid Name Age City Salary
0 100 Ritesh 25 Mumbai 15000.0
1 101 Aakash 26 Goa 16000.0
2 102 Mahima 27 Hyderabad 20000.0
3 103 Lakshay 23 Delhi 18000.0
4 104 Manu 25 Mumbai 25000.0
5 105 Nidhi 26 Delhi NaN
6 106 Geetu 30 Bengaluru 28000.0

To display shape of csv file-

 import pandas as pd
 df=pd.read_csv("E:\\Data\\Employee.csv")
 print(df.shape)
 Output-
 (7, 5)
To display name , age and salary from Employee.csv-

 import pandas as pd
 df=pd.read_csv("E:\\Data\\Employee.csv",usecols=
['Name','Age','Salary'])
 print(df)
Output-
Name Age Salary
0 Ritesh 25 15000.0
1 Aakash 26 16000.0
2 Mahima 27 20000.0
3 Lakshay 23 18000.0
4 Manu 25 25000.0
5 Nidhi 26 NaN
6 Geetu 30 28000.0

To display only 5 records-

 import pandas as pd
 df=pd.read_csv("E:\\Data\\Employee.csv",nrows=5)
 print(df)
Output-
Empid Name Age City Salary

0 100 Ritesh 25 Mumbai 15000.0

1 101 Aakash 26 Goa 16000.0

2 102 Mahima 27 Hyderabad 20000.0

3 103 Lakshay 23 Delhi 18000.0

4 104 Manu 25 Mumbai 25000.0

To display records without header-

 import pandas as pd
 df=pd.read_csv("E:\\Data\\Employee.csv",header=None)
 print(df)
Output-
0 1 2 3 4

0 Empid Name Age City Salary

1 100 Ritesh 25 Mumbai 15000.0

2 101 Aakash 26 Goa 16000.0

3 102 Mahima 27 Hyderabad 20000.0

4 103 Lakshay 23 Delhi 18000.0

5 104 Manu 25 Mumbai 25000.0

6 105 Nidhi 26 Delhi NaN


7 106 Geetu 30 Bengaluru 28000.0

To display records with index numbers-

 import pandas as pd
 df=pd.read_csv("E:\\Data\\Employee.csv",index_col=0)
 print(df)
Output-

Empid Name Age City Salary

100 Ritesh 25 Mumbai 15000.0

101 Aakash 26 Goa 16000.0

102 Mahima 27 Hyderabad 20000.0

103 Lakshay 23 Delhi 18000.0

104 Manu 25 Mumbai 25000.0

105 Nidhi 26 Delhi NaN

106 Geetu 30 Bengaluru 28000.0

To modify the employee name from Lakshay to Harsh-

 import pandas as pd
 df=pd.read_csv("E:\\Data\\Employee.csv")
 print("DataFrame Contents Before Updation")
 print(df)
 print()
 df.loc[3,'Name']="Harsh"
 df.to_csv("E:\\Data\\Employee.csv",index=False)
 print("Dataframe Contens After Updation")
 print(df)
Output-
DataFrame Contents Before Updation

Empid Name Age City Salary

0 100 Ritesh 25 Mumbai 15000.0

1 101 Aakash 26 Goa 16000.0

2 102 Mahima 27 Hyderabad 20000.0

3 103 Lakshay 23 Delhi 18000.0

4 104 Manu 25 Mumbai 25000.0

5 105 Nidhi 26 Delhi NaN

6 106 Geetu 30 Bengaluru 28000.0

Dataframe Contens After Updation

Empid Name Age City Salary

0 100 Ritesh 25 Mumbai 15000.0

1 101 Aakash 26 Goa 16000.0

2 102 Mahima 27 Hyderabad 20000.0

3 103 Harsh 23 Delhi 18000.0


4 104 Manu 25 Mumbai 25000.0

5 105 Nidhi 26 Delhi NaN

6 106 Geetu 30 Bengaluru 28000.0

To display employee file with new column names-

 import pandas as pd
 df=pd.read_csv("E:\\Data\\Employee.csv",skiprows=1)
 print(df)
Output-
100 Ritesh 25 Mumbai 15000.0

0 101 Aakash 26 Goa 16000.0

1 102 Mahima 27 Hyderabad 20000.0

2 103 Harsh 23 Delhi 18000.0

3 104 Manu 25 Mumbai 25000.0

4 105 Nidhi 26 Delhi NaN

5 106 Geetu 30 Bengaluru 28000.0

To create new csv file by coping the contents of employee.csv-

 import pandas as pd
 df=pd.read_csv("E:\\Data\\Employee.csv")
 df.to_csv("E:\\Data\\Empnew.csv")
 print(df)
Output-

To create a student csv file from dataframe-

 Student={'RollNo':[1,2,3,4,5,6],
 'StudName':['Teena','Rinku','Payal','Akshay','Gravit',
'Yogesh'],
 'Marks':[90,78,88,89,77,97],
 'Class':['11A','11B','11C','11A','11D','11E']}
 import pandas as pd
 df=pd.DataFrame(Student,columns=['RollNo','StudName','M
arks','Class'])
 df.to_csv("E:\\Data\\Student.csv")
Output-
To read student.csv file-

 import pandas as pd
 df.to_csv("E:\\Data\\Student.csv")
 df=pd.read_csv("E:\\Data\\Student.csv")
 print(df)
Output-
Unnamed : 0 RollNo StudName Marks Class

0 0 1 Teena 90 11A

1 1 2 Rinku 78 11B
2 2 3 Payal 88 11C

3 3 4 Akshay 89 11A

4 4 5 Gravit 77 11D

5 5 6 Yogesh 97 11E

To remove unnamed from above program-

 import pandas as pd
 df=pd.read_csv("E:\\Data\\Student.csv",index_col=0)
 print(df)
Output-
RollNo StudName Marks Class

0 1 Teena 90 11A

1 2 Rinku 78 11B

2 3 Payal 88 11C

3 4 Akshay 89 11A

4 5 Gravit 77 11D

5 6 Yogesh 97 11E
Coping fields into new file-

 import pandas as pd
 df=pd.read_csv("E:\\Data\\Employee.csv")
 df=df.to_csv("E:\\Data\\Emp.csv",columns=['Empid','Name'])
Output-
Data Visaulization
Using Matplotlib
Line Chart
Create a line chart-

 import matplotlib.pyplot as plt


 y=[5,6,7]
 x=[10,20,30]
 plt.plot(x,y)
 plt.show()
Output-

Create a line chart using different attributes -

 import matplotlib.pyplot as plt


 x=[2,4,6,8,10]
 y=[1,2,3,4,5]
 plt.plot(x,y,color='Blue',linewidth=10,linestyle='dashed',marker
='*',markeredgecolor='red')
 plt.xlabel('Numbers')
 plt.ylabel('Percentage')
 plt.show()

Output-

Program to plot a sine wave using line chart-

 import matplotlib.pyplot as plt


 import numpy as np
 x=np.arange(-2,1,0.1)
 y=np.sin(x)
 plt.plot(x,y)
 plt.show()

Output-

Program to plot an algbric expression 1-0.5xa^2 using line chart-

 import matplotlib.pyplot as plt


 import numpy as np
 a=np.arange(-2,1,0.01)
 y=1-0.5*a**2
 plt.plot(a,y,color='green')
 plt.title('Example')
 plt.xlabel('Input')
 plt.ylabel('Output')
 plt.show()
Output-

To plot multiple lines with different colours -

 import matplotlib.pyplot as plt


 import numpy as np
 x=np.arange(1,3)
 plt.plot(y+1,linestyle='dashed',color='green')
 plt.plot(y+2,linestyle='dotted',color='blue')
 plt.plot(y+3,color='red')
 plt.show()

Output-
Bar Chart
To plot the elements of two lists using a bar chart-

 import matplotlib.pyplot as plt


 x=[2,4,6,8,10]
 y=[6,7,8,2,4]
 x2=[1,3,5,7,9]
 y2=[7,8,2,4,2]
 plt.bar(x,y,label='Bars1')
 plt.bar(x2,y2,label='Bars2')
 plt.xlabel('x')
 plt.ylabel('y')
 plt.title('Bar Graph \n with multiline title')
 plt.legend()
 plt.show()
Output-

To plot a bar chart for a student strength analysis for 3 consecutive


years, comparing and visualiing them by converting as a pandas
dataframe and plotting using multiple bars-

 import pandas as pd
 import matplotlib.pyplot as plt
 plotdata=pd.DataFrame({'2019':[10,20,15,30],'2020':
[16,25,22,30],'2021':[19,22,29,32]},index=['A','B','C','D'])
 plotdata.plot(kind='bar')
 plt.title('Students strength analysis')
 plt.xlabel('sections')
 plt.ylabel('Strength')
 plt.show()
Output-
To plot multiple stacked bar charts for student strength analysis for 3
consecutive years-

 import pandas as pd
 import matplotlib.pyplot as plt
 plotdata=pd.DataFrame({'2019':[10,20,15,30],'2020':
[16,25,22,30],'2021':[19,22,29,32]},index=['A','B','C','D'])
 plotdata.plot(kind='bar',stacked=True)
 plt.xlabel('sections')
 plt.ylabel('Strength')
 plt.show()
Output-
Histogram
To display a histogram with well-defined edges-

 import matplotlib.pyplot as plt


 import numpy as np
 y=np.random.randn(1000)
 plt.hist(y,30,edgecolor='red')
 plt.show()
Output-
Changing the look of histogram -

 import matplotlib.pyplot as plt


 import numpy as np
 data_student=[1,11,21,31,41,51]
 plt.hist(data_student,bins=[0,10,20,30,40,50,60],weights=[10,1,
0,33,6,8],edgecolor='red',facecolor='green')
 plt.xlabel('Values')
 plt.ylabel('Frequency')
 plt.title('Histogram for student data')
 plt.savefig('student.png')
 plt.show()
Output-
Thank
You

You might also like