Professional Documents
Culture Documents
Lab2.2 Kritika
Lab2.2 Kritika
Lab2.2 Kritika
2
AIM-Study Different Basic functions of Pandas Library
In [1]:
import pandas as pd
In [2]:
branches = [
"Artificial Intelligence and Machine Learning (AI/ML)",
"Data Science and Analytics",
"Computer Systems Engineering",
"Software Engineering",
"Cybersecurity",
"Human-Computer Interaction (HCI)"
]
branches_series = pd.Series(branches)
branches_series
1. Create a DataFrame using the following dataitems. Complete the following tasksName of the
dataframe= Yourname1
v. Store the statistical description into a new dataframe for further usage
xiii. Display the Regd no and DBE mark of those students whose Result = Pass
xiv. Display the DBE Mark, DAA Mark , and Regd No of those students whose Name =Rohan and
Result= Pass
In [3]:
import pandas as pd
In [4]:
Yourname1.head()
Out[4]: Regd No Name DBE Mark DAA Mark Result Grade
In [5]:
Yourname1.tail()
In [6]:
Yourname1.describe()
In [7]:
Yourname1.describe().transpose()
DBE Mark 7.0 58.000000 23.720596 25.0 39.5 67.0 73.0 89.0
DAA Mark 7.0 58.428571 23.585710 23.0 44.0 65.0 71.5 90.0
In [8]:
statistical_description=Yourname1.describe()
In [9]:
sel_col = ["Regd No" ,"Result"]
Yourname1[sel_col].head(3)
0 S001 Pass
1 S002 Fail
2 S003 Pass
In [10]:
Yourname1["Name"].tail()
Out[10]: 2 Seema
3 Puja
4 Priya
5 Rohan
6 Guduli
Name: Name, dtype: object
In [11]:
Yourname1 = Yourname1.drop(columns=["Grade"])
Yourname1
In [13]:
print("Number of rows" ,len(Yourname1))
Number of rows 7
In [14]:
print("Number of columns" ,len(Yourname1.columns))
Number of columns 5
In [15]:
print("Dimension of the dataframe:", Yourname1.shape)
In [16]:
print("Feature names:", Yourname1.columns.tolist())
Feature names: ['Regd No', 'Name', 'DBE Mark', 'DAA Mark', 'Result']
In [18]:
passing_students=Yourname1[Yourname1["Result"]=="Pass"]
passing_students[["Regd No", "DBE Mark"]]
0 S001 68
2 S003 45
4 S004 25
5 S005 67
In [21]:
selected_students = Yourname1[(Yourname1["Name"] == "Rohan") & (Yourname1["Resul
print(selected_students[["DBE Mark", "DAA Mark", "Regd No"]])
ii. Check the datatype, index range, memory usage, number of columns and rows.
iii. Check result distribution i.e. count the number of students passed and failed.
iv. Check students who have got more than or equal to 80 on Math
x. More precisely check any null values is present in each feature or not
In [22]:
import pandas as pd
import numpy as np
In [24]:
yourname2=pd.read_csv('C:/Users/kriti/Downloads/student_result.csv')
yourname2
Out[24]: math bangla english result
0 70 80 90 1
1 30 40 50 0
2 50 20 35 0
3 80 33 33 1
4 33 35 36 1
5 32 80 35 0
6 40 50 21 0
7 33 35 35 1
8 60 23 10 0
9 33 34 35 1
10 50 40 40 1
11 35 40 30 0
12 0 0 0 0
13 10 10 10 0
14 33 33 33 1
In [25]:
print("Data Types:")
print(yourname2.dtypes)
print("\nIndex Range:")
print(yourname2.index)
print("\nMemory Usage:")
print(yourname2.memory_usage())
print("\nNumber of Columns and Rows:")
print(yourname2.shape)
Data Types:
math int64
bangla int64
english int64
result int64
dtype: object
Index Range:
RangeIndex(start=0, stop=15, step=1)
Memory Usage:
Index 128
math 120
bangla 120
english 120
result 120
dtype: int64
In [27]:
filter = yourname2[yourname2['result'] == 0]
print("Fail Students : ",filter["result"].count())
filter = yourname2[yourname2['result'] == 1]
print("Pass Students : ",filter["result"].count())
Fail Students : 8
Pass Students : 7
In [28]:
filter = yourname2[yourname2["math"] >= 80]
filter
3 80 33 33 1
In [29]:
filter = yourname2[yourname2["result"] == 0]
filter
1 30 40 50 0
2 50 20 35 0
5 32 80 35 0
6 40 50 21 0
8 60 23 10 0
11 35 40 30 0
12 0 0 0 0
13 10 10 10 0
In [30]:
df_corr = yourname2.corr()
df_corr
In [32]:
import seaborn as sns
import matplotlib.pyplot as plt
sns.heatmap(df_corr)
plt.title("Graphical Co-relation between Attributes")
0 70 80 90 1 Rohan
1 30 40 50 0 Rahul
2 50 20 35 0 Seema
3 80 33 33 1 Puja
4 33 35 36 1 Priya
5 32 80 35 0 Rohan
6 40 50 21 0 Guduli
7 33 35 35 1 NaN
8 60 23 10 0 NaN
9 33 34 35 1 NaN
10 50 40 40 1 NaN
11 35 40 30 0 NaN
12 0 0 0 0 NaN
13 10 10 10 0 NaN
14 33 33 33 1 NaN
In [36]:
print("Checks for any null value in DataFrame : ",yourname2.isnull())
Checks for any null value in DataFrame : math bangla english result N
ame
0 False False False False False
1 False False False False False
2 False False False False False
3 False False False False False
4 False False False False False
5 False False False False False
6 False False False False False
7 False False False False True
8 False False False False True
9 False False False False True
10 False False False False True
11 False False False False True
12 False False False False True
13 False False False False True
14 False False False False True
In [37]:
print("Checks for any null value in DataFrame (precisely) : ",yourname2.isnull()
In [38]:
yourname2 = yourname2.fillna("Anonymous")
yourname2
0 70 80 90 1 Rohan
1 30 40 50 0 Rahul
2 50 20 35 0 Seema
3 80 33 33 1 Puja
4 33 35 36 1 Priya
5 32 80 35 0 Rohan
6 40 50 21 0 Guduli
7 33 35 35 1 Anonymous
8 60 23 10 0 Anonymous
9 33 34 35 1 Anonymous
10 50 40 40 1 Anonymous
11 35 40 30 0 Anonymous
12 0 0 0 0 Anonymous
13 10 10 10 0 Anonymous
14 33 33 33 1 Anonymous
In [39]:
yourname2 = yourname2.drop(index = yourname2.index[:3])
yourname2
3 80 33 33 1 Puja
4 33 35 36 1 Priya
5 32 80 35 0 Rohan
math bangla english result Name
6 40 50 21 0 Guduli
7 33 35 35 1 Anonymous
8 60 23 10 0 Anonymous
9 33 34 35 1 Anonymous
10 50 40 40 1 Anonymous
11 35 40 30 0 Anonymous
12 0 0 0 0 Anonymous
13 10 10 10 0 Anonymous
14 33 33 33 1 Anonymous
Name-Kritika Das
Roll no-CSE21068
Regd no-2101020068
In [ ]: