Professional Documents
Culture Documents
AI Lab 04 Lab Tasks
AI Lab 04 Lab Tasks
A- Outcomes:
After completion of the lab session students will be able:
a. To understand the series data structure
b. To understand the DataFrame
c. To understand indexing, selection & filtering in Pandas
d. To learn applying functions, sorting & ranking in Pandas
1
Lab Session -04
B- Lab Tasks:
1- Consider the following sample data for a DataFrame:
ID Name Math Score English Score Science Score
101 Ali 85 92 78
102 Fatima 92 88 90
103 Hassan 78 80 75
104 Aisha 95 90 92
105 Ahmed 88 85 85
106 Hira 90 95 88
107 Saad 79 82 80
108 Zara 87 91 89
109 Bilal 93 89 94
110 Sana 84 93 87
Create a Pandas DataFrame using the given sample data and display the
DataFrame.
Set the ‘ID’ column as the index of the DataFrame.
Access and display the details of the student with ID 105.
Display the maths scores of students with IDs 102, 104 and 108.
Calculate and display the average scores for each subject.
Display students who scored above the average Math score.
Sort the DataFrame based on the ‘English Score’ column in descending order.
Add a new column ‘Total Score’ representing the sum of Math, English, and
Science scores.
Display the DataFrame after adding the ‘Total Score’ column.
Write/copy your code here:
Code:
import pandas as pd
2
Lab Session -04
'Name': ['Ali', 'Fatima', 'Hassan', 'Aisha', 'Ahmed', 'Hira', 'Saad', 'Zara', 'Bilal', 'Sana'],
'Math Score': [85, 92, 78, 95, 88, 90, 79, 87, 93, 84],
'English Score': [92, 88, 80, 90, 85, 95, 82, 91, 89, 93],
'Science Score': [78, 90, 75, 92, 85, 88, 80, 89, 94, 87]
}
# Create DataFrame
df = pd.DataFrame(data)
# Display math scores of students with IDs 102, 104, and 108
print("\nMaaz Bin Fazal_39_Display math scores of students with IDs 102, 104, and
108")
print(df.loc[[102, 104, 108], 'Math Score'])
3
Lab Session -04
# Add a new column 'Total Score' representing the sum of Math, English, and Science
scores
print("\nMaaz Bin Fazal_39_Add a new column 'Total Score' representing the sum of
Math, English, and Science scores")
df['Total Score'] = df['Math Score'] + df['English Score'] + df['Science Score']
4
Lab Session -04
2- Consider the following extended sample data for a DataFrame, representing sales
data for a retail store:
5
Lab Session -04
Create a Pandas DataFrame using the given sample data and display the
DataFrame.
Set the ‘Product ID’ column as the index of the DataFrame.
Create a column “Total Revenue”. Calculate and display the total revenue for
each product (Unit Price * Quantity Sold after applying discount %).
Identify and Display the top-selling product based on total revenue.
Sort the DataFrame based on the ‘Customer Rating’ column in descending
order.
Display the products with a customer rating above 4.2.
Calculate and display the correlation matrix for the numerical columns (Unit
Price, Quantity Sold, Discount %, Customer Rating).
# Maaz_39_Sample data
data = {
'Product ID': ['01', '02', '03', '04', '05', '06', '07', '08', '09', '10'],
'Product Name': ['Laptop', 'Smartphone', 'Television', 'Refrigerator', 'Washing
Machine',
'Air Conditioner', 'Microwave oven', 'Blender', 'Vacuum Cleaner', 'Coffee
Maker'],
'Unit Price': [1200, 800, 1500, 1000, 1200, 1800, 400, 50, 200, 100],
'Quantity Sold': [50, 80, 30, 45, 35, 20, 60, 150, 40, 120],
'Discount %': [5, 10, 8, 6, 7, 12, 4, 2, 5, 3],
'Customer Rating': [4.2, 4.5, 4.1, 4.3, 4.4, 4.0, 4.6, 4.8, 4.2, 4.7]
6
Lab Session -04
# Maaz_39_Create DataFrame
df = pd.DataFrame(data)
# Display DataFrame
print("Maaz_39_DataFrame:\n", df)
7
Lab Session -04
# Maaz_39_Calculate and display the correlation matrix for the numerical columns
correlation_matrix = df[['Unit Price', 'Quantity Sold', 'Discount %', 'Customer
Rating']].corr()
print("\nMaaz_39_Correlation Matrix:\n", correlation_matrix)
Output:
1 2
4 5
8
Lab Session -04
9
Lab Session -04
10
Lab Session -04
11
Lab Session -04
12
Lab Session -04
4- After dropping the specified columns, the DataFrame should now contain the
following columns:
Destination Region
Destination Country
Year
No. of Emigrants
Perform the following tasks:
Display the first few rows of the updated DataFrame.
Display the last few rows of the updated DataFrame.
Print the concise summary of the updated DataFrame.
13
Lab Session -04
Calculate and display the mean, median, and standard deviation of the
‘No. of Emigrants’ column.
Calculate and display the total number of rows in the DataFrame.
Calculate and display the total number of cells (entries) in the DataFrame.
Identify and display the unique values in the ‘Destination Region’ column.
Display the count of each unique value in the ‘Destination Region’ column.
Identify and display the data types of each column in the DataFrame.
Inspect and display the descriptive statistics of the ‘Year’ column.
Write/copy your code here:
import pandas as pd
# Maaz_39
14
Lab Session -04
# Maaz_39_Calculate and display the mean, median, and standard deviation of the
‘No. of Emigrants’ column
print("\nMean:", df['No. of Emigrants'].mean())
print("Median:", df['No. of Emigrants'].median())
print("Standard deviation:", df['No. of Emigrants'].std())
# Calculate and display the total number of cells (entries) in the DataFrame
print("Total number of cells:", df.size)
# Identify and display the unique values in the ‘Destination Region’ column
print("\nUnique values in the ‘Destination Region’ column:")
print(df['Destination Region'].unique())
# Display the count of each unique value in the ‘Destination Region’ column
print("\nCount of each unique value in the ‘Destination Region’ column:")
print(df['Destination Region'].value_counts())
# Identify and display the data types of each column in the DataFrame
print("\nData types of each column:")
print(df.dtypes)
15
Lab Session -04
Output:
1 2
16
Lab Session -04
17
Lab Session-04
18