Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

DataFrame Worksheet

S.No. Questions Marks


SECTION A

1 Which of the following is not a valid way to create a Pandas DataFrame? 1


a) From a dictionary of Series b) From a list of dictionaries
c) From JSON files d) From Excel file
2 In Pandas, which function is used to display the first few rows of a DataFrame? 1
a) show() b) display() c) head() d) first()
3 How can you add a new column named 'Age' to an existing DataFrame df? 1
a) df.append_column('Age', [25, 30, 28]) b) df['Age'] = [25, 30, 28]
c) df.addColumn(’Age’, [25, 30, 28]) d) df.new column('Age', [25, 30, 28])
4 What is Boolean Indexing used for in Pandas? 1
a) Selecting rows based on conditions b) Indexing columns based on labels
c) Sorting the DataFrame d) Renaming column headers
5 Which method would you use to select rows in a DataFrame df where column ’age' is greater than 25 1
and ’salary' is less than 50000?
A) df.loc[(df['age'] > 25) & (df['salary'] < 50000)]
B) df.iloc[(df['age’] > 25) & (df[’salary'] < 50000)]
C) df[(df[’age'] > 25) and (df['salary'] < 50000)]
D) df.query('age > 25 and salary < 50000')
6 How can you delete a column named 'Salary' from a Pandas DataFrame df? 1
a) df.drop('Salary', axis=1) b) df.delete column('Salary')
c) del df['Salary'] d) df.remove column(’Salary’)
7 What is the output of the following program? 1
import pandas as pd
df=pd.DataFrame(index=[0,1,2,3,4,5],columns=[‘one’,’two’])
print
df[‘one’].sum()
8 How do you set the x-axis label to 'Time' and y-axis label to 'Value' in Matplotlib? 1
A) plt.x1abe1(’Time') and plt.ylabe1('Value')
B) p1t.set xlabel(’Time') and p1t.set 1abe1('Value')
C) plt.axis_labels(x='Time', y='Value')
D) plt.set labels(x='Time', y='Value'))
9 Which of the following is a correct way to delete the first row of a DataFrame df? 1
A) df.drop(0) B) df.drop([0])
C) df.drop(df.index[0]) D) df.drop(rows=0)
10 Which of the following is a correct way to delete the first row of a DataFrame df? 1
A) df.drop(0) B) df.drop([0])
C) df.drop(df.index[0]) D) df.drop(rows=0))
SECTION B
11 Explain how to create a DataFrame from a list of dictionaries and display its first three rows. 2
12 Describe how to perform boolean indexing to filter rows where the column ’score' is greater than 50 2
and less than 80 in a DataFrame df
13 Write a program in python to find maximum value over index in Data frame. 2
14 i)How many rows the resultant data frame will have? 1+1
import pandas as pd
df1=pd.DataFrame({‘key’:[‘a’,’b’,’c’,’d’], ‘value’:[1,2,3,4]})df2=pd.DataFrame({‘key’:
[‘a’,’b’,’e’,’b’], ‘value’:[5,6,7,8]})
df3=df1.merge(df2,
on=’key’, how=’outer’)
1. 5
2. 4
3. 2
4. 6

ii)Write a program in python to join two data frame


SECTION C
15 Explain the process of importing a CSV file into a DataFrame, performing the following operations: 3
• Rename the columns
• Add a new column
• Export the modified DataFrame back to a CSV file
16 Given a DataFrame df with columns 'A', 'B', and ’C', write a Python script to perform the following: 3
• Select rows where column 'A' is greater than 50
• Delete column ’C'
• Iterate through the DataFrame and print each row
17 What will be the output of the following python code? 2+1
i) import pandas as pdd={'Student':['Ali','Ali','Tom','Tom'],'House':
['Red','Red','Blue','Blue'],'Points':[50,70,60,80]}
df=pd.DataFrame(d)
df1=df.pivot_table(index='Student',columns='House',values='Points',aggfunc='sum')
print(df1)

Consider a DataFrame ‘df’ created using the dictionary given below, answer the questions
given below:
ii)
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily','Michael','Matthew',
'Lara', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.NaN, 9, 20,14.5, np.NaN, 8, 19],
'attempts' : [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
qualify': ['yes','no','yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}

Write command to remove the rows having NaN values.


SECTION D
18 Given a DataFrame df with columns 'Name', 'Age', and 'Score', write a Python script to: 4
• Filter rows where 'Age’ is greater than 20 and 'Score' is less than 80.
• Sort the resulting DataFrame by ’Score' in descending order.
• Reset the index of the sorted DataFrame.
• Print the final DataFrame.

19. Emp_ID Name Dept Salary Status 4


100 Kabir IT 34000 Regular
110 Rishav Finance 28500 Regular 120 Seema
IT 13500 Contract 130 David IT 41000
Regular
140 Ruchi HRD 17000 Contract

i) Consider the above Data frame as df. Write a Python Code to calculate the average salary of the
Regular employees and the Contract employees separately.
ii) Write a Python Code to update the Salary of all Contract employees to Rs19000
iii) Write a Python Code to display the 4th Record.
iv) Write a Python Code to display the maximum salary of all employees in the ‘IT’ department.
20 Write a Python script to perform the following tasks: 5
• Create a DataFrame from a dictionary of lists with the following data:
data = {
’Student'. ['Alice', 'Bob’, 'Charlie', 'David', 'Eva'], ’Math'. [85, 92, 78, 90, 95],
'Science“. [88, 79, 84, 91, 89],
’English'. [92, 85, 88, 86, 94]

• Add a new column 'Total' which is the sum of'Math', 'Science', and 'English' for each student.
• Rename the columns 'Math' to 'Mathematics', 'Science' to 'Physics', and 'English'
to ’Literature'.
Filter out students who scored below 90 in 'Total'.

You might also like