python_ques

1.What is Pandas and why is it used in Python?
Pandas is a powerful open-source data manipulation and analysis library for Python.
It provides easy-to-use data structures and functions to work with structured data,
such as tabular data, time series, and more. Pandas is widely used in data analysis
and manipulation tasks due to its flexibility, efficiency, and rich functionality.
2.How do you install Pandas in Python?
You can install Pandas using pip, the Python package manager. Run the following
command in your terminal or command prompt:
pip install pandas
3.Explain the primary data structures in Pandas.
The primary data structures in Pandas are:

Series: One-dimensional labeled array capable of holding any data type.
DataFrame: Two-dimensional labeled data structure with columns of potentially
different types. It is similar to a spreadsheet or SQL table.
4.How do you create a Pandas DataFrame from a Python dictionary?
You can create a DataFrame from a dictionary using the pd.DataFrame() constructor.
Each key-value pair in the dictionary corresponds to a column in the DataFrame.
python
import pandas as pd
data = {'Name': ['John', 'Anna', 'Peter', 'Linda'],

'Age': [28, 35, 42, 32]}
df = pd.DataFrame(data)
5.What is the purpose of the head() and tail() functions in Pandas?
The head() function returns the first n rows of a DataFrame, while the tail()
function returns the last n rows. They are useful for quickly inspecting the
beginning or end of a large DataFrame.
python
print(df.head()) # Returns the first 5 rows

print(df.tail(3)) # Returns the last 3 rows
6.Differentiate between a DataFrame and a Series in Pandas.
A DataFrame is a two-dimensional labeled data structure with columns of potentially

different data types, similar to a table in a relational database or a spreadsheet.
A Series, on the other hand, is a one-dimensional labeled array capable of holding
any data type, similar to a single column in a DataFrame.
7.How do you check for missing values in a DataFrame?
You can use the isnull() function to check for missing values in a DataFrame. It
returns a DataFrame of the same shape as the input with True where NaN values are
present and False otherwise.
python
missing_values = df.isnull()
8.Explain the purpose of the shape attribute in Pandas.
The shape attribute of a DataFrame returns a tuple representing the dimensions of

the DataFrame. It indicates the number of rows and columns in the DataFrame.
print(df.shape) # Output: (4, 2) - 4 rows, 2 columns
9.How can you rename columns in a Pandas DataFrame?
You can rename columns in a DataFrame using the rename() function. Specify the
current column names as keys and the new names as values in a dictionary.
df.rename(columns={'old_name': 'new_name'}, inplace=True)
10.What is the role of the dtype parameter in Pandas?
The dtype parameter in Pandas specifies the data type of the elements in a
DataFrame or Series. It allows you to explicitly set or infer the data type of each
column, such as int, float, object, datetime, etc., during DataFrame creation or
manipulation. It helps ensure data integrity and optimize memory usage.
df = pd.DataFrame(data, dtype=int)
11.Explain the difference between loc and iloc in Pandas.
loc is used for label-based indexing, meaning you can specify row and column labels
to select data. iloc is used for integer-based indexing, meaning you can specify
integer indices to select data.
# Using loc
df.loc[2, 'column_name']
# Using iloc
df.iloc[2, 0]
12.How do you select specific columns from a DataFrame?
You can select specific columns from a DataFrame by passing a list of column names
to the indexing operator [] or by using the loc or iloc accessor methods.
# Using indexing operator

selected_columns = df[['column1', 'column2']]
# Using loc
selected_columns = df.loc[:, ['column1', 'column2']]
# Using iloc
selected_columns = df.iloc[:, [0, 1]]
13.What is boolean indexing, and how is it used in Pandas?
Boolean indexing is a technique used to filter rows in a DataFrame based on a

specified condition. It involves creating a boolean mask (a Series of True and
False values) that indicates which rows satisfy the condition.
# Boolean indexing example

filtered_df = df[df['column'] > 50]
14.How do you drop columns and rows from a DataFrame in Pandas?

You can drop columns and rows from a DataFrame using the drop() function. Specify
the column(s) or row(s) to drop along with the axis parameter.
# Drop column
df.drop(columns=['column_name'], inplace=True)
# Drop row
df.drop(index=0, inplace=True)
15.Explain the purpose of the isin() function in Pandas.
The isin() function is used to filter rows in a DataFrame based on whether the
values in a column are present in a specified list or array. It returns a boolean
mask indicating which rows match the specified condition.
# Example of isin() function

filtered_df = df[df['column'].isin(['value1', 'value2'])]
16.How can you set a specific column as the index in a DataFrame?
You can set a specific column as the index in a DataFrame using the set_index()
function. Specify the column name to be used as the index.
df.set_index('column_name', inplace=True)
17.What is the purpose of the at and iat accessors in Pandas?
The at and iat accessors are used for fast scalar value access in a DataFrame. They
provide optimized methods for accessing a single value based on label (at) or
integer position (iat).
# Using at accessor
value = df.at[row_label, column_label]
# Using iat accessor

value = df.iat[row_position, column_position]
18.How do you reset the index of a DataFrame?
You can reset the index of a DataFrame using the reset_index() function. By
default, it creates a new DataFrame with the old index as a column and a new
sequential index. Use the drop parameter to avoid adding the old index as a column.
df.reset_index(inplace=True, drop=True)
19.Explain the role of the isin() function in Pandas.
The isin() function in Pandas is used to filter rows based on whether the values in
a column are present in a specified list or array. It returns a boolean mask
indicating which rows match the specified condition.
# Example of isin() function

filtered_df = df[df['column'].isin(['value1', 'value2'])]
20.How can you filter rows based on multiple conditions in Pandas?
You can filter rows based on multiple conditions using boolean indexing with
logical operators (& for AND, | for OR, ~ for NOT). Enclose each condition within
parentheses.
# Example of filtering based on multiple conditions

filtered_df = df[(df['column1'] > 50) & (df['column2'] == 'value')]
21.How do you handle missing values in a DataFrame?
Missing values in a DataFrame can be handled using methods like fillna(), dropna(),
or interpolate(). fillna() is used to fill missing values with a specified value,
dropna() is used to remove rows or columns with missing values, and interpolate()
is used to fill missing values by interpolation.
# Example of handling missing values

df.fillna(0, inplace=True) # Fill missing values with 0
22.Explain the purpose of the drop_duplicates() function in Pandas.
The drop_duplicates() function is used to remove duplicate rows from a DataFrame.

By default, it considers all columns, but you can specify subset columns to
identify duplicates.
df.drop_duplicates(inplace=True)
23.What is the purpose of the apply() function in Pandas?
The apply() function in Pandas is used to apply a function along an axis of a

DataFrame or Series. It can be used to perform custom operations on data, such as
transformations, aggregations, or element-wise calculations.
# Example of apply() function

df['new_column'] = df['existing_column'].apply(lambda x: custom_function(x))
24.How do you convert data types in a Pandas DataFrame?
You can convert data types in a Pandas DataFrame using the astype() function or
specific conversion functions like to_numeric(), to_datetime(), or to_timedelta().
# Example of converting data types

df['column'] = df['column'].astype('int')
25.Explain the purpose of the groupby() function in Pandas.
The groupby() function in Pandas is used to split a DataFrame into groups based on
some criteria, such as unique values in one or more columns. It is typically
followed by an aggregation function to perform calculations within each group.
# Example of groupby() function

grouped_df = df.groupby('column').sum()
26.How do you pivot a DataFrame in Pandas?
You can pivot a DataFrame using the pivot() function, which reshapes the data by
rearranging the rows and columns. It requires specifying columns to use as the new
index, columns, and values.
# Example of pivot() function

pivoted_df = df.pivot(index='index_column', columns='column_to_pivot',
values='value_c
27.What is the merge() function, and how is it used in Pandas?
The merge() function in Pandas is used to combine two or more DataFrames based on
one or more common columns. It performs database-style joins, such as inner, outer,
left, and right joins, to merge DataFrames.
# Example of merge() function

merged_df = pd.merge(df1, df2, on='common_column', how='inner')
28.How do you handle outliers in a DataFrame?
Outliers in a DataFrame can be handled by filtering out or transforming extreme

values using techniques like winsorization, truncation, or imputation.
Additionally, you can use robust statistical measures or outlier detection
algorithms to identify and manage outliers.
# Example of handling outliers with winsorization

from scipy.stats import mstats
winsorized_values = mstats.winsorize(df['column'], limits=[0.05, 0.05])
29.Explain the purpose of the map() function in Pandas.
The map() function in Pandas is used to apply a mapping or transformation to each

element of a Series. It accepts a dictionary, function, or Series as an argument to
perform the mapping.
# Example of map() function with a dictionary

df['column'] = df['column'].map({'value1': 'new_value1', 'value2': 'new_value2'})
30.How do you perform one-hot encoding in Pandas?
One-hot encoding is performed using the get_dummies() function in Pandas. It

converts categorical variables into dummy/indicator variables, where each category
is represented as a binary feature.
# Example of one-hot encoding

encoded_df = pd.get_dummies(df, columns=['categorical_column'])

python_ques

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

python_ques

Uploaded by

Copyright:

Available Formats

1.What is Pandas and why is it used in Python?

2.How do you install Pandas in Python?

pip install pandas

3.Explain the primary data structures in Pandas.

The primary data structures in Pandas are:

4.How do you create a Pandas DataFrame from a Python dictionary?

data = {'Name': ['John', 'Anna', 'Peter', 'Linda'],

5.What is the purpose of the head() and tail() functions in Pandas?

print(df.head()) # Returns the first 5 rows

6.Differentiate between a DataFrame and a Series in Pandas.

A DataFrame is a two-dimensional labeled data structure with columns of potentially

7.How do you check for missing values in a DataFrame?

The shape attribute of a DataFrame returns a tuple representing the dimensions of

9.How can you rename columns in a Pandas DataFrame?

df.rename(columns={'old_name': 'new_name'}, inplace=True)

10.What is the role of the dtype parameter in Pandas?

11.Explain the difference between loc and iloc in Pandas.

12.How do you select specific columns from a DataFrame?

# Using indexing operator

13.What is boolean indexing, and how is it used in Pandas?

Boolean indexing is a technique used to filter rows in a DataFrame based on a

# Boolean indexing example

14.How do you drop columns and rows from a DataFrame in Pandas?

15.Explain the purpose of the isin() function in Pandas.

# Example of isin() function

16.How can you set a specific column as the index in a DataFrame?

17.What is the purpose of the at and iat accessors in Pandas?

# Using iat accessor

18.How do you reset the index of a DataFrame?

19.Explain the role of the isin() function in Pandas.

# Example of isin() function

20.How can you filter rows based on multiple conditions in Pandas?

# Example of filtering based on multiple conditions

21.How do you handle missing values in a DataFrame?

# Example of handling missing values

22.Explain the purpose of the drop_duplicates() function in Pandas.

The drop_duplicates() function is used to remove duplicate rows from a DataFrame.

23.What is the purpose of the apply() function in Pandas?

The apply() function in Pandas is used to apply a function along an axis of a

# Example of apply() function

24.How do you convert data types in a Pandas DataFrame?

# Example of converting data types

25.Explain the purpose of the groupby() function in Pandas.

# Example of groupby() function

26.How do you pivot a DataFrame in Pandas?

# Example of pivot() function

# Example of merge() function

28.How do you handle outliers in a DataFrame?

Outliers in a DataFrame can be handled by filtering out or transforming extreme

# Example of handling outliers with winsorization

29.Explain the purpose of the map() function in Pandas.

The map() function in Pandas is used to apply a mapping or transformation to each

# Example of map() function with a dictionary

30.How do you perform one-hot encoding in Pandas?

One-hot encoding is performed using the get_dummies() function in Pandas. It

# Example of one-hot encoding

You might also like