Professional Documents
Culture Documents
Eda Presentation
Eda Presentation
DATASET
Name – SAMYAK KHANDERAO
Roll No – A-46
PRN – 22610045
INTRO TO THE DATASET
This dataset is having the data of 1K+ Amazon Product's Ratings and
Reviews as per their details listed on the official website of Amazon
• Histogram:
Displays the distribution of a single
variable by dividing its range into
intervals (bins) and plotting the
frequency or count of observations
within each bin.
Visualize scatter plot in your dataset with maximum
no of parameters
This data is only for first 100 rows in
the data
sns.scatterplot(data=df.head(100),
x='actual_price',
y='discounted_price',
hue='rating', size=5)
Perform Bivariate Graphical EDA on given dataset.
• Grouping the DataFrame by category and
calculate the total number of users for
each category
• category_user_counts =
df.groupby('category')
['number_of_users'].sum().reset_index()
• sns.barplot(data=category_user_counts,
x='category', y='number_of_users')
Scatterplot (bivariate)
This scatterplot shows how the
actual price of the product
varies with the discount
percentage