EDA QUIZ Solution

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

In [1]: import numpy as np

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

In [2]: df = pd.read_csv('Penguins Analysis.csv')

In [3]: df.head()

Out[3]: species island culmen length culmen depth flipper length body mass sex

0 Adelie Torgersen 39.1 18.7 181 3750 MALE

1 Adelie Torgersen 39.5 17.4 186 3800 FEMALE

2 Adelie Torgersen 40.3 18.0 195 3250 FEMALE

3 Adelie Torgersen 36.7 19.3 193 3450 FEMALE

4 Adelie Torgersen 39.3 20.6 190 3650 MALE

In [4]: df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 333 entries, 0 to 332
Data columns (total 7 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 species 333 non-null object
1 island 333 non-null object
2 culmen length 333 non-null float64
3 culmen depth 333 non-null float64
4 flipper length 333 non-null int64
5 body mass 333 non-null int64
6 sex 333 non-null object
dtypes: float64(2), int64(2), object(3)
memory usage: 18.3+ KB

The dataset consists of 7 columns.


species: penguin species (Chinstrap, Adélie, or Gentoo)
island: island name (Dream, Torgersen, or Biscoe) in the Palmer Archipelago (Antarctica)
culmen length: culmen length (mm)
culmen depth: culmen depth (mm)
flipper length: flipper length (mm)
body mass: body mass (g)
sex: penguin sex

1. Among the three species, which has the highest count?


In [5]: df['species'].value_counts()[0:1]

Out[5]: Adelie 146


Name: species, dtype: int64

Adelie species have maximum intences in the dataset.


In [ ]:

2. Which island has the highest number of Penguins?


In [6]: df['island'].value_counts()[0:1]

Out[6]: Biscoe 163


Name: island, dtype: int64

Most of the Penguins belong to Biscoe island


In [ ]:

3. Which species of male Penguin has the highest average body mass?
In [8]: plt.figure(figsize=(15, 6))
sns.barplot(data=df, x='body mass', y='species', hue='sex')
plt.show()

Answer: Gentoo
sobashivaprakash@gmail.com
In [ ]:
YT24WOMVHR

4.
In [9]: plt.figure(figsize=(15, 6))
sns.barplot(data=df, x='culmen depth', y='species', hue='sex')
plt.show()

In [ ]:

5. Which species have highest Flipper Length in FEMALE Penguins ?


In [10]: plt.figure(figsize=(15, 6))
sns.barplot(data=df, x='flipper length', y='species', hue='sex')

In [ ]:

In [ ]:

6. Which variables have a strong correlation?


In [21]: plt.figure(figsize=(18,8))
sns.heatmap(df.corr(), annot=True)

Out[21]: <AxesSubplot:>

Flipper Length and Body Mass are strongly dependent with corelation value of 0.87
This file is meant for personal use by sobashivaprakash@gmail.com only.
In [ ]:
Sharing or publishing the contents in part or full is liable for legal action.
sobashivaprakash@gmail.com
YT24WOMVHR

This file is meant for personal use by sobashivaprakash@gmail.com only.


Sharing or publishing the contents in part or full is liable for legal action.

You might also like