Outliers

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

HANDLING OUTLIERS

IN PYTHON

Bhavishya Pandit
Bhavishya Pandit

1 Z SCORE
Z Score tells us how any specific data is different from the
mean value. High Z-score accounts for larger value and vice-
versa.

Here the values with z-


score greater than 3 or
smaller than -3 are
outliers
INTERQUARTILE RANGE
(IQR) 2
Let's assume you have 5 friends and
you ask each one of them, how
much pocket money they get each
month and then arrange the figures
in ascending order. Now looking at
the middle values would give you a
rough idea about how much money
they get. If the middle value is high
then more friends get more pocket
money or otherwise.

Bhavishya Pandit
ISOLATION
3 FOREST
Isolation Forest uses decision trees that separate
anomalies or outliers by asking simple questions
like attributes of data. It is a pretty efficient
algorithm with linear time complexity. IsolationForest algorithm is an
inbuilt feature of sci-kit-learn
library

Bhavishya Pandit
WINSORIZING
4
In Winsorizing, instead of removing the outliers
completely, we reduce their impact by replacing
extreme values with values closer to the center of
the distribution while also preserving the data
size.
limit[0] => percentage of data to be winsorized from the lower side.
limit[1] => percentage of data to be winsorized from the upper side.

Bhavishya Pandit
VISUALIZATION
5
We can also predict and handle outliers by merely
visualizing the data in the form of plots like histograms,
scatter-plots, etc.

Outliers can be detected by normalizing the data values


(e.g. bell curve)

Bhavishya Pandit
FOLLOW FOR MORE
AI/ML CONTENT

You might also like