Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

NAME: KARAN RAJENDRA MANE

PRN: 2001202058

SUBJECT: DATA MINING AND WAREHOUSING

COURSE: BSc. Data Science

COURSE CODE: DS305

TOPIC: OUTLIER DETECTION

PRACTICAL - 1
AIM/OBJECTIVE: To detect and remove outliers using data mining techniques

APPARATUS/TOOLS/EQUIPMENTS USED: R Studio, Lecture Notes, R beginners’ book, Google,


Dataset

CONCEPT/THEORY OF EXPERIEMENT:

1).Outlier detection using graphical method Histogram, Box plot, Density Plot

2).Outlier detection using scientific methods – Summary statics report, Quartiles, Inter quartile
range, Normalisation techniques(z-score), Rosner test

PROCEDURE:

1). Import given dataset in R-studio

2). Find the summary statistics report

3). Check whether the outliers are present in the given dataset or not using graphical method

4). Check whether the outliers are present in the given data set using statistical data mining

5). Treat the outliers properly

1
Observations/Calculations/Result:

1) Import bodyfat2 dataset in R:

2) Find the summary of bodyfat2. Using the summary statistics state whether the which of the
variation has outliers

2
3) Draw box plot of the variables BODYFAT & HEIGHT. Find the outliers using box plot

and also detect the index position of the outliers. Remove these outliers.

3
4) Install 'EnvStats' package.

install.packages("EnvStats")

4
5) Using the rosner test detect the outliers from the variable 'WEIGHT'. Store the

outliers into new variable. Check the dimension of updated data set.

5
6
6) Using Inter Quartile Range technique, detect the outliers from the variable 'CHEST'

also remove these outliers from the dataset.

7
7) Calculate the upper fence and lower fence of the variable 'HIP'. Using these upper

and lower fence detect the outliers from the variable 'HIP'.

8
9

You might also like