Professional Documents
Culture Documents
P01 - Data Mining - 2001202058
P01 - Data Mining - 2001202058
PRN: 2001202058
PRACTICAL - 1
AIM/OBJECTIVE: To detect and remove outliers using data mining techniques
CONCEPT/THEORY OF EXPERIEMENT:
1).Outlier detection using graphical method Histogram, Box plot, Density Plot
2).Outlier detection using scientific methods – Summary statics report, Quartiles, Inter quartile
range, Normalisation techniques(z-score), Rosner test
PROCEDURE:
3). Check whether the outliers are present in the given dataset or not using graphical method
4). Check whether the outliers are present in the given data set using statistical data mining
1
Observations/Calculations/Result:
2) Find the summary of bodyfat2. Using the summary statistics state whether the which of the
variation has outliers
2
3) Draw box plot of the variables BODYFAT & HEIGHT. Find the outliers using box plot
and also detect the index position of the outliers. Remove these outliers.
3
4) Install 'EnvStats' package.
install.packages("EnvStats")
4
5) Using the rosner test detect the outliers from the variable 'WEIGHT'. Store the
outliers into new variable. Check the dimension of updated data set.
5
6
6) Using Inter Quartile Range technique, detect the outliers from the variable 'CHEST'
7
7) Calculate the upper fence and lower fence of the variable 'HIP'. Using these upper
and lower fence detect the outliers from the variable 'HIP'.
8
9