Welcome to Scribd!

Answer:: Be The Same/ Different? Why?

Uploaded by

0% found this document useful (0 votes)

10 views1 page

When data is not normally distributed, transformations can be applied to make the data more symmetrical before identifying outliers. One approach is to use a square root transformation that normalizes the data between 0 and 1. This transformation works well for moderately skewed data like response times but may not work for extremely skewed data like salaries, since outliers on the high end would be difficult to detect.

Original Description:

marketing analytics

Original Title

Answer 3

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as docx, pdf, or txt

0% found this document useful (0 votes)

10 views1 page

Answer:: Be The Same/ Different? Why?

Uploaded by

Monika Sakkarwal

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as docx, pdf, or txt

Jump to Page

You are on page 1of 1

Search inside document

Challenge exercise 1: How would you identify outliers if the data is not normal?

Would the method

be the same/ different? Why?
Answer:
When data are highly skewed or in other respects depart from a normal distribution,
transformations to normal data is a common step in order to identify outliers using a method which
is quite effective in a normal distribution.
A transformation approach
One approach is to first make the data symmetrical by the use of a non-linear transformation. On
the transformed data, the outliers could be located using a technique taken from the normal
distribution data.
With such an approach, outliers would also be more symmetrically away from the central tendency,
providing an equal chance of locating low- and high outliers.
Out of the three commonly used transformations (Log-transform, square-root transform, arcsin
transform), the following modification of the square root transformation is considered apt to locate
outliers at either side of the distribution for response time data:

Here,
X(1) denotes the smallest item of the sample X, and
X(n) denotes the largest.
Dividing by the range (the largest observations minus the smallest) normalizes the data so that they
are located between 0 and 1, with the smaller at exactly 0 and the largest at exactly 1. This step
bounds the data into the range [0..1].
However,
Though the square-root transform works well for response times and for other measures whose
population distribution is moderately and positively skewed, it will not work in cases of extreme
asymmetry (e.g. salary). For such population, it is merely impossible to detect high outliers since any
extremely large measure is always possible in such populations.
Reference: Outliers detection and treatment: a review. Denis Cousineau, Universit de Montral,
Sylvain Chartier, University of Ottawa

Measures of Skewness and Kurtosis
Document29 pages
Measures of Skewness and Kurtosis
obielicious
No ratings yet
Kurtosis
Document4 pages
Kurtosis
Carl Samson
No ratings yet
Unit 4 Descriptive Statistics
Document8 pages
Unit 4 Descriptive Statistics
HafizAhmad
No ratings yet
Outlier
Document9 pages
Outlier
keisha555
No ratings yet
1outlier - Wikipedia
Document47 pages
1outlier - Wikipedia
jlesalvador
No ratings yet
Statistics For Datacience
Document7 pages
Statistics For Datacience
NIKHILESH M NAIK 1827521
100% (1)
X2gtefc4o2 Coy2 (RZ
Document14 pages
X2gtefc4o2 Coy2 (RZ
Alexa Lee
No ratings yet
Top 50 Interview Questions & Answers: Statistics For Data Science
Document21 pages
Top 50 Interview Questions & Answers: Statistics For Data Science
mythrim
No ratings yet
Mathematical Exploration Statistics
Document9 pages
Mathematical Exploration Statistics
anhhuyalex
No ratings yet
Data Analysis RM
Document21 pages
Data Analysis RM
ranisweta6744
No ratings yet
ZGE1104 Chapter 4
Document7 pages
ZGE1104 Chapter 4
Mae Rocelle
No ratings yet
Skewness Kurtosis and Histogram
Document4 pages
Skewness Kurtosis and Histogram
Adamu Madi
No ratings yet
Measure of Central Tendency Dispersion A
Document8 pages
Measure of Central Tendency Dispersion A
رؤوف الجبيري
No ratings yet
Marketing Research
Document16 pages
Marketing Research
iherbfirstorder
No ratings yet
Wa Nko Nalipay PR
Document12 pages
Wa Nko Nalipay PR
gbs040479
No ratings yet
BAA Class Notes
Document16 pages
BAA Class Notes
prarabdhasharma98
No ratings yet
Mayhs
Document4 pages
Mayhs
Norbert Lim
No ratings yet
Process Data Analysis
Document24 pages
Process Data Analysis
Ridwan Mahfuz
No ratings yet
Is Important Because:: The Normal Distribution
Document19 pages
Is Important Because:: The Normal Distribution
RohanPuthalath
No ratings yet
Standard Deviation-From Wikipedia
Document11 pages
Standard Deviation-From Wikipedia
ahmad
No ratings yet
What Is Mode
Document4 pages
What Is Mode
api-150547803
No ratings yet
Statistical and Probability Tools For Cost Engineering
Document16 pages
Statistical and Probability Tools For Cost Engineering
Ahmed Abdulshafi
No ratings yet
INF30036 Lecture5
Document33 pages
INF30036 Lecture5
Yehan Abayasinghe
No ratings yet
Outlier: Occurrence and Causes
Document6 pages
Outlier: Occurrence and Causes
Tangguh Wicaksono
No ratings yet
Standard Deviation and Variance
Document18 pages
Standard Deviation and Variance
Izi
100% (3)
5.1skewness and Kurtosis
Document5 pages
5.1skewness and Kurtosis
Israr Ullah
No ratings yet
Basic Statistics For Data Science
Document45 pages
Basic Statistics For Data Science
Balasaheb Chavan
No ratings yet
Normal Distributions
Document11 pages
Normal Distributions
Waseem Javaid Soomro
No ratings yet
Statistics For Data Science: What Is Normal Distribution?
Document13 pages
Statistics For Data Science: What Is Normal Distribution?
Ramesh Mudhiraj
No ratings yet
EJ1165803
Document15 pages
EJ1165803
Adhelia Josi Wulan Sari
No ratings yet
BBM 502 Research Methodology: Prem Pyari 117523
Document7 pages
BBM 502 Research Methodology: Prem Pyari 117523
geetukumari
No ratings yet
Data Science Interview Preparation (30 Days of Interview Preparation)
Document27 pages
Data Science Interview Preparation (30 Days of Interview Preparation)
Satyavaraprasad Balla
No ratings yet
Module3-Part2 (1) (Autosaved)
Document35 pages
Module3-Part2 (1) (Autosaved)
Sheeba S
No ratings yet
Mids Unit
Document13 pages
Mids Unit
2344Atharva Patil
No ratings yet
Hypothesis Testing 2
Document7 pages
Hypothesis Testing 2
kangethemoses371
No ratings yet
Inferential Report 1
Document7 pages
Inferential Report 1
paavanmalla72
No ratings yet
2robust Statistics - Wikipedia
Document69 pages
2robust Statistics - Wikipedia
jlesalvador
No ratings yet
BUSD2027 QualityMgmt Module2
Document168 pages
BUSD2027 QualityMgmt Module2
Nyko Martin Martin
No ratings yet
Sampling and Statistical Inference: Eg: What Is The Average Income of All Stern Students?
Document11 pages
Sampling and Statistical Inference: Eg: What Is The Average Income of All Stern Students?
Ahmad Mostafa
No ratings yet
Normality
Document20 pages
Normality
shan
No ratings yet
Data Sci Linkedin
Document3 pages
Data Sci Linkedin
angel
No ratings yet
Confidence Intervals PDF
Document5 pages
Confidence Intervals PDF
wolfretonmaths
No ratings yet
When Do You Use The T Distribution
Document4 pages
When Do You Use The T Distribution
Derek Dam
No ratings yet
Shape of Data Skewness and Kurtosis
Document6 pages
Shape of Data Skewness and Kurtosis
Govind Trivedi
No ratings yet
Interpret All Statistics and Graphs For
Document14 pages
Interpret All Statistics and Graphs For
MarkJasonPerez
No ratings yet
Normal Distribution
Document9 pages
Normal Distribution
Deeksha Manoj
No ratings yet
Statistical Inference Under Inequality Constraints For Non Standard Models
Document20 pages
Statistical Inference Under Inequality Constraints For Non Standard Models
Suman Rakshit
No ratings yet
Variability
Document8 pages
Variability
Jelica Vasquez
No ratings yet
Review Ka Na
Document2 pages
Review Ka Na
jhnrytagara
No ratings yet
Statistical Treatment of Data, Data Interpretation, and Reliability
Document6 pages
Statistical Treatment of Data, Data Interpretation, and Reliability
draindrop8606
No ratings yet
Dispersion: Handout #7
Document6 pages
Dispersion: Handout #7
Bonaventure Nzeyimana
No ratings yet
Cross-Sectional Dependence
Document5 pages
Cross-Sectional Dependence
Irfan Ullah
No ratings yet
LESSON 7: Non-Parametric Statistics: Tests of Association & Test of Homogeneity
Document21 pages
LESSON 7: Non-Parametric Statistics: Tests of Association & Test of Homogeneity
JiyahnBay
No ratings yet
Continuous Probability Distributions
Document8 pages
Continuous Probability Distributions
Lupita Jiménez
No ratings yet
SPC 0002
Document14 pages
SPC 0002
Deepa Dhilip
No ratings yet
Inferential Statistics For Data Science
Document10 pages
Inferential Statistics For Data Science
rsaranms
100% (1)
The Assumption(s) of Normality
Document6 pages
The Assumption(s) of Normality
Mark Kevin Aguilar
No ratings yet
Chapter Exercises:: Chapter 5:utilization of Assessment Data
Document6 pages
Chapter Exercises:: Chapter 5:utilization of Assessment Data
Jessa Mae Cantillo
No ratings yet
Sampling in Statistics
From Everand
Sampling in Statistics
Stephanie Glen
No ratings yet
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet