Normality Test

Normality test
Yi-Lung Chen, PhD

Department of Healthcare Administration, Asia University
Department of Psychology, Asia University
1
Normal distribution
• Normal distribution
– In probability theory, a normal distribution is a type of
continuous probability distribution for a real-valued random
variable
• A basic assumption for parametric test (e.g., t-test,
ANOVA, regression)
2
Methods of testing normality
• Methods of testing normality
– Graphic methods
• Frequency distribution
• Q-Q plot
– Statistical tests
• Shapiro–Wilk test
• Kolmogorov–Smirnov test
– Skewness and kurtosis
4
Graphic methods
5
Frequency distribution
(histogram)
A bell shape of histogram is the visual judgment about

whether the distribution is normal
6
Histogram
• Histogram
– is an approximate representation of the distribution of data
• X axis
– Value of variable
• Y axis
– Frequency (count)
7
Data for histogram
Raw data Data for histogram
X All unique value Frequency
1 1 1
2 2 2
2 3 3
3 4 2
3 5 1
3
4
4
5 8
Figure of histogram
Data Figure
4
All unique values Frequency

3
1 1
Frequency
2 2 2
3 3
1
4 2
5 1 0
1 2 3 4 5
Value
9
An practice
Raw data
X
0
0
Please calculate summary
1 data and draw a histogram
1
2
3
3
4
4
10
Answer for the practice
Data Figure
3
All unique values Frequency

0 2 2
Frequency
1 2
2 1 1
3 2
4 2
0
0 1 2 3 4
Value
11
Demo of histogram in the SPSS
12
13
1. Put variable in “variable(s)”
2. Click “Charts”
3. Check “Histogram”
4. Check “Show normal curve” to visualize data 14
Results
Summary data Histogram
The histogram is close to normal curve

15
Q-Q plot
• Q–Q (quantile-quantile) plot
– is a probability plot, which is a graphical method for
comparing two probability distributions by plotting their
quantiles against each other.
17
Quantile
• Quantile
– quantiles are cut points dividing the range of a probability
distribution into continuous intervals with equal
probabilities, or dividing the observations in a sample in
the same way
• Two expressions
– q-quantiles
» q is an integer
• Range from 2 to ∞
– cumulative distribution-quantiles
» cumulative distribution is probability
• Range from 0 to 1
18
q-quantiles
• q-quantiles
– q is an integer
• It means a finite set of values into q subsets of (nearly)
equal sizes.
– If q = 2
» We spilite data into 2 subsets equally
– If q = 4
» We split data into 4 subsets equally
19
The number of value for quantile
• The number of value for quantile

– q minus 1
• Where q is the part we want for data
– If we want to split our data into two parts, we need “1”
point
» For example, for a data (1, 2, 3), we cut between 2
– If we want to split our data into 3 parts, we need “2” point

» For example, for a data (1, 2, 3), we cut between 1 and 2
and between 2 and 3
20
Specialized quantiles
• Specialized quantiles
– We usually use a probability with quantile
• 2-quantile
– It is exact as the median
• 4-quantiles
– quartiles
• 100-quantiles
– percentiles
21
Another expression
• Expression of quantile based on cumulative distribution
– Cumulative Distribution Function (c.d.f.)
• The cumulative distribution function (CDF) at x gives
the probability that the random variable is less than or
equal to x: FX(x)=P(X≤x), calculated as the sum of the
probability
– Cumulative Distribution Function
» Range from 0 to 1
22
Calculation of C. D. F.
X Probability Cumulative distribution
1 0.1 (1/10) 0.1 (1/10)
2 0.1 (1/10) 0.2 (2/10)
3 0.1 (1/10) 0.3 (3/10)
4 0.1 (1/10) 0.4 (4/10)
5 0.1 (1/10) 0.5 (5/10)
6 0.1 (1/10) 0.6 (6/10)
7 0.1 (1/10) 0.7 (7/10)
8 0.1 (1/10) 0.8 (8/10)
9 0.1 (1/10) 0.9 (9/10)
10 0.1 (1/10) 1.0 (10/10)
A 10 observations data 23
Normal distribution and its CDF
Normal distribution C. D. F.
Cumulative probability
Probability
24
Expression of quantile based on
cumulative distribution
• We used the cumulative distribution to express quantile
– For a 2-quantile
• It spilt the data into 2 parts
– 50% and 50%
» 0.5 of cumulative distribution
• So, we can say a 0.5 quantile or 50% quantile
– For a 4-quantile
• It spilt the data into 4 parts
– 25%, 50%, 75%
» So, we can say they are 0.25, 0.5, and 0.75 quantile
25
A cut point to spilt data
1
2 When we have a 10 observations data,
We spilt data into 2 equal sized groups
3 5 for group 1 and 5 for group 2
4 Then this quantile is 0.5 or 50% quantile
5
Because it spilt 50% of data with equal size
6
7 Then we should cut it between 5 and 6,
So 5.5 is good
8
9
10
26
A practice
1
2 If we want to a 0.5 quantile, what is it
3
27
Answer
1
It is 2 because
2 50% of data is between
1 and 3
3
28
Many ways to calculate quantiles
• 9 methods to calculate quantiles

– Because of
• Unbiased estimate
• Derivations
• Linear interpolation
– So it may be common that some little difference happen
when we report quantiles and its-related methods
– When data is large, all methods have very close results
30
Q-Q plots compare the quantile in
your data and the quantile from a
normal distribution
31
Steps for Q-Q plot
• Steps for Q-Q plot
– 1. sort data
– 2. calculating Z score of original data
– 3. calculating theoretical normal cumulative distribution of sorted
data with its rank
– 4. calculate the quantile of Z score for normal cumulative
distribution
– 5. Plot two Z scores
For a theoretical cumulative distribution (cdi) for a normal distribution F is

𝑟𝑖 − 0.5
cd𝑖 =
𝑛
Where ri is the i-th observation’s rank
n is the total sample size 32
An example to calculate Q-Q plot
X
1
2
2
3
3
3
4
4
5
33
Calculation of Z score of x
Z score of
X
observed X
1 -1.45
2 -0.45
2 -0.45
3 0.55
3 0.55
Mean = 3, SD = 1.22
3 0.55
4 1.55
4 1.55
5 2.55 34
Ranking X and its theoretical
normal cumulative distribution
Z score of
X Rank
observed X
1 -1.45 1
2 -0.45 2
2 -0.45 3
3 0.55 4
𝑟𝑖 − 0.5
𝑞𝑖 =
3
𝑛
0.55 5
3 0.55 6
4 1.55 7
4 1.55 8
35
5 2.55 9
Ranking X and its theoretical
cumulative distribution
Z score of
cumulative
X observed Rank
distribution
X
1 -1.45 1 (1-0.5)/9
2 -0.45 2 (2-0.5)/9
2 -0.45 3 .
3 0.55 4 .
𝑟𝑖 − 0.5
𝑞𝑖 =
3
𝑛
0.55 5 .
3 0.55 6 .
4 1.55 7 .
4 1.55 8 .
36
5 2.55 9 .
Z score of expected normal quantile
Z score of
cumulative Z score of
X observed Rank
distribution expected X
X
1 -1.45 1 0.06 -1.59
2 -0.45 2 0.17 -0.97
2 -0.45 3 0.28 -0.59
3 0.55 4 0.39 -0.28
3 0.55 5 0.50 0.00
3 0.55 6 0.61 0.28
4 1.55 7 0.72 0.59
4 1.55 8 0.83 0.97
5 2.55 9 0.94 1.59 37
Plot two Z scores
Z score of Z score of
observed X expected X
-1.45 -1.59
-0.45 -0.97
-0.45 -0.59
0.55 -0.28
0.55 0.00
0.55 0.28
1.55 0.59
1.55 0.97
2.55 1.59
38
Plot two Z scores
Z score of Z score of
observed X expected X
-1.45 -1.59
-0.45 -0.97
-0.45 -0.59
0.55 -0.28
0.55 0.00
0.55 0.28
1.55 0.59
1.55 0.97
2.55 1.59
If observed score is close to expected value, we say it has normality

So, we usually add the expected value as the reference line,
which is follows the 45° line y=x 39
A little difference between
statistical software
SPSS R with the package of car SAS
Qqplot also has a little difference in different statistical software

based on different methods to calculate quantile
41
We used the same methods as those in R and SAS
Demo of Q-Q plot in the SPSS
42
43
1. Test distribution is normal (the default setting)
you can change it if you have another assumption of distribution
2. Check the standardized values (Z score)
it reports the Z score, if unchecked, it reports original scale of variable
44
Results
The observed value is close to expected normal value (the line of 45 degree)
45
The main disadvantage of
graphic methods
• The main disadvantage of graphic methods
– It is objective for different research about what is the
deviated from a normal distribution
Is this a normal distribution?

46
Test of normality
47
Test of normality
• Test of normality
– Test of normality can give us a p-value to determine
normality or not
• Methods of Testing normality

– Shapiro–Wilk test
– Kolmogorov–Smirnov test
48
Kolmogorov–Smirnov vs. Shapiro–Wilk test
• Kolmogorov–Smirnov test
– It has been reported that the K-S test has low power and it
should not be seriously considered for testing normality.
• Shapiro–Wilk test
– It is preferable that normality be assessed both visually
and through normality tests, of which the Shapiro-Wilk test,
provided by the SPSS software, is highly recommended.
Ghasemi, A., & Zahediasl, S. (2012). Normality tests for statistical analysis: a guide for non-statisticians. 49
International Journal of Endocrinology and Metabolism, 10(2), 486-489. doi:10.5812/ijem.3505
Shapiro–Wilk test
• Assumption
– Null hypothesis: normality
– Alternative hypothesis: non-normality
• Shapiro-Wilk formula
(σ 𝑎𝑖 × (𝑥𝑛+1−𝑖 − 𝑥𝑖 )2
𝑊=
σ 𝑥𝑖 − 𝑥ҧ 2
Where xi is the value of ordered data from smallest

ai is the coefficient of the order statistics of a sample of size n from
a normal distribution
n is the sample size 50
Shapiro-Wilk Tables
Coefficients p-values of W
51
An examples
Raw data
55
35
45
70
58
61
63
65
68
86
72
74
52
An examples
Raw data Sorted data Order
55 35 1
35 45 2
45 55 3
70 58 4
58 61 5
61 63 6
63 65 7
65 68 8
68 70 9
86 72 10
72 74 11
74 86 12
53
An examples
Sorted data 𝑥𝑖 − 𝑥ҧ 2
35 (35-62.7)2
45 (45-62.7)2
55 (55-62.7)2
58 (58-62.7)2
61 (61-62.7)2 (σ 𝑎𝑖 × (𝑥𝑛+1−𝑖 − 𝑥𝑖 )2
63 (63-62.7)2 𝑊=
65 (65-62.7)2
68 (68-62.7)2 (σ 𝑎𝑖 × (𝑥𝑛+1−𝑖 − 𝑥𝑖 )2
=
70 (70-62.7)2 2008.7
72 (72-62.7)2
74 (74-62.7)2
86 (86-62.7)2
Mean = 62.7 Sum = 2008.7
54
An examples
Sorted (𝑥𝑛+1−𝑖 (𝑎𝑖 × (𝑥𝑛+1−𝑖
Order 𝑎𝑖 n+1-i i 𝑥𝑛+1−𝑖 𝑥𝑖
data − 𝑥𝑖) − 𝑥𝑖)
35 1 0.5475 12+1-1=12 1 86 35 51 27.9

45 2 0.3325 12+1-2=11 2 74 45 29 9.6
55 3 0.2347 12+1-3=10 3 72 55 17 4.0
58 4 0.1586 12+1-4=9 4 70 58 12 1.9
61 5 0.0922 12+1-5=8 5 68 61 7 0.6
63 6 0.0303 12+1-6=7 6 65 63 2 0.1
65 7 Sum=44.2
68 8 44.22=1953.6
n (sample size)=12
70 9
72 10 (σ 𝑎𝑖 × (𝑥𝑛+1−𝑖 − 𝑥𝑖 )2
=
2008.7
74 11
1953.6 55
86 12 = = 0.97
2008.7
Critical value of W
When W value is smaller, indicating a tendency of non-normality

The critical value of W value is 0.859 when n = 12
Because our results is 0.97, which is large than 0.859,

suggesting normality of this data
56
An practice
Raw data a
1 0.6872
3 0.1677
4
2
Please calculate Shapiro-Wilk test for normality
57
Answer
𝑥𝑖 − 𝑥ҧ 2
Sorted data 𝑎𝑖 𝑥𝑛+1−𝑖 − 𝑥𝑖 𝑎𝑖 × (𝑥𝑛+1−𝑖 − 𝑥𝑖
1 (1-2.5)2 0.6872 4-1=3 2.1
2 (2-2.5)2 0.1677 3-2=1 0.2
3 (3-2.5)2 Sum=2.2
4 (4-2.5)2 (2.3)2=5.3
Mean=2.5 Sum=5
n (sample size)=4
5.3
= = 1.06
5
58
A deep look at the formula
Symmetry of distribution
(skewness)
Shapiro-Wilk formula
(σ 𝑎𝑖 × (𝑥𝑛+1−𝑖 − 𝑥𝑖 )2
𝑊=
Sum of square
(Variance)
59
Sensitive for Symmetry of
distribution (skewness)
Symmetry Asymmetry
3.5 3.5
2.5 2.5
Frequency
Frequency
1.5
1.5
0.5
0.5
0
1 2 3 4 5 1 2 3 4 5 6 7 8
-0.5 Value
Value
(σ 𝑎𝑖 ×(𝑥𝑛+1−𝑖 −𝑥𝑖 )2
𝑊= σ 𝑥𝑖 −𝑥ҧ 2
; because ai is smaller than 1, when there is
asymmetry, the increase in denominator is 60
stronger than in numerator
Demo of Shapiro-Wilk test in the
SPSS
61
62
1. Put variable in dependent list
2. Go to “Plots” and check “normality plots with tests”
63
Results
Because the p-value of Shapiro-Wilk test is 0.922,

then we the null hypothesis that our data has normality
64
When we want to conduct a t-test
with 2 groups, the normality test
should be tested in the outcome
variable within the whole sample
or two groups separately?
65
Normality between groups
Two sample t-test assumes normality. Therefore, it can be

used when the normality is satisfied through the normality
test. In this case, the normality test should be performed for
each group, and it can be said that the normality is satisfied
when the normality is satisfied in both groups.
Kwak, S. G., & Park, S.-H. (2019). Normality Test in Clinical Research. Journal of 66
Rheumatic Diseases, 26(1), 5. doi:10.4078/jrd.2019.26.1.5
Shortage for these
Kolmogorov–Smirnov test
• Shortage for these tests
– It is suitable for moderate sample size because of sample-
size bias
• When sample size is small, it is conservative (tend to
be normality)
– But we are usually not sure the normality under a small
size
• When sample size is large, it is too sensitive (tend to be
non-normality)
67
Skewness and kurtosis are alternative
method to test normality
68
Skewness
• Skewness
– is a measure of the “asymmetry” of the probability
distribution of a real-valued random variable about its
“mean”
mean mean
69
Symmetry
Symmetry - Wikipedia 70
Negative or positive Skewness
71
Sample skewness
• Sample skewness
– adjusted Fisher-Pearson standardized moment coefficient of
skewness (there are several methods to calculate sample
skewness)
74
An example of calculation of
sample skewness
xi Mean Sd xi-mean Cubic
1 2.25 1.25 -1.25 -1.95

2 -0.25 -0.02
2 -0.25 -0.02
4 1.75 5.36
sum 3.38
sample standard deviation
75
An example of calculation of
sample skewness
xi Mean Sd xi-mean Cubic
1 2.25 1.25 -1.25 -1.95

2 -0.25 -0.02
2 -0.25 -0.02
4 1.75 5.36
sum 3.38
4 × 3.38
=
1.253 × 3 × 2
=1.13 76
Se of skewness
• Se of skewness
6𝑛 × (𝑛 − 1)
=
(𝑛 − 2) × (𝑛 + 1) × (𝑛 + 3)
n = sample size
77
Meaning of quadratic and
cubic term
• If we have a center 0 and three point (-1, 1, 2)

– Quadratic term (non-negative feature)
• (-1-0)2 = 1
• (1-0)2 = 1
• (2-0)2 = 4
– Distance of all points apart from 0
-2 -1 0 1 2 78
Meaning of quadratic and
cubic term (II)
• If we have a center 0 and three point (-1, 1, 2)

– Cubic term (non-negative feature)
• (-1-0)3 = -1
• (1-0)3 = 1
• (2-0)3 = 8
– We can detect difference between two sides
-2 -1 0 1 2 79
Impact of parameters
• Sample skewness
– If all Xi – Xbar = 0 (symmetric), the sample skewness = 0

– If sample-standard-deviation is large, the sample skewness is
small
– If sample size is large, the sample skewness is small
80
Kurtosis
• Kurtosis
– a measure of the "tailedness" of the probability distribution
of a real-valued random variable
• Fat tailed, heavy tailed
↓ Fat tailed
Thin tailed ↑
82
Unbiased estimate of sample
kurtosis
• Unbiased estimate of sample kurtosis
k2 is the unbiased estimate of the second cumulant

(identical to the unbiased estimate of the sample variance)
adjusted Fisher-Pearson standardized moment coefficient of kurtosis
Joanes, Derrick N.; Gill, Christine A. (1998), "Comparing measures of sample skewness and 87
kurtosis", Journal of the Royal Statistical Society, Series D, 47 (1): 183–189
– If there are many outlier (xi), the kurtosis will increase

– Because we used the quartic term, the direct of outliers
(greater or smaller than mean) is the same
• For example, (-5)^4 = (5)^4
90
– When sample size increase, the estimator usually

decreases
• For example, when we keep the middle term as 1, we
compared n between 10, and 20
11 ∗ 10 92 110 243
𝑛 𝑜𝑓 10 = ∗1 −3∗ = − = 0.22 − 4.34 = −4.12
9∗8∗7 8 ∗ 7 504 56
21∗20 192 110 243
𝑛 𝑜𝑓 20 = ∗1 −3∗ = 504 − = 0.07 − 3.54=-3.47
19∗18∗17 18∗17 56
91
Impact of Outliers
• Outlier
– Both the Skewness test and the Kurtosis test are very
sensitive outlier detectors
• One outlier will make the distribution appear skewed
• Two symmetric outliers will make the tails appear heavy
92
Plot
Without outliers With “two” outliers
94
Suggestions of non-normality of
kurtosis and skewness
• Non-normality based on different sample size using common
statistical software (e.g., SPSS or SAS)
– <50
• Skessness and kurtotis
– Z > ± 1.96
– <300
• Skessness and kurtotis
– Z > ± 3.29
– >300
• Skessness
– Skessness value > ± 2
• Kurtotis
– Kurtotis value > ± 4
Kim, H-Y. (2013). Statistical notes for clinical researchers: Assessing normal distribution (2) 95
using skewness and kurtosis. Restorative Dentistry and Endodontics 38, 52–54
Demo of skewness and kurtosis in
the SPSS
99
100
1. Go to “Statistics”
2. Check “skewness” and “kurtosis”
101
Results
Z value of skewness = -0.518/0.637 = -0.81
Z value of kurtosis = -0.747/1.232 = -0.61

102
Data transformation
104
Data transformation
• Data transformation
– It is usually applied so that the data appear to more closely
meet the assumptions of a statistical inference procedure
that is to be applied, or to improve the interpretability or
appearance of graphs
• Non-normality to normality
105
Common transformation methods
• Common transformation methods

– Square-root transformation
• X’ = √X
– Cube-root transformation
• X’ = 3√X
– Logarithmic transformation
• X’ = log of X
– Base can be e (Euler's number) or 10
» These methods can be use to address right skew
106
Effect of transformation
Original sqrt cube root log(e)

1 1 1 0
2 1.414214 1.122462 0.693147
4 2 1.259921 1.386294
5 2.236068 1.30766 1.609438
10 3.162278 1.467799 2.302585
50 7.071068 1.919383 3.912023
80 8.944272 2.075782 4.382027
100 10 2.154435 4.60517
107
Illusion
Square-root and square root and log transformation can use to address
right skewness data, but the later two have stronger effect on distribution shape
https://fahimahmad.netlify.app/posts/methods-for-transforming-data-to-normal-distribution-in-r/ 108
Data transformation in the SPSS
109
Data transformation in the SPSS
1. Naming the New variable

2. Formula of transformation
3. Click Ok
110
Results
111
One issue to cause non-normality
is “outlier”
112
To address outliers
• Problems of outliers
– non-normality
– unequal variances
• Method to address outliers

– Trimming or truncation
– Winsorizing or winsorization
113
Trimming or truncation
• Trimming or truncation
– Removing outliers from your data
• Common definition of outlier
– 99.7%
» Z score of ± 3
– 98.8%
» Z score of ± 2.5
– 95%
» Z score of ± 1.96
• Definition is sometime based on your sample size,
when you have a small sample size, you usually do
not want to delete too many sample, then you may set
a strict definition of outlier, for example, ± 2.5.
114
Winsorizing or winsorization
• Winsorizing or winsorization
– It is developed by Charles P. Winsor, we first define the
cuf-off value of outliers, then we replace the outlier using
the cut-off value
115
An example
Original
-3
-2.5 If we define a outlier as z score of
-2 ± 2.5, what are the results of
trimming and winsorizing
-1
0
1
2
2.5
3
116
The results of
trimming and winsorizing
Original trimming winsorizing
-3 - -2.5
-2.5 - -2.5
-2 -2 -2
-1 -1 -1
0 0 0
1 1 1
2 2 2
2.5 - 2.5
3 - 2.5
Sample size 9 5 9
117
Calcuating Z score
118
119
Results
You can get a new standardized variable

with its name is “Z+original variable name”
For example, if your original variable is

x, then you get a “Zx” variable
120
Thanks for listening
Q&A
123

Normality Test

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Normality Test

Uploaded by

Copyright:

Available Formats

Normality test

Yi-Lung Chen, PhD

A bell shape of histogram is the visual judgment about

All unique values Frequency

All unique values Frequency

Summary data Histogram

The histogram is close to normal curve

• The number of value for quantile

– If we want to split our data into 3 parts, we need “2” point

• 9 methods to calculate quantiles

– When data is large, all methods have very close results

For a theoretical cumulative distribution (cdi) for a normal distribution F is

If observed score is close to expected value, we say it has normality

SPSS R with the package of car SAS

Qqplot also has a little difference in different statistical software

Is this a normal distribution?

• Methods of Testing normality

Where xi is the value of ordered data from smallest

35 1 0.5475 12+1-1=12 1 86 35 51 27.9

When W value is smaller, indicating a tendency of non-normality

Because our results is 0.97, which is large than 0.859,

Please calculate Shapiro-Wilk test for normality

1 (1-2.5)2 0.6872 4-1=3 2.1

2 (2-2.5)2 0.1677 3-2=1 0.2

Because the p-value of Shapiro-Wilk test is 0.922,

Two sample t-test assumes normality. Therefore, it can be

1 2.25 1.25 -1.25 -1.95

1 2.25 1.25 -1.25 -1.95

• If we have a center 0 and three point (-1, 1, 2)

• If we have a center 0 and three point (-1, 1, 2)

– If all Xi – Xbar = 0 (symmetric), the sample skewness = 0

k2 is the unbiased estimate of the second cumulant

– If there are many outlier (xi), the kurtosis will increase

– When sample size increase, the estimator usually

• Two symmetric outliers will make the tails appear heavy

Without outliers With “two” outliers

Z value of skewness = -0.518/0.637 = -0.81

Z value of kurtosis = -0.747/1.232 = -0.61

• Common transformation methods

Original sqrt cube root log(e)

1. Naming the New variable

• Method to address outliers

You can get a new standardized variable

For example, if your original variable is

You might also like