Kolmogorov-Smirnov Test (K-S Test) : Hypotheses: Null Hypothesis (H

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

Kolmogorov-Smirnov Test (K-S Test)

─ Nonparametric Test
─ This test is distribution free, meaning you don’t have to know the underlying population
distribution before running the test.
─ K-S Test can be used to test whether a sample comes from a specific distribution.
─ This test compares a known expected frequency distribution (e.g. the normal distribution) to the
observed frequency distribution generated by your data (Goodness of Fit).
─ There are no restrictions on sample size; small samples are acceptable.

Hypotheses:
Null Hypothesis (Ho):
o The data come from the specified distribution
o There is no significant difference between the distribution of the sample and the specified
distribution
o The observed frequency distribution is consistent with the expected frequency
distribution (Good fit)
o P=P0
Alternative Hypothesis (Ha):
o The data does not come from the specified distribution
o There is a significant difference between the distribution of the sample and the specified
distribution
o The observed frequency distribution is not consistent with the expected frequency
distribution (Bad fit)
o P≠P0

Formula:
Dn = Maximum│Fo(X)− Fe(X)│
Where:
Dn = Distribution of the sample
Fo(X) = Observed cumulative frequency distribution (CDF)
Fe(X) = Expected cumulative frequency distribution (CDF)
𝑘
CDF = = no. of cumulative observation per group ÷ total no. of observation
𝑛

Decision:
Reject Null Hypothesis (Ho) if: Dn > Dcritical value
Accept Null hypothesis (Ha) if: Dn ≤ Dcritical value

Example:
Suppose you have 8 movie ratings (n=8), where each rating is a number between 1.0 (lowest) and 5.0
(highest):
2.6 4.6 2.3 1.2 2.4 3.8 2.7 2.9

It looks like the ratings are low. Is there a statistical evidence that the ratings are not evenly (uniformly)
distributed at 5% level of significance?

Ho: There frequency distribution of the movie ratings is consistent with a normal distribution (P=P0)
Ha: There frequency distribution of the movie ratings is not consistent with a normal distribution (P≠P0)
Level of Significance: 0.05 Area of Rejection and Acceptance:
Tail: 1-tailed 0.457
Test Statistic: K-S Test
Critical Value: 0.457
Computations:

Ratings Fo Fe Cumulative Fo Cumulative Fe Fo(X)/Fo CDF Fe(X)/Fe CDF │Fo− Fe│


1.0 – 1.5 1 1 1 1 1/8 = 0.125 1/8 = 0.125 0
1.6 – 2.0 0 1 1 2 1/8 = 0.125 2/8 = 0.250 0.125
2.1 – 2.5 2 1 3 3 3/8 = 0.375 3/8 = 0.375 0
2.6 – 3.0 3 1 6 4 6/8 = 0.750 4/8 = 0.500 0.250
3.1 – 3.5 0 1 6 5 6/8 = 0.750 5/8 = 0.625 0.125
3.6 – 4.0 1 1 7 6 7/8 = 0.875 6/8 = 0.750 0.125
4.1 – 4.5 0 1 7 7 7/8 = 0.875 7/8 = 0.875 0
4.6 – 5.0 1 1 8 8 8/8 = 1.00 8/8 = 1.00 0

Dn = 0.250
Dcrit = 0.457

Decision: Accept the null hypothesis (Ho)


Analysis/Interpretation:
Since the computed Dn value of 0.250 is lesser than the critical value of 0.457, thus Ho is
accepted. Therefore, the frequency distribution of the movie ratings is consistent with the normal
distribution. The results imply that majority of the respondents fairly liked the movie and only few of
them either really liked or disliked it.

K-S TEST CRITICAL VALUES

You might also like