Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

CIVL101

Lecture-31

Chi-Square Distribution
Unit 6: Applications of Probability and Statistics

Learning Outcomes:

To use Chi-square distribution for testing of hypothesis

(Non-Parametric test)
Chi-Square Distribution

The chi-square distribution is denoted as 𝜒 2 − distribution.

We know that if a random variable 𝑋~𝑁(𝜇, 𝜎 2 ), then the standard normal


𝑋−𝜇
variate is 𝑍 = ~𝑁(0,1).
𝜎

The square of the standard normal variate Z

2
2
𝑋−𝜇
𝑍 =
𝜎

is called chi-square variate with 1 degree of freedom.


Generalizing, we have that if 𝑋𝑖 , 𝑖 = 1,2,3, … , 𝑛 are n independent normal
variables, then:

𝑛 2
2
𝑋𝑖 − 𝜇𝑖
𝜒 =෍
𝜎𝑖
𝑖=1

is a chi-square variate with n degrees of freedom.

The distribution is called chi-square distribution and is denoted by 𝜒 2 (𝑛).


Application of 𝝌𝟐 − Distribution

1. 𝜒 2 − distribution is used to test the goodness of fit. Suppose, we want


to fit Binomial or Poisson distribution to a given data. We use 𝜒 2 −
distribution to test weather this fitting to the data is acceptable or not.

2. 𝜒 2 − distribution is used to test the independence or dependance(


related) of the attributes or the traits of the random samples drawn from
a population.
𝝌𝟐 − test for goodness of fit
Let a distribution be given.
Let 𝑂𝑖 = Observed frequencies
Let 𝐸𝑖 = Expected frequencies

𝑛 𝑛

෍ 𝑂𝑖 = ෍ 𝐸𝑖
𝑖=1 𝑖=1

Then Chi-square statistic is:

𝒏
𝑂𝑖 − 𝐸𝑖 𝟐
𝟐
𝝌 =෍
𝐸𝑖
𝒊=𝟏
𝝌𝟐 − test as Non-parametric test

The 𝜒 2 − statistic is defined as:

𝒏
𝟐
𝟐
𝑂𝑖 − 𝐸𝑖
𝝌 =෍
𝐸𝑖
𝒊=𝟏

The 𝜒 2 −statistic depends only on the observed and expected frequencies


and the degrees of freedom.

It is independent of the population parameters.

Hence, the test is also known as a non-parametric test.


Polling Quiz
Which of the following is the necessary condition for application of 𝜒 2 -test:

(A) σ𝑛𝑖=1 𝑂𝑖 > σ𝑛𝑖=1 𝐸𝑖

(B) σ𝑛𝑖=1 𝑂𝑖 < σ𝑛𝑖=1 𝐸𝑖

(C) σ𝑛𝑖=1 𝑂𝑖 = σ𝑛𝑖=1 𝐸𝑖


Problem 1. A shop owner claims that an equal number of customers come
into his shop each weekday. To test this hypothesis, an independent
researcher records the number of customers that come into the shop on a
given week and finds the following:

Monday: 50 customers, Tuesday: 60 customers, Wednesday: 40 customers,

Thursday: 47 customers, Friday: 53 customers.

Use Chi-Square goodness of fit test to determine if the data is consistent


with the shop owner’s claim.

Given value of 𝜒 2 is 4.89


Solution. 𝑶𝒊 50 60 40 47 53
𝑬𝒊 50 50 50 50 50
𝑶 𝒊 − 𝑬𝒊 0 10 -10 -3 3

Null Hypothesis 𝐻0 : An equal number of customers come into the shop each
day.

Alternative Hypothesis 𝐻1 : An equal number of customers do not come into


the shop each day.
The 𝜒 2 − statistic is defined as:

𝒏
𝟐
𝟐
𝑂𝑖 − 𝐸𝑖
𝝌 =෍
𝐸𝑖
𝒊=𝟏

0 100 100 9 9
= + + + + = 4.36
50 50 50 50 50

Since 4.36 < 4.89, so we accept the null hypothesis.


Polling Quiz
The value of 𝜒 2 for the following data is:
𝐸𝑖 36 74 120 121 76 38
𝑂𝑖 35 80 120 114 72 44

(A) 1.077

(B) 2.077

(C) 3.077

(D) 0.077
Problem 2. A sample survey of 500 families with 4 children has been made
regarding the number of boys and girls in the families. The following data
was obtained.

No. of families: 35 100 200 125 40

No. of boys: 4 3 2 1 0

No. of girls: 0 1 2 3 4

Is the data consistent with the hypothesis that the male and female births are
equally possible? Test at 5 % level of significance. [ At 5 % level 𝜒 2 = 9.488].
Solution. Null hypothesis 𝐻0 : Male and female births are equally possible.
1
𝑝 = (Probability of male birth)
2

1
𝑞 = (Probability of female birth)
2

Using the Binomial law:

1 4
Probability that a family of 4 children has 𝑟 male children= 4𝐶𝑟 = (4𝐶𝑟 )/16
2

Expected number of families having 𝑟 male children = 500 4𝐶𝑟 /16 =


(31.25)(4𝐶𝑟 )
The 𝜒 2 − statistic is defined as:

𝒏
𝑂𝑖 − 𝐸𝑖 𝟐
𝟐
𝝌 =෍
𝐸𝑖
𝒊=𝟏

16 625 144 0 81
= + + + + = 8.89
31 125 188 125 31

Since 8.89 < 9.488, so we accept the null hypothesis.


Polling Quiz

Consider a standard package of milk chocolate M&Ms. There are six


different colors: red, orange, yellow, green, blue and brown. Suppose
that we have a simple random sample of 600 M&M candies with the
following distribution: 212 are blue,147 are orange, 103 are green,50 are
red, 46 are yellow, 42 are brown. The value of 𝜒 2 statistic is:

(A)235.42 (B) 325.42

(C) 532.42 (D) 523.42

You might also like