Unit 4 Chi Square Test WR

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Department of Mathematics KGiSL Institute of Technology SFM / NOTES

Chi-square distribution

 2  test for population variance:

Let x1,x2,….xnbe a random sample form a normal polulation with variance  2 set
n
xi  x ns 2
the null hypothesis H 0 :  2   02 .Then The Test Statistic is  2   ( )2  where
i 1 0  n2
ns 2
s2 is the variance of the sample .Then  2 = defined above follows a  2 -distribution with
 n2
n-1 degree of freedom.

 2 test for goodness of fit:

(O  E ) 2
 2 -test statistic of goodness of fit is defined by  2 = 
E

Where O-Observed frequency, E-Expected frequency

Properties of chi square distribution:

1. The exact shape of the distribution depends upon the no of degree of freedom n. In general
where „n‟ is small the shapes of the curve is skewed to the right and γ gets larger, the
distribution becomes more and more symmetrical.

2. The mean & variance of the  2 distribution are n and 2n respectively

3. The n, the  2 distribution approaches to a normal distribution.

4. The sum of independent  2 variates is also  2 variate.

Uses of chi square –distribution:

1. To test if the hypothetical value of the population variance is (say)

2. To test the “goodness of the fit”. It is used to determine whether an actual sample
distribution

matches a known theoretical distribution.

3. To test the independence of attributes.

4. To test the homogeneity of independent estimates of the population correlation coefficient.

Conditions for the validity of chi -square test:

1. The experimental data (sample observation) must be independent of each other.

1
Dr. S. VIMAL KUMAR
Department of Mathematics KGiSL Institute of Technology SFM / NOTES

2. The total individual frequencies (of no of observations in the sample must reasonably

large, say 50.

3. No individual frequency should be less than 5,if any frequency is less than 5,then it is
poled

(that it is fairly small area) with the preceding (or) succeeding frequency so that the polled

frequency is more than 5.Finally adjust for the d.f lost in pooling.

4. The no of classes „r‟ must be neither too small nor large(i.e)

Independence of attributes:

1. In case of flitting binomial distribution degree of freedom= n-1

Flitting poission distribution degree of freedom= n-2

Flitting normal distribution degree of freedom= n-3

2. If  2 =0 all observed and expected frequencies coincide.

3. For  2 distribution mean=v, variance =2v

Problems

1. Five coins are tossed 256 times. the number of heads observed is given below.examine
if the coins are unbiased by employing  2 goods of fit.

No.of heads: 0 1 2 3 4 5

Frequence : 5 35 75 84 45 12

Sol:

Given n=6, N=total No.of frequency =256

H0: Binomial is a good fit.

H1: Binomial is not a good fit.

At 5% Los degree of freedom = n-1=6-1=5

(O  E ) 2
2=
E

On the assumption H0,the expected frequencies are given the terms of N(q+p)n=256( )5

2
Dr. S. VIMAL KUMAR
Department of Mathematics KGiSL Institute of Technology SFM / NOTES

256
 [5C 0  5C 1 5C 2  5C 3  5C 4 5C 5 ]
32

=8[1+5+10+10+5+1]

Therefore the expected frequencies are 8,40,80,80,40,8.

No of heads O E O-E (O-E)2 (O  E ) 2


 E
0 5 8 -3 9
1 35 40 -5 25
2 75 80 -5 25
3 84 80 4 16
4 45 40 5 25
5 12 8 4 16
 Oi  256  Ei  256
(O  E ) 2
2=
E

Since |  2 |  , so we accept H0 at 5% LOS.

Binomial distribution is a good fit to the given data.

2. The demand for a particular spare part in a factory was found to vary from day-to-
day .In a sample study the following information was obtained.

Days: Monday Tuesday Wednesday Thursday Friday


Saturday

No.of demands: 1124 1125 1110 1120 1126


1115

Test the hypothesis that the number of parts demanded doesn’t depend on the day of
the week.

Sol:

H0: the no.of parts demanded doesnot depend on the day of week.

H1: the no of parts demanded depend on te day of week.

1
Expected no of frequency = [1124  1125  1110  1126  1120  1115]  1120
6

Days Observed Expected (O-E)2 (O  E ) 2


frequency frequency  E

3
Dr. S. VIMAL KUMAR
Department of Mathematics KGiSL Institute of Technology SFM / NOTES

Monday 1124 1120 16


Tuesday 1125 1120 25
Wednesday 1110 1120 100
Thursday 1120 1120 0
Friday 1126 1120 36
Saturday 1115 1120 25
 2 =0.179

|  2 |   ,so we accept H0 at 5% LOS


2

Hence the no. of parts demanded doesnot depend on the day of week.

3. Two sample polls of votes for two candidates A and B for a public office are taken one
from among the residents of rural areas. The results are given in the adjoining table
.examine whether the nature of te area is related to voting preference in this election.

Area Votes for Total


A B
Rural 620 380 1000
Urban 550 450 1000
Total 1170 830 2000
Sol:

Under the null hypothesis that the nature of the area is independent the votimgpreferenece in
the election.we get the expected frequencies as follows.

1170  1000
E(620) =  585,
2000
830  1000
E(380) =  415
2000
1170  1000
E(550) =  585
2000
830  1000
E(450) =  415
2000

(O  E ) 2 (620  585) 2 (380  415) 2 (550 - 585) 2 (450  415) 2


2 =      10.089
E 585 415 585 415

 02.05 for (2  1)(2  1)  1 degree of freedom is 3.841.

|  2 |   ,so we reject H0 at 5% LOS.


2

Thus we conclude that the nature of area is related voting preference in the election .

4. Two researches A and B adopted different techniques while rating the student’s level.
Can you say that the techniques adopted by them are significant?

4
Dr. S. VIMAL KUMAR
Department of Mathematics KGiSL Institute of Technology SFM / NOTES

researches Below average Above genius total


average average
A 86 60 44 10 200
B 40 33 25 2 100
C 126 93 69 12 300

Sol :

H0: There is no. significant difference between te sampling techniques used by the 2
researches for collecting te required data.

H1: There is significant difference between te sampling techniques used by the 2 researches
for collecting the required data.

Here we have 4 2 contigency table and degree of freedom =(4-1)(2-1)=3.

Under the null hypothesis of independence ,

We have

126  200
E(86)=  84,
300

93  200
E (60)  62
300

69  200
E (44)  46
300

The table of expected frequencies can be completed as shown below:

Researches No of students in each level Total


Below avg Average Above avg Genius
X 84 62 46 8 200
Y 42 31 23 4 100
Tatal 126 93 69 12 300

Since we cannot apply the χ2 test straight way here as the last frequency is less than 5.we
should use the technique of pooling in this case as given below.

Researcher Types of fi ei (fi- ei)2 ( fi  ei ) 2


students e

5
Dr. S. VIMAL KUMAR
Department of Mathematics KGiSL Institute of Technology SFM / NOTES

X Below avg 86 84 4 0.048


Avg 60 62 4 0.064
Above avg 44 46 4 .087
Geneius 10 8 4 .05

Y Below avg 40 42 4 0.095


Avg 33 31 4 0.129
Above avg 25 23 0 0
Geneius 2 4

300 300 0.923

The degree of freedom (4-1)(2-1)-1=3-1=2.since 1 dof is lost in the method of pooling


.tabulated value of χ2 for 2df at 5% LOS is 5.991.

|  2 |   , so we accept H0.
2

5. Individuals solving the problem:


5.1 Simple Problems:
1. The following table gives the frequencies of occurrence of the digits 0,1,2,…….9 in the last
place in the four figure logarithm of numbers . Examine if there is any peculiarity.

Digits: 0 1 2 3 4 5 6 7 8 9 total

Frequency:6 16 15 10 12 12 3 2 9 5 90

2. The following figures show the distribution of digits in numbers chosen at random from a
directory.

Digits 0 1 2 3 4 5 6 7 8 9 Total

Frequency 1026 1107 997 966 1075 933 1107 972 964 853 10000
Test whether the digits may be taken occur equally frequently in the directory.

3. A die was thrown 498 times denoting x to be the number appearing on the top face of it. The
observed frequency of x is given below.
x 1 2 3 4 5 6
F 69 78 85 82 86 98
What opinion you would form for the occuracy of the die.
4. The table below gives the number of aircraft accident that occurred during various days of the
week. Test whether the accidents are uniformly distributed over the week.
Days Mon Tue Wed thurs Fri Sat
No.of 14 18 12 11 15 14
accidents
Difficult problems
1. A survey of 320 family with 5 children each revealed the following information.

6
Dr. S. VIMAL KUMAR
Department of Mathematics KGiSL Institute of Technology SFM / NOTES

No.of boys 5 4 3 2 1 0
No. of 0 1 2 3 4 5
girls
No.of 14 56 110 88 40 12
families
Is this result consistent with the hypothesis that male and female birth are equally probable?
2. Four coins were tossed 160 times and the following results were obtained.
No of heads : 0 1 2 3 4
Observed frequencies: 17 52 54 31 6
Under the assumption on that coins are balanced, find the expected frequencies of getting
0, 1, 2,
3, 4 heads and test the goodness of fit.

6. Evaluation Strategy:
1. Find if there is any association between extravagance in fathers and extravagance in sons
from
the following data.
Extravagant father Miserly father
Extravagant son 327 741
Miserly son 545 234
Determine the coefficient of association also.
2. On the basis of information noted below, find out whether the new treatment is
comparatively
Superior to the conventional one.
Favourable Non favourable Total
Conventional 40 70 90
New 60 30 110
Total 100 100 N=200
3. In an investigation into the health and nutrition of two groups of children of different social
status, the following results are got.

Poor rich Total


Below normal 130 20 150
normal 102 108 210
Above normal 24 96 120
Total 256 224 480
Discuss the relation between the health and social status.

7. Flowchart:
Χ2-test

7
Dr. S. VIMAL KUMAR
Department of Mathematics KGiSL Institute of Technology SFM / NOTES

Goodness of fit Independence of


attributes

Find observed
Find observed frequencies
frequencies (Oi) &
(Oi) & expected
expected frequencies
frequencies (Ei)
(Ei)

Frame the null hypothesis and alternative hypothesis.

Compare the calculated value with table value

 If the calculated value is less than (<) table value ,accept the null hypothesis
 If the calculated value is greater than (>) table value reject the null hypothesis

8
Dr. S. VIMAL KUMAR

You might also like