TEST-5 Material F - Test and Chi-Square Test PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

TEST -5 Material F-Test & Chi-square Test

F-Test:

Test for the equality of variances of two populations ( OR)


F-Test for equality of variances

Procedure:

Let small samples of sizes n1 , n2 drawn from two populations and variances  21 and  2 2
respectively. Let s12 , s2 2 be the variances of samples. We want to test the equality of
variances of the two populations.

H0 :  21   2 2 [ variances are equal]

H1 :  21   2 2 [variances are not equal]

2 2
ns ns
Let S 1 1 1
2
, S 22  2 2
n1  1 n2  1

2
S1
If S 21  S 2 2 , then the test statistic F  2
S2

Degrees of freedom : ( v1 , v2 ) = ( n1 -1 , n2 -1)

Inference:

(i) If calculated value of F < Tabulated value of F null hypothesis H0 is accepted.


(ii) If calculated value of F > Tabulated value of F null hypothesis H0 is rejected

1) A group of 10 rats fed on diet A and another group of 8 rats fed on diet B recorded the following
increase in weight
Diet A 5 6 8 1 12 4 3 9 6 10
Diet B 2 3 6 8 10 1 2 8 - -

Show that the estimates of the population variance from the samples are not significantly different
Solution:
Null hypothesis H0 : There is no significant difference between the variance
increase in weight due to diets A & B (i.e) S12= S22
Alternative hypothesis H1 : S12≠ S22
Test Statistic :
S12 S 22
F or F 
S 22 S12
To calculate sample means and variance:
x x- x (x - x )2 y y- y (y- y )2
(x -6.4) (y-5)
5 -1.4 1.96 2 -3 9
6 0.4 0.16 3 -2 4

[Prepared by Dr.R.Manimaran,Assistant Professor,Department of Mathematics,SRMIST,Vadapalani


campus,Chennai-26] Page 1
8 1.6 2.56 6 1 1
1 -5.4 29.16 8 3 9
12 5.6 31.36 10 5 25
4 -2.4 5.76 1 -4 16
3 -3.4 11.56 2 -3 9
9 2.6 6.76 8 3 9
6 -0.4 0.16 - - -
10 3.6 12.96 - - -
 Ʃ= 64 102.40 40 82

∑𝑥 y
Mean of diet A = 𝑥̅ = Mean of diet B y 
𝑛1 n2
64 40
Here n1 = 10, x  Here n2 =8,𝑦̅ =
10 8
𝑥̅ = 6.4 𝑦̅ = 5

 
2
 yy
S2  2

n2  1
82
𝑆22=
8−1
2
S2 = 11.7143
 
2
 xx
S1 2

n1  1
102.40
10  1
S21 = 11.3778
S2 2
F 
S12 
 S2 2  S2 2 
11.7143

11.3778
Fcal = 1.0296 with degrees of freedom υ= (n2-1, n1-1)
υ = (7,9)
Tabulated value of F(7,9)=3.12
Since Fcal < Ftab ,H0 is a accepted.
(ie) there is no significant difference in population variance from the samples.
2) Test if the variances are significantly different for

X1 24 27 26 21 25
X2 27 30 32 36 28 23
Solution:
To test the variance are significantly different , we use F - test
Given n1 = 5,n2=6
Calculation for means and S.D of the samples
x x- x (x - x )2 y y- y (y- y )2
(x -24.6) (y-29.33)
24 -0.6 0.36 27 -2.33 5.4289

[Prepared by Dr.R.Manimaran,Assistant Professor,Department of Mathematics,SRMIST,Vadapalani


campus,Chennai-26] Page 2
27 2.4 5.76 30 0.67 0.4489
26 1.4 1.96 32 2.67 7.1289
21 -3.6 12.96 36 6.67 44.4889
25 0.4 0.16 28 -1.33 1.7689
- - - 23 -6.33 40.0689
Ʃ=123 21.20 176 99.3334
∑𝑥 123 ∑𝑦 176
𝑥̅ = = = 24.6 𝑦̅ = = = 29.33
𝑛1 5 𝑛2 6
∑(𝑥 − 𝑥̅ ) = 21.22 2
∑(𝑦 − 𝑦̅) = 99.3334
∑(𝑥−𝑥̅ )2 ∑(𝑦−𝑦̅)2
𝑆12 = 𝑛1 −1
𝑆22 = 𝑛2 −1
21.20 99.3334
= =
5−1 6−1
𝑆12 = 5.3 𝑆22 = 19.8667

Null hypothesis H0:  12   2 2


S2 2 19.8667
Test statistic F  
S12 5.3
Fcal = 3.7484 with degrees of freedom υ= (n2-1, n1-1)
υ = (5,4)
Tabulated value of F for (5,4) d.f at 5% level of significance is 6.26,
Since Fcal < Ftab ,we accept Ho (ie) the variances are equal.
3) Two samples of sizes 9 and 8 gave the sums of squares of deviations from their respective means

equal to 160 and 91 respectively. Can they be regarded as drawn from the same normal population.

Solution:

x  x
2
 160 ie) n1s1  160
2
Given, n1 = 9 , i

 y 2
 y  91 ie) n2 s2  91
2
n2 = 8 , i

2
ns 160 160
S 1 1 1 =
2
  20
n1  1 9  1 8

2
ns 91 91
S 2
2  2 2 =   13
n2  1 8  1 7

2
S1
Since S 21  S 2 2 , F 2
S2

20
 1.54
13

H0 : S 21  S 2 2 [ variances are equal]

H1 : S 21  S 2 2 [variances are not equal]


[Prepared by Dr.R.Manimaran,Assistant Professor,Department of Mathematics,SRMIST,Vadapalani
campus,Chennai-26] Page 3
Degrees of freedom : ( v1 , v2 ) = ( n1 -1 , n2 -1) = ( 8 ,7)

Level of significance : 5 %

Tabulated value : F0.05 ( 8 ,7) = 3.73

Since calculated value of F < Tabulated value of F, null hypothesis H0 is accepted.

(ie) the two samples could have come from two normal populations with the same variances.

4) The nicotine contents in two random samples of tobacco are given below:

Sample I 21 24 25 26 27
Sample II 22 27 28 30 31 36
Can you say that the two samples came from the same population?

Solution:

Calculation for means and S.D of the samples


x x- x (x - x )2 y y- y (y- y )2
(x -24.6) (y-29)
21 -3.6 12.96 22 -7 49
24 -0.6 0.36 27 -2 4
25 0.4 0.16 28 -1 1
26 1.4 1.96 30 1 1
27 2.4 5.76 31 2 4
- - - 36 7 49
Ʃ=123 0 21.20 174 0 108

∑𝑥 123 ∑𝑦 174
𝑥̅ = = = 24.6 𝑦̅ = = = 29
𝑛1 5 𝑛2 6
∑(𝑥 − 𝑥̅ ) = 21.2 2 2
∑(𝑦 − 𝑦̅) = 108
∑(𝑥−𝑥̅ )2 ∑(𝑦−𝑦̅)2
𝑆12 = 𝑆22 =
𝑛1 −1 𝑛2 −1
21.20 108
= 5−1 = 6−1
𝑆12 = 5.3 𝑆22 = 21.6

Null hypothesis H0:  12   2 2


2
S
Since S 2
2 S 2
1 , F  22
S1

21.6
Test statistic F
5 .3

[Prepared by Dr.R.Manimaran,Assistant Professor,Department of Mathematics,SRMIST,Vadapalani


campus,Chennai-26] Page 4
F = 4.07
Fcal = 4.07 with degrees of freedom υ= (n2-1, n1-1)
υ = (5,4)
Tabulated value of F for (5,4) d.f at 5% level of significance is 6.26,
Since Fcal < Ftab ,we accept Ho (ie) the variances of the two populations can be regarded as equal.

t-test

Null hypothesis H0: x̅1 = x̅̅̅2

Alternate hypothesis H1: x̅1 ≠ x̅̅̅2

̅̅̅−x
x1 ̅̅̅2
Test Statistic t=
n s2 +n s2 1 1
√( 1 1 2 2 )( + )
n1 +n2 −2 n1 n2

 4.4  4 .4
= = = - 1.92
 21.2  108   1 1  2.2943
   
 9  5 6 

Degrees of freedom : v = n1+n2-2 = 9

Tabulated value : t0.05(9) = 2.26

Since tcal < ttab , we accept Ho


That is the mean of two samples do not differ significantly.

Therfore , the two samples could have been drawn from the same normal population.

Chi-square test
A test for resting the significance of discrepancy between experimental values and the
theoretical values obtained under some theory or hypothesis is known as  2 test for
goodness of fit.

  2 Oi  Ei 2
Ei
where Oi -observed frequency ,
Ei - expected frequency
and  is used to test whether difference between observed and expected frequency are
2

significant
Application of Chi - square test:
1.To test the goodness of fit

2. To test the independent of attributes

3. To test the homogeneous of independent estimations

[Prepared by Dr.R.Manimaran,Assistant Professor,Department of Mathematics,SRMIST,Vadapalani


campus,Chennai-26] Page 5
Validity of Chi - square test:
1) The number of observations N in the sample must be reasonably large, say ≥ 50.
2) Individual frequencies must not be too small. (ie) Oi ≥ 10.
3) The number of classes n must be neither too small nor too large.(ie) 4 ≤ n ≤ 16.
Chi - square test of independence of attributes:
An attribute is a characteristic or a quality which may be present amongst the members of a
population.In this case , the sample data may be presented in the form of a matrix containing m rows
and n columns and hence mn cells.

 Oij  Eij 2 
The test statistic  2    
 Eij 

Degrees of freedom : ( m-1)(n-1)

If  c  T
2 2
(i) , H0 is accepted . ( ie) the attributes A and B are independent.
If  c  T
2 2
(ii) , H0 is rejected . ( ie) the attributes A and B are not independent.

1) The number of air-craft accidents that occurred during the various days of the week.Test
whether the accidents are uniformly distributed over the week.
Day Mon Tues Wed Thurs Fri Sat
No. of accidents 15 19 13 12 16 15

Null hypothesis H0: Accidents occur uniformly over the week.


Total number of accidents = 90
90
Based on H0 the expected number of accidents on any day =  15
6

Observed Expected (0i-Ei) 0  E 2


frequency (0) frequency E
(E)
15 15 0 0
19 15 4 1.07
13 15 -2 0.27
12 15 -3 0.6
16 15 1 0.07
15 15 0 0
Σ=90 90 2.01
 0  E 2 
2    
 E 

= 2.01
Degrees of freedom v = 6-1 = 5
 T 2  11.07
Tabulated value :
Since  c  2.01  T  11.07 , H0 is accepted . ( ie) the accidents may be regarded to
2 2

occur uniformly over the week.

[Prepared by Dr.R.Manimaran,Assistant Professor,Department of Mathematics,SRMIST,Vadapalani


campus,Chennai-26] Page 6
2) The number of accident in a certain locality was 12,8,20,2,4,10,15,6,9,4,. Are these frequencies in

agreement with the belief that accident conditions were the same during this 10 week period.

Solution:
100
Expected frequency of a accidents each week =  10
10
Null hypothesis Ho : The accident conditions were the same during the 10 week period
Observed Expected (0-E) 0  E 2
frequency (0) frequency E
(E)
12 10 2 0.4
8 10 -2 0.4
20 10 10 10.0
2 10 -8 6.4
14 10 4 1.6
10 10 0 0
15 10 5 2.5
6 10 -4 1.6
9 10 -1 0.1
4 10 -6 3.6
Σ=100 100 26.6

 0  E 2 
Now  2    
 E 
 2 = 26.6
(ie) calculated value  2 = 26.6
Degrees of freedom d.f= υ = n-1 = 10-1
υ=9
Since  2 cal >  2 tab, we reject the null hypothesis (ie) The accident conditions were not the same
during the 10 week period.
3) The theory predicts that the proportion of beans in four group should be 9 : 3 : 3 : 1.In an
examination with 1600 beans,the numbers in the four groups were 882, 313 , 287, and 118.Does the
expremental results support the theory.
Solution:
We want to test that the proportion of beans in the four groups are in the ratio 9: 3: 3: 1
H0 : The experiment support the theory (ie) the number of beans in the four groups are in the
ratio 9 : 3:3:1
Based on H0 the expected number of beans in the four groups are as follows :
9 3
E(1) = 1600  900 , E(2) =  1600  300
16 16
3 1
E(3) =  1600  300 , E(4) = 1600 100
16 16
Observed Expected (0i-Ei) 0  E 2
frequency (0) frequency E
(E)
882 900 -18 0.36
313 300 13 0.56
287 300 -13 0.56
118 100 18 3.24
Σ=1600 1600 4.72

[Prepared by Dr.R.Manimaran,Assistant Professor,Department of Mathematics,SRMIST,Vadapalani


campus,Chennai-26] Page 7
Since  E  O ,
i i Degrees of freedom v = 4-1 =3

 T 2  7.82
Tabulated value :

Since  c  4.72  T  7.82 , H0 is accepted .


2 2

(ie) the experimental data support the theory.

4) Fit a binomial distribution for the following data and also test the goodness of fit.

x 0 1 2 3 4 5 6 Total
f 5 18 28 12 7 6 4 80
Solution:
The probability law for binomial distribution is P( X  x)  nC x p x q n x , x  0,1,2,3,......
Mean of a binomial distribution = np from which we can find p.

x: 0 1 2 3 4 5 6 Total
f : 5 18 28 12 7 6 4 80
fx: 0 18 56 36 28 30 24 192

Mean x 
 fx  192  2.4
 f 80
(ie) np = 2.4 or 6p = 2.4
 p  0.4 and hence q  0.6
The expected frequencies are given by E ( x) = N nC x p x q n x , x  0,1,2,3,......

E ( x) = 80 6Cx 0.4 0.6
x 6 x
, x  0,1,2,3,.....6.

E(0) = 80 6C0 0.4 0.6 ,  3.73  4
0 60

E(1) = 80 6C 0.4 0.6 ,  14.93  15


1 61
1

E(2) = 80 6C 0.4 0.6 ,  24.88  25


2 6 2
2

E(3) = 80 6C 0.4 0.6 ,  22.12  22


3 63
3

E(4) = 80 6C 0.4 0.6 ,  11.06  11


4 64
4

E(5) = 80 6C 0.4 0.6 ,  2.95  3


5 65
5

E(6) = 80 6C 0.4 ,  0.33  0


6
6

The expected frequencies are given by 4 ,15 , 25 , 22 , 11 , 3 , 0

[Prepared by Dr.R.Manimaran,Assistant Professor,Department of Mathematics,SRMIST,Vadapalani


campus,Chennai-26] Page 8
Observed Expected (0i-Ei) 0  E 2
frequency (0) frequency E
(E)
5  4 4 0.842
  23   19
18 15

28 25 3 0.36
12 22 -10 4.545
7 11 3 0.642
 
6   17 3   14
4  0 

Σ=80 80 6.389

Since  E  O ,
i i Degrees of freedom v = n – k 4 -2 = 2

 T 2  5.99
Tabulated value :

Since  c  6.389  T  5.99 , H0 is rejected.


2 2

(ie) the binomial fit for the given distribution is not satisfactory.

5) Fit a Poisson distribution for the following distribution and also test the goodness of fit.

x 0 1 2 3 4 5 Total
f 142 156 69 27 5 1 400

e   x
The probability law for poisson distribution is P( X  x)  , x  0,1,2,3.......
x!

We require λ which is the mean of Poisson distribution.

x: 0 1 2 3 4 5 Total
f : 142 156 69 27 5 1 400
fx: 0 156 138 81 20 5 400

Mean x 
 fx  400  1  
 f 400
e   x e 1
The expected frequencies are given by E ( x) = N  400 .x  0,1,2,3, , , , , ,
x! x!

[Prepared by Dr.R.Manimaran,Assistant Professor,Department of Mathematics,SRMIST,Vadapalani


campus,Chennai-26] Page 9
e 1 e 1
E ( 0) = 400 .  400  147.15  147
x! 0!

e 1 e 1
E ( 1) = 400 .  400  147.15  147
x! 1!

e 1 e 1
E ( 2) = 400 .  400  73.58  74
x! 2!

e 1 e 1
E ( 3) = 400 .  400  24.53  25
x! 3!

e 1 e 1
E ( 4) = 400 .  400  6.13  6
x! 4!

e 1 e 1
E ( 5) = 400 .  400  1.23  1
x! 5!

The expected frequencies are given by 147 ,147 , 74 , 25 , 6 , 1

Observed Expected (0i-Ei) 0  E 2


frequency (0) frequency E
(E)
142 147 -5 0.17

156 147 9 0.55


69 74 -5 0.34
27  25 1 0.03
 
5   33 6   32
1  1 

ΣOi=400 ΣEi =400 1.09


Since  
Ei  Oi , Degrees of freedom v = n – k =4 -2 = 2

 T 2  5.99
Tabulated value :

Since  c  1.09  T  5.99 , H0 is accepted.


2 2

(ie) the poisson fit for the given distribution is satisfactory.

6) The following data are collected on two characters.

Smokers Non-smokers
Literates 83 57
Illitrates 45 68
Based on this ,can you say that there is no relation between smoking and literacy?

[Prepared by Dr.R.Manimaran,Assistant Professor,Department of Mathematics,SRMIST,Vadapalani


campus,Chennai-26] Page 10
H0: Literacy and smoking habit are independent.

Smokers Non-smokers Total


Literates 83 57 140
Illitrates 45 68 113
Total 128 125 253

Oi Ei Ei(rounded) Oi-Ei 0  E 2
E
83 128 140 71 12 2.03
 70.83
253
57 125 140 69 12 2.09
 69.17
253
45 128  113 57 -12 2.53
 57.17
253
68 125  113 56 12 2.57
 55.83
253
ΣOi=253 ΣEi =253 9.22

 0  E 2 
Now  2    
 E 
 2 = 9.22
(ie) calculated value  2 = 9.22
Degrees of freedom d.f= υ =(m-1) (n-1) = (2-1)(2-1) = 1
υ=1
From the table : Tabulated value  T  3.84
2

Since  2 cal >  2 tab, we reject the null hypothesis (ie) There is some association between literacy
and smoking.

[Prepared by Dr.R.Manimaran,Assistant Professor,Department of Mathematics,SRMIST,Vadapalani


campus,Chennai-26] Page 11
Objective questions

1. The degrees of freedom for the Chi-Square test statistic when testing for independence in a contingency table
with 4 rows and 4 columns would be

(a) 7 (b) 5 (c) 9 (d) 12 Ans: (c)

2. The chi-square test can be used: Ans: (c)

(a) test for difference in two variances. (b) to make inference about a population mean.
(c) to test for homogeneity of proportions. (d) for pairwise multiple comparisons of means.

3. To determine whether a set of observed frequencies differ from their corresponding expected

frequencies, we could apply the

(a) t test for dependent samples. (b) chi-square test.

(c) t test for independent samples (d) F test. Ans: (b)

4. When using the chi-square test for differences in two proportions with a contingency table that has

r rows and c columns, the degrees of freedom for the test statistic will be:

(a) (r - 1)(c - 1). (b) (r - 1) + (c - 1)


(c) n -1. (d) n1 + n2 - 2.. Ans: (a)
5. On which of the following does the critical value for a chi-square statistic rely:
(a) The degrees of freedom (b) The sum of the frequencies
(c) The row totals (d) The number of variables Ans: (a)

[Prepared by Dr.R.Manimaran,Assistant Professor,Department of Mathematics,SRMIST,Vadapalani


campus,Chennai-26] Page 12

You might also like