The Chi-Square Test of Independence: Key Concepts

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

MODULE 34

THE CHI-SQUARE TEST OF INDEPENDENCE

KEY CONCEPTS:

34.1 The ​Chi-Square () Test of Independence​ is a test of hypothesis concerning the relationship or
association between categorical variables. It is a nonparametric test which is primarily based on
counts or observed frequencies and is presented in a contingency table.

34.2 In this test, the null hypothesis (H​0​) which states that the variables are independent is tested
against the alternative hypothesis (H​a​) which states that the variables are dependent. The
Chi-Square test statistic is given by:

The decision rule is:

E​xample: ​In an experiment to study the dependence of TB on smoking habits, the following
were taken on 180 individuals:

Moderat
Non-smo Heavy
e
kers Smokers
Smokers

With TB 21 36 30
Without TB 48 26 19
Test the hypothesis that the presence or absence of TB is independent of smoking
habits. Use α = 0.05.

Solution:

Step 1: Set up the null hypothesis H​0​and alternative hypothesis H​a​.

H​0​: The presence or absence of TB and smoking habit are independent.

H​a​: The presence or absence of TB and smoking habit are not independent (dependent).

Step 2: The level of significance is α = 0.05.

Step 3: Establish the test statistics and critical region.

Step 4: Computations

We cast the data into the contingencies table with the row total, column total and grand
total:

Moderat
Non-smo Heavy Row
e
kers Smokers Total
Smokers

With TB 21 36 30 87
Without TB 48 26 19 93
Column Total 69 62 49 180
We must next compute the expected frequencies as follows. For instance,

​ is the expected frequency of the first row and first column. Observe that the row
E11
total of the first row is 87 and the column total of the first column is 69. The grand total
is 180.

Thus, we now cast the expected frequencies for each cell in the following contingency
table, where the values enclosed in the parenthesis are the expected frequencies.

Moderat
Non-smo Heavy Row
e
kers Smokers Total
Smokers

With TB 21 36 30 87
(33.35) (29.97) (23.68)
Without TB 48 26 19 93
(35.65) (32.03) (25.32)
Column Total 69 62 49 180

Computing the statistic, we obtain:


Step 5: Decision and Conclusion

Observe that this Chi-Square computed value (14.46) is greater than the table value
(5.991). Hence, we decide to reject H​o ​and conclude that the presence or absence of TB
and smoking habits are dependent.

34.3 For a 2 x 2 contingency table, we compute by applying the ​Yate’s correction for continuity given
by the formula:

Name: _________________________________________ Date: __________________ Score: ______

Instructor: __________________________________________ Schedule: ______________________


MODULE 34

EXERCISE SET
Solve the following problems:

1. In an experiment to study the dependence of religious affiliation with IQ, the following data were
taken.

IQ
Religious Affiliation
High Average Low
Catholic 40 26 17
Protestant 47 59 25
Adventist 83 104 79

Test the hypothesis that religious affiliation is independent of IQ. Use α = 0.05 level of significance.

2. In the following table, test at α = 0.05 that gender and student’s academic performance are
independent.

Academic Performance
Gender
Very Good Good Fair Poor
Male 40 35 28 37
Female 45 30 33 32

You might also like