Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

1

Chi-Square Test
Introduction
The chi-square is Non parametric test
Makes no assumption about distribution
of population
Use data at the nominal scale





Chi-Square Test
In Chi-square we compare the observed frequency of some
Observation With the expected frequency
The comparison of observed and expected frequencies is used to
calculate the value of the chi-square statistic
The symbol for chi-square and the formula are as follows:



where
O is the observed frequency, and
E is the expected frequency.

How will you show whether the distribution of
absenteeism is significant..?
The HR Manager at Georgetown Paper Ltd. Is
concerned about absenteeism among workers. He
decides to sample the company records to
determine whether absenteeism is distributed evenly
throughout the six day work week.
Week Day Number Absent
Mon 12
Tue 9
Wed 11
Thu 10
Fri 9
Sat 9
2

Chi-Square Test (goodness-of-fit test)


Suppose we required to purchase
computer for college use, select
Apple Computers, IBM
Computers, or Some other brand
of computer. We want to know if
there is a significant difference
among the frequencies with which
these three brands of computers
are selected or if the students select
equally among the three brands.
The data for 100 students is
recorded in the table.
Computer
No. of
people
preferred
IBM 47
Apple 36
Other 17





Frequency Table


Frequency with which students select computer brand
Computer
Observed
Frequency
IBM 47
Apple 36
Other 17


From the table we can see that:

Expected
Frequency
(O-E)
2
/E
33.333 5.604
33.333 0.213
33.333 8.003

Total (chi-square)

13.820
Equal expected frequencies
This table also indicated the expected frequency for
each category. Since there are 100 measures or
observations and there are three categories (Apple,
IBM, and Other) we would indicate the expected
frequency for each category to be 100/3 or 33.333.
3





Critical Value for Chi-Square
The degree of freedom
It is the number of constraints needed to calculate the critical
value
Degree of freedom (df)= (C-1)






Chi-Square Test (goodness-of-fit test)

Null hypothesis:
there are no differences between the observed and
the expected frequencies.
Alternate hypothesis
there are significant differences between the
observed and expected frequencies.
Set the alpha level.
alpha level at .05

Calculate the critical value with respect to the
degrees of freedom & alpha level
df = C 1
=(3 -1)= 2
for df =2 & Alpha =.05, the critical value is 5.991
Chi-Square Test (goodness-of-fit test)
4

Write the decision rule for rejecting the null
hypothesis.
Reject H
0
if Chi-Square > or = 5.991.

Chi-Square Test (goodness-of-fit test)

Write a summary statement based on the decision.
Since our calculated value of (13.820) is greater than
5.991, we reject the null hypothesis and not reject the
alternative hypothesis.
Chi-Square (goodness-of-fit test) with equal
expected frequencies

Chi-Square Test (goodness-of-fit test)

In a national study, students required to buy computers
for college use bought IBM computers 50% of the time,
Apple computers 25% of the time, and other computers
25% of the time. A survey on 100 new students shows that
36 bought Apple Computers, 47 bought IBM computers,
and 17 bought some other brand of computer. We want to
know if these frequencies of computer buying behavior is
similar to or different than the national study data.
5






Null hypothesis
there are no differences between the observed
and the expected frequencies.
Alternate hypothesis
there are significant differences between the
observed and expected frequencies.
Set the alpha level
alpha level is .05
Chi-Square Test (goodness-of-fit test)

Computer
Observed
Frequency
Expected
Frequency
(O-E)
2
/E
IBM 47 50 0.18
Apple 36 25 4.84
Other 17 25 2.56
Total (chi-square)




7.58
From the table we can see that:

Chi-Square (goodness-of-fit test) with unequal
expected frequencies

Calculate the critical value with respect to the
degrees of freedom & alpha level
df = (C - 1) = 2

for df =2 & Alpha =.05
the critical value is 5.991

Write the decision rule for rejecting the null
hypothesis.
Reject H0 if Chi-Square >= 5.991.

Chi-Square (goodness-of-fit test) with unequal
expected frequencies

6





Write a summary statement based on the decision.
Since our calculated value of (7.58) is greater
than 5.991, we reject the null hypothesis and not reject
the alternative hypothesis.
Chi-Square (goodness-of-fit test) with unequal
expected frequencies


Chi square test: Test of
Independence





Chi-Square :test of independence
we wants to know if there is a significant
difference in the frequencies with which males
come from small, medium, or large cities and
females comes. The two variables we are
considering here are hometown size (small,
medium, or large) and gender (male or female).
Another way of putting our research question is:
Is gender independent of size of hometown?
7





The data for 30 females and 6 males is in the following table.
Frequency with which males and females
come from small, medium, and large cities





Small Medium Large Totals
Female 10 14 6 30
Male 4 1 1 6
Totals 14 15 7 36


Chi-Square :test of independence





The formula for chi-square :
where
O is the observed frequency, and
E is the expected frequency.
Chi-Square :test of independence
The degrees of freedom for the two-dimensional chi-
square statistic is:
df = (C - 1)(R - 1)
where C is the number of columns or levels of the first
variable and R is the number of rows or levels of the
second variable.
In the table above we have the observed frequencies. Now
we must calculate the expected frequency for each of the
six cells. For two-variable chi-square we find the expected
frequencies with the formula:
Expected Frequency for a Cell =
(Column Total X Row Total)/Grand Total
Chi-Square :test of independence
8
In the table above we can see that the Column Totals are
14 (small), 15 (medium), and 7 (large), while the Row
Totals are 30 (female) and 6 (male). The grand total is 36.
Using the formula we can thus find the expected
frequency for each cell.
The expected frequency for the small female cell is 14X30/36 = 11.667
The expected frequency for the medium female cell is 15X30/36 = 12.500
The expected frequency for the large female cell is 7X30/36 = 5.833
The expected frequency for the small male cell is 14X6/36 = 2.333
The expected frequency for the medium male cell is 15X6/36 = 2.500
The expected frequency for the large male cell is 7X6/36 = 1.167
Chi-Square :test of independence
We can put these expected frequencies in our table and also include the values
for (O - E)2/E. The sum of all these will of course be the value of chi-square.
Observed frequencies, expected frequencies, and (O - E)
2
/E for males and
females from small, medium, and large cities




Small Medium Large Totals


Observed Expected (O-E)2/E Observed Expected (O-E)2/E Observed Expected (O-E)2/E


Female 10 11.667 0.238 14 12.500 0.180 6 5.833 0.005 30
Male 4 2.333 1.191 1 2.500 0.900 1 1.167 0.024 6
Totals 14




15




7




36
Chi-Square :test of independence
From the table we can see that:






and df = (C - 1)(R - 1) = (3 - 1)(2 - 1) = (2)(1) = 2
Chi-Square :test of independence
9
State the null hypothesis and the alternative hypothesis


Set the alpha level.

Calculate the value of the appropriate statistic. Also indicate
the degrees of freedom
df = (C - 1)(R - 1) = (2)(1) = 2 at this df & 5%
Write the decision rule for rejecting the null hypothesis.
Reject H
0
if >= 5.991.


Chi-Square :test of independence

To write the decision rule we can know the critical value
by looking at Table and noting the tabled value for the column
for the .05 level and the row for 2 df.
Write a summary statement based on the decision.
Fail to reject H
0
Note: Since our calculated value of (2.538) is not greater than
5.991, we fail to reject the null hypothesis and so reject the
alternative hypothesis.

Chi-Square :test of independence
27
Example
Ms. Jan Kilpatrick is the marketing
manager for a manufacturer of sports
cards. She plans to begin selling a
series of cards with pictures and playing
statistics of former Major League
Baseball players. One of the problems is
the selection of the former players. At a
baseball card show at South Mall last
weekend, she set up a booth and
offered cards of the following six Hall of
Fame baseball players: Tom Seaver,
Nolan Ryan, Ty Cobb, George Brett,
Hank Aaron, and Johnny Bench. At the
end of the day she sold a total of 120
cards. The number of cards sold for
each old-time player is shown in the
table on the right. Can she conclude
the sales are not the same for each
player?
10
28
Step 1: State the null hypothesis and the alternate hypothesis.
H
0
: there is no difference between f
o
and f
e
H
1
: there is a difference between f
o
and f
e

Step 2: Select the level of significance.
= 0.05 as stated in the problem

Step 3: Select the test statistic.
The test statistic follows the chi-square distribution,
designated as
2


29
Step 4: Formulate the decision rule.









( )
( )
( )
( )
070 . 11
2

5 , 05 .
2
2

1 6 , 05 .
2
2

1 ,
2
2

1 ,
2 2
if
0
H Reject
>
(
(


>
(
(


>
(
(


>
(
(


>
e
f
e
f
o
f
e
f
e
f
o
f
e
f
e
f
o
f
k
e
f
e
f
o
f
k
_
_
o _
o _ _
30
11
31
Step 5: Compute the value of the Chi-square
statistic and make a decision
( )

(
(


=
e
e o
f
f f
2
2
_
32
34.40
The computed
2
of 34.40 is in the rejection region, beyond the critical value of 11.070. The
decision, therefore, is to reject H
0
at the .05 level .
Conclusion: The difference between the observed and the expected frequencies is not due to
chance. Rather, the differences between f
0
and f
e
and are large enough to be considered
significant. It is unlikely that card sales are the same among the six players.
33
Step 1: State the null hypothesis and the alternate hypothesis.
H0: There is no relationship between adjustment to civilian life
and where the individual lives after being released from prison.

H1: There is a relationship between adjustment to civilian life
and where the individual lives after being released from prison.

Step 2: Select the level of significance.
= 0.01 as stated in the problem

Step 3: Select the test statistic.
The test statistic follows the chi-square distribution, designated as
2




Contingency Analysis - Example
12
34
Step 4: Formulate the decision rule.









( )
( )
( )
( )
345 . 11



if H Reject
2
3 , 01 .
2
2
) 3 )( 1 ( , 01 .
2
2
) 1 4 )( 1 2 ( ,
2
2
) 1 )( 1 ( ,
2 2
0
>
(


>
(


>
(


>
(


>



e
e o
e
e o
e
e o
e
e o
c r
f
f f
f
f f
f
f f
f
f f
_
_
_
_ _
o
o
Contingency Analysis - Example
35
Computing Expected Frequencies (f
e
)
(120)(50)
200
36
Computing the Chi-square Statistic
13
37
Conclusion
5.729
The computed
2
of 5.729 is in the Do not rejection H
0
region. The null hypothesis is not rejected
at the .01 significance level.

We conclude there is no evidence of a relationship between adjustment to civilian life and where
the prisoner resides after being released from prison. For the Federal Correction Agencys
advisement program, adjustment to civilian life is not related to where the ex-prisoner lives.

You might also like