Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 83



Prepared by:
Prof. Dr Bahaman Abu Samah
Department of Professional Development and Continuing Education
Faculty of Educational Studies
Universiti Putra Malaysia
• The chi-square distribution has only one parameter –
degrees of freedom
• The shape of the distribution is skewed to the right
(positively skewed) for small df and becomes symmetric for
large df
• The chi-square statistic reflects the magnitude of the
discrepancies between observed and expected counts
• The entire chi-square distribution lies to the right of the y-
• The chi-square value can never be negative
df = 2

df = 7

df = 12

| | | | | | | | | | | | | | | | | | | | |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 χ2

 Goodness-of-fit
To test certain assumption on the distribution of a
categorical variable
 Test of Independence
Test on association between variables regarding
contingency tables
 Test of Homogeneity
Test on the difference/proportion between groups
1 χ 2

• Aims at comparing the actual frequencies within
each category of a nominal variable against its
expected frequency of a model based on the
probability theory
• Developed by Karl Pearson in 1980s
• Also known as one sample Chi-Square Test
• Test the assumption on the distribution of a
categorical variable
• Example:
There is 3:7 male to female teachers on teaching

Next ►
• One categorical variable (nominal/ordinal) with more than two categories
• The expected frequencies should not be smaller than 5 for more than 20% of the
total expected frequencies
• In the case of expected frequencies of less than 5 exceeding 20%, it is
advisable to collapse the adjacent categories
• Calculation is based on:
O - Observed frequency
E - Expected frequency
E = np
where n – sample size
p – probability/proportion
to Expect?
What to Expect?
2  
O  E 2
Criteria Decision
E χ2cal ≥ χ2critical Reject HO

χ2cal < χ2critical Fail to reject HO

χ2 - value

State  
HO and HA Decision Conclusion

Critical value
Hypothesis Test
5-Steps Hypothesis
1 State HO and HA

2 Calculate χ2

3 Determine Critical Value

4 Decision

5 Conclusion
Step 1: State HO & HA
HO: Statement of assumption
HA: Statement opposite of the assumption
Step 2: Calculate Test
Calculation of is based on frequency
1. Observed (O)
2. Expected (E)
E = np where
n Sample size
p Probability/Proportion
(O  E )
 
Step 3: Determine Critical
Critical value is based on:
 Significance level (α)
 Degrees of freedom
df  k  1
Where k = no. of levels/groups
 , df
Step 4: Decision
– Only two (2) possible decisions.
– Reject or Fail to Reject HO
Reject HO: χ2cal ≥ χ2critical
Criteria Decision
Fail to reject HO: χ cal < χ critical
2 2
χ2cal ≥ χ2critical Reject HO
χ2cal < χ2critical Fail to reject HO
Criteria Decision
Reject HO: sig-χ ≤ α
sig-χ2 ≤ α Reject HO

Fail to reject HO: sig-χ > α 2 sig-χ2 > α Fail to reject HO

Step 5: Conclusion
Reject HO:
Significant different from the
Fail to reject HO:
No significant different from
the assumption
Example 1
The following table displays the age distribution for a sample
summoned for traffic violations. Test the hypothesis that the
proportion of people summoned for traffic violations is equal for
all age groups at .05 level of significance.

Age group < 20 20 - 29 30 - 39 40 – 49 ≥ 50

No. of Summon 32 25 19 16 8

p1 = p2 = p3 = p4 = p5 = .20
Step 1: Hypotheses
HO: The proportion of people involved in traffic
violations is the same for all age groups
HA: The proportion of people involved in traffic
violations is different by age groups
Step 2: Test Statistics
(O  E )
Age O E (O – E) (O – E)2 E
< 20 32 20 12 144 7.20
20-29 25 20 5 25 1.25
30-39 19 20 -1 1 .05
-4 16 .80
40-49 16 20 -12 144 7.20
≥ 50 8
100 20 χ2 = 16.50
Step 3: Critical value
df  k  1
 5 1
Fail to Reject HO
reject HO
 42, .05  9.49

α =.05

Criteria Decision
Step 4 : Decision χ2cal ≥ χ2critical Reject HO
χ2cal < χ2critical Fail to reject HO
Since χ2cal (16.5) is bigger than χ2critical (9.49)
 Reject HO
Step 5 : Conclusion
Conclude the proportion of people summoned for
traffic violations is significantly different by age
groups at .05 level of significance.
7 Basic Key
Test the assumption pertaining to the distribution
of a categorical variable

Next ►
► Only one categorical variable (Nominal/Ordinal)
► Calculation is based on:
O - Observed frequency
E - Expected frequency
E = np
where n – sample size
p – probability/proportion
How to run t-test in

Next ►
Next ►
Results of Chi-Square

Next ►
Reject HO: sig-χ2 ≤ α
Fail to
reject HO: sig- χ2 > α

Next ►
Reject HO
There is significant different from the

Fail to Reject HO
There is no significant different from the

Next ►
• The table below, gift_type, provides the observed frequencies
(Observed N) for each gift, as well as the expected frequencies
(Expected N), which are the frequencies expected if the null
hypothesis is true. The difference between the observed and
expected frequencies is provided in the Residual column.
• Gift type:
• 1 = Gift Certificate
• 2 = Cuddly Toy
• 3 = Cinema Tickets
• The table below, Test Statistics, provides the actual result of the chi-
square goodness-of-fit test.
Step 1: State HO & HA
HO: The proportion of respondents
received gifts are similar for all types
of gifts
HA: The proportion of respondents
received gifts are different by types
of gifts
Step 2: Report test statistics and sig-
Step 3: Determine Significance Level
Set at either .05 or .01

In this case, .05
Reject HO:
Significant different from

Step 4: Decision the assumption

Fail to reject HO:
No significant different from
– Only two (2) possible decisions.
the assumption
– Reject or Fail to Reject HO

Since sig-p (.000) < .05), reject HO

Criteria Decision
Reject HO: sig-χ ≤ α
sig-χ2 ≤ α Reject HO

Fail to reject HO: sig-χ > α

2 sig-χ2 > α Fail to reject HO
Step 5: Conclusion
Conclude the proportion of people received
gifts is significantly different by types of gifts
at .05 level of significance.
APA Reporting

A chi-square test of goodness-of-fit was performed to

determine whether the three types of gifts were equally
distributed. Distribution for the three types of gifts was not
equally distributed in the population, χ2 (2, N = 1000) = 49.4, p
< .05.
2 χ

Test of
• Aims at finding out whether the 2 qualitative
variables are independent of each other or related to
each other by taking into account the proportion of
responses found in the combination of different
categories of these two variables
• Also known as two independent samples chi-square
Purpose &
Test association between two categorical variables and
determine the strength of the association

DV − Nominal/Ordinal
IV − Nominal/Ordinal

Note: At least one of the variable is nominal scale

Calculation based on frequencies rather than numerical scores.
Next ►
to Expect?
What to
 2  
O  E 2 Criteria
χ2cal ≥ χ2critical
Reject HO
Calculate E χ2cal < χ2critical Fail to reject HO
χ2 - value

State  
HO and HA Decision Conclusion

Critical value 

Next ►
Hypothesis Test

5-Step Hypothesis
1 State HO and HA

2 Calculate Test Statistics

3 Determine Critical Value

4 Decision

5 Conclusion

Next ►
Step 1: State HO & HA
HO: DV is independent of IV
HA: DV is dependent on IV
Please follow the above stated format; otherwise the
meaning is reversed
DV = Academic performance
Example: IV = Student group
HO: Academic performance is
independent of student group 
HA: Academic performance is dependent
of student group
Next ►
Step 2: Calculate Test
1. Calculate Expected Count (E) for each of the cells
in the following contingency table

2x3 Contingency Table

Next ►
2. Calculate the Chi-Square value, using the following
formula: (O  E )
2  

Step 3: Determine Critical
Critical value is based on:
 Significance level (α)
 Degrees of freedom
df  ( R  1) (C  1)

 2
 , df

Step 4: Decision
– Only two (2) possible decisions. Criteria Decision
χ2cal ≥ χ2critical Reject HO
– Reject or Fail to Reject HO χ2cal < χ2critical Fail to reject HO

Manual: Criteria Decision

sig- χ2 ≤ α Reject HO
Reject HO: χ2cal ≥ χ2critical
sig- χ2 > α Fail to reject HO
Fail to reject HO: χ2cal < χ2critical

Reject HO: sig- χ2 ≤ α 

Fail to reject HO: sig- χ2 > α

Step 5: Conclusion
Reject HO:
DV is significantly dependent on IV
Fail to reject HO:
DV is not significantly dependent on

Measures of
1. Phi coefficient
2. Contingency coefficient a. Phi coefficient is used for only 2x2 contingency
2 table
C b. Use Guildford’s rule of thumb to interpret the
2  n magnitude of association between the two
3. Cramer V coefficient
n (k  1)

Guildford Rule of Thumb
rs Strength of Relationship

< .2 Negligible Relationship

.2 - .4 Low relationship
.4 - .7 Moderate relationship
.7 - .9 High relationship
> .9 Very high relationship
A study was conducted to test the relationship between
gender and academic performance. Data collected from a
randomly selected sample.
1. Test the hypothesis on the relationship between the two
variables at .05 level of significance.
2. Calculate and describe an appropriate measure of
association between the two variables.
Academic Performance
Gender High Moderate Low

Male 93 70 12
Female 87 32 6
Q1. Hypothesis Test
a. Hypotheses
HO: Academic performance is independent of gender
HA: Academic performance is dependent on gender
b. Test statistic. Calculate expected value for each cell
Academic Performance Row
Gender High Moderate Low Totals
Male 93
(105.0) 70
(59.5) 12
Female 87

Column Totals 180 102 18 300

O  E 2
Group O E (O – E) (O – E)2 E
M-H 93 105.0 -12.0 144.00 1.371
M-M 70 59.5 10.5 110.25 1.853
M-L 12 10.5 1.5 2.25 .214
F-H 87 75.0 12.0 144.00 1.920
F-M 32 42.5 -10.5 110.25 2.594
F-L 6 7.5 -1.5 2.25 .300

300 8.252

c. Critical value
Fail to Reject HO
df  ( R  1) (C  1) reject HO
 (2  1) (3  1)
 1 2
α =.05
 22, .05  5.99

d. Decision 5.99
Since χ2cal (8.252) > χ2critical (5.99)
 Reject HO
e. Conclusion Academic
performance is significantly dependent on gender at .
05 level of significance

Q2. Measures of
For a 2 x 3 contingency table, both contingency and
Cramer’s V coefficients are appropriate
2 2
C V
2 n n (k  1)
8.252 8.252
 
8.252  300 300 (2  1)
 .164  .166

Negligible association between gender and

academic performance
Guildford Rule of Thumb
rs Strength of Relationship

< .2 Negligible Relationship

.2 - .4 Low relationship
.4 - .7 Moderate relationship
.7 - .9 High relationship
> .9 Very high relationship
Invalid Chi Square
• Invalid if more than 1/5 (20%) cell with expected count less than
• Can be avoided by:
• Collect larger samples
• Combine data for the smaller expected categories until their
combined value is 5 or more
A study was conducted to test the relationship between
gender and group participation. Data collected from a
randomly selected sample follow
1. Test the hypothesis on the relationship between the two
variables at .05 level of significance.
2. Calculate and describe an appropriate measure of
association between the two variables.
Group participation
Gender A B C

Female 6 16 8
Male 5 13 2
Q1. Hypothesis Test
a. Hypotheses
HO: Group participation is independent of gender
HA: Group participation is dependent on gender
b. Test statistic. Calculate expected value for each cell
Group participation Row
Gender A B C Totals
Female 6
(6.6) 16
(17.4) 8
Male 5

Column Totals 11 29 10 50
(O  E )
 
Chi-Square E
O  E 2
Group O E (O – E) (O – E)2 E
F-A 6 6.6 -0.6 0.36 0.055
F-B 16 17.4 -1.4 1.96 0.113
F-C 8 6.0 2.0 4.0 0.667
M-A 5 4.4 0.6 0.36 0.082
M-B 13 11.6 1.4 1.96 0.169
M-C 2 4.0 -2.0 4.0 1.000

50 2.086

c. Critical value
Fail to Reject HO
df  ( R  1) (C  1) reject HO
 (2  1) (3  1)
 1 2

α =.05
 22, .05  5.99

d. Decision 5.99
Since χ2cal (2.086) < χ2critical (5.99)
 Fail to reject HO
e. Conclusion
Group participation is independent on gender at .05
level of significance

Q2. Measures of
For a 2 x 3 contingency table, both contingency and
Cramer’s V coefficients are appropriate
2 2
C V
2 n n (k  1)
2.086 2.086
 
2.086  50 50 ( 2  1)

 .200  .204
Negligible association between gender and group
Look for

Is this chi square of independence valid? expected

value <5

O  E 2  2 𝑋 100=33.33 %
Group O E (O – E) (O – E) 2
E 6

F-A 6 6.6 -0.6 0.36 0.055 Since 33.33% cell

F-B 16 17.4 -1.4 1.96 0.113 with expected count
less than 5,
F-C 8 6.0 2.0 4.0 0.667 therefore,
M-A 5 4.4 0.6 0.36 0.082 Chi square is invalid
M-B 13 11.6 1.4 1.96 0.169
M-C 2 4.0 -2.0 4.0 1.000

50 2.086

Condition to be fulfilled:
Valid if less than 20% cell with expected count less than 5
Invalid if more than 20% cell with expected count less than 5
Yates Correction
• Yates Correction*
• When there is only 1 degree of freedom, regular chi-test should not be used
• Apply the Yates correction by subtracting 0.5 from the absolute value of each
calculated O-E term, then continue as usual with the new corrected values
Contingency table Observed

Hypothesis Testing:
Chi-Square value
Chi-Square Tests

Asymp. Sig.
Value df (2-sided) Decision:
Pearson Chi-Square 8.253a 2 .016 sig-χ2 (.016) < .01
Likelihood Ratio 8.370 2 .015
Reject HO
6.762 1 .009
N of Valid Cases 300 Conclusios:
a. 0 cells (.0%) have expected count less than 5. The Performance is significantly
minimum expected count is 7.50. dependent on student group
Measures of Association Symmetric Measures
at .01 level of sig
between variables
Value Approx. Sig. Which measure is
Nominal by Phi .166 .016 most appropriate?
Nominal Cramer's V .166 .016
Contingency Coefficient .164 .016
N of Valid Cases 300
a. Not assuming the null hypothesis.
between variables
b. Using the asymptotic standard error assuming the null
For 2x3 table,
use C or V coefficient
• Suppose we want to test for an association between smoking
behavior (nonsmoker, current smoker, or past smoker) and gender
(male or female) using a Chi-Square Test of Independence (we'll
use α = 0.05).

Next ►
a. Hypotheses

HO: Smoking behavior is independent of gender

HA: Smoking behavior is dependent on gender
b. Report test statistic
=3.171, sig-=.205
c. Set Alpha (significant level) = .05
d. Decision
Since the p-value (.205) is greater than significance level (α = 0.05), failed to reject the null
e. Conclusion
Conclude that there is not enough evidence to suggest an association between gende
and smoking, or
No association was found between gender and smoking behavior (Χ2(2)> = 3.171, p = 0.205).
How to run t-test in

Next ►
Next ►
Results of Chi-Square

Next ►
Reject HO: sig-χ2 ≤ α
Fail to
reject HO: sig- χ2 > α

Next ►
Reject HO
There is significant association between
IV and DV

Fail to Reject HO
There is no significant association
between IV and DV

Next ►
APA Reporting
A chi-square test of independence was
performed to examine the relation between
gender and smoking behavior. The relation
between these variables was not significant, χ2
(2, N = 193) = 3.171, p >.05. Smoking
behaviour does not related to gender.

You might also like