Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 16

When to Use a Chi-Square Test (With

Examples)
BY ZACH BOBBITT POSTED ON FEBRUARY 11, 2021

In statistics, there are two different types of Chi-


Square tests:

1. The Chi-Square Goodness of Fit Test – Used to


determine whether or not a categorical variable
follows a hypothesized distribution.

2. The Chi-Square Test of Independence – Used to


determine whether or not there is a significant
association between two categorical variables.

Note that both of these tests are only appropriate


to use when you’re working with categorical
variables. These are variables that take on names
or labels and can fit into categories. Examples
include:

 Eye color (e.g. “blue”, “green”, “brown”)


 Gender (e.g. “male”, “female”)
 Marital status (e.g. “married”, “single”,
“divorced”)

This tutorial explains when to use each test along


with several examples of each.

The Chi-Square Goodness of Fit Test


You should use the Chi-Square Goodness of Fit
Test whenever you would like to know if some
categorical variable follows some hypothesized
distribution.

Here are some examples of when you might use


this test:

Example 1: Counting Customers

A shop owner wants to know if an equal number


of people come into a shop each day of the week,
so he counts the number of people who come in
each day during a random week.

He can use a Chi-Square Goodness of Fit Test to


determine if the distribution of customers follows
the theoretical distribution that an equal number
of customers enters the shop each weekday.

Example 2: Testing if a Die is Fair


Suppose a researcher would like to know if a die
is fair. She decides to roll it 50 times and record
the number of times it lands on each number.

She can use a Chi-Square Goodness of Fit Test to


determine if the distribution of values follows the
theoretical distribution that each value occurs
the same number of times.

Example 3: Counting M&M’s

Suppose we want to know if the percentage of


M&M’s that come in a bag are as follows: 20%
yellow, 30% blue, 30% red, 20% other. To test
this, we open a random bag of M&M’s and count
how many of each color appear.

We can use a Chi-Square Goodness of Fit Test to


determine if the distribution of colors is equal to
the distribution we specified.

For a step-by-step example of a Chi-Square


Goodness of Fit Test, check out this example in
Excel.

How to Perform a Chi-Square Goodness of


Fit Test in Excel
BY ZACH BOBBITT POSTED ON APRIL 17, 2020

A Chi-Square Goodness of Fit Test is used to


determine whether or not a categorical variable
follows a hypothesized distribution.
This tutorial explains how to perform a Chi-
Square Goodness of Fit Test in Excel.

Example: Chi-Square Goodness of Fit Test in Excel


A shop owner claims that an equal number of
customers come into his shop each weekday. To
test this hypothesis, an independent researcher
records the number of customers that come into
the shop on a given week and finds the following:

 Monday: 50 customers
 Tuesday: 60 customers
 Wednesday: 40 customers
 Thursday: 47 customers
 Friday: 53 customers

We will use the following steps to perform a Chi-


Square goodness of fit test to determine if the
data is consistent with the shop owner’s claim.

Step 1: Input the data.

First, we will input the data values for the


expected number of customers each day in one
column and the observed number of customers
each day in another column:
Note: There were 250 customers total. Thus, if the shop owner
expects an equal number to come into the shop each day then he
would expect 50 customers per day.
Step 2: Find the difference between the observed and expected
values.

The Chi-Square test statistic for the Goodness of


Fit test is X2 = Σ(O-E)2 / E

where:

 Σ: is a fancy symbol that means “sum”


 O: observed value
 E: expected value

The following formula shows how to calculate (O-


E)2 / E for each row:
Step 3: Calculate the Chi-Square test statistic and the
corresponding p-value.

Lastly, we will calculate the Chi-Square test


statistic along with the corresponding p-value
using the following formulas:
Note: The Excel function CHISQ.DIST.RT(x,
deg_freedom) returns the right-tailed probability of
the Chi-Square distribution associated with a test
statistic x and a certain degrees of freedom. The
degrees of freedom is calculated as n-1. In this
case, deg_freedom = 5 – 1 = 4.

Step 4: Interpret the results.

The X2 test statistic for the test is 4.36 and the


corresponding p-value is 0.3595. Since this p-value
is not less than 0.05, we fail to reject the null
hypothesis. This means we do not have sufficient
evidence to say that the true distribution of
customers is different from the distribution that
the shop owner claimed.

The Chi-Square Test of Independence


You should use the Chi-Square Test of
Independence when you want to determine
whether or not there is a significant association
between two categorical variables.

Here are some examples of when you might use


this test:

Example 1: Voting Preference & Gender

Researchers want to know if gender is


associated with political party preference in a
certain town so they survey 500 voters and
record their gender and political party
preference.
They can perform a Chi-Square Test of
Independence to determine if there is a
statistically significant association between
voting preference and gender.

Example 2: Favorite Color & Favorite Sport

Researchers want to know if a person’s favorite


color is associated with their favorite sport so
they survey 100 people and ask them about their
preferences for both.

They can perform a Chi-Square Test of


Independence to determine if there is a
statistically significant association between
favorite color and favorite sport.

Example 3: Education Level & Marital Status

Researchers want to know if education level and


marital status are associated so they collect data
about these two variables on a simple random
sample of 2,000 people.
They can perform a Chi-Square Test of
Independence to determine if there is a
statistically significant association between
education level and marital status.

For a step-by-step example of a Chi-Square Test


of Independence, check out this example in Excel.

How to Perform a Chi-Square Test of


Independence in Excel
BY ZACH BOBBITT POSTED ON APRIL 28, 2020

A Chi-Square Test of Independence is used to


determine whether or not there is a significant
association between two categorical variables.

This tutorial explains how to perform a Chi-


Square Test of Independence in Excel.

Example: Chi-Square Test of Independence in Excel


Suppose we want to know whether or not gender
is associated with political party preference. We
take a simple random sample of 500 voters and
survey them on their political party preference.
The following table shows the results of the
survey:
Use the following steps to perform a Chi-Square
test of independence to determine if gender is
associated with political party preference.

Step 1: Define the hypotheses.

We will perform the Chi-Square test of


independence using the following hypotheses:

 H0: Gender and political party preference are


independent.
 H1: Gender and political party preference
are not independent.

Step 2: Calculate the expected values.

Next, we will calculate the expected values for


each cell in the contingency table using the
following formula:

Expected value = (row sum * column sum) / table


sum.
For example, the expected value for Male
Republicans is: (230*250) / 500 = 115.

We can repeat this formula to obtain the


expected value for each cell in the table:

Step 3: Calculate (O-E)2 / E for each cell in the table.

Next we will calculate (O-E)2 / E for each cell in


the table where:

 O: observed value
 E: expected value

For example, Male Republicans would have a


value of: (120-115)2 /115 = 0.2174.

We can repeat this formula for each cell in the


table:
Step 4: Calculate the test statistic X2 and the corresponding p-
value.

The test statistic X2 is simply the sum of the


values in the last table.

The p-value that corresponds to the test


statistic X2 can be found by using the formula :

=CHISQ.DIST.RT(x, deg_freedom)

where:

 x: test statistic X2
 deg_freedom: degrees of freedom, calculated
as (#rows-1) * (#columns-1)

The test statistic X2 turns out to be 0.8640 and


the corresponding p-value is 0.649198.
Step 5: Draw a conclusion.

Since this p-value is not less than 0.05, we fail to


reject the null hypothesis. This means we do not
have sufficient evidence to say that there is an
association between gender and political party
preference.

Note: You can also perform this entire test by


using the Chi-Square Test of Independence Calculator.

Published by Zach
View all posts by Zach
Post navigation
PREVHow to Perform Polynomial Regression in Excel
NEXTAn Introduction to the Hypergeometric Distribution
Leave a Reply

Your email address will not be


published. Required fields are marked *

Comment *
Name *

Email *

Website

SEARCH
Search for:SEARCH

ABOUT
Statology is a site that makes learning statistics easy by explaining topics in simple
and straightforward ways. Learn more about us.

STATOLOGY STUDY
Statology Study is the ultimate online statistics study guide that helps you study and
practice all of the core concepts taught in any elementary statistics course and makes
your life so much easier as a student.

INTRODUCTION TO STATISTICS COURSE


Introduction to Statistics is our premier online video course that teaches you all of
the topics covered in introductory statistics. Get started with our course today.
RECENT POSTS
 How to Use the info() Method in Pandas
 How to Use pct_change() in Pandas
 How to Use the map() Function in Pandas
© 2023 Statology | Privacy Policy
Wisteria Theme by WPFriendship ⋅ Powered by WordPress

Additional Resources
The following calculators allow you to perform
both types of Chi-Square tests for free online:
Chi-Square Goodness of Fit Test Calculator
Chi-Square Test of Independence Calculator

Published by Zach
View all posts by Zach
Post navigation
PREVConfidence Interval for a Mean
NEXTChi-Square Goodness of Fit Test Calculator

You might also like