Professional Documents
Culture Documents
The Chi Square Test and The Analysis of Count Data
The Chi Square Test and The Analysis of Count Data
Introduction
Example 1: Mode of Transportation
train 90 100
car 50 50
others 20 25
500 500
The data in this table is count data – that is, the number of commuters in each mode
of transportation has been counted and tabulated. The hypothesis of interest is that
5% level of significance.
2
classes in which qualitative data are classified into more than two categories. Such
data are usually referred to as count data, or enumerative data. Many practical
binomial experiment.
The characteristics of the multinomial experiment are rarely satisfied exactly for
modes of transportation and the percentage of the commuters to each of the five
modes of transportation from the previous study, at 5% level of significance. That is,
we will test the null hypothesis that the observed distribution of preferred modes of
car 50 50 0 0 0
others 20 25 -5 25 1
Note that the farther the observed values are from their expected values, the larger
2 2
χ will become. That is, large values of χ imply that the null hypothesis is false. We
have to know the sampling distribution in order to decide whether the data indicates
deviation from the previous study. If the null hypothesis is true, the distribution of
2
in repeated sampling is approximately a χ distribution. The approximation for the
2
sampling distribution of χ is adequate as long as the expected number of
2
observations in each of the k categories is at least 5. The χ distribution is
2
rejection region for the test will be located in the upper tail of the χ distribution.
4
2
Since the computed χ = 2.7 does not exceed the table value of 9.488, we fail to reject
the null hypothesis. That is, at 5% level of significance, the data do not provide
previous study. The following are the assumptions that must be met for a
The 2023 SEA Games final medal counts for the top five nations are shown below. At
the 0.10 level of significance can it be concluded that the type of medal won was
dependent upon the competing country?
O E O E O E
2
Since the computed χ = 26.50 exceeds the table value 13.362, we reject the null
hypothesis. That is, at 10% level of significance, the data do provide sufficient
evidence to conclude that the type of medal won was dependent upon the
competing country. The following are the assumptions that must be met for a
1. The litter size of Bengal tigers is typically two or three cubs, but it can vary
between one and four. Based on long-term observations, the litter size of
Bengal tigers in the wild has the distribution given in the table provided. A
zoologist believes that Bengal tigers in captivity tend to have different
(possibly smaller) litter sizes from those in the wild. To verify this belief, the
zoologist searched all data sources and found 316 litter size records of Bengal
tigers in captivity. The results are given in the table provided. Test, at the 5%
level of significance, whether there is sufficient evidence in the data to
conclude that the distribution of litter sizes in captivity differs from that in the
wild. [https://saylordotorg.github.io/text_introductory-statistics/s15-02-chi-square-one-sample-goodness.html]
1 0.11 41
2 0.69 243
3 0.18 27
4 0.02 5
Decision 2
Since the test statistic (χ = 41. 0372) is greater than the
critical value (9. 210). Moreover, the 1-value is less than the
level of significance.