Professional Documents
Culture Documents
Cap 15 Doane
Cap 15 Doane
15-1 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
Chi-Square Tests
Chapter Contents
15.1 Chi-Square Test for Independence
15.2 Chi-Square Tests for Goodness-of-Fit
15.3 Uniform Goodness-of-Fit Test
15.4 Poisson Goodness-of-Fit Test
15.5 Normal Chi-Square Goodness-of-Fit Test
15.6 ECDF Tests (Optional)
15-2 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
Chapter Learning Objectives (LOs)
15-3 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
Chapter Learning Objectives (LOs), continued
15-4 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
15.1 Chi-Square Test for
Independence
LO15-1: Recognize a contingency table and understand
how it is created.
Contingency Tables
• A contingency table is a cross-tabulation of n paired observations into
categories.
• Each cell shows the count of observations that fall into the
category defined by its row and column heading as shown in Table 15.2.
15-5 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-1: Recognize a contingency table and understand
how it is created (continued, 2).
Contingency Tables
15-6 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-1: Recognize a contingency table and understand
how it is created (continued, 3).
Chi-Square Test
• In a test of independence for an r x c contingency table, the
hypotheses are
H0: Variable A is independent of variable B
H1: Variable A is not independent of variable B
• Use the chi-square test for independence to test these hypotheses.
• This non-parametric test is based on frequencies.
• The n data pairs are classified into c columns and r rows, and then the
observed frequency fjk is compared with the expected frequency ejk
under the assumption of independence.
15-7 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-1: Recognize a contingency table and understand
how it is created (continued, 4).
Chi-Square Distribution
• The critical value comes from the chi-square probability distribution
with (r – 1)(c – 1) degrees of freedom.
df = degrees of freedom = (r – 1)(c – 1)
15-9 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-2: Find degrees of freedom and use the chi-square
table of critical values (continued, 2).
Chi-Square Distribution
• Consider the shape of the chi-square distribution. As the degrees
of freedom increases, the shape begins to resemble a normal,
bell-shaped curve.
• However, for any contingency table you are likely to encounter,
degrees of freedom will not be large enough to assume normality.
15-10 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-2: Find degrees of freedom and use the chi-square
table of critical values (continued, 3).
Expected Frequencies
Assuming that H0 is true, the expected frequency of row j and
column k is:
where
Rj = total for row j (j = 1, 2, …, r)
Ck = total for column k (k = 1, 2, …, c)
n = sample size
15-11 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-3: Perform a chi-square test for independence
on a contingency table.
15-13 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-3: Perform a chi-square test for independence
on a contingency table (continued, 3).
15-14 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-3: Perform a chi-square test for independence
on a contingency table (continued, 4).
Steps in Testing the Hypotheses (continued)
Step 3: Calculate the Test Statistic
• For example:
15-15 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-3: Perform a chi-square test for independence
on a contingency table (continued, 5).
Steps in Testing the Hypotheses (continued)
• The chi-square test statistic is
15-16 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-3: Perform a chi-square test for independence
on a contingency table (continued, 6).
Figure 14.6
15-17 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-3: Perform a chi-square test for independence
on a contingency table (continued, 7).
Figure 14.6
15-18 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-3: Perform a chi-square test for independence
on a contingency table (continued, 8).
15-19 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-3: Perform a chi-square test for independence
on a contingency table (continued, 9).
15-20 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-3: Perform a chi-square test for independence
on a contingency table (continued, 10).
15-21 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-3: Perform a chi-square test for independence
on a contingency table (continued, 11).
15-22 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
15.2 Chi-Square Test for
Goodness-of-Fit
LO15-4: Perform a goodness-of-fit (GOF) test for a
multinomial distribution.
Purpose of the Test
• The goodness-of-fit (GOF) test helps you decide whether your
sample resembles a particular kind of population.
• The chi-square test will be used because it is versatile and easy
to understand.
15-23 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-4: Perform a goodness-of-fit (GOF) test for a
multinomial distribution (continued, 2).
15-24 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-4: Perform a goodness-of-fit (GOF) test for a
multinomial distribution (continued, 3).
15-25 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-4: Perform a goodness-of-fit (GOF) test for a
multinomial distribution (continued, 4).
Test Statistic and Degrees of Freedom for GOF
• Assuming n observations, the observations are grouped into c
classes and then the chi-square test statistic is found using:
15-26 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-4: Perform a goodness-of-fit (GOF) test for a
multinomial distribution (continued, 5).
Test Statistic and Degrees of Freedom for GOF,
continued
• If the proposed distribution gives a good fit to the sample, the test
statistic will be near zero.
• The test statistic follows the chi-square distribution with c – m – 1
degrees of freedom df = c – m – 1.
• where c is the number of classes (bins) used in the test and m is
the number of parameters estimated.
15-28 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-4: Perform a goodness-of-fit (GOF) test for a
multinomial distribution (continued, 7).
Data-Generating Situations
• Instead of “fishing” for a good-fitting model, visualize a priori the
characteristics of the underlying data-generating process.
• It is undoubtedly true that the most common GOF test is for the
normal distribution, simply because so many parametric tests
assume normality, and that assumption must be tested. Also, the
normal distribution may be used as a default benchmark for any
mound-shaped data that have centrality and tapering tails, as long
as you have reason to believe that a constant mean and variance
would be reasonable.
• However, you would not consider a Poisson distribution for
continuous data or certain integer variables because a Poisson
model only applies to integer data on arrivals or rare, independent
events.
• We remind you of this because software makes it possible to fit
inappropriate distributions all too easily.
15-29 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-4: Perform a goodness-of-fit (GOF) test for a
multinomial distribution (continued, 8).
Mixtures: A Problem
• Mixtures occur when more than one data-generating process is
superimposed on top of one another.
• Your sample may not resemble any known distribution. One common
problem is mixtures.
• A sample may have been created by more than one data-generating
process superimposed on top of another.
• For example, adult heights of either sex would follow a normal distribution,
but a combined sample of both genders will be bimodal, and its mean and
standard deviation may be unrepresentative of either sex.
• Obtaining a good fit is not sufficient justification for assuming a particular
model. Each probability distribution has its own logic about the nature of
the underlying process, so we also must examine the data-generating
situation and be convinced that the proposed model is both
logical and empirically apt.
15-30 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-4: Perform a goodness-of-fit (GOF) test for a
multinomial distribution (continued, 9).
Eyeball Tests
• A simple “eyeball” inspection of the histogram or dot plot may suffice
to rule out a hypothesized population.
• For example, if the sample is strongly bimodal or skewed, or if
outliers are present, we would anticipate a poor fit to a normal
distribution. The shape of the histogram can give you a rough idea
whether a normal distribution is a likely candidate for a good fit.
• You can be fairly sure that a formal test will agree with what your
common sense tells you, as long as the sample size is not too small.
• Yet a limitation of eyeball tests is that we may be unsure just how
much variation is expected for a given sample size. If anything, the
human eye is overly sensitive, causing us to commit α error
(rejecting a true null hypothesis) too often.
• People are sometimes unduly impressed by a small departure from
the hypothesized distribution, when actually it is within chance.
15-31 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
15.3 Uniform Goodness-of-Fit Test
LO15-5: Perform a goodness of-fit (GOF) test for a
uniform distribution.
Uniform Distribution
• The uniform goodness-of-fit test is a special case of the multinomial
in which every value has the same chance of occurrence.
• The chi-square test for a uniform distribution compares all c groups
simultaneously.
• The hypotheses are:
H0: 1 = 2 = …, c = 1/c
H1: Not all j are equal
15-32 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-5: Perform a goodness of-fit (GOF) test for a
uniform distribution (continued, 2).
15-33 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-5: Perform a goodness of-fit (GOF) test for a
uniform distribution (continued, 3).
15-34 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-5: Perform a goodness of-fit (GOF) test for a
uniform distribution (continued, 4).
15-35 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-5: Perform a goodness of-fit (GOF) test for a
uniform distribution (continued, 5).
• If the data are not skewed and the sample size is large (n > 30),
then the mean is approximately normally distributed.
• So, test the hypothesized uniform mean using
15-36 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
15.4 Poisson Goodness-of-Fit Test
LO15-6: Explain the GOF test for a Poisson distribution.
15-37 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-6: Explain the GOF test for a Poisson distribution
(continued, 2).
15-38 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-6: Explain the GOF test for a Poisson distribution
(continued, 3).
15-39 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-6: Explain the GOF test for a Poisson distribution
(continued, 4).
15-40 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
15.5 Normal Chi-Square Goodness-of-Fit Test
15-41 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-7: Explain the chi-square GOF test for normality
(continued, 2).
15-42 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-7: Explain the chi-square GOF test for normality
(continued, 3).
Advantage is a
standardized
scale.
Disadvantage is
that data are no
longer in the
original units.
15-43 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-7: Explain the chi-square GOF test for normality
(continued, 4).
• Step 3: Find the normal area within each bin assuming a normal
distribution.
• Step 4: Find expected frequencies ej by multiplying each normal area by
the sample size n.
• Classes may need to be collapsed from the ends inward to enlarge
expected frequencies.
15-44 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-7: Explain the chi-square GOF test for normality
(continued, 5).
15-45 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-7: Explain the chi-square GOF test for normality
(continued, 6).
Table 15.16
15-46 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-7: Explain the chi-square GOF test for normality
(continued, 7).
Histograms
• The fitted normal histogram gives visual clues as to the likely
outcome of the GOF test.
• Histograms reveal any outliers or other non-normality issues.
• Further tests are needed since histograms vary.
15-47 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
15.6 ECDF Tests (Optional)
LO15-8: Interpret ECDF tests and know their advantages
compared to chi-square GOF tests.
15-48 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-8: Interpret ECDF tests and know their advantages
compared to chi-square GOF tests (continued, 2).
• Another such test is the Kolmogorov-Smirnov (K-S) test, which uses the
largest absolute difference between the actual and expected cumulative
relative frequency of the n data values.
• The K-S test assumes that no parameters are estimated. If parameters are
estimated, use a Lilliefors test whose test statistic is the same but with a
different table of critical values. Both tests are done by computer.
• The K-S test can be illustrated in the same probability plot as the A-D test
as shown in Figure 15.15 (see the next slide).
15-49 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter 15
LO15-8: Interpret ECDF tests and know their advantages
compared to chi-square GOF tests (continued, 3).
15-50 Copyright ©2019 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.