Professional Documents
Culture Documents
Lecture 4.CMH Test
Lecture 4.CMH Test
We are often interested only in investigating the relationship between two binary variables (e.g.,
a disease and an exposure); however, we have to control for confounders.
A confounding variable is a variable that may be associated with either the disease or exposure
or both. For example, in, a case–control study was undertaken to investigate the relationship
between lung cancer and employment in shipyards during World War II among male residents of
coastal Georgia. In this case, smoking is a confounder; it has been found to be associated with
lung cancer and it may be associated with employment because construction workers are likely
to be smokers. Specifically, we want to know:
Among smokers, whether or not shipbuilding and lung cancer are related
Among nonsmokers, whether or not shipbuilding and lung cancer are related
The underlying question is the question concerning conditional independence between lung
cancer and shipbuilding; however, we do not want to reach separate conclusions, one at each
level of smoking. Assuming that the confounder, smoking, is not an effect modifier (i.e.,
smoking does not alter the relationship between lung cancer and shipbuilding), we want to pool
data for a combined decision. When both the disease and the exposure are binary, a popular
method to achieve this task is the Mantel–Haenszel method. The process can be summarized as
follows:
Under the null hypothesis and fixed marginal totals, cell (1,1) frequency a is distributed with
mean and variance:
r1 c1 r1 r2 c1 c2
E(a)= and Var (a)= 2 ; therefore, the Mantel–Haenszel test is based on the
n n (n−1)
Z statistic:
ad
∑ a−∑
r1 c1
n
∑
n and ¿ MH = bc
Z=
√∑ ¿ ¿ ¿ ¿ ∑n
When the test above is statistically significant, the association between the disease and the
exposure is real. Since we assume that the confounder is not an effect modifier, the odds ratio is
constant across its levels. The odds ratio at each level is estimated by ad/bc; the Mantel–
Haenszel procedure pools data across levels of the confounder to obtain a combined estimate:
Example 1:
A case–control study was conducted to identify reasons for the exceptionally high rate of lung
cancer among male residents of coastal Georgia. The primary risk factor under investigation was
employment in shipyards during World War II, and data are tabulated separately in Table I for
three levels of smoking. There are three 2×2 tables, one for each level of smoking; in Example 2,
the last two tables were combined and presented together for simplicity.
r 1 c1 46 (61)
For, a=11, = =9.38
n 299
The process is repeated for each of the other two smoking levels.
r 1 c1 112 ( 287 )
For a=70, = =58.55,
n 549
r1 c1
∑ a−∑ n
Z=
√∑ ¿ ¿ ¿ ¿
These results are combined to obtain the z score:
Exercise
Case–control study was conducted in Auckland, New Zealand to investigate the effects of
alcohol consumption on both nonfatal myocardial infarction and coronary death in the 24 hours
after drinking, among regular drinkers. Data are tabulated separately for men and women in
Table below. For each group, men and women, and for each type of event, myocardial infarction
and coronary death, test to compare cases versus controls. State, in each analysis, your null and
alternative hypotheses and choice of test size.
Totals
WOMEN NO 144 41 89 12
YES 122 19 76 4
Totals
Example 2
Since incidence rates of most cancers rise with age, this must always be considered a
confounder. Stratified data for an unmatched case–control study are shown in the Table below.
The disease was esophageal cancer among men, and the risk factor was alcohol consumption.
Use the Mantel–Haenszel procedure to compare the cases versus the controls. State your null
hypothesis and choice of test size.
r 1 c1 (122)(123)❑❑ r 1 r 2 c1 c 2
For Age 45-64 years, a=67, = =32.98 , 2 =17.65
n 455 n (n−1)
ad 67 (277) bc
=44 ¿ ¿
= =40.79 ,
n 455 n
r 1 c1 68( 42) r 1 r 2 c1 c 2
For Age 65+ years, a=24, = =13.28, 2 =7.38
n 215 n (n−1)
r1 c1
∑ a−∑ n
Z=
√∑ ¿ ¿ ¿ ¿