Professional Documents
Culture Documents
Cohort Assignment
Cohort Assignment
(100 Points)
Recommended Reading
Rothman, Greenland and Lash (Modern Epidemiology, 4th Edition page numbers in parentheses): 71-
83, 298-300 (619-49)
Rothman, “Measuring Interaction” 198-211
Ward JB, Gartner DR, Keyes KM, Fliss MD, McClure ES, Robinson WR. How do we assess a racial
disparity in health? Distribution, interaction, and interpretation in epidemiological studies. Annals of
Epidemiology. 2019;29:1-7. doi:10.1016/j.annepidem.2018.09.007
Resources: Output covariance matrix and parameter estimates from PROC GEMMOD (to estimate
ICR variance)
Introduction
For this assignment, you will conduct an analysis relevant to the second specific aim of our cohort
study, which is to determine whether the estimated association between early prenatal care and
preterm birth differs among major racial/ethnic populations that might be targeted for state funded
early prenatal care programs.
For the purposes of this assignment, race will be dichotomized as African American (AA, the index
category) or non-AA (the reference category, including all NC birth certificate race categories other
than AA combined) using a new variable, raceaa, created from race2:
Four risks (Rij) may be estimated for each combination of pnc5 and raceaa based on model 1:
pnc5 raceaa Rij Estimated value
0 0 no early care, non-AA R00 0 doubly unexposed
1 0 had early care, non-AA R10 0 + 1 exposed to care only
0 1 no early care, AA R01 0 + 2 “exposed” to race only
1 1 had early care, AA R11 0 + 1 + 2 doubly exposed
Risk difference homogeneity = additive risks (when averaged across the population)
A linear risk model with two dichotomous covariates and no interaction terms forces the estimated risk
associated with exposure to both covariates to equal the sum of the risks associated with each
exposure alone, as shown for model 1 below (where pnc5 indicates early prenatal care status and
raceaa indicates race):
This model forces risk difference homogeneity, such that the estimated RD for early care vs. no early
care for non-AA births (RD10 vs. 00) is identical to RD for early care vs. no early care among African
Americans (RD11 vs. 01):
1
RD10 vs. 00 = (0 + 1(1) + 2(0)) – (0 + 1(0) + 2(0)) = 1
RD11 vs. 01 = (0 + 1(1) + 2(1)) – (0 + 1(0) + 2(1)) = 1
Similarly, the RD for AA vs. non-AA race is identical for births with and without early care:
RD01 vs. 00 = (0 + 1(0) + 2(1)) – (0 + 1(0) + 2(0)) = 2
RD11 vs. 10 = (0 + 1(1) + 2(1)) – (0 + 1(1) + 2(0)) = 2
In addition, the RD contrasting the 17-week risk of preterm for “jointly exposed” births (had early care
and AA) and “jointly unexposed” (no early care, not AA) is equal to the sum of the average risks
associated with exposure to each individual exposure alone:
RD11 vs. 00 = (0 + 1(1) + 2(1)) – (0 + 1(0) + 2(0)) = 1 + 2
R11 = 0 + 1 + 2
= (0 + 1) + (0 + 2) - 0
= R10 + R01 – R00
This relation will always hold for model 1 because model 1 does not allow departure from additive
risks (i.e., it forces RD homogeneity). To assess whether or not the assumption of additive risks is
valid we must fit a second model that allows departures from additivity (i.e., RD heterogeneity), for
example:
Risk(preterm) = 0 + 1(pnc5) + 2(raceaa) + 3(pnc5*raceaa) model 2
In contrast with model 1, model 2 allows the estimated average risk in the “jointly exposed” (0 + 1 +
2 + 3) to be more or less than the sum of the average risks estimated in association with exposure to
each factor individually. In addition, it allows RD to depart from the homogeneity assumption:
RD10 vs. 00 = 1
RD11 vs. 01 = 1+ 3
Therefore, the coefficient for the model 2 product interaction term (3) indicates the extent to which
observed average risks in the jointly exposed differ from the average risks expected assuming
additivity. In other words, it is an estimate of the extent to which the observed RD 11 vs. 00 departs from
the RD11 vs. 00 expected under homogeneity.
2
Therefore, the expected value of R11 under the assumption of RD homogeneity can be written as
expected(R11) = R10 + R01 – R00
This relation also can be written in terms of RD with a common reference group (R00) and the
corresponding model 2 coefficients:
expected(R11 – R00) = (R10 – R00) + (R01 – R00) - (R00 – R00)
expected(RD11 vs. 00) = RD10 vs. 00 + RD01 vs. 00
= 1 + 2
Estimating RD modification using the Interaction Contrast (IC)
The interaction contrast (IC) is a measure of the difference between the observed R 11 estimate and
the R11 expected under the assumption of additive risks:
IC = R11 - expected(R11)
= R11 – (R10 + R01 – R00)
= R11 + R00 - R10 - R01
where IC = 0 when the estimated risk associated with joint exposure is equal to the sum of the
estimated “independent” risks for each exposure when averaged across the population. The IC also
can be written in terms of RD:
IC = RD11 vs 00 - expected(RD11 vs 00)
= RD11 vs. 00 - (RD10 vs. 00 + RD01 vs. 00)
Rewriting the IC in terms of the coefficients from model 2 we can also see that:
IC = (0 + 1 + 2 + 3) + (0) - (0 + 1) - (0 + 2)
IC = 3
Important point: The IC will be <0 or >0 when the average estimated risk associated with joint
exposure differs from the sum of the average risks for the “independent” exposures (i.e., the
independent risks are “non-additive”). IC= 0 when there is no RD modification, but IC = 0 does not
indicate the absence of causal interaction because we only observe “net” effects of synergism and
antagonism averaged across the population. Therefore, IC = 0 also can occur when the sum of sub-
additive risks among response types susceptible to antagonistic causal interactions is equivalent to
the sum of super-additive risks among response types that are susceptible to synergistic causal
interactions (i.e the response types cancel each other out; see Rothman, Greenland & Lash for
additional information on this topic).
For a “main exposure” that is positively associated with the outcome (main effects RD >0):
IC < 0 indicates a net reduction in the positive association with joint exposure to the effect
modifier (antagonism).
IC > 0 indicates a net increase in the positive association with joint exposure to the effect
modifier (synergism).
For a “main exposure” that is inversely associated with the outcome (main effects RD <0):
IC < 0 indicates a net increase in the inverse association with joint exposure to the effect
modifier (synergism).
3
IC > 0 indicates a net reduction in the inverse association with joint exposure to the effect
modifier (antagonism).
For example, an estimated IC <0 for early prenatal care vs. no early care (an exposure that is
inversely associated with preterm) and AA vs. no-AA race/ethnicity (which is positively associated with
preterm birth) would suggest the inverse (negative) association of early prenatal care on preterm was
increased (i.e., was more negative) in association with AA race/ethnicity than non-AA race/ethnicity.
Conversely, an estimated IC<0 would also indicate that the positive association between AA
versus non-AA race/ethnicity and preterm was reduced (i.e., the RD was closer to 0 or possibly
less than zero) in association with early prenatal care versus no early care.
As noted previously:
expected(R11) = R10 + R01 – R00
The ICR is used to estimate the difference between the observed RR11 vs. 00 and the RR11 vs. 00 expected
under the assumption of RD homogeneity:
4
Estimated risks for each combination of pnc5 and raceaa
The expected R11 assuming additive “independent” risks
IC = R11 - expected(R11)
The interaction contrast (IC) based on estimated RD and interaction contrast ratio (ICR)
based on RR, and OR
B2. Use model 3 to estimate the following, and enter results where indicated:
RD and 95% CI for each combination of pnc5 and raceaa relative to the common referent
group of “jointly unexposed” births (no early care & non-AA) (table B1)
The interaction contrast (IC) and its 95% CI (table B1)
The expected RD11 vs. 00 assuming no RD modification (table B1, no CI)
RD and 95% CI for early care vs. no early care according to maternal race (table B2)
RD and 95% CI for AA vs. non-AA maternal race according to prenatal care (table B2)
5
The product interaction term coefficient, its 95% CI and its p-value (below table B2)
The LR test statistic and its p-value comparing model 3 to a reduced linear risk model with
pnc5 and raceaa only (i.e., model 1) (below table B2)
Take time to note the difference between the estimates you enter in Tables B1 and B2. In Table
B1, you have a common referent group: no early care/non-AA. For Table 2, you are estimating RDs
for preterm birth within strata of race and then within strata of care.
To perform a likelihood ratio test (LRT) by hand, subtract the log likelihood (on the first page of the
Genmod output under “Criteria for Assessing Goodness of Fit”) from the reduced model (i.e. the
model without pncXrace) from the log likelihood in the full model (i.e. the model with pncXrace) and
multiply by two:
LRT =(LogLikelihoodFull – LogLikelihoodReduced)*2
The LRT is a chi-square test statistic with the degrees of freedom equal to the difference in the
number of parameters estimated in the reduced versus the full model. Note: LRT tests are only valid
for nested models (i.e. the “full” model includes all of the information included in the “reduced” model.)
For example, model 1 is nested in model 3.
You can calculate the P-value for the LRT in Excel or SAS.
SAS: data lrt_log; *Note: Your output will be in the dataset lrt_log, not your output;
p_log = 1 - probchi(test_statistic,degrees_freedom);
put p_log; run;
For model 3:
IC = (0 + 1 + 2 + 3) + (0) - (0 + 1) - (0 + 2)
= 3
B3. Use model 4 to estimate the following, and enter results where indicated:
RR for each combination of pnc5 and raceaa relative to the common referent group of “jointly
unexposed” births (no early care & non-AA) (table B1)
The interaction contrast ratio (ICR) (table B1)
The expected RR11 vs. 00 assuming no RD modification (table B1, no CI)
For model 4:
ICR = RR11 vs. 00 - expected(RR11 vs. 00)
= RR11 vs. 00 - (RR10 vs. 00 + RR01 vs. 00 – 1)
= RR11 vs. 00 - RR10 vs. 00 - RR01 vs. 00 + 1
6
Here, you will use single referent models (sometimes referred to as a “joint and separate effects”
models) to generate coefficients that directly estimate the “joint” effect estimate measure (for the two
covariates in combination) and the “separate” effect measures (for each covariate alone) relative to a
common reference group (the “jointly unexposed”). In this section, pay attention to any similarities (or
differences) in estimates of interaction using the different approaches. Also, note that indicator term
models are generally used for the assessment of additive interaction using ICRs when risks cannot be
directly estimated (e.g. case-control studies.)
To estimate corresponding lnRRs and lnORs use models 6 and 7, respectively (noting that the lnOdds
model is provided for completeness; you do not have to estimate the lnOdds model in this section):
C2. Use model 5 to estimate the following, and enter results where indicated:
RD and 95% CI for each combination of pnc5 and raceaa relative to the common referent
group of “jointly unexposed” births (no early care & non-AA) (table C1)
The interaction contrast (IC, no CI) (table C1)
The expected RD11 vs. 00 assuming no RD modification (table C1, no CI)
RD and 95% CI for early prenatal care vs. no early care according to maternal race (table C2)
7
RD and 95% CI for AA vs. non-AA maternal race according to prenatal care (table C2)
The LR test statistic and its p-value comparing model 5 to a reduced model with pnc5 and
raceaa only (i.e., model 1) (below table C2)
*For model 5:
IC = R11 – expected(R11)
= R11 – (R10 + R01 – R00)
= (0 + 3) – ((0 + 1) + (0 + 2) - 0)
= 3 – (1 + 2)
C3. Use model 6 to estimate the following, and enter results where indicated:
RR for each combination of pnc5 and raceaa relative to the common referent group of “jointly
unexposed” births (no early care & non-AA) (table C1)
The interaction contrast ratio (ICR) and 95% CI (see instructions below for 95% CI) (table C1)
The expected RR11 vs. 00 assuming no RD modification (table C1, no CI)
1. Access the variance-covariance matrix after fitting model 6 to estimate the variance for the ICR
using the variance formula below (from Hosmer & Lemeshow).
Var(ICR) = (RR10 vs. 002 * Var(1)) + (RR01 vs. 002 * Var(2)) + (RR11 vs. 002 * Var(3)) + (RR10 vs. 00 * RR01 vs. 00 *
2 * Cov(1,2)) + (-RR10 vs. 00 * RR11 vs. 00 * 2 * Cov(1,3)) + (-RR01 vs. 00 * RR11 vs. 00 * 2 * Cov(2,3))
2. Estimate the 95% confidence limits for the ICR as shown below:
Lower 95% CI for the ICR = ICR – 1.96 * Var(ICR)1/2
Upper 95% CI for the ICR = ICR + 1.96 * Var(ICR)1/2
Note: The original Hosmer and Lemeshow paper shows how to set up a spreadsheet to estimate
the ICR variance based on this method. There’s also a SAS macro (see Lundberg 1996).
References
1. Hosmer D, Lemeshow S. Confidence interval estimation of interaction. Epidemiol, 3: 452-456, 1992.
2. Lundberg M, Fredlund P, Hallqvist J, Diderichsen F. A SAS program calculating three measures of
interaction with confidence intervals. Epidemiology, 7: 655-6., 1996.
3. Assmann SF, Hosmer DW, Lemeshow S, Mundt KA. Confidence intervals for measures of interaction.
Epidemiology, 7: 286-90., 1996.
D. Assessing effect measure modification for modifiers with >2 categories
Methods and models to assess effect measure modification can be extended to accommodate
interactions between covariates with more than 2 categories. For example, effect measure
modification of the relation between early prenatal care (vs. no early care) and the 17 week risk of
8
preterm birth by race/ethnicity (raceth2) can be assessed using indicator term models with 7 indicator
terms for the 8 possible combinations of pnc5 (0, 1) and raceth2 (0, 1, 2, 3) as shown below (model
8):
Alternatively, you can evaluate RD modification using product term interaction models. To do this,
create 3 indicator terms for the 4 categories of raceth2 (racethwh, racethb, racetho, as for
assignment 2) and include each in a product interaction term with pnc5 (model 9)
As for the models used to assess effect measure modification between dichotomous covariates, the
indicator term and product term models shown above are equivalent with regard to their assumptions
and maximum likelihood values.
a. Use a product term or indicator term interaction model to estimate RD and 95% CI for each stratum
of prenatal care (pnc5) and race/ethnicity (raceth2) relative to White non-Hispanic births with no early
prenatal care. Report risks, RD and CI in table D1.
b. Use an LR test to compare the fit of the interaction model with a main effects model (model 10)
D2. Stratum specific RD and NNT for modifiers with >2 categories
a. Use a product term or indicator term interaction model to estimate stratum-specific RD and 95% CI
for early care vs. no early care according to race/ethnicity (raceth2). Report risks, RD and CI in table
D2.
b. Estimate the number of white non-Hispanic, white Hispanic, African American and other
race/ethnicity births that would need to receive early prenatal care in order for the number of preterm
births to decrease by one (i.e., the number needed to treat, or NNT) and report results in table D2.
9
D3. Assessing RD modification based on RR for modifiers with >2 categories
a. Use a single referent indicator term log-risk model to evaluate RD modification based on estimated
RR, and enter results in table D2. To do this you will need to estimate 3 expected RRs (assuming no
RD modification) and 3 ICRs.
ICR11 indicates whether the estimated joint effect measure for early care and white Hispanic
race/ethnicity (the observed RR11) is consistent with additive risks for early care (vs. no early
care) and white Hispanic race/ethnicity (vs. white non-Hispanic race/ethnicity)
ICR12 indicates whether the estimated joint effect measure for early care and AA race/ethnicity
(observed RR12) is consistent with additive risks for early care (vs. no early care) and AA (vs.
white non-Hispanic) race/ethnicity
ICR13 indicates whether the estimated joint effect measure for early care and other race/ethnicity
(observed RR13) is consistent with additive risks for early care (vs. no early care) and Other
race/ethnicity (vs. white non-Hispanic race/ethnicity)
10
Written Assignment (100 points total; 95 points content, 5 points style)
1. Which model will produce perfectly homogenous risk difference estimates for early care vs. no
early care across strata of race (AA or non-AA)? Why? (2 points)
2. Briefly explain the similarities or differences in RD measures for early care and AA race derived
from contingency tables (section A), model 3 and model 5. (2 points)
3. Would a Likelihood Ratio Test comparing model 6 to a log-risk model with pnc5 and raceaa
alone be relevant to an analysis of biologic interaction between early prenatal care and
race/ethnicity? Briefly justify your answer. (2 points)
4. List one advantage and one disadvantage of dichotomizing race as African American or non-
African American. How might conclusions about the presence or absence of health disparities be
affected by dichotomizing race into two groups? (4 points)
5. Describe the analyses of risk difference modification by race/ethnicity from part D as you would
for the results section of a publication. Address the following in your description (10 points):
Which group had the highest estimated 17-week risk of preterm? Which had the lowest
estimated risk?
Do estimated risks associated with early prenatal care differ according to race/ethnicity? If so,
how?
Which group or groups appeared to have the greatest reduction in preterm birth associated
with early prenatal care vs. no early care?
Are the joint effects of early prenatal care and race/ethnicity different from what you would
expect assuming additive risks?
Be brief and note estimates that support your answers. You may also refer to specific results
tables as appropriate.
6. Would the results of your analysis support targeting early prenatal care programs to specific
race/ethnicity groups in order to reduce the incidence of preterm birth in North Carolina? Why or
why not? Consider the frequency of exposure, incidence of outcome, and results from your
analysis of effect measure modification. Be brief, but justify your conclusions (10 points)
11
Table A1. Contingency Table Analyses (5 points)
Early prenatal Race ij Preterm Total Risk
care (N) (N)
No early care Non-AA 00
Had early care Non-AA 10
No early care AA 01
Had early care AA 11
*Round risk estimates to three significant digits. You do not need to show 95% CI.
Round all estimates to three significant digits. You do not need to show 95% CI for estimates in or
below table A2.
Round all estimates to three significant digits. You do not need to show 95% CI
12
Table B1. Product term interaction models: common referent analyses (6 points)
linear risk log-risk
RD 95% CI RR 95% CI
no early care/non-AA vs no early 00 vs 00
care/non-AA
early care/non-AA vs no early care/non- 10 vs 00
AA
no early care/AA vs no early care/non- 01 vs 00
AA
early care/AA vs no early care/non-AA 11 vs 00
*Round all values in table B2 to three significant digits. Round p-values to 1 significant figure.
13
Table C1. Indicator term models: common referent analyses (6 points)
linear risk log-risk
RD 95% CI RR 95% CI
no early care/non-AA vs no early 00 vs 00
care/non-AA
early care/non-AA vs no early care/non- 10 vs 00
AA
no early care/AA vs no early care/non- 01 vs 00
AA
early care/AA vs no early care/non-AA 11 vs 00
14
D1. Assessing RD modification based on RD for modifiers with >2 categories (14 points)
Preterm Total Risk RD 95% CI
births (N) births (N)
White non-Hispanic
no early care
had early care
White Hispanic
no early care
had early care
African American
no early care
had early care
Other race
no early care
had early care
Round risks, RD and 95% CI to three significant digits.
D2. Stratum-specific RD and NNT for modifiers with >2 categories (4 points)
RD 95% CI NNT*
Early care vs. no early care by race/ethnicity
White non-Hispanic 10 vs 00
White Hispanic 11 vs 01
AA 12 vs 02
Other 13 vs 03
D3. Assessing RD modification based on RR for modifiers with >2 categories (13 points)
Preterm Total RR 95% CI Expected ICR 95% CI
births (N) births (N) joint RR*
no early care,
white non-Hispanic
had early care,
white non-Hispanic
no early care,
white Hispanic
had early care,
white Hispanic
no early care,
AA
had early care, AA
no early care,
other
had early care,
other
Round all values to three significant digits. *Expected joint RR assuming no RD modification
15