Download as pdf or txt
Download as pdf or txt
You are on page 1of 45

Confounding

Effect Modification

Truong Phuoc Long, ph.D

12/12/2022 1
Content

Review on measurement of association


Confounding
Effect modification/statistical interaction

12/12/2022 2
Measures of association in statistics

 Risk difference (attributable risk)

 Relative risk (risk ratio)

 Odds ratio

 RD, RR and OR measure magnitude of association


between risk factor and risk of disease.

12/12/2022 3
Risk difference (attributable risk)
• The risk difference (RD) or attributable risk is the difference
between the risk of an outcome in the exposed group and the
unexposed group.

• It describes the actual difference in the observed risk of events


between experimental and control interventions.

12/12/2022 4
Example
 A randomized, double-blinded, placebo controlled trial of the efficacy and
safety of zidovudine (AZT) in reducing the risk of maternal-infant HIV
transmission. 363 HIV infected pregnant women were randomized to AZT
or placebo.
 Results
Of the 180 women randomized to AZT group, 13 gave birth to children
who tested positive for HIV within 18 months of birth.
Of the 183 women randomized to the placebo group, 40 gave birth to
children who tested positive for HIV within 18 months of birth.
Note: A double-blind study is one in which neither the participants nor the
experimenters know who is receiving a particular treatment.

12/12/2022 5
Risk difference HIV
Transmission
Drug group Total
AZT Placebo

Yes 13 40 53
No 167 143 310
Total 180 183 363

 The risk of HIV transmission (i.e. the proportion of HIV transmission):

 AZT: pˆ1  13/180  0.07  7%

 Placebo: pˆ2  40 / 183  0.22  22%

 Risk difference: p1ˆ pˆ2   0.15  15%

 Interpretation: If AZT was given to 1,000 HIV infected pregnant women,


this would reduce the number of HIV positive infants by 150 (relative to the
number of HIV positive infants born to 1,000 women not treated with AZT).
12/12/2022 6
Relative risk (or Risk ratio)
 Relative risk (RR) is the ratio of the probability of an outcome in an
exposed group to the probability of an outcome in an unexposed group.
 Ex: The risk of HIV transmission with AZT relative to placebo:

The risk of HIV transmission with AZT is about 1/3 the risk of HIV
transmission with placebo.
Interpretation: An HIV positive pregnant woman could reduce her
personal risk of giving birth to an HIV positive child by nearly 70% if
she takes AZT during her pregnancy.

12/12/2022 7
Relative risk (or Risk ratio)
 RR could be computed in the other direction as well

 Interpretation: An HIV positive pregnant woman increases


her personal risk of giving birth to an HIV positive child by
slightly more than three times if she does not take AZT
during her pregnancy.

12/12/2022 8
Risk difference vs. Relative risk

 Risk difference provides a measure of the public health impact of


an exposure (assuming causality).

 Relative risk provides a measure of the magnitude of the disease-


exposure association for an individual.

 Each provides a different piece of information about the “story”.

12/12/2022 9
What is an Odds?
 Odds is the ratio of the risk of having an outcome to the risk
of not having an outcome.

 If p represents the risk of an outcome, then the odds are given by:

12/12/2022 10
What is an Odds?
 The estimated risk of giving birth to an HIV infected child among
mothers treated with AZT is 𝑝Ƹ 1= 0.07
 The corresponding odds estimate is

 The estimated risk of giving birth to an HIV infected child among


mothers not treated with AZT is 𝑝Ƹ 2 = 0.22
 The corresponding odds estimate is

12/12/2022 11
Odds Ratio
 The estimated odds ratio of an HIV birth with AZT relative to placebo

 The odds of HIV transmission with AZT is 0.28 (about 1/3) the odds
of transmission with placebo.
 Interpretation: AZT is associated with an estimated 72% (estimated
OR = 0.28) reduction in odds of giving birth to an HIV infected child
among HIV infected pregnant women.

12/12/2022 12
Summary
An example of 2*2 cross table showing formulas of risk
difference, risk ratio, and odds ratio

12/12/2022 13
Confounding
Consider results from the following (fictitious) study:
• This study was done to investigate the association between
smoking and a certain disease in male and female adults.
• 210 smokers and 240 non-smokers were recruited for the study.

Source: John McGready. Statistical Reasoning II. Confounding and Effect Modification
Lecture. Johns Hopkins Bloomberg School of Public Health
12/12/2022 14
Confounding
Smoke, Disease variables: 0 = No, 1 = Yes
Regression equation?
What’s the probability of having the disease for a
Smoker
Non-smoker
Conclusion?

 Is smoking protective against disease?


12/12/2022 15
Confounding
Is smoking protective against disease?
Looking at the association between
Sex vs. Smoking
Sex vs. Disease

12/12/2022 16
Confounding

Most of the smokers are male and non-smokers are female.

Most people with disease are female.

How may these affect the previously found relationship between


smoking and disease ?

Are smokers less likely to have the disease?


12/12/2022 17
What’s Going On?
• The original outcome of interest is DISEASE.

• The original exposure of interest is SMOKING.

• In this sample, SEX is related to both the outcome and exposure.

- This relationship is possibly impacting overall relationship between


DISEASE and SMOKING.

• How can we look at the relationship between DISEASE and


SMOKING removing any possible “interference” from SEX?

 One approach - look at DISEASE and SMOKING relationship


separately for males and females.

12/12/2022 18
Confounder
Let’s find out whether sex is related to both smoking and disease

Sex variable: 0=Female, 1=Male

12/12/2022 19
Confounder
• Is smoking related to disease in males?

12/12/2022 20
Confounder
• Is smoking related to disease in females?

12/12/2022 21
Confounder
• A recap of the study (Smoking, Disease, and Sex)
- The overall (sometimes called crude, unadjusted) relationship
(RR) between smoking and disease was nearly one (risk
difference nearly 0).

- The sex specific results showed similar positive associations


between smoking and disease.
+ Males:
+ Females:

12/12/2022 22
Confounder
Sex and disease

Sex and smoking

Sex variable: 0=Female, 1=Male


12/12/2022 23
Simpson's paradox
• Simpson's paradox is a phenomenon which a trend appears in several
different groups of data but disappears or reverses when these groups
are combined.
• The nature of an association can change or disappear when data from
several groups are combined to form a single group.
• An association between an exposure X and a disease Y can be
confounded by another lurking (hidden) variable Z

Simpson's paradox for quantitative data: a


positive trend ( , ) appears for two
separate groups, whereas a negative trend
(----) appears when the groups are
combined.

12/12/2022 24
Confounding (Lurking Variable)
• A confounder Z is a variable that distorts the true relationship between
8
X and Y.
This can happen if Z is related to both X and Y

 Confounding occurs when a factor is associated


with both the exposure and the outcome but does
not lie on the causative pathway.
For example:
- if you look for an association between coffee and lung cancer, this association
may be distorted by smoking if smokers are unequally distributed between the
two groups.
- It may appear that there is an association between coffee and lung cancer,
however if you were to consider smokers and non-smokers separately for
each group this would in fact show no association.
12/12/2022 25
Confounder

In our example of disease, smoking, sex, which one is X/Y/Z?

12/12/2022 26
How to Adjust for Confounding?
• Controlling for confounding by stratification
- Look at tables separately
- For example, male and females, clinic
- Take weighted average of stratum specific estimates
• For example, in the disease/smoking situation
- To get a sex adjusted relative risk for the smoking disease relationship, we
could weight the sex-specific relative risks by numbers of males and female.

12/12/2022 27
How to Adjust for Confounding?

• One way to assess whether a variable is a confounder:


compare crude RR to adjusted RR, if it’s “different” then
that variable is a confounder.

12/12/2022 28
How to Adjust for Confounding?
• Regression method: add the confounder variable as an
independent variable.
• What is the model now?

12/12/2022 29
How to Adjust for Confounding?
Compare the model with and without the confounder:

12/12/2022 30
How to Adjust for Confounding?
When sex is included in the model, the effect of smoking is in
the opposite direction – smoking is a risk factor of the disease
and the effect is statistically significant.

12/12/2022 31
How to Adjust for Confounding?
Compare the nested models

 Need to perform the chi-square test, df = difference in number of predictors


2  513.435  490.798  22.637
12/12/2022 32
How to Adjust for Confounding?
* Compare the nested models
Initial model (Null model):

Extended model:

Chi-square test:

 Chi-square value is larger than the critical chi-square value. It means that p-
value is smaller than the critical p value (0.05)
 Reject H0 [H0: coefficient(s) of the added variable(s) =0, or the added
variable(s) does not significantly improve the model]
 The extended model is significantly better than the null model
12/12/2022 33
Effect modification/Statistical Interaction

 We have just identified that sex is a confounder of the


relationship between smoking and the disease.
 Do we really think the relationship between smoking and
disease is the same for both sex groups?
• i.e. Does a female smoker have the same potential for disease
as a male smoker?
 Effect modification occurs when the effect of an exposure is
different among different subgroups.

12/12/2022 34
Effect modification
• Effect modification (statistical interaction) occurs when the
relationship between an outcome and predictor one is different
depending on the level of predictor two.

• Examples

- The relationship between smoking and disease is different for


male and female.

- The relationship between cholesterol level and red


meat consumption is different for smokers and non-smokers.

12/12/2022 35
Effect modification
The effect of smoking on disease for male:

The effect of smoking on disease for female:

Are these coefficients different?


12/12/2022 36
Test of Interaction (effect modification)

Add an interaction term x3 = x1*x2 to the model which already


includes x1 and x2 as predictors

Create x3 variable: smoke_sex = smoke*sex

12/12/2022 37
Test of Interaction (effect modification)
• Let’s look at the result:

What is the regression equation?


What is the regression equation for male?
What is the regression equation for female?

12/12/2022 38
Test of Interaction (effect modification)
• Regression equation

• Regression equation for male

• Regression equation for female

12/12/2022 39
Interpret the coefficients

• ෡1: the difference in the logit of disease between smokers and


non-smokers of the same sex.
• ෡2: the difference in the logit of disease between male and
female for non-smokers and smokers.
• ෡3: the the difference in the relationship of smoking and
disease between male and female.

12/12/2022 40
Test of Interaction (effect modification)

• Do not reject H0 → the interaction term is not significant →


there is no difference in the relationship of smoking and disease
between male and female

12/12/2022 41
Test of Interaction (effect modification)

 Another possible test for the significance of the


interaction term?

12/12/2022 42
Test of Interaction (effect modification)
Compare the nested models

12/12/2022 43
Test of Interaction (effect modification)
• The -2LL statistic (often called the deviance) is an indicator of how much
unexplained information there is after the model has been fitted, with large
values of -2LL indicating poorly fitting models.

• The deviance is basically a measure of how much unexplained variation


there is in our logistic regression model – the higher the value the less
accurate the model.

• 2 = [-2LL (baseline)] - [-2LL (new)] with degrees of freedom = kbaseline-


knew, where k is the number of parameters in each model.

- If our new model explains the data better than the baseline model, there
should be a significant reduction in the deviance (-2LL) which can be
tested against the chi-square distribution to give a p value.

12/12/2022 44
Test of Interaction (effect modification)

• Chi-square test:
2 = 490.798 – 490.797 = 0.001 < 2 = 0.05, df = 1 = 3.84

 H0: coefficient(s) of the added variable(s) = 0, or the added


variable(s) does not significantly improve the model
 Accept the null hypothesis, the extended model is not
significantly better than the null model

12/12/2022 45

You might also like