Lecture 7 - Noncomparability I - Kahn-1

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 92

Lecture 7

Noncomparability I:
Random Error and Confounders

1
Learning Objectives
• Discuss how random error and the co-occurrence
of causes can lead to noncomparability between
exposed and unexposed groups
• Recognize the three criteria of a potential
confounder
• Learn ways to mitigate against confounding in the
design and analysis phases
• Utilize stratified analysis to adjust for the unequal
distribution of alternate causes of the outcome
and provide a valid measure of association
• Assess the direction of bias in the effect estimate
due to a confounder
2
What we’ve learned thus far
• The only way we can know for certain that an
exposure causes an outcome is to contrast a fact
with its counterfactual. Sadly, we cannot do this.
• Instead, we design studies that approximate this
contrast using an unexposed group as a
substitute for the counterfactual.
• Differences between exposed and unexposed
groups are sources of noncomparability that
threaten the internal validity of our studies,
potentially biasing our measures of association.
3
What is comparability?
Exposed and unexposed groups are
“comparable” if they have equal
distributions of all causes of the outcome
other than the exposure of interest.

4
Exposed and unexposed groups are
“comparable”

EXPOSURE EXPOSURE
OF OF
INTEREST INTEREST

Exposed Unexposed

Equal distributions of all causes of the outcome


other than the exposure of interest
5
Questions to be addressed in the next
two lectures
• Is the association causal or are there
alternative explanations?
• How do non-causal associations arise?
• How can we mitigate against non-causal
associations through design and analysis?

6
Ways noncomparability can arise
• Through sampling  random error
• In nature (“associated causes”)  confounders
• Through selection bias
• Through loss to follow-up
• Through misclassification

7
Random chance
• Variability occurs whenever we sample a source
population in order to conduct an epidemiologic
study.
• The sample we select may have a different
distribution of characteristics than the source
population.
• We put confidence intervals around the measure
of occurrence or association that results from our
study to reflect the potential variance from the
“true” population measure.

8
Random chance

9
Random chance
Population:

Brown = 21 = 19%
Yellow = 12 = 11%
Orange = 23 = 21%
Green = 18 = 16%
Blue = 25 = 22%
Red = 12 = 11%

Total = 111

10
Random chance
Population: Sample:

Brown = 21 = 19% Brown = 7 = 26%


Yellow = 12 = 11% Yellow = 4 = 15%
Orange = 23 = 21% Orange = 7 = 26%
Green = 18 = 16% Green = 2 = 7%
Blue = 25 = 22% Blue = 3 = 11%
Red = 12 = 11% Red = 4 = 15%

Total = 111 Total = 27

11
Random chance
Population: Sample:

Brown = 21 = 19% Brown = 7 = 26%


Yellow = 12 = 11% Yellow = 4 = 15%
Orange = 23 = 21% Orange = 7 = 26%
Green = 18 = 16% Green = 2 = 7%
Blue = 25 = 22% Blue = 3 = 11%
Red = 12 = 11% Red = 4 = 15%

Total = 111 Total = 27

12
Random chance
Population: Sample:

Brown = 21 = 19% Brown = 7 = 26%


Yellow = 12 = 11% Yellow = 4 = 15%
Orange = 23 = 21% Orange = 7 = 26%
Green = 18 = 16% Green = 2 = 7%
Blue = 25 = 22% Blue = 3 = 11%
Red = 12 = 11% Red = 4 = 15%

Total = 111 Total = 27

13
Random chance
Population: Sample:

Brown = 21 = 19% Brown = 7 = 26%


Yellow = 12 = 11% Yellow = 4 = 15%
Orange = 23 = 21% Orange = 7 = 26%
Green = 18 = 16% Green = 2 = 7%
Blue = 25 = 22% Blue = 3 = 11%
Red = 12 = 11% Red = 4 = 15%

Total = 111 Total = 27

Total = 4
14
Random chance
• By decreasing our sample size, we increase
the range of sampling variance and decrease
our precision, resulting in a wider confidence
interval.

Larger sample size Narrower


confidence
interval

Wider
Smaller sample size confidence
interval
15
Random chance
• When the distribution of alternative causes of the
outcome differs between exposed and unexposed
through sampling variation, our comparison
groups are not comparable and our resulting
measures of association may be biased.
• We put confidence intervals around our
measures of association to acknowledge the
range of potential bias caused by sampling
variation.

16
Random chance
Source population = 111

Exposed = 27 Unexposed = 27

17
Instead of this…
Equal distributions of alternative
causes of the outcome

Exposure Exposure
of Interest of Interest

Exposed Unexposed

18
We get this
Unequal distributions of alternative
causes of the outcome

Exposure Exposure
of Interest of Interest

Exposed Unexposed

19
Ways noncomparability can arise
• Through sampling  random error
• In nature (“associated causes”)  confounders
• Through loss to follow-up
• Through selection bias
• Through misclassification

20
Confounder
• A confounder is a factor that contributes to
noncomparability between the exposed and
unexposed groups.
• These factors are causes or “risk factors” of
the outcome that may travel with the
hypothesized exposure of interest.

21
Risk factors are like bananas…

22
Risk factors are like bananas…
People who exercise regularly may be more likely to…
• Eat a healthful diet
• Maintain optimal weight
• Not smoke
• Drink in moderation
• Access preventive care
• Wear helmets/seat belts
• Wear sunscreen

23
Risk factors are like bananas…
People who take illegal drugs may be more likely to…
• Gamble
• Smoke
• Drink to excess
• Have unprotected sex
• Exceed the speed limit
• Get into fights
• Engage in other illegal activities

24
Does exercising >= 3 times/week
improve immune function?

25
Does exercising >= 3 times/week
improve immune function?
Exercise
>=
3x/wk

Immune
BMI Flu shot
function

Fruit &
veggie
intake

26
Does exercising >= 3 times/week
improve immune function?

BMI
Flu
BMI shot
Fruits &
veggies Flu
Exercise >= 3 Exercise < 2 shot
Fruits &
times/week times/week veggies

Exposed Unexposed

27
Identifying a potential confounder

28
Identifying a potential confounder
• Alternative cause of the outcome

29
Identifying a potential confounder
• Alternative cause of the outcome that is
unequally distributed between the exposed
and unexposed groups

30
Criteria to be a potential confounder
1) Associated with the exposure
2) Independent cause of the outcome
3) Not in the causal pathway between the
exposure and outcome (i.e., not a mediator)
C

E D
31
The problem with confounders
You want to measure this…

32
The problem with confounders
…but instead you’re measuring this

33
Mitigating against non-causal associations
a.k.a. Controlling for confounders
• How do we isolate the causal effect we’re
interested in?
– Randomization (RCTs)
– Matching
– Restriction

34
Mitigating against non-causal associations
a.k.a. Controlling for confounders
• How do we isolate the causal effect we’re
interested in?
– Randomization (RCTs)
– Matching Design phase
– Restriction

35
Mitigating against non-causal associations
a.k.a. Controlling for confounders
• How do we isolate the causal effect we’re
interested in?
– Randomization (RCTs)
– Matching Design phase
– Restriction
– Stratification
Analysis phase
– Regression

36
Mitigating against non-causal associations
a.k.a. Controlling for confounders
• How do we isolate the causal effect we’re
interested in?
– Randomization (RCTs)
– Matching
– Restriction
– Stratification
– Regression

37
Randomization

38
Randomization

39
Randomization

Exposed Unexposed

40
Mitigating against non-causal associations
a.k.a. Controlling for confounders
• How do we isolate the causal effect we’re
interested in?
– Randomization (RCTs)
– Matching
– Restriction
– Stratification

41
Matching
Exposed Unexposed

42
Matching
Exposed Unexposed

43
Matching
Exposed Unexposed

44
Matching
Exposed Unexposed

45
Matching
Exposed Unexposed

46
Matching
Exposed Unexposed

47
Matching
Exposed Unexposed

48
Matching
Exposed Unexposed

49
Matching
Exposed Unexposed

50
Matching
Exposed Unexposed

51
Mitigating against non-causal associations
a.k.a. Controlling for confounders
• How do we isolate the causal effect we’re
interested in?
– Randomization (RCTs)
– Matching
– Restriction
– Stratification

52
Restriction

53
Restriction

54
Restriction

55
Restriction

56
Restriction
Among only

57
Restriction
Among only

58
Mitigating against non-causal associations
a.k.a. Controlling for confounders
• How do we isolate the causal effect we’re
interested in?
– Randomization (RCTs)
– Matching
– Restriction
– Stratification
– Regression

59
Stratification
• If a variable meets the three criteria for a
potential confounder…

60
Reminder:
Criteria to be a potential confounder
1) Associated with the exposure
2) Independent cause of the outcome
3) Not in the causal pathway between the
exposure and outcome (i.e., not a mediator)
C

E D
61
Stratification
• If a variable meets the three criteria for a
potential confounder, we test to see if it is a
source of noncomparability that biases our
measure of association (i.e., a confounder in
our data) through stratification.
• We can only stratify by variables that might
contribute to noncomparability (potential
confounders) if we have collected data on
them.

62
How does stratification work?
• Stratification removes the effect of a potential
confounder on an exposure-outcome
relationship by limiting the variance on the
outcome due to that third variable.
• If the potential confounder is not associated
with either the exposure or the outcome, then
stratifying on that variable will not change the
estimated measure of effect.

63
Does bicycle riding affect semen quality?

64
Does bicycle riding affect semen quality?

65
Does bicycle riding affect semen quality?

66
Does bicycle riding affect semen quality?

67
Does bicycle riding affect semen quality?

Exposure: bicycling > 5 hours/week


Outcome: poor semen quality

OR = 1.5

68
Does bicycle riding affect semen quality?

Exposure: bicycling > 5 hours/week


Outcome: poor semen quality

OR = 2.0

OR = 1.5

69
Does bicycle riding affect semen quality?

Exposure: bicycling > 5 hours/week


Outcome: poor semen quality

OR = 2.0

OR = 1.5

OR = 2.0

70
Does bicycle riding affect semen quality?

Exposure: bicycling > 5 hours/week


Outcome: poor semen quality

OR = 2.0

OR = 1.5

OR = 2.0
Crude = “Confounded”

Adjusted = “Truth”
71
Stratification: How to
• Use the 3 criteria to identify a potential
confounder.
• Calculate the crude (unadjusted, unstratified)
OR.
• Stratify by the potential confounder and
calculate the strata-specific ORs.

72
Stratification: How to
• If the strata-specific ORs are similar to each
other and appreciably different from the crude
OR (rule of thumb: > 10%), the variable is a
source of noncomparability in your data (a
confounder).
– Report the summary adjusted OR (weighted average
of the strata-specific ORs)
– Note: a handy way to calculate % difference is:
(crude OR – adjusted OR) / (crude OR)
73
Stratification: How to
• If the strata-specific ORs are similar to each
other but not appreciably different from the
crude OR, the variable is not a source of
noncomparability in your data.
– Report the crude OR

74
Does bicycle riding affect semen quality?

Exposure: bicycling > 5 hours/week


Outcome: poor semen quality

OR = 2.0

OR = 1.5

OR = 2.0
Crude = “Confounded”

Adjusted = “Truth”
75
How to determine the direction of
confounding
Crude

1 2
null

76
How to determine the direction of
confounding
Crude
Adjusted

1 2
null

77
How to determine the direction of
confounding
Crude
Adjusted

1 2
null

• The confounder (age) is pushing the OR in the negative direction.

78
Another way to think about it…
Crude
Adjusted

1 2
null

• Upon stratification, the OR moves away from the null, in the


positive direction, indicating that the confounder biased your
results toward the null.
79
How to predict the direction of
confounding

80
How to predict the direction of
confounding

+
Causal

81
How to predict the direction of
confounding

Protective
_

82
How to predict the direction of
confounding

Causal
_ +

83
How to predict the direction of
confounding

_ +

Negative (-) * positive (+) = negative


Therefore, the confounder (age) will push the
“true” OR in the negative direction.

84
How to predict the direction of
confounding
Step 1
• Ask yourself: Is your hypothesized “true” (adjusted)
association causal (OR > 1) or protective (OR < 1)?
• Plot it on a number line, either to the right or to the left of
the null.
Step 2
• Use the triangle schematic with arrows labeled “+” or “-”
depending on whether you posit associations to be causal
or protective.
Step 3
• Determine the direction of confounding according to the
following criteria:

85
How to predict the direction of
confounding
• Positive (+) x positive (+) = in the positive
direction on the number line
• Negative (-) x negative (-) = in the positive
direction on the number line
• Positive (+) x negative (-) = in the negative
direction on the number line
• Negative (-) x positive (+) = in the negative
direction on the number line

86
How to predict the direction of
confounding
Step 4
• Plot the “confounded” (crude) association to the
right of (in the positive direction relative to) the
“true” (adjusted) association or to the left of (in
the negative direction relative to) the “true”
association, depending on whether the
association is causal or protective.
Step 5
• Look at the number line and ask yourself: Do I
predict that the confounder will bias my results
toward or away from the null?

87
What have we learned?
• Noncomparability between exposed and
unexposed groups on the distributions of
alternative causes of the outcome can bias the
results of our studies.
• Two sources of noncomparability we
discussed today:
– Random sampling error (accounted for by
confidence intervals)
– The presence of confounders

88
What have we learned?
How to mitigate against the effect of confounders:
– Randomization
– Matching Design phase
– Restriction
– Stratification
Analysis phase
– Regression
3 criteria for a potential confounder:
1) Independent cause of the outcome
2) Associated with the exposure
3) Not in the causal pathway between them

89
What have we learned?
• If a potential confounder meets the three criteria,
stratify.
– If the strata-specific (adjusted) ORs are similar to each
other and appreciably different from the crude OR,
the variable is a confounder in your data and you
should report the summary adjusted OR.
– If the strata-specific (adjusted) ORs are similar to each
other and not appreciably different from the crude
OR, the variable is not a confounder in your data and
you should report the crude OR.
• Use the 5 steps to predict the direction of bias
due to the presence of a confounder.

90
Questions?

91
Thank you!

92

You might also like