Download as pdf or txt
Download as pdf or txt
You are on page 1of 74

Lecture 6

Study Design II:


Cohort and Case-Control Studies

1
Learning Objectives
• Characterize a cohort study, including pros and
cons
• Distinguish among different types of cohort
studies: prospective, retrospective, and
ambidirectional
• Characterize a case-control study, including pros
and cons
• Discuss the relationship between cohort and
case-control studies
• Estimate the odds ratio and understand its
relationship to the risk ratio
2
Observational studies
• RCTs are rigorous but often infeasible due to
pragmatic and ethical reasons.
• Observational studies take advantage of the
fact that people may expose themselves to
harmful or healthy exposures through
personal habits, occupation, place of
residence, or other reasons.

3
Types of observational studies
• Two principal types:
• Cohort study: subjects are defined according
to their exposure levels and followed over
time to determine disease occurrence.
• Case-control study: subjects are defined as
cases and controls and exposure histories are
compared.

4
Cross-sectional study
• Before we go into the cohort study and case-
control study in more detail, a quick note on a
third type of observational study
• Cross-sectional study: examines the
relationship between exposure and disease
prevalence at a single point in time

5
Cohort study

6
Cohort: Definition
• “Cohort” comes from the Latin word cohors,
meaning a group of soldiers (ancient Rome).
• Today we use the word cohort to
– characterize “any designated group of people who
are followed over a period of time.”
– describe a group of individuals with a common
characteristic or experience.

7
Cohort: Types of populations studied
• Open/dynamic:
– Members come and go; losses may occur
• Fixed: 不可改变的

– Defined by irrevocable event


– Does not gain members; losses may occur
• Closed:
– Defined by irrevocable event
– Does not gain members; no losses occur

8
Cohort study: Design
与RCT的不同,没有randomized
RR(risk ratio) Defined population
RD(risk difference)
sample
NON-RANDOMIZED

E+ E-

D+ D- D+ D-
9
First identify…

Total
Exposed a+b

Unexposed c+d

10
Then follow to measure disease…

Not
Diseased Total
Diseased
Exposed a b a+b

Unexposed c d c+d

11
Then calculate risk ratio
Not
Diseased Total
Diseased
Exposed a b a+b

Unexposed c d c+d

Numerator a
Risk of disease in exposed a+b
=
Denominator c
Risk of disease in unexposed c+d
12
Risk ratio interpretation
• Ratios > 1.0 indicate risk is higher among
exposed than unexposed
• Ratios = 1.0 indicate no association
• Ratios < 1.0 indicate risk is lower among
exposed than unexposed

13
Risk ratio vs. rate ratio
• Risk ratios are ideal when there is little or no loss
to follow-up.
• Most studies have substantial loss to follow-up.
• Rate ratios more accurately represent the
strength of the association when loss to follow-up
is an issue.

14
Rate ratio

15
Rate ratio

Numerator a
Rate of disease in exposed person years exposed
= c
Denominator
Rate of disease in unexposed person years unexposed

16
Rate ratio interpretation
Similar to risk ratio
• Ratios > 1.0 indicate rate is higher among
exposed than unexposed
• Ratios = 1.0 indicate no association
• Ratios < 1.0 indicate rate is lower among
exposed than unexposed

17
Risk difference

Difference between two risks =

• Interpretation: Excess risk due to the exposure


• Example: If the risk of disease is 10 per
100,000 in the unexposed and 15 per 100,000
in the exposed, then 5 per 100,000 cases are
associated with the exposure of interest.

18
Example: Nutrition and obesity
Research question: Are nutrition classes in middle
school associated with the development of obesity in
adolescence?
Sample:
 Middle school A, 400 students, receives nutrition
class (intervention)
 Middle school B, 300 students, in neighboring
district, does not receive nutrition class
Measures: Schools collect students’ height and weight
yearly for 5 years
19
Example: Risk difference

20
Example: Risk difference
• Incidence proportion (or risk) of obesity among those
who had nutrition class = 0.175 or 17.5%
• Incidence proportion (or risk) of obesity among those
who did not have nutrition class = 0.33 or 33%
• Risk difference
 Incidence proportion of exposed – incidence proportion of unexposed
 0.175 – 0.33= - 0.155
• Interpretation: There are approximately 15.5 fewer
cases of obesity during adolescence for every 100
adolescents associated with nutrition class in middle
school.
21
Rate difference
• Difference between two rates:

• Interpretation: Similar to risk difference; excess rate


due to the exposure
• Example: If the rate of disease is 8 per 100,000
person-years in the exposed and 4 per 100,000
person-years in the unexposed, then 4 excess cases
per 100,000 person-years of exposure are associated
with the exposure of interest.

22
Types of cohort studies
• Prospective cohort study
– Concurrent cohort study or longitudinal study
• Retrospective cohort study
– Non-concurrent cohort or historical cohort
• Ambidirectional cohort study
– Incorporates aspects of both

Depends on where the investigator sits!

23
Prospective cohort study
done in 2015
Defined population 2015

NON-RANDOMIZED

E+ E-

D+ D- D+ D- 2025
24
Retrospective cohort study
done in 2015
Defined population 2000

NON-RANDOMIZED

E+ E-

D+ D- D+ D- 2015

25
Ambidirectional cohort study
done in 2015
Defined population

NON-RANDOMIZED

2015
E+ E-

D+ D- D+ D-
26
Keys to all cohort studies

• Comparison groups are defined as


exposed (E+) vs. unexposed (E-)
• Subjects must be disease-free (i.e., at risk
of the outcome) at the beginning of the
study

27
Example A:
Multivitamins and memory loss
• Sample of 80 men at age 65 with no evidence
of memory loss
• Measure daily multivitamin use
• Follow groups for ten years
• Count men who develop memory loss
• Assume no losses to follow-up

28
Example A:
Multivitamins and memory loss
Memory No loss
Total
loss
Daily use 8 32 40
No daily use
16 24 40
Total 24 56 80
8/40 Interpretation: The risk of memory loss
Risk Ratio = = 0.5 among those who use a multivitamin is
16/40 50% less than the risk of memory loss
among those who do not use a
multivitamin over 10 years.
29
Cohort study: Advantages
针对rare exposures的

• Temporality
• Can look at changes in exposure over time
• Can study multiple exposures
• Possible to estimate all measures of incidence
and effect
• Efficient for rare exposures

2019习题第17题

30
Cohort study: Disadvantages
• Usually requires large investments in
resources
• Requires large sample sizes
• Not easy to reproduce
• Not efficient for rare outcomes

31
Case-control study

32
Case-control study
• Case-control studies start with the
identification of persons with the disease and
a suitable control group of persons without
the disease.
• Controls may be selected in a variety of ways;
e.g., from the same underlying cohort as the
cases or from a similar population to that
which gave rise to the cases.
• Key: comparison groups are defined as
diseased (D+) vs. non-diseased (D-)
33
How do we ensure that cases and
controls come from the same source
population?

The “would criterion”:


• Would your cases have been selected to be
controls had they not gotten disease?
• Would your controls have been selected to be
cases had they gotten disease?

34
When to conduct a case-control study
• The disease is rare
• Exposure data are difficult or expensive to
obtain
• The disease has a long induction/latent period
• Little is known about the disease
• The underlying population is dynamic

35
Case-control study: Design
odd ratio

E+ E- E+ E-

D+ D-
“Cases” “Controls”

population
36
First select…

Cases Controls
(with (without
disease) disease)
Exposed

Unexposed

Total a+c b+d

37
Then measure past exposure…

Cases Controls
(with (without
disease) disease)
Exposed a b

Unexposed c d

Total a+c b+d

38
Then calculate odds ratio
Cases Controls
(with (without
disease) disease)
Exposed a b

Unexposed c d

ad
Odds Ratio =
bc

39
Why do we calculate an odds ratio?
• We cannot estimate the risk of disease directly
when we sample people based on whether they
have the disease or not (case-control study).
• We can estimate the proportion exposed among
diseased and non-diseased.
– Estimate odds ratio for exposure
– Odds ratio for exposure = Odds ratio for disease
• If the disease is rare in the population, the odds
ratio approximates the risk ratio from a
prospective study.
40
What is an odds?
a
a+c
Cases a
1–
(with a+c
disease)
a
Exposed a a+c
=
a+c a

Unexposed c a+c a+c

a
=
c

41
The exposure odds ratio
a
Cases Controls a+c
(with (without a
disease) disease) 1–
a+c
Exposed a b b
b+d
Unexposed c d b
1–
b+d
a
c ad
Simplifies to: =
b bc
d
42
Let’s compare a cohort study
and a case-control study

43
Example B: Maternal smoking during
pregnancy and ADHD
Research question: Is smoking cigarettes during
pregnancy a potential cause of offspring attention-
deficit hyperactivity disorder (ADHD)?
Sample:
• Recruit 5,000 women during pregnancy who are
smokers and 5,000 women during pregnancy who are
not smokers
• Prospective cohort study
• Assume no loss to follow-up
Measures: Follow offspring at age 10 and determine
which children developed ADHD and which did not
44
Risk ratio
ADHD No ADHD Total
Smoking 300 4700 5000

No Smoking 200 4800 5000


Total 500 9500 10,000

300/5000
Risk Ratio = = 1.5
200/5000

45
Interpretation of risk ratio
• Offspring of women who smoked in pregnancy
have 1.5 times the risk of developing ADHD
over 10 years compared to offspring of
women who did not smoke in pregnancy.

46
Example C: Maternal smoking during
pregnancy and ADHD
Research question: Is smoking cigarettes during pregnancy
a potential cause of offspring attention-deficit hyperactivity
disorder (ADHD)?
Sample:
• 500 10-year-old children who are seeking care for ADHD
• For each child we find with ADHD, we select two children
of the same age from the same physician offices who
present for routine well visits (do not have ADHD)
• Case-control study
Measures: Mothers respond to questions, including
whether they smoked cigarettes while they were pregnant
47
Odds ratio
ADHD No ADHD Total
Smoking 300 503 803

No Smoking 200 497 697


Total 500 1000 1500

300*497
Odds Ratio = = 1.48
200*503

48
Interpretation of odds ratio
• The odds of exposure (mother smoking in
pregnancy) among those with ADHD are 1.48
times the odds of exposure among those
without ADHD over 10 years.

49
Let’s compare
• Risk ratio in cohort study = 1.5
• Odds ratio in case-control study = 1.48

• This odds ratio is approximately equivalent to


the risk ratio because the disease is rare
(prevalence = 500 out of 10,000 or 5% in our
cohort study).

50
When does the OR do a good job of
approximating the RR?

OR
or
RR

Prevalence of disease

51
Selection of cases
• Create a case definition
• Begin identifying and enrolling cases
– Hospital or clinic patient rosters
– Death certificates
– Special surveys (NHANES)
– Reporting systems (cancer registries)

52
Selection of controls
• Controls should be a sample of the population
that gave rise to the cases. population source
• The purpose of the control group is to provide
information on the exposure distribution in
the source population.
• Controls should be sampled independently of
exposure status. 2019习题第20题

53
Case-control study: Advantages
• Relatively short duration and less expensive
• Efficient for rare diseases
• Require smaller sample sizes
• Can study multiple risk factors (exposures) for
one disease
• Easily reproduced in different populations by
different investigators

54
Case-control study: Disadvantages
• Temporality hard to establish
• Relies on recall
• Difficult to identify appropriate control group
• Does not provide estimate of risks so cannot
calculate risk ratios and risk differences
• Inefficient for rare exposures

55
Let’s look at two studies of
smoking and lung cancer

56
Study 1: Cohort study
 In October 1951, a questionnaire was sent to
59,600 doctors in the United Kingdom about
their smoking habits
 40,564 valid responses
 Deaths ascertained with death certificates
 After 10 years of follow-up, 4,597 deaths, of
which 212 were due to lung cancer

57
British doctors

Smokers Non
smokers
Follow-up

% died of lung % died of lung


cancer cancer
58
Lung cancer death
 Between November 1st, 1951, and October 31st,
1961
 Risk period = 10 years

Doll R, Hill AB. (1964) Mortality in relation to smoking: Ten years’ observations of British Doctors. BMJ1:1399-1410

59
Smoking and lung cancer in the
British Doctor’s Study
Smoking status Deaths from LC N Person-years

 Never smokers 3 5,439 42,800

 Ever smokers* 143 27,769 149,000

 Other (pipe, cigar, mixed) 66

* As of Nov 1, 1951, > 1cig/day for a year


Doll & Hill, 1964

60
Smoking and lung cancer in the
British Doctor’s Study

Deaths Person- Death rate


from LC years (per 1000 py)
Ever smokers
143 149,000 0.96
(E+)
Never smokers
3 42,800 0.07
(E-)

61
Rate ratio

Rate ratio (IRR)


IRR = IR(E+) = 0.96 = 13.7
IR(E-) 0.07

62
Interpretation of rate ratio
• Doctors who smoked had 13.7 times the rate
of lung cancer death compared with doctors
who did not smoke.

63
Study 2: Case-control study
• Cases: incident lung cancer in 20 hospitals of
London
• Controls: Age-hospital matched patients
admitted for diseases other than cancer
• Exposure: questionnaire on duration, dates of
starting and stopping of smoking, amount
smoked, type of tobacco

64
Lung cancer Controls
cases free of
lung cancer

RECALL

% smokers % smokers

65
Case-Control Design

Lung cancer Controls free


cases of lung cancer
(649) (649)

Ever Never Ever Never


smokers smokers smokers smokers
(647) (2) (622) (27)

66
Odds ratio of ever smoking

Cases Controls
Ever smokers 647 622

Never smokers 2 27

647*27
Odds Ratio = = 14.0
2*622

67
Interpretation of odds ratio
• Lung cancer cases had 14.0 times the odds of
smoking history compared with cancer
controls without lung cancer.

68
In summary:
• The conditional risk of disease given exposure is not
very meaningful in a case-control study, therefore
the risk ratio and risk difference in a case control
study is not meaningful.
• However, the exposure odds ratio (the odds of
exposure given disease) is very useful, for 2 reasons:
1. The exposure odds ratio is mathematically equivalent to
the disease odds ratio
2. If the disease is rare, the disease odds ratio approximates
the disease risk ratio.
• Therefore, we can get a pretty accurate estimate of
the exposure risk ratio with the disease odds ratio!
69
Comparing the cohort study
and case-control study
• Cohort study
– Doctors who smoked had 13.7 times the rate of
lung cancer compared with doctors who did not
smoke.
• Case-control study
– Lung cancer cases had 14.0 times the odds of
smoking history compared with cancer controls
without lung cancer.

70
Odds ratios as proxies for risk ratios
• When the disease is rare, the odds ratio will
approximate the risk ratio you would have
gotten had you done a cohort study.

71
What have we learned?
• Characterize a cohort study, including pros and
cons
• Distinguish among different types of cohort
studies: prospective, retrospective, and
ambidirectional
• Characterize a case-control study, including pros
and cons
• Discuss the relationship between cohort and
case-control studies
• Estimate the odds ratio and understand its
relationship to the risk ratio
72
Questions?

73
Thank you!

74

You might also like