3 - Case-Control-Study

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 46

Case-Control Study

1. Basic Concepts
A study starts with the identification of
persons with the disease of interest and a
control group of persons without the disease,
to measure the association between the
exposure and the disease by comparing the
diseased and non-diseased with regarded to
how frequently the exposure present.
Exposure

Study subjects contact with or possession of a


characteristic or factor that is suspected to
influence the risk of developing a particular
disease.
Direction of Investigate

Onset of study
compare number exposure disease

a +
cases
a/(a+c)
c -

b -
controls
b/ +
d
Figure. Schematic diagram of case-control study basic principle
2.Type of Design
 Unmatched

frequency matching
 Matched
individual matching
Unmatched

The number of controls should be equal


cases or more than case. In addition, have
no other any restricts.
Matched
Matching is the process of selecting the
controls so that they are similar to the cases in
certain characteristics, such as age, sex, race,
occupation, etc. Matching may be two types:
 frequency matching
 individual matching
• Frequency Matching (group matching)
Frequency matching consists of selecting the
controls in such a manner that the proportion
controls with a certain characteristics is
identical to the proportion of cases with the
same characteristics.
Such as, if 50% of the case are female, the controls
will be selected so that 50% of that group is also
female.
• Individual Matching
The first is to identify a case. Then select
from the source population one or more
controls who have the same values that
the case has for each matching factor.
To match on a binary variable such as sex can
match directly. To match on a continuous
variable such as age, it is typical1y necessary to
form categories, such as 5-year intervals.
If a control be matched one case called the
ratio of one to one matching, 1:1matching
(matching pair ).
If a case matched more than one controls
called the ratio of one to R matching, 1:R (1:M)
matching.
The ratio of controls to cases rarely exceeds 4:1 by
Pitman Formula ,because additional controls beyond
this ratio add relatively little to the statistical power of
the study.
Why match?
• To reduce likelihood of confounding
• Increase the study efficiency

Match what?
Possibly confounding factors
Match variable
Age, gender, etc.

Disadvantage
•Once you match on a characteristic, you can
not study its effect.
•Over-matching.
3.Example

A case-control study on mothers use of


diethylstilbesterol during gestation and
adenocarcinoma of the vagina in daughters.
Background
Dr.Herbst was a gynecologist of the Vincent Memorial
Hospital ,in Boston, USA.
He found seven patient with vaginas adenocarcinoma of
the hospitalization treatment from 1966 to 1969.
The patients were all 15-22 year old young girls. Usually
the vagina cancer was 2% that the female reproductive
system, but the vagina adenocarcinoma was 5-10% of
the vagina cancer, very rare.
In the past, the patient's age was usually older than 25
years old, but that 7 cases were 15 to 22 years old.
Step
• The Herbst explored on risk factors of the
vagina adenocarcinoma.
• Seven cases plus the one patient of another
hospital made up of case group.
• 4 controls be matched to each case, total 32
controls.
• Investigators used the standard
questionnaire investigated the cases and
controls and their mothers.
Result
The comparison of the main exposure factor of mother of cases and controls
Case Mother's Whether This time abortion Used The pregnant
age mother pregnancy history estrogen mother's whether
smoking bleed during breast feed exposure to
pregnant X radial

No. case control case control case control case control case control case control case control

1 25 32 yes 2/4 no 0/4 yes 1/4 yes 0/4 no 0/4 no 1/4


2 30 30 yes 3/4 no 0/4 yes 1/4 yes 0/4 no 1/4 no 0/4
3 22 31 yes 1/4 yes 0/4 no 1/4 yes 0/4 yes 0/4 no 0/4
4 33 30 yes 3/4 yes 0/4 yes 0/4 yes 0/4 yes 2/4 no 0/4
5 22 27 yes 3/4 no 1/4 no 1/4 no 0/4 no 0/4 no 0/4
6 21 29 yes 3/4 ye 0/4 yes 0/4 yes 0/4 no 0/4 no 1/4
7 30 27 no 3/4 no 0/4 yes 1/4 yes 0/4 yes 0/4 no 1/4
8 25 28 yes 3/4 no 0/4 yes 0/4 yes 0/4 no 0/4 yes 1/4
total 7/8 21/32 3/8 1/32 6/8 5/32 7/8 0/32 3/8 3/32 1/8 4/32

averge 26.1
29.3
χ2 0.53 4.52 7.16 23.22 2.35 0

P 0.50 < 0.05 < 0.01 < 0.00001 0.20

5.7 8.0 10. 5 28.0 10.0 3.0


OR
Conclusions:
 Among all research factors only 3 of them were
significant in statistics:
Their mothers had used estrogen during pregnancy. (P<0.00001)
Their mothers’ history of abortion. (P<0.01)
Their mothers’ vagina hemorrhage during this pregnancy. (P<0.05)

 The presence of abortion and vagina hemorrhage


caused their mothers’ usage of diethylstilbesterol.
 Conclusion: The mothers’ usage of diethylstilbesterol
increased their daughters’ risk on adenocarcinoma of
the vagina.
4.Selection of study subjects

 Define the source population

 Based on disease status

 Comparability of case and control groups


Selection of Case
 Definition of Cases:
criteria of case diagnosis, including and excluding.

 Types of Cases:
 Incident case
 Prevalent case
 Death case

 Sources of Cases:
 Community
 Hospital
Selection of Controls

 definition of controls:
The peoples without develop the aim disease,
has the same source with case group

 Sources of controls :
 Community
 Hospital
 Relatives, neighbors, colleagues and
schoolmates
5. Sample Size
• Parameters
 The proportion of exposure that study
factors in control group(P0)
 Odds ratio ( OR) by study factors
 αvalue
 1-β
•Methods
 Table method
 Formula method

 (unmatched)

p
q

p1=p0RR/[1+p0(RR-1 ) ]
p
= ( p0 + p1 )/ 2
q p
= l-
example
A case-control study was conducted in Xi’an, so as to identify
the correlation between smoking and lung cancer. The
smoking rate of the general population in Xi’an is 20%(P0).
OR=2.0, α=0.05, 1-β=0.90.

P 1  P0 RR/[1  P0 (RR  1)]  0.2  2 (1  0.2  0.2  1)  0.333

According to the formula: Zα=1.64, Zβ=1.28

N  2pq(Z α  Z β ) 2 /(p1  p 0 ) 2

= 0.267×0.733 ( 1.96+1.28 ) 2/ ( 0.333-0.2 ) 2=232


1:1matching

m  [(U  / 2  U  p (1  p ) ] 2 /( p  1 / 2) 2

p=OR/ ( 1+OR )≈ RR/ ( 1+RR )


M≈m /( p0q1+p1q0 )
p1= p0RR/[1+ p0 ( RR-1 ) ]
q1=1- p1
q0=1-p0
To identify the correlation between oral contraceptive and
congenital heart disease, The exposure rate of the control
group is 30%(P0). RR=2.0, P1=0.46,α=0.05, β=0.1.

p=2/ ( 1+2 )=2/3

m  [(1.96/2  1 .28 2/3(1  2 / 3) ]2 /(2/3  1/2) 2  90

M≈90 /( 0.3×0.54+0.46 × 0.7 ) =186


6. Collection of data on exposure
and other factors

• interviews

• questionnaires

• examination of records

• clinical and laboratory examinations


on the data-collection strategy

• Quality control for exposure measures


standardized interview and questionnaire
• Blinding
• Same procedures
• Interviewer should be well-trained.
7.Analysis of data
• Descriptive analysis
Describe the general characteristic of study
subjects
Balance test
(To identify the case group and control group whether are
comparative)

• Measure of association
Chi square
Odds Ratio (OR)
OR 95% Confidence Interval (OR 95% CI)
• stratified analysis
What’s the OR (odds ratio)?
Odds: the probability that a particular event will
occur divided by the probability that the event will
not occur
Odds ratio: the ratio of a particular exposure among
persons with a specific disease divided by the
corresponding odds of exposure among persons
without the disease of interest
Exposure With the Without the Total
disease disease
Yes a b a+b=n1
No c d c+d=n0
Total a+c=m1 b+d=m0 a+b+c+d=N
 Odds Ratio (OR): OR = ad / bc
 Interpretation: OR=1, OR>1, OR<1
Analysis of Unmatched Design
N 2
( ad  bc  ) N
 Cross-table χ2 test χ2  2
m1m 2 n1n 2

ad
 Compute OR OR 
bc

 OR 95% Confidence Interval (OR 95% CI)

 Woolf method lnOR95%CI  lnOR  1.96 var(lnOR)

χ 2)
 Miettinen method OR95%CI  OR (11.96/
Example
A case-control study on oral contraceptive and myocardial infarction

OC Case Control Total

take 39(a) 24(b) 63(n1)


No 114(c) 154(d) 268(n2)
Total 153(m1) 178(m2) 331(N)

χ 
2 ( ad  bc )2  N

 39  154  24  114 
2
 331
 7.70
m1m2 n1n 2 153  178  63  268
 2  7.70> 02.01( 1 )  6.63, P< 0.01
39  126
OR   2.20
24  114
lnOR95%CI  lnOR  1.96 var(lnOR)  1.3218~0.2 252
exp(1.3218 ,0.2252)  (3.75,1.25 )
association of OC and MI by age was stratified

< 40 ≥40

OC ( + ) OC ( - ) Total OC ( + ) OC

(-) Total

Case 21 ( a1 ) 26 ( b1 ) 47 ( m11 ) 18 ( a2 )

88 ( b2 ) 106 ( m12 )

OR=2.80
Control 17 ( c1 ) 59 ( d1 ) 76 ( m01 ) OR=2.78
7 ( c2 )

95 ( d2 ) 102 ( m02 )

Total 38 ( n11 ) 85 ( n01 ) 123 ( t1 ) 25 ( n12 )

183 ( n02 ) 208 ( t2 )

OR=0.48of,
association χ2=7.27
age and MI OR=3.91 ,ofχ2age
association =8.89
and OC

< 40 ≥40 < 40
By Mantel - Haenszel formula compute OR and χ2

OR MH 
 (a d /t )i i i
 2.79
 (b c /t )i i i

χ 2
MH   a   E(a ) /  V(a )  11.79
i i
2
i

 E(a )   m
i n /ti
1i 1i

 I
m1im0in1in 0i
 V(a i )  
i1 ti2 (ti  1)

(11.96/ χ 2)
OR95%CI  OR
1:1 matched data analysis
Summary data format for a matched case-control
study with one control per case
Case
Control Exposed Unexposed Total

Exposed a b a+b
Unexposed c d c+d
Total a+c b+d a+b+c+d
McNemarχ 2 test χ 2 =(b-c)2/b+c
Compute OR OR=c/b
Miettinen method compute OR95 % CI
(11.96/ χ 2)
OR
Example
Association infectivity mononucleosis (IM) and lymphocyte leukaemia
Case
Control Total
(MI+) (MI-)

(MI+) a b a+b
(MI-) c d c+d
Total a+c b+d a+b+c+d

χ2= ( 35-60 ) 2/ ( 35+60 ) =6.58 , P < 0.01


OR=60/35=1.71
OR95%CI=1.71 ( 1±1.96/ 6.58 )

ORL =2.57 , ORU= 1.14


attributable fraction , AF
( etiology fraction , EF )

AFe  (I e  I u )/I e  (OR  1)/OR

AFp  (I p  I u )/I p  Pe (OR  1)/1  Pe (OR  1)


8. Bias
What does bias mean?

Any systematic error in the design,


conduct or analysis of a study that results
in a mistaken estimate of an exposure‘s
effect on the risk of disease.
Types of bias:
Selection bias

Information bias (observation bias)

Confounding
Selection bias
Selection bias is a is a systematic error in a study
that arises from the manner in which the subjects
are sampled..

Prevalence-incidence bias
Admission rate (Berkson’s) bias
Detection bias
time effect bias
Control of selection bias

• Cases should be limited to incident cases, and


should be chosen as random samples of all cases.
• Definitions, ascertainments and exclusions must
always be made explicit.
• At least two control groups should be chosen:
a. a hospital-based group
b. a community-based control group.
Information bias (observation
bias)
Information bias is a systematic error in a study that
arises from the manner in which data are collected
Recall bias
from participants.
investigation bias

Control:
•Selecting objective index
•Training investigator
•Examining condition consistent
Confounding

Confounding is a systematic error in a study that


arises from mixing of the effect of the exposure of
interest with other associated correlates of the
disease outcome.

control of confounding:
•Restriction by study design
•Matching
•Stratification in the analysis
•Mathematical modelling in the analysis
such as multiple linear regression, logistic
regression , Cox model
9. Advantages of case-control study

Useful for studying rare diseases and


diseases of long latency

Cost effective, quick and easy to complete

Can be used to generate hypotheses in a


defined population

Can do rapid evaluation of new disease


10. Disadvantages of case-control study

 Not useful for rare exposures

 Uncertainty of the exposure-disease


relationship
 Inability to provide direct estimate of risk

 Potential biases: such as selection bias,


recall bias
11. Other Types of Case Control Study

 Nested case control study

 Case-cohort study
Blood drawn on 10 000 individual

10 years 10 years

no RA=9800
RA=200 take sample of 400 with no RA
(serologic test) (serologic test)

80 120 40 360
+ - + -
A sample a nested case-control study
RA denotes rheumatoid arthritis
Blood drawn on 10 000 individual

10years 10years 10years 10years

RA=200 SLE=100 AS=150 sample 400 of


original 10 000
(serologic test) (serologic test) (serologic test) (serologic test)

80 120 20 80 15 135 40 360


+ - + - + - + -
A example of case-cohort study
RA denotes rheumatoid arthritis,SLE denotes systemic lupus
erythematosus, AS denotes ankylosing spongylitis

You might also like