Professional Documents
Culture Documents
Week 2: Counting Data
Week 2: Counting Data
1
Recall: Cohort vs. Case-Control
Exposure Disease
+ -
+ n11 n12
- n21 n22
2
Comparing Two Proportions
3
Example: ABO Hemolytic
4
Example
ABO Hemolytic Disease
Total
Yes No
Black infant 43 3541 3584
White infant 17 3814 3831
Four Methods:
• Small sample: Fisher’s exact test
• Large sample: Three tests
5
I. Fisher’s exact test
Disease + Disease -
6
Fisher’s exact test: Example
ABO Hemolytic Disease
Total
Yes No
Black infant 43 3541 3584
White infant 17 3814 3831
Reject H0 at 5% level
7
II. Large Sample Test A
Disease + Disease -
Disease + Disease -
Under H0
10
Large Sample Test B: Example
12
IV: Chi-Square Test of independence
13
Chi-Square Test of independence: Example
data: data1
X-squared = 12.2615, df = 1, p-value =
0.0004624
15
Summary: Tests of Two ind. Bin RV
16
Measures of Effects for Bin RV
17
1. Risk Difference
• Let
p1 = probability of developing disease
for exposed individuals;
p2 = probability of developing disease
for unexposed individuals
• Risk Difference = p1 – p2
19
Relative risk
20
3. Odds Ratio
and is estimated by
21
Example (Continued)
ABO Hemolytic Disease
Total
Yes No
Black infant 43 3541 3584
White infant 17 3814 3831
22
Hypothetical Case-Control Study
Disease + Disease -
Sample Exposed + a b
Exposed - c d
Disease + Disease -
Population Exposed + A B
Exposed - C D
23
Sample RR and Population RR
Disease + Disease -
Sample Exposed + a b
Exposed - c d
Disease + Disease -
Population Exposed + A B
Exposed - C D
a =f1 A, c = f1 C, b = f2 B, d = f2 D
24
Sample RR and Population RR
Disease + Disease -
Sample Exposed + a b
Exposed - c d
Disease + Disease -
Population Exposed + A B
Exposed - C D
a =f1 A, c = f1 C, b = f2 B, d = f2 D
25
Hypothetical Case-Control Study
26
Sample OR and Population OR
Disease + Disease -
Sample Exposed + a b
Exposed - c d
Disease + Disease -
Population Exposed + A B
Exposed - C D
a =f1 A, c = f1 C, b = f2 B, d = f2 D
27
Sample OR and Population OR
Disease + Disease -
Sample Exposed + a b
Exposed - c d
Disease + Disease -
Population Exposed + A B
Exposed - C D
a =f1 A, c = f1 C, b = f2 B, d = f2 D
28
Hypothetical Case-Control Study: OR
29
Estimation of OR
Disease + Disease -
Sample Exposed + a b
Exposed - c d
30
Summary: RR and OR
Disease + Disease - Total
32
Example: Smoking-Perinatal Mortality
33
Smoking-Perinatal Mortality: OR
34
Smoking-Perinatal Mortality: Tests
35
Smoking-Perinatal Mortality: CI
36
Summary
37
New Topic: Mantel-Haenszel Method
Combination of 2 x 2 Tables
• The study of association is made in separate
subgroups of the data, where the subgroups
are defined by the third variable, which is
associated with both disease and exposure.
• How to combine the information across tables
to make a single, unifying statement?
• Need to “adjust” for the effect of
“confounding” variables
• Answer: Mantel-Haenszel Method
38
Example 1
39
Smoking and Aortic Stenosis
Males Females
Aortic Smoker Aortic Smoker
Stenosis Yes No Stenosis Yes No
Yes 37 25 Yes 14 29
No 24 20 No 19 47
40
Sum Two tables
Males Females
Aortic Smoker Aortic Smoker
Stenosis Yes No Stenosis Yes No
Yes 37 25 Yes 14 29
No 24 20 No 19 47
Aortic Smoker
Stenosis
Yes No
Yes 51 54
No 43 67
41
Smoking and Aortic Stenosis
(regardless of gender)
• If the effects of gender is ignored, the strength
of the association between smoking and aortic
stenosis appears greater than it is for either
males or females alone.
• This is an example of Simpson’s paradox,
which occurs when a confounder is present
Simpson paradox when odds ratio increases when two subgroups are pooled where
subgroups can be separated by a confounder variable
42
Example 2: lung cancer & Drinking
Lung Cancer
Drinking Status Yes No Total
Heavy drinker 33 1667 1700
Nondrinker 27 2273 2300
Total 60 3940 4000
Smokers Nonsmokers
Drinking Lung Cancer Drinking Lung Cancer
Status Yes No Status Yes No
Heavy 24 776 Heavy 9 891
Nondrinker 6 194 Nondrinker 21 2079
44
Lung cancer & drinking after smoking
45
Example 3: Confounder “hide” association
Subgroup 1 Subgroup 2
Exposure Disease Exposure Disease
+ - + -
+ 60 100 + 50 10
- 10 50 - 100 60
Exposure Disease
+ -
+ 110 110
- 110 110
46
Control Confounder?
47
Confounder
• Positive Confounder is a confounder that either
1. is positively related to both exposure and disease, or
2. is negatively related to both exposure and disease
• Negative Confounder is a confounder that either
1. is positively related to disease and negatively related
to exposure, or
2. is negatively related to disease and positively related
to disease
• If a positive (negative) confounder exists,
“individual” ORs is lower (greater) than the
“pooled” OR.
if confounder impacts exposure & disease - both in same direction either +ve/-ve
then its a positive confounder. Odds ratio of combined - goes up than individual.
Viceversa
48
What type of Confounder is Gender?
Males Females
Aortic Smoker Aortic Smoker
Stenosis Yes No Stenosis Yes No
Yes 37 25 Yes 14 29
No 24 20 No 19 47
49
What type of Confounder is Gender?
Males Females
Aortic Smoker Aortic Smoker
Stenosis Yes No Stenosis Yes No
Yes 37 25 Yes 14 29
No 24 20 No 19 47
Positive Confounder!
50
What type of Confounder is Gender?
Males Females
Aortic Smoker Aortic Smoker
Stenosis Yes No Stenosis Yes No
Yes 37 25 Yes 14 29
No 24 20 No 19 47
Aortic Smoker
Stenosis
Yes No
Yes 51 54
No 43 67
Positive Confounder since “pooled” OR is greater!
as order ratios are different dont have to
combine different groups as they dont 51
have any relationship and treat two
groups separately
Mantel-Haenszel Method
52
Coffee and Myocardial Infarction
Smokers Nonsmokers
Myocardial Coffee Myocardial Coffee
Infarction Yes No Infarction Yes No
Yes 1011 81 Yes 383 66
No 390 77 No 365 123
53
Coffee and Myocardial Infarction
Smokers Nonsmokers
Myocardial Coffee Myocardial Coffee
Infarction Yes No Infarction Yes No
Yes 1011 81 Yes 383 66
No 390 77 No 365 123
Myocardial Coffee
Infarction Yes No
Yes 1394 147
No 755 200
• Logarithm of estimated OR is
• Weighted average is
So for test
statistics, if value is
• Test statistic:
small we accept
the null hypothesis
as they all would
have same OR
56
Test of Homogeneity
• Test statistic:
• Why weight?
Smokers Nonsmokers
Myocardial Coffee Myocardial Coffee
Infarction Yes No Infarction Yes No
Yes 1011 81 Yes 383 66
No 390 77 No 365 123
Log of e in
mid term
58
Coffee and Myocardial Infarction
ni = ai+bi+ci+di
60
Summary Odds Ratio: Example
Smokers Nonsmokers
Myocardial Coffee Myocardial Coffee
Infarction Yes No Infarction Yes No
Yes 1011 81 Yes 383 66
No 390 77 No 365 123
Confidence Interval--- 1
• Estimated summary OR is
where
62
Confidence Interval--- 2
63
CI of Summary Odds Ratio
• In the example,
𝟏 𝟏
𝒔ෞ𝒆 𝒀 = = = 𝟎. 𝟏𝟐𝟎
𝒘𝟏 +𝒘𝟐 𝟑𝟒.𝟔𝟐+𝟑𝟒.𝟗𝟑
64
Mantel-Haenszel Method
Step 3: Test of Association
• To test whether the summary odds ratio is
equal to 1 if 1 in normal or 0 in log scale exist
then there is no correlation
65
Test of Association
Exposure + Exposure - Total
Disease + ai bi ai+bi
Disease - ci di ci+di
Total ai + ci bi+di ni
( under H0)
66
Test of Association
Exposure + Exposure - Total
Disease + ai bi ai+bi
Disease - ci di ci+di
Total ai + ci bi+di ni
• Test statistic:
68
Coffee and Myocardial Infarction
pvalue = 1- pchisq(43.68, 1) # in
• The test statistic
X2 = 43.68 R
69
Summary
• Confounder:
• Positive Confounder
• Negative Confounder
• Mantel-Haenszel Method
1. Test of Homogeneity
2. Summary Odds Ratio
3. Test of Association
70