Hypothesis Testing and Sample Size Calculation: Po Chyou, Ph. D. Director, BBC

Hypothesis Testing
and
Sample Size Calculation
Po Chyou, Ph. D.
Director, BBC
Hypothesis Testing
on
• Population mean(s) • Coefficients based on
• Population median(s) regression model
Population proportion(s) • Odds ratio
• Population variance(s) • Relative risk
• Population correlation(s) • Trend analysis
Association based on • Survival distribution(s) /
contingency table(s) curve(s)
• Goodness of fit
Hypothesis Testing
1. Definition of a Hypothesis
An assumption made for the sake of argument
2. Establishing Hypothesis
Null hypothesis - H0
Alternative hypothesis - Ha
3. Testing Hypotheses
Is H0 true or not?
Hypothesis Testing
4.Type I and Type II Errors
Type I error: we reject H0 but H0 is true
α = Pr(reject H0 / H0 is true) = Pr(Type I error)
= Level of significance in hypothesis testing
Type II error: we accept H0 but H0 is false

 = Pr(accept H0 / H0 is false) = Pr(Type II error)
Hypothesis Testing
5. Steps of Hypothesis Testing
- Step 1 Formulate the null hypothesis H0 in
statistical terms
- Step 2 Formulate the alternative hypothesis Ha in
statistical terms
- Step 3 Set the level of significance α and the
sample size n
- Step 4 Select the appropriate statistic and the
rejection region R
- Step 5 Collect the data and calculate the statistic
Hypothesis Testing
5. Steps of Hypothesis Testing (continued)
- Step 6 If the calculated statistic falls in the
rejection region R, reject H0 in favor of Ha;
if the calculated statistic falls outside R, do
not reject H0
Density
α/2 α/2
Z
–Z α/2 0
Z α/2
Reject H 0 Do not reject H0 Reject H 0
Hypothesis Testing
6. An Example
A random sample of 400 persons included 240 smokers and 160 non-
smokers. Of the smokers, 192 had CHD, while only 32 non-smokers
had CHD.
Could a health insurance company claim the proportion of smokers
having CHD differs from the proportion of non-smokers having
CHD?
CHD No CHD
Smokers x1 n1 - x 1 n1
Non-Smokers x2 n2 - x 2 n2
n = n1 + n2
CHD No CHD
Smokers 192 48 240
Non-Smokers 32 128 160
400
Hypothesis Testing
Example (continued)
Let P1 = the true proportion of smokers having CHD
P2= the true proportion of non-smokers having CHD
- Step 1 H0 : P1 = P2
- Step 2 Ha : P1  P2
- Step 3 α = .05, n = 400

Hypothesis Testing
Example (continued)
- Step 4 statistic =  = P1 - P 2
P(1-P) (1/n1 + 1/n2)

where P1 = x1 , P2 = x2 and P = x1 + x2
n1 n2 n 1 + n2
Density
α/2 = .025 α/2 = .025
Z
0
–Z .025 Z.025
= -1.96 = 1.96
Reject H 0 Do not reject H 0 Reject H 0

Hypothesis Testing
Example (continued)
- Step 5 CHD No CHD
Smokers x1 n 1 - x1 n 1
Non-Smokers x2 n 2 - x2 n 2
n = n1 + n2
P1 = x1 = 192 = .80
CHD No CHD n1 240
Smokers 192 48 240 P2 = x2 = 32 = .20
n2 160
400
P = x1 + x2 = 192 + 32 = 224 = 0.56
n1 + n2 240 + 160 400
= P 1 - P2 = .80 - .20 = .60 = 11.84 > 1.96
P(1-P) (1/n1 + 1/n2) (.56) (1-.56) (1/240 + 1/160) .05066
Hypothesis Testing
Example (continued)
- Step 6
Reject H0 and conclude
that smokers had
significantly higher
proportion of CHD than
that of non-smokers.
[P-value < .0000001]
Hypothesis Testing
7. Contingency Table Analysis
The Chi-square distribution (2)
Density
α=.05
c2
0 c2 =3.841
.05, 1
Do not reject H0 Reject H0

Hypothesis Testing
Equation for chi-square for a contingency table
2 =  (Oij - Eij )2
i, j Eij
For i = 1, 2 and j =1, 2
2= (O11 - E11)2 + (O12 - E12)2 + (O21 - E21)2 + (O22 - E22)2

E11 E12 E21 E22
Hypothesis Testing
Equation for chi-square for a contingency table (cont.)
E11 = n1m1 E12 = n1 - n1m1 = n1m2
n n n
E21 = n2m1 E22 = n2 - n2m1 = n2m2
n n n
E11 E12 n1
E21 E22 n2
m1 m2
O11 O12 n1
O21 O22 n2
m1 m2 n = n1 + n2 = m1 + m2
Hypothesis Testing
Example : Same as before
- Step 1 H0 : there is no association between smoker
status and CHD
- Step 2 Ha : there is an association between smoker
status and CHD
- Step 3  = .05, n = 400
- Step 4 statistic =
2= (O11 - E11)2 + (O12 - E12)2 + (O21 - E21)2 + (O22 - E22)2
E11 E12 E21 E22

Hypothesis Testing
Example (continued) : Same as before
Density
α=.05
c2
0 c2 =3.841
.05, 1
Do not reject H0 Reject H0

Hypothesis Testing
- Step 5 CHD No CHD
Smokers O11 O12 n1
Non-Smokers O21 O22 n2
m1 m2 n
CHD No CHD
Smokers 192 48 240
224 176 400
CHD No CHD
Smokers E11 E12
Non-Smokers E21 E22
Hypothesis Testing
- Step 5 (continued)
E11 = n1m1 = 240 * 224 = 134.4
n 400
E12 = n1 - n1m1 = 240 - 134.4 = 105.6
n
E21 = n2m1 = 160 * 224 = 89.6
n 400
Expectation
Counts
E22 = n2 - n2m1 = 160 - 89.6 = 70.4
CHD No CHD
n
Smokers 134.4 105.6
Non-Smokers 89.6 70.4
Hypothesis Testing
- Step 5 (continued)
2= (O11 - E11)2 + (O12 - E12)2 + (O21 - E21)2 + (O22 - E22)2

E11 E12 E21 E22
= (192 - 134.4)2 + (48 - 105.6)2 + (32 - 89.6)2 + (128 - 70.4)2
134.4 105.6 89.6 70.4
= 24.68 + 31.42 + 37.03 + 47.13
= 140.26 > 3.841

Hypothesis Testing
- Step 6
Reject H0 and conclude
that there is an
association between
smoker status and CHD.
[P-value < .0000001]
Sample Size Estimation and
Statistical Power Calculation
Definition of Power
Recall :
 = Pr (accept H0 / H0 is false) = Pr (Type II error)
Power = 1 -  = Pr(reject H0 / H0 is false)
Sample Size Estimation
for Intervention on Tick Bites Among
Assumptions Campers
1. Given that the proportion (PCON) of tick bites among

campers in the control group is constant.
2. Given that the proportion (PINT) of tick bites among

campers in the intervention group is reduced by 50%
compared to that of the control group after
intervention has been implemented.
3. Given that a one- or two- tailed test is of interest with

80% power and a type-I error of 5%.
Sample Size Estimation
for Intervention on Tick Bites Among
Campers
Summary Table 1
Required N for each group

PCON PINT Two-tailed One-tailed
.01 .005 (50% reduction) 4500 3600
.05 .025 (50% reduction) 1170 922
.10 .050 (50% reduction) 475 374
.15 .075 (50% reduction) 305 240

for Intervention on Obesity of Women in
MESA
Assumptions
1. Given that the proportion (PCON) of women who are obese at baseline
(i.e., the control group) is constant. There are a total of 840 women in
the control group. Based on our preliminary data analysis results,
approximately 50% of these 840 women at baseline are obese (BMI
>= 27.3).
2. Given that the proportion (PINT) of women who are obese in the
intervention group is reduced by 5% or more compared to that of the
control group after intervention has been implemented. There are a
total of 680 women who had been newly recruited. Based on our
preliminary data analysis results, 50% of these 680 newly recruited
women are obese. Assume that 60% of these women will agree to
participate, we will have 200 women to be targeted for intervention.
for Intervention on Obesity of Women in
MESA (continued)
Assumptions
3. Given that a one-tailed test is of interest with a type-I error of 5%,

then the estimated statistical powers are shown in Table 1 for
detecting a difference of 5% or more in the proportion of obesity
between the control group and the intervention group.
Table 1
PCON (n=840) PINT (n=680) Difference Power
.50 .45 .05 61%
.50 .44 .06 75%
.50 .43 .07 85%
.50 .42 .08 92%
.50 .41 .09 96%
.50 .40 .10 98%
Reference
“Statistical Power Analysis
for the Behavioral Sciences”
Jacob Cohen
Academic Press, 1977

Take Home Message:
• You’ve got questions : Data ? STATISTICS?...
• Contact Biostatistics and consult with an experienced

biostatistician
– Po Chyou, Director, Senior Biostatistician (ext. 9-4776)

– Dixie Schroeder, Secretary (ext. 1-7266)
OR
• Do it at your own risk

Free Handout

Hypothesis Testing and Sample Size Calculation: Po Chyou, Ph. D. Director, BBC

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hypothesis Testing and Sample Size Calculation: Po Chyou, Ph. D. Director, BBC

Uploaded by

Copyright:

Available Formats

Hypothesis Testing

Type II error: we accept H0 but H0 is false

- Step 3 α = .05, n = 400

P(1-P) (1/n1 + 1/n2)

α/2 = .025 α/2 = .025

Reject H 0 Do not reject H 0 Reject H 0

Do not reject H0 Reject H0

2= (O11 - E11)2 + (O12 - E12)2 + (O21 - E21)2 + (O22 - E22)2

E11 E12 E21 E22

Do not reject H0 Reject H0

2= (O11 - E11)2 + (O12 - E12)2 + (O21 - E21)2 + (O22 - E22)2

= 140.26 > 3.841

1. Given that the proportion (PCON) of tick bites among

2. Given that the proportion (PINT) of tick bites among

3. Given that a one- or two- tailed test is of interest with

Required N for each group

.05 .025 (50% reduction) 1170 922

.10 .050 (50% reduction) 475 374

.15 .075 (50% reduction) 305 240

3. Given that a one-tailed test is of interest with a type-I error of 5%,

Academic Press, 1977

• Contact Biostatistics and consult with an experienced

– Po Chyou, Director, Senior Biostatistician (ext. 9-4776)

• Do it at your own risk

You might also like