Is Bigger Better?: An Introduction To Sample Size Calculations

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 52

Is bigger better?

An introduction to sample size calculations


 
 
 
Presented by:
Dr Adrian Esterman

Flinders Centre for Epidemiology &


Scenario 1 All studies Scenario 2
Precision Power

Descriptive Hypothesis testing

Sample surveys Simple - 2 groups


Quality control
Complex studies

Flinders Centre for Epidemiology &


Scenario 1
Suppose we want to estimate the proportion
of people in our target population with a
given characteristic:
• The proportion with depression
• The proportion with an artficial leg
• The proportion receiving incorrect medication

Flinders Centre for Epidemiology &


Scenario 1
Example

• My target population is all South


Australians aged 17 and over
• I want to find out what proportion have
an undergraduate degree
• Please raise your hand if you have an
undergraduate degree

Flinders Centre for Epidemiology &


Scenario 1

Target Random
Sample
Population

Infer
Measure
Characteristic

Flinders Centre for Epidemiology &


Scenario 1
True proportion in target population = P
Estimated proportion from sample = p

How likely is it that p is exactly equal to P?

Flinders Centre for Epidemiology &


Scenario 1

We would like 95 times out of 100,


P to fall in this range

0 p 1
Sample

Flinders Centre for Epidemiology &


Scenario 1
The range of plausible values of our sample
proportion p in which the true population
proportion P is likely to fall 95 times out of
100 is called the 95% Confidence Interval
for P

Flinders Centre for Epidemiology &


Scenario 1
95% CI
for P

0 p 1
Sample

Flinders Centre for Epidemiology &


Scenario 1
The 95% CI for p is a measure of how
accurate your sample estimate is of the true
population proportion

95% Confidence
Interval

Sample size

Flinders Centre for Epidemiology &


Scenario 1
Example
We want to estimate the proportion of the
South Australian population with COPD.
We think it will be about 12%.

We would like a 95% CI of p ± 2%.

Flinders Centre for Epidemiology &


Scenario 1

Flinders Centre for Epidemiology &


Flinders Centre for Epidemiology &
Flinders Centre for Epidemiology &
Flinders Centre for Epidemiology &
Flinders Centre for Epidemiology &
Flinders Centre for Epidemiology &
Flinders Centre for Epidemiology &
p=50% with 95% CI 50% +/- 5%

400
Required sample size

300
200
100
0

Size of target population

Flinders Centre for Epidemiology &


Statcalc
Statcalc is included as part of the Epiinfo
suite of programs. This is available free of
charge from:

http://www.cdc.gov/epiinfo/

Flinders Centre for Epidemiology &


Scenario 2

We wish to formally test the difference


between two means or two proportions

Flinders Centre for Epidemiology &


Scenario 2
Three bits of information required to determine
the sample size

Type I & II Variation


errors Clinical
effect

Flinders Centre for Epidemiology &


Process of hypothesis testing Type I &
II errors
1. State a Null hypothesis (H0)
2. State an Alternative hypothesis (HA)
3. Decide on a suitable statistical test based on
the Null hypothesis
4. Calculate the test statistic
5. Check the associated probability (p-value)
6. If p  0.05 reject the Null hypothesis

Flinders Centre for Epidemiology &


Process of hypothesis testing Type I &
II errors
Note
If the Alternative hypothesis is:
parameter 1  parameter 2
we calculate the p-value for a two-sided test

If the Alternative hypothesis is:


parameter 1 > parameter 2
we calculate the p-value for a one-sided test

Flinders Centre for Epidemiology &


Type I &
II errors
What is a p-value?

1. It is a probability, and hence lies between 0 and 1.


2. It is a measure of surprise. In fact how surprised we
are to get a test statistics that large, if the Null
hypothesis were true.

Flinders Centre for Epidemiology &


Type I &
II errors
Type I and II errors
Statistical True state of null hypothesis
decision
Hypothesis true Hypothesis false

Reject Null Type I error Correct (Power)


hypothesis

Accept Null Correct Type II error


hypothesis

Flinders Centre for Epidemiology &


Type I &
What causes a Type I error II errors

• Bias
• Confounding
• Effect modification
• Misclassification

Flinders Centre for Epidemiology &


Type I &
What causes a Type II error II errors

• Sample size too small


• Confounding
• Effect modification
• Misclassification

Flinders Centre for Epidemiology &


Example of setting error levels Type I &
II errors
New drug for lowering cholesterol
• Slightly better efficacy than existing drugs
• Much more expensive than existing drugs

What are the consequences of making a Type I error?


What are the consequences of making a Type II error?

Flinders Centre for Epidemiology &


Example 1 Type I &
II errors
New drug for lowering cholesterol
Slightly better efficacy than existing drugs
• Much more expensive than existing drugs

Conclusion
• Requires stringent Type I error (say 0.01)
• Can managed with relaxed Type II error (say 0.20)

Flinders Centre for Epidemiology &


Example 2 Type I &
II errors

Trial of new brochure to help people quit smoking


• Successful in 20% of smokers
• Negligible cost

What are the consequences of making a Type I error?


What are the consequences of making a Type II error?

Flinders Centre for Epidemiology &


Example 2 Type I &
II errors

Trial of new brochure to help people quit smoking


• Successful in 20% of smokers
• Negligible cost

Conclusion
• Can relax Type I error (say 0.10)
• Requires stringent Type II error (say 0.05)

Flinders Centre for Epidemiology &


Scenario 2
Three bits of information required to determine
the sample size

Type I & II Variation


errors Clinical
effect

Flinders Centre for Epidemiology &


Clinical
Your Alternative hypothesis states effect
that you expect one group to have a
different mean or proportion to the
other group, but how much by?

• From the literature •  15% change


• From a pilot study • Change of  1 SD
• Clinically judgement • Interim analysis

Flinders Centre for Epidemiology &


Scenario 2
Three bits of information required to determine
the sample size

Type I & II Variation


errors Clinical
effect

Flinders Centre for Epidemiology &


Variation

Is there a difference between the two means?

Mean 1 Mean 2

Systolic Blood Pressure

Flinders Centre for Epidemiology &


Variation

It depends upon the range of the distributions

Systolic Blood Pressure

Flinders Centre for Epidemiology &


Variation

To judge whether the difference between


two means is large or small, we compare it
with some measure of the variability of the
distributions

Flinders Centre for Epidemiology &


Variation

Variability

All statistical tests are based on the following ratio:

Difference between parameters


Test Statistic =
v / n

As n  v/n  Test statistic 

Flinders Centre for Epidemiology &


Variation

2
v x Test statistic
n =
Difference

Flinders Centre for Epidemiology &


Variation

The test-statistic is usually:


• Chi-squared for comparing two proportions
• Student’s t for comparing two means
• F-statistic for comparing two variances
• Z-statistic for comparing two correlation coefficients

but may be more complicated

Flinders Centre for Epidemiology &


Scenario 2
Example for two means
We wish to undertake an RCT of an intervention to
improve quality of life. At the end of the study, the
mean PCS of the SF-36 for the control group is
expected to be 35. We expect that in the
intervention group, the mean PCS will be 45. The
standard deviation of the PCS is 10.

Flinders Centre for Epidemiology &


Flinders Centre for Epidemiology &
1 – Type I
Error

1 – Type II
Error

Flinders Centre for Epidemiology &


Scenario 2
Example for two proportions
In a prospective study of hip protectors, we expect
that in the untreated group 10% of elderly people
will suffer a hip fracture. In the treated group we
expect this to reduce to 5%.

Flinders Centre for Epidemiology &


Flinders Centre for Epidemiology &
Winepiscope
Winepiscope is available free of charge
from:

http://www.clive.ed.ac.uk/winepiscope/

Flinders Centre for Epidemiology &


Allowing for dropouts 
dropouts
Nearly all studies have at least some subjects who
withdraw, are lost to follow up, or who die

If n is the sample size computed by the program,


and we expect lose d% of subjects, then the
requires sample size is N is given by:

N = (100 x n) / (100 – d)

Flinders Centre for Epidemiology &


Allowing for dropouts 
dropouts
Example
The sample size program tells us that we need 120
in each group and we are expecting a 15%
drop out.

N = (100 x 120) / (100 – 15)


= 141

Flinders Centre for Epidemiology &


Is bigger better?
 

For both descriptive and hypothesis testing


studies, the answer is yes.

1. Increasing the sample size will have no effect


on Type I errors which are largely due to bias
and/or confounding.
2. There is no point in having a larger sample size
than that required for precision or power.

Flinders Centre for Epidemiology &


Is bigger better?
 

For both descriptive and hypothesis testing


situations, the answer is yes. However:

1. Increasing the sample size will have no effect


on Type I errors which are largely due to bias
and/or confounding.
2. There is no point in having a larger sample size
than that required for precision or power.

Flinders Centre for Epidemiology &


For copies of this presentation

Please email Kylie Thomas at:

kylie.thomas@flinders.edu.au

Flinders Centre for Epidemiology &

You might also like