Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

Power and Sample Size

Calculation
Presentation # 3
Objectives

To know and understand the types of sample size calculations


To understand the meaning of the power of the test
To know and understand the meaning of effect size
To know and understand the factors affect the power
To see applications on sample size calculation
Types of Sample Size Calculations

• 1- Sample size to achieve certain power for the test of significance (Test of significance)

• 2- Sample size to achieve certain accuracy for the estimation (Confidence Interval)
Sample size to achieve a certain power
The Two Types of Errors Possible in Making
Decisions about Statistical Hypotheses

STATE OF NATURE
DECISION H0 is true H0 is false

Accept H0 : Satisfactory Type II error

Reject H0: Type I error Satisfactory


15/04/2020 4

15/04/2020 4
•Type I error occurs when the researcher rejects a null
hypothesis when it is true.

• The probability of committing a Type I error is called


the significance level.

•This probability is also called alpha, and is often


denoted by α.

•Pr [Type I error] = Pr (Rejecting H0 when it is True)=α

•1-α is called Confidence Level (CL)


15/04/2020 5

15/04/2020 5
•Type II error occurs when the researcher accepts a null
hypothesis that is false.

•The probability of committing a Type II error is called


Beta, and is often denoted by β.

•Pr [Type II error] = Pr( Accepting H0 when it is False) = β

•Another way of putting it is that

The Type I error amounts to “Disbelieving the Truth";


The Type II error, to “Believing an Untruth."

15/04/2020 6

15/04/2020 6
•Power of the Test

The probability of not committing a Type II error


is called the Power of the test.

= Pr[Not Making a Type II error]

= 1 - Pr[Making a Type II error] = 1 – β

= Pr[Recognizing that the null hypothesis is


false when it is false]
15/04/2020 7

15/04/2020 7
For example

Suppose the null hypothesis states that a population


mean is equal to 100.

A researcher might ask: What is the probability


of rejecting the null hypothesis if the true population
mean is equal to 90?

The effect size would be


90 – 100 = -10.
15/04/2020 10

15/04/2020 10
1) Sample size to achieve certain power for the test of significance
applications

Example 1: Correlation

H0:  = 0 (There is no correlation between BMI and total cholesterol )


H1:   0 (There is correlation between BMI and total cholesterol )

We will calculate the sample size needed to achieve 80% power to test that
H0: No correlation ( = 0 ) against
Ha: There is correlation( and the true correlation is  = 0.2) (usually from literature
value/s)

In this case the effect size is (true – null) = 0.2 -0 = 0.2

Power 80% means that the probability of correctly reject the null and conclude there is
correlation between BMI and Total cholesterol is 80%, when the true correlation is 0.2
Pearson's Correlation Tests

Numeric Results when H1: ρ0 ≠ ρ1


────────────────────────────────────────────────
Power N Alpha Beta ρ0 ρ1
0.80008 193 0.05000 0.19992 0.00000 0.20000
Report Definitions
Power is the probability of rejecting a false null hypothesis. It should be close to one.
N is the size of the sample drawn from the population. To conserve resources, it should be small.
Alpha is the probability of rejecting a true null hypothesis. It should be small.
Beta is the probability of accepting a false null hypothesis. It should be small.
ρ0 is the value of the population correlation under the null hypothesis.
ρ1 is the value of the population correlation under the alternative hypothesis.
Summary Statements
─────────────────────────────────────────────────────────
A sample size of 193 achieves 80% power to detect a difference of -0.20000 between the null
hypothesis correlation of 0.00000 and the alternative hypothesis correlation of 0.20000 using a
two-sided hypothesis test with a significance level of 0.05000.
Pearson's Correlation Tests

Numeric Results when H1: ρ0 ≠ ρ1


────────────────────────────────────────
Power N Alpha Beta ρ0 ρ1
0.70063 616 0.05000 0.29937 0.00000 0.10000
0.80018 782 0.05000 0.19982 0.00000 0.10000 Here I selected different values for the power
0.90007 1046 0.05000 0.09993 0.00000 0.10000 (70% , 80% and 90%)
0.70232 153 0.05000 0.29768 0.00000 0.20000
0.80008 193 0.05000 0.19992 0.00000 0.20000 and different values for the true correlation ρ1
0.90038 258 0.05000 0.09962 0.00000 0.20000 (0.1 , 0.2, 0.3, 0.4, 0.5)
0.70360 67 0.05000 0.29640 0.00000 0.30000
0.80034 84 0.05000 0.19966 0.00000 0.30000
0.90081 112 0.05000 0.09919 0.00000 0.30000
0.70695 37 0.05000 0.29305 0.00000 0.40000
0.80225 46 0.05000 0.19775 0.00000 0.40000
0.90209 61 0.05000 0.09791 0.00000 0.40000
0.70976 23 0.05000 0.29024 0.00000 0.50000
0.81394 29 0.05000 0.18606 0.00000 0.50000
0.90114 37 0.05000 0.09886 0.00000 0.50000
The farther the true correlation
from zero (the bigger the effect size),
the easier to reject the null
and conclude there is correlation,
the smaller the sample size

Always we need a larger sample size


For a bigger power
Example 2: Two-samples T- test

H0: No difference in the mean recovery time between treatment A and treatment B
(the difference in days is zero days; difference = 0)
Ha: There is difference in the mean recovery time between treatment A and Treatment B

We will calculate the sample size needed to achieve 80% power to test that
H0: (difference = 0 ) against Ha: The true difference is = 2 days)

The effect size = true – null = 2 – 0 = 2

Power 80% means that the probability of correctly reject the null and conclude there is
difference in the recovery time between treatment A and Treatment B is 80%, when the true
difference is 2 days
Two-Sample T-Tests Assuming Equal Variance

Numeric Results for Two-Sample T-Test Assuming Equal Variance ───────────────────────────


Alternative Hypothesis: H1: δ = μ1 - μ2 ≠ 0

Target Actual
Power Power N1 N2 N δ σ Alpha
0.80 0.80146 64 64 128 2.0 4.0 0.050

Report Definitions
Target Power is the desired power value (or values) entered in the procedure. Power is the probability of
rejecting a false null hypothesis.
Actual Power is the power obtained in this scenario. Because N1 and N2 are discrete, this value is often
(slightly) larger than the target power.
N1 and N2 are the number of items sampled from each population.
N is the total sample size, N1 + N2.
μ1 and μ2 are the assumed population means.
δ = μ1 - μ2 is the difference between population means at which power and sample size calculations are made.
σ is the assumed population standard deviation for each of the two groups.
Alpha is the probability of rejecting a true null hypothesis.

Summary Statements ─────────────────────────────────────────────────────────


Group sample sizes of 64 and 64 achieve 80.146% power to reject the null hypothesis of equal
means when the population mean difference is 2.0 with a standard deviation for both groups of
4.0 and with a significance level (alpha) of 0.050 using a two-sided two-sample equal-variance
t-test.
Two-Sample T-Tests Assuming Equal Variance

Numeric Results for Two-Sample T-Test Assuming


Equal Variance
─────────────────────────── Here I selected different values for the power
Alternative Hypothesis: H1: δ = μ1 - μ2 ≠ 0 (70% , 80% and 90%)

Target Actual and different values for the true difference


Power Power N1 N2 N δ σ Alpha (1 day, 2 days, 3 days)
0.700 .70116 199 199 398 1.0 4.0 0.050
0.800 .80136 253 253 506 1.0 4.0 0.050
0.900 .90045 337 337 674 1.0 4.0 0.050
0.700 .70561 51 51 102 2.0 4.0 0.050
0.800 .80146 64 64 128 2.0 4.0 0.050
0.900 .90323 86 86 172 2.0 4.0 0.050
0.700 .70111 23 23 46 3.0 4.0 0.050
0.800 .80141 29 29 58 3.0 4.0 0.050
0.900 .90487 39 39 78 3.0 4.0 0.050
The farther the true difference
from zero (the bigger the effect size). The
easier to reject the null and conclude there
is difference and the smaller the sample size

Always we need a larger sample size


For a bigger power
Example 3: Two proportion ( Z- test or chi-squared test)

H0: No difference in the recovery rate between treatment A and treatment B


(the difference is zero ; difference = 0)
Ha: There is difference the recovery rate between treatment A and Treatment B

We will calculate the sample size needed to achieve 80% power to test that
H0: (difference = 0 ) against
Ha: The true difference in recovery rate is = .1 ;i.e. 10%)
Assuming that the recovery rate for the control group is 0.8 (80%)

The effect size = true – null = 0.1 -0 = 0.1

Power 80% means that the probability of correctly reject the null and conclude there is
difference in the recovery rate between treatment A and Treatment B is 80%, when the true
rate difference is 0.1
Tests for Two Proportions
Numeric Results for Testing Two Proportions using the Z-Test with Pooled Variance ───────────────
H0: P1 - P2 = 0. H1: P1 - P2 = D1 ≠ 0.
Target Actual Diff
Power Power* N1 N2 N P1 P2 D1 Alpha
0.80 0.80007 199 199 398 0.9000 0.8000 0.1000 0.0500
* Power was computed using the normal approximation method.
Report Definitions
Target Power is the desired power value (or values) entered in the procedure. Power is the probability of
rejecting a false null hypothesis.
Actual Power is the power obtained in this scenario. Because N1 and N2 are discrete, this value is often
(slightly) larger than the target power.
N1 and N2 are the number of items sampled from each population.
N is the total sample size, N1 + N2.
P1 is the proportion for Group 1 at which power and sample size calculations are made. This is the treatment
or experimental group.
P2 is the proportion for Group 2. This is the standard, reference, or control group.
D1 is the difference P1 - P2 assumed for power and sample size calculations.
Alpha is the probability of rejecting a true null hypothesis.
Summary Statements ─────────────────────────────────────────────────────────
Group sample sizes of 199 in group 1 and 199 in group 2 achieve 80.007% power to detect a
difference between the group proportions of 0.1000. The proportion in group 1 (the treatment
group) is assumed to be 0.8000 under the null hypothesis and 0.9000 under the alternative
hypothesis. The proportion in group 2 (the control group) is 0.8000. The test statistic used is
the two-sided Z-Test with pooled variance. The significance level of the test is 0.0500
2- Sample size to achieve certain accuracy for the estimation
(Confidence Interval)

Example 1: Estimating one population mean (t-test)

What is the sample size needed to estimate the average onset age for diabetes
In Dubai, such that we are 95% confident that we will not be off by more than
2 years assuming that the standard deviation for the onset age is 6 years

Things we need:
1) Confidence level (usually, 95%)
2) The accuracy or the margin of error which is about two standard errors (2 years in this
example)
3) The standard deviation of the individual observations of the diabetes onset age (6 years in
this example), usually, we get this from the literature or from a pilot sample
Confidence Intervals for One Mean

Numeric Results for Two-Sided Confidence Intervals with Unknown Standard


Deviation ─────────────
Target Actual
Sample Distance Distance Standard
Confidence Size from Mean from Mean Deviation
Level (N) to Limits to Limits (S)
0.950 38 2.000 1.972 6.000

Report Definitions
Confidence level is the proportion of confidence intervals (constructed with this same confidence level,
sample size, etc.) that would contain the population mean.
N is the size of the sample drawn from the population.
Distance from Mean to Limit is the distance from the confidence limit(s) to the mean. For two-sided intervals,
it is also known as the precision, half-width, or margin of error.
Target Distance from Mean to Limit is the value of the distance that is entered into the procedure.
Actual Distance from Mean to Limit is the value of the distance that is obtained from the procedure.
The standard deviation of the population measures the variability in the population.
Summary Statements
A sample size of 38 produces a two-sided 95% confidence interval with a distance from the mean
to the limits that is equal to 1.972 when the estimated standard deviation is 6.000.
Confidence Intervals for One Mean

Numeric Results for Two-Sided Confidence Intervals with Unknown Standard


Deviation
Target Actual
Sample Distance Distance Standard
Confidence Size from Mean from Mean Deviation
Level (N) to Limits to Limits (S)
0.950 64 1.000 0.999 4.000
0.950 99 1.000 0.997 5.000
0.950 141 1.000 0.999 6.000
0.950 191 1.000 0.999 7.000
0.950 18 2.000 1.989 4.000
0.950 27 2.000 1.978 5.000
0.950 38 2.000 1.972 6.000
0.950 50 2.000 1.989 7.000
0.950 10 3.000 2.861 4.000
0.950 14 3.000 2.887 5.000
0.950 18 3.000 2.984 6.000
0.950 24 3.000 2.956 7.000
The higher the accuracy, the larger
the sample size

The larger the SD, the larger the sample


Size
Example 2: Estimating population proportion (z- test)

What is the sample size needed to estimate the proportion (or percent) of obesity
In Dubai, such that we are 95% confident that we will not be off by more than
0.05 (or 5%) assuming that obesity proportion is 0.3 or (30%) (usually from the
literature)

Things we need:
1) Confidence level (usually, 95%)
2) The accuracy or the margin of error which is about two standard errors (0.05 in this
example)
3) The proportion of obesity (0.3 in this example), usually, we get this from the literature
or from a pilot sample
Confidence Intervals for One Proportion

Numeric Results for Two-Sided Confidence Intervals for One Proportion ───────────────────────
Confidence Interval Formula: Simple Asymptotic

Sample
Confidence Size Target Actual Proportion Lower Upper Width if
Level(N) Width Width (P) Limit Limit P = 0.5
0.950 323 0.100 0.100 0.300 0.250 0.350 0.109
Report Definitions
Confidence level is the proportion of confidence intervals (constructed with this same confidence level,
sample size, etc.) that would contain the population proportion.
N is the size of the sample drawn from the population.
Width is the distance from the lower limit to the upper limit.
Target Width is the value of the width that is entered into the procedure.
Actual Width is the value of the width that is obtained from the procedure.
Proportion (P) is the assumed sample proportion.
Lower Limit is the lower limit of the confidence interval.
Upper Limit is the upper limit of the confidence interval.
Width if P = 0.5 is the maximum width for a confidence interval with sample size N.
Summary Statements ─────────────────────────────────────────────────────────
A sample size of 323 produces a two-sided 95% confidence interval with a width equal to 0.100
when the sample proportion is 0.300.
Confidence Intervals for One Proportion

Numeric Results for Two-Sided Confidence Intervals for One Proportion


───────────────────────
Confidence Interval Formula: Simple Asymptotic

Sample
Confidence Size Target Actual Proportion Lower Upper Width if
The closer the proportion to 0.5,
Level (N) Width Width (P) Limit Limit P = 0.5 the larger the sample size
0.950 984 0.050 0.050 0.200 0.175 0.225 0.062
0.950 1291 0.050 0.050 0.300 0.275 0.325 0.055
0.950 1476 0.050 0.050 0.400 0.375 0.425 0.051
0.950 1537 0.050 0.050 0.500 0.475 0.525 0.050 The smaller the width, the larger
0.950 246
0.950 323
0.100
0.100
0.100
0.100
0.200
0.300
0.150 0.250
0.250 0.350
0.125
0.109
The sample size
0.950 369 0.100 0.100 0.400 0.350 0.450 0.102
0.950 385 0.100 0.100 0.500 0.450 0.550 0.100
0.950 110 0.150 0.150 0.200 0.125 0.275 0.187
0.950 144 0.150 0.150 0.300 0.225 0.375 0.163
0.950 164 0.150 0.150 0.400 0.325 0.475 0.153
0.950 171 0.150 0.150 0.500 0.425 0.575 0.150
Example 3: Estimating the correlation coefficient

What is the sample size needed to estimate the correlation coefficient between the BMI and
the Total cholesterol, such that we are 95% confident that we will not be off by more than 0.1
assuming the true correlation coefficient is 0.3

Things we need:
1) Confidence level (usually, 95%)
2) The accuracy or the margin of error which is about two standard errors (0.1)
3) The correlation coefficient (0.3 in this example), usually, we get this from the literature or from
a pilot sample
Confidence Intervals for Pearson's Correlation

Numeric Results for Two-Sided Confidence Intervals for Pearson Correlation ────────────────────

Sample Sample C.I. C.I.


Confidence Size Target Actual Correlation Lower Upper Width if
Level N Width Width r Limit Limit r = 0.0
0.950 320 0.200 0.200 0.300 0.197 0.397 0.219

Report Definitions
Confidence level is the proportion of confidence intervals (constructed with this same confidence level,
sample size, etc.) that would contain the true correlation.
Sample Size N is the size of the sample drawn from the population.
Width is the distance from the lower limit to the upper limit.
Target Width is the value of the width that is entered into the procedure.
Actual Width is the value of the width that is obtained from the procedure.
r is the estimate of Pearson's product moment correlation coefficient.
Lower and Upper Limit are the lower and upper limits of the confidence interval.
Width if r = 0.0 is the maximum width for a confidence interval with sample size N.

Summary Statements
─────────────────────────────────────────────────────────
A sample size of 320 produces a two-sided 95% confidence interval with a width equal to 0.200
when the estimate of Pearson's product-moment correlation is 0.300.
The bigger the correlation,
The smaller the sample size

The smaller the width (more accurate),


The larger the sample size
References:
1) PASS 16 Power Analysis and Sample Size Software (2018). NCSS, LLC. Kaysville, Utah, USA, ncss.com/software/pass.

2) https://emj.bmj.com/content/20/5/453

You might also like