Professional Documents
Culture Documents
Power and Sample Size Calculation - Presentation - 3 PDF
Power and Sample Size Calculation - Presentation - 3 PDF
Calculation
Presentation # 3
Objectives
• 1- Sample size to achieve certain power for the test of significance (Test of significance)
• 2- Sample size to achieve certain accuracy for the estimation (Confidence Interval)
Sample size to achieve a certain power
The Two Types of Errors Possible in Making
Decisions about Statistical Hypotheses
STATE OF NATURE
DECISION H0 is true H0 is false
15/04/2020 4
•Type I error occurs when the researcher rejects a null
hypothesis when it is true.
15/04/2020 5
•Type II error occurs when the researcher accepts a null
hypothesis that is false.
15/04/2020 6
15/04/2020 6
•Power of the Test
15/04/2020 7
For example
15/04/2020 10
1) Sample size to achieve certain power for the test of significance
applications
Example 1: Correlation
We will calculate the sample size needed to achieve 80% power to test that
H0: No correlation ( = 0 ) against
Ha: There is correlation( and the true correlation is = 0.2) (usually from literature
value/s)
Power 80% means that the probability of correctly reject the null and conclude there is
correlation between BMI and Total cholesterol is 80%, when the true correlation is 0.2
Pearson's Correlation Tests
H0: No difference in the mean recovery time between treatment A and treatment B
(the difference in days is zero days; difference = 0)
Ha: There is difference in the mean recovery time between treatment A and Treatment B
We will calculate the sample size needed to achieve 80% power to test that
H0: (difference = 0 ) against Ha: The true difference is = 2 days)
Power 80% means that the probability of correctly reject the null and conclude there is
difference in the recovery time between treatment A and Treatment B is 80%, when the true
difference is 2 days
Two-Sample T-Tests Assuming Equal Variance
Target Actual
Power Power N1 N2 N δ σ Alpha
0.80 0.80146 64 64 128 2.0 4.0 0.050
Report Definitions
Target Power is the desired power value (or values) entered in the procedure. Power is the probability of
rejecting a false null hypothesis.
Actual Power is the power obtained in this scenario. Because N1 and N2 are discrete, this value is often
(slightly) larger than the target power.
N1 and N2 are the number of items sampled from each population.
N is the total sample size, N1 + N2.
μ1 and μ2 are the assumed population means.
δ = μ1 - μ2 is the difference between population means at which power and sample size calculations are made.
σ is the assumed population standard deviation for each of the two groups.
Alpha is the probability of rejecting a true null hypothesis.
We will calculate the sample size needed to achieve 80% power to test that
H0: (difference = 0 ) against
Ha: The true difference in recovery rate is = .1 ;i.e. 10%)
Assuming that the recovery rate for the control group is 0.8 (80%)
Power 80% means that the probability of correctly reject the null and conclude there is
difference in the recovery rate between treatment A and Treatment B is 80%, when the true
rate difference is 0.1
Tests for Two Proportions
Numeric Results for Testing Two Proportions using the Z-Test with Pooled Variance ───────────────
H0: P1 - P2 = 0. H1: P1 - P2 = D1 ≠ 0.
Target Actual Diff
Power Power* N1 N2 N P1 P2 D1 Alpha
0.80 0.80007 199 199 398 0.9000 0.8000 0.1000 0.0500
* Power was computed using the normal approximation method.
Report Definitions
Target Power is the desired power value (or values) entered in the procedure. Power is the probability of
rejecting a false null hypothesis.
Actual Power is the power obtained in this scenario. Because N1 and N2 are discrete, this value is often
(slightly) larger than the target power.
N1 and N2 are the number of items sampled from each population.
N is the total sample size, N1 + N2.
P1 is the proportion for Group 1 at which power and sample size calculations are made. This is the treatment
or experimental group.
P2 is the proportion for Group 2. This is the standard, reference, or control group.
D1 is the difference P1 - P2 assumed for power and sample size calculations.
Alpha is the probability of rejecting a true null hypothesis.
Summary Statements ─────────────────────────────────────────────────────────
Group sample sizes of 199 in group 1 and 199 in group 2 achieve 80.007% power to detect a
difference between the group proportions of 0.1000. The proportion in group 1 (the treatment
group) is assumed to be 0.8000 under the null hypothesis and 0.9000 under the alternative
hypothesis. The proportion in group 2 (the control group) is 0.8000. The test statistic used is
the two-sided Z-Test with pooled variance. The significance level of the test is 0.0500
2- Sample size to achieve certain accuracy for the estimation
(Confidence Interval)
What is the sample size needed to estimate the average onset age for diabetes
In Dubai, such that we are 95% confident that we will not be off by more than
2 years assuming that the standard deviation for the onset age is 6 years
Things we need:
1) Confidence level (usually, 95%)
2) The accuracy or the margin of error which is about two standard errors (2 years in this
example)
3) The standard deviation of the individual observations of the diabetes onset age (6 years in
this example), usually, we get this from the literature or from a pilot sample
Confidence Intervals for One Mean
Report Definitions
Confidence level is the proportion of confidence intervals (constructed with this same confidence level,
sample size, etc.) that would contain the population mean.
N is the size of the sample drawn from the population.
Distance from Mean to Limit is the distance from the confidence limit(s) to the mean. For two-sided intervals,
it is also known as the precision, half-width, or margin of error.
Target Distance from Mean to Limit is the value of the distance that is entered into the procedure.
Actual Distance from Mean to Limit is the value of the distance that is obtained from the procedure.
The standard deviation of the population measures the variability in the population.
Summary Statements
A sample size of 38 produces a two-sided 95% confidence interval with a distance from the mean
to the limits that is equal to 1.972 when the estimated standard deviation is 6.000.
Confidence Intervals for One Mean
What is the sample size needed to estimate the proportion (or percent) of obesity
In Dubai, such that we are 95% confident that we will not be off by more than
0.05 (or 5%) assuming that obesity proportion is 0.3 or (30%) (usually from the
literature)
Things we need:
1) Confidence level (usually, 95%)
2) The accuracy or the margin of error which is about two standard errors (0.05 in this
example)
3) The proportion of obesity (0.3 in this example), usually, we get this from the literature
or from a pilot sample
Confidence Intervals for One Proportion
Numeric Results for Two-Sided Confidence Intervals for One Proportion ───────────────────────
Confidence Interval Formula: Simple Asymptotic
Sample
Confidence Size Target Actual Proportion Lower Upper Width if
Level(N) Width Width (P) Limit Limit P = 0.5
0.950 323 0.100 0.100 0.300 0.250 0.350 0.109
Report Definitions
Confidence level is the proportion of confidence intervals (constructed with this same confidence level,
sample size, etc.) that would contain the population proportion.
N is the size of the sample drawn from the population.
Width is the distance from the lower limit to the upper limit.
Target Width is the value of the width that is entered into the procedure.
Actual Width is the value of the width that is obtained from the procedure.
Proportion (P) is the assumed sample proportion.
Lower Limit is the lower limit of the confidence interval.
Upper Limit is the upper limit of the confidence interval.
Width if P = 0.5 is the maximum width for a confidence interval with sample size N.
Summary Statements ─────────────────────────────────────────────────────────
A sample size of 323 produces a two-sided 95% confidence interval with a width equal to 0.100
when the sample proportion is 0.300.
Confidence Intervals for One Proportion
Sample
Confidence Size Target Actual Proportion Lower Upper Width if
The closer the proportion to 0.5,
Level (N) Width Width (P) Limit Limit P = 0.5 the larger the sample size
0.950 984 0.050 0.050 0.200 0.175 0.225 0.062
0.950 1291 0.050 0.050 0.300 0.275 0.325 0.055
0.950 1476 0.050 0.050 0.400 0.375 0.425 0.051
0.950 1537 0.050 0.050 0.500 0.475 0.525 0.050 The smaller the width, the larger
0.950 246
0.950 323
0.100
0.100
0.100
0.100
0.200
0.300
0.150 0.250
0.250 0.350
0.125
0.109
The sample size
0.950 369 0.100 0.100 0.400 0.350 0.450 0.102
0.950 385 0.100 0.100 0.500 0.450 0.550 0.100
0.950 110 0.150 0.150 0.200 0.125 0.275 0.187
0.950 144 0.150 0.150 0.300 0.225 0.375 0.163
0.950 164 0.150 0.150 0.400 0.325 0.475 0.153
0.950 171 0.150 0.150 0.500 0.425 0.575 0.150
Example 3: Estimating the correlation coefficient
What is the sample size needed to estimate the correlation coefficient between the BMI and
the Total cholesterol, such that we are 95% confident that we will not be off by more than 0.1
assuming the true correlation coefficient is 0.3
Things we need:
1) Confidence level (usually, 95%)
2) The accuracy or the margin of error which is about two standard errors (0.1)
3) The correlation coefficient (0.3 in this example), usually, we get this from the literature or from
a pilot sample
Confidence Intervals for Pearson's Correlation
Numeric Results for Two-Sided Confidence Intervals for Pearson Correlation ────────────────────
Report Definitions
Confidence level is the proportion of confidence intervals (constructed with this same confidence level,
sample size, etc.) that would contain the true correlation.
Sample Size N is the size of the sample drawn from the population.
Width is the distance from the lower limit to the upper limit.
Target Width is the value of the width that is entered into the procedure.
Actual Width is the value of the width that is obtained from the procedure.
r is the estimate of Pearson's product moment correlation coefficient.
Lower and Upper Limit are the lower and upper limits of the confidence interval.
Width if r = 0.0 is the maximum width for a confidence interval with sample size N.
Summary Statements
─────────────────────────────────────────────────────────
A sample size of 320 produces a two-sided 95% confidence interval with a width equal to 0.200
when the estimate of Pearson's product-moment correlation is 0.300.
The bigger the correlation,
The smaller the sample size
2) https://emj.bmj.com/content/20/5/453