power

Chapter 7 Power & Sample Size Calculation
Yibi Huang
Chapter 7 Power & Sample Size Calculation for CRDs

Section 10.3 Power & Sample Size for Factorial Designs
Chapter 7 - 1
Power & Sample Size Calculation
When proposing an experiment (applying for funding etc),

nowadays one needs to show that the proposed sample size (i.e.
the number of experiment units) is
I neither so small that scientifically interesting effects will be
swamped by random variation (i.e., has enough power to
reject a false H0 )
I nor larger than necessary, wasting resources (time & money)
Chapter 7 - 2
Errors and Power in Hypothesis Testing
I A Type I error occurs when H0 is true but is rejected.
I A Type II error occurs when H0 is false but is accepted.
I The (significance) level of a test is the chance of making a
Type I error, i.e., the chance to reject a H0 when it is true.
I The power of the test is the the chance of rejecting H0 when
Ha is true:
power = 1 − P(making type II error|H0 is false) = 1 − β
= P(correctly reject H0 |H0 is false)
I A good test should have small test-level and large power.
Ha is rejected H0 is rejected
(H0 is accepted) (Ha is accepted)
√
H0 is true α = P(Type I error)
√
H0 is false β = P(Type II error)
Chapter 7 - 3
Power Calculation for ANOVA
For a CRD design with g treatments, recall
the means model: yij = µi + εij
and the effects model: yij = µ + αi + εij ,
for i = 1, . . . , g , and j = 1, . . . , ni .
The two models are related via the relationship
µi = µ + α i .
For CRDs, we use the means model more often. However, for
power and sample size calculation, we need to use the effects
model and µ must be defined as
g ni g
1 XX 1 X
µ= µi = ni µ i .
N N
i=1 j=1 i=1
Pg Pg
which implies that i=1 ni αi = 0. Here N = i=1 ni is the total
number of experimental units.
Chapter 7 - 4
Recall the H0 and Ha for the ANOVA F test are
H 0 : µ1 = · · · = µg v.s. Ha : µi ’s not all equal,
in terms of the means model, or equivalently in terms of the effects

model, they are
H0 : α 1 = · · · = α g = 0 v.s. Ha : Not all αi ’s are 0.

MSTrt
Recall we reject H0 at level α if the test statistic F = MSE
exceeds the critical value Fα,g −1,N−g . So
Power = P(Reject H0 |Ha is true) = P(F > Fα, g −1, N−g ).
To find P(F > Fα, g −1, N−g ), we must know the distribution of F .
I What is the distribution of F under H0 ? Fg −1,N−g .
I And under Ha ?
Chapter 7 - 5
MSTrt SSTrt /(g − 1)
For F = = , recall in Ch3 we’ve shown that
MSE SSE/(N − g )
I no matter under H0 or Ha , it is always true that
SSE
∼ χ2N−g , and E(MSE) = σ 2
σ2
I For the numerator, SSTrt /σ 2 has a χ2g −1 distribution under
H0 . But under Ha , it has a non-central Chi-square
distribution χ2g −1,δ with non-central parameter
Pg 2
i=1 ni αi
δ= .
σ2
Moreover, Xg
E(SSTrt ) = (g − 1)σ 2 + ni αi2
i=1
| {z }
δσ 2
(
(g − 1)σ 2 under H0
=
(g − 1 + δ)σ 2 under Ha
Chapter 7 - 6
Non-Central F -Distribution
Under Ha , it can be shown that
MSTrt
F =
MSE
has a non-central F -distribution on degrees of freedom g − 1
and N − g , with non-centrality parameter δ, denoted as
Pg
ni α2
F ∼ Fg −1,N−g ,δ where δ = i=12 i .
σ
Recall that
g ni g
1 XX 1 X
αi = µi − µ, and µ = µi = ni µi .
N N
i=1 j=1 i=1
Non-central F -distribution is also right skewed, and the greater the

non-centrality parameter δ, the further away the peak of the
distribution from 0.
Chapter 7 - 7
Example 1: Power Calculation (1)
I g = 5 treatment groups
group sizes: n1 = n2 = n3 = 5, n4 = 6, n5 = 4
I assume σ = 0.8
I desired significance level α = 0.05
I find the power of the test when Ha is true with
µ1 = 1.6, µ2 = 0.6, µ3 = 2, µ4 = 0, µ5 = 1.
Solution.
The grand mean µ is
Pg
ni µi 5 × 1.6 + 5 × 0.6 + 5 × 2 + 6 × 0 + 4 × 1
µ = i=1 =
N 5+5+5+6+4
25
= =1
25
The αi ’s are thus αi = µi − µ = µi − 1:
α1 = 0.6, α2 = −0.4, α3 = 1, α4 = −1, α5 = 0.
Chapter 7 - 8
The non-centrality parameter is
Pg
i=1 ni αi2 5 × 0.62 + 5 × (−0.4)2 + 5 × 12 + 6 × (−1)2 + 4 × 02
δ= =
σ2 0.82
13.6
= = 21.25
0.64
So
(
MSTrt Fg −1, N−g = F4, 25−5 under H0
F = ∼
MSE Fg −1, N−g ,δ = F4, 25−5, 21.25 under Ha
df(x0, df1 = 4, df2 = 20)
0.8
H0: F ∼ F4,20
0.4
Ha: F ∼ F4,20,21.25
0.0
0 5 10 15
Chapter 7 - 9
The critical value to reject H0 keeping the significance level at
α = 0.05 is F0.05,4,20 ≈ 2.866.
> qf(0.05,4,20, lower.tail=F)
[1] 2.866081
When H0 is true, the chance that H0 is rejected is only 0.05 (the

red shaded area.)
df(x0, df1 = 4, df2 = 20)
0.8
Accept H0 Reject H0
0.4
Area = 0.05
0.0
0 F0.05,4,20 5 10 15
Reminder: Don’t confuse the following two:

I in most STAT books, Fα,df 1,df 2 means the area of the upper
tail is α.
I in R, qf(alpha, df1, df2) means the the area of the lower
Chapter 7 - 10
If Ha is true and
µ1 = 1.6, µ2 = 0.6, µ3 = 2, µ4 = 0, µ5 = 1,
we know then F ∼ F4,20,δ=21.25 . The power to reject H0 is the area
under the density of F4,20,δ=21.25 beyond the critical value (the
blue shaded area,) which is 0.9249.
> pf(qf(.95,4,20),4,20, ncp=21.25, lower.tail=F)
[1] 0.9249342
In the R codes, ncp means the “non-centrality parameter.”

df(x0, df1 = 4, df2 = 20)
0.8
Accept H0 Reject H0
0.4
Power
0.0
0 F0.05,4,20 5 10 15
Chapter 7 - 11
Remark
The power of a test is not a single value, but a function of the
parameters in Ha . If parameter µi ’s change their values, the power
of the test also changes. It makes no sense to talk about the power
of a test without specifying the parameters in Ha
Chapter 7 - 12
Power Of A Test Is Affected By ...
The larger the non-centrality parameter

Pg
ni α2
δ = i=12 i ,
σ
the greater the power.
The power of a test will increase if
I the number of replicate ni per treatment increases
I the magnitude
P of treatment effects αi ’s increase (under the
constraint i ni αi = 0)
I the size of noise σ 2 decreases.
Chapter 7 - 13
Example 2: Sample Size Calculation (1)
I g = 5 treatment groups of equal sample size ni = n for all i
I assume σ = 0.8
I desired significance level α = 0.05
I Assuming the design is balanced that all groups have identical
sample size n, what is the minimal sample size n per
treatment to have power 0.95 when
α1 = 0.5, α2 = −0.5, α3 = 1, α4 = −1, α5 = 0?
Solution. The non-centrality parameter is
Pg
ni α2 n
δ = i=12 i = [0.52 + (−0.5)2 + 12 + (−1)2 + 02 ]
σ 0.82
n
= × 2.5 = 3.9n
0.82
So
(
MSTrt Fg −1, N−g = F4, 5n−5 under H0
F = ∼
MSE Fg −1, N−g ,δ = F4, 5n−5, 3.9n under Ha
Chapter 7 - 14
Recall the critical value F ∗ for rejecting H0 at level α = 0.05 is
F ∗ = Fα, g −1, N−g = F0.05, 4, 5n−5
which can be find in R via the command
> F.crit = qf(alpha, g-1, N-g, lower.tail=F) # syntax
> F.crit = qf(0.05, 4, 5*n-5, lower.tail=F) # sub-in the values
By definition,
Power = P(reject H0 |Ha is true)
= P(the non central F statistic ≥ F ∗ )
= P(F (g − 1, N − g , δ) ≥ F ∗ )
= P(F (4, 5n − 5, 3.9n) ≥ F ∗ )
which can be found in R via the command
> pf(F.crit, g-1, N-g, ncp=delta, lower.tail=F) # syntax
> pf(F.crit, 4, 5*n-5, ncp=3.9*n, lower.tail=F) # sub-in values
In the R codes, ncp means the “non-centrality parameter.”

Chapter 7 - 15
Now we find the R code to find the power of the ANOVA F -test
when n is known. Let’s plug in different values of n and see what
is the smallest n to make power ≥ 0.95.
> F.crit = qf(alpha, g-1, N-g, lower.tail=F)
> pf(F.crit, g-1, N-g, ncp=delta, lower.tail=F) # syntax
> n = 5
> F.crit = qf(0.05, 4, 5*n-5, lower.tail=F)
> pf(F.crit, 4, 5*n-5, ncp=3.9*n, lower.tail=F)
[1] 0.8994675 # less than 0.95, not high enough
> n = 6
> F.crit = qf(0.05, 4, 5*n-5, lower.tail=F)
> pf(F.crit, 4, 5*n-5, ncp=3.9*n, lower.tail=F)
[1] 0.9578791 # greater than 0.95, bingo!
So we need 6 replicates in each of the 5 treatment groups to

ensure a power of 0.95 when
α1 = 0.5, α2 = −0.5, α3 = 1, α4 = −1, α5 = 0.
Remark: Again, we have to specify the Ha fully to to find the
appropriate sample size.
Chapter 7 - 16
But σ 2 is Unknown . . .
As σ 2 is usually unknown, here are a few ways to make a guess.

I Make a small-sample pilot study to get an estimate of σ.
I Based on prior studies or knowledge about the experimental

units, can you think of a range of plausible values for σ?
If so, choose the biggest one.
I You could repeat the sample size calculations for various levels
of σ to see how it affects the needed sample size.
Chapter 7 - 17
How to Specify the Ha ?
As the power of a test depends on the alternative hypothesis Ha ,
that is, the αi ’s, one might has to try several sets of αi ’s to find
the appropriate sample size. But how many Ha ’s we have to try?
Here is a useful trick.

1. Suppose we would be interested if any two means differed by
D or more.
2. The smallest value of δ in this case is when two means differ
by exactly, D, and the other g − 2 means are halfway between.
3. So try α1 = D/2, α2 = −D/2, and αi = 0 for all other τ .
Assuming equal sample sizes, the non-centrality parameter is
X nα2 n(D 2 /4 + D 2 /4) nD 2
i
δ= = = .
σ2 σ2 2σ 2
i
Chapter 7 - 18
Power and Sample Size Calculation for Complete Block Designs
For complete block designs
yij = µ + αi + βj + εij RCBD

yijk = µ + αi + βj + γk + εijk Latin Square
yijk` = µ + αi + βj + γk + δ` + εijk` Graeco-Latin Square
where αi ’s are the treatment effect, and βj , γk , δ` are effects of

blocking factors, when Ha is true, treatments have different effects,
the ANOVA F -statistic also has a non-central F -distribution
(
MStrt Fg −1, df of MSE under H0
F = ∼
MSE Fg −1, df of MSE,δ under Ha
where the non-centrality parameter δ is

r i αi2
P
δ= , where r = # of replicates per treatment
σ2
Note that the df changes from N − g to the df of MSE.
Chapter 7 - 19
Power and Sample Size Calculation for Balanced Factorial Designs
I Section 10.3 in Oehlert’s book

I We may ignore factorial structure and compute power and
sample size for the overall null hypothesis of no model effects.
I We will address methods for balanced data only because most
factorial experiments are designed to be balanced, and simple
formulae Power for balanced data for noncentrality parameters
exist only for balanced data.
I For factorial data, we usually test H0 about main effects or
interactions in addition to the overall H0 of no model effects.
Chapter 7 - 20
Non-Centrality Parameter for Balanced Factorial Designs
Ex, consider a 3-way factorial design
yijk` = µ + αi + βj + γk + αβij + βγjk + αγij + αβγijk + εijk`
to calculate the power of the F -test of a main effect or an

interaction effect under Ha , the non-centrality parameter can be
calculated as follows:
I Square the factorial effects (using the zero-sum constraint)
and sum them,
I Multiply this sum by the total number of data in the design
divided by the number of levels in the effect, and
I Divide that product by the error variance.
Chapter 7 - 21
E.g., consider a a × b × c factorial design with r replicates per
treatment
To test about the B main effects
H0 : β 1 = · · · = β b = 0 v.s. Ha : not all βj are 0
the ANOVA F -statistic for B main-effect is

(
MSB Fb−1, df of MSE under H0
F = ∼
MSE Fb−1, df of MSE,δ under Ha

X
N 2 2
X
2
δ= β σ = rac β σ2.
b j j j j
Here N = abcr is the total number of observations.

Chapter 7 - 22
E.g., consider a a × b × c factorial design with r replicates per
treatment
To test about the AB interactions
H0 : all αβij = 0 v.s. Ha : not all αβij = 0
the ANOVA F -statistic for AB interactions

(
MSAB F(a−1)(b−1), df of MSE under H0
F = ∼
MSE F(a−1)(b−1), df of MSE,δ under Ha

X
N 2
X
δ= (αβij ) σ 2 = rc (αβij )2 σ 2 .
ab ij ij
Here N = abcr is the total number of observations.

Chapter 7 - 23
Sample Size Should Be The Last Thing To Consider
In practice, sample size calculation is the last thing to do in

designing an experiment. Consider the followings before calculating
the required sample size.
I asking the right question,
I choosing an appropriate response,
I thinking about what factors need to be blocked on,
I choosing a right design,
I setting up the right procedure to recruit subjects, and so on,
All the above are more important than doing the right sample size
calculation!
Sometimes, choosing a better design can substantially reduces the
required sample size.
Chapter 7 - 24

power

Uploaded by

Copyright:

Available Formats

You might also like

power

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

power

Uploaded by

Copyright:

Available Formats

Chapter 7 Power & Sample Size Calculation

Chapter 7 Power & Sample Size Calculation for CRDs

When proposing an experiment (applying for funding etc),

H 0 : µ1 = · · · = µg v.s. Ha : µi ’s not all equal,

in terms of the means model, or equivalently in terms of the effects

H0 : α 1 = · · · = α g = 0 v.s. Ha : Not all αi ’s are 0.

Power = P(Reject H0 |Ha is true) = P(F > Fα, g −1, N−g ).

Non-central F -distribution is also right skewed, and the greater the

When H0 is true, the chance that H0 is rejected is only 0.05 (the

Reminder: Don’t confuse the following two:

In the R codes, ncp means the “non-centrality parameter.”

The larger the non-centrality parameter

In the R codes, ncp means the “non-centrality parameter.”

So we need 6 replicates in each of the 5 treatment groups to

As σ 2 is usually unknown, here are a few ways to make a guess.

I Based on prior studies or knowledge about the experimental

Here is a useful trick.

yij = µ + αi + βj + εij RCBD

where αi ’s are the treatment effect, and βj , γk , δ` are effects of

where the non-centrality parameter δ is

I Section 10.3 in Oehlert’s book

Ex, consider a 3-way factorial design

yijk` = µ + αi + βj + γk + αβij + βγjk + αγij + αβγijk + εijk`

to calculate the power of the F -test of a main effect or an

yijk` = µ + αi + βj + γk + αβij + βγjk + αγij + αβγijk + εijk`

To test about the B main effects

H0 : β 1 = · · · = β b = 0 v.s. Ha : not all βj are 0

the ANOVA F -statistic for B main-effect is

where the non-centrality parameter δ is

Here N = abcr is the total number of observations.

yijk` = µ + αi + βj + γk + αβij + βγjk + αγij + αβγijk + εijk`

To test about the AB interactions

H0 : all αβij = 0 v.s. Ha : not all αβij = 0

the ANOVA F -statistic for AB interactions

where the non-centrality parameter δ is

Here N = abcr is the total number of observations.

In practice, sample size calculation is the last thing to do in

You might also like