Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

Introduction to Sample Size Determination

and Power Analysis for Clinical Trials

John M. Lachin
From the Biostatistics Center, George Washington University, Bethesda, Maryland

ABSTRACT: The importance of sample size evaluation in clinical trials is reviewed and a
general method is presented from which specific equations are derived for sample
size determination or the analysis of power for a wide variety of statistical proce-
dures. The method is discussed and illustrated in relation to the t test, tests for
proportions, tests of survival time, and tests for correlations as they commonly occur
in clinical trials. Most of the specific equations reduce to a simple general form for
which tables are presented.

KEY WORDS: sample size determination, statistical power, survival analysis, tests for correlations,
tests for proportions, t tests

INTRODUCTION
It is widely recognized among statisticians that the evaluation of sample
size and power is a crucial element in the planning of any research venture.
Often it becomes necessary for the statistician to introduce these basic
concepts to collaborators who may be aware of the problem but who do not
understand the basic statistical logic. In this paper a simple expressior\ is
presented that can be used for sample size evaluation for a wide variety of
statistical procedures and that has often been employed in collaboration
with medical researchers in the conduct of clinical trials [l]. The method
presented is quite general and, it is hoped, may be applied by clinician and
statistician alike in a variety of research settings.
When conducting a statistical test, two types of error must be considered:
Type I (false positive) and Type II (false negative), with probabilities (Y and
p, respectively. In the following we will consider the general family of
statistics, say X, that are normally distributed under a null hypothesis (H,)
as N&, Ci) and under an alternative hypothesis (H,) as N(pl, 2:); where p1
> p. or pl < p. and where Ci and C: are some function of the variance cr*

Address requests for reprints to Dr. John M. Lachin, the Biostatistics Center, Department of
Statistics, George Washington University, 7979 Old Georgetown Road, Bethesda, MD 20014.
Received October I, 1979; revised and accepted September 2, 1980.

Controlled Clinical Trials 2, 93-113 (1981) 93


@ 1981 Elsevier North Holland, Inc., 52 Vanderbilt Avenue, New York, NY 10017 Olm-2456/81/020093021$02.50
94 John M. Lachin

P(X/~)
],
Figure I
,
/~0
Xa
/~1
,8:0.25
~:0.05

The distribution of a statistic X with variance ~2 under the null hypothesis


H0: /~ = /~0, i.e., the curve P(XIH0), and that under the alternative
hypothesis Hi: /~ = ~1 or the curve P(XIHI), and the probabilities of Type
I error (c~) and Type II error (/3), where X~ = /~0 + Z ~ .

of the i n d i v i d u a l o b s e r v a t i o n s a n d the s a m p l e size N. 1 G i v e n these distri-


b u t i o n s one can t h e n d e t e r m i n e c¢ a n d fl as s h o w n in Figure 1.
In a clinical trial the p a r a m e t e r /~ is the t r e a t m e n t - c o n t r o l difference in
the o u t c o m e m e a s u r e of interest, e.g., the m e a n difference on s o m e m e a s -
urable p h a r m a c o l o g i c effect such as s e r u m cholesterol, or the difference in
the p r o p o r t i o n d i s p l a y i n g an e v e n t such as healing. In such cases /~0 is
usually zero a n d / z l is specified as the m i n i m a l clinically relevant t h e r a p e u t i c
difference. (For a m o r e basic i n t r o d u c t i o n to these concepts, see [2-4]).
W h e n the statistical test is c o n d u c t e d , the p r o b a b i l i t y of T y p e I error, c~,
is specified b y the i n v e s t i g a t o r . H o w e v e r , the p r o b a b i l i t y that a significant
result will b e o b t a i n e d if a real difference (/~1) exists (i.e., the p o w e r of the
test, 1 - /3) d e p e n d s largely on the total s a m p l e size N. As one increases N
the s p r e a d of the d i s t r i b u t i o n s in Figure I decreases, i.e., the curves tighten;
t h u s / 3 decreases ( p o w e r increases). T h u s if the statistical test fails to reach
significance, the p o w e r of the test b e c o m e s a critical factor in reaching an
inference. It is not w i d e l y a p p r e c i a t e d that the failure to achieve statistical
significance m a y often b e related m o r e to the low p o w e r of the trial t h a n to
a n actual lack of difference b e t w e e n the c o m p e t i n g therapies. Clinical trials
w i t h i n a d e q u a t e s a m p l e size are thus d o o m e d to failure before t h e y b e g i n
a n d serve only to confuse the issue of d e t e r m i n i n g the m o s t effective t h e r a p y
for a g i v e n condition. T h u s o n e s h o u l d take steps to e n s u r e that the p o w e r
of the clinical trial is sufficient to justify the effort involved.
Conversely, if the p o w e r of the trial in detecting a specified clinically
relevant difference (/~1) is sufficiently h i g h , say 0.95, failure to achieve
significance m a y p r o p e r l y b e i n t e r p r e t e d as p r o b a b l y i n d i c a t i n g negligible
r e l e v a n t d i f f e r e n c e b e t w e e n the c o m p e t i n g t h e r a p i e s . T h u s the p r o p e r
i n t e r p r e t a t i o n of a " n e g a t i v e " result is b a s e d largely u p o n a c o n s i d e r a t i o n
of the p o w e r of the e x p e r i m e n t .

~The n o t a t i o n N ( ~ , X2) d e n o t e s t h e n o r m a l d i s t r i b u t i o n w i t h m e a n ~ a n d v a r i a n c e ~;~. If X


N(p., ~2) t h e n Z = (X - p.)~-~ is d i s t r i b u t e d as N(0, 1), t h e l a t t e r b e i n g w i d e l y t a b u l a t e d as
the standard normal distribution.
Sample Size Determination 95

These p o i n t s h a v e b e e n illustrated b y Freiman et al. [5] w h o s h o w e d that


of 71 recent clinical trials that reached a negative result, 67 h a d p o w e r less
than 0.90 in detecting a m o d e r a t e (25%) therapeutic i m p r o v e m e n t . Their
conclusion is that m a n y of the therapies studied were not given a fair test
s i m p l y d u e to i n a d e q u a t e sample sizes and, thus, i n a d e q u a t e p o w e r .

SAMPLE SIZE AND POWER


The p r o b l e m in p l a n n i n g a clinical trial is to d e t e r m i n e the sample size N
r e q u i r e d such that in testing H0 with stated p r o b a b i l i t y of T y p e I error ~,
the p r o b a b i l i t y of T y p e II error is a desired small level/3. The parameters of
the p r o b l e m are ~, iS,/~0,/~1, ]~02,and ~ .
Since the variances 22 are functions of N, the sample size r e q u i r e d is that
w h i c h s i m u l t a n e o u s l y satisfies the equalities Pr(Z > Z~) = c¢ if H0 is true
a n d Pr(Z > Z~) = 1 - /3 if H1 is true; w h e r e Z~ is the staildard normal
deviate at the o~ significance level (e.g., Z~ = 1.645 for ~ = 0.05, one-sided)
and w h e r e Z = (X - ~0)2 -1 is the simple statistic one w o u l d use in testing
Ho; w h e r e Z - N(0, 1) if H0 is true (see footnote 1).
It can easily b e s h o w n , h o w e v e r , that the sample size that satisfies these
equalities also satisfies the equality
I~, - ~01 = z~:~0 + z ~ : ~ , (1)
The term Z~/2 is e m p l o y e d for a two-tailed test. Derivations for particular
cases are given in S n e d e c o r and Cochran [6, p. 111] and Fleiss [ 7, p. 29],
a m o n g others.
This basic relationship can readily b e grasped from Figure 1. To relate
e q u a t i o n (1) to Figure 1, note that the critical value of X at the o~ level of
significance is X~ = /z0 + Z ~ 0 . O n e can thus readily derive e q u a t i o n (1)
from Figure 1 w h e r e the " d i s t a n c e " ]/~ - ~01 is the sum of two parts, ~K~ -
/~0] = Z~]~0 and ]~1 - X~I = Za~,, w h e r e Z~ results from the specification X~,
/~1, and ~1.
This e q u a t i o n can t h e n be u s e d to evaluate sample size or p o w e r for the
most c o m m o n l y used statistical tests once ~0, ~1, and c~ have b e e n specified.
The three basic questions one can ask are
1. W h a t sample size is r e q u i r e d to e n s u r e p o w e r 1 - / 3 of detecting a relevant
difference/~?
2. W h a t is the power (Z~) of the e x p e r i m e n t in detecting a relevant difference
k~ w h e n a specific sample size N is e m p l o y e d ?
3. W h a t difference i~ can be detected with p o w e r 1 - / ~ if the e x p e r i m e n t is
c o n d u c t e d with a specified sample size N?
Usually, q u e s t i o n 1 is e m p l o y e d in p l a n n i n g an e x p e r i m e n t a n d q u e s t i o n 2
is e m p l o y e d in evaluating the results of an experiment. Q u e s t i o n 3 can b e
e m p l o y e d in either case.
For the d e t e r m i n a t i o n of sample size (question 1) one simply solves
e q u a t i o n (1) for N once the expression for the variances Xz have b e e n
obtained. In m a n y cases, ~z will be a function of the form ~2 = o.Z/N, w h e r e
96 John M. Lachin

o-2 is the v a r i a n c e of the i n d i v i d u a l m e a s u r e m e n t s a n d N is the total sample


size. In this case
IId~l -- ['~0 I = (Z~o'olX/N) + (Zao-,Ix/N) (2)
Solving for total s a m p l e size N one o b t a i n s s i m p l y

N= [Z,~o'o + Z~o',12
[ ~-~---~0 J (3)

Likewise, to d e t e r m i n e p o w e r (question 2) one solves for Z~ to obtain f r o m


(2)
X / ~ I/Xl - /Xo t - Z~o-0
Z~ = (4) (4)
o-1

Power (1 - /3) can t h e n b e d e t e r m i n e d f r o m the value Z , b y referring Z , to


tables of the n o r m a l d i s t r i b u t i o n w h e r e values Z~ < 0.0 indicate p o w e r <
0.50.
Similarly, the m i n i m a l detectable difference w i t h p o w e r 1 - ]3 g i v e n a
s a m p l e size N (question 3) is o b t a i n e d b y solving e q u a t i o n (2) for /~,. In
s o m e cases an explicit solution is o b t a i n e d , w h e r e a s in others, e.g., for
p r o p o r t i o n s , an iterative p r o c e d u r e is r e q u i r e d since b o t h ~, a n d o- will b e
functions of the s a m e p a r a m e t e r s . Thus w e shall only consider questions 1
a n d 2 in the following.
Since e q u a t i o n s (3) a n d (4) follow o b v i o u s l y f r o m e q u a t i o n (2), it is often
c o n v e n i e n t to express the basic relationship in the f o r m
- = zoo-0 ÷ z o-, (5)

which can then be solved for N or Za. This form will be employed in cases
where the basic equations equivalent to equations (3) and (4) become
cumbersome.
The following sections demonstrate how these simple relationships can
be employed with Student's t tests, chi-square tests for proportions, analyses
of survival time, and tests for correlations. In each case, the explicit
equations for sample size and power computation are presented w i t h
examples.
Most procedures allow unequal group sizes reflected in the sample
fractions Qe a n d Qc w h e r e n e = QeN, nc = QcN a n d Qe ÷ Q¢ = 1. In the
following, the s u b s c r i p t s e a n d c are u s e d to d e n o t e the e x p e r i m e n t a l a n d
control g r o u p s w h e r e the total s a m p l e size t h e n is N = n e -I- no. O b v i o u s l y ,
Qe = Qc = 0.5 for e q u a l - s i z e d groups. For these p r o c e d u r e s it is well k n o w n
that p o w e r is m a x i m i z e d a n d total s a m p l e size m i n i m i z e d for e q u a l - s i z e d
g r o u p s , b u t d u e to ethical c o n s i d e r a t i o n s , u n e q u a l - s i z e d g r o u p s are at times
desirable.
Virtually all these m e t h o d s can b e u s e d w i t h a s i m p l e calculator. To use
these m e t h o d s one m u s t first specify the p a r a m e t e r s of the p r o b l e m . In
a d d i t i o n to Z~ a n d Z~, a n d the s a m p l e fractions Qe a n d Qc, the specific
p a r a m e t e r s of that test m u s t b e specified. For example, for the t test of two
i n d e p e n d e n t g r o u p s , the g r o u p m e a n s ve a n d vc are r e q u i r e d as well as the
s t a n d a r d d e v i a t i o n of the m e a s u r e s , o-. For s o m e other statistical p r o c e d u r e s
Sample Size Determination 97

a separate standard deviation is not required since the variance will be a


function of the expectation and/or the sample size alone.

ADDITIONAL CONSIDERATIONS
Sample size evaluation for a clinical trial is almost always a matter of
compromise between the available resources and the various objectives,
such as safety with small effects desirable and efficacy with large effects
desirable, [1, 2]. This leads to a recursive process whereby one cycles
through various specifications of the desired detectable effects and considers
the resulting sample size in relation to the objectives and the resources
available. Eventually one reaches a sample size and statement of objectives
that are consistent with each other and the available resources.
In this process, however, attention should also be given to the operational
aspects of the trial. Foremost among these are the factors related to the
administration of the program of therapy and the evaluation of outcome. As
a simple example, consider a clinical trial of an ulcer healing agent with
healing assessed endoscopically after 4 weeks. Noncompliance, dropouts,
and lack of control of other factors such as diet, drinking, and smoking may
all combine to reduce the observed healing rate and thus reduce the
statistical power of the trial. Likewise, failure to set uniform standards for
endoscopic examination and criteria for healing will increase the variability
of the outcome measurement, again leading to reduced power.
Among these, a major consideration is the rate of dropouts, patients who
terminate therapy for reasons related neither to the disease under treatment
nor the therapy. If an R dropout rate is expected, a simple but adequate
adjustment is provided by Nd = N / ( 1 - R) 2 where N is the sample size
calculated assuming no dropouts and Nd that required with dropouts [1].
Likewise, to evaluate power one would use equations and tables with N =
Nd(1 -- R) 2 where Nd is the observed or expected sample size. Additional
procedures are described in [8] and [9].

STUDENT'S t TEST
In its most general form, Student's t is used to test the hypothesis that the
mean of a normal variable, v, equals some specified value H0:/~0 = ~0
against some alternative H i : / ~ = vi, Vl ~ v0, w hen the variance is unknown.
The test statistic is of the form t = V ' N ( x - I~o)/S where x is the sample
mean with standard error S 2 / N , S 2 being the unbiased sample estimate of
the variance o-2 on N - 1 degrees of freedom (df). The distribution of t
becomes increasingly close to that of a standard normal variable as df
increases, at least 30 df being required for the approximation to be adequate
[10]. Thus equations (1) through (4) can be employed to yield an approximate
evaluation of sample size and power.
This approach, however, will tend to overestimate power for given S 2
and N, and thus it will tend to underestimate the required sample size,
although this effect is increasingly negligible for increasing df. An adequate
adjustment is obtained by the correction factor f = (df + 3)/(df + 1), where
98 John M. Lachin

fN patients are actually e m p l o y e d after N is o b t a i n e d from e q u a t i o n (3), or,


alternately, b y N/f used in e q u a t i o n (4) w h e n solving for p o w e r [6, p. 114].
For those w h o desire an exact solution, an iterative procedure is required
and is described, with brief tables, in Cochran and Cox [11, p. 19]. U n d e r
this procedure, one obtains a trial value for N that is then adjusted in light
of the resulting degrees of freedom.
In sample size or p o w e r evaluation for the t test a critical feature is the
specification of 0-2. For the other tests considered later (proportions, etc.) the
variances are not specified separately. Usually a value for 0-2 can be specified
b a s e d on prior experiments u s i n g the same m e a s u r e m e n t s ; in these cases it
is best to use the largest value 0-2 expected. Often a pilot s t u d y is helpful to
p r o v i d e an estimate of 0-" u n d e r the conditions to be used in the experiment
to be conducted. Of course, if p o w e r is to be evaluated after an experiment
was c o n d u c t e d with a given N, then the o b t a i n e d estimate S 2 of the variance
s h o u l d be employed.

A Single Mean
The most basic form of the t test is the test of H0:/x0 = ~'0 for some a priori
specified m e a n value v0 with variance 0-~, against an alternative H,: /~1 = / d l
=~ 1,0 with variance o-2. The test statistic is as presented w h e r e x is the m e a n
of a single sample of observations with sample variance S 2. Given specified
~, fl, /~0, /xl, 0-0, and 0-1, t h e equations for sample size N or p o w e r (Z~) are
exactly as p r e s e n t e d in e q u a t i o n s (3) a n d (4), respectively.

Two Independent Groups


The t test is most w i d e l y u s e d to test the null h y p o t h e s i s that the m e a n s of
two i n d e p e n d e n t g r o u p s are equal, H 0 : ~ 0 -- (re - vc) = 0, b a s e d on two
separate samples of sizes ne = Q ~ / and n~. = Qd~/, Q,. + Qc = 1. The
fractions Qe a n d Qc are the s a m p l e fractions a n d refer s i m p l y to the
p r o p o r t i o n of patients in each g r o u p , N b e i n g the total sample size. The test
statistic e m p l o y s the p o o l e d estimate S 2 of the c o m m o n variance o-~ = o-2 =
0-2 with N - 2 dr. It is well k n o w n that p o w e r is m a x i m i z e d for Q~ = Qc =
0.5.
U n d e r H , , /~, is specified as the m i n i m a l relevant difference to be
detected, /~, = lye - Vc[ ¢ 0, a n d it follows that E02 = E~ = 0-2(Qe' + Q~')/N.
U s i n g e q u a t i o n (1) the e q u a t i o n s for total N and Za are
0-2(Qe ' + Q¢') (z~ + Z~) 2
N = (6)

Za = V' -~ (7)
0- Q~ + Q ; '
w h e r e for equal sample sizes (Q~-I + Q~-I) = 4.0.
For example, consider an experiment w h e r e o- is k n o w n not to exceed or
k n o w n to be o- = 1.0 a n d it is desired to detect a difference/~1 = (re - re)
= 0.20. From e q u a t i o n (6), to e n s u r e a 90% chance of detecting this
difference (Z~ = 1.282) with c~ = 0.05 (one-sided, Z~ = 1.645), N = 858 is
Sample Size Determination 99

r e q u i r e d . This w o u l d yield a t test on 856 (N - 2) df, a n d thus w i t h the


correction factor f = 1.006, the final s a m p l e size d e s i r e d is fN = 860.
S u p p o s e , h o w e v e r , that the e x p e r i m e n t w a s actually c o n d u c t e d w i t h only
102 patients. T h e correction factor is 1.02 a n d e q u a t i o n (7) is e m p l o y e d w i t h
N = (102/f) = 100, to yield Z~ = - 0 . 6 4 5 , thus i n d i c a t i n g that for N = 100 the
e x p e r i m e n t h a d a b o u t 26% p o w e r . If the e x p e r i m e n t p r o d u c e d a n e g a t i v e
result, h o w e v e r , e q u a t i o n (6) could also b e u s e d to s h o w that w i t h N = 100
there w a s a l m o s t 100% p o w e r in detecting a difference/~, = 1.0 (Za = 3.355).
T h u s o n e could safely rule out a difference on the order of/~, = 1.0.

Paired Observations
In the e v e n t that the o b s e r v a t i o n s in the two g r o u p s are linked t o g e t h e r b y
p a i r i n g or r e p e a t e d m e a s u r e s at times a a n d b on the s a m e patient, the t test
is c o n d u c t e d u s i n g the m e a n difference d = Xb - X~ w i t h a s t a n d a r d error
~2 = cr~dN w h e r e o ' ~ = 20~(1 - p) if c r y = o-~ = o-~, p b e i n g the p r e p o s t
correlation. F r o m e q u a t i o n (1) the e q u a t i o n s for N a n d Za for detecting a
true d i f f e r e n c e / z l = Vb -- Va are:
(Z +
m - (8)

Z~ = (9)
Or d

w h i c h are e q u i v a l e n t to u s i n g e q u a t i o n s (3) a n d (4) w i t h o-,~ in place of o-0


a n d o-~.
In this instance, often an e s t i m a t e of o'2 is available f r o m p r i o r experience.
If not, an e s t i m a t e of cr2 can b e u s e d w i t h an estimate of the correlation p.
N o t e that p a i r i n g is only efficient if p > 0, i.e., there is p o s i t i v e correlation
b e t w e e n the a a n d b m e a s u r e m e n t s . If n o e s t i m a t e of p is available, it is
safest to a s s u m e p = 0 or, nominally, p = 0.10.

Two Independent Groups with Paired Observations


A c o m m o n related d e s i g n is to e m p l o y two t r e a t m e n t s in s a m p l e s of sizes
ne a n d nc w h e r e each p a t i e n t also serves as his o w n control w i t h m e a s u r e s
at t i m e s a a n d b such as before a n d after treatment. In this case the test
statistic is the s a m e as for t w o i n d e p e n d e n t g r o u p s w h e r e the p r e p o s t
differe_nces for eada p a t i e n t are u s e d as the i n d i v i d u a l o b s e r v a t i o n s ; i.e.,
Xe = de a n d X~e = de; a n d w h e r e the p o o l e d esldmate of the variance of the
d i f f e r e n c e s (S~) is e m p l o y e d . T h e p r o b l e m is f o r m u l a t e d as /xl = I ~ e -
3c I, 8e = (Veb -- V~), 3c = (Vc, -- Vc~), w i t h y2 = o.~(Q-d~ + Q ~ ' ) / N . This yields
e q u a t i o n s e q u i v a l e n t to e q u a t i o n s (6) a n d (7) w i t h Cro s u b s t i t u t e d for or.

PROPORTIONS
In e x p e r i m e n t s w h e r e the basic o u t c o m e is a qualitative variable, such as
success v e r s u s failure, the data are usually e x p r e s s e d as a p r o p o r t i o n , e.g.,
the p r o p o r t i o n of successes, or s i m p l y p. T h e exact p r o b a b i l i t y d i s t r i b u t i o n
100 John M. Lachin

of s u c h a p r o p o r t i o n is t h e b i n o m i a l d i s t r i b u t i o n that h a s p a r a m e t e r s N
( s a m p l e size) a n d ~r (the true p o p u l a t i o n p r o p o r t i o n ) . For large N (i.e.,
a s y m p t o t i c a l l y ) the b i n o m i a l d i s t r i b u t i o n m a y b e a p p r o x i m a t e d b y a n o r m a l
d i s t r i b u t i o n w i t h m e a n /z = ~r a n d v a r i a n c e ~ = ~r(1 - ~r)/N. T h u s , in
e x p e r i m e n t s i n v o l v i n g tests for p r o p o r t i o n s , the b a s i c e q u a t i o n s m a y b e
u s e d for the d e t e r m i n a t i o n of s a m p l e size a n d p o w e r .

A Single Proportion
In o n e - s a m p l e p r o b l e m s t h a t y i e l d a s i n g l e p r o p o r t i o n , the h y p o t h e s i s Ho:/~0
= ~'0 is tested w h e r e o n e w i s h e s to d e t e c t a clinically r e l e v a n t a l t e r n a t i v e
H~:~I = 7r1 w h e r e 7rl > Tr0 or ~-~ < ~r0. G i v e n a p r o p o r t i o n p f r o m a s a m p l e
of size N, the test statistic e m p l o y e d is Z = (p - ~r0)/E0 w h e r e E~ = ~r0(1 -
1ro)/N a n d w h e r e Z - N(0, 1) if H0 is true. As an e x a m p l e , in a c o h o r t follow-
u p s t u d y , o n e m i g h t test t h a t the k y e a r m o r t a l i t y e q u a l s t h a t o b t a i n e d in a
p r e v i o u s ( a n d c o m p a r a b l e ) c o h o r t , w h e r e 7r0 is that o b s e r v e d in this latter
cohort.
For the d e t e r m i n a t i o n of s a m p l e size or p o w e r o n e s u b s t i t u t e s o-2 = fro(1
- 7r0) a n d o-2 = ~1(1 - ¢rl) into e q u a t i o n (3) or (4); the e q u a t i o n s for s a m p l e
size N a n d p o w e r Z~ b e i n g

Z . ~ / ~ o (1 - ~'o) + Z,,~/~', (1 - ~-1).] ~


N = -- J (10)
7r 1 7]"0

- I - z x/;,o (1 -
= (11)
Z~ X/~r~ (1 - ~r~)

Two Independent Proportions


In the case of t w o i n d e p e n d e n t s a m p l e s of sizes ne a n d no, the null h y p o t h e s i s
H0:/~0 = (~'e - ~r¢) = 0 is t e s t e d w i t h the statistic Z = (pe - p~)/S w h e r e Pe
a n d Pc are the p r o p o r t i o n s of e v e n t s in the t w o s a m p l e s , tr0~ is e s t i m a t e d as
S 2 = (n~-1 + ncl)p(1 - P), P = Q e p ~ + Qe pc, a n d w h e r e u n d e r H0, Z - N(0,
1).
For the d e t e r m i n a t i o n of s a m p l e size a n d p o w e r , the m i n i m a l r e l e v a n t
difference/.~1 = Iwe - ¢rc I is t h e n specified. Since the v a r i a n c e will d e p e n d
o n the v a l u e s s p e c i f i e d a n d n o t o n t h e a b s o l u t e d i f f e r e n c e , b o t h 7re a n d ~r~
m u s t b e specified. T h i s y i e l d s o-~ = ['/re( L - "/re) Q e 1 +_Vc(1 - rrc) Q ~ - q -
U n d e r the null h y p o t h e s i s H0:zr~ = zr¢ = It, ~0 = 0 a_nd zr i s s p e c i f i e d as rr
= Qezre + Qczrc. This t h e n y i e l d s o'~ = (Q;~ + Q-~)~r (1 - rr). S u b s t i t u t i n g
i n t o e q u a t i o n (5) y i e l d s t h e w e l l - k n o w n f o r m u l a
X/~l~re - 1rcl = Z~X/~(1 - ~ ) ( Q e ~ + Q~-')
+ ZaX/rre(1 -- ~r~)Qe ~ + ~rc(1 - ~rc)Q~ 1 (12)

w h i c h can t h e n b e s o l v e d for N or Za.


T h i s e x p r e s s i o n can_ b e s__implified, h o w e v e r , b y n o t i n g that for e q u a l
s a m p l e sizes tr02 = 4rr (1 - rr) is a l w a y s g r e a t e r t h a n or e q u a l to o-2 = 2~re(1
Sample Size Determination 101

- ~re) + 2¢rc(1 - 7rc). This t h e n allows use of the s i m p l e r e q u a t i o n s


(Z~ + Z~)24~(1 - ~)
N = (13)
_

Z~ - Z~ (14)
2x/~-(1 - W)
H a l p e r i n (personal c o m m u n i c a t i o n to Paul Canner) has s h o w n that the
a p p r o x i m a t i o n (13) will yield total s a m p l e sizes no greater t h a n Z~ + 2Z~Za
a b o v e that o b t a i n e d f r o m e q u a t i o n (12); i.e., to w i t h i n 5.86 u n i t s for o~ =
0.05 ( o n e - s i d e d ) , / 3 = 0.10.
In u s i n g t h e s e f o r m u l a s , n o t e that ~r d e p e n d s o n the actual values 7r~ a n d
~'c s p e c i f i e d u n d e r H 1 a n d not just on the relevant difference t~l = [~'e - 7rd-
Also, since ~'(1 - 7r) is at a m a x i m u m for 7r = 0.50, it t h e n follows that for
fixed p o s i t i v e /~1, as 7re gets smaller, the r e q u i r e d s a m p l e size also gets
smaller a n d p o w e r increases a s s u m i n g 7rc < 7re). In such p r o b l e m s , therefore,
it is safest to specify the largest realistic v a l u e for 7rc ( w h e r e 7re > Tr~ and, ~-~
< 0.50) so as not to u n d e r e s t i m a t e s a m p l e size or o v e r e s t i m a t e p o w e r .
For e x a m p l e , s u p p o s e w e w i s h e d to c o n d u c t a controlled clinical trial of
a n e w t h e r a p y a n d the rate of successes in the control g r o u p is not expected
to b e greater t h a n 7rc = 0.05. Further, w e w o u l d c o n s i d e r the n e w t h e r a p y
to b e s u p e r i o r - - c o s t , r i s k s a n d other f_actors_considered--if ~r~ = 0.15, thus
y i e l d i n g t~l = 0.10, Ir = 0.10, a n d 47r(1 - 7r) = 0.36. U s i n g e q u a t i o n (13)
w i t h ot = 0.05 (one-sided) a n d ~8 = 0.10 yields N = 310 ( r o u n d e d u p f r o m
308.4); the m o r e p r e c i s e f o r m u l a (12) yields N = 306 ( r o u n d e d f r o m 304.6).
S u p p o s e , h o w e v e r , that the e x p e r i m e n t w a s c o n d u c t e d w i t h only N =
100. U s i n g e q u a t i o n (14) indicates that the p o w e r of the e x p e r i m e n t in
d e t e c t i n g / ~ = 0.10 w i t h 7rc = 0.05 is only 51% (Z~ = 0.022). If a n e g a t i v e
result w a s o b t a i n e d , h o w e v e r , o n e m i g h t w i s h to d e t e r m i n e the p o w e r of
h a v i n g detected larger differences, s a y / ~ = 0.40 for 7re = 0.05. This yields
Tre = 0.45, ¢r = 0.25, a n d 2X/~'(1 - ~) = 0.886. F r o m e q u a t i o n (14) w e find
that N = 100 yields 99.9% p o w e r (Z a = 2.87). Thus a true difference of this
m a g n i t u d e could confidently b e ruled out.
For f u r t h e r illustration, L a c h i n [2] u s e d these p r o c e d u r e s to discuss
s a m p l e size c o n s i d e r a t i o n s for FDA P h a s e II a n d III clinical trials of n e w
drugs, a n d these m e t h o d s h a v e b e e n u s e d in a v a r i e t y of clinical trials.
A d d i t i o n a l references include [1, 5, 7, a n d 8].

The Angular Transformation


T h e p r o c e d u r e s just d e s c r i b e d are u s u a l l y p r e f e r r e d since the tests for
p r o p o r t i o n s u s i n g the n o r m a l a p p r o x i m a t i o n to the b i n o m i a l are e q u i v a l e n t
to the usual X2 tests (see u n d e r D i s c u s s i o n following). O t h e r s [12], h o w e v e r ,
h a v e e m p l o y e d the a n g u l a r t r a n s f o r m a t i o n A(p) = 2 arcsin V ~ , w h e r e A(p)
is e x p r e s s e d in r a d i a n s , n o t degrees. 2 G i v e n a p r o p o r t i o n p w i t h b i n o m i a l

2For t h o s e w h o s e calculators p r o v i d e the sin f u n c t i o n in degrees, the c o n v e r s i o n factor is


arcsin (radians) = (0.017453) arcsin (degrees).
102 John M. Lachin

e x p e c t a t i o n 7r, t h e n A(p) is a p p r o x i m a t e l y n o r m a l l y d i s t r i b u t e d as N[A(rr),


N-l]. Since the v a r i a n c e (~2 = 1/N) is n o w i n d e p e n d e n t of the expectation,
the resulting s a m p l e size a n d p o w e r e q u a t i o n s are further simplified. This
a p p r o a c h , h o w e v e r , is not as accurate as that d e s c r i b e d herein.
As an illustration, a g a i n c o n s i d e r the e x a m p l e p r e s e n t e d earlier u n d e r
Two I n d e p e n d e n t P r o p o r t i o n s w i t h 7r¢ = 0.05 a n d 7re = 0.15. The e q u a t i o n
b a s e d on the arcsin t r a n s f o r m a t i o n w i t h equal s a m p l e sizes is
N = 2(Z~ + Z~) 2
[A(cre) - A(cr~)] 2 (15)

a n d for c~ = 0.05 (one-sided) a n d /~ = 0.10 w e find N = 290. This is


s o m e w h a t less t h a n the N = 310 e s t i m a t e d f r o m the a p p r o x i m a t e e q u a t i o n
(13) a n d the m o r e precise f o r m u l a (12), w h i c h yields N = 306. In general the
a n g u l a r t r a n s f o r m a t i o n p r o c e d u r e yields N a b o u t 3 - 5 % less t h a n that f r o m
e q u a t i o n (12), w i t h ~ = 0.05 ( o n e - s i d e d ) , fl = 0.10.

Paired Observations
N o w consider the p r o b l e m w h e r e two g r o u p s of o b s e r v a t i o n s are linked
t o g e t h e r in s o m e w a y such as t h r o u g h m a t c h i n g or r e p e a t e d m e a s u r e s on
the s a m e i n d i v i d u a l s at times a a n d b. This is exactly a n a l o g o u s to the
p r o b l e m of the t test for p a i r e d o b s e r v a t i o n s except that the o u t c o m e is n o w
qualitative r a t h e r t h a n quantitative. In this case, the basic data are e x p r e s s e d
as

Time b
+

m++ [ m.
Time a
m ÷ m _

mr,

w h e r e m+_, for e x a m p l e , is the n u m b e r of pairs w i t h (+) for o b s e r v a t i o n a


( t i m e a or the a p a i r m e m b e r ) a n d ( - ) for o b s e r v a t i o n b. For the a
o b s e r v a t i o n s rn~ is the total n u m b e r (+) a n d likewise mb for the b o b s e r v a -
tions. The f r e q u e n c i e s (re's) are t h e n c o n v e r t e d to p r o p o r t i o n s (p's) b y
d i v i d i n g b y the total n u m b e r of pairs, N.
In such p r o b l e m s , one w i s h e s to test the null h y p o t h e s i s H0:/~0 = (Trb -
7ra) = 0. N o t e , h o w e v e r , that Trb -- 7ra = 7r_+ -- 7r+_; thus the p r o b l e m can
t h e n b e e x p r e s s e d solely in t e r m s of the d i s c o r d a n t p r o p o r t i o n s 7r_+ a n d 7r+_
w h e r e H0 implies that 7r_+_= 7r_+_ = Tr. The test statistic e m p l o y e d is Z =
~__+ - p+_)/S w h e r e S 2 = 2p/N, p = 1/2(p_+ + p+_) is the s a m p l e e s t i m a t e of
Tr a n d w h e r e u n d e r H0, Z - N(0, 1). N o t e that Z 2 is e q u i v a l e n t to the
M c N e m a r X2 statistic usually e m p l o y e d (see u n d e r D i s c u s s i o n following).
For s a m p l e size or p o w e r d e t e r m i n a t i o n the clinically relevant difference
/xl = 17r-+ - 7r+_] is specified. The c o r r e s p o n d i n g v a r i a n c e has b e e n s h o w n
b y M i e t t i n e n [13] to b e o'21 = 2rr_+ 7r+_/~" w h e r e ~ = V2(Tr_+ + 7r+_). U n d e r
Sample Size D e t e r m i n a t i o n 103

H0:/-~0 = (Tr_+ - It+_) = 0, w h i c h i m p l i e s o-02= 2Ir. S u b s t i t u t i n g i n t o e q u a t i o n


(5) y i e l d s t h e b a s i c r e l a t i o n s h i p

V ~ ] ~ _ + - ~r+_ I = Z ~ V ' ~ + Z~X/2~_+ ~r+_/~ (16)


w h i c h c a n b e s o l v e d f o r t h e t o t a l n u m b e r of p a i r e d o b s e r v a t i o n s (N) o r
p o w e r ( f r o m Z~).
F o r e x a m p l e , c o n s i d e r t h a t w e w i s h to d e t e c t a d i f f e r e n c e / ~ , = 0.15 w h e r e
~+_ = 0.05, ( i m p l y i n g ~-_+ = 0.20), u s i n g e q u a t i o n (16) f o r a = 0.05 ( o n e -
s i d e d ) , / 3 = 0.10 y i e l d s N = 80.

Two Independent Groups with Paired Observations


A s w i t h t h e t t e s t , t h i s c a n b e e x p a n d e d to t h e p r o b l e m of t w o i n d e p e n d e n t
g r o u p s of p a t i e n t s w i t h p a i r e d o b s e r v a t i o n s o n e a c h p a t i e n t . 3 U n d e r t h i s
d e s i g n , r e p e a t e d o b s e r v a t i o n s ( + o r - ) a r e o b t a i n e d at t i m e s a a n d b o n t w o
i n d e p e n d e n t g r o u p s of s i z e s ne = QeN a n d nc = QcN. The n u l l h y p o t h e s i s
of n o t r e a t m e n t b y t i m e i n t e r a c t i o n H0:/~0 = 8e - 8c = 0 is to b e t e s t e d ,
w h e r e 8e = ~eb - 7tea is t h e c h a n g e o v e r t i m e i n t h e t r e a t e d g r o u p a n d 6c =
•rcb - Irca is t h a t for c o n t r o l s .
As shown under Paired Observations, the problem can be expressed
s o l e l y i n t e r m s of t h e d i s c o r d a n t o b s e r v a t i o n s w i t h i n t h e t w o - w a y t a b l e f o r
e a c h g r o u p , d e n o t e d as ~'e+-, ~re-+, rrc+_, a n d ~'c-+, w h i c h i n t u r n d e f i n e t h e
d e g r e e of i n t e r a c t i o n /-~1 = 1Be - 6cl to b e d e t e c t e d . U n d e r H1 t h e s a m p l e
s t a t i s t i c D = de - dc = Pc-+ -- Pe+- -- Pc-+ + Pc+- is n o r m a l l y d i s t r i b u t e d
w i t h / ~ 1 = IAI w h e r e

A = ~e-+ -- ~e+- - Ire_+ + I r c + _ (17)


and
o_2 = 4~re-+Tre+_ 41rc_+ Ire+_ (18)
Q e(~-e-+ + 1re+-) + Qc(1rc-+ + ~'c+-)
A s u f f i c i e n t c o n d i t i o n for H0 to b e t r u e is t h e a s s u m p t i o n of h o m o g e n e i t y
w h e r e i n t h e t r e a t e d a n d c o n t r o l g r o u p s a r e a s s u m e d to b e d r a w n f r o m t h e
s_ame p o p u l a t i o n w i t h c o m m o n p a r a m e t e r s or+_ = Qerre+- + Q j r c + - a n d
~r_+ = Qelre_+ + Q j r c - + y i e l d i n g /~0 = 0. A l t e r n a t i v e l y , s u c h a s e v e r e
a s s u m p t i o n m a y n o t b e r e q u i r e d a n d o n e m i g h t fit a n o - i n t e r a c t i o n m o d e l
to t h e i n t e r a c t i o n p a r a m e t e r s to o b t a i n t h e s e t of n o - i n t e r a c t i o n p a r a m e t e r s ,
~-', as 7re+_ = rre+_ + T, Ire-+ = ~re-+ - T, ~rc+- = 7re+_ - T a n d ~c-+' = Ire_+
+ T, w h e r e T = A/4 a n d A is d e f i n e d a s i n e q u a t i o n (17). A l t e r n a t i v e l y , o n e
m i g h t e m p l o y t h e s a m e p a r a m e t e r s ~'c+-, rre+_ u n d e r H0 a n d H , a n d t h e n
c o m p l e t e t h e n o - i n t e r a c t i o n m o d e l w i t h p a r a m e t e r s ~-~+_ = 1re+_, 7r~_+ =
7re_+ + 1/2A,Ir~+_ = Ire+_ and 7r'_+ = 7r~_+ - I/2A. In each case'D is normally
d i s t r i b u t e d with/~0 = 0 and variance o-2 of the same form as equation (18)
but with ~r' substituted for the ~.

3Lachin et al. [1] also present extensions to analyses across i n d e p e n d e n t s u b g r o u p s within


two i n d e p e n d e n t primary groups.
104 John M. Lachin

The null h y p o t h e s i s H0:A -- 0 is then tested u s i n g Z = (de - de)/3-o, usually


with 3-0 defined from the s a m p l e p ' s u n d e r the a s s u m p t i o n of homogeneity.
r r
In the latter case 77e+- = 77¢+- = rr÷_ a n d 77e! + : 77et + : 77_+.
Substituting into e q u a t i o n (5) yields the e q u a t i o n

77e+ -- + 77c+ I =

Z~ ~ Q 4rr;_+77'~+_ +
4 "w e¢ 4 . T T'e + _

(77;-+ + 77;÷_) Q(. (77[. + + 7re+ ) (19)

+Z,x/ 47r~. + 7 r e + _ + 4 q T c + 77c+-~

Qe (Tr~_+ + ~re+ ) Q¢(77e + + 77c+-)

with the ~-' defined u n d e r the a s s u m p t i o n of h o m o g e n e i t y or after fitting


one of the n o - i n t e r a c t i o n models. This can then be solved for total sample
size N or p o w e r .
For example, consider a clinical trial in w h i c h 100 patients, 50 in each
g r o u p , are to u n d e r g o evaluation before and after treatment a n d we desire
the p o w e r of the s t u d y to detect g r o u p differences. The parameters of the
p r o b l e m m a y be specified as 77~.+ , 8~, (which yields 77e-+), rre+-, and A
(which then yields 77e-*). A s s u m e we are interested in detecting m o d e r a t e
differences such as 77~+_ = 0.03, 3,. = 0.05, 77e+- = 0.03, a n d / x l = A = 0.15.
U s i n g rr~.+_~
' = 77e+- a n d %+_ ' = rre, , fitting a no-interaction m o d e l a n d then
u s i n g e q u a t i o n (19) with c~ = 0.05, (one-sided), we find that Zs -- 0.734 and
p o w e r = 77% (fl = 0.23). Solving for sample size in e q u a t i o n (19) with fl =
0.10 indicates that N = 151 yields 90% p o w e r of detecting these same
effects.

Discussion
A l t h o u g h the p r o b l e m s just g i v e n are p r e s e n t e d in terms of the normal
a p p r o x i m a t i o n to the binomial, a two-tailed test u s i n g each of the statistics,
Z, p r e s e n t e d u n d e r A Single P r o p o r t i o n , Two I n d e p e n d e n t P r o p o r t i o n s , and
Paired O b s e r v a t i o n s yields the same p value as the one df chi-square test
usually e m p l o y e d in the same situation. For each of these Z and c h i - s q u a r e
(X2) tests, it is easily s h o w n that X2 = Z 2 and thus that the p values for the
two tests are the same. For example, the 1 df chi-square critical value at the
0.05 level is X~.05 = 3.841, w h i c h equals (1.96) 2, w h e r e Z0.02~ = 1.96 (the two-
tailed critical value at the 0.05 level). Thus, if one i n t e n d s to use the
i n h e r e n t l y t w o - t a i l e d c h i - s q u a r e test, t w o - t a i l e d s a m p l e size or p o w e r
d e t e r m i n a t i o n s h o u l d be e m p l o y e d (i.e., u s i n g Z~/2 rather than Z~). Other-
wise, sample size m a y be severely u n d e r e s t i m a t e d .
W h e n a two-tailed test is to be c o n d u c t e d , h o w e v e r , one m u s t carefully
consider each of the two possible alternatives. For example, in tests of a
single p r o p o r t i o n , H0:77 = 77o is tested against an alternative, w h i c h for a
t w o - s i d e d test is specified a s H 1 : ~ 1 = 3 = I'a-1 - 770[ ~ 0. The t w o - s i d e d test
thus implies two alternative values for 771:771u = 77o + 3 and 771e = 77o - 3.
O b v i o u s l y , since the variances d e p e n d on 7rl, the estimated sample size will
be greater, a n d p o w e r smaller, for the alternative (77~u or 77~e) closest to 0.50.
fNote that ~-(1 - w) is m a x i m i z e d at 7r = 0.50]. In fact, the larger of the two
resulting sample size estimates m a y be as m u c h as 4.64 times the smaller
Sample Size Determination 105

estimate. T h u s , if a two-tailed analysis is to b e c o n d u c t e d , one should


c o n s i d e r the two i m p l i e d alternatives (e.g., rri~ a n d 7r~e) a n d use w h i c h e v e r
is closest to 0.50.
An alternative w o u l d b e to e m p l o y s a m p l e size p r o c e d u r e s u s i n g the
p o w e r function of the c h i - s q u a r e test itself, w h i c h is i n h e r e n t l y two-tailed.
Lachin [14] discusses this p r o c e d u r e for the general r x c c o n t i n g e n c y table
a n d s h o w e d that the use of the limiting c h i - s q u a r e p o w e r a p p r o a c h a n d the
two-tailed a s y m p t o t i c n o r m a l e q u a t i o n (11) w e r e in close a g r e e m e n t for the
2 x 2 c o n t i n g e n c y table.

S U R V I V A L ANALYSIS
In m a n y clinical trials, s i m p l e p r o p o r t i o n s as d e s c r i b e d in the last section
will b e u s e d to evaluate the o u t c o m e , such as to evaluate the healing or
i m p r o v e m e n t rate in an acute condition w i t h a s h o r t - t e r m therapy. In m a n y
other cases, h o w e v e r , the i m p o r t a n t feature is not only the o u t c o m e event,
such as death, b u t the t i m e to the terminal event, the s u r v i v a l time. In these
trials, the data is a n a l y z e d u s i n g life-table m e t h o d s that consider the t i m e
to the terminal e v e n t for each p a t i e n t a n d that p r o v i d e a m o r e p o w e r f u l
e s t i m a t e of the, say, T y e a r s u r v i v a l t h a n is o b t a i n e d f r o m the c r u d e
p r o p o r t i o n of s u r v i v o r s after T years. S o m e basic references on this proce-
d u r e are [15-17].
The basic life-table m e t h o d of analysis is d i s t r i b u t i o n free in that no
u n d e r l y i n g a s s u m p t i o n s a b o u t the d i s t r i b u t i o n of t i m e to event n e e d be
specified. For s a m p l e size evaluation, h o w e v e r , s o m e such a s s u m p t i o n m u s t
be m a d e . The m o s t c o m m o n a s s u m p t i o n is that t i m e to survival is e x p o n e n -
tially d i s t r i b u t e d w i t h h a z a r d rate h, w h e r e at a n y t i m e t the p r o p o r t i o n of
s u r v i v o r s , P~(t), is g i v e n as Pdt) = e -~t. U n d e r this m o d e l log [Ps(t)] is
linearly d e c r e a s i n g in t i m e w i t h slope ~. In a cohort of N patients, all
followed to the terminal e v e n t w i t h m e a n survival t i m e M, the h a z a r d rate
is e s t i m a t e d as L = M -~ a n d a s y m p t o t i c a l l y L ~ N ( h , h2/N), [18].

Two Independent Groups


C o n s i d e r that there are two i n d e p e n d e n t g r o u p s of sizes ne a n d nc all
f o l l o w e d to the terminal e v e n t w h e r e t i m e t is m e a s u r e d f r o m the t i m e of
e n t r y into the study. The null h y p o t h e s i s of equality of survival is e q u i v a l e n t
u n d e r e x p o n e n t i a l s u r v i v a l to H0: (he - he) = 0, w h i c h can b e tested u s i n g
the statistic Z = (Le - Le)/S w h e r e Le a n d Le are the e s t i m a t e d h a z a r d rates,
Le = M ; ' , Lc = M ~ 1, S 2 = ( n ; ~ + n ~ ) L 2, L -- (QeLe + QcLc), a n d w h e r e
u n d e r H0, Z - N(0, 1).
For the d e t e r m i n a t i o n of s a m p l e size a n d p o w e r o n e specifies the m i n i m a l
r e l e v a n t difference/~1 = [he - Xd, w h i c h yields o-~= ()~e2 Q~-I+ k~Q~-I). U n d e r
the null h y p o t h e s i s / ~ 0 = 0 a n d o-02= ~2(Q~-1 + Q~-I) w h e r e h = Qe)~e + Qehe.
S u b s t i t u t i n g into e q u a t i o n (5) yields
x//~lhe - he I = Z~X/x2 (Qe I + Q~I) + z a x / ) ~ Q ; ' + h~q~ ~ (20)
w h i c h can t h e n b e solved for N or Z~.
This e q u a t i o n w a s also p r e s e n t e d b y P a s t e r n a c k a n d Gilbert [19] a n d was
106 John M. Lachin

s h o w n b y George a n d Desu [20] to be slightly conservative in c o m p a r i s o n


to the exact d i s t r i b u t i o n of the ratio Le/L¢, which has an F distribution.
George and Desu also p r e s e n t the following a p p r o x i m a t i o n

x,/Ni~/2t~n(kdhe)] = Z , + Z~ (21)

which they s h o w to be accurate to w i t h i n two sample units of the exact


solution.
A n o t h e r a p p r o x i m a t i o n can be o b t a i n e d directly from e q u a t i o n (20) b y
n o t i n g that for equal sample sizes 4h" is always less than or equal to
2(he2 + 2~c2).This t h e n yields

~/Xl~-e - 2,~l/(~ ÷ 2~) = Z~ ÷ Z~ (22)

w h e n u s i n g equal s a m p l e sizes. This a p p r o x i m a t i o n will yield values


b e t w e e n those from e q u a t i o n (20) and the a p p r o x i m a t i o n (21) of George and
Desu and can b e s h o w n to b e w i t h i n Z~ less than that o b t a i n e d from
e q u a t i o n (20).

Two Independent Groups with Censoring


The f o r m u l a t i o n just p r e s e n t e d will rarely be applicable because it assumes
that all N patients will be followed to the terminal event no matter h o w
m u c h time is r e q u i r e d for the last patient to reach that event. This is rarely
practicable. A more realistic approach is to allow for the trial to be t e r m i n a t e d
at time T. A s s u m e that the patients enter the trial at a u n i f o r m rate over the
interval 0 to T and that exponential survival applies, as earlier. If we d e n o t e
d?()O = )~3T/(hT - 1 + e -v/) (23)
t h e n it can b e s h o w n thato-02= (h()l)(Qel + Q;~) a n d o-12= ~()le)Qe' +
q)(2~c)Q~-1 w h e r e h = Qe2~ + Q¢)~c [18]. S u b s t i t u t i n g / z , = [ke -- )%[, /Z0 = 0,
¢02, and o-12into e q u a t i o n (5) yields

x lxe - xc I = Z~X/q)(X)(Qe ~ + Q;1) + z~x/q~(he)Qg-~ + ~B(;%)QZ-' (24)


w i t h the 6(2,) as defined in equation (23). This can then be solved for sample
size N or p o w e r Z~.
T h e s e e x p r e s s i o n s can be simplified, h o w e v e r , since empirically for
Qe = Qe, o-21> o'~, a n d as e m p l o y e d b y Gross and Clark [18, p. 264] we can
use the simple e q u a t i o n

x/NIKe - ~.c[ = (Z~ + Z~)x/(h()~e)Qe ~ + 6(Xc)Q; 1 (25)

Again this a p p r o x i m a t i o n is highly accurate.


In the event that all patients enter the trial at the same p o i n t , or if each
patient enters the trial at r a n d o m b u t each is only followed up to T years
after entry, the resulting e q u a t i o n s are identical except that (h(h) in equation
(23) becomes simply h2/(1 - e-at).
At' this p o i n t it s h o u l d be n o t e d that the sample size o b t a i n e d with a
Sample Size Determination 107

s t u d y of T years d u r a t i o n is that r e q u i r e d to yield the same n u m b e r of


deaths (events) as o b t a i n e d from e q u a t i o n (21), allowing for the fact that not
all patients will have d i e d w h e n the s t u d y is terminated. This is also true of
the following procedure.

Two Independent Groups with Limited Recruitment and Censoring


In the f o r m u l a t i o n just p r e s e n t e d , note that patients are eligible to enter the
trial up to the trial e n d date, time T. Usually, h o w e v e r , it will be desired to
recruit patients for s t u d y over an interval 0 to To and then to follow all
recruited patients to the time of the terminal event, or to time T w h e r e T >
To. Based on the d e v e l o p m e n t s in [18, pp. 66-67], it can readily be s h o w n
that the variances o-02and 0-2 are as in the p r e v i o u s section b u t with q)()0
n o w defined as
e-~ (7. - T,,)_ e - ~r ] - , (26)
cb*(h) = )~2 1- hTo
I

The desired sample size or p o w e r is o b t a i n e d on substituting 6*(k) for ~b()0


in e q u a t i o n (24), or in e q u a t i o n (25) to yield an accurate a p p r o x i m a t i o n , and
solving for N or Z~.
For example, consider that a clinical trial is to be c o n d u c t e d for a disease
with m o d e r a t e levels of mortality with h a z a r d rate ~ = 0.30, yielding 50%
survivors after 2.3 years. S u p p o s e that with treatment we are interested in
a reduction in h a z a r d to ~ = 0.2, i.e., an increase in survival to 64% at 2.3
years. With equal-sized groups, a = 0.05 (one-sided) and fl = 0.10, e q u a t i o n
(20) yields N = 218 deaths are r e q u i r e d , i.e., 218 patients all followed to
time of death. The a p p r o x i m a t i o n (22) yields N = 216 and the e q u a t i o n (21)
of George and Desu yields N = 210. If the s t u d y was to be t e r m i n a t e d after
5 years, then u s i n g e q u a t i o n (24) with e q u a t i o n (23) yields N = 504 patients;
the a p p r o x i m a t i o n yields N = 508. Finally, a s s u m e that r e c r u i t m e n t was to
be t e r m i n a t e d after the first 3 years of a 5-year study, then using e q u a t i o n
(24) with e q u a t i o n (26) yields N = 378.
Note that u n d e r all these plans the sample size r e q u i r e m e n t s are based
on the n e e d to accrue a p p r o x i m a t e l y 210 deaths d u r i n g the study. Also note
that if a fixed n u m b e r of patients is to be studied, it is better for those
patients to be recruited quickly and followed for a longer p e r i o d of time
than to extend the p e r i o d of s t u d y and reduce the rate at which the patients
enter the study. This example shows that 504 patients would be n e e d e d for
a 5 - y e a r s t u d y w h e r e the patients can enter the s t u d y e v e n l y d u r i n g the full
5 - y e a r p e r i o d , whereas 378 patients w o u l d be n e e d e d if recruitment was
c o m p r e s s e d into a 3-year p e r i o d with total study d u r a t i o n again 5 years.
The reason for this quite simply is related to the total p a t i e n t m o n t h s of
experience of the cohort, i.e., the elapsed time from the time of r a n d o m i -
z a t i o n to t i m e T s u m m e d o v e r all p a t i e n t s . For a 5 - y e a r s t u d y w i t h
r e c r u i t m e n t c o m p r e s s e d into the initial 3 years, the average p a t i e n t m o n t h s
of exposure w o u l d be 3.5 years, whereas for a 5-year s t u d y with r e c r u i t m e n t
s p a n n i n g the total 5 years, the average exposure to treatment would be 2.5
years.
108 John M. Lachin

CORRELATIONS
In o b s e r v a t i o n a l s t u d i e s t h a t i n v o l v e c o r r e l a t i o n s as the p r i n c i p a l f o r m of
a n a l y s i s , t w o t y p e s of h y p o t h e s e s are u s u a l l y tested: (1) w h e t h e r a true
c o r r e l a t i o n actually exists u s i n g H0: p = 0 v e r s u s H~: p = pl :~ 0; a n d (2)
w h e t h e r t w o c o r r e l a t i o n s are s i g n i f i c a n t l y d i f f e r e n t u s i n g H0: (pc - Pc) = 0
v e r s u s H~: /x~ = (Po - Pc) :~ 0. T h e s i m p l e s t a p p r o a c h to s u c h p r o b l e m s is to
e m p l o y F i s h e r ' s a r c t a n h t r a n s f o r m a t i o n [5]:

C(r) = 1/2logo (1_+ r)


(1 r)
G i v e n a s a m p l e correlation r b a s e d o n N o b s e r v a t i o n s t h a t is d i s t r i b u t e d
a b o u t an a c t u a l c o r r e l a t i o n v a l u e ( p a r a m e t e r ) p, t h e n C ( r ) is n o r m a l l y
d i s t r i b u t e d w i t h m e a n C ( p ) a n d v a r i a n c e o-2 = 1 / ( N - 3). T h e t r a n s f o r m a t i o n
of r to C ( a n d vice versa) is w i d e l y t a b u l a t e d . ( N o t e that this is u s u a l l y
t e r m e d F i s h e r ' s Z t r a n s f o r m a t i o n , b u t w e h e r e u s e C to a v o i d conflict in
notation.)

A Single Correlation
In d e t e c t i n g a r e l e v a n t s i m p l e c o r r e l a t i o n of d e g r e e Hi: /~1 = Pl, o n e tests t h e
null h y p o t h e s i s H0: p = 0 u s i n g the test statistic Z = C(r)~ - 3 w h e r e Z
N(0, 1). S u b s t i t u t i n g i n t o e q u a t i o n (5) y i e l d s

- 3C(pl) = Z~ + Z~ (27)

f r o m w h i c h t h e r e q u i r e d s a m p l e size or p o w e r m a y b e o b t a i n e d . O b v i o u s l y ,
to d e t e c t a t r u e c o r r e l a t i o n p~ g r e a t e r t h a n 0.50 [C(p~) = 0.549], a small N
w o u l d suffice. N o t e that Ho: p = 0 is e q u i v a l e n t to a null h y p o t h e s i s that the
r e g r e s s i o n coefficient is also zero.

Two Independent Correlations


In d e t e c t i n g a r e l e v a n t d i f f e r e n c e i n correlations Hi: /~1 = IC(po) - C(pc)l =~
0 o b t a i n e d f r o m t w o i n d e p e n d e n t s a m p l e s , t h e null h y p o t h e s i s Ho: /z0 = 0 is
t e s t e d u s i n g the statistic Z = C(ro) - C(ro)/~o w h e r e ~ = N -1 (Qe I + Q~-I),
no - 3 = Q o N , n¢ - 3 = Q c N , a n d w h e r e u n d e r H0, Z - N(0, 1). T h e
correlations ro a n d rc are o b t a i n e d f r o m t w o s a m p l e s of sizes no a n d no s u c h
as r e = re(uv) a n d r¢ = rc{uv~for v a r i a b l e s u a n d v in g r o u p s e a n d c.
S u b s t i t u t i n g /xo = 0, /z, = IC(pe) - C(Po)I, a n d E~ = ~oz i n t o e q u a t i o n (5)
yields
X/N-IC(po) - C(pc) l
= Z~ + Z~ (28)
x / Q ; ' + Q~-'

w h i c h can t h e n b e s o l v e d for total s a m p l e size (N) or p o w e r (Z~). N o t e that


N f r o m e q u a t i o n (28) will actually b e six u n i t s less t h a n t h a t actually n e e d e d
since n e + n e - 6 = N.
S a m p l e Size D e t e r m i n a t i o n 109

Table I T o t a l S a m p l e S i z e (N) f r o m E q u a t i o n (29) a a s a F u n c t i o n of K


( W h e r e / z l = Kor) f o r V a r i o u s a ( O n e - s i d e d ) a n d fib

a = 0.05 a = 0.05 a = 0.025 a = 0.025 a = 0.01


K fl = 0.20 fl = 0.10 fl = 0.10 fl = 0.05 fl = 0.05

0.01 61,852 85,674 105,106 129,962 157,690


0.025 9,898 13,708 16,818 20,794 25,232
0.05 2,476 3,428 4,206 5,200 6,308
0.075 1,100 1,524 1,870 2,312 2,804
0.10 620 858 1,052 1,300 1,578
0.125 396 550 674 832 1,010
0.15 276 382 468 578 702
0.175 202 280 344 426 516
0.20 156 216 264 326 396
0.25 100 138 170 208 254
0.3 70 96 118 146 176
0.4 40 54 66 82 100
0.5 26 36 44 52 64
0.6 18 24 30 38 44
0.7 14 18 22 28 34
0.8 10 14 18 22 26
1.0 8 10 12 14 16

aRounded to the next highest even n u m b e r


hFor a two-sided determination at level a, the table should be used with the value a/2.

Table 2 P o w e r (1 - ]3) f r o m E q u a t i o n (30) a s a F u n c t i o n of K a n d T o t a l


S a m p l e S i z e N ( w h e r e / z l = Ko-) w i t h a = 0 . 0 5 ( o n e - s i d e d )

N 0.05 0.10 0.15 0.20 0.25 0.30 0.40 0.50 0.75 1.00
10 0.0685 0.0920 0.1209 0.1556 0.1964 0.2431 0.3519 0.4745 0.7663 0.9354
20 0.0776 0.1155 0.1650 0.2265 0.2991 0.3808 0.5572 0.7228 0.9563 0.9977
30 0.0852 0.1363 0.2051 0.2913 0.3914 0.4993 0.7074 0.8629 0.9931 0.9999
40 0.0920 0.1556 0.2431 0.3519 0.4745 0.5996 0.8119 0.9354 0.9990 0.9999
50 0.0983 0.1741 0.2795 0.4087 0.5489 0.6831 0.8817 0.9707 0.9999 0.9999
60 0.1042 0.1920 0.3145 0.4618 0.6147 0.7514 0.9269 0.9871 0.9999 0.9999
70 0.1100 0.2094 0.3483 0.5113 0.6724 0.8065 0.9556 0.9944 0.9999 0.9999
80 0.1155 0.2265 0.3808 0.5572 0.7228 0.8504 0.9734 0.9977 0.9999 0.9999
90 0.1209 0.2431 0.4122 0.5996 0.7663 0.8851 0.9842 0.9990 0.9999 0.9999
100 0.1261 0.2595 0.4424 0.6387 0.8037 0.9123 0.9907 0.9996 0.9999 0.9999
200 0.1741 0.4087 0.6831 0.8817 0.9707 0.9953 0.9999 0.9999 0.9999 0.9999
300 0.2180 0.5347 0.8297 0.9656 0.9964 0.9998 0.9999 0.9999 0.9999 0.9999
400 0.2595 0.6387 0.9123 0.9907 0.9996 0.9999 0.9999 0.9999 0.9999 0.9999
500 0.2991 0.7228 0.9563 0.9977 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999
750 0.3914 0.8629 0.9931 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999
1000 0.4745 0.9354 0.9990 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999
2500 0.8037 0.9996 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999
5000 0.9707 0.9999' 0,9999 0.9999; 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999
Table 3 K Values for Use with Tables 1 and 2 or Equations (29) a n d (30) a

K as a f u n c t i o n of 0 = ( g j - /z0)

K for unequal samples with


t% U n d e r t h e n u l l Equation K for s a m p l e f r a c t i o n s Q~ a n d
Statistical test hypothesis (H0) t*~ u n d e r H, in text equal sample sizes Qc w h e r e Q * = ( Q e ' + Q ; ';

t Tests for Means


(See S t u d e n t ' s t test for
correction factor)
I. A s i n g l e s a m p l e mean Vo vl 3,4 Iv, - volta" N.A.
2. Two i n d e p e n d e n t (pc - pc) = 0 v~ - vc 6,7
groups e and c (o.2 = v a r i a n c e of the o b s e r v a t i o n s )
3. Correlated o b s e r v a t i o n s (vb - %) = 0 vb - v~ 8,9 Iv~ - ~,ol/o.,~ N.A.
at times a a n d b ((if1 = v a r i a n c e of the differences xb - x<D
4. Two i n d e p e n d e n t (8~ - 6~) = 0 (fie - fie) 6,7 16, - a~li2o'<, la,- aol/o-~x/Q"
g r o u p s with paired fi~ = re0 - v ~
observations fie = vCb - v~a
Tests for P r o p o r t i o n s
1. A s i n g l e p r o p o r t i o n
Normal zr0 zr~ = r% 10, 11 See P r o p o r t i o n s , A S i n g l e P r o p o r t i o n
approximation
Angular A Oro) A ( Tr ~) ~A('rri) - A(Tro)] N.A.
transformation
A(*r) = 2 arcsin X/~ i n r a d i a n s
2. Two i n d e p e n d e n t
groups e and c
Normal (we - we) = 0 rr~ - rrc 13,14 V21zr~- ~rA~(1 - ~r)l-t See P r o p o r t i o n s , Two
approximation 7r = (w~ + ~rD/2 Independent Proportions
Angular [A(rr~) - A(Tr~)l = 0 A(~) - A (~<,) 15 Vz~4(z%) - A(zr~) I iA0,0 - A(~01/x/q;
transformation
3. Correlated observations (n_+ - 7r+_) = 0 n _ + - It+_ 16 See Proportions, Paired N.A.
at times a a n d b Observations
4. Two independent a~ - ae = 0 ae - ae 19 See Proportions, Two Independent Groups with Paired
groups with paired 8e = we-+ - ere+- Observations
observations 8e = ~rc_+ - we+-
Survival A n a l y s i s - - E x p o n e n t i a l M o d e l
1. T w o i n d e p e n d e n t
groups without
censoring
Normal approximation (Xe - ;%) = 0 ke - ;% 22 Ix. - ~1/(~ + x4 See Survival Analysis, Two
Independent Groups
or (~/;%) = I Xd;% 21 [tn(;%/~)]/2
2. Two independent (;% - ;%) = 0 ;% - ;% 25 I x , - ;%112~x~) + See SurvivM Analysis, Two
groups with 2q~(;%)]-I Independent G r o u p s w i t h
censoring at time T 23 6(x) = x~rl Censoring
(XT-l-e -xT)
3. Two independent (;% - Xe) = 0 ;% - ;% 25 In,- ;%112~*(;%) + See Survival Analysis, Two
groups with entry to 24*(;%)]-I Independent Groups with
To and censoring at Limited Recruitment and
time T > To Censoring
~k*(X) = X2 [exp[-X(T }.
- ToT°)] - exp(-XT) ] '

Tests for Correlations


1. A single correlation C(po) C(pl) 27 ~C(p3 - C(po)[ N.A.
C(p) = Fisher's Iz,ansformation and in Tables 1 or 2 or in equations
(29) and (30) one uses N - 3.
2. Two independent [C(p~) - C(p~)] = 0 C(p~) - C(p~) 28 V~(p~ - c(p41 IC(p,O - c ( p 4 1 / v ' ~
groups e and c N= n~+n~-6, Q d q = n, -3

aA v a l u e for/~1 is first d e f i n e d i n t e r m s of t h e i n d i v i d u a l e l e m e n t s , e . g . /~1 = (0.5-0.2) = 0.3 for t h e t test. G i v e n t h e s p e c i f i c a t i o n for t h e


o t h e r e l e m e n t s of K (e.g. o- = 0.2; 8 = (/~ - /~0) = 0.3) o n e s o l v e s for K, e . g . K = 0/2o- = 0.75. T h e n p r o c e e d to T a b l e 1 o r 2 or e q u a t i o n s
(29, 30) for K = 0.75.
112 John M. Lachin

T w o Related Correlations
In detecting a relevant difference b e t w e e n two correlations ra a n d rb o b t a i n e d
from a single sample of size N, the covariance Cov[C(re), C(rb)] m u s t be
considered. This o b v i o u s l y applies w h e n the two correlations involve a
c o m m o n variable, e.g., re = ru,, a n d rb -- ruw for variables u, v, a n d w. It also
applies w h e n the two correlations do not h a v e a variable in c o m m o n , e.g.,
re = ruv a n d r6 = r~x for variables u, v, w, a n d x, d u e to the o t h e r
i n t e r c o r r e l a t i o n s Puw, P .... p ~ , a n d Pvx. D u e to the c o m p l e x i t y of the
covariance expressions as g i v e n in [21], the test statistics and the solutions
for sample size a n d p o w e r will not be p r e s e n t e d , a l t h o u g h the latter are also
o b t a i n e d directly from the basic e q u a t i o n (1).

FURTHER SIMPLIFICATION AND TABLES


In m a n y of the situations just described, the e q u a t i o n s for N a n d Z~ resulting
from e q u a t i o n s (3) a n d (4) can be simplified if the difference 0 = I/~1 - /~01
is presented as a f u n c t i o n of the s t a n d a r d deviation of the basic observations.
If o'0 = o"1 = o-, a n d 0 is specified as 0 = Kcr, then the e q u a t i o n s for sample
size and p o w e r s i m p l y reduce to
N = [(Z~ + Z~)/K] 2 (29)

Z B = K V ~ - Z~ (30)
w h e r e K -- 0/o-. Table 1 presents total N from e q u a t i o n (29) as a function of
K for various c~ a n d fl levels. Table 2 presents p o w e r o b t a i n e d from Z~ u s i n g
e q u a t i o n (30) as a f u n c t i o n of K a n d total N for a = 0.05 (one-sided). If/~0
= 0 then 0 = [/~11 a n d e q u a t i o n s (29) a n d (30) s i m p l y give the sample size
(or power) w h e r e the m i n i m a l relevant difference is expressed as a fraction
(K) of the s t a n d a r d d e v i a t i o n of the observations.
These simplified e q u a t i o n s are applicable to m o s t of the p r o c e d u r e s
p r e s e n t e d in this paper. Table 3 presents the expressions for K required for
these v a r i o u s statistical tests. This table can be u s e d with Tables 1 and 2 or
with equations (29) a n d (30) directly. In each case, the c o r r e s p o n d i n g explicit
e q u a t i o n in the p r e c e d i n g text is cited.

ACKNOWLEDGMENT
The author wishes to thank Lawrence W. Shaw, James Schlesselman, and Paul Canner for
their comments and discussions on many aspects reviewed in this paper.

REFERENCES
1. Lachin J, Marks J, Schoenfield L, et al: Design and Methodological Considerations
in the National Cooperative Gallstone Study: A Multi-center Clinical Trial.
Controlled Clinical Trials, in press.
2. Lachin J: Sample size considerations for clinical trials of potentially hepatotoxic
drugs. In Davidson, CS, Levy, CM, and Chamberlayne, EC, eds: Guidelines for
Detection of Hepatotoxicity Due to Drugs and Chemicals. Washington, DC: U.S.
Department of H.E.W., National Institutes of Health, NIH Publication No. 79-13,
pp. 119-130, 1979.
Sample Size Determination 113

3. Lachin JM: Statistical inference in clinical trials. In Tygstrup N, Lachin J, Juhl E,


eds: The Randomized Clinical Trial and Therapeutic Decisions. New York: Dekker,
1981 (in press).
4. Shaw LW, Cornfield J, Cole SM: Statistical problems in the design of clinical
trials and interpretation of results. In Deutsch, E, ed: Thrombosis: Pathogenesis
and Clinical Trials. New York: Schattauer Verlag, pp. 191-202, 1973.
5. Freiman JA, Chalmers TC, Smith H, Kuebler R: The importance of Beta, the
Type II error and sample size in the design and interpretation of the randomized
controlled trial. N Engl J Med 299:690-694, 1978.
6. Snedecor GW, Cochran WG: Statistical Methods, 6th ed. Ames: Iowa State
University Press, 1967.
7. Fleiss J: Statistical Methods for Rates and Proportions. New York: Wiley, 1973.
8. Halperin M, Rogot E, Gurian J, Ederer F: Sample sizes for medical trials with
special reference to long term therapy. J Chron Dis 21:13-23, 1968.
9. Schork MA, Remington RD:' The determination of sample size in treatment
control comparisons for chronic disease studies in which drop-out or non-
adherence is a problem. J Chronic Dis 20:223-239, 1967.
10. Johnson NL, Kotz S: Distributions in Statistics: Continuous Univariate Distributions
2. New York: Wiley, 1970.
11. Cochran WG, Cox GM: Experimental Designs. New York: Wiley, 1964.
12. Sokal RD, Rohlf FJ: Biometry: The Principles and Practice of Statistics in Biometric
Research. San Francisco: Freeman, 1969.
13. Miettinen OS: The matched pairs design in the case of all-or-none responses.
Biometrics 24:339-352, 1968.
14. Lachin J: Sample size determinations for r x c comparative trials. Biometrics
33:315-324, 1977.
15. Cutler SJ, Ederer F: Maximum utilization of the life table method in analyzing
survival. J Chronic Dis 8:699-712, 1978.
16. Breslow NE: Analysis of survival data under the proportional hazards model. Int
Stat Rev 43:45-58, 1979.
17. Peto R, Pike MC, Armitage P, Breslow NE, Cox DR, Howard SV, Mantel N,
McPherson K, Peto J, Smith PG: Design and analysis of randomized clinical trials
requiring prolonged observation of each patient: II. Analysis and examples. Br
J Cancer 35:1-39, 1977.
18. Gross AJ, Clark VA: Survival Distributions: Reliability Applications in the Biomedical
Sciences. New York: Wiley, 1975.
19. Pasternack BS, Gilbert HS: Planning the duration of long-term survival time
studies designed for accrual by cohorts. J Chronic Dis 24:681-700, 1971.
20. George SL, Desu MM: Planning the size and duration of a clinical trial studying
the time to some critical event. J Chronic Dis 27:15-24, 1974.
21. Dunn OJ, Clark V: Correlation coefficients measured on the same individuals. J
Am Star Assoc 64:366-377, 1969.

You might also like