Professional Documents
Culture Documents
A Multi-Treatment Two Stage Adaptive Allocation For Survival Outcomes
A Multi-Treatment Two Stage Adaptive Allocation For Survival Outcomes
A Multi-Treatment Two Stage Adaptive Allocation For Survival Outcomes
To cite this article: Rahul Bhattacharya & Madhumita Shome (2018): A multi-treatment two stage
adaptive allocation for survival outcomes, Communications in Statistics - Theory and Methods, DOI:
10.1080/03610926.2018.1440599
Article views: 3
1. Introduction
Clinical trials are regarded as the necessity in recent times for the scientific advancement of
medicines. Most of the clinical trials recruit human beings as subjects and hence it becomes
necessary to ensure best possible care for the study subjects. Multiple treatments are, in gen-
eral, involved in clinical trials and assigning all subjects to the “best” treatment is the most
appropriate option for the clinician to maintain the ethical imperative. But “best” treatment
can not be identified prior to the actual experiment and hence the above strategy lacks feasi-
bility. Therefore, a dynamic experimental design, where the assignment of subjects to different
treatment arms are decided afresh after each or a group of responses are observed, seems sen-
sible. Such designs are data driven and are often termed adaptive. For a detailed exposure on
adaptive allocation designs, we refer the interested readers to the book-length treatment by
Atkinson and Biswas (2014), Antognini and Givagnoli (2015) and Rosenberger and Lachin
(2015).
However, clinical trials considering time-to-event outcomes (e.g. death, progression free
survival, etc.) often produce delayed responses and hence complicates the modification of
the allocation design after each response. We, therefore, suggest to adopt a two stage design,
where the allocation strategy is updated after a group of responses are observed in the first
stage. In brief, a two stage design is a hybridized version of equal allocation and data depen-
dent allocation, where equal allocation is used in the first stage and based on the first stage
data, modified allocation of the second stage is set, of course, keeping in mind the ethical
imperative. The idea of stage-wise allocation in the context of clinical trials has its root in the
works of Colton (1963) and Coad (1992, 1994), among others. But, the second stage allocation
for these designs was completely determined from the outcome of the first stage and hence
introduced selection bias. In further works, Bandyopadhyay, Biswas, and Bhattacharya (2009,
2010) and Bhattacharya and Shome (2015) incorporated data driven randomization for the
second stage patients and explored different aspects of the resulting allocation design.
Clinical trials pertaining to censored survival times mostly assume noninformative cen-
soring, that is, the censoring and event occurrence times are independent and hence a cen-
sored observation provide only the information that the event of interest does not occur
before the time of censoring. But censoring can also occur if the patients withdraw him-
self/herself from the ongoing study whenever their health suffers from the side effects or
they feel well enough. These indicate possible dependence between the event and censoring
times and makes the censoring mechanism potentially informative. Castelli, Saint-Pierre, and
Daures (2006) observed such a phenomena in a real trial with asthma patients, where the drop
out patients are good health patients, who do not feel the necessity to consult the physician.
Another instance of informative censoring followed by a Bayesian analysis can be found in
Kaciroti et al. (2012) in the context of a clinical trial for the prevention of hypertension. How-
ever, without additional information or assumption the type of association or dependence in
informative censoring is not correctly identifiable (Huang and Zhang 2008).
In the current work, we consider a clinical trial with multiple treatments with response as
a time-to-event outcome and develop a two stage allocation design with an aim to favour the
better performing treatments of the first stage for the latter stage allocation. For the develop-
ment, we extensively use noninformative censoring but with the additional assumption that
the hazards experienced by the withdrawn (i.e. censored) and those under study (i.e. uncen-
sored) are proportional. Specifically, we use the Koziol-Green model of independent censor-
ing (Koziol-Green 1976), where the survival function of the censoring variable is some power
of the survival function of the lifetime variable ensuring proportional hazards. As there is no
well accepted procedure on setting the second stage allocation, we develop a sensible proce-
dure within the framework of a hypothesis testing procedure. In particular, based on the first
stage data we consider p values for testing the superiority in pairs and use conveniently to
derive the second stage allocation probability. However, exact p values are difficult to obtain
and as an alternative, we consider the score test based on the first stage data for directional
alternative and determine the associated asymptotic/large sample p value (Silvapulle and Sen
2005). Naturally, a smaller p value (exact or asymptotic) indicates higher evidence in favour of
superiority and hence using all such pairwise p values, we develop the allocation probability
of the second stage. The allocation design together with relevant properties are discussed in
Section 2. Small sample properties of the proposed procedure are explored in Section 3. Per-
formance of the proposed allocation is evaluated within the framework of a real clinical trial
in Section 4. Finally, Section 5 concludes with a discussion of some relevant and upcoming
issues.
(ρ1m , ρ2m , . . . , ρtm ) satisfying tk=1 ρkm = 1 are derived for the allocation of second stage
patients.
Since, no standard method for the determination of allocation probabilities ρkm , k =
1, 2, . . . , t is available, we consider the framework of multiple comparisons with the best (Hsu
1996). In particular, we assume that treatment 1 is the new experimental treatment and the
other treatments are existing standards and a higher response indicates a favourable situa-
tion. Then the natural objective is to know whether the experimental treatment performs bet-
ter than any of the existing treatments, which can be judged by testing H0 : μ1 ≤ maxk=1 μk
against H1 : μ1 > maxk=1 μk , where μk is the treatment effect measure for treatment k. Now,
it is easy to observe that H0 is equivalent to the union of a number of sub hypotheses
H0k : μ1 ≤ μk and similarly, H1 can be expressed as the intersection of the sub hypotheses
H1k : μ1 > μk . Naturally, we reject the global null hypothesis H0 if all the tests for H0k against
H1k , k = 1, 2, . . . , t are rejected. Although such tests are often performed using the concerned
p values, the response model for survival outcome incorporating censoring involves a num-
ber of nuisance parameters and hence makes the determination of exact p value quite com-
plicated. We, therefore, use the notion of asymptotic p value (Silvapulle and Sen 2005) of a
relevant score test utilizing the first stage data and derive a measure of evidence favouring
treatment 1 following a pairwise analysis.
For a comprehensive study, we denote the survival (censoring) time corresponding to the i
th patient when given treatment k by Xki (Cki ) and define the censoring indicator Iki = I(Xki <
Cki ), k = 1, 2, . . . , t, i = 1, 2, . . . , N. Then, for each subject, we observe (Yki , Iki ), where
Yki = min(Xki , Cki ). For the subsequent development, we assume that for each k = 1, 2, . . . , t,
F̄k (t ) = exp(− μt ), Ḡk (t ) = {F̄k (t )}νk , where fk (gk ) is the density function of the survival
k
(censoring) distribution corresponding to treatment k and Fk (Gk ) denotes the correspond-
ing survival function and that Xki is independent of Cki for each i and k. Naturally, for the
assumed distributions and corresponding to treatment k, the hazard functions for the event
and censoring times differ by a multiplier νk .
Now consider comparing treatments 1 and r(= 1) based on 2m hypothetical assignments,
where each treatment (i.e. treatments 1 and r) is assigned with equal probability (i.e. 12 ). Then
the likelihood (Kleinbaum and Klein 2012) of the first stage data can be expressed as
δki δki δki δki
L(ξ) = fk (yki ) F̄k (yki ) Ḡk (yki ) gk (yki ) ,
k=1,r i∈U i∈C i∈U i∈C
where ξ = ξ 1r = (μ1 , μr , ν1 , νr ) , and i∈U ( i∈C ) denotes the product over the uncensored
(censored) observations and δki = 1 or 0 as the i th patient is assigned treatment k or not. Then
defining Tjk and n jk as the sum of the observed survival times and the number of uncensored
responses from the subjects assigned treatment k at the j th stage, j = 1, 2, we have
2m
2m
T1k = δkiYki , n1k = δki Iki ,
i=1 i=1
N
N
T2k = δk jYk j , n2k = δk j Ik j , k = 1, r.
j=2m+1 j=2m+1
4 R. BHATTACHARYA AND M. SHOME
2m
If further mk = i=1 δki , k = 1, r then for the assumed response model, the above likelihood
reduces to
ν mk −n1k
T1k
L(ξ) = k
m exp − (1 + νk ) ,
k=1,r
μk k μk
J (ξ )
and hence the Fisher information matrix based on a single observation as J(ξ) = m2m .
A routine maximization of (1.1) gives the maximum likelihood estimates under H0r ∪
H1r as
T11 T1r m1 − n11 mr − n1r
μ̂1 = , μ̂r = , ν̂1 = , ν̂r = .
n11 n1r n11 n1r
+T1r m −n
However, with μ̃1r = nT11
11 +n1r
and ν˜k = μ̃1r ( kT 1k ), k = 1, r, maximum likelihood estimate ξ̃
1k
under H0r comes from Result A.1 of appendix as
T11 T1r
ξ̃ = (μ̃1r , μ̃1r , ν̃1 , ν̃r )T if >
n11 n1r
T11 T1r
= (μ̂1 , μ̂r , ν̂1 , ν̂r )T if ≤ .
n11 n1r
COMMUNICATIONS IN STATISTICS—THEORY AND METHODS 5
Now it follows from Result A.2 of the Appendix that under the null hypothesis, T̃1r is asymp-
totically equivalent to X1r2 I(X1r > 0) for a standard normal random variable X1r . Since a large
value of T̃1r indicates rejection of the null hypothesis, asymptotic p value (Silvapulle and Sen
2005) is simply
p1r = lim PH0 (T̃1r ≥ x)
m→∞
with x replaced by T̃1robs , the observed value of T̃1r . A simple application of the distribution
theory gives
√
lim PH0 (T̃1r ≥ x) = P X1r2 I(X1r > 0) ≥ x = I(x ≤ 0) + (− x)I(x > 0),
m→∞
where () is the distribution function of a standard normal random variable. Hence, we get
the asymptotic p value as
T
T11 T1r 11 T1r
p1r = I ≤ + − T1r I >
n11 n1r n11 n1r
with
2 2
1 n11 n1r T11 T1r μ̂21 μ̂2
T1r = − + r .
mμ̃21r T11 + T1r n11 n1r p̂1 p̂r
Naturally, the higher the value of p1r , the lower is the evidence of superiority of treatment 1
over treatment r based on the first stage data. Thus q1r = 1 − p1r measures the evidence of
superiority of treatment 1 over treatment r. Following the notions of intersection union tests
of hypothesis (Berger and Hsu 1996), an evidence measure of the superiority of treatment
1 over all the treatments is simply π1m = minr=1 q1r , where the higher the value of π1m , the
higher is the superiority of treatment 1 among others. In a similar fashion considering the
hypothesis H0 : μk ≤ maxr=k μr against H1 : μk > maxr=1 μr , a measure of superiority of any
6 R. BHATTACHARYA AND M. SHOME
treatment k can be derived as πkm = minr=k qkr . Combining all these, we suggest to assign any
π
incoming subject of second stage to treatment k with probability ρkm = π1m +π2mkm+···+πtm .
π̂kN → 1 − (t − 1)θ or θ
COMMUNICATIONS IN STATISTICS—THEORY AND METHODS 7
in probability as μk > maxr=k μr or μk < maxr=k μr . Hence, the observed allocation propor-
tion to treatment k behaves ethically in the limit, whenever μk > maxr=k μr , that is, when
treatment k is the most effective.
3. Performance evaluation
Boldface figures within parentheses in the Power column gives the power of equal allocation. EAP figures for equal allocation
are always . with standard deviation around ..
superiority of treatment 1, which indicates the ability of the proposed procedure to capture
any departure from equality of treatment effects. In addition, a close investigation of power for
varying (ν1 , ν2 , ν3 ) reveals that power is the highest for ν1 = ν2 = ν3 , that is, when the censor-
ing percentages across the treatments are equal and incurs a loss otherwise. A further investi-
gation reveals that the proposed procedure maintains more or less a similar level of power with
equal allocation for the chosen combinations of (m, N ). Therefore, using an interim analysis
after the first stage, the proposed two stage adaptive procedure favours the better treatment
for further assignment and shares a comparable ability to detect a departure from equality of
treatment effects with the traditional equal allocation.
Boldface figures within parentheses in the Power column gives the power of equal allocation. EAP figures for equal allocation
are always . with standard deviation around ..
not show significant prolongation in progression free survival(PFS) for any treatment. The
relevant figures of the trial are summarised in Table 3.
We assume that the PFS for these treatments are distributed as exponential and the cen-
soring as independent satisfying Koziol Green model of random censorship. Now, using
the figures of Table 2, we obtain the relevant estimates μ̂A = 4.423, μ̂B = 6.011, μ̂C = 3.943
(in months) and ν̂A = .456, ν̂B = .554, ν̂C = .757. Treating these estimates as the true ones,
we redesign the trial in two stages with the trial size N = 325 for different choices of the first
stage sampling fraction θ. Specifically, we conduct a simulation study for a number of choices
B B
0.6
B B
B
B
0.5
B
B
B
EAP
0.4
B
A
0.3
A C
A A C
A C
C A A A A C
A
0.2
C
C C
C C
Figure . EAP to different treatments (indicated by A, B and C) for the redesigned Glioblastoma trial.
of θ and obtain the EAP values for different treatments. However, to enable a better represen-
tation, we provide only a plot of EAP in Figure 1.
Variation in PFS for different treatments are revealed in Figure 1. For example, treatment
B has the highest PFS and hence, EAP to treatment B is always the highest in Figure 1. On the
other hand, PFS for treatments A and C are more or less same, and hence the corresponding
curves of EAP coincide in Figure 1. Therefore, the proposed allocation is sensitive to variation
in treatment effectiveness even if equal allocation is followed in the first stage. Therefore, based
on an interim analysis after the first stage, the proposed design assigns a higher fraction of
subjects to the “winner” (i.e. better performing) of the first stage and hence keeps the ethical
imperative of doing best for every individual in the trial.
5. Concluding remarks
The current work develops a multi-treatment two stage allocation design for survival trials
under the assumption of proportional hazards within the framework of independent cen-
soring. Although independent censoring is assumed for the development, violation is not
rare (e.g. competing risks or patient drop-out). But, identifying and modeling the violation
appropriately require further data based sensitivity analysis. Apart from conventional meth-
ods, Copula based techniques are becoming popular to model possible dependence between
the event and censoring random variables. Development of two stage allocation designs con-
sidering dependence in censoring for a general class of response distributions are, therefore,
intended for future development.
Acknowledgments
The authors would like to thank the anonymous reviewer, the editor and the associate editor for their
insightful comments which led to an improvement over the earlier version of the work.
References
Atkinson, A. C., and A. Biswas. 2014. Randomised Response-Adaptive Designs in Clinical Trials. Boca
Raton: CRC Press.
COMMUNICATIONS IN STATISTICS—THEORY AND METHODS 11
Baldi, Antognini, A., and A. Giovagnoli. 2015. Adaptive Designs for Sequential Treatment Allocation.
Boca Raton: CRC Press.
Bandyopadhyay, U., and R. Bhattacharya. 2007. On an ethical cum optimal adaptive allocation design.
Statistics 41:471–83.
Bandyopadhyay, U., A. Biswas, and R. Bhattacharya. 2009. A Bayesian adaptive design for two-stage
clinical trials with survival data. Life Time Data Analysis 15:468–92.
Bandyopadhyay, U., A. Biswas, and R. Bhattacharya. 2010. A covariate-adjusted adaptive design for
two-stage clinical trials with survival data. Statistica Neerlandica 64:202–26.
Batchelor, T. T., P. Mulholland, B. Neyns, L. B. Nabors, M. Campone, A. Wick, W. Mason, T. Mikkelsen,
S. Phuphanich, L. S. Ashby, J. DeGroot, R. Gattamaneni, L. Cher, M. Rosenthal, F. Payer, J. M.
Jürgensmeier, R. K. Jain, A. G. Sorensen, J. Xu, Q. Liu, and M. Bent. 2013. Phase III Randomized
Trial Comparing the Efficacy of Cediranib As Monotherapy, and in Combination With Lomustine,
Versus Lomustine Alone in Patients With Recurrent Glioblastoma. Journal Of Clinical Oncology
31:3212–18.
Bazaraa, M. S., H. D. Sherali, and C. M. Shetty. 2003. Nonlinear Programming: Theory and Algorithms,
Second Edition. New York: Wiley.
Berger, R. L. Hsu, and J. C. Hsu. 1996. Bioequivalence Trials,Intersection-Union Tests and Equivalence
Confidence Sets. Statistical Science 11:283–302.
Bhattacharya, R., and M. Shome. 2015. A randomized two stage allocation for continuous response
clinical trials. Statistical Methods & Applications 24:373–86.
Billingsley, P. 1995. Probability and Measure, Third Edition. New York: John Wiley & Sons, Inc.
Castelli, C., P. Saint-Pierre, and J.-P. Daures. 2006. Informative censoring in survival analysis and appli-
cation to asthma. Far East Journal of Theoretical Statistics 19:203–17.
Coad, D. S. 1992. Some results on estimation for two stage clinical trials. Sequential Analysis 11:299–311.
Coad, D. S. 1994. Sequential estimation for two-stage and three-stage clinical trials. Journal of Statistical
Planning and Inference 43:343–51.
Colton, T. 1963. A model for selecting one of two medical treatments. Journal of American Statistical
Association 58:388–400.
Hsu, J. C. 1996. Multiple Comparisons Theory And Methods. London, UK: Chapman & Hall.
Huang, X., and N. Zhang. 2008. Regression Survival Analysis with an Assumed Copula for Dependent
Censoring: A Sensitivity Analysis Approach. Biometrics 64:1090–99.
Kaciroti, N. A., T. E. Raghunathan, and J. M. G. Taylor. 2012. A Bayesian model for time-to-event data
with informative censoring. Biostatistics 13:341–54.
Kleinbaum, D. G., and M. Klein. 2012. Survival Analysis-A Self Learning Text, Third Edition. Springer.
Koziol, J. A., and S. B. Green. 1976. A Cramer-von Mises statistic for randomly censored data.
Biometrika 63:465–74.
Rosenberger, W. F., and J. L. Lachin. 2015. Randomisation in Clinical Trials: Theory and Practice, Second
Edition. New York: Wiley.
Silvapulle, M. J., and P. K. Sen. 2005. Constrained Statistical Inference: Inequality, Order, and Shape
Restrictions. Hoboken, New Jersey: John Wiley & Sons, Inc.
Appendix
Result A.1. With (ξ) as mentioned in (1.1), the solution of the constrained optimization
problem:
Minimize − (ξ)
subject to μ1 − μr ≤ 0
is
T11 T1r
ξ̃ = (μ̃1r , μ̃1r , ν̃1 , ν̃r )T if >
n11 n1r
T11 T1r
= (μ̂1 , μ̂r , ν̂1 , ν̂r )T if ≤ .
n11 n1r
12 R. BHATTACHARYA AND M. SHOME
Proof. Defining the Lagrangian F (ξ, λ) = −(ξ) + λ(μ1 − μr ), we get the KKT conditions
(Bazaraa, Sherali, and Shetty 2003):
∂F (ξ, λ)
= 0, λ(μ1 − μr ) = 0 with λ ≥ 0.
∂ξ
Using the explicit expression of −(ξ), the above equations can be written in the expanded
form as:
m1 T11 (1 + ν1 )
− +λ=0 (A.1)
μ1 μ21
mr T1r (1 + νr )
− −λ=0 (A.2)
μr μ2r
T11 m1 − n11
− =0 (A.3)
μ1 ν1
T1r mr − n1r
− =0 (A.4)
μr νr
λ(μ1 − μr ) = 0 with λ ≥ 0. (A.5)
First of all, assume that λ = 0 and solve the equations (A.1)–(A.4), to get
T11 T1r m1 − n11 mr − n1r
μ1 = μ̂1 = , μr = μ̂r = , ν1 = ν̂1 = , νr = ν̂r =
n11 n1r n11 n1r
provided (A.5) is satisfied or equivalently if nT11
11
≤ nT1r1r holds.
For λ > 0, it follows from (A.5) that μ1 = μr . Denoting the common unknown value by
μ and solving equations (A.1)–(A.4), we get
T11 + T1r m1 − n11 mr − n1r
μ = μ̃1r = , ν1 = ν̃1 = μ̃1r , νr = ν̃r = μ̃1r
n11 + n1r T11 T1r
provided λ > 0 is satisfied or equivalently if T1r
n1r
< T11
n11
is satisfied. Combining all these, we get
the desired solution ξ̃.
Result A.2. For any r = 1, under μ1 = μr , as m → ∞
T̃1r → X1r2 I(X1r > 0)
in distribution, where X1r has a standard normal distribution.
√ T T
2m( n11 − n1r )
Proof. Defining X1rm = √
11 1r
, we can express the score test statistic as
μ̃1r 2 1 + 1
p̂1 p̂r
n11 n1r
2
1 1 μ̂21 μ̂2
T̃1r = 4 2m 2m
+ + r 2
X1rm I(X1rm > 0). (A.6)
T11
2m
+ T2m 1r p̂1 p̂r p̂1 p̂r
Then the required result follows from the following facts that as m → ∞,
√ 2 2
(i) 2m( nT1111
− nT1r1r ) → N(0, 2μ
p1
+ 2μ
pr
) in distribution,
T1k
(ii) 2m → 12 μpk almost surely for any k = 1, r
n1k
(iii) 2m → 12 pk almost surely for any k = 1, r and
11 +T1r
(iv) μ̃1r = nT11 +n1r
→ μ in probability,
where μ is the common unspecified value under μ1 = μr . Now it is easy to observe that as
m → ∞, X1rm → X1r in distribution, where X1r ∼ N(0, 1). Then the observations noted in
COMMUNICATIONS IN STATISTICS—THEORY AND METHODS 13
(i)-(iv) above coupled with Slutsky Theorem (Billingsley 1995) shows that the right hand side
of (A.6) is asymptotically equivalent to X1r2 I(X1r > 0). Hence the result follows.
Note: A similar reasoning along the same line establishes that qkr → 1 or 0 almost surely
as μk > μr or μk < μr , for any k = r.