Professional Documents
Culture Documents
Mudholkar1996 PDF
Mudholkar1996 PDF
To cite this article: Govind S. Mudholkar , Deo Kumar Srivastava & Georgia D. Kollia (1996): A Generalization of the Weibull
Distribution with Application to the Analysis of Survival Data, Journal of the American Statistical Association, 91:436, 1575-1583
This article may be used for research, teaching, and private study purposes. Any substantial or systematic
reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is
expressly forbidden.
The publisher does not give any warranty express or implied or make any representation that the contents will
be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should be
independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings,
demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or
arising out of the use of this material.
A Generalization of the Wei bull Distribution With
Application to the Analysis of Survival Data
Govind S. MUDHOLKAR, Deo Kumar SRIVASTAVA, and Georgia D. KOLLIA
The Weibull distribution, which is frequently used for modeling survival data, is embedded in a larger family obtained by introducing
an additional shape parameter. This generalized family not only contains distributions with unimodal and bathtub hazard shapes,
but also allows for a broader class of monotone hazard rates. Furthermore, the distributions in this family are analytically tractable
and computationally manageable. The modeling and analysis of survival data using this family is discussed and illustrated in terms
of a lifetime dataset and the results of a two-arm clinical trial.
KEY WORDS: Bathtub and unimodal hazard rates; Proportional hazards; Total time on test transform.
1575
1576 Journal of the American Statistical Association, December 1996
general discussion and some properties of the family see unimodal for a < 1 and>. < 0, (d) monotone increasing for
Mudholkar and Kollia 1994.) a :-::; 1 and >. ;:::: 0, and (e) constant for a = 1 and>. = O.
The quantile function 0), as in the Box-Cox (Box and
Cox 1964) transform, gives the quantile function of the
well-known Weibull family in the limit as >. ~ O. For>. > 0, 2.3 Scaled TTT Transforms
the quantile function is that of a power of beta variable. The sign of >., in conjunction with a, determines the haz-
For>. < 0 and a = 1, the family (2) includes some of Pren- ard shape of a member of the generalized Weibull family.
tice's generalized F distributions. It also overlaps with some For>. :-::; 0, the distribution is regular; for>. > 0, it is nonreg-
transforms of the HP family of distributions presented by ular. Therefore, an approximate determination of the sign
Harrington and Fleming (982). Most important, for>. :-::; 0 of >. is useful for analysis of the data. A device called the
and a > 0, the family coincides with that of Burr type XII scaled TIT transform and its empirical version are relevant
distributions (Burr 1942), which is included in the scope of in this context.
Interactive Screening of Models (ISMOD) an all-subsets re- For a family with the survival function S(y) = 1- F(y),
gression program for generalized linear models of Lawless
the scaled TIT transform, with Hi1(u) = J:-1(u) S(y) dy
and Singhal (987).
defined for 0 < u < 1, is ¢p(u) = Hi 1(u)/Hi 1(1). The
2.2 The Hazard Function empirical version of the scaled TIT transform is given
It is easy to verify that the hazard function of the gener- by ¢n(r/n) = H;;1(r/n)/H;;1(1) = [(L:~=l Yi:n) + (n-
alized Weibull family is r)Yr:nl/(L:~=lYi), where r = 1, ... ,n and Yi:n, i = 1, ... , n
represent the order statistics of the sample. Aarset (987)
showed that the scaled TTT transform is convex (concave)
Downloaded by [Purdue University] at 21:56 23 March 2013
(y/a)(1/o.)-l
(3) if the hazard rate is decreasing (increasing), and for bathtub
h(y) = aa(l _ >.(y/a)l/o.) .
(unimodal) hazard rates, the scaled TIT transform is first
convex (concave) and then concave (convex). Hence, from
The interesting feature of the hazard function (3) is that Theorem 1 and the foregoing discussion, the sign of >. is
it assumes four different shapes in the four regions of the negative (positive) if the scaled TIT transform is convex
plane divided by the lines a = 1 and>. = O. The nature (concave), or first concave (convex) and then convex (con-
of hazard functions in the generalized Weibull family is cave). In the presence of light censoring, one can ignore the
summarized in the following theorem, the proof of which censoring in determining the sign of the initial guess >'0.
appears in Appendix A. A selection of the typical hazard The use of this device to determine the hazard shapes is
shapes available in the family is presented in Figure 1. illustrated in Figure 3, presented in Section 5. This figure
Theorem 1. For the generalized Weibull family (2), the
gives the empirical scaled TIT transforms, after ignoring
hazard function h(y) is (a) bathtub shaped for a > 1 and the censoring, of the datasets considered by Efron (1988)
x > 0, (b) monotone decreasing for a ;:::: 1 and x :-: ; 0, (c) appearing in Section 4, and by Aarset (987) considered in
Section 5. The empirical scaled TIT transforms in Figure
3 indicate a unimodal hazard shape for Efron's dataset and
a a bathtub hazard shape for Aarset's dataset.
Bathtub Failure Rate
1(0) = -d log a - d log a equations needed for parametric estimation and testing of
the hypotheses, including that for comparison of two arms,
+ (~-1) Llog(xi/a) - Llogg(xi) H o: 0 1 = O2 , can be easily obtained from the likelihood
u u function given in (4).
1 n
+ >: Llogg(xi), (4) 4. A UNIMODAL HAZARD EXAMPLE
i=1
4.1 Modeling Head and Neck Cancer Data
where 0 = (a, >., a)', E u denotes the summation over the
We now illustrate some of the ideas and methodology of
uncensored observations, d is the total number of uncen-
parametric inference for the generalized Weibull model for
sored observations, and g(Xi) = g(xi;a, >., a) = (1-
the regular case as outlined in the previous section, using
>.(xi/a)1/D.). The parameters a, >., and a can be estimated
the data from a two-arm clinical trial considered earlier
by maximizing 1(0); that is, solving the likelihood equations
by Efron (1988). Efron observed that the empirical hazard
obtained by differentiating 1(0) with respect to a, >., and a.
functions for both samples start near 0, suggesting an initial
Let iJ = (0:,5., a)' denote the maximum likelihood esti- high-risk period in the beginning, a decline for a while,
mates of 0 = (a, >., CT)'; then asymptotically, as n ~ 00, and then stabilization after about 1 year. He developed and
illustrated a methodology for analyzing the data using a
(5) combination of techniques of quantal response analysis and
the spline regression methods.
where N 3 denotes the trivariate normal distribution and i(O)
Specifically, Efron's data from a head and neck cancer
is the Fisher information matrix. The asymptotic result re-
mains valid if i(O) is replaced by a consistent estimate i(iJ) clinical trial consist of survival times of 51 patients in arm A
who were given radiation therapy and 45 patients in arm B
Downloaded by [Purdue University] at 21:56 23 March 2013
Table 1. Survival Times in Days, From a Two-Arm Clinical ascertained by means of signed deviance residuals,
Trial Considered by Efron (1988)
Arm A: 7, 34, 42, 63, 64, 74+, 83, 84, 91,108, 11:!, 129, 133, 133, 139,
140,140,146,149,154,157,160,160,165,173,176,185+, 218, 225,
241,248,273,277,279+,297,319+,405,417,420,440, 523, 523+,
583,594,1101,1116+,1146,1226+,1349+,14"2+,1417.
Arm B: 37, 84, 92, 94, 110, 112, 119, 127, 130, 133, 140, 146, 155,
159,169+,173,179,194,195,209,249,281, 3HI, 339, 432, 469, 519,
discussed by McCullagh and Nelder (1983) and used by
528+,547+,613+,633,725,759+,817,1092+,1245+,1331 +,1557,
Efron (1988). The symbols N j, Sj and ej = E nJ/,i (sum is
1642+,1771 +,1776,1897+,2023+,2146+, 22S'7+.
over the jth time period) represent the number of patients
NOTE: + indicates censoring. at risk at the beginning of the jth period, the number of ob-
served deaths in the jth period, and the number of expected
deaths in the jth period. A summary of the results of the
goodness of fit of the models appears in Table 2.
4.2 Model Adequacy
The quality of the fit of the models for the two arms
The assessment of model adequacy requires the expected can be evaluated using the signed deviance residuals chi-
number of deaths for each of the class intervals presented squared p values. Thus for the arm A, using the same dis-
in Table 2. These are obtained by considering intervals of cretization as Efron, we observe a X 2 of 13.5137 with 10 df
length 1 month, calculating the expected number of deaths and significance probability of 19.65%. This may be con-
Downloaded by [Purdue University] at 21:56 23 March 2013
in the ith month interval as tuju, where ni is the number at sidered more satisfactory in comparison with the p values
risk at the beginning and hi is the hazard at the midpoint of 1.4% and 3.2% for the simpler linear and cubic fits, and
of the ith month interval, aad summing them over all the essentially equivalent to the p value of 20% for the less
month intervals included in the interval. Efron (1988) com- convenient cubic-linear spline fit of Efron (1988). For arm
B, we discretized the data as given in Table 2; the calcu-
bined the month intervals so as to have roughly 50 person-
lations yielded a chi-squared value of 14.8750, which for
month observations in each class interval. We also group
12 df gives a significance probability of 24.8%. This again
the intervals in a similar fashion, as in Table 2, for the re- demonstrates that a three-parameter generalized Weibull
sults to be comparable. For example, the fifth interval of model with parameter estimates given in (8) provides a good
4--6 months in Table 2 corresponding to aim A indicates 72 fit for the arm B data.
patients at risk, which is the sum of 40 and 32 patients at
risk at the beginning of intervals 4-5 and 5--6. 4.3 Weibull Goodness of Fit
The goodness of fit of the models corresponding to the As suggested in the previous section, we can test the ade-
estimates given in (7) and (8) for arms A and B can then be quacy of the Weibull models for the arm A and arm B data
NOTE: The chi-squared value lor lesllng lhe overall goodness of fl: for the proportional hazards model is 26.82, which for 22 df gives a p value of 21.8%.
Mudholkar, Srivastava, and Kollia: The Weibull Distribution and Survival Data 1579
'<t
~
where nl, UA and n2, UB denote the number of observations
a and sum over uncensored observations in arms A and B and
d 2 represents the number of uncensored observations in arm
~
a B. The likelihood equations for estimating the parameters
can be very easily obtained from (10). With an initial guess
a
~ as ao = .4, AO = -5,0'0 = 5, and TJo = .5, we obtained the
a maximum likelihood estimates as
CD
a
a & = .375675, 5. = -3.014134, a = 5.198953,
s
J::
/IV\\
l!l
a
' \'\ and
\
\~
i) = .579333. (11)
sa ;1 -,
if
N
a
11
;"1
~
..., - 4.5 Goodness of Fit and Equivalence of the Two Arms
a --~------- .. _----~- -- - From the estimates obtained in (11), we can test the good-
ness of fit for the two arms separately. For arm A using the
a
a estimates of 0:, A, and a as obtained in (11), we obtain chi-
squared value of 12.9546, which for 10 df gives a p value of
o 10 20 30 40 22.59%. A similar exercise for the arm B data with the es-
Months
timates given in (11) yields a chi-squared value of 9.21337,
Downloaded by [Purdue University] at 21:56 23 March 2013
8(a, A, a) = 1 - Fi«, A, a), where F is given by (2), in- separate fits given by (7) and (8). The observed value of the
volving parameter 0 = (a, A, a), and modeling arm B by the chi-square associated with Wald-type statistic is 11.6746,
survival function [8(a, A, 0')]'7. The four parameters 0:, A, a, which gives a p value of .0086. The equivalence of the two
and TJ can also be estimated by the method of maximum arms in the proportional hazards model (i.e., the null hy-
likelihood. Because the data corresponding to the two arms pothesis Hi: TJ = 1) may be tested using, for example, the
are independent, the joint log-likelihood under proportional likelihood ratio statistic. The associated value of the chi-
hazards assumption, (lpr.hz), is just the sum of the individ- square in this case is 5.3301 with a p value of .021. Hence
uallog-likelihoods for the two arms separately, as follows: we can conclude that the hypothesis of equality of two arms
can be rejected.
lpr.hz = 2)og h(Xi; a, A, a)
UA 5. BATHTUB AND INCREASING HAZARD RATES
nl
5.1 The Nonregular Case
+ 2)og[1-F(Xi;0:,>.,a)] +d2 log TJ
i=l The hazard rates of the generalized Weibull distributions
are increasing when A > 0, a < 1, and are bathtub shaped
+ 2)og h(Xi; a, A, a) when A > 0, a > 1. Because bathtub hazard rate data are
UB
n2
often encountered in practice, a variety of distributions for
+ TJ L log]I - F(Xi; a, A, a)], (10) modeling such data and methods for statistical analysis of
i=l
the models have appeared in the literature. Lawless (1982)
1580 Journal of the American Statistical Association, December 1996
Table 3. Survival Times for the 50 Dwices Put 5.3 Sampling Distributions and Inference
on Life Test at Time Zero
For the purpose of statistical inference, an understand-
0.1, 0.2, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 3.0, 6.0, 7.0, 11.0, 12.0, 18.0, 18.0,
ing of the joint distributions of ¢ = Yn :n and the modified
18.0, 18.0, 18.0, 21.0, 32.0, 36.0, 40.0, 45.0, 46.0, 47.0, 50.0, 55.0, 60.0,
maximum likelihood estimates a and X obtained from (14)
63.0, 63.0, 67.0, 67.0, 67.0, 67.0, 72.0, 75.0, 79.0, 82.0, 82.0, 83.0, 84.0,
is necessary. Note that the joint distribution of 6 = (¢, a, X)'
84.0, 84.0, 85.0, 85.0, 85.0, 85.0, 85.0, 86.0, 86.0
is not trivariate normal and for >. ~ ~, the rate of conver-
gence is not ..;n. It is convenient to describe the joint distri-
and Rajarshi and Rajarshi (1988) have discussed the is- bution in terms of the asymptotic marginal distribution of ¢
sues involving bathtub hazard shape and alternatives such and the asymptotic conditional distribution of (a, X)' given
as weighted least squares and method of moments to max- ¢. These are given in the following theorems; the proofs
imum likelihood estimation. appear in Appendix A.
However, as noted earlier, the generalized Weibull dis- Theorem 2.
tributions are nonregular in the sense that their support a. The marginal distribution of ¢= Y n :n is given by
(0, (J' / >'''') involves the parameters. Hence they share all the
intricacies of the theory of likelihood-based inference for P(¢ ~ t) = [1 - (1 - (tj¢)l/o)l/At.
the nonregular families. In some cases the likelihood may
b. Asymptotically, as n ~ 00,
become unbounded, and consequently maximum likelihood
estimators may not exist, whereas in other cases the max-
imum/likelihood estimators may exist but their asymptotic
distributions mayor may not follow the: classical asymp-
n
A
(* - 1) = -aZ A + Op(n- A) ~ -aZ A in law, (15)
Table 4. Model Adequacy for the Fit (18) to the Aarset Data in Table 3
Months
fit is also reflected in the empirical and fitted scaled TTT Proof of Theorem 4
transforms given in Figure 3.
Let 80 = (ao,>"o)',jj (0,>.)', and (J = (a,>..)'. Also, let
6. CONCLUSIONS
L = Lia, >.., ¢) be the log-likelihood function in (14) centered at
¢, let G(8,4» = (8/80)L, and let H((J,4» = (8 2/802)L be the
The generalized Weibull family as presented in (2) can 2 x 2 Hessian matrix. Then by the definition of jj we have
be used effectively in the analysis of survival data. The
G(jj,¢) = O. (A2)
family is versatile, accomodating monotone, unimodal, and
bathtub-shaped hazard functions. The empirical scaled TTT Now expanding (A2) about (Jo we get
transform can be used to identify the shape of the haz-
ard function. The family has closed-form expressions for
o =G(Oo, 4» + H(Oo, ¢)(jj - ( 0 ) + op(l/vn). (A3)
the distribution functions and the hazard functions, and is Assuming that n is large enough so that ¢ can be replaced by cPo
closed under proportional hazards modeling. Because of its in H(Oo, ¢), we get
analytic tractability, the likelihood-based inference in the 0= G(Oo, ¢) + H(Oo, cPo)(jj - (Jo) + op(l/vn). (A4)
regular case and an alternative method based on a "modi-
fied likelihood" can be easily implemented. Then, rearranging the terms and multiplying by vn, it follows
that
APPENDIX A: PROOFS
vn(jj - ( 0 ) = -vnH-1(Oo, cPo)G(Oo, ¢) + 01'(1). (A5)
Proof of Theorem 1 Now expanding G(Oo, ¢) around cPo, we have
The theorem can be established either by analyzing the haz-
ard quantile function h(Q(u)) = f(Q(u))/(1 - u), or by directly
vn(jj - (0) = -vnH-1(Oo,cPo)G(Oo,cPo)
examining the hazard function. From Equation (3) we have - 1 8
- vn(cP - cPo)H- (00 , cPo) 8cP G(Oo,cPo)
Downloaded by [Purdue University] at 21:56 23 March 2013
1 ) 1 . )..(y/<r)l/<>--l
D log h(y) = (Q - 1 Y + aCT(1 _ A(Y/CT)l/<>-)' (AI) + 01'(1). (A6)
where D log h(y) = (d/ dy) log h(y) is the derivative of log h(y). If >.. > 1/2, then, in view of Theorem 2, the second term in (A6) is
Part (e) is trivial, and parts (b) and (d) follow immediately from negligible and the asymptotic distribution of (j is bivariate normal.
the signs of the derivative in (A I). However, parts (a) and (c) are However, if >.. < 1/2, then the second term in (A6) dominates the
less obvious. first term, and by Theorem 2 and asymptotically, (j has a bivariate
First, consider part (a), the case a > 1 and)" > O. Here the exponential distribution. For X = 1/2, the asymptotic distribution
derivative (Al) is -00 and y = 0 and +(Xl at y = CT/(>"<>-). is a mixture of normal and exponential.
Moreover, D log h(y) vanishes only once, because the equation
APPENDIX B: INITIAL GUESS
(1 - a) + a>..(y/CT?/<>- = 0 has a unique solution, Hence in this
FOR ITERATIVE METHODS
case the hazard function h(y) is convex; that is, bathtub shaped.
Now to prove (c), assume that 0 < a < land>" < O. Once The solution of the nonlinear equations can be obtained using
again, we note that D log h(y) has only one zero; furthermore, is one of the many iterative routines; for example, those given in
easy to see that D log h(O) = 00 and D log h( (0) = 0, implying IMSL or NAG. It may be noted, however, that these routines are
that the function is increasing at zero and decreasing for large y. generally sensitive to the starting values. If censoring is light, then
It follows that the hazard function is unimodal. for the starting values of a and CT we propose using some simple
shape and scale estimates obtained assuming the Weibull model
Proof of Theorem 2 and ignoring the censored part of the data. In the illustrative ex-
a. In view of the distribution. function of y, as given in (12), ample of Section 3, we use the simple closed-form estimates, a
the proof is obvious. and if, of Kollia and Mudholkar (1990). These estimates are
b. Let 'Uw« denote the largest-order statistics of a sample of
size n from uniform (0, I) distribution. Then from the quan-
tile function (13), it is clear that the maximum Yn :n = ¢ can
be expressed as
a = .69;13n t ( ~~-=- .~~ - )tOg
2 1 Y;:n (B.l)
and
t
Y n :n =L Q(Un:n) = cP[1 - (1 -- Un:nlt.
n A (Y;n _ 1) =L -a[n(1 - Un:n)];' + Op(n- A). where Y 1 :n ::; Y 2 :n ... ::; Y n:n denote the order statistics of a com-
plete sample from a Weibull distribution with parameters a and
But n(l - Un :n ) converges, as n -'-> 00, in law to the standard CT. If the censoring is heavy; then instead of a and if, one may use
exponential random variable, and the second term on the right the Weibull maximum likelihood estimates obtained iteratively.
side, Op(n- A ) , converges to zero. Hence we get (15). For the starting value >"0, we can take a small number (e.g. ±.01)
with the sign as suggested by the shape of the empirical scaled
Proof of Theorem 3 TIT transform after ignoring the censoring.
Alternatively, one can obtain the initial values using linear re-
Conditional on ¢, the modified likelihood (14) satisfies all of the
gression. For the model (I), we have
assumptions required for the asymptotic normality of the maxi-
mum likelihood estimates (0, >.)', and hence the result. (For an logY;:n Rj logQ(i/n) = log e - a log >..
analogous development, see Smith 1985.) + alog[l- (1- (i/n))AJ. (B.3)
Mudholkar, Srivastava, and Kollia: The Weibull Distribution and Survival Data 1583
So for any fixed A, estimates of 0: and (J" can be obtained using Kalbfleisch, J. D., and Prentice, R. L. (1980), The Statistical Analysis of
linear regression. Thus by "manual updating" (i.e., "nested least Failure Data, New York: Wiley.
squares"), one can obtain a reasonable starting point (0:0, AD, 0:0) Kollia, G., and Mudholkar, G. S. (1990), "An Approach to Estimation in
corresponding to the smallest residual sum of squares when the Quantile Function Families With Application to Weibull Distribution,"
technical report, University of Rochester, Dept. of Statistics.
regression method is repeated for a selection of A values.
Lawless, J. F. (1982), Statistical Models and Methods for Lifetime Data,
New York: Wiley.
[Received December 1993. Revised March 1996.J
Lawless, J. F., and Singhal, K. (1987), "ISMOD: An All-Subsets Regres-
sion Program for Generalized Linear Models I. Statistical and Computa-
REFERENCES tional Background," Computer Methods and Programs in Biomedicine,
24,117-124.
Aarset, M. V. (1987), "How to Identify a Bathtub Hazard Rate," IEEE
Transactions on Reliability, R-36, 106-108. McCullagh, P., and Nelder, J. (1984), Generalized Linear Models, London:
Box, G. E. P., and Cox, D. R. (1964), "An Analysis of Transformations," Chapman and Hall.
Journal of the Royal Statistical Society, Ser. B, 26, 211-252. Miller, R. G., Jr. (1983), "What Price Kaplan-Meier?" Biometrics, 39,
Burr, I. W. (1942), "Cumulative Frequency Functions," Annals of Mathe- 1077-1081.
matical Statistics, 13,215-232. Mudholkar, G. S., and George, E. O. (1979), "The Logit Statistic for Com-
Cox, D. R., and Oakes, D. (1984), Analysis of Survival Data, London: bining Probabilities-An Overview," in Optimizing Methods in Statis-
Chapman and Hall. tics, ed. J. S. Rustagi, New York: Academic Press, pp. 345-365.
Efron, B. (1988), "Logistic Regression, Survival Analysis, and the Kaplan- Mudholkar, G. S., and Kollia, G. D. (1994), "Generalized Weibull Family:
Meier Curve," Journal of the American Statistical Association, 83,414- A Structural Analysis," Communications in Statistics, Pan A-Theory
425. and Methods, 23, 1149-1171.
Farewell, V. T., and Prentice, R. L. (1977), "A Study of Distributional Rajarshi, S., and Rajarshi, M. B. (1988), "Bathtub Distributions: A Re-
Shape in Life Testing," Technometrics, 19, 69-75. view," Communications in Statistics, Part A-Theory and Methods, 17,
Gore, A. P., Paranjape, S., Rajarshi, M. B., and Gadgil, M. (1986), "Some 2597-2621.
Methods for Summarizing Survivorship Data in Nonstandard Situa- Rao, C. R. (1973), Linear Statistical Inference and Its Applications, New
Downloaded by [Purdue University] at 21:56 23 March 2013