Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

This article was downloaded by: [Purdue University]

On: 23 March 2013, At: 21:56


Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41
Mortimer Street, London W1T 3JH, UK

Journal of the American Statistical Association


Publication details, including instructions for authors and subscription information:
http://www.tandfonline.com/loi/uasa20

A Generalization of the Weibull Distribution with


Application to the Analysis of Survival Data
a b c
Govind S. Mudholkar , Deo Kumar Srivastava & Georgia D. Kollia
a
University of Rochester, NY, 14627
b
Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, TN, 38105
c
Bristol-Myers Squibb, Princeton, NJ, 08540

To cite this article: Govind S. Mudholkar , Deo Kumar Srivastava & Georgia D. Kollia (1996): A Generalization of the Weibull
Distribution with Application to the Analysis of Survival Data, Journal of the American Statistical Association, 91:436, 1575-1583

To link to this article: http://dx.doi.org/10.1080/01621459.1996.10476725

PLEASE SCROLL DOWN FOR ARTICLE

Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions

This article may be used for research, teaching, and private study purposes. Any substantial or systematic
reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is
expressly forbidden.

The publisher does not give any warranty express or implied or make any representation that the contents will
be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should be
independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings,
demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or
arising out of the use of this material.
A Generalization of the Wei bull Distribution With
Application to the Analysis of Survival Data
Govind S. MUDHOLKAR, Deo Kumar SRIVASTAVA, and Georgia D. KOLLIA

The Weibull distribution, which is frequently used for modeling survival data, is embedded in a larger family obtained by introducing
an additional shape parameter. This generalized family not only contains distributions with unimodal and bathtub hazard shapes,
but also allows for a broader class of monotone hazard rates. Furthermore, the distributions in this family are analytically tractable
and computationally manageable. The modeling and analysis of survival data using this family is discussed and illustrated in terms
of a lifetime dataset and the results of a two-arm clinical trial.
KEY WORDS: Bathtub and unimodal hazard rates; Proportional hazards; Total time on test transform.

1. INTRODUCTION and fitting, especially in the presence of censoring, which


The Weibull family is commonly used in statistical anal- may involve an evaluation of an incomplete gamma integral
ysis of lifetime or response time data from reliability ex- or a beta ratio (see Kalbfleisch and Prentice 1980, pp. 26
periments and survival studies. It is generally adequate for and 63).
modeling monotone hazard rates, and large data are needed In this article we present an extension of the Weibull
to discriminate it from other monotone hazard rate models family that contains distributions with unimodal and bath-
tub hazard rates and also yields a broader class of mono-
such as gamma. However, the Weibull family is inappropri-
tone failure rates. In Section 2 we present the generalized
ate when the failure rate is indicated to be unimodal or bath-
Weibull family and discuss various types of hazard func-
Downloaded by [Purdue University] at 21:56 23 March 2013

tub shaped. To avoid the model validity issues, the nonpara-


tions it contains. We also propose and illustrate the use
metric approach, supported by the well-developed Kaplan-
of empirical total-time-on-test (TTT) transform as a device
Meier product limit estimator and related techniques, is of-
for -identifying the shape of the hazard function. In Section
ten regarded as the preferred course. However, this alterna-
3 we discuss estimation and testing of the model param-
tive is often inefficient, as noted by Miller (1983) (see also
eters using likelihood methods. We present an application
Efron 1988). The pros and cons of different parametric and
of the model to the two-arm clinical trial, considered by
nonparametric models and methodology for statistical in-
Efron (1988), in Section 4. In Section 5 we consider the
ference can be found in the work of Cox and Oakes (1984), case of a bathtub-shaped hazard rate in which the gener-
Kalbfleisch and Prentice (1980), and Lawless (1982). alized Weibull model is nonregular. For this, we suggest a
An approach to the construction of flexible parametric reparameterization of the model followed by an alternative
models is to embed appropriate competing models into a method of analysis and illustrate this using data considered
larger model by adding a shape parameter (see Kalbfleisch by Aarset (1987). In Section 6 we end with some conclu-
and Prentice 1980). This embedding approach not only pro- sions.
vides a broader range of hazard shapes, but also allows, as
Farewell and Prentice (1977, p. 69) observed, "the methods 2. THE FAMILY AND ITS HAZARD FUNCTION
of ordinary parametric inference to be used for discrimina-
2.1 The Generalized Weibull Family
tion and leads to an assessment of each competing model
relative to a more comprehensive one." The aforementioned The generalized Weibull family of distributions, in terms
generalized models are theoretically convenient for infer- of its quantile function, is given by
ences such as goodness of fit but less so for modeling
Q(u) = a [1-(1A- U),r A # 0,
Govind S. Mudholkar is Professor of Statistics and Biostatistics, Univer-
sity of Rochester, NY 14627. Deo Kumar Srivastava is Assistant Mem- = a[-log(1 - u)]'" A = 0, (1)
ber, Department of Biostatistics, St. Jude Children's Research Hospital,
Memphis, TN 38105. Georgia D. Kollia is Biostatistician, Bristol-Myers
Squibb, Princeton, NJ 08540. The research work of Deo Kumar Srivas- where 0:, a > 0 and A is real.
tava was in part supported by National Cancer Institute grant CA 21765 Inverting the quantile function (1), the distribution func-
and by the American Lebanese Syrian Associated Charities. The work of tion of the generalized Weibull distribution is seen to be
Georgia D. Kollia was partly supported by National Cancer Institute grant
CA 09168 while she was at the University of Washington in Seattle. The
authors acknowledge with thanks the numerous and valuable comments (2)
and suggestions of the two referees and the associate editor. The authors
also thank M. Sibuya for some helpful discussions. The range of the generalized Weibull random variable y in
This work is dedicated to W. Allen Wallis, Professor Emeritus of Statis-
(2) is (0, (0) for A :::; 0 and (0, a/A"') for A > O. (For a
tics and Economics, University of Rochester, for his services to the disci-
pline of statistics and the cause of scholarship. His numerous contributions,
especially the leadership in organizing various departments of statistics, in- © 1996 American Statistical Association
cluding those at the University of Rochester and the University of Chicago, Journal of the American Statistical Association
are thankfully acknowledged. December 1996, Vol. 91, No. 436, Theory and Methods

1575
1576 Journal of the American Statistical Association, December 1996

general discussion and some properties of the family see unimodal for a < 1 and>. < 0, (d) monotone increasing for
Mudholkar and Kollia 1994.) a :-::; 1 and >. ;:::: 0, and (e) constant for a = 1 and>. = O.
The quantile function 0), as in the Box-Cox (Box and
Cox 1964) transform, gives the quantile function of the
well-known Weibull family in the limit as >. ~ O. For>. > 0, 2.3 Scaled TTT Transforms
the quantile function is that of a power of beta variable. The sign of >., in conjunction with a, determines the haz-
For>. < 0 and a = 1, the family (2) includes some of Pren- ard shape of a member of the generalized Weibull family.
tice's generalized F distributions. It also overlaps with some For>. :-::; 0, the distribution is regular; for>. > 0, it is nonreg-
transforms of the HP family of distributions presented by ular. Therefore, an approximate determination of the sign
Harrington and Fleming (982). Most important, for>. :-::; 0 of >. is useful for analysis of the data. A device called the
and a > 0, the family coincides with that of Burr type XII scaled TIT transform and its empirical version are relevant
distributions (Burr 1942), which is included in the scope of in this context.
Interactive Screening of Models (ISMOD) an all-subsets re- For a family with the survival function S(y) = 1- F(y),
gression program for generalized linear models of Lawless
the scaled TIT transform, with Hi1(u) = J:-1(u) S(y) dy
and Singhal (987).
defined for 0 < u < 1, is ¢p(u) = Hi 1(u)/Hi 1(1). The
2.2 The Hazard Function empirical version of the scaled TIT transform is given
It is easy to verify that the hazard function of the gener- by ¢n(r/n) = H;;1(r/n)/H;;1(1) = [(L:~=l Yi:n) + (n-
alized Weibull family is r)Yr:nl/(L:~=lYi), where r = 1, ... ,n and Yi:n, i = 1, ... , n
represent the order statistics of the sample. Aarset (987)
showed that the scaled TTT transform is convex (concave)
Downloaded by [Purdue University] at 21:56 23 March 2013

(y/a)(1/o.)-l
(3) if the hazard rate is decreasing (increasing), and for bathtub
h(y) = aa(l _ >.(y/a)l/o.) .
(unimodal) hazard rates, the scaled TIT transform is first
convex (concave) and then concave (convex). Hence, from
The interesting feature of the hazard function (3) is that Theorem 1 and the foregoing discussion, the sign of >. is
it assumes four different shapes in the four regions of the negative (positive) if the scaled TIT transform is convex
plane divided by the lines a = 1 and>. = O. The nature (concave), or first concave (convex) and then convex (con-
of hazard functions in the generalized Weibull family is cave). In the presence of light censoring, one can ignore the
summarized in the following theorem, the proof of which censoring in determining the sign of the initial guess >'0.
appears in Appendix A. A selection of the typical hazard The use of this device to determine the hazard shapes is
shapes available in the family is presented in Figure 1. illustrated in Figure 3, presented in Section 5. This figure
Theorem 1. For the generalized Weibull family (2), the
gives the empirical scaled TIT transforms, after ignoring
hazard function h(y) is (a) bathtub shaped for a > 1 and the censoring, of the datasets considered by Efron (1988)
x > 0, (b) monotone decreasing for a ;:::: 1 and x :-: ; 0, (c) appearing in Section 4, and by Aarset (987) considered in
Section 5. The empirical scaled TIT transforms in Figure
3 indicate a unimodal hazard shape for Efron's dataset and
a a bathtub hazard shape for Aarset's dataset.
Bathtub Failure Rate

3. PARAMETRIC INFERENCE: REGULAR CASE

3.1 Maximum Likelihood Estimation


The analysis of survival data modeled by the generalized
Weibull family (2) for the regular case (i.e, a > 0 and >. < O)
can be performed using parametric methods that are now
(0.1) well established and reasonably well understood. We dis-
cuss the nonregular case in Section 5. For the regular case,
the methods for fitting models by estimation, testing para-
metric hypotheses and comparing models corresponding to
different populations are generally based on the appropriate
likelihoods and the associated large-sample theory.
Let us assume availability of one or two sets of right-
censored lifetime data without covariates. A typical set of
Unimodal Failure Rate ~
~ IFR
this kind consists of observations Xi = min(Ti , Ci ) , i =
1,2, ... , n, on n units in a survival trial. Here T i and C,
represent the failure and the censoring times of the ith unit.
Figure 1. Typical Hazard Shapes of the Generalized Weibull Distri- The log-likelihood function for these data, in the framework
butions for (a, >.) Over the Four Quadrants. of the generalized Weibull family, can be written as
Mudholkar, Srivastava, and Kollia: The Weibull Distribution and Survival Data 1577

1(0) = -d log a - d log a equations needed for parametric estimation and testing of
the hypotheses, including that for comparison of two arms,
+ (~-1) Llog(xi/a) - Llogg(xi) H o: 0 1 = O2 , can be easily obtained from the likelihood
u u function given in (4).
1 n
+ >: Llogg(xi), (4) 4. A UNIMODAL HAZARD EXAMPLE
i=1
4.1 Modeling Head and Neck Cancer Data
where 0 = (a, >., a)', E u denotes the summation over the
We now illustrate some of the ideas and methodology of
uncensored observations, d is the total number of uncen-
parametric inference for the generalized Weibull model for
sored observations, and g(Xi) = g(xi;a, >., a) = (1-
the regular case as outlined in the previous section, using
>.(xi/a)1/D.). The parameters a, >., and a can be estimated
the data from a two-arm clinical trial considered earlier
by maximizing 1(0); that is, solving the likelihood equations
by Efron (1988). Efron observed that the empirical hazard
obtained by differentiating 1(0) with respect to a, >., and a.
functions for both samples start near 0, suggesting an initial
Let iJ = (0:,5., a)' denote the maximum likelihood esti- high-risk period in the beginning, a decline for a while,
mates of 0 = (a, >., CT)'; then asymptotically, as n ~ 00, and then stabilization after about 1 year. He developed and
illustrated a methodology for analyzing the data using a
(5) combination of techniques of quantal response analysis and
the spline regression methods.
where N 3 denotes the trivariate normal distribution and i(O)
Specifically, Efron's data from a head and neck cancer
is the Fisher information matrix. The asymptotic result re-
mains valid if i(O) is replaced by a consistent estimate i(iJ) clinical trial consist of survival times of 51 patients in arm A
who were given radiation therapy and 45 patients in arm B
Downloaded by [Purdue University] at 21:56 23 March 2013

or, more simply, by the (3 x 3) sample information matrix


who were given radiation plus chemotherapy. Nine patients
given by
in arm A and 14 patients in arm B were lost to follow-up
and were regarded as censored. The data from Efron (1988)
(6) are reproduced in Table 1. Efron discretized the data into a
number of intervals, as shown in Table 2, and estimated the
The asymptotic distribution of the maximum likelihood hazard function for each discretized interval using logistic
estimates of 0 given by (5) may be used to construct ap- regression.
proximate confidence intervals and confidence regions for Using an iterative algorithm and the likelihood method
the individual parameters, for the hazard functions, or for discussed in Section 3, we fit the generalized Weibull mod-
survival functions. To test the variety of parametric hy- els with parameters 0 1 = (a1, >'1, ad' and O2 = (a2,>'2,a2)'
potheses encountered in data analysis, we may use any of to arms A and B. The initial values required for the iter-
the three well-known (Kalbfleisch and Prentice 1980; Rao ations are taken as 0 10 = (a1O = .82, >'10 = -.01, a10 =
1973) asymptotically equivalent test statistics: the likeli- 9.62) for arm A and 0 20 = (a20 = .80, >'20 = -.01, a20 =
hood ratio, Wald statistic, and score statistic. These are also 11.77), where a10 = ch, a20 = (h, a10 = 0- 1 , and CT20 = 0- 2
useful for testing goodness-of-fit hypotheses such as H o: are obtained using Equations (A.7) and (A.8) in Appendix
>. = a and He: >. = 0, a = 1, which postulate the Weibull B. The initial guesses for >. for both the arms are taken to
and exponential models. be a small negative value, because the empirical scaled TIT
transform plots, given in Figure 3, suggested unimodal haz-
3.2 Proportional Hazards Model ard rates. With number of iterations arbitrarily set at 100,
It is easy to verify that the family (2) is closed under the routines converged to the following values of the pa-
proportional hazards relationship. That is, for any TJ > a rameters:
and S(O) = 1 - F(O) in the family (2), [S(O)]'7 is also
a member of (2). Hence modeling a two-arm clinical trial
0: 1 = .469826, 5. 1 = -2.152236,
using the survival function S(0) from family (2) for one arm
and
and [S(O)]'7 for the other arm is parsimonious as it involves
only four parameters instead of the six needed if the two al = 5.988068 for arm A (7)
arms were modeled separately. The appropriateness of the
proportional hazards assumption can be checked by testing and
the hypothesis H o: TJ = 1, as illustrated in Section 4.4. Also,
the test for H o: >. = a provides a goodness-of-fit test for 0: 2 = .222806, 5. 2 = -9.387811,
the Weibull submodels for the two datasets together.
If the proportional hazards model is inappropriate for the and
data, then the two arms can be modeled separately by gener-
alized Weibull survival functions S(Or) and S(02) involv- a2 = 4.867750 for arm B. (8)
ing parameters (Jl = (al,>'I,ar)' and O2 = (a2,>'2,a2)'.
Because the two samples are independent, the likelihood of The plots of the hazard functions h(y), given by (3), corre-
the six parameters is the obvious product. The likelihood sponding to these models are presented in Figure 2.
1578 Journal of the American Statistical Association, December 1996

Table 1. Survival Times in Days, From a Two-Arm Clinical ascertained by means of signed deviance residuals,
Trial Considered by Efron (1988)

Arm A: 7, 34, 42, 63, 64, 74+, 83, 84, 91,108, 11:!, 129, 133, 133, 139,
140,140,146,149,154,157,160,160,165,173,176,185+, 218, 225,
241,248,273,277,279+,297,319+,405,417,420,440, 523, 523+,
583,594,1101,1116+,1146,1226+,1349+,14"2+,1417.

Arm B: 37, 84, 92, 94, 110, 112, 119, 127, 130, 133, 140, 146, 155,
159,169+,173,179,194,195,209,249,281, 3HI, 339, 432, 469, 519,
discussed by McCullagh and Nelder (1983) and used by
528+,547+,613+,633,725,759+,817,1092+,1245+,1331 +,1557,
Efron (1988). The symbols N j, Sj and ej = E nJ/,i (sum is
1642+,1771 +,1776,1897+,2023+,2146+, 22S'7+.
over the jth time period) represent the number of patients
NOTE: + indicates censoring. at risk at the beginning of the jth period, the number of ob-
served deaths in the jth period, and the number of expected
deaths in the jth period. A summary of the results of the
goodness of fit of the models appears in Table 2.
4.2 Model Adequacy
The quality of the fit of the models for the two arms
The assessment of model adequacy requires the expected can be evaluated using the signed deviance residuals chi-
number of deaths for each of the class intervals presented squared p values. Thus for the arm A, using the same dis-
in Table 2. These are obtained by considering intervals of cretization as Efron, we observe a X 2 of 13.5137 with 10 df
length 1 month, calculating the expected number of deaths and significance probability of 19.65%. This may be con-
Downloaded by [Purdue University] at 21:56 23 March 2013

in the ith month interval as tuju, where ni is the number at sidered more satisfactory in comparison with the p values
risk at the beginning and hi is the hazard at the midpoint of 1.4% and 3.2% for the simpler linear and cubic fits, and
of the ith month interval, aad summing them over all the essentially equivalent to the p value of 20% for the less
month intervals included in the interval. Efron (1988) com- convenient cubic-linear spline fit of Efron (1988). For arm
B, we discretized the data as given in Table 2; the calcu-
bined the month intervals so as to have roughly 50 person-
lations yielded a chi-squared value of 14.8750, which for
month observations in each class interval. We also group
12 df gives a significance probability of 24.8%. This again
the intervals in a similar fashion, as in Table 2, for the re- demonstrates that a three-parameter generalized Weibull
sults to be comparable. For example, the fifth interval of model with parameter estimates given in (8) provides a good
4--6 months in Table 2 corresponding to aim A indicates 72 fit for the arm B data.
patients at risk, which is the sum of 40 and 32 patients at
risk at the beginning of intervals 4-5 and 5--6. 4.3 Weibull Goodness of Fit
The goodness of fit of the models corresponding to the As suggested in the previous section, we can test the ade-
estimates given in (7) and (8) for arms A and B can then be quacy of the Weibull models for the arm A and arm B data

NOTE: The chi-squared value lor lesllng lhe overall goodness of fl: for the proportional hazards model is 26.82, which for 22 df gives a p value of 21.8%.
Mudholkar, Srivastava, and Kollia: The Weibull Distribution and Survival Data 1579

'<t
~
where nl, UA and n2, UB denote the number of observations
a and sum over uncensored observations in arms A and B and
d 2 represents the number of uncensored observations in arm
~
a B. The likelihood equations for estimating the parameters
can be very easily obtained from (10). With an initial guess
a
~ as ao = .4, AO = -5,0'0 = 5, and TJo = .5, we obtained the
a maximum likelihood estimates as
CD
a
a & = .375675, 5. = -3.014134, a = 5.198953,
s
J::
/IV\\
l!l
a
' \'\ and
\
\~
i) = .579333. (11)
sa ;1 -,
if

N
a
11
;"1
~
..., - 4.5 Goodness of Fit and Equivalence of the Two Arms
a --~------- .. _----~- -- - From the estimates obtained in (11), we can test the good-
ness of fit for the two arms separately. For arm A using the
a
a estimates of 0:, A, and a as obtained in (11), we obtain chi-
squared value of 12.9546, which for 10 df gives a p value of
o 10 20 30 40 22.59%. A similar exercise for the arm B data with the es-
Months
timates given in (11) yields a chi-squared value of 9.21337,
Downloaded by [Purdue University] at 21:56 23 March 2013

which for 11 df gives a p value of 60.25%. A goodness-of-


Figure 2. Fitted Hazard Functions (the Generalized Weibull Fits as
Obtained in Sec. 4) for the Two Arms Separately and Assuming Pro- fit test for proportional hazards can be performed by stack-
portional Hazards. I, arm A separately; II, arm B separately; 1/1, arm A ing the datasets for the two arms. Using (10) and following
proportional hazard; IV. arm B proportional hazard. through the calculations, we get an observed chi-squared
value of equal to 26.8164 for 28 classes with 4 parameters
given in Table 1 by testing the composite hypothesis He: to be estimated, which gives a p value of 31.31 %. This pro-
A=O. vides strong evidence for concluding that the proportional
For arm A, the value of the likelihood ratio statistic, hazards model is appropriate for modeling the data in arms
2[l(8) - l(oo)], is 66.6134. The asymptotic distribution of A and B.
the statistic is chi-squared with 1 df, which gives a p value It is interesting to note that a considerable improvement
~ O. A similar calculation for arm B yields the value of in the fit of arm B is achieved by using the proportional
likelihood ratio statistic to be 128.123; once again using a hazards model, and some improvement is noticed in the
chi-squared distribution with 1 df gives a p value ~ O. arm A fit as well. This may be due to the fact that using
On the basis of the foregoing results, the Weibull model the proportional hazards model for arm A provides more
can be decisively rejected for both arms. information about arm B and vice-versa.
The two arms of the study can be compared by test-
4.4 Proportional Hazards ing the composite hypothesis H o: 0 1 = O2 (i.e, H o: 0:1 =
Let us now consider modeling arm A using the gen- a2, Al = A2,al = 0'2) by using, for example, Wald-type
I" l A
I" "
eralized Weibull distribution with the survival function statistic (01 - O2 )' [11 (od + 12 (02 ) ]- (01 - O2 ) , for the
A "

8(a, A, a) = 1 - Fi«, A, a), where F is given by (2), in- separate fits given by (7) and (8). The observed value of the
volving parameter 0 = (a, A, a), and modeling arm B by the chi-square associated with Wald-type statistic is 11.6746,
survival function [8(a, A, 0')]'7. The four parameters 0:, A, a, which gives a p value of .0086. The equivalence of the two
and TJ can also be estimated by the method of maximum arms in the proportional hazards model (i.e., the null hy-
likelihood. Because the data corresponding to the two arms pothesis Hi: TJ = 1) may be tested using, for example, the
are independent, the joint log-likelihood under proportional likelihood ratio statistic. The associated value of the chi-
hazards assumption, (lpr.hz), is just the sum of the individ- square in this case is 5.3301 with a p value of .021. Hence
uallog-likelihoods for the two arms separately, as follows: we can conclude that the hypothesis of equality of two arms
can be rejected.
lpr.hz = 2)og h(Xi; a, A, a)
UA 5. BATHTUB AND INCREASING HAZARD RATES
nl
5.1 The Nonregular Case
+ 2)og[1-F(Xi;0:,>.,a)] +d2 log TJ
i=l The hazard rates of the generalized Weibull distributions
are increasing when A > 0, a < 1, and are bathtub shaped
+ 2)og h(Xi; a, A, a) when A > 0, a > 1. Because bathtub hazard rate data are
UB
n2
often encountered in practice, a variety of distributions for
+ TJ L log]I - F(Xi; a, A, a)], (10) modeling such data and methods for statistical analysis of
i=l
the models have appeared in the literature. Lawless (1982)
1580 Journal of the American Statistical Association, December 1996

Table 3. Survival Times for the 50 Dwices Put 5.3 Sampling Distributions and Inference
on Life Test at Time Zero
For the purpose of statistical inference, an understand-
0.1, 0.2, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 3.0, 6.0, 7.0, 11.0, 12.0, 18.0, 18.0,
ing of the joint distributions of ¢ = Yn :n and the modified
18.0, 18.0, 18.0, 21.0, 32.0, 36.0, 40.0, 45.0, 46.0, 47.0, 50.0, 55.0, 60.0,
maximum likelihood estimates a and X obtained from (14)
63.0, 63.0, 67.0, 67.0, 67.0, 67.0, 72.0, 75.0, 79.0, 82.0, 82.0, 83.0, 84.0,
is necessary. Note that the joint distribution of 6 = (¢, a, X)'
84.0, 84.0, 85.0, 85.0, 85.0, 85.0, 85.0, 86.0, 86.0
is not trivariate normal and for >. ~ ~, the rate of conver-
gence is not ..;n. It is convenient to describe the joint distri-
and Rajarshi and Rajarshi (1988) have discussed the is- bution in terms of the asymptotic marginal distribution of ¢
sues involving bathtub hazard shape and alternatives such and the asymptotic conditional distribution of (a, X)' given
as weighted least squares and method of moments to max- ¢. These are given in the following theorems; the proofs
imum likelihood estimation. appear in Appendix A.
However, as noted earlier, the generalized Weibull dis- Theorem 2.
tributions are nonregular in the sense that their support a. The marginal distribution of ¢= Y n :n is given by
(0, (J' / >'''') involves the parameters. Hence they share all the
intricacies of the theory of likelihood-based inference for P(¢ ~ t) = [1 - (1 - (tj¢)l/o)l/At.
the nonregular families. In some cases the likelihood may
b. Asymptotically, as n ~ 00,
become unbounded, and consequently maximum likelihood
estimators may not exist, whereas in other cases the max-
imum/likelihood estimators may exist but their asymptotic
distributions mayor may not follow the: classical asymp-
n
A
(* - 1) = -aZ A + Op(n- A) ~ -aZ A in law, (15)

totic theory. An excellent account of the asymptotic theory


Downloaded by [Purdue University] at 21:56 23 March 2013

where Z denotes the standard exponential random variable.


of maximum likelihood estimators for nonregular families
has been provided by Smith (1985). Theorem 3. The conditional asymptotic distribution of
(a, X)' given Yn :n , as n ~ 00, of the "modified likelihood"
5.2 Reparameterization and Estimation
(14), is bivariate normal with mean (a, >.)' and the inverse,
Suppose that data consisting of n lifetimes Y1 , Y 2 , ... , Yn i~l, of the Fisher information matrix as the covariance ma-
suggest a bathtub hazard rate, and that we wish to fit a gen- trix. The asymptotic result is valid even if i OA is replaced
eralized Weibull model. For the purpose of statistical infer- by the (2 x 2) sample information matrix I"'A evaluated at
ence, we propose reparameterizing the distribution function a and >..
F(y) in (2) as A confidence interval or test of hypothesis regarding the
parameter ¢, when shape parameters a and >. are known,
F(y) = [1- (1 - (yj¢)l/"')l/A], Y ~ ¢, (12)
can be based on the marginal distribution of ¢ given by
where a and>. are nonnegative and ¢ = (T / >."'. Note that ¢ Theorem 2. Thus for large samples, a (1 - (3) confidence
in (12) is both the scale and threshold parameter and a and interval for ¢ can be obtained from
>. are the shape parameters. The quantile function of (12)
P[C1 ~ ¢j¢ ~ C2 ] = (1 - (3), (16)
is given by
where C 1 = 1 - [(a Df)/n A] and C2 = 1 - [(aD~)/nA]
Q(u) = ¢[1 - (1 - U)A]"'. (13)
and D 1 = -log((3/2) and D 2 = -log(1 - (3/2) are the 1-
To estimate the parameters, one can proceed by first es- (3/2 and (3/2 percentiles of a standard exponential random
timating the threshold parameter ¢ by its natural consistent variable. With sufficiently large samples, the estimates a
estimator ¢ = Yn:n , where Yi:n, i = 1, ... n denote the or- and X may be used in place of a and >., if the latter are
der statistics of a sample of size n from (12). The "modified unknown. Obviously, (16) can also be used for testing H o:
log-likelihood" based on (n - 1) observations, after ignor- ¢ = ¢o at level (3.
ing the largest observation and replacing ¢ by its estimator The large-sample distribution of (ii, X)' needed for the
¢ = Yn :n is given by inference regarding the shape parameters a and >. may, in
principle, be obtained by integrating out ¢ from the joint
L = L(a, x, ¢) distribution of (¢, a, X). However, a preferable route sug-
= -(n - 1) log o - (n - 1) log>. - (n - 1) log(Yn :n ) gested by a referee leads to the following theorem, the proof
of which appears in Appendix A.
+ (~ - 1) ~ log(Yi/Y n :n )
Theorem 4. Asymptotically as n ~ 00 the distribution
of (a, X)' is (a) bivariate normal if >. > 1/2, (b) bivariate
+ (± -
1) ~
log(1- (Yi/yn:n)l/",). (14) exponential if >. < 1/2, and (c) a mixture of normal and
exponential if >. = 1/2.
If it is known a priori that>. > 1/2, then Theorem 4 pro-
The estimates of a and >., denoted by a and X, can be vides a very convenient bivariate normal distribution for
obtained by solving the maximum likelihood equations, large-sample inference about a and >.. Otherwise, the situa-
aL/aa = aLIa>. = 0, obtained from (14). tion is complicated. A large-sample joint confidence region
Mudholkar, Srivastava, and Kollia: The Weibull Distribution and Survival Data 1581

5.4 An Illustrative Example


Aarset (1987) used data on lifetimes of 50 devices (given
in Table 3) to illustrate the use of the scaled total-time-on-
CD test (TIT) transform for identifying nature of the failure
ci
rate.
It is clear from Figure 3 that a bathtub hazard rate model
to
is appropriate for the data in Table 3. With the estimation
ci method discussed earlier, we estimate the threshold parame-
ter ¢ by Yn:n = 86.0, the largest observation. It is important
to note that the largest observation 86.0 occurs with mul-
tiplicity 2, and hence the "modified likelihood" consists of
only 48 terms. The estimates of the parameters a and>. are
obtained using the likelihood methods. Thus, in terms of
original parameterization, the estimates are
'"
ci
ii = 2.2992, >. = 1.9727,
o and
ci
a = 410.0768. (18)
0.0 0.2 0.4 0.6 0.8 1.0
Large-sample confidence intervals (19) for ¢ require con-
u
Downloaded by [Purdue University] at 21:56 23 March 2013

stants C, and C2 , which can be obtained from D l and


Figure 3. Scaled TIT Transforms. Aarset data: I, Empirical; II, fit with D 2 the (3/2 and 1 - (3/2 percentiles of the standard ex-
Q = 2.2992,A = 1.9727, and CT = 410.0768. Efron data: III, empirical
ponential distribution. Thus we get 86.0001 ::; ¢ ::; 87.1714
arm A; IV, empirical arm B.
and 86.003 ¢ s s 86.7734 as approximate 95% and 90%
for (a, >., ¢) based on a test of the simple hypothesis H o: confidence intervals. However, it seems more reasonable to
(a, >., ¢) = (ao, >'0, ¢o) may then be an appropriate practi- construct one-sided confidence intervals, because the lower
cal alternative. A test for H o can be constructed using the bound can be taken as ¢. Hence we have ¢ ::; 86.7734
well-known classical union-intersection principle, which we and ¢ ::; 86.4585 as an approximate 95% and 90t one-
briefly describe here. sided confidence intervals for ¢. The corresponding 95%
To test H o: (¢, a, >.) = (¢o, ao, >'0) involving all three and 90% confidence intervals for a are a ::; 413.8005 and
parameters, at level (3, decompose H o as HOI n H 0 2 , where s
a 412.2988.
HOI: ¢ = ¢o and H 0 2 : (a, >.) = (ao, >'0) and choose (31 and
Remark 1. Note that the foregoing confidence intervals
(32 such that (1 - (3) = (1 - (3t}(1 - (32). Let Al be the
of ¢ are of the form (¢ + a, ¢ + b). They do not include
acceptance region of a level (31 test for HOI based on the
the point estimate ¢.
marginal distribution given in Theorem 2. Also, let A 2 be
To test the goodness of fit of the model, we group the
the acceptance region of a level (32 test for H 02 based on
data as in Table 4, and obtain O, and E, i = 1, ... ,7, the
the conditional distribution given in Theorem 3. The two
observed number of failures and their expectations under
tests are independent because the two null distributions are
the fitted model, by using
independent. Hence we have
for i = 1, ... ,7, (19)
(17)
where n = 50 represents the number of observations,
Thus Al n A 2 can be used as an acceptance region of size 8(t) = S(t; ii, X, a) = 1 - F(t; ii, X, a), 8(to) = 8(0) = 1,
(3 for testing the hypothesis H o. Alternatively, the test for and ti denotes the upper endpoint of the ith interval.
H o may also be conducted by combining the independent Table 4 shows that the expected frequencies E 2 and E3
p values of the tests of HOI and H 0 2 (see Mudholkar and are smaller than 5. Hence we pool the two intervals and
George 1979). Clearly, the set of all (ao, >'0, ¢o) for which get Pearson's goodness-of-fit chi-squared value with 2 df,
H o is accepted for the observed (ii, X, ¢) is a (1 - (3)100% X 2 = 3.2701, with a p value of 19.5%. (See Gore et al.
confidence region for (a, >., ¢). Some of the foregoing ideas 1986 and Lawless 1982 for similar data and details of the
are illustrated in the following example. problems encountered in their analysis.) This adequacy of

Table 4. Model Adequacy for the Fit (18) to the Aarset Data in Table 3
Months

0-5 5-10 10-20 20-35 35-55 55-75 75-86


9 2 7 2 7 9 14
7.9739 3.1813 4.7552 5.8680 7.4568 8.9837 11.7811
1582 Journal of the American Statistical Association, December 1996

fit is also reflected in the empirical and fitted scaled TTT Proof of Theorem 4
transforms given in Figure 3.
Let 80 = (ao,>"o)',jj (0,>.)', and (J = (a,>..)'. Also, let
6. CONCLUSIONS
L = Lia, >.., ¢) be the log-likelihood function in (14) centered at
¢, let G(8,4» = (8/80)L, and let H((J,4» = (8 2/802)L be the
The generalized Weibull family as presented in (2) can 2 x 2 Hessian matrix. Then by the definition of jj we have
be used effectively in the analysis of survival data. The
G(jj,¢) = O. (A2)
family is versatile, accomodating monotone, unimodal, and
bathtub-shaped hazard functions. The empirical scaled TTT Now expanding (A2) about (Jo we get
transform can be used to identify the shape of the haz-
ard function. The family has closed-form expressions for
o =G(Oo, 4» + H(Oo, ¢)(jj - ( 0 ) + op(l/vn). (A3)

the distribution functions and the hazard functions, and is Assuming that n is large enough so that ¢ can be replaced by cPo
closed under proportional hazards modeling. Because of its in H(Oo, ¢), we get
analytic tractability, the likelihood-based inference in the 0= G(Oo, ¢) + H(Oo, cPo)(jj - (Jo) + op(l/vn). (A4)
regular case and an alternative method based on a "modi-
fied likelihood" can be easily implemented. Then, rearranging the terms and multiplying by vn, it follows
that
APPENDIX A: PROOFS
vn(jj - ( 0 ) = -vnH-1(Oo, cPo)G(Oo, ¢) + 01'(1). (A5)
Proof of Theorem 1 Now expanding G(Oo, ¢) around cPo, we have
The theorem can be established either by analyzing the haz-
ard quantile function h(Q(u)) = f(Q(u))/(1 - u), or by directly
vn(jj - (0) = -vnH-1(Oo,cPo)G(Oo,cPo)
examining the hazard function. From Equation (3) we have - 1 8
- vn(cP - cPo)H- (00 , cPo) 8cP G(Oo,cPo)
Downloaded by [Purdue University] at 21:56 23 March 2013

1 ) 1 . )..(y/<r)l/<>--l
D log h(y) = (Q - 1 Y + aCT(1 _ A(Y/CT)l/<>-)' (AI) + 01'(1). (A6)

where D log h(y) = (d/ dy) log h(y) is the derivative of log h(y). If >.. > 1/2, then, in view of Theorem 2, the second term in (A6) is
Part (e) is trivial, and parts (b) and (d) follow immediately from negligible and the asymptotic distribution of (j is bivariate normal.
the signs of the derivative in (A I). However, parts (a) and (c) are However, if >.. < 1/2, then the second term in (A6) dominates the
less obvious. first term, and by Theorem 2 and asymptotically, (j has a bivariate
First, consider part (a), the case a > 1 and)" > O. Here the exponential distribution. For X = 1/2, the asymptotic distribution
derivative (Al) is -00 and y = 0 and +(Xl at y = CT/(>"<>-). is a mixture of normal and exponential.
Moreover, D log h(y) vanishes only once, because the equation
APPENDIX B: INITIAL GUESS
(1 - a) + a>..(y/CT?/<>- = 0 has a unique solution, Hence in this
FOR ITERATIVE METHODS
case the hazard function h(y) is convex; that is, bathtub shaped.
Now to prove (c), assume that 0 < a < land>" < O. Once The solution of the nonlinear equations can be obtained using
again, we note that D log h(y) has only one zero; furthermore, is one of the many iterative routines; for example, those given in
easy to see that D log h(O) = 00 and D log h( (0) = 0, implying IMSL or NAG. It may be noted, however, that these routines are
that the function is increasing at zero and decreasing for large y. generally sensitive to the starting values. If censoring is light, then
It follows that the hazard function is unimodal. for the starting values of a and CT we propose using some simple
shape and scale estimates obtained assuming the Weibull model
Proof of Theorem 2 and ignoring the censored part of the data. In the illustrative ex-
a. In view of the distribution. function of y, as given in (12), ample of Section 3, we use the simple closed-form estimates, a
the proof is obvious. and if, of Kollia and Mudholkar (1990). These estimates are
b. Let 'Uw« denote the largest-order statistics of a sample of
size n from uniform (0, I) distribution. Then from the quan-
tile function (13), it is clear that the maximum Yn :n = ¢ can
be expressed as
a = .69;13n t ( ~~-=- .~~ - )tOg
2 1 Y;:n (B.l)

and

t
Y n :n =L Q(Un:n) = cP[1 - (1 -- Un:nlt.

= exp [~ ~~-=-.~~ + .16724) log Y;:n] , (B.2)


So by expanding (1 - y)<>- in a Taylor series about y = 0 and
substituting y = (1 - Un:n)A, we get if (1.6655

n A (Y;n _ 1) =L -a[n(1 - Un:n)];' + Op(n- A). where Y 1 :n ::; Y 2 :n ... ::; Y n:n denote the order statistics of a com-
plete sample from a Weibull distribution with parameters a and
But n(l - Un :n ) converges, as n -'-> 00, in law to the standard CT. If the censoring is heavy; then instead of a and if, one may use
exponential random variable, and the second term on the right the Weibull maximum likelihood estimates obtained iteratively.
side, Op(n- A ) , converges to zero. Hence we get (15). For the starting value >"0, we can take a small number (e.g. ±.01)
with the sign as suggested by the shape of the empirical scaled
Proof of Theorem 3 TIT transform after ignoring the censoring.
Alternatively, one can obtain the initial values using linear re-
Conditional on ¢, the modified likelihood (14) satisfies all of the
gression. For the model (I), we have
assumptions required for the asymptotic normality of the maxi-
mum likelihood estimates (0, >.)', and hence the result. (For an logY;:n Rj logQ(i/n) = log e - a log >..
analogous development, see Smith 1985.) + alog[l- (1- (i/n))AJ. (B.3)
Mudholkar, Srivastava, and Kollia: The Weibull Distribution and Survival Data 1583

So for any fixed A, estimates of 0: and (J" can be obtained using Kalbfleisch, J. D., and Prentice, R. L. (1980), The Statistical Analysis of
linear regression. Thus by "manual updating" (i.e., "nested least Failure Data, New York: Wiley.
squares"), one can obtain a reasonable starting point (0:0, AD, 0:0) Kollia, G., and Mudholkar, G. S. (1990), "An Approach to Estimation in
corresponding to the smallest residual sum of squares when the Quantile Function Families With Application to Weibull Distribution,"
technical report, University of Rochester, Dept. of Statistics.
regression method is repeated for a selection of A values.
Lawless, J. F. (1982), Statistical Models and Methods for Lifetime Data,
New York: Wiley.
[Received December 1993. Revised March 1996.J
Lawless, J. F., and Singhal, K. (1987), "ISMOD: An All-Subsets Regres-
sion Program for Generalized Linear Models I. Statistical and Computa-
REFERENCES tional Background," Computer Methods and Programs in Biomedicine,
24,117-124.
Aarset, M. V. (1987), "How to Identify a Bathtub Hazard Rate," IEEE
Transactions on Reliability, R-36, 106-108. McCullagh, P., and Nelder, J. (1984), Generalized Linear Models, London:
Box, G. E. P., and Cox, D. R. (1964), "An Analysis of Transformations," Chapman and Hall.
Journal of the Royal Statistical Society, Ser. B, 26, 211-252. Miller, R. G., Jr. (1983), "What Price Kaplan-Meier?" Biometrics, 39,
Burr, I. W. (1942), "Cumulative Frequency Functions," Annals of Mathe- 1077-1081.
matical Statistics, 13,215-232. Mudholkar, G. S., and George, E. O. (1979), "The Logit Statistic for Com-
Cox, D. R., and Oakes, D. (1984), Analysis of Survival Data, London: bining Probabilities-An Overview," in Optimizing Methods in Statis-
Chapman and Hall. tics, ed. J. S. Rustagi, New York: Academic Press, pp. 345-365.
Efron, B. (1988), "Logistic Regression, Survival Analysis, and the Kaplan- Mudholkar, G. S., and Kollia, G. D. (1994), "Generalized Weibull Family:
Meier Curve," Journal of the American Statistical Association, 83,414- A Structural Analysis," Communications in Statistics, Pan A-Theory
425. and Methods, 23, 1149-1171.
Farewell, V. T., and Prentice, R. L. (1977), "A Study of Distributional Rajarshi, S., and Rajarshi, M. B. (1988), "Bathtub Distributions: A Re-
Shape in Life Testing," Technometrics, 19, 69-75. view," Communications in Statistics, Part A-Theory and Methods, 17,
Gore, A. P., Paranjape, S., Rajarshi, M. B., and Gadgil, M. (1986), "Some 2597-2621.
Methods for Summarizing Survivorship Data in Nonstandard Situa- Rao, C. R. (1973), Linear Statistical Inference and Its Applications, New
Downloaded by [Purdue University] at 21:56 23 March 2013

tions," Biometrical Journal, 28, 577-586. York: Wiley.


Harrington, D. P., and Fleming, T. R. (1982), "A Class of Rank Test Pro- Smith, R. L. (1985), "Maximum Likelihood Estimation in a Class of Non-
cedures for Censored Survival Data," Biometrika, 69, 553-566. Regular Cases," Biometrika, 72, 67-90.

You might also like