405 (4,5) PDF

Lifetime Data Analysis
Inference Procedures for Parametric Models
Mahbub Latif
Institute of Statistical Research and Training (ISRT)
University Dhaka, Dhaka 1000, Bangladesh
(mlatif@isrt.ac.bd)
October 2023
Mahbub Latif Lifetime Data Analysis October 2023 1 / 142

Plan
Inference procedures for parametric models

Introduction
Exponential distribution
Type II censoring plan
Comparison of distributions
Gamma distribution

Introduction

Introduction
• Likelihood methods for lifetime data were introduced in Chapter 2, which

includes derivation of likelihood function for different types of censored data
• Maximum likelihood estimator
• Inference about parameters (hypothesis testing and confidence intervals)

Likelihood function (Complete data)
• Let 𝑡1 , … , 𝑡𝑛 be a random sample from a distribution 𝑓(𝑡; 𝜽), where 𝜽 is a
𝑝-dimensional vector of parameters
• Likelihood and log-likelihood function
𝑛
𝐿(𝜽) = ∏ 𝑓(𝑡𝑖 ; 𝜽) (1)
𝑖=1
𝑛
ℓ(𝜽) = log 𝐿(𝜽) = ∑ log 𝑓(𝑡𝑖 ; 𝜽) (2)
𝑖=1
• Maximum likelihood estimator (MLE)
𝜽 ̂ = arg max ℓ(𝜽) (3)

𝜽∈Θ

Statistical inference (Complete data)
• Large sample property of MLE
𝜽 ̂ ∼ 𝒩(𝜽, [ℐ(𝜽)]−1 ) (4)
• Fisher expected information matrix
𝜕 2 ℓ(𝜽)
ℐ(𝜽) = 𝐸[ − ] (5)
𝜕𝜽 𝜕𝜽′
• Observed information matrix
𝜕 ℓ(𝜽) 2
𝐼(𝜽)̂ = − ∣ (6)
𝜕𝜽 𝜕𝜽′ 𝜽=𝜽̂

Likelihood function (Type I or random censoring)
• Data: {(𝑡𝑖 , 𝛿𝑖 ), 𝑖 = 1, … , 𝑛}
• 𝑡𝑖 is a sample realization of 𝑇𝑖̃ = min(𝑇𝑖 , 𝐶𝑖 )
• lifetime 𝑇𝑖 follows a distribution with pdf 𝑓(𝑡𝑖 ; 𝜽) and the corresponding survivor
function 𝑆(𝑡𝑖 ; 𝜽)
• censoring time 𝐶𝑖 could be random or fixed depending on the censoring
mechanism
• 𝛿𝑖 = 𝐼(𝑇𝑖 ≤ 𝐶𝑖 ), censoring indicator

Likelihood function (Type I or random censoring)
• Data for type I or random censoring
{(𝑡𝑖 , 𝛿𝑖 ), 𝑖 = 1, … , 𝑛}
• Likelihood function
𝑛
𝛿 1−𝛿𝑖
𝐿(𝜽) = ∏ [𝑓(𝑡𝑖 ; 𝜽] 𝑖 [𝑆(𝑡𝑖 ; 𝜽)]
𝑖=1
𝑛
𝛿 1−𝛿𝑖
= ∏ [𝑆(𝑡𝑖 ; 𝜽) ℎ(𝑡𝑖 ; 𝜽)] 𝑖 [𝑆(𝑡𝑖 ; 𝜽)]
𝑖=1
𝑛
𝛿𝑖
= ∏ [𝑆(𝑡𝑖 ; 𝜽] [ℎ(𝑡𝑖 ; 𝜽)] (7)
𝑖=1

Likelihood function (Type II censoring)
• Let lifetime 𝑇𝑖 (𝑖 = 1, … , 𝑛) follows a distribution with pdf 𝑓(𝑡𝑖 ; 𝜽) and the

corresponding survivor function 𝑆(𝑡𝑖 ; 𝜽)
• The experiment was terminated after observing 𝑟 smallest lifetimes
𝑡(1) ⩽ ⋯ ⩽ 𝑡(𝑟)
• The remaining (𝑛 − 𝑟) observations are considered as censored at 𝑡(𝑟)
𝑟 𝑛−𝑟
𝐿(𝜽) = [ ∏ 𝑓(𝑡(𝑖) ; 𝜽)] [𝑆(𝑡(𝑟) ; 𝜽)] (8)
𝑖=1

Statistical inference (censored sample)
• For censored samples, the result “asymptotic distribution of MLE is normal” is

still valid
• The expression of the Fisher expected information matrix ℐ(𝜽) is complex for
censored data, observed information matrix 𝐼(𝜃)̂ is used in making inference
with censored sample


• The exponential distribution is the first lifetime model for which statistical
methodology were extensively developed
• Exact tests can be developed for exponential distribution for certain type of
censoring mechanism
• Exponential distribution assume constant hazard and its use is limited for
analyzing real life problems

• Probability density function
𝑓(𝑡; 𝜃) = (1/𝜃) exp(−𝑡/𝜃) 𝑡 ⩾ 0, 𝜃 > 0 (9)
• Hazard function
ℎ(𝑡; 𝜃) = (1/𝜃) (10)
• Survivor function
𝑆(𝑡; 𝜃) = exp(−𝑡/𝜃) (11)
• 𝑝𝑡ℎ quantile
𝑆(𝑡𝑝 ; 𝜃) = 1 − 𝑝 ⇒ exp(−𝑡𝑝 ; 𝜃) = 1 − 𝑝 ⇒ 𝑡𝑝 = −𝜃 log(1 − 𝑝)

Homework
• Estimation and related inference of exponential distribution when the sample

has no censored observations

Type I or random censoring
• Lifetime 𝑇 ∼ Exp(𝜃), 𝜃 > 0

• Data: {(𝑡𝑖 , 𝛿𝑖 ), 𝑖 = 1, … , 𝑛}
𝑛
𝛿 1−𝛿𝑖
𝐿(𝜃) = ∏ [𝑓(𝑡𝑖 ; 𝜃)] 𝑖 [𝑆(𝑡𝑖 ; 𝜃)]
𝑖=1
𝑛
𝛿𝑖
= ∏ 𝑆(𝑡𝑖 ; 𝜃) [ℎ(𝑡𝑖 ; 𝜃)]
𝑖=1
𝑛
= ∏ exp(−𝑡𝑖 /𝜃) (1/𝜃)𝛿𝑖 (12)
𝑖=1

𝑛
𝐿(𝜃) = ∏ exp(−𝑡𝑖 /𝜃) (1/𝜃)𝛿𝑖
𝑖=1
• Log-likelihood function
𝑛
ℓ(𝜃) = ∑ [ − (𝑡𝑖 /𝜃) − 𝛿𝑖 log(𝜃)]
𝑖=1
1
= − ∑ 𝑡𝑖 − 𝑟 log(𝜃) (13)
𝜃 𝑖
• 𝑟 = ∑𝑖 𝛿𝑖 → the number of failures observed in the sample

1
ℓ(𝜃) = − ∑ 𝑡𝑖 − 𝑟 log(𝜃)
𝜃 𝑖
• MLE
𝜕ℓ(𝜃) 1 𝑟
∣ =0 ⇒ ∑ 𝑡𝑖 − = 0
𝜕𝜃 𝜃2̂ 𝑖 𝜃̂
𝜃=𝜃 ̂
𝑛
𝜃 ̂ = ∑ 𝑡𝑖 /𝑟 (14)
𝑖=1
• Assuming 𝑟 > 0 and no finite MLE exist for 𝑟 = 0

• Information matrix
𝜕 2 ℓ(𝜃)
𝐼(𝜃) = −
𝜕𝜃2
𝜕 1 𝑟
= − [ 2 ∑ 𝑡𝑖 − ]
𝜕𝜃 𝜃 𝑖 𝜃
2 𝑟
= 3
∑ 𝑡𝑖 − 2 (15)
𝜃 𝑖 𝜃
• Replacing the parameter 𝜃 by its MLE 𝜃 ̂ = ∑𝑖 𝑡𝑖 /𝑟 in (15), the observed

information matrix becomes
𝑟
𝐼(𝜃)̂ =
𝜃2̂
Confidence interval (Method I)
̂ asymptotic distribution
• Using 𝜃’s
𝜃 − 𝜃̂ 𝜃 − 𝜃̂
𝑍1 = = √ ∼ 𝒩(0, 1)
̂ −1/2
[𝐼(𝜃)] 𝜃/̂ 𝑟
• For a small sample, 𝑍1 does not approximate the standard normal distribution
very accurately
• For a sample with a small number of uncensored observations, ℓ(𝜃) tends to be
asymmeteric

Confidence interval (Method II)
• Sprott (1980) showed that ℓ1 (𝜙) = ℓ(𝜙−3 ) is more closer to symmetric
compared to ℓ(𝜃), where 𝜙 = 𝜃−1/3 ⇒ 𝜙3 = 1/𝜃
ℓ(𝜃) = −(1/𝜃) ∑ 𝑡𝑖 − 𝑟 log 𝜃 (16)

𝑖
ℓ1 (𝜙)= −𝜙3 ∑ 𝑡𝑖 + 3𝑟 log(𝜙) (17)

𝑖
−1/3
• MLE ̂
𝜙 ̂ = 𝜃−1/3 = [∑𝑖 𝑡𝑖 /𝑟]
3𝑟 9𝑟
𝐼1 (𝜙) = 6𝜙 ∑ 𝑡𝑖 + 2
⇒ 𝐼1 (𝜙)̂ =
𝑖
𝜙 𝜙2̂

Confidence interval (Method II)
• Pivotal quantity
𝜙 − 𝜙̂ 𝜙 − 𝜙̂
𝑍2 = = √ ∼ 𝒩(0, 1)
̂
[𝐼1 (𝜙)]
−1/2
𝜙/̂ 9𝑟
• Approximation of 𝑍2 is quite accurate compared to that of 𝑍1

log theta r part e
Confidence interval (Method III) kono change korbo
na
• Likelihood ratio statistic Λ(𝜃) = 2ℓ(𝜃)̂ − 2ℓ(𝜃) can also be used to obtain
confidence interval for 𝜃, where
ℓ(𝜃) = −(1/𝜃) ∑ 𝑡𝑖 − 𝑟 log(𝜃)

𝑖
̂
= −𝑟(𝜃/𝜃) − 𝑟 log(𝜃) (18)
ℓ(𝜃)̂ = −(1/𝜃)̂ ∑ 𝑡𝑖 − 𝑟 log(𝜃)̂
𝑖
= −𝑟 − 𝑟 log(𝜃)̂ (19)

Confidence interval (Method III)
• LRT statistic
Λ(𝜃) = 2ℓ(𝜃)̂ − 2ℓ(𝜃)
̂
= 2𝑟[(𝜃/𝜃) ̂
− 1 + log(𝜃/𝜃)]
• Two-sided (1 − 𝛼)100% confidence intervals are obtained as the set of 𝜃 values
that satisfy
Λ(𝜃) ≤ 𝜒2(1),1−𝛼

Confidence interval of survivor function
• Confidence interval for the parameter 𝜃 can also be used to obtain confidence
intervals for a monotone function of 𝜃, such as
𝑆(𝑡; 𝜃) = exp(−𝑡/𝜃) or ℎ(𝑡; 𝜃) = 1/𝜃
• Let 100(1 − 𝛼)% confidence interval for 𝜃
𝐿(Data) ⩽ 𝜃 ⩽ 𝑈 (Data),
where Data = {(𝑡𝑖 , 𝛿𝑖 ), 𝑖 = 1, … , 𝑛}

Confidence interval of survivor function
• 100(1 − 𝛼)% confidence interval for 𝑆(𝑡0 ; 𝜃) = exp(−𝑡0 /𝜃)
𝐿(Data) ⩽ 𝜃 ⩽ 𝑈 (Data)
𝑡0 /𝑈 (Data) ⩽ (𝑡0 /𝜃) ⩽ 𝑡0 /𝐿(Data)
−𝑡0 /𝐿(Data) ⩽ (−𝑡0 /𝜃) ⩽ −𝑡0 /𝑈 (Data)
exp ( − 𝑡0 /𝐿(Data)) ⩽ 𝑆(𝑡0 ; 𝜃) ⩽ exp ( − 𝑡0 /𝑈 (Data))

Confidence interval of hazard function
• 100(1 − 𝛼)% confidence interval for ℎ(𝑡0 ; 𝜃) = 1/𝜃
(1/𝑈 (Data)) ⩽ ℎ(𝑡0 ; 𝜃) ⩽ (1/𝐿(Data))

Example 4.1.1
• Lifetimes (in days) of 10 pieces of equipment

time status
2 1
72 0
51 1
60 0
33 1
27 1
14 1
24 1
4 1
21 0
• Assume lifetimes follow exponential distribution with parameter, i.e. 𝑇𝑖 ∼ Exp(𝜃)

Example 4.1.1
• MLE
∑𝑖 𝑡𝑖 308
𝜃̂ = = = 44.0
𝑟 7
• 95% confidence interval of 𝜃 (Method I)
√
̂ −1/2 ⇒ 𝜃 ̂ ± 𝑧 (𝜃/̂ 𝑟)
𝜃 ̂ ± 𝑧.975 [𝐼(𝜃)] .975
√
⇒ 44.0 ± (1.96)(44.0/ 7)
⇒ 11.4 ⩽ 𝜃 ⩽ 76.6

Example 4.1.1
• 95% confidence interval of 𝜃 (Method II)
• ̂
𝜙 ̂ = 𝜃−1/3 = (44.0)−1/3 = 0.283
• Confidence interval for 𝜙

√
̂ −1/2 ⇒ 𝜙 ̂ ± 𝑧.975 (𝜙/̂ 9𝑟)
𝜙 ̂ ± 𝑧.975 [𝐼1 (𝜙)]
√
⇒ 0.28 ± (1.96)(0.28/ 63)
⇒ 0.21 ⩽ 𝜙 ⩽ 0.35
• Confidence interval for 𝜃
0.21 ⩽ 𝜙 ⩽ 0.35 ⇒ 0.21 ⩽ 𝜃−1/3 ⩽ 0.35

⇒ (0.35)−3 ⩽ 𝜃 ⩽ (0.21)−3
⇒ 22.69 ⩽ 𝜃 ⩽ 103.03
Example 4.1.1
• Likelihood ratio statistic

̂
Λ(𝜃) = 2𝑟[(𝜃/𝜃) ̂
− 1 − log(𝜃/𝜃)]
• 95% CI for 𝜃 can be obtained from the set of 𝜃’s such that
Λ(𝜃) ≤ 𝜒2(1),.95 = 3.84
• 95% CI for 𝜃
22.8 to 102.4

5
Λ 3
0
20 40 60 80 100
θ

Table 1: A comparison of 95% CI for 𝜃 with the data on lifetimes of 10 piees of
equipment (Example 4.1.1)
95% CI
method lower upper
Normal approx. 11.40 76.60
Sprott 22.69 103.03
LRT 22.80 102.40

Methods for CI
• Normal approximation
2
(𝜃 ̂ − 𝜃)
𝑊1 (𝜃) = [ ] = (𝜃 ̂ − 𝜃)2 𝐼(𝜃)̂
̂ −1/2
[𝐼(𝜃)]
• Sprott’s method
2
(𝜙 ̂ − 𝜙) ̂ ̂
𝑊2 (𝜃) = [ ] = (𝜃−1/3 − 𝜃−1/3 )2 𝐼1 (𝜃−1/3 )
̂ −1/2
[𝐼1 (𝜙)]
• LRT
𝑊3 (𝜃) = 2ℓ(𝜃)̂ − 2ℓ(𝜃) = 2𝑟[(𝜃/𝜃)
̂ ̂
− 1 − log (𝜃/𝜃)]
• 𝑊𝑗 (𝜃) ∼ 𝜒2(1) 𝑗 = 1, 2, 3 under 𝐻0

LRT Normal Sprott
10.0
Wj(θ) 7.5
5.0
2.5
0.0
0 25 50 75 100 125
θ
Figure 1: Comparisons between normal approximations and LRT

Simulation study
• Performance of three methods of obtaining confidence interval for 𝜃 is compared

using a simulation study
• Lifetime 𝑇 ∼ 𝐸𝑥𝑝(𝜃)
• Censoring time 𝐶 is fixed for all individuals
𝑄 = exp(−𝐶/𝜃), 0 < 𝑄 < 1
where 𝑄 is the effective censoring fraction as each lifetime has a probability 𝑄 of

being censored
• 𝑡 is a sample realization of 𝑇 ̃ = min{𝑇 , 𝐶} and 𝛿 = 𝐼(𝑇 ⩽ 𝐶)

Simulation study
• Simulate a sample {(𝑡𝑖 , 𝛿𝑖 ), 𝑖 = 1, … , 𝑛} for 𝐵 (⩾ 2000) times

• Obtain (1 − 𝛼)100% confidence interval for 𝜃 from each of the simulated
samples using the three methods described earlier
• Proportion of times the confidence intervals contain the true 𝜃 is known as
observed coverage proportion
• It is expected that observed coverage proportion would be closer to the nominal
coverage (1 − 𝛼) if the method is appropriate

Table 2: Comparison of observed coverage proportions obtained from normal
approximations and LRT based methods for 100(1 − 𝛼)% one-sidedd confidence
intervals for 𝜃 using a simulation study
𝑄 = .10 𝑄 = .25 𝑄 = .50

𝑛 method .95 .90 .95 .90 .95 .90
10 norm 0.878 0.827 0.865 0.827 0.885 0.825

Sprott 0.955 0.906 0.954 0.901 0.950 0.898
LRT
20 norm 0.891 0.841 0.899 0.855 0.897 0.848

Sprott 0.955 0.904 0.946 0.899 0.945 0.898
LRT


Type II censoring plan approximate
• Assume lifetime 𝑇 ∼ Exp(𝜃)

• Let 𝑡(1) < ⋯ < 𝑡(𝑟) be the 𝑟 smallest lifetimes observed from an experiment
with 𝑛 (≥ 𝑟) subjects
• The joint distribution of 𝑡(1) , … , 𝑡(𝑟)
𝑟 𝑛−𝑟
𝑛!
𝑓𝑇 (𝑡(1) , … , 𝑡(𝑟) ) = [ ∏(1/𝜃)𝑒−𝑡(𝑖) /𝜃 ][𝑒−𝑡(𝑟) /𝜃) ]
(𝑛 − 𝑟)! 𝑖=1
𝑛! 𝑟
(1/𝜃) 𝑒− 𝜃 [∑𝑖=1 𝑡(𝑖) +(𝑛−𝑟)𝑡(𝑟) ]
𝑟 1
=
(𝑛 − 𝑟)!
𝑛!
= (1/𝜃)𝑟 𝑒−𝑇0 /𝜃 (20)
(𝑛 − 𝑟)!

• The log-likelihood function

1
ℓ(𝜃) = − ∑ 𝑡(𝑖) − 𝑟 log(𝜃) − (1/𝜃)(𝑛 − 𝑟)𝑡(𝑟) + Const.
𝜃 𝑖
1 𝑟
= − [ ∑ 𝑡(𝑖) + (𝑛 − 𝑟)𝑡(𝑟) ] − 𝑟 log(𝜃) + Const.
𝜃 𝑖=1
𝑇0
=− − 𝑟 log(𝜃) + Const. (21)
𝜃

• Maximum likelihood estimator
𝑟
𝜕ℓ(𝜃) ̂
∣ = 0 ⇒ 𝜃 = (1/𝑟)[ ∑ 𝑡(𝑖) + (𝑛 − 𝑟)𝑡(𝑟) ]
𝜕𝜃 𝜃=𝜃 ̂ 𝑖=1

−𝜕 2 ℓ(𝜃) 𝑟
𝐼(𝜃)̂ = 2
∣ =
𝜕𝜃 𝜃=𝜃 ̂ 𝜃2̂
• (1 − 𝛼)100% CI for 𝜃
−1/2
𝜃 ̂ ± 𝑧1−𝛼/2 [𝐼(𝜃)]
̂ (22)

Exact confidence interval proof porbo na
• Exact confidence interval for 𝜃 can be derived for uncensored and Type II
censored samples, define
𝑊1 = 𝑛𝑡(1)
𝑊2 = (𝑛 − 1)(𝑡(2) − 𝑡(1) )
⋮
𝑊𝑟 = (𝑛 − 𝑟 + 1)(𝑡(𝑟) − 𝑡(𝑟−1) )
• In general
𝑊𝑖 = (𝑛 − 𝑖 + 1)(𝑡(𝑖) − 𝑡(𝑖−1) ) 𝑖 = 1, … , 𝑟

• It can be shown that

𝑟 𝑟
∑ 𝑊𝑖 = ∑ 𝑡(𝑖) + (𝑛 − 𝑟)𝑡(𝑟) = 𝑇0 (23)
𝑖=1 𝑖=1

• The joint distribution of 𝑊1 = 𝑔1 (𝑡(1) , … , 𝑡(𝑟) ), … , 𝑊𝑟 = 𝑔𝑟 (𝑡(1) , … , 𝑡(𝑟) )
𝑓𝑊 (𝑤1 , … , 𝑤𝑟 ) = 𝑓𝑇 (𝑔1−1 (𝑤1 , … , 𝑤𝑟 ), … , 𝑔𝑟−1 (𝑤1 , … , 𝑤𝑟 ))∣𝐽 ∣ (24)
where
𝜕(𝑔1−1 (𝑤1 , … , 𝑤𝑟 ), … , 𝑔𝑟−1 (𝑤1 , … , 𝑤𝑟 ))
𝐽=
𝜕(𝑤1 , … , 𝑤𝑟 )
𝜕𝑔1−1 (𝑤1 ,…,𝑤𝑟 ) 𝜕𝑔2−1 (𝑤1 ,…,𝑤𝑟 ) 𝜕𝑔𝑟−1 (𝑤1 ,…,𝑤𝑟 )
…
⎡ −1 𝜕𝑤1 𝜕𝑤1 𝜕𝑤1 ⎤
⎢ 𝜕𝑔1 (𝑤1 ,…,𝑤𝑟 ) 𝜕𝑔2−1 (𝑤1 ,…,𝑤𝑟 ) −1
𝜕𝑔𝑟 (𝑤1 ,…,𝑤𝑟 ) ⎥
=⎢ 𝜕𝑤2 𝜕𝑤2 … 𝜕𝑤2
⎥
⎢ −1 ⋮ ⋮ ⋱ ⋮ ⎥
𝜕𝑔1 (𝑤1 ,…,𝑤𝑟 ) 𝜕𝑔2−1 (𝑤1 ,…,𝑤𝑟 ) 𝜕𝑔𝑟−1 (𝑤1 ,…,𝑤𝑟 )
⎣ 𝜕𝑤𝑟 𝜕𝑤𝑟 … 𝜕𝑤𝑟 ⎦

𝑤1
𝑤1 = 𝑔1 (𝑡(1) , … , 𝑡(𝑟) ) = 𝑛𝑡(1) ⇒ 𝑡(1) = 𝑔1−1 (𝑤1 , … , 𝑤𝑟 ) =
𝑛
𝑤2 = 𝑔2 (𝑡(1) , … , 𝑡(𝑟) ) = (𝑛 − 1)(𝑡(2) − 𝑡(1) )
𝑤2 𝑤
⇒ 𝑡(2) = 𝑔2−1 (𝑤1 , … , 𝑤𝑟 ) = + 1
𝑛−1 𝑛
⋮
𝑤𝑟 = 𝑔𝑟 (𝑡(1) , … , 𝑡(𝑟) ) = (𝑛 − 𝑟 + 1)(𝑡(𝑟) − 𝑡(𝑟−1) )
𝑤𝑟 𝑤2 𝑤
⇒ 𝑡(𝑟) = 𝑔𝑟−1 (𝑤1 , … , 𝑤𝑟 ) = +⋯+ + 1
𝑛−𝑟+1 𝑛−1 𝑛

𝜕𝑔1−1 (𝑤1 ,…,𝑤𝑟 ) 𝜕𝑔2−1 (𝑤1 ,…,𝑤𝑟 ) 𝜕𝑔𝑟−1 (𝑤1 ,…,𝑤𝑟 )
…
⎡ −1 𝜕𝑤1 𝜕𝑤1 𝜕𝑤1 ⎤
⎢ 𝜕𝑔1 (𝑤1 ,…,𝑤𝑟 ) 𝜕𝑔2−1 (𝑤1 ,…,𝑤𝑟 ) 𝜕𝑔𝑟−1 (𝑤! ,…,𝑤𝑟 ) ⎥
𝐽 =⎢ 𝜕𝑤2 𝜕𝑤2 … 𝜕𝑤2
⎥
⎢ −1 ⋮ ⋮ ⋱ ⋮ ⎥
𝜕𝑔1 (𝑤1 ,…,𝑤𝑟 ) 𝜕𝑔2−1 (𝑤1 ,…,𝑤𝑟 ) 𝜕𝑔𝑟−1 (𝑤1 ,…,𝑤𝑟 )
⎣ 𝜕𝑤𝑟 𝜕𝑤𝑟 … 𝜕𝑤𝑟 ⎦
1 1 1
𝑛 𝑛 … 𝑛
⎡ 1 1 ⎤
= ⎢0 𝑛−1 … 𝑛−1 ⎥
⎢⋮ ⋮ ⋱ ⋮ ⎥
1
⎣0 0 … 𝑛−𝑟+1 ⎦
• We can write
1 (𝑛 − 𝑟)!
∣𝐽 ∣ = = (25)
𝑛 ⋅ (𝑛 − 1) ⋯ (𝑛 − 𝑟 + 1) 𝑛!
• The joint distribution of 𝑊1 = 𝑔1 (𝑡(1) , … , 𝑡(𝑟) ), … , 𝑊𝑟 = 𝑔𝑟 (𝑡(1) , … , 𝑡(𝑟) )
𝑓𝑊 (𝑤1 , … , 𝑤𝑟 ) = 𝑓𝑇 (𝑔1−1 (𝑤1 , … , 𝑤𝑟 ), … , 𝑔𝑟−1 (𝑤1 , … , 𝑤𝑟 ))∣𝐽 ∣

𝑛! 1 − 1𝜃 { ∑ 𝑔𝑖−1 (𝑤𝑖 )+(𝑛−𝑟)𝑔𝑟−1 (𝑤𝑟 )} (𝑛 − 𝑟)!
= 𝑒 𝑖
(𝑛 − 𝑟)! 𝜃𝑟 𝑛!
1 − ∑ 𝑤𝑖 /𝜃
= 𝑒 𝑖 (26)
𝜃𝑟
• Using (23)
• Equation (26) shows that

• 𝑊1 , … , 𝑊𝑟 are independent and 𝑊𝑖 ∼ Exp(𝜃)

• Since 𝑊𝑖 ’ are independent and 𝑊𝑖 ∼ Exp(𝜃), so
𝑇0 = ∑ 𝑊𝑖 = ∑ 𝑡(𝑖) + (𝑛 − 𝑟)𝑡(𝑟)
𝑖 𝑖
follows a gamma distribution with scale parameter 𝜃 and shape parameter 𝑟

• Then using the relationship between gamma and chi-square distribution
2𝑇0
∼ 𝜒2(2𝑟)
𝜃

Review
• Moment generating functions for different distributions
𝑀𝑋 (𝑡) = 𝐸[𝑒𝑡𝑋 ]
1
⎧ 1−𝜃𝑡 for 𝑋 ∼ Exp(𝜃)
{
{
{ 1 𝑟
= ⎨( 1−𝜃𝑡 ) for 𝑋 ∼ Gamma(𝑟, 𝜃)
{
{ 𝑟/2
{( 1 ) for 𝑋 ∼ 𝜒2(𝑟)
⎩ 1−2𝑡

2𝑇0
• Since 𝜃 ∼ 𝜒2(2𝑟) , we can write
2𝑇0
𝑃 (𝜒2(2𝑟), 𝛼/2 ≤ ≤ 𝜒2(2𝑟), 1−𝛼/2 ) = 1 − 𝛼
𝜃
• 𝜒2(2𝑟), 𝑝 → 𝑝𝑡ℎ quantile of 𝜒2(2𝑟) distribution

• (1 − 𝛼)100% exact confidence interval for 𝜃
2𝑇0 2𝑇0
2
≤𝜃≤ 2
(27)
𝜒(2𝑟), 1−𝛼/2 𝜒(2𝑟), 𝛼/2

Example 4.1.3
• The first 8 observations in a random sample of 12 lifetimes (in hours) from an

assumed exponential distribution are
31, 58, 157, 185, 300, 470, 497, 673
• Homework
• Obtain 95% exact and approximate confidence intervals for 𝜃


• Comparison of two or more lifetime distributions is often of interest in practice

• For comparing two or more independent exponential distributions, different
methods of hypothesis tests and confidence intervals are available

Likelihood ratio tests
• Let 𝑇𝑖𝑗 be the lifetime corresponding to the 𝑗𝑡ℎ subject of the 𝑖𝑡ℎ group
(𝑖 = 1, … , 𝑚; 𝑗 = 1, … , 𝑛𝑖 )
• Assume 𝑇𝑖𝑗 ∼ Exp(𝜃𝑖 ) and the null hypothesis of interest
𝐻0 ∶ 𝜃 1 = ⋯ = 𝜃 𝑚
• Data:
{(𝑡𝑖𝑗 , 𝛿𝑖𝑗 ), 𝑖 = 1, … , 𝑚; 𝑗 = 1, … , 𝑛𝑖 }

• The likelihood function for 𝜽 = (𝜃1 , … , 𝜃𝑚 )′
𝑚 𝑛𝑖 𝛿𝑖𝑗 1−𝛿𝑖𝑗
𝐿(𝜽) = ∏ ∏ [𝑓(𝑡𝑖𝑗 ; 𝜃𝑖 )] [𝑆(𝑡𝑖𝑗 ; 𝜃𝑖 ]
𝑖=1 𝑗=1
𝑚 𝑛𝑖
1 𝛿𝑖𝑗
= ∏ ∏ [ ] 𝑒−𝑡𝑖𝑗 /𝜃𝑖
𝑖=1 𝑗=1
𝜃𝑖
• The corresponding log-likelihood function
ℓ(𝜽) = − ∑ ∑ [(𝑡𝑖𝑗 /𝜃𝑖 ) + 𝛿𝑖𝑗 log(𝜃𝑖 )] (28)

𝑖 𝑗

• The log-likelihood function
ℓ(𝜽) = − ∑ ∑ [(𝑡𝑖𝑗 /𝜃𝑖 ) + 𝛿𝑖𝑗 log(𝜃𝑖 )]

𝑖 𝑗
= − ∑ [(𝑇𝑖 /𝜃𝑖 ) + 𝑟𝑖 log(𝜃𝑖 )]

𝑖
• 𝑇𝑖 = ∑𝑗 𝑡𝑖𝑗 and 𝑟𝑖 = ∑𝑗 𝛿𝑖𝑗
• Maximum likelihood estimator of 𝜃𝑖 (𝑖 = 1, … , 𝑚)
𝜕ℓ(𝜽) 𝑇𝑖 𝑟𝑖
∣ =0 ⇒ − = 0 ⇒ 𝜃𝑖̂ = (𝑇𝑖 /𝑟𝑖 )
𝜕𝜃𝑖 ̂
2
𝜃𝑖 𝜃𝑖̂
𝜃𝑖 =𝜃𝑖̂

• Under
𝐻0 ∶ 𝜃1 = ⋯ = 𝜃𝑚 = 𝜃 (say)
log-likelihood function
ℓ(𝜃) = − ∑ [(𝑇𝑖 /𝜃) + 𝑟𝑖 log(𝜃)]

𝑖
• Under 𝐻0 , the MLE of 𝜃

∑𝑖 𝑇𝑖
𝜃̃ =
∑𝑖 𝑟𝑖

• Likelihood ratio test statistic
Λ = 2ℓ(𝜽)̂ − 2ℓ(𝜃)̃
= 2 ∑ [ − (𝑇𝑖 /𝜃𝑖̂ ) − 𝑟𝑖 log(𝜃𝑖̂ ) + (𝑇𝑖 /𝜃)̃ + 𝑟𝑖 log(𝜃)]
̃
𝑖
= 2 ∑ [ − 𝑟𝑖 log(𝜃𝑖̂ ) + 𝑟𝑖 log(𝜃)]
̃
𝑖
• Under 𝐻0 , Λ ∼ 𝜒2(𝑚−1) for a large sample size

Example 4.1.4
• Four independent samples of size 10 each had 7 failures

• MLE under exponential model
𝜃1̂ = 106, 𝜃2̂ = 80, 𝜃3̂ = 140, 𝜃4̂ = 158
• Homework
• Test 𝐻0 ∶ 𝜃1 = ⋯ = 𝜃4

Confidence intervals for 𝜃1/𝜃2
• Comparison between two exponential distributions
𝑇𝑖 ∼ Exp(𝜃𝑖 ), 𝑖 = 1, 2
• It can be shown that mle 𝜃𝑖̂ approximately follows normal distribution
𝜃𝑖̂ ∼ 𝒩(𝜃𝑖 , 𝜃𝑖2 /𝑟𝑖 ) (29)

log 𝜃𝑖̂ ∼ 𝒩( log 𝜃𝑖 , 1/𝑟𝑖 ) (30)
• Distribution of log(𝜃1̂ /𝜃2̂ )
log 𝜃1̂ − log 𝜃2̂ = log(𝜃1̂ /𝜃2̂ ) ∼ 𝒩( log(𝜃1 /𝜃2 ), (𝑟1−1 + 𝑟2−1 )) (31)

• Null hypothesis
𝐻0 ∶ 𝜃1 = 𝜃2 ⇒ 𝐻0 ∶ 𝜃1 /𝜃2 = 1 ⇒ 𝐻0 ∶ log 𝜃1 = log 𝜃2
• Test statistic
log(𝜃1̂ /𝜃2̂ ) − log(𝜃1 /𝜃2 )
𝑍= 1/2
∼ 𝒩(0, 1)
(𝑟1−1 + 𝑟2−1 )

• 95% CI for log(𝜃1 /𝜃2 )

1/2
log(𝜃1̂ /𝜃2̂ ) ± 𝑧1−𝛼/2 (𝑟1−1 + 𝑟2−1 ) (32)
• 95% CI for (𝜃1 /𝜃2 )

1/2
(𝜃1̂ /𝜃2 ) exp ( ± 𝑧1−𝛼/2 (𝑟1−1 + 𝑟2−1 ) ) (33)

• LRT statistics can also be used to obtain confidence interval of (𝜃1 /𝜃2 ),
consider the null hypothesis for 𝑎 > 0
𝐻0 ∶ 𝜃1 = 𝑎𝜃2
• For type I sample {(𝑡𝑖𝑗 , 𝛿𝑖𝑗 ), 𝑖 = 1, 2; 𝑗 = 1, … , 𝑛𝑖 }, the log-likelihood function
ℓ(𝜃1 , 𝜃2 ) = − ∑ [(𝑇𝑖 /𝜃𝑖 ) + 𝑟𝑖 log(𝜃𝑖 )]

𝑖
• MLE 𝜃𝑖̂ = (𝑇𝑖 /𝑟𝑖 ) 𝑖 = 1, 2

• Under 𝐻0 ∶ 𝜃1 = 𝑎𝜃2 , the log-likelihood function
ℓ(𝑎𝜃2 , 𝜃2 ) = −𝑟1 log(𝑎𝜃2 ) − (𝑇1 /(𝑎𝜃2 )) − 𝑟2 log(𝜃2 ) − (𝑇2 /𝜃2 )

= −(𝑟1 + 𝑟2 ) log(𝜃2 ) − (𝑇1 /(𝑎𝜃2 )) − (𝑇2 /𝜃2 ) − 𝑟1 log(𝑎)
• MLE under 𝐻0
𝑇1 + 𝑎𝑇2 𝑇 + 𝑎𝑇2
𝜃2̃ = ⇒ 𝜃1̃ = 𝑎𝜃2̃ = 1
𝑎(𝑟1 + 𝑟2 ) (𝑟1 + 𝑟2 )

• The likelihood ratio test statistic
Λ(𝑎) = 2ℓ(𝜃1̂ , 𝜃2̂ ) − 2ℓ(𝜃1̃ , 𝜃2̃ ),
where
ℓ(𝜃1̂ , 𝜃2̂ ) = − ∑ [(𝑇𝑖 /𝜃𝑖̂ ) + 𝑟𝑖 log(𝜃𝑖̂ )]
𝑖
= −(𝑟1 + 𝑟2 ) − 𝑟1 log(𝜃1̂ ) − 𝑟2 log(𝜃2̂ )

Λ(𝑎) = 2ℓ(𝜃1̂ , 𝜃2̂ ) − 2ℓ(𝜃1̃ , 𝜃2̃ ),
where
ℓ(𝜃1̂ , 𝜃2̂ ) = −(𝑟1 + 𝑟2 ) − 𝑟1 log(𝜃1̂ ) − 𝑟2 log(𝜃2̂ )
𝑇 𝑇
ℓ(𝜃1̃ , 𝜃2̃ ) = − 1 − 2 − 𝑟1 log(𝜃1̃ ) − 𝑟2 log(𝜃2̃ )
𝜃̃
1 𝜃̃ 2
= −(𝑟1 + 𝑟2 ) − 𝑟1 log(𝜃1̃ ) − 𝑟2 log(𝜃2̃ )

ℓ(𝜃1̂ , 𝜃2̂ ) = −(𝑟1 + 𝑟2 ) − 𝑟1 log(𝜃1̂ ) − 𝑟2 log(𝜃2̂ )

ℓ(𝜃 ̃ , 𝜃 ̃ ) = −(𝑟 + 𝑟 ) − 𝑟 log(𝜃 ̃ ) − 𝑟 log(𝜃 ̃ )
1 2 1 2 1 1 2 2
Λ(𝑎) = 2ℓ(𝜃1̂ , 𝜃2̂ ) − 2ℓ(𝜃1̃ , 𝜃2̃ )

= 2𝑟1 log(𝜃1̃ /𝜃1̂ ) + 2𝑟2 log(𝜃2̃ /𝜃2̂ )
= 2𝑟 log(𝑎𝜃 ̃ /𝜃 ̂ ) + 2𝑟 log(𝜃 ̃ /𝜃 ̂ )
1 2 1 2 2 2

• (1 − 𝛼)100% confidence interval for (𝜃1 /𝜃2 ) can be construed from the values
𝑎 that satisfy
Λ(𝑎) ≤ 𝜒2(1),1−𝛼 (34)
• For type I censored sample, LRT statistics based confidence interval for (𝜃1 /𝜃2 )
(34) is more accurate than that of normal approximation (32) for small samples

Example 4.1.5
• A small clinical trial was conducted to compare the duration of remission

achieved by two drugs used in the treatment of leukemia.
• Duration of remission is assumed to follow an exponential distribution and two
groups of 20 patients produced the followings under a Type I censoring
mechanism
𝑟1 = 10, 𝑇1 = 700, 𝑟2 = 10, 𝑇2 = 540
• Obtain 95% approximate and exact CI for (𝜃1 /𝜃2 )

Example 4.1.5
• Unrestricted MLEs
𝑇1 𝑇
𝜃1̂ = = 70 and 𝜃2̂ = 2 = 54
𝑟1 𝑟2
• 95% approximate CI for log(𝜃1 /𝜃2 )

1/2
log(𝜃1̂ /𝜃2̂ ) ± 𝑧.975 (𝑟1−1 + 𝑟2−1 )
log(70/54) ± (1.96)(10−1 + 10−1 )1/2
0.26 ± 0.877

Example 4.1.5
• 95% approximate CI (normal distribution based) for (𝜃1 /𝜃2 )
−0.617 < log(𝜃1 /𝜃2 ) < 1.136 ⇒ 0.54 < (𝜃1 /𝜃2 ) < 3.114
Is there any significant difference between 𝜃1 and 𝜃2 ?

Example 4.1.5
• Likelihood ratio statistic based confidence interval for (𝜃1 /𝜃2 ) requires estimate
of the parameters under the null hypothesis
𝐻0 ∶ 𝜃1 = 𝑎𝜃2
• Estimates under 𝐻0
𝑇1 + 𝑎𝑇2 𝑇 + 𝑎𝑇2
𝜃2̃ = ⇒ 𝜃1̃ = 𝑎𝜃2̃ = 1
𝑎(𝑟1 + 𝑟2 ) (𝑟1 + 𝑟2 )
Λ(𝑎) = 2𝑟1 log(𝑎𝜃2̃ /𝜃1̂ ) + 2𝑟2 log(𝜃2̃ /𝜃2̂ )

small 𝑎’s large 𝑎’s
𝑎 Λ(𝑎) 𝑎 Λ(𝑎)
0.525 3.953 3.185 3.911
0.560 3.424 3.150 3.819
0.595 2.958 3.115 3.726
0.630 2.549 3.080 3.633
0.665 2.187 3.045 3.541
0.700 1.869 3.010 3.448
0.735 1.589 2.975 3.356
0.770 1.341 2.940 3.263

5
3
Λ(a)
0
0 1 2 3
a

Example 4.1.5
• 95% of (𝜃1 /𝜃2 ) is the range of values of 𝑎 = (𝜃1 /𝜃2 ), such that
Λ(𝑎) ≤ 𝜒2(1),.95 = 3.84
which is
0.56 < 𝑎 < 3.15 ⇒ 0.56 < (𝜃1 /𝜃2 ) < 3.15

Example 4.1.5
Table 3: 95% confidence interval of (𝜃1 /𝜃2 ) for Type I censored sample
95% CI
method lower upper
Normal approximation 0.54 3.114
LRT 0.56 3.150

Type II censored sample (CI for 𝜃1/𝜃2)
• Lifetime distributions 𝑇𝑖 ∼ Exp(𝜃𝑖 ) 𝑖 = 1, 2

• Type II censored sample for the group 𝑖, which has 𝑟𝑖 number of failures and
(𝑛𝑖 − 𝑟𝑖 ) subjects are censored at 𝑡(𝑖𝑟𝑖 )
𝑡(𝑖1) < ⋯ < 𝑡(𝑖𝑟𝑖 )

2
𝐿(𝜃1 , 𝜃2 ) = ∏(1/𝜃𝑖 )𝑟𝑖 𝑒− ∑𝑗 𝑡(𝑖𝑗) /𝜃𝑖 𝑒−(𝑛𝑖 −𝑟𝑖 )𝑡(𝑖𝑟𝑖 ) /𝜃𝑖
𝑖=1
2
= ∏(1/𝜃𝑖 )𝑟𝑖 𝑒−(1/𝜃𝑖 )[ ∑𝑗 𝑡(𝑖𝑗) −(𝑛𝑖 −𝑟𝑖 )𝑡(𝑖𝑟𝑖 ) ]
𝑖=1
2
= ∏(1/𝜃𝑖 )𝑟𝑖 𝑒−𝑇𝑖 /𝜃𝑖
𝑖=1

• MLE
𝑇𝑖
𝜃𝑖̂ =
𝑟𝑖
• We have already shown that
2𝑇𝑖 2𝑟 𝜃 ̂
= 𝑖 𝑖 ∼ 𝜒2(2𝑟𝑖 )
𝜃𝑖 𝜃𝑖
• Then we can show
(2𝑟1 𝜃1̂ /𝜃1 )/(2𝑟1 ) 𝜃̂ 𝜃

= 1 2 ∼ 𝐹(2𝑟1 ,2𝑟2 )
(2𝑟2 𝜃2̂ /𝜃2 )/(2𝑟2 ) 𝜃2̂ 𝜃1

• We can write
𝜃1̂ 𝜃2
𝑃 (𝐹(2𝑟1 ,2𝑟2 ),𝛼/2 ≤ ≤ 𝐹(2𝑟1 ,2𝑟2 ),1−𝛼/2 ) = 1 − 𝛼
𝜃̂ 𝜃
2 1
• (1 − 𝛼)100% confidence interval for (𝜃1 /𝜃2 )
(𝜃1̂ /𝜃2̂ ) (𝜃1̂ /𝜃2̂ )

≤ (𝜃1 /𝜃2 ) ≤ (35)
𝐹(2𝑟1 ,2𝑟2 ),1−𝛼/2 𝐹(2𝑟1 ,2𝑟2 ),𝛼/2

Gamma distribution

Gamma distribution
• The pdf of two-parameter gamma distribution

𝑘−1
1 𝑡
𝑓(𝑡; 𝛼, 𝑘) = ( ) exp ( − 𝑡/𝛼), 𝑡>0 (36)
𝛼Γ𝑘 𝛼
• 𝛼 > 0 and 𝑘 > 0

Gamma distribution
• Survivor function
∞
𝑆(𝑡; 𝛼, 𝑘) = ∫ 𝑓(𝑢; 𝛼, 𝑘) 𝑑𝑢
𝑡
∞ 𝑘−1
1 𝑢
=∫ ( ) exp ( − 𝑢/𝛼) 𝑑𝑢
𝑡 𝛼Γ𝑘 𝛼
= 1 − 𝐼(𝑘, 𝑡/𝛼) (37)
• Incomplete gamma function

𝑥
1
𝐼(𝑘, 𝑥) = ∫ 𝑢𝑘−1 𝑒−𝑢 𝑑𝑢
Γ(𝑘) 0

Uncensored data
• Let 𝑡1 , … , 𝑡𝑛 be a random sample from Gamma(𝛼, 𝑘), the log-likelihood

function
𝑛
ℓ(𝛼, 𝑘) = ∑ log 𝑓(𝑡𝑖 ; 𝛼, 𝑘)
𝑖=1
𝑛 𝑘−1
1 𝑡
= ∑ log [ ( 𝑖) exp ( − 𝑡𝑖 /𝛼)]
𝑖=1
𝛼Γ𝑘 𝛼
= −𝑛 log Γ𝑘 − 𝑛𝑘 log 𝛼 + 𝑛(𝑘 − 1) log 𝑡 ̃ − 𝑛𝑡/𝛼

̄ (38)
1/𝑛
𝑛
• 𝑡 ̃ = ( ∏𝑖=1 𝑡𝑖 ) and 𝑡 ̄ = (1/𝑛) ∑𝑖 𝑡𝑖

Uncensored data
ℓ(𝛼, 𝑘) = −𝑛 log Γ𝑘 − 𝑛𝑘 log 𝛼 + 𝑛(𝑘 − 1) log 𝑡 ̃ − 𝑛𝑡/𝛼

̄
• Score functions
𝜕ℓ(𝛼, 𝑘) −𝑛𝑘 𝑛𝑡 ̄
𝑈1 (𝛼, 𝑘) = = + 2
𝜕𝛼 𝛼 𝛼
𝜕ℓ(𝛼, 𝑘)
𝑈2 (𝛼, 𝑘) = = −𝑛𝜓(𝑘) − 𝑛 log 𝛼 + 𝑛 log 𝑡 ̃
𝜕𝑘

Uncensored data
• MLE (𝛼,̂ 𝑘)̂ is a solution of the system of linear equations
−𝑛𝑘̂ 𝑛𝑡 ̄
+ 2 =0 (39)
𝛼̂ 𝛼̂
−𝑛𝜓(𝑘)̂ − 𝑛 log 𝛼̂ + 𝑛 log 𝑡 ̃ = 0 (40)
• Two equations and two unknowns

• Equations are non-linear in terms of the variables
• No closed form solutions

Substitution method
−𝑛𝑘̂ 𝑛𝑡 ̄ 𝑡̄
+ 2 = 0 ⇒ 𝛼̂ = (41)
𝛼̂ 𝛼̂ 𝑘̂
− 𝑛𝜓(𝑘)̂ − 𝑛 log 𝛼̂ + 𝑛 log 𝑡 ̃ = 0 ⇒ 𝜓(𝑘)̂ − log 𝑘̂ = log(𝑡/̃ 𝑡)̄ (42)
• Solve equation (42) using a suitable optimization technique (e.g. graphical

method, Newton-Raphson method, etc.) to obtain 𝑘̂
• Then 𝛼̂ can be obtained from (41)

Newton-Raphson method
• Score functions
𝜕ℓ(𝛼, 𝑘) −𝑛𝑘 𝑛𝑡 ̄
𝑈1 (𝛼, 𝑘) = = + 2
𝜕𝛼 𝛼 𝛼
𝜕ℓ(𝛼, 𝑘)
𝑈2 (𝛼, 𝑘) = = −𝑛𝜓(𝑘) − 𝑛 log 𝛼 + 𝑛 log 𝑡 ̃
𝜕𝑘

• Elements of Hessian matrix

𝜕 2 ℓ(𝛼, 𝑘) 𝑛𝑘 2𝑛𝑡 ̄
𝐻11 (𝛼, 𝑘) = = − 3
𝜕𝛼2 𝛼2 𝛼
2
𝜕 ℓ(𝛼, 𝑘) −𝑛
𝐻12 (𝛼, 𝑘) = = = 𝐼21 (𝛼, 𝑘)
𝜕𝛼 𝜕𝑘 𝛼
𝜕 2 ℓ(𝛼, 𝑘)
𝐻22 (𝛼, 𝑘) = = −𝑛𝜓′ (𝑘) (43)
𝜕𝑘2

• Score vector
𝑈1 (𝛼, 𝑘)
𝑈 (𝛼, 𝑘) = [ ]
𝑈2 (𝛼, 𝑘)
• Hessian matrix
𝐻11 (𝛼, 𝑘) 𝐻12 (𝛼, 𝑘)
𝐻(𝛼, 𝑘) = [ ]
𝐻21 (𝛼, 𝑘) 𝐻22 (𝛼, 𝑘)
• Score vector and information matrix are function of parameters and data

• Initial values 𝜽(0) = (𝛼(0) , 𝑘(0) )′ are chosen so that elements of score vector and
hessian matrix are finite
• Updated estimate 𝜽(1) = (𝛼(1) , 𝑘(1) )′ is obtained as
−1
𝜽(1) = 𝜽(0) − [𝐻(𝜽(0) )] 𝑈 (𝜽(0) ) (44)
• The estimate 𝜽(2) can be obtained by using 𝜽(1) as input in equation (44)
• Repeating the procedure of evaluating the equation (44) using the current
estimate, the following sequence of estimates can be obtained
{𝜽(𝑗) , 𝑗 = 3, 4, 5, … }

• A convergence criterion needs to be defined to obtain the MLE from the

sequence of estimates
{𝜽(𝑗) , 𝑗 = 1, 2, 3, … },

• Convergence criteria are defined on the basis of two successive values of the
parameters estimates
′
• 𝜽(𝑗) = (𝛼,̂ 𝑘)̂ is considered as MLE if one of the following criteria is satisfied
• ∣𝜽(𝑗) − 𝜽(𝑗−1) ∣ is very small
• ∣ℓ(𝜽(𝑗) ) − ℓ(𝜽(𝑗−1) )∣ is very small
• 𝑈 (𝜽𝑗 ) ≈ 𝟎

′
• Estimated variance-variance matrix of the MLE (𝛼,̂ 𝑘)̂ can be obtained from
the inverse of the negative of the hessian matrix evaluated at the MLE
−1
̂ (𝛼)
Var
̂ ̂
̂𝑘 = −[𝐻(𝛼,̂ 𝑘)]

Newton-Raphson method: Pseudo code
1: theta0 <- `initial value of the parameter`

2: eps <- 1
3: while (eps > 1e-5) {
4: u0 <- U(theta0)
5: h0 <- H(theta0)
6: theat1 <- theta0 - inv(h0) * u0
7: eps <- max(abs(theta1 - theta0))
8: if (eps < 1e-5) break
9: else theta0 <- theta1
10: }
11: return (list(theta0, h0))

• Statistical software have routines (such as optim() of R) that can optimize

likelihood function to obtain MLE
• Such routines require providing a “function” of likelihood function as an
argument
• Different optimization algorithms, such as Newton-Rapson, Mead-Nelder,
simulated annealing, etc. are implemented in optimization routines
• Users may provide the expressions of score and information matrix as arguments
of the routine
• If expressions of information matrix and score functions are not provided, the
routines calculate those numerically

Statistical inference
• Asymptotic distributions of MLEs, i.e.
̂ and 𝑘̂ ∼ 𝒩(𝑘, var(𝑘))

𝛼̂ ∼ 𝒩(𝛼, var(𝛼)) ̂
• Since 𝛼 > 0 and 𝑘 > 0, confidence intervals based on the sampling distribution
of log 𝛼̂ and log 𝑘̂ ensure non-negative lower limit of the confidence interval
̂ and log 𝑘̂ ∼ 𝒩( log 𝑘, var(log 𝑘))

log 𝛼̂ ∼ 𝒩( log 𝛼, var(log 𝛼)) ̂
• var(log 𝛼)̂ = (1/𝛼)̂ 2 var(𝛼)̂

• (1 − 𝛼)100% confidence interval for log 𝛼
log 𝛼̂ ± 𝑧1−𝛼/2 SE(log 𝛼)̂

• (1 − 𝛼)100% confidence interval for 𝛼
𝛼̂ exp ( ± 𝑧1−𝛼/2 SE(log 𝛼))

̂

Example 4.2.1
Following are survival time of 20 male rates that were exposed to a high level of
radiation
> rtime
## 152 152 115 109 137 88 94 77 160 165
## 125 40 128 123 136 101 62 153 83 69
• Assume the lifetimes follow a gamma distribution with parameters 𝛼 and 𝑘

Example 4.2.1
• From the data: 𝑡 ̄ = 113.45 and 𝑡 ̃ = 107.07

• Expressions of the score function
𝛼̂ = (𝑡/̄ 𝑘)̂ = 113.45/𝑘̂ (45)

𝜓(𝑘)̂ − log 𝑘̂ − log(𝑡/̃ 𝑡)̄ = 0
𝜓(𝑘)̂ − log 𝑘̂ + 0.058 = 0 (46)
• R function uniroot() can be used to obtain the value of 𝑘̂ by solving the

equation (46)
𝑘̂ = 8.799 ⇒ 𝛼̂ = 12.893

Log-likelihood function of gamma distribution 102-108
• R function to calculate log-likelihood function of a gamma distribution for a

given sample
gamma_loglk <- function(par, time) {
sum(
dgamma(time, scale = par[1], shape = par[2], log = T)
)
}
• par is a vector with the parameter scale as the first element and shape as the
second element
• time is the observed failure times
• gamma_loglk function can be evaluated for any given valid values of par and
time

Log-likelihood function of gamma distribution
• For given values of parameters, say (𝛼0 = 1, 𝑘0 = 2.), the value of

log-likelihood function can be obtained for the rat data
> gamma_loglk(par = c(1, 2), time = rtime)
## [1] -2175.531
• For another set of values (𝛼0 = 100, 𝑘0 = 80), the corresponding value of
> gamma_loglk(par = c(100, 80), time = rtime)
## [1] -5392.711

scale 11 12 13 14 15
−100.25
−100.50
−100.75
logL
−101.00
−101.25
−101.50
7 8 9 10 11
k
Figure 2: Plot of log-likelihood function against shape parameter 𝑘 for different values of scale
parameter 𝛼

shape 7 8 9 10 11
−100.5
−101.0
logL
−101.5
−102.0
10.0 12.5 15.0
α
Figure 3: Plot of log-likelihood function against scale parameter 𝛼 for different values of shape
parameter 𝑘

11
10
9
k
7
11 12 13 14 15
α
Figure 4: Contour plot of log-likelihood function of gamma distribution for rat survival data

9.0
8.9
8.8
k
8.7
8.6
12.7 12.8 12.9 13.0
α
Figure 5: Contour plot of log-likelihood function of gamma distribution for rat survival data

optim() function
• R function optim() is a general purpose optimization routine
optim(par, fn, gr = NULL, method = "Nelder-Mead",
lower = -Inf, upper = Inf,
control = list(), hessian = FALSE, ...
)
• par → initial values for the parameters to be optimized
• fn → a function to be minimized
• gr → a function to return the gradient (score function)
• method → a method to be used, e.g. “Nelder-Mead”, “BFGS”, “CG”,
“L-BFGS-B”, etc.
• lower and upper → lower and upper limit of the parameters to be optimized
• control → list of arguments (e.g. fnscale, etc.) for controlling the iterations
Initial values
• Initial values of the parameters can be selected from the exploratory plots of
• The log-likelihood function must provide finite values with the initial values of
the parameters

Initial values 110-116
• As an example, assume the initial values 𝛼0 = 12.9 and 𝑘0 = 8.8, and the
corresponding value of log-likelihood function
> gamma_loglk(par = c(12.9, 8.8), time = rtime)
## [1] -100.48
• For another set of initial values, 𝛼0 = 1.0 and 𝑘0 = 1.0

> gamma_loglk(par = c(1.0, 1.0), time = rtime)
## [1] -2269

Initial values
• Since both sets of initial values van provide finite values of log-likelihood
function, we can use either of these two as an initial value for optimizing the
log-likelihood function using the R function optim()

Example 4.2.1
• Fit of the gamma distribution to the rat data

> gamma_fit <- optim(
par = c(12.9, 8.8), fn = gamma_loglk,
control = list(fnscale = -1), hessian = T,
time = rtime
)

Example 4.2.1
• List of objects in gamma_fit

> names(gamma_fit)
## [1] "par" "value" "counts" "convergence"
## [5] "message" "hessian"
• Converged?
> gamma_fit$convergence
## [1] 0

Example 4.2.1
• Estimate of scale and shape parameters

> gamma_fit$par
## [1] 12.893951 8.799089
• 𝛼̂ = 12.894 and 𝑘̂ = 8.799

Example 4.2.1
• Estimated variance-variance matrix

> cvar <- solve(-gamma_fit$hessian)
> cvar
## [,1] [,2]
## [1,] 16.99179 -10.949815
## [2,] -10.94982 7.471711
√
• SE(𝛼)̂ = 16.992 = 4.122
√
• SE(𝑘)̂ = 7.472 = 2.733
‘

95% CI for 𝛼 and 𝑘
• Using the sampling distribution of 𝛼̂ and 𝑘̂
𝛼̂ ± 𝑧1−𝛼/2 SE(𝛼)̂ ⇒ 4.84 ≤ 𝛼 ≤ 20.95

𝑘̂ ± 𝑧1−𝛼/2 SE(𝑘)̂ ⇒ 3.46 ≤ 𝛼 ≤ 14.13
• Using the sampling distribution of log 𝛼̂ and log 𝑘̂
𝛼̂ exp ( ± 𝑧1−𝛼/2 SE(log 𝛼))

̂ ⇒ 6.91 ≤ 𝛼 ≤ 24.08
𝑘̂ exp ( ± 𝑧1−𝛼/2 SE(𝑘))
̂ ⇒ 4.79 ≤ 𝑘 ≤ 16.14

Quantiles
• 𝑝𝑡ℎ quantile
∞
1
𝑆(𝑡𝑝 ; 𝛼, 𝑘) = (1 − 𝑝) ⇒ ∫ 𝑢𝑘−1 𝑒−𝑢 𝑑𝑢 = (1 − 𝑝)
Γ(𝑘) 𝑡𝑝 /𝛼
⇒ (𝑡𝑝 /𝛼) = 𝑄(𝑝, 𝑘)

⇒ 𝑡𝑝 = 𝛼𝑄(𝑝, 𝑘) (47)
• 𝑄(𝑝, 𝑘) is the 𝑝𝑡ℎ quantile function of one-parameter gamma distribution [R

function qgamma(p, scale=1, shape)]
• Estimate of the median for the rat data
̂ = 𝛼̂ 𝑄(.5, 𝑘)̂ = (12.9)(8.46) = 109.16

𝑡.5

Likelihood ratio statistic
• Joint relative log-likelihood function
𝑟(𝛼, 𝑘) = ℓ(𝛼, 𝑘) − ℓ(𝛼,̂ 𝑘)̂ (48)
• Contour plot of 𝑟(𝛼, 𝑘) provides an informative picture of the information about

the parameters
• Approximately ellipsoidal contours indicates
• sampling distribution of 𝛼̂ and 𝑘̂ is approximately normal
• large sample based confidence intervals will be in agreement with results based
on likelihood ratio procedures

10
9
k
11 12 13 14 15
α
Figure 6: Contour plot of 𝑟(𝛼, 𝑘) for rat survival data

Likelihood ratio statistic
Λ(𝛼, 𝑘) = 2ℓ(𝛼,̂ 𝑘)̂ − 2ℓ(𝛼, 𝑘) = −2𝑟(𝛼, 𝑘) (49)
• Approximate (1 − 𝑝)100% confidence region for (𝛼, 𝑘)′ can be obtained from
the set of points (𝛼, 𝑘) satisfying
Λ(𝛼, 𝑘) ≤ 𝜒2(2),1−𝑝

12
k
8
10 15 20
α
Figure 7: 95% confidence region of (𝛼, 𝑘) for rat survival data

LRT statistic based CI for 𝑘
• Consider the following null hypothesis
𝐻0 ∶ 𝑘 = 𝑘 0
• MLEs (unrestricted and under restriction)
(𝛼,̂ 𝑘)̂ = arg max ℓ(𝛼, 𝑘) under 𝐻1 (50)

(𝛼,𝑘)∈Θ
𝛼(𝑘
̃ 0 ) = arg max ℓ(𝛼, 𝑘0 ) under 𝐻0 (51)
𝛼∈Θ𝛼

LRT statistic based CI for 𝑘
Λ𝑘 (𝑘0 ) = 2ℓ(𝛼,̂ 𝑘)̂ − 2ℓ(𝛼(𝑘

̃ 0 ), 𝑘0 ) (52)
• Under 𝐻0 ∶ 𝑘 = 𝑘0 , Λ𝑘 (𝑘0 ) approximately follows a 𝜒2(1) distribution
• An approximate two-sided (1 − 𝑝)100% confidence interval for 𝑘 can be

obtained from a set of values of 𝑘0 satisfying
Λ𝑘 (𝑘0 ) ≤ 𝜒2(1),1−𝑝 (53)

6
Λk
0
0 5 10 15 20
k0
95% CI ∶ 4.54 ≤ 𝑘 ≤ 15.28

LRT statistic based CI for 𝛼
𝐻0 ∶ 𝛼 = 𝛼 0
• MLEs (unrestricted and under restriction)
(𝛼,̂ 𝑘)̂ = arg max ℓ(𝛼, 𝑘) under 𝐻1 (54)

(𝛼,𝑘)∈Θ
̃ 0 ) = arg max ℓ(𝛼0 , 𝑘)

𝑘(𝛼 under 𝐻0 (55)
𝑘∈Θ𝑘

LRT statistic based CI for 𝛼
Λ𝛼 (𝛼0 ) = 2ℓ(𝛼,̂ 𝑘)̂ − 2ℓ(𝛼0 , 𝑘(𝛼

̃ 0 )) (56)
• Under 𝐻0 ∶ 𝛼 = 𝛼0 , Λ𝛼 (𝛼0 ) approximately follows a 𝜒2(1) distribution
• An approximate two-sided (1 − 𝑝)100% confidence interval for 𝛼 can be

obtained from a set of values of 𝛼0 satisfying
Λ𝛼 (𝛼0 ) ≤ 𝜒2(1),1−𝑝 (57)

6
Λα
0
0 10 20 30
α0
95% CI ∶ 7.37 ≤ 𝛼 ≤ 25.9

Summary of 95% CI
• For 𝛼
no-transformation ∶ 4.84 ≤ 𝛼 ≤ 20.95
log-transformation ∶ 6.91 ≤ 𝛼 ≤ 24.08
LRT ∶ 7.37 ≤ 𝛼 ≤ 25.9
• For 𝑘
no-transformation ∶ 3.46 ≤ 𝑘 ≤ 14.13
log-transformation ∶ 4.79 ≤ 𝑘 ≤ 16.14
LRT ∶ 4.54 ≤ 𝑘 ≤ 15.28

Confidence intervals of survivor function
𝐻0 ∶ 𝑆(𝑡0 ; 𝛼, 𝑘) = 1 − 𝐼(𝑘, 𝑡0 /𝛼) = 𝑠0

⇒ 𝐻0 ∶ 𝐼(𝑘, 𝑡0 /𝛼) = 1 − 𝑠0
̂ where 𝛼̂ and 𝑘̂ are MLEs

• For a given value of 𝑡0 , 𝑠0 = 𝑆(𝑡0 ; 𝛼,̂ 𝑘),
• LRT statistic
Λ(𝑠0 ) = 2ℓ(𝛼,̂ 𝑘)̂ − 2ℓ(𝛼,̃ 𝑘)̃
• 𝛼̂ and 𝑘̂ are unrestricted MLEs
• 𝛼̃ and 𝑘̃ are MLEs under 𝐻0 ∶ 𝐼(𝑘, 𝑡0 /𝛼) = 1 − 𝑠0

Confidence intervals of survivor function
• To obtain 𝛼̃ and 𝑘,̃ consider the function
𝑀 (𝑘) = ℓ(𝑘, 𝛼(𝑘))
• For any value of 𝑘, the value of 𝛼 can be obtained as
𝐼(𝑘, 𝑡0 /𝛼) = (1 − 𝑠0 ) ⇒ 𝛼 = 𝑡0 /𝑄(1 − 𝑠0 , 𝑘)
where 𝑄(⋅, 𝑘) is quantile function of a gamma distribution with scale as 1 and

shape as 𝑘
• MLE under 𝐻0
𝑘̃ = arg max 𝑀 (𝑘) ⇒ 𝛼̃ = 𝑡0 /𝑄(1 − 𝑠0 , 𝑘)̃

𝑘

LRT statistic based CI for 𝑡.5

Censored sample
Data {(𝑡𝑖 , 𝛿𝑖 ), 𝑖 = 1, … , 𝑛} and the corresponding log-likelihood function

𝑛 𝛿 1−𝛿𝑖
1 𝑡 𝑘−1 𝑖
ℓ(𝛼, 𝑘) = log ∏ [ ( 𝑖 ) 𝑒−𝑡𝑖 /𝛼 ] [1 − 𝐼(𝑘, 𝑡𝑖 /𝛼)]

𝑖=1
𝛼Γ(𝑘) 𝛼
= −𝑘 log 𝛼 − 𝑟 log Γ(𝑘) + (𝑘 − 1) ∑ 𝛿𝑖 log 𝑡𝑖

𝑖
− ∑ 𝛿𝑖 𝑡𝑖 /𝛼 + ∑(1 − 𝛿𝑖 ) log [1 − 𝐼(𝑘, 𝑡𝑖 /𝛼)] (58)

𝑖 𝑖

Censored sample
• No closed form solutions are available for MLEs 𝛼̂ and 𝑘̂

• Algebraic expressions of score function and information matrix for the
log-likelihood function (58) are very complicated
• Score functions and information matrix can be evaluated numerically
• Optimization routines available in statistical software can be used to obtain
MLEs 𝛼̂ and 𝑘,̂ and the corresponding SEs
• Different optimization algorithms such as Newton-Raphson, Nelder-Mead, etc.
are available in such routines

Censored sample
• As an example, we are going to use the same rat data that we have used for the
analysis of complete data
• For the rat data, assume a time ≥ 150 weeks is considered as censored, there
are 5 censored observations
• An R object rtimec is created, which has two columns time and status
time status time status
152 0 125 1
152 0 40 1
115 1 128 1
109 1 123 1
137 1 136 1
88 1 101 1
94 1 62 1
77 1 153 0
160 0 83 1
165 0 69 1

Updated gamma_loglk function
• R function for evaluating gamma log-likelihood function for censored sample

gamma_loglk <- function(par, time, status = NULL) {
#
if (is.null(status)) status <- rep(1, length(time))
#
llk_f <- sum(status * dgamma(time, scale = par[1],
shape = par[2], log = T))
#
llk_c <- sum((1 - status) * pgamma(time, scale = par[1],
shape = par[2], lower.tail = F, log.p = T))
return(llk_f + llk_c)
}

Censored sample
• R codes to obtain MLE of parameters of a gamma distribution from a censored

sample
gamma_out_c <- optim(
par = c(1, 1), fn = gamma_loglk,
control = list(fnscale = -1), hessian = T,
time = rtimec$time, status = rtimec$status
)

Censored sample
• Maximum likelihood estimators [gamma_out_c$par]
𝛼̂ = 21.3 and 𝑘̂ = 5.79
• Standard errors [solve(-gamma_out_c$hessian)]
𝑠𝑒(𝛼)̂ = 8.61 and 𝑠𝑒(𝑘)̂ = 2.14

6
Λα
0
0 20 40 60
α0
95% CI ∶ 10.7 ≤ 𝛼 ≤ 52.48

6
Λk
0
0 5 10 15
k0
95% CI ∶ 2.78 ≤ 𝑘 ≤ 10.9

Censored sample
• 95% CI for 𝛼
no-transformation ∶ 4.44 ≤ 𝛼 ≤ 38.17
log-transformation ∶ 9.65 ≤ 𝛼 ≤ 47.02
LRT ∶ 10.7 ≤ 𝛼 ≤ 52.48
• 95% CI for 𝑘
no-transformation ∶ 3.46 ≤ 𝑘 ≤ 14.13
log-transformation ∶ 4.79 ≤ 𝑘 ≤ 16.14
LRT ∶ 2.78 ≤ 𝑘 ≤ 10.9

Censored sample
• Estimate of median
̂ = 𝛼̂ 𝑄(.5, 𝑘)̂ = 116.4 weeks

𝑡.5

References I
Sprott, D. (1980). Maximum likelihood in small samples: Estimation in the

presence of nuisance parameters. Biometrika, 67(3):515–523.

Lifetime Data Analysis
Inference Procedures for Log-Location-Scale Distributions
Mahbub Latif
Institute of Statistical Research and Training (ISRT)
University Dhaka, Dhaka 1000, Bangladesh
(mlatif@isrt.ac.bd)
November 2023
Mahbub Latif Lifetime Data Analysis November 2023 1 / 185

Plan
Inference procedures for log-location-scale distributions

Inference for location-scale distributions
Weibull and extreme-value distributions
Log-normal and normal distributions
Log-logistic and logistic distributions


• Location-scale distributions have survivor function of the form

𝑦−𝑢
𝑆(𝑦; 𝑢, 𝑏) = 𝑆0 ( ) −∞<𝑦 <∞ (1)
𝑏
−∞ < 𝑢 < ∞ 𝑎𝑛𝑑 𝑏 > 0

• Log-lifetime 𝑌 = log 𝑇 has a location-scale distribution with survivor function
of the form (1)

• Lifetime variable 𝑇 has a log-location-scale distribution with the survivor

function
log 𝑡 − 𝑢
𝑆𝑇 (𝑡; 𝛼, 𝛽) = 𝑆0 ( )
𝑏
= 𝑆0⋆ [(𝑡/𝛼)𝛽 ] (2)
• 𝑆0⋆ (𝑤) = 𝑆0 (log 𝑤) for 𝑤 > 0

• 𝑢 = log 𝛼
• 𝑏 = (1/𝛽)

• Lifetime and log-lifetime distributions

• exponential, Weibull, log-logistic, log-normal, etc.
• extreme-value, logistic, normal, etc.

Likelihood based methods
• Goal is to estimate the parameters (𝑢, 𝑏) or (𝛼, 𝛽)

• Some advantages of estimating (𝑢, 𝑏)
• Log-likelihood function for (𝑢, 𝑏) is more closer to quadratic than that for (𝛼, 𝛽)
• Large sample normal approximations for (𝑢,̂ 𝑏)̂ tend to be more accurate that
those for (𝛼,̂ 𝛽)̂
• A better choice of parameters for obtaining MLEs and implementing normal
approximations is (𝑢, log 𝑏), which is used by most statistical software

Likelihood function
• For a censored sample

{(𝑡𝑖 , 𝛿𝑖 ), 𝑖 = 1, … , 𝑛},
the likelihood function
𝛿𝑖 1−𝛿𝑖
𝑛
1 𝑦 −𝑢 𝑦 −𝑢
𝐿(𝑢, 𝑏) = ∏ [ 𝑓0 ( 𝑖 )] [𝑆0 ( 𝑖 )] (3)
𝑖=1
𝑏 𝑏 𝑏
• 𝑦𝑖 = log 𝑡𝑖
• 𝑓0 (𝑧) = −𝑑𝑆0 (𝑧)/𝑑𝑧 → pdf

Likelihood function
• The standardized variable

𝑦𝑖 − 𝑢
𝑧𝑖 =
𝑏
• The likelihood function
𝛿𝑖
𝑛 1−𝛿𝑖
1
𝐿(𝑢, 𝑏) = ∏ [ 𝑓0 (𝑧𝑖 )] [𝑆0 (𝑧𝑖 )] (4)
𝑖=1
𝑏

Log-likelihood function
• The corresponding log-likelihood function

𝑛
ℓ(𝑢, 𝑏) = −𝑟 log 𝑏 + ∑ [𝛿𝑖 log 𝑓0 (𝑧𝑖 ) + (1 − 𝛿𝑖 ) log 𝑆0 (𝑧𝑖 )]
𝑖=1
𝑛
= −𝑟 log 𝑏 + ∑ ℓ𝑖 (𝑧𝑖 , 𝛿𝑖 ) (5)
𝑖=1
𝑛
• 𝑟 = ∑𝑖=1 𝛿𝑖

Score functions
𝑛
ℓ(𝑢, 𝑏) = −𝑟 log 𝑏 + ∑ ℓ𝑖 (𝑧𝑖 , 𝛿𝑖 )
𝑖=1
𝑛
= −𝑟 log 𝑏 + ∑ [𝛿𝑖 log 𝑓0 (𝑧𝑖 ) + (1 − 𝛿𝑖 ) log 𝑆0 (𝑧𝑖 )]
𝑖=1
𝑛
𝜕ℓ(𝑢, 𝑏) 𝜕ℓ (𝑧 , 𝛿 ) 𝜕𝑧
=∑ 𝑖 𝑖 𝑖 × 𝑖
𝜕𝑢 𝑖=1
𝜕𝑧𝑖 𝜕𝑢
𝑛
𝜕ℓ𝑖 (𝑧𝑖 , 𝛿𝑖 ) −1
=∑ ×( )
𝑖=1
𝜕𝑧𝑖 𝑏
1 𝑛 𝜕 log 𝑓0 (𝑧𝑖 ) 𝜕 log 𝑆0 (𝑧𝑖 )

= − ∑ [𝛿𝑖 + (1 − 𝛿𝑖 ) ]
𝑏 𝑖=1 𝜕𝑧𝑖 𝜕𝑧𝑖
Score functions
𝑛
𝑖=1
𝑛
𝜕ℓ(𝑢, 𝑏) 𝑟 𝜕ℓ (𝑧 , 𝛿 ) 𝜕𝑧
=− +∑ 𝑖 𝑖 𝑖 × 𝑖
𝜕𝑏 𝑏 𝑖=1 𝜕𝑧𝑖 𝜕𝑏
𝑛
𝑟 𝜕ℓ (𝑧 , 𝛿 ) −𝑧
= − + ∑ 𝑖 𝑖 𝑖 × ( 𝑖)
𝑏 𝑖=1 𝜕𝑧𝑖 𝑏
𝑟 1 𝑛 𝜕 log 𝑓0 (𝑧𝑖 ) 𝜕 log 𝑆0 (𝑧𝑖 )

= − − ∑ 𝑧𝑖 [𝛿𝑖 + (1 − 𝛿𝑖 ) ]
𝑏 𝑏 𝑖=1 𝜕𝑧𝑖 𝜕𝑧𝑖

Hessian matrix
𝜕ℓ(𝑢, 𝑏) 1 𝑛 𝜕 log 𝑓0 (𝑧𝑖 ) 𝜕 log 𝑆0 (𝑧𝑖 )

= − ∑ [𝛿𝑖 + (1 − 𝛿𝑖 ) ]
𝜕𝑢 𝑏 𝑖=1 𝜕𝑧𝑖 𝜕𝑧𝑖
𝜕 2 ℓ(𝑢, 𝑏) 𝜕 𝜕ℓ(𝑢, 𝑏)
= [ ]
𝜕𝑢2 𝜕𝑢 𝜕𝑢
1 𝑛 𝜕 2 log 𝑓0 (𝑧𝑖 ) 𝜕 2 log 𝑆0 (𝑧𝑖 )
= ∑ [𝛿 + (1 − 𝛿 ) ]
𝑏2 𝑖=1 𝑖 𝜕𝑧𝑖2 𝑖
𝜕𝑧𝑖2

Hessian matrix
𝜕ℓ(𝑢, 𝑏) 𝑟 1 𝑛 𝜕 log 𝑓0 (𝑧𝑖 ) 𝜕 log 𝑆0 (𝑧𝑖 )

= − − ∑ 𝑧𝑖 [𝛿𝑖 + (1 − 𝛿𝑖 ) ]
𝜕𝑏 𝑏 𝑏 𝑖=1 𝜕𝑧𝑖 𝜕𝑧𝑖
𝜕 2 ℓ(𝑢, 𝑏) 𝑟 2 𝑛 𝜕 log 𝑓0 (𝑧𝑖 ) 𝜕 log 𝑆0 (𝑧𝑖 )

2
= 2
+ 2
∑ 𝑧𝑖 [𝛿𝑖 + (1 − 𝛿𝑖 ) ]
1 𝑛 2 𝜕 2 log 𝑓0 (𝑧𝑖 ) 𝜕 2 log 𝑆0 (𝑧𝑖 )

+ ∑ 𝑧 [𝛿 + (1 − 𝛿 ) ]
𝑏2 𝑖=1 𝑖 𝑖 𝜕𝑧𝑖2 𝑖
𝜕𝑧𝑖2

Hessian matrix

= − ∑ [𝛿𝑖 + (1 − 𝛿𝑖 ) ]
𝜕 2 ℓ(𝑢, 𝑏) 1 𝑛 𝜕 log 𝑓0 (𝑧𝑖 ) 𝜕 log 𝑆0 (𝑧𝑖 )

= 2 ∑ [𝛿𝑖 + (1 − 𝛿𝑖 ) ]
𝜕𝑢 𝜕𝑏 𝑏 𝑖=1 𝜕𝑧𝑖 𝜕𝑧𝑖
1 𝑛 𝜕 2 log 𝑓0 (𝑧𝑖 ) 𝜕 2 log 𝑆0 (𝑧𝑖 )

+ 2 ∑ 𝑧𝑖 [𝛿𝑖 + (1 − 𝛿 𝑖 ) ]
𝑏 𝑖=1 𝜕𝑧𝑖2 𝜕𝑧𝑖2

Score function and information matrix
𝜕ℓ(𝑢,𝑏)
𝜕𝑢
𝑈 (𝑢, 𝑏) = [ ]
𝜕ℓ(𝑢,𝑏)
𝜕𝑏
𝜕 2 ℓ(𝑢,𝑏) 𝜕 2 ℓ(𝑢,𝑏)
2
𝐼(𝑢, 𝑏) = −𝐻(𝑢, 𝑏) = − [ 𝜕 2𝜕𝑢
ℓ(𝑢,𝑏)
𝜕𝑢 𝜕𝑏
𝜕 2 ℓ(𝑢,𝑏)
]
𝜕𝑏 𝜕𝑢 𝜕𝑏2

• MLE
(𝑢,̂ 𝑏)̂ ′ = arg max ℓ(𝑢, 𝑏)

(𝑢,𝑏)′ ∈Θ
• Variance-covariance matrix
−1
var(𝑢,̂ 𝑏)̂ = [𝐼(𝑢,̂ 𝑏)]
̂ = 𝑉̂
• Sampling distribution
𝑢̂ −1
𝑢 ̂
( ̂ ) ∼ 𝒩2 ([ ] , [𝐼(𝑢,̂ 𝑏)] )
𝑏 𝑏

Wald type confidence intervals
• For a large 𝑛, (𝑢,̂ 𝑏)̂ ′ follows a bivariate normal distribution with mean (𝑢, 𝑏)′
and variance matrix 𝑉 ̂
• Standard error of 𝑢̂ and 𝑏̂ can be obtained from the diagonal elements of 𝑉 ̂
̂ 1/2 and 𝑠𝑒(𝑏)̂ = 𝑉 ̂ 1/2

𝑠𝑒(𝑢)̂ = 𝑉11 22

• Following pivotal quantities follow standard normal distributions
𝑢̂ − 𝑢 𝑏̂ − 𝑏 log 𝑏̂ − log 𝑏
𝑍1 = , 𝑍2 = , 𝑍2′ =
𝑠𝑒(𝑢)̂ 𝑠𝑒(𝑏)̂ 𝑠𝑒(log 𝑏)̂
• 𝑠𝑒(log 𝑏)̂ = 𝑠𝑒(𝑏)/

̂ 𝑏̂
• (1 − 𝑝)100% confidence intervals
𝑢̂ ± 𝑧1−𝑝/2 𝑠𝑒(𝑢)̂
𝑏̂ ± 𝑧1−𝑝/2 𝑠𝑒(𝑏)̂
𝑏̂ exp{±𝑧1−𝑝/2 𝑠𝑒(log 𝑏)}
̂

Quantiles
• 𝑝𝑡ℎ quantile of log lifetime 𝑌

𝑦𝑝 − 𝑢
𝑃 (𝑌 ≤ 𝑦𝑝 ) = 𝑝 ⇒ 𝑆0 ( )=1−𝑝
𝑏
𝑦𝑝 − 𝑢
= 𝑆0−1 (1 − 𝑝)
𝑏
𝑦𝑝 = 𝑢 + 𝑏𝑤𝑝
• 𝑤𝑝 = 𝑆0−1 (1 − 𝑝) = 𝐹0−1 (𝑝) → 𝑝𝑡ℎ quantile of 𝑆0 (𝑧), the standardize

distribution

Quantiles
• Estimate of 𝑝𝑡ℎ quantile and the corresponding standard error
̂ 𝑝
𝑦𝑝̂ = 𝑢̂ + 𝑏𝑤
̂ + 𝑤𝑝2 𝑉22
𝑠𝑒(𝑦𝑝̂ ) = √𝑉11 ̂ + 2𝑤𝑝 𝑉12
̂
• Pivotal quantity
𝑦𝑝̂ − 𝑦𝑝
𝑍𝑝 = ∼ 𝒩(0, 1)
𝑠𝑒(𝑦𝑝̂ )
• (1 − 𝑞)100% confidence interval for 𝑦𝑝
𝑦𝑝̂ ± 𝑧1−𝑞/2 𝑠𝑒(𝑦𝑝̂ )

Delta method
Theorem
Let 𝑇1𝑛 , … , 𝑇𝑘𝑛 be statistics such that as 𝑛 → ∞
√ 𝒟
𝑛(𝑇1𝑛 − 𝜃1 , … , 𝑇𝑘𝑛 − 𝜃𝑘 ) −
→ 𝒩𝑘 (𝟎, Σ) (6)
where Σ = (𝜎𝑖𝑗 )𝑘×𝑘 .

Let 𝑔(𝑥1 , … , 𝑥𝑘 ) be a function whose all first derivatives exist then as 𝑛 → ∞
√ 𝒟
→ 𝒩(0, 𝐚′ Σ𝐚)
𝑛[𝑔(𝑇1𝑛 , … , 𝑇𝑘𝑛 ) − 𝑔(𝜃1 , … , 𝜃𝑘 )] − (7)
where
𝜕𝑔 𝜕𝑔 𝜕𝑔 ′
𝐚= =( ,…, )
𝜕𝜽 𝜕𝜃1 𝜕𝜃𝑘

Delta method
Theorem
Let 𝑇1𝑛 , … , 𝑇𝑘𝑛 be statistics and 𝑔𝑖 (𝑥1 , … , 𝑥𝑘 ), 𝑖 = 1, … , 𝑝, be functions, all of whose
first derivatives exist.
The joint distribution of
𝑝
√
{ 𝑛[𝑔𝑖 (𝑇1𝑛 , … , 𝑇𝑘𝑛 ) − 𝑔𝑖 (𝜃1 , … , 𝜃𝑘 )]}
𝑖=1
asymptotically follows a 𝑝-variate normal distribution with mean 𝟎 and covariance

matrix 𝐆Σ𝐆′ , where 𝐺𝑖𝑗 = 𝜕𝑔𝑖 /𝜕𝜃𝑗 is the (𝑖, 𝑗) entry of 𝐆

Confidence intervals for survival probabilities
• Want to obtain confidence interval for 𝑆(𝑦0 ) for a pre-specified log-lifetime 𝑦0 ,

where
𝑦0 − 𝑢
𝑆(𝑦0 ) = 𝑃 (𝑌 > 𝑦0 ) = 𝑆0 ( ) = 𝑆0 (𝜓)
𝑏
𝑦0 − 𝑢
⇒ 𝜓 = 𝑆0−1 (𝑆(𝑦0 )) =
𝑏
• The MLE 𝜓 ̂ depends on 𝑢̂ and 𝑏̂

• For a large 𝑛, the MLE 𝜓 ̂ follows a normal distribution with mean 𝜓 and
variance 𝑣𝑎𝑟(𝜓),̂ and the corresponding pivotal quantity
𝜓̂ − 𝜓
𝑍0 = ∼ 𝒩(0, 1)
𝑠𝑒(𝜓)̂
• Since 𝜓 ̂ is a function of 𝑢̂ and 𝑏,̂ using the statement of Theorem of the slide
[23], we can write
1/2
𝑠𝑒(𝜓)̂ = [𝐚′ 𝑉 ̂ 𝐚] ,
where 𝑉 ̂ is the covariance matrix of (𝑢,̂ 𝑏)̂ ′ and
𝜕𝜓 𝜕𝜓 ′
𝐚=( , ) = ( − 1/𝑏,̂ −𝜓/̂ 𝑏)
̂
𝜕𝑢 𝜕𝑏 𝑢=𝑢,𝑏=
̂ 𝑏̂

• (1 − 𝑝)100% confidence interval for 𝑆(𝑦0 )
𝜓 ̂ − 𝑧1−𝑝/2 𝑠𝑒(𝜓)̂ ≤ 𝜓 ≤ 𝜓 ̂ + 𝑧1−𝑝/2 𝑠𝑒(𝜓)̂
𝜓 ̂ − 𝑧1−𝑝/2 𝑠𝑒(𝜓)̂ ≤ 𝑆0−1 (𝑆(𝑦0 )) ≤ 𝜓 ̂ + 𝑧1−𝑝/2 𝑠𝑒(𝜓)̂
𝑆0 (𝜓 ̂ − 𝑧1−𝑝/2 𝑠𝑒(𝜓))
̂ ≤ 𝑆(𝑦0 ) ≤ 𝑆0 (𝜓 ̂ + 𝑧1−𝑝/2 𝑠𝑒(𝜓))
̂

cons
• Normal approximation based confidence intervals could be inaccurate for small
samples
• An alternative to normal approximation, bootstrap simulations can be used to
estimate the distributions of pivots
• All these methods can perform poorly in small samples with heavy censoring
• Implementation of likelihood ratio based confidence intervals is relatively
complicated, but LRT based CI often performs better in small and medium-size
samples

Likelihood ratio procedures lrt based ci for parameters
• To test the hypothesis 𝐻0 ∶ 𝑢 = 𝑢0 , the following likelihood ratio test statistic

can be used
Λ1 (𝑢0 ) = 2ℓ(𝑢,̂ 𝑏)̂ − 2ℓ(𝑢0 , 𝑏(𝑢
̃ 0 ))
• MLEs
(𝑢,̂ 𝑏)̂ ′ = arg max ℓ(𝑢, 𝑏) unrestricted
(𝑢,𝑏)′ ∈Θ
̃ 0 ) = arg max ℓ(𝑢0 , 𝑏) under 𝐻0

𝑏(𝑢
𝑏∈Θ1

Likelihood ratio procedures
• Under 𝐻0 ∶ 𝑢 = 𝑢0 , asymptotically Λ1 (𝑢0 ) ∼ 𝜒2(1)
• Approximate two-sided (1 − 𝑝)100% confidence interval for 𝑢 can be obtained as
the set of values of 𝑢0 for which
Λ1 (𝑢0 ) ≤ 𝜒2(1),1−𝑝
• Approximate lower (1 − 𝑝/2)100% CI for 𝑢 consisting of values
̂ 1 (𝑢0 ) ≤ 𝜒2(1),1−𝑝 }
{𝑢0 ∶ 𝐼(𝑢0 ≤ 𝑢)Λ
• Approximate upper (1 − 𝑝/2)100% CI for 𝑢 consisting of values
̂ 1 (𝑢0 ) ≤ 𝜒2(1),1−𝑝 }
{𝑢0 ∶ 𝐼(𝑢0 ≥ 𝑢)Λ

Homework - 1
• Obtain the expression of likelihood ratio test statistic based confidence interval
for the scale parameter 𝑏

Likelihood ratio procedures for quantiles
• The 𝑝𝑡ℎ quantile of location-scale distribution can be expressed as
𝑦𝑝 = 𝑢 + 𝑤𝑝 𝑏, 𝑤ℎ𝑒𝑟𝑒 𝑤𝑝 = 𝑆0−1 (1 − 𝑝)
• To obtain confidence intervals for a quantile, consider the null hypothesis
𝐻0 ∶ 𝑦𝑝 = 𝑦𝑝0
• The corresponding likelihood ratio test statistic
Λ(𝑦𝑝0 ) = 2ℓ(𝑢,̂ 𝑏)̂ − 2ℓ(𝑢,̃ 𝑏)̃ (8)

• The estimates 𝑢̂ and 𝑏̂ are MLEs under 𝐻1
(𝑢,̂ 𝑏)̂ ′ = arg max ℓ(𝑢, 𝑏)

(𝑢,𝑏)′ ∈Θ
• Steps to obtain MLEs 𝑢̃ and 𝑏,̃ under 𝐻0 ∶ 𝑦𝑝 = 𝑦𝑝0

1. Under 𝐻0 , 𝑦𝑝0 = 𝑢 + 𝑤𝑝 𝑏 ⇒ 𝑢 = 𝑦𝑝0 − 𝑏𝑤𝑝
2. 𝑏̃ = arg max𝑏∈Θ ℓ(𝑦𝑝0 − 𝑤𝑝 𝑏, 𝑏)
3. 𝑢̃ = 𝑦𝑝0 − 𝑤𝑝 𝑏̃
• (1 − 𝑞)100% Confidence interval for 𝑦𝑝 can be obtained from the set of 𝑦𝑝0
values such that
Λ(𝑦𝑝0 ) ≤ 𝜒2(1),1−𝑞
Likelihood ratio procedures for survival quantity
• To obtain confidence interval for 𝑆(𝑦0 ), consider the null hypothesis
𝐻0 ∶ 𝑆(𝑦0 ) = 𝑠0
• The same likelihood ratio statistic (8) can be used to test the hypothesis
Λ(𝑠0 ) = 2ℓ(𝑢,̂ 𝑏)̂ − 2ℓ(𝑢,̃ 𝑏)̃ (9)

• Steps for obtaining MLEs 𝑢̃ and 𝑏̃ under 𝐻0
1. Under 𝐻0 ∶ 𝑆(𝑦0 ) = 𝑠0
𝑦0 − 𝑢
𝑆(𝑦0 ) = 𝑆0 ( ) = 𝑠0 ⇒ 𝑢 = 𝑦0 − 𝑆0−1 (𝑠0 )𝑏
𝑏
2. 𝑏̃ = arg max𝑏∈Θ ℓ(𝑦0 − 𝑆0−1 (𝑠0 )𝑏, 𝑏)
3. 𝑢̃ = 𝑦0 − 𝑆0−1 (𝑠0 )𝑏̃
• The (1 − 𝑝)100% confidence interval for 𝑆(𝑦0 ) can be defined as the set of 𝑠0
values such that
Λ(𝑠0 ) ≤ 𝜒2(1),1−𝑝 (10)

Likelihood ratio procedures pros & cons
• The likelihood ratio procedure can provide quite accurate confidence intervals
when the number of failures is about 20 or more
• Two-sided intervals perform better than one-sided intervals as the former giving
more closer to nominal coverage than the other


• The pdf of Weibull distribution

𝛽−1
𝛽 𝑡 𝛽
𝑓(𝑡; 𝛼, 𝛽) = ( ) exp [ − (𝑡/𝛼) ] (11)
𝛼 𝛼
• 𝛼 > 0 and 𝛽 > 0 are scale and shape parameters, respectively

• The pdf of extreme-value distribution

1
𝑓(𝑦; 𝑢, 𝑏) = exp [(𝑦 − 𝑢)/𝑏] exp [ − 𝑒(𝑦−𝑢)/𝑏 ] (12)
𝑏
1 𝑦−𝑢
= 𝑓0 ( ) (13)
𝑏 𝑏
• 𝑢 = log 𝛼
• 𝑏 = (1/𝛽)
• Extreme-value distribution is used to make inferences about Weibull distribution

Likelihood based inference procedures
• Censored sample
{(𝑡𝑖 , 𝛿𝑖 ), 𝑖 = 1, … , 𝑛}
• Define
𝑦𝑖 = log 𝑡𝑖 and 𝑧𝑖 = (𝑦𝑖 − 𝑢)/𝑏

• General expression of likelihood function

𝑛
𝑖=1
• For extreme-value distribution
𝑆0 (𝑧) = exp(−𝑒𝑧 )
𝑑
𝑓0 (𝑧) = − 𝑆 (𝑧) = exp(𝑧 − 𝑒𝑧 )
𝑑𝑧 0

𝑛
𝑖=1
• Log-likelihood function for EV distribution

𝑛
ℓ(𝑢, 𝑏) = −𝑟 log 𝑏 + ∑ (𝛿𝑖 𝑧𝑖 − 𝑒𝑧𝑖 ) (14)
𝑖=1
• 𝑟 = ∑ 𝑖 𝛿𝑖

Score functions
• General expressions

= − ∑ [𝛿𝑖 + (1 − 𝛿𝑖 ) ]

= − − ∑ 𝑧𝑖 [𝛿𝑖 + (1 − 𝛿𝑖 ) ]

Score functions

𝜕 log 𝑓0 (𝑧) 𝜕
= log { exp(𝑧 − 𝑒𝑧 )} = 1 − 𝑒𝑧
𝜕𝑧 𝜕𝑧
𝜕 log 𝑆0 (𝑧) 𝜕
= log { exp(−𝑒𝑧 )} = −𝑒𝑧
𝜕𝑧 𝜕𝑧

Score functions
𝜕 log 𝑓0 (𝑧) 𝜕 log 𝑆0 (𝑧)

= (1 − 𝑒𝑧 ) and = −𝑒𝑧
𝜕𝑧 𝜕𝑧

= − ∑ [𝛿𝑖 + (1 − 𝛿𝑖 ) ]
1 𝑛
= − ∑ [𝛿𝑖 (1 − 𝑒𝑧𝑖 ) − (1 − 𝛿𝑖 )𝑒𝑧𝑖 ]
𝑏 𝑖=1
𝑛
1
= − [𝑟 − ∑ 𝑒𝑧𝑖 ] (15)
𝑏 𝑖=1

Score functions
• MLEs 𝑢̂ and 𝑏̂
𝜕ℓ(𝑢, 𝑏)
∣ =0
𝜕𝑢 𝑢=𝑢,̂ 𝑏=𝑏̂
𝑛 𝑛
1
− [𝑟 − ∑ 𝑒 ] = 0 ⇒ 𝑟 = ∑ 𝑒𝑧𝑖̂
𝑧𝑖̂
(16)
𝑏̂ 𝑖=1 𝑖=1
• ̂ 𝑏̂
𝑧𝑖̂ = (𝑦𝑖 − 𝑢)/

Score functions
𝜕 log 𝑓0 (𝑧) 𝜕 log 𝑆0 (𝑧)

= (1 − 𝑒𝑧 ) and = −𝑒𝑧
𝜕𝑧 𝜕𝑧
= − − ∑ 𝑧𝑖 [𝛿𝑖 + (1 − 𝛿𝑖 ) ]
𝑟 1 𝑛
= − − ∑ 𝑧𝑖 [𝛿𝑖 (1 − 𝑒𝑧𝑖 ) − (1 − 𝛿𝑖 )𝑒𝑧𝑖 ]
𝑏 𝑏 𝑖=1
𝑟 1 𝑛
= − − ∑ 𝑧𝑖 [𝛿𝑖 − 𝑒𝑧𝑖 ] (17)
𝑏 𝑏 𝑖=1

Score functions
𝜕ℓ(𝑢, 𝑏)
∣ =0
𝜕𝑏 𝑢=𝑢,̂ 𝑏=𝑏̂
𝑟 1 𝑛 𝑛
− − ∑ 𝑧𝑖̂ (𝛿𝑖 − 𝑒𝑧𝑖̂ ) = 0 ⇒ 𝑟 = − ∑ 𝑧𝑖̂ (𝛿𝑖 − 𝑒𝑧𝑖̂ ) (18)
𝑏̂ 𝑏̂ 𝑖=1 𝑖=1

Hessian matrix
𝜕 2 ℓ(𝑢, 𝑏) 1 𝑛 𝜕 2 log 𝑓0 (𝑧𝑖 ) 𝜕 2 log 𝑆0 (𝑧𝑖 ) 1 𝑛 𝑧𝑖

= ∑ [𝛿 + (1 − 𝛿 ) ] = − ∑𝑒
𝜕𝑢2 𝑏2 𝑖=1 𝑖 𝜕𝑧𝑖2 𝑖
𝜕𝑧𝑖2 𝑏2 𝑖=1
where
𝜕 2 log 𝑓0 (𝑧) 𝜕 𝑧 𝑧 𝜕 2 log 𝑆0 (𝑧)
= {1 − 𝑒 } = −𝑒 =
𝜕𝑧 2 𝜕𝑧 𝜕𝑧2
𝜕 2 ℓ(𝑢, 𝑏) 1 𝑛 𝑧𝑖̂ 𝑟
∣ = − ∑𝑒 = − (19)
𝜕𝑏 2 ̂ 2
𝑏 𝑖=1 𝑏̂ 2
𝑢=𝑢,̂ 𝑏=𝑏̂

𝜕 2 ℓ(𝑢, 𝑏) 𝑟 2 𝑛 𝜕 log 𝑓0 (𝑧𝑖 ) 𝜕 log 𝑆0 (𝑧𝑖 )
= + ∑ 𝑧 𝑖 [𝛿𝑖 + (1 − 𝛿 𝑖 ) ]
𝜕𝑏2 𝑏2 𝑏2 𝑖=1 𝜕𝑧𝑖 𝜕𝑧𝑖
1 𝑛 2 𝜕 2 log 𝑓0 (𝑧𝑖 ) 𝜕 2 log 𝑆0 (𝑧𝑖 )

+ ∑ 𝑧 [𝛿 + (1 − 𝛿 ) ]
𝜕𝑧𝑖2
𝑟 2 𝑛 𝑧𝑖 𝑧𝑖 1 𝑛 2 𝑧𝑖
= 2 + 2 ∑ 𝑧𝑖 [𝛿𝑖 (1 − 𝑒 ) − (1 − 𝛿𝑖 )𝑒 ] − 2 ∑ 𝑧𝑖 𝑒
𝑏 𝑏 𝑖=1 𝑏 𝑖=1
𝑟 2 𝑛 𝑧𝑖 1 𝑛 2 𝑧𝑖
= 2 + 2 ∑ 𝑧𝑖 [𝛿𝑖 − 𝑒 ] − 2 ∑ 𝑧𝑖 𝑒
𝑏 𝑏 𝑖=1 𝑏 𝑖=1
𝜕 2 ℓ(𝑢, 𝑏) 𝑟 1 𝑛 2 𝑧𝑖̂
∣ = − − ∑ 𝑧𝑖̂ 𝑒 (20)
𝜕𝑏2 𝑢=𝑢,̂ 𝑏=𝑏̂ 𝑏̂2 𝑏̂2 𝑖=1
𝜕 2 ℓ(𝑢, 𝑏) 1 𝑛 𝜕 log 𝑓0 (𝑧𝑖 ) 𝜕 log 𝑆0 (𝑧𝑖 )
= 2 ∑ [𝛿𝑖 + (1 − 𝛿𝑖 ) ]
𝜕𝑢 𝜕𝑏 𝑏 𝑖=1 𝜕𝑧𝑖 𝜕𝑧𝑖
1 𝑛 𝜕 2 log 𝑓0 (𝑧𝑖 ) 𝜕 2 log 𝑆0 (𝑧𝑖 )

+ ∑ 𝑧 [𝛿 + (1 − 𝛿 ) ]
𝜕𝑧𝑖2
1 𝑛
= 2 ∑ [𝛿𝑖 (1 − 𝑒𝑧𝑖 ) − (1 − 𝛿𝑖 )𝑒𝑧𝑖 − 𝑧𝑖 𝑒𝑧𝑖 ]
𝑏 𝑖=1
1 𝑛 1 𝑛 𝑛
= 2 ∑ [𝛿𝑖 − 𝑒 − 𝑧𝑖 𝑒 ] = 2 [𝑟 − ∑ 𝑒 − ∑ 𝑧𝑖 𝑒𝑧𝑖 ]
𝑧𝑖 𝑧𝑖 𝑧𝑖
𝑏 𝑖=1 𝑏 𝑖=1 𝑖=1
𝜕 2 ℓ(𝑢, 𝑏) 1 𝑛
∣ = − ∑ 𝑧𝑖̂ 𝑒𝑧𝑖̂ (21)
𝜕𝑢 𝜕𝑏 𝑢=𝑢,̂ 𝑏=𝑏̂ 𝑏̂2 𝑖=1
Covariance matrix
• Hessian matrix at MLEs 𝑢̂ and 𝑏̂
𝑛
1 𝑟 ∑𝑖=1 𝑧𝑖̂ 𝑒𝑧𝑖̂
̂
𝐻(𝑢,̂ 𝑏) = − [ 𝑛 ]
𝑛
𝑏̂2 ∑𝑖=1 𝑧𝑖̂ 𝑒𝑧𝑖̂ 𝑟 + ∑𝑖=1 𝑧𝑖2̂ 𝑒𝑧𝑖̂
• Observed information matrix at MLEs 𝑢̂ and 𝑏̂

𝐼(𝑢,̂ 𝑏)̂ = −𝐻(𝑢,̂ 𝑏)̂
𝑛
1 𝑟 ∑𝑖=1 𝑧𝑖̂ 𝑒𝑧𝑖̂
= [ 𝑛 𝑛 ]
𝑏̂2 ∑𝑖=1 𝑧𝑖̂ 𝑒𝑧𝑖̂ 𝑟 + ∑𝑖=1 𝑧𝑖2̂ 𝑒𝑧𝑖̂
• Covariance matrix of (𝑢,̂ 𝑏)̂ ′

−1
̂
𝑉 ̂ = [𝐼(𝑢,̂ 𝑏)] (22)

Covariance matrix
• MLEs of 𝛼 and 𝛽 (Weibull model parameters)
𝛼̂ = 𝑒𝑢̂ and 𝛽 ̂ = 1/𝑏̂
• Covariance matrix of (𝛼,̂ 𝛽)̂ ′ (using delta method defined in the slide [24])
𝑣𝑎𝑟(𝛼,̂ 𝛽)̂ = 𝐆 𝑉 ̂ 𝐆′ (23)
where
𝜕𝑔1 (𝑢,𝑏) 𝜕𝑔1 (𝑢,𝑏)
𝑒𝑢̂ 0
𝐺= [ 𝜕𝑔 𝑑𝑢 𝜕𝑏
] =[ ]
2 (𝑢,𝑏) 𝜕𝑔2 (𝑢,𝑏) 0 − 𝑏1̂2
𝜕𝑢 𝜕𝑏
• 𝛼 = 𝑔1 (𝑢, 𝑏) = 𝑒𝑢
• 𝛽 = 𝑔2 (𝑢, 𝑏) = (1/𝑏)
Confidence intervals
• Wald-type statistics based 100(1 − 𝑝)% CI for 𝑢 and 𝑏
𝑢̂ ± 𝑧1−𝑝/2 𝑠𝑒(𝑢)̂
𝑏̂ ± 𝑧1−𝑝/2 𝑠𝑒(𝑏)̂
𝑏̂ exp [ ± 𝑧1−𝑝/2 𝑠𝑒(log 𝑏)]

̂

Confidence intervals (LRT statistics based)
• Log-likelihood function corresponding to 𝐻0 ∶ 𝑏 = 𝑏0
𝑛
𝑦𝑖 − 𝑢
ℓ(𝑢, 𝑏0 ) = −𝑟 log 𝑏0 + ∑ [𝛿𝑖 ( ) − 𝑒(𝑦𝑖 −𝑢)/𝑏0 ]
𝑖=1
𝑏0
• MLE of 𝑢 under 𝐻0 ∶ 𝑏 = 𝑏0
𝑛
𝜕ℓ(𝑢, 𝑏0 ) 1
∣ = 0 ⇒ − [𝑟 − ∑ 𝑒(𝑦𝑖 −𝑢)/𝑏
̃ 0
]=0
𝜕𝑢 𝑏0 𝑖=1
𝑢=𝑢̃
1 𝑛 𝑦𝑖 /𝑏0
⇒ 𝑢(𝑏
̃ 0 ) = 𝑏0 log [ ∑ 𝑒 ]
𝑟 𝑖=1

Confidence intervals (LRT statistics based)
• LRT statistics
Λ(𝑏0 ) = 2ℓ(𝑢,̂ 𝑏)̂ − 2ℓ(𝑢(𝑏
̃ 0 ), 𝑏0 )
• 100(1 − 𝑝)% CI for 𝑏 is defined by the set of 𝑏0 values such that
Λ1 (𝑏0 ) ≤ 𝜒2(1),1−𝑝
• Similarly, confidence interval for 𝑢 can be obtained using the corresponding LRT
statistics (Homework)

Confidence interval for quantiles
• The 𝑝𝑡ℎ quantile of 𝑌 ∼ 𝐸𝑉 (𝑢, 𝑏)

𝑦0 − 𝑢
𝑆(𝑦𝑝 ) = 𝑆0 ( ) = (1 − 𝑝) (24)
𝑏
𝑦𝑝 − 𝑢
exp [ − exp ( )] = (1 − 𝑝) (25)
𝑏
𝑦𝑝 − 𝑢
= log [ − log(1 − 𝑝)] = 𝑆0−1 (1 − 𝑝) = 𝑤𝑝 (26)
𝑏
𝑦𝑝 = 𝑢 + 𝑤𝑝 𝑏 (27)

Wald-type confidence interval for quantiles
• The estimate of 𝑝𝑡ℎ quantile
𝑦𝑝̂ = 𝑢̂ + 𝑤𝑝 𝑏̂
• Standard error of 𝑦𝑝̂ (using the results of slide [23])
1
𝑣𝑎𝑟(𝑦𝑝̂ ) = [1 𝑤𝑝 ] 𝑉 ̂ [ ̂ + 𝑉22
] = 𝑉11 ̂ 𝑤𝑝2 + 2𝑉12
̂ 𝑤𝑝
𝑤𝑝
• Large sample based 100(1 − 𝑞)% confidence interval for 𝑦𝑝
𝑦𝑝̂ ± 𝑧1−𝑞/2 𝑠𝑒(𝑦𝑝̂ )
• Find the 100(1 − 𝑞)% confidence interval for 𝑡𝑝

LRT-based confidence interval for quantiles
• To obtain LRT statistic based confidence interval for the quantile 𝑦𝑝 , consider
the following null hypothesis
𝐻0 ∶ 𝑦𝑝 = 𝑦𝑝0
• The corresponding LRT statistic
Λ(𝑦𝑝0 ) = 2ℓ(𝑢,̂ 𝑏)̂ − 2ℓ(𝑢,̃ 𝑏)̃ (8)
• The procedure of obtaining parameter estimates 𝑢̃ and 𝑏̃ (under 𝐻0 ) is

explained in the slide [33]
• LRT statistic based (1 − 𝑞)100% confidence interval for 𝑦𝑝 can be obtained
from the set of 𝑦𝑝0 values such that
Λ(𝑦𝑝0 ) ≤ 𝜒2(1),1−𝑞

Wald-type Confidence interval for survival probability
• To obtain confidence interval for survival probability
𝑦0 − 𝑢 𝑦 −𝑢
𝑆(𝑦0 ) = 𝑆0 ( ) = exp [ − exp ( 0 )]
𝑏 𝑏
• We can defined
𝑦0 − 𝑢
𝜓 = 𝑆0−1 (𝑆(𝑦0 )) = log [ − log (𝑆(𝑦0 ))] =
𝑏
• MLE and SE
𝑦0 − 𝑢̂
𝜓̂ =
𝑏̂
̂ ̂ ̂
̂ ] [𝑉11 𝑉12 ] [ −1/𝑏 ]
𝑣𝑎𝑟(𝜓)̂ = 𝐚′ 𝑉 ̂ 𝐚 = [−1/𝑏̂ −𝜓/𝑏
𝑉21̂ 𝑉22̂ ̂
−𝜓/𝑏

Wald-type Confidence interval for survival probability
• (1 − 𝑝)100% CI for 𝜓
𝜓 ̂ − 𝑠𝑒(𝜓)̂ 𝑧1−𝑝2 < 𝜓 ≤ 𝜓 ̂ + 𝑠𝑒(𝜓)̂ 𝑧1−𝑝/2

𝐿<𝜓≤𝑈
• Confidence interval for 𝑆(𝑦0 )
𝐿 < log [ − log (𝑆(𝑦0 ))] ≤ 𝑈

exp [ − exp(𝑈 )] < 𝑆(𝑦0 ) ≤ exp [ − exp(𝐿)]

LRT-based confidence interval for survival probability
• Consider the null hypothesis 𝐻0 ∶ 𝑆(𝑦0 ) = 𝑠0 , where

𝑦0 − 𝑢
𝑆(𝑦0 ) = exp [ − exp ( )]
𝑏
• The (1 − 𝑝)100% confidence interval for 𝑆(𝑦0 ) can be defined as the set of 𝑠0
values such that Λ(𝑠0 ) ≤ 𝜒2(1),1−𝑝 , where
Λ(𝑠0 ) = 2ℓ(𝑢,̂ 𝑏)̂ − 2ℓ(𝑢,̃ 𝑏)̃ (9)
• The procedure of obtaining parameter estimates 𝑢̃ and 𝑏̃ (under 𝐻0 ) is

explained in the slide [35]

Example 5.2.1
• Leukemia remission time data were given in Example 1.1.7 and used as an
example for the non-parametric methods (e.g. Kaplan-Meier method) described
in Chapter 3
• Two groups of patients (6MP and placebo), each group has 21 patients, were
followed up to observed either remission or censoring times (in weeks)

Remission time data
> gehan65
## # A tibble: 42 x 3
## time status drug
## <dbl> <dbl> <chr>
## 1 6 1 6-MP
## 2 6 1 6-MP
## 3 6 1 6-MP
## 4 6 0 6-MP
## 5 7 1 6-MP
## 6 9 0 6-MP
## 7 10 1 6-MP
## 8 10 0 6-MP
## 9 11 0 6-MP
## 10 13 1 6-MP
## # i 32 more rows
Remission time data
drug status n
6-MP 0 12
6-MP 1 9
placebo 1 21
• Each group has 21 subjects, and all subjects of the placebo group were failed
(end of remission) and drug group (6MP) has 12 censored times

Remission time data
• Two separate Weibull distributions are assumed for the failure times of two
treatment groups, e.g.
• 6MP group:
𝑇 ∼ Weibull(𝛼1 , 𝛽1 ), 𝑌 = log 𝑇 ∼ 𝐸𝑉 (𝑢1 , 𝑏1 )
• Placebo group:
𝑇 ∼ Weibull(𝛼2 , 𝛽2 ), 𝑌 = log 𝑇 ∼ 𝐸𝑉 (𝑢2 , 𝑏2 )
• Objectives: Drawing inference about the parameters

Remission time data
• Observed data
{(𝑡𝑖 , 𝛿𝑖 ), 𝑖 = 1, … , 𝑛}
𝑛
ℓ(𝛼, 𝛽) = ∑ [𝛿𝑖 log 𝑓(𝑡𝑖 ; 𝛼, 𝛽) + (1 − 𝛿𝑖 ) log 𝑆(𝑡𝑖 ; 𝛼, 𝛽)]
𝑖=1
• MLEs
(𝛼,̂ 𝛽)̂ ′ = arg max ℓ(𝛼, 𝛽)
(𝛼,𝛽)′ ∈Θ

Analysis of remission time data (Extreme-value distribution)
• Define 𝑦 = log 𝑡 and corresponding probability density and survivor function
1
𝑓(𝑦; 𝑢, 𝑏) = exp [(𝑦 − 𝑢)/𝑏 − 𝑒(𝑦−𝑢)/𝑏 ] (28)
𝑏
𝑆(𝑦; 𝑢, 𝑏) = exp [ − 𝑒(𝑦−𝑢)/𝑏 ] (29)
𝑛 𝛿𝑖 1−𝛿𝑖
ℓ𝑒𝑣 (𝑢, 𝑏) = log ∏ [𝑓(𝑦𝑖 ; 𝑢, 𝑏)] [𝑆(𝑦𝑖 ; 𝑢, 𝑏)] (30)
𝑖=1
• MLEs
(𝑢,̂ 𝑏)̂ ′ = arg max ℓ𝑒𝑣 (𝑢, 𝑏)
(𝑢,𝑏)′ ∈Θ

survreg function
• R function survreg() can also be used to fit distributions of log-location-scale

family, its syntax is similar to the syntax of survfit()
> survreg(formula, data, dist)
• In formula, response is a Surv object, e.g. to model the variables time and
status
formula = Surv(time, status) ∼ 1
• Lifetime or log-lifetime distributions can be passed to survreg by the argument
dist

survreg function
• Available lifetime or log-lifetime distributions include “weibull”, “exponential”,

“gaussian”, “logistic”, “lognormal”, “loglogistic”, “extreme”
• The time argument of Surv function is either a lifetime or a log-lifetime
depending on whether the mentioned dist is a lifetime (e.g. “weibull”) or a
log-lifetime (e.g. “extreme”)
weibull → formula = Surv(time, status) ∼ 1

extreme → formula = Surv(log(time), status) ∼ 1

Data for the treatment (6MP) group
> d6mp
## # A tibble: 21 x 3
## time status drug
## <dbl> <dbl> <chr>
## 1 6 1 6-MP
## 2 6 1 6-MP
## 3 6 1 6-MP
## 4 6 0 6-MP
## 5 7 1 6-MP
## 6 9 0 6-MP
## 7 10 1 6-MP
## 8 10 0 6-MP
## 9 11 0 6-MP
## 10 13 1 6-MP
## # i 11 more rows
Analysis of remission time data using survreg function
> w_sreg_6mp <- survreg(Surv(time, status) ~ 1,

data = d6mp, dist = "weibull")
> ev_sreg_6mp <- survreg(Surv(log(time), status) ~ 1,

data = d6mp, dist = "extreme")

Extreme-value distribution
• Estimates of model parameters 𝑢 and log 𝑏

> tidy(ev_sreg_6mp)
## # A tibble: 2 x 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 3.52 0.273 12.9 6.28e-38
## 2 Log(scale) -0.303 0.278 -1.09 2.77e- 1

Extreme-value distribution
• Variance-covariance matrix of (𝑢,̂ log 𝑏)̂

> vcov(ev_sreg_6mp)
## (Intercept) Log(scale)
## (Intercept) 0.07473057 0.03305811
## Log(scale) 0.03305811 0.07750538

• The survreg() function returns estimates of (𝑢, log 𝑏)′ and corresponding
variance matrix
• For making inference about Weibull distribution, followings are required
1. estimate of (𝑢, 𝑏)′ and the corresponding variance matrix
2. estimate of (𝛼, 𝛽)′ and the corresponding variance matrix
• It is important to understand the methods to obtain estimates and the
corresponding variance of (𝑢, 𝑏)′ and (𝛼, 𝛽)′ from the estimates and the
corresponding variance of (𝑢, log 𝑏)′

Homework
• Obtain the variance-covarince matrix of (𝛼,̂ 𝛽)̂ ′ and (𝑢,̂ 𝑏)̂ ′

Wald-type confidence intervals (6MP group)
• 95% CI using the sampling distribution of (𝛼,̂ 𝛽)̂
𝛼̂ ± 𝑧.975 𝑠𝑒(𝛼)̂ = 33.765 ± (1.96)(9.23)

= 15.674 to 51.856
𝛽 ̂ ± 𝑧.975 𝑠𝑒(𝛽)̂ = 1.354 ± (1.96)(0.377)
= 0.615 to 2.092

• 95% CI using the sampling distribution of (𝑢,̂ log 𝑏)̂
𝑢̂ ± 𝑧.975 𝑠𝑒(𝑢)̂ = 2.984 to 4.055

𝛼̂ ± 𝑧.975 𝑠𝑒(𝛼)̂ = exp(2.984) to exp(4.055)
= 19.76 to 57.698
• Similarly
log 𝑏̂ ± 𝑧.975 𝑠𝑒(log 𝑏)̂ = −0.849 to 0.243

𝛽̂± 𝑧
.975 𝑠𝑒(𝛽)̂ = 1/ exp(0.243) to 1/ exp(−0.849)
= 0.784 to 2.336

• Obtain the variance matrix of (𝑢,̂ 𝑏)̂ using the sampling distribution of (𝑢,̂ log 𝑏)̂ ′
• 95% CI using the sampling distribution of (𝑢,̂ 𝑏)̂
𝑢̂ ± 𝑧.975 𝑠𝑒(𝑢)̂ = 2.984 to 4.055

𝛼̂ ± 𝑧.975 𝑠𝑒(𝛼)̂ = exp(2.984) to exp(4.055)
= 19.76 to 57.698
• Similarly
𝑏̂ ± 𝑧.975 𝑠𝑒(𝑏)̂ = 0.336 to 1.142

𝛽̂± 𝑧 .975𝑠𝑒(𝛽)̂ = 1/1.142 to 1/0.336
= 0.876 to 2.979

Table 1: 95% confidence intervals for 𝛼 and 𝛽 by different methods
parameter method 6-MP

𝛼 Wald (𝛼)̂ (15.674, 51.856)
Wald (𝑢)̂ (19.76, 57.698)
𝛽 Wald (𝛽)̂ (0.615, 2.092)
Wald (log 𝑏)̂ (0.784, 2.336)
Wald (𝑏)̂ (0.876, 2.979)

8
6 3.088 < u < 4.34
21.933 < α < 76.708

Λ1 4
0
3.0 3.5 4.0 4.5 5.0
u0
Figure 1: Plot of LRT statistic against different null values 𝑢0 and 95% confidence interval for
𝑢 and 𝛼

7.5
− 0.79 < log(b) < 0.32

5.0 0.726 < β < 2.203
Λ2
2.5
0.0
−1.0 −0.5 0.0 0.5
log(b0)
Figure 2: Plot of LRT statistic against different null values 𝑏0 and 95% confidence interval for
log 𝑏 and 𝛽

parameter method 6-MP

𝛼 Wald (𝛼)̂ (15.674, 51.856)
Wald (𝑢)̂ (19.76, 57.698)
LRT (21.933, 76.708)
𝛽 Wald (𝛽)̂ (0.615, 2.092)
Wald (log 𝑏)̂ (0.784, 2.336)
Wald (𝑏)̂ (0.876, 2.979)
LRT (0.726, 2.203)

Placebo group data
dplacebo <- gehan65 %>%

filter(drug == "placebo")
Model fit with the data of placebo group

> w_sreg_p <- survreg(Surv(time, status) ~ 1,
data = dplacebo, dist = "weibull")

Estimates of model parameters
> tidy(w_sreg_p)
## 1 (Intercept) 2.25 0.168 13.4 5.72e-41
## 2 Log(scale) -0.315 0.174 -1.82 6.94e- 2

parameter method 6-MP Placebo
𝛼 Wald (𝛼)̂ (15.674, 51.856) (6.363, 12.601)

Wald (𝑢)̂ (19.76, 57.698) (6.824, 13.175)
LRT (21.933, 76.708) (6.659, 13.25)
𝛽 Wald (𝛽)̂ (0.615, 2.092) (0.904, 1.837)

Wald (log 𝑏)̂ (0.784, 2.336) (0.975, 1.926)
Wald (𝑏)̂ (0.876, 2.979) (1.023, 2.077)
LRT (0.726, 2.203) (0.951, 1.868)

Quantiles
• Estimate of 𝑝𝑡ℎ quantile

̂ 𝑝
𝑦𝑝̂ = 𝑢̂ + 𝑏𝑤
• 𝑤𝑝 = log[− log(1 − 𝑝)]
• 𝑢̂ = 3.519 and 𝑏̂ = 0.739 (for treatment group)
• Wald-type CI (see the slide [58] for detail)
𝑦𝑝̂ ± 𝑠𝑒(𝑦𝑝̂ )𝑧1−𝑞/2
• Note the estimate of 𝑦𝑝̂ depends on the estimate of 𝑢̂ and 𝑏,̂ and the
corresponding variance matrix
• survreg() returns estimate and variance matrix for 𝑢̂ and log 𝑏̂

Quantiles
Table 4: 95% confidence intervals for different quantiles of treatment group (6-MP)
CI for 𝑡𝑝
𝑝 𝑤𝑝 𝑦𝑝̂ 𝑠𝑒(𝑦𝑝̂ ) lower upper
0.25 -1.246 2.599 0.655 3.726 48.559
0.50 -0.367 3.249 0.264 15.357 43.225
0.75 0.327 3.761 0.395 19.822 93.241

6 1.885 < y0.25 < 3.138
6.586 < t0.25 < 23.058
4
Λ1
0
1.5 2.0 2.5 3.0
yp
Figure 3: Plot of LRT statistic against different null values 𝑦𝑝0 and 95% confidence interval for
𝑦.25 and 𝑡.25 (6-MP group)

7.5
2.7905 < y0.5 < 3.9385

5.0 16.289 < t0.5 < 51.342
Λ1
2.5
0.0
2.5 3.0 3.5 4.0
yp
Figure 4: Plot of LRT statistic against different null values 𝑦𝑝0 and 95% confidence interval for
𝑦.5 and 𝑡.5 (6-MP group)

Quantiles
Table 5: 95% confidence intervals of different quantiles using Wald and LRT method
(6-MP group)
Wald LRT
𝑝 lower upper lower upper
0.25 3.726 48.559 6.586 23.058
0.50 15.357 43.225 16.289 51.342
0.75 19.822 93.241 27.522 112.730

Quantiles
Table 6: 95% confidence intervals for different quantiles using Wald and LRT method
(placebo group)
Wald LRT
𝑝 lower upper lower upper
0.25 1.362 10.707 2.031 5.927
0.50 4.499 11.708 5.755 9.488
0.75 8.592 16.863 8.873 16.996

Survivor function
• For Weibull distribution, the expression of survivor function
𝑆(𝑡; 𝛼, 𝛽) = exp ( − (𝑡/𝛼)𝛽 )
• Estimated survivor function

̂
𝑆(𝑡; 𝛼,̂ 𝛽)̂ = exp ( − (𝑡/𝛼)̂ 𝛽 )
par 6-MP placebo

𝛼 33.765 9.482
𝛽 1.354 1.370

1.00
0.75
survival prob 0.50

6−MP
Placebo
0.25
0.00
0 10 20 30
remission time (in weeks)
Figure 5: Comparison survival probabilities of between two treatment groups

1.00
0.75
survival prob
6−MP
0.50
Placebo
0.25
0.00
0 10 20 30
remission time (in weeks)
Figure 6: Comparison of parametric (Weibull) and non-parametric (step-function) estimates of
survivor function using remission time data

Homework
• Obtain Wald and LRT statistics based confidence interval for the survival
probability 𝑆(10)

Log-normal and normal distributions

Log-normal distribution
• 𝑇 follows a log-normal distribution with location parameter 𝜇 and scale

parameter 𝜎 if 𝑌 = log 𝑇 ∼ 𝒩(𝜇, 𝜎2 )
• The pdf and survivor function of log-normal distribution
2
1 1 log 𝑡 − 𝜇
𝑓(𝑡; 𝜇, 𝜎) = √ exp [ − ( ) ] (31)
𝜎𝑡 2𝜋 2 𝜎
log 𝑡 − 𝜇
𝑆(𝑡; 𝜇, 𝜎) = 1 − Φ( ) (32)
𝜎
• 𝜇 and 𝜎 are the parameters of both normal and log-normal distributions

• Φ(⋅) → cumulative distribution funciton of standard normal distribution

• Log-normal distribution is a member of the log-location-scale family of

distributions and the corresponding location-scale distribution is normal with
𝑆0 (𝑧) = 1 − Φ(𝑧) (33)

1 2
𝑓0 (𝑧) = √ 𝑒−𝑧 /2 = 𝜙(𝑧) (34)
2𝜋
• 𝜙(⋅) → pdf of standard normal distribution
• 𝑧 = (𝑦 − 𝜇)/𝜎

• Density function of log-lifetime

1 𝑦−𝜇
𝑓(𝑦; 𝜇, 𝜎) = 𝑓0 ( ) (35)
𝜎 𝜎
1 1 𝑦−𝜇 2
= √ exp [ − ( ) ] (36)
𝜎 2𝜋 2 𝜎
• Survivor function of log-lifetime
𝑦−𝜇 𝑦−𝜇
𝑆(𝑦; 𝜇, 𝜎) = 𝑆0 ( ) = 1 − Φ( ) (37)
𝜎 𝜎

Likelihood function normal distribution
• Data
{(𝑡𝑖 , 𝛿𝑖 ), 𝑖 = 1, … , 𝑛}
ℓ(𝜇, 𝜎) = log ∏ [(1/𝜎) 𝑓0 (𝑧𝑖 )] [𝑆0 (𝑧𝑖 )] (38)
𝑖=1
𝑛 𝑛
= −𝑟 log 𝜎 + ∑ 𝛿𝑖 log 𝑓0 (𝑧𝑖 ) + ∑(1 − 𝛿𝑖 ) log 𝑆0 (𝑧𝑖 ) (39)
𝑖=1 𝑖=1
1 𝑛 𝑛
= −𝑟 log 𝜎 − ∑ 𝛿𝑖 𝑧𝑖2 + ∑(1 − 𝛿𝑖 ) log 𝑆0 (𝑧𝑖 ) (40)
2 𝑖=1 𝑖=1
• 𝑧𝑖 = (𝑦𝑖 − 𝜇)/𝜎 and 𝑦𝑖 = log 𝑡𝑖

𝑛
• 𝑟 = ∑𝑖=1 𝛿𝑖
• Elements of hessian matrix and score function depend on the followings
𝜕 log 𝑓0 (𝑧)
= −𝑧 (41)
𝜕𝑧
𝜕 2 log 𝑓0 (𝑧)
= −1 (42)
𝜕𝑧 2
𝜕 log 𝑆0 (𝑧) 𝑓 (𝑧)
=− 0 (43)
𝜕𝑧 𝑆0 (𝑧)
2
𝜕 2 log 𝑆0 (𝑧) 𝑧𝑓 (𝑧) 𝑓 (𝑧)
2
= 0 −[ 0 ] (44)
𝜕𝑧 𝑆0 (𝑧) 𝑆0 (𝑧)

• MLEs
(𝜇,̂ 𝜎)̂ ′ = arg max ℓ(𝜇, 𝜎) (45)

Θ
(𝜇,̂ 𝜎)̂ ′ ∼ 𝒩((𝜇, 𝜎)′ , 𝑉 ) (46)
where
−1
𝑉 ̂ = [ − 𝐻(𝜇,̂ 𝜎)]
̂
• Confidence intervals of parameters, quantiles, and survival probabilities can be

obtained using the methods described for Weibull models

• Estimate of survivor function (Log-linear distribution) wald statistics based
log 𝑡 − 𝜇̂
𝑆(𝑡; 𝜇,̂ 𝜎)̂ = 1 − Φ( )
𝜎̂
= 1 − Φ(𝜓)̂ (47)
where
log 𝑡 − 𝜇̂
𝜓 ̂ = Φ−1 (1 − 𝑆(𝑡; 𝜇,̂ 𝜎))
̂ =
𝜎̂
• Standard error of 𝜓 ̂
𝑠𝑒(𝜓)̂ = √𝐚′ 𝑉 ̂ 𝐚
where
𝐚 = (−1/𝜎,̂ −𝜓/̂ 𝜎)̂ ′

Estimate of survivor function
• (1 − 𝛼)100% CI of 𝑆(𝑡)
𝐿<𝜓<𝑈
𝐿 < Φ−1 (1 − 𝑆(𝑡; 𝜇, 𝜎)) < 𝑈
Φ(𝐿) < 1 − 𝑆(𝑡; 𝜇, 𝜎) < Φ(𝑈 )
1 − Φ(𝑈 ) < 𝑆(𝑡; 𝜇, 𝜎) < 1 − Φ(𝐿)
where
𝐿 = 𝜓 ̂ − 𝑧1−𝛼/2 𝑠𝑒(𝜓)̂
𝑈 = 𝜓 ̂ + 𝑧1−𝛼/2 𝑠𝑒(𝜓)̂

Estimate of survivor function Lrt based
• LRT statistics based method of obtaining CI for survivor function is described

with
𝐻0 ∶ 𝑆(𝑦0 ) = 𝑆(log 𝑡0 ) = 𝑠0
• The 100(1 − 𝛼)% CI for 𝑆(𝑡) can be obtained from the values of 𝑠0 that satisfy
Λ(𝑠0 ) = 2ℓ(𝜇,̂ 𝜎)̂ − 2ℓ(𝜇,̃ 𝜎)̃ ≤ 𝜒2(1),1−𝛼

• Unrestricted and unrestricted MLEs are obtained as
unrestricted (𝜇,̂ 𝜎)̂ ′ = arg max ℓ(𝜇, 𝜎)

Θ
restricted (𝜇,̃ 𝜎)̃ = arg max ℓ(𝑦0 − 𝜎Φ−1 (1 − 𝑠0 ), 𝜎)
′
Θ
where under 𝐻0 , we can show

𝑦0 − 𝜇
𝑆(𝑦0 ) = 1 − Φ( ) = 𝑠0 ⇒ 𝜇 = 𝑦0 − 𝜎Φ−1 (1 − 𝑠0 )
𝜎

Quantiles
• The expression of estimate of 𝑦𝑝
𝑦𝑝̂ = 𝜇̂ + 𝜎𝑤
̂ 𝑝
where for normal distribution
𝑤𝑝 = 𝑆0−1 (1 − 𝑝) = Φ−1 (𝑝)
• Standard error of 𝑦𝑝̂
𝑠𝑒(𝑦𝑝̂ ) = √𝐚′ 𝑉 ̂ 𝐚
where
𝐚 = (1, 𝑤𝑝 )′

Homework
• Obtain the expressions of Wald-type and LRT based 100(1 − 𝛼)% confidence
intervals of 𝑦𝑝

Example 5.3.1
• Data are available on lifetimes (in thousand miles) of 96 locomotive controls, of

which were failed.
• The test was terminated after 135𝐾 miles, so 59 lifetimes were censored at
135𝐾.

Example 5.3.1
> dat_ex531
## # A tibble: 96 x 2
## time status
## <dbl> <int>
## 1 22.5 1
## 2 37.5 1
## 3 46 1
## 4 48.5 1
## 5 51.5 1
## 6 53 1
## 7 54.5 1
## 8 57.5 1
## 9 66.5 1
## 10 68 1
## # i 86 more rows
Example 5.3.1
> dat_ex531 %>%

count(status)
## status n
## <int> <int>
## 1 0 59
## 2 1 37

Example 5.3.1
Log-normal and normal model fit

mod_LN <- survreg(Surv(time, status) ~ 1,
dist = "lognormal",
data = dat_ex531)
mod_N <- survreg(Surv(log(time), status) ~ 1,

dist = "gaussian",
data = dat_ex531)

MLEs (𝜇,̂ log 𝜎)̂ and corresponding standard errors
> tidy(mod_LN)
## 1 (Intercept) 5.19 0.129 40.3 0
## 2 Log(scale) -0.136 0.131 -1.04 0.297

Estimated variance of (𝜇,̂ log 𝜎)̂
> mod_LN$var
## (Intercept) 0.01657557 0.00983969
## Log(scale) 0.00983969 0.01703353

Estimated variance 𝑉 (𝜇,̂ 𝜎)̂ from 𝑉 (𝜇,̂ log 𝜎)̂
• 𝐺 𝑉 (𝜇,̂ log 𝜎)̂ 𝐺′

## [,1] [,2]
## [1,] 0.01657557 0.00858735
## [2,] 0.00858735 0.01297359
1 0
𝐺=[ ]
0 exp(𝜎)

Table 7: 95% Confidence intervals for 𝜇 and 𝜎
Wald LRT
par lower upper lower upper
𝜇 4.942 5.447 5.000 5.400
𝜎 0.676 1.127 0.709 1.109

• Estimate of 𝑆(80)
log 80 − 𝜇̂
𝑆(80; 𝜇,̂ 𝜎)̂ = 1 − Φ( )
𝜎̂
= 0.824
• 𝜇̂ = 5.195 and 𝜎̂ = 0.873

1.00
0.75
estimate 0.50
log−normal
0.25
PL
0.00
0 50 100 150
time
Figure 7: Comparison of the estimates of survivor function
‘
Table 8: Estimate and confidence interval of 𝑆(80)
95% Wald-type CI
parameter est lower upper
𝑆(80) 0.824 0.667 0.924
• Obtain LRT based 95% CI for 𝑆(80)

Quantiles
• General expression of 𝑝𝑡ℎ quantile of log-lifetime (𝜇̂ = 5.195 and 𝜎̂ = 0.873)
𝑦𝑝̂ = 𝜇̂ + 𝜎𝑤
̂ 𝑝
• 𝑤𝑝 = Φ−1 (𝑝)

Table 9: Estimate and confidence intervals of different quantiles of locomotive controls
lifetime (normal distribution)
95% Wald-type CI
𝑝 𝑤𝑝 𝑦𝑝̂ se lower upper
0.25 -0.674 4.606 0.105
0.50 0.000 5.195 0.129
0.75 0.674 5.783 0.194

Log-logistic and logistic distributions

Log-logistic distribution
• 𝑇 follows a log-logistic distribution with parameters 𝛼 (scale) and 𝛽 (shape) if

𝑌 = log 𝑇 follows a logistic distribution with parameters 𝑢 (location) and 𝑏
(scale)

Log-logistic distribution
• The pdf, survivor, and hazard function of log-logistic distribution
(𝛽/𝛼)(𝑡/𝛼)𝛽−1
𝑓(𝑡; 𝛼, 𝛽) = 2
(48)
[1 + (𝑡/𝛼)𝛽 ]
−1
𝑆(𝑡; 𝛼, 𝛽) = [1 + (𝑡/𝛼)𝛽 ] (49)
(𝛽/𝛼)(𝑡/𝛼)𝛽−1
ℎ(𝑡; 𝛼, 𝛽) = (50)
[1 + (𝑡/𝛼)𝛽 ]

Logistic distribution
• Log-logistic distribution is a member of the log-location-scale family of

distributions and the corresponding location-scale distribution is logistic with
1
𝑆0 (𝑧) = (51)
1 + 𝑒𝑧
𝑒𝑧
𝑓0 (𝑧) = (52)
(1 + 𝑒𝑧 )2
• 𝑧 = (𝑦 − 𝑢)/𝑏

Logistic distribution
• Density function of log-lifetime
1 𝑦−𝑢
𝑓(𝑦; 𝑢, 𝑏) = 𝑓0 ( ) (53)
𝑏 𝑏
(1/𝑏) exp [(𝑦 − 𝑢)/𝑏]
= 2
(54)
{1 + exp [(𝑦 − 𝑢)/𝑏]}
• Survivor function of log-lifetime
𝑦−𝑢
𝑆(𝑦; 𝑢, 𝑏) = 𝑆0 ( ) (55)
𝑏
1
= (56)
1 + exp [(𝑦 − 𝑢)/𝑏]

• Data: {(𝑡𝑖 , 𝛿𝑖 ), 𝑖 = 1, … , 𝑛}
ℓ(𝜇, 𝜎) = log ∏ [(1/𝑏) 𝑓0 (𝑧𝑖 )] [𝑆0 (𝑧𝑖 )]
𝑖=1
𝑛 𝑛
= −𝑟 log 𝑏 + ∑ 𝛿𝑖 log 𝑓0 (𝑧𝑖 ) + ∑(1 − 𝛿𝑖 ) log 𝑆0 (𝑧𝑖 )
𝑖=1 𝑖=1
𝑛
= −𝑟 log 𝑏 + ∑ [𝛿𝑖 {𝑧𝑖 − log(1 + 𝑒𝑧𝑖 )} − log(1 + 𝑒𝑧𝑖 )]
𝑖=1
• 𝑧𝑖 = (𝑦𝑖 − 𝑢)/𝑏 and 𝑦𝑖 = log 𝑡𝑖

𝑛
• 𝑟 = ∑𝑖=1 𝛿𝑖

• Elements of hessian matrix and score function depend on the followings
𝜕 log 𝑓0 (𝑧) 2𝑒𝑧

=1− (57)
𝜕𝑧 1 + 𝑒𝑧
2
𝜕 log 𝑓0 (𝑧)
= −2𝑓0 (𝑧) (58)
𝜕𝑧 2
𝜕 log 𝑆0 (𝑧) −𝑒𝑧
= (59)
𝜕𝑧 1 + 𝑒𝑧
𝜕 2 log 𝑆0 (𝑧) −𝑒𝑧
= (60)
𝜕𝑧2 (1 + 𝑒𝑧 )2

• MLEs
(𝑢,̂ 𝑏)̂ ′ = arg max ℓ(𝑢, 𝑏) (61)

Θ
(𝑢,̂ 𝑏)̂ ′ ∼ 𝒩((𝑢, 𝑏)′ , 𝑉 ) (62)
where
−1
̂
𝑉 ̂ = [ − 𝐻(𝑢,̂ 𝑏)]
• Confidence intervals of parameters, quantiles, and survival probabilities can be

obtained using the methods described for Weibull models

• Estimate of survivor function (logistic distribution)
𝑦 − 𝑢̂ 1
𝑆0 ( ) = 𝑆(𝑦; 𝑢,̂ 𝑏)̂ = (63)
𝑏̂ ̂ 𝑏]̂
1 + exp [(𝑦 − 𝑢)/
1 − 𝑆(𝑦) 𝑦 − 𝑢̂
log [ ]= = 𝜓 ̂ = 𝑆0−1 (𝑆(𝑦)) (64)
𝑆(𝑦) 𝑏 ̂
• Standard error of 𝜓 ̂
𝑠𝑒(𝜓)̂ = √𝐚′ 𝑉 ̂ 𝐚, where 𝐚 = (−1/𝑏,̂ −𝜓/̂ 𝑏)̂ ′

(1 − 𝛼)100% CI of 𝑆(𝑡)
𝐿<𝜓<𝑈
1 − 𝑆(𝑦)
𝐿 < log <𝑈
𝑆(𝑦)
1 − 𝑆(𝑦)
exp(𝐿) < < exp(𝑈 )
𝑆(𝑦)
1 − 𝑆(𝑦)
1 + exp(𝐿) < 1 + < 1 + exp(𝐿)
𝑆(𝑦)
1 1
< 𝑆(𝑦) <
1 + exp(𝑈 ) 1 + exp(𝐿)
where
𝐿 = 𝜓 ̂ − 𝑧1−𝛼/2 𝑠𝑒(𝜓)̂
𝑈 = 𝜓 ̂ + 𝑧1−𝛼/2 𝑠𝑒(𝜓)̂
Estimate of survivor function Lrt based
• LRT statistics based method of obtaining CI for survivor function is described

with
𝐻0 ∶ 𝑆(𝑦0 ) = 𝑆(log 𝑡0 ) = 𝑠0
• The 100(1 − 𝛼)% CI for 𝑆(𝑡) can be obtained from the values of 𝑠0 that satisfy
Λ(𝑠0 ) ≤ 𝜒2(1),1−𝛼
where
Λ(𝑠0 ) = 2ℓ(𝑢,̂ 𝑏)̂ − 2ℓ(𝑢,̃ 𝑏)̃ (65)

• Unrestricted and unrestricted MLEs are obtained as
unrestricted (𝑢,̂ 𝑏)̂ ′ = arg max ℓ(𝑢, 𝑏)

Θ
restricted (𝑢,̃ 𝑏)̃ = arg max ℓ(𝑦0 − 𝑏 log {(1 − 𝑠0 )/𝑠0 }, 𝑏)

′
Θ
where under 𝐻0 , we can show
1 − 𝑠0
𝑆(𝑦0 ) = 𝑠0 ⇒ 𝑢 = 𝑦0 − 𝑏 log
𝑠0

Quantiles
• The expression of estimate of 𝑦𝑝
̂ 𝑝
𝑦𝑝̂ = 𝑢̂ + 𝑏𝑤
where for normal distribution

𝑝
𝑤𝑝 = 𝑆0−1 (1 − 𝑝) = log
1−𝑝
• Standard error of 𝑦𝑝̂
𝑠𝑒(𝑦𝑝̂ ) = √𝐚′ 𝑉 ̂ 𝐚
where
𝐚 = (1, 𝑤𝑝 )′
Homework
• Obtain the expressions of Wald-type and LRT based 100(1 − 𝛼)% confidence
intervals of 𝑦𝑝

Example 5.3.1
• Data are available on lifetimes (in thousand miles) of 96 locomotive controls, of

which were failed.
• The test was terminated after 135𝐾 miles, so 59 lifetimes were censored at
135𝐾.

Example 5.3.1
> dat_ex531
## # A tibble: 96 x 2
## time status
## <dbl> <int>
## 1 22.5 1
## 2 37.5 1
## 3 46 1
## 4 48.5 1
## 5 51.5 1
## 6 53 1
## 7 54.5 1
## 8 57.5 1
## 9 66.5 1
## 10 68 1
## # i 86 more rows
Example 5.3.1
> dat_ex531 %>%

count(status)
## status n
## <int> <int>
## 1 0 59
## 2 1 37

Example 5.3.1
Log-logistic and logistic model fit

> mod_LL <- survreg(Surv(time, status) ~ 1,
> dist = "loglogistic",
> data = dat_ex531)
> mod_L <- survreg(Surv(log(time), status) ~ 1,

> dist = "logistic",
> data = dat_ex531)

MLEs (𝑢,̂ log 𝑏)̂
## [1] 5.1206418 -0.8266704
Estimated variance of (𝑢,̂ log 𝑏)̂
## (Intercept) 0.010490062 0.007837215
## Log(scale) 0.007837215 0.022515937

MLEs of (𝑢,̂ 𝑏)̂
## [1] 5.1206418 0.4375036
Estimated variance of (𝑢,̂ 𝑏)̂
## [,1] [,2]
## [1,] 0.010490062 0.003428809
## [2,] 0.003428809 0.004309761

Table 10: 95% Confidence intervals for location and scale parameters
Wald LRT
dist par est lower upper lower upper
Logistic 𝑢 5.121 4.920 5.321 5.000 5.300
𝑏 0.438 0.326 0.587 0.360 0.559
Gaussian 𝜇 5.195 4.942 5.447 5.000 5.400
𝜎 0.873 0.676 1.127 0.709 1.109

• Estimate of 𝑆(80) (log-logistic distribution)
1
𝑆(80; 𝑢,̂ 𝑏)̂ =
̂ 𝑏]̂
1 + exp [(log 80 − 𝑢)/
= 0.844
• 𝑢̂ = 5.121 and 𝑏̂ = 0.438

1.00
0.75
estimate 0.50
log−logistic
0.25 log−normal
PL
0.00
0 50 100 150 200 250
time
Figure 8: Comparison of the estimates of survivor function
‘
Table 11: Estimate and corresponding Wald-type confidence interval of the survival
probability 𝑆(80)
95% CI
dist est lower upper
Log-logistic 0.844 0.566 0.957
Log-normal 0.824 0.667 0.924
• Obtain LRT based 95% CI for 𝑆(80)

Quantiles
• General expression of 𝑝𝑡ℎ quantile of log-lifetime (𝑢̂ = 5.121 and 𝑏̂ = 0.438)
̂ 𝑝
𝑦𝑝̂ = 𝑢̂ + 𝑏𝑤
𝑝
• 𝑤𝑝 = log 1−𝑝

Table 12: Estimate and confidence intervals of different quantiles
95% Wald-type CI
dist 𝑝 𝑤𝑝 𝑦𝑝̂ se lower upper
Logistic 0.25 -1.099 4.640 0.143
0.50 0.000 5.121 0.102
0.75 1.099 5.601 0.234
Gaussian 0.25 -0.674 4.826 0.101
0.50 0.000 5.121 0.102
0.75 0.674 5.416 0.177

Homework
• Analyze the locomotive control lifetimes using Weibull model and compare the
results


• Let 𝑇𝑗𝑖 be the lifetime of 𝑖𝑡ℎ subject of the 𝑗𝑡ℎ group (𝑖 = 1, … , 𝑛𝑗 ,

𝑗 = 1, … , 𝑚)
• Assume 𝑇𝑗𝑖 follows a distribution of log-location-scale family with parameters
𝛼𝑗 (scale) and 𝛽𝑗 (shape)
• The corresponding distribution of log-lifetime 𝑌𝑗𝑖 = log 𝑇𝑗𝑖 is of a location-scale
family distribution with parameters 𝑢𝑗 (location) and 𝑏𝑗 (scale)
𝑢𝑗 = log 𝛼𝑗 𝑎𝑛𝑑 𝑏𝑗 = (1/𝛽𝑗 )

Survivor functions
• The survivor function of 𝑌𝑗𝑖 = log 𝑇𝑗𝑖

𝑦 − 𝑢𝑗
𝑆𝑗 (𝑦) = 𝑆0 ( ) (66)
𝑏𝑗
• The survivor function of 𝑇𝑗𝑖
𝑆𝑗 (𝑡) = 𝑆0⋆ [(𝑡/𝛼𝑗 )𝛽𝑗 ] (67)
• 𝑆0⋆ (𝑥) = 𝑆0 (log 𝑥)

• 𝑢𝑗 = log 𝛼𝑗
• 𝑏𝑗 = (1/𝛽𝑗 )

• Comparison of several normal populations is a well-known problem in statistics,
where equal population variances are assumed, and the comparisons are
performed on the basis of equality of population means

Quantile
• General expression of the 𝑝𝑡ℎ quantile of the 𝑗𝑡ℎ population takes the form
𝑦𝑗𝑝 = 𝑢𝑗 + 𝑏𝑗 𝑤𝑝 , 𝑗 = 1, … , 𝑚 (68)
• 𝑤𝑝 = 𝑆0−1 (1 − 𝑝)

Equality of two populations
• When the scales are not equal (i.e. 𝑏1 ≠ 𝑏2 ), the difference between the 𝑝𝑡ℎ
quantiles does depend on the probability 𝑝 wp=p er function
𝑦1𝑝 − 𝑦2𝑝 = 𝑢1 − 𝑢2 + 𝑤𝑝 (𝑏1 − 𝑏2 ) (69)
• Under the assumption of equality of the scales (i.e. 𝑏1 = 𝑏2 ), difference between

𝑝𝑡ℎ (log-lifetime) quantile of a pair of populations (say 1 and 2) is constant,
i.e. it does not depend on the probability 𝑝 ∈ (0, 1)
𝑦1𝑝 − 𝑦2𝑝 = 𝑢1 − 𝑢2 =const (70)

when scale parameters are equal (b1=b2)
• The difference between two log-lifetime quantiles can be expressed in terms of
the ratio of lifetime quantiles
𝑦1𝑝 − 𝑦2𝑝 = 𝑢1 − 𝑢2 (70)

log 𝑡1𝑝 − log 𝑡2𝑝 = log 𝛼1 − log 𝛼2
𝑡1𝑝 /𝑡2𝑝 = 𝛼1 /𝛼2 (71)
• The ratio of the 𝑝𝑡ℎ quantiles of two lifetime distributions does not depend on
the probability 𝑝 when the corresponding shape parameters are equal (𝛽1 = 𝛽2 )

• Equality of all quantiles of two distributions, i.e.
𝑦1𝑝 = 𝑦2𝑝 ∀ 𝑝 ∈ (0, 1),
corresponds to equality of two distributions, i.e.
𝑆1 (𝑦) = 𝑆2 (𝑦)
• Under the assumption of common scale (shape for lifetime) parameter, the null
hypothesis of equality of two distributions can be expressed as
𝐻0 ∶ 𝑢1 − 𝑢2 = 0 or 𝐻0 ∶ (𝛼1 /𝛼2 ) = 1 (72)

• Equality of two populations with survivor functions (say 𝑆1 and 𝑆2 ) can be
expressed in terms of survivor functions
• Since
𝑦1𝑝 = 𝑦2𝑝 + 𝑢1 − 𝑢2 or 𝑡1𝑝 = 𝑡2𝑝 (𝛼1 /𝛼2 ),
the corresponding survivor functions can be expressed as
𝑆1 (𝑦 + 𝑢1 − 𝑢2 ) = 𝑆2 (𝑦) (73)
𝑆1 (𝑡(𝛼1 /𝛼2 )) = 𝑆2 (𝑡) (74)
• That is, the survivor functions for 𝑌 are translations of one another by an
amount (𝑢1 − 𝑢2 ) along the 𝑦-axis

Wald-type statistic
• Data {(𝑡𝑗𝑖 , 𝛿𝑗𝑖 ), 𝑖 = 1, 2} and 𝑦𝑗𝑖 = log 𝑡𝑗𝑖
• Two populations can be compared in terms of 𝑝𝑡ℎ quantile
𝐻0 ∶ 𝑦1𝑝 = 𝑦2𝑝
• Corresponding pivotal quantity
(𝑦1𝑝
̂ − 𝑦2𝑝
̂ ) − (𝑦1𝑝 − 𝑦2𝑝 )
𝑍𝑝 = 1/2
∼ 𝒩(0, 1) under 𝐻0
[𝑣𝑎𝑟(𝑦1𝑝
̂ ) + 𝑣𝑎𝑟(𝑦2𝑝 )]
• The statistic 𝑍𝑝 can be used to obtain confidence interval for (𝑦1𝑝 − 𝑦2𝑝 )

• To test 𝐻0 ∶ 𝑏1 = 𝑏2 , the following pivotal quantity can be considered
(log 𝑏̂1 − log 𝑏̂2 ) − (log 𝑏1 − log 𝑏2 )

𝑍𝑏 = ∼ 𝒩(0, 1) under 𝐻0
[𝑣𝑎𝑟(log 𝑏̂1 ) + 𝑣𝑎𝑟(log 𝑏̂2 )]1/2
• The statistic 𝑍𝑏 can be used to obtain confidence interval for (𝑏1 /𝑏2 )

• When scales are equal, two populations can be compared with respect their
location parameter 𝐻0 ∶ 𝑢1 = 𝑢2
• The corresponding pivotal quantity
(𝑢̂1 − 𝑢̂2 ) − (𝑢1 − 𝑢2 )

𝑍𝑢 = ∼ 𝒩(0, 1) under 𝐻0
[𝑣𝑎𝑟(𝑢̂1 ) + 𝑣𝑎𝑟(𝑢̂2 )]1/2
• The statistic 𝑍𝑢 can be used to obtain confidence interval for (𝑢1 − 𝑢2 )

• Wald statistic cannot be used to test
𝐻0 ∶ 𝑢1 = 𝑢2 , 𝑏1 = 𝑏2

LRT based inference
• Data {(𝑡𝑗𝑖 , 𝛿𝑗𝑖 ), 𝑗 = 1, … , 𝑚, 𝑖 = 1, … , 𝑛𝑗 } and 𝑦𝑗𝑖 = log 𝑡𝑗𝑖
• Different tests and confidence intervals of interest

1. 𝐻0 ∶ 𝑏1 = ⋯ = 𝑏𝑚
2. Confidence interval for (𝑏1 /𝑏2 )
3. Equality of several location parameters when scale parameters are equal
𝐻0 ∶ 𝑢1 = ⋯ = 𝑢𝑚 , 𝑏1 = ⋯ = 𝑏𝑚
𝐻1 ∶ all 𝑢𝑗 ’s are not equal, 𝑏1 = ⋯ = 𝑏𝑚
4. Confident interval for (𝑢1 − 𝑢2 ) when 𝑏1 = 𝑏2

5. Confidence interval for (𝑦1𝑝 − 𝑦2𝑝 ) when 𝑏1 ≠ 𝑏2

Case 1
• Hypothesis of interest
𝐻0 ∶ 𝑏1 = ⋯ = 𝑏𝑚 = 𝑏 (say) (75)
𝑚
ℓ(𝑢1 , … , 𝑢𝑚 , 𝑏1 , … , 𝑏𝑚 ) = ∑ ℓ𝑗 (𝑢𝑗 , 𝑏𝑗 ) (76)
𝑗=1
• Contribution to log-likelihood function for the 𝑗𝑡ℎ population

𝑛𝑗
ℓ𝑗 (𝑢𝑗 , 𝑏𝑗 ) = −𝑟𝑗 log 𝑏𝑗 + ∑ [𝛿𝑖 log 𝑓0 (𝑧𝑗𝑖 ) + (1 − 𝛿𝑗𝑖 ) log 𝑆0 (𝑧𝑗𝑖 )]

𝑖=1
• 𝑟𝑗 = ∑𝑖 𝛿𝑗𝑖
Case 1
• LRT statistic
Λ = 2ℓ(𝑢̂1 , … , 𝑢̂𝑚 , 𝑏̂1 , … , 𝑏̂𝑚 ) − 2ℓ(𝑢̃1 , … , 𝑢̃𝑚 , 𝑏,̃ … , 𝑏)̃ (77)
• Λ ∼ 𝜒2(𝑚−1) under the null hypothesis defined in (75)
• MLEs
• (𝑢̂𝑗 , 𝑏̂𝑗 )′ = arg maxΘ ℓ𝑗 (𝑢𝑗 , 𝑏𝑗 ), 𝑗 = 1, … , 𝑚
• (𝑢̃1 , … , 𝑢̃𝑚 , 𝑏,̃ … , 𝑏)̃ ′ = arg maxΘ ℓ(𝑢1 , … , 𝑢𝑚 , 𝑏, … , 𝑏)

Case 2
• To obtain confidence interval of (𝑏1 /𝑏2 ), consider
𝐻0 ∶ (𝑏1 /𝑏2 ) = 𝑎 ⇒ 𝐻0 ∶ 𝑏1 = 𝑎𝑏2
• 100(1 − 𝛼)% confidence interval of (𝑏1 /𝑏2 ) can be obtained from the range of
𝑎 values that satisfy
Λ(𝑎) ≤ 𝜒2(1),1−𝛼 ,
where the LRT statistic
Λ(𝑎) = 2ℓ(𝑢̂1 , 𝑢̂2 , 𝑏̂1 , 𝑏̂2 ) − 2ℓ(𝑢̃1 , 𝑢̃2 , 𝑎𝑏̃2 , 𝑏̃2 )
• (𝑢̂𝑗 , 𝑏̂𝑗 )′ = arg maxΘ ℓ𝑗 (𝑢𝑗 , 𝑏𝑗 ), 𝑗 = 1, 2
• (𝑢̃1 , 𝑢̃2 , 𝑏̃2 )′ = arg maxΘ ℓ(𝑢1 , 𝑢2 , 𝑎𝑏2 , 𝑏2 )

Case 3
• Test equality of several location parameters when scale parameters are equal
𝐻0 ∶ 𝑢 1 = ⋯ = 𝑢 𝑚 , 𝑏 1 = ⋯ = 𝑏 𝑚 (78)
𝐻1 ∶ all 𝑢𝑗 ’s are not equal, 𝑏1 = ⋯ = 𝑏𝑚 (79)

Case 3
• MLEs
• under 𝐻0 , (𝑢⋆ , 𝑏⋆ ) = arg maxΘ ℓ(𝑢, … , 𝑢, 𝑏, … , 𝑏)
• under 𝐻1 , (𝑢̃1 , … , 𝑢̃𝑚 , 𝑏)̃ = arg maxΘ ℓ(𝑢1 , … , 𝑢𝑚 , 𝑏, … , 𝑏)
• LRT statistic
Λ = 2ℓ(𝑢̃1 , … , 𝑢̃𝑚 , 𝑏,̃ … , 𝑏)̃ − 2ℓ(𝑢⋆ , … , 𝑢⋆ , 𝑏⋆ , … , 𝑏⋆ )
• Under the null hypothesis (78), Λ follows 𝜒2(𝑚−1) distribution

Case 4
• To obtain a confidence interval of (𝑢1 − 𝑢2 ) when 𝑏1 = 𝑏2 , consider the null
and alternative hypothesis
𝐻0 ∶ 𝑢1 − 𝑢2 = 𝛿, 𝑏1 = 𝑏2 vs 𝐻1 ∶ 𝑢1 − 𝑢2 ≠ 𝛿, 𝑏1 = 𝑏2
• LRT statistic
Λ(𝛿) = 2ℓ(𝑢̃1 , 𝑢̃2 , 𝑏,̃ 𝑏)̃ − 2ℓ(𝑢⋆2 + 𝛿, 𝑢⋆2 , 𝑏⋆ , 𝑏⋆ )
• under 𝐻0 , (𝑢⋆ , 𝑏⋆ ) = arg maxΘ ℓ(𝑢, 𝑢, 𝑏, 𝑏)
• under 𝐻1 , (𝑢̃1 , 𝑢̃2 , 𝑏)̃ = arg maxΘ ℓ(𝑢1 , 𝑢2 , 𝑏, 𝑏)
• 100(1 − 𝛼) confidence interval for (𝑢1 − 𝑢2 ) can be obtained from the set of 𝛿
values that satisfy Λ(𝛿) ≤ 𝜒2(1),1−𝛼
Case 5
• When 𝑏1 ≠ 𝑏2 , to obtain confidence interval for (𝑦1𝑝 − 𝑦2𝑝 ) consider the

following hypothesis
𝐻0 ∶ 𝑦1𝑝 − 𝑦2𝑝 = Δ ⇒ 𝐻0 ∶ 𝑢1 − 𝑢2 = Δ + (𝑏2 − 𝑏1 )𝑤𝑝
• 𝑤𝑝 = 𝑆0−1 (1 − 𝑝)
• LRT statistic
Λ(Δ) = 2ℓ(𝑢̂1 , 𝑢̂2 , 𝑏̂1 , 𝑏̂2 ) − 2ℓ(𝑢̃1 , 𝑢̃2 , 𝑏̃1 , 𝑏̃2 )

Case 5
• under 𝐻0
(𝑢̃1 , 𝑢̃2 , 𝑏̃1 , 𝑏̃2 ) = arg max ℓ(𝑢2 + Δ + (𝑏2 − 𝑏1 )𝑤𝑝 , 𝑢2 , 𝑏1 , 𝑏2 )

Θ
• under 𝐻1
(𝑢̂1 , 𝑢̂2 , 𝑏̂2 , 𝑏̂1 ) = arg max ℓ(𝑢1 , 𝑢2 , 𝑏1 , 𝑏2 )
Θ
• 100(1 − 𝛼) confidence interval for (𝑦1𝑝 − 𝑦2𝑝 ) can be obtained from the set of
Δ values that satisfy Λ(Δ) ≤ 𝜒2(1),1−𝛼

Comparison of Weibull or extreme value distributions
• Assume 𝑇𝑗𝑖 ∼ Weibull(𝛼𝑗 , 𝛽𝑗 ) (𝑗 = 1, … , 𝑚, 𝑖 = 1, … , 𝑛𝑗 )

• Data {(𝑡𝑗𝑖 , 𝛿𝑗𝑖 ), 𝑗 = 1, … , 𝑚, 𝑖 = 1, … , 𝑛𝑗 }
• Survivor function of Weibull distribution
𝑆𝑗 (𝑡) = exp [ − (𝑡/𝛼𝑗 )𝛽𝑗 ]
• Survivor function of extreme value distribution
𝑆𝑗 (𝑦) = exp [ − 𝑒(𝑦−𝑢𝑗 )/𝑏𝑗 ]
• 𝑢𝑗 = log 𝛼𝑗
• 𝑏𝑗 = 1/𝛽𝑗

Example 5.4.1
• Data of the following table are on the time to breakdown of electrical insulating
fluid subject to a constant voltage stress in a lifetest experiment

Table 13: Estimate of voltage-specific extreme value models
voltage 𝑢̂𝑗 ± 𝑠𝑒(𝑢̂𝑗 ) 𝑏̂𝑗 ± 𝑠𝑒(𝑏̂𝑗 )

26 6.862 ± 1.104 1.834 ± 0.885
28 5.865 ± 0.486 1.022 ± 0.474
30 4.351 ± 0.302 0.944 ± 0.303
32 3.256 ± 0.486 1.781 ± 0.254
34 2.503 ± 0.315 1.297 ± 0.211
36 1.457 ± 0.309 1.125 ± 0.221
38 0.001 ± 0.273 0.734 ± 0.367

1.00
Voltage
0.75 28
30
0.50 32
y
34
36
0.25
38
0.00
0 50 100 150
x
Figure 9: Comparison of estimated survivor function

LRT (Case 1)
• Null hypothesis
𝐻0 ∶ 𝑏1 = ⋯ = 𝑏7
• LRT statistic
Λ = 2ℓ(𝑢̂1 , … , 𝑢̂7 , 𝑏̂1 , … , 𝑏̂7 ) − 2ℓ(𝑢̃1 , … , 𝑢̃7 , 𝑏,̃ … , 𝑏)̃

= 2(−132.181) − 2(−136.578)
= 8.794
• p-value
𝑃 𝑟(𝜒2(6) ≥ Λ) = 0.185
It does not provide enough evidence to reject the null hypothesis of equality of
the scale parameters.

Confidence interval of (𝑏1/𝑏2) (Case 2)
• Wald-type
(log 𝑏̂1 − log 𝑏̂2 ) ± 𝑧1−𝛼/2 𝑠𝑒(log 𝑏̂1 − log 𝑏̂2 )

̂ ̂
(𝑏̂1 /𝑏̂2 ) 𝑒±𝑧1−𝛼/2 𝑠𝑒(log 𝑏1 −log 𝑏2 )
(1.834/1.022) 𝑒± (1.96)(0.624)
0.529 < (𝑏1 /𝑏2 ) < 6.095
• Similarly confidence intervals for (𝑏𝑗 /𝑏𝑗′ ) 𝑗 > 𝑗′ can be obtained

6:7
5:7
5:6
4:7
4:6
4:5
3:7
3:6
comparison
3:5
3:4
2:7
2:6
2:5
2:4
2:3
1:7
1:6
1:5
1:4
1:3
1:2
0
8
^ ^
bj bj′
Figure 10: Estimate and 95% confidence interval of pair-wise comparisons of scale parameters
(𝑏𝑗 /𝑏𝑗′ )

Case 3
• Equality of all location parameters when scales are equal
𝐻0 ∶ 𝑢 1 = ⋯ = 𝑢 𝑚 , 𝑏 1 = ⋯ = 𝑏 𝑚
• LRT statistic
Λ = 2ℓ(𝑢̃1 , … , 𝑢̃𝑚 , 𝑏,̃ … , 𝑏)̃ − 2ℓ(𝑢⋆ , … , 𝑢⋆ , 𝑏⋆ , … , 𝑏⋆ )

= 2(−136.578) − 2(−176.584)
= 80.013
• p-value 𝑃 (𝜒2(1) > 80.013) < .001 → There is a strong evidence against the
assumption of equality of 𝑚 location parameters

Case 4
• Wald-type confidence interval of (𝑢1 − 𝑢2 )
(𝑢̂1 − 𝑢̂2 ) ± 𝑧1−𝛼/2 𝑠𝑒(𝑢̂1 − 𝑢̂2 )

(6.862 − 4.351) ± (1.96)(1.206)
−1.367 < (𝑢1 − 𝑢2 ) < 3.361
• There is no significant difference between 𝑢1 and 𝑢2

6:7
5:7
5:6
4:7
4:6
4:5
3:7
3:6
comparison
3:5
3:4
2:7
2:6
2:5
2:4
2:3
1:7
1:6
1:5
1:4
1:3
1:2
0.0
2.5
5.0
7.5
^ −u
u ^
j j′
Figure 11: Estimate and 95% confidence interval of pair-wise comparisons of location
parameters (𝑢𝑗 − 𝑢𝑗′ )

Case 5
• General expression of 𝑝𝑡ℎ quantile of the group 𝑗
𝑦𝑗𝑝 = 𝑢𝑗 + 𝑏𝑗 𝑤𝑝 , 𝑗 = 1, … , 𝑚
• Difference of 𝑝𝑡ℎ quantile between groups 1 and 2
𝑦1𝑝 − 𝑦2𝑝 = 𝑢1 − 𝑢2 + (𝑏1 − 𝑏2 )𝑤𝑝
• 95% confidence interval for the difference of median between groups 1 and 2
𝑦1𝑚
̂ − 𝑦2𝑚
̂ ± 𝑧1−𝛼/2 𝑠𝑒(𝑦1𝑚
̂ − 𝑦2𝑚
̂ )
(6.19 − 5.491) ± (1.96)(1.291)
−1.831 < (𝑦1𝑚 − 𝑦2𝑚 ) < 3.231

6:7
5:7
5:6
4:7
4:6
4:5
3:7
3:6
comparison
3:5
3:4
2:7
2:6
2:5
2:4
2:3
1:7
1:6
1:5
1:4
1:3
1:2 0.0
2.5
5.0
7.5
y^pj − y^pj′
Figure 12: Estimate and 95% confidence interval of pair-wise comparisons of medians
(𝑦𝑗,.5 − 𝑦𝑗′ ,.5 )

Homework
• Analyse the breakdown time data using log-logistic and log-normal distributions
and compare the results with that of Weibull distribution

References I

405 (4,5) PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

405 (4,5) PDF

Uploaded by

Copyright:

Available Formats

Lifetime Data Analysis

Inference Procedures for Parametric Models

Mahbub Latif Lifetime Data Analysis October 2023 1 / 142

Inference procedures for parametric models

Mahbub Latif Lifetime Data Analysis October 2023 3 / 142

Mahbub Latif Lifetime Data Analysis October 2023 4 / 142

• Likelihood methods for lifetime data were introduced in Chapter 2, which

Mahbub Latif Lifetime Data Analysis October 2023 5 / 142

• Maximum likelihood estimator (MLE)

𝜽 ̂ = arg max ℓ(𝜽) (3)

Mahbub Latif Lifetime Data Analysis October 2023 6 / 142

• Large sample property of MLE

𝜽 ̂ ∼ 𝒩(𝜽, [ℐ(𝜽)]−1 ) (4)

• Fisher expected information matrix

Mahbub Latif Lifetime Data Analysis October 2023 7 / 142

Mahbub Latif Lifetime Data Analysis October 2023 8 / 142

Mahbub Latif Lifetime Data Analysis October 2023 9 / 142

• Let lifetime 𝑇𝑖 (𝑖 = 1, … , 𝑛) follows a distribution with pdf 𝑓(𝑡𝑖 ; 𝜽) and the

Mahbub Latif Lifetime Data Analysis October 2023 10 / 142

• For censored samples, the result “asymptotic distribution of MLE is normal” is

Mahbub Latif Lifetime Data Analysis October 2023 11 / 142

Mahbub Latif Lifetime Data Analysis October 2023 12 / 142

Mahbub Latif Lifetime Data Analysis October 2023 13 / 142

𝑓(𝑡; 𝜃) = (1/𝜃) exp(−𝑡/𝜃) 𝑡 ⩾ 0, 𝜃 > 0 (9)

ℎ(𝑡; 𝜃) = (1/𝜃) (10)

𝑆(𝑡; 𝜃) = exp(−𝑡/𝜃) (11)

𝑆(𝑡𝑝 ; 𝜃) = 1 − 𝑝 ⇒ exp(−𝑡𝑝 ; 𝜃) = 1 − 𝑝 ⇒ 𝑡𝑝 = −𝜃 log(1 − 𝑝)

Mahbub Latif Lifetime Data Analysis October 2023 14 / 142

• Estimation and related inference of exponential distribution when the sample

Mahbub Latif Lifetime Data Analysis October 2023 15 / 142

• Lifetime 𝑇 ∼ Exp(𝜃), 𝜃 > 0

Mahbub Latif Lifetime Data Analysis October 2023 16 / 142

• 𝑟 = ∑𝑖 𝛿𝑖 → the number of failures observed in the sample

Mahbub Latif Lifetime Data Analysis October 2023 17 / 142

• Assuming 𝑟 > 0 and no finite MLE exist for 𝑟 = 0

• Replacing the parameter 𝜃 by its MLE 𝜃 ̂ = ∑𝑖 𝑡𝑖 /𝑟 in (15), the observed

Mahbub Latif Lifetime Data Analysis October 2023 20 / 142

ℓ(𝜃) = −(1/𝜃) ∑ 𝑡𝑖 − 𝑟 log 𝜃 (16)

ℓ1 (𝜙)= −𝜙3 ∑ 𝑡𝑖 + 3𝑟 log(𝜙) (17)

Mahbub Latif Lifetime Data Analysis October 2023 21 / 142

• Approximation of 𝑍2 is quite accurate compared to that of 𝑍1

Mahbub Latif Lifetime Data Analysis October 2023 22 / 142

ℓ(𝜃) = −(1/𝜃) ∑ 𝑡𝑖 − 𝑟 log(𝜃)

Mahbub Latif Lifetime Data Analysis October 2023 23 / 142

Mahbub Latif Lifetime Data Analysis October 2023 24 / 142

𝑆(𝑡; 𝜃) = exp(−𝑡/𝜃) or ℎ(𝑡; 𝜃) = 1/𝜃

• Let 100(1 − 𝛼)% confidence interval for 𝜃

where Data = {(𝑡𝑖 , 𝛿𝑖 ), 𝑖 = 1, … , 𝑛}

Mahbub Latif Lifetime Data Analysis October 2023 25 / 142

• 100(1 − 𝛼)% confidence interval for 𝑆(𝑡0 ; 𝜃) = exp(−𝑡0 /𝜃)

Mahbub Latif Lifetime Data Analysis October 2023 26 / 142

• 100(1 − 𝛼)% confidence interval for ℎ(𝑡0 ; 𝜃) = 1/𝜃

(1/𝑈 (Data)) ⩽ ℎ(𝑡0 ; 𝜃) ⩽ (1/𝐿(Data))

Mahbub Latif Lifetime Data Analysis October 2023 27 / 142

• Lifetimes (in days) of 10 pieces of equipment

• Assume lifetimes follow exponential distribution with parameter, i.e. 𝑇𝑖 ∼ Exp(𝜃)

Mahbub Latif Lifetime Data Analysis October 2023 28 / 142

Mahbub Latif Lifetime Data Analysis October 2023 29 / 142

• Confidence interval for 𝜙

0.21 ⩽ 𝜙 ⩽ 0.35 ⇒ 0.21 ⩽ 𝜃−1/3 ⩽ 0.35

• Likelihood ratio statistic

Λ(𝜃) ≤ 𝜒2(1),.95 = 3.84

Mahbub Latif Lifetime Data Analysis October 2023 31 / 142