Professional Documents
Culture Documents
Slide Week 1
Slide Week 1
Survival Analysis
Lu Tian and Richard Olshen
Stanford University
2
Survival time T
Survival time T
∫t
3. Cumulative hazard function: H(t) = 0
h(u)du
5
f (t)
• Hazard function: h(t) = S(t) .
Hazard functions
• The hazard function h(t) is NOT the probability that the event
(such as death) occurs at time t or before time t
• h(t)dt is approximately the conditional probability that the
event occurs within the interval [t, t + dt] given that the event
has not occurred before time t.
• If the hazard function h(t) increases xxx% at [0, τ ], the
probability of failure before τ in general does not increase
xxx%.
8
Why hazard
• Interpretability.
• Analytic simplification.
• Modeling sensibility.
• In general, it could be fairly straightforward to understand how
the hazard changes with time, e.g., think about the hazard (of
death) for a person since his/her birth.
9
Exponential distribution
Exponential distribution
• E(T ) = λ.
The higher the hazard, the smaller the expected survival time.
• Var(T ) = λ−2 .
• Memoryless property: pr(T > t) = pr(T > t + s|T > s).
• c0 × EXP (λ) ∼ EXP (λ/c0 ), for c0 > 0.
• The log-transformed exponential distribution is the so called
extreme value distribution.
11
Gamma distribution
Gamma distribution
• E(T ) = α/λ.
• Var(T ) = α/λ2 .
• If Ti ∼ G(αi , λ), i = 1, · · · , K and Ti , i = 1, · · · , K are
independent, then
∑
K ∑K
Ti ∼ G( αi , λ).
i=1 i=1
Gamma distribution
Gamma
h(t)
t
14
Weibull distribution
Weibull distribution
Weibull distribution
p=1
p=0.5
6
p=1.5
p=2.5
4
h(t)
2
0
time
17
Log-normal distribution
log−normal
2.5
(mu, sigma)=(0, 1)
(mu, sigma)=(1, 1)
2.0
1.5 (mu, sigma)=(−1, 1)
h(t)
1.0
0.5
0.0
time
19
• For k = 1, 2, · · ·
Γ((α + k)/p)
E(T k ) =
λk Γ(α/p)
• If p = 1, GG(α, λ, 1) ∼ G(α, λ)
• if α = p, GG(p, λ, p) ∼ W (λ, p)
• if α = p = 1, GG(p, λ, p) ∼ EXP (λ)
• The generalized gamma distribution can be used to test the
adequacy of commonly used Gamma, Weibull and Exponential
distributions, since they are all nested within the generalized
gamma distribution family.
21
∫ t+dt
• N (t + dt) − N (t) ∼ P oisson( t λ(s)ds) for a rate function
λ(s), s > 0.
• The interpretation of the intensity function
pr(N (t + dt) − N (t) > 0)
lim = λ(t)
dt↓0 dt
23
Censoring
Non-informative Censoring
(Tsiatis, 1975)
• Examples of informative and non-informative censoring: not
verifiable empirically.
• Consequence of informative censoring: non-identifiability
26
Likelihood Construction
Likelihood Construction
∏
n ∏
n
{ }
⇒ L(F ) = Li (F ) = δi
f (ui ) S(ui ) 1−δi
.
i=1 i=1
28
Likelihood Construction
∏
n ∏
n
[ ]
⇒ L(F, G) = Li (F ) = {f (ui )(1 − G(ui ))} {S(ui )g(ui )}
δi 1−δi
.
i=1 i=1
{ }{ }
∏
n ∏
n ∏
n
= Li (F, G) = f (ui )δi S(ui )1−δi g(ui )1−δi (1 − G(ui ))δi .
i=1 i=1 i=1
29
Likelihood Construction
∏
n
{ } ∏
n
L(F ) = f (ui )δi S(ui )1−δi = h(ui )δi S(ui )
i=1 i=1
only.
30
∏
n
L(λ) = λδi e−λui = λr e−λW ,
i=1
where
∑n
1. r = i=1 δi = #failures
∑n
2. W = i=1 ui = total followup time
• The score function: ∂l(λ)/∂λ = r/λ − W.
• The observed information: −∂ 2 l(λ)/∂λ2 = r/λ2
31
λ̂ − λ
√ → N (0, 1)
2
λ /np
in distribution as n → ∞.
• λ̂ approximately follows N (λ, r/W 2 ) for large n.
32
• With δ−method
log(λ̂) ∼ N (log(λ), r−1 ),
i.e., the variance r−1 is free of the unknown parameter λ.
• Hypothesis testing for H0 : λ = λ0
√
Z = r{log(λ̂) − log(λ0 )} ∼ N (0, 1)
under H0 . We will reject the null hypothesis if |Z| is too big.
• Confidence interval:
[ √ √ ]
pr log(λ̂) − 1.96 1/r < log(λ) < log(λ̂) + 1.96 1/r ≈ 0.95,
PBC Example
This data is from the Mayo Clinic trial in primary biliary cirrhosis
(PBC) of the liver conducted between 1974 and 1984. A total of
424 PBC patients, referred to Mayo Clinic during that ten-year
interval, met eligibility criteria for the randomized placebo
controlled trial of the drug D-penicillamine. The first 312 cases in
the data set participated in the randomized trial and contain
largely complete data. The additional 106 cases did not participate
in the clinical trial, but consented to have basic measurements
recorded and to be followed for survival.
34
library(survival)
head(pbc)
u=pbc$time
delta=1*(pbc$status>0)
r=sum(delta)
W=sum(u)
lambdahat=r/W
c(lambdahat,lambdahat*exp(-1.96/sqrt(r)),lambdahat*exp(1.96/sqrt(r)))
36
library(survival)
head(pbc)
u=pbc$time[1:312]
delta=1*(pbc$status>0)[1:312]
trt=pbc$trt[1:312]
r1=sum(delta[trt==1])
W1=sum(u[trt==1])
lambda1=r1/W1
r2=sum(delta[trt==2])
W2=sum(u[trt==2])
lambda2=r2/W2
z=(log(lambda1)-log(lambda2))/sqrt(1/r1+1/r2)
c(z, 1-pchisq(z^2, 1))
38
Ti = β ′ zi + ϵi or log(Ti ) = β ′ zi + ϵi
β ′ zi
• Weibull regression: Ti ∼ W B(λ, p) λi = λ0 e
• Under Weibull regression, the hazard function of T for given zi
′
h(t|zi ) = pλp0 tp−1 epβ zi
h(t|z2 ) pβ ′ (z2 −z1 )
⇒ =e ⇒ proportional hazards
h(t|z1 )
• Equivalently
Parametric Inference
Parametric Inference
Parametric Inference
h(t)
1
2 ... k+1
k
t
v1 v2 v3 ... vk-1 vk
44
Parametric Inference
• L(λ1 , · · · , λk+1 ) = · · ·
• Let rj and Wj be the total number of events and follow-up
time within the interval [vj , vj+1 ), respectively. λ̂j = rj /Wj
• log(λ̂1 ), · · · , log(λ̂k+1 ) are approximately independently
distributed with log(λ̂j ) ∼ N (log(λj ), rj−1 ) for large n.
• The statistical inference for the hazard function follows.