Download as pdf or txt
Download as pdf or txt
You are on page 1of 51

UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Faculty of Health Sciences

Cox regression

Torben Martinussen
Department of Biostatistics
University of Copenhagen

20. september 2012


Slide 1/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Survival analysis

Standard setup for right-censored survival data. IID copies of (T , D) where

T = T∗ ∧ C D = I(T ∗ ≤ C)

ubiquitous with T ∗ being the true survival time and C the (potential) censoring
time and possibly covariates Xi (t).
Hazard-function: α(t) = limh↓0 h1 P(t ≤ T ∗ < t + h | T ∗ ≥ t, Ft− ).

Counting process: Ni (t) = I(Ti ≤ t, Di = 1)

Martingale: Mi (t) = Ni (t) − Λi (t)

where R
t
Λi (t) = 0 Yi (s)α(s) ds (compensator), Yi (t) = I(t ≤ Ti ) (at risk process).
Intensity: Y (t)α(t).

Torben Martinussen — Cox regression — 20. september 2012


Slide 2/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Modelling Survival Data

• To model and analyze survival data and deal with censorings in a


convenient way one often specifies the model through the hazard rate:

α(t) = risk of dying among people at risk at time t

• The hazard function is closely related to the survival probability


Z t
P(T > t) = S(t) = exp(− α(s)ds)
0

• in other words Z t
−logS(t) = A(t) = α(s)ds
0

• So
d
logS(t) α(t) = −
dt
• Notation, may use both α(t) and λ(t) for the hazard function. Forgive me.
• But keep in mind that intensity function is Y (t)α(t).

Torben Martinussen — Cox regression — 20. september 2012


Slide 3/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Parametric Models

• To give precise and clear answers. Kaplan-Meier curves and log-rank


tests are not always sufficient.
• By making simplifying reasonable assumptions something more can be
said through parametric models.
• The simplest possible survival model simply claims that the intensity is
constant over time
λ(t) = λ, Λ(t) = λt
When the rate is constant the survival time is exponentially distributed.
• One may validate the model by considering the non-parametric estimator
of the cumulative intensity (Nelson-Aalen)

Λ̂(t)

• Should be approximately linear if the model is correct

Torben Martinussen — Cox regression — 20. september 2012


Slide 4/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Malignant melanoma

In the period 1962-77 205 patients had their tumour removed and were
followed until 1977.
At the end of 1977:
• 57 died of mgl. mel.
• 14 died of non-related mgl. mel.
• 134 were still alive.
Purpose: Study effect on survival of sex, age, thickness of tumour, ulceration, ..

Torben Martinussen — Cox regression — 20. september 2012


Slide 5/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Malignant melanoma

N time status sex age year thickness ulcer

1 10 3 1 76 1972 6.76 1
2 30 3 1 56 1968 0.65 0
3 35 2 1 41 1977 1.34 0
4 99 3 0 71 1968 2.90 0
5 185 1 1 52 1965 12.08 1
6 204 1 1 28 1971 4.84 1
7 210 1 1 77 1972 5.16 1
8 232 3 0 60 1974 3.22 1
9 232 1 1 49 1968 12.88 1
. . . . . . . .
. . . . . . . .
. . . . . . . .
203 4688 2 0 42 1965 0.48 0
204 4926 2 0 50 1964 2.26 0
205 5565 2 0 41 1962 2.90 0

Torben Martinussen — Cox regression — 20. september 2012


Slide 6/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Parametric models - the Weibull model


• Let X , a (p-dimensional) covariate. The Weibull regression model has
hazard function
αθ (t) = λγ(λt)γ−1 exp(X T β), (1)
where θ = (λ, γ, β); β denotes the regression parameters.
• The hazard (1) may be written as a so-called proportional hazards model

λ0 (t) exp(X T β),

• Likelihood function based on Ti = Ti∗ ∧ Ci Di = I(Ti∗ ≤ Ci ), Xi (ignoring


censoring part) is :
R Ti
αθ (t) dt
Y
L(θ) = αθ (Ti )Di e− 0

• Standard MLE applies,



Σ̂ = I −1 (θ̂)
D
n(θ̂ − θ)→N(0, Σ),

where I(θ) is minus the derivative of U(θ).


• How would you compute θ̂ if you were to implement it in R?
Torben Martinussen — Cox regression — 20. september 2012
Slide 7/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Parametric models - the Weibull model

• Do exercise 3.7 (a) in MS, p. 76.


• Read in the melanoma-data, and fit the Weibull-model with only sex in the
model, see p. 69 in MS
• Any suggestion on how to check the validity of the Weibull-model in this
case?

Torben Martinussen — Cox regression — 20. september 2012


Slide 8/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Cox-Regression

• Cox-regression is the regression technique for survival analysis, Cox


(1972).
• Per 2005. The two most cited statistical papers:
I (1) With 25,869 citations (currently cited 1,984 times per year): Kaplan, E. L.

and Meier, P. (1958) Nonparametric estimation from incomplete observations,


Journal of the American Statistical Association, 53, pp. 457-481.
I (2) With 18,193 citations (1,342 per year), Cox, D. R. (1972) Regression

models and life tables, Journal of the Royal Statistical Society, Series B, 34,
pp. 187-220.
• Regression techniques are very useful for dealing with many covariates
• Can be used to learn about treatment effect while correcting for other
covariates.
• Cox model:
λi (t) = λ0 (t) exp(β1 Xi1 + ... + βp Xip ),
where λ0 (t) is baseline hazard for a subject with covariates 0.
• Note: λ0 (t) is not further specified!

Torben Martinussen — Cox regression — 20. september 2012


Slide 9/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

The Cox model

• The regression coefficients β1 , ..., βp represent the effects of the


covariates.
• β1 is the effect of Xi1 when we have corrected for the other covariates.
• β1 may be interpreted in terms of the relative risk when the covariate Xi1 is
increased 1:
λ0 (t) exp(β1 (Xi1 + 1) + ... + βp Xip )
= exp (β1 )
λ0 (t) exp(β1 Xi1 + ... + βp Xip )
• If β1 > 0 the risk of dying increases as Xi1 increases, and if β1 < 0 the risk
of dying decreases as Xi1 increases.
• The quantity
β̂1 Xi1 + ... + β̂p Xip
is called the prognostic index for the ith subject.

Torben Martinussen — Cox regression — 20. september 2012


Slide 10/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Two-Sample Cox Model

• Consider a simple 2-sample situation where we wish to study effect of sex.


• Defining a covariate for the ith patient

1 Male
Xi =
0 Female

• The hazard can be written as

λi (t) = λ0 (t) exp(βXi )

giving the contributions λ0 (t) exp(β) when Xi = 1 and λ0 (t) when Xi = 0.


• Note again that λ0 (t) is unspecified meaning that which distribution is
unspecified?
• In a moment we derive estimators for the Cox model, but let’s already use
them. Can be calculated in R (or SAS, Stata, SPSS).
• http://nyheder.ku.dk/alle_nyheder/2012/2012.9/
forskere-paaviser-klar-sammenhaeng-mellem-antistof-og-leddegigt/
• BMJ-paper

Torben Martinussen — Cox regression — 20. september 2012


Slide 11/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Example: Melanoma data

Consider the Cox-model for the melanoma with the explanatory variables sex
and log(thickness) (lthick):

λi (t) = λ0 (t) exp (β1 · sexi + β2 · lti )

We get:

> melanoma$lthick=log(melanoma$thick)
> fit=coxph(Surv(days,status==1)~ factor(sex)+lthick,data=melanoma)
> fit
Call:
coxph(formula = Surv(days, status == 1) ~ factor(sex) + lthick,
data = melanoma)

coef exp(coef) se(coef) z p


factor(sex)1 0.458 1.58 0.269 1.70 8.8e-02
lthick 0.781 2.18 0.157 4.96 6.9e-07

Likelihood ratio test=33.5 on 2 df, p=5.45e-08 n= 205

What does the relative risks of 1.58 and 2.18 mean?

Torben Martinussen — Cox regression — 20. september 2012


Slide 12/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

PBC-data

PBC data (primary biliary cirrhosis): 418 patients are followed until death or
censoring.
PBC is a fatal chronic liver disease.
Important explanatory variables:
• Age: in years
• Albumin: serum albumin (mg/dl)
• Bilirubin: serum bilirunbin (mg/dl)
• Edema: 0 no edema, 0.5 untreated or successfully treated 1 edema
despite diuretic therapy
• Prothrombin time: standardised blood clotting time

Torben Martinussen — Cox regression — 20. september 2012


Slide 13/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Fitting Cox’s model in R

> library(survival)
> data(pbc)
> with(pbc,cbind(time,status,age,edema,bili,protime,albumin)[1:5,])
time status age edema bili protime albumin
[1,] 400 2 58.76523 1.0 14.5 12.2 2.60
[2,] 4500 0 56.44627 0.0 1.1 10.6 4.14
[3,] 1012 2 70.07255 0.5 1.4 12.0 3.48
[4,] 1925 2 54.74059 0.5 1.8 10.3 2.54
[5,] 1504 1 38.10541 0.0 3.4 10.9 3.53
>
> fit.pbc<-coxph(Surv(time, status==2) ~ age+factor(edema)+log(bili)+log(protime)+
log(albumin),data=pbc)
> fit.pbc
Call:
coxph(formula = Surv(time, status == 2) ~ age + factor(edema) +
log(bili) + log(protime) + log(albumin), data = pbc)

coef exp(coef) se(coef) z p


age 0.0405 1.0413 0.00771 5.25 1.6e-07
factor(edema)0.5 0.2819 1.3256 0.22523 1.25 2.1e-01
factor(edema)1 1.0125 2.7524 0.28975 3.49 4.8e-04
log(bili) 0.8590 2.3609 0.08324 10.32 0.0e+00
log(protime) 2.3599 10.5902 0.77314 3.05 2.3e-03
log(albumin) -2.5158 0.0808 0.65262 -3.85 1.2e-04

Likelihood ratio test=232 on 6 df, p=0 n=416 (2 observations deleted due to missingness)

Torben Martinussen — Cox regression — 20. september 2012


Slide 14/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Likelihood construction

• Likelihood function based on Ti = Ti∗ ∧ Ci Di = I(Ti∗ ≤ Ci ), Xi is:


Y R Ti YY Rτ
α(Ti )Di e− 0
α(t) dt
= α(t)∆Ni (t) e− 0 Yi (t)α(t) dt
i i t

I τ is end of observation period

I Yi (t) = I(t ≤ Ti ) is the at risk indicator.


• Can also write it as
YY Rτ
dA(t)∆Ni (t) e− 0 Yi (t)dA(t)
i t
Rt
where A(t) = 0
α(s) ds.
T Rt
• Cox-model: A(t) = A0 (t)eX β
with A0 (t) = 0
α0 (s) ds.

Torben Martinussen — Cox regression — 20. september 2012


Slide 15/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Likelihood construction

Pn
• Let S0 (t, β) = i=1 Yi (t) exp(XiT β).
• Based on martingale equation can you then suggest an estimator of A0 (t)
assuming β is known?

Torben Martinussen — Cox regression — 20. september 2012


Slide 16/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Likelihood construction

• The "estimator" Z t
1
Â0 (t, β) = dN· (s)
0 S0 (t, β)
is called the Breslow-estimator.
• Now plug this estimator into the likelihood function, and arrive at a
function only depending on β!
• This function, L(β) is called Cox’s partial likelihood function.

Torben Martinussen — Cox regression — 20. september 2012


Slide 17/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Cox’s proportional hazards model


• The Cox model is
λi (t) = Yi (t)λ0 (t) exp(XiT β), (2)
where X = (X1 , ..., Xp ) is a p-dimensional locally bounded predictable
covariate and Yi (t) is an at risk indicator.
• The regression parameter β is estimated as the maximizer to Cox’s partial
likelihood function
Y Y  exp (X T β) ∆Ni (t)
i
L(β) = , (3)
t
S0 (t, β)
i

where
n
X
S0 (t, β) = Yi (t) exp(XiT β).
i=1

• Λ0 (t) is estimated by the Breslow estimator


Z t
1
Λ̂0 (t) = Λ̂0 (t, β̂) = dN· (t). (4)
0 S0 (t, β̂)

Torben Martinussen — Cox regression — 20. september 2012


Slide 18/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Cox’s proportional hazards model

• We have presented theory with time-const. X ’s, but can easily handle
time-var. covariates.
• Now let’s look into the asymptotical properties of Cox’s estimators.
• The first partial derivative of S0 (t, β) with respect to β is denoted :
n
X
S1 (t, β) = Yi (t) exp(XiT β)Xi
i=1

• Show that β̂ is found as the solution to the score equation U(β̂) = 0,


where
n Z τ
X S1 (t, β)
U(β) = (Xi (t) − E(t, β))dNi (t) E(t, β) = . (5)
0 S0 (t, β)
i=1

• Show also that U(β0 ) can be written as a martingale evaluated at τ . Here,


β0 denotes the true β.

Torben Martinussen — Cox regression — 20. september 2012


Slide 19/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Asymptotic properties

Theorem
• Under the standard conditions, then, as n → ∞,

n−1/2 U(β0 )→N(0, Σ),


D

D
n1/2 (β̂ − β0 ) → N(0, Σ−1 ),

and Σ is estimated consistently by n−1 I(β̂), with I minus the derivative of


U.

• Under the standard conditions, then, as n → ∞,


D
n1/2 (Λ̂0 (t, β̂) − Λ0 (t)) → U(t)

where U(t) is a zero-mean Gaussian process with covariance function


Φ(t). Not a martingale though, why?

Get back to how Φ(t) can be estimated consistently

Torben Martinussen — Cox regression — 20. september 2012


Slide 20/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Asymptotic properties
• Key to the proof is that the score evaluated in the true point β0 is a
martingale (evaluated at τ ):
n Z
X τ
U(β0 ) = (Xi − E(t, β0 ))dMi (t) (6)
i=1 0

• The predictable variation process of n−1/2 U(β0 ) is


n Z
X τ
hn−1/2 U(β0 )i = n−1 (Xi (t) − E(t, β0 ))⊗2 Yi (t) exp(XiT (t)β0 )dΛ0 (t)
i=1 0
Z τ
−1 P
=n V (t, β0 )S0 (t, β0 )dΛ0 (t)→Σ.
0

where
S2 (t, β)
V (t, β) = − E(t, β)⊗2 . (7)
S0 (t, β)
• Lindeberg condition of martingale CLT is also fulfilled

Torben Martinussen — Cox regression — 20. september 2012


Slide 21/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Asymptotic properties

• It follows that U(β0 ) converges in distribution to a normal variate with


zero-mean and variance Σ.
• A Taylor series expansion of the score gives

n1/2 (β̂ − β0 ) = (n−1 I(β ∗ ))−1 n−1/2 U(β0 ),

where β ∗ is on the line segment between β0 and β̂.


• Consistency of β̂ and the results above give that n1/2 (β̂ − β0 ) converges
to the postulated normal distribution.

Torben Martinussen — Cox regression — 20. september 2012


Slide 22/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

IID decomposition
• The score process evaluated at β0
n
X
n−1/2 U(β0 , t) = n−1/2 1i (t) + op (1)
i=1

where
t
Z 
s1 (β0 , t)
1i (t) = Xi − dMi (s)
0 s0 (β0 , t)

with sj (β0 , t) the limit in prob. of n−1 Sj (β0 , t).


• That is a sum of zero-mean iid terms!
• The variance of n1/2 (β̂ − β0 ) may be estimated consistently by
( n
)
−1
X
Σ̃ = nI (β̂, τ ) ˆ⊗2
1i (τ ) I −1 (β̂, τ ).
i=1

where ˆ1i (τ ) is given by 1i (τ ) replacing unknowns with their empirical


counterparts.
Torben Martinussen — Cox regression — 20. september 2012
Slide 23/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

IID decomposition

• More important! Try the same for the Breslow-estimator.

n1/2 (Λ̂0 (t, β̂) − Λ0 (t)) ≈ n1/2 (Λ̂0 (t, β0 ) − Λ0 (t)) + Dβ Λ̂0 (t, β̂)n1/2 (β̂ − β0 )

• Dβ Λ̂0 (t, β̂) conv. in probability so last term is essentially a sum of


zero-mean iid’s.
• But also
Z t
1
n1/2 (Λ̂0 (t, β0 ) − Λ0 (t)) = dM· (s)
0 S 0 (s, β)
Z t
1
≈ n−1/2 dMi (s),
0 s0 (s, β)

P
since n−1 S0 (s, β)→s0 (s, β).

Torben Martinussen — Cox regression — 20. september 2012


Slide 24/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

IID decomposition
• It thus follows that
n1/2 (Λ̂0 (t, β̂) − Λ0 (t))
is asymptotically equivalent to
n
X
n−1/2 2i (t), (8)
i=1

where

2i (t) =3i (t) + H T (β0 , t)I(β0 , τ )−1 1i (τ ),


Z t
3i (t) = S0−1 (β0 , s)dMi (s),
0
Rt
and where H(β, t) is the derivative of 0
S0−1 (β, s)dN.(s) wrt β.
• The variance of n1/2 (Λ̂0 (t, β̂) − Λ0 (t)) thus may be estimated by
n
X
n−1 ⊗2
2i (t)
i=1

Torben Martinussen — Cox regression — 20. september 2012


Slide 25/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Resampling

• Now, n1/2 (Λ̂0 (t, β̂) − Λ0 (t)) is asymptotically equivalent to


n
X
n−1/2 ˆ2i (t)Gi (9)
i=1

where G1 , ..., Gn are iid N(0, 1).


• This is very useful for constructing confidence bands for the survival
predictions and to make tests about the baseline and survival function
predictions.
• Taylor expansion:

n1/2 (S0 (Λ̂0 , β̂, t) − S0 (Λ0 , β, t)) = −S0 (Λ0 , β, t)


n o
n1/2 exp(X0T β)(Λ̂0 (t) − Λ0 (t)) + Λ0 (t) exp(X0T β)X0T (β̂ − β) . (10)

Torben Martinussen — Cox regression — 20. september 2012


Slide 26/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Resampling
• Just saw that : n1/2 (β̂ − β, Λ̂0 − Λ0 ) was asymptotically equivalent to the
processes
n
X
∆1 = n−1/2 ˆ1i Gi ,
i=1
n
X
∆2 (t) = n−1/2 ˆ2i (t)Gi ,
i=1

where G1 , ..., Gn are independent standard normals.


• It follows that n1/2 (Ŝ0 − S0 ) has the same asymptotic distribution as
n
X
∆S (t) = S0 (t)n−1/2 ˆ4i (t)Gi , (11)
i=1

where

4i (t) = exp(Z0T β)2i (t) + A(t) exp(X0T β)X0T 1i .

Torben Martinussen — Cox regression — 20. september 2012


Slide 27/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Cox-Regression: checking assumptions

Wish to study a treatment effect while correcting for other variables. May use
Cox-model:
λi (t) = λ0 (t) exp(β1 Xi1 + ... + βp Xip )
λ0 (t) is the baseline hazard for a subject with covariates 0.
The Cox models ability to deal with many covariates comes from the
regression structure. Some assumptions have been made
• The effects of covariates are additive and linear on the log risk scale.
• If covariates interact with each other the regression model should include
interaction terms.
• The relative risk between the hazard rate for two subjects is constant over
time
λ0i (t)
c(β1 , ..., βp ) =
λi (t)
We will try to check all these assumptions with the latter being the key
assumption.

Torben Martinussen — Cox regression — 20. september 2012


Slide 28/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Interaction between categorical and cont. variables

Melanoma data:
Consider the Cox-model for melanoma data with explanatory variables sex and
log(thickness) (lt):

λi (t) = λ0 (t) exp (β1 · sexi + β2 · lti )

How do we check for interaction in this situation?


• Group the continuous variable (log(thickness)) and proceed as before.
• Allow for two different β2 ’s, one for each value of sex.
• Fit this latter model in R, and interpret output.

Torben Martinussen — Cox regression — 20. september 2012


Slide 29/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Interaction between categorical and cont. variables

> fit=coxph(Surv(days,status==1)~ factor(sex)+factor(sex)*lthick,data=melanoma)


> fit
Call:
coxph(formula = Surv(days, status == 1) ~ factor(sex) + factor(sex) *
lthick, data = melanoma)

coef exp(coef) se(coef) z p


factor(sex)1 1.111 3.037 1.817 0.612 5.4e-01
lthick 0.834 2.302 0.214 3.895 9.8e-05
factor(sex)1:lthick -0.113 0.893 0.311 -0.363 7.2e-01

Cox-model: λ0 (t) exp (β1 · sex + β2 · lt + β3 · sexlt)


with

β1 + (β2 + β3 ) · lt, sex = 1
β1 · sex + β2 · lt + β3 · sexlt =
β2 · lt, sex = 0

• The baseline λ0 (t) is the hazard for?


• The effect of lt is estimated to ? for male and to ? for female
• Is there an interaction between sex and lt?

Torben Martinussen — Cox regression — 20. september 2012


Slide 30/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Checking proportionality
• Constant relative risk between the hazard rates for two subjects

λ0i (t)
c(β1 , ..., βp ) =
λi (t)
• In the 2-sample case the assumption is equivalent to proportional
intensities for the two groups, i.e.,

exp(β1 )λ1 (t) = λ2 (t).

• Implies that the cumulative intensities are proportional


Z t
Λ2 (t) = λ2 (s)ds = exp(β1 )Λ1 (t)
0

• A plot of Λ̂2 (t) versus Λ̂1 (t) should give a straight line through (0,0) with
slope exp(β1 ).
• Similarly
log(Λ2 (t)) − log(Λ1 (t)) = β1
so plotting log(Λ̂k (t)), k = 1, 2, versus t should give parallel curves.
Torben Martinussen — Cox regression — 20. september 2012
Slide 31/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

The Stratified Cox model

The stratified Cox model contains different baselines for different strata :

λik (t) = λk (t) exp(β1 Xi1 + ... + βp Xip ) k = 1, ..., K

λk (t) is the baseline hazard for a subject in strata k .


• The regression coefficients β1 , ..., βp represent the effects of the
covariates as in the simple Cox regression model.
• Melanoma-data: we might stratify according to ulceration.
• The baselines give the mortality of the two defined by ulceration (yes/no)
when all other covariates are 0.

Torben Martinussen — Cox regression — 20. september 2012


Slide 32/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Log-cumulative hazard plot


> fit=coxph(Surv(days,status==1)~ factor(sex)+lthick+strata(ulc),data=melanoma)
> fit.detail=coxph.detail(fit)
> logcum1=log(cumsum(fit.detail$hazard[c(17:57)]))
> logcum0=log(cumsum(fit.detail$hazard[1:16]))
> plot(c(fit.detail$time[17:57],3700),c(logcum1,logcum1[41]),type=’s’,xlab="Time (in days)",ylab="Logcums")
> lines(c(fit.detail$time[1:16],3700),c(logcum0,logcum0[16]),type=’s’,lty=2)

-1
-2
Logcums

-3
-4
-5

500 1000 1500 2000 2500 3000 3500

Time (in days)

Parallel?
Torben Martinussen — Cox regression — 20. september 2012
Slide 33/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Checking Proportionality: tests

• Graphical test may reveal serious violations, but they may be difficult to
use in general.
• If effect is expected to wear off with time or to increase with time, one may
try to describe this effect and then fit a Cox model with a time-varying
explanatory variable f (t)

λ1 (t)
exp(β + θ · f (t)) =
λ2 (t)
• For a given choice of f (t) we can then test if θ is 0, i.e, if the time-varying
effect can be left out of the model. A common choice of f (t) is ln(t). More
specialized methods and programs are available.

Torben Martinussen — Cox regression — 20. september 2012


Slide 34/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Checking Proportionality: tests

> fit=coxph(Surv(days/365,status==1)~ factor(sex)+ulc+lthick,data=melanoma)


> time.test=cox.zph(fit,transform="log")
> time.test
rho chisq p
factor(sex)1 -0.0627 0.230 0.6312
ulc -0.1335 0.956 0.3283
lthick -0.2942 4.022 0.0449
GLOBAL NA 8.773 0.0325

Based on this, the Cox model is rejected.

Torben Martinussen — Cox regression — 20. september 2012


Slide 35/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Checking Proportionality: tests

• Another option is to have a time-changing effect of ulc.


• Imagine for instance that effect of ulc is most important in an initial phase.
We try the first 4 years, approx 1400 days.
• We hence replace βulc · ulc by βulc (t) · ulc with

βulc (t) = βulc,1 + βulc,2 I(t > 1400)

• Note that

βulc (t) · ulc = (βulc,1 + βulc,2 I(t > 1400)) · ulc


= βulc,1 · ulc + βulc,2 · X (t)

with
X (t) = I(t > 1400)) · ulc
being a time-varying covariate.
• So this is just a Cox-model again, but now with a time-varying covariate.

Torben Martinussen — Cox regression — 20. september 2012


Slide 36/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Time-changing effect of covariates

• Such a model can easily be fitted in R using the survSplit-function.


> melanoma1=survSplit(melanoma,cut=c(1400),end="days",start="start",
event="status")
> melanoma1$ulcnew=melanoma1$ulc*as.numeric(melanoma1$days>1400)
>
> fit1=coxph(Surv(start,days,status==1)~ factor(sex)+lthick+
actor(ulc)+ factor(ulcnew), data=melanoma1)
> summary(fit1)
Call:
coxph(formula = Surv(start, days, status == 1) ~ factor(sex) +
lthick + factor(ulc) + factor(ulcnew), data = melanoma1)

n= 367

coef exp(coef) se(coef) z Pr(>|z|)


factor(sex)1 0.3744 1.4541 0.2701 1.386 0.165759
lthick 0.5741 1.7756 0.1801 3.187 0.001436 **
factor(ulc)1 1.6967 5.4561 0.5024 3.377 0.000732 ***
factor(ulcnew)1 -1.5515 0.2119 0.6451 -2.405 0.016170 *
---
• Relative risk Ulc/no Ulc in the first 4 years, and after the first 4 years?
Conclusion?

Torben Martinussen — Cox regression — 20. september 2012


Slide 37/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Cox’s proportional hazards model

• The procedures described above are the traditional goodness-of-fit


tools.
• Make tests against specific deviations: Replace X1 with (X1 , X1 (log (t))),
say (β1 → β1 + βp+1 · log (t)). Test the null βp+1 = 0.
• Graphical method
• Time-changing effect.

These methods are quite useful but also have some limitations:
• Graphical method:
I Not parallel. What is acceptable?
I What if a given covariate is continuous?
• Test: Ad hoc method. Which transformation to use?
• Time-changing effect. Also ad-hoc in that the time where effects changes
are rarely known in practice.
• All methods: They assume that model is ok for all the other covariates.

Torben Martinussen — Cox regression — 20. september 2012


Slide 38/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Cumulative martingale residuals

Alternative: Cumulative martingale residuals, Lin, Wein and Ying (1993).


The martingales under the Cox regression model can be written as
Z t
Mi (t) = Ni (t) − Yi (s) exp(X T β)dΛ0 (s)
0
Z t
M̂i (t) = Ni (t) − Yi (s) exp(X T β̂)d Λ̂0 (s)
0
Z t
1
= Ni (t) − Yi (s) exp(XiT (s)β̂) dN· (s).
0 S0 (s, β̂)
One idea is now to look at different groupings of the these residuals and see if
they behave as they should under the model.

Torben Martinussen — Cox regression — 20. september 2012


Slide 39/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Cumulative martingale residuals

The score function, evaluated in the estimate β̂, and seen as a function of time,
can for example be written as
n Z t n Z t
X S1 (s, β̂) X
U(β̂, t) = (Xi − )dNi (s) = Xi d M̂i (s)
i=1 0 S0 (s, β̂) i=1 0

so it is equivalent to cumulating the martingale residuals (in certain way). Also,


by Taylor series expansion,

n−1/2 U(β̂, t) ≈ n−1/2 U(β0 , t) − (n−1 I(β̂, t))n1/2 (β̂ − β)


n1/2 (β̂ − β) ≈ (n−1 I(β̂, τ ))−1 n−1/2 U(β0 , τ )

Hence
n o
n−1/2 U(β̂, t) ≈ n−1/2 U(β0 , t) − I(β̂, t)(I(β̂, τ ))−1 U(β0 , τ )

Torben Martinussen — Cox regression — 20. september 2012


Slide 40/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Cumulative martingale residuals

n−1/2 U(β̂, t) is thus asymptotically equivalent to the process


 
n−1/2 M1 (t) − I(t, β̂)I −1 (τ, β̂)M1 (τ ) , (12)

where
n
X n Z
X t
M1 (t) = M1i (t) = (Xi (s) − e(s, β0 ))dMi (s)
i=1 i=1 0

with e(t, β0 ) = limp E(t, β0 ).

Note: n−1/2 U(β̂, τ ) = 0

Torben Martinussen — Cox regression — 20. september 2012


Slide 41/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Cumulative martingale residuals


The distribution of the process n−1/2 M1 (t) (t ∈ [0, τ ]) is asymptotically
equivalent to
n Z
X t
n−1/2 (Xi (s) − E(s, β̂))dNi (s)Gi
i=1 0

where G1 , ..., Gn are independent standard normals.


Since Mi ’s are i.i.d. with variance E(Ni ). Alternatively to this resampling
approach one may also ([?]), show
n
X Z t
n−1/2 M̂1i (t)Gi , with M̂1i (t) = (Xi (s) − E(s, β̂))d M̂i (s).
i=1 0

Therefore it can be shown that n1/2 U(β̂, t) is asymptotically equivalent to


n 
X 
n−1/2 M̂1i (t) − I(t, β̂)I −1 (τ, β̂)M̂1i (τ ) Gi ,
i=1

for almost any sequence of the counting processes that determines M̂ and I.
Torben Martinussen — Cox regression — 20. september 2012
Slide 42/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Cumulative martingale residuals

To summarize the cumulative score process plots one may look at test
statistics like

sup |Uj (β̂, t)| |, j = 1 . . . , p,


t∈[0,τ ]

Torben Martinussen — Cox regression — 20. september 2012


Slide 43/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

PBC-data

> library(timereg)
> fit=cox.aalen(Surv(days,status==1)~ prop(factor(sex))+prop(lthick)+
prop(ulc), data=melanoma)
> summary(fit)

Proportional Cox terms :


Coef. SE Robust SE D2log(L)^-1 z P-val
prop(factor(sex))1 0.381 0.274 0.281 0.271 1.36 0.174000
prop(lthick) 0.576 0.162 0.172 0.179 3.34 0.000847
prop(ulc) 0.939 0.315 0.307 0.324 3.06 0.002230
Test for Proportionality
sup| hat U(t) | p-value H_0
prop(factor(sex))1 3.27 0.282
prop(lthick) 8.17 0.012
prop(ulc) 4.31 0.054
> plot(fit,score=T)

Torben Martinussen — Cox regression — 20. september 2012


Slide 44/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

PBC-data

prop(factor(sex))1 prop(lthick)

10
6
Cumulative coefficients

Cumulative coefficients
4

5
2
0

0
-2

-5
-4

-10
-6

0 500 1500 2500 0 500 1500 2500

Time Time

prop(ulc)
6
Cumulative coefficients

4
2
0
-2
-4
-6

0 500 1500 2500

Time

Torben Martinussen — Cox regression — 20. september 2012


Slide 45/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Cumulative Residuals

• LWY also suggested :


Z t
Mc (t, z) = KzT (s)d M̂(s)
0

where Kz (t) is an n × 1 matrix with elements I(Xi1 (t) ≤ z) for i = 1, . . . , n


focusing here on the first continuous covariate X1 , say.
• Thus cumulating residuals versus both time and the covariate values.
• A 1-dimensional process :
Z τ
Mc (z) = KzT (t)d M̂(t), (13)
0

which can be plotted against z.



• If X1 (t) = X1 then 0 KzT (t)d M̂(t) =
P
i M̂i (τ )I(Xi1 ≤ z) which is a
process in z.
• This will reveal if the functional form is appropriate. These processes can
also be resampled.

Torben Martinussen — Cox regression — 20. september 2012


Slide 46/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Cumulative residuals

> fit.gof=cox.aalen(Surv(days,status==1)~ prop(lthick),melanoma,


residuals=1)
>
> resids=cum.residuals(fit.gof,melanoma,cum.resid=1)
> summary(resids)
Test for cumulative MG-residuals

Grouped cumulative residuals not computed, you must provide


modelmatrix to get these (see help)

Residual versus covariates consistent with model

sup| hat B(t) | p-value H_0: B(t)=0


5.772 0.512

Call:
cum.residuals(fit.gof, melanoma, cum.resid = 1)

Torben Martinussen — Cox regression — 20. september 2012


Slide 47/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Cumulative martingale residuals

10
5
Cumulative residuals

0
-5
-10

3 4 5 6 7

Torben Martinussen — Cox regression — 20. september 2012


Slide 48/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Cumulative residuals

> fit.gof=cox.aalen(Surv(days,status==1)~ prop(thick),melanoma,


residuals=1)
> resids=cum.residuals(fit.gof,melanoma,cum.resid=1)
> summary(resids)
Test for cumulative MG-residuals

Grouped cumulative residuals not computed, you must provide


modelmatrix to get these (see help)

Residual versus covariates consistent with model

sup| hat B(t) | p-value H_0: B(t)=0


10.885 0.022

Call:
cum.residuals(fit.gof, melanoma, cum.resid = 1)

Torben Martinussen — Cox regression — 20. september 2012


Slide 49/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Cumulative martingale residuals

10
5
Cumulative residuals

0
-5
-10

0 500 1000 1500

Torben Martinussen — Cox regression — 20. september 2012


Slide 50/51
UNIVERSITY OF COPENHAGEN DEPARTMENT OF BIOSTATISTICS

Summary

• Cox’s proportional hazards model.


• Nice model with easily interpretable parameters - relative risks!
• Is is used heavily in Biostatistcs.
• Are the relative risks really not depending on time? Check model carefully.
• Resampling techniques useful for evaluating asymptotics distributions.
• Useful goodness-of-fit based on cumulative residuals with p-values.

Torben Martinussen — Cox regression — 20. september 2012


Slide 51/51

You might also like