Survival Analysis-Debby Raden

Survival Analysis
time to event analysis
Debby Syahru Romadlon & Raden Ahmad Dedy M

Outline
• Introduction
• Survival Function and Hazard Function
• Methods in Survival Analysis
• Non-Parametric Procedures
• Life-Table Analysis
• Kaplan-Meier Estimate
• Proportional Hazard Model (PHM)
• Check for Proportional Hazard
• Two Sample Testing
Objectives
• Introduce the concepts,

• Analytical methods,
• Applications in survival analysis
3
Introduction
Analysis of data that correspond to the time from a
well-defined time origin until the occurrence of
some particular event or end-point
• A failure time (survival time, lifetime), T,

is a non-negative valued random
variable
• For most of the applications, the value
DATA of T is the time from a certain event to
a failure event.
5
Things that need to be precisely defined
1. “TIME” (with origin)

Time since recruitment into the study
Time since randomization (in a clinical trial)
Time since employment
Time since diagnosis (prognosis studies)
Time since infection (e.g. HIV)
Time since menarche
Calendar time
Age
2. “EVENT” (with date or precise time in appropriate scale
Death
Disease (diagnostic, start of symptom, relapse)
Remission of diarrhea
Quit smoking
Menopause 6
Example of “time until any well-defined event”:
a. Time from start of treatment to a failure time

b. Time from onset of infection to onset of disease
c. Time until death
d. Time until relapse or progression of cancer
e. Time until engraftment after bone marrow transplant
f. Time until relapse after quitting smoking
g. Time until the commission of a crime after a criminal is released from jail
h. Time from birth to death = age at death
i. Time from birth to onset of a disease – onset age
7
Special features of survival data
Special features:
(1)The survival data are generally not

symmetrically distributed (tend to be
positively skewed)
(2)The survival times are frequently
censored (i.e., when the end-point of
interest has not been observed for that
individual)
Important The actual survival time of an individual, t, is independent of

Assumption of any mechanism which causes that individual’s survival time to
censored data:
be censored at time c, where c < t (i.e., uninformative
censoring) 8
Key Assumption—Uninformative Censoring
C e n s o re d o b s e r v at i o n Tr u e S ( t )
Have, on the average, the same

experience after being
censored than those remaining
under observation
(“Uninformative censoring”,
censoring is not related to risk
of the event)
If losses “die” immediately after being lost

(e.g., because the more severe cases go to
another place for treatment) 9
Key Assumption—Uninformative Censoring
Good alternatives Tr u e S ( t )
No perfect ways to assess the

validity of this assumption;
therefore, there are good
alternatives can be taken:
 Use common sense and
judgement
 Examine baseline
characteristics of losses and If losses “have better prognosis (e.g., they
feel so good that they do not care about
retained observations
being in the study anymore).
10
Censoring—Random Censoring
This type of censoring will be the main censoring

mechanism that we deal with. It occurs when the
censoring time varies from individual to individual
and is unknown in advance.
For example:
 In a follow-up study, the censoring occurs due to the end of the study,
loss of follow-up, or early withdrawals
 Reasons for censoring:
o Patients decide to move to another hospital
o Patients quit treatment because of side-effects of a drug
o Failures occur after the end of study, etc. 11
Under random censoring, what is the actually observed data?
 Ideally, we would like to observed the “complete data” t1, t2, t3, …….,
tn
 Due to censoring, we only observe “right-censoring data”:
yi = ti if ti ≤ ci
ci if ti > ci
 The censoring indicators
δi = 1 if data is uncensored, ti ≤ ci
0 if data is censored, ti > ci
12
 Data: (y , δ ), (y , δ ), …, (y , δ ) and possibly some covariate
Example: A set of observed survival data is
yi 25 18 17 22 27
δi 1 0 1 0 1
The data can also be presented as

25 18+ 17 22+ 27
13
Study Time and Patient Time
Patient Time
Study Time
The calendar time period in The period of time that a

which an individual is in the patient spends in the study,
study measured from that patient’s
time origin
14
Study Time and Patient Time
15
Survival Function
and
Hazard Function
Cumulative Distribution Function
Definition: Cumulative distribution function F(t)
F(t) = Pr (T ≤ t)
17
Survival Function S(t)
Definition: Survival function S(t)

S(t) = Pr (T > t) = 1=Pr (T ≤ t) = 1 – F(t)
F(t) = Pr (T ≤ t)
Characteristics of S(t):
 (a) S(t) = 1 if t < 0
 (b) S(∞)= limt->∞ S(t)=0
 (c) S(t) is non-increasing in t
 In general, the survival function S(t) provides useful summary

statistics, such as the median survival time, t-year survival rate,
etc.
18
 Although the “outcome” is the
proportion surviving up to a
given time
 The survival curve also allows

estimating survival percentiles,
e.g., the median survival time.
19
Density Function S(t)
Definition: Density function f(t)

(a) If T is a discrete random variable:
f(t) = Pr (T = t )
(b) If T is (absolutely) continuous:
Note that:
20
Hazard Function S(t)
Definition: Hazard function λ(t)

(a) If T is discrete:
(b) If T is (absolutely) continuous:
Here:
λ(t)Δt≈ the proportion of individuals experiencing failure in (t, t+Δt)
to those surviving up to t 21
Hazard Function S(t)
Example:
(a) Constant hazard λ(t) = λ0
(b) Increasing hazard λ(t2) ≥ λ(t1), if t2 ≥ t1
(c) Decreasing hazard λ(t2) ≤ λ(t1), if t2 ≥ t1
(d) U-shape hazard (human mortality for age at death)
Remark: Modeling the hazard function is one way for parametric

modeling.
22
Cumulative Hazard Function (chf) Λ
(t)
Definition: Cumulative hazard function (chf)Λ(t).
(a) If T is discrete, let xi’s be the mass points
(b) If T is (absolutely) continuous
23
Relationship among Functions
(a) If T is discrete,
(b) If T is (absolutely) continuous

S(t) = Pr (T > t) = Pr ( T ≥ t ),
24
Relationship among Functions
A well-known relationship among the density, hazard, and

survival function is:
Thus S(t) = e-Λ(t) = e-

Or
Λ(t) = -ln S(t)
When T is a continous variable, we also have
∫∞0 λ(u)du= ∞ 25
METHODS
IN SURVIVAL ANALYSIS
Methods in Survival Analysis
Variable in interest: TIME to occurrence of an EVENT
Usual primary objective(s):

(1)To estimate SURVIVAL FUNCTION (Cumulative survival)
Methods: LIFE TABLE
KAPLAN MEIER
27
Methods in Survival Analysis
Variable in interest: TIME to occurrence of an EVENT
(2) To compare of survival curves in different groups

Methods: LOGRANK
NON-PARAMETRIC MODELS PHM (COX)
28
Non-Parametric Procedures
Estimating Survival Function (for no censored observations)
(1) Suppose a single sample of survival times, where none of the

observations are censored.
(2) Defined Survival function S(t)
the probability that an individual survives for a time greater than
t.
(3) Estimated by the empirical survival function:
Equivalenty
where the empirical distribution function F(t) is

30
Life-Table Analysis
Life-table estimates of the survivor function
The life table estimate of the survivor function:

First obtained by dividing the period of observation into a
series of time intervals.
These intervals need not necessarily be of equal length,
although they usually are.
32
Example: Life-table estimates of the Survivor
Function
33
Example: Life-table estimates of the Survivor
Function
34
35
Kaplan-Meier Estimate
The Kaplan-Meier Estimate
 The Kaplan-Meier estimator (1958, JASA) is a

nonparametric estimator for the survival function S.
 Consider now either random censoring or type-I
censoring.
Objective: Estimate the Survival function

Cumulative survival at time t, S(t)
The complement of the cumulative incidence
S (t ) = 1-[ Cumulative incidence] 43

 Assume uninformative censoring. That is, assume that Ti
is independent of Ci for each i. The data are
(y1, δ1), (y2, δ2), …., (yn, δn)
 Let y(1) < y(2) < … < y(k), k≤ n, be the distinct, uncensored,
and ordered failure times.
 In the presence of censoring, this is estimated as a

product of conditional probabilities. Kaplan-Meier
estimate (Kaplan & Meier, 1958)
44
 For example
 Data: 3, 2+, 0, 1, 5+, 3, 5
 (y(1), y(2), y(3), y(4))=(0, 1, 3, 5)
 Suppose y(i-1) < t < y(i). A principle of nonparametric

estimation of S is to assign positive probability to and
only to uncensored failure times. Therefore, we try to
estimate
45
 How to estimate S(t)? Define
46
N(1)=7, N(2)=6, N(3)=4, N(4)=2
d(1)=1, d(2)=1, d(3)=2, d(4)=1
Now estimate
The Kaplan-Meier estimate is thus
4747
Example: 3, 2+, 0, 1, 5+, 3, 5
Uncensored 0 1 3 5
times
di 1 1 2 1
Ni 7 6 4 2
4848
Steps in Kaplan-Meier Estimation
For each time of

Sort the survival
2 occurrence of an event,
times from
1 compute the conditional
smallest to greatest
survival (1- “hazard”)
3 For each time of occurrence

of an event, calculate the
survival function (multiplying
conditional probabilities of
survival).
49
Example:
10 individuals followed up to 24 months
6 died, 4 censored before end of follow up
Follow-up times:
17 4 8+ 20 24+ 13 16+ 2 9 10+
50
ti
survival at ti survival function S(t)
2 9/10= 0.9 0.900
4 8/9= 0.889 0.900 x 0.889 = 0.800
9 6/7= 0.857 0.800 x 0.857 = 0.686
13 4/5= 0.8 0.686 x 0.800 = 0.549
17 2/3= 0.667 0.549 x 0.667 = 0.366
20 1/2= 0.5 0.366 x 0.500 = 0.183
Ranking: 2 4 8+ 9 10+ 13 16+ 17 20 24+
51
52
If the largest observed time is uncensored, the Kaplan- Meier
estimate will reach the value 0 as t ≥ the largest observed time
If the largest observed time is censored, the Kaplan-

Meier estimate will not go down to 0 and is unreliable for t >
largest yi.
In this case, we say that S(t) undetermined for t > the largest
observed time.
53
Notes
 The calculations are made at the times when events occur. Censoring times are
skipped. Censored observations only contribute information up to the time when
they are withdrawn.
 Kaplan & Meier call this method the “product-limit estimate”, for it is the limit of the
life table with the shortest possible intervals (as short as it is needed to include only
one event)
 The method is theoretically designed for exact event times. In case of approximate
(rounded) times (e.g., years), ties can occur:
 more than one event at one given time: no problem event(s) and censored
observations.
 Convention: place the censored observation after the events at each failure time with
ties (suggested by Kaplan&Meier,1959).
 The K-M estimate is a nonparametric method which can be applied to either discrete
or continuous data. 54
Greenwood’s Formula
 For estimating the variance of the Kaplan-Meier estimate:
55
Greenwood’s Formula
 Confidence Interval
 Greenwood method (1962)
56
Remark
Remark 1
 Property
 When n is large
 Where σ(t)2 can be estimated by the Greenwood’s formula.
Remark 2
The accuracy of the K-M estimate and Greenwood’s formula relies on large sample size of
uncensored data. Make sure that you have at least, say, 20 or30 uncensored failure times
in your data set before using the methods.
Remark 3
Greenwood’s formula is more appropriate when 0<<S(t)<<1. Using Greenwood’s formula,
the confidence interval limits could be above 1 or below 0. In this case, we usually replace
these limit points by 1 or 0.
For example, a 95% confidence interval could be (0.845,1.130), we will use(0.845,1) instead.
57
58
59
60
61
62
Proportional Hazard Model
(PHM)
Proportional Hazard Model (COX, 1972)
 Assume that at any given time (t), that hazard in those exposed to a
certain risk factor [ h1 (t) ] is a multiple of some underlying hazard [ h0
(t) ]
 Lets “call” that multiplying factor” ” (so that it always has a positive value,
regardless of ’s value); so that
When X=1
When X=0
= Cox regression coefficient, determined by

partial likelihood estimation
Proportional Hazard Model (COX, 1972)
 According to this model, is the logarithm of the relative hazard (RH):
 Thus
Proportionality assumption
 Assumes that changes in levels of the independent variables will produce

proportionate changes in the hazard function, independent of time
OR
Notes
 The baseline hazard can be constant (as in the exponential model) or
changing (as in other parametric models).
 In fact, the baseline hazard could have any shape. It is not necessary to
figure out the shape of the baseline hazard,
as long as the hazard in the exposed is always a multiple (that of the
unexposed (proportional).
 For example:
Notes
 Cox’s brilliant idea was to find a procedure to estimate in the presence

of censored data without needing to specify or estimate h0 (t) !!!!
Partial (or “conditional”) likelihood
 The fact that to estimate there is no need to estimate h0 (t) is why this is
considered a semiparametric model.
 The assumption if “proportionality” is implicit in the fact that there is only

one RH (one )for the whole follow-up.
Notes
 The Assumption of proportionality is

analogous to the assumption of lack of
multiplicative interaction (uniform RH
across multiple strata --- strata of time in
this case) needed to calculate Mantel-
Haenzel or logrank tests.
 If the hazard are not proportional,

particularly when the curves cross
(“qualitative interaction”), the model,
the ,and the RH are I-r-r-e-l-e-v-a-n-t!!! Danger of using this model as a
black box!
Extend to the multivariate
situation
 As with any regression method, the Cox model can be extended to
the multivariate situation:
 Problem
 Compare the hazard of two groups that are identical with respect
to all characteristics expect that X1=1 in one group (exposed) and
X1=0 in another (unexposed)
Relative hazard
=
 The Cox model can be expressed as
a function of survival in exposed and
unexposed:
 Although usually this is not the primary

objective, S0(t) (and h0(t)) can be estimated once the  ' s have
been obtained (Kalbfleisch & Prentice,1980).
 Thus ,adjusted survival curves can be obtained from the Cox model.
Other non-parametric alternatives to obtain adjusted survival estimates
have been described (Nieto & Coresh,1996).
Interpretation of the regression coefficients
 For a binary variable (X1={1, 0}):
Log (RH) corresponding to exposed (x1=1) compared to unexposed
(x1=0), adjusted for the other ‘x”.
 For a continuous variable:
Adjusted Log (RH) corresponding to an increment in one unit of x1:
 To calculate the RH corresponding to an increment in 10 units of x1, for

example:
Interpretation of the regression
coefficients
 For a categorical ordinal variable
 The same as for a continuous variable
 Where only one term x1 is included, only one estimate of the RH is
obtained
 Assumption
• LINEARITY:
The RH comparing x1=5 with x1=4 is identical to the
RH comparing x1=2 and x1=1
• If this assumption is not true, 2 possible strategies:
(1) Use dummy variable
(2) Use quadratic terms, (x12), polynomial regression, etc.
 For a non ordinal categorical variable:
 ”Factor” the variable, i.e., define indicator (dummy) variables:
 Example: ”Center”, in a study with participants from Jackson, Forsyth Co.,
Minneapolis, and Washington Co.
(1) Select one of the categories reference: e.g., Forsyth Co.
(2) Create one indicator variable for each of the remaining categories:
XJ XM XW That is
Forsyth Co 0 0 0 xJ=1 for Jackson, 0 otherwise
Jackson 1 0 0 xM=1 for Minneapolis and 0 otherwise xW=1 for
Washington Co. and 0 otherwise
Minneapolis 0 1 0
Washington DC 0 0 1 In the model :
𝛽 𝐽=¿ Adjusted log RH comparing participants from Jackson with

participants from Forsyth
Adjusted log RH comparing participants from Minneapolis with

𝛽 𝑀=¿ participants from Forsyth
𝛽𝑊 =¿ Adjusted log RH comparing participants from Washington Co. with

participants from Forsyth
Estimation of variance and SE
 In addition to the regression coefficient (β), it is possible to estimate the

variance and standard error [SE(β)].
 HYPOTHESIS TEST (H0: β =0)

Wald test:
or it’s square, with distribution (1 d.f.))

Estimation of variance and SE
 CONFIDENCE LIMITS ( 95 %):
and the corresponding CL for the relative hazard:
 Note: For a RH for an increment of more than one unit it is necessary

to multiply also the SE to obtain the CL. For example, for an
increment in 10 units:
How to deal with interaction
 Note that the coefficients for x1 are calculated assuming that x2, for
example, is the same in numerator and denominator, regardless of its
value.
 Assumption: there is no interaction between x1 and x2. To assess if this
assumption is true and handle possible
 interaction, 2 possible strategies:
 (1) Stratified analysis:
E.g.: calculate ß1 in those with x2=1 (ß1,x2=1) and in those with x2=0
(ß1,x2=0):
Check if ß1,x2=1 ≈ ß1,x2=0
(Advantage: Simple, easy to explain)
 (2) Add an interaction term to the regression

h1(t)=h0(t) x e{1x1+2x2+3x3+…..+kxk+1x2(x1x2)}
Checking Proportionality
 Add interaction term exposure*time: create a time-dependent

interaction term, redefined at each event time:
 Can use time as a continuous variable, or categorized in the

relevant intervals.
 The Wald statistic of this term assess whether there is a

“statistically significant departure from proportionality”
 Graphical Analysis:
 Survival curves: not very
sensitive except to
assess total lack of
proportionality: cross-
over
 Graphical Analysis (cont’d):

 Log of cumulative hazard
function [log(-log(S))]: if the
hazards are proportional, the
curves have to be parallel:
S1(t)=[S0(t)]exp( x)
 Taking logarithms and

multiplying by –1 in both sides:
-LogS1(t)= exp (x) X [-logS0(t)]
 Taking logarithms again:

Log[-logS1(t)]= x +log[-logS0(t)]
83
84
Check for Proportional Hazard
Figure: Baseline hazard functions for operation status data
Figure: Log minus log plot for two levels of operation
Methods for the examination of assumptions of the Cox Model
Assumption How to Assess Assumption How to Assess

Assumptions shared by all survival analysis Assumptions shared by other multiple regression methods:
-Non informative Knowledge about Categorizations of continuous
censoring subject/study -Linearity variables
populations Quadratic terms Residuals
-No secular trends
Comparing baseline Measures of influence
-Extreme values
characteristics Residuals/outliers
Knowledge about subject
Assumptions more unique to the Cox model: -Adequate variables Stepwise methods Best-model
selection Residuals
Interaction x* time
-Proportionality Graphical analysis Stratification
-No interaction
Stratification Add interaction terms
Two Sample Testing
Examples from the Precursors Study
 Survival according to smoking:

Log-rank Test for Right Censored Data
 Ideas:
 Create a 2x2 table at each uncensored failure time
 The construction of each 2x2 table is based on the
corresponding risk set.
 Combine information from tables
 The null hypothesis is
H0 : A(t)= B(t) (or, SA(t) = SB(t)) for all t
Note: Where “for all t” might be replaced by“for observed t”.

When do we reject H0
The null hypothesis is H0 : A(t)= B(t) for all t
Consider three different kinds of alternatives:

(A1) H1 : A  B no prior knowledge
(A2) H1 : A < B treatment A is better
(A3) H1 : A > B treatment B is better
Usually the significance level of a test is set up to be 0.05.

Example from the statistical
output
 Objective: Compare the Survival function two (or more) groups
 Exposed/Unexposed (2 sample) in this case
 Use Log-rank test
 Example
 Exposed: 2 4 8+ 9 10+ 13 16+ 17 20 24+
 Unexposed: 3+ 5 7+ 11+ 14 18 19+ 24+ 24+ 24+
Example from the statistical
output
Chi Square DF Pr>Chi- Square

Log-Rank 2.0640 1 0.1508
Wilcoxon 1.7468 1 0.1863
-2 Log(LR) 1.6773 1 0.1953
Compensating Bias
 KEY ASSUMPTION when comparing

two curves:
 ”Bias” due to censoring (if
present) is the same across study
groups (compensating bias):
 Generalizations of Log-rank tests for

K groups are available.
Summary
 After this course, students would understand

 The features and assumptions of survival data
 The definition of survival function and hazard function
 The use and interpretations of analytic methods for survival data,
including
• Life table analysis
• Kaplan-Meier analysis
• Log-rank test
• Cox proportional hazard model
References
 Collett, David. (2013). Modelling Survival Data in Medical Research. Second Edition.
Published by Chapman & Hall.
 David G. Kleinbaum, Mitchel Klein. (2012). Survival Analysis: A Self-Learning Text.
Third Edition. Springer-Verlag New York
 David W. Hosmer, Stanley Lemeshow, Susanne May. (2008) Applied Survival
Analysis: Regression Modeling of Time to Event Data (Wiley Series in Probability and
Statistics). John Wiley & Sons, Inc.
 Steve Selvin.(2008) Survival Analysis for Epidemiologic and Medical Research
(Practical Guides to Biostatistics and Epidemiology). Cambridge University Press.
97
98

Survival Analysis-Debby Raden

Uploaded by

Copyright:

Available Formats

You might also like

Survival Analysis-Debby Raden

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Survival Analysis-Debby Raden

Uploaded by

Copyright:

Available Formats

Survival Analysis

time to event analysis

Debby Syahru Romadlon & Raden Ahmad Dedy M

• Introduce the concepts,

• A failure time (survival time, lifetime), T,

1. “TIME” (with origin)

2. “EVENT” (with date or precise time in appropriate scale

a. Time from start of treatment to a failure time

(1)The survival data are generally not

Important The actual survival time of an individual, t, is independent of

Have, on the average, the same

If losses “die” immediately after being lost

No perfect ways to assess the

This type of censoring will be the main censoring

Under random censoring, what is the actually observed data?

Example: A set of observed survival data is

The data can also be presented as

The calendar time period in The period of time that a

Definition: Cumulative distribution function F(t)

Definition: Survival function S(t)

 In general, the survival function S(t) provides useful summary

 The survival curve also allows

Definition: Density function f(t)

Definition: Hazard function λ(t)

(b) If T is (absolutely) continuous:

Remark: Modeling the hazard function is one way for parametric

(b) If T is (absolutely) continuous

(b) If T is (absolutely) continuous

A well-known relationship among the density, hazard, and

Thus S(t) = e-Λ(t) = e-

When T is a continous variable, we also have

Variable in interest: TIME to occurrence of an EVENT

Usual primary objective(s):

Variable in interest: TIME to occurrence of an EVENT

(2) To compare of survival curves in different groups

(1) Suppose a single sample of survival times, where none of the

where the empirical distribution function F(t) is

The life table estimate of the survivor function:

 The Kaplan-Meier estimator (1958, JASA) is a

Objective: Estimate the Survival function

S (t ) = 1-[ Cumulative incidence] 43

 In the presence of censoring, this is estimated as a

 (y(1), y(2), y(3), y(4))=(0, 1, 3, 5)

 Suppose y(i-1) < t < y(i). A principle of nonparametric

 How to estimate S(t)? Define

The Kaplan-Meier estimate is thus

For each time of

3 For each time of occurrence

6 died, 4 censored before end of follow up

Ranking: 2 4 8+ 9 10+ 13 16+ 17 20 24+

If the largest observed time is censored, the Kaplan-

= Cox regression coefficient, determined by

 According to this model, is the logarithm of the relative hazard (RH):

 Assumes that changes in levels of the independent variables will produce

 Cox’s brilliant idea was to find a procedure to estimate in the presence

 The assumption if “proportionality” is implicit in the fact that there is only

 The Assumption of proportionality is

 If the hazard are not proportional,

 Although usually this is not the primary