Professional Documents
Culture Documents
Survival Analysis
Survival Analysis
1
i. Right Censored: Right censoring is used in many problems. It
happens when we are not certain what happened to people after
a certain point in time. It occurs when the true event time is
greater than the censored time when c < t. This happens if either
some people cannot be followed the entire time because they
died or were lost to follow up or withdrew from the study.
ii. Left Censored: Left censoring is when we are not certain what
happened to people before some point in time. Left censoring is
the opposite, occurring when the true event time is less than the
censored time when c > t.
iii. Interval Censored: Interval censoring is when we know
something has happened in an interval (not before starting time
and not after ending time of the study) but we do not know exactly
when in the interval it happened. Interval censoring is a
concatenation of the left and right censoring when the time is
known to have occurred between two-time points Survival
Function S (t): This is a probability function that depends on the
time of the study. The subject survives more than time t. The
Survivor function gives the probability that the random variable T
exceeds the specified time t.
2
𝑛𝑖 −𝑑𝑖
𝑆𝑡 = ∏
𝒕𝒊 < 𝒕 𝑑𝑖
where 𝑛𝑖 is the number of survivors just prior to time 𝑡𝑖 . If no censoring occurs,
all subjects have failed. When there is censoring, 𝑛𝑖 equals the number of
survivors less the number of censored cases (i.e., subjects who were no
longer observed at a given time). The number of failed subjects within the
same observation period is indicated by 𝑑𝑖 .
The Kaplan-Meier method can create both a tabular estimate and graphical
stairstep curve for use in analysis. If the analysis needed falls within the curve,
before the last censored time, it is a simple method to use. To use the Kaplan-
Meier method, certain assumptions must be present: censoring is unrelated to
either survival or failure, the survival probabilities are the same for all subjects
regardless of when the observation period began, and event times are
accurately recorded. These minimal assumptions are important because they
allow the Kaplan-Meier method to be applied to a range of time-to-event data.
3
Assumptions of Kaplan Meier Survival
In real-life cases, we do not have an idea of the true survival rate function. So
in Kaplan Meier Estimator we estimate and approximate the true survival
function from the study data. There are 3 assumptions of Kaplan Meier
Survival
i. Survival Probabilities are the same for all the samples who joined late in
the study and those who have joined early. The Survival analysis which
can affect is not assumed to change.
ii. Occurrence of Event are done at a specified time.
iii. Censoring of the study does not depend on the outcome. The Kaplan
Meier method doesn’t depend on the outcome of interest.
Interpretation of Survival Analysis is Y-axis shows the probability of subject
which has not come under the case study. The X-axis shows the representation
of the subject’s interest after surviving up to time. Each drop in the survival
function (approximated by the Kaplan-Meier estimator) is caused by the event
of interest happening for at least one observation.
The plot is often accompanied by confidence intervals, to describe the
uncertainty about the point estimates-wider confidence intervals show high
uncertainty, this happens when we have a few participants- occurs in both
observations dying and being censored.
I. We need to perform the Log Rank Test to make any kind of inferences.
II. Kaplan Meier’s results can be easily biased. The Kaplan Meier is a
univariate approach to solving the problem
III. Removal of Censored Data will cause to change in the shape of the
curve. This will create biases in model fit-up
IV. Statistical tests and observations become mislead if the Dichotomizing
of Continuous Variable is performed.
V. By dichotomizing means we take statistical measures such as median
to create groups but this may lead to problems in the data set.
4
Log-Rank Test
The log-rank test is a nonparametric hypothesis test comparing more than one
strata of survival distributions. The log-rank test compares the hazard function
estimates of two or more survival groups at each observed time. In other
words, this test allows for comparisons of differences in survival times for an
event among different groups of observations (e.g., different brands of
batteries, differences among batteries from multiple vendors, different battery
charging management techniques).
This test statistic will have a chi-squared distribution with degrees of freedom
equal to one less than the number of groups being compared. The null
hypothesis for the log-rank test is that all groups have an equal hazard
rate/survival distribution. The rejection of the null hypothesis would be that at
least one of the survival groups has a different hazard rate/survival
distribution. In other words, a rejection of the null hypothesis would result in a
conclusion that at least one of the survival groups has a different survival
distribution than the other two.
5
Advantages & Dis-Advantages of Kaplan Meier
Estimator
• Advantages
i. Does not require too many features- time to the survival analysis
event is only required.
ii. Provides an average overview related to the event.
• Disadvantages
i. Lots of variables cannot be correlated and monitor
simultaneously.
ii. If censoring data is removed the model will get biased at the time
of fitting.
iii. The proper estimation of the magnitude of change in the event
cannot be predicted.
Conclusion
Kaplan-Meier statistical method is very useful in the field of epidemiology
especially in the analysis of time to event data. The method is used in survival
analysis to analyze the patients that reached a certain event and those that
are censored during a given period of time. It is also very applicable in making
comparison between groups of participants such as control group and
treatment group. Statistical software such as SPSS, Stata, SAS and R
packages can be used to generate survival table and Kaplan-Meier estimate
curve as well as other important and relevant tables like overall comparisons
table. The KM estimate is also applied in other disciplines such as
engineering, economics, physics etc.