School of Mathematics and Statistics: I I I I I

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

School of Mathematics and Statistics

Te Kura Mātai Tatauranga

STAT 431 Assignment 2 Due: Monday 8 April, 12PM

1. Let (T i , Ci ) be pair of mutually independent lifetimes T i and censoring time Ci for n


individuals i = 1, ..., n. Assume that each T i has identical distribution with T having
common survival distribution function S (t) and type-1 exit intensity λ(t), and that each
Ci has identical distribution with C. These individuals are observed at time 0 = t0 < t1 <
... < tk < tk+1 < ... < tm = t. Next, define

τ = min{T, C}, δ = 1{T <C} , and λk = P{tk < τ ≤ tk+1 , δ = 1 τ > tk }, (1)

similarly defined for τi , δi . Let dk be the number of type-1 exits within [tk , tk+1 ), i.e., dk =
#{i : τi ∈ [tk , tk+1 ), δi = 1} and yk be the number of individuals alive (at risk) at time tk , i.e.,
yk = #{i : τi ≥ tk }. Denote by ck the number of individuals exit by type zero in [tk , tk+1 ).

(a) Find the relationship between yk+1 , yk , ck and dk .

(b) Suppose dk |yk ∼ Bin(yk , λk ). Write the likelihood function for a given data (dk , yk ).

(c) Show that the maximum likelihood estimate λbk = dk /yk is unbiased estimator of λk .

(d) Based on large-sample property of maximum likelihood estimation, show that

[ dk (yk − dk )
Var(λbk ) = . (2)
y3k

[Hints: Use observed Fisher Information at the estimate λbk ].

(e) A group of 30 patients are being observed for certain diseases. Their entry and exit
ages as well as gender types are reported in Table 1. Use this data to calculate:

dk , y k , b
λk , Λ(t
b k ), and b
S (tk ),

where Λ(t)
b and b S (t) are respectively the Nelson-Aalen estimate of integrated inten-
sity Λ(t) and the Kaplan-Meier estimate of survival function S (t).
[Hints: Recall that lifetime refers to observational duration].

1
ID Gender Entry Age Exit Age Censor
1 M 72 76 1
2 F 76 78 1
3 M 71 74 1
4 F 69 74 1
5 M 69 71 0
6 M 67 72 0
7 F 66 67 0
8 M 65 70 0
9 F 65 70 1
10 F 73 78 1
11 F 65 68 1
12 F 70 71 0
13 M 68 73 0
14 M 70 73 1
15 F 71 75 1
16 M 66 70 1
17 F 69 71 0
18 M 73 76 1
19 F 68 73 1
20 F 70 74 1
21 M 66 70 1
22 M 67 68 0
23 M 66 68 0
24 F 68 72 1
25 M 69 73 1
26 M 78 81 1
27 M 66 70 1
28 F 89 92 1
29 F 69 74 1
30 M 66 68 0
Table 1: Data for Question 1(e)

2
2. Let t1 < t2 < · · · < tn , with n ≥ 1, be ordered times at which events are observed.
Denote respectively by dk and yk the number of events and the number at risk at time tk ,
k = 1, . . . , n, with y0 being the number of individuals at initial time t0 = 0. The Kaplan-
Meier estimator b S (t) of the survival function S (t) is given for any t ≥ 0 by
Y dk 
S (t) =
b 1− , (1)
k:t ≤t
yk
k

whilst its variance is given by the following Greenwood’s formula:

[
X dk
S (t)} = b
S 2 (t) .

Var b (2)
k:t ≤t
yk (yk − d k )
k

In the absence of censoring:

(a) Derive a simplified formula for the Kaplan-Meier estimator b


S (t).


(b) Derive a simplified formula for the variance Var b
S (t)}.

(c) The following data records the length of time patients being diagnosed with cancer
were observed. There is no censoring for these observations.

ID time (t) ID time (t)


1 1 12 8
2 1 13 8
3 2 14 11
4 2 15 11
5 3 16 12
6 4 17 12
7 4 18 15
8 5 19 17
9 5 20 22
10 8 21 23
11 8
Data for Question 2.

S (t) for t = 4 and its standard deviation.


Based on this data, find the estimate b

(d) Give a 95% confidence interval for S (t), for t = 4.

You might also like