Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

DFCI Biostat, Nov 12, 1999

Gap Time Distributions

Interested in the distribution of T2 , T1 . Examples: Duration of response. T1 to relapse/progression.

= Time from baseline to event 1 T2 = Time from baseline to event 2


T1

= time to start of response, T2 = time = time to progression, T2 = = time to

Survival beyond progression. T1 time to death.

Recurrent events (time between j th and j + 1st). T1 j th event. T2 = time to j + 1st event.

DFCI Biostat, Nov 12, 1999

Comparisons of treatment groups are primarily descriptive: eg consider duration of response only a subset of cases respond responders from different treatment groups may differ in ways other than treatment received thus cannot use the baseline randomization to infer a causal effect of treatment Terminology and examples used in the rest of the talk will be for survival beyond progression, but the same issues arise for other gap time distribution inference problems.

DFCI Biostat, Nov 12, 1999

E9486 Multiple Myeloma Groups dened by markers measured at time of progression. Is survival beyond progression different among groups? 451 cases with disease progression, 413 dead, 41 still alive

Following slides: Plots of the raw data, and KM estimates of survival beyond progression for groups formed by time of progression, show strong evidence for dependence.

DFCI Biostat, Nov 12, 1999

E9486
Dead survtime-progtime 6 Alive 8 4

0 2 4 progtime (years) 6

DFCI Biostat, Nov 12, 1999

Survival Beyond Progression by Year of Progressio


P = 0.0000000000039 1.0 0.8 Probability 0.6 0.4 0.2 0.0 0 2 4 Years 6 8 0-1 (98 events/ 98 cases) 1-2 (88 events/ 91 cases) 2-3 (129 events/ 131 cases) >3 (98 events/ 134 cases)

DFCI Biostat, Nov 12, 1999

Another approach: t a proportional hazards model with response = survival beyond progression, covariate = time to progression. Use penalized partial likelihood to give a exible estimate of the effect. z <- cox.spline(a,usurv-uprogtime,ustat,uprogtime) cox.spline.plot(z,1, font=5, lwd=2,xlab=Time of Progression, main=Survival Beyond Progression Hazard Ratio)

DFCI Biostat, Nov 12, 1999

Survival Beyond Progression Hazard Ratio


1 Log Hazard Ratio -5 0 -4 -3 -2 -1 0

4 Time of Progression

DFCI Biostat, Nov 12, 1999

Identiability Issues Let T1


T2

= time from baseline to death


F1 x

= time from baseline to progression   = P T1  x; S y jx = P T2 , T1 y jT1 = x


S y

Let

and Then

S y

Z1  =
0

  = P T2 , T1 j 

y :

S y x dF1 x :

 

Weight given to S y jx is the probability (density) that T1

That is, the marginal distribution S y  is the weighted average of the conditional distributions S y jx.

= x.

DFCI Biostat, Nov 12, 1999

With limited follow-up, S y jx is not identiable or estimable for all x; y . Then the data only contain information on S y jx for x; y combinations with x + y c. Let c = the maximum follow-up. Eg, for E9486, c = 11:5 years.
:

In the plot of the data, x is on the horizontal axis and y on the vertical. Only have information on S y jx for x; y  points below the line y = 11:5 , x.

DFCI Biostat, Nov 12, 1999

10

E9486
8 x=4.5 Dead Alive

y=survtime-progtime

0 2 4 x=progtime (years) 6

y=11.5-x

DFCI Biostat, Nov 12, 1999

11

Since do not have info on S y jx for all x; y , cannot estimate the marginal distribution S y . Exception: If T1 y c , b.
b always, and b c, then S y is estimable for



Eg, for duration of response, if response always occurs within 6 months of entry when it occurs at all, and c = 5 years, then S y  is estimable for y 4:5.

DFCI Biostat, Nov 12, 1999

12

Dependent Censoring Let C be the potential censoring time measured from baseline. Suppose C is independent of T1 ; T2 . Actually observe minfT1 ; C g, I T1

 C , minfT2 ; C g, I T2  C .

For the gap time distribution, the failure time is T2 , T1 and the censoring time is C , T1 . If T2 , T1 is correlated with T1 , then C , T1 will be correlated with T2 , T1 , so the censoring and failure times will not be independent when measured from T1 . In E9486, censoring from the accrual and follow-up periods should be roughly uniform over 7; 11:5, which is the region between the two diagonal lines, below. (Of course, our follow-up is not quite that consistent or reliable).

DFCI Biostat, Nov 12, 1999

13

E9486
Dead y=survtime-progtime y=7-x Alive 8 6

0 2 4 x=progtime (years) 6

y=11.5-x

DFCI Biostat, Nov 12, 1999

14

On the gap-time scale, consider the subjects at risk at y

= 1 years.

Subjects censored at this time will all have long times to progression. Since they have long times to progression, they will have longer survival beyond progression, because of the correlation between the two quantities. Hence the censored subjects are not a random subset of the risk set at any given time (dependent censoring) Thus standard methods applied to the marginal gap time data (eg Kaplan-Meier) will be biased, even when S y  is identiable. (Will be unbiased on the difference between the maximum of the support of the progression dist. and the minimum of the support of the censoring dist.)

DFCI Biostat, Nov 12, 1999

15

What to Estimate / How to Estimate it? 1. Focus on the conditional distribution S y jx. Identiable for x + y
T2 c.

, T1 and C , T1 are conditionally independent given T1 , so

generally dependent censoring will not be a problem Can model the dependence of the distribution of T2 , T1 on T1 (eg Cox model) Inferences on other factors from tests and Cox models stratied on TTP groups are approximately valid Can give nonparametric kernel-type estimators (eg Dabrowska, SJS, 1987). See the function surv.smooth() in the local S library.

DFCI Biostat, Nov 12, 1999

16

Is the marginal distribution really of interest with dependence? 2. Focus on the conditional distribution
H yx

 j  = P T2 , T1

y T1

j  x
c.

Identiable for x + y

Lin, Sun, Ying (Bka, 1999) give a consistent estimator (below), and in unpublished work have developed a generalization of the logrank test. Lin, Sun, Ying: Let H x; y  = P T1
ji ji i

  = P C t. ~ Index subjects with the subscript i, and let T = minfT ; C g, = I T  C , j = 1; 2, i = 1; : : : ; n. Note that H y jx = H x; y =H x; 0.
y and G t
ji ji i

 x; T2 , T1

DFCI Biostat, Nov 12, 1999

17

With no censoring, the EDF

1
n

X
i

I T1i

 x; T2 , T1
i

is unbiased for H x; y . With censoring, note that Gt 0 for t ~ ~ y = 0 when 1i = 0. I T2i , T1i
c, then
i i i i i i

c. Also,

If x + y

~ ~ ~ ~ EfI T1  x; T2 , T1 y=Gy + T1 jT1 ; T2 g = EfI T1  x; T2 , T1 y; C , T1 y=Gy + T1 jT1 ; T2 g = I T1  x; T2 , T1 y;
i i i i i i i i i i i

so

DFCI Biostat, Nov 12, 1999

18

1
n

X ~
i

I T1i

~ ~  x; T2 , T1
i

y =G y

~   + T1 
i

is unbiased for H x; y . Substituting a consistent estimator for Gt then gives a consistent estimator for H x; y .

^ Can use the Kaplan-Meier estimator Gt, computed from the data ~ T2i ; 1 , 2i .

Note that the full data set measured from baseline is used to estimate G. Asymptotic variance is not trivial to calculate.

DFCI Biostat, Nov 12, 1999

19

S Function survbrec(): survival beyond recurrence


survbrec <- function(tp,rtime,rstat,stime,sstat, tp2=maxstime-max(tp),maxstime=max(stime)) { ### Estimates conditional gap time distribution ### H(tp[j]|tp2)=P(S-R>tp[j]|R<=tp2) (tp2 is a scalar) ### rtime, stime are potentially censored observations on ### R, S, measured from baseline ### rstat, sstat are 1 for events, 0 for failures ### Generally H is not identifiable if tp+tp2>max(stime) but ### can override the observed max(stime) with maxstime arg. out <- tp subr <- stime-rtime > 0 & rstat == 1 & rtime <= tp2 if (length(subr[subr])==0) return(rep(NA,length(tp))) if (all(sstat == 1)) {# no censoring, dist identifiable h0 <- length(stime[subr]) for (j in 1:length(tp)) {

DFCI Biostat, Nov 12, 1999

20

h1 <- length(stime[stime-rtime > tp[j] & subr]) out[j] <- h1/h0 } } else { cd <- survfit(Surv(stime,1-sstat)1) h0 <- sum(1/(summary(cd,times=sort((rtime)[subr]))$surv)) for (j in 1:length(tp)) { if (tp[j]>maxstime-tp2) out[j] <- NA else { i <- stime-rtime > tp[j] & subr out[j] <- if (length(i[i])==0) 0 else {sum(1/( summary(cd,times=sort((rtime+tp[j])[i]))$surv))/h0 } } } } out }

DFCI Biostat, Nov 12, 1999

21

> # 9486; H(i|2) > survbrec(1:10,d$progtime,d$progstat,d$survtime,d$survstat,2) [1] 0.35475379 0.19061398 0.12707598 0.07412766 0.04264862 [6] 0.02346133 0.01434252 0.01304146 0.02603643 NA > ## NOTE: not monotone > # 9486; H(i|5) > survbrec(1:7,d$progtime,d$progstat,d$survtime,d$survstat,5) [1] 0.51997911 0.30615689 0.20832986 0.12561816 0.06387121 [6] 0.03879853 NA

Simulation: T1  Exp1, T2 , T1 jT1  Weibull1 + :5T1 ; 2, corT1 ; T2 , T1  = :53, C  U 0; 2:5, n = 200 On average expect 73 cases without progression, 70 progressed but alive, and 57 progressed and died.
f1 <- function(tpp,cutoff,n=200,mc=2.5){ u1 <- rexp(n) #progtime u2 <- u1+rweibull(n,shape=2,scale=1+.5*u1) #survtime

DFCI Biostat, Nov 12, 1999

22

truc <- truu <- tpp sub <- u1<=cutoff; nc <- length(sub[sub]) for (j in 1:length(tpp)) { xj <- u2-u1>tpp[j] truu[j] <- sum(xj)/n truc[j] <- sum(xj & sub)/nc } ct <- mc*runif(n) #censoring i1 <- ifelse(ct<u1,0,1) u1 <- pmin(u1,ct) i2 <- ifelse(ct<u2,0,1) u2 <- pmin(u2,ct) d2 <- survbrec(tpp,u1,i1,u2,i2,cutoff,maxstime=mc) sub <- u2-u1>0 & i1 == 1 k1 <- summary(survfit(Surv(u2-u1,i2)1,subset=sub), times=tpp)$surv cbind(truu,truc,d2,k1) }

DFCI Biostat, Nov 12, 1999

23

> > > > > + > >

ntri <- 500 tpp <- c(.5,1) out <- array(NA,c(length(tpp),4,ntri)) for (i in 1:ntri) out[,,i] <- f1(tpp,1) dimnames(out) <- list(format(tpp),c(True Unc, True Cond,LSY,KM),NULL) # Estimates of means apply(out,c(1,2),mean) True Unc True Cond LSY KM 0.5 0.87127 0.8360188 0.8393883 0.8450474 1.0 0.59050 0.4958633 0.4990447 0.5027919 > # Standard errors of means > sqrt(apply(out,c(1,2),var)/ntri) True Unc True Cond LSY KM 0.5 0.001071808 0.001477317 0.002585383 0.001610749 1.0 0.001596997 0.002019595 0.003563854 0.002712360

DFCI Biostat, Nov 12, 1999

24

> # Estimates of variances > apply(out,c(1,2),var) True Unc True Cond LSY KM 0.5 0.0005743859 0.001091232 0.003342102 0.001297256 1.0 0.0012752004 0.002039382 0.006350528 0.003678449

True Unc = True unconditional probability S y  True Cond = True conditional probability H y jx LSY = Lin Sun Ying estimator of H y j1 KM = Kaplan-Meier applied to the gap time data

Conditional and marginal distributions are different LSY is essentially unbiased for the true conditional distribution

DFCI Biostat, Nov 12, 1999

25

KM is biased as an estimator of the true unconditional distribution It is coincidence that KM is also nearly unbiased for the conditional distribution. The KM would be the same for any value of cutoff above, while H y jx would vary with x =cutoff. Variance of LSY estimator is substantially larger than KM. How efcient is the LSY procedure?

DFCI Biostat, Nov 12, 1999

26

Summary If the time to the initiating event is correlated with the gap time to the terminating event, then In general the marginal gap time distribution is not identiable When it is identiable, standard methods for inference on the marginal distribution may be invalid due to dependent censoring Various conditional distributions can be estimated, and inference should focus on these. LSY estimator is not monotone, and its efciency properties are not clear.

You might also like