[R] Problem plotting curve on survival curve

Mon Mar 3 15:53:14 CET 2008

Calum had a long question about drawing survival curves after fitting a Weibull 
model, using pweibull, which I have not reproduced.

It is easier to get survival curves using the predict function.  Here is a
simple example:

> library(survival)
> tfit <- survreg(Surv(time, status) ~ factor(ph.ecog), data=lung)
> table(lung$ph.ecog)
   0    1    2    3 <NA> 
  63  113   50    1    1 

> tdata <- data.frame(ph.ecog=factor(0:3))
> qpred <- predict(tfit, newdata= tdata, type='quantile', p=1:99/100)
> matplot(t(qpred), 99:1/100, type='l')

  The result of predict is a matrix with one row per group and one column per 
quantile.  The final plot uses "99:1" so as to show 1-F(t) = S(t) rather than F.
Don't ask for the 1.0 quantile BTW -- it is infinity and I doubt you want the 
plot to stretch out that far.  The 0.0 quantile can also have issues due to the 
implicit log transform used in many distributions.  
   If I had not used the newdata argument, we would get 227 rows in the result, 
one for each subject.  That is, 63 copies of the ph.ecog==0 curve, 113 of the 
ph.ecog==1 curve, ...  The above fit assumed a common shape for the 4 groups, 
you can add a "+ strata(ph.ecog)" term to have a separate scale for each group; 
this would give the same curves as 4 separate fits to the subgroups.

  There are several advantages to using the predict function.  The first is that 
the code does not need to change if you decide to use a different distribution.  
The second is that you can add the "se.fit=T" argument to get confidence bounds 
for the curves.  (A couple more lines for your matplot call of course).

	Terry Therneau
	Mayo Clinic