[R] Problem plotting curve on survival curve
Terry Therneau
therneau at mayo.edu
Mon Mar 3 15:53:14 CET 2008
Calum had a long question about drawing survival curves after fitting a Weibull
model, using pweibull, which I have not reproduced.
It is easier to get survival curves using the predict function. Here is a
simple example:
> library(survival)
> tfit <- survreg(Surv(time, status) ~ factor(ph.ecog), data=lung)
> table(lung$ph.ecog)
0 1 2 3 <NA>
63 113 50 1 1
> tdata <- data.frame(ph.ecog=factor(0:3))
> qpred <- predict(tfit, newdata= tdata, type='quantile', p=1:99/100)
> matplot(t(qpred), 99:1/100, type='l')
The result of predict is a matrix with one row per group and one column per
quantile. The final plot uses "99:1" so as to show 1-F(t) = S(t) rather than F.
Don't ask for the 1.0 quantile BTW -- it is infinity and I doubt you want the
plot to stretch out that far. The 0.0 quantile can also have issues due to the
implicit log transform used in many distributions.
If I had not used the newdata argument, we would get 227 rows in the result,
one for each subject. That is, 63 copies of the ph.ecog==0 curve, 113 of the
ph.ecog==1 curve, ... The above fit assumed a common shape for the 4 groups,
you can add a "+ strata(ph.ecog)" term to have a separate scale for each group;
this would give the same curves as 4 separate fits to the subgroups.
There are several advantages to using the predict function. The first is that
the code does not need to change if you decide to use a different distribution.
The second is that you can add the "se.fit=T" argument to get confidence bounds
for the curves. (A couple more lines for your matplot call of course).
Terry Therneau
Mayo Clinic
More information about the R-help
mailing list