[R] Predict in glmnet for Cox family

Therneau, Terry M., Ph.D. therneau at mayo.edu
Tue Apr 21 14:32:52 CEST 2015

On 04/21/2015 05:00 AM, r-help-request at r-project.org wrote:
> Dear All,
> I am in some difficulty with predicting 'expected time of survival' for each
> observation for a glmnet cox family with LASSO.
> I have two dataset 50000 * 450 (obs * Var) and 8000 * 450 (obs * var), I
> considered first one as train and second one as test.
> I got the predict output and I am bit lost here,
> pre <- predict(fit,type="response", newx =selectedVar[1:20,])
>           s0
> 1  0.9454985
> 2  0.6684135
> 3  0.5941740
> 4  0.5241938
> 5  0.5376783
> This is the output I am getting - I understood with type "response" gives
> the fitted relative-risk for "cox" family.
> I would like to know how I can convert it or change the fitted relative-risk
> to 'expected time of survival' ?
> Any help would be great, thanks for all your time and effort.
> Sincerely,

The answer is that you cannot predict survival time, in general.  The reason is that most 
studies do not follow the subjects for a sufficiently long time.  For instance, say that 
the data set comes from a study that enrolled subjects and then followed them for up to 5 
years, at which time 35% had experienced mortality (using the usual Kaplan-Meier).  Fit a 
model to the data and ask "what is the predicted survival time for a low risk subject". 
The answer will at best be "greater than 5 years".   The program cannot say if it is 6 or 
10 or even 1000.  A bigger data set does not help.

Terry Therneau

More information about the R-help mailing list