[R] isssues with predict.coxph, offset, type = "expected", and newdata?
David James
daj025 at gmail.com
Fri Sep 30 16:26:27 CEST 2016
Hi,
It seems there might be two issues with predict.coxph(), and I'd
appreciate confirmation before submitting a bug report to the package
author.
(1) predict() seems to produce incorrect predictions when using type
= "expected" from a Cox model with an offset and specifying a new data
for prediction (see Example 1 below).
(2) predict() produces an error for a call with a null model and a
newdata= argument (see Example 2 below).
Please note that I didn't check these cases with additional
complications, like strata(), cluster(), etc.
SessionInfo() at the bottom of this email (notice I'm using survival
2.39-5, the current version in CRAN).
Thanks,
David
--------------------------------------------------------------------------------------------------------------
Example 1
> library("survival")
> data(lung)
>
> # simulate a three-fold genetic effect (worsening)
> set.seed(123)
> lung$gene.eff <- sample(log(c(1, 3)), size=nrow(lung), replace=TRUE, prob=c(0.5, 0.5))
>
> # fix the gene effect, do not estimate it
> m1 <- coxph(Surv(time, status) ~ age + sex + offset(gene.eff), data = lung)
>
> # two hypothetical individuals differing only on gene.eff
> nd <- expand.grid(time=180, status=1, age=65, sex=1, gene.eff=log(c(1,3)))
>
> p1.lp <- predict(m1, newdata = nd, type = "lp")
> p1.rsk <- predict(m1, newdata = nd, type = "risk")
> p1.exp <- predict(m1, newdata = nd, type = "expected")
>
> # output from type "lp" (linear predictor) and "risk" are ok,
> # but not from type "expected"
> all.equal(3, exp(p1.lp[2]-p1.lp[1]), check.names = FALSE)
[1] TRUE
> all.equal(3, p1.rsk[2]/p1.rsk[1], check.names = FALSE)
[1] TRUE
> all.equal(3, p1.exp[2]/p1.exp[1], check.names = FALSE)
[1] "Mean relative difference: 0.6666667"
>
> # notice that predict() produces the same expected number of
> # events for both hypothetical individuals
> print(p1.exp)
[1] 0.1882663 0.1882663
Example 2
> # null model
> m0 <- coxph(Surv(time, status) ~ 1, data = lung)
>
> # time points at which we want predictions
> nd <- data.frame(time=seq(from=0, to=360, by=30), status=rep(1, 13))
>
> # predictions are produced for types "lp" and "risk", but not "expected"
> p0.lp <- predict(m0, type="lp", newdata = nd)
> p0.rsk <- predict(m0, type="risk", newdata = nd)
> p0.exp <- predict(m0, type="expected", newdata = nd)
Error in newx %*% object$coef :
requires numeric/complex matrix/vector arguments
>
Session Info:
> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows 7 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United
States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] survival_2.39-5
loaded via a namespace (and not attached):
[1] Matrix_1.2-6 tools_3.3.1 splines_3.3.1 grid_3.3.1
lattice_0.20-33
More information about the R-help
mailing list