[R] how to check linearity in Cox regression
Steven McKinney
smckinney at bccrc.ca
Sat May 10 03:33:21 CEST 2008
> -----Original Message-----
> From: r-help-bounces at r-project.org on behalf of array chip
> Sent: Fri 5/9/2008 2:54 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] how to check linearity in Cox regression
> Hi, I am just wondering if there is a test available for testing if a linear fit of an independent variable in a Cox regression is enough? Thanks for any suggestions.
> John Zhang
> require("survival")
> require("splines")
>
> #
> # You could fit a model with the covariate of interest,
> # then an additional fit with other functional forms.
> # I'll use the stanford2 data and age as the covariate.
> # The first fit will have age as a linear term.
> # The second fit will have a quadratic term as well.
> #
>
> stanford2df <- stanford2
> stanford2df$agectr <- stanford2$age - mean(stanford2$age)
> stanford2df$age2 <- stanford2df$agectr^2
> fit1 <- coxph(Surv(time, status) ~ agectr, data = stanford2df)
> fit2 <- coxph(Surv(time, status) ~ agectr + age2, data = stanford2df)
> summary(fit1)
Call:
coxph(formula = Surv(time, status) ~ agectr, data = stanford2df)
n= 184
coef exp(coef) se(coef) z p
agectr 0.0292 1.03 0.0106 2.74 0.0061
exp(coef) exp(-coef) lower .95 upper .95
agectr 1.03 0.971 1.01 1.05
Rsquare= 0.044 (max possible= 0.996 )
Likelihood ratio test= 8.27 on 1 df, p=0.00403
Wald test = 7.51 on 1 df, p=0.00613
Score (logrank) test = 7.6 on 1 df, p=0.00583
> summary(fit2)
Call:
coxph(formula = Surv(time, status) ~ agectr + age2, data = stanford2df)
n= 184
coef exp(coef) se(coef) z p
agectr 0.04100 1.04 0.010237 4.01 6.2e-05
age2 0.00197 1.00 0.000689 2.86 4.2e-03
exp(coef) exp(-coef) lower .95 upper .95
agectr 1.04 0.960 1.02 1.06
age2 1.00 0.998 1.00 1.00
Rsquare= 0.08 (max possible= 0.996 )
Likelihood ratio test= 15.4 on 2 df, p=0.00046
Wald test = 17.4 on 2 df, p=0.00017
Score (logrank) test = 17.3 on 2 df, p=0.000175
> anova(fit1, fit2, test = "Chisq")
Analysis of Deviance Table
Model 1: Surv(time, status) ~ agectr
Model 2: Surv(time, status) ~ agectr + age2
Resid. Df Resid. Dev Df Deviance P(>|Chi|)
1 183 1017.94
2 182 1010.84 1 7.10 0.01
> # the likelihood ratio test suggests that a linear term for
> # age alone may not be adequate. The quadratic form yields
> # a better fit.
>
> # You can also examine functional forms using spline fits.
> fit0 <- coxph(Surv(time, status) ~ age, data = stanford2df)
> fit3 <- coxph(Surv(time, status) ~ ns(age, df = 4), data = stanford2df)
> pred3 <- predict(fit3, type="terms", se=TRUE)
>
> hfit <- pred3$fit[,1]
> hse <- pred3$se[,1]
> hmat=cbind(hfit, hfit+2*hse,hfit-2*hse)
> o <- order(stanford2$age)
> matplot(stanford2$age[o], hmat[o, ], pch="*",col=c(2,4,4), xlab = "age",
+ ylab="Log Relative Risk",main="Cox Model: Survival",type="l")
> # The plot of the spline fit for age shows a non-linear form
>
> anova(fit0, fit3, test = "Chisq")
Analysis of Deviance Table
Model 1: Surv(time, status) ~ age
Model 2: Surv(time, status) ~ ns(age, df = 4)
Resid. Df Resid. Dev Df Deviance P(>|Chi|)
1 183 1017.94
2 180 1009.45 3 8.49 0.04
> # The likelihood ratio test suggests the linear model
> # can be improved upon.
Hope this helps
Steve McKinney
____________________________________________________________________________________
[[elided Yahoo spam]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list