[R] survival::clogit, how to extract residuals for GOF assessment
Joe Ceradini
joeceradini at gmail.com
Tue Apr 26 20:04:26 CEST 2016
Hi Folks,
Hopefully this question has enough R and not too much stats to be
appropriate for this list. Based on,* Hosmer et al. 2013. Logistic
regression for matched case-control studies. Applied Logistic
Regression *(eqtn.
7.8)*, *I am assessing GOF of conditional (or matched) logistic regression
models with the *standardized Pearson residuals*. The authors define
“large” as delta chi-squared values > 4.
For a 1:1 study, I can fit a conditional logistic model via an
unconditional routine by making the response variable all 1’s, taking the
difference of the covariate values for each pair, and removing the
intercept. I can then extract the standardized residuals from the model
(see code at bottom for example). However, if I want to fit a 1:many model,
I need to use survival::clogit, which is where my question comes in:
How can I apply this same "test" to a clogit model? Which residuals should
I be extracting? Or, is this not an option for a clogit model?
## The default residuals of coxph in R are the martingale residuals.
## resid(fit1,type=c("martingale", "deviance", "score", "schoenfeld",
## "dfbeta", "dfbetas", "scaledsch","partial"))
R code below shows equivalence between clogit and binomial GLM fit on the
differences (note: these would not be equivalent if used a "cluster"
argument in clogit), and GOF "test" for binomial GLM fit on the
differences. I would like to assess GOF of the fit.clogit model but do not
know how.
require(survival)
require(plyr)
# Dataframe
dat.full <- structure(list(resp = c(1L, 0L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 0L,
0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L,
0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 0L,
0L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 0L,
1L, 0L), x1 = c(3.92, 2.04, 2.27, 9.25, 6.13, 10.44, 5.09, 1.27,
5.9, 2.88, 3.79, 1.46, 3.35, 3.82, 4.28, 6.52, 3.54, 3.46, 0.46,
1.68, 3.22, 1.64, -0.12, 2.78, 2.58, 2, 2.83, 3.58, 1.45, 1.64,
2.89, 3.12, 5.6, 8.29, 3.42, 4.8, 3.04, 4.33, 5.31, 1.78, 8.18,
4.56, 4.85, 7.99, 7.52, 6.85, 7.64, 3.33, 5.17, 4.62, 1.24, 2.54,
3.08, 8.2, 1.81, 2.78, 2.16, 2.76, 3.45, 3.43), ID = c(10L, 10L,
11L, 11L, 13L, 13L, 17L, 17L, 18L, 18L, 23L, 23L, 25L, 25L, 29L,
29L, 31L, 31L, 33L, 33L, 35L, 35L, 38L, 38L, 39L, 39L, 4L, 4L,
41L, 41L, 43L, 43L, 45L, 45L, 46L, 46L, 49L, 49L, 50L, 50L, 54L,
54L, 55L, 55L, 56L, 56L, 57L, 57L, 59L, 59L, 6L, 6L, 60L, 60L,
7L, 7L, 8L, 8L, 9L, 9L)), .Names = c("resp", "x1", "ID"), row.names = c(1L,
2L, 3L, 4L, 7L, 8L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 21L,
22L, 23L, 24L, 25L, 26L, 29L, 30L, 31L, 32L, 33L, 34L, 35L, 36L,
37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L, 49L, 50L, 55L,
56L, 57L, 58L, 59L, 60L, 61L, 62L, 63L, 64L, 65L, 66L, 67L, 68L,
71L, 72L, 73L, 74L, 75L, 76L), class = "data.frame")
# fit clogit
fit.clogit <- clogit(resp ~ x1 + strata(ID), method = "efron", data =
dat.full)
summary(fit.clogit)
# Make dataframe so can fit on differences with unconditional routine.
# x1 where resp = 1 minus x1 where resp = 0, grouped by ID
dat.diff <- ddply(dat.full, .(ID), summarise,
x1.diff = x1[resp == 1] - x1[resp == 0])
dat.diff$resp <- 1 # response variable is all 1's
# Fit on differences and 1's, remove intercept
fit.diff <- glm(resp ~ -1 + x1.diff, family = binomial, data = dat.diff)
summary(fit.diff)
summary(fit.clogit)$coefficients
# GOF: delta chi-squared
plot(fit.diff$fitted.values, rstandard(fit.diff)^2)
round(sort(rstandard(fit.diff)^2), 4)
Thanks!
Joe
[[alternative HTML version deleted]]
More information about the R-help
mailing list