[R] about the summary(cph.object)
David Winsemius
dwinsemius at comcast.net
Sat Aug 1 14:34:19 CEST 2009
On Jul 31, 2009, at 11:24 PM, zhu yao wrote:
> Could someone explain the summary(cph.object)?
>
> The example is in the help file of cph.
>
> n <- 1000
> set.seed(731)
> age <- 50 + 12*rnorm(n)
> label(age) <- "Age"
> sex <- factor(sample(c('Male','Female'), n,
> rep=TRUE, prob=c(.6, .4)))
> cens <- 15*runif(n)
> h <- .02*exp(.04*(age-50)+.8*(sex=='Female'))
> dt <- -log(runif(n))/h
> label(dt) <- 'Follow-up Time'
> e <- ifelse(dt <= cens,1,0)
> dt <- pmin(dt, cens)
> units(dt) <- "Year"
> dd <- datadist(age, sex)
> options(datadist='dd')
This is process for setting the range for the display of effects in
Design regression objects. See:
?datadist
"q.effect
set of two quantiles for computing the range of continuous variables
to use in estimating regression effects. Defaults are c(.25,.75),
which yields inter-quartile-range odds ratios, etc."
?summary.Design
#---
" By default, inter-quartile range effects (odds ratios, hazards
ratios, etc.) are printed for continuous factors, ... "
#---
"Value
For summary.Design, a matrix of class summary.Design with rows
corresponding to factors in the model and columns containing the low
and high values for the effects, the range for the effects, the effect
point estimates (difference in predicted values for high and low
factor values), the standard error of this effect estimate, and the
lower and upper confidence limits."
#---
> Srv <- Surv(dt,e)
>
> f <- cph(Srv ~ rcs(age,4) + sex, x=TRUE, y=TRUE)
> summary(f)
>
> Effects
> Response : Srv
>
> Factor Low High Diff. Effect S.E. Lower 0.95 Upper
> 0.95
> age 40.872 57.385 16.513 1.21 0.21 0.80 1.62
> Hazard Ratio 40.872 57.385 16.513 3.35 NA 2.22 5.06
In this case with a 4 df regression spline, you need to look at the
"effect" across the range of the variable. You ought to plot the age
effect and examine anova(f) ). In the untransformed situation the plot
is on the log hazards scale for cph. So the effect for age in this
case should be the difference in log hazard at ages 40.872 and 57.385.
SE is the standard error of that estimate and the Upper and Lower
numbers are the confidence bounds on the effect estimate. The Hazard
Ratio row gives you exponentiated results, so a difference in log
hazards becomes a hazard ratio. {exp(1.21) = 3.35}
> sex - Female:Male 2.000 1.000 NA 0.64 0.15 0.35 0.94
> Hazard Ratio 2.000 1.000 NA 1.91 NA 1.42 2.55
>
>
> Wat's the meaning of Effect, S.E. Lower, Upper?
You probably ought to read a bit more basic material. If you are
asking this question, Harrell's "Regression Modeling Strategies" might
be over you head, but it would probably be a good investment anyway.
Venables and Ripley's "Modern Applied Statistics" has a chapter on
survival analysis. Also consider Kalbfliesch and Prentice "Statistical
Analysis of Failure Time Data". I'm sure there are others; those are
the ones I have on my shelf.
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
More information about the R-help
mailing list