[R] about the summary(cph.object)

Sat Aug 1 14:34:19 CEST 2009

On Jul 31, 2009, at 11:24 PM, zhu yao wrote:

> Could someone explain the summary(cph.object)?
>
> The example is in the help file of cph.
>
> n <- 1000
> set.seed(731)
> age <- 50 + 12*rnorm(n)
> label(age) <- "Age"
> sex <- factor(sample(c('Male','Female'), n,
>              rep=TRUE, prob=c(.6, .4)))
> cens <- 15*runif(n)
> h <- .02*exp(.04*(age-50)+.8*(sex=='Female'))
> dt <- -log(runif(n))/h
> label(dt) <- 'Follow-up Time'
> e <- ifelse(dt <= cens,1,0)
> dt <- pmin(dt, cens)
> units(dt) <- "Year"
> dd <- datadist(age, sex)
> options(datadist='dd')

This is process for  setting the range for the display of effects in  
Design regression objects. See:

?datadist

"q.effect
set of two quantiles for computing the range of continuous variables  
to use in estimating regression effects. Defaults are c(.25,.75),  
which yields inter-quartile-range odds ratios, etc."

?summary.Design
#---
" By default, inter-quartile range effects (odds ratios, hazards  
ratios, etc.) are printed for continuous factors, ... "
#---
"Value
For summary.Design, a matrix of class summary.Design with rows  
corresponding to factors in the model and columns containing the low  
and high values for the effects, the range for the effects, the effect  
point estimates (difference in predicted values for high and low  
factor values), the standard error of this effect estimate, and the  
lower and upper confidence limits."

#---

> Srv <- Surv(dt,e)
>
> f <- cph(Srv ~ rcs(age,4) + sex, x=TRUE, y=TRUE)
> summary(f)
>
>                                         Effects               
> Response : Srv
>
> Factor            Low    High   Diff.  Effect S.E. Lower 0.95 Upper  
> 0.95
> age               40.872 57.385 16.513 1.21   0.21 0.80       1.62
>  Hazard Ratio     40.872 57.385 16.513 3.35     NA 2.22       5.06

In this case with a 4 df regression spline, you need to look at  the  
"effect" across the range of the variable. You ought to plot the age  
effect and examine anova(f) ). In the untransformed situation the plot  
is on the log hazards scale for cph. So the effect for age in this  
case should be the difference in log hazard at ages 40.872 and 57.385.  
SE is the standard error of that estimate and the Upper and Lower  
numbers are the confidence bounds on the effect estimate. The Hazard  
Ratio row gives you exponentiated results, so a difference in log  
hazards becomes a hazard ratio. {exp(1.21) = 3.35}

> sex - Female:Male  2.000  1.000     NA 0.64   0.15 0.35       0.94
>  Hazard Ratio      2.000  1.000     NA 1.91     NA 1.42       2.55
>
>
> Wat's the meaning of Effect, S.E. Lower, Upper?

You probably ought to read a bit more basic material. If you are  
asking this question, Harrell's "Regression Modeling Strategies" might  
be over you head, but it would probably be a good investment anyway.  
Venables and Ripley's "Modern Applied Statistics" has a chapter on  
survival analysis. Also consider Kalbfliesch and Prentice "Statistical  
Analysis of Failure Time Data". I'm sure there are others;  those are  
the ones I have on my shelf.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT