[R] Appropriateness of survdiff {survival} for non-censored data

Terry Therneau therneau at mayo.edu
Thu Jul 8 15:55:10 CEST 2010


The query:

"Thus I question the appropriateness of using survdiff in my analysis; I
have exact data yet I would be testing on the Kaplan-Meir estimate of
these data in survdiff.  Thanks for any help."

My thoughts:
  There are two aspects of survival analysis you need to think of.  The
first, as you've noted, is the nuisance of censored data and the fact
that this forces different software.  All of that software works fine
with uncensored data, the Kaplan-Meier for instance simply reduces to
the emprical cdf.  
  The second is that the models commonly used are ones that have been
found to work well for this kind of data.  It is easy to do a censored
data t-test for instance [survreg(Surv(y) ~ x, dist='gaussian')] but it
is almost never done.  The reason is that the effect of covariates on
survival times is not well described by a location shift, e.g.,
"everyone gets 3 more weeks".  The log rank test is most powerful for a
shift in the hazard rate, which is how a lot of covariates seem to work
for this data.  BTW in uncensored data the LR is equivalent to the
Savage exponential scores test which comes from the non-parametrics
literature but is rarely used there: most of that literature deals with
problems where the effect of 'x' is not a shift in hazard.

If the way in which covariates affect insect lifetimes is similar to how
they work in human biology or industrial reliabily, then survival
methods would be good choice.  The answer to this is biological, not
statistical.

Terry Therneau



More information about the R-help mailing list