[R] model selection, stepAIC(), and coxph() (fwd)

Thomas Lumley tlumley at u.washington.edu
Mon May 8 17:11:35 CEST 2006

On Sat, 6 May 2006, Chad Reyhan Bhatti wrote:

> Hello,
> My question concerns model selection, stepAIC(), add1(), and coxph().
> In Venables and Ripley (3rd Ed) pp389-390 there is an example of using
> stepAIC() for the automated selection of a coxph model for VA lung cancer
> data.
> A statistics question:  Can partial likelihoods be interpreted in the same
> manner as likelihoods with respect to information based criterion and
> likelihood ratio tests?  It seems that they should be treated as
> quasilikelihoods which would make stepAIC() invalid and would require the
> use of add1() with a F-test for the reduction in deviance.

Since this is a question about the MASS book you would be better off 
contacting the authors.

They do (as usual) know what they are doing.  The Cox model is an 
unusually (perhaps uniquely) well-behaved semiparametric model, and the 
partial likelihood really does behave this way.

- For data without ties in the survival time the partial likelihood is 
(proportional to) the marginal likelihood of the ranks, so it is a 
perfectly good parametric likelihood. (Kalbfleisch & Prenctice, 
Biometrika, 1973)

- The chi^2 distribution (rather than F distribution) for the likelihood 
ratio test is justified by the marginal likelihood, or by martingale 
arguments (eg the book by Fleming and Harrington), or in more modern times 
by empirical process arguments or as a semiparametric profile likelihood. 
However, the only technically hard part is showing weak convergence -- the 
original paper by Cox showed that the variance of the partial score and 
the Hessian of the partial likelihood were the same, which is the key fact 
for the chi^2 rather than F test to be valid (if one of them is)

- The same arguments suggest AIC will be appropriate for comparing 
different subsets of variables in the same way that it is for generalized 
linear models. I don't have a reference here.


Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

More information about the R-help mailing list