[R] logistic regression model + Cross-Validation

Frank E Harrell Jr f.harrell at vanderbilt.edu
Sun Jan 21 15:54:00 CET 2007


nitin jindal wrote:
> Hi,
> 
> I am trying to cross-validate a logistic regression model.
> I am using logistic regression model (lrm) of package Design.
> 
> f <- lrm( cy ~ x1 + x2, x=TRUE, y=TRUE)
> val <- validate.lrm(f, method="cross", B=5)

val <- validate(f, ...)    # .lrm not needed

> 
> My class cy has values 0 and 1.
> 
> "val" variable will give me indicators like slope and AUC. But, I also need
> the vector of predicted values of class variable "cy" for each record while
> cross-validation, so that I can manually look at the results. So, is there
> any way to get those probabilities assigned to each class.
> 
> regards,
> Nitin

No, validate.lrm does not have that option.  Manually looking at the 
results will not be easy when you do enough cross-validations.  A single 
5-fold cross-validation does not provide accurate estimates.  Either use 
the bootstrap or repeat k-fold cross-validation between 20 and 50 times. 
  k is often 10 but the optimum value may not be 10.  Code for averaging 
repeated cross-validations is in 
http://biostat.mc.vanderbilt.edu/twiki/pub/Main/RmS/logistic.val.pdf
along with simulations of bootstrap vs. a few cross-validation methods 
for binary logistic models.

Frank
-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University



More information about the R-help mailing list