[R] glmnet with binary logistic regression

Sat Jul 23 01:51:32 CEST 2011

Hi all,

I am using the glmnet R package to run LASSO with binary logistic
regression.  I have over 290 samples with outcome data (0 for alive, 1 for
dead) and over 230 predictor variables.  I currently using LASSO to reduce
the number of predictor variables.

I am using the cv.glmnet function to do 10-fold cross validation on a
sequence of lambda values which I let glmnet determine.  I then take the
optimal lambda value (lambda.1se) which I then use to predict on an
independent cohort.  

What I am finding is that this optimal lambda value fluctuates everytime I
run glmnet with LASSO.  It deviates quite a bit such that each time I
generate an ROC curve for my validation cohort, I get AUC values which
deviate a bit.  Does anyone know why there is such a fluctuation in the
generation of an optimal lambda?  I am thinking it might be due to the 10
fold cross validation step the training set is not being split well know to
have enough alive and dead cases?  Thoughts?

--
View this message in context: http://r.789695.n4.nabble.com/glmnet-with-binary-logistic-regression-tp3688126p3688126.html
Sent from the R help mailing list archive at Nabble.com.