[R] glmnet with binary logistic regression
fongchun
fongchunchan at gmail.com
Sat Jul 23 01:51:32 CEST 2011
Hi all,
I am using the glmnet R package to run LASSO with binary logistic
regression. I have over 290 samples with outcome data (0 for alive, 1 for
dead) and over 230 predictor variables. I currently using LASSO to reduce
the number of predictor variables.
I am using the cv.glmnet function to do 10-fold cross validation on a
sequence of lambda values which I let glmnet determine. I then take the
optimal lambda value (lambda.1se) which I then use to predict on an
independent cohort.
What I am finding is that this optimal lambda value fluctuates everytime I
run glmnet with LASSO. It deviates quite a bit such that each time I
generate an ROC curve for my validation cohort, I get AUC values which
deviate a bit. Does anyone know why there is such a fluctuation in the
generation of an optimal lambda? I am thinking it might be due to the 10
fold cross validation step the training set is not being split well know to
have enough alive and dead cases? Thoughts?
--
View this message in context: http://r.789695.n4.nabble.com/glmnet-with-binary-logistic-regression-tp3688126p3688126.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list