[R] Ridge Regression variable selection
Ben Bolker
bbolker at gmail.com
Thu Dec 27 16:14:39 CET 2012
Frank Harrell <f.harrell <at> vanderbilt.edu> writes:
>
> Unlike L1 (lasso) regression or elastic net (mixture of L1 and L2), L2 norm
> regression (ridge regression) does not select variables. Selection of
> variables would not work properly, and it's unclear why you would want to
> omit "apparently" weak variables anyway.
> Frank
>
... and this was cross-posted from StackOverflow, where I said more
or less the same thing about ridge regression (I didn't get into the
"don't do variable selection" issue yet, I was waiting ...)
http://stackoverflow.com/questions/14046569/ridge-regression-in-r
For the other questions (what are the lambda values? What does
the output mean?) I would suggest getting a copy of _Modern
Applied Statistics in S_ [the book that the package, MASS, was
written to accompany] and reading the relevant chapter.
> maths123 wrote
> > I have a .txt file containing a dataset with 500 samples. There are 10
> > variables.
> >
> > I am trying to perform variable selection using the ridge regression
> > method but I am very confused.
> >
> > I have input the following:
> > diabetes10<-read.table("diabetes10.txt", header=TRUE)
> > diabetes10
> > library(MASS)
> > select(lm.ridge(y=diabetes10 ~ age+sex+bmi+map+tc , diabetes10,
> > lambda = seq(0,0.1,0.0001)))
> >
> > First of all, i am confused about the lamda values,
> > Second of all, my output is:
> >
> > modified HKB estimator is -1.334073e-29
> > modified L-W estimator is -5.610557e-28
> > smallest value of GCV at 1e-04
> >
> >
> > I have no idea what that is telling me and where I am supposed to work out
> > which variables have been selected.
> >
More information about the R-help
mailing list