[R] Questions regrading the lasso and glmnet
    Patrick Breheny 
    patrick.breheny at uky.edu
       
    Sun May 29 12:41:36 CEST 2011
    
    
  
On 05/28/2011 12:54 PM, Ben Haller wrote:
> 1. Is my choice of glmnet() ok?  On what basis should I choose
> glmnet() vs. lars()?
LARS is for linear regression; your outcome is binary.
> 2. Is the way I'm scaling the variables before calling glmnet()
> correct?  Or should the squares themselves be centered and scaled?
> 3. Is my model matrix correct, or do I have a problem with the scale
> of the interaction variables?
glmnet centers and scales the variables itself.  You do not need to do so.
> 4. Is it a problem that the lasso fit gives non-zero coefficients for
> interactions whose underlying terms have zero coefficients?
This is going to occur with any automated model selection procedure 
unless specifically disallowed.
> 5. Is there any way to choose a simple explanatory model, smaller
> than the best predictive model supported by the data, that is less
> arbitrary / subjective?
You have 5 variables.  Variable selection is not your goal.  What you 
are trying to do is fit a curve (as opposed to a line) through your 
data, along possibly with interactions.  I would suggest looking into 
splines, provided for example in the mgcv package.
-- 
Patrick Breheny
Assistant Professor
Department of Biostatistics
Department of Statistics
University of Kentucky
    
    
More information about the R-help
mailing list