[R] model selection with step function

P Ehlers ehlers at math.ucalgary.ca
Fri Nov 25 15:43:47 CET 2005


Antonio,

Antonio Olinto wrote:
> Dear Ehlers, thanks for your message.
> 
> Following the example on stepAIC and Venebles & Ripley’s book, it seems that
> update rearranges the terms. I didn’t understand how to indicate the formula in
> the function.
> 
> I have the initial model U ~ var1+var2+var3+var4 (family Gaussian). I want first
> to select the terms, putting main effects first. If I just write step(model) it
> will take out no significant variables (lets say var2) and will give U ~
> var1+var3+var4. Supposing the var4 have the “strongest” effect upon U, followed
> by var1 and var3, I would like to have the out put U ~ var4+var1+var2.
> 
> Is it possible to do so?

Isn't that what stepAIC gives (with trace=TRUE) for its last model?
It orders predictors in terms of their effect on the AIC.

But why are you using glm() with the Gaussian family? See the comment
in MASS 4, page 190.

> 
> Thanks again.
> 
> Antonio Olinto
> Biologist
> Sao Paulo Fisheries Institute
> 
> PS. Unfortunately I don’t have Regression Modeling Strategies around here

That's a shame. Here's a quote from Frank's book:

"Stepwise variable selection has been a very popular
technique for many years, but if this procedure had just
been proposed as a statistical method, it would most likely
be rejected because it violates every principle of
statistical estimation and hypothesis testing."

I would use the technique only in an exploratory setting, i.e. one that
might help me to refine further experimentation.

Peter

> 
> 
> Citando P Ehlers <ehlers at math.ucalgary.ca>:
> 
> 
>>Antonio Olinto wrote:
>>
>>
>>>Hello,
>>>
>>>I have a doubt in using the function step (step wise) to select glm
>>
>>models.
>>
>>>Usually I apply the gamma distribution to analyze fishery data. To select
>>
>>the
>>
>>>terms I use a routine where I first compare single term models to the null
>>
>>model
>>
>>>(eg. U~1 vs. U~depth; U~1 vs. U~latitude; etc. – where U= abundance) and,
>>
>>by
>>
>>>means of the result given by a likelihook function applied for each
>>
>>comparison,
>>
>>>I select the “strongest” effect, let’s say depth. Then I run a new step
>>>comparing the U~depth vs. U~depth+latitude; U~depth vs. U~depth+... etc.
>>
>>Making
>>
>>>this way I put the terms in “magnitude” order.
>>>
>>>I tried to make a gaussian model using the step(glm.model) function to
>>
>>select
>>
>>>the terms but I saw that in the output table given by anova(glm.model) the
>>>selected terms kept the original order.
>>>
>>>Is it possible to have the terms in the model rearranged, as in my
>>
>>example?
>>
>>>Thanks for any help. I read Chambers and Hastie’s “Statistical Models in
>>
>>S”,
>>
>>>Venables and Ripley “Modern Applied Statistics” and, of course, R help but
>>
>>I
>>
>>>couldn’t get the trick.
>>>
>>
>>I don't know if this will get you there, but
>>
>>1. I would use stepAIC in package MASS;
>>2. set argument trace = TRUE;
>>3. think very hard about the interpretation of the model;
>>4. read also Frank Harrell's "Regression Modeling Strategies".
>>
>>Peter
>>
>>
>>
>>>Antonio
>>>
>>>--
>>>Biologist
>>>Sao Paulo Fisheries Institute
>>>
>>>
>>>
>>>-------------------------------------------------
>>>WebMail Bignet - O seu provedor do litoral
>>>www.bignet.com.br
>>>
>>>______________________________________________
>>>R-help at stat.math.ethz.ch mailing list
>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>PLEASE do read the posting guide!
>>
>>http://www.R-project.org/posting-guide.html
>>
>>-- 
>>Peter Ehlers
>>Department of Mathematics and Statistics
>>University of Calgary, 2500 University Dr. NW       ph: 403-220-3936
>>Calgary, Alberta  T2N 1N4, CANADA                  fax: 403-282-5150
>>
>>
>>
> 
> 
> 
> 
> -------------------------------------------------
> WebMail Bignet - O seu provedor do litoral
> www.bignet.com.br
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html




More information about the R-help mailing list