[R] simplifying a GLM-removing categorical variables
Ben Bolker
bolker at ufl.edu
Tue Mar 4 15:21:41 CET 2008
mariannej <marianne.james <at> abdn.ac.uk> writes:
> I have created a GLM (using the quasipoisson family) and am now trying to
> simplify it. One of my explanatory variables is categorical (vegetation
> type, with 6 different levels). In the model, 5 of the 6 levels are
> significant and one is not.
>
> How should I simplify my model? Do I need to take out the whole category
> (i.e. all of vegetation type), or just the level that is not significant
> (but how would I explain this biologically?)
>
> Please spell out any anwers simply, I am new to R,
>
This is really a statistical rather than an R question,
but the short answer is: you probably shouldn't try to
remove the "non-significant" level. Depending on the
details of your model -- the "significance" of the parameters,
which I assume you're gleaning from summary(), refers
to the difference of the levels from the baseline (first)
level. If 5 out of the 6 levels are significantly different
from the baseline, then the factor belongs in the model.
(You could _conceivably_ try to lump the "non-significant"
level together with the baseline level, but this really
goes in the direction of data-dredging.)
I would strongly recommend that you consult a good
general text on generalized linear models for strategies
of model simplification and interpretation -- to repeat,
this is really a statistical question and not an
R-specific one ...
good luck,
Ben Bolker
More information about the R-help
mailing list