[R] Logistic regression model selection with overdispersed/autocorrelated data
Jesse.Whittington@pc.gc.ca
Jesse.Whittington at pc.gc.ca
Wed Feb 8 16:49:05 CET 2006
Thanks for pointing out the aod package and the beta-binomial logistic
models Renaud.
While I see how betabinom could be applied to some of our other analyses ,
I don't see how it can be used in our habitat selection analysis where
individual locations are coded as 0 or 1 rather than proportions. Gee
models (geeglm from geepack) could be used for our analyses. Even though
these models are fit using maximum likelihood estimation, they do not solve
our model selection problem.
Beta-coefficients from gee, glm, glmm's, and lrm are nearly identical. The
only thing that varies is the variance-covariance matrix and the resulting
standard errors. Consequently, the deviances should be similar because
predicted values (p) are calculated from the beta-coefficients. For an
individual data point, the loglikelihood = y * log(p) + (1 - y) * log(1-p)
and the deviance = -2 * sum(loglikelihoods). Consequently, the difference
in deviance between two models is amplified by autocorrelated data and
causes models to be overparamaterized when using AIC or likelihood ratio
tests.
I am curious how others select models with autocorrelated data.
Thanks for your help,
Jesse
Renaud Lancelot
<renaud.lancelot@ To: "Jesse.Whittington at pc.gc.ca" <Jesse.Whittington at pc.gc.ca>
gmail.com> cc: r-help at stat.math.ethz.ch
Subject: Re: [R] Logistic regression model selection with overdispersed/autocorrelated
31/01/2006 01:02 data
If you're not interested in fitting caribou-specific responses, you
can use beta-binomial logistic models. There are several package
available for this purpose on CRAN, among which aod. Because these
models are fitted using maximum-likelihood methods, you can use AIC
(or other information criteria) to compare different models.
Best,
Renaud
2006/1/30, Jesse.Whittington at pc.gc.ca <Jesse.Whittington at pc.gc.ca>:
>
>
> I am creating habitat selection models for caribou and other species with
> data collected from GPS collars. In my current situation the
radio-collars
> recorded the locations of 30 caribou every 6 hours. I am then comparing
> resources used at caribou locations to random locations using logistic
> regression (standard habitat analysis).
>
> The data is therefore highly autocorrelated and this causes Type I error
> two ways â small standard errors around beta-coefficients and
> over-paramaterization during model selection. Robust standard errors are
> easily calculated by block-bootstrapping the data using "animal" as a
> cluster with the Design library, however I haven't found a satisfactory
> solution for model selection.
>
> A couple options are:
> 1. Using QAIC where the deviance is divided by a variance inflation
factor
> (Burnham & Anderson). However, this VIF can vary greatly depending on
the
> data set and the set of covariates used in the global model.
> 2. Manual forward stepwise regression using both changes in deviance and
> robust p-values for the beta-coefficients.
>
> I have been looking for a solution to this problem for a couple years and
> would appreciate any advice.
>
> Jesse
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
--
Renaud LANCELOT
Département Elevage et Médecine Vétérinaire (EMVT) du CIRAD
Directeur adjoint chargé des affaires scientifiques
CIRAD, Animal Production and Veterinary Medicine Department
Deputy director for scientific affairs
Campus international de Baillarguet
TA 30 / B (Bât. B, Bur. 214)
34398 Montpellier Cedex 5 - France
Tél +33 (0)4 67 59 37 17
Secr. +33 (0)4 67 59 39 04
Fax +33 (0)4 67 59 37 95
More information about the R-help
mailing list