[R] tweedie and lmer

Ben Bolker bolker at ufl.edu
Thu Aug 27 15:07:38 CEST 2009

kbs wrote:
> This is the link that gave me the indication:
> https://stat.ethz.ch/pipermail/r-help/2007-March/127261.html
> Are there alternative ways to deal with a high count of zeros for  
> count data with lmer?

Fair enough.  I think  the problem is that lme4 has changed quite
a lot in two years -- the hard-coding I refer to may not have been
true two years ago.

However, in looking at your question more carefully, I don't think
you need Tweedie distributions anyway.  Tweedie distributions are
most useful for *continuous* data with a positive mass at zero, 
not for zero-inflated count data.  For zero-inflated count data, I
would try the following:

(1)  Try fitting a Poisson GLMM and see whether low means and random effects
together account for the zeros you see (Warton 2005).

(2) use negative binomial or zero-inflated distributions.  This is not
possible with glmer, but you can try glmmADMB or MCMCglmm instead.  Or
but then you'll have to code your own model in the WinBUGS language.

	title = {Many zeros does not mean zero inflation: comparing the
goodness-of-fit of parametric models to multivariate abundance data},
	volume = {16},
	shorttitle = {Many zeros does not mean zero inflation},
	url = {http://dx.doi.org/10.1002/env.702},
	doi = {10.1002/env.702},
	abstract = {An important step in studying the ecology of a species is
choosing a statistical model of abundance; however, there has been little
general consideration of which statistical model to use. In particular,
abundance data have many zeros (often 50-80 per cent of all values), and
zero-inflated count distributions are often used to specifically model the
high frequency of zeros in abundance data. However, in such cases it is
often taken for granted that a zero-inflated model is required, and the
goodness-of-fit to count distributions with and without zero inflation is
not often compared for abundance {data.In} this article, the goodness-of-fit
was compared for several marginal models of abundance in 20 multivariate
datasets (a total of 1672 variables across all datasets) from different
sources. Multivariate abundance data are quite commonly collected in applied
ecology, and the properties of these data may differ from abundances
collected in autecological studies. Goodness-of-fit was assessed using {AIC}
values, graphs of observed vs expected proportion of zeros in a dataset, and
graphs of the sample mean-variance {relationship.The} negative binomial
model was the best fitting of the count distributions, without
zero-inflation. The high frequency of zeros was well described by the
systematic component of the model (i.e. at some places predicted abundance
was high, while at others it was zero) and so it was rarely necessary to
modify the random component of the model (i.e. fitting a zero-inflated
distribution). A Gaussian model based on transformed abundances fitted data
surprisingly well, and rescaled per cent cover was usually poorly fitted by
a count distribution. In conclusion, results suggest that the high frequency
of zeros commonly seen in multivariate abundance data is best considered to
come from distributions where mean abundance is often very low (hence there
are many zeros), as opposed to claiming that there are an unusually high
number of zeros compared to common parametric distributions. Copyright �
2005 John Wiley \& Sons, Ltd.},
	number = {3},
	journal = {Environmetrics},
	author = {David I. Warton},
	year = {2005},
	pages = {275--289}
View this message in context: http://www.nabble.com/tweedie-and-lmer-tp25156793p25167567.html
Sent from the R help mailing list archive at Nabble.com.

More information about the R-help mailing list