[R] glm and percentage data with many zero values
Tony Plate
tplate at acm.org
Tue Mar 8 23:18:16 CET 2005
A very quick and easy thing to do with count data is to add 1 (or 0.5) to
all your counts (I'm sure you can work backwards from abundance data to
counts and then forward again). This gets rid of zero problems. In some
cases this approximates a Bayesian approach with a low-information prior
(though I'm not at all sure whether this is the case with a glm with
Poisson errors).
-- Tony Plate
At Wednesday 08:02 AM 4/20/2005, Christian Kamenik wrote:
>Dear all,
>
>I am interested in correctly testing effects of continuous environmental
>variables and ordered factors on bacterial abundance. Bacterial abundance
>is derived from counts and expressed as percentage. My problem is that the
>abundance data contain many zero values:
>Bacteria <-
>c(2.23,0,0.03,0.71,2.34,0,0.2,0.2,0.02,2.07,0.85,0.12,0,0.59,0.02,2.3,0.29,0.39,1.32,0.07,0.52,1.2,0,0.85,1.09,0,0.5,1.4,0.08,0.11,0.05,0.17,0.31,0,0.12,0,0.99,1.11,1.78,0,0,0,2.33,0.07,0.66,1.03,0.15,0.15,0.59,0,0.03,0.16,2.86,0.2,1.66,0.12,0.09,0.01,0,0.82,0.31,0.2,0.48,0.15)
>
>First I tried transforming the data (e.g., logit) but because of the zeros
>I was not satisfied. Next I converted the percentages into integer values
>by round(Bacteria*10) or ceiling(Bacteria*10) and calculated a glm with a
>Poisson error structure; however, I am not very happy with this approach
>because it changes the original percentage data substantially (e.g., 0.03
>becomes either 0 or 1). The same is true for converting the percentages
>into factors and calculating a multinomial or proportional-odds model
>(anyway, I do not know if this would be a meaningful approach).
>I was searching the web and the best answer I could get was
>http://www.biostat.wustl.edu/archives/html/s-news/1998-12/msg00010.html in
>which several persons suggested quasi-likelihood. Would it be reasonable
>to use a glm with quasipoisson? If yes, how I can I find the appropriate
>variance function? Any other suggestions?
>
>Many thanks in advance, Christian
>
>
>================================
>
>
>Christian Kamenik
>Institute of Plant Sciences
>University of Bern
>Altenbergrain 21
>3013 Bern
>Switzerland
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list