[R] glm and percentage data with many zero values

Tony Plate tplate at acm.org
Tue Mar 8 23:18:16 CET 2005


A very quick and easy thing to do with count data is to add 1 (or 0.5) to 
all your counts (I'm sure you can work backwards from abundance data to 
counts and then forward again).  This gets rid of zero problems.  In some 
cases this approximates a Bayesian approach with a low-information prior 
(though I'm not at all sure whether this is the case with a glm with 
Poisson errors).

-- Tony Plate

At Wednesday 08:02 AM 4/20/2005, Christian Kamenik wrote:
>Dear all,
>
>I am interested in correctly testing effects of continuous environmental 
>variables and ordered factors on bacterial abundance. Bacterial abundance 
>is derived from counts and expressed as percentage. My problem is that the 
>abundance data contain many zero values:
>Bacteria <- 
>c(2.23,0,0.03,0.71,2.34,0,0.2,0.2,0.02,2.07,0.85,0.12,0,0.59,0.02,2.3,0.29,0.39,1.32,0.07,0.52,1.2,0,0.85,1.09,0,0.5,1.4,0.08,0.11,0.05,0.17,0.31,0,0.12,0,0.99,1.11,1.78,0,0,0,2.33,0.07,0.66,1.03,0.15,0.15,0.59,0,0.03,0.16,2.86,0.2,1.66,0.12,0.09,0.01,0,0.82,0.31,0.2,0.48,0.15)
>
>First I tried transforming the data (e.g., logit) but because of the zeros 
>I was not satisfied. Next I converted the percentages into integer values 
>by round(Bacteria*10) or ceiling(Bacteria*10) and calculated a glm with a 
>Poisson error structure; however, I am not very happy with this approach 
>because it changes the original percentage data substantially (e.g., 0.03 
>becomes either 0 or 1). The same is true for converting the percentages 
>into factors and calculating a multinomial or proportional-odds model 
>(anyway, I do not know if this would be a meaningful approach).
>I was searching the web and the best answer I could get was 
>http://www.biostat.wustl.edu/archives/html/s-news/1998-12/msg00010.html in 
>which several persons suggested quasi-likelihood. Would it be reasonable 
>to use a glm with quasipoisson? If yes, how I can I find the appropriate 
>variance function? Any other suggestions?
>
>Many thanks in advance, Christian
>
>
>================================
>
>
>Christian Kamenik
>Institute of Plant Sciences
>University of Bern
>Altenbergrain 21
>3013 Bern
>Switzerland
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html




More information about the R-help mailing list