[R] Binomial glms with very small numbers

Patrick Connolly p.connolly at hortresearch.co.nz
Thu Jan 15 02:54:01 CET 2004

On Wed, 14-Jan-2004 at 05:15PM -0800, Spencer Graves wrote:

|>       The advisability of using "glm" with mortality depends not on
|> the size of sample groups but on the assumption of independence:
|> Whether you have 3 individuals per group or 30 or 1, is it

I think we can assume independence.  What concerned me more was the
fact that there will be rather a lot of 0s and 1s, corresponding to
-Inf and Inf on the transformed scale.  Only half the possible values
(namely, 1 & 2) will be usable in the fitting.  On second thoughts,
since the response can be given as a binary, perhaps I was
unnecessarily concerned.

|> plausible to assume that all individuals represented in your
|> data.frame have independent chances of survival give the
|> potentially explanatory variables?  If the answer is "yes", then
|> "glm" is appropriate.  If the answer is "no", then some other tool
|> may be preferable.  However, "glm" is quick and easy in R, and I
|> might start with that, even if I felt the assumption of
|> independence was violated.  If I found nothing there, I would not
|> likely find anything with techniques that handled more
|> appropriately the violations of independence.

Thanks for that suggestion.

|>       Similarly, I can't see how it would matter whether potentially 
|> explanatory variables were continuous or categorical, as long as a 
|> categorical variable were appropriately coded as a factor (or 
|> "character", which is then treated as a factor) if it has more than 2 
|> levels. 

I didn't think it would make a difference but I included it in case
someone more knowledgeable had reasons why it did.


Patrick Connolly
Mt Albert
New Zealand 
Ph: +64-9 815 4200 x 7188
I have the world`s largest collection of seashells. I keep it on all
the beaches of the world ... Perhaps you`ve seen it.  ---Steven Wright 

More information about the R-help mailing list