[R] package for repeated measures ANOVA with various link functions REDUX

Wed Mar 5 17:48:05 CET 2008

On Tue, Mar 4, 2008 at 9:48 PM, John Sorkin <jsorkin at grecc.umaryland.edu> wrote:
> Prof. Bates was correct to point out the lack of specifics in my original posting. I am looking for a package that will allow we to choose among link functions and account for repeated measures in a repeated measures ANOVA.
>
>  My question is what package should I use to facilitate estimating rates of illegal drug use at three centers, and the effect two interventions have on usage. At each center data describing the rate of drug use was obtained once a month. For the first six-months, there was no intervention at any of the three centers. For months seven through 13 intervention one was applied at each of the three centers. For months 14 through 24 intervention two was applied at each center. The question I am trying to answer is did intervention one or two change drug usage at any of the three centers. I am treating center as a repeated measure, i.e. the rate of drug use at month one will be correlated with the rate of drug use at center one at months two, three, etc.

>  I have accounted for repeated measures several ways in the past.

>  (1) I have used SAS proc MIXED with a REPEATED statement. The REPEATED statement allows for the specification of the within-subject correlation of repeated measures by specifying the structure of the within-subject variance-covariance matrix of the repeated measures. The matrix is block diagonal with one block for each subject.

But does such a structure extend to models with binary or count
responses?  You have mentioned that you want to use an arbitrary link
function such as quasibinomial.  What I understand the effect of the
REPEATED statement to be is to specify a parameterized form of the
marginal variance-covariance matrix of the responses.  If the response
variable has a multivariate normal distribution it is possible to
independently specify the mean (determined by the fixed-effects
parameters) and the marginal variance-covariance.

However, in the case of generalized linear models the mean response is
determined by a linear predictor and a link function while the
variance-covariance of the response is determined by prior weights and
a variance function.  The same is true for generalized linear mixed
models except that this description applies to the conditional
distribution of the response given the random effects.  The link and
the variance functions must agree so, for example, using a logit or
probit link which restricts the value of mu to the interval [0,1]
would imply a variance function (up to prior weights) of mu(1-mu).  At
least I think so - others may feel that it is possible to specify an
arbitrary variance function but I don't see how that can make sense.
To me the whole point of generalized linear models is to transform the
linear predictor to the desired range for the mean and to take into
account what this implies about the variance.

Even if you feel that it is possible to relax the ties between the
link function and the variance function I don't see how it would be
possible to specify an arbitrary structure for the marginal
variance-covariance of the response.  If you say that the marginal
variance-covariance must have a block-wise compound symmetry structure
but you are going to restrict the mean to the range [0,1] I think you
have painted yourself into a corner.  I don't think it is possible to
specify a mean on a restricted range and separately specify an
arbitrary variance-covariance structure.  In particular, when the mean
is on the range [0,1] then you better have the variance going to zero
as the mean goes to 0 or to 1.  You can't arbitrarily say that the
variance within a block must be constant, regardless of the values of
the means in those blocks.

>  (2) I have used SAS proc GENMOD which uses GEE to adjust the parameter estimates and their standard errors for the fact that a repeated measurements of a parameter are obtained from a given subjects.
>
>  Is there any package in R that will allow me to perform a repeated measures ANOVA with a selection of link functions that will allow me to account for repeated measures by either specifying the correlation structure of the repeated measures from a subject a la SAS proc mixed or by adjusting the parameter estimates using GEE a la proc GENMOD? Perhaps R has a package that accounts for repeated measures in some other manner.
>
>  Thank you,
>  John Sorkin
>
>
>
>  John Sorkin M.D., Ph.D.
>  Chief, Biostatistics and Informatics
>  University of Maryland School of Medicine Division of Gerontology
>  Baltimore VA Medical Center
>  10 North Greene Street
>  GRECC (BT/18/GR)
>  Baltimore, MD 21201-1524
>  (Phone) 410-605-7119
>  (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>
>  >>> "Douglas Bates" <bates at stat.wisc.edu> 3/4/2008 5:13 PM >>>
>  On Tue, Mar 4, 2008 at 10:52 AM, John Sorkin
>  <jsorkin at grecc.umaryland.edu> wrote:
>  > R 2.6.0
>  >  Windows XP
>
>  >  At the risk of raising the ire of the R gods . . .
>  >  I am looking for a package that will allow me to perform a poisson, quasipoisson, or negative binomial regression with adjustment for repeated measures. I have looked at glm, it does not appear to allow repeated measures. Although I can't get any help for lme or lme4 I remember that those packages perform repeated measures using random effects, not repeated measures ANOVA which is what I am looking for. (By the why, how can I get help for lme4? I have tried ?lme4, help.search("lme4") etc. to no avail.)
>  >  A suggestion for a package that will allow for repeated measures ANOVA in the context of various link functions would be appreciated.
>
>  I think you would need to be more specific about the model than just
>  saying "repeated measures ANOVA".  To me, "repeated measures"
>  describes a structure in the data.  There are many ways that one could
>  model the effects of the repeated measures; some might make sense in
>  the context of your data and some might not.  Without further details
>  about how you want to model the effect of the repeated measurements it
>  would be difficult to say if you could use the lmer function in the
>  lme4 package to do so.
>
>  The purpose of the S language and the R implementation of that
>  language is to facilitate exploration of data, including the fitting
>  of models that may be appropriate - always keeping in mind George
>  Box's famous statement that, "All models are wrong, but some models
>  are useful".  The "one size fits all" approach to data analysis - also
>  known as "give me a quart and a half of statistics and just make sure
>  that there is a p-value less than 5% somewhere in there" - doesn't fit
>  well into the R system.
>
>  Confidentiality Statement:
>  This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information.  Any unauthorized use, disclosure or distribution is prohibited.  If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
>