[R] how to write crossed and nested random effects in a model

Wed Mar 14 00:47:54 CET 2012

Niroshan <wnnperer <at> ucalgary.ca> writes:

> I have a question based on my research. I am analyzing reader-based
> diagnostic data set.  My study involves diabetic patients who were evaluated
> for treatable diabetic retinopathy based on the presence or absence of two
> pathologies in their eyes.  Pathologies were identified using the clinical
> examination (Gold standard method). In addition it can be identified by
> taking digital images of patients’ eyes and this method is cost effective.
> Finally two readers go over the images independently and patients are
> diagnosed as either positive or negative for the pathologies.
> My objective is, estimation the sensitivity and specificity of reader-based
> diagnostic method.
> 
> I am going to fit multivariate probit model. But the problem has complex
> correlation structure. We have three different correlation: readers results
> are correlated, patients left and right eyes are correlated and pathologies
> are correlated since all based on the retina in the eye.
> 
> Could anyone help me out how to address these correlations in a model using
> random effects? 
> 
> Also I think patients and readers are crossed each other since each reader
> go over each patients’ images. And [snip] eyes are nested with patients and
> pathologies are nested with in the eye.  Is this crossed and nested
> interpretation true?  If then how can I include these effects as random
> terms to the model?
> 
> My response is readers ‘ diagnosed values. Per patient I have 8 values (2
> pathologies, left and right eye and 2 readers) 
> Explanatory variables are actual disease status of each pathology for left
> and right eyes.
> 

   I think that *in principle* (if you are using lme4, which is
probably the most convenient option for dealing with crossed REs) you
probably want

 ~ pathology + (pathology|reader)+(pathology|patient/eye)

  The fixed effect term says that pathologies may vary in their
overall frequency.  The first RE term says that different readers can
vary, in a pathology-specific way (if they just differed overall in
their sensitivity you would want (1|reader) instead); the second says
that there is variance among eyes (within patients) in all pathologies
(and that they may be correlated).

  A few cautions about this:

* I'm not sure I got it right

* You might want to forward this (along with my answer, so we're not
starting from scratch) to r-sig-mixed-models at r-project.org , where
there is more expertise in mixed models.

* if you have the _same_ two readers for all of your patients (as
opposed to two different readers chosen at random out of a large,
possibly overlapping pool), then it isn't be practical to treat them
as a random effect, no matter how much sense it makes philosophically
-- use pathology*reader instead.

* You may need a moderately large amount of data to fit this model ...