[R] Decomposing tests of interaction terms in mixed-effects models

Mon Aug 4 14:51:48 CEST 2008

Andrew Robinson wrote:
> On Mon, Aug 04, 2008 at 10:17:38AM +0200, Peter Dalgaard wrote:
>   
>> Andrew Robinson wrote:
>>     
>>> Dear R colleagues,
>>>
>>> a friend and I are trying to develop a modest workflow for the problem
>>> of decomposing tests of higher-order terms into interpretable sets of
>>> tests of lower order terms with conditioning.
>>>
>>> For example, if the interaction between A (3 levels) and C (2 levels)
>>> is significant, it may be of interest to ask whether or not A is
>>> significant at level 1 of C and level 2 of C.
>>>
>>> The textbook response seems to be to subset the data and perform the
>>> tests on the two subsets, but using the RSS and DF from the original
>>> fit.  
>>>
>>> We're trying to answer the question using new explanatory variables.
>>> This approach (seems to) work just fine in aov, but not lme.  
>>>
>>> For example,
>>>
>>> ##########################################################################
>>>
>>> # Build the example dataset
>>>
>>> set.seed(101)
>>>
>>> Block <- gl(6, 6, 36, lab = paste("B", 1:6, sep = ""))
>>> A <- gl(3, 2, 36, lab = paste("A", 1:3, sep = ""))
>>> C <- gl(2, 1, 36, lab = paste("C", 1:2, sep = ""))
>>> example <- data.frame(Block, A, C) 
>>> example$Y <- rnorm(36)*2 + as.numeric(example$A)*as.numeric(example$C)^2 + 
>>>        3 * rep(rnorm(6), each=6)
>>>
>>> with(example, interaction.plot(A, C, Y)) 
>>>
>>> # lme 
>>>
>>> require(nlme) 
>>> anova(lme(Y~A*C, random = ~1|Block, data = example)) 
>>>
>>> summary(aov(Y ~ Error(Block) + A*C, data = example))
>>>
>>> # Significant 2-way interaction.  Now we would like to test the effect
>>> # of A at C1 and the effect of A at C2.  Construct two new variables
>>> # that decompose the interaction, so an LRT is possible.
>>>
>>> example$AC <- example$AC1 <- example$AC2 <- interaction(example$A, 
>>> example$C) 
>>> example$AC1[example$C == "C1"] <- "A1.C1"  # A is constant at C1
>>> example$AC2[example$C == "C2"] <- "A1.C2"  # A is constant at C2
>>>
>>> # lme fails (as does lmer)
>>>
>>> lme(Y ~ AC1 + AC, random = ~ 1 | Block, data = example)
>>>
>>> # but aov seems just fine.
>>>
>>> summary(aov(Y ~ Error(Block) + AC1 + AC, data = example)) 
>>>
>>> ## AC was not significant, so A is not significant at C1
>>>
>>> summary(aov(Y ~ Error(Block) + AC2 + AC, data = example)) 
>>>
>>> ## AC was significant, so A is significant at C2
>>>
>>> ## Some experimentation suggests that lme doesn't like the 'partial
>>> ## confounding' approach that we are using, rather than the variables
>>> ## that we have constructed.
>>>
>>> lme(Y ~ AC, random = ~ 1 | Block, data = example)
>>> lme(Y ~ A + AC, random = ~ 1 | Block, data = example)
>>>
>>> ##########################################################################
>>>
>>> Are we doing anything obviously wrong?   Is there another approach to
>>> the goal that we are trying to achieve?
>>>  
>>>       
>> This looks like a generic issue with lme/lmer, in that they are not 
>> happy with rank deficiencies in the design matrix.
>>
>> Here's a "dirty" trick which you are not really supposed to know about 
>> because it is hidden inside the "stats" namespace:
>>
>> M <- model.matrix(~AC1+AC, data=example)
>> M2 <- stats:::Thin.col(M)
>> summary(lme(Y ~ M2 - 1, random = ~ 1 | Block, data = example)
>>
>> (Thin.col(), Thin.row(), and Rank() are support functions for 
>> anova.mlm() et al., but maybe we should document them and put them out 
>> in the open.)
>>     
>
> That is a neat idea, thanks, Peter, but it doesn't quite fit the bill.
> The summary provides t-tests but I am hoping to find F-tests,
> otherwise I'm not sure how to efficiently test A (3 levels) at the two
> levels of C.  
>
> The anova.lme function doesn't help, sadly:
>
>   
>> anova(lme(Y ~ M2 - 1, random = ~ 1 | Block, data = example))
>>     
>    numDF denDF F-value p-value
> M2     6    25 23.0198  <.0001
>
> so I'm still flummoxed!
>
> Andrew
>   
You do have to peek into M2 to know that the test is for whether the 
last two coefs are zero, but how about

 > M3 <- M2[,2:4]
 > M4 <- M2[,5:6]
 > anova(lme(Y ~ M3+M4, random = ~ 1 | Block, data = example))
            numDF denDF  F-value p-value
(Intercept)     1    25 10.66186  0.0032
M3              3    25 55.31464  <.0001
M4              2    25  1.27591  0.2967

Also, check out estimable() in the gmodels package.

-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907