[R] Decomposing tests of interaction terms in mixed-effects models

Mon Aug 4 10:17:38 CEST 2008

Andrew Robinson wrote:
> Dear R colleagues,
>
> a friend and I are trying to develop a modest workflow for the problem
> of decomposing tests of higher-order terms into interpretable sets of
> tests of lower order terms with conditioning.
>
> For example, if the interaction between A (3 levels) and C (2 levels)
> is significant, it may be of interest to ask whether or not A is
> significant at level 1 of C and level 2 of C.
>
> The textbook response seems to be to subset the data and perform the
> tests on the two subsets, but using the RSS and DF from the original
> fit.  
>
> We're trying to answer the question using new explanatory variables.
> This approach (seems to) work just fine in aov, but not lme.  
>
> For example,
>
> ##########################################################################
>
> # Build the example dataset
>
> set.seed(101)
>
> Block <- gl(6, 6, 36, lab = paste("B", 1:6, sep = ""))
> A <- gl(3, 2, 36, lab = paste("A", 1:3, sep = ""))
> C <- gl(2, 1, 36, lab = paste("C", 1:2, sep = ""))
> example <- data.frame(Block, A, C) 
> example$Y <- rnorm(36)*2 + as.numeric(example$A)*as.numeric(example$C)^2 + 
>         3 * rep(rnorm(6), each=6)
>
> with(example, interaction.plot(A, C, Y)) 
>
> # lme 
>
> require(nlme) 
> anova(lme(Y~A*C, random = ~1|Block, data = example)) 
>
> summary(aov(Y ~ Error(Block) + A*C, data = example))
>
> # Significant 2-way interaction.  Now we would like to test the effect
> # of A at C1 and the effect of A at C2.  Construct two new variables
> # that decompose the interaction, so an LRT is possible.
>
> example$AC <- example$AC1 <- example$AC2 <- interaction(example$A, example$C) 
>
> example$AC1[example$C == "C1"] <- "A1.C1"  # A is constant at C1
> example$AC2[example$C == "C2"] <- "A1.C2"  # A is constant at C2
>
> # lme fails (as does lmer)
>
> lme(Y ~ AC1 + AC, random = ~ 1 | Block, data = example)
>
> # but aov seems just fine.
>
> summary(aov(Y ~ Error(Block) + AC1 + AC, data = example)) 
>
> ## AC was not significant, so A is not significant at C1
>
> summary(aov(Y ~ Error(Block) + AC2 + AC, data = example)) 
>
> ## AC was significant, so A is significant at C2
>
> ## Some experimentation suggests that lme doesn't like the 'partial
> ## confounding' approach that we are using, rather than the variables
> ## that we have constructed.
>
> lme(Y ~ AC, random = ~ 1 | Block, data = example)
> lme(Y ~ A + AC, random = ~ 1 | Block, data = example)
>
> ##########################################################################
>
> Are we doing anything obviously wrong?   Is there another approach to
> the goal that we are trying to achieve?
>   
This looks like a generic issue with lme/lmer, in that they are not 
happy with rank deficiencies in the design matrix.

Here's a "dirty" trick which you are not really supposed to know about 
because it is hidden inside the "stats" namespace:

M <- model.matrix(~AC1+AC, data=example)
M2 <- stats:::Thin.col(M)
summary(lme(Y ~ M2 - 1, random = ~ 1 | Block, data = example)

(Thin.col(), Thin.row(), and Rank() are support functions for 
anova.mlm() et al., but maybe we should document them and put them out 
in the open.)

-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907