[R] Decomposing tests of interaction terms in mixed-effects models
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Mon Aug 4 14:51:48 CEST 2008
Andrew Robinson wrote:
> On Mon, Aug 04, 2008 at 10:17:38AM +0200, Peter Dalgaard wrote:
>
>> Andrew Robinson wrote:
>>
>>> Dear R colleagues,
>>>
>>> a friend and I are trying to develop a modest workflow for the problem
>>> of decomposing tests of higher-order terms into interpretable sets of
>>> tests of lower order terms with conditioning.
>>>
>>> For example, if the interaction between A (3 levels) and C (2 levels)
>>> is significant, it may be of interest to ask whether or not A is
>>> significant at level 1 of C and level 2 of C.
>>>
>>> The textbook response seems to be to subset the data and perform the
>>> tests on the two subsets, but using the RSS and DF from the original
>>> fit.
>>>
>>> We're trying to answer the question using new explanatory variables.
>>> This approach (seems to) work just fine in aov, but not lme.
>>>
>>> For example,
>>>
>>> ##########################################################################
>>>
>>> # Build the example dataset
>>>
>>> set.seed(101)
>>>
>>> Block <- gl(6, 6, 36, lab = paste("B", 1:6, sep = ""))
>>> A <- gl(3, 2, 36, lab = paste("A", 1:3, sep = ""))
>>> C <- gl(2, 1, 36, lab = paste("C", 1:2, sep = ""))
>>> example <- data.frame(Block, A, C)
>>> example$Y <- rnorm(36)*2 + as.numeric(example$A)*as.numeric(example$C)^2 +
>>> 3 * rep(rnorm(6), each=6)
>>>
>>> with(example, interaction.plot(A, C, Y))
>>>
>>> # lme
>>>
>>> require(nlme)
>>> anova(lme(Y~A*C, random = ~1|Block, data = example))
>>>
>>> summary(aov(Y ~ Error(Block) + A*C, data = example))
>>>
>>> # Significant 2-way interaction. Now we would like to test the effect
>>> # of A at C1 and the effect of A at C2. Construct two new variables
>>> # that decompose the interaction, so an LRT is possible.
>>>
>>> example$AC <- example$AC1 <- example$AC2 <- interaction(example$A,
>>> example$C)
>>> example$AC1[example$C == "C1"] <- "A1.C1" # A is constant at C1
>>> example$AC2[example$C == "C2"] <- "A1.C2" # A is constant at C2
>>>
>>> # lme fails (as does lmer)
>>>
>>> lme(Y ~ AC1 + AC, random = ~ 1 | Block, data = example)
>>>
>>> # but aov seems just fine.
>>>
>>> summary(aov(Y ~ Error(Block) + AC1 + AC, data = example))
>>>
>>> ## AC was not significant, so A is not significant at C1
>>>
>>> summary(aov(Y ~ Error(Block) + AC2 + AC, data = example))
>>>
>>> ## AC was significant, so A is significant at C2
>>>
>>> ## Some experimentation suggests that lme doesn't like the 'partial
>>> ## confounding' approach that we are using, rather than the variables
>>> ## that we have constructed.
>>>
>>> lme(Y ~ AC, random = ~ 1 | Block, data = example)
>>> lme(Y ~ A + AC, random = ~ 1 | Block, data = example)
>>>
>>> ##########################################################################
>>>
>>> Are we doing anything obviously wrong? Is there another approach to
>>> the goal that we are trying to achieve?
>>>
>>>
>> This looks like a generic issue with lme/lmer, in that they are not
>> happy with rank deficiencies in the design matrix.
>>
>> Here's a "dirty" trick which you are not really supposed to know about
>> because it is hidden inside the "stats" namespace:
>>
>> M <- model.matrix(~AC1+AC, data=example)
>> M2 <- stats:::Thin.col(M)
>> summary(lme(Y ~ M2 - 1, random = ~ 1 | Block, data = example)
>>
>> (Thin.col(), Thin.row(), and Rank() are support functions for
>> anova.mlm() et al., but maybe we should document them and put them out
>> in the open.)
>>
>
> That is a neat idea, thanks, Peter, but it doesn't quite fit the bill.
> The summary provides t-tests but I am hoping to find F-tests,
> otherwise I'm not sure how to efficiently test A (3 levels) at the two
> levels of C.
>
> The anova.lme function doesn't help, sadly:
>
>
>> anova(lme(Y ~ M2 - 1, random = ~ 1 | Block, data = example))
>>
> numDF denDF F-value p-value
> M2 6 25 23.0198 <.0001
>
> so I'm still flummoxed!
>
> Andrew
>
You do have to peek into M2 to know that the test is for whether the
last two coefs are zero, but how about
> M3 <- M2[,2:4]
> M4 <- M2[,5:6]
> anova(lme(Y ~ M3+M4, random = ~ 1 | Block, data = example))
numDF denDF F-value p-value
(Intercept) 1 25 10.66186 0.0032
M3 3 25 55.31464 <.0001
M4 2 25 1.27591 0.2967
Also, check out estimable() in the gmodels package.
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list