[R] lmer, p-values and all that
Nutter, Benjamin
NutterB at ccf.org
Wed Apr 3 15:56:50 CEST 2013
My apologies for coming late into this conversation, but I'm curious about something in your response
You use the following code to peform a likelihood ratio test between an lm object and a mer object
fm0 <- lm(distance~age,data=Orthodont)
fm2 <- lmer(distance~age+(1|Subject),data=Orthodont,REML=FALSE)
ddiff <- -2*c(logLik(fm0)-logLik(fm2))
pchisq(ddiff,1,lower.tail=FALSE)
It seems like this would be simple to roll up into a function, such as
lrtestGeneric <- function(fit1, fit2){
chisq <- -2 * c(logLik(fit1) - logLik(fit2))
df <- abs(attributes(logLik(fit1))$df - attributes(logLik(fit2))$df)
p <- pchisq(chisq, df, lower.tail=FALSE)
lrtest <- data.frame("L.R.Chisq"=chisq, "d.f."=df, "P"=p)
return(lrtest)
}
lrtestGeneric(fm0, fm2)
My question is about whether it is appropriate to use the degrees of freedom returned by logLik or if I should just use 1 degree of freedom when comparing a model without the random effect to one with the random effect. For instance, logLik returns a difference of 3 between degrees of freedom in the models. Should I be using the 3 degrees of freedom in the likelihood ratio test, or is it better to go with 1?
fit0 <- lm(Reaction ~ Days, sleepstudy)
fit1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
lrtestGeneric(fit0, fit2)
Any education you can provide would be great appreciated.
Thanks
Benjamin Nutter | Biostatistician | Quantitative Health Sciences
Cleveland Clinic | 9500 Euclid Ave. | Cleveland, OH 44195 | (216) 445-1365
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Ben Bolker
Sent: Wednesday, March 27, 2013 10:34 PM
To: David Winsemius
Cc: r-help at stat.math.ethz.ch
Subject: Re: [R] lmer, p-values and all that
On 13-03-27 10:10 PM, David Winsemius wrote:
>
> On Mar 27, 2013, at 7:00 PM, Ben Bolker wrote:
>
>> Michael Grant <michael.grant <at> colorado.edu> writes:
>>> Dear Help:
>>
>>> I am trying to follow Professor Bates' recommendation, quoted by
>>> Professor Crawley in The R Book, p629, to determine whether I should
>>> model data using the 'plain old' lm function or the mixed model
>>> function lmer by using the syntax anova(lmModel,lmerModel).
>>> Apparently I've not understood the recommendation or the proper
>>> likelihood ratio test in question (or both) for I get this error
>>> message: Error: $ operator not defined for this S4 class.
>>
>> I don't have the R Book handy (some more context would be extremely
>> useful! I would think it would count as "fair use" to quote the
>> passage you're referring to ...)
>
> This is the quoted Rhelp entry:
>
> http://tolstoy.newcastle.edu.au/R/help/05/01/10006.html
>
> (I'm unable to determine whether it applies to the question at hand.)
OK, I misspoke -- sorry. I think the lmer()/lme() likelihoods are actually comparable; it's GLMMs (glmer(), with no analogue in lme()-land except for MASS::glmmPQL, which doesn't give you log-likelihoods at all) where the problem arises.
You can (1) use lme(), (2) look at http://glmm.wikidot.com/faq for suggestions about testing random-effects terms (including "perhaps don't do it at all"), or (3) construct the likelihood ratio test yourself as follows:
library("nlme")
data(Orthodont)
fm1 <- lme(distance~age,random=~1|Subject,data=Orthodont)
fm0 <- lm(distance~age,data=Orthodont)
anova(fm1,fm0)[["p-value"]]
detach("package:nlme",unload=TRUE)
library("lme4.0") ## stable version of lme4
fm2 <- lmer(distance~age+(1|Subject),data=Orthodont,REML=FALSE)
anova(fm2,fm0) ## fails
ddiff <- -2*c(logLik(fm0)-logLik(fm2))
pchisq(ddiff,1,lower.tail=FALSE)
## not identical to above but close enough
>
>>
>>> Would someone be kind enough to point out my blunder?
>>
>> You should probably repost this to the
>> r-sig-mixed-models at r-project.org mailing list.
>>
>> My short answer would be: (1) I don't think you can actually use
>> anova() to compare likelihoods between lm() and lme()/lmer() fits in
>> the way that you want: *maybe* for lme() [don't recall], but almost
>> certainly not for lmer(). See http://glmm.wikidot.com/faq for
>> methods for testing significance/inclusion of random factors (short
>> answer: you should *generally* try to make the decision whether to
>> include random factors or not on _a priori_ grounds, not on the basis
>> of statistical tests ...)
>>
>> Ben Bolker
>>
>
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
===================================
Please consider the environment before printing this e-mail
Cleveland Clinic is ranked one of the top hospitals in America by U.S.News & World Report (2012).
Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations.
Confidentiality Note: This message is intended for use ...{{dropped:18}}
More information about the R-help
mailing list