[R] what does it mean when my main effect 'disappears' when using lme4?

David Winsemius dwinsemius at comcast.net
Thu Aug 19 01:52:48 CEST 2010


On Aug 18, 2010, at 6:45 PM, Peter Ehlers wrote:

> On 2010-08-18 11:49, Johan Jackson wrote:
>> No, apologies (good catch David!), I merely copied the script  
>> incorrectly.
>> It was
>>
>> lmer(Y~X + (1|labs),data=DATA)
>>
>> in my original script. So my question still stands: is it expected  
>> behavior
>> for lmer to access the object 'labs' rather than the object 'DATA 
>> $labs' when
>> using the data= argument?
>>
>> JJ
>>
>
> I don't think that's expected behaviour, nor do I think that it  
> occurs.
> There must be something else going on. Can you produce this with a
> small reproducible example?

This makes me wonder if there couldn't be a Wiki page where  
questioners could be referred that would illustrate the quick and easy  
construction of examples that could test such theories? I would  
imagine that in (this instance) the page would start with the  
data.frame that were on the help page for lmer() (for example) and  
then put in the workspace a mangled copy of a vector that migh exhibit  
the pathological structure that might exist in the OP's version of  
"labs" and then run lmer() to see if such an "unexpected behavior"  
might be exhibited.

Just an idea. (I've never managed to get any R-Wiki contributions  
accepted through the gauntlet that it puts up.)

-- 
David.
>
>  -Peter Ehlers
>
>>
>>
>>
>> On Wed, Aug 18, 2010 at 11:29 AM, David Winsemius<dwinsemius at comcast.net 
>> >wrote:
>>
>>>
>>> On Aug 18, 2010, at 1:19 PM, Johan Jackson wrote:
>>>
>>>  Hi all,
>>>>
>>>> Thanks for the replies (including off list).  I have since  
>>>> resolved the
>>>> discrepant results. I believe it has to do with R's scoping rules  
>>>> - I had
>>>> an
>>>> object called 'labs' and a variable in the dataset (DATA) called  
>>>> 'labs',
>>>> and
>>>> apparently (to my surprise), when I called this:
>>>>
>>>> lmer(Y~X + (1|labs),dataset=DATA)
>>>>
>>>> lmer was using the object 'labs' rather than the object 'DATA 
>>>> $labs'. Is
>>>> this
>>>> expected behavior??
>>>>
>>>
>>> help(lmer, package=lme4)
>>>
>>> It would be if you use the wrong data argument for lmer(). I doubt  
>>> that the
>>> argument "dataset" would result in lmer processing "DATA".  My  
>>> guess is that
>>> the function also accessed objects "Y" and "X" from the calling  
>>> environment
>>> rather than from within "DATA".
>>>
>>>
>>>
>>>
>>>> This would have been fine, except I had reordered DATA in the  
>>>> meantime!
>>>>
>>>> Best,
>>>>
>>>> JJ
>>>>
>>>> On Tue, Aug 17, 2010 at 7:17 PM, Mitchell Maltenfort<mmalten at gmail.com
>>>>> wrote:
>>>>
>>>>  One difference is that the random effect in lmer is assumed --
>>>>> implicitly constrained, as I understand it -- to
>>>>> be a bell curve.  The fixed effect model does not have that  
>>>>> constraint.
>>>>>
>>>>> How are the values of "labs" effects distributed in your lm model?
>>>>>
>>>>> On Tue, Aug 17, 2010 at 8:50 PM, Johan Jackson
>>>>> <johan.h.jackson at gmail.com>  wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> Setup: I have data with ~10K observations. Observations come  
>>>>>> from 16
>>>>>> different laboratories (labs). I am interested in how a  
>>>>>> continuous
>>>>>>
>>>>> factor,
>>>>>
>>>>>> X, affects my dependent variable, Y, but there are big  
>>>>>> differences in
>>>>>> the
>>>>>> variance and mean across labs.
>>>>>>
>>>>>> I run this model, which controls for mean but not variance  
>>>>>> differences
>>>>>> between the labs:
>>>>>> lm(Y ~ X + as.factor(labs)).
>>>>>> The effect of X is highly significant (p<  .00001)
>>>>>>
>>>>>> I then run this model using lme4:
>>>>>> lmer(Y~ X + (1|labs)) #controls for mean diffs bw labs
>>>>>> lmer(Y~X + (X|labs)) #and possible slope heterogeneity bw labs.
>>>>>>
>>>>>> For both of these latter models, the effect of X is non- 
>>>>>> significant (|t|
>>>>>>
>>>>> <
>>>>>
>>>>>> 1.5).
>>>>>>
>>>>>> What might this be telling me about my data? I guess the second  
>>>>>> (X|labs)
>>>>>>
>>>>> may
>>>>>
>>>>>> tell me that there are big differences in the slope across  
>>>>>> labs, and
>>>>>> that
>>>>>> the slope isn't significant against the backdrop of 16 slopes  
>>>>>> that
>>>>>> differ
>>>>>> quite a bit between each other. Is that right? (Still, the  
>>>>>> enormous drop
>>>>>>
>>>>> in
>>>>>
>>>>>> p-value is surprising!). I'm not clear on why the first (1|labs),
>>>>>>
>>>>> however,
>>>>>
>>>>>> is so discrepant from just controlling for the mean effects of  
>>>>>> labs.
>>>>>>
>>>>>> Any help in interpreting these data would be appreciated. When  
>>>>>> I first
>>>>>>
>>>>> saw
>>>>>
>>>>>> the data, I jumped for joy, but now I'm muddled and uncertain  
>>>>>> if I'm
>>>>>> overlooking something. Is there still room for optimism (with  
>>>>>> respect to
>>>>>>
>>>>> X
>>>>>
>>>>>> affecting Y)?
>>>>>>
>>>>>> JJ
>>>>>>
>>>>>>      [[alternative HTML version deleted]]
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
>>>>>>
>>>>> http://www.R-project.org/posting-guide.html
>>>>>
>>>>>> and provide commented, minimal, self-contained, reproducible  
>>>>>> code.
>>>>>>
>>>>>>
>>>>>
>>>>        [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>> David Winsemius, MD
>>> West Hartford, CT
>>>
>>>



More information about the R-help mailing list