[R] lm function in R
Gabor Grothendieck
ggrothendieck at gmail.com
Sun Feb 14 00:55:54 CET 2010
You can find out the model matrix like this where we use the builtin
data frame CO2 to illustrate:
fo <- uptake ~ Treatment * Type
mod <- lm(fo, CO2); mod
mm <- model.matrix(fo, CO2)
View(mm)
On Sat, Feb 13, 2010 at 5:03 PM, Something Something
<mailinglists19 at gmail.com> wrote:
> I tried..
>
> mod = lm(Y ~ X1*X2*X3, na.action = na.exclude)
> formula(mod)
>
> This produced....
> Y ~ X1 * X2 * X3
>
>
> When I typed just mod I got:
>
> Call:
> lm(formula = Y ~ X1 * X2 * X3, na.action = na.exclude)
>
> Coefficients:
> (Intercept) X11 X21 X31 X11:X21 X11:X31
> X21:X31 X11:X21:X31
> 177.9245 0.2005 2.4482 3.1216 0.8127 -26.6166
> -3.0398 29.6049
>
>
> I am trying to figure out how R computed all these coefficients.
>
>
>
>
>
> On Sat, Feb 13, 2010 at 1:30 PM, Bert Gunter <gunter.berton at gene.com> wrote:
>
>> ?formula
>>
>>
>> Bert Gunter
>> Genentech Nonclinical Statistics
>>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
>> On
>> Behalf Of Something Something
>> Sent: Saturday, February 13, 2010 1:24 PM
>> To: Daniel Nordlund
>> Cc: r-help at r-project.org
>> Subject: Re: [R] lm function in R
>>
>> Thanks Dan. Yes that was very helpful. I didn't see the change from '*'
>> to
>> '+'.
>>
>> Seems like when I put * it means - interaction & when I put + it's not an
>> interaction.
>>
>> Is it correct to assume then that...
>>
>> When I put + R evaluates the following equation:
>> Y-Hat = b0 + b1X1 + b2X2 + . . . bkXk + 7 7 7 + bkXk
>>
>>
>> But when I put * R evaluates the following equation;
>> Y-Hat = b0 + b1X1 + b2x2 + ... + bkXk + b12 X12+ b13 X13 +........ + c
>>
>> Is this correct? If it is then can someone point me to any sources that
>> will explain how the coefficients (such as b0... bk, b12.. , b123..) are
>> calculated. I guess, one source is the R source code :) but is there any
>> other documentation anywhere?
>>
>> Please let me know. Thanks.
>>
>>
>>
>> On Fri, Feb 12, 2010 at 5:54 PM, Daniel Nordlund
>> <djnordlund at verizon.net>wrote:
>>
>> > > -----Original Message-----
>> > > From: r-help-bounces at r-project.org [mailto:
>> r-help-bounces at r-project.org]
>> > > On Behalf Of Something Something
>> > > Sent: Friday, February 12, 2010 5:28 PM
>> > > To: Phil Spector; r-help at r-project.org
>> > > Subject: Re: [R] lm function in R
>> > >
>> > > Thanks for the replies everyone. Greatly appreciate it. Some
>> progress,
>> > > but
>> > > now I am getting the following values when I don't use "as.factor"
>> > >
>> > > 13.14167 25.11667 28.34167 49.14167 40.39167 66.86667
>> > >
>> > > Is that what you guys get?
>> >
>> >
>> > If you look at Phil's response below, no, that is not what he got. The
>> > difference is that you are specifying an interaction, whereas Phil did
>> not
>> > (because the equation you initially specified did not include an
>> > interaction. Use Y ~ X1 + X2 instead of Y ~ X1*X2 for your formula.
>> >
>> > >
>> > >
>> > > On Fri, Feb 12, 2010 at 5:00 PM, Phil Spector
>> > > <spector at stat.berkeley.edu>wrote:
>> > >
>> > > > By converting the two variables to factors, you are fitting
>> > > > an entirely different model. Leave out the as.factor stuff
>> > > > and it will work exactly as you want it to.
>> > > >
>> > > > dat
>> > > >>
>> > > > V1 V2 V3 V4
>> > > > 1 s1 14 4 1
>> > > > 2 s2 23 4 2
>> > > > 3 s3 30 7 2
>> > > > 4 s4 50 7 4
>> > > > 5 s5 39 10 3
>> > > > 6 s6 67 10 6
>> > > >
>> > > >> names(dat) = c('id','y','x1','x2')
>> > > >> z = lm(y~x1+x2,dat)
>> > > >> predict(z)
>> > > >>
>> > > > 1 2 3 4 5 6 15.16667
>> 24.66667
>> > > > 27.66667 46.66667 40.16667 68.66667
>> > > >
>> > > >
>> > > > - Phil Spector
>> > > > Statistical Computing
>> Facility
>> > > > Department of Statistics
>> > > > UC Berkeley
>> > > > spector at stat.berkeley.edu
>> > > >
>> > > >
>> > > >
>> > > > On Fri, 12 Feb 2010, Something Something wrote:
>> > > >
>> > > > Hello,
>> > > >>
>> > > >> I am trying to learn how to perform Multiple Regression Analysis in
>> R.
>> > > I
>> > > >> decided to take a simple example given in this PDF:
>> > > >> http://www.utdallas.edu/~herve/abdi-prc-pretty.pdf
>> > > >>
>> > > >> I created a small CSV called, students.csv that contains the
>> following
>> > > >> data:
>> > > >>
>> > > >> s1 14 4 1
>> > > >> s2 23 4 2
>> > > >> s3 30 7 2
>> > > >> s4 50 7 4
>> > > >> s5 39 10 3
>> > > >> s6 67 10 6
>> > > >>
>> > > >> Col headers: Student id, Memory span(Y), age(X1), speech rate(X2)
>> > > >>
>> > > >> Now the expected results are:
>> > > >>
>> > > >> yHat[0]:15.166666666666668
>> > > >> yHat[1]:24.666666666666668
>> > > >> yHat[2]:27.666666666666664
>> > > >> yHat[3]:46.666666666666664
>> > > >> yHat[4]:40.166666666666664
>> > > >> yHat[5]:68.66666666666667
>> > > >>
>> > > >> This is based on the following equation (given in the PDF): Y =
>> 1.67
>> > +
>> > > X1
>> > > >> +
>> > > >> 9.50 X2
>> > > >>
>> > > >> I ran the following commands in R:
>> > > >>
>> > > >> data = read.table("students.csv", head=F, as.is=T, na.string=".",
>> > > >> row.nam=NULL)
>> > > >> X1 = as.factor(data[[3]])
>> > > >> X2 = as.factor(data[[4]])
>> > > >> Y = data[[2]]
>> > > >> mod = lm(Y ~ X1*X2, na.action = na.exclude)
>> > > >> Y.hat = fitted(mod)
>> > > >> Y.hat
>> > > >>
>> > > >> This gives me the following output:
>> > > >>
>> > > >> Y.hat
>> > > >>>
>> > > >> 1 2 3 4 5 6
>> > > >> 14 23 30 50 39 67
>> > > >>
>> > > >> Obviously I am doing something wrong. Please help. Thanks.
>> > > >>
>> >
>> > Hope this is helpful,
>> >
>> > Dan
>> >
>> > Daniel Nordlund
>> > Bothell, WA USA
>> >
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>> [[alternative HTML version deleted]]
>>
>>
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list