[R] Is this a valid syntax for lm()
Rui Barradas
ru|pb@rr@d@@ @end|ng |rom @@po@pt
Wed Nov 12 18:32:07 CET 2025
Às 17:12 de 12/11/2025, Rui Barradas escreveu:
> Às 16:30 de 12/11/2025, Brian Smith escreveu:
>> Hi,
>>
>> I have below code
>>
>> ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
>> trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
>> group <- gl(2, 10, 20, labels = c("Ctl","Trt"))
>> group1 <- head(gl(2, 10, 22, labels = c("Ctl1","Trt1")), 20)
>> weight <- c(ctl, trt)
>> dat = as.data.frame(cbind(weight, group, group1))
>> lm.D9 <- lm(weight ~ group * group1 - 1 - group1, dat)
>>
>> I want to incorporate interaction between 2 variables group and
>> group1, however do not want to incorporate level-0 for group1 not the
>> intercept.
>>
>> Therefore I used (-1 - group1) in the formula.
>>
>> I would like to know if above is a valid syntax for the stated model.
>>
>> Thanks and regards,
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide https://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> Hello,
>
> Yes, that syntax is valid. But isn't
>
> lm.D9b <- lm(weight ~ 0 + group + group:group1, dat)
>
>
> more readable?
>
> You can check that the two models are the same with
>
>
> summary(lm.D9)
> summary(lm.D9b)
>
>
> This will tell where the objects returned by those two calls to lm() are
> different, giving further arguments to prefer model lm.D9b.
>
>
> all.equal(lm.D9, lm.D9b, check.attributes = FALSE)
>
>
> Hope this helps,
>
> Rui Barradas
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide https://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
Hello,
Sorry for my hasty post, there is another problem with your code.
The dat creation code is wrong:
dat = as.data.frame(cbind(weights, group, group1))
first creates a matrix with cbind then coerces the matrix to data.frame.
The error is in creating a matrix. Matrices can only have one data class
so all variables become numeric and the factors group and group1 are no
longer factors.
This error will impact everything that follows.
The correct way is to use data.frame(weights, group, group1). See the
code below. The models coefficients, s.e's, etc are different. And so
are the predictions from the models.
ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2, 10, 20, labels = c("Ctl","Trt"))
group1 <- head(gl(2, 10, 22, labels = c("Ctl1","Trt1")), 20)
weight <- c(ctl, trt)
wrong_dat <- as.data.frame(cbind(weight, group, group1))
right_dat <- data.frame(weight, group, group1)
str(wrong_dat)
#> 'data.frame': 20 obs. of 3 variables:
#> $ weight: num 4.17 5.58 5.18 6.11 4.5 4.61 5.17 4.53 5.33 5.14 ...
#> $ group : num 1 1 1 1 1 1 1 1 1 1 ...
#> $ group1: num 1 1 1 1 1 1 1 1 1 1 ...
str(right_dat)
#> 'data.frame': 20 obs. of 3 variables:
#> $ weight: num 4.17 5.58 5.18 6.11 4.5 4.61 5.17 4.53 5.33 5.14 ...
#> $ group : Factor w/ 2 levels "Ctl","Trt": 1 1 1 1 1 1 1 1 1 1 ...
#> $ group1: Factor w/ 2 levels "Ctl1","Trt1": 1 1 1 1 1 1 1 1 1 1 ...
wrong_lm.D9 <- lm(weight ~ group * group1 - 1 - group1, wrong_dat)
right_lm.D9 <- lm(weight ~ group * group1 - 1 - group1, right_dat)
summary(wrong_lm.D9)
#>
#> Call:
#> lm(formula = weight ~ group * group1 - 1 - group1, data = wrong_dat)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -1.0710 -0.4938 0.0685 0.2462 1.3690
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> group 7.7335 0.4540 17.04 1.51e-12 ***
#> group:group1 -2.7015 0.2462 -10.97 2.10e-09 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.6964 on 18 degrees of freedom
#> Multiple R-squared: 0.9818, Adjusted R-squared: 0.9798
#> F-statistic: 485.1 on 2 and 18 DF, p-value: < 2.2e-16
summary(right_lm.D9)
#>
#> Call:
#> lm(formula = weight ~ group * group1 - 1 - group1, data = right_dat)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -1.0710 -0.4938 0.0685 0.2462 1.3690
#>
#> Coefficients: (2 not defined because of singularities)
#> Estimate Std. Error t value Pr(>|t|)
#> groupCtl 5.0320 0.2202 22.85 9.55e-15 ***
#> groupTrt 4.6610 0.2202 21.16 3.62e-14 ***
#> groupCtl:group1Trt1 NA NA NA NA
#> groupTrt:group1Trt1 NA NA NA NA
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.6964 on 18 degrees of freedom
#> Multiple R-squared: 0.9818, Adjusted R-squared: 0.9798
#> F-statistic: 485.1 on 2 and 18 DF, p-value: < 2.2e-16
# generate data for predict()
g <- gl(2, 1, labels = c("Ctl","Trt"))
g1 <- gl(2, 1, labels = c("Ctl1","Trt1"))
# wrong_new must be coerced to numeric
wrong_new <- expand.grid(group = g, group1 = g1)
wrong_new[] <- lapply(wrong_new, as.numeric)
# keep right_new as factors
right_new <- expand.grid(group = g, group1 = g1)
predict(wrong_lm.D9, newdata = wrong_new)
#> 1 2 3 4
#> 5.0320 10.0640 2.3305 4.6610
predict(right_lm.D9, newdata = right_new)
#> Warning in predict.lm(right_lm.D9, newdata = right_new): prediction from
#> rank-deficient fit; attr(*, "non-estim") has doubtful cases
#> 1 2 3 4
#> 5.032 4.661 5.032 4.661
#> attr(,"non-estim")
#> 2 3
#> 2 3
Hope this helps,
Rui Barradas
More information about the R-help
mailing list