[R] Passing formula as parameter to `lm` within `sapply` causes error [BUG?]
Duncan Murdoch
murdoch@dunc@n @end|ng |rom gm@||@com
Wed May 1 00:32:17 CEST 2019
On 30/04/2019 11:24 a.m., Jens Heumann wrote:
> Hi,
>
> `lm` won't take formula as a parameter when it is within a `sapply`; see
> example below. Please, could anyone either point me to a syntax error or
> confirm that this might be a bug?
>
I haven't looked carefully at your example. From a quick glance,
however, I'd suspect that the issue is with the formula. Formulas have
attached environments, where they look up variables in them that aren't
in the data argument to lm(). In your code it's not obvious to me what
environment would be attached, but I suspect it's the caller of sapply,
not the environment that sapply creates for a particular value of its
argument. I think this because of a rule that is supposed to be
followed in R:
Formulas get the environment where they were created attached to
them. That would be your global environment.
R is flexible, so functions don't have to follow this rule, but it
causes lots of confusion when they don't.
Duncan Murdoch
> Best,
> Jens
>
> [Disclaimer: This is my first post here, following advice of how to
> proceed with possible bugs from here: https://www.r-project.org/bugs.html]
>
>
> SUMMARY
>
> While `lm` alone accepts formula parameter `FO` well, the same within a
> `sapply` causes an error. When putting everything as parameter but
> formula `FO`, it's still working, though. All parameters work fine
> within a similar `for` loop.
>
>
> MCVE (see data / R-version at bottom)
>
> > summary(lm(y ~ x, df1, df1[["z"]] == 1, df1[["w"]]))$coef[1, ]
> Estimate Std. Error t value Pr(>|t|)
> 1.6269038 0.9042738 1.7991275 0.3229600
> > summary(lm(FO, data, data[[st]] == st1, data[[ws]]))$coef[1, ]
> Estimate Std. Error t value Pr(>|t|)
> 1.6269038 0.9042738 1.7991275 0.3229600
> > sapply(unique(df1$z), function(s)
> + summary(lm(y ~ x, df1, df1[["z"]] == s, df1[[ws]]))$coef[1, ])
> [,1] [,2] [,3]
> Estimate 1.6269038 -0.1404174 -0.010338774
> Std. Error 0.9042738 0.4577001 1.858138516
> t value 1.7991275 -0.3067890 -0.005564049
> Pr(>|t|) 0.3229600 0.8104951 0.996457853
> > sapply(unique(data[[st]]), function(s)
> + summary(lm(FO, data, data[[st]] == s, data[[ws]]))$coef[1, ]) # !!!
> Error in eval(substitute(subset), data, env) : object 's' not found
> > sapply(unique(data[[st]]), function(s)
> + summary(lm(y ~ x, data, data[[st]] == s, data[[ws]]))$coef[1, ])
> [,1] [,2] [,3]
> Estimate 1.6269038 -0.1404174 -0.010338774
> Std. Error 0.9042738 0.4577001 1.858138516
> t value 1.7991275 -0.3067890 -0.005564049
> Pr(>|t|) 0.3229600 0.8104951 0.996457853
> > m <- matrix(NA, 4, length(unique(data[[st]])))
> > for (s in unique(data[[st]])) {
> + m[, s] <- summary(lm(FO, data, data[[st]] == s, data[[ws]]))$coef[1, ]
> + }
> > m
> [,1] [,2] [,3]
> [1,] 1.6269038 -0.1404174 -0.010338774
> [2,] 0.9042738 0.4577001 1.858138516
> [3,] 1.7991275 -0.3067890 -0.005564049
> [4,] 0.3229600 0.8104951 0.996457853
>
> # DATA #################################################################
>
> df1 <- structure(list(x = c(1.37095844714667, -0.564698171396089,
> 0.363128411337339,
> 0.63286260496104, 0.404268323140999, -0.106124516091484, 1.51152199743894,
> -0.0946590384130976, 2.01842371387704), y = c(1.30824434809425,
> 0.740171482827397, 2.64977380403845, -0.755998096151299, 0.125479556323628,
> -0.239445852485142, 2.14747239550901, -0.37891195982917, -0.638031707027734
> ), z = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), w = c(0.7, 0.8,
> 1.2, 0.9, 1.3, 1.2, 0.8, 1, 1)), class = "data.frame", row.names = c(NA,
> -9L))
>
> FO <- y ~ x; data <- df1; st <- "z"; ws <- "w"; st1 <- 1
>
> ########################################################################
>
> > R.version
> _
> platform x86_64-w64-mingw32
> arch x86_64
> os mingw32
> system x86_64, mingw32
> status
> major 3
> minor 6.0
> year 2019
> month 04
> day 26
> svn rev 76424
> language R
> version.string R version 3.6.0 (2019-04-26)
> nickname Planting of a Tree
>
> #########################################################################
>
> NOTE: Question on SO two days ago
> (https://stackoverflow.com/questions/55893189/passing-formula-as-parameter-to-lm-within-sapply-causes-error-bug-confirmation)
> brought many views but neither answer nor bug confirmation.
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list