[R] weighted regression inside FOREACH loop
William Dunlap
wdunlap at tibco.com
Fri Oct 7 17:18:25 CEST 2016
A more general way is to change the environment of your formula to
a child of its original environment and add variables like 'weights' or
'subset' to the child environment. Since you change the environment
inside a function call it won't affect the formula outside of the function
call.
E.g.
fmla <- as.formula("y ~ .")
models <- foreach(d=1:10, .combine=rbind, .errorhandling='remove') %dopar% {
datdf <- data.frame(y = 1:100+2*rnorm(100), x = 1:100+rnorm(100))
localEnvir <- new.env(parent=environment(fmla))
environment(fmla) <- localEnvir
localEnvir$weights <- rep(c(1,2), 50)
mod <- lm(fmla, data=datdf, weights=weights)
return(mod$coef)
}
models
# (Intercept) x
#result.1 -0.16910860 1.0022022
#result.2 0.03326814 0.9968325
#result.3 -0.08177174 1.0022907
#...
environment(fmla)
#<environment: R_GlobalEnv>
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Fri, Oct 7, 2016 at 7:44 AM, Bos, Roger <roger.bos at rothschild.com> wrote:
> All,
>
> I figured out how to get it to work, so I am posting the solution in case
> anyone is interested. I had to use attr to set the weights as an attribute
> of the data object for the linear model. Seems convoluted, but anytime I
> tried to pass a named vector as the weights the foreach loop could not find
> the variable, even if I tried exporting it. If anybody knows of a better
> way please let me know as this does not seem ideal to me, but it works.
>
> library(doParallel)
> cl <- makeCluster(4)
> registerDoParallel(cl)
> fmla <- as.formula("y ~ .")
> models <- foreach(d=1:10, .combine=rbind, .errorhandling='pass') %dopar% {
> datdf <- data.frame(y = 1:100+2*rnorm(100), x = 1:100+rnorm(100))
> attr(datdf, "weights") <- rep(c(1,2), 50)
> mod <- lm(fmla, data=datdf, weights=attr(data, "weights"))
> return(mod$coef)
> }
> Models
>
>
>
>
>
> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Bos, Roger
> Sent: Friday, October 07, 2016 9:25 AM
> To: R-help
> Subject: [R] weighted regression inside FOREACH loop
>
> I have a foreach loop that runs regressions in parallel and works fine,
> but when I try to add the weights parameter to the regression the
> coefficients don’t get stored in the “models” variable like they are
> supposed to. Below is my reproducible example:
>
> library(doParallel)
> cl <- makeCluster(4)
> registerDoParallel(cl)
> fmla <- as.formula("y ~ .")
> models <- foreach(d=1:10, .combine=rbind, .errorhandling='remove') %dopar%
> {
> datdf <- data.frame(y = 1:100+2*rnorm(100), x = 1:100+rnorm(100))
> weights <- rep(c(1,2), 50)
> mod <- lm(fmla, data=datdf, weights=weights)
> #mod <- lm(fmla, data=datdf)
> return(mod$coef)
> }
> models
>
> You can change the commenting on the two “mod <-“ lines to see that the
> non-weighted one works and the weighted regression doesn’t work. I tried
> using .export="weights" in the foreach line, but R says that weights is
> already being exported.
>
> Thanks in advance for any suggestions.
>
>
>
>
>
> ***************************************************************
> This message and any attachments are for the intended recipient's use only.
> This message may contain confidential, proprietary or legally privileged
> information. No right to confidential or privileged treatment of this
> message is waived or lost by an error in transmission.
> If you have received this message in error, please immediately notify the
> sender by e-mail, delete the message, any attachments and all copies from
> your system and destroy any hard copies. You must not, directly or
> indirectly, use, disclose, distribute, print or copy any part of this
> message or any attachments if you are not the intended recipient.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
More information about the R-help
mailing list