[R] bug in Windows implementation of nlme::groupedData
John Fox
j|ox @end|ng |rom mcm@@ter@c@
Fri Jan 7 22:34:52 CET 2022
Dear Melissa,
It seems strange to me that your code would work on any platform (it
doesn't on my Mac) because the data frame you create shouldn't contain a
matrix named "X" but rather columns including those originating from X.
To illustrate:
> X <- matrix(1:12, 4, 3)
> colnames(X) <- c("a", "b", "c")
> X
a b c
[1,] 1 5 9
[2,] 2 6 10
[3,] 3 7 11
[4,] 4 8 12
> y <- 1:4
> (D <- data.frame(y, X))
y a b c
1 1 1 5 9
2 2 2 6 10
3 3 3 7 11
4 4 4 8 12
> str(D)
'data.frame': 4 obs. of 4 variables:
$ y: int 1 2 3 4
$ a: int 1 2 3 4
$ b: int 5 6 7 8
$ c: int 9 10 11 12
My session info:
> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.1
Matrix products: default
LAPACK:
/Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] nlme_3.1-153 HRW_1.0-5
loaded via a namespace (and not attached):
[1] compiler_4.1.2 tools_4.1.2 KernSmooth_2.23-20
splines_4.1.2
[5] grid_4.1.2 lattice_0.20-45
I hope this helps,
John
--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/
On 2022-01-07 11:23 a.m., Key, Melissa wrote:
> I am trying to replicate a semi-parametric analysis described in Harezlak, Jaroslaw, David Ruppert, and Matt P. Wand. Semiparametric regression with R. New York, NY: Springer, 2018. (https://link.springer.com/book/10.1007%2F978-1-4939-8853-2).
>
> I can successfully run the analysis, but now I'm trying to move it into my workflow, which requires that the analysis be conducted within a function (using targets package), and the `groupedData` function now fails with an error that it cannot find the `X` matrix (see reprex below). I've tried the reprex on both my personal Mac (where it works??) and on windows machines (where it does not) - so the problem is likely specific to Windows computers (yes, this seems weird to me too).
> All packages have been updated, and I'm running the latest version of R on all machines.
>
> Reprex:
>
> library(HRW) # contains example data and ZOSull function
> library(nlme)
>
> data(growthIndiana)
>
>
> analyze_this <- function(df) {
>
> mean.x <- mean(df$age)
> mean.y <- mean(df$height)
> sd.x <- sd(df$age)
> sd.y <- sd(df$height)
>
> df$x <- (df$age - mean.x) / sd.x
> df$y <- (df$height - mean.y) / sd.y
>
> X <- model.matrix(~ x * male * black, data = df)
> dummyID <- rep(1, length(nrow(X)))
>
> grouped_data <- groupedData(y ~ X[,-1]|rep(1, length = nrow(X)), data = data.frame(y = df$y, X, dummyID))
>
> }
>
>
> # doesn't work on Windows machine, does work on the Mac
> analyze_this(growthIndiana)
> #> Error in eval(aux[[2]], object): object 'X' not found
>
> # does work
>
> df <- growthIndiana
>
> mean.x <- mean(df$age)
> mean.y <- mean(df$height)
> sd.x <- sd(df$age)
> sd.y <- sd(df$height)
>
> df$x <- (df$age - mean.x) / sd.x
> df$y <- (df$height - mean.y) / sd.y
>
> X <- model.matrix(~ x * male * black, data = df)
> dummyID <- rep(1, length(nrow(X)))
>
> grouped_data <- groupedData(y ~ X[,-1]|rep(1, length = nrow(X)), data = data.frame(y = df$y, X, dummyID))
>
>
> # attempted work-around.
>
> analyze_this2 <- function(df) {
> num.global.knots = 20
> num.subject.knots = 10
>
> mean.x <- mean(df$age)
> mean.y <- mean(df$height)
> sd.x <- sd(df$age)
> sd.y <- sd(df$height)
>
> df$x <- (df$age - mean.x) / sd.x
> df$y <- (df$height - mean.y) / sd.y
>
> X <- model.matrix(~ x * male * black, data = df)
> dummyID <- rep(1, length(nrow(X)))
>
> # grouped_data <- groupedData(y ~ X[,-1]|rep(1, length = nrow(X)), data = data.frame(y = df$y, X, dummyID))
>
> global.knots = quantile(unique(df$x), seq(0, 1, length = num.global.knots + 2)[-c(1, num.global.knots + 2)])
> subject.knots = quantile(unique(df$x), seq(0, 1, length = num.subject.knots + 2)[-c(1, num.subject.knots + 2)])
>
> Z.global <- ZOSull(df$x, range.x = range(df$x), global.knots)
> Z.group <- df$black * Z.global
> Z.subject <- ZOSull(df$x, range.x = range(df$x), subject.knots)
>
> Zblock <- list(
> dummyID = pdIdent(~ 0 + Z.global),
> dummyID = pdIdent(~ 0 + Z.group),
> idnum = pdSymm(~ x),
> idnum = pdIdent(~ 0 + Z.subject)
> )
>
> df$dummyID <- dummyID
> tmp_data <- c(
> df,
> X = list(X),
> Z.global = list(Z.global),
> Z.group = list(Z.global),
> Z.subject = list(Z.subject)
> )
>
> fit <- lme(y ~ 0 + X,
> data = tmp_data,
> random = Zblock
> )
>
> }
>
> # this works (warning - lme takes awhile to fit)
> analyze_this2(growthIndiana)
>
> sessionInfo()
> #> R version 4.1.2 (2021-11-01)
> #> Platform: x86_64-w64-mingw32/x64 (64-bit)
> #> Running under: Windows 10 x64 (build 22000)
> #>
> #> Matrix products: default
> #>
> #> locale:
> #> [1] LC_COLLATE=English_United States.1252
> #> [2] LC_CTYPE=English_United States.1252
> #> [3] LC_MONETARY=English_United States.1252
> #> [4] LC_NUMERIC=C
> #> [5] LC_TIME=English_United States.1252
> #>
> #> attached base packages:
> #> [1] stats graphics grDevices utils datasets methods base
> #>
> #> other attached packages:
> #> [1] nlme_3.1-153 HRW_1.0-5
> #>
> #> loaded via a namespace (and not attached):
> #> [1] lattice_0.20-45 digest_0.6.29 withr_2.4.3 grid_4.1.2
> #> [5] magrittr_2.0.1 reprex_2.0.1 evaluate_0.14 KernSmooth_2.23-20
> #> [9] highr_0.9 stringi_1.7.6 rlang_0.4.12 cli_3.1.0
> #> [13] rstudioapi_0.13 fs_1.5.2 rmarkdown_2.11 splines_4.1.2
> #> [17] tools_4.1.2 stringr_1.4.0 glue_1.6.0 xfun_0.29
> #> [21] yaml_2.2.1 fastmap_1.1.0 compiler_4.1.2 htmltools_0.5.2
> #> [25] knitr_1.37
>
> Created on 2022-01-07 by the [reprex package](https://reprex.tidyverse.org) (v2.0.1)
>
> Here's the session Info for the Mac:
>
> sessionInfo()
> R version 4.1.2 (2021-11-01)
> Platform: aarch64-apple-darwin20 (64-bit)
> Running under: macOS Monterey 12.1
>
> Matrix products: default
> LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib
>
> locale:[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] nlme_3.1-153 targets_0.8.1 HRW_1.0-5
>
> loaded via a namespace (and not attached):
> [1] compiler_4.1.2 pillar_1.6.4 tools_4.1.2 digest_0.6.28 lattice_0.20-45 jsonlite_1.7.2 evaluate_0.14
> [8] lifecycle_1.0.1 tibble_3.1.6 pkgconfig_2.0.3 rlang_0.4.12 igraph_1.2.9 cli_3.1.0 yaml_2.2.1
> [15] xfun_0.28 fastmap_1.1.0 withr_2.4.2 knitr_1.36 htmlwidgets_1.5.4 vctrs_0.3.8 grid_4.1.2
> [22] tidyselect_1.1.1 glue_1.5.0 data.table_1.14.2 R6_2.5.1 processx_3.5.2 fansi_0.5.0 rmarkdown_2.11
> [29] callr_3.7.0 purrr_0.3.4 magrittr_2.0.1 scales_1.1.1 codetools_0.2-18 ps_1.6.0 htmltools_0.5.2
> [36] ellipsis_0.3.2 splines_4.1.2 colorspace_2.0-2 KernSmooth_2.23-20 utf8_1.2.2 visNetwork_2.1.0 munsell_0.5.0
> [43] crayon_1.4.2
>
>
> Melissa Key, Ph.D.
> Statistician
> Non-Invasive Brain Stimulation Team
> Infoscitex
>
> Cell: 937-550-4981
> Email: mkey using infoscitex.com<mailto:mkey using infoscitex.com>
> Base email: melissa.key.ctr using us.af.mil<mailto:melissa.key.ctr using us.af.mil>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list