[Rd] Deep Replicable Bug With AMD Threadripper MultiCore
ivo welch
|vo@we|ch @end|ng |rom @nder@on@uc|@@edu
Fri Apr 5 02:28:54 CEST 2019
The following program is whittled down from a much larger program that
always works on Intel, and always works on AMD's threadripper with
lapply but not mclappy. With mclapply on AMD, all processes go into
"suspend" mode and the program then hangs. This bug is replicable on an
AMD Ryzen Threadripper 2950X 16-Core Processor (128GB RAM), running
latest ubuntu 18.04. The R version 3.5.3 (2019-03-11) -- "Great Truth" ,
invoked with --vanilla. I hope this helps...it took quite a while to get
it to this stage. I sure hope that I am not reporting an old bug...
options("mc.cores"=4)
library(data.table)
library(parallel)
if (!file.exists("bugsample.csv")) {
NR <- 64833330
notused <- data.frame(v1=1:NR, v2=1:NR, v3=1:NR, x1=log(1:NR),
x2=log(1:NR))
fwrite(notused, file="bugsample.csv")
stop("you can quit now and restart the program")
}
if (!exists("notused")) notused <- fread("bugsample.csv", nrows= Inf) ##
needed! Inf cannot be replaced by actual NR
sample <- data.frame( groupidentifier=c( rep(11111,2000), rep(22222, 4500 )
) )
sample$yvar <- sin(1:nrow(sample))
sample$xvar <- 1:nrow(sample)
testfun <- function(dl) {
with(dl, message("Working: ", first(groupidentifier), " with ",
nrow(dl)))
lapply( 1:nrow(dl), FUN=function(onedayindex) {
if ((onedayindex %% 500) != 0) return(NULL)
with(dl[1:onedayindex,],
c( tryCatch( coef(lm( yvar ~ xvar, data=dl[1:onedayindex,]
))[2], error = function(e) NA ) ) )
})
}
message("starting --- replicable hang with mclapply, but not lapply")
o <- mclapply(split( 1:nrow(sample), sample$groupidentifier ),
FUN=function(.index) testfun( sample[.index, , drop=FALSE] ))
message("never gets here with mclapply")
print( do.call("c", o[[1]]) )
print( do.call("c", o[[2]]) )
--
Ivo Welch (ivo.welch using ucla.edu)
[[alternative HTML version deleted]]
More information about the R-devel
mailing list