[R-pkg-devel] Installation took CPU time XXX times elapsed time
Ivan Krylov
|kry|ov @end|ng |rom d|@root@org
Fri Oct 3 16:34:46 CEST 2025
В Wed, 1 Oct 2025 16:58:16 +0000
<fabian.bernhard using unibe.ch> пишет:
> The cookbook
> (https://contributor.r-project.org/cran-cookbook/code_issues.html)
> suggests use of the j-flag for compilation (which I am unsure how to
> implement).
Thank you for consulting the cookbook! If all you have is Makevars,
there should be no need for setting the -j flag anywhere; that advice
is there for the CMake+Ninja and Rust build systems, which default to
using all available cores.
> Whereas LLMs suggest to ensure all dependencies are stated in
> ‘src/Makevars’ (which I tried to ensure). However still no success
> with the current version of the code.
This is good advice (you need the dependencies in case someone does run
MAKEFLAGS=-j$(nproc) R CMD INSTALL rsofun_*.tar.gz), but it won't help
you with unintended installation-time parallelism. Your current
Makevars, as noted by Dirk, has mistakes in the stated dependencies:
>> all: $(SHLIB) clean
This is wrong because it implies that 'make clean' could be started at
the same time while $(SHLIB) is being built, removing the object files
that $(SHLIB) actually depends upon.
Removing the 'all: clean' dependency (but keeping 'all: $(SHLIB)' as the
first target), removing the additional dependencies of $(SHLIB), and
using the output of the following Perl program for all dependency
information helps your package survive installation with MAKEFLAGS=-j2:
perl -lne '
my $o = $ARGV =~ s/[.]f90$/.o/r;
$modules{$1} = $o
if /^\s*module\s+(\w+)/;
push @{$uses{$o}}, $1
if /^\s*use\s*(\w+)/;
}{
while (my ($k, $v) = each %modules) {
print "$k.mod: $v";
}
while (my ($k, $v) = each %uses) {
print "$k: ", join " ", map "$_.mod", @$v;
}
' *.f90
While the build system trying to be too helpful is a common case of
this NOTE (in which case the solution is easy), it is not the only
possible explanation. We've also seen GCC internally spawn too many
child processes for link-time optimisation (system setup problem, not a
package problem, since fixed) and a package accidentally spawning too
many OpenMP threads in .onLoad for a short while, affecting some of its
reverse dependencies (also fixed).
Let's try to see which parts of R CMD INSTALL create child threads.
It's not the only way extra CPU time could be spent (and not the only
way threads could be created), but you have to start somewhere.
wget https://cran.r-project.org/incoming/archive/rsofun_5.1.0.tar.gz
ltrace -w 10 -f -e pthread_create -- \
sh -c 'R CMD INSTALL -l rsofun.Rcheck rsofun_5.1.0.tar.gz'
Does the package or its dependencies do something strange while
preparing the namespace? No, 'cli' only starts a single thread and
that's it. Does the compiler start extra threads? Why yes, it does:
flang-new-19 -fpic -g -c main_pmodel.mod.f90 -o main_pmodel.mod.o
[pid 13542] --- Called exec() ---
[pid 13542] --- Called exec() ---
[pid 13543] --- Called exec() ---
[pid 13543] libLLVM.so.19.1->pthread_create(140029587084992,
0x7ffe99915e20, 0x7f5b30e65ef0, 0x55ef27e69b60) = 0
/usr/lib/x86_64-linux-gnu/libLLVM.so.19.1(_ZN4llvm27llvm_execute_on_thread_implEPFPvS0_ES0_St8optionalIjE+0x60)
[0x7f5b30ec6be0]
/usr/lib/x86_64-linux-gnu/libLLVM.so.19.1(_ZN4llvm13StdThreadPool4growEi+0x17b)
[0x7f5b30e6477b]
<...>
/usr/bin/sh(_ZN7Fortran8frontend13CodeGenAction21beginSourceFileActionEv+0x102b)
[0x55eeebefb0bb]
On my 16-thread machine this varies per file (up to 16 threads in the
pool for the largest file) and results in an average load of 18s user /
15s real = 1.2. It's not implausible that the older 2x Xeon E5-2690
running the incoming pretests at CRAN (with 32 threads in total) end up
at 280% average CPU load due to flang-new-19's multi-threading.
Unfortunately, I couldn't find any way to disable LLVM multi-threading
at runtime, short of recompiling LLVM.
--
Best regards,
Ivan
[1]
https://cran.r-project.org/web/checks/check_flavors.html#r-devel-linux-x86_64-debian-clang
More information about the R-package-devel
mailing list