[R] bottlenecks in R script
Joe Calderon
calderon.joe at gmail.com
Tue Mar 16 17:51:21 CET 2010
hello *, im running into two major bottlenecks an R script.
1. going through a 40mb file and reading in via readLines() 1 line at
a time is almost an order of magnitude slow than the equivalent in
python, im wondering if there are alternatives to readLines(), doing
more lines at a time helps a bit
2. generating date sequences takes a long time, im basically doing
something like seq.Date(Sys.Date(), length.out = 300, by ='day') a lot
while digging into it, i strace'd the running process and it seems the
bulk of the time is spent checking for /etc/localtime
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2819, ...}) = 0
strace -cp 2964
Process 2964 attached - interrupt to quit
^CProcess 2964 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
94.61 0.006387 0 55872 stat
2.58 0.000174 0 568 read
1.42 0.000096 0 285 write
1.39 0.000094 1 137 brk
------ ----------- ----------- --------- --------- ----------------
100.00 0.006751 56862 total
has anybody ran into similar problems?
More information about the R-help
mailing list