[R] performance of do.call("rbind")
Hervé Pagès
hpages at fredhutch.org
Mon Jun 27 21:58:42 CEST 2016
Hi,
Note that if your list of 200k data frames is the result of splitting
a big data frame, then trying to rbind the result of the split is
equivalent to reordering the orginal big data frame. More precisely,
do.call(rbind, unname(split(df, f)))
is equivalent to
df[order(f), , drop=FALSE]
(except for the rownames), but the latter is *much* faster!
Cheers,
H.
On 06/27/2016 08:51 AM, Witold E Wolski wrote:
> I have a list (variable name data.list) with approx 200k data.frames
> with dim(data.frame) approx 100x3.
>
> a call
>
> data <-do.call("rbind", data.list)
>
> does not complete - run time is prohibitive (I killed the rsession
> after 5 minutes).
>
> I would think that merging data.frame's is a common operation. Is
> there a better function (more performant) that I could use?
>
> Thank you.
> Witold
>
>
>
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fredhutch.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the R-help
mailing list