[R] shrink a dataframe for plotting

Alex Brown alex at transitive.com
Wed Nov 21 12:14:21 CET 2007


For me the largest challenge with such data sets is the extra time  
that it takes to develop the appropriate graph, given the time it  
takes to plot each prototype. Once I have got the graph scale  
decorations etc correct then the time for the final plot is almost  
irrelevant.

For this reason I often take a random subset of the data rows using  
sample and use this reduced set to develop the graph before switching  
to the full dataset.

Since your data are monotonocally decreasing however I suggest that  
you take every 100th row instead-this should produce a graph  
indistinguishable from the original at that resolution.

-Alex Brown

On 21 Nov 2007, at 10:24, Thibaut Jombart <jombart at biomserv.univ-lyon1.fr 
 > wrote:

> Alexy Khrabrov wrote:
>
>> I get tables with millions of rows.  For plotting to a screen-size
>> jpg, obviously just about 1000 points are enough.  Instead of feeding
>> plot() the original millions of rows, I'd rather shrink the original
>> dataframe, using some kind of the following interpolation:
>>
>> -- split dataframe into chunks of N rows each, e.g. 1000 rows each
>> -- compute average for each column
>> -- issue one new row of those averages into the shrunk result
>>
>> Is there any existing package to do that in R?  Otherwise, which R
>> idioms are most effective to achieve that?
>>
>> Cheers,
>> Alexy
>>
>>
>>
>>
> Hi,
>
> if you want to extract relevant information from such a table,  
> splitting
> rows in arbitrary chuncks may not solve your problem. Ordinations in
> reduced space are designed for that kind of task, but hierachical
> clustering may also help. See Legendre & Legendre (1998, Numerical
> Ecology, Elsevier) for examples of such methods in Ecology, and the R
> packages ade4, vegan and hclust.
>
> Regards,
>
> Thibaut.
>
> -- 
> ######################################
> Thibaut JOMBART
> CNRS UMR 5558 - Laboratoire de Biométrie et Biologie Evolutive
> Universite Lyon 1
> 43 bd du 11 novembre 1918
> 69622 Villeurbanne Cedex
> Tél. : 04.72.43.29.35
> Fax : 04.72.43.13.88
> jombart at biomserv.univ-lyon1.fr
> http://lbbe.univ-lyon1.fr/-Jombart-Thibaut-.html?lang=en
> http://pbil.univ-lyon1.fr/software/adegenet/
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list