[R] Hierarchical Cluster Analysis with large dataset

Sarah Goslee sarah.goslee at gmail.com
Sun Nov 3 23:01:30 CET 2013


I think your dataset is too large to be interpretable, but in general
you should check out the cluster package, specifically clara(), which
is intended for use with large data.


On Sun, Nov 3, 2013 at 4:42 AM, Petar Milin
<petar.milin at uni-tuebingen.de> wrote:
> Hello!
> Can anyone give me advice on running Hierarchical Cluster Analysis on large
> datasets? For example, 80000x10000. Calculating distances on such a
> dataframe seems impossible even on very powerful computer.
> Also, any other advice that would lead to reduction of dimensionality,
> i.e., cluster/group variables would be more than welcomed.
> Many thanks,
> PM
Sarah Goslee

More information about the R-help mailing list