[R] [dist]how to analise a large matrix?
Charles C. Berry
cberry at tajo.ucsd.edu
Fri Aug 22 02:31:20 CEST 2008
On Thu, 21 Aug 2008, mcnda839 at mncn.csic.es wrote:
> Hi all,
>
> I have a matrix of about 100.000 x 4 that I need to classify using
> euclidean metric. For that I am using dist or daisy functions, but I
> am afraid that the message: Error in vector("double", length) : vector
> size specified is too large, means too much lines.
>
Yes, your distance matrix will take dozens of Gigabytes to store.
> Can anyone suggest me how should I analyse this matrix?
Try something other than 'hierarchical clustering'.
See
http://cran.r-project.org/web/views/Cluster.html
for some suggestions.
kmeans(), perhaps ?
HTH,
Chuck
>
> Thanks in advance,
>
> Diogo André Alagador
> MNCN,CSIC, Madrid, Spain
> ISA, Lisbon, Portugal
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
More information about the R-help
mailing list