[R] Memory exhausted with "dist" in "mva" library
Martin Maechler
maechler at stat.math.ethz.ch
Thu May 23 14:49:44 CEST 2002
>>>>> "Kenneth" == Kenneth Cabrera <krcabrer at epm.net.co> writes:
Kenneth> I have a database with 25000 rows and 30 columns
Kenneth> and I want to make cluster analysis to cluster the
Kenneth> 25000 records,
Kenneth> but the memory exhausted using the "dist" function
Kenneth> in "mva" library. I use the "--max-mem-size" up to
Kenneth> 1780Mb (If I use more the R returns me a error message)
Kenneth> What can I do?
not use any distance (dissimilarity) based clustering method if
possible because that saves a lot of memory.
library(cluster) {and other CRAN non-base packages} is
recommended for more flexibility.
In particular, I'd recommend using daisy() instead of dist() quite a bit!
clara() {in the cluster package} was written for
Clustering
~LARge
~~~Application.
~
but there are many more cluster methodologies that work with
euclidean (or manhattan) metric directly instead of first
computing the n(n-1)/2 distances.
I hope this gets you started.
Martin Maechler <maechler at stat.math.ethz.ch> http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum LEO C16 Leonhardstr. 27
ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND
phone: x-41-1-632-3408 fax: ...-1228 <><
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list