[R] Gap statistic

Nestor Fernandez nestor.fernandez at ufz.de
Thu Mar 10 14:00:44 CET 2005


Dear All,

I need to calculate the optimal number of clusters for a classification based on a large number of observations (tens of thousands).
Thibshirani et al. proposed the gap statistic for this purpose. I tried the R-code developed by R. Jörnsten but R hangs with such amount of data ().
Is it available any other (optimised) code?
Any help would be appreciated, including suggestions about other alternatives for the selection of an optimal number of cluster from large datasets.

Thanks, 


Néstor Fernández, PhD.

Department of Ecological Modelling
UFZ - Centre for Environmental Research
PF 500136, DE-04301, Leipzig, Germany.
Tel: +49 341-2352034
E-mail: nestor.fernandez at ufz.de




More information about the R-help mailing list