[R] kmeans: how to retrieve clusters
Peter Langfelder
peter.langfelder at gmail.com
Tue Feb 28 07:46:34 CET 2012
On Mon, Feb 27, 2012 at 3:18 PM, ikuzar <razuki at hotmail.fr> wrote:
> Hello,
>
> I'd like to classify data with kmeans algorithm. In my case, I should get 2
> clusters in output. Here is my data
>
> colCandInd colCandMed
> 1 82 2950.5
> 2 83 1831.5
> 3 1192 2899.0
> 4 1193 2103.5
>
> The first cluster is the two first lines
> the 2nd cluster is the two last lines
>
> Here is the code:
> x = colCandList$colCandInd
> y = colCandList$colCandMed
> m = matrix(c(x, y), nrow = length(colCandList$colCandInd), ncol=2)
> kres = kmeans(m, 2)
>
> Is there a way to retrieve both clusters in output of the algorithm in order
> to process in each cluster ? (I am looking for smthing like kres$clustList
> ... where I can process each cluster)
>
> kres$cluster did not yield what I expected ...
Not sure what you mean by "process each cluster" and why kres$cluster
is not what you expected. kres$cluster will tell you which cluster
each point (row of your matrix) belongs to. The result depends on how
you initialize the kmeans since the inter-point distances are quite
similar to one another. For example, I get
> set.seed(10)
> kres = kmeans(m, 2)
> kres$cluster
[1] 2 2 1 1
> set.seed(1)
> kres = kmeans(m, 2)
> kres$cluster
[1] 1 1 2 2
> set.seed(200)
> kres = kmeans(m, 2)
> kres$cluster
[1] 2 2 1 1
> kres = kmeans(m, 2)
> kres$cluster
[1] 1 2 1 2
So 3 times out of 4 I get the result you expect, and once a different one.
If you need the result in a different format, that should be no problem.
More information about the R-help
mailing list