[R] Is it possible to obtain an agglomeration schedule with R cluster analyis
Bob Green
bgreen at dyson.brisnet.org.au
Sun Feb 24 00:33:27 CET 2013
Willam,
Many thanks. I'll check this against my data tomorrow when I'm back
at work. This looks just what I wanted.
Regards
Bob
At 09:27 AM 24/02/2013, William Dunlap wrote:
>You didn't show what the tabular summary should look like.
>However, look at the height and merge components of
>an hclust object:
>
> > hc3 <- hclust(dist(USArrests[1:8, c(1,2,4)]))
> > data.frame(hc3[2:1])
> height merge.1 merge.2
>1 9.297849 -1 -8
>2 13.609188 -2 -5
>3 23.779193 -4 -6
>4 33.865321 -3 2
>5 48.229659 1 3
>6 104.636227 4 5
>7 185.135221 -7 6
>The two merge.* columns identify what groups merged at
>the corresponding height value. Negative values, i, refer to the
>-i'th leaf value in the 'labels' component and positive values, i, refer
>to cluster created in the i'th row of the data.frame. The following
>function transforms those references into name:
>
>f <- function(hc){
> data.frame(row.names=paste0("Cluster",seq_along(hc$height)),
> height=hc$height,
> components=ifelse(hc$merge<0,
> hc$labels[abs(hc$merge)], paste0("Cluster",hc$merge)),
> stringsAsFactors=FALSE)
>}
>
>as in
> > f(hc3)
> height components.1 components.2
>Cluster1 9.297849 Alabama Delaware
>Cluster2 13.609188 Alaska California
>Cluster3 23.779193 Arkansas Colorado
>Cluster4 33.865321 Arizona Cluster2
>Cluster5 48.229659 Cluster1 Cluster3
>Cluster6 104.636227 Cluster4 Cluster5
>Cluster7 185.135221 Connecticut Cluster6
>
>Compare that to the output of str(as.dendrogram(hc3)):
>
> > str(as.dendrogram(hc3))
>--[dendrogram w/ 2 branches and 8 members at h = 185]
> |--leaf "Connecticut"
> `--[dendrogram w/ 2 branches and 7 members at h = 105]
> |--[dendrogram w/ 2 branches and 3 members at h = 33.9]
> | |--leaf "Arizona"
> | `--[dendrogram w/ 2 branches and 2 members at h = 13.6]
> | |--leaf "Alaska"
> | `--leaf "California"
> `--[dendrogram w/ 2 branches and 4 members at h = 48.2]
> |--[dendrogram w/ 2 branches and 2 members at h = 9.3]
> | |--leaf "Alabama"
> | `--leaf "Delaware"
> `--[dendrogram w/ 2 branches and 2 members at h = 23.8]
> |--leaf "Arkansas"
> `--leaf "Colorado"
>
>Does f() produce the information you need for your display?
>
>Bill Dunlap
>Spotfire, TIBCO Software
>wdunlap tibco.com
>
>
> > -----Original Message-----
> > From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf
> > Of Bob Green
> > Sent: Saturday, February 23, 2013 12:49 PM
> > To: Uwe Ligges
> > Cc: r-help at r-project.org
> > Subject: Re: [R] Is it possible to obtain an agglomeration
> schedule with R cluster analyis
> >
> > Hello Uwes,
> >
> > Thanks. Re-reading the hclust pages I found that using the hclust
> > 'USArrests' data that the command > plot (hc1) will generate the
> > order in which cases joined. however, I still can't see how to obtain
> > the respective height at which each case joined each cluster or the
> > height when clusters merge.
> >
> >
> > The dendrogram {stats} page provides the following code which
> > produces the information that I require. However, what I would like
> > to obtain is a table of the height at which cluster formed.
> >
> > > hc <- hclust(dist(USArrests), "ave")
> > > (dend1 <- as.dendrogram(hc)) # "print()" method
> > > str(dend1) # "str()" method
> >
> > I also found as.hclust which plots what I want, but I still can't
> > find a way to produce the actual height values which are being
> > plotted, for example as a tabular summary.
> >
> > plot(hc) ; mtext("hclust", side=1)
> >
> > Any assistance is appreciated,
> >
> > Bob
> >
> >
> >
> > At 04:01 AM 24/02/2013, Uwe Ligges wrote:
> >
> >
> > >On 22.02.2013 11:41, Bob Green wrote:
> > >>Hello,
> > >>
> > >>In SPSS the cluster analysis output includes an agglomerations schedule,
> > >>which details the stages when cases are joined.
> > >>
> > >>Is it possible to obtain such output when performing cluster analysis in
> > >>R? If so, I'd appreciate advice regarding how to obtain this
> information.
> > >
> > >
> > >If you are talking about hierarchical clustering via hclust(), see ?hclust
> > >It tells you that the relevant information is available inside the
> > >object and you can even see it via the plot method.
> > >
> > >Uwe Ligges
> > >
> > >
> > >
> > >>
> > >>Any assistance is appreciated,
> > >>
> > >>Regards
> > >>
> > >>Bob
> > >>
> > >>______________________________________________
> > >>R-help at r-project.org mailing list
> > >>https://stat.ethz.ch/mailman/listinfo/r-help
> > >>PLEASE do read the posting guide
> > >>http://www.R-project.org/posting-guide.html
> > >>and provide commented, minimal, self-contained, reproducible code.
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list