[BioC] Extracting dendogram information from Heatmaps

Thomas Girke thomas.girke at ucr.edu
Thu Dec 13 18:00:01 CET 2007


Alison,

In addition to James' suggestions, you may want to get familiar how to access the 
different data components of the resulting hclust object (e.g. labels, order) and 
the cutree() function. If you can't read the labels in the plots, then you can 
always extract them in clean text in the corresponding tree order (see below: 
hr$labels[hr$order]) from the hclust objects.

Here is a short example to illustrate a possible hclust-heatmap/heatmap.2
routine:

# Generate a sample matrix
y <- matrix(rnorm(50), 10, 5, dimnames=list(paste("g", 1:10, sep=""), paste("t", 1:5, sep=""))) 

# Cluster rows and columns by correlation distance
hr <- hclust(as.dist(1-cor(t(y), method="pearson"))) 
hc <- hclust(as.dist(1-cor(y, method="spearman"))) 

# Obtain discrete clusters with cutree
mycl <- cutree(hr, h=max(hr$height)/1.5)

# Prints the row labels in the order they appear in the tree.
hr$labels[hr$order] .
# Prints the row labels and cluster assignments
sort(mycl) 

# Some color selection steps
mycolhc <- sample(rainbow(256))
mycolhc <- mycolhc[as.vector(mycl)]

# Plot the data matrix as heatmap and the cluster results as dendrograms with heatmap or heatmap.2
# and show the cutree() results in color bar.
heatmap(y, Rowv=as.dendrogram(hr), Colv=as.dendrogram(hc), scale="row", RowSideColors=mycolhc) 

library("gplots") 
heatmap.2(y, Rowv=as.dendrogram(hr), Colv=as.dendrogram(hc), col=redgreen(75), scale="row", 
ColSideColors=heat.colors(length(hc$labels)), RowSideColors=mycolhc, trace="none", key=T, cellnote=round(t(scale(t(y))),1))


Best, 
Thomas

On Thu 12/13/07 09:58, James W. MacDonald wrote:
> Hi Alison,
> 
> alison waller wrote:
> > Hello Everyone,
> > 
> >  
> > 
> > I've been using heatmap and heatmap.2 to draw heatmaps for my experiments.  
> > 
> >  
> > 
> > I have a heatmap of the M values of 6 arrays for the spots with pvalues were
> > <0.005 (from eBayes).
> > 
> > However, I would like to see which spots it has grouped together in the row
> > dendogram.  Is there a way I can extract the information about the spots
> > that are clustered together.  I cannot read the row names, and even if I
> > could I was hoping there would be some way to list the clusters and save it
> > to a file.
> 
> There are two ways to do this that I know of. And either can be a pain, 
> depending on how big the dendrogram is.
> 
> Both methods require you to construct your dendrogram first. You can 
> then choose the clusters with the mouse. This might be more difficult if 
> you have some gigantic dendrogram and have ingested too much coffee ;-D.
> 
> Normally, one would simply do
> 
> heatmap(mymatrix, otherargs)
> 
> and accept the default clustering method. However, you can always 
> pre-construct the dendrograms and then feed those to heatmap().
> 
> Rowv <- as.dendrogram(hclust(dist(mymatrix)))
> Colv <- as.dendrogram(hclust(dist(t(mymatrix))))
> 
> heatmap(mymatrix, Rowv=Rowv, Colv=Colv, otherargs)
> 
> Now if you do something like that, then you can try
> 
> plot(Rowv)
> a.cluster <- identify(Rowv)
> 
> and then use your mouse to choose the upper left corner of a rectangle 
> that encompasses the cluster you are interested in. Here is where the 
> size of the dendrogram and the amount of coffee comes in. If the 
> dendrogram is really large then identify() may not be able to figure out 
> what you are trying to select, or may decide you are choosing the upper 
> right corner.
> 
> You can choose as many clusters as you want, and they will be in the 
> list a.cluster, in the order you selected.
> 
> A more programmatic method is to use rect.hclust() and either choose the 
> height at which to make the cuts, or the number of clusters, etc. Again, 
> depending on the size of your dendrogram, this may work well or it may 
> be painful.
> 
> Best,
> 
> Jim
> 
> 
> > 
> >  
> > 
> > Thanks,
> > 
> >  
> > 
> > Alison  
> > 
> >  
> > 
> > ******************************************
> > Alison S. Waller  M.A.Sc.
> > Doctoral Candidate
> > awaller at chem-eng.utoronto.ca
> > 416-978-4222 (lab)
> > Department of Chemical Engineering
> > Wallberg Building
> > 200 College st.
> > Toronto, ON
> > M5S 3E5
> > 
> >   
> > 
> >  
> > 
> > 
> > 	[[alternative HTML version deleted]]
> > 
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> -- 
> James W. MacDonald, M.S.
> Biostatistician
> Affymetrix and cDNA Microarray Core
> University of Michigan Cancer Center
> 1500 E. Medical Center Drive
> 7410 CCGC
> Ann Arbor MI 48109
> 734-647-5623
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 

-- 
Thomas Girke
Assistant Professor of Bioinformatics
Director, IIGB Bioinformatic Facility
Center for Plant Cell Biology (CEPCEB)
Institute for Integrative Genome Biology (IIGB)
Department of Botany and Plant Sciences
1008 Noel T. Keen Hall
University of California
Riverside, CA 92521

E-mail: thomas.girke at ucr.edu
Website: http://faculty.ucr.edu/~tgirke
Ph: 951-827-2469
Fax: 951-827-4437



More information about the Bioconductor mailing list