[R] Extracting a Sub-Tree from a Dendrogram (based on some criteria)

Leo Mada |eo@m@d@ @end|ng |rom @yon|c@eu
Tue Aug 19 23:07:11 CEST 2025


Dear R-Users,

I would like to extract a branch (sub-tree) from an existing tree (dendrogram) which fulfils the following conditions:
- it includes a specified leaf;
- it has a minimum number of leafs, but more than a specified number n;

In other words, I want to extract the n most similar leaves to a given leaf.

Does anyone know some package that has this functionality?

I have some working code, but it is a quick hack and not very robust. Before investing more time in it, maybe there is already such functionality.

I looked through the Cluster TaskView and also explored the dendextend and ape packages (and a few more); but I did not spot such functionality.
https://cran.r-project.org/web/views/Cluster.html

My current code is on GitHub (see link below).

An example would look like this:

data(iris)

irisClust = iris[,-5]
d = dist(irisClust, method = "euclidean")
x = hclust(d, method="ward.D")
x$labels = paste0("L", 1:nrow(irisClust))

# 1  = Must contain leaf 1;
# 20 = Must cover at least 20 leaves;
tmp = subtree.nc(1, 20, x);
plot(tmp)

The function subtree.nc (and the dependencies count.nodes, subtree.nn and order.tree) are in the specified file on GitHub; the code is a little bit long for this post. All functions in the file are actually independent of other files/modules.

# GitHub:
https://github.com/discoleo/PeptideClassifier/blob/main/R/Helper.Tree.R

There are a few pre-computed moderate-size trees also on GitHub (for more realistic exploration):
https://github.com/discoleo/PeptideClassifier/tree/main/inst/examples

Many thanks in advance for any useful pointers.

Sincerely,

Leonard

	[[alternative HTML version deleted]]



More information about the R-help mailing list