[R] recursive function - finding connections

Peter Langfelder peter.langfelder at gmail.com
Fri Jul 15 01:58:38 CEST 2011


Hi Paul,

I assume you are using the argument cutoff to specify the p-value
below which nodes are considered connected and above which they are
not connected.

I would use single linkage hierarchical clustering. If you have two
groups of nodes and any two nodes between the groups are connected
(i.e. have adjacency =1 or dissimilarity 0), then the groups have
dissimilarity 0. If no two nodes between the two groups are connected,
you will get dissimilarity 1. Thus you can use any tree cut height
between 0 and 1 to get the clusters that correspond to connected. For
large data you will need a large computer to hold your distance
matrix, but you must have observed that already.

subgraphs = function(mat, cut)
{
  disconnected = mat>cut # Change the inequality if necessary
  tree = hclust(as.dist(disconnected), method = "single")
  clusters = cutree(tree, h = 0.5)
  # Clusters is already the answer, but you want it in a different
format, so we reformat it.
  nClusters = max(clusters)
  connectedList = list();
  for (c in 1:nClusters)
    connectedList[[c]] = which(clusters==c)
  connectedList
}

Try it and see if this does what you want.

HTH

Peter

On Thu, Jul 14, 2011 at 4:12 PM, Benton, Paul
<hpaul.benton08 at imperial.ac.uk> wrote:
> Sorry bad example. My data is undirected. It's a correlation matrix so probably better to look at something like:
>
> foomat<-cor(matrix(rnorm(100), ncol=10))
> foomat
>
> mine are pvalues from the correlation but same idea.



More information about the R-help mailing list