[R] recursive function - finding connections

Peter Langfelder peter.langfelder at gmail.com
Fri Jul 15 02:05:07 CEST 2011


One more thing - for large data sets, the packages flashClust and
fastcluster provide much faster hierarchical clustering that (at least
for flashClust which I'm the maintainer of) give the exact same
results. Simply insert a

library(flashClust)

before you call the function and your code will run much faster.

Peter

On Thu, Jul 14, 2011 at 4:58 PM, Peter Langfelder
<peter.langfelder at gmail.com> wrote:
> Hi Paul,
>
> I assume you are using the argument cutoff to specify the p-value
> below which nodes are considered connected and above which they are
> not connected.
>
> I would use single linkage hierarchical clustering. If you have two
> groups of nodes and any two nodes between the groups are connected
> (i.e. have adjacency =1 or dissimilarity 0), then the groups have
> dissimilarity 0. If no two nodes between the two groups are connected,
> you will get dissimilarity 1. Thus you can use any tree cut height
> between 0 and 1 to get the clusters that correspond to connected. For
> large data you will need a large computer to hold your distance
> matrix, but you must have observed that already.
>
> subgraphs = function(mat, cut)
> {
>  disconnected = mat>cut # Change the inequality if necessary
>  tree = hclust(as.dist(disconnected), method = "single")
>  clusters = cutree(tree, h = 0.5)
>  # Clusters is already the answer, but you want it in a different
> format, so we reformat it.
>  nClusters = max(clusters)
>  connectedList = list();
>  for (c in 1:nClusters)
>    connectedList[[c]] = which(clusters==c)
>  connectedList
> }
>
> Try it and see if this does what you want.
>
> HTH
>
> Peter
>
> On Thu, Jul 14, 2011 at 4:12 PM, Benton, Paul
> <hpaul.benton08 at imperial.ac.uk> wrote:
>> Sorry bad example. My data is undirected. It's a correlation matrix so probably better to look at something like:
>>
>> foomat<-cor(matrix(rnorm(100), ncol=10))
>> foomat
>>
>> mine are pvalues from the correlation but same idea.
>



-- 
Sent from my Linux computer. Way better than iPad :)



More information about the R-help mailing list