[R] Sampling the Distance Matrix

David L Carlson dcarlson at tamu.edu
Fri Sep 25 15:54:54 CEST 2015


You defined x and y in your original email as:

> x<-rnorm(20)
> y<-rnorm(20)
>
> mm<-as.matrix(cbind(x,y))
>
> dst<-(dist(mm))

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352


-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net] 
Sent: Thursday, September 24, 2015 6:30 PM
To: Lorenzo Isella
Cc: David L Carlson; r-help at r-project.org
Subject: Re: [R] Sampling the Distance Matrix


On Sep 24, 2015, at 1:54 PM, Lorenzo Isella wrote:

> On Thu, Sep 24, 2015 at 01:30:02PM -0700, David Winsemius wrote:
>> 
>> On Sep 24, 2015, at 12:36 PM, Lorenzo Isella wrote:
>> 
>>> Hi,
>>> And thanks for your reply.
>>> Essentially, your script gets the job done.
>>> For instance, if I run
>>> 
>>> mm <- cbind(5/(1:5), -2*sqrt(1:5))
>>> dst <- dist(mm)
>>> dst2 <- as.matrix(dst)
>>> diag(dst2) <- NA
>>> idx <- which(apply(dst2, 1, function(x) all(na.omit(x)>.9)))
>>> 
>>> then it correctly detects the first two rows, where all the values are
>>> larger than 0.9.
>>> In other words, it detects the points that are at least 0.9 units away
>>> from *all* the other points.
>>> My other question (I did not realize this until I got your answer) is
>>> the following: I have the distance matrix of a set of N points.
>>> You gave me an algorithm two find all the points that are at least 0.9
>>> units away from any other points.
>>> However, in some cases, for me it is OK even a weaker condition: find
>>> a subset of k points (with k tunable) whose distance *from each other*
>>> is greater than 0.9 units (even if their distance from some other
>>> points may be smaller than 0.9).
>> 
>> If I understand ..... Make a matrix of unique combinations, then apply by rows to get the qualifying columns that satisfy the distance criterion:
>> 
>> mtxcomb <- combn(1:20, 5)
>> goodcls <- apply(mtxcomb , 2, function(idx) all( dist( cbind( x[idx], y[idx]) ) > 0.9))
>> mtxcomb [ , goodcls]
>> 
>> In my sample it was around 9% of the total 5 item combinations.
>> 
>> snipped a lot of output:
>> .....
>>   [,1440] [,1441]
>> [1,]      12      13
>> [2,]      13      16
>> [3,]      16      17
>> [4,]      19      19
>> [5,]      20      20
>>> dim( mtxcomb)
>> [1]     5 15504
>> 
> 
> Hi,
> Thanks for your reply.
> I think I am getting there, but when I run your commands, I get this
> error message
> 
> Error in cbind(x[idx], y[idx]) : object 'x' not found
> 
> Any idea why? Should I combine those 3 lines with something else?

No idea. I was running the setup that you asked for in your original message which you have now omitted from the mail chain.



> Cheers
> 
> Lorenzo

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list