[R] Sampling the Distance Matrix
David L Carlson
dcarlson at tamu.edu
Fri Sep 25 15:54:54 CEST 2015
You defined x and y in your original email as:
> x<-rnorm(20)
> y<-rnorm(20)
>
> mm<-as.matrix(cbind(x,y))
>
> dst<-(dist(mm))
-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352
-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net]
Sent: Thursday, September 24, 2015 6:30 PM
To: Lorenzo Isella
Cc: David L Carlson; r-help at r-project.org
Subject: Re: [R] Sampling the Distance Matrix
On Sep 24, 2015, at 1:54 PM, Lorenzo Isella wrote:
> On Thu, Sep 24, 2015 at 01:30:02PM -0700, David Winsemius wrote:
>>
>> On Sep 24, 2015, at 12:36 PM, Lorenzo Isella wrote:
>>
>>> Hi,
>>> And thanks for your reply.
>>> Essentially, your script gets the job done.
>>> For instance, if I run
>>>
>>> mm <- cbind(5/(1:5), -2*sqrt(1:5))
>>> dst <- dist(mm)
>>> dst2 <- as.matrix(dst)
>>> diag(dst2) <- NA
>>> idx <- which(apply(dst2, 1, function(x) all(na.omit(x)>.9)))
>>>
>>> then it correctly detects the first two rows, where all the values are
>>> larger than 0.9.
>>> In other words, it detects the points that are at least 0.9 units away
>>> from *all* the other points.
>>> My other question (I did not realize this until I got your answer) is
>>> the following: I have the distance matrix of a set of N points.
>>> You gave me an algorithm two find all the points that are at least 0.9
>>> units away from any other points.
>>> However, in some cases, for me it is OK even a weaker condition: find
>>> a subset of k points (with k tunable) whose distance *from each other*
>>> is greater than 0.9 units (even if their distance from some other
>>> points may be smaller than 0.9).
>>
>> If I understand ..... Make a matrix of unique combinations, then apply by rows to get the qualifying columns that satisfy the distance criterion:
>>
>> mtxcomb <- combn(1:20, 5)
>> goodcls <- apply(mtxcomb , 2, function(idx) all( dist( cbind( x[idx], y[idx]) ) > 0.9))
>> mtxcomb [ , goodcls]
>>
>> In my sample it was around 9% of the total 5 item combinations.
>>
>> snipped a lot of output:
>> .....
>> [,1440] [,1441]
>> [1,] 12 13
>> [2,] 13 16
>> [3,] 16 17
>> [4,] 19 19
>> [5,] 20 20
>>> dim( mtxcomb)
>> [1] 5 15504
>>
>
> Hi,
> Thanks for your reply.
> I think I am getting there, but when I run your commands, I get this
> error message
>
> Error in cbind(x[idx], y[idx]) : object 'x' not found
>
> Any idea why? Should I combine those 3 lines with something else?
No idea. I was running the setup that you asked for in your original message which you have now omitted from the mail chain.
> Cheers
>
> Lorenzo
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list