[R] locating element in distance matrix
David L Carlson
dcarlson at tamu.edu
Fri Jan 11 23:41:13 CET 2013
If you really have a matrix to begin with, yes. But if you generated it from
dist() or its relations, you would have to convert it to a matrix (roughly
doubling the memory needed). The various hierarchical cluster functions
usually want a dist object.
> dm <- dist(x, diag=TRUE, upper=TRUE)
> str(dm)
Class 'dist' atomic [1:45] 3.84 4.09 3.64 4.94 4.33 ...
..- attr(*, "Size")= int 10
..- attr(*, "Diag")= logi TRUE
..- attr(*, "Upper")= logi TRUE
..- attr(*, "method")= chr "euclidean"
..- attr(*, "call")= language dist(x = x, diag = TRUE, upper = TRUE)
In dist(), diag=TRUE and upper=TRUE refer only to how the matrix is
displayed. It is still stored as a single vector:
> round(dm, 3)
1 2 3 4 5 6 7 8 9 10
1 0.000 3.843 4.094 3.643 4.935 4.328 4.288 6.205 6.197 2.181
2 3.843 0.000 5.085 5.171 5.067 3.788 4.384 5.770 7.113 2.830
3 4.094 5.085 0.000 3.571 4.548 4.103 3.532 3.917 6.470 3.734
4 3.643 5.171 3.571 0.000 3.821 3.843 3.667 5.513 5.176 3.294
5 4.935 5.067 4.548 3.821 0.000 4.815 3.465 5.918 6.138 4.764
6 4.328 3.788 4.103 3.843 4.815 0.000 2.794 3.937 5.475 3.023
7 4.288 4.384 3.532 3.667 3.465 2.794 0.000 4.075 5.251 4.010
8 6.205 5.770 3.917 5.513 5.918 3.937 4.075 0.000 5.511 5.152
9 6.197 7.113 6.470 5.176 6.138 5.475 5.251 5.511 0.000 6.168
10 2.181 2.830 3.734 3.294 4.764 3.023 4.010 5.152 6.168 0.000
> dm[1]
[1] 3.843183
> dm[2, 1]
Error in dm[2, 1] : incorrect number of dimensions
----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352
> -----Original Message-----
> From: David Winsemius [mailto:dwinsemius at comcast.net]
> Sent: Friday, January 11, 2013 4:21 PM
> To: dcarlson at tamu.edu
> Cc: 'eliza botto'; r-help at r-project.org
> Subject: Re: [R] locating element in distance matrix
>
>
> On Jan 11, 2013, at 1:51 PM, David L Carlson wrote:
>
> > If you have a dist object (created by dist()) or if you used
> lower.tri(x) to
> > extract the lower triangle of the matrix, which() will not work since
> the
> > matrix is now stored as a numeric vector with n(n-1)/2 elements where
> n is
> > the number of rows/columns. In that case you must compute the
> original
> > row/column values from the position along the vector:
> >
> >> dwhich <- function(d, indx) {
> > + i <- round((1+sqrt(1+8*length(d)))/2, 0)
> > + rowd <- unlist(sapply(2:i, function(x) x:i))
> > + cold <- rep(1:(i-1), (i-1):1)
> > + return(data.frame(indx=indx, row=rowd[indx], col=cold[indx]))
> > + }
>
> Wouldn't it be easier to leave the distance matrix structure intact and
> just make the diagonal and upper.tri positions Inf?
>
> > dwhich <- function(d) {
> + d[row(d) <= col(d)] <- Inf
> + which(d == min(d,na.rm=FALSE), arr.ind=TRUE)
> + }
> > dwhich(dm)
> row col
> 10 10 1
>
> --
>
> >> set.seed(42)
> >> x <- matrix(rnorm(100), 10, 10)
> >> d <- dist(x)
> >> dm <- as.matrix(dist(x, diag=TRUE, upper=TRUE))
> >> dm <- dm[lower.tri(dm)]
> >> dwhich(d, which(d==min(d)))
> > indx row col
> > 1 9 10 1
> >> dwhich(dm, which(dm==min(dm)))
> > indx row col
> > 1 9 10 1
> >
> >
> >> -----Original Message-----
> >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> >> project.org] On Behalf Of David Winsemius
> >> Sent: Friday, January 11, 2013 12:37 PM
> >> To: eliza botto
> >> Cc: r-help at r-project.org
> >> Subject: Re: [R] locating element in distance matrix
> >>
> >>
> >> On Jan 11, 2013, at 9:55 AM, eliza botto wrote:
> >>
> >>>
> >>> Dear useRs,
> >>> I have a very basic question. I have a distance matrix and i
> skipped
> >>> the upper part of it deliberately.
> >>
> >> I have no idea what htat means. Code is always helpful in resolving
> >> ambiguities.
> >>
> >>> The distance matrix is 1000*1000. Then i used "min" command to
> >>> extract the lowest value from that matrix. Now i want to know what
> >>> is the location of that lowest element? More precisely, the row and
> >>> column number of that lowest element.
> >>> Thanks in advance
> >>
> >> ?which
> >> which( distmat == min(distmat), arr.ind=TRUE)
> >>
> >> (It's possible to have more than one match and it would be up to
> you
> >> to decide how to break ties.)
> >>
> >> --
> >>
> >> David Winsemius, MD
> >> Alameda, CA, USA
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posting-
> >> guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
>
> David Winsemius
> Alameda, CA, USA
More information about the R-help
mailing list