[R] A combinatorial optimization problem: finding the best permutation of a complex vector
Ravi Varadhan
rvaradhan at jhmi.edu
Sun Nov 15 05:47:07 CET 2009
Hi,
It was pointed out to me by Hans Borchers that the timing that I reported (0.3 sec) in the previous email for solving the LSAP problem, for N=1000, was too optimistic, because "X" and "Y" were equivalent up to a permutation.
In order to test this out, I ran a few more experiments with different random variate distributions for X and Y. In all the experiments, I took N = 500.
The execution times were faster when X and Y had the same or similar distributions. This is generally around 8 - 9 seconds. The more different the distributions are, the greater the time. For example, when I took the real and imaginary parts of X to be from a t-distribution with 3 degrees of freedom, and Y to be from uniform distribution in (0, 1), the execution times were around 80-90 seconds.
n <- 500
x <- rt(n, df=3) + 1i * rt(n, df=3)
y <- runif(n) + 1i * runif(n)
Cmat <- outer(x, y, FUN=function(x,y) Mod(x - y))
system.time(ans <- solve_LSAP(Cmat, maximum=FALSE))
When I increased N = 1000, the time was about 1400 seconds!
Ravi.
____________________________________________________________________
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University
Ph. (410) 502-2619
email: rvaradhan at jhmi.edu
----- Original Message -----
From: Ravi Varadhan <rvaradhan at jhmi.edu>
Date: Saturday, November 14, 2009 10:53 am
Subject: Re: [R] A combinatorial optimization problem: finding the best permutation of a complex vector
To: "Charles C. Berry" <cberry at tajo.ucsd.edu>
Cc: r-help at r-project.org
> Hi,
>
> I have solved the problem that I had posed before. Here is a
> statement of the problem:
>
> "I have a complex-valued vector X in C^n. Given another
> complex-valued vector Y in C^n, I want to find a permutation of Y,
> say, Y*, that minimizes ||X - Y*||, the distance between X and Y*. "
>
> I was talking to Professor Moody T. Chu, who is a well-known numerical
> analyst from NC State Univ, and he pointed out that this problem is an
> instance of the classical "linear sum assignment problem (LSAP)" in
> discrete mathematics. Once this was revealed to me, it didn't take me
> long to find out the existence of various algorithms (e.g. Hungarian
> algorithm) and codes (C, Fortran, Matlab) for solving this problem. I
> also looked in the CRAN task view on optimization and found that the
> LSAP solver is present in the "clue" package. Thanks to Kurt Hornik
> for this package.
>
> So, here is an illustration of the "amazing" power of mathematics:
>
> n <- 1000
>
> x <- rt(n, df=3) + 1i * rt(n, df=3) # this is the target vector to be
> matched
>
> y <- x[sample(n)] # this is the vector to be permuted
>
> # Note: I have chosen a random permutation of the target so that I
> know the answer is "x" itself
> # and the minimum distance is zero
>
> Cmat <- outer(x, y, FUN=function(x,y) Mod(x - y))
>
> require(clue)
>
> ans <- solve_LSAP(Cmat, maximum=FALSE) # We are minimizing the linear
> sum
>
> dist <- function(x, y) sqrt(sum(Mod(x - y)^2))
>
> dist(x, y[c(ans)])
>
>
> This is remarkable. It takes only about 0.3 seconds to solve this
> difficult combinatorial problem!
>
>
> Best,
> Ravi.
>
> ____________________________________________________________________
>
> Ravi Varadhan, Ph.D.
> Assistant Professor,
> Division of Geriatric Medicine and Gerontology
> School of Medicine
> Johns Hopkins University
>
> Ph. (410) 502-2619
> email: rvaradhan at jhmi.edu
>
>
> ----- Original Message -----
> From: "Charles C. Berry" <cberry at tajo.ucsd.edu>
> Date: Thursday, November 12, 2009 2:20 pm
> Subject: Re: [R] A combinatorial optimization problem: finding the
> best permutation of a complex vector
> To: Ravi Varadhan <rvaradhan at jhmi.edu>
> Cc: r-help at r-project.org
>
>
> > On Thu, 12 Nov 2009, Ravi Varadhan wrote:
> >
> > >
> > > Hi,
> > >
> > > I have a complex-valued vector X in C^n. Given another
> > complex-valued
> > > vector Y in C^n, I want to find a permutation of Y, say, Y*, that
>
> > > minimizes ||X - Y*||, the distance between X and Y*.
> > >
> > > Note that this problem can be trivially solved for "Real" vectors,
>
> > since
> > > real numbers possess the ordering property. Complex numbers,
> > however, do
> > > not possess this property. Hence the challenge.
> > >
> > > The obvious solution is to enumerate all the permutations of Y and
>
> > pick
> > > out the one that yields the smallest distance. This, however, is
>
> > only
> > > feasible for small n. I would like to be able to solve this for n
>
> > as
> > > large as 100 - 1000, in which cases the permutation approach is
> > > infeasible.
> > >
> > > I am looking for algorithms, possibly iterative, that can provide
> a
> >
> > > "good" approximate solutions that are not necessarily optimal for
>
> > > high-dimensional vectors. I can do random sampling, but this can
> be
> > very
> > > inefficient in high-dimensional problems. I am looking for
> > efficient
> > > algorithms because this step has to be performed in each iteration
>
> > of an
> > > "outer" algorithm.
> > >
> > > Are there any clever adaptive algorithms out there?
> > >
> >
> > I do not know.
> >
> > But would you settle for a not-so-clever adaptive heuristic?
> >
> > If so, see below.
> >
> >
> >
> > > Here is an example illustrating the problem:
> > >
> > > require(e1071)
> > >
> > > n <- 8
> > > x <- runif(n) + 1i * runif(n)
> > > y <- runif(n) + 1i * runif(n)
> > >
> > > dist <- function(x, y) sqrt(sum(Mod(x - y)^2))
> > >
> > > perms <- permutations(n)
> > > dim(perms) # [1] 40320 8
> > > tmp <- apply(perms, 1, function(ord) dist(x, y[ord]))
> > > z <- y[perms[which.min(tmp), ]] # exact solution
> > > dist(x, z)
> > >
> > > # an aproximate random-sampling approach
> > > nsamp <- 10000
> > > perms <- t(replicate(nsamp, sample(1:n, size=n, replace=FALSE)))
> > > tmp <- apply(perms, 1, function(ord) dist(x, y[ord]))
> > > z.app <- y[perms[which.min(tmp), ]] # approximate solution
> > > dist(x, z.app)
> > >
> >
> > The heuristic is to use a stochastic greedy updates. Here is a very
>
> > simple
> > one:
> >
> > swap.samp <-
> > function(index) {
> > sub.ind <- sample(seq(along=index),2)
> > index[sub.ind]<- rev(sub.ind)
> > index
> > }
> >
> >
> > z.app <- y
> > z.cand <- y
> >
> > for (i in 1:100) z.cand <-
> > if( dist(x,z.app) < dist(x,z.cand) ) {
> >
> > z.app[swap.samp(1:8)]
> >
> > } else {
> > z.app <- z.cand
> > z.cand[swap.samp(1:8)]
> > }
> >
> > On your toy example, this usually finds the min(dist(x,z.app)) in <
>
> > 100
> > trials.
> >
> > Note that when
> >
> > z.diff <- z.app != z.cand
> >
> > dist(x[ z.diff ],z.app[ z.diff ])^2 - dist(x[ z.diff ],z.cand[
> > z.diff ])^2
> >
> > equals
> >
> > dist(x,z.app)^2 - dist(x,z.cand)^2
> >
> > so you could vectorize the above to randomly pair up all the points
>
> > (for n
> > even) then update the n%/%2 pairs all at once.
> >
> >
> > For large problems, you might try to preferentially sample pairs of
>
> > points
> > with similar values of Mod ( x - z.app ) to begin with. The
> heuristic
> >
> > being that in two pairs of points that are both distant, swapping
> them
> >
> > might realize a bigger benefit.
> >
> >
> > --
> >
> > Another version is to sample k points, randomly permute, then do a
> > greedy
> > update:
> >
> > sub.samp <-
> > function(index,n) {
> > sub.ind <- sample(seq(along=index),n)
> > index[sub.ind]<- sample(sub.ind)
> > index
> > }
> >
> >
> > # k is 4 here
> >
> > for (i in 1:100) z.cand <-
> > if( dist(x,z.app) < dist(x,z.cand) ){
> > z.app[sub.samp(1:8,4)]
> > } else {
> > z.app <- z.cand
> > z.cand[sub.samp(1:8,4)]
> > }
> >
> >
> > On your toy problem, this takes in the low 100's to find the min.
> >
> > I think you can do the similar vectorization trick here.
> > --
> >
> > I suppose another variant would repeatedly sample k points, then
> > enumerate
> > distances for all permutations of them, then choose the best one to
>
> > update z.app.
> >
> > Again, in larger problems, one might weight those k points towards
> > higher
> > values of Mod( x - z.app ) at least to begin with.
> >
> > HTH,
> >
> > Chuck
> >
> >
> > > Thanks for any help.
> > >
> > > Best,
> > > Ravi.
> > >
> > >
> > > ____________________________________________________________________
> > >
> > > Ravi Varadhan, Ph.D.
> > > Assistant Professor,
> > > Division of Geriatric Medicine and Gerontology
> > > School of Medicine
> > > Johns Hopkins University
> > >
> > > Ph. (410) 502-2619
> > > email: rvaradhan at jhmi.edu
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list
> > >
> > > PLEASE do read the posting guide
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> > Charles C. Berry (858) 534-2098
> > Dept of
> Family/Preventive
> > Medicine
> > E UC San Diego
> > La Jolla, San Diego 92093-0901
> >
> >
>
> ______________________________________________
> R-help at r-project.org mailing list
>
> PLEASE do read the posting guide
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list