[R] Help on comparing two matrices
Daniel Nordlund
djnordlund at verizon.net
Tue Aug 25 07:01:33 CEST 2009
> -----Original Message-----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Michael Kogan
> Sent: Saturday, August 22, 2009 11:45 AM
> To: r-help at r-project.org
> Subject: [R] Help on comparing two matrices
>
> Hi,
>
> I need to compare two matrices with each other. If you can get one of
> them out of the other one by resorting the rows and/or the
> columns, then
> both of them are equal, otherwise they're not. A matrix could
> look like
> this:
>
> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
> [1,] 0 1 1 1 0 1 1 0
> [2,] 1 1 0 0 0 1 0 1
> [3,] 1 0 1 0 0 0 1 1
> [4,] 1 1 0 0 1 0 0 0
> [5,] 1 0 1 1 1 0 0 0
> [6,] 0 1 0 1 1 0 0 0
> [7,] 0 0 0 0 0 1 1 1
>
> Note that each matrix consists of ones and zeros, in each row and in
> each column there are at least three ones and one zero and
> each pair of
> rows/columns may have at most two positions where both are
> ones (e.g.
> for the 1. and 2. rows those positions are 2 and 6).
>
> I was advised to sort both matrices in the same way and then
> to compare
> them element by element. But I don't manage to get them sorted... My
> approach is as following:
>
> 1. Sort the rows after the row sums (greater sums first).
> 2. Sort the columns after the first column (columns with ones in the
> first row go left, columns with zeros go right).
> 3. Save the left part (all columns with ones in the first
> row) and the
> right part in separate matrices.
> 4. Repeat steps 2 and 3 with both of the created matrices (now taking
> the second row for sorting), repeat until all fragments consist of a
> single column.
> 5. Compose the columns to a sorted matrix.
>
> This algorithm has several problems:
>
> 1. How to make a loop that is branching out in two subloops on each
> iteration?
> 2. How to organize the intermediate results and compose them without
> losing the order? Maybe save them in lists and sublists?
> 3. A fundamental problem: If there are rows with equal sums
> the result
> may depend on which of them is sorted after first. Maybe this
> algorithm
> won't work at all because of this problem?
>
> Thanks in advance for your input,
> Michael
>
Michael,
I see that you have received a number of responses to your request, and you
may have already solved your problem. It seems to me that the responses so
far don't guarantee finding a match if both row and column exchanges are
necessary. I put together some code as an R learning task for me. It is
only partially automated. Someone with more R programming skills than me
could probably totally automate this process. I think your initial plan of
sorting by row total was moving in the right direction. I took your
original matrix, m, and created a matrix, x, as a random ordering of rows
and columns of m. Then sorted m and x by both row and column totals.
Next, all possible orderings of the ordered x matrix, x.o, were constructed
by permuting rows (and columns) with the same row ( or column) totals and
were compared to the ordered m matrix, m.o . I used the permutations
function from the gtools package, so you would need to install and load that
package. If a match was found, TRUE was printed out. A better R programmer
than me could probably finish automating the entire process, wrapping it all
up in a nice function, but this at least gives you a framework for a
solution.
require(gtools)
m <- matrix(c(
0, 1, 1, 1, 0, 1, 1, 0,
1, 1, 0, 0, 0, 1, 0, 1,
1, 0, 1, 0, 0, 0, 1, 1,
1, 1, 0, 0, 1, 0, 0, 0,
1, 0, 1, 1, 1, 0, 0, 0,
0, 1, 0, 1, 1, 0, 0, 0,
0, 0, 0, 0, 0, 1, 1, 1), 7, 8, byrow = TRUE)
##----create comparison matrix by randomly ordering rows and columns (should
compare TRUE)----##
i <- sample(1:nrow(m))
j <- sample(1:ncol(m))
x <- m[i,j]
##----if row and column totals aren't the same, then matrices can't be
"equal"----##
# no need to proceed if false
identical(sort(apply(m,1,sum)),sort(apply(x,1,sum))) &
identical(sort(apply(m,2,sum)),sort(apply(x,2,sum)))
##----Order both arrays by row and column totals----##
m.o <- m[order(apply(m,1,sum)),order(apply(m,2,sum))]
x.o <- x[order(apply(x,1,sum)),order(apply(x,2,sum))]
##----build row permutation list----##
# these are groups of rows that have same sum
rowtot <- rle(apply(x.o,1,sum))$lengths
rb <- list()
begin <- 1
for(i in 1:length(rowtot)){
rb[[i]] <- permutations(rowtot[i],rowtot[i],v=begin:cumsum(rowtot)[i])
begin <- begin + rowtot[i]
}
# construction of rperm could (should) be automated
rperm <- cbind(rb[[1]][rep(1:nrow(rb[[1]]), each = nrow(rb[[2]])), ],
rb[[2]][rep(1:nrow(rb[[2]]), nrow(rb[[1]])), ] )
rperm <- cbind(rperm[rep(1:nrow(rperm), each = nrow(rb[[3]])), ],
rb[[3]][rep(1:nrow(rb[[3]]), nrow(rperm)), ] )
##----build column permutation list----##
# these are groups of columns that have same sum
coltot <- rle(apply(x.o,2,sum))$lengths
cb <- list()
begin <- 1
for(i in 1:length(coltot)){
cb[[i]] <- permutations(coltot[i],coltot[i],v=begin:cumsum(coltot)[i])
begin <- begin + coltot[i]
}
# construction of cperm could (should) be automated
cperm <- cbind(cb[[1]][rep(1:nrow(cb[[1]]), each = nrow(cb[[2]])), ],
cb[[2]][rep(1:nrow(cb[[2]]), nrow(cb[[1]])), ] )
##----compare m.o with all x.o where rows and columns of x.o with tied
totals are permuted----##
for(i in 1:nrow(rperm)){
for(j in 1:nrow(cperm)){
if(identical(m.o,x.o[rperm[i,],cperm[j,]])) {
cat('TRUE','\n')
break
}
}
}
Hope this is helpful,
Dan
Daniel Nordlund
Bothell, WA USA
More information about the R-help
mailing list