[R] Finding (swapped) repetitions of numbers pairs across two columns
arun
smartpink111 at yahoo.com
Fri Dec 28 03:49:09 CET 2012
Hi,
You could also use:
apply(cbind(v1,v2),1,function(x) x[order(x)])
#or
unique(t(apply(cbind(v1,v2),1,sort.int,method="quick")))
By comparing different methods:
set.seed(51)
v1<-sample(0:9,1e5,replace=TRUE)
set.seed(49)
v2<-sample(0:9,1e5,replace=TRUE)
system.time(res1<-unique(t(apply(cbind(v1, v2), 1, sort))))
# user system elapsed
# 11.373 0.188 11.918
system.time(res2<-unique(t(apply(cbind(v1,v2),1,sort.int,method="quick"))))
# user system elapsed
# 7.088 0.120 7.446
identical(res1,res2)
#[1] TRUE
system.time(res3 <- unique(t(apply(cbind(v1,v2),1,function(x) x[order(x)])))) #found to be faster
# user system elapsed
# 2.693 0.072 2.857
identical(res1,res3)
#[1] TRUE
A.K.
----- Original Message -----
From: Emmanuel Levy <emmanuel.levy at gmail.com>
To: R-help Mailing List <r-help at r-project.org>
Cc:
Sent: Thursday, December 27, 2012 3:30 PM
Subject: [R] Finding (swapped) repetitions of numbers pairs across two columns
Hi,
I've had this problem for a while and tackled it is a quite dirty way
so I'm wondering is a better solution exists:
If we have two vectors:
v1 = c(0,1,2,3,4)
v2 = c(5,3,2,1,0)
How to remove one instance of the "3,1" / "1,3" double?
At the moment I'm using the following solution, which is quite horrible:
v1 = c(0,1,2,3,4)
v2 = c(5,3,2,1,0)
ft <- cbind(v1, v2)
direction = apply( ft, 1, function(x) return(x[1]>x[2]))
ft.tmp = ft
ft[which(direction),1] = ft.tmp[which(direction),2]
ft[which(direction),2] = ft.tmp[which(direction),1]
uniques = apply( ft, 1, function(x) paste(x, collapse="%") )
uniques = unique(uniques)
ft.unique = matrix(unlist(strsplit(uniques,"%")), ncol=2, byrow=TRUE)
Any better solution would be very welcome!
All the best,
Emmanuel
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list