[R] Removing rows that are duplicates but column values are in reversed order
arun
smartpink111 at yahoo.com
Fri Apr 12 22:06:59 CEST 2013
Hi,
From your example data,
dat1<- read.table(text="
id1 id2 value
a b 10
c d 11
b a 10
c e 12
",sep="",header=TRUE,stringsAsFactors=FALSE)
#it is easier to get the output you wanted
dat1[!duplicated(dat1$value),]
# id1 id2 value
#1 a b 10
#2 c d 11
#4 c e 12
But, if you have cases like the one below (assuming that all those instances were there is reversed order have the same value)
dat2<- read.table(text="
id1 id2 value
a b 10
c d 11
b a 10
e c 12
c e 12
",sep="",header=TRUE,stringsAsFactors=FALSE)
dat2[apply(dat2[,-3],1,function(x) {x1<- order(x); x1[1]<x1[2]}),]
# id1 id2 value
#1 a b 10
#2 c d 11
#5 c e 12
#or you have cases like these:
dat3<- read.table(text="
id1 id2 value
a b 10
c d 11
b a 10
a b 10
e c 12
c e 12
c d 11
",sep="",header=TRUE,stringsAsFactors=FALSE)
dat3New<-dat3[apply(dat3[,-3],1,function(x) {x1<- order(x); x1[1]<x1[2]}),]
dat3New[!duplicated(dat3New$value),]
# id1 id2 value
#1 a b 10
#2 c d 11
#6 c e 12
A.K.
>Hi everybody,
>
>I was hoping that someone could help me with this problem. I
have a table with 3 columns. Some rows contain duplicates where the
identifiers in >columns 1 and 2 are in reverse order, but the value
associated with the row is the same.
>
>For example:
>
>id1 id2 value
>a b 10
>c d 11
>b a 10
>c e 12
>
>Rows 1 and 3 are duplicates (have the same value). I would like
to retain only row 1 and delete row 3. Final table should look like
this:
>
>id1 id2 value
>a b 10
>c d 11
>c e 12
>
>Thanks in advance for any help provided.
>
>Vince
More information about the R-help
mailing list