[R] extract rows in dataframe with duplicated column values
Tiago R Magalhaes
tiago17 at socrates.Berkeley.EDU
Fri Mar 18 19:21:44 CET 2005
Thank you very much to Andy Liaw, Rob J Goedman and Marc Schwartz for
taking their time to solve my problem. I've learned in many other
occasions from useful tips coming from all 3 of them and it just
happened once again. You got to love this mailing list...
subset(x, a %in% a[duplicated(a)])
works in all cases and it's the simplest, but as always all the
solutions made me understand a little better the R concepts and
functions.
I would suggest to include this in the help pages for duplicated.
Also useful might be:
subset(x, !a %in% a[duplicated(a)])
giving all rows that don't have any duplicated
again thanks for all help in this mailing list
>Here's one more possibility:
>
> > subset(x, a %in% a[duplicated(a)])
> a b
>2 2 10
>3 2 10
>4 3 10
>5 3 10
>6 3 10
>
>HTH,
>
>Marc Schwartz
>
>
>On Thu, 2005-03-17 at 22:25 -0500, Liaw, Andy wrote:
>> OK, strike one...
>>
>> Here's my second try:
>>
>> > cnt <- table(x[,1])
>> > v <- as.numeric(names(cnt[cnt > 1]))
>> > v
>> [1] 2 3
>> > x[x[,1] %in% v, ]
>> a b
>> 2 2 10
>> 3 2 10
>> 4 3 10
>> 5 3 10
>> 6 3 10
>>
>> Andy
>>
>> > From: Liaw, Andy
>> >
>> > Does this work for you?
>> >
>> > > x[table(x[,1]) > 1,]
>> > a b
>> > 2 2 10
>> > 3 2 10
>> > 5 3 10
>> > 6 3 10
>> >
>> > Andy
>> >
>> > > From: Tiago R Magalhaes
>> > >
>> > > Hi
>> > >
>> > > I want to extract all the rows in a data frame that have duplicates
>> > > for a given column.
>> > > I would expect this question to come up pretty often but I have
>> > > researched the archives and surprisingly couldn't find anything.
>> > > The best I can come up with is:
>> > >
>> > > x <- data.frame(a=c(1,2,2,3,3,3), b=10)
>> > > xdup1 <- duplicated(x[,1])
>> > > xdup2 <- duplicated(x[,1][nrow(x):1])[nrow(x):1]
>> > > xAllDups <- x[(xdup1+xdup2)!=0,]
>> > >
>> > > This seems to work, but it's so convoluted that I'm sure there's a
>> > > better method.
>> > > Thanks for any help and enlightenment
> > > > [[alternative HTML version deleted]]
More information about the R-help
mailing list