[R] Dealing with Duplicates - How to count instances?

Johannes Graumann johannes_graumann at web.de
Fri Feb 2 21:37:42 CET 2007


jim holtman wrote:

> table(data[column])
> 
> will give you the number of items in each subgroup; that would be the
> count you are after.

Thanks for your Help! That rocks! I can do 

copynum <- table(data_6plus["Accession.number"])
data_6plus$"Repeats" <- sapply(data_6plus[["Accession.number"]], function(x)         
   copynum[x][[1]])

now!

But how about this:
- do something along the lines of 

duplicity <- duplicated(data_6plus["Accession.number"])
data_6plus_unique <- subset(data_6plus,duplicity!=TRUE)

- BUT: retain from each deleted row one field, append it to a vector and
fill that into a new field of the remaining row of the set sharing
data_6plus["Accession.number"]?

How would you do something like that?

Joh



More information about the R-help mailing list