[R] randomly select duplicated entries
jim holtman
jholtman at gmail.com
Wed Jul 9 22:42:42 CEST 2008
How about this:
> dat <- read.table(textConnection("Id myvar
+ 12 1
+ 12 2
+ 12 6
+ 34 9
+ 34 4
+ 34 8
+ 65 15
+ 65 23"), header = TRUE)
> closeAllConnections()
> # split by the id and then choose one
> x <- lapply(split(dat, dat$Id), function(.grp){
+ .grp[sample(seq(length(.grp)), 1),]
+ })
> do.call(rbind, x)
Id myvar
12 12 1
34 34 9
65 65 15
On Wed, Jul 9, 2008 at 3:17 PM, Juliet Hannah <juliet.hannah at gmail.com> wrote:
> Using this data as an example
>
> dat <- read.table(textConnection("Id myvar
> 12 1
> 12 2
> 12 6
> 34 9
> 34 4
> 34 8
> 65 15
> 65 23"), header = TRUE)
> closeAllConnections()
>
> how can I create another data set that does not have duplicate entries
> for 'Id', but the included values
> are randomly selected from the available ones.
>
> Thanks!
>
> Juliet
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem you are trying to solve?
More information about the R-help
mailing list