[R] Generating unordered, with replacement, samples

Giovanni Petris gpetris at uark.edu
Wed Sep 17 21:46:32 CEST 2014

Hi Duncan,

You are right. The idea of the derivation consists in 'throwing' k placeholders ("*" in the example below) in the list of the individuals of the population. For example, if the population is letters[1:6], and the sample size is 4, the following code generates uniformly a 'sample'.

> n <- 6; k <- 4
> set.seed(2)
> xxx <- rep("*", n + k)
> ind <- sort(sample(2 : (n+k), k))
> xxx[setdiff(1 : (n+k), ind)] <- letters[seq.int(n)]
> noquote(xxx)
 [1] a b * c d * * e f *

This represents the sample (b, d, d, f). I am still missing the "all" I need to do that you mention, that is how I can transform the vector xxx into something more readily usable, like c(b, d, d, f), or even a summary of counts. I guess I am looking for a bit of R trickery here...

Thank you,

From: Duncan Murdoch [murdoch.duncan at gmail.com]
Sent: Wednesday, September 17, 2014 14:07
To: Giovanni Petris; r-help at R-project.org
Subject: Re: [R] Generating unordered, with replacement, samples

On 17/09/2014 2:25 PM, Giovanni Petris wrote:
> Hello,
> I am trying to interface in my teaching some elementary probability with Monte Carlo ideas. In sampling from a finite population, the number of distinct samples of size 'k' from a population of size 'n' , when individuals are selected with replacement and the selection order does not matter, is choose(n + k -1, k). Does anyone have a suggestion about how to simulate (uniformly!) one of these possible samples? In a Monte Carlo framework I would like to do it repeatedly, so efficiency is of some relevance.
> Thank you in advance!

I forget the details of the derivation of that count, but the number
suggests it is found by selecting k things without replacement from
n+k-1.  The sample() function in R can easily give you a sample of k
integers from 1:(n+k-1); "all" you need to do is map those numbers into
your original sample of k from n.  For that you need to remember the
derivation of that formula!

Duncan Murdoch

More information about the R-help mailing list