[R] Drawing a sample based on certain condition
Duncan Murdoch
murdoch@dunc@n @end|ng |rom gm@||@com
Mon Apr 14 17:36:54 CEST 2025
On 2025-04-14 7:26 a.m., Brian Smith wrote:
> Hi,
>
> For my analytical work, I need to draw a sample of certain sample size
> from a denied population, where population members are marked by
> non-negative integers, such that sum of sample members if fixed. For
> example,
>
> Population = 0:100
> Sample_size = 10
> Sample_Sum = 20
>
> Under this setup if my sample members are X1, X2, ..., X10 then I
> should have X1+X2+...+X10 = 20
>
> Sample drawing scheme may be with/without replacement
>
> Is there any R function to achieve this? One possibility is to employ
> naive trial-error approach, but this doesnt seem to be practical as it
> would take long time to get the final sample with desired properties.
>
> Any pointer would be greatly appreciated.
One general way to think of this problem is that you are defining a
distribution on the space of all possible samples of size 10, such that
the probability of a sample is X if the sum is 20, and zero otherwise,
and you want to sample from this distribution.
There's probably a slick method to do that for your example, but if
you've got a general population instead of that special one, I doubt it.
What I would do is the following:
Define another distribution on samples that has probabilities that
depend on the sum of the sample, with the highest probabilities attached
to ones with the correct sum, and probabilities for other sums declining
with distance from the sum. For example, maybe
P(sum) = Y/(1 + abs(sum - 20))
for some constant Y.
You can use MCMC to sample from that distribution and then only keep the
samples where the sum is exactly equal to the target sum. If you do
that, you don't need to care about the value of Y. but you do need to
think about how proposed moves are made, and you probably need to use a
different function than the example above for acceptable efficiency.
Duncan Murdoch
More information about the R-help
mailing list