[R] uniform integer RNG 0 to t inclusive

Prof Brian Ripley ripley at stats.ox.ac.uk
Tue Sep 19 10:41:22 CEST 2006


On Tue, 19 Sep 2006, Sean O'Riordain wrote:

> Hi Duncan,
>
> Thanks for that.  In the light of what you've suggested, I'm now using
> the following:
>
>  # generate a random integer from 0 to t (inclusive)
>  if (t < 10000000) { # to avoid memory problems...
>    M <- sample(t, 1)
>  } else {
>    while (M > t) {
>      M <- as.integer(urand(1,min=0, max=t+1-.Machine$double.eps))
>    }
>  }

sample(t, 1) is a sample from 1:t, not 0:t.

You need

sample(t+1, 1, replace=TRUE) - 1

which works in all cases up to INT_MAX-1, and beyond that you need to 
worry about the resolution of the RNG (and to use floor not as.integer).

There is no such thing as urand in base R ....

>
> cheers and Thanks,
> Sean
>
> On 18/09/06, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
>> On 9/18/2006 3:37 AM, Sean O'Riordain wrote:
>>> Good morning,
>>>
>>> I'm trying to concisely generate a single integer from 0 to n
>>> inclusive, where n might be of the order of hundreds of millions.
>>> This will however be used many times during the general procedure, so
>>> it must be "reasonably efficient" in both memory and time... (at some
>>> later stage in the development I hope to go vectorized)
>>>
>>> The examples I've found through searching RSiteSearch() relating to
>>> generating random integers say to use : sample(0:n, 1)
>>> However, when n is "large" this first generates a large sequence 0:n
>>> before taking a sample of one... this computer doesn't have the memory
>>> for that!
>>
>> You don't need to give the whole vector:  just give n, and you'll get
>> draws from 1:n.  The man page is clear on this.
>>
>> So what you want is sample(n+1, 1) - 1.  (Use "replace=TRUE" if you want
>> a sample bigger than 1, or you'll get sampling without replacement.)
>>>
>>> When I look at the documentation for runif(n, min, max) it states that
>>> the generated numbers will be min <= x <= max.  Note the "<= max"...
>>
>> Actually it says that's the range for the uniform density.  It's silent
>> on the range of the output.  But it's good defensive programming to
>> assume that it's possible to get the endpoints.
>>
>>>
>>> How do I generate an x such that the probability of being (the
>>> integer) max is the same as any other integer from min (an integer) to
>>> max-1 (an integer) inclusive... My attempt is:
>>>
>>> urand.int <- function(n,t) {
>>>   as.integer(runif(n,min=0, max=t+1-.Machine$double.eps))
>>> }
>>> # where I've included the parameter n to help testing...
>>
>> Because of rounding error, t+1-.Machine$double.eps might be exactly
>> equal to t+1.  I'd suggest using a rejection method if you need to use
>> this approach:  but sample() is better in the cases where as.integer()
>> will work.
>>
>> Duncan Murdoch
>>>
>>> is floor() "better" than as.integer?
>>>
>>> Is this correct?  Is the probability of the integer t the same as the
>>> integer 1 or 0 etc... I have done some rudimentary testing and this
>>> appears to work, but power being what it is, I can't see how to
>>> realistically test this hypothesis.
>>>
>>> Or is there a a better way of doing this?
>>>
>>> I'm trying to implement an algorithm which samples into an array,
>>> hence the need for an integer - and yes I know about sample() thanks!
>>> :-)
>>>
>>> { incidentally, I was surprised to note that the maximum value
>>> returned by summary(integer_vector) is "pretty" and appears to be
>>> rounded up to a "nice round number", and is not necessarily the same
>>> as max(integer_vector) where the value is large, i.e. of the order of
>>> say 50 million }
>>>
>>> Is version etc relevant? (I'll want to be portable)
>>>> version               _
>>> platform       i386-pc-mingw32
>>> arch           i386
>>> os             mingw32
>>> system         i386, mingw32
>>> status
>>> major          2
>>> minor          3.1
>>> year           2006
>>> month          06
>>> day            01
>>> svn rev        38247
>>> language       R
>>> version.string Version 2.3.1 (2006-06-01)
>>>
>>> Many thanks in advance for your help.
>>> Sean O'Riordain
>>> affiliation <- NULL
>>>
>>> ______________________________________________
>>> R-help at stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list