[R] Bug : Autocorrelation in sample drawn from stats::rnorm (hmh)

Fri Oct 5 09:58:15 CEST 2018

On 05/10/2018, 09:45, "R-help on behalf of hmh" <r-help-bounces using r-project.org on behalf of hugomh using gmx.fr> wrote:

    Hi,

    Thanks William for this fast answer, and sorry for sending the 1st mail 
    to r-help instead to r-devel.

    I noticed that bug while I was simulating many small random walks using 
    c(0,cumsum(rnorm(10))). Then the negative auto-correlation was inducing 
    a muchsmaller space visited by the random walks than expected if there 
    would be no auto-correlation in the samples.

    The code I provided and you optimized was only provided to illustrated 
    and investigate that bug.

    It is really worrying that most of the R distributions are affected by 
    this bug !!!!

    What I did should have been one of the first check done for _*each*_ 
    distributions by the developers of these functions !

    And if as you suggested this is a "tolerated" _error_ of the algorithm, 
    I do think this is a bad choice, but any way, this should have been 
    mentioned in the documentations of the functions !!

    cheers,

    hugo

This is not a bug. You have simply rediscovered the finite-sample bias in the sample autocorrelation coefficient, known at least since
Kendall, M. G. (1954). Note on bias in the estimation of autocorrelation. Biometrika, 41(3-4), 403-404. 

The bias is approximately -1/T, with T sample size, which explains why it seems to disappear in the larger sample sizes you consider.

Jan