[R] Rép : ANOVA and permutation tests : beware of traps

Wed Sep 24 14:24:37 CEST 2014

On 24/09/2014 8:00 AM, Stéphane Adamowicz wrote:
> Many thanks to J. Wiley and B. Ripley for their quick answers.
> I suppose that many end users are aware of problems in calculation accuracy with computers. However, I would like to comment that it was not that obvious for me that the data order matters. First, I do not find any clear mention of this particular problem in the == help page, but perhaps I am experiencing difficulties with the English. Second, I do not encounter this problem neither with the piece of code I proposed to replace the dubious one, or in the following experiments :

I suspect that there would be situations where your code failed as 
well.  Now that I know about this potential error, I would not trust 
code that doesn't defend against it, even if I could not find an example 
where it failed:  it depends on tests of exact equality in floating 
point values, and that is known to be a source of errors in numerical 
methods.
>
> > # experiment 1 : comparing total variances
> > var(dat1$Y) == var(dat2$Y)
> [1] TRUE
> > # experiment 2 : comparing bilateral T tests
> > abs(t.test(Y~F, dat1)$statistic) == abs(t.test(Y~F, dat2)$statistic)
>     t
> TRUE
> > # experiment 3 : applying permutations to T tests
> > nperm <- 10000
> > T <- abs(t.test(Y~F, dat1)$statistic)
> > Tperm <- replicate(n=nperm, abs(t.test(sample(Y)~F, dat1)$statistic))
> [1] 0.1018898	# that's nice !
>
> Thus, why a naive end user as I am should expect such pitfalls with F values given by the lm function ? Furthermore, codes similar to the one I criticized can be found in the teaching documents of various Universities and thus are spreading out. I would not be surprised that some scientific papers already rely on it ...
>
> In fact, even in R web pages, under Books ("This page gives a partially annotated list of books that are related to S or R and may be useful to the R user community"), I found only one book clearly devoted to randomization methods : "[32] Laura Chihara and Tim Hesterberg. Mathematical Statistics with Resampling and R. Wiley, 1st edition, 2011. ISBN 978-1-1180-2985-5". Looking at the author's profiles, I would say that "Beware of the trap of listening to people with no knowledge of basic numerical methods!" does not apply to them. Here is their recommended R code for one-way anova (chapter 12, but adapted to my example data):

You should tell the authors that their code doesn't work.  Many books 
contain errors, but they only get corrected when authors are informed 
about them.
>
> > observed <- anova(lm(Y ~ F, dat1))$F[1]
> > n <- length(dat1$Y)
> > B <- 10^5 - 1
> > results <- numeric(B)
> > for (i in 1:B)
> + {
> + 	index <- sample(n)
> + 	Y.perm <- dat1$Y[index]
> + 	results[i] <- anova(lm(Y.perm ~ dat1$F))$F[1]
> + }
> > (sum(results> observed) + 1) / (B + 1)	# P value
> [1] 0.03969
>
> Well, it seems that I am not the only guy who does not find the trap obvious ...
>
> Thus, my final question remains : How can we evaluate the reliability of CRAN packages that propose randomization (or bootstrap) methods ?
The same way as you evaluate the reliability of any software:  you 
examine the code for common errors, you apply tests to problems that you 
know to be difficult, etc.  At least all CRAN packages provide you with 
source code.

Duncan Murdoch