[R] Surprising behaviour of fisher.test

(Ted Harding) ted.harding at wlandres.net
Mon Aug 29 16:47:15 CEST 2011


On 29-Aug-11 11:44:28, Öhagen Patrik wrote:
> I did a simple little simulation of a binary variable in a two armed
> trial. I was quite surprised by the number of p-values delivered by the
> fisher.test function which was >1(!). Of course, under the null
> hypothesis you expect a fair number of outcomes with the same number of
> event in both arms but still?
> 
> Is there some silly error in my crude code?
> 
>#########################################################################
>################
> 
> niter<-50000
> ra<-rbinom(niter,100,.05)
> 
> rb<-rbinom(niter,100,.05)
> 
> pval<-rep(NA,niter)
> 
> 
> for (i in 1:niter){
> apa<-matrix(c(100-ra[i],ra[i],100-rb[i],rb[i]),byrow=T,ncol=2)
> pval[i]<-fisher.test(apa)$p.value
> 
> }
> 
> 
> cbind(ra,rb,pval)[pval < 0.06 & pval > 0.04,]
> hist(pval,probability=T)
> summary(pval)
> 
> table(pval< 0.05)/niter
> 
> sum(pval>1)/niter
> 
>########################################################
> Patrik Öhagen

After reading your posting, and being puzzled by "I was
quite surprised by the number of p-values delivered by the
fisher.test function which was >1(!).", I ran your code.
In each of three runs I got sum(pval>1) = 0.

It would indeed be surprising to get any pval > 1, since
the Fisher P-value is the sum of a subset of the probabilities
possible for the table. This subset may sometimes be all of
them, but even so (unless there was an unusual rounding error)
pone should not see pval > 1.

There are certainly many P-values equal to 1. On my third sun
(of 50000) I get

  sum(pval==1)
  # [1] 11520

  sum(pval==1)/niter
   [1 ] 0.2304


The probability of pval=1 in an interation is *at least* the
probability (in your setup of the tables) that ra[i] = rb[i],
which would be the probability that two independent binomial
samples of size 100 with p=0.05  should give the same result,
which is

sum((dbinom((0:100),100,0.05))^2)   which = 0.1307316

*"at least"* because, depending on the configuration of the
random 2x2 table, there are other possibilities for the Fisher
P-value to equal 1.

So the large number of P-values equal to 1 (as can be clearly
seen from the histogram) is not a surprise.

I am, therefore wondering if you really observed any pval > 1?
Did you confuse "pvale == 1" with "pval > 1" in your posting?
If you really did get any "pval > 1", how many did you get?

Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <ted.harding at wlandres.net>
Fax-to-email: +44 (0)870 094 0861
Date: 29-Aug-11                                       Time: 15:47:12
------------------------------ XFMail ------------------------------



More information about the R-help mailing list