[R] discrepancy in fisher exact test between R and wiki formula

(Ted Harding) Ted.Harding at wlandres.net
Mon Dec 3 23:24:04 CET 2012

On 03-Dec-2012 21:22:28 JiangMei wrote:
> Hi All. Sorry to bother you. I have a question about fisher exact test.
> I counted the presence of gene mutation in two groups of samples.
> My data is as follows
>                 Presence   Absence
> GroupA   4                6
> GroupB   5                11
> When using the formula of fisher exact test provided by wiki
> (http://en.wikipedia.org/wiki/Fisher%27s_exact_test), the p-value is 0.29.
> But when calculated by R, the p-value is 0.69. My code is shown below
> counts<-c(4,5,6,11)
> data<-matrix(counts,nrow=2)
> fisher.test(data)
> Why did I get two different numbers? Is there anything wrong with my R codes?
> Wish your help! Thanks very much! I really appreciate it.

The reason is that the formula given in Wikipedia is for one particlar
set of values (a,b,c,d). In your case, a=4, b=6, c=5, d=11 and the
Wikipedia formula for p gives the probability of (a,b,c,d) = (4,6,5,11).

However, this is not the P-value for the test. For a 3-sided
alternative (see ?fisher.test ) the P-value is the sum of all such
probabilities for values of (a,b,c,d) such that a+b = 10, c+d = 16,
a+c = 9, b+d = 17 AND the probability p is less than or equal to
the probability of (4,6,5,11). So it includes the case that has been
observed and (in general) others, so will be greater (0.69) than the
value (0.29) given by the formula.

The default alternative for R's fisher.test() is "two-sided".
If you look at ?fisher.test() you will see:

  Two-sided tests are based on the probabilities of the tables,
  and take as 'more extreme' all tables with probabilities less
  than or equal to that of the observed table, the p-value being
  the sum of such probabilities.

I hope this helps.

E-Mail: (Ted Harding) <Ted.Harding at wlandres.net>
Date: 03-Dec-2012  Time: 22:24:00
This message was sent by XFMail

More information about the R-help mailing list