[R] Product of 1 - probabilities

Thu May 21 16:59:51 CEST 2009

On 21-May-09 14:15:08, Mark Bilton wrote:
> I am having a slight problem with probabilities.
> 
> To calculate the final probability of an event p(F), we can take the
> product of the chance that each independent event that makes p(F) will
> NOT occur.
> So...
> p(F) = 1- ( (1-p(A)) * (1-p(B)) * (1-p(C))...(1-p(x)) )
> 
> If the chance of an event within the product occurring remains the
> same, we can therefore raise this probability to a power of the number
> of times that event occurs.
> e.g. rolling a dice p(A) = 1/6 of getting a 1...
> p(F) = 1 - (1- (1/6))^z 
> p(F) = 1 - (1-p(A))^z tells us the probabiltity of rolling a 1 'at
> least once' in z number of rolls.
> 
> So then to R...
> 
> if p(A) = 0.01; z = 4; p(F) = 0.039
> 
> obviously p(F) > p(A)
> 
> however the problem arises when we use very small numbers e.g. p(B) = 1
> * 10^-30
> R understands this value
> However when you have 1-p(B) you get something very close to 1 as you
> expect...but R seems to think it is 1.
> So when I calculate p(F) = 1 - (1-p(B))^z = 1 to the power anything
> equals 1 so p(F) = 0 and not just close to zero BUT zero.
> It doesn't matter therefore if z = 1*10^1000, the answer is still zero
> !!
> 
> Obviously therefore now p(F) < p(B)
> 
> Is there any solution to my problem, e.g.
> - is it a problem with the sum (-) ? ie could I change the number of
> bits the number understands (however it seems strange that it can hold
> it as a value close to 0 but not close to 1)
> -or should I maybe use a function to calculate the exact answer ?
> -or something else
> 
> Any help greatly appreciated
> Mark

The problem is not with sum(), but with the fact that R adopts the
exact value 1 for (1-x) if (on my machine) x < 10^(-16). This is to
do with the number of binary digits used to store a number. Your
10^(-30) is much smaller than that!

  x <- 10^(-30)
  x
# [1] 1e-30
  y <- (1-x) ; 1-y
# [1] 0
  x <- 10^(-15)
  y <- (1-x) ; 1-y
# [1] 9.992007e-16
  x <- 10^(-16)
  y <- (1-x) ; 1-y
# [1] 1.110223e-16
  x <- 10^(-17)
  y <- (1-x) ; 1-y
# [1] 0

Read ?.Machine and, in particular, about:
  double.eps: the smallest positive floating-point number 'x'
    such that '1 + x != 1'.  It equals 'base^ulp.digits' if
    either 'base' is 2 or 'rounding' is 0;  otherwise, it is
    '(base^ulp.digits) / 2'.  Normally '2.220446e-16'.

There is no cure for this in R, but you could try to work round it
(depending on the details of your problem) by using an approximation
to the value of the expression.

For instance, if x is very small (say 10^(-20)), and n is not very
large, then to first order

  (1 - x)^n = 1 - n*x

or, if X is a vector of very small numbers,

  prod((1 - X)) = 1 - sum(X)

Either of these may still result in 1 if what is being subtracted
is less than .Machine$double.eps.

However, in the particular type of application you describe,

  1 - (1 - x)^n = n*x to first order
  1 - prod((1 - X)) = sum(X) to first order

You will just have to work out what will give you a sensible answer.
Basically, you are trying to operate beyond the limits of precision
of R, and so will need to re-cast your question in alternative terms
which will yield an adequately precise result.

Hoping this helps,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 21-May-09                                       Time: 15:59:47
------------------------------ XFMail ------------------------------