[R] sum over extremely small numbers

Thu Aug 16 22:09:23 CEST 2012

On Thu, Aug 16, 2012 at 01:41:17PM -0400, Jie wrote:
> Dear All,
> 
> I am evaluating the value of loglikelihood and it ends up with the sum of
> tiny numbers.
> Below is an example: suppose I would like to calculate sum_i (log (sum_j x
> [i, j] )), the index of log (x) is in the range, say (-2000, 0). I am aware
> that exp(-744.5) will be expressed as 0 in 32 bit R and exp
> Is there a way to improve the result?
> 
> R example:
> powd <- sample(-2000:0, 100, replace=T)  # the power of x [i, j]
> x <- matrix(exp(powd),10)                            # the value of x
> sum(log(rowSums(x)))                                 # sum

Hi.

The numbers in a row of the matrix "x" are not only very small, but
also very different from each other. In this situation, their maximum
may be a good approximation of their sum. For example

  2^-1000 + 2^-500 + 2^-100 \approx 2^-100

since

  (2^-1000 + 2^-500 + 2^-100)/2^-100

is very close to 1.

If the numbers are not so different, then the following formula
for a vector "x", which may be a single row of a matrix, may be
helpful

  log(sum(exp(x))) = max(x) + log(sum(exp(x - max(x))))

Note that exp() is not applied to the original numbers, but to their
differences and the main term is exp(0) due to the maximum element
of x. So, at least the main term is computed without underflow.

The accuracy may be further improved by eliminating exp(0) from the
sum in the right hand side and using function log1p(x).

  set.seed(1234567)
  x <-  - runif(10)
  A <- log(sum(exp(x)))
  y <- x[-which.max(x)] 
  B <- max(x) + log1p(sum(exp(y - max(x))))
  A

  [1] 1.771461

  abs(A - B)

  [1] 2.220446e-16

See also

  http://rwiki.sciviews.org/doku.php?id=misc:r_accuracy:computing_using_logarithms

Hope this helps.

Petr Savicky.