[R] Behaviour of integrate (was 'Poisson-lognormal probability calcul ations')

k.jewell at campden.co.uk k.jewell at campden.co.uk
Fri Feb 15 17:47:42 CET 2008


Hi again,

Adding further information to my own query, this function gets to the core
of the problem, which I think lies in the behaviour of 'integrate'.
-------------------------------------
function (x, meanlog = 0, sdlog = 1, ...) {
    require(stats)
    integrand <- function(t, x, meanlog, sdlog) dpois(x,t)*dlnorm(t,
meanlog, sdlog)
    mapply(function(x, meanlog, sdlog, ...)
#                (1/gamma(x+1))*
         integrate(function(t, x, meanlog, sdlog)
#                   gamma(x+1)*
              integrand(t, x, meanlog, sdlog),
              lower = 0, upper = Inf, x = x, meanlog = meanlog, sdlog =
sdlog, ...)$value,
           x, meanlog, sdlog, ...
           )
                                           }
----------------------------------------
Mathematically, the presence or not of the two commented lines should make
no difference; they multiply the integrand by a constant (with respect to
the integration), then divide the result by the same constant. In practice
they make a big difference! I guess they're altering the behaviour of the
'integrate'.

I'd have thought the presence of the lines would worsen the behaviour.
Without the lines the integrand is reasonably small, the integral is < 1.
With the lines the limit on the integral is x!,  leading to "non-finite
function values" for x much > 170, even if we use logs to get around the
limit on gamma(x).

In fact with the lines the plot of function(x) v. x looks reasonable (but I
don't know if the values are correct!!), but without the lines it looks
silly, I just don't believe it!

I thought the problem might relate to the note in ?integrate "If the
function is approximately constant (in particular, zero) over nearly all its
range it is possible that the result and error estimate may be seriously
wrong.". I wouldn't really expect multiplication by a large constant to fix
such errors (??), but the lognormal distribution is skew so it might be
considered "approximately constant ... over nearly all its range". Even 
though ...
> integrate(dlnorm, 0, Inf)
1 with absolute error < 2.5e-07
... suggests that this is not the source of the problem,  I tried changing
variables to integrate over a normally distributed variable:
----------------
function (x, meanlog = 0, sdlog = 1, ...) {
    require(stats)
    integrand <- function(t, x, meanlog, sdlog) dpois(x,exp(t))*dnorm(t,
meanlog, sdlog)
    mapply(function(x, meanlog, sdlog, ...)
 #              (1/gamma(x+1))*
         integrate(function(t, x, meanlog, sdlog)
 #                  gamma(x+1)*
              integrand(t, x, meanlog, sdlog),
              lower = -Inf, upper = Inf, x = x, meanlog = meanlog, sdlog =
sdlog, ...)$value,
           x, meanlog, sdlog, ...
           )
                                           }
-----------------------------
Still no better; with the constants the values look reasonably smooth (but
are they correct??), without the constants the values are silly.

I've tried reducing rel.tol and increasing subdivisions, they change the
behaviour a little but don't "fix" it, and I still get the marked difference
between the presence and absence of those lines (and I'm increasingly unsure
whether either answer is correct!).

I'm still trying. but I really think I'm going nowhere. Has anyone any
ideas?

Thanks in advance,

Keith Jewell
mailto:k.jewell at campden.co.uk telephone (direct) +44 (0)1386 842055
> -----Original Message-----
> From: Jewell, Keith
> Sent: 15 February 2008 11:16
> To: 'r-help at r-project.org'
> Subject: Poisson-lognormal probability calculations
> 
> Hi,
> 
> just for the record, although I don't think it's relevant (!)
> -------------------------------------
> > sessionInfo()
> R version 2.6.0 (2007-10-03)
> i386-pc-mingw32
> 
> locale:
> LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United
> Kingdom.1252;LC_MONETARY=English_United
> Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252
> 
> attached base packages:
> [1] stats4    splines   stats     graphics  grDevices utils
> datasets
> methods   base
> 
> other attached packages:
> [1] VGAM_0.7-5         xlsReadWrite_1.3.2
> --------------------------------
> 
> I'm having some problems with a Poisson-lognormal density (mass?)
> function.
> 
> VGAM has the dpolono function, but that doesn't work for x-values over
> 170, and I need to go to *much* bigger numbers. It fails first because
> of gamma overflow, then because of non-finite integrand.
> ---------------------
> > VGAM::dpolono(170)
> [1] 4.808781e-09
> > VGAM::dpolono(171)
> [1] 0
> Warning message:
> In VGAM::dpolono(171) : value out of range in 'gammafn'
> > VGAM::dpolono(172)
> [1] 0
> Warning message:
> In VGAM::dpolono(172) : value out of range in 'gammafn'
> > VGAM::dpolono(173)
> [1] 0
> Warning message:
> In VGAM::dpolono(173) : value out of range in 'gammafn'
> > VGAM::dpolono(174)
> Error in integrate(f = integrand, lower = -Inf, upper = Inf, x = x[i],
> :
>   non-finite function value
> -------------
> 
> I tidied up a little (to my eyes only, no offence intended to the
> estimable authors of VGAM) and avoided the gamma overflow by using logs
> - OK so far, it agrees almost perfectly with VGAM::dpolono for x up to
> 170, and now extends the range up to 173 (wow!!).
> -------------
> dApolono <-
> function (x, meanlog = 0, sdlog = 1, ...) {
>     require(stats)
>     integrand <- function(t, x, meanlog, sdlog) exp(-t+x*log(t)-
> log(sdlog*t*sqrt(2*pi))-0.5*((log(t)-meanlog)/sdlog)^2)
>     mapply(function(x, meanlog, sdlog, ...){
>        temp <- try(
>          integrate(f = integrand, lower = 0, upper = Inf, x = x,
> meanlog = meanlog, sdlog = sdlog, ...)
>                   )
>          ifelse(inherits(temp, "try-error"), NA, exp(log(temp$value)-
> lgamma(x+1)))
>                                       }
>               , x, meanlog, sdlog, ...
>            )
>                                            }
> plot(log(dApolono(0:173)))
> dApolono(174)
> -----------------------------
> 
> Addressing the non-finite integrand, I noticed that the gamma(x+1)
> divisor was outside the integrand so that as x gets bigger the
> integrand gets bigger, only to be finally divided by an increasing
> gamma(x+1). I reasoned that putting the divisor inside the integral
> might incur a substantial performance hit, but would keep the integrand
> at reasonable values.
> 
> --------------------------------------
> dApolono <-
> function (x, meanlog = 0, sdlog = 1, ...) {
>     require(stats)
>     integrand <- function(t, x, meanlog, sdlog) exp(-t+x*log(t)-
> log(sdlog*t*sqrt(2*pi))-0.5*((log(t)-meanlog)/sdlog)^2-lgamma(x+1))
>     mapply(function(x, meanlog, sdlog, ...){
>        temp <- try(
>          integrate(f = integrand, lower = 0, upper = Inf, x = x,
> meanlog = meanlog, sdlog = sdlog, ...)
>                   )
>          ifelse(inherits(temp, "try-error"), NA, temp$value)
>                                       }
>               , x, meanlog, sdlog, ...
>            )
>                                            }
> plot(log(dApolono(0:173)))
> dApolono(174)
> dApolono(1E3)
> -------------------------
> 
> This avoids the non-finite integrand and gives me answers at much
> higher x, but now at least some of the answers are wrong (not to say
> silly), even in the range where the other versions worked. I've tried
> other variations including changing the variable of integration to
> log(t) and integrating dpois()*dlnorm(), but I can't fix it; if the
> factorial (=gamma) is inside the integrand I get silly answers, if it's
> outside I get non-finite integrand.
> 
> I'm tearing my hair. Can anyone suggest where I may be going wrong? Any
> suggestions at all will be appreciated.
> 
> Thanks in advance,
> 
> Keith Jewell
> mailto:k.jewell at campden.co.uk telephone (direct) +44 (0)1386 842055
> 



_____________________________________________________________________
The information in this document and attachments is given after the exercise of all reasonable care and skill in its compilation, preparation and issue, but is provided without liability in its application or use.  It may contain privileged information that is exempt from disclosure by law and may be confidential.  If you are not the intended recipient you must not copy, distribute or take any action in reliance on it.  If you have received this document in error please notify us and delete this message from your system immediately.

Campden & Chorleywood Food Research Association Group;
Campden & Chorleywood Food Research Association (company limited by guarantee),registered number 510618);
CCFRA Group Services Ltd. (registered number 3841905); and
CCFRA Technology Ltd  (registered number 3836922),
all registered in England and Wales with the registered office at Station Road, Chipping Campden, Gloucestershire, GL55 6LD.

The Group may monitor e-mail traffic data and also the content of e-mail for the purposes of security and staff training.

Any advice given is subject to our normal terms and conditions of trading, a copy of which is available on request.

________________________________________________________________________
This e-mail has been scanned for all viruses by MessageL...{{dropped:2}}



More information about the R-help mailing list