[R] another optimization question

Prof Brian D Ripley ripley at stats.ox.ac.uk
Sun Nov 25 09:26:23 CET 2001

On Sat, 24 Nov 2001, John Fox wrote:

> Dear R list members,
> Since today seems to be the day for optimization questions, I have one that
> has been puzzling me:
> I've been doing some work on sem, my structural-equation modelling package.
> The models that the sem function in this package fits are essentially
> parametrizations of the multinormal distribution.  The function uses optim
> and nlm sequentially to maximize a multinormal likelihood. One of the
> changes I've introduced is to use an analytic gradient rather than rely on
> numerical derivatives. (If I can figure it out, I'd like to use an analytic
> Hessian as well.)
> I could provide additional details, but the question that I have is
> straightforward. I expected that using an analytic gradient would make the
> program faster and more stable. It *is* substantially faster, by up to an
> order of magnitude on the problems that I've tried. In one case, however, a
> model that converged (to the published solution) with numerical derivatives
> failed to converge with analytic derivatives. I can program around the
> problem, by having the program fall back to numerical derivatives when
> convergence fails, but I was surprised by this result, and I'm concerned
> that it reflects a programming problem or an error in my math. I suspect
> that if I had made such an error, however, the other examples I tried would
> not have worked so well.
> So, my question is, is it possible in principle for an optimization to fail
> using a correct analytic gradient but to converge with a numerical
> gradient? If this is possible, is it a common occurrence?

It's possible but rare.  You don't have a `correct analytic gradient', but
a numerical computation of it.  Inaccurately computed gradients are a
common cause of convergence problems. You may need to adjust the

It's also possible in principle that the optimizer takes a completely
different path from the starting point due to small differences in
calculated derivatives.  It's worth trying staritng near the expected
answer to rule this out.

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list