[R] strange `nls' behaviour

Mon Nov 12 16:18:26 CET 2007

On Mon, Nov 12, 2007 at 03:25:38PM +0100, Martin Maechler wrote:
> >>>>> "DM" == Duncan Murdoch <murdoch at stats.uwo.ca>
> >>>>>     on Mon, 12 Nov 2007 07:36:34 -0500 writes:
> 
>     DM> On 11/12/2007 6:51 AM, Joerg van den Hoff wrote:
>     >> I initially thought, this should better be posted to r-devel
>     >> but alas! no response. 
> 
>     DM> I think the reason there was no response is that your example is too 
>     DM> complicated.  You're doing a lot of strange things (fitfunc as a result 
>     DM> of deriv, using as.name, as.call, as.formula, etc.)  You should simplify 
>     DM> it down to isolate the bug.  Thats a lot of work, but you're the one in 
>     DM> the best position to do it.  I'd say there's at least an even chance 
>     DM> that the bug is in your code rather than in nls().
> 
> yes.. and.. no : 
> - His code is quite peculiar, but I think only slightly too complicated

thank's  for  this  response. I just tried to give an easier
example in response to duncan's mail.

> 
> - one could argue that the bug is in Joerg's thinking that
>   something like
>   	    nls(y ~ eval(fitfunc), ....)
> 
>   should be working at all.
>   But then he had found by experiment that it (accidentally I   d'say)
>   does work in many cases.
> 
>     DM> And 2.5.0 *is* ancient; please confirm the bug exists in R-patched if it 
>     DM> turns out to be an R bug.
> 
> You are right, but indeed (as has Kate just said) it *does*
> exist in current R versions.
> 
> I agree that the behavior of nls() here is sub-optimal.
> It *should* be consistent, i.e. work the same for n=4,5,6,..
> 
> I had spent about an hour after Joerg's R-devel posting,
> and found to be too busy with more urgent matters --
> unfortunately forgetting to give *some* feedback about my findings.
> 
> It may well be that we find that nls() should give an
> (intelligible) error message in such eval() cases - rather than
> only in one case...

well,  if at all possible, it would _really_ be nice to have
the opportunity  to  pass  `eval'  constructs  throught  the
`model'  argument  to `nls'. I'm far from understanding what
actually is going on in `model.frame', but if it parses  the
thing  correctly  ("accidentally"  or  not)  except  in  one
special case, is'nt  there  a  clean  way  to  "repair"  the
special  case?  in my application I found the ability to use
`eval' constructs here really helpful. at least  I  did  not
find another way to be able to use `deriv' information _and_
allow the user to use arbitrary symbols  in  specifying  the
model  formula. at least I'd prefer the present situation of
"works nearly always" to a consistent "works never".

joerg

> 
> Martin Maechler
> 
> 
>     DM> Duncan Murdoch
> 
> 
> 
> 
>     DM> so  I  try  it  here.  sory  for  the
>     >> lengthy explanation but it seems unavoidable. to quickly see
>     >> the problem simply copy the litte example below and execute
>     >> 
>     >> f(n=5)
>     >> 
>     >> which  crashes. called with n !=  5 (and of course n>3 since
>     >> there are 3 parameters in the model...) everything is as  it
>     >> should be.
>     >> 
>     >> in detail:
>     >> I  stumbled over the follwing _very_ strange behaviour/error
>     >> when using `nls' which  I'm  tempted  (despite  the  implied
>     >> "dangers") to call a bug:
>     >> 
>     >> I've  written a driver for `nls' which allows specifying the
>     >> model and the data vectors using arbitrary  symbols.   these
>     >> are  internally  mapped  to  consistent names, which poses a
>     >> slight complication when using `deriv' to  provide  analytic
>     >> derivatives. the following fragment gives the idea:
>     >> 
>     >> #-----------------------------------------
>     >> f <- function(n = 4) {
>     >> 
>     >> x <- seq(0, 5, length  = n)
>     >> 
>     >> y <- 2 * exp(-1*x) + 2; 
>     >> y <- rnorm(y,y, 0.01*y)
>     >> 
>     >> model <- y ~ a * exp (-b*x) + c
>     >> 
>     >> fitfunc <- deriv(model[[3]], c("a", "b", "c"), c("a", "b", "c", "x"))
>     >> 
>     >> #"standard" call of nls:
>     >> res1 <- nls(y ~ fitfunc(a, b, c, x), start = c(a=1, b=1, c=1))
>     >> 
>     >> call.fitfunc <- 
>     >> c(list(fitfunc), as.name("a"), as.name("b"), as.name("c"), as.name("x"))
>     >> call.fitfunc <- as.call(call.fitfunc)
>     >> frml <- as.formula("y ~ eval(call.fitfunc)")
>     >> 
>     >> #"computed" call of nls:
>     >> res2 <- nls(frml, start = c(a=1, b=1, c=1))
>     >> 
>     >> list(res1 = res1, res2 = res2)
>     >> }
>     >> #-----------------------------------------
>     >> 
>     >> the  argument  `n'   defines  the number of (simulated) data
>     >> points x/y which are going to be fitted by some model ( here
>     >> y ~ a*exp(-b*x)+c )
>     >> 
>     >> the first call to `nls' is the standard way of calling `nls'
>     >> when knowing all the variable and parameter names.
>     >> 
>     >> the second call (yielding `res2') uses a constructed formula
>     >> in `frml' (which in this example is of course not necessary,
>     >> but  in  the general case 'a,b,c,x,y' are not a priori known
>     >> names).
>     >> 
>     >> now, here is the problem: the call
>     >> 
>     >> f(4)
>     >> 
>     >> runs fine/consistently, as does every call with n > 5.
>     >> 
>     >> BUT: for n = 5 (i.e. issuing f(5))
>     >> 
>     >> the second fit leads to the error message:
>     >> 
>     >> "Error in model.frame(formula, rownames, variables, varnames, extras, extranames,  : 
>     >> invalid type (language) for variable 'call.fitfunc'"
>     >> 
>     >> I  cornered  this  to a spot in `nls' where a model frame is
>     >> constructed in variable `mf'.  the parsing/constructing here
>     >> seems  simply  to  be messed up for n = 5: `call.fitfunc' is
>     >> interpreted as variable.
>     >> 
>     >> I,  moreover, empirically noted that the problem occurs when
>     >> the total number of  parameters  plus  dependent/independent
>     >> variables  equals  the number of data points (in the present
>     >> example a,b,c,x,y).
>     >> 
>     >> so it is not the 'magic' number of 5 but rather the identity
>     >> of data vector length and number of parameters+variables  in
>     >> the model which leads to the problem.
>     >> 
>     >> this  is  with  2.5.0  (which  hopefully  is  not considered
>     >> ancient) and MacOSX 10.4.10.
>     >> 
>     >> any ideas?
>     >> 
>     >> thanks
>     >> 
>     >> joerg
>     >> 
>     >> ______________________________________________
>     >> R-help at r-project.org mailing list
>     >> https://stat.ethz.ch/mailman/listinfo/r-help
>     >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>     >> and provide commented, minimal, self-contained, reproducible code.
> 
>     DM> ______________________________________________
>     DM> R-help at r-project.org mailing list
>     DM> https://stat.ethz.ch/mailman/listinfo/r-help
>     DM> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>     DM> and provide commented, minimal, self-contained, reproducible code.