[R] strange `nls' behaviour
Duncan Murdoch
murdoch at stats.uwo.ca
Mon Nov 12 17:09:21 CET 2007
On 11/12/2007 9:14 AM, Joerg van den Hoff wrote:
> On Mon, Nov 12, 2007 at 07:36:34AM -0500, Duncan Murdoch wrote:
>> On 11/12/2007 6:51 AM, Joerg van den Hoff wrote:
>> >I initially thought, this should better be posted to r-devel
>> >but alas! no response.
>>
>> I think the reason there was no response is that your example is too
>> complicated. You're doing a lot of strange things (fitfunc as a result
>> of deriv, using as.name, as.call, as.formula, etc.) You should simplify
>
> thanks for the feedback.
>
> concerning "lot of strange things": OK. I thought the
> context might be important ("why, for heaven's sake do you
> want to do this!?"), but, then, maybe not. so the easiest
> way to trigger a similar (not the identical) behaviour is
> something like
>
> f <- function (n) {
> #---------------------------------------------------------
> #define n data points for a (hardcoded) model:
> #-----------
> x <- seq(0, 5, length = n)
> y <- 2 * exp(-1*x) + 2;
> y <- rnorm(y,y, 0.01*y)
>
> #the model (same as the above hardcoded one):
> model <- y ~ a * exp (-b*x) + c
>
> #call nls as usual:
> res1 <- try(nls(model, start = c(a=2, b=1, c=2)))
>
> #call it a bit differently:
> res2 <- nls(y ~ eval(model[[3]]), start = c(a=2, b=1, c=2))
>
> list(res1 = res1, res2 = res2)
> #---------------------------------------------------------
> }
I'd say the problem is relying on the default for the envir parameter to
eval. It's reasonable to expect nls to set things up so that terms in
the model formula are taken from the right place, but when your model
formula is playing with evaluation, you should be very careful to make
sure the evaluation takes place in the right context.
The default for envir is "parent.frame()", and that can't be right: that
will see local variables in whatever function called it, so if one of
them has a variable called "model", you'll subscript it.
If I were trying to do what you're doing, I would construct the formula
before the call to nls, and pass that. I.e. something like the
following silly code:
model2 <- model
model2[[3]] <- model[[3]] # The eval() is implicit
res2 <- nls(model2, start = c(a=2, b=1, c=2))
If you really want to put eval() in a formula, then I can't see how you
could avoid an explicit specification of the envir parameter. So I'd
call this a bug in your code.
Duncan Murdoch
> this is without all the overhead of my first example and now
> (since not quite the same) the problem arises at n == 3
> where the fit can't really procede (there are also 3
> parameters -- the first example was more relevant in this
> respect) but anyway the second nls-call does not procede
> beyond the parsing phase of `model.frame'.
>
> so, I can't see room for a real bug in my code, but very
> well a chance that I misuse `nls' (i.e. not understanding
> what really is tolerable for the `model' argument of `nls').
> but if the latter is not the case, I would think there is a
> bug in `nls' (or, actually, rather in `model.frame' or
> whatever) when parsing the nls call.
>
>
>> it down to isolate the bug. Thats a lot of work, but you're the one in
>> the best position to do it. I'd say there's at least an even chance
>> that the bug is in your code rather than in nls().
>>
>> And 2.5.0 *is* ancient; please confirm the bug exists in R-patched if it
>> turns out to be an R bug.
>
> if need be, I'll do that (if I get it compiled under
> macosX). but assuming that you have R-patched installed
> anyway, I would appreciate if you would copy the new example
> above or the old one below to your R- prompt and see,
> whether it crashes with the same error message if called
> with the special number of data points (3 for the new, 5 for
> the old example)? and if so, maybe bring this to the
> attention of d. bates?
>
>
> j. van den hoff
>>
>> Duncan Murdoch
>>
>
>>
>>
>>
>> so I try it here. sory for the
>> >lengthy explanation but it seems unavoidable. to quickly see
>> >the problem simply copy the litte example below and execute
>> >
>> >f(n=5)
>> >
>> >which crashes. called with n != 5 (and of course n>3 since
>> >there are 3 parameters in the model...) everything is as it
>> >should be.
>> >
>> >in detail:
>> >I stumbled over the follwing _very_ strange behaviour/error
>> >when using `nls' which I'm tempted (despite the implied
>> >"dangers") to call a bug:
>> >
>> >I've written a driver for `nls' which allows specifying the
>> >model and the data vectors using arbitrary symbols. these
>> >are internally mapped to consistent names, which poses a
>> >slight complication when using `deriv' to provide analytic
>> >derivatives. the following fragment gives the idea:
>> >
>> >#-----------------------------------------
>> >f <- function(n = 4) {
>> >
>> > x <- seq(0, 5, length = n)
>> >
>> > y <- 2 * exp(-1*x) + 2;
>> > y <- rnorm(y,y, 0.01*y)
>> >
>> > model <- y ~ a * exp (-b*x) + c
>> >
>> > fitfunc <- deriv(model[[3]], c("a", "b", "c"), c("a", "b", "c", "x"))
>> >
>> > #"standard" call of nls:
>> > res1 <- nls(y ~ fitfunc(a, b, c, x), start = c(a=1, b=1, c=1))
>> >
>> > call.fitfunc <-
>> > c(list(fitfunc), as.name("a"), as.name("b"), as.name("c"), as.name("x"))
>> > call.fitfunc <- as.call(call.fitfunc)
>> > frml <- as.formula("y ~ eval(call.fitfunc)")
>> >
>> > #"computed" call of nls:
>> > res2 <- nls(frml, start = c(a=1, b=1, c=1))
>> >
>> > list(res1 = res1, res2 = res2)
>> >}
>> >#-----------------------------------------
>> >
>> >the argument `n' defines the number of (simulated) data
>> >points x/y which are going to be fitted by some model ( here
>> >y ~ a*exp(-b*x)+c )
>> >
>> >the first call to `nls' is the standard way of calling `nls'
>> >when knowing all the variable and parameter names.
>> >
>> >the second call (yielding `res2') uses a constructed formula
>> >in `frml' (which in this example is of course not necessary,
>> >but in the general case 'a,b,c,x,y' are not a priori known
>> >names).
>> >
>> >now, here is the problem: the call
>> >
>> >f(4)
>> >
>> >runs fine/consistently, as does every call with n > 5.
>> >
>> >BUT: for n = 5 (i.e. issuing f(5))
>> >
>> >the second fit leads to the error message:
>> >
>> >"Error in model.frame(formula, rownames, variables, varnames, extras,
>> >extranames, : invalid type (language) for variable 'call.fitfunc'"
>> >
>> >I cornered this to a spot in `nls' where a model frame is
>> >constructed in variable `mf'. the parsing/constructing here
>> >seems simply to be messed up for n = 5: `call.fitfunc' is
>> >interpreted as variable.
>> >
>> >I, moreover, empirically noted that the problem occurs when
>> >the total number of parameters plus dependent/independent
>> >variables equals the number of data points (in the present
>> >example a,b,c,x,y).
>> >
>> >so it is not the 'magic' number of 5 but rather the identity
>> >of data vector length and number of parameters+variables in
>> >the model which leads to the problem.
>> >
>> >this is with 2.5.0 (which hopefully is not considered
>> >ancient) and MacOSX 10.4.10.
>> >
>> >any ideas?
>> >
>> >thanks
>> >
>> >joerg
>> >
>> >______________________________________________
>> >R-help at r-project.org mailing list
>> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >PLEASE do read the posting guide
>> >http://www.R-project.org/posting-guide.html
>> >and provide commented, minimal, self-contained, reproducible code.
>>
More information about the R-help
mailing list