[R] [Rd] Formulas in gam function of mgcv package

Wed Aug 26 12:13:30 CEST 2009

Dear Simon,

thanks for your answer.

I am running the model with both s and te smoothing, to compare.

A few questions on your email:

1) Isotropic smoothness: my variables are centred and scaled. I assumed an 
isotropic smoother (that is, a smoother that treats all the variables in the 
same way) was good. What do you think? Is my understanding of isotropic 
smoothing wrong? 

2) s(x1,...., xn): it does not contains (1), but I thought it was true that it 
does improve on (1) by being free of including some interaction, albeit not 
explicitly .... is my interpretation wrong?

3) te: I am confused! What does it mean that the function space for (4) is 
built up from the function spaces used in (3)? Does it mean that 
te(xi,....,xn) is an expansion on the te(xi), including all the terms 
te(x1)*te(x2)*....*te(xj)*....*te(xn) of the different orders?

Example: in the case of 4 variables, including te(x1)*te(x2), te(x2)*te(x3), 
.... te(x1)*te(x2)*te(x3) .... to te(x1)*te(x2)*te(x3)*te(x4) .....

Sorry for being particularly daft ....

Regards

On Wednesday 26 August 2009 09:56:13 you wrote:
> > > I am trying to understand the relationships between:
> > >
> > > y~s(x1)+s(x2)+s(x3)+s(x4)
> > >
> > > and
> > >
> > > y~s(x1,x2,x3,x4)
> > >
> > > Does the latter contain the former? what about the smoothers of all
> > > interaction terms?
>
> The first says that you want a model
> E(y) = f_1(x_1) + f_2(x_2) + f_3(x_3) + f_4(x_4) (1)
> where the f_j are smooth functions. The additive decomposition is quite a
> strong assumption, since it assumes that the effect of x_j is not dependent
> on x_k unless j=k. The second model is just
> E(y) = f(x_1,x_2,x_3,x4)                                          (2)
> where f is a smooth function. This looks very general, but actually `s'
> terms assume isotropic smoothness, which is also quite a strong assumption.
>
> Now if I simply state that f and the f_j are `smooth functions', and leave
> it at that, then (2) would of course contain (1), but to actually estimate
> the models I need to state, mathematically, what I mean by `smooth'. Once
> I've done that I've pretty much determined the function spaces in which f
> and the f_j will lie, and in general (2) will no longer strictly contain
> (1). mgcv's `s' terms use a thin plate spline measure of smoothness for
> multivariate smooths, and this means that (1) will not be strictly nested
> within (2), since e.g. a 4D thin plate spline can not generally represent
> exactly what the sum of 4 1D splines can represent.
>
> If you want to acheive exact nesting then using tensor product smooths with
> something like
>
> y~te(x1)+te(x2)+te(x3)+te(x4)   (3)
>
> y~te(x1,x2,x3,x4)                         (4)
>
> will do the trick (because the function space for (4) is built up from the
> function spaces used in (3)).
>
> As to where all the 2 and 3 way interactions have gone in (4)... it's just
> like ANOVA - if you put in a 4 way interaction then the lower order
> interactions are not identifiable, unless you choose to add constraints to
> make them so. `mgcv' will allow you add main effects and interactions, and
> will handle the constraints automatically, but if this sort of functional
> ANOVA is a major component of what you want to do, then it is probably
> worth checking out the gss package and Chong Gu's book on smoothing spline
> ANOVA.
>
> best,
> Simon

-- 
Corrado Topi

Global Climate Change & Biodiversity Indicators
Area 18,Department of Biology
University of York, York, YO10 5YW, UK
Phone: + 44 (0) 1904 328645, E-mail: ct529 at york.ac.uk