[R] gam()
John Fox
jfox at mcmaster.ca
Thu Jun 5 17:12:17 CEST 2003
Dear Henric,
At 05:01 PM 6/4/2003 +0200, Henric Nilsson wrote:
>I've now spent a couple of days trying to learn R and, in particular, the
>gam() function, and I now have a few questions and reflections regarding
>the latter. Maybe these things are implemented in some way that I'm not
>yet aware of or have perhaps been decided by the R community to not be
>what's wanted. Of course, my lack of complete theoretical understanding of
>what mgcv really does may also show...
>
>1. When fitting models where a factor interacts with a smooth term, say
>y~a+s(x,by=a.1)+s(x,by=a.2), I noticed that the rug in the plot of each of
>the smooth terms is identical. I expected the rug in the plot of e.g.
>s(x,by=a.1) to only include those x for which a.1=1 to be able to judge if
>observations of x where a.1=1 are sparse in any region. Also, it would be
>really if nice the "by=..." was included in the output of the plot.gam()
>and the "Approximate significance of smooth terms:" part of the summary.gam().
>
>2. John Fox has modified anova.glm() into anova.gam()
>(http://www.socsci.mcmaster.ca/jfox/Books/Companion/nonparametric-regression.txt)
>for comparison of two or more fitted models based on the difference
>between residual deviances. Indiscriminate use of such a procedure
>shouldn't perhaps be encouraged, but I think that many users expect it to
>be part of the mgcv package since this model selection idea is covered in
>several texts and also implemented in S-plus (and may be OK for truly
>nested models). And even if it's been decided that this functionality is
>not wanted in mgcv, perhaps another function comparing several models by
>the GCV/UBRE score and other useful statistics can be implemented?
The problem with comparing two gams in R fit with mgcv is that, by default,
the degree of smoothing for terms is selected independently for each model.
Simon Wood previously posted a message to the R-help list discussing this
issue and making some suggestions. The issue doesn't arise in the same way
with models fit by the gam function in S-PLUS because the degree of
smoothing there is instead selected by the user. I should update my
appendix on nonparametric regression to discuss this question -- the
current presentation isn't really adequate.
>3. Some authors [1, 2] suggests pointwise estimation of odds ratios and
>corresponding confidence intervals based on the smooth terms in a GAM.
>Maybe something for mgcv?
>[1] Figueiras, A. & Cadarso-Suárez C. (2001) "Application of Nonparametric
>Models for calculating Odds Ratios and Their Confidence Intervals for
>Continuous Exposures", American Journal of Epidemiology, 154(3), 264-275.
>[2] Saez, M., Cadarso-Suárez C. & Figueiras, A. (2003) "np.OR: an S-Plus
>function for pointwise nonparametric estimation of odds-ratios of
>continuous predictors", Computer Methods and Programs in Biomedicine, 71,
>175-179.
>
>4. For each purely parametric covariate a t-test is produced; I'd like to
>have something like S-plus' anova.gam() to get an overall test. (Perhaps
>with the addition of a choice between Type I and Type III tests, but I
>guess that may be controversial). Is it possible?
John
-----------------------------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada L8S 4M4
email: jfox at mcmaster.ca
phone: 905-525-9140x23604
web: www.socsci.mcmaster.ca/jfox
More information about the R-help
mailing list