[R] unexpected GAM result - at least for me!

Duncan Murdoch murdoch at stats.uwo.ca
Mon Mar 31 15:30:01 CEST 2008


On 3/31/2008 9:01 AM, Monica Pisica wrote:
>   Thanks Duncan.
>  
> Yes i do have variation in the lidar metrics (be, ch, crr, and home) 
> although i have a quite high correlation between ch and home. But even 
> if i eliminate one metric (either ch or home) i end up with a deviation 
> of 99.99. The species has values of 0 and 1 since i try to predict 
> presence / absence.
>  
> Do you think it is still a valid result?

I repeat:  look at the data. Compare the observed and predicted. That's 
the only way to know whether this is reasonable or not.

If you're getting reasonable predictions, then it's a valid fit.  (The 
tests and approximations used in the reported p-values may not be at all 
valid.  I don't know what the requirements are for those in a GAM, but 
if you're getting a perfect fit, then they probably aren't being met.)

Duncan Murdoch


>  
> Thanks again,
>  
> Monica
> 
>  > Date: Mon, 31 Mar 2008 08:47:48 -0400
>  > From: murdoch at stats.uwo.ca
>  > To: pisicandru at hotmail.com
>  > CC: r-help at r-project.org
>  > Subject: Re: [R] unexpected GAM result - at least for me!
>  >
>  > On 3/31/2008 8:34 AM, Monica Pisica wrote:
>  > >
>  > > Hi
>  > >
>  > >
>  > > I am afraid i am not understanding something very fundamental.... 
> and does not matter how much i am looking into the book "Generalized 
> Additive Models" of S. Wood i still don't understand my result.
>  > >
>  > > I am trying to model presence / absence (presence = 1, absence = 0) 
> of a species using some lidar metrics (i have 4 of these). I am using 
> different models and such .... and when i used gam i got this very weird 
> (for me) result which i thought it is not possible - or i have no idea 
> how to interpret it.
>  > >
>  > >> can3.gam <- gam(can>0~s(be)+s(crr)+s(ch)+s(home), family = 'binomial')
>  > >> summary(can3.gam)
>  > > Family: binomial
>  > > Link function: logit
>  > > Formula:
>  > > can> 0 ~ s(be) + s(crr) + s(ch) + s(home)
>  > > Parametric coefficients:
>  > > Estimate Std. Error z value Pr(>|z|)
>  > > (Intercept) 85.39 162.88 0.524 0.6
>  > > Approximate significance of smooth terms:
>  > > edf Est.rank Chi.sq p-value
>  > > s(be) 1.000 1 0.100 0.751
>  > > s(crr) 3.929 8 0.380 1.000
>  > > s(ch) 6.820 9 0.396 1.000
>  > > s(home) 1.000 1 0.314 0.575
>  > > R-sq.(adj) = 1 Deviance explained = 100%
>  > > UBRE score = -0.81413 Scale est. = 1 n = 148
>  > >
>  > > Is this a perfect fit with no statistical significance, an 
> over-estimating or what???? It seems that the significance of the 
> smooths terms is "null". Of course with such a model i predict perfectly 
> presence / absence of species.
>  > >
>  > > Again, i hope you don't mind i'm asking you this. Any explanation 
> will be very much appreciated.
>  >
>  > Look at the data. You can get a perfect fit to a logistic regression
>  > model fairly easily, and it looks as though you've got one. (In fact,
>  > the huge intercept suggests that all predictions will be 1. Do you
>  > actually have any variation in the data?)
>  >
>  > Duncan Murdoch
> 
> 
> In a rush? Get real-time answers with Windows Live Messenger. 
> <http://www.windowslive.com/messenger/overview.html?ocid=TXT_TAGLM_WL_Refresh_realtime_042008>



More information about the R-help mailing list