[R] gam(mgcv) predicting across random effect and understanding narrow CI

Sat Jan 25 16:29:54 CET 2014

Hi everyone,

I am new to additive modelling and am surprised by the results of a model
I'm working on.  I wanted to check with more experienced users to make sure
I'm not misunderstanding something basic.

*Data:* I have 10 replicated runs from an evolutionary simulation model,
measuring the evolved change in dispersal after an event (climate change)
across a spatial environmental gradient.  I want to find a fitted line
showing mean change in dispersal across the gradient/model landscape with
95% confidence intervals, as a way of assessing in which parts of the
gradient dispersal changed significantly (ie if 95% CI don't overlap 0).  My
data is:

response = change in dispersal ('Ddif')
predictor = position along spatial gradient 401 columns long ('col')
random effect = replicate ('rep')
The data are here: https://www.dropbox.com/sh/3xfjusgkit9bvt2/Ts2KjMOa7Y
(There are NAs at the first and last columns of each run as simulation
organisms died out at the extremes of the environmental gradient)

*Question 1:* Because adding the random effect is drastically reducing my
perceived sample size (from ~4000 data points to only 10 independent model
runs), I expected a large increase in the width of the CI.  However, there
is almost no difference in the line or CI produced by the models:

gam.nore <- gam(Ddif ~ s(col), data=trapzdc0, method="REML")
gam.re    <- gam(Ddif ~ s(col) + s(rep, bs="re"), data=trapzdc0,
method="REML") 

There is also very little change in the R^2 and no change in n=3590 reported
in the summary.  I tried with gamm as well but still no noticeable change (I
know that in this case I'm adding a random intercept instead of a random
smoother but figured I'd try everything):

gamm.re <- gamm(Ddif ~ s(col), random = list(rep =~ 1), data = trapzdc0)

Why do the CI change so little?
Is a random effect not the right way to account for non-independence of data
in GAMs?
Have I coded something incorrectly?

*Question 2:* When I ask for a predicted line across replicates (ie
summarizing the general patterns of the model), I get predicted lines that
are 3590 long (ie across all data points), rather than 401 (the length of
the predictor variable).  This also makes me think I somehow haven't
communicated the random effect properly to R.

pgam.re <- as.data.frame(predict(gam.re, type='link', level=0, se.fit =
TRUE))
dim(pgam.re)
#[1] 3590    2

I tried adding 'level=0', which asks for a global prediction across random
effects in lmer, but no change.
pgam.re <- as.data.frame(predict(gam.re, type='link', level=0, se.fit =
TRUE))

Even if I make a new data frame of the right length (401) across which the
model should predict and a dummy 'rep' variable I get a predictor 3590 long:
col <- c(seq(1,401,1))
rep <- c(rep(0,401))   
newdat <- as.data.frame(cbind(col,rep)); dim(newdat)

pgam.re2 <- as.data.frame(predict(gam.re, type='link', level=0, se.fit =
TRUE, data=newdat))
dim(pgam.re2)
[1] 3590    2

How can I get a prediction (for use in plotting, eg ggplot) across
replicates?

I've spent days trolling for answers in help sites to no avail - any insight
would be greatly appreciated!
Thanks!!

--
View this message in context: http://r.789695.n4.nabble.com/gam-mgcv-predicting-across-random-effect-and-understanding-narrow-CI-tp4684135.html
Sent from the R help mailing list archive at Nabble.com.