[R] Problem with binomial gam{mgcv}
Erin Conlisk
erin.conlisk at gmail.com
Sat Oct 10 02:00:55 CEST 2015
Hello,
I am having trouble testing for the significance using a binomial model in
gam{mgcv}. Have I stumbled on a bug? I doubt I would be so lucky, so
could someone tell me what I am doing wrong?
Please see the following code:
________________________________
# PROBLEM USING cbind
x1 <- runif(500, 0, 100) # Create 500 random variables to use as my
explanatory variable
y1 <- floor(runif(500, 0, 100)) # Create 500 random counts to serve as
binomial "successes"
y2 <- 100-y1 # Create 500 binomial "failures", assuming a total of 100
trials and the successes recorded in y1
Model <- gam(cbind(y1, y2) ∼ 1 + s(x1), family=binomial)
summary(Model)
________________________________
The result is that my random variable, x1, is highly significant. This
can't be right...
So what happens when I change the observations from a "batch" of 100 trials
to individual successes and failures?
________________________________
# NOW MAKE ALL THESE DATA 0 and 1
r01<-rep(0,500)
data01<-cbind(x1, y1, y2, r01)
rownames(data01)<-seq(1,500, 1)
colnames(data01)<-c('x1', 'y1', 'y2', 'r01')
data01<-data.frame(data01)
expanded0 <- data01[rep(row.names(data01), data01$y1), 1:4] # Creates a
replicate of the # explanatory variables for each individual "success"
r01<-rep(1,500)
data01<-cbind(x1, y1, y2, r01)
rownames(data01)<-seq(1,500, 1)
colnames(data01)<-c('x1', 'y1', 'y2', 'r01')
data01<-data.frame(data01)
expanded1 <- data01[rep(row.names(data01), data01$y2), 1:4] # Creates a
replicate of the # explanatory variables for each individual "failure"
data01<-rbind(expanded0,expanded1)
Model2 <- gam(r01 ∼ 1 + s(x1), family=binomial)
summary(Model2)
___________________________________
The result is what I expect. Now my random variable, x1, is NOT
significant.
What is going on here?
I should say that I didn't just make this up. My question arose playing
with my real data, where I was getting high significance, but a terrible
proportion of deviance explained.
My apologies if this is explained elsewhere, but I have spent hours
searching for an answer online.
Thank you kindly,
Erin Conlisk
--
Postdoctoral Researcher
UC Berkeley
Energy and Resources Group
310 Barrows Hall
Berkeley, CA 94720
cell: 858-776-2939
[[alternative HTML version deleted]]
More information about the R-help
mailing list