[R-sig-ME] zero-inflated models in MCMCglmm
Gustaf Granath
gustaf.granath at gmail.com
Thu Dec 1 23:36:22 CET 2016
Jarrod,
Thanks for explaining this. I now start to understand how these models
are set up in MCMCglmm. The weird contrasts seem impossible to get rid
of though, and Im still struggling a bit to work out how to combine the
effects.
Great idea to use the simulation function. I used it for predictive
model checking of zeros, and I do have one question. Does simulate()
include (marginalise) the residual error (units)? I get quite different
results if I use a "by hand" simulation function (from course notes) and
simulate(), for a standard Poisson model.
Cheers
Gustaf
####### R code
# get test data
require(MCMCglmm)
require(RCurl)
zero.dat
<-read.csv(text=getURL("https://raw.githubusercontent.com/ggranath/zero_inf_models/master/test_data.csv"),header=T)
#plots nested in site (X3 treatments at plot-level)
zero.dat$nested_plot <- factor(with(zero.dat, paste(site, plot, sep=
"_")) )
#zip model
nitt = 10000 #low for testing
thin = 10 #low for testing
burnin = 1000 #low for testing
prior = list( R = list(V = diag(2), nu = 0.002, fix = 2),
G=list(G1=list(V=diag(2)*c(1,0.001), nu=0.002, fix=2)))
zip <- MCMCglmm(y ~ trait -1 + at.level(trait, 1):(X1*X2*X3_nest),
random = ~ idh(trait):nested_plot, rcov =
~idh(trait):units,
data=zero.dat, family = "zipoisson", nitt = nitt,
burnin = burnin, thin=thin, prior=prior, pr = TRUE, pl
= TRUE)
summary(zip)
# Gives a warning!! And contrasts are not straight forward to interpret.
#zap model
prior = list( R = list(V = diag(2), nu = 0.002, fix = 2),
G=list(G1=list(V=1, nu=0.002,alpha.mu=0, alpha.V=625^2)))
zap <- MCMCglmm(y ~ trait*(X1*X2*X3_nest),
random = ~ nested_plot, rcov = ~idh(trait):units,
data=zero.dat, family = "zapoisson", nitt = nitt,
burnin = burnin, thin=thin, prior=prior, pr =
TRUE, pl = TRUE)
summary(zap)
# No warning and everything looks good
# Poisson model
prior = list( R = list(V = diag(1), nu = 0.002),
G=list(G1=list(V=1, nu=0.002,alpha.mu=0, alpha.V=625^2)))
pois <- MCMCglmm(y ~ X1*X2*X3_nest,
random = ~ nested_plot,
data=zero.dat, family = "poisson", nitt = nitt,
burnin = burnin, thin=thin, prior=prior, pr = TRUE, pl
= TRUE)
summary(pois)
# No warning
# Some predictions
# by default, all random factors are marginalised
p.zip <- predict(zip, type="response", posterior = "mean")
p.zap <- predict(zap, type="response", posterior = "mean")
p.pois <- predict(pois, type="response", posterior = "mean")
cbind(aggregate(y ~ X1*X2*X3_nest, zero.dat, mean),
zip = aggregate(p.zip ~ X1*X2*X3_nest, zero.dat, mean)$V1,
zap = aggregate(p.zap ~ X1*X2*X3_nest, zero.dat, mean)$V1,
pois = aggregate(p.pois ~ X1*X2*X3_nest, zero.dat, mean)$V1)
# poisson model give extreme (unrealistic) values. zap best I think.
# Predictive testing: zeros
# zip model
oz <- sum(zero.dat == 0)
sim.zi <- simulate(zip, type="response", posterior = "all", nsim=1000)
dist.zeros.zi <- apply(sim.zi, 2, function (x) sum(x==0))
hist(dist.zeros.zi)
abline(v = oz, col = "red")
# zap model
oz <- sum(zero.dat == 0)
sim.za <- simulate(zap, type="response", posterior = "mean", nsim=1000)
dist.zeros.za <- apply(sim.za, 2, function (x) sum(x==0))
hist(dist.zeros.za)
abline(v = oz, col = "red")
# very good prediction of zeros!
# Poisson model
oz <- sum(zero.dat == 0)
sim.pois <- simulate(pois, type="response", posterior = "mean", nsim=1000)
dist.zeros.pois <- apply(sim.pois, 2, function (x) sum(x==0))
hist(dist.zeros.pois, xlim=c(900,1250))
abline(v = oz, col = "red")
# Zero-inflated!
# Simulation from pois model, by hand.
mc.samp <- nrow(pois$VCV)
nz <- 1:mc.samp
oz <- sum(zero.dat == 0)
for (i in 1:mc.samp) {
pred.l <- rnorm(nrow(zero.dat), (cbind(pois$X,pois$Z) %*%
pois$Sol[1,])@x, sqrt(pois$VCV[i,]))
nz[i] <- sum(rpois(nrow(zero.dat), exp(pred.l)) == 0)
}
hist(nz, xlim=c(900,1250))
abline(v = oz, col = "red")
# plot zero distributions
zeroDens <- data.frame(y = c(dist.zeros.zi, dist.zeros.za,
dist.zeros.pois, sample(nz,1000)),
model = rep(c("zi", "za", "pois",
"pois.byHand"), each=1000))
library(ggplot2)
ggplot(zeroDens,aes(x=y, fill=model)) + geom_density(alpha=0.25) +
geom_vline(xintercept=oz)
# only ZAP that captures all the zeros. Zip not that bad though.
On 2016-12-01 07:48, Jarrod Hadfield wrote:
> Hi Gustaf,
>
> I don't have dat, so I can't run the final bit of code.
>
> However, these are my thoughts.
>
> 1/ In the zap model you are allowing X1 to X3 to effect the level of
> zero alteration, whereas in the zip model you are just fitting a
> constant level of zero inflation across X1 to X3. In this sense the
> zap model will provide a superior fit because it has more parameters.
> The warning message about dropping terms is fine, although the default
> contrast for the zip model is a bit annoying: I would have preferred
> the main effect for X1 to be yes rather than no, but I guess its no
> big deal.
>
> 2/ You have fitted a single nested_plot effect in the random effects.
> This is a little odd, except in the case where the data are not
> zero-inflated. In this case having a single nested_plot term in the
> zap model is equivalent to fitting a nested_plot term in the standard
> Poisson. If the data are zero-inflated, and/or the model is not a zap
> model, then the case for having a single term is not well justified. I
> would use us or idh and in the latter case consider fixing the second
> variance (associated with zero-inflation/alteration) close to zero.
>
> 3/ The marginal predictions do not take into account the covariances
> between traits. This is generally OK, but its not ideal when the
> traits refer to the multiple parameters of a single distribution as
> with za/zi/hu models. I should put a warning in. You can also use the
> simulate function on the model and then average over the draws to get
> the predictions, and this will take into account any covariances. Note
> that if you use idh(trait):nested_plot there are no covariances so
> predict should be fine.
>
> Cheers,
>
> Jarrod
>
>
>
>
>
>
> On 30/11/2016 23:30, Gustaf Granath wrote:
>> cbind(aggregate(y ~ X1*X2*X3_nest, zero.dat, mean),
>> zip = aggregate(p.zip ~ fire*retention*micro_hab.two, dat,
>> mean)$V1,
>> zap = aggregate(p.zap ~ fire*retention*micro_hab.two, dat,
>> mean)$V1,
>> pois = aggregate(p.pois ~ fire*retention*micro_hab.two, dat,
>> mean)$V1)
>
>
More information about the R-sig-mixed-models
mailing list