[R] glmmADMB: Generalized Linear Mixed Models using AD Model Builder

Mon Dec 19 00:16:13 CET 2005

Dear professor Bates,

thank you for your reaction. To make sure that no errors occur in the 
data generation process I used the elegant function you so neatly 
provided to generate a couple of datasets under the model specification 
specified earlier. Running lmer with a Laplace approximation to the 
high-dimensional integral in the likelihood gives me a warning an then 
this show-stopper:

Warning: IRLS iterations for PQL did not converge
Error in objective(.par, ...) : Unable to invert singular factor of 
downdated X'X

Fitting the dataset with glmmADMB gives no apparent problems and 
reasonable estimates. I attached the particular dataset to the email.

The difference in computation time can be attributed to the fact that 
glmmadmb uses a generic technique called automatic differentiation with 
the Laplace approximation. The same technique can be employed to fit 
much more complex nonlinear models, but I'm sure Hans & Dave can tell 
more about it.

Best regards,
	Roel de Jong

Douglas Bates wrote:
> On 12/15/05, Roel de Jong <dejongroel at gmail.com> wrote:
> 
>>Dear R-users,
>>
>>because lme(r) & glmmpql, which are based on Penalized Quasi Likelihood,
>>are not very robust with Bernoulli responses,
> 
> 
> The current version of lmer takes method =  "PQL" (the default) or
> "Laplace" or "AGQ" although AGQ is not available for vector-valued
> random effects in that version so one must be content with "PQL" or
> "Laplace"
> 
> 
>>I wanted to test glmmADMB.  I run the following simulation study:
> 
> 
>>500 samples are drawn with the model specification:
>>y = (intercept*f1+pred2*f2+pred3*f3)+(intercept*ri+pred2*rs)
>>     where pred2 and pred3 are predictors distributed N(0,1)
>>     f1..f3 are fixed effects, f1=-1, f2=1.5, f3=0.5
>>     ri is random intercept with associated variance var_ri=0.2
>>     rs is random slope with associated variance var_rs=0.4
>>     the covariance between ri and rs "covr"=0.1
>>
>>1500 units/dataset, class size=30
> 
> 
> Could you make the datasets, or the code that generates them,
> available?  My code for such a simulation would be
> 
> genGLMM <- function(nobs, gsiz, fxd, Sigma, linkinv = binomial()$linkinv)
> {
>     ngrp <- nobs/gsiz
>     ranef <- matrix(rnorm(ngrp * ncol(Sigma)), nr = ngrp) %*% chol(Sigma)
>     pred2 <- rnorm(nobs)
>     pred3 <- rnorm(nobs)
>     mm <- model.matrix(~pred2 + pred3)
>     rmm <- model.matrix(~pred2)
>     grp <- gl(n = 1500/30, k = 30, len = 1500)
>                                         # linear predictor
>     lp <- as.vector(mm %*% fxd + rowSums(rmm * ranef[grp,]))
>     resp <- as.integer(runif(nobs) < linkinv(lp))
>     data.frame(resp = resp, pred2 = pred2, pred3 = pred3, grp = grp)
> }
> 
> Running this function gives
> 
>>nobs <- 1500
>>gsiz <- 30
>>fxd <- c(-1, 1.5, 0.5)
>>Sigma <- matrix(c(0.2, 0.1, 0.1, 0.4), nc = 2)
>>set.seed(123454321)
>>sim1 <- genGLMM(nobs, gsiz, fxd, Sigma)
>>(fm1 <- lmer(resp ~ pred2 + pred3 + (pred2|grp), sim1, binomial))
> 
> Generalized linear mixed model fit using PQL
> Formula: resp ~ pred2 + pred3 + (pred2 | grp)
>    Data: sim1
>  Family: binomial(logit link)
>       AIC      BIC    logLik deviance
>  1403.522 1440.714 -694.7609 1389.522
> Random effects:
>  Groups Name        Variance Std.Dev. Corr
>  grp    (Intercept) 0.44672  0.66837
>         pred2       0.55629  0.74585  0.070
> # of obs: 1500, groups: grp, 50
> 
> Estimated scale (compare to 1)  0.9032712
> 
> Fixed effects:
>              Estimate Std. Error z value  Pr(>|z|)
> (Intercept) -1.081710   0.121640 -8.8927 < 2.2e-16
> pred2        1.607273   0.141697 11.3430 < 2.2e-16
> pred3        0.531071   0.072643  7.3107 2.657e-13
> 
>>system.time(fm1 <- lmer(resp ~ pred2 + pred3 + (pred2|grp), sim1, binomial))
> 
> [1] 0.33 0.00 0.33 0.00 0.00
> 
>>(fm2 <- lmer(resp ~ pred2 + pred3 + (pred2|grp), sim1, binomial, method = "Laplace"))
> 
> Generalized linear mixed model fit using Laplace
> Formula: resp ~ pred2 + pred3 + (pred2 | grp)
>    Data: sim1
>  Family: binomial(logit link)
>       AIC      BIC    logLik deviance
>  1401.396 1438.588 -693.6979 1387.396
> Random effects:
>  Groups Name        Variance Std.Dev. Corr
>  grp    (Intercept) 0.35248  0.59370
>         pred2       0.46641  0.68294  0.077
> # of obs: 1500, groups: grp, 50
> 
> Estimated scale (compare to 1)  0.9854841
> 
> Fixed effects:
>              Estimate Std. Error z value  Pr(>|z|)
> (Intercept) -1.119008   0.121640 -9.1993 < 2.2e-16
> pred2        1.680916   0.141697 11.8627 < 2.2e-16
> pred3        0.543548   0.072643  7.4825 7.293e-14
> 
>>system.time(fm2 <- lmer(resp ~ pred2 + pred3 + (pred2|grp), sim1, binomial, method = "Laplace"))
> 
> [1] 4.62 0.01 4.65 0.00 0.00
> 
> Fitting that model using glmmADMB gives
> 
>>(fm3 <- glmm.admb(resp ~ pred2 + pred3, ~ pred2, "grp", sim1, "binomial", "logit", "full"))
> 
> ...
> iteration output omitted
> ...
> 
> GLMM's in R powered by AD Model Builder:
> 
>   Family: binomial
> 
> Fixed effects:
>   Log-likelihood: -602.035
>   Formula: resp ~ pred2 + pred3
> (Intercept)       pred2       pred3
>    -1.11990     1.69030     0.54619
> 
> Random effects:
>   Grouping factor: grp
>   Formula: ~pred2
> Structure: General positive-definite
>                StdDev      Corr
> (Intercept) 0.5890755
> pred2       0.6712377 0.1023698
> 
> Number of Observations: 1500
> Number of Groups: 50
> 
> The "Laplace" method in lmer and the default method in glmm.admb,
> which according to the documentation is the Laplace approximation,
> produce essentially the same model fit.  One difference is the
> reported value of the log-likelihood, which we should cross-check, and
> another difference is in the execution time
> 
> 
>>system.time(fm3 <- glmm.admb(resp ~ pred2 + pred3, ~ pred2, "grp", sim1, "binomial", "logit", "full"))
> 
> ...
> Iteration output omitted
> ...
> [1]  0.23  0.02 21.44 19.45  0.24
> 
> Fitting this model takes about 4.7 seconds with the Laplace
> approximation in lmer (and only 0.33 seconds for PQL, which is not
> that far off) and about 20 seconds in glmm.admb
> 
> 
> 
> 
>>convergence:
>>~~~~~~~~~~~~
>>No crashes.
>>5/500 Datasets had on exit a gradient of the log-likelihood > 0.001
>>though. Removing the datasets with questionable convergence doesn't seem
>>to effect the simulation analysis.
>>
>>bias:
>>~~~~~~
>>f1=-1.00531376
>>f2= 1.49891060
>>f3= 0.50211520
>>ri= 0.20075947
>>covr=0.09886267
>>rs= 0.38948382
>>
>>Only the random slope "rs" is somewhat low, but i don't think it is of
>>significance
>>
>>coverage alpha=.95: (using asymmetric confidence intervals)
>>~~~~~~~~~~~~~~~~~~~~~~~~
>>f1=0.950
>>f2=0.950
>>f3=0.966
>>ri=0.974
>>covr=0.970
>>rs=0.970
>>
>>While some coverages are somewhat high, confidence intervals based on
>>asymptotic theory will not have exactly the nominal coverage level, but
>>with simulations (parametric bootstrap) that can be corrected for.
>>
>>I can highly recommend this excellent package to anyone fitting these
>>kinds of models, and want to thank Hans Skaug & Dave Fournier for their
>>hard work!
> 
> 
> I agree.  I am particularly pleased that Otter Research allows access
> to a Linux executable of their code (although I would, naturally,
> prefer the code to be Open Source).
> 
> 
>>Roel de Jong.
>>
>>
>>Hans Julius Skaug wrote:
>>
>>>Dear R-users,
>>>
>>>Half a year ago we put out the R package "glmmADMB" for fitting
>>>overdispersed count data.
>>>
>>>http://otter-rsch.com/admbre/examples/glmmadmb/glmmADMB.html
>>>
>>>Several people who used this package have requested
>>>additional features. We now have a new version ready.
>>>The major new feature is that glmmADMB allows Bernoulli responses
>>>with logistic and probit links. In addition there is
>>>a "ranef.glmm.admb()" function for getting the random effects.
>>>
>>>The download site is still:
>>>
>>>http://otter-rsch.com/admbre/examples/glmmadmb/glmmADMB.html
>>>
>>>The package is based on the software ADMB-RE, but the full
>>>unrestricted R-package is made freely available by Otter Research Ltd
>>>and does not require ADMB-RE to run. Versions for Linux and Windows
>>>are available.
>>>
>>>We are still happy to get feedback for users, and to get suggestions
>>>for improvement.
>>>
>>>We have set up a forum at http://www.otter-rsch.ca/phpbb/ for discussions
>>>about the software.
>>>
>>>Regards,
>>>
>>>Hans
>>>
>>>_____________________________
>>>Hans Julius Skaug
>>>
>>>Department of Mathematics
>>>University of Bergen
>>>Johannes Brunsgate 12
>>>5008 Bergen
>>>Norway
>>>ph. (+47) 55 58 48 61
>>>
>>>______________________________________________
>>>R-help at stat.math.ethz.ch mailing list
>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>>>
>>
>>______________________________________________
>>R-help at stat.math.ethz.ch mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>>
> 
> 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: cd.14
Url: https://stat.ethz.ch/pipermail/r-help/attachments/20051219/60e5fc4f/cd.pl