[R-sig-ME] Cross-validated likelihood, cont.

Wed Aug 14 20:04:14 CEST 2019

Dear Rolf,

The optimization behind mixed_model() is a hybrid that starts with EM and then switches to quasi-Newton. When you set the control arguments "iter_EM = 0" and "iter_qN_outer = 0" specify that neither EM nor quasi-Newton iterations are used. Hence, you only calculate the log-likelihood value (and the rest of the components) only for the initial values.

The reason why you received the error message is because in the list you provide to the initial_values argument of mixed_model() should be a named list. The name for the fixed effects coefficients is "betas" not "beta" - see the online help file for more info. I had also made this mistake in my untested code you quoted below. If you make this change, it should work (at least it does in my machine).

Best,
Dimitris

-----Original Message-----
From: Rolf Turner <r.turner using auckland.ac.nz> 
Sent: Wednesday, August 14, 2019 1:07 PM
To: D. Rizopoulos <d.rizopoulos using erasmusmc.nl>
Cc: Ben Bolker <bbolker using gmail.com>; r-sig-mixed-models using r-project.org
Subject: Re: [R-sig-ME] Cross-validated likelihood, cont.

Dear Prof. Rizopoulos,

I hope that this gets through to you on your travels and that I am not disrupting the even tenor of your ways *too* badly. :-)

I am *finally* getting back to my efforts to calculate the log likelihood of a new ("validation") set of data on the basis of a model fitted to a different ("training") set of data.

You suggested a procedure for doing this using the mixed_model() function from the GLMMadaptive package.  After a bit of running-in time, getting to know the mixed_model() function, I (finally!) set about trying to follow your advice.

On 29/04/19 4:25 PM, D. Rizopoulos wrote:

> If you want you could give a try to the GLMMadaptive package that 
> implements the adaptive Gaussian quadrature for a vector of random 
> effects (e.g., intercepts and slopes as in your case), and from which 
> you get the log-likelihood in two steps, e.g.,
> 
> library(GLMMadaptive)
> # the log-likelihood at the initial values fm <- 
> mixed_model(cbind(Dead, Alive) ~ (Trt + 0) / Dose, random = ~ Dose
> | Rep,
>    data = Ts, family = binomial(link = 'cloglog'), iter_EM = 0, 
> iter_qN_outer = 0)
> logLik(fm)

I am a bit mystified by the "iter_EM = 0" and "iter_qN_outer = 0" in the foregoing.  I presume that they really shouldn't be there, but belong only in the next step.  But I could be completely out to lunch here, since I don't know what I'm doing at all!

I presume that "Ts" should be the "training set" of data.

> # the log-likelihood at user-specified values gm <-  
> mixed_model(cbind(Dead, Alive) ~ (Trt + 0) / Dose, random = ~ Dose | 
> Rep,
>    data = Ts, family = binomial(link = 'cloglog'), iter_EM = 0, 
> iter_qN_outer = 0,
>    initial_values = list(beta = <put_your_fixed_effects_here>, D =
> <put_the_RE_cov_matrix_here>))
> logLik(gm)

I presume (yet again!) the "Ts" in the foregoing should be "VS", the new "validation" set of data.  If not, then I'm *really* not understanding anything that is going on here.

What I am conjecturing is that the foregoing code "fits" a model to the "validation" data without actually doing any fitting.  I.e. it just "stays at the starting values" (since it does no iterations) thereby getting the same model as that encompassed by "fm".  If this is not what the code is doing, can you please explain what it *is* doing?

For "beta" in the initial_values list I used fixef(fm) and for "D" I used fm$D.  If this isn't right, then I need more instruction.

Anyhow:  I put all this together in a script ("demo.txt") which is attached, along with a demonstration data set X.txt.

If these file are save into your working directory you should be able to
source("demo.txt") and see what I saw:

> Error in if (use.names && nt[i] == nc[i]) dQuote(nt[i]) else i : 
>   missing value where TRUE/FALSE needed In addition: Warning messages:
> 1: glm.fit: fitted probabilities numerically 0 or 1 occurred
> 2: glm.fit: fitted probabilities numerically 0 or 1 occurred

I presume (am I presumptuous or what?) that the warnings are not too much to worry about, but the error flummoxes me.  Clearly I've got something wrong, but I have insufficient insight to see what it is.

Can you (or possibly someone else) enlighten me?

Thanks.

cheers,

Rolf

--
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276