[R] Missing data augmentation

Mon Jun 16 14:07:57 CEST 2003

Dear Jonck,

I was hoping that someone with more experience with mice and norm would 
pick up this question, but perhaps the following will help:

Without seeing your data, it's hard to determine the source of the problem; 
of course, I wouldn't necessarily be able to do that even with the data.

At 08:25 PM 6/14/2003 +0200, Jonck van der Kogel wrote:
>Hi all,
>A short while ago I asked a question about multiple imputation and I got 
>several helpful replies, thanks! I have untill now tried to use the 
>packages mice and norm but both give me errors however.
>
>mice does not even run to start with and gives me the following error 
>right away:
>iter imp variable
>   1   1  Liquidity.ratioError in chol((v + t(v))/2) : the leading minor 
> of order 1 is not positive definite
>
>To be honest I have no idea whatsoever what that error message means, so 
>my experiments with mice were shortlived :-)

If I remember correctly, leading minors are determinants of square 
submatrices starting at row and column 1; the leading minor of order 1 is 
therefore just the entry in the first row, first column; for it to be "not 
positive definite" suggests that it is 0 or negative. What exactly v is I 
can't say, but using traceback() might help you locate the problem more 
specifically. Addressing questions to the authors of mice might also help.

>I then tried the package "norm". I got some ways with the experiment, 
>following the help file:
>s <- prelim.norm(as.matrix(myDataSet))
>thetahat <- em.norm(s)
>rngseed(1234567)
>theta <- da.norm(s, thetahat, steps=20, showits=TRUE)
>
>At this stage however I get the following error:
>Steps of Data Augmentation:
>1...2...Error: NA/NaN/Inf in foreign function call (arg 2)
>
>This seems strange to me, since the whole purpose of this routine is to 
>work with NA values. So why is it complaining about NA values?

Actually, the error message is less specific than that and suggests a 
numerical problem in the data augmentation step. Since both programs are 
producing numerical errors, I'd suspect some problem, such as 
ill-conditioning, in the data.

>After this I got it to work in an unlikely fashion: I first standardized 
>my dataset using scale(). After that I was able to run the
>"theta <- da.norm(s, thetahat, steps=20, showits=TRUE)" line succesfully. 
>Which seems strange to me, since s still creates NA values, so why is it 
>not complaining about them this time. I have repeated the process several 
>times, with subsets of my original dataset and the same problems arise 
>each time.

It's odd that scaling the data helps since I believe that norm does this 
itself.

>Standardizing, calculating the missing values, imputing them and then 
>standardizing again does not seem the correct way to go to me however. In 
>my opionion the correct way of doing things would be to impute the missing 
>values and then standardize the dataset. In other words, the way that 
>seems correct to me is not working.

I'm not sure that I follow that. You can always undo the standardization at 
the end, but perhaps I'm missing something.

I hope that these remarks are of some use,
  John

-----------------------------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada L8S 4M4
email: jfox at mcmaster.ca
phone: 905-525-9140x23604
web: www.socsci.mcmaster.ca/jfox