[R] Missing data augmentation
John Fox
jfox at mcmaster.ca
Mon Jun 16 14:07:57 CEST 2003
Dear Jonck,
I was hoping that someone with more experience with mice and norm would
pick up this question, but perhaps the following will help:
Without seeing your data, it's hard to determine the source of the problem;
of course, I wouldn't necessarily be able to do that even with the data.
At 08:25 PM 6/14/2003 +0200, Jonck van der Kogel wrote:
>Hi all,
>A short while ago I asked a question about multiple imputation and I got
>several helpful replies, thanks! I have untill now tried to use the
>packages mice and norm but both give me errors however.
>
>mice does not even run to start with and gives me the following error
>right away:
>iter imp variable
> 1 1 Liquidity.ratioError in chol((v + t(v))/2) : the leading minor
> of order 1 is not positive definite
>
>To be honest I have no idea whatsoever what that error message means, so
>my experiments with mice were shortlived :-)
If I remember correctly, leading minors are determinants of square
submatrices starting at row and column 1; the leading minor of order 1 is
therefore just the entry in the first row, first column; for it to be "not
positive definite" suggests that it is 0 or negative. What exactly v is I
can't say, but using traceback() might help you locate the problem more
specifically. Addressing questions to the authors of mice might also help.
>I then tried the package "norm". I got some ways with the experiment,
>following the help file:
>s <- prelim.norm(as.matrix(myDataSet))
>thetahat <- em.norm(s)
>rngseed(1234567)
>theta <- da.norm(s, thetahat, steps=20, showits=TRUE)
>
>At this stage however I get the following error:
>Steps of Data Augmentation:
>1...2...Error: NA/NaN/Inf in foreign function call (arg 2)
>
>This seems strange to me, since the whole purpose of this routine is to
>work with NA values. So why is it complaining about NA values?
Actually, the error message is less specific than that and suggests a
numerical problem in the data augmentation step. Since both programs are
producing numerical errors, I'd suspect some problem, such as
ill-conditioning, in the data.
>After this I got it to work in an unlikely fashion: I first standardized
>my dataset using scale(). After that I was able to run the
>"theta <- da.norm(s, thetahat, steps=20, showits=TRUE)" line succesfully.
>Which seems strange to me, since s still creates NA values, so why is it
>not complaining about them this time. I have repeated the process several
>times, with subsets of my original dataset and the same problems arise
>each time.
It's odd that scaling the data helps since I believe that norm does this
itself.
>Standardizing, calculating the missing values, imputing them and then
>standardizing again does not seem the correct way to go to me however. In
>my opionion the correct way of doing things would be to impute the missing
>values and then standardize the dataset. In other words, the way that
>seems correct to me is not working.
I'm not sure that I follow that. You can always undo the standardization at
the end, but perhaps I'm missing something.
I hope that these remarks are of some use,
John
-----------------------------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada L8S 4M4
email: jfox at mcmaster.ca
phone: 905-525-9140x23604
web: www.socsci.mcmaster.ca/jfox
More information about the R-help
mailing list