[R] SEM validation: Cross-Validation vs. Bootstrapping

Paul Miller pjmiller_57 at yahoo.com
Fri Nov 2 17:57:25 CET 2012


Hi Joshua,

Thanks for your very helpful reply. I took a look at your mediation tutorial, MplusAutomation, and semutils. I may not need any of this for the current project, but it was very interesting and I could imagine using it in the future.

I also looked at the Hastie, Tibshirani, & Friedman book. I get the sense that it's going to be a lot more straightforward to stick with some form of cross validation. Probably something along the lines of what you've suggested.

Our sample contains 1138 cancer patients. The Mplus output indicates there are 67 "Free Parameters". So I think that means we would want to have a sample of about 670 to estimate the model. Under the proposed split, the training sample (n = 759) would be large enough but the testing sample (n = 379) would not. So I was wondering what, if anything, to do about that. One idea would be to do a multigroup comparison where corresponding across group parameters would initially be constrained equal and the constraints only would be released if a difference in chi-square test suggested it were necessary to do so. I'm not sure if that's a good idea or not though. Another possibility might be further model simplification.

You specifically mention the idea of testing cross-group constraints and a difference in chi-square. I had thought this would be standard practice but some people seem to feel that a sort of "configural" invariance is fine. So the model might be seen as having cross-validated if the factor loadings and structural paths appear similar/consistent in the absence of any tests of statistical significance. I was wondering what you or any other people out there might think about that.

Thanks,

Paul




More information about the R-help mailing list