I'm doing something that might be a bit dicey, but I don't have any better ideas and I'm hoping for feedback.
I've done a latent class analysis in Mplus. When I added predictors, some of which had significant (~20%) missingness, I went to MI. I'm interested in the effects of one of the predictors on class membership.
There is missingness in the indicators, so naturally class definitions and sizes vary across imputations. This introduces a great deal of (what seems to me to be spurious) between-imputation variance that is swamping any effects. The fractions of missing information are running around 99%, and df are only fractionally above the number of imputations, so I need to do a large number of imputations to get any stability.
I tried a hybrid approach, where I set the latent class definition parameters to be fixed at the values from the original LCA (with no predictors and thus no need for MI) across imputations. I'm still getting extremely high missing information and low degrees of freedom. With large numbers of imputations (50), I am finding effects, but they're with DF of 51. I'm assuming the problem is because the sizes of the classes vary.
I'm concerned that the large missing information is a problem. Any other suggestions on how to combine MI and LCA? Or otherwise tackle this situation?
bmuthen posted on Thursday, September 02, 2004 - 8:21 am
That's a difficult situation; I'm not sure how to improve the MI approach. An alternative that could be considered now given version 3 is to do ML analysis with the missingness on the x's taken into account by usual MAR and normality assumptions. By this I mean that V3 can predict class membership from y's, not only x's, so if you "turn the x's into y's" they can have missing data and will be handled using the usual normality assumptions (an assumption that imputers tell me isn't that harmful even with binary x's). Turning the x's into y's is done by mentioning their variances in the model. Note, however, that with missing data on the x's, this leads to high-dimensional numerical integration as soon as you have more than a couple of x's and is therefore slow (although the new version 3.11 - available today - can better handle numerical integration with as few as 5 integration points).
Thanks. I ended up trying a Monte Carlo integration with 1000 points, but after it ran for 11.5 hours, it crashed with "DETERMINANT OFA MATRIX IS TOO LARGE." . I know I've asked Linda about that error message before, but a computer crash has wiped out my e-mail archives and I don't remember her reply. I can probably drop one of the variables; I'll try 5^6 on my fastest machine and see what happens.