Message/Author 


Greetings. I'm doing something that might be a bit dicey, but I don't have any better ideas and I'm hoping for feedback. I've done a latent class analysis in Mplus. When I added predictors, some of which had significant (~20%) missingness, I went to MI. I'm interested in the effects of one of the predictors on class membership. There is missingness in the indicators, so naturally class definitions and sizes vary across imputations. This introduces a great deal of (what seems to me to be spurious) betweenimputation variance that is swamping any effects. The fractions of missing information are running around 99%, and df are only fractionally above the number of imputations, so I need to do a large number of imputations to get any stability. I tried a hybrid approach, where I set the latent class definition parameters to be fixed at the values from the original LCA (with no predictors and thus no need for MI) across imputations. I'm still getting extremely high missing information and low degrees of freedom. With large numbers of imputations (50), I am finding effects, but they're with DF of 51. I'm assuming the problem is because the sizes of the classes vary. I'm concerned that the large missing information is a problem. Any other suggestions on how to combine MI and LCA? Or otherwise tackle this situation? Thanks, Pat 

bmuthen posted on Thursday, September 02, 2004  8:21 am



That's a difficult situation; I'm not sure how to improve the MI approach. An alternative that could be considered now given version 3 is to do ML analysis with the missingness on the x's taken into account by usual MAR and normality assumptions. By this I mean that V3 can predict class membership from y's, not only x's, so if you "turn the x's into y's" they can have missing data and will be handled using the usual normality assumptions (an assumption that imputers tell me isn't that harmful even with binary x's). Turning the x's into y's is done by mentioning their variances in the model. Note, however, that with missing data on the x's, this leads to highdimensional numerical integration as soon as you have more than a couple of x's and is therefore slow (although the new version 3.11  available today  can better handle numerical integration with as few as 5 integration points). 


Thanks, it's churning away  I think I tried this before, but it had too many integration dimensions (7); I'm trying it with 3.11 now. 

bmuthen posted on Thursday, September 02, 2004  5:11 pm



I guess 5 to the power of 7 is also a high number, but anything less than 5 integration points per dimension gives only a rough ML approximation. I guess you could try 3 for a very rough picture. 


Thanks. I ended up trying a Monte Carlo integration with 1000 points, but after it ran for 11.5 hours, it crashed with "DETERMINANT OFA MATRIX IS TOO LARGE." . I know I've asked Linda about that error message before, but a computer crash has wiped out my email archives and I don't remember her reply. I can probably drop one of the variables; I'll try 5^6 on my fastest machine and see what happens. 


Nope, didn't like that. "This model can be done only with MonteCarlo integration." I'll pursue the original approach with imputation people; please let me know if you think of something else. 

Andy Ross posted on Friday, June 02, 2006  4:37 am



Dear Prof Muthen Is there a way to save the conditional probabilities using the save data option when modelling with multiple imputed datasets? If so what is the command code for this? Many thanks Andy 


The conditional probabilities are not save with TYPE=IMPUTATION; 

Back to top 