LCA and multiple imputation PreviousNext
Mplus Discussion > Latent Variable Mixture Modeling >
 Patrick Malone posted on Thursday, September 02, 2004 - 6:52 am

I'm doing something that might be a bit dicey, but I don't have any better ideas and I'm hoping for

I've done a latent class analysis in Mplus. When
I added predictors, some of which had significant (~20%) missingness, I went to MI. I'm interested in the effects of one of the predictors on class membership.

There is missingness in the indicators, so naturally class definitions and sizes vary across
imputations. This introduces a great deal of (what seems to me to be spurious) between-imputation variance that is swamping any effects. The fractions of missing information are running around 99%, and df are only fractionally above the number of imputations, so I need to do a large number of imputations to get any stability.

I tried a hybrid approach, where I set the latent class definition parameters to be fixed at the values from the original LCA (with no predictors and thus no need for MI) across imputations. I'm still getting extremely high missing information and low degrees of freedom. With large numbers of
imputations (50), I am finding effects, but they're with DF of 51. I'm assuming the problem is because the sizes of the classes vary.

I'm concerned that the large missing information is a problem. Any other suggestions on how to combine MI and LCA? Or otherwise tackle this situation?

 bmuthen posted on Thursday, September 02, 2004 - 8:21 am
That's a difficult situation; I'm not sure how to improve the MI approach. An alternative that could be considered now given version 3 is to do ML analysis with the missingness on the x's taken into account by usual MAR and normality assumptions. By this I mean that V3 can predict class membership from y's, not only x's, so if you "turn the x's into y's" they can have missing data and will be handled using the usual normality assumptions (an assumption that imputers tell me isn't that harmful even with binary x's). Turning the x's into y's is done by mentioning their variances in the model. Note, however, that with missing data on the x's, this leads to high-dimensional numerical integration as soon as you have more than a couple of x's and is therefore slow (although the new version 3.11 - available today - can better handle numerical integration with as few as 5 integration points).
 Patrick Malone posted on Thursday, September 02, 2004 - 12:53 pm
Thanks, it's churning away -- I think I tried this before, but it had too many integration dimensions (7); I'm trying it with 3.11 now.
 bmuthen posted on Thursday, September 02, 2004 - 5:11 pm
I guess 5 to the power of 7 is also a high number, but anything less than 5 integration points per dimension gives only a rough ML approximation. I guess you could try 3 for a very rough picture.
 Patrick Malone posted on Friday, September 03, 2004 - 5:48 am
Thanks. I ended up trying a Monte Carlo integration with 1000 points, but after it ran for 11.5 hours, it crashed with "DETERMINANT OFA MATRIX IS TOO LARGE." . I know I've asked Linda about that error message before, but a computer crash has wiped out my e-mail archives and I don't remember her reply. I can probably drop one of the variables; I'll try 5^6 on my fastest machine and see what happens.
 Patrick Malone posted on Friday, September 03, 2004 - 9:49 am
Nope, didn't like that. "This model can be done only with MonteCarlo integration." I'll pursue the original approach with imputation people; please let me know if you think of something else.
 Andy Ross posted on Friday, June 02, 2006 - 4:37 am
Dear Prof Muthen

Is there a way to save the conditional probabilities using the save data option when modelling with multiple imputed datasets?

If so what is the command code for this?

Many thanks

 Linda K. Muthen posted on Friday, June 02, 2006 - 8:44 am
The conditional probabilities are not save with TYPE=IMPUTATION;
 CB posted on Thursday, April 09, 2015 - 12:50 pm

I am running LCA with two parameterizations of the same variables as well as running these models with and without multiple imputation. In the first parameterization, I'm using both unordered and ordered categorical variables. When I run the models with and without multiple imputation, I get the same results. In the second parameterization, I'm using all binary variables. However, when I run the models with and without multiple imputation, I get different results.

Do you have any thoughts as to why I'm getting different results when I code the variables as binary? Is it a possible issue of identifiability? Or have any resources that I could look at to help me figure this out?

Thanks so much!
 Bengt O. Muthen posted on Thursday, April 09, 2015 - 1:01 pm
Please send to support the outputs for the 2 binary runs, plus the imputation run for it.
 Shannon Healy  posted on Thursday, March 02, 2017 - 1:30 pm

I have probably an overly simplistic question about imputation in Mplus software, but here goes. I'm doing an LCA on complex survey data and because I assume that missingness in the indicators is not missing at random I did a multiple imputation. My question is regarding the type of output provided for an LCA using multiple imputed data..since it does not have the same types of results as a non-imputed dataset, namely item probabilities by class. Is it possible to request these outputs for imputed data as well or are there limitations to the types of output imputed LCAs can provide?

Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message