Latent class analysis with multiple i... PreviousNext
Mplus Discussion > Latent Variable Mixture Modeling >
 Jessica M Hill posted on Tuesday, April 14, 2015 - 6:31 am
I am carrying out latent class analysis using 5 multiply imputed datasets.
I have managed to carry out the analysis using

I have then run the analysis with all the parameters fixed to values of the average estimates from the mulitple imputation analysis to obtain latent class probablities. This gives me the class probablities.

My question is how should I assess the model fit. Because the results are the average across the 5 datasets, the usual criteria I would use are not available.

Thank you

 Tihomir Asparouhov posted on Tuesday, April 14, 2015 - 9:02 am
You can use "Model Test" for specific hypothesis. You can also use tech10 in the fixed values run.
 Jessica M Hill posted on Tuesday, April 14, 2015 - 11:32 pm
Thank you for your reply. Could you maybe give me a little more detail about what exactly I should be looking for in the tech10 output?
Many thanks for your help
 Jessica M Hill posted on Wednesday, April 15, 2015 - 7:31 am
Sorry, I have another question. How can I find out the results of the bootstrap likelihood ratio test for the different models I run? Using tech14 when I run the analysis using the fixed values, on any of the imputed datasets, is different depending on the dataset I use. Does this matter? The significance level of the models seems to remain the same for the different datasets.
 Jon Heron posted on Wednesday, April 15, 2015 - 11:47 pm
Hi Jessica,

there are many unanswered questions when it comes to MI in mixture modelling.

It sounds like you are attempting to determine the number of classes across your imputed datasets. My suggestion would be that you make this decision first, prior to analysing the imputed data - perhaps relying on likelihood-based methods to deal with class-indicator missingness at that point, alternatively using some theory.

By the way, your imputation model should contain information regarding the classes - so this is a little chicken and egg. The first mention I have seen on this is

Colder CR, Mehta P, Balanda K, et al. Identifying trajectories of adolescent smoking: an application of latent growth mixture modeling. Health Psychol 2001/3; 20(2): 127-35

You'll find that some people - including me - have neglected to do this, and assumed a single population for the purpose of imputation. This is likely to attenuate any differences between the classes.
 Jessica M Hill posted on Thursday, April 16, 2015 - 12:06 am
Thanks for your reply. If I understand correctly you suggest running the LCA analysis on the data with missing values and then using multiple imputation to impute class membership?
 Jon Heron posted on Thursday, April 16, 2015 - 12:59 am
Hi Jessica

no, I didn't mean that. I'll try again now I am caffeinated.

You don't need imputation to deal with indicators of class membership so presumably you are using imputation because you have missing covariate information.

My suggestion is that you determine the number of classes before considering imputation and then you use imputation to deal with covariate missingness. After doing imputation you will need to carry out another LCA but at this point you wont need to worry about BLRT/BIC etc.

You'll see a number of ad-hoc alternatives in the literature. For instance, if your entropy is excellent (>0.9?) then exporting your LCA results, assigning to modal class and then doing imputation *might* be adequate. Hard to say whether any bias from doing that would be smaller/greater than imputing whilst ignoring the underlying class structure.
 Jessica M Hill posted on Thursday, April 16, 2015 - 1:10 am
Thanks again Jon. I'm probably being really dumb, but why don't I need inputation to deal with indicators of class membership? If some of the variables I am using to determine my classes having missing values should I not deal with these before using them to determine my classes?
 Jon Heron posted on Thursday, April 16, 2015 - 2:37 am
Hi Jessica

missing data in class indicators is typically handled using likelihood methods (FIML). Check out some of Craig Enders' work including Enders and Bandalos. Of course there are assumptions to make, but there always are with missing data.

Enders and Bandalos (2001):-
 Jessica M Hill posted on Thursday, April 16, 2015 - 2:42 am
Thanks very much for all your help. I have more reading to do I think!
 Jon Heron posted on Thursday, April 16, 2015 - 3:13 am
No probs
 Tihomir Asparouhov posted on Thursday, April 16, 2015 - 9:31 am
Tech10 has univariate and bivariate fit information which you can use to asses model fit.

For determining the number of classes, my preferred method is to run tech14 on each imputed data set (with all parameters free). Once you determine the number of classes for each imputed data set - use the # of classes that comes up the most across the imputed data sets.

Just to confirm Jon Heron's reply ... if you have the raw data with the missing values you can run LCA with the ML estimator directly without doing imputations and avoid the above complications.
 Jessica M Hill posted on Thursday, April 16, 2015 - 11:07 pm
Ok. That's great. Thank you for your help.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message