I am using Mplus 4 to do latent class analysis to categorize 9 ordinal variables (1-3 responses for each). I estimated 1-class, 2-class, 3-class, 4-class, 5-class, 6-class, 7-class, 8-class, and 9-class models. BIC statistics indicate that the 6-class model had the smallest BIC. But substantively, a 4-class model makes more sense. How should we choose the best models -- should this be purely data driven, or should we take into consideration of substantive meanings of the latent classes? Do you have any advice? Is there any reference that I can cite? Thank you for your help!
I am also posting my syntax (for 5-class models) here. Please let me know if you see any problems. I really appreciate it.
VARIABLE: NAMES ARE sampleid v1-v9; USEV ARE v1-v9; Missing are all (-9999); CLASSES = c(5); CATEGORICAL = v1-v9; ANALYSIS: TYPE = MIXTURE; OUTPUT: TECH1 TECH8; SAVEDATA: file is mplus05.txt ; save is cprob; format is free;
The classes should definitely make theoretical sense and that should be taken into account when determining the number of classes. The following paper may discuss this. We definitely do in the Topics 5 and 6 course videos.
Nylund, K.L., Asparouhov, T., & Muthén, B. (2007). Deciding on the number of classes in latent class analysis and growth mixture modeling. A Monte Carlo simulation study. Structural Equation Modeling, 14, 535-569.
fgong posted on Wednesday, April 06, 2011 - 3:45 pm