Message/Author 

Sanjoy posted on Thursday, November 24, 2005  5:34 am



Prof. Muthen ... this is the situation ... I am doing a LCA with Covarites, 6 ordinal (each 3 categories) indicator variables and 10 covariates 2 class 3 class 4 class 5 class 6 Class AIC 2479.396 2450.399 2437.525 2432.189 2389.83 Sample Adjusted BIC 2490.276 2468.119 2462.084 2463.588 2427.76 Entropy 0.862 0.898 0.823 0.849 0.981 Pearson Pvalue 0.0294 (df=701) 0.0312 (df=687) 0.99 (df =676) 0.99 (df =663) 0.99 (df =0.0865) QI. Can you help me please to choose the number of latent calsses  following AIC and sample adjusted BIC, Class 4 is better than 2 and 3  Class 5 is better following AIC but not by Sample adjusted BIC  Class 6 is better, in fact I have checked "Nagin table" (diagonal elements are between .95.99) however in the regression result with 6 classes some of the estimates are astoundingly large, it looks slightly weird Q2. Can you tell us please whether this is ok or not (# no warning in the output apart from +/15 issues for some of the cells) Categorical Latent Variables C#1 ON F1 0.907 0.522 1.739 HHSIZE 1.108 0.420 2.637 F3INC 1.364 0.854 1.598 F4GMHEAR 0.685 0.413 1.660 F5EMPLOY 34.076 15.033 2.267 F6AEDU 16.388 6.456 2.539 F6BSCI 10.737 4.648 2.310 F7AGE 2.728 1.156 2.360 F10GMREA 16.461 6.709 2.454 F11FSHOP 22.471 9.730 2.309 C#2 ON F1 5.969 2.599 2.297 HHSIZE 0.497 0.397 1.251 F3INC 5.860 2.523 2.322 F4GMHEAR 14.672 5.987 2.451 F5EMPLOY 9.946 5.794 1.717 F6AEDU 21.408 8.463 2.530 F6BSCI 71.584 32.350 2.213 F7AGE 6.231 2.615 2.382 F10GMREA 39.392 17.437 2.259 F11FSHOP 25.250 11.214 2.252 C#3 ON F1 20.990 10.401 2.018 HHSIZE 6.373 2.561 2.489 F3INC 4.458 2.059 2.165 F5EMPLOY 4.925 4.366 1.128 F6AEDU 18.131 8.259 2.195 F6BSCI 43.399 19.744 2.198 F7AGE 11.876 5.159 2.302 F10GMREA 22.331 9.846 2.268 F11FSHOP 32.049 14.271 2.246 C#4 ON F1 0.000 0.000 0.000 HHSIZE 2.756 1.080 2.552 F3INC 5.470 2.349 2.328 F5EMPLOY 1.834 3.593 0.511 F6AEDU 24.308 10.995 2.211 F6BSCI 59.123 28.020 2.110 F7AGE 7.606 3.424 2.221 F10GMREA 22.938 10.632 2.158 F11FSHOP 29.166 12.868 2.267 C#5 ON F1 0.000 0.000 0.000 HHSIZE 0.886 0.428 2.069 F3INC 1.456 0.866 1.682 F5EMPLOY 33.593 14.879 2.258 F6AEDU 16.087 6.478 2.483 F6BSCI 8.538 4.791 1.782 F7AGE 2.558 1.168 2.189 F10GMREA 15.794 6.795 2.324 F11FSHOP 22.203 9.729 2.282 Intercepts C#1 89.500 39.580 2.261 C#2 137.887 60.984 2.261 C#3 124.500 51.936 2.397 C#4 129.445 54.820 2.361 C#5 89.779 39.518 2.272 Q3. I ran MendellRubin test, it appears class 5 should be rejected... is not it!, however, the mean and the standard deviation value look pretty huge, Why is this ? TECHNICAL 11 OUTPUT VUONGLOMENDELLRUBIN LIKELIHOOD RATIO TEST FOR 5 (H0) VERSUS 6 CLASSES H0 Loglikelihood Value 1178.578 2 Times the Loglikelihood Difference 211.324 Difference in the Number of Parameters 45 Mean 294207.135 Standard Deviation 413877.053 PValue 1.0000 LOMENDELLRUBIN ADJUSTED LRT TEST Value 210.471 PValue 1.0000 Thanks and regards 

bmuthen posted on Thursday, November 24, 2005  2:26 pm



Q1. You have some answers in the Muthen (2004) chapter in the Kaplan handbook on our web site. I would also look at the log likelihood and see how it improves as you add classes  you stop adding classes when the improvement starts leveling off. I would place less emphasis on LoMendellRubin. Q2. Large slopes of c on x is not a problem but is likely to happen with more classes because you are more likely to find people who have little variation on the x in certain classes (so with x=0/1, you may have almost all x=1 in one class) and you may also have class probability almost 1 for certain x values. Q3. It looks like Tech11 runs into a problem here  hard to say why. 

Sanjoy posted on Saturday, November 26, 2005  1:17 am



Thank you Professor, let me go through the paper you have suggested. regards 

anonymous posted on Tuesday, December 20, 2005  12:59 pm



Hello, how do I correct for design effects when estimating LC regression models but the number of betweenunits is small  12, in mjy case. Would it be approbiate to use the complex procedure? 

bmuthen posted on Tuesday, December 20, 2005  1:10 pm



Simulations indicate that you might need at least 20 clusters for the SEs of the complex procedure to work well. 


Dr. Muthen, I am working on a mixture model and attempting to identify trajectories/classes related to a specific behavior. Is there a command that provides information about which individual cases are placed in each trajectory/class? Thank you. 


See the CPROBABILITIES option of the SAVEDATA command. 

Jon Elhai posted on Thursday, July 17, 2008  12:14 am



Drs. Muthen, In running an LCA (ordinal outcome variables and MLR estimation)... I'm wondering how to interpret a p value of 1.000 for the VUONGLOMENDELLRUBIN LR tests that I have obtained for each of my class solutions that I have run thus far (classes of 1 through 5). Is it typical for those first few class solutions to all have p values of 1.000 for those tests? 


A pvalue of 1 in the 2class run would suggest that a singleclass model is sufficient. 

Wen posted on Thursday, October 15, 2009  2:41 pm



Dear Drs. Muthen, I'm working on the growth mixture modeling with continuous outcome. If the 2class model has BIC=2180.8 and the VUONGLOMENDELLRUBIN LIKELIHOOD RATIO TEST is significant (p=0.041) whereas the 3class models has BIC=2180.6 and pvalue=0.264 and is more interpretable. Which one should I choose? Do I have to calculate the chisquare value? If yes, can I get it in programming GMM? Thank you. 


The VUONGLOMENDELLRUBIN LIKELIHOOD RATIO TEST points to two classes. Substantive interpret ability and predictive validity can also be considered in determining the number of classes. See the following papers which are on the website: Nylund, K.L., Asparouhov, T., & Muthen, B. (2007). Deciding on the number of classes in latent class analysis and growth mixture modeling. A Monte Carlo simulation study. Structural Equation Modeling, 14, 535569. Muthén, B. (2004). Latent variable analysis: Growth mixture modeling and related techniques for longitudinal data. In D. Kaplan (ed.), Handbook of quantitative methodology for the social sciences (pp. 345368). Newbury Park, CA: Sage Publications. 

M.O. posted on Tuesday, May 12, 2015  8:37 am



Dear Drs. Muthen, I am comparing factor mixture model(FMM), and local independence model(LCA). LCA worked well, with 3class solution showing the smallest BIC (5225). FMM did not work with 2class solution: while showing smaller BIC (5160), and bootstrap LR difference test being significant (p=0.01), all the classmembers resulted in one class (ie., number of class members for class 1=888; and class2=0). 3class FMM worked finely, with good interpretability, and BIC being smaller than 3class LCA (5164). However, LR difference test was not significant (p=1) Just for clarification, here are the input files for LCA and FMM. Considring the result, is it reasonable to choose 3class FMM ? LCA TITLE: LCA DATA: FILE = 'N888.dat'; VARIABLE: NAMES = u1u3; USEVARIABLES = u1u3; CATEGORICAL = u1u3; CLASSES = C(3); ANALYSIS: TYPE IS MIXTURE; STARTS = 1000 100; FMM TITLE: fmm DATA: FILE = 'N888.dat'; VARIABLE: NAMES = u1u3; USEVARIABLES = u1u3; CATEGORICAL = u1u3; CLASSES = C(3); ANALYSIS: TYPE IS MIXTURE; STARTS = 10000 1000; ALGORITHM=INTEGRATION; MODEL:%OVERALL% f1 by u1u3;;  


I'd go by BIC. But I would also consider a model that is LCA, but adds a few WITH statements. See our new article: Asparouhov, T. & Muthen, B. (2015). Residual associations in latent class and latent transition analysis. Structural Equation Modeling: A Multidisciplinary Journal, 22:2, 169177, DOI: 10.1080/10705511.2014.935844 

Back to top 