Message/Author 


When performing an exploratory latent class analysis, how do I decide on the number of classes? Should I start with a low number of classes and keep increasing until I get a good fit? Also, how do I come up with initial conditional probabilities? I have no prior beliefs about the data. Thank you. 

bmuthen posted on Tuesday, January 09, 2001  2:13 pm



Start with 2 classes and keep increasing to see if the set of available model indices points to a certain number of classes. You can use BIC and you can check the classification quality  and in Mplus version 2, to be released in February, you also have an entropy measure and chisquare tests of fit. For a discussion, see Muthen & Muthen (2000) as listed in the Reference section on this web site. Starting values can be low probabilities for all items of one class and high probabilities for the other class. 

Anonymous posted on Tuesday, October 30, 2001  2:34 pm



I started out with a 2 class LC model and then increased to a 3 class model. When I do this (a) I get a message saying that a "saddle point" has been reached and that I should try new starting values or a new model. I have tried various different starting values but keep getting this message indicating that changing the starting values did not change anything. Is there any other way of going about this ? (b) I get the message saying that one of the logit thresholds have been approached  can this be overcome by setting the starting values at the extreme threshold value shown in the output; (c) a message saying that the chisq test cannot be computed as the freq table is too large  is there a way to overcome this problem. Thanks in advance.... Best Krishna 

bmuthen posted on Wednesday, October 31, 2001  10:11 am



A first approach to try out is to fix the thresholds labeled as extreme (e.g. at +10) and reestimate the model  this may solve the problem. Extreme thresholds are useful in the sense that they make the class interpretation easier (perfect association between the class and item). Regarding freq tables that are too large  when the table is very large the chisquare is probably not of interest anyway due to many cells having very small expected frequencies, resulting in a very poor chisquare approximation. 


Howdy. I'm still working on the LCA problem y'all have been helping me with on different threads. Now I'm running into problems with starting values and different solutions. The original problem was an 8class model with 21 dichotomous indicators (N=445). No constraints on the model. I've run it with three different sets of starting values and got three different solutions. The loglikelihoods are within a few points of each other, but the solutions can be substantially different. I'm not getting any kind of warnings (except to ignore the standard errors where the logits are at the extremes). I thought the model complexity might be contributing to the instability, so I went to a simpler, 4class model and got a stable solution that holds across several sets of starting values. However, when going to a second sample, I'm getting the same variance in outcomes even with the simple model. Any insights or suggestions? Thanks, Pat Malone 

bmuthen posted on Thursday, November 01, 2001  3:11 pm



LCA is an exploratory model with relatively little structure imposed and it is known to be able to produce different solutions. Your example seems like a more unusual version version, however, where solutions with very different parameter estimates are obtained with rather little loglikelihood difference. This indicates that these particular data provide little information on this particular model. Perhaps this is also seen in the classification table reflecting uncertainty? The different solutions may all have proper maxima so that the information matrix is not singular. The only remedy here is to do what you did, use several sets of starting values. Or, formulate more restrictive models. 


Thanks. In the absence of strong theory guiding the starting values, I was thinking of trying an automated process where I generate random starting values for some reasonably large number (maybe 50) of runs. I could then look for modal solutions or optimal solutions (by loglikelihood). Do you think there's merit to such an approach? Very computationally intensive, I know. Thanks 

bmuthen posted on Friday, November 02, 2001  7:51 am



Yes, that is a reasonable approach for this situation. 


I have two general questions about interpreting the output for a LCA model. First, do you have any suggested guidelines for interpreting entropy? Second, under what circumstances should one use the sample size adjusted BIC as opposed to the BIC? i notice in the mplus 2.0 manual that a study is cited on p. 372 (Yang, 1998) that found superior performance for the sample size adjusted BIC for LCA models. does this argue for the sample size adjusted BIC anytime one is doing an LCA? thanks. 

bmuthen posted on Thursday, November 15, 2001  10:19 am



Entropy is described in the User's Guide. I have not found specific guidelines  you may be able to find guidelines in the marketing literature (see ref. in User's Guide). Entropy does not seem to be correlated with goodness of fit of the model (much like Rsquare is not correlated with good model fit in cov structure models). The classification table gives further information. Yang suggested using the samplesize adjusted BIC for LCA. But this is just one study and for certain LCA models. I would continue to use BIC as well until more studies acculumate. This is not to say that I fully trust BIC  this measure is currently being studied by several research groups. You can also use the chisquare test against an unrestricted multinomial distribution. Interpretability and usefulness remain key considerations. 

Anonymous posted on Wednesday, July 24, 2002  7:00 pm



Hi, I am attempting to use mixture modeling to explore possible latent classes in a univariate distribution. In comparing the models in which I've specified a different number of classes, I'm not sure what criteria to use to determine which model best describes the data. Specifically, should I select the model that yields the lowest (unadjusted for sample size) BIC, the lowest AIC, or use a chi square test to compare the models  or use some other criteria? Thank you very much for your help. 


How to decide on the number of classes is a topic that continues to be studied. These issues are discussed in the following references: McLachlan, G.J. & Peel, D. (2000). Finite Mixture Models. New York: Wiley & Sons. Muthén, B. & Muthén, L. (2000). Integrating personcentered and variablecentered analysis: Growth mixture modeling with latent trajectory classes. Alcoholism: Clinical and Experimental Research, 24, 882891. (#85) 


Version 2.12 update now installed. I'll go and read the Biometrika article, but the LoMendellRubin looks like it might help make dimensionality decisions. How much does this depend on your advice to put classes in ascending order of magnitude? If the first class is the smallest would this make less difference than if the first class was the largest? 

bmuthen posted on Tuesday, August 20, 2002  12:50 pm



Which class to put first could affect the starting values, which may have influence on the LMR LRT if there are local optima, otherwise not (you can check the k1 solution's log likelihood value printed in TECH11). So, as a matter of habit, in the kclass run I would have the smallest class first since the program would then use the starting values from the remaining k1 classes, which are probably better starting values for a k1 class analysis. In some models, the parameterization of first class achieves the identification of the model, in which case this class would have to be moved to a higher class location. 


I am doing a latent variable mixture model with a dependent variable and a timevarying covariate of that variable. I was excited to see the LoMendellRubin LRT test to help decide on the number of classes. However, I am finding that the statistic varies widely between different models I have tested with the same number of classes. For instance, a 3class model with no variance in growth parameters gives a LRT of 61 with a p value of 0.22. When I allow residual variances that are different between classes, the model yields an LRT of 867, with a p value for the LRT test of .0001. I then allow regressions to be different between classes. Now, the model fits better according to the chisquare difference test, but the LRT is 343 with a p value of 0.59. I'm not sure what to make of this. I also have a question about regression coefficients. In some models I allowed different regression coefficients for different classes. In some of these cases, I have gotten solutions that have "y ON x" values greater than 1.0. I believed these were standardized regression coefficients, so that would not be an admissible solution. Is this correct? And do you have any ideas for how to get around it? Thanks, Jennie 

bmuthen posted on Thursday, September 12, 2002  3:45 pm



First, make sure that in your kclass run, the log likelihood value that is printed under TECH11 for your H0 model (the k1 model, i.e. the model with one less class) agrees with the log likelihood value that you got in the regular output for your run with k1 classes. If they disagree, you are not testing against the right k 1 model and will have to modify the starting values of your kclass run. Note that Mplus drops class 1 when doing the k1 class H0 run. Second, the LMR LRT p values can vary if you allow more or less flexibility in your mixture model; this is as it should be. Regarding your standardized solutions, are these cases where you have negative residual variances for y? If you like, send support@statmodel.com an output excerpt that shows this. 


I'm still trying to do the right thing in making # of class decisions. My reading of the LoMendellRubin article suggests that it's really only appropriate for LCA of continuous variables. As I'm doing LCA of the patterns in 7 dichotomous survey responses I'm concluding it's not quite right. So I've read through all I can find and have taken chapter 1 from Marcoulides & Schumacker to imply that it's best to find a minimum BIC for deciding how many classes. This has come down to 4. However I guess I should now be optimising among the possible starting configurations. My dimension reduction used the 4 most popular response patterns so I'm now sampling 4 from 66. I've got the time to construct a new input file down to 5 minutes and have managed 10 random starts. One of these is slightly lower than the original on BIC. It's got less thresholds set (3/+3 gone to 15/+15) during optimisation but has deleted one of the 66 cells. I'm hesitant choosing an option which doesn't represent all 66 cells  but should I go strictly with BIC? And am I right not to worry too much about a few +/ 15s? And how many random starts should I do? Should I choose some to represent profiles that don't exist? I could also agree with Jennie Jester that the LMR LRT is all over the place with the differing starting values (from 920.905 to 842.586!) but I'm not confident in using it. Obviously this could be a starting value problem. Finally if I could persuade my fellow worker to tell me a theoretically grounded set of starting positions can I fit some in the starting position by using @ rather than *? And if I get a threshold should I run again with those values fixed to 15/+15 or another value? 

bmuthen posted on Tuesday, October 22, 2002  3:04 pm



The LoMendellRubin article refers to the original Vuong article which we feel also covers the categorical outcomes case. We have limited simulation studies indicating that it works well here too. But I agree that multiple solutions can make it awkward to work with in practice. You should only consider the TECH11 test results for the solution with the highest log likelihood value. The deletion of cells is only done in the chisquare computations and not in the model estimation. You don't have to worry about +/15 at all  in fact, they make the interpretation clearer. A +/ 15 threshold in one solution (with one set of starting values) may not be at this extreme in another solution, so don't fix them at these values when changing starting values. Theoretically grounded values using @ instead of * are valuable just like in CFA. 


I'm finalising a Winter AMA special session presentation on choosing # classes using Adjusted LMR in latent class analyses. I've frequently got the situation where I've an adjusted LMR p>.05 suggesting that the #classes1 that has LMR p<.001 would be the right decision. However #classes+1 more also has LMR p<.001 and #classes+2. I've checked and reordered my starting values to get the outcome class sizes ascending  makes little or no difference. Any suggestions as how to resolve. Should I take 1) the lowest no. of classes with a significant LMR 2) the highest sig LMR dimension with a NS LMR above 3) the highest sig LMR dimenson with two NS LMR for dimensions above. I certainly wouldn't recommend anyone to decide the #classes is right if the LMR p<.001 on the dimensionality they've chosen. 

bmuthen posted on Monday, January 13, 2003  10:07 am



The LMR test gives a p value for the (k1)class versus the kclass model when running the kclass model. So to get support for k classes (or more) you want a low p value for the k1 model in the kclass run and a high p value for the k model in the k+1 run. You should keep adding classes until this happens. Note that the LMR log likelihood for the k1 class model in the kclass run should be the same as the log likelihood for the k1 run. Hope this answers your question and if not let us know. 


Yes I'm almost there. So a rule of the type take the number of classes (K) with a low p value that's less than a k+1 test with a high p value is what is suggested. This is what works fine with the following series # classes__LMR adjusted__prob 2____________1351.173____0.0000 3_____________487.692____0.0000 4______________56.022____0.0000 5_____________520.310____0.0000 6_______________2.205____0.6240 7______________14.810____0.1775 Unfortunately I've got series of analyses with p's behaving like this 2_____________5779.2_____low 3_______________67.53____low 4________________0.83____high 5_____________1108.1_____low 6________________2.2_____high I'm taking this to suggest either 3 or 5! And if I'm looking for a low class number I'd take 3 and looking for a high # of classes I take the 5. And I've carefully checked the outcome class sizes are in ascending order. Of course minimum BIC gives something unequivocal! 

bmuthen posted on Tuesday, January 14, 2003  9:39 am



Just to be clear, let me ask you about the first column that you label # classes. By this do you mean: (1) the number of classes in your input (what I call the "kclass run"), or (2) the number of classes that is used for the H0 in the TECH11 LMR test (which is "the (k1)class model")? Assuming it is (1), I would choose 5 classes in your first example and 3 classes in your second example. Assuming it is (2), I would choose 6 and 4, respectively. For the case when you get low, low, high, low, high, I would go with the first instance that it switches from low to high  the higher class numbers where such a switch occurs is then not of interest since we already couldn't reject that a lower number of classes fit. 


In addition to looking at various fit statistics, it is also important to look at classification quality and interpretability when deciding on the number of classes. Classification quality can be determined by looking at the entropy value and the classification table that is part of the Mplus output. Interpretability should be guided by whether the classes make sense theoretically, whether they have predicitive validity, etc. This is discussed in the following paper: Muthén, B. & Muthén, L. (2000). Integrating personcentered and variablecentered analyses: Growth mixture modeling with latent trajectory classes. Alcoholism: Clinical and Experimental Research, 24, 882891. 

Anonymous posted on Tuesday, April 08, 2003  9:51 am



I am launching into a latent class analysis on some symptom data from the SCID. I first ran a PCA in SPSS, and found that three eigenvalues are >1. A colleague of mine said that the "rule of thumb" is that a class solution with one class greater than number of eigenvalues >1 (in this case, it would be a 4class solution) is probably the best. Is this documented anywhere? Thank you, Dawn 

bmuthen posted on Tuesday, April 08, 2003  9:59 am



Yes, see references in section 2 of Muthen (2002), Statistical and Substantive Checking in Growth Mixture Modeling, which is available in pdf on the Mplus home page, wwww.statmodel.com. 

Anonymous posted on Friday, December 12, 2003  6:00 am



I am trying to fit an LCA model to three variables with ordinal levels 0 1 2. I am having trouble getting a model to fit. The message I get looks something like this: THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ILLCONDITIONED FISHER INFORMATION MATRIX. CHANGE YOUR MODEL AND/OR STARTING VALUES. THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO A NONPOSITIVE DEFINITE FISHER INFORMATION MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.549D12. THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THIS IS OFTEN DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. CHANGE YOUR MODEL AND/OR STARTING VALUES. PROBLEM INVOLVING PARAMETER 30. I followed the advice I saw somewhere which was to use ending values from one failed run for the next run. I did approximately 10 iterations of this, and while I keep getting the same error mesasge, the class variable remains unchanged, that is the subjects keep ending up in the same class structure. Do you have any advice on how I might proceed from here? By the way, I have tried 3 4 5 6 class solutions with similar problems. Thank you very much in advance for your advice. 


It sounds like an identification problem. Given that you have three variables with three categories, you would have 26 independent pieces of information. The message is complaining about parameter 30. If you send the output to support@statmodel.com, I can take a look at it. 

Anonymous posted on Monday, October 11, 2004  8:09 am



I am working on a latent class analysis and have a question about determining which solution has the best fit. When comparing a 4 class to a 3 class, my BIC increases slightly (6653.22 to 6689.09), but the entropy increases substantially (.61 to .84). The LMR is also nonsignificant. The interpretability of the 4 class solution makes sense substantively. By the way, the LMR is also nonsignficant between the 2 class and 3 class solutions. I then tried to see if adding covariates would give me better goodness of fit statistics. HOwever, I ran into the same problem  improved entropy and interpretability of the 4 class compared to the 3 class, but with slight increases in the BIC and non significant LMR. Is it appropriate to use the 4 class solution given the nonsignificant LMR? thanks for your help. 


Hello, This is my first attempt at latent class analysis, and I'm having difficulty determining the best solution from the various fit statistics. Should "fit" be based solely on the lowest BIC, the lowest Likelihood ratio chisquare statistic, and the extent to which the solution makes sense to the analyst? Also, what is the difference between the BIC and adjusted BIC, particularly towards assessing "fit"? Thanks, Jeffry 


See the following paper for suggestions: Muthén, B. (2004). Latent variable analysis: Growth mixture modeling and related techniques for longitudinal data. In D. Kaplan (ed.), Handbook of quantitative methodology for the social sciences (pp. 345368). Newbury Park, CA: Sage Publications. It can be downloaded from the Mplus homepage. Regarding the two BIC's, a new paper by Nylund et al. indicates that the adjusted BIC may work better than the conventional BIC. The adjusted BIC uses a modified sample size. See Appendix 8 of the technical appendices which are available on the website. 

Anonymous posted on Friday, October 22, 2004  10:22 am



Hi, I am wondering if anyone can help me make a conditional statement into a latent class analyses I am doing. I am interested in assigning everyone who answered 'yes' to a certain question in a certain category. How would I write that into a code? Thanks for the help. 


You can do this using the KNOWNCLASS option of the VARIABLE command or by fixing the value of the threshold of the item as shown in Example 7.24. 

Anonymous posted on Wednesday, December 01, 2004  2:17 pm



Hi, I'm having difficulty interpreting the LoMendellRubin likelihood ratio test (Tech 11), particularly your indication that "a low pvalue indicates that the estimated model is preferable." Do you mean this in accordance with conventional practice (i.e., p= .05 or less)? My output is as follows: K LRT PValue 1 NA 2 0.0033 3 0.0777 4 0.1681 5 0.3894 I suspect that the two class model is preferred with respect to the LRT pvalue. However, the BIC and Entropy indices are better for the three class model. Thanks for your help. 


We always recommend looking at the solutions for more than one set of classes. For example, here you might look at the 2, 3, and 4 class solutions to see how they differ and whether they are substatively meaningful. 

Anonymous posted on Friday, December 31, 2004  4:22 pm



Hello, I'm running an LCA in which my LRT test for H0 (2 latent classes) versus three latent classes yields the error message: TECHNICAL 11 OUTPUT THE LIKELIHOOD RATIO TEST COULD NOT BE COMPUTED. THE INFORMATION MATRIX OF THE H0 MODEL WITH ONE LESS CLASS IS SINGULAR. REORDER THE CLASSES. THE LOGLIKELIHOOD VALUE FOR THE ESTIMATED H0 MODEL IS 2341.782. How do I specify reordered categories? Thanks. 


You can reorder classes by using the ending values as starting values. So if you want Class 2 to be Class 1, use the ending values of Class 2 as starting values for Class 1. 


Dear Linda, on October 14 of 2004 you mentioned a paper by Nylund et al. concerning sample size adjusted BIC. Could you give me a hint where I can find this article? Thank you! 

bmuthen posted on Thursday, March 24, 2005  7:49 am



We decided to extend the investigation in this paper and it is therefore not finished at this point. Probably will be available this summer. 

Salma posted on Monday, January 16, 2006  3:36 am



Please advice on an example that I can refer to in using LCA with ordinal manifest variables an a categorical latent variable. thanks 


See Examples 7.3, 7.4, and 7.5. 

Tonya Jones posted on Thursday, July 13, 2006  1:59 pm



Hi, I’m a research assistant using Mplus to conduct latent class analysis with continuous variables for the first time. I am trying to identify the number of classes present in a sample of 2,265 survey respondents. My 14 variables are 14 health attitudes (likert scales strongly disagree to strongly agree). Theoretically, the primary investigator and I believe that there several classes within the sample (at least 4 or 6). However, the statistics best support the solution of 2 classes. For instance, the 2 class solution was significant, the 3 class solution was not significant, and the 4, 5 and 6 class solutions were significant. The 4 and 6 class solutions make sense theoretically, but the 2 class and 5 class solutions do not. Below are the output for the different solutions using 50,20 starting values. Also, warning messages appear in the 3+ output. I’ve used 20,10; 50,10; 50,20; and 100, 50 starting values, and my class sizes hold after changing the starting values, but I still encounter the same warning messages (see below). Should I be concerned? I’ve read both papers discussing the issue of choosing the correct number of classes (Muthén, B. & Muthén, L. (2000). Integrating personcentered and variablecentered analysis: Growth mixture modeling with latent trajectory classes. AND Nylund, K.L., Asparouhov, T., & Muthen, B. (2006). Deciding on the number of classes in latent class analysis and growth mixture modeling. A Monte Carlo simulation study.), and I’ve learned a lot from them. I’m just interested in your thoughts regarding using the 4 class or 6 class results even though the 3 class result was not significant. Thanks in advance and sorry for the long message. 2 CLASS SOLUTION THE MODEL ESTIMATION TERMINATED NORMALLY TESTS OF MODEL FIT Loglikelihood H0 Value 42803.940 H0 Scaling Correction Factor 1.125 for MLR Information Criteria Number of Free Parameters 43 Akaike (AIC) 85693.881 Bayesian (BIC) 85939.994 SampleSize Adjusted BIC 85803.375 (n* = (n + 2) / 24) Entropy 0.999 Class Counts and Proportions Latent Classes 1 2190 0.96860 2 71 0.03140 VUONGLOMENDELLRUBIN LIKELIHOOD RATIO TEST FOR 1 (H0) VERSUS 2 CLASSES H0 Loglikelihood Value 44231.624 2 Times the Loglikelihood Difference 2855.368 Difference in the Number of Parameters 15 Mean 37.662 Standard Deviation 201.245 PValue 0.0000 LOMENDELLRUBIN ADJUSTED LRT TEST Value 2830.933 PValue 0.0000 3 CLASS SOLUTION WARNING: WHEN ESTIMATING A MODEL WITH MORE THAN TWO CLASSES, IT MAY BE NECESSARY TO INCREASE THE NUMBER OF RANDOM STARTS USING THE STARTS OPTION TO AVOID LOCAL MAXIMA. WARNING: THE BEST LOGLIKELIHOOD VALUE WAS NOT REPLICATED. THE SOLUTION MAY NOT BE TRUSTWORTHY DUE TO LOCAL MAXIMA. INCREASE THE NUMBER OF RANDOM STARTS. THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.117D17. PROBLEM INVOLVING PARAMETER 31. THE MODEL ESTIMATION TERMINATED NORMALLY TESTS OF MODEL FIT Loglikelihood H0 Value 40908.249 H0 Scaling Correction Factor 1.760 for MLR Information Criteria Number of Free Parameters 58 Akaike (AIC) 81932.498 Bayesian (BIC) 82264.464 SampleSize Adjusted BIC 82080.188 (n* = (n + 2) / 24) Entropy 0.999 Class Counts and Proportions Latent Classes 1 71 0.03140 2 2072 0.91641 3 118 0.05219 VUONGLOMENDELLRUBIN LIKELIHOOD RATIO TEST FOR 2 (H0) VERSUS 3 CLASSES H0 Loglikelihood Value 42803.940 2 Times the Loglikelihood Difference 3791.383 Difference in the Number of Parameters 15 Mean 1725.840 Standard Deviation 2263.722 PValue 0.1304 LOMENDELLRUBIN ADJUSTED LRT TEST Value 3758.937 PValue 0.1321 4 CLASS SOLUTION WARNING: WHEN ESTIMATING A MODEL WITH MORE THAN TWO CLASSES, IT MAY BE NECESSARY TO INCREASE THE NUMBER OF RANDOM STARTS USING THE STARTS OPTION TO AVOID LOCAL MAXIMA. WARNING: THE BEST LOGLIKELIHOOD VALUE WAS NOT REPLICATED. THE SOLUTION MAY NOT BE TRUSTWORTHY DUE TO LOCAL MAXIMA. INCREASE THE NUMBER OF RANDOM STARTS. THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.285D15. PROBLEM INVOLVING PARAMETER 3. THE MODEL ESTIMATION TERMINATED NORMALLY TESTS OF MODEL FIT Loglikelihood H0 Value 40238.840 H0 Scaling Correction Factor 1.686 for MLR Information Criteria Number of Free Parameters 73 Akaike (AIC) 80623.680 Bayesian (BIC) 81041.500 SampleSize Adjusted BIC 80809.566 (n* = (n + 2) / 24) Entropy 0.959 Class Counts and Proportions Latent Classes 1 1144 0.50597 2 71 0.03140 3 118 0.05219 4 928 0.41044 VUONGLOMENDELLRUBIN LIKELIHOOD RATIO TEST FOR 3 (H0) VERSUS 4 CLASSES H0 Loglikelihood Value 40908.249 2 Times the Loglikelihood Difference 1338.818 Difference in the Number of Parameters 17 Mean 58.384 Standard Deviation 42.977 PValue 0.0000 LOMENDELLRUBIN ADJUSTED LRT TEST Value 1328.698 PValue 0.0000 5 CLASS SOLUTION WARNING: WHEN ESTIMATING A MODEL WITH MORE THAN TWO CLASSES, IT MAY BE NECESSARY TO INCREASE THE NUMBER OF RANDOM STARTS USING THE STARTS OPTION TO AVOID LOCAL MAXIMA. WARNING: THE BEST LOGLIKELIHOOD VALUE WAS NOT REPLICATED. THE SOLUTION MAY NOT BE TRUSTWORTHY DUE TO LOCAL MAXIMA. INCREASE THE NUMBER OF RANDOM STARTS. THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.183D19. PROBLEM INVOLVING PARAMETER 31. THE MODEL ESTIMATION TERMINATED NORMALLY TESTS OF MODEL FIT Loglikelihood H0 Value 39924.057 H0 Scaling Correction Factor 1.675 for MLR Information Criteria Number of Free Parameters 88 Akaike (AIC) 80024.113 Bayesian (BIC) 80527.787 SampleSize Adjusted BIC 80248.196 (n* = (n + 2) / 24) Entropy 0.957 Class Counts and Proportions Latent Classes 1 71 0.03140 2 189 0.08359 3 1538 0.68023 4 345 0.15259 5 118 0.05219 VUONGLOMENDELLRUBIN LIKELIHOOD RATIO TEST FOR 4 (H0) VERSUS 5 CLASSES H0 Loglikelihood Value 40302.825 2 Times the Loglikelihood Difference 757.537 Difference in the Number of Parameters 17 Mean 111.882 Standard Deviation 137.085 PValue 0.0048 LOMENDELLRUBIN ADJUSTED LRT TEST Value 751.811 PValue 0.0050 6 CLASS SOLUTION WARNING: WHEN ESTIMATING A MODEL WITH MORE THAN TWO CLASSES, IT MAY BE NECESSARY TO INCREASE THE NUMBER OF RANDOM STARTS USING THE STARTS OPTION TO AVOID LOCAL MAXIMA. THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.120D18. PROBLEM INVOLVING PARAMETER 3. THE MODEL ESTIMATION TERMINATED NORMALLY TESTS OF MODEL FIT Loglikelihood H0 Value 39672.371 H0 Scaling Correction Factor 1.603 for MLR Information Criteria Number of Free Parameters 103 Akaike (AIC) 79550.742 Bayesian (BIC) 80140.269 SampleSize Adjusted BIC 79813.021 (n* = (n + 2) / 24) Entropy 0.905 Class Counts and Proportions Latent Classes 1 171 0.07563 2 324 0.14330 3 71 0.03140 4 764 0.33790 5 813 0.35958 6 118 0.05219 VUONGLOMENDELLRUBIN LIKELIHOOD RATIO TEST FOR 5 (H0) VERSUS 6 CLASSES H0 Loglikelihood Value 39918.607 2 Times the Loglikelihood Difference 492.472 Difference in the Number of Parameters 19 Mean 79.957 Standard Deviation 92.654 PValue 0.0046 LOMENDELLRUBIN ADJUSTED LRT TEST Value 489.139 PValue 0.0048 


It sounds like you go by the TECH11 LoMendellRubin test to determine if "x classes is significant". That is not the only way to decide on the number of classes as the Nylund et al paper shows. See also Muthen (2004). Also, you should note the important warning: WARNING: THE BEST LOGLIKELIHOOD VALUE WAS NOT REPLICATED. THE SOLUTION MAY NOT BE TRUSTWORTHY DUE TO LOCAL MAXIMA. INCREASE THE NUMBER OF RANDOM STARTS. This means that your solution is not trustworthy  follow the suggestion. 


I am examining a LCA with categorical and continuous outcomes. The 4 class solution has a minimum BIC. Differences in LL values appear to stabalize at 4 classes. The LMRLRT suggests retaining 5 classes, of which, one class has a size of 3% of the sample. The BLRT suggests retaining up to 10 classes. I expect that consistency is important. Given these conflicting results, do any specific tests have greater weight? Thanks, Tom 


In a situation like this, I would probably look at the 4, 5, and 6 class solutions and see what they look like from a substantive point of view. Are the classes really different or just variations on a theme? What does theory suggest? 


Hi all, I have read over this thread intensely and would like to ask a few questions about a latent class model I am running. In short I have 7 binary (0,1) indicators and am looking at a 2 or 3 group class solution. The BIC differences between the two are minimal. Prior theory and a previous EFA on the binary data suggest the possibility of a 3 group solution but I am little confused by my output. No matter the number of random starts I choose, one group always ends up with 1 (in either solution). I am wondering what might be the reason for why I am getting this result. Thanks, JD 


It sounds like you are getting one empty class when you ask for three classes. If this is the case, this could point to a twoclass solution. 


Thanks. I noticed when I tweaked my input instructions I got a seemingly 'expectable' threegroup solution with. The tweaks were nontechnical: I did not specify type of data nor number of observations. Also resaved my .dat file without variable names at the top and specified free format. Would this indeed make a difference? 


It would be impossible to say without seeing exactly what you are did. I would be concerned if I could not understand how the changes made a difference. 


I think possibly by specifying the number of observations I inadvertently limited the observations used in the model run. I made the mistake of using this command and specified the number of cases as the number of observations. 


Hello, I'm running a LCA with 7 likerttype (15) indicators (N=9.600) with different class solutions (24). Is it necessary to use userspecified starting values and if so when and how do I have to specify them? The User Guide did not not provide sufficient information for the selection of them. Thanks, Mario 


It is not necessary to provide starting values. If you don't have any idea of what they would be, it is best to use the default starting values. 


Hello Linda, Thank you! That's what I did but only got nonsignificant ChiSquare tests (p=1.000) for all class solutions. Does it mean that I only have one class? Furthermore, when I plotted the latent profiles they had almost similar shapes. Thanks, Mario 


See the following paper which is available on the website and the Topic 5 course handout on the website to understand how to determine the number of classes: Nylund, K.L., Asparouhov, T., & Muthén, B. (2007). Deciding on the number of classes in latent class analysis and growth mixture modeling. A Monte Carlo simulation study. Structural Equation Modeling, 14, 535569. 


Hello, I am looking for resources on how to run a exploratory LCA in Mplus with respect to evidence of how to do it, what number of classes to start withbut also the code to run the program in Mplus. I have three continuous variables that I would like to perform the Exploratory LCA on. Do any of your handouts actually have the information on how to implement it in Mplus? Thanks very much for your help. ***Melissa 


The Topic 5 course handout shows how to carry out an LCA. 


Dear Dr. Muthen, I am running LCA models based on four categorical variables with four levels each. When requesting TECH11, I wanted to use userspecific starting values from my output without TECH11 to order the latent classes in the way you recommended in the Mplus manual (first class with the fewest participants and the last the most). Unfortunately, some of my logit thresholds were set at extreme values in the optimization (15.000 and 15.000). Therefore, my starting values based on the threshold estimates are not ordered increasingly and cause error messages. Is there another way to order the latent classes according to class counts when requesting TECH11? If not, can I still rely on the results of the likelihood ratio tests or should I use another approach? Any suggestions, recommendations, or references are appreciated! Thanks and kind regards, Claudia 


If, for example, the second threshold is fixed at 15, try using 14 for the first and 16 for the third. 


Hi, I am having an issue choosing the number of classes to go with in an latent profile analysis. I am using 5 imputed datasets to run my LPA and am comparing the fit based on BIC and entropy based on the recommendations in Nylund et al. (2007). My issue is that when I run the analyses with one imputed dataset for the plot and to save the class probabilities the class membership is coming out very differently. Basically, I am running a four class analysis. In the analysis with the five datasets there are two larger classes with 44% and 32% of the sample in them and then two smaller classes with 12% and 10% of the sample in them (sample size of 150). When I run the same analysis in the single datasets that class make up is 44% in two classes and 5.9% in two classes. I do not have the same issues when I run the analysis with three classes. The class structure and % are basically the same whether I run with imputed data or a single dataset. Given these issues does it make sense to go with the three class structure instead of the four class even though the four class structure has a better BIC and Entropy when run with the five imputed datasets? Thanks, Laura 


How did you impute that data? Did you use a mixture model? Why are you not using maximum likelihood instead of imputing. BIC and other fit statistics are given as averages in multiple imputation. They have not been developed for imputation. It is not clear if an average is meaningful. 


Linda, I used Amelia in R to create five datasets. I do not quite understand what it would mean to use ML instead of imputing. Laura 


You don't want to impute by a different model than you analyze. I suspect Amelia does not have mixture modeling as an option. Using ML is using all available information which is the default in Mplus. Some refer to this as FIML. 


Linda, That's what I thought you meant but I wasn't completely sure. Thank you for the explanation re: Amelia. In using ML estimation would I have to have MPlus exclude missing data listwise or do you recommend a way to impute missing data values? I understand that I should not use data imputed using a model other than mixture modeling, but is there a way to impute data in MPlus using mixture modeling? Thanks, Laura 


Linda, Nevermind, I spent some time with the Mplus manual and realized that if I turn off Listwise=on then it would estimate the model with missing data theory. Thanks for all of your help! Laura 

Leslie Roos posted on Wednesday, August 15, 2012  11:35 am



Hello! I am a graduate student working on a LCA analysis using NESARC data to examine if a 10 childhood adversity variables (all binary categorical) form in a latent class pattern to predict a binary adult outcome. I am struggling with the MODEL specifications. My understanding was that with all categorical predictors, MODEL specification is not necessary, but since this involves complex sampling with covariates (can be categorical or continuous), I think it possibly is? Additionally, we do not have any aprori hypotheses about latent variable interactions or random slopes, but I not sure if these need to be specified? I've included my syntaxinprogress below, any help in designing the model or suggestions would be hugely appreciated  thank you!! TITLE: LCA Childhood Adversity DATA: FILE = .dat’ VARIABLE: NAMES = Outcome ace1 ace2 ace3 ace4 income w2_eth w2mstat w2sex w2_urb age; USEVARIABLES = CATERGORICAL = ace1 ace2 ace3 ace4 CLASSES = c (2) WITHIN = ace1 ace2 ace3 ace4 BETWEEN = income w2_eth w2mstat w2_urb age; STRATIFICATION w2strat; CLUSTER = w2psu; WEIGHT = w2weight; SUBPOPULATION = w2sex = 1. KNOWNCLASS ; (?) DEFINE ANALYSIS Type= COMPLEX MIXTURE; STARTS = 100 10; STITERATIONS = 20; MODEL ?? 


You should look at Chapter 7 examples. Example 7.12 is a LCA with a covariate. You would use TYPE=COMPLEX MIXTURE; and the complex survey data options shown above. You should not use BETWEEN and WITHIN. They are for TYPE=TWOLEVEL. 

Study Hard posted on Tuesday, February 03, 2015  6:19 pm



Hello Dr. Muthen I am runnin a latent class analysis following the 2012 article "Using Mplus TECH11 and TECH14 to test the number of latent classes" by Asparouhov and B. Muthen. The article compaed 4 versus 5 classes model and found that the pvalues were small in both step 2 and step 3. So, the conclusion was to reject 4 class model. But how would you interpret the situation where the step 2 test resulted in a large pvalue (the Vuong test) whereas the step 3 resulted in a pvalue of 0.000. Should model 4 be still rejected? 


You stop testing once you have a first large pvalue. 

CB posted on Tuesday, March 24, 2015  1:07 pm



Thanks for all of the helpful references in deciding the number of latent classes! What references are available that discuss variable selection for LCA, especially for categorical indicators? I'm trying to find type of semisystematic modeling strategy, but have had difficulty in finding it. Also, do the pvalues for the Ztest (estimate divided by its standard error) help in deciding what variable is a "good" discriminator for the latent classes? Thanks! 


Take a look at our technical appendix for version 7: Variablespecific entropy contribution. 

Back to top 