Mplus Discussion >> LCA: Choosing Initial Conditional Probabilities & Number of Classes

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


LCA: Choosing Initial Conditional Pro...

Mplus Discussion > Latent Variable Mixture Modeling >

Message/Author

Juned Siddique posted on Tuesday, January 09, 2001 - 10:58 am

When performing an exploratory latent class analysis, how do I decide on the number of classes? Should I start with a low number of classes and keep increasing until I get a good fit?

Also, how do I come up with initial conditional probabilities? I have no prior beliefs about the data. Thank you.

bmuthen posted on Tuesday, January 09, 2001 - 2:13 pm

Start with 2 classes and keep increasing to see if the set of available model indices points to a certain number of classes. You can use BIC and you can check the classification quality - and in Mplus version 2, to be released in February, you also have an entropy measure and chi-square tests of fit. For a discussion, see Muthen & Muthen (2000) as listed in the Reference section on this web site. Starting values can be low probabilities for all items of one class and high probabilities for the other class.

Anonymous posted on Tuesday, October 30, 2001 - 2:34 pm

I started out with a 2 class LC model and then increased to a 3 class model. When I do this (a) I get a message saying that a "saddle point" has been reached and that I should try new starting values or a new model. I have tried various different starting values but keep getting this message indicating that changing the starting values did not change anything. Is there any other way of going about this ? (b) I get the message saying that one of the logit thresholds have been approached - can this be overcome by setting the starting values at the extreme threshold value shown in the output; (c) a message saying that the chi-sq test cannot be computed as the freq table is too large - is there a way to overcome this problem. Thanks in advance....

Best
Krishna

bmuthen posted on Wednesday, October 31, 2001 - 10:11 am

A first approach to try out is to fix the thresholds labeled as extreme (e.g. at +-10) and re-estimate the model - this may solve the problem. Extreme thresholds are useful in the sense that they make the class interpretation easier (perfect association between the class and item). Regarding freq tables that are too large - when the table is very large the chi-square is probably not of interest anyway due to many cells having very small expected frequencies, resulting in a very poor chi-square approximation.

Patrick Malone posted on Thursday, November 01, 2001 - 1:15 pm

Howdy. I'm still working on the LCA problem y'all have been helping me with on different threads. Now I'm running into problems with starting values and different solutions. The original problem was an 8-class model with 21 dichotomous indicators (N=445). No constraints on the model. I've run it with three different sets of starting values and got three different solutions. The log-likelihoods are within a few points of each other, but the solutions can be substantially different. I'm not getting any kind of warnings (except to ignore the standard errors where the logits are at the extremes).

I thought the model complexity might be contributing to the instability, so I went to a simpler, 4-class model and got a stable solution that holds across several sets of starting values. However, when going to a second sample, I'm getting the same variance in outcomes even with the simple model.

Any insights or suggestions?

Thanks,
Pat Malone

bmuthen posted on Thursday, November 01, 2001 - 3:11 pm

LCA is an exploratory model with relatively little structure imposed and it is known to be able to produce different solutions. Your example seems like a more unusual version version, however, where solutions with very different parameter estimates are obtained with rather little loglikelihood difference. This indicates that these particular data provide little information on this particular model. Perhaps this is also seen in the classification table reflecting uncertainty? The different solutions may all have proper maxima so that the information matrix is not singular. The only remedy here is to do what you did, use several sets of starting values. Or, formulate more restrictive models.

Patrick Malone posted on Friday, November 02, 2001 - 5:44 am

Thanks. In the absence of strong theory guiding the starting values, I was thinking of trying an automated process where I generate random starting values for some reasonably large number (maybe 50) of runs. I could then look for modal solutions or optimal solutions (by loglikelihood). Do you think there's merit to such an approach? Very computationally intensive, I know.

Thanks

bmuthen posted on Friday, November 02, 2001 - 7:51 am

Yes, that is a reasonable approach for this situation.

Blair Beadnell posted on Wednesday, November 14, 2001 - 4:49 pm

I have two general questions about interpreting the output for a LCA model. First, do you have any suggested guidelines for interpreting entropy? Second, under what circumstances should one use the sample size adjusted BIC as opposed to the BIC? i notice in the mplus 2.0 manual that a study is cited on p. 372 (Yang, 1998) that found superior performance for the sample size adjusted BIC for LCA models. does this argue for the sample size adjusted BIC anytime one is doing an LCA? thanks.

bmuthen posted on Thursday, November 15, 2001 - 10:19 am

Entropy is described in the User's Guide. I have not found specific guidelines - you may be able to find guidelines in the marketing literature (see ref. in User's Guide). Entropy does not seem to be correlated with goodness of fit of the model (much like R-square is not correlated with good model fit in cov structure models). The classification table gives further information.

Yang suggested using the sample-size adjusted BIC for LCA. But this is just one study and for certain LCA models. I would continue to use BIC as well until more studies acculumate. This is not to say that I fully trust BIC - this measure is currently being studied by several research groups. You can also use the chi-square test against an unrestricted multinomial distribution. Interpretability and usefulness remain key considerations.

Anonymous posted on Wednesday, July 24, 2002 - 7:00 pm

Hi,

I am attempting to use mixture modeling to explore possible latent classes in a univariate distribution. In comparing the models in which I've specified a different number of classes, I'm not sure what criteria to use to determine which model best describes the data. Specifically, should I select the model that yields the lowest (unadjusted for sample size) BIC, the lowest AIC, or use a chi square test to compare the models - or use some other criteria?

Thank you very much for your help.

Linda K. Muthen posted on Thursday, July 25, 2002 - 9:56 am

How to decide on the number of classes is a topic that continues to be studied. These issues are discussed in the following references:

McLachlan, G.J. & Peel, D. (2000). Finite Mixture Models. New York: Wiley & Sons.

Muthén, B. & Muthén, L. (2000). Integrating person-centered and variable-centered analysis: Growth mixture modeling with latent trajectory classes. Alcoholism: Clinical and Experimental Research, 24, 882-891. (#85)

Dr Stephen K Tagg posted on Tuesday, August 20, 2002 - 7:51 am

Version 2.12 update now installed. I'll go and read the Biometrika article, but the Lo-Mendell-Rubin looks like it might help make dimensionality decisions. How much does this depend on your advice to put classes in ascending order of magnitude? If the first class is the smallest would this make less difference than if the first class was the largest?

bmuthen posted on Tuesday, August 20, 2002 - 12:50 pm

Which class to put first could affect the starting values, which may have influence on the LMR LRT if there are local optima, otherwise not (you can check the k-1 solution's log likelihood value printed in TECH11). So, as a matter of habit, in the k-class run I would have the smallest class first since the program would then use the starting values from the remaining k-1 classes, which are probably better starting values for a k-1 class analysis.

In some models, the parameterization of first class achieves the identification of the model, in which case this class would have to be moved to a higher class location.

Jennie Jester posted on Thursday, September 12, 2002 - 12:17 pm

I am doing a latent variable mixture model with a dependent variable and a time-varying covariate of that variable. I was excited to see the Lo-Mendell-Rubin LRT test to help decide on the number of classes. However, I am finding that the statistic varies widely between different models I have tested with the same number of classes. For instance, a 3-class model with no variance in growth parameters gives a LRT of 61 with a p value of 0.22. When I allow residual variances that are different between classes, the model yields an LRT of 867, with a p value for the LRT test of .0001. I then allow regressions to be different between classes. Now, the model fits better according to the chi-square difference test, but the LRT is 343 with a p value of 0.59. I'm not sure what to make of this.

I also have a question about regression coefficients. In some models I allowed different regression coefficients for different classes. In some of these cases, I have gotten solutions that have "y ON x" values greater than 1.0. I believed these were standardized regression coefficients, so that would not be an admissible solution. Is this correct? And do you have any ideas for how to get around it?

Thanks,

Jennie

bmuthen posted on Thursday, September 12, 2002 - 3:45 pm

First, make sure that in your k-class run, the log likelihood value that is printed under TECH11 for your H0 model (the k-1 model, i.e. the model with one less class) agrees with the log likelihood value that you got in the regular output for your run with k-1 classes. If they disagree, you are not testing against the right k -1 model and will have to modify the starting values of your k-class run. Note that Mplus drops class 1 when doing the k-1 class H0 run.

Second, the LMR LRT p values can vary if you allow more or less flexibility in your mixture model; this is as it should be.

Regarding your standardized solutions, are these cases where you have negative residual variances for y? If you like, send support@statmodel.com an output excerpt that shows this.

Dr Stephen K Tagg posted on Tuesday, October 22, 2002 - 10:42 am

I'm still trying to do the right thing in making # of class decisions. My reading of the Lo-Mendell-Rubin article suggests that it's really only appropriate for LCA of continuous variables. As I'm doing LCA of the patterns in 7 dichotomous survey responses I'm concluding it's not quite right. So I've read through all I can find and have taken chapter 1 from Marcoulides & Schumacker to imply that it's best to find a minimum BIC for deciding how many classes. This has come down to 4. However I guess I should now be optimising among the possible starting configurations. My dimension reduction used the 4 most popular response patterns so I'm now sampling 4 from 66. I've got the time to construct a new input file down to 5 minutes and have managed 10 random starts. One of these is slightly lower than the original on BIC. It's got less thresholds set (-3/+3 gone to -15/+15) during optimisation but has deleted one of the 66 cells.
I'm hesitant choosing an option which doesn't represent all 66 cells - but should I go strictly with BIC? And am I right not to worry too much about a few +/- 15s? And how many random starts should I do? Should I choose some to represent profiles that don't exist?
I could also agree with Jennie Jester that the LMR LRT is all over the place with the differing starting values (from 920.905 to -842.586!) but I'm not confident in using it. Obviously this could be a starting value problem.
Finally if I could persuade my fellow worker to tell me a theoretically grounded set of starting positions can I fit some in the starting position by using @ rather than *? And if I get a threshold should I run again with those values fixed to -15/+15 or another value?

bmuthen posted on Tuesday, October 22, 2002 - 3:04 pm

The Lo-Mendell-Rubin article refers to the original Vuong article which we feel also covers the categorical outcomes case. We have limited simulation studies indicating that it works well here too. But I agree that multiple solutions can make it awkward to work with in practice. You should only consider the TECH11 test results for the solution with the highest log likelihood value.

The deletion of cells is only done in the chi-square computations and not in the model estimation.

You don't have to worry about +/-15 at all - in fact, they make the interpretation clearer. A +/- 15 threshold in one solution (with one set of starting values) may not be at this extreme in another solution, so don't fix them at these values when changing starting values.

Theoretically grounded values using @ instead of * are valuable just like in CFA.

Dr Stephen K Tagg posted on Friday, January 10, 2003 - 5:18 am

I'm finalising a Winter AMA special session presentation on choosing # classes using Adjusted LMR in latent class analyses. I've frequently got the situation where I've an adjusted LMR p>.05 suggesting that the #classes-1 that has LMR p<.001 would be the right decision. However #classes+1 more also has LMR p<.001 and #classes+2. I've checked and reordered my starting values to get the outcome class sizes ascending - makes little or no difference. Any suggestions as how to resolve. Should I take 1) the lowest no. of classes with a significant LMR 2) the highest sig LMR dimension with a NS LMR above 3) the highest sig LMR dimenson with two NS LMR for dimensions above. I certainly wouldn't recommend anyone to decide the #classes is right if the LMR p<.001 on the dimensionality they've chosen.

bmuthen posted on Monday, January 13, 2003 - 10:07 am

The LMR test gives a p value for the (k-1)-class versus the k-class model when running the k-class model. So to get support for k classes (or more) you want a low p value for the k-1 model in the k-class run and a high p value for the k model in the k+1 run. You should keep adding classes until this happens. Note that the LMR log likelihood for the k-1 class model in the k-class run should be the same as the log likelihood for the k-1 run. Hope this answers your question and if not let us know.

Dr Stephen K Tagg posted on Tuesday, January 14, 2003 - 3:08 am

Yes I'm almost there. So a rule of the type take the number of classes (K) with a low p value that's less than a k+1 test with a high p value is what is suggested. This is what works fine with the following series
# classes__LMR adjusted__prob
2____________1351.173____0.0000
3_____________487.692____0.0000
4______________56.022____0.0000
5_____________520.310____0.0000
6_______________2.205____0.6240
7______________14.810____0.1775

Unfortunately I've got series of analyses with p's behaving like this
2_____________5779.2_____low
3_______________67.53____low
4________________0.83____high
5_____________1108.1_____low
6________________2.2_____high

I'm taking this to suggest either 3 or 5!
And if I'm looking for a low class number I'd take 3 and looking for a high # of classes I take the 5. And I've carefully checked the outcome class sizes are in ascending order. Of course minimum BIC gives something unequivocal!

bmuthen posted on Tuesday, January 14, 2003 - 9:39 am

Just to be clear, let me ask you about the first column that you label # classes. By this do you mean: (1) the number of classes in your input (what I call the "k-class run"), or (2) the number of classes that is used for the H0 in the TECH11 LMR test (which is "the (k-1)-class model")? Assuming it is (1), I would choose 5 classes in your first example and 3 classes in your second example. Assuming it is (2), I would choose 6 and 4, respectively. For the case when you get low, low, high, low, high, I would go with the first instance that it switches from low to high - the higher class numbers where such a switch occurs is then not of interest since we already couldn't reject that a lower number of classes fit.

Linda K. Muthen posted on Tuesday, January 14, 2003 - 10:02 am

In addition to looking at various fit statistics, it is also important to look at classification quality and interpretability when deciding on the number of classes. Classification quality can be determined by looking at the entropy value and the classification table that is part of the Mplus output. Interpretability should be guided by whether the classes make sense theoretically, whether they have predicitive validity, etc. This is discussed in the following paper:

Muthén, B. & Muthén, L. (2000). Integrating person-centered and variable-centered analyses: Growth mixture modeling with latent trajectory classes. Alcoholism: Clinical and Experimental Research, 24, 882-891.

Anonymous posted on Tuesday, April 08, 2003 - 9:51 am

I am launching into a latent class analysis on some symptom data from the SCID. I first ran a PCA in SPSS, and found that three eigenvalues are >1. A colleague of mine said that the "rule of thumb" is that a class solution with one class greater than number of eigenvalues >1 (in this case, it would be a 4-class solution) is probably the best. Is this documented anywhere?

Thank you,
Dawn

bmuthen posted on Tuesday, April 08, 2003 - 9:59 am

Yes, see references in section 2 of Muthen (2002), Statistical and Substantive Checking in Growth Mixture Modeling, which is available in pdf on the Mplus home page, wwww.statmodel.com.

Anonymous posted on Friday, December 12, 2003 - 6:00 am

I am trying to fit an LCA model to three variables with ordinal levels 0 1 2. I am having trouble getting a model to fit. The message I get looks something like this:

THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ILL-CONDITIONED
FISHER INFORMATION MATRIX. CHANGE YOUR MODEL AND/OR STARTING VALUES.

THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO A NON-POSITIVE
DEFINITE FISHER INFORMATION MATRIX. THIS MAY BE DUE TO THE STARTING VALUES
BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION
NUMBER IS -0.549D-12.

THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE
COMPUTED. THIS IS OFTEN DUE TO THE STARTING VALUES BUT MAY ALSO BE
AN INDICATION OF MODEL NONIDENTIFICATION. CHANGE YOUR MODEL AND/OR
STARTING VALUES. PROBLEM INVOLVING PARAMETER 30.

I followed the advice I saw somewhere which was to use ending values from one failed run for the next run. I did approximately 10 iterations of this, and while I keep getting the same error mesasge, the class variable remains unchanged, that is the subjects keep ending up in the same class structure. Do you have any advice on how I might proceed from here? By the way, I have tried 3 4 5 6 class solutions with similar problems. Thank you very much in advance for your advice.

Linda K. Muthen posted on Friday, December 12, 2003 - 9:12 am

It sounds like an identification problem. Given that you have three variables with three categories, you would have 26 independent pieces of information. The message is complaining about parameter 30. If you send the output to support@statmodel.com, I can take a look at it.

Anonymous posted on Monday, October 11, 2004 - 8:09 am

I am working on a latent class analysis and have a question about determining which solution has the best fit. When comparing a 4 class to a 3 class, my BIC increases slightly (6653.22 to 6689.09), but the entropy increases substantially (.61 to .84). The LMR is also non-significant. The interpretability of the 4 class solution makes sense substantively. By the way, the LMR is also non-signficant between the 2 class and 3 class solutions.

I then tried to see if adding covariates would give me better goodness of fit statistics. HOwever, I ran into the same problem - improved entropy and interpretability of the 4 class compared to the 3 class, but with slight increases in the BIC and non significant LMR.

Is it appropriate to use the 4 class solution given the non-significant LMR?

thanks for your help.

Jeffry Thigpen posted on Monday, October 11, 2004 - 10:04 am

Hello,

This is my first attempt at latent class analysis, and I'm having difficulty determining the best solution from the various fit statistics. Should "fit" be based solely on the lowest BIC, the lowest Likelihood ratio chi-square statistic, and the extent to which the solution makes sense to the analyst? Also, what is the difference between the BIC and adjusted BIC, particularly towards assessing "fit"?

Thanks, Jeffry

Linda K. Muthen posted on Thursday, October 14, 2004 - 10:37 am

See the following paper for suggestions:

Muthén, B. (2004). Latent variable analysis: Growth mixture modeling and related techniques for longitudinal data. In D. Kaplan (ed.), Handbook of quantitative methodology for the social sciences (pp. 345-368). Newbury Park, CA: Sage Publications.

It can be downloaded from the Mplus homepage.

Regarding the two BIC's, a new paper by Nylund et al. indicates that the adjusted BIC may work better than the conventional BIC. The adjusted BIC uses a modified sample size. See Appendix 8 of the technical appendices which are available on the website.

Anonymous posted on Friday, October 22, 2004 - 10:22 am

Hi,

I am wondering if anyone can help me make a conditional statement into a latent class analyses I am doing. I am interested in assigning everyone who answered 'yes' to a certain question in a certain category. How would I write that into a code?

Thanks for the help.

Linda K. Muthen posted on Friday, October 22, 2004 - 10:57 am

You can do this using the KNOWNCLASS option of the VARIABLE command or by fixing the value of the threshold of the item as shown in Example 7.24.

Anonymous posted on Wednesday, December 01, 2004 - 2:17 pm

Hi, I'm having difficulty interpreting the Lo-Mendell-Rubin likelihood ratio test (Tech 11), particularly your indication that "a low p-value indicates that the estimated model is preferable." Do you mean this in accordance with conventional practice (i.e., p= .05 or less)? My output is as follows:

K LRT P-Value
1 NA
2 0.0033
3 0.0777
4 0.1681
5 0.3894

I suspect that the two class model is preferred with respect to the LRT p-value. However, the BIC and Entropy indices are better for the three class model. Thanks for your help.

Linda K. Muthen posted on Wednesday, December 01, 2004 - 2:41 pm

We always recommend looking at the solutions for more than one set of classes. For example, here you might look at the 2, 3, and 4 class solutions to see how they differ and whether they are substatively meaningful.

Anonymous posted on Friday, December 31, 2004 - 4:22 pm

Hello,
I'm running an LCA in which my LRT test for H0 (2 latent classes) versus three latent classes yields the error message:

TECHNICAL 11 OUTPUT

THE LIKELIHOOD RATIO TEST COULD NOT BE COMPUTED. THE INFORMATION MATRIX
OF THE H0 MODEL WITH ONE LESS CLASS IS SINGULAR. REORDER THE CLASSES.
THE LOGLIKELIHOOD VALUE FOR THE ESTIMATED H0 MODEL IS -2341.782.

How do I specify reordered categories?

Thanks.

Linda K. Muthen posted on Friday, December 31, 2004 - 4:50 pm

You can reorder classes by using the ending values as starting values. So if you want Class 2 to be Class 1, use the ending values of Class 2 as starting values for Class 1.

Christian Geiser posted on Thursday, March 24, 2005 - 1:24 am

Dear Linda,

on October 14 of 2004 you mentioned a paper by Nylund et al. concerning sample size adjusted BIC. Could you give me a hint where I can find this article? Thank you!

bmuthen posted on Thursday, March 24, 2005 - 7:49 am

We decided to extend the investigation in this paper and it is therefore not finished at this point. Probably will be available this summer.

Salma posted on Monday, January 16, 2006 - 3:36 am

Please advice on an example that I can refer to in using LCA with ordinal manifest variables an a categorical latent variable. thanks

Linda K. Muthen posted on Monday, January 16, 2006 - 7:22 am

See Examples 7.3, 7.4, and 7.5.

Tonya Jones posted on Thursday, July 13, 2006 - 1:59 pm

Hi,

I�m a research assistant using Mplus to conduct latent class analysis with continuous variables for the first time. I am trying to identify the number of classes present in a sample of 2,265 survey respondents. My 14 variables are 14 health attitudes (likert scales- strongly disagree to strongly agree). Theoretically, the primary investigator and I believe that there several classes within the sample (at least 4 or 6). However, the statistics best support the solution of 2 classes. For instance, the 2 class solution was significant, the 3 class solution was not significant, and the 4, 5 and 6 class solutions were significant. The 4 and 6 class solutions make sense theoretically, but the 2 class and 5 class solutions do not. Below are the output for the different solutions using 50,20 starting values.

Also, warning messages appear in the 3+ output. I�ve used 20,10; 50,10; 50,20; and 100, 50 starting values, and my class sizes hold after changing the starting values, but I still encounter the same warning messages (see below). Should I be concerned?

I�ve read both papers discussing the issue of choosing the correct number of classes
(Muth�n, B. & Muth�n, L. (2000). Integrating person-centered and variable-centered analysis: Growth mixture modeling with latent trajectory classes. AND
Nylund, K.L., Asparouhov, T., & Muthen, B. (2006). Deciding on the number of classes in latent class analysis and growth mixture modeling. A Monte Carlo simulation study.), and I�ve learned a lot from them. I�m just interested in your thoughts regarding using the 4 class or 6 class results even though the 3 class result was not significant.

Thanks in advance and sorry for the long message.

2 CLASS SOLUTION

THE MODEL ESTIMATION TERMINATED NORMALLY

TESTS OF MODEL FIT
Loglikelihood

H0 Value -42803.940
H0 Scaling Correction Factor 1.125
for MLR

Information Criteria

Number of Free Parameters 43
Akaike (AIC) 85693.881
Bayesian (BIC) 85939.994
Sample-Size Adjusted BIC 85803.375
(n* = (n + 2) / 24)
Entropy 0.999

Class Counts and Proportions

Latent
Classes

1 2190 0.96860
2 71 0.03140

VUONG-LO-MENDELL-RUBIN LIKELIHOOD RATIO TEST FOR 1 (H0) VERSUS 2 CLASSES

H0 Loglikelihood Value -44231.624
2 Times the Loglikelihood Difference 2855.368
Difference in the Number of Parameters 15
Mean -37.662
Standard Deviation 201.245
P-Value 0.0000

LO-MENDELL-RUBIN ADJUSTED LRT TEST

Value 2830.933
P-Value 0.0000

3 CLASS SOLUTION

WARNING: WHEN ESTIMATING A MODEL WITH MORE THAN TWO CLASSES, IT MAY BE
NECESSARY TO INCREASE THE NUMBER OF RANDOM STARTS USING THE STARTS OPTION
TO AVOID LOCAL MAXIMA.

WARNING: THE BEST LOGLIKELIHOOD VALUE WAS NOT REPLICATED. THE
SOLUTION MAY NOT BE TRUSTWORTHY DUE TO LOCAL MAXIMA. INCREASE THE
NUMBER OF RANDOM STARTS.

THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE
TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE
FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING
VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE
CONDITION NUMBER IS 0.117D-17. PROBLEM INVOLVING PARAMETER 31.

THE MODEL ESTIMATION TERMINATED NORMALLY

TESTS OF MODEL FIT

Loglikelihood

H0 Value -40908.249
H0 Scaling Correction Factor 1.760
for MLR

Information Criteria

Number of Free Parameters 58
Akaike (AIC) 81932.498
Bayesian (BIC) 82264.464
Sample-Size Adjusted BIC 82080.188
(n* = (n + 2) / 24)
Entropy 0.999

Class Counts and Proportions

Latent
Classes

1 71 0.03140
2 2072 0.91641
3 118 0.05219

VUONG-LO-MENDELL-RUBIN LIKELIHOOD RATIO TEST FOR 2 (H0) VERSUS 3 CLASSES

H0 Loglikelihood Value -42803.940
2 Times the Loglikelihood Difference 3791.383
Difference in the Number of Parameters 15
Mean 1725.840
Standard Deviation 2263.722
P-Value 0.1304

LO-MENDELL-RUBIN ADJUSTED LRT TEST

Value 3758.937
P-Value 0.1321

4 CLASS SOLUTION

WARNING: WHEN ESTIMATING A MODEL WITH MORE THAN TWO CLASSES, IT MAY BE
NECESSARY TO INCREASE THE NUMBER OF RANDOM STARTS USING THE STARTS OPTION
TO AVOID LOCAL MAXIMA.

WARNING: THE BEST LOGLIKELIHOOD VALUE WAS NOT REPLICATED. THE
SOLUTION MAY NOT BE TRUSTWORTHY DUE TO LOCAL MAXIMA. INCREASE THE
NUMBER OF RANDOM STARTS.

THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE
TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE
FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING
VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE
CONDITION NUMBER IS -0.285D-15. PROBLEM INVOLVING PARAMETER 3.

THE MODEL ESTIMATION TERMINATED NORMALLY

TESTS OF MODEL FIT

Loglikelihood

H0 Value -40238.840
H0 Scaling Correction Factor 1.686
for MLR

Information Criteria

Number of Free Parameters 73
Akaike (AIC) 80623.680
Bayesian (BIC) 81041.500
Sample-Size Adjusted BIC 80809.566
(n* = (n + 2) / 24)
Entropy 0.959

Class Counts and Proportions

Latent
Classes

1 1144 0.50597
2 71 0.03140
3 118 0.05219
4 928 0.41044

VUONG-LO-MENDELL-RUBIN LIKELIHOOD RATIO TEST FOR 3 (H0) VERSUS 4 CLASSES

H0 Loglikelihood Value -40908.249
2 Times the Loglikelihood Difference 1338.818
Difference in the Number of Parameters 17
Mean 58.384
Standard Deviation 42.977
P-Value 0.0000

LO-MENDELL-RUBIN ADJUSTED LRT TEST

Value 1328.698
P-Value 0.0000

5 CLASS SOLUTION

WARNING: WHEN ESTIMATING A MODEL WITH MORE THAN TWO CLASSES, IT MAY BE
NECESSARY TO INCREASE THE NUMBER OF RANDOM STARTS USING THE STARTS OPTION
TO AVOID LOCAL MAXIMA.

WARNING: THE BEST LOGLIKELIHOOD VALUE WAS NOT REPLICATED. THE
SOLUTION MAY NOT BE TRUSTWORTHY DUE TO LOCAL MAXIMA. INCREASE THE
NUMBER OF RANDOM STARTS.

THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE
TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE
FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING
VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE
CONDITION NUMBER IS 0.183D-19. PROBLEM INVOLVING PARAMETER 31.

THE MODEL ESTIMATION TERMINATED NORMALLY

TESTS OF MODEL FIT

Loglikelihood

H0 Value -39924.057
H0 Scaling Correction Factor 1.675
for MLR

Information Criteria

Number of Free Parameters 88
Akaike (AIC) 80024.113
Bayesian (BIC) 80527.787
Sample-Size Adjusted BIC 80248.196
(n* = (n + 2) / 24)
Entropy 0.957

Class Counts and Proportions

Latent
Classes

1 71 0.03140
2 189 0.08359
3 1538 0.68023
4 345 0.15259
5 118 0.05219

VUONG-LO-MENDELL-RUBIN LIKELIHOOD RATIO TEST FOR 4 (H0) VERSUS 5 CLASSES

H0 Loglikelihood Value -40302.825
2 Times the Loglikelihood Difference 757.537
Difference in the Number of Parameters 17
Mean 111.882
Standard Deviation 137.085
P-Value 0.0048

LO-MENDELL-RUBIN ADJUSTED LRT TEST

Value 751.811
P-Value 0.0050

6 CLASS SOLUTION

WARNING: WHEN ESTIMATING A MODEL WITH MORE THAN TWO CLASSES, IT MAY BE
NECESSARY TO INCREASE THE NUMBER OF RANDOM STARTS USING THE STARTS OPTION
TO AVOID LOCAL MAXIMA.

THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE
TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE
FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING
VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE
CONDITION NUMBER IS -0.120D-18. PROBLEM INVOLVING PARAMETER 3.

THE MODEL ESTIMATION TERMINATED NORMALLY

TESTS OF MODEL FIT

Loglikelihood

H0 Value -39672.371
H0 Scaling Correction Factor 1.603
for MLR

Information Criteria

Number of Free Parameters 103
Akaike (AIC) 79550.742
Bayesian (BIC) 80140.269
Sample-Size Adjusted BIC 79813.021
(n* = (n + 2) / 24)
Entropy 0.905

Class Counts and Proportions

Latent
Classes

1 171 0.07563
2 324 0.14330
3 71 0.03140
4 764 0.33790
5 813 0.35958
6 118 0.05219

VUONG-LO-MENDELL-RUBIN LIKELIHOOD RATIO TEST FOR 5 (H0) VERSUS 6 CLASSES

H0 Loglikelihood Value -39918.607
2 Times the Loglikelihood Difference 492.472
Difference in the Number of Parameters 19
Mean 79.957
Standard Deviation 92.654
P-Value 0.0046

LO-MENDELL-RUBIN ADJUSTED LRT TEST

Value 489.139
P-Value 0.0048

Bengt O. Muthen posted on Friday, July 14, 2006 - 4:49 pm

It sounds like you go by the TECH11 Lo-Mendell-Rubin test to determine if "x classes is significant". That is not the only way to decide on the number of classes as the Nylund et al paper shows. See also Muthen (2004).

Also, you should note the important warning:

WARNING: THE BEST LOGLIKELIHOOD VALUE WAS NOT REPLICATED. THE
SOLUTION MAY NOT BE TRUSTWORTHY DUE TO LOCAL MAXIMA. INCREASE THE
NUMBER OF RANDOM STARTS.

This means that your solution is not trustworthy - follow the suggestion.

Thomas Olino posted on Wednesday, April 25, 2007 - 12:12 pm

I am examining a LCA with categorical and continuous outcomes. The 4 class solution has a minimum BIC. Differences in LL values appear to stabalize at 4 classes. The LMR-LRT suggests retaining 5 classes, of which, one class has a size of 3% of the sample. The BLRT suggests retaining up to 10 classes.

I expect that consistency is important. Given these conflicting results, do any specific tests have greater weight?

Thanks,
Tom

Linda K. Muthen posted on Thursday, April 26, 2007 - 8:09 am

In a situation like this, I would probably look at the 4, 5, and 6 class solutions and see what they look like from a substantive point of view. Are the classes really different or just variations on a theme? What does theory suggest?

J.D. Haltigan posted on Monday, May 03, 2010 - 11:42 pm

Hi all,
I have read over this thread intensely and would like to ask a few questions about a latent class model I am running.

In short I have 7 binary (0,1) indicators and am looking at a 2 or 3 group class solution. The BIC differences between the two are minimal. Prior theory and a previous EFA on the binary data suggest the possibility of a 3 group solution but I am little confused by my output. No matter the number of random starts I choose, one group always ends up with 1 (in either solution). I am wondering what might be the reason for why I am getting this result.

Thanks,
JD

Linda K. Muthen posted on Tuesday, May 04, 2010 - 11:24 am

It sounds like you are getting one empty class when you ask for three classes. If this is the case, this could point to a two-class solution.

J.D. Haltigan posted on Tuesday, May 04, 2010 - 5:21 pm

Thanks.
I noticed when I tweaked my input instructions I got a seemingly 'expectable' three-group solution with. The tweaks were non-technical:

I did not specify type of data nor number of observations. Also resaved my .dat file without variable names at the top and specified free format.

Would this indeed make a difference?

Linda K. Muthen posted on Tuesday, May 04, 2010 - 6:22 pm

It would be impossible to say without seeing exactly what you are did. I would be concerned if I could not understand how the changes made a difference.

J.D. Haltigan posted on Tuesday, May 04, 2010 - 6:50 pm

I think possibly by specifying the number of observations I inadvertently limited the observations used in the model run. I made the mistake of using this command and specified the number of cases as the number of observations.

Mario Mueller posted on Thursday, January 12, 2012 - 12:30 am

Hello,
I'm running a LCA with 7 likert-type (1-5) indicators (N=9.600) with different class solutions (2-4).
Is it necessary to use user-specified starting values and if so when and how do I have to specify them? The User Guide did not not provide sufficient information for the selection of them.

Thanks, Mario

Linda K. Muthen posted on Thursday, January 12, 2012 - 6:07 am

It is not necessary to provide starting values. If you don't have any idea of what they would be, it is best to use the default starting values.

Mario Mueller posted on Monday, January 16, 2012 - 5:12 am

Hello Linda,

Thank you! That's what I did but only got non-significant Chi-Square tests (p=1.000) for all class solutions. Does it mean that I only have one class?
Furthermore, when I plotted the latent profiles they had almost similar shapes.

Thanks, Mario

Linda K. Muthen posted on Monday, January 16, 2012 - 1:41 pm

See the following paper which is available on the website and the Topic 5 course handout on the website to understand how to determine the number of classes:

Nylund, K.L., Asparouhov, T., & Muth�n, B. (2007). Deciding on the number of classes in latent class analysis and growth mixture modeling. A Monte Carlo simulation study. Structural Equation Modeling, 14, 535-569.

Melissa Kimber posted on Friday, February 10, 2012 - 10:03 am

Hello,
I am looking for resources on how to run a exploratory LCA in Mplus with respect to evidence of how to do it, what number of classes to start with--but also the code to run the program in Mplus. I have three continuous variables that I would like to perform the Exploratory LCA on.
Do any of your handouts actually have the information on how to implement it in Mplus?
Thanks very much for your help.
***Melissa

Linda K. Muthen posted on Friday, February 10, 2012 - 10:04 am

The Topic 5 course handout shows how to carry out an LCA.

Claudia Recksiedler posted on Tuesday, March 06, 2012 - 11:11 pm

Dear Dr. Muthen,

I am running LCA models based on four categorical variables with four levels each. When requesting TECH11, I wanted to use user-specific starting values from my output without TECH11 to order the latent classes in the way you recommended in the Mplus manual (first class with the fewest participants and the last the most). Unfortunately, some of my logit thresholds were set at extreme values in the optimization (-15.000 and 15.000). Therefore, my starting values based on the threshold estimates are not ordered increasingly and cause error messages. Is there another way to order the latent classes according to class counts when requesting TECH11? If not, can I still rely on the results of the likelihood ratio tests or should I use another approach? Any suggestions, recommendations, or references are appreciated!

Thanks and kind regards,
Claudia

Linda K. Muthen posted on Wednesday, March 07, 2012 - 8:53 am

If, for example, the second threshold is fixed at 15, try using 14 for the first and 16 for the third.

Laura Selkirk posted on Wednesday, March 28, 2012 - 9:57 am

Hi,

I am having an issue choosing the number of classes to go with in an latent profile analysis. I am using 5 imputed datasets to run my LPA and am comparing the fit based on BIC and entropy based on the recommendations in Nylund et al. (2007). My issue is that when I run the analyses with one imputed dataset for the plot and to save the class probabilities the class membership is coming out very differently.

Basically, I am running a four class analysis. In the analysis with the five datasets there are two larger classes with 44% and 32% of the sample in them and then two smaller classes with 12% and 10% of the sample in them (sample size of 150). When I run the same analysis in the single datasets that class make up is 44% in two classes and 5.9% in two classes.

I do not have the same issues when I run the analysis with three classes. The class structure and % are basically the same whether I run with imputed data or a single dataset. Given these issues does it make sense to go with the three class structure instead of the four class even though the four class structure has a better BIC and Entropy when run with the five imputed datasets?

Thanks,
Laura

Linda K. Muthen posted on Wednesday, March 28, 2012 - 1:28 pm

How did you impute that data? Did you use a mixture model? Why are you not using maximum likelihood instead of imputing.

BIC and other fit statistics are given as averages in multiple imputation. They have not been developed for imputation. It is not clear if an average is meaningful.

Laura Selkirk posted on Wednesday, March 28, 2012 - 2:26 pm

Linda,

I used Amelia in R to create five datasets. I do not quite understand what it would mean to use ML instead of imputing.

Laura

Linda K. Muthen posted on Wednesday, March 28, 2012 - 6:42 pm

You don't want to impute by a different model than you analyze. I suspect Amelia does not have mixture modeling as an option.

Using ML is using all available information which is the default in Mplus. Some refer to this as FIML.

Laura Selkirk posted on Thursday, March 29, 2012 - 3:16 pm

Linda,

That's what I thought you meant but I wasn't completely sure. Thank you for the explanation re: Amelia. In using ML estimation would I have to have MPlus exclude missing data listwise or do you recommend a way to impute missing data values? I understand that I should not use data imputed using a model other than mixture modeling, but is there a way to impute data in MPlus using mixture modeling?

Thanks,
Laura

Laura Selkirk posted on Thursday, March 29, 2012 - 3:52 pm

Linda,

Nevermind, I spent some time with the Mplus manual and realized that if I turn off Listwise=on then it would estimate the model with missing data theory. Thanks for all of your help!

Laura

Leslie Roos posted on Wednesday, August 15, 2012 - 11:35 am

Hello!

I am a graduate student working on a LCA analysis using NESARC data to examine if a 10 childhood adversity variables (all binary categorical) form in a latent class pattern to predict a binary adult outcome. I am struggling with the MODEL specifications.
My understanding was that with all categorical predictors, MODEL specification is not necessary, but since this involves complex sampling with covariates (can be categorical or continuous), I think it possibly is? Additionally, we do not have any aprori hypotheses about latent variable interactions or random slopes, but I not sure if these need to be specified?

I've included my syntax-in-progress below, any help in designing the model or suggestions would be hugely appreciated -- thank you!!

TITLE: LCA Childhood Adversity
DATA: FILE = .dat�
VARIABLE:
NAMES = Outcome ace1 ace2 ace3 ace4
income w2_eth w2mstat w2sex w2_urb
age;
USEVARIABLES =
CATERGORICAL = ace1 ace2 ace3 ace4
CLASSES = c (2)
WITHIN = ace1 ace2 ace3 ace4
BETWEEN = income w2_eth w2mstat
w2_urb age;
STRATIFICATION w2strat;
CLUSTER = w2psu;
WEIGHT = w2weight;
SUBPOPULATION = w2sex = 1.
KNOWNCLASS ; (?)
DEFINE
ANALYSIS Type= COMPLEX MIXTURE;
STARTS = 100 10;
STITERATIONS = 20;
MODEL
??

Linda K. Muthen posted on Wednesday, August 15, 2012 - 4:38 pm

You should look at Chapter 7 examples. Example 7.12 is a LCA with a covariate. You would use TYPE=COMPLEX MIXTURE; and the complex survey data options shown above. You should not use BETWEEN and WITHIN. They are for TYPE=TWOLEVEL.

Study Hard posted on Tuesday, February 03, 2015 - 6:19 pm

Hello Dr. Muthen

I am runnin a latent class analysis following the 2012 article "Using Mplus TECH11 and TECH14 to test the number of latent classes" by Asparouhov and B. Muthen.

The article compaed 4 versus 5 classes model and found that the p-values were small in both step 2 and step 3. So, the conclusion was to reject 4 class model.

But how would you interpret the situation where the step 2 test resulted in a large p-value (the Vuong test) whereas the step 3 resulted in a p-value of 0.000.

Should model 4 be still rejected?

Bengt O. Muthen posted on Wednesday, February 04, 2015 - 3:23 pm

You stop testing once you have a first large p-value.

CB posted on Tuesday, March 24, 2015 - 1:07 pm

Thanks for all of the helpful references in deciding the number of latent classes!

What references are available that discuss variable selection for LCA, especially for categorical indicators? I'm trying to find type of semi-systematic modeling strategy, but have had difficulty in finding it. Also, do the p-values for the Z-test (estimate divided by its standard error) help in deciding what variable is a "good" discriminator for the latent classes?

Thanks!

Bengt O. Muthen posted on Tuesday, March 24, 2015 - 2:20 pm

Take a look at our technical appendix for version 7:

Variable-specific entropy contribution.

Jen Doty posted on Thursday, March 17, 2016 - 7:49 am

I typically work with MPlus when I work with latent class analysis, but I am currently writing a paper where the group did latent class analysis using Stata. One of the authors has done LCAs in Latent Gold wants to report classification error. Is that analogous to entropy? If not what is the difference? Thank you for helping me navigate interdisciplinary differences in stats language!

Linda K. Muthen posted on Thursday, March 17, 2016 - 11:10 am

I don't know how these are related offhand. You may want to ask on a general discussion forum like SEMNET or ask the Latent Gold developers.

Jen Doty posted on Thursday, March 17, 2016 - 1:23 pm

Looking at it, I think it might be 1-entropy = classification error, but I'll do some more asking around. Thanks anyway!

Brandy Callahan posted on Wednesday, April 20, 2016 - 1:27 pm

Hello,

I have read through the Web Notes (http://www.statmodel.com/examples/webnotes/webnote14.pdf) in trying to decide how many classes to retain. After requesting tech 11 and tech14, the Lo-Mendell-Rubin adjusted LRT test p-value is non-significant (p=1.0000), suggesting no difference between k and k-1, but the BLRT is very significant (p=0.0000) suggesting that the k-1 model can be rejected. Your 2006 paper indicates that "BLRT
outperforms both the NCS and LMR" (Nylund, Asparouhov & Muthen, 2006) -- should I therefore consider only the BLRT in selecting my classes, if the LMR and BLRT are divergent?

Thanks so much.

Linda K. Muthen posted on Wednesday, April 20, 2016 - 4:30 pm

I would look at BIC also and substantive theory.

John B. Nezlek posted on Monday, July 13, 2020 - 11:48 am

Hi Mplus,

I am trying to run a LCA using the following model:

classes = c(2);
ANALYSIS: TYPE = MIXTURE EFA 1 2;

The model will not converge (at all).

NO CONVERGENCE. PROBLEM OCCURRED IN EXPLORATORY FACTOR ANALYSIS WITH 1 FACTOR(S).

NO CONVERGENCE. PROBLEM OCCURRED IN EXPLORATORY FACTOR ANALYSIS WITH 2 FACTOR(S).

When I run a straight EFA (no mixture) I get two eigenvalues greater than 1.0,
6.368 and 1.979.

Any suggestions as to why the model will not converge in the mixture analysis?

Thanks,
John

Bengt O. Muthen posted on Monday, July 13, 2020 - 4:40 pm

Send the output with Tech8 turned on.