Message/Author 

Anonymous posted on Monday, June 23, 2003  1:53 pm



1).In Mplus output of LCA, there are two types of class counts/proportions based on posterior probabilities and most likely class membership, respectively. What does it indicate when these types of class counts are quite different? 2). Mplus provides significance test results for threshold estimates. What does it mean when the test is not statistically significant for a threshold estimate? Do we need to set it to 0? 

bmuthen posted on Monday, June 23, 2003  1:56 pm



1) This means that the classification is not very clear, i.e. the entropy is rather low 2) this significance test can be ignored. Zero for a threshold typically implies probability 0.5 which is usually not a meaningful value to test against. Don't set the threshold to zero. 

Anonymous posted on Tuesday, July 08, 2003  12:31 pm



1) Information criterion statistics (e.g., BIC, AIC) are usually used to compare LCA models with different number of classes because these models are not nested. Once a model with the smallest BIC is chosen, how do we know the model fits data? What statistics provided in Mplus output can be used to test goodnessoffit of the LCA model? 2) In his paper (Statistical and substantive checking in GMM), Dr. Muthén mentioned that the LoMendellRubin LR test is a new statistic that can be used for latent class enumeration of GMMs with nonnormality of the outcome, and the LMR LRT procedure was implemented in Mplus 2.12. I am wondering if LMR LR can be used for class enumeration in LCA; if so, how can we get it from Mplus 2.12? 3) LCA models with smaller BIC could also have smaller value of entropy, compared with other LCA models. Should the value of entropy also be used for model comparison? Is there any cutting point value for entropy that indicates an acceptable class membership classification? 4) When predictors of the latent class membership are included into LCA, model results, including latent class probabilities and conditional probabilities, often change. Does this mean we should always make model selection and interpret model results based on LCA model with predictors, and not report the model results without predictors? 

bmuthen posted on Tuesday, July 08, 2003  1:21 pm



1) This is a good question that has not received enough attention in mixture modeling. It is a difficult topic because it is not enough to check fit against first and second order moments because mixtures are not normal. Mplus offers two features. Tech7 gives a comparison between modelestimated means, variances, and covariances and sample counterparts where raw data has been weighted by posterior probability estimates. This procedure was used in the Roeder, Lynch, Nagin article in JASA (see Mplus references). The new Tech13 checks agreement with multivariate skew and kurtosis. Mixture model fit is an area ripe for research. One new line of development is residual diagnostics. Both Tech13 and residual diagnostics are referred to in my paper that you mention under 2). 2) Yes. And you get it through Tech13 just as for continuous outcomes. 3) I don't think entropy should be of primary concern for deciding number of classes unless deciding between two models that are equivalent in other regards. I am not aware of a cutting point for good entropy. Note also that the classification can be good for some classes and the entropy low because of other classes not being well distinguishable. 4) Yes, I think the final model should include covariates. I have a draft paper that discusses this issue, which I will be happy to send to you later this summer (remind me). For example, if the true model has covariates x influencing a latent class variable c and also have direct influence on the indicators u, then excluding x in the analysis will produce incorrect classification because of the direct influence of x on u. 

Anonymous posted on Tuesday, July 08, 2003  2:18 pm



Dear Dr. Muthén: Thank you so much for your quick reply! I tried both TECH7 and TECH13 options as you recommended. They did not work for LCA. Mplus output shows both options are only available for models with yvariables. I am wondering how LR based chisquare statistic was used for LCA model fit testing in some studies: 1) Flaherty, B.P. 2002. Assessing reliability of categorical substance use measures with latent class analysis. Drug and Alcohol Dependence, 68:S7S20). (The computer software used for the study is LTA). 2) Mitchell & Plunkett. 2000. The latent structure of substance use among American Indian adolescents: An example using categorical variables. American Journal of Community Psychology, 28: 105125. (The software used for the study is MLLSA). 

bmuthen posted on Tuesday, July 08, 2003  3:29 pm



Yes, Tech7 and Tech13 are only for continuous outcomes ("y"s). Tech11 can be used for categorical outcomes ("u"s), but it isn't a test against data. For a test against data with a model for u's you can use the Pearson and likelihood ratio chisquare tests of the model against the unrestricted alternative (multinomial frequency table). The problem is that when the number of u's is not small, the number of cells of the table gets large and the tests are not well approximated as chisquares. But these are probably the tests that you see in those articles. 

Anonymous posted on Thursday, July 10, 2003  1:15 pm



I have two LCA models: Model with 3classes: BIC=3174.61 AIC=3032.38 Model with 4classes: BIC=3216.24 AIC=3013.05 One model has a smaller BIC, while another one has a smaller AIC. Which model should I choose? In this case, should the choice of model depends on which model is more interpretable? 

bmuthen posted on Thursday, July 10, 2003  1:19 pm



In a dissertation that a student of mine did, BIC performed better in an LCA Monte Carlo study than AIC. BIC seems to be more widely accepted with mixtures (latent class models). But in this type of case I would certainly rely heavily on interpretation. You can also check the Pearson and likelihood ratio chisquare tests against the unrestricted frequency table as well as the new Tech11. 

Anonymous posted on Friday, July 11, 2003  11:02 am



Dear Dr. Muthén: My model results show that the Pearson and likelihood ratio chisquare statistics were significant for both models (indicating bad model fit?). However the pvalue was slightly larger for 3class model (BIC is also smaller for 3class model); and 3class model is more interpretable. So, 3class model should be preferred. Right? Tech11 output: VUONGLOMENDELLRUBIN LIKELIHOOD RATIO TEST FOR 3 (H0) VERSUS 4 CLASSES H0 Loglikelihood Value 1483.448 2 Times the Loglikelihood Difference 53.845 Difference in the Number of Parameters 19 Mean 7.576 Standard Deviation 129.300 PValue 0.3000 LOMENDELLRUBIN ADJUSTED LRT TEST Value 53.382 PValue 0.3022 I wish to have some questions about Tech11 output: 1) Can the VUONGLOMENDELLRUBIN and LOMENDELLRUBIN ADJUSTED LRT tests be used for comparing models with different number of latent classes, which are nonnested? 2)The pvalue shows the chisquare is not statistically significant. Does this indicate no difference between 3class and 4class models? 3) The number of parameters was 35 for 3class model and 50 for 4class model, according to the TESTS OF MODEL FIT in Mplus output. The difference in number of parameters is 15, which is different from that (i.e., 19) in Tech11 output. 4) The pvalue for the chisquare=53.845 with d.f.=19 calculated by SAS Probchi function is 0.000035, which is different from those (i.e., 0.3000 and 0.3022) in Tech11 output. Could you please help me with these questions? Thank you very much! 

bmuthen posted on Friday, July 11, 2003  4:28 pm



I agree with your first paragraph. 1) No 2) Yes. In other words, you can't reject that 3 classes is sufficient. 3) I am glad you are checking this. This could mean that the Tech11 test was not correctly applied here due to the model specification of the first class. Note that Tech11 gets the 3class model by dropping class 1 in the 4class run. Perhaps you have restrictions in the last 3 classes of the 4class run that you don't have in your 3class run? 4) That p value is based on chi square, but the idea behind Tech11 is that the likelihood difference is not chi square in this case. The Tech11 distribution is different from chi square  hence the difference in p value. 

Anonymous posted on Friday, July 11, 2003  5:00 pm



Thank you very much! 


Dear Dr. Muthén, In your posting (in this discussion) from 8 July 2003, you mention your own workinprogress on the issue of covariates in mixture models (specifically, on why it is advisable to include covariates in a mixture model, and why failure to do so results in incorrect classification). Have you been able to publish this work yet? I'd be very grateful for a reference, or for hints towards related literature. 


Following is the reference: Muthén, B. (2004). Latent variable analysis: Growth mixture modeling and related techniques for longitudinal data. In D. Kaplan (ed.), Handbook of quantitative methodology for the social sciences (pp. 345368). Newbury Park, CA: Sage Publications. This paper can be downloaded from the website. 


Hi, Two questions about LCA : 1.Can I do LCA on nominal variables? 2.If yes, can I compute Model Fits and how, to know if my model is good? I know that I have to choose the lowest BIC to choose how many classes, but I want to be sure that my model is good even that the BIC is low!! Thank You Annie 


1. Yes. 2. You would use the same strategy to decide on the number of classes as with any other latent class indicators. See the Muthen paper in the Kaplan handbook. 


Thank you, But to publish, we need model fits like Pearson or likelihood ratio chisquare statistics. Is it possible with this kind of test? I try the sampstat to compute model fits but it doesn’t work… Thanks again Annie 


You will find Pearson and LR chisquare tests for the categorical outcomes (also nominal) in the output. 


Sorry to be so demanding but I don’t see anywhere the word Pearson in the output? Do I have to specified something in the output or in the analysis? 


You must not have categorical (or nominal) outcomes. If you do, send your input, output, data, and license number to support@statmodel.com 


The thing is that I know how many classes are in my model. But, I want to know if the model is good with some statistics fits. (rsmea, Pearson, Chisquare, etc). In the output, I see AIC, BIC and Entropy. There's no Pearson or Chisquare test. How can I find those fits (tech...) variable: name are id a1a10; usevariables are a1a10; auxiliary = id; classes = c(4); nominal = a1a10; missing = . ; analysis: type = mixture missing; 


Please send your input, data, output, and license number to support@statmodel.com. 


Hello, I am working on An LCA with one covariate. When I run the model without the covariate, the output includes Conditional Response Probabilities ("RESULTS IN PROBABILITY SCALE"). However, when I include the coavariate in the model, the output does not include this section. Is there a way in Version 4.1 to get this? Thank you. 


No, the probability conversion is given only for models without covariates. 


Is there a reference someone can suggest for the benefit of BIC over AIC in comparing models? If you have one, please email me at tkashdan@gmu.edu gratefully, ************************** Todd B. Kashdan, Ph.D. Assistant Professor Department of Psychology George Mason University Mail Stop 3F5 Fairfax, VA 22030 Office number: 7039939486 Fax number: 7039931359 Website: http://mason.gmu.edu/~tkashdan 


There is literature comparing the two for mixture models; see for instance the Nylund et al paper on the Mplus web site. But I can't think of papers off hand in nonmixture cases. 


Todd, Here are two BIC references I just used for a recent manuscript using LCA: Li, W., & Nyholt, D. R. (2001). Marker selection by Akaike information criterion and Bayesian information criterion. Genetic Epidemiology, 21, S272S277 Raftery, A. E. (1995/2004). Bayesian model selection in social research. In: P.V. Marsden (Ed.), Sociological Methodology. Cambridge, MA: Blackwell 

Anonymous posted on Thursday, November 29, 2007  5:17 am



Good morning. The following statement appears in the User's guide in relation to the use of the Tech11 option: "it is recommended when using starting values that they be chosen so that the last class is the largest class." Could you please provide an example of the syntax that would be used to specify that the last class is the largest class? Thank you. 


To do this, you use the ending values for the parameters in the class that you want to be last as starting values for the parameters in the last class in a subsequent analysis. There are several examples in Chapter 7 of how to assign starting values to the parameters in different classes. 

Anonymous posted on Wednesday, December 12, 2007  8:15 am



I'm sorry to be a bother. What command do I use to obtain info on the ending values? 


If by ending values you mean the parameter estimates you will see those in the output. Perhaps you mean the estimated class probabilities which are obtained using the "CPROB" option. 

Joan W. posted on Tuesday, September 21, 2010  11:45 am



According to the Mplus manual, entropy is calculated using the formula in Ramaswamy et al. (1993). Since this formula involves taking the logarithm of posterior probability for each class, I was wondering how this is defined for a posterior probability score of zero? Thanks. 


I think the log of 0 is negative infinity but this value is multiplied by zero so it falls out of the calculation. 


Hi there I have completed an LCA and the results makes substantive sense, there is a strong entropy value and correct classification is very high so that's great. However, I have read that I should check LCA results under four conditions being: CLASSVARYING, UNRESTRICTED CLASSINVARIANT, UNRESTRICTED CLASSVARYING, DIAGONAL CLASSINVARIANT, DIAGONAL I wonder if I could have some guidance on which is the 'default' in MPlus  I have used very simple LCA language, and how I would specify and estimate the other three? Thanks so much, 


Are your outcomes categorical or continuous? Where have you read about these conditions? 


The indicator variables in my LCA are continuous (so I should have posted under LPA, very sorry). I read about these conditions in the work of Katherine Masyn from Harvard. Perhaps I should just be happy with my findings as in subsequent mediation models where the profile variable is used as the mediator (three profiles), models converge and results confirm hypotheses. Can I also check in about using the profile variable (with categories 1, 2 and 3) as a continuous mediator in a path model. I note that it cannot be designated as categorical (as it actually is  although highly ordinal with higher values denoting more selfregulation problems), but I recall Linda suggesting in Hong Kong that if entropy and class classification were high it is acceptable to use class in this way in path models? Would you suggest any examples or references to strengthen my case? Thanks so much for your time 


The Mplus default for LPA with continuous outcomes is classinvariant, diagonal covariance matrices. Using a latent class variable as a mediator involves a more complex analysis. Treating it as an observed ordinal or nominal variable involves understanding the modeling in Muthén, B. (2011). Applications of causally defined direct and indirect effects in mediation analysis using SEM in Mplus. which is on our website under Papers, Mediational Modeling. Even if you have high entropy and use Most Likely Class membership plus treat it as ordinal, you can use the ordinary product formula for indirect effects only if you view the mediator as the continuous latent response variable behind the ordinal observed mediator. This needs WLSMV or Bayes estimation, while ML can't do that. Using ML, you can take the approach discussed in the paper above, where the observed variable is used as the mediator, treated either as ordinal or nominal. 


Thanks for the feedback, and the reminder of that paper. Thinking more carefully about my research question, I'd love to get your quick thoughts. I realise that my latent profile variable (ordinal 1, 2, 3) is actually really a predictor, or analogous to a 3level treatment group I guess (levels of selfregulation problems in early childhood). It is not a mediator. In my path models I have only observed variables, no latents, rather I have assigned the coding for selfregulation problems based on an LPA conducted seperately. I have three confounders  2 are binary (maternal history of depression and child gender) and 1 is continous (socio economic disadvantage)  they confound the treatment group, the contnious outcome (middle childhood behaviour problems) and the continuous mediators (parenting). Given the assumptions set out in the paper I could / should not be looking to claim causal mediation at all, or do you think I should give it a go using the model constraints as set out in the paper? What are your thoughts on a fairly basic path analysis under the above conditions using WLSMV? Would modeldirect and indirect effects be trustworhty? So far the models converge, fit the data and interpretations of effect sizes (small) and directions make substantive sense. I just need to confirm for myself that I can defend the analysis methodology. Thanks so much for your time 


If you see your latent profile variable as predictor you may want to do a multiplegroup analysis based on it. The confounders then have different effects in the different groups. For each group you can have a mediation model if that is relevant. I shouldn't go into suggesting analysis approaches without knowing much more and that is not the aim of Discussion. 

Back to top 