Anonymous posted on Monday, June 23, 2003 - 1:53 pm
1).In Mplus output of LCA, there are two types of class counts/proportions based on posterior probabilities and most likely class membership, respectively. What does it indicate when these types of class counts are quite different? 2). Mplus provides significance test results for threshold estimates. What does it mean when the test is not statistically significant for a threshold estimate? Do we need to set it to 0?
1) This means that the classification is not very clear, i.e. the entropy is rather low
2) this significance test can be ignored. Zero for a threshold typically implies probability 0.5 which is usually not a meaningful value to test against. Don't set the threshold to zero.
Anonymous posted on Tuesday, July 08, 2003 - 12:31 pm
1) Information criterion statistics (e.g., BIC, AIC) are usually used to compare LCA models with different number of classes because these models are not nested. Once a model with the smallest BIC is chosen, how do we know the model fits data? What statistics provided in Mplus output can be used to test goodness-of-fit of the LCA model?
2) In his paper (Statistical and substantive checking in GMM), Dr. Muthén mentioned that the Lo-Mendell-Rubin LR test is a new statistic that can be used for latent class enumeration of GMMs with non-normality of the outcome, and the LMR LRT procedure was implemented in Mplus 2.12. I am wondering if LMR LR can be used for class enumeration in LCA; if so, how can we get it from Mplus 2.12?
3) LCA models with smaller BIC could also have smaller value of entropy, compared with other LCA models. Should the value of entropy also be used for model comparison? Is there any cutting point value for entropy that indicates an acceptable class membership classification?
4) When predictors of the latent class membership are included into LCA, model results, including latent class probabilities and conditional probabilities, often change. Does this mean we should always make model selection and interpret model results based on LCA model with predictors, and not report the model results without predictors?
bmuthen posted on Tuesday, July 08, 2003 - 1:21 pm
1) This is a good question that has not received enough attention in mixture modeling. It is a difficult topic because it is not enough to check fit against first and second order moments because mixtures are not normal. Mplus offers two features. Tech7 gives a comparison between model-estimated means, variances, and covariances and sample counterparts where raw data has been weighted by posterior probability estimates. This procedure was used in the Roeder, Lynch, Nagin article in JASA (see Mplus references). The new Tech13 checks agreement with multivariate skew and kurtosis. Mixture model fit is an area ripe for research. One new line of development is residual diagnostics. Both Tech13 and residual diagnostics are referred to in my paper that you mention under 2).
2) Yes. And you get it through Tech13 just as for continuous outcomes.
3) I don't think entropy should be of primary concern for deciding number of classes unless deciding between two models that are equivalent in other regards. I am not aware of a cutting point for good entropy. Note also that the classification can be good for some classes and the entropy low because of other classes not being well distinguishable.
4) Yes, I think the final model should include covariates. I have a draft paper that discusses this issue, which I will be happy to send to you later this summer (remind me). For example, if the true model has covariates x influencing a latent class variable c and also have direct influence on the indicators u, then excluding x in the analysis will produce incorrect classification because of the direct influence of x on u.
Anonymous posted on Tuesday, July 08, 2003 - 2:18 pm
Dear Dr. Muthén:
Thank you so much for your quick reply!
I tried both TECH7 and TECH13 options as you recommended. They did not work for LCA. Mplus output shows both options are only available for models with y-variables.
I am wondering how LR based chi-square statistic was used for LCA model fit testing in some studies:
1) Flaherty, B.P. 2002. Assessing reliability of categorical substance use measures with latent class analysis. Drug and Alcohol Dependence, 68:S7-S20). (The computer software used for the study is LTA).
2) Mitchell & Plunkett. 2000. The latent structure of substance use among American Indian adolescents: An example using categorical variables. American Journal of Community Psychology, 28: 105-125. (The software used for the study is MLLSA).
bmuthen posted on Tuesday, July 08, 2003 - 3:29 pm
Yes, Tech7 and Tech13 are only for continuous outcomes ("y"s). Tech11 can be used for categorical outcomes ("u"s), but it isn't a test against data. For a test against data with a model for u's you can use the Pearson and likelihood ratio chi-square tests of the model against the unrestricted alternative (multinomial frequency table). The problem is that when the number of u's is not small, the number of cells of the table gets large and the tests are not well approximated as chi-squares. But these are probably the tests that you see in those articles.
Anonymous posted on Thursday, July 10, 2003 - 1:15 pm
I have two LCA models:
Model with 3-classes: BIC=3174.61 AIC=3032.38 Model with 4-classes: BIC=3216.24 AIC=3013.05
One model has a smaller BIC, while another one has a smaller AIC. Which model should I choose? In this case, should the choice of model depends on which model is more interpretable?
bmuthen posted on Thursday, July 10, 2003 - 1:19 pm
In a dissertation that a student of mine did, BIC performed better in an LCA Monte Carlo study than AIC. BIC seems to be more widely accepted with mixtures (latent class models). But in this type of case I would certainly rely heavily on interpretation. You can also check the Pearson and likelihood ratio chi-square tests against the unrestricted frequency table as well as the new Tech11.
Anonymous posted on Friday, July 11, 2003 - 11:02 am
Dear Dr. Muthén:
My model results show that the Pearson and likelihood ratio chi-square statistics were significant for both models (indicating bad model fit?). However the p-value was slightly larger for 3-class model (BIC is also smaller for 3-class model); and 3-class model is more interpretable. So, 3-class model should be preferred. Right?
VUONG-LO-MENDELL-RUBIN LIKELIHOOD RATIO TEST FOR 3 (H0) VERSUS 4 CLASSES
H0 Loglikelihood Value -1483.448 2 Times the Loglikelihood Difference 53.845 Difference in the Number of Parameters 19 Mean 7.576 Standard Deviation 129.300 P-Value 0.3000
LO-MENDELL-RUBIN ADJUSTED LRT TEST
Value 53.382 P-Value 0.3022
I wish to have some questions about Tech11 output:
1) Can the VUONG-LO-MENDELL-RUBIN and LO-MENDELL-RUBIN ADJUSTED LRT tests be used for comparing models with different number of latent classes, which are non-nested?
2)The p-value shows the chi-square is not statistically significant. Does this indicate no difference between 3-class and 4-class models?
3) The number of parameters was 35 for 3-class model and 50 for 4-class model, according to the TESTS OF MODEL FIT in Mplus output. The difference in number of parameters is 15, which is different from that (i.e., 19) in Tech11 output.
4) The p-value for the chi-square=53.845 with d.f.=19 calculated by SAS Probchi function is 0.000035, which is different from those (i.e., 0.3000 and 0.3022) in Tech11 output.
Could you please help me with these questions? Thank you very much!
2) Yes. In other words, you can't reject that 3 classes is sufficient.
3) I am glad you are checking this. This could mean that the Tech11 test was not correctly applied here due to the model specification of the first class. Note that Tech11 gets the 3-class model by dropping class 1 in the 4-class run. Perhaps you have restrictions in the last 3 classes of the 4-class run that you don't have in your 3-class run?
4) That p value is based on chi square, but the idea behind Tech11 is that the likelihood difference is not chi square in this case. The Tech11 distribution is different from chi square - hence the difference in p value.
Anonymous posted on Friday, July 11, 2003 - 5:00 pm
In your posting (in this discussion) from 8 July 2003, you mention your own work-in-progress on the issue of covariates in mixture models (specifically, on why it is advisable to include covariates in a mixture model, and why failure to do so results in incorrect classification). Have you been able to publish this work yet? I'd be very grateful for a reference, or for hints towards related literature.
Muthén, B. (2004). Latent variable analysis: Growth mixture modeling and related techniques for longitudinal data. In D. Kaplan (ed.), Handbook of quantitative methodology for the social sciences (pp. 345-368). Newbury Park, CA: Sage Publications.
2.If yes, can I compute Model Fits and how, to know if my model is good? I know that I have to choose the lowest BIC to choose how many classes, but I want to be sure that my model is good even that the BIC is low!!
Thank you, But to publish, we need model fits like Pearson or likelihood ratio chi-square statistics. Is it possible with this kind of test? I try the sampstat to compute model fits but it doesn’t work…
The thing is that I know how many classes are in my model. But, I want to know if the model is good with some statistics fits. (rsmea, Pearson, Chi-square, etc). In the output, I see AIC, BIC and Entropy. There's no Pearson or Chi-square test. How can I find those fits (tech...)
variable: name are id a1-a10; usevariables are a1-a10; auxiliary = id; classes = c(4); nominal = a1-a10; missing = . ;
Hello, I am working on An LCA with one covariate. When I run the model without the covariate, the output includes Conditional Response Probabilities ("RESULTS IN PROBABILITY SCALE"). However, when I include the coavariate in the model, the output does not include this section. Is there a way in Version 4.1 to get this?
************************** Todd B. Kashdan, Ph.D. Assistant Professor Department of Psychology George Mason University Mail Stop 3F5 Fairfax, VA 22030 Office number: 703-993-9486 Fax number: 703-993-1359 Website: http://mason.gmu.edu/~tkashdan
Here are two BIC references I just used for a recent manuscript using LCA:
Li, W., & Nyholt, D. R. (2001). Marker selection by Akaike information criterion and Bayesian information criterion. Genetic Epidemiology, 21, S272-S277
Raftery, A. E. (1995/2004). Bayesian model selection in social research. In: P.V. Marsden (Ed.), Sociological Methodology. Cambridge, MA: Blackwell
Anonymous posted on Thursday, November 29, 2007 - 5:17 am
Good morning. The following statement appears in the User's guide in relation to the use of the Tech11 option: "it is recommended when using starting values that they be chosen so that the last class is the largest class." Could you please provide an example of the syntax that would be used to specify that the last class is the largest class? Thank you.
To do this, you use the ending values for the parameters in the class that you want to be last as starting values for the parameters in the last class in a subsequent analysis. There are several examples in Chapter 7 of how to assign starting values to the parameters in different classes.
Anonymous posted on Wednesday, December 12, 2007 - 8:15 am
I'm sorry to be a bother. What command do I use to obtain info on the ending values?
If by ending values you mean the parameter estimates you will see those in the output. Perhaps you mean the estimated class probabilities which are obtained using the "CPROB" option.
Joan W. posted on Tuesday, September 21, 2010 - 11:45 am
According to the Mplus manual, entropy is calculated using the formula in Ramaswamy et al. (1993). Since this formula involves taking the logarithm of posterior probability for each class, I was wondering how this is defined for a posterior probability score of zero? Thanks.
I have completed an LCA and the results makes substantive sense, there is a strong entropy value and correct classification is very high so that's great. However, I have read that I should check LCA results under four conditions being: CLASS-VARYING, UNRESTRICTED CLASS-INVARIANT, UNRESTRICTED CLASS-VARYING, DIAGONAL CLASS-INVARIANT, DIAGONAL
I wonder if I could have some guidance on which is the 'default' in MPlus - I have used very simple LCA language, and how I would specify and estimate the other three? Thanks so much,
The indicator variables in my LCA are continuous (so I should have posted under LPA, very sorry). I read about these conditions in the work of Katherine Masyn from Harvard. Perhaps I should just be happy with my findings as in subsequent mediation models where the profile variable is used as the mediator (three profiles), models converge and results confirm hypotheses.
Can I also check in about using the profile variable (with categories 1, 2 and 3) as a continuous mediator in a path model. I note that it cannot be designated as categorical (as it actually is - although highly ordinal with higher values denoting more self-regulation problems), but I recall Linda suggesting in Hong Kong that if entropy and class classification were high it is acceptable to use class in this way in path models? Would you suggest any examples or references to strengthen my case? Thanks so much for your time
The Mplus default for LPA with continuous outcomes is class-invariant, diagonal covariance matrices.
Using a latent class variable as a mediator involves a more complex analysis. Treating it as an observed ordinal or nominal variable involves understanding the modeling in
Muthén, B. (2011). Applications of causally defined direct and indirect effects in mediation analysis using SEM in Mplus.
which is on our website under Papers, Mediational Modeling. Even if you have high entropy and use Most Likely Class membership plus treat it as ordinal, you can use the ordinary product formula for indirect effects only if you view the mediator as the continuous latent response variable behind the ordinal observed mediator. This needs WLSMV or Bayes estimation, while ML can't do that. Using ML, you can take the approach discussed in the paper above, where the observed variable is used as the mediator, treated either as ordinal or nominal.
Thanks for the feedback, and the reminder of that paper. Thinking more carefully about my research question, I'd love to get your quick thoughts. I realise that my latent profile variable (ordinal 1, 2, 3) is actually really a predictor, or analogous to a 3-level treatment group I guess (levels of self-regulation problems in early childhood). It is not a mediator. In my path models I have only observed variables, no latents, rather I have assigned the coding for self-regulation problems based on an LPA conducted seperately.
I have three confounders - 2 are binary (maternal history of depression and child gender) and 1 is continous (socio economic disadvantage) - they confound the treatment group, the contnious outcome (middle childhood behaviour problems) and the continuous mediators (parenting).
Given the assumptions set out in the paper I could / should not be looking to claim causal mediation at all, or do you think I should give it a go using the model constraints as set out in the paper?
What are your thoughts on a fairly basic path analysis under the above conditions using WLSMV? Would modeldirect and indirect effects be trustworhty? So far the models converge, fit the data and interpretations of effect sizes (small) and directions make substantive sense. I just need to confirm for myself that I can defend the analysis methodology.
If you see your latent profile variable as predictor you may want to do a multiple-group analysis based on it. The confounders then have different effects in the different groups.
For each group you can have a mediation model if that is relevant.
I shouldn't go into suggesting analysis approaches without knowing much more and that is not the aim of Discussion.
Daniel Lee posted on Tuesday, March 17, 2015 - 1:55 pm
Hi Dr. Muthen,
I have conducted a GMM and identified 4 latent class trajectories in a sample of 1500 participants. I noticed that I have less than 30 participants in one class (all other classes have over 200 participants). I wanted to do more analyses with class membership and am not sure what to do with the tiny class! Should I use the 3 class model instead? Or, should I refrain from conducting analyses on members within the tiny class?
Does the small class have a substantive interpretation different from the other classes. Is it an important class?
Daniel Lee posted on Tuesday, March 24, 2015 - 10:42 am
Hi Dr. Muthen, it's distinct with respect to intercept (intercept is middle of class 1 and class 3), but not slope (all negative, similar magnitude).
Would there be computational issues down the line if I run further analyses with a class that has a sample size less than 30, while all other classes have 500+? If so, do you know of any articles/books that I can refer to?