Questions about mixture model
Message/Author
 Anonymous posted on Thursday, April 21, 2005 - 4:29 am
I have estimated a mixture model with three latent variables and continuous and categorical indicators, with one, two, three, and more... classes. I would like some precisions about interpretation of the output file.

1)For instance the output file of the model with three classes indicates “6 perturbed starting value run(s) did not converge”. What does it mean exactly ? Has this fact any consequence on the quality of the solution (local vs global) ?

2)TECH8 shows that the loglikelihood increases smoothly and reaches a stable maximum and the absolute and relative change goes to zero. There are very small fluctuations of counts in the first iterations but they remain stable in the others iterations.

3)The maximum loglikelihood is – 49025,6. This means that the maximum likelihood is very near 0. Is it realistic ? What is the signification of a likehood in a neighborhood of zero ?
 Linda K. Muthen posted on Thursday, April 21, 2005 - 9:18 am
1. I assume that 6 out of the default of 10 did not converge. The majority should converge and converge with the same loglikelihood. I would try STARTS = 50 5; and if you don't get good results, then try STSCALE = 1;

2. This is good.

3. The loglikelihood that you give is not zero as you show it. The absolute value of the loglikelihood has no real meaning.
 Anonymous posted on Sunday, April 24, 2005 - 2:18 am
Thank you very much Dr Muthén, I am trying to follow your recommendations.
 Anonymous posted on Thursday, September 15, 2005 - 1:20 pm
I have read the preceding messages, and i don't 1. understand where i find the output file (question 1 in the message)(the output file of the model with three classes indicates “6 perturbed starting value run(s) did not converge”. )
2. I want also to ask you how can i interpret the estimate and the tests for the threshols in the output file, can i use this results in the choice of the starting value, and if the model is not significant (ch-square), can i also use this estimates in order to have a good models.
3.i have doing a mixture model with one , two class, but i have the missing value in my data, and in the output, i have covariance coverage, how can i use this output, and i have a warning :THE COVARIATE COVERAGE FALLS BELOW THE SPECIFIED LIMIT, what that is means?, i have also another message for the same model:
THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE
FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING
VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE
CONDITION NUMBER IS 0.294D-17. PROBLEM INVOLVING PARAMETER 15.

what that is means?

4. in my same model, i have 30 binary indicators, so that is very large for calculate the chi-square, how can i resolve this problem: THE CHI-SQUARE TEST CANNOT BE COMPUTED BECAUSE THE FREQUENCY TABLE FOR THE
LATENT CLASS INDICATOR MODEL PART IS TOO LARGE.

thank you very much in advance.
 Linda K. Muthen posted on Thursday, September 15, 2005 - 2:49 pm
1. The output file referred to above is the regular output file generated when Mplus is run.
2. These are tests that the thesholds are significantly different from zero. These tests are typically not useful for determining how to change a model.
3. The covariance coverage message means that you have more than 90 percent missing on one variable. You can see what parameter 15 is by looking at TECH1. That parameter makes the model not identified. Are you using Version 3.13.
4. The chi-square cannot be computed in your case. Instead you can look at the bivariate test of fit in TECH10 in terms of standardized residuals.
 Anonymous posted on Friday, September 16, 2005 - 6:08 am
thank you for your response
3.i use version 3.12
4. i have add tech10 in my program, but i don't have any additionnal output.
i don't know how i interpret the output file, i find for example the estimates for the thresholds how can i interpret this, also the output for tech1, what that is means the lambda and alpha.
can you tell me where can i find the interpretation of the output of Mplus, i have the usetr's guide, but i don't find any example of the output.
thank you for your help
 Linda K. Muthen posted on Friday, September 16, 2005 - 7:57 am
It sounds like you need to send your input, data, output, and license number to support@statmodel.com.
 Anonymous posted on Friday, September 16, 2005 - 8:05 am
I HAVE ESTIMATED THE MIXTURE MODEL WITH ONE LATENT VARIABLE, 29 CATEGORICAL INDICATORS AND TWO CLASS, I WOULK LIKE SOME PRECISIONS
1. TECH8 SHOWS THAT THE LIKELIHOOD INCREASES SMOOTHLY AND THE ABSOLUTE AND RELATIVE CHANGE GOES TO ZERO FOR THE 10 SETS, BUT THE COUNTS IS NOT THE SAME FOR ALL THE SETS, WHAT THAT IS MEANS?
2. IF I CAN'T HAVE THE CH-SQUARE TEST BECAUSE I HAVE SO MUCH CATEGORICAL INDICATORS, CAN I USE THE TECH11 TO CHOICE THE MODEL WITH TWO CLASSES VERSUS 1 CLASS, THAT IS SUFFICIENT TO DO THIS TEST?.
3. FOTR THE NESTED MODEL , WE USE THE TEST IN TECH11 TO CHOICE THE MODEL, AND FOR THE MODELS NOT NESTED WU USE aic or bic, that is correct?.
4. WHEN WE TELL NESTED MODEL, THAT IS MEANS MODELS WITH THE SAME RESTRICTIONS.

THANK YOU IN ADVANCE FOR YOUR RESPONSE
 Gilca Garcia de Oliveira posted on Wednesday, December 07, 2005 - 3:17 pm
what is the interpretation of the log likelihood in general and its negative sign in particular?
 bmuthen posted on Wednesday, December 07, 2005 - 6:31 pm
That's a big topic, but here is a short version. Very loosely speaking (and terribly nonstatistical), you can think of the likelihood as the probability of observing your sample assuming that your sample was drawn from the population your model specifies. The likelihood is a product of likelihood components for each observation (due to independent observations), and taking log turns this product into a sum. The sum is most often negative because the likelihood is most often less than 1 (as with probabilities). In this scale, -500 for example, is a lower (worse) likelihood than -100.
 Justin Jager posted on Tuesday, August 15, 2006 - 2:11 pm
I'm new to mixture modeling and I'm not sure what the absolute value of the categorical latent variable(s) actually substantively means. I know that the number of latent categorical variables is equal to g-1, where g is the number of unknown classes. But, what does the value of the categorical variable actually indicate? For example, in one two class solution the final proportions for the two latent classes were .222 and .778, and the value of C#1 was -1.252. At first I thought it was just an odds ratio indicating probability of class membership with the last class as the reference group, but this does not seem to be the case.
 Linda K. Muthen posted on Tuesday, August 15, 2006 - 4:00 pm
1/exp (-1.252) = .778
 Linda K. Muthen posted on Tuesday, August 15, 2006 - 4:12 pm
Correction:

1/(1 + exp (-1.252)) = .778
 Justin Jager posted on Sunday, August 27, 2006 - 12:38 pm
Currently, I'm using Growth mixture modeling to identify latent classes of math achievement over 5 year span. In addition, I want to see what characerstics (gender for example) of the child predict latent class membership.

In the M-plus manual version 4.0, the GMM in example 8.1 (pg. 175) seems to be the model I want to emulate. That is it is a two class solution, with an exogenous variable x (which hypothetically could be gender - the predictor I want to inlcude).

Looking at the model in example 8.1, the question I have is do I need to include the arrows from x to i and s if I am only interested in whether or not x predicts latent class membership and I am not interested in how x impacts the growth factors themselves?

I can provide more detail of my model if necessary, but I assume my question is just as clear, if not clearer, when just using the model in Figure 8.1.

Thanks,

Justin
 Bengt O. Muthen posted on Monday, August 28, 2006 - 8:57 am
It is perfectly fine to have x influence only c and not i, s. You can test if x also influences i, s significantly. Often this happens. For example, classes may be high and low. The high class may be more likely for high x values. But being unusually high in the high class may also be associated with a higher x value.
 Fernando Terrés de Ercilla posted on Wednesday, November 29, 2006 - 1:17 am
I've estimated an LPA with 3 count indicators and 2 classes, and several covariates. The model runs OK in Mplus 4.2 but in the plot information I can't see the estimated values, or estimated probabilities. Is it my fault?.
By the way, congratulations for the new version: I've experienced the better starting values and the speed advantages (up to 4 times quickier with a dual core proc.).
Fer.
 Linda K. Muthen posted on Wednesday, November 29, 2006 - 9:05 am
These plots are not available when there are covariates in the model.

I'm glad you like the quicker computations.
 Fernando Terrés de Ercilla posted on Wednesday, November 29, 2006 - 11:03 am
Sorry, more exactly: in a project for a transport firm, I want to analyse the relationship between worker’s characteristics (sex, age, and seniority), kind of machine (among 9 possible vehicles) and the frequency of adjustments they make in their seats and driving wheels, self reported by 3 counts (weekly adjustments of each elements) and 2 ordinal items (Likert-7). This will be part of a bigger ergonomic model, but for the moment I would like to get the relationship between the class probabilities and the covariates.
Now, if in the model I use the 2 ordinal items and the 3 counts, saving the results and the class probabilities, and requesting a Plot3 gives me the following plots: Histograms (sample values), Scatterplots (sample values), Sample means, Sample proportions, Estimated means, Sample and estimated means, Estimated probabilities, Observed individual values, and Estimated means and observed individual values.
But if I treat the 3 counts as continuous, then I get the previous plus the following: Histograms (sample values, estimated values), Scatterplots (sample values, estimated values), …, Mixture distributions, and Estimated probabilities for a categorical latent variable as a function of its covariates.
I tell the full history because I could consider other possible modelling options. Thanks in advance, Fer.
 Linda K. Muthen posted on Wednesday, November 29, 2006 - 12:23 pm
You most likely need numerical integration for the model and that precludes you getting those other plots.
 Amber Grundy posted on Sunday, February 04, 2007 - 8:46 am
I am estimating a mixture model with 2 classes of a continuous outcome. I have entered a time-invariant predictor of the intercept, slope, and also of class (using the code in example 8.1. I just want to make sure that I am interpreting the output correctly. First, for the intercept & slope I get this:

I ON X 0.170 0.049 3.445
S ON X -0.029 0.017 -1.716
(this output is the same for both classes)

Does this mean that X has a significant positive effect on the intercept, regardless of group membership, and that it has a non-signif (but marginal) effect on the slope, again regardless of group membership?

Then, for class, I have

C#1 ON X 0.171 0.051 3.355
Intercepts C#1 -3.618 2.498 -1.448

For the first line, does that mean that if you are high on X, you are more likely to be in C1 than C2?

For the second line, I don't really know what it means. Can you help? Thanks!
 Linda K. Muthen posted on Monday, February 05, 2007 - 9:19 am
Yes to your first question. You can, however, relax the equality of the regressions of i and s on x across classes to see if the relationship is the same in both classes.

And yes to the second.

The intercepts of the categorical latent variable relate to the number of people in each class. This is not a parameter that needs interpretation.
 Amber Grundy posted on Monday, February 05, 2007 - 2:57 pm
Thank you so much for your help!
 Raji Srinivasan posted on Sunday, June 03, 2007 - 10:31 am
Hello,

I have estimated a latent class random effects structural equation model - and have 3 segments.

I am interested in performing Wald tests to compare the sizes of coefficients - is there a statement in Mplus that I can use or do I have to do these tests manually?

Best, Raji
 Linda K. Muthen posted on Monday, June 04, 2007 - 7:22 am
MODEL TEST can be used for Wald tests. See the Mplus User's Guide.
 Allison Holmes Tarkow posted on Monday, March 31, 2008 - 9:40 am
Hello- I've estimated a piecewise GMM with 3 classes for reading achievement (5 waves). I'm running into trouble trying to add a latent predictor of the trajectory class.

Is it possible to include latent factor predictors?

One error I tend to get is that the computer does not have enough memory to calculate the model. Could this actually be the case?

Thanks very much!
 Linda K. Muthen posted on Monday, March 31, 2008 - 10:14 am
Please send your input, data, output, and license number to support@statmodel.com.
 Jon Elhai posted on Thursday, August 28, 2008 - 2:18 pm
Dear Linda/Bengt,
I am running a growth mixture model with 3 latent classes and no covariates. Could you kindly tell me what the following portion of the Mplus output refers to? Does this refer to the difference in mean between the class's slope listed below and the C#3 slope?

Categorical Latent Variables

Means
C#1 -0.546 0.576 -0.948 0.343
C#2 -1.027 0.407 -2.525 0.012
 Linda K. Muthen posted on Friday, August 29, 2008 - 8:38 am
These are the logits for the proportions in each class.
 Jon Elhai posted on Friday, August 29, 2008 - 8:55 am
Linda,
With regard to my previous question... Is there a way to determine whether one class' slope is significantly higher/lower than another class' slope?
 Bengt O. Muthen posted on Friday, August 29, 2008 - 9:48 am
You can do that in 2 ways in Mplus.

1. Run the model with slopes held equal and then run with them different. Compute a chi2 test as 2 times the loglikelihood difference.

2. Use Model Test to do Wald testing of equality (see UG).
 Sofia Diamantopoulou posted on Friday, November 21, 2008 - 6:58 am
Dear Dr Muthen,

I am estimating a dual trajectory model (3x3 model) and I have trouble reading the output. I wonder how I can get the following information:
1. Conditional probabilities of one latent class membership given the other latent class membership and vice versa.
2. Joint probabilities

Thank you.
 Bengt O. Muthen posted on Friday, November 21, 2008 - 7:55 am
You find the joint probabilities in the output under estimated class frequencies and probabilities estimated by the model. This shows 9 cells corresponding to the 3 x 3 table. From this you can compute the conditional probs by hand (add up the freqs for each row or column of the table and compute the proportions for that row or column).
 Sofia Diamantopoulou posted on Thursday, November 27, 2008 - 7:56 am
Thank you for your reply. However, I have another question. When I run the dual trajectory model the intercepts for some classes becomes negative. How am I to understand these intercepts?

PS. My data are censored (b).
 Bengt O. Muthen posted on Saturday, November 29, 2008 - 9:29 am
Note that intercepts are not means. Check Tech4 or the RESIDUAL option of the OUTPUT command to see what the means are. Also, with censored (b) a negative mean could imply that more than 50% are at the lowest censoring point of zero - that much censoring can only happen with a negative mean for the latent response variable y* that is censored. See literature on censored-normal response variables, such as Maddala's book.
 Sofia Diamantopoulou posted on Friday, December 05, 2008 - 5:40 am
How do I get the class membership of individuals in a dual trajectory model?
 Linda K. Muthen posted on Friday, December 05, 2008 - 6:23 am
You would use the CPROBABILITIES option of the SAVEDATA command.
 Aidan G. Wright posted on Wednesday, December 29, 2010 - 6:20 pm
Dear Drs. Muthen,

I also have received the following warning:
THE CHI-SQUARE TEST CANNOT BE COMPUTED BECAUSE THE FREQUENCY TABLE FOR THE
LATENT CLASS INDICATOR MODEL PART IS TOO LARGE.

I see above that your recommendation has been to look at the TECH 10 output. I was hoping you could provide some more guidance or reference to a paper/primer on how to use TECH10 more specifically in this instance.

Thank you in advance for your help.
 Linda K. Muthen posted on Thursday, December 30, 2010 - 5:50 am
TECH10 contains univariate, bivariate, and response pattern standardized residuals. Look for values over 1.96 to see where the model does not fit.
 J.D. Haltigan posted on Monday, January 13, 2014 - 9:55 am
Hi:

I have a situation in which I am loosely trying to explore the latent structure or class structure of a high-dimesional observational coding system (100 + indicators, frequency counts). I knew in advance that given the indicator:cases ratio I would likely have a nonidentified model which is the case. However, the LCA mixture model does converge despite the relevant warnings of untrustworthy standard errors etc.

I have two questions:

How does a model converge when the number of parameters is greater than the sample size? In this case, is the output still loosely interpretable?

As analagous EFA models do NOT converge, I was wondering if there are any other approaches I might take given the limitations of the indicator:sample size ratio to explore the structure of the observational coding system.
 Bengt O. Muthen posted on Monday, January 13, 2014 - 4:14 pm
You can get a converged likelihood even for non-identified models; models with more parameters than subjects is also not a problem. You should do a simulation study to see if you can recover the model that generated the data.

Using very parsimonious models is a way to get around a small sample size. E.g. CFA instead of EFA.
 J.D. Haltigan posted on Sunday, January 19, 2014 - 6:01 pm
As a follow-up to my last question re: LCA models converging despite the non-identification...

In my case past a 2-class model (i.e., 3-5) the models converge but the best log-likelihood is not replicated (no matter how much I increase the random starts). I assume the chief reason for the non-replication is that, given that the data is count and heavily skewed to zero, I am trying to extract more classes (past 2) than are represented in the data. All else being equal, would this be a reasonable assumption?
 Bengt O. Muthen posted on Monday, January 20, 2014 - 5:08 pm
I think so.
 J.D. Haltigan posted on Tuesday, January 21, 2014 - 12:40 pm
Thanks much. I want to make the case that I can explore the class structure of the 4-class model (it has the lowest BIC) of 1-5 model runs. That said, only the 2-class LL replicates. Given that the indicator:sample size ratio is not going to get any better, could I make a case that despite the failure to replicate the best LL, I am choosing to examine it (exploratory, all cautions noted) based strictly on the lowest BIC of 1-5 model runs?
 Bengt O. Muthen posted on Tuesday, January 21, 2014 - 5:31 pm
I think you want to replicate it to be able to trust it. Perhaps it might help replicating it by choosing a smaller STSCALE value (see UG), for instance going from 5 to 1.
 J.D. Haltigan posted on Sunday, January 26, 2014 - 10:53 am
Hello:

One of the tweaks I made was to specify inflation (i) for the count variable portion of the model. Previously I had simply specified the variables as count (with no inflation). Is this appropriate given that my indicators are heavily skewed towards zero counts?
 Bengt O. Muthen posted on Sunday, January 26, 2014 - 12:56 pm
I'd go by what BIC suggests. See the different count modeling alternatives in the count regression example on slides 39-43 of our Topic 5 handout on our website.
 J.D. Haltigan posted on Sunday, January 26, 2014 - 8:50 pm
Thanks! Much longer computational time I am finding out for the (nb) and (i) count models. Will either of these models impact the ability of the ll to replicate relative to the default Poisson count model?
 Bengt O. Muthen posted on Monday, January 27, 2014 - 9:34 am
Can't give a general answer to that.
 Maja Flaig posted on Saturday, February 07, 2015 - 5:43 am
Dear Drs. Muthen,

I'd like to ask your opinion on an analysis. The analysis concerns data on students' concepts of depression. I am wondering whether there are distinguishable concepts in my sample, so I thought of conducting mixture analyses in Mplus.

I used 42 items to measure participants' concepts of depression, no I am wondering whether these are too many variables to conduct an LCA. My sample size is 340. Also the results would be quite difficult to interpret with that many variables. In this case would you recommend CFA mixture models to facilitate interpretation of latent classes?

Thank you in advance for your help.
Best regards,

Maja
 Bengt O. Muthen posted on Saturday, February 07, 2015 - 2:52 pm
LCA doesn't so much tell about distinguishable concepts (having to do with variables) as distinguishable people. You may want to do EFA.
 Maja Flaig posted on Monday, February 09, 2015 - 6:15 am
Thanks for your quick response.

Well, I hypothesize that different people have different concepts, i.e. subjective theories of depression.

In this case, would you still recommend efa?
 Linda K. Muthen posted on Monday, February 09, 2015 - 9:13 am
If you want to group people, use LCA. If you want to group variables, use EFA.
 ruthjlee posted on Thursday, May 14, 2015 - 10:40 pm
Hello,

A colleague has suggested that we perform an EFA within a CFA framework and use the factor scores in the LCA, in order to avoid interpreting a large number of variable loadings within classes. However, I noticed a relevant discussion on this forum (Sept 18 2012, 10.44am). The questioner was advised that if a factor is used as a variable in an LCA, '... the indicator may have a direct relationship to the categorical latent variable'.

- Does 'indicator' here refers to the manifest variables that were loaded onto the factor before the factor was used in the LCA?

- If so, is the issue here that the direct relationship of some of the manifest variables with the categorical latent variable might differ substantially from the relationship between the factor score and the categorical latent variable?

- Would this problem be mitigated if we used factors rather than factor scores?

Many thanks in advance!
 Bengt O. Muthen posted on Friday, May 15, 2015 - 5:52 pm
Q1. Yes.

Q2. Yes.

Q3. No.
 'Alim Beveridge posted on Thursday, May 18, 2017 - 11:47 pm
Dear Bengt and Linda,

I am conducting an LCA with 47 categorical items (binary and ordinal) but a fairly small sample size (N=158). Some variables are highly skewed (>90% 0 or 1); there is some missing data, and the BASIC analysis warns of zero cells and high sample correlations between variables (eg .986). BIC favors the 2-class solution.
The chi-square stat is not shown because the table is too large. Entropy and ALCP are high (> .96). But, there are many low univariate entropies and the classes are not interpretable with so many variables.

I get many warnings of this kind:
IN THE OPTIMIZATION, ONE OR MORE LOGIT THRESHOLDS ...

And also that some SEs may not be trustworthy due to a non-positive definite first-order derivative product matrix, and THE NUMBER OF PARAMETERS IS GREATER THAN THE SAMPLE SIZE.

1. Is there a recommended systematic procedure for dropping variables that are not contributing to the solution? For instance, dropping those with low univ. entropy one-by-one and rerunning the LCA each time?
2. Beside dropping variables, is there anything I should consider that might help improve the solution? For instance, could FMA be useful?
3. Should checking for local independence come before, during or after the process of dropping variables?

Thanks!
 Bengt O. Muthen posted on Friday, May 19, 2017 - 11:37 am
1. Perhaps using the univariate entropies can be used step-wise as you suggest but this is really a research question.

2. You already have too many parameters.

3. That's a research question.