Anonymous posted on Thursday, April 21, 2005 - 4:29 am
I have estimated a mixture model with three latent variables and continuous and categorical indicators, with one, two, three, and more... classes. I would like some precisions about interpretation of the output file.
1)For instance the output file of the model with three classes indicates “6 perturbed starting value run(s) did not converge”. What does it mean exactly ? Has this fact any consequence on the quality of the solution (local vs global) ?
2)TECH8 shows that the loglikelihood increases smoothly and reaches a stable maximum and the absolute and relative change goes to zero. There are very small fluctuations of counts in the first iterations but they remain stable in the others iterations.
3)The maximum loglikelihood is – 49025,6. This means that the maximum likelihood is very near 0. Is it realistic ? What is the signification of a likehood in a neighborhood of zero ?
1. I assume that 6 out of the default of 10 did not converge. The majority should converge and converge with the same loglikelihood. I would try STARTS = 50 5; and if you don't get good results, then try STSCALE = 1;
2. This is good.
3. The loglikelihood that you give is not zero as you show it. The absolute value of the loglikelihood has no real meaning.
Anonymous posted on Sunday, April 24, 2005 - 2:18 am
Thank you very much Dr Muthén, I am trying to follow your recommendations.
Anonymous posted on Thursday, September 15, 2005 - 1:20 pm
I have read the preceding messages, and i don't 1. understand where i find the output file (question 1 in the message)(the output file of the model with three classes indicates “6 perturbed starting value run(s) did not converge”. ) 2. I want also to ask you how can i interpret the estimate and the tests for the threshols in the output file, can i use this results in the choice of the starting value, and if the model is not significant (ch-square), can i also use this estimates in order to have a good models. 3.i have doing a mixture model with one , two class, but i have the missing value in my data, and in the output, i have covariance coverage, how can i use this output, and i have a warning :THE COVARIATE COVERAGE FALLS BELOW THE SPECIFIED LIMIT, what that is means?, i have also another message for the same model: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.294D-17. PROBLEM INVOLVING PARAMETER 15.
what that is means?
4. in my same model, i have 30 binary indicators, so that is very large for calculate the chi-square, how can i resolve this problem: THE CHI-SQUARE TEST CANNOT BE COMPUTED BECAUSE THE FREQUENCY TABLE FOR THE LATENT CLASS INDICATOR MODEL PART IS TOO LARGE.
1. The output file referred to above is the regular output file generated when Mplus is run. 2. These are tests that the thesholds are significantly different from zero. These tests are typically not useful for determining how to change a model. 3. The covariance coverage message means that you have more than 90 percent missing on one variable. You can see what parameter 15 is by looking at TECH1. That parameter makes the model not identified. Are you using Version 3.13. 4. The chi-square cannot be computed in your case. Instead you can look at the bivariate test of fit in TECH10 in terms of standardized residuals.
Anonymous posted on Friday, September 16, 2005 - 6:08 am
thank you for your response 3.i use version 3.12 4. i have add tech10 in my program, but i don't have any additionnal output. i don't know how i interpret the output file, i find for example the estimates for the thresholds how can i interpret this, also the output for tech1, what that is means the lambda and alpha. can you tell me where can i find the interpretation of the output of Mplus, i have the usetr's guide, but i don't find any example of the output. thank you for your help
Anonymous posted on Friday, September 16, 2005 - 8:05 am
I HAVE ESTIMATED THE MIXTURE MODEL WITH ONE LATENT VARIABLE, 29 CATEGORICAL INDICATORS AND TWO CLASS, I WOULK LIKE SOME PRECISIONS 1. TECH8 SHOWS THAT THE LIKELIHOOD INCREASES SMOOTHLY AND THE ABSOLUTE AND RELATIVE CHANGE GOES TO ZERO FOR THE 10 SETS, BUT THE COUNTS IS NOT THE SAME FOR ALL THE SETS, WHAT THAT IS MEANS? 2. IF I CAN'T HAVE THE CH-SQUARE TEST BECAUSE I HAVE SO MUCH CATEGORICAL INDICATORS, CAN I USE THE TECH11 TO CHOICE THE MODEL WITH TWO CLASSES VERSUS 1 CLASS, THAT IS SUFFICIENT TO DO THIS TEST?. 3. FOTR THE NESTED MODEL , WE USE THE TEST IN TECH11 TO CHOICE THE MODEL, AND FOR THE MODELS NOT NESTED WU USE aic or bic, that is correct?. 4. WHEN WE TELL NESTED MODEL, THAT IS MEANS MODELS WITH THE SAME RESTRICTIONS.
what is the interpretation of the log likelihood in general and its negative sign in particular?
bmuthen posted on Wednesday, December 07, 2005 - 6:31 pm
That's a big topic, but here is a short version. Very loosely speaking (and terribly nonstatistical), you can think of the likelihood as the probability of observing your sample assuming that your sample was drawn from the population your model specifies. The likelihood is a product of likelihood components for each observation (due to independent observations), and taking log turns this product into a sum. The sum is most often negative because the likelihood is most often less than 1 (as with probabilities). In this scale, -500 for example, is a lower (worse) likelihood than -100.
I'm new to mixture modeling and I'm not sure what the absolute value of the categorical latent variable(s) actually substantively means. I know that the number of latent categorical variables is equal to g-1, where g is the number of unknown classes. But, what does the value of the categorical variable actually indicate? For example, in one two class solution the final proportions for the two latent classes were .222 and .778, and the value of C#1 was -1.252. At first I thought it was just an odds ratio indicating probability of class membership with the last class as the reference group, but this does not seem to be the case.
Currently, I'm using Growth mixture modeling to identify latent classes of math achievement over 5 year span. In addition, I want to see what characerstics (gender for example) of the child predict latent class membership.
In the M-plus manual version 4.0, the GMM in example 8.1 (pg. 175) seems to be the model I want to emulate. That is it is a two class solution, with an exogenous variable x (which hypothetically could be gender - the predictor I want to inlcude).
Looking at the model in example 8.1, the question I have is do I need to include the arrows from x to i and s if I am only interested in whether or not x predicts latent class membership and I am not interested in how x impacts the growth factors themselves?
I can provide more detail of my model if necessary, but I assume my question is just as clear, if not clearer, when just using the model in Figure 8.1.
It is perfectly fine to have x influence only c and not i, s. You can test if x also influences i, s significantly. Often this happens. For example, classes may be high and low. The high class may be more likely for high x values. But being unusually high in the high class may also be associated with a higher x value.
I've estimated an LPA with 3 count indicators and 2 classes, and several covariates. The model runs OK in Mplus 4.2 but in the plot information I can't see the estimated values, or estimated probabilities. Is it my fault?. By the way, congratulations for the new version: I've experienced the better starting values and the speed advantages (up to 4 times quickier with a dual core proc.). Fer.
Sorry, more exactly: in a project for a transport firm, I want to analyse the relationship between worker’s characteristics (sex, age, and seniority), kind of machine (among 9 possible vehicles) and the frequency of adjustments they make in their seats and driving wheels, self reported by 3 counts (weekly adjustments of each elements) and 2 ordinal items (Likert-7). This will be part of a bigger ergonomic model, but for the moment I would like to get the relationship between the class probabilities and the covariates. Now, if in the model I use the 2 ordinal items and the 3 counts, saving the results and the class probabilities, and requesting a Plot3 gives me the following plots: Histograms (sample values), Scatterplots (sample values), Sample means, Sample proportions, Estimated means, Sample and estimated means, Estimated probabilities, Observed individual values, and Estimated means and observed individual values. But if I treat the 3 counts as continuous, then I get the previous plus the following: Histograms (sample values, estimated values), Scatterplots (sample values, estimated values), …, Mixture distributions, and Estimated probabilities for a categorical latent variable as a function of its covariates. I tell the full history because I could consider other possible modelling options. Thanks in advance, Fer.
I am estimating a mixture model with 2 classes of a continuous outcome. I have entered a time-invariant predictor of the intercept, slope, and also of class (using the code in example 8.1. I just want to make sure that I am interpreting the output correctly. First, for the intercept & slope I get this:
I ON X 0.170 0.049 3.445 S ON X -0.029 0.017 -1.716 (this output is the same for both classes)
Does this mean that X has a significant positive effect on the intercept, regardless of group membership, and that it has a non-signif (but marginal) effect on the slope, again regardless of group membership?
Then, for class, I have
C#1 ON X 0.171 0.051 3.355 Intercepts C#1 -3.618 2.498 -1.448
For the first line, does that mean that if you are high on X, you are more likely to be in C1 than C2?
For the second line, I don't really know what it means. Can you help? Thanks!
Jon Elhai posted on Thursday, August 28, 2008 - 2:18 pm
Dear Linda/Bengt, I am running a growth mixture model with 3 latent classes and no covariates. Could you kindly tell me what the following portion of the Mplus output refers to? Does this refer to the difference in mean between the class's slope listed below and the C#3 slope?
I am estimating a dual trajectory model (3x3 model) and I have trouble reading the output. I wonder how I can get the following information: 1. Conditional probabilities of one latent class membership given the other latent class membership and vice versa. 2. Joint probabilities
You find the joint probabilities in the output under estimated class frequencies and probabilities estimated by the model. This shows 9 cells corresponding to the 3 x 3 table. From this you can compute the conditional probs by hand (add up the freqs for each row or column of the table and compute the proportions for that row or column).
Note that intercepts are not means. Check Tech4 or the RESIDUAL option of the OUTPUT command to see what the means are. Also, with censored (b) a negative mean could imply that more than 50% are at the lowest censoring point of zero - that much censoring can only happen with a negative mean for the latent response variable y* that is censored. See literature on censored-normal response variables, such as Maddala's book.
I also have received the following warning: THE CHI-SQUARE TEST CANNOT BE COMPUTED BECAUSE THE FREQUENCY TABLE FOR THE LATENT CLASS INDICATOR MODEL PART IS TOO LARGE.
I see above that your recommendation has been to look at the TECH 10 output. I was hoping you could provide some more guidance or reference to a paper/primer on how to use TECH10 more specifically in this instance.
I have a situation in which I am loosely trying to explore the latent structure or class structure of a high-dimesional observational coding system (100 + indicators, frequency counts). I knew in advance that given the indicator:cases ratio I would likely have a nonidentified model which is the case. However, the LCA mixture model does converge despite the relevant warnings of untrustworthy standard errors etc.
I have two questions:
How does a model converge when the number of parameters is greater than the sample size? In this case, is the output still loosely interpretable?
As analagous EFA models do NOT converge, I was wondering if there are any other approaches I might take given the limitations of the indicator:sample size ratio to explore the structure of the observational coding system.
You can get a converged likelihood even for non-identified models; models with more parameters than subjects is also not a problem. You should do a simulation study to see if you can recover the model that generated the data.
Using very parsimonious models is a way to get around a small sample size. E.g. CFA instead of EFA.
As a follow-up to my last question re: LCA models converging despite the non-identification...
In my case past a 2-class model (i.e., 3-5) the models converge but the best log-likelihood is not replicated (no matter how much I increase the random starts). I assume the chief reason for the non-replication is that, given that the data is count and heavily skewed to zero, I am trying to extract more classes (past 2) than are represented in the data. All else being equal, would this be a reasonable assumption?
Thanks much. I want to make the case that I can explore the class structure of the 4-class model (it has the lowest BIC) of 1-5 model runs. That said, only the 2-class LL replicates. Given that the indicator:sample size ratio is not going to get any better, could I make a case that despite the failure to replicate the best LL, I am choosing to examine it (exploratory, all cautions noted) based strictly on the lowest BIC of 1-5 model runs?
One of the tweaks I made was to specify inflation (i) for the count variable portion of the model. Previously I had simply specified the variables as count (with no inflation). Is this appropriate given that my indicators are heavily skewed towards zero counts?
Thanks! Much longer computational time I am finding out for the (nb) and (i) count models. Will either of these models impact the ability of the ll to replicate relative to the default Poisson count model?
Maja Flaig posted on Saturday, February 07, 2015 - 5:43 am
Dear Drs. Muthen,
I'd like to ask your opinion on an analysis. The analysis concerns data on students' concepts of depression. I am wondering whether there are distinguishable concepts in my sample, so I thought of conducting mixture analyses in Mplus.
I used 42 items to measure participants' concepts of depression, no I am wondering whether these are too many variables to conduct an LCA. My sample size is 340. Also the results would be quite difficult to interpret with that many variables. In this case would you recommend CFA mixture models to facilitate interpretation of latent classes?
If you want to group people, use LCA. If you want to group variables, use EFA.
ruthjlee posted on Thursday, May 14, 2015 - 10:40 pm
A colleague has suggested that we perform an EFA within a CFA framework and use the factor scores in the LCA, in order to avoid interpreting a large number of variable loadings within classes. However, I noticed a relevant discussion on this forum (Sept 18 2012, 10.44am). The questioner was advised that if a factor is used as a variable in an LCA, '... the indicator may have a direct relationship to the categorical latent variable'.
- Does 'indicator' here refers to the manifest variables that were loaded onto the factor before the factor was used in the LCA?
- If so, is the issue here that the direct relationship of some of the manifest variables with the categorical latent variable might differ substantially from the relationship between the factor score and the categorical latent variable?
- Would this problem be mitigated if we used factors rather than factor scores?
I am conducting an LCA with 47 categorical items (binary and ordinal) but a fairly small sample size (N=158). Some variables are highly skewed (>90% 0 or 1); there is some missing data, and the BASIC analysis warns of zero cells and high sample correlations between variables (eg .986). BIC favors the 2-class solution. The chi-square stat is not shown because the table is too large. Entropy and ALCP are high (> .96). But, there are many low univariate entropies and the classes are not interpretable with so many variables.
I get many warnings of this kind: IN THE OPTIMIZATION, ONE OR MORE LOGIT THRESHOLDS ...
And also that some SEs may not be trustworthy due to a non-positive definite first-order derivative product matrix, and THE NUMBER OF PARAMETERS IS GREATER THAN THE SAMPLE SIZE.
1. Is there a recommended systematic procedure for dropping variables that are not contributing to the solution? For instance, dropping those with low univ. entropy one-by-one and rerunning the LCA each time? 2. Beside dropping variables, is there anything I should consider that might help improve the solution? For instance, could FMA be useful? 3. Should checking for local independence come before, during or after the process of dropping variables?