Message/Author 

Anonymous posted on Friday, October 29, 1999  11:46 am



I notice that I get different solutions when I use different starting values. Why is this? 


If you are getting different solutions, then you are hitting a local maxima. The best solution is the one with the highest loglikelihood. It is important, however, in mixture modeling, to confirm that your solution is not just a local solution by using different starting values to obtain the same solution. 

Anonymous posted on Friday, July 21, 2000  6:03 am



What is the best way to choose starting values? I first hypothesized starting values for a simple correlation model with differing means, cov's and variances, and when that failed, I did a cluster analysis which broke the data set into two virtually equal groups. I took the estimates from these two clusters for the means, cov's and variances as the starting values, and ran the model. Even though the two groups were equally balanced, in the mixture model, one ran to 0 in the third step and the model failed again. I have tested this model, and others on various data sets, and continue to have the same problem. Any help would be appreciated. 

bmuthen posted on Friday, July 21, 2000  7:57 am



Mixture modeling can be difficult when many or all parameters are allowed to vary across the latent classes. A zero class is often a result of too much model flexibility. A useful analysis strategy is to start with a more restrictive model, such as allowing only the means to vary across the latent classes, perhaps also letting the withinclass covariance matrix be diagonal as in latent profile analysis. Once a solution has been found for this model, other parameters can be allowed to vary across the classes, one step at a time checking how the likelihood improves. For example, after the model with free means is successful, covariances can be freed across classes and then variances using starting values from previous runs. 


Based on two variables X1 and X2, a researcher hypothesized three clusters. I wanted to test the three cluster assumption by estimating a mixture model with 2 observed variables, X1 and X2. I chose the starting values based on a scatterplot of the two variables. I have three questions: first, why is it taking about 1013 hours to find a solution (N=5000+). Second, the probabilities of being in certain classes appears to be disproportionate to the number of cases that, based on the scatterplot, should be associated with a certain class. Am I doing something incorrectly. Similarly, while reading the scatterplot seems to suggest five classes, I cannot get a model with 5 classes to converge. I have attached my input. TITLE: DATA: FILE IS "C:\WINDOWS\Desktop\MPLUS\DAT_Files\Mixture.dat"; VARIABLE: NAMES ARE X1 X2; MISSING ARE ALL (9999); CLASSES = c(5); ANALYSIS: TYPE IS MIXTURE; ESTIMATOR IS MLR; ITERATIONS = 4000; CONVERGENCE = 0.0001; MITERATION = 4000; MCONVERGENCE = 0.0001; MCITERATION = 4000; MCCONVERGENCE = 0.0001; MUITERATION = 4000; MUCONVERGENCE = 0.0001; MIXU = CONVERGENCE; MODEL: %OVERALL% %c#1% [X1*115]; [X2*2]; %c#2% [X1*75]; [X2*10]; %c#3% [X1*80]; [X2*20]; %c#4% [X1*50]; [X2*0]; %c#5% [X1*80]; [X2*10]; OUTPUT: SAVEDATA: FILE IS c:\Windows\Desktop\Prob.txt; FORMAT IS FIXED; SAVE = CPROB; 


One question is if for each latent class you want (1) uncorrelated x variables or (2) correlated x variables. I think the way you are setting it up gives model type (1), which is the Latent Profile Analysis type of model. Alternatively, you may search for latent classes where the x's are correlated  the Mplus web site example mix8mix11 and the reference to the Everitt book are useful for such models. It is well known in the statistical literature on finite mixture modeling that model type (2) can give difficulties. Model type (1) is typically easier, at least if you have classinvariant variances as your input specifies. Convergence can be slowed down for several reasons. First, it is useful to put your variables on a similar scale with variances in the 110 range. Second, you have increased the default number of iterations to 4000 in four instances and I would suggest using the defaults except for MITERATIONS, which you need to increase to get convergence. Once you have a solution, you may want to relax the specification of equal variances across the latent classes  this can have a large effect on the classification of individuals. 


I have one point of clarification on the answer above related to the time it took to converge. In mixture modeling, convergence is evaluated differently than in regular SEM modeling. In Mplus Version 1, the number of iterations for the mixture part of the model is not the maximum number but the actual number of iterations. So by choosing 4000 for MITERATIONS, MCITERATIONS, and MUITERATIONS instead of using the defaults, the time to converge is increased dramatically. As mentioned above, use the defaults unless a message is received to do otherwise. 

Anonymous posted on Friday, June 15, 2001  9:56 am



This question is regarding identification in a mixture model. It appears from the text of the manual that standard errors won't be computed if the model is not identified and an error message will appear. However, I have a mixture model that ran with 42 parameters (14 per class with 3 classes) and only 34 var/covar in the sample data. Is it possible to not be identified and get a solution (albeit inappropriate)? 

bmuthen posted on Friday, June 15, 2001  10:12 am



Identification in mixture models is not the same as for conventional covariance structure models. For the latter you consider the covariance matrix as the sufficient statistics because you are in a normaltheory framework. For the former, however, you have no sufficient statistics less than the raw data because you do not assume normality. You only assume conditional normality given covariates within each class, which can give rise to very nonnormal outcomes. With 34 (36?) varcov elements it sounds as if you have 8 outcomes which often could support that many parameters, depending on the model. I would say that the invertability of the information matrix is a rather trustworthy index of local model identifiability. 

Anonymous posted on Friday, June 15, 2001  11:51 am



The ultimate model I want to test is a fiveclass model with 7 measured outcomes (28 var/cov not 34oops) and 74 parameters [14 per class + 4 from ALPHA(C)] which seems to be unrealistic. It appears you are suggesting that if I get a solution (e.g., the matrix is invertable) then I can trust the results (this is a latent profile model with seven continuous measures). 


Fourteen parameters per class does seem to be an overly unrestriced model that may not be identifed. If it does converge and the information matrix is invertible, then your model is most likely identified. It sounds like you have a latent profile model with 7 means and 7 variances varying across classes. Typically having variances varying across classes in LPA is difficult. You can start with a more restricted model where the variances are held equal across classes. Then take the solution as starting values for a model with classvarying variances. 

Anonymous posted on Tuesday, June 26, 2001  8:36 am



What does it mean (as stated in the output) that I have reached a saddle point? 

bmuthen posted on Tuesday, June 26, 2001  9:28 am



A saddle point is not a true maximum of the likelihood. Although the firstorder derivatives are all zero as they should be for a maximum, not all secondorder derivatives are negative as they should be. Saddle points occur for some modeldata combinations and reflect the fact that the likelihood is not easy to maximize. Faced with this outcome, new starting values should be given. 

Anonymous posted on Thursday, February 07, 2002  6:44 am



I can estimate a oneclass, and a threeclass, but not a twoclass model using the same indicators? Using the same indicator variables (but different starting values), I have been able to estimate a oneclass model and a threeclass model. The good news is that the threeclass model is vastly better than the oneclass on all possible indicators of “betterness”. The bad news is that try as I might, I cannot get a two class model to converge. Following from your manual, it appears that Mplus is able to execute the first three steps of the EM algorithm, but gets stuck on inverting the Fisher information matrix to create standard errors. As you likely already know, the output I receive provides the estimates for the twoclass model, but not standard errors or fit statistics. The estimates are very much in concurrence with my theoretical expectations, and follow logically as a middle point between the estimates from the one and the three class models. I still think my threeclass model is that best, but without fit statistics for the twoclass model it’s a really an issue of asking my audience to “trust me” – not a comfortable argument to make. My questions are this: 1. What could be behind this situation where I can fit a one and a threeclass model, but not a twoclass model? 2. Can you suggest any tricks to unstick a stuck Fisher information matrix (believe me I have tried all sorts of starting values)? Many thanks. 


It would probably be best if you send your input and data to support@statmodel.com and I can take a look at it. 

Jason Hay posted on Wednesday, February 27, 2002  9:44 pm



I am looking at using this program for simulations in finite mixture models. At the moment I am testing the demo version. I am curious as to how you specify the program to analyse data that is a mixture of three gaussians. My data set is in two columns one categorical and the other a value. 

bmuthen posted on Thursday, February 28, 2002  7:16 am



The User's Guide gives several examples of how to do simulations with mixtures. It sounds like you want to work with a 3class model, and that you want to consider 2 variables. It sounds like you want one variable that is categorical and the other normal, but with one categorical variable I don't understand how the gaussian aspect comes in. Also, I don't understand what you mean by saying "my data set", since data are generated in the simulations. Perhaps you want to clarify. 

Anonymous posted on Wednesday, May 01, 2002  12:59 pm



I am fitting a LPA model with four continuous indicators, with both means and variances class specific. I get a reasonable solution, but the modification indices list class specific covariances with 999.000 as the modification index and 0.000 as the expected change indices. What does this indicate? 


The 999.000 indicates that the modification index could not be computed. 

Anonymous posted on Tuesday, May 07, 2002  12:33 pm



I would like to evaluate a mixture model solution using various staring values to determine if the solution is not just a local solution. What should I be looking for across solutions using different starting values normal termination of model estimation, fit statistics, or more? 


You should be looking at TECH8 output. Things to check are the following: 1. The loglikelihood should increase smoothly and reach a stable maximum 2. The absolute and relative change should go to zero  fluctuations may indicate multiple solutions 3. Class counts should remain stable If convergence is obtained for more than one set of starting values, compare the loglikelihood values and select solution with the largest loglikelihood. 

David Rein posted on Monday, November 25, 2002  2:12 pm



Hi, this question regards a difference in my conclusions found when I use my modeling data, and the data I am holding out for testing. My modeling sample if much bigger than my testing sample (750,000 to 86,000), and the issue occurs with the smallest class. In my modeling sample, both my BIC and entropy statistics, and my interpretation of the groups lead me to conclude that there is a fourth class of patients, but that it is very small (about 0.4% of the total sample) which is equal to a large number of about 3000 people. In my holdout sample, I have set all my coefficients equal to those found in my modeling data. I see a very large drop in BIC when going from 1 class to 2, then a large drop in BIC going from two classes to three, and then a large INCREASE in BIC when adding a fourth class. Each model produces the same proportional results. Class four in the testing sample is also 0.4% of the sample, but this only leads to a class four N of around 300. So, what's the interpretation here? Am I unjustified in moving out beyond three classes? Is the small proportional size of class four responsible for this turn of events? What can I say about my results if I am unable to reproduce them in the holdout sample? Incidentally, although the holdout sample was randomly drawn, it differs from the modeling sample in some important ways (thanks SAS!). Can this be my culprit? 

bmuthen posted on Monday, November 25, 2002  3:21 pm



Is this LCA (categorical outcomes)? How many items? Also, I wonder if the 4th class has the same interpretation in your 2 analyses. Some quick thoughts that come to mind before having heard your responses: Significant differences between the 2 samples can certainly create the discrepancy in BIC. If the 4th class is connected with many parameters, it may be that 300 individuals do not provide sufficient stability. BIC is not necessarily always the best method to choose the number of classes. BIC works a bit different in different sample sizes because the penalty for many parameters is bigger with bigger samples. This should however work in the direction of having BIC evidence for more classes in the smaller sample (if I am thinking correctly), so that cannot play in here. 

David Rein posted on Tuesday, November 26, 2002  7:00 am



Thanks for your quick response. The model is essentially the same basic cluster analysis as the one used in the example by Everitt and Hand (1981), and yes, it is an LCA in that it does not allow the indicators to be correlated. The latent class variable is indicated by four continuous measures of the same concept. Because the coefficients in the holdout sample are restricted as equal to the values found in the modeling sample the interpretation is the same. In each case the fourth class is the most severe  experiences the highest values of the four indicators. I'm going to take back my earlier statements about there being differences between the fit and the unfit sample as on the four variables used for this model there is no difference. (There were some differences in other variables which led to my earlier comment.) I can cut and past the by class mean and variance values but there's not much there, they show classes in equal proportions in each group, with statistically identical mean values and variances between classes in the fitted and the unfitted sample. The difference in the two is that the modeling sample allows the classes to be estimated freely, and testing sample restricts estimators found before. I see two possible conflicting interpretations. 1. Freeing the estimation in the modeling sample finds a distinct and interpretable class. 2. Freeing the estimation in the modeling sample overparameterizes the estimation, finding a class that isn't really "there", which allows me to impose a ready made interpretation on it. Is there any way to test which of these options is closer to the big T truth? 

bmuthen posted on Tuesday, November 26, 2002  9:06 am



Just to recap, your analysis sample had BIC values that pointed to 4 classes (I assume BIC was at a minimum at 4 classes), while in your holdout sample BIC pointed to 3 classes. So, your concern is about this discrepancy, right? Let me probe a bit further. When you did your holdout sample analysis, you say that you fixed the parameters at the values of the analysis sample. Do you mean all parameters, including the ones describing the class probabilities? If so, there are no free parameters left and the BIC penalty for parameters is zero, which means that here BIC is a function only of the log likelihood. That doesn't seem right because the log likelihood should improve when going from 3 to 4 classes  unless the 4class solution is a local optimum. You might want to check that. Note also that Mplus 2.12 allows a new statistical test of k1 versus k classes using the Lo, Mendell, Rubin LRT  see Tech11. This can give different results than BIC. 

David Rein posted on Tuesday, November 26, 2002  10:00 am



On point one  yes, the model sample had declining BIC scores, which pointed to four classes, and an improved entropy statistic for four classes compared to three. I should note though that the improvement in BIC was small, but significant using a Bayes Factor test. However, I then was unable to estimate a model with five classes. In contrast, the results in the holdout sample are nicer in a way, as they the pattern lacks any ambiguity  clearly pointing to a threeclass model. It just problematic, as conceptually a fourclass model makes a bit more sense than a threeclass model (As the fourth class handles the outliers much better than pooling them with class three). On the second point  Not quite. The proportions in the holdout sample were estimated freely. Only the means and the variances of the indicator variables were restricted to be equal to the values form the model sample. I will use the Lo, Mendell, Rubin LRT when I get a chance  but is it possible that the answer is just ambiguity. Given a large enough sample, an outlier distribution can be identified, but in most subsamples, this group is too small to be considered its own group? 

bmuthen posted on Tuesday, November 26, 2002  5:37 pm



It seems to me that a class consisting of 300 individuals would be well estimated since with 4 variables relatively few parameters specific to this class are involved. This assumes of course that this class is welldefined. You say that you have a latent class model for continuous outcomes (so an LPA). Are you letting the means or also the variances vary across classes? Another question is how well one should expect this type of crossvalidation to work if the model doesn't really fit the data well. The Mplus residual output wrt to means, variances, and covariances and Tech12 output wrt to skewness and kurtosis might be useful to assess the model fit here. It is a little strange the holdout sample would get a BIC that points to fewer classes (3) than the analysis sample (4) because the BIC penalty for many parameters is smaller in the holdout sample analysis (both the sample size and the number of free parameters are smaller). I wonder what the BIC picture would be if you had all parameters free in the holdout sample analysis (I know that this forfeits the purpose, but)  perhaps just a relatively small change in estimates would take place, but the log likelihood might improve a lot and perhaps BIC would then again point to 4 classes. 

Yifu Chen posted on Tuesday, March 18, 2003  6:33 am



Hi, I have a quick question. I have 7 binary variables and want to ran a latent class model. When I did it in Mplus, the following message shown: IN THE OPTIMIZATION, ONE OR MORE LOGIT THRESHOLDS APPROACHED AND WERE SET AT THE EXTREME VALUES. EXTREME VALUES ARE 15.000 AND 15.000. THE FOLLOWING THRESHOLDS WERE SET AT THESE VALUES: * THRESHOLD 1 OF CLASS INDICATOR IVA4PC FOR CLASS C#4 AT ITERATION 17 * THRESHOLD 1 OF CLASS INDICATOR IVA6PC FOR CLASS C#4 AT ITERATION 49 * THRESHOLD 1 OF CLASS INDICATOR IVA2PC FOR CLASS C#4 AT ITERATION 67 * THRESHOLD 1 OF CLASS INDICATOR IVA1PC FOR CLASS C#2 AT ITERATION 106 THE MODEL ESTIMATION TERMINATED NORMALLY I still can get all the results, but I wonder if these messages indicate any problem in my model. Can I trust the result? Thank you! 


No, these messages do not indicate any problem in your model. It can actually aid in interpretation of the classes when an item has probability zero or one in a class. 

Anonymous posted on Wednesday, April 09, 2003  10:47 am



Hello, I apologize for the elementary nature of this question, but I am working on an LCA, trying to model a 3 and 4 class solution. I apparently had appropriate starting values for the 2class solution, but this is the message I get for the 3class solution: THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO A NONPOSITIVE DEFINITE FISHER INFORMATION MATRIX. CHANGE YOUR MODEL AND/OR STARTING VALUES. THE MODEL ESTIMATION HAS REACHED A SADDLE POINT, NOT A PROPER SOLUTION. CHANGE YOUR MODEL AND/OR STARTING VALUES. THE CONDITION NUMBER IS 0.838D+00. THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THIS IS OFTEN DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. CHANGE YOUR MODEL AND/OR STARTING VALUES. PROBLEM INVOLVING PARAMETER 2. What are some rules when assigning starting values to a 3 and 4class model? Does the model have to be constrained in this case? Thank you very much, Dawn 


To answer this, I need to know how many latent class indicators you have and what starting values you used. Please send the 2 and 3 class outputs and data to support@statmodel.com. 

Anonymous posted on Saturday, July 12, 2003  4:21 pm



I'm in the process of developing a SEM which posits a latent categorical variable, LV, as a mediating variable between a set of covariates, X, and an outcome variable, Y. I proceeded by developing the LC measurement model first and find that I get very similar LogL and LR ChiSquare values for the ML and MLF estimators, but very different values for the MLR estimator. My questions are therefore: Should the variation in LogL for the ML, MLF, and MLR estimators be a matter of concern, and is there a reason (aside from convergence problems) that one would ever *not* want to use the MLR estimator ? Shouldn't the MLR estimator be used for all ML modeling in Mplus ? In general, do you have a set of a priori considerations you suggest in selecting ML estimators for Mplus ? Are nested LRChi Square tests of fit / restrictions equally valid for all three estimators ? Finally, when I include LV in the fuller SEM with X and Y, are my results likely to be less sensitive to the choice of estimator than they were when I was modeling LV alone (since more information is included in the full SEM versus the LCMM alone) ? 

bmuthen posted on Saturday, July 12, 2003  4:51 pm



Since you have a categorical latent variable, you must be doing type=mixture analysis. ML, MLF, and MLR give the same log likelihood values and only affect SEs here. If MLR gives a different log likelihood, something is wrong in the input; please email input, output, and data to support@statmodel.com. I would recommend the default estimators in all cases since they draw on experiences we have had with simulations. Nested loglikelihood tests are the same for the different mixture estimators because the log likelihood is the same. If in your last question you are asking if including x makes results more sensitive to estimator choice in this mixture modeling example, my answer is I don't know. 

Anonymous posted on Friday, September 26, 2003  2:00 pm



I encoutered a strange situation, where I got the results of "0value" estimates in my 3class mixture model (see output below; all observed variables are dichotomized ones). The model converged at the 797th iteration. Any suggestions? Many thanks! TESTS OF MODEL FIT Loglikelihood H0 Value 2548.777 Information Criteria Number of Free Parameters 12 Akaike (AIC) 5121.553 Bayesian (BIC) 5187.100 SampleSize Adjusted BIC 5148.977 (n* = (n + 2) / 24) Entropy 0.776 ChiSquare Test of Model Fit for the Latent Class Indicator Model Part Pearson ChiSquare Value 47.028 Degrees of Freedom 19 PValue 0.0004 Likelihood Ratio ChiSquare Value 44.687 Degrees of Freedom 19 PValue 0.0008 MODEL RESULTS Estimates S.E. Est./S.E. CLASS 1 CLASS 2 CLASS 3 LATENT CLASS INDICATOR MODEL PART Class 1 Thresholds DYSINTER$1 3.374 0.885 3.815 DYSLUBR$1 2.705 0.164 16.463 DYSPAIN$1 4.873 0.426 11.428 DYSCLIM$1 3.783 0.553 6.847 DYSPLEAS$1 3.751 0.351 10.687 Class 2 Mplus VERSION 2.02 PAGE 4 Northern European Women (3 Classes) Thresholds DYSINTER$1 1.380 1.679 0.822 DYSLUBR$1 1.341 0.502 2.671 DYSPAIN$1 12.057 15.134 0.797 DYSCLIM$1 0.685 0.485 1.411 DYSPLEAS$1 1.150 0.409 2.814 Class 3 Thresholds DYSINTER$1 0.000 0.000 0.000 DYSLUBR$1 0.000 0.000 0.000 DYSPAIN$1 0.000 0.000 0.000 DYSCLIM$1 0.000 0.000 0.000 DYSPLEAS$1 0.000 0.000 0.000 LATENT CLASS REGRESSION MODEL PART Means C#1 2.163 0.140 15.450 C#2 0.170 0.521 0.327 LATENT CLASS INDICATOR MODEL PART IN PROBABILITY SCALE Class 1 DYSINTER Category 1 0.967 0.028 34.140 Category 2 0.033 0.028 1.169 DYSLUBR Category 1 0.937 0.010 97.075 Category 2 0.063 0.010 6.495 DYSPAIN Category 1 0.992 0.003 308.855 Category 2 0.008 0.003 2.363 DYSCLIM Category 1 0.978 0.012 81.376 Category 2 0.022 0.012 1.851 DYSPLEAS Category 1 0.977 0.008 124.141 Category 2 0.023 0.008 2.916 Class 2 DYSINTER Category 1 0.201 0.270 0.745 Category 2 0.799 0.270 2.963 DYSLUBR Category 1 0.793 0.083 9.607 Category 2 0.207 0.083 2.512 DYSPAIN Category 1 1.000 0.000 11390.062 Category 2 0.000 0.000 0.066 DYSCLIM Category 1 0.665 0.108 6.149 Mplus VERSION 2.02 PAGE 5 Northern European Women (3 Classes) Category 2 0.335 0.108 3.101 DYSPLEAS Category 1 0.760 0.075 10.175 Category 2 0.240 0.075 3.221 Class 3 DYSINTER Category 1 0.500 0.000 0.000 Category 2 0.500 0.000 0.000 DYSLUBR Category 1 0.500 0.000 0.000 Category 2 0.500 0.000 0.000 DYSPAIN Category 1 0.500 0.000 0.000 Category 2 0.500 0.000 0.000 DYSCLIM Category 1 0.500 0.000 0.000 Category 2 0.500 0.000 0.000 DYSPLEAS Category 1 0.500 0.000 0.000 Category 2 0.500 0.000 0.000 FINAL CLASS COUNTS AND PROPORTIONS OF TOTAL SAMPLE SIZE Class 1 1391.33758 0.79916 Class 2 189.69584 0.10896 Class 3 159.96658 0.09188 


If you send the full output to support@statmodel.com, I will take a look at it and anwer your question. Please don't post it on the discusion board. 


The reason that you get zeroes in the third class is that you give no threshold starting values for the third class. Instead you give starting values for the second class twice. I think you need to change the second %c#2% to %c#3%. 

Anonymous posted on Friday, October 29, 2004  3:22 pm



I have a 4 class LCA model with 6 indicators. Three of these were originally measured as 4point ordinal scales (strongly disagreestrongly agree). I've recoded these indicators to be dichotomous variables. My LCA results are different when I specify these dichotomous indicators to be nominal vs. ordinal. Is there a way to determine which is the correct specification (the model fit for each are about the same, but the class profiles are different). Thank you. 

bmuthen posted on Saturday, October 30, 2004  10:10 am



Analyzing dichotomous items by latent class analysis using the nominal or the categorical option should give the same result. The differences you are seeing may be due to local optima  using a higher value for STARTS = will give the same results. If not, please send your inputs, outputs, and data to support@statmodel.com. 

Anonymous posted on Thursday, November 18, 2004  1:07 pm



I have a generalized growth mixture with two background variables, 4 time points. I ran the model with STARTS 500, 10. The twoclass model BIC is 3301 and the three class is 3304. The AIC for the three class is 3244 and three class is 3238, Sample adjusted bic for the 2 class is 3244 and three class is 3238. I know that is not necessarily a big difference, I prefer the three class model, but BIC suggests they are essentially the same or the two class is slightly better. AIC and sample BIC are slightly smaller. Would you have any suggestions about which model is best, I like the three class solution, but I'm not sure I can justify it for publication. What would you say as a reviewer? The LOMENDELLRUBIN ADJUSTED LRT TEST = 30 (p = .09). Thank you in advance 

bmuthen posted on Thursday, November 18, 2004  1:53 pm



Since BIC etc can't clearly distinguish between the models I would go with the one that is easiest to interpret. Also, if you compare the two models you probably have some similarities so that the 3class model is merely an elaboration of the 2class model. Also, adding more covariates and perhaps also a distal outcome might help distinguish between the models. 

Anonymous posted on Tuesday, March 01, 2005  8:03 pm



Here's a foolish question, but I've tried without success to learn the answer so far: What's the interpretation of the categorical latent variable means part of the printout in a mixture model? e.g.: Categorical Latent Variables Means C#1 1.474 0.136 10.863 C#2 0.224 0.209 1.071 C#3 0.475 0.161 2.942 Thanks for any information on it! 


The values are multinomial logits. When they are converted to probabilities, they are the probabilities of class membership. See Calculating probabilities from logistic regression coeffcients in Chapter 13. This conversion would be done using Formula 2 without x's. The a's are the multinomial logits. 


I was wondering if there were any "rules of thumb" for choosing the appropriate estimator for LCA models using binary data that have very high/low freqencies (e.g. 94%/2.5%)? I have the same question for EFA with binary data. Is there a good reference to learn more about the different estimators and when to use them with binary data? Thank you in advance. Tom 

bmuthen posted on Monday, March 21, 2005  5:54 pm



LCA is only available (at least in Mplus) with the ML estimator (in the ML, MLF, and MLR versions; I would use the default MLR). For EFA with binary data only the WLSMV estimator is available in Mplus. A reference discussing binary data FA with skewed items is Muthen (1989) in SM&R (see the Mplus web site). 

ADC posted on Tuesday, May 03, 2005  9:38 am



I am doing LCA with 8 indicator variables (categorical: 0, 1, 2) and 2 covariates. A 2class solution seems to best fit the data. I'd like to know which of my indicator variables were more influential in determining class membership. Ideally, I'd like to be able to create an algorithm (e.g., weights in a general linear model) that will allow me to assign other participants into these same two categories on the basis of the same indicator variables and covariates. I'm not sure what part of the output tells me these weights. Thanks for any help. Sorry if this is a simplistic question, but I'm a little overwhelmed by the information in the output. 


If I understand your correctly, you want to look at the thresholds of the latent class indicator variables. 

ADC posted on Tuesday, May 03, 2005  2:24 pm



Thank you for your reply. If I understand the thresholds correctly, they represent a value (though I'm not sure what the value is of. Of a latent trait?) that needs to be exceeded for the indicator variable to move to the next category. But both the thresholds and the "Results in Probability Scale" seem to relate to the likelihood of a score on an indicator, given membership in a latent class. I'd like to go the other way. Given scores on the 8 indicator variables, how would I predict a new individual's membership in a latent class? I've been looking at the section in Chapter 13 of the User Guide on Calculating Probabilities from Logistic Regression Coefficients. This seems close to what I want to do, but the examples only seem to focus on intercepts and covariates. Is there a way to plug the thresholds into a similar formula? 

bmuthen posted on Tuesday, May 03, 2005  2:34 pm



One approach is to first estimate the model parameters for a certain sample. The second step is then do ananalysis using the same model but holding all parameters fixed at the estimated values from step 1. In this second step your sample may consist of only 1 person (or a set of persons) and that person's data vector is the same as in step 1.  This second step produces the "posterior" class probabilities for all classes for the individual (or individuals), which is what you need in order to classify the person into a class (based on the most likely class). The second step is not straightforward to calculate by hand, but is done quickly in Mplus. Hope that is close to what you had in mind. 

ADC posted on Wednesday, May 04, 2005  7:46 am



Yes, that's helpful, thanks. I was hoping for a simple formula that could be done by hand or reported in a manuscript, but it seems like that may not be possible. One question about this method. When I ran a subsample of the original sample, using the fixed parameters as you suggested, I was able to replicate the results (with small deviations) from the original run. However, I got the following errors:  *** ERROR in Model command Unknown threshold value 2 for variable BORD7 *** ERROR in Model command Unknown threshold value 2 for variable BORD8 *** ERROR in Model command Unknown threshold value 2 for variable BORD7 *** ERROR in Model command Unknown threshold value 2 for variable BORD8 *** ERROR The following MODEL statements are ignored: * Statements in Class 1: [ BORD7$2 ] [ BORD8$2 ] * Statements in Class 2: [ BORD7$2 ] [ BORD8$2 ]  I gather that these errors are because I had fixed parameters for 3 levels of BORD7 and BORD8, but the data in my subsample had only 2 levels for these variables. I was able to work around it by including bogus data vectors is that the best solution? Thanks once again for your quick replies and your helpful suggestions. 


I would simply add the new observations to the original data set. Try that to make sure you get the same results as using bogus data vectors. You probably will. 

Anonymous posted on Monday, May 16, 2005  6:32 am



Hello! I am attempting a latent profile analysis and wonder if there is a simple way to transform the logits into a scale that reflects the original metric. Otherwise, do you have advice about easing interpretation so that I can determine what my clusters mean? Thanks in advance for your help. 


In latent profile analysis, the latent class indicators are usually continuous so logits would not be estimated. Means and variances would be estimated. What is the measurement scale of your latent class indicators? 

Anonymous posted on Monday, May 16, 2005  7:24 pm



Re: posting Monday, May 16th at 6:32 AM I suppose I am technically doing a mixture model, as one of my latent classes is composed of continuous indicators, while two are composed of categorical indicators. I am specifically wondering how to translate the logits of the categorical scales so that I can interpret my classes. Any information would be appreciated. Thanks! 


Toward the end of Chapter 13, there is a section that shows how to translate logits into probabilities. It uses intercepts. If you change the sign of your thresholds, they become intercepts. See the first example where the covariates values are all zero. 

Anonymous posted on Thursday, May 19, 2005  10:21 am



Re: posting Monday, May 16th at 6:32 AM Hi, Again. Thanks for your most recent post. I have been advised to interpret the means provided for each class, not the thresholds. Further, thresholds are given for each manifest variable and I am seeking to interpret my latent variable means for each class. (Does that make any sense? Sorry, I am new to this.) I am attempting to interpret the means for each of my latent variables (1 continuous and 2 categorical) for each of my classes proposed. Is there a way to transform the means back into the original metric so that I can understand what my classes mean in terms of my latent variables? Otherwise, how does one interpret the means to understand what their respective class memberships mean? Thanks! 


Please send your output and license number to support@statmodel.com. Refer specifically to the values you are trying to interpret. 

Anonymous posted on Friday, May 20, 2005  2:57 pm



Thank you for your response. I will send my output and license number on Monday. Unfortunately, I did not check this message until I was home for the weekend. Thanks, again. I look forward to getting your response. 

Anonymous posted on Monday, May 23, 2005  7:49 am



Hi, again. I have sent my output to support@statmodel.com. I look forward to hearing your response. Thanks, again. 

Anonymous posted on Wednesday, July 27, 2005  8:47 am



I am testing a crosssectional mixture model in which there are two DVs and 5 IVs. If I wish to control for an extraneous variable (e.g., size of the company or age of company) that is not of any theoretical interest to my study (simply a control variable), do I just add the variable to the regression in the overall model? Or should I also include these variables in the regression of class membership onto the covariates that are of theoretical interest to my study? 

bmuthen posted on Wednesday, July 27, 2005  6:37 pm



Both. 

Anonymous posted on Thursday, August 04, 2005  12:46 pm



Hi, I am doing a mixture model, I have 8 latent class indicators, this is the first time i do this type of analysis, what is the technique to do for choosing the starting value? and if i want to do boostrap in my analysis so i must change MLR to ML how this affect my result? thank you for your help 

bmuthen posted on Monday, August 08, 2005  2:21 pm



No starting values are needed  see the version 3 User's Guide. MLR is used to get nonnormality robust SEs. Bootstrapping might be useful for small samples. I am not sure which approach is best  perhaps a study is needed. 

Anonymous posted on Tuesday, August 09, 2005  5:46 am



can tell me excatly where can i find that no starting values are needed in the version 3 User's guide, because i have understand that if i want to have a latent variable with 2 or more classes, i must give the threshold starting values . my question is how can i choose this starting value 


See Example 7.3. 

Anonymous posted on Tuesday, August 09, 2005  8:20 am



thank you for your answers, but if i want to specify the starting values instead of random starting value, how can i do this, if i understand, i can begin by the random starting values and use the outputs in order to specify the starting values for the second run, if not, how can i do this. 


See Example 7.4. This has userspecified starting values. 

Anonymous posted on Tuesday, August 09, 2005  10:49 am



excuse me i haven't put a good question , my question is how te specify the values of starting values, and what is the technique toi be sure that the solution is not local maxima, 


In Example 7.5, starting values are given and random starts are used. The starting values are the values following the asterisks (*). An asterisk (*) is used to specify a starting value. To avoid a local solution, use random starts as is done in this example. You can read about random starts by looking up the STARTS option in the Mplus User's Guide. 

Anonymous posted on Sunday, August 28, 2005  11:03 am



Hello. How are the standard errors of the paramter estimates are obtained im mixture model? I didnt specify bootstrap in the analysis command. Is it the default method to use in Mplus? Can I find more detailed description for bootstrap in EM algorith in any reference paper? 

bmuthen posted on Sunday, August 28, 2005  12:11 pm



The mixture model uses regular MLbased SEs, where the ML information matrix is estimated by one of three alternatives given in Technical Appendix 8 on the Mplus web site. 

ksullivan posted on Wednesday, February 22, 2006  10:14 am



Hello. I am trying to run a latent class analysis with 6 latent variable indicators. Each indicator has six levels (i.e., 1, 2, 3, 4, 5, 6). I realize that using this many indicators forces the model to estimate a lot of parameters. I can get the model to run with 2 classes but it falls apart with 3 classes. I get the errors where the model did not terminate normally due to an "ILLCONDITIONED FISHER INFORMATION MATRIX" and "DUE TO A NONPOSITIVE DEFINITE FISHER INFORMATION MATRIX." I also had problems with reaching large thresholds. I tried increasing the STARTS but I still could not get the 3 class model to run. I think that I need to set starting values but I am unsure how to set starting values when my indicators have 6 levels. I have the probabilities based on an LCA run with 1 and 2 classes but I am not sure how this translates into setting values for a model with 3 or more classes. Thanks in advance for any help. 


Are you using Version 3.13? If not, I would download it and try the analysis again. Also, increase your starts for the two class solution and be sure you are getting a good solution, that is, be sure you replicate the best loglikelihood value at least twice. Having thresholds fixed is not a problem. It helps define the classes. If all else fails, please send your input, data, output,and license number to support@statmodel.com. 


In implementing the LoMendellRubin Likelihood ratio test, I have seen that you should have the classes ordered such that the largest class is the last class. Does it then follow that the first class should be the smallest and each subsequent class should be ordered in size? Thanks! 


No, that is not necessary. 


Sorry Linda, just requesting a clarification on the above point. I thought that the idea with implementation of the LoMendellRubin was to have the smallest class as the first class, with the order of the subsequent classes immaterial, seeing as the first class was being deleted for the comparison with k1 classes...is this correct? thanks in advance, Geoff 


I always thought you wanted the largest class last and the rest really didn't matter. 


Hello Dr. Muthén, the last two above mentioned messages makes me doubtful: Is it necessary for conducting or interpreting the LoMendellRubin Likelihood two have the last class the largest class? If so, how to tell M+ that the last class should be the largest? Thank you in advance. Phil 


It is desirable to have the largest class last. You can do this by using parameter estimates from the largest class as starting values for the last class. 


Hello Dr. Muthén, it followed your suggestion and used the parameter estimates (variances) from the largest class as starting values for the last class: MODEL: %OVERALL% %c#1% y2 y3 y4 y5 y6; %c#2% y2 y3 y4 y5 y6; %c#3% y2@0.124 y3@0.113 y4@0.128 y5@0.103; Unfortunately, this changed the whole model; it is no longer comparable with the model without reordered classes. What did I wrong? And what happens with the bootstrapped LoMendellRubin Likelihood ratio when the last class is not the largest one? Thanks, Phil 


You need to send the output from your first analysis where you obtained the starting values and the input, data, output from the second analysis and your license number to support@statmodel.com. 

Ralf Wierich posted on Wednesday, December 06, 2006  6:22 am



Hi, I have two questions: 1. if I run my model with 3 classes, i get the error message: WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IN CLASS 2 IS NOT POSITIVE DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR A LATENT VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO LATENT VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO LATENT VARIABLES. CHECK THE TECH4 OUTPUT FOR MORE INFORMATION. PROBLEM INVOLVING VARIABLE ANG. The latent var ANG in class 2 has a very low residual variance of .004. What can i do? Is this a problem of my model? The 2class solution ended properly. 2. How can I interpret the intercept terms of the models? Are there any papers I should look into? Thanks, Ralf 


Please send your input, data, output, and license number to support@statmodel.com. It is likely that you don't need three classes but I would need more information to say for sure. I can then answer your second question when I see what your model is. 


Hi Linda Does MPlus have an output command that gives an LCA model it's classification error, like the one in LatentGold? Thanks, Silvia 


Mplus provides a classification table and an entropy value as the default. No special option is needed. 

Alex posted on Wednesday, June 13, 2007  2:00 pm



Hi, In the estimation of mixture models, MOC (from R) provides ICLBIC (A BIC corrected for entropy) using the formula BIC + 2*entropy. I have two questions: (1) Do you plan on implementing it in Mplus ? (2) How would you calculate it by hand from Mplus given the fact that the entropy is not computed in the same way (is MOC, we seek to minimize entropy and in Mplus to maximize it if i'm right) ? Thank you very much 


1. Can you provide a reference for ICLBIC? 2. I don't think you could calculate it by hand if entropy is not the same. 

Alex posted on Thursday, June 14, 2007  7:41 am



Thank you for this prompt answer. I'll look into that and send you (in the next couple of days) a email with a reference and with the MOC formula for entropy (easier to type a forumla in a word document than here). 


Thanks for sending the information. We have read about ICLBIC in the McLachlan and Peel book on mixture modeling. It appears to have performed well in the limited simulations that were carried out. We will study it further and see how it performs and include it if it looks promising. One potential objection is that it combines the loglikelihood which is a fit statistic with entropy which is a measure of classification quality and not a measure of fit. You can compute ICLBIC by hand using information from Mplus. See formula 6.75 on page 217 of the McLachlan and Peel book. This formula has three terms. The sum of terms 1 and 3 are equal to the Mplus BIC shown in formula 6.50 on page 209. Term 2 is shown in formula 6.58 on page 213, EN (tau). That term is the numerator in formula 171 shown in Technical Appendix 8 on the Mplus website with the exception of a sign change. You can solve for that term using formula 171 and change the sign and then use it to compute ICL BIC. 

Alex posted on Tuesday, June 19, 2007  1:10 pm



Thank you for the answer. As a conclusion, to make sure I make no mistakes (beeing no statistician): ICLBIC = BIC + 2 (1Ek)n lnK Were: BIC is the BIC given by MPlus Ek is the "entropy" given by Mplus n is the number of subjects K is the number of classes Am I right ? Thanks again. 


I think you forgot to take the negative of the numerator term. I think it should be: ICLBIC = BIC + 2 (Ek  1)n lnK 


I am trying to run a FMM w/ 4 factors & 6 classes, N is 1388. Factors are measured by 5 or 6 binary variables. I'm using the cluster means from SAS Proc Fastclus as starting values. So far, I have not been able to get it to run using MLR or MLF, even when I reduce the classes to 2. It gives me the following: *** FATAL ERROR THERE IS NOT ENOUGH MEMORY SPACE TO RUN THE PROGRAM ON THE CURRENT INPUT FILE. THE ANALYSIS REQUIRES 4 DIMENSIONS OF INTEGRATION RESULTING IN A TOTAL OF 0.50625E+05 INTEGRATION POINTS. THIS MAY BE THE CAUSE OF THE MEMORY SHORTAGE. etc. When I cut the number of subjects down to 100, it does run but says the following: WARNING: THIS MODEL REQUIRES A LARGE AMOUNT OF MEMORY AND DISK SPACE. IT MAY NEED A SUBSTANTIAL AMOUNT OF TIME TO COMPLETE. REDUCING THE NUMBER OF INTEGRATION POINTS OR USING MONTECARLO INTEGRATION MAY RESOLVE THIS PROBLEM. (This is still running so I don't know what the output will look like). It runs when I use only 3 factors with MLF and Cholesky=off (but takes about 20 hours). Is it possible that a PC with 3 Gigs of RAM is not enough to run this model or is something else wrong? Thanks 


First, I would check that a regular (singleclass) FA can be estimated. It may not be identified if you have too many free loadings  for example an EFA with 4 factors and 6 variables is not identified. Second, 4 factors leads to 4 dimensions of integration and with the default number of integration points of 15, you then have 15tothepowerof4 integration points. This then gets multiplied by the number of subjects, leading to the large memory requirement. You can reduce the number of integration points by saying either integration=7 (for example), or integration=montecarlo. Third, when you allow for several latent classes, you may not need as many dimensions as you otherwise would. Finally, memory use on your computer can be expanded as mentioned under Systems Requirements on the web site. 

J.W. posted on Friday, September 21, 2007  1:24 pm



Deciding on number of latent classes in GMM: Sometimes, Information Criteria and LMR LR test lead to contradictory results. For example, BIC of kclass model is smaller than BIC of (k1)class model, while the LMR LR test canˇŻt reject (k1)class model. In this case, which model (kclass model or (k1)class model) is better? Any help will be appreciated! 


See the following paper for guidance on determining the number of classes: Nylund, K.L., Asparouhov, T., & Muthen, B. (2006). Deciding on the number of classes in latent class analysis and growth mixture modeling. A Monte Carlo simulation study. Accepted for publication in Structural Equation Modeling. 


About the largest class last for the interpretation of the LoMendellRubin LRT (see posts around October 6 2006): I understand it is desirable to have the largest class last but I am not sure about the order for the other classes. I ran a LCA with 4 classes with the classes ordered (smallest  largest) and another LCA with only the last class fixed as the largest class (here, the second class turn out to be the smallest). As expected, the p values for the LMR LTR differ. Which value should I report? When running several models, shouldn't one be consistent throughout and specify the models so that the smallest class is always the first class and the largest class always the last? Thanks! 


The largest class last for the LRT tests is not for purposes of interpretation. It is because the first class is deleted when testing the k versus the k1 models. The order of the other classes is not an issue. I could not say which value you should report without seeing the two outputs. It sounds like you fixed the parameters in the last class which is not what is recommended. You can send the outputs and your license number to support@statmodel.com if you want further comments. 


Hi, Actually, I did not fix the parameters of the last class, I used the parameter estimates from the largest class as starting values for the last class. My mistake, I used the wrong terminology. I won't worry about the other classes then. Thanks! 


I extracted 4 groups in my growth mixture analysis. There is a group which could be clearly assigned as "increasers" looking at the plot of estimated means, but looking at the means of linear an quadratic slope revealed no significance (due to high SE), intercept is significant. Could such a group still be assigned as increases? Are the high SE of the mean estimates may be a sign of misfit? 


A high standard error indicates a lack of precision of the estimate. So the model is not welldefined. 


I have four continuous variables from laboratory measures of anxiety response. I plan to run a LPA on the variables to test a 1 vs. 2 group model of anxiety response. Are there any additional mixture models you would recommend for answering this question? Would a Mixture regression analysis, CFA mixture modeling, or a structural equation mixture modeling be appropriate? 


Without knowing your research context, I would say that in addition to LPA a factor mixture model might be explored  for an overview of mixture models, see Muthén, B. (2008). Latent variable hybrids: Overview of old and new models. In Hancock, G. R., & Samuelsen, K. M. (Eds.), Advances in latent variable mixture models, pp. 124. Charlotte, NC: Information Age Publishing, Inc. which is on our web site. 

Rui posted on Tuesday, July 22, 2008  12:22 pm



Hi, I have a question in terms of the conditional independence assumption in LCA. When the latent class indicators are combination of binary, censored, and count variables (data in ex7.11), how can I examine/account for the withinclass correlations among the LC indicators? Thanks! 


See v5 UG ex7.16. 

linda beck posted on Wednesday, July 23, 2008  8:24 am



Referring to the Kaplan chapter it is important to include covariates in final mixture solutions. Regarding their effects on growth parameters, is it important to hold these effects on intercepts and slopes equal across classes or should one try to have unequal effects? In an unconditional model I've computed before, only equal growth parameter variances converged.... (I ask this question because in an article it was emphasized to hold effects of covariates on growth parameters equal across classes, unfortunately I can't find it...) 


I typically hold covariate effects equal across classes  it is a more stable model  unless the context says that one should expect differences such as when looking for differential intervention effects. 

linda beck posted on Friday, July 25, 2008  6:41 am



That's also my experience, thank you! unequal effects don't converge. A related question: should one prefer the covariate solution as final model albeit the BIC is slightly higher as compared to the unconditional model (30 BICpoints). I've found some sig. direct effects on class membership and not considering them would end up in an misspecified model, in my opinion. thank you so far! greetings, linda beck 


I prefer the covariates included in the model when possible. The "analyzingclassifyinganalyzing" approach using most likely class membership gives biased estimates and SEs. Note that BIC is not in the same metric with and without covariates, so not comparable. This is because the likelihood scale changes when adding covariates. 

linda beck posted on Monday, July 28, 2008  6:49 am



I had a deeper look at my model. Since I'm using twopart growth mixture modeling is very complex. I have an Intervention which should also predict the slope of the upart, together with other covariates. As you said, especially when you have an Intervention one could postulate unequal effects on the slopes. But this model is very complex. a.) it is possible to hold effects of some covariates on the slopes equal and for others (like intervention) not, in order to get rid of complexity!? I guess no... b.) I'm quite sure, that the intervention and the other covariates have an effect on the slope. Would it be better to postulate equal effects of the intervention on the slopes as compared to postulate no effects at all, and therefore misspecify the model!? thank you for this great forum! 


I think it is reasonable to hold all covariate effects except the intervention equal across classes. With x1 being the intervention dummy covariate you would say: %overall% ..... su on x1 x2; %c#1% su on x1; %c#2% su on x1; 

linda beck posted on Monday, July 28, 2008  9:56 am



Thank's a lot for this advice. I will try this. It takes a long time to compute the iu and su variances in a two part mixture model (2 Dimensions of Integration). I think about reducing the number of integration points (15 is default). Which value would be reliable enough, 10? What else can one do to increase the speed? 


The first thing to do is to add in the Analysis command: Process = 4(starts); if you have 4 processors. Say 2 if you have 2. This distributes the mixture starts over the processors and speed is much improved. Often it is sufficient to use integ = 7; 

linda beck posted on Tuesday, July 29, 2008  2:06 am



ok, thanks... unfortunately this distribution of mixture starts is not available for the second run of starting values (i think this is needed for LMR)... would be cool to implement in the next version of mplus! 

linda beck posted on Tuesday, July 29, 2008  3:41 am



Before I do the final step  adding covariates into twopart mixture  I have a last question (hopefully). In my unconditional model, concerning the variances only the intercepts and linear slopes become significant. In the conditional mixture models should I also allow for effects on the quadratic slopes, albeit their variance is not significant in the unconditional mixture models (or not estimatable)? Would that may be too complex? BTW: I get the impression, that in mixture modeling effects of covariates on growth parameters are easier to compute than variances in unconditional models, do you know why!? 


On your first post, I am not sure I know what you mean by the distribution of mixture starts not being available for "the second run of starting values". On your second post, yes you can regress a growth factor on covariates even if it did not have a significant variance in the unconditional run. Using the additional information from the covariate's relationships to the outcomes may uncover variation in the growth factor. It is also ok if the residual variance is zero. 

linda beck posted on Thursday, July 31, 2008  8:43 am



first post: I' m referring to the technical 8 output "windows system32" window. when I use LMRT there is a "first" run of starting values (distributed over processers) and then the final stage optimizations (also distributed). After this, mplus should give me an output, but when using LMRTtech11 (may be this is the cause, I don't really know) there is a "second" run of both procedures (starting values and optimization). This run is not distributed over processors. After that (long time), mplus gives me an output. 


When runs are not distributed across the processors, it may be that there is not enough RAM. We have seen this. 

linda beck posted on Friday, August 08, 2008  7:13 am



sorry, for bothering you again. But, as I'm writing down my findings on two part mixture with covariates, I noticed that it is not longer possible to get the estimated percentages of belonging to one category in the binary part (I usually derived these in unconditional models via "univariate distribution fit of classes" in the "residuals" option). This might be due to predicting the binary growth parameters with my covariates, (for the continuous part this is not a problem, I get the means for plots). The warning message is: "Residuals are not available for models with covariates..." Is there any other option to get these estimates? I find them very useful to plot. If not, I will take the estimated percentages from the unconditional model as an approximation. regards, linda 


Growth plots are not available when numerical integration is involved. 


I'm interested in using ex 7.27 (FMA) with four continuous variables. What would the model command look like for continuous variables? 


 drop Categorical =  drop Algorithm=Integration;  and change the lines with [u1$1u8$1]; to [u1u4]; 


Hi there, I am new to GMM and would like to use it to model the interaction of a continuous covariate x treatment group on longitudinal alcohol outcomes following an intervention. In the case the interaction is "disordinal," so to speak, I would like to find out the point what in linear regression would be the "crossover interaction"  that specifies at which approximate value of the covariate one intervention becomes more efficacious than the other in decreasing atrisk drinking. I would then like to test this "cutpoint" by classifying new cases according to it, running an new, additional GMM on the new cases and comparing their classifications. Is this remotely possible? I have done things like his before (e.g., with Fisher's coefficients from discriminant analysis), and I see you alluded to this kind of application in Muthen & Muthen (2004) on pg 359, but I wasn't sure exactly how to do this in a GMM context. Thanks so much for any tips you can give me! Susan 


Regarding your first paragraph, in a singleclass setting, my thinking is that one would estimate the model and based on the estimated coefficients one can derive the values of the covariate (the cutpoint) that you are interested in. With GMM, you have classspecific intervention effects so this exercise needs to be done for each class. Regarding your second and third paragraph, it sounds like you are referring to my 2004 Kaplan book chapter where I discuss early classification into a problematic class  that is, based on early outcomes. Perhaps you are thinking of using the covariate and treatment group information only (with the model estimates for its coefficients) to classify individuals  which is certainly possible and has been written about  but I don't see offhand how the cutpoint comes in here. 


Thanks so much! That's exactly the info i was hoping for. Do you have any examples you have done (and published) using the early classification into a problematic class? I work best from examples, and I was having trouble searching for that in the literature. Thanks again! 


We discuss this in our short course teaching under Topic 6 on slides 138148; see http://www.statmodel.com/newhandouts.shtml I am also asking a colleague who has gone ahead with this idea in a paper. 


Thanks for the hint about the handouts. They were helpful. I would still be interested in any papers/authors you might be able to suggest if you get a chance. I did have another question: I would like to run a GMM with two dummycoded knownclass variables (i have three treatment groups)...I haven't seen any examples that include two dummycoded categorical known class variables and just wanted to check whether this is doable. Thanks again! Susan 


The one person I know who is working on this is not ready to share the paper. See the following paper on the website: Boscardin, C., Muthén, B., Francis, D. & Baker, E. (2008). Early identification of reading difficulties using heterogeneous developmental trajectories. Journal of Educational Psychology, 100, 192208. Muthén, B., Khoo, S.T., Francis, D. & Kim Boscardin, C. (2003). Analysis of reading skills development from Kindergarten through first grade: An application of growth mixture modeling to sequential processes. Multilevel Modeling: Methodological Advances, Issues, and Applications (in press). S.R. Reise & N. Duan (Eds). Mahaw, NJ: Lawrence Erlbaum Associates, pp.7189. With the KNOWNCLASS option, you can specify three groups. It is not necessary to create two dummy variables. 

linda beck posted on Monday, September 29, 2008  5:41 am



As posted some time ago above, I have a twopart mixture model with an intervention as covariate in the model. I found two groups, and the intervention has some effects on the slopes in one group. All works fine, but the journal wants effect sizes. a. is there a way to get an effect size for the effect of the treatment on the slope in the continuous and the binary part? b. I have an idea. I thought of using cohen d and getting the means of both trajectories (to get effect sizes for the contpart). then comparing the means between the dummy coded treatment following cohen's approach. But this seems to be difficult, since my treatment is part of the whole twopart mixture model (it predicts groupmembership and slopes). Is there a way to get the means of both trajectories of the contpart in two part mixture modeling sorted by a dummy coded group which is part of the model? Do you see any alternatives? For the binary part I have no idea to get effect sizes at all... thank you! linda beck 


a. Effect size has to do with a dependent variable's mean differences across groups divided by the standard deviation. One way is to consider the dependent variable to be the slope but that is not very down to earth. Instead, one would probably want to look at the outcome at some particularly imnportant time point. This information is given by estimated means per class in the RESIDUAL or Tech7 output. b. You seem to say that your treatment dummy covariate influences not only the slope but also the "groupmembership". By group membership I wonder if you mean the latent class variable. If so, do you really think that treatment changes class membership? In our own work we have typically not taken that approach, but have focused on latent class as being preexisting (before treatment started) with treatment effect being on the slope within latent class. For the binary part, I don't know that effect size makes sense. Perhaps one can translate the effect into an odds ratio for treatmentcontrol related to binary outcome 01. 

linda beck posted on Tuesday, September 30, 2008  8:49 am



thank you! a. o.k., these are the means of the classes in general. But to compute effect sizes I need the means of the classes sorted by treatment vs. control. How can I get these means in mixture modeling? Or did I missunderstand you!? sorry... b. I found a significant effect of treatment on group membership and I think this is very reasonable. Otherwise, it would be bad news for prevention science since it is one goal of (primary) prevention programs to deflect from bad developmental pathways concerning substance use, aggression and so on. If they would be fixed in stone one could only minimalize the damage (only effects on the slope in bad trajectories). 


a. I think you get the treatment and classspecific means using the plot part of Mplus via the "adjusted means" option. Otherwise, you can compute them from the parameter estimates. b. It is certainly a possible modeling approach, but you have to be careful that the class membership doesn't influence the parameters of outcomes before the intervention starts, because in that case it doesn't make sense that the intervention influences class membership (which in turn influences something before the intervention). For instance, if your growth model included the preintervention time point and you center so that the intercept represents the systematic part of the outcome at that time point, the intercept growth factor mean should not be allowed to vary across the latent classes. Another approach is a latent transition model, where you have a latent class variable before the intervention and another one after. The one after can have as "indicators" the growth factors of a process that starts after the intervention. 

linda beck posted on Thursday, October 02, 2008  1:23 am



b. I modeled invariant intercept means between classes in my two part mixture model. I set: [iu] (1); [iy] (2); in both class statements. But the model didn't converge due to a non positive fisher matrix and no computable standard errors. The means/intercepts of 'iy' were set equal in the output, but I think 'iu' was the problem, there were only asterisks. Do I have to add a command, may be aiming at the thresholds or something like that? thanks, linda beck 

linda beck posted on Thursday, October 02, 2008  7:51 am



addition: I think the problem with equal 'iu' across both classes has something to do with the need of mplus to constrain at least one 'iu' to zero. Am I right? I have nor more ideas. 

linda beck posted on Thursday, October 02, 2008  8:14 am



sorry, I think I found the right way to test mean equality in both trajectories. set 'iu' to 0 in both classes, and 'iy' equal, right? :) sorry, once again 


For the growth model for the categorical outcome, the comparison is setting the mean of iu to zero at all timepoints versus zero at one timepoint and free at the other timepoints. For the growth model for the continuous outcome iy, the comparison is the means of iy free at all timepoints versus holding the means equal across time. 

linda beck posted on Thursday, October 02, 2008  10:15 am



my setting in order to do this is: %c#1% [iu@0]; [iy] (1); %c#2% [iu@0]; [iy] (1); against a model without that class specific statements. right!? many thanks for that great support! 


That looks correct. 

linda beck posted on Tuesday, October 21, 2008  7:49 am



back to bengt's answer from october 01, 2008, 08.19h. Unfortunately the 'adjusted means'option is not available when estimating the entire twopart model. How can I compute the treatmentclassspecific means from parameter estimates as you said!? 


Do you mean for the continuous part? 

linda beck posted on Wednesday, October 22, 2008  1:39 am



sorry for being imprecise, yes, for the continuous part. 


Please see the OlsenSchafer JASA article referred to in the UG. If it looks too complex, a local statistician should be able to help with that. 

linda beck posted on Monday, January 12, 2009  9:03 am



I did not find anyone who could help me with computing class specific estimated means of twopart mixture separated by treatment (see last posts). Currently I'm thinking of simply splitting the sample by treatment/control and estimating the final "two class" twopart mixture for both these conditions, to get the class specific estimated means for the continuous part separated by treatment condition. a.) Would that be sufficient to get an idea of class specific means (separated by treatment) of the original model which utilized the entire sample? 

linda beck posted on Tuesday, January 13, 2009  8:26 am



sorry for bothering you again, but besides the problem above, there is only one problem left, that reviewers wanted to be solved. I want to control the effects of treatment on slope for initial level by predicting the slope with the intercept (muthen,curran). I'm using two part growth mixture (randomized prevention) and unfortunately there are some efects of the treatment on preintervention status (intercept). My problem: b. I don't want to use the original muthen and curran approach (I want to skip the "multiple group" or "known class"part). I only want to predict the slope with treatment within the entire sample controlling for effects of the intercepts on the slopes. Is muthen and curran (predicting slope with intercept) the right approach for achieving that aim? c. I lose the covariance between both intercepts of the y and upart, when I use the intercepts as a predictor of the slopes within both parts. That is the only covariance of growth factors between both y and u parts I have estimated in my twopart mixture model(because it was the only significant). Is that a problem? I thought it is the heart of twopart growth curves to have at least one covariances between the growth factors of both parts... Thank you so much! linda 


b. Growth mixture modeling is a more general alternative to the MuthenCurran modeling, so since you are doing mixture modeling you don't need to also think about MC matters. c. You should add that covariance back in the model by saying iu with iy; 


a. That would only be an approximation. You can see how different the estimates are from those of the joint analysis. 


P.S. on a. You first can do the analysis of both groups, then do the analysis of each group with all parameters fixed at the values of the first analysis to get the plots. 

linda beck posted on Wednesday, January 14, 2009  10:30 am



thank you for your patience... so I should use the same (co)variances, thresholds and regressioncoefficents (derived from the analysis of the entire sample) for the analysis of both control/treatment? What about the means? They surely should not be fixed in treatment and control analyses (because you said "all")!? 


Good point. I assume your treatment/control dummy influences a growth factor mean, in which case you have to use the mean for the group in question. 

linda beck posted on Thursday, January 15, 2009  8:52 am



I'm not sure if I fully understood. But what I've done so far (with plausible estimated means as outcome) is: I have estimated the model separately for t/ccondition fixing all parameters (cov, thresholds, regression coefficients and growth factor means) at the values of the analysis utilizing the entire sample. With one exception: regarding the growth factor means (which are intercepts at all, because I have a conditional model) I fixed only the intercepts iu and iy at values of the entire analysis because they were not influenced by treatment in the analyis using the entire sample (su and sy are influenced by treatment). In other words, only the slope means were allowed to be freely estimated in the separated analyses. I'm a bit confused about the thresholds, should I fix them at values of the entire analyses or not, when doing the separated analyses for t/c? I hope that's the way to go, in principal... P.S.: Would be really cool to have the adjusted means (by covariates) option for twopart models in the future! :) 

linda beck posted on Friday, January 16, 2009  10:34 am



add., besides the thresholds (in question) should one also fix the intercept of c (c#1) when doing the separated analysis for Treatment and control? Sorry, I overlooked that [c#1] yesterday... When I fix the thresholds and [c#1] I have only two parameters left to estimate in both separated analyses (su and sy), which are influenced by t in the original model, utilizing the entire sample. Is that the correct model for what you had in mind? 


In order to guide you properly we need you to send the Mplus Version 5.2 output, input, data, and license number to support@statmodel.com. 

linda beck posted on Friday, January 16, 2009  11:04 am



add2, sorry, since c is influenced by treatment the intercept of c should vary across the analyses, i guess. only the thresholds remain in question. sorry for unnecessary posting.... 

HeeJin Jun posted on Monday, February 16, 2009  3:02 pm



Hi, I am doing mixture modeling (type= mixture complex). Tech 11 says that 4 class model was not better than 3 class model (p=0.1171)(tech 14 wasn't available due to type=complex). To make sure, I ran the analysis again with type=mixture only and the tech 14 says that 4 class model is better than 3 class model (p=.0000). What result should I choose to pick up the number of class? Thanks. 


Are you using a weight variable? 

HeeJin Jun posted on Tuesday, February 17, 2009  8:45 am



No. I used a cluster."CLUSTER = idm;" Thanks. HeeJin 


You need to look at the results from TECH11 while taking weighting and clustering into account. Running TECH11 or TECH14 without taking weighting and clustering into account does not correctly represent your data. A further consideration in choosing the number of classes is the substantive interpretation of the classes. 


Hello, I am running a 2class LCA with 1 binary categorical indicator and 3 nominal indicators. I have two questions: 1. As outlined in the section on TECH14 in the User Guide, I first ran TECH14 with the starts option, then used the OPTSEED option with the seed of the stable solution, then ran with LRTSTARTS = 0 0 40 10. Here, I received a warning: WARNING: THE BEST LOGLIKELIHOOD VALUE WAS NOT REPLICATED IN 3 OUT OF 5 BOOTSTRAP DRAWS. THE PVALUE MAY NOT BE TRUSTWORTHY DUE TO LOCAL MAXIMA. INCREASE THE NUMBER OF RANDOM STARTS USING THE LRTSTARTS OPTION. I increased the LRTSTARTS to 0 0 80 20 and the model ran without any problems/warnings. I then additionally specified LRTBOOTSTRAP = 100 (as suggested by McLachlan and Peel, 2000), and I again receive the warning printed above. I subsequently increased the LRTSTARTS to as high as 0 0 150 30 and no longer receive the warning. Should I be concerned about having to increase the LRTSTARTS so high? 2. For the LoMendellRubin LRT in TECH11, I understand that the last class should be the largest. In the User Guide, you specify that if you are using starting values, they be chosen so that the last class is the largest. If I am using the automatic starting values and I notice that the last class is not the largest, does this mean I have to specify my own starting values? Many thanks. 


1. As long as you reach a solution, you don't need to be concerned. It may be that the loglikelihood is bumpy and it is difficult to find the solution. 2. Yes. 


Hello, I am comparing 2, 3, 4, & 5class models using LCA. I have 3 nominal and one binary indicator, and I am running these LCA models at various ages (25, 30, 35 years etc.). I have a few questions from my output: 1. In a couple of instances, when I run TECH11 and TECH14, the H0 Loglikelihood from the k1 class model is not the same as it was in the previous run with one less class. Why does this happen and how can I interpret it? 2. In a few instances, I observe the following two errors: a)ONE OR MORE PARAMETERS WERE FIXED TO AVOID SINGULARITY OF THE INFORMATION MATRIX. THE SINGULARITY IS MOST LIKELY DUE TO THE MODEL IS NOT IDENTIFIED, OR DUE TO A LARGE OR A SMALL PARAMETER ON THE LOGIT SCALE. THE FOLLOWING PARAMETERS WERE FIXED: 62 What does this mean for model interpretation? b)THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.487D16. PROBLEM INVOLVING PARAMETER 40. Can you provide any guidance how I can address this issue? Many Thanks 


1. Use the STARTS and/or LRTSTARTS option to increase the random starts. 2a. If the threholds are fixed, this is fine. If it is other parameters, send your output and license number to support@statmodel.com. 2b. Please send the output and your license number to support@statmodel.com. 


Hi, I have a question about random starts. I have requested TECH1 but I am not sure where I can find the starting values. For example, if I get the output as follow, what are the starting values? Are they 0 0 0 for the variable AC_FS NC_FS CC_FS? Or, are they .305 .229 .225? Thanks! NU AC_FS NC_FS CC_FS ________ ________ ________ 1 0.000 0.000 0.000 THETA AC_FS NC_FS CC_FS ________ ________ ________ AC_FS 0.305 NC_FS 0.000 0.229 CC_FS 0.000 0.000 0.225 


The starting values are under the heading of STARTING VALUES under Technical 1 in the output. It looks like what you have above are starting values. Nu and Theta are difference matrices. Nu gives starting values for means. Theta gives starting values for variances and covariances. 


Thanks so much, Linda! 


Dear Drs Muthen, I have been trying to build a conditional Model that my dataset fit well. My path diagram describes a relationship between weight at different ages and cancer development. I have two latent variables (slope and intercept) with arrows indicating towards the weight at different ages. I have also arrows from these latent variables towards cancer as I am trying to investigate the effects of the weight changes on cancer. In this case, I do have weight and cancer as outcome variables. Weight is my continuous variable. And, Cancer is my categorical ( Free(o), early onset(1), late onset(2)) variable. My aim is to know the patterns of weight gain/loss by cancer groups(0,1 &2). I have tried LGCM by specifying cancer as a categorical. But, I could not get the chisquare and model fit indices ( e.g. CFI and RMSA). But, when I run the model without specifying the categorical variable, I get the fit indices but 999.000 in my S.E and Pvalues persists. 1. Is LGCM ideal in my case? 2.What do I do if my slope and intercept are exogenous with two outcome variables? 3. How do I deal with 999.000 values? Thank for your help in advance. 


Given that the weight profiles do not follow a developmental trend but rather go up and down, I would not use a growth model. I would use the weight variables in a latent class analysis to find patterns of weight gain and loss and use the cancer variable as a distal outcome. 


Dear Dr. Linda, Many thanks for the clarification. Regards. 


Dear Dr. Linda, In relation to the above question: Where do I get reference material for " Latent class analysis with distal outcome". I am new to Mplus and such techniques! Regards. 


I would suggest viewing the Topic 5 course video and looking at the papers on the website Latent Class Analysis. A good book is: Hagenaars, J.A & McCutcheon, A. (2002). Applied latent class analysis. Cambridge: Cambridge University Press. Example 7.12 shows an LCA with a covariate. 

Simon Denny posted on Thursday, September 16, 2010  9:45 pm



Hello Linda and Bengt I have fitted a multilevel mixture model with four classes with a categorical outcome. The betweenlevel variables are dummy variables that were constructed for high, medium and low levels of various aspects of school environments. I am interested in estimating the percentages or rates of my outcome variable within each class at different levels of the school variables. i.e. instead of presenting the odds ratios, I want to present the percentages/rates within the four classes. This for a layreport where odds ratios are not easily understood. The percentages don't necessarily need to take into account the individuallevel covariates, but this would be nice if possible. Is there any way of converting the odds ratio's back to a percent of my outcome variable? Or is there any way of assigning the students to their latent class so I can then look at different levels of the dummy variables and my outcome variable? Thanks very much for any help. 


This is possible. I would need to see your full output to understand your exact situation. Please send it and your license number to support@statmodel.com. 

Poh Hiong posted on Sunday, February 06, 2011  8:52 am



Hi, I have a question regarding solution to my latent class analysis. I realize that there is a difference in the profile of my classes between the model without predictors and the model with the predictors. To elaborate, in my analysis without any predictors, it shows 3 classes. However, when I add in the predictors, it also shows 3 classes but they take a different profile. I am not sure if I can still interpret the effects of the predictors on class membership when there are changes in the profile. Any advice on this situation will be helpful. Thanks. 


This issue is discussed in the following paper which is available on the website: Muthén, B. (2004). Latent variable analysis: Growth mixture modeling and related techniques for longitudinal data. In D. Kaplan (ed.), Handbook of quantitative methodology for the social sciences (pp. 345368). Newbury Park, CA: Sage Publications. 


I had a question about a mixture model, I got the following error after using RML with Monte Carlo and my log likelihoods are over 2000, but it said the model terminated normally. Any thoughts? Thanks! THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.210D16. PROBLEM INVOLVING PARAMETER 25. 


Please send the output and your license number to support@statmodel.com. 


Hello, I would like to use a known class procedure to estimate a crossgender latent profile of six observed variables, allowing the variances to vary across gender. Here are my questions: 1. As I understand from my research so far, there is no way to obtain a test of the relative fit of k and k1 classes (a LRT test) when using the known class procedure in mixtures in MPLUS. Am I correct? 2. Beyond looking at the BIC, what are your recommendations for comparing class solutions when using the known class option? 3. I considered running latent profile separately for males and females, but this doesn't give me the same solutions since mean profiles will be different across genders. Is running bygender models the better choice given limitations of estimating the best solution using the known class procedure? 4. Do you know of any published studies using the known class procedure in MPLUS for a latent profile? Thanks in advance for your help, Michelle Little 


1. That's correct. It is possible in principle to do the BLRT, but it is only implemented for one latent class variable and exploratory mixture analysis (LCA). 2. With categorical items I would use bivariate tests in TECH10 and with continuous outcomes perhaps TECH11, TECH12, TECH13 (see UG). 3. I would recommend considering having gender as a covariate instead of Knownclass because it may make the solution search easier. Although it sounds like you have continuous indicators and therefore might consider classvarying variances, which the covariate approach can't handle, the covariate approach allows gender variation in class probabilities (c on x) and allows gender differences in some indicator means by direct effects of x on y (item bias). Without direct effects, the class profiles are the same for the genders. I would first analyze each gender separately to see if there is hope for getting the same classes. 4. Not off hand, but check under Papers on our web site. Does someone else know? 


Thanks for the above! I ran by gender LPAs and found that a 4class model was the best fit for both genders. The configuration of profiles is generally similar with 2 notable mean differences on specific indicators within 2 classes. I ran a full model with a gender covariate effect on those indicators within those particular classes, and a gender direct effect on class membership. Here are my questions: 1. A bestfitting 4class model resembles bygender profiles. To obtain class probabilities, should I leave in the direct effect of gender on C, although it is nonsignificant, since gender covariate affects class membership? 2. For planned analyses, it would be helpful to use the class probabilities from the full model. Any advice for good criteria for using the full sample vs by gender class member results, beyond significant chisquares in crosstabs of class membership in full vs bygender models? 


1. I think you are asking if you should leave "c on gender" in the model even if it is not significant. I would say yes  I am not in favor of "trimming" the model after analysis. 2. I would use the full sample analysis since you found that the two genders have most parameters in common. But report that separater gender analyses agree with the full model. 


Thanks. 


I am running a latent profile analysis on 5 factors with each three indicators (3 different informants for each factor, but the same three across factors). The solution looks very much like what I would expect based on theory, but I get the negative psi matrix warning. It relates to the last factor, but if I run the analysis without this factor, I get the same warning but for the factor that is now the last. I do not see a negative (residual) variance or a correlation greater than 1 anywhere. To me, it seems unlikely that there is a linear dependency among more than two variables, but I would like to ask how I can check for this. thank you! 


The message could also be due to a group of high correlations. See TECH4. I would do the analysis without mixture and explore the factor structure further. I would ask for modification indices so see if there is a need for residual covariances across the five factors. 


thank you 

Yen posted on Wednesday, March 14, 2012  9:30 am



I am using imputed data (.implist) for factor mixture model. It took me more than 2 hours to run each analysis. I wonder if the following commands are appropriate for imputed data: algorithm=integration starts=1000 250 Second, is it possible to request plot with implist? Thank you. 


I would not recommend using imputed data with such a computationally demanding model. I would use MLR on the original data. 

Matt Luth posted on Friday, March 16, 2012  9:09 am



I am trying to run a model similar to example 7.27. However instead of fixing the first factor loading to be one in both groups to set the scale, I would like to fix the variance in both groups to one. Would this be the correct syntax? MODEL: %OVERALL% F BY U1* U2U8; [F@0]; F@1; %C#1% F BY U1* U2U8; [U1$1U8$1]; %C#2% F BY U1* U2U8; [U1$1U8$1]; The reason I ask is that these two models produce different results. I’m sure I’m missing something. Thanks 


It sounds like you are hitting a local solution. Try more random starts. Also, be sure to check that you have the same number of parameters in both models. Otherwise, send both outputs and your license number to support@statmodel.com. 

Stata posted on Tuesday, March 20, 2012  12:31 pm



I am running factor mixture model with ordinal and binary variables. I got a bunch of error messages with FMM3. Is there a problem with my Mplus syntax? *** ERROR in MODEL command Ordered thresholds 2 and 3 for class indicator A4 are not increasing. Check your starting values. ANALYSIS: TYPE = Mixture; Algorithm = integration; Starts = 2000 500; Model: %overall% f by a1a20 b1b5 c1c4 d1d6 e1e12; [f@0]; f; %c#1% [a1$3a20$3]; [b1$1b5$1]; [c1$3c4$3]; [d1$2d6$2]; [e1$3e12$3]; %c#2% [a1$3a20$3]; [b1$1b5$1]; [c1$3c4$3]; [d1$2d6$2]; [e1$3e12$3]; Any helps are appreciated! 


You seem to want the thresholds to be classvarying, but the way you state things not all of them will be classvarying. As an example, for [a1$3a20$3]; it seems like you should refer also to thresholds 1 and 2 to make them different across classes as well. 

Stata posted on Tuesday, March 20, 2012  9:26 pm



Dr. Muthen, I am sorry to bother you again. Does it mean that I have to specify each threshold? [a1$1a20$1]; [a1$2a20$2]; [a1$3a20$3]; Thank you 


Yes. 

Yen posted on Saturday, March 24, 2012  11:06 am



Mplus 6.1 and 6.11 provide tech11 and tech14 results for factor mixture model with type=imputation. Why 6.12 does not have that capacity (see below)? In that case, should I trust tech11 and tech14 results from Mplus 6.1 and 6.11? *** WARNING in OUTPUT command TECH11 option is not available with DATA IMPUTATION or TYPE=IMPUTATION in the DATA command. Request for TECH11 is ignored. *** WARNING in OUTPUT command TECH14 option is not available with DATA IMPUTATION or TYPE=IMPUTATION in the DATA command. Request for TECH14 is ignored. 2 WARNING(S) FOUND IN THE INPUT INSTRUCTIONS Thank you. 


We removed TECH11 and TECH14 from multiple imputation in Version 6.12 because these were not yet developed for multiple imputation. This is a research area how this should be done. 

Yen posted on Saturday, March 24, 2012  7:13 pm



Hi Linda, I am sorry to bother you again on this topic, but does the issues with tech11 and tech14 also apply to latent class analysis with type=imputation? Thanks. 


This applies to all analyses with TYPE=IMPUTATION. 

Bart Simms posted on Friday, March 30, 2012  10:32 pm



Hello, I am attempting to run an FMM5 with two classes and two correlated factors. There are 3 count indicators and four ordinal indicators. I also included STARTS = 120 30; STITERATIONS = 65; In the output there were the warnings THE MODEL ESTIMATION HAS REACHED A SADDLE POINT OR A POINT WHERE THE OBSERVED AND THE EXPECTED INFORMATION MATRICES DO NOT MATCH. THE CONDITION NUMBER IS 0.235D04. THE PROBLEM MAY ALSO BE RESOLVED BY DECREASING THE VALUE OF THE MCONVERGENCE OR LOGCRITERION OPTIONS. THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THIS IS OFTEN DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. CHANGE YOUR MODEL AND/OR STARTING VALUES. PROBLEM INVOLVING PARAMETER 9. RESULTS ARE PRESENTED FOR THE MLF ESTIMATOR. Parameter 9 is a loading for a categorical variable in class 1. I'm just wondering the best thing to do at this point to get it to run. Even with the saddle point, this is better than my other models (including the 1 class version) in terms of information criteria, and it makes good sense substantively. So I'm really hoping to get it to work. I figured I would ask before I tried more starts, because that run took 45 hours. 


It looks like the estimation recovered and you obtained MLR standard errors. If you have standard errors in the results, you can ignore the message. You can also try to decrease the MCONVERGENCE and LOGCRITERION options. 

Bart Simms posted on Saturday, March 31, 2012  11:44 pm



Thank you. Yes, there are standard errors. So the MLF results are only the parameter estimates themselves, and not the standard errors? I also realized that I did something that surely didn't help the estimation in that the loading set to 1 for the second factor was a cross loading, and this was the same indicator set to 1 for the first factor. This resulted in quite a low factor variance and some huge loadings for other indicators. I guess I will rerun it after correcting this, and also try more starts. *But am I wasting time by setting STITERATIONS too high at 65? Perhaps the default is better? 


Just to clarify, the standard errors you get are MLF not ML or MLR. The parameter estimates are maximum likelihood as they are for ML, MLR, and MLF. I would not change the default unless I had a compelling reason. 


Hello! I am new to M plus and LCA with combination of categorical and continous variables (7 in total), I think 5 clusters seem to be the best solution. but I have three following warnings in the process I have two big questions: 1) do I have to standardidized all the continuous variables(they have different scales) 2) would you please let me know what I am supposed to do? the three warnings are WARNING: THE BEST LOGLIKELIHOOD VALUE WAS NOT REPLICATED. THE SOLUTION MAY NOT BE TRUSTWORTHY DUE TO LOCAL MAXIMA. INCREASE THE NUMBER OF RANDOM STARTS. THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.851D17. PROBLEM INVOLVING PARAMETER 14. ONE OR MORE PARAMETERS WERE FIXED TO AVOID SINGULARITY OF THE INFORMATION MATRIX. THE SINGULARITY IS MOST LIKELY BECAUSE THE MODEL IS NOT IDENTIFIED, OR BECAUSE OF EMPTY CELLS IN THE JOINT DISTRIBUTION OF THE CATEGORICAL VARIABLES IN THE MODEL. THE FOLLOWING PARAMETERS WERE FIXED: 19 20 21 22 23 24 37 


1. No. 2. Please send your output and license number to support@statmodel.com. 


Hi, I want to do path analysis based on three cohorts. I want to find out whether it is justifiable to use the three cohorts all together although the frequency distribution of some variables differs depending on the cohort. I ran the path model for each cohort separately and for all three cohorts together. I compared the results, and the general conclusion is that more parameters become significant when I use all three cohorts together. There are only a few minor differences in the results between the models of the three cohorts. I wander if this conclusion is sufficient enough to use all three cohorts together. Or should I try a multigroup comparison? I looked at the user’s guide, but could not find an example in the chapter on mixture modeling that ressembles what I think I need to do. A simplified version of my current path model is as follows: CATEGORICAL ARE y1 y; USEVARIABLES ARE x1 x2 x3 y1 y2 ; MISSING ARE ALL (999); ANALYSIS: estimator = ml; integration = montecarlo; MODEL: Y2 on x1 x2 x3 y1; y1 on x1 x2 x3; How does the syntax look like when I want to find out whether this model differs depending on the cohort being used? Or is there some other way than mixture modeling to find out that there is statistcally no difference according to the cohort and that it is justifiable to use the path model with all three cohorts together? I appreciate your advice. 


You can use the multiple group multiple cohort approach shown in Example 6.18. This allows you to test differences in parameters across the cohorts. 

namer posted on Wednesday, June 26, 2013  3:26 am



Hello, When running a 3class LCGA on 6 time points, I get the warning: WARNING: THE BEST LOGLIKELIHOOD VALUE WAS NOT REPLICATED. THE SOLUTION MAY NOT BE TRUSTWORTHY DUE TO LOCAL MAXIMA. INCREASE THE NUMBER OF RANDOM STARTS. I've tried increasing STARTS up to 5000 500 and it doesn't resolve the problem. However, if I subsequently set OPTSEED to the random seed value that resulted in the best loglikelihood value in the previous (nonreplicated) analysis this warning goes away. Can I now be sure that I have achieved a global maxima? Do you have any other advice on how I could address this warning? Thanks kindly! 


No, the fact that the OPTSEED run doesn't give the warning is no guarantee. For us to diagnose this you would have to send input, data, output, and license number to support@statmodel.com. 

Adam Myers posted on Monday, July 29, 2013  6:12 pm



Hi Bengt and Linda, I have a question about the different diagnostics for selection of an appropriate latent class solution. Specifically, when I run an LPA using MPlus, the LMR test indicates rejection of a threeclass solution (which makes little sense for my data), but when I run a model with four through nine classes, the pvalue for the LMR test is well below .05. All of the other indicators (AIC, BIC, BLRT) consistently suggest that each successive class solution is a better fit. I read in Nylund, Asparahouv, and Muthen (2007) that LMR tends to overestimate the number of classes rather than underestimate, but it seems to be doing the exact opposite with my data. Thoughts? Should I ignore the LMR results for the 3class solution? Thanks in advance! 


In a kclass analysis, LMR testss k1 versus k. A pvalue greater than .05 says you can't reject the k1 model. You should be looking for a pvalue greater than .05. BIC is good for determining the number of classes. 


can Mplus deal with 3plm model ? 


can Mplus deal with 3plm model ? 


Not at this time. 

xiaoyu bi posted on Tuesday, January 07, 2014  10:17 am



Dear Dr. Muthen, For the growth mixture model, when I reported the counts/proporations of individuals in each class, should I report them based on (a) estimated posterior probabilities, or (b) their most likely latent class membership? I noticed that the graphs generated by Mplus are based on the estimated posterior probabilities. I read a book, and it reports the number/proporations based on their most likely latent class membership. But, if in the text I report the counts and proporations based on their most likely latent class membership, the number will not match with those in the graph? What do most researchers do? Any suggestions? Thank you so much! 


You should report counts based on the estimated model (this section comes first in the output). 

Carey posted on Tuesday, May 13, 2014  2:23 pm



I keep on getting this error: "*** WARNING in MODEL command All variables are uncorrelated with all other variables within class. Check that this is what is intended." Is this problematic? 


No, this is as it should be. It is just a reminder. 

Marianne SB posted on Wednesday, September 24, 2014  4:39 am



I have a data set with 6 waves of data of depressive symptoms among about 1000 adolescents. Is it methodologically and statistically sound to: 1) Do a series of growth mixture models  a cubic 3class solution provided the best fit (lowest BIC, high entropy, all proportions above 5 %). (Growth mixture) 2) Then test whether effects of various covariates on level of depressive symptoms are different across classes. (regression mixture) Do you know of any papers who have done something similar? 


1) Yes 2) Sounds like you want to have classvarying i s on x; You specify that within each class. So, it's possible, although not always easy to get stable solutions. For examples, see the papers on our website under Papers, Growth Mixture Modeling. 

Marianne SB posted on Wednesday, September 24, 2014  10:40 pm



Thanks for your reply! Actually, I want to examine effects of some covariates measured on some of the waves, so I cannot regress i s q c on them. Therefore, I am thinking about specifiying dep16 on x y z (timevarying covariates). However, as you suggest, I struggle with unstable solutions and error messages. It might not be possible to examine both classes with different longterm development and different effects of covariates simultaneously with my dataset. I don't think any of the papers under Papers, Growth Mixture Modeling have done this either. 

Marianne SB posted on Wednesday, September 24, 2014  10:43 pm



Addition: My covariates are measured on some of the waves in the middle, not on baseline. 


Should be possible in principle, but may be hard in certain data. 

Anna Hawrot posted on Wednesday, November 12, 2014  6:02 am



Hello, Could you, please, explain what it means that an LCA model is not welldefined? I've come across this term several times, but haven't found any explanation. 


Can you give a context for this? 

Anna Hawrot posted on Thursday, November 13, 2014  5:19 am



I've seen this term in several articles. For instance, authors were extracting up to 6 classess, however results of the 6class model were not reported because the model was not welldefined for the data. Yestarday, after posting my message here, I found the information in UG7, p. 466 that models whose final stage optimizations resulted in LogLikelihood (LL) values very close to the best LL, may be notwell defined for the data. In order to understand it better I was experimenting with different LCA models and I managed to get such models (parameters' values differred not dramatically, but substantially; LL values were very close, e.g., 8126.353, 8126.691, 8126.729). However, I also got ones with almost identical parameters' values. To sup up, my results were in line with the information in UG7. Infering from my explorations, I would say that a model is "notwell defined" for the data when its parameters estimates are unstable, and thus  not thrustworthy. Am I right? 


I think you are on the right track. Look at our handout for Topic 5, slide 116, cases 3 and 4. The likelihood has 2 local maxima that are very similar in height. This means that using this model the data is not very informative about the value (estimate) of the parameter on the x axis. This is what we mean by the model not being welldefined. This implies that the ML method breaks down  it cannot clearly help us find a best parameter estimate. In case 3 the problem isn't that big since the parameter values are not that far apart for the 2 peaks, but in case 4 it is a serious problem. 

Anna Hawrot posted on Tuesday, November 18, 2014  9:08 am



Thank you! It's much clearer now! 

Ann Nguyen posted on Wednesday, January 21, 2015  12:23 pm



I ran a series of LCAs (from a 1 class solution to a 4 class solution) despite the LMR test indicating a preference for a 1 class solution. A 3 class solution showed the greatest reduction in AIC and BIC. Entropy was highest for the 3 class solution. Moreover, the 3 class solution was most easily interpretable and consistent with theory. Given all of these factors, is it fine to ignore the nonsignificant LMR test from the 3 class solution and select the 3 class solution as the final solution? Thank you. 


I don't think you need to be tied to what LMR says. A key indicator is where BIC is at its minimum  so if that was at 3 classes you are fine to go with that. BIC doesn't do well at small sample sizes, such as n<200. 

Ann Nguyen posted on Thursday, January 22, 2015  6:20 am



Thank you for your quick response, Dr. Muthen. 


Hi, I'm running a LCA with four, ordinal variables. I estimated a 3class solution but the proportion of replications of my best LL solution was a bit low (34%) with starts =100 50 and one perturbation failed to converge. I upped the random starts to 500 100 to see if I could get better replication. Again replication of the best LL solution was 34/100. I then copied the Svalues and pasted them into the model command to further help the model along. I got an error message stating: "One or more pairs of ordered thresholds are not increasing in Class 1. Check your starting values. Problem with the following pairs: ANTI$2 (5.175) and ANTI$3 (5.175) ANTI$4 (15.000) and ANTI$5 (15.000)" I see that a few of my thresholds were the same (i.e. 5.175) and of course the thresholds set to an extreme value (15) were be the same. How do I remedy this problem when pasting in Svalues? 


You can simply add a small value to the starting value for the higher threshold. But getting 34 replications of the best LL seems more than sufficient to me. 

Back to top 