Message/Author 

Anonymous posted on Thursday, May 11, 2000  2:56 am



I would like to estimate a structural equation mixture model, where y is the dependent and x1  x4 are the independent latent variables. The path coefficients of the structural model should vary across latent classes. When I try run the following specifications, class size is always half of the sample size and no model fit measures are reported. Can you help me with the model specifications? CLASSES = CLASS(2); USEVARIABLES ARE Y_1 Y_2; X1_1 X1_2 X1_3 X1_4; X2_1 X2_2 X2_3 X2_4 X2_5 X2_6; X3_1 X3_2 X3_3 X3_4; X4_1 X4_2 X4_3 X4_4 X4_5; MISSING ARE *; ANALYSIS: TYPE IS MIXTURE; ESTIMATOR IS MLR; ITERATIONS = 1000; CONVERGENCE = 0.000001; MITERATION = 100; MCONVERGENCE = 0.000001; MCITERATION = 2; MCCONVERGENCE = 0.000001; MUITERATION = 2; MUCONVERGENCE = 0.000001; MODEL: %OVERALL% Y BY Y_1 Y_2; X1 BY X1_1 X1_2 X1_3 X1_4; X2 BY X2_1 X2_2 X2_3 X2_4 X2_5 X2_6; X3 BY X3_1 X3_2 X3_3 X3_4; X4 BY X4_1 X4_2 X4_3 X4_4 X4_5; %CLASS#1% Y ON X1 X2 X3 X4; %CLASS#2% Y ON X1 X2 X3 X4; OUTPUT: TECH1 TECH2 TECH5 TECH7 TECH8; 


In mixture modeling you want to give starting values for parameters that distinguish the different latent classes. This means that your classspecific statements %c#1% and %c#2% should give starting values reflecting your beliefs about how the two classes differ. I think your run gets stuck due to the nondifferentiated starting values, simply dividing the sample in halves. 

Anonymous posted on Wednesday, August 16, 2000  3:34 pm



I have your paper on SecondGeneration Structural Equation Modeling (#82) and printouts of the corresponding Mplus examples (penn1penn7) and I am now trying to use them as a model for my own data. How did you set the starting values used in these lines in penn1: math7math9*7 math10*13; slope with intercpt*3.1; [intercpt*42.8 slope*.6]; and in penn2: [intercpt*62.8 slope*3.6]; 


Here are two strategies for obtaining starting values. How the ones in the Second Generation paper were obtained has been long forgotten. · Strategy 1 · Do a conventional oneclass analysis · Use estimated growth factor means and standard deviations as growth factor mean starting values in a multiclass model  mean plus and minus .5 standard deviation · Strategy 2 · Estimate a multiclass model with the variances and covariances of the growth factors fixed to zero · Use the estimated growth factor means as growth factor mean starting values for a model with growth factor variances and covariances free 

Peter Tice posted on Thursday, August 24, 2000  8:31 am



Starting values for multiclass models Yes, I've been working on finding starting values for later identifying the best class model (based on BIC value). Thus far, I'm using the following strategy. I set the growth factor means at zero and for the individual class components provide an intercept value based on an estimated mean and +/ .5 standard deviation. In many ways I'm combining the two strategies listed in the message dated August, 17th. I'd like to continue setting the growth factor means at zero, since my preference is to assume no withinclass variation. Nonetheless, with this strategy I'm comfortably able to fit multiclass models including up to 4 latent classes. However, when continuing further with models including 5 or more classes I frequently receive the message that Mplus is unable to calculate standard errors (either a result of incorrect starting values or model identification). Below is an example of a program that successfully converges. title: mixture model example data: file is u:\example.dat variable: names are id t1age del1 del2 del3 del4 ; useobservations = t1age = 12 ; usevariables are del1 del2 del3 del4 ; classes = c(4) ; missing are . ; analysis: type = mixture ; miterations = 1000 ; model: %overall% i by del1del4 @1 ; s by del1 @0 del2 @3 del3 @6 del4 @13 ; q by del1 @0 del2 @9 del4 @36 del4 @169 ; [del1 @0 del2 @0 del3 @0 del4 @0] ; i @ 0 s @0 q @0 ; %c#1% [i*0.176 s*0.0 q*0.0] ; %c#2% [i*1.913 s*0.0 q*0.0] ; %c#3% [i*3.306 s*0.0 q*0.0] ; %c#4% [i*4.003 s*0.0 q*0.0] ; savedata: file is u:class ; save = cprobabilities ; 

bmuthen posted on Thursday, August 24, 2000  9:40 am



A combination strategy for starting values seems reasonable. In your message, I think you mean to say that the growth factor variances, not their means, are fixed at zero; your input reflects this. Regarding problems with calculating standard errors, that often corresponds to a nonidentified model (a problematic, nonpositive definite Hessian matrix). It may well be that 5 classes is too much to identify from 4 time points of data. More research needs to go into mixture identification matters such as these. At the same time, with mixtures it is harder to judge if the Hessian is problematic due to nonidentification or for other reasons, in which case the problem can be avoided by using better starting values. 

bmuthen posted on Thursday, August 24, 2000  11:04 am



Adding to my previous message, it should not be implied that there must be more variables than classes in mixture modeling. What can be identified depends on the specific mixture model and the data. For example, there are many classic examples where many classes are found with only a single outcome variable. There is often, however, lack of empirical identification where there is not enough information in the data to support a certain number of classes. 


Hi. I am running several latent trajectory class models and have a general question. What syntax would I need in order to estimate a quadratic factor in one class but not the other class in a typical twoclass model? I believe I heard/read that this is possible but want to make sure that I am specifying the model correctly. 


When you want a quadratic growth factor in one class but not in the other, then fix the mean, variance, and covariances of the quadratic growth factor with all of the other growth factors to zero. For example, if class 2 has no quadratic growth factor, %c#2% q@0; [q @0]; q WITH i@0; q WITH s@0; where i is the intercept growth factor, s is the linear growth factor, and q is the quadratic growth factor. 

Taj Carson posted on Wednesday, January 23, 2002  7:28 am



I am working on an evaluation design that involves using a structural equation model with latent classes. However, I would like to measure change over time in the outcome measures, but will be using a crosssection of people measured at two different time points. Can I use MPlus given this data structure, and the fact that the data from time 1 is from different individuals than the data from time 2? Thank you, Taj Carson 


Random effect growth models require that the same individuals be measured repeatedly. Your data do not meet this requirement. 

David Rein posted on Saturday, March 09, 2002  11:43 am



Hello, Is there a way to set starting values for the population proportion values in a mixture model. For example in in I am trying to fit an outlier distribution with a two class model, I might be interestested in estimating one huge class: 99.5% of the data and one tiny class 0.5 % of the data. 


You can set the class probability parameters in the %OVERALL% part of the model using the [] statement. They will be in the logit scale. For example, MODEL: %OVERALL% [c#1* 5]; where 5 would be a logit value close to a 99.5 % class. You would do this for the number of classes minus 1. 

Anonymous posted on Friday, June 21, 2002  3:49 pm



I'm trying to do a multigroup model in Mplus using TRAINING data to specify group (class) membership. I notice that when Mplus provides my parameter estimates, the group assignments are off, which strikes me as odd since group membership is very well defined. Why is this and is there anyway to correct it ? 

bmuthen posted on Saturday, June 22, 2002  10:28 am



Please send your input, output, and data to Mplus support, support@statmodel.com 

bmuthen posted on Saturday, June 22, 2002  10:29 am



Please send your input, output, and data to Mplus support, support@statmodel.com 

Anonymous posted on Tuesday, May 13, 2003  10:02 am



I have some questions regarding the modeling of latent class measurement models (LCMMs) in Mplus, in the case where the LCMM is posited as an intervening variable between a set of X variables and a series of distal outcomes (Y variables). First, I’ve noticed that Mplus allows one to specify that the effects of X on Y are fixed or vary across latent classes. Is it the case that when the X effects vary by latent class they can be interpreted as interaction effects (X interacts with LCMM class membership) ? Second, related to the question above, if the distal outcome (Y variable) is continuous, Mplus allows the means and variance of Y to vary by latent class. The variation in means across latent classes is straightforward, but if the LCMM is being used as a traditional intervening variable, how is the variation in variances to be interpreted ? Third, if the LCMM (call it L) is concurrent with another intermediating outcome (call it H), both of which are allowed to have effects on a set of Y variables, is it possible to specify that the errors of L and H are correlated ? Finally, if the distal outcome variable (call it OCAT) is an ordered categorical variable with greater than 2 categories, are the threshold terms OCAT$1, OCAT$2, OCAT$3, etc interpreted as conditional probabilities of some sort, i.e., p(outcome variable level = i versus the baseline  class membership = t) ? 

bmuthen posted on Tuesday, May 13, 2003  10:12 pm



1) Yes 2) Variation in variances is a function of x predicting class (group) membership, where in each group any parameter including y variances may be groupspecific  so a more general form of mediation. 3) The Latent Class model does not have errors in the conventional sense, but residuals from H could be made to have direct influence on latent class indicators beyond the latent classes. 4) Ordered polytomous outcomes are modeled using the proportionalodds model (agresit, 1990, pp 322324), so saying that there are parallel logit lines for probabilities of outcomes C, C or C1, etc, where C is the highest category. 

Anonymous posted on Monday, December 22, 2003  11:50 am



Could you please describe or provide me with a reference for the EMA algorithm that is now the default algorithm for mixture SEM? In my experience I find it to be faster and equally good as EM, but I would like to know how it is working. Thanks in advance. 

bmuthen posted on Monday, December 22, 2003  12:12 pm



No reference, but the algorithm simply switches away from EM when EM has shown to give little change in the log likelihood for a couple of iterations, and then instead uses quasinewton or fisher scoring optimization for a while. 

Anonymous posted on Wednesday, April 28, 2004  2:07 pm



I'm interested in mixture factor analysis with binary or ordinal dvs. I find it difficult to conceptualize how one infers from binary or ordinal indicators the presence of a mixture distribution of continuous latent factors. It would be useful to see a paper that goes into some detail on this, if you would be able to provide a reference. 

bmuthen posted on Thursday, April 29, 2004  6:14 pm



This is at the research frontier and I am not aware of a paper that detail this  our own writings and formulas behind the software algorithms are not yet ready for dissemination (but soon). You should conceptualize this analogous to how you conceptualize doing the analysis for continuous outcomes. A mixture of continuous latent factors gives rise to a nonnormal latent variable distribution and such a distribution can fit some data better than using a normal distribution. 

gdeitz posted on Saturday, July 17, 2004  7:33 pm



For my dissertation, I am interested in using a finite mixture SEM approach to devising an organizational taxonomy and then comparing the fit to that of measures of established conceptual typologies. The SEM model will involve 5 predictor (latent) variables and two (observed) dependent variables. I have a model paper, but I believed the authors used a competitor's software (although it is not so stated). Based on what I've seen of MPlus, I think it will do what I need it to do. However, in looking at the examples and training videos, I'm not entirely confident that I've seen anything that exactly addresses what I'd like to do. (Of course, being new to LCA, maybe I'm just not understanding what I'm reading? ;) ). Can you put my mind at ease before I place the order for a student license? Thanks. 


Look at the following papers which can be downloaded from the Mplus website under Mplus papers. I think there are in the general area that you are interested in. Lubke, G. & Muthén, B. (2003). Performance of factor mixture models. Under review, Multivariate Behavioral Research. Lubke, G. & Muthén, B. (2003). Investigating population heterogeneity with factor mixture models. Under review, Psychological Methods. 

Anonymous posted on Sunday, December 12, 2004  7:16 am



Hello, I am conducting a mixture model. I would like to know how weights are handled in mixture models and the effects of weights on the estimated class probabilities. I cannot find the details in the manual and in the technical appendix. Thanks in advance! 

bmuthen posted on Sunday, December 12, 2004  11:08 am



The handling of weights is described in Mplus Web Note #7 shown at http://www.statmodel.com/resrchpap.html This paper also has a latent class analysis example. 


Hi, I am estimating a latent class SEM with multiple classes and i would like to setup a class in which none of the IV's affect the DV and the regression equation for that class just has a constant on the rhs. (for the remaining classes i would like the constant and the effects as well). how do i estimate the equation intercept? regards 


If I understand correctly, you would specify the regression equation in the %OVERALL% part of the model and then fix the regression coefficients to zero in the class for which you want only the mean of the dependent variable to be estimated. 


Dr. Muthen, Thank you for the clarification. 

anonymous posted on Friday, February 17, 2006  4:11 am



hi i am intending to use SEMM, with a categorical latent class variable c as a predictor of a number of continuous latent variables f. my question relates to the measurement model part of this analysis. more specifically, how do i integrate the latent class variable into the measurement model? i understand that LCA is a measurement model in itself, so do i still need to include the categorical latent c in the measurement model(CFA) of the continuous latent variables, and if yes, how? or am i getting this all wrong? also, can you point me to any paper that has used SEMM? 


The factor would be a distal outcome like in Example 8.6 which shows an observed variable as a distal outcome. You would just have a factor as a distal outcome and the variation of the factor means over classes are the parameters of interest. See the following paper which can be downloaded from our website: Lubke, G. & Muthén, B. (2003). Investigating population heterogeneity with factor mixture models. It has been published in Psych Methods with a 2005 date I believe. 

katharina posted on Thursday, March 02, 2006  1:55 am



When estimating a factor mixture model, am I correct in assuming that Mplus by default constrains the parameters required for strict factorial invariance to be equal across classes, while furthermore letting the factor means vary across classes (with the last class receiving a factor mean of zero) by default? 


Yes. 


I'm trying to fit a mixtureSEM. How does it approximately take to get a solution? In a first step I allowed only one equation (sr) to be different, but would like to let vary the whole structural model. I let the model run for more than 1 hour and didn't get a solution (on pentium@2.8GHZ), the oneclassmodel runs about 5 seconds. Or is the model to complex to get a mixturesolution? Are there any tricks to speed up the calculation? VARIABLE: USEVARIABLES ARE pt1 pt2 pt4pt6 sr1 sr2 sr4 sr5 sr7 sr9 sr10 joy1joy7 ang1ang6 kog1 kog3kog6 att2 att4att7 int2 int4int7 loy2loy4 loy6 loy7 per rab mind; CATEGORICAL ARE per rab mind; CLASSES=c(2); ANALYSIS: TYPE=MIXTURE; ESTIMATOR IS MLR; ALGORITHM=INTEGRATION; MODEL: %OVERALL% pt BY pt1 pt2 pt4pt6; sr BY sr1 sr2 sr4 sr5 sr7 sr9 sr10 ; joy BY joy1joy7; ang BY ang1ang6; kog BY kog1 kog3kog6; att BY att2 att4att7; loy BY loy2 loy3 loy4 loy6 loy7; int BY int2 int4int7; pt ON per mind rab; joy ON sr pt; ang ON sr pt; kog ON sr pt per mind rab; att ON joy ang kog; int ON att; loy ON att int; %c#1% sr ON per*0.298 rab*0.01 mind*0.2; %c#2% sr ON per*0.2 rab*0.2 mind*0.1; 


IF you take the covariates off of the CATEGORICAL list, you won't need numerical integration and things should be much faster. If this does not solve your problem, please send input, data, output, and your license number to support@statmodel.com. 

C. Sullivan posted on Tuesday, November 20, 2007  6:21 am



I am trying to run a structural equation model with (like ex. 7.19) a few (continuous) latent predictors and a fully endogenous latent class variable. I've been able to get the separate measurement models (2 CFAs and an LCA) to run with reasonable solutions but get a "fatal error...reciprocal interaction problem" message when I try to put the models together. Is there anything I can do to correct this problem? 


Please send your input, data, output, and license number to support@statmodel.com. 

Stephan posted on Thursday, November 29, 2007  7:41 pm



Latent Class/Latent Profile Analysis Hello, in his paper #86 Prof. Muthén writes on p.1 "(..)data consists of different groups(..)but group membership is not observed" I investigate the collaboration between universities and commercial blue chipcompanies. Let’s say 5 exogenous LV and 1 endogenous LV, all continuous. (…) F6 ON F1F5; (…) However, beside various unobserved population variables I assume that several unis rated their relationship to the same blue chipcompany. Due to confidentiality the data set has no matching variable but I believe that the sample is not independent. My question: (1) Is latent profile analysis a valuable tool to take this data set drawbacks into account? (2) If yes, and the outcome will be reasonable can I say that there are invariances between groups but I can only assume what the causes are (same collaborator, country, size, age,…)? (3) In his paper “Maryland keynote v21” Prof. Muthén refers on p.4 to Lubke/Muthén (2005) and I was wondering if this paper is also available but could not find it. It’s not on the reference list. Any suggestions are appreciated. Many thanks in advance. Stephan 


I don't see how latent profile analysis would help in this situation. Following is the paper you are looking for: Lubke, G. & Muthén, B. (2007). Performance of factor mixture models as a function of model size, covariate effects, and classspecific parameters. Structural Equation Modeling, 14(1), 26–47. 


From your experience: Is there a reasonable sample size for growth mixture modeling? Seems 250300 cases to be o.k.? 


This can be fine depending on how well separated the classes are. If they are not wellseparated it may not be enough. Basically it depends on the data. You can do a Monte Carlo simulation based on the specifics of you data to see. 

Hao Duong posted on Tuesday, October 07, 2008  9:46 pm



Dr. Muthen, When I run the model with three classes, by default the covariances (WITH) between intercept and slopes (2 slopes) are similar across classes. However, I would like to examine them freely since I expect they may be different across classes. Is this option possible in Mplus? If yes, please explain. Thank you Hao 


The default is to hold them equal across classes. To relax the equality constraint, mention them in the classspecific parts of the MODEL command. 

Vlad posted on Tuesday, February 02, 2010  4:44 am



Hello, I am following example 7.27,page 176, from Mplus book with an additional structural equation, ans on f within each class(I have 2 classes in my model). f is a latent variable which varies between classes. It appears that in class 2 the coefficient and variance of f (ans on f) are not significant. Moreover, the variance of is close to zero. Thus, I restricted the coefficient and variance of f in this class to be equal zero.However, once a model is estimated with new restrictions, classes are switching their places. In other words, class 2 in the new model(with restrictions) represents the sample that was previously classified as class 1. As a result, the restrictions(coefficient and variance of f=0)are applied for a class of not my interest. I have also tried to give starting values for each class but it doesn't work, classes are still switching. Do you have any suggestion how can I test my restrictions for the particular class? Regards, V 


Please send the full output and your license number to support@statmodel.com. Your question is not clear. 

Utkun Ozdil posted on Saturday, December 11, 2010  12:17 pm



Hi,, I want to test a model that includes BOTH categorical and continuous LATENT variables. Am I to use mixture modeling in Mplus? Thanks... 


Yes. 

Utkun Ozdil posted on Monday, December 13, 2010  12:47 am



Dr Muthen,, I have one more question about this model. It will also be a multilevel one. That is,, I want to conduct multilevel mixture modeling with my data. So in that case should I use the Combination Addon of MPlus? When I use only the mixture addon will that help me in building a multilevel model too? Thanks... 


A multilevel mixture model requires the Base Program and Combination AddOn. There is no multilevel modeling available in the Base Program and Mixture AddOn. 


Dr. Muthen, I am following example 7.19. When u1u8 are continuous variables, how example 7.19 be modified for use with continuous variables? Thanks... 


Remove the CATEGORICAL option, ALGORTIHM=INTEGRATION, and refer to intercepts instead of thresholds,u1 not u1$1. 


Dr. Muthen, In Mplus, I use both latent profile analysis (c=4) and structural equation mixture model (one continuous latent variable and one categorical latent variable, all observed variables are continuous variables, c=4). I want to know why the numbers of latent class in LPA is not equal to the numbers of latent class in SEMM. Thank you very much. 


Please send the output(s) and your license number to support@statmodel.com so we can understand the question. 


Dr Muthen, my research investigates whether the relationship between 2 antecedents and 3 performance measures (outcomes) is influenced by the adoption of certain practices by a set of firms. I have built a measure of the adoption of these practices using 11 items. My scale measures how frequently firms implement those practices along a threestage new product development process. The scale is meant to capture the adoption of those practices both in terms of intensity (i.e. frequency) and scope (how many phases). Since literature indicates that both intensity and scope of adoption should have relevant consequences, I wish to retain both dimensions in my model. For this reason, I am thinking about a model that allows me to identify groups of firms that display similar adoption patterns, and to study the "antecedentsoutcomes" relationship within each group. Put simply, I am thinking about a mixture model where a categorical latent variable moderates the "antecedentsoutcomes" relationship (intercepts and slopes would be free to vary across classes). In this model:  the categorical latent variable would be measured by the 11 adoption items  the two antecedents would be measured by their respective scales  the three outcomes would be measured using summated scales from EFA. I would really appreciate your opinion on my modeling approach. Does it look ok? Thanks a lot! 


It seems alright to me. As a first step, you may want to do LCA of the 11 items alone and see if the latent classes make substantive sense. When doing the full model, you would let the slopes of outcomes regressed on antecedents vary across the latent classes (as well as the default class variation in their means). 


Thank your Bengt. I followed your advice and run a LCA of the 11 items. I also added a few covariates using the aux(e)/aux(r) commands as I was interested in investigating predictors of class membership. In the full model I let slopes and means vary across the classes as you suggested. Now I would like to test for differences in slopes and means. I am assuming that a chisquare model difference test between a constrained and an unconstrained model is the way to go. In this case, I will have to take into account (2x) the difference in loglikelihood values (considering the scaling correction factor) and the difference in free parameters between the two models to obtain the chisquare statistic. Correct? Thanks! 


For a mixture model, I would use MODEL TEST instead of difference testing. 


Thank you Linda. I will use the MODEL TEST command as you suggested. I am wondering if there's any way in which I can instruct MPLUS to hold the slopes for one equation equal across all classes and test it using the MODEL TEST command. I have 3 equations per class, with three independent variables per equation, and a total of four class. The independent variables are the same across the 4 classes. I would have used an input like this: s1_1=s2_1=s3_1=s4_1; (sx_y is the label I used for slopes, where x indicate class number and y refers to the independent variable for which I am testing equality of slopes) However MPLUS allows only one equality sign per line. So instead I have used this input: s1_1=s2_1; s1_1=s3_1; s1_1=s4_1; s2_1=s3_1; s2_1=s4_1; s3_1=s4_1; is this the right way to code it? I am interested in a joint test first, and then eventually look for pairwise differences (similar to an ANOVA test with posthoc analyses). Thank you very much! 


You should say: s2_1 = s1_1; s3_1 = s1_1; s4_1 = s1_1; 

xianhuazeng posted on Friday, February 24, 2012  6:33 am



I am following example 7.19. When u1u4 are continuous variables, the c is 3 classes latent variable and i want to continuous latent variable regressed on categorical latent variable TITLE: this is an example of DATA: FILE IS ex7.19.dat; VARIABLE: NAMES ARE u1u8; CATEGORICAL = u5u8; CLASSES = c (3); ANALYSIS:TYPE =MIXTURE; ALGORITHM = INTEGRATION; MODEL: %OVERALL% f BY u1u4; f ON C; %c#1% [u5$1u8$1]; %c#2% [u5$1u8$1]; %c#3% [u5$1u8$1]; OUTPUT:TECH7 TECH8; is it OK? 


C cannot appear on the righthand side of on. The effect you want is the varying of the mean of f across classes. 

xianhuazeng posted on Friday, February 24, 2012  10:09 pm



Thank you Linda. When u1u4 are continuous variables, the c is 3 classes latent variable .DATA: FILE IS ex7.19.dat; VARIABLE: NAMES ARE u1u8; CATEGORICAL = u5u8; CLASSES = c (3); ANALYSIS:TYPE =MIXTURE; ALGORITHM = INTEGRATION; MODEL: %OVERALL% f BY u1u4; c#1 ON f; c#2 ON f; %c#1% [u5$1u8$1]; %c#2% [u5$1u8$1]; %c#3% [u5$1u8$1]; OUTPUT:TECH7 TECH8; 

xianhuazeng posted on Saturday, February 25, 2012  12:40 am



is the command OK? 


The best way to tell is to run it and see if you get what you want. 

Jan Stochl posted on Wednesday, April 03, 2013  3:52 am



Dear Linda, is Mplus capable to estimate semiparametric factor analysis model that also accounts for complex sample designs? Something like type = mixture complex? Thanks a lot, Jan 


Yes. See the ESTIMATOR option in the user's guide. There is a table that shows all combinations of analysis types and the estimators available for them. 

Back to top 