Mplus Discussion > Latent Variable Mixture Modeling >
 Chuck Cleland posted on Thursday, January 27, 2000 - 9:07 am
I have a question about the item labeled "LATENT CLASS REGRESSION MODEL PART" in mixture model output. Suppose one has a latent profile analysis with four standardized continuous variables and the categorical latent variable has two levels. In this case, how does one interpret the LCRMP coefficient and what does it mean if est/SE is greater than abs(2)?
 Bengt O. Muthen posted on Thursday, January 27, 2000 - 10:11 am
The latent class regression model part refers to the regression of the latent class variable on covariates, that is intercepts and slopes. If there are no covariates, which seems to be the case in your application, the coefficients given under this heading are just the intercepts. In this context, intercepts are logit coefficients determining the probabilities of the classes. In this case, the est/se ratio is not of much interest since the ratio tests against a zero logit, which translates to a probability of 0.5. So, these ratios can be ignored when there are no covariates.
 Craig Gordon posted on Thursday, March 16, 2000 - 6:23 pm
I have 4 years of panel data and two variables of interest. The first variable, P, is an ordinal level variable with 3 categories. The second variable, A, is continuous. What I am trying to estimate is the change over the four years in the predicted probabilities for each category of P for individual i, given a change in A. Given that there are 9 possible paths to get from year 1 to year 4 (excluding intermediate points) for variable P, is there a way to estimate a path given a change in A?

In SAS, there is a procedure PROC TRAJ that handles this, and I was told that there may be a way to do it with MPLUS.

Thank you. posted on Tuesday, March 28, 2000 - 7:48 am
Sorry for the delay in answering this - I was out of town. You mention TRAJ which concerns latent classes of development, i.e. mixture modeling. The mixture modeling part of Mplus currently does not allow polytomous outcomes or trends in development over time in categorical outcomes. Your message, however, does not describe theories of latent classes of development. If instead the chief concern is describing longitudinal development of the polytomous outcome over time as a function of the continuous predictor variable, you can use the regular, non-mixture, part of Mplus for the analysis. You can either use models that are auto-regressive or growth models, the latter having growth factors, i.e. continuous latent variables, that influence the outcome. The continuous predictor variable can influence either the growth factors or the outcome directly. Hope this helps.
 Juned Siddique posted on Tuesday, January 23, 2001 - 1:44 pm
I have a model with three latent classes (C#1, C#2, C#3). Under the model statement, I included the command "C#1 on BLK" where BLK is a binary race variable. In the output under LATENT CLASS REGRESSION MODEL PART, there is the slope for C#1 ON BLK and then under intercepts there are two intercepts, C#1 and C#2. Why are there two intercepts? I believe I am modeling:
Pr(C#1)=B0 + B1*BLK
So there should only be one intercept. Thank you.
 Linda K. Muthen posted on Tuesday, January 23, 2001 - 2:18 pm
The categorical latent class variable has three categories (classes). If you want to specify the multinomial logistic regression of the categorical latent variable on BLK, you need two statements:

c#1 ON BLK;
c#2 ON BLK;

This is in line with regular multinomial logistic regression. See the Agresti reference on our website.
 Gil posted on Tuesday, August 21, 2001 - 7:24 pm
I'd like to identify latent segments differing with respect to the price elasticity for many commodities. The model, say, a mixture regression model allows for the fact that these price elasticities may differ for different segments, and consists of:
1. to explain the latent variable as a function of covariates
2. to predict a dependent variable as a function of predictors

The problem is that the dataset contains repeated observations (time-series) for each commodity (cross-sectional), and the model closely relates with these longitudinal data.

Suppose a regression model for estimating price elasticity, i.e., double-log model of quantity on price. We estimate price elasticities for N commodities by using T repetedly observed data for each, but within the framework of latent segments. This might be accomplished using the command GROUPING which identifies commodity. But the command GROUPING cannot be applied with MIXTURE.

Could somebody please give me some suggestion how I would fit these models using Mplus?

Thanks in advance for explanations, comments and tips.
 Linda K. Muthen posted on Friday, August 24, 2001 - 9:13 am
It sounds like your observations are commodities. I'm not sure why you want to use the GROUPING option. If you want to identify latent segments of the commodities, the TYPE=MIXTURE is appropriate. GROUPING is used when there are observed not latent groups or segments. A growth mixture model would probably be appropriate here. If you check the References and Examples posted here on the website, you may find something to help.
 Hana Kim posted on Sunday, November 11, 2001 - 2:56 pm
Hello! My research is conjoint marketing studies where respondents (cases) were repeatedly asked to provide personnel intention of purchase under several different scenarios. Please note that there are several records per respondent. My interest is to classify cases into several homogeneous groups and to develop regression models for each segment. Since it appears most likely that any different demographic profiles causes these segments, I will show then how to classify each respondent into the segment which is most appropriate. (A very similar example of my study could be seen from

Do the Mplus fill these requirements, for cases involving repeated measurements?
 bmuthen posted on Tuesday, November 13, 2001 - 8:23 am
I don't know conjoint analysis, but it sounds like you can use the Mplus mixture analysis to let the repeated measures for respondents be the latent class indicators (u variables in Mplus language) that measure the latent class variable (c in Mplus), and regress c on covariates (x). Here, the latent classes of c gives you the different segments and segment probabilities are expressed as a function of the x's by multinomial logistic regression. All of this is carried out by a single analysis using maximum-likelihood estimation. This analysis relates to what is referred to as Latent Class Growth Analysis in paper number 86 on the Mplus web site - we'd be happy to send this to you.
 Anonymous posted on Wednesday, January 16, 2002 - 8:57 am
Hello, thank you for making this forum available.

I have fitted a three-class mixture, with class 1 being the least prevalent and class 3 the most.

One of the models I need to run is a logistic model of a binary outcome based on class.

The issue here is that the Probability of the outcome in class one and two is 1, or as near to certainty as you get. Therefore, there is no variance in the dependent variable for classes one and two.

However, a person in class three has only a hypothesized 60% chance of the event (as supported by empirical frequency results). What is interesting is to 1. Categorize subjects into class three, and 2. Calculate their probability of the event.

I think I can do this by first assigning kids to class, using one model then, second, running an additional logistic regression for the kids in class three in a second model (weighting by the probability of being in class three from the output data set), but this seems like an inelegant solution.

Is there a way in Mplus to code the regression model into the first mixture model, such that it runs only for class three?
 Anonymous posted on Wednesday, January 16, 2002 - 2:02 pm
Sorry for 2 questions in one day, and thank you for answering them.

Is there away to tell Mplus a dependent variable of interest and have it output predicted values for it based on the structural model?

I'm pretty sure I can do this by hand, but obviously, it would be easier if Mplus were kind enough to do it for me.

many thanks.

 Bengt O. Muthen posted on Thursday, January 17, 2002 - 8:29 am
The way to handle the problem of zero variability of a dependent variable in two of the three classes is as follows where u1-u5 are your binary latent class indicators and u6 is the binary outcome. Here u6 has probability of 1 in the first two classes and its probability is estimated in the third class. Note that u6 is seen simply as yet another latent class indicator. As usual, the probablity of u6 given class 3 is in a logit scale. This is, of course, a partial input.

VARIABLE: categorical = u1-u6;



 Bengt O. Muthen posted on Thursday, January 17, 2002 - 8:31 am
Regarding your second question regarding predicted values of an observed dependent variable, Mplus does not do this automatically.
 Anonymous posted on Thursday, June 27, 2002 - 8:29 am
I have 5 years of data collection of a continuous outcome. Using MPLUS I was able to identify 4 trajecory classes with a quadric model. A reviewer suggested me to revise this findings using PORC TRAJ in SAS. I don't know what are the assumptions made in MPLUS as compared to PROC TRAJ and if the classes could be different. Has anybody compared both approaches? Are they similar? Is there any literature available? Thanks
 bmuthen posted on Thursday, June 27, 2002 - 12:28 pm
There are articles related to this topic authored by me and listed under References, Growth Mixture Modeling on this web site, e.g. papers 82, 85 and 86. PROC TRAJ assumes no within-class variability of the trajectories, which is a special case of Mplus, restricting the growth factor covariance matrix to zero (i fixed at 0, s fixed at 0, i with s fixed at zero). My experience with real-data analyses is that this specification often does not fit the data well.
 Anonymous posted on Thursday, June 27, 2002 - 2:01 pm
It appears that one can only include a single latent variable in an Mplus MIXTURE model. Is this due to methodological restrictions? Are you planning on expanding on this capability in later versions ? Thank you.
 bmuthen posted on Thursday, June 27, 2002 - 3:25 pm
You can have as many continuous latent variables as you want in mixture modeling. As for categorical latent variables, the program is intended for a single variable, but can be used also with several variables. The multiple latent categorical variable approach is described briefly on page 11 of paper #86. New Mplus development are in progress for more efficient handling of multiple latent categorical variables.
 Anonymous posted on Friday, August 23, 2002 - 12:45 pm
Can mixture modeling in MPlus analyze a set of regressions, or simply one regression at a time? In other words, if I were interested in a set of regressions such as the following:

d= a + b + c + error
f= d + e + error
h= f + g + error

would I need to analyze each regression seperately, or could I have the procedure analyze the set of regressions concurrently?
 bmuthen posted on Friday, August 23, 2002 - 1:57 pm
The set of regressions can be analyzed in a single analysis.
 Bonnie posted on Monday, April 11, 2005 - 3:35 pm
Dear Whom May Concern:

I have a question about SEM with a categorical latent variable.

Outcome Y, Mediators M1 and M2 are all continuous variables. But U1 and U2 are both binary aviables, so the latent variable C is also categorical. X1-X4 are covariates, not shown in the graph.

The following is my code. My question is that how to write the code for the Model section, should I say "Y on C M1-M2 X1-X4" or say
"Y on C#1 M1-M2 X1-X4" ? It would report error if I used the former one. And how to order these statements. What is shown below is not working actually, I just hope to provide some info. I would greatly appreciate your help!

NAMES ARE X1-X4 M1 M2 Y u1 u2 ;
CLASSES = c(4);

TYPE IS Mixture;

Y on u1 u2 M1-M2 X1-X4;
c#1 on u1 u2;
c#2 on u1 u2;
c#3 on u1 u2;

Best Regards,
 Linda K. Muthen posted on Wednesday, April 13, 2005 - 11:37 pm
The latent variable c for u1 and u2 does not have to be categorical because u1 and u2 are categorical. The factors in SEM are continuous not categorical. They can, however, have indicators that are continuous, categorical, or other scales. Do you want a traditional SEM model where the factors have categorical indicators or are you interested in a mixture model where the factors are categorical?
 Huabin  posted on Monday, May 09, 2005 - 12:26 pm
I am trying to use Mplus for a mixture modeling. I am confused with the CLASS statement on P109 of User's Guide. Looking at the data file for Ex.7.1.I noticed that the classes of all the observatios have been specified( 1 or 2), not "latent" . But the User's Guide on p 109 says, " ... there is one categorical latent variable c that has two latent classes." I can not understand. Sorry I raised this very basic question.

Huabin Luo
 Linda K. Muthen posted on Monday, May 09, 2005 - 1:01 pm
The 2 is the number of latent classes. Perhaps I don't understand your question. It is not necessary to specify latent.
 Lilian posted on Sunday, December 04, 2005 - 7:08 pm
I was wondering whether we can change the reference category in Mplus when running a latent class regression. I am running a 6-class model and regressing the latent outcome on a few covariates, and i would like the reference category to be the class with the lowest symptom probability.. is that possible? Thanks!
 Linda K. Muthen posted on Monday, December 05, 2005 - 7:16 am
You can use the ending values of the class you want to be last as starting values for the last class in a subsequent run and that class will be last.
 pete posted on Thursday, February 09, 2006 - 11:53 am
I try to fit a mixed logistic regression model with covariates on both the regression and the the latent class part on an individual level.

The model is working and there appear no error messages. Since such models tend to be unidentifiable, does the lack of error messages indicate that the model is identified or is there no guarantee for indentification in mplus?
 bmuthen posted on Thursday, February 09, 2006 - 12:09 pm
That is a notoriously difficult model and I would be a bit suspicious. Look at your condition number - if it is close to 10-10 I would be wary. You can also try a high starts = value to investigate the trustworthiness of the solution. You can also do an Mplus Monte Carlo study using your parameter estimate values to see if the model can be recovered. - If the model still holds up, I'd like to use it as an example...
 Shane Allua posted on Saturday, February 03, 2007 - 5:25 am
I see that odds ratios for regression of latent class variable on covariates is new in V4.2. What syntax is required to get this information and can the ORs be output to a dataset?

 Linda K. Muthen posted on Saturday, February 03, 2007 - 6:46 am
It is done automatically.
 xi li posted on Tuesday, August 11, 2009 - 9:56 pm

Just run a simple mixture model, but kept getting the error.

Could you tell me what went wrong?

I am using mplus 4.2.


Names are a1 a2 a3;
Categorical are a1 a2
Classes = c(4);
a3 ON a1 a2;
c ON a1 a2;
tech1 tech8;

*** ERROR in Model command
Unknown variable(s) in an ON statement: C
 Amir Sariaslan posted on Tuesday, August 11, 2009 - 10:26 pm
Xi Li,

Add a ; to the end of:
Categorical are a1 a2

 Meghan Slining posted on Saturday, February 27, 2010 - 9:11 am
I have fit a series of LCGA models on 13 infant growth measures. I have decided on the number of classes that I favor and would now like to examine associations between class membership and a series of distal (continous) outcomes. I have tried a number of variations in code and received error messages. My only success was with the 2 class model (which is not my favored model) where I added the following line at the end of my model statement:

bmia ON C;

1. What code is needed for a regression with a 4 class model?

2. If I have a series of outcomes I am interested in, can I put them all in one model statement?

Thank you,
 Linda K. Muthen posted on Saturday, February 27, 2010 - 3:22 pm
The effect of a distal outcome is seen in the means or thresholds varying across classes. The ON option is not used.

If you have more than one distal outcome, the assumption of conditional independence is imposed. If this is not what you want, you may want to do one at a time.
 Olga Maslovskaya posted on Tuesday, March 02, 2010 - 7:04 am

I would like to fit a LCA model with categorical covariates. I am not sure how to specify that covariates are categorical. If I specify them as categorical, this is the error message I am getting:

The following MODEL statements are ignored:
* Statements in the OVERALL class:
C#1 ON X5
One or more MODEL statements were ignored. These statements may be
incorrect or are only supported by ALGORITHM=INTEGRATION.

Please find my code with just one categorical covariate below:

variable: names are X1-X6 u1-u7;
usevariables are u1-u7 x5;
categorical are u1-u7 x5;
classes = c (2);
analysis: type=mixture;
model: %overall%
c on x5;

Could you please help me to sort out this problem?

Many thanks!

 Linda K. Muthen posted on Tuesday, March 02, 2010 - 7:29 am
The CATEGORICAL list is for dependent variables. In regression, covariates must be binary or continuous. In both cases, they are treated as continuous.
 Olga Maslovskaya posted on Tuesday, March 02, 2010 - 8:01 am
Many thanks, Linda!

Is there a way to include a categorical covariate with more than 2 categories into regression without creating dummy variables?

Thank you very much,

 Linda K. Muthen posted on Tuesday, March 02, 2010 - 12:08 pm
Not if the categories are unordered.
 Andre Plamondon posted on Tuesday, May 04, 2010 - 4:24 pm
I am using 2 covariates in my LPA. These covariates are correlated because of shared method variance (same rater). I was wondering if the logistic regression is a standard (instead of backwise/stepwise) and if I can assume that the shared variance is not attributed to any of those two variables? I have one variable at an earlier time-point but this doesn't lead to the same results.

Basically, can I use those two variables (they do not predict the same class membership).
 Andre Plamondon posted on Tuesday, May 04, 2010 - 5:15 pm
I forgot to say that these 2 variables are measuring two members of a dyad (mother-child). Maybe I should use the earlier measure of the mother because I would be partialling out variance attributable to the child as well as the observer if I take a measure that was taken during the same interaction.
 Linda K. Muthen posted on Wednesday, May 05, 2010 - 8:43 am
Mplus does standard logistic regression. I am not sure how you should approach the situation with your covariates.
 Andre Plamondon posted on Wednesday, July 14, 2010 - 12:47 pm
How would one do a liability threshold
model with the latent class variables as (latent) dichotomous dependent variables for analyses with twin data? It was mentioned in the Twin Research and Human Genetics 2006. I have found how to use it with "normal" variables but wonder how to do it with classes.
 Andre Plamondon posted on Wednesday, July 14, 2010 - 12:49 pm
I chose to post it here since I want to use regressions in a multilevel model to get heritability estimates. This strategy was shown by McArdle and Prescott in the same issue of Twin Research and Human Genetics. Sorry if it didn't seem logical to post here at first sight.
 Shaunna Clark posted on Thursday, July 15, 2010 - 7:27 am

The third chapter in my dissertation addresses how to use latent classes in a liability threshold model (Clark, S.L. (2010). Mixture modeling with behavioral data. Doctoral dissertation, University of California, Los Angeles.). The appendix for that chapter includes example Mplus code for this model. In order to do this model in Mplus you will need to be using version 6.

A copy of my dissertation can be found on the Mplus website under the factor mixture modeling tab of the papers section.

If you have any questions feel free to email me at
 mpduser1 posted on Monday, October 04, 2010 - 11:45 am
Is it possible to use MODEL PRIORS in Mplus 6.0 to specify a small informative priors to aid in the identification of a latent class regression analyses when one of the latent classes is small (see, for example, the procedure mentioned by Collins & Lanza, 2010)?
 Tihomir Asparouhov posted on Monday, October 04, 2010 - 1:50 pm
You can provide informative priors for every model parameter in Mplus and yes it should be possible to use informative priors to help identify small classes.
 Sarah Ryan posted on Tuesday, March 29, 2011 - 5:06 pm
I'm trying to figure out how best to go about analyzing a mediation model which can be described:
1) Secondary data set (N= appx. 9,000), using three waves of data
2) Several background covariates
3) 5 exogenous measures- 4 latent factors and 1 manifest (continuous) indicator (a student colleague has suggested using bifactor analysis to treat these as one "general factor" as each of the 5 measures could be considered submeasures, suggests this may simplify interpretation of any mediation effect)
4) Latent mediator (Arrived at through latent class analysis)
5) One manifest DV (6 ordinal levels, treated as continuous)

The more I read on these discussion boards, the less convinced I am that using a latent class variable as the mediator actually is doable, or that it is a match theoretically (aside from the fact that this may be an unnecessarily complicated model). It almost seems like what I'd end up with is more a moderation analysis (DV would actually be DV means as a function of class membership). Am I right in this thinking?
 Bengt O. Muthen posted on Tuesday, March 29, 2011 - 5:53 pm
A latent class mediator makes for a more complex model, just like an observed nominal mediator would. What should mediation mean in this case? Perhaps the following. The latent class membership can be influenced by exogenous variables, including factors, and latent class membership can influence DVs (by changing their means if cont's as you said). One can also add the restriction of having no direct effects from IVs to DVs. That formulation seems reasonable and the modeling can be done in Mplus because you can have latent variables influencing class membership. But how an indirect effect should be quantified is not clear - it is not just a product of two slopes as with a cont's mediator.

Re 5), perhaps you are thinking about a second-order factor model where only that general factor is an IV. With the bifactor model the general and the specific factors all can be IVs.
 Sarah Ryan posted on Wednesday, March 30, 2011 - 2:50 pm
Thanks for this response- it is very helpful. I also just read your 2009 paper with Clark, "Relating LCA Results..." and this gives me more food for thought.

It is theoretically conceivable, with the indicators I'm using and the construct I'm testing, that the latent CLASS mediator could function as latent FACTOR, making it continuous and reducing a bit of the complexity. My committee suggested considering the latent class mediator, but I'm not sure they realized that they were sending me off into relatively uncharted waters (as far as I can tell, though perhaps I'm wrong). I need to go back to my model and the literature to think more about which (factor or class) I believe is more likely.

I'll keep plugging away here, and VERY much appreciate this board and your advice.
 Igor Himelfarb posted on Sunday, April 03, 2011 - 5:41 pm
I am trying to regress a continuous variable on a categorical latent variable (c = 3) and on a continuous latent variable. Here is my code:

DATA: FILE IS wpa4.dat;
CLASSES = C (3);
F BY U7-U13;
U1 ON F;
U1 ON C F;

The statement " U1 ON C F;" gives me the following error message:
The following MODEL statements are ignored:
* Statements in the OVERALL class:
U1 ON C#1
U1 ON C#2
One or more MODEL statements were ignored. These statements may be incorrect.

Please help!
 Linda K. Muthen posted on Monday, April 04, 2011 - 8:56 am
You cannot regress an observed variable on a categorical latent variable. The results you are interested in are the means or thresholds of the observed variable varying across classes.
 Suresh Ramanathan posted on Wednesday, November 16, 2011 - 6:42 am

I have a single binary covariate predicting different trajectories for emotions over time (10 measures), which in turn are expected to predict differences in consumption. Bengt was kind enough to direct me to examples of mixture modeling with distal outcomes, and I have experimented with many variations, including keeping factor variances and residual variances as class-invariant. My question now is simple - how can I estimate the indirect effect of the covariate on the distal outcome? Unlike normal mediation, there is no a*b effect to be estimated. How can I assert that the effect of the covariate on the outcome is mediated by the differences in trajectories?


 Bengt O. Muthen posted on Wednesday, November 16, 2011 - 3:24 pm
Are you saying that your binary covariate influences the latent class membership, where latent class membership gives class specific means for the outcome?
 Suresh Ramanathan posted on Wednesday, November 16, 2011 - 4:28 pm
right, that is what I am saying. So, the binary covariate influences the intercept, linear slope and quadratic trend, and these in turn are predicted to lead to differences in the distal outcome. I included the statement c#1 on x to estimate the effect of the covariate on latent class menbership, and I know that the beta for the effect of c on the distal outcome is given by the difference in the class means. What I now need to know is whether there s an indirect effect of the covariate on the distal outcome.
 Bengt O. Muthen posted on Wednesday, November 16, 2011 - 6:07 pm
Please give your MODEL command statements so I am sure of your model.
 Suresh Ramanathan posted on Sunday, November 20, 2011 - 10:07 am

This is what I have. I hope I am doing this right. My x variable is litdar (0-1 variable). My distal outcome is consum (range 0-50,treated as continuous).
What I find is the following:
a) 2 class model with class-varying psis and thetas fits the data well, better than a growth model. The two classes are high versus low guilt.
b) The effect of the covariate on class membserhip is not significant.
c) However, within the high guilt class, the covariate predicts differences in trajectories. One group (x = 0) has a rising guilt while the other (x=1) has a reducing guilt, and the linear and quadratic growth factors are significantly different for the two groups. Further, the distal outcome is significantly different for these two groups within the high guilt condition.

d) There are no differences in trajectories for the low guilt class, and the distal outcome also does not differ for these two groups.
 Suresh Ramanathan posted on Sunday, November 20, 2011 - 10:07 am
This is the set of MODEL commands I used.



ac bc qc |ag1@0 ag2@1 ag3@2 ag4@3 ag5@4 ag6@5 ag7@6 ag8@7 ag9@8 ag10@9;
ac*5048.14; bc*607.81; qc*5.90;
ac WITH bc*-678.04;
ac WITH qc*58.72;
bc WITH qc*-57.52;
ag1*779.60 ag2*240.52 ag3*428.66 ag4*363.95 ag5*369.10 ag6*511.33 ag7*645.82 ag8*682.81 ag9*265.19 ag10*582.09;
ac ON litdar;
bc ON litdar;
qc ON litdar;
c#1 ON litdar;
consum ON litdar;

[ac*51.48 bc*5.39 qc*.71];
ac*5048.14; bc*607.81; qc*5.90;
ac WITH bc*-678.04;
ac WITH qc*58.72;
bc WITH qc*-57.52;
ag1*779.60 ag2*240.52 ag3*428.66 ag4*363.95 ag5*369.10 ag6*511.33 ag7*645.82 ag8*682.81 ag9*265.19 ag10*582.09;
ac ON litdar;
bc ON litdar;
qc ON litdar;
consum ON litdar;

[ac*120.51 bc*24.93 qc*-2.1];
 Bengt O. Muthen posted on Sunday, November 20, 2011 - 8:11 pm
So you have a direct effect of litdar (your binary X) on consum (your Y). But you don't have any indirect effect because litdar is not significantly influencing the class membership and although litdar influences the growth factors within class, your model doesn't say that the growth factors influence litdar.
 Jamie Vaske posted on Tuesday, January 17, 2012 - 9:32 am
A colleague and I were recently looking over the Jung & Wickrama (2008) article on Latent Class Growth Analysis and Growth Mixture Modeling with MPLUS. In their article, they have a LCGA and they directly regress the slope factor on a covariate. Here is their syntax:

Model: %OVERALL%
i s|t1@0 t2@1 t3@2;
i s ON x;
c#1 ON x;
c#2 ON x;

Our question pertains to how to interpret the effect of the covariate on the growth factors. The variation in the growth factors is set to 0, so the covariate is not explaining variation in the growth factors within a class. What does the effect of X on the growth factors represent when the variation in growth factors is constrained to zero?

 Linda K. Muthen posted on Wednesday, January 18, 2012 - 9:53 am
With a conditional model, the residual variances are fixed at zero.

When i and s are regressed on x, it is a shift in means for each gender if x is, for example, gender.
 Regan posted on Friday, February 03, 2012 - 11:32 pm
"This brings up two issues which may not always be well understood in mixture modeling. First, modeling the influence of a latent class variable c on a distal outcome y is not done by saying y ON c, but what is done gives information equivalent to having used ON..."

Dr. B. Muthen, the above was a response you gave to someone some years back...I am new to LCA and want to clarify somethings with regards to this comment:

1) Do I interpret your comment correctly if I say that when adding a distal outcome y to see the effect of class membership on y, that instead of using a y on c command, that we should just add the outcome variable to the 'usevariable' statement--therefore it is technically a covariate, but interpreted as an outcome?

2) Similar, but regarding dependent variables: Is there a substantive difference in whether we add the dependent variable to the 'usevariable' statement vs. using the 'knownclass' statement and adding a regression statement of c on x? If there is a significant association between x and c (for instance if x is gender) should we move to a multiple group analysis?
 Linda K. Muthen posted on Saturday, February 04, 2012 - 7:06 am
1. It is still an outcome. In reality it is another latent class indicator.

2. Please send outputs that illustrate what you are asking to make it clear. Also send your license number to
 Regan posted on Monday, February 06, 2012 - 9:46 am
Thank you for the clarification. I am just starting out with the analysis and trying to understand what I have learned from attending your sessions and putting them to practical use at the current time. Therefore I have not yet gotten any output yet but wanted to understand more about the different command statements in order to obtain the correct output.
 Linda K. Muthen posted on Monday, February 06, 2012 - 1:49 pm
A good way to answer your question and get experience is to run the analysis various ways and compare the results.
 Melissa Kimber posted on Saturday, March 10, 2012 - 1:09 pm
My LVSEM model involves two steps. First completing a latent profile analysis to devise a latent class variable of commuity adversity; and THEN, using that latent class variable as a 'predictor' in a LVSEM model.

Is there an example of how to do this somewhere.

The results of my LPA confirmed a 3-class solution.
So, I thought that inorder to use my new Latent class variable in my model all I would have to do is have in the variable command.

CLASS = C(3);

and then define my latent class variale in the %overall% model command with its continuous indicators; while using the TYPE=MIXTURE analys command.

Then I could regress my latent class C variable onto my outcome of interest. However, an error message i saying that my latent class variable cannot also be defined as a continous latent variable.

Any help that you can offer would be appreciated.
 Linda K. Muthen posted on Saturday, March 10, 2012 - 1:56 pm
Please send the output and your license number to
 Melissa Kimber posted on Saturday, March 10, 2012 - 3:07 pm
Hi Dr. Muthen,
I cannot, it is government protected data where each output has to be vetted through security and you are only allowed to vet twice. So, I need to save it for when my model works. Any ideas would be helpful.
 Linda K. Muthen posted on Sunday, March 11, 2012 - 11:35 am
I'm guessing you have y ON c. This is not the correct specification. Remove that. What you want to look at is the varying of the means of y across classes.
 Gail Smith posted on Friday, March 30, 2012 - 7:22 am
I am doing a LCA with 3 classes and want to change my reference class from class 3 to class 2. In earlier posts, you have suggested to use the ending values for the parameters in the class that you want to be last as starting values for the parameters in the last class in a subsequent analysis. My question is: where do I find these ending values?

Thank you for your help.
 Linda K. Muthen posted on Friday, March 30, 2012 - 9:38 am
They are your results in the analysis where class 3 is the reference class. You can use the SVALUES option of the OUTPUT command to generate input with starting values and then change the class labels. You also need to change the means of the categorical latent variables.
 Anthony Rosellini posted on Saturday, April 21, 2012 - 7:26 am
It is clear that one cannot treat a latent class/profile as an independent variable by regressing X on C; instead you recommend including the X outcomes in the analysis to see how they may vary across the classes.

My model is a 6-profile solution (LPA of 7 continuous indicators), however I am not interested in how the 6 profiles are differentially related to a dependent variable. Rather, I want to compare the predictive abilities of certain profiles to other independent variables (e.g., controlling for a closely diagnoses, do profiles 1 and 2 predict impairment).

e.g., X on C#1 C#2 DX1 DX2;

Is their a way to run such an analysis in a single step in mplus? I understand that it is not recommended to export the posterior probabilities and run the analysis in a second step.

Could you also explain why it is not possible to regress X on C in Mplus? Do you expect this to be possible in future versions?
 Linda K. Muthen posted on Sunday, April 22, 2012 - 2:21 pm
We don't specify x ON c but the intercept difference of x across classes is this regression. You can use MODEL TEST to test the intercept differences.
 Bengt O. Muthen posted on Sunday, April 22, 2012 - 4:45 pm
So you should say

x on dx1 dx2;

and then the x intercept differences across the latent classes are equivalent to the slopes for "x on c".
 Regan posted on Tuesday, June 12, 2012 - 11:53 am
Drs. Muthen
I have one question:

In the sample code below, drug use is used as a predictor of class membership:

c#1 on drug;

Is the following the correct code if I want to use drug use as a distal outcome -- testing if class membership predicts drug use?

drug on c#1;

Thank you!
 Linda K. Muthen posted on Tuesday, June 12, 2012 - 1:54 pm
For a distal outcome, simply include drug on the USEVARIABLES list. The varying of the means of drug across classes captures drug ON c#1.
 Regan posted on Friday, June 15, 2012 - 10:42 am
Thank you!
 Regan posted on Thursday, June 21, 2012 - 3:13 pm
Hello Professors

My first question:

In your handout on LCA on slide 126 it shows that the predictor variable "black" is not siginificant in the regression equation for class 1, however it is significant for classes 2 and 3. My question here, is if this is interpreted as 'a significant predictor of class only for classes 2 and 3, however being black is not a significant predictor of class 1"? Also, would this imply that a multiple group model be run for black and non-black respondents?

My second question:

In using a distal outcome, I know I need to compare the means across groups and use the 'model test' command. However, if I have 3 groups, do I need to run the model three times to obtain 3 different Wald tests (p1=p2; p2=p3; p1=p3)?

Thank you so much.
 Linda K. Muthen posted on Friday, June 22, 2012 - 9:25 am
1. No. For the interpretation, see page 445 of the Mplus User's Guide.

2. To test the three separately, you need to run MODEL TEST three times.
 Regan posted on Friday, June 22, 2012 - 11:22 am
Thank you again
 Regan posted on Friday, July 27, 2012 - 2:08 pm

When conducting the test of mean differences on a distal outcome, I am using the MODEL TEST command. I am running this several times because I have four groups. I am wondering if there needs to be a post-hoc Bonferonni test applied in this context, and if so, how is it conducted in Mplus? Thank you.
 Linda K. Muthen posted on Friday, July 27, 2012 - 2:10 pm
Whenever you do several tests you should consider some type of correction. I would suggest being conservative about the p-values. There is no formal approach taken.
 tomas dvorak posted on Friday, February 01, 2013 - 9:49 am
I run regression of the latent class variable on covariates. In the Model Result part of the output, for some covariates the S.E. is 0 (and p-value 999). What is the problem and how can it be avoided?
Many thanks for your help.
 Bengt O. Muthen posted on Saturday, February 02, 2013 - 8:59 am
That means that the slope cannot be determined. This happens when a class has zero variance for a covariate - everyone in that class has the same covariate value. It is the same issue as in ordinary logistic regression. It is not really a problem in that it is useful to know that people in that class are homogeneous with respect to that covariate.
 Lauren Christine Taylor posted on Tuesday, March 12, 2013 - 11:33 am

I have encountered the following error will running a LCA model without covariates on a dataset that contains both continuous and dichotomous outcomes (class indicators) with no missing values. I'm not sure what to do with this message since I don't have any covariates.


This parameter refers too:



 Linda K. Muthen posted on Tuesday, March 12, 2013 - 11:45 am
Please send the output and your license number to
  Chris Kenaszchuk posted on Wednesday, February 04, 2015 - 10:19 am
Thanks for your past responses. In the following input I can regress an observed variable, YT2, on a categorical latent variable, CW. Model estimation terminates normally, although I do receive a message about a non-positive definite matrix.

Several users report receiving Mplus output errors when they try this regression. The advice in response is that an observed variable can not be regressed on a latent variable. Instead, include the observed variable on the USEVARIABLES statement and then examine class-specific means.

What is the reason why I am able to run the regression? Is it because the observed variable YT2 is a dependent variable in the model?

CLASSES = CB(2) CW(2);

BETWEEN = CB W1 W2; !W1 and W2 are between-level covariates.
WITHIN = YT1 U1-U7; !YT1 is a covariate.


!YT2 is on both levels. Omit from BETWEEN= and WITHIN= above.

CB ON W1 W2;
YT2 ON CW; !The regression in question.


[U1 - U7];
[U1 - U7];
 Bengt O. Muthen posted on Wednesday, February 04, 2015 - 11:13 am
Please send your output where you see this YT2 ON CW regression to
  Chris Kenaszchuk posted on Monday, March 02, 2015 - 9:54 am
For the input below:
(1) YT2 ON CW is modeled on the between level. Does this mean that between-level clusters influence the association between within-level latent class membership (CW) and YT2?

(2) Is the first regression of YT2 on YT1 (under MODEL: %WITHIN%) expected to produce an intercept for YT2? There were slope estimates in the Within Level results but I did not see intercept estimates. YT2 intercept estimates are in the Between Level output, and I assume they are for the regression on CW.

(3) If I un-commented the regression lines under MODEL CW, would I be allowing the regression of YT2 ON YT1 to be different for each class? Thank you.

CLASSES = CB(2) CW(2);

STARTS = 100 20;


CB ON W1 W2;

[U1 - U7];
!YT2 ON YT1;

[U1 - U7];
!YT2 ON YT1;
 Bengt O. Muthen posted on Monday, March 02, 2015 - 4:45 pm
(1) You have


which should not be used because it regresses a continuous between-level random effect on a categorical latent class variable, which is not the Mplus design. The CW means can vary across CB classes without saying this.

(2) An observed variable has only one intercept and this is by default printed on Between.

(3) Yes, but you can't have it in the Overall part as well because that would lead to non-identification.
 S Elaine posted on Wednesday, March 18, 2015 - 12:42 pm
Van Horn et al. (2009) stated: "In general, we believe that regression mixture models are best viewed as a large-sample technique, though further methodological research is needed before sample size guidelines are provided."
Are there any recent guidelines about sample size for Latent Class Regression? I am unable to locate recent articles addressing this issue. I ask, because in exploring this technique our research team found three distinct groups based on differential effects of 3 risk factors on 3 mental health outcomes; however, we only have 291 children in our sample. I am wondering if it is reasonable to proceed with examining two predictors of group differences. Thank you for your help.
 Bengt O. Muthen posted on Wednesday, March 18, 2015 - 4:39 pm
I am not aware of such guidelines. You can easily do your own Monte Carlo study in Mplus to find out.
 James Swartz posted on Monday, October 19, 2015 - 8:50 am
I am running a multi-group latent class model with medical conditions defining latent class (c) and HIV serostatus as the knownclass. I have a set of covariates on which I want to regress both the latent and known classes. I do not want to treat them as distal variables but do want to allow them to influence class membership and within class conditional probabilities.

The categorical covariates are fine. The issue is with the covariate - age in years - measured on an interval level. To get the model to run and converge, I have to use these statements in the model statememt:

c with age;
c on newrace2;
c on orient;
c on k6cat;
c on alcuse2;
c on mjuse2;
c on methuse2;
c on eduse2;
c on popuse2;

This produces means for age and ORs for the categorical predictors. If I flip the statement "c with age" to be "c on age" to get ORs for each increased year of age as I might in an ordinary LR, the model does not converge and/or runs forever. I suspect the distribution of age, which has few cases at the upper end, might contribute to this.

Is there a better way to get ORs with CIs for the age variable? Also, and if not, is there a way to get significance tests to compare mean ages across the different latent classes or do I just have to use the CIs for age in the printout to figure that out myself?

Thanks for any help.
 Jon Heron posted on Monday, October 19, 2015 - 9:17 am
I guess in a more general setting you could derive the directional association from the ratio of the covariance and the variance of the independent variable, however I am struggling to envisage this when C is latent nominal - what does a covariance even represent in this situation?

Also, when you say "tests to compare mean ages across the different latent classes" it sounds like you are now thinking of age as being dependent.

There are discussions in the technical appendices regarding continuous dependent variables causing problems when their distribution is non-normal, however I'm not aware of this problem when the variable is a predictor (indeed it's use as a predictor was used as one solution to this (LTB)).
 Bengt O. Muthen posted on Monday, October 19, 2015 - 2:07 pm
Just to add to Jon's answer, perhaps you want to scale down the age variable, e.g. centering it and/or dividing it by 10.
 Chris Giebe posted on Tuesday, November 15, 2016 - 2:33 am
Hello, I'm trying to include a covariate into my two-level model, with class 1 as a reference class.
I've been using example 10.6 of the user's guide and your ASB example of topic 5 part 3 video as a reference, to create the following model:


f BY c#2-c#4;
f ON w;

but am getting this error:

*** ERROR in MODEL command
Unknown variable(s) in a BY statement: C#2-C#4

What am I doing wrong?
Thanks in advance
 Bengt O. Muthen posted on Tuesday, November 15, 2016 - 5:24 pm
I don't know how you make class 1 your reference class. Mplus reacts against the mentioning of c#4 in the BY statement. Instead say

f BY c#1-c#3;

If this doesn't help, send output to Support along with license number.
 Chris Giebe posted on Thursday, March 30, 2017 - 6:29 am

Using manual BCH, I have successfully run an LCR with a single covariate and a single continuous distal outcome, comparing class means using the wald test.

Is it possible to run several distal outcomes that are all part of a questionnaire?

In my dataset, I have several items of a health questionnaire that I would like to combine into an "overall health score" that is specific to each latent class and compare classes on that score.

Is that possible?
Comparing the different classes on an overall health score is just so much more meaningful to what I'm trying to do than comparing every single item in the quetionnaire.

 Bengt O. Muthen posted on Thursday, March 30, 2017 - 9:20 am
Why not use the total score as the observed distal outcome.

Running several outcomes gives the same result as running one at a time.
 Chris Giebe posted on Saturday, April 01, 2017 - 7:06 am
Thanks for the quick response. I guess that makes sense to create a summary score beforehand, and then include that in the model.

I do have a follow-up question:

In the output, under MODEL RESULTS I am seeing the class specific Estimates, S.E., Est./S.E., and p-values columns.
Am I correct in understanding that the intercepts are the class means of my outcome variable?
I'm noticing under RESIDUAL OUTPUT (I'm assuming this is Tech4?)
ESTIMATED MODEL AND RESIDUALS (......) that there are also Model Estimated Means for my covariate and outcome.
These are vastly different than the intercepts under MODEL RESULTS.
Which ones do I report? The Model Estimated Means under RESIDUAL OUTPUT or the intercepts under MODEL RESULTS?

 Bengt O. Muthen posted on Saturday, April 01, 2017 - 4:32 pm
The intercepts for the outcomes are not the means for the outcomes - just like in regular regression.
 Jenny Chang posted on Monday, November 13, 2017 - 4:32 am
I am trying to use 3-step approach to compare the difference of a distal outcome PND. My preliminary result by auto BCH is consistent with those by traditional 3-step. Then I further control effect of covariate AG on PND, which was not assumed to vary across classes.
1.Result of manualBCH shows Classification Probabilities matrix had negative value and also value above 1. The result is very different from those by autoBCH.Dose it fails in this case? The webnote 21 mentioned equal variance of distal outcome may solve this problem, but I did not found the sample code.
2.I followed the syntax in Appendix E in Asparouhov & Muth¨¦n(2014) to use manual 3-step (Vermunt,2010), and compared the intercept of PND by wald test(syntax as follow), in order to compare the difference of their mean. Does it make sense?
3.If my step 2 makes sense, I found the result has less significant pairwise comparison than those by autoBCH and traditional 3-step. Is the result by this manual 3-step robust, since the association between C and distal outcome is even lower than those by traditional approach.
4.Based on the current reuslt, which approach is recommended?
PND on AG;
PND (a1);
PND (a2);
0 = a1 ¨C a2;
 Bengt O. Muthen posted on Monday, November 13, 2017 - 3:38 pm
Send the 2 outputs (manual 3 step and auto BCH) to Support along with your license number and these questions.
 QianLi Xue posted on Thursday, January 11, 2018 - 7:58 am
When fitting a latent class regression,

I got the following:


Categorical Latent Variables

R1LOWACT 0.318
R1SLEEP 1.472
R1FINFIT 0.873
R1COMMENG1 1.010
R1MEDS 6.985
R1TECH 1.958

R1LOWACT 0.543
R1SLEEP 1.263
R1FINFIT 0.843
R1COMMENG1 1.001
R1MEDS 1.983
R1TECH 1.558

The question is why the z scores and confidence intervals are shown. They were shown in the Alternative parameterization table. Do we expect to get p-values from there?
 Bengt O. Muthen posted on Thursday, January 11, 2018 - 4:30 pm
The p-values are not given for these odds ratio results. You can compute confidence intervals for them using the approach shown in our FAQ on our website:

Odds ratio confidence interval from logOR estimate and SE
 samah Zakaria Ahmed posted on Monday, January 22, 2018 - 4:49 pm
I want to ask about MODEL RESULTS in case of having 2 class latent variable and binary observed variables
what does Thresholds in each class refer to?
Does it refer to the coefficients of logit function directly(alpha and beta)?
logit(prob(y=1|z)=alpha + beta(z)
 Bengt O. Muthen posted on Monday, January 22, 2018 - 5:04 pm
The threshold parameter for a binary variable is the same as the logit intercept with a sign change.

The comparison category is 0 for the binary variable.
 Thomas Scotto posted on Friday, March 23, 2018 - 10:58 am
Part 1: Input:

Hello, I’m trying to run a multi-group LCA analysis—my groups are designated as known classes, g, and I want to estimate five classes. I want to regress each of the classes onto a simple dichotomous education covariate and analyse group specific results—I think this is correct?

classes= g(2) c(5);
knownclass= g (group=0 group=1);
Type = mixture ;
Starts= 25 25;
c on eduts;
c on g;
Model g:
c on eduts
 Bengt O. Muthen posted on Friday, March 23, 2018 - 4:20 pm
Looks good.
 owis eilayyan posted on Wednesday, July 04, 2018 - 3:57 pm
I am trying to assess the influence of a latent class variable c on distal outcomes “AGG_PHYS” and “AGG_MENT”. I used the following syntax:
DATA: FILE IS C:\Users\Owis Eilayyan\Desktop\PhD\Scoring\Dec2017\LCA\DATA\LBP_Datav6A.dat;
VARIABLE: NAMES ARE ID red age gender marital children educ empl social Ethnicity hand AGG_PHYS AGG_MENT PainS PainInt ODI HADS_D HADS_A PHQ PF RP BP GH VT RE SF MH Effic FABQph FABQw KeelT KeelS;
usevariables are AGG_PHYS AGG_MENT PainS PainInt ODI HADS_D HADS_A Effic FABQph FABQw;
missing = .;
CLASSES = c (3);

However, I got this error message: “Unknown variable(s) in an ON statement: AGG_PHYS”. How can I run a regression analysis with distal outcomes using DU3STEP command?

Thank you,
 Bengt O. Muthen posted on Wednesday, July 04, 2018 - 4:40 pm
You don't say "...ON C" in Mplus, just like you don't regress anything on a nominal variable. for correct use of 3-step with a distal outcome, see the 2 papers on our website:

Asparouhov, T. & Muthén, B. (2014). Auxiliary variables in mixture modeling: Three-step approaches using Mplus. Structural Equation Modeling: A Multidisciplinary Journal, 21:3, 329-341. The posted version corrects several typos in the published version. An earlier version of this paper was posted as web note 15. (Download appendices with Mplus scripts).

Asparouhov, T. & Muthén, B. (2014). Auxiliary variables in mixture modeling: Using the BCH method in Mplus to estimate a distal outcome model and an arbitrary second model. Web note 21.
 owis eilayyan posted on Thursday, July 05, 2018 - 6:01 am
Thank you Dr. Muthen,
 Lan Luo posted on Thursday, August 23, 2018 - 7:07 pm

I also encountered the same problem.Can I get some advice?

I want to evaluate the relationship between a binary dependent variable and 5 latent classes + some other independent variables.

I have read the paper:
Asparouhov, T. & Muthén, B. (2014). Auxiliary variables in mixture modeling: Using the BCH method in Mplus to estimate a distal outcome model and an arbitrary second model. Web note 21.

However, this paper mentions that BCH is executed when the dependent is a continuous variable.

I also have read the paper:
Asparouhov, T. & Muthén, B. (2014) Auxiliary variables in mixture modeling: Three-step approaches using Mplus. Structural Equation Modeling: A Multidisciplinary Journal, 21:3, 329-341.

However, this paper only provides an example for the linear regression.

Is there any methods to suit my case?

In my case,
the dependent variable is binary (death or not),
5 latent classes,
and 5 independent variales (agegrp, gender, insurance, seifa, single).

Your help will be greatly appreciated!

 Bengt O. Muthen posted on Friday, August 24, 2018 - 5:56 pm
See Table 6 of Web Note 21 which recommends the DCAT auxiliary setting for a categorical distal outcome.
 Lan Luo posted on Sunday, August 26, 2018 - 4:57 pm
Hi Bengt,

Thank you for your advice!

Yes, the DCAT auxiliary setting is recommended for the categorical distal outcome.

I tried the following syntax:

names = death single female agegrp
finyear icu priv char seifa rural
categorical = dis1-dis11;
usevariables = dis1-dis11;
classes = c(5);
Auxiliary = death(DCAT);
Type = mixture;
starts = 200 50;

However, I found it can only evaluate the relationship between the outcome(death) and the latent class variable.

Can you advise me how to bring other covariates(single female agegrp finyear icu priv char seifa rural) into the regression as independent variables?

In my case, latent variable and other covariates(single female agegrp finyear icu priv char seifa rural) are expected as independent variables.

Thanks a lot!

 Bengt O. Muthen posted on Monday, August 27, 2018 - 2:12 pm
See the manual approach described in Appendix E from

as referred to in the paper on our website:

Asparouhov, T. & Muthén, B. (2014). Auxiliary variables in mixture modeling: Three-step approaches using Mplus. Structural Equation Modeling: A Multidisciplinary Journal, 21:3, 329-341. The posted version corrects several typos in the published version. An earlier version of this paper was posted as web note 15. (Download appendices with Mplus scripts).
 Lan Luo posted on Monday, August 27, 2018 - 4:30 pm
Thanks, Bengt O!

The manual approach in Appendix E is an example for a linear regression.

According to this paper, can I add a setting "categorical = death" in step 3 to define my dependent variable to be binary? So I can get the result for the logistic regression?
 Lan Luo posted on Monday, August 27, 2018 - 4:32 pm
I tired the following syntax in step 3:

names = dis1-dis11 death single female agegrp finyear C1-C5 n;
usevariables = death single female
agegrp finyear n;
classes = c(5);
categorical = death;
nominal = n;
Type = mixture;
Starts = 200 120;
death on single female
agegrp finyear;
death on single female
agegrp finyear;
death on single female
agegrp finyear;
death on single female
agegrp finyear;
death on single female
agegrp finyear;
 Lan Luo posted on Monday, August 27, 2018 - 4:33 pm
If such a syntax is feasible for my case?

Your help will be greatly appreciated!

Thanks again!

 Bengt O. Muthen posted on Monday, August 27, 2018 - 5:22 pm
Send your outputs from the first and last step to Support along with your license number.
 Daniel Lee posted on Thursday, June 06, 2019 - 9:55 am
Hi, I am trying to conduct a mixture model for a very simple regression with a categorical dependent variable (i.e., identifying subgroups for Y on X).

My code is not working and I was wondering if you could help me revise it. Thank you so much, as always:

[NMENT4$2*1] (1);
[NMENT4$2*-1] (1);
 Bengt O. Muthen posted on Thursday, June 06, 2019 - 5:17 pm
We need to see your full output to be able to say - send to Support along with your license number.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message