Multilevel latent class analysis PreviousNext
Mplus Discussion > Multilevel Data/Complex Sample >
 Sven De Maeyer posted on Monday, March 29, 2004 - 9:01 am

I would like to inform if it is possible to include a latent class variable in a multilevel model. For instance in school effectiveness reseach: some characteristics could possibly distinguish 5 classes of schools. Would it be possible to use this latent class variable to predict student outcomes at the lowest level. Or is this impossible? If so, are there references to check for more details about this option?

thank you for the information
 Linda K. Muthen posted on Monday, March 29, 2004 - 9:24 am
Version 3 of Mplus can estimate multilevel mixture models. The Version 3 User's Guide will describe how to set these models up.
 shige posted on Monday, March 29, 2004 - 6:32 pm
Hi Linda,

In order to estimate multilevel mixture model (that is, multilevel model with finite mixture random component), I need the base package, the multilevel addon, and the mixture addon, am I correct about this? Thanks!

 Linda K. Muthen posted on Tuesday, March 30, 2004 - 7:19 am
You would need the combination add-on to estimate a multilevel mixture model.
 Joyce T. posted on Monday, April 04, 2005 - 6:36 am
I'am running a multilevel model (using ML) which contains 20 dependant variables, 12 independant variables and 3 continuous latent variables.
I would like to know how mplus compute the degrees of freedom for both, the Chi-square test of model fit and the Chi-square test of model fit for the baseline model.
 BMuthen posted on Wednesday, April 06, 2005 - 3:03 am
The degrees of freedom is the number of parameters in the H1 model minus the number of parameters in the H0 model. The chi-square test of model fit for ML uses as H1 a model with free means, and free variances and covariances for both within and between. The baseline model is a model of free means and variances for between and within.
 Ann-marie Faria posted on Friday, September 25, 2009 - 7:42 am
I am working on a multi-level LCA in a school-based data set. The variables of interest are 1) parent involvement (a child level variable) and 2) classroom quality (a classroom level variable). The ultimate goal is to use the latent classes as independent variables to predict child outcomes such as school readiness.

1) One question that has come up in our discussions is how to deal with important covariates (such as maternal education, child age, child sex, child ethnicity). I was wondering if you could speak to the differences between a) including the covariates in the model that estimates the latent class memberships vs. b) running post-hoc ANOVAS to examine the distribution of the profiles on these important covariates.

2) Also I am interested in how using the Latent Classes as independent variables to predict other outcomes would shift class memberships. Does this often happen when adding a predictive step into LCA/LPA models?

3) Finally we are using a national database with sampling weights, will weighting the data influence the LCA/LPA outcomes?

Thank you so much for your time
 Linda K. Muthen posted on Friday, September 25, 2009 - 11:04 am
1. Generally it is best to estimate the full model simultaneously. See the following paper which is available on the website for further information:

Relating latent class analysis results to variables not included in the analysis. Submitted for publication.

2. If you add outcomes other than the original set, this will most likely change class membership and perhaps it should. See the following related paper which is available on the website:

Muthén, B. (2004). Latent variable analysis: Growth mixture modeling and related techniques for longitudinal data. In D. Kaplan (ed.), Handbook of quantitative methodology for the social sciences (pp. 345-368). Newbury Park, CA: Sage Publications.

3. You can and should include complex survey data features in the analysis. See the user's guide under complex survey data to see the options available in Mplus.
 Ann-marie Faria posted on Friday, September 25, 2009 - 11:18 am
Thank you for your quick reply and for the references.

Best, Ann-Marie
 Ann-marie Faria posted on Monday, May 10, 2010 - 12:03 pm
Hello Linda-

I have two follow up questions for the multilevel LCA we are working on.

1) Standardized Scores: Do you suggest running the models with standardized continuous indicators? Or is it acceptable to keep the indicators of profiles in their original metric (even if variances are different among indicators?)

2) Predicting Outcomes from Profile Membership: Also, for estimating the relationship between different profiles and an outcome (say literacy achievement) we have been including the outcome of interest as an indicator of profile membership. We then ran a Wald statistic to examine profile differences on the mean estimates of the outcome. Is this how you would suggest estimating profile differences on an outcome?

Thank you-
 Linda K. Muthen posted on Tuesday, May 11, 2010 - 9:48 am
1. I would not standardize.
2. Yes.
 Andre Plamondon posted on Saturday, June 19, 2010 - 12:40 pm
I have a question regarding a multi-level LCA too.

I am using a sample of twins in which I would like to identify the genetic-environmental etiology of class membership. I also have predictors at the within and the between level.

Since regressing the measured environmental variable doesn't seem to modify the ACE results (as seen in Turkheimer et al., 2005), I would like to use another strategy developed by Rasgach, O'Connor and Jenkins in which the genetic resemblance is a fixed effect.

Although I think this strategy is the best, I'm not sure how to actually bring the equation in a Mplus input.
The equation is :
y(ij) = Beta(o) + u(j) + e(ij) + g(ij)

Basically, the only thing that changes from a regular multilevel model is the g(ij) which is the genetic effect for the child (i) in the j'th family which varies for all individuals according to behavior genetic assumptions (it can be used with complex family pedigrees). I am wondering what do I write in the input. I think they use a single group to do the analyses and it departs from the multiple group analyses I am used to with twin samples.

Maybe the paper would make it clearer :

The equation is on page 8.
 Bengt O. Muthen posted on Sunday, June 20, 2010 - 11:31 am
Page 8, bottom, suggests that a covariance for the g(ij) term is a function of known constants. This reminds me of "QTL" modeling which is shown in the UG ex 5.23. This UG example shows how to use the Constraint= approach to moderate a covariance using read-in values. Perhaps that is a path towards doing what you want.
 Ann-marie Faria posted on Friday, June 25, 2010 - 10:48 am
In running the multi-level LPA I mentioned earlier, we see from the output that the means of the indicators in the level 2 profiles are constrained to be equal across profiles. It appears that this is the default in Mplus.

Is it possible simultaneously estimate a latent profile of level 1 (child level variables) and a latent profile of level 2 (classroom level variables) without the level 2 means being fixed across level 2 profiles?

I attempted to override this estimation with starting values, but am getting repeated errors messages:

The following MODEL statements are ignored:
* Statements in Class %CB#1% of MODEL CB on the BETWEEN level:
* Statements in Class %CB#2% of MODEL CB on the BETWEEN level:
One or more MODEL statements were ignored. These statements may be incorrect.

Do you have any suggestions?

Thank you again,
 Linda K. Muthen posted on Friday, June 25, 2010 - 10:56 am
Please send the full output and your license number to
 Robert Urban posted on Friday, December 17, 2010 - 10:12 am
Dear Dr. Muthén,
I want to perform LCA on a complex dataset (teachers were rating students), and want to control for clustering effect. However I cannot define Type=complex, since this is already done with Type=mixture.
How can I use the clustering or stratification options in this type of analysis?
Thanks a lot.
 Linda K. Muthen posted on Friday, December 17, 2010 - 10:43 am
 Michelle Maier posted on Friday, March 11, 2011 - 8:37 am
I am working on a two-level LCA where the variables of interest include both child-level (observed child interactions) and class-level variables (classroom quality). I have specified two classes at each level. I would like to see if the resulting four profiles differ on child-level school readiness outcomes. Is there a way to get an outcome mean for each profile? I am only able to get two means (one for each of the within classes) rather than four means (one for each of the four profiles).
Thank you!
 Bengt O. Muthen posted on Friday, March 11, 2011 - 9:49 am
Take a look at the 2010 Henry-Muthen article in the SEM journal. For the models of figures 1-4 the between classes only make the within classes more or less likely but don't change the profiles of the observed items. In contrast, for figure 5 and on there are item-specific differences across the between-level classes so that would give the profile differences you expect.
 K Frampton posted on Wednesday, March 16, 2011 - 11:23 am

I am running a multilevel LPA with continuous parenting indicators, with children nested within families. My goal is to identify parenting profiles, observe how they differ across various factors (mainly SES), and then use profile membership to predict a distal outcome (children's prosocial skills) in interaction with SES.

I first fit the model in a single level, and then in a multilevel. 4 classes were identified. Entropy is .86.

I then added covariates of interest(e.g., age of child, SES variables), to identify what distinguishes these groups. When I do this, the structure of the classes changes significantly. I know this is because of measurement variance issues. When I regress parenting indicators on covariate(s) in a single level, it improves the fit of the model. However, in a multilevel, when I do the same thing, computation time was + 2 hours, and it did not converge.

Any suggestions on how to get around this measurement variance issue in a multilevel? Because entropy is high, is it feasible in a multilevel framework to save the classes identified and then work with them as an observed variable, as you might do in a single level?

Also, with all this in mind - how would you suggest answering my final question - how SES X profile predicts a distal outcome?

I am grateful for any recommendations. Thanks.

 Bengt O. Muthen posted on Wednesday, March 16, 2011 - 4:30 pm
It sounds like you have measurement non-invariance and that you add direct effects from covariates to indicators to take this into account. Note that you cannot identify a model with all direct effects.

To see your multilevel problem, you would have to send your input, output, data, and license number to

SES X profile influencing a distal can be handled by distal regressed on SEM with different slopes in the different profiles.
 IYH Boon posted on Monday, June 27, 2011 - 12:58 pm
Are there any examples/code snippets available for situations like the one K Frampton describes, above, where the goal is to (1) identify latent profiles at level two and (2) relate these profiles to a distal outcome observed at level one?

I'm working on a similar problem and am unsure about how to specify the model statement.

Thanks in advance for the help,

 Bengt O. Muthen posted on Monday, June 27, 2011 - 5:32 pm
I don't think we have that in script or paper form, but you would work along the lines of the below. This creates a between-level (say school) latent class variable cb from the between-level z indicators and cb influences the means of the random intercept for the distal outcome d (which is say a student variable varying on both within and between), which is how the cb influence carries over to the student's distal outcome.



Between = z1-z10 cb;
classes = cb(2);




d on x;




I don't think you have to say more in MODEL because the z means vary across the cb classes as the default, and so does the d mean, where on between d is the random intercept in the regression of d on x.

Hope this start helps.
 Junqing Liu posted on Thursday, August 25, 2011 - 12:50 pm
I am new to Mplus and LPA. I am working on a two-level LPA in a workforce data set. The variables of interest are 1) organization culture (a level 2 latent variable based on five level 2 continuous indicators) and 2) worker demography and practices and (level 1 observed variables). The goal is to use the latent classes as independent variables to predict workers' practice such as using a type of therapy.
1)One question is do i need to run the LPA first to get the classes(say there will be 2 or 3 categories)before including the latent class variable into the final model to predict the worker outcome?

2) In the final model using the org. culture class membership to predict worker outcome, do i need to include the observed organization id as a predicting variable to declare this is two-level model?

3) What is the output of the final model? Is it separate regression models for each category of org. culture? Or is it one regression?

4) Is there any empirical research reference on cross-sectional multilevel LPA analysis that I can read?

Many thanks!
 Bengt O. Muthen posted on Thursday, August 25, 2011 - 5:29 pm
1) Although not necessary, it's a good idea.

2) Orginization would be your Cluster= variable - see UG.

3)-4) You should read

Henry, K. & Muthén, B. (2010). Multilevel latent class analysis: An application of adolescent smoking typologies with individual and contextual predictors. Structural Equation Modeling, 17, 193-215.

which you can find on our web site.
 Junqing Liu posted on Friday, August 26, 2011 - 1:48 pm
Hi Bengt,

This is extremely helpful!

I can compare different LPA models and pick one that fits the best as the final latent profile model to do further analysis. I have some follow-up questions about the further analysis.

1.How common is it to use the latent profile variable as a predictor along with other covariates, rather than as a dependent variable?

2. If it is common, then should a level 1 latent profile variable be included as a regular categorical covariate (along with other level 1 and level 2 predictors) to predict a level 1 outcome or the way to included it depends on how the latent profile variable is modeled such as a two-level latent profile model with level 2 factor on random latent class intercepts and level 2 factor on random latent class indicators?

3. Is there any empirical research reference on using multilevel latent profile variable as predictor that I can read?

Thank you for your patience with my long-windedness.
 Bengt O. Muthen posted on Saturday, August 27, 2011 - 12:01 pm
1. It is getting more used now that software is available for easy use. There are papers on our web site showing this. But I would not say that it is common yet.

2. A latent class variable should be included as a predictor if substantive theory warrants that. Note, however, that you don't say "y ON c" (for a distal outcome y), but Mplus lets the y means change over the latent classes.

3. I have not seen multilevel latent profile used as a predictor yet in the literature, but there is nothing precluding it. The approach used in Henry & Muthen can easily be expanded to that using Mplus.
 Junqing Liu posted on Friday, September 09, 2011 - 6:56 am
Thanks, Bengt. This is very helpful.

I tried the following three-class two-level LPA of org. culture. However, the output does not include results on latent classes. All the results are about correlations and covariance. The five latent indicators are the mean score of scales and the value ranging from 1 to 5.

1) How may i change the following syntax to get output on latent classes?

2) Is it ok to use the mean of scales as latent indicators? Or is it better to use the items within the scales as indicators?

Thank you very much.


Usevariables = l_coh_ag L_aut M_Coll M_FocOut M_RoCn ORGID;
Classes = c(3);
Within=l_coh_ag L_aut M_Coll M_FocOut M_RoCn;

type = mixture twolevel;
starts=20 10;

C#1; C#2; C#1 WITH C#2;


 Bengt O. Muthen posted on Friday, September 09, 2011 - 7:11 am
Please send your output to support.
 Junqing Liu posted on Tuesday, September 13, 2011 - 12:49 pm
Thanks, Bengt and Linda. The technical problem is solved.

I have a couple of questions related to the results of the two-level model i mentioned earlier.

1) Are tech 11 and tech 14 applicable to a two-level LPA model? If so, when the LMR p value of tech 11 is not but the p of tech 14 is significant, should i pick a K as oppose to a k-1 class model?

2) The BIC and AIC of the two-level model are smaller than those of a single level model, but not by much (e.g. the adjusted BIC is 2542.96 for the two-level model and is 2559.12 for the single level model). In this case, should i still choose the two-level model?

3) What the following specification means, especially C#1 WITH C#2?

C#1; C#2; C#1 WITH C#2;

Thank you very much!
 Tihomir Asparouhov posted on Tuesday, September 13, 2011 - 5:39 pm
1) In principle tech14 is more reliable however you might also want to look at BIC and AIC for the two models.

2) You should also consider the size of the variance on the between level and see if it is significant. Note also that the performance of BIC would depend on the number of clusters (two-level units) which drives the asymptotics. Ultimately a simulation study would show if AIC and BIC are useful in this context. This is not well studied.

3) These are the interaction parameters in a log-linear model for contingency tables. For example in

these are the lambda_IJ parameters.
 Tihomir Asparouhov posted on Tuesday, September 13, 2011 - 7:32 pm
Actually the answer to #3 above is not correct at all. Here is the correct answer.

C#1 and C#2 are alpha1j and alpha2j in formula (4) in

These are normally distributed random effects that vary over clusters and allow the class proportions to change over clusters. The covariance term C#1 WITH C#2; is just the covariance between the two random effects and C#1; C#2; are the two variances.
 Mary Campa posted on Thursday, September 15, 2011 - 4:40 pm
Hello Dr. Muthen,

I am reading your paper with Dr. Henry (2010, SEM 17m 193-215)and trying to replicate the analysis titled: Three classes at level 1, Two classes at level 2 random effects model: nonparametric approach (Model 4a from Table 1).

I am using 4 classes at level one but otherwise the model statements are the same. However, I continue to get this error message:

** ERROR in MODEL command
Unknown class model name CW specified in C-specific MODEL command.

I am not sure what I am missing?

Thank you for your assistance.
 Bengt O. Muthen posted on Thursday, September 15, 2011 - 6:00 pm
This can only be answered if you send the output to support.
 Junqing Liu posted on Friday, September 16, 2011 - 9:03 am
Thank you very much, Tihomir.

I have a couple follow up questions regarding the interpretation of c#1 and C#2.

1) Does the within level means of c#1 means the random intercept of c#1 as compared to c#3? What dos a significant P value of the within level means of c#1?

2) What does a non-significant p value of the between level variance of c#1 mean?

3) Regarding your earlier response on examining the significance of between level variance, should i do a log likelihood ration test of the one level and two level model, or should i use the p value that mention above, the p value of the between level variances of c#1 and c#2?

Thanks again!
 Bengt O. Muthen posted on Friday, September 16, 2011 - 9:41 am
1) Yes. That p-value is typically not of interest because it tests that the logit is zero which with 3 classes is not a meaningful point.

2) That indicates that there is zero between-level variance for class 1.

3) Neither test is optimal because of testing variances at zero, the border of the admissible parameter space. I would just leave them as is and report them.
 Friedrich Platz posted on Thursday, January 05, 2012 - 6:26 am
Happy New Year!
I hope, you had a great holidays! I'm a beginner in the field of multilevel analysis with a lot of questions. First, my aim is to reveal a typology on musicians who were rated by audience members.
I have read your very intersting and inspiring paper (Henry & Muthen, 2010). First of al, I have some questions about your paper which could help me, solving some misunderstandings:
1) To which item or construct does your cluster variable, named LEAID, refer to?
2) It would be much easier to follow your steps, reported in your article, if I could run your code on the dataset. I wrote to Mrs. Henry, but she advised me to simulate such a dataset by Monte Carlo technique. Unfortunately, I am not able to do this. How does it work?
3) I would like to use a model for my own dataset that is the same as presented in Figure 7 (multilevel latent class model – non parametric approach with level 2 factor on random latent class indicators). Would you give me a hint how to write this in MPLUS-code?
4) The final model seems also be attractive to me. But it is hard to understand your reported code without any comments on it. Especially the lower part (model constraint). I would be happy, If you could give us some comments on the code.

I would be very happy, if you could help me answering these questions.

Thank you!
 Linda K. Muthen posted on Thursday, January 05, 2012 - 12:28 pm
You can look at Examples 10.6 and 10.7 in the user's guide. These are similar to what is done in the Henry and Muthen paper and there are data available with each example. A note of caution, if you are a beginner in multilevel analysis, starting with a multilevel mixture model may not be a good idea. Studying both multilevel and mixture modeling as a first step is a good idea.
 Friedrich Platz posted on Thursday, January 05, 2012 - 1:00 pm
Thank's a lot. Which literature would you recommend for an good introduction into These topics?
All the best!
 Linda K. Muthen posted on Thursday, January 05, 2012 - 1:18 pm
You can see our Topics 5, 6, 7, and 8 course handouts which cover these topics and contain references.
 Friedrich Platz posted on Thursday, January 05, 2012 - 1:26 pm
Thank you!
 Angela Urick posted on Thursday, January 19, 2012 - 12:46 pm
I’m working on a two-level LCA (types of teachers in schools with types of principals) with a cw and cb that have different indicators (I have labeled them uw’s and ub’s below). These two sets of indicators have dichotomous and continuous measures. Finally, I want to regress cw on cb with a random intercept (similar to ex. 10.7). Here is my basic code:

CATEGORICAL = uw 1 - uw15 ub1 - ub13;
CLASSES= cb(3) cw(4);
WITHIN= uw1 – uw30 x;
BETWEEN= cb ub1 - ub30 w;

cw on x;
cb on w;
cw#1 – cw#3 on cb;

[uw1$1 – uw15$1]
[uw16 – uw30]
[uw1$1 – uw15$1]
[uw16 – uw30]
[uw1$1 – uw15$1]
[uw16 – uw30]
[uw1$1 – uw15$1]
[uw16 – uw30]

Here are my questions:
1. Is a LRT (TECH 11, 14) possible for this model with different indicators for cw and cb? If not, how would you suggest that I asses class fit?
2. Theoretically, there is a two-way relationship between cw and cb, how would you suggest that this be modeled?
 Bengt O. Muthen posted on Thursday, January 19, 2012 - 8:57 pm
1. No. It is an open research question on how to determine number of classes in a multilevel setting such as this. The ordinary BIC, for example, may not be the best approach. Interpretability can be helpful.

2. The relationship between cw and cb is captured by the between-level statement

cw#1 – cw#3 on cb;

where the cw random intercepts are influenced by cb.
 Angela Urick posted on Saturday, January 21, 2012 - 10:33 am
Thank you, Dr. Muthen.
 Angela Urick posted on Monday, January 30, 2012 - 1:05 pm
Good afternoon, Dr. Muthen,
I have another question in reference to the model mentioned above. In the results, the means of the level 1 indicators (uw/u) vary across classes as expected. However, the means of the level 2 indicators (ub/z) are the same across all classes. I ran cb as a single level LCA—there should be three different classes. Why would these means remain the same across the between level classes? Do I need to free the between level indicators or make other edits to the code?
Thanks again, Angela
 Bengt O. Muthen posted on Monday, January 30, 2012 - 8:26 pm

Model cb:



and then mention the intercepts of the ub indicators in each of the cb classes.
 Angela Urick posted on Tuesday, January 31, 2012 - 10:19 am
Thanks, it worked.
 Mary Campa posted on Saturday, May 26, 2012 - 12:13 pm
I am building a model similar to the Henry & Muthen, 2010 model 2, a two-level random effects LCA. I have selected a four-class model as best fitting. My question is about the between-level variances produced by this code:

C#1; C#2; C#3; C#1 WITH C#2 C#3;
C#2 WITH C#3;

I used starting values to switch the ordering of the classes and the estimates and standard errors of the between-level variances (C#1 -C#3) changed. For example, on the initial run, the variance in class 2 was significant but on the next run the variance for the same class (although now a new number) was not. This happened for multiple classes, where the parameter estimates changed based on what class I selected as the reference.

My understanding from the Henry & Muthen paper is that these parameters represent the between-level variance in the class membership. The classes (proportions, probability of indicators) remains the same regardless of which is the reference class so I am not clear why these are changing.

Does this suggest there is something wrong with my model or am I wrong in the interpretation?

Thank you for your help.
 Bengt O. Muthen posted on Saturday, May 26, 2012 - 12:42 pm
The multinomial regression has the coefficients at zero for the last class, the reference class. The between-level variance components add to the coefficients for all but the last class. It makes sense that the size and SEs of these variance components change when you change the order of the classes because it is all relative to the reference class.
 sunY posted on Friday, June 08, 2012 - 3:44 pm
Hi Dr. Muthen,
Thank you so much for your help in advance.
I'm running a two level LCA and latent variable and independent (treat) is school level, and dependent variable is student level.
According to the Number of clusters: 32, isn't the sum of latent classes supposed be 32 as well? But, the results show latent variables were converged by student level not school level like below:

Class Counts and Proportions


1 1437 0.53720
2 1238 0.46280

CLUSTER = school;
BETWEEN = treat c;
CLASSES = c(2);

STARTS = 150 25;

dv on treat;
dv on treat;
 Bengt O. Muthen posted on Friday, June 08, 2012 - 8:36 pm
Although the latent class variable is a between-level variable (varying across schools only), the class counts and proportions printed say how many students are in each class. But there are only 32 schools and the latent classes refer to them.
 sunY posted on Friday, June 08, 2012 - 11:29 pm
Thank you so much for the prompt response.
 pandhusujarwo posted on Thursday, July 05, 2012 - 6:09 am
Dear Prof Muthen,

I fit multilevel finite mixture modelling for count data. I want to get variances across classes in level 1 and level 2. I run the model with this code:

y on x1;
y on x1;

y on w;
y on w;

With this model, Mplus only give me one number of variance across classess and level. How I have to intrepret these variances? Can Mplus 6.1 reveal variances across classes and level?

Thanks very much for your help
 Linda K. Muthen posted on Thursday, July 05, 2012 - 11:07 am
Please send your output and license number to
 xiaoshu zhu posted on Wednesday, July 25, 2012 - 10:56 am

I have a question regarding the class membership at the group level.
I followed the codes for the non-parametric MLCA in Henry and Muthen (2010) and specified a model with two student LCs and three group LCs. The output showed that some groups were assigned to two group LCs, simultaneously.

How can we deal with this problem? Should we decide the group class membership based on the one with largest proportion of students within the group?

Thanks in advance!
 Linda K. Muthen posted on Thursday, July 26, 2012 - 11:59 am
Please send the output and saved data along with your license number to Point to specifically what you see as the problem.
 Mike Todd posted on Sunday, January 20, 2013 - 5:14 pm
Hello there:

We have a multilevel dataset (individuals nested with census tracts) that we would like to use in a multilevel latent profile analysis.

The census tracts (Level 2 units), and in turn, the individuals (Level 1 units) can be grouped into two categories. What we are wondering is if how/if measurement invariance across the two categories can be tested for in a multilevel LPA/LCA. Would we test this in the same manner that we would for standard "single-level" LPA? If not, can you point us to a relevant approach that could be applied to results generated by Mplus?

Thanks so much!
 Bengt O. Muthen posted on Sunday, January 20, 2013 - 5:35 pm
It sounds like you have an observed grouping variable on level 2. This can be handled by defining a between-level latent class variable that is exactly the same as the grouping variable. See UG ex 7.24 for how this is done in the single-level case. The between-level latent class variable has to be declared on the Between = list as in UG chapter 10. Then you can specify and test various degrees of measurement invariance across these between-level classes.
 Mike Todd posted on Monday, January 21, 2013 - 9:11 am
Great! Thanks, Bengt!
 Karoline Bading posted on Saturday, March 15, 2014 - 9:26 am
first part:

hello everybody :-) !

i am running a latent analysis with complex survey data on 9 items aiming at political alienation and willingness to participate in the democratic process.

in my early steps i ran the analysis without type = complex mixture (only type = mixture) and it turned out that a 3- and 4-class solution seemed the most reasonable solutions (log-likelihood based fit indices were all pretty sobering, but interpretation was consistent with substantive theory of the construct).

later i realised i should be using type = complex mixture, since the data is clustered with n(cluster)= 27 and differing cluster sizes. hence, i re-ran the analysis for 3,4 and 5 latent classes.

what struck me was, that the estimates did not change at all for the 3 and 5 class solutions, but changed considerably for the 4 class solution, which was the best model interpretation-wise when using only type = complex and now unfortunately is far less sensical.
 Karoline Bading posted on Saturday, March 15, 2014 - 9:27 am
second part: (sorry for the long message!)

how is that possible, that 3 and 5 class solutions did not change, but the 4 class solution did?
estimator is mlr, which i assume somehow weights with cluster-size?
could that be "playing against" my beloved 4 class solution, because one the classes is quite small in comparison to the other 3. maybe if people in this class come from cluster units with a small weight (due to small cluster unit size), this class cannot be detected well enough?

as you can tell, i only have a very vague understanding of how the estimators work. i apologize for painful stupidity in my thoughts expressed above.

do you have any recommendations for chosing an estimator when using type = complex mixture. ive heard there are different options: mlr, uls ...
are there any paper where i could look things up?

thanks so much for help!

 Linda K. Muthen posted on Sunday, March 16, 2014 - 11:54 am
Unless you have weights, your classes should not change when adding COMPLEX. Perhaps you are not replicating the best loglikelihood in all analyses. Or perhaps the order of the classes changed. If you can't see the problem, send the relevant outputs and your license number to

The only estimator available for TYPE=COMPLEX is MLR.

Please limit future posts to one window. If they are longer than that, they are not appropriate for Mplus Discussion.
 Karoline Bading posted on Sunday, March 16, 2014 - 12:33 pm
thanks for the answer.
i want to run the blrt but i am using type = complex mixture option. if i compare the 5 class solution using type=complex mixture to the 5 class solution using only type = mixture, estimates dont change and p-values do, but only slighty.
would it be acceptable to run the blrt using type = mixture even though the data is in fact clustered?

 Linda K. Muthen posted on Monday, March 17, 2014 - 2:52 pm
I would not use BLRT using TYPE=MIXTURE if you have clustered data. I would use BIC.
 giulia peruzzi posted on Tuesday, April 15, 2014 - 9:11 am
hi, i'm running this multilevel latent class model:
Names are
HISEI SP_SCOL sp_sc_d cod_scu;
Missing are all (-9999) ;
auxiliary are id;
usevariables are GENERE REGOLARE CITT2
categorical are GENERE REGOLARE CITT2
Classes = CB(2) CW(3);
between = CB;
cluster = cod_scu;
Type= Mixture Twolevel;
CW on CB;

And i have two questions:
1) i have different thresholds' estimates in the same within classes, i.e.
thresholds' estimates of latent class 1 1 are different from these of latent class 2 1.
Is it correct?

2) How can i calculate thresholds' estimates in probability scale?
 giulia peruzzi posted on Tuesday, April 15, 2014 - 10:08 am
I'm sorry: i forgot to say "Thanks"
 Bengt O. Muthen posted on Wednesday, April 16, 2014 - 5:57 pm
1) The thresholds vary across the between-level CB classes as the default. You should think of thresholds as a between-level quantity in line with having means appear on between in regular multilevel modeling.

2) This is tricky because the probabilities involve the random effects and therefore require numerical integration.
 giulia peruzzi posted on Tuesday, May 06, 2014 - 8:10 am
Dear Bengt, thank you for your kind reply. I would have two minor remarks just to be sure to have properly understood your comments.

1) the following is part of my output. According to your examples, the first coefficients in each group of thresholds (e.g. the pairs of GENERE$1) should be equal.
As you can see, mine are not.Is there a mistake or is there a reason I cannot see underlying this results?

Latent Class Pattern 1 1

GENERE$1 -0.461 0.087 -5.272 0.000
REGOLARE$1 -0.877 0.079 -11.060 0.000

Latent Class Pattern 2 1

GENERE$1 0.763 0.095 8.036 0.000
REGOLARE$1 0.200 0.103 1.934 0.053

2) ok, i understand, but it's very difficult interpreting characteristics of classes looking at threshold's estimates. could it be a good idea saving class probabilities and then analizing classes with descriptive statistics?
 Bengt O. Muthen posted on Tuesday, May 06, 2014 - 6:46 pm
Which of my examples are you referring to?
 giulia peruzzi posted on Wednesday, May 07, 2014 - 12:00 am
I'm sorry: not properly "your" example, actually... i'm referring to Guide's example, 10x7.
Thanks a lot
 Bengt O. Muthen posted on Wednesday, May 07, 2014 - 11:46 am
Yes, ex 10.7 has the thresholds equal across the cb classes. So you are saying you don't get that - then I would have to see your full output to Support along with your license number.
 B posted on Saturday, July 19, 2014 - 10:24 am

I don't think I'm finding a models similar to what I want to explore in MPlus inside the user's guide. I have 2 specific questions.

Here's goes:

-Question 1: I'm creating a multilevel latent profile model for students in classrooms. I want to see if the the random effects (intercepts) for profiles latent means, constituted by a battery of student indicators, are predicted by a level 2 latent factor for classroom environment.

Would I specify that latent factor for classroom environment in this part of the model statement?
schoolfactor by indicators
c#1 on schoolfactor
c#2 on schoolfactor

- Question 2: Finally, how would I specify a multilevel mixture model where the profile is constituted by child (level1) and classroom (level2) indicators? The profile might have measures of cognitive and social outcomes for children as well as measures of classroom environment.
 Bengt O. Muthen posted on Saturday, July 19, 2014 - 4:10 pm
Q1. You would follow the ideas in the Henry-Muthen multilevel LCA paper on our website. Declare a between-level latent class variable. On between you simply say

schoolfac BY ....;

which will give you the schoolfactor means in the different between classes.

Q2. Just declare the classroom environment variables as Between variables and include them in the BY statement.
 CMP posted on Thursday, February 26, 2015 - 3:03 am
I am running a multilevel mixture analysis with random effects. My variables are y x1 x2 x3 x4 x5.
On level 1:
Y on x1 x2 x3 x4
On level 2, I would like to identify latent classes (cb) using only the random slopes from level 1 (s1 s2 s3 s4). I do not want to include random intercept (y) as it does not make substantive sense, in my case. Following your example on the User’s Guide (example 10.2), I notice the random intercept is included as an indicator of cb. How can I specify that only the random slopes be used?

Thank you in advance for your response.
 Bengt O. Muthen posted on Thursday, February 26, 2015 - 7:27 am
What you want implies that the mean of the random intercept y would not vary across the cb classes. To specify that you would need to hold those means equal across the cb classes:

[y] (1);
[y] (1);
 CMP posted on Thursday, March 05, 2015 - 2:31 am
I posted earlier on about the multilevel analyses I am running. My further questions are:
1) Repeated measures nested in individuals is by other multilevel standards 2-level but considered 1-level model by Mplus, as I understand from the UG. In my input I specified the model as TYPE = TWOLEVEL MIXTURE RANDOM; is this correct?
2) The ICs (Bic, aBic) and entropy favour a 3-class solution but the LRT does not. So I am trying to use TECH14 to test the different class solutions by bootstrap but the model keeps running and never stops (2 days!). What could the problem be?
lrtstarts= 50 20 50 20;
3) I had done the multilevel analyses without modelling latent class in HLM. When I run this same model in Mplus, some random slope variances which were significant in HLM become NSG in Mplus. Why is this so?
4) Could this be due to nonnormale multivariate nature of the data? I wanted to change the estimator to MLM but got an error message. Is this impossible to do?
Your help with these questions will be much appreciated.
 Bengt O. Muthen posted on Thursday, March 05, 2015 - 9:42 am
1) In Mplus you can do growth as 2-level in long format or as 1-level in wide format. We recommend the latter whenever possible. So if you take the latter approach, Type=Twolevel would refer to some other clustering like students in classrooms.

2) I would simply go by BIC.

3) Perhaps you used MLR in Mplus and ML in HLM.

4) Use MLR.
 CMP posted on Thursday, March 05, 2015 - 12:33 pm
Thank you very much for your response. When I used ML as the estimator in Mplus my results were similar to that got with HLM using ML. However, there were still slight différences in the p values.

Could you point me to any paper that I can reference in which the information provided by ICs, LRT and entropy were contradictory? Thank you once again for your help.
 Bengt O. Muthen posted on Thursday, March 05, 2015 - 1:18 pm
Perhaps this paper is useful:

Morgan, G. B. (2014). Mixed mode latent class analysis: An examination of fit index performance for classification. Structural Equation Modeling: A Multidisciplinary Journal, DOI: 10.1080/10705511.2014.935751
 E. Cohen posted on Monday, August 24, 2015 - 6:57 am
Dear Drs. Muthén, I have the following multilevel LCA problem: I want to identify subtypes of related cases in a multilevel sampling context (three levels), using a set of seven categorical variables measured on levels 1 and 2 (4 L1 variables, 3 L2 variables).

Two questions:
1. Is it possible (yet) to estimate a MLCA model with latent classes based on indicators/variables measured on different levels? (Henry & Muthén (2010) as well as other articles on MLCA look at models comprising L1 latent classes only.)

2. Is it possible (yet) in Mplus to estimate a three-level LCA (with covariates on two levels predicting class membership)? If not, is there a general estimation/data preparation strategy you would suggest (e.g., ignoring – while acknowledging – L3 clustering, or aggregating scores on L1 variable and treat them as L2 variables, then omitting L1 and shifting the level of analysis upwards)?

Thanks in advance for your help!
 Bengt O. Muthen posted on Tuesday, August 25, 2015 - 8:29 am
1. Mplus handles latent class variables on both level 1 and level 2. See for example,
the paper on our website:

Muthén, B. & Asparouhov, T. (2009). Growth mixture modeling: Analysis with non-Gaussian random effects. In Fitzmaurice, G., Davidian, M., Verbeke, G. & Molenberghs, G. (eds.), Longitudinal Data Analysis, pp. 143-165. Boca Raton: Chapman & Hall/CRC Press.

and also

Muthén, B. & Asparouhov, T. (2009). Multilevel regression mixture analysis. Journal of the Royal Statistical Society, Series A, 172, 639-657.

2. I think level 1 and level 2 latent class variables can have their random intercepts be predicted by level 2 and level 3 covariates.
 Bengt O. Muthen posted on Tuesday, August 25, 2015 - 8:56 am
See also UG ex 10.5 and 10.8 where you see "cb". cb can have observed level 2 indicators as well.
 Vanessa Castro posted on Friday, October 23, 2015 - 8:33 am
I ran the following LPA
Type = Mixture Complex ;
algorithm = integration;
integration = montecarlo;
Starts = 100 10;
stiterations = 10;
k-1starts = 100 10;
processors =8(starts);
c on age_oa age_ma; age_oa age_ma;

Why in the output, is the covariate (age group dummy coded for middle and older adults) listed within each factor?

Latent Class 1
SSAVOID -0.771 0.035 -22.122 0.000
SSLEAVE -0.303 0.064 -4.733 0.000
SMMOD 0.509 0.051 9.917 0.000
ADNEG -0.041 0.060 -0.678 0.498
ADPOS -0.545 0.050 -10.940 0.000
REDET -0.019 0.051 -0.376 0.707
REDIS -0.130 0.048 -2.720 0.007
REPOS -0.688 0.034 -20.051 0.000
RERUM 0.222 0.100 2.211 0.027
REACC 0.349 0.048 7.196 0.000
RMSUP 0.027 0.059 0.464 0.642
RMPOS -0.250 0.059 -4.205 0.000
RMPHYS 0.106 0.068 1.559 0.119
AGE_OA 0.223 0.024 9.272 0.000
AGE_MA 0.303 0.027 11.410 0.000
 Bengt O. Muthen posted on Friday, October 23, 2015 - 3:10 pm
Because you change the status of those age dummies from covariates to variables that have parameters in the model by your statement:

age_oa age_ma;

So they are just like any other "Y" variables.
 Vanessa Castro posted on Friday, October 23, 2015 - 3:22 pm
Thank you Dr. Muthen for your response.

I had to include that statement for the model to run, otherwise I received the missing on X error statement.

Given this,is this output and conclusion then fine for interpreting these latent classes? Thank you so much for your help!
 Bengt O. Muthen posted on Saturday, October 24, 2015 - 8:13 am
I have to see the full output to say - send to Support along with your license number.
 davide morselli posted on Monday, February 08, 2016 - 6:34 am
I'm trying like a mad to change the reference category of the between-level latent class in a non-parametric two-level LCA as descripted in Henry & Muthén's paper.

my syntax of the VARIABLE and MODEL sections is:
usevariables = TOLER PROT ;
cluster = cntry;
within = TOLER PROT ;
classes = cb (5) cw(3);
between = cb;
CW on CB ;

model cw:
[ toler@-0.820 ];
[ prot@-0.411 ];
[ toler@0.527 ];
[ prot@-0.411 ];
[ toler@-0.683];
[ prot@0.589 ];

I can change the level-1 class order using the model CW part, but I can't find where and how to specify it for CB.

thank you!
 Koksal Banoglu posted on Monday, February 08, 2016 - 11:08 am
Dear Dr. Muthen,
I am running a multilevel LPA model, akin to the model developed by Henry & Muthen's (2010). With your kind permission, I want to pose three questions about interpretation of my output with references to the main article.
1- The propability estimates of level-2 predictors over the level-1 latent class solutions (i.e. Table 3, Henry & Muthen, 2010) refer to "C#2 on [Level-2 predictor name]" statements in the output, right?
2- The effect of level-2 predictors over the level-1 latent class indicator intercepts (i.e. Table 4, Henry & Muthen, 2010) refer to estimates of "model constraints" such as "POP30, GRW30"?? Then what is the function of total effect variables (i.e. C2POPLOG, C2TOBGRW and C2POVLEV) in the output now that the significance and effect size of those would not reported in the article? Or, first we should check the significance of total effect (e.g. C2POP*LOG) and direct effect estimates (e.g. C2_POP) on the latent class factors and then we can report the effect size of individual cross-level variables (e.g. POP30)?
3- I use LPA with continuous indicators thus I interpret the linear regression outcomes. Should I pay an extra attention to any estimate while interpreting the output?

Thank you for your patience to read my message in advance. Now I cross my fingers for your answer.

Best regards.
 Bengt O. Muthen posted on Monday, February 08, 2016 - 6:28 pm
Please contact the first author about these questions. She's at Colorado State Univ.
 davide morselli posted on Tuesday, February 09, 2016 - 1:35 pm
do someone have any suggestion for the problem I posted above?
 Bengt O. Muthen posted on Tuesday, February 09, 2016 - 6:25 pm
Which Figure and Appendix model are you trying to use?
 davide morselli posted on Wednesday, February 10, 2016 - 12:16 am
my model is:

usevariables = TOLER PROT ;
cluster = cntry;
within = TOLER PROT ;
classes = cb (5) cw(3);
between = cb;
CW on CB ;

model cw:
[ toler@-0.820 ];
[ prot@-0.411 ];
[ toler@0.527 ];
[ prot@-0.411 ];
[ toler@-0.683];
[ prot@0.589 ];

and I would need to change the reference class of from CB#6 to CB#1. I know how to do it with the within-level classes (by fixing the means as in the example above) but I can't find what I should do for the between one
 Bengt O. Muthen posted on Wednesday, February 10, 2016 - 1:18 pm
It helps me if you connect your model to models numbered in Henry-Muthen.
 davide morselli posted on Thursday, February 11, 2016 - 5:30 am
that would be Model 4a at p.214 of the SEM article

Thank you in advance

 Bengt O. Muthen posted on Friday, February 12, 2016 - 2:57 pm
Use SVALUES in your original run to save the estimates. Then use them in a Starts=0 run where you switch the regression coefficients for CW on CB so that CB#1 values are given for the last CB class. You may also have to re-calculate the logit for CB called [cb#1] etc. You do that by first computing all the CB class probabilities from the logits as described in Chapter 14 and then switch the probabilities and then compute the new logits.
 Corey Savage posted on Sunday, February 14, 2016 - 9:25 am
With model 2 in table 1 in Henry and Muthen (2010), the parametric approach to a multilevel 3 class model, do I need the following in the syntax for a 2 class model? Would I just limit that to only "C#1"?

C#1; C#2; C#1 WITH C#2;
 Bengt O. Muthen posted on Sunday, February 14, 2016 - 5:17 pm
Yes, with 3 classes you have 2 random intercepts: c#1 and c#2. And you want them to be able to correlate.
 davide morselli posted on Monday, February 22, 2016 - 8:02 am
Hi Bengt,
sorry for bothering again, but it's still not clear how I can switch the regression coefficients for CW on CB so that CB#1 values are given for CB#5 (the last class) as I cannot mention the last class directly.
 Bengt O. Muthen posted on Monday, February 22, 2016 - 5:51 pm
Send your run with the original solution and your best attempt at switching the classes to Support along with your license number. Briefly describe where you get stuck.
 Brian Knop posted on Monday, March 07, 2016 - 2:26 pm
I have looked through the user guide and Henry and Muthen (2010), but can't figure out how to get odds ratios for level 2 variable effects on latent classes (such as effect of tobacco growing state on smoking type). I can get coefficients (for level 2 variables), but not odds ratios.
Thank you.
 Linda K. Muthen posted on Monday, March 07, 2016 - 2:39 pm
The categorical variables are random intercept on level 2. So they are continuous.
 Brian Knop posted on Tuesday, March 08, 2016 - 8:23 am
Thank you for your quick response. That explains why the output gives odds ratios for within-level effects, but not between-level effects on the latent classes. But Henry and Muthen (2010) present between-level odds ratios, so it is possible, yes?
 Linda K. Muthen posted on Tuesday, March 08, 2016 - 9:30 am
The variable would need to be on the BETWEEN list and the CATGORICAL list to get an adds ratio on between.
 Brian Knop posted on Wednesday, March 09, 2016 - 9:17 am
Ok, for some reason I still can't get odds ratios when I do that. If I were hypothetically looking at school-level effects on latent classes of test scores and all variables used are categorical, my code looks something like this:

usevariables= schooltype mathscore engscore histscore;

categorical are schooltype mathscore engscore histcore;

between= cb schooltype;


classes=cb (3)

TYPE= mixture twolevel;

cb#1 cb#2 ON schooltype;

Would that produce odds ratios in the output?
 Linda K. Muthen posted on Wednesday, March 09, 2016 - 10:53 am
Please send the output and your license number to
 Miatta Echetebu posted on Wednesday, June 15, 2016 - 3:52 pm
We are trying to run a two-level LCA. The syntax we are using is

NAMES ARE ID agency directservices ServYouth TimeWork gender age RaceBWO education fieldstudy sector RacePriv InstDisc BlatantRace ;
MISSING are all (999);
USEVARIABLES = RacePriv InstDisc BlatantRace ;
STARTS = 60 30;
tech11 tech14 ;

Our output AIC and BIC are nearly identical to the AIC and BIC we received when not running it with the second level. Additionally, the output gives us within-level variances but not between-level variances. Is this syntax correct for running a multilevel LCA? Is it actually running it with two levels or just reproducing the single level output?
Thank you very much for your help!
 Bengt O. Muthen posted on Wednesday, June 15, 2016 - 5:39 pm
You have to mention the random intercepts for the latent class variables on Between. For an example, see UG ex 10.6.
 Allison posted on Monday, August 15, 2016 - 9:48 am
Hello Bengt and Linda:

I have a multilevel LPA, with events (Level-1) nested within people (Level-2). Currently, I have a model with 8 within-person profiles, and nothing being specified at the between-person level of analysis.

In traditional multilevel research with a similar structure (events nested within people), it is common to group-mean center the L1 constructs prior to analyses. Would you recommend group-mean centering the L1 variables being specified as profile indicators? Would other centering decisions, such as grand-mean centering or even standardizing the profile indicators within-person, be appropriate?

Any guidance is greatly appreciated!
 Bengt O. Muthen posted on Monday, August 15, 2016 - 1:14 pm
I would not do any centering or standardization for this model.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message