CFA Mixture + Covariates PreviousNext
Mplus Discussion > Latent Variable Mixture Modeling >
 Sung Kim posted on Tuesday, October 02, 2007 - 1:41 am
My CFA mixture model without covariates preferred four classes. However, after the factors and the latent class variable were regressed on three covariates, the values of fit indexes indicates a 2-class model is preferred. Is this because "factor mixture models are estimated conditional on covariates (Lubke & Muthen, 2005, p.31)"? If that's true, do I need to try an unconditional model first without covariates and then try an conditional model with covariates? Or, do I need to insert covariates from the beginning and only go for a conditional model?

One more question: when calculating factor mean difference across classes, which one should be used: unstandardized or standardized values?

Thank you so much in advance.
 Bengt O. Muthen posted on Wednesday, October 03, 2007 - 11:07 am
I would first settle on the number of classes without using covariates. Then I would add covariates but not be surprised if the needed number of classes changes. There shouldn't be a change unless there are direct effects from covariates to the outcomes. If you need such direct effects, the number of classes should be determined when such direct effects are included.

Unstandardized values should be used.
 Sung Kim posted on Thursday, October 04, 2007 - 2:29 pm
If there are direct effects (Path 5 in the article) like that, it means that "the measurement model ... is not the same across classes" (Lubke & Muthen, 2005, p. 29). Is that right?

May I ask about how to detect the direct effects from covariates to the outcome variables? For example, I have three covariates of gender, race, and clinical status and 45 items measuring either one or three factors related to psychological well-being. What I'm interested in is how to know which items are regressed on which covariates. Could you let me know?
 Linda K. Muthen posted on Thursday, October 04, 2007 - 3:24 pm
A significant direct effect from gender to a math item would indicate that the math item is not invariant, that is, it behaves differently for males and females for a given factor value.

To detect direct effects, you can add to the MODEL command the regression of all factor indicators on all covariates at zero, for example,

y1-y10 ON x1-x5@0;

You can then ask for modification indices and see where you may have measurement non-invariance.
 Sung Kim posted on Tuesday, October 09, 2007 - 8:22 am
Since modification indexes suggested a direct effect from the three covariates to one outcome variable (i.e., y42), I modeled that in the next step. However, although the likelihood value decreased (and so did AIC, BIC, etc.), the regression coefficients linking the covariates and the outcome variable became much smaller and insignificant than the standardized E.P.C.s for the direct effects estimated in the previous step.

Is it expected or did I do something wrong?

Also, which one should be used for a standardized E.P.C. between std and stdYX?
 Linda K. Muthen posted on Tuesday, October 09, 2007 - 8:36 am
The modification indices reflect what would happen if one parameter is changed not three. I would look at the change in chi-square rather than the E.P.C. values.
 Sung Kim posted on Tuesday, October 09, 2007 - 9:18 am
I did look at MIs, whose values for three direct effects were greater than 100. Some of the MIs were over 300. In addition, standardized (stdYX) EPCs related to them were over .50 and one EPC was even close to 1. That's why I included them in the model next step.

I expected the regression coefficients in the model that included the direct effects would be close to the EPCs estimated in the previous step, but the coefficients were much smaller. Why is that?
 Linda K. Muthen posted on Tuesday, October 09, 2007 - 9:45 am
What you expect will occur only if you free one parameter. After one parameter is freed, the modification indices and expected parameter change values will change for the other parameters.
 Sung Kim posted on Wednesday, October 17, 2007 - 1:36 pm
Is a single-class, or baseline CFA mixture model equivalent to a simple CFA model when there's no covariate involved?

I fitted the two different models as following, but their loglikelihood and fit index values were different.

I fitted:

Single-class CFA Mixture
Variable: ...
classes = c(1);
Analysis: TYPE = Mixture Missing;
Model: %Overall%
oq BY y1-y45;
Output: ...

Simple CFA
Analysis: TYPE = Missing H1;
Model: oq BY y1-y45;
Output: ...

My coding for the single-class CFA Mixture model is correct?

 Linda K. Muthen posted on Wednesday, October 17, 2007 - 2:11 pm
A model with one class is the same as a regular model. If you don't get the same results, something must be different. It may be that you don't have means in the regular model but you do in the mixture model.
 Sung Kim posted on Friday, October 19, 2007 - 1:19 pm
I included "TYPE = MEANSTRUCTURE" in the regular model as you suggested, but I got the same loglikelihood value, which is still different from the single class mixture model. I'd like to know why there is a difference.

However, I see your point that means are included in a mixture model but not in a regular CFA model.

 Linda K. Muthen posted on Friday, October 19, 2007 - 3:06 pm
The only way I can answer this is to see the two outputs and your license number at
 Sung Kim posted on Thursday, November 08, 2007 - 2:40 pm
I fixed at 0 the factor means of the reference class in CFA mixture models, but they are not 0 in the result. What happened?

When the models were estimated without covariates in the model, the factor means were estimated as 0s. However, after the covariates were included, the factor means are no longer 0s. Why is that?
 Linda K. Muthen posted on Friday, November 09, 2007 - 7:59 am
In a conditional model, you are fixing the intercept to zero.
 Sung Kim posted on Sunday, November 11, 2007 - 2:20 pm
In a conditional model, comparing factor means across classes is not free from errors, that is the residual factor scores. Therefor, if I want to compare classes with something error-free, I need to compare them at both levels of the intercepts and the regression weights of covariates. Is that right?

Thank you so much!
 Linda K. Muthen posted on Monday, November 12, 2007 - 8:48 am
The factor means that are compared across groups and not the means of the factor scores.

If you want to compare the means in a conditional model, you would need to use MODEL CONSTRAINT to create the means and MODEL TEST to test if they are different.
 Sung Kim posted on Friday, June 06, 2008 - 1:09 pm
I need to present factor means and their standard errors of each class from a conditional FMM. However, I couldn't find their standard errors. How can I find them?

Also, once covariates are included in the factor mixture model, comparing factor means across classes becomes arduous, I guess.
 Bengt O. Muthen posted on Friday, June 06, 2008 - 5:08 pm
With covariates, Mplus reports intercepts, not means. But you can use grandmean centering (subtracting the sample mean) so that the intercepts become the means.
 Sung Kim posted on Friday, June 06, 2008 - 8:50 pm
Thank you so much! One more favor? Could you let me know how to code grand mean centering in Mplus? My code looks like below.


f1 BY y1* y2-y5;
f2 BY y6* y7-y10;

f1 ON gender ethnic;
f2 ON gender ethnic;

c#1 ON gender ethnic;

[f1@0 f2@0];

f1 ON gender ethnic;
f2 ON gender ethnic;

[f1*-2 f2*-2];

f1 ON gender ethnic;
f2 ON gender ethnic;

 Linda K. Muthen posted on Friday, June 06, 2008 - 8:59 pm
See the CENTERING option in the user's guide.
 Sung Kim posted on Friday, June 06, 2008 - 9:52 pm
Thank you so much!

I added the CENTERING option:

However, this time I've got a larger loglikelihood value, different class portions, etc. It looks like the addition of the centering option affects the whole fitting process. Is it expected?
 Linda K. Muthen posted on Saturday, June 07, 2008 - 6:10 am
Please send your input, data, output, and license number to
 Jon Elhai posted on Friday, November 07, 2008 - 9:38 am
I'm having trouble specifying a CFA mixture model, using 4 factors for 17 observed continuous items, and testing 2 classes. The syntax from the UG's example 7.17 shows this for a one-class, one factor model:
f BY y1-y5;

But if I add more factors and a second class, I'm wondering what additional syntax to write for the c#1 and c#2 text - the above text fixes the single factor's mean to 1. Do you have any examples?
 Jon Elhai posted on Friday, November 07, 2008 - 9:50 am
To follow-up on my previous message. I just realized the example 7.17 does test two classes but specifies that class 1 has a fixed factor mean. So if I intend my model to be the same across classes, do I just specify my factor model under the %OVERALL% syntax, with no specification of individual classes below it? When doing this, I am receiving some error messages.
 Bengt O. Muthen posted on Friday, November 07, 2008 - 5:45 pm
A class-invariant factor mixture model would be

f by y1-y5;

This implies class-invariant loadings, intercepts, residual variances, and factor variances. The factor means change over classes with one class having the mean fixed at zero as the default for identification purposes.

If that doesn't help and the error message for your problematic run doesn't help, please send your license number and the input, data, and output for the problematic run that you mention.
 Scott R. Colwell posted on Thursday, July 22, 2010 - 7:36 pm
I am running a CFA mixture model with 3 factors with 2 items on the first factor, 3 on the second and 3 on the third. The factors are non-invariant based on a binary variable (0 and 1). I decided to first determine the number of classes without the binary covariate. I have allowed factor loadings, factor variances, residual variances to vary across classes as I have no reason to believe they are invariant across classes. I have only set the number of classes to 2.

I continue to get the message to increase the number of miterations, but I am now at 50,000 miterations and still getting the message. Variances of the indicators are less than 2 so they aren't way out.

How many miterations should I realistically have for this?
 Linda K. Muthen posted on Friday, July 23, 2010 - 11:33 am
Please send the full output and your license number to

See the following article which is available on the website:

Clark, S.L., Muthén, B., Kaprio, J., D’Onofrio, B.M., Viken, R., Rose, R.J., Smalley, S. L. (2009). Models and strategies for factor mixture analysis: Two examples concerning the structure underlying psychological disorders.
 Fiona Shand posted on Wednesday, August 11, 2010 - 9:24 pm
I have identified a 2 class, 1 factor mixture model as the best fit to my data. I have then run the model again with covariates. A reviewer has commented that he/she is skeptical of such post hoc comparisons when not weighted by group membership probabilities. I had thought that FMM with covariates incorporated posterior class membership probabilities when calculating the regression model. Can you please tell me if this is so? Thanks in advance.
 Linda K. Muthen posted on Thursday, August 12, 2010 - 9:29 am
It sounds like the reviewer's comment relates to doing the analysis in two steps, that is, saving most likely class membership and regressing it on a set of covariates. It sounds like you are doing the analysis in one step in which case I don't understand the critique.
 Adam Meade posted on Monday, September 19, 2011 - 12:47 pm
I have a factor mixture model with 5 continuous indicators represented by a single factor. I have two latent classes. I need to compare 9 covariates to see which best differentiate those two latent classes. Some of my covariates are highly correlated. Should I compare these covariates simultaneously in a single model, or in different models given the colinearity? Also, by what criterion is it best to evaluate the covariates in terms of differentiating classes? Thank you!
 Bengt O. Muthen posted on Tuesday, September 20, 2011 - 8:42 am
I don't know that there is any best strategy for this. You could first see which single covariate is most strongly related to the latent classes. Then work with pairs of covariates, etc. "Most strongly related" needs to be defined, which is easy with 2 classes (which one changes the probability the most when, say, changing the covariate one SD), but with more than 2 classes the covariate can be good for discriminating between two of the classes, but not between two other ones.
 Tay posted on Thursday, October 09, 2014 - 2:35 am
Hi Drs Muthen, I'm running a 4-class 2-factor model with this being the end point for different variations of factor mixture models ranging from least to most restrictive based on Clark et als recommendations. I was wondering with the PPP and DIC indices introduced in V7.3 for Bayesian mixture whether these are good substitutes for BIC ? This option is appealing given the amount of computational time with the default estimator for testing a large number of FMM variations. The other issue relates to label switching and whether this is of particular concern with Bayesian ffm? Many thanks
 Bengt O. Muthen posted on Thursday, October 09, 2014 - 2:54 pm
Bayes mixture PPP was in version 7.2 already. DIC is not available yet for mixtures.

I don't think we know yet how using PPP to settle the number of classes compares to BIC. But I don't see how it would be faster.

Label switching is an issue with Bayes mixtures and perhaps more so with the flexibility of FMMs. We briefly mention label switching in our Topic 9 Bayes teachings.
 Marco Pannacci posted on Tuesday, February 14, 2017 - 9:04 am
Dear MPLUS support,
with the reference to "Investigating Population Heterogeneity With Factor Mixture Models" paper,
how can I specify in MPLUS the relation between the covariate X and the class variable C?
is it done by default in MPLUS when I run the following code (where "s_0202" is the categorical dummy covariate and the other variables are continuous variables)?

Many thanks



mld_r lung_r vi_856 s_0202 ;


SAD by
mld_r* (1)
lung_r (2)
vi_856 (3)

SAD on 0202 ;
 Marco Pannacci posted on Tuesday, February 14, 2017 - 9:09 am
And question 2: if looking at the MODINDICES I find a significant direct effect (between the covariate and the indicator) for example like

lung_r on 0202 ;

is it mandatory to specify also the relation between the latent variable and the same covariate? Like:

SAD on 0202 ;

Many thanks
 Bengt O. Muthen posted on Wednesday, February 15, 2017 - 11:12 am
See our UG examples with "c ON x".

Modindices can be unreliable for mixtures.

It is ok to specify a direct relationship between a covariate and a latent class indicator if it is significant.

Note that we ask that postings be limited to one window. Longer questions involving input/output should be sent to Support.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message