Performance of factor mixture models PreviousNext
Mplus Discussion > Latent Variable Mixture Modeling >
 Daniel posted on Monday, August 25, 2003 - 9:16 am
I just read the paper by Gitta Lubke and Bengt about monte carlo studies of factor mixture models. I have two questions. First, how does one estimate Mahalanobis distance? Second, am I to understand that there isn't much difference in model performance (in terms of parameter coverage and average posterior probability) between a model with and without class-specific variance? Is it better to allow for class-specific variance in, for example, a growth mixture model, than to assume no variation, or does it not make much difference?
 BMuthen posted on Tuesday, August 26, 2003 - 10:00 am
I think the formula for the Mahalanobis distance is in the paper. Otherwise, see a multivariate statitics textbook. It can be done by using, for example, SAS proc matrix.

Regarding your second question, as far as the data for this paper is concerned, you are correct. It may not be correct for all data.

Regarding class-specific variances, they can make a big difference if they are needed to fit the data. For example, studying problematic behavior development over time, one often sees a normal class which shows a low mean and very little variation over time whereas other classes may have a lot of variation.
 Daniel posted on Tuesday, August 26, 2003 - 10:12 am
Thank you
 Levent Dumenci posted on Monday, December 13, 2004 - 11:58 am
I have a question on the following portion of the output file:

CLASSES = c(3);

F by x1-x10;
FILE IS outdat.dat;
save = fscores;

*** ERROR in Analysis command
FSCORES are not available when there are no latent variables.

F is a latent variable. What I really want to do is to estimate different factor structures across classes and save the estimated fscores plus the cprob. By the way, I encountered no problem in saving the cprob.

Thank you
 Linda K. Muthen posted on Tuesday, December 14, 2004 - 9:08 am
The default with mixture and categorical is to fix the factor variances to zero. Therefore no factor scores etc. are available. I think you need to add ALGORITHM = INTEGRATION to your ANALYSIS command.
 Daniel Bontempo posted on Wednesday, January 05, 2005 - 1:58 pm
Linda -

Is there information on how the factor scores are computed and what are the properties of the factor scores (e.g., validity, preserved factor correlations)? Does it matter if the indicators were continuous or categorical?

 Linda K. Muthen posted on Wednesday, January 05, 2005 - 3:44 pm
See Technical Appendix 11 on the website for a description of factor scores for continuous and cateogical outcomes.
 Andrew Baillie posted on Thursday, August 03, 2006 - 6:11 pm
Linda or Bengt,

Could I check how a "zero class" as you used in

Bengt Muthen and Tihomir Asparouhov, Item response mixture modeling: Application to tobacco dependence criteria, Addictive Behaviors, Volume 31, Issue 6, June 2006, Pages 1050-1066.

is tested?

I guess that i'ts done by testing for two classes and setting the thresholds for each item to be -15 for one class? Is that right and is there anything else to be done to test a zero class?



(andrew.baillie AT
 Bengt O. Muthen posted on Thursday, August 03, 2006 - 7:23 pm
You fix the thresholds at +15 for the zero class and you also fix factor means and variances at zero for this zero class.
 Andrew Baillie posted on Tuesday, August 08, 2006 - 7:32 pm

thanks for your quick response (and for the excellent support!)

I'm wondering about setting the factor means for a zero class to zero. I was expecting that factor scores would be standardised so that zero would be the grand average? That way the zero class "could" have a mean that was greater than the other classes? (This sounds v. unlikely with thresholds at +15)

thanks again


(andrew.baillie AT
 Bengt O. Muthen posted on Wednesday, August 09, 2006 - 7:57 am
For the zero class you should think of the factor as not existing - when you fix both the mean and the variance of the factor to zero it no longer influences anything in the model.

For the remaining classes, the average of the factor means across classes is not standardized to zero. Instead, a reference class with zero factor mean is used.
 Andrew Baillie posted on Sunday, October 29, 2006 - 2:28 am
Bengt & Linda,

I'm wondering about the magnitude of the differences between the models in say yor paper with Tihomir Asparouhov. Given some of the analysis compares -2logliklihoods between nested models what do you think about using the omega^2 effect size as an index of the magnitude of the differences between models? Are there any other alternatives?



andrew.baillie AT
 Bengt O. Muthen posted on Sunday, October 29, 2006 - 3:26 pm
For nested models I like to use LR chi-square based on 2*LL diffs - when that is correct (no parameters on the border). I don't know about omega. We are also contemplating generalizing the bootstrapped LR test to models that differ not only in number of classes but also in terms of number of random effects.
 anonymous posted on Monday, April 02, 2007 - 3:15 pm

I just read the forthcoming paper (Muthén, 2006) on Latent variable hybrids: Overview of old and new models.

I was wondering if inputs examples for the four cross sectional models graphically exposed in figure 1 were available (mixture factor analysis, non parametric FA, factor mixture analysis, non parametric FMA).

Thank you very much
 Linda K. Muthen posted on Monday, April 02, 2007 - 6:25 pm
Send your email address to
 Alex posted on Friday, April 27, 2007 - 6:53 am
I am working on a latent profile analysis in which I use factor mixture models to test for conditional dependance. Following your previous suggestions, I use factor mixture models with class specific factor variance. They work generally well.
However, I also try to fit models with class specific factor loadings (i.e. 7.27). You warned me that such models could be hard to fit (and they are). For example, amongst the numerous warnings which I obtain, I often receive message regarding negative (and significant) residual factor variance within one of the class.
My question is whether factor mixture models with within class factor loadings could be easier to fit by fixing the overall factor variance to 1 instead of fixing the loading of the first indicator to 1, while withdrawing (or not ?) class specific estimates of factor variance ? Would it still be logical to compare the fit of factor mixture models with class specific factor variance (unstandardized) to standardized models with class specific loadings ?

Thank you very much in advance.
 Bengt O. Muthen posted on Friday, April 27, 2007 - 7:32 am
Yes, when the factor loading matrix is made class-specific, it is my experience that it can help to set the metric in the factor variance (@1) for each class instead of the first loading. This way, the search for the best solution is less dependent on the quality of that one item.
 Alex posted on Friday, April 27, 2007 - 8:46 am
Thank you very much for your answer.
Just to make sure I understand correctly, to "set the metric in the factor variance for each class", I only have to set the variance (f@1) in the %overal% section of the model and not in each class section.
 Bengt O. Muthen posted on Friday, April 27, 2007 - 8:54 am
That's right. Saying that in the overall makes it hold for each class (see Tech1).
 Levent Dumenci posted on Thursday, September 20, 2007 - 12:29 pm
* 10 binary observed variables of u1-u10,
* factor mixture model with 2 classes,
* one factor for each class,
* classes differ in factor means only.
what should be the statements for item thresholds and factor means under %overall%, %c#1%, and %c#2%?
 Linda K. Muthen posted on Thursday, September 20, 2007 - 12:52 pm
Factor means and thresholds are free across classes as the default so I don't think you need to say anything about them.
 Anthony Ahmed posted on Sunday, May 11, 2008 - 1:23 pm
I am currently working on a dissertation project comparing the performance of Meehl's taxometric methods to latent variable methods. I have recently purchased the Mplus program to run mixture modeling methods. Every data condition I have simulated uses four continuous variables. I plan to run Latent Profile Analysis, Latent Class Factor Analysis and Factor Mixture Analysis. Is it possible to run Latent Class Factor Analysis with continuous indicator variables and is there an example in the user's guide?

My Monte Carlo datasets are tab delimited .dat files that have variable names on the first line. I got an error message when I tried to run latent profile analysis "ERROR
Invalid symbol in data file:
"V1" at record #: 1, field #: 1." The analysis runs when I delete the variable names on the datafile. Is there a way to work around this in the format statement? I know it is possible to skip columns with fixed data formats but can I skip the first row with a free data format?
Finally, I'd like the methods to determine the number of classes that best fit the data. Would I have to analyze each dataset twice, starting with c = 1 and then c = 2 or is it possible to ask the program to do both in one analysis?
 Linda K. Muthen posted on Monday, May 12, 2008 - 10:20 am
See the following paper which is available on the website for an example of Latent Class Factor Analysis:

Muthén, B. (2006). Should substance use disorders be considered as categorical or dimensional? Addiction, 101 (Suppl. 1), 6-16.

You should just delete the line with the variable names.

You need to run the analysis separately for two, three, etc. classes.
 Orla McBride posted on Monday, September 21, 2009 - 5:33 pm
I have conducted CFA and LCA on a set of binary indicators. The results suggest evidence for a one-factor model or a three-factor solution with parallel profiles. I want to follow-up this analysis by estimating a hybrid model. I have read the papers by Muthen (2006) and Muthen & Asparouhov (2006) but I am a bit unclear about the differences between LCFA and IRT mixture modeling. I’d be grateful if you could answer the following questions:

(1) Should LCFA be estimated in favor of IRT mixture modeling when the factor is considered to have a non-normal distribution?
(2) In LCFA, is it correct that the latent classes share the same dimension and therefore the factor loadings (but not thresholds) should be equal across classes?
(3) Does it make conceptual sense to estimate both types of models on the same set of indicators? If so, in what circumstances?
 Shaunna Clark posted on Tuesday, September 22, 2009 - 11:11 am
One paper that may answer many of your questions is the forthcoming Clark and Muthen article entitled "Models and strategies for factor mixture analysis:
Two examples concerning the structure underlying psychological disorders."

But since it is currently not available, here are some responses to your questions:

First, IRT mixture models and LCFA are not separate models, but LCFA is a special case where the factor loadings and item thresholds are invariant across classes and the factor variance\covariance is zero. The only difference between the classes are the location of the classes on the factor, as indicated by the factor mean being different in each class. So, to answer question 2, the classes do share the same dimension, but the factor loadings and item thresholds are equal across classes.
 Shaunna Clark posted on Tuesday, September 22, 2009 - 11:12 am
I would argue that both the LCFA and more flexible models which relax the equality of the item thresholds and factors loadings should be applied to data, but that it should be kept in mind what each of these models implies about the underlying structure of the data. LCFA and an alternative which allows the factor variance to be estimated (this variance can be restricted to be equal across classes or non-invariant), both have the same factor running through all classes and the difference between classes arises due to having class varying factor means and potentially factor variances. Other models which relax the equality of factor loadings and item thresholds may still have the same factor in both classes depending on the difference in the estimated item thresholds and factor loadings when they are allowed to vary across classes.

Also, both the LCFA and other models which relax the equality of item thresholds and factor loadings across classes allow for a non-normal factor.
 Matt Thullen posted on Tuesday, November 10, 2009 - 10:14 am

In LCFA, How is the zero factor score interpreted? Im thinking of how to represent the factor scores within and between each class in for a model with 3 in a graph or plot.

Also with LCFA, I based my syntax off the examples in Clark & Muthen(recently posted) and I get warnings about having more equality labels than parameters. I have something resembling this for each of my classes: [u1-u12] (1-12). The model seems to run fine but Im not sure if or what I should do to address those warnings.

thank you
 Linda K. Muthen posted on Tuesday, November 10, 2009 - 1:09 pm
The zero factor score is a reference point. It is not identified as a free parameter.

I would need to see your full output and license number to understand why you get an error for the syntax you show.
 Mike Stoolmiller posted on Thursday, June 02, 2011 - 7:42 am
I'm fitting factor mixture models and I have model with 4 latent classes and a single latent factor in each class. When I request class probabilities and factor scores, I get two different factor scores, one which is labeled the same way I labeled my latent factor and one with C_ as a prefix. I searched the Mplus version 6 manual but can't find any information on this second factor score. Can you direct me to some relevant documentation?
 Linda K. Muthen posted on Thursday, June 02, 2011 - 9:51 am
I don't know of any documentation. One is mixed over classes and the other is for the most likely class.
 Mike Stoolmiller posted on Thursday, June 02, 2011 - 11:49 am
Which is which?
 Linda K. Muthen posted on Thursday, June 02, 2011 - 3:01 pm
c_ is most likely class membership.
 kelly kenzik posted on Thursday, January 31, 2013 - 6:41 am
I have a factor mixture model with two factors and 4 classes. I am allowing the means, thresholds, and factor loadings to vary across classes. My factor loadings however have only 0.00 for SE, and 999 for for my p-value. Is this because I did not make any specifications for the factor variance and it is being held at 0?
Can I still compare these factor loadings across classes?
My code looks like this:
y1 by u1-21;
y2 by u22-u37;
y1 by u1-21;
y2 by u22-u37;
[u1$1-u21$1] ;
[u1$2-u21$2] ;
[u22$2-u37$2] ;
[u22$3-u37$3] ;
...and so on for the other classes.

Thanks so much!!
 Linda K. Muthen posted on Thursday, January 31, 2013 - 7:35 am
Please send the output and your license number to so I can see what the problem is.
 Artemis posted on Thursday, November 20, 2014 - 3:20 am
Dear Profs Muthen,

I have read with great interest your paper entitled 'Item response mixture modelling: Application to tobacco dependence criteria' and I would like to generate Table 6 from your paper in my problem (i.e. Response pattern classification by latent classes using factor mixture analysis) I guess factor scores and cprob will have to be saved and depend on which model one fits i.e. which constraints one imposes for measurement invariance, right?
In addition, I have in my problem a rather big sample i.e. ~ 7000 which makes computation time quite slow i.e. each model takes 20-30 hours to run...would it be acceptable for all the different models I want to fit to select randomly say perhaps 1000 and do all comparisons and then fit the final model to the big sample? Something like a kind of cross validation if I may name it like this?
Many thanks for all your help and time to this.
 Bengt O. Muthen posted on Thursday, November 20, 2014 - 3:28 pm
Q1. See the RESPONSE option on page 754 of the V7 UG.

Q2. Seems reasonable. Or, get a computer with at least 8 processors.
 Artemis posted on Friday, November 21, 2014 - 9:02 am
Thanks a lot for the very helpful responses, unfortunately my PC has 8 processors so I guess I will try the cross-validation approach-thanks again for everything.
 Raghav Ramachandran posted on Monday, February 08, 2016 - 1:52 pm
Drs. Muthen,

I am running factor mixture models (FMM) at two different time points with continuous indicators (20 indicators, 5 covariates, 3 factors, and 4 classes). I'm able to get the FMMs to run successfully at each time point but when I try to combine them into a longitudinal model, the model estimation does not terminate normally. I would also like to include a distal outcome in the model.

I've tried constraining factor loadings to be equal across time and fixing factor correlations across time to be zero, but that does not help with the model estimation. I haven't been able to find any literature on longitudinal factor mixture models; I've only found cross-sectional FMMs. I was wondering if you would be able to point me towards a few useful articles or if you had any suggestions on useful constraints/parameter restrictions for longitudinal FMMs with distal outcomes. Thank you for your help.

 Bengt O. Muthen posted on Monday, February 08, 2016 - 6:22 pm
Don't fix factor correlations across time to zero. You can hold factor loadings equal across classes.

If this doesn't work, send output to Support along with your license number.
 'Alim Beveridge posted on Monday, April 04, 2016 - 1:49 am
Dear Drs. Muthen,

I am trying to conduct latent class factor analysis (LCFA) using Mplus 7.4. I am following the instructions and examples provided in Clark et al (2013). According to Clark et al, only factor means should vary across classes in LCFA (they call it FMM-1). The loadings should be invariant, and the factor covariance matrix should be 0. The results fit these specifications. However, I notice that the residuals variances are also invariant across classes. Should this be the case?

I am also wondering about model identification requirements. If I want all loadings to be freely estimated, I know that I must fix the mean of one class to 0 (automatically done for the last class) if there are 2 classes. Must I fix the mean of another class (e.g. to 1) if I have 3 classes? In general must I fix, the mean of k-1 classes, if there are a total of k classes, or is it enough to fix 1?

Finally, a very basic question: the 5 observed variables in my model are all percentages, and only 2 are normally distributed. Must percentages be declared as some data type other than continuous?


Clark, S. L., Muthén, B., Kaprio, J., D’Onofrio, B. M., Viken, R., et al. 2013. Models and strategies for factor mixture analysis: an example concerning the structure underlying psychological disorders. Structural Equation Modeling, 20(4): 681–703.
 Bengt O. Muthen posted on Monday, April 04, 2016 - 6:55 pm
Q1: Residual variance invariance is applied for parsimony's sake.

Q2: Try fixing the mean in only one class and see if the model is identified. If not send to Support along with your license number.

Q3: Not unless they are close to 0 or 1.
 'Alim Beveridge posted on Monday, April 04, 2016 - 8:42 pm
Thanks Dr. Muthen,

I am also trying Exploratory factor mixture analysis (example 4.4.). I have a few questions:

1. in the results, loadings and residual variances vary across classes. does that mean that all parameters are non-invariant across classes?

2. If I specify that I want more than 2 factors (eg EFA 1 4), Mplus says: "Too many factors were requested for EFA. The maximum number of factors is set to 2."
Is 2 the max number of factors allowed for Exploratory FMM, or does it depend on something like number of observed variables in the model?

3. I often get the error "NO CONVERGENCE. PROBLEM OCCURRED IN EXPLORATORY FACTOR ANALYSIS WITH 2 FACTOR(S)." What is the best way to handle this? Should I try increasing one or several of the following: initial stage random starts, final stage optimizations, or initial stage iterations (STARTS and/or STITERATIONS)?

 Bengt O. Muthen posted on Tuesday, April 05, 2016 - 6:45 am
1. Yes

2. It depends on how many variables you have. See our Topic 1 handout for regular EFA on our website.

3. Typically this is because of negative residual variances and there is not an easy fix except to have fewer factors.
 Todd Jensen posted on Tuesday, June 07, 2016 - 2:15 pm
Are there any references or guides that expand on the Mplus syntax given in the Clark et al. (2013) piece on FMM, such that instructions on correctly specifying FMM-2, FMM-3, and FMM-4 models with three or more latent classes are offered?

Model identification seems to be a bit tricky when moving into the FMM-2, FMM-3, and FMM-4 model specifications with three or more latent classes (and with four latent factors).

Any guidance would be very appreciated.
 Bengt O. Muthen posted on Tuesday, June 07, 2016 - 2:59 pm
The UG has FMM models - check there first.
 Todd Jensen posted on Wednesday, June 08, 2016 - 5:10 am
It appears that all the FMM examples in the UG only specify a two-class solution. There does not appear to be any examples with three or more classes specified for FMM.
 Bengt O. Muthen posted on Wednesday, June 08, 2016 - 11:00 am
UG ex 7.27 can be directly generalized to more than 2 classes.
 Maike Trautner posted on Thursday, July 28, 2016 - 9:46 am

I have conducted a factor mixture model with three latent factors and two classes. Correlations between factors, as well as item residual variances are class specific. All other parameters are held equal.

By default, factor means are set to zero in class two, but are estimated in class one. I am now unsure of how to interpret the absolute levels of the factors? Is there a way to obtain absolute levels of the factors to interpret them as an average score of people within a class?

Thanks for your help!
 Bengt O. Muthen posted on Thursday, July 28, 2016 - 12:11 pm
No, because they refer to latent variables, factor means are always relative - one class compared to the other. Same for multiple groups or multiple time points. You don't need more information than that.
 WEN Congcong posted on Wednesday, March 08, 2017 - 7:10 pm
Professor Muthen,

Hello! I have read your paper ¡°models and strategies for factor mixture analysis¡± and have some questions. Thank you for shedding light on my questions.

1.You mentioned the measurement invariance as factor loading and intercept invariance and proposed FMM1 and 2. We may naturally come up with FMM4 and 5. But why a FMM3, why not add factor variance/covariance invariance in FMM2?

2. All the 5 FMM didn¡¯t allow cross loadings and got more parsimonious models. Is it worthwhile to allow cross loadings to get better model fit given that the model fit of the 2f2c model was worse than the1f2c wrong model in the second example?

3. I think the analysis of data with FMM is more exploratory, and simultaneously confirmatory. You investigated the data with different FMMs and wanted to get the model with the best model fit. But the model with the best fit was not correct. The alternative 2f2c model also needed adjustment due to partial intercept invariance. This process reminded me of the multiple group EFA asymptotical restriction and model adjustment process. What about your opinion?

4. Is skew-t applicable in the framework of FMM, and with Mplus program now?
 J.D. Haltigan posted on Sunday, April 15, 2018 - 5:30 am
A follow-up to the reference to the FMM-1 in the Clark et al (2013) piece. In that piece, the FMM-1 is noted as having no within class variability on the factor. As such, using a 2-class model, the exemplar syntax sets the variance to 0 in the overall part of the model command while using the marker ID approach to set the metric as the default (also last class mean to zero). If I wanted to estimate all factor loadings and set variance to 1 to identify the model (instead of 0), would this technically move out of the realm of FMM-1 (i.e., no within class variability). Since all classes have factor variance at 1, I don't think it would but I am wondering about consequences of doing this.
 Bengt O. Muthen posted on Sunday, April 15, 2018 - 4:55 pm
Yes, as soon as the variance is > 0 , this would move it out of that realm.
 Angela Nickerson posted on Friday, April 27, 2018 - 11:28 pm
Dear Drs Muthen,
I am running a factor mixture model (CFA up to 2 factors and LPA up to 5 classes) using survey weights. I have been following the steps in the Clark , Muthen et al paper in 2013. I am able to successfully run models at level FMM-1 (i.e., class invariant item means and factor loadings, factor covariances fixed to 0, but allowing factor means to vary). However, I have had difficulty running the models from FMM-2 (i.e., class varying factor covariances) onwards. In the case of the FMM-2 model for 2-class, 2 -factor model (see below syntax), I was able to get convergence by fixing a small non-significant negative residual factor variance to 0, and a non-significant factor covariance to 0, but then the model results were not meaningful in that one of the two latent classes had a membership of <5% when it would be expected to be much larger. When I move to FMM-3 and 4, I consistently get error messages regarding either non-convergence or the singularity of the data which seem to be related to the factor means. I am wondering if this would suggest that models beyond the most restrictive are not appropriate for my data?
Thank you very much.


F1 by x1-x6;
F2 by x7-x14;
[x1-x14] (1-14);

F1 with F2;

F1 with F2;
 J.D. Haltigan posted on Saturday, April 28, 2018 - 1:02 am
In response to your above post, Angela, I might mention I have had similar issues progressing from FMM-1 through FMM-4 from the Clark et al. (2013) piece. Although in my case difficulties seemed related to the factor variances, the common thread seemed to be that the more you try to parse a modest-sized sample (you don't mention your N) into K + 1 classes, the harder FMM-2 thru 4 are to fit as they permit too much flexibility. Hunt & Jorgensen (2011) explicitly note that "permitting too much flexibility in component models subverts the whole idea of a mixture of definite components and risks a failure of identifiability in the model and numerical problems with the fitting."
 Angela Nickerson posted on Saturday, April 28, 2018 - 5:07 am
Thank you for this comment - it is interesting to hear that you also had difficulties with this progression. My sample is 1,800+ participants. I also have read the comment by Hunt & Jorgensen, but had thought that FMM-2 was still a fairly restrictive model with relatively low flexibility, and was hoping it might be possible to test it in my data.
 Bengt O. Muthen posted on Sunday, April 29, 2018 - 4:37 pm
Yes, some data don't contain enough information to support the more flexible models. But if you like, you can send your FMM-3 run to Support along with your license number and maybe we can see something.
 Angela Nickerson posted on Sunday, April 29, 2018 - 4:55 pm
Thank you very much Dr Muthen. I will send it through.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message