Latent Profile Analysis PreviousNext
Mplus Discussion > Latent Variable Mixture Modeling >
 Levent Dumenci posted on Friday, May 09, 2003 - 5:20 am
Here is my syntax:

type is mixture;

True or False: Observed covariance matrix is class specific.
 bmuthen posted on Friday, May 09, 2003 - 6:44 am
This model implies a class-invariant covariance matrix for the x's. The covariances are fixed to zero as the default.
 Levent Dumenci posted on Monday, May 12, 2003 - 6:20 am
That's correct. I estimated class-invariant covariance matrix (diagonal), as I intended. However, the residual covariance matrices (Observed - estimated) vary across classes, which led me think that observed covariance matrices differ across classes. Is that right?
 bmuthen posted on Monday, May 12, 2003 - 7:28 am
Here "Observed" is a class-specific covariance matrix computed by weighting the raw data with the individuals' posterior probabilities as estimated from the model. This means that the "observed" covariance matrix does change over the classes. This is the closest to observed that one can get with unknown mixture classes.
 Anonymous posted on Friday, May 14, 2004 - 11:28 am
I am new to Mplus and I need help on my problem below.
I have a structural model has two exogenous latent models effecting one endogenous latent variable. One of the exogenous latent variables has four categorical manifest variables and the other exogenous latent variable has three manifest variables. The endogenous latent variable is continuous and has 4 continuous manifest variables.
I am considering using latent profile analysis and the Mplus program. Am I right and how do I go about it.
Thanks so much for your assistance.
 Anonymous posted on Friday, May 14, 2004 - 12:35 pm
Sorry I did not preview my first message. Below is the corrected version.
I am new to latent variable modeling (not SEM) and Mplus. I need help to solve a modeling problem.
I have a model that has two exogenous latent variables effecting one endogenous latent variable. The latter has four continous manifest variables. One of the exogenous latent variables has four categorical manifest variables and the other has three continous manifest variables. I want to use latent profile analysis and Mplus. Am I right in using LPA? Can Mplus perform the analysis?
 Linda K. Muthen posted on Friday, May 14, 2004 - 3:44 pm
So you want to look for unobserved heterogeneity in your data in the form of latent classes using an SEM model?
 Anonymous posted on Tuesday, May 18, 2004 - 7:19 am
 Linda K. Muthen posted on Tuesday, May 18, 2004 - 7:44 am
There are two papers on the homepage at that describe such models using Mplus. The first author is Lubke.
 Scott Weaver posted on Wednesday, August 02, 2006 - 11:30 pm
I am conducting a latent profile analysis. I have specified tech11 to get the LMR test, but am having difficulty figuring out how to specify the start values such that the first class is the smallest class. I tried specifying the start value for the latent class mean (C#1*-2 in a 2 class model) but that does not seem to work to make class 1 be the smallest class.

Your help is very much appreciated!
 Linda K. Muthen posted on Thursday, August 03, 2006 - 6:57 am
You should be specifying starting values so that the largest class is last. You do this by using the parameter values of the means of the latent class indicators not the means of the categorical latent variable.
 Scott Weaver posted on Thursday, August 03, 2006 - 12:05 pm
I tried what you suggested, but it does not seem to be working either. In my initial run of a 2 class model, the largest class is first. So I used the estimated means from the largest class as start values for the last class (%c#2%) and ran the model with starts = 0 0. I verified that my specified start values were used with tech1. However, the results still are such where the largest class is first. Any suggestions?
 Linda K. Muthen posted on Thursday, August 03, 2006 - 12:43 pm
You would need to send your input, data, output, and license number to so I can see where you are going wrong.
 Zhongmiao Wang posted on Thursday, November 02, 2006 - 9:32 am
any opinions about the difference between Latent Profile Analysis and Latent Class Analysis? From the Lubke and Muthen's paper about factor mixture models, I think for LPA, the latent class indicators are continuous variables, but for LCA, the latent class indicators are categorical or ordinal variables? Am I right?
 Linda K. Muthen posted on Thursday, November 02, 2006 - 1:13 pm
That sounds correct.
 Raji Srinivasan posted on Saturday, February 03, 2007 - 7:46 pm
This is a question regarding the output file from a mixture model.

I am saving the output from a mixture model with the cprobs for segment membership- but I would also like to get the id's of the observations - in order to do some additional post hoc analysis using other variables not used in the mplus models.
Is there a way that I can do this?
In other words, can I get mplus to carry forward an id variable from the input data file into the output data file

Thanks in advance!
 Linda K. Muthen posted on Monday, February 05, 2007 - 6:45 am
If you include the IDVARIABLE option in the VARIABLE command, the id variable will be saved.
 Michelle Finney posted on Friday, June 06, 2008 - 12:12 pm
I have pre and post measures on five cognitive domains for a large sample of healthy older adults with a family history of Alzheimer's disease. The five pre-and post measures were adjusted for age, gender, and IQ using data from a control sample
We are interested in identifying 3 possible groups in terms of cognitive performance at both time points: improver, stable, or decliner.
I have thought about 3 possible approaches to analyze the data in MPLUS
1. Include all 10 measures (5pre and 5post) as indicators of class membership. Do latent Profile analysis.
2. Compute five change scores: post - pre, in which case a negative score would indicate decline, and then use both the five change scores and also the Time 1 scores as indicators of class membership. (Including performance at time 1 and also change would capture those with low initial performance and also a decline.)
3. Assign a score to each individual on each of the five cognitive dimension according to the following scheme:
Assign 2 to those performing 1SD or more above Controls
Assign 1 to those performing within +/-1SD relative to controls
Assign 0 to those performing 1SD or more below Controls.
Use these 10 categories in a latent Class Analysis.

What would be the best approach to identify decliners, stables, and "improvers"?

Thank you very much in advance for your help!

 Bengt O. Muthen posted on Friday, June 06, 2008 - 12:26 pm
How about Latent Transition Analysis, where you have a latent class model for the 5 outcomes at each of the two time points? With say 2 classes at each time point you would have a chance to get the decliners, stables, and improvers.
 Michelle Finney posted on Friday, June 06, 2008 - 1:24 pm
Dr. Muthen,

Thanks for your response.

Could you point me to an example of the commands set up using MPLUS?

Thank you!
 Linda K. Muthen posted on Friday, June 06, 2008 - 1:44 pm
See Example 8.13 and 8.14 in the Mplus User's Guide. See also the Nylund dissertation that is on the website.
 Thomas Olino posted on Friday, August 07, 2009 - 5:25 pm
Are there any guidelines for the minimum number of continuous indicators for latent profile analysis? Alternatively asked, is there a way to calculate df for latent profile analysis?

 Linda K. Muthen posted on Friday, August 07, 2009 - 6:09 pm
Degrees of freedom are not relevant for LPA because there is no unrestricted set of sample statistics to test against. I know of no guidelines for the minimum number of continuous indicators for latent profile analysis.
 Tomoko Udo Schaller posted on Wednesday, September 02, 2009 - 9:49 am
Dr. Muthen,

I'm using Latent Profile Analysis on the data collected from three different sample sources. I could try to use sample sources are a covariate, but am wondering if there are any other ways to take different sample sources into account in the analysis.

Thank you.
 Bengt O. Muthen posted on Wednesday, September 02, 2009 - 11:30 am
You can treat sample source as "KNOWNCLASS" which makes it possible to test equality across samples of any of the parameters in your model.
 Melissa Kimber posted on Wednesday, February 22, 2012 - 8:22 am
Hello Dr.'s Muthen,
I have run an LPA on three continuous indcators and have found a good 4 class model. I would now like to use that Latent class variable as 'predictor,' if you will, in another LV model that has dependent variable that is defined by 3 binary indicators and ther observed covariates.
Is there any syntax or examples about how to do this?
Thank you,
 Bengt O. Muthen posted on Wednesday, February 22, 2012 - 10:17 am
Look at how UG ex8.6 handles the influence of c on u. You don't say u ON c, but the u thresholds/means vary over the c classes by default.
 Erika Wolf posted on Wednesday, April 11, 2012 - 1:34 pm
How do I request output with the observed-expected residual covariance matrix for a latent profile analysis when type = complex mixture?
 Linda K. Muthen posted on Wednesday, April 11, 2012 - 3:49 pm
Use the RESIDUAL option of the OUTPUT command.
 Kathryn Modecki posted on Saturday, August 17, 2013 - 3:07 am
Dear Dr.'s Muthen-I am running a LPA with decision making variables. However, theoretically I should include indicator variables that "overlap" -for instance, benefits of taking a risk, costs of taking a risk, and "depth of processing" (the sum of all benefits and costs), and benefit-to reward-ratio (benefits/costs). This would be an issue in regression, I believe, according to Cohen, Cohen, West, & Aiken. From your view, is there a similar issue in LPA? The solution converges in Mplus but my concern is that variables with "overlap" may be less likely to "drive" the differences across profiles. Thank you very much for your time.
 Bengt O. Muthen posted on Sunday, August 18, 2013 - 1:50 pm
In regression you worry about collinearity (too high correlation) among the predictors (covariates; x's; IVs), but in LPA your variables are outcomes (y's; DVs) so that issue isn't involved. Still, overlapping indicators may create residual covariances in LPA which may cause BIC to point to too many latent classes. If this is a concern, you can also do mixture modeling with all covariances in the model in line with UG ex 7.22.
 Steven L Lancaster posted on Thursday, January 08, 2015 - 7:28 am
Hello, I am hoping for a bit of clarification/guidance in terms of picking analytic strategy. I am using data that involves ratings of emotions that are, obviously, correlated. Thus, I initially was examining using the syntax noted above (UG ex 7.22). However, in other places, I saw a suggestion that when variables may correlate within class, factor mixture modeling (similar to ex 7.17) is the way to go. Is there any guidance you can provide as to how to decide which strategy is best?
 Sakhavat Mammadov posted on Thursday, January 08, 2015 - 3:55 pm
Hello Dr. Muthen, In my study, I used LPA to investigate high-ability students' personality profiles. I submitted the paper to a journal in our field (gifted education). They asked for a major revision and reviewers' asked some important questions. One of the comments was about the way of how I conducted analysis: The instrument that I used is the Five-Factor Model (FFM) personality inventory, which has five factors (Extraversion, Openness, and etc.). I submitted the five subtest scores from the FFM for the sample of 410 students. The analysis yielded three profiles. The reviewer indicated that the the analysis that used data created with factor analysis to then do an LPA to group people together is problematic. The problem with this is that the variables used to created the classes have been designed to be distinct and difficult to group together. The reviewer indicated that this somewhat lows entropy values and makes it difficult to have distinct classes.
I am wondering if I can use LPA with those factor scores (latent variables). If I have to use only observable variables, wouldn't it be difficult to interpret profiles? I do not know how the result will look like if I include all 45 items from the dataset into LPA analysis. What is the correct way of using LPA in this study? Is the reviewer right with his/her concern? I appreciate your time and explanation about this issue.

 Bengt O. Muthen posted on Thursday, January 08, 2015 - 5:11 pm
Note that you don't have to have correlations among indicators within class for the indicators to correlate - they correlate because they are all influenced by the latent class variable. For an application, you may want to take a look at the papers on our website:

Muthén, B. (2006). Should substance use disorders be considered as categorical or dimensional? Addiction, 101 (Suppl. 1), 6-16.

Muthén, B. & Asparouhov, T. (2006). Item response mixture modeling: Application to tobacco dependence criteria. Addictive Behaviors, 31, 1050-1066.
 Steven L Lancaster posted on Friday, January 09, 2015 - 8:52 am
Thank you for the quick response. Given that, when is it necessary/beneficial to specify covariances (ex 7.22) versus not? In reviewing papers which have published in my area which use LPA this is not very common, but it is not clear how this decision is to be made.

I have a follow-up question as well. If I want to include a covariate (gender), it is my understanding that the model presented in Figure 3b in that 2006 paper is the best way to go due to the model parsimony it provides over traditional LPA. Is that correct?
 Bengt O. Muthen posted on Friday, January 09, 2015 - 4:48 pm
It depends on substantive theory to a large extent. LPA (no correlations within class) is a simple parsimonious model. You can add more classes so that the zero within-class correlation specification is more realistic. But sometimes you get a better BIC by adding some WITH statements rather than adding classes. UG ex 7.22 is an extreme case of that where you add all possible WITH statements. Such a model is more often used in the classification literature where there is little or no theory of a set of items designed to measure a latent class variable (and therefore more likely have zero within-class correlations). Instead of adding all WITH, the factor of factor mixture captures within-class correlations more parsimoniously (fewer parameters) than using all WITH statements. Such a factor would be relevant if substantive theory suggests a "severity dimension" within class.

Gender can predict the latent class variable or also a severity factor within class.

For a further modeling overview, see also the paper on our website under General Mixture Modeling:

Muthén, B. (2008). Latent variable hybrids: Overview of old and new models. In Hancock, G. R., & Samuelsen, K. M. (Eds.), Advances in latent variable mixture models, pp. 1-24. Charlotte, NC: Information Age Publishing, Inc. Click here for information about the book.
 Bengt O. Muthen posted on Saturday, January 10, 2015 - 9:27 am
Answer to Mammadov post:

I don't see a problem with doing LPA on factor scores if correlation among the factors is substantively motivated (a latent class variable influencing the factors make the factors correlate). It would be different if the factors were obtained with orthogonal rotation using say Varimax.

You can also do LPA based on the 45 items but that might well lead to a different picture.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message