Multinomial logistic regression PreviousNext
Mplus Discussion > Latent Variable Mixture Modeling >
 Bonnie J. Taylor posted on Tuesday, December 07, 1999 - 1:00 pm
For the latent class model with covariates, can contrasts other than each class compared to the reference group be made?
 Bengt O. Muthen posted on Wednesday, December 08, 1999 - 9:28 am
No, I don't think so. The two tools we have available for classes is that we can make the last class (the reference class) into the one we want by choice of starting values, and have equality constraints on the class probability parameters.
 Bonnie J. Taylor posted on Friday, December 17, 1999 - 8:34 am
In a multinomial logistic regression with a covariate and a latent categorical variable having more than two classes, the individuals do not actually have a 1 or 0 signifying class membership, instead they have a probability of membership for each class. Does the estimation of logit coefficients begin with actually assigning each individual to one specific class and then somehow iteratively account for the fact that the probabilities are not exactly 0 and 1?
 Bengt O. Muthen posted on Friday, December 17, 1999 - 7:01 pm
The Mplus multinomial regression with a latent class variable as the dependent variable assigns each individual fractionally to all classes using the "posterior probabilities" and does not force a 0/1 classification. This is done throughout the EM iterations. The first set of fractional assignments are based on the starting values, and they are then iteratively improved on until convergence.
 Bonnie J. Taylor posted on Friday, January 26, 2001 - 6:33 am
How are nominal variables used as predictors of the latent categorical variable? (i.e. How do they need to be coded?) For example, I am using age, gender, and race as predictors. I understand the numerical and the binary, but I am not clear on the nominal.
 Linda K. Muthen posted on Friday, January 26, 2001 - 11:34 am
The nominal variables should be turned into a set of dummy variables as in regular regression. So if you have three categories in the nominal variable, you will have two dummy variables.
 Dirk Temme posted on Tuesday, January 15, 2002 - 5:29 am
As described in Example 25.4 by modifiying the mixture approach appropriately it is possible to estimate a multinomial logistic regression model where the dependent variable (represented by the latent categorical variable c) is unordered categorical (such a model can be used for example if one tries to explain the choice of a specific product from a set of multiple alternative products). As a result of the required "trick" we only get single estimates for the model parameters instead of class-specific estimates. At first sight it seems natural to use the observed latent class indicators u (e.g., measuring the observed choice of a specific product by individual i) as the dependent variable if one is interested in a "true" combination of the multinomial logistic regression model and the finite mixture approach. Unfortunately, is is assumed that the class indicators u are binary or ordered categorical. Therefore it seems that only the choice between two alternatives (e.g., product A and product B) can be explained in a true mixture application. Do you know of a different way to estimate a mixture multinomial logistic regression with multiple unordered categories?
 bmuthen posted on Thursday, January 17, 2002 - 9:17 am
Although we have not tried this, it seems that it is possible to do what you want using an approach with several latent class variables. One latent class variable corresponds to the unordered outcome - using the training data construction that is shown in Example 25.4. Say that this has 3 categories (call the variable u). The other latent class variable is the true latent class variable. Say that this has 2 categories (call the variable c). You then create a 6-class setup:

c=1: 1 2 3
c=2: 4 5 6

where columns correspond to the 3 categories of u. Here, class 3 should have its slope on x fixed at zero, while leaving the intercept free.
 Huang posted on Tuesday, February 18, 2003 - 12:18 pm
Dear, Dr. Muthen,

I have one qustion on the interpretation of the output, and two questions on the model fitting.

First, in LCA/LCR, the output of the measurement part for one of the binary indicators (manifest variable) is described as the folloing

FUTUR (w/i class 1) value s.e val/s.e
Category 1 0.808 0.019 42.852
Category 2 0.192 0.019 10.168

How to interpret this item response probability? What is the code for "category 1"? P(futur=1|class=1) = 0.808 or 0.192?

Second, when I fit the regression part of LCR, if some of the covariates are categorical, do I need to center then also. For example, gender and race are both binary covariates, which format is more propriate in Mplus?

names are popul futur alc money sex race;
usev are popul futur alc money centsex centrace;
categorical=popul futur alc money;
c#1 c#2 on centsex centrace;

names are popul futur alc money sex race;
usev are popul futur alc money sex race;
c#1 c#2 on sex race;

Third, how to make the last class (the reference class) into the one we want?

MODEL: (original model, which takes class-3 as the baseline in polytomous regression)
c#1 c#2 on centgrad centrace;
[popul$1*2 futur$1*2 alc$1*-1 money$1*2];
[popul$1*-1 futur$1*-1 alc$1*2 money$1*-1];
[popul$1*-3 futur$1*-.5 alc$1*-6 money$1*-.5];

If I want to use class-1 as the reference in regression piece, is the following gonna work? (just, switch the starting value of class1 to class3)

c#1 c#2 on centgrad centrace;
[popul$1*-3 futur$1*-.5 alc$1*-6 money$1*-.5];
[popul$1*-1 futur$1*-1 alc$1*2 money$1*-1];
[popul$1*2 futur$1*2 alc$1*-1 money$1*2];

Thank you very much for your help and your time. Your information and suggestion will be appreciated. Have a great day!

 Linda K. Muthen posted on Tuesday, February 18, 2003 - 12:57 pm
Category 1 is 0. Category 2 is 1.

It is not necessary to center covariates. You might want to center the continuous ones but not the binary.

Use starting values to make the last class the one you want. What you suggest should work.
 Anonymous posted on Friday, July 25, 2003 - 1:16 pm
I'm curious of the formula used to compute the estimated class proportions when there are class predictors in the model. I tried looking this up in my categorical data analysis books, but all only gave formulas for the predicted probability P(C|x), not the estimated unconditional probabilily P(C). I suppose this is because in most multinomial models, c is an observed variable, rather than latent.

If you can point me to the formula, I would be grateful. Thank you for your attention.
 bmuthen posted on Friday, July 25, 2003 - 6:59 pm
Without covariates, the probabilities of c have their own parameters, designated as the [c#] intercepts in Mplus. These probabilities are the same as the values obtained from the estimated posterior probabilities from the last stage of the ML iterations, computed for each class and each individual. Each class' probability is obtained by summing over the individuals' post prob values for that class. You are right that with covariates, there is no designated parameter for the c probabilities. The estimated probabilities are, however, obtained in the same way via posterior probabilities as for the case without covariates.
 Anonymous posted on Sunday, October 19, 2003 - 4:28 pm
I am using mixture modeling to perform a multinomial logistic regression with an unordered polytomous observed dependent variable. I am especially interested in testing whether the estimates for two predictors in the same model are the same. I can easily run the model with and without the relevant constraints. P. 371 of the manual indicates that the log likelihood ratio for a given model can be used to compute the likelihood ratio chi-square for nested models (which, I believe I have). Is there a way to get this likelihood ratio chi-square printed in the output?
 Linda K. Muthen posted on Sunday, October 19, 2003 - 5:18 pm
You have to do two runs and then do a difference test using the two loglikelihood values.
 Patrick Malone posted on Monday, October 20, 2003 - 4:55 am
Hi, Linda.

Is there any chance in future versions of getting the capacity to test constraints in a single run? That would be a great convenience.
 Linda K. Muthen posted on Monday, October 20, 2003 - 6:10 am
I will certainly add it to my list.
 Anonymous posted on Monday, October 20, 2003 - 9:39 am
Regarding the constraint tests for the multinomial regression: I understand that I need to do the difference test. However, it's not clear what values need to be contrasted. For BOTH models (estimates freely estimated and estimates constrained to be equal), the same H0 loglikelihood value is given so I didn't understand your reply which indicated I should do a difference test of the likelihood values from the same model. (No other loglikelihood value appears, e.g., one for H1). I had thought there would be a way to obtain the likelihood ratio chi-squares for the two models for each model so that I can subtract one from the other. The Information Criteria (number of free parameters, AIC, BIC, and sample-corrected BIC) do differ for the 2 models. Although the models are nested, can I subtract the relevant Information Criteria values for a valid test of the equality constraint?
 Linda K. Muthen posted on Monday, October 20, 2003 - 9:48 am
2 times the loglikelihood difference gives you the chi-square difference. You should get a different H0 loglikelihood for each model. If you don't, please send your two outputs to so I can see what the problem is. I don't know that the difference between information criteria values for the two models is a valid test of the equality constraint.
 Ken Wahl posted on Saturday, March 27, 2004 - 11:11 am
With a nominal variable, should it make any difference which value is used as reference? It seems that dummy parameters can drop in/out of significance whe different reference values are used -- but maybe I'm not interpreting the results correctly. I'd hate to presume non-significance simply because I chose the wrong reference value.
 bmuthen posted on Saturday, March 27, 2004 - 11:37 am
The choice of reference category makes a difference - and it should because you are discussing different effects with different reference categories. You can change the reference category by changing your starting values.
 Carlos posted on Thursday, April 22, 2004 - 8:07 am
Hi Linda, Bengt,
I would like to use the results from a latent class model conducted in one sample, to obtain class probabilities (and eventually classify) people in a different sample. Do you have any utilities for doing this?

Otherwise, do you have any examples I could use, using excel or other similar tools? Sometimes we use a discrimininant function to do this, but we would like to use the results from the latent class model. An example with latent class indicators and covariates would be extremely helpful. The issue gets more complicated because some of our class indicators can be ordered or unordered indicators (although we try avoiding the latter.)
 Linda K. Muthen posted on Thursday, April 22, 2004 - 9:50 am
What you would do is fix all of the parameters in the model to the values obtained from the first sample. Then run the analysis using the data from the second sample and asking for CPROBABILITIES in the SAVEDATA command.
 Carlos posted on Thursday, April 22, 2004 - 11:20 am
That's brilliant (and it makes perfect sense)!

Thanks so much.
 Patrick Malone posted on Wednesday, June 16, 2004 - 7:14 pm

I feel like I asked this a couple of years ago, but I can't find it.

I've got a four-class model I'm happy with. I'm regressing class membership (c#1, 2, and 3) on a dichotomous treatment predictor.

I'm wanting to know if the predictor significantly predicts membership in each of the classes, taken singly. That is, are treatment Ss more likely than control Ss to be in class 2.

The regression parameter, of course, gives me that in relation to the reference class, class 4. But I want the backdrop to be the aggregate of the other three classes -- class 2 vs. anything else.

Any suggestions? Thanks.
 bmuthen posted on Wednesday, June 16, 2004 - 8:57 pm
You can express the probability of being in each of the classes as a function of tx by the usual multinomial logistic regression expression. And through this you can get the sum of probabilities of being in all classes but class 2. So this way you can take the ratio you want and get the point estimate. But I think the log of this probability ratio is not a simple function of the regression coefficients, but a non-linear function of several coefficients, so to get the SE you have to use the Delta method. Correct me if I am wrong, someone.
 Patrick Malone posted on Thursday, June 17, 2004 - 8:11 am
Thanks, I'll look into it. Yes, I've got the point estimate.
 Anonymous posted on Thursday, June 24, 2004 - 3:35 pm
I have a gmm with a dichotomous predictor of class membership (gender). There are no females in one of my classes and the regression coefficient associating this class with gender is 70.8. I also get an error message stating that this coefficient was fixed to avoid singularity. What should I do in this case?
 bmuthen posted on Friday, June 25, 2004 - 9:09 am
You can report this solution. The large fixed reg coeff simply means that the probability is 1 for being in this class when the dichotomous predictor is 1 as opposed to 0. The value 70.8 is arbitrary - any large value, say greater than 15 (it depends on the other coefficients' sizes), suffices to give probability 1.
 Anonymous posted on Monday, September 06, 2004 - 7:06 pm
I have a binary item (X) that I'd like to use as a predictor of class membership. However, I don't want to include it in the GGMM (for my current analysis, I don't want X to affect the latent class solution). Rather, I'd like to get the posterior probabilities of group membership for each class, estimate the weighted proportion on X for each class, and then compare them using a test for two proportions or some other appropriate test. Would this be an appropriate strategy to assess the effect of X on class membership given my need to not include X in the model?
 Anonymous posted on Monday, September 13, 2004 - 5:46 am
Is it possible to change the reference group for a 4 class model with covariates? The default is for class 4 to be the reference; however, I would like for the reference group to be class 1.
Thanks for your help.
 Linda K. Muthen posted on Wednesday, September 29, 2004 - 4:04 pm
Re: September 6 -- Because each individual is in each class, putting them in their most likely class only will result in estimation error. Furthermore, the standard errors are incorrect because class membership is taken as observed not inferred which will distort the test of proportions.

You might want to look at the paper by B. Muthen in the book edited by Kaplan which can be downloaded from the Mplus homepage. This discusses covariates affecting class membership. This is not always undesirable.
 Linda K. Muthen posted on Wednesday, September 29, 2004 - 4:06 pm
You can't change the reference group to be other than the last class, but you can use starting values to make sure the class you want as the reference class is the last class.
 Christian Geiser posted on Sunday, October 31, 2004 - 3:59 am
I have 2 questions concerning latent class analysis with covariates. I have a latent class variable (attachment style, 20 indicators) with 3 latent classes. In my model, the LC variable is predicted by 3 continuous latent variables (factors). My first question is, as my data is sparse (many indicators for the LC variable), the p-values of the Chi**2 tests provided in Mplus do not seem appropriate for the overall fit assessment of the model. What kind of procedure do you recommend for the fit assessment of such a model? E.g., is there a possibility to obtain a bootstrap p-value for the Chi**2 tests in Mplus (as is possible in other LCA computer programs)?

The second question concerns the multinomial logistic regression of the LC variable on the covariates. I understand from the output that my predictors significantly predict my LC variable. However, Mplus does not seem to provide any kind of effect size for the predictor model (like pseudo-R**2 as is available for multinomial regression for example in SPSS) that could be used to judge whether the effect is really meaningful. Do you know of any possibility to compute an effect size measure for such a model?
Thank you very much in advance!
 bmuthen posted on Monday, November 01, 2004 - 8:04 am
Model testing against data is difficult for LCA with covariates - chi-square testing against a frequency table is not possible given the continuous LCA covariates since that does not provide "grouped" data. I would recommend chi-square testing based on the log likelihood difference between nested models, e.g. with or without some direct effects from the covariates onto the indicators. Or in the case of testing k vs k-1 classes, where parameters are on the border of their admissible space, using the LMR test in Mplus (Tech11). The LMR test is a way to avoid the heavy compuations of bootstrap p values.

I think a clear way to understand the estimated effects of covariates on the latent classes is to compute predicted probabilities for the classes at given covariate values. The Mplus plot facility can be used to do this automatically; see e.g. the Version 3 User's Guide, end of Chapter 13. I think that is more clarifying than any type of R-square statistic which I don't find as natural for non-continuous outcomes.
 Christian Geiser posted on Sunday, November 07, 2004 - 2:52 am
Thank you very much for your reply. The LMR test seems to be helpful. But is it possible to compute bootstrap p-values for the LR and Pearson X**2 tests in Mplus for LCA models without covariates?
 Christian Geiser posted on Thursday, November 11, 2004 - 2:00 am
Let me add to my previous question -- would it be possible via Monte Carlo simulation in Mplus? Thanks, Christian
 bmuthen posted on Sunday, November 14, 2004 - 11:59 am
Such bootstrapping is not available in the current Mplus. I think the Mplus Monte Carlo facility can only be of limited help here. You have to draw new samples with replacement and estimate your model for each such sample. You can submit a set of such new samples to Mplus using the "external" Monte Carlo facility and thereby get summaries where you can study the distribution of your test statistics in a Monte Carlo run of those samples.
 Christian Geiser posted on Friday, November 19, 2004 - 7:19 am
A paper by Langeheine, Pannekoek & van de Pol (1996) about parametric bootstrapping of goodness of fit measures made me think of the possibility to use the Mplus MC facility as they call the procedure they describe in that article "Monte Carlo bootstrap". In addition, the steps to be performed in this bootstrap ("assume the model is true - treat fitted proportions under the model as population proportions - draw samples from this multinomial distribution with known parameters - estimate the same model for these samples and assess G**2 (LR-test) for each sample - reject model if proportion of bootstrap G**2's that are larger than original G**2 is very small." [p.495]) sound very much like that should be possible with a MC simulation. Could you elaborate a bit why you think that this can not be done in Mplus? Again, thank you very much in advance.

Langeheine, Pannekoek, & van de Pol (1996). Bootstrapping goodness of fit measures in categorical data analysis. Sociological Methods & Research, 24(4), 492-516.
 bmuthen posted on Friday, November 19, 2004 - 11:12 am
I see - you are referring to parametric bootstrapping where you are drawing samples from an estimated model (I was thinking more in terms of non-parametric drawing with replacement from the raw data). Yes, the Mplus Monte Carlo facility can be used for drawing and analyzing such samples. You enter the estimated model as population parameters and draw samples from this across many replications. The Monte Carlo summaries give distributions for the resulting tests and estimates. But the LR test for a frequency table is not part of the summaries. It is only part of the output if you do "external" (as opposed to internal) Monte Carlo in Mplus. External is when you have generated data yourself outside Mplus and then send those data to the Mplus Monte Carlo facility for analysis and summaries.
 bmuthen posted on Friday, November 19, 2004 - 11:23 am
Note that to circumvent the limited output from an internal run, you can generate your data sets in a first internal run, save them, and then use them in an external run.
 Christian Geiser posted on Saturday, November 20, 2004 - 7:54 am
Are you sure? My (internal) MC output for a model with 3 latent classes contains means, SD's and the number of successful computations for the likelihood ratio chi**2 as well as for the Pearson X**2. Or am I missing something here?
Another question: How can I get the loglikelihood for the saturated latent class model in Mplus? Thanks, Christian
 bmuthen posted on Saturday, November 20, 2004 - 10:36 am
You are right; I forgot about this recent addition to Version 3. Regarding the saturated model logL, it is not separately reported but you could deduce it from the LR chi-square value and the logL for H0 which are both printed, since the LR chi-square is 2 times the logL difference between H1 and H0.
 Christian Geiser posted on Sunday, November 21, 2004 - 7:42 am
Your own program can more than you thought! Isn't that a nice surprise...!

Thanks for the information about the saturated model logL, I thought about that possibility too, but isn't it a problem for the computation you suggest that Mplus computes "corrected" chi-squares (my output tells me Mplus has deleted certain extreme values (cells) in the computation of chi-squares)? Thank you, Christian
 Linda K. Muthen posted on Sunday, November 21, 2004 - 9:22 am
We are continually surprised by the unacticipated power of Mplus :)

I think if you add UCELLSIZE = 0; to the ANALYSIS command, you will avoid having any cells deleted.
 Christian Geiser posted on Thursday, November 25, 2004 - 9:24 am
Linda and Bengt, thank you. You were extremely helpful. One last question: The MC outputs for my latent class models give me expected as well as observed proportions and percentiles for the fit statistics. Though I think the observed values should be interpreted, I am somewhat uncertain. Could you briefly explain the difference? Thank you, Christian
 bmuthen posted on Friday, November 26, 2004 - 6:31 am
The expected proportions in the first column simply indicate which percentage levels are reported, so you are right that you would be interested in the results in the observed proportion column, column two. So for instance, the value 0.05 in the first column simply indicates that it is this value that you want to get close to when looking at the second column value. And, analogously for the percentiles.
 Anonymous posted on Sunday, December 12, 2004 - 7:34 pm
Dear Bengt,

I am running a three-class mixture model with multinomial logit regression on the class membership. I know that the last group (c#3 in my case) is the default reference group. I would like to know how to change the reference group to other groups, say Class 2. Thanks in advance!

Y1 with Y2;
[Y1 Y2];

c#1 c#2 ON x1 x2 x3 x4;

Y1 with Y2;
[Y1 Y2];

Y1 with Y2;
[Y1 Y2];
 bmuthen posted on Sunday, December 12, 2004 - 7:38 pm
You have to do this by using starting values that make whatever class you want the last class. So you rerun your analysis giving some key starting values for the last class which you take from class 2 of your current solution.
 Anonymous posted on Sunday, December 12, 2004 - 9:54 pm
Thanks a lot for the suggestion. I have tried it and got some strange results. When there was no starting value, the correlation between two variables in one class was positive while it became negative when starting values were given.

All other fit statistics and parameter estimates were exactly the same except the mentioned correlation.
 Linda K. Muthen posted on Monday, December 13, 2004 - 6:42 am
Send the two outputs to so I can see exactly what you are doing -- the output with no starting values and the output where you use the ending values as starting values.
 Judith Baer posted on Tuesday, October 25, 2005 - 11:24 am
We are trying to determine the significance level for the covariates in a LCA with categorical variables. For example

male -.963 .095 -10.135
Blk .432 .138 3.119
Age .382 .051 7.433

What is the meaning of .095 for males?

 Linda K. Muthen posted on Tuesday, October 25, 2005 - 12:00 pm
The columns of the output are described in Chapter 17 under Summary of Analysis Results. If the value .095 is in the second column of the results, it is the standard error of the parameter estimate -.963.
 Tan Teck Kiang posted on Tuesday, December 20, 2005 - 7:42 pm
We heard that MPlus version 4 would implement boostraping method of p-value for latent class analysis. Dayton mentioned that for spare data chi-square and G2 differ and should not be used for model identification. Our data has 17 binary manifest variables and the output is as follows:

Class Chi-square (p-value) G2 (p-value)
2 289355 (0.0000) 57781 (1.0000)
3 176063 (0.0000) 44941 1.0000)
4 95719 (0.0000) 36958 (1.0000)
5 90990 (0.0000) 31853 (1.0000)
6 71474 (0.0000) 27569 (1.0000)
7 60354 (1.0000) 24459 (1.0000)
8 43462 (1.0000) 22227 (1.0000)

I have tried using Bootstap option for LCA and the chi-square and G2 p-value is the same as those provided by the standard output. Have I done it correctly just by specifying the bootstrap option? If not, how it can be done in current version? Anway, does the bootstaping help in addressing the p-value?

Based on AIC, BIC and adjusted BIC, these indices keep droping till 12 classes and we do not carrying on since more than 12 classes as it is hard to explain for huge number of classes. We plot the 3 indices against no of classes and the drop in these indices seem to be at a decreasing rate after 7 classes. Does we need to rely on information criteria alone or should more judgement other than statistical be used?

 bmuthen posted on Wednesday, December 21, 2005 - 7:30 am
Yes,the forthcoming Mplus version 4 will give a bootstrap-based p value for the likelihood-ratio test of k-1 versus k classes. In the current version of Mplus, however, bootstrapping is only for standard errors of parameter estimates and does not help in determining the number of classes. At this point, BIC drop and the Tech11 Lo-Mendel-Rubin test can be used to help decide on the number of classes - taken together with substantive considerations.
 Kate Degnan posted on Tuesday, January 10, 2006 - 5:02 am
I am using the mixture modeling to perform a latent profile analysis. I have found interactions to predict the probabilities of membership in the profiles. Since the output gives logit information is it correct to assume that I would interpret/calculate the interaction in the same way I would for a regular logistic regression?
 bmuthen posted on Tuesday, January 10, 2006 - 8:50 am
 Michael P. posted on Monday, March 20, 2006 - 1:26 pm
Dear Muthen´s,

i have two questions regarding multinomial mixture.

1) if you have h latent classes, you can compute the probability of
class membership for only h-1 classes using covariate
values. Is it possible to compute this probability for the last
class? I know, in a case without covariates c#h can be computed
from the given thresholds of the other classes.

2) Class membership probability uses both thresholds and slopes,
but what is if somes slopes are not significant. Can you leave those
out and compute the probability only with the significant slopes?

Thanks for any response.
 Linda K. Muthen posted on Monday, March 20, 2006 - 2:19 pm
See Chapter 13 where calculating probabilities from logistic regression coefficients is described. It shows how to compute probablities for all classes.

If a slope is part of the estimated model, it must be used to compute the probability whether it is signficanct or not.
 Sorrel Stielstra posted on Thursday, June 22, 2006 - 8:06 am
I understand that starting values can be changed to control which group is the last class and that models can be run multiple times so that different classes can be used as the reference group in a multinomial logstic regression. However in some instances all groups can be conveniently compared without running multiple models because "alternative parameterizations" are automatically provided in the output. This helpful element disappears when I mention the variances of variables in the model command. Is there any way to retain the "alternative parameterization" function so that all groups can be compared in just one output?
 Bengt O. Muthen posted on Thursday, June 22, 2006 - 9:53 am
I think the option is unavailable when numerical integration is involved given that such models are more computer intensive, and I don't think there is a workaround here short of rerunning with different starting values.
 Annie Desrosiers posted on Tuesday, November 07, 2006 - 12:23 pm

I have a question about LCA.
I found 4 latent classes in a model using 9 nominal variables.
Each variable have 2 or 3 choices of response.
I want to know if it's possible with MPlus to identified, for each classes, which are the choices of response that explain the classes.
I want to be able to tell that people in the class # 1 choose the category 2 in the first variable and the class # 2 choose the category 1.

Thank you

variable: name are id sexe y1-y9;
usevariables are y1-y9;
auxiliary = id;
classes = c(4);
nominal = y1-y9;
missing = . ;

analysis: type = mixture missing;
 Linda K. Muthen posted on Tuesday, November 07, 2006 - 1:34 pm
In your results, you will get a mean for each class for all but the last category of the nominal variable. You can turn these means into probabilities.

p = 1 /( 1 + exp -L) where

L = means in the output
 Annie Desrosiers posted on Wednesday, November 08, 2006 - 4:26 am
Thank you!

I have some value at 15.000 with a S.E. at 0.000.
What does that means?
Can I use it?

 Linda K. Muthen posted on Wednesday, November 08, 2006 - 9:22 am
It means that values became large and were fixed. This should be stated in the output. Yes, you can use it.
 Annie Desrosiers posted on Tuesday, November 14, 2006 - 8:44 am
Hi Linda, I have a other question:

Can I do Multinomial Logictic regression on nominal variables?
I try it and the ouput say that I can't have nominal variables on the right-side of the ON.

Thank You
 Linda K. Muthen posted on Tuesday, November 14, 2006 - 10:16 am
The NOMINAL option is for dependent variables. If you have an unordered categorical variable that you want to use as a covariate, you need to create a set of dummy variables as in regular regression. If you want it to be a dependent variable, it should be on the left-hand side of ON. See Example 3.6.
 Andy Ross posted on Tuesday, March 20, 2007 - 6:32 am
Dear Prof. Muthen

I have calculated estimates of the effect of a covariate within EACH of the latent classes rather than for m-1 classes by comparison to a 'reference' class (i.e. standard multinomal regression coefficients). However i am stuck as to what to call them - and am also looking for a reference for this approach?

This is what I have done:
I have calculated probabilities from the logistic regression coefficients as outlined in chapter 13, for a LCA with four classes and 13 covariates - using the model contraint command to estimate these new parameters.

For example I have calculated the probabilities for membership of each of the latent classes for women, holding all other predictors at their mean, and have done the same for men. I have then calculated the ratio of these two probabilities: prob(women)/prob(men) in each of the classes, giving me a kind of odds (?!?) associated with being female by reference to being male in each of the latent classes

I also noted that I can get back to the original multinomial odds ratios by taking the ratio of the new estimate for class one (for example) and the last class.

However, whilst these give me a clear indication of the effect of each covariate in each of the classes without comparison to a reference class, I have no idea what to call them.

Could you please advise me.

With many thanks

 Linda K. Muthen posted on Tuesday, March 20, 2007 - 9:00 am
It sounds like you have used the results from a multinomial logistic regression to create probabilities for both males and females and then created odds from those probabilities so that the odds represents the odds of being male versus female in a certain class conditioned on the other covariates. I don't think this has a special name. I would refer to it as an odds.
 Andy Ross posted on Tuesday, March 20, 2007 - 9:26 am
Excellent - many thanks for that. Do you happen to know of a good reference for this approach?

 Linda K. Muthen posted on Tuesday, March 20, 2007 - 9:50 am
I think the Agresti book is good for categorical outcomes.
 Andy Ross posted on Wednesday, March 21, 2007 - 4:35 am
Many thanks again.
 Yoel Greenberg posted on Monday, September 24, 2007 - 11:43 pm
I am conducting a research which is trying to observe changes in musical style over time (1680-1750). Each characteristic I am checking receives a 0 or 1 value per musical work, and in unclear cases, an interim value is used (in no fields are there more than 2 interim values). I understand I should use ordinal multinomial logistic regression, but I have not managed to get SPSS to plot it on a graph. How can that be done?
Please take into account that I am a novice in statistics...
 Linda K. Muthen posted on Tuesday, September 25, 2007 - 6:11 am
I would not know how to do this in SPSS as I don't use that program. You should try directing your question to SPSS support.
 Yoel Greenberg posted on Tuesday, September 25, 2007 - 6:14 am
Are there any programs you can recommend that do allow such an option?
 Linda K. Muthen posted on Tuesday, September 25, 2007 - 6:17 am
 Andrea Vocino posted on Sunday, April 06, 2008 - 11:33 pm
Can Mplus estimate mixed logit models?
 Linda K. Muthen posted on Monday, April 07, 2008 - 9:04 am
If by mixed you mean a mixture model, yes. If by mixed you mean a multilevel model, yes.
 Andrea Vocino posted on Tuesday, April 08, 2008 - 6:12 am
I was referring to

McFadden, D., Train, K., 2000. Mixed MNL models for discrete response. Journal of Applied Econometrics 15, 447–470.

but I believe your answer is still yes.

I was also wondering whether it was possible to run a model with exogenous latent varaibles [metric contrinuos] and have an endogenous variable given polytomous discrete choices. In other words can Mplus run a diecrete choice model with latent varaibles?
 Linda K. Muthen posted on Tuesday, April 08, 2008 - 1:26 pm
A multinomial regression with latent variables as covariates can be estimated in Mplus.
 Mohini Lokhande posted on Wednesday, August 20, 2008 - 10:42 am
I like to compare three groups in their delinquency development (3 classes) and associated covariates. For this, I use a model as presented in Example 8.8.

However, I only get 1 overall result for the multinominal logistic regression, not 3 logistic regression analyses for each group as I expected (same as in the example-output on this homepage). Is it possible to get one result for each subgroup? - I would like to compare the regression coefficients between the groups.

A second questions concerns the reference class in multinominal logistic regression analysis. Despite of giving the "right" starting values for intercept and slope of the 3 classes, I do not always get the classes in the order I need for the logistic regressions. Are there further suggestions how to deal with this issue.

Thank you very much for your support in advance!
 Linda K. Muthen posted on Wednesday, August 20, 2008 - 3:42 pm
In the multinomial logistic regression of a categorical latent variable on a set of covariates, the last class is the reference class. This regression cannot vary across classes. The covariates explain the classes. When there are more than two classes, Mplus gives the results with each class as the reference class.

If you have further questions, send your files and license number to
 claudio barbaranelli posted on Sunday, February 15, 2009 - 3:14 am
Dear Linda, Dear Bengt,
I have a model with 2 latent
continuous variables and a
nominal (3 unordered categories)
dependent variable.
I regressed the nominal variable
on the two latent variables and
I've got my results.
Now I want to do a multiple group
analysis on the same data
(I have to different groups),
but when using the usual syntax
for multiple group models I get
this error message:

*** ERROR in ANALYSIS command
ALGORITHM = INTEGRATION is not available for multiple group analysis.
Try using the KNOWNCLASS option for TYPE = MIXTURE.

this is my input file:

TITLE: modello 2
DATA: FILE IS sem_2.dat;
NAMES ARE gruppo dasis1
dasis2 diff difft compeff;
nominal ARE compeff;

GROUPING IS gruppo (1 = IND 2 = DEC);

att_imp by dasis1* dasis2 (1);
att_esp by diff* difft (2);

dasis1 dasis2 (3);

att_imp @1;
att_esp @1;

compeff ON att_imp att_esp;

compeff ON att_imp att_esp;

output: samp stdy stdyx ;


Thanks in advance
 Linda K. Muthen posted on Sunday, February 15, 2009 - 10:55 am
A mixture model with one categorical latent variable for which the classes are known is the same as a multiple group analysis. See Example 8.8. Use only the KNOWNCLASS variable.
 Mary Mitchell posted on Tuesday, November 03, 2009 - 12:09 pm
Dear Linda and Bengt,
I would like to regress class membership on covariates in my model, but seem to be running into problems. I tried using "c on gender," but the proportions of my classes change wildly. I tried "auxiliary(r)=gender," but Mplus doesn't seem to recognize this (can you only use auxiliary(e)?). I also tried auxiliary=gender, but gender doesn't show up in the output anywhere. The manual suggests using the SAVEDATA option, but I'm unclear how this might be used. What I really want are estimates with odds ratios saying that males have twice the odds of belonging to class 1 as compared to class 2.
 Linda K. Muthen posted on Tuesday, November 03, 2009 - 12:34 pm
If the classes change when you regress the categorical latent variable on a covariate, this indicates that there are most likely direct effects between the latent class indicators and the covariate. The following paper which is available on the website discusses this:

Muthén, B. (2004). Latent variable analysis: Growth mixture modeling and related techniques for longitudinal data. In D. Kaplan (ed.), Handbook of quantitative methodology for the social sciences (pp. 345-368). Newbury Park, CA: Sage Publications.
 Mary Mitchell posted on Wednesday, November 04, 2009 - 7:18 am
Dear Linda,

Thanks for the quick turn-around time on your response! I reviewed the article and then found example 7.12, which gives me the code for modeling direct effects of covariates on indicators.

 Anjali Gupta posted on Saturday, November 14, 2009 - 8:36 am

In my path models with continuous outcomes, the output includes a 'with' measure of correlation for the dependent variables.

If my path model has one binary dependent variable, I don't get that correlation, probably due to the binary variable.

Is it possible to get some parallel measure for logistic path models?

I've illustrated all my models for my dissertation - and it would be nice if they all contained the same amount of information (to avoid concern).
 Bengt O. Muthen posted on Saturday, November 14, 2009 - 8:56 am
So you are asking about a model with several DVs, one of which is binary. The residual covariance (WITH) appears by default for the WLSMV estimator because then that pertains to the underlying multivariate normal DVs (so analogous to having continuous observed DVs). With ML logit, however, there is not such a natural underlying multivariate distribution. With an extra twist, you could do ML probit and add residual covariance by defining a factor

f BY DV1 DV2;
f@1; [f@0];

where the second loading carries the information about the residual covariance for DV1 and DV2. This can also be used in the ML logistic context.
 Anjali Gupta posted on Saturday, November 14, 2009 - 9:28 am

Thank you for the quick reply. I entered all of the syntax you provided (not sure that's correct).

"f BY cesd03 afdc_03;

f@1; [f@0];"

And it ran - but the coefficients predicting to my binary DV changed quite a bit. Maybe I can 'ignore' that and still obtain the needed statistic?

And I'm not sure where in the output I'd find the needed covariance.

Thank you.
 Bengt O. Muthen posted on Saturday, November 14, 2009 - 10:01 am
The loading for afdc_03 on f is the covariance estimate. If it is significant, this residual covariance should be in the model and you would have to use the new estimates for the binary DV coefficients. If the residual covariance is significant, the model without it is misspecified and should not be interpreted.

Similarly, if you run this using estimator = WLSMV you can test if that WITH should be in the model and you can see how those binary DV coefficients change.

Note that you want to include these residual covariances among all your DVs.
 Bengt O. Muthen posted on Saturday, November 14, 2009 - 10:49 am
You should also check that the new factor f does not correlate with any other variables in your model - there should be no estimated

f WITH...

in your output. If you have any, fix them at zero.
 Arina Gertseva posted on Thursday, September 30, 2010 - 8:54 am
I am in the process of running latent growth mixture models to identify distinct trajectories of change in crime rates from 1981 to 2006 at the county level. I consider four crimes: homicide, aggravated assault, robbery, and simple assault. Each model takes between 70-160 hours to run.
After the distinct trajectory groups are identified, I would like to explore the extent to which trajectory groups differ in terms of time-variant covariates (population change, changes in residential mobility, poverty change, etc). What is the best way to examine the impact of these time-varying characteristics within the framework of mixture modeling?
Would it be appropriate to use a percent change in population size, poverty rate, and residential mobility between 1981 and 2006 as a time-invariant covariate for explaining the change in crime rate between 1981 and 2006?
 Bengt O. Muthen posted on Friday, October 01, 2010 - 8:28 am
In the beginning of your message you say "time-variant covariates", which implies that the covariate changes over the time period that you consider and at each time point influences the outcome at that time point. If this is what you mean, I don't see how such a variable can be related to trajectory class membership which is something that is stable over time.

At the end you say "time-invariant covariate", so either you are then talking about another matter, or that's what you meant to begin with.
 Arina Gertseva posted on Monday, October 04, 2010 - 10:15 pm
I apologize for an unclear question. The initial idea was to incorporate several “time-variant” covariates and examine the extent to which they explain the level of crime rate at each time point (overall 25 time points) for each latent class. However, I am afraid that this approach will give me too much information about each latent class, and I will be unable to draw meaningful conclusions about what makes the latent classes different.
Modeling changes in each of the selected covariates (I have 10 covariates) over time and relating them to changes in crime rates across latent classes seems even more difficult task.
At the same time, the idea of substituting “time-variant” covariates with “time-invariant” (measured just once in time) seems insufficient for accounting for almost three decades of data on crime rate.
That is why I was thinking about a way to account for change(s) in my covariates but without complicating the models too much. For example, one of the variables I will be including is the change in population over time. Is it appropriate to include the relative population change in size of counties between two pairs of data points (1981-2006) to predict the class membership and/or the rate of change in crime rate for each latent class?
 Bengt O. Muthen posted on Tuesday, October 05, 2010 - 9:11 am
I think incorporating time-varying covariates is important in studying crime rates. I don't think it will hinder your growth mixture modeling. But you should let the time-varying covariates influence the outcomes, not the latent class membership. You are right, however, that modeling with time-varying covariates involves more complex interpretations that using only time-invariant covariates, and that is true even in a regular, single-class, growth model.

As an example of the latter, you are asking if a change in a covariate between two time points can be specified to influence the rate of change in crime rate. So a time-varying covariate influencing the slope growth factor. That seems complex. I have worked with a model where a time-varying covariate influences the intercept (level) growth factor, but not the slope. I wonder if you could apply a piece-wise model with pieces determined by when large changes in pop size occur. This can be applied also with mixtures, not having pop size influencing class membership, but focusing on the influence on the slope which is a within-class entity.
 Bengt O. Muthen posted on Tuesday, October 05, 2010 - 10:06 am
A possibility would be to score county pop. size (p) change as the time-varying covariate x_t = p_t - p_t-1 and let x_t influence the outcome y_t. This means that the growth factors are interpreted as the crime curve for zero change in county pop. size (zero values of the time-varying covariate).

This can be used for single-class as well as mixture growth, where for mixtures the influence of x_t on y_t can be different in different classes if you want.
 Johannes Bauer posted on Tuesday, October 12, 2010 - 8:01 am

I am trying to fit a sequence of latent profile analyses (LPA) with covariates that predict latent class membership:

* Model 1 is the LPA without covariates (3 Classes).
* Model 2 includes one continuous and two binary observed variables as covariates of the LPA (age, gender, school type).
* Model 3 adds a continuous latent factor as predictor of class membership (competence).

Here are my questions:

(1) As far as I have read in another thread, there are no global fit indices to judge the fit of models 2 and 3 (right?). I could compare the log-likelihoods and information criteria of the models, but would that be meaningful? I don't think so, because Model 3 is much more complex than Model 2 since it involves an additional latent variable.

(2) Is it possible and meaningful to calculate a measure of explained variance for Models 2 and 3 (e.g. Nagelkerke)? If yes how could I do that? I know that I would need the likelihood of an intercept-only model, but how do I get this?

Many thanks in advance
 Linda K. Muthen posted on Wednesday, October 13, 2010 - 12:26 pm
1. Chi-square and related fit statistics are not available for your models. Model are nested if they contain the same set of observed dependent variables.

2. I'm not sure how you would do this.
 Finbar posted on Tuesday, October 19, 2010 - 6:22 am
Dear Bengt and Linda,
I have a multinomial logistic model with a DV containing 5 groups, including the reference group. When looking at the modification indices it does not tell me which of the four groups is being referred to. For example,
Y# ON...
Y# ON...
Y# ON...
Where I get the significant results for three of the four outcome groups. Is there a way to find out which outcome group is being referred to? If there were four groups, I would imagine that they are in order of 1-4.
For my regression results, it does not give the outcome # either, but they are in order of 1-4.
Thank you
 Linda K. Muthen posted on Tuesday, October 19, 2010 - 8:07 am
The last category of the nominal variable in a multinomial logistic regression is the reference category. I think you need to shorten your variable name so the # is printed. They are printed from lowest to highest.
 Syed Noor posted on Thursday, February 03, 2011 - 10:29 pm
Hi, I have 2 questions regarding LCA with Covariates..

I have identified 3 latent classes (C) using 5 predictors (u). Now I want to run multinomial regression to identify socio-demographic variables that are associated with class membership.

1. One of my covariates marital status has three categories. But when I look at the OUTPUT I am seeing only one odds ratio for c#1 vs C#3. Am I looking at the wrong output? If yes then,

2. Which part of the output will give me the odds ratio with 95% CI for C ON x1 x2 x3.

Thanks in advance
 Linda K. Muthen posted on Friday, February 04, 2011 - 6:13 am
You should create two dummy variables for marital status.

Use the CINTERVAL option to obtain confidence intervals.
 Syed Noor posted on Friday, February 04, 2011 - 12:41 pm
Thank you, Linda..Thank you very much.

 Syed Noor posted on Saturday, February 05, 2011 - 5:50 pm
Hi Linda,

Without covariates,

2 class 3 class
BIC 7182 7163
p (LMR) 0.00 0.001
entropy 0.88 0.753

With covariates,

2 class 3 class
BIC 7141 7105
p (LMR) 0.00 0.771
entropy 0.889 0.814

My understanding is without covariates 3-class solution fits the data better (low BIC compared) but with covariates 2 class solution fits the data better (p=value for LMR).

And I think final decision should be made with covariates in the model. What do you suggest?

Another thing 3 class with covariates giving a very wide 95%CI for some covariates.

I think there are some direct effects from x to u but I am not sure how to model that. Any suggestion?

Thanks in advance.

 Linda K. Muthen posted on Sunday, February 06, 2011 - 10:49 am
It seems BIC chooses the three-class model in both cases. Direct effects are specified as

u ON x;
 Sanjoy Bhattacharjee posted on Friday, February 03, 2012 - 1:03 pm
Profs. Muthen,

Can we estimate Random Parameter Logit or Mixed Logit, as coined by Prof. K. Train, in Mplus (mine is V4.0 + mixture add-on) when the coefficients associated with covariates follow distribution other than standard normal, e.g. we want to estimate the following model

Y = g(alpha + beta1*X1 + beta2*X2)
Y is binary (0 and 1)
g(.) is a logit link
alpha follows standard normal
beta1 follows truncated normal
beta2 follows triangular distribution

Thanks and regards
 Bengt O. Muthen posted on Friday, February 03, 2012 - 5:16 pm
You can let the coefficients have a normal distribution, but not the other two distributions.
 Linda Breeman posted on Tuesday, March 13, 2012 - 1:09 am

I have a question regarding the output of a multinomial regression analysis, where i regressed some characteristics on latent class membership (3 classes). Standard, the last (3) class is the reference group and in the output i can find the estimates and associated p-values. However, beneath there is also the heading “ALTERNATIVE PARAMETERIZATIONS FOR THE CATEGORICAL LATENT VARIABLE REGRESSION” where i can find the output when using another class as reference group. When comparing the same classes, the estimates are the same (in value, but reversed), however, the S.E. is different and therefore p-vales are different. How is this possible?
 Linda K. Muthen posted on Tuesday, March 13, 2012 - 8:27 am
Please send the output and your license number to
 Julia Lee posted on Wednesday, April 11, 2012 - 1:58 pm
I am using LPA with multinomial logistic regression. Is there a reason why the magnitude of the estimate of one class is so much greater than the others (4 classes vs on referent)? Does local maxima causes this kind of results? I appreciate your response.

For example:
C#1 ON
LSF 0.115 0.436 0.265 0.791
P -1.261 1.301 -0.969 0.332
RAN 0.134 0.543 0.246 0.805
PM 0.045 0.507 0.089 0.929
OL -0.939 0.478 -1.964 0.049
SSRS -1.313 0.673 -1.949 0.051
SWAN 0.102 0.327 0.313 0.754
SPEECE 1.035 2.897 0.357 0.721
GENDER 0.253 0.654 0.387 0.699
FRL 1.299 2.212 0.587 0.557
TIER 123.064 379.908 0.324 0.746

C#2 ON
LSF -3758.058 1649.341 -2.279 0.023
PA -2112.290 823.188 -2.566 0.010
RAN -228.478 124.362 -1.837 0.066
PM -243.467 125.532 -1.939 0.052
OL -2029.686 851.741 -2.383 0.017
SSRS 1080.844 468.392 2.308 0.021
SWAN -2887.610 1144.011 -2.524 0.012
SPEECE 1519.415 552.779 2.749 0.006
GENDER 1468.333 674.166 2.178 0.029
FRL 1200.846 438.840 2.736 0.006
TIER 9583.447 3806.868 2.517 0.012
 Linda K. Muthen posted on Wednesday, April 11, 2012 - 3:57 pm
The probabilities in class 2 must be much smaller than those in the reference class. I don't think this is a local solution. Did you replicate the best loglikelihood several times?
 Julia Lee posted on Wednesday, April 11, 2012 - 4:24 pm
Hi Linda,
You are right! The probabilities in class 2 is smaller than the reference classes. Is there a way to solve this problem?



1 156.71638 0.30080
2 28.44918 0.05460
3 113.43462 0.21772
4 155.98498 0.29940
5 66.41484 0.12748
 Julia Lee posted on Wednesday, April 11, 2012 - 4:27 pm
Hi again Linda,
To answer your question, best likelihood (i.e., the first LL on the list was not replicated) while the 2nd best likelihood was replicated 19 times. Is that the cause of the problem? This is what I used for the starts. Thanks again.
STARTS 800 40;
 Linda K. Muthen posted on Wednesday, April 11, 2012 - 6:05 pm
Try STARTS = 2000 500. If you don't replicate the best loglikelihood, you have hit a local solution.
 Jennifer Buckley posted on Friday, April 20, 2012 - 6:52 am
I've calculated predicted probabilities of class membership from a multinomial logistic regression model with latent classes on binary covariates. I was wondering how I get confidence intervals to examine if the differences in probabilities between groups are significant?
 Linda K. Muthen posted on Friday, April 20, 2012 - 8:36 am
If you specify the predicted probabilities in MODEL CONSTRAINT, you will get a standard error which can be used in computing confidence intervals. You can also ask for them by using the CINTERVAL option of the OUTPUT command.
 Mario Mueller posted on Tuesday, June 12, 2012 - 1:56 am
Dear Linda,
I run a multinomial logistic regression with continuous and binary predictors and continuous mediators. I've used the MODEL CONSTRAINT-option to specify the indirect effects. The output, however, produced equal estimates for each category of the outcome. Do you have any idea?

Thanks, Mario
 Linda K. Muthen posted on Tuesday, June 12, 2012 - 8:51 am
Please send the output and your license number to
 Zsofia Ignacz posted on Tuesday, June 19, 2012 - 11:18 am
Dear Professors Muthén,

I would like to constrain two of my regression coefficients as equal in my multinomial logistic regression. I have managed to have the wished coefficients equal, however, also across categories and not just within categories... How should I modify the following syntax in order to have different estimate between categories, but equal within categories?

nominal is jp;

MODEL: jp ON age fHU96 fHU05 fHU08 youngA transgen statshift;
jp on lostgen1 (1)
lostgen2 (1);

Thank you in advance for your help,
Zsofia Ignacz
 Bengt O. Muthen posted on Tuesday, June 19, 2012 - 4:01 pm
If jp is nominal, you refer to its categories by

 Zsofia Ignacz posted on Wednesday, June 20, 2012 - 1:59 am
Dear Professor Muthen,

Thank you very much for your answer! It works perfectly!

Kind regards to you,
Zsofia Ignacz
 christine meng posted on Tuesday, August 14, 2012 - 10:45 pm
I am running a multinomial logistic regression by regressing covariates on a 3-class GMM. I saw from the previous posts that to change the reference group was to change the starting values. So I did the following, but I got the exact same results. Did I miss anything?

c#3 as reference group
[i*31 s*1.5];
[i*-28 s*.1];
[i*40 s*0];

c#2 as reference group
[i*31 s*1.5];
[i*40 s*0];
[i*-28 s*.1];

c#1 as reference group
[i*40 s*0];
[i*-28 s*.1];
[i*31 s*1.5];
 Linda K. Muthen posted on Wednesday, August 15, 2012 - 12:22 pm
Please send the output before you added the starting values and the output with the starting values along with your license number to
 Lisa M. Yarnell posted on Sunday, January 06, 2013 - 2:54 am
Hi Linda and Bengnt,

I see under EXAMPLE 3.6: MULTINOMIAL LOGISTIC REGRESSION in the current manual that one can use code like this (below) when one wants to give starting values or place restrictions on the parameters.

MODEL: u1#1 u1#2 ON x1 x3;

How might I adjust this code to provide starting values? My ordered dependent variable has 3 levels. I assume I would adjust it to reflect these 3 levels, as in:
MODEL: u1#1 u1#2 u1#3 ON x1 x3;

But can you give an example of what starting values to use, and where to place them in this code?
 Lisa M. Yarnell posted on Sunday, January 06, 2013 - 6:55 am
Linda and Bengt, in the post above, I should not have referred to the dependent variable as having ordered levels; it is a nominal (unordered) variable.

Also, I should add: The purpose of the starting values is to make the second class the reference class. By default the reference class is the third class. But I also want the odds ratio for Class 1 vs. Class 2.

What starting values should I add to the code above, and where would I out them? Thank you!
 Bengt O. Muthen posted on Sunday, January 06, 2013 - 5:40 pm
With observed nominal variables you don't change the order of categories by starting values as you do with latent categorical (latent class) variables). You can instead change the scoring of your observed categories so that the highest score corresponds to your reference class because the highest score category is chosen as the last, reference, category.
 Lisa M. Yarnell posted on Sunday, January 06, 2013 - 11:18 pm
Thank you, Bengt, for your help!
 Jennifer Buckley posted on Friday, February 08, 2013 - 8:02 am
I am looking for any advice about calculating confidence interval for predicted probabilities of class membership from a multinomial logistic regression model. I have calculated the predicted probabilities and used the CINTERVAL command to get lower and upper estimates of the coefficients. However, I’m not sure how to use these to estimate 95 percent CIs around the predicted probabilities e.g. do I use the lower estimate for the intercept with the lower estimate for the coefficient.
I appreciate this does not relate directly to Mplus, however I’m struggling to find an answer. Any help with getting to the right calculation or useful reference would be really appreciated.
Thank you in advance.
 Linda K. Muthen posted on Friday, February 08, 2013 - 3:00 pm
You would need to use MODEL CONSTRAINT to get the predicted probabilities for the values of x you are interested in. You will then get standard errors you can use for confidence intervals.
 Jennifer Buckley posted on Monday, February 11, 2013 - 7:46 am
Dear Linda,

Thank you for your response. I'm finding it difficult to work out how to use the MODEL CONSTRAINT command to get the predicted probabilities. Would you be able to provide any further instructions or reference an example?

Kind regards, Jen
 Linda K. Muthen posted on Monday, February 11, 2013 - 10:06 am
See Example 5.20 to see how MODEL CONSTRAINT is used. See also MODEL CONSTRAINT in the user's guide.

See pages 492-497 of the user's guide to see how to compute predicted probabilities.
 Karen Kochel posted on Tuesday, June 11, 2013 - 8:55 am
Hello - I conducted a multinomial logistic regression in which I regressed a Time 2 observed continuous variable on Time 1 latent categorical variable (4 classes). I am now interested in examining a Time 1 observed continuous variable as a moderator. If this is possible, can you please provide me with some direction? XWITH and KNOWNCLASS commands don’t seem appropriate here. Thanks!
 Bengt O. Muthen posted on Tuesday, June 11, 2013 - 10:24 am
I don't understand where the multinomial logistic regression comes in.
 Karen Kochel posted on Tuesday, June 11, 2013 - 10:41 am
I'm sorry I wasn't clear. Substantively, I'm interested in knowing, for example, whether high scores on peer rejection are associated with increased odds of membership in a victim class relative to a bully class. Logisitic regression results suggest yes. Now, I would like to investigate whether the association between high scores on peer rejection and the odds of being identified as a bully vs. victim depends on bullies' and victims' aggression scores. So, I would like to know whether I can examine aggression, an observed continuous variable, as a moderator. Does this make sense? Thank you.
 Bengt O. Muthen posted on Tuesday, June 11, 2013 - 10:54 am
It sounds like you are saying, using regular mediation notation:

x = peer rejection score
m = aggression score
y = victim/bully binary vble

I may have misinterpreted because this does not jive with your first message because it said Time 2 was continuous DV and Time 1 was categorical with 4 classes.
 Karen Kochel posted on Tuesday, June 11, 2013 - 11:46 am
Thanks for your help, and I am very sorry for the confusion. Let me try to clarify. First, I conducted LPA at Time 1 and obtained a 4-class solution (thus the categorical latent variable I referenced in the initial post). Second, I conducted the regression as stated in the first post. I realize that x=T2 peer rejection, m=T1 aggression, and y=T1 categorical latent class (bully, victim, bully-victim, uninvolved) is temporally backwards. In view of this, I guess my first question is, could it be argued that this analysis makes conceptual sense in terms such as: high scores on T2 peer rejection are associated with increased odds of membership in one class compared to another at T1? If yes, can I examine aggresion as a continuous moderator (not mediator)?

If no, might it be possible to examine a continuous moderator (again, let's say Time 1 aggression) of the association between Time 1 latent categorical variable (with 4 classes) and a distal T2 outcome (such as peer rejection)? So, in this case, let's suppose a wald test revealed that victims scored higher than bullies on rejection at T2. Can I examine whether this association depends on bullies' and victims' aggression scores (a continuous measure) or need I dichotomize aggression and use knownclass? Thanks very much for your continued help.
 Bengt O. Muthen posted on Tuesday, June 11, 2013 - 12:07 pm
You can have a T1 latent class variable c (4 classes) and a distal T2 continuous outcome y (peer rejection), moderated by a T1 continuous aggression score z. You would do this by

y on z;
y on z (b1);
y on z (b4);

You can test equality of b1-b4 using Model test.
 DEC posted on Monday, August 25, 2014 - 11:34 am
Hello Drs. Müthen,

I am trying to run a multinomial logistic regression with two continuous latent factors and their latent interaction in relation to membership in externalizing trajectory classes (1=chronic, 2=increasing, 3=decreasing, 4=normative). Prior to this regression I conducted a latent class growth analysis to approximate 4 trajectory classes of externalizing behavior. I imported the class variable into Mplus to run in a multinomial logistic regression and defined it as a nominal variable.

I have two questions that I'm hoping you can help with:

1) How do I interpret a significant latent interaction in relation to a nominal outcome? I'm interested in contrasts between specific classes (e.g., chronic vs. normative, decreasing vs. chronic).

2) Would it be better to run the multinomial logistic regression and latent class growth analysis simultaneously? I am worried that class assignment will change after adding in latent predictors and their interaction.

I should also mention that I'm running these analyses on a Mac and some of the trajectory classes include few participants (i.e., chronic n = 9, increasing n = 23, decreasing n = 38, normative n = 168). Thanks so much for any recommendations you can provide!

Kindest regards,
 Bengt O. Muthen posted on Monday, August 25, 2014 - 3:01 pm
1) Multinomial logistic regression slopes concern log odds ratios, typically interpreted in their exponentiated (odds ratio) form. Combine that with the fact that interactions (also latent vble interactions) can be expressed in terms of moderator functions (see our FAQ on this) and you have your answer.

2) No, do it in separate steps for simplicity - at least if your entropy is at least 0.80.
 Isaac J. Washburn posted on Tuesday, August 26, 2014 - 12:43 pm
Drs. Müthen,
I am using the 3-step method to first find the best fitting categorical latent variable and then in the final step predicting that from the intercept and slope of a logistic growth model. Everything appears to be working, but when I change the intercept to a different time point the relationship between the slope and the categorical latent variable change. The relationship between the intercept and categorical latent variable does not. As I understand, this should not be happening. There is minor shifting of sample size, but nothing that would warrant this.

First Intercept:
HI -0.575 0.271 -2.122 0.034
HS -0.174 0.318 -0.547 0.584

HI 0.459 0.208 2.201 0.028
HS 0.362 0.297 1.219 0.223

Second Intercept
HI -0.574 0.270 -2.120 0.034
HS 0.399 0.323 1.235 0.217

HI 0.459 0.208 2.204 0.028
HS -0.096 0.286 -0.335 0.737
 Bengt O. Muthen posted on Tuesday, August 26, 2014 - 3:49 pm
I don't know what the rules for this are when you are using a non-linear (logistic) growth model. Perhaps a simple simulation using Mplus is in order (e.g. 1 rep with a very large sample size to get close to pop values).
 Isaac Washburn posted on Tuesday, August 26, 2014 - 3:58 pm
That is a good thought.I will try that.
 Valentina Ulloa posted on Saturday, August 22, 2015 - 4:33 am
Hello Drs. Muthén,
I´m regressing a latent class on some covariates (with BCH method) and would like to know the significance of the overall effect of the predictors, given that the multinomial coefficients only provide separate estimates of the classes compared to a reference.

Just in case, this is my model, were SC is the latent variable:

SC on pstat sex educ age;
ostat on educ pstat sex;

So, if I´m right, I would like to test H0 = all coefficients associated with a given variable are equal to zero, i.e. pstat, sex, educ, age.

 Bengt O. Muthen posted on Saturday, August 22, 2015 - 6:59 am
You can do that using Model Test.
 Valentina Ulloa posted on Monday, August 24, 2015 - 1:06 pm
Ok, thank you Dr.
And I´have another question. In the latent class mixture regression analyses, standardised coefficient of the regression of a distal outcome on covariates are different accross classes (including different Rsquares), but unstandardised estimates are the same for all classes. What would be the interpretation of that?

Thanks again.

Kind regards,

 Linda K. Muthen posted on Tuesday, August 25, 2015 - 7:10 am
Unstandardized coefficients are held equal across classes as the default. Standardization is done using the class standard deviations which make the standardized solutions different. If you don't want the coefficients held equal across classes, mention them in the class-specific parts of the MODEL command.
 Jin Qu posted on Saturday, August 13, 2016 - 7:27 pm
I am trying to change the reference group from class 4 to class 3. I am using mRS,mSC, mSCpRS(product term) to predict nominal variables (1,2,3,4). I use the logistic regression odds ratio that I obtained (when running the same analysis and using class 4 as the reference group) as start values. Is this correct?

When I run the following codes, I obtained the error message "invalid class value". I wonder what this message means and how I can revise my code? Thanks!

usevar are c mRS_6
mSC_6 mSCpRS;

missing are all (-999);
nominal is c;

center mRS_6 mSC_6 (grand mean);
mSCpRS = mRS_6*mSC_6;


c#1 on mRS_6 (b1)
mSC_6 (b2)
mSCpRS (b3);

c#2 on mRS_6 mSC_6 mSCpRS;
c#4 on mRS_6 mSC_6 mSCpRS;

[mRS_6*0.454 mSC_6*1.382 mSCpRS*0.870];

[mRS_6*0.707 mSC_6*1.223 mSCpRS*0.667 ];

[mRS_6*0.624 mSC_6*1.088 mSCpRS*0.754];
 Bengt O. Muthen posted on Sunday, August 14, 2016 - 11:15 am
Please send your full output and license number to Support.
 Kathryn Modecki posted on Sunday, August 21, 2016 - 12:40 am
Hello-I am running a latent growth mixture model, using r3step to test associations with covariates. When I add a "granular" age variable (based on birthday) to the model, one of the model comparisons results in a 999 EST/SE (SE is 0), though this is not the comparison of interest to me. Can this solution be reported? Or would you advise that the model be revised? Thank you.
 Kathryn Modecki posted on Sunday, August 21, 2016 - 12:41 am
Sorry, to clarify-age is being added as a co-variate, using the r3step command. Thanks
 Bengt O. Muthen posted on Sunday, August 21, 2016 - 4:44 pm
Please send your output and license number to Support.
 Magnus Alderling posted on Thursday, November 17, 2016 - 9:14 am

I have done a multinomial regression in MPlus and wanted to check if SPSS gives the same results. It turned out that neither an unstandardized nor a standardized solution for beta coefficients from MPlus where in the neighbourhood of the results obtained from SPSS. Moreover, SPSS gives reasonable estimates which correspond well to observed data. Exactly what part do the thresholds in MPlus play together with beta coefficients in order of interpretation of results?
As an example, taking just one of the ten levels of the multinomial outcome and having this level regressed on one of the six categories representing the independent variable Mplus gives estimate=-0.179 and S.E=0.016 with intercept for specific level on D.V=0.184 and corresponding threshold=-2.148. SPSS gives the intercept=1.69 and estimate=-3.121. Which results should I rely on?

 Bengt O. Muthen posted on Thursday, November 17, 2016 - 2:52 pm
SPSS and Mplus should give the same results (SPSS also uses ML I assume) so something is not done right for instance regarding the data. Send the SPSS output in pdf and the Mplus output to Support along with your license number.
 Ernest Boakye-Dankwa posted on Monday, June 05, 2017 - 10:27 pm
Dear All,

I am conducting LCA on 11-binary indicators using two different datasets from different regions.My results indicate that 3-class model fits one dataset well, and 2-class model represents the other dataset well. I am trying to assess the differences between the latent classes across the two regions and also differences within each region by income status (low vs. high). Is there any material I could read to address this challenge?

Thank you.
 Bengt O. Muthen posted on Tuesday, June 06, 2017 - 6:19 pm
You can make region into a Knownclass variable and then test for region differences for some classes using Model Test. The UG has examples of Knownclass use and our courses also cover this.
 Nina Pocuca posted on Wednesday, October 25, 2017 - 10:30 pm
Dear Drs Muthen,
I’ve conducted a multinomial logistic regression examining the effect of an interaction on a categorical outcome variable with 3 categories. I’ve noticed that Mplus does not provide the output for a likelihood ratio test (as would be provided in SPSS). Is this the case? If so, would it be statistically sound to interpret the significance of the predictor variables (including interaction terms) on the DV, according to the model results provided in the Mplus output? Thank you in advance for your assistance.
 Bengt O. Muthen posted on Thursday, October 26, 2017 - 1:52 pm
I don't know which LR test SPSS does but if it is for a certain regression coefficient, you may just as well use the z-score in the Mplus output. If it is for a set of coefficients, you can use Model Test.
 Juliana Gottschling  posted on Monday, November 26, 2018 - 1:37 am
Dear Drs Muthen,

I am using multinomial logistic regression with a categorical latent variable (4 categories) regressed on a continuous latent variable. Is there a way to calculate a measure of effect size for the overall model (similar to a pseudo r²)?

Thanks in advance for your help.
 Bengt O. Muthen posted on Monday, November 26, 2018 - 12:59 pm
If there is a pseudo R-2 for multinomial logistic regression with an observed DV, you can use that approach for the latent version by expressing it in Model Constraint.
 Virginia Rangel posted on Thursday, April 11, 2019 - 10:58 am
I am running an LCA and, in the two-class model, I am getting very bizarres estimates for the multinomial regression (e.g., estimates such as 633 with SE=0, est/se=999.00, and p-value=0.0). When I looked at the results from the three and four-class models, I didn't get such bizarre results. I have almost 600 cases with 10 covariates and 9 indicators. Unfortunately, I cannot send my datafile or results because of the restricted nature of my data license.
 Bengt O. Muthen posted on Thursday, April 11, 2019 - 4:14 pm
Large estimates occur when the coefficients cannot be determined (it also happens in regular logistic regression). For instance, there is no variation in an X variable for both of the 2 DV classes. This is ok - just say that the coefficient couldn't be estimated.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message