Multinomial Logistic Regression
Message/Author
 Vilma posted on Thursday, May 31, 2007 - 10:37 pm
We are working on a project where the aim is to find the probability to respond to a particular ad campaign. We have a categorical dependent variable having 5 categories i.e. 4 ad campaigns which are not intrinsically ordered and a fifth which is Non response towards any of the ads. We run a model in SAS with various independent variables.

Here we are facing with certain issues

1) There is a bias towards the Non response of the campaign (We tried to reduce the bias by adding weights in our model).

2) Model Building Issues

Dummy Output

There are some categories which are insignificant. For example in IV1 we have categories AD1, AD2,AD4 insignificant should we be dropping the variable altogether from our analysis since some of the categories are insignificant. If not kindly suggest appropriate method to follow.

I request you to please provide us with some insights.

Vilma
 Linda K. Muthen posted on Monday, June 04, 2007 - 10:55 am
This forum is geared toward Mplus not other statistical programs. It would be necessary to know a lot more about your research to give you suggestions and Mplus Discussion is not able to help to that extent.
 Sara May posted on Saturday, October 31, 2009 - 5:39 pm
I am running a multinomial logistic regression in Mplus. Could you please help me with the following information:

1) One of the predictor variables is ordinal. If I define this variable as ordered categorical in Mplus, how does Mplus handle this level of measurement when estimating the regression model? Would the variable be treated as if it had interval level of measurement? Or, should it be better represented by a set of dummy variables (as is typical in other software).

2) Is it possible to run a hierarchical regression model in Mplus? I.e., add one set of predictor variables in step 1 and another set of variables subsequently. If so, what coefficients does Mplus provide to estimate the improvement in model fit? I wasn't able to find indexes that are typically included in other software (e.g., Pseudo R^2). Could you please recommend papers that have used multinomial logistic regression in Mplus?

Thanks so much!
 Amir Sariaslan posted on Saturday, October 31, 2009 - 7:28 pm
Sara,

1) Predictor variables are not defined in Mplus. Both of your proposed approaches are discussed here: http://www.stat.columbia.edu/~cook/movabletype/archives/2009/10/coding_ordinal.html

2) Mplus does multilevel models ("hierarchical regression") but I presume that's not what you're looking for. Testing model fit in a step-wise fashion can be done through likelihood ratio testing. As for references, LCA/LCGA/GMM models with predictors will include a multinomial regression component. Just have a look under "Papers" on the Statmodel website.

/Amir
 Linda K. Muthen posted on Sunday, November 01, 2009 - 9:45 am
1) In regression, covariates can be binary or continuous. In both cases, they are treated as continuous. You can leave the ordinal variable as it is or create a set of dummy variables. The CATEGORICAL option is for dependent variables.

2) Mplus does not have stepwise regression.
 fritz posted on Thursday, June 17, 2010 - 9:35 am
Hello,

I've got quite a simple multinomial logistic regression model (like example 3.6 in the User's Guide).

My nominal variable has three unordered categories, however, and I understand that the last category is taken as the reference group. Thus, I'got odds ratios for group 1 vs. group 3 and group 2 vs. group 3. But what about testing group 1 vs. group 2? Do I have to run seperate logistic regressions with dummy-coded outcomes to test each group agains each other? Or is there a way to calculate odds ratios for group 1 vs. group 2 from the results of my multinomial RA?

 Linda K. Muthen posted on Thursday, June 17, 2010 - 10:32 am
We usually give these alternatives. I would need more information to understand why you are not getting them. Please send the full output and your license number to support@statmodel.com.
 Edith posted on Friday, June 17, 2011 - 11:52 am
Hello Linda,

I have two questions regarding multinomial regression (they are very similar to the questions raised in this thread before, however, I did not find an answer so far):

1. Mplus output of example 3.6 shows the estimates and the odds for Kat#1 vs Kat#3 and Kat#2 vs Kat#3 (the nominal dependent variable has three unordered categories). However, is there a way to receive the values for Kat#1 vs Kat#2?

2. Does Mplus provide a (pseudo) R-square for the nominal dependent variable (like Nagelkerke or McFadden or Cox&Snell)? The output of example 3.6 does not show a R-square. Is there a way for Mplus to calculate it?

Thank you very much for your help!
 Linda K. Muthen posted on Friday, June 17, 2011 - 1:15 pm
1. Class 3 is the reference category. You would need to change the reference category using the DEFINE command to make category two the last category.

2. No.
 Yijie Wang posted on Monday, October 24, 2011 - 9:19 am
Hi Dr. Muthen,

I'm conducting a multinomial logistic regression in a nested data. My syntax is as follows:

usevariables are c_rlow prstrand;
nominal is c_rlow;
cluster is feeder;

ANALYSIS: type=complex;

MODEL: c_rlow on prstrand;

OUTPUT: standardized CINTERVAL ;

Strangely, in the STDYX results, all the coefficients are either 1 or -1, with p-value as 999.00. I know that I should look at the unstandardized results. But this strange output in STDYX makes me wonder if my model is running correctly. Would you please help me with this issue? Thank you very much!
 Bengt O. Muthen posted on Monday, October 24, 2011 - 8:36 pm
You don't want to use STDYX when the DV (that is Y) is nominal. STDX or raw results should be used.
 Yijie Wang posted on Monday, October 24, 2011 - 10:07 pm
Hi Dr. Muthen,

Thank you for your response! I have one more following up question. So I don't need to be concerned about the strange coefficients and p-values in STDXY ouput? Thank you!
 Linda K. Muthen posted on Tuesday, October 25, 2011 - 2:47 pm
You should ignore STDYX with nominal outcomes.
 Yijie Wang posted on Tuesday, October 25, 2011 - 2:57 pm
Hi Dr. Muthen,

I got it. Thank you very much for your help!
 Jaana Minkkinen posted on Tuesday, June 30, 2015 - 2:14 am
I am a little bit confused with the proper standardization. I have a nominal DV and binary and continuous covariates, so I should use raw estimates or the STDY standardization for covariates if I understand correctly previous conversation. However, is it correct to have STDY standardization for all covariates including both binary and continuous covariates? I mean that if I needed to have a table with standardized coefficients, is it proper to have STDY standardization for all my covariates? Are the coefficients of binary and continuous covariates with STDY standardization comparable with each others?
 Bengt O. Muthen posted on Wednesday, July 01, 2015 - 5:31 pm
Use STDX and divide the coefficients for the binary x's with their respective standard deviations.
 Jaana Minkkinen posted on Friday, July 03, 2015 - 12:58 am
Thank you for your help, it is greatly appreciated. I have one further question. When I divided the coefficient for the significant binary x with its respective standard deviation I got a bigger value than the standardized coefficients of the significant continuous x:s. Is it correctly interpreted that this binary x has a greater effect than the continuous x:s on the DV?
 Christina Dyar posted on Friday, December 11, 2015 - 12:06 pm
Dear Drs. Muthén,
I am currently running a model in which a multinomial variable (3 categories) needs to function as an outcome variable and a predictor variable. However, it appears that Mplus is unable to use a nominal variable as a predictor variable (without dummy coding the variable). I am wondering if you have any ideas about how to approach this model.

I’ve used two approaches to attempt to model it. In the first, I used the nominal version of the variable as a dependent variable and a dummy coded version of the variable (2 dummy coded variables) as the predictor within the same model, but I don’t think this is appropriate in a structural equation modeling context.

In the second approach, I’ve split the model in two, with the first model predicting the nominal outcome variable and the second model starting with the dummy coded variables as predictors.

However, I would prefer to be able to run this as a single model. Is there any way to do so? I would greatly appreciate any suggestions.
 Bengt O. Muthen posted on Saturday, December 12, 2015 - 4:53 pm
You can do this using Knownclass, that is, Type=Mixture where the categories of the observed nominal variable are set equal to latent classes. See our FAQ:

Making an observed categorical variable u equivalent to a latent class variable c
 Bengt O. Muthen posted on Saturday, December 12, 2015 - 5:04 pm
I should add that for the nominal latent class variable c,

c--> y

is then captured by different means of y in the different classes.
 Jennifer posted on Tuesday, January 10, 2017 - 6:43 am
I am using SEM for model building, where I have a number of manifest variables (dichotomous) and continuous latent variables as predictors of a nominal (4 categories) dependent variable. I have a few specific questions.

1. I am trying to run a multiple mediation model, where X is binary manifest, Ms are continuous latent variables, and U is the nominal variable. When I tried running this I encountered an error message stating “the following indirect effect is currently not supported,” referring to the indirect path between X and U. Is there any way to accomplish this analysis in Mplus?

2. It seems that my analyses might be simpler if I were able to use WLSMV as my estimator, though I realize it’s not possible to do this with multinomial regression. I was wondering if it were possible to “trick” Mplus by creating three dummy dichotomous variables and including them in the same analysis to replace the nominal variable as the outcome. I tried this but Mplus recognized that the sample correlation between these dummy variables was 1.00 so the model did not run. Is there a way to decompose a nominal variable in Mplus so that the system perceives it as three dependent categorical variables? I do not want to run three separate analyses because two of my groups have small sample sizes, so I lose a lot of the power.

 Bengt O. Muthen posted on Tuesday, January 10, 2017 - 3:46 pm
1. I am not sure what one should mean by indirect and direct effects for a nominal outcome. Effects are defined in terms of the expectation of the outcome and there is not a single value for that with a nominal outcome. I would dichotomize.

2. Because 2 of your groups have small numbers, I would simply do one binary outcome run - the big category versus the other two.
 Raffaele Zanoli posted on Monday, February 06, 2017 - 10:19 am
Hello I would like to know if it is possible to fix the intercept and/or slope of the last category (reference category) in a multinomial logistic regression to a number other than zero (e.g. 1).

I used Model constraint but I cannot make reference to the last category.

Thank you!
 Bengt O. Muthen posted on Monday, February 06, 2017 - 3:50 pm
 Raffaele Zanoli posted on Tuesday, February 07, 2017 - 12:04 am
Sorry I am not sure I grasped the answer. How should I recode the variable to impose that the intercept of the last category is 1? Do you mean I shoud change the reference category and then impose the constraint?
 Bengt O. Muthen posted on Wednesday, February 08, 2017 - 2:32 pm
 Raffaele Zanoli posted on Thursday, February 09, 2017 - 12:07 pm
Still I cannot get the model I wish.
My model is:

VARIABLE:
NAMES are u1 x1 x2 x3 x4;
NOMINAL is u1;
MODEL:
u1#1 on x1 (1);
u1#2 on x2 (1);
u1#3 on x3 (1);

[u1#1 u1#2 u1#3]

the fact is that i want to impose the last equation (u1#4) to be modelled with intercept zero (it is needed for identification) but with slope of u#4 on x4 equalled to the slope of the other u1 categories.

I find quite odd, compared to other packages doing multinomial logistic regression (e.g. STATA command clogit) that the SLOPE parameter is necesseraly constrained to zero. is there a way to go around this default zero slope?
 Bengt O. Muthen posted on Friday, February 10, 2017 - 10:29 am
clogit stands for conditional logistic regression. Mplus does not do that but instead the regular multinomial logistic regression which typically uses the standardization of zero for the last intercept and slope.
 timboto17@gmail.com posted on Sunday, April 02, 2017 - 3:54 pm
Dear Mplus team,
I am running multinomial logistic regression with mplus. My dependant variable has 6 categories including “don’t know” and “doesn’t apply”. As I am interested in the other 4 categories of this variable, is it better to discard the two categories previously mentioned or to keep then in the analysis (by recoding them as a single category for instance but not interpret the result associated with this category?). Thank you kindly for your help.
 Bengt O. Muthen posted on Thursday, April 06, 2017 - 5:51 pm
Keep them in the analysis. You might also find something interesting for them.
 Stig Hebbelstrup Rye Rasmussen posted on Monday, November 13, 2017 - 2:25 pm
Dear Mplus team,

I am estimating a multinomial logistic regression with three outcomes, and I have some difficulties deciding on the correct estimation of confidence intervals for predicted probabilities.

In chapter 4 of our Regression and Mediation analysis book you nicely describe how to calculate predicted probabilities but many of the confidence intervals for my predicted probabilities both contain negative values and values larger than one.

The dataset is clustered so I use the type=complex option and cannot get bootstrapped confidence intervals where strange probabilities are usually not a problem.

For an ordinary logistic regression i can use the model constraint to obtain the confidence intervals by estimating them in the following way, where I first obtain the standard errors (se) by estimating the logit coefficients.

exp(logit.coef+1.96*se.logit)/(1+exp(fit+1.96*se.logit)

Is it possible to do something similar for a multinomial logit regression?

E.g. for the reference category: 1/(1+exp(logit1.coef+1.96*logit1.se)+exp(logit2.coef+1.96*logit2.se)
 Tihomir Asparouhov posted on Tuesday, November 14, 2017 - 8:38 am
The best thing to do is just let model constraint give you the confidence interval.
model constraint: new(p3);
p3=1/(1+exp(l1)+exp(l2));
See User's guide example 3.10 for how to use model constraints.

If that doesn't work (vary rarely it won't work) you can use model constraint to get SE for
A = log(exp(logit1.coef)+exp(logit2.coef))
then use
1/(1+exp(A+1.96*se.A))
 Stig Hebbelstrup Rye Rasmussen posted on Tuesday, November 14, 2017 - 10:34 am
Hi Tihomir

I used the model constraint using your first suggestion originally which gave me the strange probabilities I described above.

Your second suggestion however works perfectly! Thank you so much.
 Daria Shamrova posted on Thursday, April 19, 2018 - 8:40 am
Hello Drs. Muthén,

I have a question about the model fit summary for multinominal logistic regression. I get loglikelihood and Information Criteria in the output. Is there a way to know that the model is significant based on the information provided in Mplus? What model fit information should be reported for multinominal regressions results done in Mplus? Thank you!
 Bengt O. Muthen posted on Thursday, April 19, 2018 - 4:14 pm
Typically, no fit statistics are reported. It is difficult to provide relevant fit statistics unless all covariates are binary so that you can look at a frequency table. You report the estimates (particularly slopes) and their significance and perhaps provide a probability plot like we show in our RMA book. See also categorical data books by Agresti and Long.
 Daria Shamrova posted on Thursday, April 19, 2018 - 6:36 pm
Thank you, Dr. Muthen! It helps
 Jayme Walters posted on Thursday, May 17, 2018 - 12:26 pm
I am having some trouble with conducting a multinomial logistic regression.

1.Cases with missing data on the IVs were being deleted. I read through this forum and found this statement: “The only way to avoid the listwise deletion of covariates is to bring them into the model as dependent variables. You can do this by mentioning their variances in the MODEL command. You then make distributional assumptions about them.”

I am not clear on how to mention the variances in the model. Also, what distributional assumptions would be made?

2.Error: ONE OR MORE PARAMETERS WERE FIXED TO AVOID SINGULARITY OF THE INFORMATION MATRIX. THE SINGULARITY IS MOST LIKELY DUE TO THE MODEL IS NOT IDENTIFIED, OR DUE TO A LARGE OR A SMALL PARAMETER ON THE LOGIT SCALE. THE FOLLOWING PARAMETERS WERE FIXED:
Parameter 7, CLASS#1 ON WHITE
Parameter 14, CLASS#2 ON WHITE

The following is the syntax:

TITLE: MLR Test Day
DATA:
FILE IS "E:\WALTERS\Mason\5.14.18\MLRtestdata.dat";
VARIABLE:
NAMES ARE id flag female age income12 class black othrace white hsorless somecoll collgrad;
IDVARIABLE = id;
Usevariables are female age income12 class black white hsorless somecoll;
Nominal is class;
Useobservations are (Flag EQ 1);
MISSING ARE ALL (999);
Analysis:
estimator = ML;
Model:
class#1 class#2 on female age income12 black white hsorless somecoll;
 Bengt O. Muthen posted on Thursday, May 17, 2018 - 4:40 pm
1. To mention the variance of X in a model, say

x;

The distributional assumption is normality of X.

2. Check if the White variable has variability in the class#1 category. Same for class#2.
 Jayme Walters posted on Friday, May 25, 2018 - 9:44 am
Dr. Muthen,
Thank you for your help. I am new to Mplus so I am still struggling a bit with this.

Below is the syntax. Is this the way you mention the variance? If so, I'm struggling with interpretation. I want to see if income is a predictor of class assignment.

With my second issue, you were right. There is no variability in class 1 or 2. Does that mean I should remove that variable? Thank you!

TITLE: MLR Test Day
DATA:
FILE IS "E:\WALTERS\Mason\5.14.18\MLRtestdata.dat";

VARIABLE:
NAMES ARE id flag female age income12 class black othrace
IDVARIABLE = id;
Usevariables are female age income12 class black white hsorless somecoll;
Nominal is class;
Useobservations are (Flag EQ 1);
MISSING ARE ALL (999);

Analysis:
estimator = ML;

Model:
class#1 class#2 income12 on age female black white hsorless somecoll;
 Bengt O. Muthen posted on Friday, May 25, 2018 - 12:52 pm
To mention the variance - see the FAQ on our website: Missing on x's. Note that in mixtures with "c ON x", this can lead to heavy computations due to numerical integration.

Yes, remove the variable.