Mplus Discussion >> Multinomial Logistic Regression

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Multinomial Logistic Regression

Mplus Discussion > Categorical Data Modeling >

Message/Author

Vilma posted on Thursday, May 31, 2007 - 10:37 pm

We are working on a project where the aim is to find the probability to respond to a particular ad campaign. We have a categorical dependent variable having 5 categories i.e. 4 ad campaigns which are not intrinsically ordered and a fifth which is Non response towards any of the ads. We run a model in SAS with various independent variables.

Here we are facing with certain issues

1) There is a bias towards the Non response of the campaign (We tried to reduce the bias by adding weights in our model).

2) Model Building Issues

Dummy Output
Parameter ADS Estimate Pr>Chi Sq
Intercept AD1 -7.3657 <.0001
Intercept AD2 -6.0348 <.0001
Intercept AD3 -7.3328 <.0001
Intercept AD4 -7.8386 <.0001
IV_1 AD1 -0.0240 0.5528
IV_1 AD2 -0.0600 0.7319
IV_1 AD3 0.0566<.0001
IV_1 AD4 0.0581 0.1931
IV_2 AD1 0.4249 <.0001
IV_2 AD2 0.6094 0.1931
IV_2 AD3 0.0497 <.0001
IV_2 AD4 0.0331 <.0001

There are some categories which are insignificant. For example in IV1 we have categories AD1, AD2,AD4 insignificant should we be dropping the variable altogether from our analysis since some of the categories are insignificant. If not kindly suggest appropriate method to follow.

I request you to please provide us with some insights.

Vilma

Linda K. Muthen posted on Monday, June 04, 2007 - 10:55 am

This forum is geared toward Mplus not other statistical programs. It would be necessary to know a lot more about your research to give you suggestions and Mplus Discussion is not able to help to that extent.

Sara May posted on Saturday, October 31, 2009 - 5:39 pm

I am running a multinomial logistic regression in Mplus. Could you please help me with the following information:

1) One of the predictor variables is ordinal. If I define this variable as ordered categorical in Mplus, how does Mplus handle this level of measurement when estimating the regression model? Would the variable be treated as if it had interval level of measurement? Or, should it be better represented by a set of dummy variables (as is typical in other software).

2) Is it possible to run a hierarchical regression model in Mplus? I.e., add one set of predictor variables in step 1 and another set of variables subsequently. If so, what coefficients does Mplus provide to estimate the improvement in model fit? I wasn't able to find indexes that are typically included in other software (e.g., Pseudo R^2). Could you please recommend papers that have used multinomial logistic regression in Mplus?

Thanks so much!

Amir Sariaslan posted on Saturday, October 31, 2009 - 7:28 pm

Sara,

1) Predictor variables are not defined in Mplus. Both of your proposed approaches are discussed here: http://www.stat.columbia.edu/~cook/movabletype/archives/2009/10/coding_ordinal.html

2) Mplus does multilevel models ("hierarchical regression") but I presume that's not what you're looking for. Testing model fit in a step-wise fashion can be done through likelihood ratio testing. As for references, LCA/LCGA/GMM models with predictors will include a multinomial regression component. Just have a look under "Papers" on the Statmodel website.

/Amir

Linda K. Muthen posted on Sunday, November 01, 2009 - 9:45 am

1) In regression, covariates can be binary or continuous. In both cases, they are treated as continuous. You can leave the ordinal variable as it is or create a set of dummy variables. The CATEGORICAL option is for dependent variables.

2) Mplus does not have stepwise regression.

fritz posted on Thursday, June 17, 2010 - 9:35 am

Hello,

I've got quite a simple multinomial logistic regression model (like example 3.6 in the User's Guide).

My nominal variable has three unordered categories, however, and I understand that the last category is taken as the reference group. Thus, I'got odds ratios for group 1 vs. group 3 and group 2 vs. group 3. But what about testing group 1 vs. group 2? Do I have to run seperate logistic regressions with dummy-coded outcomes to test each group agains each other? Or is there a way to calculate odds ratios for group 1 vs. group 2 from the results of my multinomial RA?

Thanks in advance!

Linda K. Muthen posted on Thursday, June 17, 2010 - 10:32 am

We usually give these alternatives. I would need more information to understand why you are not getting them. Please send the full output and your license number to support@statmodel.com.

Edith posted on Friday, June 17, 2011 - 11:52 am

Hello Linda,

I have two questions regarding multinomial regression (they are very similar to the questions raised in this thread before, however, I did not find an answer so far):

1. Mplus output of example 3.6 shows the estimates and the odds for Kat#1 vs Kat#3 and Kat#2 vs Kat#3 (the nominal dependent variable has three unordered categories). However, is there a way to receive the values for Kat#1 vs Kat#2?

2. Does Mplus provide a (pseudo) R-square for the nominal dependent variable (like Nagelkerke or McFadden or Cox&Snell)? The output of example 3.6 does not show a R-square. Is there a way for Mplus to calculate it?

Thank you very much for your help!

Linda K. Muthen posted on Friday, June 17, 2011 - 1:15 pm

1. Class 3 is the reference category. You would need to change the reference category using the DEFINE command to make category two the last category.

2. No.

Yijie Wang posted on Monday, October 24, 2011 - 9:19 am

Hi Dr. Muthen,

I'm conducting a multinomial logistic regression in a nested data. My syntax is as follows:

usevariables are c_rlow prstrand;
nominal is c_rlow;
cluster is feeder;

ANALYSIS: type=complex;

MODEL: c_rlow on prstrand;

OUTPUT: standardized CINTERVAL ;

Strangely, in the STDYX results, all the coefficients are either 1 or -1, with p-value as 999.00. I know that I should look at the unstandardized results. But this strange output in STDYX makes me wonder if my model is running correctly. Would you please help me with this issue? Thank you very much!

Bengt O. Muthen posted on Monday, October 24, 2011 - 8:36 pm

You don't want to use STDYX when the DV (that is Y) is nominal. STDX or raw results should be used.

Yijie Wang posted on Monday, October 24, 2011 - 10:07 pm

Hi Dr. Muthen,

Thank you for your response! I have one more following up question. So I don't need to be concerned about the strange coefficients and p-values in STDXY ouput? Thank you!

Linda K. Muthen posted on Tuesday, October 25, 2011 - 2:47 pm

You should ignore STDYX with nominal outcomes.

Yijie Wang posted on Tuesday, October 25, 2011 - 2:57 pm

Hi Dr. Muthen,

I got it. Thank you very much for your help!

Jaana Minkkinen posted on Tuesday, June 30, 2015 - 2:14 am

I am a little bit confused with the proper standardization. I have a nominal DV and binary and continuous covariates, so I should use raw estimates or the STDY standardization for covariates if I understand correctly previous conversation. However, is it correct to have STDY standardization for all covariates including both binary and continuous covariates? I mean that if I needed to have a table with standardized coefficients, is it proper to have STDY standardization for all my covariates? Are the coefficients of binary and continuous covariates with STDY standardization comparable with each others?

Bengt O. Muthen posted on Wednesday, July 01, 2015 - 5:31 pm

Use STDX and divide the coefficients for the binary x's with their respective standard deviations.

Jaana Minkkinen posted on Friday, July 03, 2015 - 12:58 am

Thank you for your help, it is greatly appreciated. I have one further question. When I divided the coefficient for the significant binary x with its respective standard deviation I got a bigger value than the standardized coefficients of the significant continuous x:s. Is it correctly interpreted that this binary x has a greater effect than the continuous x:s on the DV?

Christina Dyar posted on Friday, December 11, 2015 - 12:06 pm

Dear Drs. Muth�n,
I am currently running a model in which a multinomial variable (3 categories) needs to function as an outcome variable and a predictor variable. However, it appears that Mplus is unable to use a nominal variable as a predictor variable (without dummy coding the variable). I am wondering if you have any ideas about how to approach this model.

I�ve used two approaches to attempt to model it. In the first, I used the nominal version of the variable as a dependent variable and a dummy coded version of the variable (2 dummy coded variables) as the predictor within the same model, but I don�t think this is appropriate in a structural equation modeling context.

In the second approach, I�ve split the model in two, with the first model predicting the nominal outcome variable and the second model starting with the dummy coded variables as predictors.

However, I would prefer to be able to run this as a single model. Is there any way to do so? I would greatly appreciate any suggestions.

Bengt O. Muthen posted on Saturday, December 12, 2015 - 4:53 pm

You can do this using Knownclass, that is, Type=Mixture where the categories of the observed nominal variable are set equal to latent classes. See our FAQ:

Making an observed categorical variable u equivalent to a latent class variable c

Bengt O. Muthen posted on Saturday, December 12, 2015 - 5:04 pm

I should add that for the nominal latent class variable c,

c--> y

is then captured by different means of y in the different classes.

Jennifer posted on Tuesday, January 10, 2017 - 6:43 am

I am using SEM for model building, where I have a number of manifest variables (dichotomous) and continuous latent variables as predictors of a nominal (4 categories) dependent variable. I have a few specific questions.

1. I am trying to run a multiple mediation model, where X is binary manifest, Ms are continuous latent variables, and U is the nominal variable. When I tried running this I encountered an error message stating �the following indirect effect is currently not supported,� referring to the indirect path between X and U. Is there any way to accomplish this analysis in Mplus?

2. It seems that my analyses might be simpler if I were able to use WLSMV as my estimator, though I realize it�s not possible to do this with multinomial regression. I was wondering if it were possible to �trick� Mplus by creating three dummy dichotomous variables and including them in the same analysis to replace the nominal variable as the outcome. I tried this but Mplus recognized that the sample correlation between these dummy variables was 1.00 so the model did not run. Is there a way to decompose a nominal variable in Mplus so that the system perceives it as three dependent categorical variables? I do not want to run three separate analyses because two of my groups have small sample sizes, so I lose a lot of the power.

Many thanks in advance for your help!

Bengt O. Muthen posted on Tuesday, January 10, 2017 - 3:46 pm

1. I am not sure what one should mean by indirect and direct effects for a nominal outcome. Effects are defined in terms of the expectation of the outcome and there is not a single value for that with a nominal outcome. I would dichotomize.

2. Because 2 of your groups have small numbers, I would simply do one binary outcome run - the big category versus the other two.

Raffaele Zanoli posted on Monday, February 06, 2017 - 10:19 am

Hello I would like to know if it is possible to fix the intercept and/or slope of the last category (reference category) in a multinomial logistic regression to a number other than zero (e.g. 1).

I used Model constraint but I cannot make reference to the last category.

Thank you!

Bengt O. Muthen posted on Monday, February 06, 2017 - 3:50 pm

I advice you to re code your nominal variable instead.

Raffaele Zanoli posted on Tuesday, February 07, 2017 - 12:04 am

Sorry I am not sure I grasped the answer. How should I recode the variable to impose that the intercept of the last category is 1? Do you mean I shoud change the reference category and then impose the constraint?

Bengt O. Muthen posted on Wednesday, February 08, 2017 - 2:32 pm

Yes, on your last sentence.

Raffaele Zanoli posted on Thursday, February 09, 2017 - 12:07 pm

Still I cannot get the model I wish.
My model is:

VARIABLE:
NAMES are u1 x1 x2 x3 x4;
NOMINAL is u1;
MODEL:
u1#1 on x1 (1);
u1#2 on x2 (1);
u1#3 on x3 (1);

[u1#1 u1#2 u1#3]

the fact is that i want to impose the last equation (u1#4) to be modelled with intercept zero (it is needed for identification) but with slope of u#4 on x4 equalled to the slope of the other u1 categories.

I find quite odd, compared to other packages doing multinomial logistic regression (e.g. STATA command clogit) that the SLOPE parameter is necesseraly constrained to zero. is there a way to go around this default zero slope?

Bengt O. Muthen posted on Friday, February 10, 2017 - 10:29 am

clogit stands for conditional logistic regression. Mplus does not do that but instead the regular multinomial logistic regression which typically uses the standardization of zero for the last intercept and slope.

timboto17@gmail.com posted on Sunday, April 02, 2017 - 3:54 pm

Dear Mplus team,
I am running multinomial logistic regression with mplus. My dependant variable has 6 categories including �don�t know� and �doesn�t apply�. As I am interested in the other 4 categories of this variable, is it better to discard the two categories previously mentioned or to keep then in the analysis (by recoding them as a single category for instance but not interpret the result associated with this category?). Thank you kindly for your help.

Bengt O. Muthen posted on Thursday, April 06, 2017 - 5:51 pm

Keep them in the analysis. You might also find something interesting for them.

Stig Hebbelstrup Rye Rasmussen posted on Monday, November 13, 2017 - 2:25 pm

Dear Mplus team,

I am estimating a multinomial logistic regression with three outcomes, and I have some difficulties deciding on the correct estimation of confidence intervals for predicted probabilities.

In chapter 4 of our Regression and Mediation analysis book you nicely describe how to calculate predicted probabilities but many of the confidence intervals for my predicted probabilities both contain negative values and values larger than one.

The dataset is clustered so I use the type=complex option and cannot get bootstrapped confidence intervals where strange probabilities are usually not a problem.

For an ordinary logistic regression i can use the model constraint to obtain the confidence intervals by estimating them in the following way, where I first obtain the standard errors (se) by estimating the logit coefficients.

exp(logit.coef+1.96*se.logit)/(1+exp(fit+1.96*se.logit)

Is it possible to do something similar for a multinomial logit regression?

E.g. for the reference category: 1/(1+exp(logit1.coef+1.96*logit1.se)+exp(logit2.coef+1.96*logit2.se)

Tihomir Asparouhov posted on Tuesday, November 14, 2017 - 8:38 am

The best thing to do is just let model constraint give you the confidence interval.
model constraint: new(p3);
p3=1/(1+exp(l1)+exp(l2));
See User's guide example 3.10 for how to use model constraints.

If that doesn't work (vary rarely it won't work) you can use model constraint to get SE for
A = log(exp(logit1.coef)+exp(logit2.coef))
then use
1/(1+exp(A+1.96*se.A))

Stig Hebbelstrup Rye Rasmussen posted on Tuesday, November 14, 2017 - 10:34 am

Hi Tihomir

I used the model constraint using your first suggestion originally which gave me the strange probabilities I described above.

Your second suggestion however works perfectly! Thank you so much.

Daria Shamrova posted on Thursday, April 19, 2018 - 8:40 am

Hello Drs. Muth�n,

I have a question about the model fit summary for multinominal logistic regression. I get loglikelihood and Information Criteria in the output. Is there a way to know that the model is significant based on the information provided in Mplus? What model fit information should be reported for multinominal regressions results done in Mplus? Thank you!

Bengt O. Muthen posted on Thursday, April 19, 2018 - 4:14 pm

Typically, no fit statistics are reported. It is difficult to provide relevant fit statistics unless all covariates are binary so that you can look at a frequency table. You report the estimates (particularly slopes) and their significance and perhaps provide a probability plot like we show in our RMA book. See also categorical data books by Agresti and Long.

Daria Shamrova posted on Thursday, April 19, 2018 - 6:36 pm

Thank you, Dr. Muthen! It helps

Jayme Walters posted on Thursday, May 17, 2018 - 12:26 pm

I am having some trouble with conducting a multinomial logistic regression.

1.Cases with missing data on the IVs were being deleted. I read through this forum and found this statement: �The only way to avoid the listwise deletion of covariates is to bring them into the model as dependent variables. You can do this by mentioning their variances in the MODEL command. You then make distributional assumptions about them.�

I am not clear on how to mention the variances in the model. Also, what distributional assumptions would be made?

2.Error: ONE OR MORE PARAMETERS WERE FIXED TO AVOID SINGULARITY OF THE INFORMATION MATRIX. THE SINGULARITY IS MOST LIKELY DUE TO THE MODEL IS NOT IDENTIFIED, OR DUE TO A LARGE OR A SMALL PARAMETER ON THE LOGIT SCALE. THE FOLLOWING PARAMETERS WERE FIXED:
Parameter 7, CLASS#1 ON WHITE
Parameter 14, CLASS#2 ON WHITE

The following is the syntax:

TITLE: MLR Test Day
DATA:
FILE IS "E:\WALTERS\Mason\5.14.18\MLRtestdata.dat";
VARIABLE:
NAMES ARE id flag female age income12 class black othrace white hsorless somecoll collgrad;
IDVARIABLE = id;
Usevariables are female age income12 class black white hsorless somecoll;
Nominal is class;
Useobservations are (Flag EQ 1);
MISSING ARE ALL (999);
Analysis:
estimator = ML;
Model:
class#1 class#2 on female age income12 black white hsorless somecoll;

Bengt O. Muthen posted on Thursday, May 17, 2018 - 4:40 pm

1. To mention the variance of X in a model, say

x;

The distributional assumption is normality of X.

2. Check if the White variable has variability in the class#1 category. Same for class#2.

Jayme Walters posted on Friday, May 25, 2018 - 9:44 am

Dr. Muthen,
Thank you for your help. I am new to Mplus so I am still struggling a bit with this.

Below is the syntax. Is this the way you mention the variance? If so, I'm struggling with interpretation. I want to see if income is a predictor of class assignment.

With my second issue, you were right. There is no variability in class 1 or 2. Does that mean I should remove that variable? Thank you!

TITLE: MLR Test Day
DATA:
FILE IS "E:\WALTERS\Mason\5.14.18\MLRtestdata.dat";

VARIABLE:
NAMES ARE id flag female age income12 class black othrace
white hsorless somecoll collgrad;
IDVARIABLE = id;
Usevariables are female age income12 class black white hsorless somecoll;
Nominal is class;
Useobservations are (Flag EQ 1);
MISSING ARE ALL (999);

Analysis:
estimator = ML;

Model:
class#1 class#2 income12 on age female black white hsorless somecoll;

Bengt O. Muthen posted on Friday, May 25, 2018 - 12:52 pm

To mention the variance - see the FAQ on our website: Missing on x's. Note that in mixtures with "c ON x", this can lead to heavy computations due to numerical integration.

Yes, remove the variable.

Su Jung Park posted on Wednesday, February 20, 2019 - 11:43 am

Hello Dr. Muthen

I'm going to conduct multinomial logit regression using 4 groups as a dependent variable. I want to set the first group (group 1) as a reference group. I understand I need the DEFINE commend. However, I could not find information of how to do. Could you write a syntax for me?" Thank you so much in advance.

Bengt O. Muthen posted on Wednesday, February 20, 2019 - 2:13 pm

If I am not mistaken, the output shows you results using all possible reference groups.

Su Jung Park posted on Wednesday, February 20, 2019 - 5:38 pm

Thank you for you reply. However, the last group is set as a default and I need to changed to use a first group as a reference gropu. If you let me know how to do using a commend, I would appreciate you.

Bengt O. Muthen posted on Friday, February 22, 2019 - 10:55 am

Just re-score your nominal DV so that the first category has the highest value.

Jessica Smith posted on Saturday, April 20, 2019 - 11:10 pm

Hello,
I am using a manual 3step LCA.

For the odds ratio, it seems Mplus still takes the last group as the reference group, and does not provide alternatives like it does for parameters.

Can you please be specific on how to change my reference group from group#5 to group#1?

I changed starting value, that was just testing my luck, and I was not lucky to have C#1 change to C#5.

I re-score my nominal DV, and it seems that logits for the classification probabilities are all messed up.

Thank you very much,

Jessica Smith posted on Sunday, April 21, 2019 - 8:06 am

My apologies about double posting. I thinking I figured it out. e^parameter solves the problem. Thanks for providing alternative parameterization.

Wen, Fur-Hsing posted on Tuesday, January 28, 2020 - 7:45 pm

I want to test two multinominal regression coefficients below,
but I cannot get the results of model test.

MODEL:
cesd5g#1 on sex (a11);
cesd5g#1 on age (a12);
cesd5g#1 on spout child;

cesd5g#2 cesd5g#3 cesd5g#4 on sex age spout child;
cesd5g#1 cesd5g#2 cesd5g#3 cesd5g#4 on cra cesd moss soc ex insuff aware fmfeel ;

pg5g#1 on sex (b11);
pg5g#1 on age (b12);
pg5g#1 on spout child;

pg5g#2 pg5g#3 pg5g#4 on sex age spout child ;
pg5g#1 pg5g#2 pg5g#3 pg5g#4 on cra cesd moss soc ex insuff aware fmfeel ;

model test:
a11 = b11;
a12 = b12;

Tihomir Asparouhov posted on Thursday, January 30, 2020 - 12:05 pm

Look in the output to see what error message is printed. If that doesn't help send your example to support@statmodel.com

Jill Rabinowitz posted on Sunday, September 06, 2020 - 5:59 pm

Hi there,

I want to run a multinomial logistic regression. My data are in the long format and there are multiple observations per person.

How would I modify the code below to indicate that there are multiple observations per person?

VARIABLE:
NAMES ARE u1 x1 x3;
NOMINAL IS u1;
MODEL:
u1#1 u1#2 ON x1 x3;

Bengt O. Muthen posted on Monday, September 07, 2020 - 3:46 pm

You would use

Cluster = id;

Analysis: Type=Twolevel;

Model:

%Within%
u1#1 u1#2 ON x1 x3;
%Between%
u1#1 with u1#2;

Lili Toh posted on Monday, September 14, 2020 - 9:50 pm

I've a model where 4 latent continuous and observed scale variables (math ach and uai) predict a nominal outcome (3 categories).

I'd like to run a multigroup analysis with gender as 2 groups. I read in the Mplus 8.4 users manual that for nominal DVs, knownclass should be used with type = mixture.

However, I'm not sure about syntax for 1. configural invariance model (unconstrained) 2. scalar invariance model (constrain loadings equivalent across groups) 3. metric invariance model (constrain intercepts equivalent across groups). Other examples I found in the manual are for latent class variables.

Lili Toh posted on Monday, September 14, 2020 - 9:51 pm

Should the configural invariance model look like this?

VARIABLE: NAMES ARE gender m_ach_9 uai uni se51 se52 se78 int44 int45 int46 imp73 imp74 imp75 ptd57 ptd58 ptd59;
Missing are all (-999);
Usevariables are gender m_ach_9 uai uni se51 se52 se78 int44 int45 int46 imp73 imp74 imp75 ptd57 ptd58 ptd59;
Nominal is uni;
KNOWNCLASS = gender (1=male 2=female);
ANALYSIS: TYPE = MIXTURE;
INTEGRATION = MONTECARLO;
MODEL: SUCCEXP BY se51 se52 se78;
INT BY int44 int45 int46;
IMP BY imp73 imp74 imp75;
PERDIFF BY ptd57 ptd58 ptd59;
SUCCEXP WITH INT IMP PERDIFF m_ach_9;
INT WITH IMP PERDIFF m_ach_9;
IMP WITH PERDIFF m_ach_9;
PERDIFF WITH m_ach_9;
UAI ON SUCCEXP INT IMP PERDIFF m_ach_9;
uni#1 uni#2 ON SUCCEXP INT IMP PERDIFF m_ach_9;

Bengt O. Muthen posted on Tuesday, September 15, 2020 - 6:07 pm

Our UG describes our recommended testing sequence - check configural etc in the index.

LT posted on Tuesday, September 15, 2020 - 6:28 pm

Thank you for directing me there! The index led me to pg. 670 about testing for model invariance.

It states that the model option to test measurement invariance is not available for nominal variables. However, throughout the UG it was stated that multiple group analysis for nominal outcome variables can be done using KNOWNCLASS. I don't have any latent class IVs so am not sure which approach is best.

Thank you again - I am a beginner to Mplus and certainly appreciate it. I am also open to getting paid help from an expert if you know of anyone to refer to.

Bengt O. Muthen posted on Wednesday, September 16, 2020 - 3:05 pm

Your nominal variable (uni) is not a factor indicator so measurement invariance is not relevant.

You can also ask general analysis questions on SEMNET.

If this doesn't help, send your output to Support along with your license number.