We are working on a project where the aim is to find the probability to respond to a particular ad campaign. We have a categorical dependent variable having 5 categories i.e. 4 ad campaigns which are not intrinsically ordered and a fifth which is Non response towards any of the ads. We run a model in SAS with various independent variables.
Here we are facing with certain issues
1) There is a bias towards the Non response of the campaign (We tried to reduce the bias by adding weights in our model).
There are some categories which are insignificant. For example in IV1 we have categories AD1, AD2,AD4 insignificant should we be dropping the variable altogether from our analysis since some of the categories are insignificant. If not kindly suggest appropriate method to follow.
I request you to please provide us with some insights.
This forum is geared toward Mplus not other statistical programs. It would be necessary to know a lot more about your research to give you suggestions and Mplus Discussion is not able to help to that extent.
Sara May posted on Saturday, October 31, 2009 - 5:39 pm
I am running a multinomial logistic regression in Mplus. Could you please help me with the following information:
1) One of the predictor variables is ordinal. If I define this variable as ordered categorical in Mplus, how does Mplus handle this level of measurement when estimating the regression model? Would the variable be treated as if it had interval level of measurement? Or, should it be better represented by a set of dummy variables (as is typical in other software).
2) Is it possible to run a hierarchical regression model in Mplus? I.e., add one set of predictor variables in step 1 and another set of variables subsequently. If so, what coefficients does Mplus provide to estimate the improvement in model fit? I wasn't able to find indexes that are typically included in other software (e.g., Pseudo R^2). Could you please recommend papers that have used multinomial logistic regression in Mplus?
2) Mplus does multilevel models ("hierarchical regression") but I presume that's not what you're looking for. Testing model fit in a step-wise fashion can be done through likelihood ratio testing. As for references, LCA/LCGA/GMM models with predictors will include a multinomial regression component. Just have a look under "Papers" on the Statmodel website.
1) In regression, covariates can be binary or continuous. In both cases, they are treated as continuous. You can leave the ordinal variable as it is or create a set of dummy variables. The CATEGORICAL option is for dependent variables.
I've got quite a simple multinomial logistic regression model (like example 3.6 in the User's Guide).
My nominal variable has three unordered categories, however, and I understand that the last category is taken as the reference group. Thus, I'got odds ratios for group 1 vs. group 3 and group 2 vs. group 3. But what about testing group 1 vs. group 2? Do I have to run seperate logistic regressions with dummy-coded outcomes to test each group agains each other? Or is there a way to calculate odds ratios for group 1 vs. group 2 from the results of my multinomial RA?
I have two questions regarding multinomial regression (they are very similar to the questions raised in this thread before, however, I did not find an answer so far):
1. Mplus output of example 3.6 shows the estimates and the odds for Kat#1 vs Kat#3 and Kat#2 vs Kat#3 (the nominal dependent variable has three unordered categories). However, is there a way to receive the values for Kat#1 vs Kat#2?
2. Does Mplus provide a (pseudo) R-square for the nominal dependent variable (like Nagelkerke or McFadden or Cox&Snell)? The output of example 3.6 does not show a R-square. Is there a way for Mplus to calculate it?
1. Class 3 is the reference category. You would need to change the reference category using the DEFINE command to make category two the last category.
Yijie Wang posted on Monday, October 24, 2011 - 9:19 am
Hi Dr. Muthen,
I'm conducting a multinomial logistic regression in a nested data. My syntax is as follows:
usevariables are c_rlow prstrand; nominal is c_rlow; cluster is feeder;
MODEL: c_rlow on prstrand;
OUTPUT: standardized CINTERVAL ;
Strangely, in the STDYX results, all the coefficients are either 1 or -1, with p-value as 999.00. I know that I should look at the unstandardized results. But this strange output in STDYX makes me wonder if my model is running correctly. Would you please help me with this issue? Thank you very much!
I am a little bit confused with the proper standardization. I have a nominal DV and binary and continuous covariates, so I should use raw estimates or the STDY standardization for covariates if I understand correctly previous conversation. However, is it correct to have STDY standardization for all covariates including both binary and continuous covariates? I mean that if I needed to have a table with standardized coefficients, is it proper to have STDY standardization for all my covariates? Are the coefficients of binary and continuous covariates with STDY standardization comparable with each others?
Thank you for your help, it is greatly appreciated. I have one further question. When I divided the coefficient for the significant binary x with its respective standard deviation I got a bigger value than the standardized coefficients of the significant continuous x:s. Is it correctly interpreted that this binary x has a greater effect than the continuous x:s on the DV?
Dear Drs. Muthén, I am currently running a model in which a multinomial variable (3 categories) needs to function as an outcome variable and a predictor variable. However, it appears that Mplus is unable to use a nominal variable as a predictor variable (without dummy coding the variable). I am wondering if you have any ideas about how to approach this model.
I’ve used two approaches to attempt to model it. In the first, I used the nominal version of the variable as a dependent variable and a dummy coded version of the variable (2 dummy coded variables) as the predictor within the same model, but I don’t think this is appropriate in a structural equation modeling context.
In the second approach, I’ve split the model in two, with the first model predicting the nominal outcome variable and the second model starting with the dummy coded variables as predictors.
However, I would prefer to be able to run this as a single model. Is there any way to do so? I would greatly appreciate any suggestions.
I should add that for the nominal latent class variable c,
is then captured by different means of y in the different classes.
Jennifer posted on Tuesday, January 10, 2017 - 6:43 am
I am using SEM for model building, where I have a number of manifest variables (dichotomous) and continuous latent variables as predictors of a nominal (4 categories) dependent variable. I have a few specific questions.
1. I am trying to run a multiple mediation model, where X is binary manifest, Ms are continuous latent variables, and U is the nominal variable. When I tried running this I encountered an error message stating “the following indirect effect is currently not supported,” referring to the indirect path between X and U. Is there any way to accomplish this analysis in Mplus?
2. It seems that my analyses might be simpler if I were able to use WLSMV as my estimator, though I realize it’s not possible to do this with multinomial regression. I was wondering if it were possible to “trick” Mplus by creating three dummy dichotomous variables and including them in the same analysis to replace the nominal variable as the outcome. I tried this but Mplus recognized that the sample correlation between these dummy variables was 1.00 so the model did not run. Is there a way to decompose a nominal variable in Mplus so that the system perceives it as three dependent categorical variables? I do not want to run three separate analyses because two of my groups have small sample sizes, so I lose a lot of the power.
1. I am not sure what one should mean by indirect and direct effects for a nominal outcome. Effects are defined in terms of the expectation of the outcome and there is not a single value for that with a nominal outcome. I would dichotomize.
2. Because 2 of your groups have small numbers, I would simply do one binary outcome run - the big category versus the other two.
Sorry I am not sure I grasped the answer. How should I recode the variable to impose that the intercept of the last category is 1? Do you mean I shoud change the reference category and then impose the constraint?
VARIABLE: NAMES are u1 x1 x2 x3 x4; NOMINAL is u1; MODEL: u1#1 on x1 (1); u1#2 on x2 (1); u1#3 on x3 (1);
[u1#1 u1#2 u1#3]
the fact is that i want to impose the last equation (u1#4) to be modelled with intercept zero (it is needed for identification) but with slope of u#4 on x4 equalled to the slope of the other u1 categories.
I find quite odd, compared to other packages doing multinomial logistic regression (e.g. STATA command clogit) that the SLOPE parameter is necesseraly constrained to zero. is there a way to go around this default zero slope?
clogit stands for conditional logistic regression. Mplus does not do that but instead the regular multinomial logistic regression which typically uses the standardization of zero for the last intercept and slope.
Dear Mplus team, I am running multinomial logistic regression with mplus. My dependant variable has 6 categories including “don’t know” and “doesn’t apply”. As I am interested in the other 4 categories of this variable, is it better to discard the two categories previously mentioned or to keep then in the analysis (by recoding them as a single category for instance but not interpret the result associated with this category?). Thank you kindly for your help.
I am estimating a multinomial logistic regression with three outcomes, and I have some difficulties deciding on the correct estimation of confidence intervals for predicted probabilities.
In chapter 4 of our Regression and Mediation analysis book you nicely describe how to calculate predicted probabilities but many of the confidence intervals for my predicted probabilities both contain negative values and values larger than one.
The dataset is clustered so I use the type=complex option and cannot get bootstrapped confidence intervals where strange probabilities are usually not a problem.
For an ordinary logistic regression i can use the model constraint to obtain the confidence intervals by estimating them in the following way, where I first obtain the standard errors (se) by estimating the logit coefficients.
I have a question about the model fit summary for multinominal logistic regression. I get loglikelihood and Information Criteria in the output. Is there a way to know that the model is significant based on the information provided in Mplus? What model fit information should be reported for multinominal regressions results done in Mplus? Thank you!
Typically, no fit statistics are reported. It is difficult to provide relevant fit statistics unless all covariates are binary so that you can look at a frequency table. You report the estimates (particularly slopes) and their significance and perhaps provide a probability plot like we show in our RMA book. See also categorical data books by Agresti and Long.
I am having some trouble with conducting a multinomial logistic regression.
1.Cases with missing data on the IVs were being deleted. I read through this forum and found this statement: “The only way to avoid the listwise deletion of covariates is to bring them into the model as dependent variables. You can do this by mentioning their variances in the MODEL command. You then make distributional assumptions about them.”
I am not clear on how to mention the variances in the model. Also, what distributional assumptions would be made?
2.Error: ONE OR MORE PARAMETERS WERE FIXED TO AVOID SINGULARITY OF THE INFORMATION MATRIX. THE SINGULARITY IS MOST LIKELY DUE TO THE MODEL IS NOT IDENTIFIED, OR DUE TO A LARGE OR A SMALL PARAMETER ON THE LOGIT SCALE. THE FOLLOWING PARAMETERS WERE FIXED: Parameter 7, CLASS#1 ON WHITE Parameter 14, CLASS#2 ON WHITE
The following is the syntax:
TITLE: MLR Test Day DATA: FILE IS "E:\WALTERS\Mason\5.14.18\MLRtestdata.dat"; VARIABLE: NAMES ARE id flag female age income12 class black othrace white hsorless somecoll collgrad; IDVARIABLE = id; Usevariables are female age income12 class black white hsorless somecoll; Nominal is class; Useobservations are (Flag EQ 1); MISSING ARE ALL (999); Analysis: estimator = ML; Model: class#1 class#2 on female age income12 black white hsorless somecoll;
Dr. Muthen, Thank you for your help. I am new to Mplus so I am still struggling a bit with this.
Below is the syntax. Is this the way you mention the variance? If so, I'm struggling with interpretation. I want to see if income is a predictor of class assignment.
With my second issue, you were right. There is no variability in class 1 or 2. Does that mean I should remove that variable? Thank you!
TITLE: MLR Test Day DATA: FILE IS "E:\WALTERS\Mason\5.14.18\MLRtestdata.dat";
VARIABLE: NAMES ARE id flag female age income12 class black othrace white hsorless somecoll collgrad; IDVARIABLE = id; Usevariables are female age income12 class black white hsorless somecoll; Nominal is class; Useobservations are (Flag EQ 1); MISSING ARE ALL (999);
Analysis: estimator = ML;
Model: class#1 class#2 income12 on age female black white hsorless somecoll;