2-level multinomial logit with random... PreviousNext
Mplus Discussion > Multilevel Data/Complex Sample >
 Yohan Choi posted on Sunday, September 02, 2018 - 2:41 pm
how to implement two-level multinomial logistic model with separate but correlated random effects in Mplus?

ex) y = (1, 2, 3)
independent variabels = x1, x2
cluster = pid

In stata,
gsem (2.y <- x1 x2 M1[pid]) ///
(3.y <- x1 x2 M2[pid]), ///
cov(M1[pid]*M2[pid]) mlogit
(in sem manual in Stata, example 41g)

How to implement this in Mplus?
 Bengt O. Muthen posted on Monday, September 03, 2018 - 1:49 pm
Try using these pieces in a Type=Twolevel run (see UG chapter 9 for general twolevel setups) using the example of a 3-category nominal y variable.

Nominal = y;
within = x1 x2;


y on x1 x2;
y#1 with y#2;

The between level can also regress the random effects y#1 and y#2 on a between-level covariate. or be used to predict some between-level DV (don't think Stata can do that).
 Yohan Choi posted on Monday, September 03, 2018 - 7:21 pm
Thanks professor.

I wonder that base category of y must be largest number.

Is it fixed in Mplus or changeable?
 Bengt O. Muthen posted on Tuesday, September 04, 2018 - 2:46 pm
Use Define to re-score the variable and thereby change the base category.
 Yohan Choi posted on Thursday, September 13, 2018 - 9:38 pm
Hello, Professor.
I did like the following, but confronted problems.

I represent a part. Y is (1,2,3,4,5)

usevariables=pid y yg11-yg14 yic11-yic14 x12-x31;
within=yg11-yg14 yic11-yic14 x12-x31;


y on yg11-yg14 yic11-yic14 x12-x31;
y#1 with y#2 y#3 y#4; y#2 with y#3 y#4; y#3 with y#4;

*** WARNING in MODEL command
One or more between-level covariances involving the following variable is free
while its between-level variance is fixed at 0. Fix all between-level covariances with this variable at 0 or free its variance. Problem with: Y#1

it's same for Y#2, Y#3

Between Level
Y#2 0.000
Y#3 0.000
Y#4 0.000

Y#3 0.000
Y#4 0.000

Y#4 0.000

Y#1 -4.226
Y#2 -8.726
Y#3 -8.248
Y#4 3.603

And I need constant's coefficients, but did not calculate that.

Sorry for seemingly basic questions.
I really need your help, professor.
 Bengt O. Muthen posted on Friday, September 14, 2018 - 1:30 pm
Add the mentioning of their variances:

y#1 - y#4;

But note that this results in 4-dimensional integration which is heavy, slow, and imprecise (you can try Integration = Montecarlo(5000). You may want to put a factor behind these 4 as shown in UG examples for 2-level mixture modeling (the latent class variable is also nominal).
 Yohan Choi posted on Friday, September 14, 2018 - 6:57 pm
Thanks professor.
Very thanks for your kind reply.

I have one more question.
Can I have predicted probability with 95% CI with some fixed covariates in this model?
 Bengt O. Muthen posted on Sunday, September 16, 2018 - 12:02 pm
You can get that using Model Constraint to express the estimated logits in terms of probabilities (see e.g. end of UG chapter 14).
 Yohan Choi posted on Monday, September 17, 2018 - 6:12 pm
Thanks Muthen.

There are another proper methods to reduce convergence time?
 Bengt O. Muthen posted on Tuesday, September 18, 2018 - 5:48 pm
Only the 2 suggestions I made.
 John C posted on Tuesday, December 10, 2019 - 10:59 am

I am doing a multinomial model with TwoLevel.

Because the distribution of the dependent variable is quite uneven across the five categories, I only free the variance on two of the intercepts.

However, I noticed that I am able to obtain covariance estimates between all four identified intercepts.

I'd like to understand how Mplus is able to estimate all of these when the variances of two of the intercepts have not been explicitly freed up.
 Tihomir Asparouhov posted on Wednesday, December 11, 2019 - 2:02 pm
Send your example to support@statmodel.com. I can't replicate what you are describing. It shouldn't happen. You can add
U#3@0 U#4@0 U#5@0;
and that should eliminate the random effects for the additional categories. If you see covariances for these random intercepts that means that they are numerically integrated and should be reflected in the dimensions of numerical integration. If only two intercepts are free you would have just 2 dimensions of numerical integration. If you have covariances there will be more dimensions of numerical integrations and the run would be very slow. If there are covariance there must be also variances since the variance covariance for the random effects is always positive definite , i.e., the variance must also be free.
 Jilian Halladay posted on Wednesday, January 08, 2020 - 6:02 am
Hi There,
I am trying to run a random effects 2 level multinomial logistic regression where the outcome is coded as 0 (reference), 1 and 2. My model is running using the below code but my effect sizes are in the opposite directions as expected. Please let me know if you see any issues with the code! Thanks so much,

usevariables are s_female s_ageyrs
j_pared2 j_int j_ext median enrol js_Sel js_Sel_mean t_Sel3_mean ov2 t_selprog_m5 ;

nominal is ov2;
idvariable x_student_ID;
cluster= x_idschool;
within= s_female s_ageyrs j_pared2 j_int j_ext js_Sel ;
between= median enrol t_Sel3_mean t_selprog_m5 js_Sel_mean;

missing are all (999);
center s_ageyrs
j_pared2 j_int j_ext median enrol js_Sel
js_Sel_mean t_Sel3_mean t_selprog_m5 (grandmean);
type=twolevel random;
ov2 on s_female s_ageyrs j_pared2 j_int
j_ext js_Sel;
s_female s_ageyrs j_pared2 j_int
j_ext js_Sel;
ov2 on median enrol t_Sel3_mean js_Sel_mean t_selprog_m5;
median enrol t_Sel3_mean js_Sel_mean t_selprog_m5;
 Bengt O. Muthen posted on Wednesday, January 08, 2020 - 10:46 am
The highest category of the nominal variable is chosen as the reference by Mplus. Use Change the scoring of the nominal variable, e.g. by Define.
 Jilian Halladay posted on Wednesday, February 05, 2020 - 9:11 am
Hi there,

Thanks for the advice - that worked.

I am now trying to estimate several univariable models using FIML where I only have a variable measured at level one included in the analysis, but I still want to make sure I am appropriately adjusting for the multilevel nature of the data. Therefore, I am wondering if when you have a variable only measured at the individual level do you have to estimate the variances of your outcome at the upper level?

For example, which of the following is correct:

usevariables are

nominal is ov3;
idvariable x_student_ID;
cluster= x_idschool;! teach_id;
within= js_Sel ;
missing are all (999);
if (ov2==0) then ov3=2;
if (ov2==1) then ov3=1;
if (ov2==2) then ov3=0;


type=twolevel random;

Model OPTION 1:
ov3#1 ov3#2 on js_Sel;
ov3#1 ov3#2 ;

Model OPTION 2:
ov3#1 ov3#2 on js_Sel;

Thanks so much, Jillian
 Bengt O. Muthen posted on Thursday, February 06, 2020 - 12:32 pm
Yes, you should estimate the variances on Between as you do in Option 1. Otherwise they are zero which says that you don't need multilevel.
 Jilian Halladay posted on Thursday, February 06, 2020 - 1:29 pm
Thank you!

Do the variances need to be included at all times? Even when upper level variables are included in the model?

For example:
ov3#1 ov3#2 on js_Sel;
ov3#1 ov3#2 on js_sel_mean;
ov3#1 ov3#2 js_sel_mean ;
 Bengt O. Muthen posted on Thursday, February 06, 2020 - 3:30 pm
When you have upper-level variables as predictors like you do, the variances of your random intercepts are residual variances. If they are specified as zero, you are saying that R-square = 1.
 Jilian Halladay posted on Friday, February 07, 2020 - 6:24 am
Hi Dr. Muthen,

Sorry I am a bit confused. In the user guide, example 9.1 of a 2-level random intercept regression, there is no variance of the dependent variable added to the code.

See here:

TITLE: this is an example of a two-level regression analysis for a continuous independent variable with a random intercept and an observed covariate

DATA: FILE = ex9.1a.dat;

VARIABLE: NAMES = y x w xm clus;


BETWEEN = w xm;

CLUSTER = clus;





y ON x;


y ON w xm;

In my case, what would be the rationale for adding in the variance at the upper level? By indicating method twolevel, doesn't this automatically account for the clustering?
 Bengt O. Muthen posted on Friday, February 07, 2020 - 2:45 pm
With continuous DVs as in UGex9.1, Mplus estimates the Between-level variance without it having to be mentioned in the input. You can mention it if you want to. Either way, it will show up in the output. With other outcomes like your nominal DV, Mplus does not do that automatically because it may lead to too heavy computations due to many dimensions of integration. So you have to specifically mention the (residual) variances you want estimated.

The rationale for adding the residual variance in your case is that you want to be able to estimate R-square and not unrealistically set it at 1 (by analogy of regular linear regression).

No, saying Type=Twolevel is not enough to account for clustering - you also have to follow through and specify the model with some between-level variation which is how the clustering gets captured. See e.g. my 1994 paper on our website:

Muthén, B. (1994). Multilevel covariance structure analysis. In J. Hox & I. Kreft (eds.), Multilevel Modeling, a special issue of Sociological Methods & Research, 22, 376-398.
download paper contact author
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message