Multilevel Regression PreviousNext
Mplus Discussion > Multilevel Data/Complex Sample >
 Alex posted on Monday, October 17, 2005 - 8:34 pm

I am running a multilevel regression analysis using Mplus (with random intercepts and slopes).

In my model, I've entered three continuous level 2 independent variables, two dummy-coded level 1 independent variables and a continuous dependent variable.

My level 2 sample size is only 10 (level 1 sample size is 300). I am using the default estimator, MLR.

The model runs and I get the corresponding output. However, I also get the following error message:



To what extent (if at all) can I trust my results (If I counted correctly, the number of estimated parameters is 12)?

Thanks for your help!
 Boliang Guo posted on Tuesday, October 18, 2005 - 1:43 am
see raudenbush's hlm book, the minmum number of level 2 unit is 30 but he conducted a meta analysis with 19 studies. the number of level2 unit in your study is really small and you are also not lucky to get the good result. can you modle a complex OLS equation with only 10 case?
if you have no strong theory background about the between l2 unit variance, then, try ols.
 Joanna Harma posted on Monday, August 13, 2007 - 5:30 am
Dear Dr. Muthen
I am trying to run multilevel logistic regression analysis, with ML estimator, to identify variables that significantly effect the school type that a child goes to. At the micro level I have child characteristics and at macro level I have family characteristics. I have following questions about the analysis:

1) When I run the null model I get intercept variance at the macro-level but I do not get intercept coefficient. How can I get it? Is threshold the intercept coefficient?

2) Is the fact that my model has only random intercept and no random slope good enough justification (taking into account hierarchical data) for multilevel modelling?

3) Can I use var/s.e. to test if the intercept in null model is random or not. Or will I have to do some other test like fixing variance of intercept to zero and doing likelihood ratio test? Is estimate/s.e. for covariates, variances and covariances z-score or t-value?

Many thanks
p.s. I hope you had great vacation.
 Joanna Harma posted on Monday, August 13, 2007 - 5:31 am
one last question I could not post due to post limit.

In order to test model fitness I am using loglikelihood ratio test. What I am doing is I introduce a covariate in the model and record the loglikelihood value and then in the second step I fix the coefficient of that covariate to zero and record second loglikelihood value. Then I multiply the difference (second Ė first) of two recorded values with -2 and do chi-square test. Is this way correct? Or is there any other way of checking model fitness?

Many thanks
 Matthew Cole posted on Monday, August 13, 2007 - 1:17 pm
For the LRT, take 2 times the difference in H0 Value from the model fit, and take the difference in model parameters as the df. For example, assume the first model has an H0 = 5237.921, df = 16, and for the second model, if H0 = 5236.561, df = 20, 5237.921-5236.561 = 1.36 * 2 = 2.72, df = 4. Chi-square for df = 4 critical value = 9.488, therefore, chi-square 4 = 2.72 > .05, and the second model is not an improvement over the first model.
 Linda K. Muthen posted on Tuesday, August 14, 2007 - 6:39 pm

If a post does not fit in the space provided in one window, it is too long for Mplus Discussion. Please do not double post in the future.

1. Yes.
2. Yes.
3. Yes as an approximation. Because you are testing a variance against zero which is on the border of admissible values, this test may have problems. There is a large literature on this topic.

I think when you test two nested models where covariates values are fixed to zero and where they are free, the test describes the significance of the covariates. I don't think it says that the model with covariates is better.
 Joanna Harma posted on Wednesday, August 15, 2007 - 5:56 am
Dear Linda,
Regarding intercept coefficient in multilevel regression.
UCLA website has presented example from Snijders and Bosker using MPlus in chapter 14. In all the examples intercept coefficients has been shown by S & B as threshold for dependent variable at between level, but with negative sign.
Could you please clarify if intercept coefficient is threshold or threshold with (-) sign? Also, when this intercept coefficient for null model is converted to probability will it be same as proportion of 1 in the dependent variable?
Many thanks and sorry about previous post.
 Linda K. Muthen posted on Wednesday, August 15, 2007 - 10:06 am
Yes, the intercept and threshold are the same except for sign.

Yes, the probability is for the u=1 not u=0.
 Joanna Harma posted on Thursday, August 16, 2007 - 12:49 am
Dear Linda,
Following what you told I ran null model with just dependent variable at between level. Results are:
Estimates S.E. Est./S.E.

Q2$1 0.709 0.650 1.090

Q2 34.909 12.985 2.688

From what you told me my intercept coefficient is -0.709. When I convert it to probability of u=1 it comes out as 0.33 where as I have 41.3% observations as u=1 in my sample. Why am I getting this difference? Shouldn't the two probabilites be same?
Many thanks
 Linda K. Muthen posted on Thursday, August 16, 2007 - 8:53 am
You would need to use information from both the within and between parts of the model to obtain the probability that you want. This can be done only with numerical integration.
 radanielina-hita marie louise posted on Monday, August 06, 2012 - 9:35 am

I am running a "Random Intercept Random Slope with Level 1 & 2 Predictors". My level 2 predictor is the school SES. The DV is fruit intake. I got the following error message: One or more between-level variables have variation within a cluster for one or more clusters. Check your data and format statement.

This is the code that I used:

Within = eat_frontof TV;
between= SES;

slope |fruit on eat_TV;

fruit slope on eat_TV;
fruit with slope;

Thanks for your help

 Linda K. Muthen posted on Monday, August 06, 2012 - 9:57 am
Any variable on the BETWEEN list must have the same value for every member of a cluster. Apparently SES does not meet this criterion. If you can't figure it out, please send the output, data, and your license number to
 Anne Casper posted on Wednesday, January 20, 2016 - 9:27 am

I am running a multilevel regression analysis with two predictors and their interaction. Both predictors are groupmean centered level 1 variables. The interaction term is significant and I would now like to do simple slope analyses.

Q1: I requested TECH3 to obtain the asymptotic covariance matrix in order to use the Preacher et al online tool for mulitlevel simple slopes. All the estimates in TECH 3 end with D-01 or D-02 or D-03. Could you tell me what this means? Does it look correct to you?

Q2: Is it also possible to use MODEL CONSTRAINT for calculating the simple slopes? If so, how would I do this?

Many thanks in advance,
 Bengt O. Muthen posted on Wednesday, January 20, 2016 - 7:26 pm
Q1. As an example 0.5D-01 is the same as 0.05.

Q2. I don't think you need to bother with the route of TECH3 but can instead follow suggestions on our Mediation page:
 Anne Casper posted on Wednesday, January 20, 2016 - 11:56 pm
Dear Dr. Muthen,

many thanks for your quick help!

 Andrea M Reina Tamayo posted on Monday, February 06, 2017 - 3:52 am
I have a question about the covariance structure used by MPlus.
I run a model in SPSS and there I have to specify the covariance structure, I can choose from VARIANCE COMPONENTS, AUTOREGRESSIVE(1), COMPOUND SYMMETRY, and several others.
However, in Mplus, I just specify ANALYSIS:
Type is twolevel GENERAL. I do not know what kind of covariance structure Mplus is using as default. Could you inform me about this?
 Bengt O. Muthen posted on Monday, February 06, 2017 - 3:37 pm
Usually it is zero correlations. You will see in the output what has been estimated. If it is not shown it is zero.
 Andrea M Reina Tamayo posted on Wednesday, February 08, 2017 - 7:44 am
Hi Bengt, thank your for your response. I think I had to add some information about my design.

I have 87 individuals, who completed 3 measures a day for 5 days. Therefore, there is dependance in my data because of the repeated measures. In SPSS and R, I can control for this with an autoregressive covariance matrix. I cannot specify this on Mplus, or can I?

Also, I am not sure if I should specify a three level model (moments, nested within days, nested within individuals), or a two level model because in chapter 9 I read that for longitudinal data Mplus has 1 level less than other softwares. Therefore, I thought of keeping my long format data and having two levels only, but how does Mplus know the order of the observations if there is no third level?

I think is important to know that I am not examining growth or change, just regressions on the DV measured in the moment from predictors that were measured in the same moment.

I appreciate your help,

 Bengt O. Muthen posted on Wednesday, February 08, 2017 - 3:58 pm
Use wide format so that you have 15 columns. Then say

y1 on x1;
y2 on x2;

You can then use UG ex6.17 or simply correlate the residuals of adjacent time points, e.g.:

y1 WITH y2;
 Andrea M Reina Tamayo posted on Thursday, February 09, 2017 - 1:29 am
Hi Bengt,

Thank you for your reply. I am a bit confused still though. If I use the wide format and regressed
y1 on x1
y2 on y2 ...
y15 on x15

then, I would have 15 slopes. I would be able to do autocorrelation, if I follow the example on the UG, but I want to compare my results with SPSS and R. In these programs, I get one slope that takes into account the auto correlation. How can I get 1 slope when I have 15 DVs (representing one variable) and 15 predictors (representing one variable)?
 Bengt O. Muthen posted on Thursday, February 09, 2017 - 5:00 pm
You can apply equality constraints on the slopes

y1 on x1 (1);
y2 on x2 (2);

You can also hold autocorrelations equal across time.
 Andrea M Reina Tamayo posted on Friday, February 10, 2017 - 4:36 am
Ahh ok thank you! I was just wondering why is it not possible to run this data with the type=two level? Would the estimates be wrong? Is it not multilevel data?
 Bengt O. Muthen posted on Friday, February 10, 2017 - 10:36 am
Mplus does not allow an autocorrelation in the type=twolevel, long format. That will be in Version 8.
 Andrea M Reina Tamayo posted on Monday, February 13, 2017 - 5:41 am
aah ok! Good to know :-) Thanks for all your help!

 Bengt O. Muthen posted on Monday, February 13, 2017 - 4:50 pm
 Joana Alexandra dos SAntos Costa posted on Monday, July 03, 2017 - 10:05 am
Dear Professor Muthen,
IĀ0Ö7m now performing a Bayes twolevel regression as you sugested a couple of days ago, as I only had 20 clusters besides a 1229 individuas sample. After conducting the first step of the analysis, I found an significant degree of heterogeneity among clinical groups for the intercept in the regression of thought supression on depression (WBSI and Dep, respectively). I found a median point estimate for random intercep variance of 0.712; a 95% credibility from .005 to 2.944. Following that, I try to include level-2 covariates to explain the heterogeneity. Here The Mplus is giving me an error that I can not understand. Did I did something wrong in the commands and if so, whatt should I do ?.
IĀ0Ö7ll send you the output file in a second mensage as the message size is limited.
Sincerly Joana Costa
 Joana Alexandra dos SAntos Costa posted on Monday, July 03, 2017 - 10:06 am
Dear Professor Muthen, here it is the output file with the error mensage:

WBSItot AAQII SfCpAt awar nonjudge nonreact dep GrupoCli ;

MISSING = ALL (-999);
CLUSTER = GrupoCli;
BETWEEN = AAQII SfCpAt awar nonjudge nonreact;
FBITER= 10000;

dep ON WBSItot ;

dep ON AAQII SfCpAt awar nonjudge nonreact;
NEW (icc);


One or more between-level variables have variation within a cluster for
one or more clusters. Check your data and format statement.

Between Cluster ID with variation in this variable
Variable (only one cluster ID will be listed)


Joana Costa
 Linda K. Muthen posted on Monday, July 03, 2017 - 11:20 am
Yes, the post size is limited. We ask that posts be kept within one window.

Please send your output, data set, and license number to
 Y.A. posted on Friday, March 16, 2018 - 11:16 pm
Dear Prof. Muthen,

I am reading Version 7 UG ex9.2. My question is:

For covariate x with a randome slope, if it is neither specified in within nor in between, is the latent between covariate x actually the cluster mean of the observed covariate x? and the slope of this latent covariate x is the so called contextual effect, right?

Thank you very much.

Best regards,

 Bengt O. Muthen posted on Saturday, March 17, 2018 - 9:19 am
Perhaps you are looking at the "third part" of ex9.2. No, in this case the latent between part of the covariate x is not the cluster mean of x. It is the latent between part of x. They are related but not the same.

The contextual effect has to do with the random intercept regression on the cluster mean. For a definition of contextual effect, see the Raudenbush-Bryk multilevel book, page 140, Table 5.11.
 Y.A. posted on Tuesday, March 20, 2018 - 12:04 am
Dear Prof. Muthen,

1. So if the covariate x is not mentioned in the within part of the code, when it is a random intercept model(e.g. second part of ex 9.1), then the covariate x is decomposed into two latent variables (latent within part of x and latent between part of x); If it is a random slope model (e.g. third part of ex 9.2), then the covariate x is also modeled on both within and between part, but the within part is actually the original observed covariate x (not the latent within part of x) and the between part of x is the latent between part of x?

2. In the second part of ex9.1, why grand mean centering is used? I checked Raudenbush-Bryk multilevel book, page 140, Table 5.11, if the contextual effect is calculated as the difference between the between slope and the within slope (as in the second part of ex9.1), group mean centering should be used. If the grand mean centering is used, the contextual effect is directly the slope of the level-2 random intercept equation, but not the difference between the between slope and within slope. How to understand this ?

Thank you very much.

 Bengt O. Muthen posted on Tuesday, March 20, 2018 - 12:13 pm
1. This is correct. In the forthcoming Mplus version 8.1, the latent variable decomposition of x will be available also for random slopes when using Bayes, that is, the latent within-level part of x will be used with a random slope as well.

2. It is confusing - in retrospect it probably would have been clearer to not have grandmean centered x in Define. The reason that the
betac = gamma01 - gamma10 formula is used in Model Constraint is that Mplus does a "latent group-mean centering of the latent within-level covariate" (see the text on pages 274-275). That's why the group-mean centered version of betac is used even though the observed x is not group-mean centered in Define.
 Y.A. posted on Tuesday, March 20, 2018 - 7:20 pm
Dear Prof. Muthen,

The reason I asked all these questions is because I am having trouble understand the random slope model of the multilevel mediation models, especially the example F and J here

The authors of these examples used Mplus to test the multilevel indirect effect following the guide of ex9.1 and ex9.2. From your last reply, can I interpret it as it is more intuitive to use group mean centering for the random slope model? If my goal is to get the multilevel mediation effect, can I bypass the estimation of the contextual effect? As long as I get the between slopes with the group mean centering, multiply these between slopes, then I get between level indirect effect, right?

Thank you very much.

Best regards,

 Bengt O. Muthen posted on Wednesday, March 21, 2018 - 3:03 pm
Yes on all 3 questions.
 SY Khan posted on Sunday, May 13, 2018 - 10:10 am
I have a two-level moderated mediation model. IV is workplace-level. Moderator, mediator and 4 outcomes are individual-level with a few control varaibles on both levels. I am estimating simple moderation at individual-level, and moderated mediation on path a in the between model. My queries:

1-Overall model fit is bad (CFI=0.456; TLI=-4.444; RMSEA=0.489). Is this a relaible model? What can can do to improve the model fit?

2- For estimating differences in the public vs pvt workplace, I specified grouping option. It gives within and between results seperately, but not conditional indirect effects seperately. However, there is no differnce in the estimated between and withing regression results. How can I specify grouping option correctly in two-level data.

Many thanks.
 Bengt O. Muthen posted on Monday, May 14, 2018 - 4:27 pm
1. Q1: No. Q2: Look at the Residual output and Modindices.

2. Send your output to Support along with your license number and point out what you are referring to.
 Hassan posted on Friday, June 15, 2018 - 12:11 am
Dear Profs. Muthen,

I am running a 2-level multilevel model.

There are three variables: Teacher motivation, student self-efficacy, and studentsí achievement.

I specified teacher as an L2 variable, and I didnít specify self-efficacy and achievement as either between or within variables because they would exist at both models.

Regarding the relations between self-efficacy and achievement, we have two models, one within and one between. Regarding the relation between these variables at between model, which interpretation is correct?

1) class-average self-efficacy is related to class-average achievement or

2) class-average self-efficacy is related to individual studentsí achievement

I have the same question about teacher motivation and studentsí achievement:

1)Teacherís motivation is related to class-average achievement or

2) teacherís motivation is related to individual studentsí achievement

and the last question: are the relations in the between model (teacherís motivation with studentsí achievement, and self-efficacy (b) with achievement (b)) macro-micro models?

Thank you very much.
 Bengt O. Muthen posted on Friday, June 15, 2018 - 2:42 pm
2) and 2). Although Mplus can use the latent variable between part instead of the observed class-average.

The relations on between are macro-macro.

See also the video and handout for our Short Course Topic 7.
 anonymous posted on Tuesday, October 23, 2018 - 1:19 am

I am doing multilevel modelling with MPLUS (7) in which I compare three groups using dummy variables. I get different groups' mean of the latent variables (always the reference group mean as other are controlled for), but variance (diagonal of the covariance matrix) is always same for different groups. Can I get group specific variance, in which I can calculate SD and SE for different groups. Actually, my main aim is to get SE for different groups. How I can get these?

Thank you!
 Bengt O. Muthen posted on Tuesday, October 23, 2018 - 11:23 am
Is group an L2 variable?
 anonymous posted on Tuesday, October 23, 2018 - 12:02 pm

L1 variables: math test and these groups (dummy)
L2 variables: group ratios and average test score (cluster_means)
 Bengt O. Muthen posted on Tuesday, October 23, 2018 - 5:27 pm
You need to do a multiple-group analysis of your 3 groups to let the variance vary. But a complicating factor is that the group variable is on Level 1 - this means you have to study our Web Note 16 on our web site.
 anonymous posted on Tuesday, October 23, 2018 - 10:43 pm
Okay, Thank you!
 anonymous posted on Tuesday, October 23, 2018 - 10:43 pm
Okay, Thank you!
 anonymous posted on Wednesday, October 24, 2018 - 12:54 am
However, do you know, can I use this formula

SD = square root(sum(xi^2)/(N-1)-Mean^2)

to calculate SD and SE=SD/square root(n), as I know means (specific for each groups) and I get every terms (xi) when I save factor scores.
 Bengt O. Muthen posted on Wednesday, October 24, 2018 - 5:36 pm
I thought you wanted model-estimated latent variable variances, not variances for observed variables.
 Jilian Halladay posted on Wednesday, June 19, 2019 - 5:56 am
Hi There,

I am obtaining the following error message when I run a random intercept linear model using FIML. I do not obtain this error when I am not using FIML and estimates are similar between complete case and FIML (with this error) models. Thoughts?

Parameter 21, %WITHIN%: DAYS34

Thanks in advance!
 Jilian Halladay posted on Wednesday, June 19, 2019 - 6:10 am
For reference to my above post, here is my current code. I think I might be employing FIML incorrectly as other models are also having issues (but only when applying FIML).


achiev ON fem age assets days12 days34 days56 everyday adhd;
[fem age assets adhd days12 days34 days56 everyday];

%between j_class2%

%between idschl%
achiev ON median_inc;

Thanks for your help!
 Bengt O. Muthen posted on Wednesday, June 19, 2019 - 12:23 pm
We need to see your full output - send to Support along with your license number.
 Anne Casper posted on Wednesday, August 07, 2019 - 1:21 am
I am running a multilevel regression analysis with one predictor, using the MLR estimator. The regression weight of the predictor is not significant at p=.09 but the model comparison of the null model and the model with that one predictor using the loglikelihoods reaches significance at the .05 level. I followed the recommendations for loglikelihood testing from this site

If I re-run my analyses with estimator = ML, the predictor reaches significance at .05 in the regression as well.

How can this be explained?

Thanks very much in advance,
 Bengt O. Muthen posted on Wednesday, August 07, 2019 - 5:27 pm
z-tests and likelihood-ratio chi-square tests may disagree. The ML estimator may have smaller SEs and therefore give significance - but those SEs may not be as dependable as those from MLR. You can settle the matter by requesting bootstrap confidence intervals.
 anonymous posted on Friday, November 08, 2019 - 12:10 am
In the double latent model, Marsh et al. (2009) grand-mean cantered indicators that load on the latent variables at the individual level and then the effect of this latent aggerated variables is aggerated effect at the classroom level not contextual effects (calculated separately as beta difference in the model constrain).

1. Do you know how the contextual effect of a latent aggerated variable can be directly controlled at the classroom level in the double latent model.
 Franzi KŲŖler posted on Thursday, April 23, 2020 - 2:01 am
I am testing 2 hypotheses in a mlm in a sample of students nested in classroom. Hypothesis 1 is a 1-1-1 mediation: x1 (Level 1 only) -> conflict (Level 1 & 2/ configural construct; measured @Level 1 but meaningful at both Levels) -> performance (Level 1 only); and Hypothesis 2 a 2-2-1 mediation: x2 (Level 2 only) -> conflicts (Level 1 &2) -> performance (Level 1 only). My syntax looks like this:
BETWEEN ARE x2 m_conflicts;
WITHIN ARE x1 conflicts;
CLUSTER = classroom; ! Level-2 grouping identifier
m_conflicts = cluster_mean(conflicts);
x1 conflicts perf;
mean_K ON x1 (aW);
perf ON conflicts (bW) x1;
x2 m_conflicts perf;
m_conflicts ON x2 (aB);
perf ON m_conflicts (bB) x2;
NEW(ind_W ind_B);
ind_W = aW*bW;
ind_B = aB*bB;

Everything works but I am wondering about 2 things:
1) does it look correctly this way? Can I analyze both mediations together in a single mlm?
2) do I have to group mean center x1? I have group mean centred the mediator (conflict) to separate variances of both levels from each other but x1 exists only at level 1 and I don't use it at level 2 - so I have done it that alright?
 Bengt O. Muthen posted on Thursday, April 23, 2020 - 4:24 pm
We need to see your full output - send to Support along with your license number.
 Friedrich Platz posted on Tuesday, August 25, 2020 - 2:24 am
Dear Mrs. and Mr. Muthen,

I wanted to model a three-level regression covering a repeated measures design with different conditions, in which participants' ratings were repeatedly measured (18 times) for different stimuli. Thus we defined the ratings as level I, person ID as level II and stimuli ID as level three variable. However, when modeling a Random-Intercept-/ "Null"-Model as first step of modeling, I gut the following error message:
"Clusters for COND with the same IDs have been found in different clusters for SUBJ. These clusters are assumed to be different because clusters for COND are not allowed to appear in more than one cluster for SUBJ."

How do I have to interpret this message and how can I solve this problem?

Thanks in advanced!
 Bengt O. Muthen posted on Wednesday, August 26, 2020 - 2:16 pm
Change the cluster values so that they are unique.
 Friedrich Platz posted on Friday, August 28, 2020 - 12:39 am
Dear Mr. Muthen,

thank you for the suggestions. However, they did not solve the problem. Perhaps I have to give some more details. WE used an incomplete randomized repeated measures design resulting in two nested cluster-units: One cluster unit represents the repeated measures within every person, the other cluster unit represents the small selection of stimuli (for every person) out of all available stimuli. I've tried several approaches but it seems not possible (at least for me) to model this design in MPLUS. Either there is no variation in a cluster ( such as age being surely held constant within a person) or at least variation due to repeated appearance of the same stimulus in different units of the other cluster "which is not allowed" (e.g. stimulus A in participant 1 and 2).
Is there any solution to represent this experimental design in MPLUS? Again, thank you for your contributions!
 Bengt O. Muthen posted on Saturday, August 29, 2020 - 4:36 pm
Why not handle this as a joint analysis of 2 groups. The groups can have different number of variables if you do it as Type=Mixture using Knownclass to specify the groups.

If this doesn't help, try SEMNET.
 Friedrich Platz posted on Sunday, August 30, 2020 - 11:29 pm
Dear Mr. Muthen,

thanks again for your valuable suggestions and contributions. The best choice for my data modeling concerning such nested cluster variables is a cross-classified regression model, and yet I was able to get results (and no error messages concerning clashes of IDs within or between Cluster variables).

However, when reading and interpreting my results, I assume that the output of cross-classified regression analyses reports unstandardized regression coefficient, isn't it? If so, how can I gute standardized results? Due to BAYES estimator the "usual" way seems not to work.
 Bengt O. Muthen posted on Monday, August 31, 2020 - 4:40 pm
To diagnose this, please send your output to Support along with your license number.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message