Centering PreviousNext
Mplus Discussion > Multilevel Data/Complex Sample >
 Andy Cohen posted on Thursday, July 12, 2007 - 8:40 am
I am conducting a 2 level analysis in which I would like to include interactions between two main effect variables in the within portion of my analysis. I am defining the variables using the DEFINE command (e.g. IntA_B = A * B). The underlying variables for the interaction terms need to be group mean centered. I have already specified group mean centering for these variables in the VARIABLE command (as that is necessary for the use of the TWOLEVEL option in the ANALYSIS command, but am wondering if the DEFINE command will use the original or centered form of the variables.

 Linda K. Muthen posted on Thursday, July 12, 2007 - 8:56 am
The transformations in the DEFINE command are done before the centering.
 Clemens Lechner posted on Thursday, February 16, 2012 - 4:23 am
Dear Dr. Muthen,

I wonder, whether - as a consequence of the above - the newly defined interaction terms should be listed under the CENTERING command?

Does the order of operations (first computing the interaction terms, then centering) affect the results?
 Linda K. Muthen posted on Thursday, February 16, 2012 - 2:50 pm
Yes, the order of operations matters. Any transformations using DEFINE should be done first and the data saved. The centering should be done on the saved data.
 C. Lechner posted on Friday, February 17, 2012 - 6:38 am
Ok, thank you very much for your answer!
May I ask two additional questions:

#1: I suppose the same would apply to interactions between a latent variable and a manifest variable computed using the XWITH command? I would first compute the interaction, save it, and then center it along with the other variables in the model?

#2: Assume I have a multilevel model with two predictors and an interaction between the two on level 1.
One of the two predictors that interact have a random effect, the other is treated as a fixed effect. The interaction thus has to be treated as a random effect as well.
However, do BOTH predictors that are part of the interaction have to be treated as random, or will it suffice to treat one as random and the second one as fixed (as I would assume)?
Technically, both works fine, because in a regression or path model, Mplus will treat these interactions as any other variables. But is it correct?
 Bengt O. Muthen posted on Friday, February 17, 2012 - 8:22 pm
1. Using XWITH, you would center the manifest variable in Define say, but do that in the same one-step analysis. The latent variable has mean zero in most models and need not be altered.

2. I don't see that the choice of fixed or random for each of the variables has any implication for their interaction.
 Chuck Burgess posted on Monday, February 27, 2012 - 6:03 pm
Just so that I'm clear, it seems like there is no way use the CENTER command to group mean center a set of variables and then use them in a DEFINE statement in the same procedure. For example:

names = AgencyID Gender Age T employ enroll engage housegb incany totsup infsup
formsup anysup anyinf anyform;
cluster = AgencyID;
missing are all .;
usevar = Gender Age T incany formsup Intx;
categorical are incany;
within = T formsup Intx;
between = Age Gender;
center = grand mean (Age) group mean (T formsup);

Intx = T*formsup;

This would compute the Intx variable before group mean centering T and formsup and this isn't what I want. Is there any way around this? The manual states that the CLUSTER_MEAN option also cannot be used with subsequent DEFINE statements. So I guess that leaves me with using the SAVEDATA command to save the group means, then running another procedure using those saved variables to compute the group mean centered values. Save that data for a final time, and run a third procedure calculating the interaction term with the saved, group mean centered values. Is that correct? Or did I add in an extra step somewhere. Thanks for all your help!
 Linda K. Muthen posted on Tuesday, February 28, 2012 - 10:18 am
It sounds like you are correct.
 Karen S. Mitchell posted on Monday, July 02, 2012 - 1:19 pm
Hi, I am running a two-level model to test group differences before and after an intervention. I'm entering my own time variable to represent the number of days since baseline (see sample script below). I noticed that Mplus is automatically centering my time variable. Is there a way to not center it? I would like the baseline (T1) to = 0, as this is more meaningful. Thanks!

WIDE = DV_T1 DV_T2 DV_T3 |
T1 T2 T3 ;
LONG = DV | timeB ;

IDVARIABLE = person ;

Variable: Names are ID group DV_T1 DV_T2 DV_T3 T1 T2 T3 ;

Usevariables are
group DV timeB person ;

Cluster = person ;
Within time timeB;
Between = group ;

Missing are .;


s | DV on timeB ;

DV s on group ;
DV with s ;
 Linda K. Muthen posted on Tuesday, July 03, 2012 - 10:57 am
The REPETITION option assigns to the variable time consecutive numbers starting with zero.
 Melvin C Y posted on Friday, January 25, 2013 - 4:44 am
In my two level sem model, I specified the following (there''re more variables but I''ll keep it simple)
Within is x1 x2; !very low ICC.

% within%
WL1 by x1 x2 x3;

Dep on WL1;

% between%
BL2 by x3;

Dep on BL2;

Fit statistics are fine except for L2 srmr (above1.0). When I add groupmean centering to x1 and x2 in a subsequent run, the L2 srmr improved substantially.
What could be the reason for this? Should I center?
I placed x1 and x2 at within as their ICCs were very low.
 Bengt O. Muthen posted on Friday, January 25, 2013 - 4:18 pm
I would feel more comfortable with other fit indices for two-level modeling. Stay with chi-square, RMSEA, and CFI.
 Kirill Fayn posted on Thursday, May 30, 2013 - 12:45 am

i am trying to run my first MLM on mplus and am having difficulty centring my level one variables.

The model and the error is below:

USEVARIABLES ARE Interest Cope1 Nov1 ZOpen ZInt;
WITHIN = Cope1 Nov1;
MISSING ARE all (-9999);
CLUSTER = subject;

IntCop | Interest ON Cope1; !need to make these factors
IntNov | Interest ON Nov1;
Interest IntCop IntNov ON ZOpen ZInt

*** ERROR in DEFINE command
Error in assignment statement for CENTER

Could you please help. The syntax seems to be right so I am guessing I can't centre these variables for some reason.

Thanks in advance for your time.

Best regards,

 Linda K. Muthen posted on Thursday, May 30, 2013 - 10:54 am
What version of Mplus are you using? If it is earlier than Version 7, the CENTERING option was in the VARIABLE command. If it is Version 7 or later, please send the output and your license number to
 Katerina Gk posted on Wednesday, October 09, 2013 - 4:42 am
I have 5-factor model(job sat.) and self-eff.( 3-factor model).I want to aggregate by school the observed variables of the job sat. and self-effi. in the between level. If I use CENTERING = GRANDMEAN (x) is enough to understand that I need to aggregate at between level?the observed variable are the same in two levels....

Missing are all (999);
CLUSTER IS sxoleio;
er1_w by e1@1... ;
er2_w by e7@1... ;

a1_w by a3@1 ...;
a2_w by a1@1 ...;

er1_w ON a1_w;

er1_b by e1@1...;
er2_b by e7@1...;

a1_b by a3@1 ... ;
a2_b by a1@1 ...;

er1_b ON a1_b;
OUTPUT: standardized;
Thank you very much
 Linda K. Muthen posted on Wednesday, October 09, 2013 - 10:17 am
If you want an aggregated variable on the between level, use the CLUSTER_MEAN option of the DEFINE command to create it. See Example 9.1 where using this variable versus a latent variable decomposition of the individual-level variable is discussed.
 Katerina Gk posted on Wednesday, October 09, 2013 - 11:55 am
Thank you for your help!!
 Ute Hulsheger posted on Thursday, March 27, 2014 - 6:51 am
I am running the following random intercept multilevel model



y ON x;

y ON x;

Is it correct that in such a situation, where x is also used at the between level, x is basically group-mean centered at the within level although I specified GRANDMEAN in the Define section?

Is there a way to overrule this procedure and use grand-mean centering at Level 1?

Many thanks in advance for your time.
 Linda K. Muthen posted on Thursday, March 27, 2014 - 1:53 pm
You can put x on the WITHIN list and then create a cluster-level variable for x on between using the CLUSTER_MEAN option of the DEFINE command. See Example 9.1.
 Paraskevas Petrou posted on Tuesday, July 15, 2014 - 6:44 am

In relation to the issues of centering in multilevel models raised above, I have two questions:

1. If any transformations (e.g., interaction between observed variables) are made before centering, it means that the interaction term does not use standardized scores of the products, which violates a basic requirement for the computation of any interaction term. How can I bypass this problem? Or is it not a problem?

2. When using grandmean centering (or no centering) for my within-level predictors, the fit of the model is considerably higher than when using groupmean centering. What could be the reason? Is there a preferable centering method for within-level predictors?

Thank you very much in advance!

 Linda K. Muthen posted on Tuesday, July 15, 2014 - 11:42 am
1. It is standard to center, not standardize, variables before creating an interaction between them.

2. See the Raudenbush and Bryk book. This is a complex topic.
 Zen Goh posted on Friday, August 01, 2014 - 7:16 am
I'm running a 1-1-1 mod-med model, and have a generic question about centering.

(1) Why do we only center the X but not the M(mediator), as we do using HLM software? In HLM software, X and M are specified as group-centered predictors, so that only the within-level relationships are apparent. In Mplus, I'm only allowed to center X but not M. (see code and error message below)

(2) how does the lack of centering the mediator in MPlus affects the results and interpretation?

 Zen Goh posted on Friday, August 01, 2014 - 7:18 am
NAMES = clust Gender Age Child WLoad WFCts WFCemo LSat MgSup;
USEVARIABLES ARE WLoad WFCts LSat Gender Age Child MgSup;
BETWEEN = Gender Age Child MgSup;
CLUSTER= clust;

*** WARNING in MODEL command
Variable on the left-hand side of an ON statement in a | statement is a
WITHIN variable. The intercept for this variable is not random.
Variable: WFCTS
*** ERROR in MODEL command
Within-level variables cannot be used on the between level.
Within-level variable used: WFCTS
*** ERROR in MODEL command
Within-level variables cannot be used on the between level.
Within-level variable used: WFCTS
*** ERROR in MODEL command
Within-level variables cannot be used on the between level.
Within-level variable used: WFCTS
The following MODEL statements are ignored:
* Statements in the BETWEEN level:
 Linda K. Muthen posted on Friday, August 01, 2014 - 10:54 am
Variables on the BETWEEN list cannot be used in the within part of the model. They are measured on the cluster level. This is not related to centering. Read Example 9.1. It goes over all of the multilevel options. Example 9.2 shows a random slope model with a cross-level interaction.
 Zen Goh posted on Friday, August 01, 2014 - 12:53 pm
I think I might not have been clear in my question.

My mediator (not moderator) variable is also a within, level-1 variable - why do we not center this as well? Would this not affect the interpretation of the results?
 Bengt O. Muthen posted on Friday, August 01, 2014 - 2:24 pm
You can center mediators as well. The error message refers to something else.
 Tina Davidson posted on Thursday, October 02, 2014 - 1:25 am

I am running a 1-1-1 path model (using manifest variables) with non-independence in my DV (high ICC1; performance rated by common supervisor). In reading about centering (e.g., Enders & Tofighi, 2007), it is clear I need to group-mean center my predictors to prevent between-group variance biasing my results. However, I wonder why we don't group-mean center the DV to "purge" out between-group variance in the DV and only include within-group variance?

In books and in papers I generally find that the DV is not centered, but in the case of non-independence, doesn't the DV then include a lot of "noise" and how exactly does Mplus deal with the DV then?

E.g., is defining the DV at the WITHIN level an option? Are there other options to take care of this?

Thank you very much for your advice,
 Linda K. Muthen posted on Thursday, October 02, 2014 - 9:26 am
You model between-group variance in multilevel modeling. There is no need for anything else.
 Eivind Ystrøm posted on Friday, October 10, 2014 - 4:12 am
Dear Muthén,
I appreciate the new define functions in version 7.2, where order of commands has significance.

In a multilevel regression model where y(within) is regressed on x1(groupmeancentered (GMC)) and x2(GMC), should the interaction term for x1 and x2 (x1*x2) be computed before or after the GMC? That is, to first calculate x1*x2, and then GMC x1, x2, and x1*x2, OR calculate x1(GMC) and x2(GMC), and then calculate x1*x2 as x1(GMC)*x2(GMC).

1) y(within) on x1(GMC) x2(GMC) x1*x2(GMC)
2) y(within) on x1(GMC) x2(GMC) x1(GMC)*x2(GMC)

(say that, for example, y is lung cancer, x1 is smoking, and x2 is working with asbestos).

Hope you can help me with this conundrum,
Eivind Ystrom
 Bengt O. Muthen posted on Friday, October 10, 2014 - 3:26 pm
You can get wider input on general centering matters from Multilevelnet.
 Angela posted on Friday, October 09, 2015 - 1:55 pm

I am running a model using type=complex and I would like to look at some interaction effects. I read on the discussion board that when running a two-level model, centering should occur after creating the interaction variables in the define statement. However, I am not looking at effects across levels, so I was not sure if this applied to my model. At what point should I center my variables when looking at interaction effects in a multilevel model?

 Bengt O. Muthen posted on Friday, October 09, 2015 - 4:10 pm
Type=Complex is not a 2-level model but a single-level model. With single-level models it is not required to center variables, but if it is done it should be before creating the interaction.
 JLuk posted on Sunday, December 13, 2015 - 7:29 pm
Cross-Level Interaction in Multilevel ZIP Model

1. I am running a two-level zip model, with drinking being the outcome. I'm interested in testing a cross-level interaction. Can this be done in Mplus (both for the zero-inflation and count part)?

2. I adapted syntax from example 9.2, with the following key changes:
- VARIABLE: a zip model is specified by "count is drink(i)"
s | drink on x;
si | drink#1 on x;
drink s on w xm;
drink with s;
drink#1 si on w xm;
drink#1 with si;

Does this look right?

3. Is the plotting function for cross-level interaction (second part of example 9.2) robust while using ZIP?

Thanks a lot for your help!
 Bengt O. Muthen posted on Tuesday, December 15, 2015 - 6:23 pm
1. Yes, I believe so.

2. Looks right.

3. Yes, use the style of
 JLuk posted on Tuesday, December 15, 2015 - 6:40 pm
Great! Thank you so much, Dr. Muthen!
 JLuk posted on Monday, December 21, 2015 - 12:41 pm
Cross-Level Interaction in Multilevel Zero-Inflated Model

1. In running a cross-level interaction, I compared using ZIP vs. ZINB models. It appears that the multilevel ZIP model ran, but the ZINB model did not. It gave the following error message:
Internal Error Code: GH1006.
An internal error has occurred. Please contact us about the error,
providing both the input and data files if possible.

I think using ZIP model would be justifiable given similar BIC. However, I'm just curious why the model did not run with ZINB.

2. In the syntax in the previous post:
s | drink on x;
si | drink#1 on x;
drink s on w xm;
drink with s;
drink#1 si on w xm;
drink#1 with si;

If I would like to include other covariates in the within-level, do I simply specify:
s | drink on x covar1 covar2;
si | drink#1 on x covar1 covar2;

3. I'm using Mplus on a Mac and am realizing that the plot function may not work on Mac. Is that still the case with the latest version? If so, any alternative resources that you'd recommend to probe the interaction?

Thank you!
 Bengt O. Muthen posted on Monday, December 21, 2015 - 6:38 pm
Please send files to Support along with your license number.

The Mac plots are listed on our website.
 Lauren Menger posted on Monday, January 18, 2016 - 1:55 pm
I have a question related to your Dec 9, 2011 2:22 post. I am running a multilevel regression model and getting different ICCs depending on which predictor variables are in my model (e.g., models with and without a moderator variable). In your previous post, you explained:

"Usually the ICC changes with the covariates due to this misspecification where a covariate is on the within list but actually it is not a within level variable because it hasn't been centered. You can use this command to fix this misspecifcations:
for all x variables that are on the within= list.

The only x variables I have not been centering are a binary treatment variable (treatment = 1 vs control = 0) and my interaction term because both have a meaningful zero. If I center these variables, I believe I will no longer be able to easily interpret the intercept as the score for control participants with average (due to centering) scores on all other predictors.

What is your recommendation in this situation? Which ICCs should I report and why?
 Bengt O. Muthen posted on Monday, January 18, 2016 - 2:29 pm
We have a FAQ explaining why ICCs change across models:

"Icc changes from one model to another"

One approach is to to use the ICCs from a Type = Twolevel Basic analysis.

You may also want to ask this general analysis question on Multilevelnet.
 Heather Prime posted on Thursday, May 05, 2016 - 1:44 pm

I recall reading that it has become possible to order define commands so that centering occurs before computation of interactions. However, I cannot seem to find this information in the user's guide or on the discussion board. Could you please speak to how I can do this, if possible?

 Linda K. Muthen posted on Thursday, May 05, 2016 - 4:13 pm
See the DEFINE command in the current user's guide on the website. Page 578.
 Dzifa Adjaye-Gbewonyo posted on Sunday, March 12, 2017 - 4:59 pm
I am conducting a multi-group analysis and wanted to center the age variable on the mean age for each group for more meaningful interpretation of the intercepts. However, my data is also complex survey data so I am using TYPE=Complex, but I know the default for CENTER (GROUPMEAN) is to use the cluster mean. Because I wanted to instead center age on the mean for the groups of the multi-group analysis (not the cluster mean), I used the following command

DEFINE: CENTER age (GROUPMEAN urban) where urban is the grouping variable that denotes urban/rural groups. However, I got the following warnings:

*** WARNING in DEFINE command
Specification for the CENTER function with GROUPMEAN includes extra
information that will be ignored. Extra information given below:
Categorical variable DEP contains less than 2 categories in Group 0.

Is it possible to specify a different groupmean variable than a clustering variable for complex data (i.e. in the case of a grouping variable for multi-group analysis)? It's not essential that I center age, but I just thought I would try it.
 Bengt O. Muthen posted on Sunday, March 12, 2017 - 5:25 pm
Just say

Define: age = age - x;

where x is a value that you want to use to center age.
 Dzifa Adjaye-Gbewonyo posted on Monday, March 13, 2017 - 6:23 am
Thanks that is straightforward. Can I do something like

Can I add "if" statements to the command to condition it on something else? For instance

Define: age= age-x if <grouping>=0
age= age-y if <grouping>=1

where grouping represents the variable that designates the different groups
 Linda K. Muthen posted on Monday, March 13, 2017 - 7:00 am
Yes. See the DEFINE command in the user's guide for the correct specification.
 Frank Egloff posted on Thursday, December 14, 2017 - 10:42 am
Dear Drs. Muthen,

I have question regards what Mplus is doing when running this multilevel model:

iw sw | tv12Slow@0 tv22Slow@1 tv32Slow@2 tv42Slow@3 tv52Slow@4
tv62Slow@5 tv72Slow@6 tv82Slow@7;
tv12Slow - tv82Slow (1);

b | sw on iw;

ib sb qb| tv12Slow@0 tv22Slow@1 tv32Slow@2 tv42Slow@3 tv52Slow@4
tv62Slow@5 tv72Slow@6 tv82Slow@7;
tv12Slow - tv82Slow@0;

ibxcos | ib xwith cos;
sb on cos IB ibxcos;
b on cos;

Are “iw” and “sw” automatically centered at group mean?

Thank you for your answer!

 Bengt O. Muthen posted on Thursday, December 14, 2017 - 12:16 pm
The means of iw and sw are zero. See for instance the output shown on on website ofr UG ex 9.12.
 Frank Egloff posted on Friday, December 15, 2017 - 3:58 am
Dear Dr. Muthen,

thank you for your answer! I found in ex 9.12 that the mean of x (which is the level 1 predictor) is -0.021 (which is almost zero) which is the grand mean (not the group mean, which might possibly still differ).

I actually got a request from a reviewer who stated (drawing on Enders & Tofighi, 2007) that level 1 predictors (as "iw" (latent variable) in my case) need to be centered at group mean to allow for an unbiased cross-level-interaction parameter. I agree.

Thus, in my case I am particularly interested if the group means of “iw” are zero. When I look at the factor scores of my model and calculate the “iw”-group means by hand, they are almost zero (but not exactly zero: e.g. 0.023, -0.053, -0.074 – which are close to zero values considering an “iw”-standard deviation of 4.87).

1. What is Mplus doing (is group mean centering applied to lower level latent predictor variables by default)?

2. Why are there these small deviations from zero in group means (assumed group mean centering was applied)?

3. Do you know a source to cite which states what Mplus is doing?

Thank you so much again!
 Tihomir Asparouhov posted on Friday, December 15, 2017 - 10:31 pm
1. E(iw)=E(sw)=0. This is how you specified the model. I would not characterize this as centering. It is simply how you specified the model. There is no hidden procedure here.

2. Factor scores are based on the data in each row. I wouldn't really expect the average to be zero. If you have a population with mean zero and you draw a sample from that population the sample mean is not zero. In some very simple models this can indeed happen (models with explicit solution for the factor scores). This is not a simple model so I would not expect this to happen. Another argument why you should not expect the average factor score to be exactly zero is this. If the factor scores are supposed to average exactly to zero, then they should do so also in each cluster - but they don't - the model parameter estimates are based on the entire population and would not be able to do that for very cluster. Among other reasons I would not expect to see zero average is missing data and unbalanced design.

3. Muthén, B. & Asparouhov, T. (2009). Growth mixture modeling: Analysis with non-Gaussian random effects. In Fitzmaurice, G., Davidian, M., Verbeke, G. & Molenberghs, G. (eds.), Longitudinal Data Analysis, pp. 143-165. Boca Raton: Chapman & Hall/CRC Press.
 Frank Egloff posted on Thursday, December 21, 2017 - 5:54 am
Dear Dr. Asparouhov,
thank you very much for your answers. I have a request concerning your answer to question 1. Actually, my problem is that from the Mplus syntax (that I specified) I still have difficulties to conclude exactly what it does. Since you explained that E(iw)=E(sw)=0, I can conclude that these two variables have a mean of zero in some way - but I do not know if iw and sw add to zero within of their groups. I think it would help me al lot if you could tell me if the following equations (focus on level 2) represent the model I specified. If yes, is “Muthén, B. & Asparouhov, T. (2009). Growth mixture modeling: Analysis with non-Gaussian random effects” the appropriate source to cite? If no, could you please write down the level 2 equations that correspond to my model?

Level 1:
Ytij = PI0ij + PI1ij * (timetij) + PI2ij * (timetij)squared + Etij

Level 2:
PI0ij = ß00j + R0ij
PI1ij = ß10j + R1ij
R1ij = ß20j + ß11j * R0ij + R2ij

Level 3:
ß00j = GAMMA000 + U00j
ß10j = GAMMA100 + GAMMA101 * (COSj) + GAMMA102 * ß00j + GAMMA103 * (COSj * ß00j) + U10j
ß11j = GAMMA200 + GAMMA201 * (COSj) + U20j
PI2ij = GAMMA300

Your help is very much appreciated!
Merry Christmas!
 Tihomir Asparouhov posted on Thursday, December 21, 2017 - 11:19 am
That reference is the correct reference. The equations are correct with the exception of one thing: you have to delete ß20j. According to the model E(iw)=E(sw)=0 not just for the entire population but also within each cluster.
 Frank Egloff posted on Thursday, January 11, 2018 - 9:56 am
Dear Dr. Asparouhov,

thank you very much for your kind answer! I understood that when specifying…

iw sw | tv12Slow@0 tv22Slow@1 tv32Slow@2 tv42Slow@3 tv52Slow@4
tv62Slow@5 tv72Slow@6 tv82Slow@7;
tv12Slow - tv82Slow (1);
b | sw on iw;

(see full input above)

…Mplus is doing automatically…

Level 2:
PI0ij = ß00j + R0ij
PI1ij = ß10j + R1ij
R1ij = ß11j * R0ij + R2ij

…which expresses that…

…R1ij is predicted by R0ij.

My second question is for a source that confirms that Mplus is doing this when we use this specification. So I looked into your recommendation (Muthén, B. & Asparouhov, T. (2009). Growth mixture modeling: Analysis with non-Gaussian random effects).
...unfortunately I could not find any equations or textual descriptions that would confirm that Mplus is doing “R1ij = ß11j * R0ij + R2ij” when specifying “b | sw on iw;” (not even when looking into papers cited in this work).

Could you help me to find such a statement?

Your help would be again, very much appreciated!

With kind regards

 Tihomir Asparouhov posted on Thursday, January 11, 2018 - 12:32 pm
I would say that you should refer to page 755 from Mplus user's guide. That page explains the Mplus language regarding
s | Y on X
i.e. it gives the language specification for how random slopes are specified in Mplus.
 Frank posted on Friday, January 12, 2018 - 3:01 am
Dear Dr. Asparouhov,

again, thank you very much for your kind answer!

I looked into the Users’ Guide (p. 755) and found that there is a relatively general description of what Mplus does when we specify: s | y ON x;

“s is a random slope in the regression of y on x where y is a continuous dependent variable and x is an independent variable.”

So actually from this I cannot be sure to infer that it is actually the residuals of x and y that are regressed on each other.

Nevertheless, I looked into example 9.2. (p.277) and found a promising additional statement (sentence two):

"The random slope s is defined by the linear regression of the dependent variable y on the observed individual-level covariate x. The within-level residual variance in the regression of y on x is estimated as the default."

Does this actually say that it is the residuals of y and are regressed on the residuals of x?

Your help would be again, very much appreciated!

With kind regards

 Tihomir Asparouhov posted on Friday, January 12, 2018 - 9:59 am
Your understanding is correct. We do not use that language however. Instead we use this. Every variable is decomposed as a within and between components
You are calling YW the residual. In the Mplus framework we call this the within component of the variable Y, but it is the same things. In your case IW and SW are the withing level components of the random intercept and slope.

If you are still unsure about what Mplus does I would recommend that you conduct a little montecarlo study. Generate data according to your model (and your understanding of what the model is), you can of course generate such data in Mplus, but probably you want to generate it somewhere else where you program the generation yourself. Generate large sample then run the model through Mplus and verify that all parameters are recovered by the Mplus estimation.
 Bengt O. Muthen posted on Friday, January 12, 2018 - 11:50 am
The connection between the Mplus notation and that of multilevel growth modeling is laid out in detail in our short course Topic 8 video and handout on our website, starting with slide 58.
 Emily L posted on Tuesday, May 15, 2018 - 2:25 pm
Hello Dr. Muthen,

I am running a two level multigroup random intercepts model, I am using multiple imputation for missing data. Some of my clusters were more sampled than others.

I would like to center the between group variables. According to the manual, when using a two level model it appears that CENTER (GRANDMEAN) uses the mean from the within level for a given variable. It is only for three-level models that grand mean centering will use the between level mean. Is there anyway to center my level two variables in my mplus syntax?

Moreover, two of my between group variables are aggregates using CLUSTER_MEAN, to confirm, it is possible to use these aggregated vairbales in the center command but not if i was to calculate the center myself.
 Tihomir Asparouhov posted on Tuesday, May 15, 2018 - 4:58 pm
I think it should work fine for you. The code should look like this

center xb(grandmean);
 Emily L posted on Wednesday, May 16, 2018 - 9:00 am
Hello Dr. Asparouhov,

Thank you for your response!

Do you know if there is anyway to resolve the issue of centering with uneven cluster sizes? It's my understanding that mplus will use the grand mean across all cases, not the mean of variables at the L2 level.

That is, if there is 50 clusters of 95-105 individuals, it will use the grandmean of all individual cases for the level 2 variables (i.e. ~5000) and not the mean of the clusters (i.e., 50) consequently oversampling certain clusters.
 Tihomir Asparouhov posted on Wednesday, May 16, 2018 - 9:38 am
I don't think you are correct. Please send your example to I am pretty sure that Mplus uses the average over the 50 clusters not the 5000 individuals. You can of course use the define command to subtract whatever value you want from a variable. You can use type=basic to figure out what the mean is.
 Ted Fong posted on Sunday, July 15, 2018 - 9:52 pm
If I use latent variable decomposition for both the predictors X and outcomes Y in a multilevel path analysis with categorical outcome, should I still need to center the individual-level component of X by grandmean/groupmean? Would doing so faciliate interpretation or would it be redundant?

 Bengt O. Muthen posted on Monday, July 16, 2018 - 2:47 pm
Q1: No.

Q2: It would be wrong.
 Ebrahim Hamedi posted on Tuesday, July 17, 2018 - 1:00 am
On page 639 of the manual, there is a paragraph about the order of execution in the define command when there is "center", which is difficult for me to understand. By asking this question it will hopefully become clear.

x1 = x1/1000;
x2 = x2/1000;
center x1 x2 (grandmean);
x1sq = x1*x1;
x2sq = x2*x2;

Here, I want the variables to be divided first, then centered, and finally squared. Is it the order followed by mplus?

 Linda K. Muthen posted on Tuesday, July 17, 2018 - 5:57 pm
 Paraskevas Petrou posted on Wednesday, February 13, 2019 - 5:49 am
Dear Mplus users,

I have a two-level model where a within-level latent variable correlates with two within-level observed variables. I would like to model these correlations at both levels of analyses. However, this is not possible if I center all variables at the groupmean. How can I use groupmean centering and model my statements at both levels of analyses?

Thank you!
 Bengt O. Muthen posted on Wednesday, February 13, 2019 - 5:51 pm
Use the Cluster_Mean option to create a between-level counterpart to the Group-mean centered within variable,
 Paraskevas Petrou posted on Thursday, February 14, 2019 - 12:52 am
Thank you Bengt!

I assume that in the define command, first, I groupmean center all the within-level variables and then I cluster_mean them. Does that mean that the cluster_mean command uses the uncentered or the centered within-level variables?

 Bengt O. Muthen posted on Friday, February 15, 2019 - 8:19 am
Cluster_Mean is done first so it uses the uncentered values.
 Jilian Halladay posted on Monday, August 05, 2019 - 8:26 am

When grand mean centering at an upper level, is the variable centered at the upper or the individual level?

For example if I have 100 students nested in 10 schools and I am centering a school climate (school level variable), will the school climate mean be at the school level (n=10) or the student level (n=100)?

Thanks in advance,
 Bengt O. Muthen posted on Monday, August 05, 2019 - 5:10 pm
Upper level so school level.
 Nicole Watkins posted on Monday, September 23, 2019 - 6:29 am
Dr. Muthens,
When I am running a Multiple Group analysis to examine differences by gender, but I also have a separate continuousXcontinuous variable interaction within the model, should I center all of the continuous variables separately for boys and girls, or should I grand-mean center? When I grand-mean center and then look at the output for either boys or girls, the means are not 0 (which makes sense...). But I am not sure if that is correct, or if I should group-mean center them instead so that the mean for each gender is still 0.
 Bengt O. Muthen posted on Monday, September 23, 2019 - 5:05 pm
This general analysis strategy question is suitable for SEMNET. You don't have to center.
 Es Maths posted on Sunday, October 13, 2019 - 4:09 pm
I would like to conduct a single-level cfa with 11 binary variables before conducting a two-level cfa. Is it possible to group mean center binary variables? when I try to center them I got the following warning for all items:
Centering can be applied to continuous observed dependent or observed independent variables only. Unable to center variable: Q1D

Is it possible to center binary variables in this case for single level multilevel analyses. Or centering is not necessary in this case?

 Bengt O. Muthen posted on Monday, October 14, 2019 - 2:09 pm
No, I wouldn't try to group-mean center binary variables. Instead, go straight to the twolevel model with Between-level factors.
 Es Maths posted on Monday, October 14, 2019 - 4:28 pm
Dear Dr Muthen,

Thanks for your quick reply. Very helpful! I need to test measurement invariance at the individual level. Then, is it Ok to ignore the between-level when specifying a meassurement invariance model at the individual level or is it better/possible to specify a two-level model but test measurement invariance at only individual level? Hope I am clear and this is not a general question for this topic.

Thanks and best wishes.
 Bengt O. Muthen posted on Wednesday, October 16, 2019 - 7:12 am
It sounds like you have a twolevel situation where the grouping is on Within. That requires special modeling that is described in our Web Note 16:

Asparouhov, T. & Muthén, B. (2012). Multiple group multilevel analysis. Mplus Web Notes: No. 16. November 15, 2012.
 Es Maths posted on Wednesday, October 16, 2019 - 5:29 pm
Many thanks for your response.

I have two levels (students and classes) but I have students from 6 countries. I would like to show that the student assessment measured the same construct at the individual level across six countries. (But country was not treated as a level).

Based on pg.7 of Asparouhov, T. & Muthén, B. (2012) 'Each
cluster now contains observations from different groups and the cluster level
random effects can be different in all the groups.', I do not think I have such situations.

Am I right or am I missing something? I just wonder when I specify my measurement models at the individual level, do I need to take into account the nested structure?

As a note, the items are binary.

Thanks for your response.
 Bengt O. Muthen posted on Thursday, October 17, 2019 - 3:03 pm
Sounds like you have a twolevel situation with students in classes. And that you want the 6 countries captured by dummy covariates that influence the factor indicators.
 Jilian Halladay posted on Monday, November 18, 2019 - 8:04 am
Hi there,

I am conducting a 3-level linear regression and my upper level variables are not centering. I am wondering if there is a different code I need to be using to center my upper level variables?

idvariable x_student_ID;
cluster= x_idschool teach_id;
within= s_female s_ageyrs j_pared2 j_int j_ext js_Sel ;
between= (x_idschool) median enrol t_Sel3_mean js_Sel_mean

(teach_id)t_Sel3 t_selprog3;

missing are all (999);

center s_ageyrs
j_pared2 j_int j_ext median enrol js_Sel
t_Sel3 js_Sel_mean t_Sel3_mean (grandmean);

type=threelevel random;

ov on s_female s_ageyrs j_pared2 j_int
j_ext js_Sel;
s_female s_ageyrs j_pared2 j_int
j_ext js_Sel;
%between x_idschool%
ov on median enrol t_Sel3_mean js_Sel_mean;
median enrol t_Sel3_mean js_Sel_mean;
%between teach_id%
ov on t_Sel3 t_selprog3;
t_Sel3 t_selprog3;


Centering (GRANDMEAN)
 Tihomir Asparouhov posted on Monday, November 18, 2019 - 6:30 pm
See page 648 in the User's Guide. You might also find this useful
 Jilian Halladay posted on Tuesday, November 19, 2019 - 6:51 am
Hi there,

Thanks for the response.

In the user's guide, the coding is as follows:

Following is an example of how to specify grand-mean centering for

That is what I used in my model, so I am unsure why the level 1 variables are the only variables being centered.

Let me know if you have any thoughts. Thanks so much.
 Tihomir Asparouhov posted on Tuesday, November 19, 2019 - 7:35 am
I can't replicate this problem. Send the data and the input to

Note that grandmean centering doesn't affect most of the parameters.

You might want to verify that the variables didn't get centered using
savedata: file is 1.dat;
This will show you the data that was analyzed by Mplus, i.e., it should be the centered data.
 anonymous posted on Tuesday, November 26, 2019 - 6:30 am
In doubly latent mlm model, is the effect of the latent aggregated predictor always independent classroom effect (not contextual)?
 Bengt O. Muthen posted on Tuesday, November 26, 2019 - 2:53 pm
I think so because the latent variable centering is analogous to the group-mean centered approach shown in the left column of Table 5.11 on page 140 in the Raudenbush-Bryk (2002) book.
 anonymous posted on Tuesday, December 17, 2019 - 8:06 am
What do you think about centering binary variables.

Can individual-level binary predictors centered (group-mean, grand-mean) similar way as continuous individual-level predictors (Enders & Tofighi, 2007)?
 Tihomir Asparouhov posted on Wednesday, December 18, 2019 - 11:52 am
Yes. Take a look at page 11
section "Multilevel regression with a categorical predictor or mediator"
 anonymous posted on Thursday, December 19, 2019 - 2:43 pm
Thank you so much for this and the source.
 Rebecca Lazarides posted on Wednesday, July 08, 2020 - 4:56 am
I modeled a two-level model with certain items included as predictors only at L1 and others at L1 and L2.

For items only included as independent variables at L1 I used group-mean centering. For items that are included at both levels (L1+L2) I used grandmean centering.

My question applies to items that are only used at L1 - does it make sense to center them and would group-mean centering be the correct option here?
Thank you for your response.

Here is an example for my model:

usevariable: x1 x2 x3 z1 z2 z3 y1 y2;

within = x1 x2 x3;

center x1 x2 x3 (groupmean);
center z1 z2 z3 (grandmean);

y by y1 y2;
x by x1 x2 x3;
z by z1 z2 z3;

y on x z;


by by y1 y2;
bz by z1 z2 z3;

by on bz;
 Bengt O. Muthen posted on Wednesday, July 08, 2020 - 5:20 pm
If x1, x2, x3 only vary on within, there is not need to center them. You would do group-mean centering of them on within if they also varied on between. For the y's and the z's, because they are DVs, Mplus automatically does a latent variable decomposition where the within part is used on within and the between part is used on between. So no need for centering there either.
 Rebecca Lazarides posted on Tuesday, July 21, 2020 - 10:14 am
Thank you very much. I have a follow-up question. If x1 is on both levels (within: student, between: classroom) - which is the default in Mplus version 8 regarding the centering of x1? Is groupmean the default or grandmean - if we do not mention "center =" in our syntax?

Thank you so much for your support!
 Rebecca Lazarides posted on Tuesday, July 21, 2020 - 10:19 am
sorry - a specification: My question pertains to doubly-latent models where x1 is the independent variable on both levels. However, I am generally interested in the defaults of Mplus v8 regarding the centering option.
 Bengt O. Muthen posted on Wednesday, July 22, 2020 - 5:55 pm
Mplus does not do observed-variable centering (group- or grand-mean) by default. But we have commands that will do it for you when you ask. For many models it does latent variable decomposition into within and between components which on the within level is an implicit latent variable group-mean centering that is better than the regular observed-variable group-mean centering. If your model mentions the variance of the x1 variable, Mplus will use a latent variable decomposition. See also the comments in the output that describe the centering. And see our paper:
 Rebecca Lazarides posted on Wednesday, July 22, 2020 - 11:17 pm
Hi, thank you! So when using doubly latent modeling, the default is to use group-mean centering on the within level. What happens on the between level - which is the centering mode selected by Mplus?
 Bengt O. Muthen posted on Thursday, July 23, 2020 - 5:33 pm
Mplus uses the latent between-level part of the variable. This means it is the counterpart to the observed cluster mean.
 Paraskevas Petrou posted on Monday, September 07, 2020 - 2:55 am
Dear Bengt,

Do I understand correct your answer above: So if I decompose the same within-level relationship with latent variables both at the within and the between level (see model below), I do not need to perform any centering? Is that correct? And does my input below look correct overall?

Thank you!

usevariable: x1 x2 x3 y1 y2;

within = ;
between =;

y by y1 y2;
x by x1 x2 x3;

y on x;


by by y1 y2;
bx by x1 x2 x3;

by on bx;
 Bengt O. Muthen posted on Monday, September 07, 2020 - 3:49 pm
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message