Mplus Discussion >> Within and Between Specification

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Within and Between Specification

Mplus Discussion > Multilevel Data/Complex Sample >

Message/Author

Scott Colwell, PhD posted on Monday, November 28, 2005 - 3:35 pm

Hello:

I keep getting the following error and I can't see what I am doing wrong. Could you please help?

USEV ARE TRUST72 TRUST75 TRUST55 TRUST57
TRUST59 OPOR74 OPOR76 OPOR77 OPOR510
OPOR710 LOYAL108 LOYAL109 LOYAL114
PERF92 PERF93 PERF95 PERF96 PERF97
AGE GENDER BRBNK INTBNK ECO74 ECO710
ECO715 ECO54 ECO56 ESO72 ESO713 ESO79;

CLUSTER = mgrnum;
WITHIN = LOYAL108 LOYAL109 LOYAL114
PERF92 PERF93 PERF95 PERF96 PERF97
AGE GENDER BRBNK INTBNK;

BETWEEN = ECO74 ECO710 ECO715 ECO54 ECO56 ESO72 ESO713 ESO79;

ANALYSIS:
TYPE = TWOLEVEL;
ESTIMATOR = ML;
ALGORITHM = INTEGRATION;
MITERATIONS = 3500;

MODEL:
%WITHING%
wtrust by TRUST72
TRUST75 (1)
TRUST55 (2)
TRUST57 (3)
TRUST59 (4);

wopor by OPOR7
OPOR76 (5)
OPOR77 (6)
OPOR510 (7)
OPOR710 (8);

wloyal by LOYAL108 LOYAL109 LOYAL114;
wperf by PERF92 PERF93 PERF95 PERF96 PERF97;
wloyal on wperf;
wperf on wtrust wopor;
wtrust wopor on age gender BRBNK INTBNK;

%BETWEEN%
btrust by TRUST72
TRUST75 (1)
TRUST55 (2)
TRUST57 (3)
TRUST59 (4);
bopor by OPOR74
OPOR76 (5)
OPOR77 (6)
OPOR510 (7)
OPOR710 (8);

LV_OCO by ECO74
ECO710 (9)
ECO715 (10)
ECO54 (11)
ECO56 (12);

LV_NCOB by ESO72
ESO713 (13)
ESO79 (14);

XINTER | LV_OCO XWITH LV_NCOB;

btrust on LV_OCO LV_NCOB XINTER;
bopor on LV_OCO LV_NCOB XINTER;

OUTPUT: SAMPSTAT STANDARDIZED TECH1 TECH8;

Linda K. Muthen posted on Monday, November 28, 2005 - 4:20 pm

I don't see any error. Please send your input, data, output, and license number to support@statmodel.com.

Tom Munk posted on Friday, December 09, 2005 - 10:19 am

I am confused about when to use the WITHIN and BETWEEN options. I'll focus on WITHIN.

The version 3 User's Guide says "the WITHIN option is used with TYPE=TWOLEVEL and estimator=ML, MLR, or MLF to identify the variables in the data set that refer to individual-level (within) information."

If this means that its use should be restricted to variables with no between variance, then the axiom was violated in Example 9.1. The x variable has an ICC of .16.

At the same time, I am confused by the following fact. In an unconditional model, my ICC is about .30. When I add some variables to the USEVARIABLES statement, it does not change. When I declare them to be WITHIN (whether they are actually used in the model or not), the DV's ICC declines precipitously -- to about .08. Its total variance does not change, just the between/within distribution.

What are the guidelines for appropriate use of WITHIN and BETWEEN?

Linda K. Muthen posted on Friday, December 09, 2005 - 11:04 am

In Example 9.1, the data are generated so that x is a within variable. So it would not have variability on the between level.

BETWEEN is used to identify variables that are measured at the cluster level. I don't think there is any ambiguity there.

WITHIN is used if you want to use a variable only on the WITHIN level. Saying it is WITHIN just means that we won't estimate parameters for it on the BETWEEN level. It does not mean that the variable has no between-level variability.

Tom Munk posted on Saturday, December 10, 2005 - 1:31 pm

In example 9.1, groupmean centering is used for x. This implies to me that there is some degree of between level variability. My analysis suggests its ICC is .16, not negligible.

I find it hard to understand the meaning of my model. How can I explain the fact that the decision not to estimate BETWEEN-level parameters for a certain set of IVs results in a dramatic shift in DV variance from BETWEEN to WITHIN?

Linda K. Muthen posted on Sunday, December 11, 2005 - 9:34 am

Example 9.1 uses grandmean centering. And the x variable is generated as a within variable.

When looking at ICC's, you should be looking at an unrestricted model. Use TYPE=BASIC with no specification of variables being within only if they are measured on the individaul level. These are the relevant ICC's. Any other ICC's you see are not.

Your problem is that the between-level part of your indvidual-level variables are highly correlated with your between-level variables. I suggest doing two analyses -- one with the individaul-level variables in which you model the between-part of the individaul-level variables and a second where your unit of analysis is cluster and you model your between-level variables.

Tom Munk posted on Friday, December 16, 2005 - 8:39 am

Linda,

I'm exploring example 9.1 using TYPE = BASIC and no model; in particular the question of "what percentage of the variance in y is between levels?" This is, of couse, the ICC, or the ratio of the between-level y variance to the sum of y's between- and within-level variance. My question is why it changes when other variables are added or modified.

9.1i) When usevariable is y clus; the icc is .293.
9.1j) When x is added to the usevariables, the icc becomes .302.
9.1d,e) When w is added as a between variable, the icc becomes .304
9.1c) When x is declared within, the icc becomes .336
9.1g) When x is groupmeaned, the icc changes to .322.

bmuthen posted on Friday, December 16, 2005 - 9:09 am

This is not a problem, but if you use the MUML estimator I don't think you will experience this. The MUML estimator uses unbiased estimators of Sigma-within and Sigma-between (see Tech App for V2). The default MLR estimator that you are using gives consistent ML estimates. In the ML case I think you will see the same icc's under the variations you look at only if you have balanced data (all cluster sizes the same). In the general case, ML estimation draws on information from all the variables and so icc's will change a little bit depending on which variables are included.

Tom Munk posted on Monday, December 19, 2005 - 12:11 pm

Thanks, Linda.

I was able to confirm that MUML does stabilize the ICCs in the face of added variables, but also found that MUML is incompatible with TYPE=TWOLEVEL and WITHIN.

In the case of my data, with MLR, the ICCs change a lot when variables are declared WITHIN. (from .30 to .08).

I've implemented a solution that seems to work. Stick with the default MLR estimator. Group-center my IVs, effectively eliminating their between part. Declare them WITHIN. Use more accurate group-level variables as level-2 predictors. The collinearity problem is eliminated by the group-centering. The ICCs remain stable at about .30. As described in Bryk and Raudenbush(2002, p. 140), the contextual effect of these IVs is the difference of the level 2 effect and the level 1 effect for the same variable.

Linda K. Muthen posted on Monday, December 19, 2005 - 1:17 pm

Great.

anonymous posted on Wednesday, January 11, 2006 - 9:29 am

Hello Linda, hello Bengt,

I wonder what Mplus does exactly when specifying a variable as within. Just to understand properly:
- The within-model is still estimated with the pooled within variance of the items, that are specified as within (not the total variance of these items)
- Regarding the between-model: Are these items simply dropped from the estimated sigma-between (so that the sigma between has other dimensions than sigma within) or are their between-variances/covariances restricted to zero (and therefore tested)?

Thanks for the clarification!

bmuthen posted on Wednesday, January 11, 2006 - 10:58 am

"Within" says that the variable has no between variation and is considered in its totality. So, regarding the first question, the total variation in the variable (the variable itself) is considered (in line with HLM for covariates). Regarding the second question, the between Sigma elements are restricted to zero for Within variables. Note, however, that for chi-square testing not only the H0 model, but also the H1 model imposes these restrictions.

Dan Feaster posted on Thursday, April 17, 2008 - 10:10 am

How do I overide the default of making an X variable (i.e. independent variable or covariate) either within or between when using numerical integration? Tihomir and Bengt mention (in Webnote 11-Constructing Covariates in Multilevel Regression) with numerical integration the RHS variables default to be either within or between to decrease the dimensionality of the integration problem. They conclude the section by mentioning that that it possible to specify latent variable covariates within the numerical integration estimation but it is not done by default. I have not seen anything in the manual to explain how. I have tried just entering it at both within and between but I get an error saying that the covariate must be declared within OR between in the presence of numerical integration.

Thank you!

Tihomir Asparouhov posted on Thursday, April 17, 2008 - 11:16 am

Hi

Below is the equivalent of Mplus User's Guide Example 9.1b with categorical dependent variable and numerical integration. It shows how the X covariate is both between and within. The only difference with numerical integration is in the interpretation of the within level regression U on X. On the within level the X variable is the actual observed value (uncentered), it is not the centered XW.

Tihomir

montecarlo:
names are u x;
generate = u(1);
categorical = u;
nobservations = 1000;
ncsizes = 1;
csizes = 100 (10);
nreps = 1;

ANALYSIS:
TYPE = TWOLEVEL;
algo = int;

model population:

%WITHIN%
x*1;
u ON x*1;

%BETWEEN%
u ON xb*1;
xb by x@1;
u*1; xb*1;

model:

%WITHIN%
x*1;
u ON x*1;

%BETWEEN%
u ON xb*1;
xb by x@1;
u*1; xb*1;

Janine Neuhaus posted on Tuesday, January 27, 2009 - 5:14 am

Dear Linda and Bengt,
just a short question: What can I do to determine a contextual effect of a construct, when within and between variables are not exactly the same?
I ran a MFA (multilevel factor analysis) and found different structure patterns on both levels (within: two factors, between: one factor). Any suggestions? I would really appreciate it.
Janine

Bengt O. Muthen posted on Tuesday, January 27, 2009 - 5:25 pm

So the single contextual construct is something different from the individual's 2 constructs. You can still see how big of an effect on the individual outcome the single contextual construct has. I assume here that you have something like

%Within%
fw1 BY...
fw2 BY...
y on fw1 fw2;

%Between%
fb BY...
y on fb;

where the between-level y is the random intercept in the regression of y on fw1 fw2.

You can standardize the slopes for fw1, fw2, fb to unit factor variances and unit total y variance and then compare their sizes. This would have to be done by hand using the Mplus estimates because Mplus does not give a standardization in terms of total y variance.

Janine Neuhaus posted on Wednesday, January 28, 2009 - 3:57 am

Dear Bengt,
thank you very much! You're right with my model specification. But could you please tell me how to standardize the slopes (Is there an example in the user's guide or somewhere else on this website)?
Thanks so much,
Janine

Bengt O. Muthen posted on Friday, January 30, 2009 - 10:47 am

I don't think we have an example. You ask for RESIDUALS in the OUTPUT and there you find the estimated variances for the dependent variables (y) on Within and Between. You add those up into the total variance and take the sqrt to get the y SDs. If you look at the UG, standardization in general is explained as dividing the coefficient by the y SD and multiplying by the x SD. Your x SD is the sqrt of the variances of the factors for each of the 2 levels. Given that

y = bb*fb+ b1*fw1+b2*fw2 ...

where the bb coefficient comes from Between and the b1, b2 coeff's come from Within, this tells you how to standardize the bb, b1, and b2 coefficients, thereby making them comparable.

Rene Paulson posted on Wednesday, March 31, 2010 - 3:25 pm

Hello Drs Muthen,

I am running a multilevel CFA on the KSB factor before running the measurement model and SEMs. I did not reach adequate fit, so my professor suggested that I make the factor loading of (item x) within equal to the factor loading of (item x) between. Do you have suggestions for how to accomplish this in MPLUS?

TITLE: Multi-level CFA - KSB
DATA: FILE IS h:\groupdata.dat;
FORMAT IS FREE;
LISTWISE = Off;
VARIABLE: Names are id sg epia10 epib6 epia9 epia16 epib5 epib13 epia8 epia12 epia15
ks1 ks2 ks3 ks5 ks9 ks11 gl1 gl2 gl3 gl4 gl5 gl6 pstm1 pstm3 pstm5
tmsspec tmscred tmscoor gl1grp gl2grp gl3grp gl4grp gl5grp gl6grp
tmsspecg tmscredg tmscoorg ks1grp ks2grp ks3grp ks5grp ks9grp ks11grp;
usev sg ks1 ks2 ks3 ks5 ks9 ks11;
MISSING are all(-999999);
Cluster is sg;
ANALYSIS: TYPE = twolevel;
ITERATIONS is 5000;
estimator = MLF;
miterations = 5000;
H1ITERATIONS=50000;
STARTS = 100 10;

MODEL: %WITHIN%
KSBw by ks1 ks2 ks3 ks5 ks9 ks11;
%BETWEEN%
KSBb by ks1 ks2 ks3 ks5 ks9 ks11;
OUTPUT: Mod Stand Sampstat Res Tech1 Tech8;

Linda K. Muthen posted on Wednesday, March 31, 2010 - 5:08 pm

To hold the factor loading of ks equal across levels, do the following where (1) is an equality label:

MODEL: %WITHIN%
KSBw by ks1
ks2 (1)
ks3 ks5 ks9 ks11;
%BETWEEN%
KSBb by ks1
ks2 (1)
ks3 ks5 ks9 ks11;

Rene Paulson posted on Wednesday, March 31, 2010 - 10:36 pm

Thanks Linda. This worked. Rene

Rene Paulson posted on Saturday, April 03, 2010 - 12:00 pm

Hi Linda,

I have completed my multilevel SEM (with an interaction) and I want to standardize my raw coefficients for interpreation. I understand the formula to compute this is STDxy(b) = b*(SDx/SDy). Based on the discussion threads, I need to use tech4 to get the variances of the measures, and that tech4 output doesn't run in the random mode. I went back to my measurement model and added tech4, and received the following.

TECHNICAL 4 OUTPUT

ESTIMATES DERIVED FROM THE MODEL

ESTIMATED MEANS FOR THE LATENT VARIABLES
EPI TMSAFE KS TM GL
ESTIMATED MEANS FOR THE LATENT VARIABLES

ESTIMATED COVARIANCE MATRIX FOR THE LATENT VARIABLES

ESTIMATED COVARIANCE MATRIX FOR THE LATENT VARIABLES

ESTIMATED CORRELATION MATRIX FOR THE LATENT VARIABLES
ESTIMATED CORRELATION MATRIX FOR THE LATENT VARIABLES

Can you tell me if there is another command I should use to get the variances of the measurement items as well, or is it acceptable for me to just run them in another software like SAS or SPSS?

Thanks
Rene

Rene Paulson posted on Saturday, April 03, 2010 - 1:21 pm

Hi Linda,

I thought I would also add something. I have variances for the latent constructs(beneath intercepts);

1. just confirming that I can use those in the formula for standardized coefficients
2. for models with the interaction, I want to calculate the standardized coeffient for the paths including the moderator. How could I get the variance for that item to include in the formula?

Thanks
Rene

Linda K. Muthen posted on Sunday, April 04, 2010 - 9:51 am

Please send your full output and license number to support@statmodel.com. I need to see why you don't get TECH4.

Helen Zhao posted on Monday, December 19, 2011 - 11:02 pm

Hi Linda,

I am new to multi-level modeling, there is a very basic question confused me:

In the user guide 6.0, it says that "The BETWEEN option is used to identify the variables in the data set that are measured on the cluster level and modeled only in the between model".

I have a construct called supervisor-rated performance. That is, there are a number of groups and each group has one supervisor and a number of employees. The supervisor rated each employee differently.I wonder should i count such variable as "measured on the cluster level"?

Thanks!

Linda K. Muthen posted on Tuesday, December 20, 2011 - 5:59 am

The way to know if a variable is a between variable is that each person in a cluster gets the same value on that variable. It does not sound like your variable is a between variable. You may find the Topics 7 and 8 course handouts and videos on the website helpful.

Jeanine Gruetter posted on Friday, May 06, 2016 - 7:16 am

Hi Linda or Bengt,

I am testing an autoregressive model with latent variables (two time points) in a multilevel framework.

I have a two level variable (classroom norms) which I define at the individual (in the %within% section and at the classroom level in the %between% section als latent variable.

The outcome is also a latent variable at the individual level (individual empathy)

When I include the prediction of classroom norms (level 2) on individual empathy (level 1) in the between section, I receive an error message that I cannot enter a within variable in the between section.
The problem is that - as far as I understand - I have to define that latent variable either in the within or the between section first, before testing the hypothesized model.

Can you give me some suggestions on how to solve this problem?

And is it possible (and or) necessary to center the latent variable that is once in the model as within variable (individual perception of norms) and once as a between variable (classroom aggregate)?

Your help is much appreciated! :-)

Bengt O. Muthen posted on Monday, May 09, 2016 - 2:11 pm

I assume that it is the empathy variable that you get the error message about. Don't put that variable on the Within list. Then you will have a latent variable decomposition with a between-level version of it to use on Between.

The centering is automatic with a latent variable decomposition - see page 261 of the UG version that is posted on our website.

If this doesn't help, send output to Support along with your license number.

Avril Kaplan posted on Wednesday, January 31, 2018 - 8:51 am

Dear Professors,

Do you know of a resource that presents the equations used to generate the within and between level variance/covariance matrices of the MPLUS output when running a two-level analysis with a WLSMV estimator?

Thanks in advance.

Bengt O. Muthen posted on Wednesday, January 31, 2018 - 4:26 pm

See the technical appendix on our website at

http://www.statmodel.com/techappen.shtml

Two-Level Weighted Least Squares Estimation. Proceedings of the Joint Statistical Meeting, August 2007, Biometrics Section