I don't see any error. Please send your input, data, output, and license number to firstname.lastname@example.org.
Tom Munk posted on Friday, December 09, 2005 - 10:19 am
I am confused about when to use the WITHIN and BETWEEN options. I'll focus on WITHIN.
The version 3 User's Guide says "the WITHIN option is used with TYPE=TWOLEVEL and estimator=ML, MLR, or MLF to identify the variables in the data set that refer to individual-level (within) information."
If this means that its use should be restricted to variables with no between variance, then the axiom was violated in Example 9.1. The x variable has an ICC of .16.
At the same time, I am confused by the following fact. In an unconditional model, my ICC is about .30. When I add some variables to the USEVARIABLES statement, it does not change. When I declare them to be WITHIN (whether they are actually used in the model or not), the DV's ICC declines precipitously -- to about .08. Its total variance does not change, just the between/within distribution.
What are the guidelines for appropriate use of WITHIN and BETWEEN?
In Example 9.1, the data are generated so that x is a within variable. So it would not have variability on the between level.
BETWEEN is used to identify variables that are measured at the cluster level. I don't think there is any ambiguity there.
WITHIN is used if you want to use a variable only on the WITHIN level. Saying it is WITHIN just means that we won't estimate parameters for it on the BETWEEN level. It does not mean that the variable has no between-level variability.
Tom Munk posted on Saturday, December 10, 2005 - 1:31 pm
In example 9.1, groupmean centering is used for x. This implies to me that there is some degree of between level variability. My analysis suggests its ICC is .16, not negligible.
I find it hard to understand the meaning of my model. How can I explain the fact that the decision not to estimate BETWEEN-level parameters for a certain set of IVs results in a dramatic shift in DV variance from BETWEEN to WITHIN?
Example 9.1 uses grandmean centering. And the x variable is generated as a within variable.
When looking at ICC's, you should be looking at an unrestricted model. Use TYPE=BASIC with no specification of variables being within only if they are measured on the individaul level. These are the relevant ICC's. Any other ICC's you see are not.
Your problem is that the between-level part of your indvidual-level variables are highly correlated with your between-level variables. I suggest doing two analyses -- one with the individaul-level variables in which you model the between-part of the individaul-level variables and a second where your unit of analysis is cluster and you model your between-level variables.
Tom Munk posted on Friday, December 16, 2005 - 8:39 am
I'm exploring example 9.1 using TYPE = BASIC and no model; in particular the question of "what percentage of the variance in y is between levels?" This is, of couse, the ICC, or the ratio of the between-level y variance to the sum of y's between- and within-level variance. My question is why it changes when other variables are added or modified.
9.1i) When usevariable is y clus; the icc is .293. 9.1j) When x is added to the usevariables, the icc becomes .302. 9.1d,e) When w is added as a between variable, the icc becomes .304 9.1c) When x is declared within, the icc becomes .336 9.1g) When x is groupmeaned, the icc changes to .322.
bmuthen posted on Friday, December 16, 2005 - 9:09 am
This is not a problem, but if you use the MUML estimator I don't think you will experience this. The MUML estimator uses unbiased estimators of Sigma-within and Sigma-between (see Tech App for V2). The default MLR estimator that you are using gives consistent ML estimates. In the ML case I think you will see the same icc's under the variations you look at only if you have balanced data (all cluster sizes the same). In the general case, ML estimation draws on information from all the variables and so icc's will change a little bit depending on which variables are included.
Tom Munk posted on Monday, December 19, 2005 - 12:11 pm
I was able to confirm that MUML does stabilize the ICCs in the face of added variables, but also found that MUML is incompatible with TYPE=TWOLEVEL and WITHIN.
In the case of my data, with MLR, the ICCs change a lot when variables are declared WITHIN. (from .30 to .08).
I've implemented a solution that seems to work. Stick with the default MLR estimator. Group-center my IVs, effectively eliminating their between part. Declare them WITHIN. Use more accurate group-level variables as level-2 predictors. The collinearity problem is eliminated by the group-centering. The ICCs remain stable at about .30. As described in Bryk and Raudenbush(2002, p. 140), the contextual effect of these IVs is the difference of the level 2 effect and the level 1 effect for the same variable.
anonymous posted on Wednesday, January 11, 2006 - 9:29 am
Hello Linda, hello Bengt,
I wonder what Mplus does exactly when specifying a variable as within. Just to understand properly: - The within-model is still estimated with the pooled within variance of the items, that are specified as within (not the total variance of these items) - Regarding the between-model: Are these items simply dropped from the estimated sigma-between (so that the sigma between has other dimensions than sigma within) or are their between-variances/covariances restricted to zero (and therefore tested)?
Thanks for the clarification!
bmuthen posted on Wednesday, January 11, 2006 - 10:58 am
"Within" says that the variable has no between variation and is considered in its totality. So, regarding the first question, the total variation in the variable (the variable itself) is considered (in line with HLM for covariates). Regarding the second question, the between Sigma elements are restricted to zero for Within variables. Note, however, that for chi-square testing not only the H0 model, but also the H1 model imposes these restrictions.
Dan Feaster posted on Thursday, April 17, 2008 - 10:10 am
How do I overide the default of making an X variable (i.e. independent variable or covariate) either within or between when using numerical integration? Tihomir and Bengt mention (in Webnote 11-Constructing Covariates in Multilevel Regression) with numerical integration the RHS variables default to be either within or between to decrease the dimensionality of the integration problem. They conclude the section by mentioning that that it possible to specify latent variable covariates within the numerical integration estimation but it is not done by default. I have not seen anything in the manual to explain how. I have tried just entering it at both within and between but I get an error saying that the covariate must be declared within OR between in the presence of numerical integration.
Below is the equivalent of Mplus User's Guide Example 9.1b with categorical dependent variable and numerical integration. It shows how the X covariate is both between and within. The only difference with numerical integration is in the interpretation of the within level regression U on X. On the within level the X variable is the actual observed value (uncentered), it is not the centered XW.
Dear Linda and Bengt, just a short question: What can I do to determine a contextual effect of a construct, when within and between variables are not exactly the same? I ran a MFA (multilevel factor analysis) and found different structure patterns on both levels (within: two factors, between: one factor). Any suggestions? I would really appreciate it. Janine
So the single contextual construct is something different from the individual's 2 constructs. You can still see how big of an effect on the individual outcome the single contextual construct has. I assume here that you have something like
%Within% fw1 BY... fw2 BY... y on fw1 fw2;
%Between% fb BY... y on fb;
where the between-level y is the random intercept in the regression of y on fw1 fw2.
You can standardize the slopes for fw1, fw2, fb to unit factor variances and unit total y variance and then compare their sizes. This would have to be done by hand using the Mplus estimates because Mplus does not give a standardization in terms of total y variance.
Dear Bengt, thank you very much! You're right with my model specification. But could you please tell me how to standardize the slopes (Is there an example in the user's guide or somewhere else on this website)? Thanks so much, Janine
I don't think we have an example. You ask for RESIDUALS in the OUTPUT and there you find the estimated variances for the dependent variables (y) on Within and Between. You add those up into the total variance and take the sqrt to get the y SDs. If you look at the UG, standardization in general is explained as dividing the coefficient by the y SD and multiplying by the x SD. Your x SD is the sqrt of the variances of the factors for each of the 2 levels. Given that
y = bb*fb+ b1*fw1+b2*fw2 ...
where the bb coefficient comes from Between and the b1, b2 coeff's come from Within, this tells you how to standardize the bb, b1, and b2 coefficients, thereby making them comparable.
I am running a multilevel CFA on the KSB factor before running the measurement model and SEMs. I did not reach adequate fit, so my professor suggested that I make the factor loading of (item x) within equal to the factor loading of (item x) between. Do you have suggestions for how to accomplish this in MPLUS?
TITLE: Multi-level CFA - KSB DATA: FILE IS h:\groupdata.dat; FORMAT IS FREE; LISTWISE = Off; VARIABLE: Names are id sg epia10 epib6 epia9 epia16 epib5 epib13 epia8 epia12 epia15 ks1 ks2 ks3 ks5 ks9 ks11 gl1 gl2 gl3 gl4 gl5 gl6 pstm1 pstm3 pstm5 tmsspec tmscred tmscoor gl1grp gl2grp gl3grp gl4grp gl5grp gl6grp tmsspecg tmscredg tmscoorg ks1grp ks2grp ks3grp ks5grp ks9grp ks11grp; usev sg ks1 ks2 ks3 ks5 ks9 ks11; MISSING are all(-999999); Cluster is sg; ANALYSIS: TYPE = twolevel; ITERATIONS is 5000; estimator = MLF; miterations = 5000; H1ITERATIONS=50000; STARTS = 100 10;
MODEL: %WITHIN% KSBw by ks1 ks2 ks3 ks5 ks9 ks11; %BETWEEN% KSBb by ks1 ks2 ks3 ks5 ks9 ks11; OUTPUT: Mod Stand Sampstat Res Tech1 Tech8;
I have completed my multilevel SEM (with an interaction) and I want to standardize my raw coefficients for interpreation. I understand the formula to compute this is STDxy(b) = b*(SDx/SDy). Based on the discussion threads, I need to use tech4 to get the variances of the measures, and that tech4 output doesn't run in the random mode. I went back to my measurement model and added tech4, and received the following.
TECHNICAL 4 OUTPUT
ESTIMATES DERIVED FROM THE MODEL
ESTIMATED MEANS FOR THE LATENT VARIABLES EPI TMSAFE KS TM GL ESTIMATED MEANS FOR THE LATENT VARIABLES
ESTIMATED COVARIANCE MATRIX FOR THE LATENT VARIABLES
ESTIMATED COVARIANCE MATRIX FOR THE LATENT VARIABLES
ESTIMATED CORRELATION MATRIX FOR THE LATENT VARIABLES ESTIMATED CORRELATION MATRIX FOR THE LATENT VARIABLES
Can you tell me if there is another command I should use to get the variances of the measurement items as well, or is it acceptable for me to just run them in another software like SAS or SPSS?
I thought I would also add something. I have variances for the latent constructs(beneath intercepts);
1. just confirming that I can use those in the formula for standardized coefficients 2. for models with the interaction, I want to calculate the standardized coeffient for the paths including the moderator. How could I get the variance for that item to include in the formula?
Please send your full output and license number to email@example.com. I need to see why you don't get TECH4.
Helen Zhao posted on Monday, December 19, 2011 - 11:02 pm
I am new to multi-level modeling, there is a very basic question confused me:
In the user guide 6.0, it says that "The BETWEEN option is used to identify the variables in the data set that are measured on the cluster level and modeled only in the between model".
I have a construct called supervisor-rated performance. That is, there are a number of groups and each group has one supervisor and a number of employees. The supervisor rated each employee differently.I wonder should i count such variable as "measured on the cluster level"?
The way to know if a variable is a between variable is that each person in a cluster gets the same value on that variable. It does not sound like your variable is a between variable. You may find the Topics 7 and 8 course handouts and videos on the website helpful.
I am testing an autoregressive model with latent variables (two time points) in a multilevel framework.
I have a two level variable (classroom norms) which I define at the individual (in the %within% section and at the classroom level in the %between% section als latent variable.
The outcome is also a latent variable at the individual level (individual empathy)
When I include the prediction of classroom norms (level 2) on individual empathy (level 1) in the between section, I receive an error message that I cannot enter a within variable in the between section. The problem is that - as far as I understand - I have to define that latent variable either in the within or the between section first, before testing the hypothesized model.
Can you give me some suggestions on how to solve this problem?
And is it possible (and or) necessary to center the latent variable that is once in the model as within variable (individual perception of norms) and once as a between variable (classroom aggregate)?
I assume that it is the empathy variable that you get the error message about. Don't put that variable on the Within list. Then you will have a latent variable decomposition with a between-level version of it to use on Between.
The centering is automatic with a latent variable decomposition - see page 261 of the UG version that is posted on our website.
If this doesn't help, send output to Support along with your license number.