Anonymous posted on Monday, December 23, 2002 - 2:21 pm
I think that this is a simple problem - so simple that I can't find an example of it.
I have scores for children, at different ages, and want to estimate the effect of age on the scores. The model statements look like this:
%within% s1 | score on age; %between% distance on sex; s1 on sex;
And Mplus says:
*** WARNING in Model command Variable will be assumed to be a y-variable on the BETWEEN level: AGE *** ERROR in Model command Variable is a y-variable on the BETWEEN level but is an x-variable on the WITHIN level: AGE
I assume that you have not mentioned age on the WITHIN list of the VARIABLE command. If you do not specify that it is only a WITHIN variable, Mplus assumes that it is also used on the BETWEEN level. And because you don't use it, Mplus warns you that it is being treated as a y variable on the BETWEEN level. You can avoid this by adding WITHIN = age to the VARIABLE command. Also, a variable cannot be used as an x on one level and a y on another. But once again, specifying age to be a WITHIN variable should solve your problem.
You will find a similar example on page 4 of the Addendum to the Mplus User's Guide which can be found at www.statmodel.com under Product Support.
Anonymous posted on Tuesday, July 22, 2003 - 11:26 am
I've just made my first stab at a multilevel model in Mplus and am encountering the same problem as the poster from 12/23/2002 above.
My (abbreviated) Mplus code (following the examples in the 2.13 Users Manual) is:
. . .
BETWEEN = x1 x2; WITHIN = r1 r2; MODEL:
%within% s1| y on x1; s2| y on x2;
%between% s1 s2 y on r1 r2;
. . .
Even though I've specified the level-2 covariates r1 and r2 on the BETWEEN command, Mplus produces warnings indicating that both r1 and r2 will be used as y-variables. Why is this ?
I'm encountering other difficulties and have a few additional questions as well:
1. I'm trying to follow the procedure outlined by Bengt in his 1994 SMR piece (although I want to estimate a multilevel SEM, not a multilevel FA). When I request that Mplus provide me with the SIGB matrix (either correlation on covariance), Mplus produces the requested file (i.e., it shows up in my c:\Mplus directory and the Mplus output file echos that its been produced) but when I open the file itself I find its empty. Have I done something incorrect ?
2. Is it the case that for the above script Mplus assumes that the Level-1 coefficients s1 and s2 are uncorrelated unless I specifically include a command "s1 with s2*.3", etc ? Isn't it more appropriate to assume that s1, s2, ... sN are always correlated ?
3. Does the Mplus multilevel SEM model not provide an estimate of the level-1 intercept (B0) coefficient, and does it not allow this coefficient to have a hierarchical structure ? (Or, is this what one is in effect doing by including y in the BETWEEN model statement ?). My Mplus output provides no estimate of the mean of y.
4. The above model produces a between covariance matrix that is not positive definite. It suggests that I set the variance of one of the slope terms to zero or specify the term as a within variable. Is setting the variance of a slope to zero the same as saying that slope is estimated without error ? Does one have to formally specify slopes as WITHIN variables ?
It would be best if you sent your complete output to firstname.lastname@example.org so that we can see your full model and analysis type and the full text of the error message.
Also, please download Version 2.14 from www.statmodel.com under Product Support. It has a fix to the sigma b matrix file being empty.
Anonymous posted on Sunday, May 16, 2004 - 8:13 pm
I ask for a help about the problem of some extent overlap between level-2 predictor and outcome in analyses moderating effect. I intend to consider the model: Level 1: Yij = b0j + b1j (Xij) + eij Level 2: b0j=r00 b1j=r10+r11(Wj)+u1 The outcome variable Yij is an individual characteristic variable, such as social competemce, where the level 2 variable Wj is a composite group variable was created using sevel individual variables (such as academic performance, leadership, peer acceptance, and social competence), which also including social competence. The result of multilevel confirmation factor analysis revealed that the way of composition of level-2 variable is reasonable. Now I want to know:
(1) If I only consider the effect of level-2 variable on the level-1 random slope, whether the overlap of predictor and outcome is a serious problem or not? My consider is that I am look at level2 influence on slopes but not intercept, the slope is the association between two variables which is a distinctive concept from the level-2 variable, am I right?
(2) If I also consider the effect of level-2 variable on the random intercept, what should I do?
I have a problem I do not know how to solve at the moment.
I am doing multilevel modeling (repeated measures design). At the between level I look at variance in hostility scores between individuals, and at the within level I examine variance in hostility scores across three different relationships (friends, enemies, neutrals) within individuals. At the between level I have also found that low self-esteem is related to higher overall hostility. But I would like to know if low-self esteem is especially related (more strongly related) to inferring hostility in certain type of relationship (e.g. friends). How can I look at this?
Input is as follows: INPUT: TITLE: FAIL; DATA: FILE IS mplusi jaoks.dat; VARIABLE: NAMES ARE ID PRO1 REA1 PRO2 REA2 PRO3 REA3 GENDER SELF RS EXTERN INTERN ADAPTK VAEN SUMMA VEAD HOSTIL SOB VAENL TUTTAV VEAD2 VAEN2; !sob, vaenl, tuttav - these variables represent relationship types (dummy-coded) USEOBSERVATIONS = GENDER EQ 1; USEVARIABLES ARE SELF SOB HOSTIL; !SELF=SELF-ESTEEM !SOB=FRIENDSHIP (DUMMY-CODED) !HOSTIL=HOSTILITY SCORE CLUSTER = ID; WITHIN IS SOB; BETWEEN IS SELF; ANALYSIS: TYPE = TWOLEVEL; ESTIMATOR = MLR; MODEL: %BETWEEN% HOSTIL ON SELF; %WITHIN% HOSTIL ON SOB; !AT THE MOMENT: HOSTILITY TOWARDS FRIENDS (I.E. FRIENDSHIP) OUTPUT: SAMPSTAT STANDARDIZED RES MODINDICES (0.00);
You mention that you have repeated measures but I don't see that in your MODEL command. Where is time? I think you want to see if there is an interaction between SELF and SOB. You can create an interaction variable using DEFINE by multiplying the two variables. You can use that variable as a covariate to capture the interaction. However, SELF is a BETWEEN variable and SOB is a WITHIN variable. Is this really the case?
Concerning repeated measures design, I did not measure anything over time. In other words, for me, three time points are three relationship types. Yes, SELF is at the between level, and SOB at the within level. I formed the interaction term between SELF and SOB. At the between level I want to see if children with lower self-esteem infer more hostility across all the relationship types, as compared to children with higher self-esteem. At the within level I want to test if children with low self-esteem infer more hostility from friends than from enemies or neutral acquaintances.
TITLE: FAIL; DATA: FILE IS mplusi jaoks.dat; VARIABLE: NAMES ARE ID PRO1 REA1 PRO2 REA2 PRO3 REA3 GENDER SELF RS EXTERN INTERN ADAPTK VAEN SUMMA VEAD HOSTIL SOB VAENL TUTTAV VEAD2 VAEN2; USEOBSERVATIONS = GENDER EQ 1; USEVARIABLES HOSTIL SELF SOB INT; !SELF=SELF-ESTEEM !SOB=FRIENDSHIP (DUMMY-CODED) !HOSTIL=HOSTILITY SCORE !INT=INTERACTION BETWEEN SOB AND SELF CLUSTER = ID; WITHIN IS SOB INT; BETWEEN IS SELF; DEFINE: INT = SOB*SELF; ANALYSIS:TYPE = TWOLEVEL; ESTIMATOR=MLR; MODEL: %BETWEEN% HOSTIL ON SELF; %WITHIN% HOSTIL ON SOB INT; OUTPUT: SAMPSTAT STANDARDIZED RES MODINDICES (0.00);
The result showed that interaction term between SELF and SOB predicted hostility (standardized path = -.44). At the same time path from SOB to HOSTIL (hostility score) disappeared. Could I interpret the result so that children with low-self esteem have higher hostility scores in friendship situation compared to hostility in other two situations.
bmuthen posted on Thursday, June 03, 2004 - 7:46 am
You create an interaction variable as SOB*SELF in Define. Since these two variables are on different levels, it seems like you instead want to work with a random slope in addition to the random intercept you have for HOSTIL. This results in a "cross-level interaction" in multilevel modeling terms (see HLM literature) - the random slope modeling results in a regression of HOSTIL on the product of SOB and SELF, but you get the correct standard errors. So you can delete your Define statment and instead have (with type = random twolevel)
Your output shows a zero residual variance for s in the regression of s ON self. This causes a perfect negative correlation because the estimated regression says that s is a deterministic function of self. I tried the analysis without regression s self to see if s has significant variation. It does not. This means that there is no cross-level interaction.
Anonymous posted on Wednesday, June 09, 2004 - 5:33 am
I have multilevel data with four different situations nested within individuals. The program gives me a negative intraclass correlation for one variable, which is impossible I think? Also, trying to specify a two-level model including this variable, I get error messages about the matrix not being positive definite. How could I find out what is wrong?
The negative intraclass correlation is caused by a negative between level variance. If you do a TYPE = TWOLEVEL BASIC, you can see where the negative variance is and modify your model accordingly. This negative variance is what makes your matrix not positive definite.
Let me expand the previous answer. The negative variance is most likely caused because the variable has zero between-level variance. This variable should not be included in the between part of the model.
Anonymous posted on Monday, September 06, 2004 - 6:06 am
I am trying to find out if the association between X and Y (two indivdiual level variables) varies as a function of classroom (C) levels of W (W is measured for each child). I can not figure out the correct input to answer this question. Do you have any suggestions?
Anonymous posted on Monday, September 06, 2004 - 7:54 am
I tought it was shown in example 9.1 but when I plugged in my variables I got this message: *** ERROR The number of observations is 0. Check your data and format statement. Do you know what I might be doing wrong? Thank you.
It sounds like you are reading your data incorrectly. The Mplus default is listwise deletion. Any observation with a missing value on one or more analysis variables is deleted from the analysis. After listwise deltion, you may have no observations. If you can't figure out the problem, you should send you input and data to email@example.com.
Anonymous posted on Monday, October 04, 2004 - 10:04 am
My understanding is that a variable can't be x on the between level model, and y on the within level model. However, I need this variable theoretically as x on the between level and y on the within level. In this case, is it still right if I use this variable as x on the between level, and y on the within? If okay, how can I use this variable as both x and y (e.g., code)?
bmuthen posted on Monday, October 04, 2004 - 3:05 pm
for the variable v on the level that it is not a y-variable. Here, z can be any of the variables on that level.
Anonymous posted on Tuesday, October 12, 2004 - 2:44 am
I am doing multilevel modelling...and I want to report between-level and within-level variance estimates (StdYX). But they are all 1.000-s. What does that mean? Should I report unstandardized estimates then?
Kätlin posted on Thursday, October 21, 2004 - 12:06 am
I do not know how to construct a model. Maybe you could help me. I assessed children´s attributions and behavioral strategies in three relationship types, that is, towards friends, enemies, and neutral acquaintances. I am doing two-level modeling, where individuals are at level 2, and different relationship types at level 1. Relationships (peers) are dummy-coded. In addition, I have measured children´s externalizing, internalizing, and adaptive behaviors, and I have also calculated the same indices for friends, enemies, and neutral acquaintances. I have regarded children´s behavioral indices as only between-level variables, and peers´ behavioral indices as only within-level variables. Thus, the model is as follows:
.... CLUSTER IS ID; within are kaasada kaaseks kaasint; between are eks int ada; ANALYSIS: TYPE = TWOLEVEL; ESTIMATOR = MLR; MODEL: %BETWEEN% intent on eks int ada; ag on eks int ada; !intent - hostile attributions; !ag - aggressive solutions; !eks int ada - behavioral indices of children; %WITHIN% intent on kaaseks kaasint kaasada; ag on kaaseks kaasint kaasada; !kaaseks kaasint kaasada - behavioral indices of peers; OUTPUT: SAMPSTAT STANDARDIZED RES MOD (0.00);
Hopefully I have done a right thing so far. For instance, I know that less hostility is inferred from more prosocial peers (a within path from intent on kaasada is significant). But I would also like to know, if friend´s or/and child´s own adaptive behaviors have an effect on cognitions towards friends. And if so, is it stronger for friends than, for instance, for neutral acquaintances. How can I analyze that? If I do simple path analyses for friends, enemies, and neutral acquaintances separately, then I do not take into account that behavioral indices of children and behavioral indices of peers are actually at different levels.
Thus, for each relationship I could construct the following model:
MODEL: ag on eks int ada kaaseks kaasint kaasada; intent on eks int ada kaaseks kaasint kaasada; ag with intent;
!ag - aggressive solutions towards friends; !intent - hostile attributions towards friends; eks int ada - behavioral indices of children; kaaseks kaasint kaasada - behavioral indices of friends;
From what I understand, you have measured several variables on a group of children. I don't believe that you need multilevel modeling because you have not measured any one variable repeatedly nor are children nested in classrooms for example. I would specify my regression relationships in a regular model.
Anonymous posted on Friday, November 12, 2004 - 5:14 am
I would like to report within- and between-level correlations. Where can I get the levels of significance? Should I specify each pair of variables under the model command, and decide the significance on the basis of z-value?
Hello, I would like to ask the follwing concerning the potentials of Mplus:
1. Is it possible to calculate random slope effects within a structural equation multilevel path model (2 levels)? Please note: This question refers to cross-sectional data, not to e.g. longitudinal latent growth models.
2. Is it possible for Mplus to construct structural equation multilevel models where the indicators of the level-2 construct(s) have no equivalents on level-1?
bmuthen posted on Sunday, November 14, 2004 - 11:45 am
Type = Basic should be used if all you want is the within and between correlations, but I am not sure Mplus gives the SEs for these.
bmuthen posted on Sunday, November 14, 2004 - 11:47 am
Answer to Nov 13 - 08:39.
1. Yes. See the Version 3 User's Guide examples.
Mike Cheung posted on Thursday, February 17, 2005 - 1:26 am
I want to predict a level-2 dependent variable (Gp_Cho) by using an aggregated level-1 predictor (Ind_Cho). The selected code is:
BETWEEN IS Gp_Cho; CLUSTER IS Gp_Num; ANALYSIS: TYPE IS TWOLEVEL; MODEL: %WITHIN% Ind_Cho; %BETWEEN% Gp_Cho ON Ind_Cho;
I got the error messages: *** WARNING in Model command Variable is uncorrelated with all other variables on the WITHIN level: IND_CHO *** ERROR in Model command Variable is an x-variable on the BETWEEN level but is a y-variable on the WITHIN level: IND_CHO
I know that "a variable cannot be used as an x on one level and a y on another" (by Linda at December 23, 2002). Could you explain or point me to the references why it is not possible to use the same variable as x and y at different levels?
Are there any ways to "trick" the program to use the aggregated mean (from level-1) to predict a true level-2 dependent variable?
I have a question about the Mplus-output of a "TYPE=TWOLEVEL"-analysis with no specified model. As far as I understand, this is equivalent with conducting an oneway-ANOVA with random effects on the dependent variable. So Mplus estimates the between- and within-variances and also a mean for the between-part. This mean seems to be not simply the average of the dependent variable.
Is it correct, that the estimated mean is a "precision weighted average" (Raudenbush & Bryk, 2002), which is descriped as an estimator of the true grand mean?
Many thanks in advance!
bmuthen posted on Monday, October 10, 2005 - 9:03 am
Yes. As mentioned in the book, this is also the ML estimate which is what Mplus gives.
Samuel posted on Saturday, October 29, 2005 - 8:15 am
Hello Dr. Muthén,
I have a simple question about the capabilities of Mplus to take nonindependence into account. My data are from individuals nested into workgroups. For the main analysis, I use TYPE=COMPLEX and that works just fine. But as a preliminary analysis, I would like to show, that people from different agegroups don't differ significantly in the dependent variables. So with independent data, I would simply conduct an one-way ANOVA. How would I do that with non-independent data? Maybe a regression analysis with agegroup as a categorial IV and TYPE=COMPLEX to take the nonindependence into account?
Kätlin posted on Tuesday, January 24, 2006 - 7:27 am
I have a question concerning using type = two-level or type = complex. When I use complex method, I get a significant path between two variables (b on a). When I am specifying the model at two levels separately, and the same path is estimated at both levels, the path is not significant at the within level, however it is significant at the between level.
I am now confused which method to use because the interpretation is different depending on the method I use.
COMPLEX and TWOLEVEL are two different approaches for clustered data. In COMPLEX, standard errors and chi-square are computed taking into account the non-independence of observations due to clustering, whereas in TWOLEVEL parameters are modeling for both the indiviual and the cluster. So to some extent, the choice has to do with your hypotheses. In your case, you might want to use TWOLEVEL because it seems to give a fuller picture of what is going on.
Your example is unusual. If you can share the input and data, I would like to use it as an example when teaching. If so, please send them to firstname.lastname@example.org.
I am a new user of Mplus. I would like to conduct what I believe is a multilevel CFA. I have individual perceptions of leadership with individuals nested in units. The leadership latent variable is a second order factor with five first order factors - each measured by 3 items. When I conducted CFA using AMOS, I had a good model fit. But this does not take into consideration the unit level and the clustering effect. Furthermore, Rwg and ICC(2) are above the cut-off popints justifying aggregation to unit level. So what is the language I should use? TYPE=TWO LEVEL or TYPE=COMPLEX. Does Mplus aggregate the individual level variable to unit level? thanks
Thanks Linda, I have used the following syntax: tl1 to 15 being the observed variables (items of the scale). The first order factors are vis ic is sl and pr tl is the second order factor. Is this fine?
Individuals are clustered in regions and I have individual level data (Xij) and region level data(Rj). The outcome variable (Yij)is categorical and individual level. I want to estimate a multilevel model with random intercept (but no random slopes)and with latent variables where all the latent variables are region level data. I have in mind the following model: LR1 by R1 R2 R3 LR2 by R4 R5 R6 Y on X1 X2 LR1 LR2
Is this a TOWLEVEL model? How do i define *between* and *within* for this model?
I am new to multilevel analysis and MPlus, and I am exploring what analyses are most appropriate for my purposes and data.
For the purpose of developing a questionnaire of parents’ cognitions about their child (i.e., dyadic cognitions), I have administered an initial item-pool of 55 items among around 300 parents.
I want to construct the final scale by (a) selecting items with high item-total correlations and small to moderate inter-item correlations; and (b) selecting items on theoretical grounds. Subsequently, I want to do an EFA on the final scale, examining its factor structure.
My data have a multi-level structure, because some participants are nested within the same child; that is, 200 participants are in dyads (i.e., they are husband and wife) and have reported their cognitions about the same child. The remaining 100 participants are individual (e.g., because they are single parents or their partner did not participate).
How can I best take the multilevel structure of my data into account? For example, would it be possible to compute item-total correlations, inter-item correlations, and EFA using type=twolevel?
Thank you for the comment! Before doing an EFA, however, I would like to select items that are both psychometrically strong (i.e., with high item-total correlation and relatively normal distribution) and theoretically central to the construct (selected by an expert panel). Else, I’m afraid that weak or theoretically strange items will result in uninterpretable factor solutions. What is your opinion on this?
Any item construction should include experts in the field. Any data analysis should include a thorough investigation of the univariate and bivariate descriptive statistics involving the variables. EFA can be used descriptively to see how items behave as far as if they load on the expected factor, if they have cross-loadings that are unexpected etc.
Della posted on Thursday, August 18, 2011 - 5:56 pm
If you have both dichotomous and ordinal indicators for some of your factors and your doing a MSEM and the variables are non-normal, high skewed, sample size large over 1000.
What are are the best estimators for Type=Complex and Type=Twolevel?
I have try an two-level regression analysis for a continuous dependent variable with a random intercept (example 9.1):
There are three independent variables (x1, x2, and x3) and four dependent variables (y1,y2,y3,y4). First a have create variables with the cluster means for x1-x3 and compute the following model:
NAMES= X1-X3 !independent individual values Y1-Y4 !dependent individual values XM1-XM3; !cluster mean values WITHIN = X1-X3; BETWEEN = Y1-Y4; CLUSTER = ID;
DEFINE: CENTER x1-x3 (GRANDMEAN);
ANALYSIS: TYPE=TWOLEVEL; MODEL: %WITHIN% Y1-Y4 ON X1-X3; %BETWEEN% Y1-Y4 ON XM1-XM3;
Question 1: Can I state on the level 1 the individual extent of X1 have an effect on the individual value of Y1? Question 2: Level 2: The cluster-level of X1 (the cluster mean)have an effect on the cluster value of Y1? Question 3: There are the meaning, that the aggregation of individual values to a cluster mean need a reliability between the individuals within the cluster. What can I do, if the reliability between the individuals in clusters is poor?
The inputs will not run. You have the x's on the WITHIN list and the y's on the BETWEEN list and are using both variables on both levels. Please send any outputs and your questions to email@example.com so we can help you.
Please note that posts on Mplus Discussion should not exceed one window. In the future, please limit your post to one window.
Huiping Xu posted on Thursday, November 21, 2013 - 9:21 pm
Dear Dr. Muthen,
In my study, I have 3 groups of subjects who are repeatedly measured on 5 items at 3 time points. I want to see whether these 5 items define 2 factors and how the three groups of subjects are different on these two factors at a fixed time and across time. Because the time effect is not linear, I will be treating time as a categorical variable. I would also like to examine whether time and group has an interaction effect. Does multilevel factor analysis seem appropriate to answer my questions?
I am reading your 1994 paper on multilevel factor analysis model. It appears that the analysis is decomposed into the between and within subject factor analysis. Two sets of factor scores can be derived from the analysis. The between subject scores are derived on the subject level so one subject has one factor score. The within subject scores are derived on the time level so each subject gets 3 factor scores, one at each visit. How should I use these factor scores to answer my questions?
I would treat this as a single-level longitudinal factor analysis where non -independence of observations is handled by multivariate modeling. It would be like Example 6.14 but without the growth model.
The two inputs yield nearly identical means, but different variances.
Where is the difference between the two approaches?
2.) I compared a 0-Modell (twolevel) with a model, where I used the cluster_mean command to create a level2 variable. The variance at level2 in the 0-modell is smaller than the variance of the cluster_mean variable. Why is this so? Is there a general rule, that the variance of manifest aggregated variables is higher (overestimated)?