Anonymous posted on Monday, January 28, 2002 - 11:43 am
I am working on a two level SEM model (unbalanced data). I am new user. I have couple questions: 1. Is it possible to define a variable that designates those clusters with few number of cases in them so that I can run the analysis without them using USEOBSERVATIONS command? 2. Is the estimation method mentioned in Appendix 10 (page 380 of the User's guide) called MUML not included in the software at this moment? 3. Is there any articles/books that uses the software and goes through a twolevel example other than the ones I found in the Mplus web pages? 4. More specifically, what kind of constraints are required in twolevel analysis for any model to be identified? I am getting the message that one of the Beta parameters is problematic, causing underidentification? How can I fix that?
I am not sure these questions were raised before. Thank you very much for your time.
1. Unless you have a variable that contains information about cluster size, I can't think of a way to do this. 2. MUML is the estimator used for TWOLEVEL when the data are unbalanced. If the data are balanced, it is Maximum Likelihood 3. I believe that the following book contains Mplus examples. Heck, R. (2001). Multilevel modeling with SEM. In G.A. Marcoulides & R.E. Schumacker (eds.), New Developments and Techniques in Structural Equation Modeling (pp. 89-127). Lawrence Erlbaum Associates. 4. I can't answer your idenentification problem without seeing the input/output and data if possible. Please send them to email@example.com.
Anonymous posted on Thursday, July 24, 2003 - 10:31 am
I had a handful of questions about Mplus’ mulitivariate covariance structure analysis (MCSA) after reading Bengt’s 1994 SMR piece.
First, Bengt writes (pg. 388): “…If all intraclass correlations are close to zero, as is the case for many applications, it might not be worthwhile to go further…”. I was under the impression that hierarchical / multilevel modeling is *always* preferable to non-hierarchical modeling, for various reasons (see for example, Gelman et. al., 1995). Is Bengt’s point that one wouldn’t expect to see much improvement in the Level-1 coefficients if the ICCs are close to zero, or rather that obtaining convergence is sufficiently difficult that any gains in parameter estimation are probably not worth the tremendous amount of effort that would be required to build the model / obtain convergence (etc.) ?
Second, on the Mplus MCSA approach in general: am I correct in understanding that the Mplus / Muthen approach is a SEM analog to the HLM / empirical Bayes approach to hierarchical modeling ?
Third, and related to the second question above: in implementing MCSA in Mplus when using individual data as inputs, does Mplus always work from a “constructed” set of variance / covariance matrix ? Perhaps what I’m trying to understand here is if the Mplus approach to MCSA be conceptualized in another manner (i.e., Bayesian / probability notation) rather than that of fitting a variance / covariance matrix that has within and between components (and requires the estimation of St, Spw, and Sb matrices) ? (I ask only because for some reason the notation of Gelman et. al. seems a bit more intuitive to me at this point).
Finally, regarding the FIML and MUML estimators described in Bengt’s 1994 SMR piece: are the current Mplus 2.14 estimators of the same name identical to FIML / MUML as described in the article ? Is the Mplus ML option for MCSA the same as the FIML option in the case of unbalanced cluster sizes (or could you provide a citation describing the difference between the FIML, MUML, ML as applied in Mplus’ MCSA module) ? I’ve scanned the Mplus Papers / Recent Publications listing on the front of the web page and nothing seemed applicable here.
Thanks very much.
bmuthen posted on Thursday, July 24, 2003 - 11:16 am
Good questions. Things will become clearer as I finish an overview paper later this summer on FIML capabilities in Mplus.
Regarding the first question, yes I was thinking pragmatically. The parameter estimates and the SEs may not be that different - and convergence can be problematic.
Second question. Since the introduction of FIML in version 2.1, the approaches are analogous for the models that they both do - e.g. models with no latent variables. With the older MUML approach, we only had random intercepts, not random slopes. And, no missing data.
Third question. Between and within matrices are only used for MUML, while for FIML the raw data need to be used in order to handle random slopes and missing data. With the more general models that FIML includes, one can also describe the models using the Baysian conditioning style, level 1 given level 2, etc.
Final question. MUML is the same. The FIML algorithm is different in that in the earlier approach FIML was done in standard SEM software simply by using as many between cov matrices as there were distinct cluster sizes.
- Hope that gives sufficient answers for now. More in the forthcoming papers.
Anonymous posted on Thursday, July 24, 2003 - 4:44 pm
Thanks very much.
As a follow-up to your responses above (I may have one more in a bit): MUML is still (i.e., as of Mplus 2.14) the only method in Mplus to handle unbalanced clusters / "Level-2" units ?
I hope you'll be putting the paper you mention up on the Mplus website when you're finished.
bmuthen posted on Thursday, July 24, 2003 - 4:56 pm
No, FIML in Mplus (since version 2.1 of May 2002) handles the general case of unbalanced clusters, i.e. where different clusters have different number of members.
Yes, the paper will be posted on the home page with examples that can be run.
Anonymous posted on Saturday, August 09, 2003 - 6:35 am
I had a pair of questions when I run a multilevel structural model.
(1)for random intercepts model, the result presented the warning message:
THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO A NON-ZERO DERIVATIVE OF THE OBSERVED-DATA LOGLIKELIHOOD.
CONVERGENCE CRITERION FOR THE MODEL IS NOT FULFILLED. CHECK YOUR STARTING VALUES OR INCREASE THE NUMBER OF MITERATIONS. ESTIMATES CANNOT BE TRUSTED. THE LOGLIKELIHOOD DERIVATIVE FOR PARAMETER 12 IS -0.49269400D+00.
My command file as follow:
TITLE: multilevel regression analysis
DATA: FILE IS "C:\d1.dat";
VARIABLE: NAMES ARE subject group y1 y2 x z1 z2; USEVARIABLES ARE group y1 y2 x z1 z2; CLUSTER IS group; MISSING ARE ALL(999); WITHIN = x; CENTERING = GROUPMEAN(x); BETWEEN = z1 z2;
ANALYSIS: TYPE IS TWOLEVEL RANDOM MISSING;
MODEL: %WITHIN% y2 on y1; y2 on x; y1 on x;
y1 on z1 z2; y2 on z1 z2;
(2)For random slopes model: MODEL: %WITHIN% y2 on y1; s1|y2 on x; s2|y1 on x;
y1 on z1 z2; y2 on z1 z2; s1 on z1 z2; s2 on z1 z2;
Such as the follow warning massege was gaven:
THE ESTIMATED BETWEEN COVARIANCE MATRIX IS NOT POSITIVE DEFINITE AS IT SHOULD BE. COMPUTATION COULD NOT BE COMPLETED. PROBLEM INVOLVING VARIABLE ZCPSOC. THE RESIDUAL CORRELATION BETWEEN ZCPSOC00 AND ZCPSOC IS 1.000
THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES.
Please tell me how handle those problems. Thank you very much.
bmuthen posted on Saturday, August 09, 2003 - 7:19 am
For the first issue, I recommend following the advice in the error message. The derivatives should be zero for all parameter estimates, but one is far from it, so the "miterations" should be increased. This error message may be an indication that this parameter (#12 - see Tech1 output) is hard to estimate, that is, it is ill-defined from the data for this model.
For the second issue, I assume that you have used generic variable names, and not your actual variable names, in the input you give - the error message talks about other variables and I assume they are the 2 y variables, that is the between part of y, i.e. the y intercepts. If I am correct, the error message implies that the y intercepts correlate perfectly when having accounted for z1, z2.
Anonymous posted on Saturday, March 27, 2004 - 10:52 am
I am working on a multilevel SEM. At the within level 'sp' predicts 'mot'. At the between level 'lcb' predicts 'ppp' which predicts 'mot'. In other words I want to estimate the effect of between level factors on the individual level factor, 'mot'. I am not sure what to do with 'sp' in the between model. Reading p. 299 of V2. User'########, I think that I should fix the indicators of 'sp' to 0 in the between model. When I do this by fixing these variables to 0 on one of the Level 2 factors, e.g, 'ppp', then the results aren't what I would expect. If I create a new factor, 'spb' in level 2 and fix the indicators to 0 on this factor, then the results seem more like what I would expect. My code for the latter would be:
TITLE: LCB TOTAL TWO-LEVEL TRIAL 1 DATA: FILE IS C:\Documents and Settings\Owner\My Documents\LCB\lcb2.dat; FORMAT IS A21, 3F11.0, 11F11.4, 2F13.4, 3F11.4, F11.2, F11.5, F8.2; VARIABLE: NAMES ARE project courseid ugorgrad contenta lcbelief lcnonstu lcnontch posint fclp cllnd chlres indsoc sposint sfclp scllnd schlres sindsoc se tasmas pergoal finalgra zfinalgr stpp_com; CLUSTER IS courseid; BETWEEN = lcbelief lcnonstu lcnontch posint fclp cllnd chlres indsoc; USEVARIABLES ARE schlres sindsoc sposint sfclp scllnd se tasmas pergoal lcbelief lcnonstu lcnontch posint fclp cllnd chlres indsoc; MISSING = .; ANALYSIS: TYPE = TWOLEVEL; MODEL: %WITHIN% sp by sposint sfclp scllnd schlres sindsoc; mot by se tasmas pergoal; mot on sp; %BETWEEN% lcb by lcbelief lcnonstu lcnontch; ppp by posint fclp cllnd chlres indsoc; spb sposint@0sfclp@0scllnd@0schlres@0sindsoc@0; ppp on lcb; motb by se tasmas pergoal; motb on ppp;
Is this correct, if I am not interested in estimating 'sposint' 'sfclp' 'scllnd' 'schlres' and 'sindsoc' in the between level model?
Thanks in advance.
bmuthen posted on Saturday, March 27, 2004 - 11:43 am
The easiest way to get rid of sp indicators on the between level is to put them on the
list in the Variable command. That says that they don't have variation across between units.
It's good to see new Mplus out. I am impressed about the ability of handling multilevel model in the new Mplus.
I am trying models of multilevel SEM using Mplus 3.01. I found the output is different from what appeared in Heck's (2001) article. First, I can't find chi-square or for the model. Second, I found STANDARDIZED in OUTPUT seems no working, since there is no standardized coefficients shown in the output. Is there any way to get this information? Besides, can the formulas in appendix 10 in previous Mplus manual apply to random slope model in Mplus 3.01? I am preparing a paper for a conference, so just curious about the formulas and mathematical part of this kind of model in Mplus.
Ron Heck used the MUML estimator not maximum likelihood. That is why the output is different. With random slopes, you do not get a chi-square or standardized values because the variance changes. The appendices cover Version 2 not Version 3. The technical appendices will be updated as time permits.
Thanks for your reply. I wonder if you can provide me the reference for random slopes and two level model in Mplus. Besides, I am also puzzled about how to interpret the output. Here is part of my output:
-------------------------------- Between Level
S ON INDEX_1 0.425 0.018 23.764 EFF3_1 -0.370 0.026 -13.977
bmuthen posted on Monday, June 14, 2004 - 10:25 am
Yes, that would be the equation predicting del3, so using the average s value. Note that s also has a residual variance.
The reference you ask for is Asparouhov and Muthen 2003a (in preparation) with the working title given in Muthen (2004) on the Mplus web site.
Anonymous posted on Saturday, September 11, 2004 - 12:40 pm
I have one question when I run a two-level path analysis as follows:
TITLE: two-level path analysis with continuous dependent variables DATA: FILE IS c:\e1.dat; VARIABLE: NAMES ARE y1 y2 x1 x2 w clus; USEVARIABLES ARE y1 y2 x1 x2 w clus; WITHIN = x1 x2; BETWEEN = w; CLUSTER IS clus; Missing = .; ANALYSIS:TYPE = TWOLEVEL; MODEL: %WITHIN% y2 ON y1 x2; y1 ON x1 x2; %BETWEEN% y2 y1 ON w;
I got the error message as follows: *** ERROR This analysis is only available with the Multilevel or Combination Add-On.
I would greatly appreciate it if I could hear from you soon. Thank you so much.
Hello, A colleague and I are new to MPlus but attended several days of your short courses. We would like to replicate one of the analyses that you did in class for the multigroup multilevel model using the NELS:88 data set. I looked through the website but couldn't find it. Are the data on the website or could we get a copy of the data? Thank you so much.
It's not available on our website. It can be ordered because it is in the public domain. We could send you the subset of the data we worked with, but it would probably be best for you to go back to the original data if you are going to do any serious research.
Dr. Vandenberg sends his regards and suggested I post this question here. I’m working on a multi-level dataset and came up with a conceptual/practical question I thought you may be able to provide advice about. We have longitudinal data (3 time periods) at the store level for 21 stores; however, not all of the same individuals filled out the survey at all three time periods (but the same stores did). We have DVs at the individual level (e.g., empowerment) and group level (e.g., store performance). We would like to run a multi-level model growth model; however, because we are restricted by sample size (N=21 at store level) we are thinking of running some type of cohort design analysis. My questions are: 1) can we run a multi-level growth model cohort designt level, 2) what are the caveats for taking a 'cohort design' approach, and 3) can you point me to an example? Thanks in advance for your time and consideration.
bmuthen posted on Thursday, March 10, 2005 - 4:22 pm
It sounds like you want to have a type=twolevel model with cluster=store ("Between" in Mplus) and individuals spread over time points as Within. And, it sounds like on Within you want to have multiple cohorts of sets of different individuals - I assume for the same 21 stores. I haven't seen this done, but I think it can be done. But first let me ask if I am understanding you correctly.
Anonymous posted on Tuesday, March 15, 2005 - 7:50 am
Hello, in some of your papers I read that MUML is useful with unbalanced groups and small sample sizes. So, I used this estimator to compute multilevel CFAs. But the tech1 output looked quite different with MUML than for the default ML-estimator. Is this a problem of the program or is the model calculated really different?
With balanced data, MUML is ML. With unbalanced data it is not. The TECH1 matrices will look different as the model is set up differently. The results should not look that different.
Anonymous posted on Thursday, March 17, 2005 - 11:01 am
Hello, I'm new to Mplus and I'm slowly trying to build my way to a complex multi-level path model. For some reason, I get the following error message:
*** ERROR One or more between-level variables have variation within a cluster for one or more clusters. Check your data and format statement.
This is simply not true: I can check it 'manually', looking at the data, but i also know from the way level 2 variable was created that it is impossible. So I must be doing wrong something else. I'll be very grateful for your suggestions/advice.
You are most likely reading your data incorrectly. This sometimes happens in free format if you have blanks in your data. Check the data set for blanks. If you can't see what is happening, send the data and input along with your license number to firstname.lastname@example.org and I will sort it out.
In regards to the multi-level cohort design I mentioned, you are correct in your understanding. We have 21 different stores (with hundreds of individuals in each store) across 3 time periods (although the individuals are different at each time period, but the stores are the same). Thanks.
bmuthen posted on Friday, March 25, 2005 - 9:47 am
Are all the individuals different at the different time points? I assume no, since you were talking about multiple cohorts which suggests longitudinal data.
You are correct. Not all of the individuals are different at each time period. For each store, we do have quite a few indivdiuals that we know took the survey at time 1 and time 2, and time 2 and time 3 (and we can assume some took it all 3 time periods, but we don't have that information because we just asked about the previous time period and we didn't track individuals). At time 2, 50% of the sample had taken the survey at time 1. At time 3, 44% of the sample had taken the survey at time 2.
BMuthen posted on Saturday, April 02, 2005 - 8:34 pm
I will get back to you on this after April 19 if you repost this to remind me. This is a research topic that Mplus is looking into.
Anonymous posted on Friday, April 08, 2005 - 2:49 pm
I'm been running some simple random coefficient models in Mplus and HLM6.0. The model involves a random coefficient pertaining to a Level-1 dummy variable X.
I've noticed that, for clusters which don't contain both values of the dummy variable X, HLM6.0 appears to drop the entire unit from analysis. Its not clear what Mplus does in this situation, however. As a result, although my Mplus and HLM6.0 results are similar, they appear (from the HLM6.0 and Mplus outputs) to have been performed on slightly different sample sizes.
Would you provide some guidance as to how Mplus handles situations where the random coefficients pertain to dummy variables in situations where both values are not evident in some clusters ?
BMuthen posted on Sunday, April 10, 2005 - 2:48 am
Mplus does not drop these clusters because they contribute to the ML estimation of the intercept. Perhaps the differences you see are due to using REML rather than ML in HLM or because of different convergence crtieria.
Hello, I am a new user to Mplus running a Multilevel (individuals nested in schools) SEM model and would need help with the syntax for the second level. The school level variables I intend to include are “School Risk” (measured Rhsfree) and “School Location” (measured as urban). Both are dichotomous variables and I want to estimate their slopes and intercepts on “Perception” (measured as percept1)” and “Attitude” (measured by utq41a utq41b utq41c utq41d). I would be most grateful, if you would help me in the writing of the syntax. I am looking forward to hearing favorably from you. Below is my Input for the Level one analysis. Thank you.
INPUT INSTRUCTIONS TITLE: Diss DATA: FILE IS "C:\Mplus Files\Diss67.dat"; FORMAT IS 79F8.2;
The examples in Chapter 9 cover multilevel analysis. Instead of COMPLEX, you would want TWOLEVEL. And then you want to specify the within and between parts of the model. See Example 9.6 to get started. School-level variables should be listed on the BETWEEN statement of the VARIABLE command. Chapter 14 contains a section that describes special options for two-level models. You should also read this.
Thanks very much. As a follow-up, I tried following Example 9.6 but I received an error message saying “This analysis is only available with the Multilevel or Combination Add-On”. Reading through the responses to some of the questions posted at the Discussions page, I realized the problem is possibly due to the fact that I have the Base program or the Base plus the Mixture Add-on that do not have Multilevel. Please, is there a way for me to get the program that have Multilevel? Thank you.
Hi, Dr. Muthen: I tried to simulate a two-level CFA model with M-plus and the within level population parameters are exact same as a paper. I have 15 indicators predicted by 3 factors in within level. In between level model, I have 2 factors and 10 indicators. M-plus always gave the error message that "THE ESTIMATED WITHIN COVARIANCE MATRIX IS NOT POSITIVE DEFINITE AS IT SHOULD BE. THE VARIANCE OF w2(one between level indicator) APPROACHES 0." In fact the population variance of w2 is .5. I did not know how to make sure the covariance of two-level CFA model could be inverted and computable and do you have any suggestion in terms of how to set up valid population parameters of 2 level SEM model for large covariance case? I also attached my code in the following. I very appreciate your help.
Hi, Dr Muthen: I very appreciate your response and I have a question for simulating a 2 level CFA model with different ICC conditions of .2 and .3. For example, I have a 2 level CFA model with 2 factors y1& y2 at within level and yb1 & yb2 at between level respectively and y1 is correlated with y2 and yb1 is correlated with yb2, each of them predicts 3 indicators. M-plus would give intraclass correlation for 6 indicators finally and I do not know how to control ICC as one condition to simulate data? Sometimes ICC for different indicators vary significantly. Whether we need to check the average of the ICC of the six indicators or is there alternative method for calculating unique ICC? For example my model has 2 related factors yb1 and yb2 at between level and how to calculate ICC in this case? Someone recommends to change the between level factor variance and I do not know whether it works?
By changing between-level factor or factor indicator variances, you change the ICC's. You can look at the Monte Carlo examples for Chapter 9 that come on the Mplus CD to get an idea of how this works. You can also just try changing the variances and see how this affects the ICC's.
Hi, Dr. Muthen: I am running a 2-level SEM model and the output did not show CFI and SRMR shows total value and within level value. Do you know how to explain this? Does that mean CFI is not meaningful in multilevel SEM model?
Dr. Muthen: Thanks for your comments. I think probably my 2-level CFA model has the convergence problem although it has very good parameter estimates and hence results did not show CFI and TLI. Again, I still has problem with simulating a valid 2-level CFA model. Mplus always indicates that THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ILL-CONDITIONED FISHER INFORMATION MATRIX. My model should be appropriate and I tried several different starting values, however, this did not work. Example11.7 is different from my case since it reads data from a real dataset. I need to simulate data using population parameters. I just wonder is there any formula or empirical population parameters example I could use for generating a valid 2-level CFA model. Thanks. Additionally, I am still confused at ICC for 2 level CFA model. Suppose I have 6 indicators in my between level model, 6 intraclass correlations are shown for first replication, do you know the formula and how to calculate them? ICC is between level variations over total variation.
Dr.Muthen: I run Mplus Monte Carlo versions of ex9.6 and it did not give CFI and TLI. My 2-level CFA simulation model has the equivalent problem; it shows Chi square, RMSEA and SRMR. I just wonder whether Mplus program did not give CFI?
I didn't realize that you were referring to a Monte Carlo output. We don't do CFI and TLI with Monte Carlo because it requires running a baseline model at each replication in addition to the regular analysis.
Dear Dr. Muthen, I am running multilevel SEM analyses using the TYPE=COMPLEX MISSING command along with the stratification, cluster, and weight commands. The analyses terminate normally providing the fit indices, however I cannot get standardized estimates or latent r-squares. Are these calculations not available when using the multilevel analyses?
i am doing two-level analysis. i am very surprised that when i delete the 'mcp with mautive'(the model required), the chi-square reduce from 51647.675 to 29.439, and the model fit change from 0 to 0.980. such a large change makes me confusing.why these happen? thanks in advanced.
ANALYSIS: TYPE = TWOLEVEL;
MODEL: %within% effort by pmeff tmeff; impul by pmimp tmimp; effort on mautive mcp; impul on mautive mcp; edrad on effort impul;
mautive with mcp; effort with impul; tmimp with tmeff;
tmimp with edrad; tmeff with edrad;
tmimp with tmeff; tmimp with edrad; tmeff with edrad; edrad; tmimp; tmeff;
You should not put WITH statements in the model for covariates. When you do this, distributional assumptions are made about these variables. Covariates should be mentioned only on the right-hand side of ON.
I have a question about estimating multiple group, multilevel SEM models. I'll start with a conceptual description. If you need the code, I can send you this.
I have nested data, with students nested in classrooms. There are 28 classrooms and 480 students. I wanted to run multilevel models to test the effects of two classroom-based interventions (total effects for X1 and X2) and also test the indirect effects of the program via theoretical mediators (A*B). I used the type=two level command.
One of my research questions is: Are indirect effects moderated by gender (moderated mediation). However, when I tried to do a multiple group model, with grouping = gender and cluster=classroom, the model would not run. I was told this is because of problems with the covariance matrix-perhaps because more than one gender is represented in each cluster. Is this correct? Is there anyway to get around this?
Also, I was unable to test indirect effects when I included the intervention variables as predictors in the level 2 (or between-level) equations (all other predictors were included in the within-level model). So, I included them in the within-level equation with the rest of the fixed effects. This is not ideal because the intervention was delivered at the classroom level. Did I do something wrong, or is this a glitch with the type=twolevel program?
I have a question about multilevel SEM models where level-1 variables only appear as outcomes of level-2 variables. Children are nested within classrooms (T4ADULT T4IGNORE T4PSOLVE T4PASSV T4REVENG T4VICT are child-level vars). Is this the correct way to specify such a model?
USEVAR ARE TID T4ADULT T4IGNORE T4PSOLVE T4PASSV T4REVENG T4VICT PARb SEPb INDb AVb ASb PUNb T4NMb T4AAb T4AVb;
Model: %within% T4VICT on T4PASSV; T4VICT on T4PSOLVE; T4VICT on T4ADULT ; T4VICT on T4REVENG;
%between% T4PASSV on PARB; T4IGNORE on PARB PUNB; T4PSOLVE on PARB INDB; T4ADULT on PUNB; T4REVENG on SEPB AVB; SEPB on PUNB T4AVB; ASB on PARB T4AAB; AVB on T4AVB; T4AVB on T4NMB;
THE LOGLIKELIHOOD DECREASED IN THE LAST EM ITERATION. CHANGE YOUR MODEL AND/OR STARTING VALUES.
THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES.
A MATRIX COULD NOT BE INVERTED DURING THE BASELINE MODEL ESTIMATION. THE ESTIMATED BETWEEN COVARIANCE MATRIX IS NOT POSITIVE DEFINITE AS IT SHOULD BE. COMPUTATION COULD NOT BE COMPLETED. THE VARIANCE OF T4PASSV APPROACHES 0. FIX THIS VARIANCE AND THE CORRESPONDING COVARIANCES TO 0, DECREASE THE MINIMUM VARIANCE, OR SPECIFY THE VARIABLE AS A WITHIN VARIABLE.
It seems you have no variability of t4passv on the between level. You should do a TYPE=BASIC TWOLEVEL; and look at the ICC for that variable. If it is small, you should put the variable on the WITHIN list and remove it from the between part of the model.
Stephan posted on Thursday, February 14, 2008 - 9:24 pm
Hello,apologize for this long questions. I’m investigating the relationship between married couples. 3 end. LV measure ‘satisfaction’,7 exog. LV are influence factors. However, there are 3 systematic differences which could have an effect on all variables. These are ‘Motives’ of marriage. Let’s say “Forced by family”, “Money” and “Love”. My model contains cluster = country of origin. (…) Variable: Names Are x1-x40 Mo1-3 Clus; Usevariables Are x1-x40 Mo1-M03 Clus; Between = Mo1-Mo3; Cluster Is clus; Analysis: Type = Twolevel; Model: %Within% F1 By x1@1 x2-x4; (…) F4 By x17@1 x18-x20; (…) F1-F3 ON F4-F11; %Between% F1a By x1@1 x2-x4; (…) F4a By x17@1 x18-x20; (...) F1a-F3a ON F4a-F7a; F1a-F7a ON Mo1-Mo3; (...) Question 1: Is it correct not to use the ‘Whithin Is’ statement? Question 2: Does it make sense to regress all latent variables in the %between% statement on the 3 Motives that might influence group differences? Question 3: Or would it be better to use these 3 motives as cluster variables. (Cluster = 1,2,3) But how should I, in that case, implement ‘country’. Question 4: Or should I use LCA (with c=3) and then regress all IV & DV latent variables on c? Any hint is highly appreciated. -Stephan
I would create two dummy variables and use them as covariates.
J.W. posted on Friday, February 15, 2008 - 1:57 pm
In multilevel modeling, it is usually assumed that a lower level unit (e.g., student) is nested within only one higher level unit (e.g., school). However, in some hierarchically structured data (e.g., snowball sample, chain referral sample, RDS sample, …) individuals are likely to be nested in multiple personal networks; e.g., A is in B’s network, as well as in C and other’s networks. In such data, difficulty of identifying a unique higher level unit for each individual makes multilevel modeling difficult. Is there any way in Mplus to handle observation dependence in such data? Thank you very much for your help!
This is referred to as crossed random effects and is not yet available in Mplus.
Stephan posted on Sunday, February 17, 2008 - 4:05 pm
Hi Linda, thanks for the response. So, do you think that there’s no nested structure? However, I am interested in model path coefficients of those who married because of love and those who have been forced etc. My hypothesis is that in case of ‘force’ all coefficients are statistical not significant, in case of ‘money’ significant but lower as in case of ‘love’, however. Thanks a lot for your help. Cheers, Stephan
I don't see there is any clustering related to the categories unless there is something you are not saying. It sounds like people are in one of three observed groups. You can either use two dummy variables or multiple group analysis.
Stephan posted on Monday, February 18, 2008 - 2:09 pm
Hi Linda, Yes, that's it. Sorry - I've mixed up multiple group and multi level analysis. Thanks a lot for the advice. -Stephan
I would like to simulate multilevel data for a CFA problem and then estimate the CFA with a non-hierarchical model. I see how to simulate the multilevel data for the CFA situation, but when I try estimating the model with the non-hierarchical model I get an error message saying I need to use either %Between% or %Within%. I've tried both and get different results. My question is, which should I use to analyze the data in the naive (non-hierarchical) way, or is there another approach that I'm missing? Thanks very much.
Hi, I'm running a two-level SEM in Mplus 4.2. I use a dichotomous variable both at within- and between- level but the version doesn't seem to support this, my question is whether the latest version suports this? I want to use the "CATEGORICAL =" command for the variables, but I get the answer that
"*** ERROR in Analysis command ALGORITHM = INTEGRATION is not available for multiple group analysis. Try using the KNOWNCLASS option for TYPE = MIXTURE."
The error message refers to the fact the you need to use the KNOWNCLASS option instead of the GROUPING option for multiple group analysis with ALGORITH=INTEGRATION. This is still the case. It has nothing to do with the CATEGORICAL option.
Hello- I am running an SEM model with nested data (students within classrooms) and trying to create a latent factor of school readiness (srw by Eclr Eltr Enmbr Eshp Esize and srb by Eclr Eltr Enmbr Eshp Esize)- I am getting an error report similar to the one posted on August 9,2003.
THE ESTIMATED BETWEEN COVARIANCE MATRIX IS NOT POSITIVE DEFINITE AS IT SHOULD BE. COMPUTATION COULD NOT BE COMPLETED. PROBLEM INVOLVING VARIABLE ECLR. THE CORRELATION BETWEEN ENMBR AND ECLR IS 0.994 THE CORRELATION BETWEEN ENMBR AND ELTR IS 0.996 THE CORRELATION BETWEEN ESHP AND ENMBR IS 0.994
THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES.
I am confused - because I thought having high correlation between indicators of a latent factor was positive because they are all indicators of the underlying construct. Could you please help me interpret this error message. Thank you.
You would need to specify the baseline model in the MODEL command.
Unai Elorza posted on Wednesday, August 26, 2009 - 6:36 am
Hello, I am a newcomer to Mplus and multilevel modeling. I am testing a 2=>1=>1 mediation analysis: level 2 variable contributing to a level 1 mediator which, in turn, contributes to a level 1 dependent variable. All of them are observed variables. When I try to interpret the output file, I have three different types of standardized coefficients: STD, STDY and STDYX. One of them (STDYX) gives me a one std coefficient higher than 1 (all correlations are lower than 1; there are no covariances higher than 1). So, my questions are: 1) Which std parameters should I take into account? 2) Having a coefficient higher than 1, means that Heywood problem is arising in the test? How could I overcome it?
1. STDYX is used for continuous covariates. STDY is used for binary covariates.
2. With more than one covariate, a regression coefficient can be greater than one. You should always check for Heywood cases.
M Hamd posted on Wednesday, April 28, 2010 - 11:29 am
I am running a MSEM model which terminates normally (no negative variances or other errors). It is a 1-1-2 mediation model such that X->M->Y.
However, the correlation between two of the latent variables (more specifically, X and M) at the group level is very high (.97) while at the individual level is much lower (.23). Q1: Does this indicate problems in the data? Or is it Normal for level-2 correlations to be higher.
Moreover, the standardized path coefficients between "X and Y" and "M and Y" are 4 and 3 respectively. Karl Joreskog suggests this is ok, but may indicate multicollinearity. However, the VIFs (for individual-level variables) are fine.
q2: Is there some other error i should explore for before concluding there results are fine?
q3: Is there some other source besides Joreskog that I can use to argue that in multilevel models std coefficients greater than 1 are not necessarily a problem?
It sounds like you have the same model on both levels, in which case the high X, M correlation on the between level could cause the multicollinearity that does produce standardized slopes greater than one. The between level also specifies a regression so the issue is the same as in regular regression - so you don't need any extra arguments beyond Joreskog.
It is common for between-level variables to correlate much higher than within. This has to do with the topic of ecological correlations.
Perhaps you want to have only M->Y on the between level. Then the intercept of M still influences the intercept of Y and therefore the individual's Y.
M Hamd posted on Thursday, April 29, 2010 - 9:33 am
Thank you. Actually my model is like this:
%within% m on x;
%between% m on x(a); y on x; y on m(b) ;
MODEL CONSTRAINT: NEW(ab); ab = a*b;
As you will see by 1-1-2, I meant, Y is purely between-level variable, and x and m are free to exist at both levels.
When i set Y on X@0; the path coeff are no longer greater than 1.0.
I guess, I will cite Joreskog to argue that std coeff > 1.0 are not an issue in this case.
Hi, I'm trying to estimate a 1, (1,1), 1 MSEM (two mediators, all measures taken at L1), but get the error warning of a non-positive definite fisher information matrix, possibly due to starting values or model nonidentification; and SEs that cannot be computed due to problems with para25. I've tried a few things (deleted clusters with no variance; set variances to zero), with little success. Para25 is the relationship that the DV has with itself (wb in the syntax below) in the psi matrix. I'm not sure what this means, or what I should do to correct for it? I've copied my syntax below. There are no missing data in the variables used in the model. Any help would be appreciated. Stacey
ANALYSIS: TYPE IS TWOLEVEL RANDOM; ITERATIONS = 1000; CONVERGENCE = 0.01; MODEL: %WITHIN% WB ON CT(BW1); WB ON AF(BW2); C|WB ON TFL; CT WITH AF; CT ON TFL(AW1); AF ON TFL(AW2); %BETWEEN% C CT AF WB; C WITH CT AF WB; WB ON CT(BB1); WB ON AF(BB2); WB ON TFL; CT WITH AF; CT ON TFL(AB1); AF ON TFL(AB2); [C]; MODEL CONSTRAINT: NEW(ABW1 ABW2 ABB1 ABB2 CONW CONB); ABW1=AW1*BW1; ABW2=AW2*BW2; ABB1=AB1*BB1; ABB2=AB2*BB2; CONW=ABW1-ABW2; CONB=ABB1-ABB2;
Tobias Koch posted on Wednesday, November 10, 2010 - 10:01 am
I'm working on a ML-CFA with two correlated factors on each level. I would like to use the new bayes estimator and set invers wishart priors for specific variance/covariance matrices. For example, for the residual covariance matrix and/or for the covariance matrices of the latent factors. How can I access specific covariance matrices in the model command, so that I can refer to them in the model prior command later on? Put differently, is it feasible to set priors for entire covariance matrices or only for single parameters in the covariance matrices? If it's possible to refer to entire covariance matrices with priors, where can I find examples of it.
Peter Halpin posted on Wednesday, February 02, 2011 - 6:36 am
I have a question about multilevel SEM that I can't seem to find an answer to in the manual. Basically, I want to regress a latent variable at the within level on a latent variable at the between level .
For example, I want to regress "student attitudes" (for which I have a measurement model at the within level) on "teaching style" (for which I have a measurement model at the between level). If both variables were manifest, I would want student attitudes to have random intercept model and use teaching attitudes as a between level predictor.
So I am thinking of something like:
%between% y1 by t1-t5;
%within% y2 by s1-s5; y2 on y1;
Is this possible? To have a heirachical model for a latent variable? Any advice here would be greatly appreciated,
I am trying to construct a two level path analysis with variables x, y, z and a cluster variable a. All are observed. X is a between variable. y and z are measured at the individual (within) level but as potential outcomes, can operate at either the within or between levels. I wish to test the following model:
%within% z on y;
%between% z on y x; y on x;
However, when I run it, I get an error: "Observed variable on the right-hand side of a between-level ON statement must be a BETWEEN variable. Problem with Y."
Yet, this seems similar in structure to example 9.5 in the user's manual (V6, April 2010). And, I don't want to restrict Y to be a between variable given the within model.
Hemant Kher posted on Wednesday, August 17, 2011 - 7:19 pm
Greetings Linda and Bengt, I have a question on my multilevel model in MPlus. I am trying to replicate results from Applied Longitudinal Data Analysis (Singer & Willett, page 163, Model D). Dependent variable is CESD, and my two independent variables are Unemp, and monBYun. In this model the intercept, as well as the slopes for the independent variables are to have both fixed and random components. The code I used to fit this model is: %within% s1 | cesd on unemp; s2 | cesd on monBYun; %between% cesd with s1 s2; s1 with s2; The model fits, but my results are slightly different from the results listed in the book (or for that matter obtained using SAS or MLWin). Results are given below. Intercept [MPlus=11.135; Book=11.267] / Intercept var [MPlus=42.779; Book=41.52] Unemp slope [Mplus=6.985; Book=6.8795] / Unemp Var [MPlus=33.269; Book=40.45] monBYun slope [MPlus=-0.300; Book=-0.3254] / monBYun Var [MPlus=0.758; Book=0.71] Res. Var [MPlus=60.379; Book=62.43] These results are for Model D; using the same data but different variable combinations, I fit 3 other models (A, B, C) and the numbers are identical (to 3 decimal place rounding). The mean intercept and slopes for Model D are fairly close as shown above, but, difference in variance, especially Unemp seems larger. I have used the same estimator (ML) as used in prior models on the data. Is my code in error by any chance?
The page 163 results don't show covariances among the random effects, but I assume they are included in their modeling. A first check is if the same number of parameters is used - Mplus shows this clearly in the output.
Next, you can check if the loglikelihoods (LL) agree. They give the Deviance which is -2 times the LL. Perhaps either their run or the Mplus run can sharpen its convergence criterion and get a better LL with agreement of estimates.
Hemant Kher posted on Thursday, August 18, 2011 - 8:32 am
Hi Professor Muthen, I am grateful for your time on my question.
The LL value for the MPlus model I ran was -2547.997. Thus -2LL=5095.994; on page 163 of the book the deviance for the same model is 5093.6. These values are very close, but not identical.
I also checked the SAS (v9.2) output for the same model on UCLA’s ATS website. SAS results are identical to those reported in the book including -2LL. From the output it appears as though there are 10 parameters estimated in SAS. My MPlus output tells me “Number of Free Parameters 10”, thus the number of parameters estimated is the same as well.
As a side, from the ATS website, only the results from MLWin and SAS are consistent for Model D. Remaining software packages, MPlus, Stata and SPSS give estimates that are close yet different. Stata had difficulty calculating standard errors for variance / covariance parameters.
The reason you are experiencing problems with this example is that the variance covariance matrix for the 3 random effects is singular - I computed the determinant and it came out negative. Different algorithms and packages will react differently on singular variance covariance matrix and most people would consider this an unacceptable model. You can introduce structured variance covariance matrix for the random effects to eliminate these problems.
Just a little more information about Mplus. If you add the technical option output:tech8; you will see the details of the convergence process. The default algorithm EMA quickly reaches the ML estimates but fails because the variance covariance matrix for the random effects is singular. At that point Mplus switches to the EM algorithm which slowly approaches the singularity, but Mplus will deliberately avoid the full convergence to avoid the singularity. In this part of the algorithm the solution is driven by the logcriterion convergence criterion. So essentially all software packages differ because the ML solution is inadmissible so they report their own version of "approximately" ML solution.
Hemant Kher posted on Thursday, August 18, 2011 - 7:32 pm
Thank you Tihomir, Linda and Bengt. These details are beyond my comprehension. I am glad that I modeled the problem correctly in MPlus and the difference in results due to the difference in how softwares tackle singular matrices.
When I made the change suggested by Tihomir my solution matches the one from Stata; Stata is unable to compute std. errors but MPlus did (as you well know). The solution time was a bit longer than before.
Eric Deemer posted on Saturday, March 17, 2012 - 1:13 pm
Hello, Like a previous poster, I am trying to fit a multilevel SEM but I'm also getting the following error message:
*** ERROR One or more between-level variables have variation within a cluster for one or more clusters. Check your data and format statement.
But I know that each person within the cluster in question has the same (mean) score. Do you think this could be data format issue? Thanks.
If you get this message and you are certain that the values for each person within a cluster are the same, you may be reading your data incorrectly, for example, blanks in the data set, the wrong number of variable names. If you cannot figure it out, please send the relevant files and your license number to email@example.com.
I am testing a multilevel SEM model with three within-level outcomes. At the within-level I control for one within-level variable and I specify all intercorrelations between the three outcomes. At the between-level (that I'm mainly interested in) I have specified 2 between-level predictors (a and b) for these outcomes, 2 between-level moderators, their 2 interaction terms and I control for the effect of 2 between-level variables on the outcomes. Everything runs normally and the fit is good. Then I include in the between-level two correlations between the two control variables and predictor a, because they are theoretically important and they also increase the fit. But I get this error message about the NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX ... etc PROBLEM INVOLVING PARAMETER 49.
Parameter 49 is the PSI for predictor a. When I remove the correlation between the control variable and predictor a, the error goes away. When I replace this correlation with another correlation between the same control and predictor b, then the error message involves the PSI of predictor b.
I am wondering if you know what could the problem be.
Thank you very much. I am wondering if I have to specify a random model, since I do not predict random slopes. Could i run the model with type is twolevel as well? Then I could request the model indirect option for stratpo on treat and comppo on treat.
I have been using Mplus version 7 to estimate Multilevel Structural Equation Models based on Dr. Preacher's syntax with TYPE IS TWOLEVEL RANDOM. Under "estimated sample statistics", Mplus produces a within correlation matrix and a between correlation matrix. I would greatly appreciate if you could clarify (1) how these matrices are calculated and (2) whether missing data has been taken into consideration during the calculations of the within and between matrices.
(1-2) The assumption is that the variables can be decomposed into two orthogonal components, within and between, in line with random effects anova. The covariance matrices for those two parts are then estimated by maximum-likelihood under the standard MAR assumption of missing data (so "FIML").
I am testing a multilevel SEM model with 1-1-1 model and 2-2-1 model. I have two problems. could you help me?
fist of all, when i testing a 1-1-1 model and 2-2-1 model separately, there's fit very good. but, once i testing both model simultaneously, fit indices are worst! is it possible? can in testing 1-1-1 model and 2-2-1model separately? I think it's not a good idea.
and At the level 1. I testing only 1-1-1 model. but, when i input control variables in this model, fit indices are getting worse. and then I'm also getting the following error message:
"variable is uncorrelated with all other variables"
In this case, do I exclude all control variable? If so, at the level 1 model, there's no control variable. that's maybe problems later. I am wondering if you know what could the problem be.
I’m fitting a random slope multilevel SEM. The code and selected output are below:
WITHIN = matage; BETWEEN = liv sr; CATEGORICAL = sex; COUNT = fitn(p); DEFINE: CENTER matage (groupmean); MODEL: %WITHIN% rs_sexmat | sex ON matage; %BETWEEN% sex ON liv; sex WITH rs_sexmat; fitn ON liv rs_sexmat sr;
Estimate S.E. Est./S.E. P-Value FITN ON RS_SEXMAT -29.287 7.648 -3.830 0.000 SEX ON LIV 0.000 0.021 -0.011 0.991 FITN ON LIV 0.028 0.012 2.319 0.020 SR 0.054 0.123 0.437 0.662 SEX WITH RS_SEXMAT -0.001 0.001 -0.728 0.467 Means RS_SEXMAT 0.014 0.005 2.566 0.010 Variances RS_SEXMAT 0.000 0.000 1.925 0.054
1. How I define a path from the random intercept of variable “SEX” on my outcome variable “FITN”? 2. Does a negative coefficient for “FITN ON RS_SEXMAT” mean that the more positive the random slope, the lower the "FITN"? 3. Is it justified to estimate a coefficient for “FITN ON RS_SEXMAT” in the case of non-significant variance for “RS_SEXMAT”?
With respect to the question 1 above, if I use that syntax I get the following error message:
*** ERROR in MODEL command Observed variable on the right-hand side of a between-level ON statement must be a BETWEEN variable. Problem with: SEX *** ERROR The following MODEL statements are ignored: * Statements in the BETWEEN level: FITN ON SEX
But if I put the variable "SEX" on the BETWEEN-list, I get this error message:
*** ERROR in MODEL command Between-level variables cannot be used on the within level. Between-level variable used: SEX
All variables measured on the between-level must be put on the between list and used only in the between part of the model. If this does not help, please send the original output and your license number to firstname.lastname@example.org.
Samuli Helle posted on Thursday, September 04, 2014 - 3:04 am
Given the model below, is there a way to constraint the variance of random slopes >0?
%WITHIN% rs_sexmat | sex ON matage; %BETWEEN% sex; sex WITH rs_sexmat; fitn ON rs_sexmat;
I tried the following but I'm not sure whether it worked because the lower 95% CI for the variance estimate was still <0.
%WITHIN% rs_sexmat | sex ON matage; %BETWEEN% sex; sex WITH rs_sexmat; fitn ON rs_sexmat; rs_sexmat (rs); MODEL CONSTRAINT: rs > 0;
Your results give the estimated variance for the random slope and you can also get the estimated variance for fitn. Using that, you can standardize the regression coefficient for fitn ON rs_sexmath (unless it already prints it), which gives the usual interpretation.
tom norton posted on Monday, October 20, 2014 - 6:33 pm
I am trying to run a multilevel SEM with level 1 predictor (X) and dependent (Y) variables, and level 2 moderator (M) and control (C) variables. Am I correct in writing my commands as follows:
The s | statement should refer to only one regression slope, so say
%within% s | y on x; y on c;
I think on Between you want to say:
s on m; y on m c;
where the first line gives the m*x cross-level interaction and the second line gives the main effects from m and c.
Star Chen posted on Wednesday, November 12, 2014 - 2:46 pm
Hi: I'm kind of new to Mplus. Lately I was trying to fit a multilevel mediation model with two mediators (each as a latent variable of several indicators). When I fit one mediator, the model runs, but when I try to model two mediators at the same time, I always get this error message. "THE MODEL ESTIMATION TERMINATED NORMALLY
THE H1 MODEL ESTIMATION DID NOT CONVERGE. CHI-SQUARE TEST AND SAMPLE STATISTICS COULD NOT BE COMPUTED. INCREASE THE NUMBER OF H1ITERATIONS."
I tried to increase H1ITERATIONS to 5000, but it still gives me the same error. When I increase it to be over 5000, Mplus appears to keep running forever, so I aborted the analysis. The model I was trying to fit is only the basic one without adjusting for any covariate. I'm wondering if there's any solutions to this problem.
Originally I also tried to model mediators as two simple variables (mean score) instead of as two latent variables, but then model estimation couldn't terminate normally even in the case of one mediator.
Djangou C posted on Thursday, March 12, 2015 - 8:08 pm
Hi Dr Muthen, I am running multilevel models in Mplus and I have 2 questions to ask. 1)This question is related to what you called ecological correlation on your posting on Thursday, April 29, 2010 - 8:09 am. The correlations of level-2 are higher than that of level-1. How do you explain this statistically? Could you please point to a paper that can explain this and I could use for reference? 2)This question is related to class separation in multilevel mixture model. In Lubke & Muthen (2007) the multivariate Mahalanobis distance was used to define class separation. I don’t see how to use the same measure in multilevel mixture model for each level as the means are only defined in one level. How do we then define the class separation in that situation? Is there another measure for class separation that only use the covariance matrices (which is available for each level)? Any reference to point to? Thank you.
Robinson, W.S. (1950). "Ecological Correlations and the Behavior of Individuals". American Sociological Review (American Sociological Review, Vol. 15, No. 3) 15 (3): 351–357. doi:10.2307/2087176.
You can also see Wikepedia for more references.
2) The same measure should be used for 2-level models. It uses the means and they are not affected by the multilevel situation - there are not separate means on the 2 levels. No references as far as I know.
Nir Madjar posted on Thursday, May 12, 2016 - 2:44 am
I wonder if there is any way to model a latent factor from between level, to be associated with a within variable. In other words, the equivalent in HLM would be a level-2 factor that is modeled on the intercept of level-1 outcome variable (e.g., class level variable that explains students’ levels outcome). For example (I know this is not a possible model in Mplus but wonder whether any other model specification will test this hypothesis):
Model: %BETWEEN% factorB BY var1 var2; %WITHIN% Outcome ON factorB;
I assume that Outcome is a student-level variable in which case you say
%Between% factorB BY... outcome on factorB;
where outcome is the latent between-level part of the Outcome variable.
Nir Madjar posted on Thursday, May 12, 2016 - 9:21 pm
Thank you for your prompt response!
I am sorry if this is my misunderstanding, but in this case the model explains latent between-level part of the Outcome variable. In my case I have a model in which students are nested within classes, and I have a latent factor at the class-level (e.g., classroom climate – BETWEEN level factor) that is hypothesized to explain variance on the student level (i.e., he model should explain latent WITHIN-level part of the Outcome variable). Is there any option to model this?