Message/Author 

Anonymous posted on Monday, January 28, 2002  11:43 am



I am working on a two level SEM model (unbalanced data). I am new user. I have couple questions: 1. Is it possible to define a variable that designates those clusters with few number of cases in them so that I can run the analysis without them using USEOBSERVATIONS command? 2. Is the estimation method mentioned in Appendix 10 (page 380 of the User's guide) called MUML not included in the software at this moment? 3. Is there any articles/books that uses the software and goes through a twolevel example other than the ones I found in the Mplus web pages? 4. More specifically, what kind of constraints are required in twolevel analysis for any model to be identified? I am getting the message that one of the Beta parameters is problematic, causing underidentification? How can I fix that? I am not sure these questions were raised before. Thank you very much for your time. 


1. Unless you have a variable that contains information about cluster size, I can't think of a way to do this. 2. MUML is the estimator used for TWOLEVEL when the data are unbalanced. If the data are balanced, it is Maximum Likelihood 3. I believe that the following book contains Mplus examples. Heck, R. (2001). Multilevel modeling with SEM. In G.A. Marcoulides & R.E. Schumacker (eds.), New Developments and Techniques in Structural Equation Modeling (pp. 89127). Lawrence Erlbaum Associates. 4. I can't answer your idenentification problem without seeing the input/output and data if possible. Please send them to support@statmodel.com. 

Anonymous posted on Thursday, July 24, 2003  10:31 am



I had a handful of questions about Mplus’ mulitivariate covariance structure analysis (MCSA) after reading Bengt’s 1994 SMR piece. First, Bengt writes (pg. 388): “…If all intraclass correlations are close to zero, as is the case for many applications, it might not be worthwhile to go further…”. I was under the impression that hierarchical / multilevel modeling is *always* preferable to nonhierarchical modeling, for various reasons (see for example, Gelman et. al., 1995). Is Bengt’s point that one wouldn’t expect to see much improvement in the Level1 coefficients if the ICCs are close to zero, or rather that obtaining convergence is sufficiently difficult that any gains in parameter estimation are probably not worth the tremendous amount of effort that would be required to build the model / obtain convergence (etc.) ? Second, on the Mplus MCSA approach in general: am I correct in understanding that the Mplus / Muthen approach is a SEM analog to the HLM / empirical Bayes approach to hierarchical modeling ? Third, and related to the second question above: in implementing MCSA in Mplus when using individual data as inputs, does Mplus always work from a “constructed” set of variance / covariance matrix ? Perhaps what I’m trying to understand here is if the Mplus approach to MCSA be conceptualized in another manner (i.e., Bayesian / probability notation) rather than that of fitting a variance / covariance matrix that has within and between components (and requires the estimation of St, Spw, and Sb matrices) ? (I ask only because for some reason the notation of Gelman et. al. seems a bit more intuitive to me at this point). Finally, regarding the FIML and MUML estimators described in Bengt’s 1994 SMR piece: are the current Mplus 2.14 estimators of the same name identical to FIML / MUML as described in the article ? Is the Mplus ML option for MCSA the same as the FIML option in the case of unbalanced cluster sizes (or could you provide a citation describing the difference between the FIML, MUML, ML as applied in Mplus’ MCSA module) ? I’ve scanned the Mplus Papers / Recent Publications listing on the front of the web page and nothing seemed applicable here. Thanks very much. 

bmuthen posted on Thursday, July 24, 2003  11:16 am



Good questions. Things will become clearer as I finish an overview paper later this summer on FIML capabilities in Mplus. Regarding the first question, yes I was thinking pragmatically. The parameter estimates and the SEs may not be that different  and convergence can be problematic. Second question. Since the introduction of FIML in version 2.1, the approaches are analogous for the models that they both do  e.g. models with no latent variables. With the older MUML approach, we only had random intercepts, not random slopes. And, no missing data. Third question. Between and within matrices are only used for MUML, while for FIML the raw data need to be used in order to handle random slopes and missing data. With the more general models that FIML includes, one can also describe the models using the Baysian conditioning style, level 1 given level 2, etc. Final question. MUML is the same. The FIML algorithm is different in that in the earlier approach FIML was done in standard SEM software simply by using as many between cov matrices as there were distinct cluster sizes.  Hope that gives sufficient answers for now. More in the forthcoming papers. 

Anonymous posted on Thursday, July 24, 2003  4:44 pm



Thanks very much. As a followup to your responses above (I may have one more in a bit): MUML is still (i.e., as of Mplus 2.14) the only method in Mplus to handle unbalanced clusters / "Level2" units ? I hope you'll be putting the paper you mention up on the Mplus website when you're finished. 

bmuthen posted on Thursday, July 24, 2003  4:56 pm



No, FIML in Mplus (since version 2.1 of May 2002) handles the general case of unbalanced clusters, i.e. where different clusters have different number of members. Yes, the paper will be posted on the home page with examples that can be run. 

Anonymous posted on Saturday, August 09, 2003  6:35 am



I had a pair of questions when I run a multilevel structural model. (1)for random intercepts model, the result presented the warning message: THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO A NONZERO DERIVATIVE OF THE OBSERVEDDATA LOGLIKELIHOOD. CONVERGENCE CRITERION FOR THE MODEL IS NOT FULFILLED. CHECK YOUR STARTING VALUES OR INCREASE THE NUMBER OF MITERATIONS. ESTIMATES CANNOT BE TRUSTED. THE LOGLIKELIHOOD DERIVATIVE FOR PARAMETER 12 IS 0.49269400D+00. My command file as follow: TITLE: multilevel regression analysis DATA: FILE IS "C:\d1.dat"; VARIABLE: NAMES ARE subject group y1 y2 x z1 z2; USEVARIABLES ARE group y1 y2 x z1 z2; CLUSTER IS group; MISSING ARE ALL(999); WITHIN = x; CENTERING = GROUPMEAN(x); BETWEEN = z1 z2; ANALYSIS: TYPE IS TWOLEVEL RANDOM MISSING; MODEL: %WITHIN% y2 on y1; y2 on x; y1 on x; %BETWEEN% y1 on z1 z2; y2 on z1 z2; (2)For random slopes model: MODEL: %WITHIN% y2 on y1; s1y2 on x; s2y1 on x; %BETWEEN% y1 on z1 z2; y2 on z1 z2; s1 on z1 z2; s2 on z1 z2; Such as the follow warning massege was gaven: THE ESTIMATED BETWEEN COVARIANCE MATRIX IS NOT POSITIVE DEFINITE AS IT SHOULD BE. COMPUTATION COULD NOT BE COMPLETED. PROBLEM INVOLVING VARIABLE ZCPSOC. THE RESIDUAL CORRELATION BETWEEN ZCPSOC00 AND ZCPSOC IS 1.000 THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES. Please tell me how handle those problems. Thank you very much. 

bmuthen posted on Saturday, August 09, 2003  7:19 am



For the first issue, I recommend following the advice in the error message. The derivatives should be zero for all parameter estimates, but one is far from it, so the "miterations" should be increased. This error message may be an indication that this parameter (#12  see Tech1 output) is hard to estimate, that is, it is illdefined from the data for this model. For the second issue, I assume that you have used generic variable names, and not your actual variable names, in the input you give  the error message talks about other variables and I assume they are the 2 y variables, that is the between part of y, i.e. the y intercepts. If I am correct, the error message implies that the y intercepts correlate perfectly when having accounted for z1, z2. If you want, please send your input, and data to support@statmodel.com. 

Anonymous posted on Saturday, March 27, 2004  10:52 am



I am working on a multilevel SEM. At the within level 'sp' predicts 'mot'. At the between level 'lcb' predicts 'ppp' which predicts 'mot'. In other words I want to estimate the effect of between level factors on the individual level factor, 'mot'. I am not sure what to do with 'sp' in the between model. Reading p. 299 of V2. User'########, I think that I should fix the indicators of 'sp' to 0 in the between model. When I do this by fixing these variables to 0 on one of the Level 2 factors, e.g, 'ppp', then the results aren't what I would expect. If I create a new factor, 'spb' in level 2 and fix the indicators to 0 on this factor, then the results seem more like what I would expect. My code for the latter would be: TITLE: LCB TOTAL TWOLEVEL TRIAL 1 DATA: FILE IS C:\Documents and Settings\Owner\My Documents\LCB\lcb2.dat; FORMAT IS A21, 3F11.0, 11F11.4, 2F13.4, 3F11.4, F11.2, F11.5, F8.2; VARIABLE: NAMES ARE project courseid ugorgrad contenta lcbelief lcnonstu lcnontch posint fclp cllnd chlres indsoc sposint sfclp scllnd schlres sindsoc se tasmas pergoal finalgra zfinalgr stpp_com; CLUSTER IS courseid; BETWEEN = lcbelief lcnonstu lcnontch posint fclp cllnd chlres indsoc; USEVARIABLES ARE schlres sindsoc sposint sfclp scllnd se tasmas pergoal lcbelief lcnonstu lcnontch posint fclp cllnd chlres indsoc; MISSING = .; ANALYSIS: TYPE = TWOLEVEL; MODEL: %WITHIN% sp by sposint sfclp scllnd schlres sindsoc; mot by se tasmas pergoal; mot on sp; %BETWEEN% lcb by lcbelief lcnonstu lcnontch; ppp by posint fclp cllnd chlres indsoc; spb sposint@0 sfclp@0 scllnd@0 schlres@0 sindsoc@0; ppp on lcb; motb by se tasmas pergoal; motb on ppp; Is this correct, if I am not interested in estimating 'sposint' 'sfclp' 'scllnd' 'schlres' and 'sindsoc' in the between level model? Thanks in advance. 

bmuthen posted on Saturday, March 27, 2004  11:43 am



The easiest way to get rid of sp indicators on the between level is to put them on the Within = list in the Variable command. That says that they don't have variation across between units. 

Yifu posted on Monday, June 07, 2004  7:06 am



Hi, Dr. Muthen, It's good to see new Mplus out. I am impressed about the ability of handling multilevel model in the new Mplus. I am trying models of multilevel SEM using Mplus 3.01. I found the output is different from what appeared in Heck's (2001) article. First, I can't find chisquare or for the model. Second, I found STANDARDIZED in OUTPUT seems no working, since there is no standardized coefficients shown in the output. Is there any way to get this information? Besides, can the formulas in appendix 10 in previous Mplus manual apply to random slope model in Mplus 3.01? I am preparing a paper for a conference, so just curious about the formulas and mathematical part of this kind of model in Mplus. Thanks in advance! 


Ron Heck used the MUML estimator not maximum likelihood. That is why the output is different. With random slopes, you do not get a chisquare or standardized values because the variance changes. The appendices cover Version 2 not Version 3. The technical appendices will be updated as time permits. 

Yifu posted on Monday, June 14, 2004  8:05 am



Dr. Muthen, Thanks for your reply. I wonder if you can provide me the reference for random slopes and two level model in Mplus. Besides, I am also puzzled about how to interpret the output. Here is part of my output:  Between Level S ON INDEX_1 0.425 0.018 23.764 EFF3_1 0.370 0.026 13.977 DEL3 ON EFF3_1 0.082 0.065 1.252 Intercepts CONTROL 24.183 0.153 157.710 WM1 29.864 0.160 187.122 IR1 12.403 0.100 124.388 DEL3 5.323 1.801 2.955 S 8.587 0.691 12.420  S is the random slope for del3 ON pare in within model. Pare is a latent variable with three inidcators. Could I use the following equation to interpret the model? del3=5.323  0.082Eff3_1 + 8.587pare + 0.425 (pare*Index_1)  0.37 (pare*Eff3_1). 

bmuthen posted on Monday, June 14, 2004  10:25 am



Yes, that would be the equation predicting del3, so using the average s value. Note that s also has a residual variance. The reference you ask for is Asparouhov and Muthen 2003a (in preparation) with the working title given in Muthen (2004) on the Mplus web site. 

Anonymous posted on Saturday, September 11, 2004  12:40 pm



I have one question when I run a twolevel path analysis as follows: TITLE: twolevel path analysis with continuous dependent variables DATA: FILE IS c:\e1.dat; VARIABLE: NAMES ARE y1 y2 x1 x2 w clus; USEVARIABLES ARE y1 y2 x1 x2 w clus; WITHIN = x1 x2; BETWEEN = w; CLUSTER IS clus; Missing = .; ANALYSIS:TYPE = TWOLEVEL; MODEL: %WITHIN% y2 ON y1 x2; y1 ON x1 x2; %BETWEEN% y2 y1 ON w; I got the error message as follows: *** ERROR This analysis is only available with the Multilevel or Combination AddOn. I would greatly appreciate it if I could hear from you soon. Thank you so much. 


You must have the Base program and the Base plus the Mixture Addon. Multilevel is not available with these programs. 

suppawan posted on Friday, October 15, 2004  6:26 am



Dr. Muthen, Can Mplus version 3 allows for multilevel SEM with non recursive model, and is there any example or references that explain in this issue? Thank you so much. 


Yes. We don't have an example but you can modify a recursive model. 


Hello, A colleague and I are new to MPlus but attended several days of your short courses. We would like to replicate one of the analyses that you did in class for the multigroup multilevel model using the NELS:88 data set. I looked through the website but couldn't find it. Are the data on the website or could we get a copy of the data? Thank you so much. 


It's not available on our website. It can be ordered because it is in the public domain. We could send you the subset of the data we worked with, but it would probably be best for you to go back to the original data if you are going to do any serious research. 


Dr. Vandenberg sends his regards and suggested I post this question here. I’m working on a multilevel dataset and came up with a conceptual/practical question I thought you may be able to provide advice about. We have longitudinal data (3 time periods) at the store level for 21 stores; however, not all of the same individuals filled out the survey at all three time periods (but the same stores did). We have DVs at the individual level (e.g., empowerment) and group level (e.g., store performance). We would like to run a multilevel model growth model; however, because we are restricted by sample size (N=21 at store level) we are thinking of running some type of cohort design analysis. My questions are: 1) can we run a multilevel growth model cohort designt level, 2) what are the caveats for taking a 'cohort design' approach, and 3) can you point me to an example? Thanks in advance for your time and consideration. 

bmuthen posted on Thursday, March 10, 2005  4:22 pm



It sounds like you want to have a type=twolevel model with cluster=store ("Between" in Mplus) and individuals spread over time points as Within. And, it sounds like on Within you want to have multiple cohorts of sets of different individuals  I assume for the same 21 stores. I haven't seen this done, but I think it can be done. But first let me ask if I am understanding you correctly. 

Anonymous posted on Tuesday, March 15, 2005  7:50 am



Hello, in some of your papers I read that MUML is useful with unbalanced groups and small sample sizes. So, I used this estimator to compute multilevel CFAs. But the tech1 output looked quite different with MUML than for the default MLestimator. Is this a problem of the program or is the model calculated really different? Thank you very much! 


With balanced data, MUML is ML. With unbalanced data it is not. The TECH1 matrices will look different as the model is set up differently. The results should not look that different. 

Anonymous posted on Thursday, March 17, 2005  11:01 am



Hello, I'm new to Mplus and I'm slowly trying to build my way to a complex multilevel path model. For some reason, I get the following error message: *** ERROR One or more betweenlevel variables have variation within a cluster for one or more clusters. Check your data and format statement. This is simply not true: I can check it 'manually', looking at the data, but i also know from the way level 2 variable was created that it is impossible. So I must be doing wrong something else. I'll be very grateful for your suggestions/advice. 


You are most likely reading your data incorrectly. This sometimes happens in free format if you have blanks in your data. Check the data set for blanks. If you can't see what is happening, send the data and input along with your license number to support@statmodel.com and I will sort it out. 


In regards to the multilevel cohort design I mentioned, you are correct in your understanding. We have 21 different stores (with hundreds of individuals in each store) across 3 time periods (although the individuals are different at each time period, but the stores are the same). Thanks. 

bmuthen posted on Friday, March 25, 2005  9:47 am



Are all the individuals different at the different time points? I assume no, since you were talking about multiple cohorts which suggests longitudinal data. 


You are correct. Not all of the individuals are different at each time period. For each store, we do have quite a few indivdiuals that we know took the survey at time 1 and time 2, and time 2 and time 3 (and we can assume some took it all 3 time periods, but we don't have that information because we just asked about the previous time period and we didn't track individuals). At time 2, 50% of the sample had taken the survey at time 1. At time 3, 44% of the sample had taken the survey at time 2. 

BMuthen posted on Saturday, April 02, 2005  8:34 pm



I will get back to you on this after April 19 if you repost this to remind me. This is a research topic that Mplus is looking into. 

Anonymous posted on Friday, April 08, 2005  2:49 pm



I'm been running some simple random coefficient models in Mplus and HLM6.0. The model involves a random coefficient pertaining to a Level1 dummy variable X. I've noticed that, for clusters which don't contain both values of the dummy variable X, HLM6.0 appears to drop the entire unit from analysis. Its not clear what Mplus does in this situation, however. As a result, although my Mplus and HLM6.0 results are similar, they appear (from the HLM6.0 and Mplus outputs) to have been performed on slightly different sample sizes. Would you provide some guidance as to how Mplus handles situations where the random coefficients pertain to dummy variables in situations where both values are not evident in some clusters ? Thanks. 

BMuthen posted on Sunday, April 10, 2005  2:48 am



Mplus does not drop these clusters because they contribute to the ML estimation of the intercept. Perhaps the differences you see are due to using REML rather than ML in HLM or because of different convergence crtieria. 


Hello, I am a new user to Mplus running a Multilevel (individuals nested in schools) SEM model and would need help with the syntax for the second level. The school level variables I intend to include are “School Risk” (measured Rhsfree) and “School Location” (measured as urban). Both are dichotomous variables and I want to estimate their slopes and intercepts on “Perception” (measured as percept1)” and “Attitude” (measured by utq41a utq41b utq41c utq41d). I would be most grateful, if you would help me in the writing of the syntax. I am looking forward to hearing favorably from you. Below is my Input for the Level one analysis. Thank you. INPUT INSTRUCTIONS TITLE: Diss DATA: FILE IS "C:\Mplus Files\Diss67.dat"; FORMAT IS 79F8.2; VARIABLE: NAMES ARE Crsswlk ms7dist ms8dist hs9dist Rhsfree utq39 utq40 utq41a utq41b utq41c utq41d n2q49 n2q50 n2q51a n2q51b n2q51c n2q51d upq40c n1q44cp7 n1q44c_7 n1q44c_8 n1q44c_9 AGE sex anyuse1 anyuse2 anyuse3 anyuse4 anyuse5 Urban Mdelig rutq40 rn2q50 rage8 rage9 percept1 percept2 percept3 percept4 percept5 white latino black other deviance rpercp1 rpercp2 rpercp3 rpercp4 rpercp5 ATTITUD1 ATTITUD2 treatms stressms ethnic Asian Amind othrace misrace Region upq37a upq37b upq37c upq37d upq37e upq37f upq37g upq37h q10c tq10c eq10g n1q7g n2q7g q10f tq10f eq10j n1q7j n2q7j instper; USEVARIABLES ARE utq41a utq41b utq41c utq41d rutq40 n1q44cp7 percept1 latino black other age sex anyuse1 deviance; CLUSTER IS hs9dist; MISSING IS blank; ANALYSIS: TYPE IS COMPLEX MISSING H1; ITERATIONS = 1000; CONVERGENCE = 0.00005; MODEL: attitude BY utq41a utq41b utq41c utq41d; attitude ON rutq40 percept1 anyuse1 deviance; percept1 ON rutq40 n1q44cp7 latino black other age sex anyuse1 deviance; OUTPUT: RESIDUAL STANDARDIZED MODINDICES; 


The examples in Chapter 9 cover multilevel analysis. Instead of COMPLEX, you would want TWOLEVEL. And then you want to specify the within and between parts of the model. See Example 9.6 to get started. Schoollevel variables should be listed on the BETWEEN statement of the VARIABLE command. Chapter 14 contains a section that describes special options for twolevel models. You should also read this. 


Thanks very much. As a followup, I tried following Example 9.6 but I received an error message saying “This analysis is only available with the Multilevel or Combination AddOn”. Reading through the responses to some of the questions posted at the Discussions page, I realized the problem is possibly due to the fact that I have the Base program or the Base plus the Mixture Addon that do not have Multilevel. Please, is there a way for me to get the program that have Multilevel? Thank you. 


Yes, you would need the appropriate addon to do the analysis. You can contact sales@statmodel.com to see what you would have to do to upgrade to the Base Program and the Multilevel AddOn. 


Hi, Dr. Muthen: I tried to simulate a twolevel CFA model with Mplus and the within level population parameters are exact same as a paper. I have 15 indicators predicted by 3 factors in within level. In between level model, I have 2 factors and 10 indicators. Mplus always gave the error message that "THE ESTIMATED WITHIN COVARIANCE MATRIX IS NOT POSITIVE DEFINITE AS IT SHOULD BE. THE VARIANCE OF w2(one between level indicator) APPROACHES 0." In fact the population variance of w2 is .5. I did not know how to make sure the covariance of twolevel CFA model could be inverted and computable and do you have any suggestion in terms of how to set up valid population parameters of 2 level SEM model for large covariance case? I also attached my code in the following. I very appreciate your help. montecarlo: names are x1x15 w1w10; nobservations = 500; ncsizes = 1; csizes = 50(10); seed = 58459; nreps = 500; save=c:/p/1/m*.dat; ANALYSIS: TYPE = TWOLEVEL ; MODEL POPULATION: %WITHIN% y1y3@1; y1 BY x1@.7 x2@.7 x3@.75 x4@.8 x5@.8; y2 BY x4@.7 x6@.7 x7@.7 x8@.75 x9@.8 x10@.8; y3 BY x1@.7 x11@.7 x12@.7 x13@.75 x14@.8 x15@.8; x1@.51 x2@.51 x3@.4375 x4@.36 x5@.36 x6@.51 x7@.51; x8@.4375 x9@.36 x10@.36 x11@.51 x12@.51 x13@.4375 x14@.36 x15@.36; y1 WITH y2@.5; y2 WITH y3@.3; y1 WITH y3@.4; %BETWEEN% yb1yb2@1; yb1 BY w1@.8 w2@.8 w3@.75 w4@.7 w5@.7; yb2 BY w1@.7 w4@.7 w6@.8 w7@.8 w8@.75 w9@.7 w10@.7; w1@.51 w2@.51 w3@.4375 w4@.36 w5@.36 w6@.51 w7@.51 w8@.4375 w9@.36 w10@.36; yb1 WITH yb2@.6; MODEL: %WITHIN% y1y3; y1 BY x1 x2 x3 x4 x5 ; y2 BY x4 x6 x7 x8 x9 x10; y3 BY x1 x11 x12 x13 x14 x15; x1x15; y1 WITH y2; y2 with y3; y1 with y3; %BETWEEN% yb1yb2; yb1 BY w1*.1 w2 w3 w4 w5; yb2 BY w1*.1 w4 w6 w7 w8 w9 w10; w1w10; yb1 WITH yb2; output: tech9; 


What you have specified is that the population residual covariance of w2 is .51. If you have a problem deciding on population values, you might want to look at Example 11.7 in the user's guide. 

wendy posted on Tuesday, May 23, 2006  10:11 pm



Hi, Dr Muthen: I very appreciate your response and I have a question for simulating a 2 level CFA model with different ICC conditions of .2 and .3. For example, I have a 2 level CFA model with 2 factors y1& y2 at within level and yb1 & yb2 at between level respectively and y1 is correlated with y2 and yb1 is correlated with yb2, each of them predicts 3 indicators. Mplus would give intraclass correlation for 6 indicators finally and I do not know how to control ICC as one condition to simulate data? Sometimes ICC for different indicators vary significantly. Whether we need to check the average of the ICC of the six indicators or is there alternative method for calculating unique ICC? For example my model has 2 related factors yb1 and yb2 at between level and how to calculate ICC in this case? Someone recommends to change the between level factor variance and I do not know whether it works? 


By changing betweenlevel factor or factor indicator variances, you change the ICC's. You can look at the Monte Carlo examples for Chapter 9 that come on the Mplus CD to get an idea of how this works. You can also just try changing the variances and see how this affects the ICC's. 

wendy posted on Friday, May 26, 2006  7:48 pm



Hi, Dr. Muthen: I am running a 2level SEM model and the output did not show CFI and SRMR shows total value and within level value. Do you know how to explain this? Does that mean CFI is not meaningful in multilevel SEM model? 


If you get chisquare, you should get CFI and TLI. Perhaps the baseline model did not converge. SRMR is given for the between and within parts of the model. 

wendy posted on Saturday, May 27, 2006  3:24 pm



Dr. Muthen: Thanks for your comments. I think probably my 2level CFA model has the convergence problem although it has very good parameter estimates and hence results did not show CFI and TLI. Again, I still has problem with simulating a valid 2level CFA model. Mplus always indicates that THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ILLCONDITIONED FISHER INFORMATION MATRIX. My model should be appropriate and I tried several different starting values, however, this did not work. Example11.7 is different from my case since it reads data from a real dataset. I need to simulate data using population parameters. I just wonder is there any formula or empirical population parameters example I could use for generating a valid 2level CFA model. Thanks. Additionally, I am still confused at ICC for 2 level CFA model. Suppose I have 6 indicators in my between level model, 6 intraclass correlations are shown for first replication, do you know the formula and how to calculate them? ICC is between level variations over total variation. 


The Mplus CD you got also contains Monte Carlo versions of each of the examples in the User's Guide. I think ex 9.6 might be close to what you want. An illconditioned information matrix may imply that the model you specify is not identified. 

wendy posted on Saturday, May 27, 2006  6:33 pm



Dr.Muthen: I run Mplus Monte Carlo versions of ex9.6 and it did not give CFI and TLI. My 2level CFA simulation model has the equivalent problem; it shows Chi square, RMSEA and SRMR. I just wonder whether Mplus program did not give CFI? 


I didn't realize that you were referring to a Monte Carlo output. We don't do CFI and TLI with Monte Carlo because it requires running a baseline model at each replication in addition to the regular analysis. 


You can get CFI and TLI if you do this as an external Monte Carlo. See Example 11.6 in the Mplus User's Guide. 


Dear Dr. Muthen, I am running multilevel SEM analyses using the TYPE=COMPLEX MISSING command along with the stratification, cluster, and weight commands. The analyses terminate normally providing the fit indices, however I cannot get standardized estimates or latent rsquares. Are these calculations not available when using the multilevel analyses? 


Do you have the STANDARDIZED option in the OUTPUT command? 


Yes, I do include standardized in the output command. 


Then you will need to send your input, data, output, and license number to support@statmodel.com so I have more information to answer your question. 


Ok, thanks! 


i am doing twolevel analysis. i am very surprised that when i delete the 'mcp with mautive'(the model required), the chisquare reduce from 51647.675 to 29.439, and the model fit change from 0 to 0.980. such a large change makes me confusing.why these happen? thanks in advanced. ANALYSIS: TYPE = TWOLEVEL; MODEL: %within% effort by pmeff tmeff; impul by pmimp tmimp; effort on mautive mcp; impul on mautive mcp; edrad on effort impul; mautive with mcp; effort with impul; tmimp with tmeff; tmimp with edrad; tmeff with edrad; %between% tmimp with tmeff; tmimp with edrad; tmeff with edrad; edrad; tmimp; tmeff; 


You should not put WITH statements in the model for covariates. When you do this, distributional assumptions are made about these variables. Covariates should be mentioned only on the righthand side of ON. 


I have a question about estimating multiple group, multilevel SEM models. I'll start with a conceptual description. If you need the code, I can send you this. I have nested data, with students nested in classrooms. There are 28 classrooms and 480 students. I wanted to run multilevel models to test the effects of two classroombased interventions (total effects for X1 and X2) and also test the indirect effects of the program via theoretical mediators (A*B). I used the type=two level command. One of my research questions is: Are indirect effects moderated by gender (moderated mediation). However, when I tried to do a multiple group model, with grouping = gender and cluster=classroom, the model would not run. I was told this is because of problems with the covariance matrixperhaps because more than one gender is represented in each cluster. Is this correct? Is there anyway to get around this? Also, I was unable to test indirect effects when I included the intervention variables as predictors in the level 2 (or betweenlevel) equations (all other predictors were included in the withinlevel model). So, I included them in the withinlevel equation with the rest of the fixed effects. This is not ideal because the intervention was delivered at the classroom level. Did I do something wrong, or is this a glitch with the type=twolevel program? Mary Terzian 


Please send your input, data, output, and license number to support@statmodel.com so we can see exactly what is happening. 

abg posted on Monday, March 05, 2007  11:18 am



I have a question about multilevel SEM models where level1 variables only appear as outcomes of level2 variables. Children are nested within classrooms (T4ADULT T4IGNORE T4PSOLVE T4PASSV T4REVENG T4VICT are childlevel vars). Is this the correct way to specify such a model? USEVAR ARE TID T4ADULT T4IGNORE T4PSOLVE T4PASSV T4REVENG T4VICT PARb SEPb INDb AVb ASb PUNb T4NMb T4AAb T4AVb; BETWEEN = T4NMB T4AAB T4AVB ASb INDb AVb SEPb PUNb PARb; CLUSTER = TID; ANALYSIS: TYPE = TWOLEVEL; Model: %within% T4VICT on T4PASSV; T4VICT on T4PSOLVE; T4VICT on T4ADULT ; T4VICT on T4REVENG; %between% T4PASSV on PARB; T4IGNORE on PARB PUNB; T4PSOLVE on PARB INDB; T4ADULT on PUNB; T4REVENG on SEPB AVB; SEPB on PUNB T4AVB; ASB on PARB T4AAB; AVB on T4AVB; T4AVB on T4NMB; THE LOGLIKELIHOOD DECREASED IN THE LAST EM ITERATION. CHANGE YOUR MODEL AND/OR STARTING VALUES. THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES. A MATRIX COULD NOT BE INVERTED DURING THE BASELINE MODEL ESTIMATION. THE ESTIMATED BETWEEN COVARIANCE MATRIX IS NOT POSITIVE DEFINITE AS IT SHOULD BE. COMPUTATION COULD NOT BE COMPLETED. THE VARIANCE OF T4PASSV APPROACHES 0. FIX THIS VARIANCE AND THE CORRESPONDING COVARIANCES TO 0, DECREASE THE MINIMUM VARIANCE, OR SPECIFY THE VARIABLE AS A WITHIN VARIABLE. 


It seems you have no variability of t4passv on the between level. You should do a TYPE=BASIC TWOLEVEL; and look at the ICC for that variable. If it is small, you should put the variable on the WITHIN list and remove it from the between part of the model. 

abg posted on Tuesday, March 06, 2007  1:30 pm



Thanks very much. I'll do that. 

Stephan posted on Thursday, February 14, 2008  9:24 pm



Hello,apologize for this long questions. I’m investigating the relationship between married couples. 3 end. LV measure ‘satisfaction’,7 exog. LV are influence factors. However, there are 3 systematic differences which could have an effect on all variables. These are ‘Motives’ of marriage. Let’s say “Forced by family”, “Money” and “Love”. My model contains cluster = country of origin. (…) Variable: Names Are x1x40 Mo13 Clus; Usevariables Are x1x40 Mo1M03 Clus; Between = Mo1Mo3; Cluster Is clus; Analysis: Type = Twolevel; Model: %Within% F1 By x1@1 x2x4; (…) F4 By x17@1 x18x20; (…) F1F3 ON F4F11; %Between% F1a By x1@1 x2x4; (…) F4a By x17@1 x18x20; (...) F1aF3a ON F4aF7a; F1aF7a ON Mo1Mo3; (...) Question 1: Is it correct not to use the ‘Whithin Is’ statement? Question 2: Does it make sense to regress all latent variables in the %between% statement on the 3 Motives that might influence group differences? Question 3: Or would it be better to use these 3 motives as cluster variables. (Cluster = 1,2,3) But how should I, in that case, implement ‘country’. Question 4: Or should I use LCA (with c=3) and then regress all IV & DV latent variables on c? Any hint is highly appreciated. Stephan 


I would create two dummy variables and use them as covariates. 

J.W. posted on Friday, February 15, 2008  1:57 pm



In multilevel modeling, it is usually assumed that a lower level unit (e.g., student) is nested within only one higher level unit (e.g., school). However, in some hierarchically structured data (e.g., snowball sample, chain referral sample, RDS sample, …) individuals are likely to be nested in multiple personal networks; e.g., A is in B’s network, as well as in C and other’s networks. In such data, difficulty of identifying a unique higher level unit for each individual makes multilevel modeling difficult. Is there any way in Mplus to handle observation dependence in such data? Thank you very much for your help! 


This is referred to as crossed random effects and is not yet available in Mplus. 

Stephan posted on Sunday, February 17, 2008  4:05 pm



Hi Linda, thanks for the response. So, do you think that there’s no nested structure? However, I am interested in model path coefficients of those who married because of love and those who have been forced etc. My hypothesis is that in case of ‘force’ all coefficients are statistical not significant, in case of ‘money’ significant but lower as in case of ‘love’, however. Thanks a lot for your help. Cheers, Stephan 


I don't see there is any clustering related to the categories unless there is something you are not saying. It sounds like people are in one of three observed groups. You can either use two dummy variables or multiple group analysis. 

Stephan posted on Monday, February 18, 2008  2:09 pm



Hi Linda, Yes, that's it. Sorry  I've mixed up multiple group and multi level analysis. Thanks a lot for the advice. Stephan 


Hi, I would like to simulate multilevel data for a CFA problem and then estimate the CFA with a nonhierarchical model. I see how to simulate the multilevel data for the CFA situation, but when I try estimating the model with the nonhierarchical model I get an error message saying I need to use either %Between% or %Within%. I've tried both and get different results. My question is, which should I use to analyze the data in the naive (nonhierarchical) way, or is there another approach that I'm missing? Thanks very much. Holmes 


Please send your output and license number to support@statmodel.com. 


Dr. Muthen, As for CFI in multilevel model, how does Mplus identify "baseline model"?? Any document or any information I can find here? 


The baseline model is the means and the between and within level variances of the dependent variables. 


Hello, I am considering doing a multilevel model with NAEP data. I have used HLM in the past but was wondering how or if MPlus could accomodate plausible values. Thank you. 


I think plausible values is multiple imputation. If so, Mplus does not create the datasets but can analyze them correctly using TYPE=IMPUTATION; 


Hi, I'm running a twolevel SEM in Mplus 4.2. I use a dichotomous variable both at within and between level but the version doesn't seem to support this, my question is whether the latest version suports this? I want to use the "CATEGORICAL =" command for the variables, but I get the answer that "*** ERROR in Analysis command ALGORITHM = INTEGRATION is not available for multiple group analysis. Try using the KNOWNCLASS option for TYPE = MIXTURE." If I upgrade, will this be possible? 


The error message refers to the fact the you need to use the KNOWNCLASS option instead of the GROUPING option for multiple group analysis with ALGORITH=INTEGRATION. This is still the case. It has nothing to do with the CATEGORICAL option. 


Hello I am running an SEM model with nested data (students within classrooms) and trying to create a latent factor of school readiness (srw by Eclr Eltr Enmbr Eshp Esize and srb by Eclr Eltr Enmbr Eshp Esize) I am getting an error report similar to the one posted on August 9,2003. THE ESTIMATED BETWEEN COVARIANCE MATRIX IS NOT POSITIVE DEFINITE AS IT SHOULD BE. COMPUTATION COULD NOT BE COMPLETED. PROBLEM INVOLVING VARIABLE ECLR. THE CORRELATION BETWEEN ENMBR AND ECLR IS 0.994 THE CORRELATION BETWEEN ENMBR AND ELTR IS 0.996 THE CORRELATION BETWEEN ESHP AND ENMBR IS 0.994 THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES. I am confused  because I thought having high correlation between indicators of a latent factor was positive because they are all indicators of the underlying construct. Could you please help me interpret this error message. Thank you. 


Factor indicators should for the same factor should correlate. But factor indicators that correlate one are statistically indistinguishable and contribute nothing above each other. 


How can I define the baseline model in Mplus? In other words , how can I get the parameter estimates for baseline model in Mplus? 


You would need to specify the baseline model in the MODEL command. 

Unai Elorza posted on Wednesday, August 26, 2009  6:36 am



Hello, I am a newcomer to Mplus and multilevel modeling. I am testing a 2=>1=>1 mediation analysis: level 2 variable contributing to a level 1 mediator which, in turn, contributes to a level 1 dependent variable. All of them are observed variables. When I try to interpret the output file, I have three different types of standardized coefficients: STD, STDY and STDYX. One of them (STDYX) gives me a one std coefficient higher than 1 (all correlations are lower than 1; there are no covariances higher than 1). So, my questions are: 1) Which std parameters should I take into account? 2) Having a coefficient higher than 1, means that Heywood problem is arising in the test? How could I overcome it? Thank you very much in advance. Unai 


1. STDYX is used for continuous covariates. STDY is used for binary covariates. 2. With more than one covariate, a regression coefficient can be greater than one. You should always check for Heywood cases. 

M Hamd posted on Wednesday, April 28, 2010  11:29 am



Dr. Muthen I am running a MSEM model which terminates normally (no negative variances or other errors). It is a 112 mediation model such that X>M>Y. However, the correlation between two of the latent variables (more specifically, X and M) at the group level is very high (.97) while at the individual level is much lower (.23). Q1: Does this indicate problems in the data? Or is it Normal for level2 correlations to be higher. Moreover, the standardized path coefficients between "X and Y" and "M and Y" are 4 and 3 respectively. Karl Joreskog suggests this is ok, but may indicate multicollinearity. However, the VIFs (for individuallevel variables) are fine. q2: Is there some other error i should explore for before concluding there results are fine? q3: Is there some other source besides Joreskog that I can use to argue that in multilevel models std coefficients greater than 1 are not necessarily a problem? 


It sounds like you have the same model on both levels, in which case the high X, M correlation on the between level could cause the multicollinearity that does produce standardized slopes greater than one. The between level also specifies a regression so the issue is the same as in regular regression  so you don't need any extra arguments beyond Joreskog. It is common for betweenlevel variables to correlate much higher than within. This has to do with the topic of ecological correlations. Perhaps you want to have only M>Y on the between level. Then the intercept of M still influences the intercept of Y and therefore the individual's Y. 

M Hamd posted on Thursday, April 29, 2010  9:33 am



Dr Muthen Thank you. Actually my model is like this: %within% m on x; %between% m on x(a); y on x; y on m(b) ; MODEL CONSTRAINT: NEW(ab); ab = a*b; As you will see by 112, I meant, Y is purely betweenlevel variable, and x and m are free to exist at both levels. When i set Y on X@0; the path coeff are no longer greater than 1.0. I guess, I will cite Joreskog to argue that std coeff > 1.0 are not an issue in this case. Thank you very much for your response. 


Hi, I'm trying to estimate a 1, (1,1), 1 MSEM (two mediators, all measures taken at L1), but get the error warning of a nonpositive definite fisher information matrix, possibly due to starting values or model nonidentification; and SEs that cannot be computed due to problems with para25. I've tried a few things (deleted clusters with no variance; set variances to zero), with little success. Para25 is the relationship that the DV has with itself (wb in the syntax below) in the psi matrix. I'm not sure what this means, or what I should do to correct for it? I've copied my syntax below. There are no missing data in the variables used in the model. Any help would be appreciated. Stacey ANALYSIS: TYPE IS TWOLEVEL RANDOM; ITERATIONS = 1000; CONVERGENCE = 0.01; MODEL: %WITHIN% WB ON CT(BW1); WB ON AF(BW2); CWB ON TFL; CT WITH AF; CT ON TFL(AW1); AF ON TFL(AW2); %BETWEEN% C CT AF WB; C WITH CT AF WB; WB ON CT(BB1); WB ON AF(BB2); WB ON TFL; CT WITH AF; CT ON TFL(AB1); AF ON TFL(AB2); [C]; MODEL CONSTRAINT: NEW(ABW1 ABW2 ABB1 ABB2 CONW CONB); ABW1=AW1*BW1; ABW2=AW2*BW2; ABB1=AB1*BB1; ABB2=AB2*BB2; CONW=ABW1ABW2; CONB=ABB1ABB2; 


Please send the output and your license number to support@statmodel.com. 

Tobias Koch posted on Wednesday, November 10, 2010  10:01 am



Hi, I'm working on a MLCFA with two correlated factors on each level. I would like to use the new bayes estimator and set invers wishart priors for specific variance/covariance matrices. For example, for the residual covariance matrix and/or for the covariance matrices of the latent factors. How can I access specific covariance matrices in the model command, so that I can refer to them in the model prior command later on? Put differently, is it feasible to set priors for entire covariance matrices or only for single parameters in the covariance matrices? If it's possible to refer to entire covariance matrices with priors, where can I find examples of it. Thank you very much, Best regards 


You can set prior for the entire covariance matrices or for individual parameters. For examples look in Section 2.2.2 in http://statmodel.com/download/BayesAdvantages18.pdf The actual input/outputs are in http://statmodel.com/download/examples55.zip 

Peter Halpin posted on Wednesday, February 02, 2011  6:36 am



Hi, I have a question about multilevel SEM that I can't seem to find an answer to in the manual. Basically, I want to regress a latent variable at the within level on a latent variable at the between level . For example, I want to regress "student attitudes" (for which I have a measurement model at the within level) on "teaching style" (for which I have a measurement model at the between level). If both variables were manifest, I would want student attitudes to have random intercept model and use teaching attitudes as a between level predictor. So I am thinking of something like: %between% y1 by t1t5; %within% y2 by s1s5; y2 on y1; Is this possible? To have a heirachical model for a latent variable? Any advice here would be greatly appreciated, Peter. 


You cannot use a withinlevel variable in the between part of the model. You would need to remove t1t5 from the WITHIN list and create a both y1w and y1b and regress y2 on y1b. 


Thanks Linda! For anyone with a similar interest I ended up with something like this and it worked. Also see Example 9.9 %between% y1 by t1t5; y2_b by s1s5; y2_b on y1; %within% y2_w by s1s5; 


Hi, I am trying to construct a two level path analysis with variables x, y, z and a cluster variable a. All are observed. X is a between variable. y and z are measured at the individual (within) level but as potential outcomes, can operate at either the within or between levels. I wish to test the following model: %within% z on y; %between% z on y x; y on x; However, when I run it, I get an error: "Observed variable on the righthand side of a betweenlevel ON statement must be a BETWEEN variable. Problem with Y." Yet, this seems similar in structure to example 9.5 in the user's manual (V6, April 2010). And, I don't want to restrict Y to be a between variable given the within model. Can you help? Thanks. 


I think you can avoid the message by mading a latent variable that is exactly the same as the observed variable y. %within% z on y; %between% f BY y; y@0; z on f x; f on x; 

Hemant Kher posted on Wednesday, August 17, 2011  7:19 pm



Greetings Linda and Bengt, I have a question on my multilevel model in MPlus. I am trying to replicate results from Applied Longitudinal Data Analysis (Singer & Willett, page 163, Model D). Dependent variable is CESD, and my two independent variables are Unemp, and monBYun. In this model the intercept, as well as the slopes for the independent variables are to have both fixed and random components. The code I used to fit this model is: %within% s1  cesd on unemp; s2  cesd on monBYun; %between% cesd with s1 s2; s1 with s2; The model fits, but my results are slightly different from the results listed in the book (or for that matter obtained using SAS or MLWin). Results are given below. Intercept [MPlus=11.135; Book=11.267] / Intercept var [MPlus=42.779; Book=41.52] Unemp slope [Mplus=6.985; Book=6.8795] / Unemp Var [MPlus=33.269; Book=40.45] monBYun slope [MPlus=0.300; Book=0.3254] / monBYun Var [MPlus=0.758; Book=0.71] Res. Var [MPlus=60.379; Book=62.43] These results are for Model D; using the same data but different variable combinations, I fit 3 other models (A, B, C) and the numbers are identical (to 3 decimal place rounding). The mean intercept and slopes for Model D are fairly close as shown above, but, difference in variance, especially Unemp seems larger. I have used the same estimator (ML) as used in prior models on the data. Is my code in error by any chance? 


Don't their Model D have a main effect of Time, whereas it looks like you only have the interation of Time and Unemp. 

Hemant Kher posted on Thursday, August 18, 2011  6:02 am



Hi Dr. Muthen, Model D excludes the main effect of time (page 173); the only covariates are Unemp and monBYun (interaction between time and Unemp). As mentioned in the post, the model has a random effect for each fixed effect. Hemant 


You are right. The page 163 results don't show covariances among the random effects, but I assume they are included in their modeling. A first check is if the same number of parameters is used  Mplus shows this clearly in the output. Next, you can check if the loglikelihoods (LL) agree. They give the Deviance which is 2 times the LL. Perhaps either their run or the Mplus run can sharpen its convergence criterion and get a better LL with agreement of estimates. 

Hemant Kher posted on Thursday, August 18, 2011  8:32 am



Hi Professor Muthen, I am grateful for your time on my question. The LL value for the MPlus model I ran was 2547.997. Thus 2LL=5095.994; on page 163 of the book the deviance for the same model is 5093.6. These values are very close, but not identical. I also checked the SAS (v9.2) output for the same model on UCLA’s ATS website. SAS results are identical to those reported in the book including 2LL. From the output it appears as though there are 10 parameters estimated in SAS. My MPlus output tells me “Number of Free Parameters 10”, thus the number of parameters estimated is the same as well. As a side, from the ATS website, only the results from MLWin and SAS are consistent for Model D. Remaining software packages, MPlus, Stata and SPSS give estimates that are close yet different. Stata had difficulty calculating standard errors for variance / covariance parameters. 


The SAS deviance translates to LL = 2546.8 so that it a little bit higher than the Mplus LL. You can try a sharper Mplus convergence criterion saying mconvergence = 0.00001; in the Analysis command instead of the default mconv = 0.001 and see if the LL changes. In some cases the estimated between covariance matrix is close to singular (e.g. high correlations) and can be the cause of not getting right at the best LL. 

Hemant Kher posted on Thursday, August 18, 2011  10:47 am



Thank you for your thoughts Professor Muthen. I used a convergence criteria even smaller than the one you suggested by it does not change the previous results. 


We haven't had a case so far where we have not been able to get perfect agreement between SAS and Mplus, so feel free to send the input, output and data to support together with the SAS output. 

Hemant Kher posted on Thursday, August 18, 2011  12:29 pm



Thanks Professor Muthen, I will send all the material to the support email address soon. 


You can decrease the logcriterion convergence criterion (add these two commands logcriterion=0.0000001; miter=10000;) and then you get the same results as in stata. http://www.ats.ucla.edu/stat/stata/examples/alda/chapter5/aldastatach5.htm The reason you are experiencing problems with this example is that the variance covariance matrix for the 3 random effects is singular  I computed the determinant and it came out negative. Different algorithms and packages will react differently on singular variance covariance matrix and most people would consider this an unacceptable model. You can introduce structured variance covariance matrix for the random effects to eliminate these problems. 


Just a little more information about Mplus. If you add the technical option output:tech8; you will see the details of the convergence process. The default algorithm EMA quickly reaches the ML estimates but fails because the variance covariance matrix for the random effects is singular. At that point Mplus switches to the EM algorithm which slowly approaches the singularity, but Mplus will deliberately avoid the full convergence to avoid the singularity. In this part of the algorithm the solution is driven by the logcriterion convergence criterion. So essentially all software packages differ because the ML solution is inadmissible so they report their own version of "approximately" ML solution. 

Hemant Kher posted on Thursday, August 18, 2011  7:32 pm



Thank you Tihomir, Linda and Bengt. These details are beyond my comprehension. I am glad that I modeled the problem correctly in MPlus and the difference in results due to the difference in how softwares tackle singular matrices. When I made the change suggested by Tihomir my solution matches the one from Stata; Stata is unable to compute std. errors but MPlus did (as you well know). The solution time was a bit longer than before. 

Eric Deemer posted on Saturday, March 17, 2012  1:13 pm



Hello, Like a previous poster, I am trying to fit a multilevel SEM but I'm also getting the following error message: *** ERROR One or more betweenlevel variables have variation within a cluster for one or more clusters. Check your data and format statement. But I know that each person within the cluster in question has the same (mean) score. Do you think this could be data format issue? Thanks. Eric 


If you get this message and you are certain that the values for each person within a cluster are the same, you may be reading your data incorrectly, for example, blanks in the data set, the wrong number of variable names. If you cannot figure it out, please send the relevant files and your license number to support@statmodel.com. 


Hi, I am testing a multilevel SEM model with three withinlevel outcomes. At the withinlevel I control for one withinlevel variable and I specify all intercorrelations between the three outcomes. At the betweenlevel (that I'm mainly interested in) I have specified 2 betweenlevel predictors (a and b) for these outcomes, 2 betweenlevel moderators, their 2 interaction terms and I control for the effect of 2 betweenlevel variables on the outcomes. Everything runs normally and the fit is good. Then I include in the betweenlevel two correlations between the two control variables and predictor a, because they are theoretically important and they also increase the fit. But I get this error message about the NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX ... etc PROBLEM INVOLVING PARAMETER 49. Parameter 49 is the PSI for predictor a. When I remove the correlation between the control variable and predictor a, the error goes away. When I replace this correlation with another correlation between the same control and predictor b, then the error message involves the PSI of predictor b. I am wondering if you know what could the problem be. Best regards, Paris Petrou 


It sounds like the predictors are binary and the message is generated because the mean and variance of a binary variable are not orthogonal. If this is the situation, you can ignore the message. 


Dear Linda, Thank you for your reply. One of the two control variables (which is correlated with the predictor) is indeed binary. So, I will ignore the message. Kind regards, Paris Petrou 


Hello, I am specifying a multilevel pathmodel with students nested in small groups with the following variables: treatment (level 2) > groupwork quality (level 2) > stratpo (level 1) > comppo(level 1) Furthermore, I want to integrate pretestscores for both level 1 variables stratpre and comppre as covariates. Is it possible to compute the indirect effect on the group level as follows? missing = all (999.00); usevariables are treat gwq stratpo comppo stratpre comppre; centering is grandmean (stratpre comppre); between is treat gwq; cluster is group; analysis: type is twolevel random; model: %within% stratpo on stratpre; comppo on comppre; comppo on stratpo; %between% gwq on treat(a); stratpo on gwq(b); comppo on stratpo(c); model constraint: new (abc); abc = a*b*c; output: cinterval; 


This looks correct. 


Thank you very much. I am wondering if I have to specify a random model, since I do not predict random slopes. Could i run the model with type is twolevel as well? Then I could request the model indirect option for stratpo on treat and comppo on treat. 


There is no reason for you to use RANDOM for the model above. 


Dear Dr. Muthen, I have been using Mplus version 7 to estimate Multilevel Structural Equation Models based on Dr. Preacher's syntax with TYPE IS TWOLEVEL RANDOM. Under "estimated sample statistics", Mplus produces a within correlation matrix and a between correlation matrix. I would greatly appreciate if you could clarify (1) how these matrices are calculated and (2) whether missing data has been taken into consideration during the calculations of the within and between matrices. Thanks in advance! 


(12) The assumption is that the variables can be decomposed into two orthogonal components, within and between, in line with random effects anova. The covariance matrices for those two parts are then estimated by maximumlikelihood under the standard MAR assumption of missing data (so "FIML"). 


Hello I am testing a multilevel SEM model with 111 model and 221 model. I have two problems. could you help me? fist of all, when i testing a 111 model and 221 model separately, there's fit very good. but, once i testing both model simultaneously, fit indices are worst! is it possible? can in testing 111 model and 221model separately? I think it's not a good idea. and At the level 1. I testing only 111 model. but, when i input control variables in this model, fit indices are getting worse. and then I'm also getting the following error message: "variable is uncorrelated with all other variables" In this case, do I exclude all control variable? If so, at the level 1 model, there's no control variable. that's maybe problems later. I am wondering if you know what could the problem be. Best regards. 


You must have a variable on the USEVARIABLES list that you are not using in the MODEL command. 


I’m fitting a random slope multilevel SEM. The code and selected output are below: WITHIN = matage; BETWEEN = liv sr; CATEGORICAL = sex; COUNT = fitn(p); DEFINE: CENTER matage (groupmean); MODEL: %WITHIN% rs_sexmat  sex ON matage; %BETWEEN% sex ON liv; sex WITH rs_sexmat; fitn ON liv rs_sexmat sr; Estimate S.E. Est./S.E. PValue FITN ON RS_SEXMAT 29.287 7.648 3.830 0.000 SEX ON LIV 0.000 0.021 0.011 0.991 FITN ON LIV 0.028 0.012 2.319 0.020 SR 0.054 0.123 0.437 0.662 SEX WITH RS_SEXMAT 0.001 0.001 0.728 0.467 Means RS_SEXMAT 0.014 0.005 2.566 0.010 Variances RS_SEXMAT 0.000 0.000 1.925 0.054 1. How I define a path from the random intercept of variable “SEX” on my outcome variable “FITN”? 2. Does a negative coefficient for “FITN ON RS_SEXMAT” mean that the more positive the random slope, the lower the "FITN"? 3. Is it justified to estimate a coefficient for “FITN ON RS_SEXMAT” in the case of nonsignificant variance for “RS_SEXMAT”? 


1. You should say on Between: fitn on sex; 2. The larger rs_sexmat, the smaller fitn 3. Probably not. 


With respect to the question 1 above, if I use that syntax I get the following error message: *** ERROR in MODEL command Observed variable on the righthand side of a betweenlevel ON statement must be a BETWEEN variable. Problem with: SEX *** ERROR The following MODEL statements are ignored: * Statements in the BETWEEN level: FITN ON SEX But if I put the variable "SEX" on the BETWEENlist, I get this error message: *** ERROR in MODEL command Betweenlevel variables cannot be used on the within level. Betweenlevel variable used: SEX Any advice? 


One more question: if I have an OUTCOME variable that is measured on the betweenlevel, should I also include this variable in the list of BETWEENvariables? 


All variables measured on the betweenlevel must be put on the between list and used only in the between part of the model. If this does not help, please send the original output and your license number to support@statmodel.com. 

Samuli Helle posted on Thursday, September 04, 2014  3:04 am



Given the model below, is there a way to constraint the variance of random slopes >0? %WITHIN% rs_sexmat  sex ON matage; %BETWEEN% sex; sex WITH rs_sexmat; fitn ON rs_sexmat; I tried the following but I'm not sure whether it worked because the lower 95% CI for the variance estimate was still <0. %WITHIN% rs_sexmat  sex ON matage; %BETWEEN% sex; sex WITH rs_sexmat; fitn ON rs_sexmat; rs_sexmat (rs); MODEL CONSTRAINT: rs > 0; Thanks. 


This will keep the point estimate positive. It does not put a constraint on the confidence interval. 

Samuli Helle posted on Thursday, September 04, 2014  11:34 am



Is there a way to constraint also the confidence interval? 


Only when using Bayes where the prior keeps all values positive. 

Samuli Helle posted on Friday, September 05, 2014  12:33 am



Which I cannot use because I have a binary and count responses. Any idea when such a Bayesian model comes available in MPlus? 


It is on our list but there is no prediction for when it will be added. 

Samuli Helle posted on Wednesday, September 10, 2014  12:21 pm



And one more question: what's the unit of random slopes? I mean if I run the model below and want to report the effect size of regression "fitn ON rs_sexmat"? %WITHIN% rs_sexmat  sex ON matage; %BETWEEN% sex; sex WITH rs_sexmat; fitn ON rs_sexmat; 


Your results give the estimated variance for the random slope and you can also get the estimated variance for fitn. Using that, you can standardize the regression coefficient for fitn ON rs_sexmath (unless it already prints it), which gives the usual interpretation. 

tom norton posted on Monday, October 20, 2014  6:33 pm



I am trying to run a multilevel SEM with level 1 predictor (X) and dependent (Y) variables, and level 2 moderator (M) and control (C) variables. Am I correct in writing my commands as follows: %WITHIN% s  Y ON X C; %BETWEEN% M ON C; s ON M; Warm Regards 


The s  statement should refer to only one regression slope, so say %within% s  y on x; y on c; I think on Between you want to say: s on m; y on m c; where the first line gives the m*x crosslevel interaction and the second line gives the main effects from m and c. 

Star Chen posted on Wednesday, November 12, 2014  2:46 pm



Hi: I'm kind of new to Mplus. Lately I was trying to fit a multilevel mediation model with two mediators (each as a latent variable of several indicators). When I fit one mediator, the model runs, but when I try to model two mediators at the same time, I always get this error message. "THE MODEL ESTIMATION TERMINATED NORMALLY THE H1 MODEL ESTIMATION DID NOT CONVERGE. CHISQUARE TEST AND SAMPLE STATISTICS COULD NOT BE COMPUTED. INCREASE THE NUMBER OF H1ITERATIONS." I tried to increase H1ITERATIONS to 5000, but it still gives me the same error. When I increase it to be over 5000, Mplus appears to keep running forever, so I aborted the analysis. The model I was trying to fit is only the basic one without adjusting for any covariate. I'm wondering if there's any solutions to this problem. Originally I also tried to model mediators as two simple variables (mean score) instead of as two latent variables, but then model estimation couldn't terminate normally even in the case of one mediator. Any suggestions would be welcomed.Thank you. 


Please send the output and your license number to support@statmodel.com. 

Djangou C posted on Thursday, March 12, 2015  8:08 pm



Hi Dr Muthen, I am running multilevel models in Mplus and I have 2 questions to ask. 1)This question is related to what you called ecological correlation on your posting on Thursday, April 29, 2010  8:09 am. The correlations of level2 are higher than that of level1. How do you explain this statistically? Could you please point to a paper that can explain this and I could use for reference? 2)This question is related to class separation in multilevel mixture model. In Lubke & Muthen (2007) the multivariate Mahalanobis distance was used to define class separation. I don’t see how to use the same measure in multilevel mixture model for each level as the means are only defined in one level. How do we then define the class separation in that situation? Is there another measure for class separation that only use the covariance matrices (which is available for each level)? Any reference to point to? Thank you. 


1) The classic paper is Robinson, W.S. (1950). "Ecological Correlations and the Behavior of Individuals". American Sociological Review (American Sociological Review, Vol. 15, No. 3) 15 (3): 351–357. doi:10.2307/2087176. You can also see Wikepedia for more references. 2) The same measure should be used for 2level models. It uses the means and they are not affected by the multilevel situation  there are not separate means on the 2 levels. No references as far as I know. 

Nir Madjar posted on Thursday, May 12, 2016  2:44 am



Hello, I wonder if there is any way to model a latent factor from between level, to be associated with a within variable. In other words, the equivalent in HLM would be a level2 factor that is modeled on the intercept of level1 outcome variable (e.g., class level variable that explains students’ levels outcome). For example (I know this is not a possible model in Mplus but wonder whether any other model specification will test this hypothesis): Model: %BETWEEN% factorB BY var1 var2; %WITHIN% Outcome ON factorB; With many thanks! 


I assume that Outcome is a studentlevel variable in which case you say %Between% factorB BY... outcome on factorB; where outcome is the latent betweenlevel part of the Outcome variable. 

Nir Madjar posted on Thursday, May 12, 2016  9:21 pm



Thank you for your prompt response! I am sorry if this is my misunderstanding, but in this case the model explains latent betweenlevel part of the Outcome variable. In my case I have a model in which students are nested within classes, and I have a latent factor at the classlevel (e.g., classroom climate – BETWEEN level factor) that is hypothesized to explain variance on the student level (i.e., he model should explain latent WITHINlevel part of the Outcome variable). Is there any option to model this? With many thanks. 


A level2 variable cannot influence a level1 variable except through the level1 variable's level2 part. That's not an Mplus restriction but a general one. 


trying to post as a member. can post as a guest 


I need to run a multilevel SEM model in which the slope is allowed to vary. Level 1: AS>OCB>IRB Level 2: PJ. PJ interacts with AS in a crosslevel model. PJ interacts with OCB in a crosslevel model. I believe that this is a ‘intercept and slopes as outcomes’ model as described by Luke (2004) What I DID USEVARIABLES = AS OCB PJ IRB Supvten Educ Age Gender; cluster = Supvcode; within = AS OCB IRB Supvten Educ Age Gender; between = pj; !dep var level 2 DEFINE: CENTER PJ(Grandmean); Analysis: type=twolevel random; Model: %within% beta1OCB on AS Supvten Educ Age Gender;!betal is random effect slope y on x beta2IRB on OCB Supvten Educ Age Gender; %between% OCB on PJ;!regression of random intercept on PJ beta1 on PJ;!regression of random slope on PJ. Is the interaction of AS and PJ sign? OCB with beta1! residual covariance between intercept and slope IRB on PJ;!regression of random intercept on PJ beta2 on PJ;!regression of random slope on PJ. Is the interaction of OCB and PJ sign? IRB with beta2! residual covariance between intercept and slope Output: sampstat; *** ERROR in MODEL command Unknown variable(s) in a WITH statement: ON It does not help to put beta1 and beta2 in the usevariables statement. The program does not recognize beta1 and beta2. 


You forgot a semi colon at the end of yuor OCB with beta1 line. Also, your random slope statements should refer to only one variable on the RHS of ON. 


Thank you for your quick response. I made the changes and got the following error messages. *** WARNING in MODEL command Variable on the lefthand side of an ON statement in a  statement is a WITHIN variable. The intercept for this variable is not random. Variable: OCB *** WARNING in MODEL command Variable on the lefthand side of an ON statement in a  statement is a WITHIN variable. The intercept for this variable is not random. Variable: IRB *** ERROR One or more betweenlevel variables have variation within a cluster for one or more clusters. Check your data and format statement. Between Cluster ID with variation in this variable Variable (only one cluster ID will be listed) PJ 1014 I have been reading on this board about other users who have received this message for multilevel. My between group variable PJ is not the same within each group. I was able to analyse the data in HLM but of course could not do SEM. Do you have a suggestion? 


The first warnings are clear, right? To say s  y on x; implies that the regression slope of y on x varies across clusters and therefore y varies across clusters. Hence, y cannot be a Within variable. The PJ 1014 error is also clear  a variable designated as Between is a variable that characterizes the cluster and therefore does not vary within a cluster. If the variable varies also within a cluster it should not be on the Between list. 


Yes, I can see that my model in inappropriate. Is there a way to run the following model? Level 1: AS>OCB>IRB Level 2: PJ. PJ interacts with AS in a crosslevel model. PJ interacts with OCB in a crosslevel model. I would like the intercepts and slopes to vary but it is not necessary. 


crosslevel interactions are obtained by using random slopes that PJ predicts on level 2. 


How do I write "random slopes that PJ predicts on level 2" My level 2 variable is PJ. The ICC is 22%. This suggests to me that there is between group variance. I have had an error message: The PJ 1014 error is also clear  a variable designated as Between is a variable that characterizes the cluster and therefore does not vary within a cluster. If the variable varies also within a cluster it should not be on the Between list. So PJ cannot be on the between list. It is my only level 2 variable. Is this a problem that cannot be analysed using Mplus? 


Q1. See the handout for our short course Topic 7, slide 45. There w is the Level2 (Betweenlevel) variable. If your PJ variable is given an icc this means that it has both within and betweenlevel variance and should therefore not be put on the Between = list. 


I am looking at handout for short course Topic 7, slide 45. My level 2 variable is PJ. PJ is not the cluster variable but rather a variable measured at level 2. PJ has both within and between level variance. I think you are telling me that PJ cannot be w. Therefore I think that you are telling me that I cannot analyse a question in which my only level 2 variable varies both within and between levels. Therefore my question cannot be analysed in MPlus. Am I right? 


By level2 variables we typically mean variables that vary across clusters only. Such as a teacher characteristic when cluster=classroom. If your PJ variable varies both on level 1 and on level 2, it sounds like it is a variable measured for "students" (to use the student/classroom example). Such a variable can have a level2 counterpart, either the observed cluster mean (classroom mean) of the variable or the latent between part (see UG ex9.1, part2). You can have such a variable influence the random slope on level 2 (the Between level). 


Dear Bengt, to make this a classroom problem I could say: Student's grade (level 1) = student IQ (level 1) x student's perception of the school (level 2). PJ is student's perception of the school. We think that PJ is measured at level 2 because this variable is about the school. PJ will vary by student (level 1) and by school (level 2). Can you refer me to an example that I can follow? Thank you 


Here is a relevant paper that should be of interest to you: Lüdtke, O., Marsh, H.W., Robitzsch, A., Trautwein, U., Asparouhov, T., & Muthén, B. (2008). The multilevel latent covariate model: A new, more reliable approach to grouplevel effects in contextual studies. Psychological Methods, 13, 203229. 

Daniel Lee posted on Tuesday, February 21, 2017  3:15 pm



Hello, I am trying to expand the 211 multilevel sem to a 2111 multilevel sem (numbers represent levels). When conducting such an analysis, I was wondering if you knew how to calculate the indirect effects. For the 211 analysis, I know it's (m on x)*(y on M), but I'm not sure how to expand this formula when you have an extra mediator on the chain. Thank you!! 


It's just (Y on m2)*(m2 on m1)*(m1 on x). 


Dear Dr. Muthen, I am running MSEM (twolevel) as 221. 3 X predictor latent factors are formative (Level2). 5 M latent reflective factors are mediators (in level 2) and 3 Y latent reflective factors are outcomes at level1. My questions are: 1 Should I run a model for the 3 outcomes simultaneously (at the same input), OR run each outcome separately? 2 In terms of the mediators, which one is correct?/Is there any difference?: ! regress mb on xb, call the slope "a" M1 M2 M3 M4 M5 ON XB1 XB2 XB3(a); ! regress yb on mb, call the slope "b" YB1 YB2 YB3 ON M1 M2 M3 M4 M5 (b); ! regress yb on xb, too YB1 YB2 YB3 ON XB1 XB2 XB3; MODEL CONSTRAINT: NEW(ab); ab = a*b; OR: ! regress mb on xb, call the slope "a" M1 ON XB1 XB2 XB3 (a1); M2 ON XB1 XB2 XB3 (a2); ...….. ! regress yb on mb, call the slope "b" YB1 YB2 YB3 ON M1(b1); ...... ! regress yb on xb, direct effect, too YB1 YB2 YB3 ON XB1 XB2 XB3 (cdash); MODEL CONSTRAINT: NEW(a1b1 a2b2 a3b3 a4b4 a5b5 TOTALIND TOTAL); ! name the indirect effect a1b1 = a1*b1; ! Specific indirect effect of X on Y via M1 ...etc I really appreciate your help. 


You can't give 1 label to several slopes  see labels in Chapter 17 for how to use them. Including all outcomes at once is the same as doing them one at a time. All mediators should be included at once. 


Thanks for your reply. Do you mean each predictor should be in a separate line? sorry I am a new user for mplus. I have not found a syntax for multiple predictors, mediators, and outcomes in a 221 model. So, I borrow the syntax from Preacher et al. I really appreciate your help. 


One label per line: M1 ON XB1 XB2 XB3 (a1); M2 ON XB1 XB2 XB3 (a2); 

Bander AL posted on Thursday, March 30, 2017  12:40 pm



Thanks Linda. and in terms of regression yb on mb, is it this correct?: [YB1 YB2 YB3] (b0); YB1 YB2 YB3 ON M1(b1); YB1 YB2 YB3 ON M2(b2); YB1 YB2 YB3 ON M3(b3); If this is not correct, it means I need to write a separate input for each outcome (YB1 YB2 YB3). Am I right? 


If you want all three regression coefficients to be the same, it is correct. I don't think you want that. You should run some of these options and see what you get. That would help you understand better. One label after three covariates hold the coefficients equal. Check the output. 

Back to top 