Message/Author 


Given the that standard fit indexes are not available when estimation multilevel models, what criteria can be used to evaluate those models? 


The residuals are available and can be used to assess model fit. We are working on RMSEA, TLI, and CFI for multilevel models. I will post ways to use information from existing Mplus outputs to calculate these. 


A robust RMSEA for multilevel models can be calculated using results from Mplus. RMSEA = sqrt((2 * Fmin / t) (1/n))*sqrt(g) where Fmin = the last value from the function column of TECH5 n = sample size (total of all groups if multiple group) t = trace of the product of the u and gamma matrices. See Satorra (1992) for a definition. g = number of groups For computational purposes, you can calculate RMSEA as follows: RMSEA = sqrt((chisquare/(n*d))  (1/n))*sqrt(g) where d is degrees of freedom, n is total sample size, chisquare is chisquare, and g is the number of groups. Chisquare from MLM or MLMV can be used. The RMSEA will be the same whichever one is used. 


Two other fit measures can be computed for multilevel models. They are TLI and CFI. I don't know how well these fit indices perform for multilevel models. Their range is from zero to one with one being good fit. In trying them out, I found it difficult to find a value very far from 1. However, this may not be different than for regular SEM models. Both of these measures require information from a baseline model in addition to the model being tested. Typical baseline models have zero covariances. I looked at a multilevel baseline model with free means and variances on the between level and free variances on the within level. This is obtained in Mplus by having no statements in the MODEL command. Another choice is a model with free means, variances, and covariances on the between level and free variances on the within level. This is obtained in Mplus by using WITH statements on the between level to free the covariances. I did not try this. TLI and CFI can be computed as follows where chib = chisquare for the baseline model dfb = degrees of freedom for the baseline model chit = chisquare for the model being tested dft = degrees of freedom for the model being tested TLI = (chib  (chit * dfb/dft))/(chibdfb) CFI = 1  ((max (chit dft, 0)/(max(chitdft, chibdfb,0)) 


Dr. Muthen: (1) The Rsqure is offered for each variable in multilevel analysis, however, is there anyway to know the Rsqure for the overall model? How do we know how much variance is explained by the overall model? (2) Can we include contexual variable(s) in the individual level? 


Regarding (1), the typical SEM focus is on Rsquare for each dependent variable, not variance explained in some more global sense. Perhaps you are thinking about overall model fit. In SEM, the model fit is not a direct function of variance explained, but chisquare and other fit indices are useful. Regarding (2), a level2 variable can be allowed to influence a level1 variable. This is represented on the between level, letting the level2 variable influence the level2 part of the level1 variable. 

Lee Van Horn posted on Tuesday, February 22, 2000  12:35 pm



In the above the degrees of freedom and chisquare for the baseline model would be obtained using a model with no between or within structure specified right? Assumably the chibdfb is going to be the largest value for demoninator of the CFI equation in most cases? The difficulty I'm having is that my indendence model "failed to converge with serious problems in the iterations." I would like to be sure I was doing it right before giving up on calculating the CFI. I've been specifying a confirmatory within model with no structure for the between model. Some of the ICC's are very small and I've had problems with the estimates for the variance in the between model being 0 for a couple variables. This makes for some errors in calculating the t scores and standardized loadings (negative t scores and standardized loadings of 999). Calculating start values with the 0.5*(Sbsw)/c did not change the estimates. I don't however see any reason that this would invalidate my model, especially given that I'm primarily interested in the within model? Are there any problems with this logic? There are some theoretical reasons for sticking with the two level model not withstanding the low ICCs. Lee Van Horn 


Yes to your questions in your first paragraph. The independence model can fail to converge with small ICC's and in such cases some between variances may need to be fixed at zero. Note that zero variances is ok on the between level because the between covariance matrix is not inverted. Instead, Sigma_W + s Sigma_B is inverted (see Appendix of User's Guide). Fixing some between variances at zero can also be used for your confirmatory model and does not invalidate your within model. If you are only interested in the within part, regular analysis of S_W gives very similar results. 

Anonymous posted on Friday, February 16, 2001  12:38 pm



Would you provide some context, guidelines, or references for interpreting the WRMR statistic provided in Mplus 2.0 ? Do you feel the RMSEA and WRMR provide a better indication of overall model fit than CFI, TLI in the situation where the data are suspected to deviate from the multivariate / normality assumptions or sample size is of conern ? 


The Version 2 User's Guide gives some background for WRMR on pages 361362 that states things better than I can do here. There is also a paper by YuMuthen that is now in draft form and will shortly be made available. So far, the WRMR performance looks quite good for both continuous and categorical outcomes. We have not yet studied nonnormal continuous data situations, so we have to wait with a conclusion of how WRMR compares to other indices in such cases. WRMR is designed to work well here. I refer to the HuBentler articles for comparisons of the other fit indices for continuous nonnormal data. 

Chingyun Yu posted on Saturday, February 17, 2001  9:13 pm



In our study, WRMR (with cutoff value of .9) is a better fit index than RMSEA (cutoff value of .06) and CFI (cutoff of .95). Generally speaking, RMSEA performs better than CFI for continuos outcomes whereas CFI performs slightly better than RMSEA for dichotomous outcomes [except for misspecified factorloading (complex) models when N<=250]. Marsh, Balla and McDonald's (1988) simulation study on four different data sets have shown that TLI is relatively independent of sample size. Moreover, Hu and Bentler (1998; p446) mentioned that TLI is less sensitive to sample size and consistently performs well even with models deviate from the multivariate normality assumptions. 

Chingyun Yu posted on Saturday, February 17, 2001  9:48 pm



Hu and Bentler (1998; 1999) have shown that CFI, TLI and RMSEA (ML and GLS based) perform well across different sample sizes and distributions. However, TLI and RMSEA are less preferable at small sample sizes (e.g. N<=250). 


For a complex model [p.135 User Guide], how are the degrees of freedom calculated? 


For the MLM estimator, degrees of freedom are calculated in the usual way. For the MLMV estimator, the formula for the degrees of freedom is found on page 358 of the new Mplus User's Guide. It is formula 110. 


I am conducting a twolevel structural equation modeling analysis, with about 275 people and about 75 groups (unbalanced). I have found models that appear to fit well at both the within and the betweengroups levels, but two of the rsquared's for the betweengroups model are slightly above 1 (1.005 and 1.08). All the other model parameters have reasonable values and suggest a good fit. What do I do with the > 1 rsquared's in terms of the analysis and reporting results? Thank you, Jennifer Johnson 


Can you please send the output to support@statmodel.com. I'd like to take a look at it before I answer your question. 


Thanks for sending your output. In your multilevel factor analysis, your Rsquare values on the between level are printed as undefined in the Mplus output because the corresponding estimated residual variances on the between level are negative. The greater than one values are shown next to "Undefined" to alert you to the fact that you have a problem. If the negative values are very small, the residual variance can be set to zero. If not, you may want to rethink your model. It is often the case that betweenlevel residuals(unreliabilities) are small due to betweenlevel correlations being high relative to withinlevel correlations. 

Anonymous posted on Friday, May 30, 2003  3:58 am



How do you explain the to adjustment mesureing chi2 and RMSEA in Lisrel. What is simular and what is different. 


I don't understand what you are asking. Can you be more specific? 

Michael Eid posted on Wednesday, January 19, 2005  5:56 am



I am interested in a twolevel factor model for six ordinal variables and four factors on the within and between level. I got results for the MLF estimator but no chisquare test (although this possibility is mentioned in the manual on p. 401). What do I have to do to get a goodnessoffit statistic for the model? Moreover, I have not got results for the MLR estimator because of memory problems. Is there any way to handle this problem? 

BMuthen posted on Thursday, January 20, 2005  8:00 pm



You don't get fit statsitics because there is not a single covariance matrix to test against with orginal outcomes using maximum likelihood. I suspect that the memory problem has to do with too many dimensions of integration. To circumvent this, you can try INTEGRATION = MONTECARLO. However, I wonder why you didn't have this problem with MLF. 

Anonymous posted on Thursday, March 31, 2005  7:50 am



Is it possible to get fit indices other than LL, AIC, BIC when using a categorical dependent variable in a two level model? Preferably, I could get CFI, rmsea, etc (ie something with recommended cut offs to guide model acceptance). (I am using TWOLEVEL MISSING H1 and MLR). Thanks for your help. 

Thuy Nguyen posted on Thursday, March 31, 2005  1:42 pm



Twolevel modeling with a categorical dependent variable requires numerical integration. The only fit indices available for analysis with numerical integration are the ones that are printed (LL, AIC, and BIC). 


I am using multilevel modeling with the MLR estimator. Can I use the difference in 2LL as a chisquare statistic (comparing nested models)? Or is some scaling factor required? 

bmuthen posted on Wednesday, May 25, 2005  10:58 am



MLR chisquare difference testing requires a scaling factor in line with what is described on the Mplus web site for MLM. For 2level models without random slopes, testing against an unrestricted H1 model is available and then the scaling factor is given. If you have random slopes, but don't have strong nonnormality, you can use ML and do the difference testing the usual way. 


Thank you very much. I would like to ask another question. I am using twolevel modeling with students in classrooms. I would like to distinguish gender context within the classroom. One way I used was applying multigroup modeling with girls and boys as groups, and classrooms defining the between level. This worked fine until I introduced a crosslevel interaction (using random slopes). Then the deviance 2LL increased (was about doubled)compared with the model without this crosslevel interaction. Is this a source for concern? When I declared my moderator variable as a variable at the within level and introduced the classroom means of this moderator at the between level, then this increase of the deviance did not happen. Is it a sensible approach to add aggregated variables at the between level? Or is it always better to let Mplus construct the betweenlevel variables from the predictors in the withinlevel part? 

bmuthen posted on Wednesday, May 25, 2005  3:40 pm



Multiplegroup, 2level modeling should use groups based on betweenlevel variables, not withinlevel variables. So if you have students within schools, multiplegroup analysis can study Catholic versus Private versus Public schools (as in Muthen, Khoo et al). The rule must be independent samples. Due to intraclass correlation, they are not independent if you consider gender groups within schools (or classrooms). In fact, I am not aware of any published methodology for doing what you want. I have, however, seen this need in at least 3 recent instances now, and the Mplus team has an idea for exploration on how to approach it, so maybe we should get going. Regarding your second question about crosslevel interaction, I wopuld be surprised if I saw such big change in the log likelihood. Regarding between variables, the observed cluster mean entered explicitly as an observed variable in the data should work fine. 


Thanks again. Yes, I was also concerned about the dependency issue, which will also bother us when we define the between level as samegender groups within classrooms. We then ignore the classroom level. In our case, though, it seems more important to attend to the gender context within a classroom than to the classroom context. Concerning the value of the loglikelihood, the big change was surprising. But then, it was also surprising that this big change did not occur when entering the observed cluster mean as an observed variable in an otherwise identical model. We are looking forward to the results of your exploration of the problem. 


Hi, above it was stated that for 2LL difference testing "MLR chisquare difference testing requires a scaling factor in line with what is described on the Mplus web site for MLM." Can you please tell me where on the website I should go to read about this? thanks 


Go to the home page and you will find ChiSquare Difference Testing for MLM under Special Analyses With Mplus. 


Using Mplus, I am testing a 2level model with a continuous dependent variable. At level 1, there are control variables, continuous independent variables, and three interaction terms. At level two, the slopes are fixed and the intercept is regressed on two level2 control variables. My question concerns demonstrating that the model improves with the addition of the interaction terms, analogous to the evaluation of the change in rsquared in OLS. I've computed the difference in deviance (2LL) between the two models, and the chisquare value is statistically significant. I interpreted this to mean that the inclusion of the interaction terms improved the fit of the model. A reviewer has challenged my approach, arguing that I should compute the change in Rsquared instead. Snijders and Bosker (1999), Chapter 7, describe a method for computing change in rsquared. I certainly can do that, but I question whether it is a sufficient substitute for the 2LL chisquare difference test. If I understand Snijders and Bosker correctly, their computation approximates a relative "effect size" between nested models, but does not represent a test of whether the change in rsquared is different from zero (analogous to the Fstatistic in OLS). My take is that 2LL is the important test for assessing whether there is a statistically signficant improvement in the model when interaction terms are added. But I don't want to fall prey to my own assumptions. Is anyone aware of a reason why I should abandon the 2LL chisquared difference test in favor of the Snijders and Bosker approach in this situation? Thanks! 


I would agree with you. Chisquare is a test of model fit. Rsquare gives the variance explained for a single dependent variable. I would agree with your difference testing if you are interested in seeing if adding interactions improves overall model fit. If you do this, don't forget for the model without the interaction(s), the interaction variable(s) must be inlcuded. For example, if you have a model with an interaction where x1x2 is the interaction y ON x1 x2 x1x2; then you must have y ON x1 x2 x1x2@0; in the model without the interaction. 


Thanks, Linda. One clarification: your note helped me realize that I can test the difference in chisquared for my two models in SEM fashion if I add the interaction terms and fix them at zero. In my earlier question, what I was doing was running one model without the interaction terms included, then running a second model with the interaction terms. I then computed "deviance" scores for each model (2 log likelihood), and subtracted the smaller (model with interactions) from the larger (model without interactions). I then looked at a chisquare table to see if this difference is statistically signifcant with the change in degrees of freedom. So now I'm curious whether there are substantive differences between my approach and what you suggested. Again, thank you. 


2 times the difference in loglikelihoods for two nested models is distributed chisquare so would do the same thing as a chisquare difference test. You would still need to include the interaction terms and zero them out. I don't believe the models are nested if you don't include them. 

Anonymous posted on Friday, July 15, 2005  4:54 pm



Hi, I have siblings in my data, so I've been running growth curve models with type=complex, estimator =MLR. I've noticed that my CFI/TLI are very low (sometimes even negative values) with the complex commend. I ran the same models without the complex command and got nice fit indices (around .9). The differences are quite striking to me. Why is that big difference betw the two? Thank you. 

bmuthen posted on Friday, July 15, 2005  6:17 pm



Please post the 2 inputs you are comparing here. 

Anonymous posted on Monday, July 18, 2005  9:12 am



Hi, here are my 2 inputs. Thank you so much for you help. input #1 (w/ complex command) USEVARIABLES ARE AMCORT1 AMCORT4 AMCORT7 AMCORT10 AMCORT13 ; CLUSTER IS STUDYSIB; USEOBSERVATIONS=CONDTION EQ 2; MODEL: I by AMCORT1  AMCORT13@1; S by AMCORT1 @0 AMCORT4 @.4 AMCORT7 @.7 AMCORT10 @1 AMCORT13 @1.3; I S; [I S]; I with S; [AMCORT1  AMCORT13 @0]; AMCORT1 WITH AMCORT4; AMCORT4 WITH AMCORT7; AMCORT7 WITH AMCORT10; AMCORT10 WITH AMCORT13; ANALYSIS: TYPE=MISSING COMPLEX H1; ESTIMATOR=MLR; H1ITERATIONS=50000; input #2 (w/o complex command) USEVARIABLES ARE AMCORT1 AMCORT4 AMCORT7 AMCORT10 AMCORT13 ; USEOBSERVATIONS=CONDTION EQ 2; MODEL: I by AMCORT1  AMCORT13@1; S by AMCORT1 @0 AMCORT4 @.4 AMCORT7 @.7 AMCORT10 @1 AMCORT13 @1.3; I S; [I S]; I with S; [AMCORT1  AMCORT13 @0]; AMCORT1 WITH AMCORT4; AMCORT4 WITH AMCORT7; AMCORT7 WITH AMCORT10; AMCORT10 WITH AMCORT13; ANALYSIS: TYPE=MISSING H1; H1ITERATIONS=50000; 

bmuthen posted on Monday, July 18, 2005  4:58 pm



Can't see anything there  are you using version 3.12? Please send your output, data and license number to support@statmodel.com. 

Samuel posted on Saturday, January 07, 2006  7:31 am



Hello Drs. Muthén, I have a question regarding the chi²differencetest for MLR: I would like to compare two nested MFAModels, which differ only in 1 df. The scaling correction factor of the less restrictive model is slightly higher than that of the more restrictive model (.69 vs. .62), which produces a negative value for the chi²differencetest. Does this have a special meaning and how would you treat this? Many thanks for any suggestions. 


This can happen and has been discussed in the literature by Bentler. I think this has also been discussed on SEMNET. You may want to look at the archives. The asymptotic behaviour did not work for your situation and the test cannot be used therefore. If there is a difference of only one parameter, you can look at the z test for that parameter in the less restricted model. 

Samuel posted on Saturday, January 07, 2006  3:07 pm



Many thanks, that helps a lot! 

Marco posted on Monday, January 09, 2006  6:59 am



Hello Drs. Muthén, I have done a MFA according to your recommended steps from the 1994paper. Regarding the EFA of SIGB/SB, the analysis showed a reasonable chi²statistic and factor structure, but the RMSEA was far above the usual cutoffcriteria (.16). All other analysissteps, including the final MFA, showed acceptable chi² and fit indices. Judging from your experience, is the distorted RMSEA of the SBEFA problematic/meaningful? Thanks a lot for your help! 


I think in the paper it says that chisquare (and therefore any fit statistic based on chisquare) are not correct because the maximum likelihood assumptions are not fulfilled by the estimated sigma between matrix. Your should look at other descriptive measures like eigenvalues and RMSR. 

Marco posted on Tuesday, January 10, 2006  1:26 am



OK, I am sorry, I was unaware that the RMSEA is directly based on the chisquare. The paper says that distorted statistics are likely with sigmabetween, but I guess there's no reason to expect these with the samplebetween. Thanks for your patience. 

bmuthen posted on Tuesday, January 10, 2006  8:50 am



Samplebetween has the problem of not being an estimator of Sigmabetween. It estimates a linear combination of Sigmabetween and Sigmawithin. See Tech appendix for Version 2. So it is preferrable to analyze the estimated Sigmabetween. Just don't trust MLbased fit measures literally. 

Lois Downey posted on Thursday, May 25, 2006  11:49 am



I'm doing a CFA with 801 patients clustered under 92 physicians. The indicators are dichotomous. When run as a complex model estimated with WLSMV, an 11indicator singlefactor model fits well (chisquare p>.25, CFI=1.00, TLI=1.00, RMSEA = 0.015, WRMR = 0.620). Ultimately, however, I need to run a twolevel SEM in order to test both patient and physician characteristics as predictors of the latent variable. Before doing that, I'd like to reevaluate the measurement model with a twolevel CFA. Although you've given formulas for computing several measures of fit for twolevel models, the formulas require components that are unavailable for categorical indicators. Is there a way to evaluate the fit of my twolevel CFA model other than looking at residuals? If not, what value on the residuals would provide evidence of inadequate fit? 


It is hard to evaluate fit of a model to data when the outcomes are categorical and there are many of them. Already with 11 binary outcomes do you have a frequency table that has too many zero cells for the LR or Pearson chisquares to work. And on top of that, there is the issue that you want to take the clustering into account. The WLSMV approach does not exactly test against data but against an unrestricted tetrachoric correlation matrix for the 11 (as discussed in my 1993 BollenLong book; see web site References). Bivariate residuals (tech10) are useful but again don't yet take the clustering into account. I would do what is often done in statistics  work with neighboring models. Using ML, you get a loglikelihood value and you can use this for LRT chisquare testing of a sequence of models. The models to be compared would be 1 and 2 factors for patients and 1 and 2 factors for physicians. 


I am currently working on two projects in which I use Mplus. In the first one, we test the mediation of drinking motives in the link between personality factors and adolescent alcohol use. We have about 50 items, 5 latent variables, direct and indirect effects, and more than 2000 individuals. The second project deals with environmental factors of adolescent alcohol use and I performed a twolevel SEM with 4 items at level two and 10 items and a latent variable at level 1. In total, we have more than 6000 participants grouped in about 400 units. In both projects, we get good SRMS and RMSEA values (around .6) but not satisfactory CFI and TLI values (around .85). Is it the case that, with complex models and high sample sizes, SRMR and RMSEA tend to perform better than CFI and TLI? I saw that the SRMR is given for the first and the second level separately but not RMSEA, CFI, and TLI? Are the latter nevertheless valid for twolevel modeling or is there any rescaling of these indices necessary or anything of this kind? 


I am not aware of a literature on the empirical performance of these fit indices for twolevel models. Short of doing your own simulation study, here are two alternative ways to learn more. A useful approach when you have only random intercepts is to go through the 5 steps of the Muthen (1994) article in SM&R, where the first step is to analyze Spooledwithin. This makes it possible to rely on the regular literature for fit indices. Another approach is to rely less on fit indices and more on chisquare difference testing of neighboring models. 


When evaluating a twolevel model, Mplus presents SRMR for both individual level and collective level. My first question is: what are the critical values for SRMS at each level? My second question is: how important SRMR is in evaluating a twolevel model, cos we usually look at chisquare and RMSEA and CFI only. Thanks in advance. 


I don't think critical values have been studied in the multilevel framework. It is a descriptive measure. The suggested cutoff for simple random samples is less than or equal to .07 or .08. If your SRMR is a lot larger than this, I would question why. 


I need your opinion as well as your suggestion to solve my problem. I analysed my data using the multilevel mixture modelling. In the initial stepI run the null model (that is there is no level1 and level2 predictors)and have these results: loglikehood: HO value= 2351.213 H1 value= 2351.210 Information criteria: Number of free parameter =3 Akaike (AIC)=4708.427 Bayesian (BIC)=4723.580 sample size adjusted BIC=4714.051 Sample size is = 1154 SEcond step, I include predictors and in my final model I got not only direct effects from Level2 variables but also several interaction effects to the slopes. I have these results: loglikehood: HO value= 6705.187 H) Scaling correction factor for MLR= 1.630 Information criteria: Number of free parameter =38 Akaike (AIC)=13486.374 Bayesian (BIC)=13678.312 sample size adjusted BIC=13557.612 Sample size is = 1154 My problem is how I can say that the final model is the best model that I managed to get from my data? shlould I used the formula BIC=2LL + rlnN, but I found the final model have higher 2LL. What should I do???? 


The log likelihood, BIC, and AIC are used to compare models not as absolute values. They are not comparable for models with covariates and without. I would first get a good model without covariates and then get a good model with covariates. See the Muthen chapter in the book edited by Kaplan that is available on the website for how to select models. 


I’m estimating a twolevel SEM. There are multiple ordinallevel mediator variables included in the structural part of the model and the three factor indicators for the single continuous latent variable are all ordinallevel (the between and within factors use the same indicators). A version of my input is below. My questions: 1) Are there fit indices for this model that I need to show? If not, what do I report? (The output, of course, only includes BIC, AIC. Some of what I’ve been reading seems to indicate that fit statistics just don’t exist for SEMs with categorical data involved.) 2) Do I even need fit indices if my only interests are the path coefficients, rsquares, and the variance components? I’m not actually testing the veracity of the model itself; I’m only commenting on the importance of the hierarchical nature of the data. Thanks CLUSTER = cluster; CATEGORICAL = u1 u2 u3 u4 u5; WITHIN = x1 x2 x3; BETWEEN = x4 x5; ANALYSIS: TYPE = TWOLEVEL; ESTIMATOR = MLR; INTEGRATION = MONTECARLO; PROCESS = 2; MODEL: %WITHIN% fw BY u1 u2 u3; u4 u5 ON x1 x2 x3; fw ON u4 u5 x1 x2 x3; %BETWEEN% fb BY u1 u2 u3; u4 u5 ON x4 x5; fb ON x4 x5; 


1) I am not aware of overall model fit indices for twolevel modeling involving categorical indicators. There is however a new method coming out in Mplus Version 5 which will provide this. In fact, if you like, I'd be interested in borrowing your data to try out your model with this new approach. Otherwise, the standard statistical approach of comparing your model to a less restricted neighboring model can be used  but it may be tricky to come up with such a model in your case. Checking residuals is also important. 2) Your model has restrictions (it is overidentified) so you need to know that it is a reasonable representation of your data  otherwise parameter estimates based on it should not be interpreted. 


I am trying to assess model fit (path model) for a multilevel model (crosssectional complex survey data) with categorical (dichotomous) variables. Based on my reading above it seems that this may not be possibleat least not with a simple fit statistic. That said, in the last post (Sept 2007) Dr. Muthen alludes to this possibility available in Version 5. Is this true? I downloaded version 5 but don't get any fit statistics aside from loglikelihood and AIC/BIC.... Do I have to request something specific? Assuming there is no fit statistic yet available you talk about comparing to a "less restricted neighboring model"can you clarify this idea? Do you just mean a model with less variables/ relationships and then running the significance test to see if the original model is a statistical improvement over the original? Thanks in advance... 


Yes, Version 5 has a new weighted least squares estimator which allows testing of model restrictions. To access this, you should use the option estimator = wlsm; This new approach is discussed in the Technical Appendices on our web site, see the last link at http://www.statmodel.com/techappen.shtml For path analysis a less restricted, neighboring model would for example be a model that contains all possible paths. 


Thank you for the response. However, I didn't specify earlier that I have twolevels of clustering in the data, and thus need to use the twolevel analysis command. It appears as though wlsm is not available for twolevel. Is that right? Would the best "workaround" at this point be to do the neighboring model option? Thanks. 


WLSM is available for 2level data  that's a new feature of Version 5. 


Sorry for the continued confusion, but I am using MPlus Version 5, and am running a twolevel complex analysis and still getting errors when I try to use estimator=wlsm. The exact error given is *** ERROR in Analysis command Estimator WLSM is not allowed with TYPE=TWOLEVEL COMPLEX. My code is below. I could email the exact the output and input files if my error is still unclearthanks again, in advance, Nathan  Data: file is sempv.dat; variable: names are cidi_id weight1 weight2 strata mplsclus cra12 crv12 overall parpush earlyalc earlymda earlyied adultalc adultied adultmda; usevariables weight2 cra12adultmda; subpopulation = cra12 == 1 or cra12==2; categorical = cra12adultmda; weight is weight2; missing are .; cluster is strata mplsclus; within = cra12adultmda; analysis: type= twolevel complex; estimator=wlsm; algorithm=integration; integration=montecarlo; model: %within% cra12 crv12 on overall ; overall on adultied adultalc adultmda earlyied; adultied on earlyalc earlymda; adultalc adultmda on earlyalc earlymda earlyied; earlyalc earlymda earlyied on parpush; 


You are using TWOLEVEL COMPLEX and WLSM is not available for the combination of TWOLEVEL and COMPLEX. There is a table in Chapter 15 under the ESTIMATOR option that shows which estimators are available for all analysis types. 


You could use type = complex with estimator = wlsm. 


Is there some way of getting fit statistics that are interpretable with a twolevel model with all categorical variables? I get the following output but am not sure about what it means for overall model fit. TESTS OF MODEL FIT Loglikelihood H0 Value 158791.491 H0 Scaling Correction Factor 1.604 for MLR Information Criteria Number of Free Parameters 25 Akaike (AIC) 317632.982 Bayesian (BIC) 317857.084 SampleSize Adjusted BIC 317777.634 (n* = (n + 2) / 24) 


When means, variances, and covariances are not sufficient statistics for model estimation, traditional fit statistics like chisquare are not available. In these cases, nested models are tested using 2 times the loglikelihood difference which is distributed as chisquare. With weighted least squares estimators, you will obtain traditional fit statistics. 


Dr. Muthen , Could you provide the formula for SRMRW and SRMRB used in multilevel model? Thanks. Mark 


The formula for SRMR is shown in Technical Appendix 5, Formula 128. In the multilevel case, the sample statistics used depend on the estimator used. 


I want to compare two models using the MLR chisquare difference test. I have computed the calculation by hand (as outlined on the website) but I am unsure about how to interpret the result? For example, what does a TRd value of 7.65 mean? 


This is the chisquare difference. At the end of the testing for measurement invariance section in Chapter 13, there is a discussion of model difference testing that gives the interpretation. 


Thanks Linda for your reply. In Chapter 13 it mentions about whether or not the chisquare difference value is significant. What number does the chisquare difference value have to be to be significant? 


You need to use the difference in degrees of freedom to find the critical value in the chisquare table. 

Hao Duong posted on Tuesday, May 27, 2008  1:39 am



Dr. Muthen, Follow these discussions, I would like to ask if BIC and AIC in Mplus 5 are acceptable fit indexes for multilevel mixture models? Thank you Hao 


BIC and AIC are not absolute fit statistics. They are most often used to compare models. See a multilevel textbook for more information on how these are used. 

Hao Duong posted on Tuesday, May 27, 2008  6:04 pm



Dr. Muthen, Thank you for your reply. I mean to compare the models. Hao 


To compare nested models, you can use 2 times the loglikelihood difference which is distributed as chisquare. 

RAlgesheimer posted on Tuesday, September 02, 2008  3:07 am



Dear Linda and Bengt, I have a question concerning model comparisons. I analyzed (1) a MTMM model using a traditional CFA and (2) as data is nested a multilevel MTMM. As data is nested, I would like to see that the results of the second model stronger predict the data. These are my questions: a. Is a CFA model nested in a MLCFA if the structure is replicated on either the "between" level or on both levels? b. How can I compare their fits in order to show the preferability of one model? These are the model results: (1) CFA Chi2=20 (15), p=.1572, RMSEA=.036, BIC=5260. (2) MLCFA CHI2=40 (31), p=.1217, RMSEA=.019, BIC=17152. Thank you very much in advance. best wishes, René 


a. Not nested in a sense that you can apply chi2 testing b. Preferability should be based on whether or not the 2level model fits well and has significant between level parameters. Is there significant betweenleel variation? 

RAlgesheimer posted on Tuesday, September 02, 2008  7:29 am



Thanks a lot for your quick reply. The service you are offering in here is examplary. Yes, the 2level model fits very well, all between level parameters are significant, and the design effect is about 2.6. Nevertheless, I was thinking about the preferability, because at the withinlevel neither method effects nor trait covariances are significant (I applied a MTMM correlated uniqueness model). best, René 


If the 2level model has significant betweenlevel variation, the 1level model is wrong. Only for some situations is the 1level model "aggregatable" in which case Type = Complex would give correct results (see MuthenSatorra, 1995 in Soc Meth). But that is most often not the case, for instance due to different number of factors on Within and Between  which is typical. In other words, if the 1level MTMM gives significant methods and trait covariances and the 2level MTMM does not, I would trust the 2level MTMM. But make sure you have explored the number of factors on between. For instance, it is often the case that you have only 1 trait on between. 


Dear Linda and Bengt, I have a longitudinal dataset with 69 patients being tested on up to 14 occasions on various mediator and outcome measures. I have now specified a twolevel randomintercept model with a mediated effect (multilevel 1>1>1 mediation) and try to argue in favor of one model against another. I was told that model fit indices such as AIC , BIC are calculated under i.i.d. assumptions and therefore do not make much sense with correllated data. However, regarding an R^2 equivalent for multilevel models, Snijders & Bosker (1994) have argued in favor of an index called R1^2 being defined as the proportional reduction of overall prediction error due to including predictors in the model. My questions now are a) could this index be meaningfully used for my model (which includes an indirect effect) b) could this index be used to compare different models without having to compare AICs and c) does MPlus give this statistic ? Thanks a lot! Benjamin 


We are not familiar with this index and its use so cannot comment. It is not available in Mplus. With continuous outcomes, you will obtain chisquare. I would use that. 


I'm testing a path model with categorical dependent variables and some (but not all) latent variables. In addition, I'm using MLR, type=missing complex (for clustering and missings), and integration=montecarlo (because of a mediating continuous variable). I found that fit indices are not provided for these kind of models with categ variables, but is there a way to calculate them by hand? I would like to use this formula (see below and mentioned earlier on this forum) but I do not have information about the chi square! And how do I decide how many groups I have? RMSEA = sqrt((chisquare/(n*d))  (1/n))*sqrt(g) 


Chisquare and related fit measures are not available when means, variances, and covariances are not sufficient statistics for model estimation. This is the situation you are in. Nested models can be tested using 2 times the loglikelihood difference which is distributed as chisquare. 

V X posted on Friday, April 02, 2010  12:30 am



Dear Dr. Muthen, I have a question with regard to model comparison when "integration = MonteCarlo" is specified. If I have two nested models, may I still use deviance test when MonteCarlo integration is implemented? Thank you. 


In principle, this is possible. It is however the case that with INTEGRATION=MONTECARLO the loglikelihood is less precise. 


I am exploring some other ways to compute RMSEA for multilevel SEM, and wonder if I can get partial elements of Fmin (min. fit function)? Specifically, according to Liang&Bentler (2004), the ML fit function can be expressed as something like: Fml= Sum_g{(n_g1)*f1_g}+Sum_g{f2_g} where 1st term is based on the sample pooledwithin matrix and 2nd term is based on a weighted sum of within an between matrices. I assume similar fit function is used in Mplus's ML methods. If this is true, is there's any way I get the 1st and 2nd terms of Fml separately? Thanks Jinsong 


It may require some work. The fit function expression you show is the multiplegroup approach to 2level random intercept modeling which I wrote about in the paper: Muthén, B. (1990). Mean and covariance structure analysis of hierarchical data. Paper presented at the Psychometric Society meeting in Princeton, NJ, June 1990. UCLA Statistics Series 62. which is available as paper 32 at my UCLA web site at http://www.gseis.ucla.edu/faculty/muthen/full_paper_list.htm 


Dr. Muthen, So is this multiplegroup approach also used as the fit function (Fmin) for all ML MSEM estimators (e.g., ML, MLR, MUML) in Mplus? In other words, can ML Fmin be rewritten in similar two terms as above? (it seems Eq. 35 in p. 16 of your paper is similar). I just found that Mplus can give the within (Sig_W) and between (Sig_B) covariance matrix. If all these are correct, I can develop different types (like levelspecific) fit indices (e.g., RMSEA and CFI) with some work, and then compare their performance with general or other levelspecific indices using Monte Carlo simulation. Any comments? 


My paper shows how to write the LL as several groups when there are more than one distinct group size. The Mplus fit function does not work in this way, but instead goes straight to the raw data. 


Thanks Dr. Muthen, it seems the levelspecific indices based on the ML fit function I mentioned won't work for Mplus. Or is there any other way in your mind when you said it might require some work? I hope you don't mind I keep bothering you on this issue, since I find it could illuminate my current research idea somewhat. 


No further thoughts on this come to mind off hand. 


Hi Linda & Bengt, I've analyzed my data via MSEM and try to compute RMSEA using the above posted fomula (the standard fomula) but the value is incorrect with the reported value in Mplus. What is that means? Is there any other alternative formula to compute RMSEA using the chisquares from tested model? Thanks 


Please send the full output, your hand calculations of RMSEA, and your license number to support@statmodel.com. 


Dear Mplus team, I have a question re the indices of my MCFA. My CFI = .92, RMSEA = .08, SRMR (within) = .049, SRMR (between)= .275. So, all criteria are good, except the SRMR (between). I have very low between group variances for some variables, and a small sample (n=34 teams, n= 177 people) at level 2. I have read on this forum that cut off criteria do not apply to multilevel models oneonone, and that nonconvergence can be an issue for my problem. Is there a way I can improve the SRMR (between)? What strategies should I employ? Many thanks in advance. 


A less restrictive model should improve fit on the between level. It's hard to say more than this. 

Wu wenfeng posted on Sunday, November 07, 2010  5:47 pm



Dear Mplus team, I am doing a mediate effect analysis of MSEM,but the output result has no standardized effect coefficient. could you help me find out the reason? or supply a method to solve this problem? thank you! follow is my syntax: VARIABLE: MISSING ARE ALL (99); NAMES ARE ID ageI Gender CDI_sum CHAS_SUM cog time dep str; USEV=ID cog dep str; ! ID means student number, cog means the negative cognitive style(test in initial time), dep the student depressive symptom(which test 4 times), str the student strss events(which also test 4 times) CLUSTER IS ID; BETWEEN = cog; ANALYSIS: TYPE IS TWOLEVEL RANDOM; MODEL: %WITHIN% dep ON str; %BETWEEN% dep str cog; cog ON str(a); dep ON cog(b); dep ON str; MODEL CONSTRAINT: NEW(indb); indb=a*b; OUTPUT: TECH1 TECH8 CINTERVAL; 


With TYPE=RANDOM, the variance of y varies as a function of x so how to standardize is not welldefined. 

Wu wenfeng posted on Monday, November 08, 2010  4:44 pm



thanks! and sorry to disturb you again,I have another question,can I set the initial dependent variable test value as a control, then change the above syntax as: VARIABLE: MISSING ARE ALL (99); NAMES ARE ID ageI Gender stddep cog time dep str; USEV=ID cog dep str stddep; ! ID means student number, cog means the negative cognitive style(standardized and test in initial time), dep the student depressive symptom(which test 4 times), str the student strss events(which also test 4 times),stddep the standardized initial dependent variable test value CLUSTER IS ID; BETWEEN = cog stddep; ANALYSIS: TYPE IS TWOLEVEL RANDOM; MODEL: %WITHIN% dep ON stddep str; %BETWEEN% dep str stddep cog; cog ON str(a); dep ON stddep cog(b); dep ON stddep str; MODEL CONSTRAINT: NEW(indb); indb=a*b; OUTPUT: TECH1 TECH8 CINTERVAL; when I do HLM, I set the initial dependent variable test value as a control, using MSEM I am not sure if it is right. 

Wu wenfeng posted on Monday, November 08, 2010  4:55 pm



can I center the within independent varible "str" by group before do the above Mplus analysis? 


I'm afraid I don't understand what you mean by "set the initial dependent variable test value as a control". You can use the CENTERING option to center. 

Callie Burt posted on Wednesday, November 10, 2010  10:59 am



Hi, Could you point me to a reference I can cite when comparing models when using numerical integration? Thanks for your help 


It's not the numerical integration per se but the kind of model for which ML requires numerical integration. These tend to be models where the means, variances, and covariances that are used for the usual chisquare model testing aren't sufficient for describing the model. In such settings statisticians instead work with BIC or for nested models chisquare based on loglikelihood difference testing between competing models. I know of no reference for this. 

Wu wenfeng posted on Sunday, December 26, 2010  7:09 pm



I am doing a Multilevel SEM, I am a newer to Mplus, the syntax I used is: DATA: FILE IS E:\Mplus data\PhD dissert\mplus data\long grade3 data.dat; VARIABLE: MISSING ARE ALL (999); NAMES ARE ID Gender TCDI dep str ZHWLK; USEVARIABLES ARE ID dep str ZHWLK TCDI; CENTERING=groupmean(str ZHWLK); WITHIN = ZHWLK str; BETWEEN = TCDI; CLUSTER IS ID; ANALYSIS: TYPE = TWOLEVEL RANDOM; MODEL: %WITHIN% dep ON str; dep ON ZHWLK(aw); ZHWLK ON str(bw); %BETWEEN% dep ON TCDI; MODEL CONSTRAINT: ! section for computing indirect effects NEW(indw); ! name the indirect effects indw=aw*bw; ! co OUTPUT: TECH1 TECH8; the OUTPUT shows: ChiSquare Test of Value 0.000* Degrees of Freedom 0 PValue 0.0000 Scaling Correction Factor 1.000 for MLR that means I can't assess the model fit, could you please tell me what I should do to solve this problem? thank you! 


You need to specify a model that is not saturated to be able to assess fit. 

Wu wenfeng posted on Monday, December 27, 2010  4:18 pm



thanks! 

Wu wenfeng posted on Tuesday, December 28, 2010  1:29 am



sorry to disturb you again! Could please give me your opinion about the syntax I posted here in december 26? because I am not sure if the syntax is right. Please let me have an explain: the data is long format, variables dep, str, and ZHWLK include 3 time investigations(3 follow wave),dep means depression, and TCDI means the first time investigation of depression; str means stress,and ZHWLK is a cognitive variable. the key of my problem is that whether setting "ped ON TCDI" is right.thank you! 


Example 9.16 shows how to specify a growth model when the data are in long format. 

Wu wenfeng posted on Tuesday, December 28, 2010  7:32 pm



thanks 


I am doing a twolevel regression model with observed variables. I have a small dataset: about 90 teachers in 27 schools, the cluster mean size being about 3. I have started with a null model, getting a deviance value of 880. When I add two predictors to the within level, surprisingly the deviance goes up: 1012! Next, adding the same two predictors to the between level, the deviance comes down a bit, as it should, to 1007. At the same time, however, the regression coefficients are significant, and the portion of explained variance increases, indicating that the model is meaningful. How should this be interpreted and what should I do? Thanks! 


Please send the outputs involved and your license number to support@statmodel.com. 


Hi Linda, I wonder if we can request for other fit indices (e.g. IFI) apart from what we normally receive from the standard report. Thanks. 


No, all available fit statistics are given automatically. 

Jing Zhang posted on Friday, August 26, 2011  11:13 am



Dear Dr. Muthen, Following is the syntax I used to specify a model. In the place of The model fit information, the results just showed "Loglikelihood" and "Information Criteria", but the CFI, TLI, RMSEA, AND SRMR didn't show. I wonder why? Thanks. TITLE: A linear growth model with time varying covariate DATA: FILE IS int_covariate.dat; VARIABLE: NAMES ARE ID INT1INT3 AGE1AGE3 SNS1 SNP1 SNS3 SNP3 PSS1 PSP1 PSS3 PSP3; USEVARIABLES ARE INT1INT3 AGE1AGE3 PSS1 PSS3; TSCORES = AGE1AGE3; MISSING = ALL(999999); ANALYSIS: TYPE=RANDOM; MODEL: I S  INT1INT3 AT AGE1AGE3; INT1@0; st  INT1 ON PSS1; st  INT3 ON PSS3; PLOT: TYPE IS PLOT3; SERIES IS INT1 INT2 INT3(*); OUTPUT: SAMPSTAT TECH1 TECH8 MODINDICES(3.84); 


With TYPE=RANDOM, chisquare and related fit statistics are not available because means, variances, and covariances are not sufficient statistics for model estimation. 

Jing Zhang posted on Friday, August 26, 2011  5:24 pm



Dear Dr. Muthen, Thanks for you quick response. Then with TYPE=RANDOM, how do we evaluate if the model fits well without using chisquare and related fit statistics? Jing 


Nested models are tested using 2 times the loglikelihood difference which is distributed as chisquare. BIC is also used to compare models. 


I am using complex data (teachers within schools) with 260 cases on level 1 but only 8 cases on level 2. Computing a single level model (without the statement "complex") my model fit is fine: RMSEA = 0.015, CFI = 0.957, TLI = 0.955, SRMR = 0.068. But when using the complexmode, model fit is getting worse (especially CFI and TLI): RMSEA = 0.057, CFI = 0.714, TLI = 0.701, SRMR = 0.068. I also computed a model using the twolevelmode with groupmeancentering: RMSEA = 0.048, CFI = 0.744, TLI = 0.732, SRMR = 0.063. In my opinion, when using the statement complex or computing a twolevel model, fits should become better not worse. May it be that 8 cases on level 2 are not enough for using the complexmode? If yes, how can I model my data appropriately? 


Eight clusters is not enough to get stable results. It is recommended to have a minimum of 3050 clusters. You can control for nonindependence of observations by creating a set of 7 dummy variables and using them in the analysis. 


Dear Linda, I have tried that approach and created 8 dummy variables (all included at usevariables) of which I used 7 in the onstatements for all mediating and dependent variables. DEFINE: s1 = 0; s2 = 0; s3 = 0; s4 = 0; s5 = 0; S6 = 0; S7 = 0; s8 = 0; if (school eq 2) then s1=1; if (school eq 3) then s2=1; if (school eq 4) then s3=1; if (school eq 5) then s4=1; if (school eq 6) then s5=1; if (school eq 7) then s6=1; if (school eq 8) then s7=1; if (school eq 1) then s8=1; Model: ! only the on and withstatements KF on TF LF PF s1 s2 s3 s4 s5 s6 s7; KO on TF LF PF s1 s2 s3 s4 s5 s6 s7; UE on TF LF PF s1 s2 s3 s4 s5 s6 s7; MB on KF KO UE TF LF PF s1 s2 s3 s4 s5 s6 s7; AZ on KF KO UE TF LF PF s1 s2 s3 s4 s5 s6 s7; LMSO on MB AZ KF KO UE TF LF PF s1 s2 s3 s4 s5 s6 s7; LMSZ on MB AZ KF KO UE TF LF PF s1 s2 s3 s4 s5 s6 s7; LMEI on MB AZ KF KO UE TF LF PF s1 s2 s3 s4 s5 s6 s7; TF with TA LF PF; TA with LF PF; LF with PF; LMSO with LMSZ LMEI; LMSZ with LMEI; KF with KO UE; KO with UE; MB with AZ; But now I do not get any fitstatistics at all. Do you have any idea of what went wrong here? 


Please send your output and license number to support@statmodel.com. 

Eva posted on Saturday, September 15, 2012  10:16 am



I am unclear on how to compare model fit for nested models. My analysis uses TYPE=TWOLEVEL and ESTIMATOR=WLSMV. It was mentioned to use the 2 loglikelihood difference; I do not see that value in the output, but see the Chisquare test of model fit instead, which is not to be "used for chisquare difference testing in the regular way." DIFFTEST is not available for the TWOLEVEL analysis either, despite the output suggesting for me to use that option. What to do, what to do? 


The DIFFTEST option is not currently available for TYPE=TWOLEVEL. If you want to difference testing with TYPE=TWOLEVEL, use maximum likelihood estimation. 


Dear Linda and Bengt, I have a question about the fit of my model. I have a multilevel model with days (40 days) at the within level and persons (60 persons) at the between level. My model is a mediated model with one independent variable, three mediators and one dependent variable. The fit of my model is: RMSEA = .12 CFI = .93 TLI = .33 SRMR within = .04 SRMR between = .13 I have this problem when I use TYPE = TWOLEVEL and TYPE = COMPLEX. Can you maybe help me? Thank you very much! 


We need to see your output to say  send to Support. 

Suekyung Lee posted on Wednesday, December 26, 2012  9:56 am



Dear Dr. Muthen, I'm testing a path model with one categorical dependent variable and three continuous mediating variables, using a complex data set. According to the User’s Guide, I'm using WLSMV as estimator (default), type = complex, and repse = jackknife1. I also include weight and repweight under variable command. I found that only WRMR is provided for available fit indices. However, when I ran the same analysis without replicates, other fit indices are provided and WRMR has the exact same value regardless of inclusion of replicates. I wonder if I can use these fit indices without replicates to evaluate overall model fit and chisquare values to perform chisquare difference tests. If not, what would you recommend to evaluate model fit and and to perform chisquare difference tests? Thank you! 


Fit statistics have not been developed for replicate weights. You should ignore WRMR. It is a experimental fit measure. You should not use fit statistics for the model without replicate weights. The data being analyzed are not the same. I'm not sure if MODEL TEST is available. If it is, you can do a joint test of all of the leftout paths. 

Suekyung Lee posted on Thursday, December 27, 2012  9:57 am



Thank you for your reply. Would you be more specific about MODEL TEST and a joint test? Thank you! 


See MODEL TEST in the user's guide. A joint test means to include all of the leftout paths in MODEL TEST at the same time. 

Tom Carwell posted on Saturday, February 02, 2013  10:30 am



Using Mplus, I want to test a 2level mediation model (TYPE=TWOLEVEL). My questions concerns model comparisons: 1) I want to check if that the model improves with the addition of the mediator, analogous to the evaluation of the change in rsquared in OLS. Is there any way to do this? Is the approach to compare a first model with the mediator fixed at zero and a second model with the mediator not fixed acceptable? 2) Can I use chisquare difference test or should I use 2LL chisquare difference test for this purpose? 3) In addition, I want to rule out an alternative explanation of full mediation by examining a third model by adding direct paths and comparing the new model to the hypothesized model. I saw in recent article a report on similar comparisons based on comparing chisquare values and fix indices (without conducting chisquare difference test), is this use of chisquare and fit indices reasonable? What is proper method to conduct this comparison? Thanks! 


You can do in twolevel analysis what you do in singlelevel analysis. But regarding 1), I would not advocate seeing if the model gets a better rsquare by including a mediator. I would simply see if the mediator gives rise to a significant indirect effect. You may want to ask on SEMNET. 2) No, because in one case you have 1 DV (the outcome) and in the other you have 2 DVs (outcome and mediator), so metric not the same. 3) I would start with the model y on m x; m on x; and simply see which effects are significant. For instance, if y on x is insignificant here as judged by a ztest, it will also be insignificant as judged by a 1df chisquare test. 

Tom Carwell posted on Saturday, February 02, 2013  11:15 pm



Dear Bengt, Thank you for your prompt response and suggestions. 

Beth Bynum posted on Monday, August 26, 2013  12:04 pm



I am working on a multilevel model with 10 predictor scales and one outcome variable. The outcome variable is a grouplevel measure of performance and the predictors are ratings made at the individuallevel. I am only interested in the relationship between the predictors and the grouplevel outcome. I've include this relationship on the %BETWEEN% level (e.g Y ON P1 P2 P3 P4 P5) but I am unsure what to include on the %WITHIN% level to get adequate model fit. When I don't include anything on the %WITHIN% level, the model has zero degrees of freedom. When I specify only variance estimates on the %WITHIN% level, my model fits very poorly (TLI < 0). What is the best approach for using Multilevel model when you are not interested in modeling withinlevel relationships? Thanks! 


You can have a model of variances and covariances on within. You could also create a dataset where clusters are the observations and use all between variables. 

Eric Deemer posted on Sunday, September 29, 2013  3:05 pm



I fitted a multilevel mediation model and got a chi square value of zero. I read through this thread and it seems like this happens because means, variances, and covariances don't provide enough information for estimation. I thought this was only the case with "type = twolevel random" estimation? I used "type = twolevel." eric 

Eric Deemer posted on Sunday, September 29, 2013  3:20 pm



I noticed that the betweenlevel covariances among my 3 predictors are all zero. I suppose that this is due to centering, and is the reason I can't get a chi square statistic? Not enough information? eric 


Please send the output and your license number to support@statmodel.com. 

RuoShui posted on Wednesday, December 18, 2013  5:53 pm



Dear Dr. Muthen, I am using LGCM. One model is just the regular LGCM and the other one I used the intercept and slope growth factor to predict two continuous outcomes. I am wondering can I use the differences of 2LL and see if this change is significant with the change of degrees of freedom to determine the model fit? Thank you so much! 


You can use 2LL only if your two models have the same dependent variables. You can do a joint test if the 4 slopes are significant by using Model Test in the model that includes the two cont's outcomes. 

RuoShui posted on Thursday, December 19, 2013  4:37 pm



Dear Bengt, Thank you so much for your time. I might be very wrong. But do you mean doing this? math on i (p1); math on s (p2); literacy on i (p3); literacy on s (p4); Model Test: p1=0; p2=0; p3=0; p4=0; 


This looks correct. 

RuoShui posted on Sunday, January 05, 2014  5:44 pm



Dear Dr. Muthen, I hope you had a lovely Christmas and new year. I have a question about the 2LL. I realize that if my predictors use multiple indicators instead of mean scores of the same constructs, the 2LL is much larger than when I use mean scores. I understand using latent variables makes the model more complicated but at the same time, it takes into consideration of measurement errors. But as for model fit, the model using mean scores has much lower 2LL and BIC. Shall I give up on using latent variables and use mean scores instead? Thank you very much. 


The LL is not in the same metric when comparing models with different DVs, so you can't compare. When you have multiple indicators they are included in the list of DVs. 

RuoShui posted on Monday, January 06, 2014  2:42 pm



Thank you very much Bengt! I see, so the indicators of predictor variables are considered as DVs as well. Am I correct that for multiple indicator growth models with latent predictors of the slope and intercept growth factors, the only way to evaluate model fit is to use the model test command to test whether the paths from each latent predictor to i and s are significant? For example: i on react (p1); s on react (p2); Model test: p1=0; p2=0; Thank you so much! 


Yes, indicators are DVs because they are regressed on the factors. No, I don't see why you can't evaluate model fit the regular way with latent predictors. 

RuoShui posted on Monday, January 06, 2014  4:36 pm



Dear Bengt, Thank you very much for your help! I am sorry I did not explain very well just now. I meant to ask if I want to do model comparisons, what should I use since 2LL should not be used with different DVs? Is joint test the only way? Thank you! 


The joint test can be used to see if the latent predictor has significant effects. If you do the same test with the observed predictor, I guess a comparison of findings can be done. 

RuoShui posted on Monday, January 06, 2014  8:16 pm



Dear Bengt, Thank you so much! I have another related question: I want to compare model 1 (only estimating the intercept and slope growth factor) with model 2 (introducing predictor variables of i and s, but the predictor variables are latent variables with multiple indicators). As you said, 2LL should not be used. Then what indices should I use to argue that model 2 is a better model than model 1? Thank you! 


Maybe you can consider the amount of variance explained in the growth factors. But that requires that both models fit well. 

sojung park posted on Monday, January 20, 2014  3:28 pm



Dear Dr.Muthens, I am trying to test nested modeling for logistic regression using FIML I figured that I need to use the syntax Analysis: estimator=ML; integration = montecarlo; this way, I am not losing any observation, but then the output does not provide the standard set of information for testing nested modeling Could you please help me? I do not want to use other estimator that seems to change the model to probit that I do not want to thank you so much for your help! 


You can test nested models based on ML using 2 times the log likelihood difference for the two models. 


Hello, a question on multilevel model fit: I'd like to compare the fit of three models (null model, level1 predictors only, level1 and level2 predictors). Using the loglikelihood fit index, I have now computed the SatorraBentler Chisq difference statistics as mentioned here: http://www.statmodel.com/chidiff.shtml With every set of predictors added, the LL becomes less negative (from 2514.07 for the null model to 1149.49 to the model including both level1 and level2 predictors). Computing the SB results in significant differences between null model and Level1 predictor model and from Level1 to Level2 predictor model (the test statistics becoming smaller, yet still being significant). Is my interpretation that model fit becomes better with the inclusion of every set of predictors added correct? And is my procedure okay? (I am using the default MLR estimator.) And, just to be on the safe side: Is it correct that the models with more predictors are nested in the models with fewer predictors? Thanks, Tanja 


Nested models must share the same set of dependent variables. In this case, the model with fewer predictors is nested in the model with more predictors. 


Thanks, Linda. So, is my procedure and ma interpretation that model fit improves with the inclusion of more predictors all right? Best, Tanja 


If the difference test between the model with all covariates versus the model with fewer covariates is significant, the model with fewer covariates fits worse. 


Thank you very much! <3 Tanja 

Shiny7 posted on Thursday, October 23, 2014  12:41 pm



Hello Dr. Muthen, would you please support me, regarding the following questions with respect to multilevel modeling: a) can I use AIC and BIC in order to compare models using MLR? b) Hox (2010, 51) points out, that with MLR: 'AIC and BIC can only be used for models that differ in the random, [but not the fixed] part.". Can you explain, what that means? c) Despite the loglikelihood difference test, why can´t I use ChiSquare Difference Test (SatorraBentler correction) in order to compare models, like in SEM? Thank you very much! Shiny 


a) Yes. b) Hox doesn't talk about MLR, but RML (restricted ML). c) With MLR you can do the special difference testing that we describe at http://www.statmodel.com/chidiff.shtml 

Shiny7 posted on Friday, October 24, 2014  12:05 am



Dear Dr. Muthen, thanks a lot for your immediate reply! I am relieved. Shiny 


Hi Linda and Bengt, I am using MSEM to estimate 2level models of a set of baseline covariates + teacher variables of interest on student achievement. I am looking at AIC and LL across models. These should generally decrease as predictors are added to the model. This is true when I compare AIC and LL between the baseline and final teacher models; fit improves when I add the teacher variables, because while these indices account for model complexity, the teacher variables add explanatory value. However when I compare AIC and LL between the final models and an Unconditional model with student achievement as the sole variable, varying on levels 1 and 2, I see that fit of the conditional models is WORSE. That is, adding the predictors made the AIC and LL statistics INCREASE in absolute value, relative to the Unconditional model. I believe that this occurs because certain variables in my conditional models are "yvariables on the BETWEEN level and xvariables on the WITHIN level" and hence "treated as a yvariable on both levels." This is fine with me. But given that the conditional models have, analytically speaking, more than one yvariablenot just student achievementthis is giving a very different picture of the overall model when fit statistics are generated. Can you clarify or verify my thinking here? Thank you. 


Yes, extra y variables throw off the logL and BIC comparisons. Note that there isn't a clear consensus on AIC/BIC for 2level because it isn't clear if the sample size should be the total or the number of clusters. There are some articles on this although I can't pinpoint them right now (probably in the SEM journal). 


Dear Dr. Muthen When comparing nested models, is it possible to use CFI, RMSEA, and SRMR to decide which model is better. If so, are there certain cutoff points to indicate that one model is significantly better than the other. For example, when the difference in CFI is at least .01, then the model is significantly better. Many thanks in advance! Martijn 


There is some literature discussing this  you may for instance check the SEM journal. Or ask on SEMNET. Personally, I am not convinced about the value of being guided by these differences. 


Thanks for the quick response. If I was to use the chi square difference test for comparing models, I'd need the scaling correction factor. However, I do not get it in the output of MPlus (estimator=MLM). I have the feeling that I'm overlooking something very obvious, but I cannot figure out what it is. Also, my model runs with ML and MLM, but not with MLR, is there something I should pay extra attention to? Many thanks in advance 


You should send the MLM and MLR outputs and your license number to support@statmodel.com. 


I would recommend exploring why MLR has problems when ML doesn't  they differ only in the SEs. 


I'm sorry for all the followup questions. My study has a cohort sequential design, which means a low covariance coverage. However, is it possible that MLR has trouble with it whereas ML does not? Many thanks! Martijn Van Heel 


Dear Bengt and Linda, I wish to produce consistent AIC (CAIC) for my mixture models (LPAs) to follow the procedure for multigroup analysis of similarity recommended by Morin, Meyer, Creusier & Bietry (2015). The formula for CAIC given in Nylund, Asparouhov & Muthen (2007) is CAIC = 2*logL + p*(log(n) + 1). 1. Is p the "Number of Free Parameters" in the output? 2. Does Mplus calculate BIC using this formula: BIC = 2*logL + p*log(n)? 3. If yes, then CAIC is simply BIC + p, right? Thanks, 'Alim 


1. I would think so; I don't have the CAIC formula in front of me. 2. Yes. 3. Probably. 


Dear Bengt, In my latent profile analysis I do not obtain the same results when I add BIC + p and when I calculate CAIC as 2*logL + p*(log(n) + 1). In fact, when I calculate BIC using 2*logL + p*log(n), I don't get the same results as the one produced by Mplus. However the AIC I get is the same as the one produced by Mplus. Is BIC calculated differently in mixture modeling? Thanks, 'Alim 


No, BIC is always calculated according to the formula you show so I don't see how you don't get agreement. I assume you use "elog" and have p=number of free parameters. If this doesn't help, you can send output to Support. 


Hi, I am wondering how to estimate model fit in type=twolevel random. Thanks. 


No absolute fit statistics are available in this case. You can test nested models using 2 times the loglikelihood difference which is distributed as chisquare. You can compare nonnested models with the same set of dependent variables using BIC. 


Hi Linda, Thanks for your quick reponse.Have learned a lot after reading this thread of discussion. Many thanks. 

Ger_Wel posted on Thursday, January 26, 2017  2:22 am



Dear people, I have some questions. Hopefully you can help me out. (a) I want to test a multilevel mediation model and build my model by adding variables step by step. When I start by testing an intercept onlymodel, I get a Chisquare of zero with zero degrees of freedom. In every following step of building my model this stays the same. Is this because standard fit indexes are not available for multilevel models? I cannot imagine my model being saturated in the first step. Or should I start with all variables in the model (usevariables = [all the variables in the final model]), without specifiying any relations in the MODEL part? In that case my model is not identified because of not enough clusters (15 clusters). Mplus suggests reducing the numer of parameters, but I specify none. Do you know where I can find information on how to do this? (b) Because of the MLR estimation I follow the procedure for difference testing using the loglikelihood (TRd = 2*(L0  L1)/cd). Since there are no degrees of freedom, I use the number of free parameters reported in the output directly under ‘Model fit information’. Is this correct? If not, how can I obtain the number of df? 


We need to see the full output; please send to Support along with you license number. 


(a) You don't specify a model in this run with only the PAP variable which means that Mplus by default estimates its mean, its variance on within, and its variance on between. You see these 3 parameters in your output. Because you analyze only 1 variable you have only 3 pieces of sample information and they correspond to these 3 parameters so you get a saturated model. In your other run you bring in predictors of PAP. Again, you have decided for some reason to not specify a model. Mplus defaults introduces some basic parameters to fill in this void. If you look at your output you see which parameters have been estimated. By default  because you don't specify a model  the predictors are not correlated. I assume this is not what you want. In terms of analysis strategies for multilevel modeling you may want to post on SEMNET or Multilevelnet and explain what you want to accomplish. 


Hi, I'm comparing model fit between two models. Model 1 is saturated and Model 2 is a nested nonsaturated model. My questions are: 1. Can I still trust the parameter estimates in the saturated model? 2. As chisquare, CFI and TLI show perfect fit for the saturated model, could I still compare model fit between the two models using AIC and BIC? 3. Or should I change the saturated model (even though it conceptually makes sense) to compare fit? Thank you! 

Alex Leung posted on Saturday, February 25, 2017  6:40 pm



Dear Drs. Muthen, I am new to mplus and am looking into multilevel mediation models. I am aware that in multilevel models, chisquare difference test cannot be applied using MLR estimator... and it's been suggested that we should use SatorraBentler ChiSquare Test to accommodate the multilevel structure. In this case, how do we assess model fit of a 211 model using SatorraBentler chisquare test? Thanks! 


I don't see why MLR chisquare difference testing couldn't be used for multilevel mediation models. 

Hoda Vaziri posted on Tuesday, March 21, 2017  12:17 pm



I want to compare two threelevel models using SatorraBentler ChiSquare Test. The first model is: %WITHIN% Y ON c; s1  Y ON x1; %BETWEEN group% s2  Y ON x2; %BETWEEN country% Y ON x3; The other is a model in which the effects of x1 and x2 on Y are fixed to 0. If I just remove the whole line (e.g., s1  Y ON x1;), the models are not nested and I can't use the chisquare difference test. When I say, for example, s1  Y ON x1@0, the intercept for s1 and s2 are still calculated. Would you please help me how I should specify a nested model that does not include the random effects of x1 and x2 on Y? 


This statement is an incorrect use of the Mplus language: s1  Y ON x1@0 The s1  statement does not refer to a (single) parameter but to a variable s1 which can have many parameters on other levels (mean, variance, reg coeff's etc). You can say y on x1@0. But note that a random slope model cannot be compared to a fixed slope model using chisquare difference testing because it involves setting a variance to zero which is on the border of its parameter space. Instead, use BIC to compare the models. 


What is the minimum number of clusters needed for a twolevel CFA with 13 indicators? I currently only have preliminary data available (only 3 clusters) and have received the following message: " THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION." 


20. 

Sara Riese posted on Tuesday, July 31, 2018  4:41 pm



Hello Drs Muthen, I am trying to use the Ryu and West (2009) approach to estimate model fit of my multilevel path analysis. However, when I run a fully saturated between model, the model will not run. My code is below: TITLE: patient satisfaction DATA: FILE IS aim2_data.dat; VARIABLE: NAMES ARE facility inf hr tech1 tech2 techany inter firstANC educ parity dist public level urban patsat; USEVARIABLES ARE facility inf hr techany inter firstANC educ parity dist public level urban patsat; BETWEEN ARE inf hr public level urban; WITHIN ARE educ firstANC parity dist; CATEGORICAL IS patsat ; CLUSTER IS facility; MISSING ARE ALL (99); ANALYSIS: TYPE IS TWOLEVEL; ESTIMATOR IS wlsmv; MODEL: %WITHIN% patsat ON firstANC educ parity dist techany inter; %BETWEEN% patsat public level urban inf hr techany inter WITH patsat public level urban inf hr techany inter; The output stops with the following messages: SINGULAR INFORMATION MATRIX PROBLEM OCCURRED IN THE UNIVARIATE ESTIMATION OF VARIABLE PUBLIC. SINGULAR INFORMATION MATRIX PROBLEM OCCURRED IN THE UNIVARIATE ESTIMATION OF VARIABLE URBAN. THE WEIGHT MATRIX COULD NOT BE COMPUTED. Do you have any advice on how to get this model to run? Both public and urban are binary. Thanks in advance! 


Try Estimator = ML or Bayes instead. 


Hi, I am using MSEM to estimate 2level models and I calculated two different models: Model 1. Including only control variables Model 2. Including one further independent variable and two moderator variables. According to AIC and BIC, model fit of Model 2 is worse compared to Model 1 even if further predictors are added to the model. Model 2 actually represents my "full" model used for hypothesis testing. Is there any explanation for AIC and BIC becoming greater? How woul you suggest to proceed? Thank you very much in advance! 


You might be misreading AIC and BIC. Comparison applies only if the dependent variables in both models are identical and from your statement I am thinking that this is not the case. I would suggest that you rerun Model 1 as follows: use Model 2 but fix those extra parameters that are for the additional predictors to zero. That way the set of variables used in the two models is identical. Use likelihood ratio test to confirm the conclusion as well as the Wald test, see Model Test in the User's Guide. 


Thank you very much for your quick response!The dependent variable is the same for both models. The only thing, which changes, is the number of Level 1 predictors. Do they Always have to be the same to compare AIC and BIC? I run another Model 3 including a Level 2 Moderator and thereby, model fit increases compared to Model 2 (BIC and AIC get smaller). Thanks! 


I have a further consideration in Addition to my previous post: Could this worse model fit of Model 2 compared to Model 1 be a result of adding random slopes to fixed effects? I calculated the influence of Control variables as direct fixed effects like "y on c" and the further predictors as defined slopes "s I y on x". 


The number of predictors doesn't affect AIC and BIC  again you can run the model with added predictors and fixed regression coefficients @0 and you should get identical AIC and BIC. If you have a mismatch between the number of dependent variables usually the problem can be spotted very easily because AIC and BIC will be on a completely different scale. Basically the rule that Mplus uses to determine if a variable is dependent or independent is this: if in Model results you get a variance parameter the variable is dependent, otherwise it is not. It is not unusual to add more parameters to a model and get a worse AIC and BIC. This is because they have a penalty for the number of parameters to prevent overfitting. You might have to send your question to support@statmodel.com There are many complexities that could affect things. Centering of the covariate for example. http://www.statmodel.com/download/CenteredMediation.pdf To make sure you are comparing the same type of model  look at the likelihood. You can convert a random slope to a fixed slope by fixing its variance to 0 and the mean to the fixed slope estimate. If x is a within and between variable then s  y on x is not quite he same as Y on X because s  y on x would use the entire X in that regression while the Y on X will use only the within portion. 

Back to top 