Evaluation of model fit
Message/Author
 Larry Farmer posted on Monday, November 01, 1999 - 10:36 pm
Given the that standard fit indexes are not available when estimation multilevel models, what criteria can be used to evaluate those models?
 Linda K. Muthen posted on Friday, November 05, 1999 - 9:31 am
The residuals are available and can be used to assess model fit. We are working on RMSEA, TLI, and CFI for multilevel models. I will post ways to use information from existing Mplus outputs to calculate these.
 Linda K. Muthen posted on Friday, November 12, 1999 - 5:32 pm
A robust RMSEA for multilevel models can be calculated using results from Mplus.

RMSEA = sqrt((2 * Fmin / t)- (1/n))*sqrt(g)

where Fmin = the last value from the function column of TECH5
n = sample size (total of all groups if multiple group)
t = trace of the product of the u and gamma matrices. See Satorra (1992) for a
definition.
g = number of groups

For computational purposes, you can calculate RMSEA as follows:

RMSEA = sqrt((chi-square/(n*d)) - (1/n))*sqrt(g)

where d is degrees of freedom, n is total sample size, chi-square is chi-square, and g is the number of groups. Chi-square from MLM or MLMV can be used. The RMSEA will be the same whichever one is used.
 Linda K. Muthen posted on Tuesday, November 16, 1999 - 3:16 pm
Two other fit measures can be computed for multilevel models. They are TLI and CFI. I don't know how well these fit indices perform for multilevel models. Their range is from zero to one with one being good fit. In trying them out, I found it difficult to find a value very far from 1. However, this may not be different than for regular SEM models.

Both of these measures require information from a baseline model in addition to the model being tested. Typical baseline models have zero covariances. I looked at a multilevel baseline model with free means and variances on the between level and free variances on the within level. This is obtained in Mplus by having no statements in the MODEL command. Another choice is a model with free means, variances, and covariances on the between level and free variances on the within level. This is obtained in Mplus by using WITH statements on the between level to free the covariances. I did not try this.

TLI and CFI can be computed as follows where

chib = chi-square for the baseline model
dfb = degrees of freedom for the baseline model
chit = chi-square for the model being tested
dft = degrees of freedom for the model being tested

TLI = (chib - (chit * dfb/dft))/(chib-dfb)

CFI = 1 - ((max (chit -dft, 0)/(max(chit-dft, chib-dfb,0))
 Shyuemeng Luu posted on Tuesday, December 21, 1999 - 7:28 am
Dr. Muthen:

(1) The R-squre is offered for each variable in multilevel analysis, however, is there anyway to know the R-squre for the overall model? How do we know how much variance is explained by the overall model?

(2) Can we include contexual variable(s) in the individual level?
 Bengt O. Muthen posted on Tuesday, December 21, 1999 - 4:23 pm
Regarding (1), the typical SEM focus is on R-square for each dependent variable, not variance explained in some more global sense. Perhaps you are thinking about overall model fit. In SEM, the model fit is not a direct function of variance explained, but chi-square and other fit indices are useful. Regarding (2), a level-2 variable can be allowed to influence a level-1 variable. This is represented on the between level, letting the level-2 variable influence the level-2 part of the level-1 variable.
 Lee Van Horn posted on Tuesday, February 22, 2000 - 12:35 pm
In the above the degrees of freedom and chi-square for the baseline model would be obtained using a model with no between or within structure specified right? Assumably the chib-dfb is going to be the largest value for demoninator of the CFI equation in most cases? The difficulty I'm having is that my indendence model "failed to converge with serious problems in the iterations." I would like to be sure I was doing it right before giving up on calculating the CFI.

I've been specifying a confirmatory within model with no structure for the between model. Some of the ICC's are very small and I've had problems with the estimates for the variance in the between model being 0 for a couple variables. This makes for some errors in calculating the t scores and standardized loadings (negative t scores and standardized loadings of 999). Calculating start values with the 0.5*(Sb-sw)/c did not change the estimates. I don't however see any reason that this would invalidate my model, especially given that I'm primarily interested in the within model? Are there any problems with this logic? There are some theoretical reasons for sticking with the two level model not withstanding the low ICCs.

Lee Van Horn
 Bengt O. Muthen posted on Wednesday, February 23, 2000 - 10:06 am
Yes to your questions in your first paragraph. The independence model can fail to converge with small ICC's and in such cases some between variances may need to be fixed at zero. Note that zero variances is ok on the between level because the between covariance matrix is not inverted. Instead, Sigma_W + s Sigma_B is inverted (see Appendix of User's Guide).

Fixing some between variances at zero can also be used for your confirmatory model and does not invalidate your within model. If you are only interested in the within part, regular analysis of S_W gives very similar results.
 Anonymous posted on Friday, February 16, 2001 - 12:38 pm
Would you provide some context, guidelines, or references for interpreting the WRMR statistic provided in Mplus 2.0 ?

Do you feel the RMSEA and WRMR provide a better indication of overall model fit than CFI, TLI in the situation where the data are suspected to deviate from the multivariate / normality assumptions or sample size is of conern ?
 Bengt O. Muthen posted on Friday, February 16, 2001 - 3:26 pm
The Version 2 User's Guide gives some background for WRMR on pages 361-362 that states things better than I can do here. There is also a paper by Yu-Muthen that is now in draft form and will shortly be made available. So far, the WRMR performance looks quite good for both continuous and categorical outcomes. We have not yet studied non-normal continuous data situations, so we have to wait with a conclusion of how WRMR compares to other indices in such cases. WRMR is designed to work well here. I refer to the Hu-Bentler articles for comparisons of the other fit indices for continuous non-normal data.
 Ching-yun Yu posted on Saturday, February 17, 2001 - 9:13 pm
In our study, WRMR (with cutoff value of .9) is a better fit index than RMSEA (cutoff value of .06) and CFI (cutoff of .95). Generally speaking, RMSEA performs better than CFI for continuos outcomes whereas CFI performs slightly better than RMSEA for dichotomous outcomes [except for misspecified factor-loading (complex) models when N<=250].

Marsh, Balla and McDonald's (1988) simulation study on four different data sets have shown that TLI is relatively independent of sample size. Moreover, Hu and Bentler (1998; p446) mentioned that TLI is less sensitive to sample size and consistently performs well even with models deviate from the multivariate normality assumptions.
 Ching-yun Yu posted on Saturday, February 17, 2001 - 9:48 pm
Hu and Bentler (1998; 1999) have shown that CFI, TLI and RMSEA (ML- and GLS- based) perform well across different sample sizes and distributions. However, TLI and RMSEA are less preferable at small sample sizes (e.g. N<=250).
 Frank Lawrence posted on Monday, October 22, 2001 - 12:51 pm
For a complex model [p.135 User Guide], how are the degrees of freedom calculated?
 Linda K. Muthen posted on Monday, October 22, 2001 - 4:03 pm
For the MLM estimator, degrees of freedom are calculated in the usual way. For the MLMV estimator, the formula for the degrees of freedom is found on page 358 of the new Mplus User's Guide. It is formula 110.
 Jennifer Johnson posted on Tuesday, October 22, 2002 - 2:29 pm
I am conducting a two-level structural equation modeling analysis, with about 275 people and about 75 groups (unbalanced). I have found models that appear to fit well at both the within and the between-groups levels, but two of the r-squared's for the between-groups model are slightly above 1 (1.005 and 1.08). All the other model parameters have reasonable values and suggest a good fit. What do I do with the > 1 r-squared's in terms of the analysis and reporting results?

Thank you,
Jennifer Johnson
 Linda K. Muthen posted on Tuesday, October 22, 2002 - 3:03 pm
Can you please send the output to support@statmodel.com. I'd like to take a look at it before I answer your question.
 Linda K. Muthen posted on Thursday, October 24, 2002 - 10:38 am

In your multilevel factor analysis, your R-square values on the between level are printed as undefined in the Mplus output because the corresponding estimated residual variances on the between level are negative. The greater than one values are shown next to "Undefined" to alert you to the fact that you have a problem. If the negative values are very small, the residual variance can be set to zero. If not, you may want to rethink your model. It is often the case that between-level residuals(unreliabilities) are small due to between-level correlations being high relative to within-level correlations.
 Anonymous posted on Friday, May 30, 2003 - 3:58 am
How do you explain the to adjustment mesureing chi2 and RMSEA in Lisrel. What is simular and what is different.
 Linda K. Muthen posted on Friday, May 30, 2003 - 8:15 am
I don't understand what you are asking. Can you be more specific?
 Michael Eid posted on Wednesday, January 19, 2005 - 5:56 am
I am interested in a two-level factor model for six ordinal variables and four factors on the within and between level. I got results for the MLF estimator but no chi-square test (although this possibility is mentioned in the manual on p. 401). What do I have to do to get a goodness-of-fit statistic for the model?

Moreover, I have not got results for the MLR estimator because of memory problems. Is there any way to handle this problem?
 BMuthen posted on Thursday, January 20, 2005 - 8:00 pm
You don't get fit statsitics because there is not a single covariance matrix to test against with orginal outcomes using maximum likelihood. I suspect that the memory problem has to do with too many dimensions of integration. To circumvent this, you can try INTEGRATION = MONTECARLO. However, I wonder why you didn't have this problem with MLF.
 Anonymous posted on Thursday, March 31, 2005 - 7:50 am
Is it possible to get fit indices other than LL, AIC, BIC when using a categorical dependent variable in a two level model? Preferably, I could get CFI, rmsea, etc (ie something with recommended cut offs to guide model acceptance).
(I am using TWOLEVEL MISSING H1 and MLR).

 Thuy Nguyen posted on Thursday, March 31, 2005 - 1:42 pm
Two-level modeling with a categorical dependent variable requires numerical integration. The only fit indices available for analysis with numerical integration are the ones that are printed (LL, AIC, and BIC).
 Rinus Voeten posted on Wednesday, May 25, 2005 - 4:02 am
I am using multilevel modeling with the MLR estimator. Can I use the difference in -2LL as a chi-square statistic (comparing nested models)? Or is some scaling factor required?
 bmuthen posted on Wednesday, May 25, 2005 - 10:58 am
MLR chi-square difference testing requires a scaling factor in line with what is described on the Mplus web site for MLM. For 2-level models without random slopes, testing against an unrestricted H1 model is available and then the scaling factor is given. If you have random slopes, but don't have strong non-normality, you can use ML and do the difference testing the usual way.
 Rinus Voeten posted on Wednesday, May 25, 2005 - 1:45 pm
Thank you very much.

I would like to ask another question. I am using two-level modeling with students in classrooms. I would like to distinguish gender context within the classroom. One way I used was applying multi-group modeling with girls and boys as groups, and classrooms defining the between level. This worked fine until I introduced a cross-level interaction (using random slopes). Then the deviance -2LL increased (was about doubled)compared with the model without this cross-level interaction. Is this a source for concern?

When I declared my moderator variable as a variable at the within level and introduced the classroom means of this moderator at the between level, then this increase of the deviance did not happen. Is it a sensible approach to add aggregated variables at the between level? Or is it always better to let Mplus construct the between-level variables from the predictors in the within-level part?
 bmuthen posted on Wednesday, May 25, 2005 - 3:40 pm
Multiple-group, 2-level modeling should use groups based on between-level variables, not within-level variables. So if you have students within schools, multiple-group analysis can study Catholic versus Private versus Public schools (as in Muthen, Khoo et al). The rule must be independent samples. Due to intraclass correlation, they are not independent if you consider gender groups within schools (or classrooms). In fact, I am not aware of any published methodology for doing what you want. I have, however, seen this need in at least 3 recent instances now, and the Mplus team has an idea for exploration on how to approach it, so maybe we should get going.

Regarding your second question about cross-level interaction, I wopuld be surprised if I saw such big change in the log likelihood.

Regarding between variables, the observed cluster mean entered explicitly as an observed variable in the data should work fine.
 Rinus Voeten posted on Thursday, May 26, 2005 - 1:25 am
Thanks again.

Yes, I was also concerned about the dependency issue, which will also bother us when we define the between level as same-gender groups within classrooms. We then ignore the classroom level. In our case, though, it seems more important to attend to the gender context within a classroom than to the classroom context.

Concerning the value of the loglikelihood, the big change was surprising. But then, it was also surprising that this big change did not occur when entering the observed cluster mean as an observed variable in an otherwise identical model.

We are looking forward to the results of your exploration of the problem.
 Michael J. Zyphur posted on Friday, May 27, 2005 - 7:59 pm
Hi,
above it was stated that for -2LL difference testing "MLR chi-square difference testing requires a scaling factor in line with what is described on the Mplus web site for MLM." Can you please tell me where on the website I should go to read about this?

thanks
 Linda K. Muthen posted on Saturday, May 28, 2005 - 6:23 am
Go to the home page and you will find Chi-Square Difference Testing for MLM under Special Analyses With Mplus.
 Ray Sparrowe posted on Thursday, June 09, 2005 - 7:24 am
Using Mplus, I am testing a 2-level model with a continuous dependent variable. At level 1, there are control variables, continuous independent variables, and three interaction terms. At level two, the slopes are fixed and the intercept is regressed on two level-2 control variables.

My question concerns demonstrating that the model improves with the addition of the interaction terms, analogous to the evaluation of the change in r-squared in OLS. I've computed the difference in deviance (-2LL) between the two models, and the chi-square value is statistically significant. I interpreted this to mean that the inclusion of the interaction terms improved the fit of the model.

A reviewer has challenged my approach, arguing that I should compute the change in R-squared instead. Snijders and Bosker (1999), Chapter 7, describe a method for computing change in r-squared. I certainly can do that, but I question whether it is a sufficient substitute for the -2LL chi-square difference test. If I understand Snijders and Bosker correctly, their computation approximates a relative "effect size" between nested models, but does not represent a test of whether the change in r-squared is different from zero (analogous to the F-statistic in OLS).

My take is that -2LL is the important test for assessing whether there is a statistically signficant improvement in the model when interaction terms are added. But I don't want to fall prey to my own assumptions. Is anyone aware of a reason why I should abandon the -2LL chi-squared difference test in favor of the Snijders and Bosker approach in this situation?

Thanks!
 Linda K. Muthen posted on Thursday, June 09, 2005 - 8:34 am
I would agree with you. Chi-square is a test of model fit. R-square gives the variance explained for a single dependent variable. I would agree with your difference testing if you are interested in seeing if adding interactions improves overall model fit. If you do this, don't forget for the model without the interaction(s), the interaction variable(s) must be inlcuded. For example, if you have a model with an interaction where x1x2 is the interaction

y ON x1 x2 x1x2;

then you must have

y ON x1 x2 x1x2@0;

in the model without the interaction.
 Ray Sparrowe posted on Thursday, June 09, 2005 - 3:48 pm
Thanks, Linda.

One clarification: your note helped me realize that I can test the difference in chi-squared for my two models in SEM fashion if I add the interaction terms and fix them at zero. In my earlier question, what I was doing was running one model without the interaction terms included, then running a second model with the interaction terms. I then computed "deviance" scores for each model (-2 log likelihood), and subtracted the smaller (model with interactions) from the larger (model without interactions). I then looked at a chi-square table to see if this difference is statistically signifcant with the change in degrees of freedom.

So now I'm curious whether there are substantive differences between my approach and what you suggested.

Again, thank you.
 Linda K. Muthen posted on Thursday, June 09, 2005 - 5:13 pm
-2 times the difference in loglikelihoods for two nested models is distributed chi-square so would do the same thing as a chi-square difference test. You would still need to include the interaction terms and zero them out. I don't believe the models are nested if you don't include them.
 Anonymous posted on Friday, July 15, 2005 - 4:54 pm
Hi,
I have siblings in my data, so I've been running growth curve models with type=complex, estimator =MLR. I've noticed that my CFI/TLI are very low (sometimes even negative values) with the complex commend. I ran the same models without the complex command and got nice fit indices (around .9). The differences are quite striking to me. Why is that big difference betw the two? Thank you.
 bmuthen posted on Friday, July 15, 2005 - 6:17 pm
Please post the 2 inputs you are comparing here.
 Anonymous posted on Monday, July 18, 2005 - 9:12 am
Hi, here are my 2 inputs. Thank you so much for you help.

input #1 (w/ complex command)
USEVARIABLES ARE AMCORT1 AMCORT4 AMCORT7 AMCORT10 AMCORT13 ;
CLUSTER IS STUDYSIB;
USEOBSERVATIONS=CONDTION EQ 2;

MODEL:
I by AMCORT1 - AMCORT13@1;
S by AMCORT1 @0 AMCORT4 @.4 AMCORT7 @.7 AMCORT10 @1 AMCORT13 @1.3;
I S;
[I S];
I with S;
[AMCORT1 - AMCORT13 @0];
AMCORT1 WITH AMCORT4;
AMCORT4 WITH AMCORT7;
AMCORT7 WITH AMCORT10;
AMCORT10 WITH AMCORT13;

ANALYSIS: TYPE=MISSING COMPLEX H1;
ESTIMATOR=MLR;
H1ITERATIONS=50000;

input #2 (w/o complex command)
USEVARIABLES ARE AMCORT1 AMCORT4 AMCORT7 AMCORT10 AMCORT13 ;
USEOBSERVATIONS=CONDTION EQ 2;

MODEL:
I by AMCORT1 - AMCORT13@1;
S by AMCORT1 @0 AMCORT4 @.4 AMCORT7 @.7 AMCORT10 @1 AMCORT13 @1.3;
I S;
[I S];
I with S;
[AMCORT1 - AMCORT13 @0];
AMCORT1 WITH AMCORT4;
AMCORT4 WITH AMCORT7;
AMCORT7 WITH AMCORT10;
AMCORT10 WITH AMCORT13;

ANALYSIS: TYPE=MISSING H1;
H1ITERATIONS=50000;
 bmuthen posted on Monday, July 18, 2005 - 4:58 pm
Can't see anything there - are you using version 3.12? Please send your output, data and license number to support@statmodel.com.
 Samuel posted on Saturday, January 07, 2006 - 7:31 am
Hello Drs. Muthén,

I have a question regarding the chiē-difference-test for MLR: I would like to compare two nested MFA-Models, which differ only in 1 df. The scaling correction factor of the less restrictive model is slightly higher than that of the more restrictive model (.69 vs. .62), which produces a negative value for the chiē-difference-test. Does this have a special meaning and how would you treat this? Many thanks for any suggestions.
 Linda K. Muthen posted on Saturday, January 07, 2006 - 8:18 am
This can happen and has been discussed in the literature by Bentler. I think this has also been discussed on SEMNET. You may want to look at the archives. The asymptotic behaviour did not work for your situation and the test cannot be used therefore.

If there is a difference of only one parameter, you can look at the z test for that parameter in the less restricted model.
 Samuel posted on Saturday, January 07, 2006 - 3:07 pm
Many thanks, that helps a lot!
 Marco posted on Monday, January 09, 2006 - 6:59 am
Hello Drs. Muthén,

I have done a MFA according to your recommended steps from the 1994-paper. Regarding the EFA of SIG-B/S-B, the analysis showed a reasonable chiē-statistic and factor structure, but the RMSEA was far above the usual cut-off-criteria (.16). All other analysis-steps, including the final MFA, showed acceptable chiē and fit indices. Judging from your experience, is the distorted RMSEA of the S-B-EFA problematic/meaningful?

Thanks a lot for your help!
 Linda K. Muthen posted on Monday, January 09, 2006 - 8:36 am
I think in the paper it says that chi-square (and therefore any fit statistic based on chi-square) are not correct because the maximum likelihood assumptions are not fulfilled by the estimated sigma between matrix. Your should look at other descriptive measures like eigenvalues and RMSR.
 Marco posted on Tuesday, January 10, 2006 - 1:26 am
OK, I am sorry, I was unaware that the RMSEA is directly based on the chi-square. The paper says that distorted statistics are likely with sigma-between, but I guess there's no reason to expect these with the sample-between. Thanks for your patience.
 bmuthen posted on Tuesday, January 10, 2006 - 8:50 am
Sample-between has the problem of not being an estimator of Sigma-between. It estimates a linear combination of Sigma-between and Sigma-within. See Tech appendix for Version 2. So it is preferrable to analyze the estimated Sigma-between. Just don't trust ML-based fit measures literally.
 Lois Downey posted on Thursday, May 25, 2006 - 11:49 am
I'm doing a CFA with 801 patients clustered under 92 physicians. The indicators are dichotomous. When run as a complex model estimated with WLSMV, an 11-indicator single-factor model fits well (chi-square p>.25, CFI=1.00, TLI=1.00, RMSEA = 0.015, WRMR = 0.620).

Ultimately, however, I need to run a two-level SEM in order to test both patient and physician characteristics as predictors of the latent variable. Before doing that, I'd like to reevaluate the measurement model with a two-level CFA. Although you've given formulas for computing several measures of fit for two-level models, the formulas require components that are unavailable for categorical indicators. Is there a way to evaluate the fit of my two-level CFA model other than looking at residuals? If not, what value on the residuals would provide evidence of inadequate fit?
 Bengt O. Muthen posted on Friday, May 26, 2006 - 5:29 am
It is hard to evaluate fit of a model to data when the outcomes are categorical and there are many of them. Already with 11 binary outcomes do you have a frequency table that has too many zero cells for the LR or Pearson chi-squares to work. And on top of that, there is the issue that you want to take the clustering into account. The WLSMV approach does not exactly test against data but against an unrestricted tetrachoric correlation matrix for the 11 (as discussed in my 1993 Bollen-Long book; see web site References).

Bivariate residuals (tech10) are useful but again don't yet take the clustering into account.

I would do what is often done in statistics - work with neighboring models. Using ML, you get a loglikelihood value and you can use this for LRT chi-square testing of a sequence of models. The models to be compared would be 1 and 2 factors for patients and 1 and 2 factors for physicians.
 Emmanuel Kuntsche posted on Monday, September 11, 2006 - 4:55 am
I am currently working on two projects in which I use Mplus. In the first one, we test the mediation of drinking motives in the link between personality factors and adolescent alcohol use. We have about 50 items, 5 latent variables, direct and indirect effects, and more than 2000 individuals. The second project deals with environmental factors of adolescent alcohol use and I performed a twolevel SEM with 4 items at level two and 10 items and a latent variable at level 1. In total, we have more than 6000 participants grouped in about 400 units.

In both projects, we get good SRMS and RMSEA values (around .6) but not satisfactory CFI and TLI values (around .85). Is it the case that, with complex models and high sample sizes, SRMR and RMSEA tend to perform better than CFI and TLI?

I saw that the SRMR is given for the first and the second level separately but not RMSEA, CFI, and TLI? Are the latter nevertheless valid for twolevel modeling or is there any rescaling of these indices necessary or anything of this kind?
 Bengt O. Muthen posted on Monday, September 11, 2006 - 9:43 am
I am not aware of a literature on the empirical performance of these fit indices for two-level models. Short of doing your own simulation study, here are two alternative ways to learn more. A useful approach when you have only random intercepts is to go through the 5 steps of the Muthen (1994) article in SM&R, where the first step is to analyze S-pooled-within. This makes it possible to rely on the regular literature for fit indices. Another approach is to rely less on fit indices and more on chi-square difference testing of neighboring models.
 Kajsa Yang-Hansen posted on Friday, December 01, 2006 - 6:19 am
When evaluating a two-level model, Mplus presents SRMR for both individual level and collective level. My first question is: what are the critical values for SRMS at each level? My second question is: how important SRMR is in evaluating a two-level model, cos we usually look at chi-square and RMSEA and CFI only. Thanks in advance.
 Linda K. Muthen posted on Friday, December 01, 2006 - 10:07 am
I don't think critical values have been studied in the multilevel framework. It is a descriptive measure. The suggested cutoff for simple random samples is less than or equal to .07 or .08. If your SRMR is a lot larger than this, I would question why.
 Nordin abd Razak posted on Sunday, January 14, 2007 - 4:30 am
I need your opinion as well as your suggestion to solve my problem.
I analysed my data using the multilevel mixture modelling. In the initial step-I run the null model (that is there is no level-1 and level-2 predictors)and have these results:

loglikehood:

HO value= -2351.213
H1 value= -2351.210

Information criteria:

Number of free parameter =3
Akaike (AIC)=4708.427
Bayesian (BIC)=4723.580

Sample size is = 1154

SEcond step, I include predictors and in my final model I got not only direct effects from Level-2 variables but also several interaction effects to the slopes. I have these results:

loglikehood:

HO value= -6705.187
H) Scaling correction factor for MLR= 1.630

Information criteria:

Number of free parameter =38
Akaike (AIC)=13486.374
Bayesian (BIC)=13678.312
Sample size is = 1154

My problem is how I can say that the final model is the best model that I managed to get from my data? shlould I used the formula BIC=-2LL + rlnN, but I found the final model have higher -2LL. What should I do????
 Linda K. Muthen posted on Sunday, January 14, 2007 - 10:49 am
The log likelihood, BIC, and AIC are used to compare models not as absolute values. They are not comparable for models with covariates and without. I would first get a good model without covariates and then get a good model with covariates. See the Muthen chapter in the book edited by Kaplan that is available on the website for how to select models.
 Jeremy Flaherty posted on Saturday, September 29, 2007 - 12:41 pm
Im estimating a two-level SEM. There are multiple ordinal-level mediator variables included in the structural part of the model and the three factor indicators for the single continuous latent variable are all ordinal-level (the between and within factors use the same indicators). A version of my input is below.

My questions:

1) Are there fit indices for this model that I need to show? If not, what do I report? (The output, of course, only includes BIC, AIC. Some of what Ive been reading seems to indicate that fit statistics just dont exist for SEMs with categorical data involved.)

2) Do I even need fit indices if my only interests are the path coefficients, r-squares, and the variance components? Im not actually testing the veracity of the model itself; Im only commenting on the importance of the hierarchical nature of the data.

Thanks

CLUSTER = cluster;
CATEGORICAL = u1 u2 u3 u4 u5;
WITHIN = x1 x2 x3;
BETWEEN = x4 x5;
ANALYSIS:
TYPE = TWOLEVEL;
ESTIMATOR = MLR;
INTEGRATION = MONTECARLO;
PROCESS = 2;
MODEL:
%WITHIN%
fw BY u1 u2 u3;
u4 u5 ON x1 x2 x3;
fw ON u4 u5 x1 x2 x3;
%BETWEEN%
fb BY u1 u2 u3;
u4 u5 ON x4 x5;
fb ON x4 x5;
 Bengt O. Muthen posted on Sunday, September 30, 2007 - 10:41 am
1) I am not aware of overall model fit indices for twolevel modeling involving categorical indicators. There is however a new method coming out in Mplus Version 5 which will provide this. In fact, if you like, I'd be interested in borrowing your data to try out your model with this new approach. Otherwise, the standard statistical approach of comparing your model to a less restricted neighboring model can be used - but it may be tricky to come up with such a model in your case. Checking residuals is also important.

2) Your model has restrictions (it is overidentified) so you need to know that it is a reasonable representation of your data - otherwise parameter estimates based on it should not be interpreted.
 Nathan Tintle posted on Saturday, February 02, 2008 - 6:10 am
I am trying to assess model fit (path model) for a multilevel model (cross-sectional complex survey data) with categorical (dichotomous) variables. Based on my reading above it seems that this may not be possible--at least not with a simple fit statistic. That said, in the last post (Sept 2007) Dr. Muthen alludes to this possibility available in Version 5. Is this true? I downloaded version 5 but don't get any fit statistics aside from log-likelihood and AIC/BIC.... Do I have to request something specific?

Assuming there is no fit statistic yet available you talk about comparing to a "less restricted neighboring model"--can you clarify this idea? Do you just mean a model with less variables/ relationships and then running the significance test to see if the original model is a statistical improvement over the original?

 Bengt O. Muthen posted on Saturday, February 02, 2008 - 5:32 pm
Yes, Version 5 has a new weighted least squares estimator which allows testing of model restrictions. To access this, you should use the option

estimator = wlsm;

This new approach is discussed in the Technical Appendices on our web site, see the last link at

http://www.statmodel.com/techappen.shtml

For path analysis a less restricted, neighboring model would for example be a model that contains all possible paths.
 Nathan Tintle posted on Monday, February 04, 2008 - 9:50 am
Thank you for the response. However, I didn't specify earlier that I have two-levels of clustering in the data, and thus need to use the twolevel analysis command. It appears as though wlsm is not available for twolevel. Is that right?

Would the best "work-around" at this point be to do the neighboring model option?

Thanks.
 Bengt O. Muthen posted on Monday, February 04, 2008 - 10:04 am
WLSM is available for 2-level data - that's a new feature of Version 5.
 Nathan Tintle posted on Tuesday, February 05, 2008 - 7:01 am
Sorry for the continued confusion, but I am using M-Plus Version 5, and am running a twolevel complex analysis and still getting errors when I try to use estimator=wlsm.

The exact error given is

*** ERROR in Analysis command
Estimator WLSM is not allowed with TYPE=TWOLEVEL COMPLEX.

My code is below. I could email the exact the output and input files if my error is still unclear---thanks again, in advance, Nathan
--------------------------------------

Data: file is sempv.dat;
variable: names are cidi_id weight1 weight2 strata mplsclus
cra12 crv12 overall parpush earlyalc earlymda earlyied
subpopulation = cra12 == 1 or cra12==2;

weight is weight2;

missing are .;

cluster is strata mplsclus;

analysis:
type= twolevel complex;
estimator=wlsm;
algorithm=integration; integration=montecarlo;

model:
%within%
cra12 crv12 on overall ;
earlyalc earlymda earlyied on parpush;
 Linda K. Muthen posted on Tuesday, February 05, 2008 - 7:11 am
You are using TWOLEVEL COMPLEX and WLSM is not available for the combination of TWOLEVEL and COMPLEX. There is a table in Chapter 15 under the ESTIMATOR option that shows which estimators are available for all analysis types.
 Bengt O. Muthen posted on Tuesday, February 05, 2008 - 9:32 am
You could use type = complex with estimator = wlsm.
 Archana Jajodia posted on Tuesday, March 25, 2008 - 7:36 am
Is there some way of getting fit statistics that are interpretable with a twolevel model with all categorical variables? I get the following output but am not sure about what it means for overall model fit.

TESTS OF MODEL FIT

Loglikelihood

H0 Value -158791.491
H0 Scaling Correction Factor 1.604
for MLR

Information Criteria

Number of Free Parameters 25
Akaike (AIC) 317632.982
Bayesian (BIC) 317857.084
(n* = (n + 2) / 24)
 Linda K. Muthen posted on Tuesday, March 25, 2008 - 9:24 am
When means, variances, and covariances are not sufficient statistics for model estimation, traditional fit statistics like chi-square are not available. In these cases, nested models are tested using -2 times the loglikelihood difference which is distributed as chi-square.

With weighted least squares estimators, you will obtain traditional fit statistics.
 Hsien-Yuan Hsu posted on Sunday, April 06, 2008 - 4:03 pm
Dr. Muthen ,

Could you provide the formula for SRMR-W and SRMR-B used in multilevel model?

Thanks.
Mark
 Linda K. Muthen posted on Monday, April 07, 2008 - 8:57 am
The formula for SRMR is shown in Technical Appendix 5, Formula 128. In the multilevel case, the sample statistics used depend on the estimator used.
 Orla Mc Bride posted on Monday, May 19, 2008 - 9:09 am
I want to compare two models using the MLR chi-square difference test. I have computed the calculation by hand (as outlined on the website) but I am unsure about how to interpret the result? For example, what does a TRd value of 7.65 mean?
 Linda K. Muthen posted on Monday, May 19, 2008 - 9:59 am
This is the chi-square difference. At the end of the testing for measurement invariance section in Chapter 13, there is a discussion of model difference testing that gives the interpretation.
 Orla Mc Bride posted on Monday, May 19, 2008 - 10:15 am
Thanks Linda for your reply. In Chapter 13 it mentions about whether or not the chi-square difference value is significant. What number does the chi-square difference value have to be to be significant?
 Linda K. Muthen posted on Monday, May 19, 2008 - 11:01 am
You need to use the difference in degrees of freedom to find the critical value in the chi-square table.
 Hao Duong posted on Tuesday, May 27, 2008 - 1:39 am
Dr. Muthen,
Follow these discussions, I would like to ask if BIC and AIC in Mplus 5 are acceptable fit indexes for multilevel mixture models?
Thank you
Hao
 Linda K. Muthen posted on Tuesday, May 27, 2008 - 10:25 am
BIC and AIC are not absolute fit statistics. They are most often used to compare models. See a multilevel textbook for more information on how these are used.
 Hao Duong posted on Tuesday, May 27, 2008 - 6:04 pm
Dr. Muthen,
I mean to compare the models.
Hao
 Linda K. Muthen posted on Tuesday, May 27, 2008 - 7:30 pm
To compare nested models, you can use -2 times the loglikelihood difference which is distributed as chi-square.
 RAlgesheimer posted on Tuesday, September 02, 2008 - 3:07 am
Dear Linda and Bengt,

I have a question concerning model comparisons. I analyzed (1) a MTMM model using a traditional CFA and (2) as data is nested a multilevel MTMM. As data is nested, I would like to see that the results of the second model stronger predict the data. These are my questions:

a. Is a CFA model nested in a ML-CFA if the structure is replicated on either the "between" level or on both levels?

b. How can I compare their fits in order to show the preferability of one model?

These are the model results:
(1) CFA
Chi2=20 (15), p=.1572, RMSEA=.036, BIC=5260.
(2) ML-CFA
CHI2=40 (31), p=.1217, RMSEA=.019, BIC=17152.

Thank you very much in advance.

best wishes,
René
 Bengt O. Muthen posted on Tuesday, September 02, 2008 - 6:20 am
a. Not nested in a sense that you can apply chi2 testing

b. Preferability should be based on whether or not the 2-level model fits well and has significant between level parameters. Is there significant between-leel variation?
 RAlgesheimer posted on Tuesday, September 02, 2008 - 7:29 am
Thanks a lot for your quick reply. The service you are offering in here is examplary.

Yes, the 2-level model fits very well, all between level parameters are significant, and the design effect is about 2.6.

Nevertheless, I was thinking about the preferability, because at the within-level neither method effects nor trait covariances are significant (I applied a MTMM correlated uniqueness model).

best,
René
 Bengt O. Muthen posted on Tuesday, September 02, 2008 - 7:57 am
If the 2-level model has significant between-level variation, the 1-level model is wrong. Only for some situations is the 1-level model "aggregatable" in which case Type = Complex would give correct results (see Muthen-Satorra, 1995 in Soc Meth). But that is most often not the case, for instance due to different number of factors on Within and Between - which is typical.

In other words, if the 1-level MTMM gives significant methods and trait covariances and the 2-level MTMM does not, I would trust the 2-level MTMM. But make sure you have explored the number of factors on between. For instance, it is often the case that you have only 1 trait on between.
 Benjamin Boecking posted on Wednesday, January 28, 2009 - 5:31 am
Dear Linda and Bengt,
I have a longitudinal dataset with 69 patients being tested on up to 14 occasions on various mediator- and outcome measures.

I have now specified a two-level random-intercept model with a mediated effect (multilevel 1->1->1 mediation) and try to argue in favor of one model against another.

I was told that model fit indices such as AIC , BIC are calculated under i.i.d. assumptions and therefore do not make much sense with correllated data.

However, regarding an R^2 equivalent for multilevel models, Snijders & Bosker (1994) have argued in favor of an index called R1^2 being defined as the proportional reduction of overall prediction error due to including predictors in the model.

My questions now are
a) could this index be meaningfully used for my model (which includes an indirect effect)
b) could this index be used to compare different models without having to compare AICs and
c) does MPlus give this statistic ?

Thanks a lot!
Benjamin
 Linda K. Muthen posted on Wednesday, January 28, 2009 - 9:08 am
We are not familiar with this index and its use so cannot comment. It is not available in Mplus. With continuous outcomes, you will obtain chi-square. I would use that.
 Sofie Wouters posted on Monday, September 28, 2009 - 8:38 am
I'm testing a path model with categorical dependent variables and some (but not all) latent variables.
In addition, I'm using MLR, type=missing complex (for clustering and missings), and integration=montecarlo (because of a mediating continuous variable).
I found that fit indices are not provided for these kind of models with categ variables, but is there a way to calculate them by hand?
I would like to use this formula (see below and mentioned earlier on this forum) but I do not have information about the chi square! And how do I decide how many groups I have?

RMSEA = sqrt((chi-square/(n*d)) - (1/n))*sqrt(g)
 Linda K. Muthen posted on Monday, September 28, 2009 - 9:20 am
Chi-square and related fit measures are not available when means, variances, and covariances are not sufficient statistics for model estimation. This is the situation you are in. Nested models can be tested using -2 times the loglikelihood difference which is distributed as chi-square.
 V X posted on Friday, April 02, 2010 - 12:30 am
Dear Dr. Muthen,

I have a question with regard to model comparison when "integration = MonteCarlo" is specified. If I have two nested models, may I still use deviance test when MonteCarlo integration is implemented?

Thank you.
 Linda K. Muthen posted on Friday, April 02, 2010 - 8:19 am
In principle, this is possible. It is however the case that with INTEGRATION=MONTECARLO the loglikelihood is less precise.
 Jinsong Chen posted on Monday, April 12, 2010 - 3:24 pm
I am exploring some other ways to compute RMSEA for multilevel SEM, and wonder if I can get partial elements of Fmin (min. fit function)? Specifically, according to Liang&Bentler (2004), the ML fit function can be expressed as something like:
Fml= Sum_g{(n_g-1)*f1_g}+Sum_g{f2_g}
where 1st term is based on the sample pooled-within matrix and 2nd term is based on a weighted sum of within an between matrices. I assume similar fit function is used in Mplus's ML methods. If this is true, is there's any way I get the 1st and 2nd terms of Fml separately?

Thanks
Jinsong
 Bengt O. Muthen posted on Thursday, April 15, 2010 - 11:21 am
It may require some work. The fit function expression you show is the multiple-group approach to 2-level random intercept modeling which I wrote about in the paper:

Muthén, B. (1990). Mean and covariance structure analysis of hierarchical data. Paper presented at the Psychometric Society meeting in Princeton, NJ, June 1990. UCLA Statistics Series 62.

which is available as paper 32 at my UCLA web site at

http://www.gseis.ucla.edu/faculty/muthen/full_paper_list.htm
 Jinsong Chen posted on Thursday, April 15, 2010 - 2:47 pm
Dr. Muthen,

So is this multiple-group approach also used as the fit function (Fmin) for all ML MSEM estimators (e.g., ML, MLR, MUML) in Mplus? In other words, can ML Fmin be rewritten in similar two terms as above? (it seems Eq. 35 in p. 16 of your paper is similar).

I just found that Mplus can give the within (Sig_W) and between (Sig_B) covariance matrix. If all these are correct, I can develop different types (like level-specific) fit indices (e.g., RMSEA and CFI) with some work, and then compare their performance with general or other level-specific indices using Monte Carlo simulation. Any comments?
 Bengt O. Muthen posted on Friday, April 16, 2010 - 10:32 am
My paper shows how to write the LL as several groups when there are more than one distinct group size. The Mplus fit function does not work in this way, but instead goes straight to the raw data.
 Jinsong Chen posted on Friday, April 16, 2010 - 12:04 pm
Thanks Dr. Muthen, it seems the level-specific indices based on the ML fit function I mentioned won't work for Mplus. Or is there any other way in your mind when you said it might require some work? I hope you don't mind I keep bothering you on this issue, since I find it could illuminate my current research idea somewhat.
 Bengt O. Muthen posted on Saturday, April 17, 2010 - 3:53 pm
No further thoughts on this come to mind off hand.
 Zairul Nor Deana posted on Thursday, April 22, 2010 - 6:44 am
Hi Linda & Bengt,
I've analyzed my data via MSEM and try to compute RMSEA using the above posted fomula (the standard fomula) but the value is incorrect with the reported value in Mplus. What is that means? Is there any other alternative formula to compute RMSEA using the chi-squares from tested model?

Thanks
 Linda K. Muthen posted on Thursday, April 22, 2010 - 8:44 am
 Andreas Richter posted on Friday, June 25, 2010 - 2:01 am
Dear Mplus team,
I have a question re the indices of my MCFA. My CFI = .92, RMSEA = .08, SRMR (within) = .049, SRMR (between)= .275. So, all criteria are good, except the SRMR (between). I have very low between group variances for some variables, and a small sample (n=34 teams, n= 177 people) at level 2.

I have read on this forum that cut off criteria do not apply to multilevel models one-on-one, and that non-convergence can be an issue for my problem.

Is there a way I can improve the SRMR (between)? What strategies should I employ? Many thanks in advance.
 Linda K. Muthen posted on Friday, June 25, 2010 - 8:17 am
A less restrictive model should improve fit on the between level. It's hard to say more than this.
 Wu wenfeng posted on Sunday, November 07, 2010 - 5:47 pm
Dear Mplus team, I am doing a mediate effect analysis of MSEM,but the output result has no standardized effect coefficient. could you help me find out the reason? or supply a method to solve this problem? thank you!
VARIABLE:
MISSING ARE ALL (-99);
NAMES ARE ID ageI Gender CDI_sum CHAS_SUM cog time dep
str;
USEV=ID cog dep str; ! ID means student number, cog means the negative cognitive style(test in initial time), dep the student depressive symptom(which test 4 times), str the student strss events(which also test 4 times)
CLUSTER IS ID;
BETWEEN = cog;
ANALYSIS: TYPE IS TWOLEVEL RANDOM;
MODEL:
%WITHIN%
dep ON str;
%BETWEEN%
dep str cog;
cog ON str(a);
dep ON cog(b);
dep ON str;
MODEL CONSTRAINT:
NEW(indb);
indb=a*b;
OUTPUT: TECH1 TECH8 CINTERVAL;
 Linda K. Muthen posted on Monday, November 08, 2010 - 9:07 am
With TYPE=RANDOM, the variance of y varies as a function of x so how to standardize is not well-defined.
 Wu wenfeng posted on Monday, November 08, 2010 - 4:44 pm
thanks! and sorry to disturb you again,I have another question,can I set the initial dependent variable test value as a control, then change the above syntax as:
VARIABLE:
MISSING ARE ALL (-99);
NAMES ARE ID ageI Gender stddep cog time dep
str;
USEV=ID cog dep str stddep; ! ID means student number, cog means the negative cognitive style(standardized and test in initial time), dep the student depressive symptom(which test 4 times), str the student strss events(which also test 4 times),stddep the standardized initial dependent variable test value
CLUSTER IS ID;
BETWEEN = cog stddep;
ANALYSIS: TYPE IS TWOLEVEL RANDOM;
MODEL:
%WITHIN%
dep ON stddep str;
%BETWEEN%
dep str stddep cog;
cog ON str(a);
dep ON stddep cog(b);
dep ON stddep str;
MODEL CONSTRAINT:
NEW(indb);
indb=a*b;
OUTPUT: TECH1 TECH8 CINTERVAL;
when I do HLM, I set the initial dependent variable test value as a control, using MSEM I am not sure if it is right.
 Wu wenfeng posted on Monday, November 08, 2010 - 4:55 pm
can I center the within independent varible "str" by group before do the above Mplus analysis?
 Linda K. Muthen posted on Tuesday, November 09, 2010 - 9:39 am
I'm afraid I don't understand what you mean by "set the initial dependent variable test value as a control".

You can use the CENTERING option to center.
 Callie Burt posted on Wednesday, November 10, 2010 - 10:59 am
Hi,
Could you point me to a reference I can cite when comparing models when using numerical integration?

 Bengt O. Muthen posted on Wednesday, November 10, 2010 - 4:05 pm
It's not the numerical integration per se but the kind of model for which ML requires numerical integration. These tend to be models where the means, variances, and covariances that are used for the usual chi-square model testing aren't sufficient for describing the model. In such settings statisticians instead work with BIC or for nested models chi-square based on loglikelihood difference testing between competing models. I know of no reference for this.
 Wu wenfeng posted on Sunday, December 26, 2010 - 7:09 pm
I am doing a Multilevel SEM, I am a newer to Mplus, the syntax I used is:
DATA:
FILE IS E:\Mplus data\PhD dissert\mplus data\long grade3 data.dat;
VARIABLE:
MISSING ARE ALL (-999);
NAMES ARE ID Gender TCDI dep str ZHWLK;
USEVARIABLES ARE
ID dep str ZHWLK TCDI;
CENTERING=groupmean(str ZHWLK);

WITHIN = ZHWLK str;
BETWEEN = TCDI;
CLUSTER IS ID;
ANALYSIS:
TYPE = TWOLEVEL RANDOM;
MODEL:
%WITHIN%
dep ON str;
dep ON ZHWLK(aw);
ZHWLK ON str(bw);
%BETWEEN%
dep ON TCDI;

MODEL CONSTRAINT: ! section for computing indirect effects
NEW(indw); ! name the indirect effects
indw=aw*bw; ! co
OUTPUT: TECH1 TECH8;
the OUTPUT shows:
Chi-Square Test of Value 0.000*
Degrees of Freedom 0
P-Value 0.0000
Scaling Correction Factor 1.000
for MLR

that means I can't assess the model fit, could you please tell me what I should do to solve this problem? thank you!
 Linda K. Muthen posted on Monday, December 27, 2010 - 6:01 am
You need to specify a model that is not saturated to be able to assess fit.
 Wu wenfeng posted on Monday, December 27, 2010 - 4:18 pm
thanks!
 Wu wenfeng posted on Tuesday, December 28, 2010 - 1:29 am
sorry to disturb you again!
Could please give me your opinion about the syntax I posted here in december 26? because I am not sure if the syntax is right.
Please let me have an explain: the data is long format, variables dep, str, and ZHWLK include 3 time investigations(3 follow wave),dep means depression, and TCDI means the first time investigation of depression; str means stress,and ZHWLK is a cognitive variable. the key of my problem is that whether setting "ped ON TCDI" is right.thank you!
 Linda K. Muthen posted on Tuesday, December 28, 2010 - 5:05 pm
Example 9.16 shows how to specify a growth model when the data are in long format.
 Wu wenfeng posted on Tuesday, December 28, 2010 - 7:32 pm
thanks
 Annarilla Ahtola posted on Sunday, March 27, 2011 - 11:28 pm
I am doing a two-level regression model with observed variables. I have a small dataset: about 90 teachers in 27 schools, the cluster mean size being about 3.

I have started with a null model, getting a deviance value of 880. When I add two predictors to the within level, surprisingly the deviance goes up: 1012! Next, adding the same two predictors to the between level, the deviance comes down a bit, as it should, to 1007.

At the same time, however, the regression coefficients are significant, and the portion of explained variance increases, indicating that the model is meaningful.

How should this be interpreted and what should I do?

Thanks!
 Linda K. Muthen posted on Monday, March 28, 2011 - 10:14 am
 Patchara Popaitoon posted on Monday, March 28, 2011 - 10:42 am
Hi Linda,

I wonder if we can request for other fit indices (e.g. IFI) apart from what we normally receive from the standard report. Thanks.
 Linda K. Muthen posted on Monday, March 28, 2011 - 10:46 am
No, all available fit statistics are given automatically.
 Jing Zhang posted on Friday, August 26, 2011 - 11:13 am
Dear Dr. Muthen,

Following is the syntax I used to specify a model. In the place of The model fit information, the results just showed "Loglikelihood" and "Information Criteria", but the CFI, TLI, RMSEA, AND SRMR didn't show. I wonder why?

Thanks.

TITLE: A linear growth model with time varying covariate
DATA: FILE IS int_covariate.dat;
VARIABLE: NAMES ARE ID INT1-INT3 AGE1-AGE3 SNS1 SNP1 SNS3 SNP3 PSS1 PSP1 PSS3 PSP3;
USEVARIABLES ARE INT1-INT3 AGE1-AGE3 PSS1 PSS3;
TSCORES = AGE1-AGE3;
MISSING = ALL(-999999);
ANALYSIS: TYPE=RANDOM;
MODEL: I S | INT1-INT3 AT AGE1-AGE3;
INT1@0;
st | INT1 ON PSS1;
st | INT3 ON PSS3;
PLOT:
TYPE IS PLOT3;
SERIES IS INT1 INT2 INT3(*);
OUTPUT: SAMPSTAT TECH1 TECH8 MODINDICES(3.84);
 Linda K. Muthen posted on Friday, August 26, 2011 - 1:52 pm
With TYPE=RANDOM, chi-square and related fit statistics are not available because means, variances, and covariances are not sufficient statistics for model estimation.
 Jing Zhang posted on Friday, August 26, 2011 - 5:24 pm
Dear Dr. Muthen,

Thanks for you quick response. Then with TYPE=RANDOM, how do we evaluate if the model fits well without using chi-square and related fit statistics?

Jing
 Linda K. Muthen posted on Friday, August 26, 2011 - 6:16 pm
Nested models are tested using -2 times the loglikelihood difference which is distributed as chi-square. BIC is also used to compare models.
 Marcus Pietsch posted on Friday, August 10, 2012 - 1:51 am
I am using complex data (teachers within schools) with 260 cases on level 1 but only 8 cases on level 2. Computing a single level model (without the statement "complex") my model fit is fine: RMSEA = 0.015, CFI = 0.957, TLI = 0.955, SRMR = 0.068.

But when using the complex-mode, model fit is getting worse (especially CFI and TLI): RMSEA = 0.057, CFI = 0.714, TLI = 0.701, SRMR = 0.068.

I also computed a model using the twolevel-mode with groupmeancentering: RMSEA = 0.048, CFI = 0.744, TLI = 0.732, SRMR = 0.063.

In my opinion, when using the statement complex or computing a twolevel model, fits should become better not worse. May it be that 8 cases on level 2 are not enough for using the complex-mode? If yes, how can I model my data appropriately?
 Linda K. Muthen posted on Friday, August 10, 2012 - 11:09 am
Eight clusters is not enough to get stable results. It is recommended to have a minimum of 30-50 clusters. You can control for non-independence of observations by creating a set of 7 dummy variables and using them in the analysis.
 Marcus Pietsch posted on Tuesday, August 14, 2012 - 4:15 am
Dear Linda,

I have tried that approach and created 8 dummy variables (all included at usevariables) of which I used 7 in the on-statements for all mediating and dependent variables.

DEFINE:

s1 = 0;
s2 = 0;
s3 = 0;
s4 = 0;
s5 = 0;
S6 = 0;
S7 = 0;
s8 = 0;

if (school eq 2) then s1=1;
if (school eq 3) then s2=1;
if (school eq 4) then s3=1;
if (school eq 5) then s4=1;
if (school eq 6) then s5=1;
if (school eq 7) then s6=1;
if (school eq 8) then s7=1;
if (school eq 1) then s8=1;

Model:
! only the on- and with-statements

KF on TF LF PF s1 s2 s3 s4 s5 s6 s7;
KO on TF LF PF s1 s2 s3 s4 s5 s6 s7;
UE on TF LF PF s1 s2 s3 s4 s5 s6 s7;

MB on KF KO UE TF LF PF s1 s2 s3 s4 s5 s6 s7;
AZ on KF KO UE TF LF PF s1 s2 s3 s4 s5 s6 s7;

LMSO on MB AZ KF KO UE TF LF PF s1 s2 s3 s4 s5 s6 s7;
LMSZ on MB AZ KF KO UE TF LF PF s1 s2 s3 s4 s5 s6 s7;
LMEI on MB AZ KF KO UE TF LF PF s1 s2 s3 s4 s5 s6 s7;

TF with TA LF PF;
TA with LF PF;
LF with PF;

LMSO with LMSZ LMEI;
LMSZ with LMEI;

KF with KO UE;
KO with UE;

MB with AZ;

But now I do not get any fit-statistics at all. Do you have any idea of what went wrong here?
 Linda K. Muthen posted on Tuesday, August 14, 2012 - 5:58 am
 Eva posted on Saturday, September 15, 2012 - 10:16 am
I am unclear on how to compare model fit for nested models. My analysis uses TYPE=TWOLEVEL and ESTIMATOR=WLSMV. It was mentioned to use the -2 loglikelihood difference; I do not see that value in the output, but see the Chi-square test of model fit instead, which is not to be "used for chi-square difference testing in the regular way." DIFFTEST is not available for the TWOLEVEL analysis either, despite the output suggesting for me to use that option. What to do, what to do?
 Linda K. Muthen posted on Monday, September 17, 2012 - 4:17 pm
The DIFFTEST option is not currently available for TYPE=TWOLEVEL. If you want to difference testing with TYPE=TWOLEVEL, use maximum likelihood estimation.
 Kimberley Breevaart posted on Wednesday, November 21, 2012 - 5:59 am
Dear Linda and Bengt,

I have a question about the fit of my model. I have a multilevel model with days (40 days) at the within level and persons (60 persons) at the between level. My model is a mediated model with one independent variable, three mediators and one dependent variable.

The fit of my model is:

RMSEA = .12
CFI = .93
TLI = .33
SRMR within = .04
SRMR between = .13

I have this problem when I use TYPE = TWOLEVEL and TYPE = COMPLEX.

Can you maybe help me? Thank you very much!
 Bengt O. Muthen posted on Wednesday, November 21, 2012 - 4:02 pm
We need to see your output to say - send to Support.
 Suekyung Lee posted on Wednesday, December 26, 2012 - 9:56 am
Dear Dr. Muthen,

I'm testing a path model with one categorical dependent variable and three continuous mediating variables, using a complex data set. According to the Users Guide, I'm using WLSMV as estimator (default), type = complex, and repse = jackknife1. I also include weight and repweight under variable command.

I found that only WRMR is provided for available fit indices. However, when I ran the same analysis without replicates, other fit indices are provided and WRMR has the exact same value regardless of inclusion of replicates.

I wonder if I can use these fit indices without replicates to evaluate overall model fit and chi-square values to perform chi-square difference tests. If not, what would you recommend to evaluate model fit and and to perform chi-square difference tests?

Thank you!
 Linda K. Muthen posted on Wednesday, December 26, 2012 - 4:07 pm
Fit statistics have not been developed for replicate weights. You should ignore WRMR. It is a experimental fit measure.

You should not use fit statistics for the model without replicate weights. The data being analyzed are not the same.

I'm not sure if MODEL TEST is available. If it is, you can do a joint test of all of the left-out paths.
 Suekyung Lee posted on Thursday, December 27, 2012 - 9:57 am

Would you be more specific about MODEL TEST and a joint test?

Thank you!
 Linda K. Muthen posted on Thursday, December 27, 2012 - 2:00 pm
See MODEL TEST in the user's guide. A joint test means to include all of the left-out paths in MODEL TEST at the same time.
 Tom Carwell posted on Saturday, February 02, 2013 - 10:30 am
Using Mplus, I want to test a 2-level mediation model (TYPE=TWOLEVEL).

My questions concerns model comparisons:

1) I want to check if that the model improves with the addition of the mediator, analogous to the evaluation of the change in r-squared in OLS. Is there any way to do this? Is the approach to compare a first model with the mediator fixed at zero and a second model with the mediator not fixed acceptable?

2) Can I use chi-square difference test or should I use -2LL chi-square difference test for this purpose?

3) In addition, I want to rule out an alternative explanation of full mediation by examining a third model by adding direct paths and comparing the new model to the hypothesized model. I saw in recent article a report on similar comparisons based on comparing chi-square values and fix indices (without conducting chi-square difference test), is this use of chi-square and fit indices reasonable? What is proper method to conduct this comparison?

Thanks!
 Bengt O. Muthen posted on Saturday, February 02, 2013 - 10:47 am
You can do in two-level analysis what you do in single-level analysis.

But regarding 1), I would not advocate seeing if the model gets a better r-square by including a mediator. I would simply see if the mediator gives rise to a significant indirect effect. You may want to ask on SEMNET.

2) No, because in one case you have 1 DV (the outcome) and in the other you have 2 DVs (outcome and mediator), so metric not the same.

y on m x;
m on x;

and simply see which effects are significant. For instance, if y on x is insignificant here as judged by a z-test, it will also be insignificant as judged by a 1-df chi-square test.
 Tom Carwell posted on Saturday, February 02, 2013 - 11:15 pm
Dear Bengt,

Thank you for your prompt response and suggestions.
 Beth Bynum posted on Monday, August 26, 2013 - 12:04 pm
I am working on a multilevel model with 10 predictor scales and one outcome variable. The outcome variable is a group-level measure of performance and the predictors are ratings made at the individual-level. I am only interested in the relationship between the predictors and the group-level outcome. I've include this relationship on the %BETWEEN% level (e.g Y ON P1 P2 P3 P4 P5) but I am unsure what to include on the %WITHIN% level to get adequate model fit. When I don't include anything on the %WITHIN% level, the model has zero degrees of freedom. When I specify only variance estimates on the %WITHIN% level, my model fits very poorly (TLI < 0).

What is the best approach for using Multilevel model when you are not interested in modeling within-level relationships? Thanks!
 Linda K. Muthen posted on Tuesday, August 27, 2013 - 2:24 pm
You can have a model of variances and covariances on within. You could also create a dataset where clusters are the observations and use all between variables.
 Eric Deemer posted on Sunday, September 29, 2013 - 3:05 pm
I fitted a multilevel mediation model and got a chi square value of zero. I read through this thread and it seems like this happens because means, variances, and covariances don't provide enough information for estimation. I thought this was only the case with "type = twolevel random" estimation? I used "type = twolevel."

eric
 Eric Deemer posted on Sunday, September 29, 2013 - 3:20 pm
I noticed that the between-level covariances among my 3 predictors are all zero. I suppose that this is due to centering, and is the reason I can't get a chi square statistic? Not enough information?

eric
 Linda K. Muthen posted on Monday, September 30, 2013 - 6:38 am
 RuoShui posted on Wednesday, December 18, 2013 - 5:53 pm
Dear Dr. Muthen,

I am using LGCM. One model is just the regular LGCM and the other one I used the intercept and slope growth factor to predict two continuous outcomes. I am wondering can I use the differences of -2LL and see if this change is significant with the change of degrees of freedom to determine the model fit?

Thank you so much!
 Bengt O. Muthen posted on Thursday, December 19, 2013 - 10:46 am
You can use -2LL only if your two models have the same dependent variables. You can do a joint test if the 4 slopes are significant by using Model Test in the model that includes the two cont's outcomes.
 RuoShui posted on Thursday, December 19, 2013 - 4:37 pm
Dear Bengt,

Thank you so much for your time.

I might be very wrong. But do you mean doing this?

math on i (p1);
math on s (p2);
literacy on i (p3);
literacy on s (p4);

Model Test:
p1=0;
p2=0;
p3=0;
p4=0;
 Linda K. Muthen posted on Friday, December 20, 2013 - 11:15 am
This looks correct.
 RuoShui posted on Sunday, January 05, 2014 - 5:44 pm
Dear Dr. Muthen,

I hope you had a lovely Christmas and new year.

I have a question about the -2LL. I realize that if my predictors use multiple indicators instead of mean scores of the same constructs, the -2LL is much larger than when I use mean scores. I understand using latent variables makes the model more complicated but at the same time, it takes into consideration of measurement errors. But as for model fit, the model using mean scores has much lower -2LL and BIC. Shall I give up on using latent variables and use mean scores instead?

Thank you very much.
 Bengt O. Muthen posted on Monday, January 06, 2014 - 8:23 am
The LL is not in the same metric when comparing models with different DVs, so you can't compare. When you have multiple indicators they are included in the list of DVs.
 RuoShui posted on Monday, January 06, 2014 - 2:42 pm
Thank you very much Bengt! I see, so the indicators of predictor variables are considered as DVs as well. Am I correct that for multiple indicator growth models with latent predictors of the slope and intercept growth factors, the only way to evaluate model fit is to use the model test command to test whether the paths from each latent predictor to i and s are significant?

For example:
i on react (p1);
s on react (p2);
Model test:
p1=0;
p2=0;

Thank you so much!
 Bengt O. Muthen posted on Monday, January 06, 2014 - 3:03 pm
Yes, indicators are DVs because they are regressed on the factors.

No, I don't see why you can't evaluate model fit the regular way with latent predictors.
 RuoShui posted on Monday, January 06, 2014 - 4:36 pm
Dear Bengt,

Thank you very much for your help!

I am sorry I did not explain very well just now. I meant to ask if I want to do model comparisons, what should I use since -2LL should not be used with different DVs? Is joint test the only way?

Thank you!
 Bengt O. Muthen posted on Monday, January 06, 2014 - 5:11 pm
The joint test can be used to see if the latent predictor has significant effects. If you do the same test with the observed predictor, I guess a comparison of findings can be done.
 RuoShui posted on Monday, January 06, 2014 - 8:16 pm
Dear Bengt,

Thank you so much!

I have another related question:

I want to compare model 1 (only estimating the intercept and slope growth factor) with model 2 (introducing predictor variables of i and s, but the predictor variables are latent variables with multiple indicators). As you said, -2LL should not be used. Then what indices should I use to argue that model 2 is a better model than model 1?

Thank you!
 Bengt O. Muthen posted on Tuesday, January 07, 2014 - 8:37 am
Maybe you can consider the amount of variance explained in the growth factors. But that requires that both models fit well.
 sojung park  posted on Monday, January 20, 2014 - 3:28 pm
Dear Dr.Muthens,

I am trying to test nested modeling for logistic regression using FIML--

I figured that I need to use the syntax

Analysis:
estimator=ML;
integration = montecarlo;

this way, I am not losing any observation, but then the output does not provide the standard set of information for testing nested modeling--

Could you please help me? I do not want to use other estimator that seems to change the model to probit that I do not want to--

thank you so much for your help!
 Bengt O. Muthen posted on Monday, January 20, 2014 - 5:06 pm
You can test nested models based on ML using -2 times the log likelihood difference for the two models.
 Tanja Gabriele Baudson posted on Saturday, February 01, 2014 - 2:43 am
Hello,

a question on multilevel model fit: I'd like to compare the fit of three models (null model, level-1 predictors only, level-1 and level-2 predictors). Using the log-likelihood fit index, I have now computed the Satorra-Bentler Chisq difference statistics as mentioned here: http://www.statmodel.com/chidiff.shtml

With every set of predictors added, the LL becomes less negative (from -2514.07 for the null model to -1149.49 to the model including both level-1 and level-2 predictors). Computing the S-B results in significant differences between null model and Level-1 predictor model and from Level-1 to Level-2 predictor model (the test statistics becoming smaller, yet still being significant). Is my interpretation that model fit becomes better with the inclusion of every set of predictors added correct? And is my procedure okay? (I am using the default MLR estimator.) And, just to be on the safe side: Is it correct that the models with more predictors are nested in the models with fewer predictors?

Thanks,
Tanja
 Linda K. Muthen posted on Saturday, February 01, 2014 - 2:10 pm
Nested models must share the same set of dependent variables. In this case, the model with fewer predictors is nested in the model with more predictors.
 Tanja Gabriele Baudson posted on Saturday, February 01, 2014 - 4:31 pm
Thanks, Linda. So, is my procedure and ma interpretation that model fit improves with the inclusion of more predictors all right?

Best,
Tanja
 Linda K. Muthen posted on Sunday, February 02, 2014 - 10:21 am
If the difference test between the model with all covariates versus the model with fewer covariates is significant, the model with fewer covariates fits worse.
 Tanja Gabriele Baudson posted on Monday, February 03, 2014 - 6:22 am
Thank you very much! <3
Tanja
 Shiny7 posted on Thursday, October 23, 2014 - 12:41 pm
Hello Dr. Muthen,

would you please support me, regarding the following questions with respect to multilevel modeling:

a) can I use AIC and BIC in order to compare models using MLR?

b) Hox (2010, 51) points out, that with MLR: 'AIC and BIC can only be used for models that differ in the random, [but not the fixed] part.". Can you explain, what that means?

c) Despite the log-likelihood difference test, why canīt I use Chi-Square Difference Test (Satorra-Bentler correction) in order to compare models, like in SEM?

Thank you very much!

Shiny
 Bengt O. Muthen posted on Thursday, October 23, 2014 - 2:42 pm
a) Yes.

b) Hox doesn't talk about MLR, but RML (restricted ML).

c) With MLR you can do the special difference testing that we describe at

http://www.statmodel.com/chidiff.shtml
 Shiny7 posted on Friday, October 24, 2014 - 12:05 am
Dear Dr. Muthen,

I am relieved.

Shiny
 Lisa M. Yarnell posted on Wednesday, May 13, 2015 - 1:31 pm
Hi Linda and Bengt,

I am using MSEM to estimate 2-level models of a set of baseline covariates + teacher variables of interest on student achievement.

I am looking at AIC and LL across models. These should generally decrease as predictors are added to the model.

This is true when I compare AIC and LL between the baseline and final teacher models; fit improves when I add the teacher variables, because while these indices account for model complexity, the teacher variables add explanatory value.

However when I compare AIC and LL between the final models and an Unconditional model with student achievement as the sole variable, varying on levels 1 and 2, I see that fit of the conditional models is WORSE. That is, adding the predictors made the AIC and LL statistics INCREASE in absolute value, relative to the Unconditional model.

I believe that this occurs because certain variables in my conditional models are "y-variables on the BETWEEN level and x-variables on the WITHIN level" and hence "treated as a y-variable on both levels."

This is fine with me. But given that the conditional models have, analytically speaking, more than one y-variable--not just student achievement--this is giving a very different picture of the overall model when fit statistics are generated.

Can you clarify or verify my thinking here? Thank you.
 Bengt O. Muthen posted on Thursday, May 14, 2015 - 10:36 am
Yes, extra y variables throw off the logL and BIC comparisons.

Note that there isn't a clear consensus on AIC/BIC for 2-level because it isn't clear if the sample size should be the total or the number of clusters. There are some articles on this although I can't pinpoint them right now (probably in the SEM journal).
 Martijn Van Heel posted on Thursday, October 27, 2016 - 6:35 am
Dear Dr. Muthen

When comparing nested models, is it possible to use CFI, RMSEA, and SRMR to decide which model is better.
If so, are there certain cut-off points to indicate that one model is significantly better than the other.
For example, when the difference in CFI is at least .01, then the model is significantly better.

Martijn
 Bengt O. Muthen posted on Thursday, October 27, 2016 - 5:50 pm
There is some literature discussing this - you may for instance check the SEM journal. Or ask on SEMNET. Personally, I am not convinced about the value of being guided by these differences.
 Martijn Van Heel posted on Friday, October 28, 2016 - 3:37 am
Thanks for the quick response.

If I was to use the chi square difference test for comparing models, I'd need the scaling correction factor.

However, I do not get it in the output of MPlus (estimator=MLM). I have the feeling that I'm overlooking something very obvious, but I cannot figure out what it is.

Also, my model runs with ML and MLM, but not with MLR, is there something I should pay extra attention to?

 Linda K. Muthen posted on Friday, October 28, 2016 - 12:14 pm
You should send the MLM and MLR outputs and your license number to support@statmodel.com.
 Bengt O. Muthen posted on Friday, October 28, 2016 - 12:14 pm
I would recommend exploring why MLR has problems when ML doesn't - they differ only in the SEs.
 Martijn Van Heel posted on Sunday, October 30, 2016 - 6:12 am
I'm sorry for all the follow-up questions.
My study has a cohort sequential design, which means a low covariance coverage.
However, is it possible that MLR has trouble with it whereas ML does not?

Many thanks!
Martijn Van Heel
 'Alim Beveridge posted on Tuesday, December 06, 2016 - 7:22 pm
Dear Bengt and Linda,
I wish to produce consistent AIC (CAIC) for my mixture models (LPAs) to follow the procedure for multigroup analysis of similarity recommended by Morin, Meyer, Creusier & Bietry (2015). The formula for CAIC given in Nylund, Asparouhov & Muthen (2007) is
CAIC = -2*logL + p*(log(n) + 1).

1. Is p the "Number of Free Parameters" in the output?

2. Does Mplus calculate BIC using this formula: BIC = -2*logL + p*log(n)?

3. If yes, then CAIC is simply BIC + p, right?

Thanks,
'Alim
 Bengt O. Muthen posted on Wednesday, December 07, 2016 - 10:38 am
1. I would think so; I don't have the CAIC formula in front of me.

2. Yes.

3. Probably.
 'Alim Beveridge posted on Friday, December 09, 2016 - 5:38 am
Dear Bengt,

In my latent profile analysis I do not obtain the same results when I add BIC + p and when I calculate CAIC as -2*logL + p*(log(n) + 1). In fact, when I calculate BIC using -2*logL + p*log(n), I don't get the same results as the one produced by Mplus. However the AIC I get is the same as the one produced by Mplus. Is BIC calculated differently in mixture modeling?

Thanks,
'Alim
 Bengt O. Muthen posted on Friday, December 09, 2016 - 9:25 am
No, BIC is always calculated according to the formula you show so I don't see how you don't get agreement. I assume you use "e-log" and have p=number of free parameters. If this doesn't help, you can send output to Support.
 Jessica ZHANG posted on Saturday, December 17, 2016 - 12:30 am
Hi,
I am wondering how to estimate model fit in type=twolevel random. Thanks.
 Linda K. Muthen posted on Saturday, December 17, 2016 - 6:00 am
No absolute fit statistics are available in this case. You can test nested models using -2 times the loglikelihood difference which is distributed as chi-square. You can compare non-nested models with the same set of dependent variables using BIC.
 Jessica ZHANG posted on Saturday, December 17, 2016 - 4:59 pm
Hi Linda,