I am wondering about the barriers to creating an RMSEA index that would be appropriate for models involving categorical variables. Mplus doesn't give an RMSEA index in such cases. But in talking to quite a few people, no one has been able to describe why such an index would be inappropriate for such models. Can anyone cite any papers that describe such barriers, or alternatively, any papers that define the RMSEA index in categorical models? Can anyone describe briefly what the barrier is?
It seems possible to use RMSEA for categorical outcomes and Version 2.0 of Mplus will include this. The drawback is that little is known about how to use RMSEA for categorical outcomes in practice. This is also true for RMSEA for continuous outcomes that are non-normal. In both cases, RMSEA can be built on a robust chi-square, either mean-adjusted or mean- and variance-adjusted. The 1999 AERA paper by Nevitt and Hancock studied the continuous non-normal case for the mean-adjusted chi-square. The problem is that our limited simulation studies suggest that in these cases, the RMSEA seems to be influenced not only by model misspecification but also by the degree of non-normality in the variables - the more skewed the variables, the lower the RMSEA. This means that the normal-variable standard of good models having RMSEA values less than 0.05 cannot be relied on in all situations, and the fit evaluation becomes data distribution dependent. Same thing for ADF and its WLS counterpart for categorical outcomes. But perhaps such RMSEA values have some practical utility nevertheless - simulations studies are needed. Other fit indices for categorical outcomes will also be put into the next Mplus version.
Dieter Urban posted on Wednesday, November 03, 1999 - 12:27 pm
Is there any way to calculate RMSEA for models with categorical variables using Mplus 1.0 ?
Leigh Roeger posted on Wednesday, November 03, 1999 - 4:23 pm
This is a question I would also be interested in learning an answer to. For the Mplus folks I would also be keen to know when version 2 might be 'on the streets'.
I pass on that in Bollens (1989) book 'Structural Equations with latent variables' on page 436 he reports that Muthen and Kaplan (1985) found that ML and GLS chi-square tests were quite robust except when the categorical variables had large skewness or kurtoses > -1 or +1. So I wonder if the data is not badly non-normal geting fit statistics (for categroical models) running Mplus for continous variables might not be too much of a sin - although no doubt purists (and the more knowlegeable) among us might disagree with this.
A robust RMSEA for categorical outcomes can be calculated using results from Mplus.
RMSEA = sqrt((2 * Fmin / t)- (1/n))*sqrt(g)
Fmin = the last value from the function column of TECH5 n = sample size (sum of all groups if multiple group) t = trace of the product of the u and gamma matrices. See Satorra (1992) for a definition. g = number of groups
For computational purposes, you can calculate RMSEA as follows:
RMSEA = sqrt((chi-square/(n*d)) - (1/n))*sqrt(g)
where d is degrees of freedom, n is total sample size, chi-square is chi-square, and g is the number of groups. Chi-square from WLSM or WLSMV can be used. The RMSEA will be the same whichever one is used.
Ehsan Soofi posted on Wednesday, November 10, 1999 - 12:04 pm
The ratio Chi-squared/df has been suggested, for example by Marsh and Hocevar (Psych, Bull, 1985), as an index of fit with values close to 2 more desirable. I think interpreting RMSEA in terms of Chi-squared/df has an intuitive appeal. Write the RMSEA expression as:
RMSEA = sqrt((Chi-sq/df - 1)/n).
Then, RMSEA is an average excess Chi-sq./E(Chi-sq.) - 1, with RMSEA=0 when Chi-sq.=E(Chi-sq.).
With regard to WLSMV and WLSM, I am getting WLSMV Chi-sq./df = WLSM Chi-sq./df for several the models that I have use. I cannot see why this should hold in general.
Rich Jones posted on Thursday, November 11, 1999 - 11:30 am
Thanks for the great message board. In follow-up to your Nov 8 post regarding a robust RMSEA for categorical outcomes, is it appropriate to use results of model fitting reported in TECH5 to compute a goodness of fit index
GFI for categorical outcomes can be computed from information in TECH5. Fmin is the minimum value of the fitting function which is the last value in the first column of TECH5. Finit is the minimum fitting function value for a model with all parameters fixed to zero. This is obtained in TECH5 by estimating a model for the same set of variables with no statements in the MODEL command.
There's an article in the December 1998 issue of Psychological Methods by Hu and Bentler which reviews fit methods for continuous outcomes. GFI does not get a very good review. I don't know of any review of GFI for categorical outcomes. I'm not sure if its behavior has been studied in this situation. Perhaps someone else has some information on this topic.
Craig Gordon posted on Thursday, November 18, 1999 - 10:03 am
Along the same lines as the discussion regarding RMSEA, can one calculate a CFI? I assume I would have to run a separate independent model to get the denominator, or does MPLUS calculate that somewhere?
TLI and CFI can be computed for models with categorical outcomes. Both of these measures require information from a baseline model in addition to the model being tested. Typical baseline models have zero covariances. The baseline model for categorical outcomes has all parameters fixed to zero except the thresholds. This is obtained by TYPE=MEANSTRUCTURE in the ANALYSIS command and no statements in the MODEL command.
TLI and CFI can be computed as follows where
chib = chi-square for the baseline model dfb = degrees of freedom for the baseline model chit = chi-square for the model being tested dft = degrees of freedom for the model being tested
I am not sure how one would get the probability of close fit for RMSEA <=.05 but I will look into it.
Brent Hutto posted on Tuesday, January 04, 2000 - 12:56 pm
OK, we're trying to calculate RMSEA, TLI and CFI for a model with 9 first-order factors and 3 second-order factors. All 30 indicators are categorical and we are not estimating a mean structure. We're not really sure that our "baseline model" is specified correctly.
TITLE: Baseline Model, Std Data Only
DATA: FILE IS All.DAT; TYPE IS INDIVIDUAL; NGROUPS = 1;
VARIABLE: NAMES ARE Source Group Sex v01-v30; MISSING ARE .; USEOBS = Source EQ 1 OR Source EQ 2; USEVARIABLES ARE v01-v30; CATEGORICAL ARE v01-v30;
OUTPUT: RES STAND TECH5;
When we run this program, we get warnings in TECH5 of the form:
"ZERO CELL PROBLEM: IR, J & K = 1 30 29"
but we do get a pseudo-ChiSq and its degrees of freedom. Using those as "baseline" values and using the same values from the actual program (with a MODEL: statement) we do the following calculations:
n=1219 Model ChiSq=1628.522*, df=116** Baseline ChiSq=12329.906*, df=38**
We're a little surprised by such a large value of TLI given that RMSEA>.1 and CFI<.9 and we calculated similar values for another dataset ( RMSEA=.1019, TLI=.9739, CFI=.9344 ), although in this case TLI and CFI agree somewhat.
1) Given that we used TYPE=GENERAL in the "real model", is the above "baseline model" correct?
2) Is there a generally acceptable reference we can use when we pubish RMSEA, TLK and CFI values calculated by this method?
================================================ Brent Hutto Early Alliance Project Statistician Department of Psychology University of South Carolina (803) 777-5452 or Hutto@SC.edu
It looks like your baseline model is correct and that your calculations are also correct. The warning message that you get is telling you that in the bivariate frequency table for variables 29 and 30, there is a low cell frequency. But if the model estimation terminates normally, this is nothing to worry about.
I don't see your results as that contradictory for either of your examples. Both RMSEA and CFI agree that the models fit poorly. The cutoff for CFI and TLI recommended by Hu and Bentler (Psych. Methods 1998 Volume 3) for continuous outcomes is greater than .95. Your TLI's are .96 and .97 which suggests that the cutoff for TLI should perhaps be higher for categorical outcomes.
As far as we know, there have been no published studies of the behavior of RMSEA, CFI, or TLI for categorical outcomes. Our very limited studies of RMSEA found that it does not work as well for categorical outcomes as for continuous. So we have no references to suggest. These are studies that need to be done.
Ehsan Soofi posted on Thursday, January 13, 2000 - 3:31 pm
The discrepancy between the CFI and TLI may be due to the type of Ch-sq. statistic used and its degrees of freedom. In my little experience with these indices I have found out that the ratio WLSMV/DF is about the same as WLSM/DF. When computing indices that are functions of the ratio Ch-sq./DF (e.g., RMSEA and TLI), WLSMV and WLSM produce about the same results. This may not be the case for other types of indices such as CFI, which depends on the difference Ch-sq.-DF. For numerous models that I have run, WLSM has produced remarkably close values for CFI and TLI. I have also noted that DF for WLSM is always the same as DF for MLE.
I'm working on a project that involves estimating multiple CFAs using symptom (dichotomous)data. After I establish the appropriate model for individual groups, I hope to test for invariance across groups. I had intended to rely heavily on nested chi sq. tests to establish the superiority of different models both within (boys 1 vs. 2 factor) and across (boys vs. girls) groups. However, it appears that this strategy won't work if I use the WLSM or WLSMV estimators. Is there some other method to test the validity of competing model structures when using these estimators?
Incidentally, when I try to use the WLS estimator, Mplus gives an error implying that the weight matrix is not positive definite. Given that models using WLSM and WLSMV converged, I was surprised at this finding. The symptoms are very skewed.
Finally, I should mention that these data come from an epidemiological study so I'm using a "weight" variable.
We recommend WLS for nested model testing. Unless the sample size is very large, very skewed items can make the weight matrix of WLS not invertable. The weight matrix is not inverted in WLSM or WLSMV. This is why you get convergence with these estimators. You can try to delete very skewed items to make the weight matrix invertable. I don't know of a good alternative to assessing nested models at the present time.
I have now dipped more than a toe or two into the waters provided by Mplus, specifically, path analysis with endogenous categorical/dichotomous variables. In fact, my current application is very much like Example 15.1A (p.131) of the manual. I just have many more variables exogenous to the two endogenous ones.
I now have several major areas of questions: (1) How do I calculate a case-by-case probability of the ultimate outcome, based on output of Mplus? FYI, to date I have concentrated on "sampstat" and "residual" output, and have not delved into any TECH output. (2) What is the correlation matrix used by Mplus for path analysis with one (or more) endogenous dichotomous variables? How is the correlation matrix calculated from free data (all categorical, dichotomous variables)? Pending an answer to those questions, what about using correlation matrices developed specifically for binary data? I am thinking here of the half dozen variations offered by SYSTAT 9 for Windows. (3) Has anyone examined the similarities or differences between the approach to path analysis with endogenous categorical (binary) variables implemented in Mplus 1.0x and that described by Bollen et al., a 2-stage probit regression (Demography 32(1): 111-131, Feb. 1995), entitled "Binary Outcomes and Endogenous Explanatory Variables...."? I would very much appreciate guidance, assistance, or references in my mission to get these questions answered soon.
For questions 1 and 2 there are several Muthen-authored articles that are useful, for example Muthen, Kao, Burstein (1991) in Journal of Educational Measurement. See list of references on Mplus Discussion. Another useful article is the one by Xie. With path analysis there are usually x variables in the model, that is variables that have no model structure (covariates). In such cases, Mplus has the advantage that it uses sample statistics that are regression-based, not correlations. Regarding question 3, I have not studied that Bollen article - anybody else?
Anonymous posted on Friday, March 31, 2000 - 11:19 am
I am running a path analysis type of model similar to the one on p. 131, except that: 1) there is a single binary observed outcome 2) the predictors are all latent factors (some with continuous indicators, others with categorical indicators). I have tried several wls(m)(v)estimators. Convergence is slow but it works. However, I do not know what kind of analysis is actually being conducted. I assume that Mplus runs a probit. Or am I wrong? Could you provide an explanation and a reference? Also, why doesn't Mplus allow the logistic option with latent variables? I tried the above analysis with "logistic" but it seems only to work with observed variables. Thanks for your response.
Logistic regression is available for only univariate observed outcomes with observed independent variables. The logistic model does not easily generalize to the multivariate latent variable model framework.
The analysis you are running with latent independent variables and a categorical outcome uses probit regression. This is described in Appendices 1 and 2 of the Mplus User's Guide. The following references describe the estimation further.
Muthén, B. (1983). Latent variable structural equation modeling with categorical data. Journal of Econometrics, 22, 48-65.
Muthén, B. (1984). A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators. Psychometrika, 49, 115-132.
For an application, see the Xie reference on Mplus Discussion.
MikeW posted on Wednesday, April 19, 2000 - 6:21 am
In scanning the posts above, I noticed that Linda presented a formula for RMSEA as follows:
RMSEA = sqrt((chi-square/(n*d)) - (1/n))*sqrt(g)
This formula differs from that provided by Rigdon (1996) in the journal SEM, 3(4), p369-379. The difference is that Linda is using n (total sample size) where Rigdon uses n-1 (he also doesn't include info about multiple groups - g). In my application the difference b/w formulae is trivial. Clearly the larger the sample size the less impact this has. However I was wondering whether the difference was related to the specific use of WLSM, WLSMV?
I do not get the following outputs when ALL the indicators are categorical. * Chi-Sq. and iteration results for NULL model defined with no statements in the model command. * Derivatives with respect to Theta for a structural variable model. There is no problem when some indicators are continuous.
If you are using Version 1.04, you should get chi-square when there is no model statement. I suspect that you are using an earlier version. To get derivatives with respect to theta add a covariance of zero to the model command, for example,
I wish Mplus would have an option of outputting sample correlations and model correlations to an external file so that it would be easy to compute RMSR and some other indexes. Mplus' printed output breaks correlation matrices into columns of 5 variables. It is possible to use it for computing RMSR by first manually editing the output (deleting unnecessary stuff) and then exporting the values to Excel. However, to compute Rao's distance (for a direct comparison of competing models) I need a traditional square correlation matrix. I haven't figured out an easy way of transforming output file into a square correlation matrix, and doing it manually is very tedious because I have a lot of data sets and large number of items. Writing a SAS program for doing this task is possible, but promises to be quite time consuming. Does anyone have any ideas?
I am examining nested models with categorical outcomes using WLS across 19 groups and a large total sample size of about 14,000. What is the recommended approach to testing nested models under these conditions?
You would do chi-square difference testing in the regular way using WLS. You would look at the difference between chi-square for two nested models and the difference in the degrees of freedom for the two models. A chi-square table will tell you whether the difference is significant for the number of degrees of freedom. Note that difference testing is not appropriate for WLSM and WLSMV.
Anonymous posted on Thursday, November 16, 2000 - 2:02 pm
I performed CFA with dichotomous items. Using WLSMV, one factor model had chi-square=142.188, df=84. To compute NNFI and CFI, I also ran null model and got chi-square=776.778, df=73. What I could not understand is why the null model has a smaller df value. In conventional SEM with continuous variables, the null model has a larger df value. Can you please explain this? My computed RMSEA=.06 and NNFI=.928. Can I define my model fit is good?
The degrees of freedom for WLSMV are not computed in the regular way. See formula 109 on page 281 of the Mplus User's Guide.
I know of no studies that have looked at the fit measures you mention for categorical outcomes so I could not comment on what they mean.
Anonymous posted on Wednesday, December 06, 2000 - 7:32 pm
Above you suggest that weighted WLSMV and WLSM models may converge where an WLS model won't because the weight matrix is not inverted under these procedures. What do you recommend if a weighted WLSMV model doesn't converge whereas the unweighted WLS, WLSM, and WLSMV models do (Mplus returns an error message stating the model may not be identified) ? Would extreme skew of the weights generate this type of error ?
AIC is based on the maximum loglikelihood value. The WLSMV estimator does not maximize the loglikelihood so this value is not available for WLSMV. If an AIC could be computed, non-nested models can be compared using it.
Anonymous posted on Wednesday, January 24, 2001 - 10:13 am
Is there any method to distinguish between non-nested models that use the WLSMV estimator?
You can look at a variety of fit measures like SRMR, CFI, TLI, RMSEA, the new WRMR that will be in Version 2 etc. You can't say that one is statistically better than the other. It would be a qualitative comparison. You can also look at the p-values from the chi-squares.
A question came up on SEMNET why Mplus has not included the test of underlying normality for categorical outcomes. Background reading for such testing includes my chapter Muthen (1993) Goodness of fit with categorical and other nonnormal variables (pp. 205-234). In Bollen & Long (Eds.) Testing Structural Equation Models. Newbury Park: Sage. This chapter points out that these tests can be useful, but also that they may be overly sensitive and frequently lead to rejection for reasons that may not be important enough to warrant abandoning the use of polychorics/polyserials. It suggests that often only one or two cells cause the rejection e.g. due to irrelevant causes such as response style (here a polychoric may actually serve to smooth the bivariate distribution and in some sense give a better correlation than the usual Pearson product moment). Also, the approach used in Mplus makes it possible to relax the assumption of underlying normality in the frequent situation where there are covariates in the model, instead using a regression-based approach that only assumes conditional normality. Here, the normality pertains only to the residuals in the regression of the categorical outcomes on the covariates and a bivariate normality test would be irrelevant.
We have attempted to fit one and two-factor models to a set of 11 dichotomous variables,using Confirmatory Factor Analysis with tetrachoric corellations in Mplus v2.01.
Curiously, the chi-square tests of model fit for both the one and two-factor models have 27 degrees of freedom. Surely the two-factor model should have one less degree of freedom, due to the correlation between factors?
If you are using the WLSMV estimator, which is the default estimator for categorical outcomes, the degrees of freedom are not calculated in the regular way but according to formula 110 which can be found on page 358 of the Mplus User's Guide. I think this is probably what is happening.
Jef Kahn posted on Wednesday, May 23, 2001 - 11:08 am
Using Mplus 2, I am testing a model with five continuous latent variables and one binary observed endogenous variable. My sample size is 314. I have three questions:
1. Is the sample size large enough to accurately estimate parameters using the WLSM method? I recall reading on SEMNET that WLSM could work with small samples, but my small sample size makes me a little nervous. Do you see this as a limitation?
2. Is there any way to compute indirect effects (correlations) given that one of my endogenous variables is observed and binary? I understand that probit regressions are used, and I am not sure how that would affect the indirect effects. Actually, I do not see where Mplus provides indirect effects at all. Could you point me in the right direction?
3. Is there a way to obtain the correction factor needed to do a chi-square difference test with WLSM in Mplus?
Models with 12 observed variables were studied for WLSM and WLSMV. Sample sizes as low as 150 looked OK. So I think this should be OK although sample size considerations must take into account the number of free paramaters and the distribution of the variables among other things. We recommend WLSMV for categorical outcomes.
Indirect effects are not automatically computed in Mplus. You would have to do them outside of the program. You would multiply the regression coefficients together. It doesn't matter if they are probit or regular or combinations. You can see how to compute the standard errors in Bollen's book.
There is no scaling correction factor for WLSM or WLSMV. This is under development.
Jef Kahn posted on Thursday, May 24, 2001 - 6:00 am
I have 19 observed variables and between 139 and 145 df for the models I am testing. Is there a specific paper I can cite regarding how well different types of models behave with different sample sizes under WLSMV? Also, is there a paper that describes the benefit of using WLSMV versus WLSM with a categorical outcome? I am looking for the proper source to cite. Thank you.
Linda Muthen wrote in previous post: Models with 12 observed variables were studied for WLSM and WLSMV. Sample sizes as low as 150 looked OK. So I think this should be OK although sample size considerations must take into account the number of free paramaters and the distribution of the variables among other things. We recommend WLSMV for categorical outcomes.
The following reference can be requested by emailing firstname.lastname@example.org. I believe it would be the best reference for what you want.
Muthén, B., du Toit, S.H.C., & Spisic, D. (1997). Robust inference using weighted least squares and quadratic estimating equations in latent variable modeling with categorical and continuous outcomes. Accepted for publication in Psychometrika.
I am doing a multigroup analysis using categorical, observed, dependent variables. I would like to test configural invariance of the model and I am having trouble figuring out how to specify certain parameters.
Here is what I would like to do:
A)Fix first item's factor loading to 1 to set the scale of the factors in each group with freely estimated factor loadings not constrained to be equal across groups***I am pretty sure I have done this correctly***
B)Set the mean of each factor ***not sure how to do this; I know that for continuous variables I would fix the first item's intercept to 0 but this doesn't seem to work with categorical variables***
C)Relax equality constraints for intercepts (thresholds)***not sure how to do this or if this would be the default once the mean for each factor is set***
D)Allow variances, covariances and means to be freely estimated and heterogeneous across groups ***I believe this isthe default in mplus***
E)Allow covariances between like items' uniquenesses to be estimated across groups ***Not sure how to do this***
I would be happy to send you a copy of the mplus command file that I have put together if that would help with these questions.
It would help to see your Mplus command files. If you could send them to email@example.com, I can take a look at them. Please include in the title what you are attempting in each file.
Jef Kahn posted on Thursday, November 15, 2001 - 7:05 am
I am trying to interpret a WRMR value of 1.24 in a structural model that contains a dichotmous endogenous variable (I used the WLSMV method). I know from Appendix 5 of the manual that WRMR values of .90 or lower are considered indicators of good fit. What is the range of WRMR? How far from "good fit" is 1.24?
bmuthen posted on Thursday, November 15, 2001 - 10:25 am
WRMR is a descriptive fit index and its statistical distribution is not yet known. This is much like the situation for CFI/TLI. In this sense, we don't know how far a TLI of 0.60, say, is from 1.0. Nor do we know how far 1.24 is from 1. WRMR ranges from 0 to infinity. In more recent work, we conclude that perhaps 1.0 is a better cut off than 0.9. More studies would be needed to decide "how bad" higher ranges such as 1.0-1.3 are in practice.
Tao Xin posted on Tuesday, November 20, 2001 - 9:20 am
i have attempted to fit one and four-factor models to a set of 21 dichotomous variables,using Confirmatory Factor Analysis with tetrachoric corellations in Mplus v2.01. The output mentioned that I can not use chi-square difference to compare the fit of these two models. I am just wondering if there is any alternative approach to compare the fit of two models? Thanks a lot.
As a reference, the following are the model-fit indices I got by using WLSMV: unidimensional model: Chi-square value = 2997.549 (df = 154) CFI = .824 TLI = .943 RMSEA = .052 Four-factor correlation model: Chi-square value = 1452.813 (df = 158) CFI = .920 TLI = .975 RMSEA = .035 Four-factor second-order model: Chi-square value = 1437.654 (df = 157) CFI = .921 TLI = .975 RMSEA = .035
Bmuthen posted on Wednesday, November 21, 2001 - 8:30 am
It doesn't sound like these models are nested. Therefore, chi-square difference testing would not be appropriate. For nested models, we recommend using WLS for chi-square difference testing and then using WLSMV for the final model.
I have a question on the scale freeness and scale invariance of ULS. I ran several item factor analyses with NOHARM and Mplus using ULS on the basis of binary data. I had 124 items and 407 subjects. I used NOHARM with raw product moments as input data. The Mplus analyses were based on tetrachoric correlations.
When comparing the results slight differences occured for the standardized loadings for less complex models. The largest discrepancies resulted for a nested factor model (e.g. each item got a loading on one general factor and on one group factor). The ULS discrepancy functions were completely different in all model runs. Are these signs for a lack of scale freeness and invariance? Your comments are highly appreciated. Thanks, Martin.
The Mplus ULS estimator applied to tetrachoric correlations is a scale free approach. I'm not familiar with the NOHARM approach, but it may consider a different model than the tetrachoric model, which could account for the discrepancy function differences.
Anonymous posted on Tuesday, October 29, 2002 - 7:18 pm
Dear Dr Muthen,
Thank you for the great software and this useful discussion list. I have a case of the "missing SRMR". I have found with categorical indicators that when doing multi-group modelling, or do a CFA with a covariate (eg MIMIC model), or when outputting factor scores, that the SRMR no longer appears in the output. Is there a reason for this? How can I get it back (short of manually outputting the observed and fitted correlation matrices?)
The mystery of the missing SRMR is solved. We do not compute SRMR for categorical outcomes when there is one or more covariates in the model. This is because the sample statistics are not correlations in this situation. They are probit thresholds, regression coefficients, and residual correlations. WRMR looks at the difference between these sample statistics and their model estimated counterparts.
Anonymous posted on Saturday, November 02, 2002 - 2:42 pm
I have a question about WLSMV df. I have noted on posts here and from my own models that the df for the baseline model can be *smaller* than that of the fitted model, and I recognise the methods of computation of df involve a complex matrix formula. Is it possible to give a "layperson" explanation for how this can be, as it strikes as unintuitive (at least to me) to estimate more parameters yet also have greater df.
Also, there have been references above in this message board to the paper Muthén, B., du Toit, S.H.C., & Spisic, D. (1997). Robust inference using weighted least squares and quadratic estimating equations in latent variable modeling with categorical and continuous outcomes. Accepted for publication in Psychometrika.
Has that been published yet, or is it "in press"?
Many thanks in advance.
bmuthen posted on Saturday, November 02, 2002 - 5:55 pm
The WLSMV df is "estimated" to get as close of an approximation to a chi-square distribution as possible and is therefore both model and data dependent. Hence it doesn't have a substantively interpretable meaning as in regular chi-square testing. The p value is what can be interpreted here.
The Muthen et al paper is not published yet and I have not yet taken the time to do the final revision for putting it in press - too many other important papers to write. But, I am happy to send the paper in its current version.
Jef Kahn posted on Sunday, March 09, 2003 - 6:14 am
I am estimating a model with 5 observed variables and 4 latent variables (with 2-3 indicators per latent variable). One of my observed variables is dichotomous, so I am using WLSMV estimation. Sample size is 204. I am using Mplus version 2.
I have estimated a measurement model, where my 9 variables are correlated, and a structural model, which has 22 more constraints than the measurement model. Fit indices are:
For the measurement model: chi-square = 100.01 df = 32 CFI = .933 TLI = .960 RMSEA = .102
For the structural model: chi-square = 86.36 df = 33 CFI = .948 TLI = .970 RMSEA = .089
I understand that the chi-square is mean and variance adjusted and that degrees of freedom are not computed the traditional way. My question is about CFI, TLI, and RMSEA. I am not used to having the structural model (with more constraints) fit better then the measurement model (with fewer constraints). I am guessing that this is an anomaly from the method of computing chi-square and degrees of freedom. Is this correct?
No, this is probably not the case. It is more likely if you look at TECH1 for each run that there is a default that is changing the structural model in a way that you are not expecting. If you can't figure this out, please send me the two outputs including TECH1.
Anonymous posted on Wednesday, August 20, 2003 - 9:33 pm
I have a few questions about nested growth curve modeling with binary outcomes. My data is highly skewed 0 ranges from 323-355 and 1 ranges from 28 to 60 for the six outcomes. I ran two nested models with two parameterizations (i.e., 1) fixing the mean intercept at zero and holding the thresholds equal across time and 2) fixing the first threshold at zero and freeing the intercept factor). I would like to know if my code is correct for each parameterization method.
Here is some relevant code from the second model...
I am particularly interested in the mean and significance of the intercept (i.e., if individuals vary significantly on 'dicomp' at time 1). So, I would like to use the second method (i.e., not set the intercept to zero). However, with this method the SEs can not be calculated and the output highlights a problem with parameter 7, the slope. Can you provide me with any suggestions?
Finally, could you please relay the RMSEA, TLI, and CFI values that would indicate a 'good or reasonable fit' for categorical outcomes? Has any research been conducted recently on this topic?
Muthén, B., du Toit, S.H.C., & Spisic, D. (1997). Robust inference using weighted least squares and quadratic estimating equations in latent variable modeling with categorical and continuous outcomes. Accepted for publication in Psychometrika. (#75)
Refer to it as paper 75. It is the best description of these estimators.
Anonymous posted on Wednesday, April 21, 2004 - 8:14 pm
In your Aug 21 2003 post: "Suggested cutoffs for categorical outcomes based on a "recent dissertation" are:
RMSEA less than or equal to .05 TLI greater than or equal to .95 CFI greater than or equal to .96 "
Would you please tell me the reference of the dissertation? Thanks.
bmuthen posted on Thursday, April 22, 2004 - 6:57 am
See the dissertation by Yu posted on the Mplus home page.
Anonymous posted on Friday, June 18, 2004 - 12:12 pm
Is there a way to form dummy variables in the define statement for a categorical independent variable?
Yes if you use the weighted least squares estimator. No if you use maximum likelihood estimation.
Anonymous posted on Tuesday, November 30, 2004 - 8:54 am
Dear Dr Muthen,
To do the chi-square difference test, is it better to use "WLSM" with use the adjusted formula on statmodel homepage, or using wlsmv with "DIFFTES" option? In other words, which of the two estimators are more appropriate to calculate chi square difference test?
You can use either. We recommend WLSMV as the default because we have found it performs better in most instances. I would use the DIFFTEST option and WLSMV.
Anonymous posted on Sunday, December 05, 2004 - 12:49 am
As a novice with Mplus, I have a few questions concerning EFA and CFA with dichotomous observed variables.
First, in EFA, I wonder why sometimes I don't have the RMSEA for example the 3 factors solution. What I mean is this: I'm using WLSMV, I get the RMSEA for the 1 factor solution (RMSEA = 0.02, for example) and also for the 2 factors solution (RMSEA = 0.005, for example) but not for the 3 factors solution. Does this mean that the RMSEA is close to 0 for the 3 factors solution so that it is not in the output?
Second, I wonder if it is possible to get eigenvalues in a CFA with dichotomous data ? (with WLSMV)
Third, I would like to know if it is possible to test different model, nested or not (with WLSMV) by a chi-square model fit test or by having AIC, BIC, ... I noticed that I can get AIC and BIC with continuous observed variables but not with dichotomous observed variables... Maybe you can please tell me if there is an option I can add so that I can get AIC and BIC with dichotomous data (and with WLSMV). Or maybe you can please tell me the formula to calculate it?
I don't know exactly why this is so but I believe that there has been a change related to when RMSEA is zero. If you are not using the most recent version of Mplus, please download it. If you are, please send your output and data to firstname.lastname@example.org.
You cannot get eigenvalues through CFA but you can get them through EFA. They would be the same because that are based on the data not on the estimated model.
You can use the DIFFTEST option to do chi-square difference tests for WLSMV. AIC and BIC are not available because they are based on maximum likelihood estimation.
Anonymous posted on Sunday, December 05, 2004 - 3:09 pm
Thanks for your reply! Just to add few more question: you said that "AIC and BIC are not available because they are based on maximum likelihood estimation"
But how come that I can get AIC and BIC even when using WLSMV with continuous data (but not with binary data)?
About the DIFFTEST option to do chi-square difference tests for WLSMV, we can only used it to compare between nested models, right? Can you just explain me what are exactly "nested models"? Because I want to compare models with 2 factors and models with 3 factors (with the same binary observed variables used) But if there aren't nested models, what can I do to know which model (between the 2 factors solution and the 3 factors solution) is the best one? (that's the reason why I want to use AIC and BIC)
Would you suggest me to use maximum likelihood estimation to get the AIC and the BIC, even if I have binary data?
WLSMV is not allowed when outcome variables are continuouos. I think if you look at your output, the estimator will be ML and there will be a warning telling you that WLSMV is not allowed.
Chi-square difference testing is appropriate only for nested models. Nested models are generally models that are special cases of the same set of observed variables. The 2 and 3 factor solutions would be nested but the difference test may not be valid because some parameters are on the border.
If you want AIC and BIC, you will need to use maximum likelihood.
Jasmin Tiro posted on Thursday, January 20, 2005 - 5:44 am
I am having a problem regarding fit indices with my mediational model. The final outcome variable is an observed ordinal variable with 4 levels. There are 2 mediators and 3 independent variables all measured as continuous latent variables. The model includes one direct path between the final outcome and 1 of the exogenous variables.
In Mplus Version 3 using WLSMV estimation:
CFI = 0.071 TLI = 0.912 RMSEA = 0.099
The problem is not due to the latent factors because the fit of the 5 factors using WLSMV estimation was not nearly as bad (CFI = 0.873; TLI = 0.861; RMSEA = 0.063).
The inconsistent fit of the CFI in the mediational model appears to stem from the estimation of the baseline/independence model. When comparing my model to the baseline model, the chi-square statistic goes down and the degrees of freedom get very small (see below). I know that WLSMV estimation computes degrees of freedom differently than ML, but I don't understand why the chi-square is decreasing when no covariation is assumed among the latent and final outcome variables.
Chi-Square Test of Model Fit Value 669.465* Degrees of Freedom 95** P-Value 0.0000
Chi-Square Test of Model Fit for the Baseline Model Value 627.692 Degrees of Freedom 9 P-Value 0.0000
When I used MLM estimation and assumed the final outcome variable was continuous, the fit was much better (CFI = 0.879; TLI = 0.869; RMSEA = 0.052) and the chi-square statistic for the baseline model increased as expected.
Chi-Square Test of Model Fit Value 1526.140* Degrees of Freedom 579 P-Value 0.0000 Scaling Correction Factor 1.318 for MLM
Chi-Square Test of Model Fit for the Baseline Model Value 8487.327 Degrees of Freedom 630 P-Value 0.0000
Dr. Muthen, I am running path models to test a theory in Mplus 3.01. I have continuous exogenous,and continuous and binary endogenous variables predicting a single binary outcome. Can I get the total effect for each predictor on the outcome? I understand that I cannot simply add the indirect and direct paths with binary outcomes.
Please download Version 3.11 which is the most recent version of Mplus. You can obtain total and indirect effects using MODEL INDIRECT. See the user's guide for a description.
Anonymous posted on Thursday, February 24, 2005 - 8:28 am
When testing nested models using the DIFFTEST, is each model run separately and then the output from step 2 generates a chi-square value and df based on WLSMV for each model? Is it then appropriate to perform a chi square difference test using these values?
For WLSMV, the chi-square difference test is computed using the derivatives from the H0 and H1 models. You cannot obtain a proper chi-square difference test using the chi-square statistics from the H0 and H1 models.
Anonymous posted on Monday, March 21, 2005 - 6:01 pm
What suggestions can you give me to deal with the message below?
WARNING: THE RESIDUAL COVARIANCE MATRIX (PSI) IN GROUP LATINO IS NOT POSITIVE DEFINITE. PROBLEM INVOLVING VARIABLE GOAL2.
The model estimation terminated successfully. goal2 is a latent variable with 4 observed indicators.
I am doing a multiple group analysis.
CFI/TLI are 0.847/0.868 RMSEA 0.079
bmuthen posted on Monday, March 21, 2005 - 6:17 pm
First, you should update to version 3.12 - you received a message about this if you are a licensed user and sent in your registration card.
The message can be obtained for several reasons as is listed in the version 3.2 change of your warning message: negative factor variance, factor correlates 1 with other factors, factor is involved in linear dependencies with other factors, etc.
These are not available. So if you want them, you will need to compute them yourself. There are so many possible fit indices that we have included one from each of the most common fit index families.
Ringo Ho posted on Friday, April 15, 2005 - 5:33 am
hi Prof. Muthen
I have two questions about how the df being calculated infactor analysis models with categorical outcomes. I have three categorical variables (binary) and I have fitted a factor analysis model, with some specific constraints on the factor loadings and the residual variances. Because the constraints on the residual variances I use THETA parameterization. I have two datasets -- they are almost the same one dataset has 6 possible response patterns (3 binary items: 8 possible patterns but with these two "0 0 0" and "1 1 1" excluded) and another dataset including also "0 0 0" and "1 1 1" patterns (so all 8 possible patterns in the data). I fitted the exact same models to both dataset. I used WLSMV estimator. I have two questions: (1) how the df of the fitted model cacluated, I couldn't find the formula from the mplus manual. Is it "(total number of threshold parameters+total number of tetrachoric correlation) minus (total number of parameters in the models"? or ...? (2) the df for the baseline model differed by 1 between above two datasets---even the same model is fitted !?, why would this happen? I thought it should not be data-dependent in the df calculation ... did I miss something here?
Please let me know if I should send the input and output files to you.
Below is the extract of my input and output files using the 6 response pattern data set: -------------------------------------------------- DATA: FILE IS data.txt; TYPE IS INDIVIDUAL; VARIABLE: NAMES ARE y1-y3; CATEGORICAL are y1-y3; ANALYSIS: TYPE=MEANSTRUCTURE; ESTIMATOR=WLSM; PARAMETERIZATION=THETA;
------------------------------------------------- * This is the extract of output file based on the 6 response patterns data file * ------------------------------------------------- Chi-Square Test of Model Fit Value 1.756* Degrees of Freedom 1** P-Value 0.1851 Chi-Square Test of Model Fit for the Baseline Model Value 429.792 ---> Degrees of Freedom 2 P-Value 0.0000 CFI/TLI CFI 0.998 TLI 0.996 Number of Free Parameters 5 -------------------------------------------------- * Below is the extract of the input and output from the 8 possible response patterns data file but fitted by the same model * --------------------------------------------------- DATA: FILE IS datamakeup.txt; ! 8 response patterns TYPE IS INDIVIDUAL; VARIABLE: NAMES ARE y1-y3; CATEGORICAL are y1-y3; ANALYSIS: TYPE=MEANSTRUCTURE; ESTIMATOR=WLSM; PARAMETERIZATION=THETA;
model constraint: p3=(-1)*p1+(-1)*p2; OUTPUT: SAMPSTAT; STAND; RES; TECH1 TECH4 TECH6;
* Output ! Chi-Square Test of Model Fit Value 0.150* Degrees of Freedom 1 P-Value 0.6981 Scaling Correction Factor 0.610 for WLSM Chi-Square Test of Model Fit for the Baseline Model Value 247.656 ---> Degrees of Freedom 3 P-Value 0.0000 CFI/TLI CFI 1.000 TLI 1.010 Number of Free Parameters 5 ---------------------------------------------------
Thank you very much for your help! Ringo
LMuthen posted on Saturday, April 16, 2005 - 4:43 am
1. The formula for the degrees of freedom for WLSMV can be found in the technical appendices on the Mplus website. I think the formula is 110 but I don't have this available right now. You can obtain degrees of freedom of the type you describe using WLS or WLSM.
2. The degrees of freedom for baseline models on two datasets can differ. Sample size is also involved in the degrees of freedom for WLSMV.
I am estimating very similar models on the same sample, but getting very different model fit and modification indices. Specifically, I am estimating 2 models on the same sample (N=1,602)...
Model 1: X1,X2-->M1-->Y1 Model 2: X1,X2-->M1-->Y2
Where M1 is continuous, Y1 is binary (any visit vs. no visit), and Y2 is continuous (# visits; logged+1). X1 and X2 covary. I adjusted for clustering in both models (type = complex), and both have missing data on X1 and M1 variables (not on Y's).
Model 1 is estimated using WLSMV. Model fit was poor (CFI=.53; RMSEA=.07; WRMR=1.99). Modification indices suggested adding "X2 ON X1" to the model. I reestimated and model fit was acceptable.
Model 2 (estimated with MLR), immediately had acceptable model fit, and did not suggest the modifications in Model 1. Given that X1-->M are the same for both equations, I cannot understand why I'm getting such divergent model fit indices and suggested modifications in that portion of the model.
Any thoughts/suggestions? I am presenting both models in the same paper, so large differences in paths that are consistent across models are sure to raise flags.
Input is as follows:
/*=====Binary outcome=====*/ CLUSTER is VISN;
USEVARIABLES are Y1 X1-X30;
CATEGORICAL is Y1;
ANALYSIS: Type = complex meanstructure missing h1;
MODEL: F1 BY X1 X2 X3; F2 BY X4 X5 X6; F3 BY X7 X8 ;
F1 ON F2 F3 X9-X30; Y1 ON F1 F2 F3 X9-X30;
/*=====Continuous outcome=====*/ CLUSTER is VISN;
USEVARIABLES are Y2 X1-X30;
ANALYSIS: Type = complex meanstructure missing h1;
MODEL: F1 BY X1 X2 X3; F2 BY X4 X5 X6; F3 BY X7 X8 ;
I am confused by your Mplus input compared to your introductory description. In the latter you say "Modification indices suggested adding "X2 ON X1"", but x2 and x1 are factor indicators in the Mplus input. In terms of the Mplus input for the categorical run, which parameter did the MIs suggest including? Generally speaking, I wouldn't be surprised if a model with a binary outcome y1 fits differently than the same model with a different, continuous outcome. Also, how come f2 and f3 are not regressed on x9-x30?
Apologies. Let me start over. The two models are specified as follows:
Model 1 (binary Y; estimated with WLSMV):
F1 = funtion of (F2, F3, X9-X30) Y1 = function of (F1, F2, F3, X9-X30)
Model 2 (continuous Y; estimated with MLR):
F1 = f(F2, F3, X9-X30) Y2 = f(F1, F2, F3, X9-X30)
Where Y1 is any doctor visit versus not (0,1); Y2 is number of doctor visits (log(#visits+1)); F1-F3 are latent variables with multiple indicators; and X9-X30 are observed variables.
Model 2 fit the data well.
Model 1 did not, and suggested adding "F2 ON F3."
For context, my primary hypothesis is that F1 mediates the effects of F2,F3,X9-X30 on Y1,Y2. I will be submitting to health services research journal, where it is common to present utilization of health service results first as logit model for any vs. none, and then as OLS for total # visits...only the dependent variable changes in these models.
I would like to treat F2,F3,X9-X30 as exogenous and assume they are correlated as in multiple regression.
Returning to my problem...given that Y1 and Y2 are similar, and the rest of the model is identical, a red flag was raised when results indicated the "final model" would need to differ considerably to reach acceptable model fit.
Normally I would attribute the difference to the difference in outcome variables, but I also reran Model 1 treating Y1 as continuous, and the model fit similarly to Model 2 -- good fit with no major modifications suggested. From what I can see, it appears the difference is in the estimator (WLSMV versus MLR).
I would have thought that, if the covariances between F2,F3,X9-X30 were freed (as I would like them to be), then there would be very few degrees of freedom remaining for model fit and modification indices -- as it approaches a typical multiple regression specification. Whereas the MLR output lists covariance estimates between F2/F3 and X9-X30, the WLSMV output only lists a covariance estimate between F2 and F3.
Do I need to include WITH statements in WLSMV if I want to make the same "ceteris paribus" statement for my primary hypothesis in models 1 and 2?
If I am specifying this correctly, and I wish to present both in a paper, which model is more conservative?
I see. WLSMV does not automatically correlate x's (observed covariates) and exogenous factors (see bottom of this message), whereas this is done when all dependent variables are continuous. I would suggest making the 2 types of models (for y1 and y2) compatible by having in both
f2 f3 on x9-x30;
This then - for both types of models - makes f2 and f3 related to the x's and also lets f2 and f3 have a residual covariance. Then the results should be more compatible. Regressing the f2 and f3 factors on the x's is probably a realistic representation given that the x's may well be antecedent to the factors.
The reason why WLSMV does not correlate x's and exogeneous factors is that the x's then become part of the model instead of being conditioned on and this then forces model fitting via latent correlations instead of via the regression-based approach favored in Muthen (1984) - see Mplus references.
I have run a simple path model that exclusively involves observed independent and observed dependent variables. My independent variables are a combination of binary and continuous variables. My 3 dependent variables are all binary. Mplus provided fit indices for this model. However, I'm not clear as to whether these fit indices are appropriate to report. In other words, how are these fit indices computed in simple path analysis with no latent variables?
I'm interested in computing the population Gamma Index. But I wonder if it is possible to compute the population noncentrality index (NCI=chi-df/N-1) for the WLSMV chi-square estimate. Does is make sense in the case of the WLSMV?
Prof Muthen, Thank you for your reply to my posting on Fri. I did indeed leave out arrows between observed predictor variables within my model containing only observed variables. To clarify, are you saying that the the fit indices (i.e., chi-square, CFI, RMSEA) are measuring to what degree I have accurately accounted for the relationships among all of my observed variables within the model, even among my observed predictor variables? Thank you!
Re: May 31 - 10:46. Typically, there would be no left-out arrows among the predictor variables (called x's in Mplus jargon). As in regression analysis, the x's are freely correlated and not part of the model. Typically, a model says something about the relationships between the x's and the y's, and among the y's. So, in typical cases, you don't even mention the correlations among the x's. But if you specify that the x's are not correlated, they are brought into the model and correlatedness among them contribute in the fit assessment.
Prof Muthen, Thank you again for your helpful response to my posting for May 31 at 10:46am. In the model that I am running, I have multiple observed mediators. In this case, I assume that the mediators are "y" variables according to Mplus, given that they are regressed upon x variables within the path model structure. In path analyses such as this one, where multiple observed mediators exist, do the relationships among the observed mediators factor into the fit indices? I see in the output that Mplus estimates the relationship among the terminal y variables (i.e., the y variables that are the ultimate outcome in the mediation pathways), as indicated by "WITH" statements in the model output. Does Mplus generate similar estimates among mediating variables? I do not see that this estimated in my current model output. Thank you!
If you do not see parameter estimates in the Results section of the output, then those parameters are not being estimated. If you want them to be estimated, you will need to explicity add them to the MODEL command to override the Mplus default of having them fixed to zero.
Regarding my previous postings that referenced my path model that exclusively contains observed variables and multiple potential mediators - Forgive my ignorance as a novice Mplus user, what syntax would allow me to specify the pathways among my mediating variables? Would I use regression equation syntax (e.g., MediatorB ON MediatorC), or is there a better method? Ideally, I would prefer to account for the correlations among the mediators without specifying a hypothesized causal direction. Thank you!
On is for regression relationships. WITH is for correlational relationships. The options of the MODEL command are described in Chapter 16 of the Mplus User's Guide.
Anonymous posted on Thursday, June 23, 2005 - 8:34 pm
I am a first-time visitor to Mplus, and have an urgent question that needs an answer right away. How can I use your service to calculate chi square difference test for my a prior model and a rival model to see which one has a better fit? Can you provide me with a step-by-step guide?
You should be able to find this information in any SEM textbook, for example, the Bollen book.
Deb Kloska posted on Wednesday, August 17, 2005 - 7:59 am
I have a question about SEM with categorical outcome variables. I thank you in advance for your assistance.
Measurement Latent Outcome Y1: The single latent factor Y (Alcohol Abuse Disorder age 35) is being created from 19 ordered categorical variables (0,1,2,3) for 8008 weighted cases, no missing data. These observed variables are predominantly 0s. Error: I would like to estimate the uncorrelated error variance of each of the observed variables, allowing for measurement error. In a separate step I would also like to set the error variance to 20% Independent Variable: is a single observed variable V308, heavy drinking age 18.
Structure Latent X as measured by a single continuous indicator V308 predicts the outcome Y. X is also non-normal with a predominance at the lower end of a 1-6 scale.
Initially I ran just the measurement part of this model (step A. in the MODEL below) and the rmsea was .048 and TLI .989.
When I run the full model below I get the following message: THE MODEL ESTIMATION TERMINATED NORMALLY THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 21.
I do get model estimates STD and STDYX, but I do not get model fit statistics.
Also, in tech5 I get the following message 19 times. I know we have bivariate zero cells, but am not sure if this is the source of the problem. ZERO CELL PROBLEM: IR, J & K = 1 2 1
My questions are: 1. Have I misspecified the intended model? 2. Why do I not get any fit statistics? Is this a problem due to lack of variance? 3. How do I use the information in the tech outputs to determine where problems lie. The variable it identifies as a problem is not very different from many of the other variables in the model. 4. In a separate step I would also like to set the error variance of the observed variables to 20%. How do I do this?
Here is the model specified: DATA: FILE IS data.RAW; FORMAT IS 27F9.5; TYPE IS INDIVIDUAL; VARIABLE: NAMES ARE V106, V350, CLGSTAT, V308, V1708, V2708, V3708, V4708, C35152, C35153, C35154, C35155, C35156, C35157, C35158, C35159, C35160, C35161, C35162, C35163, C35164, C35165, C35166, C35197, C35199, C35200, C35201; USEVARIABLES ARE V106, v308, C35152-C35201; CATEGORICAL ARE C35152-C35201; WEIGHT IS V106; ANALYSIS: TYPE = GENERAL; ESTIMATOR = WLSMV; PARAMETERIZATION = THETA; MODEL: !A. create the latent factor Y; Y BY C35152@1 C35153-C35201; !B. estimate the error variances of the observed; C35152 - C35201; !C. create a latent factor X for the independent variable; X BY V308@1; !D. estimate the affect of x on y; Y on X;
OUTPUT: SAMPSTAT STANDARDIZED RES MODINDICES TECH1 TECH2 TECH3 TECH4 TECH5;
I completed several CFAs of a measurement model with five factors and 40 ordinal observed variables (six points) using both MPlus 4.0 and Lisrel 8.72. To account for the ordinal nature of the variables, I based the analyses on polychoric correlations in Lisrel (using the assymptotic covariance matrix) and I specified these variables as categorical in MPlus. The thresholds that I obtained were the same for both software packages. However, I found that the fit indices were not the same. In particular, the RMSEA was consistently higher when estimating the model in MPlus regardless of which estimator was used (WLSM or WLSMV), and the CFI was consistently lower when using the WLSMV estimator in MPlus. Here is an example:
I understand that the WLSMV estimator results in a different chi-square with estimated degrees of freedom. But can anyone explain why the other fit indices, in particularly the RMSEA, are not consistently the same? I am unclear about how to interpret the RMSEA because the value obtained from Lisrel seems to be quite supportive of global fit whereas the value obtained from MPlus would be indicative of poor fit. Thank you.
What is your sample size? Are you using LISREL's diagonally weighted least-squares estimator? I don't know how LISREL computes RMSEA, but the CFI difference of 0.96 versus 0.72 would seem to be straighforward to try to understand given that the formulas are transparent in this case. I would start with comparing the baseline test of a diagonal correlation matrix that CFI uses.
Thanks Bengt. The sample size for the above is 1444 and I was using the DWLS estimator in Lisrel. My understanding is that this should be the same as the WLSM estimator in Mplus. Is that correct? I also tried different sample sizes and different CFA models, but the results consistently show a larger RMSEA and a smaller CFI when using Mplus in comparison to Lisrel. So the different global fit values remain puzzling to me. Also, although I checked that the observed correlations and thresholds are the same (and, obviously, the model specification is the same), the estimated correlations and residual correlations are slightly different when comparing both programs. Unfortunately it is beyond my mathematical capability to figure out why this might be. Should I be expecting the same results from both software programs are is there a fundamental different between the two with respect to CFA models? Thanks so much.
I think LISREL's DWLS is different from Mplus' WLSM and WLSMV because of differences in how the weight matrix is estimated - although asymptotically they are most likely the same. Because of this, the parameter estimates will be slightly different - as well as chi-square and the fit indices derived from chi-square. But it seems to me that a quantity such as CFI should be rather similar. If LISREL reports the fit of the "baseline" model (uncorrelated variables), you can compare that to Mplus and see if that's where the difference begins. Which approach is best seems like a question that calls for a simulation study. The Hu simulation that we have posted on the website shows that CFI works well in the WLSMV context.
Yes, the chi-square for the independence model is indeed substantially different. For your information, the following values pertain to the same model as the one I reported yesterday: Lisrel 8.7 (DWLS estimator): baseline chi-sq (Df): 514052.97 (780) MPlus 4.0 (WLSM estimator): baseline chi-sq (Df): 960726.69 (780) MPlus 4.0 (WLSMV estimator): baseline chi-sq (Df): 60353.343 (49) I am surprised by the magnitude of the difference, but this does seem to explain why the global fit indices are different. I agree that a simulation study would probably be very informative. I am still puzzled about why the CFI based on the WLSM in Mplus (CFI = 0.94, which is fairly close to the one obtained when using Lisrel) is so different from the CFI based on the WLSMV estimator (CFI = 0.72).
I am wondering about the estimation in categorical models. I have estimated the same model in Mplus (in Mplus you would call it a factor model with ordered categorical indicators, all loadings fixed to 1) as well as in SAS using PROC NLMIXED and in a specialised software for Rasch models (Winmira). PROC NLMIXED and Winmira results are only trivially different which could be expected considering that they apply different estimation techniques (marginal ML vs. conditional ML). The log-likelihood that Mplus reports is, however, non-trivially smaller than that for the other two programs. Could that have anything to do with the message: "** Of the 15625 cells in the latent class indicator table, 36 were deleted in the calculation of chi-square due to extreme values." Are "extreme value" cells already excluded when maximizing the likelihood function or only afterwards?
after digging through the technical appendices I found out that Mplus uses the proportional odds model for regressing ordered categorical variables on the latent vars, whereas the models I estimated with PROC MIXED uses an adjacent category logit link - both result in the same number of parameters, but I have not yet figured out if this could possibly influence model fit...
Hello Bengt and Linda, I have a following problem: I am testing a model with a binary dependent variable( WLSMV estimatation) and a few latent factors as independent var. with the sample size of 400. I am getting a very nice RMSEA of 0.043 and WRMR of .984, CHi-square=222.001 with df=127. On the other hand my CFI equals to 0.781 and TLI equals to 0.838. Would you think that this model had an acceptable fit, even CFI is low?
I could not find any references except of Yu's dissertation about the good/bad cut of points for fit indices.... I thought maybe some new suggestions came out for the models with BINARY dep. variables.
I imagine that you have a p-value greater than .05 for the chi-square given the RMSEA. I suspect you have low correlations among your observed dependent variables making it hard to reject the H0 model. I would not conclude that this model fits the data.
I have been examining models with all dichotomous variables and using WLSMV to provide model fit indices. I was curious about the chi-square values that are given by using MLR. When I have run the same models, I see that MLR provides chi-square values. However, since WLSMV is default, I have trusted that this is the most appropriate estimator.
In short, is it appropriate to use the chi-square values from models using MLR both for model fit and comparisons?
Hello Linda, Thank you very much for the response. About the low correlations: This is a longitudinal model with 3 time points, with the binary dependent variable at the last time point. So in that case the correlations are not very high between time2 and time3. Is there any way I could improve the model with the low correlations?
I have tried running the same model, that has all 8 measured variables as dichotomous, using WLS, WLSMV, and MLR. WLS gives the "expected" number of degrees of freedom (28). WLSMV gives the appropriate number of degrees of freedom based on the robust correction (24). However, I cannot figure out how the df works for MLR. I think that the computation is based on total possible cells in the distribution (i.e. 2^8 = 256). However, there are 7 parameters being estimated in the model and the model reports 248 df. Are the total df computed as [(2^8)-1] = 255 in this situation?
The degrees of freedom for WLSMV are not computed in the regular way and should not be used in the regular way. The degrees of freedom for MLR should be the same as for WLS. It may be that some defaults are different and you are getting different parameters. If this does not help you, please send the input, data, and output for both WLS and MLR to email@example.com along with your license number.
Salma Ayis posted on Friday, January 12, 2007 - 4:04 am
Two questions please! (1)How do I use BIC and AIC to judge the goodness of fit in an (one parameter or two parameter logistic) IRT model? (2) Pearson Chi-square and Likelihood Ratio Chi-square, when do they agree/not agree and why?. Is there a note or a reference which may help to understand these relations in a straightforward way?. Many thanks for any suggestions
Hello, I m computing a LCA and use TECH11(like K.Nylund on web seminar) to get the threshold values and their related probability scales. Unfortunately TECH11 does not display any probability scales. Any suggestions? Thanks for your help, Stephan
Hi Linda, no I guess its not the p-values. After computing a LCA I get threshold values and related Estimates, S.E. etc.
Here K.Nylund sais that it is easier to interpret Estimators by looking on the probabilities -and switched to the next table-. Hence users don't need to compute anything to interpret the Estimates. Her output (TECH11) shows, that people with a 1 in variable a have a z% probability of beeing in class x and so on. Regards, Stephan
I think Karen must have meant that it is easier to look at the output probability values rather than the logit values. Mplus gives both logits and probabilities for some models, for example, models without covariates. See the output from Example 7.3 where you will find the results in both logit and probability scale. This has nothing to do with TECH11.
Hi Linda, thanks for your response. I've computed example 7.3 with my variables and got the desired 'RESULTS IN PROBABILITY SCALE'. Afterwards I used my sytanx below and missed the probability scale. However, I'll switch to the textbook version. Regards, stephan
DATA: FILE IS "F:\sample.dat";
VARIABLE: NAMES ARE d7 d16; USEVARIABLES ARE d7 d16; Classes = z(2); CATEGORICAL ARE d7 d16; MISSING ARE ALL (-1234); ANALYSIS: ALGORITHM=INTEGRATION; type = mixture missing; Starts= 20 2; Output: TECH11 TECH1 TECH8; PLOT: type = plot3;
o.k., good to know. I will delete this section. Thank you for your help! -Stephan
R. Carter posted on Tuesday, April 17, 2007 - 10:28 am
Hello, I'm new to Mplus and factor analysis. I'm doing a simple path analysis with 1 categorical dependent variable and 2 continuous IVs. One of the IVs also serves as a mediating variable.
In this example, I did not receive any fit indices such as test of chi square difference, is this to be expected given the type of anaylsis...(the syntax for my model is below).
TITLE: Anxiety, Depression, and SI-FULL DATA: FILE IS G:\Suicide Project\2007\FullSI.txt; VARIABLE:NAMES ARE SI RCMCPRT CDINSI; CATEGORICAL IS SI; ANALYSIS: TYPE=MEAN; MODEL: CDINSI ON RCMCPRT; SI ON CDINSI RCMCPRT; MODEL INDIRECT: SI IND RCMCPRT; PLOT: TYPE=PLOT1 PLOT2 PLOT3; OUTPUT: CINTERVAL Samp Stand Residual Mod(3.84) Tech1;
The model is just-identified. This is why there are no degrees of freedom.
R. Carter posted on Tuesday, April 17, 2007 - 1:49 pm
Hello Dr. Muthen, thanks so much for getting back to me so quickly.
Just a quick follow-up regarding my initial concern:
Would you recommend running separate models to free up a parameter so that I generate fit indices, and thus report them in the manuscript or is it ok to not report fit indices given my interest in both direct and indirect effects of anxiety on the DV and thus report the findings as a path analysis or regression analysis using Mplus etc.?
I am doing path analysis with both continuous and categorical dependent variables using the MLR estimator. All variables in the model are observed.
Following previous communication with MPlus support, my understanding is that loglikelihood can be used to assess model fit in this situation, as other fit indices are not available. The following values are provided in my output:
Max LL for the unrestricted (H1) model : -3553.456
LL Ho model: -3475.079 Ho scaling correction factor for MLR: 0.972
1) Is the unrestricted model a fully saturated model, ie. no degrees of freedom. If not, please explain the unrestricted/H1 model and how it is estimated.
2) Is it appropriate to assess model fit of the Ho model (my analysis model), by comparing it to the H1 model ?
So, in general, values of CFI and/or TLI below 0.95, suggest that the model does not have good fit?
In my specific case, the RMSEA is less than 0.06 and WRMR is less than 0.9, but the TLI and CFI are around 0.90. I am not sure how much weight to put on each to determine whether or not the model has a good fit.
My understanding was that if the sample is large, it is easy to find significant difference between the estimated and the "perfect" model. I am working with two samples, one is 1800 cases, and the other is about 6000 cases.
It is true that chi-square can be sensitive to sample size but that does not render it useless. You can consider doing a sensitivity analysis where you free parameters until you obtain a good fit. Then compare the results to your original model. If the results from the original model are different in the less-constrained model, this would indicate the chi-square was correct in saying the model fit poorly. If the parameters stay approximately the same, it would point to chi-square being sensitive.
I have a new dataset with ordinal items, although it's an 11 point scale. MPlus limits categorical items to 10 pts. In this situation, would you recommend treating the data as continuous? Or collapsing two points (ie 1 and 2)to make it categorical? And does MPlus have a facility for doing that?
Also, in a single factor CFA with continuous indicators: is it possible to correlate errors of the dependent indicators?
I have two questions, which I split into two messages.
----------------------- Question 1. ----------------------- Back in 2003, you commented in a response to a user’s question regarding WRMR as follows:
"There have been few studies of the behavior of fit statistics for categorical outcomes. In your case, you have a combination of one categorical and several continuous outcomes. I know of no studies of the behavior of fit statistics in this situation. The following dissertation studied fit statistics for categorical outcomes. It can be downloaded from the homepage of our website.
Yu, C.Y. (2002). Evaluating cutoff criteria of model fit indices for latent variable models with binary and continuous outcomes. Doctoral dissertation, University of California, Los Angeles.
You may have to do a simulation study to see which fit statistic behaves best for your situation."
I wonder if you have ever seen any studies published since 2003 that solves this issue. I myself have a similar situation: three mediating dichotomous variables six continuous outcome variables, and three independent variables, with the following fit indices:
CFI 0.95 TLI 0.94 RMSEA 0.12 WRMR 1.16
CFI and TLI look good. But RMSEA and WRMR look problematic. Among those four, which index should I trust?
----------------------- Question 2. ----------------------- I ran a similar model using Amos with the model that I described above. I got the following fit indices and did not understand why RMSEA was so different from the one that I got using Mplus.
CFI 0.92 TLI 0.84 RMSEA 0.75 (WRMR not calculated by Amos)
One thing that I should note is that I was originally using Amos but switched to Mplus because Amos did not sufficiently address dichotomous mediating variables. So I acknowledge that the model that I ran using Amos was not exactly the same as the one that I ran using Mplus. Yet, the coefficients are pretty similar to each other. Then I don't understand why RMSEA was so much worse in Mplus than in Amos, despite the similarity of the models. I would appreciate your help with these two regards. Thank you very much in advance.
None of the fit statistics from Mplus that you show meets the guidelines recommended in the Yu dissertation. Only CFI meets the Bentler cutoff. And you don't show chi-square which is probably the most studied fit statistic. I would not be satisfied that this model fits the data.
When you treat the variables as all continuous in Amos, you do not use the same estimator or methodology. You would expect some similarity in results but not agreement. This methodology also points to model misfit.
I am running a model with a binary outcome and have calculated a series of chi-square difference tests using DIFFTEST for WLSMV. The added paths are correlations between an exogenous and an endogenous variable. You have responded to previous inquiries that the CFI and RMSEA should be examined using standard cutoffs. Although the chi-square difference tests indicate improvement in fit, the CFI and RMSEA become worse (CFI drops from .81 to .44). I have examined the same exact model with similar continuous outcomes and the addition of these correlations has improved all fit statistics. Are the CFI and RMSEA valid for interpreting improvement in fit with binary outcomes? Are there any other peculiarities of modeling with binary outcomes that I may have overlooked?
Hello. I am running a model with a dichotomous outcome variable. I have estimates of coefficients. I would like to find the predicted probablity of the outcome for each subject. Is there an Mplus command to do this? Thanks!
Is there some reason why the typical SEM fit indices cannot be calculated in categorical models using ML estimation and the logit link? The fit indices are available using WLSMV / probit.
I can imagine that one of the following is true:
1) The fit can be calculated, but it hasn't yet been implemented in MPlus. It should be possible to calculate these indices using MPlus output.
2) Fundamentally, these indices have not been defined or do not make sense.
Can you clear this up? I'd rather use ML / logit in my case because of the ease of calculating predicted probabilities. Also I am calculating indirect effects and their SEs using an equation given by Winship and Mare (1983) using the model constraint feature, but to do this using the probit link would require MPlus to have an inverse probit function. It is quite easy to do using a logit link due to the closed-form nature of the inverse logit transformation.
It is a matter of what the unrestricted model (H1) is that we test our model (H0) against. With ML for categorical outcomes, H1 is the unrestricted multinomial model for the observed data frequency table. Mplus gives that, although this is not useful for large tables.
Weighted least squares fits models to the correlations for the y* variables, the underlying normal latent response variables. This gives a test of H0 against an H1 which is the unrestricted correlation model for the y* variables.
I'm testing a logistic regression model in Mplus en want to calculate the chi square of my model by using the difference in log likelihoods between a model with 3 predictors and the empty model, but how would this empty or unconditional model syntax look like?
Thank you for your answer!
syntax model with predictors:
VARIABLE: NAMES ARE NR SCHOOL SEX TRACK DIPVA DIPMP ACH ACA1 WIS WET TAAL ARICHT ASCHO RSHO LEVEL RSHO1 RSUNIV; USEVARIABLES ARE SEX ACH ACA1 RSHO; CATEGORICAL ARE RSHO; MISSING IS ALL (-99.00); ANALYSIS: ESTIMATOR = ML; MODEL: RSHO ON SEX ACH ACA1;
I was wondering if in CFA using WLSMV (delta) in multiple groups the chi-square was the sum of the the chi-square of the same model run separately in each group? I run the following model separately in two groups (n1 = 448; n2 = 224):
VARIABLE: NAMES ARE V1 V2 V3 V4 V5 V6 V7; CATEGORICAL IS ALL; MISSING IS ALL (999 998 997);
MODEL: F1 BY V1 V2; F2 BY V3 V4 V5 V6 V7; V3 WITH V4; [F1@0F2@0];
Which fitted well the data in both group (just for interest here: Chi-square(g1) = 14.972, df = 11, p = 0.1838; Chi-square(g2) = 14.211, df = 9, p = 0.1150). Now, when i run the same model on both group simultaneously, with freeing the factor loadings in both groups, I obtained:
Chi-Square Test of Model Fit
Value = 40.329 df = 23 P-Value = 0.0141
Chi-Square Contributions From Each Group
G1 = 16.454 G2 = 23.874
My questions are: 1- why are the group contributions different from the chi-squares computed separately? 2- how can the model fit both groups separately but not simultaneously? I hope I'm not missing something obvious...
I wish to compare a number of nested models. The model has one predictor variable, two mediators, and a single outcome variable. The predictor and outcome are continuous variables. There are two mediational pathways, one path through a continuous mediator, and one path through a dichotomous variable. The nested models involve a multiple group analysis by which different pathways are allowed to vary by gender.
Here is my syntax:
grouping is f502_s (0=female 1=male); categorical is cluster2; missing are all(-999) ; analysis: type = general ; estimator = mlr ; parameterization = theta ; bootstrap=5000 ; model: map1_c on nd_c (b1); is_c on nd_c (b2); map1_c on is_c (b3); cluster2 on nd_c (b4); map1_c on cluster2 (b5); model male: map1_c on nd_c (b1); is_c on nd_c (b2); map1_c on is_c (b3); cluster2 on nd_c (b4); map1_c on cluster2 (b5);
model constraint: new (dir ind0 ind1 ind9 tot0 tot1 tot9); dir=b1; ind0=b2*b3; ind1=b4*b5; ind9=ind0+ind1; !should estimate the TOTAL indirect effect; tot0=dir+ind0; tot1=dir+ind1; tot9=dir+ind9; !should estimate the TOTAL total effect; output: standardized cinterval(bcbootstrap);
(PART 2) I was running this analysis with only the bootstrap indicator in the analysis statement, but it will only give me RSMEA and WRMR for model fit indices. I have included type=general, estimator=ML, parameterization=theta in hopes that this would provide me with AIC, BIC, and others, and most importantly, deviance and Chi Square. However, when I run the model, it says:
*** WARNING in ANALYSIS command PARAMETERIZATION=THETA is not allowed for TYPE=MIXTURE or ALGORITHM=INTEGRATION. Setting is ignored. *** ERROR in ANALYSIS command ALGORITHM = INTEGRATION is not available for multiple group analysis. Try using the KNOWNCLASS option for TYPE = MIXTURE.
Your help would be GREATLY appreciated! I have spent the last hours pulling my hair out over this one!
When comparing nested models using WLSMV, Mplus gives an output with the estimated delta chi-square and the delta df. Can these delta chi-squares and delta df be seen as parallel to the deltas when comparing nested models using ML? If so, what is the interpretation of 2(delta)df-(delta)chi-square (i.e. Akaike information criterion)?
Hi, I'm scaling my items in an IRT framework, using the MLR estimator in Mplus. For a Rasch model, I fix all loadings to one, for a 2 PL model, I estimate them freely. My aim is to check whether a Rasch or a 2PL model fits my data.
I have some questions concerning the output of both calculations:
Mplus gives BIC and sample-size adjusted BIC. We have data of 1187 subjects - which one to trust more?
Embretson and Reise describe a test comparing the log likelihood of both models with a chi square difference test - taking -2 times the log likelihood of both models and compare their difference to a critical chi square value. What I've done is using the log likelihood I get in Mplus. However, you do point out that for different estimators like WLSMV and MLR (the latter I use) one should use the difftest option instead of doing the chi square difference test by hand because these estimators calculate the chi square value and degrees of freedom differently. Does this apply to my case, too, where I use the log likelihood for both models?
Yes, thank you Linda. Can the delta chi-square and delta df be interpreted in such a way that one can compute delta AIC and select between the nested models based on the delta AIC (prioritizing parsimony)?
How to do difference testing depends on the estimator. With MLR you need to use the scaling correction factor given in the output following the directions on the website. WLSMV does not get a loglikelihood. Chi-square difference testing can be done using the DIFFTEST option.
thanks for your answer; I've done the calculations using the scaling correction factor. Just one more question concerning difference testing with MLR - what is the correct number of degrees of freedom for this test? To give an example - I have a test with 17 items. They are scaled in a 1 PL vs. 2 PL model. Is it the difference in number of free parameters I have in the Mplus output - which is (34 -18 = 16)? Or is it the actual difference in free parameters (34- 17 = 17)?
The correct number of degrees of freedom is the difference between the number of free parameters in the two models. The number free of parameters is given with the fit statistics at the beginning of the results section.
Dear MPlus experts, I am trying to compare models using the WLSMV estimator and DIFFTEST. To be precise I try to compare the following two models: 1) the partial mediation model in which both the continuous mediator and dichotomous dependent share the most antecedents, except for the mediator also impacting the dependent and 2) a “no mediation” model in which both the continuous mediator and dichotomous dependent have the same antecedents and the mediator does NOT impact the dependent (therefore the no mediation).
I would say that model 2 is nested within model 1, but the MPlus output states that this is not the case. Please see the syntaxes below. Could you help me out on this? The reviewers really want us to test a “no mediation model” as well and compare it with other models.
Many thanks and kind regards,
Step 1: MODEL: failure ON growtht2 handeln industn handwn genhcm spehcm lambda24 lnncoown ; growtht2 ON growtht1 handeln industn handwn genhcm spehcm lambda24 lnncoown; SAVEDATA: DIFFTEST IS partmed.dat;
Step 2: ANALYSIS: DIFFTEST IS partmed.dat; MODEL: failure ON handeln industn handwn genhcm spehcm lambda24 lnncoown ; growtht2 ON growtht1 handeln industn handwn genhcm spehcm lambda24 lnncoown;
Hello, I am using the WLSMV estimator to estimate and compare CFA type models with categorical ordered polytomous indicators. I have a model in which all the parameters are freely estimated and I am comparing it to a nested more restricted model in which the parameters are fixed to values that were obtained from a (separate) calibration sample. When I do the DIFFTEST (I am sure I am setting it up correctly) it is significant. The chi-sq "test of model fit" value for the more restricted model is lower (and therefore leading me to believe its better fitting) than the chi-sq value for the freely estimated model. I have read on this board and elsewhere, however, that the chi-sq value for the WLSMV estimator is wrong and doesn't mean much (its the p-value that means something). Am I interpreting correctly that the more restricted model is fitting significantly better than the freed model based on the values of chi-sq I am seeing? Again the DIFFTEST is significant (p<.0001) but I don't know which model is fitting better because I am not sure about the meaning of the chi-square for each model due to using WLSMV.
I'm sorry, your response above is very clear with regard to how to interpret the DIFFTEST. But just for maximum clarification-- Which chi-sq values are meaningless? The DIFFTEST chi-sq value, the individual chi-sq values or (it seems like) all of them? Does this mean the chi-sq should never be reported (whether or not one is doing difference testing) when using WLSMV and categorical indicators? Thanks.
Very last question: Does sample size effect the sensitivity of the DIFFTEST? It would seem not to because the DF are based on the number of freed and fixed parms and not on the sample size. But I am trying to be clear. Ultimatley what I am seeing is that the DIFFTEST for WLSMV seems way too sensitive.
Thanks again. Sorry for being a bit disorganized in my questions.
Prior to Version 6, I would report only the p-value. With the new adjustment in Version 6, the chi-square values given and degrees of freedom agree with the p-value given in a chi-square table and can be reported. The difference in the chi-square values should not be used for difference testing. The DIFFTEST option should be used in this case.
I would think the DIFFTEST option is sensitive to sample size.
Dear mplus team, I have version 5 at the office and version 6 on my home computer. I have been runing some very simple models (regressing one outcome on one predictor at a time), but comparing the effect of different distributional assumptions on the fit and coefficients. I am getting vastly different outcomes for ML and MLR estimation going from v5 to v6. I see that there was a change in the way LL is calculated going from the two versions. But it seemed to be specific to models with covariates, of which these have none. Should I be conserned about this change? Thank you in advance for you help.
See Analysis Conditional on Covariates under Version History.
anonymous posted on Tuesday, April 19, 2011 - 8:29 am
I am conducting a LCA using complex survey data. The indicators are 6 different diagnoses and I've included only those individuals who meet criteria for at least one of the 6 diagnoses in the analysis (but many have multiple, hence the impetus to identify classes). I am obtaining some strange results in terms of conflicting information criteria. Whereas the Lo Mendel Rubin adjusted statistic is no longer significant at the 2-class solution, it becomes signficiant thereafter until the 8-class solution. Most other criteria also reach a low at the 7-class solution. Any ideas why the Lo Mendel Rubin initially prefers the 1-class solution?
There is no theory for turning a weighted least squares chi-square into AIC.
I would use the same estimator with all models. I would recommend MLR if you are going to use maximum likelihood. Note that categorical data methodology handles floor and ceiling effects. These are not a problem for categorical outcomes.
Hello I am currently evaluating a CFA measurement model and subsequent structural regression model for an array of latent continuous, observed continuous/ordinal, and observed dichotomous variables and have been running into some unexpected model fit issues. The regression model is structurally just-identified (i.e., same number of paths being estimated as the measurement model), thus leading me to expect that the model chi-squared and other goodness of fit statistics should be identical with that of the measurement model. This is not, however, the case. Using WLSMV estimation, the model d.f., chi-square value, and goodness of fit statistics slightly differ when going from a measurement to structural model. Using WLSM estimation, model d.f. remains the same, but the chi-square value and goodness of fit statistics again differ when moving from the measurement to structural framework. What could cause these discrepancies? Is it something to do with how the chi-square value is estimated using WLSMV/WLSM? If I remove the observed dichotomous variables from the analysis and use maximum likelihood estimation, model chi-square and all goodness of fit statistics are identical in both measurement and structural models, as expected. Thanks, Anthony
And just to provide some additional info that I could not fit in the above post...
The CFA measurement model consists of 12 variables. The structural model involves regressing 10 outcomes (latent & observed continuous, observed dichotomous) on two predictors (1 observed continuous, 1 observed ordinal).
With maximum likelihood and categorical factor indicators, means, variances, and covariances are not sufficient statistics for model estimation so chi-square and related fit statistics are not available. In this case, nested models can be tested using loglikelihood difference testing where -2 times the loglikelihood difference is distributed as chi-square.
The chi-squares referred to in the message are chi-squares tests of the observed versus the estimated frequency tables for the categorical indicators. These do not work sell with over 8 indicators.
I suggest using WLSMV if you want chi-square and related fit statistics.
Thank you for your answer. I want to use the model to further compare two nested model, one without and one with an interaction between latent variables (XWITH), therefor i need to use ML to compare loglikelihood. Q1: Can i report chi-square and related fit statistics based on WLSMV and loglikelihood based on ML? Q2: If not, how do I examine model fit using ML in my case?
1. You should not report fit statistics from two different estimators. You should report the fit statistics from the estimator whose results are being reported.
2. The best you can do is use the fit statistics before the interaction is added and then add the interaction to see if it is significant. See the following FAQ on the website where issues related to this are discussed:
The variance of a dependent variable as a function of latent variables that have an interaction is discussed in Mooijaart and Satorra
I assume you are using TYPE=EFA and the WLSMV estimator for which the DIFFTEST option is required for difference testing. DIFFTEST is not available for TYPE=EFA but you can do EFA using ESEM where it is available. See Example 5.24 (remove covarieates and indirect effect) and Example 13.12.
Duru Alan posted on Thursday, August 30, 2012 - 7:19 am
I am estimating an ordered categorical cfa. I am confused with my fit indices since my chi2 is not significant and CFI also looks good, but RMSEA is very high. And there is no modification index value above 4. Does that mean this is an acceptable model?
MODEL FIT INFORMATION
Number of Free Parameters 38 Chi2 Value 8.935* Degrees of freedom 11 P-Value 0.6279 Probability RMSEA <=.05 0.949 CFI 1.000 TLI 1.004
I don't see the high RMSEA. The probability that it is small is high.
Therese Shaw posted on Thursday, September 27, 2012 - 12:21 am
Hi I would like to test for measurement invariance using the difference in McDonald's non-centrality index (NCI) as recommended by Meade et al (2008) in "Power and Sensitivity of Alternative Fit Indices in Tests of Measurement Invariance" J Appl Psych. I am testing for invariance over time using categorical indicators in a one-factor model and have a sample size of approx 1900. I am using the difftest to get the chi-square test and calculating a difference in the CFI to compare models, but would like to also use the third measure recommended by Meade et al as I suspect the chi-square test may be affected by the large sample size. Thank you
We are comparing two repeated measures CFA models with two latent factors at two points in time using ordinal data and WLSMV. All latent factors are standardized. In model 1, all other measurement model parameters are freely estimated at both time points. In model 2, the measurement model parameters are constrained to be equal over time. The WLSMV chi-square difference test indicates, not surprisingly, that model 1 (unconstrained) fits relatively better than model 2 (constrained). However, the RMSEA is .047(.043-.050) for model 1 and .042(.039-.045) for model 2. We have checked Technical 1 to verify that all model constraints are exactly as they were intended. Do you have a possible explanation for these seemingly discrepant results of fit indices?
It sounds like the structural part is saturated or that the measurement model is just-identified. If you want a definitive answer, send the outputs and your license number to firstname.lastname@example.org.
Lin Gu posted on Friday, August 16, 2013 - 5:08 am
I tested two nested models with categorical variables by using WLSMV. The chi-square value was lower in the more restricted (H0)model than in the less restricted model (H1). I also got smaller df in H0.
I wonder why this happened because the more restricted model(H0) should have had a higher chi-square value and more df.
Can we interpret the WLSMV-based chi-square values as that the lower the value the better the model fit?
Can I still use DIFF test to compare the two models?
You must use DIFFTEST with WLSMV. You cannot directly compare the chi-square values.
Qi Shi posted on Friday, September 06, 2013 - 7:22 pm
Dear Professors, I am beginner user of Mplus and i'm in the process of rerunning my analysis for my dissertation because LISREL didn't run for some reason so now i'm trying to learn Mplus to solve my research question. I would really appreciate it if you could offer some help. Specifically, i used a secondary dataset but the sample that will be used in my study is only 469. There's a lot of missing data (ranging from 10%-40%). The models i'm testing included binary categorical outcome variables. I used WLSMV. My question is: is it true that Mplus by default treats missing data using FIML? But FIML works under MAR missing data assumption, however,i think my data is MNAR. Then the model fit i got is: Chi-square value:22.952 , df=23, p value: 0.46 RMSEA: 0.000, 90CI=(0.000-0.038) CFI: 1.000 TLI:1.001 WRMR:0.538.
So, this model has good fit, right? My dissertation committee asked for SRMR, but how can I get SRMR using Mplus?
The default in Mplus is to use all available information not to use listwise deletion. With maximum likelihood, FIML is used. With WLSMV, pairwise present is used.
Such good fit could be the result of low power to reject the H0 model because of low correlations.
When SRMR is available, it is given automatically.
Qi Shi posted on Saturday, September 07, 2013 - 11:34 am
Thanks a lot Professor Muthen! One follow-up question, is it ok to go ahead interpret the results using these model fit statistics? If i have such good fit (probably due to low power to reject the H0 model), can i still conclude this model fits the data well?
If you have low correlations, I think you need to consider that the lack of power is most likely why you see such good fit. You could do a simulation study to see if this is the case.
dvl posted on Friday, September 13, 2013 - 7:51 am
Just a quick question:
I have a simple path model with 3 dichotomous variables: 1 exogeneous, 1 mediating and 1 the outcome variable (each of the variables are representing change in a variable between two waves). I specified my two endogenous variables as categorical and then mplus gives me RMSEA, Chi², CFI and WRMR statistics? RSMEA and CFI are not good at all, but is that surprising in the case of my model? Which is the appropriate fit statistics to consider in case of my model specification?
I would place emphasis on the following fit statistics in this order of importance:
chi-2 RMSEA CFI
Mher B. posted on Friday, September 20, 2013 - 7:42 am
Dear professors, Some articles suggest that ULSMV (or ULS?)outperforms WLSMV (or DWLS) (Forero, 2009, Rhemtulla, 2012). Q1: What is your opinion regarding this conclusion? So I decided to try ULSMV instead of WLSMV when estimating CFA with 65 ordinal DV, 15 latent IV and N=230 in Mplus 7. When I am doing CFA with ULSMV I am not getting SRMR or WRMR. I thought it would be preferable to have more information about fit of the model and shifted to WLSMV, which provided me also with WRMR (but not SRMR). (!) When I switch from CFA to ESEM framework I get both SRMR and WRMR with both WLSMV and ULSMV estimators. Initial rational for switching from ULSMV to WLSMV disappeared. So I think I should switch back to ULSMV but now I am concerned with the lack of studies investigating proper cutoffs for descriptive fit indices with ULSMV estimation. (If you aware of such studies, can you mention them?) Q2: Is it logical to refer to such studies based on WLSMV estimation (e.g. Yu, 2002) as general recommendations for categorical variable methodologies? If no, should I continue using WLSMV because we (probably) know more about fit indices based on WLSMV? Q3: Can you, please, comment on this discrepancy of available fit indices of the same estimators between CFA and ESEM frameworks? Are interpretations of SRMR and WRMR indices in ESEM the same as in CFA? Thank You very much!
q1. I agree that Forero-Maydeu (2009) show a certain advantage for ULSMV over WLSMV in their study, but both are expected to perform reasonably well in most cases, so I wouldn't just switch from WLSMV to ULSMV.
q2. Yes. Perhaps.
q3. I don't think you need SRMR - with CFA you already have chi-2, RMSEA, and CFI. SRMR is more of a descriptive statistics that you typically use with EFA explorations. I would not rely on WRMR very much because it sometimes gives results quite different from other fit indices.
I would also suggest including comparisons of your hypothesized model not only with the unrestricted H1 model but also with other H0 models that are only somewhat more relaxed, what I call "neighboring models".
Mher B. posted on Monday, September 23, 2013 - 3:52 pm
Thank You a lot Dr. Muthen for your very helpful assistance. Can I ask a more specific question?
I got a model with the following fit(WLSMV): X^2 = 1579.936, df = 1097, p= 0.0000; RMSEA = 0.043, C.I. 0.039-0.048, p(RMSEA <=.05)= 0.990; CFI = 0.975; TLI = 0.970; SRMR = 0.056; WRMR = 0.952.
Modifications improved only descriptive fit indices. Kline(2011) reminds as about the importance of the X^2 while some studies report that WLSMV X^2 are inflated (particularly in harsh conditions)(e.g. Potthast, 1993; Beauducel, 2006). Q1: How you think, is it reasonable to accept the model with the provided fit?
I have also tried to get non-significant X^2 by saturating the model as once you have suggested 'sensitivity analysis' for parameter estimates if we think X^2 is overestimated. Yet I am not successful at this (I get problems with identification or convergence). Q2: Would you recommend more efforts in this direction?
Q1 If you have a large sample size (several thousand observations), I would find this model to fit reasonably well if you have done a good sensitivity analysis.
q2 Sounds like you are not doing the analyses correctly.
Mher B. posted on Monday, September 23, 2013 - 4:13 pm
>>> Also thank you for your recommendation regarding "neighboring models".
I wanted to separate one factor from its EFA set in ESEM and test it as nested model. Brown (CFA for applied research, 2006) says "use of the x^2 diff test is not justified when neither solution provides an acceptable fit to the data". According to this diff tests in my case turns useless, don't they? (and I cannot make use of "neighboring models"). Q3: So I can get out a factor from the EFA set only based on negligible cross-loadings (at least I have showed negligible cross-loadings instead of not including the factor in the EFA set from the begining because it needed to serve as DV) and theoretical considerations (DV), can't I?
Diff tests are not good if both of the two models fit very poorly. You want to choose your neighboring model to fit reasonably well.
Q3. I don't understand your question. Please restate.
Mher B. posted on Thursday, September 26, 2013 - 3:31 pm
I think you have already answered that question. I wanted to separate a latent from an initial EFA set where it was with some other latents. I wanted to justify that decision by the diff test and theory. As you confirmed diff test is not useful in this case (poor fitting parent model), so it is not reasonable to do such modifications until I get good fitting (parent) model, right?
Thank You very much!
Liz Goncy posted on Monday, October 14, 2013 - 10:14 am
I ran a CFA with 26 indicators across four factors. All indicators were ordered polytomous and treated as such in the model. I'm attempting to test alternate models and would like to examine differences in BIC. I know the default estimator (WLSMV) does not provide this and have changed the estimator to ML. However, I'm getting an error message regarding co-variances being not defined with the ML estimator that I don't get with WLSMV. The error is specific to WITH statements for items that are conceptually similar - but distinct - and I'd like to model their co-variance(e.g., He slapped me. Violence Victimization VS. I slapped him. Violence Perpetration). *** ERROR in MODEL command Covariances for categorical, censored, count or nominal variables with other observed variables are not defined. Problem with the statement: UN012 WITH UN025
We don't allow the WITH option to specify residual covariances because each one requires on dimension of integration when maximum likelihood estimation is used. If you don't have too many, you can specify
I have a questionnaire with 55 items (not normal distribution). Items are categorical variables (a three-point Likert scale). I need to find a factor model using EFA.I have looked at a number of solutions.
The parallel analysis suggested four-factor solutions.
WLSMV: Often gave a three-factor solution (no downloads in the four factor).
MLR with categorical variable gives the four-factor solution. The Fit indices of this four-factor model are better than the three-factor solution (CFA).
Is this MLR with categorical variables (EFA) more suitable for my data processing?
I think that I cannot use the MLR with continues variables (EFA) since the data is categorical?
In fact, when I run CFA with continuous variables, I get a lot worse fit indices (e.g., CFI and TLI <.82). Concerning the categorical variables, the model fit index is about: CFI / TLI = .95 and RMSEA = .63). Why is this? Is this fit indices (CFI / TLI = .95 and RMSEA = .63) sufficient (items=51, four factor, n=618)?
You should treat the variables as categorical and decide on three or four factors based on the substantive interpretation of the two factor solutions.
Tracy Zhao posted on Monday, January 27, 2014 - 7:57 pm
Hi, I am doing a simulation study in which I analyzed simulated categorical data for a CFA model. I used the "WLSMV" option and don't see SRMR being displayed in my output. The input file is a rather simple one:
Title: XXXXX rep 1 DATA: FILE IS 'D:\XXXX1.DAT'; VARIABLE: NAMES ARE y1-y8; CATEGORICAL ARE y1-y8; ANALYSIS: parameterization = theta; ESTIMATOR = WLSMV; MODEL: f BY y1-y8*; f@1; SAVEDATA: RESULTS ARE XXXX1.dat;
Please let me know what I should do to get the SRMR in the output. Thanks!
SRMR is not available for categorical options when the model includes thresholds. You can try adding MODEL = NOMEANSTRUCTURE to the ANALYSIS command to see if you get it then.
Lynn Vinc posted on Monday, March 17, 2014 - 5:09 am
Dear Linda, We ran a cross lagged panel model in Mplus with categorical predictors and categorical outcome variables. As we wanted odds ratio's an outcome variables (instead of probit regression coefficients) we implemented ML as estimator (estimator is ml; algorithm is integration;integration is montecarlo;) However, after including this comment, we do not get fit statistics like rmsea/chi square/ BIC ans SRMR. Is there an option to still get these fit statistics? Or is this only possible for probit regression?
I'm running a latent growth curve model with three-level categorical indicators. I also have time-varying covariates, which are my main variables of interest. Observations are clustered in families, so I'm also using the CLUSTER = command. I ran the model both using WLSMV and MLR. I was planning to use the parameter estimates from MLR and the model fit statistics from WLSMV. I found though that the p-values of the parameter estimates were very different between the two models. I knew the parameter estimates would differ due to logit vs. probit, but wasn't expecting large differences in statistical significance. Do you know why that would be the case? Is it invalid to use the model fit statistics from WLSMV if I'm reporting the parameter estimates from MLR?
Yes, it is invalid to use model fit statistics from WLSMV and parameter estimates from MLR. Differences may be due to different missing data procedures in the two estimators and different models fitting the data differently.
Lynn Vinc posted on Thursday, March 27, 2014 - 4:25 am
Hi Linda, Thanks for your response. Do you have a literature reference for us in which this is explained? Kind regards
The chi-squares that are referred to here compare observed versus expected frequencies of the multiway frequency table of your categorical factor indicators. They are not useful if you have more than about 8 indicators and if they do not agree. It sounds like you have more than 8 indicators.
The chi-square that you are thinking about and related fit statistics like RMSEA etc. are not available when maximum likelihood estimation is used with categorical dependent variables. In this case, means, variances, and covariances are not sufficient statistics for model estimation.
Now please, a related question is: what is(are) the best way(s) to compare different models in my case (categorical indicators, MLR estimation), as I have also a correction factor for the loglikelihood?
OK, thanks again Linda, I just needed a confirmation. In relation to the same models I am enquiring about, is it normal that the estimation of a second-order factor (with categorical items) model takes 14 hours, and it's not over yet? I would prefer to continue with ml-type estimation. My model is as follows: Analysis: type is general; estimator is mlr; Model: ME by x1 x2 x3 x4 x5; GE by y1 y2 y3 y4; BL by z1 z2 z3; G by ME GE BL; Thank you.
Hello Dr. Muthen I'm new to this board and have been trying to find the answer to how to report fit indices for models with binary outcomes.
I have a path model with 1 continuous exogenous variable, 4 continuous mediator variables and 3 binary dependent variables. Because the outcomes variables are binary I used estimator=mlr
I have two questions about this: 1. If required, how would I report the fit statistics for this model, as the mlr analyses provide me with AIC/BIC and log likelihood measures and I am not sure that I (or the reviewers of my paper) know how to interpret these. This is what I get: MODEL FIT INFORMATION Number of Free Parameters 58
Loglikelihood H0 Value -1926.380 H0 Scaling Correction Factor 1.051 for MLR
2. I would like to be able to control for the effect of each dependent variable on each other and was told to use parameterization = theta; However, this gives me the following message: "PARAMETERIZATION=THETA is not allowed for TYPE=MIXTURE or ALGORITHM=INTEGRATION. Setting is ignored."
1. When absolute fit indices are not available, chi-square difference testing using the loglikelihood can be used to compare nested models. BIC can be used to compare neighboring models that have the same set of dependent variables.
If you want absolute fit statistics, you can use WLSMV.
2. PARAMETERIZATION=THETA is available for WLSMV.
Eiko Fried posted on Monday, June 23, 2014 - 1:55 pm
Dear Dr Muthens,
I am running a MIMIC model with 11 categorical x1 through x11 and one latent factor y1. MPLUS does not provide absolute fit indices such as RMSEA & CFI. What could be the reason for this? I could not find information in the v.7 handbook.
If you are using maximum likelihood estimation, absolute fit statistics are not available for categorical outcomes.
ri ri posted on Wednesday, August 20, 2014 - 12:19 am
we conducted a CFA and wanted to see the model fit of the measurement model. we did it in AMOS first and got very good model fit (all fit the rule of thumbs). Since AMOS does not differentiate continuous and categorical data, I re-analyzed the model in mplus. Before i added categorical data the result is almost the same as that in amos: Remsea .046, CFI .98 TLI .975 WRMR .055. But when I added categorical data, I got low TLI and also relatively low CFI.
Chi-Square Value 149.197*; DF =91 RMSEA ==0.054 CFI = 0.909 TLI 0.880 Chi-Square Test of Model Fit for the Baseline Model Value 760.155 Degrees of Freedom 120 WRMR = 0.620
I wonder if this model is not fit after all? Can I improve the model fit with the same constructs?
Here i attached my syntax: USEVARIABLES ARE X1 x2 x3 m1 m2 m3 y1-y7 ; CATEGORICAL = y6 y7; ANALYSIS: ESTIMATOR = WLSMV; MODEL: xw BY x1 x2 x3; mw BY m1 m2 m3; Y1w BY y1 y2 y3; Y2W BY y4 y5; Y6 WITH xw mw y1w y2w y7; Y7 WITH xw mw y1w y2w y7; OUTPUT: TECH1 CINTERVAL;
You will obtain different results when you treat variables as categorical versus continuous. When categorical variables are treated as continuous, their correlations are attenuated. Lower correlations make it more difficult to reject the model.
Julie Self posted on Wednesday, October 01, 2014 - 8:02 am
I am running path analysis which includes one binary mediator variable and accounts for clustering. Based on my desire for fit statistics and the suggestions from previous discussion posts, I just changed from MLR to WLSMV.
When using MLR, my code for analysis section was: Analysis: ESTIMATOR = mlr; TYPE = complex; INTEGRATION = MONTECARLO;
When switching to WLSMV, I used: Analysis: ESTIMATOR = wlsmv; PARAMETERIZATION=THETA; TYPE = complex;
No other code changed.
I got the desired model fit statistics, but my means in estimated sample statistics are way off, and I can't figure out why. For example, the mean for my dependent variable, height-for-age z-score, changed from -2.212 to -9.173 (implausible).
Can you offer any explanation or fix for this problem? I'd really like the absolute fit indices, but I don't think I can use the output knowing that my means are off.
Is there a reason why Mplus hasn't added limited-information fit statistics like M2 (Maydeu-Olivares & Joe, 2006) and M2* (Cai and Hansen, 2013)? Are there plans to include in future versions of Mplus? It would be helpful to have more tools for assessing model fit using ML for categorical outcomes.
I'm curious as to why, when I run an EFA with all categorical variables, I get model fit information (RMSEA, CFI/TLI, SRMR) but when I run the model as a CFA I only get Loglikelihood and Information Criteria? If I am trying to establish unidimensionality, can I use the model fit indices from the 1 factor EFA? Or should I use the loglikelihood/information criteria from the CFA?
I think what you are seeing may be a function of the default estimators used. for categorical outcomes with WLSMV you get usual model fit information but with ML you don't. With ML you need to look at TECH10 fit.
Dear Linda, We ran a twolevel pathmodel with one ordinal and two continuous outcomes, continuous x-variables (cross lagged panel) and some covariates. Is it ok to use ULSMV (N=2600)? Which fit indexes are proper? I read in Mplus discussion that CFI and RMSEA are not used for the testing of nested models. Can I use p-value in Chi-Square Test of Model Fit for the Baseline Model? Are other indexes like TLI and SRMR ok? Thank you!
When consider the model fit for the simple CFA with dichotomous items (WLSMV estimator), what is the order of importance (chi-square, TLI, CFI, RMSEA, WRMR)? Is there any research article on it? Thank you!
I am running a logistic regression analysis on complex survey data with MLR estimation. Could you tell me what the R-square is that is reported with the standardized output? Is it a pseudo R-square such as Cox-Snell? Thanks in advance.
It is the R-2 for a continuous latent response variable in line with the 1985 article by McKelvey & Zavoina.
Bin Xie posted on Sunday, February 28, 2016 - 4:13 pm
I am running a path model with one categorical dependent variable, one categorical mediator, and one continuous predictor. I try two models with and without theta parameterization following the examples in Chapter 3 (ex3.12 and ex3.13). Both models give the model fit information: 0 for RMSEA and Chi-square test of model fit, 1 for CFI/TLI and 0.001 for WRMR. Does it mean the model fit is really "perfect"? Is there an alternative statistics I can use to evaluate how well the model fits. Thanks!
But because you have categoricalM and Y, you first have to read the paper on our website:
Muthén, B. & Asparouhov, T. (2015). Causal effects in mediation modeling: An introduction with applications to latent variables. Structural Equation Modeling: A Multidisciplinary Journal, 22(1), 12-23. DOI:10.1080/10705511.2014.935843
Bin Xie posted on Sunday, February 28, 2016 - 6:10 pm
Thanks very much!
JIn Liu posted on Monday, March 14, 2016 - 4:23 pm
I have a general question regarding to the fit indices cut-off values using the WLSMV estimation.
Most of the cut-off guideline articles are ML based. Can these be used as references in the WLMSV estimation? Probably not.
I think the suggested cut-offs should be similar.
Do you have any recommendations on the related references?
Yu, C.Y. (2002). Evaluating cutoff criteria of model fit indices for latent variable models with binary and continuous outcomes. Doctoral dissertation, University of California, Los Angeles. download paper contact author show abstract
JIn Liu posted on Monday, March 14, 2016 - 5:47 pm
Thanks for your reply. The dissertation only included information for the binary outcomes. I wonder if the results can be applied to ordered categorical data (items with 3 categories).
No such testing is available. You really need grouped data to do such testing, that is, many observations for each distinct combination of values on the covariates so that you can look at fit of a frequency table. Typically that is not available.