Anonymous posted on Wednesday, November 13, 2002 - 11:37 am
I have a few questions regarding analyzing the output provided in the Mplus parameter arrays.
1. When I ask Mplus to provide the model residual matrices using OUTPUT: RESIDUAL, among the matrices Mplus provides is a matrix of slope residuals. I noticed that this matrix contains slopes for relationships I don't explicitly include in the model -- for example, I regress Y1, Y2 and Y3 on X1, X2, and X3 and on a latent variable U1 which loads on I1, I2, and I3. Mplus provides the slope for Y1 on I1 even though the relationship is not explicity modeled. How do I interpret this slope ?
2. How can I ask Mplus to provide me the correlations between several latent variables in a model, i.e., U1, U2, and U3 ?
3. Mplus provides a residual matrix for model estimated intercepts and thresholds. How useful are these in analyzing model fit -- i.e., do you recommend including these residuals in plots of residuals and should these residuals be as much a matter of concern as the residuals for the VAR/COVAR matrix ?
1. With categorical outcomes, when covariates are included in the model, the sample statistics are no longer the correlations but the probit thresholds, regression coefficients, and residual correlations. These are estimated for all of the observed variables in the analysis just as correlations are estimated for all variables in the analysis if there are no covariates in the model.
2. You can say,
v1 WITH v2 v3; v2 WITH v3;
If these variables are exogenous, the covariances will be included in the model as the default.
3. The residuals of the intercepts/thresholds are usually not structured so are not important for model fit.
Anonymous posted on Monday, July 28, 2003 - 7:14 pm
I have a question. That is, the model fit statistics of my model is not consistent for me to reach a conclusion. How can I interpret this result. The result is as below:
********************************************* TESTS OF MODEL FIT
Chi-Square Test of Model Fit P-Value 0.0683
Chi-Square Test of Model Fit for the Baseline Model P-Value 0.0000
The CFI is a little low, indicating a rather poor fit, while the other fit indices are good. I wonder if you sample size is perhaps small, or your sample correlations low - that might account for this discrepancy.
Anonymous posted on Tuesday, August 12, 2003 - 12:36 pm
Can someone recommend a resource specifying how to translate standard model fit indices into evaluatory terms such as "unacceptable, acceptable, good, or excellent"? A simple chart is the kind of thing I want.
I think this is a good question for SEMNET. You may also want to look at the Hu and Bentler article that you can find under References at www.statmodel.com.
Anonymous posted on Sunday, September 07, 2003 - 6:16 pm
Is there currently a convenient way to do nested difference of Chi-Square tests for the Mplus WLSMV estimator ? Is this at all related to the procedure for doing similar tests for the Mplus MLM estimator ?
There is currently no way to do difference testng for WLSMV. We recommend using WLS for difference testing and WLSMV for the final model. Difference testing for WLSMV is likely to be available in Version 3.
Anonymous posted on Tuesday, September 09, 2003 - 10:42 am
A few follow-up questions to the reply from Sept. 8 re: WLSMV difference tests:
1. Given that there's no way of performing nested tests of fit using WLSMV, isn't it possible that using nested tests of fit with WLS one could obtain a final model that would not have be obtained using WLSMV nested tests of fit(provided that they were available) ? In other words, if one is testing a relatively complex model using the Mplus WLS estimator, wouldn't one be better off using the same WLS estimator in fitting the model and then in reporting / interpreting the final results ?
2. Aside from the scaled Chi-Squared statistics, what specifically is "lost" in opting for the WLS over WLSMV ?
3. In estimating the coefficients and SEs for a rather complex SEM using Mplus, I notice a radical difference in the SEs obtained using WLS versus WLSMV (in some cases, the SEs almost double). Is this to be expected ?
4. Is Muthen's CFA for ordered categorical indicators still valid using the Mplus WLS as opposed to the WLSMV estimator ?
5. Is there a situation in which you would *not* recommend using the WLSMV estimator -- i.e., large number of desired parameters relative to the sample size; a model consisting of mostly categorical exogenous variables, etc ?
bmuthen posted on Tuesday, September 09, 2003 - 5:00 pm
1. Yes on the first part. For the final model it still seems worthwhile to make sure that the WLS results are good by comparing parameter estimates and SEs to those of WLSMV - and perhaps report the latter.
2. The WLS SEs may not be as good as those of WLSMV - and in some cases of smaller samples and more skewed items, the parameter estimates may not be as good.
3. Not unless you work with smaller samples and more skewed items.
4. Yes, but see 2.
5. I am not yet aware of any such situation.
Anonymous posted on Wednesday, September 10, 2003 - 11:37 am
At the risk of becoming a nuisance, I'd like to pose a final follow-up question (or two):
Regarding your response #1:
If the WLS and WLSMV parameter estimates do not agree in a fairly sizeable SEM (i.e., a SEM with many parameters relative to sample n), how would one be able to tell if the disagreement between WLS and WLSMV parameter estimates is due to the fact that WLS produced a model that does not fit by WLSMV standards (so to speak), versus the fact that the WLSMV estimates are superior to the WLS estimates in for the model in question (due to sample size, skew of continuous variables in the model, etc) ?
Also, purely out of curiosity: in reading the Mplus v2.0 manual technical appendices, it appears that the WLSMV estimator "uses the information available in the data twice" to produce the relevant W matrix, whereas WLS only relies on the data once (if that makes sense) to produce W. Thus, hypothetically, wouldn't one want to stay with the WLS estimator if one had less confidence in his / her sample (for whatever reason) ? I.e., wouldn't WLSMV compound errors of estimation if one was working with a sample he / she had some, but limited, confidence in ?
This is all very helpful. Thanks very much for your input.
I think it would be a good idea for you to send the data and the two outputs -- WLS and WLSMV -- to firstname.lastname@example.org. Then we can give you a more informed answer given that what you are seeing is most likely data dependent.
bmuthen posted on Thursday, September 11, 2003 - 5:52 pm
Regarding the first question, if a model doesn't fit well, the quality of WLSMV estimates is not guaranteed.
Regarding the second question, WLS and WLSMV both draw on "the full weight matrix, W" computed from the data. In WLS this matrix is used both in parameter estimation and in SE and Chi-square computations. In WLSMV, only the diagonal is used for parameter estimation and the full W only for SE and chi-square. So if you have limited confidence in your data - or in your W computed from the data - you are perhaps better off using WLSMV because the parameter estimates are not dependent on the whole W.
Anonymous posted on Sunday, October 26, 2003 - 4:26 am
Hi, I use Mplus to test a model that was previously specified in a well-corroborated study. The dependent variable is ordered categorical (4 categories). I use the default value about the estimator.The fit statistics are as follows:
CFI=0.963, TLI=0.960, RMSEA=0.041, WRMR=1.066
According to the User's Guild, WRMR should be under 0.9 to be a good fit. Why the former three indexes indicate a good fit, but WRMR show a bad fit?
By the way, I have another model of which the dependent variable is binary, the fit statistics are as follows:
CFI=0.958, TLI=0.821, RMSEA=0.037, WRMR=0.586
Again, why in this case TLI is particularly bad, but CFI, RMSEA and WRMR seem good?
There have been few studies of the behavior of fit statistics for categorical outcomes. In your case, you have a combination of one categorical and several continuous outcomes. I know of no studies of the behavior of fit statistics in this situation. The following dissertation studied fit statistics for categorical outcomes. It can be downloaded from the homepage of our website.
Yu, C.Y. (2002). Evaluating cutoff criteria of model fit indices for latent variable models with binary and continuous outcomes. Doctoral dissertation, University of California, Los Angeles.
You may have to do a simulation study to see which fit statistic behaves best for your situation.
I noticed that Mplus produces matrices of residual values for first- and second-order moments when the RESIDUAL option is invoked on the OUTPUT line.
Is there a corresponding option that enables output or saving of individual record-level residuals for model equations? Or does the end user need to compute those manually from the raw input data and the parameter estimates generated by Mplus?
With many thanks for your reply,
bmuthen posted on Tuesday, December 07, 2004 - 4:48 pm
Saving individual-level residuals is currently not available.
kgreen posted on Wednesday, March 02, 2005 - 8:43 am
My question pertains to the model fit statistics for my SEM model with all continuous variables. The model has 2 latent variables with two indicators each and eight additional measured variables.
I'm confused by the RMSEA of 0. I assume it is because my DF are larger than my chi-square. Does that indicate adequate fit or a problem with the model?
Chi-Square Test of Model Fit=34.887, DF =36, P-Value=0.5214 Chi-Square Test of Model Fit for the Baseline Model=1036.027, DF=65, P-value= 0.0000 CFI=1.000 TLI=1.002 RMSEA=0.000 90 Percent C.I. (0.000-0.027) Probability RMSEA <= .05 1.000 SRMR Value=0.030
Yes, RMSEA of zero comes about because your degrees of freedom are larger than your chi-square value. It indicates good fit.
Anonymous posted on Wednesday, May 25, 2005 - 3:39 pm
Hi, I interpreted the p-value ( =0) associated with the chi-square of the fitted model as a sign of poor fit of my model. But my colleague here thinks that it has nothing to do with goodness of fit but rather is about the joint significance of the exogenous variables. Can you please help. Thanks!
bmuthen posted on Wednesday, May 25, 2005 - 3:41 pm
This p value concerns the probability of the model having generated the data, so you are right.
Anonymous posted on Friday, September 02, 2005 - 12:02 pm
Dear Professor Muthen,
Can I interpret the fit statistics (chi-square) in the MPLUS results as a test of null hypothesis, which states that the model fits the data?
For example, if p=0.4, can I say -- the null hypothesis stating that the SEM model fits the data cannot be rejected?
No, Mplus does not provide standardized residuals. I will put that on our list of things to add.
anonymous posted on Wednesday, November 02, 2005 - 2:42 am
I´m sorry, but I`m not that familiar with the names of the different matrices in SEM. Is it possible to obtain standardized residuals for the difference between the observed and predicted relationship among the indicators (= correlation residuals) in Mplus?
bmuthen posted on Wednesday, November 02, 2005 - 5:15 am
Not currently. That is what Linda said she put on her list of things to add in her message above. Note, however, that often the Modification indices are more informative than the residuals in terms of which model modifications are needed to fit the model better to the data (see Sorbom articles).
robertav posted on Monday, September 03, 2007 - 9:25 am
Dear Authors, I'm working with a SEM with both continuous and categorical(ordinal) indicators. I'm using the WLSMV estimator. Most fit indexes show a good fit, only the WRMR is really bigger that the suggested value of 0.9.
TESTS OF MODEL FIT Chi-Square Test of Model Fit Value 6592.727* Degrees of Freedom 182** P-Value 0.0000
Chi-Square Test of Model Fit for the Baseline Model Value 49532.611 Degrees of Freedom 84 P-Value 0.0000
CFI/TLI CFI 0.870 TLI 0.940 Number of Free Parameters 115
RMSEA Estimate 0.051 WRMR Value 4.646
I saw the dissertation of Yu, C.Y. (2002), but he treats only the case of binary outcomes. What do you suggest me? Should I ignore the WRMR index?
You can ignore WRMR but I don't think you can ignore CFI and TLI and chi-square depending on your sample size.
David Lin posted on Thursday, July 10, 2008 - 7:44 pm
I have done an CFA WITH CONTINUOUS FACTOR INDICATORS, which is estimated by "MLM" (not ML). However, according to Hu and Bentler (1999), the CFI and TLI should be bigger than .95, and RMSEA < .06 by ML estimator. Is this the case for MLM?
Secondly, my CFA model with S-B£q2=363.38, df=245 (p<.001), CFI=.91, TLI=.89, RMSEA=.043, SRMR=.06, I am not sure whether it is ok. Could you give me some advice?
Thanks in advance.
-------I cited it below------
Hu, L.-T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6(1), 1-55.
The results suggest that, for the "ML" method, a cutoff value close to .95 for TLI, ¡K, CFI¡K; a cutoff value close to .08 for SRMR; and a cutoff value close to .06 for RMSEA are needed before we can conclude that there is a relatively good fit between the hypothesized model and the observed data.(p. 1)
I would use those cutoffs for MLM also. It sounds like your model does not fit the data well. You could look at modification indices to see where the misfit lies. You could also got back to an EFA to see if the items behave as expected.
I have been suggested to correlate indicators (using with) by looking at those with high Modification Indices, which could help to improve the model fit. I have been trying this and Chi-Square does in fact reduce, but the p value is still .0000. CFI and RMSEA also improve but they are still inadequate. My question is to what extent could I rely on this strategy to improve model fit? Or does this mean the model is not strong enough?
I would deeply appreciate any suggestions/insights.
This is the model I am using. For example: CD4 is theoritised as part of anx but I included it in the other latent vars after looking at Modification Indices:
Model: delinq by CD11 CD19 CD21; aggr by CD1 CD2 CD4 CD5 CD6 CD7 CD8 CD9 CD12 CD18; anx by CD3 CD4 CD13 CD14 CD15 CD16 CD17 CD22 CD23 CD25 CD20; soma by CD4 CD24 CD26; CD1 with CD4; CD7 with CD8; CD18 with CD19; CD22 with CD23;
Follow-up for my last post, I forgot to mention that: 1. all of my variables have been defined as categorical. 2. correlations between CD1-CD4; CD7-CD8; CD18-CD19; and CD22-CD23 could arguably be supported by theory. for example CD7 refers to "destroys personal belongings" and CD8 refers to "destroys other´s belongings". The four pairings showed high Mod Indices.
You should have some justification like theory for adding residual correlations. If you need so many, perhaps your CFA model is not correct. I would suggest doing an EFA to see whether the factors you have specified in the CFA are in fact measured by the variables you are using for factor indicators.
Thanks a lot for your message; this is very helpful.
I had conducted an EFA in SPSS, using Oblimin and Principal Component Analysis, which gave me 6 factors. I had changed them into 4 by allocating some indicators into the factor in which they had the second highest loading. These seemed to make sense with the original scale (it is the Achenbach Child Behaviour Checklist).
After doing the EFA in MPLUS using WLSMV or WLSM instead I need to reach 10 factors until the Chi-square gets significant at .01 and 11 factors for it to be significant at .05. (CFI=.997/998; TLI=.997/.998; RMSEA=.012/.010). However, 10 or 11 factors seem too many for a CFA given that I have 26 indicators, isn´t it?
Additional info: N=2152, there are no outliers and no missing cases. The scale for responses is 1=yes,2=sometimes,3=no.
Dear M&M, I am using SEM to analyze a pathway in which I have a final binary outcome, from a matched case-control study. Because of this I am using stratification for the matching variable. So, I have only two categorical variables, in which one is the case-control status. I would like to know how I can get goodness of fit indexes from this analysis (ANALYSIS type: complex, Estimator=MLR). I also would like to know if it is possible to estimate the indirect effects p-values. Many thanks, Valéria
My equations are: sta ON sex X1 X2 X3 X4 X5 X6 X7 X8; X6 ON sex X1 X2 X7 X8; X3 ON sex X1 X2 X4 X6 X9 X10; X4 ON sex X1 X2 X11 X12; X9 ON sex X1 X2 X3 X11; X3 ON sex X1 X2 X6 X9 X11; X5 ON sex X1 X2 X6 X13;
sta and sex are categorical, and I have different results with MLR and WLSMV estimation. I have 8 dependent variables and 7 independent. All observed and 2 binary. Do you think it is better if I use WLSMV? Many thanks,
I estimated several models based on my theoretical expectations. All models provide good fit, with one exception. That model looks like this: m1 BY f1 f2; m1 ON x1-x12; m2 ON x1-x12;
y1 ON m1 m2; y1 ON x2-x12;
y2 ON m1 m2; y2 ON x2-x12;
y1 with y2;
MODEL INDIRECT: y1 IND x1; y2 IND x1;
In this model, y1 and y2 are both categorical and N = 339, I used type=complex to control for clustering. The fit indices I get are: chi-square = 46.351 (45), p=.095, CFI = .844, TLI = .804, RMSEA =.031, WRMR =.762.
I now doubt what to do. What can be reasons of this poor fit? Because I want to test the theoretical expectations, I prefer to not change a lot. Can I, based on these fit indices, argue that the theoretical model does just not fit the data? Or do I make mistakes and are these fit indices a result of wrong model specifications?
In the meantime, I also tried to estimate the models without indicating y1 and y2 as categorical. In that case, CFI=.986 and TLI = .973. The results remain highly similar.
y1 and y2 are both measured on a 3-point scale. The mean and sd of y1 is 2.11 (.045) and of y2 is 2.33 (.039), so they are skewed. The N = 339. Can I treat y1 and y2 with this N as continuous? Or do you advise me to keep them as categorical?
How can I obtain a single indicator which shows the discrepancies between sample observed and model implied variance/covariance matrix? (i.e. S-sigma or the chi-square value of goodness of fit) Is this the Chi-square Test of Model Fit in the output? If not, is there a way I could compute this?
Thank you very much for always responding promptly.
Please help me clarify further about the statistics used in the output. Now I understand that H1 loglikelihood value corresponds to the observed sample variance/covariance and H0 corresponds to variance/covariance implied by the proposed model. And the chi-square of the baseline model tests the null hypothesis that all regression coefficients in the proposed model are zero. Therefore, we want to fail to reject Chi-Square Test of Model Fit, but we want to reject Chi Square Test of Model Fit for the Baseline. Please correct if these are wrong.
I fit a model using Mplus 4.2 with this analysis and outcome options: ANALYSIS: TYPE = MEANSTRUCTURE MISSING H1; ESTIMATOR = WLSMV; PARAMETERIZATION IS THETA; ITERATIONS = 1000; CONVERGENCE = 0.00005; COVERAGE = 0.10;
OUTPUT: SAMPSTAT MODINDICES(0) STANDARDIZED H1SE;
And I got the following outcome:
TESTS OF MODEL FIT Chi-Square Test of Model Fit Value 0.000* Degrees of Freedom 0** P-Value 0.0000
* The chi-square value for MLM, MLMV, MLR, ULS, WLSM and WLSMV cannot be used for chi-square difference tests.
** The degrees of freedom for MLMV, ULS and WLSMV are estimated according to a formula given in the Mplus Tech...
What is the value and df of the Chi-square? How can I find the information about the Model Fit?
Yes as long as there are no messages to the contrary.
yan liu posted on Wednesday, March 07, 2012 - 11:42 am
Greetings! I did some analyses based on the multilevel SEM mediation modeling in Mplus (Preacher, Zyphur and Zhang, 2010). This new method uses latent variable decomposition approach, so it allows us to examine the mediation effect at both levels.
However, I found the model fit is almost perfect (RMSEA, CFI,SRMR) and Chi-square test has zero df. So it's just identified model. I tried "delete 1-add 1" and found the model fit became poor. In this case, I am not sure if this is caused by the poor model fit of my final model or the mis-specified model with one path deleted.
Could you please give me some advice about how to provide evidence about model fit in this just-identified model?
My model is specified as follows (Predictor: teach, Mediator: PNS, Outcome: movat). Thanks! MODEL: %WITHIN% PNS ON teach (aw); movat ON PNS (bw); movat ON teach;
%BETWEEN% movat PNS teach; PNS ON teach(ab); movat ON PNS (bb); movat ON teach;
MODEL CONSTRAINT: NEW(indb indw); indw=aw*bw; indb=ab*bb;
Chi-Square Test of Model Fit for the Baseline Model Value 2443.761 Df 210 P-Value 0.0000
It was my understanding that the SRMR was typically just slightly larger than the RMSEA.
Could you explain why the SRMR is so poor while the RMSEA is not?
If it helps, I am running a single indicator SEM in which I have 7 waves of data.
At each wave, I have a predictor, a mediator, and an outcome variable defined by a total score for which I have specified the residual variance.
The model includes: a) correlations between the 3 latents at each wave, b) autoregressive paths between same latents at subsequent waves, c) paths from the predictor variable to the outcome variable 2 waves later, d) paths from the predictor variable to the mediator at each subsequent wave, and e) paths from the mediator to the outcome variable at each subsequent wave. I have constrained paths that I have not included to be zero (e.g., MODEL=NOCOVARIANCES).
Any help you could offer in understanding this would be much appreciated!
Linda, Thanks very much for replying to my student, Alison Alden, in the post above and in a subsequent email correspondence. There is another aspect of this analysis that puzzles us having to do with the matrix of residuals. Specifically, there are several places in the Residual Output in which values of 999.000 appear and we don't know what these signify. As one example, there is one in the matrix of the Standardized Residuals (z-scores) for the covariances/correlations/Resid Corr (but a 999.000 does NOT appear in the same place in the matrix of Residuals for Covariances/Correlations/Residual Correlations). So, my first question is what does a 999.00 mean in the Residual Output. My second question is whether there is a way to get Mplus to output the Residual Correlation Matrix as opposed to the Residual Covariance Matrix? Thanks!
thanks Linda. I have a few follow-up questions. First, how are the Standardized Residuals (z scores) computed? I think that would help me to understand why some of the values in a Standardized Residuals (z scores) matrix could be computed and others not. Second, for the standardized residuals that can't be computed, what impact do they have on SRMR? Finally, to get the Residual Correlation Matrix, we tried adding the statement Matrix = Correlation to the Analysis Command but we get an error message saying that Listwise deletion must be on but that would leave us with fewer subjects than parameters. We are wondering if we could compute the correlation matrix outside of Mplus and then use that as our data file rather than the raw data but we are not sure what to use as the value for our nobservations if we did so (the total number of subjects including those with some missing data?).
We now understand we cannot the matrix=correlation option in the Analysis command. However, we assume from your earlier statement that residuals are given for the matrix that is analyzed, that if we use the Type=Correlation option in the File command (using the correlation matrix as the data file rather than the raw data that we typically use) then we can use the Residual option, is that correct? If so, we believe that we also need to use the NOBSERVATIONS option and we are not sure what to put for that. To elaborate, if we were using raw data, listwise deletion would leave us with fewer subjects than parameters so we are hesitant to compute the correlation matrix using listwise deletion and the n of subjects with complete data as our value of NOBSERVATIONS. On the other hand, it doesn't seem that Mplus has an option for putting in the number of observations each correlation is based so if we don't use listwise deletion we aren't sure what value to enter for the NOBSERVATIONS. Or would we just enter the total number of subjects including those who have some missing data (which doesn't seem quite right to us)? Any thoughts/guidance will be most appreciated.
P.S. is there any consideration of adding an option to Mplus such that one can request residuals in a correlation metric even if one has analyzed the covariance matrix? It seems to us that it would be both useful (as in the present case, helping us to figure out why our model is so bad/where the greatest strain is coming from in a metric we can easily understand) and feasible (given that you are already computing all of the individual elements in that matrix in order to calculate SRMR). Thanks!
Standardized residuals give much more information than residuals in the correlation metric. An even better tool for examining model misfit is modification indices. You can ask for these in the OUTPUT command by specifying:
There are several issues here. Your model should not have more free parameters than the number of observations in your data set. Chi-square cannot be trusted at your sample size. You have little power at your sample size.
Jihyun Yoon posted on Friday, December 27, 2013 - 12:24 am
Do you mean that df should not exceed the number of observations?
So, in this case, CFI and RMSEA can not be trusted as well?
No, the degrees of freedom is the difference between the number of free parameters in the H1 model minus the number of free parameters in the H0 model. The number of free parameters in the H0 model should not exceed the number of observations in your data set.
No fit statistics based on chi-square can be trusted. CFI and RMSEA are based on chi-square.
Soyoung Kim posted on Monday, December 30, 2013 - 4:44 am
Dear Dr. Muthen,
I analyzed a model with n = 44 and number of variables =15 by Mplus 6.11. And I had the results of df=198, chi-square of the model = 146.5, CFI =1, and RMSEA=0
I wonder how 'df=198' was calculated in this model.
Hello, I am running a latent difference score mediation model using syntax provided by Preacher and colleagues. My latent variables are made up of categorical indicators (they were originally continuous, but I am treating them as categorical because several of them were skewed), and I am generating bootstrapped confidence intervals for the indirect effects. While I was able to obtain fit statitics when treating my indicators as continous (and using the ML estimator), fit statistics are no longer produced when using the WLSMV estimator (and ordinal data). I am not receiving any sort of error message. I am a beginner to SEM, so this may be a dumb question, but any idea why? Thanks!