Message/Author 

daniel posted on Tuesday, February 18, 2003  10:31 am



How to carry out Chisquare difference test if indicators in a model are all or partly categorical? 


Use the WLS estimator for difference testing. Use the WLSMV estimator for the final model. 

Daniel posted on Monday, March 03, 2003  12:12 pm



For a theoratical model with all categorical indicators, fit indices are: CFI=0.996 TLI=0.996 RMSEA=0.046 SRMR=0.070 which are acceptable. But Chisquare difference test between theoratical model and measurement model is significant. In this case, Is Chisquare difference test realy matter? Is it a crucial index among all fit indices such that the testing model must meet it first of all? Any suggestions? 

bmuthen posted on Monday, March 03, 2003  2:41 pm



Both of your models fit the data reasonably well. However, if the two models differ in important ways substantively, the result of the chisquare difference testing is important. The difference testing is a powerful way to distinguish between models. Note, however, that you should make sure that the two models are nested. 

Daniel posted on Friday, March 07, 2003  6:23 am



These two model are indeed nested. Chisquare difference test is not siginificant only after allowing residual covariance between the error terms of some indicators which belong to same latent variable. Can I modify the model in this way? what is the concerns about this practice? Thank you 

Daniel posted on Friday, March 07, 2003  11:41 am



Introducing residual covariances between the error terms of some indicators belonging to the same latent variable can greatly improve Chisquare differece test. However, these residual covariances of these error terms are not significant. As I understand, In the output of final model which one deems it acceptable, all factor loadings and path coefficents should be significantly different from zero. However, we are allowed to just keep residual covariacnes of error terms in the model if it can offer better fit, no matter whether they are significant or not. Am I right? 

bmuthen posted on Monday, March 10, 2003  7:35 am



Typically, if a residual covariance gives an improvement in chisquare it is also significant. A good practice is to include in your model only residual covariances that are both significant and make substantive sense. 

Daniel posted on Sunday, March 16, 2003  10:14 am



According to your suggestion, use the WLS estimator for Chisquare difference testing. Use the WLSMV estimator for the final model. More general, however, there are also many fit indices which are linked with chisquare value or degrees of freedom, e.g., NFINormedfit index PRPasimonious ratio PNFIPasimonious mormedfit index RNFIRelative normedfit index RPRRelative pasimony ratio RPFIRelative parsimoniousfit index . Should I use all chisquare values and degrees of freedom from WLSMV estimate or all of that from WLS estimate? If I use that from WLSMV, a problem occurs when calculating PR: PR=dfj/df0>1, where dfj is the degrees of freedom derived from testing model while df0 is that derived from baseline model. Theoretically PR should smaller than one but real calculation shows it greater than one. If we should use those chisquare values from WLS estimate, in what circumstances can the values of that from WLSMV be used? 


For difference testing, use WLS. For all else, I would use WLSMV. RMSEA, CFI, and TLI have been investigated for WLSMV by one of Bengt's students. They seem to work well. The other measures have not been studied as far as I know. I would definitely not use PR with WLSMV because of the fact that the degrees of freedom are not calculated in the regular way and do not have the same meaning. I'm sure that PR was developed for degrees of freedom calculated in the regular way. 

Anonymous posted on Thursday, October 07, 2004  8:22 am



Sorry, but what does chisquare difference test mean? WLSMV is not accepted in this test. Does the diefference test mean that if you look perhaps with WLS to the quality of your model you compare the difference of (Chisq.0Modeldf of 0model) (chisq.hyp.modeldf of hyp.Model) / (chisq0modeldf0model) to compute CFI? If the values of the hyp.Model are lower than the ones of the 0model, this means a high CFI. Is it not allowed to use WLSMV for a difference test because the df are computed in another way (User's Guide: 358)? But what does the CFI if this is the case, mean if WLSMV is used? Or does chisquare difference test mean something completely different from that i wrote? 


Mplus Version 3 now has a procedure for doing chisquare difference testing using WLSMV. Yes, the problem was that the degrees of freedom are not computed in the regular way. 

Anonymous posted on Wednesday, October 13, 2004  12:38 am



So, does the difference test mean the comparison between 0Model and hyp.Model and the Comparative Fit Index to check the quality of your model is such a difference test? 

bmuthen posted on Thursday, October 14, 2004  11:40 am



The difference test refers to a comparison between H0 and H1, where H0 is nested within H1. Here, H0 is the model you are focusing on. H1 can be any less restricted model. With CFI, H1 is the completely unrestricted model. 

Anonymous posted on Monday, December 06, 2004  10:44 am



I want to use the difftest command in MPlus 3 to perform a difference of two nested models (with categorical data). Can you point me to the reference that describes the analytics of the procedure (I will need to cite it). Has any simulation work been done that I can cite as well? Thank you. 

bmuthen posted on Monday, December 06, 2004  11:38 am



This method is equivalent to the difference testing that was previously implemented in Mplus for H1/H0 model testing (see Muthen, DuToit, Spisic). We have conducted many simulations but there is no reference yet. Simulations of this kind are unfortunately not very easy to do currently in Mplus. 


Dear Support Team, I am using the chi² difference test under WLSMV. Can you explain shortly, why the two step procedure is necessary? Are there any references about this? My second question concers invariance testing in MGA with categorial outcomes under WLSMV. For correct interpretation of group means invariance of intercepts and loadings must be given, right? But what about the thresholds? Can the thresholds be free while the loadings are constant over groups? In other words, is it possible to test separately on invariance of thresholds and invariance of loadings? Millsap & Tein (2004) are talking about a minimum of thresholds beeing constant due to identification while others let all thresholds vary over groups. What do you think, is the best way of invariance testing? And in terms of identification is it possible to set all thresholds free? Thanks for helping. 


Technical appendix 4 discusses estimators in Mplus. You can request the Muthen, DuToit, and Spisic paper from bmuthen@ucla.edu. We recommend invariance testing where thresholds and intercepts are held equal or not in tandem. One reason is that item characterisitc curves are based on both of these parameters. Not all thresholds can be freed. I have added a section to Chapter 13 in the Version Mplus User's Guide that describes the steps we recommend for testing measurement invariance for continuous and for categorical outcomes. 


Hello Linda and Bengt, When I report a chisquare for a CFA (or other kinds of model), I routinely uses the little correction of "chisquare value divided by its df." (Bollen, 1989). Although purists think its not the best practice, to me it is a little better than the "always" significant chisquare. My question is: can this correction be applied to the DIFFTEST in Mplus. Because it is a difference test, this may militates against such correction? I'm not so sure how it is calculated, so I wonder what you guys would think about this practice. Thanks. Julien 


I was also fond of the chisquare divided by df, but it seems like this is essentially what (the better motivated)RMSEA does, RMSEA = sqrt (CS/(n*df)) where CS is the chisquare, n is the sample size, and df is the degrees of freedom. No, you can't apply CS/df to DIFFTEST because it gives results in WLSMV style where only the p value is relevant. 


Let me backtrack on the last part of my answer. In DIFFTEST, the df printed is not the difference between the number of parameters in the two models compared. Nevertheless, the value printed is chisquare for the df printed (so the p value is right as a function of those two) this would suggest that the CS/df descriptive approach you mention is equally motivated here, although I am not saying that I particularly endorse it. 


Greetings, When I tried to use the DIFFTEST procedures for WLSMV to compare 1 and 2factor EFA models, I got this error: "THE CHISQUARE DIFFERENCE TEST COULD NOT BE COMPUTED BECAUSE THE FILE CONTAINING INFORMATION ABOUT THE H1 MODEL HAS INSUFFICIENT DATA." Should the WLSMV diff testing method work for EFA? Regards, CW 


The DIFFTEST option cannot be used with TYPE=EFA; You would need to do an EFA in a CFA framework to use DIFFTEST. 

kc blackwell posted on Friday, September 01, 2006  11:35 am



Hello, When comparing two nested models (as in a series of invariance analyses) using the DIFFTEST option with WLSMV, is it possible that the chisquare value for the more restrictive model (i.e., with loadings constrained across groups) will be smaller than the chisquare value for the less restrictive model (i.e., a baseline model without these loading constraints), or does this indicate an error on my part? Thank you for your help. 


The chisquare values for WLSMV cannot be used directly for difference testing. This is why we have the DIFFTEST option. These values do not follow the normal expectations. I would not be concerned. 


Linda and Bengt, I have been doing some multiple group analyses using WLSMV and have run into some instances where, according the the SatorraBentler scaled (meanadjusted) chisquare, the CFI, the TLI and the RMSEA, the constrained model fits better than the unconstrained model. In addition, the estimated df for constrained model is less than the estimate for the unconstrained model. The diff test runs  that is, it accepts that the models are nested  and shows a nonsignificant change in chisquare. It is also the case that the estimates from the unconstrained models (i.e., thresholds and factor loadings) are very close to one another across groups. Here are the results for a CFA model with 4 groups, 3 latent variables and 2 measured variables. Unconstrained model: SB scaled chisquare= 135.08 estimate of df=72 CFI=.98 TLI=.99 RMSEA=.065 Constrained model SB scaled chisquare= 89.25 estimate of df= 57 CFI=.99 TLI=1.00 RMSEA=.052 Diff test: diff in chisquare= 33.40 change in df=25 My apologies for posting a question that is similar to questions you have answered before, but after reading through prior posts I am still left wondering if I have made some sort of mistake. Thank you for your help. 


To clarify, the SatorraBentler scaled (meanadjusted) chisquare is part of the MLM estimator not WLSMV. If you are using WLSMV, the chisquare values cannot be used for difference testing without using the DIFFTEST option which it appears you are using. With WLSMV, you don't expect the chisquare value and the degrees of freedom to behave as with ML, for example. 


Thank you for clearing up my confusion about the SantorraBentler scaled chisquare. Where can I find an explanation for the calculation of chisquare and degrees of freedom when using the WLSMV estimator? The fact that the the df do not seem to correspond in a straightforward way with the number of measured variables and the number of estimated associations among variables is confusing to me. 


See Technical Appendix 4 which is on the website. 


I am helping prepare a manuscript that reports on some of the analyses described in the post above using WLSMV to accommodate categorical variables in an SEM. In the methods section of the paper, we say that we used WLSMV estimates of chisquare and the derivatives difference test for change in model fit with nested models. Given that some of our reported results will still seem odd to readers familiar with ML estimates, I am considering adding the following footnote: "Degrees of freedom for the model fit chisquare test is itself mean and variance adjusted when using the WLSMV estimator and does not correspond in a straightforward way with the numbers of measured variables and estimated parameters. This leads to some values that may appear counterintuitive (e.g., nested models where the estimated degrees of freedom for the constrained model are the same or fewer than for the unconstrained model). Also, the difference in model fit for nested models that is based on the derivatives difference test does not correspond directly with the differences in estimated chisquare and degrees of freedom between the constrained and unconstrained models." Is this accurate? Even after reading the technical appendix and the Muthen, du Toit, and Spisic paper, I am still a little fuzzy on what is going on with the DIFFTEST and the chisquare and df estimates. 


This seems reasonable. You could also say that the chisquare is adjusted to obtain an accurate pvalue and it is the pvalues that are relevant in this situation. 


I would like to use the DIFFTEST to examine the chisquare difference for two models with all categorical data and using WLSMV estimation. The HO model has five firstorder correlated factors and the H1 model is the more restricted model with a secondorder factor (see below). Technically these models are nested, but the software keeps giving me the message that "THE CHISQUARE DIFFERENCE TEST COULD NOT BE COMPUTED BECAUSE THE H0 MODEL IS NOT NESTED IN THE H1 MODEL". Is there a solution for calculating the chisquare difference between these two models when using WLSMV estimation? Here are the two models I would like to compare: H1 MODEL: ANALYSIS: ESTIMATOR = WLSMV; MODEL: F1 by y2 y5 y6 y7; F2 BY y8 y11 y15 y16; F3 BY y27 y32; F4 BY y18 y21 y23 y24; F5 BY y35 y36 y37 y38; F by family friends living school self; SAVEDATA: DIFFTEST IS test.dat; HO MODEL: ANALYSIS: ESTIMATOR = WLSMV; DIFFTEST IS test.dat; MODEL: F1 by y2 y5 y6 y7; F2 BY y8 y11 y15 y16; F3 BY y27 y32; F4 BY y18 y21 y23 y24; F5 BY y35 y36 y37 y38; 


My apologies, I mislabeled the secondorder factor structure in my previous posting. Here are the two models that I would like to compare using the DIFFTEST: H1 MODEL: ANALYSIS: ESTIMATOR = WLSMV; MODEL: F1 by y2 y5 y6 y7; F2 BY y8 y11 y15 y16; F3 BY y27 y32; F4 BY y18 y21 y23 y24; F5 BY y35 y36 y37 y38; F by F1 F2 F3 F4 F5; SAVEDATA: DIFFTEST IS test.dat; HO MODEL: ANALYSIS: ESTIMATOR = WLSMV; DIFFTEST IS test.dat; MODEL: F1 by y2 y5 y6 y7; F2 BY y8 y11 y15 y16; F3 BY y27 y32; F4 BY y18 y21 y23 y24; F5 BY y35 y36 y37 y38; 


Did you run the model with the secondorder factor first? It is the more restrictive model because it imposes constraints on psi. 


Linda, thanks for pointing that out. I had indeed incorrectly saved the derivatives of the more restrictive model first. The difftest works fine when based on the derivatives of the less restrictive model. Thanks again. 


I am performing a DIFFTEST to assess the extent to which there are gender differences in individual regression parameters in a path model. My H1 model is a saturated model. In my H0 model, I have constrained 1 parameter to be equal across groups by including "y1 ON x1 (1);" in the model command. My output reads: Chisquare test for difference testing Value .566 Degrees of freedom 1** Pvalue .4519 Does the pvalue of .4519 indicate that this individual regression parameter is not significantly different by gender? Thank you for you time. 


The interpretation of difference testing is described in Chapter 13 under Model Difference Testing. A nonsignificant result indicates that the constraining the parameter to be equal in both groups does not significantly worsen model fit. This indicates that the parameter is not different in both groups. 


I have a path model with 6 continuous endogenous variables being estimated using ML. I am interested in testing the extent to which there is a significant improvement in model fit with the addition of a single dichotomous mediating variable. I run into a problem because my nested model is estimated using ML while my comparison model must be estimated using WLSMV because of the categorical mediating variable. Which chisquare difference testing procedure would be appropriate in this situation? On a related note, is it possible to test the significance of the change in R square of the outcome variables individually rather than testing the change in the overall fit of the model ? Thank you for your help, Katie 


A necessary condition for models to be nested is that they have the same set of observed variables. You should include the same set of observed variables in both models and fix the regression of the distal to zero in one and allow it to be estimated in the other. With WLSMV, note that the DIFFTEST option is needed for chisquare difference testing. See Example 12.12 in the Mplus user's guide. Rsquare is not a test of model fit so I would not use it in that way. 


Ahhhh yes. That makes sense. I hadn't thought of that. Thank you very much for your help. Katie 


Greetings I have a data set of dichotomous variables, have been analysing them using WLMSV and have compared the goodness of fit of a number of different models relative to a baseline model using the Chisquared difference test. Following your comments to Julien March 2021 2006 I have been using the DIFFTEST Chisquare divided by its degrees of freedom as the comparison value when comparing fit of models. You suggested in your response that the DF for this measure makes this division valid. (It also makes a difference to my results whether or not the Chi squared value is reduced in this way.) I have a couple of questions:  Could you please confirm that the Difference Test Chisquare values could be validly compared when divided by the DF. A reviewer of our paper has strongly criticised our using this measure.  Is it possible to derive an AIC or similar value to allow us to compare the relative fit of pairs of nonnested models?  Is there any way in which I could obtain a 95% confidence for the RMSEA when doing these analyses? Thanks Ruth 


I think the reviewer is criticizing using a chi=square divided by the degrees of freedom in general. I would agree with this criticism. I don't advocate this practice. No. AIC is for maximum likelihood estimators. No, this has not yet been developed for weighted least squares. 


Thanks Linda Following on from your response, when I use the difference test to compare the goodness of fit of two nonnested models relative to the baseline model, the DIFFTEST Chi square values are associated with different numbers of degrees of freedom. Can these Chi square values (both highly significant) be compared without any adjustment? That’s where I had thought the division by degrees of freedom was appropriate. Is there any other way I can make a quantitative statement about their relative goodness of fit? (The variables in the data set are all dichotomous) Ruth 


If you are using the DIFFTEST option, you must be comparing two nested models and using WLSMV. With WLSMV, only the pvalue is meaningful. The chisquare and degrees of freedom are adjusted to obtain a correct pvalue. The is why you need to use the DIFFTEST option for chisquare difference testing. There are more fit measures than chisquare to consider. I would look at those in addition to chisquare. If you have a very large sample, it may be that chisquare is sensitive to model misfit. 


doing a mgfa using wlsm (delta) to test invariance across 6 groups. responses are all categorical. i am doing a chisquare difference test comparing a baseline model (thresholds, factor loadings, factor variance and covariance freed across groups) with a restricted model (parameters constrained equal across groups). # of groups=6, # of observations: Group SE=321; Group GN=166; Group PN=196; Group PT=161; Group DM=160; Group PPP=251, # of dependent variables=36; # of independent variables =0, # of continuous latent variables =3. chisquare for baseline model=6478.376(3561) and the chisquare for the restricted model=6465.057(3726). is this possible to have a baseline model with a higher chisquare with lower number of df than the restricted model? it seems strange to have a negative chisquare and positive df for a chisquare difference test. 


to continue from the above message... i am having the same issue with only 2 groups (same data set  i thought it might be the complexity of the model). my chisquare for the baseline model is greater than chisquare for the restricted model, but the df for the baseline model is less than the df for the restrictive model. i don't think it is my syntax or coding, because i use the exact same syntax on a different data set and i don't have the issue. any ideas would be greatly appreciated. 


Please send your input, data, output, and license number to support@statmodel.com. 


Hi I'm fitting a CFA model with binairy indicators and I test nested models using the DIFFTEST option. I have two questions: 1. MPlus prints an overall fit of the model including a chisquare. In my case, chi(128)=155.389, p=.05. Can I interpreted this overall fit as I would in the case of continuous data [even though I can't use it for chidiff testing]? I mean: for continuous data, I would conclude that, given that my sample size is large, this chisquare indicates a well fitting model. (Note that CFI=.999, RMSEA=.013; this indicates good fit as well, but I particularly want to know whether the overall chisquare fit can be interpreted as usual). 2. I struggle with how to report the results of the chisquare difference tests that I get when I use the DIFFTEST option because the degrees of freedom are not equal to the number of parameters that I constrain... Thanks in advance Sophie 


In both cases with WLSMV, the only value that is interpretable is the pvalue. 


Hi, Im fitting genetic models with binary data: I use WLSMV and the model constraint option. I cant interpret the chisquare or use it for model comparison, but I also cant use the DIFFTEST option because it is incompatible with the model contraint option. How to proceed?? Best Sophie 


Try WLSM. 


If is use WLSM, I still get the warning; *** ERROR in Analysis command DIFFTEST is not available in conjuction with nonlinear constraints through the use of MODEL CONSTRAINT. Request for DIFFTEST is ignored. and no further output... 


You don't use DIFFTEST with WLSM. You use the scaling correction factor like with MLR or MLM. 


aha! that was silly. thanks so much I always appreciate your swift reactions to questions 

Wei Chun posted on Sunday, December 21, 2008  11:16 pm



How do we obtain SatorraBentler chisquare statistic and it's p value in Mplus? With thanks 


This is the MLM estimator in Mplus. 

Wei Chun posted on Monday, December 22, 2008  2:41 pm



I am testing a structural model (n = 1965) using WLSMV estimator. The other fit indices are fine but the Chisquare p is significant. Do you think that the model should be rejected? Many thanks. 


The sample size is not that large for categorical outcomes. I would need to see the whole picture to comment further. If you send the output and your license number to support@statmodel.com, I can take a look at it. 


Hi Linda and/or Bengt, I have run into a problem conducting difference tests for models including categorical indicators. We have compared a number of nested models successfully (that use identical measurement models but differ in terms of the paths included in the structural model) but with one comparison in particular Mplus tells us that the models are not nested and we are certain that they are (the one simply frees up 4 paths in the structural model to be estimated that aren't included in the comparison model). Please let me know if you need me to send you the outputs from each of the two models, and the data file and the derivatives. Thanks! Rick 


Please send your full outputs and license number to support@statmodel.com. 


will do, thanks Linda! 


which alpha does DIFFTEST use? 0.05, 0.01 or something else? 


DIFFTEST gives the pvalue. 


I'm using the DELTA parameterization with WLSMV, and I need to change the default saturated model to take into account some restriction (twin 1 is identical to twin 2). I guess I should estimate a modified saturated model, and use DIFFTEST with this model? However, I'm unsure how the saturated model is estimated in Mplus, especially while using the Delta parametrization and the multigroup option. Is there a reference somewhere? Best, Guillaume 


You should run a model where what you want the H1 model to be is the H0 model using DIFFTEST in the SAVEDATA command. Then run the H0 model using DIFFTEST in the ANALYSIS command. The folloiwng paper may discuss the saturated model in Mplus: Prescott, C.A. (2004). Using the Mplus computer program to estimate models for continuous and categorical data from twins. Behavior Genetics, 34, 1740. 

Jason Bond posted on Tuesday, December 08, 2009  12:42 pm



I'm attempting to assess whether the addition of a single dichotomous indicator of a factor to a number of other dichotomous indicators improves model fit. Alternatively, I guess this question could be formulated as "is such a question answerable using a typical chisquared difference test?" If so, then from the first post above, your suggestion is to use WLS. But you've also mentioned in a post above that, when using chisquared difference testing (specifically when using the DIFFTEST option which only applies to the WLSMV estimator), the same set of variables should be in the model So would it be correct in the Null Model to use: Model: f By tolerancsocintpb* craving@0; f@1; when considering the single additional craving variable or to simply exclude it from consideration (i.e., not include it in the Usevar list or the Categorical list or the Model statement)? My concern is that, when I do the latter, the Ha model chisquared produced is larger than the H0 model chisquared and has more degrees of freedom (due to all of the covariances excluded I imagine plus the excluded path), whereas in the traditional chisquared difference testing from nested modeling exhibits the reverse. But in doing the former, should I also fix all other model parameters associated with craving to zero as well (i.e., Tau)? 


I would take the former approach and but still include parameters for the mean (or threshold) and variance (if any) for the variable. 

Jason Bond posted on Wednesday, December 09, 2009  10:45 am



So when you refer to variance for the variable I'm assuming that you are referring to the additional dichotomous manifest variable (i.e., craving) and not the factor variance? However, in looking through the output and TECH1 output parameters, I don't see anywhere variances for the dependent dichotomous indicators (i.e., the THETA matrix). Is this something that can be allowed? My goal is to assess the contribution of an additional variable above and beyond the other variables in the model. As it is correlated with the other variables already in the model, the chisquared produced by fixing its factor loading to 0 is massive. However, the question of whether the additional variable contributes anything to the model fit above and beyond the other variables doesn't seem to quite be answered by this approach. Others considering this question have performed analyses with and without the additional variable included and compared the usual fit measures (BIC, RMSEA, information curves, etc.) across the two models. Which do you think might be more relevant? Thanks much again Bengt. 


To answer your questions in turn: That's right. In this case there is no variance  that's what I meant by "if any". Depends on the question  see below. I think the problem is formulated in an awkward way  I don't think one should think about whether or not adding an indicator improves model fit. Instead, think about whether or not it adds important information for the factor (assuming the model still fits). That question could be answered by information functions. And could be answered by reduction of SEs in structural relations that the factor is involved in. 


Hi, I read the comments regarding calculations of degrees of freedom for chisquare difference testing using WLSMV. I understand that the chisquare and degrees of freedom are adjusted to obtain a correct pvalue, and (they are not what one would perhaps expect to see). However, what is formula for calculating the degrees of freedom for chisquare difference testing using WLSMV? (The Technical Appendices do not state it, and neither does Satorra and Bentler [1999], unless I missing something; nor do they represent the difference in the degrees of freedom of the two nested models.) I understand that the degrees of freedom for chisquare for model fit are calculated according to Appendix 4, (110). Thank you so much. Best, Serban 


See the DIFFTEST technical appendix on the website. 


Dear Linda, When you stated on Monday, June 19, 2006 that "The DIFFTEST option cannot be used with TYPE=EFA; You would need to do an EFA in a CFA framework to use DIFFTEST," does that mean that I should run the resulting EFA models in CFA in order to get the correct MLR scaling factor to use in the SatorraBentler modification? OR Can I use the MLR scaling factor that automatically displayed below the chisquare in my categorical EFA with default estimator WLSM? The technical appendix at http://statmodel.com/chidiff.shtml does not specify what to do for WLSM estimation although the output warning states that "MLM, MLR and WLSM chisquare difference testing is described in the Mplus Technical Appendices at www.statmodel.com. See chisquare difference testing in the index of the Mplus User's Guide." Thank you for your time, Alicia 


WLSM should be treated as MLM and MLR. You may not be aware that we have a new way to do EFA as part of a CFA model. See the Version 5.1 Language and Examples Addendums on the website with the user's guide. 

Fatma Ayyad posted on Wednesday, October 20, 2010  1:47 pm



Dear Dr. Muthen, When I tried to use the DIFFTEST procedures for WLSMV to compare the parameters between two groups I got this: THE CHISQUARE DIFFERENCE TEST COULD NOT BE COMPUTED BECAUSE THE H0 MODEL IS NOT NESTED IN THE H1 MODEL. Should I consider the pvalue of the Chisquare test of model fit? Otherwise, how should I judge on my model? Thank you, Fatma 


DIFFTEST is used to test two nested models. To determine the fit of a single model, use the fit statistics provided. 

Fatma Ayyad posted on Thursday, October 21, 2010  8:17 pm



Thank you! 


Hi, I want to test measurement invariance between nested models using DIFFTEST. The problem is, I have a massive dataset (250+K), so the tiniest differences are going to come up as significant. Is there any way I can specify the amount the models need to be different by? E.g. test the probability that the models differ by more than say, 10%? Or alternatively, would any other tests (say change in CFI, TLI) be useful? (and how does one request these in MPlus syntax). My other option is to take random samples from my whole sample, but I'd like to attempt it on the whole sample if possible. Thanks! 


All fit statistics available for a particular model are given as the default. I would suggest taking random samples from the sample of a size such that you don't have any empty cells in the bivariate tables of the categorical indicators. 

Kathy posted on Monday, March 21, 2011  11:38 am



Is this the right formula for calculating chisquare difference test for categorical data using WLSM, because in the notes it says "Chisquare testing for continuous nonnormal outcomes"? cd = (d0 * c0  d1*c1)/(d0  d1) TRd = (T0*c0  T1*c1)/cd 


Yes, these would be the correct formulas in that case also. 


I have 19 ordinal items as indicators of 3 latent factors using WLSMV (as determined by EFA, promax rotation). I would like to test this model against a different sample, to see if the 3 factor structure holds. From the postings and the Mplus manual, it seems that I would run one CFA using both groups, specifying 3 factors, and would then run the same model but specify that the factor covariances across both groups are equivalent [i.e., f1 WITH f2 f3 (1)]. I would then use the difftest function to determine if the two models are significantly different. Is this correct? I am very new to Mplus, so I have also included a snippet of my code: For H0 model: GROUPING IS group (0=n561 1=n562) MODEL: F1 BY B10_in B19_in B28_in B54_in B61_in B66_in B71_in B79_in B80_in; F2 BY B9_sb B18_sb B27_sb B36_sb; F3 BY B3_wm B39_wm B48_wm B63_wm B73_wm B78_wm; ANALYSIS: ESTIMATOR = WLSMV; SAVEDATA: DIFFTEST IS deriv_561_562_9in_6wm_4sh.dat; For H1 model: GROUPING IS group (0=n561 1=n562) MODEL: !same as above plus next line F1 WITH F2 F3 (1); ANALYSIS: DIFFTEST IS deriv_561_562_9in_6wm_4sh.dat; Is this correct? 


This looks correct. You can also consider comparing variances. See multiple group analysis in the Topic 1 and Topic 2 course handouts for measurement invariance and the comparison of structural parameters. 


Thank you for your reply. For this same data set, I would like to test the invariance of factor loadings between these two independent samples (19 categorical indicators, 3 continuous latent variables). From the course handouts you mention, it seems that the default in Mplus is to hold the factor loadings equal between groups. In order to test the factor loadings between groups, it seems that I would create an overall analysis model for both groups (as I did in my H0 model above), and then run this as a CFA saving the residuals (using difftest). For my second CFA (and subsequently chi square difference test), I would include a group model in addition to the overall analysis model, but the group model would need to specify that the factor loadings are to be freely estimated. According to the manual, it looks like I would list each indicator with a * to allow the factor loadings to be freely estimated. However, on the slides (#212, topic 1) it appears that I should list the indicators in brackets. Can you please explain? I went ahead and modeled my code after slides #212213. However I receive the following error when attempting to run the second model and difftest: THE CHISQUARE DIFFERENCE TEST COULD NOT BE COMPUTED BECAUSE THE H0 MODEL IS NOT NESTED IN THE H1 MODEL. Please advise. Thank you. 


If you look through the User's Guide you see that bracket statements are either intercepts (for continuous indicators) or thresholds (for categorical indicators)  they are not loadings. 


Hello. I am testing measurement and structural invarance across two groups using WLMSV. DIFFTEST works for the measurement invariance tests, but as I move from scalar to factor variance invariance, the DIFFTEST error states: the chisquare difference test could not be computer because the HO model is not nested in the H1 model. Could you please advise? 


Please send the two outputs and your license number to support@statmodel.com. 


Hello, I have just started using mplus (and doing sem & path analysis). I am a bit confused with the fit of the model. My model consists of categorical outcome variable and mostly categorical variables with only 1 continuous variable. Usevariables are x1 x2 x3 x4 x5 x6 y1 x7 x8; Categorical are x1 x2 x3 x4 x5 x6; Model: aliend by x4 x5 x6; alienp by x1 x2; alienp with y1; alienp with aliend; y1 with aliend; alienp y1 aliend on x7 x8; x3 on alienp x8 y1 aliend; Chi Square test of model: 105.518* df: 16 p: 0.000 RMSEA Estimate: 0.022 90% C.I.: 0.018 0.026 CFI: 0.983 TLI: 0.963 ChiSquare Test of Model fit for the Baseline Model: 5274.423 df: 35 p: 0.000 Since the chisquare test shows significance, does that mean that the model is not fitting well? However, I thought the CFI & TLI was showing good fit. Thanks in advance 


CFI is a less stringent fit statistic than chisquare. If you are new to both Mplus and SEM, I suggest listening to our Topic 1 course video on the website and getting an SEM book. A good one for beginners is the one by Rex Kline. 

Tanya posted on Tuesday, September 27, 2011  9:58 am



will do that... thanks a lot! 

Kathy posted on Tuesday, October 04, 2011  2:08 pm



In conducting a MGFA I found noninvariance of the factor loadings/thresholds across groups (p<.001) but the CFI and RMSEA values were unchanged between the baseline model and the loading/threshold model. In other words, the difference test indicated that constraining the loadings/threshold equal across groups resulted in a decrease in the fit of the model, but the goodnessoffit values suggest no such decrease in model fit. The same thing has happened in several other analyses. Why would the goodnessoffit indicate no change? Which values do you pay attention to, i.e. is there really a decrease in model fit? 


The default in Mplus is for the thresholds and factor loadings to be held equal across classes. So you should be relaxing, not imposing, these constraints. See the Topic 2 course handout under multiple group analysis to see how to do this. See also the multiple group discussion in Chapter 14 of the user's guide. 

Kathy posted on Tuesday, October 04, 2011  5:21 pm



In accordance with topic 14 my baseline model has the loadings/thresholds freed across groups, and in what I called the "loading/threshold" model the parameters were made equal (mplus default). Is this not right? At any rate, I found noninvariance between these two models, according to the DIFFTEST (p<.001), but the CFI and RMSEA values were unchanged between the two model. My question pertains to the discrepancy between the DIFFTEST and the CFI and RMSEA. That is, the DIFFTEST suggests that constraining the loadings/thresholds to be equal decreased the fit of the model while the CFI and RMSEA suggest that the fit of model did not change. My question is why would the goodnessoffit values indicate no change when the DIFFTEST suggest that the model fit decreased? Which values do you pay attention to? 


I would have to see the two outputs and your license number at support@statmodel.com to say anything more. 


I have been asked by a reviewer to explain how the df are calculated for the chi square difference test (in assessing invariance between a less restrictive CFA model using ordered categorical data and a more restrictive model). I have read the technical appendix for chisquare difference testing on the website, but I am afraid that I do not completely understand it. I have two questions about it. First, I do not see the scaling correction factor for either the less restrictive model (c0) or for the more restrictive model (c1) as part of my Mplus output. Second, I am hoping you can clarify how the scaling correction factor is estimated or calculated. My current understanding is that using the scaling correction is helpful for ensuring that the obtained chi square difference test value approximates a chi square distribution. But I am not entirely sure that I am correct or how the scaling correction is obtained. 


The degrees of freedom for a chisquare difference test is the difference in degrees of freedom between the two models. If you don't find the scaling correction factor, you must be using an old version of the program. The formula for the scaling correction factor is in Technical Appendix 4. This cannot be computed by hand. 

David Kosson posted on Thursday, November 10, 2011  1:22 pm



Linda, Thanks. I am guessing you are saying that this is the case even if I am using the WLSMV estimator (which i am). But this does not seem to be the case  For my less restrictive model (allowing the groups to differ on all loadings and thresholds, using nomean structure),the Chi Square Value = 251.196* Degrees of Freedom = 79** For my more restrictive model (allowing the groups to differ on loadings but not thresholds, no mean structure), the chi square value = 226.604* Degrees of Freedom = 76** But for the chi square difference test, chi square value = 19.033 Degrees of Freedom = 9** PValue = 0.0249 In case it helps, there were 13 indicators, all latent factor means were set at 0 and all scale factors (or indicators) were fixed at 1. 


If you are using a version before Version 6, the degrees of freedom for WLSMV are not calculated in the regular way. Both chisquare and the degrees of freedom are adjusted to obtain a correct pvalue. Neither chisquare nor the degrees of freedom should be interpreted. To do difference testing with WLSMV, you must use the DIFFTEST option. There is no scaling correction factor involved. The difference in the number of free parameters can be used instead of the difference in degrees of freedom. 

Eric Chen posted on Wednesday, December 14, 2011  12:54 am



Dear Dr. Muthen, I conduct a multiple group categorical CFA using WLSMV as estimator. I wonder how to carry out the chisquare difference test when the difference between my H0 and H1 models is a nonliear constraint. Thanks in advance. JH Chen 


Can you describe more what you mean by the difference being a nonlinear constraint. 

Eric Chen posted on Wednesday, December 14, 2011  5:28 pm



Dear Dr. Muthen, I plan to use two groups 1factor CFA to assess uniform and nonuniform DIF, separately. So,the 1st constraint is (threshold/loading) for the studied item to be equal across two groups. And the 2nd constraint is (loading/residual variance) for the studied item to be equal across two groups. Thanks! JH Chen 


These are not nonlinear constraints. You can do regular difference testing in your case. 

Eric Chen posted on Thursday, December 15, 2011  8:26 pm



Dear Dr. Muthen, Thnaks for your reply. I have one more question. If I have to use WLSMV as estimator and MODEL CONSTRAINT to specify my H0 model, how could I carry out a chisquare difference test in Mplus6? It seems that DIFFTEST can't work when the WLSMV and MODEL CONSTRAINT are used at the same time. JH Chen 


Please send the output that shows this problem and your license number to support@statmodel.com. 


When completing EFA on categorical data using the WLSMV estimator in the output there is a section titled 'FACTOR STRUCTURE'. What rotation is used to ascertain this output? Are they a recalculation of the Geomin rotated loadings also provided? 


The default rotation is used or the rotation specified using the ROTATION option. The factor structure is the itemfactors correlations. 


Dear Mplus Team, if I treat my variables as categorical in multiple group models, taking MLR (not WLSMV) as estimator (type = mixture) and using the likelihood ratio test (LRT) for model comparisons: Should I use (in the case of categorical outcomes and MLR) the formulas on http://www.statmodel.com/chidiff.shtml  “Difference Testing Using the Loglikelihood with MLR”) based on loglikelihood values and scaling correction factors? Or is it better with categorical outcomes to use ML and the ordinary likelihood ratio test for model comparisons? My question is triggered by a posting from Tihomir Asparouhov: “If you are using the MLR estimator with categorical data you should use the unscaled likelihood ratio test. The SB is designed to be used for the case when you are treating the variables as continuous.” ( http://www.statmodel.com/discussion/messages/9/189.html ) Does it mean: only take the MLR likelihood values and calculate: 2*(L0  L1), without difference test scaling correction? Thank you very much. 


What he meant is that MLR should be used when categorical variables are treated as continuous. If categorical variable are treated as categorical then ML should work fine. If you use MLR, the scaling correction factor is always required. 


hi there, we are trying to conduct the difftest for a model using WLSMV, and are getting this error message: THE MODEL ESTIMATION TERMINATED NORMALLY THE CHISQUARE COMPUTATION COULD NOT BE COMPLETED BECAUSE OF A SINGULAR MATRIX. in a search of the discussion forum, it looks like this problem usually leads to y'all asking to see the input and data, but in this case we wouldn't be able to do that because some of the data can't be shared due to a legal agreement. are there any other options here that we could try to pursue? thanks, tom 


Please send the outputs from the two runs and we'll see what we can do. 


thanks, bengt. we managed to resolve that specific problem but can't get past messages that the models aren't nested (when as far as we can tell they are)so we will send the outputs in case you can help. 

sojung park posted on Tuesday, November 05, 2013  7:11 pm



Dear Dr.Muthens, I am running regression with binary outcome. In order to have FIML, I use the syntax estimate=ML INTEGRATION=MONTECARLO. how can I do chisquare test for series of nested model? if I use WLSMV, it seems I still have FIML, but I prefer running logit, not probit model.. thank you so much! 


You do a difference test using the loglikelihoods. See ChiSquare Difference Test for MLM and MLR in the left column of the home page. 

ellen posted on Tuesday, February 18, 2014  1:19 pm



Dr. Muthen, Can I use .csv data file for the DIFFTEST? Or, does it only work for a .dat data file? I am comparing nested models, using the WLSMV estimator. Thanks! 


Either file can be used as long as it is a text file. 

jml posted on Wednesday, June 25, 2014  11:02 am



Dear Drs. Muthen, I am having the same problem as a few other people in this thread where I'm trying to conduct a chisquare difference test between two models that I believe are nested, where the indicators are categorical and the estimator is ULSMV. The error message is the following: THE MODEL ESTIMATION TERMINATED NORMALLY THE CHISQUARE COMPUTATION COULD NOT BE COMPLETED BECAUSE OF A SINGULAR MATRIX. I am using the method described on your site for ULSMV/WLSMV estimators. Thanks! 


Please send the outputs from the two steps to support along with your license number. 


To compare nested models with MLR and categorical dependent variables, there is no scaling correction factor in the output (Mplus V 7.2). A "chisquare test of model fit for the binary and ordered categorical outcomes" is provided. Is it allowable to do a traditional chisquare difference test to compare nested models without the scaling correction factor? 


Please send the full output to support@statmodel.com along with your license number so we can see your exact situation. 

Lois Downey posted on Thursday, September 03, 2015  7:41 am



I am using WLSMV and DIFFTEST in an exploratory investigation of whether there are regional differences in various categorical outcomes. Region is a 11category nominal scale variable, and each model uses 10 dummy indicators as predictors of one of the outcomes of interest. However, the pvalue of the chisquare difference test differs considerably, depending upon which region I use as the reference group. For my final models, I've been using the category with the lowest coefficient as the reference group, thus ensuring that the coefficient estimates are all positive. Is this a reasonable strategy? Or is there a better rule of thumb for selecting the reference group in an exploratory study, given that the result depends on which region is selected? Thank you. 


Try using Model Test to see if you face the same issue. 

Lois Downey posted on Thursday, September 03, 2015  9:57 am



Thanks. I'll try that. However, I've not used Model Test before. Let me be sure I understand the procedure. Is this the correct procedure for testing a nominal scale variable with 7 categories? Run 1: MODEL: Y on x1,x2,x3,x4,x5,x6 (b1b6); MODEL TEST: 0=b1b6; ========== Run 2: MODEL: Y on x0,x2,x3,x4,x5,x6 (b1b6); MODEL TEST: 0=b1b6; Then compare the pvalues for the Wald Test of Parameter Constraints from the two runs. Is that correct? 


You want to test that all of them are zero jointly, so Model Test: 0 = b1; ... 0 = b6; you can do that using a DO loop: Model Test: DO(1,6) 0 = b#; 

Lois Downey posted on Thursday, September 03, 2015  10:29 pm



Oh, I see. Thanks! Although this method gives pvalues for the Wald tests that are similar when the reference category is altered, they don't match exactly. For example, looking at one outcome, I get the following pvalues for omnibus tests for 5 sets of dummy indicators, depending upon the reference group selected: 0.6501 vs. 0.6523 0.4873 vs. 0.5016 0.4788 vs. 0.4786 0.3385 vs. 0.3534 0.0446 vs. 0.0447 (I perhaps should have mentioned that these are complex regressions, although I don't know whether that's relevant.) If I use the MLR estimator rather than WLSMV, and the log likelihood and scaling factor to compute the pvalue for the omnibus test, I get the following values for the 5 predictors above (irrespective of which category is used as the reference group): 0.0670 0.5917 0.4712 0.2420 0.0703 The discrepancies between the results with MLR (which is the estimator I've typically used in the past) and WLSMV are of concern, making me think that I should use MLR for my current analyses. Do you agree? Thanks very much for your help. 


Please send input, output, and data for a relevant WLSMV vs MLR comparison so we can take a look at it. Send as little as possible to pinpoint their differences. 

Daniel Lee posted on Friday, April 22, 2016  6:58 pm



Hello Dr. Muthen, I used the modification indices for categorical EFA (WLSMV) and removed an item that was contributing to a lot of model misfit. After removing the item, I would like to conduct a Difftest (as you normally would for two models w/ categorical indicators) but the deriv.dat would not save. The error message I get when I try to "SAVEDATA: Difftest is deriv.dat" for the baseline EFA model is: *** WARNING in SAVEDATA command The DIFFTEST option is not available for TYPE=EFA. Note that the DIFFTEST option is available with the use of EFA factors (ESEM). Request for DIFFTEST will be ignored. I would appreciate your guidance and resources for conducting difftests in categorical EFA models. 


You will need to do your EFA as an ESEM. See Example 5.24. This example is an EFA if you remove the covariate and direct effects. Other ESEM examples follow it. 

Daniel Lee posted on Saturday, April 23, 2016  8:13 pm



Many thanks! Makes perfect sense! 


I am running a crosslagged autoregressive model with two main categorical variables and some continuous covariates. To handle missing data I am using ML estimator with Montecarlo integration. I would like to compare nested models. However, DIFFTEST is not allowed with ML estimator. Should I use different estimator (WLS?) just for the model comparison? 


With ML, you can look at the difference in the loglikelihoods and the difference in the number of parameters. Minus two times the loglikelihood difference is distributed as chisquare. 


Dr. Muthen, 1) In my crosslagged autoregressive models all variables are categorical (estimator ML), so I assume I do not need to calculate correction factor when I calculate chisquare from loglikelihoods? 2)could you recommend a reference paper for calculating chisquare using log likelihoods of nested models? 3)Also, output does not give any fit indices. Is there a way to know if my baseline model fits the data well? Thank you, Vaiva 


1) Correct. 2) Try any SEM book. 3) There is not an overall test of fit but you can look at bivariate fit using TECH10. 


Thank you. TECH10 OUTPUT FOR CATEGORICAL VARIABLES IS NOT AVAILABLE FOR MODELS WITH COVARIATES Is there a possible solution to this? 


Not really because you no longer have a frequency table to test the model against. Instead, you can think about the restrictions that the model imposes  such as only lag1 relationships  and free up those restrctions to see if that model has a better logL. 


Is there a way to inspect for outliers under the Bayes estimator in mplus? I see in Lee's (2007) text on bayesian structural equation modeling there is a suggestion to inspect the residuals for outliers and a qqplot for normality to check the fit of the model. Thanks! 


We don't have that implemented yet. 


Dear Dr. Muthen, I am doing multiple group CFAs to test for configural and metric invariance of a scale across three groups. I want to compare the configural and the metric model using chisquare difference testing. Since we are using the MLR estimator, we have to calculate the SatorraBentler scaled chisquare difference test (TRd) as indicated on the website. In order to do so we need to use the scaling correction factor. We are however not sure which scaling correction factor to use, as the output reports several ones: On the one hand the output reports two scaling correction factors (H0 and H1) under the heading 'loglikelihood' and on the other hand the output reports a scaling correction factor under the heading 'ChiSquare Test of Model Fit'. Which one should we use? Thank you for your answer. Sara 


The chisquare difference testing that is printed already takes this into account  no need to work with the scaling factors. 


Dear Dr. Muthen, Thank you for your answer. However, as far as I know, no chisquare difference test is printed. I calculated the difference test using the formulas on your website for an MLR estimator: https://www.statmodel.com/chidiff.shtml. Is this correct? Thank you. Sara 


Please send your output to Support along with your license number. 


Hi, I would like to compare a secondorder factor model with a twofactor model. However, I always get the error: THE CHISQUARE DIFFERENCE TEST COULD NOT BE COMPUTED BECAUSE THE H0 MODEL IS NOT NESTED IN THE H1 MODEL. My syntax is: Run 1: MODEL: CI BY G_1 G_3 G_5 G_6; PE BY G_2 G_4 G_7 G_8; SAVEDATA: DIFFTEST IS X2.dat; Run 2: ANALYSIS: DIFFTEST IS X2.dat; MODEL: CI BY G_1 G_3 G_5 G_6; PE BY G_2 G_4 G_7 G_8; GR BY CI* PE; GR@1; CI PE (1); I also tried running the second order factor model first, but got the same error message. Thank you very much for your help, Theres 


Hi again, I just realized that the second order factor does not influence the model fit and the models show exactly the same model fit indices. Is there another way to find out which model fits the data better, the second order factor model or the two factor model? Thank you 


A secondorder factor model is not testable unless you have at least 4 firstorder factors. What the secondorder factor model does is to put restrictions on the factor covariance matrix of the firstorder factors. With 3 firstorder factors this is the same as the 3 elements in that covariance matrix so fit is the same. With only 2 firstorder factor indicators the model is not identified  one factor covariance cannot identify a loading and a factor variance, nor two loadings as in your case. 


Thank you for your response, it helped me a lot! 

Back to top 