Comparing CFA models PreviousNext
Mplus Discussion > Confirmatory Factor Analysis >
 Andrea Saul posted on Thursday, March 30, 2006 - 1:31 pm
Does anyone know of any standards or recommendations when comparing two "good fitting" models with the CFI or TLI index? I'm interested in comparing three different models for a set of related psychological symptoms, the items are all dichotomous, and I'm using the MPlus program and end up with consistently "better" fit indexes across CFI, TLI, RMSEA, and SRMR indexes, but I have no idea how to determine if the differences in fit are enough to matter and haven't found anything on it - If being more specific matters, with my largest sample (1500 youth,21 items, 2 factors), I end up with CFI .987 .983, and .982 and corresponding TLI values .993, .990, .989. With a smaller sample (N300), again all four fit indexes are consistently slightly better for one of the two models, but the differences are even smaller, e.g. cfi of .982 vs. 980 and .979 - besides for theoretical issues between models, any advice for commenting on if there is a statistically meaningful better fit for one model vs. the others? Thanks for any feedback, I (and my advisor) are stumped.
 Linda K. Muthen posted on Thursday, March 30, 2006 - 1:59 pm
I don't think there is a way to say that one CFI, TLI, RMSEA, and SRMR value is signficantly better than another. I think the model choice would hinge on theoretical issues as you mention. You don't mention chi-square. I wonder why.
 Andrea Saul posted on Thursday, March 30, 2006 - 3:55 pm
Thank you for your amazingly quick reply! I did not mention chi-square because in all my analysis the chi-square is significant - I was under the impression that this is typical with large sample sizes (my samples range from 300-1500 for the CFAs) regardless of model fit, but honestly my stats background is much weaker than I'd like, and I'm pretty far outside my advisors areas of expertise with these few analysis... perhaps I should still consider the difference in the chi-square statistics despite their significance(?) Thank you again for you initial reply, it has saved me ongoing dead ends trying to find something on the topic.
 Bengt O. Muthen posted on Thursday, March 30, 2006 - 4:30 pm
Yes, chi-square difference testing can be quite useful for nested models. With dichotomous items and WLSMV estimation you would use DIFFTEST.
 Andrea Saul posted on Friday, March 31, 2006 - 7:22 am
Thank you! I will give that a try
 Andrea Saul posted on Friday, March 31, 2006 - 7:37 am
Sorry, one final question that will reveal my statistical knowledge limitations. I am not sure my comparison models are "nested" after looking at the DIFFTEST info in the manual. I have three competing models with three (models 1&2) or two (model 3) factors which the 21 symptoms load on, thus while the items loading on the factors are the same, there are different items loading on different factors across the models. Is that "nested"? If not, can I still compare the chi squares by hand calculation of if the difference is signigicant? Thanks again for your expertise!

 Bengt O. Muthen posted on Friday, March 31, 2006 - 3:24 pm
DIFFTEST will tell you, but say that you have loadings with 2 factors as (x is non-zero loading):

x 0
x 0
x 0
0 x
0 x
0 x

and then for 3 factors case a)

x 0 0
x 0 0
0 0 x
0 x 0
0 x 0
0 0 x

or 3 factors case b)

x 0 0
x 0 0
x 0 x
0 x 0
0 x 0
x 0 x

then you can think of the 2-factor model as having a third factor with variance fixed at 1 and zero loadings on that factor. The 2-factor model is not nested within 3-factor case a) but is nested within 3-factor case b).
 Andrea Saul posted on Monday, April 03, 2006 - 5:42 pm
Thank you very much, your feedback has been very helpful!
 Doveh Etti posted on Wednesday, September 27, 2006 - 4:42 am
I have 2 questions, first one raised after reading “Nested CFAs create linear dependency” discussion
* Don’t you have to impose constraints on the correlations of f1 and f2 with f3?
If the answer is yes, what constraints?
I really don’t understand why the 2 factors model is nested within the “3 factors case b)” model. To be more specific:
As I understand you suggest to change:
x 0 x 0 0
x 0 x 0 0
x 0 => x 0 0
0 x 0 x 0
0 x 0 x 0
0 x 0 x 0
and compare it to:

x 0 0
x 0 0
x 0 x
0 x 0
0 x 0
x 0 x

As I guess you tried to make one factor out of the 1st and 3rd factor. So should not the last row of the 2 factors model have to be: x 0 instead of 0 x?
 Bengt O. Muthen posted on Sunday, October 01, 2006 - 11:16 am
Adding a third column (factor) with zero loadings to the 2-factor model, you would fix the correlations of this extra factor with the other 2 factors to zero.

My 3b model had a mistake and should have the last row replaced by

0 x 0
 Doveh Etti posted on Wednesday, October 04, 2006 - 1:28 am
I am confirming receiving your answer regarding the question of the nested
models in the factor analysis and I thank you for it. I understand that the
question how to compare the non nested factors is unresolved.
 Jean-Samuel Cloutier posted on Thursday, May 17, 2012 - 5:00 pm
Is is possible to compare two nested models with the same number of degree of freedom.

Here is my situation:

My one factor model

F4 BY Pe11 PE8 PE10 Pcom Pdiv Pemp Penv Ppro;

Has the same number of degree of freedom (20) than my three factor uncorrelated model.

F1 BY PE11@1 PE10* PE8*;
F2 BY Pcom@1 Pdiv* Pemp*;
F3 By Penv@1 Ppro*;

Moreover, my three factor correlatd model

F1 BY PE11@1 PE10* PE8*;
F2 BY Pcom@1 Pdiv* Pemp*;
F3 By Penv@1 Ppro*;

F1 F2 f3 WITH F1 F2 F3;

has the same number of degree of freedom (18) than my second order model

F1 BY PE11@1 PE10* PE8*;
F2 BY Pcom@1 Pdiv* Pemp*;
F3 By Penv@1 Ppro*;

F4 BY F1@1 F2* F3*;

Can I compare them anyway, and how should I do it?

 Linda K. Muthen posted on Friday, May 18, 2012 - 11:46 am
The two models with 20 degrees of freedom are not nested. They can be compared using BIC.

The two models with 18 degrees of freedom are not nested. They are not statistically distinguishable because besides having the same degrees of freedom, they have the same fit.
 Jean-Samuel Cloutier posted on Friday, May 18, 2012 - 12:05 pm
Why would you prefer the BIC over the AIC in this case?
 Linda K. Muthen posted on Friday, May 18, 2012 - 12:13 pm
See the following paper which is available on the website:

Nylund, K.L., Asparouhov, T., & Muthén, B. (2007). Deciding on the number of classes in latent class analysis and growth mixture modeling. A Monte Carlo simulation study. Structural Equation Modeling, 14, 535-569.
 Trang Q. Nguyen posted on Monday, January 07, 2013 - 10:25 am
I am doing CFA with categorical variables, using the default WLSMV estimator. I would like to compare two simple CFAs:

2-factor model:
f1 BY x1 x2 x6 x7 x8 x9;
f2 BY x3 x4 x5;
f1 WITH f2

and 1-factor model:
f BY x1 x2 x3 x4 x5 x6 x7 x8 x9;

Is the 1-factor model nested in the 2-factor model, and can I use DIFFTEST to compare them? Moving from 2-factor to 1-factor, we need to constrain the correlation of f1 with f2 at 1. A professor suggested because 1 is on the boundary of the parameter space, testing may not be simple. When I searched the internet, I found a lecture by Jonathan Templin in which he says that in this case the chi-square difference is not distributed chi2(df=1), but is a mixture: 0.5chi2(df=0) + 0.5chi2(df=1). This makes me wonder if in general chi-square testing to compare CFAs with different number of factors is the way to go, and if DIFFTEST is appropriate to compare my two models estimated using WLSMV.

I appreciate your advice on this, and on any other methods I can use to compare the two models. I noticed that along with the likelihood, AIC and BIC are not available for WLSMV.

Thank you!

 Linda K. Muthen posted on Monday, January 07, 2013 - 12:18 pm
You cannot use DIFFTEST for this comparison for the reasons stated. Use maximum likelihood and compare BIC.
 Trang Q. Nguyen posted on Monday, January 07, 2013 - 12:53 pm
Thanks so much for your quick response. Would you advise using MLR estimator and compare BIC to select the model, and then switch back to WLSMV for later analyses when I use the factor(s) in structural models with other variables?

In some other section of this forum, you showed how to use the MODEL TEST command to do a Wald test of whether a correlation between two factors is 1. Would that be useful in my case, say I run the 2-factor model and test if the correlation is 1, and to favor the 1-factor model if the test does not reject, or the 2-factor model if the test rejects the null hypothesis? Or does this not make sense, I guess because the factors are defined conditional on the loadings in the 2-factor model?


 Bengt O. Muthen posted on Monday, January 07, 2013 - 4:35 pm
If it is not computationally heavy to use MLR, I would report those results. You can also mention the WLSMV chi-square tests of model fit.

I am not a fan of testing if the correlation is 1. I would report the two WLSMV chi-square tests and the MLR-based BIB values.
 Trang Q. Nguyen posted on Monday, January 07, 2013 - 5:15 pm
Thank you! By "the two WLSMV chi-square tests", so you mean the two tests comparing these models to the saturated model?

In order to do that, I understand I need to fit the saturated model with SAVEDATA DIFFTEST and then fit the other two models with ANALYSIS DIFFTEST. Is there special syntax to specify the saturated model? Or do I just do the following:

x1 PWITH x2-x9;
x2 PWITH x3-x9;

My sample size is over 3000. With this sample size, are chi-square tests helpful or are they likely to be significant regardless of how little misfit there is?

Thank you!
 Bengt O. Muthen posted on Monday, January 07, 2013 - 5:46 pm
You automatically get a chi-square test against the saturated model when using WLSMV.

Even with n=3000 I would report the chi-square test. Note also that you get Modindices with WLSMV to improve the model.
 Trang Q. Nguyen posted on Thursday, January 10, 2013 - 11:37 am
Thank you so much! I really appreciate it!
 Raimondo Bruno posted on Tuesday, October 07, 2014 - 4:02 am
long time lurker, first time poster.

I'm running CFA with categorical variables; comparing two unifactorial models using different definitions of two of the 11 items in the scale. There is some MCAR data (very minimal); N~1500

Using MLR, I get:
Model A: BIC=7471; chi2(df=2020)=2182
Model B: BIC=8787; chi2(df=2017)=2375

My co-authors want CFI/WRMR etc fit indices, so I have also run this using WLSMV, with model superiority running in the other direction:
Model A: chi2(df=44)=353; WRMR=2.059
Model B: chi2(df=44)=273; WRMR=1.709

My questions are:
1. I can't quite understand why the chi2 are so different between the MLR/WLSMV models. I'm leaning to reporting the ones using WLSMV estimator as they make more sense to me (in the paper there is also a unifactorial structure with 3 items, so a just identified model, and the MLR chi-sqare df=1 rather than 0) - could anybody provide a reference to how the chi-square is calculated under these models? (google has failed me)

2. Should I just ditch being wishywashy and solely report all statistics using the MLR estimator? Is one of the estimators optimal under these conditions?

Knowing that the answer switches using the different estimators, it doesn't sit well with me only reporting one solution until I am clearer as to the reason for the discrepancy!
 Linda K. Muthen posted on Tuesday, October 07, 2014 - 9:08 am
Please end the two outputs and your license number to
  Jessica posted on Tuesday, August 11, 2015 - 10:59 pm
I conduct a CFA wtih two latent variable(f1, f2) and each variable has three indicators(y1-y6).

I want to verifty the baseline model and saturated model. In order to do that, I just do the following:

1) baseline model
only the intercepts(6) and residual variance(6) of the observed variable are estimated. All factor loadings fixed to 1, and the variance of the latent variables are 0. So df=15

2) saturated model
intercepts(6), residual variance(6), and covariance of the residual(15) are estimated. All factor loadings fixed to 1, and the variance of the latent variables are 0. So, df=0

Thank you!
 Linda K. Muthen posted on Wednesday, August 12, 2015 - 7:11 am
The baseline and saturated models contain only observed variables. There should be no factors in these models.
 Anne Black posted on Monday, December 12, 2016 - 1:54 pm
I have a multiple group CFA (combination continuous and categorical indicators) and am testing the plausibility of constraints. I would like to compare model fit using BIC, but don't have this, or -2LL in my output. Is there a way to request this?
 Bengt O. Muthen posted on Monday, December 12, 2016 - 4:34 pm
You get that with Estimator = ml.
 Anne Black posted on Tuesday, December 13, 2016 - 6:50 am
Thank you, Dr. Muthen. I am using TYPE=COMPLEX, and added ESTIMATOR=ML and am getting a warning, " Estimator ML is only allowed with TYPE=COMPLEX and replicate weights." Is it that I am also using Delta parameterization for categorical indicators?
 Bengt O. Muthen posted on Tuesday, December 13, 2016 - 3:19 pm
use Estimator = mlr.

Delta has to do with Estimator = wlsmv.
 Anne Black posted on Thursday, December 15, 2016 - 7:32 am
Forgive my persistence, but I tried estimator=MLR and get the warning,
" ALGORITHM=INTEGRATION is not available for multiple group analysis.
Try using the KNOWNCLASS option for TYPE=MIXTURE."

Is there another way to get BIC for a multiple group CFA with continuous and categorical indicators?

Thanks in advance.
 Linda K. Muthen posted on Thursday, December 15, 2016 - 8:09 am
If you use maximum likelihood and the CATEGORICAL option, multiple group analysis can be done only using the KNONWCLASS option in conjunction with TYPE=MIXTURE. When classes are known, this is the same as multiple group analysis.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message