Andrea Saul posted on Thursday, March 30, 2006 - 1:31 pm
Does anyone know of any standards or recommendations when comparing two "good fitting" models with the CFI or TLI index? I'm interested in comparing three different models for a set of related psychological symptoms, the items are all dichotomous, and I'm using the MPlus program and end up with consistently "better" fit indexes across CFI, TLI, RMSEA, and SRMR indexes, but I have no idea how to determine if the differences in fit are enough to matter and haven't found anything on it - If being more specific matters, with my largest sample (1500 youth,21 items, 2 factors), I end up with CFI .987 .983, and .982 and corresponding TLI values .993, .990, .989. With a smaller sample (N300), again all four fit indexes are consistently slightly better for one of the two models, but the differences are even smaller, e.g. cfi of .982 vs. 980 and .979 - besides for theoretical issues between models, any advice for commenting on if there is a statistically meaningful better fit for one model vs. the others? Thanks for any feedback, I (and my advisor) are stumped.
I don't think there is a way to say that one CFI, TLI, RMSEA, and SRMR value is signficantly better than another. I think the model choice would hinge on theoretical issues as you mention. You don't mention chi-square. I wonder why.
Andrea Saul posted on Thursday, March 30, 2006 - 3:55 pm
Thank you for your amazingly quick reply! I did not mention chi-square because in all my analysis the chi-square is significant - I was under the impression that this is typical with large sample sizes (my samples range from 300-1500 for the CFAs) regardless of model fit, but honestly my stats background is much weaker than I'd like, and I'm pretty far outside my advisors areas of expertise with these few analysis... perhaps I should still consider the difference in the chi-square statistics despite their significance(?) Thank you again for you initial reply, it has saved me ongoing dead ends trying to find something on the topic.
Sorry, one final question that will reveal my statistical knowledge limitations. I am not sure my comparison models are "nested" after looking at the DIFFTEST info in the manual. I have three competing models with three (models 1&2) or two (model 3) factors which the 21 symptoms load on, thus while the items loading on the factors are the same, there are different items loading on different factors across the models. Is that "nested"? If not, can I still compare the chi squares by hand calculation of if the difference is signigicant? Thanks again for your expertise!
DIFFTEST will tell you, but say that you have loadings with 2 factors as (x is non-zero loading):
x 0 x 0 x 0 0 x 0 x 0 x
and then for 3 factors case a)
x 0 0 x 0 0 0 0 x 0 x 0 0 x 0 0 0 x
or 3 factors case b)
x 0 0 x 0 0 x 0 x 0 x 0 0 x 0 x 0 x
then you can think of the 2-factor model as having a third factor with variance fixed at 1 and zero loadings on that factor. The 2-factor model is not nested within 3-factor case a) but is nested within 3-factor case b).
Thank you very much, your feedback has been very helpful!
Doveh Etti posted on Wednesday, September 27, 2006 - 4:42 am
I have 2 questions, first one raised after reading “Nested CFAs create linear dependency” discussion * Don’t you have to impose constraints on the correlations of f1 and f2 with f3? If the answer is yes, what constraints? I really don’t understand why the 2 factors model is nested within the “3 factors case b)” model. To be more specific: As I understand you suggest to change: x 0 x 0 0 x 0 x 0 0 x 0 => x 0 0 0 x 0 x 0 0 x 0 x 0 0 x 0 x 0 and compare it to:
x 0 0 x 0 0 x 0 x 0 x 0 0 x 0 x 0 x
As I guess you tried to make one factor out of the 1st and 3rd factor. So should not the last row of the 2 factors model have to be: x 0 instead of 0 x?
Adding a third column (factor) with zero loadings to the 2-factor model, you would fix the correlations of this extra factor with the other 2 factors to zero.
My 3b model had a mistake and should have the last row replaced by
0 x 0
Doveh Etti posted on Wednesday, October 04, 2006 - 1:28 am
I am confirming receiving your answer regarding the question of the nested models in the factor analysis and I thank you for it. I understand that the question how to compare the non nested factors is unresolved.
See the following paper which is available on the website:
Nylund, K.L., Asparouhov, T., & Muthén, B. (2007). Deciding on the number of classes in latent class analysis and growth mixture modeling. A Monte Carlo simulation study. Structural Equation Modeling, 14, 535-569.
I am doing CFA with categorical variables, using the default WLSMV estimator. I would like to compare two simple CFAs:
2-factor model: f1 BY x1 x2 x6 x7 x8 x9; f2 BY x3 x4 x5; f1 WITH f2
and 1-factor model: f BY x1 x2 x3 x4 x5 x6 x7 x8 x9;
Is the 1-factor model nested in the 2-factor model, and can I use DIFFTEST to compare them? Moving from 2-factor to 1-factor, we need to constrain the correlation of f1 with f2 at 1. A professor suggested because 1 is on the boundary of the parameter space, testing may not be simple. When I searched the internet, I found a lecture by Jonathan Templin in which he says that in this case the chi-square difference is not distributed chi2(df=1), but is a mixture: 0.5chi2(df=0) + 0.5chi2(df=1). This makes me wonder if in general chi-square testing to compare CFAs with different number of factors is the way to go, and if DIFFTEST is appropriate to compare my two models estimated using WLSMV.
I appreciate your advice on this, and on any other methods I can use to compare the two models. I noticed that along with the likelihood, AIC and BIC are not available for WLSMV.
Thanks so much for your quick response. Would you advise using MLR estimator and compare BIC to select the model, and then switch back to WLSMV for later analyses when I use the factor(s) in structural models with other variables?
In some other section of this forum, you showed how to use the MODEL TEST command to do a Wald test of whether a correlation between two factors is 1. Would that be useful in my case, say I run the 2-factor model and test if the correlation is 1, and to favor the 1-factor model if the test does not reject, or the 2-factor model if the test rejects the null hypothesis? Or does this not make sense, I guess because the factors are defined conditional on the loadings in the 2-factor model?
Thank you! By "the two WLSMV chi-square tests", so you mean the two tests comparing these models to the saturated model?
In order to do that, I understand I need to fit the saturated model with SAVEDATA DIFFTEST and then fit the other two models with ANALYSIS DIFFTEST. Is there special syntax to specify the saturated model? Or do I just do the following:
x1 PWITH x2-x9; x2 PWITH x3-x9; etc.?
My sample size is over 3000. With this sample size, are chi-square tests helpful or are they likely to be significant regardless of how little misfit there is?
I'm running CFA with categorical variables; comparing two unifactorial models using different definitions of two of the 11 items in the scale. There is some MCAR data (very minimal); N~1500
Using MLR, I get: Model A: BIC=7471; chi2(df=2020)=2182 Model B: BIC=8787; chi2(df=2017)=2375
My co-authors want CFI/WRMR etc fit indices, so I have also run this using WLSMV, with model superiority running in the other direction: Model A: chi2(df=44)=353; WRMR=2.059 Model B: chi2(df=44)=273; WRMR=1.709
My questions are: 1. I can't quite understand why the chi2 are so different between the MLR/WLSMV models. I'm leaning to reporting the ones using WLSMV estimator as they make more sense to me (in the paper there is also a unifactorial structure with 3 items, so a just identified model, and the MLR chi-sqare df=1 rather than 0) - could anybody provide a reference to how the chi-square is calculated under these models? (google has failed me)
2. Should I just ditch being wishywashy and solely report all statistics using the MLR estimator? Is one of the estimators optimal under these conditions?
Knowing that the answer switches using the different estimators, it doesn't sit well with me only reporting one solution until I am clearer as to the reason for the discrepancy!
The baseline and saturated models contain only observed variables. There should be no factors in these models.
Anne Black posted on Monday, December 12, 2016 - 1:54 pm
I have a multiple group CFA (combination continuous and categorical indicators) and am testing the plausibility of constraints. I would like to compare model fit using BIC, but don't have this, or -2LL in my output. Is there a way to request this?
Anne Black posted on Tuesday, December 13, 2016 - 6:50 am
Thank you, Dr. Muthen. I am using TYPE=COMPLEX, and added ESTIMATOR=ML and am getting a warning, " Estimator ML is only allowed with TYPE=COMPLEX and replicate weights." Is it that I am also using Delta parameterization for categorical indicators?
If you use maximum likelihood and the CATEGORICAL option, multiple group analysis can be done only using the KNONWCLASS option in conjunction with TYPE=MIXTURE. When classes are known, this is the same as multiple group analysis.
I have some questions regarding model comparison and model evaluation. I have conducted an EFA (with 1 to 5 factors) and then a CFA, both with WLSMV.
1.The chi-square model comparisons of the EFA are all significant (thus suggesting that a 5 factor model is the best), even though theoretical interpretability and parallel analysis suggest that a two-factor model is the best. Can I ignore the chi-square model comparisons?
2.WLSMV does not produce AIC and BIC values. For the CFA I tried to use DIFFTEST to compare a 2-factor CFA to a 1-factor CFA, but got the following warning: THE CHI-SQUARE DIFFERENCE TEST COULD NOT BE COMPUTED BECAUSE THE FILE CONTAINING INFORMATION ABOUT THE H1 MODEL HAS INSUFFICIENT DATA. Does this mean these two models are not nested? If so, is there any other way to compare the CFA models?
3. I would like to report some information regarding the proportion variance explained by the factors. The output only produces explained variance per item. Is it possible to calculate proportion explained variance for factors?