JPower posted on Tuesday, November 24, 2015 - 12:29 pm
I am using WLSMV estimation to conduct a confirmatory factor analyses on a measure (3 factors, 18 ordinal items) and am interesting in the stability of factor loadings and thresholds across two time points. In an initial model with all of the loadings and thresholds constrained, model fit indices suggest quite good fit (CFI=0.965, TLI=0.965, RMSEA=0.045 with p=1.000, WRMR=1.651). Modification indices suggest the freeing of specific loadings/thresholds. I then freed these one at a time (based on modification indices) and conducted chi-square difference tests at each step, which indicated significant improvements in fit.
My question is whether I have gone too far in freeing the constraints - have I over-fit the models by freeing the loadings/thresholds? Even though this improved fit, model fit was good with them all constrained. I guess I'm concern that I'm relying too much on chi-square estimates, which I believe may find small differences significant with larger sample sizes. My sample size is approximately 800. I am aware that there have been recommendations in the literature of using changes in RMSEA and CFI (Chen 2007), but these guidelines were not developed for categorical models/WLSMV estimation. In reading through the discussion board, I also understand that BIC is not an option with WLSMV. Thoughts?
Overfitting may occur due to chi-square sensitivity when n is large. You can see how much key parameters such as the factor mean difference over time is affected by freeing more measurement parameters. If chi-square drops significantly without such a key parameter changing in substantively meaningful ways you have demonstrated over-sensitivity.
JPower posted on Thursday, November 26, 2015 - 8:30 am
In terms of measurement invariance testing specifically, does this apply to DIFFTEST results as well? I am very concerned about concluding that the loadings and thresholds can' be constrained across time based on DIFFTEST results, when the model fit was already quite good to begin with. Are there any guidelines as to how much loadings or thresholds would need to change for it to be meaningful? Thanks!
Q2. No, but I recommend the sensitivity approach I mentioned. And note that good fit by e.g. CFI doesn't always protect you from important misfit that chi-square can detect.
Louise Black posted on Thursday, December 20, 2018 - 7:49 am
We are conducting invariance testing in a very large sample (> 13,000 longitudinal, > 30,000 group). We are using WLSMV since we have ordinal items and so plan to rely on difftest rather than CFI given that the chi-square is not comparable in the same way as for ML.
1. In relation to mean differences and overfitting (discussed above), would this be just for the second time point/group (since the mean would be fixed@0 in the first time point/group for identification)? 2. Would a difference in mean level of .011 upon freeing an item’s loadings and threshold parameters be indicative of overfitting/is there a threshold for this? 3. Given our sample size, we are considering using a random subsample to overcome oversensitivity- do you think this is a reasonable approach? Many thanks in advance!
2. It is the substantively important difference you want to focus on. Since you might not have a feel for a factor's values, you can translate your 0.011 to a key factor indicator mean (y-mean = intercept + loading*factormean).
3. I think there are different opinions about this - you may want to try asking on SEMNET.
Margarita posted on Friday, February 22, 2019 - 4:32 am
Hi Dr. Muthen,
Following my colleague's post, I read your 2002 paper on Latent Variable Analysis With Categorical Outcomes, and I wanted to check whether the following formula y=lambda*alpha is correct for polytomous indicators?
With an ordinal y, you don't have means like for a continuous y. Instead, you have thresholds and probabilities for the different ordered y categories. See regression with an ordinal y, e.g. in our RMA book, Chapter 5.
Margarita posted on Thursday, February 28, 2019 - 6:45 am
Yes you are right, I just got confused from your suggestion to Louise above for using the difference in mean y for ordinal items.
One last question, if you have the time: given that chi-square and related fit indices cannot be compared within WLSMV, could one compare the SRMR given that it's not influenced by the chi-square?
Our understanding is that difftest is still somewhat sensitive to sample size and we have very large N (> 13,000 longitudinal, > 30,000 group invariance). Published simulations using CFI difference don't seem to extend to WLSMV so we're just trying to figure out the most robust approach. Do you think SRMR could be useful in this case?
SRMR and CFI can be used to establish approximate measurement invariance simply be evaluating these for the scalar or metric MI model (so not within a difference procedure between two models but using it directly on the constrained model).
You might find useful the MODEL option described on page 670 in the user's guide. Mybe also consider the implications of using parameterization=theta/delta.