Message/Author 

JPower posted on Tuesday, November 24, 2015  12:29 pm



Hello, I am using WLSMV estimation to conduct a confirmatory factor analyses on a measure (3 factors, 18 ordinal items) and am interesting in the stability of factor loadings and thresholds across two time points. In an initial model with all of the loadings and thresholds constrained, model fit indices suggest quite good fit (CFI=0.965, TLI=0.965, RMSEA=0.045 with p=1.000, WRMR=1.651). Modification indices suggest the freeing of specific loadings/thresholds. I then freed these one at a time (based on modification indices) and conducted chisquare difference tests at each step, which indicated significant improvements in fit. My question is whether I have gone too far in freeing the constraints  have I overfit the models by freeing the loadings/thresholds? Even though this improved fit, model fit was good with them all constrained. I guess I'm concern that I'm relying too much on chisquare estimates, which I believe may find small differences significant with larger sample sizes. My sample size is approximately 800. I am aware that there have been recommendations in the literature of using changes in RMSEA and CFI (Chen 2007), but these guidelines were not developed for categorical models/WLSMV estimation. In reading through the discussion board, I also understand that BIC is not an option with WLSMV. Thoughts? Thanks! 


Are you using the DIFFTEST option for the difference testing? 

JPower posted on Wednesday, November 25, 2015  7:19 am



Yes, I am using DIFFTEST to compare the models as they are nested. 


Overfitting may occur due to chisquare sensitivity when n is large. You can see how much key parameters such as the factor mean difference over time is affected by freeing more measurement parameters. If chisquare drops significantly without such a key parameter changing in substantively meaningful ways you have demonstrated oversensitivity. 

JPower posted on Thursday, November 26, 2015  8:30 am



In terms of measurement invariance testing specifically, does this apply to DIFFTEST results as well? I am very concerned about concluding that the loadings and thresholds can' be constrained across time based on DIFFTEST results, when the model fit was already quite good to begin with. Are there any guidelines as to how much loadings or thresholds would need to change for it to be meaningful? Thanks! 


Q1 Yes. Q2. No, but I recommend the sensitivity approach I mentioned. And note that good fit by e.g. CFI doesn't always protect you from important misfit that chisquare can detect. 

Louise Black posted on Thursday, December 20, 2018  7:49 am



We are conducting invariance testing in a very large sample (> 13,000 longitudinal, > 30,000 group). We are using WLSMV since we have ordinal items and so plan to rely on difftest rather than CFI given that the chisquare is not comparable in the same way as for ML. 1. In relation to mean differences and overfitting (discussed above), would this be just for the second time point/group (since the mean would be fixed@0 in the first time point/group for identification)? 2. Would a difference in mean level of .011 upon freeing an item’s loadings and threshold parameters be indicative of overfitting/is there a threshold for this? 3. Given our sample size, we are considering using a random subsample to overcome oversensitivity do you think this is a reasonable approach? Many thanks in advance! 


1. Right (if I understand what you mean). 2. It is the substantively important difference you want to focus on. Since you might not have a feel for a factor's values, you can translate your 0.011 to a key factor indicator mean (ymean = intercept + loading*factormean). 3. I think there are different opinions about this  you may want to try asking on SEMNET. 

Back to top 