Iwas running a series of model estimates as part of my Ph.D. dissertation and due to markedly skewed responses (teachers estimates of mastery of 42 classroom observable behaviors related to learning for kindergarten and 1st grade) I ran the models under three estimation methods, the default ML, the ML allowing missing values, and the WLSMV assuming ordered categorical data.
One thing I neoted was that I obtained relatively similar values for all fit indices under the ML and ML-missing, but under the WLSMV method the CFI (and TLI if I remember correctly) differed considerably from the values obtaiend under the other two methods (mean difference about 0.15), while the values of RMSEA and SRMR were wery similar to those obtaiend under the other two methods.
In my evaluation of the results I lean towards giving results obtained under WLSMV more weigth becasue of the skewed data (in first grade responses to several items are limited to the two highest points on a four point scale), but reading this section over last night I was uncertain how to explain this difference in the behaviour of the fit indexes under different estimation methods to my comittee members.
Is there a theoretical rationale for this different behavior of the fit indexes or empirical results that supports preferring the results obtained under the WLSMV over the other two in my evaluation?
Or is the response 'because of the limits (skewness) of the data' simply a good enough response?
Sig Skulason Educational Testing Institute Reykjavik Iceland
The fit indices given with WLSMV correspond to fitting the model to different sample statistics than those used with the ML estimators. Because of this, the fit indices can be quite different particularly with very skewed variables where we know that sample statistics of Pearson correlations are attenuated relative to polychoric correlations. I don't know that this has been written about.
Given the skewness of your data and the fact that the variables are not continuous, WLSMV seems more appropriate.
Bengt, I have the same issue. So my first question is since this post, has anyone written on this?
Second, I have 1-6 likert scale data from a teacher effectiveness questionnaire. How much skewness is too much to not use ML? I know their is a Psych Methods article, I think 97, that talks about skew and kurt levels of over 3 and 8 being severly skewed.
bmuthen posted on Sunday, March 05, 2006 - 3:25 pm
Two articles in British Journal of Mathematical and Statistical Psychology by Muthen & Kaplan are relevant - see our web page under References, Categorical Outcomes, SEM.
Valeriana posted on Wednesday, March 08, 2006 - 5:58 pm
Kline(2005) stated that for WLSM, WLSMV and DWLS not all of the indexes of model fit uses in ML estimation are available. If I use one of the three first methods cited above, which indexes should I use? And what about the interpretation of them?
I am not familiar with the Kline article. Mplus provides chi-square, RMSEA, TLI, CLI, and SRMR for categorial outcomes. We do not provide BIC, AIC, etc. because these are maximum likelihood based. See the dissertation by Yu on the website where the fit measures mentioned earlier have been studied for categorical outcomes.