Polychoric correlations and ULS PreviousNext
Mplus Discussion > Confirmatory Factor Analysis >
Message/Author
 Leanne Magee posted on Wednesday, October 06, 2004 - 1:31 pm
I am attempting to conduct confirmatory factor analyses using AMOS software on a data set collected from a 5-point scale in which there is neither univariate nor multivariate normality. Realizing AMOS is not sufficient for these analyses, we considered MPLus. However, my sample size is too small for weighted least squares (WLS) categorical methods in MPlus, and the methods for continuous data are inappropriate because of the level of measurement of the item responses. We have considered fitting the model using polychoric correlations and unweighted least squares (ULS) in MPlus, because ULS might do better with a small sample than the otherwise preferable WLS methods. What would you suggest we do?
 Linda K. Muthen posted on Thursday, October 07, 2004 - 4:44 am
I don't know how small your sample is but the WLSMV estimator has been shown to work well in small samples for some models. You can request the following reference from burnett@gseis.ucla.edu:

Muthén, B., du Toit, S.H.C. & Spisic, D. (1997). Robust inference using weighted least squares and quadratic estimating equations in latent variable modeling with categorical and continuous outcomes. Accepted for publication in Psychometrika. (#75)
 Leanne Magee posted on Thursday, October 07, 2004 - 10:00 am
Thank you for your prompt response. As it turns out, I am conducting the CFA in three samples - one has 110 participants, another 55, and the third 31. I have requested the article by contacting the given email address, but wanted to know if you had any opinion regarding the actual sizes of my sample before I am able to read the article. Thank you!
 Linda K. Muthen posted on Tuesday, October 12, 2004 - 4:36 pm
With these very small samples, ULS is most likely the best approach.
 craig neumann posted on Friday, May 04, 2007 - 11:38 am
When using ordinal items in CFA models (samples >=250), it seems that a best practice would be to use the raw items and the WLSMV estimation procedure. However, I have seen some investigators use a polychoric correlation matrix as the data input and the ML estimation procedure. While I assume the two methods should produce very similar results, shouldn't the former approach produce more precise model resuts? Any references on this topic would be appreciated.
 Linda K. Muthen posted on Friday, May 04, 2007 - 2:53 pm
If you use maximum likelihood with a polychoric correlation matrix, you will obtain consistent parameter estimates but standard errors and chi-square will not be correct. It is often the case that polychoric correlatino matrices are not positive defininite.
 craig neumann posted on Friday, May 04, 2007 - 5:12 pm
Thank you Linda. And so, you'd recommend using raw items as input with WLSMV as a better approach than ML?
 Linda K. Muthen posted on Friday, May 04, 2007 - 5:19 pm
Yes. Or maximum likelihood with raw data. Mplus has both estimators for categorical outcomes.
 Tracy Witte posted on Tuesday, May 27, 2008 - 8:17 am
I am conducting an EFA with 10 categorical indicators (some binary, some with 5 categories) on a sample of 1,085. The first model I ran involved using the ULS estimator, and I obtained a 2-factor solution that seemed quite interpretable and made sense in terms of previous work. After doing some more reading, I discovered that WLSMV was considered to be a better estimator. When I ran the analysis using WLSMV, I obtained a different solution, and one that is less interpretable/useful. I still obtained a 2-factor solution, it's just that I have more items double loading on both factors, and overall, a less clear picture about how the items hang together. Am I justified in using ULS? Why would the solutions be so different from one another?
 Linda K. Muthen posted on Tuesday, May 27, 2008 - 9:03 am
I don't think the two analyses should be that different. Please send the two outputs and your license number to support@statmodel.com.
 Lim Jie Xin posted on Thursday, July 17, 2014 - 3:19 am
Dear Muthen,

Referring to your previous post (dated May 04, 2007) regarding FIML and Polychoric correlation, I am interested in the non-linear CFA (e.g. Example 5.7 in the manual) with categorical data.
I understand that LMS uses FIML. Does declaring the data as categorical produced inaccurate SE as well?
 Linda K. Muthen posted on Thursday, July 17, 2014 - 8:13 am
No.
 Brett Holfeld posted on Thursday, July 17, 2014 - 9:57 am
I have two latent constructs, bullying and victimization, that are composed of four binary indicators (coded 0,1) that were measured at 2 time points with a sample of over 700. I ran a CFA for each construct at each time point as well as factorial invariance across time for each construct.However, my model fit statistics in many cases appear too good to be true (e.g., CFI 1.000 and RMSEA 0 or very close to 0). I used the WLSMV as an estimator as a result of the categorical nature of indicators. I just read that the polychoric correlation matrix as the data input should be included. However, I am not sure what this is, how to include it (syntax) and how this may influence the models. Any advice or suggestions would be greatly appreciated!

Brett
 Linda K. Muthen posted on Thursday, July 17, 2014 - 11:30 am
If you have ordinal variables, Mplus analyzes a polychoric correlation matrix. You do not need to provide this. Mplus uses the raw data to compute it.
 Brett Holfeld posted on Thursday, July 17, 2014 - 12:16 pm
Thanks for the quick response Linda! So this polychoric correlation matrix is produced automatically if the indicators are labeled as categorical and an appropriate estimator is used (e.g., WLSMV)? Do you have any idea why my model fit indices appear to be so good? For example, here is the model fit indices for the victimization construct at one time point (similar results were found at the second time point):

Number of Free Parameters 21

Chi-Square Test of Model Fit

Value 8.763*
Degrees of Freedom 15
P-Value 0.8896

* The chi-square value for MLM, MLMV, MLR, ULSMV, WLSM and WLSMV cannot be used for chi-square difference testing in the regular way. MLM, MLR and WLSM chi-square difference testing is described on the Mplus website. MLMV, WLSMV,and ULSMV difference testing is done using the DIFFTEST option.

RMSEA (Root Mean Square Error Of Approximation)

Estimate 0.000
90 Percent C.I. 0.000 0.016
Probability RMSEA <= .05 1.000

CFI/TLI

CFI 1.000
TLI 1.012
 Linda K. Muthen posted on Thursday, July 17, 2014 - 12:36 pm
Sometimes overly good fit is because correlations are low making it difficult to reject the model.
 Brett Holfeld posted on Thursday, July 17, 2014 - 1:21 pm
Okay. Do you think it would be appropriate to publish findings with a fit statistics like this? Thanks!

Brett
 Linda K. Muthen posted on Friday, July 18, 2014 - 11:28 am
If you explain why the fit is so good.
 Brett Holfeld posted on Friday, July 18, 2014 - 12:39 pm
Thanks!
Back to top
Add Your Message Here
Post:
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Password:
Options: Enable HTML code in message
Automatically activate URLs in message
Action: