

WLS vs MLR in EFA w ordinal data 

Message/Author 

Joel Nigg posted on Tuesday, January 13, 2009  10:34 am



I am conducting EFA with ordinal data (0,1,2), using MPLUS 5.1. (48 items, N=700). The default estimator (WLSMV) gives decent results in a few minutes. The data however are not multinormal so I am interested in seeing an ML solution. The manual seems to indicate MLR is an option. However, when I run estimator=MLR (a) I got complete results for 23 factors, though it took it a long time to run it; (b) when I ran 46 factors, the program ran for 12 hours without completing. Is this too much computation? This is clustered data and has missing data. Should MLR in principal be possible here and something else is wrong, or is MLR not doable in this situation? Thanks Joel 


The fact that your ordinal outcomes may have floor or ceiling effects does not mean that you should use maximum likelihood rather than weighted least squares estimation. With categorical outcomes, both weighted least squares and maximum likelihood are able to deal with this. With maximum likelihood estimation and categorical outcomes, each factor is one dimension of integration. We recommend not going over four. Weighted least squares is recommended when you have many factors and not so many factor indicators. Maximum likelihood is recommended when you have few factors and many factor indicators. Results should not differ. 

Joel Nigg posted on Tuesday, January 13, 2009  3:28 pm



Linda: Very helpful as always, thanks Joel 


Hello Linda, I'd like to conduct EFA and CFA on ordinal data comparing WLSMV and MLR estimator (specifying the data as categorical). 1 would you know of any reference that have compared the two on ordinal data already? 2 When specifying the data as categorical, MLR estimator is based on polychoric correlation, isn't it? 3 Unfortunately, I had to collapse fivepoint scales into threepoint scales in order to avoid bivariate zero count. Would you say that using MLR with such scales is pushing a bit too far? thanks very much for any insight! Dorothee 


1. There is a literature on comparing WLSMV treating outcomes as categorical vs ML treating outcomes as continuous but that's not what you are considering it sounds like. You consider ML (or MLR) treating outcomes as categorical. Much less is written on that. Off hand, I can only think of an EFA paper by Mislevy but that's for binary not ordinal outcomes and only looking at one real data set. Other Mplus Discussion readers might know of work in this area. 2. No and yes. With MLR, it is not true that polychoric correlations are computed and the model fitted to them. Raw data is used directly in the model fitting. But with probit links the assumptions are the same as those behind polychorics  if you assume a normal factor and probit links you get normality for underlying continuous latent response variables for the categorical items, just like with polychorics. So in this way, IRT assumptions are the same as polychoric assumptions. 3. MLR treating outcomes as ordinal works fine with only 3 categories. 


Hello, Continuing on the comparison between WLSMV and MLR, I am not really sure if what i did was correct. As written previously, I run both WLSMV and MLRbased EFA on the same ordinal data (specified CATEGORICAL) and get similar factor structure and fit conclusions. However, I read in another post that, by default, WLSMV uses probit regression while MLR uses logistic regression. 1 Am i correct in thinking that i can still compare the results of both methods (used with their default features) and having more confidence in the results I find if both methods agree? 2 Would you mind explaining a bit further why in MLR on categorical data the "Raw data is used directly in the model fitting", as I thought that in factor analysis even with ML estimation the fitting function was trying to minimize discrepancy between the sample and the estimated covariance (or correlation) matrices? I apologize if I grossly misinterpreted my readings...! 3 In another post, we were advised to look at "technical appendix 8" to understand better the MLR method. However, the appendix order seems to have change. Could you please specify the name of the appendix about MLR? Thank you so much for the time you take to answer this forum! 


1. You can compare the two methods using the probit and logit results by looking at the ratio of the parameter estimate to the standard error of the parameter estimate (column 3 of the Mplus results). You could also use LINK=PROBIT; in the ANALYSIS command with MLR and obtain probit results for maximum likelihood. 2. With continuous outcomes and maximum likelihood, means, variances, and covariances are sufficient statistics for model estimation. With categorical outcomes and maximum likelihood, this is not the case. Raw data are needed. 3. If you go to the first Click here under Technical Appendices, you will find Technical Appendix 8. See specifically formulas 168 and 170. 


On 1/13/2009, Linda stated that results should not differ between MLR and WLSMV estimation, yet WLSMV is the default estimator for categorical outcomes in Mplus. Just to make sure I understand, is this because WLSMV estimation is more efficient than MLR, or are there situations in which the WLSMV results are more trustworthy as well? Thanks, Jeff 


With categorical outcomes and continuous latent variables, maximum likelihood requires numerical integration which is more computationally demanding than weighted least squares. This is why weighted least squares is the default for categorical outcomes. The results should be equally trustworthy. 


Hi, I am trying to run an IRT model in which I have two continuous latent variables with binary loadings. My model looks like F1 BY fa1 fa2 fa3 fa4 fa5 fa6 fa7 F2 BY d1 d2 d4 d4 d5 F1 on V X1 V on F1 F2 X3 F2 on X2 where V is a binary observed variable, X1, X2 and X3 are set of observed variables. Two problems; 1) when I run the model with the default probit WLSMV, I get a positive parameter for V on F1. However, when I run the model with MLR for logit (estimator=MLR) (logistic odds ratio is 0.860) and probit (estimator=MLR ; link=probit ;), I get a negative parameter for V on F1. I would be grateful if you could explain me why WLSMV and MLR are giving results with different signs. 2) I created an interaction variable between F1 and F2 and tried to estimate a model which looks like V on F1 F2 INT however I am receiving the following warning: "ONE OR MORE PARAMETERS WERE FIXED TO AVOID SINGULARITY OF THE INFORMATION MATRIX. THE SINGULARITY IS MOST LIKELY BECAUSE THE MODEL IS NOT IDENTIFIED, OR BECAUSE OF EMPTY CELLS IN THE JOINT DISTRIBUTION OF THE CATEGORICAL VARIABLES IN THE MODEL.THE FOLLOWING PARAMETERS WERE FIXED:14 17 18" As far as I know my model was overidentified before the inclusion of interaction variable. Do have any suggestion for me to solve this problem? Thanks, Okan 


We need to see these runs to diagnose it. Please send all 3 outputs that you mention to support@statmodel.com. 

Back to top 

