Message/Author 


Dear Linda or Bengt Muthen, I am trying to describe and illustrate current similarities and differences between binary CFA and IRT for my thesis. The default estimation method in Mplus for categorical CFA is WLSMV. To run an IRT model, the example in your manual suggests to use MLR as the estimation method. When I use MLR, is the data input still the tetrachoric correlation matrix or is the original response data matrix used? Many thanks for your help. Best wishes, Eveline Gebhardt 


I don't think there is a difference between CFA of categorical variables and IRT. It is sometimes claimed but I don't agree. Which estimator is typically used may differ, but that's not essential. MLR uses the raw data, not a sample tetrachoric correlation matrix. 


Thank you for your quick response. I agree, the differences do not seem fundamental, more traditional and they are disappearing, which doesn't make it easier to describe. Some people say the difference is where the marginalization occurs. I'm in the process of understanding this. Since ML(R) uses the raw response data, is the marginalization done on the observed categories (instead of the latent response variables)? And is the integration over the number of factors (instead of the number of items)? Can I ask for a reference to ML and MLR estimation of categorical CFA in Mplus? Thank you again. 


The ML(R) approach is the same as the "marginal ML (MML)" approach described in e.g. Bock's work. So using the raw data and integrating over the factors using numerical integration. MML being contrasted with "conditional ML" used e.g. with Rasch approaches. Assuming normal factors, probit (normal ogive) itemfactor relations, and conditional independence, the assumptions are the same for ML and for WLSMV, where the latter uses tetrachorics. This is because those assumptions correspond to assuming multivariate normal underlying continuous latent response variables behind the categorical outcomes. So WLSMV only uses 1st and 2ndorder information, whereas ML goes all the way up to the highest order. The loss of info appears small, however. ML doesn't fit the model to these sample tetrachorics, so perhaps one can say that WLSMV marginalizes in a different way. It's a matter of estimator differences rather than model differences. We have an IRT note on our web site: http://www.statmodel.com/download/MplusIRT2.pdf but again, the ML(R) approach is nothing different from what's used in IRT MML. 


Thank you, that is very helpful just like many other subjects on the discussion board. I think the pieces are falling into place now. One very final question. I have done a simple simulation study. I've generated 10 data sets with 10 items and one factor. I have analyzed these with ACER's ConQuest software for Rasch models (which also applies MML/EM). I'm able to transform the Rasch parameters into what I call typical IRT parameters (by transforming the variance of the latent factor to 1 and using the logit/probit approximation). Then I can transform these typical difficulties and discrimination into thresholds and a loading estimate (loadings are constrained to be equal). The loading, thresholds, typical IRT discrimination and difficulties are very close to the Mplus results using the WLSMV. The IRT parameters are almost identical when I use ML(R). However, the loading and thresholds are about 2.25 times larger using ML(R) than using WLSMV. The variance of the factor is 1 in both instances, but the loading in the ML(R) case is larger than 1. (I'm using Mplus 5.) Eveline 


I am confused by your second to last paragraph. First you say that the IRT par's are close to WLSMV. Then you say the IRT par's are almost the same as ML. And then you seem to contradict the first 2 sentences by saying that WLSMV is different from ML. Please clarify. Also, please upgrade to 6.12. 


Sorry for the confusion. Here a more structured description: From ConQuest estimates, I compute (1) typical IRT parameter estimates (discrimination and difficulties) and (2) CFA parameter estimates (loading and thresholds). When I use WLSMV in Mplus, both (1) and (2) from the Mplus output are very close to my transformed ConQuest estimates. When I use MLR in Mlus, (1) is identical to my transformed ConQuest estimates, but (2) is about 2.25 times larger. The loading is bigger than 1 (about 1.4), while it is about 0.6 when I use WLSMV and ConQuest. The thresholds differ by the same factor. The variance of the factor is fixed to 1 and the loadings constrained to be equal in all instances above. I see now that lambda(MLR)=1.7*alpha(MLR), so it has something to do with the logitprobit approximation. I think I can work it out and I'll check results with Mplus 6.12 first. Thanks again for your help. Eveline 


Thanks for the clarification. WLSMV uses a probit link, so to match that in Mplus ML(R) you need to say LINK = PROBIT; in the ANALYSIS command because the ML(R) default is logit link. The 1.7 constant is only used when printing the IRT translation (below the regular estimates). Hope that helps. 


Given the above, is it an arbitrary choice per se then whether one uses WLSMV or MLR to estimate a binary IRT (CDFA) model? My inclination is to always use the fullinformation method but I am curious why then the default would be WLSMV (limitedinformation) in Mplus (7.11)perhaps this is because it is easier to compute particularly with more complex models? My question is also prompted by the observation that one of my indicators behaves somewhat erratically in that its sign for both its location and discrimination parameters changes depending on the estimator ( MLR + WLMSV). In a situation where all items are low base rate, is one estimator more advantageous than another? 


You have some answers in the new FAQ "Estimator choices with categorical outcomes". With low base rate the fullinformation estimators of ML and Bayes may have an advantage over WLSMV, particularly because low base rate tends to go together with zero cells (see FAQ). 


Wow this is great. Thank you! 

Back to top 