The background: In the context of some work I did comparing pseudo-R-squared measures for ordinal response models (e.g., ordinal logit or probit), a reviewer mentioned casually that an alternative to the existing measures could easily be implemented by using polychoric correlations and WLS in an SEM framework, and then *somehow* obtaining some kind of R-Squared. I'm wondering how to get such an R-squared.
Here's more detail: I have data available for a manifest ordinal variable Y, along with a covariate vector of manifest Xs, all continuous or dummies. Assuming that this Y is a categorized version of a latent continuous Y*, I want to use an SEM approach to generate a pseudo R-squared measure that would estimate the "true" R-squared that would occur if Y* were regressed on the X vector. I'm not even sure here whether what I want is best conceptualized as the R-squared for the latent Y* regressed on the manifest X vector, or, more in line with the nature of polychoric correlations, the R-squared for the latent Y* regressed on the vector of latent Xs underlying my manifest Xs.
Since I am doing this for a methodological invesigation, I in fact have the actual data for the continuous Y* and the associated Xs, so I know the true R-squared for the model with the underlying Y*. I have categorized Y* into various ordinal Ys, and now wish to compare an SEM- based R-squared to those true values, as well as to the various other existing pseudo-R-squared measures for non-SEM ordinal response models.
Is getting some kind of pseudo-R-squared like this possible in Mplus and/or other SEM implementations, and how might I do it?
Regards, Mike Lacy
bmuthen posted on Friday, September 02, 2005 - 3:43 pm
With an ordered categorical dependent variable Mplus gives you the R-square for the y*, using either a logit (with ML) or probit (with WLSMV) regression link. Note that Mplus does not force you to work with polychorics assuming underlying normal x* variables behind the x's - that would add unnessarily strict assumptions.
This sounds very interesting. Since it seems many other SEM program don't produce an R^2 for this kind of situation, let alone one based on reasonable assumptions, I'll be curious to understand how it is done. Can you point to where I should look in the documentation, or to a published source where I might get a better idea of how the R^2 is being done?
BMuthen posted on Monday, September 05, 2005 - 8:56 pm
You can look at the technical appendices on the website specifically Appendix 1. Also, see the McKelvey and Zavoina article from 1975 in Mathematical Sociology or the Snijder and Boskers multilevel book.
Hello, my SEM contains two binary dependent variables and the output highlights related R-squares (WLSVM estimator). Would you please so kind two explain what kinds of R-square Mplus produces? Is it McFadden's Adj R2,Cragg & Uhler's R2, Efron's R2 or other? I tried to find explanations in the appendix on your website but failed. Simply having an Rsquare-name in my paper would definitily be a plus in the review process. Thank you very much for your help. -stephan
For categorical outcomes, we use the underlying continuous latent response variable approach, also discussed in McKelvey & Zavoina (1975) in J of Math'l Soc. You also find this discussed in the Snijders & Bosker multilevel book. See also Formula 15 in Technical Appendix 1 and the reference given there.
One related question about the rē. I have one dependent categorcial variable and I am using WLSMV-Estimator. However, in my model, there is also a latent factor with 3 categorical indicators. The factor serves as an intervening variable, that is explained by a continious and a cateogrical variable. The path coefficients towards the latent factor are linear in this case. What about the r2 that is calculated for this latent factor? Is it normal rē or McKelvey? And what about the categorical indicators of this factor?
The R2 for the factor is the regular R2 for linear regression. The R2 for the categorical factor indicators with WLSMV is the R2 for the continuous latent response variables underlying the factor indicators (which is line with McKelvey-Zavoina).
I suppose the McFadden is the most popular and suitable for my analysis. I know the formula, 1- loglihood(estimated model)/loglihood (null model), but I am not sure how to execute a one-level null model in this case. I read that covariates should be in the model but their slopes fixed zero. This sounds simple but I don't know how to do it because I am not so familiar with Mplus yet.
Thank you very much! Now I got McFadden's PseudoR2. However, my colleague said that we should have also Nagelkerke PseudoR2 for the article. Could you give me advice for a proper formula of Nagelkerke PseudoR2?