Interpretation of R-square in categor... PreviousNext
Mplus Discussion > Categorical Data Modeling >
 Tor Neilands posted on Wednesday, November 03, 2004 - 2:19 pm

I am fitting the following model in Mplus:

Positive BY ppo rps ;
Negative BY npo ics as ;

PsyHlth BY avgsps bdisum1 pss1 psomsum1 ;

PsyHlth ON Positive Negative ;
Adhlt90 ON PsyHlth ;

All observed variables and latent variabls are continuous, with the exception of the distal outcome Adhlt90, which is dichotomous. I am using the WLSMV estimator and theta parameterization to estimate this model.

I am curious as to what the interpretation of the r-square value reported for Adhtl90 is? And, does that interpretation change if I were to use ML estimation or delta parameterization? What if Adhlt90 were a count or zero-inflated count outcome?

Thanks so much for your insights,

Tor Neilands
 bmuthen posted on Wednesday, November 03, 2004 - 6:03 pm
With WLSMV you get a probit regression for Adhlt90. With ML you get logit. In both cases does the R-square refer to explained variance proportion in an underlying continuous latent response variable. For probit this response variable has a conditionally normal density with unit variance given covariates, while for logit it has a conditionally logistic density with variance pi^2/3. For probit, see Tech App 1 on our web site. This kind of R-square was discussed in Amemiya, and also in McKelvey & Zavoina (I think a 1985 Math Soc article) - it is also discussed in the Snijders-Bosker's multilevel book. I don't know what to say about the count question - do we give an R-square here?
 Tor Neilands posted on Thursday, November 04, 2004 - 10:24 am
Thank you, Bengt. I will check out those references. I have Snijers & Bosker. Can you post the full citation for the Amemiya reference? I noticed two Amemiya citations in the Mplus appendices. One is an article; the other a full textbook.

Regarding the R-square for count outcomes, Mplus does not produce an R-square on the output for count outcomes.

With best wishes,

 bmuthen  posted on Thursday, November 04, 2004 - 10:26 am
That's Amemiya (1981) - a good overview.
 Tor Neilands posted on Thursday, November 04, 2004 - 12:30 pm
Thank you, Bengt. I will get a copy.
 Lois Downey posted on Tuesday, March 25, 2008 - 3:55 pm
I'm running a path model with ordinal outcomes, using WLSMV. My understanding is that the R-squares provided in the output represent the estimated proportion of the assumed underlying Y*s explained by the model. There is a column in the R-square table labeled "Scale Factors." What is the interpretation of those numbers?
 Linda K. Muthen posted on Tuesday, March 25, 2008 - 4:16 pm
You are correct about R-square. A scale factor is one divided by the standard deviation of the underlying latent response variable.
 Heike B. posted on Friday, January 20, 2012 - 4:32 am
I have a path model containing an observed categorical mediator and observed categorical outcome variables. I see some indirect effects that are small but significant. However I worry about the R-squared of the mediator as it is very small.

1. Does the small R-Squared somehow puts in question the exsitence of my indirect paths?

2. Can I test the significance of the mediator's R-Squared using an F-Test? (The data had been non-normal, and I used WLSMV)

Thanks a lot in advance.
 Linda K. Muthen posted on Friday, January 20, 2012 - 9:10 am
I would look at significance of the indirect effect and not be concerned with R-square.
 Heike B. posted on Friday, January 20, 2012 - 2:19 pm
Thank you, Linda for the good news.

 Christoph Weber posted on Wednesday, April 25, 2012 - 5:11 am
Dear Dr. Muthen,
I want to compare an effect of a indep. variable on a continious and a dichtomous dep. Variable. (two different models)

Model 1:

Y on x;

Model 2:

categorical = y;
y On x;

Is it possible to compare the R squared?
Is there a transformation to make the R squared for probit and ML (OLS) compareable?


Christoph Weber
 Linda K. Muthen posted on Wednesday, April 25, 2012 - 11:08 am
You should not compare the R-square values from a continuous and a categorical dependent variable. There is no such transformation.
 ri ri  posted on Wednesday, September 03, 2014 - 1:50 am
I have a question about the significance of R-square. I also have categorical dependent variable. while using WLSMV I set up a latent Response variable. When I looked at the R square's p value, it is above .05 thus not significant. There is also an insiginificant p value of the r square of one of my latent continous variable which is a mediator. What does this mean?

Could you help me to Interpret this result?

Thank you!
 Bengt O. Muthen posted on Wednesday, September 03, 2014 - 3:16 pm
For general interpretation matters like these, you want to turn to SEMNET.
 ri ri  posted on Wednesday, September 03, 2014 - 11:46 pm
I was wondering, when interpreting explained variance, can R square value be directly interpreted as the variance explained by predictors? Because when y is a categorical, it is a probit Regression.
 ri ri  posted on Thursday, September 04, 2014 - 4:47 am
I found the answer. You have explained in the previous post. Thank you!
 Chenqq posted on Saturday, May 12, 2018 - 6:05 am
Dear everyone, should R-Square be reported in multivariable regression in one paper?
 Bengt O. Muthen posted on Monday, May 14, 2018 - 4:27 pm
This question is suitable for SEMNET.
 Gaye Ildeniz posted on Saturday, July 06, 2019 - 11:42 am

A little confused from the posts above.

I have Full SEM with all categorical indicators (likert-type scales) for the latent variables, estimated using WLSMV.

1. So, can I use the R-square estimates for the DVs (latent) or not?
2. If yes: since all my variables are treated as categorical, can I compare these R-square values?

Many thanks.
 Bengt O. Muthen posted on Saturday, July 06, 2019 - 3:30 pm
1. The latent variables (factors) are continuous so using a regular R-square is fine.

2. Not sure if you are talking about the categorical indicators as DVs or if you are talking about comparing R-square across several different latent variables (factors). If the latter, no problem comparing.
 Gaye Ildeniz posted on Thursday, February 27, 2020 - 9:27 am
All items on likert-type scale.
I'm running the model below:



HW BY dw1 dw2 dw3 dw4 dw6;
RET BY dw17 dw18 dw19 dw20 dw21 dw22;
EC BY dw9 dw10 dw11 dw12 dw13 dw14 dw15;
CSB BY dw24 dw25 dw26 dw27 dw28 dw30;
dw4 WITH dw6;
dw20 WITH dw21;
HAB BY ha1i ha2i ha3i ha4i ha5i ha6i;
HS BY hs1 hs2 hs3 hs4 hs5 hs6 hs7 hs8;
ECWC BY ec1 ec2 ec3 ec4 ec5 ec6 ec7 ec8;
CSBIT BY csb1 csb2 csb3 csb4 csb5 csb6;

My questions:
1. When using continuous variables, a regression coefficient (standardised beta) with a single predictor would be equal to correlation coefficient. What happens in my case? My understanding was: when treating "indicators" as categorical, the latent variable would not necessarily be categorical. Instead they would be continuous latent variables with categorical indicators. Am I correct? How should I interpret the STDYX values as a result of the regression commands?

Or are they probit regression coefficients in this case?

2. In my output, I get R2 values for the DV. These values also have an associated p-value. Can you help me understand what that indicates? I am specifically referring to R2 value, not R2 change.

I really appreciate your help.
Thank you very much.
 Bengt O. Muthen posted on Thursday, February 27, 2020 - 12:10 pm
1. HW on HAB is a linear regression between 2 continuous variables.

2. The p-value shows if R2 is significant.

Note that you request Geomin rotation but your Model statements say CFA.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message