Mplus Discussion >> Interpretation of R-square in categorical data modeling

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Interpretation of R-square in categor...

Mplus Discussion > Categorical Data Modeling >

Message/Author

Tor Neilands posted on Wednesday, November 03, 2004 - 2:19 pm

Greetings,

I am fitting the following model in Mplus:

MODEL:
Positive BY ppo rps ;
Negative BY npo ics as ;

PsyHlth BY avgsps bdisum1 pss1 psomsum1 ;

PsyHlth ON Positive Negative ;
Adhlt90 ON PsyHlth ;

All observed variables and latent variabls are continuous, with the exception of the distal outcome Adhlt90, which is dichotomous. I am using the WLSMV estimator and theta parameterization to estimate this model.

I am curious as to what the interpretation of the r-square value reported for Adhtl90 is? And, does that interpretation change if I were to use ML estimation or delta parameterization? What if Adhlt90 were a count or zero-inflated count outcome?

Thanks so much for your insights,

Tor Neilands

bmuthen posted on Wednesday, November 03, 2004 - 6:03 pm

With WLSMV you get a probit regression for Adhlt90. With ML you get logit. In both cases does the R-square refer to explained variance proportion in an underlying continuous latent response variable. For probit this response variable has a conditionally normal density with unit variance given covariates, while for logit it has a conditionally logistic density with variance pi^2/3. For probit, see Tech App 1 on our web site. This kind of R-square was discussed in Amemiya, and also in McKelvey & Zavoina (I think a 1985 Math Soc article) - it is also discussed in the Snijders-Bosker's multilevel book. I don't know what to say about the count question - do we give an R-square here?

Tor Neilands posted on Thursday, November 04, 2004 - 10:24 am

Thank you, Bengt. I will check out those references. I have Snijers & Bosker. Can you post the full citation for the Amemiya reference? I noticed two Amemiya citations in the Mplus appendices. One is an article; the other a full textbook.

Regarding the R-square for count outcomes, Mplus does not produce an R-square on the output for count outcomes.

With best wishes,

Tor

bmuthen posted on Thursday, November 04, 2004 - 10:26 am

That's Amemiya (1981) - a good overview.

Tor Neilands posted on Thursday, November 04, 2004 - 12:30 pm

Thank you, Bengt. I will get a copy.

Lois Downey posted on Tuesday, March 25, 2008 - 3:55 pm

I'm running a path model with ordinal outcomes, using WLSMV. My understanding is that the R-squares provided in the output represent the estimated proportion of the assumed underlying Y*s explained by the model. There is a column in the R-square table labeled "Scale Factors." What is the interpretation of those numbers?

Linda K. Muthen posted on Tuesday, March 25, 2008 - 4:16 pm

You are correct about R-square. A scale factor is one divided by the standard deviation of the underlying latent response variable.

Heike B. posted on Friday, January 20, 2012 - 4:32 am

I have a path model containing an observed categorical mediator and observed categorical outcome variables. I see some indirect effects that are small but significant. However I worry about the R-squared of the mediator as it is very small.

1. Does the small R-Squared somehow puts in question the exsitence of my indirect paths?

2. Can I test the significance of the mediator's R-Squared using an F-Test? (The data had been non-normal, and I used WLSMV)

Thanks a lot in advance.

Linda K. Muthen posted on Friday, January 20, 2012 - 9:10 am

I would look at significance of the indirect effect and not be concerned with R-square.

Heike B. posted on Friday, January 20, 2012 - 2:19 pm

Thank you, Linda for the good news.

Heike

Christoph Weber posted on Wednesday, April 25, 2012 - 5:11 am

Dear Dr. Muthen,
I want to compare an effect of a indep. variable on a continious and a dichtomous dep. Variable. (two different models)

Model 1:

Y on x;

Model 2:

categorical = y;
y On x;

Is it possible to compare the R squared?
Is there a transformation to make the R squared for probit and ML (OLS) compareable?

Thanks

Christoph Weber

Linda K. Muthen posted on Wednesday, April 25, 2012 - 11:08 am

You should not compare the R-square values from a continuous and a categorical dependent variable. There is no such transformation.

ri ri posted on Wednesday, September 03, 2014 - 1:50 am

I have a question about the significance of R-square. I also have categorical dependent variable. while using WLSMV I set up a latent Response variable. When I looked at the R square's p value, it is above .05 thus not significant. There is also an insiginificant p value of the r square of one of my latent continous variable which is a mediator. What does this mean?

Could you help me to Interpret this result?

Thank you!

Bengt O. Muthen posted on Wednesday, September 03, 2014 - 3:16 pm

For general interpretation matters like these, you want to turn to SEMNET.

ri ri posted on Wednesday, September 03, 2014 - 11:46 pm

I was wondering, when interpreting explained variance, can R square value be directly interpreted as the variance explained by predictors? Because when y is a categorical, it is a probit Regression.

ri ri posted on Thursday, September 04, 2014 - 4:47 am

I found the answer. You have explained in the previous post. Thank you!

Chenqq posted on Saturday, May 12, 2018 - 6:05 am

Dear everyone, should R-Square be reported in multivariable regression in one paper?

Bengt O. Muthen posted on Monday, May 14, 2018 - 4:27 pm

This question is suitable for SEMNET.

Gaye Ildeniz posted on Saturday, July 06, 2019 - 11:42 am

Hi,

A little confused from the posts above.

I have Full SEM with all categorical indicators (likert-type scales) for the latent variables, estimated using WLSMV.

1. So, can I use the R-square estimates for the DVs (latent) or not?
2. If yes: since all my variables are treated as categorical, can I compare these R-square values?

Many thanks.

Bengt O. Muthen posted on Saturday, July 06, 2019 - 3:30 pm

1. The latent variables (factors) are continuous so using a regular R-square is fine.

2. Not sure if you are talking about the categorical indicators as DVs or if you are talking about comparing R-square across several different latent variables (factors). If the latter, no problem comparing.

Gaye Ildeniz posted on Thursday, February 27, 2020 - 9:27 am

Hello,
All items on likert-type scale.
I'm running the model below:

VARIABLE:
CATEGORICAL = ALL;

ANALYSIS:
ESTIMATOR IS WLSMV;
ROTATION IS GEOMIN;

MODEL:
HW BY dw1 dw2 dw3 dw4 dw6;
RET BY dw17 dw18 dw19 dw20 dw21 dw22;
EC BY dw9 dw10 dw11 dw12 dw13 dw14 dw15;
CSB BY dw24 dw25 dw26 dw27 dw28 dw30;
dw4 WITH dw6;
dw20 WITH dw21;
HAB BY ha1i ha2i ha3i ha4i ha5i ha6i;
HS BY hs1 hs2 hs3 hs4 hs5 hs6 hs7 hs8;
ECWC BY ec1 ec2 ec3 ec4 ec5 ec6 ec7 ec8;
CSBIT BY csb1 csb2 csb3 csb4 csb5 csb6;
HW ON HAB;
RET ON HS;
EC ON ECWC;
CSB ON CSBIT;

My questions:
1. When using continuous variables, a regression coefficient (standardised beta) with a single predictor would be equal to correlation coefficient. What happens in my case? My understanding was: when treating "indicators" as categorical, the latent variable would not necessarily be categorical. Instead they would be continuous latent variables with categorical indicators. Am I correct? How should I interpret the STDYX values as a result of the regression commands?

Or are they probit regression coefficients in this case?

2. In my output, I get R2 values for the DV. These values also have an associated p-value. Can you help me understand what that indicates? I am specifically referring to R2 value, not R2 change.

I really appreciate your help.
Thank you very much.

Bengt O. Muthen posted on Thursday, February 27, 2020 - 12:10 pm

1. HW on HAB is a linear regression between 2 continuous variables.

2. The p-value shows if R2 is significant.

Note that you request Geomin rotation but your Model statements say CFA.

Gaye Ildeniz posted on Wednesday, March 04, 2020 - 2:26 am

Thank you for your response.

Yes, I should remove the rotation command, thanks for warning.

-I still don't get the concept of R2 being significant. Can you recommend any resources that I can read to understand where it's coming from? I have been traditionally trained on SPSS for a decade now and never ever come across any discussion around the significance of R2. I have seen the significance of F-values (so the overall model in SPSS), significance of beta values (so whether one IV would significantly predict the DV) or the significance of a R2 change after modifying a model.

-Also, normally in linear regression with one IV, it would be equal to correlation between the two variables. Would the same logic still apply in my case because there is only one predictor for one DV?

Thank you again.

Bengt O. Muthen posted on Wednesday, March 04, 2020 - 9:54 am

- I personally don't pay attention to R2 being significant (significantly different from zero) or not. I am not aware of literature on that but R2 is just like any other statistic that has sampling variation and for which we can test it being significantly different from zero or not. I typically look only at significance of slopes.

- Yes, the same logic applies to your HW ON HAB latent variable regression.

Miriam Forbes posted on Wednesday, August 05, 2020 - 9:27 pm

Dear Drs Muthen,

Would it be reasonable to compare the R2 values from probit and logit regression results? Specifically, we're comparing models predicting the same binary outcome using 1) latent variables as predictors in a probit regression framework versus 2) using the estimated factor scores as predictors in a logistic regression framework. I saw above that you said "In both cases does the R-square refer to explained variance proportion in an underlying continuous latent response variable", but also referred to different distributions for the latent response variables in probit vs logit.

Thanks in advance for any help you can offer!

All the best,

Miri

Bengt O. Muthen posted on Thursday, August 06, 2020 - 5:39 pm

I think not because the latent response variables refer to different models.

Miriam Forbes posted on Thursday, August 06, 2020 - 6:58 pm

Thank you!