Mplus Discussion >> R-Square

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


R-Square

Mplus Discussion > Multilevel Data/Complex Sample >

Message/Author

Tom Munk posted on Thursday, November 17, 2005 - 12:05 pm

I've tried (and failed) to replicate MPLUS R-square values.

For example, in a two-level model with no predictors, the variances were 898(within) and 372(between). By adding a set of predictors at each level, I obtain variances of 844(w) and 99(b). These are reductions of 6%(w) and 73.4%(b). I would expect MPLUS's R-sq values to closely match these, but they don't. They are 22.5%(w) and 17.6%(b).

How are the MPLUS values calculated? Can they be interpreted as the fraction of the (within- or between-) variance explained by the predictors?

Linda K. Muthen posted on Thursday, November 17, 2005 - 2:24 pm

R-square is variance explained divided by total variance. I would need to see what you are basing your numbers on the see what you are doing. Please send your input, data, output, and license number to support@statmodel.com.

Huang Xiaorui posted on Thursday, October 19, 2006 - 7:45 am

I am doing a twolevel model analysis, I can get R-square for each regression. But I can't get the whole model's R-square. How can it be displayed in Mplus? Thanks.
Within Level

Observed
Variable R-Square

PMIMP 0.491
TMIMP 0.092
PMEFF 0.720
TMEFF 0.072
PMR 0.483
TMR 0.057
MI 0.124

Latent
Variable R-Square

EFFORT 0.345
IMPUL 0.072

Between Level

Linda K. Muthen posted on Thursday, October 19, 2006 - 9:59 am

There is not an R-square for the full model.

liesbeth mercken posted on Tuesday, September 18, 2007 - 1:28 am

Hi,
If you have the following model
Y1 ON Y2;
Y2 ON X1 X2 X3;
X1 WITH X2 X3;
X2 WITH X3;

and you have to add X4 to the model in the following way:
Y1 ON Y2 X4;
Y2 ON X1 X2 X3;
Y2 WITH X4;
X1 WITH X2 X3;
X2 WITH X3;

is it possible that R-square for Y2 decreases after you entered X4 in the model? And is this the case because you added a corelation with Y2?

thank you,
Liesbeth

Linda K. Muthen posted on Tuesday, September 18, 2007 - 4:34 am

I'm not sure what might happen in this case. In regression, the means, variances, and covariaces of the covariates are not model paremters. You should not mention x1, x2, x3, and x4 in the MODEL command except on the right-hand side of ON. Also, why do you have y1 ON x2 and not y2 ON x4?

liesbeth mercken posted on Tuesday, September 18, 2007 - 7:14 am

Hi Linda,
Thank you for your reaction.
Y2 on X2 is theoretically and empirically supported. The association between Y2 and X4 is completely new.
X4 is a new concept that is measured at the exact same time as Y2, so I can't use an ON statement as I then conclude causality in a specific direction.
But... I actually tried what you suggested Y2 ON X4,
but to be sure also X4 ON Y2.
Both models give significant and good results (also both interpretable)...
So I though it was best to choose for the safe road and keep it in the model as a correlation. Would you suggest to choose a direction?
thank you

Bengt O. Muthen posted on Thursday, September 20, 2007 - 10:57 am

When you added x4 to the model you let it correlate only with y2 and y1, not x1-x3. This could cause a misspecified model with too high chi-square test of fit, in which case the estimates should not be interpreted. Perhaps you want to regress x4 on x1-x3. Also, you might have a misfitting model due to y1 not having direct influence from x1-x3.

In any case, even if you have well-fitting models, with these left-out effects I would say that the R-square behavior is not predictable.

Marko Neumann posted on Friday, February 15, 2008 - 3:57 am

To my knowledge, the basis for the R� computation are the variance components of level 1 and level 2 resulting from the empty model. However, the R� calculated by Mplus does not seem to be identical to the R� values that result from using the empty model. Mplus seems rather to refer to the estimated within- and between level parts of the (co)variance given in the sample statistics output of the specified model which differ considerably from the variance components of the empty model, depending on the specified individual level predictors.

example:
The dependent variable is school achievement. The empty model provides a within variance of 413 and a between variance of 775, and thus an intra class correlation of .652 (which is quite high, but normal in the tracked German secondary school system). Inclusion of level 1 predictor variables (e. g., prior knowledge, SES) results in an increased within variance(918) and a decreased between level variance (52) in the samp stats output. The residual variances in the specified model were 185 for the within and 52 (that means equal to the between variance part of the samp stats output) for the between level. R� for within was .80 (resulting from (918-185)/918= .798). On the contrary, computation of R� on the basis of the variance components of the empty model (like in HLM) would result in (413-185)/413= .55.

End of Part I

Marko Neumann posted on Friday, February 15, 2008 - 3:59 am

Part II

Including level 2 predictor variables in the next step (e. g., school track, mean achievement) results in the following variance components: within = 825 and between = 72; the residual variances were: within = 183 and between = 27. In Mplus within R� was .78 (resulting from (825-183)/825= .778) and between R� was .63 (resulting from (72-27)/72= .625).
Computation of R� on basis of the variance components of the empty model would result in (413-183)/413= .56 for within and (775-27)/ 775= .97 for between.

To sum up, I have three main questions: Why do the variance components at level 2 in the samp stats output decrease after inclusion of predictor variables? Why do they differ so much from the variance components in the empty model? How do I have to interpret them? I would be very grateful if you could answer my question.

Linda K. Muthen posted on Friday, February 15, 2008 - 9:30 am

The only reason that I can think of for sample statistics to differ is that the sample size has changed. If this is not the reason, please send your inputs, data, outputs, and license number to support@statmodel.com.

Marko Neumann posted on Monday, February 18, 2008 - 3:02 am

Bengt O. Muthen posted on Monday, February 18, 2008 - 9:13 am

Mplus computes R-square as the ratio of estimated explained variance in the numerator and estimated total variance in the denominator, where this is done for each level separately.

R-square on level 2 refers to proportion variance explained in random intercepts. With different level 1 predictors, the intercept definition changes (value of y when all x's = 0), so the level 2 R-square can therefore change.

Stephan posted on Sunday, March 02, 2008 - 11:32 pm

Hello,
in one of the web courses Bengt mentioned that user can have good overall model fit (incl. GFI) but rather bad R-square for each of their latent variables. Could you please give me any further hints? Under which circumstances might that occur? Thanks a lot for your help.-Stephen

Linda K. Muthen posted on Monday, March 03, 2008 - 7:34 am

R-square is not a test of model fit. It describes the variance of the dependent variable explained by a set of covariates. A model can fit the data well even when the set of covariates does not explain the variance in the dependent variable.

Stephan posted on Monday, March 03, 2008 - 3:08 pm

Dear Linda,
thanks for the response. But R-square for latent variables is not 1-residual variance?

Stephan posted on Monday, March 03, 2008 - 3:12 pm

sorry, didn't see the output with the standardized residual variances. I am o.k. now.
Thanks for your help. Best, Stephen

Darla Kendzor posted on Tuesday, May 12, 2009 - 2:31 pm

I have submitted a manuscript for publication that describes a model which includes 4 continuous latent variables (IV/mediators) and 3 dichotomous observed outcome variables (DVs). A reviewer has asked me to include the percent of the variance accounted for by the IVs/mediators in the DVs. Mplus provides r-square values for each observed and latent variable, and I am wondering if mplus also provides an r-square for the whole model (i.e., the percent of variance in each of the outcomes [DVs] accounted for by ALL of the IVs/mediators)? If not, would it be meaningful to square the standardized total direct and indirect effect coefficient to obtain an r-square value this way? If neither of these ideas are possible, do you have any suggestions about how I might answer the reviewer's concern?

Linda K. Muthen posted on Wednesday, May 13, 2009 - 9:52 am

I would give the R-squares for each regression. A model R-square does not make sense because the aim of the model is not to maximize variance explained but to reproduce variances and covariances. R-square is not a model fit statistic.

Lois Downey posted on Tuesday, October 20, 2009 - 5:19 pm

Since Mplus doesn't provide a confidence interval around r-squared for clustered regression models, I've been computing the confidence intervals manually, using the point estimate and estimated standard error provided in the output. However, I'm invariably getting a lower bound that is negative. This seems counter-intuitive, given that r-squared cannot be less than zero. Are the negative values likely a result of rounding error, given that Mplus provides estimates rounded to 3 digits? (I'm multiplying the estimated standard error by 1.96 and then subtracting and adding the result to the point estimate.)

Thanks.

Linda K. Muthen posted on Wednesday, October 21, 2009 - 9:47 am

This can happen because there are no restrictions put on the confidence intervals in their computation. I don't think it is a rounding error.

Alden Gross posted on Monday, May 03, 2010 - 12:41 pm

Dear Drs. Muthen,
I'd like to report R^2 values from growth curve models for latent growth factors (intercept and slope). I understand MPLUS calculates R-square values as the (variance explained by the model) divided by (total variance) of an outcome.

I have two questions. First, what, other than stated regression paths, contribute to a latent outcome's R^2? Second, when I add demographic variables to explain the outcome, why does MPLUS' given R^2 decrease?

Here are some excerpts:
MODEL:
if sf | y1@0 y2@1 y3@2;
i1 s1 | x1v1@0 x1v2@1 x1v3@2;
i2 s2 | x2v1@0 x2v2@1 x2v3@2;
if ON i1 i2; sf ON i1 i2;

I would expect R^2 for "if" to be a sum of squared standardized regression coefficients for "if" regressed on i1 and i2. For instance, if the standardized B for if on i1=0.375 and i2=0.574, I think R^2 =0.375^2+0.574^2= 47%. However, MPLUS's given R^2 = 76%, which is correctly (1-residual/total) = (1-0.146/0.607) = 0.759.

When I then add covariates in another model (e.g., if ON age sex education;), the MPLUS-given R^2 decreases.
Thanks so much!

Linda K. Muthen posted on Tuesday, May 04, 2010 - 9:52 am

You have forgotten the covariance. You need to add:

2*cov*b1*b2

Brondeel Ruben posted on Saturday, December 04, 2010 - 6:39 am

Hi.

I fitted a logistic regression model with complex data. I was asked how much variance is explained by the model. So I stated 'standardized' in the output command.
1. What kind of R-square is this? Is it an adjusted version of the R-square, something like a Nagelkerke's R-square?
2. Is it meaningful in this context?
3. How does it handle the complex data structure? Or is that irrelevant for the computation of the R-square (in this case, but also for the case of a continuous dependent)

Regards,
Ruben.

Linda K. Muthen posted on Sunday, December 05, 2010 - 11:12 am

The R-square is the variance explained in the latent response variable underlying the categorical variable. See the following book for further information:

Long, S. (1997). Regression models for categorical and limited
dependent variables. Thousand Oaks: Sage.

R-square is not affected by non-independence of observations. R-square is usually not used for logistic regression.

jmaslow posted on Tuesday, May 10, 2011 - 7:52 am

Hello,

I am able to replicate the r square produced by Mplus in latent variable SEM models. I am attempting to extend this to SEM models containing a latent interaction calculated with XWITH.

I use the formula provided by Mooijaart & Satorra (2009) to calculate % variance explained by the main effects and the interaction term and also include 2*cov*b1*b2 and the term that multiplies the b of the interaction term by the variance and covariances of the latent variables and sum all of these terms.

However, by this method, the variance explained in the model with xwith included is much lower than the variance explained by the model with only main effects (16% versus 35%). I feel that what is missing is the covariance of the main effects with the interaction term, which are not provided by Mplus. Is there a way to get these covariances from the program, calculate them, or force them to be 0?

Thank you.

Bengt O. Muthen posted on Tuesday, May 10, 2011 - 10:46 am

You say

"and the term that multiplies the b of the interaction term by the variance and covariances of the latent variables"

But, that term should be b*b*(V(f1)*V(f2)+[cov(f1,f2)]**2).

Then you say

"I feel that what is missing is the covariance of the main effects with the interaction term"

There are no such terms missing because all third-order moments are zero due to the normality assumption of the factors.

jmaslow posted on Wednesday, May 11, 2011 - 9:19 am

Thank you, Dr. Muthen. This is very helpful. Just to be clear, then, the main effects and interaction in a model with XWITH are completely uncorrelated with each other and can be considered independent?

Linda K. Muthen posted on Wednesday, May 11, 2011 - 9:21 am

Yes.

Hans Leto posted on Thursday, May 31, 2012 - 1:56 pm

Dr. Muthen.

Which would be the formula to calculate R-squared for a three-way interaction.

For instance:
V(f4) = b1*2 V(f1)+ b2*2 V(f2)+ b3*2 V(f3) + 2*f1*f2 Cov(f1,f2) + b4**2 V(f1xf2xf3) +

I am not sure how to specify the Cov when I have three factors.

Thank you.

Linda K. Muthen posted on Thursday, May 31, 2012 - 3:38 pm

R-square is 1 minus the standardized residual variance of the dependent variable.

Hans Leto posted on Friday, June 01, 2012 - 2:52 am

But the standardized residual variance is not provided in the output with TYPE=RANDOM? How can I request or calculate it?

Linda K. Muthen posted on Friday, June 01, 2012 - 9:55 am

See the following FAQ on the website:

Latent variable interactions

caroline masquillier posted on Tuesday, July 17, 2012 - 5:26 am

Dear Drs. Muthen,

What kind of R-square do we get in Mplus output for a logistic regression? Is this an adjusted version of the R-square, something like a Nagelkerke's R-square?

Thank you very much

Linda K. Muthen posted on Tuesday, July 17, 2012 - 11:01 am

The R-square is for the latent response variable. It is described in the Snijders and Bosker book Multilevel Analysis.

May Yang posted on Wednesday, April 17, 2013 - 9:26 am

Hello,
I believe this is along the same lines as Tom Munk's posting on 11/17/05. I am confused on how R2 is calculated in a multi-level model. So R2= variance explained/ total variance. When I run a null model (without any predictors), I get a within variance=2.234. 2.234 to me is the total variance. When level 1 predictors are added, I get variance estimate of 0.147. I would assume that R2 is then 0.147/2.234 = 0.066 but the output report R2 (obtained using standardized option) = 0.086. What am I missing here? Thank you.

Bengt O. Muthen posted on Wednesday, April 17, 2013 - 12:23 pm

On each of the 2 levels, R-square is explained variance divided by total variance (on that level). Yes, 2.234 is the total variance on level 1, but when adding a predictors the variance parameter that gets estimated is the residual variance.

If this does not explain things, please send output to support.

Elina Dale posted on Thursday, October 10, 2013 - 4:16 pm

Dear Dr. Muthen,

As you wrote, the R-square is the variance explained in the latent response variable underlying the categorical variable. Is it proportion?

In MPlus CFA output, the R-Square estimate is provided next to observed factor indicators. So, if we see an R-Sq of 0.400 next to y1 (factor 1 indicator), does it mean that 40% of variance in factor 1 is explained by this categorical variable (y1)? But then there are 3 other indicators that also have a similar R-Sq estimate and so, when you add them up they are >100%.

Also, what do residual variance values (Column 6 in R-Square table) mean in this case?

I am giving an example of the output below, so that you understand what res variance and R-Sq values I am referring to.

R-SQUARE

Observed variable (Column 1)
Estimate (Column 2) S.E. (Column 3)
Est./S.E. (Column 4)
Two-Tailed P-value (Column 5)
Residual Variance (Column 6)

Thank you!

Linda K. Muthen posted on Friday, October 11, 2013 - 6:10 am

The R-square for y1 means that 40% of the variance of y1 is explained by the factor. Factor indicators are dependent variables and factors are independent variables in the factor model.

Residual variances are not model parameters for categorical variables. The values given under residual variance are computed after model estimation as remainders.

Johnson Song posted on Thursday, March 06, 2014 - 3:18 pm

Dear Dr. Muthen,

If the statistically significance test of between-level R^2 indicates that the R^2 is not statistically significant at (R^2=.07,p>.05), does it mean that the portion of the variances of this latent variable explained by the predictors is not different from zero even though the regression coefficients actually is statistically significant (p=.017)?

Whether should I still report the significant regression coefficients?

Thank you so much for your advice!

Best,
John

Bengt O. Muthen posted on Friday, March 07, 2014 - 4:39 pm

Q1. Yes. But note that the test of R-2 may not work as well as the test of the regression coefficient because the sampling distribution of R-2 may not be as close to normal.

Q2. I would do that. In general I don't report R-2 significance.

Lance Rappaport posted on Wednesday, December 03, 2014 - 11:16 am

Hello.

I understand that the stdyx option is not available when using random slopes in a multilevel context. I understand that r-square at level 1 cannot be estimated as it varies as a function of the grouping variable. My question is why an r-squared value cannot be computed for a level 2 variable?

For example, I have random slopes at level 1, which predict a level 2 endogenous variable. How could I compute an r-square for this endogenous variable? R-square for this variable should not vary as a function of anything included in the model.

Sincerely,
Lance

Bengt O. Muthen posted on Wednesday, December 03, 2014 - 4:30 pm

The R-2 for level-2 is well defined as you say. You can express it using Model Constraint with Model parameter labels.

Lance Rappaport posted on Wednesday, December 03, 2014 - 5:07 pm

Thank you very much, especially for such a prompt response. Would it be okay if I asked how to use model parameter labels to capture the standardized residual variance?

Bengt O. Muthen posted on Thursday, December 04, 2014 - 11:40 am

The general approach is shown in UG ex 5.20.

Shiny7 posted on Tuesday, January 13, 2015 - 9:53 am

Dear Drs. Muthen,

I�d like te replicate my R2 on the between level in a multilevel model.

In the null model the variance between is 7.699. In the Random Intercept Model the variance between is 0.619 (0.291 standardized).

When I calculate (7.699-0.619)/7.699 it is: R2= 0.920.

Mplus is giving me 0.709 (standardizes solution).

a) Is 0.709 the standardized version of 0.920? Or is my calculation wrong?

b) the R2 of 0.709 is not significant, is that plausible, although I have sicnificant (and not significant) L2 predictors?

Thank you very much in advance.
Shiny

Bengt O. Muthen posted on Tuesday, January 13, 2015 - 2:04 pm

What's the difference in the Mplus input specification of your null model and random intercept model?

Shiny7 posted on Wednesday, January 14, 2015 - 12:37 am

Dear Dr. Muthen,

thank you very much for your quick reply.

May I send you my output(s), please? It is a quite complex model...

Maybe too complex...

Shiny

Linda K. Muthen posted on Wednesday, January 14, 2015 - 6:20 am

You can send the outputs and your license number to support@statmodel.com as long as your support contract is current.

Christine Kemp posted on Thursday, April 30, 2015 - 2:20 pm

Hi,

I am running a continuous-time survival analysis using the cox regression model (based on example 6.21). I included Output: stdyx and can see my standardized coefficients but the R-Square line is coming up blank. Do you know why this would be?

Bengt O. Muthen posted on Thursday, April 30, 2015 - 6:22 pm

What residual variance would the R-square use?

Eric Thibodeau posted on Friday, May 29, 2015 - 6:51 am

Hi,

I'm running a very simple multiple regression model with one continuous outcome and 5 predictors. Three of the five predictors are considered covariates and not target predictors. I'm interested in the added value of the target predictors. My idea was to run two models, the first with just the three covariates predicting the outcome, and the second model with the two target predictors added. I would look at the change in R-squared. I know there are ways to test whether the R-squared increase was statistically significant, via an F-test of change. Is there any way to test that in Mplus? I've been told that the chi-square difference p-value would be equivalent. If I were to compare models using the chi-square difference test, would I set the parameters of the target predictors in my nested model to zero, and then free them up in the second model? Thanks!

Eric

Bengt O. Muthen posted on Friday, May 29, 2015 - 8:09 am

No such F-test in Mplus. I would go about it the way you describe in your second to last sentence.

Dirk Pelt posted on Wednesday, October 14, 2015 - 1:16 am

Dear Bengt and Linda,

I have read this whole thread and it appears that most questions relate to the fact that there appear to be two ways of calculating R² (please correct me if I'm wrong):

1. One based on comparison with a null model including a random intercept only. The formula is: (var of null model - var of model with predictors)/var of null model. This is the 'standard' way in multilevel models as described in Kreft and De Leeuw (1998), Snijders and Bosker etc. This formula can be applied to each level. This is all based on unstandardized coefficients/variables.

2. One that Mplus reports, which is 1-standardized residual or simply the sum of all standardized beta coefficients squared. Again separately for each level.

Which is the correct one to use? I believe that in the multilevel literature, standardizing of coefficients is always treated as something problematic. Thank you!

Dirk

Bengt O. Muthen posted on Wednesday, October 14, 2015 - 2:54 pm

For each level Mplus uses the standard R-square formula:

(1) (Variance explained by covariates)/(Total DV variance)

That happens to be the same as 1 - stand'd res var.

I think the formula you mention is the same as (1) because the var of null model is the total DV variance and when you say "var of model with predictors" you may be referring to the residual variance in the model with predictors. The difference is then what I call "variance explained by covariates".

li zhou posted on Thursday, December 10, 2015 - 10:37 am

Dear Dr. Muthen,

I conducted a quite simple model as follows ('pl' is the DV):

bf by bf_know bf_info bf_fami bf_expe;
bl by bl_abse bl_sale;
si by si_price si_qual si_serv;
bf_know bf_expe with bf_info bf_fami;
bl on bf;
pl on bl;
pl on bf;
pl on si;
pl on age gender gi edu shopper

The 'STANDARDIZED MODEL RESULTS' were quite okay.
However, the results of 'R-SQUARE' were as follows:

Obser-Var Estimate S.E. Est./S.E. Two-Tailed P-Value
PL 0.066 0.025 2.635 0.008
BF_KNOW 0.885 0.044 20.214 0.000
BF_INFO 0.949 0.051 18.582 0.000
BF_FAMI 0.852 0.065 13.128 0.000
...

My questions are (regarding the 'R-SQUARE' results):
1)Does the estimate next to the 'PL' mean that only 6.6% variance of PL were explained by the whole model?

2)Though the other model fit indexes (CFI/TLI, SRMR, RMSEA) were quite good, as the model does not explain much of the variance of the DV 'PL', can I still report the significant coefficients?

3)If yes, how can I explain the small value of the R-SQUARE for 'PL'?

Thanks.

Bengt O. Muthen posted on Thursday, December 10, 2015 - 6:18 pm

Please send output to Support along with your license number.

Rick Borst posted on Wednesday, November 02, 2016 - 3:36 am

Dear Professors Muthen,

I analyzed a model with and without a latent interaction term (as a matter of fact it is one moderator which has an effect on two IV's). The outputs show that the R-square is higher without the interaction term than with the interaction term. Although, the interactions are significant. Is it possible to have significant interactions term but a lower r-square or is there something wrong with my input? If it is possible: does that mean that my model with interactions should be discarded despite the significant interactions?

Thanks.

Bengt O. Muthen posted on Wednesday, November 02, 2016 - 5:26 pm

Q1. I think so because all parameter estimates change.

Q2. No, I would go by significance, not R-2.

Andre Maharaj posted on Friday, December 16, 2016 - 3:34 pm

Hello there,

I have a similar question, and am not sure I am understanding all the comments.

I have a two level model that was run with two count dependent variables (nb).

I would like to get a (pseudo?) r-square for the amount of variance explained for each regression;

1. Is this possible? (the stdyx output reports estimates of 1 with two-tailed p values of 999.000 for the within level).

2. Is what I am looking for actually reported on the between level? (The estimates there are 0.136 for dv1 an 0.520 for dv2.

Thanks to everyone in advance.

Bengt O. Muthen posted on Friday, December 16, 2016 - 5:47 pm

1-2. Level-1 r-square for count DVs is not defined because count regression does not have a residual variance. For Level 2 you work with continuous random effect variables so there the usual R-square for linear regression is used.

Andre Maharaj posted on Friday, December 16, 2016 - 5:59 pm

Thanks for the quick response!

So the L2 explained variance is only for the L2 predictor then, correct?

Do you know of any other way to calculate an effect size for the level 1 regressions?

If not, how appropriate / inappropriate might it be to run the standard regression model and report a "standard" pseudo r-squared?

Any advice would be greatly appreciated!

Thanks again.

Andre Maharaj posted on Saturday, December 17, 2016 - 12:32 pm

Also, and probably more importantly, I have 6 groups dummy coded in the regressions, and I would like to report an effect size for the groups with a significant difference relative to the reference group (run multiple times to change the reference groups).

Any thoughts on how I might do this would be appreciated.

Andre Maharaj posted on Saturday, December 17, 2016 - 12:41 pm

Or should I just interpret the exponentiated coefficient as the effect size?

(Sorry for the multiple posts; couldn't find a way to edit the previous ones)

Bengt O. Muthen posted on Monday, December 19, 2016 - 5:59 pm

Dec 16, 5:59 post:

Q1. Yes

Q2. No

Q3. Don't think so - I wouldn't focus on R-square for models without a residual - but check the books by Long or Hilbe.

Bengt O. Muthen posted on Monday, December 19, 2016 - 6:01 pm

Just interpret the exponentiated coefficients. You can also express the model-estimated means for the count variable and/or consider the corresponding probability of having the count e.g. = 0. See, e.g., our new book.

Andre Maharaj posted on Tuesday, December 20, 2016 - 9:40 pm

Thanks you kindly!

Hoda Vaziri posted on Tuesday, March 21, 2017 - 12:02 pm

How can I calculate pseudo r-squared for a three-level model?
My output says standardized values are only available with ALGORITHM=INTEGRATION, but when I include this, I receive an error saying the integration algorithm is not allowed with threelevel models.

Bengt O. Muthen posted on Tuesday, March 21, 2017 - 5:58 pm

Q1: Ask on multilevelnet.

Regarding your output, we have to see it to be able to say - send to Support along with your license number.

John G posted on Sunday, June 11, 2017 - 10:27 pm

I've created a multilevel mediation model in Mplus 8 using instructions from Preacher, Zyphur, Zhang (2010). I've been asked by a reviewer to supply an R-square for the outcome variable. I've read through several of these discussion threads, and it seems that I need to request standardized outputs. However, when I add the "STANDARDIZED;" request, I receive the following error message:

"STANDARDIZED (STD, STDY, STDYX) options are available only for TYPE=TWOLEVEL RANDOM with ESTIMATOR=BAYES. Request for STANDARDIZED (STD, STDY, STDYX) is ignored."

Do you have any instructions to overcome this hurdle? Thanks in advance!

Bengt O. Muthen posted on Monday, June 12, 2017 - 5:59 pm

It sounds like your model has a random slope. This situation has not be covered in terms of standardization or R-square. Bayes in Version 8 handles it.

John G posted on Tuesday, June 13, 2017 - 5:19 pm

Thanks for the advice.

I wonder if the Bayes estimator does not work with this particular MSEM analysis. When I add "estimator is bayes" to the analysis specification, I receive the following error:

*** ERROR in MODEL command
Unrestricted x-variables for analysis with TYPE=TWOLEVEL and ESTIMATOR=BAYES
must be specified as either a WITHIN or BETWEEN variable. The following variable
cannot exist on both levels: IV1

The model ran fine with the analysis command "type is twolevel random." Do you have any further suggestions?

Bengt O. Muthen posted on Tuesday, June 13, 2017 - 6:31 pm

We need to see the full output - please send to Support along with your license number.

Jo�o Maroco posted on Friday, June 30, 2017 - 8:06 am

Hello,
Can I add the Within level R^2 and the between level R^2 and get an estimate of the overall variability in my dependent variable that is explained by predictors at level 1 plus predictors at level 2 in an HLM model?
Thanks!

Bengt O. Muthen posted on Friday, June 30, 2017 - 5:05 pm

You can get the total proportion variance explained but it is not the sum of the R-2s. Perhaps TECH4 is useful here.

Christian Marx posted on Tuesday, October 24, 2017 - 12:50 pm

Dear Professors Muthen,

how does Mplus calculate R-Square in a latent regression model with dummies and continous variables as IV and a latent variable as DV?

I fail to reproduce R-Square when summing all squared stdyx coefficients which might be due to the the dummies. So, how do I have to calculate the partial r-square of the dummies?

Best

Christian

Bengt O. Muthen posted on Tuesday, October 24, 2017 - 1:02 pm

R-square is computed just like in regular regression. If the predictors are correlated it isn't enough to sum up squared regression coefficients because there are covariances too.

Christian Marx posted on Tuesday, October 24, 2017 - 1:14 pm

Thanks a lot for your answer!
Does Mplus offer partial R-Square for each predictor?

Bengt O. Muthen posted on Tuesday, October 24, 2017 - 1:39 pm

No.

Stefan Kamin posted on Tuesday, November 14, 2017 - 2:20 am

Hello,

I am estimating a path model with Bayes (continuous and categorical DVs, non-informative priors). I would be interested to know how the r-squared is calculated in the Bayes framework (both for continuous and categorical dependent variables). I believe the calculation is different from traditional ols regression as I was not able to reconstruct the r-square by hand. I would appreciate some advice.

Best
Stefan

Bengt O. Muthen posted on Wednesday, November 15, 2017 - 10:41 am

Bayes does it the usual way. For a categorical DV the residual variance is 1 due to using probit regression.

Jo�o Maroco posted on Wednesday, June 20, 2018 - 1:57 am

Dear Bengt and Linda,

I read all the posts in this thread, but still didn't find any question close to the one I have.
I am running a 2-level HLM random slopes model. I got these results:

SUMMARY OF DATA FOR THE FIRST DATA SET
Number of missing data patterns 1
Number of clusters 68
Average cluster size 23.853

Intraclass
Variable Correlation
ASRREA 0.189

My Question is:
I have no significant predictors (I am using 29 predictors to explain level 2 variance, and no level 1 predictors), but still i have a R^2 = 0.999 non-significant...):

R-SQUARE

Within Level

Between Level
Observed Two-Tailed Rate of
Variable Estimate S.E. Est./S.E. P-Value Missing

ASRREA 0.999 1.636 0.611 0.541 0.000

So I am interpreting this as a problem of over-fitting. I have 29 predictors, for 68 clusters with an average cluster size of 23.853. Thus none of the 29 predictors are significant although R^2~1...

Is this interpretation correct?

Is there a rule of thumb relating the number of predictors per number of clusters/average cluster size that I can use?

Thanks for any input on this.
Warm regards,
Jo�o Mar�co

Bengt O. Muthen posted on Wednesday, June 20, 2018 - 11:52 am

Send your full output to Support along with your license number.

Lena Keller posted on Monday, September 10, 2018 - 6:12 am

Dear Bengt and Linda,

I'm fitting a (manifest) simple linear regression (1 predictor, 1 DV) vs. a (manifest) quadratic regression in a multi-group model with TYPE=COMPLEX and weights using MLR. Based on classic textbook knowledge, I would expect that more complex models explain more variance than simpler models. Counterintuitively, in some cases R� is larger in the linear model than in the quadratic model. When I calculate these models with an OLS approach, the quadratic model always explains more variance than the linear model.

I wonder if the reason for this counterintuitive finding could be that OLS vs. MLR estimation fits a different discrepancy function to fit model parameters?

Thanks in advance!

Best,
Lena

Bengt O. Muthen posted on Monday, September 10, 2018 - 2:47 pm

Perhaps your multiple-group analysis uses equality constraints across groups in which case all bets are off.

Lena Keller posted on Wednesday, September 12, 2018 - 11:54 pm

Thanks for your answer!
There are no equality constraints across groups, and we freely estimate the parameters in the linear and quadratic models, so this cannot be the reason.
Is there any another explanation for this phenomenon?

Bengt O. Muthen posted on Friday, September 14, 2018 - 1:38 pm

Please send 2 outputs showing this phenomenon to Support along with your license number.

Jesus Garcia posted on Friday, March 22, 2019 - 6:07 am

Hi Drs Muthen, I have a question about R2, Why my model show me R-square for each observed variable but not for the whole factor? is something wrong? e.g
MODEL:
COG_PIE BY COG_5 COG_10 COG_15 COG_30;
AF_PIE BY AF_35 AF_40 AF_50 AF_55;
BEH_PIE BY COND_65 COND_75 COND_80;
INT_5 ON COG_PIE AF_PIE BEH_PIE;

OUTPUT
R-SQUARE
Observed VARIABLE Estimate S.E Est/S.E Pvaleu
COG_5 0.484 0.033 14.639 0.000
COG_10 0.180 0.025 7.269 0.000
COG_15 0.416 0.028 14.638 0.000
COG_30 0.213 0.030 7.146 0.000

Thanks!

Bengt O. Muthen posted on Friday, March 22, 2019 - 4:39 pm

R-square is given for factors as well; below the observed variable R-squares. If you don't see it, send your output to Support along with your license number.