The coefficients labeled Std are standardized using the variances of the continuous latent variables. The coefficients labeled StdYX are using the variances of the continuous latent variables as well as the background and/or outcome variables. The Std and StdYX coefficients are the same for parameter estimates involving only latent variables such as continuous latent variable variances, covariances, and regressions. They differ for parameter estimates involving both factors and observed variables such as factor loadings. Only Std should be used for dummy background variables.
Anonymous posted on Tuesday, November 30, 1999 - 7:57 pm
What is the interpretation of the estimates/coefficients for paths to a categorical outcome in Mplus?
The Mplus estimates for paths from predictors to an observed categorical dependent variable are probit regression coefficients. Typically, only their signs and significance are noted. A positive sign means that the probability of the categorical dependent variable (e.g. the category 1 for a 0/1 variable) is increased when the predictor value increases. A larger magnitude means that this probability increases faster. If more detailed description of the influence is of interest, the probabilities can be plotted as a function of the predictors. See also Appendix 1 of the Mplus User’s Guide.
Anonymous posted on Thursday, December 02, 1999 - 6:09 am
I have a good model, with a p-value greater than .05, but some standardized coefficients (both Std and StdYX) have a value of 999.000. What does it mean?
If you have standardized values of 999, you most likely have negative values of the variances/residual variances related to the parameters with standardized values of 999. If the negative values are not significant, you could set them to zero. If they are significant, you might want to consider rethinking your model.
Anonymous posted on Wednesday, April 12, 2000 - 10:46 am
I run an SEM with both latent variables and observed variables. One of the observed variables (which is also a dependent variable) is a categorical variable. Would anyone tell me how to interprete the SE (or StdYX) values? How do I know the significance level of the parameters.
The values found in the column labelled SE are the standard errors of the parameter estimates. The ratio of the parameter value to the standard error can be used to determine the statistical significance of the parameter. The values in the column labelled StdYX are standardized parameter estimates. The parameter estimates are standardized using the variances of the continuous latent variables as well as the variances of the outcome and/or background variables. In the case where the outcomes variables are categorical, the variance of the y* variable is used.
I want to test by how much the fit of nested models differs. However, I am modeling skewed outcomes (symptoms counts) and thus have to use the MLM estimator which, as I understand it precludes the use of X-squared difference tests. Is there a test you can suggest other than merely inspecting the increase in p-values for worse fitting models?
We have the formulas ready and are doing some final checks before we post them on the website.
Anonymous posted on Tuesday, August 15, 2000 - 8:58 am
We know the ratio of the parameter value to its standard error can be used to determine the significance of the parameter. This ratio is a t statistic and can be compared with +/- 1.96. But what is the df of this t statistic?
Yes, if you go to the home page of www.statmodel.com, you will see a reference to this.
Anonymous posted on Friday, December 15, 2000 - 5:26 pm
Does Mplus provide "effect decompositions", i.e., in causal models using either observed or latent variables can the total effects in such models be decomposed into their direct and indirect effects (along with standard errors or significance tests for all three types of effects?
No, Mplus does not provide indirect and total effects including standard errors, just direct effects.
Anonymous posted on Tuesday, May 08, 2001 - 9:55 am
Please clarify a few things for me regarding Std and StdYX values. If I'm constructing a two stage SEM where a latent (CFA) variable is used as both an outcome and a covariate.
When I want to compare the effects of various x (causally prior) variables on a continuous latent variable (call it L), I use the Mplus Std values. This comparisons are valid for both continuous and categorical x's (such as ability score and gender).
When I want to compare the effects of various x and my latent variable L on some additional outcome measure (call it Y), I must use the Mplus StdYX values (in Mplus the Std values for my x variables in this portion of the model are the same as the unstandardized variables). In this case, I can compare (if I wanted to) the magnitude of the CFA loadings with the effect of a continuous x on Y. However, the StdYX values are only valid for continuous x and L variables and there is no way to compare the relative effect of dummy x variables (such as "gender") with the effect of L on Y.
Is this correct ?
bmuthen posted on Thursday, May 10, 2001 - 10:01 am
When x is a dummy variable such as gender, you are correct that you do not want to standardize its slope by its standard deviation (sd). So for L regressed on x, you use Std. For Y regressed on L and x, you use StdYX for L, but for x you need to do a simple hand calculation. In order to get the desired standardization wrt Y but not wrt x, you can either start with the Std value and divide by the estimated Y sd, or start with the StdYX value and de-standardize wrt x by dividing by the sample sd of x; the two ways give the same result.
Anonymous posted on Thursday, May 10, 2001 - 11:38 am
Bengt, I have two follow-up questions to your response above.
First, I should have mentioned that my additional outcome measure Y is an ordered categorical variable. I do not see where Mplus provides information on the SD of my outcome variable Y. (My model is also a multigroup model so I have allowed tau's to vary across groups and fixed the scale factors for Y to 1 for all groups.)
Second, regardless of whether or not my Y is categorical or continuous, if I follow the procedure you describe above, wouldn't I only be able to compare the effect of dummy x's on Y, but not the effects of continuous x's on Y with the dummy x's on Y, nor the effect of L on Y with the effects of dummy x's on Y ?
With a categorical dependent variable, the sd of y is not used but the sd of y* (the variable that has a linear relationship to the predictors). The y* variance is not printed, but can be deduced via the residual variance which is printed if standardized is requested. But given that the dependent variable is categorical, the second of the two alternatives that I mentioned would seem easiest - de-standardizing the coefficient for the dummy x.
The question of being able to compare a standardized value for a continuous x with a value for a dummy x is the same as in regular regression analysis. It is possible if you keep in mind that the value for the continuous x talks about the amount of sd change in y for an sd change in x, whereas the value for the dummy x talks about the amount of sd change in y for a change from male to female.
Steve Lewis posted on Monday, August 20, 2001 - 6:43 am
In my larger model of two endogenous factors, a manifest categorical indicator and one exogenous factor with five indicators. The exogenous factor has three of the indicators with standardized loadings above 1.0. How should I interpret these or should I set the Lambdas to one?
To say anything further, I would need to see your input and data. You can send them to email@example.com.
duckhye posted on Thursday, May 30, 2002 - 10:47 am
Xie(1989, in the reference section) says that "USUAL" path coefficients are different from LISCOMP 0.1 standardized solution, because LISCOMP 0.1 standarized solution refers to the case of unit variances of latent variables CONDITIONAL on exogenous variables. "Usual" path coefficients assume unit variances of all variables UNCONDITIONAL on exogenous variables.
When you said "For Y regressed on L and x, you use StdYX for L, but for x you need to do a simple hand calculation. In order to get the desired standardization wrt Y but not wrt x, you can either start with the Std value and divide by the estimated Y sd, or start with the StdYX value and de-standardize wrt x by dividing by the sample sd of x; the two ways give the same result", are these procedures for calculating "usual" path coefficients mentioned by Xie?
bmuthen posted on Sunday, June 02, 2002 - 12:13 pm
The answer to your last question is yes. Unlike LISCOMP, Mplus does not give standardization to variances conditional on x's.
Anonymous posted on Thursday, July 18, 2002 - 10:07 am
Will Mplus ever be adding the capcity for effect decompositions including standard errors and test statistics for indirect, direct, and total effects? If not, why not?
bmuthen posted on Tuesday, October 01, 2002 - 9:15 am
The topic of standardized coefficients greater than 1 is well treated by Joreskog, see
Anonymous posted on Sunday, January 26, 2003 - 1:23 pm
Hi all! Please help me with this: three continuous latent variables a b c predict variable x (which is continuous) which predicts Y which is dummy var). a also predicts Y. so: 1) should I present Std or StdYX (or StdXY for the first part, and Std for second and what to do with a--Y ? 2) what "sort" of coefficients are calculated for a-b-c to x, x to y, and a to y? 3) can i compare somehow a to b (their impact on x)? 4) can i compare Rsquare of X (or Y) in a models with all three (a b c) and only two predictors (a b)?
Thanks you very much!
bmuthen posted on Monday, January 27, 2003 - 9:45 am
1) You should use StdYX whenever you want to standardized with respect to both latent and observed variables, which seems to be the case here.
2) With a continuous dependent variable you have regular linear regression coefficients, and with a categorical dependent variable you have probit coefficients.
3) Yes, that's the idea behind standardized coefficients.
Anonymous posted on Tuesday, January 28, 2003 - 2:50 am
Hi! Thanks so much! For questions 3 and 4 I just wanted to ask if there is a formal test to compare this parameters? 3. can I apply formula: (b1-b2)/(sqrt(vara+varb))to compare reg. weights in a single model? If so, is it the same with probit coeff? 4. is there a formula to compare Rsquare-s (to say that one model explains significantly more than the other)? 5. Should I interpret Rsquare of a categorical dependent variable the same way like it was a "normal" variable (the percent of explained variance)? 6. one more silly question... If i have a single variable that is predictor of Y than Rsq is square of its reg.weight... i think. But if i have two or more predictors and one of them have reg.weight (0,83) (the other is -0,13 ns) and Rsquare is only 0,54? Howcome? is this normal?
bmuthen posted on Tuesday, January 28, 2003 - 9:53 am
3) In principle, you can test equality of both standardized and unstand'd coeff's if the denominator of your test takes into account the variance and covariance of what's in the numerator (drawing on "the Delta method"). These follow general regression analysis rules and so is not Mplus specific. Same for probit.
4) See a good regression analysis book.
5) Yes, but remember that it is the R-square for y*, not y (see Mplus User's Guide, Tech App 1).
6) See a good regression analysis book.
Anonymous posted on Tuesday, January 28, 2003 - 11:18 am
thank you very much!
Anonymous posted on Tuesday, August 12, 2003 - 1:34 pm
This is a question about the model covariances. I want to report the residual correlations between my DVs (call them X and Y). Under MODEL RESULTS, X WITH Y, I interpret Est./S.E. to be the residual covariance between these variables. I interpret StdYX to be the residual correlation. Is this PRECISELY correct? If not, what should I report, and what do I call it?
The value in the column labelled estimate is the residual covariance. The value in the column labelled est/se is the ratio of the parameter estimate to the standard error of the parameter estimate. This is a z-value. StdYX is the residual covariance standardized using the variances of both y and x. It is not a correlation. It is a standardized covariance.
I am running a SEM model, with independent factors that are highly collinear (they are conceptually different though.) Some of the standardized structural coefficients that I obtain are above 1. From Joreskog (June 22, 1999) I am assuming that this may happen, and I can report those results. Is that right? More troublesome, some of the structural coefficients are high, and negative, even when factor correlations show a strong, positive association among these factors. i.e. The correlation of F1 with F3 is .84. The correlation of F2 with F3 is .54. The correlation between F1 and F2 is .87 (I know this is high, but is the result of scale usage rather than being measures of the same concept.)
When I look at the regression of F3 on F1, the standardized structural coefficient is 1.36! and the coefficient for F3 on F2 is -.84! I know the quality of my data is not very good, but there is nothing I can do to change it. 1) Should I report those results, or is there any fix I may try first? 2) Is the negative coefficient an end result of multicollinearity? I've seen this before in the context of simple linear models, but never found a good explanation (and there is no reason for the sign of the coefficient to be that way.) 3) Also, I've seen this happen more often when using the WLSMV estimation method, instead of ML? Any particular reason for this?
bmuthen posted on Friday, April 30, 2004 - 9:46 am
It does sound like you suffer from multicollinearity and that you need to address that before reporting your results. Either by dropping one of the factors, or by some other approach.
Carlos posted on Friday, April 30, 2004 - 12:33 pm
I know, but from reading Joreskog's article my understanding was that the standardized coefficients, even if above 1, were OK. He mentions that that happens in the context of multicollinearity. Do you agree with that? I understand how multicollinearity may affect your standard errors or r-square, but I don't understand why the impact on the sign and size of the coefficient. We are dealing with 'importance' measures, so our measures tend to be collinear even when they represent different things. I am not doing this for an academic journal, so just need a sense of direction. I was told that other methods, such as neural networks, could handle this, but I would rather use a confirmatory method.
Thanks again. Carlos
bmuthen posted on Friday, April 30, 2004 - 7:08 pm
I, too, work with importance ratings in a marketing research context, and I also see negative signs here and there where I expect see postitive signs theoretically. The problem is the multicollinearity among the factors. (One way to determine this is to force all the covariances among the factors to zero. You will likely see the sign become positive. Of course, this is a highly unrealistic model).
BTW, I also see this in customer satisfaction models a lot. Another approach is to introduce second order factors. If the first order factors are as correlated as you say, then it is likely a higher order construct is driving the responses to your survey items (or as Bengt said, there is really only a single factor where you wish to see two or more factors). Of course, the problem with second order factors is their interpetation to the reader/client.
Thank you for your comments. I've tried already fixing the covariances to 0 but did not work. I may check that again though. I do use second order factors once in a while, but in this case I have a model that worked pretty well among more sophisticated audiences and the general public in some countries, but failed in one region. I know this is more driven by scale issues (sometimes, to deal with importance measures we use a variant of conjoint analysis to get more discrimination among importance measures, but not in this case) rather than cultural differences, but is hard to prove with this data (I did not design the study.) I any case, thanks again for yours and Bengt comments.
Perhaps adding a factor across all items measured on the same scale in addition to the original factors will capture the scaling effect. William Dillon wrote a article in JMR about using such factors in brand equity research where scaling issues like you have is a big deal. The factor attempts to parse out the scale effect, leaving your original factors to capture the unique measure of each item. kust a thought...
That seems like a good idea. I will try that. Thanks!
Anonymous posted on Thursday, August 26, 2004 - 1:09 pm
A reviewer has asked me to provide the C.I.s for the StdYX that I am squaring to report a genetic correlation for twins similar to the Prescott paper. It there a quick way for me to calulate these values. Thanks in advance. Tom
Anonymous posted on Tuesday, November 16, 2004 - 7:44 am
In reference to the message regarding the standardized values of 999.000: I have a cross-lagged panel design which gives "Std" values that are realistic, however my StdYx values are 999.000. Is this simply an artifact of testing against the poisson distribution?
I would not recommend standardizing the variables. I would use them in their raw metric.
Anonymous posted on Wednesday, April 20, 2005 - 11:25 am
I used Mplus to run some logistic regressions to use in mediation analyses. When calculating mediation with logistic regression coefficients, I want to use the standardized regression coefficients. I noticed that SAS and MPLUS do not standardize these coefficients in the same way. Is MPLUS more accurate? If so, why?
bmuthen posted on Wednesday, April 20, 2005 - 11:45 am
Mplus uses the logistic density variance of pi-squared/3 as the residual variance in its standardization. This is in line with conceptualizing the binary outcome as having an underlying continuous response variable with a logistic density for the residual (total response variable variance is the explained part plus the residual part). I don't know what SAS does.
Anonymous posted on Wednesday, June 08, 2005 - 8:11 am
I am confused concerning the use of Std and STdYX. I am estimating a model with both categorical and continous dependent and independent variables. I estimate a path model so I have no latent variables.
When I want to use standardized solution should I use Std or StdYX?
suppose x1 = gender (categorical), x2 = achievement (continuous) , x3 = amount of hours of math(categorical), x4 attitude(continuous)
Is it correct when I say:
1) x2 ON x4: I use Std 2) x3 ON x4: I use Std 3) x2 ON x1 x4 x3: I use StdXY for x4 and for x1 and x3 I divide StdXY by the SDx1 or SDx3 (or divide Std by SDx2) because I don't want to standardize the dummy variable gender its slope by its SD (see previous on the discussion list)? However I use theta parameterization and no variances or residual variances for the categorical dependent variables are estimated. How can I make this calculation? 4) x3 ON x1 x2 x4: I use STXY for x4 but what to do with x1 and x2. Since no variances can be estimated for categorical variables under theta I cannot compute the standardized path coefficients. Is that correct?
Can you suggest a solution?
bmuthen posted on Wednesday, June 08, 2005 - 6:19 pm
3 facts help you to answer your own questions:
1. These decisions are not influenced by the dependent variable scale nor by Delta vs Theta parameterization since with a categorical dependent variable, the standardization is done with respect to the SD of y*, the underlying continuous latent response variable - so you can act as if the dependent variable is continuous.
2. You don't want to standardize with respect to a binary independent variable because you are not interest in the effect of a 1 SD change in such a variable but in the change from 0 to 1.
3. Mplus does not print out what you want according to 2. If you use StdYX, you have to unstandardize with respect to the binary independent variable (x), that is divide the StdYX value by the x SD.
Anonymous posted on Thursday, June 09, 2005 - 2:24 am
I am still confused. Could you please clarify what you mean? As I understand correctly for every binary (or categorical) independent variable I have to divide StdYX by x SD. But what do you mean by nr 2. I don't have to standardize at all and just use the raw coefficient for binary independent variables?
I do not understand how I can figure out the x SD since in my output I only get the residual variances of my continuous variables. What do I have to do to get the variances of the binary variables?
bmuthen posted on Thursday, June 09, 2005 - 6:22 pm
Let me first restate how a standardized coefficient is computed from the raw coefficient b
StdYX(b) = b*SD(X)/SD(Y).
Regarding your first question, no you DO want to standardize, but not wrt (with respect to) x. In other words, you standardize wrt y (so divide by SD(Y), but not wrt to x (so don't multiply by SD(X)). Standardizing only wrt y gives a coefficient that tells you how many y SD units change you get for a 1 unit change in x (change from x=0 to x=1, say).
Regarding your second question, you get SD(X) by doing a type=basic run with x included in the Usev list.
I have problems interpreting Std and StdYX. I am estimating a path model, so I don't have any latent variables. I have continuous as well as binary dependent variables. In my results Std is always equal to b. In the manual is said that the coefficient in Std are standardized using the variances of the continuous latent variables. However I don't have latent variables. Or does Mplus assume that I estimate latent variables and the variances are 1 so that can explain why b and std are always exactly the same?
I understand (from previous discussion) that for binary independent variables you have to divide StdYX by SDx. Does this also apply when the dependent variable is binary?
I don't understand what is meant in the manual by 'for stdYX the coeff are stand using variances of the continuous latent variables and the var of the background and/or outcome var'. As I understand StdYX= b*SDx/SDy so in this formula what are the continuous latent var and what are the var of the background and/or outcome var? As you can see I am very confused! Could you please clarify this?
In addition to my previous question. I can perfectly calculate the stdYX when variables involved are continuous (with the formula b*SDx/SDy). But when one or both variables are binary the calculations with the formula do not correspond anymore with Mplus output.
So whay is happening? How does Mplus calculate these standardized coefficients when binary or more in general categorical variables are involved?
BMuthen posted on Tuesday, June 14, 2005 - 9:18 am
If there are no latent variables, then STD is the same as the raw coefficient.
The answer to your second paragraph in your first message is no.
With binary dependent variables, the standardization uses the estimated variance of y*. I am not sure if we print that anywhere that you could do it by hand.
Does anybody have any idea how to print the estimated variance of y*? I need this to correct the stdYX so I can calculate the correct standardized coefficient for binary independent variables on binary dependent variables.
I don't think this can be done. Come back after July 1 and I will research it.
You don't really need to know the y* variance because the standardization choice is only dependent on the independent variable being binary or not.
Anonymous posted on Wednesday, June 22, 2005 - 2:26 pm
I have an observed, binary independent variable and an observed, continuous dependent variable. If I use regression to examine the relationship the standardized beta is the same as the StdYX. So, I am unclear why we correct the StdYX value (by dividing by SD X) here, but do not correct in standard regression?
BMuthen posted on Thursday, June 23, 2005 - 3:48 am
With binary observed independent variables, the STDYX needs to be adjusted to reflect that you are interested in the change in the dependent variable when the independent variable goes from 0 to 1 rather than increasing one standard deviation which is not of interest with a binary indepdent variable. This is the same as in ordinary regression.
Anonymous posted on Monday, June 27, 2005 - 8:30 am
I've got yet another question on standardization of path coefficients involving latent variables. This refers to a path leading from a latent variable (f) to range of a continuous and categorical observed variables (y1-y5).
I had assumed that the coefficients given in the "Std" column were standardized by multiplying the "Estimates" column with the variance of f (f being the predictor). In the model I have run, however, this assumption doesn't square up with the variance of f provided by the output. Rather, the multiplyer I would need to turn my "Estimates" into "Std" values is equal to the standardized loading of one of the observed variables (x) that I used to measure the factor with (the one whose unstandardized loading is set to 1).
[Approximate example: Y1 on F: EST=0.3; STD=0.2; F by X: EST= 1.0; STD=0.67; Variance (F): EST=0.5; STD=1.0]
I'd be grateful if you could help me make sense of this.
When f is the predictor, the STD column is the estimate multiplied by the standard deviation of f. You can find the variance of f in TECH4.
Anonymous posted on Friday, September 02, 2005 - 12:08 pm
If i have a SEM model where the outcome is a latent variable and the predictors are observed exogenous variables, Does the STDXY Beta coefficients can be interepreted the same as a regular OLS regression? In other words if I square the STDXY beta coeffcients can I interpret them in the same way as in the OLS model?
Factors in SEM are continuous variables, therefore the coefficients are linear regression coefficients. So yes. Regarding your second question, the answer would be yes I guess, but I'm not sure what you are getting at by squaring them.
I have three continuous latent variables (X,Y,Z) and are interested in finding the partial correlation between X and Y while controlling for Z. For this I used the following model syntaks:
MODEL: z BY var1-var6; y BY var7-var11; x BY var12-var16; x y ON z;
As I have understood, the correlation between (the error terms of) X and Y in this model is r = cov(xy) / (sd(x)*sd(y)), and this correlation would be the partial correlation between X and Y, controlling for Z. Unfortunately, the correlation displayed in the Mplus-output (stdYX) is much lower than the partial correlation I should obtain when using the formula above (I get "the right value" when using another SEM-program). Is it wrong to interpret the StdXY for "X WITH Y" in the way I do?
BMuthen posted on Saturday, November 12, 2005 - 5:49 pm
The StdYX value for the residual covariance between the residuals for x and y is not the correlation between these residuals but the standardized residual covariance. The correlation between the residuals is obtained as the residual covariance divided by the product of the standard deviations of the two residuals.
rpaxton posted on Saturday, December 10, 2005 - 11:19 pm
Greetings once again,
Thanks for the amount of support that you give for Mplus, you guys have been really helpful.
On another note, just going through some articles in the physical activity field, based on SEM approached. When examining a SEM with both latent and manifest variables which is the best symbol to use betas or gammas. I have noticed that some researchers have use these symbols interchangeably. Could you provide a little insight on their assumptions. Before I read those articles, I assumed that all paths were standardized betas.
The choice of beta or gamma to refer to unstandardized regression coefficients is arbitrary. Beta usually refers to regressions among latent variables. And gamma is usually used for regressions where the covariates are observed.
Is it possible to constrain standardized coefficients?
I have successfully constrained unstandardized coefficients in a model in order to test whether contraining coefficients to be equal results in decreased fit. However, it occurs to me that it is really the difference between *standardized* coefficients that would matter, and I can't seem to work out how to constrain them.
For example, if I want to say that A definitely relates more strongly to B than C does to B, the standardized coefficients from A to B and from C to B are the difference of concern--is that correct?
bmuthen posted on Friday, February 10, 2006 - 7:01 am
This is possible in Mplus Verson 4, which will be released at the end of the month.
anonymous posted on Saturday, February 18, 2006 - 11:35 pm
I went through the discussions on StdYX and looked up the Mplus manual and have not been able to figure out how to interpret the intercept estimates in the StdYX column.
I ran a simple regression analysis using Ex3.1.dat. I regressed y1 on x1 and got the following results:
Est. S.E. Est./S.E. Std StdYX
Y1 ON X1 0.986 0.050 19.891 0.986 0.665
Intrcpts Y1 0.484 0.052 9.327 0.484 0.312
What does 0.312, the estimate of the intercept Y1 in the StdYX column denote? At first I thought the intercept would be equal to 0, as expected from a standardized regression equation (the Y1 on X1 for StdYX, 0.665, is a correlation estimate. I checked and it's not the mean of Y1 either (it's .4848). How would one interpret the estimate?
Also, how would one interpret the intercepts of the indicators in the StdYX column of CFA? My understanding is that they are from the NU vector, correct? At first I thought they would be zeros too.
Hello, I am trying on confidence intervals at the moment and I would be interested in CIs for the standardized coefficients. Is it possible to obtain these directly from MPLUS and if yes: how? Thank you!
Hello Bengt, thank you for this offer but I just solved the problem with the help of my statistic professors: You can easily calculate the standardized confidence intervalls by the upper and lower bound dividing them by the same factor that you get when you devide the unstandardized coefficient by the standardized coefficient.
sivani sah posted on Sunday, April 02, 2006 - 2:09 pm
Dear Dr. Muthen, We got standardized coefficient of 1.26 in SEM. Is this estimate wrong? or can STD coeff. be larger than 1? if it can be, what are the reasons to get these coeff larger than 1. If the std. coef. of greater than 1 can be reported, are there any literature around so that we can refer that literature? I will appreciate your help in this regard. Thank you? Sivani
I am working on a research project where our team is trying to assess the nature of association between a set of multiple choice items and open ended items within a particular subject area, like math or reading.
We created parcels of the MC and OE scores, specified one latent, and specified the error variances for the two observed variables. We also fixed the MC path to 1. This was done to satisfy the t-rule. Thus, one path between the OE and the latent variable and the variance of the latent were estimated, leaving one degree of freedom.
One question of interest is if the standardized paths of the OE and the MC to the latent variable are significantly different. I was tempted to use the CINTERVAL output to determine if the estimates overlap, but was hesitant to draw any conclusions based on this. I realize that this is an improper use of CI's, and I didn't know how this would transfer to the standardized estimates.
Is there a way to test the differences of the latent correlations using the output at hand? Fisher's Z can be used with pearson correlations, but I was reasonably sure this did not apply here.
To test equality of standardized coefficents, see the Version 4 User's Guide example 5.20
Monti Vitti posted on Thursday, August 10, 2006 - 11:05 am
Dear Linda & Bengt
I have a question regarding a cross-lagged effects model with latent variables. Joreskog (1999) and Finkel(1995) could not help. The standardized coeff. for one stability estimate is way above 1 (residual variance also negative, although not significant).This is from prior X to posterior X.
I've checked for multicollinearity with SPSS (between the items that compose the factors,and there isn't any). Q1) Could this anomaly be an effect of the autoregressive element,and if yes, is this acceptable? Q2) Could it be a by-product of my running the model on a homogenous subsample? I don't get the problem in the full sumple.
Q3)Setting the error variance to 0 produces normal estimates,yet the model falls apart.The best solution I have found is to delete the correlation between prior X and prior Y. But what does this mean?
It sounds like the model is misspecified - there might be restrictions imposed that are not suitable, for instance that the time 3 outcome is influenced by the time 2 outcome but not the time 1 outcome. The subsample may be systematically different from the full sample in this regard.
Laney Sims posted on Monday, October 09, 2006 - 8:05 am
I am confused about how the "estimates" are computed. I read that these are the unstandardized regression coefficients, in the sense that they are not altered by multiplying by SD(X)/SD(Y), or by 1/SD(Y). However, in general linear regression, "standardized" means that the coefficients are constrained to be between -1 and 1 (computed by subtracting the mean from each observation then dividing by the sample SD). Are your unstandardized betas still standardized in this respect, or are they the unaltered regression coefficients for the raw data?
No the unstandardized betas are not standardized. They are unaltered regression coefficients.
Laney Sims posted on Monday, October 09, 2006 - 8:39 am
Thank you for the clarification. I have one more quick question: if the unstandardized coefficient is signficant (based on the t-value), is it safe to assume the associated standardized coefficients are also significant?
I am estimating several path models (no latent variables) where: TYPE IS COMPLEX MISSING H1 ESTIMATOR IS MLR
I am interested in calculating regression coefficients that are standardized in various ways. Three scenarios:
1) I have a binary IV & continuous DV. I want to express the standard deviation (SD) change in the DV that occurs when my IV changes from a 0 to a 1. How do I derive this standardized coefficient from the provided StdYX value?
2) I have an ordinal IV (e.g., with 4 potential values) & continuous DV. Is it customary to report the SD change in the DV that occurs with one unit change in my IV, or the change that occurs in the DV with a SD change in my IV (even though my IV is ordinal)? If the latter is the correct approach, do I present the StdYX value as is?
3) Same scenarios as the above two, except that now my DV is a count variable. I guess the main difference in the question would be does one usually standardize the beta using the variance of the count variable (or would it make sense to present the unstandardized beta instead; so that the interpretation would be the unit change in the count variable that occurs with one unit change in the IV)? Finally, would the answer to this question change if the count DV was a scale of three summed count variables?
1. Multiply the StdYX coefficient by the standard deviation of x.
2. An ordinal covariate is treated as though it is continuous in regression. The regression coefficient is the change in y for a one unit change in x. You would need to create a set of dummy variables if this is not what you want.
3. Count variables don't have variances so standradized coefficients are not available.
Thank you for your replies. Follow-up questions on points 1 & 2:
1. How do I obtain the standard deviation of x in Mplus (given that I have complex survey data that require weight, strata, and cluster variables; TYPE IS COMPLEX MISSING HI)?
2. For the models including ordinal IVs, assume that I would like to present the change in Y (in standard deviations) for 1 unit change in X. Would I do the same as in point 1 (multiply the StdYX by the SD of X) to obtain this coefficient?
My understanding is that the StdYX coefficient is the change (in standard deviations) in Y for 1 SD change in X; and the unstandardized coefficient is the unit change in Y for 1 unit change in X. Neither of these coefficients are appropriate if I would like to present the change in Y (in standard deviations)for 1 unit change in X. Is my understanding not correct?
Sorry to throw out another post before you had a chance to answer my first one, but I need further clarification on a point you made on November 11.
My questions was: 1) I have a binary IV & continuous DV. I want to express the standard deviation (SD) change in the DV that occurs when my IV changes from a 0 to a 1. How do I derive this standardized coefficient from the provided StdYX value?
Your answer was: 1. Multiply the StdYX coefficient by the standard deviation of x.
Bengt said on June 8, 2005: If you use StdYX, you have to unstandardize with respect to the binary independent variable (x), that is divide the StdYX value by the x SD.
My X variable (predictor) is binary and my Y variable (outcome) is continuous. Wouldn't I need to divide StdYX by the standard deviation x to derive the value that reflects change in Y (in standard deviations) for one unit change in X (e.g., male to female, for the binary variable "sex")?
I apologize. My mind thought divide and my fingers typed multiplied. To clarify, the StdYX standardization of a regression coefficient is:
StdYX = Beta * sd(x) / sd (y)
so you must divide by sd (x) to take this out of the metric of standard deviation units of x. Then it becomes Beta / sd(y). This represents the change in standard deviation units of y for a one unit change in x.
You get the standard deviation of x by doing TYPE=COMPLEX MISSING BASIC; with no MODEL command.
If you set the metric of the factors by fixing the factor variance to one and allowing all factor loadings to be free, then the factor covariance becomes a correlation. The raw coefficient in column one will be the same as the Std coefficients which standardize by factor variances which are one. StdYX will be the raw coefficient standardized using the observed variable variances.
Then you should send your input, data, output, and license number to firstname.lastname@example.org so I can see exactly what you are doing. If all factor variances are one, then it should be so there must be something else you don't see.
I would like to obtain the standardized residual covariance matrix for a CFA with ordinal data using WLSMV. Mplus 4.2 gives me the residual correlation matrix when I specify [output: residual]. Is there a way to also obtain the standardized covariance matrix or, otherwise, to calculate this by hand? For clarification, here is an example syntax of the type of model I'm refering to: ANALYSIS: ESTIMATOR = WLSMV; VARIABLE: NAMES ARE y1-y7; USEVARIABLES ARE y1-y7; CATEGORICAL ARE y1-y7; MODEL: f BY y1 y2-y7; OUTPUT: STANDARDIZED RESIDUAL;
Thanks for asking. Joreskog described a method of obtaining residuals that are standardized with respect to their asymptotic standard errors. This was described in Joreskog (2002) for ordinal data: http://www.philscience.com/hangul/statistics/ssi/lisrel/techdocs/ordinal.pdf Standardized residuals are handy for identifying residuals that are larger than what one would expect from sampling error.
Can these standardized residuals be obtained in MPlus?
The current version of Mplus does not give standardized residuals. This will be added in a future version. Modification indices which are available are typically better at pinpointing the source of misfit.
Greetings, I'm doing a CFA with categorical indicators, another CFA with continuous indicators, and a final SEM testing relations between both forms of latent variables. As meny other in this discussion, I remain confused as to the use of STD and STDYX coefficients.
My personals questions are: STD or STDYX in the following? (1) CFA with categorical indicators: factor loadings and factor correlations. (2) CFA with continuous indicators: factor loadings and factor correlations (here I believe we use STDYX all over). (3) Latent X on latent Y (were latent X rely on categorical indicators and latent Y on continuous indicators). (4) Latent X1 on latent X2. (5) Latent Y1 on latent Y2.(6) Latent Y on Latent X. (7) The effect of a categorical covariate on Latent X or Y (here, we use STD I think or the estimate directly). (8) The effect of a continuous covariate on Latent X or Y (here, we use STD I think or the estimate directly).
However, As I realized that you answered related questions many times in the past, I would suggest (that would certainly help me a lot) that you (if you have time) develop a summary table indicating whihc one to use in which case to put either on the website or/and in the next version of the manual.
You can find the definition of Std and StdYX in the user's guide in Chapter 17. In 99 percent of the cases, StdYX is most appropriate. With binary covariates, StdYX should be adjusted so that the standardization uses only the standard deviation of y, not the standard deviation of x. Std would be used if for some reason, you want to see standardization using only the standard deviations of the latent variables.
Hello Drs. Muthen, Can you shed some light on this? I ran a SEM model with 4 latent vars predicting a manifest var. 3 of 4 IVs are NS, but the STDXY estimate for one of the NS is larger than that of the significant vars. How could this come about? Thank you for any insight you might have.
The size of a coefficient is not necessarily related to significance for raw or standardized coefficients. The size of the raw coefficient is related to the scale on which it is measured. Raw coefficients are standardized using standard deviations that may vary in size.
Laura Pierce posted on Wednesday, August 29, 2007 - 10:46 am
Monti Vitti posted on Monday, September 03, 2007 - 7:38 am
Dear Drs. Muthen,
I am running a cross-lagged effects model, using surface indicators (2 waves).When reporting coefficients of interest (cross-effects) I use unstandardized estimates, for reasons of comparison across subsamples. I use the typical template of "a 1 point change in X predicts an X,X point change on the Y scale". Yet, I have been asked to provide a more "intuitive metric".What do you think this means, and how do I do it? Many thanks, Mon.
You could use standardized coefficients which talk about standard deviation units.
Anna Siser posted on Friday, February 01, 2008 - 11:05 am
Hi, In my non-recursive model beta/standardized coefficient is greater than one. The variable in question is unobserved and is indicated by two observed variables. I have included a description of the path diagram. Measurement Model: Belief1<--Factual Beliefs (1) Belief2<--Factual Beliefs Belief3<--Factual Beliefs Belief4<--Factual Beliefs Belief5<--Factual Beliefs Belief(n)<--errorbelief(n) (1) Cuts<--Policy Preferences (1) Limits<--Policy Preferences Cuts<--errorcut (1) Limits<--errorlimit (1) Govt1<--AntiGovt (1) Govt2<--AntiGovt Govt(n)<--errorgovt(n) (1) Structural Model: Policy Preferences<--Factual Beliefs Factual Beliefs<--policy>1) Factual Beliefs<--know Policy Preferences<--AntiGovt Factual Beliefs<--ideology Policy Preferences<--ideology Factual Beliefs<--errorfactbelief Policy Preferences<--errorpolpref errorfactbelief<-->errorpolpref Variables: Belief(n), Cuts and Limits: There are five categories. Govt(n): There are four categories. Know: Variable is dichotomous. Ideology: A three category variable. Other information: The vars have a (rough) normal distribution. The size of the data set is n= 248.
I have noticed that in version 5 the test statistic associated w/ a given standardized effect is *different* than the test statistic associated w/ the same unstandardized effect (this was not true in earlier versions of Mplus). I did not think that this was possible (e.g., in simple OLS case, we don't get separate tests for b vs. B associated w/ same covariate).
1. Can you explain the difference in how these test statistics are computed?
2. Can you explain why this feature was added in v5 (what Q does it answer that was previously unanswerable)?
3. If I am reporting standardized effects in text, would you recommend that I report test statistic assoc w/ unstandardized or standardized effect?
Test statistics were not given for standardized coefficients in earlier versions of Mplus. The test statistic was for the raw coefficient.
Statistical signficance of raw versus standardized coefficients are not necessarily the same. For a regression coefficient, a raw regression coefficient is associated with a one unit change in x while a standardized regression coefficient is associated with a one standard deviation unit change in x.
You should report the test statistic associated with the coefficient you are reporting.
Typically the two types of test statistics give the same significant or insignificant result. If you observe large differences in p values, we'd be interested in seeing them - please send to email@example.com.
Just an observation/question. Running an SEM model and requesting confidence intervals on standardized indirect effects, seems to only provide unstandardized indirect effects confidence intervals, in the standardized output. I did a search of the posts and didn't see anyone experiencing this, I wonder if I missed something? Here is my run-stream: ======================================== data: file is modeldataless1.txt; variable: names are SurveyID city PACIYM PACIM SSRWE pbrwe PSITOT FMCOH FMSUPRT SSRWCY FMCOHY; usevariables are PACIYM PACIM SSRWE pbrwe PSITOT FMCOH FMSUPRT SSRWCY FMCOHY; missing are .; model:
ssrwe on fmsuprt pacim ; pbrwe on psitot ; ssrwcy on fmcohy ; pacim on fmcoh psitot ; paciym on psitot fmcohy; fmcoh with psitot fmsuprt fmcohy; psitot with fmsuprt fmcohy; fmsuprt with fmcohy;
model indirect: ssrwe ind pacim fmcoh; ssrwe ind pacim psitot ; output: Standardized cinterval; ========================================= Thanks
I'm can't answer this without seeing the full output. Please send it and your license number to firstname.lastname@example.org. The STANDARDIZED and CINTERVAL options should apply to regular and MODEL INDIRECT results.
A colleague is using Mplus to regress an ordered categorical variable with three levels onto two continuous latent variables which are themselves measured by multiple ordered categorical variables with three levels each. My colleague is using ML estimation with the LOGIT link to obtain odds ratios of the regressions of the endogenous observed variable onto the two latent variables.
Mplus produces the odds ratios and 95% confidence intervals for the odds ratios for this analysis. Our understanding is that the odds ratios represent the change in the ratio of the odds of the outcome per unit change in the latent variable. We are wondering if it would be possible to obtain odds ratios that represent change per standard deviation of the explanatory latent variables? If so, how would one do it in Mplus?
So, I would take the STD value and raise e to its value for each point estimate, lower confidence limit vlaue, and upper confidence limit value for which I was interested in obtaining the OR and 95% CIs for the OR in the standardized metric?
Hello! I entered 2 continuous and centered covariates and 2 dummy coded covariates in a multinomial logistic regression to predict class membership. mplus 5.1 gives me STD as default and STDY/STDYX when requesting Stand-solutions. Which solution should I use, when I want to have comparable coefficients? Should I use also stand. results for the dummy variables? My guess is, that I have to use STD for all covariates!?
You should use STDYX for continuous covariates and STDY for dummy coded covariates. However, these would not be comparable because STDYX is for a one standard deviation change and STDY is for a change from male to female for example.
ok, many thanks! I have some last questions. 1. That I have entered centered cont. covariates isn't a problem? 2. But I can make comparisons within both groups of cont. and dummy variables!? 3. I compute the exp(b)'s on the basis of your recommended STDYX and STDY solutions for both groups of covariates!? (I have to compute them, because mplus does not offer them using imputed data sets).
3. Is there any possibility to get raw coefficients and there SD in a multinomial regression in mplus 5.1? I've heard of in mplus 4.21 one could divide the raw coeff. of the centered variables by SE (which is similar to SD in mplus) to get standardized coefficients!?
Multinomial regression gives raw coefficients and their SEs. If you want coefficients standardized with respect to covariates you multiply by their SDs (from Sampstat). You won't get SEs for those standardized coefficients.
sorry to be tenacious, but this is not the same as standardizing covariates before entering the regression and using their coefficients!? However, I checked the literature and your advice seems ok, with respect to rank the coefficients in a meaningful way. (ref.:http://www2.chass.ncsu.edu/garson/pa765/logistic.htm)
final question: is it ok to exponentiate these "standardized" coefficients and to compare these Exp (b)? Thank you!
When I use 'MODEL INDIRECT', 'STDYX' and 'CINTERVAL' in a path analysis I get some unexpected results.
Whilst 'MODEL RESULTS' differ from 'STANDARDIZED MODEL RESULTS: STDYX Standardization' and 'TOTAL, TOTAL INDIRECT, SPECIFIC INDIRECT, AND DIRECT EFFECTS' differ from 'STANDARDIZED TOTAL, TOTAL INDIRECT, SPECIFIC INDIRECT, AND DIRECT EFFECTS: STDYX Standardization', as would be expected, 'CONFIDENCE INTERVALS OF TOTAL, TOTAL INDIRECT, SPECIFIC INDIRECT, AND DIRECT EFFECTS' are IDENTICAL to 'CONFIDENCE INTERVALS OF STANDARDIZED TOTAL, TOTAL INDIRECT, SPECIFIC INDIRECT, AND DIRECT EFFECTS STDYX Standardization'.
Derek Kosty posted on Tuesday, August 26, 2008 - 3:28 pm
In this CFA with categorical outcomes I set the residual variance of a continuous latent variable to equal zero in order to make the PSI matrix positive-definite:
MODEL: mood by LMDD4 LDYS4 LDPD4; anxiety by LGOA4 LPTS4 LSPE4 LSOC4 LPAN4 LOBC4; intern by mood anxiety; anxiety@0;
As a result, the standardized factor loading from "intern" to "anxiety" is equal to one (and has no standard error). I tried working through the formula bStdYX = b*SD(x)/SD(y), as an effort to gain understanding of the issue but i cannot seem to find SD(x) or SD(y) in my output. I need to know why this is for the writeup of these analyses.
In this application, SD(x) is the SD for intern (so sqrt of the variance that is printed) and SD(y) is the SD for anxiety (see Tech4 for the corresponding variance).
Having only 2 indicators (first-order factors) for a second-order factor gives a weak model, only identified due to the zero residual variance of anxiety. There should be at least 3 and preferablymany more first-order factors.
Derek Kosty posted on Tuesday, August 26, 2008 - 5:07 pm
These are proposed models within this field of research. The aim of the study is to evaluate these models. Therefore, we cannot satisfy the preferred model that has at least three first-order factors because thats not the way it is being looked at. Do you see this as being extremely problematic? What suggestions do you have, if any?
I can't say that it is wrong, but this is not in my view how second-order factor modeling should be done. The model is just barely identified with no possibility to test it, or to adjust it for the many types of misspecification possibilities it may contain. My only suggestion would be to not draw on a second-order factor in this case.
Derek Kosty posted on Wednesday, August 27, 2008 - 10:03 am
What exactly do you mean by "not draw on"? Also, how do I determine if a negative residual variance is significant or not? If it is negative, the SE and p-value cannot be computed...
"Draw on" is here a euphemism for "don't draw strong conclusions based on", or less diplomatically - don't use.
The problem of using a second-order factor here is clear from your question about the negative residual variance. Your model
MODEL: mood by LMDD4 LDYS4 LDPD4; anxiety by LGOA4 LPTS4 LSPE4 LSOC4 LPAN4 LOBC4; intern by mood anxiety;
is not identified (I would think Mplus flags it as such), so the estimates (such as negative residual variances) therefore should not be interpreted. There is no way of knowing what the residual variances are. The model becomes identified by for instance fixing one residual variance at zero, but if that is not true, then the resulting estimates are distorted.
Derek Kosty posted on Wednesday, August 27, 2008 - 11:21 am
Many thanks. I feel much more comfortable now. I will propose to the rest of my team getting rid of that second order factor. In hind sight, i don't see a real need for it anyways. We will just talk about the model as being an internalizing model without actually including the second order factor. Does this seem reasonable?
Derek Kosty posted on Friday, September 19, 2008 - 3:13 pm
When conducting a CFA with dichotomous (0,1) observed variables:
f1 by LADH4 LODD4; f2 by LCON4 LAPD4 LALC4 LPOT4 LDRG4;
I get the following warning:
"WARNING: THE RESIDUAL COVARIANCE MATRIX (THETA) IS NOT POSITIVE DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR AN OBSERVED VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO OBSERVED VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO OBSERVED VARIABLES. CHECK THE RESULTS SECTION FOR MORE INFORMATION. PROBLEM INVOLVING VARIABLE LAPD4."
And the standardized factor loading of f2 by LAPD4 is greater than 1 (1.013).
I believe it is due to the nature of the variables LAPD4 and LCON4. If LAPD4=1 then LCON4=1. But if LCON4=1 does not imply that LAPD4=1.
Not sure exactly what the issue is. I can calculate the correlation between the two variables and it is only .61. Any thoughts on this?
The message is not about a factor loading. I would need to see the output to answer your question. Please send it and your license number to email@example.com.
Jungeun Lee posted on Tuesday, September 30, 2008 - 12:01 pm
I am not super sure that I have a clear idea about when to go with 'Std' and when to go with 'StdYX'.
I ran a CFA and its follow-up SEM. In the CFA, I have 4 latent variables and their corresponding observed variables (continuous). To see if a specific indicator is strongly related to the corresponding latent variable than others, I am thinking to use 'StdYX'.
In its follow-up SEM, two continuous observed variables were added. In this model,1) a latent variable predicted another latent variable; 2) a combination of a latent variable and one of the added continuous variable predicted another latent variable; 3) a latent variable predicted one of the added continuous variable. I am thinking, for the part like 1), I would go with 'Std'. For parts like 2) and 3), I will go with 'StdYX'.
I am slightly confused about the definition of StdY in the manual. I have a model which has continous latent variables, continuous manifest variables, and binary variables.
I requested both StdYX and StdY. I get someting like this:
From________To_______StdXY_____StdY Latent----->Latent ______.456______same Binary----->Manifest____ .123_____ different Binary----->Latent______.234______different Manifest--->Manifest_____.345_____ same
According to the MPlus manual, StdY uses the variance of CONTINUOUS LATENT variables for standardization. This explains why the StdYX and StdY coefficents are the same for Latent-->Latent, and different for Binary-->Manifest and Binary-->Latent, but why are coefficients for Manifest-->Manifest coefficients same? They are not continuous LATENTS, so I was expecting they would be standardized differently in StdY and StdYX.
I think your question is how are dependent and independent variables defined in Mplus. An independent variable is a variable that appears only on the right-hand side of an ON statement. All other variables are dependent variables.
Garry Gelade posted on Thursday, October 16, 2008 - 10:46 am
I'd like to know why in my model I seem to get slightly different p-values for my unstandardized vs. STDXY results. In STDXY, two relations between latent variables become significant that were only marginally significant looking at the unstandardized results. My model contains continuous and binary independent variables, latent endogenous variables, and continuous dependent variables.
Is there a rule to determine which type of standardization I "should" be using? Clearly, bias would lead me to prefer the standardized over unstandardized coefficients because they produced more significant results in my model. My original rationale for looking at STDXY was that the variances of all variables in the model, including background/outcome variables, are used to standardize.
There is no rule. I would use raw coefficients unless I had a reason to use standardized. In both cases, you should be conservative regarding the p-values given the large number of tests being done. Some kind of Bonferroni type correction should be made.
QianLi Xue posted on Wednesday, November 19, 2008 - 6:17 am
How come STDYX and STDY in a path model with only observed varaibles give same estimates?
Eulalia Puig posted on Thursday, January 22, 2009 - 10:48 am
Dear Linda or Bengt,
I've read some of the posts, but I'm still unclear as to which standardization to use. My current model uses only residuals (both as independent and dependent variables), so they are continuous, non-latent variables, right? STD gives me the exact same output as the unstandardized coefficients. The other standardizations (STDYX and STDY) give me not only different output, but also different p-values.....
First, which standardization should I use?
Second, why do I get different p-values?
Thank you so much in advance. I really appreciate your work an effort in this site.
Eulalia Puig posted on Thursday, January 22, 2009 - 10:50 am
OK, I just read the answer to my second question....
I'm confused about how STDYX standardization relates to regression on Z scores. I've run a multiple regression in Mplus with 2 indepdendent vars. When I convert the dependent and independent vars to Z scores and rerun, I don't quite get the parameter estimates shown under STDYX Standardization in the original raw data regression (STDYX coeffients = 0.494 and .211, coefficients from analyzing Z scores = .488 and .209). According to Pedhazur the standardized coefficients should match those calculated from Z-score data (Multiple Regression in Behavioral Research, 2nd ed., 1982, p. 53). Could you comment on this?
Could the difference arise because the STDYX standardization doesn't use the standard deviation of Y computed from the sample? The 5.0 Mplus User's Guide (p.577) reports that for the STDYX standardization "SD(x) is the sample standard deviation of x and SD(y) is the model estimated standard deviation of y". Could you please explain what is meant by the "model estimated standard deviation of y"? This appears to be something different than just the standard deviation of y computed from the sample.
Thanks for your help! I'm trying to decide whether to publish the Betas I get from the regression on Z scores or those reported under STDYX Standardization from the raw data analysis.
Thanks Linda for the prompt reply. I'm sorry for being dense, but is the only difference between the "model estimated" standard deviation and the sample standard deviation that the former is calculated using n and the latter with n-1?
Could you please explain the rationale behind why you prefer obtaining standardized coefficients from the STDYX values vs running the analysis on Z scores (beyond the obvious advantage that the STDYX approach doesn't require standardizing the raw data prior to analysis)? In multiple regression, are the standardized coefficients reported under STDYX somehow superior to those calculated by analyzing Z scores?
I appreciate your further insight into the benefits/weaknesses of the different approaches.
Q1. For just-identified (saturated) models, yes. For over-identified models, use the model-estimated SDs.
Q2. Pre-standardizing variables is risky in many settings; there is a large literature on this - see for example the underpinnings of invariance of structural coefficients in SEM. Why then pre-standardize when you can get what you want out of the standardized solution.
You exponentiate the slope in the regression of a count variable on a covariate. You can do this in MODEL CONSTRAINT using the NEW option to define a new parameter. Then you will obtain a standard error for the incident rate ratio.
How to use MODEL CONSTRAINT is described on pages 555-558 of the Mplus User's Guide. If you can't get this working from these instructions, please send your input, data, output, and license number to firstname.lastname@example.org.
My colleagues and I ran a seemingly unrelated logistic regression (SULR) in a multiple group framework. The sampling design is complex with weighting, stratification, and clustering variables.
We have several observed binary dependent measures and a set of observed predictors. There are no latent variables in the model. Since a given participant provided responses to each variable, errors are correlated and we would like to jointly estimate each regression equation (hence the SULR). The grouping variable has four levels, and we would like to see, for example, if the pattern (logistic coefficients) and thresholds - for the same set of predictors predicting each binary DV - are invariant across these four groups. For example,
Group A y1 on x1 x2 x3 ... y2 on x1 x2 x3 ...
Group B y1 on x1 x2 x3 ... y2 on x1 x2 x3 ...
And so on.
When we request standardized coefficients in a model that constrained b coefficients to equality across groups, the reported unstandardized coefficients, as expected, do not vary over groups. However, the STDyx estimates - which are what we want to use - DO vary over groups. For example, the unstandardized x1 coefficient above has the same value in groups A, B, C, and D, but the corresponding STDyx value varies over those groups.
Is there some sort of adjustment that needs to be made to the STDyx values?
The raw coefficients are standardized using group-specific model estimated standard deviations. This is why they differ across groups.
You should do the equality testing on the raw coefficients.
pan yi posted on Wednesday, March 18, 2009 - 1:18 pm
Dear MPlus experts, I have a question about obtaining standardized regression coefficients for my SEM model. I have two latent variable interaction terms in the model and I specify the type of analysis to be random. I want to get standardized regression coefficients of latent endogenous variables on latent exogenous variables. Once I specify "type=random", the STANDARDIZED output is not available and tech4 output becomes unavailable too. I have been looking for a way to get mean and var-cov matrix of my latent variables for some time but with no success. Could you please inform me how I can achieve my goal? Thank you very much!
I'm running a path analysis X-->M-->Y, with 3 different outcomes variables (1 latent, 2 continuous observed DVs). I'm using the BOOTSTRAP=500 option, INDIRECT MODEL as well as CINTERVAL (BCBOOTSTRAP) STDXY output options to get C.I. of standardized coefficents.
What I've got in output was: MODEL RESULTS, STDXY estimates, nonstandardized and standardized indirect and direct effects as well as C.I. for MODEL RESULTS, non-std and std indirect and direct effects but none for the STDXY coefficents.
Am I forgetting to specify anything to get the CI of stdxy coef? Alternatively, should I only report CI for the direct and indirect effects and explain that the X-->M and M-->Y coefficients are significant?
jks posted on Wednesday, November 04, 2009 - 8:02 pm
Hello, I ran the following model to test measurement invariances (configural, metric, scalar, complete) across two groups: f1 by y1-y4; f2 by y5-y8;
I was trying to write standardized estimates in an external file using savedata command in MPLUS 5.2. But MPLUS didn't write the standardized estimates which were constrained to be equal across groups. Is there any way to write the standardized estiamtes in an external file? OR In the formula: b*SD(x)/SD(y), how can I get SD(x) and SD(y) for my model (specifically, how can I get standardized factor loadings, intercepts, error variances from unstandardized estimates). All variables are continuous in the model.
Hi, I am running a path analysis with y1 on x y2 on x
and have to direct significant coefficients in my output for both paths. I would like to compare the difference of the coefficients (.37 vs. .48) in this model. Is there a way to do this? Thanks for your advice, Sofie
With categorical outcomes and covariates using weighted least squares estimation, we don't give standard errors for standardized. This is not because they cannot be computed but it was not implemented in Mplus.
ela m. posted on Thursday, March 11, 2010 - 10:18 am
Hello Dr. Muthen, I'm a student working for the first time with M-plus and path analysis. We are interested to see the effects of some factors (categorical or continous) on suicidality in children. For this we did logistic regression and as some factors have missing data, it was suggested to use path analysis, too. I used the following options: ESTIMATOR=MLR; ALGORITHM=INTEGRATION; INTEGRATION=MONTECARLO (500);
but when I wanted to use Model indirect I got an error: "MODEL INDIRECT is not available for analysis with ALGORITHM=INTEGRATION." I'm a bit confused what option should I use, or should I calculate the indirect effects from the output I got from Model specification? Also, in the output, at the model results there is no stdyx that I read it's the path coefficient. How should I get it? A general question regarding path analysis: my understanding was that we propose a large model, with a lot of path connections, but we select the best one. How should I do this in Mplus? Do I have to try all the possible submodels and compare the fit measures? Please let me know. Thank you very much for your time.
You can use MODEL CONSTRAINT to create the indirect effect if the mediator is not a categorical variable. Indirect effects for categorical mediatiors can be estimated only with the weighted least squares estimator and probit regression in Mplus.
If we don't give standardized estimates, there must be a reason. I would need to see the full output and license number at email@example.com to see what it is.
I am running a simple path model with count variables as the outcome variables. With continuous predictor and outcome variables, I would interpret STDYX as the change in y in y standard deviation units for a standard deviation change in x. How do I interpret the STDYX coefficient with a y variable that is a count variable rather than a continuous one? Because it is a count variable, it does not make sense to think of STDYX as the change in y in y standard deviation units, correct?
With count outcomes, I would either use the raw coefficients or use STDX which you can compute as the raw coefficient times the standard deviation of x. It would be interpreted as the log rate change for a one standard deviation change in x.
Maja Cambry posted on Wednesday, March 31, 2010 - 10:19 am
I ran an SEM with categorical and continous indicators (WLSMV estimation). Standard errors and p-values for the standardized estimates (including R-square) were not provided in the input. My questions are...
1) Can I get s.e.'s of standardized estimates with this type of model? If so, how? 2) If not, can I assume that the standardized estimates are significant if the unstandardized estimate is significant? Or, is there a way to calculate s.e. for standardized coefficients? 3)Is there a way to determine if R-square of the endogenous latent variable(s) is significant?
1. With WLSMV, standard errors of standardized coefficients are not given when the model has covariates. 2. No. 3. We don't give this.
I would report the raw results as far as significance goes and show the standardized without significance.
jing xu posted on Thursday, May 13, 2010 - 3:52 am
I tested a model where there included three independent variables (IDV) and three dependent variables (DV). But all the Std. structural coefficients were either much greater than 1, or negative and significant. I found the zero-order correlation coefficients between three IDVs are high (all around .85). Do you think this is due to multicollinearity, or Heywood case?
I tried a second order model to include an IDV-S which has three IDVs as its dimensions. The results turned that the IDV-S significantly and positively affected the three DVs. The path coefficients were satisfactory between 0 and 1.
It would seem multicollinearity could be the issue.
jing xu posted on Thursday, May 13, 2010 - 8:35 pm
Thanks, Linda. I also tested the 3 IDVs on only one DV (eg. DV1), but found the results were okay. But when I added one more DV inside, the results were still unsatisfactory. I couldn't understand when I included all the "correlated" IDVs but only one DV, the things turned much better. So it seemed the number of DVs also count into multicollinearity.
jing xu posted on Thursday, May 13, 2010 - 9:10 pm
And I can also tell you that I tested pairs of IDVs on the 3 DVs. One pair of IDVs has correlation .87, but the structural model is quite acceptable. One pair of IDVs has correlation .84 (lower than .87), but the structural model is very unacceptable. Also it seems that higher IDVs' correlation may not 100% result in multicollinearity. In fact, only when the problems happen, we can explain by using multicollinearity.
Rachel Perl posted on Thursday, July 08, 2010 - 10:51 am
I ran the same regression in SPSS and in Mplus. The unstandardized coefficients are exactly the same for all variables but the standard errors are not. The differences are small such as .147 vs. .149 or .127 vs. .129. Mplus estimates for standard errors are consistently larger. I was wondering why this is the case. In both regressions I used listwise deletion and the number of cases is identical.
You would need to define the standardized coefficients as NEW parameters in MODEL CONSTRAINT and test the difference using MODEL TEST.
Simon O. F. posted on Friday, August 06, 2010 - 6:03 am
Thank you for your answer Linda. I would however need further assistance. To be more precise, I am trying to test equality of 2 paths coefficients, say X->Y and Z->Y. I know I obtain the standardized path from X to Y by taking beta_xy*sqrt(var(x)/var(y)). I also know how to create parameters for beta_xy and var(x), but not for var(y) since:
MODEL: Y ON X (beta_xy); X (var_x); Y (var_y);
Will only get me var_y as being the residual variance of Y and not the total variance of Y. Is there any simple way to create a parameter for the total variance of a dependant variable in Mplus?
You would need to define the variance of y in MODEL CONSTRAINT and then use it in the standardization.
Emily Yeend posted on Friday, August 20, 2010 - 6:24 am
I am looking at modeling indirect effects and have a question about the standardized indirect effect values and p-values.
Firstly, is it useful to report the standardized indirect effect values?
Secondly, (if it is) when I am looking at a variable influencing a continuous variable via another continuous variable I think that I would use StdYX (Am I right?). However, if I am looking at a variable influencing another continuous variable via a binary variable what would I use?
Hello, I have some questions regarding my data which are based on an RCT. I used regression to see whether treatment condition predicted outcomes after controlling for covariates. I didn't use ANCOVA in SPSS, because I wanted to base my results on all available information regardless of whether participants completed treatment. 1. What exacly is the difference between FIML and EM in terms of estimating missing data? Which one is it that makes use of all available data and creates an estimated covariance matrix for the entire sample? In my regression analyses, the output was identical regardless of whether I specified Algoritm = EM, or not. 2. One of the reviwers of our paper said that he/she questioned the use of EM procedure in handling our missing data, which are close to 40% and non-random for some measures. What do you recommend as the best way of dealing with this issue? Can you direct me toward any published studies that have dealt with the issue of a high percentage of non-random missing data in a good way? 3. Finally, another comment we got was that the reviewer wanted F-values, but we only provided beta-weights because we ran regression in in Mplus. Is there any way of converting a beta-weight to an F-statistic? or do we simply answer that our analysis did not provide f-values? When I run the analyses in SPSS using ancova, I get the same significant result, but it is based on a smaller N because of listwise in SPSS. Thanks, Kristine
Emily: I think it could be useful to present standardized indirect effects. Comparing them would not be.
Use StdYX for both. Indirect effects with a categorical mediator should be done in Mplus only for weighted least squares analysis and probit regression where the continuous latent response variable is used for standardization.
Emily Yeend posted on Tuesday, August 24, 2010 - 2:53 am
I've been reading the user guide and I'm still a little unclear when I would use the different standardizations.
Am I right in thinking that in general I would show standardized coefficients when I would like to compare influences of variables. Typically I would use StdYX however where I have a binary covariate (or mediator) I would show StdY for that specific relationship. (If this is the case, are these standardizations still comparable?)
Which is the appropriate way to display the residual variances? Or would I always just show these in unstandardized form.
Similarly, I am I right in thinking that in the unstandardized output, WITH statements show the covariance of residuals, whist in the StdYX output it shows the correlation of the resiudal.
You should always use StdYX for continuous covariates and StdY for binary covariates. If you compare the standardized coefficients, you should keep in mind one is the change associated with a one standard deviation change in x and the other is the change associated with a shift from one value of x to the other.
Please send the output and your license number so I can see why you don't get StdY.
I tested a latent variable multiple mediation model using standardized variables and my "c' " path is greater than one. Although the model fit is very good and there were no warnings, is this a sign of a problem? If not, how does one interpret path weights greater than one in this context and in general?
I am estimating a model where I have several covariates predicting two latent factors. These two latent factors are the same construct, measured at two time points. Therefore, I constrain the coefficients of the covariates (predicting the factors) to be equal. In my output, the raw coefficinents are, in fact, constrained to be equal. However, the standardized (STdYX) coefficients are not. Can you explain why this is the case?
Thank you for your quick reply, but I am still unclear on this. Let me be more specific.
In the model, the raw coefficients predicting f1 and f2 are constrained to be equal equal. For example, in the output, the raw coefficient for the effect of education on health at time 1 is .5, and the raw coefficient for the effect of education on health at time 2 is .5.
For the standardized coefficients, the effect of health at time 1 is .2 and the effect at time 2 is .05. I do not understand why the standardized coefficients for the covariate education is not also constrained to be the same.
The coefficients at each time point are not standardized using the same standard deviations. Time 1 uses time 1 standard deviations. Time 2 uses time 2 standard deviations. Only if the standard deviations at each time point are the same will the standardized coefficients at each time point be the same.
I have run a SEM using a WLSMV estimator with a binary observed DV. I understand that the unstandardized estimates are probit regression coefficients. I have latent IVs, continuous observed IVs and dummy IVs. My understanding from prior posts is that if I want to examine the relative impact of the IVs, I would use STDYX for continuous and latent IVs and STDY for the dummy IVs. Will you please confirm that understanding is correct and answer 2 additional questions? 1) Is it reasonable to show the standardized coefficients in a graphic of the SEM (using STDYX and STDY as described above) even though this is a probit model? 2) The STDY does not appear in my output--why? Thank you
I’m a new Mplus user running version 6.1. I have a two part question:
1. I’m running an SEM model with a mixture of binary, ordinal, and continuous variables thus my default estimator is WLSMV. If I request the MODINDICES option, I see various BY, ON/BY, ON, and WITH statement. Where should I be looking if the modification is suggesting a removal of a certain pathway (i.e., a non-significant Wald Test for a certain path coefficient)?
2. Should I ever expect to see StdYX estimates exceeding 1 under the standardized model results?
1. Modification indices suggest adding not removing paths. See the user's guide for further information. Significance of paths in the model are found in the results section. See the user's guide under the OUTPUT command for a description of the output layout.
Thank you for the quick reply. While I do see the estimate, S.E., and p-values for each total and total indirect pathways under the non-standarized section, those test-statistic are not given for the standaridized total and total indirect.
Should I be using the BOOTSTRAP option rather than the default DELTA method if I wanted to obtain the standardized values?
Given I have a mixture of variable types, should I only be concern with the standarized output?
In a classic SEM model, the output in the MODEL results section always set the first factor loading for each latent variable to 1 with zero standard error and 999 for the test. How can I have more control as to which factor loading gets fixed at 1 to fix the indeterminacy problem?
Is it simply a matter of rearranging which variables follows the BY statement or using the @1 for variable of interest?
While I understand from previous posting that with categorical outcomes and covariates using weighted least squares estimation, Mplus does not give standard errors and p-values for standardized model results.
My understanding is that it has not been implemented into Mplus to automatically calculate; however, do you have any suggestion or example of the coding needed to obtain them directly in Mplus without using other resources?
You could use MODEL CONSTRAINT to specify standard parameters and obtain standard errors and p-values in this way.
Jeff Jones posted on Tuesday, December 28, 2010 - 3:07 pm
I am trying to do a standardized regression with random predictors using the MODEL CONSTRAINT command, but I am having trouble. I have read through the forum and the technical appendix on standardized coefficients, and I am still lost. I have two questions:
1) I am trying to work out a simple two predictor problem with the delta method, and do not understand how to setup the problem to match the technical appendix specifications. I am not sure where the identity matrix comes into play.
2) I am not sure how to use the MDOEL CONSTRAINT command for a random predictors regression problem. Any advice on how to set it up (or a simple example) would be very much appreciated.
Your Dec 28 post talks about the delta method in (1) so you must be asking about standard errors. Perhaps you are asking about standard errors for the standardized solution?
For (2), see UG ex 5.20. This is an example where you compute the standardized solution in Model Constraint. And that also gives you the SEs for those standardized coefficients using the delta method.
Utkun Ozdil posted on Thursday, May 05, 2011 - 10:53 pm
I want to report the coefficients of my model such that y ON gender. With the gender covariate I chose to report STDY results. But I am unclear in that are these coefficients represented as a beta or a gamma?
vf is a residual variance. The formula for standardizing a regression coefficient requires that the coefficient be multiplied by the standard deviation of x and divided by the standard deviation of y. To obtain the variance of x, use TYPE=BASIC. You will need to compute the variance of f by using the formula:
Thank you very much for your fast answer, explanation and suggestion.
I had actually simplified somewhat the problem to make it clearer. In fact, I have several covariates X1, X2, ..., XJ: F on X1 (Beta1) X2 (Beta2) ... XJ (BetaJ);
1. Does your formula generalize to: Var(F) = Beta1^2 * Var(X1) + ... + BetaJ^2 * Var(XJ) + Res. var(F) ?
2. Some of the covariates Xj are ordinal, and one of them is nominal, so that each of those covariates is dummy coded as (C-1) binary variables (where C is the number of categories of the ordinal/nominal covariate). In these cases, how do I compute Var(Xj)?
3. How about approach (A) = post-calculate Var(f) by reading factor scores from the savedata file? Should it yield the same results?
4. A rather unrelated question: Mplus does not compute factor scores for one of my models, with message: FACTOR SCORES CAN NOT BE COMPUTED FOR THIS MODEL DUE TO A REGRESSION ON A DEPENDENT VARIABLE.
I checked that Mplus indeed did not produce any save file after estimating this model.
If this behaviour is ok, could you please tell me why factor scores can not be produced by Mplus in this setting?
1. With more than one x, you need to include the covariances in the formula also.
2. Covariates are treated as continuous in regression so you should use the variances from TYPE=BASIC.
3. Factor scores are not the same as the factors in the model. How close they are can be seen by looking at the factor determinacy. I would not use factor scores.
4. You should use Version 6.11 and if you still get the message, send it along with your license number to firstname.lastname@example.org.
Paresh posted on Wednesday, June 01, 2011 - 5:00 pm
Dear Dr. Muthen, To check for multicollinearity, I regressed DV on a second order factor of all IVs. Unlike my hypothesized model, this model has poor fit indices(CFI=0.539, TLI=0.464, RMSEA=0.095). Does the poor fit mean that I do not have multicollinearity problem, or do I need to check the significance of the beta estimate in the second order model? Thank you for your help.
You want to first check that the second-order factor model fits without including the DV. Also, the poor fit doesn't have to do with multicollinearity (which I assume is due to highly correlated first-order factors).
How would one go about constraining a standardized structural parameter to a value in a typical SEM (e.g., constrain a beta to .2) ? I think I am missing something in the "Model Constraints" literature.
I am computing variances for my downstream variables and when I check them against Tech 4, most seem to be within rounding error. However, one of the computed values = 3.085 while Tech 4 = 3.072. While I realize that this shouldn't changed substantive conclusions much, I am curious if this is still within rounding error or if there are some instances when computation and Tech 4 would not match up (aside from user error, which may be the case here).
BTW -- Six demographic variables predicting outcome (brd).
It sounds like you are getting only standardized parameter estimates with no standard errors. In this case, you cannot get a p-value without computing the standard error yourself which could prove difficult to do.
The Demo and regular version are identical except for a limit on the number of variables. If you get standard errors in one, you will also get them in the other. Perhaps you are using an old demo. It sounds like the university also has an old version. We give p-values now.
The ratio of the parameter estimate to its standard error is a z-test in large samples. You can use a z-table to find the p-value.
c_wahl on C_PID; (lets assume this is path a) C_K on C_PID; (path b) c_iss on C_PID; (path c) c_wahl on c_iss; (path d) c_wahl on C_K; (path e) C_K on c_iss; (patch f)
c_wahl ind C_PID; c_wahl ind c_iss;
1) Am I correct to assume that only for path b+f a linear regression is estimated and for the other paths a probit regression?
2) Assume that c_iss is not categorical, but a continous variable am I right that than for path b+c+f a linear regression is estimated?
3) In terms of the strength of the standardized estimates, can I conclude base on the value that one or the other predictor is stronger or weaker than the other or am I not allowed to do this?
4) For model indirect command, I got direct and indirect effects. Some of the indirect paths should multiply an estimate of the linear regression and probit. is this a problem in comparing standardized total effects?
Dear Drs Muthen, I am running a multiple mediation path analysis with dichotomous independent variables, continuous latent variable mediators with ordinal indicators, and a continuous latent dependent variable with ordinal indicators (plus other control variables). I am using WLSMV as my estimator, theta parameterization and calculating bcboot confidence intervals. My reviewers want to know how to interpret the magnitude of the coefficients for the continuous latent dependent variables. 1) How do I determine the range and standard deviation of my continuous latent variables? Since my independent variables are dichotomous I wanted to standardize my coefficients by the dependent variable (STDY) but this option is not available for weighted least squares estimation. Also, I read in the manual that standardization uses the delta method and only allows for symmetric bootstrap confidence intervals. 2) Can I standardize by y (by doing STDYX and unstandardizing by X) and still report the significance levels based on the non-standardized bcboot confidence intervals? 3) How would I standardize by Y for the indirect effects? Is there an output that shows the standard deviation of the indirect effect using bcboot? Thank you.
(1) The means and SDs of your latent variables are obtained in TECH4 and because they are assumed normally distributed this gives you a notion of their range.
(2) You want to look at STD, that is, standardizing only with respect to the latent variables. This is because your direct and indirect effects pertain to the latent variables. Mplus does not give you bootstrapped SEs for standardized coefficients. Yes, I would say that you can report significance/CI based on the raw (unstandardized) coefficients and add reporting of standardized coefficients without significance/CI.
An alternative is to use Estimator=Bayes where you achieve the same as using bootstrap, namely allowing non-normal distributions of the estimates. With Bayes you also get this for standardized coefficients.
Thank you. I tried using Etimator=Bayes, but this estimator does not allow sampling weights and model indirect is not available.The primary table in my paper is the one reporting the indirect effects. When I try standardize(STD) and use the model indirect command all of the total and total indirect effects = 0, but the specific indirect and direct effects have values. This doesn't make sense because the sum of the specific indirect effects should equal the total indirect effect and the sum or the total indirect and direct effects should equal the total effect. Is this because one of my mediators is not a latent variable? I forgot to mention that in addition to three continous latent variable mediators I have one observed continuous mediator (logged adjusted household income). I think this puts me back to trying to calculate STDY from STDYX, but I don't know how to do this for the indirect effects reported using the model indirect command. Do I just divide by the standard deviation of the independent dichotomous variable, or do I need to divide by the standard deviation of the indirect effect? If I need to divide by the standard deviation of the indirect effect how do I get mplus to output this? Thanks again!
Regarding your total and total indirect effects being zero, please send this output and data to Support.
Regarding Bayes, you don't need Model Indirect but can create these effects in Model Constraint. But you are right that Bayes does not yet handle weights.
Regarding standardizing an indirect effect, consider a model with x, m, y and the indirect effect from x to y via m obtained as the product of two slopes. When x is binary you don't want to standardize with respect to x. The indirect effect product is standardized with respect to y by dividing the product by the SD of y.
The result of above formula differs from the standardized effect output provided by Mplus, and I know it does not account for the variances and covariances of x1-x3. However, if I am wanting to interpret my model conditional on the covariates, wouldn't the above formula be appropriate, i.e., yielding a standardized effect from a to y for fixed x1-x3? Thanks so much.
Philip Jones posted on Wednesday, November 23, 2011 - 5:28 am
Thanks. I had tried that but was not getting convergence using the WLSMV estimator (convergence was no problem when I didn't explicitly include "x1-x3;"). So I thought I was doing something wrong. Do you know why that might be?
This approach does not work with WLSMV. If you use ML, you will get the standard errors for the standardized coefficients.
Arin Connell posted on Thursday, December 08, 2011 - 11:09 am
I am trying to test the equality of two regression coefficients using the model constraint commands to calculate relevant variances and standardized regression coefficients. The calculated variances appear fine (they match Tech4 output).
I am concerned that the calculated standardized regression coefficients do not match those produced in the STDYX output (calculated Beta1 = .201, Beta2 = .49, while STDYX estimates are Beta1=.634, Beta2 = .595), and am therefore not confident in the model test. Any insights would be appreciated.
negaff by NA BFI_N ; posaff by PA BFI_E ; PTSD by PSSI PSSSR ; dep by HRSD BDI ; physcon by PC_1 PC_2 ; negaff (varNA); posaff (varPA); negaff with posaff (covNAPA) ;
physcon on PTSD ; Dep on negaff (p1) posaff (p2); PTSD on negaff (p3); dep with PTSD ; PSSSR with BDI ; PSSI with HRSD ; physcon with dep@0; dep (p4) ; ptsd (p5) ;
Ahh! Great--assumed I was doing something dumb here, but it wasn't jumping out at me. Thanks!
Xiaolu Zhou posted on Saturday, January 07, 2012 - 10:20 am
I run a SEM with M-plus: 2 independent latent variables are a and b; 4 dependent latent variables are c,d,e,f and g. Because there are too many variables, I used parcels for the observed variables. The result showed that d was not significantly related to a and b; e was significantly related to a and b. While the regression I did with SPSS before showed that d was significantly related to a and e was significantly related to b. What account for these different results between SEM and regression? Thanks a lot!
Maybe I don't understand what you are comparing. I assume that when you talk about SPSS regression, this is when you are using parcels, that is, sums of the items measuring the dependent latent variables instead of the latent variables themselves(and same for the exogenous latents). And you are comparing that to the full latent variable model with multiple indicators for the DV latents. If that understanding is correct, the different results would seem to be due to some indicators having direct effects from some of the exogenous factors - something that could be found out by requesting Modindices(All).
Xiaolu Zhou posted on Sunday, January 08, 2012 - 10:57 am
Thank you very much Bengt! Your understanding is correct. I found one indicator of one DV latent have direct effect from one indicator of IV latent. I run it and get a better result: one path is the same as regression result now, but there is still another path differs from regression result. Seems like that there is no command in model modification indices is reasonable anymore, what should I do?
If the latent variable model fits well in terms of chi-square I would trust its results over the parcel version. The discrepancies may be due to several causes, including low reliability of the parcels.
Amber Watts posted on Monday, January 09, 2012 - 10:17 am
With regard to your earlier statement that significance tests for the standardized coefficients should not be used for the unstandardized coefficients-- what would explain why one would be significant if the other is not? For example, I am using a latent factor (Mets) to predict an observed continuous variable (memory). The unstandardized coefficient would suggest a non-significant relationship, while the stdyx suggests it is a significant relationship.
I have 1 independent variable (observed variable (x)) and 1 dependent variable (latent variable (y) made from categorical variables) and 1 dependent & independent variable (latent variable (z) made from categorical items).
Usevariables are X y z a b c d e f; Categorical are A b c d e f;
Model: Y by a b c; Z by d e f; Z on x; Y on x; Y on Z;
So I was wondering 1) since all my outcome variables are latent, does this mean that the results are not in probit coefficients but they are in linear regression coefficients? 2) since I have one observed variable I should be reporting the SDYX results, am I correct?
I’d like to use m-plus to generate some Monte Carlo data to use with PLS. In m-plus I can only specify non-standardized estimates to path coefficients etc. PLS calculated only standardized values. Is it possible to calculate non-standardized estimates for standardized estimates?
If you generate data where variables have variances of one, the data are standardized.
QianLi Xue posted on Thursday, February 23, 2012 - 7:21 pm
Hello, I understand that the correlation between the residuals of two dependent variables can be obtained as the residual covariance divided by the product of the standard deviations of the two residuals.How to do significance testing of this calculated correlation in MPLUS. Where to find variance-covariance matrix of the residual variance and covariance estimtes? Thanks in advance for your help!
In most cases, Mplus gives the standard errors of the standardized coefficients when you ask for STANDARDIZED in the OUTPUT command. TECH3 contains the estimated covariances and correlations of the parameter estimates.
Jiyeon So posted on Saturday, March 03, 2012 - 11:57 pm
Hi Prof. Muthen,
I want to see if standardized path coefficients are statistically different from each other. I heard that you can do this by using equality constraint (e.g., 0 = pathA- pathB).
I ran the model with these equality constraints and do not know what to look for in the output in order to see if the path coefficients are statistically significant. Please help me!
Jiyeon So posted on Sunday, March 04, 2012 - 9:45 pm
Thank you again!
So the second pair of path coefficients are the only statistically different pair?
It's counter-intuitive since ID = -.41, PSI = -.28, PR = -.26, TR = -.11.
So in terms of difference in the values of standardized coefficients, diff1 (= .13) was much larger than diff2 (= .02). Is it possible that diff1 is not a statistically significant difference when diff2 is a statistically significant difference?
I have a manifest path model containing ordinal exogenous variables, ordinal outcomes and ordinal mediators. Is it correct to use the STDY output to obtain the standardized path coefficients or should I use STDYX?
Which output should be used in case of several predictors where some predictors are continous and others are ordinal?
No, scale free has nothing to do with scale parameters. For a model that is scale free, the same standardized coefficients are obtained whether the unstandardized raw data or standardized data are analyzed. Scale free models have no constraints across variables, for example, equality constraints. See the Bollen SEM book for further information.
I'd like to test some hypotheses about standardized coefficients (StdYX) after fitting a latent variable path model. Is there a way to do this using the existing Mplus parameter labeling conventions but applied to StdYX coefficients? Otherwise, the standardized coefficients I'm interested in are complicated functions of other more basic model parameters. I know I can export the parameters and the covariance matrix of the parameters and use the delta method but I'm checking to see if there is a short cut.
In the tech report, "Standardized Coefficients in Mplus", June 13, 2007, on page 2 it mentions, "We can obtain standard errors for the expression in (6-9) by the delta method if we have the joint asymptotic variance W for theta, Var(Y), and Var(eta)." Any chance that Mplus can output W?
With the unstandardized results, the scale of the factor loading is going to be determined by the scale of the factor indicators. Be sure you have no negative residual variances for the factor indicators.
Herb Marsh posted on Thursday, October 04, 2012 - 1:29 pm
For my multigroup SEM, I would like to have a solution in which the estimates are standardized in relation to a common within-group metric (available in LISREL but apparently not Mplus). Here is what I did:
1. I standardized all indicators (Mn=0, SD=1) in relation to the total sample (i.e., a common metric) 2. I ran a 'total group' analysis, disregarding the multiple groups 3. I included a set of 25 dummy variables representing the 26 countries; I treated each of these as MIMIC variables that predicted the indicators in my model. 4. From this analysis, I took the factor loading from for the first indicator of each factor. 5. I then ran my multiple group analysis based on the standardized indicators. However, instead of fixing the first factor loading of each factor to 1.0, I fixed it the the factor loadings I got from the total group analysis (i.e., step 4 above).
My logic is that this is equivalent to standardization in relation to a common within-group standardization. This is relevant in that there are substantial group differences on some of the variables so that the average within-group standard deviations would be quite different than the total group standardization. For the multiple group analysis, the model is fit separately to each group so that the within-group common metric is appropriate.
Is this appropriate and is there an easier way to do it?
I don't follow your proposal, but it sounds like you want to standardize with respect to pooled-within group variances - if so, that cov matrix can be obtained in a separate run. Maybe you want to try and see if this gives the same answer as your proposal.
Dave Graham posted on Thursday, November 08, 2012 - 1:42 pm
Dear Professors Muthen, in my model I am regressing an observed and two latent variables on two dichotomous independent variables (this is not the full model, but the part where the problem occurs).
I get higher significance levels for the path from the first dichotomous variable to the observed outcome compared to the paths from the second dichotomous variable to the latent constructs. However, the standardized coefficients are higher for the paths to the latent variable. No matter what type of standardization I use (STDYX, STDY, STD).
I can see that the Est./S.E.-values are bigger for the coefficient estimated for the observed outcome and should therefore be more significant. However, the result does not intuitively make sense to me. Shouldn't the significance level be reflected in the coefficients after standardization?
I am not sure if I have misspecified my model. I tried different latent variables (but the same observed outcome) but the problem/phenomenon stays the same. It would really be great if you could give me a hint on whether this is an explainable and sensible result or if I might have a wrong model.
Lois Downey posted on Friday, March 01, 2013 - 12:15 pm
I need 95% confidence intervals for standardized coefficients in a complex regression model with a latent-variable outcome (ordered categorical indicators) and several manifest predictors. (My motivation for seeking standardized estimates was my believe that this would keep the estimates independent of which indicator was used to scale the latent variable.)
However, with the default WLSMV estimation, 95% CIs seem to be given for only the unstandardized coefficients. By contrast, if I specify restricted maximum likelihood estimation with a probit link, the 95% CIs are given for both the unstandardized coefficients and for coefficients standardized any of the three ways.
Several questions: 1) Is there any reason not to use the MLR/probit solution in lieu of WLSMV?
2) How do I test for effect modification in an MLR/probit model in Mplus. (Stata appears to have a utility for evaluation of interaction effects in logistic and probit models, but so far I've been unable to locate this facility in Mplus.)
3) Is there some reason for the omission of CIs for standardized coefficients with WLSMV?
4) I assume that the use of WLSMV as the default for models of this type is based on the designers' belief that it is preferable for some reason. Can you explain the reason for this preference (in lay terms)?
1. No. 2. Create the interaction terms using the DEFINE command. 3. We don't do standard errors for standardized in WLSMV for conditional models. Therefore, confidence intervals cannot be created. This will change in one of the the next versions of Mplus. 4. We use WLSMV as the default because it does not require numerical integration with categorical outcomes and because residual correlations are more easily included in the model. WLSMV is not preferable to maximum likelihood.
Lois Downey posted on Saturday, March 02, 2013 - 7:16 am
Thank you for the answers. With regard to response #2, I was not specific enough in the question I asked. I understand how to compute the interaction term. However, there is an article in The Stata Journal (http://www.stata-journal.com/article.html?article=st0063) that seems to suggest that a different method is needed for correctly estimating the interaction EFFECT and its SIGNIFICANCE when the model is based on either probit or logistic regression. Is there a way to obtain these corrected values in Mplus?
I have a few question about the regression coefficients reported in the output. I am running a simple mediation in which the IV and Mediator are continuous, and four DVs which are categorical. One of the DVs is dichotomous, the other DVs are ordered categories.
1. Do the regression coefficients reported for each of the relationships vary in type so that the IV M path are linear regression and the paths to the DVs probit coefficients?
2. How does one interpret coefficients considering they express different things, e.g linear interpreted as unit changes in both variables; probits as changes in the probability of a z-score? Does the output make things 'easier' in that the regressions reported are scaled to allow for similar interpretation?
3. This is not a coefficient question per se. When using categorical variables, do ordered categorical variables need to be coded as dummy variables, or can they be used in their raw ordered form in one variable, e.g. 1, 2,3,4,5?
4. Related to 3. When doing simple regression a coefficient of .88 for predicting a dichotomous DV from continous IV was not significant. However a variable with multiple categories predicted from the same IV had a smaller coefficient but was significant. How does this arise?
ML is not available as I am bootstrapping the analysis. (v6.11) But if I have a mix of variable types under wlsmv (which is default I think when categoricals are used) are all coefficients probit, even those for continuous variables?
I have seen the paper but found it rather technical. However, does your it apply equally to ordinal variables?
Re q4 is there any way to fix this, for example by dichotomising all dependent variables?
Hi there I am checking my use of the MPlus output in regards to standardized coefficients. I have a simple path analysis with only observed variables. y on c x1 x2 x3 c on x1 x2 The estimator is WLSMV as c is ordered categorical ( 1 2 3). x1 and y are continuous x2 and x3 are binary (0 1 as in gender)but not designated as such as exogenous. I will report stdxy for y on x1 and calculate stdy to report for y on c x2 x3. My confusion arises in the paths c (ordinal) on x1 (continuous) and c on x2 (binary). I think these are probit regressions but I'm unsure which form of std coefficient to best use? Thanks for any advice..
The type of coefficient to use depends on the scale of the covariate. Covariates in regression can be treated as binary or continuous. For binary covariates, use StdY. For continuous covariates, use StdYX.
lamjas posted on Tuesday, April 02, 2013 - 7:53 pm
I have a path model (no latent variables) with two binary and two continuous observed variables. The model is like this:
u1 on u2 c1 c2; u2 c1 on c2;
where u are binary variables and c is continuous variables. The estimator is WLSMV as default.
I have questions whether I should use unstandardized or standardized coefficients when I report the results.
(1) Should I report unstandardized coefficients for paths (u1 on u2; u1 on c1; u1 on c2; u2 on c2) involving binary DV with odd ratios in the results? These paths looks like logistic regressions to me.
(2) For path involving two continuous variables (c1 on c2), I believe I should use Stdyx coefficients, is that right?
(3) For indirect effects, should I used Stdyx coefficient provided in Model indirect command?
With WLSMV u2 as a covariate is the continuous latent response variable underling u2. So in all cases your covariates are continuous. You should use StdYX in all cases. For binary covariates, use StdY.
lamjas posted on Thursday, April 04, 2013 - 6:15 pm
I have a follow-up question to confirm the report of indirect effects.
For the indirect effects, c2--> c1 --> u1, I believe I should use StdYX provided in Model indirect command as both direct effects are StdYX.
How about c2 --> u2 --> u1? For direct effect, c2 --> u2 is StdYX, while u2--> u1 is StdY, should I use the coefficient by StdYX times StdY?
You look at the exogenous variable. Both u1 and u2 are continuous latent response variables with WLSMV when they are used as covariates. So you would use StdYX.
Hemant Kher posted on Saturday, April 06, 2013 - 10:51 am
I ran a growth model with a predictor for the latent intercept / slope, and some distal outcomes predicted by the latent intercept / slope. When I see the raw data model results, all of the paths from the latent intercept / slope to the distal outcomes are non-significant. Yet, when I see the same paths within the STDYX portion, I see that some of the paths are (in some cases very highly) significant. I am confused as to why there are two seemingly contradictory results.
Cecily Na posted on Sunday, April 07, 2013 - 8:48 am
Dear professors, I am running a two-level model without latent factors, but with a dichotomous outcome. The unstandardized coefficient of the between level path is not significant, but the standardized path is highly significant (beta= -0.999, p< .001). What is the reason? I used STDYX. Thank you!
I wonder how to interpret the coefficients labeled "StdYX" and "StdY" correctly? Do we have the cutoffs that can be used to interpret the values? Can we use the 0.2, 0.5, and 0.7 (Cohen's effect size) for interpretation? Thank you so much!
The interpretation is shown under the STANDARDIZED option in the user's guide. These can be used as effect size for a binary covariate because they represent a mean change but not for a continuous covariate.
I am performing path analysis with various sorts of variables. When requesting standardized output, I get the following message:
"STANDARDIZED COEFFICIENTS ARE NOT AVAILABLE FOR MODELS WITH CENSORED, CATEGORICAL, NOMINAL, COUNT, OR CONTINUOUS-TIME SURVIVAL MEDIATING OR PREDICTOR VARIABLES."
(1) Is there a way to calculate the standardized coefficients myself for binary, censored and continuous covcariates or would it be sufficient to report the unstandardized regression coeffecients?
(2) In a paper I reported the unstandardized regression coefficients of the path model in tables. However, the reviewers want me to visualize the path model. But it seems unusual to me to report the unstandardized regression coeffecients next to the arrows in the path model. Would it be ok to mention the unstandardized regression coefficients next to the arrows?
(3)When I tried to model mere correlation between two variables in the path model, I got the following message: "Covariances for categorical, censored, count or nominal variables with other observed variables are not defined." Might there be another way to take into account the relationship between those two variables?
Dear Mplus Team, with one categorical variable y and one factor f (measured by other variables u1-u10) and the probit regression y on f: What is the meaning of the STDYX Standardization solution when I'm using the MLR estimator with Probit Link? With WLSMV this can be seen as polyserial correlation (regression of the standardized underlying latent y* variable on the standardized continuous factor f)? But in the MLR-Probit-Model, how can I interpret the STDYX solution with regard to y ON f? (It is a multiple group model and I’m interested in the polyserial correlations between y and f - but I must take the MLR estimator because of empty cells).
With a categorical y, the ML(R)-probit STDYX for y ON f pertains to the linear regression coefficient of y* regressed on f. This is the same as with WLSMV, except ML-probit considers a Theta parameterization and WLSMV Delta by default.
Tom Booth posted on Tuesday, June 25, 2013 - 2:56 am
I have a model in which a latent variable (which has count variables as indicators) is regressed on a continuous variable. I appreciate the factor loadings of this model are best reported raw coefficients, but which standardization is most meaningful for the regression path involving the latent construct and continuous variable?
I have run a structural equation model with both continuous latent and continuous observed variables.
My problem is that when I interpret the standardized coefficients, I find that some coefficients are very significant (for instance, .17, p <.001),> .05). If the coefficients are standardized, it is my understanding that I should be able to directly compare them, but the output seems to defy that logic. Could you help me understand what is going on?
Yes that is exactly what I am saying. I have standardized coefficients that are small, but significant, and others that are large, but not significant. My understanding with standardized coefficients is that I should be able to compare them in terms of magnitude, and so a smaller coefficient should not be significant if a larger coefficient is not.
And said that we can use MODEL CONSTRAINT to calculate the standardised estimate manually. The problem is that how can we access to the sample covariance terms when the estimated covariance terms are specified to be zero in the model (as above)?
While I was waiting for your answer, I played around with the model to observe Mplus behaviour. It seems that it still specifies covariances between independent variables even though I did not specify WITH at all. And after I specified one covariance, degrees of freedom and goodness-of-fit indices changed. Nonetheless, they became the same when I created covariances among all independent variables.
Right now I am figuring out how to extend the same formula to a more complex model:
y ON x1 x2; x1 ON x3 x4 (p1-p2); x3 ON x4 x5 x6 (p3-p5);
I tried to create a formula under MODEL CONSTRAINT for calculating variance of x1, but the figure does not match the result (i.e. in the estimated covariance matrix). The problem is that x4 indirectly affects x1 through x3, and this is different from having multiple covariates that do not affect one another (i.e. no residual term). I know that parts of x1's variance comes from p1**2*vx3 (i.e. vx3 is variance of x3 and the calculation is correct since it matches the value in the estimated covariance matrix), p2**2*vx4, and rx1 (i.e. x1's residual term). I do not think that I can use covariance terms here since x4 directly affects x3. Could you clarify this point?
I am running a SEM model with a dichotomous outcome. I don't think I can use the MLR estimator to get logit coefficients because then you dont get model fit indices (i.e RMSEA), which I need. So I have to use the WSMLV estimator that produces probit coefficients because that comes with model fit indices.
However, I am struggling to interpret them. and I have the formula to make marginal probabilities which I have seen in the forum, but I'm unsure how to use it:
P(y=1): f(-threshold+ b1*x1 + b2*x2 ....)
1) I don't know what (f= cumulative normal distribution function) is? 2) What threshold do I use? the threshold for the outcome, or the threshold for the predictor? 3) If my x1 variable is a latent factor, what value do I put in the model? 4) and if I have three paths of indirect effects, do I have include all their coefficients in the equation?
I have checked with maximum likelihood results (estimated matrix), and the calculation is correct, but I would like to cross-check with you that it is indeed correct.
I used this formula feeding into Bayesian path analysis, but this time the calculation results differ from the estimated variance–covariance matrix. Also, posterior predictive p value is changed just because I created new variables under MODEL CONSTRAINT. Does Mplus treat the calculation as a separate independent variable? It seems to have an impact on model fit in Bayesian analysis.
db40 posted on Friday, November 15, 2013 - 12:48 pm
Dear mplus team,
I have a fairly basic mediation model as seen by my syntax. One of the problems im experiencing is a large Est/s.e = 999.000 as per below.
I have searched and read that may have negative variances. Apologies, but what does this mean and how can I get around this problem. I have reduced to model significantly to see if this negates the issue as below but it hasnt.
Two-Tailed Estimate S.E. Est./S.E. P-Value
Effects from SINGLE to SUICTHLF
Total 0.376 0.049 7.733 0.000 Total indirect 0.000 0.000 999.000 0.000
VARIABLE: NAMES ARE
pserial Age female male married single SWD MH suicthyr suicthlf suicatwk suicatyr Soc_2010 wt_ints1 occ_prestige high medium low N_WS WS_2 ResSex;
Raj Kumar posted on Tuesday, December 10, 2013 - 2:30 pm
I'm new to Mplus program, so forgive me if this is a trivial question.
My research question is as follows:
I would like to get standardized estimates to compare in this mediation analysis. From what I understand, I should use the STDYX for the Y on X, and M on X.
I'm confused what to use for the mediatior standardized estimate, as it is binary and STDYX is not appropriate. I believe it is STDY, but I'm still not sure where you get the Y* in order to derive this. Also, can you compare the value of a STDYX estimate to a STDY estimate directly?
I have gone through the user manual extensively, and am still finding it unclear. I would greatly appreciate if you could walk through the steps for completing this problem.
If you use WLSMV, a continuous latent response variable behind the observed binary M is used in the modeling, so the regular STDYX estimates are fine.
Stephanie posted on Monday, January 27, 2014 - 5:19 am
As I am using the WLSMV, I do not get results for the p-values and standard errors of the standardized results. But if I would like to report the significance of these standardized estimates, can I assume that if the corresponding unstandardized results are significant the standardized are as well? And if I can’t, is there another possibility to get their p-values?
And my second question: Did I understand it correctly, that with WLSMV I should report the stdyx results for both, continuous and binary variables?
I am constructing a SEM with both binary and continuous variables (TYPE is general; estimator is ml; integration is montecarlo). Like Heidi Knipprath posted on 6/11/03, I get the error message "STANDARDIZED COEFFICIENTS ARE NOT AVAILABLE FOR MODELS WITH CENSORED, CATEGORICAL, NOMINAL, COUNT, OR CONTINUOUS-TIME SURVIVAL MEDIATING OR PREDICTOR VARIABLES". I get the same message when I request stdY or stdYX. Based on the discussion on the board, it seems that standardized coefficients should be available for binary variables.
I have a multiple group model where the grouping variable is gender. I ran an unconstrained model in which the paths were free to vary by gender. In the output, one of the parameters is significant for both males and females. Is there a way to test if the coefficient for males is significantly larger than the coefficient for females?
Hi, The standardized coefficients at the between level in a twolevel MSEM were quite large (.3 - .4), all with p-values <.001. The model ICC, however, was .012.
This is a typical school effects model. I grandmean centered the between school predictors.
Because the ICC seemed negligible, I reran the model accounting for stratified sampling but without the multilevel modeling. The standardized school effects are now very small (.01 -.03) and not significant. I am surprised by the drastic change in the standardized coefficients. Are you?
Hello. I am wondering the reason to get the same value in the estimate in this two model (I have this problem with more models) in the output STDYX Standardization. You need to know that all the variables are binary, I use the ESTIMATOR =WLSMV and PARAMETERIZATION=THETA;
M.G. Keijer posted on Thursday, May 15, 2014 - 12:08 pm
I was wondering how I could constrain standardized coefficients that are same among groups In a basic multi group model the constrained unstandardized coefficients are the same among groups, but constrained standardized coefficients differ, probably because of group specific estimates differ. Is there still a way to get the same STDYX coefficients, for instance by constraining the "PSI part" ? If so how can do I do this. Thank you in advance
You should be very careful considering standardized coefficients for group comparisons. The statistical literature has many articles warning against doing that and reviewers will most likely protest. The unstandardized coefficients are the ones likely to be invariant, not the standardized ones. This is because the different groups most likely have different variances.
I'm running a TWOLEVEL model and would like to confirm whether the coefficient estimates in my model results are standardized. The previous answer on this string doesn't seem to apply since I don't have "Std" as a descriptor for my coefficients. It doesn't seem like my coefficients could be standardized since I have coefficients greater than 1. Thank you.
I am conducing a series of SEM-based latent growth curves with censored data due to a stacking up of the data at the top of the distribution. Based on reading through the Mplus discussion boards I am using the WLSMV estimator. I have covariates in my model, but I understand that Version 7.2 will provide fully standardized output for WLSMV with covariates. I just downloaded the new version but I am still not getting standardized standard errors and p-values in my output. Any thoughts as to why I'm still not getting the full standardized output? Thank you in advance!
Dear Linda (or Bengt)! I have a logistic model with a latent predictor and would like to present a coefficient that stands for "the increase in logit for outcome = 1 when the latent predictor increases by one SD". Is this what the std-standardized coefficient stands for?
Dear Drs. Muthén, I am testing a cross-lagged model with three assessments over time, t0, t1, t2. One of my variable is binary, and the other is continuous. When I inspect the correlations in SPSS and MPlus, I see major diversions for the binary variable. t0-t1 and t1-t2 correlations in SPSS are more or less comparable, but in Mplus the same values are .39 and .82, respectively (using the exact same sample).
I see the same differences in the stability paths of the cross-lagge model. For the binary variable, the standardized (styx) stability coefficient from t0 to t1 is .33, whereas the stability coefficient from t1 to t2 is .82. This is weird because these observations are rather stable over time, 89% and 91% of the sample remaining in the same category from t0 to t1 and t1 to t2.
Can you please give me a hint about what is the reason for these differences in the estimated values?
Dear Linda, Thank you very much for your prompt response. Can you please also comment on the differences in the stability paths in MPlus.
"I see the same differences in the stability paths of the cross-lagge model. For the binary variable, the standardized (styx) stability coefficient from t0 to t1 is .33, whereas the stability coefficient from t1 to t2 is .82. This is weird because these observations are rather stable over time, 89% and 91% of the sample remaining in the same category from t0 to t1 and t1 to t2."
Why do we get so much different stability coefficients even though the change over time between each time periods are very similar to each other?
If I do not define t1 and t2 observations as binary, what would the estimate be based on? Logistic regression, or OLS?
Q1. The different sample statistics used (Pearson vs tetrachorics) make it impossible to predict what differences you might see, so we cannot comment on the stabilities.
Q2. Linear regressions using ML.
Mandy Cao posted on Monday, December 01, 2014 - 8:23 am
Dear Dr. Muthen,
I am struggling with 1 question and would appreciate your help a lot: I am using 5 IVs to predict engagement, then engagement predicts one DV. I found 3 IVs have significant path coefficients to engagement: .35, .26, .11 respectively (standardized). So I concluded that the one with .35 had the biggest prediction on work engagement.
One of my committee members said i can not eyeball the coefficients to make conclusion and I need to test it (but did not tell me how). Another committee member told me to conduct relative importance test or dominance test.
After reading two articles, i found it impossible to do these 2 tests because i loaded every item onto relevant latent construct in the first step CFA and then did the structural regression part. The examples i have read about relative importance or dominance analysis seem to have one score represent one variable.
I have consulted with my research professor and he told me it is impossible to do such tests. I did read the mplus menu but did not find relevant information.
Could you please kindly provide your insights? (FYI: these 5 independent variables are correlated)
If you are saying that you want to test that the standardized coefficient x is greater than the standardized coefficient y, that is a more advanced topic. You may want to look at the article on our website:
Van de Schoot, R., Hoijtink, H., Hallquist, M. N., & Boelen, P.A. (2012). Bayesian evaluation of inequality-constrained hypotheses in SEM models using Mplus. Structural Equation Modeling, 19, 593-609.
I am running a latent profile analysis model where the latent class moderates a simple regression relationship between two continuous variables.
The latent classes are determined by four continuous variables, (that are not the exogenous/endogenous variables).
I have included the stdyx command in the output to obtain the standardized results, and I want to make sure I am interpreting the output correctly.
1. The means, and parameter estimates (coefficients and intercepts) in the unstandardized output are just the means of each variable within each class, as well and the unstandardized slope coefficients, correct?
2. are the standardized parameter estimates for the coefficient Y on X just typical, standardized coefficients (betas)?
3. How are the means of the indicators within each latent class standardized in the standardized output?
4. What is the interpretation of the standardized intercept (the endogenous variable)? In the unstandardized output, the intercept is just the mean of the endogenous variable(s) within each class, weighted by the posterior probabilities, correct?
Another question: I want to be able to determine if the parameter estimates (slope coefficients and intercepts) are statistically different from one another across the different classes. Is there a test associated with one of the tech outputs that reports this? Or, if I want to do this by hand, is it as simple as calculating confidence intervals for each estimate based on the standard errors and seeing if the confidence intervals of any two classes overlap with one another?
Hi, I've been running a 2nd order bivariate growth curve model to look at the longitudinal change (covariance among slopes) in two domains. At the first level I have factors defined by at least three manifest variables, measured in years 1,2,and 5. The second order is the linear growth across five years with Lx and Sx for the x-factors and Ly and Sy for the y-factors. All variables are continuous and estimation is FIML.
The covariance among Sx and Sy is statistically significant with with t=3.5, p<.001 implying that those who change in variable x also do so at variable y. Accordingly, the correlation in STDXY among Sx and Sy is r=.76 but in the STDXY the correlation comes out as not significant (z=1.194, p=.23).
I don't understand the reason for this? Also, if I run a nested model to obtain the likelihood ratio (LR) test with the correlation (covariance) constrained to 0 I get LR=14 with 1df, which is statistically significant - as I would have expected from the t-value of the covariance test.
I'm pretty sure there is a very easy solution to my confusion - but right now I don't seem to be able to find it.
Hello, I conducted SEM analyses using Mplus several years ago. Finally publishing. When i reported R2, researchers/reviewers are asking me how the R2 was calculated. Also, they are asking me whether the R2 i reported is based on a Cohen's d or a Pearson's r value. Can i assume R2 as reported in SEM from Mplus that R2 is based on Pearson's r?? They said they need to know to evaluated the magnitude (I reported an R2=.4) that depending on whether it is d or r based they use a different cut off point. I'm a bit rusty on remembering how R2 is calculated or what it is based. Can you please remind me? Thank you, Suellen Hopfer, PhD REAL Prevention
Hello Linda, The scale of the DV is categorical (vaccination: yes/no). The SEM model tests the impacts of an intervention (dummy coded for 3 versions of an intervention) and its impact on 2 latent mediators on the outcome of vaccination. Suellen
For a categorical DV, Mplus gives the R-square for a continuous latent response variable underlying the categorical observed variable. So it is the same as a linear regression R-square for that latent response variable. This was proposed in
McKelvey, R.D. & Zavoina, W. (1975). A statistical model for the analysis of ordinal level dependent variables. Journal of Mathematical Sociology, 4, 103-120.
and it is also used in other texts, such as Scott Long's nice book on Regression Models for Categorical... and the Snijders & Boskers multilevel book.
Note also that with a binary outcome you may want to consider indirect and direct effects based on counterfactuals - this is becoming the new standard for mediation modeling; see e.g.
Muthén, B. & Asparouhov, T. (2015). Causal effects in mediation modeling: An introduction with applications to latent variables. Structural Equation Modeling: A Multidisciplinary Journal, 22(1), 12-23. DOI:10.1080/10705511.2014.935843
Hello, I'm running an dyadic model where the paths connecting the same variables in both participants need to be set to be the same. I set my parameters to be equal using the following format:
x_1 on y_1 (1); x_2 on y_2 (1);
Additionally, I set my means, variances, and thresholds to be the same:
[x_1] (2); [x_2] (2);
[y_1] (3); [y_1] (3);
When I estimate my model, the unstandardized estimates are indeed equal as specified. However, the standard estimates (STDYX) are different. Shouldn't the standardized estimates still be equal to each other for the ones that were constrained?
I would like some clarification/ reassurance that I am calculating the correct standardised estimates for my model.
I am running a SEM with a CFA with two factors, and a mediation model which tests the effect of the two factors (P and I) on an outcome Y, via a mediator M. I also have some covariates (Xs) that I regress Y on. I am using WLMSV estimation hence I obtain probit coefficents for the model.
Is it correct for me to use STDYX for continuous X variables and STDYX/ Sample Standard Deviation of X, for dummy variables?
ehrbc1 posted on Thursday, August 06, 2015 - 4:46 pm
Hi there, I am trying to compare the strength of two different indirect effects. I have followed the approach as highlighted in Lau, R. S., & Cheung, G. W. (2012). Estimating and comparing specific mediation effects in complex latent variable models. Organizational Research Methods.
So, I have created an additional model parameter that represents the difference in strength between two indirect effects using the following: MODEL CONSTRAINT: NEW (MED1 MED2 DM1); MED1 = cp*sc; MED2 = ap*sc; DM1 = MED1-MED2; ANALYSIS: BOOTSTRAP = 1000; OUTPUT: stdYX CINTERVAL(BCBOOTSTRAP); stdyx CINTERVAL; STANDARDIZED (STDYX);
From what I gather, the DM variable is based on the unstandardized indirect effects and the resulting coefficient is undstandardized. Is it possible to compare the standardized indirect effects and/or get a resulting standardized coefficient?
Hello, I am examining the relation between a CFA measurement model for depression and several binary and continuous predictors using WLSMV estimation. Question: Is it ok if I report the unstandardized estimates, the standard errors and/or the p values of the unstandardized estimates together with the STDYX estimates for factor loadings and continuous covariates, and the STD estimates for categorical covariates? (If I only reported the standardized estimates I would have no way of saying whether they are significant)
I am running a mediation model with probit regression using WLSMV estimation. For direct path coefficients, I have used STDYX/ sample standard deviation to report standardised coefficients for binary X variables.
How should I report standardised indirect effects of binary variables? I get STDYX and STD in the output, but it doesn't seem correct to follow the same method as above for indirect effects.
We recommend StdY for binary covariates and StdYX for continuous covariates. If you get only StdYX and Std, you need to adjust StdYX to StdY. The formulas are in the user's guide under the STANDARDIZED option.
C. Lechner posted on Sunday, October 25, 2015 - 11:33 am
Suppose I want to compare the effects of a latent covariate f1 on two types of outcomes: 1. a latent outcome f2 2. a manifest outcome Y whereby Y and all manifest indicators of f2 are 5-point categorical variables.
My question is: Can I use the Wald test to statistically compare the size of the influence of f1 on f2 and of f1 on Y?
I am asking because I am unsure what the impact of Y entering the estimation equation as a categorical variable on the viability of such a comparison would be.
You can compare standardized coefficients only when the metric of the unstandardized dependent variables are the same. You cannot make the comparison you describe.
C. Lechner posted on Sunday, October 25, 2015 - 1:25 pm
Thank you, Linda, this is what I suspected. The problem is that I cannot easily bring Y to the same metric as f2 by standardising it because Y is categorical. Are you aware of a possible solution to this problem? For example, should I estimate a single-indicator latent variable for Y in order to bring it to the same metric as f2? Are there other solutions? Thanks!
I am not sure there is a satisfactory solution for this.
Irene Dias posted on Tuesday, November 24, 2015 - 10:51 am
I am performing a path analysis (saturated model) and I am trying to test the equality of two standardized regression coefficients (STDYX), using the model constraint commands (and MLR estimator) but my model implies testing two indirect effects (although I am testing the equality of the coefficients that correspond to the direct effects). However, my results are quite strange because I am obtaining a significant Wald test for very close standardized coefficients and non-significant results for very different coefficients. Therefore, I suspect that there is some problem with my code. I would thank any insights.
RC ON WR (p4); RC ON Rei (p1); RC ON LC (p2); Rei ON WR; Rei ON LC; WR WITH LC (covLCWR); LC (varLC); WR (varWR); RC (p3); Rei (p5);
Additionally (in relation to my previous post), when using logit/mlr (my preferred choice as this is more common for journals in my field - education - and easier to interpret) the effects (decomposition) of my main predictor variable on my outcome variable do not add up (both unstandarized and standardized). The coef keep increasing slightly as the model becomes more complicated. I have checked the models, the codes etc but cannot figure out what is going on.
* Is this to do with the different types of variables? I assume I could use the STANDARDIZED option to get all 3 types and then select the relevant option (STD for latent; STDY for binary; and STDYX for continuous?) based on the variable type. If this is the case, would they be comparable? And should the 'effects' add-up?
This message got a little confusing. But why are you dividing by sqrt(rvar_pb2)? That's just a residual, not a total, variance. And you seem to be aware of that distinction elsewhere.
Sara Geven posted on Wednesday, December 09, 2015 - 12:05 pm
Dear professor muthen, Sorry for the confusion and thanks for the reply. I tried out different things to see what mplus is using, but i understand from your post that i should use the total variance. A related question: I intend to calculate contextual effects for my study (I.e between effect - within effect). I was wondering whether I should first standardise the between effect and the within effect before I subtract them? Or should I calculate the difference first and then standardise it (i.e. by multiplying it with the s.d of the level 2 predictor and dividing it by the s.d of the level 1 d.v.)? In syntax I have seen the latter approach, but I encountered papers in which a standardised individual level effect and the standardized contextual effect add up to the standardized between level effect. Sara
Dear Drs. Muthen, We are running a model with a latent variable (based on six observed indicators) as a dependent variable in the model and there are four observable variables as the predictors. We also include a set of correlated errors for the latent variable. We have a fairly small number of cases , and thinking that the coefficients may not really follow a symmetric normal distribution, so we use estimator = Bayes, the model converged. But then when we tried to request standardized coefficients via STDYX, there is a fatal error as below “*** FATAL ERROR STANDARDIZED COEFFICIENTS ARE NOT AVAILABLE FOR THIS MODEL.” This error message went away when we took out the correlated error, but we are not sure why this happened. We are hoping that you can shed lights on this, maybe we made some fatal error that we are not aware of, thanks!
below is our partial mplus input, many thanks: ..... analysis: estimator = bayes; model: trait by cort11 cort12 cort21 cort22 cort31 cort32; cort31 with cort21; trait on edu2dicho control dimen1 dimen2; plot: type = plot2; output: tech1 tech8 stdyx ;
Please send your output to Support along with your license number.
Jukki Lee posted on Friday, June 17, 2016 - 2:18 am
I have a mediation model X --> M --> Y and a few questions concerning the output:
1) The standardized coefficients and the related p-values change in the output when I add the command "cinterval (bootstrap)". Which coefficients should I report? The ones from the output with bootstrap or without?
2) In the output with bootstrap, the standardized model results only show StdYX estimates without any values for S.E., Est./S.E. and two-tailed p-values. Is there anything wrong with my analysis? Which numbers should I report when the p-values are missing?
I have several questions about how to explain the probit regression coefficient and calculate probability from probit regression coefficient:
1. I saw some literature explained Probit regression coefficient in this way: (For example, we have an unstandardized regression coefficient of -0.526) For every one unit change in X, the z-score of Y decreases by 0.526 or one-unit increase in X resulted in a decrease of 0.526 standard deviations in the predicted Z score of cumulative normal probability distribution of Y.
Is it a correct explanation?
2. I want to calculate probability from probit coefficient. When X = continuous latent construct and Y = binary observed outcome, using the formula from Mplus Guide, P (Y = 1 | x) = F (a + b*x), a=-threshold; b=unstandardized regression coefficient.
My question is that in my case, is this correct for x= factor score?
I'm interested in testing the differences between two cross-lagged effects. My model looks like this (all variables are observed):
VAR1_t2 ON VAR1_t1; VAR2_t2 ON VAR2_t1; VAR1_t2 ON VAR2_t1 (a1); VAR2_t2 ON VAR1_t1 (a2); [VAR1_t1 VAR2_t1]
I know how to test the difference between the unstandardized coefficients (a1 and a2). I would like to know how I could test the differences between the standardized coefficients. Can I do that using the MODEL CONSTRAINT option? If so, how?
The variances for your time 2 DVs (what you have in the denominators) are not correct - they have to take into account the variance contribution from the same variable at time 1.
Silvia posted on Thursday, June 23, 2016 - 6:35 am
Thanks for your answer. I'm not familiar with the general formula to compute the standardized estimates. Would you be so kind to direct me to where I can find it or to suggest me how to take into account the contribution from the same var at T1 in the above model?
All you need is how to compute the variance of a DV in a regression model with several covariates; your standardization formula is otherwise correct. This variance computation is covered in many texts - see also our new book.
I think my question is a repeat from prior posts but I wanted to make sure I understand correctly. I ran an SEM that had two covariates, 4 latent factors, and a DV. We found a total R-square of 51% (4 variables with direct effects). Although we provided indirect and direct effects, a reviewer of our paper requested the specific effect size for each predictor of the DV. In prior post, you noted that one cannot determine the precise % of variance in the DV accounted for by each of the direct predictors because the predictors are correlated? Is that true?
My confusion is that I think one can do this when using multiple regression. If I include the 6 predictors in a regression model, I would get an overall R-squared, and then partial correlation and semi-partial correlations for each predictor. Even though those predictors are related, I could square the semi-partial to get at the specific effect size for a variable. Can't we do this in SEM?
A couple of related questions include: (a) Is the Beta value for a direct effect the same thing as a partial correlation (correlation between one variable and DV after controlling for all other variables linked to DV)?
(b)Given that I obtained a Beta for each direct effect in our SEM, couldn't we simply square the Beta to get some rough estimate of the effect?
(c) Can one get a semi-partial for each predictor of a given DV?
The regular way to assess the relative importance of a predictor is by using standardized coefficients. I don't know what is meant by "the specific effect size for each predictor". You may want to ask these general analysis questions on SEMNET.
Dear Dr. Muthen, I have a data contain 5079 participants nested in 183 classes. The independent varaivle and dependent variable were mesured by 23 and 22 items, respectively. I I would like to compare both unstandardized and standardized coefficients estimated from six methods. Six methods are single level single indicator (uncorrect mesurement errors), single level single indicator (correct mesurement errors), single level SEM, twolevle single indicator (uncorrect mesurement errors), twolevle single indicator (uncorrect mesurement errors), and twolevle SEM. Is it resonable to compare them?
Dear Dr. Muthen, I have a data contain 5079 participants nested in 183 classes. The independent varaivle and dependent variable were mesured by 23 and 22 items, respectively. I I would like to compare both unstandardized and standardized coefficients estimated from six methods. Six methods are single level single indicator (uncorrect mesurement errors), single level single indicator (correct mesurement errors), single level SEM, twolevle single indicator (uncorrect mesurement errors), twolevle single indicator (correct mesurement errors), and twolevle SEM. Is it resonable to compare them?
I would like to ask you a couple of quick questions regarding standardisation please. I am running a simple mediation model: P->B->D, where P->B path is moderated by FI. All variables are latent with ordinal indicators and FI is an observed continuous variable.
From one of the earlier threads, I read that grandmean centering helps to alleviate multicollinearity issue, as FI would be highly correlated with the product term (P*FI). That was the case in my model, and grandmeancentering FI solved the problem. So 3 questions are:
1) Via model constraint, I calculated indirect effects at specific moderator values. Since mplus gives no standardised solution, I am going to calculate this manually (using Mplus FAQ). But I realised that SD of the observed variable is somewhat bigger (5.7) then SD of other latent variables (from 0.45 to 0.90). So is this ok to apply bigger SD, or does it have to be scaled in someway?
2) Since FI was grandmean centered would its sd change too?
3) What would be the formulae for manually calculating correct CIs for the manually calculated indirect effect at specific moderator values?!
There you find an example with an xz interaction which is what you have. In Model Indirect you use the MOD option to capture the moderated effect. You get a plot of CIs. And you also get this for standardized. Plus it handles latent variables.
After reading many of your answers, and the paragraph in the user guide, I'm still in doubts about which standardized coefficients to use for my SEM. I have an SEM linking some observed variables (X) over some factors (latent variables) to some other observed dependent variables (Y), i.e. X -> factors -> Y. X and Y are both ordered categorical data (5-Likert). On the latent factors we have included background information (dummy/binary variables).
From earlier posts I think that for the dummy variables I should use STDY and for both observed variables (X and Y) I should use STDYX. For the factors I should use STD. Is this correctly understood? However, this means that I should report results from different parts of the output for the same factor. An example is my first factor:
F1 ON GENDER F2 F3 BC_R;
, where F's are factors, gender is a dummy, and BC_R is observed 5-Likert variable.
Lastly, I'm not sure if I should use raw estimates when reporting the coefficients for the items on each of the latent factors, or I should use any of the standardized coefficients?
I hope it makes sense, and that you can guide me to which standardised variables I should use. Thank you.
I am conducting a moderated mediation model with two independent variables and a dichotomous moderator. I am wondering about two things and I am hoping for your help. Thanks!
First the output doesn't display standardized results for the indirect effect. Is it because of the dichotomous moderator or is there a way to demand these coefficients?
Second underlying analysis like simple moderation and mediation differ in the significance looking at standardized or unstandardized indirect effects. The unstandardized ones war significant using bootstrapping, the standardized aren't. I thought that standardization doesn't change significance?
When estimating the model, the program did not generate the values of p and IC's for the standardized coefficients. I"m using Mplus versim 6. How I can resolve this, since I must use standardized values?
A other question is: If I'm only working with categorical variables, can I use the raw regression coefficient?
I am doing SEM with categorical factor indicators as well as observed variables that are both continuous and categorical (i.e., continuous age, categorical sex). I know that STDY is more appropriate for reporting standardized results if an observed exogenous variable is binary (i.e. sex) since standard deviation changes in binary variables are not meaningful. However, if my observed outcome variable is also binary and I was interested in generating a standardized output, does that mean that just STD should be used (i.e., only standardized by the continuous latent variable variances)? In my final model, both the latent factors and the observed outcome are endogenous.
STD is fine in this case. You don't need standardization for a binary DV - it's metric is known/clear (0/1).
Virya Koy posted on Monday, April 17, 2017 - 8:27 am
Dear Prof Muthen could you please kindly explain what is the (1) STDYX Standardization? (2) what is the estimate values for? (3) what is the S.E. values? (4) what Est./S.E.values? (5) what is the P-Value as the criteria for significant?
I'm trying to read in data from SPSS into Mplus. When I use TYPE=LISTWISE in Mplus, I get the exact same N in SPSS and Mplus. I get the exact same mean. However, the variances are not the same. Is there a known difference in the way the two programs calculate variance? Maybe N vs. N - 1?
I have obtained some puzzling results for my path analysis with continuous latent variables as outcome measures, and all continuous predictors. Whereas in the model results the parameter estimate for some beta's is positive, in the stdyx it has suddenly changed into a negative value.. How can this be explained? I have found that it is sometimes due to multi-colinearity but my preliminary analyses have ruled this out.
Virya Koy posted on Tuesday, April 25, 2017 - 7:14 am
what should I do if all my direct effects are not significant?
Virya Koy posted on Tuesday, April 25, 2017 - 7:15 am