Mplus Discussion >> Correlation of exogenous variables in SEM

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Correlation of exogenous variables in...

Mplus Discussion > Structural Equation Modeling >

Message/Author

Joseph E. Glass posted on Tuesday, July 26, 2011 - 11:29 am

This is a basic question regarding the correlation of exogenous variables in SEM.

I have a mediation model where 1) a latent variable is regressed on several observed exogenous variables, 2) a mediator is regressed on that latent variable and the exogenous variables, 3) a dependent variable is regressed on all of the above variables.

The exogenous variables are known to be correlated (e.g. race, income, education, age). From the Mplus output, it appears that the model does not automatically estimate the correlations among exogenous variables.

How should I approach the correlation of these exogenous variables? Are there specific guidelines or rationale that I should consider? My research question does not specifically involve the exogenous variables; rather, I am adjusting for them because there is known to be sociodemographic variation in my constructs of interest.

Linda K. Muthen posted on Tuesday, July 26, 2011 - 12:44 pm

In regression, the model is estimated conditioned on observed exogenous variables. Their means, variances, and covariances are not model parameters. If you want to know their means, variances, and covariances, do a TYPE=BASIC.

Joseph E. Glass posted on Tuesday, July 26, 2011 - 12:58 pm

Thank you Linda. I may have not been completely clear. I am not necessarily interested in obtaining estimates for these correlations.

Rather, I am trying to figure out, what are the implications of including versus not including WITH statements for the exogenous variables when conducting SEM? I.e., how will including versus not including such WITH statements impact the overall model? Is there rationale for choosing when to include these correlations?

Linda K. Muthen posted on Tuesday, July 26, 2011 - 1:42 pm

If you include them in the analysis, they are treated as dependent variables and distributional assumptions are made about them.

Mr Mo DANG-ARNOUX posted on Thursday, June 21, 2012 - 7:56 am

Dear Profs Muthen and Mplus experts,

I would like to account for correlations between covariates and an exogenous factor. I am regressing:

F2 ON F1 X1 X2; ! X1-X2 observed covariates
F2 BY V1-V5; ! F2 endogenous factor
F1 BY U1-U3; ! F1 exogenous factor

I plan to use the WLSMV estimator (indicators U_i and V_j are categorical).

A. According to the Mplus 6.1 version note, the default is to no longer include into the model the correlations between covariates, in ML estimations. Is that also the default assumption in WLSMV estimations?

B. By default, Mplus assumes an absence of correlation between the exogenous factor F1 and the covariates X1 X2, right ?

Non-zero correlations would however be more realistic here (F1=impairment, X1=pain, X2=parenting stress). Modification indices do suggest links:
F1 ON X1;
F1 ON X2;

Yet, domain-driven theory makes more sense of a reverse link: X1 X2 ON F1.

C. What are the statistical implications of replacing "F1 ON X1 X2" by:
C1. "X1 X2 ON F1"?
C2. "X1 X2 WITH F1"? Does the latter
imply a zero correlation between covariates X1 and X2, unless I add
"X1 WITH X2" ?

Many thanks in advance for your attention and any guidance,

M� Dang-Arnoux

Linda K. Muthen posted on Friday, June 22, 2012 - 9:46 am

A. The covariates are correlated. The correlations are not model parameters as the model is estimated conditioned on the covariates.

B. Yes.

C1. It's a different model. It is your choice which of the two specifications to use.

C2. Yes.

Mr Mo DANG-ARNOUX posted on Sunday, June 24, 2012 - 1:12 am

Dear Linda,

Thank you very much for your prompt answer, I highly appreciate your constant timeliness in answering our questions.

Just a few precisions I would like to add:

Re:A. I was wondering why your 6.1 version note mentioned only ML estimation (and not others like WLSMV). Was it already implicit in WLSMV, that correlations between covariates' are also left out of the model?

In my analyses, WLSMV regression results differ much, when specifying or not correlations between an exogenous factor F1 and covariates X1 X2. In contrast, with or without those correlations, MLR results vary little (close to WLSMV with the correlations specified). Do MLR and WLSMV handle in different ways the correlation between exogenous factors and covariates?

Re:C1-C2. Actually *both* "X1 X2 ON F1" and "X1 X2 WITH F1" imply a zero correlation between covariates X1 and X2 (conditioned on F1), unless I add "X1 WITH X2", right?

Is this added complexity the reason that

C3. Your 6.1 version note says: "Because covariates are characterized as exogenous variables, using ON with the covariates on the right-hand side is a natural approach"?

C4. Modification indices usually propose "F1 ON X1" and "F1 ON X2" instead of "X1 X2 ON F1"?

Thank you again for your helpful support.

Sincerely yours,
M�

Linda K. Muthen posted on Sunday, June 24, 2012 - 7:15 am

Before Version 6.1, all models were estimated conditioned on the observed exogenous variables except maximum likelihood with continuous dependent variables. The model results in this case are the same whether the model is estimated conditioned on the observed exogenous variables or not. We made the change for consistency.

With WLSMV we do not recommend using WITH with observed and latent exogenous variables. We recommend using ON. WITH changes the model.

c1-c2. I think so. Try it and see.

c3. Yes.

c4. Modification indices are given for all fixed or constrained parameters. Whether it makes sense to add them is determined by the user.

Mr Mo DANG-ARNOUX posted on Monday, June 25, 2012 - 12:22 am

Many thanks for your again very prompt answer, Linda. This discussion clears things up for me.

Could you please tell me in more detail the meaning of:

"We recommend using ON. WITH changes the model."

Thank you!

M�

Linda K. Muthen posted on Monday, June 25, 2012 - 10:24 am

Using WITH the sample statistics for model estimation are tetrachoric/polychoric correlations. With ON they are probit regression coefficients and residual correlations.

ellen posted on Friday, September 14, 2012 - 10:32 am

This is a basic question. It appears that Mplus estimates correlations between exogenous predictor variables by default even when the WITH statements are not specified.

I have 4 exogenous predictors (e.g., A, B, C, D), and 1 mediator, and 1 outcome variable. When I specified WITH statements for the covariances among three exogenous variables (A, B, C), but did NOT specify an WITH statement for the three exogenous variables with the fourth variable, D, the output in Mplus somehow still estimate the covariances in the output.

If I don't want a covariance parameter to be estimated between two exogenous variables, do I need to fix it to zero or simply not estimate it? Does fixing a parameter to zero mean the same as removing this parameter from the model?

Thank you for your time!

Bengt O. Muthen posted on Friday, September 14, 2012 - 10:50 am

We need to see the output, but the answer to your last question is yes.

Stephanie posted on Thursday, November 15, 2012 - 3:39 am

We have questions regarding the correlation of exogenous variables in our SEM. (We are using MPlus Version 5.1).

In our model we have four exogenous variables: One latent variable, two continuous manifest variables and one dichotomous manifest variable. As our dependent variable is dichotomous we are using the WLSMV estimator. Our questions are:

1. Do we explicitly have to integrate a correlation with the WITH statement for all these exogenous variables � manifest and latent? Or does MPlus calculate these correlations by default?

2. If it does, where can we see the correlations with their level of significance in the output?
In our current output (without integrated correlations) we can only see correlations between the latent exogenous variable and all the other variables in the model but no correlations between this latent exogenous variable and the other exogenous variables.

3. Are correlations probably only necessary between exogenous manifest and latent variables but not between exogenous manifest variables?

We thank you very much in advance for your support!

Linda K. Muthen posted on Friday, November 16, 2012 - 12:09 pm

With WLSMV, I would not covary the observed and latent exogenous variables. I would relate these variables using the observed exogenous variables on the right-hand side of ON statements where the latent variables are dependent variables.

Stephanie posted on Wednesday, April 10, 2013 - 6:39 am

We have a further question regarding the correlation of exogenous variables in our SEM.

We are using the same model as mentioned in our last question on November 15th. But now we have only four manifest exogenous variables from which three are continuous and one is dichtomous. Our dependent variable still is dichotomous, so we are using the WLSMV estimator.

In this case, do we have to integrate a correlation with the WITH statement for all these exogenous variables? Or does MPlus calculate these correlations by default? And if it does, where can we see the correlations with their level of significance in the output?

We thank you very much for your support!

Linda K. Muthen posted on Wednesday, April 10, 2013 - 6:51 am

In regression, the model is estimated conditional on the observed exogenous variables. You should not mention them in the model command. You can see their correlations in the descriptive statistics from the SAMPSTAT option.

Stephanie posted on Thursday, April 11, 2013 - 12:04 am

Thank you very much for your help!
I have included the SAMPSTAT option in my model command. But unfortunately the output only contains correlatios between all other variables in the model but not between the four manifest exogenous variables. How is it possible to get them?

Linda K. Muthen posted on Thursday, April 11, 2013 - 6:22 am

Do a TYPE=BASIC with no MODEL command.

Andrea Norcini Pala posted on Wednesday, May 01, 2013 - 3:16 am

Dear Professors,

I performed a SEM with observed and latent variables. A reviewer stated that we "Ought provide the correlation matrix".

I am concerned about this, in fact, SEM has been performed to test a model we tested on an another sample. furthermore, the correlation matrix would refer to observed variables rather than latent ones.

1. do you think is really importnat to provide the correlation matrix of the variables tested within the SEM?

2. does it make sense to you that the relationships (coefficients) obtained with the SEM differ from those obtained with correlation? (e.g., a significant relationship in SEM is not significant in correlation matrix)

Thank you very much
Andrea

Linda K. Muthen posted on Wednesday, May 01, 2013 - 8:38 am

It makes sense to provide descriptive statistics for the data that are analyzed. Means, variances, and a correlation matrix provide a good description of the data and can be used to reanalyze the data.

Carolyn CL posted on Monday, June 10, 2013 - 1:38 pm

Dear Drs. Muthen and MPLUS experts,

After reading up on the issue of correlating exogenous variables, I would like to be clear on the following:

Model:
Y1 ON Z1 X1 X2 X3; !X1-X3 are observed exogenous variables
Z1 ON X1 X2 X3;
Z1 BY X4 X5 X6 X7; !Z1 is an endogenous latent variable

1. The regression coefficients for variables X1-X3 on Y1 and Z1 are estimated conditional on each other (and on Z1, for Y1).

2. This allows me to say, in the interpretation of the results, that the effect of X1 (for example) on Y1, controls for the effects of X2, X3 and Z1.

3. If I want to say that I allowed X1-X3 to correlate with each other, I would need to add WITH terms as follows to the model:

X1 WITH X2 X3;
X2 WITH X3;

In this way, I would effectively be estimating parameters for the covariance between X1-X3.

Many thanks,

Carolyn

Bengt O. Muthen posted on Monday, June 10, 2013 - 2:56 pm

1-2 are correct, but 3 is wrong. You don't need to, and typically should not, add WITH terms for the observed exogenous variables (what we call covariates). The covariance parameters activated by WITH are not part of the model. You can think of the covariates as being correlated by default - just like they are in regular regression. The sample statistics show their values.

Carolyn CL posted on Monday, June 10, 2013 - 3:35 pm

Many thanks for this.

One final question, what would be the implication for the model and interpretation of the results if I did add covariance parameters to the model by using the WITH statement?

I ask because a reviewer noted that 'We would expect substantial bivariate correlations between the various measures X1-X3. Since these are all modeled simultaneously as manifest variables, I'm concerned about the validity of the model results'.

Bengt O. Muthen posted on Monday, June 10, 2013 - 4:47 pm

It sounds like the reviewer does not understand that the x1-x3 correlations are not held at zero in your analysis. Your analysis is ok if these 3 variables correlate, which covariates typically do.

In your case, if you have no missing data on x1-x3, if you add WITH among your 3 covariates you will get the same results, except that CFA and TLI will be inflated.

GP posted on Thursday, September 19, 2013 - 12:54 pm

Dear Drs. Muthen,

I'm running the path model below. x1, x4, and m are dichotomous; x2 and x3 are continuous. tc is censored, with corresponding time variable t. I also have a clustering variable c.

I would like to know how to model the correlation between x2/x4 and x3/x4. I understand I cannot use the WITH command, since x4 is dichotomous.

Thank you for your help.

Variable:
Names = x1 x2 x3 x4 m t tc c;
Categorical = m x4;
Cluster=c;
Survival = t (all);
Timecensored =tc (0=not 1=right);
Analysis:
Basehazard = off ;
Type=complex;
Algorithm=integration;
Integration=montecarlo;
Model:
x2 ON x1;
x3 ON x1;
x4 ON x1;
m ON x2 x3 x4;
t ON x2 x3 x4 m;

Bengt O. Muthen posted on Thursday, September 19, 2013 - 6:29 pm

This is a big topic, made more complex since you consider a continuous-time survival outcome. x4 is dichotomous and also a DV. For mediation models like this one, one approach is to work x4*, the latent continuous response variable behind the observed x4. Then all relationships stay linear and things are easy. WLSMV does that, but doesn't handle survival analysis. You need ML for that, but ML does not use x4* so you end up with a mixture of linear and logit/probit regressions. To do mediation modeling right, you should use ML and the causal effect approaches described by e.g. Vanderwheele in the epidemiology literature. I think such survival analysis can be done in Mplus, but I haven't tried it out. I did not include that in my causal effects paper (see out website).

I know this wasn't your question, but felt I had to mention it. With ML, residual covariances that you ask about can be done using a factor behind the two variables - then they correlate beyond that their predictors produce.

George Acheampong posted on Sunday, January 25, 2015 - 4:45 pm

Dear Prof Muthen,
I am running a path analysis of form:

f1 on x1 x2

f1 is categorical
x1 x2 are continuous

I do not want the residuals of f1 to be correlated to x1 x2. How can I implement this in mplus. I want to use the MLR estimator. I will appreciate any help.

Linda K. Muthen posted on Sunday, January 25, 2015 - 4:49 pm

They will not be. The model is estimated conditioned on the observed exogenous covariates. Their means, variances, and covariances are not model parameters.

shaun goh posted on Thursday, October 01, 2015 - 7:03 pm

Dear Prof Muthen,

I am struggling with the specification of correlations between predictors, in a model where there are three predictors : two exogenous covariates and one endogenous latent factor.

By default, Mplus correlates the two exogenous covariates and does not bring them into the likelihood. However, the two exogenous covariates are not correlated with the one endogenous latent factor.

On one hand, I do not want to do so as one of the covariates is non-normal (i.e. gender) and using the WITH command decreases fit. On the other hand, I understand that predictors are typically correlated, and I should specify correlations between all predictors?

Thank you for your time and assistance,
Shaun

Bengt O. Muthen posted on Friday, October 02, 2015 - 7:56 am

I think it is often natural to regress the factor on the exogenous covariates to capture their association.

shaun goh posted on Saturday, October 03, 2015 - 12:25 am

Dear Bengt,

Im sorry i was not clear. I am struggling to figure out if i should add a 'f1 with x1 x2' in the following model.

Y1 on x1 x2 f1
f1 by t1 t2 t3

i understand X1 and x2 are correlated by mplus default outside of the likelihood. However, f1 is not correlated with x1 x2.

On one hand, I have the impression that all predictors should be correlated. On the other hand, i would rather not bring x1 into the likelihood as it is binary.

I was wondering how one could proceed from here ?

Thanks again,
Shaun

Bengt O. Muthen posted on Sunday, October 04, 2015 - 5:56 pm

Say

f1 on x1 x2;

Jennifer Lee posted on Friday, April 01, 2016 - 7:33 am

Hi Drs. Muthen,

I would like to check on something that you mentioned in a post above regarding WITH terms.

"You don't need to, and typically should not, add WITH terms for the observed exogenous variables (what we call covariates). The covariance parameters activated by WITH are not part of the model. You can think of the covariates as being correlated by default - just like they are in regular regression. The sample statistics show their values."

I would like to confirm that this is also the case for type=complex and MLR estimation. That is, that the covariates are correlated by default in the model, and are recognized as such because they are on the right side of ON.

We have a mediated model in which the meditor and dependent variable are latent factors, and the independent variables are a mix of observed and binary variables.

The model looks like this:

F1 by V1 V2 V3;
F2 by V5 V5 V6 V7 V8;

F2 ON V9 V10 V11 V12;
F1 ON V9 V10 V11 V12;

F1 ON F2;

F2 ON V13 V14 V15 V16;
F1 ON V13 V14 V15 V16;

F1 IND V13;
F1 IND V14;
F1 IND V15;
F1 IND V16;

Where V1-V12 are all continuous variables, V9-V12 are a mix of binary and continuous control variables, and V13-V16 are all binary independent variables.

Linda K. Muthen posted on Friday, April 01, 2016 - 8:50 am

This is true in any regression model. In regression, the model is estimated conditioned on the covariates. They are not assumed to be uncorrelated.

Irene Dias posted on Tuesday, April 12, 2016 - 3:05 pm

Dear Professors,

I have four exogenous variables (two observed [x2 and x4] and two latent [F1 and F2]), two observed mediators (x1 and x3) and two endogenous latent (F3 and F4). All observed are continuous and I am using ML.
The model is as follow:
F1 by a1 a2; !F1 exogenous factor
F2 by a3 a4; !F2 exogenous factor
F3 by a5 a6; !F3 endogenous factor
F4 by a7 a8; !F4 endogenous factor

F3 ON F1 x1 x2;
x1 ON F1 x2;

F4 ON F2 x3 x4;
x3 ON F2 x4;

Model indirect:
F3 IND x1 x2;
F3 IND x1 F1;

F4 IND x3 x4;
F4 IND x3 F2;

I am not specifying any with statements but I can see in the diagram that by default, Mplus calculates the covariance between x2 and x4 and the covariance between F1 and F2 (double-sided arrows in the diagram). However, theoretically, F1 should not also be correlated with F2 but also with x2 and x4, and so does F2, i.e. all four observed and latent predictors should correlate.
Should I specify the statements F1 WITH x2 x4 and F2 WITH x2 x4? If I understood what you said before, I should not specify with statements between the observed predictors, right?

Linda K. Muthen posted on Tuesday, April 12, 2016 - 4:09 pm

We show the covariance between x2 and x4 because they are not uncorrelated during model estimation although the parameter is not estimated. In regression, the model is estimated conditioned on the observed exogenous variables. If you want to relate exogenous observed and latent variables, you should use ON statements.

Deniz posted on Thursday, November 16, 2017 - 7:32 am

Dear Dr. Muthen,
I have a structural equation model in which four latent dependent variables are predicted by three latent and two manifest independent variables.
1) Does mplus take correlations between manifest and latent independent variables into account?
2) And if yes, where can I find these correlations in the output?
(I know that correlations between latent variables are given in the model results section & correlations between manifest variables are given in sampstat results).

Bengt O. Muthen posted on Thursday, November 16, 2017 - 3:44 pm

1) You will see in the output if these correlation estimates show up. If not, they are zero - and you may want to free them using WITH in the Model command.

Eric M. posted on Thursday, December 14, 2017 - 2:59 pm

Hi. I�d like some clarity about whether or not exogenous latent variables (with continuous indicators) should be correlated with exogenous observed variables (all continuous). In the below example exogenous observed variables are used as controls in this model. Do I need to correlate the exogenous latent variable (F1) with the exogenous observed variables (gender, age)? I noticed that if I added other exogenous latent variables that latent variables are correlated with each other� and the observed variables are correlated with each other. However, exogenous latent variables are not correlated with exogenous observed variables in the diagram that is produced.

It was my understanding that exogenous variables should be correlated. Is this something specific to MPLUS�s defaults. Can you please clarify? Thank you!

For example:
F1 BY V1 V2 V3;
Z1 Y1 ON F1;
Y2 ON Y1;
Z2 ON Z1;
Y3 ON Z2 Y2;

!with covariates (observed variables) on some of the endogenous variables

Y1 Y2 Y3 ON gender;
Y1 Y2 Y3 ON age;

Eric M. posted on Thursday, December 14, 2017 - 3:25 pm

Correction and additional info.

gender (clearly not continuous).
The example I provided isn't the exact model.

But adding adding a WITH statement correlated these exogenous variables in model fails to converge. with the message. Where the problem parameter is one of the covariates.

THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE
TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE
FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING
VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE
CONDITION NUMBER IS 0.535D-20.

Bengt O. Muthen posted on Thursday, December 14, 2017 - 5:22 pm

Yes, with continuous DVs you should correlate the exogenous factors with the observed, exogenous variables. This is not the default due to aspects of different estimators (with categorical DVs and WLSMV you don't want to use WITH but instead ON).

The message you get is ignorable when one of your observed exogenous variables is binary. It happens because due to the WITH statement you are also estimating the mean and variance of the binary variable and those 2 parameters are mathematically related as p and p*(1-p).

Eric M. posted on Thursday, December 14, 2017 - 6:54 pm

Thank you! That really clears things up for me.

Eric M. posted on Friday, February 16, 2018 - 12:36 pm

I have an additional follow up to a previous post about covariances among observed and latent exogenous variables. Bengt metnioned: "Yes, with continuous DVs you should correlate the exogenous factors with the observed, exogenous variables. This is not the default due to aspects of different estimators (with categorical DVs and WLSMV you don't want to use WITH but instead ON)."

However, I just noticed by doing so, you lose df's. This wouldn't normally occur when let say the variables were all observed or all latent. Is this problematic or does it change the model substantially?

Bengt O. Muthen posted on Friday, February 16, 2018 - 6:08 pm

Yes, you lose df's but zero correlations between exog factors and exog observed vbles is typically not a model part that you are interested in testing.

Tyler Moore posted on Thursday, March 22, 2018 - 12:13 pm

Hi, I have a ton of mediators that I want to allow to correlate with each other. The default seems to be to not allow them to correlate, but to specify the correlations individually would require a ton of WITH statements. Is there an easy way to tell Mplus to allow all mediators to be correlated?

Bengt O. Muthen posted on Thursday, March 22, 2018 - 3:49 pm

Using the example of 10 mediators, just say

M1-M10 WITH M1-M10;

CG posted on Friday, March 23, 2018 - 8:40 am

Hello, I have a question regarding the difference between WITH and ON statements in SEM. I have two latent variables and three observed indicator variables. I am trying to determine the relationship between the latent variables and observed indicator variables (e.g, relationship shame (latent) and fear(observed)).

When I use the ON command, and allow the latent variable to predict the observed variables, the model does not converge. If I allow the observed variables to serve as the predictor of the latent variable (theoretically sound), the model does run and is significant. However, if I use the WITH command (ie Latent WITH observed), I get the same results (model fit, parameter estimates etc.) as the ON statement above (observed predicting latent). I am unsure why this happens and do not know which command is more appropriate.

My questions are as follows:

1) What is the difference between the ON and WITH command? Why would I get the same model fit and parameter estimate results with both?

2) Will the WITH command allow me to determine the correlation between a latent variable and observed indicator variable?

3) Does it matter which order the variables are listed in the WITH command (ie X WITH Y vs Y WITH X)?

4) Can I use the WITH command to make any predictive or independent/dependent statements?

Tyler Moore posted on Friday, March 23, 2018 - 9:08 am

Hi Bengt, thanks for the quick response. What if my variables are not numbered - e.g. my mediators are "depression", "mania", "income", "height", etc., etc. Can I list them alphabetically? Like:

aardvark-zebra WITH aardvark-zebra

Bengt O. Muthen posted on Friday, March 23, 2018 - 4:21 pm

Answer for Moore:

The list function uses the order of variables given in the USEV statement unless there is no USEV statement in which case the NAMES statement is used.

Bengt O. Muthen posted on Friday, March 23, 2018 - 4:34 pm

Answer for CG:

1) with 2 latent variables and 3 observed variables you have 6 total WITH statements or 6 ON statements. Either approach makes the relations between the two sets of variables just-identified (saturated) explaining the same model fit.

2) Yes.

3) No.

4) No.

By the way, the expression "the model is significant" is not correct. If the chi-square test of the model is significant, the model is rejected.

Kyle Levesque posted on Tuesday, April 28, 2020 - 7:00 pm

Hello Bengt,
After reading this thread over a few times, I too would greatly appreciate clarification.

You previously mentioned that "for continuous DVs you should correlate the exogenous factors with the observed, exogenous variables. This is not the default due to aspects of different estimators (with categorical DVs and WLSMV you don't want to use WITH but instead ON)."

For my model, the DV is a continuous factor (fY), and I have three observed covariates (X1-X3), and one factor, fX5, that uses bivariate indicators (hence the model uses WLSMV).

Model:
fY on fX5 X1 X2 X3;

but then two scenarios:
1) if I simply correlated the exogenous factors with exogenous observed (fX5 with X1-X3), the fit of the model is great, but the key path fX5->fY is not significant.

2) if I follow the advice to use "ON" to correlate the exogenous factors with exogenous observed when using WLSMV (fX5 on X1-X3), the fit of model (CFI TLI) are compromised. BUT the key path fX5->fY is significant.

What is the best approach here?

Bengt O. Muthen posted on Wednesday, April 29, 2020 - 4:26 pm

This suggests that the sub model consisting of the 3 x's, fx5, and the indicators of fx5 is mis-specified. That's the only part of the model that can introduce misfit when you take approach 2). There may be some significant direct effects from the x's to the fx5 indicators, suggesting measurement non-invariance. You want to analyze this part separately as a first step.

Lori Scott posted on Monday, September 21, 2020 - 6:12 am

After reading this thread, I am hoping to get clarification on whether to include "with" statements to correlate exogenous manifest variables in in the following model conditions:
Model 1 - Single level regression; Estimator is WLSMV; Y1 is categorical; X1-X3 are continuous; M is the interaction between X2 and X3; and the MODEL command is:
Y1 ON X1 X2 X3 M;
X2 and X3 are known to be correlated and results differ if "X2 WITH X3" is specified in the model. Without X2 WITH X3, the model is just-identified, and Y1 on M is not significant. With X2 WITH X3 in the model, fit indices indicate good fit, and Y1 on M is significant. However, based on my reading of this thread, "X2 WITH X3" should NOT be included in this model. Is this correct?

Model 2 - Twolevel fixed effects model with Bayes estimation, and 4 continuous DVs with both within and between variance are regressed on a mixture of covariates at both within and between levels (only 1 covariate is categorical and this is on the within level only; all other exogenous variables are continuous). Incuding WITH statements for all predictors on the within and all predictors at the between level improves model fit so that the PPP value is >.05. Without these covariances between predictors, the model fit is usually poor according to the PPP. Should these covariances of predictors be included?

Bengt O. Muthen posted on Monday, September 21, 2020 - 5:00 pm

Model 1:

If you use the specification

Y1 ON X1 X2 X3 M;

Then X1-M are exogenous (what Mplus calls X variables) and their parameters such as WITH are not estimated but like in regular regression analysis, all such X's are allowed to correlate (so they are implicitly correlated). Don't mention their covariances unless you have to bring them into the model for reasons of missingness.

Model 2:

Here I would need to see your specific model output, for instance to see if you get a latent variable decomposition. The general rule is - if the variances or means/thresholds are seen in the output, that is, they are part of the estimated model, then these X's should be correlated using WITH unless they already are. This rule is the same as for Model 1 (if you bring the Xs into the model).

Lori Scott posted on Tuesday, September 22, 2020 - 5:36 am

Thank you, Bengt, for your response. Just a few more questions for clarification... Under what conditions would one want to bring the X covariances into the model due to missingness? Are there some types of missingness under which you would want to do this, and other types for which you would not? For Model 1 there is a great deal of missingness, which is probably not missing at random but is most likely explained by one or more variables in the model.

To hopefully get your advice on whether to include X covariances in Model 2, I will send you the model output via e-mail. Thanks again!

Bengt O. Muthen posted on Tuesday, September 22, 2020 - 9:58 am

That's a big topic. The best way to learn about this is to study the Missing data section of our Topic 11 Short Course video and handout on our website. This is also discussed at length in our book described at:

http://www.statmodel.com/Mplus_Book.shtml

LT posted on Wednesday, October 07, 2020 - 6:54 pm

I have a basic question about correlating IVs in an SEM. I understand this is automatically done in Mplus, but when there are a mix of latent and observed IVs, Mplus only correlates the latent variables and ignores observed variables.

Should I specify in the syntax using the WITH command, such that all IVs correlate with one another?

Bengt O. Muthen posted on Thursday, October 08, 2020 - 11:19 am

Right. Or, regress the latent variables on the observed.