Using a control variable in SEM PreviousNext
Mplus Discussion > Structural Equation Modeling >
 Amanda M. White posted on Wednesday, October 17, 2007 - 7:32 pm
I'd like to control for age in my SEM model - would I use WITH to do this? Also, if I do use WITH to control, is there one line of syntax that controls for age in the entire model or do I have to include a line in for each construct or variable?
 Linda K. Muthen posted on Thursday, October 18, 2007 - 9:41 am
One controls for a variable by using it as a covariate in a regression. The ON option is used for regression. You should include age as a covariate in all ON statements.
 Emily Scheinfeld posted on Thursday, February 14, 2013 - 12:00 pm
I understand I need to control for my control variables (age and sex) on each line, but what does that look like? If I have all observed data:


How do I include age and sex as a covariate?
 Linda K. Muthen posted on Thursday, February 14, 2013 - 1:45 pm
If you factor is iv, the BY statement should be


You don't need the extra IV BY DV.

Then you can regress iv on the covariates:

iv ON age sex;
 Karen Kegel posted on Tuesday, June 04, 2013 - 6:58 am
Dear Drs Muthen:

I have 2 questions related to a covariate:

1) I want to include Age as a covariate in my SEM models. Age only correlates with some variables in my model, so I'm only featuring it in some lines of my Mplus code. For example:


This is okay to do, right? I see above you stated "You should include age as a covariate in all ON statements".

2) Also, even if MPlus automatically correlates Age with my main exogenous variable (here, SC), I would like to get the value of that path in the output. How do I ask for that? If I add a "WITH" line, some of the values (eg, the AIC) change...

 Karen Kegel posted on Tuesday, June 04, 2013 - 8:25 am
Another control variable question while I'm at it:

What if I want to have a control variable that does not correlate with the main exogenous variable? For example, I have syntax like so for one model:

udo ON sc age
hs ON udo sc
pd ON hs udo sc age

I do not want MPlus to assume there should be covariance between SC and age. Is there anything I can do about this? Thanks again!
 Bengt O. Muthen posted on Wednesday, June 05, 2013 - 10:39 am
I would recommend handling the covariates by the Mplus default, namely that they are all correlated. Their correlation values can be found by using SAMPSTAT. Those correlations are not part of the model. The only exception I would make in this regard is if you have strong theoretical reasons for zero correlations, for instance with a randomized study where the tx variable is uncorrelated with a pretest, but even then it is not necessary. What you should not do is to look at the sample correlations and for almost zero sample corr's fix those corrs at zero in the model. Including a correlation that is almost zero does not hurt.

When you start using WITH among covariates, they change status in Mplus and are include among the list of DVs and therefore BIC/AIC change.
 Karen Kegel posted on Thursday, June 06, 2013 - 11:29 am
Thank you so much for your help. To clarify, you are saying that by default, MPlus includes covariance/correlation between a control variable and an exogenous variable. For one of my models, I absolutely need to this covariance/correlation to be contributing to model fit, path coefficient results, etc. This is because I am comparing it vs other models where I have a predictor path running from Age to UDO. When moving UDO to become the main exogenous variable for one model, this predictor path needs to get turned into an explicit covariance path.

If you are saying that such a covariance/correlation is NOT part of model results, is there a way to explicitly and accurately indicate covariance between a control variable and the exogenous variable--besides using a WITH statement?

Sorry if this sounds confusing. Thanks so much again for your time!
 Bengt O. Muthen posted on Thursday, June 06, 2013 - 2:23 pm
If you want some covariates (say x1, x2) to be part of the model to be estimated (and to be evaluated for fit), you include them in the model by saying one of the following:

x1 x2;

[x1 x2];

x1 WITH x2;
 Estee posted on Tuesday, March 04, 2014 - 11:20 am
Dear Dr. Muthen,

I need advice on mplus syntax if I want to include control variables in a mediation model.
Let's say if I have 1 independent variables (IV), 1 dependent variables (DV) and 2 mediators in a SEM model (M1&M2).
The syntax I used is:


IV BY a1 a2;
DV BY b1 b2;
M1 BY c1 c2;
M2 BY c3 c4;

!Direct Relationship between IV and DV

!Relationship between IV and Mediator
M1 ON IV (P1);
M2 ON IV (P1);

!Relationship between DV and Mediator
DV ON M1 (P3);
DV ON M2 (P4);


NEW (IVM1DV) = P1*P3
NEW (IVM2DV) = P2*P4

 Estee posted on Tuesday, March 04, 2014 - 11:21 am
** Continued from the previous post

I have three control variables to be added. They are gender, father's education level and mother's education level. From previous posts, I learnt that control variables/covariates are added in ALL ON statements. As there are missing value in my covariates, I need to bring all of the covariates into the model by mentioning their variances in the MODEL command. So, is the following syntax correct?


Gender FEdu MEdu;
IV BY a1 a2;
DV BY b1 b2;
M1 BY c1 c2;
M2 BY c3 c4;

!Direct Relationship between IV and DV
DV ON IV Gender FEdu MEdu;

!Relationship between IV and Mediator
M1 ON IV Gender FEdu MEdu (P1);
M2 ON IV Gender FEdu MEdu (P1);

!Relationship between DV and Mediator
DV ON M1 Gender FEdu MEdu (P3);
DV ON M2 Gender FEdu MEdu (P4);


NEW (IVM1DV) = P1*P3
NEW (IVM2DV) = P2*P4


Also, do I need to add the control variables into the MODEL INDIRECT?
Your guidance is very much appreciated. Thank you very much.
 Linda K. Muthen posted on Tuesday, March 04, 2014 - 12:32 pm
This would appear to be correct. Please note that posts to Mplus Discussion should not exceed one window. Please follow that in the future.
 Estee posted on Wednesday, March 05, 2014 - 10:24 am
Thank you very much. I apologize for the the previous post which exceeded one window.
I have a question regarding moderated mediation. In a moderated mediation model, are control variables allowed to be included? For example, gender is tested as a moderator in the mediation model while family income and father education level are controlled in the model.
 Linda K. Muthen posted on Wednesday, March 05, 2014 - 10:40 am
 Yvonne LEE posted on Saturday, August 02, 2014 - 6:28 pm
I am doing SEM with WLSMV estimation. After adding social desirability (MC_C) as a control variable to the outcome variable (SES0123), the output on the standardized indirect effect section only shows the estimates without the S.E., p-value and cinterval. Is this normal? How to obtain the cinterval of the standardized indirect effect?
 Linda K. Muthen posted on Sunday, August 03, 2014 - 6:17 am
You are using an older version of Mplus where standard errors for standardized parameter estimates are not available for conditional models. Update to Version 7.2.
 Yvonne LEE posted on Thursday, August 07, 2014 - 5:58 am
I am a self-learner of Mplus, therefore would like some confirmation to make sure I do it right.

1. Social desirability is a covariate in my study. I first add social desirability (SoDesir) to the DV in SEM with the command 'DV ON SoDesir' in order to make social desirability a control variable in my model. Is the command correct?

2. Model modification indices then suggest me to add a path between social desirability and an IV for model improvement. I then add 'IV on SoDesir'. Is this proper? Does this mean social desirability is now controlled for both DV and IV?

3. To develop a model in SEM, should I construct a model with good fit and then add the social desirability control variable? Or, I add the social desirability control in the first place? I learn that the control variable will not be included in model estimation in any way.
 Yvonne LEE posted on Thursday, August 07, 2014 - 6:48 am
sorry, left one question.

4. Can we add the social desirability control variable to latent factor apart from the observed indicator? The modification indices do suggest but I wonder what it means.
 Linda K. Muthen posted on Thursday, August 07, 2014 - 9:41 am
You should listen to the Topic 1 course video on the website. Only dependent variables go on the left-hand side of the ON statement. Independent variables and control variables go on the right-hand side of the ON statement. Covariates can be observed variables or latent variables.
 Shiny posted on Thursday, August 14, 2014 - 4:07 am
I am running a sequential full Mediation model. that is: x-m1-m2-y. y is a categorical data. now I d like to add one control variable c. I want to test whether the Mediation effect still exists when control is added and also how it infuences the Regression coeffient of m1 on x.

what i did in mplus is as follow:

DATA: FILE IS dataset1.dat;
x c m1 m2 y;
y ON m2 (c);
m2 on m1(b);
m1 on x(ax);
m1 on c(ac);! c is a control variable
NEW(axb axc axbc acb acc acbc);
axb = ax*b;
axc = ax*c;
axbc =ax*b*c;
acb = ac*b;
acc =ac*c;
acbc =ac*b*c;

My questions are:

1.if i want to check the Impact of control variable (c) on the main Mediation effect (x-m1-m2-y), can I just Regress m1 on c, and skip the Regression of m2 and y on c?

2. is it up to me to decide whether an indirect effect between control variable and y shall be tested? or actually I shall only do Regression test involving DV and control?

3. In the Output, the control variable (c)and IV (x) is covariated as Default. Do they Need to be covariated or can I remove the covariate?

Many thanks!
 Bengt O. Muthen posted on Thursday, August 14, 2014 - 6:13 pm
1. Only if the model fits well without

y m2 on c;

2. Theory and testing should decide.

3. A good neutral stance is to let them be correlated which Mplus does automatically (not a parameter estimated in the model).
 Jamie-Lee Pennesi posted on Thursday, February 04, 2016 - 3:42 pm
Is anyone able to answer this: If you have categorical variable(s) in your model do you have to also have ANALYSIS: ESTIMATOR = WLSMV in the syntax? Or can the model be run without it?
 Linda K. Muthen posted on Thursday, February 04, 2016 - 7:04 pm
It depends on the type of model. For TYPE=GENERAL, the default estimator when there is one or more categorical dependent variables in the model is WLSMV. It is not necessary to say it in that case.
 fouz posted on Thursday, June 07, 2018 - 9:51 pm
I have question, I'm trying to analysis my data if my factor iv on level and sex and school status and my DV is number sense related to 5 of competent how I have to build my model if I don't have items on IV
how do I include level, sex and school status as covariate ?
 Bengt O. Muthen posted on Friday, June 08, 2018 - 1:47 pm
Answered elsewhere.
 Md Zabir Hasan posted on Sunday, September 02, 2018 - 9:29 pm
Thanks so much for your response. Just two points of clarification then:

Q1: The proper code if I want to examine the relationship between the latent variable f1 and health, controlling for gender and age, would be the following (regardless of whether I use WLSMV or MLR):

f1by x1;

f1 by x2;

f1 by x3;

health on f1 gender age;

f1 on gender age;

Q2: If f1 was an observed, rather than latent variable, then the proper code to examine the relationship between f1 and health controlling for gender and age would be the following:

health on f1 gender age;
 Bengt O. Muthen posted on Monday, September 03, 2018 - 1:54 pm
Q1: Yes.

Q2: Yes.
 Hassan posted on Saturday, February 29, 2020 - 1:58 pm
Dear Prof. Muthen,

I have two independent variables (i.e., x1 and x2) and one dependent variable (i.e., y).

I want to see whether adding x2 explains additional variance above and beyond that of x1 in y.

This is the procedure I took: in the first model I just had "y on x1". In the second model, I added x2 next to x1 "y on x1 x2". Then I comparedthe value of R2 in the second model with the first model.

Now, I have two questions:
1)Is my procedure correct?
2)Is there a method in mplus to test whether the difference in R2 value is statistically significant or not?

Thanks in advance for your help.
 Bengt O. Muthen posted on Monday, March 02, 2020 - 12:00 pm
1) It doesn't seem wrong. But why don't you simply run the model

y on x1 x2

and check if the x2 slope is significant? Why go via R-square?

2) Not that I can think of.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message