Remind me about the definition of a "control variable".
Pankaj P posted on Monday, June 05, 2006 - 7:13 pm
Controlling for a variable means explaining relationship between independent and dependent variables AFTER we extract the impact of control variable on the DV. In other words, we run regresion on the residuals of regression between control variable and the DV. I read in a paper somewhere, that I could do the same in SEM i.e. run SEM on residual covariance matrix. However, would considering indirect effects take care of controlling (besides its mediating effect)? Thanks!
For a simple mediation model, if I control for var1 and var2 for path a), is that required to control for the same variables for path c) or b)? In other words, do I need to control for the same variables for each path in the model? Thanks a lot!
You would regress the dependent variables and the mediators on the control variables. You would not regress the covariates on the control variables.
Chris Chen posted on Sunday, December 26, 2010 - 4:59 pm
Dear Dr. Muthen,
Thanks for the reply.
For the statement "you would not regress the covariates on the control variables" in your message, do you mean would not regress the "independent variables" on the control variables? It seems to me "covariates" and "control variables" are the same term. could you please clarify?
i'm having some issues with adding control variables into my sem model. i have two latents and am looking at both of them as potential mediators. when i run the model i have proposed, the model fit is excellent and most of the hypothesized paths are significant. however, when i add in a control variable, the model fit is terrible and i'm not understanding why that may be.
x2 is my control/confound variable
F1 BY y3 y7; F2 BY y9 y10 y12-y14; F1 on x6; F2 on x6; y15 on F1 F2 x6; F1 with F2; F2 on x2; MODEL INDIRECT: y15 IND x6
when i run the above, model fit is horrible, but when i run this one, it's fine.
F1 BY y3 y7; F2 BY y9 y10 y12-y14; F1 on x6; F2 on x6; y15 on F1 F2 x6; F1 with F2; MODEL INDIRECT: y15 IND x6
When I add manifest variables in my structural model (with four latent factors), they have an influence on some of the factor loadings of these latent constructs. Some of the factor loadings diminish far below the 0.40 boundary line in comparison with the measurement model (with all the latent factors together where all factor loadings were above 0.40). Can this be possible? Is there a solution for this ?
You just regress the factors on the binary variable.
Jetty posted on Friday, November 08, 2013 - 9:34 am
I am testing a path analysis model of X via 4 mediators to Y (ordered categorical). I am using example 3.16 from newest manual, except I specify that the DV is categorical and use X1,X2, and X3 in the second MODEL command (not X2 only as in the text) because I need a fully adjusted model. My question is: Given that X2 and X3 are not included in the MODEL INDIRECT statements, are the indirect effects estimates adjusted for X2 and X3? If not, what is the proper syntax for including them as control variables?
Control variables are treated as any other covariate.
dvl posted on Thursday, December 12, 2013 - 8:09 am
I'd like to ask a question on the control variables included in different regressions in a path model (no latent concepts). As my theory assumes different control variables for each of my endogenous variables in the model, each path has different control variables. As I have read above that is not a problem given that including the same control variables in each equation is statistically not required (to my opinion, it would make path models less interesting if they did). However, the next issues are unclear to me:
(1) How should I interpret indirect and total effects when different control variables are included in each equation? For which variables are the total and indirect effects controlled in this case and how should we report on this?
(2) In a path model all exogenous variables are correlated. For example:
X -> W -> P C1 = control variable 1 C2 = control variable 2
W on X C1; P on W C2;
Mplus assumes C1 and C2 to be correlated, even if I do not include C2 as a control in the relationship on W. Is this true? So the regression on W is not controlled for C2, but somehow, the correlation between C2 and C1 should affect the relationship between X and W? You know how I should see this?
I really struggle with this! I hope you can help. Thanks a lot!
Regarding my question above, I have figured (2) out already but the question that remains open to me is whether it is possible to interpret an indirect effect when the mediator variable and the dependent variable have different control variables? I am really confused about that!
in a structural model, I want to control for the effects of a nominal variable (different countries). I am not interested in the effect different countries may have on my dependent variables, I merely want to control for the variance they may explain in the model.
Would I need to construct dummy variables for each of the countries and enter them separately in the model (with one being the reference category), or can I enter a single variable in which country A =1, country B = 2, etc.?
Regarding my question above: I have path model (with all manifest variables, no latent concepts due to data limitations). I want to include the same control variables in each equation in order to know where my indirect effects are controlled for! But in that case, I have a saturated model and I have no fit indices! Can I do something with that kind of models, because in the literature, I see no publications using saturated models? Can you give me some advise on this?
Although it is true that chi-square model testing can't reject the model, it is not a fatal flaw to consider a saturated model if your theories don't involve excluded paths. There should be many such models in the literature. Regression analysis is another example of a saturated model.
Imaan posted on Saturday, February 15, 2014 - 2:52 am
I have three mediators in my model. Should i take control variable which have significant relationship with DV and Mediators.But control variables perform differently to DV and mediators. How would i run this analysis. Should i include all control variables to DV and Mediators paths.or should i exclude control variable which has insignificant relationship with Dv or Mediators
I'm running a multiple mediation model with 3 mediators, income, maternal mental health and parenting behaviour. According to theory, the mediators themselves are all interlinked. More specifically, income is thought to directly affect maternal mental health and parenting behaviours. I am including a range of controls for each of my mediators and I wondered if it was acceptable to allow income to act as a control for the other two mediators even though income is itself a mediator?
I am working on a SEM model. I do not want Mplus to estimate some of the paths that it automatically does, like between some of the exogenous variables. So I was thinking of fixing the paths to 0. Will this be the right way to do it? If so, what is the syntax for that? Should I mention each interaction in a separate syntax or can they be mentioned on the same line?
lamjas posted on Thursday, June 26, 2014 - 1:32 am
I have two questions.
(1) For a simple mediation SEM using a cross-sectional data, let's say IV-->M-->DV, and gender is the control variable, should I add paths of gender to all three variables, or just on M and DV?
(2) For a cross-lagged SEM, let's say A1, A2, B1, B2 (the number indicates the time point), while gender and age as control variables, am I right that I should add paths of gender and age to Time 1 variables only?
If I have a time-variant control variable (say X1 and X2), then I should add paths of X1 to A1 and B1, whereas X2 to A2 and B2. Am I right?
The distinction is that you don't regress an exogenous iv on another exogenous variable. Your a and b variables are dependent variables.
Paula Yuma posted on Sunday, June 29, 2014 - 10:13 am
Hello, Thank you for this amazing forum!
I'm running a mediation analysis with three exogenous latent factors (HSEP, NSEP and PARKS), one mediating latent factor (PNS) and one observed outcome (PA).
I'm receiving some advice that in addition to placing the control variables (cov1-8) in the regression equations (ON statements) as below:
PNS on HSEP NSEP PARKS cov1 cov2 cov3 cov4 cov5 cov6 cov7 cov8;
PA on PNS HSEP NSEP PARKS cov1 cov2 cov3 cov4 cov5 cov6 cov7 cov8;
I should also be regressing the latent factors NSEP HSEP and PARKS on all the covariates, as follows:
NSEP on cov1-cov8; HSEP on cov1-cov8; PARKS on cov1-cov8;
My model runs well with the control variables in the ON statements with the mediator and outcome. It essentially doesn't run at all when I regress the HSEP NSEP and PARKS factors on the controls. What would you advise in terms of how to include these controls, and if possible, could you explain why?
Additionally, some of my covariates are dichotomous dummy variables for race. Should their covariances with one another be set to 0 using cov1 with cov2@0?
Many thanks! Paula Yuma Doctoral Candidate UT Austin School of Social Work
I agree that you need to allow for factors and covariates to be related. If this runs into problems as you say, you may want to send to Support.
Nothing needs to be said about the relationships among covariates. I assume that for the dummies, you have one less dummy variable than the number of categories.
Paula Yuma posted on Sunday, June 29, 2014 - 12:09 pm
Thank you so much for your reply. Yes, I have one less dummy than the number of categories.
Just to clarify, do the independent factors need to be related to the covariates through a regression statement (where the IVs are regressed on the covariates), or just allowed to covary, as they would by default as they are all included in the mediator and outcome regression statements?
Typically you can let covariates and exogenous factors simply correlate using WITH. For categorical DV modeling, e.g. using WLSMV, using ON can be preferred because WITH makes the covariates "into y's" (so with parameters that are part of the model) and this has implications for assumptions of underlying normality.