Anonymous posted on Friday, July 30, 2004 - 10:16 am
I would like to test a mediator relationship with dichtomous variables for the predictor variable and mediator and a continuous variable for the outcome. In other words, my model would look something like this:
Dichtomous var --> Dichotomous var --> Continuous var
Is it appropriate to do this using SEM? I usually think of SEM as being used with continuous variables only.
Thanks in advance for your help!
bmuthen posted on Friday, July 30, 2004 - 10:27 am
Yes, you can do this in Mplus SEM. For the equation where the dependent variable is dichotomous, a logit or probit regression is used (using ML or WLS). The mediation can be either via the observed dichotomous variable or via a continuous latent response variable underlying the observed dichotomous variable. You may also want to contact David McKinnon who does research on related matters.
Anonymous posted on Monday, May 30, 2005 - 4:05 pm
I hypothesize in the following system of dichotomous variables that y2 is endogenous in y1. y1=f(x1,y2) y2=g(x2)
How do I test for my hypothesis in Mplus? Thanks much!
You would use the CATEGORICAL option of the VARIABLE command to specify y1 and y2 are categorical. Following is how to specify the model:
MODEL: y1 ON x1 y2; y2 ON x2;
The content of the model is that x2 influences y1 indirectly via y2. The default estimator is weighted least squares which estimates probit regression coefficients. You can obtain logistic regression coefficients using maximum likelihood estimation. For probit, the y1 regression uses y2* as a predictor where y2* is an underlying continuous latent variable. For logistic, the y1 regression uses the observed variable y2 as a predictor.
Anonymous posted on Monday, May 30, 2005 - 8:55 pm
It works great-- Thank you!
Bonnie posted on Thursday, July 14, 2005 - 11:02 am
I cannot understand here: y1 is a continuous variable, but probit regression is still the default estimator for regression of y1 on y2. Usually should use linear regression, right? Would greatly appreciate your help!
This would only happen if you have a combination of categorical and continuous outcomes. The regression coefficient for the continuous variable is a simple linear regression coefficient.
deborah posted on Thursday, October 27, 2005 - 10:41 am
I would like to test a mediation model such as:
dichotomous var--> nominal var --> continuous var
What would be best way to get started?
anonymous posted on Thursday, October 27, 2005 - 10:48 am
I have a missing data problem. The predictor variable has large portions of data missing because the questionnaire used to measure it, was developed after a number of subjects had already been enrolled. Should I just not use the data from the missing predictor? I have sufficient sample size.
You have two choices. Don't use the cases or treat the x variables as y variables in the model. If you do this, they will be included in the analysis. They will then have distributional assumptions made about them.
Nominal variables cannot be mediating variables in Mplus. You would need to test the model for each category of the mediating variable.
bmuthen posted on Monday, October 31, 2005 - 7:15 am
As Linda mentioned, the modeling of
dichotomous var--> nominal var --> continuous var
cannot be done in a straightforward way in Mplus because the nominal --> continuous relationship would call for breaking up the nominal variable into a set of dummy variables.
A way around this is to handle the modeling in a mixture framework where the observed nominal variable is made equal to a latent categorical variable. Then the nominal --> continuous relationship could be captured as usual in mixtures, namely by the continuous variable having different means in the different latent classes. Making the observed nominal u variable the same as the categorical latent c variable is done by saying
I am working on a treatment outcome study. We are trying to find the best method for doing mediational analyses given the constraints of our study. The main constraint is that both conditions are active treatment and there is no other comparison/control group. We would like to be able to examine whether or not the change in the process variables precedes change in the outcome variable. Is there a semi-legitimate way to do this over two time points or over three time points (given that we have no control group)? Our mediational measures may also be somewhat weak.
We have been combing the literature but have not yet found a suggested method for these mediational analyses that best fits our study and its constraints.
Would you have a specific recommendation or a reference to a theoretical article or a treatment study that might be helpful?
bmuthen posted on Tuesday, November 01, 2005 - 8:13 am
I can't comment on this study specifically, but below are some general references in the psych literature. You may want to contact these authors directly. Mediation is also a (differently treated) topic within causal inference currently discussed among statisticians for treatment/intervention studies.
MacKinnon, D.P., Lockwood, C.M., Hoffman, J.M., West, S.G. & Sheets, V. (2002). A comparison of methods to test mediation and other intervening variable effects. Psychological Methods, 7, 83-104.
MacKinnon, D.P., Lockwood, C.M. & Williams, J. (2004). Confidence limits for the indirect effect: Distribution of the product and resampling methods. Multivariate Behavioral Research, 39, 99-128.
Shrout, P.E. & Bolger, N. (2002). Mediation in experimental and nonexperimental studies: New procedures and recommendations. Psychological Methods, 7, 422-445.
bmuthen posted on Thursday, November 03, 2005 - 9:05 am
Here is what Dave McKinnon says:
The closest thing to this is the paper by Collins, L. M., Graham, J. W.,& Flaherty, B. P. (1998). An alternative framework for defining mediation. Multivariate Behavioral Research, 33, 295-312.
I have tried applying some Markov models for repeated learning from my dissertation to the mediation case but I have not pursued it enough to fully develop the model. I think developing mediation in a Markov model is a good idea. The states of having the mediator and having the outcome are not necessarily absorbing so it may differ from typical Markov Models for learning.
phdstudent posted on Wednesday, March 11, 2009 - 7:38 am
I am not sure how to setup my model using mediators. The way I am setting up now is not converging. I am looking at following relationships. my sample is n>1500 --
I am conducting a mediational model with a nominal predictor (race/ethnicity - 3 dummy codes with 1 omitted reference group), two latent variables as the mediators, and two continuous outcomes. I want to verify that the indirect effects from mplus can be used to identify any indirect effects in my analysis even though my predictors are dummy variables.
Sorry, in re-reading my post, I misrepresented my variables. I have a nominal predictor, 2 latent variables as mediators, and 2 dichotomous outcomes. In this case, can I use an IND command to examine the indirect effects for this model?
A set of dummy variables must be created for the nominal predictor. Otherwise it should be fine.
Linda posted on Wednesday, January 27, 2010 - 12:15 am
I am doing a path analysis with 4 observed categorical variables (1 predictor, 2 mediators, 1 outcome). These are all dichotomous variables. The data also has weights, but no clusters or stratification.
I was able to run the model using type=general and delta parameterization.
My question is in interpreting the coefficients. Are the path coefficients interpreted as log odds?
A probit regression coefficient is not a probability. See pages 406-407 to see how to translate a probit regression coefficient into a probability. See the Topic 2 course handout and video for information on categorical data analysis.
Linda posted on Thursday, January 28, 2010 - 1:21 am
Thank you for pointing me to helpful resources. So it looks like, the coefficients can be discussed in terms of either Z-scores or probit index. To estimate probabilities, I will have to follow the equation shown on page 406 in the user's manual.
I had one more question. Do you by any chance know of a published paper on mediation that describes the use of mplus and shows probit regression coefficients? I am writing a paper, and I thought it would be helpful to see one example.
I'm conducting a linear Latent Growth Model. I have 7 independent Variables (4 oft them are latent). The influence of these 7 variables on i and s is supposed to be mediated by 6 binary variables.
The model does not converge. Is there a problem with the latent independent variables? Because if I'm only using the observed indep. var there is no problem (using MLR). But the Probit-Model results in a not positive definite Covar-matrix.
Further I'm trying to estimate indirect effects. But this is only in conjunction with the WLMSV-Estimation possible? Isn't it?
So what would you suggest?
1.) Calculating factor scores of the independent latent Vs, and then using them as observed variables?
2.) Is there a way to calculate indirect effects with logits?
If you have not followed the following steps in your analysis, I would do so:
1. Find a well-fitting model for the latent covariates. 2. Add the observed covariates. 3. Find a well-fitting growth model. 4. Estimate a model with the growth model and covariates. 5. Add the mediators perhaps not all at the same time.
These steps will allow you to find problems in your model.
Indirect effects are possible with both weighted least squares and maximum likelihood. In Mplus, they are currently available only for weighted least squares.
csulliva posted on Wednesday, November 10, 2010 - 1:38 pm
I have a multiple mediation model where a left-censored dependent variable is being regressed on a dichotomous predictor We then have a set of 4-5 regressors that we believe might confound that relationship, which are then entered into the equation as mediators (some of which are dichotomous). It seems this analysis requires algorithm=integration and I am getting error messages that suggest that model indirect cannot be used in such situations. Is it possible to get indirect/total effects for these relationships?
In Mplus dichotomous mediators can be used in indirect effects only with weighted least squares estimation. You can have censored and categorical variables with weighted least squares estimation.
csulliva posted on Wednesday, November 10, 2010 - 2:48 pm
Thank you very much. I ran the model with WLS, but got a nonconvergence, maximum iterations exceeded (with various settings for iterations) message. Is there an optimal "iterations=" setting for this type of model?
The model specified below attempts to look at whether the association between black and wounds (both categorical) is mediating by 3 categorical (i.e., modtheft, modviol, weapon) and 2 continuous (i.e., Zsociodis, Zsupervision) variables. Almost all of the variables are significant predictors of wounds in this model, is very different from the results I get just running a simple probit regression trying to predict wounds. In this simple model, only a few variables are significant.
I re-ran the model including residual covariances among all of the mediators. This greatly change the model, with many of the variables predicting wounds under the first on statement being reduced to non-significance.
Can you provide guidance on what is going on here and what the most appropriate specification would be? Thanks.
CATEGORICAL IS wounds modtheft modviol weapon; Missing = all (999);
Analysis: type=missing; Bootstrap= 500;
MODEL: wounds on black modtheft modviol weapon Zsociodis Zsupervision;
modtheft on black; modviol on black; weapon on black; Zsociodis on black; Zsupervision on black;
MODEL INDIRECT: wounds IND modtheft black; MODEL INDIRECT: wounds IND modviol black; MODEL INDIRECT: wounds IND weapon black; MODEL INDIRECT: wounds IND Zsociodis black; MODEL INDIRECT: wounds IND Zsupervision black;
You certainly want to correlate the residuals for the mediators because it is clear that besides "black" the mediators share many other predictors, in which case the residuals will be correlated. The model without the correlations will be misspefied and the estimates not trustworthy.
I'm quite new to SEM and am wondering if it's possible to use test a dual mediation model with only dichotomous variables. I have a sample of about 340 participants, one dependent variable, one independent variable, two control variables, and two hypothesized mediators, all of which are dichotomous variables. At the moment I'm using logistic regression but I think SEM would be a better test of the mediation model. Thanks in advance for any feedback!
In Mplus, your model can currently be estimated using probit regression and weighted least squares estimation where MODEL INDIRECT is available. There will be some additions to this in the fairly near future.
I have multiple mediators, some of which are continuous, some of which are ordinal and one of which is dichotomous. As I understand it, the ordinal variable can, with certain assumptions, be used in the same way as the continuous mediators. However, when I want to control my binary mediator for the same covariates as the other mediators, I encounter problems.
The fact that I use multiply imputed data limits my possibilities (e.g. I cannot use bootstrapped standard errors). What is left, I think, is the option of a different parameterization and estimation method, but if I, say, choose for probit link and WLS estimation, I can only do so for the entire model, effectively specifying this as the method to follow for all other mediators as well, right? Any ideas on how to solve this problem?
Additionally, I have a question on testing contrasts in ordinal mediators. I have tested a 3-category ordinal mediator as continuous and through dummies, but my interest is a difference test in mediation for category 1 and 3. Is this possible in MPlus? Thanks in advance!
You can try Estimator = Bayes, which like bootstrapping allows for a non-normal distribution for the indirect effects and which like WLSMV allows for latent response variable mediators suitable for binary and ordinal variables.
I am unclear about the situation you describe in your last paragraph. What kind of application raises this question and do you mean that two different dichotomizations of the ordinal mediator gives different indirect effect results?
Thanks for your prompt reply, dr. Muthen. The question in the last paragraph concerns the test of specific contrasts in the mediation through an ordinal variable. The three categories represent having no, some and many foreign-born friends and it significantly mediates the effect of my x on my y, but I am curious whether the specific difference between 'many' and 'some', or ' many' and 'none' is significant. Since I can only use one dummy if I specify the variable as categorical, the reference group always contains two categories, effectively blocking my attempt to test specific contrasts. Is there a way around this? Thanks in advance!
That sounds like it is not m*, the continuous latent response variable behind the ordinal m, that is the relevant mediator. It sounds like somehow m is the mediator you are interested in. But then you have to ask yourself how should the regression of the outcome y on the mediator m be formulated - how should m be treated in this regression? Should it be treated as continuous? If you consider it ordinal, how do you specifiy a regression on an ordinal predictor m without introducing m*?
Thanks for the reply. Indeed, I wouldn't know the answer either, but I was hoping there was a way to do it anyway =).
As for the estimator=bayes, how would this affect the interpretation of regression coefficients? Could I still interpret the coefficients for continuous variables as the unit increase of y when x increases by one and is a probit link used for the categorical variables? How would I interpret the coefficients for the indirect effects when one of the effects (say m on x) is estimated with probit while y on m is not? Are any additional tests, tricks or words of caution needed compared to 'regular' ML estimation? Thanks in advance!
Bayes does not change the interpretation of the estimates.
You ask about a model with binary mediator. Using Bayes with Mediator=Latent, or using WLSMV, you have two linear regressions (m* on x and y on m*, where m* is the continuous latent response variable behind m) and therefore the usual product indirect effect is fine. New approaches can also deal with the observed binary m being the mediator (more on this in a forthcoming paper), in which case ML is also available in addition to Bayes with Mediator=Observed.
Many thanks for your reply. The estimation with Bayes worked for my full model, but when I tried to run a multigroup version, I got the message that multigroup analysis is not supported with Bayesian estimation.
I tried fiddling around with mixture modelling and knownclass, but I have no experience with these kinds of models and keep getting errors. My approach was the following, given that the observed dummy for group membership is g(only relevant syntax shown):
y on m1; y on m2;(etc.) y on x; y on [covariates];
m1-m5 with m1-m5
[all the commands above repeated, with unique labels and constraints for indirect effects]
[idem with different labels]
The error I get reads: "Variances for categorical outcomes can only be specified using PARAMETERIZATION=THETA with estimators WLS, WLSM, or WLSMV.".
However, I was under the impression that Bayesian could deal with this and that I still have to specify categorical variables. Is my mixture approach ok and what to do with the error? Any help would be much appreciated.
I am running a multiple mediation model with: 3 dichotomous, 1 ordinal, and 1 continuous IV 3 continuous M's 1 dichotomous DV (all variables are observed)
I used MLR (I am not quite sure when to use WLSMV instead) and TYPE=MISSING. However, I got an error message stating that this model can only be used in combination with the Montecarlo integration. I added INTEGRATION=MONTECARLO to the Analysis statement and the model seems to work now.
My question is: is it correct that the Montecarlo integration should be used here or did I do something wrong? What does it mean for my results? I am not familiar with Montecarlo yet..
My second question concerns computing indirect effects for dichotomous outcomes. MODEL INDIRECT does not seem to work with the integration command. Can I use MODEL CONSTRAINT and compute the product of the IV-M and M-DV effects, or do I need a different approach with a dichotomous DV?
I'm looking at a mediation analysis and have a categorical IV, a categorical mediator, and a categorical DV. I'm having difficulty interpreting the direct and indirect effects, however, because the output doesn't indicate whether the regressions are probit, logit, OLS, etc.
Below is what I hope presents a 2-1-1 model with outcome 'readm' (level 1), mediator 'educat' (level 1) and main explanatory variable 'patgroup' (level 2). These are all binary variables. If I add educat to the categorical command, I get an error. Can you propose a solution?
BETWEEN = patgroup;
WITHIN = smoking gender copd;
CLUSTER is Hospital;
categorical is readm;
ANALYSIS: TYPE=TWOLEVEL; estimator=wlsmv;
model: %WITHIN% readm on smoking gender copd; readm on educat (bw);
%BETWEEN% educat ON patgroup (a); readm ON educat (bb); readm ON patgroup (cp);
MODEL CONSTRAINT: NEW (direct indirect); indirect=a*(bw+bb); direct=(cp);
I'm working with a longitudinal data set (4 time points) where the outcome at T4 is binary (employee turnover; yes or no). All other variables are continuous.
One of the hypothesized models is: X --> M1 --> M2, M3, M4 --> Y.
Due to survey attrition (25%), I was thinking of modeling the longitudinal relationships with MLR (montecarlo integration). I have 4 dimensions of integration (i.e., 4 latent variables predicting Y, i.e., M1-M4) in one of my models.
Could I use MODEL INDIRECT to evaluate the indirect effects of X on Y through M1-M4?
Is MLR the best method to begin with, or would you recommend another approach (i.e., MLR probit, Bayes)? I've read extensively on the topic, but I'm still unsure as to the best way forward (WLSMV doesn't seem appropriate given the missing data).
I think you have Model Indirect available to you here also; otherwise define the effects in Model Constraint.
But you have a binary Y which calls for the new counterfactually-defined "causal" effects. We don't offer that option with multiple mediators. If you are interested in this perhaps you can use VanderWeele's approach for a rare outcome.
Read the Valeri-Vanderweele (2013) article in Psych Methods on mediation with a binary Y. There you see the rate outcome approach which can be done in Mplus.
Stay with ML for this.
Marco R posted on Tuesday, March 08, 2016 - 9:52 am
I would like to conduct a mediated moderation with 2 moderators (both dichotomous), one mediator (dichotomous), and a continuous outcome. I'm trying to use the following syntax: VARIABLE: NAMES ARE X W M Y; USEVARIABLES X W M Y XW; CATEGORICAL are M;
DEFINE: XW = X*W;
ANALYSIS: BOOTSTRAP = 5000;
Model: Y ON M; Y ON X; Y ON W; Y ON XW;
M ON X; M ON W; M ON XW;
MODEL INDIRECT: Y via M XW;
a) Is this the appropriate syntax for this mediated moderation? b)Does it account for the fact that the mediator is dichotomous? c)Also, how can I then decompose this mediated moderation and test indirect effects for each combination of moderator values?
a) No, Mplus doesn't know that XW is the product of X and W. Instead, look at the MOD option of the Model Indirect command.
b) Mplus would use the proper counterfactually-defined effects if you use MOD.
c) Mplus offers moderator plots. See the document on our Mediation Page (left column of home page):
Moderated mediation plot based on User's Guide ex 3.18 using the Version 7.2 MODEL INDIRECT language
Roy Stewart posted on Thursday, April 09, 2020 - 2:58 pm
Dear Dr. Muthén,
In various mediation analyses with a binary outcome and continuous or binary mediators, I get in the Mplus output an estimator mentioned in Model results the value 0.023 as p-value, while the corresponding odds ratio has a p-value of 0.126.
Does anyone have an explanation or literature for me?
With odds ratios you don't want to use p-values but instead request a confidence interval. This takes into account properly the fact that odds ratios have a non-symmetric sampling distribution. The p-value assume symmetric.