Message/Author 

Anonymous posted on Friday, July 30, 2004  10:16 am



I would like to test a mediator relationship with dichtomous variables for the predictor variable and mediator and a continuous variable for the outcome. In other words, my model would look something like this: Dichtomous var > Dichotomous var > Continuous var Is it appropriate to do this using SEM? I usually think of SEM as being used with continuous variables only. Thanks in advance for your help! 

bmuthen posted on Friday, July 30, 2004  10:27 am



Yes, you can do this in Mplus SEM. For the equation where the dependent variable is dichotomous, a logit or probit regression is used (using ML or WLS). The mediation can be either via the observed dichotomous variable or via a continuous latent response variable underlying the observed dichotomous variable. You may also want to contact David McKinnon who does research on related matters. 

Anonymous posted on Monday, May 30, 2005  4:05 pm



Dr., I hypothesize in the following system of dichotomous variables that y2 is endogenous in y1. y1=f(x1,y2) y2=g(x2) How do I test for my hypothesis in Mplus? Thanks much! 


You would use the CATEGORICAL option of the VARIABLE command to specify y1 and y2 are categorical. Following is how to specify the model: MODEL: y1 ON x1 y2; y2 ON x2; The content of the model is that x2 influences y1 indirectly via y2. The default estimator is weighted least squares which estimates probit regression coefficients. You can obtain logistic regression coefficients using maximum likelihood estimation. For probit, the y1 regression uses y2* as a predictor where y2* is an underlying continuous latent variable. For logistic, the y1 regression uses the observed variable y2 as a predictor. 

Anonymous posted on Monday, May 30, 2005  8:55 pm



It works great Thank you! 

Bonnie posted on Thursday, July 14, 2005  11:02 am



I cannot understand here: y1 is a continuous variable, but probit regression is still the default estimator for regression of y1 on y2. Usually should use linear regression, right? Would greatly appreciate your help! 


This would only happen if you have a combination of categorical and continuous outcomes. The regression coefficient for the continuous variable is a simple linear regression coefficient. 

deborah posted on Thursday, October 27, 2005  10:41 am



I would like to test a mediation model such as: dichotomous var> nominal var > continuous var What would be best way to get started? 

anonymous posted on Thursday, October 27, 2005  10:48 am



I have a missing data problem. The predictor variable has large portions of data missing because the questionnaire used to measure it, was developed after a number of subjects had already been enrolled. Should I just not use the data from the missing predictor? I have sufficient sample size. 


You have two choices. Don't use the cases or treat the x variables as y variables in the model. If you do this, they will be included in the analysis. They will then have distributional assumptions made about them. 


Nominal variables cannot be mediating variables in Mplus. You would need to test the model for each category of the mediating variable. 

bmuthen posted on Monday, October 31, 2005  7:15 am



As Linda mentioned, the modeling of dichotomous var> nominal var > continuous var cannot be done in a straightforward way in Mplus because the nominal > continuous relationship would call for breaking up the nominal variable into a set of dummy variables. A way around this is to handle the modeling in a mixture framework where the observed nominal variable is made equal to a latent categorical variable. Then the nominal > continuous relationship could be captured as usual in mixtures, namely by the continuous variable having different means in the different latent classes. Making the observed nominal u variable the same as the categorical latent c variable is done by saying %c#1% [u$1@15]; %c#2% [u$1@15]; The u and c variables are then the same because the probability that u=0 is the same as the probability that c=1 and the probability that u=1 is the same as the probability that c=2. 


I am working on a treatment outcome study. We are trying to find the best method for doing mediational analyses given the constraints of our study. The main constraint is that both conditions are active treatment and there is no other comparison/control group. We would like to be able to examine whether or not the change in the process variables precedes change in the outcome variable. Is there a semilegitimate way to do this over two time points or over three time points (given that we have no control group)? Our mediational measures may also be somewhat weak. We have been combing the literature but have not yet found a suggested method for these mediational analyses that best fits our study and its constraints. Would you have a specific recommendation or a reference to a theoretical article or a treatment study that might be helpful? 

bmuthen posted on Tuesday, November 01, 2005  8:13 am



I can't comment on this study specifically, but below are some general references in the psych literature. You may want to contact these authors directly. Mediation is also a (differently treated) topic within causal inference currently discussed among statisticians for treatment/intervention studies. MacKinnon, D.P., Lockwood, C.M., Hoffman, J.M., West, S.G. & Sheets, V. (2002). A comparison of methods to test mediation and other intervening variable effects. Psychological Methods, 7, 83104. MacKinnon, D.P., Lockwood, C.M. & Williams, J. (2004). Confidence limits for the indirect effect: Distribution of the product and resampling methods. Multivariate Behavioral Research, 39, 99128. Shrout, P.E. & Bolger, N. (2002). Mediation in experimental and nonexperimental studies: New procedures and recommendations. Psychological Methods, 7, 422445. 

bmuthen posted on Thursday, November 03, 2005  9:05 am



Here is what Dave McKinnon says: The closest thing to this is the paper by Collins, L. M., Graham, J. W.,& Flaherty, B. P. (1998). An alternative framework for defining mediation. Multivariate Behavioral Research, 33, 295312. I have tried applying some Markov models for repeated learning from my dissertation to the mediation case but I have not pursued it enough to fully develop the model. I think developing mediation in a Markov model is a good idea. The states of having the mediator and having the outcome are not necessarily absorbing so it may differ from typical Markov Models for learning. 

phdstudent posted on Wednesday, March 11, 2009  7:38 am



I am not sure how to setup my model using mediators. The way I am setting up now is not converging. I am looking at following relationships. my sample is n>1500  x1 (cont latent var/categ indicators) & x2 (observed binary) > m1 (observed binary) & m2 (observed binary) > y1 (observed binary) controlling for age, educ, region etc. My code is x1 by ind1 ind2 ind3; m1 m2 ON x1 x2; y1 ON m1 m2; y1 m1 m2 ON age educ region; 1. The model is not converging  how i can fix this? 2. do I need to write code to look at indirect effects? if so, what would that be? thank you in advance! 


Please send your input, data, output, and license number to support@statmodel.com. 

phdstudent posted on Wednesday, March 11, 2009  8:11 pm



Will do thanks 


Hi  I am conducting a mediational model with a nominal predictor (race/ethnicity  3 dummy codes with 1 omitted reference group), two latent variables as the mediators, and two continuous outcomes. I want to verify that the indirect effects from mplus can be used to identify any indirect effects in my analysis even though my predictors are dummy variables. Thanks! aprile 


I don't see a problem with that. 


Sorry, in rereading my post, I misrepresented my variables. I have a nominal predictor, 2 latent variables as mediators, and 2 dichotomous outcomes. In this case, can I use an IND command to examine the indirect effects for this model? Thanks! 


A set of dummy variables must be created for the nominal predictor. Otherwise it should be fine. 

Linda posted on Wednesday, January 27, 2010  12:15 am



I am doing a path analysis with 4 observed categorical variables (1 predictor, 2 mediators, 1 outcome). These are all dichotomous variables. The data also has weights, but no clusters or stratification. I was able to run the model using type=general and delta parameterization. My question is in interpreting the coefficients. Are the path coefficients interpreted as log odds? Thanks in advance! 


With WLSMV, the regression coefficients in the regression of categorical dependent variables on a set of covariates are probit regression coefficients. They are not log odds. 

Linda posted on Wednesday, January 27, 2010  12:25 pm



Thank you! Just so that I understand how to interpret probit regression coefficients. If the unstandardized path coefficient between the predictor (X) and mediator (M) was 0.58, then the interpretation would be: one unit of increase in X increases the probability of M by 0.58. And if this was a standardized path coefficient, then: one standard deviation increase in X increases the probability of M by 0.53. Is this right? 


A probit regression coefficient is not a probability. See pages 406407 to see how to translate a probit regression coefficient into a probability. See the Topic 2 course handout and video for information on categorical data analysis. 

Linda posted on Thursday, January 28, 2010  1:21 am



Thank you for pointing me to helpful resources. So it looks like, the coefficients can be discussed in terms of either Zscores or probit index. To estimate probabilities, I will have to follow the equation shown on page 406 in the user's manual. I had one more question. Do you by any chance know of a published paper on mediation that describes the use of mplus and shows probit regression coefficients? I am writing a paper, and I thought it would be helpful to see one example. 


I'm afraid I don't know of such a paper. 


Dear Drs. Muthen, I'm conducting a linear Latent Growth Model. I have 7 independent Variables (4 oft them are latent). The influence of these 7 variables on i and s is supposed to be mediated by 6 binary variables. The model does not converge. Is there a problem with the latent independent variables? Because if I'm only using the observed indep. var there is no problem (using MLR). But the ProbitModel results in a not positive definite Covarmatrix. Further I'm trying to estimate indirect effects. But this is only in conjunction with the WLMSVEstimation possible? Isn't it? So what would you suggest? 1.) Calculating factor scores of the independent latent Vs, and then using them as observed variables? 2.) Is there a way to calculate indirect effects with logits? best regards Christoph Weber 


If you have not followed the following steps in your analysis, I would do so: 1. Find a wellfitting model for the latent covariates. 2. Add the observed covariates. 3. Find a wellfitting growth model. 4. Estimate a model with the growth model and covariates. 5. Add the mediators perhaps not all at the same time. These steps will allow you to find problems in your model. Indirect effects are possible with both weighted least squares and maximum likelihood. In Mplus, they are currently available only for weighted least squares. 

csulliva posted on Wednesday, November 10, 2010  1:38 pm



I have a multiple mediation model where a leftcensored dependent variable is being regressed on a dichotomous predictor We then have a set of 45 regressors that we believe might confound that relationship, which are then entered into the equation as mediators (some of which are dichotomous). It seems this analysis requires algorithm=integration and I am getting error messages that suggest that model indirect cannot be used in such situations. Is it possible to get indirect/total effects for these relationships? 


In Mplus dichotomous mediators can be used in indirect effects only with weighted least squares estimation. You can have censored and categorical variables with weighted least squares estimation. 

csulliva posted on Wednesday, November 10, 2010  2:48 pm



Thank you very much. I ran the model with WLS, but got a nonconvergence, maximum iterations exceeded (with various settings for iterations) message. Is there an optimal "iterations=" setting for this type of model? 


The default is usually sufficient. You would need to send the output and your license number to support@statmodel.com for further help. 


The model specified below attempts to look at whether the association between black and wounds (both categorical) is mediating by 3 categorical (i.e., modtheft, modviol, weapon) and 2 continuous (i.e., Zsociodis, Zsupervision) variables. Almost all of the variables are significant predictors of wounds in this model, is very different from the results I get just running a simple probit regression trying to predict wounds. In this simple model, only a few variables are significant. I reran the model including residual covariances among all of the mediators. This greatly change the model, with many of the variables predicting wounds under the first on statement being reduced to nonsignificance. Can you provide guidance on what is going on here and what the most appropriate specification would be? Thanks. CATEGORICAL IS wounds modtheft modviol weapon; Missing = all (999); Analysis: type=missing; Bootstrap= 500; MODEL: wounds on black modtheft modviol weapon Zsociodis Zsupervision; modtheft on black; modviol on black; weapon on black; Zsociodis on black; Zsupervision on black; MODEL INDIRECT: wounds IND modtheft black; MODEL INDIRECT: wounds IND modviol black; MODEL INDIRECT: wounds IND weapon black; MODEL INDIRECT: wounds IND Zsociodis black; MODEL INDIRECT: wounds IND Zsupervision black; 


You certainly want to correlate the residuals for the mediators because it is clear that besides "black" the mediators share many other predictors, in which case the residuals will be correlated. The model without the correlations will be misspefied and the estimates not trustworthy. 


Hi I have a simple mediation problem by now. x (dichotomous, intervention var) > m (continuous mediator) > y (continuous dependent variable) And I just want to include the interaction between x and m in the model. How can I do this in mplus? Just put the product of the two in the model? Thanks 


An interaction between two observed variables is the product of the two variables. You can create this using the DEFINE command. 


Hi, I'm quite new to SEM and am wondering if it's possible to use test a dual mediation model with only dichotomous variables. I have a sample of about 340 participants, one dependent variable, one independent variable, two control variables, and two hypothesized mediators, all of which are dichotomous variables. At the moment I'm using logistic regression but I think SEM would be a better test of the mediation model. Thanks in advance for any feedback! 


In Mplus, your model can currently be estimated using probit regression and weighted least squares estimation where MODEL INDIRECT is available. There will be some additions to this in the fairly near future. 


Hello, I have multiple mediators, some of which are continuous, some of which are ordinal and one of which is dichotomous. As I understand it, the ordinal variable can, with certain assumptions, be used in the same way as the continuous mediators. However, when I want to control my binary mediator for the same covariates as the other mediators, I encounter problems. The fact that I use multiply imputed data limits my possibilities (e.g. I cannot use bootstrapped standard errors). What is left, I think, is the option of a different parameterization and estimation method, but if I, say, choose for probit link and WLS estimation, I can only do so for the entire model, effectively specifying this as the method to follow for all other mediators as well, right? Any ideas on how to solve this problem? Additionally, I have a question on testing contrasts in ordinal mediators. I have tested a 3category ordinal mediator as continuous and through dummies, but my interest is a difference test in mediation for category 1 and 3. Is this possible in MPlus? Thanks in advance! Viktor 


You can try Estimator = Bayes, which like bootstrapping allows for a nonnormal distribution for the indirect effects and which like WLSMV allows for latent response variable mediators suitable for binary and ordinal variables. I am unclear about the situation you describe in your last paragraph. What kind of application raises this question and do you mean that two different dichotomizations of the ordinal mediator gives different indirect effect results? 


Thanks for your prompt reply, dr. Muthen. The question in the last paragraph concerns the test of specific contrasts in the mediation through an ordinal variable. The three categories represent having no, some and many foreignborn friends and it significantly mediates the effect of my x on my y, but I am curious whether the specific difference between 'many' and 'some', or ' many' and 'none' is significant. Since I can only use one dummy if I specify the variable as categorical, the reference group always contains two categories, effectively blocking my attempt to test specific contrasts. Is there a way around this? Thanks in advance! 


That sounds like it is not m*, the continuous latent response variable behind the ordinal m, that is the relevant mediator. It sounds like somehow m is the mediator you are interested in. But then you have to ask yourself how should the regression of the outcome y on the mediator m be formulated  how should m be treated in this regression? Should it be treated as continuous? If you consider it ordinal, how do you specifiy a regression on an ordinal predictor m without introducing m*? 


Dear dr. Muthen, Thanks for the reply. Indeed, I wouldn't know the answer either, but I was hoping there was a way to do it anyway =). As for the estimator=bayes, how would this affect the interpretation of regression coefficients? Could I still interpret the coefficients for continuous variables as the unit increase of y when x increases by one and is a probit link used for the categorical variables? How would I interpret the coefficients for the indirect effects when one of the effects (say m on x) is estimated with probit while y on m is not? Are any additional tests, tricks or words of caution needed compared to 'regular' ML estimation? Thanks in advance! 


Bayes does not change the interpretation of the estimates. You ask about a model with binary mediator. Using Bayes with Mediator=Latent, or using WLSMV, you have two linear regressions (m* on x and y on m*, where m* is the continuous latent response variable behind m) and therefore the usual product indirect effect is fine. New approaches can also deal with the observed binary m being the mediator (more on this in a forthcoming paper), in which case ML is also available in addition to Bayes with Mediator=Observed. 


Dear dr. Muthen, Many thanks for your reply. The estimation with Bayes worked for my full model, but when I tried to run a multigroup version, I got the message that multigroup analysis is not supported with Bayesian estimation. I tried fiddling around with mixture modelling and knownclass, but I have no experience with these kinds of models and keep getting errors. My approach was the following, given that the observed dummy for group membership is g(only relevant syntax shown): variable: categorical= m1 m2 classes= cg(2); knownclass= cg (g=0 g=1); analysis: type=mixture; estimator=bayes; model: %overall% m1 on x; m2 on x; etc. m1 on [covariates];..etc. y on m1; y on m2;(etc.) y on x; y on [covariates]; m1m5 with m1m5 %cg#1% [all the commands above repeated, with unique labels and constraints for indirect effects] %cg#2% [idem with different labels] The error I get reads: "Variances for categorical outcomes can only be specified using PARAMETERIZATION=THETA with estimators WLS, WLSM, or WLSMV.". However, I was under the impression that Bayesian could deal with this and that I still have to specify categorical variables. Is my mixture approach ok and what to do with the error? Any help would be much appreciated. 


Please send the output and your license number to support@statmodel.com. 


Dear dr. Muthen, I am running a multiple mediation model with: 3 dichotomous, 1 ordinal, and 1 continuous IV 3 continuous M's 1 dichotomous DV (all variables are observed) I used MLR (I am not quite sure when to use WLSMV instead) and TYPE=MISSING. However, I got an error message stating that this model can only be used in combination with the Montecarlo integration. I added INTEGRATION=MONTECARLO to the Analysis statement and the model seems to work now. My question is: is it correct that the Montecarlo integration should be used here or did I do something wrong? What does it mean for my results? I am not familiar with Montecarlo yet.. My second question concerns computing indirect effects for dichotomous outcomes. MODEL INDIRECT does not seem to work with the integration command. Can I use MODEL CONSTRAINT and compute the product of the IVM and MDV effects, or do I need a different approach with a dichotomous DV? Thank you very much! 


INTEGRATION=MONTECARLO is a type of numerical integration not to be confused with a Monte Carlo simulation study. It is required when there is missing data on the mediator. You can use MODEL CONSTRAINT to create the indirect effect as a product. You may want to see the following paper which is available on the website for new thoughts on mediation modeling: Muthén, B. (2011). Applications of causally defined direct and indirect effects in mediation analysis using SEM in Mplus. 

Jo Brown posted on Wednesday, October 03, 2012  6:11 am



Dear Drs Muthen, I am interested in running a simple mediation analysis using structural equation model. model: Y on M; Y on X; M on X; MODEL INDIRECT: Y IND X; Y and M are latent variables but my predictor (X) is a 'yes/no' item. I am under the impression that in order to use SEM all variables in the model should be latent variables. If this is the case, what alternative options do I have? I thought about using path analysis but the total scores for Y and M are skewed... Many thanks, Joe 


It is not true that all variables need to be latent. You can handle skewed variables by using the MLR estimation procedure. 

Jo Brown posted on Thursday, October 04, 2012  1:53 am



Thanks Bengt, so I have two options 1. Use latent scores for m and DV while use my 1item variable as an IV 2. Use sum scores for my m and DV and again my 1item IV while specifying MLR to deal with skewness. correct? Thanks again 


Use 1. If your factor indicators are nonnormally distributed, use MLR. 


Hi there, I'm looking at a mediation analysis and have a categorical IV, a categorical mediator, and a categorical DV. I'm having difficulty interpreting the direct and indirect effects, however, because the output doesn't indicate whether the regressions are probit, logit, OLS, etc. Is there a way to tell what is happening? 


If you are using WLSMV, the regression coefficients are probit. 


Dear Mplus team, Below is what I hope presents a 211 model with outcome 'readm' (level 1), mediator 'educat' (level 1) and main explanatory variable 'patgroup' (level 2). These are all binary variables. If I add educat to the categorical command, I get an error. Can you propose a solution? BETWEEN = patgroup; WITHIN = smoking gender copd; CLUSTER is Hospital; categorical is readm; ANALYSIS: TYPE=TWOLEVEL; estimator=wlsmv; model: %WITHIN% readm on smoking gender copd; readm on educat (bw); %BETWEEN% educat ON patgroup (a); readm ON educat (bb); readm ON patgroup (cp); MODEL CONSTRAINT: NEW (direct indirect); indirect=a*(bw+bb); direct=(cp); output: TECH1 TECH8; 


What error do you get? 

db40 posted on Wednesday, April 08, 2015  12:07 pm



Hi, I am wondering if Mplus is able estimate multilevel categorical IVs within a mediation framework. For example, using the Mostlikelyclassmembership variable if it has 6 levels. If there isn't, is there a clever work around in Mplus which can do this? 


You can have a set of dummy covariates for class membership and put them into a multilevel mediation model. 


Dr. Muthen, I'm working with a longitudinal data set (4 time points) where the outcome at T4 is binary (employee turnover; yes or no). All other variables are continuous. One of the hypothesized models is: X > M1 > M2, M3, M4 > Y. Due to survey attrition (25%), I was thinking of modeling the longitudinal relationships with MLR (montecarlo integration). I have 4 dimensions of integration (i.e., 4 latent variables predicting Y, i.e., M1M4) in one of my models. Could I use MODEL INDIRECT to evaluate the indirect effects of X on Y through M1M4? Is MLR the best method to begin with, or would you recommend another approach (i.e., MLR probit, Bayes)? I've read extensively on the topic, but I'm still unsure as to the best way forward (WLSMV doesn't seem appropriate given the missing data). Any help would be greatly appreciated! AnnRenee 


I think you have Model Indirect available to you here also; otherwise define the effects in Model Constraint. But you have a binary Y which calls for the new counterfactuallydefined "causal" effects. We don't offer that option with multiple mediators. If you are interested in this perhaps you can use VanderWeele's approach for a rare outcome. 


Thank you, Dr. Muthen! When I run Model Indirect (e.g., Y IND X), I get the indirect effects of X on Y via all of the mediators, as one would expect. However, are these results meaningful given my Y is a binary DV? How should I interpret these results in this context? Should I consider moving to WLSMV on imputed data? 


Read the ValeriVanderweele (2013) article in Psych Methods on mediation with a binary Y. There you see the rate outcome approach which can be done in Mplus. Stay with ML for this. 

Marco R posted on Tuesday, March 08, 2016  9:52 am



Hi, I would like to conduct a mediated moderation with 2 moderators (both dichotomous), one mediator (dichotomous), and a continuous outcome. I'm trying to use the following syntax: VARIABLE: NAMES ARE X W M Y; USEVARIABLES X W M Y XW; CATEGORICAL are M; DEFINE: XW = X*W; ANALYSIS: BOOTSTRAP = 5000; Model: Y ON M; Y ON X; Y ON W; Y ON XW; M ON X; M ON W; M ON XW; MODEL INDIRECT: Y via M XW; OUTPUT: cinterval(bootstrap); a) Is this the appropriate syntax for this mediated moderation? b)Does it account for the fact that the mediator is dichotomous? c)Also, how can I then decompose this mediated moderation and test indirect effects for each combination of moderator values? Thank you very much in advance! 


a) No, Mplus doesn't know that XW is the product of X and W. Instead, look at the MOD option of the Model Indirect command. b) Mplus would use the proper counterfactuallydefined effects if you use MOD. c) Mplus offers moderator plots. See the document on our Mediation Page (left column of home page): Moderated mediation plot based on User's Guide ex 3.18 using the Version 7.2 MODEL INDIRECT language 

Roy Stewart posted on Thursday, April 09, 2020  2:58 pm



Dear Dr. Muthén, In various mediation analyses with a binary outcome and continuous or binary mediators, I get in the Mplus output an estimator mentioned in Model results the value 0.023 as pvalue, while the corresponding odds ratio has a pvalue of 0.126. Does anyone have an explanation or literature for me? Many thanks in advance 


With odds ratios you don't want to use pvalues but instead request a confidence interval. This takes into account properly the fact that odds ratios have a nonsymmetric sampling distribution. The pvalue assume symmetric. 

Back to top 