Categorical endogenous and exogenous ... PreviousNext
Mplus Discussion > Structural Equation Modeling >
 Hadas Hawlena posted on Friday, April 20, 2012 - 5:48 am
Dear Drs. Muthen,
I have never used Mplus. Before I buy and learn how to use it I want to make sure that it can run the SEM analyses I need. I have a sample of 143 sites sampled at different environmental conditions and localities and in each site I have measured the abundance of three species. I wish to explore the direct and indirect effects on the relative abundance of the three species by testing the relations between these three species and ~14 different variables, and within the 14 different variables. Some of these variables are dichotomic, some ordinal and some continuous.
Here are the questions:
1. If I consider the presence/absence of the three species some of my endogenous variables become dichotomic; is there a way to create a model including both endogenous and exogenous dicotomic variables together with continuous ones?
2. If I consider the relative abundance of the species, most of the sites do not have the species and only few have many, so the distribution of the endogenous variables is negative binomial. Is there a way to account for the discrepancy from normality in this case?
3. My main aim is to find the best model out of few candidates; can I calculate for the models that are fit to my previous questions AIC values?
Thanks in advance, Hadas
 Linda K. Muthen posted on Friday, April 20, 2012 - 8:05 am
It sounds like you have a mediation model with binary, continuous, and count dependent variables. This is all possible in Mplus. See the following paper which is available on the website:

Muthén, B. (2011). Applications of causally defined direct and indirect effects in mediation analysis using SEM in Mplus.
 Hadas Hawlena posted on Monday, April 30, 2012 - 10:13 am
Dear Dr. Muthen
Thanks for your super-fast and helpful answer. I have read through the article, however I didn’t manage to fully understand how one can use nominal independent variable in the model. In particular, I didn’t understand how latent variables are used. I managed to run the model when I defined my independent variable as categorical but not as nominal. For example if x->y and x is nominal with three categories and y is binary categorical. How do I formulate it into the Mplus script?
Thanks in advance, Hadas
 Linda K. Muthen posted on Monday, April 30, 2012 - 1:58 pm
If the variable x is an observed exogenous variable, you would create two dummy variables to represent the three categories.
 Hadas Hawlena posted on Tuesday, May 01, 2012 - 2:29 am
Do you mean that after I create the two dummy variables I define these two new variables as categorical and run the analyses as I would with categorical variables?
So, if I would have just 2 categories (for example female-0 and male-1) than I just treat the variable as categorical?
Thanks, Hadas
 Linda K. Muthen posted on Tuesday, May 01, 2012 - 6:46 am
The CATEGORICAL option is for dependent variables only. The dummy variables are independent variables. You simply use them on the right-hand side of an ON statement. In regression, independent variables can be binary or continuous. In both cases, they are treated as continuous.
 Hadas Hawlena posted on Wednesday, May 02, 2012 - 11:26 am
When I use the dummy variables as independents I get two odd ratios for the relation of each new variable on the dependent variable. Should I average them?

Also, what should I do if the mediator is nominal. For example if x->m->y and x is categorical, m is nominal with three categories and y is binary categorical. It was hard for me to follow the relevant example in the article. Should it be the same?

Thank you, Hadas
 Linda K. Muthen posted on Thursday, May 03, 2012 - 11:46 am
You should not average the two odds ratios. They are for different variables.

If you want to use a nominal mediator as described in the article, you may need to get assistance if it was hard for you to follow.
 Hadas Hawlena posted on Thursday, May 03, 2012 - 10:11 pm
Thanks for all the information. Hadas
 Tracy Witte posted on Wednesday, March 20, 2013 - 6:12 am
I have noticed that with version 7, Mplus lets me specify binary predictor variables as categorical. In previous versions, I used to get an error message when I tried to do this, saying that the categorical descriptor only worked for dependent variables. Is this a new feature of version 7? If so, what is different about the modeling that allows this to be done?
 Linda K. Muthen posted on Wednesday, March 20, 2013 - 6:25 am
No, the CATEGORICAL option should be used only for dependent variables. Please send the output where the error message does not show up to
 Stephanie posted on Wednesday, January 15, 2014 - 2:36 am
I have a question on the one hand regarding observed exogenous variables and, on the other, the imputation of missing values in this variable.

I ran a sem (WLSMV because of a binary dependent variable) with three observed exogenous variables from which one is binary. This variable also includes missing values which I would like to impute.

To run the imputation and to avoid the delition of cases I additionaly included a line with "x1 x2 x3;" in the model command to mention the variance of all observed variables. All missings are defined correctly. Additionally, I use theta parameterization.

My question is, whether this is correct with a binary exogenous variable as my unstandardized coefficients for these variables are comparably rather high (e.g. 12.123) which seems to be implausible to me?

I thank you very much for your kind help!
 Stephanie posted on Wednesday, January 15, 2014 - 2:55 am
I am sorry! I have forgotten to mention that I only have included "x1 x2 x3;" for all variables in the model that are observed exogenous and include missing values (NOT for all observed exogenous variables). One of them is the above mentioned binary exogenous variable. The other two variables are observed and continious and also include missing values.

Maybe I should also mention that my model includes five observed exogenous variables, in sum. But I have only mentioned those with missings in the "x1 x2 x3;" command.

I apologize for the inconvenience!
 Linda K. Muthen posted on Wednesday, January 15, 2014 - 6:11 am
You should not include observed exogenous variable variances in the MODEL command when using weighted least squares estimation. This is appropriate only for maximum likelihood estimation. When you do this, you should include variances for all observed exogenous variables not a subset of them.
 Stephanie posted on Wednesday, January 15, 2014 - 7:09 am
Thank you very much for your prompt answer.

Did I understand you right, that when using WLSMV I should include the variance for all five exogenous variables in the model command (-> "x1 x2 x3 x4 x5;") to get the missings imputed?

I have just tried this but unfortunately got three times the warning:


Three variables are dichotomous.

How could I solve this problem?
 Linda K. Muthen posted on Wednesday, January 15, 2014 - 8:07 am
No, for WLSMV you should never include the variances of the observed exogenous variables.

For maximum likelihood, if you include them, you should include all of them not a subset.
 Stephanie posted on Wednesday, January 15, 2014 - 11:36 am
Thank you once more for your answere. But may I ask you if there is a possibility to I avoid the deletion of cases when I would like to impute missing values in observed exogenous variables in an wlsmv?
 Linda K. Muthen posted on Wednesday, January 15, 2014 - 2:16 pm
You would need to use DATA IMPUTATION. See the user's guide particularly Example 11.5.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message