Missing Data with Categorical Variables PreviousNext
Mplus Discussion > Missing Data Modeling >
 learningmplus posted on Thursday, June 09, 2011 - 6:35 pm

How can I handle missing data on dichotomous variables in MPlus? Some of the variables are independent (e.g., marital status = intact; non-intact) and others are dependent (presence/absence of a disorder).

I used the A-B*; which worked for continuous outcomes and all observations are used. However, when the independent and dependent variables are dichotomous the cases without data are deleted? Is there a way to handle those cases so I use the full dataset? Thank you!
 Linda K. Muthen posted on Thursday, June 09, 2011 - 7:11 pm
Regression models are estimated conditioned on the independent variables. If you don't want the observations with missing on these variables deleted, mention the variances of the independent variables in the MODEL command. They will then be treated as dependent variables and distributional assumptions will be made about them.
 Daniel Schaefer posted on Wednesday, September 09, 2015 - 7:06 am

I would like to conduct a mutliple regression analysis using FIML. Some of the independent variables are dummy variables with missing data.
When I use the [x1 x2 c1]-specification, does MPlus assume that all of these variables are normally distributed, or does it take account of their dichotomous nature? If the former, is there any other way to deal with missing data on dummy independent variable when using FIML?

Best wishes
 Linda K. Muthen posted on Wednesday, September 09, 2015 - 9:26 am
If you bring the covariates into the model and do not estimate the model conditioned on the covariates, distributional assumptions are made about them. In regression, covariates can be binary or continuous. In both cases, they are treated as continuous.
 Daniel Schaefer posted on Wednesday, September 09, 2015 - 11:57 am
Dear Prof. Muthen,

Many thanks for the quick response!

If I may ask a question for the purpose of clarification on that:
I have read that including dummy variables into multiple regression analyses as independent variables is not very problematic in case of listwise deletion, but that the Full-Information-Maximum-Likelihood-Procedure for dealing with missing data requires normally distributed variables. Is there any recommended method how to deal in MPlus with missing values on categorical independent variables? (Or is it ok when I just use the
"[x1 c1];"-specification under Model after "y on x1 c1;" for dealing with missing values on the categorical variable c1?)

Best regards,
 Linda K. Muthen posted on Wednesday, September 09, 2015 - 12:18 pm
Go to Version History on the website and to Version 6.1. There is a discussion of this issue there.
 Daniel Schaefer posted on Thursday, September 10, 2015 - 3:29 am
Thank you.

When I have a multiple regression analysis with a categorical independent variable which has missing values, the "[x1 c1];"-specification apparently leads to a FIML-procedure which basis its estimation on the problematic assumption that the categorical variable is normally distributed.
Does that mean that I should better conduct multiple imputation (in which the missing values of the categorical variable are estimated via a logistic regression)?
 Linda K. Muthen posted on Thursday, September 10, 2015 - 10:36 am
In some limited simulation studies that we have conducted, making this assumptions seems okay. You can use multiple imputation and treat the variable as categorical but there are limitations to options with multiple imputation, for example, the only absolute fit statistic available is for a model with all continuous dependent variables and maximum likelihood estimation.
 Daniel Schaefer posted on Thursday, September 10, 2015 - 12:55 pm
Thank you so much! That's very helpful.
 Dex  posted on Friday, October 21, 2016 - 8:45 am

I was wondering what is the mechanism Mplus uses for impute multivariate categorical data? Is the procedure in Mplus the same as the chained equations approach (MICE package in R)?

Thank you for your help.
 Bengt O. Muthen posted on Friday, October 21, 2016 - 12:56 pm
See the Technical report on our website at

Asparouhov, T. & Muthén, B. (2010). Multiple imputation with Mplus. Technical Report. Version 2. Click here to view Mplus inputs, data, and outputs used in this paper.
download paper contact second author
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message