Handling missing data: Bayesian vs. MLR
Message/Author
 Jieying Chen posted on Saturday, April 13, 2013 - 5:43 am
Dear Dr. Muthen,

I have a model in which A interacts with B to influence C which in turn influences D. A, B and C are continuous, and D is categorical. D has some missing data (4.5% out of around 300 people) and they do not seem to be random because most of those who are missing on D have low values on either A or B (-1 SD or more). The distribution of A is negatively skewed (skewness = -.626, S.D. = .142). A and B are moderately correlated (around .48).

Here are my questions.
(1) I ran a path analysis without measurement model, and found that Bayesian analysis gives different results from the analysis using MLR. Why?
(2) Which is a better way to handle missing data under what circumstances? Bayesian or MLR?
(3) Which Bayesian method am I using when I specify ESTIMATOR=BAYES in the model described above?

 Linda K. Muthen posted on Saturday, April 13, 2013 - 1:57 pm
1. This should not be the case. Please send the two outputs and your license number to support@statmodel.com.

2. They are the same.

3. A model with non-informative priors.
 Ads posted on Sunday, March 12, 2017 - 9:02 am
I have a path model where most endogenous variables are normally distributed, but there is one mediator that is not normally distributed (example is below). I would like to use a Bayesian model to obtain asymmetric confidence intervals for the mediator (the model is multilevel and thus cannot use bootstrap estimates). I wanted to ask about how missing data would be handled for the variable that is non-normal.

Example model:

Y1 ON X1 X2
Y2 ON X1 X2 Y1
Y3 ON X1 X2 Y1 Y2
MODEL INDIRECT: Y3 IND X1 X2 Y1;

Y2 is the variable that has a non-normal distribution. My question is:
1. If Y2 has a different distribution than the other variables (e.g., Dirichlet), is it assumed MAR based on X1 X2 Y1
2. Or, Is Y2 assumed MAR based on X1 X2 Y1 and also Y3 (i.e., including all variables listed in the path model, including ones that are not predictors of Y2)
3. Or are neither of these correct and there is another answer
 Bengt O. Muthen posted on Sunday, March 12, 2017 - 5:33 pm
2. is correct. And, normality is assumed for all variables conditional on the covariates X1, X2. Typically ML under MAR has some robustness to missing on non-normal variables.
 John C posted on Monday, October 21, 2019 - 5:56 pm
I am trying to estimate a twolevel model with missing data on the the (binary) outcome. I am trying to bring in the distribution of the covariates into the model for estimation under MAR. Because some of these are also binary, I am using the Bayes estimator with the PREDICTOR=OBSERVED setting.

However, the latter setting is not available with TYPE=TWOLEVEL. Is there any other way I can account for the non independence of observations?

Or, is the only other option to give a full Bayesian specification with two levels of priors?
 Bengt O. Muthen posted on Tuesday, October 22, 2019 - 5:05 pm
Sounds like you've studied chapters 9 and 10 of our RMA book. Perhaps the easiest approach is to take the approximate approach of using multilevel ML estimation and ignoring that some of the covariates are binary (treating them as continuous).

Note also the missing data matter on pages 446-447 of the RMA book where it shows that missing on Y is not helped by bringing covariates into the model (only missing on X for subjects with non-missing Y).