

Propensity of missingness 

Message/Author 

KS posted on Saturday, April 30, 2011  5:13 pm



I have a question related to an experiment with missing data using Monte Carlo Simulation in Mplus. The experiment is about longitudinal data having 4 time points Y1 – Y4 and 1 dichotomous variable X. I would like to generate MAR missing data of Y2, Y3 and Y4 when the cause of missing depends on Y1. More specifically, I would like to be able to control the propensity of missingness which is measured by the correlation between the indicator of missingness (R = 1 if missing, and R = 0 otherwise) and value of Y1 which is a cause of missingness. Is it possible to control the value of this correlation by adjusting values of alpha or beta in the MODEL MISSING command? 


Example 12.2 illustrates this. 

KS posted on Tuesday, May 03, 2011  11:15 am



Thank you for your suggestion. I have looked at the Example 12.2. However, my experiment for MAR is different from that example. My Y2 depends on baseline value (Y1), which is not timeinvariant covariate. Here is my code. MODEL MISSING: [Y2@ alpha ]; Y2 ON Y1 * beta ; Question1: If Y1 has a normal distribution (mean = 50, variance = 100), can I estimate total % of missing data of Y2 in my data set by substituting mean of Y1 in this logistic model? Probability of missing data = ( 1/ (1 + exp ( (alpha + beta * Y1)) How accurate would my estimation be? Question 2: How to adjust the values of Alpha and Beta to increase the propensity of missingness while keeping the probability of missing data the same? 


Yes, you have to modify ex 12.2 a little bit as you suggest. 1. Because of the nonlinear function, it is not the case that mean P = logit(alpha + beta* mean Y1). A straightforward, although approximate, approach is to do trial and error using a large data set. 2. Same answer  trial and error Don't forget to say MISSING = y2y4 in the MONTECARLO command. 

KS posted on Thursday, May 19, 2011  11:25 am



Thank you for your suggestions. I would like to have your comments on this experiment. I am studying about the propensity of missingness, which will be measured by Pearson correlation coefficient between logit, log(P/(1P), and cause of missingness variable. When the correlation coefficient between the logit function and cause of missing Z is specified, parameter â can be determined for a given value of missing percentage P. Beta = (Corr(logit(P),Z) / (Sqrt(P(1P)) * Sigma(Z)) (approximately). After getting the value of beta, I run experiment in Mplus to see the percentage of missing data actually obtained (P*) for different values of alpha. The value of alpha that gives missing percentage P* same as the specified value P will be the one I will use in further study. I would like to know if this is a plausible approach to find alpha and beta to use with Mplus MODEL MISSING command when I want to fix the propensity of missingness and percent of missing data. Thank you! 


I haven't seen that approach and can't really comment on it. I tend to think of alpha as having to do with the degree of MCAR missingness  which you can pick any realistic level of  and beta the degree of MAR missingness. 

Back to top 

