Mplus Discussion >> Missingness partly due to non-observed variables

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Missingness partly due to non-observe...

Mplus Discussion > Missing Data Modeling >

Message/Author

ResNL posted on Monday, August 20, 2012 - 3:37 am

Dear Drs Muthen,

I have a problem with my data for which I haven't been able to find a satisfactory solution so far. The problem concerns data of people who have filled out a questionnaire at 8 time points. The main variable in my analysis is the sum score per time point. Now, some of the data is missing. This is partly because of the sum scores at the previous time points so that the higher the scores at previous points, the higher the probability that such subjects will drop out at later time points. This is MAR. However, the missingness probably also depends on variables that were not measured. How do I deal with this missingness, with the MAR part and the non-observed part? Thanks for your help.

Linda K. Muthen posted on Monday, August 20, 2012 - 9:04 am

I think the best thing would be treating them as MAR. If you think there are other variables that may be related to missingness, you could consider the AUXILIARY (m) option.

ResNL posted on Monday, August 20, 2012 - 12:03 pm

Thanks Linda. If I consider all missingness to be MAR, and the missingness is assumed to be caused either by variables that are part of the model or by variables that were not measured; am I correct to argue that no model for the missingness should and can be specified? Just do listwise deletion, or am I mistaken? Thanks again for your advice.

Bengt O. Muthen posted on Monday, August 20, 2012 - 2:22 pm

Even if you don't measure all variables that might influence missingness, it is likely that MAR does better than listwise even though the MAR assumptions are not fulfilled. See for instance

Muthén, B., Kaplan, D. & Hollis, M. (1987). On structural equation modeling with data that are not missing completely at random. Psychometrika, 52:3, 431-462.

which is posted on our web site.

ResNL posted on Friday, August 24, 2012 - 5:59 am

Thanks Drs Muthen. I'm opting now for multiple imputation. My question is what exactly is meant with multiple imputation with an unrestricted H1 model: what is the unrestricted H1 model? Further, after the MI procedure, I want to estimate a linear ALT model (Bollen) but as I understand it, I cannot get the factor scores for intercept and slope for each participant when using type=imputation. How can I solve this because I need these factor scores for subsequent analysis? Thanks again for your advice.

Linda K. Muthen posted on Friday, August 24, 2012 - 8:35 am

The H1 model is a model of means, variances, and covariances.

If you want factor scores, you would need to use TYPE=MISSING, the default, instead of DATA IMPUTATION. I would suggest using the factors in the ALT model instead of factor scores. Unless factor determinacy is one, the two are not equivalent and the factors are preferable.

ResNL posted on Friday, August 24, 2012 - 1:00 pm

Thanks Linda. But when I would use TYPE=MISSING I would not be able to use multiple imputation, right? Or am I mistaken? Would it then be a solution to manually estimate my model in the 10 datasets created with MI; and then create average estimates in the Rubin way? Is it a technical reason why factors in the ALT model are not computable with multiple imputation? Thanks again.

Linda K. Muthen posted on Friday, August 24, 2012 - 2:30 pm

Right. Your choice is the methodology of TYPE=MISSING or multiple imputation. The two methods are asymptotically equivalent. Factor scores are not available with multiple imputation. They have not been implemented. If you want factor scores, you would need to use TYPE=MISSING. As I said, I would use multiple imputation and use the factors in the model not do the analysis in two steps.