Multi-level Non-ignorable Missing Dat... PreviousNext
Mplus Discussion > Missing Data Modeling >
 Matthew Constantinou posted on Friday, September 29, 2017 - 8:09 am
Dear Bengt, Linda and Tihomir,

I am attempting to replicate the missing data modelling approach of Falkenström, Granström, and Holmqvist (2013; see In an Mplus two-level model, they regressed between-level slopes on dummy coded variables capturing k-1 monotonous missing data patterns (i.e. Pattern Mixture Modeling), and within-level intercepts on a dummy coded variable reflecting whether observations were missing or present (i.e., Selection Modeling).

Q1: I tried to integrate the NMAR syntax into a TWO-LEVEL model, by adding the essentials of UG v8 Ex 11.3 into the %WITHIN% level and essentials of Ex 11.4 into the %BETWEEN% LEVEL but failed. Is this even possible? Note I am using a long-data structure.

Q2: I manually created dummy coded variables for each level/NMAR method. The %BETWEEN% level (Pattern Mixture) regressions work well; the %WITHIN% level (Selection Model) variable is claimed to have a 0 variance. It is simply a variable with 1s for data = missing, and 0's for data = present. There are no missing data values for the first of 4 time-points.Any ideas?

Best wishes,
 Bengt O. Muthen posted on Monday, October 02, 2017 - 5:20 pm
These are research questions that I don't know the answers to.
 Matthew Constantinou posted on Tuesday, October 10, 2017 - 8:25 am
Hi Bengt,

Thank you for your reply. Allow me to elaborate.

I wondered whether the "DATA MISSING" function was available in multilevel models (MLM), but it appears not (e.g., DATA MISSING uses wide format which is incompatible with MLMs, which they use long).

A potential solution is to manually create the necessary dummy variables. For Selection Modeling, this would involve regressing a variable which codes for whether an observation is missing = 1 or not = 0, onto the outcome measure at the %WITHIN% level of an MLM. The problem is that that there is no missing data for the baseline time-point, which is a common scenario (e.g., STAR*D), and a 0 variance error is thrown.

I am wondering whether there's an alternative approach to achieve this or if you have any suggestions for amendment.

 Linda K. Muthen posted on Tuesday, October 10, 2017 - 9:06 am
DATA MISSING is simply a data transformation program. The variables are in wide format. Multilevel growth models can be in wide format. See, for example, Example 9.12 in the user's guide.
 Matthew Constantinou posted on Wednesday, October 11, 2017 - 11:48 am
True, but as I'm sure you know, studies in the social sciences typically use the long format to take advantage of differences in handling missing data and computational efficiency.

So basically, DATA MISSING is incompatible with long data because it is designed to convert wide structured data.

The issue is, when I manually create dummy coded variables of dropout and regress them onto the intercept and slope factors with the restrictions mentioned in the STAR*D paper, the model is not identified (I am talking about both PM and SM models).
 Bengt O. Muthen posted on Thursday, October 12, 2017 - 10:32 am
Unless you have very long, longitudinal data, I think the single-level wide approach is preferable to the two-level long approach and that's why we promote and focus on the wide approach. It is more flexible, for instance, easily handling testing of measurement non-invariance over time and allowing for residual variances to vary over time. The wide and long approaches handle missing data under MAR the same way contrary to what you seem to suggest.

I don't think a two-level, long dropout approach could more easily handle PM and SM models.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message