By Diggle-Kenward, do you mean missingness that is a function of the variable which has missingness? If so, yes. If not, please send me the article describing the model.
V X posted on Wednesday, October 10, 2007 - 3:02 am
Dr. Muthen, I am also interested in learning Mplus to fit Diggle-Kenward selection model and shared-parameter model for nonignorable missing data (that is, missing not at random). Would you provide some Mplus code examples?
I think Diggle-Kenward consider missingness as a function of the (latent) response variables y - what you would have observed if if wasn't missing. You could use DATA MISSING to create binary missing data indicators and then regress those "u's" on the "y's" that have missingness by regular ON statements (y on u). I am not familiar with the term "shared-parameter model".
Dr Muthen, You said "for individuals who have missing data on y, the y variable is a latent variable". Do you mean that by creating a latent variable CY like the figure in slide 6 of your Lecture 17, we can fit a nonignorable missingness model with missing Y?
No, it is not. But a new missing data paper will be posted within short which discusses alternative models for non-ignorable missing data and you can then request the Mplus setups for those analyses.
Tim Stump posted on Friday, April 20, 2012 - 3:01 pm
I have a cohort of type II diabetes adolescents with hemoglobin a1c collected at baseline (prior to high school graduation), 3, 6, 9, and 12 month time pts. We know that our a1c outcome does not satisfy MAR assumption because we could not get all chart review data from physicians offices after adolescents left home. Baseline a1c is not missing, but missing increases over time. The cohort is relatively small with 180 subjects, but would like to explore some of the models outlined in "Growth Modeling With Nonignorable Dropout: Alternative Analyses of the STAR*D Antidepressant Trial". Our goal is simply to model a1c over time and see if trajectory is different for a couple of binary covariates and if missing a1c influences trajectory. Would have you any suggestions as to which type of NMAR model would work better with our small sample?
I am looking for a way to model non-ignorable missing data in a LCA model.
Case is I have 3 variables, each representing the age at onset of a 3-stage process. As stages 2 and 3 are only possible if the previous one has been reached, missing data on stage 2 and/or stage 3 are non-ignorable. The missing data structure looks like a monotone missing pattern, except that they are not dropout: the missing data are informative that the next stage was not reached and I want to include this information in the model.
Structure of the database is : s1 s2 s3 9 12 13 12 14 17 14 15 . 11 16 . 13 . . 17 . .
Do you have any advice on how to implement such a model in Mplus? The closest I found is the Diggle-Kenward selection model (Ex. 11.3), but there are no i, s, q components in what I model...
So you have an LCA based on 3 cont's variables. Does s1 predict missingness on s2 and s3 and s1 is always observed, so MAR? Regarding non-ignorable, are you saying that the values that would have been observed for s2 and s3 predict their missingness?
I think I would need to understand the setting better to help you and that goes beyond Mplus Discussion. You may want to ask on SEMNET. You want to make clear why LCA is of interest to you (why mixtures?) and why you want to model trajectories (trajectories of what?). Two comments: Selection modeling of missing data like Diggle-Kenward can be done without a growth model; survival modeling might be relevant given that you want to model age at which the events happened (perhaps multivariate survival; see the Masyn dissertation on our website).
I have a longitudinal dataset over 5 waves with 1600 cases. I think that the missing data is MNAR because missingness on IVs is related to the DVs in my model. How is it best to model missing data in Mplus when it is MNAR? Am I right in thinking that FIML is not appropriate?
OK many thanks. Tabacnick & Fidell (2007) state that “MNAR is inferred if the (missing variable analysis) t- tests show that missingness is related to the DV.” (p63) Would it be possible to ask for clarification since this seems to contradict what you are saying?
Would it also be possible to ask for clarification on one separate Mplus issue? In order to conduct measurement invariance for GCM is it required to have the exact same measure at each wave? E.g if one has a measure with some different items at each wave (for example to make the measure age appropriate), am I right in thinking that standardised scores cannot be used and the only option is to use only those items that appear at each wave (wave counterpart items)?
MAR is proper if the missingness can be predicted by observed variables. NMAR is at hand if missingness is predicted by unobserved variables such as the value that would have been observed or other latent variables.
Don't use standardized scores in growth modeling. You can deal with different items at different time points if you take a multiple indicator approach with measurement invariance for items that are in common.
Harmen Zoet posted on Thursday, November 10, 2016 - 12:27 am
I want to conduct an LPA with treatment outcome scores as indicators (operationalized as difference scores (T1 - T2) of items on a questionnaire). However, I have missing data which might be non-ignorable. After all, it is plausible that I have more missing data on my last point of measurement (T2) for those who do not respond to treatment .
What is now the best way to deal with my missing data? Should I first of all use multiple imputation for all items, then compute my difference scores, prior to running my LPA? Or is it also possible (and better) to first compute my difference scores, then run my LPA while at the same time using maximum likelihood estimation (or FIML)? Or do I completely miss the point and is it necessary to do this in a different way?
Use ML instead of multiple imputation whenever you can to deal with missing data.
I am not sure why you need to work with difference scores - keeping the original scores might be better and uses all available information with ML. The estimated classes will tell you if increases, decreases, or constancy forms the classes.
Harmen Zoet posted on Thursday, November 17, 2016 - 1:12 am
Dear dr Muthen,
Thanks for your answer. I'm considering your advice, but can not really figure it out. Would the following be correct:
Variable: Names are nummer T0B1FR (...) KIPPOST;
USEVARIABLES are T0B1FR-T2D5IN KIPPOST; CLASSES = c (4); MISSING is all (999);
All indicator variables are the items from the questionnaire, both pre- and post-treatment (expect for KIPPOST, this is a post-treatment total severity score). I used specified starting values, because automatic starting values lead to local maxima. Weird thing, however, is that I cannot view any plot from the estimated probabilities when I run the above.
It is, otherwise, not possible to us ML while computing difference scores in Mplus (before running the LPA), right?