Multiple sources of missing data PreviousNext
Mplus Discussion > Missing Data Modeling >
 Nicole Watkins posted on Tuesday, June 05, 2018 - 10:26 am
I am working on my dissertation and using NLSY79 data. I will be doing a latent growth curve modeling depressive symptoms at 9 points. When I ran an unconditional growth model, I received the error that the covariance coverage minimum of .10 was not met. I have a lot of missing data- but much of it is caused by the NLSY survey design, and/or that the participant was not age-eligible. I will briefly outline the 3 types of missing below:

1) Planned missing: For example, in 2000, participants who responded to the depression questions in 1998 were skipped in 2000. In 1998, children over age 20 were not assessed at all. Beginning in 2012, participants age 25-28 were skipped over the depression questions. These are examples of planned missing.

2) Age-ineligible: Another source of missing would be that the child was age 25 at the last interview, and so they are missing on age 26/27 data because they have not 'aged-in'.

3) Non-response: And then I have some missing due to attrition, or unplanned skips (10-30% depending on the age).

When I lower the covariance coverage minimum to .07 the model runs.

So I have 2 questions:
1) What does it do to my model if I lover my covariance coverage to .07 rather than .10?
2) Is using the FIML method enough to deal with the types of missing data that I am experiencing?

Thank you in advance, I look forward to hearing some suggestions.
 Bengt O. Muthen posted on Tuesday, June 05, 2018 - 5:02 pm
First see if UG ex6.18 is relevant for at least some of the planned missingness.

1) In the cells (variable pairs) with low coverage, you have fewer subjects contributing to the estimation of parameters corresponding to those cells. For instance, if you have a residual covariance parameter between Y1 and Y4, the coverage for that pair influences the quality of that parameter's estimation. Low coverage for pairs of variables close to the diagonal (that is, variable pairs close to each temporally in your growth model, where the diagonal concerns coverage for a variable) is more harmful for growth model estimation than low coverage for pairs far from the diagonal.

2) FIML can be ok - it depends on the considerations I mention in 1).
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message