Hello, I am running a series of cross-domain LGM models, each model investigating the relation between 2 variables: depressive symptoms and cognitive function domains (scale scores) using 6 waves of data at 3-year intervals. Several covariates measured at baseline are used to predict the intercepts and slopes of both cognition and depression.
Question 1: When using MLR estimation is it the case that persons with missing data on covariates are excluded from analysis by the method employed?
Question 2: While MLR uses all available information, is there any requirement regarding the minimum number of data waves that each person must have on both outcomes (depression and cognition) in order to be kept in the analysis? Is it acceptable to include persons who only have data at 1 time point?
Thanks. And I have some follow up questions: Q1: How can I mention the covariate means or variances in my syntax, and is this procedure recommended for either binary, ordered categrical or continuous variable? In my dataset persons who miss on the covariates usually miss on the outcome variables as well, in which case including covariate means and variance may not make a great difference.
Q2: What is the reason for using all cases?
Q3: My covariance coverage ranges between 95% and 10%. Should this worry me or is this something that MLR can deal with?
Q2. There is information on some variables for some subjects - you don't want to exclude that.
Q3. You should worry with low coverage - and 10% is extremely low if it is on the diagonal of the coverage matrix. With low coverage your model assumptions play too big of a role relative to the data information.
Thanks. The coverage on the diagonal ranges from about 95 % at wave 1 to about 20% at wave 6, and the average coverage across the 6 waves is between 50% and 60% (depending on the model). Are these acceptable values? Or what practical solutions would you recommend to increase coverage:
(a) Include just the first 4 waves (min coverage about 45% ) or the first 5 waves (min coverage about 30%)?
(b) Exclude the oldest old participants at baseline in order to reduce follow-up attrition due to old age?
I would worry about selective attrition when diagonal coverage falls much below something like 80%. I don't know that (a) reducing the number of waves is a good idea, or (b) restricting the range of inference.
I would instead want to focus on why so many drop out - and if the dropout is "informative". You may want to check for NMAR missing using a relatively simple approach such as shown in UG ex 11.4. For a broader view, see the paper on our website:
Muthén, B., Asparouhov, T., Hunter, A. & Leuchter, A. (2011). Growth modeling with non-ignorable dropout: Alternative analyses of the STAR*D antidepressant trial. Psychological Methods, 16, 17-33. Click here to view Mplus outputs used in this paper. download paper contact first author show abstract