I am using the “TYPE = MISSING H1” option in MPlus. For this option, the default assumes that data are missing at random. In my dataset, measures in the beginning of the survey were more likely to be completed and higher order factors (interactions) are missing whenever a component factor is missing. Do I need to address these concerns when doing missing data imputation?
Remember that the "missing at random (MAR)" approach that Type = missing uses is not the same as assuming missing completely at random (MCAR), but missingness can be quite selective (the terms are misleading). Listwise deletion (not using Type=Missing) is correct only under MCAR. MAR is much more flexible than MCAR, for instance if your attrition happens for subjects having particularly high (or low) values at the first time point, MAR may be approximately true. It is probably the case that missingness is often NMAR (not missing at random) and a function of many unobserved variables, but MAR may still be a reasonable approximation. In short, use Type = Missing. I assume that the data you have is sufficient to identify your interactions.
Thank you for your response. I decided to test whether people with missing data were significantly different on any of our variables compared to those who had complete data. There was one significant difference (even if I were to Bonferroni correct for the number of analyses) and several p <.10 trends. In general, people with higher psychopathology were less likely to have complete data. Does this violate even more flexible MAR assumptions? Do you still recommend using the MAR approach in this case?
I have one interaction term ("sxc" in syntax below) and N=363. I am also running a split group analysis for men (N=180) vs. women (N=183) I thought this should be enough data to identify the interaction. Do you agree? Do you have any recommendations for how to do a power analysis to test this?
MODEL: ib ON smsr strc; coprc ON smsr; sxc WITH strc coprc; strc WITH coprc; pswq ON smsr strc ib coprc sxc; rrs ON smsr strc ib coprc sxc; rrs WITH pswq;
If psychopathology has missing data and the missingness is related to the values of psychopathology, you do not have MAR. If psychopathology has missing data and the missingness is not related to the values of psychopathology, you have MAR.
You can see the following paper where assessing power is discussed:
Muthén, L.K. & Muthén, B.O. (2002). How to use a Monte Carlo study to decide on sample size and determine power. Structural Equation Modeling, 4, 599-620.
There are some not-too-technical overview papers on missing data techniques using "FIML" - which is ML - one is I think written by Werner Wotke and/or by Schumaker. You can search for that. Essentially, the EM algorithm is suitable with an unrestricted H1. EM makes the estimation easy by estimating the expected missing data values in each iteration. But that is not necessarily the best algorithm in the H0 computations - here you can focus on estimating the model parameters directly, going over each of the missing data patterns.
I have a non-convergence problem with a data set with missing completely at random data (respondents had to answer a random selection of items). The coverage of the off-diagonal elements of the covariance matrix is between 16% and 20%.
I use the the analysis command type = missing but get even for very simple models (e.g CFA with 4 indicators) the following message:
THE MISSING DATA EM ALGORITHM FOR THE H1 MODEL HAS NOT CONVERGED WITH RESPECT TO THE LOGLIKELIHOOD FUNCTION. THIS COULD BE DUE TO LOW COVARIANCE COVERAGE OR A NOT SUFFICIENTLY STRICT EM PARAMETER CONVERGENCE CRITERION. CHECK THE COVARIANCE COVERAGE, OR SHARPEN THE EM PARAMETER CONVERGENCE CRITERION, OR RERUN WITHOUT H1 TO OBTAIN H0 PARAMETER ESTIMATES AND STANDARD ERRORS.
Is there anything else I can do? I have tried to set the iterations to 10000 but nothing changed - all suggestions are warmly appreciated.
Your low coverage is causing the H1 model not to converge. This is because the H1 model uses both diagonal and off-diagonal elements. If the H1 model does not converge, you cannot get chi-square. You can add NOCHI to the OUTPUT command.
Johan Ng posted on Thursday, September 06, 2012 - 4:03 am
Dear Bengt & Linda,
I am trying to run a path analysis using the BAYES estimator. I understand missing data are treated using FIML by default when MLR etc are used. But how are they treated when I use BAYES?