Message/Author 


I am using the “TYPE = MISSING H1” option in MPlus. For this option, the default assumes that data are missing at random. In my dataset, measures in the beginning of the survey were more likely to be completed and higher order factors (interactions) are missing whenever a component factor is missing. Do I need to address these concerns when doing missing data imputation? Thanks for your assistance! 


Remember that the "missing at random (MAR)" approach that Type = missing uses is not the same as assuming missing completely at random (MCAR), but missingness can be quite selective (the terms are misleading). Listwise deletion (not using Type=Missing) is correct only under MCAR. MAR is much more flexible than MCAR, for instance if your attrition happens for subjects having particularly high (or low) values at the first time point, MAR may be approximately true. It is probably the case that missingness is often NMAR (not missing at random) and a function of many unobserved variables, but MAR may still be a reasonable approximation. In short, use Type = Missing. I assume that the data you have is sufficient to identify your interactions. 


Thank you for your response. I decided to test whether people with missing data were significantly different on any of our variables compared to those who had complete data. There was one significant difference (even if I were to Bonferroni correct for the number of analyses) and several p <.10 trends. In general, people with higher psychopathology were less likely to have complete data. Does this violate even more flexible MAR assumptions? Do you still recommend using the MAR approach in this case? I have one interaction term ("sxc" in syntax below) and N=363. I am also running a split group analysis for men (N=180) vs. women (N=183) I thought this should be enough data to identify the interaction. Do you agree? Do you have any recommendations for how to do a power analysis to test this? MODEL: ib ON smsr strc; coprc ON smsr; sxc WITH strc coprc; strc WITH coprc; pswq ON smsr strc ib coprc sxc; rrs ON smsr strc ib coprc sxc; rrs WITH pswq; 


If psychopathology has missing data and the missingness is related to the values of psychopathology, you do not have MAR. If psychopathology has missing data and the missingness is not related to the values of psychopathology, you have MAR. You can see the following paper where assessing power is discussed: Muthén, L.K. & Muthén, B.O. (2002). How to use a Monte Carlo study to decide on sample size and determine power. Structural Equation Modeling, 4, 599620. 


Unfortunately, our missingness is related to the values of psychopathology, so we do not have MAR. Given this limitation, is listwise deletion preferable to TYPE = MISSING? Thanks again for your help! 


I think using TYPE=MISSING is preferable even in this situation. 

Eulalia Puig posted on Wednesday, November 28, 2007  6:31 pm



Hello. What is the default in v5 for using WLSMV in SEM? Thanks! 

Eulalia Puig posted on Thursday, November 29, 2007  6:05 am



What I meant above is what the default for dealing with missing values is  MAR or listwise? Thanks. 


The default in Version 5 is to estimate the model using all available data and missing data theory. Listwise deletion can be obtained using LISTWISE=ON in the DATA command. 

kirby posted on Wednesday, May 21, 2008  3:02 am



With TYPE IS MISSING; ESTIMATOR IS MLR; am I right that MPlus uses the expectation maximizationalgorithm to handle missing data? 


The EM algorithm is used to give ML estimates of the H1 unrestricted model, but for the H0 model other ML algorithms are used (QuasiNewton, FS). 

kirby posted on Thursday, May 22, 2008  1:47 am



Dear Bengt, thanks for your answer. Would it be possible to give me some further information about how missing data is handled with MLR? Why are the other ML algorithms (QN, FS) used? Since I do not know any papers I could have a look at to solve my questions, I really appreciate your help. Thanks a lot. 


There are some nottootechnical overview papers on missing data techniques using "FIML"  which is ML  one is I think written by Werner Wotke and/or by Schumaker. You can search for that. Essentially, the EM algorithm is suitable with an unrestricted H1. EM makes the estimation easy by estimating the expected missing data values in each iteration. But that is not necessarily the best algorithm in the H0 computations  here you can focus on estimating the model parameters directly, going over each of the missing data patterns. 


Dear Bengt, are there differences in ML vs. MLR in the use of algorithm in the case of missing data? thanks alex 


No. 


Thanks, two more questions. 1)which sandwich estimator is used to obtain the corrected s.e. ? 2) are the parameter estimates obtained via raw ML (as I understand) or also via a sandwich estimator? best Alex 


1. See on the website Technical Appendix 8 formula 170. 2. Sandwich estimators are used for standard errors not parameter estimates. 


Thanks Linda, do you know if the sandwich type estimator of MLR is the same used for the robust covariance matrix in EQS or do their exist several estimators? 


There are different algorithms for sandwich estimators. I'm not sure what EQS uses. 


Dear Linda, dear Bengt, I have a nonconvergence problem with a data set with missing completely at random data (respondents had to answer a random selection of items). The coverage of the offdiagonal elements of the covariance matrix is between 16% and 20%. I use the the analysis command type = missing but get even for very simple models (e.g CFA with 4 indicators) the following message: THE MISSING DATA EM ALGORITHM FOR THE H1 MODEL HAS NOT CONVERGED WITH RESPECT TO THE LOGLIKELIHOOD FUNCTION. THIS COULD BE DUE TO LOW COVARIANCE COVERAGE OR A NOT SUFFICIENTLY STRICT EM PARAMETER CONVERGENCE CRITERION. CHECK THE COVARIANCE COVERAGE, OR SHARPEN THE EM PARAMETER CONVERGENCE CRITERION, OR RERUN WITHOUT H1 TO OBTAIN H0 PARAMETER ESTIMATES AND STANDARD ERRORS. Is there anything else I can do? I have tried to set the iterations to 10000 but nothing changed  all suggestions are warmly appreciated. Best Marcel 


Your low coverage is causing the H1 model not to converge. This is because the H1 model uses both diagonal and offdiagonal elements. If the H1 model does not converge, you cannot get chisquare. You can add NOCHI to the OUTPUT command. 

Johan Ng posted on Thursday, September 06, 2012  4:03 am



Dear Bengt & Linda, I am trying to run a path analysis using the BAYES estimator. I understand missing data are treated using FIML by default when MLR etc are used. But how are they treated when I use BAYES? Many thanks in advance! 


Bayes and FIML (=ML) both assume "MAR" (see the Enders 2010 book), so using all available data. 


Hello, When running a four wave growth model using MLR estimation and a Zero inflated poisson distribution, the following results in ChiSquare Difference tests for MCAR were obtained: Pearson Chi Squared: 601.499 Pearson df: 1090 Pearson p: >.999 (p = 1.000) Likelihood Chi Squared: 674.469 Likelihood df: 1090 Likelihood p: >.999 (p = 1.000) Can it be assumed the data is MCAR? Thanks, Hillary 


That's right. There is no evidence against that assumption. For more on this test see https://www.tandfonline.com/doi/pdf/10.1080/01621459.1982.10477795?needAccess=true 


Thanks for your response! So I can say that my data is missing completely at random? Hillary 


Yes 


Ok, great! Thanks so much!! Hillary 

Back to top 