Message/Author 

Jo Brown posted on Monday, November 05, 2012  8:44 am



Hi Drs Muthen, I imported some data into Mplus from Stata to run SEM analyses. The data contains missing information and in the output I can see the message: Number of missing data patterns 30 Is Mplus dealing with the missing data or should I be running SEM on complete cases only by importing only cases with complete information on the variables of interest? Thanks 


The default in Mplus is to use all available information. If you want listwise deletion, add LISTWISE=ON to the DATA command. 

Jo Brown posted on Tuesday, November 06, 2012  2:59 am



Thanks Linda, Am I right in thinking that when I do not specify LISTWISE = ON missing data is dealt using pairwise deletion. For the purpose of preliminary analysis (I will impute missing data ultimately), is listwise deletion a better way to deal with missing data than pairwise? 


See pages 78 of the user's guide for a brief description of how missing data are handled. It varies by the type of estimator that is used. I believe the Mplus default is the best way to deal with missing data and is preferable to listwise deletion. 

Jo Brown posted on Tuesday, November 06, 2012  6:18 am



Thanks Linda, I am using WLSMV so according to the manual, mplus uses pairwise present analysis. Assuming my data is missing at random,can I use pairwise deletion. I am conscious that there is some debate and I am hoping you could recommend some reading. 


Pairwise present requires MCAR not MAR. See the following paper which is available on the website: Muthén, B., Kaplan, D. & Hollis, M. (1987). On structural equation modeling with data that are not missing completely at random. Psychometrika, 52:3, 431462. You can also consider using multiple imputation. 

Jo Brown posted on Wednesday, November 07, 2012  2:02 am



Thanks Linda, Unfortunately I do not seem able to open the link to the paper; it says document not found. J 


I successfully opened it twice. Here is the link: http://www.statmodel.com/download/mkh1987.pdf 

Jo Brown posted on Wednesday, November 07, 2012  9:31 am



Thank you for the link  it worked! I tried opening from this link with no success: http://www.statmodel.com/missingdata.shtml 


How did you get to that link? It's not the link associated with the paper. 

Jo Brown posted on Thursday, November 08, 2012  5:32 am



I googled: Muthén, B., Kaplan, D. & Hollis, M. (1987). On structural equation modeling with data that are not missing completely at random. Psychometrika, 52, 3, 431462 and the first search hit directs me to the link above which contains a list of papers on missing data including the paper you recommended. Anyway, the link you sent me worked  thank you again! 

Cecily Na posted on Wednesday, August 14, 2013  11:42 am



Dear Professor, I have four dummy variables as predictors and 20+ outcome variables. I did not put all the outcome variables in a single model, but organized them into separate groups with each group containing more related outcome variables. Eventually I have four or five models, each with four or five outcome variables and the same four dummy variables as predictors. My question is I used the same sample for all these models, but the model results showed different sample sizes across these models. What is the reason? As default, Mplus only use FIML for the missing outcome values, not for the predictors (exogenous variables). So as long as predictors and sample data are the same, the sample size should be the same. Thank you in advance for clarification. 


Please send an example of this your license number to support@statmodel.com. 

una posted on Wednesday, November 13, 2013  5:44 am



Dear Prof. Muthen, In my analysis with MLR, the dependent variable is a latent variable, while the independent variables are observed. I have 498 respondents. I am not specifying “listwise is on”. However, in the final analysis I only keep 309, with the warning “Data set contains cases with missing on xvariables. These cases were not included in the analysis. Number of cases with missing on xvariables: 189”. In this regard, I have two questions: 1.Is it correct to state: “Missing data on the dependent variables were treated with full information maximum likelihood (FIML). Missing data on independent variables were treated using listwise deletion” 2.Why is Mplus not using all the 498 cases? Thank you very much in advance, 


In regression, the model is estimated conditioned on the observed exogenous variables. Missing data theory applies to only endogenous dependent variables. If you mention the variances of all of your independent variables in the MODEL command, they will be treated as dependent variables and distributional assumptions will be made about them but they will not be excluded from the analysis. 

Back to top 