Message/Author 

Jon Elhai posted on Wednesday, April 28, 2010  12:27 am



A couple questions about Mplus 6's ability to generate imputed datasets using MI... 1) What types of variables can be used in Mplus 6 for generating imputed datasets? Continuous and categorical variables? What about count variables? 2) Is the Bayes estimator robust to nonnormality? Or must I normalize variables before generating imputed datasets in Mplus 6? 


1. For now imputation is available for continuous and categorical variables. 2. No. You should not normalize the variables. Instead you can use twopart modeling, mixture modeling, or treat them as categorical if they have no more than 10 values. 


I have tried to do a multiple imputation in Mplus 6 in which I impute 2 categorical variables. Everything seems to go well, but when I want to do further analyses with the imputed data, I get the following message for each data file: "Errors for replication with data file P:\Data\Radar\Eigen\SPSS\zdaadn23imp1.dat: *** ERROR Unexpected end of file reached in data file. Errors for replication with data file P:\Data\Radar\Eigen\SPSS\zdaadn23imp2.dat: *** ERROR Unexpected end of file reached in data file. etc.." What am I doing wrong? Thank you very much in advance! 

Thessa Wong posted on Thursday, May 20, 2010  12:22 pm



I just figured out what the problem was. I had defined to many variables in the command. However, that gives me a new problem: I want to use the imputations for analyses in which I also take into account other variables. However, I do not want these other variables to influence my data imputation, but it looks like that including these variables in the data imputation is the only way to get the variables in the new dataset with imputed data. In other words, how do I create a dataset with the imputed variables combined with other variables? 


See the AUXILIARY option in the user's guide. 

Jon Heron posted on Friday, May 21, 2010  5:56 am



Hi Thessa, you might want to read this: Graham, J. W. (2009). Missing data analysis: Making it work in the real world. Annual Review of Psychology, 60, 549576. "The truth is that all variables in the analysis model must be included in the imputation model. The fear is that including the DV in the imputation model might lead to bias in estimating the important relationships (e.g., the regression coefficient of a program variable predicting the DV). However, the opposite actually happens. When the DV is included in the model, all relevant parameter estimates are unbiased, but excluding the DV from the imputation model for the IVs and covariates can be shown to produce biased estimates. The problem with leaving the DV out of the imputation model is this: When any variable is omitted from the model, imputation is carried out under the assumption that the correlation is r = 0 between the omitted variable and variables included in the imputation model. Thus, when the DV is omitted, the correlations between it and the IVs (and covariates) included in the model are all suppressed (i.e., biased) toward 0." 


That makes a lot of sense. Thank you very much for your thorough response! I have another question: is there any way to check the quality of the imputation? In the Mplus output, you get the averaged estimates from all the created data sets. I was wondering whether it is possible to get output for all the different data sets? This way I am able to see whether there is a big variance or not in the estimates. If so, I guess the average estimate is not that reliable. Of course I can do the analyses seperately for all the data sets, but there is probably a way that Mplus can do that for me. 

Jon Heron posted on Friday, May 21, 2010  10:11 am



Good question! One for Bengt I think I've been calling Mplus from Stata and running models on individual datasets that way. It looks like you many be able to do this in R now too. 


The way to know how good the average parameter estimates are is to look at the standard errors of the parameter estimates. With imputation, these are computed using the average of the squared standard errors over the set of analyses and the between analysis parameter estimate variation (Rubin, 1987; Schafer, 1997). 

Alvin Wee posted on Thursday, June 27, 2013  4:46 am



Hi Linda, I am using Mplus 6 and did some analyses with multiple imputed data. I understand that I am about to get Mplus to produce betweenimputation variances which if my understanding is correct, will appear as a "% missing" column next to the parameters in the output. However, no matter wgat I tried, this does not show up. The command that I use is as follow: TITLE: Base Model Data: File = C:\Users\Alvin\Desktop\MI\Mplus.50implist.dat; TYPE = IMPUTATION ; VARIABLE: NAMES ARE TP1DEP TP2DEP TP3DEP TP1AN TP2AN TP3AN TP1STR TP2STR TP3STR TP1FDep TP2FDep TP3FDep TP1Sleep TP3Sleep TP1SupT ; USEVARIABLES ARE TP1DEP  TP1SupT ; ANALYSIS: TYPE = general ; MODEL: TP2DEP TP2AN TP2STR ON TP1DEP TP1AN TP1STR ; TP3DEP TP3AN TP3STR ON TP2DEP TP2AN TP2STR ; TP2DEP WITH TP2AN TP2STR ; TP2AN WITH TP2STR ; TP3DEP WITH TP3AN TP3STR ; TP3AN WITH TP3STR ; TP2DEP TP2AN TP2STR ON TP1FDep TP1Sleep TP1SupT TP2FDep ; TP3DEP TP3AN TP3STR ON TP3FDep TP3Sleep ; OUTPUT: STDYX Many thanks, Alvin 

Alvin Wee posted on Thursday, June 27, 2013  12:05 pm



Got it. I upgraded to version 7. 

Back to top 