Ordered Polytomous data PreviousNext
Mplus Discussion > Missing Data Modeling >
 Sanjoy posted on Sunday, May 01, 2005 - 10:06 pm
Dear Professor/s ... I have couple of quick questions regarding missing data analysis ...

Mine is SEM with 5-scale categorical outcome indicator, in my final-use data set I don't have any missing covariates (i.e. no missing X's)... missingness is there only in the outcome indicator variable

Following is my code ...

DATA: FILE IS d:\mpluspaper1_missing.txt;
NAMES ARE X1-X19 Y1-Y4 XB1-XB6 XP1-XP9 R1-R9 B1-B11 T1-T4 MB MR MB3-MB5;

! M in model statement indicates missing dependent variables


MODEL: B by MB3-MB5;
R by R7-R9;
Y1 on B R X7-X12;
B on R X2 X8 X9 X11 X15 X9 X10;
R on B X5 X9 X10 X12;


Q1. How do I know what exactly MPlus is doing ... I mean the mathematics behind it, like the way we can say for sure about WLS(MV) once we read your papers (83,84,95,97) ... actually professor, this is something I have to report in my thesis

Q2. Where should I put "H1" in the analysis command, since MPlus is saying in order to access "sampstat" under "missing" I need to put "H1" on?

Q3. Once we change the parameterization from theta to delta, significance of the parameter/s changes ... why!

Q4. I guess my result will be better if we can treat my missing data as Non-ignorable, what should be the necessary changes in my Mplus-code in order to get that

Actually prof. ... apart from testing my model hypotheses I'm also checking three other things ... what will happen to our overall fit of the model, when we replace "Don't Know" by
1.0 (where don’t know stands for no importance)
2.3 ( where don’t know stands for neutral point)
3. Don’t know being treated as a genuine missing value

we have "don't know" on that 3 indicator variables, which we represent as MB3-MB5 …it’s quite reasonable to assume in our particular situation “don’t know”/ missingness is/could be a function of X, like her different demographic features, as you can see from our model statement “MB3-MB5” are loaded onto the latent factor “B”, which in turn is regressed on different X’s

Thanks in zillions, with Regards
 bmuthen posted on Tuesday, May 03, 2005 - 10:06 am
Q1. The answers are in the Version 3 User's Guide (see e.g. chapter 1).

Q2. In the Analysis command, TYPE= ...H1;

Q3. Long story - see web note #4. Basically, this is in line with standardized slopes not having the same SEs as raw slopes.

Q4. See Q1 answer

The last questions are better put forth on SEMNET and discussed with your advisor.
 Sanjoy posted on Tuesday, May 03, 2005 - 6:32 pm
Thank you Professor ... web note#4 is really helpful, "H1" is working fine now ... regarding User's guide note, it's written all that MPlus can do but not the program logistics ... I mean something like the way your articles explain things … today I got those two of your article (later one is a note) on missing data (#47 and # 93)... thanks to Maija ... I was, in fact looking for article like these two, especially No. 93 which helped me a lot to understand the way we deal with non-ignorable missing data in Latent variable framework

With regards
 bmuthen posted on Tuesday, May 03, 2005 - 6:38 pm
With WLSMV and no exogenous observed variables (no "x's"), Mplus simply uses the pairwise present approach (see e.g. Little & Rubin's missing data book). With x's, missingness is allowed to be predicted by x's in the MAR sense of Little-Rubin.

With ML, regular MAR is used.
 Sanjoy posted on Thursday, May 05, 2005 - 8:45 pm
Thank you Professor ... I hope, now I start getting slightly the issues behind missing data handling and it analysis

Now professor... with MPlus, unlike other software we can do a great deal of things with missing data, and particularly under a situation when we have multivariate dependent variables with categorical indicators ... at least to best of my knowledge I can't remember any other econometric software which can do such things, however there is one thing we were missing here and that is imputation ... is there any statistical reason behind ... I mean, on the whole your experience don't find Imputation technique efficient or something like that

If it is not ... then this is what I have planned to go for with ... I'm going to use your WLSMV, since this is the only estimator which can estimate my situation efficiently ... and I'm going to do it over 5/10 imputed data set (though I suppose 5 is ok under moderately missingness)

I have three very quick questions

1. What is your advice ... should I go for

2. are all ".dat" files same in nature (like .dat in MPlus or in GAUSS) ... since I'm doing imputation in GAUSS and I have noticed ".dat" file that GAUSS creates is some kind of encrypted file ... well I can convert them again into ".txt" file with GAUSS ... but I'm just wondering

3. Now I made five files ready (in ASCII / txt format) … following your example 12.13 HOW can I COMBINE them so that I can run the imputation … I read the page, but can’t understand how will one “FILE” command take care of five files!

Thanks and regards
 Linda K. Muthen posted on Friday, May 06, 2005 - 7:09 am
Imputation and maximum likelihood estimation for missing data are asymptotically equivalent.

1. You should check the literature for the number of imputed data sets to use.

2. I don't know if all .dat files are the same.

3. Look up IMPUTATION in the index of the Mplus User's Guide. It shows how the file should look.
 Sanjoy posted on Friday, May 06, 2005 - 4:48 pm
Oops madam ... thanks for ur suggestion ... but, I couldn't run ... this is what I have written ...I have made 5 imputed data set saved in "D" ...

each data set has 240 rows and 8 columns

TITLE: imputation TEST
DATA: FILE IS d:\impute1.txt;
FILE IS d:\impute2.txt;
FILE IS d:\impute3.txt;
FILE IS d:\impute4.txt;
FILE IS d:\impute5.txt;

MODEL: a by A1-A4;
b by B1-B4;

MPlus is saying
"*** ERROR in Data command
There are fewer NOBSERVATIONS entries than groups in the analysis."

I have tried with replacing 240 by 240*5=1040 in NOBSERVATION ...it's giving the same error message

could you sugest me the correct one please

thanks and regards
 Linda K. Muthen posted on Friday, May 06, 2005 - 5:12 pm
Example 12.13 shows an input for multiple imputation. Please compare your input to that. The names of the five data sets should be in an external ASCII file not in the input file. The ASCII file with the names of the data sets is the file name that should be referenced in the FILE option.
 Sanjoy posted on Saturday, May 07, 2005 - 4:35 pm
sorry madam, I'm still struggling with this ...in MPLus example 12.13 it's saying "the FILE option of the DATA command is used to give the names of the multiple imputation data set to be analyzed. the file named using the FILE option of the DATA command must contain a list of the names of the multiple imputation data sets to be analyzed"

I have tried in this way which failed

TITLE: imputation TEST
DATA: FILE IS d:\impute1.txt

MPlus is saying
*** ERROR in Data command
The file specified for the FILE option cannot be found. Check that this
file exists: d:\impute1.txt d:\impute2.txt d:\impute3.txt d:\impute4.txt d:\i

while there are five data sets and they do exist in "d" drive ... to make things sure I run them seperatley and they work

I have ALSO tried with putting ";" after each data set name, that did not work either

How can I do this ... "The names of the five data sets should be in an external ASCII file not in the input file. The ASCII file with the names of the data sets is the file name that should be referenced in the FILE option."... as you have advised me earlier

thanks for your patience , regards ...sanjoy
 Linda K. Muthen posted on Saturday, May 07, 2005 - 5:05 pm
That is not what it says. Following is what it says: "The FILE option is used to give the name of the file that contains the names of the multiple imputation data sets to be analyzed." So the names should be in a file. You should not list all of the names using the FILE option. If you look at the example, there is one file name, imput.dat. The file imput.dat contains the names of the data sets and this is shown in the example.
 Sanjoy posted on Saturday, May 07, 2005 - 6:51 pm
this time I got it ... thannnnnnnk you so much, for your advice and your patience, of course :-)...

let me know madam if I'm still wrong

(for the folks who are doing imputation for the first time)

1. open a NOTEPAD window

2. paste five names of the file that you have created through imputation ... say e.g.
(do NOT mention the directory name here like d:\impute1.txt ... mine is here "d" drive)

3. close the window and save the file under a name ,say "Multiple" under "d" drive, so in ur commnad it will look like
DATA: FILE IS d:\multiple.txt;

(now if u have partioned drive like I have "c" and "d", u can NOT save this file in one drive and keep those 5 imputed files in other drive ... )
 Linda K. Muthen posted on Sunday, May 08, 2005 - 6:44 am
Looks correct.
 Huang Wu posted on Saturday, June 09, 2018 - 2:52 pm
The answers are insightful. I am still wondering if all the imputation datasets should be included in one file and use the file name to separate?
 Linda K. Muthen posted on Saturday, June 09, 2018 - 6:15 pm
Each imputation data set should be in a separate file. The names of the data sets are listed in a file that is specified using the FILE option. See Example 13.13 in the user's guide.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message