Dear Professor/s ... I have couple of quick questions regarding missing data analysis ...
Mine is SEM with 5-scale categorical outcome indicator, in my final-use data set I don't have any missing covariates (i.e. no missing X's)... missingness is there only in the outcome indicator variable
Following is my code ...
TITLE: response WITH MISSING DATA: FILE IS d:\mpluspaper1_missing.txt; VARIABLE: NAMES ARE X1-X19 Y1-Y4 XB1-XB6 XP1-XP9 R1-R9 B1-B11 T1-T4 MB MR MB3-MB5; USEVARIABLES ARE X2 X5 X7-X12 X15 Y1 R7-R9 MB3-MB5; CATEGORICAL ARE Y1 R7-R9 MB3-MB5; MISSING ARE .;
! M in model statement indicates missing dependent variables
MODEL: B by MB3-MB5; R by R7-R9; Y1 on B R X7-X12; B on R X2 X8 X9 X11 X15 X9 X10; R on B X5 X9 X10 X12;
OUTPUT: STANDARDIZED; SAMPSTAT;
Q1. How do I know what exactly MPlus is doing ... I mean the mathematics behind it, like the way we can say for sure about WLS(MV) once we read your papers (83,84,95,97) ... actually professor, this is something I have to report in my thesis
Q2. Where should I put "H1" in the analysis command, since MPlus is saying in order to access "sampstat" under "missing" I need to put "H1" on?
Q3. Once we change the parameterization from theta to delta, significance of the parameter/s changes ... why!
Q4. I guess my result will be better if we can treat my missing data as Non-ignorable, what should be the necessary changes in my Mplus-code in order to get that
Actually prof. ... apart from testing my model hypotheses I'm also checking three other things ... what will happen to our overall fit of the model, when we replace "Don't Know" by 1.0 (where don’t know stands for no importance) 2.3 ( where don’t know stands for neutral point) 3. Don’t know being treated as a genuine missing value
we have "don't know" on that 3 indicator variables, which we represent as MB3-MB5 …it’s quite reasonable to assume in our particular situation “don’t know”/ missingness is/could be a function of X, like her different demographic features, as you can see from our model statement “MB3-MB5” are loaded onto the latent factor “B”, which in turn is regressed on different X’s
Q1. The answers are in the Version 3 User's Guide (see e.g. chapter 1).
Q2. In the Analysis command, TYPE= ...H1;
Q3. Long story - see web note #4. Basically, this is in line with standardized slopes not having the same SEs as raw slopes.
Q4. See Q1 answer
The last questions are better put forth on SEMNET and discussed with your advisor.
Sanjoy posted on Wednesday, May 04, 2005 - 12:32 am
Thank you Professor ... web note#4 is really helpful, "H1" is working fine now ... regarding User's guide note, it's written all that MPlus can do but not the program logistics ... I mean something like the way your articles explain things … today I got those two of your article (later one is a note) on missing data (#47 and # 93)... thanks to Maija ... I was, in fact looking for article like these two, especially No. 93 which helped me a lot to understand the way we deal with non-ignorable missing data in Latent variable framework
bmuthen posted on Wednesday, May 04, 2005 - 12:38 am
With WLSMV and no exogenous observed variables (no "x's"), Mplus simply uses the pairwise present approach (see e.g. Little & Rubin's missing data book). With x's, missingness is allowed to be predicted by x's in the MAR sense of Little-Rubin.
Thank you Professor ... I hope, now I start getting slightly the issues behind missing data handling and it analysis
Now professor... with MPlus, unlike other software we can do a great deal of things with missing data, and particularly under a situation when we have multivariate dependent variables with categorical indicators ... at least to best of my knowledge I can't remember any other econometric software which can do such things, however there is one thing we were missing here and that is imputation ... is there any statistical reason behind ... I mean, on the whole your experience don't find Imputation technique efficient or something like that
If it is not ... then this is what I have planned to go for with ... I'm going to use your WLSMV, since this is the only estimator which can estimate my situation efficiently ... and I'm going to do it over 5/10 imputed data set (though I suppose 5 is ok under moderately missingness)
I have three very quick questions
1. What is your advice ... should I go for
2. are all ".dat" files same in nature (like .dat in MPlus or in GAUSS) ... since I'm doing imputation in GAUSS and I have noticed ".dat" file that GAUSS creates is some kind of encrypted file ... well I can convert them again into ".txt" file with GAUSS ... but I'm just wondering
3. Now I made five files ready (in ASCII / txt format) … following your example 12.13 HOW can I COMBINE them so that I can run the imputation … I read the page, but can’t understand how will one “FILE” command take care of five files!
Oops madam ... thanks for ur suggestion ... but, I couldn't run ... this is what I have written ...I have made 5 imputed data set saved in "D" ...
each data set has 240 rows and 8 columns
TITLE: imputation TEST DATA: FILE IS d:\impute1.txt; FILE IS d:\impute2.txt; FILE IS d:\impute3.txt; FILE IS d:\impute4.txt; FILE IS d:\impute5.txt; TYPE=IMPUTATION; NOBSERVATIONS=240; VARIABLE: NAMES ARE A1-A4 B1-B4; CATEGORICAL = B1-B4;
MODEL: a by A1-A4; b by B1-B4;
MPlus is saying "*** ERROR in Data command There are fewer NOBSERVATIONS entries than groups in the analysis."
I have tried with replacing 240 by 240*5=1040 in NOBSERVATION ...it's giving the same error message
Example 12.13 shows an input for multiple imputation. Please compare your input to that. The names of the five data sets should be in an external ASCII file not in the input file. The ASCII file with the names of the data sets is the file name that should be referenced in the FILE option.
Sanjoy posted on Saturday, May 07, 2005 - 10:35 pm
sorry madam, I'm still struggling with this ...in MPLus example 12.13 it's saying "the FILE option of the DATA command is used to give the names of the multiple imputation data set to be analyzed. the file named using the FILE option of the DATA command must contain a list of the names of the multiple imputation data sets to be analyzed"
I have tried in this way which failed
TITLE: imputation TEST DATA: FILE IS d:\impute1.txt impute2.txt impute3.txt impute4.txt impute5.txt; TYPE=IMPUTATION;
MPlus is saying *** ERROR in Data command The file specified for the FILE option cannot be found. Check that this file exists: d:\impute1.txt d:\impute2.txt d:\impute3.txt d:\impute4.txt d:\i
while there are five data sets and they do exist in "d" drive ... to make things sure I run them seperatley and they work
I have ALSO tried with putting ";" after each data set name, that did not work either
How can I do this ... "The names of the five data sets should be in an external ASCII file not in the input file. The ASCII file with the names of the data sets is the file name that should be referenced in the FILE option."... as you have advised me earlier
That is not what it says. Following is what it says: "The FILE option is used to give the name of the file that contains the names of the multiple imputation data sets to be analyzed." So the names should be in a file. You should not list all of the names using the FILE option. If you look at the example, there is one file name, imput.dat. The file imput.dat contains the names of the data sets and this is shown in the example.
this time I got it ... thannnnnnnk you so much, for your advice and your patience, of course :-)...
let me know madam if I'm still wrong
(for the folks who are doing imputation for the first time)
1. open a NOTEPAD window
2. paste five names of the file that you have created through imputation ... say e.g. impute1.txt impute2.txt impute3.txt impute4.txt impute5.txt (do NOT mention the directory name here like d:\impute1.txt ... mine is here "d" drive)
3. close the window and save the file under a name ,say "Multiple" under "d" drive, so in ur commnad it will look like DATA: FILE IS d:\multiple.txt; TYPE=IMPUTATION;
(now if u have partioned drive like I have "c" and "d", u can NOT save this file in one drive and keep those 5 imputed files in other drive ... )