MAR vs. MCAR
Message/Author
 Eric Teman posted on Tuesday, May 31, 2011 - 8:21 pm
Am I correct to say
MODEL MISSING:
[x1-x3@-1.386294361];
will produce MCAR data with a probability of .20? That is, p=1/(1+exp(1.386294361) = .20.

I am confused about what X1 ON X4*.4; will do. What specifically does the .4 do and what is the relationship between logits and probabilites once the covariates are added?
 Linda K. Muthen posted on Wednesday, June 01, 2011 - 10:51 am

x1 ON x4 is a logistic regression equation. .4 is in a logit metric. If x4 has no missing, this results in MAR. This says that the probability of missing on x1 increases as x4 increases. This is in addition to the .20 missing rate. How much more missing is obtained can be determined by looking at the generated data. The choice of a number is guided by trial and error.
 Eric Teman posted on Wednesday, June 01, 2011 - 6:47 pm
Hi Linda,

Did you mean "this results in MCAR" instead of MAR?
 Linda K. Muthen posted on Wednesday, June 01, 2011 - 8:09 pm
No, it is MAR. With MCAR, missingness may not be related to an observed variable like x4.
 Eric Teman posted on Wednesday, June 01, 2011 - 8:49 pm
OK, that makes sense. So if I want strictly MAR data, I would get rid of the [x1-x4@-1] and just use the logistic regression, e.g., x1 ON x2*.4;?
 Linda K. Muthen posted on Thursday, June 02, 2011 - 9:44 am
I think you have a misunderstanding. MAR is more general than MCAR. If your data are MCAR, you can use MAR analysis.

Using just [x1-x4@-1] gives MCAR. Using both [x1-x4@-1] and the ON statements gives MAR.
 Eric Teman posted on Monday, June 06, 2011 - 5:42 pm
Regarding the data imputation command for categorical data: Does this method suffer from the problems mentioned in the literature about not rounding ordinal variables?
 Bengt O. Muthen posted on Tuesday, June 07, 2011 - 7:34 am
I think that critique is when imputing the variables as continuous, followed by truncation, which is not what Mplus does. Do you have in mind a specific reference dealing with this issue?
 Eric Teman posted on Tuesday, June 07, 2011 - 7:30 pm
Here is a snippet from Finch (2010): "Allison (2005), however, found that when using the MI method with dichotomous data, rounding could lead to estimation bias when calculating proportions. On the other hand, other researchers have shown that imputing
ordinal data with 5 or more categories using MI yielded acceptable correlation estimation results when as much as 30%of the data were missing (Leite and Beretvas, 2004; Schafer, Khare, and Ezzati-Rice, 1993)."

For example, I am using proc mi in SAS to impute ordinal data for a 5-point Likert-type scale and having SAS round to the nearest whole number. Is this bad?
 Bengt O. Muthen posted on Wednesday, June 08, 2011 - 5:26 pm
Mplus does not use any of these methods, where a continuous variable is rounded, so the critique does not apply. Mplus uses a multivariate probit regression for categorical data imputation. There is also no problem with the number of variables as has been mentioned in connection with the loglinear imputation model.
 Eric Teman posted on Friday, June 17, 2011 - 5:47 pm
How would I interpret the following statement:

MODEL MISSING:
[x1-x3@-5.293304825];
x1 ON x4*.1 x5*.1 x6*.1 x7*.1;
x2 ON x4*.1 x5*.1 x6*.1 x7*.1;
x3 ON x4*.1 x5*.1 x6*.1 x7*.1;
 Linda K. Muthen posted on Friday, June 17, 2011 - 5:55 pm
Using both [x1-x4@-1] and the ON statements gives MAR.
 Eric Teman posted on Tuesday, June 21, 2011 - 2:31 pm
To be clear, the ON statements can be used to regress the DVs on IVs, not just covariates, right?

Can the term covariate be used interchangeably with IV?
 Eric Teman posted on Tuesday, June 21, 2011 - 8:40 pm
I guess I should be clearer with my question. I am generating 7 ordinal variables to simulate responses on a survey. I am setting the missing data rate on 3 of those ordinal variables and using ON statement with the remaining 4 variables to simulate MAR data. Does it matter that all 7 variables are response variables and not covariates?
 Linda K. Muthen posted on Wednesday, June 22, 2011 - 9:42 am
The terms covariate and independent variable are interchangeable.

You can generate missingness as a function of dependent or independent variables.