Message/Author 

Eric Teman posted on Wednesday, June 01, 2011  2:21 am



Am I correct to say MODEL MISSING: [x1x3@1.386294361]; will produce MCAR data with a probability of .20? That is, p=1/(1+exp(1.386294361) = .20. I am confused about what X1 ON X4*.4; will do. What specifically does the .4 do and what is the relationship between logits and probabilites once the covariates are added? 


Yes to your first question. x1 ON x4 is a logistic regression equation. .4 is in a logit metric. If x4 has no missing, this results in MAR. This says that the probability of missing on x1 increases as x4 increases. This is in addition to the .20 missing rate. How much more missing is obtained can be determined by looking at the generated data. The choice of a number is guided by trial and error. 

Eric Teman posted on Thursday, June 02, 2011  12:47 am



Hi Linda, Did you mean "this results in MCAR" instead of MAR? 


No, it is MAR. With MCAR, missingness may not be related to an observed variable like x4. 

Eric Teman posted on Thursday, June 02, 2011  2:49 am



OK, that makes sense. So if I want strictly MAR data, I would get rid of the [x1x4@1] and just use the logistic regression, e.g., x1 ON x2*.4;? 


I think you have a misunderstanding. MAR is more general than MCAR. If your data are MCAR, you can use MAR analysis. Using just [x1x4@1] gives MCAR. Using both [x1x4@1] and the ON statements gives MAR. 

Eric Teman posted on Monday, June 06, 2011  11:42 pm



Regarding the data imputation command for categorical data: Does this method suffer from the problems mentioned in the literature about not rounding ordinal variables? 


I think that critique is when imputing the variables as continuous, followed by truncation, which is not what Mplus does. Do you have in mind a specific reference dealing with this issue? 

Eric Teman posted on Wednesday, June 08, 2011  1:30 am



Here is a snippet from Finch (2010): "Allison (2005), however, found that when using the MI method with dichotomous data, rounding could lead to estimation bias when calculating proportions. On the other hand, other researchers have shown that imputing ordinal data with 5 or more categories using MI yielded acceptable correlation estimation results when as much as 30%of the data were missing (Leite and Beretvas, 2004; Schafer, Khare, and EzzatiRice, 1993)." For example, I am using proc mi in SAS to impute ordinal data for a 5point Likerttype scale and having SAS round to the nearest whole number. Is this bad? 


Mplus does not use any of these methods, where a continuous variable is rounded, so the critique does not apply. Mplus uses a multivariate probit regression for categorical data imputation. There is also no problem with the number of variables as has been mentioned in connection with the loglinear imputation model. 

Eric Teman posted on Friday, June 17, 2011  11:47 pm



How would I interpret the following statement: MODEL MISSING: [x1x3@5.293304825]; x1 ON x4*.1 x5*.1 x6*.1 x7*.1; x2 ON x4*.1 x5*.1 x6*.1 x7*.1; x3 ON x4*.1 x5*.1 x6*.1 x7*.1; 


Using both [x1x4@1] and the ON statements gives MAR. 

Eric Teman posted on Tuesday, June 21, 2011  8:31 pm



To be clear, the ON statements can be used to regress the DVs on IVs, not just covariates, right? Can the term covariate be used interchangeably with IV? 

Eric Teman posted on Wednesday, June 22, 2011  2:40 am



I guess I should be clearer with my question. I am generating 7 ordinal variables to simulate responses on a survey. I am setting the missing data rate on 3 of those ordinal variables and using ON statement with the remaining 4 variables to simulate MAR data. Does it matter that all 7 variables are response variables and not covariates? 


The terms covariate and independent variable are interchangeable. You can generate missingness as a function of dependent or independent variables. 

Back to top 