Message/Author |
|
|
I'm running a simple regression with a single predictor called X. I want another variable Z included in the modeling, since the probability that X has a missing value is related to Z. By including Z the missing data will satisfy the Missing at Random (MAR) assumption. 1. If Z is in my dataset, but not listed in the Model statement and not listed as an auxillary variable, will its relation with X be considered so that the MAR assumption can be satisfied? Or must Z be included in the Auxiliary command with the m setting? 2. If I list Z on the Auxiliary command,does Mplus include it per the Model 2 ("extra DV") or Model 3 ("saturated correlates")of Graham 2003 or in some other way? Is the way it's implemented in Mplus superior to Graham's models 2 and 3? 3. Considering SEM models in general, is there ever a need to explicity fit Graham's models 2 or 3 rather than rely on the Auxiliary command with the m setting to bring the auxiliary variables I want considered into the modeling (so as to meet the MAR assumption)? Thanks for your assistance! |
|
|
See our technical appendix document http://www.statmodel.com/download/AuxM2.pdf In other words, 1. z must included in the aux list with m 2. Mplus uses the saturated correlates approach 3. the Mplus approach with aux m is the saturated correlates - with the needed additional conveniences (see doc). |
|
|
See also my web talk on this topic at http://www.statmodel.com/webtalks.shtml |
|
|
Thanks for your reply and the links! The web talk was very clear, and the Monte Carlo simulation in it provided two good demonstrations -- both of the ease of doing simulations in Mplus and of the biases that occur when variables related to missingness are not included as auxiliary variables. |
|
|
I'm trying to conduct a simulation study related to auxiliary variables. For a simple bivariate regression (replicating some of Collins et al. 2001), I get different results when I use the "auxiliary" command versus when I specify the saturated correlates model manually. The following is my code using the auxiliary command: VARIABLE: NAMES = x y z; USEVAR = x y; MISSING = ALL(-999); AUXILIARY = (m) z; MODEL: y ON x*0.6; which returns an average estimate of 0.5389 for the regression weight (note, I obtain an identical estimate when I exclude the auxiliary variable). The following is my model statement for the saturated correlates version of this model MODEL: eta BY y; y@0; xi BY x; x@0; eta ON xi*0.6; aux BY z; z@0; aux WITH eta; which returns an average estimate of 0.5945 for the regression weight. Can you see what I'm doing incorrectly? I should get the same results from the two approaches, right? |
|
|
One additional note on the above post. I realized there's a much more straightforward way to specify the saturated correlates model for the bivariate regression. I ran the following model MODEL: y ON x; z WITH x; z WITH y; and obtained the same results as the more complicated version in the previous post (avg. b = 0.5945), but still different results than if I use the "auxiliary" variable command. |
|
|
These differences are to be expected. This is discussed in the following paper which is on the website: Clark, S. & Muthén, B. (2009). Relating latent class analysis results to variables not included in the analysis. |
|
|
Thanks for your response. I'm not sure I understand, though, how the paper you mentioned relates to my problem. The paper seems to be about issues with auxiliary variables (i.e., distal outcomes) in a latent class framework rather than auxiliary variables related to missing data. Is there a connection that I'm not seeing? Also, if the difference in the two approaches is expected, do you have a sense of which approach is to be preferred? |
|
|
I didn't see that. Why don't you send the two outputs that don't agree that you think should and your license number to support@statmodel.com. |
|
Back to top |