Mplus Discussion >> Auxiliary command and Graham's models

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Auxiliary command and Graham's model...

Mplus Discussion > Missing Data Modeling >

Message/Author

Calvin D. Croy posted on Monday, July 27, 2009 - 10:03 am

I'm running a simple regression with a single predictor called X. I want another variable Z included in the modeling, since the probability that X has a missing value is related to Z. By including Z the missing data will satisfy the Missing at Random (MAR) assumption.

1. If Z is in my dataset, but not listed in the Model statement and not listed as an auxillary variable, will its relation with X be considered so that the MAR assumption can be satisfied? Or must Z be included in the Auxiliary command with the m setting?

2. If I list Z on the Auxiliary command,does Mplus include it per the Model 2 ("extra DV") or Model 3 ("saturated correlates")of Graham 2003 or in some other way? Is the way it's implemented in Mplus superior to Graham's models 2 and 3?

3. Considering SEM models in general, is there ever a need to explicity fit Graham's models 2 or 3 rather than rely on the Auxiliary command with the m setting to bring the auxiliary variables I want considered into the modeling (so as to meet the MAR assumption)?

Thanks for your assistance!

Bengt O. Muthen posted on Monday, July 27, 2009 - 10:21 am

See our technical appendix document

http://www.statmodel.com/download/AuxM2.pdf

In other words,

1. z must included in the aux list with m

2. Mplus uses the saturated correlates approach

3. the Mplus approach with aux m is the saturated correlates - with the needed additional conveniences (see doc).

Bengt O. Muthen posted on Monday, July 27, 2009 - 11:12 am

See also my web talk on this topic at

http://www.statmodel.com/webtalks.shtml

Calvin D. Croy posted on Thursday, July 30, 2009 - 8:53 am

Thanks for your reply and the links! The web talk was very clear, and the Monte Carlo simulation in it provided two good demonstrations -- both of the ease of doing simulations in Mplus and of the biases that occur when variables related to missingness are not included as auxiliary variables.

Shawn Bauldry posted on Sunday, October 24, 2010 - 5:25 pm

I'm trying to conduct a simulation study related to auxiliary variables. For a simple bivariate regression (replicating some of Collins et al. 2001), I get different results when I use the "auxiliary" command versus when I specify the saturated correlates model manually.

The following is my code using the auxiliary command:

VARIABLE: NAMES = x y z;
USEVAR = x y;
MISSING = ALL(-999);
AUXILIARY = (m) z;

MODEL: y ON x*0.6;

which returns an average estimate of 0.5389 for the regression weight (note, I obtain an identical estimate when I exclude the auxiliary variable).

The following is my model statement for the saturated correlates version of this model

MODEL:
eta BY y;
y@0;
xi BY x;
x@0;
eta ON xi*0.6;
aux BY z;
z@0;
aux WITH eta;

which returns an average estimate of 0.5945 for the regression weight.

Can you see what I'm doing incorrectly? I should get the same results from the two approaches, right?

Shawn Bauldry posted on Sunday, October 24, 2010 - 6:18 pm

One additional note on the above post. I realized there's a much more straightforward way to specify the saturated correlates model for the bivariate regression. I ran the following model

MODEL:
y ON x;
z WITH x;
z WITH y;

and obtained the same results as the more complicated version in the previous post (avg. b = 0.5945), but still different results than if I use the "auxiliary" variable command.

Linda K. Muthen posted on Monday, October 25, 2010 - 12:03 pm

These differences are to be expected. This is discussed in the following paper which is on the website:

Clark, S. & Muth�n, B. (2009). Relating latent class analysis results to variables not included in the analysis.

Shawn Bauldry posted on Monday, October 25, 2010 - 5:30 pm

Thanks for your response. I'm not sure I understand, though, how the paper you mentioned relates to my problem. The paper seems to be about issues with auxiliary variables (i.e., distal outcomes) in a latent class framework rather than auxiliary variables related to missing data. Is there a connection that I'm not seeing?

Also, if the difference in the two approaches is expected, do you have a sense of which approach is to be preferred?

Linda K. Muthen posted on Monday, October 25, 2010 - 6:04 pm

I didn't see that. Why don't you send the two outputs that don't agree that you think should and your license number to support@statmodel.com.