Auxiliary function of missing data PreviousNext
Mplus Discussion > Missing Data Modeling >
 Xu, Man posted on Tuesday, August 05, 2008 - 11:06 am
sorry if I multi posted this.
I just experimented with the auxiliary function for the Mplus 5.1. The results are quite similary to FIML. I am not quite sure if the auxiliary vaiables are considered in relation to the clustering effect for data that has clustering structure. I used complex design function to account for the cluster effect.
Can I have more information regarding this function please?
 Bengt O. Muthen posted on Tuesday, August 05, 2008 - 3:12 pm
Yes, Type=Complex is in operation when you use aux(m).
 Xu, Man posted on Thursday, August 07, 2008 - 9:26 am
Thanks! I didn't totally understand the technical appendix for this function. But if I understood correctly, this function implemented method from Graham(2003), right? My confusion is that this paper didn't specify how this works for multilevel data. How Mplus takes into account of it when Type=Complex is in operation when you use aux(m)?

Graham, J. W. 2003
Adding missing-data-relevant variables to FIML-based structural equation models
Structural Equation Modeling 10, 1 page 80-100
 Bengt O. Muthen posted on Thursday, August 07, 2008 - 9:43 am
Aux(m) still uses maximum-likelihood estimation, just with an extended set of variables. Type=Complex adjusts for complex sample features just like with other ML estimation - so there is no extra difficulty when aux(m) is added. Type=twolevel is another matter.
 Xu, Man posted on Thursday, August 07, 2008 - 10:40 am
I see. Thanks for the reply! How it is another matter when Type=twolevel? (so sorry if my question is a bit "idiotic...")
 Bengt O. Muthen posted on Thursday, August 07, 2008 - 6:17 pm
Well, then you have to make sure that the "saturated correlates" approach is applied correctly to the two levels - this has not been explicated in the literature. May be straightforward, but...
 Andy Ross posted on Wednesday, February 11, 2009 - 1:57 am

Following on from your last post, is there any plans to make the aux(m) available for type=twolevel or type=complex?

Many thanks!
 Andy Ross posted on Wednesday, February 11, 2009 - 1:59 am
Apologies, in my last post I meant type=mixture not type=complex
 Linda K. Muthen posted on Wednesday, February 11, 2009 - 10:20 am
We have no plans to do this in the immediate future.
 Guillaume Filteau posted on Tuesday, April 21, 2009 - 2:15 pm
Is it possible to estimate the variance of an auxiliary covariate?

Would that result in using all individuals for the analysis, even if they are missing the auxiliary variable?

 Linda K. Muthen posted on Wednesday, April 22, 2009 - 9:36 am
No,only the mean and the standard error of the mean are given.
 Sara May posted on Sunday, October 25, 2009 - 10:27 am
Is it possible to specify both binary variables as well as continuous variables as missing data correlates?

The following syntax doesnt work if I specify the dummy variables as categorical:

auxiliary = (m) dummy1 dummy2 continuous1 continuous2;

 Linda K. Muthen posted on Sunday, October 25, 2009 - 11:17 am
This is for continuous variables only.
 Maren Winkler posted on Monday, May 10, 2010 - 3:08 am

I've estimated a SEM and didn't have problems with fit, SE etc. Now I've added auxiliary variables (aux (m)) and got the following message:


I have a model as follows:

F1 BY a b c;
F2 BY x y z;


d ON F1;


(the variable RESID is specified in order to allow an additional path from the residuum of variable d on F2 over and above the loading of F1 - on which d loads, too).
The correlation of d and RESID is .906 - which is no surprise given the model.

I'm confused because the model as above worked fine without auxiliary variables.

thanks for your help!
 Linda K. Muthen posted on Monday, May 10, 2010 - 8:04 am
I think you mean:

 Carolyn CL posted on Thursday, December 13, 2012 - 9:12 am
Hello Dr. Muthen,

I was wondering if there is a way to make the AUXILIARY = (m)x; function work when some of the dependent variables are categorical?

If I treat my variables as continuous and run the (m)x function and compare this model to a saturated correlate model, the results tend to be quite similar in terms of fit indices (CFI, RMSEA), effect sizes and significance. However I am not comfortable treating these categorical variables as continuous because of their non-normal distributions.

Would very much appreciate your suggestions.

 Linda K. Muthen posted on Thursday, December 13, 2012 - 9:43 am
AUXILIARY (m) is available only for continuous variables.
 Karen-Inge Karstoft posted on Monday, January 07, 2013 - 3:55 am

given that it is not possible to use the aux (m) in TYPE=MIXTURE - is there any other way to account for variables predicting missingness without including them in the model?

 Linda K. Muthen posted on Monday, January 07, 2013 - 10:58 am
You could use multiple imputation. See DATA IMPUTATION.
 Chris Blanchard posted on Friday, January 11, 2013 - 11:23 am
i'm new to MPlus and wanted to confirm the following. I have a repeated measure design with 4 times points (steps per day at the end of CR, 3 mo after CR, 6 mo after CR, 9 mo after CR). In some preliminary analyses, I found that Diag_Rec (categorical), BMI, and frst_ev (categorical) were related to "missingness" at follow-up assessments. So, I just want to be sure that these variables are included in the model, so do I use the auxiliary function to do so in the analysis below? If not, do I have to use multiple imputation instead? Thanks!

NAMES ARE id Diag_Rec BMI frst_ev t2_steps t3_steps t4_steps t5_steps;

USEV ARE t2_steps t3_steps t4_steps t5_steps;

MISSING t2_steps(999) t3_steps(999) t4_steps(999) t5_steps(999);

AUXILIARY = Diag_Rec BMI frst_ev;

CLASSES = c(2);

STARTS 100 5;

i s | t2_steps@-3 t3_steps@-1 t4_steps@1 t5_steps@3;
 Bengt O. Muthen posted on Friday, January 11, 2013 - 4:56 pm
I assume these 3 variables don't have a substantive role in your growth model - if they do - just included them in the model. So if not, either approach you mention is fine in principle. You have to specify Auxiliary Missing, not just Auxiliary (see UG). But I think we have not yet developed Aux Missing for Mixtures yet, so that won't work. Multiple imputation is possible, although you assume a 2-class model so regular unrestricted (1-class) imputation is a bit off, but probably better than not using those variables.
 Maren Formazin posted on Wednesday, February 13, 2013 - 8:04 am

I have data from 2326 subjects. For the model that I'm interested in, I use 7 items. There are 8 subjects with complete missings on these 7 items.

If I estimate my model without the AUXILIARY (M) command, Mplus warns me there are 8 cases with complete missings and tells me N = 2318.

However, if I estimate my model with the AUXILIARY (M) command with my data, Mplus warns me "1 case with complete missing data" and tells me N = 2325.

This doesn't make sense to me - there are indeed 8 subjects with complete missing data on those 7 items that constitute my model; however, all 8 participants have non-missing data on the auxiliary variables.

What can have gone wrong here?

 Linda K. Muthen posted on Wednesday, February 13, 2013 - 3:53 pm
Please send the two outputs, data, and your license number to
 Carolyn CL posted on Tuesday, March 19, 2013 - 12:11 pm
Hello Dr. Muthen,

I am running a saturated correlate structural equation model with socio-economic status as the auxiliary variable and dummies representing poverty trajectories (3 dummies, 4th reference category excluded) as independent variables predicting weight status (3 categories: normal, overweight, obese).

When I run a basic model (N = 1230):

Weight ON d_low d_inc d_dec;

The model runs fine.

When I add the auxiliary variable (N = 2120):

Weight ON d_low d_inc d_dec;

SES WITH Weight d_low d_inc d_dec;

The coefficients of the dummy variables are comparable, but the standard errors are inflated, and one sig. effect becomes ns.

When I run the full model (with additional independent and dependent variables) including the auxiliary variable SES, the model fails to converge. Increasing the number of iterations does not solve the problem. I can however run the model without the auxiliary and the model converges, but I obviously lose part of my sample.

Any idea why the full model with the auxiliary will not converge?

 Linda K. Muthen posted on Tuesday, March 19, 2013 - 12:17 pm
I would have no idea. You would need to send the outputs and your license number to for further information.
 Carolyn CL posted on Thursday, March 21, 2013 - 8:46 am
I found the problem - the model was by default allowing the dummy variables to correlate with each other once I added the auxiliary variable. Restricting the correlations to 0 allows the model to converge and provides the expected results.

Many thanks!
 Michael J. Kieffer posted on Sunday, February 23, 2014 - 11:59 am
Imagine a model-building process with a dataset that has missing data and a simple model that includes:
AUXILIARY = (m) c1 c2;
Model: y ON x1;

Then, I want to know if c1 should be a control, so I do:
AUXILIARY = (m) c2;
Model: y ON x1 c1;

In such a case, I feel like it's important to list c1 as auxiliary in the first model, because if I didn't, it seems like the two models would be fitted to different variance-covariance matrices (i.e., b/c one incorporates c1 in FIML and the other doesn't). I'm not talking about comparing the goodness of fit, but just a less formal model-building process where I'm trying to decide which covariates to include based on their z-stats. For instance, in a scenario in which I have a large number of covariates to consider, I systematically try different covariates but make sure to include them either in AUXILIARY or USEVARIABLES. Or a scenario where I want to present effect sizes for a question predictor with two different combinations of covariates.

Does this approach make sense or am I worrying too much about the concern that leaving c1 out of both AUX and USEVARIABLES in one model than including it in USEVARIABLES in a later model?
 Michael J. Kieffer posted on Sunday, February 23, 2014 - 12:03 pm
I ask the question above, b/c I also frequently use BOOTSTRAP, which I realize is incompatible with AUX = (m). If my concern is valid, are there any plans to make these two compatible in the future?

PS- I realize that the Model commands above also need a line like: x1 c1; to treat these as outcomes and get FIML to work. I just ran out of space.
 Bengt O. Muthen posted on Monday, February 24, 2014 - 3:22 pm
I think it is difficult to deal with missing using Auxiliary (m) at the same time as you try to decide on which covariates to control for. I would do the latter first and if needed then the former. Particularly since Aux m often does not affect the results much. You may also want to take this by a general analysis discussion list like SEMNET.
 Stafanie Chris posted on Wednesday, March 05, 2014 - 5:44 am
Hello Dr. Muthen,

I did a latent class analysis,and determined three-class solution is the most appropriate number of classes. Then, in my following step, I included three distal continuous outcomes by Lanza's method (DCON).To account for the clustering nature of my data (cases nested in classes), I used "TYPE=COMPLEX MIXTURE".However, there comes a warning, "Auxiliary variables with DCATEGORICAL or DCONTINUOUS are not available with TYPE=MIXTURE COMPLEX."

What should I do to account for the hierachcical structure of my data in mplus?

Many thanks!

 Bengt O. Muthen posted on Wednesday, March 05, 2014 - 10:34 am
You are right that this is not implemented yet for Lanza's method. I don't have a good answer to give you.
 Rimantas Vosylis posted on Tuesday, May 26, 2015 - 10:33 am
Dear Muthens,
I am running a large longitud CFA model on a dataset that has some missing data. When I run this model without aux variables my model runs fine, but when I include (m) aux variables, I get a warning: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE. <...> PROBLEM INVOLVING VARIABLE.
It doesn't tell me which variable creates the problem. It seems it can be related to one of the dependent variables in the model, which residual variance I have fixed to zero, because residual var for this variable was always insignificant in my previous models and I kept getting the warnings. This variable is a bit different from the others in the model (it is grades received in school, while others are subjective measures of scholar competence), therefore I believe fixing its variance to zero is a good strategy (which it turned out to be because I had no more warning messages). I think that a problem I describe can be caused by automatically added covariances between this variable and aux variables by using “aux (m)” function. Since the error var of my variable is 0 then correlation is not estimated and gives a warning.
Do You think this can be the case? Is it possible to fix some covariances of auxiliary variables to zero?
Maybe there is some other reason? Maybe I should ignore this message because the actual model and parameters in the model look fine?
Thanks in advance!
 Linda K. Muthen posted on Tuesday, May 26, 2015 - 10:56 am
Please send the relevant outputs and your license number to
 Aurelie Lange posted on Monday, December 12, 2016 - 1:49 am
Dear Dr Muthn,

I am using MLR in a type=complex model. I have longitudinal data. For one of the analyses, I am only interested in te last the time-point. Therefore, I have added the other time-points (as well as some client characteristics) to AUXILIARY so that MLR can use all available information when estimating the model.
However, the parameter estimates do not change at all when including or excluding the AUXILIARY command. Hence, I am doubting whether it is working appropriately.

If auxiliary is not working well in this situation, would it be better to use multiple imputation instead? I understood that maximum likelihood and M.I. perform equally well, but if ML can't use all available information, then M.I. might be better, right?

 Bengt O. Muthen posted on Monday, December 12, 2016 - 4:31 pm
I would not use MI when you can do ML-MAR ("FIML"). For instance, you can include the second to last time point in the model and just use WITH to connect it to the last outcome - this is essentially what Aux(M) does. But if Aux (M) didn't change the estimates then this won't either - and nor would MI.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message