Xu, Man posted on Tuesday, August 05, 2008 - 11:06 am
sorry if I multi posted this. I just experimented with the auxiliary function for the Mplus 5.1. The results are quite similary to FIML. I am not quite sure if the auxiliary vaiables are considered in relation to the clustering effect for data that has clustering structure. I used complex design function to account for the cluster effect. Can I have more information regarding this function please? Thanks!
Yes, Type=Complex is in operation when you use aux(m).
Xu, Man posted on Thursday, August 07, 2008 - 9:26 am
Thanks! I didn't totally understand the technical appendix for this function. But if I understood correctly, this function implemented method from Graham(2003), right? My confusion is that this paper didn't specify how this works for multilevel data. How Mplus takes into account of it when Type=Complex is in operation when you use aux(m)?
Graham, J. W. 2003 Adding missing-data-relevant variables to FIML-based structural equation models Structural Equation Modeling 10, 1 page 80-100
Aux(m) still uses maximum-likelihood estimation, just with an extended set of variables. Type=Complex adjusts for complex sample features just like with other ML estimation - so there is no extra difficulty when aux(m) is added. Type=twolevel is another matter.
Xu, Man posted on Thursday, August 07, 2008 - 10:40 am
I see. Thanks for the reply! How it is another matter when Type=twolevel? (so sorry if my question is a bit "idiotic...")
I've estimated a SEM and didn't have problems with fit, SE etc. Now I've added auxiliary variables (aux (m)) and got the following message:
"THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR A LATENT VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO LATENT VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO LATENT VARIABLES. CHECK THE TECH4 OUTPUT FOR MORE INFORMATION. PROBLEM INVOLVING VARIABLE ."
(the variable RESID is specified in order to allow an additional path from the residuum of variable d on F2 over and above the loading of F1 - on which d loads, too). The correlation of d and RESID is .906 - which is no surprise given the model.
I'm confused because the model as above worked fine without auxiliary variables.
Carolyn CL posted on Thursday, December 13, 2012 - 9:12 am
Hello Dr. Muthen,
I was wondering if there is a way to make the AUXILIARY = (m)x; function work when some of the dependent variables are categorical?
If I treat my variables as continuous and run the (m)x function and compare this model to a saturated correlate model, the results tend to be quite similar in terms of fit indices (CFI, RMSEA), effect sizes and significance. However I am not comfortable treating these categorical variables as continuous because of their non-normal distributions.
Hi, i'm new to MPlus and wanted to confirm the following. I have a repeated measure design with 4 times points (steps per day at the end of CR, 3 mo after CR, 6 mo after CR, 9 mo after CR). In some preliminary analyses, I found that Diag_Rec (categorical), BMI, and frst_ev (categorical) were related to "missingness" at follow-up assessments. So, I just want to be sure that these variables are included in the model, so do I use the auxiliary function to do so in the analysis below? If not, do I have to use multiple imputation instead? Thanks! chris
VARIABLE: NAMES ARE id Diag_Rec BMI frst_ev t2_steps t3_steps t4_steps t5_steps;
I assume these 3 variables don't have a substantive role in your growth model - if they do - just included them in the model. So if not, either approach you mention is fine in principle. You have to specify Auxiliary Missing, not just Auxiliary (see UG). But I think we have not yet developed Aux Missing for Mixtures yet, so that won't work. Multiple imputation is possible, although you assume a 2-class model so regular unrestricted (1-class) imputation is a bit off, but probably better than not using those variables.
I have data from 2326 subjects. For the model that I'm interested in, I use 7 items. There are 8 subjects with complete missings on these 7 items.
If I estimate my model without the AUXILIARY (M) command, Mplus warns me there are 8 cases with complete missings and tells me N = 2318.
However, if I estimate my model with the AUXILIARY (M) command with my data, Mplus warns me "1 case with complete missing data" and tells me N = 2325.
This doesn't make sense to me - there are indeed 8 subjects with complete missing data on those 7 items that constitute my model; however, all 8 participants have non-missing data on the auxiliary variables.
Carolyn CL posted on Tuesday, March 19, 2013 - 12:11 pm
Hello Dr. Muthen,
I am running a saturated correlate structural equation model with socio-economic status as the auxiliary variable and dummies representing poverty trajectories (3 dummies, 4th reference category excluded) as independent variables predicting weight status (3 categories: normal, overweight, obese).
When I run a basic model (N = 1230):
Weight ON d_low d_inc d_dec;
The model runs fine.
When I add the auxiliary variable (N = 2120):
Weight ON d_low d_inc d_dec;
SES WITH Weight d_low d_inc d_dec;
The coefficients of the dummy variables are comparable, but the standard errors are inflated, and one sig. effect becomes ns.
When I run the full model (with additional independent and dependent variables) including the auxiliary variable SES, the model fails to converge. Increasing the number of iterations does not solve the problem. I can however run the model without the auxiliary and the model converges, but I obviously lose part of my sample.
Any idea why the full model with the auxiliary will not converge?
I would have no idea. You would need to send the outputs and your license number to email@example.com for further information.
Carolyn CL posted on Thursday, March 21, 2013 - 8:46 am
I found the problem - the model was by default allowing the dummy variables to correlate with each other once I added the auxiliary variable. Restricting the correlations to 0 allows the model to converge and provides the expected results.
Imagine a model-building process with a dataset that has missing data and a simple model that includes: USEVARIABLES = y x1; AUXILIARY = (m) c1 c2; Model: y ON x1;
Then, I want to know if c1 should be a control, so I do: USEVARIABLES = y x1 c1; AUXILIARY = (m) c2; Model: y ON x1 c1;
In such a case, I feel like it's important to list c1 as auxiliary in the first model, because if I didn't, it seems like the two models would be fitted to different variance-covariance matrices (i.e., b/c one incorporates c1 in FIML and the other doesn't). I'm not talking about comparing the goodness of fit, but just a less formal model-building process where I'm trying to decide which covariates to include based on their z-stats. For instance, in a scenario in which I have a large number of covariates to consider, I systematically try different covariates but make sure to include them either in AUXILIARY or USEVARIABLES. Or a scenario where I want to present effect sizes for a question predictor with two different combinations of covariates.
Does this approach make sense or am I worrying too much about the concern that leaving c1 out of both AUX and USEVARIABLES in one model than including it in USEVARIABLES in a later model?
I think it is difficult to deal with missing using Auxiliary (m) at the same time as you try to decide on which covariates to control for. I would do the latter first and if needed then the former. Particularly since Aux m often does not affect the results much. You may also want to take this by a general analysis discussion list like SEMNET.
I did a latent class analysis,and determined three-class solution is the most appropriate number of classes. Then, in my following step, I included three distal continuous outcomes by Lanza's method (DCON).To account for the clustering nature of my data (cases nested in classes), I used "TYPE=COMPLEX MIXTURE".However, there comes a warning, "Auxiliary variables with DCATEGORICAL or DCONTINUOUS are not available with TYPE=MIXTURE COMPLEX."
What should I do to account for the hierachcical structure of my data in mplus?
Dear Muthens, I am running a large longitud CFA model on a dataset that has some missing data. When I run this model without aux variables my model runs fine, but when I include (m) aux variables, I get a warning: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE. <...> PROBLEM INVOLVING VARIABLE. It doesn't tell me which variable creates the problem. It seems it can be related to one of the dependent variables in the model, which residual variance I have fixed to zero, because residual var for this variable was always insignificant in my previous models and I kept getting the warnings. This variable is a bit different from the others in the model (it is grades received in school, while others are subjective measures of scholar competence), therefore I believe fixing its variance to zero is a good strategy (which it turned out to be because I had no more warning messages). I think that a problem I describe can be caused by automatically added covariances between this variable and aux variables by using âaux (m)â function. Since the error var of my variable is 0 then correlation is not estimated and gives a warning. Do You think this can be the case? Is it possible to fix some covariances of auxiliary variables to zero? Maybe there is some other reason? Maybe I should ignore this message because the actual model and parameters in the model look fine? Thanks in advance!
I am using MLR in a type=complex model. I have longitudinal data. For one of the analyses, I am only interested in te last the time-point. Therefore, I have added the other time-points (as well as some client characteristics) to AUXILIARY so that MLR can use all available information when estimating the model. However, the parameter estimates do not change at all when including or excluding the AUXILIARY command. Hence, I am doubting whether it is working appropriately.
If auxiliary is not working well in this situation, would it be better to use multiple imputation instead? I understood that maximum likelihood and M.I. perform equally well, but if ML can't use all available information, then M.I. might be better, right?
I would not use MI when you can do ML-MAR ("FIML"). For instance, you can include the second to last time point in the model and just use WITH to connect it to the last outcome - this is essentially what Aux(M) does. But if Aux (M) didn't change the estimates then this won't either - and nor would MI.