Missing on x-variables PreviousNext
Mplus Discussion > Missing Data Modeling >
Message/Author
 Jiyoung  posted on Friday, June 26, 2009 - 2:10 am
I added control variables to my model. I got the following message after the addition. I wanted to use the full information. Therefore, I defined the all of the variables with missing data. It seems that the program deleted cases using listwise deletiion. With the following warning, I could not see the model estimation.

Data set contains cases with missing on x-variables.
These cases were not included in the analysis.
Number of cases with missing on x-variables: 94
1 WARNING(S) FOUND IN THE INPUT INSTRUCTIONS

What can I do to see the results of the model estimation?
 Linda K. Muthen posted on Friday, June 26, 2009 - 6:35 am
The only way to avoid the listwise deletion of covariates is to bring them into the model as dependent variables. You can do this by mentioning their variances in the MODEL command. You then make distributional assumptions about them.
 nina chien posted on Friday, September 04, 2009 - 1:58 pm
To avoid listwise deletion of covariates, I added the covariates into the model by mentioning their variances. This resulted in a series of WARNING statements that some of my covariates are categorical (which they are): “WARNING: VARIABLE GENDERR MAY BE DICHOTOMOUS BUT DECLARED AS CONTINUOUS.”

So, I tried stating covariates as categorical, but this resulted in error statements that the covariates are not DV’s: “ERROR in VARIABLE command. CATEGORICAL option is used for dependent variables only. GENDERR is not a dependent variable.”

I'm very confused because the WARNING and ERROR messages are contradictory. Can I ignore the warning statements?

Also, the model where covariates were declared as continuous did not converge. Is this related to the WARNING?

Thanks so much.
 Linda K. Muthen posted on Friday, September 04, 2009 - 4:46 pm
You should not put covariates on the CATEGORICAL list. The message comes about because the mean and variance of a dichotomous variable are not independent of each other. I would need to see the full output and your license number at support@statmodel.com to say if this is related to convergence.
 Yvonne Terry-McElrath posted on Tuesday, April 05, 2011 - 6:50 pm
Drs. Muthen -

I have been asked by a reviewer to provide a reference for exactly what Mplus is doing when bringing covariates with missing data into the model as dependent variables by steps such as mentioning their variances in the MODEL command (and thus making distributional assumptions about them). If you have any recommendations for this, I would greatly appreciate it.

Many thanks.
 Linda K. Muthen posted on Wednesday, April 06, 2011 - 10:24 am
I don't know of a reference to this exactly. When you bring the x's into the model, multivariate normality is assumed for the x's and all continuous y's. This assumption is discussed in any structural equation modeling book.
 Cheung hoi Shan posted on Monday, July 16, 2012 - 6:12 pm
I ran a path analysis and one of the independent variables has missing data on several cases. Apparently listwise deletion was used. Is there any way I can avoid this? Thanks.
 Linda K. Muthen posted on Tuesday, July 17, 2012 - 7:45 am
Missing data theory does not apply to observed exogenous variables. The model is estimated conditioned on them. You can bring all of the covariates into the model by mentioning their variances in the MODEL command. Then they will be treated as dependent variables and distributional assumptions will be made about them.
 Cheung hoi Shan posted on Tuesday, August 14, 2012 - 7:43 pm
Thanks Linda. I have run the analysis, and realise that if I bring the exogenous variable with missing data into the model, df becomes higher. I am wondering if this would pose any issue with explaining to readers how the model was specified. How do people usually justify the use of this method?
 Linda K. Muthen posted on Wednesday, August 15, 2012 - 6:19 am
Your degrees of freedom should not change if you put no restrictions on the x's. Please send the output and your license number to support@statmodel.com so I can see what you did.
 radanielina-hita marie louise posted on Saturday, June 15, 2013 - 7:59 am
Dear Dr. Muthen

I am running a mediational model with two covariates (relationship status and greek membership). I have missing values on these. I specified type=missing with MLR. Then I entered the covariates in the model command and Mplus uses the whole sample. I understand that missing data theory does not apply to exogenous observed variables and when I do not include them in the model, there was no substantive changes except in one of my hypothesized effect. I want to keep all sample but I am wondering what happened here that brought the change? Can I trust the output with the observed variables included in the model

Thanks
 Linda K. Muthen posted on Sunday, June 16, 2013 - 3:46 pm
When you mention the means, variances, or covariances of the exogenous variables in the MODEL command, they are treated as dependent variables and distributional assumptions are made about them. This can change the results if not all variables are continuous. See the Version 6.1 version history on the website for a description.
 wahideh Achbari posted on Wednesday, January 22, 2014 - 2:19 am
Dear Dr Muthen

When I am running a one-factor CFA model with continuous data, I get the following error. "Data set contains cases with missing on all variables. These cases were not included in the analysis. Number of cases with missing on all variables: 2"

Previously, I have run the same model without getting the error message.

I have checked whether the error is not due to misreading the input file, but still cannot find its source.

Thank you.
 Linda K. Muthen posted on Wednesday, January 22, 2014 - 6:20 am
Please send the outputs where you get the message and where you don't get the message along with your license number to support@statmodel.com.
 Katharina Groß posted on Monday, February 24, 2014 - 11:34 am
Dear Mplus team,

I have several questions regarding logistic regression with missing values. My data set contains 1000 cases, 300 of them with missing values. y and x1 are both binary and have missing values; x2-x4 are continuous and completely observed. I want Mplus to use all cases, including those with missing values. In order to do that I mentioned the variance of x1 and ran the following model:

ANALYSIS:
estimator = ml;
integration = montecarlo;
MODEL:
y on x1 x2 x3 x4;
x1;

My questions are as follows:

1. When I mention the variance of x1 Mplus uses 950 of 1000 cases. If I specify one further independent variable (no matter which one), all cases are included. What I don’t understand is that the coefficients of the model vary depending on which variance I choose to mention. This observation confuses me and I am not sure which variance I should mention.

2. A more general question concerns predicting probabilites. Is it right that it is not possible to predict probabilities when there are missing values on an independent variable using ML estimation?

3. Is the given R square the McKelvey & Zavoina’s R square – and can I trust the value although I have missing values on the independent variable or does this bias the R square-computation (as I would guess)?

I would appreciate any help.
 Linda K. Muthen posted on Tuesday, February 25, 2014 - 1:25 pm
1. You should include the variances of all of the covariates or none of the covariates. When you include, for example, y1 and y2, they are correlated and y3 and y4 are correlated because the model is estimated conditional on them. But there are no correlations between y1, y2 and y3, y4. The zero correlations vary depending on which variances you inlcude in the model.

2. I don't think this can be done.

3. Yes.
 Hanna Esser posted on Wednesday, September 03, 2014 - 6:13 am
Dear Mplus team,

I have a question regarding missing values in my path model.

Variables x1-x6 are exogenous. Four of them are binary (gender and yes/no variables), two are continuous.
Variables y1-y7 are endogenous. y1-y6 are ordinal, y7 is binary (yes/no).
y1-y6 are also exogenous, because I compute regressions on y7.

My current sample size is 4009 and I would like to use the whole sample for my analysis. Unfortunately, there are missing data in variables x2-x6. I assume that they are MCAT. When I do the analysis without extra assumptions, there are 1031 cases with missing on x-variables which get excluded from my analysis. Most of the missing data are on the variable “income” because many refused to give this information and on the variable “Parents without qualification” (yes/no).

By reading old discussions I found out that I when I mention the variances no listwise deletion happens. This is the case in my analysis; it includes all 4009 cases when I do this, but I get the warning that my exogenous variables x1-x4 are dichotomous but declared as continuous (which is true).

What can I do to include all cases and avoid the listwise deletion of 25% of my cases?

Thank you
 Bengt O. Muthen posted on Wednesday, September 03, 2014 - 3:18 pm
You can do what you did and ignore the warning.
 Shiny7 posted on Monday, September 29, 2014 - 3:37 am
Dear Drs. Muthen,

I ran a multilevel model with 7 x-Variables and one continous outcome using MLR.

As known, Mplus does not include cases with missings on all x-Variables, which in my case are many, many cases.

Am I right that the only solution is to mention the variances of the x-Variables in the model command? Although the assumption that my data are multivariate normally distributed is in fact not given (!) (using MLR therefore).

The problem is furhter, that I have only 21 clusters, and if i take the variances of the x-variables into the model, the number of paramters exceeds the number of clusters which is also a known problem.

Can you please help solving that problem?

Best regards
Shiny
 Linda K. Muthen posted on Monday, September 29, 2014 - 9:59 am
I can't see any other approach to this.
 Shiny7 posted on Monday, September 29, 2014 - 10:22 am
Dear Mrs. Muthen,

okay, thank you so much for your immediate reply...

Shiny
 Sandra Coulon posted on Monday, November 24, 2014 - 9:28 am
Basic regression with type=general is listwise deleting on x. Based on this thread and also http://www.ats.ucla.edu/stat/mplus/faq/fiml_counts.htm, I mention the predictors in my model statement as in:

model:
LWCNO24 PSS24 on age24 female studyxtx BPComp BMI24 NEIGH24 GRMnGni GxNEIGH;
[age24 female studyxtx BPComp
BMI24 CE24 NSat24 NEIGH24 GRMnGni GxNEIGH];

However, when I do that, I get the following:

THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.206D-20. PROBLEM INVOLVING THE FOLLOWING PARAMETER: Parameter 65, GXNEIGH.

I thought maybe this was a degrees of freedom issue but the "number of free parameters" is 65. What is the issue here, and is there another way to get the program not to listwise delete on X?
 Linda K. Muthen posted on Monday, November 24, 2014 - 9:35 am
This is likely due to one or more of your predictors being binary. The mean and variance of a binary variable are not orthogonal and this can trigger the message. Comment out the means. If the message disappears, you can put them back and ignore the message.
 Sandra Coulon posted on Monday, November 24, 2014 - 9:58 am
Thank you for the quick reply! I did comment out the means output request and the message did not go away. Other thoughts?
 Linda K. Muthen posted on Monday, November 24, 2014 - 11:10 am
Please send the two outputs, with and without the means, and your license number to support@statmodel.com.
Back to top
Add Your Message Here
Post:
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Password:
Options: Enable HTML code in message
Automatically activate URLs in message
Action: