Missing data using logistic regression PreviousNext
Mplus Discussion > Missing Data Modeling >
Message/Author
 Gail Smith posted on Tuesday, July 05, 2011 - 7:31 am
I am using logistic regression and have missing data. I added the covariates.
Below is my code.

VARIABLE:
NAMES ARE v1 - v48;
USEVARIABLES ARE v32 v5 v6 v11 v12 v13 v14 v26 v27 v30 v33 v42-v48;
MISSING ARE ALL (-99);
CATEGORICAL ARE v32 ;

ANALYSIS:
! TYPE IS MONTECARLO;
ESTIMATOR IS ml;
integration = montecarlo;
ITERATIONS = 1000;
CONVERGENCE = 0.00005;

model:
v32 on v5 v6 v11 v12 v13 v14 v26 v27 v30 v33 v42-v48 ;
v5 v6 v11 v12 v13 v14 v26 v27 v30 v33 v42-v48 ;

I get the folowing warning:
THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE
TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE
FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING
VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE
CONDITION NUMBER IS 0.463D-18. PROBLEM INVOLVING PARAMETER 184.

If I just list the variables that have missing values, it runs with no warnings.

If I list the variables that have missing values with an *, Iit runs with no warnings but the standard errors are differnt from the run without the *.

Thank you for your help,
Gail
 Linda K. Muthen posted on Tuesday, July 05, 2011 - 7:48 am
Please send the two outputs and your license number to support@statmodel.com.
 Katie Witkiewitz posted on Thursday, September 24, 2015 - 4:35 pm
Hi Bengt and Linda,

I am running a simulation study of various missing data approaches for logistic regression analyses with a binary outcome and binary predictor. When I bring the binary predictor in the model by adding the variance term I can recover the full sample size, but the results from the ML estimation are nearly identical in some scenarios and entirely identical in most scenarios to the analyses that used listwise deletion.

I've included several auxiliary variables in the ML estimation model so I am not sure why ML is giving me the same results as listwise deletion.

Any ideas?

Thanks so much!
 Linda K. Muthen posted on Thursday, September 24, 2015 - 5:18 pm
The results will be affected only by the new cases that do not have missing on y. Only those will contribute to the estimation of the slope.
 jmaslow posted on Wednesday, September 19, 2018 - 9:20 am
I am running a multiple group logistic regression model with missing data. I attempted to bring the covariates into the model so that all cases would be included in the model, which works when I run it as a single group model. However, in the multiple group model, I receive the error message:

WARNING: VARIABLE BLACK MAY BE DICHOTOMOUS BUT DECLARED AS CONTINUOUS.

If I specify that variable as categorical, I get the error message:

*** ERROR in VARIABLE command
The CATEGORICAL option is used for dependent variables only.
The following variable is an independent variable in the model.
Problem with: BLACK


So, my question is, how can I retain all cases despite some missing data in a multiple group logistic regression model? Thank you!
 Bengt O. Muthen posted on Wednesday, September 19, 2018 - 10:56 am
Mention the variances or means of all the covariates.

See also Chapter 10 of our RMA book.
 Zach Gassoumis posted on Sunday, December 23, 2018 - 4:54 pm
I am working with a student who is running a logistic regression model with ML. She would like to add auxiliary variables to help account for the missing data. However, it seems that the AUXILIARY option with the (m) setting is not available with a categorical DV. Is there any other way to integrate auxiliary variables for missing data into a logistic regression model?
 Bengt O. Muthen posted on Sunday, December 23, 2018 - 5:21 pm
You can try doing it manually by including the auxiliaries as DVs. Make sure the auxiliaries are correlated with the other variables. Try it out on a data set with no missing to check that you get the same results with and without the aux's.
 Zach Gassoumis posted on Thursday, January 10, 2019 - 11:08 pm
Thanks for the suggestion. We tried this approach but hit a roadblock with the error message "Covariances for categorical, censored, count or nominal variables with other observed variables are not allowed." It suggests using PARAMETERIZATION=RESCOV, but it seems like that only works for mixture models.

Where Y is a categorical DV, X is the vector of IVs, & AUX is the vector of auxiliary variables, the code we tried is:

ANALYSIS:
ESTIMATOR = MLR;
INTEGRATION = MONTECARLO;

MODEL:
Y ON X;
X;
AUX ON X;
Y WITH AUX;
AUX WITH AUX;

Do you have any recommendations on how we can get the categorical DV to correlate with the auxiliary variables?
 Bengt O. Muthen posted on Saturday, January 12, 2019 - 11:50 am
It's a bit awkward using ML. For each WITH, you can introduce a factor behind the 2 variables to allow them to correlate. But with many variables that leads to too many dimensions of integration. The Bayes estimator can do it more easily - not using factors - but that is probit, not logit.

Another alternative is to use Multiple Imputation in a first step. It has the advantage of handling many auxiliary variables but the disadvantage of having limited analysis features in the second step where the imputed data sets are used.
Back to top
Add Your Message Here
Post:
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Password:
Options: Enable HTML code in message
Automatically activate URLs in message
Action: