Message/Author 

Gail Smith posted on Tuesday, July 05, 2011  7:31 am



I am using logistic regression and have missing data. I added the covariates. Below is my code. VARIABLE: NAMES ARE v1  v48; USEVARIABLES ARE v32 v5 v6 v11 v12 v13 v14 v26 v27 v30 v33 v42v48; MISSING ARE ALL (99); CATEGORICAL ARE v32 ; ANALYSIS: ! TYPE IS MONTECARLO; ESTIMATOR IS ml; integration = montecarlo; ITERATIONS = 1000; CONVERGENCE = 0.00005; model: v32 on v5 v6 v11 v12 v13 v14 v26 v27 v30 v33 v42v48 ; v5 v6 v11 v12 v13 v14 v26 v27 v30 v33 v42v48 ; I get the folowing warning: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.463D18. PROBLEM INVOLVING PARAMETER 184. If I just list the variables that have missing values, it runs with no warnings. If I list the variables that have missing values with an *, Iit runs with no warnings but the standard errors are differnt from the run without the *. Thank you for your help, Gail 


Please send the two outputs and your license number to support@statmodel.com. 


Hi Bengt and Linda, I am running a simulation study of various missing data approaches for logistic regression analyses with a binary outcome and binary predictor. When I bring the binary predictor in the model by adding the variance term I can recover the full sample size, but the results from the ML estimation are nearly identical in some scenarios and entirely identical in most scenarios to the analyses that used listwise deletion. I've included several auxiliary variables in the ML estimation model so I am not sure why ML is giving me the same results as listwise deletion. Any ideas? Thanks so much! 


The results will be affected only by the new cases that do not have missing on y. Only those will contribute to the estimation of the slope. 

jmaslow posted on Wednesday, September 19, 2018  9:20 am



I am running a multiple group logistic regression model with missing data. I attempted to bring the covariates into the model so that all cases would be included in the model, which works when I run it as a single group model. However, in the multiple group model, I receive the error message: WARNING: VARIABLE BLACK MAY BE DICHOTOMOUS BUT DECLARED AS CONTINUOUS. If I specify that variable as categorical, I get the error message: *** ERROR in VARIABLE command The CATEGORICAL option is used for dependent variables only. The following variable is an independent variable in the model. Problem with: BLACK So, my question is, how can I retain all cases despite some missing data in a multiple group logistic regression model? Thank you! 


Mention the variances or means of all the covariates. See also Chapter 10 of our RMA book. 


I am working with a student who is running a logistic regression model with ML. She would like to add auxiliary variables to help account for the missing data. However, it seems that the AUXILIARY option with the (m) setting is not available with a categorical DV. Is there any other way to integrate auxiliary variables for missing data into a logistic regression model? 


You can try doing it manually by including the auxiliaries as DVs. Make sure the auxiliaries are correlated with the other variables. Try it out on a data set with no missing to check that you get the same results with and without the aux's. 


Thanks for the suggestion. We tried this approach but hit a roadblock with the error message "Covariances for categorical, censored, count or nominal variables with other observed variables are not allowed." It suggests using PARAMETERIZATION=RESCOV, but it seems like that only works for mixture models. Where Y is a categorical DV, X is the vector of IVs, & AUX is the vector of auxiliary variables, the code we tried is: ANALYSIS: ESTIMATOR = MLR; INTEGRATION = MONTECARLO; MODEL: Y ON X; X; AUX ON X; Y WITH AUX; AUX WITH AUX; Do you have any recommendations on how we can get the categorical DV to correlate with the auxiliary variables? 


It's a bit awkward using ML. For each WITH, you can introduce a factor behind the 2 variables to allow them to correlate. But with many variables that leads to too many dimensions of integration. The Bayes estimator can do it more easily  not using factors  but that is probit, not logit. Another alternative is to use Multiple Imputation in a first step. It has the advantage of handling many auxiliary variables but the disadvantage of having limited analysis features in the second step where the imputed data sets are used. 

Back to top 