Anonymous posted on Tuesday, September 28, 2004 - 4:07 pm
What is better if you want to make a later categorical SE-Modeling and you want to check the reliabilty and validity of your expected two factors with a "correlation matrix"? 1. Handle the missings in the way the default is: exlude any observation with one or more missing values, or 2. Use all observations by specifying TYPE=MISSING;?
bmuthen posted on Wednesday, September 29, 2004 - 10:50 pm
I am interested in conducting multinomial regression analyses, in some cases with latent categorical outcomes, and in other cases with observed categorical outcomes; in both cases with missing data on the outcomes. Per your recommendations, I have been using covariates to improve the precision of classification (my covariates have no missing data).
I have been specifying "missing" to make use of all of the possible information, but have been surprised to find that Mplus is eliminating cases that have all missing data on the latent class indicators. This would make sense to me if I were not using covariates (i.e., a case cannot be analyzed if it has no non-missing values). But since my covariates are 100% complete, I expected that Mplus would use all the cases, using information from the covariates to feed into the latent class estimates. I had the same issue when using observed categorical outcomes.
It would be great if you would provide some insight as to why this is the case with these categorical analyses in contrast to analyses with continuous outcomes in which cases are utilized if they have a non-missing predictor and/or outcome value. My worry is that the estimates are biased by excluding cases with missing data on the latent class indicators (alternately observed class in the ordinary multinomial analyses).
With u denoting your latent class indicators and x the covariates, you are essentially estimating the regression model [u | x]. People with missing on all components of the u vector but with x observed don't contribute to the likelihood for that model. So it is correct to do deletion of such individuals. This is in line with the bivarate normal missing data example in the Little & Rubin book. Although people with missing on u but not on x contribute to estimation of [u] parameters, they don't contribute to the [u|x] parameters that Mplus reports.
Lois Downey posted on Wednesday, September 02, 2015 - 5:18 pm
I am running regression models with ordered categorical outcomes and am interested in the best way to handle missing data.
A writeup at http://www.ats.ucla.edu/stat/mplus/faq/fiml_counts.htm indicates that when the outcome is a continuous variable, Mplus includes all available data for cases that have some missing values, but for(at least some) outcomes that are not continuous, it does not. To circumvent this problem, the author recommends that one include the means for all predictors (even dichotomous predictors) in the model.
Do you think this procedure is wise, in the interest of using all available information? If so, would you include the means of all predictors, all predictors except dichotomies, or only continuous predictors?
In a univariate regression, if you bring the covariates into the model to avoid listwise deletion of any case that has missing on one or more covariates, the regression coefficients are estimated using only the cases with information on both the covariate and the dependent variable.
Lois Downey posted on Thursday, September 03, 2015 - 12:23 am
Regarding your response from 9/2/15:
I have a model that includes a continuous predictor, 2 dichotomous predictors, and 10 indicator dummies representing an 11-category nominal scale predictor.
The coefficients are different for a model in which no means are included and a model that includes the mean for the continuous predictor. Also, the DIFFTEST shows the nominal scale predictor to be statistically significant when the mean of the continuous predictor is in the equation (p=.0369), but nonsignificant when it is omitted (p=.0671).
Given your response, I wouldn't have expected this difference. Can you explain?
There are many things about the model above that are not mentioned. I would need more information. One comment, however. is that if you include predictors in the model you must include either all of them or none or them. When you include a subset, the covariances among those included and those not inlcuded are zero which is not what you want. When the model is estimated conditioned on the covariates and they are not brought into the model, their covariances are not zero but the sample statistic values. So bringing in only one predictor is likely the issue above.
Lois Downey posted on Thursday, September 03, 2015 - 1:13 pm
Re: your response of 9/3/15.
Thanks. The "all or none" instruction is very helpful. When I include the means for the indicators of the nominal scale variable in this model, the model doesn't converge (after a long string of warnings about my having declared a dichotomous variable as continuous). So -- at least for this model -- it looks as if I need to include NONE of the means.
A nominal predictor should be turned into a set of dummy variables. If not, it is treated as a continuous variable. Bringing the covariates into the model should be dones only with maximum likelihood not weighted least squares.