Message/Author 

Anonymous posted on Tuesday, September 28, 2004  4:07 pm



What is better if you want to make a later categorical SEModeling and you want to check the reliabilty and validity of your expected two factors with a "correlation matrix"? 1. Handle the missings in the way the default is: exlude any observation with one or more missing values, or 2. Use all observations by specifying TYPE=MISSING;? 

bmuthen posted on Wednesday, September 29, 2004  10:50 pm



Type = Missing should be preferred. 


I am interested in conducting multinomial regression analyses, in some cases with latent categorical outcomes, and in other cases with observed categorical outcomes; in both cases with missing data on the outcomes. Per your recommendations, I have been using covariates to improve the precision of classification (my covariates have no missing data). I have been specifying "missing" to make use of all of the possible information, but have been surprised to find that Mplus is eliminating cases that have all missing data on the latent class indicators. This would make sense to me if I were not using covariates (i.e., a case cannot be analyzed if it has no nonmissing values). But since my covariates are 100% complete, I expected that Mplus would use all the cases, using information from the covariates to feed into the latent class estimates. I had the same issue when using observed categorical outcomes. It would be great if you would provide some insight as to why this is the case with these categorical analyses in contrast to analyses with continuous outcomes in which cases are utilized if they have a nonmissing predictor and/or outcome value. My worry is that the estimates are biased by excluding cases with missing data on the latent class indicators (alternately observed class in the ordinary multinomial analyses). Thanks for your time and attention! 


With u denoting your latent class indicators and x the covariates, you are essentially estimating the regression model [u  x]. People with missing on all components of the u vector but with x observed don't contribute to the likelihood for that model. So it is correct to do deletion of such individuals. This is in line with the bivarate normal missing data example in the Little & Rubin book. Although people with missing on u but not on x contribute to estimation of [u] parameters, they don't contribute to the [ux] parameters that Mplus reports. 


Thanks for the quick reply! Does this logic also apply identically to observed categorical outcomes? 


Yes. 

Lois Downey posted on Wednesday, September 02, 2015  5:18 pm



I am running regression models with ordered categorical outcomes and am interested in the best way to handle missing data. A writeup at http://www.ats.ucla.edu/stat/mplus/faq/fiml_counts.htm indicates that when the outcome is a continuous variable, Mplus includes all available data for cases that have some missing values, but for(at least some) outcomes that are not continuous, it does not. To circumvent this problem, the author recommends that one include the means for all predictors (even dichotomous predictors) in the model. Do you think this procedure is wise, in the interest of using all available information? If so, would you include the means of all predictors, all predictors except dichotomies, or only continuous predictors? Thanks! 


In a univariate regression, if you bring the covariates into the model to avoid listwise deletion of any case that has missing on one or more covariates, the regression coefficients are estimated using only the cases with information on both the covariate and the dependent variable. 

Lois Downey posted on Thursday, September 03, 2015  12:23 am



Regarding your response from 9/2/15: I have a model that includes a continuous predictor, 2 dichotomous predictors, and 10 indicator dummies representing an 11category nominal scale predictor. The coefficients are different for a model in which no means are included and a model that includes the mean for the continuous predictor. Also, the DIFFTEST shows the nominal scale predictor to be statistically significant when the mean of the continuous predictor is in the equation (p=.0369), but nonsignificant when it is omitted (p=.0671). Given your response, I wouldn't have expected this difference. Can you explain? Thank you! 

Back to top 