I am running logistic regression in Mplus using FIML. I would like to be able to generate some of the traditional model evaluation statistics that journal reviewers expect for logistic regression, to decrease the odds of my paper being rejected due to a reviewer's lack of familiarity with modelling in Mplus. To this end, I have two questions:
1) The R-square statistic in the output (presumably for Y*, per equation 15 of the first technical appendix) is helpful, but I would also like to provide the more commonly reported McFadden R-square. McFadden's pseudo R-square is a ratio of the estimated and null models' loglikelihoods, but the loglikelihood required is different from the one provided in Mplus' Model Fit section. Is there any way to obtain the loglikelihood statistic required for McFadden's R-square in the Mplus output?
2) I would also like to calculate the Hosmer & Lemeshow goodness-of-fit statistic. For a model without missing data, I could calculate it from the parameter estimates. But this is a poor approach for a model with missingness on variables that strongly predict the outcome variable. Is there any way to generate predicted probabilities (or predicted Y* values for all observations in a dataset) using the Mplus model's FIML-based estimates? Or would this be a violation of the assumptions required in an FIML modelling context?
Thank you for the guidance. When I do the two runs to get McFadden's R-square, what values should I plug into the R-square equation? I tried using the Loglikelihood from the Model Fit section, but the resultant value was far from the true McFadden R-square I calculated based on running the same model in SAS: 0.014 from Mplus vs. 0.149 from SAS (all variables had complete data). Note: this was a simulation dataset, and the SAS value is in-line with expectations.
The loglikelihood values are very different between what you report for SAS and Mplus, so something is off here. For instance, the LL for the Mplus H0 model is -6715.375 whereas the SAS number is -531.682, which is 10 times less.
To sort this out, send your Mplus outputs and a pdf of the SAS output to Support.
The reason the LLs are so different is that by mentioning their variances your Mplus runs bring the covariates into the model in the sense of estimating their parameters. That implies that you have not one but several DVs and therefore the LLs of the two programs are not on the same metric. Since you are doing regression you don't want to include the covariates in the model. Don't mention the variances of the covariates and you will get agreement.
Thank you for the clarification. The example I sent you had complete data, for simplicity - I'm sorry that I neglected to mention this. Once I apply the model to a dataset with missing data, I will need to include the covariates in the model to invoke FIML. Based on your response, I take it that the LLs cannot be used to calculate McFadden's R-square for the logistic DV when the model utilizes FIML - is that correct?
I don't know - that is a research question. Perhaps it is possible to do a run with only the covariates - just-identified modeling handling the missingness - and then subtract that LL value from each of the two LL's to eliminate the marginal covariate LL and thereby still consider "y | x" as in McFadden's approach.
Dr. Muthen, would you still recommend subtracting the covariate-only model LL from the estimated model LL and null model LL to calculate McFadden's R-square? I have a similar situation where I'm including the covariates in order to use maximum likelihood but want to calculate a pseudo R-square for models with nominal DVs.
Model indirect: Cycled ind uncert MomNPart; Cycled ind divorceatt MomNPart; Cycled ind loveenough MomNPart; Cycled ind uncert DadNPart; Cycled ind divorceatt DadNPart; Cycled ind loveenough DadNPart; Cycled ind uncert Divorced; Cycled ind divorceatt Divorced; Cycled ind loveenough Divorced; Cycled ind uncert ParentsEverCycle; Cycled ind divorceatt ParentsCycle; Cycled ind loveenough ParentsCycle;