Udaya Wagle posted on Friday, October 28, 2005 - 2:51 pm
I'm estimating a model with 5 latent, 20 x, amd 25 Y (many of these categorical) variables. My chi square estimates are quite large (10627 with 734 df). Thinking that this large a chi square estinate might be because of the large sample (size 2812), I drew several samples out of the data and estimated the model with the same specification. While there was no convergence when sample sizes were small (n<400), the ratio of chi square to df was smaller than 5 with n=600. The full sample gives the CFI, TLI and RMSEA of .94, .93, and .069 resp., which are much worse with the subsamples. I know that the dataset I am working with has a lot of missing values which I imputed using Stata and the distribution is skewed in many cases, but would it still be reasonable to go ahead with it and perhaps explain why the model fit is not that good?
Also, one of the latent concepts has an unidentified R-squared and I am not sure why that is the case.
The unidentified r-square is most likely caused by a negative residual variance. This makes the model inadmissible. So you most likely need to change your model. None of your fit statistics is very good, so I would think you need to rethink your model. Starting with an EFA might be a good idea. Then a CFA. And then add the covariates after you have a well CFA model.
Udaya Wagle posted on Saturday, October 29, 2005 - 8:58 pm
Thank you Linda. I used the process you just described in developing the complex model I estimated in the end. But because I am using WLSM estimator, I cannot get modification indices. Also, because probabilities for all of the individual coefficients are significant after I have come down to the final model using testing down approach, I cannot really identify the problem. Is there a way to ascertain which of the indicators might be at the root of the negative residual variance?
The problem of high chi-square to df ratio also persists with the individual CFA models estimating each of the concepts when I use the full sample of 2812, even though other fit indices are much better. But are the issues of model complexity, large sample size, highly skewed distributioin, etc. helpful to explain the situation at all?
I absolutely agree that my model is not specified well. The concepts I am using are very broad and theoretical connection amongst these is not that compelling. I have estimated a similar model with different data in a different context and the fit in that case was much better and the work has in fact been published already. In this sense, my goal in estimating this model even with quite fragmented and not well-behaved data (here I am using secondary sources) is to draw attention of scholars that more serious work is necessary. Would you suggest this is not a good idea?
Modification indices are available for the WLSMV estimator.
If you are getting r-square, then you should be getting the residual variances. This is where you would see which one is negative. You should have also see this in your EFA. If you can't see what the problem is, I suggest sending your input, data, output, and license number to email@example.com.
I can't really comment on whether the idea behind what you are doing is good. I would have to know more than I can from Mplus Discussion.