Chi square significance and sample size PreviousNext
Mplus Discussion > Structural Equation Modeling >
Message/Author
 Peter Croy posted on Saturday, April 14, 2007 - 7:33 pm
Can someone tell me why it is that the chi square test is "almost always" significant when sample size is large. I have 2500 participants and always get a significant chi square despite good fit for other indices (e.g., CFI>0.96). What is the technical explanation for the sensitivity of chi square to sample size.
 Linda K. Muthen posted on Sunday, April 15, 2007 - 8:30 am
The likelihood-ratio chi-square test of model fit for the H0 model against the H1 model is 2 times the sample size times a fitting function. See Technical Appendix 1.

If you want to test whether the poor fit is actually due to the sensitivity of chi-square, you can free parameters until you get a well-fitting model according to chi-square and compare the parameter estimates from this analysis to the one with fewer parameters. If the original parameter estimates are reproduced in the less parsimonious model, then you might have a case for chi-square sensitivity.
 Peter Croy posted on Sunday, April 15, 2007 - 8:11 pm
I have already correlated some residuals, but otherwise my model is based on Mplus defaults. I have three latent variables predicting a fourth LV. All LVs have at least 3 (observed) indicators. I still get a sig Chi square using ML (MLM improved/lowered chi square but it was still sig).
What parameters do you suggest that I free in order to test for chi square sensitivity?
 Linda K. Muthen posted on Monday, April 16, 2007 - 7:57 am
You can look at modification indices and free the parameters with the larger modification indices.
 Peter Croy posted on Monday, April 16, 2007 - 11:46 pm
I have already done this to correlate residuals ... there are no further MIs of any great effect size.

So, where to from here? Do I rely on the often cited claim that large sample size tends very strongly to produce large chi square and, on that basis, chi square tests of model fit can be ignored and,instead, indices such as CFI should be used?
 Linda K. Muthen posted on Tuesday, April 17, 2007 - 8:13 am
Eventually if you free enough fixed parameters, you will obtain a well-fitting chi-square. The question then is did your original model fall apart or not. If so, you can't blame the sensitivity of chi-square. In some cases, there are no single large modification indices that reduce chi-square sufficiently but a set of moderate sized ones. This could point to a poor model. A factor analysis model is not always most appropriate for the data.
 Rachel Dyane Upton posted on Monday, November 10, 2008 - 10:26 pm
Hello. I am trying to run a latent class analysis with ordinal indicators (there are 8 ordinal variables with between 4 and 5 categories each) and 2 latent classes. For many of my models I've been getting a p-value of 1 for the likelihood-ratio chi-square test, and a p-value of between .7 and .3 for the Pearson's chi-square test.

I am not receiving any error messages that warn me of singularity problems, etc., so should I ignore the p-values for the likelihood-ratio chi-square test, or is it in fact an indication that something serious is wrong?

Thank you.
 Linda K. Muthen posted on Tuesday, November 11, 2008 - 1:06 pm
The likelihood ratio and Pearson chi-square test work best with around 8 or fewer items. In these cases, they are trustworthy if they agree. If they do not agree, I would not use them.
 Lois Downey posted on Friday, February 06, 2009 - 11:57 am
My 5-factor CFA model with 17 ordinal indicators, based on 1291 cases, has good fit except for the chi-square test. Following your instructions, I freed parameters until the chi-square test was non-significant -- in the process allowing 13 indicator pairs to have correlated residuals.

Can you tell me how close the parameter estimates in the two models must be in order for me to conclude that the misfit of the original model is due to chi-square sensitivity? All of the factor loadings in the less parsimonious model remain statistically significant, with the absolute difference between the standardized factor loadings for the two models varying between .000 and .066 (mean absolute difference = .019).

However, 10 of the 13 pairs of residuals have correlations significantly different from 0, and the absolute values of some of those 10 are quite large (with the largest five falling between .30 and .36).

Should I conclude that the original model shows unacceptable fit?

Thanks,
Lois Downey
 Linda K. Muthen posted on Friday, February 06, 2009 - 4:02 pm
Given the number of number of significant residual covariances, you might want to go back to an EFA or the method described in the following paper which is available on the website:

Asparouhov, T. & Muthén, B. (2008). Exploratory structural equation modeling. Accepted for publication in Structural Equation Modeling.
 Jerry Cochran posted on Wednesday, October 20, 2010 - 5:51 pm
Hi Dr. Muthen,

I have a couple of questions on LCA and chi square significance:

1) Are the Pearson Chi-Square and the Likelihood Ratio Chi-Square both supposed to have p-values greater than .05 to have a good fitting model?

2) If so, what if one or both of them become less than .05 during the process of adding classes to find the optimal number of classes?
 Linda K. Muthen posted on Thursday, October 21, 2010 - 2:22 pm
1. Yes.
2. These tests usually don't work well with more than 8 latent class indicators. If they are not pretty close, they should both be ignored.
 s v posted on Friday, November 26, 2010 - 11:18 am
hi, I have a question relating to sample size. I have 2 categories (subject can response to 'test' or to 'control', not both). I’m using a Chi2 test; in order to get a confidence interval of 95% (alpha = 0.05), how large would my minimum sample size have to be? Is a sample size of 10 enough?
 Félix Caballero posted on Wednesday, February 13, 2013 - 8:52 am
Hello. Is the ratio of chi-square to degrees-of-freedom also influenced by large sample sizes?

This ratio is used to assess the fit of a model with ratio <3 being considered acceptable. I have a two-factors model with 22 free parameters and a sample size higher than 10,000, and I wonder whether I should use this criteria to assess the goodness-of-fit of the model.

Thanks,
Félix
 Linda K. Muthen posted on Wednesday, February 13, 2013 - 10:25 am
We don't advocate the use of this ratio. You might want to post your question on SEMNET or a general discussion forum for other opinions.
Back to top
Add Your Message Here
Post:
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Password:
Options: Enable HTML code in message
Automatically activate URLs in message
Action: