Ian Zajac posted on Wednesday, August 15, 2007 - 8:03 pm
In am performing a CFA with dichotomous outcome variables. I get the EMPTY CELL warning regarding the bivariate table and the relevant variable names.
This, I have read, means that the tetrachoric correlation for a pair of variables is equal to 1. However, when I inspect the sample tetrachoric matrix, this is not the case at all. The correlations are generally moderate (@ .45 or so).
Am I looking at the right correlations? Also, could this be due to highly skewed variables? It appears that all the suspect variables are (i.e., >95% correct/incorrect).
Empty cells in bivariate tables are most common when variables have extreme cuts like 95/5. Empty cells imply correlations of one. I'm not sure what you are looking at so can't say if you are looking at the right thing. If you want further clarifiction, please send your input, data, output, and license number to email@example.com.
Ian Zajac posted on Monday, August 20, 2007 - 6:14 pm
OUPUT is sampstat
You get a number of statistics including > SUMMARY OF CATEGORICAL DATA PROPORTIONS > SAMPLE STATISTICS > ESTIMATED SAMPLE STATISTICS > SAMPLE TETRACHORIC CORRELATIONS
Given the warnings, I would expect than when I inspect the 'Sample tetrachoric correlations', I would find a correlation of 1 between the variables noted in the warnings. But, this isn't the case. The correlations are actually moderate and are never equal to 1.
Dear Linda, I have the exact same problem as Ian:"I get the EMPTY CELL warning regarding the bivariate table and the relevant variable names.
This, I have read, means that the tetrachoric correlation for a pair of variables is equal to 1. However, when I inspect the sample tetrachoric matrix, this is not the case at all. The correlations are generally moderate (@ .45 or so). "
For ordinal variables, is it necessary to ensure there are no empty cells in any of the bivariate tables. I am investigating the structure in 32 variables with 4 categories and will exclude 6 of the low frequency variables to address this problem. However, is it OK if there is the occasional empty cell remaining?
I am having just the same problem that Ian and Theresa have had. I get a slew of warning messages telling me that the bivariate table of (dichotomous variable) and (other dichotomous variable) has an empty cell. What does this mean? If I calculate the bivariate correlation between those two variables, it is not 1, and it is also not 0. It is generally moderate, just as Ian described.
I'm a little confused by what is being said here. In my case I have: THE BIVARIATE TABLE OF M7 AND M1 HAS AN EMPTY CELL. In traditional categorical data analysis you could put 0.5 in that cell or something of that sort. Is there a way to deal with this in Mplus besides just deleting a variable?
Having searched high and low for why my estimates may differ in my IRT models depending on my use of WLMSV vs. MLR I think this may be it.
In the default (WLMSV) mode, I get these warnings for a number of variables. Using MLR I don't. All data is nonmissing (I have no missing data). Does the MLR estimator simply handle the empty cells differently?
ML doesn't first compute latent correlations as in WLSMV, so results may differ a bit when there are zero cells (which hurts correlations). But note also that the default ML link is logit, whereas WLSMV uses probit.
1) Most p-values for the indicators (loadings) in the WLSMV (probit) are significant; in MLR (logit) they are not. That said, ICC curves are generally in accord. The one curve that literally reverses its direction is for the indicator that has the most empty cell warnings in WLMSV. As such, is MLR (logit) somehow taking into account the rarity of occurrence for this particular behavior (indicator) differently than probit? More specifically, given the high zero-inflation of the indicators is MLR/ logit a better choice?
2) What fascinates me is that the estimates of the IRT ZIP I ran (count) mimics the results of the MLR IRT run (categorical). Thus, my inclination is that using MLR in the binary case better addresses my data given such rare occurrences of some indicators? Or am I completely off base?
1) I don't think it is clear that ML is better than WSLMV with empty cells. Research with simulations would be needed to shed light on that I can't recall having seen that. - Anyone? I would think ML also suffers from empty cells since you then have limited information about association between pairs of items.
2) I am not sure one can take the count results as support for the binary ML advantage - they may suffer similarly from the empty cells.