Is it possible to model count data in an IRT framework using a zip model? I would like to adopt an IRT/latent trait model for the count data to generate IRT graphics as are generated in the dichotomous case. The only way I have been able to do this so far is to recode zero-inflated data to presence/absence--the implications of which I am still not certain about.
Thank you. I just noticed in the Topic 2 notes for the NLSY example, the estimator used was MLR rather than the default estimator of WLMSV for categorical indicators. Was this decision made to better deal with the non-normality of the data?
The reason I ask is that I get some different item discriminatory results depending on the estimator I choose. Incidentally, the ones most impacted are those which when treated as count data have the narrowest range of values.
Thank you. Estimates from the IRT model where I treat outcomes as binary (0/1) categorical and from the zero-inflated part of the model where I treat data as count are indeed similar (both using MLR of course).
Loading estimate magnitudes for the binary IRT model using MLR are somewhat different than loading estimate magnitudes for the binary IRT model when WLMSV is used (as are associated p-values, most of which are NS using MLR). My best guess is that this is due to how MLR handles non-normality...are there any other likely reasons why the loading estimates would differ somewhat between the two estimator methods?
I wanted to follow-up on this thread as I am following Wang (2010) in attempting to understand results from the IRT-ZIP model described above. More specifically, in the binary portion of the IRT-ZIP model, would parameter estimates (discrimination/location) be essentially describing the items' ability to predict the structural zero group (where p(u#=1)) whereas in the count portion they would be describing the items' ability to predict the estimated frequency/count given that the subject is in the Poisson process?
Also, would the intercepts in the CFA count model be analagous to the item difficulties/location parameters when data are treated as categorical (rather than count)? I ask as I am attempting to compare results from a 2PLM where I treated data as binary so as to generate ICCs to the ZIP-IRT approach.
In a factor analysis model such as UG ex 5.4, the latent binary u# variables are not related to the factor, but only their means are estimated. So the loadings/discriminations refer to the count part of the variable.
The intercepts for counts play the role of negative thresholds for binary items.
Thanks Dr. Muthen. So if I am understanding correctly, the factor loadings are, in a sense, 'controlling' for that part of the model that is captured by the perfect zero state? Said differently, I am trying to better understand how this model (zip) differs from the same CFA of count data using a plain Poisson process.
I have conducted a CFA (n=9557) with seven count variables, using a negative binomial regression and MLR estimator (default).
I have not used STD (or any derivative) as you mention that count variables cannot be standardized. The chi2 suggests good model fit (X=45494.7, df=77826, p=1.00). The factor loading's look fine (.54-.95) however, the variance for the latent variable is 3.106. It usually is 1.00 when I have previously done CFA with categorical variables, do you know why the variance for the latent variable is so large? Is this normal for CFA with count variables?