Message/Author 


Hello: Is it possible to model count data in an IRT framework using a zip model? I would like to adopt an IRT/latent trait model for the count data to generate IRT graphics as are generated in the dichotomous case. The only way I have been able to do this so far is to recode zeroinflated data to presence/absencethe implications of which I am still not certain about. 


Yes. I think we have an example of that in the UG. 


Thank you. I also consulted short course Topic 2 notes re: IRT using NLSY data as an example. I have two remaining questions. Is it possible to generate item characteristic curves when data are specified as count? In a ZIP count model, would not estimates from the binary component approximate those obtained when data are treated as categorical in prototypical IRT models? Many thanks. 


We don't give ICC's for count variables. These should be similar. 


Thank you. I just noticed in the Topic 2 notes for the NLSY example, the estimator used was MLR rather than the default estimator of WLMSV for categorical indicators. Was this decision made to better deal with the nonnormality of the data? The reason I ask is that I get some different item discriminatory results depending on the estimator I choose. Incidentally, the ones most impacted are those which when treated as count data have the narrowest range of values. 


Which slide are you looking at> I am not aware of any estimator choice for a count outcome other than maximum likelihood. 


Slide 98 for NLSY example with categorical indicators. 


We use MLR because we are doing a logistic IRT. WLSMV provides probit regression only. 


Thank you. Estimates from the IRT model where I treat outcomes as binary (0/1) categorical and from the zeroinflated part of the model where I treat data as count are indeed similar (both using MLR of course). Loading estimate magnitudes for the binary IRT model using MLR are somewhat different than loading estimate magnitudes for the binary IRT model when WLMSV is used (as are associated pvalues, most of which are NS using MLR). My best guess is that this is due to how MLR handles nonnormality...are there any other likely reasons why the loading estimates would differ somewhat between the two estimator methods? 


WLSMV is probit. MLR is logistic. Missing data handling differs between the two estimators. Normality of the observed variables is not an issue with categorical data methodology. 


I wanted to followup on this thread as I am following Wang (2010) in attempting to understand results from the IRTZIP model described above. More specifically, in the binary portion of the IRTZIP model, would parameter estimates (discrimination/location) be essentially describing the items' ability to predict the structural zero group (where p(u#=1)) whereas in the count portion they would be describing the items' ability to predict the estimated frequency/count given that the subject is in the Poisson process? Also, would the intercepts in the CFA count model be analagous to the item difficulties/location parameters when data are treated as categorical (rather than count)? I ask as I am attempting to compare results from a 2PLM where I treated data as binary so as to generate ICCs to the ZIPIRT approach. 


In a factor analysis model such as UG ex 5.4, the latent binary u# variables are not related to the factor, but only their means are estimated. So the loadings/discriminations refer to the count part of the variable. The intercepts for counts play the role of negative thresholds for binary items. 


Thanks Dr. Muthen. So if I am understanding correctly, the factor loadings are, in a sense, 'controlling' for that part of the model that is captured by the perfect zero state? Said differently, I am trying to better understand how this model (zip) differs from the same CFA of count data using a plain Poisson process. 


The factor loadings and intercepts are better estimated when ZIP is used due to a need to handle the excess number of zeros. 


Hi there, I have conducted a CFA (n=9557) with seven count variables, using a negative binomial regression and MLR estimator (default). I have not used STD (or any derivative) as you mention that count variables cannot be standardized. The chi2 suggests good model fit (X=45494.7, df=77826, p=1.00). The factor loading's look fine (.54.95) however, the variance for the latent variable is 3.106. It usually is 1.00 when I have previously done CFA with categorical variables, do you know why the variance for the latent variable is so large? Is this normal for CFA with count variables? Many thanks, Emily Lowthian 


You can use STD which standardized wrt the factors. I would not necessarily worry about these large factor loadings. 

Back to top 