I have a 17-item, 5-point response format symptom checklist that I am using as my outcome measure with a large sample (n ~ 3000). The distribution is clearly non-normal, J-shaped as you would expect, but most of the sample endorses at least one of the symptoms at a non-zero level.
My colleague is trying to convince me that an "overdispersed Poisson" regression is the way to model predictors rather than OLS regression after some sort of non-linear transformation of the outcome to approximate normality. He is excited about an interaction that he reports. When I run the analyses using OLS or Logistic (after dichotomizing the outcome), it turns out that sure enough the main effects are significant, BUT the interaction is only signficant when run with the Poisson.
His argument is basically that the Poisson is "more appropriate, sensitive, and powerful"... But selection of the Poisson does not seem appropriate (rather possibly opportunistic) to me. I have used Poisson with rare events, like suicide counts or rates. Does the use of the Poisson seem appropriate in this context?
~~~~~~~~~~~~~~~~~~~~~~~ Stephen C. Messer, PhD Chief, Psychiatry Research Section
Given that the items are 5-category, treating the sum of them (which I assume is what's done) as ZIP does not seem all that natural to me. And, linear regression even after transformation does not seem appropriate either given the strong floor effects. Dichotomizing the sum seems alright, but loses a lot of information. I think you are right to be skeptical when only one modeling approach gives a certain answer. Trying other alternatives for triangulation seems wise. How about letting the dependent variable be a factor measured by 17 categorical items? That handles the non-normality and gives you power due to a parsimonious model. Can be done using ML or WLSMV in Mplus.