Beginner question on count data (pois... PreviousNext
Mplus Discussion > Categorical Data Modeling >
Message/Author
 Fernando Terrés de Ercilla posted on Wednesday, March 29, 2006 - 3:29 am
I’m trying to analyze some panel data models with count variables, where my interest is in the behavior of the rate, and its relationship with certain covariates. Most of my data is severely overdispersed, and if I try to use it I receive an error message related to the computation of the posterior distribution. Is that due to the overdispersion?; if that is the reason, there is anyway to model overdispersed or negative binomial counts with Mplus?.
I have also tried to model a short example with data that wasn’t overdispersed, but I also got that error message, so I don’t really know where is my mistake.

For example, if the data is:

id acc aflds lnafl
------------------------+
1. 442 5776 8.661467
2. 495 6085 8.713582
3. 536 5936 8.688790
4. 480 6008 8.700848
5. 508 6223 8.736008
6. 470 6321 8.751633
7. 495 6569 8.790117
+-----------------------+

(Where acc: accidents, aflds: exposure, lnafl: ln(aflds))

Then, if I try the simple null model:

acc on lnafl@1

I receive the message: “SERIOUS PROBLEM IN THE OPTIMIZATION WHEN COMPUTING THE POSTERIOR DISTRIBUTION. CHANGE YOUR MODEL AND/OR STARTING VALUES.”

(The estimated intercept should be -2.528)

Thanks in advance,
Fernando.
 Linda K. Muthen posted on Wednesday, March 29, 2006 - 8:29 am
I don't believe that the problem is overdispersion. Your counts are very high which may indicate they can be treated as continuous. I would not be able to say more without more information. You can send your input, data, output, and license number to support@statmodel.com if you want us to look into this further.
 Thomas Olino posted on Wednesday, May 05, 2010 - 11:40 am
I was wondering if anyone had thoughts on how to handle this situation. The dependent vairable of interest is a proportion score that has many zero values. If these data were count data, I would use a count distribution to model the data. However, this seems inappropriate with these data.

Would it be reasonable to specify the regression model as a binary outcome and for the non-zero proportion values, estimate those as a continuous variable?

Thanks!

Tom
Back to top
Add Your Message Here
Post:
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Password:
Options: Enable HTML code in message
Automatically activate URLs in message
Action: