I'm using zero-inflated negative binomial in a complex dataset (clustering within schools). If I don't change the starting values, I get a reasonable result. But if I do increase the amount of starting values, I get a result with fixed parameters in the zero-model to avoid singularity. I was also wondering which technique is used to correct the s.e. for the complex structure.
the model below has 2 zero-inflated (poisson) dependent variables. I would like to include a correlation between these 2 variables. 1. Is this correlation specified correctly? 2. Is the standard MLR the one to use? 3. should I be suprised that the other coefficients are quite different, given a high correlation between the 2 independent variables?
Thank you very much for providing these techniques. It's a really nice model I couldn't have fitted before using Mplus. Ruben.
Usevariables are prop_del viol_del delgroup att_viol s_contr mk_tot; Missing are .; categorical are delgroup; count is prop_del viol_del(i); cluster = school; analysis: type = complex; starts 100 20; model: att_viol on MK_tot ; S_contr on att_viol MK_tot; delgroup on S_contr att_viol; prop_del on MK_tot att_viol S_contr delgroup; prop_del#1 on MK_tot att_viol S_contr delgroup; viol_del on MK_tot att_viol S_contr delgroup; viol_del#1 on MK_tot att_viol S_contr delgroup; f BY prop_del viol_del; f@1;
the model is quite stable, and replicable with other start values. However, one of the coefficients is very high. The estimate for prop_del#1 on delgroup is 14.525. The oddsratio is therefore about 2million. This is very strange to report. But it does make sense that this coefficient is very high, only not that high, and the rest of the model makes sense.
Can and should I somehow restrict the size of this parameter? Or should I just report it like it is?
in the model above, is it possible to estimate indirect effects on the zero-inflated dependent variables. Can Mplus do this? And in what scale would they be? I assume that it would be an effect measured in 2 coefficients, one for the zero-part and one for the count-part?
The interpretation of the indirect effects can be drawn from the path model in terms of positive and negative effects. Which I will report like this. I was just wondering if there would be some estimation of the size of this effect, and some confidence intervals if possible.
The guiding principle for being able to produce indirect effects is that the M ON X and the Y ON M regressions are both linear. This is more general than it sounds. For instance, M can be a latent response variable for a categorical (binary or ordered) observed variable, in which case we call it M*. In for example probit regression M* ON X is then a linear regression. For this example what is required is that Y ON M is also linear. Y can then be a latent continuous response variable for a categorical outcome, a lograte for a count, or a log hazard for a survival variable. But, continuing the example with a categorical observed measure of the mediator M, the key is that Y ON M concerns the latent continuous response variable M*, not the observed categorical measurement. So for instance, with a binary variable it is not the event itself that predicts Y but the tendency for the event to happen.
In this example, Y is a count and I think you had a categorical mediator. That's a tricky combination which Mplus doesn't yet handle. Count Y requires ML which doesn't yet work with a latent response variable M for Y ON M. Bayes can do that, but can't yet do counts.
Ksenia posted on Monday, December 01, 2014 - 8:56 am
I am looking for a software that can handle three level zero-inflated negative binomial. I would appreciate if you could answer whether:
1a) MPlus works with three-level zero-inflated negative binomial; 1b) MPlus works with three-level negative binomial;
2) if one can graph an interaction in three level negative binomial using MPlus ? Thank you very much.
Mplus does not do three-level for count variables.
Tracy Witte posted on Tuesday, February 24, 2015 - 8:10 am
I am running a zero-inflated negative binomial regression in Mplus. I have a categorical predictor with 3 levels (i.e., three different diagnoses). Thus, I am modeling the predictors as a set of two dummy variables, with one of the diagnoses as the reference variable. Since this only allows for the comparison of each of the other diagnoses with the reference group, I also ran the regression again with a different diagnosis as the reference variable so that I can get that final pairwise comparison.
For the first regression, I got this warning message, " WARNING: THE MODEL ESTIMATION HAS REACHED A SADDLE POINT OR A POINT WHERE THE OBSERVED AND THE EXPECTED INFORMATION MATRICES DO NOT MATCH. AN ADJUSTMENT TO THE ESTIMATION OF THE INFORMATION MATRIX HAS BEEN MADE. THE CONDITION NUMBER IS -0.221D-04..."
Based on previous posts, from what I understand, it's ok to ignore this message. However, I did not get this warning message for the second regression I ran. Also, for this regression, the beta weights for the pairwise comparison that is also contained in the original regression are quite different. Normally, when I run regressions with dummy variables, in the second version, one of the beta weights is identical to the first and is just a different sign.
I'm wondering if these results are trustworthy, or if I'm perhaps doing something wrong. Any assistance would be helpful!
Could you please help me interpret the R-square output from running the following ZINB model?
count = numdxy (nbi); Analysis: Type = complex; Integration = Montecarlo; Model: numdxy on sexf ... e04 ; numdxy#1 on sexf ... e04; !the following line is included so Mplus executes FIML rather than !listwise deletion; sexf marcoh ed12sup unemply ipovert discnty e04;
Here's the output that I need help with:
R-SQUARE Obs Var Estimate S.E.
NUMDXY 0.479 0.233 NUMDXY 1.000 999.000
My questions: 1) Which line for NUMDXY above is for the count data 0 and higher, and which line is for the logistic regression predicting membership in the "must be zero" group?
2) Why is the Rsquare estimate 1.00 and S.E. of 999?
Dr. Muthen, Thank you for your quick response. May I ask for a little more clarification?
1. As far as I can tell either you didn't answer my first question or I'm not smart enough to understand your reply. So, in general, in Mplus output for the R-squares from ZINB models, does the first line containing the name of the outcome variable report the R-square for the model for all observations with outcome values >= 0, or is the first line in the R-square section for the logistic regression predicting membership in the "must have count of 0" group?
2. Does your reply mean that the reported R-square values are for the standardized coefficients? If not, how is standardization a part of computing the R-square values? Thanks for unpacking your reply.