Mplus Discussion >> Zero-inflated gamma regression

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Zero-inflated gamma regression

Mplus Discussion > Latent Variable Mixture Modeling >

Message/Author

Magnus Alderling posted on Thursday, November 10, 2016 - 3:29 am

Hi!

I have in a first step done a LCA where a six cluster solution revealed the best fit of data. In the next step I want to analyze the relation between these clusters and sickness absence measured in days. Sickness absence has a preponderance of zero values and the distribution of values above zero is skewed to the right. I decided to use the data twostep in MPlus in order to do a zero-inflated gamma regression but the beta coefficients for my clusters make no sense in the logistic regression part, e.g. one cluster, according to data, have a larger proportion of zero’s compared to the reference cluster but MPlus gives the opposite.
I have found a reference how a zero-inflated gamma regression is carried out in SAS, syntax is written by Dale McLerran and have been referenced by others. Applying that code in SAS gives beta coefficients that fit my data better. Specifically, the difference in beta coefficients from the logistic regression is merely a shift of + or – sign, the magnitude is exactly the same, for every cluster between MPlus and SAS whereas the beta coefficients from the gamma regression, reminded of each other. Thus, the magnitude of intercept and the beta coefficients for clusters respectively in MPlus were 113%, 67%, 66%, 71%, 64%, 57% compared to the ones obtained from SAS. Why are there differences in either magnitude or sign for beta coefficients between softwares?

Magnus

Magnus Alderling posted on Thursday, November 10, 2016 - 3:36 am

My MPlus syntax:

TITLE:
Distribution drivers

DATA TWOPART:
NAMES= sjldag;
BINARY= bin1;
CONTINUOUS =cont1;

DATA:
File is sjuklonedagar.txt;

VARIABLE:
Names are personnummer1 sjldag kluster;

IdVariable is personnummer1;

Usevariables are sjldag kluster1 kluster2 kluster3 kluster4 kluster5 bin1 cont1;

CATEGORICAL is bin1;

DEFINE:
kluster1 = kluster ==1;
kluster2 = kluster ==2;
kluster3 = kluster ==3;
kluster4 = kluster ==4;
kluster5 = kluster ==5;

ANALYSIS:
Estimator = ML;
STARTS = 400 100;
LRTSTARTS = 0 0 200 40;

MODEL:
cont1 on kluster1 kluster2 kluster3 kluster4 kluster5;
bin1 on kluster1 kluster2 kluster3 kluster4 kluster5;

OUTPUT:
sampstat;
tech4;
tech7;
tech10;
tech11;
tech14;
tech15;

PLOT:
type=plot2;

SAVEDATA:
File is item_prob_plot_sjuklonedagar;

Magnus Alderling posted on Thursday, November 10, 2016 - 3:38 am

The SAS syntax:

proc nlmixed data=sjldagar;
parms b0_f=0 b1_f=0 b2_f=0 b3_f=0 b4_f=0 b5_f=0 b6_f=0
b0_h=0 b1_h=0 b2_h=0 b3_h=0 b4_h=0 b5_h=0 b6_h=0
log_theta=0;
eta_f=b0_f+b1_f*kluster1+b2_f*kluster2+b3_f*kluster3+b4_f*kluster4+b5_f*kluster5+b6_f*kluster6;
p_yEQ0=1/(1+exp(-eta_f));
eta_h=b0_h+b1_h*kluster1+b2_h*kluster2+b3_h*kluster3+b4_h*kluster4+b5_h*kluster5+b6_h*kluster6;
mu=exp(eta_h);
theta=exp(log_theta);
r=mu/theta;
if sjldag=0 then
ll=log(p_yEQ0);
else
ll=log(1-p_yEQ0)-lgamma(theta)+(theta-1)*log(sjldag)-theta*log(r)-sjldag/r;
model sjldag ~ general(ll);
predict (1-p_yEQ0)*mu out=expect_zig;
predict r out=shape;
estimate "scale" theta;
run;

Bengt O. Muthen posted on Thursday, November 10, 2016 - 12:15 pm

Note that 2-part modeling is not the same as zero-inflated modeling; the latter is a mixture model and the former is not. So the results should not be expected to be the same, only similar. This is described in chapter 7 of our new book

http://www.statmodel.com/Mplus_Book.shtml

Note also that the binary part of 2-part modeling in Mplus describes the probability of not being at zero. Zero-inflated modeling typically describes the probability of being in the zero class. The latter is the case for example using the censored-inflated model in Mplus.

Also, we request that postings are limited to one window. Longer questions should be sent to Mplus Support.