Anonymous posted on Tuesday, March 08, 2005 - 10:45 am
Just a quick question. If I am using c# to capture the zero-inflation, then do I have to use the ii si | u1#1@0 u2#1@1 u3#1@2 u4#1@3; as is found in 8.11? I guess I'm confused as to how to read the ii and si in the output, so if you have any suggestions where I may go to figure this part out too? I really appreciate it.
Anonymous posted on Tuesday, March 08, 2005 - 10:50 am
I am sorry. I specifically am referring to the means portion of the output. What does it mean when the ii and/or the si have a significant mean?
bmuthen posted on Tuesday, March 08, 2005 - 11:47 am
"ut#1" refers to a dichotomous latent variable where the focus is on the probability of being in the class that cannot obtain an observed count other than zero ("zero class" at time t). The estimated ii and si means are interpreted like growth modeling of a binary outcome - for instance, the mean of i is the logit for the probability of being in the zero class at the time point with time score 0 and the mean of si gives the change in that logit over time.
Jason Bond posted on Thursday, July 13, 2006 - 10:37 am
Bengt and Linda,
When I try and run zero-inflated Poisson LCGM, I very often encounter problems with convergence. One issue may possibly be the range of the variables (0-365 with a fairly big pile up at 0 (20-50% of the cases across the 4 waves)) and misingness (9-30% of the cases across the 4 waves). Censored and even censored inflate analyses seem to be a litle easier to get to converge. Is the procedure quite sensitive to whether the distributional assumptions of the outcome variables are satisfied? Similar output is obtained for linear instead of quadratic growth. Even assuming only a Poisson response (not zero-inflated) did not seem to want to run, with an error of: THE LOGLIKELIHOOD DECREASED IN THE LAST EM ITERATION. CHANGE YOUR MODEL AND/OR STARTING VALUES.
Basically, the same model as below but without the (i) on the Count statement or the zero-inflated parameters line. Any thoughts you have would be appreciated.
Mplus VERSION 3.01 MUTHEN & MUTHEN 07/12/2006 5:18 PM
TITLE: LCA For Number of AA Meetings;
DATA: FILE IS "I:\MyFiles\Trajectories\AA-Tx-Careers\Rep-Orig-Traj\AA-TX-traj.dat";
VARIABLE: NAMES = id dataset2 age1829 age3049 gender blckhisp white black hisp aacapt1 aacapt2 aacapt3 aacapt4;
USEVARIABLES ARE aacapt1 aacapt2 aacapt3 aacapt4;
Classes = C(4); MISSING ARE ALL (-9); IDvariable = id; Count = aacapt1-aacapt4 (i);
SAVEDATA: FILE = "I:\MyFiles\Trajectories\AA-Tx-Careers\Rep-Orig-Traj\output.out"; SAVE = CPROBABILITIES;
ANALYSIS: TYPE = Mixture Missing; STARTS = 10 2;
OUTPUT: TECH1 TECH8;
PLOT: Type is PLOT3; Series = aacapt1 (0) aacapt2 (1) aacapt3 (3) aacapt4 (5);
Number of dependent variables 4 Number of independent variables 0 Number of continuous latent variables 6 Number of categorical latent variables 1
Observed dependent variables
Count AACAPT1 AACAPT2 AACAPT3 AACAPT4
Continuous latent variables I S Q II SI QI
Categorical latent variables C
Variables with special functions
ID variable ID
Estimator MLR Information matrix OBSERVED Optimization Specifications for the Quasi-Newton Algorithm for Continuous Outcomes Maximum number of iterations 1000 Convergence criterion 0.100D-05 Optimization Specifications for the EM Algorithm Maximum number of iterations 500 Convergence criteria Loglikelihood change 0.100D-06 Relative loglikelihood change 0.100D-06 Derivative 0.100D-05 Optimization Specifications for the M step of the EM Algorithm for Categorical Latent variables Number of M step iterations 1 M step convergence criterion 0.100D-05 Basis for M step termination ITERATION Optimization Specifications for the M step of the EM Algorithm for Censored, Binary or Ordered Categorical (Ordinal), Unordered Categorical (Nominal) and Count Outcomes Number of M step iterations 1 M step convergence criterion 0.100D-05 Basis for M step termination ITERATION Maximum value for logit thresholds 15 Minimum value for logit thresholds -15 Minimum expected cell size for chi-square 0.100D-01 Maximum number of iterations for H1 2000 Convergence criterion for H1 0.100D-03 Optimization algorithm EMA Random Starts Specifications Number of initial stage starts 10 Number of final stage starts 2 Number of initial stage iterations 10 Initial stage convergence criterion 0.100D+01 Random starts scale 0.500D+01 Random seed for generating random starts 0
Input data file(s) I:\MyFiles\Trajectories\AA-Tx-Careers\Rep-Orig-Traj\AA-TX-traj.dat Input data format FREE
SUMMARY OF DATA
Number of patterns 0 Number of y patterns 0 Number of u patterns 0
COVARIANCE COVERAGE OF DATA
Minimum covariance coverage value 0.100
RANDOM STARTS RESULTS RANKED FROM THE BEST TO THE WORST LOGLIKELIHOOD VALUES
Initial stage loglikelihood values, seeds, and initial stage start numbers:
6 perturbed starting value run(s) did not converge.
Loglikelihood values at local maxima, seeds, and initial stage start numbers:
-51887.783 462953 7 -51887.783 127215 9
THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS -0.178D-16. PROBLEM INVOLVING PARAMETER 10.
ONE OR MORE MULTINOMIAL LOGIT PARAMETERS WERE FIXED TO AVOID SINGULARITY OF THE INFORMATION MATRIX. THE SINGULARITY IS MOST LIKELY BECAUSE THE MODEL IS NOT IDENTIFIED, OR BECAUSE OF EMPTY CELLS IN THE JOINT DISTRIBUTION OF THE CATEGORICAL LATENT VARIABLES AND ANY INDEPENDENT VARIABLES. THE FOLLOWING PARAMETERS WERE FIXED: 2 3 THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THIS IS OFTEN DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. CHANGE YOUR MODEL AND/OR STARTING VALUES. PROBLEM INVOLVING PARAMETER 5.
FINAL CLASS COUNTS AND PROPORTIONS FOR THE LATENT CLASSES BASED ON THE ESTIMATED MODEL
There have been changes to the Poisson algorithm since Version 3.01. You should upgrade to the most recent version of Mplus. I think your problems may be solved by this.
Elia Femia posted on Thursday, August 31, 2006 - 3:36 pm
Referring to the post from March 8 2005, I'd like to clarify the meaning a significant estimated si mean. In this part of the model, we are predicting the probability of being in the zero class? If I have group membership as a covariate (0=control and 1=treatment), and the si coefficient is negative and significant, is that interpreted as the control group having a higher probability of being in the zero class over time? And what if, in addition, the qi coefficient is also significant (and positive)?
That post referred to zero-inflated Poisson modeling. So this is a 2-class model in line with the Roeder et al (1999) JASA article. One class of people follow the regular Poisson where the dependent variable is the log rate for the counts. The other class of people can only have zero counts and here the dependent variable is the probability of being in this zero class. See ex 6.7 in the User's Guide. "si" in that notation refers to the inflation part and is the slope in the growth model for changes over time in individual probabilities of being in the zero class. A negative significant si slope mean implies that the probability goes down over time. Regressing si on a covariate, you have 2 parameters: the intercept and the slope in this regression. If you are saying that this latter slope is negative, then yes your interpretation is correct. I won't answer the qi part because it is not clear if you refer to the mean of qi or the regression of qi on the covariate - in any case, it is always difficult in any growth model to single out effects on linear and quadratic slopes.
Regarding above message, with ZIP in a growth model over several time points: does ZIP assume that a portion of the sample will be zero at every single time point? OR: at each time point, there are more individuals at zero than expected for a regular poisson, but not necessarily the same individuals at zero at every time point?
My data reflects the latter situation. Trying to fit a ZIP model results in the mean of the growth inflation term to be zero at each time point. Does this suggest Poisson (without inflation) would be a better fit?
On your first paragraph, ZIP growth modeling let's people move in and out of zero across the time points, so it is not necessarily the same individuals at zero at every time point. ZIP mixture modeling can be used to in addition specify a zero class throughout if you want that.
Inflation estimates close to -15 suggest that inflation is not needed. I don't know why you would get (exactly) zero values at each time point unless you inadvertently specify that.
With a ZIP latent growth model, I am somewhat confused about the use of random effects versus fixed effects. I am fitting a ZIP LGM with three time points. The slope parameter for the binary part is negligible, so I dropped it such that the binary part is a means model, not a growth model.
When I freely estimate variances for both count and binary intercepts, I get a singularity of the information matrix (possible underidentification?). When I constrain the variance for the binary intercept part to zero (making it a fixed effect), the estimated inflation probability across waves is 9.4%. When I constrain the variance for the count part to zero (fixed effect), the estimated inflation probability jumps to 48.3%. Reading over Hall's (2000) article on ZIP and ZIB regression with random effects, I am inclined to allow the random intercept effect for the count part of the model.
The AIC favors the count random intercept model over the binary random intercept model. But what can I make of the vast differences in inflation probabilities? And, do I really need the inflation component? The non-inflated model AIC is basically comparable, but there is a preponderance (~50% at each wave) of zeros in the data, suggesting the utility of zero-inflation.
1. Count model without inflation and random intercept and slope growth factors - see BIC 2. Count inflated model with no growth model for the inflated part of the variable and random intercept and slope factors for the continuous part of the variable -- is BIC better or worse? If worse, work with the count model without inflation and adjust steps 3 and 4. 3. Count inflated model with a growth model also for the inflated part but fixed effects for both the intercept and slope growth factors for the inflated part and random intercept and slope growth factors for the continuous part of the variable -- how does BIC compare to 1 and 2. 4. Count inflated model with a growth model for the inflated part and random effects for both the intercept and slope growth factors for both growth models -- I think this is where you get singularity -- what exactly does the message say?
Regarding the probabilities, if you are computing these for a model with random effects, I think they will be incorrect because they cannot be computed with numerical integration.
Thanks for your useful reply. I fit the model without inflation (Model 1 in your reply) and an inflated model without a growth model for the inflated piece (Model 2). The BIC for the non-inflated model is 1105, whereas the BIC for the inflated model without growth for the inflation part is 1115. So, it looks like the inflated model is not improving things (residuals look similar for Model 1 and Model 2). The thresholds for Model 2 are -15, -2.0, and -3.7 for the three waves, yielding low inflation probabilities (.00, .12, and .02, respectively). To clarify, when you mentioned that inflation probabilities will be incorrect for random effects models (because cannot be computed by numerical integration), does that apply to a model with random effects in the count and/or inflation parts or just random effects in the inflation growth parameters?
Although Models 3 and 4 are probably contraindicated because the BIC favors the non-inflated model, I fit them to learn more about thinking through this problem. Model 3 (fixed effects inflation growth model) yields a BIC of 1113.5 (slightly better than Model 2) and finds inflation probabilities of .05, .06, and .09 across the waves.
For Model 4, I used numerical integration (15 points) and again received the singularity of information matrix message, which was:
ONE OR MORE PARAMETERS WERE FIXED TO AVOID SINGULARITY OF THE INFORMATION MATRIX. THE SINGULARITY IS MOST LIKELY DUE TO THE MODEL IS NOT IDENTIFIED, OR DUE TO A LARGE OR A SMALL PARAMETER ON THE LOGIT SCALE. THE FOLLOWING PARAMETERS WERE FIXED: 1 4 8 9 10 11 12 13 14
Parameter 1 is the inflation intercept (mean) from the nu matrix. Parameter 4 is the slope for the inflation part from the alpha matrix. Parameters 8-14 are the covariances of the inflation intercept and slope parameters with all other growth parameters (Psi matrix), as well as the variances of the inflation intercept and slope.
The output for Model 4 looks severely out of whack (e.g., estimate for inflation slope parameter is 193!). BIC for Model 4 is 1144. I guess model 4 was not meant to be given the best BIC in Model 1. To conclude, it appears that the non-inflated model works the best. Is that accurate, and is there any more to the story that I should be thinking about? Thanks again.
The numerical integration is needed only for the probabilities related to the inflation growth parameters.
Although I don't think you are interested in Model 4, it looks like you have hit an odd solution. You can use the STARTS option in the ANALYSIS command, for example, STARTS = 100 10; This may help.
Note that posts should not exceed one window.
J.W. posted on Wednesday, March 05, 2008 - 8:52 am
I am testing LGM for a count outcome using a zero-inflated Poisson model: 1) I ran the Mplus example program ex6.7.inp. The mean of Ii was 0. Then, I freed the parameter Ii, but the estimated mean of Ii was still 0 (Mplus fixed it to zero to avoid singularity of the information matrix according to the message in Mplus output) 2) I noticed that the data set (i.e., ex6.7.dat) used in Mplus example program ex6.7.inp was generated from Monte Carlo simulation in Mplus example program mcex6.7.inp where Ii was set to 0. Then, I re-set Ii to 0.1 in the program and generated a new data set by re-running the Monte Carlo simulation. 3) I ran the program ex6.7.inp again using the new data set. The estimated mean of Ii was still zero. It seems that Ii was set to 0 by default in Mplus. Is this right? 4) I freed the parameter Ii and re-ran the model. Then, I got the message “...TO AVOID SINGULARITY OF THE...THE FOLLOWING PARAMETERS WERE FIXED: 4”. However, it was not parameter 4 (i.e., the mean of Ii), but its S.E. was fixed to zero. Your answers to my questions will be highly appreciated!
The mean of ii is fixed at zero as part of the growth model parameterization for the inflation part of the model. If you want to free the mean of ii, you must also fix the intercepts of the inflation outcome to zero intead of having them held equal.
Linda, thanks a lot! A few more questions: 1) Holding the intercepts of the inflation outcome equal is actually holding the thresholds equal (threshold=-intercept), right? 2) In regard to interpretations of threshold of the inflation outcome and mean of Ii: -- Is the estimate of a threshold (e.g., U14#1=-2.139) the negative value of logodds of having extra zeros in the sample at a specific time point (e.g., Time=4)? -- Is the mean of Ii the negative value of logodds of having extra zeros on average over time? The model results are: Ii=-0.162 (estimated by freeing [Ii] and fixing [U11#1- U14#1@0]); U11#1=-0.262, U12#1=-0.606, U13#1=-0.801, U14#1=-2.139 (estimated by fixing [Ii@0] and freeing [U11#1- U14#1]). Your help will be appreciated!
Look at UG ex 3.8 for ZIP regression and its explanation of inflation. In that example the u#1 on x refers to the logistic regression probability of being in the zero class, that is the class that is unable to have positive counts. That class can be seen as the inflation class (extra zeros). Here we estimate a logistic regression intercept, not a threshold. The higher the intercept, the higher the probability. And the higher the logodds for being in the zero class vs the other class. With growth modeling, this translates to an outcome at a certain time point. The mean of Ii is on the same scale as the intercept because Ii takes the role of x in regression (now regression of the count outcome on Ii). - So higher values give higher prob of zero class.
Dear drs. muthen, I am testing a three-wave SEM model. My independent, wave 1, variable (IV) is a four-group categorical (categories are qualitatively different). It is my understanding that IVs need not be specified as such in the syntax. My model fits very well, and the path from the IV to the DV is significant. I am unsure however, how to interpret the output concerning the regression of W2_SOCI ON W1_SOCGR being Est./S.E = -2.762 (p= 0.006). Does this mean that the greater the group dummy code (1, 2, 3, or 4), the lower the social compentence at w2 (w2_soci)? Because that doesn't make any sense with my categorical variable. I've tried the define command (defining three of the four groups), but it doesn't work. Relatedly, with the new version, estimates are provided for unstand. model results, STDYX Stand., STDY stand., and Std. What is the differnece between all of these? And finally, how can the indicator of a factor be significant, but variance explained in the indicator not? thank you
If you have a nominal independent variable, you need to create a set of dummy variables. I'm not sure why this does not work. Please send your input, data, output, and license number to firstname.lastname@example.org for help with this.
Please see the STANDARDIZED option in the user's guide for a description of the various standardized estimates.
The reason the two tests might be different is that one tests if the size of the factor loading is different from zero. The other tests whether the variance explained in the dependent variable is different from zero. The latter is a function of more than the factor loading parameter.
Dear Dr. Muthén, You mentioned in your response on March 09, 2008 that Mplus estimates “a logistic regression intercept, not a threshold” in the logit part of the ZIP model. As I recall that Mplus reports threshold instead of intercept for logit model. So, when Mplus reports intercept and when threshold for a logit model or probit model? Thanks!
When you make x a ZIP how do you specificy an effect of the inflation part of the model on the DV? When I list x as a count I get just a "bmi on x" parameter but I am also interested in the effect of the inflation on BMI.
Thank you for the direction. I understand how to specify the inflation part of the ZIP variable when it is a DV but I get an error when I list x#1 on the right side of the ON statement. I am interested in the effect of the inflation on the continuous DV. Is there a way to do this?
I am running a latent growth curve model (6 time points)and then using the intercept and slope parameters to predict the outcome (1 time point) using a ZIP model. I am getting noticably different estimates when I look at the unstandardized versus standardized output. For example:
Unstandardized Estimate S.E. Est./S.E. P-Value TDRK29C#1 ON I -3.521 6.170 -0.571 0.568 S1 -104.887 144.716 -0.725 0.469
Standardized (STDYX) TDRK29C#1 ON I -0.174 0.185 -0.937 0.349 S1 -0.958 0.202 -4.754 0.000
Is this ok? Should I report unstandardized results? Thanks!
Unstandardized and standardized coefficients will be different. The amount of difference depends on the standard deviations of the variables involved. If you don't have a reason to use standardized, I would use unstandardized.
My colleague and I have been running lcga models using 37 waves of count data. The data have a large percentage of 0s at each wave and when 0s are disregarded, the data are highly skewed, even after top coding at 75. We are interested in comparing the results obtained using proc traj and mplus. We ran 2-class zip models in both programs (in mplus fixing the variances of the count and zero inflated portions of the model to 0). The fit in sas is ok, but the fit for the mplus model is very poor, especially for one of the classes (the posterior probabilities are near .4 and the estimated mean trajectory is much lower than the observed mean trajectory) and the model will only converge using mlf. Moreover, in mplus 90% of the cases were placed in one class while the split was 60%/40% for proc traj. Because it is unlikely that the count portion of the model follows a poisson distribution, we reran the lcga 2-class model in mplus using the zero inflated negative binomial model (zinb). The fit was much improved and the percentage of the sample falling into each class was similar to that obtained from the proc traj zip model. I have read that proc traj and mplus will not give you the exact same results for the zip model, but why are they so different? Maybe you could recommend an article that discusses this issue? Most of what I have read says that sas has more flexibility to account for dormancy and exposure time, but provides little elaboration. Thank you.
Because the models used in TRAJ are restricted, special cases of the Mplus models you get exactly the same results as in TRAJ when you set up the model correctly in Mplus. This is also true for the zip model. You need to send both the TRAJ and Mplus outputs for the zip model where your two runs disagree so we can see where the input error lies. Please include your licence number. I don't think that sas has more flexibility as you suggest- perhaps you can point me to such written claims.
AeLy Park posted on Friday, August 13, 2010 - 12:40 pm
I tested ZIP model with repeated measures first and then put covariates into the model to predict binary part and count part. Then I got the following warning messages. So the model loose 816 cases when I put the covariates. In the program, I put <Type = Missing; Integration=5;>
Any more function I need to put into the model? How can I use full information including covariates?
*** WARNING in ANALYSIS command Starting with Version 5, TYPE=MISSING is the default for all analyses. To obtain listwise deletion, use LISTWISE=ON in the DATA command. *** WARNING Data set contains cases with missing on x-variables. These cases were not included in the analysis. Number of cases with missing on x-variables: 816 *** WARNING Data set contains cases with missing on all variables except x-variables. These cases were not included in the analysis. Number of cases with missing on all variables except x-variables: 7 3 WARNING(S) FOUND IN THE INPUT INSTRUCTIONS
Missing data theory does not apply to covariates. You can mention the variances of the covariates in the MODEL command. They will then be treated as dependent variables and distributional assumptions will be made about them. Or you can use multiple imputation to create imputed data sets. I think both approaches are just about the same.
I am running a ZIP growth model with KNOWNCLASS option in order to look at the estimated means as well as the estimated probability for the subgroups that I am interested in.
However, I don't receive differential estimated probability for each subgroup. Therefore, I am wondering is there a specific save command or an alternative approach that I can use to reveal the estimated probability for binary outcomes for each subgroup?.
No, but you can compute the estimated probability of being at zero or not from the estimated parameter values. This is done in line with a regular binary logit growth model for being in the zero class (see UG ex for ZIP).
I don't quite understand. Did you suggest me that I could calculate the estimated probability for time 1 by using estimated mean that is equals 1.191 and the probability of inflation probability = .755 at time 1?
If so, can you point out the formula for calculating Pr(Yit=0)?
I think that Pr(Yit=0)=Pr(zero-class)+Pr(yit=0|non-zero class)*Pr(non_zero class)
So, I can use .755+Pr(Yit=0|non-zero class)*(1-0.755)
I don't know specific for the detailed equation for calculating Pr(Yit=0| non-zero class), is it e^(-1.191)*1.191^0?
You asked for "the estimated probability for binary outcomes for each subgroup", so I interpreted that to mean that you wanted the probability of being in the zero class for each subgroup. That is the probability of inflation. See our Topic 2 handout. Your P(Yit=0) is correct, but that is another matter.
I am so sorry for the confusion. I meant to graph the trajectories of the binary outcome (being at 0) for 5 KNOWNCLASS for 20 time points. However, I did not know how to do to request Mplus output it for me. I tried to put "residual" command and also request plot3. However,I only have trajectories of estimated means for each KNOWNCLASS.
Previously, I interpreted your advice to me was to compute the trajectories of the binary outcome for 5 KNOWNCLASS by a) using the estimated means across time points for each KNOWNCLASS, and b)the overall probability of inflation.
I wonder now a) is it one of the correct procedures get trajectories of binary outcome (with and without co-variates), and b) is there a better way to do this?
If you want to graph the estimated trajectory for the binary part of the ZIP model you can look at page 682 of the V6 UG where the bar (|) function is used in the SERIES option to plot two growth curves, in your case the binary and the count curves.
You get these curves for each KNOWNCLASS.
Melanie Wall posted on Wednesday, November 09, 2011 - 12:36 pm
We are trying to use Model Constraint commands to directly estimate the expected values from a LCGM Zero-Inflated Poisson, but we are not getting the same values as those output by the SERIES option in the Plot command.
Snipit of code...
model:%overall% i s q|dsmdep12@0dsmdep13@.1dsmdep14@.2dsmdep15@.3; ii si qi|dsmdep12#1@0 dsmdep13#1@.1 dsmdep14#1@.2 dsmdep15#1@.3 ; s-q@0; ii-qi@0; %c#1% [i*-0.4](a1) [s*-0.3](a2) [q*1.1](a3) [si*0.5](a5); [qi*-0.7](a6); [dsmdep12#1-dsmdep23#1*2] (a4); %c#2% MODEL CONSTRAINT: NEW (point112 point113); point112 = (1/(1+exp(a4)))*exp(a1); point113 = (1/(1+exp(a4+a5*.1+a6*.01)))*exp(a1+a2*.1+a3*.01);
Looks like your Model Constraint statements are correct. (In the Model command I don't think you mean dsmdep23#1*2, but dsmdep15#1*2.)
Please send input and data to Support so we can investigate the discrepancy.
Melanie Wall posted on Wednesday, November 09, 2011 - 2:04 pm
Actually thanks to our diligent colleague, Mei-Chen Hu, we figured out our constraints were wrong. Because we are allowing the intercept of the Poisson part to be random, we need to also include the variability of that intercept when calculating the mean back on the original scale. So below, we label the variance of the intercept as "av1" and then put it into the model constraints...
I missed that you had specified the intercept growth factor i as random. This means that numerical integration over i is done resulting in the values given in the RESIDUAL output which are then plotted.
Thank you for your response, and sorry for not being clear. I followed UG 6.7, so I assume it's a zero-inflated growth model.
I have an added question. If I want to compute the estimated probabilities of p(y=0) at each time point from estimated parameters, what's the correct formula? I tried exp(I+S*time)/(1+exp(I+S*time), but it doesn't seem correct. I think the intercept parameter (not the growth factor I) comes into play, but I haven't figured out how.
I am analyzing substance use data (count) with p(y=0) vary from .5 to .25 over time. Do you think zero-inflated poisson growth model appropriate, or would you recommend other types of model such as two-part growth model?
UG 6.7 is a zero-inflated model, so everyone contributes to every part of its estimation. For two-part models only the ones with non-zero response contribute to the continuous part.
It is difficult for you to compute the estimated probabilities because the growth factors are random variables. This means that you can't just insert their means in the formula but have to integrate over their distributions, which Mplus does using numerical integration. I think we print the estimated probabilities.
There isn't a clear choice. If you want to view this as there being 2 types of people who answer zero I would use zero-inflated modeling: Those who do not participate in the activity and those who do but didn't in the time period studied.
Laura posted on Wednesday, March 19, 2014 - 12:15 pm
I have a question on LCGA with a count variable that is very skewed: about 80% have zero values in each time point. I have compared the results of models with zip, zinb and negative binomial distribution. What comes to the BIc values, it seems that "zinb" fits the data best (in 2 to 5 latent class models) and "zip" is the second best alternative, although the differences in BIC values are quite small. The form of the trajectories is quite different in these models (with zip and zinb). However, both of them make sense substantially. Is it possible in this case to choose the model (zip or zinb) based on BIC values?
It's hard to make a statistical choice when BIC values are close. You can also look at TECH10 and count the number of significant bivariates; see the below Muthen-Asparouhov chapter where we use TECH10 information for crime curve model fit:
Muthén, B. & Asparouhov, T. (2009). Growth mixture modeling: Analysis with non-Gaussian random effects. In Fitzmaurice, G., Davidian, M., Verbeke, G. & Molenberghs, G. (eds.), Longitudinal Data Analysis, pp. 143-165. Boca Raton: Chapman & Hall/CRC Press.
You say LCGA, which then raises the possibility of generalizing to a GMM.
Laura posted on Tuesday, August 19, 2014 - 7:33 am
Thanks for your reply! Related to the previous post, I would still like to ask about the choice of the distribution. Should the choice be based more on substantial interpretation or on statistics, e.g. BIC values? With negative binomial and zero-inflated negative binomial I get quite similar solutions that are also substantially interesting. With ZIP, the trajectories are also clear, but one distinct group identified by ZINB and NB is missing. The BIC values are, in general, the best with ZINB, but the models do not converge very easily and the best log likelihood value is not always replicated. The BIC values are the second best with ZIP, but the model does not identify the distinct trajectory that was identified with ZINB and NB. With NB the interpretation is good (and the average posterior probabilities are the highest) but the BIC values are the worst. Overall the differences between BIC values are quite small. What do you think is more important in this kind of situation; statistical criteria (BIC, converging) or interpretation?
These are difficult choices. Being able to replicate the best logL many times is important in order to trust the solution. If BIC values are close, I would rely on interpretability and usefulness of the model - for instance by relating the classes to antecedents and consequences.
But you say LCGA - why not GMM? Our Topic 6 handout on our website, slides 127-137 discusses the choices and in particular slide 130 compares GMM and LCGA, with GMM doing better.
I am running a 5 wave LCGA on skewed count data (roughly 50-60% zeroes at any time point). Additionally, in these models I need to identify a class of people who score zero at every wave.
1. I understand from above that if I want a zero class with the same people at each time point I need to use a fixed zero class as the zero inflation allows people to move in and out at any given wave?
2. Does using a fixed zero class negate the need for zero inflation, or does this depend on model fit?
3. I have variances much larger than my means, does this indicate that negative binomial models would be better suited than poisson models?
4. I am trying to compare poisson, zip, negative binomial and zinb models all with a fixed zero class - can I do this using BIC/AIC etc? Is a larger BIC in the ZI models a true indication of poorer fit, or just that these models are more complex with more parameters than the non ZI models?
5. Finally, the ZINB and ZIP latent growth (non mixture) models have incredibly large values for inflation growth means and variance (i.e. a slope of -35.00 and variance of 1800.00). The intercept means are 0.00 and the intercept variances are even larger (e.g 41,000). How do you interpret such large values? Is this an indicator of a larger problem?
1. In my experience, BIC typically does not favor a zero class across time. Instead, a solution with a low class (almost zero) comes out as the winner.
We talk about ZIP growth modeling in the video and handout for Topic 6, slides 128-137. For a count outcome U, the inflation is referred to as U#, where u# is a binary latent inflation variable and u#=1 indicates that the individual is unable to assume any value except 0. In the output you see an instance of U# = -15 which means that there is no inflation (prob of being in the zero class for this outcome is zero). Conversely, if you want to force a zero class you use +15 and say [u#@15]; and then also fix any growth factor parameters at zero.
2. If you specify a fixed zero class you are using an inflation model.
3. Variance larger than mean typically calls for an inflation model. This doesn't mean that negbin fits better than ZIP.
4. BIC tends to make good choices. See also Topic 2 for regression examples using BIC to choose among a multitude of models variations.
5. Note that the Topic 6 slides 128-137 don't use a growth model for the inflation part, but simply uses an intercept/mean parameter for it. Use that model first.
Joe posted on Thursday, February 04, 2016 - 3:26 pm
In a ZIP growth model (Ex. 6.7) for the inflation part, if the intercept of the outcome variable (e.g., u11#1) is -1.37, can I interpret this parameter as the probability of 0.25 (e^-1.37) of being unable to assume any value except zero for each time point?
Almar Kok posted on Monday, December 12, 2016 - 12:37 am
Dear Dr. Muthén,
I am in doubt whether to specify skewed variables in an LCGA as censored or let the model assume their distribution is normal.
On the one hand, given the skewness of the variables it seems plausible to define them as censored. On the other hand, in other discussions on this forum you state the following: “If you expect a latent class (mixture) model underlying your data it is natural for you to see non-normal outcomes; that's what the mixture can explain", and "" the skewness is part of what is expected in mixtures and part of what determines the classes.”
My questions are: 1. By these statements, do you mean that I should NOT specify skewed variables as censored in an LCGA?
2. I have compared results from a censored model vs a not-censored model, and they are quite different. The types of trajectories are about the same, but the percentages in the latent classes differ substantially. Also, the entropy in the censored models is quite a bit lower than in the non-censored models. Which one should I choose?
I hope you can provide some guidance regarding these questions.
1. Censored is only needed when you have a strong floor or ceiling effect, for example when more than 25% are at the lowest value. This is a more important factor in the choice than the skewness itself.