Message/Author 

Anonymous posted on Tuesday, March 08, 2005  10:45 am



Just a quick question. If I am using c# to capture the zeroinflation, then do I have to use the ii si  u1#1@0 u2#1@1 u3#1@2 u4#1@3; as is found in 8.11? I guess I'm confused as to how to read the ii and si in the output, so if you have any suggestions where I may go to figure this part out too? I really appreciate it. 

Anonymous posted on Tuesday, March 08, 2005  10:50 am



I am sorry. I specifically am referring to the means portion of the output. What does it mean when the ii and/or the si have a significant mean? 

bmuthen posted on Tuesday, March 08, 2005  11:47 am



"ut#1" refers to a dichotomous latent variable where the focus is on the probability of being in the class that cannot obtain an observed count other than zero ("zero class" at time t). The estimated ii and si means are interpreted like growth modeling of a binary outcome  for instance, the mean of i is the logit for the probability of being in the zero class at the time point with time score 0 and the mean of si gives the change in that logit over time. 

Jason Bond posted on Thursday, July 13, 2006  10:37 am



Bengt and Linda, When I try and run zeroinflated Poisson LCGM, I very often encounter problems with convergence. One issue may possibly be the range of the variables (0365 with a fairly big pile up at 0 (2050% of the cases across the 4 waves)) and misingness (930% of the cases across the 4 waves). Censored and even censored inflate analyses seem to be a litle easier to get to converge. Is the procedure quite sensitive to whether the distributional assumptions of the outcome variables are satisfied? Similar output is obtained for linear instead of quadratic growth. Even assuming only a Poisson response (not zeroinflated) did not seem to want to run, with an error of: THE LOGLIKELIHOOD DECREASED IN THE LAST EM ITERATION. CHANGE YOUR MODEL AND/OR STARTING VALUES. Basically, the same model as below but without the (i) on the Count statement or the zeroinflated parameters line. Any thoughts you have would be appreciated. Jason  Mplus VERSION 3.01 MUTHEN & MUTHEN 07/12/2006 5:18 PM INPUT INSTRUCTIONS TITLE: LCA For Number of AA Meetings; DATA: FILE IS "I:\MyFiles\Trajectories\AATxCareers\RepOrigTraj\AATXtraj.dat"; VARIABLE: NAMES = id dataset2 age1829 age3049 gender blckhisp white black hisp aacapt1 aacapt2 aacapt3 aacapt4; USEVARIABLES ARE aacapt1 aacapt2 aacapt3 aacapt4; Classes = C(4); MISSING ARE ALL (9); IDvariable = id; Count = aacapt1aacapt4 (i); SAVEDATA: FILE = "I:\MyFiles\Trajectories\AATxCareers\RepOrigTraj\output.out"; SAVE = CPROBABILITIES; ANALYSIS: TYPE = Mixture Missing; STARTS = 10 2; OUTPUT: TECH1 TECH8; PLOT: Type is PLOT3; Series = aacapt1 (0) aacapt2 (1) aacapt3 (3) aacapt4 (5); MODEL: %OVERALL% i s q  aacapt1@0 aacapt2@1 aacapt3@3 aacapt4@5; ii si qi  aacapt1#1@0 aacapt2#1@1 aacapt3#1@3 aacapt4#1@5; INPUT READING TERMINATED NORMALLY LCA For Number of AA Meetings; SUMMARY OF ANALYSIS Number of groups 1 Number of observations 389 Number of dependent variables 4 Number of independent variables 0 Number of continuous latent variables 6 Number of categorical latent variables 1 Observed dependent variables Count AACAPT1 AACAPT2 AACAPT3 AACAPT4 Continuous latent variables I S Q II SI QI Categorical latent variables C Variables with special functions ID variable ID Estimator MLR Information matrix OBSERVED Optimization Specifications for the QuasiNewton Algorithm for Continuous Outcomes Maximum number of iterations 1000 Convergence criterion 0.100D05 Optimization Specifications for the EM Algorithm Maximum number of iterations 500 Convergence criteria Loglikelihood change 0.100D06 Relative loglikelihood change 0.100D06 Derivative 0.100D05 Optimization Specifications for the M step of the EM Algorithm for Categorical Latent variables Number of M step iterations 1 M step convergence criterion 0.100D05 Basis for M step termination ITERATION Optimization Specifications for the M step of the EM Algorithm for Censored, Binary or Ordered Categorical (Ordinal), Unordered Categorical (Nominal) and Count Outcomes Number of M step iterations 1 M step convergence criterion 0.100D05 Basis for M step termination ITERATION Maximum value for logit thresholds 15 Minimum value for logit thresholds 15 Minimum expected cell size for chisquare 0.100D01 Maximum number of iterations for H1 2000 Convergence criterion for H1 0.100D03 Optimization algorithm EMA Random Starts Specifications Number of initial stage starts 10 Number of final stage starts 2 Number of initial stage iterations 10 Initial stage convergence criterion 0.100D+01 Random starts scale 0.500D+01 Random seed for generating random starts 0 Input data file(s) I:\MyFiles\Trajectories\AATxCareers\RepOrigTraj\AATXtraj.dat Input data format FREE SUMMARY OF DATA Number of patterns 0 Number of y patterns 0 Number of u patterns 0 COVARIANCE COVERAGE OF DATA Minimum covariance coverage value 0.100 RANDOM STARTS RESULTS RANKED FROM THE BEST TO THE WORST LOGLIKELIHOOD VALUES Initial stage loglikelihood values, seeds, and initial stage start numbers: 51887.783 462953 7 51887.783 127215 9 51887.784 939021 8 51887.787 903420 5 60970.569 unperturbed 0 6 perturbed starting value run(s) did not converge. Loglikelihood values at local maxima, seeds, and initial stage start numbers: 51887.783 462953 7 51887.783 127215 9 THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.178D16. PROBLEM INVOLVING PARAMETER 10. ONE OR MORE MULTINOMIAL LOGIT PARAMETERS WERE FIXED TO AVOID SINGULARITY OF THE INFORMATION MATRIX. THE SINGULARITY IS MOST LIKELY BECAUSE THE MODEL IS NOT IDENTIFIED, OR BECAUSE OF EMPTY CELLS IN THE JOINT DISTRIBUTION OF THE CATEGORICAL LATENT VARIABLES AND ANY INDEPENDENT VARIABLES. THE FOLLOWING PARAMETERS WERE FIXED: 2 3 THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THIS IS OFTEN DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. CHANGE YOUR MODEL AND/OR STARTING VALUES. PROBLEM INVOLVING PARAMETER 5. FINAL CLASS COUNTS AND PROPORTIONS FOR THE LATENT CLASSES BASED ON THE ESTIMATED MODEL Latent Classes 1 29.25021 0.07519 2 29.25021 0.07519 3 301.24938 0.77442 4 29.25021 0.07519 FINAL CLASS COUNTS AND PROPORTIONS FOR THE LATENT CLASS PATTERNS BASED ON ESTIMATED POSTERIOR PROBABILITIES Latent Classes 1 29.25021 0.07519 2 29.25021 0.07519 3 301.24938 0.77442 4 29.25021 0.07519 CLASSIFICATION OF INDIVIDUALS BASED ON THEIR MOST LIKELY LATENT CLASS MEMBERSHIP Class Counts and Proportions Latent Classes 1 26 0.06684 2 23 0.05913 3 308 0.79177 4 32 0.08226 Average Latent Class Probabilities for Most Likely Latent Class Membership (Row) by Latent Class (Column) 1 2 3 4 1 0.250 0.250 0.250 0.250 2 0.250 0.250 0.250 0.250 3 0.029 0.029 0.912 0.029 4 0.250 0.250 0.250 0.250 MODEL RESULTS Estimates Latent Class 1 I  AACAPT1 1.000 AACAPT2 1.000 AACAPT3 1.000 AACAPT4 1.000 S  AACAPT1 0.000 AACAPT2 1.000 AACAPT3 3.000 AACAPT4 5.000 Q  AACAPT1 0.000 AACAPT2 1.000 AACAPT3 9.000 AACAPT4 25.000 II  AACAPT1#1 1.000 AACAPT2#1 1.000 AACAPT3#1 1.000 AACAPT4#1 1.000 SI  AACAPT1#1 0.000 AACAPT2#1 1.000 AACAPT3#1 3.000 AACAPT4#1 5.000 QI  AACAPT1#1 0.000 AACAPT2#1 1.000 AACAPT3#1 9.000 AACAPT4#1 25.000 Intercepts AACAPT1#1 1.471 AACAPT1 0.000 AACAPT2#1 1.471 AACAPT2 0.000 AACAPT3#1 1.471 AACAPT3 0.000 AACAPT4#1 1.471 AACAPT4 0.000 Means I 4.735 S 0.355 Q 0.073 II 0.000 SI 0.377 QI 0.011 Latent Class 2 I  AACAPT1 1.000 AACAPT2 1.000 AACAPT3 1.000 AACAPT4 1.000 S  AACAPT1 0.000 AACAPT2 1.000 AACAPT3 3.000 AACAPT4 5.000 Q  AACAPT1 0.000 AACAPT2 1.000 AACAPT3 9.000 AACAPT4 25.000 II  AACAPT1#1 1.000 AACAPT2#1 1.000 AACAPT3#1 1.000 AACAPT4#1 1.000 SI  AACAPT1#1 0.000 AACAPT2#1 1.000 AACAPT3#1 3.000 AACAPT4#1 5.000 QI  AACAPT1#1 0.000 AACAPT2#1 1.000 AACAPT3#1 9.000 AACAPT4#1 25.000 Intercepts AACAPT1#1 1.471 AACAPT1 0.000 AACAPT2#1 1.471 AACAPT2 0.000 AACAPT3#1 1.471 AACAPT3 0.000 AACAPT4#1 1.471 AACAPT4 0.000 Means I 4.735 S 0.355 Q 0.073 II 0.000 SI 0.377 QI 0.011 Latent Class 3 I  AACAPT1 1.000 AACAPT2 1.000 AACAPT3 1.000 AACAPT4 1.000 S  AACAPT1 0.000 AACAPT2 1.000 AACAPT3 3.000 AACAPT4 5.000 Q  AACAPT1 0.000 AACAPT2 1.000 AACAPT3 9.000 AACAPT4 25.000 II  AACAPT1#1 1.000 AACAPT2#1 1.000 AACAPT3#1 1.000 AACAPT4#1 1.000 SI  AACAPT1#1 0.000 AACAPT2#1 1.000 AACAPT3#1 3.000 AACAPT4#1 5.000 QI  AACAPT1#1 0.000 AACAPT2#1 1.000 AACAPT3#1 9.000 AACAPT4#1 25.000 Intercepts AACAPT1#1 1.471 AACAPT1 0.000 AACAPT2#1 1.471 AACAPT2 0.000 AACAPT3#1 1.471 AACAPT3 0.000 AACAPT4#1 1.471 AACAPT4 0.000 Means I 3.524 S 0.490 Q 0.088 II 0.000 SI 0.377 QI 0.011 Latent Class 4 I  AACAPT1 1.000 AACAPT2 1.000 AACAPT3 1.000 AACAPT4 1.000 S  AACAPT1 0.000 AACAPT2 1.000 AACAPT3 3.000 AACAPT4 5.000 Q  AACAPT1 0.000 AACAPT2 1.000 AACAPT3 9.000 AACAPT4 25.000 II  AACAPT1#1 1.000 AACAPT2#1 1.000 AACAPT3#1 1.000 AACAPT4#1 1.000 SI  AACAPT1#1 0.000 AACAPT2#1 1.000 AACAPT3#1 3.000 AACAPT4#1 5.000 QI  AACAPT1#1 0.000 AACAPT2#1 1.000 AACAPT3#1 9.000 AACAPT4#1 25.000 Intercepts AACAPT1#1 1.471 AACAPT1 0.000 AACAPT2#1 1.471 AACAPT2 0.000 AACAPT3#1 1.471 AACAPT3 0.000 AACAPT4#1 1.471 AACAPT4 0.000 Means I 4.735 S 0.355 Q 0.073 II 0.000 SI 0.377 QI 0.011 Categorical Latent Variables Means C#1 0.000 C#2 0.000 C#3 2.332 TECHNICAL 1 OUTPUT PARAMETER SPECIFICATION FOR LATENT CLASS 1 PARAMETER SPECIFICATION FOR LATENT CLASS 2 PARAMETER SPECIFICATION FOR LATENT CLASS 3 PARAMETER SPECIFICATION FOR LATENT CLASS 4 PARAMETER SPECIFICATION FOR LATENT CLASS REGRESSION MODEL PART ALPHA(C) C#1 C#2 C#3 C#4 ________ ________ ________ ________ 1 1 2 3 0 PARAMETER SPECIFICATION FOR THE CENSORED/NOMINAL/COUNT MODEL PART NU(P) FOR LATENT CLASS 1 AACAPT1# AACAPT1 AACAPT2# AACAPT2 AACAPT3# ________ ________ ________ ________ ________ 1 4 0 4 0 4 NU(P) FOR LATENT CLASS 1 AACAPT3 AACAPT4# AACAPT4 ________ ________ ________ 1 0 4 0 LAMBDA(P) FOR LATENT CLASS 1 I S Q II SI ________ ________ ________ ________ ________ AACAPT1# 0 0 0 0 0 AACAPT1 0 0 0 0 0 AACAPT2# 0 0 0 0 0 AACAPT2 0 0 0 0 0 AACAPT3# 0 0 0 0 0 AACAPT3 0 0 0 0 0 AACAPT4# 0 0 0 0 0 AACAPT4 0 0 0 0 0 LAMBDA(P) FOR LATENT CLASS 1 QI ________ AACAPT1# 0 AACAPT1 0 AACAPT2# 0 AACAPT2 0 AACAPT3# 0 AACAPT3 0 AACAPT4# 0 AACAPT4 0 ALPHA(P) FOR LATENT CLASS 1 I S Q II SI ________ ________ ________ ________ ________ 1 5 6 7 0 8 ALPHA(P) FOR LATENT CLASS 1 QI ________ 1 9 NU(P) FOR LATENT CLASS 2 AACAPT1# AACAPT1 AACAPT2# AACAPT2 AACAPT3# ________ ________ ________ ________ ________ 1 4 0 4 0 4 NU(P) FOR LATENT CLASS 2 AACAPT3 AACAPT4# AACAPT4 ________ ________ ________ 1 0 4 0 LAMBDA(P) FOR LATENT CLASS 2 I S Q II SI ________ ________ ________ ________ ________ AACAPT1# 0 0 0 0 0 AACAPT1 0 0 0 0 0 AACAPT2# 0 0 0 0 0 AACAPT2 0 0 0 0 0 AACAPT3# 0 0 0 0 0 AACAPT3 0 0 0 0 0 AACAPT4# 0 0 0 0 0 AACAPT4 0 0 0 0 0 LAMBDA(P) FOR LATENT CLASS 2 QI ________ AACAPT1# 0 AACAPT1 0 AACAPT2# 0 AACAPT2 0 AACAPT3# 0 AACAPT3 0 AACAPT4# 0 AACAPT4 0 ALPHA(P) FOR LATENT CLASS 2 I S Q II SI ________ ________ ________ ________ ________ 1 10 11 12 0 8 ALPHA(P) FOR LATENT CLASS 2 QI ________ 1 9 NU(P) FOR LATENT CLASS 3 AACAPT1# AACAPT1 AACAPT2# AACAPT2 AACAPT3# ________ ________ ________ ________ ________ 1 4 0 4 0 4 NU(P) FOR LATENT CLASS 3 AACAPT3 AACAPT4# AACAPT4 ________ ________ ________ 1 0 4 0 LAMBDA(P) FOR LATENT CLASS 3 I S Q II SI ________ ________ ________ ________ ________ AACAPT1# 0 0 0 0 0 AACAPT1 0 0 0 0 0 AACAPT2# 0 0 0 0 0 AACAPT2 0 0 0 0 0 AACAPT3# 0 0 0 0 0 AACAPT3 0 0 0 0 0 AACAPT4# 0 0 0 0 0 AACAPT4 0 0 0 0 0 LAMBDA(P) FOR LATENT CLASS 3 QI ________ AACAPT1# 0 AACAPT1 0 AACAPT2# 0 AACAPT2 0 AACAPT3# 0 AACAPT3 0 AACAPT4# 0 AACAPT4 0 ALPHA(P) FOR LATENT CLASS 3 I S Q II SI ________ ________ ________ ________ ________ 1 13 14 15 0 8 ALPHA(P) FOR LATENT CLASS 3 QI ________ 1 9 NU(P) FOR LATENT CLASS 4 AACAPT1# AACAPT1 AACAPT2# AACAPT2 AACAPT3# ________ ________ ________ ________ ________ 1 4 0 4 0 4 NU(P) FOR LATENT CLASS 4 AACAPT3 AACAPT4# AACAPT4 ________ ________ ________ 1 0 4 0 LAMBDA(P) FOR LATENT CLASS 4 I S Q II SI ________ ________ ________ ________ ________ AACAPT1# 0 0 0 0 0 AACAPT1 0 0 0 0 0 AACAPT2# 0 0 0 0 0 AACAPT2 0 0 0 0 0 AACAPT3# 0 0 0 0 0 AACAPT3 0 0 0 0 0 AACAPT4# 0 0 0 0 0 AACAPT4 0 0 0 0 0 LAMBDA(P) FOR LATENT CLASS 4 QI ________ AACAPT1# 0 AACAPT1 0 AACAPT2# 0 AACAPT2 0 AACAPT3# 0 AACAPT3 0 AACAPT4# 0 AACAPT4 0 ALPHA(P) FOR LATENT CLASS 4 I S Q II SI ________ ________ ________ ________ ________ 1 16 17 18 0 8 ALPHA(P) FOR LATENT CLASS 4 QI ________ 1 9 STARTING VALUES FOR LATENT CLASS 1 STARTING VALUES FOR LATENT CLASS 2 STARTING VALUES FOR LATENT CLASS 3 STARTING VALUES FOR LATENT CLASS 4 STARTING VALUES FOR LATENT CLASS REGRESSION MODEL PART ALPHA(C) C#1 C#2 C#3 C#4 ________ ________ ________ ________ 1 0.000 0.000 0.000 0.000 STARTING VALUES FOR THE CENSORED/NOMINAL/COUNT MODEL PART NU(P) FOR LATENT CLASS 1 AACAPT1# AACAPT1 AACAPT2# AACAPT2 AACAPT3# ________ ________ ________ ________ ________ 1 0.321 0.000 0.321 0.000 0.321 NU(P) FOR LATENT CLASS 1 AACAPT3 AACAPT4# AACAPT4 ________ ________ ________ 1 0.000 0.321 0.000 LAMBDA(P) FOR LATENT CLASS 1 I S Q II SI ________ ________ ________ ________ ________ AACAPT1# 0.000 0.000 0.000 1.000 0.000 AACAPT1 1.000 0.000 0.000 0.000 0.000 AACAPT2# 0.000 0.000 0.000 1.000 1.000 AACAPT2 1.000 1.000 1.000 0.000 0.000 AACAPT3# 0.000 0.000 0.000 1.000 3.000 AACAPT3 1.000 3.000 9.000 0.000 0.000 AACAPT4# 0.000 0.000 0.000 1.000 5.000 AACAPT4 1.000 5.000 25.000 0.000 0.000 LAMBDA(P) FOR LATENT CLASS 1 QI ________ AACAPT1# 0.000 AACAPT1 0.000 AACAPT2# 1.000 AACAPT2 0.000 AACAPT3# 9.000 AACAPT3 0.000 AACAPT4# 25.000 AACAPT4 0.000 ALPHA(P) FOR LATENT CLASS 1 I S Q II SI ________ ________ ________ ________ ________ 1 0.000 0.000 0.000 0.000 0.000 ALPHA(P) FOR LATENT CLASS 1 QI ________ 1 0.000 NU(P) FOR LATENT CLASS 2 AACAPT1# AACAPT1 AACAPT2# AACAPT2 AACAPT3# ________ ________ ________ ________ ________ 1 0.321 0.000 0.321 0.000 0.321 NU(P) FOR LATENT CLASS 2 AACAPT3 AACAPT4# AACAPT4 ________ ________ ________ 1 0.000 0.321 0.000 LAMBDA(P) FOR LATENT CLASS 2 I S Q II SI ________ ________ ________ ________ ________ AACAPT1# 0.000 0.000 0.000 1.000 0.000 AACAPT1 1.000 0.000 0.000 0.000 0.000 AACAPT2# 0.000 0.000 0.000 1.000 1.000 AACAPT2 1.000 1.000 1.000 0.000 0.000 AACAPT3# 0.000 0.000 0.000 1.000 3.000 AACAPT3 1.000 3.000 9.000 0.000 0.000 AACAPT4# 0.000 0.000 0.000 1.000 5.000 AACAPT4 1.000 5.000 25.000 0.000 0.000 LAMBDA(P) FOR LATENT CLASS 2 QI ________ AACAPT1# 0.000 AACAPT1 0.000 AACAPT2# 1.000 AACAPT2 0.000 AACAPT3# 9.000 AACAPT3 0.000 AACAPT4# 25.000 AACAPT4 0.000 ALPHA(P) FOR LATENT CLASS 2 I S Q II SI ________ ________ ________ ________ ________ 1 0.000 0.000 0.000 0.000 0.000 ALPHA(P) FOR LATENT CLASS 2 QI ________ 1 0.000 NU(P) FOR LATENT CLASS 3 AACAPT1# AACAPT1 AACAPT2# AACAPT2 AACAPT3# ________ ________ ________ ________ ________ 1 0.321 0.000 0.321 0.000 0.321 NU(P) FOR LATENT CLASS 3 AACAPT3 AACAPT4# AACAPT4 ________ ________ ________ 1 0.000 0.321 0.000 LAMBDA(P) FOR LATENT CLASS 3 I S Q II SI ________ ________ ________ ________ ________ AACAPT1# 0.000 0.000 0.000 1.000 0.000 AACAPT1 1.000 0.000 0.000 0.000 0.000 AACAPT2# 0.000 0.000 0.000 1.000 1.000 AACAPT2 1.000 1.000 1.000 0.000 0.000 AACAPT3# 0.000 0.000 0.000 1.000 3.000 AACAPT3 1.000 3.000 9.000 0.000 0.000 AACAPT4# 0.000 0.000 0.000 1.000 5.000 AACAPT4 1.000 5.000 25.000 0.000 0.000 LAMBDA(P) FOR LATENT CLASS 3 QI ________ AACAPT1# 0.000 AACAPT1 0.000 AACAPT2# 1.000 AACAPT2 0.000 AACAPT3# 9.000 AACAPT3 0.000 AACAPT4# 25.000 AACAPT4 0.000 ALPHA(P) FOR LATENT CLASS 3 I S Q II SI ________ ________ ________ ________ ________ 1 0.000 0.000 0.000 0.000 0.000 ALPHA(P) FOR LATENT CLASS 3 QI ________ 1 0.000 NU(P) FOR LATENT CLASS 4 AACAPT1# AACAPT1 AACAPT2# AACAPT2 AACAPT3# ________ ________ ________ ________ ________ 1 0.321 0.000 0.321 0.000 0.321 NU(P) FOR LATENT CLASS 4 AACAPT3 AACAPT4# AACAPT4 ________ ________ ________ 1 0.000 0.321 0.000 LAMBDA(P) FOR LATENT CLASS 4 I S Q II SI ________ ________ ________ ________ ________ AACAPT1# 0.000 0.000 0.000 1.000 0.000 AACAPT1 1.000 0.000 0.000 0.000 0.000 AACAPT2# 0.000 0.000 0.000 1.000 1.000 AACAPT2 1.000 1.000 1.000 0.000 0.000 AACAPT3# 0.000 0.000 0.000 1.000 3.000 AACAPT3 1.000 3.000 9.000 0.000 0.000 AACAPT4# 0.000 0.000 0.000 1.000 5.000 AACAPT4 1.000 5.000 25.000 0.000 0.000 LAMBDA(P) FOR LATENT CLASS 4 QI ________ AACAPT1# 0.000 AACAPT1 0.000 AACAPT2# 1.000 AACAPT2 0.000 AACAPT3# 9.000 AACAPT3 0.000 AACAPT4# 25.000 AACAPT4 0.000 ALPHA(P) FOR LATENT CLASS 4 I S Q II SI ________ ________ ________ ________ ________ 1 0.000 0.000 0.000 0.000 0.000 ALPHA(P) FOR LATENT CLASS 4 QI ________ 1 0.000 TECHNICAL 8 OUTPUT INITIAL STAGE ITERATIONS TECHNICAL 8 OUTPUT FOR UNPERTURBED STARTING VALUE SET ITER LOGLIKELIHOOD ABS CHANGE REL CHANGE CLASS COUNTS ALGORITHM 1 0.66904778D+05 0.0000000 0.0000000 97.250 97.250 EM 97.250 97.250 2 0.60970582D+05 5934.1962440 0.0886962 97.248 97.251 EM 97.253 97.248 3 0.60970569D+05 0.0126947 0.0000002 97.145 97.262 EM 97.376 97.216 TECHNICAL 8 OUTPUT FOR STARTING VALUE SET 1 ITER LOGLIKELIHOOD ABS CHANGE REL CHANGE CLASS COUNTS ALGORITHM 1 0.65423857D+05 0.0000000 0.0000000 65.990 149.512 EM 95.524 77.974 2 0.53137657D+05 ************ 0.1877939 32.262 294.741 EM 32.461 29.536 3 0.51867844D+05 1269.8122781 0.0238967 29.250 301.250 EM 29.250 29.250 4 0.51887949D+05 20.1047530 0.0003876 29.250 301.250 EM 29.250 29.250 TECHNICAL 8 OUTPUT FOR STARTING VALUE SET 2 ITER LOGLIKELIHOOD ABS CHANGE REL CHANGE CLASS COUNTS ALGORITHM 1 0.72616222D+05 0.0000000 0.0000000 62.402 58.775 EM 51.383 216.440 2 0.57352476D+05 ************ 0.2101975 30.821 31.049 EM 32.522 294.608 3 0.51783555D+05 5568.9212783 0.0970999 29.250 29.251 EM 29.253 301.245 4 0.51888221D+05 104.6658649 0.0020212 29.250 29.250 EM 29.250 301.249 TECHNICAL 8 OUTPUT FOR STARTING VALUE SET 3 ITER LOGLIKELIHOOD ABS CHANGE REL CHANGE CLASS COUNTS ALGORITHM 1 0.67321388D+05 0.0000000 0.0000000 56.303 59.723 EM 141.281 131.693 2 0.56825460D+05 ************ 0.1559078 30.240 32.082 EM 292.102 34.576 3 0.52827678D+05 3997.7819975 0.0703520 29.422 29.424 EM 300.708 29.446 4 0.52901771D+05 74.0925036 0.0014025 29.404 29.404 EM 300.787 29.404 TECHNICAL 8 OUTPUT FOR STARTING VALUE SET 4 ITER LOGLIKELIHOOD ABS CHANGE REL CHANGE CLASS COUNTS ALGORITHM 1 0.59878285D+05 0.0000000 0.0000000 107.659 42.138 EM 115.763 123.440 2 0.56451827D+05 3426.4580370 0.0572237 124.445 30.052 EM 200.179 34.324 3 0.54143781D+05 2308.0451419 0.0408852 48.404 29.446 EM 281.704 29.446 4 0.53064222D+05 1079.5598127 0.0199388 29.473 29.419 EM 300.689 29.419 5 0.52902861D+05 161.3608945 0.0030409 29.405 29.405 EM 300.785 29.405 6 0.52903814D+05 0.9535982 0.0000180 29.407 29.407 EM 300.779 29.407 TECHNICAL 8 OUTPUT FOR STARTING VALUE SET 5 ITER LOGLIKELIHOOD ABS CHANGE REL CHANGE CLASS COUNTS ALGORITHM 1 0.66884666D+05 0.0000000 0.0000000 71.464 119.866 EM 110.843 86.827 2 0.55244074D+05 ************ 0.1740398 31.676 292.003 EM 33.073 32.248 3 0.52043695D+05 3200.3790473 0.0579316 29.250 301.250 EM 29.250 29.250 4 0.51888655D+05 155.0405820 0.0029790 29.250 301.250 EM 29.250 29.250 5 0.51887787D+05 0.8675706 0.0000167 29.250 301.249 EM 29.250 29.250 TECHNICAL 8 OUTPUT FOR STARTING VALUE SET 6 ITER LOGLIKELIHOOD ABS CHANGE REL CHANGE CLASS COUNTS ALGORITHM 1 0.66852854D+05 0.0000000 0.0000000 100.432 73.852 EM 83.106 131.609 2 0.57556789D+05 9296.0651102 0.1390526 145.265 31.968 EM 35.019 176.747 3 0.55004486D+05 2552.3035206 0.0443441 187.576 30.108 EM 30.108 141.209 4 0.54512109D+05 492.3769669 0.0089516 208.005 29.501 EM 29.501 121.992 5 0.54150738D+05 361.3708978 0.0066292 289.204 29.420 EM 29.420 40.955 6 0.53003087D+05 1147.6512364 0.0211936 300.734 29.422 EM 29.422 29.422 7 0.52901865D+05 101.2214061 0.0019097 300.787 29.404 EM 29.404 29.404 8 0.52903923D+05 2.0572518 0.0000389 300.779 29.407 EM 29.407 29.407 TECHNICAL 8 OUTPUT FOR STARTING VALUE SET 7 ITER LOGLIKELIHOOD ABS CHANGE REL CHANGE CLASS COUNTS ALGORITHM 1 0.84085127D+05 0.0000000 0.0000000 88.851 60.809 EM 174.555 64.786 2 0.56756134D+05 ************ 0.3250158 78.879 32.274 EM 244.466 33.381 3 0.54042887D+05 2713.2471442 0.0478054 41.906 29.752 EM 287.591 29.752 4 0.52376326D+05 1666.5613406 0.0308378 29.251 29.251 EM 301.246 29.251 5 0.51905521D+05 470.8049273 0.0089889 29.250 29.250 EM 301.249 29.250 6 0.51887802D+05 17.7192435 0.0003414 29.250 29.250 EM 301.249 29.250 7 0.51887783D+05 0.0189753 0.0000004 29.250 29.250 EM 301.249 29.250 TECHNICAL 8 OUTPUT FOR STARTING VALUE SET 8 ITER LOGLIKELIHOOD ABS CHANGE REL CHANGE CLASS COUNTS ALGORITHM 1 0.71135676D+05 0.0000000 0.0000000 112.422 90.486 EM 94.292 91.800 2 0.57277354D+05 ************ 0.1948154 211.785 33.742 EM 33.631 109.841 3 0.53220600D+05 4056.7539099 0.0708265 294.926 29.257 EM 29.257 35.560 4 0.51906613D+05 1313.9871940 0.0246894 301.250 29.250 EM 29.250 29.250 5 0.51887819D+05 18.7942098 0.0003621 301.250 29.250 EM 29.250 29.250 6 0.51887784D+05 0.0343058 0.0000007 301.249 29.250 EM 29.250 29.250 TECHNICAL 8 OUTPUT FOR STARTING VALUE SET 9 ITER LOGLIKELIHOOD ABS CHANGE REL CHANGE CLASS COUNTS ALGORITHM 1 0.78133250D+05 0.0000000 0.0000000 59.251 59.376 EM 77.906 192.468 2 0.56783824D+05 ************ 0.2732438 32.630 32.635 EM 40.482 283.254 3 0.52338884D+05 4444.9408355 0.0782783 29.250 29.250 EM 29.250 301.249 4 0.51905364D+05 433.5195405 0.0082829 29.250 29.250 EM 29.250 301.249 5 0.51887812D+05 17.5517942 0.0003381 29.250 29.250 EM 29.250 301.249 6 0.51887783D+05 0.0294882 0.0000006 29.250 29.250 EM 29.250 301.249 TECHNICAL 8 OUTPUT FOR STARTING VALUE SET 10 ITER LOGLIKELIHOOD ABS CHANGE REL CHANGE CLASS COUNTS ALGORITHM 1 0.76622768D+05 0.0000000 0.0000000 71.071 64.460 EM 123.621 129.848 2 0.98855630D+05 ************ 0.2901600 90.931 104.362 EM 113.457 80.250 FINAL STAGE ITERATIONS TECHNICAL 8 OUTPUT FOR STARTING VALUE SET 7 7 0.51887783D+05 0.0189753 0.0000004 29.250 29.250 EM 301.249 29.250 8 0.51887783D+05 0.0000000 0.0000000 29.250 29.250 EM 301.249 29.250 TECHNICAL 8 OUTPUT FOR STARTING VALUE SET 9 6 0.51887783D+05 0.0294882 0.0000006 29.250 29.250 EM 29.250 301.249 7 0.51887783D+05 0.0000010 0.0000000 29.250 29.250 EM 29.250 301.249 8 0.51887783D+05 0.0000000 0.0000000 29.250 29.250 EM 29.250 301.249 SAVEDATA INFORMATION Order and format of variables AACAPT1 F10.3 AACAPT2 F10.3 AACAPT3 F10.3 AACAPT4 F10.3 ID F10.3 CPROB1 F10.3 CPROB2 F10.3 CPROB3 F10.3 CPROB4 F10.3 C F10.3 Save file I:\MyFiles\Trajectories\AATxCareers\RepOrigTraj\output.out Save file format 10F10.3 Save file record length 1000 Beginning Time: 17:18:36 Ending Time: 17:18:53 Elapsed Time: 00:00:17 MUTHEN & MUTHEN 3463 Stoner Ave. Los Angeles, CA 90066 Tel: (310) 3919971 Fax: (310) 3918971 Web: www.StatModel.com Support: Support@StatModel.com Copyright (c) 19982004 Muthen & Muthen 


There have been changes to the Poisson algorithm since Version 3.01. You should upgrade to the most recent version of Mplus. I think your problems may be solved by this. 

Elia Femia posted on Thursday, August 31, 2006  3:36 pm



Referring to the post from March 8 2005, I'd like to clarify the meaning a significant estimated si mean. In this part of the model, we are predicting the probability of being in the zero class? If I have group membership as a covariate (0=control and 1=treatment), and the si coefficient is negative and significant, is that interpreted as the control group having a higher probability of being in the zero class over time? And what if, in addition, the qi coefficient is also significant (and positive)? Thank you for your help. 


That post referred to zeroinflated Poisson modeling. So this is a 2class model in line with the Roeder et al (1999) JASA article. One class of people follow the regular Poisson where the dependent variable is the log rate for the counts. The other class of people can only have zero counts and here the dependent variable is the probability of being in this zero class. See ex 6.7 in the User's Guide. "si" in that notation refers to the inflation part and is the slope in the growth model for changes over time in individual probabilities of being in the zero class. A negative significant si slope mean implies that the probability goes down over time. Regressing si on a covariate, you have 2 parameters: the intercept and the slope in this regression. If you are saying that this latter slope is negative, then yes your interpretation is correct. I won't answer the qi part because it is not clear if you refer to the mean of qi or the regression of qi on the covariate  in any case, it is always difficult in any growth model to single out effects on linear and quadratic slopes. 

B Lee posted on Tuesday, March 27, 2007  8:22 pm



Regarding above message, with ZIP in a growth model over several time points: does ZIP assume that a portion of the sample will be zero at every single time point? OR: at each time point, there are more individuals at zero than expected for a regular poisson, but not necessarily the same individuals at zero at every time point? My data reflects the latter situation. Trying to fit a ZIP model results in the mean of the growth inflation term to be zero at each time point. Does this suggest Poisson (without inflation) would be a better fit? 


On your first paragraph, ZIP growth modeling let's people move in and out of zero across the time points, so it is not necessarily the same individuals at zero at every time point. ZIP mixture modeling can be used to in addition specify a zero class throughout if you want that. Inflation estimates close to 15 suggest that inflation is not needed. I don't know why you would get (exactly) zero values at each time point unless you inadvertently specify that. 


With a ZIP latent growth model, I am somewhat confused about the use of random effects versus fixed effects. I am fitting a ZIP LGM with three time points. The slope parameter for the binary part is negligible, so I dropped it such that the binary part is a means model, not a growth model. When I freely estimate variances for both count and binary intercepts, I get a singularity of the information matrix (possible underidentification?). When I constrain the variance for the binary intercept part to zero (making it a fixed effect), the estimated inflation probability across waves is 9.4%. When I constrain the variance for the count part to zero (fixed effect), the estimated inflation probability jumps to 48.3%. Reading over Hall's (2000) article on ZIP and ZIB regression with random effects, I am inclined to allow the random intercept effect for the count part of the model. The AIC favors the count random intercept model over the binary random intercept model. But what can I make of the vast differences in inflation probabilities? And, do I really need the inflation component? The noninflated model AIC is basically comparable, but there is a preponderance (~50% at each wave) of zeros in the data, suggesting the utility of zeroinflation. Thanks for your help. 


I would look at the following models: 1. Count model without inflation and random intercept and slope growth factors  see BIC 2. Count inflated model with no growth model for the inflated part of the variable and random intercept and slope factors for the continuous part of the variable  is BIC better or worse? If worse, work with the count model without inflation and adjust steps 3 and 4. 3. Count inflated model with a growth model also for the inflated part but fixed effects for both the intercept and slope growth factors for the inflated part and random intercept and slope growth factors for the continuous part of the variable  how does BIC compare to 1 and 2. 4. Count inflated model with a growth model for the inflated part and random effects for both the intercept and slope growth factors for both growth models  I think this is where you get singularity  what exactly does the message say? Regarding the probabilities, if you are computing these for a model with random effects, I think they will be incorrect because they cannot be computed with numerical integration. 


Dr. Muthen, Thanks for your useful reply. I fit the model without inflation (Model 1 in your reply) and an inflated model without a growth model for the inflated piece (Model 2). The BIC for the noninflated model is 1105, whereas the BIC for the inflated model without growth for the inflation part is 1115. So, it looks like the inflated model is not improving things (residuals look similar for Model 1 and Model 2). The thresholds for Model 2 are 15, 2.0, and 3.7 for the three waves, yielding low inflation probabilities (.00, .12, and .02, respectively). To clarify, when you mentioned that inflation probabilities will be incorrect for random effects models (because cannot be computed by numerical integration), does that apply to a model with random effects in the count and/or inflation parts or just random effects in the inflation growth parameters? Although Models 3 and 4 are probably contraindicated because the BIC favors the noninflated model, I fit them to learn more about thinking through this problem. Model 3 (fixed effects inflation growth model) yields a BIC of 1113.5 (slightly better than Model 2) and finds inflation probabilities of .05, .06, and .09 across the waves. For Model 4, I used numerical integration (15 points) and again received the singularity of information matrix message, which was: 


ONE OR MORE PARAMETERS WERE FIXED TO AVOID SINGULARITY OF THE INFORMATION MATRIX. THE SINGULARITY IS MOST LIKELY DUE TO THE MODEL IS NOT IDENTIFIED, OR DUE TO A LARGE OR A SMALL PARAMETER ON THE LOGIT SCALE. THE FOLLOWING PARAMETERS WERE FIXED: 1 4 8 9 10 11 12 13 14 Parameter 1 is the inflation intercept (mean) from the nu matrix. Parameter 4 is the slope for the inflation part from the alpha matrix. Parameters 814 are the covariances of the inflation intercept and slope parameters with all other growth parameters (Psi matrix), as well as the variances of the inflation intercept and slope. The output for Model 4 looks severely out of whack (e.g., estimate for inflation slope parameter is 193!). BIC for Model 4 is 1144. I guess model 4 was not meant to be given the best BIC in Model 1. To conclude, it appears that the noninflated model works the best. Is that accurate, and is there any more to the story that I should be thinking about? Thanks again. 


The numerical integration is needed only for the probabilities related to the inflation growth parameters. Although I don't think you are interested in Model 4, it looks like you have hit an odd solution. You can use the STARTS option in the ANALYSIS command, for example, STARTS = 100 10; This may help. Note that posts should not exceed one window. 

J.W. posted on Wednesday, March 05, 2008  8:52 am



I am testing LGM for a count outcome using a zeroinflated Poisson model: 1) I ran the Mplus example program ex6.7.inp. The mean of Ii was 0. Then, I freed the parameter Ii, but the estimated mean of Ii was still 0 (Mplus fixed it to zero to avoid singularity of the information matrix according to the message in Mplus output) 2) I noticed that the data set (i.e., ex6.7.dat) used in Mplus example program ex6.7.inp was generated from Monte Carlo simulation in Mplus example program mcex6.7.inp where Ii was set to 0. Then, I reset Ii to 0.1 in the program and generated a new data set by rerunning the Monte Carlo simulation. 3) I ran the program ex6.7.inp again using the new data set. The estimated mean of Ii was still zero. It seems that Ii was set to 0 by default in Mplus. Is this right? 4) I freed the parameter Ii and reran the model. Then, I got the message “...TO AVOID SINGULARITY OF THE...THE FOLLOWING PARAMETERS WERE FIXED: 4”. However, it was not parameter 4 (i.e., the mean of Ii), but its S.E. was fixed to zero. Your answers to my questions will be highly appreciated! 


The mean of ii is fixed at zero as part of the growth model parameterization for the inflation part of the model. If you want to free the mean of ii, you must also fix the intercepts of the inflation outcome to zero intead of having them held equal. 

J.W. posted on Friday, March 07, 2008  10:16 am



Linda, thanks a lot! A few more questions: 1) Holding the intercepts of the inflation outcome equal is actually holding the thresholds equal (threshold=intercept), right? 2) In regard to interpretations of threshold of the inflation outcome and mean of Ii:  Is the estimate of a threshold (e.g., U14#1=2.139) the negative value of logodds of having extra zeros in the sample at a specific time point (e.g., Time=4)?  Is the mean of Ii the negative value of logodds of having extra zeros on average over time? The model results are: Ii=0.162 (estimated by freeing [Ii] and fixing [U11#1 U14#1@0]); U11#1=0.262, U12#1=0.606, U13#1=0.801, U14#1=2.139 (estimated by fixing [Ii@0] and freeing [U11#1 U14#1]). Your help will be appreciated! 


Look at UG ex 3.8 for ZIP regression and its explanation of inflation. In that example the u#1 on x refers to the logistic regression probability of being in the zero class, that is the class that is unable to have positive counts. That class can be seen as the inflation class (extra zeros). Here we estimate a logistic regression intercept, not a threshold. The higher the intercept, the higher the probability. And the higher the logodds for being in the zero class vs the other class. With growth modeling, this translates to an outcome at a certain time point. The mean of Ii is on the same scale as the intercept because Ii takes the role of x in regression (now regression of the count outcome on Ii).  So higher values give higher prob of zero class. 


Dear drs. muthen, I am testing a threewave SEM model. My independent, wave 1, variable (IV) is a fourgroup categorical (categories are qualitatively different). It is my understanding that IVs need not be specified as such in the syntax. My model fits very well, and the path from the IV to the DV is significant. I am unsure however, how to interpret the output concerning the regression of W2_SOCI ON W1_SOCGR being Est./S.E = 2.762 (p= 0.006). Does this mean that the greater the group dummy code (1, 2, 3, or 4), the lower the social compentence at w2 (w2_soci)? Because that doesn't make any sense with my categorical variable. I've tried the define command (defining three of the four groups), but it doesn't work. Relatedly, with the new version, estimates are provided for unstand. model results, STDYX Stand., STDY stand., and Std. What is the differnece between all of these? And finally, how can the indicator of a factor be significant, but variance explained in the indicator not? thank you 


If you have a nominal independent variable, you need to create a set of dummy variables. I'm not sure why this does not work. Please send your input, data, output, and license number to support@statmodel.com for help with this. Please see the STANDARDIZED option in the user's guide for a description of the various standardized estimates. The reason the two tests might be different is that one tests if the size of the factor loading is different from zero. The other tests whether the variance explained in the dependent variable is different from zero. The latter is a function of more than the factor loading parameter. 

J.W. posted on Tuesday, March 11, 2008  2:02 pm



Dear Dr. Muthén, You mentioned in your response on March 09, 2008 that Mplus estimates “a logistic regression intercept, not a threshold” in the logit part of the ZIP model. As I recall that Mplus reports threshold instead of intercept for logit model. So, when Mplus reports intercept and when threshold for a logit model or probit model? Thanks! 


For probit and logistic, we give thresholds. For multinomial logistic, we give intercepts and also for the inflated part of ZIP. As I am sure you know, the threshold and intercept differ only by sign. 


Thresholds for observed dependent variables, providing for ordered polytomous variables. Intercepts for latent binary and unordered dependent variables. 


I am interested in estimating a growth model with a continuous DV and a TVC that is ZIP. I want to compare the above model to a crosslagged model. Is it possible to regress a count variable on a continous outcome? And if so, can Mplus handle crosslagged models? Thanks in advance! 


Yes to both questions. 


Thank you Linda. I've tried to estimate the first model from above (a growth model where the DV is continous and the TVC is ZIP): If this is a traditional TVC model: i s  bmi2@0 bmi3@1 bmi4@2 bmi5@3 bmi6@4; bmi2 ON x2; bmi3 ON x3; bmi4 ON x4; bmi5 ON x5; bmi6 ON x6; i s ON x1 bmi1; When you make x a ZIP how do you specificy an effect of the inflation part of the model on the DV? When I list x as a count I get just a "bmi on x" parameter but I am also interested in the effect of the inflation on BMI. Thank you! 


The COUNT option has various settings. See the Version 5.1 Language Addendum for the full set. You can find it at: http://www.statmodel.com/ugexcerpts.shtml See also Example 3.8. 


Linda, Thank you for the direction. I understand how to specify the inflation part of the ZIP variable when it is a DV but I get an error when I list x#1 on the right side of the ON statement. I am interested in the effect of the inflation on the continuous DV. Is there a way to do this? 


Please send your input, data, output, and license number to support@statmodel.com. 

socrates posted on Sunday, June 07, 2009  8:46 pm



Hi Do you now a reference where I can find the formula to calculate the expected counts based on the growth parameter estimates in a GMM for a count outcome using a zeroinflated poisson model? Many thanks! 


See: Hilbe, J. M. (2007). Negative binomial regression. Cambridge, UK: Cambridge University Press. 


Dear Dr. Muthen, I am running a latent growth curve model (6 time points)and then using the intercept and slope parameters to predict the outcome (1 time point) using a ZIP model. I am getting noticably different estimates when I look at the unstandardized versus standardized output. For example: Unstandardized Estimate S.E. Est./S.E. PValue TDRK29C#1 ON I 3.521 6.170 0.571 0.568 S1 104.887 144.716 0.725 0.469 Standardized (STDYX) TDRK29C#1 ON I 0.174 0.185 0.937 0.349 S1 0.958 0.202 4.754 0.000 Is this ok? Should I report unstandardized results? Thanks! 


Unstandardized and standardized coefficients will be different. The amount of difference depends on the standard deviations of the variables involved. If you don't have a reason to use standardized, I would use unstandardized. 

LAS posted on Tuesday, June 15, 2010  9:57 am



My colleague and I have been running lcga models using 37 waves of count data. The data have a large percentage of 0s at each wave and when 0s are disregarded, the data are highly skewed, even after top coding at 75. We are interested in comparing the results obtained using proc traj and mplus. We ran 2class zip models in both programs (in mplus fixing the variances of the count and zero inflated portions of the model to 0). The fit in sas is ok, but the fit for the mplus model is very poor, especially for one of the classes (the posterior probabilities are near .4 and the estimated mean trajectory is much lower than the observed mean trajectory) and the model will only converge using mlf. Moreover, in mplus 90% of the cases were placed in one class while the split was 60%/40% for proc traj. Because it is unlikely that the count portion of the model follows a poisson distribution, we reran the lcga 2class model in mplus using the zero inflated negative binomial model (zinb). The fit was much improved and the percentage of the sample falling into each class was similar to that obtained from the proc traj zip model. I have read that proc traj and mplus will not give you the exact same results for the zip model, but why are they so different? Maybe you could recommend an article that discusses this issue? Most of what I have read says that sas has more flexibility to account for dormancy and exposure time, but provides little elaboration. Thank you. 


Because the models used in TRAJ are restricted, special cases of the Mplus models you get exactly the same results as in TRAJ when you set up the model correctly in Mplus. This is also true for the zip model. You need to send both the TRAJ and Mplus outputs for the zip model where your two runs disagree so we can see where the input error lies. Please include your licence number. I don't think that sas has more flexibility as you suggest perhaps you can point me to such written claims. 

AeLy Park posted on Friday, August 13, 2010  12:40 pm



I tested ZIP model with repeated measures first and then put covariates into the model to predict binary part and count part. Then I got the following warning messages. So the model loose 816 cases when I put the covariates. In the program, I put <Type = Missing; Integration=5;> Any more function I need to put into the model? How can I use full information including covariates? *** WARNING in ANALYSIS command Starting with Version 5, TYPE=MISSING is the default for all analyses. To obtain listwise deletion, use LISTWISE=ON in the DATA command. *** WARNING Data set contains cases with missing on xvariables. These cases were not included in the analysis. Number of cases with missing on xvariables: 816 *** WARNING Data set contains cases with missing on all variables except xvariables. These cases were not included in the analysis. Number of cases with missing on all variables except xvariables: 7 3 WARNING(S) FOUND IN THE INPUT INSTRUCTIONS 


Missing data theory does not apply to covariates. You can mention the variances of the covariates in the MODEL command. They will then be treated as dependent variables and distributional assumptions will be made about them. Or you can use multiple imputation to create imputed data sets. I think both approaches are just about the same. 


Hi Drs. Muthen, I am running a ZIP growth model with KNOWNCLASS option in order to look at the estimated means as well as the estimated probability for the subgroups that I am interested in. However, I don't receive differential estimated probability for each subgroup. Therefore, I am wondering is there a specific save command or an alternative approach that I can use to reveal the estimated probability for binary outcomes for each subgroup?. Thank you, ChienTi 


No, but you can compute the estimated probability of being at zero or not from the estimated parameter values. This is done in line with a regular binary logit growth model for being in the zero class (see UG ex for ZIP). 


Hi Dr. Muthen, I don't quite understand. Did you suggest me that I could calculate the estimated probability for time 1 by using estimated mean that is equals 1.191 and the probability of inflation probability = .755 at time 1? If so, can you point out the formula for calculating Pr(Yit=0)? I think that Pr(Yit=0)=Pr(zeroclass)+Pr(yit=0nonzero class)*Pr(non_zero class) So, I can use .755+Pr(Yit=0nonzero class)*(10.755) I don't know specific for the detailed equation for calculating Pr(Yit=0 nonzero class), is it e^(1.191)*1.191^0? Thank you so much, ChienTi 


You asked for "the estimated probability for binary outcomes for each subgroup", so I interpreted that to mean that you wanted the probability of being in the zero class for each subgroup. That is the probability of inflation. See our Topic 2 handout. Your P(Yit=0) is correct, but that is another matter. 


Hello Dr. Muthen, I am so sorry for the confusion. I meant to graph the trajectories of the binary outcome (being at 0) for 5 KNOWNCLASS for 20 time points. However, I did not know how to do to request Mplus output it for me. I tried to put "residual" command and also request plot3. However,I only have trajectories of estimated means for each KNOWNCLASS. Previously, I interpreted your advice to me was to compute the trajectories of the binary outcome for 5 KNOWNCLASS by a) using the estimated means across time points for each KNOWNCLASS, and b)the overall probability of inflation. I wonder now a) is it one of the correct procedures get trajectories of binary outcome (with and without covariates), and b) is there a better way to do this? Many Thanks, ChienTi 


If you want to graph the estimated trajectory for the binary part of the ZIP model you can look at page 682 of the V6 UG where the bar () function is used in the SERIES option to plot two growth curves, in your case the binary and the count curves. You get these curves for each KNOWNCLASS. 

Melanie Wall posted on Wednesday, November 09, 2011  12:36 pm



We are trying to use Model Constraint commands to directly estimate the expected values from a LCGM ZeroInflated Poisson, but we are not getting the same values as those output by the SERIES option in the Plot command. Snipit of code... model:%overall% i s qdsmdep12@0 dsmdep13@.1 dsmdep14@.2 dsmdep15@.3; ii si qidsmdep12#1@0 dsmdep13#1@.1 dsmdep14#1@.2 dsmdep15#1@.3 ; sq@0; iiqi@0; %c#1% [i*0.4](a1) [s*0.3](a2) [q*1.1](a3) [si*0.5](a5); [qi*0.7](a6); [dsmdep12#1dsmdep23#1*2] (a4); %c#2% MODEL CONSTRAINT: NEW (point112 point113); point112 = (1/(1+exp(a4)))*exp(a1); point113 = (1/(1+exp(a4+a5*.1+a6*.01)))*exp(a1+a2*.1+a3*.01); New/Additional Parameters POINT112 0.071 POINT113 0.067 However, Mplus SERIES gives. 0.00000 0.07990 1.00000 0.07512 Can you help us figure out why the Model Constraint and the SERIES do not agree. 


Looks like your Model Constraint statements are correct. (In the Model command I don't think you mean dsmdep23#1*2, but dsmdep15#1*2.) Please send input and data to Support so we can investigate the discrepancy. 

Melanie Wall posted on Wednesday, November 09, 2011  2:04 pm



Actually thanks to our diligent colleague, MeiChen Hu, we figured out our constraints were wrong. Because we are allowing the intercept of the Poisson part to be random, we need to also include the variability of that intercept when calculating the mean back on the original scale. So below, we label the variance of the intercept as "av1" and then put it into the model constraints... %c#1% [i*0.4](a1); i(av1); [s*0.3](a2); [q*1.1](a3); [si*0.5](a5); [qi*0.7](a6); [dsmdep12#1dsmdep23#1*2] (a4); MODEL CONSTRAINT: point112 = (1/(1+exp(a4)))*exp(a1+av1/2); point113 = (1/(1+exp(a4+a5*.1+a6*.01)))*exp(a1+av1/2+a2*.1+a3*.01); Now we get the same estimates as the SERIES command. 


I missed that you had specified the intercept growth factor i as random. This means that numerical integration over i is done resulting in the values given in the RESIDUAL output which are then plotted. 


P.S. I guess your explicit formula gives the same as the numerical integration  I haven't checked by completing the square in the exp. 


Hello, Dr. Muthen, I have questions of twopart growth model for zeroinflated data. Is it right that 1) all cases are included in the logistic part, and 2) the cases included in the poisson part are only who have nonzero responses at all time points? Thank you for your help. 


Are you asking about twopart growth modeling or zeroinflated growth modeling? These two are different. Or, are you looking at some sort of combination? 


Thank you for your response, and sorry for not being clear. I followed UG 6.7, so I assume it's a zeroinflated growth model. I have an added question. If I want to compute the estimated probabilities of p(y=0) at each time point from estimated parameters, what's the correct formula? I tried exp(I+S*time)/(1+exp(I+S*time), but it doesn't seem correct. I think the intercept parameter (not the growth factor I) comes into play, but I haven't figured out how. I am analyzing substance use data (count) with p(y=0) vary from .5 to .25 over time. Do you think zeroinflated poisson growth model appropriate, or would you recommend other types of model such as twopart growth model? Thank you very much for your help! 


UG 6.7 is a zeroinflated model, so everyone contributes to every part of its estimation. For twopart models only the ones with nonzero response contribute to the continuous part. It is difficult for you to compute the estimated probabilities because the growth factors are random variables. This means that you can't just insert their means in the formula but have to integrate over their distributions, which Mplus does using numerical integration. I think we print the estimated probabilities. There isn't a clear choice. If you want to view this as there being 2 types of people who answer zero I would use zeroinflated modeling: Those who do not participate in the activity and those who do but didn't in the time period studied. 

Laura posted on Wednesday, March 19, 2014  12:15 pm



Hi, I have a question on LCGA with a count variable that is very skewed: about 80% have zero values in each time point. I have compared the results of models with zip, zinb and negative binomial distribution. What comes to the BIc values, it seems that "zinb" fits the data best (in 2 to 5 latent class models) and "zip" is the second best alternative, although the differences in BIC values are quite small. The form of the trajectories is quite different in these models (with zip and zinb). However, both of them make sense substantially. Is it possible in this case to choose the model (zip or zinb) based on BIC values? 


It's hard to make a statistical choice when BIC values are close. You can also look at TECH10 and count the number of significant bivariates; see the below MuthenAsparouhov chapter where we use TECH10 information for crime curve model fit: Muthén, B. & Asparouhov, T. (2009). Growth mixture modeling: Analysis with nonGaussian random effects. In Fitzmaurice, G., Davidian, M., Verbeke, G. & Molenberghs, G. (eds.), Longitudinal Data Analysis, pp. 143165. Boca Raton: Chapman & Hall/CRC Press. You say LCGA, which then raises the possibility of generalizing to a GMM. 

Laura posted on Tuesday, August 19, 2014  7:33 am



Thanks for your reply! Related to the previous post, I would still like to ask about the choice of the distribution. Should the choice be based more on substantial interpretation or on statistics, e.g. BIC values? With negative binomial and zeroinflated negative binomial I get quite similar solutions that are also substantially interesting. With ZIP, the trajectories are also clear, but one distinct group identified by ZINB and NB is missing. The BIC values are, in general, the best with ZINB, but the models do not converge very easily and the best log likelihood value is not always replicated. The BIC values are the second best with ZIP, but the model does not identify the distinct trajectory that was identified with ZINB and NB. With NB the interpretation is good (and the average posterior probabilities are the highest) but the BIC values are the worst. Overall the differences between BIC values are quite small. What do you think is more important in this kind of situation; statistical criteria (BIC, converging) or interpretation? 


These are difficult choices. Being able to replicate the best logL many times is important in order to trust the solution. If BIC values are close, I would rely on interpretability and usefulness of the model  for instance by relating the classes to antecedents and consequences. But you say LCGA  why not GMM? Our Topic 6 handout on our website, slides 127137 discusses the choices and in particular slide 130 compares GMM and LCGA, with GMM doing better. 

namer posted on Friday, April 03, 2015  1:49 am



Dear Dr. Muthen(s), I am running a 5 wave LCGA on skewed count data (roughly 5060% zeroes at any time point). Additionally, in these models I need to identify a class of people who score zero at every wave. 1. I understand from above that if I want a zero class with the same people at each time point I need to use a fixed zero class as the zero inflation allows people to move in and out at any given wave? 2. Does using a fixed zero class negate the need for zero inflation, or does this depend on model fit? 3. I have variances much larger than my means, does this indicate that negative binomial models would be better suited than poisson models? 4. I am trying to compare poisson, zip, negative binomial and zinb models all with a fixed zero class  can I do this using BIC/AIC etc? Is a larger BIC in the ZI models a true indication of poorer fit, or just that these models are more complex with more parameters than the non ZI models? 5. Finally, the ZINB and ZIP latent growth (non mixture) models have incredibly large values for inflation growth means and variance (i.e. a slope of 35.00 and variance of 1800.00). The intercept means are 0.00 and the intercept variances are even larger (e.g 41,000). How do you interpret such large values? Is this an indicator of a larger problem? Thank you for your time and help! Namer 


1. In my experience, BIC typically does not favor a zero class across time. Instead, a solution with a low class (almost zero) comes out as the winner. We talk about ZIP growth modeling in the video and handout for Topic 6, slides 128137. For a count outcome U, the inflation is referred to as U#, where u# is a binary latent inflation variable and u#=1 indicates that the individual is unable to assume any value except 0. In the output you see an instance of U# = 15 which means that there is no inflation (prob of being in the zero class for this outcome is zero). Conversely, if you want to force a zero class you use +15 and say [u#@15]; and then also fix any growth factor parameters at zero. 2. If you specify a fixed zero class you are using an inflation model. 3. Variance larger than mean typically calls for an inflation model. This doesn't mean that negbin fits better than ZIP. 4. BIC tends to make good choices. See also Topic 2 for regression examples using BIC to choose among a multitude of models variations. 5. Note that the Topic 6 slides 128137 don't use a growth model for the inflation part, but simply uses an intercept/mean parameter for it. Use that model first. 

Joe posted on Thursday, February 04, 2016  3:26 pm



In a ZIP growth model (Ex. 6.7) for the inflation part, if the intercept of the outcome variable (e.g., u11#1) is 1.37, can I interpret this parameter as the probability of 0.25 (e^1.37) of being unable to assume any value except zero for each time point? 


See our Topic 6 handout from our courses. 

Almar Kok posted on Monday, December 12, 2016  12:37 am



Dear Dr. Muthén, I am in doubt whether to specify skewed variables in an LCGA as censored or let the model assume their distribution is normal. On the one hand, given the skewness of the variables it seems plausible to define them as censored. On the other hand, in other discussions on this forum you state the following: “If you expect a latent class (mixture) model underlying your data it is natural for you to see nonnormal outcomes; that's what the mixture can explain", and "" the skewness is part of what is expected in mixtures and part of what determines the classes.” My questions are: 1. By these statements, do you mean that I should NOT specify skewed variables as censored in an LCGA? 2. I have compared results from a censored model vs a notcensored model, and they are quite different. The types of trajectories are about the same, but the percentages in the latent classes differ substantially. Also, the entropy in the censored models is quite a bit lower than in the noncensored models. Which one should I choose? I hope you can provide some guidance regarding these questions. Many thanks in advance! 


1. Censored is only needed when you have a strong floor or ceiling effect, for example when more than 25% are at the lowest value. This is a more important factor in the choice than the skewness itself. 2. Choose according to the above. 

J Jack posted on Tuesday, May 23, 2017  6:54 am



Hello, I am running zeroinflated Poisson LCGM but I get strange results when I check the plot options while the output looks ok. Now I am unsure whether to trust my results. The model convergences, produces classes and gives estimates that seem to make sense. But the problem is: when I plot the observed means against the estimated means with the plot option, I encounter that: estimated means for all classes lie far outside the possible range of values (all straight lines at value 999 for all classes) while the observed class values are far below in the given range of weeks per year (0 to 52). furthermore: if I use the numerical integration algorithm (which I suspect can be used in my case), then the plot options to check the distribution of means is not available. Does that mean that with numerical integration it is not possible to visually check how the sample means are distributed around the class means? Any hints what I could have gone wrong would be greatly appreciated. 


Is it an Mplus plot you are looking at or a plot you have made outside Mplus? Remember that the model consider log(mean) and the plot is for the mean  so an exponentiation is involved. With random effects  such as growth factors  there is also numerical integration involved in the Mplus plots (your output Summary shows if integration is done). 

J Jack posted on Wednesday, May 24, 2017  1:20 am



Thanks Bengt. It is indeed an Mplus plot: the plot of estimated means and observed values and the same happens for the estimated means and observed means plot. I have been considering that it could have to do with the exponentiation. But I could not find any logical explanation why the log transformation should affect the estimated values differently than the observed values, resulting in the estimated values being far outside the range of the observed values and on the same line for all classes. As it seems you did not encounter such problems with the plot function before  unfortunately I have no idea what could be wrongly specified in the model in order to achieve that all estimated means would be incredibly inflated while the model still converges and estimates classes as expected. 


I would need to try this myself on your data to know exactly what you are looking at. Send files to Support along with your license number. 

Back to top 