Mplus Discussion >> Growth Mixture Survival Analysis

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Growth Mixture Survival Analysis

Mplus Discussion > Growth Modeling of Longitudinal Data >

Message/Author

Anonymous posted on Thursday, March 21, 2002 - 2:50 pm

Dear Linda & Bengt,

I am running a growth mixture survival model, relating development of aggression to later school suspension. I first ran the wrong data, i.e. not assigning a missing code for the remaining suspension time indicators (0=no, 1=yes) once a removal has occured. Before I did that my model converged. After adding the missing values, I get the same error message agin:
*** ERROR in Model command
Ordered thresholds 1 and 2 for class indicator U7 are not
increasing. Check your starting values.

I looked at the data file as well as the distribution of that indicator by class memebership of aggression, but I did not find anything wrong. Any suggestion would be greatly appreciated.

Best,

Hanno

P.S.: Here is the input file for your information:Data:

File is d:\data\AERA\aera.dat;

Variable: Names are prcid cluster cohort desgn11f
gender lunch_b race sctag011 sctag012
sctag021 sctag022 u3 u4 u5 u6 u7;

Missing are all (9);

Usevariables are sctag011 sctag012
sctag021 sctag022 u3-u7;
Useobservations is gender EQ 1;
categorical are u3-u7;
Classes = c(3);

Analysis: Type = Mixture Missing;

Model: %Overall%

ac by sctag011@1 sctag012@1 sctag021@1 sctag022@1;
bc by sctag011@0 sctag012@0.5 sctag021@1 sctag022@1.5;

[ac bc];
[sctag011@0 sctag012@0 sctag021@0 sctag022@0];

f by u3-u7@1;
ac with bc@0;
bc@0;
sctag011 with sctag012;

%c#1%
[ac*3.039 bc*];
[f*4];
[u3$1*3] (1);
[u4$1*3] (2);
[u5$1*2] (3);
[u6$1*2] (4);
[u7$1*1] (5);

%c#2%
[ac*2.669 bc*];
[f*1];
[u3$1*3] (1);
[u4$1*3] (2);
[u5$1*2] (3);
[u6$1*2] (4);
[u7$1*1] (5);

%c#3%
[ac*1.559 bc@0];
[f@0];
sctag011-sctag022;
ac;
[u3$1*3] (1);
[u4$1*3] (2);
[u5$1*2] (3);
[u6$1*2] (4);
[u7$1*1] (5);

Output: tech1 patterns standardized tech8 residual ;

Linda K. Muthen posted on Thursday, March 21, 2002 - 6:42 pm

This message means that u7 is not binary but has three categories. Mplus needs a starting value for u7$2 that is greater than the starting value for u7$1. It is using zero as a default and this is not greater than the starting value for u7$1. Could you be reading the data incorrectly? Why don't have add SAVEDATA: FILE IS and seen what Mplus is reading and saves for u7.

Hanno Petras posted on Friday, March 22, 2002 - 6:31 am

Dear Linda,

as a follow-up, as I mentioned in my email, the categorical indicators u3 to u7 are binary indicators (0 or 1) with 9 as a missing value. Interestingly enough, I was able to run the model by excluding "u7". I also tried to collapse u6 and u7 in case of small cell frequencies, but this resulted in the same error message. The save command unfortunately does not work, since the model (even a basic) does not run, because of the threshold problem. Therefor, I can not really check what Mplus reads in comparison to what data set I provided. Any other ideas?

Best,

Hanno

Linda K. Muthen posted on Friday, March 22, 2002 - 7:59 am

Do a TYPE=BASIC MISSING without the CATEGORICAL statement and save the data. I believe that you will find u7 is not binary. You can check the sample statistics and the save file. This should work because it will not be stopped by the threshold problem.

Hanno Petras posted on Friday, March 22, 2002 - 9:12 am

Dear Linda,

good point. I tried that and looked at the frequency of u7. The mean is below one and the frequency reveals that the variable has in fact only two categories. What now?

Best,

Hanno

Linda K. Muthen posted on Saturday, March 23, 2002 - 8:31 am

Did you do the frequencies on the data saved by Mplus? If so, I would need the data and input to look at the problem myself.

Miles Taylor posted on Monday, April 11, 2005 - 1:07 pm

Dear Linda & Bengt,
I've been trying to run a model from the "zero-inflation" lecture (#10) of the online videos (pgs 8-15 of handout, Olsen and Schafer). Though the instruction is VERY helpful for setting up the data and the MODEL commands, I'm not sure which type of model to use. Could you please tell me...
1. whether this is a MEANSTRUCTURE or MIXED model and
2. if MEAN, do I have to assume both the observed indicators and latent "onset" variables are continuous? and
3. if MIXED, which is/are my "class" variable/s?

The only way I've gotten this model to run (identified) is to use MEANSTRUCTURE with all continuous observed & latent variables. I'm unclear now on the interpretation of the effects of "x" on "iu" and "su" since they are not binary.

BMuthen posted on Wednesday, April 13, 2005 - 11:47 pm

This can be run as a meanstructure model is you have only a single class which is the standard Olsen and Schafer model. All observed variables are not continuous. The binary part of this model is an observed variable which at each timepoint indicates being at zero or not. So the effect of x on iu and su are interpeted as a growth model for a binary outcome. For more details, see the example in the Mplus User's Guide under two-part modeling.

sara posted on Wednesday, October 12, 2005 - 11:36 am

Dear Linda & Bengt,

I am running a discrete survival model with drug initiation.
I get some error message:

THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ILL-CONDITIONED
FISHER INFORMATION MATRIX. CHANGE YOUR MODEL AND/OR STARTING VALUES.

THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO A NON-POSITIVE
DEFINITE FISHER INFORMATION MATRIX. THIS MAY BE DUE TO THE STARTING VALUES
BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION
NUMBER IS 0.220D-17.

THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE
COMPUTED. THIS IS OFTEN DUE TO THE STARTING VALUES BUT MAY ALSO BE
AN INDICATION OF MODEL NONIDENTIFICATION. CHANGE YOUR MODEL AND/OR
STARTING VALUES. PROBLEM INVOLVING PARAMETER 2.

My inp file is like:
VARIABLE: NAMES are
Subject
class
cond
dgu9_1
dgu9_2r
dgu9_3r
dgu9_4r
dgu9_5r
d1
d2
d3
;
usevariables are dgu9_2r dgu9_3r dgu9_4r dgu9_5r d1 d3 ;
categorical = dgu9_2r dgu9_3r dgu9_4r dgu9_5r ;
!classes=c(1);
missing = all(-999);
analysis: type= missing;
!ALGORITHM=INTEGRATION;
estimator=MLR;
!Starts = 50 5;

MODEL:
f by dgu9_2r dgu9_3r dgu9_4r dgu9_5r@1;
f on d1 d3;
!f@0;
output: sampstat ;

Any suggestion would be greatly appreciated.

Linda K. Muthen posted on Wednesday, October 12, 2005 - 2:48 pm

You don't seem to have your model set up correctly. See Example 6.18. I think you want:

MODEL:
f by dgu9_2r@1 dgu9_3r@1 dgu9_4r@1 dgu9_5r@1;
f on d1 d3;
f@0;

sara posted on Thursday, October 13, 2005 - 7:06 am

Thanks, Linda.

drgopukumar posted on Tuesday, July 18, 2006 - 11:34 pm

HI

I have 3 sets of data i.e. 1st assessment, 2nd assessment and 3rd assessment on same patient different occasions. I want to do growth mixture model. please help me to do the analysis throgh SAS and give me the formula and steps for Growth mixture model so that i can calculate my own.

Linda K. Muthen posted on Wednesday, July 19, 2006 - 8:03 am

The formulas are in the following paper:

Muth�n, B. & Shedden, K. (1999). Finite mixture modeling with mixture outcomes using the EM algorithm. Biometrics, 55, 463-469.

This is too involved to tell you how to program it is SAS.

Nan Zhang posted on Tuesday, May 11, 2010 - 8:14 pm

Dear Dr. Muthen
I have a question about the example 6.23-- what is the parametric model used? (Exponential or Weibull or others?)

Best,
Nan

Linda K. Muthen posted on Wednesday, May 12, 2010 - 8:36 am

Exponential using the Cox model.

Nan Zhang posted on Friday, May 14, 2010 - 6:40 am

Dear Dr. Muthen
I am doing continuous-time survival analysis using a parametric
proportional hazards model with two factors influencing survival.

Part of my code is:
MODEL: f1 BY b1-b5 c1-c5 c8;
f2 BY b6 c6 c7;
[t#1-t#21];
t ON f1 f2 age famdep;
f1 on famdep;
f2 on famdep;

But I ran into the error message even I increased the MITERATIONS to 2000:
THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO A NON-POSITIVE
DEFINITE FISHER INFORMATION MATRIX. THIS MAY BE DUE TO THE STARTING VALUES
BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION
NUMBER IS 0.952D-15.

Could you give me some suggestions how to fix this?

Thanks in advance!

Nan

Linda K. Muthen posted on Friday, May 14, 2010 - 8:34 am

Please send your full output and license number to support@statmodel.com.

Richard E. Zinbarg posted on Tuesday, September 21, 2010 - 8:52 am

Hi Linda and/or Bengt,
I am collaborating on an analysis of predictors of the onset of marijuana use. Participants were assessed at three points in time (ages 14, 16 and 18) and the primary dependent variable is whether the P has begun marijuana use by that time. It seems to me that this calls for discrete-time survival analysis with missing data flags being used to indicate the event has already occurred in an earlier time point. My collaborator wants to use latent growth curve modeling instead (with values of 1 not only at the time point at which the individual initiated onset but at subsequent time points as well). Would the results of a latent growth curve analysis of these data be interpretable?
As always, many thanks for this outstanding discussion board!

Linda K. Muthen posted on Tuesday, September 21, 2010 - 9:41 am

I think discrete-time survival analysis would be most appropriate to ass onset of marijuna use. However, I think more than three measures are needed.

Richard E. Zinbarg posted on Wednesday, September 22, 2010 - 6:28 am

thanks for the very speedy reply Linda! As I have not done a lot of survival analyses before, I have another question that is very basic. In the examples in Ch. 6 of the User Guide, MLR is used but the examples mention that one could choose other estimators. Is MLR recommended for survival analysis? And under what circumstances would it be recommended to use a different estimator?

Linda K. Muthen posted on Wednesday, September 22, 2010 - 10:53 am

In Mplus you must used maximum likelihood to obtain results that are discrete-time survival. If you use another estimator, I don't know what your results would be but not discrete-time survival. I have not seen other estimators used for discrete-time survival.

Richard E. Zinbarg posted on Wednesday, September 22, 2010 - 6:46 pm

thanks Linda, ML it shall be then! When regressing continuous covariates in the discrete-time survival model, how does one interpret the parameter estimates provided by Mplus? Are these the regression coefficients that one would take the exponent of to get the hazard ratio?

Linda K. Muthen posted on Thursday, September 23, 2010 - 9:09 am

Yes.

Ho Wang posted on Thursday, February 17, 2011 - 12:37 pm

For some reasons, I couldn't find the example data, input and output files for example 8.15 in Chapter 8, either on the Mplus website or in the installed "users' guide examples". Could you drop a hint? Thanks.

Bengt O. Muthen posted on Thursday, February 17, 2011 - 5:02 pm

You are right that this example is not there. Early on we didn't have a way to generate Monte Carlo data for discrete-time survival. But if you look at UG ex 12.8 you find a simulation study there, although for a single-class model.

Ho Wang posted on Thursday, February 24, 2011 - 1:38 pm

Thanks for the tip. Since the scenario of ex 8.15 is a bit different from ex12.8, I am wondering if there are online Mplus documents and papers for using growth trajectory classes to predict survival. Thanks

Bengt O. Muthen posted on Thursday, February 24, 2011 - 3:05 pm

There is the Muthen-Masyn (2005) JEBS article which we have on our web site.

Just try to set it up by combining pieces from different UG examples and see how it works out.

Ho Wang posted on Thursday, February 24, 2011 - 9:18 pm

Thanks

Gideon Bahn posted on Wednesday, June 22, 2011 - 8:28 am

Hi Tihomir,
I am trying to model based on the paper you presented at JSM biometrics section in 2006. I am wondering it is ok for 3.3 time varying latent variable model without CVs.
variable: names are ID insul prim trt time mi chf cor amput;
Usevariables = trt prim time mi chf cor amput;
survival = time (all);
timecensored = prim (0=not 1=right);
CATEGORICAL = mi chf cor amput;
Analysis: Algorithm = integration;
basehazard = on;
Model: prime1 by mi-chf@1 ;
prime2 by cor-amput@1;
time ON trt prime1 prime2;
prime1 on trt ;
prime2 on trt;
prime1 with prime2;

OUTPUT: TECH1 TECH8;

Tihomir Asparouhov posted on Wednesday, June 22, 2011 - 9:53 am

Everything looks good except
basehazard = on;

I think you want to remove that command or set this to
basehazard = off;

otherwise you would be going into parametric modeling of the basehazard.

Gideon Bahn posted on Wednesday, June 22, 2011 - 12:04 pm

Hi Tihomir,
I think you notinced that this model has two latent variables in one model, which is like multiple outcomes (two outcomes).

Gideon Bahn posted on Wednesday, June 22, 2011 - 1:30 pm

Can I also use two difference constant survival times for each of latent variables?

Tihomir Asparouhov posted on Thursday, June 23, 2011 - 11:43 am

I am not sure what you mean but the factor models
prime1 by mi-chf@1 ;
prime2 by cor-amput@1;
don't need to have loadings fixed to 1. You can leave that as
prime1 by mi-chf;
prime2 by cor-amput;

Gideon Bahn posted on Thursday, June 30, 2011 - 10:58 am

The two factors are the two latent variables (making them multiple outcomes).

Loadings, Yes.

Fernando H Andrade posted on Monday, August 19, 2013 - 10:41 am

Dr Muth�n
I modeled the onset of alcohol, cigarettes and marijuana (see syntax for details) getting a high correlation between the onset of alc and cig (0.994), would there be a problem? (i also got a warning)

%Overall%
ses by z_sesw1@1;
z_sesw1@0.001;
ext10 by CBEXT10@1;
CBEXT10@0.001;
alc by yalc6-yalc17@1;
cig by ycig7-ycig17@1;
mar by ymar11-ymar17@1;

alc with cig mar;
cig with mar;

alc ON y1age z_sesw1 CBEXT10;
cig ON y1age z_sesw1 CBEXT10;
mar ON y1age z_sesw1 CBEXT10;

yalc6 ON female (1);
yalc7 ON female (1);
yalc8 ON female (2);
.
yalc10 ON female (2);
yalc11 ON female (3);
..
yalc15 ON female (3);
yalc16 ON female (4);
yalc17 ON female (4);

similar for cig and mar

Bengt O. Muthen posted on Monday, August 19, 2013 - 3:50 pm

It sounds like you get a high correlation among your dependent variable factors. I would do a factor analysis of these 2 sets of indicators to see if this high correlation persists. It is more of a substantive issue, rather than a statistical Mplus issue.

Fernando H Andrade posted on Tuesday, August 20, 2013 - 10:14 am

Sounds good, thank you very much!

Joe posted on Thursday, October 03, 2013 - 9:26 am

Hi,

I am running a discrete-time survival analysis, where the event is passing a test at time 1, 2 or 3.

f BY t1-t3@1;
f@0;
f ON S H M F L D;

I want to add time-varying covariates that indicate whether the student was close passing the previous time (based on continuous scores, not pass/fail). For example, t2 regressed on an indicator of those students close to passing t1.

t2 ON C1;
t3 ON C2;

When I include the second in the model (t3 ON c2), the UNIVARIATE PROPORTIONS AND COUNTS FOR CATEGORICAL VARIABLES suggest that I lose those students that are censored at t2. In other words, I lose 785 students that survived t1 (0) but were missing at t2. As a result, my proportions for t1 are meaningfully affected. But I do not lose any cases at any other time, so that those censored at t2 are not lost at t3.

1) Can you help explain this to me?
2) What are your thoughts about a covariate that is contingent on the score a previous time?

Thanks,
Joe

Linda K. Muthen posted on Thursday, October 03, 2013 - 11:50 am

You are losing cases because of missing data on the covariates. For observations with missing data on the survival variables, you can change the value for the covariates to a non-missing value. Then they will not be deleted. Their value will have no effect on the analysis. So in DEFINE say

If (t2 eq 999) then c1 = 4;

where 999 is the missing value flag for t2.

Joe posted on Thursday, October 03, 2013 - 1:20 pm

Thank you very much! I actually had to use the _MISSING command (rather than 999), but it worked.

Varsha Gupta posted on Monday, March 23, 2015 - 4:14 am

Hi!

I am trying to compare the trajectories derived from Growth Mixture Model in MPLUS. I use the criteria of minimmum BIC and PARAMETRIC BOOTSTRAPPED LIKELIHOOD RATIO TEST ( p-value 0.05) to conclude the number of classes. I start with no. of classes=1 and increase the number of classes till p-value of Bootstrap method >0.05 and BIC change is insignificant. I get 4 classes. To validate the model, I used predefined clinical thresholds to determine 3 stable traj ( within a category at all time points + mover negative and mover positive trajectories). But moving negative traj cant be identified by GMM. HOwever, if I incerease the classes to 6 BIC goes down but p-value >0.0.5. But there is a good overlap of clinical definition and GMM derivation...It is correct to do that and only use BIC as goodness of classification? I need my model to be consistent with Clinical definitions.

Bengt O. Muthen posted on Monday, March 23, 2015 - 1:57 pm

In my view, using theory and BIC is sufficient.

Jordan davis posted on Wednesday, December 14, 2016 - 8:51 am

Hello! I just completed a survival mixture analysis (5 class solution) liek example 6.18. I am wondering about the interpretation of the output with and without co-variates.

WITHOUT CO-VARIATES:
1) are the estimates given for my survival variables (t) hazard ratios? such that a value below 1 has a lower hazard compared to the reference class?

WITH CO-VARIATES:
1) I am getting two regressions:
T ON X;
C ON X;

the results of C ON X are just the multi-nominal logistic regression. However, I'm having trouble with the T ON X. Does this impact the survival variable (which is now represented by an "intercept")?

2) are the results interpreted separately for X on C and X on T? or is this a type of moderation?

Bengt O. Muthen posted on Wednesday, December 14, 2016 - 12:07 pm

6.18 is not a survival mixture example (in the UG on the web).

T ON x

gives regular Cox survival information. For interpretations, see our survival papers on our website under Papers, Survival Analysis such as

Asparouhov, T., Masyn, K. & Muth�n, B. (2006). Continuous time survival in latent variable models. Proceedings of the Joint Statistical Meeting in Seattle, August 2006. ASA section on Biometrics, 180-187. Click here to download the files associated with this paper.
download paper contact first author show abstract

Muth�n, B., Asparouhov, T., Boye, M., Hackshaw, M. & Naegeli, A. (2009). Applications of continuous-time survival in latent variable models for the analysis of oncology randomized clinical trial data using Mplus. Technical Report. Click here to view Mplus outputs used in this paper.
download paper contact first author

and also

Stoolmiller, M. and Snyder, J. (2013). Embedding multilevel survival analysis of dyadic social interaction in structural equation models: Hazard rates as both outcomes and predictors. Journal of Pediatric Psychology, DOI: 10.1093/jpepsy/jst076.
download paper show abstract