Message/Author 

Anonymous posted on Saturday, September 15, 2001  1:59 pm



Linda/Bengt Using Mplus,is it possible to do discrete time survival anlaysis with one latent class? I want to: 1. Test if timeinvariant and/or timevariant variables (predictor) significantly influence survival fuctions. 2. Conduct multigroup analysis to test if two groups have different survival fuctions. Can I use equality constraints and test it? 3. Test if there are interactions between the effects of predictors and group (e.g., gender). Can I do multigroup analysis to test eqility of the effects across group? I have three timepoint data. Is it enough for doing survival analysis using Mplus? Can I use missing data? I am looking forward to hearing your answer. Thanks so much! 

bmuthen posted on Sunday, September 16, 2001  8:37 am



Yes, you do regular discretetime survival analysis with one class (in line with our new paper #92). Regarding your question 1, the answer is yes. Re 2., yes you can do multiplegroup analysis by noting that in mixture modeling groups can be handled by latent classes that are known using training data (see papers 85, 86). Re 3., yes this is possible. Three time points is a rather low number for survival analysis, although I haven't done simulation studies to see how the method works in this case. Missing data can be handled using MAR as usual. 

Anonymous posted on Sunday, September 16, 2001  10:10 pm



Although 3 time points are not enough for survival analysis, it is possible to do the analysis, right? How many time points would you recommend? Four to seven as for latent growth modeling? 

bmuthen posted on Monday, September 17, 2001  9:19 am



Yes, it is possible with 3 time points. I am not aware of research suggesting number of time points, but maybe others reading this would know. 

Anonymous posted on Thursday, March 07, 2002  9:25 pm



I know that I can do discretetime survival analysis using Mplus. Is it also possible to do this for correlated data, for example, mortality for a family with different relationships (mother, father, sons)? I am looking forward to hearing your answer. Thanks. 

bmuthen posted on Friday, March 08, 2002  9:04 am



I think of 3 ways to handle correlated data in general. Note that DTSMA in Mplus is done via type=mixture. First, you can estimate the parameters of the model as usual, but compute the correct standard errors that take the nonindependence into account. This can be done in the current Mplus using type=mixture complex. Second, you can take a multivariate approach where say mother, father, son all contribute to a multivariate observation vector. I have not explored this possibility for DTSMA, but it would seem to have great potential. Third, you can take a multilevel approach. Multilevel mixture analysis is not available in the current Mplus but will be available in the future. 


Following up the messages from 9/01: What does the class structure do in a oneclass model? I see it in the survival analysis examples, but I'm not clear on what's different in a oneclass model compared to a nonmixture model. Thanks. 


There is no difference between a oneclass model and a nonmixture model. Because certain features are not available in the nonmixture parts of Mplus, doing a oneclass mixture model allows those features to be used for nonmixture models. 


Thanks. I guess the feature availability is part of the "difference" I was asking about. What is it that makes the oneclass valuable for survival analysis? Thanks. 


I think the question is what makes TYPE=MIXTURE valuable for survival analysis. One can have mixture discretetime survival analysis where there is more than one class. See Example 25.13 in the Mplus User's Guide. Or one can have discretetime survival analysis with one class. It is the use of the latent class indicators to represent survival during a certain period that makes the discretetime survival analysis possible. For example, a person who is observed for five time periods and survives during those period has u values of 0 0 0 0 0. A person who experiences the event (death, for example) in period 4 would have u values of 0 0 0 1 999. And a person who drops out after period 4 would have u values of 0 0 0 0 999. I hope this answers your question. If not, ask again. 


Yes, thanks. 


In private correspondence with Linda, I've been working on a survival model where the survival function is the outcome of a path model; unfortunately, she tells me that's not possible in the current version. So now I'm in brainstorming mode. I don't have a lot of possible times of failure, and I don't have any missing data, so I've really only got a handful of response patterns. How useful would it be to approximate the outcome by an ordinal variable that is the time of failure? Pat 

Chuck Green posted on Tuesday, February 18, 2003  8:29 am



I am attempting to implement the discrete time survival model that is described in the Muthen and Masyn article, as well as in the User's Guide. I have 180 daily observations on /a sample of N=253. The difficulty I seem to be having is that on some of the days, there are no "failures" (i.e. events coded "1"). This results in a day in which all values are either "0" or missing. The Mplus output indicates that these days are problematic for the analysis. Is my assumption correct that the problem is that these day actually provide no information about survival? In order to deal with this problem I have tried two approaches: 1) I have collapsed observations (i.e. into weeks, and then months) such that no single measurement occasion has only one category of response. This works but I have to collapse the data all the way to the moth level. 2) I have attempted to treat those days on which only one type of response occurs as missing. The problem here is that I still receive the same error message that only one response category occured on that particular day. How might I circumvent this problem using MPlus? Chuck Green 


The current version of Mplus does not allow categorical variables with observations in only one category. In an unrestricted survival model with no covariates, one can simply manually insert an estimated hazard of zero. However, if the hazard is constrained in some way, e.g., constant hazard, or the model includes timeinvariant covariate effects, simply excluding that time interval respesents a loss of information and could result in a bias in the hazard and effects estimates. Collapsing is always a possibility but you need to make the assumption that the underlying hazard is constant within each discrete time interval. Also, there is the suggestion (from Tihomir) of inserting an observation with the unobserved category in question with a weight of zero. I wonder, however, with only n=253 on 180 time intervals how solid/stable the model will be, even if it is "allowed" to run. It may be that with so many time periods, continuous time survival might be a much more reasonable way to go. 

Anonymous posted on Tuesday, March 04, 2003  6:26 pm



I would like to estimate a DTSA model where the probability of failure is predicted by a number of variables. One of the exogenous variables is itself an endogenous variable from the "first stage" regression estimate. That is Y ON X1Xm; Factor BY f1f4; Factor ON XmXz, Y I am not sure how to set up the model. If I put the regression component before of after the %OVERALL% command or after the %C#1% command, the program won't calculate. Is there wa way to set up this model? Thanks 

bmuthen posted on Wednesday, March 05, 2003  8:06 am



This analysis is not yet available in Mplus. It is an example of a structure, a path model in this case, being imposed on variables predicting the binary event history indicators, which is a model that Mplus currently cannot estimate by the required maximumlikelihood approach. This will be available in Version 3. 


Can DTSA be modeled using complex data (clustering and sample weights)? I remember that TYPE=COMPLEX requires continuous outcomes. 


Yes, this is possible. DTSA is done using TYPE=MIXTURE. TYPE=MIXTURE COMPLEX with weights can be used if you want to include clustering and weights. 

Anonymous posted on Wednesday, August 31, 2005  9:01 am



HI there, I am trying to run a survivial analysis with time varying covariates using MPLUS version3. How do I set up the model with time varying covatiates? Thanks! Chris 


See Example 6.18 in the Mplus User's Guide which is a discretetime survival analysis and add timevarying covariates as shown in Example 6.10. 

Anonymous posted on Wednesday, August 31, 2005  10:39 am



Thanks! Can you give me an example for the format of the data. I apprecaite your help! Chris 

Anonymous posted on Wednesday, August 31, 2005  10:43 am



Dear Linda, here is the code that I have used for the survival analysis described in my earlier message. Currently this only includes gender as a covariate. usevariables are ev7ev21 m; categorical are ev7ev21; missing are all(999); idvar is id; classes = c(1); ANALYSIS: Type = Mixture Missing; Starts 100 10; MODEL: %overall% ev7ev21 on m; OUTPUT: Tech1 Tech8; 


I think your questions are more appropriate for support. Please send them along with your license number to support@statmodel.com. 


I want to do simultenous, discrete time hazardd ratio modelling. Specifically, I am interested in finding the determinants of out of wedlock childbearing. Its competing risk is marriageonce you are married, you are no longer at risk of marriage out of wedlock. The recent literature suggests that you have to take into accout the correlations beetween unobserved individual characteristics for marriage and for unpartnered childbearing because marriage and fertility decisions are made joinly, thus, each desicion affects the other risk. Therefore, I have to assess the two events simultenously. Does Mplus do this? 


Just one more thing, my data set is survey data with sampling weights. 


It is possible to do a competing risk model in Mplus while incorporating shared unobserved hetergeneity. I discuss both discretetime competing risk models and models that incorporate unobserved hetergeneity in my guest lecture #12 for Bengt's Statistical Analysis with Latent Variables class. Although I don't show the combination (competing risk with shared unobserved hetergeneity) in an example, it should be reasonably clear how to specify such a model. You will have the option of modeling the shared unobserved heterogeneity with either a latent class or latent factor variable. Video of the lectures is available at http://www.ats.ucla.edu/stat/seminars/ed231e/ Slides are available at http://www.gseis.ucla.edu/faculty/muthen/Handouts.htm You can use Type=complex in conjunction with the Weight option of the variable command to account for the sampling weights. For more information, see Chapter 9 of the Mplus User's Guide. I'm guessing that given your area of application that you are familiar with Fiona Steele's work. If not, you should definitely have a look at her stuff. In my experience, most of the models she presents have equivalent specifications in Mplus. Katherine Masyn 


Thank you so much, Katherine, yes, I had Fiona Steele's recent studies in mind. I'll certainly take a look at the video of your lectures. 


I have data on 474 pregnant smokers randomized to one of three treatments aimed reducing smoking prevalence during pregnancy. Women were assessed (at each biweekly or monthly hospital visit) from enrollment (baseline) to end of term of pregnancy (week 40). I want to run a survival analysis that examines time to event (in this case quit/abstinence) and the effects of the three interventions on the onset at which the women quit. However, not all women were enrolled at baseline at the same week of gestation during their pregnancy; thus, some women have an opportunity to be exposed to the program for a longer time. Singer and Willet (2003, pg 595) discuss issues with “late entrants” and the need to reposition individuals based on a common time metric. I have repositioned women based on gestation weeks (0 to 40 weeks) at baseline (entry) to time of event (quit) or end of term without quit (censor). My question is whether Mplus can: A. handle “late entrants” B. what specifications are required in the MODEL commands C. how should the data be set up to run this type of model 


It seems that late entrance can be handled by assigning missing data to the event indicators at the time points before such a person enters. 


Hi, Michael. If I understand your data situation, it would appear that you have a couple of options depending on the time metric you use. If you use the gestation time metric, then you do indeed have late entrants and Bengt's suggestion above would apply. By coding the event indicators that correspond to the gestational periods prior to intake as missing, those cases would not be included in the risk sets for those time periods (in line with Singer & Willet's "common sense" approach on pg. 599). If instead you use the time metric where t=0 corresponds to intake, then there are no late entrants. However, you would most likely want to use gestational period at intake as a covariate. In either case, all the setup work is done at the data coding level and no special model commands are needed. Best, Katherine 


Dear Linda and Bength, I wonder if I can use Mplus in the following situation. My data is about patients having a given disease , say D. I want to model the time since the onset of the disease D until the occurence of a given event , say E (the death of the patient for example). One characteristic of the data is that patients enter the study at different levels of the severity of the disease (continuous scal : 0100). So, the onset of the disease D is not know and hence the observed duration is not the true one. Is there any similar or closer problem that have been already analysed by Mplus? If yes, then I would be grateful if you can send me some references. Regards, A. Oulhaj 


Hi, Abderrahim. Just to make sure I understand before I offer further comment... If we consider this a multiple event process, with onset of disease as the first event and then "E" as the second event, then some of your patients are leftcensored with respect to disease onset in that they have already experience the first event prior to entry into the study but you do not know the timing of onset beyond knowing that it occured before study intake. Is that right? Does this apply to all subjects, or only those with disease severity above 0; that is, does a disease severity of 0 represent absence of the disease? Also, could severity of the disease be used as any sort of crude proxy of the time since disease onset, e.g., those with higher severity have, on average, had the disease longer than those with lower severity? Finally, are you working with continuous or discretetime survival data? Best, Katherine Masyn 


Hi katherine, First of all, I would like to thank you for your interest and your quick reply. My message will be split into 2 for reason of the required maximum size. Before answering your questions, I will try to be more specific about the disease I am analyzing. It's the Alzheimer's disease. Roughly speaking, Every subject entering the study (i.e. at his first visit) was administred a neuropsychological test (composed of binary and polytomous items). The score of a given subject is defined as the sum of the answers to the items of this neuropsychological test and lies between 0 and 100. A cutoff point of 85 have been already fixed in previous studies to discriminate between patients and controls so that if the score is less than 85, the subject is diagnosed as patient, elsewhere control. The study is a longitudinal one so that every subject (control or patient) is assessed every 6 months. I am interested in patients only and the risk factors for their survival. The severity that I talk about in my previous email may be defined as severity = 100  score and is considered as an approximation of the "true" severity. 


Hi katherine, Now, let me answer your questions: Answer to question 1: Yes my patients are leftcensored in the way you sad in your email ( they have already experience the disease prior to entry into the study but I do not know the timing of onset beyond knowing that it occured before study intake) Answer to question 2: Yes it applies to all subjects. I have very few converted controls (those who were controls and converted afterwards to patients) Answer to question 3: Yes, severity of the disease (or the score) may be used as a proxy of the time since disease onset. In the sense that those with higher severity have, on average, had the disease longer than those with lower severity. Another proxy is the relative's informations. We ask them how long did you notice the change in the behaviour of the patient. But this information, alone, is not reliable. Answer to question 4: I would like to work with continuous survival time. However, if discrete one is more plausible then it's ok. Please, feel free for any comments or other questions. Again, many thanks, Regards, Abderrahim 


Hi. I also have a survival model application with late entry. Women with pelvic problems were recruited (aged 32 to 48) and observed for up to 8 years. I will use participant age to define the time scale. The outcome event is whether they had a hysterectomy. There are over 1500 women in the data set. The observed hysterectomy rate is about 10%. We will model the effects of timevarying covariates with missing values. My understanding is that, in principle, Mplus 3.12 can handle all 'challenging' aspects of this application (i.e., late entry, timevarying covariates, missing values in the X's). However, after reading the above postings, which apparently referred to earlier versions of Mplus, I am worried that for many participant ages the observed hysterectomy rate will equal zero. Question 1: Is it true that this circumstance will pose a problem for DTSA/DTSMA model estimation in Mplus 3.12? I think the answer is 'yes' and I would be required to collapse participant ages into units larger than one year to ensure nonzero event variation at each timepoint. Also, a colleague pointed out that using EM to estimate a DTSA with timevarying covariates may not incorporate associations of the covariates across time. Question 2: Are the DTSA/DTSMA models fit by Mplus limited in this way? Thanks. Steve 


Hi. By the way, I am assuming that your responses to Abderrahim's interesting questions (April 19, 2007) were delivered 'offline.' If reasonably possible, it would be instructive if they could be posted here. Thanks. Steve 


1. You need variability in the event history indicator. This is not specific to Mplus but rather to survival modeling. 2. The default is for timevarying covariates to covary. I'm not sure if the other posting was answered off line or not at all. 


Hi. Will the continuoustime survival model fit by Mplus 4 allow for left truncation/late entry? Thank you. Steve 


Can you explain what you mean by this? 


Hi, Linda. Women of varying ages enrolled in an observational study. We want to use participant age to define the time scale of the survival model. Therefore, some women were observed from, say, ages 33 to 39, whereas other women were observed from 43 to 47, etc. That is, participants have different 'start times' in the survival model. My question is whether the continuoustime survival model in Mplus can accommodate this design feature, which is usually referred to as 'late entry' or 'left truncation'. Thank you in advance. Steve 


For continuoustime survival, left truncation is not allowed. That being said, I wonder if age is really what you want as the time axis. I am wondering if measurement occasion might be an alternative and use age as a covariate. 


Thanks, Linda. Defining the time scale as timeonstudy and using age at baseline as a covariate is a possibility. However, we have a strong preference for plotting survival and smoothed hazard across age. This will be most meaningful to our clinical audience. Steve 


Hi, It is possible to mixed factor analysis and survival analysis? I have a model with factor and I want to predict a dichotomous variable (decro). Decro is 0 or 1 depend if it appends or not. 3 factors are formed with continuous variables and the third one is on the first two. And the variable “decro” depend on the third factor. This is my input model: f1 by y1y6; f2 by x1x3; f3 by a1a4; f3 on f1 f2; f3 on decro; Thank you 


I can't see why this would be a problem. 


Hi, I want to do a discretetime survival analysis with timevarying and timeinvariant covariates (4 panel waves). My question is: Do I simply use conventional logistic regression procedure as described in example 6.18 of the Mplus manual? Or is a latent class regression (LCR) required as described in Masyn's (2003) dissertation (p. 55, 75)? Is there any major difference between these two approaches? Thanks in advance, Oliver 


Follow Example 6.18. This had to be done as a oneclass mixture model which is the same as a regular analysis at the time of Katherine's dissertation. You may find the following paper of interest: Muthén, B. & Masyn, K. (2005). Discretetime survival mixture analysis. Journal of Educational and Behavioral Statistics, 30, 2728. 


I've recently conducted an experiencesampling study. Participants answered up to five questions, though they may have opted not to answer (censored). Participants completed up to 210 observations of these five samples. There's a observationinvariant variable that I want to use both as a predictor of time and as a moderator of the effects of which question (I plan on dummycoding this). Time is measured in seconds. Can MPlus do this? Can I treat this as continuoustime survival? Can anyone point me to example syntax? Thanks, Steve 


There are several examples of continuoustime survival in Chapter 6 of the user's guide. I'm not sure how this applies to your study. 

Ken posted on Tuesday, April 22, 2008  6:20 pm



Is it possible to fit a continuoustime survival analysis with a stratified baseline hazard, that is different depending on group? If so, how ? 


You would do this using the KNOWNCLASS option and TYPE=MIXTURE where the classes are the groups. 

Ken posted on Wednesday, April 23, 2008  4:18 pm



Does this require the mixture addon, even though I'm not fitting a mixture model ? I'm in the process of sorting out the requirements for a new project. 


Yes, this would require that. 

Michelle posted on Friday, June 26, 2009  7:32 am



Hi  Sorry to post in two threads, but wanted to keep questions under their correct topics. I am trying to use a latent class growthsurvival analysis model similar to example 8.15 in the MPlus manual or the Muthen & Masyn (2005) paper. We have 4 waves of data over 20 yrs. We're like to use latent classes of healthy aging (defined by a score (y1y4)) to predict mortality (u1u4 in the discretetime model). We could also model mortality continuously, but I'd like to get the discrete model working b/c of timevarying covariates. In the discretetime set up, the first wave of data has no deaths. Do I use only u2u4 since these are the only instances of the mortality variable in which there is variation? Do I need to somehow then specify to MPlus that y1 and u2 are not at the same timepoint? How might I do this? Thanks for any guidance! Michelle 


Yes, do DTSA and use u2u4, regress u2 on both y1 and y2, u3on y3, and u4 on y4. 


Hi, When trying to estimate a basic discrete time suvival model where all indivudals are alive at time 1, I receive the warning that " One or more variables have a variance of zero". Is it possible to include the intial time period when all indivudals have not yet experienced the event of death? Thanks. 


No, this time period should not be included. 


Example 6.21 is a continuous time survival analysis using Cox regression. The data file Ex6.21.dat shows 3 cols of data; time to event (t), covariate (x) score and censoring category (tc). For cases where tc = 1 (Right censored) there is a value for t. How can there be a value for t when an even hasn't happened? More importantly, for my analysis, I have time (days) from hospitalisation to death (t) and a timecensored variable: dead(tc=0)/alive(tc=1). What value(s) of t should I put for cases that are still alive at the end of the study? Many thanks, Peter 


The value you see is the time the person left the study. So for your example this is the days hospitalized at the end of the study for those not dead. 


Thanks Bengt. I want to predict time to death from GMM class using Cox reg: Variable: Names are sex age y94 y95 y96 y97 y98 y99 tc t; Usevariables are y94y99 age tc t; Missing are all (999); classes = c(2); survival = t(ALL); timecensored = tc (1 = NOT 0 = right); Analysis: Type = mixture ; basehazard = off; starts = 100 5; Model: %overall% i s  y94@0 y95@1 y96@2 y97@3 y98@4 y99@5; t on age c; %c#1% i s  y94@0 y95@1 y96@2 y97@3 y98@4 y99@5; [i*1.9 s*.2]; y94 to y99 are continuous. I get errors: "The following MODEL statements are ignored: * Statements in the OVERALL class: t ON C#1 One or more MODEL statements were ignored. These statements may be incorrect." Can I use Cox reg to predict time to death from GMM group membership? Peter 


You should not say "t on c". Instead the intercept of t will vary across the c classes. 

Maja Wiest posted on Tuesday, September 07, 2010  3:59 am



I have a question regarding continuoustime survival analysis. I try to set up a model in which survival is predicted by a latent difference score and control variables. Mplus gives the following error FATAL ERROR THIS MODEL CAN BE DONE ONLY WITH MONTECARLO INTEGRATION. Variable: names = ls11 ls12 ls13 ls14 ls15 ls21 ls22 ls23 ls24 ls25 status age sex SurvMon school; usevariables are status age sex SurvMon school P1LS1 P2LS1 P1LS2 P2LS2; Survival = SurvMon (ALL); Timecensored = status ; missing are all (99); categorical are sex school; Define: P1LS1 = (ls14 + ls12 + s15)/3; P2LS1 = (ls11 + ls13)/2; P1LS2 = (ls24 + ls22 + s25)/3; P2LS2 = (ls21 + ls23)/2; Model: LS1 by P1LS1 P2LS1 (1) P1LS2@1 P2LS2 (1); Diff by P1LS2 P2LS2 (1); is2 by P2LS1 P2LS2@1; is2 with LS1@0 diff@0 age@0; [P1LS1@0 P1LS2@0]; [P2LS1 P2LS2] (2); [LS1 Diff]; SurvMon on age sex school Diff; 


Add INTEGRATION=MONTECARLO; to the ANALYSIS command. 


I have a question regarding Cox regression analysis. This is my first time using a Cox regression model in Mplus. I am having trouble interpreting my findings as they are quite different when I run the same model using another statistical program. My model is specified as: Usevariables = El5ycap Rcd5ycap DayUse VioHx priortx; Survival = El5ycap (all); Timecensored = Rcd5ycap (0 = not 1 = right); Analysis: Basehazard = OFF; Model: El5ycap on DayUse VioHx priortx; The Mplus model are listed as follows: Covariate Estimate S.E. Est./S.E. PValue TwoTailed DAYUSE 0.054 0.076 0.707 0.479 VIOHX 0.003 0.078 0.038 0.970 PRIORTX 0.113 0.080 1.413 0.158 Running the same model using another statistical program I get the following: Covariate B S.E. Odds Ratio PValue DAYUSE .018 .118 1.018 1.018 VIOHX .315 .113 1.370 .005 PRIORTX .055 .113 .946 .625 It may be that I am incorrectly specifying the model or that I am failing to understand how to convert the Mplus estimates into the scales I am more familiar working with. I apologize for the potential simplicity of my question. Any help you can give will be greatly appreciated. 


It's not possible to understand the problem without more information. Please send the outputs from the two programs, your data, and your license number to support@statmodelcom. 

David Tong posted on Monday, November 15, 2010  5:22 am



Is it possible to do a continuous time repeated events survival analysis with mplus. The outcome variable is time to hospitalization which unfortunately happens multiple times per individual. I wish to test if timeinvariant and/or timevariant variables significantly influence survival fuctions. If mplus cannot handle this as a single level model would a two level model work with all the survival curves for each person grouped together under that individual. 


You can do it as a repeated events model or as a two level model. It would be easier to do it as a twolevel model since the number of hospitalizations varies between the individuals. When it does repeated events model should have missing data for those individuals that have less than the max number of hospitalizations and that is a bit hard to setup although you can do it with the DATA LONGTOWIDE command in Mplus, see the Mplus User's Guide. The twolevel version should be straight forward. 

David Tong posted on Tuesday, November 16, 2010  6:59 pm



Thank you very much. The twolevel version would be much easier to set up and suits my current needs. If the individuals were grouped into a larger set such as communities then I would be forced to model it as repeated event for each individual and then group the individuals into communities for the second level. A three level capability in mplus would be amazing. Maybe for a future version. 

QianLi Xue posted on Thursday, February 24, 2011  11:35 am



Hi, I'm tring to fit a joint latent class (C) and continuoustime survival (T) model, where C is the predictor of T (see coding below). (1) If I suspect nonproportional hazard, we can do the following, right? SURVIVAL = dtime1(7*1.5); BASEHAZARD = OFF(UNEQUAL); (2) How to interpret the output ("FRAILTY" is C)? Categorical Latent Variables Means FRAILTY#1 0.176 0.360 0.488 0.625 FRAILTY#2 0.683 0.308 2.216 0.027 FRAILTY#3 0.627 0.250 2.513 0.012 (3) Given that the logrank test is no longer valid for nonproportional hazard, how can I test the effect of C on T? Thanks! USEVARIABLES ARE nhis1 nhis2 nhis3 nhis4 nhis5 nhis6 nhis7 death dtime1; MISSING ARE ALL (9999); CATEGORICAL ARE nhis1 nhis2 nhis3 nhis4 nhis5 nhis6 nhis7; CLASSES = frailty(4); SURVIVAL = dtime1(7*1.5); TIMECENSORED = death (1 = NOT 0 = RIGHT); ANALYSIS: TYPE IS MIXTURE; ! Assuming baseline hazard is class variant BASEHAZARD = OFF(UNEQUAL); STARTS = 500 50; STITERATIONS=20; 


(1) You don't need the command SURVIVAL = dtime1(7*1.5); That command specifies a parametric stepwise model for the baseline, so it has nothing to do with proportional hazard. The command BASEHAZARD = OFF(UNEQUAL); does give a nonproportional hazard effect from frailty to dtime1. (2) This simply gives you the class variable frailty distribution. (3) Here you will need to use a parametric model SURVIVAL = dtime1(7*1.5); BASEHAZARD = ON(UNEQUAL); Then use likelihood ratio testing or model test (Wald test) to test equality of the baseline hazard parameters across class. 

QianLi Xue posted on Monday, February 28, 2011  5:28 pm



Thanks so much for your prompt response! Just want to clarify: Re (2): so are they in logit scale? If so, we can do the inverse calculation to get probabilities of being in each of the frailty classes, right? But the results don't seem to be right. Re (3): The version 6 of MPLUS does not allow "unequal" option follwing the BASEHAZARD=ON statement.However, the output gives classandintervalspecific hazard estimates. So I guess it assumes "unequal" by default. If so, how can I set it to equal? and what if I only want to set the hazard to be equal for a subset of frailty classes? Here is a new question: When I fit a conventional Cox model, in the output,it gives the following for each class. How to interpret the "Means" here? Latent Class 1 Means DTIME1 0.535 0.407 1.317 0.188 Thanks in advance for your help! 


(2) The model is multinomial logit  as in http://en.wikipedia.org/wiki/Multinomial_logit Mplus will compute these for you in probability scale. You can find these in the output. (3) You can set the them to be equal in class one and two as follows %C#1% [dtime1#1dtime1#1] (p1p8); %C#2% [dtime1#1dtime1#1] (p1p8); (4) The means have the same meaning as if you regress on the constant 1 (in the Cox regression). Alternatively you can interpret this coefficient as the effect of the class variable, i.e., as the regression coefficient of the survival variable on the class variable. These means can only be identified though in case the baseline is held equal across classes, since otherwise the effect of the class is in the entire baseline hazard rather than a shift only. 

David Kerr posted on Thursday, March 17, 2011  10:46 am



My questions build on the 11/16/10 thread between Drs. Tong and Asparouhov. I'm hoping to run continuous time multiple spell survival analysis predicting depressive episode onsets and offsets, and am predicting there are latent classes defined in part by the regression of timetoevent on a covariate. 1) Am I reading example 8.16 in the user's guide correctly when I conclude that class is identified in part by the strength of the regression of t on x? Or is c just identified based on the survival function, and then regressed on x? 2) Tong's example of 2 events led to Asparouhov's suggestion that a 2level model be used. Would 4 events (e.g., onset, remission, relapse, remission) require a 4level model? I take it from Tong's response that this is not currently possible. If so, would you suggest modeling this as repeated events instead of multilevel? 3) Combining questions 1 and 2: Is it going to be possible to run a repeated events mixture model, where class is identified in part by how strongly a covariate is associated with timetoevents (i.e., time to onset, and then time to remission, etc.)? Thanks very much. 


1) In example 8.16, C is identified by the categorical indicators U and also in part by T on X as well as T itself (as the mean of T varies across the two classes). 2) It will not require 4level. It will be a twolevel model, however in your context that would not be appropriate I think because you have different types of survival variables, i.e., onset, remission, rather than repeated observations of the same type of variables. Even if you decide to do twolevel model with multiple repeated observations it should be bivariate I think with the two variables onset/relapse and remission. In any case I would recommend simply doing a repeated events model if possible. 3) It is possible but it may be difficult to identify the class only by the covariates. 


3) Since you have multiple survival variables it should not be hard to identify classes, using basically the survival variables as indicators. 

David Kerr posted on Friday, March 18, 2011  9:21 am



This is all very helpful. Thank you. 

Lena Herich posted on Thursday, March 31, 2011  8:28 am



Hello! I am trying to fit a continuous time survival analysis model with latent classes as predictors. My model setting is the following. What I am trying to do is, to investigate the effect of class 2, by estimating the intercept of the time variable. I tried the model in couple of different settings, but the SE ist always given as 0.000 and consequently Est/Se as 999. Can you tell me what might be the problem? UseVariables= cov1 cov2 cov3 u1 u2 u3 u4 u5 cens time ; Survival = time (All); Timecensored = cens (1 = NOT 0 = RIGHT); Classes= c(2); Analysis: Type=Mixture missing; Algorithm = Integration; Basehazard = Off; Starts = 50 10; process = 4(starts); Model: time on cov1 cov2 cov3; c#1 on cov1 cov2 cov3 %c#1% [time@0]; %c#2% [time*0]; 


This looks ok, so to diagnose the problem please send input, output, data and license number to support@statmodel.com. I assume you have no problem doing a regular singleclass analysis. 


Hi, I would like to conduct a discretetime survival analysis similar to the one in Example 6.19 in the user's guide but I would like to test the assumption that the covariate x has the same influence at each time point (x is measured at time 1 and I believe its influence will decrease over time). Can this assumption be tested in discretetime survival analysis? If so, any leads as to how would be greatly appreciated. Thanks! 


An alternative to the specification in Example 6.19 is: MODEL: u1u4 ON x (1); You can then compare this to: MODEL: u1u4 ON x; 


thanks for the very speedy reply Linda as always! 


actually, referring back to Example 6.19, I realize I have some followup questions. In your alternative specification, does one simply do away with f? Could one as another alternative retain f but freely estimate the loadings on it for all but the first loading (which would still be constrained to zero) and then compare that model to the one given in Example 6.19? Thanks again Linda! 


Yes, you could also do that but fix the factor loading at one not zero. 


yes, thanks Linda, I meant constraining that first loading to one. And am I correct in thinking that in your alternative specification (u1u4 on x) that one simply does away with f? 


Yes. 


Hi, I have a question to continous time survival with latent classes. I am using Mplus Version 5. Does this mean, that the baselinehazard function will be estimated always completely unconstrained in each latent class, or is there a way in which I can constrain it to be equal across classes? 


You should update your Mplus version. Starting with Version 6 we have the option to hold the basehazard equal (by default) across the classes or unequal using the commands basehazard=OFF(equal); basehazard=OFF(unequal); In version 5 it is not possible to make the basehazard equal across class in the PH Cox model, i.e., it will always be completely unconstrained. 


Hi, I'm interested in doing a survival analysis with 3 timepoints. I would like to look at the effect of two covariates, one continuous and one categorical (gender), and their interaction on survival. Obviously gender is fixed, but I would like to look at it's varying effect at each of the time points. My problem is that at my second time point, all of my participants that experience the 'event' are female. What can I do in this case? I don't want to collapse time points as I only have 3 to begin with! I read in a post above that I could inlude a male with the event and weight them 0, but I'm not sure how to do this? Alternatively, I could make the effect of gender invariant at all time points, and this fixes the output (in terms of error messages), but this is not really what I want to test. Thanks, Sarah 


A discretetime survival model can be specified as f BY u1u3@1; f ON x; f@0; or u1u3 ON x (1); The data are set up in a particular way so that with maximum likelihood estimation, a discretetime survival model is estimated. If you specify the following, I am not sure if you will still obtain a discretetime survival model: u1u3 ON xcont (1); u1 ON gender; u3 ON gender; 


Thanks for your response, Linda. I did wonder whether I could do that (i.e., exclude 'u2 ON gender'). Why exactly is this problematic? Thanks. 


The problem is that this is not a standard survival model. You are still explaining the hazard at each time point, but you may have have a difficult time with reviewers given that you have only three time points and the problem with gender at time two. 


Ok, thanks so much. 


Hi, I have another question about a similar discretetime survival model as mentioned above (1 continuous and 1 dichotomous covariate). I'm inerested in looking at the effect of the interaction between these two covariates on survival at each time point. If I include something like this: u1u3 ON xcont; u1u3 ON gender; u1u3 ON contXgender; and have a significant interaction for 1 or more of the time points, how do I then plot the hazard probabilities? I have been using the formulas provided in one of Dr. Masyn's lectures to get my hazard probabilities, but if I use the effect estimates from the main effects of xcont and gender to plot (I'm just plotting +1 SD for xcont), it doesn't look like I have an interaction at all. Any help would be appreciated! 


An interaction plot should have the hazard on the y axis, the continuous covariate on the x axis, and the binary covariate should be represented by a line in the plot for each value. 


Thanks Linda. How though can I incorporate time into the graph? That is, I would like to have my discrete time points on the x axis, hazard probability on the y axis, and plot 4 separate lines for males/+1SD, males/1SD, females/1SD, females/+1SD. Thanks! 


Ok, I think I've figured out my problem. I wasn't including the effect estimates for my interaction in my hazard probability equation. I've now done, for a certain time point (eg, u2): hazard(u2male,+1SD xcont) = 1/(1+exp(thresh for u2(beta for male)*(1)(beta for xcont)*(+1SD xcont)(beta for interaction)*(1*+1SD xcont))) Does that make sense? When I plot this it looks right. Thanks. 


Let's focus on your expression (beta for male)*(1)(beta for xcont)*(+1SD xcont)(beta for interaction)*(1*+1SD xcont) The easiest way to think about this is to think regular linear regression with one binary x1 and another continuous x2: y = a + b1*x1 + b2*x2 + b3* x1*x2 + e. With x1=0 you have y intercept = a y slope on x2 = b2 With x1=1 you have y intercept = a+b1 y slope on x2 = b2+b3 This gives you two lines in your plot of y on x2, one for x1=0 and one for x1=1. Now, just drop a and instead add in the threshold as you had it, do the exp that you have, and you can plot two hazard curves. 


Dear Dr. Muthen I would like to simultaneously model the onset of alcohol, cigarettes and marijuana among adolescents. I am not assuming competing risks. (for example, if an adolescent engage in the onset of alcohol, still remain at risk for the onset of cigarettes and marijuana) I am not sure about my coding. would the following code (assuming parallel effects) be correct? The dummies indicators for the events (including censoring) are: alcohol: a6a7 cig: c7c17 marijuana: m11m17 code: classes = c(1) !indicators a by a6a17@1; c by c7c17@1; m by m11m17@1; a with c; a with m; m with c; !effect of covariates a on x1 x2 x3; c on x1 x2 x3; m on x1 x2 x3; thank you Fernando 


That looks correct. 


Thank you, Fernando 

QianLi Xue posted on Thursday, October 17, 2013  12:37 pm



Do you have example code for doing multiple group discretetime survival analysis? 


I don't. It would involve using the CLASSES and KNOWNCLASS options in conjunction with TYPE=MIXTURE with Example 6.19. When all classes are known, it is the same as multiple group analysis. See also Example 7.21. 

Trutz Haase posted on Sunday, February 23, 2014  9:05 pm



Hi, I have a few questions regarding discretetime survival analysis. I would like to use a continuous latent variable to express the hazard of death, with piecewise thresholds to capture the changing influence of the explanatory variables. My question (1) is whether this is appropriate or whether I should consider alternative specifications? I intend to combine this with a latent categorical variable to capture unmeasured heterogeneity, using a disturbance term for the continuous latent variable to capture frailty. My hypothesized model looks like this: CLASSES = c (2); USEVARIABLES = u1u24 x1x4; CATEGORICAL = u1u24; MISSING = ALL (999); TYPE = MIXTURE; %OVERALL% f BY q1q24@1; f ON x1x4; c ON x1x4; [u1$1]; [u2$1u4$1] (1); [u5$1u12$1] (2); [u13$1u24$1] (3); %C#1% f on x1x4; f; %C#2% f on x1x4; f; (2) Am I correct in thinking that even if there is no empirical support for multiple classes, the model should still be specified as a mixture model with a latent categorical variable (K=1)? Many thanks, Trutz 


You can do piecewise that way. The frailty model with a residual variance for f is poorly identified and should be avoided. Fix f@0 as in UG ex 6.19. Twoclass DTSA is possible, but is hard to find support for. If you think there is only 1 class there is no reason to run it as a mixture model. 

Trutz Haase posted on Tuesday, February 25, 2014  3:41 am



Dear Prof. Muthen, many thanks for your observations, which are very useful. Would you therefore recommend following the example 6.19 in the UG or adopting the approach referred to in previous messages in this list, with CLASSES = c (1); and TYPE = MIXTURE;? Yours, Trutz 


Follow ex 6.19. 

Trutz Haase posted on Thursday, February 27, 2014  1:00 pm



Dear Prof. Muthen, many thanks for your reply, when I follow the example, my model converges and the results are interpretable. I would like to include a complex mediation structure upstream of the DTS outcome, using two ordinal measures (stage of disease, aggressiveness of treatment program): background characteristics > stage background char. + stage > treatment background + stage + treatment > survival outcome I would then like to estimate the indirect effects along these pathways and have specified these using VIA. The error messages prompt me to specify PARAMETERIZATION = THETA; and this is my only option under ANALYSIS. I get a series of warnings, e.g.: THE SAMPLE CORRELATION OF u22 AND u16 IS 0.997 DUE TO ONE OR MORE ZERO CELLS IN THEIR BIVARIATE TABLE. INFORMATION FROM THESE VARIABLES CAN BE USED TO CREATE ONE NEW VARIABLE. The model estimation terminates normally, and I obtain chisquare, CFI, RMSEA etc., which is useful. I assume that lack of fit in this kind of model is primarily due to the proportional hazards assumption in the DTSA part. The results appear to be in order. My question is: can I conclude that the THETA parameterization (which uses WLSMV here) is suitable? Many thanks once again, Trutz 


For DTSA you need to use ML. WLSMV cannot be used. What you are trying to do is from a methodological point of view a research topic. It is not clear how to define indirect effects with DTSA. You can perhaps consider an effect on the factor f behind the DTSA indicators. But it looks like you also have a binary treatment mediator, which further complicates the modeling and the effect interpretation. 


Is it possible in Mplus to fit a model with multiple survival outcomes? I am interested in time to events A, B and C, and these may occur in any combination and order? In addition, I'd like to estimate the correlation amongst them. 


Yes, this is possible. See 3.1 Frailty Models in http://statmodel.com/download/SurvivalJSM3.pdf You can use f by S1 S2 S3; to correlate the variables when these are the survival variables – there is no actual correlation. 

benedetta posted on Thursday, April 30, 2015  8:02 am



Dear Prof. Muthen, I have a question on the different assumptions behind dtsa implemented introducing either (1) a categorical latent variable with one class or (2) a continuous latent variable. The following specifications give the same outcome: (1) MODEL: %overall% d1d10 on X(x); or (2) MODEL: f by d1d10@1; f@0; f on X (x); My question is, if I want X to have timevarying effect on the events, in the case of categorical latent variable (1) it will be: MODEL: %overall% d1d10 on X(x1x10); how can I obtain the same result using a continuous latent variable? Is it possible at all to have timevarying effect within this framework? Thank you very much in advance! 


If you have MODEL: f by d1d10@1; f@0; f on X (x); but want to have timevarying x effects, you can say MODEL: f by d1d10@1; f@0; d1d10 on x. But then you are not bringing f into the model. You cannot both say f on x; and d1d10 on x; because that won't be identified. 

WenHsu Lin posted on Wednesday, November 11, 2015  12:34 am



Dear Prof. Muthen, I also have similar question. So, if I want to see the effects of x on survival function, I need to specify f on x. However, this is also assumed that the effect of x on survival (hazard) is invariant. Is this correct? So, if I want to have varying effects, so I gotta specify d1d10 on x, right? for this matter, the result of each gives odds ratio (logistic regression on categorical outcome) right? Second, if I have a distal outcome that will be influenced by the survival function. I will write z on f. Is this correct? 


All correct. 


I am trying to estimate a model with correlated hazards. As an analogy, my application is similar to a cohort of individuals being infected nonlethal diseases, repeatedly. 1) Is it possible to estimate a model with multiple such hazards, assumed correlated with a common set of predictors. 2) Additionally, could these be specified as repeating events? Any pointers to examples with Mplus would be great. 


Maybe the concept of competing risks http://link.springer.com/chapter/10.1007/9781441966469_9 https://www.statmodel.com/download/StoolmillerSnyder2013.pdf Also consider Section 3.1 in http://statmodel.com/download/SurvivalJSM3.pdf and user's guide example 12.10 Repeating events or twolevel models can be estimated with Mplus. 


Thank you for the references, will look them up. I am interested in correlated hazards that are noncompeting  in my example above it is similar to an individual being infected with both malaria and hepatitisc. These could have common predictors, but could be noncompeting. 

sangwonkim posted on Tuesday, April 19, 2016  7:48 am



Dear Prof. Muthens, I am using a panel dataset (yearly collected) and running a discretetime survival analysis with timevarying covariates (pm4pm8, please see a syntax below). If timevarying covariates are introduced into the model, then different odd ratios by each time point would be created/computed like running different logistic regression at each time point? or Could I obtain a kind of composite odd ratio? if yes, which parts I should modify in the syntax? Any references/resources that you could share would be greatly appreciated. Thank you! _________ Title: model 2 data: file is DTSA_4_recoding.dat; variable: names are gender income AVIC4 AVIC5 AVIC6 AVIC7 AVIC8 pedu_re adult pm4 pm5 pm6 pm7 pm8; categorical are AVIC4 AVIC5 AVIC6 AVIC7 AVIC8; missing are all(9); classes=c(1); analysis: type=mixture; ALGORITHM=INTEGRATION; model: %overall% avic4avic8 on gender income adult pedu_re; avic4 on pm4; avic5 on pm5; avic6 on pm6; avic7 on pm7; avic8 on pm8; OUTPUT: STANDARDIZED cinterval; 

QianLi Xue posted on Wednesday, May 31, 2017  10:01 am



Here is the simulation code for example 6.20 in the user's guide, which is included in the Mplus Examples  Monte Carlo Couterparts fold:  title: this is an example of a continuoustime survival analysis using the Cox regression model montecarlo: names = t x; generate = t(s 20*1); hazardc = t (.5); survival = t; nobs = 50; nreps = 1; save = ex6.20.dat; model population: x@0; x@1; [t#1t#21*1]; t on x*.5; analysis: model: t on x*.5; output: tech8 tech9;  In the model population statement: should the 1st line "x@0" be changed to "[x@0] for setting the mean of x to zero? 


Yes, this is a typo. The typo has no effect because the x@1; line supersedes it and the mean defaults to zero. 


Dear Prof. Muthén, I would like to run a survival analysis using using latent growth curve parameters (i.e., slopes) to predict mortality (i.e., survival). Unfortunately, I have an error in the model command with regard to my SURVIVAL option: *** ERROR in MODEL command: Variances for count variables are not currently defined. Variance given for: SUR_Y In the input, I specify the SURVIVAL option as follows: SURVIVAL = sur_y (ALL); TIMECENSORED = Censor2 (0=NOT 1=RIGHT); sur_y is the years of survival. I do not understand the error message above, and tech4 is not available. How can the variances for sur_y be defined? Is it required to add a further declaration? I would be grateful for any comments on this. Thank you very much in advance! 


Please send your output to Support along with your license number. 

Anna Austin posted on Monday, September 04, 2017  8:32 am



In GMM and LCGA, is there a way to censor individuals when they are no longer at risk for the outcome? Or should these individuals have their values of the variable of interest set to missing? 


Scoring them as missing is one approach. I think there might be other ones involving various modeling but I have nothing concrete to suggest. 

Back to top 