Multilevel data in discrete-time surv... PreviousNext
Mplus Discussion > Multilevel Data/Complex Sample >
 Anonymous posted on Sunday, June 27, 2004 - 1:07 pm
Is there anyway way to model multilevel data in discrete-time survival analysis? Our sample consists of siblings coming from the same families. Thanks.
 bmuthen posted on Sunday, June 27, 2004 - 4:15 pm
Yes you can do this - in two ways. One way is to do a single-level analysis where you model the variables for all of the siblings (the sample size is the number of families). The other way is to do a two-level analysis with siblings nested within families, using random effects (e.g. a random intercept) that vary across families
 Hanno Petras posted on Wednesday, July 28, 2004 - 12:38 pm
Hi Bengt,

as a follow-up to Anonymous's question, I assume that you would have to use the "f by event indicator notation" instead of the LCA parameterization. Would "f" then be allowed to vary across clustering units? Also, is there an example output file available? Thanks.


 bmuthen posted on Wednesday, July 28, 2004 - 1:37 pm
Yes. Ex6.19 in the Version 3 User's Guide would have to be combined with the examples of the multilevel chapter 9, say ex9.6, but perhaps with only the fb factor on between.
 Chyke Doubeni posted on Monday, September 14, 2009 - 10:41 am
This is a somewhat naive question but will appreciate feedback and guidance.

We are estimating two-level survival analyses and need to get estimates of the random effects - variance, interquartile hazard ratio (HR) and median HR. This is discussed in the article: Chaix & Merlo. Am J Epidemiol 2005;162:171Ė182.

My collaborators used a Bayesian approach in SAS to estimate the variance parameter. How do I do specify it in MPLUS to get estimates of the random effect?

This is the current form of the model. I had also specified it as a discrete time model.

WITHIN = ... ;

BETWEEN = x2-x5;

CLUSTER = tractid;
SURVIVAL = p_yralld (ALL);
TIMECENSORED = deadcnsr (1=NOT 0=RIGHT);



p_yralld ON entr_age sex bmicur marriage ltcoll coll raceblk raceoth diabetes heart stroke smoke logcal logfiber fairpoor;

p_yralld ON x2-x5;
 Bengt O. Muthen posted on Tuesday, September 15, 2009 - 9:15 am
In your setup you estimate on Between the residual variance of the random intercept for the p_yralld survival variable. If you delete x2-x5 as covariates on Between, you will estimate the variance of the random intercept.
 Chyke Doubeni posted on Friday, September 18, 2009 - 2:08 am
Thank you so much. I modified the model as follows and used MLR:

BETWEEN = x2-x5;


p_yralld ON entr_age ;

p_yralld ON ;

1.) The standardized variances were 1.0 and se=0. Should I expect that?
2.) Also, I modeled quintiles of the BETWEEN variable. What would be your advice for deriving the IHR, and Median HR?
3.) The formula in Chaix & Merlo's article seems to be based on continuous rather than dummies.
Would I need to use the continuous BETWEEN variable rather than the "quintile" variables?
 Linda K. Muthen posted on Friday, September 18, 2009 - 3:24 pm
Regarding number 1, please send the output and your license number to

Regarding 2, 3, and 4 I am not familiar with the paper.
 Chyke Doubeni posted on Tuesday, September 22, 2009 - 9:54 am
I think I figured it out. I was looking to derive the random effects (frailty) parameter with Cox frailty models. This is not possible in the multilevel framework - is it?
 Bengt O. Muthen posted on Tuesday, September 22, 2009 - 10:53 am
Yes, you can have frailties in the Cox model, and on both levels.
 Daniel Dickson posted on Thursday, May 29, 2014 - 2:52 pm

I am working on a multilevel survival analysis using cox regression (continuous time survival). The researcher I am working with have found meaningful person level predictors of returning to hospitalization (only the first return to treatment). The researcher would like to test to see if there is a random effect of hospitals, in particular to determine if there are different survival rates between hospitals.

I have received some recommendations that it may be best to dummy code each hospital (N=30) and include them in the analyses as a person level predictor, but I have not seen any literature to necessarily support that approach. I am wondering if there is a way to test if there are differences between hospitals (between level variables) on survival. An added difficulty is that there are no other hospital level predictors in the model.

Here is the syntax for the model I have proposed to test, but I realize that the variance estimation at the between level does not answer my question if survival differs as a function of hospitals:

Days_Ret ON age race11 race3
R_arr3 ExtBeh Intern CareIss1 los_627;

Thank you.
 Tihomir Asparouhov posted on Thursday, May 29, 2014 - 4:54 pm
Either fixed effect model with dummy variables or random effect model that you have above should illuminate the issue.
 Filippo Temporin posted on Sunday, January 24, 2016 - 10:30 am
Hi Bengt and Tihomir,

as a follow-up to this conversation I have another question. I've never used Mplus and it can be my sofware for the future.

I'm working on a multilevel frailty model in order to study survival of children. The structure implies a household- (second) and regional (third) level. The measurement model is a 2-level model with a latent variables at the household and regional level. These is then included as predictors in the 3-level structural model for mortality

I wonder if there is the possibility to include the variance of the household-level latent variable as a predictor in the structural model.

In the case it could be done, is it feasible to include it in both the two- and three-level models?

Any advice would be extremely useful!
Thank you!
 Tihomir Asparouhov posted on Monday, January 25, 2016 - 8:54 am
You will need to use type=twolevel and the household variables have to be setup as a multivariate vector (wide format). Currently only twolevel modeling is available in Mplus with survival variables.

To use the variance variable on the higher level, see section 4.1 in
or see slides 24 and 28 (two different approaches) in
 Laura Leaning posted on Monday, May 02, 2016 - 2:04 pm
Is the two-level continuous-time survival analysis using Cox regression with a random intercept model shown in Example 9.18 appropriate for use when levels correspond to repeated measurements of the same individual?

Also, is this a frailty model?

Thank you!
 Tihomir Asparouhov posted on Wednesday, May 04, 2016 - 10:11 am
Yes on both questions.
 Laura Leaning posted on Wednesday, May 04, 2016 - 10:43 am
Regarding example 9.18... If the clustering indicator is the individual, and it's a time-to-event outcome, how is this model estimating an individual random effect without another variable to estimate at the same time? (For example, event and death?)
 Tihomir Asparouhov posted on Wednesday, May 04, 2016 - 11:07 am
Take a look at sections 3.1 and 3.2 in

The individual random effect (frailty) reflects the differences in the hazard functions and can be estimated due to the repeated measurements.

The model in section 3.2 is a bivariate double frailty (one from id and one from hospital). To get the model for 9.18 you would use equation (21) only and remove eta_ijw.
 Laura Leaning posted on Wednesday, May 04, 2016 - 11:38 am
I think I am confused because there are two time-to-event variables in those examples (T1 remission to relapse and T2 relapse to death), but there is only one time-to-event variable in Example 9.18 (t).
 Laura Leaning posted on Wednesday, May 04, 2016 - 11:47 am
What I don't understand is how you can estimate a frailty term if you never see T1 and T2 on the same individual, but you always see either T1 or T2. In other words, does a frailty model make sense when they are competing events?
 Tihomir Asparouhov posted on Thursday, May 05, 2016 - 10:32 am
Frailty requires at least two observed time to event variables. This applies to two-level methods like example 9.18 or single level multivariate methods like section 3.1.
 Laura Leaning posted on Monday, May 09, 2016 - 8:44 am
Thank you very much for that clarification. If I use example 9.18 with only one time-to-event variable (t), it is not a frailty model. Does this also mean that there is no random intercept? I want to use the exact specification below, clustering repeated measures within individuals, in order to accommodate time-varying confounders.

VARIABLE: NAMES = t x w tc clus;
CLUSTER = clus;
t ON x;
t ON w;
 Tihomir Asparouhov posted on Monday, May 09, 2016 - 2:18 pm
Example 9.18 is a frailty model with random intercept because there are more than one time to event variables within each cluster, assuming the cluster size being more than one.

I would recommend reading about "DATA WIDETOLONG:" command in Mplus to understand the equivalence between multivariate and multilevel models. This of course is not related to frailty - it is a general concept.
 Kirill Fayn posted on Wednesday, October 12, 2016 - 8:03 am

i am getting a new error on survival models that previously ran without any issues. The error is:

There is at least one observation in the data set where a survival variable
has a negative value or 0. Please check your data and format statement.

There are indeed 0s in the variable but this has always been the case. There are negative numbers that have been defined as missing.

The only thing that has changed is that I am running the analysis on 7.4 instead of 7.

Thanks in advance for your help
 Tihomir Asparouhov posted on Wednesday, October 12, 2016 - 3:21 pm
We made that change recently to improve the applicability of the model. Consider equation (8) in
S(0)=1 meaning that the probability that a person dies at time T=0 is zero. Here are some options - I am not sure which one is best for you.
a) Add +1 to the survival variable
b) replace 0 with 0.0001 (this leads to huge hazard in the 0 to 0.0001 interval)
c) remove these observations - it is not clear why you have these in the first place - we study a sample of alive people to see how covariats affect their survival, why include dead people in the sample.
 Jennie Jester posted on Thursday, January 11, 2018 - 12:07 pm
I am using repeated events survival analysis to look at driving arrests. I have done a model in SAS that looks like this:
proc phreg data = drivelib.repeat covm covs(aggregate);
model (time0 time1)*repevnt(0)= audlifbft1 ;
id target;
where offcount < 9;
strata offcount;

This is conditional model A from Hosmer and Lemeshow.

I would like to do the analysis in Mplus, to take advantage of using FIML (because I will include other predictors which have some missing data). The only example I have found is UG 9.18 and I modified it to the following:
CLUSTER = target;

BETWEEN =audlifbft1;
SURVIVAL = duroff (ALL);
TIMECENSORED = repevnt(0 = NOT 1 = RIGHT);

duroff ON audlifbft1;

I donít understand what the CLUSTER variable is supposed to be - I used our ID variable, but I donít think that is correct.
Can you help me with this syntax or direct me to a citation for it?


 Tihomir Asparouhov posted on Thursday, January 11, 2018 - 8:36 pm
See equation 1 in

You can also check out Section 3.2 in
or look up examples for Frailty models in
The section 3.2 example though is a bivariate where your example is univariate and you don't have the within level frailty.

The cluster variable should be the variable that shows how observations are nested withing a higher level unit. For example


represents the random intercept (if exponentiated it is the hazard function proportionality factor) that applies to all the observations in that cluster. So yes in your case cluster is ID.

Consider doing multiple imputation for your missing data since that will give you more estimating options.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message