Jenn Tein @ ASU has a chapter that should be relevant:
Tein, J.-Y., & MacKinnon, D. P. (2003). Estimating Mediated Effects with Survival Data. H. Yanai, A. O. Rikkyo, K. Shigemasu, Y. Kano, J. J. Meulman (Eds.) New Developments on Psychometrics (pp. 405-412). Tokyo, Japan: Springer-Verlag Tokyo Inc.
I'd contact her or Dave MacKinnon directly if you have trouble tracking down the book.......
Steffi posted on Thursday, June 22, 2006 - 2:29 pm
I am afraid my problem is very basic but I cannot come up with a good explanation.
Running a DTSA with four binary indicators (x1-x4), and two continuous predictors (y1 and y2) the model quickly converges (Mixture Missing) and yields reasonable results. The relevant Model statement is:
The predictors y1 and y2 are correlated by default in the Mplus analysis. Just like in regression analysis, these correlations are not part of the model's parameter set. You should not mention y1 with y2.
If you really want the mediation model that you have problems with, contact support.
Steffi posted on Thursday, June 22, 2006 - 11:26 pm
thank you for your fast reply! Indeed, the mediation is what I am actually after.
But out of curiosity: other than in "standard" CFA I assume it is not possible to constrain corr(y1, y2) to zero?
I think you are asking if in a mixture analysis you can constrain the correlation of two variables to zero if they are predictors of a continuous latent variable that in turn influences categorical observed variables. That is possible in Mplus. If you have problems doing so, please send your input, output, data, and license number to firstname.lastname@example.org.
I have a follow up question on Stan Hong, Lois Gelfand and my earlier post:
I want to test a mediation effect in discrete time survival analysis (X->M->Y) with X = continuous predictor, M = continuous mediator and Y = vector of binary event indicators (proportional hazard odds assumption). I have censored observations, and from my reading of the literature, the best approach would be the product of coefficients method (a*b, where a is a standard regression coefficient and b is a logit coefficient) using bootstrapped SEs. With a logit link function and ML, am I correct that such a test is not (yet) possible in Mplus?
The next best approach is probably to compute SE using the Delta Method as recommended by Linda somewhere, but this also needs to be done by hand, correct?
Before tackling this task, I want to make sure that there is no "better" alternative (such as a probit-link function with WLSMV as indicated in another post, etc. but I am very hesitant to adopt such an approach since I have never seen it before in a DTSA context). What would you recommend me and can you provide me with any good references for mediation effects with dichotomous outcomes (I am aware of MacKinnons message from Dec. 04 and many of his excellent papers, but I wonder to which extent these findings can be generalized to DTSA with censored observations)?
I want to do discrete-time hazard model. My data set is in long format---observations at different ages are nested within individuals, which are nested in cluster (census tracts). Can I specify id of individuals as "idvariable" and census tracts as "cluster" to do three level model? Or do I have to restructure the data so that each age interval becomes one variable and to take a multivariate approach at the lower level?
Also, can I introduce frailty to account for unobserved heterogeneity for individuals without restructuring the data?
I am trying to use the "Model Constraint: New" command described by Bengt above on July 26, 2006 4:25 pm to obtain the Delta method for estimating the standard error of a mediation effect in a cox regression model.
However, I get a fatal error message: "*** FATAL ERROR Internal Error Code: PR1004 - Parameter restriction split problem. An internal error has occurred. Please contact us about the error, providing both the input and data files if possible."
Am I doing something very wrong here? My syntax is below:
The problem is that one variable is continuous and the other time censored. This is not allowed in MODEL CONSTRAINT. If you are creating an indirect effect, I am not sure that the product can be used here. You can ask for TECH3 and compute the standard error of the product yourself.
HI Linda, Thanks for your reply on Nov 8. Could I please confirm with you about not being sure if the product can be used in the example syntax I posted on Nov 8? I would ideally like to estimate the indirect effect of RGRLEV on P_123Y (survival time). However, this would mean multiplying the estimates of the regression of FAMPROB6 on RGRLEV, and the cox regression of P_123Y on RGRLEV. As these are different kinds of regression analyses, perhaps I cannot multiply the paths together to obtain an indirect effect- is this correct?
Is there any way around this? I suppose I could forego the cox regression model and estimate the logistic model regressing I_123Y (survival or not) on FAMPROB6 - and then multiply the path from RGRLEV to FAMPROB6 and then FAMPROB6 to I_123Y to obtain the indirect effect of RGRLEV on I_123Y.
I have two questions concerning discrete-time mediated survival analysis. Our data set is in long format (observations nested within individuals). The predictor is dichotomous, mediators are continuous, dependent variable is an event indicator (no missing data).
1. Our model is a survival analysis with a preceding panel regression model. Is it correct to use TYPE=COMPLEX in conjunction with the CLUSTER option to account for the nonindependence of observations (due to the long format)? Or are there any alternatives you would suggest in our case (any literature suggestions are appreciated)?
2. Is it true that indirect effects can only be estimated with a probit regression in our case? Is there any alternative model specification based on the standard logistic link function?
1. It sounds like you have longitudinal data ("panel regression model" followed by a subsequent survival model. That seems to be best handled by letting the panel part be in wide, not long form (single-level analysis) since the survival modeling is in wide form (so first the columns with the panel outcomes for the panel time points, followed by columns for the even indicators).
2. You can consider the indirect effect with f as end point as in regular linear regression since f (using the Mplus UG notation) is continuous.
We can see that your suggestion is a viable alternative. But: We are mainly interested in the effects of time-varying mediators on the outcome. If we use a wide data setup, we get several effects (as many as there are panel waves, in our case 8) on the respective event indicators. We feel this is quite cumbersome to depict/interpret.
Would it be correct to impose an equality contraint on the various effects per (time-varying) covariate? In this case there would be only one effect per time-varying covariate. (Still, if the model fits worsens after the constraint, we continue to have the "cumbersome" multiple effects, don't we?)
We are pretty sure the relationship is not invariant across time. But if this is the case, the results become quite technical/complicated for a non-methodological journal article, which we are trying to avoid... Besides, isn't the equality constraint just what the "usual" time-discrete event history analysis with time-varying covariates does (unless you include interactions with times of measurement)?
One last question: Would you consider the "long format" approach (every row in the data refers to an episode of observation) incorrect, even if nonindependence of observations is controlled for by adjusting standard errors (by using the Mplus CLUSTER command)? In the "classic" (i.e., non-SEM) literature, time-discrete event history analysis is usually done in long format (this is why we figured we could run our analysis that way).
I am doing a discrete time survival analysis by following models described in the Muthén and Maysen's article, and the Mplus user’s guide. However, I am experiencing some difficulties.
1) I have 10 time points but for the first two time points the binary U’s indicators have only 0 values. As the current version of Mplus doesn’t allow a categorical indicator with only one value, I thought of doing the analysis without the first two time points. That strategy can however result in bias in the estimation of the hazard because of the suppression of information related to time interval. Is that reasoning correct? If yes, is there any way to evaluate that bias? If the bias is negligible I can carry on with the discrete time survival analysis especially because the continuous one doesn’t really suit my problematical.
2) In the Muthén and Maysen's article they mentioned imposing specific structure to the logit baseline hazard. So, I was wandering how to test linear trend on the logit baseline hazard via MODEL CONSTRAINT.
I am looking forward to hearing your answer. Thank you for your invaluable help.
I am planning to do a discrete time survival analysis and had a few questions I was hoping you could answer:
1) How should the data be coded for people who have missing assessments but then come back into the study? For example, say I have six assessments and someone drops out at time 3, but then comes back into the study at time 5 and reports the event. Should it be coded:
0 0 999 999 1 999 (999=missing)
if this hypothetical person did not report the event at time 5 or 6 would it be coded?
0 0 999 999 0 0
2) How does one handle unequal spacing between time points in these models?
1. In the Muthen/Masyn implementation of discrete-time survival analysis in Mplus, only non-repeatable events such as onset of drug use are considered. Because of that, I think the coding for your two cases would be
0 0 0 0 1 999
0 0 0 0 0 0
Missing has a different meaning in this coding scheme.
2. The spacing does not matter unless you put a growth model on the thresholds.
Thanks for the reply. Since the people have missing information for some assessments it seems problematic to assume that the event has not occurred in for these phases (i.e., 0). I thought there was a way to take periodic missingness into account in these models. If I understand correctly dropout prior to an event occurring would be handled by coding 999 at the timepoint where dropout occurred, but handling periodic missingness as I outlined previously seems more tricky.
I have a dataset where the units under observation are subsidiaries of multi-national companies. In a period of 20 years new subsidiaries are setup that enter into the dataset. How should I code for the initial periods when they were absent because they were not born at all? They are not really missing values.
Since you posted under discrete-time survival, I assume you observe some event such as failure. Perhaps you should consider as your survival time variable, the number of years between being born and the event?
Yes, we do observe a failure and that is the event of interest, not the founding/entry. As of now I am not considering time-variant covariates (country level contextual factors) but the idea was to incorporate them at a later point of time. If I didn't have these I could have potentially coded each subject from its first year of entry. Even if I did this it would still bias any unobserved heterogeneity for the particular calendar year, for ex: there could have been a bad economic condition during a particular year in a particular country. So I just reasoned that I will code it as missing since it is not part of the risk set in any case.
If I correctly understood your suggestion - I can incorporate age as a time-variant covariate. I am just wondering if there would be too much of a correlation between this and the dependent variables (u) and wouldn't that overshadow the effects of my other variables?
We think age should not be the covariate - it should be the dependent survival variable. Instead we would recommend that the economic conditions that you are talking about should be a time-varying covariate - that can be done as in the http://statmodel.com/download/lilyFinalReportV6.pdf
Discrete time survival may be a good approach here that can also accommodate time-varying covariates easily.
Yes, discrete time survival is the most appropriate approach for more than one reason for my analysis.
If I have age as the survival time dependent I cannot have macro-economic conditions as a simple time-varying covariate. Let me illustrate with an example. In a particular country consider subsidiary 1 as existing between 2006 and 2010. Subsidiary 2 exists from 2004 to 2008. I can have 7 dummy binary variables for age 1 to 7 but neither of them reached age 6. I now have 7 values for macro-economic condition. The macro-economic condition of year 2008 was fatal for one and not the other but with a simple time-varying covariate it appears that the fifth year (2008) was fatal for both.
If I correctly understood why you provided the particular link, what I can do is incorporate the existence of the subsidiary in a given year and year based time-varying variables in the latent part. This will incorporate the calendar year based effect on the survivability, through the class membership. Am I correct in interpreting what you said?
Now if you also add discrete time survival - the value 5 will also be split into binaries age1, age2,...,age7 age1, age2,...,age7, x1, x2, ..., x7 0,0,0,0,0,999,999 ,0,0,1,0,0,0,0 0,0,0,0,1,999,999 ,0,0,1,0,0,0,0
Dear all, we are doing a discrete-time survival analysis in which we examine whether the effect of one time-invariant predictor (proportionality assumed and confirmed) on event occurences is mediated by a time-varying predictor (proportionality NOT confirmed). The basic model looks like this:
! c'-path event_T1 on t_iv (1); .. event_Tn on t_iv (1);
! b-paths event_T1 on t_v1 (b1); .. event_Tn on t_vn (bn);
! a-paths t_v1 on t_iv (a1); .. t_vn on t_iv (an);
To test mediation, we use the model constraint approach and everything works fine. However, given that the time-varying mediator is non-proportional we get n mediation effects (in our study n = 8); this makes sense but to facilitate the interpretation and reporting of the results, I was wondering whether there is a procedure or an approach to average the n effects? Or would you recommend to compute the model with proportionality assumed for the time-varying predictor?
Stefano posted on Friday, January 04, 2013 - 6:33 am
I am running a discrete time survival analysis with a time-invariant predictor and considering a mediator (proportionality NOT confirmed). Based on the last note on this thread, I would like to obtain the average of the n indirect effects, but I didn´t manage to write the correct code.