I am running a repeated-event, two-process DTSM. Subjects are measured in 8 periods, and I am using the counting process formulation. Six events of each type are possible.
My goal is to introduce a relationship between the two processes. Since the number of events reported (of either type) is likely to be positively correlated across individuals, introducing event frailties and correlating them does not work for me. Instead, I chose to work with two time-varying covariates which count the number of periods since the last event of each type. However, Mplus drops any case with missing on any covariates, i.e. any individual who does not experience BOTH events in the first period is ignored.
How should I handle this? MonteCarlo integration should work, but the variables are not normally distributed and it is extremely slow. I was also considering a piece-wise approach (discretize the variable with one category being "not yet observed"). Note, btw, that missingness is perfectly predicted by the outcome variables and P(missing=1,t) is equal to the cumulative survival function up to period t. I'd really appreciate your insight on this. Regards, Liuben
PS. I chose to post this on the forum so that the community could take a look. Any ideas or examples of similar work would be highly appreciated!
No, actually. For the first event, exactly the opposite is true; since the covariate counts the time since the last observed event, it is always missing when the individual is at risk for the first event (i.e. has valid values on the outcome), and has a valid value at failure time. It always has a valid value for the second+ spells of the same type. Essentially having a missing value on the covariate in any period corresponds to not having experienced a first event up to that period.
If I understand your problem correctly a simple solution would be to record the missing covariates as 0 instead of missing value. I assume the regression slope for the time-varying covariates is time invariant. When you don't have such covariates you essentially don't want them in the regression and recording them as 0 will do that.
Yes, the slopes are time-invariant. Thank you for the suggestion - I was considering this given that estimation is done by ML. The only thing that concerns me is that a missing value on the covariate (or a zero if changed) may be associated with long-term survival, especially for the values of the covariate in later periods. Could this somehow distort the likelihood function and/or parameter estimation? My point is, if we record missings as zero, won't the slope be heavily biased upwards because 1) even a tiny change in x results in a large effect on y, and 2) this change happens in a large proportion of the population due to long-term survivors, giving them a large weight in the estimation of the PDF of y? Thank you again for your time!
I think at this point the question is how you define the model rather than the estimation. The missing covariates seems to me are missing because they are not defined not because they are missing/unobserved. So I would suggest that you focus on the model definition and possibly do some simulations.