Time-varying covariates
Message/Author
 Anonymous posted on Wednesday, September 25, 2002 - 12:41 pm
I am considering how many time-varying covariates to include in my model.

In Example 22.1c of the manual, X3 is one time-varying covariate affecting Y1. Its effect on Y1 was modeled separately in each time period (i.e., x31 on y11, x32 on y12, x33 on y13 etc.) Is it possible to include more than one time-varying covariate? Most of the examples in the MPLUS manual and the workshop use one time-varying covariate.

What are the issues that one should consider in deciding how many time-varying covariates to include? Thank you for your time.
 bmuthen posted on Friday, September 27, 2002 - 9:58 am
It is possible to use many time-varying covariates (tvc's) at each time point. You should use the same considerations as in regular regression, that is, to avoid multicollinearity the tvc's should not be too highly correlated. You might want to standardized the tvc's to zero means to make interpretations easier.
 Silvia posted on Friday, April 30, 2004 - 3:43 pm
Hi Bengt and Linda,
I am running a parallel process growth model with time invariant and time varying covariates.
In my model the time varying covariates predict the indicators for both growth variables.

The Modification indices suggest a relationship between one of the time varying covariates and the slope factor of one of the parallel process variables. This would imply that drinking at time 3 is related to the overall slope of abstinence.

Would this make any sense and is it mathematically acceptable?
thanks
silvia
 bmuthen posted on Friday, April 30, 2004 - 7:10 pm
If you center at time 1, then the slope factor concerns growth starting right after time 1 which means that influence from a time 3 time-varying covariate is questionable. Perhaps it means that the growth is accelerated at time 3, but then you would need a piecewise model.
 Anonymous posted on Wednesday, September 14, 2005 - 6:24 pm
I'd like to examine the effect of perceived opportunities on school performance (from ages 9 to 15 - measured each year). I want to determine if at-risk adolescents who perceive that they have limited opportunities are less likely to perform well in school. I'm considering two different models.

1. Model both processes as growth models and correlate the growth factors.

2. Model academic performance as a growth model and regress the residual individual scores of academic performance at each time on the individual scores of perceived opportunities at each time.

I'm wondering if one method should be preferred over the other (or if there is an alternative model that would be best) and how the interpretation of the effects of perceived opportunities on academic performance would differ.
Thank you.
 bmuthen posted on Wednesday, September 14, 2005 - 6:56 pm
Perhaps the perceived opportunities variable doesn't really follow a growth model - perhaps it bounces up and down over time - in which case a better approach might be to have a growth model for performance with perceived opportunities as a time-varying covariate.
 Anonymous posted on Thursday, September 15, 2005 - 5:09 pm
Thank you. With the type of model that you suggest - would I interpret the regression coefficients as the effect of perceived opps on academic performance after adjusting for time?
 bmuthen posted on Thursday, September 15, 2005 - 6:09 pm
Yes, I think it is fair to describe it as adjusting for time - or growth.
 Dianna Densmore posted on Wednesday, March 29, 2006 - 12:31 pm
Hi,
I have run analysis for a multiple indicator latent growth curve model for linear growth and then run a subsequent analysis adding in a TVC to the model (an unconditional TVC model) where the repeated latent factor is regressed on the TVC for synchronous and lagged effects. My data are ordinal. The models fit the data well. My question is one of interpretation when comparing the results of the models. I've reported the results of the unconditional linear growth model and then have reported the regression coefficients for the unconditional TVC model. My understanding is that the regression coefficients in the TVC model represent the effect of the TVC on the latent factor net of the underlying growth curve, so these are the values on which I've mainly focused. Should I expect that the slope and intercept will differ in the TVC model and how do I interpret this info compared to the unconditional model? The TVC slope and intercept values are slightly different than the unconditional no-predictor model. Is this simply a reflection of the model being conditioned on the covariate? And, if so, does the change in intercept and slope provide substantive info that I should report beyond that of the regression coeffients?
Thanks.
 Bengt O. Muthen posted on Wednesday, March 29, 2006 - 6:41 pm
You are right that the TVC effects are net of the effects of the growth factors. Or, put another way, your growth factor effects are net of the TVCs. At TVCs = 0, you see the effect of the growth factors. So, you could center the TVCs so that they have mean zero. Then the means of the growth factors may not change much when including the TVCs.
 Hyoshin Kim posted on Monday, February 12, 2007 - 7:54 pm
Hi,

In Growth Mixture Modeling in Mplus, would Mplus allow time-varying covariates to estimate the probabilities of class membership?

I understand that time-invaiant covariates, not time-varying covariates, are usually used as predictors of the latent class (i.e., estimating probabilities of class membership), and time-varying covariates are used to estimate the trajectories (e.g., intercept and slopes)?

But I wonder if Mplus allows time-varying covariates to estimate both the class membership probabilities and the trajectories, if I include them in the model?

Am I then, in fact, turing into latent transition modeling?

Any comments would be appreciated.
 Bengt O. Muthen posted on Tuesday, February 13, 2007 - 8:10 am
You can let time-varying covariates influence class membership. But this means that class membership probabilities change over time, so you have to consider whether this is what you want for the subject matter at hand. If you do, this may lead to reconceptualizing class membership as time varying, which as you say may lead to a type of latent transition model.
 Kaigang Li posted on Saturday, May 31, 2008 - 10:22 am
Professor Muthens,

I have a basic question about how to determine the number of classes using GMM since I have been confused with that when I read GMM related articles. From reading I get an idea which testing GMM starts with determining the number of classes by unconditional LCGA by using some model-fit indixes (e.g. entropy, AIC, BIC, and LMR-LRT, and then covariates are included to predict the growth factors and modify the membership.

Let's say 4 classes are determined by unconditional LCGA. But when I added the covariates in the model, the classes may be reduced to 3 in terms of those indexes.

So how can determine the number of classes for final model?

I read the chapter "Second-Generation SEM Growth Analysis", 3 or 4 classes based on 8 classes from LCGA are used for GMM. I cannot get very clear idea how you determine the number.

In your another article "Integrating Person-Centered and Variable-Centered
Analyses: Growth Mixture Modeling With Latent Trajectory Classes" you mentioned "three different considerations are used to decide on the number of latent classes."

Sorry about the long description, basically I am wondering if you could give any comments to clarify the process for determining the number of classes using GMM.

Thanks for your time.

Kaigang
 Bengt O. Muthen posted on Sunday, June 01, 2008 - 12:52 am
Take a look at my 2004 chapter in the Kaplan handbook which you find under Papers, Growth Mixture Modeling on this web site.
 linda beck posted on Tuesday, December 02, 2008 - 3:44 am
I want to examine the influence of parental alcohol disorder (pad) on depression during adolescence across 5 measurement points (LGM).
I want to learn more about the structural effects of 'pad' on depression (time invariant) but also about the time specific effects of the same variable 'pad' on depression (time variant).

1.) How could one disentangle both kinds of effects best? E. g. model both, time variant and time invariant efects of 'pad' in separate models or in ONE model?

2.) How would one interpret effects of 'pad' on the intercept and on x1 in the latter model when these effects are modelled together (intercept is centered at t1)?

3.) Is the interpretation correct, that effects of 'pad' on e. g. x2 x3 x4 x5 are 'time specific effects beyond what is explained by systematic change'?
 linda beck posted on Tuesday, December 02, 2008 - 3:54 am
add.: in the time invariant definition, 'pad' is measured at t1.
 Bengt O. Muthen posted on Tuesday, December 02, 2008 - 9:59 am
I think you are saying that a time-invariant covariate pad has both indirect and direct effects on the depression outcome over time. The indirect effects are via pad's influence on the growth factors.

You cannot identify all the direct effects plus the effects on the growth factors. It is like a Mimic model. I would recommend looking at the modification indices for the model with no direct effects to see if one or two direct effects seem to be needed. The indirect effect via the intercept growth factor has to do with the constant effect on all outcomes, whereas a direct effect is over and above that.
 linda beck posted on Thursday, December 11, 2008 - 8:54 am
I'm not sure if I was explicit enough, sorry! I want to conceptualize "pad" as time invariant (measured at t1) and predict growth curve factors. In addition, I have measured pad also at t2-t5. So there is a possibility to examine time specific effects of pad1-pad5 on y1-y5 (besides the effect of pad1 on growth factors), i.e. to have direct effects on the depression outcome.
a.) Does this conceptualization in general and the model (see below) make sense?

That would be the following model (in principal, since I can't model all direct effecs of pad1-pad5, as you said):
i s on pad1;

b.) I guess, I should model both kinds of effects in ONE model, if my conceptualization is o.k.?

I can't use mod-indices to explore possible direct effects of pad1-pad5, since I'm utilizing numerical integration.

c.) Does it make sense then, to model each time-specific effect separately in the model above (e. g. i s on pad1 AND y1 on pad1) and to check if it is significant (and to omit when not) then go to 'i s on pad1 AND y2 on pad2'?
Thanks a lot!
linda beck

p.s. sorry for the long post. I wanted to avoid follow-up posts...
 Bengt O. Muthen posted on Friday, December 12, 2008 - 8:37 am
Note that your statement

gets expanded as

...

You can't have pad1 influence both i, s and y1-y5, but you could say

i s on pad1;
...

Then you can see what's significant.
 linda beck posted on Tuesday, December 16, 2008 - 1:36 am
That was exactly what I meant. Sorry, for the wrong syntax! Thank you.
linda beck
 Nicolas M posted on Monday, August 30, 2010 - 1:22 pm
Dear Professors,

I come from a GLMM background and I try to transpose my usual models in the latent growth curve approach. When working with time-varying covariates in a GLMM, I simply write down these equations for "i" individuals and "t" time points :

y_it = gamma(1)_i + TIME_it*gamma(2)_i + beta*x_it + e_it
gamma(1) = alpha_0 + psi(1)_i
gamma(2) = alpha_1 + psi(2)_i

where gamma(1) is the intercept with an individual variance psi(1), gamma(2) the random slope with an individual variance psi(2) and x_it a time-varying covariate.
For a latent growth curve model with MPLUS, would it be the same than using this syntax :
i s | y1@0 y2@1 y3@2 ...
y1 on x1 (1);
y2 on x2 (1);
y3 on x3 (1);
...
and thus constraining the beta parameter to be the same through time?
I'd like to avoid having one parameter by time point (like in y1 on x1; y2 on x2; y3 on x3;) or having to specify a random slope for the time-varying covariate (st | y1 on x1; st | y2 on x2 ...).

Thank you in advance for your help.
 Linda K. Muthen posted on Tuesday, August 31, 2010 - 10:10 am
You have specified the time-varying covariates correctly.

Regarding the time scores, does each person have the same time score at each time point?
 Nicolas M posted on Tuesday, August 31, 2010 - 11:03 am
Unfortunately no, they don't have the same time scores. They were all interviewed each year, but starting from different ages.
I've already tried to use the TSCORES functionality of MPLUS to specify the time scores for each individual, but then I was forced to use a MLR estimator whose computation was too complicated (12 dimensions of integration, as my dependent variable is actually latent).

I then decided to use a WLSMV estimator, which also allow me (if I'm correct) to estimate the covariance between the residuals of my indicators through time without an additional computational burden.

The final solution I've retained, even if it's not as good as the TSCORES option, is to use this specification :

y_it = gamma(1)_i + TIME_it*gamma(2)_i + beta*x_it + e_it
gamma(1) = alpha_0 + alpha_1*STARTING-AGE_i + psi(1)_i
gamma(2) = alpha_1 + psi(2)_i
with TIME_it going from 0 to 9 (@0, @1...) and the alpha_1 controlling for the different "starting point" of each individual.

This is the only way I could obtain an estimation for this model (I've tried the MLR estimation with TSCORES but it has never converged, even after playing with the number of Monte Carlo integration points and the MCONVERGENCE option. I gave up because the computation was too slow).
 Linda K. Muthen posted on Tuesday, August 31, 2010 - 4:25 pm
You need to use TSCORES. Please send the full output and your license number to support@statmodel.com to see why you are having convergence problems.
 davide morselli posted on Friday, September 16, 2011 - 12:20 am
Hello
I have very a basic question.
I have a longitudinal study with 12 waves, my DV is measured at each wave, but my IV only at three time points. I want to test whether the change (slope1) in the DV can predict the change (slope2) in the IV.
My question is: Does it make sense to consider slope1 in all 12 time points or it would be better to consider it only at the same three time points of the IV?
Furtehrmore I read it would be better to have at least 4 time points, but i did not really understand why, could you give me some hints?
Thanks
Davide
 davide morselli posted on Friday, September 16, 2011 - 5:01 am
just one more question: how should I interpret the mean of the slope that from significant becomes not-signifcant when I regressit on some covariates?\$
Thank you!!
 Bengt O. Muthen posted on Saturday, September 17, 2011 - 11:04 am
The slope that is the IV should not be formed from time points later than those used for the DV slope.

When you regress slope on other variables, the parameter is no longer a mean but an intercept. The intercept can be non-sig while the mean is sig. The mean is given in Tech4 when you regress the slope on other variables.
 Carolyn Tompsett posted on Monday, November 07, 2011 - 5:51 am
I am using TSCORES to account for individually-varying times of data collection across 7 waves of data. When someone was missing a time point I substituted a mean "days" variable for the TSCORE to avoid them being kicked out of all analyses. Now I have added a time-varying covariate, and everyone who is missing a single wave of data is kicked out of all analyses. Is there any way to avoid this, besides just mean substitution?
 Bengt O. Muthen posted on Monday, November 07, 2011 - 8:47 pm
Yes, this can be avoided. Instead of scoring the time-varying covariate as missing at this time point for this individual, score it as something else (anything will do). This will not influence the computations because the DV has a missing data flag at that time point for that individual, so it won't have an effect.
 EFried posted on Friday, February 24, 2012 - 2:05 am
(1)
I want to explore in a GMM with 5 measurement points whether certain latent classes of my dependent depression variable seem to be "vulnerability" or "resilience" classes. I want to use baseline covariates to predict this.

If I understand MPLUS correctly, even with the
c ON x1 x2 x3;
statement, all that MPLUS can tell me is that covariate e.g. Neuroticism might have a significant effect on intercept and slope of class 1 but not on class 2, but that doesn't tell me if people in class 1 actually have a higher Neuroticism, right?

What would you recommend for this - running ANOVAs in a second step after the GMM?

Is there a good paper in which covariates are successfully used to predict latent classes in MPLUS?

(2)
First one should find a good model without covariates, and then add them. However, the c ON x1 x2 x3; command is bound to change the class solution, right? I use starting values found in models without covariates when adding covariates, but does this make sense when using the c ON x statement?

Thank you so much
 Linda K. Muthen posted on Friday, February 24, 2012 - 11:18 am
1. The c ON x regression predicts class membership. The i s ON x regression predicts variability in the intercept and slope growth factors. If you want to see if the mean values of your covariates vary across classes use the e setting of the AUXILIARY command. See the Growth Mixture Modeling section in Papers on the website.

2. This is likely due to the need for direct effects between the covariates and the outcomes. See on the website:

Muthén, B. (2004). Latent variable analysis: Growth mixture modeling and related techniques for longitudinal data. In D. Kaplan (ed.), Handbook of quantitative methodology for the social sciences (pp. 345-368). Newbury Park, CA: Sage Publications.
 EFried posted on Sunday, February 26, 2012 - 7:12 am
1) Thanks!

2) The paper focuses on distal outcomes, I am only working with 5 equidistant repeated measurement depression scores which are normal dependent variables. Still, adding covariates and especially the c ON x; statement sometimes drastically changes classes.

3) In the forum and handbook, it is usually recommended to first fit models without covariates and then add them step by step, fixing the classes if need be to keep them the same. The paper you mention above (Muthén, 2004) states the opposite though:
"It should not be expected that the class distribution or individual classification remains the same when adding covariates. "
"The fact that the "unconditional model" without covariates is not necessarily the most suitable for finding the number of classes has not been fully appreciated [...]. An important part of GGMM is the prediction of class membership probabilities from covariates."

I'm not sure which method to use, both seem to make sense.

Thank you, your input is highly appreciated
Efried
 Linda K. Muthen posted on Sunday, February 26, 2012 - 8:54 am
The current thinking is to find the number of classes without adding covariates. I would not recommend fixing the classes. If adding covariates changes the classes, the need for direct effects should be examined.
 EFried posted on Sunday, February 26, 2012 - 12:57 pm
Thank you. Regarding the auxiliary command (e), with which I'd like to compare whether my covariates differ between classes,

ANALYSIS:
type = MIXTURE;
auxiliary = x1 x2 (e);
MODEL:
%OVERALL%
i s q | y0@0 y1@1 y2@2 y3@3 y4@4;
q@0;
i s ON x1 x2;
c ON x1 x2;

causes the "auxiliary: unknown option" error. I find only one example about auxiliary (e) in the handbook and am not sure what I'm doing wrong.
 Linda K. Muthen posted on Sunday, February 26, 2012 - 2:33 pm
The AUXILIARY options belongs in the VARIABLE command.
 EFried posted on Monday, February 27, 2012 - 4:26 am
Sorry, that was a stupid mistake to make.

Pasting the line into the VARIABLES part gives me following error:

NAMES ARE x1 y0-y4;
USEVAR = x1 y0-y4;
missing=all(-999);
classes=c(2);
auxiliary=x1 (e);
ANALYSIS:
type = MIXTURE;
MODEL:
%OVERALL%
i s | y0@0 y1@1 y2@2 y3@3 y4@4;
i s ON x1;
c ON x1;

*** ERROR in MODEL command
Unknown variable(s) in an ON statement: x1

Without "auxiliary=x1 (e);" x1 is detected properly so the code works. Maybe I'm using the statement wrong?
 Linda K. Muthen posted on Monday, February 27, 2012 - 6:06 am
You cannot use x1 in the model and also as an auxiliary variable.
 EFried posted on Tuesday, February 28, 2012 - 1:44 am
I would like to find out - in my final model, and that includes covariates - how the classes differ from each other.
One important aspect is: which class has more females, which class has a higher neuroticism, is a genetic polymorphism more common in one class than the others.

For this question, auxiliary (e) is not helpful if I cannot include my covariates in the model predicting classes.

Is there a way to solve this in MPLUS? How do other people do this? Nearly all papers I find for GMM are descriptive and rarely add covariates, and the few I found seem to never use the c ON x1 x2; command to predict class membership with covariates.

Thank you!
 Linda K. Muthen posted on Wednesday, February 29, 2012 - 6:12 pm
See TECH4.
 Lisa posted on Wednesday, April 24, 2013 - 3:33 am
Hi

I would like to add a time varying control variable to my LGM. I have four waves of data and the control variable is a single item continuous variable (7pt Likert) measured at each time point. Could you please advise me on the best way to add this control to my LGM?

Thank you!
 Linda K. Muthen posted on Wednesday, April 24, 2013 - 6:25 am
See Example 6.12.
 Annie Robitaille posted on Wednesday, May 08, 2013 - 11:17 am
I am running a time-varying covariate model on data spanning from 2006 to 2012. Data is available for each individual when they first enter long-term care and then every 3 months following admission in long-term care. However, some entered in 2006 and therefore have many time points whereas others entered later (e.g. 2011) and have fewer time points. I was treating entry in long-term care as time 0 (baseline) for all individuals, regardless of the actual calendar date of entry. Is that ok? By setting up my data this way some have 24 time points whereas others have fewer because they entered at a later date. Can I still use FIML to treat missing values for those who have fewer time points and therefore have missing values for the later time points or should I be using a missing by design method instead? Is that possible with Mplus?
Thank you in advance for your help.
 Linda K. Muthen posted on Thursday, May 09, 2013 - 9:48 am
This sounds correct and you can still use FIML.
 Jing Zhang posted on Thursday, August 15, 2013 - 11:34 am
I am fitting a long format longitudinal model. My question is that if I can add time varying dichotomous covariates at “within level”?
(MD is dichotomous )

variable: Names are subid MRN wave SEX MD pebdtot;
USEVARIABLES ARE wave pebdtot;
MISSING = ALL (-999999);
cluster = MRN;
within = wave;
analysis: TYPE = TWOLEVEL RANDOM;
model:
%within%
s | pebdtot on wave;
pebdtot ON MD;
%BETWEEN%
pebdtot ON SEX
pebdtot with s;
OUTPUT: SAMPSTAT STANDARDIZED;
Thank you very much,
Jing
 Bengt O. Muthen posted on Thursday, August 15, 2013 - 1:31 pm
Yes, like in UG ex 9.16. The MD variable should also be on the within list. And SEX on the between list.
 T Yilmaz posted on Monday, June 30, 2014 - 2:31 am
I want to explain growth trajectory parameters (intercept, slope, acceleration) with my time-varying and time-invariant variables. All the sources suggest that I can put time-varying variables at Level 1. But then I will not be able to explain my growth parameters with my time-varying variables. Is there a way to solve this problem?
 Linda K. Muthen posted on Monday, June 30, 2014 - 10:17 am
I think you have your data in long format. If you put it in wide format, you will have more flexibility. Then you can use the time-varying covariates at both levels.
 Julia Sander posted on Monday, February 15, 2016 - 5:18 am
Hi,

I ran a growth curve model using a cohort-sequential longitudinal design with four waves and a time-varying covariate (TVC).

I used the tscores command [Mplus Version 7] to model the lifespan development of frequency of peer visits (vfriends) and controlled for the time-varying covariate employment status (work: 0 = unemployed, 1 = employed).

Age was centered at 50 years and multiplied by *0.01 in order to avoid numerically small estimates. Age range of the sample is 17 to 85 years.

The model with a linear, quadratic, and cubic slope converged, but estimated means for intercept (I), linear slope (S), quadratic slope (Q), and cubic slope (C) seem to be illogical with values off the charts, especially for mean of Q and C, where variance had to be set to zero for convergence.

In order to double-check I recoded the employment variable (work_rec). This time 1 equaled "unemployed" and 0 equaled "employed". Estimated parameters seem to be more comprehensible now.

I have no idea why this simple recoding of a dichotomous variable results in such very different model outcome (model fit parameters changed too). I somehow expected just a mirrored trajectory after recoding the binary TVC...?!

I would be very grateful for any hint. Thank you very much in advance!
Julia
 Linda K. Muthen posted on Monday, February 15, 2016 - 6:41 am
Please send the two outputs and your license number to support@statmodel.com. Include SAMPSTAT in both outputs.