Message/Author 

Anonymous posted on Wednesday, September 25, 2002  12:41 pm



I am considering how many timevarying covariates to include in my model. In Example 22.1c of the manual, X3 is one timevarying covariate affecting Y1. Its effect on Y1 was modeled separately in each time period (i.e., x31 on y11, x32 on y12, x33 on y13 etc.) Is it possible to include more than one timevarying covariate? Most of the examples in the MPLUS manual and the workshop use one timevarying covariate. What are the issues that one should consider in deciding how many timevarying covariates to include? Thank you for your time. 

bmuthen posted on Friday, September 27, 2002  9:58 am



It is possible to use many timevarying covariates (tvc's) at each time point. You should use the same considerations as in regular regression, that is, to avoid multicollinearity the tvc's should not be too highly correlated. You might want to standardized the tvc's to zero means to make interpretations easier. 

Silvia posted on Friday, April 30, 2004  3:43 pm



Hi Bengt and Linda, I am running a parallel process growth model with time invariant and time varying covariates. In my model the time varying covariates predict the indicators for both growth variables. The Modification indices suggest a relationship between one of the time varying covariates and the slope factor of one of the parallel process variables. This would imply that drinking at time 3 is related to the overall slope of abstinence. Would this make any sense and is it mathematically acceptable? thanks silvia 

bmuthen posted on Friday, April 30, 2004  7:10 pm



If you center at time 1, then the slope factor concerns growth starting right after time 1 which means that influence from a time 3 timevarying covariate is questionable. Perhaps it means that the growth is accelerated at time 3, but then you would need a piecewise model. 

Anonymous posted on Wednesday, September 14, 2005  6:24 pm



I'd like to examine the effect of perceived opportunities on school performance (from ages 9 to 15  measured each year). I want to determine if atrisk adolescents who perceive that they have limited opportunities are less likely to perform well in school. I'm considering two different models. 1. Model both processes as growth models and correlate the growth factors. 2. Model academic performance as a growth model and regress the residual individual scores of academic performance at each time on the individual scores of perceived opportunities at each time. I'm wondering if one method should be preferred over the other (or if there is an alternative model that would be best) and how the interpretation of the effects of perceived opportunities on academic performance would differ. Thank you. 

bmuthen posted on Wednesday, September 14, 2005  6:56 pm



Perhaps the perceived opportunities variable doesn't really follow a growth model  perhaps it bounces up and down over time  in which case a better approach might be to have a growth model for performance with perceived opportunities as a timevarying covariate. 

Anonymous posted on Thursday, September 15, 2005  5:09 pm



Thank you. With the type of model that you suggest  would I interpret the regression coefficients as the effect of perceived opps on academic performance after adjusting for time? 

bmuthen posted on Thursday, September 15, 2005  6:09 pm



Yes, I think it is fair to describe it as adjusting for time  or growth. 


Hi, I have run analysis for a multiple indicator latent growth curve model for linear growth and then run a subsequent analysis adding in a TVC to the model (an unconditional TVC model) where the repeated latent factor is regressed on the TVC for synchronous and lagged effects. My data are ordinal. The models fit the data well. My question is one of interpretation when comparing the results of the models. I've reported the results of the unconditional linear growth model and then have reported the regression coefficients for the unconditional TVC model. My understanding is that the regression coefficients in the TVC model represent the effect of the TVC on the latent factor net of the underlying growth curve, so these are the values on which I've mainly focused. Should I expect that the slope and intercept will differ in the TVC model and how do I interpret this info compared to the unconditional model? The TVC slope and intercept values are slightly different than the unconditional nopredictor model. Is this simply a reflection of the model being conditioned on the covariate? And, if so, does the change in intercept and slope provide substantive info that I should report beyond that of the regression coeffients? Thanks. 


You are right that the TVC effects are net of the effects of the growth factors. Or, put another way, your growth factor effects are net of the TVCs. At TVCs = 0, you see the effect of the growth factors. So, you could center the TVCs so that they have mean zero. Then the means of the growth factors may not change much when including the TVCs. 

Hyoshin Kim posted on Monday, February 12, 2007  7:54 pm



Hi, In Growth Mixture Modeling in Mplus, would Mplus allow timevarying covariates to estimate the probabilities of class membership? I understand that timeinvaiant covariates, not timevarying covariates, are usually used as predictors of the latent class (i.e., estimating probabilities of class membership), and timevarying covariates are used to estimate the trajectories (e.g., intercept and slopes)? But I wonder if Mplus allows timevarying covariates to estimate both the class membership probabilities and the trajectories, if I include them in the model? Am I then, in fact, turing into latent transition modeling? Any comments would be appreciated. 


You can let timevarying covariates influence class membership. But this means that class membership probabilities change over time, so you have to consider whether this is what you want for the subject matter at hand. If you do, this may lead to reconceptualizing class membership as time varying, which as you say may lead to a type of latent transition model. 

Kaigang Li posted on Saturday, May 31, 2008  10:22 am



Professor Muthens, I have a basic question about how to determine the number of classes using GMM since I have been confused with that when I read GMM related articles. From reading I get an idea which testing GMM starts with determining the number of classes by unconditional LCGA by using some modelfit indixes (e.g. entropy, AIC, BIC, and LMRLRT, and then covariates are included to predict the growth factors and modify the membership. Let's say 4 classes are determined by unconditional LCGA. But when I added the covariates in the model, the classes may be reduced to 3 in terms of those indexes. So how can determine the number of classes for final model? I read the chapter "SecondGeneration SEM Growth Analysis", 3 or 4 classes based on 8 classes from LCGA are used for GMM. I cannot get very clear idea how you determine the number. In your another article "Integrating PersonCentered and VariableCentered Analyses: Growth Mixture Modeling With Latent Trajectory Classes" you mentioned "three different considerations are used to decide on the number of latent classes." Sorry about the long description, basically I am wondering if you could give any comments to clarify the process for determining the number of classes using GMM. Thanks for your time. Kaigang 


Take a look at my 2004 chapter in the Kaplan handbook which you find under Papers, Growth Mixture Modeling on this web site. 

linda beck posted on Tuesday, December 02, 2008  3:44 am



I want to examine the influence of parental alcohol disorder (pad) on depression during adolescence across 5 measurement points (LGM). I want to learn more about the structural effects of 'pad' on depression (time invariant) but also about the time specific effects of the same variable 'pad' on depression (time variant). 1.) How could one disentangle both kinds of effects best? E. g. model both, time variant and time invariant efects of 'pad' in separate models or in ONE model? 2.) How would one interpret effects of 'pad' on the intercept and on x1 in the latter model when these effects are modelled together (intercept is centered at t1)? 3.) Is the interpretation correct, that effects of 'pad' on e. g. x2 x3 x4 x5 are 'time specific effects beyond what is explained by systematic change'? 

linda beck posted on Tuesday, December 02, 2008  3:54 am



add.: in the time invariant definition, 'pad' is measured at t1. 


I think you are saying that a timeinvariant covariate pad has both indirect and direct effects on the depression outcome over time. The indirect effects are via pad's influence on the growth factors. You cannot identify all the direct effects plus the effects on the growth factors. It is like a Mimic model. I would recommend looking at the modification indices for the model with no direct effects to see if one or two direct effects seem to be needed. The indirect effect via the intercept growth factor has to do with the constant effect on all outcomes, whereas a direct effect is over and above that. 

linda beck posted on Thursday, December 11, 2008  8:54 am



I'm not sure if I was explicit enough, sorry! I want to conceptualize "pad" as time invariant (measured at t1) and predict growth curve factors. In addition, I have measured pad also at t2t5. So there is a possibility to examine time specific effects of pad1pad5 on y1y5 (besides the effect of pad1 on growth factors), i.e. to have direct effects on the depression outcome. a.) Does this conceptualization in general and the model (see below) make sense? That would be the following model (in principal, since I can't model all direct effecs of pad1pad5, as you said): i s on pad1; y1y5 on pad1pad5; b.) I guess, I should model both kinds of effects in ONE model, if my conceptualization is o.k.? I can't use modindices to explore possible direct effects of pad1pad5, since I'm utilizing numerical integration. c.) Does it make sense then, to model each timespecific effect separately in the model above (e. g. i s on pad1 AND y1 on pad1) and to check if it is significant (and to omit when not) then go to 'i s on pad1 AND y2 on pad2'? Thanks a lot! linda beck p.s. sorry for the long post. I wanted to avoid followup posts... 


Note that your statement y1y5 on pad1pad5; gets expanded as y1 on pad1pad5; ... y5 on pad1pad5; You can't have pad1 influence both i, s and y1y5, but you could say i s on pad1; y1 on pad1; y2 on pad2; ... y5 on pad5; Then you can see what's significant. 

linda beck posted on Tuesday, December 16, 2008  1:36 am



That was exactly what I meant. Sorry, for the wrong syntax! Thank you. linda beck 

Nicolas M posted on Monday, August 30, 2010  1:22 pm



Dear Professors, I come from a GLMM background and I try to transpose my usual models in the latent growth curve approach. When working with timevarying covariates in a GLMM, I simply write down these equations for "i" individuals and "t" time points : y_it = gamma(1)_i + TIME_it*gamma(2)_i + beta*x_it + e_it gamma(1) = alpha_0 + psi(1)_i gamma(2) = alpha_1 + psi(2)_i where gamma(1) is the intercept with an individual variance psi(1), gamma(2) the random slope with an individual variance psi(2) and x_it a timevarying covariate. For a latent growth curve model with MPLUS, would it be the same than using this syntax : i s  y1@0 y2@1 y3@2 ... y1 on x1 (1); y2 on x2 (1); y3 on x3 (1); ... and thus constraining the beta parameter to be the same through time? I'd like to avoid having one parameter by time point (like in y1 on x1; y2 on x2; y3 on x3;) or having to specify a random slope for the timevarying covariate (st  y1 on x1; st  y2 on x2 ...). Thank you in advance for your help. 


You have specified the timevarying covariates correctly. Regarding the time scores, does each person have the same time score at each time point? 

Nicolas M posted on Tuesday, August 31, 2010  11:03 am



Unfortunately no, they don't have the same time scores. They were all interviewed each year, but starting from different ages. I've already tried to use the TSCORES functionality of MPLUS to specify the time scores for each individual, but then I was forced to use a MLR estimator whose computation was too complicated (12 dimensions of integration, as my dependent variable is actually latent). I then decided to use a WLSMV estimator, which also allow me (if I'm correct) to estimate the covariance between the residuals of my indicators through time without an additional computational burden. The final solution I've retained, even if it's not as good as the TSCORES option, is to use this specification : y_it = gamma(1)_i + TIME_it*gamma(2)_i + beta*x_it + e_it gamma(1) = alpha_0 + alpha_1*STARTINGAGE_i + psi(1)_i gamma(2) = alpha_1 + psi(2)_i with TIME_it going from 0 to 9 (@0, @1...) and the alpha_1 controlling for the different "starting point" of each individual. This is the only way I could obtain an estimation for this model (I've tried the MLR estimation with TSCORES but it has never converged, even after playing with the number of Monte Carlo integration points and the MCONVERGENCE option. I gave up because the computation was too slow). 


You need to use TSCORES. Please send the full output and your license number to support@statmodel.com to see why you are having convergence problems. 


Hello I have very a basic question. I have a longitudinal study with 12 waves, my DV is measured at each wave, but my IV only at three time points. I want to test whether the change (slope1) in the DV can predict the change (slope2) in the IV. My question is: Does it make sense to consider slope1 in all 12 time points or it would be better to consider it only at the same three time points of the IV? Furtehrmore I read it would be better to have at least 4 time points, but i did not really understand why, could you give me some hints? Thanks Davide 


just one more question: how should I interpret the mean of the slope that from significant becomes notsignifcant when I regressit on some covariates?$ Thank you!! 


The slope that is the IV should not be formed from time points later than those used for the DV slope. When you regress slope on other variables, the parameter is no longer a mean but an intercept. The intercept can be nonsig while the mean is sig. The mean is given in Tech4 when you regress the slope on other variables. 


I am using TSCORES to account for individuallyvarying times of data collection across 7 waves of data. When someone was missing a time point I substituted a mean "days" variable for the TSCORE to avoid them being kicked out of all analyses. Now I have added a timevarying covariate, and everyone who is missing a single wave of data is kicked out of all analyses. Is there any way to avoid this, besides just mean substitution? 


Yes, this can be avoided. Instead of scoring the timevarying covariate as missing at this time point for this individual, score it as something else (anything will do). This will not influence the computations because the DV has a missing data flag at that time point for that individual, so it won't have an effect. 

EFried posted on Friday, February 24, 2012  2:05 am



(1) I want to explore in a GMM with 5 measurement points whether certain latent classes of my dependent depression variable seem to be "vulnerability" or "resilience" classes. I want to use baseline covariates to predict this. If I understand MPLUS correctly, even with the c ON x1 x2 x3; statement, all that MPLUS can tell me is that covariate e.g. Neuroticism might have a significant effect on intercept and slope of class 1 but not on class 2, but that doesn't tell me if people in class 1 actually have a higher Neuroticism, right? What would you recommend for this  running ANOVAs in a second step after the GMM? Is there a good paper in which covariates are successfully used to predict latent classes in MPLUS? (2) First one should find a good model without covariates, and then add them. However, the c ON x1 x2 x3; command is bound to change the class solution, right? I use starting values found in models without covariates when adding covariates, but does this make sense when using the c ON x statement? Thank you so much 


1. The c ON x regression predicts class membership. The i s ON x regression predicts variability in the intercept and slope growth factors. If you want to see if the mean values of your covariates vary across classes use the e setting of the AUXILIARY command. See the Growth Mixture Modeling section in Papers on the website. 2. This is likely due to the need for direct effects between the covariates and the outcomes. See on the website: Muthén, B. (2004). Latent variable analysis: Growth mixture modeling and related techniques for longitudinal data. In D. Kaplan (ed.), Handbook of quantitative methodology for the social sciences (pp. 345368). Newbury Park, CA: Sage Publications. 

EFried posted on Sunday, February 26, 2012  7:12 am



1) Thanks! 2) The paper focuses on distal outcomes, I am only working with 5 equidistant repeated measurement depression scores which are normal dependent variables. Still, adding covariates and especially the c ON x; statement sometimes drastically changes classes. 3) In the forum and handbook, it is usually recommended to first fit models without covariates and then add them step by step, fixing the classes if need be to keep them the same. The paper you mention above (Muthén, 2004) states the opposite though: "It should not be expected that the class distribution or individual classification remains the same when adding covariates. " "The fact that the "unconditional model" without covariates is not necessarily the most suitable for finding the number of classes has not been fully appreciated [...]. An important part of GGMM is the prediction of class membership probabilities from covariates." I'm not sure which method to use, both seem to make sense. Thank you, your input is highly appreciated Efried 


The current thinking is to find the number of classes without adding covariates. I would not recommend fixing the classes. If adding covariates changes the classes, the need for direct effects should be examined. 

EFried posted on Sunday, February 26, 2012  12:57 pm



Thank you. Regarding the auxiliary command (e), with which I'd like to compare whether my covariates differ between classes, ANALYSIS: type = MIXTURE; auxiliary = x1 x2 (e); MODEL: %OVERALL% i s q  y0@0 y1@1 y2@2 y3@3 y4@4; q@0; i s ON x1 x2; c ON x1 x2; causes the "auxiliary: unknown option" error. I find only one example about auxiliary (e) in the handbook and am not sure what I'm doing wrong. 


The AUXILIARY options belongs in the VARIABLE command. 

EFried posted on Monday, February 27, 2012  4:26 am



Sorry, that was a stupid mistake to make. Pasting the line into the VARIABLES part gives me following error: NAMES ARE x1 y0y4; USEVAR = x1 y0y4; missing=all(999); classes=c(2); auxiliary=x1 (e); ANALYSIS: type = MIXTURE; MODEL: %OVERALL% i s  y0@0 y1@1 y2@2 y3@3 y4@4; i s ON x1; c ON x1; *** ERROR in MODEL command Unknown variable(s) in an ON statement: x1 Without "auxiliary=x1 (e);" x1 is detected properly so the code works. Maybe I'm using the statement wrong? 


You cannot use x1 in the model and also as an auxiliary variable. 

EFried posted on Tuesday, February 28, 2012  1:44 am



I would like to find out  in my final model, and that includes covariates  how the classes differ from each other. One important aspect is: which class has more females, which class has a higher neuroticism, is a genetic polymorphism more common in one class than the others. For this question, auxiliary (e) is not helpful if I cannot include my covariates in the model predicting classes. Is there a way to solve this in MPLUS? How do other people do this? Nearly all papers I find for GMM are descriptive and rarely add covariates, and the few I found seem to never use the c ON x1 x2; command to predict class membership with covariates. Thank you! 


See TECH4. 

Lisa posted on Wednesday, April 24, 2013  3:33 am



Hi I would like to add a time varying control variable to my LGM. I have four waves of data and the control variable is a single item continuous variable (7pt Likert) measured at each time point. Could you please advise me on the best way to add this control to my LGM? Thank you! 


See Example 6.12. 


I am running a timevarying covariate model on data spanning from 2006 to 2012. Data is available for each individual when they first enter longterm care and then every 3 months following admission in longterm care. However, some entered in 2006 and therefore have many time points whereas others entered later (e.g. 2011) and have fewer time points. I was treating entry in longterm care as time 0 (baseline) for all individuals, regardless of the actual calendar date of entry. Is that ok? By setting up my data this way some have 24 time points whereas others have fewer because they entered at a later date. Can I still use FIML to treat missing values for those who have fewer time points and therefore have missing values for the later time points or should I be using a missing by design method instead? Is that possible with Mplus? Thank you in advance for your help. 


This sounds correct and you can still use FIML. 

Jing Zhang posted on Thursday, August 15, 2013  11:34 am



I am fitting a long format longitudinal model. My question is that if I can add time varying dichotomous covariates at “within level”? (MD is dichotomous ) variable: Names are subid MRN wave SEX MD pebdtot; USEVARIABLES ARE wave pebdtot; MISSING = ALL (999999); cluster = MRN; within = wave; analysis: TYPE = TWOLEVEL RANDOM; model: %within% s  pebdtot on wave; pebdtot ON MD; %BETWEEN% pebdtot ON SEX pebdtot with s; OUTPUT: SAMPSTAT STANDARDIZED; Thank you very much, Jing 


Yes, like in UG ex 9.16. The MD variable should also be on the within list. And SEX on the between list. 

T Yilmaz posted on Monday, June 30, 2014  2:31 am



I want to explain growth trajectory parameters (intercept, slope, acceleration) with my timevarying and timeinvariant variables. All the sources suggest that I can put timevarying variables at Level 1. But then I will not be able to explain my growth parameters with my timevarying variables. Is there a way to solve this problem? 


I think you have your data in long format. If you put it in wide format, you will have more flexibility. Then you can use the timevarying covariates at both levels. 

Back to top 