The topic of cross-lagged panel modeling has come up a couple of times recently. I recommend the article by Hamaker et al (2015) in Psych Methods: A critique of cross-lagged panel modeling. Here is Hamaker's Mplus input for the proposed RI-CLPM in Figure 1:
! Constrain the measurement error variances to zero y1-x4@0;
! Optional: Constrain observed means per variable over time ! [x1 x2 x3 x4] (mx); ! [y1 y2 y3 y4] (my);
! Specify the lagged effects between the within-person centered variables ! Optional: constrain them to be invariant over time cx2 ON cy1 cx1; cx3 ON cy2 cx2; cx4 ON cy3 cx3; cy2 ON cy1 cx1; cy3 ON cy2 cx2; cy4 ON cy3 cx3;
! Within-person centered variables at the first wave correlated cx1 WITH cy1;
! Allow the residuals (dynamic errors) at subsequent waves to be correlated cx2 WITH cy2; cx3 WITH cy3; cx4 WITH cy4;
! Fix the correlation between the random intercepts and the within-person centered ! variables at the first wave to zero (as by default these would be estimated) mu_x WITH cx1@0cy1@0; mu_y WITH cx1@0cy1@0;
Note that the cx and cy factors behind each outcome are used to represent the within-level (within-subject) part of the outcomes - the between-subject part is captured by the random intercepts - so that the cross-lagged regressions refer to relationships on the within level.
I am running the RI-CLPM for motivation and different emotions. I have two questions/concerns:
a) I am confused why some autoregressive paths are not significant (I also ran the models according to the traditional CLPM, and they were all significant). I am aware that the autoregressive parameters in the traditional CLPM are usually higher, however, I would not know how to interpret these findings.
b) How do I account for measurement invariance using the RI-CLPM? Can I still do latent modelling for the scales using the items (and enforcing equal factor loadings across the three measurement points).
Thank you very much for posting the code to apply RI-CLPM in Mplus. It was very helpful, and I got it working for my panel (longitudinal) data.
I am used to analyzing longitudinal data in multilevel models; therefore, the terminology that I will use below will be quite similar.
For my model, I would like to add a moderator for the cross-lagged paths (e.g., for cx1 on cy2). In multilevel terms, this would be an interaction between a level 2 variable and a level 1 variable (or: an interaction of a between-person variable that has been measured once and the within-person part of a repeated measure).
I checked the multilevel moderation and mediation sources for Mplus. However, as the RI-CLPM does not seem to take the typical multilevel approach, I'm having a hard time to integrate the moderation models with RI-CLPM.
I was wondering whether someone has any thoughts/ideas on this.
When the moderator is invariant over time, you have several options: a) a multiple group approach based on a median split of the moderator, and then compare parameters across the two groups; this does involve overruling some of the Mplus multiple group defaults though; b) a DSEM (i.e., multilevel time series) approach (this requires your data to be in long format, rather than in wide format); adding a cross-level interaction is really simple then (you just add the moderator as a between level predictor for the random slopes); the disadvantage is that it automatically assumes that all the parameters are invariant over time (e.g., the lagged relationships between waves, or the residual variances), which may be problematic when the intervals between the waves vary, and/or if the time intervals between the waves are relatively large and developmental changes are may have occurred during the study; however, some of these constraints may be overcome by adding dummy variables for the different waves, and interaction terms between these and the lagged predictors; c) add the actual interaction terms to the RI-CLPM (with data in wide format); this would require the interaction between the moderator and the within-person centered latent variables (e.g., cy1); I believe this could be done using the XWITH statement.
Note that the latter would also be the go-to option if you have a time-varying moderator.
Thank you very much for your response, it was very helpful.
I worked on option c) and started with just one moderated path, however I got the following message:
THE ESTIMATED COVARIANCE MATRIX COULD NOT BE INVERTED. COMPUTATION COULD NOT BE COMPLETED IN ITERATION 1. CHANGE YOUR MODEL AND/OR STARTING VALUES.
I added the following lines to the code in the opening post, to show how I defined my time-invariant moderator and interaction terms.
analysis: TYPE=RANDOM; ALGORITHM = INTEGRATION !(to use the XWITH statement)
cm1 BY m1@1 m1@0 cx3 ON cy2 cx2 m1; cy2xcm1 | cy1 XWITH cm1; cx3 ON cy2xcm1;
I'm already quite skeptical that I can reach convergence with this particular model with my current data (917 observations), however I can not pinpoint at the moment whether that is an issue right now or whether I overlooked something with my code.
Thank you for pointing me into the direction of option b), which could also be viable in my case.
Thank you for this code, very helpful. I have a model that does not converge (I have tried increasing the number of iterations already) and I think that the problem is with one of the new within person centered variables. I had previously run a normal CLPM with no problems.
You can estimate measurement error variance, although you need som constraint for identification(e.g., all measurement error variances are held equal; or the first and last are equal to the second and second last respectively.
Adding measurement error results in Kenny and Zautraís trait state error (TSE) model (formerly known as the STARTS model). In general it requires a larger number of repeated measurements (say 8 waves or more), to be empirically identified.
Dear Muthen, I am running an RI-CLPM model with two variables and two waves, and 3 co-variates. I wonder if my syntax is correct? model cant converge. what can i do? thanks! MODEL: RI_PORCSB BY PORCSB1@1PORCSB2@1; RI_FTS BY FTS1@1FTS2@1; cPORCSB1 BY PORCSB1@1; cPORCSB2 BY PORCSB2@1; cFTS1 BY FTS1@1; cFTS2 BY FTS2@1; PORCSB1-FTS2@0; cPORCSB2 ON cPORCSB1 cFTS1 AGE1 SEX1 BMI1; cFTS2 ON cPORCSB1 cFTS1 AGE1 SEX1 BMI1; cPORCSB1 WITH cFTS1; cPORCSB2 WITH cFTS2; RI_PORCSB WITH cPORCSB1@0cFTS1@0; RI_FTS WITH cPORCSB1@0cFTS1@0; OUTPUT: TECH1 STDYX SAMPSTAT;
A RI-CLPM with only two waves is not identified (as the traditional CLPM is already saturated then, so adding two latent variables with a covariance is not possible then). While having some covariates here may ensure that the number of parameters does not exceed the number of sample statistics (such that the model is not resulting in a negative number of df), the model is probably still not identified, because with only two waves, it is not possible to tell the difference between stability due to autoregression versus stability due to a trait. In contrast, when you have three waves, you can tell the difference between these two forms of stability, because they imply different covariance structures (i.e., the typical simplex structure versus the one factor structure).
Thank you for your response. My understanding is that i have to go with a traditional clpm. here is what i have now: MODEL:
PORCSB2 ON PORCSB1 FTS1 AGE1 BMI1 SEX1; FTS2 ON PORCSB1 FTS1 AGE1 BMI1 SEX1; PORCSB1 WITH FTS1; PORCSB2 WITH FTS2;
However, age and bmi are time variant covariates. how do i account for this in the model? my alternative would be: PORCSB2 ON PORCSB1 FTS1 AGE2 BMI2 SEX1; FTS2 ON PORCSB1 FTS1 AGE2 BMI2 SEX1; PORCSB1 ON AGE1 SEX1; FTS1 ON AGE 1 SEX1; PORCSB1 WITH FTS1; PORCSB2 WITH FTS2; is correct? thanks
yuxiong posted on Tuesday, October 02, 2018 - 9:40 am
I am trying to use RI-CLPM in multigroup comparison, however I got identification issues. I read the post in the comment that "a) a multiple group approach based on a median split of the moderator, and then compare parameters across the two groups; this does involve overruling some of the Mplus multiple group defaults though"
can I ask what specifically needs to be overruled? I only know that reference factor mean is set to 0 by default but it does not seem to influence this one.
Mplus will impose the following default constraints (related to strong factorial invariance):
a) equal factor loadings across groups; this is no problem here, because all factor loadings are constrained to be 1 anyway
b) equal intercepts across groups; this you typically do not want, so you need to free the intercepts in the second group; you can simply do this by specifying them as free parameters; when you have x1 to y3, you simply include for the second group: [x1-y3];
c) free latent means in the second group; this leads to trying to estimate more parameters for the mean structure than that there are observed means (hence the identification problems); you need to constrain all the means of the latent variables in the second group to zero; this includes the means of the random intercept factors, and the means of the within-person centered variables per occasion
yuxiong posted on Tuesday, October 02, 2018 - 6:14 pm
Hi there, Iím running the RI-CLPM for 16 separate models: 8 different predictors, and 2 different outcome variables. When I run the 8 models for the first outcome variable, everything goes smoothly. When I run the models with the second outcome variable, nearly all of the models produce the following error: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR A LATENT VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO LATENT VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO LATENT VARIABLES. CHECK THE TECH4 OUTPUT FOR MORE INFORMATION. PROBLEM INVOLVING VARIABLE MU_Y. It is generally the same variable (the random intercept of the second outcome variable) that has a negative variance. Any thoughts why this might be the case? Both outcome variables are measures of social anxiety, so Iím not sure why this is occuring - they are very similar. One idea Iíve had is to fix the variance of the random intercept of the second outcome variable to be similar to that of the first outcome variable. I know itís recommended to fix it to zero, but doesnít that defeat the whole purpose of the RI-CLPM?
A negative variance estimate typically means a model is too complex for the data you have. In this case, I would conclude that the second outcome variable is not characterized by stable between-person differences; rather, everyone varies around the same mean (or trend) on this variable. Hence, you can do one of two things: 1) You can set the variance of this random intercept to zero, and also set the covariance between this random intercept and the other to zero, while still estimating the random intercept of the predictor freely; this will still result in a warning about the covariance matrix, but then you can just ignore it. 2) You can adjust the model by taking the random intercept for this outcome variable out of the model, and model the lagged relationship between the within-person part of the predictor variable (note that for that variable you keep the random intercept in the model), and the original (i.e., non-decomposed) outcome variable. These two options are statistically identical (same fit etc.), and should lead to the same lagged parameter estimates.
Dear dr Muthen, for RI-CLPM with latent variables, can i (1) use factor scores as person centered variables, or should i make a new by-statement of the factors?(2)does my input seem right?
1) WRLFT1, T2,T3 by the items (factor loadings = across time) same for WRLIT1,2,3; PEET1,2,3 and for PEIT1,2,3 2) Intercepts of items = across time 3) RI_WRLF by WRLFT1@1,T2@1,T3@1;(same for WRLI, PEE and PEI) 4) [RI_WRLF]; [RI_WRLI]; [RI_PEE]; ... 5) Intercept of all factors = 0: [WRLFT1@0WRLFT2@0WRLFT3@0]...; 6) All measurement error variances =0: WRLFT1@0WRLFT2@0WRLFT3@0PEET1@0Ö 7) Aur effects: WRLFT3 ON WRLFT2 (WRLF); WRLFT2 ON WRLFT1 (WRLF); WRLIT3 ON WRLIT2 (WRLI)...; 8) All possible cross-lagged effects: even if not hypothesized 9) Corr within person variables: PEET1 WITH PEIT1 WRLFT1 WRLIT1 WRLNT1; PEIT1 WITH WRLFT1 WRLIT1 WRLNT1;...; 10) Corr residuals at subsequent waves 11) Corr between the RI's and the other exogenous var =0: e.g., RI_WRLF WITH PEET1@0PEET2@0...;
Philipp Alt posted on Wednesday, February 13, 2019 - 6:42 am
I am tinking about setting up an RI-CLPM to study the development of two processes. However, I am not quite sure if this approach is quite fitting, because of my data-structure:
I have data from 9 waves with an age-range within each wave from 8 to 15 years. Because I want to study the processes as a function of age rather than wave, I restructured all the data and basically pooled the different age groups from all waves and then remerged them into one wide dataset, giving me a large dataset with age in the columns rather than wave.
But this leaves me with a couple of question before setting up a RI-CLPM:
1) Is there a possibilty to control for the cohort, that the person is from with the RI-CLPM? I would think that this mandatory.
2) Is there a way to control for the fact that people had different participation rates? Some people were measured in all 9 waves, some just once. This would also have to be addressed, I think. Is this possible wih the RI-CLPM approach?
3) I also have siblings in the data set. Rather than excluding them, I was wondering if I could keep them in the data set and control for the non-indepence via TYPE=COMPLEX,for example, in the RI-CLPM approach?
1) Right, age and not wave should be the time axis. You can handle this in 2 different ways, either using a dummy variable influencing the person-specific intercept or using a multiple-group approach where group correspond to cohort. The UG ex 6.18 shows how to do the latter approach which is very flexible so that you can consider which RI-CLPM parameters are cohort invariant. The ex6.18 approach is also discussed in our Short Course Topic 4 on our website; see slides 48 and on.
2) Yes, this would be handled by standard ML under the usual MAR assumption, also called FIML. Just give missing data flags for the missing values.
3) Right, Type=Complex can be used to adjust the SEs. This assumes of course that subjects with siblings have the same parameter values as subject without siblings (Complex does not allow for different parameter values).
I am testing an RI-CLPM model but the random intercept for one of my variables does not have significant variance across individuals. The covariance between my two random intercepts is also not significant. If I estimate variances of both random intercepts and the covariance, I get an error message that says that the latent variable covariance matrix is non-positive definite. If I constrain the variance (and covariance) to 0 for the one variable, the error message disappears. I am still estimating variance in the random intercept for my other variable (which is significant). I understand that if I constrain both random intercepts my model is identical to the CLPM. But in this case, I am only constraining one. Is it okay to estimate the RI-CLPM in this way? It seems to make no substantive difference to the results.
In general, if there is no trait-like aspect in one variable but there is in the other, is it okay to still use RI-CLPM and just constrain the variance of the intercept for that variable?
When the variance of a random intercept is not significant, this implies there is not really evidence that there are individual differences in this term. Hence, fixing the variance to zero is a reasonable next step. Alternatively, you can decide to remove the entire random intercept from your model: This is actually the same thing, but it will also make the error message disappear. Either way, it means the new model does not include stable, time-invariant differences between individuals in that particular variable, while there may still be time-invariant, trait-like individual differences on the other variable, which are adequately captured by the remaining random intercept.
STDYX will standardize each regression coefficient using the variances (or standard deviations) of the predictor and the outcome variable that are associated with this regression coefficient.
Since in the RI-CLPM, the lagged coefficients are included between the within-person components, the standardization also occurs using only within-person variance. Hence, this implies it is within-person standardization.
I haven not done or seen this done before, but it should be possible in Mplus with multilevel SEM (use TYPE = TWOLEVEL, possibly also RANDOM if you want random regression parameters).
Note that the regular RI-CLPM is in wide-format (meaning: it is not specified as a multilevel model but as a regular SEM model). In your case you would thus have time points (level 1) as variables and persons (level 2) as cases (i.e., rows in your datafile), just as in the regular RI-CLPM.
Then you can include classroom (level 3) as the between level clusters, and decide whether you want random lagged parameters or not. You can check UG example 9.5 for an illustration of a 2-level path model with random regression coefficients.
We used a CLPM model to assess two variables across three waves. The two interval is about 10.5 and 13.5 months, separately. Is there a way to account for the differences in the length of the time? Any Mplus syntax example for continuous-time CLPM? Thanks!
When intervals in a CLPM design are of different length, the parameters should not be constrained to be identical over time. Such constraints are only sensible when the intervals are identical, and you assume the underlying process remains the same over time. To determine whether the underlying dynamics remain the same even though the intervals are different, a continuous time perspective is needed. This is explained in more detail here: https://ryanoisin.github.io/files/RyanKuiperHamaker_preprint.pdf To summarize the main issue: The constraints that are needed are on the matrix with lagged parameters, rather than on separate lagged parameters, making it difficult to impose them in Mplus. Alternatively, one could first estimate the model in the conventional way (without the constraints), and then convert the parameters obtained for the two intervals to refer to an interval of the same length (e.g. 12 months, see also https://ryanoisin.github.io/files/KuiperRyan_2018_DrawingConclusions_SEM.pdf). However, there is at this point no test to determine whether these converted parameters are significantly different from one another. Alternatively, you could use software that was specifically designed for contrinuous time modeling, such as ctsem in R.
I have tried the code above to model RI-CLPM with 2 variables (anxiety and insomnia) across 4 time points (equal 10 week intervals), using Estimator = ML, and did not constrain observed means per variable over time - but receive the following message: NO COVERGENCE. NUMBER OF ITERATIONS EXCEEDED. This persists even when I increase iterations to 20000. Is there anything else I can adjust in the code to get it to converge?
We need to see your full output - and data if possible - send to Support along with your license number.
shonnslc posted on Wednesday, October 02, 2019 - 9:27 am
I am wondering if it is necessary to specify the dynamic errors in RI-CLPM:
cx2 WITH cy2; cx3 WITH cy3; cx4 WITH cy4;
What happens if this part is not specified in the model? I am doing power analysis for RI-CLPM and I encountered replication error when I specified dynamic errors but when I removed this part, there was no error message for each replication. Thanks.
We need to see your full output to say - send to Support along with your license number.
Philipp Alt posted on Tuesday, January 21, 2020 - 4:59 am
I have more of a conceptual question about the RI-CLPM:
My understanding of the RI-CLPM is that you control for stable between person differences in the between part of the model. Therefore it does not make sense to control stable covariates (variables that do not change) in the within part of the model anymore, as they are already controlled for in the between-part. Is this assumption right?
Hi Philipp, your reasoning is mostly correct. However, you could also consider regressing the observed variables on a time-invariant covariate directly (rather than through the random intercepts); this would allow for the effect of this covariate to change over time. If youíd constrain the regression parameters in this model to be invariant over time, the model ibecomed identical to the model in which the random intercept is regressed on the time-invariant covariate. Hence, you can do a chi-square test to compare these two options (time-varying effect vs constant effect).
Iím analyzing data for ~14,000 weighted cases across three timepoints using Hamakerís RI-CLPM. Iím hoping to break this sample into subgroups to ascertain whether the cross-lagged paths operate in the same way across groups. My code for the full sample works perfectly fine, and if I create separate datasets by subgroup the code again works just fine (e.g., dataset for White, dataset for Black, etc.). However, when I use the GROUPING command, I get warnings that the model is not identified. Is it possible to use the GROUPING option in order to use DIFFTEST without splitting the sample into separate datasets?
Here's the error message:
THE MODEL ESTIMATION TERMINATED NORMALLY THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING THE FOLLOWING PARAMETER: Parameter 24, Group DISAB: [ CACH1 ] THE CONDITION NUMBER IS -0.688D-12. THE ROBUST CHI-SQUARE COULD NOT BE COMPUTED.
I believe you are trying to run a multiple group version of the RI-CLPM. This is a little tricky, as it requires you to overrule the multiple-group-factor-analysis-defaults that Mplus imposes. Specifically, Mplus will constrain the intercepts of observed variables to be identical across the groups, and free the latent means (i.e., for all the variables defined by a BY statement) in the second group; this leads to a model that is unidentified in this case. You can find the correct code for this (and other extensions of the RI-CLPM) here: https://ellenhamaker.github.io/RI-CLPM/mplus.html
I wonder whether there is a way to conduct statistical testing that compares the strengths of cross-lagged parameters (rather than just descriptively compare the standardized coefficients) when two sets of variables are not on the same metric?
I was thinking about running a model after standardizing all variables, and directly test whether the different score is greate than zero. But it is difficult to do WP standardization on variables because of the large amount of missing values.
If I use the BP standardization, I think I can interpret the results using the relative order changes, but not sure this would be a good approach.