I am working on example 6.18 (MULTIPLE GROUP MULTIPLE COHORT GROWTH MODEL) to use the analysis for my own data. However, I do not quite understand how the analysis and the data file correspond. The data file has 10 columns of data, and I do not know what these represent. The first row of data are as follows: 3.199588 2.374718 2.877231 4.045188 -0.378137 0.299639 -0.642262 -0.880031 -2.606761 1
There are 9 possible years from which the data could have been drawn (ages 10-18), so I thought that these corresponded to the first 9 columns. However, that can't be the case, because the final column refers to which cohort each individual has come from (1, 2, or 3) and members of each cohort should have missing observations on specific years e.g. Cohort 1 (from 1990) should be missing data from years 11, 13, 15, 17, & 18- but everyone with a '1' in the final column has complete data (as shown in the example row above).
To complicate matters further (for me anyway), the syntax in the example says: VARIABLE: NAMES = y1-y4 x a21-a24 g; GROUPING = g (1 = 1990 2 = 1989 3 = 1988); That is 10 variables, not 9. I am also lost as to where a21-a24 comes in. Is it that we have Y1-Y4 and a21-a24 in the data file along with 'x' or with 'g'?
The ten columns of data represent the 10 variables in the NAMES statement: y1 y2 y3 y4 x a21 a22 a23 a24 g. In the multiple group approach, there are four variables that represent measurement occasions not ages. This is described in the text of the example. Nine variables are used if you don't take the multiple group approach.
Thanks Linda - I think the following is correct then: The model in the e.g. has four measurements for the variable in the LGCM (y1-y4) = the 4 first columns of the data set. Next is a predictor (x) of the intercept and slope. Then come four measurements of a time-varying covariate (a21-24). Finally, g is the grouping variable. If I have no time-varying covariates, then I can ignore all the bits of MPlus code that include a21-a24. And if I have no predictor of intercept/slope I could ignore those involving 'x'. If I ignored all of those then I'd simply evaluate the growth curve model across the accelerated design I have. Thanks again Simon
Hi again, My post above was borne out, I managed to run the model without any of the time-varying covariates and without 'x'. Thanks again for your response earlier. Best wishes Simon
Corinna posted on Monday, October 26, 2015 - 7:36 am
Dear Muthen & Muthen, I am interested in using GMM to analysis a dataset based on a accelerated longitudinal design. I saw the example 6.18 (MULTIPLE GROUP MULTIPLE COHORT GROWTH MODEL) on Mplus User's Guide, it shows how to conduct LGM with accelerated longitudinal design, is it possible to do that in GMM? If so, could you please give the example syntax? If not, how to first re-organize the data in Mlus, then conduct a GMM? Thank you very much!
Hi I'm afraid I am still struggling a little to understand this example! We have figured out most of it, but one thing still seems opaque to me, and this is the numbers in parentheses. For example, y2-y4 (22-24) Can you explain what the 22-24 refers to? The manual explains that "In the group-specific MODEL command for cohort 1988, the ON statement with the (12) equality constraint describes the linear regression of y1 on the time-varying covariate a21 for cohort 1988 at age 12" but I can't see how that parameter is designated '12'. Thanks Simon
When an age is represented by more than one cohort, the parameters need to be held equal to reflect that. Ages 12, 14, and 16 are represented by two cohorts. All three ages appear in cohorts 1988 and 1990. The equalities hold parameters equal over these cohorts.
I'd like to ask another question about example 6.18. In the output to the example (and when I run it on my own data too) it shows estimates (slope mean, slope variance, etc) according to cohort (1990, 1989, 1988).
There are no 'overall' estimates though (that I can see). I had assumed I'd get these e.g. overall slope mean and overall slope variance, as this was the point of the design. I'd also assumed there would be some indication in the results of whether there WAS a common fitting slope (vs. whether e.g. three different slopes are a better way to describe the data), but I can't find anything that would fit the bill here either.
The output for UG ex 6.18 is on our website and shows that only one set of growth factor parameters are estimated. For instance, the means of i and s are labeled in the Overall part of the model which means that they are held equal across groups (cohorts). They are listed in the output for each group but the values are the same.
Onsen Juiko posted on Wednesday, April 13, 2016 - 3:00 pm
In a multiple group multiple cohort growth model with several time invariant covariates (TIC), should the mean and variance of each TIC be constrained to be equal across cohorts?
That's the investigators choice. Parameters of the covariates are typically not part of the model which speaks for allowing them to differ across cohorts. But it might be of interest to study in which case you can bring the covariates into the model.
Hopefully one last question - I have my model and understand the output now, so thanks for all your patience.
The only thing left is that I wonder whether it is possible to test whether the 'single' growth model is better than one where growth is allowed to differ across cohorts? The fit of my accelerated design is 'ok' but not great and when instead I fit a model with three different cohorts estimated (multi-group approach to growth curves basically) I get much better fit. Can I test the improvement in fit statistically somehow? I think I could look at the chi-square difference between the two models and assess it that way, but I'm not sure. Thanks, Simon
I have a question about measuring polynomial growth in an accelerated longitudinal design. Using a study with 5 cohorts who are assessed 4 times with 2 year overlap, I can see how the example from chapter 6 can be extended to examine cubic growth even though people only provide 4 data points (which would typically not be enough). However, is there any issue with estimating variation/random effects for this cubic growth parameter in such a model? I ask because of a recent paper suggesting that you would still need each person to provide enough time points that would be needed to examine the polynomial of interest in a typical longitudinal model (so in this case I would need at least 5 time points for each individual- relevant comment is in Box 3 here https://www.sciencedirect.com/science/article/pii/S1878929317300300). I am interested in examining correlates of this growth so wish to ensure that I can use the growth parameters (as high as cubic growth) as correlates of other variables. Thank you for your time!