Mplus Discussion >> Multi-level Growth Curve Modeling

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Multi-level Growth Curve Modeling

Mplus Discussion > Growth Modeling of Longitudinal Data >

Message/Author

Chaoyang Li posted on Wednesday, January 17, 2001 - 4:07 pm

We have a data set that contains about 2,200 subjects who were surveyed on cigarette use longitudinally for more than 10 years across 65 schools. First we are interested in looking at the growth trend at individual level and then we are interested in examining the differences of the slopes and intercepts across schools. A three-level growth modeling might be used for these purposes. By referring the examples in MPlus manual and the handout from the workshop, we wrote the following programs:

...

VARIABLE: NAMES ARE ID SCHID GROUP WNC6 WNC8
WNC9 WNC10 WNC11 WNC18 WNC21;
MISSING is .;
USEVAR = GROUP SCHID WNC6-WNC21;
CLUSTER=SCHID;

ANALYSIS:
TYPE = TWOLEVEL;
ITERATIONS = 1200;
ESTIMATOR = MLM;

MODEL:
%BETWEEN%
levelb BY WNC6-WNC21@1;
trendb BY WNC6@0 WNC7@1 WNC8@2 WNC9@3
WNC10@4 WNC11@5.5 wnc18@7
wnc19@8.5 WNC21@10;
[WNC6-WNC11@0 WNC18@0 WNC19@0 WNC21@0];
[levelb trendb];
levelb ON GROUP;
trendb ON group;
%within%
levelw BY WNC6-WNC11@1 wnc18@1 wnc19@1
wnc21@1;
trendw BY WNC6@0 WNC7@1 WNC8@2 WNC9@3
WNC10@4 WNC11@5.5 wnc18@7
wnc19@8.5 WNC21@10;

levelw ON GROUP;
trendw ON group;

OUTPUT: SAMPSTAT STANDARDIZED;

However, we got the following error message:

THE SAMPLE COVARIANCE MATRIX FOR THE VARIABLES IN THE MODEL CANNOT BE INVERTED. THIS CAN OCCUR IF A VARIABLE HAS NO VARIATION OR IF TWO VARIABLES ARE PERFECTLY CORRELATED. CHECK YOUR DATA.
*** FATAL ERROR

Could you please expain why the errors occured and how to modify the program?

Thank you for your help.

Linda K. Muthen posted on Thursday, January 18, 2001 - 11:33 am

This appears to be a problem with your data. It is most likely due to little variability on the between level. If you send the input and data to support@statmodel.com, I can take a look at it.

Anonymous posted on Wednesday, July 10, 2002 - 7:05 am

I have data on 850 children attending about 300 different schools (about 3 kids per class) with about six waves of data collection. Is this too few children to estimate a multilevel growth curve? Also, if the children change schools during the course of the study, will this impact my ability to estimate a multi-level model?

Thank you.

Anonymous posted on Thursday, July 11, 2002 - 9:14 am

It is possible to estimate a multilevel growth curve even with 2 subjects per cluster. To take into account children changing schools you have to set up a multiple membership model. This is not very easy to do but here is how it goes. First form the new clusters to be clusters of schools where children can move from one school to another in the same cluster. Then you have to setup dummy variables for school membership for each student. Finally set the model as
between level intercept | y on dummy
and
between level slope | y on dummy x time.

Anonymous posted on Thursday, March 25, 2004 - 7:14 am

We have student achievement data over a five year period on tests in math, reading and writing. The students are nested in classes within schools within boards. We would like to investigate school improvement over the five years. The constraints in these data are that:
1. Different tests were taken in each year although equated from year to year:
2. Different students took the different tests, for example grade 3 students in year 1 are different from grade 3 students in year two in the same school:
3. Minimal information on schools, e.g. average income
Is it possible to fit a cross-sectional longitudinal model to examine school improvement?
Thanks

bmuthen posted on Thursday, March 25, 2004 - 4:19 pm

That's a big topic. You may be interested in looking at my UCLA colleague Yeow Meng Thum's work in this area:

http://www.gseis.ucla.edu/faculty/thum/Papers/SMR1103.pdf.

finnigan posted on Friday, August 29, 2008 - 9:35 am

Dear Linda/Bengt

I am using a multiple indicator growth model to model varablity of individuals within shools, and schools are in different regions using 4 measurement occasions.

In this case I take it that this is a three level model, individuals in schools within regions.

are there any MPLUS examples you are aware of that use a multiple indicator growth model at three levels?

I plan to use individual times of observation to examine within and between person change. Does the introduction time as a varible add a fourth level ie individuals, within schols,within regions within time?

Thanks

Bengt O. Muthen posted on Friday, August 29, 2008 - 9:47 am

First question is if you have many regions, where many means at least say 20.

If not, then treat region as a fixed mode - using dummy covariates.

If yes, then use

Type = Complex Twolevel;

where Complex covers the region and Twolevel covers the schools (see UG).

AT (individually-varying times of observation) should not add a level. You can use AT in the wide, multivariate approach to growth modeling that we prefer.

finnigan posted on Friday, August 29, 2008 - 10:06 am

Bengt

Thanks for this. Just to recap

I would have 4 regions at most

east,west,midlands and south.
Hence I have a two level model - individuals within schools. Region would act as a covariate variable and time of observation does not add any level.

Thanks

Bengt O. Muthen posted on Friday, August 29, 2008 - 5:33 pm

That's right.

Nikolai Eton posted on Wednesday, March 18, 2009 - 6:05 am

Dear Linda & Bengt,

is it possible to measure individual's (nested in groups) growth on group-level outcomes? Do you have any references or Mplus examples on that?

Many thanks in advance.

Bengt O. Muthen posted on Friday, March 20, 2009 - 11:59 am

Yes, this is possible. We don't have any refs or examples, but you specify it just like you would in regular growth.

Francis Huang posted on Tuesday, March 02, 2010 - 7:02 am

I am running a multilevel (students within school) latent growth curve model looking at achievement scores over three time points. I have three covariates at the student level (race, ses, and gender-- all binary).

I am computing the intercepts (starting point) and I'm wondering why only the intercept and slope at the between level are shown by default in the output. For computing the intercept (with race, ses, and gender set to 0), shouldn't the starting year be:

intercept(between)+intercept(within)?

Thanks.

Bengt O. Muthen posted on Tuesday, March 02, 2010 - 1:13 pm

There is only one parameter for the mean of the intercept growth factor and it appears at the between level (you can think of this as the intercept growth factor have zero mean on level 1). Note that the intercept growth factor mean is the mean of the outcome at time 1. Just like there is only one outcome mean, there is only one intercept mean.

Francis Huang posted on Tuesday, March 02, 2010 - 1:37 pm

Thank you for the reply.

A few follow up questions:

1. So I take it that the within level intercepts and slopes are usually not reported? What is the main purpose of the intercept(within) in the tech4 output? When I add the intercept(between) and the intercept(within) I get the overall mean and my intercept(within) is negative and not zero (it's only zero when I don't have any covariates in the model).

2. Also- would you have a recommendation of any article that does a good job of reporting the output of analyses using MLGC?

Many thanks.

Bengt O. Muthen posted on Tuesday, March 02, 2010 - 3:22 pm

Slide 58 of our Topic 8 handout makes it clear what iw, sw, ib, and sb consist of in multilevel terms.

You see there that with a covariate x,

iw = beta*x+r

and this is why Tech4 shows a non-zero value for the mean of iw - it is simply beta*x-bar. There is no intercept parameter for iw, which is the same as no mean parameter for iw when there are no covariates.

I cannot think of articles off hand - anybody else? I would think the Raudenbush-Bryk (2002) book has examples of this kind.

Francis Huang posted on Tuesday, March 02, 2010 - 3:49 pm

Thank you! Will check out your handout.

Francis Huang posted on Wednesday, March 03, 2010 - 9:01 am

Wanted to also ask: how do you get the corresponding standard errors of the output of TECH4 (variances)?

Bengt O. Muthen posted on Wednesday, March 03, 2010 - 11:11 am

Generally, a TECH4 quantity is a function of several model parameter estimates, not a single model parameter estimate. This is the case of the variance of an endogeneous variable for example. To get the SE you would have to define a NEW parameter in Model Constraint and express the new parameter as a function of the model parameters using their labels. An approximate approach is to say drop covariates so that you get the variance as a model parameter.

Gabriela R posted on Tuesday, February 15, 2011 - 11:50 am

Hello,
I hope you can give me some advice on the following:
I have modeled a questionnaire at 4 time-points, obtaining 4 factors, one at each time point. I then applied equal structure, equal loadings and equal thresholds constraints. The next step was to apply an LGM on the 4 factors. In order to reduce the number of variables in my model, I was thinking of fitting the intercept and slope straight on the factor scores of the 4 factors. The factor scores would be obtained from the invariance model, letting the 4 factors correlate.

My question is: If I let the factors correlate in the model from which I save the factor scores, would this bias the LGM estimation?

Thank you!
Gabriela

Bengt O. Muthen posted on Tuesday, February 15, 2011 - 12:02 pm

No, correlating the factors would be in line with the LGM because LGM implies a certain factor correlation.

To use factor scores, however, you should have a sufficient number of high-loading items for the factor. It does help of course that you draw on information from all 4 time points.

You can also do a 1-step ML analysis, although with categorical items that will involve 4 dimensions of numerical integration which gives heavy computations. You can also use Bayesian analysis which avoids the integration; see papers on our web site.

William Johnston posted on Wednesday, May 23, 2012 - 11:09 am

Hello,

I am trying to run a three-level model that estimates the whether neighborhood-level poverty (measured only at Wave 1) impacts individual youths' academic growth trajectories, here measured by WISC scores.

When I run the following code Mplus balks and says that 'povtycon' (the neighborhood-level covariate of interest) has no variation when in reality it does.

Any ideas?

Thanks!

-William
- - - - - - - - - - -

USEVAR =
subid nc wiscraw wiscraw2 wiscraw3 needsratio povtycon race_d2 race_d3 race_d4 race_d5 sex_r;

CLUSTER = nc;

Analysis:
Type = TWOLEVEL;
ESTIMATOR = MUML;

MODEL:
%WITHIN%
iw sw | wiscraw@0 wiscraw2@1
wiscraw3*2 (1);
iw sw ON needsratio
race_d2 race_d3 race_d4 race_d5
sex_r;

%BETWEEN%
ib sb | wiscraw@0 wiscraw2@1
wiscraw3*2 (1);
ib sb ON povtycon;

OUTPUT:
SAMPSTAT STANDARDIZED RESIDUAL;

Linda K. Muthen posted on Wednesday, May 23, 2012 - 12:22 pm

Please send the output, data, and your license number to support@statmodel.com.

Tao Yang posted on Tuesday, December 04, 2012 - 8:30 pm

Dear Dr. Muthen,
I am running a two-level linear growth model with individually varying time scores. I would like to model the cross-level moderating effect of a between-level variable (w) on the within-level effects of latent intercept and slope on an outcome variable (z) respectively. My syntax is as below.

VARIABLE:
USEVAR = y1-y5 t1-t5 z w clustid;
TSCORES = t1-t5;
MISSING ARE ALL (-9999);
CLUSTER = clustid;
BETWEEN = w;

ANALYSIS:
TYPE = TWOLEVEL RANDOM;
ALGORITHM = INTEGRATION;
INTEGRATION = MONTECARLO(5000);

MODEL:
%WITHIN%
i s|y1-y5 AT t1-t5;
si| z ON i;
ss| z ON s;
%BETWEEN%
y1-y5@0;
si ss ON w;

I got the error message "THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO A NON-ZERO DERIVATIVE OF THE OBSERVED-DATA LOGLIKELIHOOD..CHECK YOUR STARTING VALUES OR INCREASE THE NUMBER OF MITERATIONS".

I increased number of miterations to 2000 and got the same message. I then increased Montecarlo integration points to 10000 and MITERATIONS to 10000, and got this message:

"THE ESTIMATED BETWEEN COVARIANCE MATRIX COULD NOT BE INVERTED..CHANGE YOUR MODEL AND/OR STARTING VALUES.THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE COMPUTATION..."

I am not sure what might be the cause(s) of the error and/or whether there were errors in the model specification.

Thanks!

Linda K. Muthen posted on Wednesday, December 05, 2012 - 6:22 am

Please send the output and your license number to support@statmodel.com.

William Johnston posted on Wednesday, February 06, 2013 - 12:45 pm

Based on the 3-level LG model mentioned above, I am running a model in which child cohort membership moderates the effect of the between-level covariate neighborhood affluence ("AFF"). I also want to include a mediator ("MED"), but two issues come up. First, I do not get any fit indices. Second, I am not able to use MODEL INDIRECT to do an effect decomposition. Is it possible to test for mediated moderation within a 3-level latent growth model? If so, how would I need to change the syntax below?

ANALYSIS:
TYPE = TWOLEVEL RANDOM;

MODEL:
%WITHIN%
iw sw | wrat@0 wrat2@1 wrat3@2;
s9i | iw on cohort9;
s12i | iw on cohort12;
s9s | sw on cohort9;
s12s | sw on cohort12;
iw sw on ...;

%BETWEEN%
ib sb | wrat@0 wrat2@1 wrat3@2;

s9i@0; s12i@0; s9s@0; s12s@0; wrat@0; wrat2@0; wrat3@0;

ib sb s9i s12i s9s s12s on AFF MED; MED on AFF;

MODEL INDIRECT:
s9i IND MED AFF;
s12i IND MED AFF;
s9s IND MED AFF;
s12s IND MED AFF;
ib IND MED AFF;
sb IND MED AFF;

Linda K. Muthen posted on Thursday, February 07, 2013 - 9:55 am

Chi-square and related fit statistics are not available when means, variances, and covariances are not sufficient statistics for model estimation. This is the case with TYPE=RANDOM.

MODEL INDIRECT is also not available. You can use MODEL CONSTRAINT to specify the indirect effects.

Alicia Doyle-Lynch posted on Thursday, May 30, 2013 - 10:16 am

I have a negative binomial model with time (4 timepoints) nested within students (N=18921) nested within schools (N=132). I am predicting intercepts and longitudinal slopes of alcohol consumption (a negative binomial outcome). I am having trouble getting these models to converge in wide format (they are taking several days, and often not converging). However, when I switch the data to long, the models run much more quickly. I understand the differences between the models in long versus wide format. But, is it running these models in the long format still a valid way to analyze this data?
Thanks in advance.

Alicia Doyle Lynch posted on Thursday, May 30, 2013 - 12:35 pm

One more question regarding running the model in wide format. I want to use both individual level covariates (sex, race, SES, etc.) and school level covariates (public vs. private, school locale, etc.) to predict individual intercepts and slopes. I created within-level latent intercept, slope and slope squared terms (intcptw slopew squarew). However, I get an error message when I try to use these terms on the between level ("Within-level variables cannot be used on the between level."). I am worried, however, that by creating a second set of latent terms (intcptb slopeb squareb) on the between level I am actually predicting school-level intercepts and slopes and not individual-level intercepts and slopes. How can I be sure I am predicting individual level and not school level intercepts and slopes when using "between level" variables?
Model: %Within%
intcptw slopew squarew | dyam1@0 dyam2@1 dyam3@6 dyam4@13 ;
intcptw slopew squarew sex race SES ;
dyam1-dyam4(1);
%Between%
intcptb slopeb squareb | dyam1@0 dyam2@1 dyam3@6 dyam4@13 ;
intcptb slopeb squareb public locale1 locale2 ;
dyam1-dyam4@0;

Bengt O. Muthen posted on Friday, May 31, 2013 - 8:44 am

For your first question you have to send the outputs for the same model done in wide and long for us to see.

For your second question, intcptb etc on Between are the between-level parts of the growth factors, so it is correct to regress them on between-level covariates. So, between-level covariates predict the between-level part of the growth factors which in turn predict the between-level part of the dyam outcomes, which therefore predict the observed dyam outcomes.

Jamie Humphrey posted on Thursday, October 23, 2014 - 12:20 pm

Using complex survey data, I am trying to run a three-level growth model where children are nested in 4 time points as well as neighborhoods. My data are in long format due to non-constant weights over time. I've written code based on example 9.16 where y2= outcome, x1-x5= time-invariant covariates, a1-a5= time-varying covariates. When I attempt to run the model I get the following error:

*** ERROR IN MODEL command
Between-level variables cannot be used on the within level. Between-level variables used: Y2.

How do I need to structure my code to avoid this error? Also, How would I modify my code to set the covariance between the random intercept and slope to zero?

TITLE: Random Intercept & Slope Model with Time-Invariant & Time Varying Covariates;
DATA: FILE IS C:...\Mplus\math_long.csv;
VARIABLE:NAMES= id clus strat iptw x1 y3 y4 y1 y2 x2-x5 a1-a5 time;
USEVARIABLE= clus strat iptw x1 y2 x2-x5 a1 a2 a4 a5 time;
MISSING= ALL (-1234);
CLUSTER= clus;
STRATIFICATION= strat;
WEIGHT= iptw;
WITHIN= time y2 a1 a2 a4 a5;
BETWEEN= x1-x5;
ANALYSIS: TYPE= TWOLEVEL COMPLEX RANDOM;
MODEL: %WITHIN%
s | y2 ON time;
y2 ON a1 a2 a4 a5;
%BETWEEN%
y2 s ON x1-x5;
y2 WITH s;
OUTPUT: SAMPSTAT TECH4 TECH8;

Bengt O. Muthen posted on Thursday, October 23, 2014 - 12:30 pm

Note that UG ex 9.16 does not put y (your y2) on the Within list. Note also that this UG ex shows that you should not say

y2 ON a1 a2 a3 a4;

Instead, this statement should only have one variable "a", which like y2 repeats 4 times.

On our website you will find UG ex9.16 input and data so you can see how the data are structured.

Jamie Humphrey posted on Thursday, October 23, 2014 - 12:53 pm

Thank you for the fast response. I apologize for being unclear. Each of the 'a' variables represent a different time-varying covariate, e.g., SES advantage, residential stability, rather than the same variable at different time points. After I removed y (my y2) from the Within list, I still received the same error.

Bengt O. Muthen posted on Thursday, October 23, 2014 - 2:44 pm

Please send your output, data, and license number to Support@statmodel.com.

Minnik Findik posted on Tuesday, June 02, 2015 - 5:11 am

Hello,

I keep on getting...
THE MODEL ESTIMATION TERMINATED NORMALLY
WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE...

WITHIN = CumRiskTb;
BETWEEN = ScholClim08 ScholClim09 ScholClim10;
MISSING=ALL (-999);
IDVARIABLE IS ID;
CLUSTER IS SchId;
ANALYSIS: TYPE IS TWOLEVEL;

MODEL: %WITHIN%
iEMO sEMO| MAMS_emo_08@0 MAMS_emo_09@1 MAMS_emo_10@2;
iEMO sEMO on CumRiskTb;

%BETWEEN%
iSCLIM sSCLIM| ScholClim08b@0 ScholClim09b@1 ScholClim10b@2;
iEMOB sEMOB| MAMS_emo_08@0 MAMS_emo_09@1 MAMS_emo_10@2;
iEMOB on iSCLIM;
sEMOB on sSCLIM;

There are no negative variances.But some correlations are 999.00 (i.e. Iext & Sext)
Can the warning be due to that? if so is there anything I can do to fix it?
Thank you for your time!

Linda K. Muthen posted on Tuesday, June 02, 2015 - 6:25 am

Please send the output and your license number to support@statmodel.com.

Bep Uink posted on Thursday, October 15, 2015 - 12:33 am

Hello, I am trying to model the effect of stress on change in emotion across the day using the uni-variate (i.e. multi level) format.
I have centered time on stressful event (which is a level 1 IV). So, t=0 when stressful event occurs; t = 1, t = 2 etc. are time points after the event and t = -1, t= -2 etc. are time points before the event.
I am unsure how to interpret significant main effects of time. There is a sig. negative relationship between Time of Event and emotion, can I say that as time moves toward values > 0 (i.e. post-stress) values of emotion decrease?

Bengt O. Muthen posted on Thursday, October 15, 2015 - 3:32 pm

Yes.

Wong Yin Yee posted on Tuesday, July 25, 2017 - 8:15 pm

Hi Bengt,

I have a question about the interpretation of results.

I run a multilevel (members in teams) latent growth model and include an individual-level time-varying covariate with a random slope that varies on both the within and between levels. So, it is similar to the example of 9.14 in MPlus user guide.

1. How do I know a1-a4 has an impact on y1-y4?

2. For �S� reported in variance at within level, what does it mean if I get a positive estimate with significant p-value. It means that there is a between-person difference?

3. For �S� reported in �means� at between level, I get a negative estimate with significant p-value, what does it mean?

4. For �S� reported in �variance� at between level, I get a positive with significant p-value, it means that there is between-team difference?

5. If I add a team-level predictor to predict �S� at between level, I get a positive estimate with significant p-value, what does it mean?

I am a new user to MPlus, so please help answer these questions. Many thanks.

Bengt O. Muthen posted on Wednesday, July 26, 2017 - 3:55 pm

Answered on Mplus Support.

Diana Ribeiro da Silva posted on Friday, October 20, 2017 - 3:49 am

Dear Muthen & Muthen

We conducted a Randomized controlled trial (treatmement and control groups assessed in 4 time points) and we are interested in finding if different profiles of those subjects (we previously conducted a LPA based on their personality and found a 4 class solution) change over time differently (in variables such as anger, shame, paranoia, ..) also considering if they are in the treatment or in the control group

We tried GMM, but that did not solve our problem, because we do not want to classify people considering how they change over time. We want to see how different profiles change over time considering also treatment/control condicton

Is there any way to enter those 4 different profiles as multiple dummy variables in a LGCM?

Or it there a better way to do it?

Thank you so much

Bengt O. Muthen posted on Friday, October 20, 2017 - 5:34 pm

Look at Latent Transition Analysis examples in the UG and papers on this topic (under Papers) on our website.

Laura Alexandra posted on Sunday, March 17, 2019 - 7:17 am

Dear Mr. and Mrs Muth�n

We face the problem of a rather complex longitudinal data structure based on which we would like to recover a developmental score scale (e.g., latent variable growth score) and would very much appreciate your advice.

The data structure is as follows:

-Intensive longitudinal data consisting of multiple digital assessments the students conducted throughout the school year in a digital platform/ formative assessment system (we face: imbalanced data per student and unequal time intervals (e.g. 5 � 100 assessments per student/ school year). Further, the students conducted different assessments (different items, maybe some overlap) and the items are binary coded (correct/false)

Our aim is to get insight into students development in subject domains and we would like to extract a developmental score for further analyses (based on as few assumptions on e.g. functional form, dimensionality etc. as possible � maybe previous analyses on other data will show that we need to fit a multidimensional model with different dimensions for e.g., mathematics �)

Could you give us a hint on potentially suitable latent variable (growth) modelling technique?

Many thanks for your expertise in this regard! We highly appreciate it

Bengt O. Muthen posted on Sunday, March 17, 2019 - 4:34 pm

Regarding analysis of intensive longitudinal data, including unequal time intervals, you may want to study the article

Hamaker, E.L., Asparouhov, T., Brose, A., Schmiedek, F. & Muth�n, B. (2018). At the frontiers of modeling intensive longitudinal data: Dynamic structural equation models for the affective measurements from the COGITO study. Multivariate Behavioral Research, DOI: 10.1080/00273171.2018.1446819 (Online supporting material).

See also our Short Course Topic 12 video and handout at

http://www.statmodel.com/course_materials.shtml

Growth modeling of latent variable constructs is treated in our Short
Course Topic 4. See especially the section on Multiple Indicator growth.

Rachel Dew posted on Wednesday, March 20, 2019 - 9:08 am

I have been working on a growth model of behavioral variables over four waves of data, using age as a time score. A reviewer has asked that I also use age as a control variable. Is that necessary and if not, how would I explain that?

Rachel

Bengt O. Muthen posted on Wednesday, March 20, 2019 - 5:01 pm

If the starting age varies to a substantively important degree, you should take this into account. A flexible and interesting way is to see the different starting ages as multiple-cohorts like in UG ex 6.18.