

Modeling clustered data over time 

Message/Author 


Hello, I am trying to test a simple model that gets complicated very quickly by the fact that my data is clustered and has repeated measures (7 time points). I am working with version 3.0 of mPlus. I have six observed variables that I believe to load on a single latent variable, and want to test if this is indeed the case. I have collected these 6 variables at 133 different facilities across the country quarterly for two years. The facilities cluster regionally, as they are managed by 23 regional offices (each facility can only belong to one region). I was able to successfully test for fit using only one time point, (code printed below). Now what I want to do is be able to test the LV1 BY OV1OV6 model but using all 7 quarters of data in addition to regional clustering (I assume that the scores on these observed variables change over time, however, I want to test that the model does not). So my question is how do I account for both time and region in the same model? Many thanks! Sylvia J. Hysong ORIGINAL, SINGLE QUARTER CODE FOLLOWS: VARIABLE: NAMES ARE region ov1ov6 facno; USEVARIABLES ARE region ov1ov6; CLUSTER IS region; ANALYSIS: TYPE IS COMPLEX; ESTIMATOR IS MLR; ITERATIONS = 1000; CONVERGENCE = 0.00005; MODEL: LV1 BY ov1ov6; OUTPUT: SAMPSTAT STANDARDIZED MODINDICES CINTERVAL; 


One way to deal with it is shown in Example 9.12. Or you could use Example 6.1 and add the CLUSTER option to the VARIABLE command and TYPE=COMPLEX; to the ANALYSIS command. I would also suggest upgrading to the most recent version of Mplus. There have been many changes in the nine years since Version 3 came out. 


Hi Linda, Thank you for your feedback. Sorry for the delay in a response, this project got put on hold for a while and is just now starting up again. I don't know if I made this sufficiently clear in my original post or not, but my interest is not to model change over time in these data. My interest is just to test the LV1 BY ov1ov6 model; it's just that I happen to have clustered and autocorrelated data because I have 7 quarters of it. So all I want to do is calculate accurate estimates for the measurement model. Does that make sense? How do I do that? Many thanks, Sylvia J. Hysong P.S. My data is currently in "tall and skinny format", that is multiple rows of data for each case (i.e., variables = facility, region, quarter, OVnum, score). To do the analysis correctly, do I need it in "short and fat" format, that is, one row of data for each case (variables = facility, region, ov1q1, ov1q2... ov6q3, ov6q4)? 


Sounds like what you want to do I would call longitudinal factor analysis. And you have clustered data so could use either Type = Complex or Type = Twolevel  you probably want to use Type = Complex for simplicity since you have only 23 clusters. You would arrange your data in line with ex 9.12 in the V4 UG so that you have the 6*7 outcomes as columns for a facility (plus the other variables, including the cluster the facility is in). The model would be like f1 by ov11ov16; f2 by ov21ov26; ... f7 by ov71ov76; where you can test various degrees of acrosstime equality constraints of measurement loadings and intercepts (see Chapter 16 of the UG for such equality testing. 

Dana Wood posted on Sunday, February 22, 2009  8:02 pm



Hello, I want to estimate a latent growth model for three time points. My study participants are nested within schools; however, all participants change school over time. I was planning to use Type = Complex and the cluster option. Is there any way to specify that participants are nested in different clusters at each time point? Thanks in advance! 


No, this option is not currently available in Mplus. 

Jaime Booth posted on Tuesday, April 03, 2018  1:17 pm



Hello, I have a question similar to the one that Dana posted in 2009. I am trying to estimate a path model with a data set that has individuals nested in neighborhoods over 5 points in time with about 15% of respondents moving neighborhoods at each time point. I see that there was not an option available in 2009 to address this issue but was wondering if an option had been added. Does this address the problem: DATA WIDETOLONG: wide = trt8 trt10 trt11 trt12 trt15; long = trt; IDvariable = trt; Repetition = age; Variable: Names are id dan11 dan8 dan10 PMon_P10 PMon_P11 PMon_P12 trt8 trt10 trt11 trt12 trt15 Disa10 Disa11 Disa12 Disa15 Disa6 Disa8 Cohes_10 Cont_11 Cohes_11 Cont_12 Cohes_12 Tob17 Maj17 Alc17 ; Missing are all (9999) ; usevariables are Tob17 Maj17 Alc17 PMon_P11 PMon_P12 Cont_11 Cohes_11 Cont_12 Cohes_12 Disa11 Disa12 Disa15 trt; cluster is trt ; ANALYSIS: TYPE=COMPLEX; 


We have not developed general methodology for multiple membership modeling. 

Back to top 

