Message/Author 


I ran an unconditional growth model with both a linear and a quadratic growth factor. My primary outcome is categorical, and treating as such in the input file using the code below. Using the code below the intercept was not estimated. I was curious why this would happen since it is estimated when I treat the categorical observed variable as continuous? CATEGORICAL ARE t1smkst t2smkst t3smkst t4smkst; Analysis: TYPE = MEANSTRUCTURE MISSING H1; MODEL: i s q  t1smkst@0 t2smkst@1 t3smkst@2 t4smkst@3; i with s@0; i with q@0; s with q@0; s@0; q@0; OUTPUT: SAMPSTAT RESIDUAL STANDARDIZED; 


The growth model paramaterization with categorical outcomes is to hold thresholds equal and not estimate the mean of the intercept growth factor. The two equivalent but different growth model parameterizations are shown in Chapter 16 where growth models are discussed. This one is used with categorical outcomes because it offers more flexibility with ordinal variables. 


Hello, I have a small sample (N=86) and 3 time points with binary outcomes. I plan to run a LCGM to find trajectory classes. Time intervals between T1 and T2 are varying across individuals between 10 and 50 years, between T2 and T3 it is the same for each person (14 years). My idea is to center time at T2 (slope loading=0), using a loading of 1 for T3 and allow the loading of T1 to be free. Example for a 3class solution: classes = c(3); categorical = t1 t2 t3; missing are t1 t2 t3 (99); analysis: type=mixture; STARTS= 500 50; model: %overall% i s  t1 t2@0 t3@1; My questions: 1. Are these conditions (time points, sample size) sufficient for this method? 2. What do you suggest is the best way to specify time ? Thank you! 

Jon Heron posted on Thursday, July 29, 2010  1:51 pm



Hi Mario, I think you're going to need to go longformat here, or wide format but model time with the AT command. With 3 fixed time points and binary data you only have 8 pieces of information  the number of occurrences of 000, 001, 011, 010, 100, 110, 101 and 111. Doesn't sound enough to me to fit 3 classes (2 parameters), estimate means for intercept and slope in each one (another 6 parameters) and also estimate one of your loadings. 


Hi Jon, thank you for your response. When comparing with model 8.9, is it because of only 3 time points? Is there another way to test for or to confirm trajectories of these binary data (diagnosis vs. none) across three time points? 

Jon Heron posted on Thursday, July 29, 2010  2:59 pm



Hi Mario, the method I obscurely referred to as "AT" is shown in example 6.12. Ignore the stuff with "ST" in it. This allows for varying ages at each observation, e.g. you might have an intended data collection when the subjects are age 12 but due to a number of reasons the actual age is scattered across 11.5  12.5. You can get an equivalent model by stacking the dataset into long format and fitting an MLwiNstyle growth model. This would seem a reasonable approach if your ages were spread in the way I describe, however I am a little bit wary in your case as you say the spread of ages is 1050 years. can anything be modelled in a linear fashion across such a range? 


Hello Jon, Thank you...okay, I reshaped the dataset to long format and now it contains ID, "age" (0 at baseline) and individually varying ages for t1 and t2 as well as the associated diagnosis status (0/1). There is no way to run such a model in MPlus? What about STATA? Or did I misunderstand you? 


Example 9.16 shows the model specification for a growth model using the long format. 


Hello Linda, I studied the Instructions for model 9.16 but do not fully understand its parts (time, a3). Is there a topic handout available? My aim is to find and/or confirm classes of trajectories over 3 time points with binary outcomes without any covariates but individually varying time intervals. Thank you for your support Mario 


Hello Linda, Thank you for your response. I studied model 9.16 but do not fully understand its parts (e.g., time , a3). Is there a topic handout available? My aim is to find and/or confirm trajactories of a binary outcome over three time points without any covariates but individually varying time intervals. Mario 


Sorry, got an error message after first posting! 

Jon Heron posted on Monday, August 02, 2010  8:58 am



Hi Mario, a3 is the timevarying covariate. This might be useful. I've fitted a growth model to 3 measures of weight (wt1wt3) with variable times of measurement. Both these bits of syntax give the same result (wide and then long format). Should be simple to extend to multiple classes. VARIABLE: names= wt1 wt2 wt3 agewks1 agewks2 agewks3; missing are all (9999) ; tscores = agewks1 agewks2 agewks3; ANALYSIS: proc = 2 ; type=random ; MODEL: i s  wt1 wt2 wt3 at agewks1 agewks2 agewks3; i s ; i with s ; wt1 wt2 wt3 (equalvar) ; VARIABLE: names= id wt agewks ; missing are all (9999) ; cluster=id ; within=agewks ; ANALYSIS: type= twolevel random ; algorithm = integration ; integration=100 ; miterations=1000 ; MODEL: %within% s  wt on agewks ; %between% wt with s ; 


Hello Jon, thanks for these examples... when I specify the regarding model: variable: names are id dia0 dia1 dia2 zeit0 zeit1 zeit2; usevariables dia0 dia1 dia2 zeit0 zeit1 zeit2; tscores = zeit0 zeit1 zeit2; categorical = dia0 dia1 dia2; analysis: type=random; proc = 2; model: i s  dia0 dia1 dia2 at zeit0 zeit1 zeit2; i s ; i with s ; dia0 dia1 dia2 (equalvar) ; ...I get the error message, that "proc" is unknown. Furthermore, how can I extend to classes (type=mixture?)? Is (equalvar) part of the model? didn't find any explanation about it. thanks Mario 

Jon Heron posted on Tuesday, August 03, 2010  11:32 am



1] Proc problem. Have you got a really old version of Mplus  v4 perhaps?? 2] Equalvar. Anything in round brackets implies a constraint. This is simply saying that the residual variances are constrained equal across time. Not necessary for the model, but needed to obtain the same answer as longformat 3] Add "type = mixture;" to the analysis, "classes = c(2);" to the variable section and "%overall%" immediately after "model:" 


yes, I use v4! What is it for and can I use an equivalent command in v4? 

Jon Heron posted on Wednesday, August 04, 2010  6:43 am



It's to utilise multiple processors if your PC has more than one. Means your programs will run faster  random starts are shared across the processors. Guess it was introduced into version 5. Your sample is small so I don't think a big problem here, but it makes a big difference to me as I have 100 times your sample size. I should point out that that's not the only benefit to upgrading  new routines, new estimators, better efficiency. I'll stop now cos I don't work for Mplus' sales dept ;) 


Thank you, Jon! I hope that's okay that I ask all these things but I've just begun to work with MPlus. I followed all your suggestions but two errors occured: 1. tscores: are these relevant for the model? tscores are not compatible with mixture models. 2. the known error message: Categorical variable DIA1 contains less than 2 categories. ...although I doublechecked it! Thanks, Mario 

Jon Heron posted on Thursday, August 05, 2010  10:54 am



1] Yes, you need them for allowing varying times of observation if you're modelling in wideformat. Sounds like longformat the only option then. 2] Has your dataset been read in correctly? If you fit a very simple model e.g. i  dia0@0 dia1@1 dia2@2; then the first bit of output will be the distribution of each of your categorical measures. Does that look like it should do? 

Jon Heron posted on Friday, August 06, 2010  7:07 am



Mario, I'm struggling to estimate a 2class growth model in longformat using my data so am wondering if you have any other options to keep things simple. For simpler wideformat analysis you need more degrees of freedom, so how about 3level instead of binary repeated measures, is that possible with your data? 


Hi Jon, my outcome is strictly binary...either you are below or above the diagnostic cutoff from a stress response syndrome. We used interview data containing different criteria. I will try your two suggestions from above and let you know. You are a big help to me. Thank you very much. Mario 


oha, I mixed up dia2 with time0 (what is always =0) so MPlus didn't read two categories...it's corrected now. When I specify "mixture" I get the message: TSCORES option is only available with TYPE = RANDOM. Otherwise (with "random") I get: *** WARNING in Variable command CLASSES option is only available with TYPE=MIXTURE. CLASSES option will be ignored. *** ERROR in Model command Unknown variables: %OVERALL% in line: %OVERALL% Should I try your suggested longformat syntax? 

Jason Bond posted on Saturday, December 06, 2014  5:59 pm



Bengt/Linda, I intend to do agebased (instead of wavebased) analyses using the NLSY dataset. Do all agebased analyses require the use of tscores? That is, although it may be that using year of interview may not be exactly linearly related to age, if one simply uses age at first interview and then assumes time between year of interview increases exactly linear with age, is there a way to get around using the tscores option to do agebased analyses? Thanks, Jason 


If you have the same distance in time between the repeated measurements for all subjects there is no need for t scores. 

Jason Bond posted on Tuesday, December 16, 2014  7:28 pm



I was hoping this was the case. Then how would one implement age based analyses? My guess is that fixing time to specific values (e.g., @0 for the first wave, etc.) would not be correct? Thanks again... 


See UG ex 6.18. 

Jason Bond posted on Wednesday, January 21, 2015  9:01 pm



Thanks for this. Beyond the question of cohort effects, the endpoint of our analyses, however, is to estimate LCGM/GMM. Assuming no cohort effects are found, is there a way to set up such mixture models for agebased analyses assuming time between waves is constant across individuals not using TSCORES? Thanks again... Jason 


The approach of UG ex 6.18 can be used in a mixture setting as well. Group (cohort) becomes Knownclass. 

Jason Bond posted on Tuesday, February 10, 2015  6:57 pm



Related...I'm analyzing NLYS with data from 11 waves. In using TSCORES in a random slope model, say, the error message I get is: *** ERROR in MODEL command The number of fixed time scores is not sufficient for model identification in the following growth process: I S from the corresponding analysis syntax: i s  f3_82 f3_83 f3_84 f3_88 f3_89 f3_94 f3_02 f3_06 f3_08 f3_10 f3_12 AT t_82 t_83 t_84 t_88 t_89 t_94 t_02 t_06 t_08 t_10 t_12 ; Although I'm guessing it doesn't matter, age has been 'centered' around 21 (and the data truncated so that only data from ages 2151 are analyzed) so that 'times' of measurement indicate how much older than 21 the respondent was at the corresponding wave. Thanks for any input... Jason 

Jason Bond posted on Tuesday, February 10, 2015  7:12 pm



Related to the above mentioned NLSY data (11 waves and over 10k cases), I'm trying to estimate a single class cubic growth model with random growth coefficients and no other covariates. Quadratic models seem to converge fine but cubic (and higher order ones) give me: THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ILLCONDITIONED FISHER INFORMATION MATRIX. CHANGE YOUR MODEL AND/OR STARTING VALUES. THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO A NONPOSITIVE DEFINITE FISHER INFORMATION MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.204D10. THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THIS IS OFTEN DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. CHANGE YOUR MODEL AND/OR STARTING VALUES. PROBLEM INVOLVING PARAMETER 1. I've tried increasing the starts to no avail and the same LL seems to be being reached for a number of starts. With so many time points per respondent available, I would assume that such a model would be estimable. Might you have any advice for me for fitting such a model (other than the typical fixing of parameters)? Thanks much again, Jason 


Post 1: The Timescores option should be used in conjunction with the AT option of growth modeling. Post 2: We need to see the output. 

Back to top 