I ran an unconditional growth model with both a linear and a quadratic growth factor. My primary outcome is categorical, and treating as such in the input file using the code below. Using the code below the intercept was not estimated. I was curious why this would happen since it is estimated when I treat the categorical observed variable as continuous?
The growth model paramaterization with categorical outcomes is to hold thresholds equal and not estimate the mean of the intercept growth factor. The two equivalent but different growth model parameterizations are shown in Chapter 16 where growth models are discussed. This one is used with categorical outcomes because it offers more flexibility with ordinal variables.
I have a small sample (N=86) and 3 time points with binary outcomes. I plan to run a LCGM to find trajectory classes. Time intervals between T1 and T2 are varying across individuals between 10 and 50 years, between T2 and T3 it is the same for each person (14 years).
My idea is to center time at T2 (slope loading=0), using a loading of 1 for T3 and allow the loading of T1 to be free.
Example for a 3-class solution:
classes = c(3); categorical = t1 t2 t3; missing are t1 t2 t3 (99);
thank you for your response. When comparing with model 8.9, is it because of only 3 time points? Is there another way to test for or to confirm trajectories of these binary data (diagnosis vs. none) across three time points?
Jon Heron posted on Thursday, July 29, 2010 - 2:59 pm
the method I obscurely referred to as "AT" is shown in example 6.12. Ignore the stuff with "ST" in it.
This allows for varying ages at each observation, e.g. you might have an intended data collection when the subjects are age 12 but due to a number of reasons the actual age is scattered across 11.5 - 12.5.
You can get an equivalent model by stacking the dataset into long format and fitting an MLwiN-style growth model.
This would seem a reasonable approach if your ages were spread in the way I describe, however I am a little bit wary in your case as you say the spread of ages is 10-50 years. can anything be modelled in a linear fashion across such a range?
Thank you...okay, I reshaped the dataset to long format and now it contains ID, "age" (0 at baseline) and individually varying ages for t1 and t2 as well as the associated diagnosis status (0/1). There is no way to run such a model in MPlus? What about STATA? Or did I misunderstand you?
Jon Heron posted on Monday, August 02, 2010 - 8:58 am
a3 is the time-varying covariate. This might be useful. I've fitted a growth model to 3 measures of weight (wt1-wt3) with variable times of measurement. Both these bits of syntax give the same result (wide and then long format). Should be simple to extend to multiple classes.
VARIABLE: names= wt1 wt2 wt3 agewks1 agewks2 agewks3; missing are all (-9999) ; tscores = agewks1 agewks2 agewks3; ANALYSIS: proc = 2 ; type=random ; MODEL: i s | wt1 wt2 wt3 at agewks1 agewks2 agewks3; i s ; i with s ; wt1 wt2 wt3 (equalvar) ;
VARIABLE: names= id wt agewks ; missing are all (-9999) ; cluster=id ; within=agewks ; ANALYSIS: type= twolevel random ; algorithm = integration ; integration=100 ; miterations=1000 ; MODEL: %within% s | wt on agewks ; %between% wt with s ;
model: i s | dia0 dia1 dia2 at zeit0 zeit1 zeit2; i s ; i with s ; dia0 dia1 dia2 (equalvar) ;
...I get the error message, that "proc" is unknown. Furthermore, how can I extend to classes (type=mixture?)? Is (equalvar) part of the model? didn't find any explanation about it.
Jon Heron posted on Tuesday, August 03, 2010 - 11:32 am
1] Proc problem. Have you got a really old version of Mplus - v4 perhaps??
2] Equalvar. Anything in round brackets implies a constraint. This is simply saying that the residual variances are constrained equal across time. Not necessary for the model, but needed to obtain the same answer as long-format
3] Add "type = mixture;" to the analysis, "classes = c(2);" to the variable section and "%overall%" immediately after "model:"
Thank you, Jon! I hope that's okay that I ask all these things but I've just begun to work with MPlus.
I followed all your suggestions but two errors occured:
1. tscores: are these relevant for the model? tscores are not compatible with mixture models. 2. the known error message: Categorical variable DIA1 contains less than 2 categories. ...although I doublechecked it!
Jon Heron posted on Thursday, August 05, 2010 - 10:54 am
1] Yes, you need them for allowing varying times of observation if you're modelling in wide-format. Sounds like long-format the only option then.
2] Has your dataset been read in correctly? If you fit a very simple model e.g.
oha, I mixed up dia2 with time0 (what is always =0) so MPlus didn't read two categories...it's corrected now.
When I specify "mixture" I get the message:
TSCORES option is only available with TYPE = RANDOM.
Otherwise (with "random") I get:
*** WARNING in Variable command CLASSES option is only available with TYPE=MIXTURE. CLASSES option will be ignored. *** ERROR in Model command Unknown variables: %OVERALL% in line: %OVERALL%
Should I try your suggested long-format syntax?
Jason Bond posted on Saturday, December 06, 2014 - 5:59 pm
I intend to do age-based (instead of wave-based) analyses using the NLSY dataset. Do all age-based analyses require the use of t-scores? That is, although it may be that using year of interview may not be exactly linearly related to age, if one simply uses age at first interview and then assumes time between year of interview increases exactly linear with age, is there a way to get around using the tscores option to do age-based analyses? Thanks,
If you have the same distance in time between the repeated measurements for all subjects there is no need for t scores.
Jason Bond posted on Tuesday, December 16, 2014 - 7:28 pm
I was hoping this was the case. Then how would one implement age based analyses? My guess is that fixing time to specific values (e.g., @0 for the first wave, etc.) would not be correct? Thanks again...
Jason Bond posted on Wednesday, January 21, 2015 - 9:01 pm
Thanks for this. Beyond the question of cohort effects, the endpoint of our analyses, however, is to estimate LCGM/GMM. Assuming no cohort effects are found, is there a way to set up such mixture models for age-based analyses assuming time between waves is constant across individuals not using TSCORES? Thanks again...
The approach of UG ex 6.18 can be used in a mixture setting as well. Group (cohort) becomes Knownclass.
Jason Bond posted on Tuesday, February 10, 2015 - 6:57 pm
Related...I'm analyzing NLYS with data from 11 waves. In using TSCORES in a random slope model, say, the error message I get is:
*** ERROR in MODEL command The number of fixed time scores is not sufficient for model identification in the following growth process: I S
from the corresponding analysis syntax:
i s | f3_82 f3_83 f3_84 f3_88 f3_89 f3_94 f3_02 f3_06 f3_08 f3_10 f3_12 AT t_82 t_83 t_84 t_88 t_89 t_94 t_02 t_06 t_08 t_10 t_12 ;
Although I'm guessing it doesn't matter, age has been 'centered' around 21 (and the data truncated so that only data from ages 21-51 are analyzed) so that 'times' of measurement indicate how much older than 21 the respondent was at the corresponding wave. Thanks for any input...
Jason Bond posted on Tuesday, February 10, 2015 - 7:12 pm
Related to the above mentioned NLSY data (11 waves and over 10k cases), I'm trying to estimate a single class cubic growth model with random growth coefficients and no other covariates. Quadratic models seem to converge fine but cubic (and higher order ones) give me:
THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ILL-CONDITIONED FISHER INFORMATION MATRIX. CHANGE YOUR MODEL AND/OR STARTING VALUES.
THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO A NON-POSITIVE DEFINITE FISHER INFORMATION MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.204D-10.
THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THIS IS OFTEN DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. CHANGE YOUR MODEL AND/OR STARTING VALUES. PROBLEM INVOLVING PARAMETER 1.
I've tried increasing the starts to no avail and the same LL seems to be being reached for a number of starts. With so many time points per respondent available, I would assume that such a model would be estimable. Might you have any advice for me for fitting such a model (other than the typical fixing of parameters)? Thanks much again,