I am running a series of growth models with an ordered categorical outcome variable (5 categories) with 5 waves of data. I fit a series of models with increasing numbers of explanatory variables, one of which is a factor measured by two components. I am fitting models with up to quadratic growth parameters (i,s,q). Some of the explanatory variables have quadratic terms as well as linear, some linear only in relation to each of these growth parameters.
Adding extra explanatory variables results in models in which fit indices (especially CFI, TLI, WRMR) are substantially worse than the recommendations in Yu and Muthen (2001) (>0.95,>.95,<0.06,<0.90 for TLI, CFI, RMSEA, WRMR)
Of my models the worst value of WRMR is 7.055 and the worst value of CFI is 0.588.
All these models converge without errors or warnings. I find few differences in fit by altering the pattern of relationship of the explanatory variables to the growth terms. The models all give interpretable relationships of the factor to the outcome over time.
Do you have any recommendations concerning reporting of such results (including which fit indices to report) or in improving the models? Is the Yu and Muthen (2001) paper accessible anywhere?
If you add covariates and model fit becomes worse, it may mean that you need direct effects between the covariates and the outcome. You would look at modification indices.
The Yu and Muthen paper is not available but the Yu dissertation is on the website.
Hee-Jin Jun posted on Monday, September 03, 2007 - 10:09 am
I am running a trajectory analysis with binary outcome. Sample size is 6703 and there are 18 time points without missing value. In the Output of the 2 classes model, I found a note that
Chi-Square Test of Model Fit for the Binary and Ordered Categorical (Ordinal) Outcomes**
** Of the 262144 cells in the latent class indicator table, 140 were deleted in the calculation of chi-square due to extreme values.
I am not sure what this means. I don't know where 262144 cell comes from? Would you tell me how can I found the reference for this type of problem. Also, will it be possible to compare this results with 3 classes model, which said "144 were deleted?"
The number of cells comes from the multiway frequency table created from the categorical dependent variables in the analysis. For example, if you had 4 binary dependent variables, the table would have 16 cells (2*2*2*2). The chi-square test compares the observed frequency table to the model estimated frequency table. With more than 8 categorical dependent variables, this test is not useful.
Hee-Jin Jun posted on Monday, September 03, 2007 - 10:48 am
Thank you very much for your help. Is there any way that I can compare the models to choose the most appropriate number of classes?
I understood than when comparing models, the model with the smaller AIC indicates a better global fit. On the SEMNET discussion board, it was pointed out that in some programs, the opposite is true. In Mplus , when comparing two models using AIC does the higher or lower index value represent a better global fit?
Hello, In GMM, is it appropriate to use the usual indicators of model fit (adjusted BIC, the likelihood ratio tests, etc) to compare model fit in unconditional linear growth model with (say) 4 classes to an unconditional quadratic growth model with (say) 3 classes? I have determined that the 4 group model is the best model if I stick to linear- now I'd like to compare with the 2 to 7 group solution, but higher order. Many thanks, Tracie
You can look at the BIC and loglikelihood values but difference testing using the loglikelihood would not be appropriate. See the Nylund et al. paper on the website under Latent Transition Analysis.
Tracie B posted on Saturday, May 16, 2009 - 9:53 pm
Thank you, I've looked at the paper and realized I used misleading language with 'higher order', I mean allowing quadratic or cubic growth...I think my question is actually much more basic: I would like to compare the fit using GMM(not LCA or LTA as far as I understand it)of the following:
CLASSES=c(4) ANALYSIS: TYPE=MIXTURE MISSING; MODEL: %OVERALL% I S | (...etc 10 time points)
with CLASSES=c(3) ANALYSIS: TYPE=MIXTURE MISSING; MODEL: %OVERALL% I S Q | (...etc 10 time points)
i.e. linear growth specifying 4 classes Vs. quadratic growth specifying 3 classes, for example; both unconditional models. It is 'intuitive' to compare K Vs K+1 groups, but what about comparing model fit for 4 groups when only linear growth is permitted, with 4 groups when quadratic growth is permitted....or even more extreme comparisons. Is it still appropriate to use the same usual indicators of model fit? Thanks again for this great site.
I think the only way to compare those models is using BIC.
socrates posted on Wednesday, June 17, 2009 - 3:11 am
Dear Dr. Muthén
My GMM with starting values converged well. However, the highest log-likelihood (LL) was observed for unperturbed seeds and it was not replicated. Increasing the STARTS and the STITERATIONS results in a improvement of the second-best LL but it is still not as good as the best LL. When running the GMM with the OPTSEED command and the seeds of the second best LL, it does not converge.
Do you have any suggestions what else I could do to be sure that there is no local maximum? Or can I be sure that I catched the optimal model given the circumstances mentioned above?
It sounds like you are trying to get too much out of your data. You most likely need a more parsimonious model. For further information, send your output and license number to firstname.lastname@example.org.
mihyun park posted on Wednesday, December 07, 2011 - 8:39 am
I conducted separately for male and female adolescents indicated that an LGM with an intercept and linear slope factor provided the best fit to the data for male adolescents(Chi-Square Test of Model Fit value=13.680, Degrees of Freedom=3, P-Value=0.0040, RMSEA=0.047, CFI/TLI=0.982/0.963).
By the way, the LGM result of female adolescents is following.
ESTIMATED SAMPLE STATISTICS
Estimated sample statistics, mean scores are each 2.144, 2.118, 2.281, 2.393 for four years.
if Quadric LGM Chi-Square Test of Model Fit value=9.655, Degrees of Freedom=1, P-Value=0.0019, RMSEA=0.081, CFI/TLI=0.994/0.983,
if Linear LGM Chi-Square Test of Model Fit value=12.467, Degrees of Freedom=3, P-Value=0.0059, RMSEA=0.049, CFI/TLI=0.989/0.978,
I think.. the model fit of linear LGM was better than quadric LGM. But the means decreased a little on second year and increased from third year again. In that case, which LGM model is the best?
I would check if the mean of the quadratic growth factor is significant. If not, I would use the linear model results.
Mihyun Park posted on Thursday, January 05, 2012 - 9:22 am
I appriciate your help. I added 5th data and conducted for male again. I also tried to analysis spline LGM model. and compared linear model with spline model(chi-square comparing ). The spline model's model fit was better than the other. Also I compared linear model with quadratic model. The quadratic model was better. And the slope and quadratic mean of quadratic model were significant. Descriptive mean scores were increased and decreased at last time point. And then, should I choose the quadratic LGM model as the best trajectory model of my data? I don't know well how I should compare and decide to spline model and quadratic model.
If you are using maximum likelihood, you can use BIC to compare the spline and quadratic models.
Mihyun Park posted on Thursday, January 12, 2012 - 10:52 am
Thank you for your advice. Thanks to you I have learned a lot and have done well. I tried to analyze pararelle latent growth model separately for male and female adolescents. After that, I want to conduct multiple group LGM to examine the path invariance, prediction rate and slope rate across gender. As you know, the pararelle latent growth model has two growth model. Girls have two linear(spline) growth model. But Boys have linear growth model and quadratic growth model. In other word, the configures are different across gender. If my data have different configure LGM across gender, can I use multiple group LGM?
To use multiple group analysis with growth models, the same growth model must fit in both groups. If not, across group comparisons cannot be made.
Mihyun Park posted on Wednesday, January 18, 2012 - 11:44 am
Thank you very much for your kind help.
I ran pararelle latent growth model.
Model : i1 s1 | su11@0 su22* su33* su44* su55@1; i2 s2 | ax11@0 ax22* ax33* ax44* ax55@1; ax11 with su11; ax22 with su22; ax33 with su33; ax44 with su44; ax55 with su55; i2 on i1; s2 on i1 s1;
Is this model possible? But some researcher(around me) said, because su11 and ax11, su22 and ax22, et al.were assessed at same time point, the model(Model: s1 on i2; s2 on i1;) is possible. So the i2 on i1 and s2 on i1 s1 syntax were impossible. The model syntax is sequential growth latent model. If I use ON model, the i1 s1(by su11 su22 su33 su44 and su55) should be measured before i2 s2(ax11 ax22 ax33 ax44 and ax55). But I read some article using ON pararelle latent growth model with same time point. Which is right? Would you give me some advice?
It is the unrestricted model of means, variances, and covariances which is used in the calculation of chi-square.
L. Siemons posted on Thursday, October 04, 2012 - 7:36 am
I have some questions about the growth mixture modeling procedure I used.
1. I did run a linear and a quadratic model, both with 2 and with 3 classes. Mplus gives some information with the LO-MENDELL-RUBIN ADJUSTED LRT TEST about the fit of the classes. If I understand correctly, 3 classes are better fitting than 2 classes when this measure is significant when running a 3 class model? Is there also a fit statistic which shows me whether the quadratic model fits better than the linear model? Or can I test this myself using some kind of log-likelihood ratio test? If so, which log-likelihood value should I use from the output?
2. I read something about removing non-significant quadratic terms, but retaining the linear parameters (irrespective of significance). How can I determine whether my quadratic terms are significant? Can this information be found in the output?
3. How can I save the group membership of the persons? I did try to do this using SAVE: CPROBABILITIES, but it gives an error and doesn’t run.
4. I learned from the literature that many fit indices can be examined (e.g. (sample-size adjusted) BIQ, AIC, posterior probabilities, log likelihood vales, adjusted LRT). Which measures are best/sufficient to use? And which values are they allowed to take in order to consider it a good fit?
I would greatly appreciate your help with answering these questions. Thank you very much!
Dear Mplus staff, I am fitting a unconditional linear GCM model with eight waves of data and ~63000 people. Predictably I have a large proportion of missing data that generates warning: "THE COVARIANCE COVERAGE FALLS BELOW THE SPECIFIED LIMIT. THE MISSING DATA EM ALGORITHM WILL NOT BE INITIATED.
However model estimation converges normally (and quite fast) and generates a set of estimates. On the other hand I am having hard time retrieving any of model fit indices (i.e., CFI, TLI, WRMR, or even chi2). Please advise me what should I do in that regard. Thanks for your time Oleg
Zuduo Zheng posted on Monday, September 23, 2013 - 10:33 pm
Hello Dear Linda,
I'm new to LGM and to Mplus.
Based on your reply to one of the previous posts, my understanding is that the p value in the section of Chi-Square Test of Model Fit Value should be big (say, bigger than 0.05). However, I also noticed that in a couple of old posts (in this forum), the models discussed were with a p value less than 0.05. So, I'm writing to double check with you.
Below is what I got from my model. May I conclude that this model does NOT fit the data well because the p value in the section of Chi-Square Test of Model Fit Value is close to zero?
Amber Fahey posted on Thursday, June 07, 2018 - 7:12 pm
Hello, I am running a two level model with random intercepts and random slopes. I am obtaining LL's almost twice as large in the conditional model than the unconditional model. I can not use S-B deviance statistic to compare models because I am receiving negative values. But based on the size of the LL's it looks like the unconditional model is a better fit, however, my predictors in the conditional model explain 75 and 25 of the variability in my intercept and slope, respectively 9using pseudo R squared). Is there a way to reconcile this? It appears like I have entered good predictors (theoretically driven), but I can't understand why it would appear as if the conditional model is worse.
My LL for the unconditional model is: -42777.07 For the conditional model the LL is:-8534.71
I have read the larger absolute value the better, but I’ve also read the smaller (closer to zero) the better. Can you please clarify? Is it dependent on the type of analysis or is it computed differently here in mplus?