Mplus Discussion >> Model fit

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Model fit

Mplus Discussion > Growth Modeling of Longitudinal Data >

Message/Author

russell Ecob posted on Wednesday, August 29, 2007 - 4:21 am

I am running a series of growth models with an ordered categorical outcome variable (5 categories) with 5 waves of data. I fit a series of models with increasing numbers of explanatory variables, one of which is a factor measured by two components. I am fitting models with up to quadratic growth parameters (i,s,q). Some of the explanatory variables have quadratic terms as well as linear, some linear only in relation to each of these growth parameters.

Adding extra explanatory variables results in models in which fit indices (especially CFI, TLI, WRMR) are substantially worse than the recommendations in Yu and Muthen (2001) (>0.95,>.95,<0.06,<0.90 for TLI, CFI, RMSEA, WRMR)

Of my models the worst value of WRMR is 7.055 and the worst value of CFI is 0.588.

All these models converge without errors or warnings. I find few differences in fit by altering the pattern of relationship of the explanatory variables to the growth terms. The models all give interpretable relationships of the factor to the outcome over time.

Do you have any recommendations concerning reporting of such results (including which fit indices to report) or in improving the models? Is the Yu and Muthen (2001) paper accessible anywhere?

Linda K. Muthen posted on Wednesday, August 29, 2007 - 8:53 am

If you add covariates and model fit becomes worse, it may mean that you need direct effects between the covariates and the outcome. You would look at modification indices.

The Yu and Muthen paper is not available but the Yu dissertation is on the website.

Hee-Jin Jun posted on Monday, September 03, 2007 - 10:09 am

Hi,

I am running a trajectory analysis with binary outcome. Sample size is 6703 and there are 18 time points without missing value. In the Output of the 2 classes model, I found a note that

Chi-Square Test of Model Fit for the Binary and Ordered Categorical (Ordinal) Outcomes**

** Of the 262144 cells in the latent class indicator table, 140 were deleted in the calculation of chi-square due to extreme values.

I am not sure what this means. I don't know where 262144 cell comes from? Would you tell me how can I found the reference for this type of problem. Also, will it be possible to compare this results with 3 classes model, which said "144 were deleted?"

Thanks.

Hee-Jin

Linda K. Muthen posted on Monday, September 03, 2007 - 10:25 am

The number of cells comes from the multiway frequency table created from the categorical dependent variables in the analysis. For example, if you had 4 binary dependent variables, the table would have 16 cells (2*2*2*2). The chi-square test compares the observed frequency table to the model estimated frequency table. With more than 8 categorical dependent variables, this test is not useful.

Hee-Jin Jun posted on Monday, September 03, 2007 - 10:48 am

Dear Linda,

Thank you very much for your help.
Is there any way that I can compare the models to choose the most appropriate number of classes?

Thanks.

Hee-Jin

Linda K. Muthen posted on Monday, September 03, 2007 - 10:56 am

You can compare nested models by using -2 times the loglikelihood difference.

Sandra Acosta posted on Thursday, August 07, 2008 - 12:12 am

I understood than when comparing models, the model with the smaller AIC indicates a better global fit. On the SEMNET discussion board, it was pointed out that in some programs, the opposite is true. In Mplus , when comparing two models using AIC does the higher or lower index value represent a better global fit?

Linda K. Muthen posted on Thursday, August 07, 2008 - 9:44 am

You look for the lowest like BIC.

Tracie B posted on Friday, May 15, 2009 - 6:51 pm

Hello,
In GMM, is it appropriate to use the usual indicators of model fit (adjusted BIC, the likelihood ratio tests, etc) to compare model fit in unconditional linear growth model with (say) 4 classes to an unconditional quadratic growth model with (say) 3 classes? I have determined that the 4 group model is the best model if I stick to linear- now I'd like to compare with the 2 to 7 group solution, but higher order.
Many thanks,
Tracie

Linda K. Muthen posted on Saturday, May 16, 2009 - 8:31 am

You can look at the BIC and loglikelihood values but difference testing using the loglikelihood would not be appropriate. See the Nylund et al. paper on the website under Latent Transition Analysis.

Tracie B posted on Saturday, May 16, 2009 - 9:53 pm

Thank you,
I've looked at the paper and realized I used misleading language with 'higher order', I mean allowing quadratic or cubic growth...I think my question is actually much more basic:
I would like to compare the fit using GMM(not LCA or LTA as far as I understand it)of the following:

CLASSES=c(4)
ANALYSIS: TYPE=MIXTURE MISSING;
MODEL:
%OVERALL%
I S | (...etc 10 time points)

with
CLASSES=c(3)
ANALYSIS: TYPE=MIXTURE MISSING;
MODEL:
%OVERALL%
I S Q | (...etc 10 time points)

i.e. linear growth specifying 4 classes Vs. quadratic growth specifying 3 classes, for example; both unconditional models. It is 'intuitive' to compare K Vs K+1 groups, but what about comparing model fit for 4 groups when only linear growth is permitted, with 4 groups when quadratic growth is permitted....or even more extreme comparisons. Is it still appropriate to use the same usual indicators of model fit? Thanks again for this great site.

Linda K. Muthen posted on Sunday, May 17, 2009 - 10:46 am

I think the only way to compare those models is using BIC.

socrates posted on Wednesday, June 17, 2009 - 3:11 am

Dear Dr. Muth�n

My GMM with starting values converged well. However, the highest log-likelihood (LL) was observed for unperturbed seeds and it was not replicated. Increasing the STARTS and the STITERATIONS results in a improvement of the second-best LL but it is still not as good as the best LL. When running the GMM with the OPTSEED command and the seeds of the second best LL, it does not converge.

Do you have any suggestions what else I could do to be sure that there is no local maximum? Or can I be sure that I catched the optimal model given the circumstances mentioned above?

Thanks

Linda K. Muthen posted on Wednesday, June 17, 2009 - 9:13 am

It sounds like you are trying to get too much out of your data. You most likely need a more parsimonious model. For further information, send your output and license number to support@statmodel.com.

mihyun park posted on Wednesday, December 07, 2011 - 8:39 am

Dear Linda,

I conducted separately for male and female adolescents indicated that an LGM with an intercept and linear slope factor provided the best fit to the data for male adolescents(Chi-Square Test of Model Fit value=13.680, Degrees of Freedom=3, P-Value=0.0040, RMSEA=0.047, CFI/TLI=0.982/0.963).

By the way, the LGM result of female adolescents is following.

ESTIMATED SAMPLE STATISTICS

Estimated sample statistics, mean scores are each 2.144, 2.118, 2.281, 2.393 for four years.

if Quadric LGM
Chi-Square Test of Model Fit value=9.655, Degrees of Freedom=1, P-Value=0.0019, RMSEA=0.081, CFI/TLI=0.994/0.983,

if Linear LGM
Chi-Square Test of Model Fit value=12.467, Degrees of Freedom=3, P-Value=0.0059, RMSEA=0.049, CFI/TLI=0.989/0.978,

I think.. the model fit of linear LGM was better than quadric LGM. But the means decreased a little on second year and increased from third year again. In that case, which LGM model is the best?

Linda K. Muthen posted on Thursday, December 08, 2011 - 9:46 am

I would check if the mean of the quadratic growth factor is significant. If not, I would use the linear model results.

Mihyun Park posted on Thursday, January 05, 2012 - 9:22 am

Dear Linda,

I appriciate your help.
I added 5th data and conducted for male again. I also tried to analysis spline LGM model. and compared linear model with spline model(chi-square comparing ). The spline model's model fit was better than the other.
Also I compared linear model with quadratic model. The quadratic model was better. And the slope and quadratic mean of quadratic model were significant. Descriptive mean scores were increased and decreased at last time point.
And then, should I choose the quadratic LGM model as the best trajectory model of my data?
I don't know well how I should compare and decide to spline model and quadratic model.

Linda K. Muthen posted on Friday, January 06, 2012 - 10:15 am

If you are using maximum likelihood, you can use BIC to compare the spline and quadratic models.

Mihyun Park posted on Thursday, January 12, 2012 - 10:52 am

Dear Linda,

Thank you for your advice. Thanks to you I have learned a lot and have done well.
I tried to analyze pararelle latent growth model separately for male and female adolescents. After that, I want to conduct multiple group LGM to examine the path invariance, prediction rate and slope rate across gender.
As you know, the pararelle latent growth model has two growth model. Girls have two linear(spline) growth model. But Boys have linear growth model and quadratic growth model. In other word, the configures are different across gender. If my data have different configure LGM across gender, can I use multiple group LGM?

Linda K. Muthen posted on Thursday, January 12, 2012 - 12:06 pm

To use multiple group analysis with growth models, the same growth model must fit in both groups. If not, across group comparisons cannot be made.

Mihyun Park posted on Wednesday, January 18, 2012 - 11:44 am

Thank you very much for your kind help.

I ran pararelle latent growth model.

Model :
i1 s1 | su11@0 su22* su33* su44* su55@1;
i2 s2 | ax11@0 ax22* ax33* ax44* ax55@1;
ax11 with su11;
ax22 with su22;
ax33 with su33;
ax44 with su44;
ax55 with su55;
i2 on i1;
s2 on i1 s1;

Is this model possible?
But some researcher(around me) said, because su11 and ax11, su22 and ax22, et al.were assessed at same time point, the model(Model: s1 on i2; s2 on i1;) is possible. So the i2 on i1 and s2 on i1 s1 syntax were impossible. The model syntax is sequential growth latent model.
If I use ON model, the i1 s1(by su11 su22 su33 su44 and su55) should be measured before i2 s2(ax11 ax22 ax33 ax44 and ax55).
But I read some article using ON pararelle latent growth model with same time point.
Which is right? Would you give me some advice?

Lisa M. Yarnell posted on Monday, February 06, 2012 - 8:55 am

Hello, in the output for my model with MLR estimation, I see the following Loglikelihoods and associated statistics:

Loglikelihood
H0 Value 84979.874
H0 Scaling Correction Factor for MLR 1.498

H1 Value -84286.966
H1 Scaling Correction Factor for MLR 1.454

Is the H0 model what we call the baseline model when using Chi-Square?

So if I want to compare the fit of two nested, structured models, I would utilize the H1 values from one model and the H1 values from the other model, right?

Thanks!

Linda K. Muthen posted on Monday, February 06, 2012 - 9:12 am

The H0 model is the analysis model, the model specified in the MODEL command. To compare two nested models, use the two H0 loglikelihoods.

Lisa M. Yarnell posted on Monday, February 06, 2012 - 10:15 am

OK, what is the H1 model, then, in a given singular Mplus output? What does the H1 tell you?

Linda K. Muthen posted on Monday, February 06, 2012 - 1:41 pm

It is the unrestricted model of means, variances, and covariances which is used in the calculation of chi-square.

L. Siemons posted on Thursday, October 04, 2012 - 7:36 am

Hello,

I have some questions about the growth mixture modeling procedure I used.

1. I did run a linear and a quadratic model, both with 2 and with 3 classes. Mplus gives some information with the LO-MENDELL-RUBIN ADJUSTED LRT TEST about the fit of the classes. If I understand correctly, 3 classes are better fitting than 2 classes when this measure is significant when running a 3 class model?
Is there also a fit statistic which shows me whether the quadratic model fits better than the linear model? Or can I test this myself using some kind of log-likelihood ratio test? If so, which log-likelihood value should I use from the output?

2. I read something about removing non-significant quadratic terms, but retaining the linear parameters (irrespective of significance). How can I determine whether my quadratic terms are significant? Can this information be found in the output?

3. How can I save the group membership of the persons? I did try to do this using SAVE: CPROBABILITIES, but it gives an error and doesn�t run.

4. I learned from the literature that many fit indices can be examined (e.g. (sample-size adjusted) BIQ, AIC, posterior probabilities, log likelihood vales, adjusted LRT). Which measures are best/sufficient to use? And which values are they allowed to take in order to consider it a good fit?

I would greatly appreciate your help with answering these questions. Thank you very much!

Linda K. Muthen posted on Friday, October 05, 2012 - 11:20 am

1. With the Lo-Mendell-Rubin test, when the p-value exceeds .05 for let's say three classes, you choose the two-class solution.

2. You would look at the mean of the quadratic growth factor. If it is significant, you would choose the quadratic model.

3. Please send the output and your license number to support@statmodel.com.

4. BIC is good to use. See the Topic 5 course handout and video where this is discussed.

Lisa M. Yarnell posted on Monday, April 08, 2013 - 11:21 am

Hi Linda, I am have the following select fit statistics for a model that I ran (see below). Why is the number of free parameters 12, but the degrees of freedom for the Chi-square 2?

Don't both refer to the model as a whole? Or does the Chi-square df not count certain estimated parameters, like variances?

Which should I rely on in reporting the degrees of freedom for the model? Thanks.

Number of Free Parameters 12

Loglikelihood
H0 Value -1053.922
H1 Value -1041.253

Information Criteria
Akaike (AIC) 2131.844
Bayesian (BIC) 2171.424
Sample-Size Adjusted BIC 2133.407

Chi-Square Test of Model Fit Value 25.337
Degrees of Freedom 2
P-Value 0.0000

Linda K. Muthen posted on Monday, April 08, 2013 - 1:39 pm

The number of free parameters is not equal to the degrees of freedom. The degrees of freedom is equal to the number of parameters in the H1 model minus the number of parameters in the H0 model.

Oleg Zaslavsky posted on Tuesday, June 11, 2013 - 2:07 am

Dear Mplus staff,
I am fitting a unconditional linear GCM model with eight waves of data and ~63000 people.
Predictably I have a large proportion of missing data that generates warning:
"THE COVARIANCE COVERAGE FALLS BELOW THE SPECIFIED LIMIT.
THE MISSING DATA EM ALGORITHM WILL NOT BE INITIATED.

However model estimation converges normally (and quite fast) and generates a set of estimates.
On the other hand I am having hard time retrieving any of model fit indices (i.e., CFI, TLI, WRMR, or even chi2).
Please advise me what should I do in that regard.
Thanks for your time
Oleg

Linda K. Muthen posted on Tuesday, June 11, 2013 - 11:41 am

Please send the output and your license number to support@statmodel.com.

Zuduo Zheng posted on Monday, September 23, 2013 - 10:33 pm

Hello Dear Linda,

I'm new to LGM and to Mplus.

Based on your reply to one of the previous posts, my understanding is that the p value in the section of Chi-Square Test of Model Fit Value should be big (say, bigger than 0.05). However, I also noticed that in a couple of old posts (in this forum), the models discussed were with a p value less than 0.05. So, I'm writing to double check with you.

Below is what I got from my model. May I conclude that this model does NOT fit the data well because the p value in the section of Chi-Square Test of Model Fit Value is close to zero?

Any response from you will be appreciated.

Zuduo

MODEL FIT INFORMATION

Number of Free Parameters 24

Loglikelihood

H0 Value -90554.527
H1 Value -90513.727

Information Criteria

Akaike (AIC) 181157.053
Bayesian (BIC) 181314.910
Sample-Size Adjusted BIC 181238.645
(n* = (n + 2) / 24)

Chi-Square Test of Model Fit

Value 81.599
Degrees of Freedom 30
P-Value 0.0000

(the rest is omitted)

Linda K. Muthen posted on Tuesday, September 24, 2013 - 11:33 am

Please send the output and your license number to support@statmodel.com.

Amber Fahey posted on Thursday, June 07, 2018 - 7:12 pm

Hello, I am running a two level model with random intercepts and random slopes. I am obtaining LL's almost twice as large in the conditional model than the unconditional model. I can not use S-B deviance statistic to compare models because I am receiving negative values. But based on the size of the LL's it looks like the unconditional model is a better fit, however, my predictors in the conditional model explain 75 and 25 of the variability in my intercept and slope, respectively 9using pseudo R squared). Is there a way to reconcile this? It appears like I have entered good predictors (theoretically driven), but I can't understand why it would appear as if the conditional model is worse.

Bengt O. Muthen posted on Friday, June 08, 2018 - 1:55 pm

When you say "LL's almost twice as large" are you sure you are taking into account that they have negative values? For instance, LL= -1500 is better (higher) than LL=-1000.

Amber Fahey posted on Friday, June 08, 2018 - 2:23 pm

My LL for the unconditional model is: -42777.07
For the conditional model the LL is:-8534.71

I have read the larger absolute value the better, but I�ve also read the smaller (closer to zero) the better. Can you please clarify? Is it dependent on the type of analysis or is it computed differently here in mplus?

Bengt O. Muthen posted on Friday, June 08, 2018 - 2:56 pm

It's strange that the unconditional model has such a much better LL. If you like, you can send your 2 outputs to Support along with your license number.

LS posted on Friday, October 11, 2019 - 6:06 am

Dear Drs. Muthen,
I have two questions with regard to model fit.

1) I am running conditional GMM models through the one-step approach.

My covariates are 3; however, depending on how many covariates I add to the model, N drops due to missing values.

How can I compare conditional models fit if N changes? As far as I understood, BIC and -2LL can be used only if models have the same N. I would like to know if adding more covariates is not improving model fit, therefore retaining the more parsimonious model.

2) How can I compare unconditional and conditional models (and therefore, the utility of adding covariates to the model) if N again is so different?

Thank you so much for your help,
LS

Bengt O. Muthen posted on Friday, October 11, 2019 - 2:36 pm

You can include all potential covariates and fix the slopes of those you don't want to include in a particular model. Then you use the same N and can compare BICs as well.