Mplus Discussion >> Model identification

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Model identification

Mplus Discussion > Growth Modeling of Longitudinal Data >

Message/Author

David Myers posted on Tuesday, May 16, 2000 - 7:32 am

In a recent conversation with Linda, we were alerted to a potential problem with empirical identification (e.g., the data could not support the model and we were finding what probably were local minima). We have adjusted our model somewhat (a growth model and the reason I am writing in this section of the board) and find we still cannot obtain convergence when we use the program's default start values. After some work with adjusting start values, we are able to obtain convergence. As a check, we have been changing the start values (all except one set at the final solution and making one start value equal to the final solution plus a one-half standard deviation adjustment). Is this sufficient to check for local minima or should we do something more drastic, such as set new start values for several parameters simultaneously with the above approach?

Linda K. Muthen posted on Wednesday, May 17, 2000 - 8:15 am

I would suggest changing the centering to see if you also get the same chi-square. This in effect changes the starting values. Perhaps just change the centering to the second time point to not make such a big change as when you center at the average time point.

Lyndon Brooks posted on Monday, August 12, 2002 - 9:43 pm

I'm new to MPlus and LGM. I've fit a simple LGM to 12 time points (n=41). This worked fine (if the fit could be better) while I didn't ask for the latent variable means [level trend], but no SE's were calculated when I did because 'the model may not be identified'. If non-identification is indeed the problem, how might the latent variable mean estimates be involved? Otherwise, things to try?

Linda K. Muthen posted on Tuesday, August 13, 2002 - 9:14 am

I would need to see your output to answer your question. Please send it to support@statmodel.com. If you have not asked for TECH1 in the OUTPUT command, please add that and rerun it before you send it to me.

Hanno Petras posted on Thursday, March 06, 2003 - 6:24 am

Dear Linda and Bengt,

yesterday, a collaborator approached me with a problem, which I thought was quite puzzeling: He had analyzed the growth in depression with three time points. A linear growth model over three time points without covariates has nine free parameters in the unrestricted H1 model (three means and six variances-covariances). For the one class H0 growth model 8 parameters are required (two means, 2 variances, 1 covariance, 3 residual variances). This leaves one degree of freedom and the model is identified. Surprisingly enough, he was able to run models with as many as five classes, which all converged perfectly (with decreasing BIC and fantastic entropy). The question is: How is that possible? I would appreciate any comments.
On a second note, what do you think of using covariates to increase the number of df to fit more complex model and therefore outweigh the restrictions due to smaller number of available time points.

Best,

Hanno

Linda K. Muthen posted on Thursday, March 06, 2003 - 6:40 am

The way you are thinking about degrees of freedom does not apply to mixture models where the information does not come only from the covariance matrix and mean vector but also from the raw data. See, for example, the Fisher iris data mixture example on the Mplus website.

Along this line, using covariates to increase the degrees of freedom is not necessary. Covariates in the model can help in finding classes however.

Hanno Petras posted on Friday, March 07, 2003 - 5:47 am

Dear Linda,

thank you for your response. I was wondering about the formula commonly used to determine the df in a growth mixture model. I am not quite sure that I understand when you say that the information not only comes from the covariance matrix and mean vector but aslo from the raw data. Could you elaborate on that? Finally, if the above is true what is the base for the advice to have at least four time points for a growth mixture model? Thank you for comments.

Best,

Hanno

Linda K. Muthen posted on Friday, March 07, 2003 - 8:05 am

There is no formula for determining the degrees of freedom in a growth mixture model. With growth mixture models, the model is not fit to the mean vector and covariance matrix. These are not sufficient statistics for this type of model. There is no unrestricted model as there is in a random coefficient growth model. Therefore, no chi-square and degrees of freedom can be estimated. You can think of it like LCA and LPA. With LCA, there is an unrestricted model. It is the contingency table of the latent class indicators. Therefore, degrees of freedom and a chi-square can be estimated. In LPA, this is not the case because there is no unrestricted model.

The advice about four time points is for a growth model not a growth mixture model. Four time points are not required but are recommended to give modeling flexibility. This same recommendation applies to each class of a growth mixture model.

Hanno petras posted on Tuesday, March 11, 2003 - 7:55 am

Dear Linda,

thank you for your feedback. Could you elaborate on the fact that the Growth Curve Models are not fit to the mean vector and covariance matrix or maybe you can suggest a recent publication? Thank you.

Best,

Hanno

Linda K. Muthen posted on Tuesday, March 11, 2003 - 8:04 am

Growth curve models are fit to mean and covariance matrices. Growth mixture models are not. Mixture models do not work with a normality assumption for the observed variables and therefore mean and covariance matrices are not sufficient statistics for the estimation of such models. Raw data are needed. Identification in mixture models is discussed to some extent in Muthen and Shedden (1999).

Anonymous posted on Thursday, January 20, 2005 - 11:29 am

I'm runing a growth curve model with 5 data points assessed at age 5,7,10,14, and 17. One of the reviewers has suggested that we center at the last data point. I first ran the linear model (fixed the slope factor loadings at -12, -10,
-7, -3 and 0) but didn't fit very well, so I tried a linear spline model (fixed the last two time points and freed the last three). The model worked fine (significantly improved) but the direction of the slope changed from positive to negative in the linear spline model. From the average mean values over time, I think the slope should be positive... Do you have any idea what's happening here? Thank you so much.

BMuthen posted on Thursday, January 20, 2005 - 8:13 pm

Try fixing the first to minus one and the last to zero and have them free in between. The slope then refers to the change from the first to the last time point.

Anonymous posted on Friday, January 21, 2005 - 3:29 am

I am new to Mplus and do not know how to modify my model/analysis in order to sort out the problem of a non postive def. matrix and achieve convergence in my growth mixture model.I tried modifying the starting values - running a 1 class model and using the values from there as starting values for my 3- classes model by didn't work.I also tried increasing the STARTS values.In the output I got a message saying "problem involving parameter 13" [ my model is: i BY s0@0 s2@2;
s q BY s2@2 s7@7 s9@9 s13@13; i s q c#1 c#2 on ag;] can you tell me which parameter is parameter 13?
thanks !

Linda K. Muthen posted on Saturday, January 22, 2005 - 3:23 pm

TECH1 will tell you which parameter is number 13. However, you syntax does not look like a growth model. See Chapter 6 of the Mplus User's Guide for growth model examples. Also see Chapter 16 which describes the growth modeling language. If you continue to have problems, send your complete output and data to support@statmodel.com.

Mogens Fenger posted on Thursday, February 24, 2005 - 5:20 am

Hi M&M

I'm running some hefty growth modeling using this rather amazing Mplus 3.1. Quite often the message ".Variance covariance .not trustworthy.. Condiotion number xxx. Problem involving parameter 40" shows up, even with a terminated process with convergence, fits, etc.
Two questions: What is the optimal value for the Condition number? A read about it in the manual, but couldn't find it again(by the way, in the sentence above D is used for marking the exponent, later on E is used). What do you actually do with this parameter 40?
I've got a third question: after a normal terminated process, a lot of warnings pops up, all of them telling you about problems with residual variances and the parameter in question. I don't quite understand the significans of this, as the process terminated normally, giving fits, entropi, mean, SE etc.

Best
Mogens

bmuthen posted on Saturday, February 26, 2005 - 5:38 pm

The condition number is computed in connection with the SEs of the model and relates to the identification of the model. The number is the ratio of the lowest to the highest eigenvalue of the information matrix. That statement may not be accessible in non-technical terms, but a value equal to or close to zero means that the model is not identified and therefore has problems. "Close to zero" is about 1 to the power of -10 in Mplus. You don't want small values that go below that. When it refers to parameter 40, Tech1 will tell you which one that is and the task is then to figure out why the model isn't identified when this parameter is included in the model. If you don't see the problem, send your output to support@statmodel.com.

Warnings about residual variances have to do with negative variances or correlations that are not in the -1, 1 range. If you don't see the problem, again send to support.

Allison Fuligni posted on Wednesday, October 26, 2005 - 3:45 pm

I am a non-sophisticated user of MPlus, trying to look at growth mixture modeling with multiple groups. With the Users Guide I determined I should be using the KNOWNCLASS statement to identify the groups (in my case, 3 race/ethnicity groups). However, as I attempt to build my model, moving from one latent class in addition to the 3 race classes to a 2 X 3 class model, I get error messages saying model estimation did not terminate normally - change my model or starting values. I see the examples of how to specify starting values for each class in the Users Guide, but don't know how to begin to determine which starting values I would specify. Is there another source where I could read more on this topic?

Thank you very much.

Linda K. Muthen posted on Wednesday, October 26, 2005 - 5:02 pm

I'm guessing that the message may have more to do with the model than the starting values. You may want to fit a growth model for each ethnicity group separately much like you would do a factor analysis for each group separately as a first step. You might want to read the Muthen paper in a book edited by Kaplan. This paper can be downloaded from the website.

Edward Barker posted on Tuesday, February 13, 2007 - 5:29 am

quick question: I am using MPlus to estimate GGMM for antisocial behaviors in boys and girls. I find, for girls, the VUONG-LO-MENDELL-RUBIN LIKELIHOOD RATIO TEST FOR 2 (H0) VERSUS 3 CLASSES (p = .65) whereas the VUONG-LO-MENDELL-RUBIN LIKELIHOOD RATIO TEST FOR 3 (H0) VERSUS 4 CLASSES (p = 0.0426).

Model estimation appears to have terminated normally for each model (i.e., the loglikelihoods were replicated).

Is this something to be concerned about?

Linda K. Muthen posted on Tuesday, February 13, 2007 - 8:39 am

I would try TECH14 also. Be sure to read about the steps to use TECH14 in the most recent user's guide which is on the website. You want to use it in conjunction with the OPTSEED option.

Edward Barker posted on Thursday, February 15, 2007 - 2:53 am

Thanks! I followed steps to use TECH14.

New but related question. I now have two LRTs: 1) VUONG-LO-MENDELL-RUBIN LIKELIHOOD RATIO TEST FOR 1 (H0) VERSUS 2 CLASSES (p=.08), and 2) PARAMETRIC BOOTSTRAPPED LIKELIHOOD RATIO TEST FOR 1 (H0) VERSUS 2 CLASSES (p=.0000).

Is it a concern the two LRTs do not agree?

Are there additional checks for model stability in Mplus?

Edward Barker posted on Thursday, February 15, 2007 - 5:24 am

One more question, and this is probably related to my previous question: If model estimation terminates with the output below, is it the case that the model solution is not stable. THANKS!

Loglikelihood values at local maxima, seeds, and initial stage start numbers:

-2593.457 49221 254
-2593.457 650371 14

WARNING: WHEN ESTIMATING A MODEL WITH MORE THAN TWO CLASSES, IT MAY BE
NECESSARY TO INCREASE THE NUMBER OF RANDOM STARTS USING THE STARTS OPTION
TO AVOID LOCAL MAXIMA.

Class 1

Observed
Variable R-Square

THEFT2 0.000
THEFT3 0.000
THEFT4 Undefined -0.22204E-15
THEFT5 0.000

Class 2

Observed
Variable R-Square

THEFT2 0.000
THEFT3 0.000
THEFT4 Undefined -0.22204E-15
THEFT5 0.000

Class 3

Observed
Variable R-Square

THEFT2 0.000
THEFT3 0.000
THEFT4 Undefined -0.22204E-15
THEFT5 0.000

Linda K. Muthen posted on Thursday, February 15, 2007 - 9:11 am

TECH11 and TECH14 can disagree. In the following paper which is available on the website, TECH14 came out best:

Nylund, K.L., Asparouhov, T., & Muthen, B. (2006). Deciding on the number of classes in latent class analysis and growth mixture modeling. A Monte Carlo simulation study. Accepted for publication in Structural Equation Modeling.

See the following paper which is also available on the website for suggestions of how to assess model fit:

Muth�n, B. (2004). Latent variable analysis: Growth mixture modeling and related techniques for longitudinal data. In D. Kaplan (ed.), Handbook of quantitative methodology for the social sciences (pp. 345-368). Newbury Park, CA: Sage Publications.

We give the warning whenever there are more than two classes. You have replicated your loglikelihood so this does not apply to you.

The fact that you have r-squares of zero make me think you must have fixed some parameters to make this happen. Otherwise, you would need to send the full output and your license number to support@statmodel.comn

zhenli posted on Monday, April 02, 2007 - 12:30 am

Dear Dr. Muthen
when we fit a latent growth model, we usually fit two slope loadings at 0 and 1 to identify the model no matter linear, free estimate, or quadratic. I wonder can we fit these two slope loadings at other values. why we fit one of the slope loadings at 1. is that because we need to scale the latent variables (intercept and slope factors)?

Thank you!

Linda K. Muthen posted on Monday, April 02, 2007 - 8:37 am

It is not necessary to use zero and one as time scores. The intercept growth factor is defined at the time score of zero so it is convenient to have that as the first time score if you are interested in initial status or the last if you are interested in final status etc. Regarding the other time scores, they need only reflect the distance between measurement occasions. So it can be 0 1 2 or 0 .1 .2.

Alison Ventura posted on Tuesday, August 14, 2007 - 7:01 am

Dear Dr. Muthen,

This message will be in two posts:
I am new to GMM and am trying to fit a GMM for weight status (Body Mass Index) data. I first fit a simple model with the default constraints across classes and found a three class solution was the best fit.
But, within this model I saw that the residual variances jumped quite high for the last time point.
A colleague who has a bit more experience with these types of analyses suggested I examine the distributions within each class for normality (I did, the distributions didn't look too skewed) and try rerunning things with unconstrained variances. I did, and although the three class model fit better than the constrained version, the four class model gave me the following error message:
THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO A NON-ZERO DERIVATIVE OF THE OBSERVED-DATA LOGLIKELIHOOD.

THE MCONVERGENCE CRITERION OF THE EM ALGORITHM IS NOT FULFILLED.CHECK YOUR STARTING VALUES OR INCREASE THE NUMBER OF MITERATIONS.ESTIMATES CANNOT BE TRUSTED. THE LOGLIKELIHOOD DERIVATIVE FOR PARAMETER 10 IS 0.15289800D-04.

Alison Ventura posted on Tuesday, August 14, 2007 - 7:02 am

I also tried running the model with class-specific covariation between s and i, which gave me a slightly better fitting model for 3 classes, but more error messages for 4 classes about an ill-conditioned and non-positive definite fisher information matrix and not being able to calculate S.E.s.
I am not sure where to go from here. Do these error messages with the unconstrained models mean:
(1) that the constrained model fits the best?
(2) that the 3 class unconstrained model fits better than the 4 class?
(3) that I am doing something wrong or need to try something different in setting up the unconstrained models?

I should add that the unconstrained models also had the same pattern of increasing residual variance at the later time points, and it was more dramatic within each class when I let the variances be class-specific. Is this something I should be concerned about, or is it something that would be expected with longitudinal data?

Apologies for such a long question; thank you for any assistance you can provide.

Linda K. Muthen posted on Tuesday, August 14, 2007 - 6:27 pm

If a post does not fit in the space provided in one window, it is too long for Mplus Discussion. Please do not double post in the future.

A residual variance can be larger at one timepoint. This does not necessarily indicate a problem.

Estimating a model with class-varying variances can be problematic. I would run the model with the variances held equal across classes and ask for plots using the PLOT command. I would look at the estimated mean and observed individual plot to see if it appears that there are different variances in the classes. If one class has a smaller or larger variance than the others, I would free the variance for that class.

None of what you are seeing says anything about model fit.

Joan W. posted on Tuesday, October 14, 2008 - 8:37 am

Dear Dr. Muthen,

I have a question with regard to your recent post on June 10, 2008 on degrees of freedom. When you say degrees of freedom are relevant when means, variances and covariances are sufficient statistics for model estimation, are you referring to both latent class models with continuous as well as categorical indicators? I asked this because I have frequently read papers on how to calculate degrees of freedoms for latent class analysis with categorical variables, and in those papers, df=R-q-1, where r is equal to the number of unique response patterns, and q is equal to the number of free parameters.

A related question to the degrees of freedom is the absolute goodness-of-fit index. I remember i read an earlier post from mplus discussion, saying that because there are no sufficient statistics and no unrestricted model in mixture models, there is no chi-square statistics. I wonder whether this is applicable to continuous indicators as well as categorical indicators too. Likewise, I've seen papers discussing chi-square statistics as a goodness-of-fit index for analyzing contingency tables by comparing the estimated frequency with the observed frequency in each cell. I'm just confused...

Thanks!

Linda K. Muthen posted on Tuesday, October 14, 2008 - 11:45 am

I don't see what post you are referring to from June 10, 2008. Could you give me the link?

Joan W. posted on Tuesday, October 14, 2008 - 12:11 pm

Under the section of "latent class growth analysis" through the following link:
http://www.statmodel.com/discussion/messages/14/195.html?1213117551

the latest post. thanks!

Linda K. Muthen posted on Wednesday, October 15, 2008 - 10:38 am

For latent class analysis and categorical outcomes, the H0 model has q parameters, the H1 model has R-1 parameters where R is the number of cells in the multiway frequency table so the degrees of freedom is R - q - 1. Chi-square tests are available to test the fit of the observed versus expected cells frequencies. These tests are usually not useful when there are more than 8 latent class indicators.

For continuous latent class indicators, a frequency table test is not relevant. No chi-square test is available.

microlily posted on Saturday, December 13, 2008 - 9:59 am

Dear Dr. Muthen:
I am running a-57-cases quadratic growth model assessed posttreatment pain at 7, 10, 30, and 90 hours. I got "NO CONVERGENCE. NUMBER OF ITERATIONS EXCEEDED". Is it because a small sample or how should i do to make data convergence?
Thank you so much.

Bengt O. Muthen posted on Saturday, December 13, 2008 - 5:05 pm

That's hard to give a general answer to. Take a look at the "non convergence" entry the UG index.

If that doesn't help, please send your input, output, data and license number to suypport@statmodel.com.

ywang posted on Wednesday, August 05, 2009 - 12:37 pm

Dear Drs. Muthen:

I have a question about the LGM with three time points on continuous indicator variables.

Mplus kept showing information like
" THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE
TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE
FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING
VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE
CONDITION NUMBER IS -0.186D-16. PROBLEM INVOLVING PARAMETER 11".

but it also showed "THE MODEL ESTIMATION TERMINATED NORMALLY"

I kept fixing the problems. However, after I fixed one problem, e.g. by fixing the variance at 0, it would tell me there is another problem involving a different parameter. What can I do with this?

Thank you very much for your help in advance!

Bengt O. Muthen posted on Wednesday, August 05, 2009 - 1:01 pm

It is not possible to diagnose this without having more information. Please send your input, output, data and license number to support@statmodel.com. Only your first analysis is needed.

Fernando H Andrade posted on Monday, February 15, 2010 - 3:10 pm

Dear Dr. Muthen
I am new to Mplus. I am trying to fit a quadratic growth curve model. The outcomes are binary (0 1) responses. i first fitted the linear model and i did not have any problems. but when i try to fit the quadratic model i get this message:

THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 23.
THE CONDITION NUMBER IS 0.430D-10.

FACTOR SCORES WILL NOT BE COMPUTED DUE TO NONCONVERGENCE OR NONIDENTIFIED MODEL.

This is the syntax:

Categorical are pda1 pda2 pda3 pda4 pda5 pda6 pda7 pda8
pda9 pda10 pda11 pda12 pda13 pda14 pda15;

Missing are all (-9999);

MODEL: i s q | pda1@0 pda2@1 pda3@2 pda4@3 pda5@4 pda6@5 pda7@6 pda8@7
pda9@8 pda10@9 pda11@10 pda12@11 pda13@12 pda14@13 pda15@14;

OUTPUT: tech1

Is there something am i missing?

thank you very much

Fernando

Linda K. Muthen posted on Monday, February 15, 2010 - 3:13 pm

Try dividing the time scores by 10, 0 .1 .2 etc. If that does not help, please send the full output and your license number to support@statodel.com.

Fernando H Andrade posted on Monday, February 15, 2010 - 6:10 pm

Dear Linda thank you very much
it worked well. may i ask why does it work with increments of 10 and not increments of 1?
fernando

Linda K. Muthen posted on Tuesday, February 16, 2010 - 6:47 am

The time scores provide values that are used to determine where model estimation starts. A linear scores of 14 becomes a quadratic score of 196. It is better to start at 1.4 and 1.96.

Juned Siddique posted on Wednesday, March 31, 2010 - 3:24 pm

Hi. How do I modify the following code so that only the intercept and slope are random and the quadratic term is fixed? Thank you.

i s q | y1@0 y2@1 y3@2 y4@3

Bengt O. Muthen posted on Wednesday, March 31, 2010 - 3:40 pm

q@0;

This fixes the q variance at zero.

Nicole Nugent posted on Tuesday, August 24, 2010 - 6:12 am

Dear Drs. Linda and Bengt Muthen,

I am working with a colleague to fit a growth model with observed affect data at 5 time points. An binary event occurred at the 3rd time point, and so we set the T3 intercept/time @ 0. We generated temporal offsets (measured in hours) for the remaining four variables (V1,V2,V4,V5), reflecting the number of hours removed those observations were from the event of interest. Offsets prior to the event (T1 & T2) are negative while offsets following the event (T4 & T5) are positive.

i s q | T1@V1 T2@V2 T3@0 T4@V4 T5@V5;
event on i s q;

Our theory and findings from repeated measures ANOVA suggest that a positive quadratic is found among individuals experiencing the event. However, we have been unable to get the model to converge. We have managed convergence when we use simple time points (-2, -1, 0, 1, 2) and/or drop the quadratic term� but the quadratic is central to our hypotheses and reviewers are requiring us to parameterize time in terms of actual time before/since the event.

I'm concerned that the sample is simply too small (N = 36) to effectively fit a model of this type?

Thanks in advance for your help! Sincerely, Nicole

Linda K. Muthen posted on Tuesday, August 24, 2010 - 8:27 am

Please send the outputs and your license number to support@statmodel.com.

Janet Smith posted on Saturday, February 05, 2011 - 9:15 am

Hello

I am trying to compute a piecewise LGM for two slopes. My syntax is this:

VARIABLE: NAMES ARE x15 x16 x17 x19;
MISSING ARE ALL (-999);
USEVAR = x15 x16 x17 x19;

MODEL:
i s1 | x15@0 x16@1 x17@1 x19@1;
i s2 | x15@0 x16@0 x17@0 x19@1;

OUTPUT: TECH1 MODINDICES STDYX;

PLOT: TYPE=PLOT3;
series is x15(0) x16(1) x16(2) x19(4);

I keep getting the error message:

THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL.PROBLEM INVOLVING PARAMETER 10.

Parameter 10 is the PSI table cell between s1 and s1.

Looking at the mod indices the cell reads '.968'

I don't know what this means and don't know how to fix the problem.

Can you please help?

Linda K. Muthen posted on Saturday, February 05, 2011 - 9:18 am

Please send the full output and your license number to support@statmodel.com.

Sofie Henschel posted on Friday, May 27, 2011 - 6:16 am

Dear Dr. Muthen,
I'm trying to run a simple latent growth model with 3 timepoints. There is a problem in model identification because of a non-positive def. matrix error which shows up as a negative slope variance. I am not sure how to handle this. Fixing at least the residual variances of the first/second or second/third timepoint for a common estimation yields a model identification at last.
Do you have any idea what the problem is?
Kind regards,

Linda K. Muthen posted on Friday, May 27, 2011 - 6:22 am

A linear growth model with a continuous outcome and three time points is identified. I would have to see the problem to say what the message means. Please send the output and your license number to support@statmodel.com.

Till posted on Friday, October 14, 2011 - 3:46 am

Dear Prof. Muth�n,

I'm conducting several LGM. In some models I get the "Nonconvergence" Message from Mplus.
If I fix the residual variance of some parameter X to a small positive amount (.01) the analysis works.
But I cannot theoretically justify that way of handling it, because that parameter X does not seem to have a different residual variance compared to other parameters Y and even in the data there is nothing, that would point to that special parameter.
My question is: Is this an arguable method or should I try to find some other solution?
I hope the question is not answered somewhere else, I could not find any matching posts.
Thank you in advance
Best regards
Till

Please find my Input below

variable: names= g1 e1 n1 g2 e2 n2 g3 e3 n3 l01 l02 l03 l04 l05 l06 l07 l08 l09;

usevariable=n1 n2 n3 l01 l02 l03 l04 l05 l06 l07 l08 l09;
missing=all(99);

model: i s | l01@0 l02 l03 l04 l05@-1 l06 l07 l08 l09;
F1 by n1 n2 n3;
i s on F1;

P. S. a solution would be l02@0.01;
f. ex.

Linda K. Muthen posted on Saturday, October 15, 2011 - 8:12 am

I would not fix the negative residual variance to a small value. It is likely due to a misspecified model. You could try holding the residual variances equal across time or changing the model.

Till posted on Monday, October 17, 2011 - 4:18 am

Dear Linda,

thank you very much for your fast response.
I would like to engross the question a little bit more.
I'm examining the impact of pesonality on life satisfaction. For that purpose I specifiy the model described above first for three factors F1 F2 F3 with
F1 by n1 n2 n3;
F2 by e1 e2 e3;
F3 by g1 g2 g3;
and
i s on F1 F2 F3;
and then once for every Factor alone.
I have the CONVERGENCE Problem for two of the three models with only one factor, but not for the basismodel and I have problems interpreting that. I suppose that if the problems were due to missspecification, none of the models should work. In my interpretation I assume that the problem arises in part due to the very small sample size (N=99) but still I do not understand why it works sometimes and sometimes not.
Is there, in your opinion, any possible explanation?
Thank you very much for your time!
Best regards
Till

Linda K. Muthen posted on Monday, October 17, 2011 - 6:00 am

I would need to see the outputs and your license number at suppport@statmodel.com.

Xiaolu Zhou posted on Monday, January 30, 2012 - 10:49 am

Hi,

I am comparing two SEMs with Mplus. The one is 3 latent factors related to X latent factor. The other is the same 3 latent factors related to Y/Z latent factors. X represents the 1-factor structure of C scale. Y and Z represent the 2-factor structure of the same C scale. The SEM for the X latent factor is good. However, I can not get the model fit of the SEM for the Y/Z factors. The output showed that MAXIMUM LOG-LIKELIHOOD VALUE FOR THE UNRESTRICTED (H1) MODEL IS -1344.889. NO CONVERGENCE. NUMBER OF ITERATIONS EXCEEDED. Could you please tell me what account for this? Thanks a lot!

Linda K. Muthen posted on Monday, January 30, 2012 - 10:51 am

Please send the output and your license number to support@statmodel.com.

Isaura Olivares posted on Sunday, April 22, 2012 - 9:44 pm

I have four waves of data and I am estimating a quadratic 5 class GMM. I have found that a four class model fits the data well, but a five class model seems to have even better �fit� based on the BIC, best LL value, and the LMR. However, for this 5-class model, the estimates for the means of the growth factors are completely difference than the estimates I found for a 4-class model. In fact, when I looked at the estimated means graph for the 5 class model, it looked completely different from the one for a 4-class. It is my understanding that in GMM, whenever a model is estimated with a new class, the new class would be class #1 in the new model and that the other classes should remain pretty much the same. Is this correct?

This is the syntax that I am using for a 4 and 5 class model.
i s q| y1@0 y2@1.5 y3@3 y4@4.5
q@0;

Isaura Olivares posted on Sunday, April 22, 2012 - 9:45 pm

I have found that after constraining the growth factor variances (i s q) to zero, as in latent class growth analysis, the estimates for the means of the growth factors appear to be more consistent with those of the four class model (with small differences) and include the new class. The only problem is that the BIC is higher than that for a 4 class (for which I set q@0). Would it make sense to choose the 5 class model with i-q set at zero as the correct model?

THANK YOU FOR YOUR HELP!

Isaura Olivares posted on Sunday, April 22, 2012 - 10:02 pm

Could it also be possible that four waves of data are not enough for a 5 class model? All the classes make sense theoretically, including the fifth class.

Linda K. Muthen posted on Monday, April 23, 2012 - 1:23 pm

No, it is not correct that the new class would be number 1.

No, it would not make sense to choose the 5-class model and fix i-q at zero.

No, it is not possible that four waves are not enough for a 5-class model. I would use the 5-class model with free growth factor variances.

Please limit your posts to one window in the future.

iolivares posted on Monday, April 23, 2012 - 5:14 pm

Dr. Muthen,

Thank you for your prompt response. I apologize for my multiple posts earlier. I think I am misunderstanding something, which may be basic. If growth factor estimates (means) for a 5 class model with better fit are completely inconsistent with those of a 4 class model, would it still make sense to choose the 5 class model? I am making the assumption that models with k classes have the same growth estimates as one with k-1 classes with the exception of the new class and some minor variations in estimates for the old classes which may occur as a result of extracting the new class.

Linda K. Muthen posted on Monday, April 23, 2012 - 5:16 pm

This is not true. A five-class solution does not have the four-class solution with one new class. It has 5 new classes which may or may not resemble the classes from the four-class solution.

iolivares posted on Monday, April 23, 2012 - 5:29 pm

Thank you! That certainly makes sense!

lam chen posted on Tuesday, May 22, 2012 - 3:50 am

Dera Dr. Muthen,

I am modeling a GMM with four observation points.When it is two classes,convergence is gotten.However, once improving the number of class, PSI matrixs are not positive identificated. How should I slove this problem? I also found that after adding covariates, convergence was gotten.I guess the reason is that there not enough observation points to model the 3 classes and abouve GMM,is it?

Linda K. Muthen posted on Tuesday, May 22, 2012 - 8:42 am

As the number of classes increases, the within class variability decreases. You may be trying to extract too many classes.

When you add covariates to the model, they contribute to the formation of the classes so more information is used than when they are not included. In addition, adding covariates increases power.

Shirley Poyau posted on Sunday, September 16, 2012 - 7:21 pm

I have been trying to fit a 2nd order growth curve to data with three time points, with 3 indicators per time point. Each time I receive an error message stating that the model may not be identified, and that there is an error with one of the parameters (the alpha matrix of the intercept; this has happened with two separate models).

I've tested and imposed measurment (scalar) invariance across time, and have set intercepts to zero for first order constructs.

Any ideas as to why this model might be unidentified?

Linda K. Muthen posted on Monday, September 17, 2012 - 6:42 am

Please send your output and license number to support@statmodel.com.

Kara Thompson posted on Wednesday, November 07, 2012 - 2:52 pm

I am running a piecewise growth model using a cohort sequential study with time as age. The model runs for the total sample of 662 and for males only (n=320). But I cannot get the model to converge for females (n=335)

The error I get is NO CONVERGENCE. SERIOUS PROBLEMS IN ITERATIONS. CHECK YOUR DATA, STARTING VALUES AND MODEL.

Do you have an suggestions for what might be wrong? or what I should try?

Thanks
Kara
i s1 | y14@0 y15@.1 y16@.2
y17@.3 y18@.4 y19@.5 y20@.6
y21@.6 y22@.6 y23@.6 y24@.6 y25@.6 y26@.6 y27@.6;

i s2 | y14@0 y15@0 y16@0
y17@0 y18@0 y19@0 y20@0
y21@.1 y22@.2 y23@.3 y24@.4 y25@.5 y26@.6 y27@.7;

Linda K. Muthen posted on Wednesday, November 07, 2012 - 3:12 pm

Please send the output and your license number to support@statmodel.com.

Shraddha Kashyap posted on Monday, March 11, 2013 - 1:57 am

Hi,

Im having some trouble with a LGCA...when I test for a one class model, I keep getting an impossible mean value for one of my variables.

I have checked my SPSS data and MPLUS data file and I cannot find any problems there....the minimum value for this variable is 0, and the maximum is 5...but the output gives me a mean of 13.7 for this variable. Is there anything I can do to find the problem?

Thanks!

Linda K. Muthen posted on Monday, March 11, 2013 - 6:56 am

Please send the output and your license number to support@statmodel.com.

Markus Martini posted on Wednesday, August 14, 2013 - 1:28 am

I have a question concerning the fixing of growth parameters. When I see in my diagramm that the trend of my data points go from left above to right down then the parameters must be fixed negative like:

t1@0 t2@-1 t3@-2 ...

Is that right? What about other (falling) curves e.g. logarithmic? etc.?

Thank you very much indeed!

Linda K. Muthen posted on Wednesday, August 14, 2013 - 6:00 am

The time scores should be positive and reflect the distance between the measurement occasions. The sign of the trend will be seen in the mean of the slope growth factor. See the Topic 3 course video and handout for a discussion of these issues and other types of growth curves.

Paraskevas Petrou posted on Monday, October 14, 2013 - 6:16 am

Dear Linda and Bengt,

I am testing the following growth model:

MODEL:
ix sx | x1@0 x2@1 x3@2;
im sm | m1@0 m2@1 m3@2;
iy sy | y1@0 y2@1 y3@2;

xmI | ix XWITH im;
xmS | sx XWITH sm;

iy on ix im xmI;
sy on ix im xmI sx sm xmS;

When I run it, "the model estimation does not terminate normally due to a change in the likelihood during the last e step". After I increase MIterations and MConvergence, I get the same error and:
SLOW CONVERGENCE DUE TO PARAMETER 24,
which is the PSI of im.

Any help would be very much appreciated.

Thank you,
Paris

Linda K. Muthen posted on Monday, October 14, 2013 - 6:51 am

You should build the model up in parts to see when the problem occurs. I would fit each growth model separately to see if there are any problems there. If not put them together. Then add ON statements and then the interactions.

Paraskevas Petrou posted on Tuesday, October 15, 2013 - 1:15 am

I build the model in the order you suggested. The first error appears when I enter the second on statement (without the interactions yet). It reads:

THE ESTIMATED COVARIANCE MATRIX COULD NOT BE INVERTED. COMPUTATION COULD NOT BE COMPLETED IN ITERATION 38. CHANGE YOUR MODEL AND/OR STARTING VALUES.

THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES.

Also, the following warning appears several times early in the process: LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE.

Sometimes I resolve this by constraining a variance to be positive but other times it is not clear what is the problem.

Thank you,
Paris

Linda K. Muthen posted on Tuesday, October 15, 2013 - 6:21 am

You need to send the output where the first message about the latent variable covariance matrix appears along with your license number to support@statmodel.com.

Sanne de Vries posted on Monday, September 22, 2014 - 8:25 am

Dear Prof. Muth�n,

In order to assess program effects (N = 126 RCT: 1 experimental group, N = 70 and 1 control group, N = 56), I'm conducting a two-part latent growth model for a semicontinuous outcome measure (1 part: a binary variable and 2nd part: a continuous variable).
Unfortunately, I could not get fit indices (chi-square test) for the experimental group (I conducted seperate analyses for each group). Also, after trying several options, I've got several warnings (NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR A LATENT VARIABLE), based on these input commands:

MODEL:
iu su | bin1@0 bin2@1 bin3@2;
iy sy | cont1@0 cont2@1 cont3@2;

!su@0; sy with iu@0;
!sy@0;

OUTPUT: SAMPSTAT STANDARDIZED MODINDICES TECH1 TECH8 TECH4;
PLOT: TYPE = PLOT3;
SERIES = bin1-bin3(su) | cont1-cont3(sy);

Could you explain these warnings and advise an alternative model?

Thank you in advance,

Sanne

Linda K. Muthen posted on Monday, September 22, 2014 - 9:41 am

Please send the full output and your license number to support@statmodel.com.

Ida Behrendt-M�ller posted on Thursday, December 08, 2016 - 4:38 am

Dear Prof.
I have a problem with tech11 in a cubic GMM. I ran the model from 1 to 5 classes and kept the intra-class variance fixed for the Q and Cu term.
When I ran the 5-class model, I got this in TECH11:

Mean *****************
Standard Deviation *****************
P-Value 0.1752
I didn't receive any warnings. Can the P-values be trusted even though the mean and SD can't be computed?
To solve the problem, I've tried to use both the optseed function (i have used number of seeds from best loglikelihood found previously) and increased the number of miterations, but this gives me a new error:

THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO A NON-ZERO
DERIVATIVE OF THE OBSERVED-DATA LOGLIKELIHOOD.

THE MCONVERGENCE CRITERION OF THE EM ALGORITHM IS NOT FULFILLED.
CHECK YOUR STARTING VALUES OR INCREASE THE NUMBER OF MITERATIONS.
ESTIMATES CANNOT BE TRUSTED. THE LOGLIKELIHOOD DERIVATIVE
FOR THE FOLLOWING PARAMETER IS -0.12168984D-04:
Parameter 32, %C#5%: [ CU ].

And all fitestimates changes.
I'm not sure what i should do next?

Kind regards

Bengt O. Muthen posted on Thursday, December 08, 2016 - 5:18 pm

Please send the output to Support along with your license number.

Chunhua Cao posted on Monday, June 12, 2017 - 2:35 pm

Dear Dr. Muth�n,

I ran a growth curve model and got this error message, "THE SAMPLE SIZE PLUS THE PRIOR DEGREES OF FREEDOM OF PSI MUST BE GREATER THAN THE NUMBER OF LATENT VARIABLES". The model commands are:

i1 s1|B1@0 B2@1 B3@2 B4@3 B5@4 B6@5 B7@6 B8@7 B9@8 B10@9 B11@10 B12@11 B13@12;

i2 s2|T8@0 T9@1 T10@2 T11@3 T12@4 T13@5 T14@6 T15@7 T16@8 T17@9 T18@10 T19@11;

My sample size is very small (8 participants). The Bayesian estimation was used. I also tried estimating only three latent variables (i.e., i1, i2, and s2) and the code successfully ran.

Could you please clarify what the error message means? How can I calculate the prior degrees of freedom of psi?

Many thanks in advance for your help!

Tihomir Asparouhov posted on Tuesday, June 13, 2017 - 2:06 pm

The message refers to the Inverse Wishart prior for the variance covariance of i1 s1 i2 s2 - this variance covariance is referred to in tech1 as Psi. The default prior is not acceptable for such small sample size (the default prior is IW(I,-5) see page 36 http://statmodel.com/download/Bayes3.pdf)

Use MODEL PRIOR: to specify a weakly informative prior for Psi and it would be advisable to conduct prior sensitivity study with such small sample size.

Chunhua Cao posted on Tuesday, June 13, 2017 - 3:15 pm

Thank you very much for your feedback, Dr. Asparouhov. It helps a lot! I just want to clarify - why the sample size plus the degrees of freedom should be greater than the number of latent variables? Is this a requirement/rule particular for the Inverse Wishart prior, and the priors for other parameters are fine with the small sample size?

Again thank you for your time and clarification!

Tihomir Asparouhov posted on Tuesday, June 13, 2017 - 4:16 pm

That requirement ensures that the posterior distribution used in the MCMC is a proper distribution. This prior is a multivariate prior for 10 parameters, it is not as simple as a normally distributed prior for one intercept parameter.

Chunhua Cao posted on Wednesday, June 14, 2017 - 7:59 am

Thanks so much for your clarification, Dr. Asparouhov!

Lina posted on Saturday, June 09, 2018 - 8:48 am

Dear Professor,
I ran a group SEM and got this error message below:

THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE
COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING THE FOLLOWING PARAMETER: Parameter 29, Group F: JOBF WITH STAY
THE CONDITION NUMBER IS 0.276D-16.

Note: Parameter 29 is PSI(JOBF and STAY) in female group reffering to TECH1 output. The command used is below:

Variable: Name are NO GENDER STAY JOBF VAPF VAST PAPF PAST rebo1-rebo22;
IDVARIABLE = NO;
Usev are NO STAY JOBF VAPF VAST PAPF PAST E1 E2 E3;
GROUPING = GENDER (1=M 0=F);
Categorical=STAY;
Missing=All(-9);
DEFINE: E1=(rebo1+rebo2+rebo3)/3;
E2=(rebo6+rebo8+rebo13)/3;
E3=(rebo14+rebo16+rebo20)/3;
Model: EE by E1 E2 E3;
WPV by VAPF VAST PAPF PAST;
EE on WPV;
JOBF on EE;
STAY on EE;
MODEL M:
STAY on WPV;
PAPF WITH PAST;
MODEL F:
STAY on JOBF;
VAPF WITH PAST;
Model indirect: JOBF ind WPV;
STAY ind WPV;
output: SAMPSTAT TECH1 TECH4 Stdyx Mod;

May I have your advice on how to deal with this error? Thank you for your help!

Bengt O. Muthen posted on Sunday, June 10, 2018 - 6:04 pm

Send your full output to Support along with your license number.

Lina posted on Sunday, June 10, 2018 - 11:49 pm

Thank you for your response!

Seungmin Lee posted on Wednesday, April 03, 2019 - 10:20 am

Dear Dr. Muthen,

I have difficulty to calculate degree of freedom in a latent curve analysis with 7 time points and dummy variable (i.e., intervention).

My Model is:

USEVAR ARE M0 M1 M2 M3 M4 M5 M6 INTER;
! INTER = dummy variable

MODEL:
IS_AP GR_AP | M0@0 M1@1 M2@1.00001 M3@1.00002 M4@1.00003 M5@1.00004 M6@1.00005;

IS_AP GR_AP ON INTER;

IS_AP with GR_AP;

Mplus shows 28 degree of freedom (with 14 free parameter) with above coding.

From my calculation, the number of free parameters is 14 as well (Path = 2; Latent factor means = 2; Residual variances = 7; Disturbance latent variance = 2, Covariance = 1).
However, from my calculation, the number of observation (i.e., known) = 8(8+1)/2 + 7 = 43. So the degree of freedom from my calculation is 43-14 = 29, not 28.

Would it be possible to get any advice how to calculate the degree of freedom in this model?

Thank you for your time.

Seung

Bengt O. Muthen posted on Wednesday, April 03, 2019 - 4:55 pm

You should not count parameters of the marginal distribution of x (it is not modeled). But you should include the covariances between the y's and x. That is, you have

7(7+1)/2+7 + 7*1 = 42.

ksk posted on Wednesday, October 16, 2019 - 4:25 pm

Dear Drs. Muth�n,

I've run growth mixture modeling for a single construct measured across 3 time points (i.e., 3 variables are COR1, COR2, and COR3). Below is part of the commands for GMM. For a 1-class model, I got an error such that the residual variance of COR3 has a negative value.

MODEL:
i s | COR1@0 COR2@0.7 COR3@1;

I thought I could set the residual variance of COR3 to zero (COR3@0;). However, how can I be sure that the negative value can be treated as zero when I don't know whether the negative value is significantly different from zero or not? Is there a way to examine the negative value?

Bengt O. Muthen posted on Wednesday, October 16, 2019 - 4:58 pm

A good way to handle negative variances is to change the model. For instance, you can hold the residual variances equal across time.

Es Maths posted on Sunday, March 22, 2020 - 5:48 pm

Hello. I am working on a linear growth model with three time points. And, I have nine time-invariant predictors to add the model in order to predict growth and initial status. When I add all predictors into the model altogether at one step, the model fit indices were not good (e.g. RMSEA = 0.118, CFI = .896). Is there any approach such as adding predictors on at a time into the model to find out which predictor worsens the model?

Thanks.

Bengt O. Muthen posted on Monday, March 23, 2020 - 5:06 pm

If your fit is good without the covariates, this means that some of them may have direct effects onto the outcomes. You can't identify all of them but you can regress the outcome at time 2 and time 3 on all the covariates with fixed zero slopes and look at the Modification indices to see which might be needed to free up - that tells you which covariate causes the misfit.

Es Maths posted on Tuesday, March 24, 2020 - 6:33 pm

Many thanks for your prompt reply. Yes, the model without covariates is good. Could ou possibly check the code below to see whether I interpret you correctly? And, why can not I run this model and why am I getting 'unknown variables a-h' error? Thanks.

model:

i s | NL1@0 NL2@0.6 NL3@2;

NL3@0;

i on VSWM CSC FIE DC AS BRIEF COBS NUMOP NONVIQ;

s on VSWM CSC FIE DC AS BRIEF COBS NUMOP NONVIQ;

NL2 ON VSWM (a);
NL2 ON CSC (b);
NL2 ON FIE (c);
NL2 ON DC (d);
NL2 ON BRIEF (e);
NL2 ON COBS (f);
NL2 ON NUMOP (g);
NL2 ON NONVIQ (h);

NL3 ON VSWM (j);
NL3 ON CSC (k);
NL3 ON FIE (l);
NL3 ON DC (m);
NL3 ON BRIEF (n);
NL3 ON COBS (o);
NL3 ON NUMOP (p);
NL3 ON NONVIQ (q);

a-h@0;
j-q@0;

output: stdyx;

MODINDICES (ALL);

Bengt O. Muthen posted on Wednesday, March 25, 2020 - 4:53 pm

a-h are parameter labels. Fixing them to zero needs to be done in Model Constraint. I think you should instead fix these parameters directly in the Model command.

Es Maths posted on Monday, March 30, 2020 - 5:03 pm

Thank you so much. I managed to fix these parameters directly in the Model command but I got 'No modification indices above the minimum value'. Is there any other way of finding out what causes the misfit or how I can improve the model fit (of a linear growth model with three time point and nine time-invariant covariates)?

Many thanks.

Bengt O. Muthen posted on Monday, March 30, 2020 - 5:08 pm

See our Short Course Topic 3 video and handout.

Es Maths posted on Tuesday, March 31, 2020 - 6:26 pm

Many thanks Dr Muthen.

(1) I assume that the suggestions are related to misfit regarding unconditional model. However, I have a good unconditional model. So, would that be okay to free time scores after adding the covariates?

(2) And, unfortunately, none of the suggestions (i.e. freeing factor scores or adding covariances) work to improve model fit. I am sure the problem is related to one of the covariates but I can not pinpoint it.

Any suggestion would be very much appreciated. Thanks.

Bengt O. Muthen posted on Wednesday, April 01, 2020 - 4:06 pm

Not sure the modindices suggest freeing the time scores when adding the covariates. But when the fit deteriorates when adding time-invariant covariates, it is usually the covariances between the covariates and the outcomes that are the source of the problem.

Es Maths posted on Wednesday, April 01, 2020 - 4:43 pm

Thank you so much again. I understand your point and am investigating the potential sources of the misfit for a few weeks.

Modindices suggest only ' BRIEF WITH NL1 ' (BRIEF is one of the covariates, and NL1 is the outcome at the first time point). However, applying this suggestion to the model makes the model worse.

Without BRIEF and with the rest of the covariates, the model fits satisfactorily. I believe that BRIEF measure is problematic but can not justify this based on MODINDICES.

Thanks.

Es Maths posted on Wednesday, April 08, 2020 - 5:11 pm

Thank you Dr Muthen for your last reply. I wonder if there is any resource of yours addressing the issue that'when the fit deteriorates when adding time-invariant covariates, it is usually the covariances between the covariates and the outcomes that are the source of the problem'. Many thanks.

Bengt O. Muthen posted on Thursday, April 09, 2020 - 4:11 pm

No, I don't have anything written on this. You may want to ask on SEMNET.