annonymous posted on Tuesday, December 20, 2005 - 1:59 pm
I am working a paper that uses growth mixture modeling and I have a few questions. They are as follows:
1) When I move from a 2 class model to a 3 class model I get a non-positive definite matrix error. Should this be taken as an indication that 2 classes should be considered more heavily than the 3 class model?
2) If this does not mean that 2 classes are the proper option, then is it possible to specify different models for each class? That is, class 1 and class 2 are linear but class 3 is quadratic. Is this possible to specify in Mplus? If so, is there an example of the syntax of this in the manual that you may direct too?
Thank you in advance for your help.
bmuthen posted on Tuesday, December 20, 2005 - 3:51 pm
1) Extracting 3 classes may give you a negative variance for a growth factor which would indicate that for this growth factor there is no within-class variation left so the factor should be fixed rather than random.
2)Yes, you can let the model be different in different classes. Setting say
because you may still have a non-zero quadratic growth factor mean.
For other suggestions, see my 2004 chapter in the Kaplan handbook on our web site.
Anonymous posted on Wednesday, December 21, 2005 - 5:55 pm
Thank you very much. This helps a lot!
Anonymous posted on Thursday, January 05, 2006 - 11:09 am
I would like to ask a question to follow-up on the issue of different models in different classes. If the model is structured as a linear model in the overall statement (e.g., int slope), how would the syntax be written so that the model would allow group 1 to be (e.g.,int slope) and group 2 to be quadratic (e.g., int slope quad)? Please keep in mind that the overall is a linear model (i.e., int slope).
The reason that I ask this is because in your reply. It appears that you are making the assumption that in the overall statement that the model was quadratic rather than linear. This is rather perplexing yet exciting for me. Thank you in advance for your help.
You have to specify the model with the most parameters in the %OVERALL% part of the model. Then you change this in the class-specific model parts. So in Class 1, if you want a linear model, you fix the parameters for the quadratic growth factor to zero.
Andy Ross posted on Tuesday, April 03, 2007 - 5:03 am
Dear Prof Muthen
I am attempting to estimate a simple latent class model with 4 indicators (var levels: 2 2 2 4) and 4 latent classes. The model has 4 degrees of freedom and should be identified however I am getting a non-positive definite matrix error.
My condition index is 0.638D-15. The problematic parameter identified has a high, but not inconceivably high threshold: 13.728. However another threshold has an SE of 54.834 (cf a range of 0.152>1.987) and in addition, separate runs give the same log-like with very slightly different parameter estimates - supporting non-identification.
However why should this model be non-identified given the degrees of freedom?
Andy Ross posted on Tuesday, April 03, 2007 - 7:15 am
As a follow up to my last post - I was able to estimate the model without error by setting the parameter with the high SE at the value given in the original estimate. Is this a viable/acceptable solution?
Extreme parameter estimates can cause a singular information matrix, followed by the non-identification message. The extreme value may be due to a probability going towards 0 or 1. Typically Mplus fixes such parameters at the high value and thereby removes it from the information matrix calculation; for some reason this did not happen in your case. So this may not be a truly undefined parameter. Fixing the parameter at the high value is typically ok. But to be sure that this is not an unidentified model you could send your input, output, data, and license number to firstname.lastname@example.org.
Note that the 4-indicator, 3-class model with binary indicators is known to have 1 df but still be non-identified. See Goodman (1974). You have 4 classes, but have a chance at identification due to one of the items being 4-categoy.
Assume that I want to check up to 3 classes. Testing 2 classes leads to significant intercepts and slopes variances. But testing 3 classes leads to an insignificant negative slope variance. Should I report the fit statistics of all these solutions and take the 3 classes (it fitted best) with a restricted slope variance (0) or should I go back to a model estimating only the intercept variance also for the 2 class model and comparing this 2 class solution against the 3 classes?
In other words: Concerning the entire class finding process per se should one stick to a (co)variance structure during this procedure or could one alternate across class solutions (e. g. in case of non positive definite psi matrix). I was concerned because restrictions lead to better BIC, and I don't want to favour a class solution only because of restrictions that are not implemented in the other class solution. Thank you!
I don't see why one would use the same covariance structure across different number of classes. Adding classes it is natural that some additional variance is absorbed by the classes so that for example there is no longer any within-class variance of a slope. I would fix such a slope variance at zero and then use BIC to compare across classes.
sounds very logical, thank you. Referring to this logic it would be very unlikely (impossible!?) that this slope variance (hold equal across classes) is significant in let's say 4 classes, when it already became insignificant at 3 classes!?
LMRT and BLRT are also comparable in cases of different covariance structures, I guess!?
I haven't seen LMR and BLRT done when the competing models differ in terms of both number of classes and number of random effects. Mplus does not do this, but holds the random effects the same in the k-1 and k-class runs.
So, as far as I understood, BIC is the only statistical criterium to decide on the number of classes when the covariance structure changes across classes? I know a paper in which the authors used LMR and fixed an error variance of a growth parameter to zero in one class. But I can imagine of more cases. When analyzing a single class you often get significant slope variances which get lost in LGMM, as in my case. Thank you for your time!
ref. to my post, tue, 29th july, 9.19am first part of the message... Does this hold true also for a insig. negative residual slope variance found in 2 classes (I fixed it to zero then)? Should I also fix it to zero prior computing the following class solutions (3 and 4 classes)? I ask this beforehand, because it takes one day on my PC to compute a model. I hope this is my last question :-)
Sue Lee posted on Monday, February 14, 2011 - 7:25 pm
Dear Dr. Muthen,
Running a maximum 2-order saturated structural model with the following command, VARIABLE: USEVARIABLE = mitem1-mitem38; CATEGORICAL = mitem1-mitem38; CLASSES = c(16); ANALYSIS: TYPE = MIXTURE MISSING; STARTS = 0; PROCESSORS = 4;
I got the following warning message. THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS -0.178D-17. PROBLEM INVOLVING PARAMETER 106.
My question is how we can know whether this is an non-identified case or just a wrong start value? I was advised to turn on multiple random start instead of fixing it at 0 (START=0). Woule it be a good follow-up after I got this warning? Any of your expertise on this matter is greatly appreciated. Thanks very much!