First order deriviate product matrix ... PreviousNext
Mplus Discussion > Latent Variable Mixture Modeling >
Message/Author
 Anonymous posted on Saturday, December 20, 2003 - 3:44 pm
Hello. In running a latent variable growth mixture model, I get an error message that includes the note that the solution returned "A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX". IS it true that I cannot trust the standard errors from such a solution? But can I interpret the parameter estimates? I guess I'm wondering how improper is this solution?
Thanks much.
 bmuthen posted on Saturday, December 20, 2003 - 4:04 pm
It is generally a very good check of non-identification. In some cases, however, the MLR SEs that come out - if they do - are trustworthy. For example, if a binary y is treated as continuous you get a singularity between the mean and the variance, but the MLR SEs may be ok. You can do a Monte Carlo simulation to check.

If a model is truly not identified, the parameter estimates should not be interpreted.
 Anonymous posted on Wednesday, December 24, 2003 - 10:58 am
Thanks for the reply Dr. Muthen. What puzzles me is that there are other solutions that converge just fine. However, these other converged solutions had a worse loglikelihood than this solution with the warning. So should I discard this better solution because of the warning in favor of a solution with a worse loglikelihood? Or should I suspect that the model is in general unidentified despite the other proper solutions?
Thanks so much.
 bmuthen posted on Thursday, December 25, 2003 - 2:10 pm
When you say that you have "other solutions that converge just fine", do you mean for the same model using other starting values, or do you mean other model variations?
 Anonymous posted on Saturday, December 27, 2003 - 9:20 pm
I mean that other solutions converge just fine for the same model using other starting values.
 bmuthen posted on Sunday, December 28, 2003 - 3:16 pm
This may indicate that your data do not support a model with this many classes or this parametric structure. I would not choose a solution with worse likelihood, but change the model. Note also the part in my first response:

"if a binary y is treated as continuous you get a singularity between the mean and the variance, but the MLR SEs may be ok. "
 bmuthen posted on Sunday, December 28, 2003 - 4:23 pm
Another reason you might get non-identification is that one class collapses, so that there is almost nobody in it (check the class count section in the output) - this is an indication to reduce the number of classes.

If you like, you can also send your input, output and data to Mplus support.
 Peter Martin posted on Friday, May 05, 2006 - 9:32 am
Hello,
I have come across a similar problem as "Anonymous" in the exchange above. I am trying to estimate an LCA with c=6 for five indicators and three covariates. Both the latent categorical variable and the indicators are regressed on the covariates. With random starts, I get a solution that converges fine. However, using specific starting values, I can achieve a substantially different solution with a lower-loglikelihood - but with this solution, I get the same warning as "Anonymous" (NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX). The warning identifies a problem parameter: it's a threshold that's estimated to be 16.679. Moreover, the same warning gives a condition number (0.146D-12) - which I don't know how to interpret.

1. What is the condition number, and what does the value 0.146D-12 indicate?

2. Do my results indicate that a model with fewer than 6 classes might be appropriate? Or are there other changes to the model that I could introduce to make a six-class model viable?
 Bengt O. Muthen posted on Friday, May 05, 2006 - 6:24 pm
First, a couple of general comments.

I assume that you don't regress all indicators on the covariates - that is not an identified model.

You say you get a lower loglikelihood - but you want a higher one (max).

1. Thresholds that go large like that are harmless causes of the non-pos def message. With large thresholds, the information matrix estimate obtained by the first-order derivative approach can be numerically determined as singular. Degree of singularity is measured by the condition number, which is the ratio of the smallest to largest eigenvalue of the info matrix estimate. You don't want a very small condition number and 0.146D-12 is very small, very close to exactly zero in machine numerical precision terms.

2. Not necessarily. A large threshold of 16 simply indicates that for this class, this item has value 1 instead of 0 (assuming a binary item). Such items are useful for defining classes. It seems strange, however, that the automatic fixing of large thresholds does not come into play for your run. If you are using Version 4.0 and don't get SEs for this solution, please send your input, output, data, and license number to support@statmodel.com.
 Peter Martin posted on Tuesday, May 09, 2006 - 2:49 am
I realize I have to understand some basics. I did try to regress all indicators on the covariates. Why is this model not identifiable? Given that my 5 indicators each have 3 or 4 categories, I thought I had up to 3*3*3*4*4 - 1 = 435 independent pieces of information. In my attempted 6-class model, I was estimating 107 parameters - for this reason, I assumed that it's identifiable. I'd be grateful for any hints about where my thinking is flawed, or for a reference to a good introduction to the issue of identifiability in latent class models. I'm interested in understanding this thoroughly. Moreover, my immediate practical concerns are these: Can I achieve identifiability by introducing restrictions, or by reducing the number of classes?
 Bengt O. Muthen posted on Tuesday, May 09, 2006 - 5:47 am
Having more pieces of information than parameters is only a necessary, not sufficient condition for identification. Your model regresses the latent class variable on the covariates. In addition to that you try to get "direct effects" by regressing each latent class indicator on the covariates. That is not identified. Think of the information that contributes to those estimates - it is the regression of each indicator on all covariates. Say that you have p indicators and q covariates giving p*q slopes. These slopes can't be divided up into both p*q direct effects and regression slopes for the latent class variable on the covariates. You can have some direct effects, but not all.
 Yu Kyoum Kim posted on Friday, June 22, 2007 - 8:00 am
Dear Drs. Muthen

Is condition number is same as determinant of input matrix?

Thank you so much in advance!
 Bengt O. Muthen posted on Friday, June 22, 2007 - 9:12 am
No, the condition number that Mplus prints is the ratio of the smallest to the largest eigenvalue of the estimated information matrix. The smaller it is, the closer the matrix is to being singular, that is, the closer the model is to not being identified.

The singularity of the sample statistics covariance matrix is evaluated separately.
 Yu Kyoum Kim posted on Friday, June 22, 2007 - 10:33 am
Thank you so much!
Then, can I obtain determinant of input matrix using Mplus?
 Linda K. Muthen posted on Friday, June 22, 2007 - 10:54 am
No, this is not available.
 Ben Chapman posted on Friday, August 31, 2007 - 3:12 pm
I am interested in the extent to which outliers in a latent profile model introduce a non-positive first order derivative product matrix at large numbers of classes.

I don't have McLaghlan & Peel in front of me but I believe they offer some cautions about outliers in mixtures of normal distributions.

What happens is that the outliers tend to form one or more of their own tiny classes (the "collapsing classes" mentioned above). The parameters where the SEs are not trustworthy are in these outlier classes.

So I am assuming the small classes produced by the outlier are introducing non-identification.

My intuition is to say the model can't estimate, say 5 class-specific means, 5 variances, and a class proportion from only say 2 observations--but I am not sure this is technically correct, because isn't other information is the sample used to some degree to estimate these parameters?

I don't plan on retaining this model and it is easily estimable without outliers, i am just curious about the possible effect of outliers on normal mixtures.
 Bengt O. Muthen posted on Friday, August 31, 2007 - 6:53 pm
Small classes can produce non-identification if the number of parameters specific to such a class exceeds the number of people in the class. This would produce the first-derivative-product-based non-identification message. Class-specific parameters draw only on the information from the people in that class. So you can have a mean parameter for 1 outlying person, but not also a variance parameter specific to this person.
 Sung Joon Jang posted on Wednesday, April 20, 2011 - 10:16 am
I got a message about "A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX," while it says, "THE MODEL ESTIMATION TERMINATED NORMALLY." Should I not trust the standard errors of the model parameters estimates even though I got converged solution?
 Bengt O. Muthen posted on Wednesday, April 20, 2011 - 10:41 am
You most likely can, but it depends on the setting. Please send your output to support.
 jas229 posted on Friday, May 04, 2012 - 2:43 pm
Hello,

I ran a model testing a potential interaction among latent variables using TYPE=RANDOM, ALGORITHM=INTEGRATION, and the XWITH approach for creating latent variable interactions. I obtained the error message "THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX." The model estimation terminated normally though, and I am modeling some dichotomous variables. Can I trust these standard errors given that (based on your replies above) the dichotomous variable issue sometimes causes this error message to arise?

Thanks in advance for your time and consideration.
 Linda K. Muthen posted on Friday, May 04, 2012 - 2:48 pm
You would need to test that the dichotomous variable is the problem by removing it and seeing if the message disappears.
 jas229 posted on Friday, May 04, 2012 - 4:46 pm
Dear Dr. Muthen,

Thank you for your prompt reply. Removing the dichotomous variables did make the error message disappear. Does this mean that the standard errors in the original output should be trustworthy?

Thank you again for your help.
 Linda K. Muthen posted on Friday, May 04, 2012 - 6:01 pm
Yes. Then the message was generated because the mean and the variance of a dichotomous item are not orthogonal.
Back to top
Add Your Message Here
Post:
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Password:
Options: Enable HTML code in message
Automatically activate URLs in message
Action: