Mplus Discussion >> First order deriviate product matrix in GMM's

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


First order deriviate product matrix ...

Mplus Discussion > Latent Variable Mixture Modeling >

Message/Author

Anonymous posted on Saturday, December 20, 2003 - 3:44 pm

Hello. In running a latent variable growth mixture model, I get an error message that includes the note that the solution returned "A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX". IS it true that I cannot trust the standard errors from such a solution? But can I interpret the parameter estimates? I guess I'm wondering how improper is this solution?
Thanks much.

bmuthen posted on Saturday, December 20, 2003 - 4:04 pm

It is generally a very good check of non-identification. In some cases, however, the MLR SEs that come out - if they do - are trustworthy. For example, if a binary y is treated as continuous you get a singularity between the mean and the variance, but the MLR SEs may be ok. You can do a Monte Carlo simulation to check.

If a model is truly not identified, the parameter estimates should not be interpreted.

Anonymous posted on Wednesday, December 24, 2003 - 10:58 am

Thanks for the reply Dr. Muthen. What puzzles me is that there are other solutions that converge just fine. However, these other converged solutions had a worse loglikelihood than this solution with the warning. So should I discard this better solution because of the warning in favor of a solution with a worse loglikelihood? Or should I suspect that the model is in general unidentified despite the other proper solutions?
Thanks so much.

bmuthen posted on Thursday, December 25, 2003 - 2:10 pm

When you say that you have "other solutions that converge just fine", do you mean for the same model using other starting values, or do you mean other model variations?

Anonymous posted on Saturday, December 27, 2003 - 9:20 pm

I mean that other solutions converge just fine for the same model using other starting values.

bmuthen posted on Sunday, December 28, 2003 - 3:16 pm

This may indicate that your data do not support a model with this many classes or this parametric structure. I would not choose a solution with worse likelihood, but change the model. Note also the part in my first response:

"if a binary y is treated as continuous you get a singularity between the mean and the variance, but the MLR SEs may be ok. "

bmuthen posted on Sunday, December 28, 2003 - 4:23 pm

Another reason you might get non-identification is that one class collapses, so that there is almost nobody in it (check the class count section in the output) - this is an indication to reduce the number of classes.

If you like, you can also send your input, output and data to Mplus support.

Peter Martin posted on Friday, May 05, 2006 - 9:32 am

Hello,
I have come across a similar problem as "Anonymous" in the exchange above. I am trying to estimate an LCA with c=6 for five indicators and three covariates. Both the latent categorical variable and the indicators are regressed on the covariates. With random starts, I get a solution that converges fine. However, using specific starting values, I can achieve a substantially different solution with a lower-loglikelihood - but with this solution, I get the same warning as "Anonymous" (NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX). The warning identifies a problem parameter: it's a threshold that's estimated to be 16.679. Moreover, the same warning gives a condition number (0.146D-12) - which I don't know how to interpret.

1. What is the condition number, and what does the value 0.146D-12 indicate?

2. Do my results indicate that a model with fewer than 6 classes might be appropriate? Or are there other changes to the model that I could introduce to make a six-class model viable?

Bengt O. Muthen posted on Friday, May 05, 2006 - 6:24 pm

First, a couple of general comments.

I assume that you don't regress all indicators on the covariates - that is not an identified model.

You say you get a lower loglikelihood - but you want a higher one (max).

1. Thresholds that go large like that are harmless causes of the non-pos def message. With large thresholds, the information matrix estimate obtained by the first-order derivative approach can be numerically determined as singular. Degree of singularity is measured by the condition number, which is the ratio of the smallest to largest eigenvalue of the info matrix estimate. You don't want a very small condition number and 0.146D-12 is very small, very close to exactly zero in machine numerical precision terms.

2. Not necessarily. A large threshold of 16 simply indicates that for this class, this item has value 1 instead of 0 (assuming a binary item). Such items are useful for defining classes. It seems strange, however, that the automatic fixing of large thresholds does not come into play for your run. If you are using Version 4.0 and don't get SEs for this solution, please send your input, output, data, and license number to support@statmodel.com.

Peter Martin posted on Tuesday, May 09, 2006 - 2:49 am

I realize I have to understand some basics. I did try to regress all indicators on the covariates. Why is this model not identifiable? Given that my 5 indicators each have 3 or 4 categories, I thought I had up to 3*3*3*4*4 - 1 = 435 independent pieces of information. In my attempted 6-class model, I was estimating 107 parameters - for this reason, I assumed that it's identifiable. I'd be grateful for any hints about where my thinking is flawed, or for a reference to a good introduction to the issue of identifiability in latent class models. I'm interested in understanding this thoroughly. Moreover, my immediate practical concerns are these: Can I achieve identifiability by introducing restrictions, or by reducing the number of classes?

Bengt O. Muthen posted on Tuesday, May 09, 2006 - 5:47 am

Having more pieces of information than parameters is only a necessary, not sufficient condition for identification. Your model regresses the latent class variable on the covariates. In addition to that you try to get "direct effects" by regressing each latent class indicator on the covariates. That is not identified. Think of the information that contributes to those estimates - it is the regression of each indicator on all covariates. Say that you have p indicators and q covariates giving p*q slopes. These slopes can't be divided up into both p*q direct effects and regression slopes for the latent class variable on the covariates. You can have some direct effects, but not all.

Yu Kyoum Kim posted on Friday, June 22, 2007 - 8:00 am

Dear Drs. Muthen

Is condition number is same as determinant of input matrix?

Thank you so much in advance!

Bengt O. Muthen posted on Friday, June 22, 2007 - 9:12 am

No, the condition number that Mplus prints is the ratio of the smallest to the largest eigenvalue of the estimated information matrix. The smaller it is, the closer the matrix is to being singular, that is, the closer the model is to not being identified.

The singularity of the sample statistics covariance matrix is evaluated separately.

Yu Kyoum Kim posted on Friday, June 22, 2007 - 10:33 am

Thank you so much!
Then, can I obtain determinant of input matrix using Mplus?

Linda K. Muthen posted on Friday, June 22, 2007 - 10:54 am

No, this is not available.

Ben Chapman posted on Friday, August 31, 2007 - 3:12 pm

I am interested in the extent to which outliers in a latent profile model introduce a non-positive first order derivative product matrix at large numbers of classes.

I don't have McLaghlan & Peel in front of me but I believe they offer some cautions about outliers in mixtures of normal distributions.

What happens is that the outliers tend to form one or more of their own tiny classes (the "collapsing classes" mentioned above). The parameters where the SEs are not trustworthy are in these outlier classes.

So I am assuming the small classes produced by the outlier are introducing non-identification.

My intuition is to say the model can't estimate, say 5 class-specific means, 5 variances, and a class proportion from only say 2 observations--but I am not sure this is technically correct, because isn't other information is the sample used to some degree to estimate these parameters?

I don't plan on retaining this model and it is easily estimable without outliers, i am just curious about the possible effect of outliers on normal mixtures.

Bengt O. Muthen posted on Friday, August 31, 2007 - 6:53 pm

Small classes can produce non-identification if the number of parameters specific to such a class exceeds the number of people in the class. This would produce the first-derivative-product-based non-identification message. Class-specific parameters draw only on the information from the people in that class. So you can have a mean parameter for 1 outlying person, but not also a variance parameter specific to this person.

Sung Joon Jang posted on Wednesday, April 20, 2011 - 10:16 am

I got a message about "A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX," while it says, "THE MODEL ESTIMATION TERMINATED NORMALLY." Should I not trust the standard errors of the model parameters estimates even though I got converged solution?

Bengt O. Muthen posted on Wednesday, April 20, 2011 - 10:41 am

You most likely can, but it depends on the setting. Please send your output to support.

jas229 posted on Friday, May 04, 2012 - 2:43 pm

Hello,

I ran a model testing a potential interaction among latent variables using TYPE=RANDOM, ALGORITHM=INTEGRATION, and the XWITH approach for creating latent variable interactions. I obtained the error message "THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX." The model estimation terminated normally though, and I am modeling some dichotomous variables. Can I trust these standard errors given that (based on your replies above) the dichotomous variable issue sometimes causes this error message to arise?

Thanks in advance for your time and consideration.

Linda K. Muthen posted on Friday, May 04, 2012 - 2:48 pm

You would need to test that the dichotomous variable is the problem by removing it and seeing if the message disappears.

jas229 posted on Friday, May 04, 2012 - 4:46 pm

Dear Dr. Muthen,

Thank you for your prompt reply. Removing the dichotomous variables did make the error message disappear. Does this mean that the standard errors in the original output should be trustworthy?

Thank you again for your help.

Linda K. Muthen posted on Friday, May 04, 2012 - 6:01 pm

Yes. Then the message was generated because the mean and the variance of a dichotomous item are not orthogonal.

Ari J Elliot posted on Thursday, January 08, 2015 - 9:08 am

Hello Drs. Muthen,

I have a model with 3 latent and 1 observed DV and 4 covariates (2 binary, 2 continuous).

When I run the model, with either ML or MLR estimation, the model converges but I receive a message that "THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX.

The message indicates that the problem is with 1 continuous covariate and 1 binary covariate (when I remove these from the model I no longer receive the message).

I can find no reason for non-identification, the results are quite sensical, and when I ran the model in another SEM program to check I didn't receive any error message.

I would like to be sure that I can indeed trust the SEs. Any guidance would be appreciated.

Linda K. Muthen posted on Thursday, January 08, 2015 - 11:13 am

If you have brought the binary covariate into the model by mentioning its mean, variance, or covariance with another variable, you will get that message because the mean and variance of a binary variable are not orthogonal. The message can be ignored if this is the reason for it.

Rongfang Jia posted on Friday, October 07, 2016 - 1:30 pm

Hi I ran a multi-level multi-indicator latent growth model
MODEL: %within%
f1w BY CELF1C*(1)PLN1C(2)PP1C PWPA1W(4); f2w BY CELF3C(1)PLN3C(2)PP3C(3)PWPA3W(4); f3w BY CELF4C*(1)PLN4C(2)PP4C(3)PWPA4W(4);
iw sw | f1w@0 f2w@1 f3w@2;
%between%
f1b BY CELF1C*(5)PLN1C(6)PP1C(7)PWPA1W(8);
f2b BY CELF3C*(5)PLN3C(6)PP3C(7)PWPA3W(8);
f3b BY CELF4C*(5)PLN4C(6)PP4C(7)PWPA4W(8);
ib sb | f1b@0 f2b@1 f3b@2;

I got two warnings: 1) A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. 2) THE LATENT VARIABLE COVARIANCE MATRIX IS NOT POSITIVE DEFINITE.
When comparing mine with Example 9.15 in User's guide. I noticed 1) you set the cross-level loading equal.2) WLSM. 3) doesn�t use fixed factor method of scaling as I did. 4) the between-level intercept growth factor is set zero; residual variances of the factors are held equal over time . Could any of these differences be the reasons of warnings?

Bengt O. Muthen posted on Friday, October 07, 2016 - 4:40 pm

Labels like your 1, 2, and 4 in

CELF1C*(1)PLN1C(2)PP1C PWPA1W(4);

need to be separated by a semicolon or be on separate lines.

A growth model with multiple indicators needs to have intercept invariance, not only loading invariance. See UG pages 687-692 for how to parameterize your model.

Sophie Dan posted on Saturday, April 22, 2017 - 9:44 pm

Dr.Muthen,

Hallo! I run a twolevel CFA, and get a warning like this "MAXIMUM LOG-LIKELIHOOD VALUE FOR THE UNRESTRICTED (H1) MODEL IS -28091.078

THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE
TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE
FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING
VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE
CONDITION NUMBER IS -0.314D-16. PROBLEM INVOLVING PARAMETER 45.

THE NONIDENTIFICATION IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE
NUMBER OF CLUSTERS. REDUCE THE NUMBER OF PARAMETERS.

THE MODEL ESTIMATION TERMINATED NORMALLY
"
But I also get all the needed result, is the warning make sense, or I can ignore it?

Another problem is that if I do twolevel EFA and get a negative residual, how to make it positive?

Thank you! Any help from you will be greatly appreciated!

Bengt O. Muthen posted on Tuesday, April 25, 2017 - 5:37 pm

Use TECH1 to check that you don't have more between-level parameters than the number of clusters.

EFA with negative residual variances often suggests that too many factors have been extracted.

Sophie Dan posted on Wednesday, April 26, 2017 - 1:23 am

Thanks very much for your reply!

To say "between level parameters", do you mean the m*(m+1)/2 numbers of parameters? I f the number of variables I used at the within level is the same as at the between level, so the number of estimated parameters at the within and between level should be the same?

And in terms of EFA with negative variance, although it indicates I should extract less factors, if it can meet the requirement of theory and the factor correlation does not reach the critical value which suggests they should be combined into one factor, can I just ignore the residual variance, or it is inadmissible?

Thank you again!!

Bengt O. Muthen posted on Wednesday, April 26, 2017 - 2:08 pm

By between-level parameters I mean the parameters that are specific to the between-level - this is clear from TECH1.

As for your other questions, see my general answer.

Diana Chirinos Medina posted on Tuesday, July 25, 2017 - 12:30 pm

Dear Dr. Muthen,

I am running a CFA with a clustering variable and I get the following error:

THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE
TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE
FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING
VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE
CONDITION NUMBER IS -0.530D-17. PROBLEM INVOLVING THE FOLLOWING PARAMETER:
Parameter 48.

THIS IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE NUMBER
OF CLUSTERS MINUS THE NUMBER OF STRATA WITH MORE THAN ONE CLUSTER.

In fact the number of observations in some clusters are quite small. I saw you responded to a similar message with the following:

Linda K. Muthen posted on Friday, June 24, 2011 - 4:50 pm
The number of clusters is the number of independent observations in your data set. The warning is telling you that you have more parameters than you have independent observations. The impact of this on the results has not been studied. This is simply a warning.

Does this mean that this is just a warning and I can interpret the results?
Or are the results not trustworthy?

Please let me know.
Thank you for your reply.
Best,
Diana

Bengt O. Muthen posted on Tuesday, July 25, 2017 - 6:07 pm

Linda's answer is still good. I would just add that you can see how many cluster-level parameters you have - if that is considerably lower that the number of clusters you may be ok. But only a simulation study would be able to tell (and so far nobody seems to jump on doing this study which we have advertised for years).

Qiong Wu posted on Tuesday, September 26, 2017 - 9:06 pm

Hi Dr. Muthen,

I was running an SEM model with 183 participants. The model was fine. For some reason I need to exclude 8 participants (leaving 175 participants in the sample). I ran the same model again and ran into the message "THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.425D-20. PROBLEM INVOLVING PARAMETER 58."

I am wondering, since the number of observed variances-covariances = [k(k + 1)] / 2, and I did not change the number of variables (k) or the number of estimated coefficients of the model, the model should not have an identification problem. Am I correct? What can the problem be?

Thank you!

Bengt O. Muthen posted on Wednesday, September 27, 2017 - 3:29 pm

Please send your output to Support along with your license number.