Message/Author 

Anonymous posted on Saturday, December 20, 2003  3:44 pm



Hello. In running a latent variable growth mixture model, I get an error message that includes the note that the solution returned "A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX". IS it true that I cannot trust the standard errors from such a solution? But can I interpret the parameter estimates? I guess I'm wondering how improper is this solution? Thanks much. 

bmuthen posted on Saturday, December 20, 2003  4:04 pm



It is generally a very good check of nonidentification. In some cases, however, the MLR SEs that come out  if they do  are trustworthy. For example, if a binary y is treated as continuous you get a singularity between the mean and the variance, but the MLR SEs may be ok. You can do a Monte Carlo simulation to check. If a model is truly not identified, the parameter estimates should not be interpreted. 

Anonymous posted on Wednesday, December 24, 2003  10:58 am



Thanks for the reply Dr. Muthen. What puzzles me is that there are other solutions that converge just fine. However, these other converged solutions had a worse loglikelihood than this solution with the warning. So should I discard this better solution because of the warning in favor of a solution with a worse loglikelihood? Or should I suspect that the model is in general unidentified despite the other proper solutions? Thanks so much. 

bmuthen posted on Thursday, December 25, 2003  2:10 pm



When you say that you have "other solutions that converge just fine", do you mean for the same model using other starting values, or do you mean other model variations? 

Anonymous posted on Saturday, December 27, 2003  9:20 pm



I mean that other solutions converge just fine for the same model using other starting values. 

bmuthen posted on Sunday, December 28, 2003  3:16 pm



This may indicate that your data do not support a model with this many classes or this parametric structure. I would not choose a solution with worse likelihood, but change the model. Note also the part in my first response: "if a binary y is treated as continuous you get a singularity between the mean and the variance, but the MLR SEs may be ok. " 

bmuthen posted on Sunday, December 28, 2003  4:23 pm



Another reason you might get nonidentification is that one class collapses, so that there is almost nobody in it (check the class count section in the output)  this is an indication to reduce the number of classes. If you like, you can also send your input, output and data to Mplus support. 


Hello, I have come across a similar problem as "Anonymous" in the exchange above. I am trying to estimate an LCA with c=6 for five indicators and three covariates. Both the latent categorical variable and the indicators are regressed on the covariates. With random starts, I get a solution that converges fine. However, using specific starting values, I can achieve a substantially different solution with a lowerloglikelihood  but with this solution, I get the same warning as "Anonymous" (NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX). The warning identifies a problem parameter: it's a threshold that's estimated to be 16.679. Moreover, the same warning gives a condition number (0.146D12)  which I don't know how to interpret. 1. What is the condition number, and what does the value 0.146D12 indicate? 2. Do my results indicate that a model with fewer than 6 classes might be appropriate? Or are there other changes to the model that I could introduce to make a sixclass model viable? 


First, a couple of general comments. I assume that you don't regress all indicators on the covariates  that is not an identified model. You say you get a lower loglikelihood  but you want a higher one (max). 1. Thresholds that go large like that are harmless causes of the nonpos def message. With large thresholds, the information matrix estimate obtained by the firstorder derivative approach can be numerically determined as singular. Degree of singularity is measured by the condition number, which is the ratio of the smallest to largest eigenvalue of the info matrix estimate. You don't want a very small condition number and 0.146D12 is very small, very close to exactly zero in machine numerical precision terms. 2. Not necessarily. A large threshold of 16 simply indicates that for this class, this item has value 1 instead of 0 (assuming a binary item). Such items are useful for defining classes. It seems strange, however, that the automatic fixing of large thresholds does not come into play for your run. If you are using Version 4.0 and don't get SEs for this solution, please send your input, output, data, and license number to support@statmodel.com. 


I realize I have to understand some basics. I did try to regress all indicators on the covariates. Why is this model not identifiable? Given that my 5 indicators each have 3 or 4 categories, I thought I had up to 3*3*3*4*4  1 = 435 independent pieces of information. In my attempted 6class model, I was estimating 107 parameters  for this reason, I assumed that it's identifiable. I'd be grateful for any hints about where my thinking is flawed, or for a reference to a good introduction to the issue of identifiability in latent class models. I'm interested in understanding this thoroughly. Moreover, my immediate practical concerns are these: Can I achieve identifiability by introducing restrictions, or by reducing the number of classes? 


Having more pieces of information than parameters is only a necessary, not sufficient condition for identification. Your model regresses the latent class variable on the covariates. In addition to that you try to get "direct effects" by regressing each latent class indicator on the covariates. That is not identified. Think of the information that contributes to those estimates  it is the regression of each indicator on all covariates. Say that you have p indicators and q covariates giving p*q slopes. These slopes can't be divided up into both p*q direct effects and regression slopes for the latent class variable on the covariates. You can have some direct effects, but not all. 


Dear Drs. Muthen Is condition number is same as determinant of input matrix? Thank you so much in advance! 


No, the condition number that Mplus prints is the ratio of the smallest to the largest eigenvalue of the estimated information matrix. The smaller it is, the closer the matrix is to being singular, that is, the closer the model is to not being identified. The singularity of the sample statistics covariance matrix is evaluated separately. 


Thank you so much! Then, can I obtain determinant of input matrix using Mplus? 


No, this is not available. 

Ben Chapman posted on Friday, August 31, 2007  3:12 pm



I am interested in the extent to which outliers in a latent profile model introduce a nonpositive first order derivative product matrix at large numbers of classes. I don't have McLaghlan & Peel in front of me but I believe they offer some cautions about outliers in mixtures of normal distributions. What happens is that the outliers tend to form one or more of their own tiny classes (the "collapsing classes" mentioned above). The parameters where the SEs are not trustworthy are in these outlier classes. So I am assuming the small classes produced by the outlier are introducing nonidentification. My intuition is to say the model can't estimate, say 5 classspecific means, 5 variances, and a class proportion from only say 2 observationsbut I am not sure this is technically correct, because isn't other information is the sample used to some degree to estimate these parameters? I don't plan on retaining this model and it is easily estimable without outliers, i am just curious about the possible effect of outliers on normal mixtures. 


Small classes can produce nonidentification if the number of parameters specific to such a class exceeds the number of people in the class. This would produce the firstderivativeproductbased nonidentification message. Classspecific parameters draw only on the information from the people in that class. So you can have a mean parameter for 1 outlying person, but not also a variance parameter specific to this person. 


I got a message about "A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX," while it says, "THE MODEL ESTIMATION TERMINATED NORMALLY." Should I not trust the standard errors of the model parameters estimates even though I got converged solution? 


You most likely can, but it depends on the setting. Please send your output to support. 

jas229 posted on Friday, May 04, 2012  2:43 pm



Hello, I ran a model testing a potential interaction among latent variables using TYPE=RANDOM, ALGORITHM=INTEGRATION, and the XWITH approach for creating latent variable interactions. I obtained the error message "THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX." The model estimation terminated normally though, and I am modeling some dichotomous variables. Can I trust these standard errors given that (based on your replies above) the dichotomous variable issue sometimes causes this error message to arise? Thanks in advance for your time and consideration. 


You would need to test that the dichotomous variable is the problem by removing it and seeing if the message disappears. 

jas229 posted on Friday, May 04, 2012  4:46 pm



Dear Dr. Muthen, Thank you for your prompt reply. Removing the dichotomous variables did make the error message disappear. Does this mean that the standard errors in the original output should be trustworthy? Thank you again for your help. 


Yes. Then the message was generated because the mean and the variance of a dichotomous item are not orthogonal. 


Hello Drs. Muthen, I have a model with 3 latent and 1 observed DV and 4 covariates (2 binary, 2 continuous). When I run the model, with either ML or MLR estimation, the model converges but I receive a message that "THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. The message indicates that the problem is with 1 continuous covariate and 1 binary covariate (when I remove these from the model I no longer receive the message). I can find no reason for nonidentification, the results are quite sensical, and when I ran the model in another SEM program to check I didn't receive any error message. I would like to be sure that I can indeed trust the SEs. Any guidance would be appreciated. 


If you have brought the binary covariate into the model by mentioning its mean, variance, or covariance with another variable, you will get that message because the mean and variance of a binary variable are not orthogonal. The message can be ignored if this is the reason for it. 


Hi I ran a multilevel multiindicator latent growth model MODEL: %within% f1w BY CELF1C*(1)PLN1C(2)PP1C PWPA1W(4); f2w BY CELF3C(1)PLN3C(2)PP3C(3)PWPA3W(4); f3w BY CELF4C*(1)PLN4C(2)PP4C(3)PWPA4W(4); iw sw  f1w@0 f2w@1 f3w@2; %between% f1b BY CELF1C*(5)PLN1C(6)PP1C(7)PWPA1W(8); f2b BY CELF3C*(5)PLN3C(6)PP3C(7)PWPA3W(8); f3b BY CELF4C*(5)PLN4C(6)PP4C(7)PWPA4W(8); ib sb  f1b@0 f2b@1 f3b@2; I got two warnings: 1) A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. 2) THE LATENT VARIABLE COVARIANCE MATRIX IS NOT POSITIVE DEFINITE. When comparing mine with Example 9.15 in User's guide. I noticed 1) you set the crosslevel loading equal.2) WLSM. 3) doesn’t use fixed factor method of scaling as I did. 4) the betweenlevel intercept growth factor is set zero; residual variances of the factors are held equal over time . Could any of these differences be the reasons of warnings? 


Labels like your 1, 2, and 4 in CELF1C*(1)PLN1C(2)PP1C PWPA1W(4); need to be separated by a semicolon or be on separate lines. A growth model with multiple indicators needs to have intercept invariance, not only loading invariance. See UG pages 687692 for how to parameterize your model. 

Sophie Dan posted on Saturday, April 22, 2017  9:44 pm



Dr.Muthen, Hallo! I run a twolevel CFA, and get a warning like this "MAXIMUM LOGLIKELIHOOD VALUE FOR THE UNRESTRICTED (H1) MODEL IS 28091.078 THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NONPOSITIVE DEFINITE FIRSTORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS 0.314D16. PROBLEM INVOLVING PARAMETER 45. THE NONIDENTIFICATION IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE NUMBER OF CLUSTERS. REDUCE THE NUMBER OF PARAMETERS. THE MODEL ESTIMATION TERMINATED NORMALLY " But I also get all the needed result, is the warning make sense, or I can ignore it? Another problem is that if I do twolevel EFA and get a negative residual, how to make it positive? Thank you! Any help from you will be greatly appreciated! 


Use TECH1 to check that you don't have more betweenlevel parameters than the number of clusters. EFA with negative residual variances often suggests that too many factors have been extracted. 

Sophie Dan posted on Wednesday, April 26, 2017  1:23 am



Thanks very much for your reply! To say "between level parameters", do you mean the m*(m+1)/2 numbers of parameters? I f the number of variables I used at the within level is the same as at the between level, so the number of estimated parameters at the within and between level should be the same? And in terms of EFA with negative variance, although it indicates I should extract less factors, if it can meet the requirement of theory and the factor correlation does not reach the critical value which suggests they should be combined into one factor, can I just ignore the residual variance, or it is inadmissible? Thank you again!! 


By betweenlevel parameters I mean the parameters that are specific to the betweenlevel  this is clear from TECH1. As for your other questions, see my general answer. 

Back to top 