Message/Author 

Anonymous posted on Thursday, January 20, 2005  12:18 pm



Hi I have been working in Mplus on several twolevel structural equation models, and sometimes the residual variance of my observed dependent variables is negative. The model fits are high, but I am not sure how to interpret or fix these rv's so they are no longer negative. Please advise. Thank you. 

BMuthen posted on Thursday, January 20, 2005  7:57 pm



If the negative residual variances are large, this is a sign that your model is not appropriate for your data and you need to change your model. If they are small, you may want to fix them to zero. Residual variance are often small on the between level of multilevel models. 

matthew posted on Sunday, November 27, 2005  3:14 am



hi i am a new user of SEM and also face the similar problem. what are the causes of negative residual variance? i mean how inappropriateness of the model would cause this problem but not reflecting on the modelfit indices. i want to ignore it (simply delete it), is there any guideline or significant level of the value so that it would be comfortable to do this? 


there is too much to say on this topic and checking books is helpful.  in sum, reasons for neg vars include small sample size (so neg est even if pop value is pos), model misspecification, and very skewed variables (floor effects). also see my other answer today. 


Hi Could you advise me on the following problem, please? Using CFA, I established discriminant validity between 6 latent variables (x1x6). Then using SEM, I regressed y1, a latent variable with 3 continuous indicators on x1x6. The model fit very well. However, on addtion of a final path, regressing u1 a binary observed variable on y1, I received a warning that the residual covariance matrix is not positive definite.On inspection of the results, I realised there was a negative residual on one of the indicators of y1, and the model fit indices are very poor. What should I do? I tried dichotomising y1, as the indicators are negatively skewed but the model fit is worse. I would be greatful for any advice. Grainne 


Please send your input, data, output, and license number to support@statmodel.com. 


Hello, Is it possible to constrain residual variance of outcome variables to be greater than or equal to zero in Mplus (rather that only equal to zero)? If yes, could you please state how? 


You can use MODEL CONSTRAINT to constrain them to be greater than zero, for example, MODEL CONSTRAINT: 0 < p2; 


Hello Linda, I found a negative error variance for one of my ordinal outcomes (estimator WLSMV), and i'd like to test the inequality H0: error variance is greater than or equal to zero using the Wald test. Is it possible in Mplus? Thank you very much in advance. 


I don't know what your model is, but unless you have a longitudinal or multigroup situation, the residual variances for ordinal outcomes are not free parameters. They are printed as remainders when requesting a standardized solution  perhaps that's where you see the negative value. So since they are functions of other parameters, you cannot do a test on them in a straightforward fashion. If you for instance consider a single factor and no covariates (in a crosssectional, singlegroup, the residual variance remainder is theta, theta = 1  lambda*lambda* psi so you would have to do the Wald test on the new parameter theta (testing against = 0, not > 0). 


Thanks Bengt for your answer. I indeed saw the negative residual variance doing ESEM with WLSMV and Delta parameterization on 2factor model described by a total of 5 ordinal data. 1. But, if one uses the Theta param., the residuals should be the free parameters, shouldn't they? In this case, the Wald test could be directly used on the residual? 2. About the Wald test, I was wondering if it was possible to do a onesided test of H0: residual >= 0 (as opposed to the twosided test H0: residual=0), as it is suggested in the article of Kolenikov & Bollen "Testing negative error variances: is a Heywood case a symptom of misspecification?" http://web.missouri.edu/~kolenikovs/ Hope these questions make sense! Thank you very much for your help. 


1. With the theta parameterization the residual variance is fixed to 1 (unless you have multiple group situation)  so in a way this is giving you residual variance > 0 condition. The residual variance is not a free parameter because it is still not identified so it has to be fixed to a value that determines the parameterization. For the theta parameterization that value is 1. 2. In principle yes  this amounts to dividing the pvalue you get by 2, but again with the theta parameterization you can not do this at all because the residual variance is fixed to one. In the delta parameterization you can do this using the method Bengt outlined above, i.e., by making a new parameter in model constraints that is equal to the residual variance parameter. The residual variance parameter in the model is not really a regular parameter  it is a dependent constrained parameter that you can not access directly so you have to make your own duplicate of it. 

RDU posted on Saturday, February 13, 2010  12:29 am



I have a question concerning the residual variance provided in the standardized model results for categorical outcomes. Since this is a standardized solution, then are the residual variances listed standardized residual variances or are they unstandardized? Furthermore, if they are standardized then how exactly does one obtain the unstandardized residual variances (Is it similar to what it would be in regular regression?). Thanks. 


I think you are asking about the residual variances that are printed with Rsquare. These are raw coefficients that are computed as a remainder from the model estimated results. They are not estimated as part of the model. Categorical outcomes do not have variance parameters. 

RDU posted on Saturday, February 13, 2010  10:32 am



To better clarify my question, I was referring to the residual variances that are provided using Theta parameterization (WLSMV estimation) with categorical data, where standardized model results can be requested. As part of the standardized output, residual variances are given along with each item's R^2. Thus I was wondering whether the residual variance is standardized since it is part of the standardized model output, or whether it is an unstandardized estimate. If it is in fact a standardized residual variance, I also wanted to know if an unstandardized estimate could be obtained or calculated by hand. Thank you for your response. 


With the Theta parameterization, scale factors are given with Rsquare. If you have further questions about this, please send the full output as an attachment and your license number to support@statmodel.com. 

RDU posted on Saturday, February 13, 2010  11:37 am



Yes, I apologize for the confusion. I believe I mistook the scale factors from Theta parameterization for the residual variances provided in delta parameterization. 

RDU posted on Saturday, February 13, 2010  1:40 pm



Given the previous question I am also curious as to whether the residual variances given for delta parameterization are standardized or unstandardized, as theta parameterization was said to provide scale factors and not residual variances...Is this correct? 


Neither the scale factors or residual variances presented with Rsquare are standardized. 

Hemant Kher posted on Thursday, April 21, 2011  8:04 am



Hello Professor Muthen, I have a question, and I hope that you can provide some insights. My question is related to a multipleindicator latent curve model. Using the CFA approach, I estimate a latent construct for 4 different time points (factors F1, F2, F3 and F4  each estimated using the same 4 items). I followed directions to establish measurement invariance (same scale indicator at each time, loadings for nonscale items constrained equal across time, and equal intercepts for nonscale items). The model with CFA works fine with a good fit. However, when I fit a growth model on the factors, I get a negative residual variance for the first factor (F1); the residual variance is small and statistically insignificant (0.026, Z=0.354, p=0.723). When I fix this residual variance for F1 to zero (f1@0;), the change in model fit is negligible and not statistically significant. But I am not sure if doing this (setting factor residual variance to zero) is reasonable. Your thoughts at your convenience would be appreciated. 


Negative residual variances typically reflect a misspecified model. For instance, perhaps a nonlinear growth model is more suitable. Also, instead of fixing the residual variance at zero, you could try holding them equal across time. 

Hemant Kher posted on Thursday, April 21, 2011  8:26 am



Professor Muthen  Thank you for a quick response. Holding the factor residual variances equal across over time has solved the problem. 

Katja posted on Monday, January 14, 2013  1:14 am



Hi! I have a question, regarding a neg. residual variance. There was a post: "If the negative residual variances are large, this is a sign that your model is not appropriate for your data and you need to change your model. If they are small, you may want to fix them to zero. Residual variance are often small on the between level of multilevel models." What is considered as a large/small negative variance? I have a neg. residual variance of ,083. Can i fix it to zero? Thank you! 


Try fixing it to zero and see how this affects the results. 


Hello There seem to be two approaches of handling negative residual variances: The first is to fix the residual variance to 0 or a small positive value. The second one is to use the 'model constraint' to constrain the variance to be greater than zero. These approaches seem to be different because the first one delivers an additional degree of freedom wheras the second does not. So I am wondering which approach you consider more appropriate and why. Many thanks Johannes 


It is common practice to fix small insignificant negative residual variances to zero or constrain them to be greater than zero. These approaches are basically the same. Neither are optimal. A better choice is to change to model. 

Linh Nguyen posted on Saturday, July 19, 2014  9:31 pm



Hello Linda I am also having the negative residual variance problem with my measurement and structural models. The problem is at the secondorder construct which has only 2 firtorder indicators. So when I run CFA for this secondorder construct, it is unidentified (My sample size is 364). So I have set the factor's loadings to be equal (Tau equivalence) in order to run CFA and validate the construct individually. The CFA model fits very well then with an insignificant Chisquared, CFI =.995, RMSEA =.04, SRMR =.02. The overall measurement model (with 5 other constructs, 32 observed variables) is also adequately fit. In my structural model, if I did not set the factor loading of the above construct to be equal, I would get 1 negative residual variance (from the above construct's indicator). The negative residual variance was .017, insignificant. If I continued to employ Tauequivalence assumptions, the model would be fine. Can I set the construct's factor loadings to be equal like in the CFA model for my structural model to solve the problem? Thank you very much! I am looking forwards to hearing from you Kind regards Linh 

Linh Nguyen posted on Saturday, July 19, 2014  10:35 pm



With regard to the mentioned secondorder construct, eventhough it has 2 firstorder indicators, each firstorder indicator has 3 observed variables. Thanks Linh 


I would use the CFA model of equal loadings in the SEM model. 


Hi Linda Thanks so much for your quick response! It really helps. Sorry for my late reply cause I could not post any message on the website until now. I am writing up my analysis, just wondering if you can give me some references for using the CFA model of equal loadings in the structural model? I have been searching for journal articles about the issue but could not find the proper one. Thank you Kind regards Linh 


I don't know of any reference. A model with two secondorder factors is not identified unless you make some constraint like equal loadings. Ideally you would have more secondorder factors so this is not necessary. 


Thanks Linda 

Ted Fong posted on Friday, August 01, 2014  2:12 am



Dear Dr. Muthén, I understand that a 2ndorder factor model with just two firstorder factors is typically unidentified. When the model is made identified with a model constraint such as equal loadings on the 2ndorder factor, this model should have the same df as the twofactor correlated model. My question is: is it possible for an identified 2ndorder factor model with two firstorder factors to have a lower df than the twofactor correlated model? I have recently come across a paper where the former model has 2 df fewer than the latter one. Thanks very much, Ted 


This question is better suited for a general discussion forum like SEMNET. 


Hi Dr Muthen I am running a full structural equation model (with categorical outcome) and am having trouble. While it runs and is terminated normally, I still get the error message about a negative residual variance: "THE MODEL ESTIMATION TERMINATED NORMALLY WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE....PROBLEM INVOLVING VARIABLE DLMORAL." After reading online I fixed the residual variance to 0 (DLMORAL@0) , but am still getting this error message. Am I able to still use the estimates provided or is there something else I can do to fix this problem? Although having it fixed @0 means I do not have estimates for two correlations involving this variable. This is the starting model so I want to make sure there are no issues before I modify it. I have run CFA on all independent variables before running the full model and there were no problems with this variable. Thanks in advance for your assistance 


I should add that my sample size is only 320 but that has not been a problem for several other models (including one similar) that use this variable. Does this mean that this issue is caused by the addition of one latent variable and one observed variable (that does not correlate with the variable in question)? Sorry about the second post. 


Please send the output and your license number to support@statmodel.com. Be sure to include TECH4. 

Linda Lin posted on Tuesday, March 10, 2015  3:04 pm



Can you provide how Mplus decompose the total outcome variance and calculate the level 2 residual variance? I tried to demonstrate as below: the total variance of outcome variable can be decomposed as [ level1 residual variance ]+ [level1 explained variance] + [level 2 residual variance] + [level2 explained variance]. That is, for example usevariables are class x y; within=; between=; cluster=class; Analysis: type=twolevel; Model: %within% y on x (a); y (b); x (c); %Between% y (d); x(e); y on x (a); I expect that total Var(y)should=b+ c*a^2 + d + e*a^2. But I found that the formula I used here can not get the accurate total variance of Y. 


I don't know if you compare your formula to the total variance in the sample or the modelestimated total variance. If you don't have perfect fit, those two are different. The formula is correct. You use the label "a" on both levels  is that what you intend? If this doesn't help, send output to support@statmodel along with your license number. 

Linda Lin posted on Wednesday, March 11, 2015  7:12 pm



Thanks for your answer, Dr. Muthen. For clarifying my question above: 1) I intend to constrain the coefficients to be same across both levels. 2) I compared the total variance estimated from below model with the total variance estimated from the above model. usevariables are class y; within=; between=; cluster=class; Analysis: type=twolevel; Model: %within% y (e); %Between% y (f); total variance = e+f. I expected the total variances are the same. Is this e+f the sample variance you mentioned? Thanks! 


e+f is the modelestimated total variance that I referred to. Your first model was Model: %within% y on x (a); y (b); x (c); %Between% y (d); x(e); y on x (a); But because you have an equality constraint "a", you won't necessarily get the same total variance  the equality may not hold. Try it without the equality. 


Dear Linda or Bengt, I am fitting a twolevel factor model with WLSMV on binary data, and I would like to report the withinlevel residual variance, although they are not free parameters. Can I calculate them as 1  lambdawithin^2 * phiwithin? Or are the residual variances at the withinlevel fixed at 1 like in the thetaparameterization? Thanks in advance, Sanne 


If you ask for STANDARDIZED in the OUTPUT command, you will obtain residual variances which are computed as remainders. It is at the end of the standardized output with Rsquare. 


Dear Linda, Thanks for pointing me to this output. I do not see the residual variances, but I do get scale factors, which are 1/sd(underlying response variable) if I understand correctly. I then calculated the total variance as 1 / scale factor^2, and the residual variances as total variance  explained variance. The result is around 1 for all variables: 1.002 1.002 1.000 1.001 1.003 0.997 0.999 0.998 1.001. This gives the impression that the residual variances are effectively constrained to be 1, and the small deviations that I find are due to rounding error. Could that be correct? Because then I will just report (unstandardized) residual variances of 1. 


Yes, the residual variances are fixed at 1 because the Theta parameterization is used for 2level WLSMV. 


Thank you very much! 


What is considered as a large/small negative variance? I have a neg. residual variance Can i fix it to zero? and how? 


You say y@0; No rules of thumb, but if fixing it at zero didn't change the fit very much then it wasn't big. 

Bo Y posted on Monday, February 08, 2016  2:32 am



Hi Drs. Muthen, Similar to the above questions, I ran into the warning msg when I did a crosslagged model. The model fit index seems acceptable after modification. I have a relatively small sample size 102 for wave one and 80 for second wave. DCI23 is an important observed variable in this model. Is there a technique that I could apply to fix this? Thank you very much!  THE MODEL ESTIMATION TERMINATED NORMALLY WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR A LATENT VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO LATENT VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO LATENT VARIABLES. CHECK THE TECH4 OUTPUT FOR MORE INFORMATION. PROBLEM INVOLVING VARIABLE DCI23. 


No way to change this without changing the model. If fit didn't change very much as stated above, then fixing it to zero should be okay. 

Bo Y posted on Tuesday, February 09, 2016  1:34 am



Thanks a lot, Linda. I would like to make sure I put in the syntax right. Would you please let me know whether the following is OK? Following your advice, I add one line in MODEL section, which is "DCI23@0". Then the CFI increased a bit from .975 to .976, and TFI increased a little bit too from .960 to .962, but still I got the same WARNING msg below. Do I need to do anything further?  THE MODEL ESTIMATION TERMINATED NORMALLY WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR A LATENT VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO LATENT VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO LATENT VARIABLES. CHECK THE TECH4 OUTPUT FOR MORE INFORMATION. PROBLEM INVOLVING VARIABLE DCI23. 


Please send the output and your license number to support@statmodel.com. 


I am estimating a structural equation model, which has two time points of executive function(EF). The time 2 EF has negative residual variance  I believe it's because the two time points are so highly related.(The standardized output shows that the beta from time 1 EF to time 2 EF is greater than one.) I would like to set the residual variance of time 2 EF to zero. Does it make sense to do that and still use time 2 EF as the dependent variable in a regression? Because my understanding is that a regression equation is predicting variance, so if you set the variance to zero there is nothing to predict. 


I would not recommend going in that direction. I assume EF is a factor with several indicators. If so, explore acrosstime residual correlations for the same indicators. That may reduce the factor correlation. 


Hi Drs. Muthen, I'm running a quadratic latent growth curve model and am getting an error message because there is a negative residual variance for both the slope and quadratic factor. If I restrict the slope and quadratic factor residual variances to 0, the model fit worsens. However, if I restrict the only time point with nonsignificant residual variance to zero, and restrict the slope to 0 the model fits well and q no longer has a negative residual variance. Is this o.k. for me to do? Specifically, this is my residual variance output: Residual Variances M6_IBQ 0.934 0.364 2.568 0.010 M9_IBQ 0.311 0.147 2.119 0.034 M12_IBQ 0.915 0.322 2.841 0.005 M24_ECBQ 4.037 3.100 1.302 0.193 I 0.022 0.316 0.070 0.944 S 9.129 5.135 1.778 0.075 Q 3.853 2.053 1.877 0.061 When I restrict s@0 & q@0 the model fit worsens. However, if I restrict s@0 and M24_EBQ@0 the model runs, q no longer has a negative residual variance, and the model fit is better. Is this o.k. for me to do? Sorry for my lengthy post, any information is greatly appreciated. 

Back to top 