Message/Author 

Anonymous posted on Thursday, January 20, 2005  12:18 pm



Hi I have been working in Mplus on several twolevel structural equation models, and sometimes the residual variance of my observed dependent variables is negative. The model fits are high, but I am not sure how to interpret or fix these rv's so they are no longer negative. Please advise. Thank you. 

BMuthen posted on Thursday, January 20, 2005  7:57 pm



If the negative residual variances are large, this is a sign that your model is not appropriate for your data and you need to change your model. If they are small, you may want to fix them to zero. Residual variance are often small on the between level of multilevel models. 

matthew posted on Sunday, November 27, 2005  3:14 am



hi i am a new user of SEM and also face the similar problem. what are the causes of negative residual variance? i mean how inappropriateness of the model would cause this problem but not reflecting on the modelfit indices. i want to ignore it (simply delete it), is there any guideline or significant level of the value so that it would be comfortable to do this? 


there is too much to say on this topic and checking books is helpful.  in sum, reasons for neg vars include small sample size (so neg est even if pop value is pos), model misspecification, and very skewed variables (floor effects). also see my other answer today. 


Hi Could you advise me on the following problem, please? Using CFA, I established discriminant validity between 6 latent variables (x1x6). Then using SEM, I regressed y1, a latent variable with 3 continuous indicators on x1x6. The model fit very well. However, on addtion of a final path, regressing u1 a binary observed variable on y1, I received a warning that the residual covariance matrix is not positive definite.On inspection of the results, I realised there was a negative residual on one of the indicators of y1, and the model fit indices are very poor. What should I do? I tried dichotomising y1, as the indicators are negatively skewed but the model fit is worse. I would be greatful for any advice. Grainne 


Please send your input, data, output, and license number to support@statmodel.com. 


Hello, Is it possible to constrain residual variance of outcome variables to be greater than or equal to zero in Mplus (rather that only equal to zero)? If yes, could you please state how? 


You can use MODEL CONSTRAINT to constrain them to be greater than zero, for example, MODEL CONSTRAINT: 0 < p2; 


Hello Linda, I found a negative error variance for one of my ordinal outcomes (estimator WLSMV), and i'd like to test the inequality H0: error variance is greater than or equal to zero using the Wald test. Is it possible in Mplus? Thank you very much in advance. 


I don't know what your model is, but unless you have a longitudinal or multigroup situation, the residual variances for ordinal outcomes are not free parameters. They are printed as remainders when requesting a standardized solution  perhaps that's where you see the negative value. So since they are functions of other parameters, you cannot do a test on them in a straightforward fashion. If you for instance consider a single factor and no covariates (in a crosssectional, singlegroup, the residual variance remainder is theta, theta = 1  lambda*lambda* psi so you would have to do the Wald test on the new parameter theta (testing against = 0, not > 0). 


Thanks Bengt for your answer. I indeed saw the negative residual variance doing ESEM with WLSMV and Delta parameterization on 2factor model described by a total of 5 ordinal data. 1. But, if one uses the Theta param., the residuals should be the free parameters, shouldn't they? In this case, the Wald test could be directly used on the residual? 2. About the Wald test, I was wondering if it was possible to do a onesided test of H0: residual >= 0 (as opposed to the twosided test H0: residual=0), as it is suggested in the article of Kolenikov & Bollen "Testing negative error variances: is a Heywood case a symptom of misspecification?" http://web.missouri.edu/~kolenikovs/ Hope these questions make sense! Thank you very much for your help. 


1. With the theta parameterization the residual variance is fixed to 1 (unless you have multiple group situation)  so in a way this is giving you residual variance > 0 condition. The residual variance is not a free parameter because it is still not identified so it has to be fixed to a value that determines the parameterization. For the theta parameterization that value is 1. 2. In principle yes  this amounts to dividing the pvalue you get by 2, but again with the theta parameterization you can not do this at all because the residual variance is fixed to one. In the delta parameterization you can do this using the method Bengt outlined above, i.e., by making a new parameter in model constraints that is equal to the residual variance parameter. The residual variance parameter in the model is not really a regular parameter  it is a dependent constrained parameter that you can not access directly so you have to make your own duplicate of it. 

RDU posted on Saturday, February 13, 2010  12:29 am



I have a question concerning the residual variance provided in the standardized model results for categorical outcomes. Since this is a standardized solution, then are the residual variances listed standardized residual variances or are they unstandardized? Furthermore, if they are standardized then how exactly does one obtain the unstandardized residual variances (Is it similar to what it would be in regular regression?). Thanks. 


I think you are asking about the residual variances that are printed with Rsquare. These are raw coefficients that are computed as a remainder from the model estimated results. They are not estimated as part of the model. Categorical outcomes do not have variance parameters. 

RDU posted on Saturday, February 13, 2010  10:32 am



To better clarify my question, I was referring to the residual variances that are provided using Theta parameterization (WLSMV estimation) with categorical data, where standardized model results can be requested. As part of the standardized output, residual variances are given along with each item's R^2. Thus I was wondering whether the residual variance is standardized since it is part of the standardized model output, or whether it is an unstandardized estimate. If it is in fact a standardized residual variance, I also wanted to know if an unstandardized estimate could be obtained or calculated by hand. Thank you for your response. 


With the Theta parameterization, scale factors are given with Rsquare. If you have further questions about this, please send the full output as an attachment and your license number to support@statmodel.com. 

RDU posted on Saturday, February 13, 2010  11:37 am



Yes, I apologize for the confusion. I believe I mistook the scale factors from Theta parameterization for the residual variances provided in delta parameterization. 

RDU posted on Saturday, February 13, 2010  1:40 pm



Given the previous question I am also curious as to whether the residual variances given for delta parameterization are standardized or unstandardized, as theta parameterization was said to provide scale factors and not residual variances...Is this correct? 


Neither the scale factors or residual variances presented with Rsquare are standardized. 

Hemant Kher posted on Thursday, April 21, 2011  8:04 am



Hello Professor Muthen, I have a question, and I hope that you can provide some insights. My question is related to a multipleindicator latent curve model. Using the CFA approach, I estimate a latent construct for 4 different time points (factors F1, F2, F3 and F4  each estimated using the same 4 items). I followed directions to establish measurement invariance (same scale indicator at each time, loadings for nonscale items constrained equal across time, and equal intercepts for nonscale items). The model with CFA works fine with a good fit. However, when I fit a growth model on the factors, I get a negative residual variance for the first factor (F1); the residual variance is small and statistically insignificant (0.026, Z=0.354, p=0.723). When I fix this residual variance for F1 to zero (f1@0;), the change in model fit is negligible and not statistically significant. But I am not sure if doing this (setting factor residual variance to zero) is reasonable. Your thoughts at your convenience would be appreciated. 


Negative residual variances typically reflect a misspecified model. For instance, perhaps a nonlinear growth model is more suitable. Also, instead of fixing the residual variance at zero, you could try holding them equal across time. 

Hemant Kher posted on Thursday, April 21, 2011  8:26 am



Professor Muthen  Thank you for a quick response. Holding the factor residual variances equal across over time has solved the problem. 

Katja posted on Monday, January 14, 2013  1:14 am



Hi! I have a question, regarding a neg. residual variance. There was a post: "If the negative residual variances are large, this is a sign that your model is not appropriate for your data and you need to change your model. If they are small, you may want to fix them to zero. Residual variance are often small on the between level of multilevel models." What is considered as a large/small negative variance? I have a neg. residual variance of ,083. Can i fix it to zero? Thank you! 


Try fixing it to zero and see how this affects the results. 


Hello There seem to be two approaches of handling negative residual variances: The first is to fix the residual variance to 0 or a small positive value. The second one is to use the 'model constraint' to constrain the variance to be greater than zero. These approaches seem to be different because the first one delivers an additional degree of freedom wheras the second does not. So I am wondering which approach you consider more appropriate and why. Many thanks Johannes 


It is common practice to fix small insignificant negative residual variances to zero or constrain them to be greater than zero. These approaches are basically the same. Neither are optimal. A better choice is to change to model. 

Linh Nguyen posted on Saturday, July 19, 2014  9:31 pm



Hello Linda I am also having the negative residual variance problem with my measurement and structural models. The problem is at the secondorder construct which has only 2 firtorder indicators. So when I run CFA for this secondorder construct, it is unidentified (My sample size is 364). So I have set the factor's loadings to be equal (Tau equivalence) in order to run CFA and validate the construct individually. The CFA model fits very well then with an insignificant Chisquared, CFI =.995, RMSEA =.04, SRMR =.02. The overall measurement model (with 5 other constructs, 32 observed variables) is also adequately fit. In my structural model, if I did not set the factor loading of the above construct to be equal, I would get 1 negative residual variance (from the above construct's indicator). The negative residual variance was .017, insignificant. If I continued to employ Tauequivalence assumptions, the model would be fine. Can I set the construct's factor loadings to be equal like in the CFA model for my structural model to solve the problem? Thank you very much! I am looking forwards to hearing from you Kind regards Linh 

Linh Nguyen posted on Saturday, July 19, 2014  10:35 pm



With regard to the mentioned secondorder construct, eventhough it has 2 firstorder indicators, each firstorder indicator has 3 observed variables. Thanks Linh 


I would use the CFA model of equal loadings in the SEM model. 


Hi Linda Thanks so much for your quick response! It really helps. Sorry for my late reply cause I could not post any message on the website until now. I am writing up my analysis, just wondering if you can give me some references for using the CFA model of equal loadings in the structural model? I have been searching for journal articles about the issue but could not find the proper one. Thank you Kind regards Linh 


I don't know of any reference. A model with two secondorder factors is not identified unless you make some constraint like equal loadings. Ideally you would have more secondorder factors so this is not necessary. 


Thanks Linda 

Ted Fong posted on Friday, August 01, 2014  2:12 am



Dear Dr. Muthén, I understand that a 2ndorder factor model with just two firstorder factors is typically unidentified. When the model is made identified with a model constraint such as equal loadings on the 2ndorder factor, this model should have the same df as the twofactor correlated model. My question is: is it possible for an identified 2ndorder factor model with two firstorder factors to have a lower df than the twofactor correlated model? I have recently come across a paper where the former model has 2 df fewer than the latter one. Thanks very much, Ted 


This question is better suited for a general discussion forum like SEMNET. 


Hi Dr Muthen I am running a full structural equation model (with categorical outcome) and am having trouble. While it runs and is terminated normally, I still get the error message about a negative residual variance: "THE MODEL ESTIMATION TERMINATED NORMALLY WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE....PROBLEM INVOLVING VARIABLE DLMORAL." After reading online I fixed the residual variance to 0 (DLMORAL@0) , but am still getting this error message. Am I able to still use the estimates provided or is there something else I can do to fix this problem? Although having it fixed @0 means I do not have estimates for two correlations involving this variable. This is the starting model so I want to make sure there are no issues before I modify it. I have run CFA on all independent variables before running the full model and there were no problems with this variable. Thanks in advance for your assistance 


I should add that my sample size is only 320 but that has not been a problem for several other models (including one similar) that use this variable. Does this mean that this issue is caused by the addition of one latent variable and one observed variable (that does not correlate with the variable in question)? Sorry about the second post. 


Please send the output and your license number to support@statmodel.com. Be sure to include TECH4. 

Linda Lin posted on Tuesday, March 10, 2015  3:04 pm



Can you provide how Mplus decompose the total outcome variance and calculate the level 2 residual variance? I tried to demonstrate as below: the total variance of outcome variable can be decomposed as [ level1 residual variance ]+ [level1 explained variance] + [level 2 residual variance] + [level2 explained variance]. That is, for example usevariables are class x y; within=; between=; cluster=class; Analysis: type=twolevel; Model: %within% y on x (a); y (b); x (c); %Between% y (d); x(e); y on x (a); I expect that total Var(y)should=b+ c*a^2 + d + e*a^2. But I found that the formula I used here can not get the accurate total variance of Y. 


I don't know if you compare your formula to the total variance in the sample or the modelestimated total variance. If you don't have perfect fit, those two are different. The formula is correct. You use the label "a" on both levels  is that what you intend? If this doesn't help, send output to support@statmodel along with your license number. 

Linda Lin posted on Wednesday, March 11, 2015  7:12 pm



Thanks for your answer, Dr. Muthen. For clarifying my question above: 1) I intend to constrain the coefficients to be same across both levels. 2) I compared the total variance estimated from below model with the total variance estimated from the above model. usevariables are class y; within=; between=; cluster=class; Analysis: type=twolevel; Model: %within% y (e); %Between% y (f); total variance = e+f. I expected the total variances are the same. Is this e+f the sample variance you mentioned? Thanks! 


e+f is the modelestimated total variance that I referred to. Your first model was Model: %within% y on x (a); y (b); x (c); %Between% y (d); x(e); y on x (a); But because you have an equality constraint "a", you won't necessarily get the same total variance  the equality may not hold. Try it without the equality. 


Dear Linda or Bengt, I am fitting a twolevel factor model with WLSMV on binary data, and I would like to report the withinlevel residual variance, although they are not free parameters. Can I calculate them as 1  lambdawithin^2 * phiwithin? Or are the residual variances at the withinlevel fixed at 1 like in the thetaparameterization? Thanks in advance, Sanne 


If you ask for STANDARDIZED in the OUTPUT command, you will obtain residual variances which are computed as remainders. It is at the end of the standardized output with Rsquare. 


Dear Linda, Thanks for pointing me to this output. I do not see the residual variances, but I do get scale factors, which are 1/sd(underlying response variable) if I understand correctly. I then calculated the total variance as 1 / scale factor^2, and the residual variances as total variance  explained variance. The result is around 1 for all variables: 1.002 1.002 1.000 1.001 1.003 0.997 0.999 0.998 1.001. This gives the impression that the residual variances are effectively constrained to be 1, and the small deviations that I find are due to rounding error. Could that be correct? Because then I will just report (unstandardized) residual variances of 1. 


Yes, the residual variances are fixed at 1 because the Theta parameterization is used for 2level WLSMV. 


Thank you very much! 


What is considered as a large/small negative variance? I have a neg. residual variance Can i fix it to zero? and how? 


You say y@0; No rules of thumb, but if fixing it at zero didn't change the fit very much then it wasn't big. 

Bo Y posted on Monday, February 08, 2016  2:32 am



Hi Drs. Muthen, Similar to the above questions, I ran into the warning msg when I did a crosslagged model. The model fit index seems acceptable after modification. I have a relatively small sample size 102 for wave one and 80 for second wave. DCI23 is an important observed variable in this model. Is there a technique that I could apply to fix this? Thank you very much!  THE MODEL ESTIMATION TERMINATED NORMALLY WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR A LATENT VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO LATENT VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO LATENT VARIABLES. CHECK THE TECH4 OUTPUT FOR MORE INFORMATION. PROBLEM INVOLVING VARIABLE DCI23. 


No way to change this without changing the model. If fit didn't change very much as stated above, then fixing it to zero should be okay. 

Bo Y posted on Tuesday, February 09, 2016  1:34 am



Thanks a lot, Linda. I would like to make sure I put in the syntax right. Would you please let me know whether the following is OK? Following your advice, I add one line in MODEL section, which is "DCI23@0". Then the CFI increased a bit from .975 to .976, and TFI increased a little bit too from .960 to .962, but still I got the same WARNING msg below. Do I need to do anything further?  THE MODEL ESTIMATION TERMINATED NORMALLY WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR A LATENT VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO LATENT VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO LATENT VARIABLES. CHECK THE TECH4 OUTPUT FOR MORE INFORMATION. PROBLEM INVOLVING VARIABLE DCI23. 


Please send the output and your license number to support@statmodel.com. 


I am estimating a structural equation model, which has two time points of executive function(EF). The time 2 EF has negative residual variance  I believe it's because the two time points are so highly related.(The standardized output shows that the beta from time 1 EF to time 2 EF is greater than one.) I would like to set the residual variance of time 2 EF to zero. Does it make sense to do that and still use time 2 EF as the dependent variable in a regression? Because my understanding is that a regression equation is predicting variance, so if you set the variance to zero there is nothing to predict. 


I would not recommend going in that direction. I assume EF is a factor with several indicators. If so, explore acrosstime residual correlations for the same indicators. That may reduce the factor correlation. 


Hi Drs. Muthen, I'm running a quadratic latent growth curve model and am getting an error message because there is a negative residual variance for both the slope and quadratic factor. If I restrict the slope and quadratic factor residual variances to 0, the model fit worsens. However, if I restrict the only time point with nonsignificant residual variance to zero, and restrict the slope to 0 the model fits well and q no longer has a negative residual variance. Is this o.k. for me to do? Specifically, this is my residual variance output: Residual Variances M6_IBQ 0.934 0.364 2.568 0.010 M9_IBQ 0.311 0.147 2.119 0.034 M12_IBQ 0.915 0.322 2.841 0.005 M24_ECBQ 4.037 3.100 1.302 0.193 I 0.022 0.316 0.070 0.944 S 9.129 5.135 1.778 0.075 Q 3.853 2.053 1.877 0.061 When I restrict s@0 & q@0 the model fit worsens. However, if I restrict s@0 and M24_EBQ@0 the model runs, q no longer has a negative residual variance, and the model fit is better. Is this o.k. for me to do? Sorry for my lengthy post, any information is greatly appreciated. 

Mplus User posted on Thursday, January 25, 2018  1:58 pm



I tested a 6factor bifactor model in CFA and detected a negative error variance for one of my observed variables. The value was quite small (.005) and nonsignificant, so I followed the advice given on this forum to fix the error variance. I have 2 questions: 1. Should I fix the error variance to 0 or to a small positive number (e.g., .01), or does it not matter? 2. I hope to compare the model fit of a secondorder factor model with a bifactor model (the former is nested within the latter). There are no negative residual errors in the secondorder factor model. If I fix the error variance of the bifactor model, should I also fix the error variance of the secondorder model to be consistent? 


1. It doesn't matter. 2. I would not fix the residual variance in either model when comparing the two. 

Mplus User posted on Friday, January 26, 2018  5:15 am



Thank you! 


I am conducting a twolevel EFA with binary indicators (WLSMV estimator), and some of the solutions include one small, nonsignificant negative residual variance at the between level (z < .15). This occurs even when overextraction does not appear to be an issue (e.g., only 2 between factors were extracted, and 1 between factor is a poor fit to the data). I would typically fix these small residual variances to zero, but since ESEM is not available with type = twolevel, I do not think that is possible. Is there a way to fix residual variances at the between level to zero in a twolevel EFA? If not, can the resulting solution be interpreted, despite the small negative residual variance? Thank you very much! 

Julia Wolke posted on Wednesday, May 16, 2018  1:40 am



I´m trying to run a longitudinal secondorder CFA model. my model has 3 latent factors on the first level (each based on 4 indicators) and one secondorderfactor. there are two measurement points. I‘ve set up the longitudinal model with configural invariance and correlated residuals over time (both for the indicators and the latent firstorder factors). and here already occures the problem: psi is not positive definite. one of the firstorder latent factors has a negative residual (at both measurement points) and a loading on the secondorder factor >1. the factor is also very high correlated with one of the other latent firstorder factors (>.90) at both times. (nevertheless the model fit is quite acceptable.) I‘m not sure how to deal with that. I‘ve tried to constrain the residuals to be > 0 as stated above, but that doesn’t work („unknown parameters“), maybe because it´s a latent variable? when fixing the parameter@0, the problem/ warning message remains. I’d be glad to hear any other suggestion what to do now. thanks a lot! 


Check if the model fits well at each time point and shows no such problems. 

Julia Wolke posted on Thursday, May 17, 2018  12:42 am



The problem indeed is there at each time point (but both models fit acceptable). 


It sounds like you should revise the model at each time point. 


Good afternoon Dr. Linda and Bengt Muthen, I was wondering if you could provide the explanation as to what exactly is being done by setting a indicator of a factor equal to one versus setting it equal to zero doing? I want to ensure that If I am doing one versus the other I understand what is being done. Thank you much for your time. 


Our Short Course Topic 1 video and handout is a good way to understand the basics of factor analysis and will give insights to these questions. 


Hi, I ran a higherorder factor (5 firstorder factors and one higherorder factor) model that was estimated using ESEMwithinCFA (using results from a firstorder model). I first had a large negative residual variance for one of the firstorder factors. When I allowed residual variance between two firstorder factors, I no longer had this problem but standardised factor correlations exceeded 1. Looking at raw results, the factor correlations are fine. My question thus is whether I can use the raw results even thought the standardised results are problematic. Thanks 


I think the standardized factor correlations that you refer to may instead be residual factor correlations. In any case, neither raw not standardized should have correlations greater than 1. Perhaps it is better to fix the residual variance at zero. 


Thanks for your reply. I'm referring to the standardised output F5 WITH F4 (both firstorder factors), which gives me 1.053 when I allow a residual variance between the two. When I look at the raw output, F5 WITH F4 gives me .833. When I instead fix the residual variance of F4 to 0 (f4@0), all the residual variances are fine but I get HF BY F4 = 1.041 (HF = higherorder factor). So I assume this also just indicates that my model is not OK and I should probably stick to the firstorder model, which had a good fit, right? Can I still report the fit indices for the higher order model despite negative residuals/or correlations greater than 1? Or would I just say that the model does not converge properly? Thank you. 


If you divide .833 with the product of the square roots of the F4, F5 residual variances, you would get 1.053. Q1: right. Q2: It converges, but you get negative residual variances. 

Pia Kreijkes posted on Wednesday, January 30, 2019  6:14 am



Thank you for your last response. I have another question about negative residual variance, this time in a twolevel analysis. When I perform an EFA with type=twolevel, I get a small negative residual variance on the between level for one item (within level is fine). Is there a way I can fix this to 0? ESEM with type=twolevel does not seem to be available yet. Could I do this in the EFA within a CFA framework instead? Also, my variables are on an ordinal level so I define them as categorical (in case this makes a difference). 


Yes you can use EFA within CFA and fix the residual variance on the between level to 0, but if it is not significant anyway maybe you don't even need to do that. 


Dear Dr Muthen, I am testing a structural equation model across three groups. (multigroup analyses). I have a dichotomous outcome and two predictors one of which is categorical. For the categorical variable I use dummy’s that a composed with a define statement. I use Theta and WSLM. The model runs fine but I get this warning message: ‘’ WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IN GROUP COHORTB3 IS NOT POSITIVE DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/ RESIDUAL VARIANCE FOR A LATENT VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO LATENT VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO LATENT VARIABLES. CHECK THE TECH4 OUTPUT FOR MORE INFORMATION. PROBLEM INVOLVING VARIABLE STABL. “ When I check my residual variances I see that there is one negative residual variance in variable stabl in cohort 3 (one of the dummy variables). When I constrain the residual variance to 0 (stabl@0;) I get the following error: “THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING PARAMETER 10.” Do you know what might be happening with this one dummy variable in cohort3 (I checked sample sizes, this seems not to be the problem)? Is there any other solutions that I could try? Thanks in advance! 


We need to see your full output  send to Support along with your license number. 


Also, I am confused why you have a residual variance for one the dummy variables if it is a covariate. Such parameters are not included for covariates  perhaps you have brought it into the model. 

Back to top 