Message/Author 


Hello. I am conducting a CFA (MLR estimator) with three, continuous indicators across 4 groups. For one of my groups, I get one nonsignificant, negative (near zero, within 95% CI) residual variance when my intercepts are freely estimated and factor mean is set to 0 for all groups (my Model 1). I could set the residual variance to 0, but then I get a standardized factor loading and rsquare of 1.000, which I consider non"useful" information. If I rerun Model 1 for just that group, I get the same results. But, when I constrain factor loadings to be equal across groups (free intercepts, factor mean at 0) for my Model 2, I get a positive residual variance for that variable and group (.609), which is a value still within the original 95% CI of Model 1. To rerun Model 1, might I set the residual variance for that variable for that group to .609 instead of zero so that I might get "usable" estimates? 

bmuthen posted on Saturday, February 26, 2005  4:54 pm



If the model with invariant loadings fits well, you could make an argument that what you propose is reasonable  you are borrowing information from the other groups to get a better estimate for that residual variance. It is a bit ad hoc, however, since the fixed value of .609 has sampling variability so the resulting SEs of the model might need to be taken with a grain of salt (i.e. work with conservative tests of parameter significance). 

rpaxton posted on Saturday, February 18, 2006  9:06 pm



I and trying to confirm the factor structure of 2 second order CFA's with 5 factors each. For some reason the residual variance for one of the factors is negative. Would you recommend deleting that factor? What steps should be taken to handle this situation. 

bmuthen posted on Sunday, February 19, 2006  2:45 pm



You could fix that residual variance at zero. That would mean that that firstorder factor is a perfect indicator of the secondorder factor  this happens in some instances. 

rpaxton posted on Sunday, February 19, 2006  4:27 pm



How would I fix the residual variance to zero. This is my model statement: f1 by var2var3; . . . F10 by var27var30; exper by f1f5; behav by f6f10; !............................... Should I just say F9@0 below the final statement. Thanks 

bmuthen posted on Sunday, February 19, 2006  4:42 pm



That's right. 

Valeriana posted on Tuesday, March 21, 2006  7:41 am



Hi, I´m trying to use a CFA model to test convergent and discriminant validity. Though, almost all the residual variance are nonsignificant. If I fix them at zero, indexes such as "composite reliability" or "average variance extracted" or any other reliability index will be inflated. What should I do? Thank´s. 


I would not fix insignificant residual variances at zero. Having residuals is wellmotivated and the fact that they are nonsignificant may merely mean that you have a smallish sample size. 


After looking through various discussion boards, I figured out how to fix my problem of having a negative residual variance for my variable dep3. I just added a line dep3@0 to the model command: VARIABLE: NAMES ARE ...; MISSING = ALL (99); USEVAR = dep1 dep2 dep3 critsen1 critgf1 das1; ANALYSIS: TYPE = MEANSTRUCTURE MISSING; MODEL: i s  dep1@1 dep2@2 dep3@3; i s ON critsen1 critgf1 das1; dep3@0; OUTPUT: TECH4 SAMPSTAT STANDARDIZED MODINDICES (3.84); I am pleased to have figured out how to fix the problem. However, I do not fully understand why setting that residual variance to zero allowed the model to run. Can you offer me any help in understanding this at a more applied level? Thanks! Renee 


Fixing a negative residual variance is done if the residual variance is a small negative value and not signficant. Othwerwise, the model should be changed. The reason that this is a problem is that variances cannot be negative by definition. 

Suzanne Jak posted on Tuesday, April 22, 2008  1:09 am



Hello, I'm fitting 2 firstorder and 1 secondorder factor on 7 continuous observed variables. Using scaling in lambda, the residual variance of de first factor is negative. My model runs well when I fix the variance of this factor to 1, and remain the factorlading of the first factor to be 1. Q1: Is this bad practice? Q2: I thought it is useless to have 1 indicator for a factor, so I regressed de variable 'kub' direct on de 2order factor. Is this ok? This is my input: Title: factormodel met weging sommen op 2 factoren, f1 fixed op 1 Data: FILE IS schaal.dat; Variable: NAMES ARE ana cijf fig kub som syl voc w; WEIGHT IS w; Analysis: ESTIMATOR = MLR; Model: f1 BY fig ana syl cijf som; f1@1; f2 BY som voc; f3 BY f1 f2 kub; Output: sampstat standardized modindices tech4 tech1; Thanks! Suzanne 


You should not set the metric of the factor by both fixing a factor loading to one and fixing the factor variance to one. If you relax one of these restrictions, the model is not identified. You need a minimum of three firstorder factors for the model to be identified without making perhaps unrealistic restrictions on the model. 

JPower posted on Tuesday, January 20, 2009  9:28 am



Hello, I'm conducting a CFA of a four factor scale with ordinal indicators using wlsmv estimation. One of the factors has only 2 indicators and for these indicators there is limited variability in responses item s: Category 1 0.969 Category 2 0.026 Category 3 0.004 Category 4 0.001 item o: Category 1 0.975 Category 2 0.016 Category 3 0.006 Category 4 0.002 My fit statistics are reasonable (CFI, TLI >0.95, RMSEA, SRMR = 0.07). However, I get a warning about theta not being positive definite and there is a negative residual variance for item s (0.003). Would dropping this factor be reasonable given the limited variability in responses and the negative residual variance? What would you suggest as next steps? Thanks. 


In my opinion a factor with two indicators is not generally believable given that it is not identified without borrowing from other parts of the model. In your case, I would use one of the factor indicators as an observed variable in the model. 


Hello, I am trying to establish measurement invariance in a measurement model before proceeding to a multigroup structural model. THere are two latent variables and two groups. When I ran it constraining the factor loadings to be equal, it ran fine, but when I free the factor loadings for the unconstrained model, I get one negative residual variance in one group (leading to the "not positive definite" error message). The intercepts are also freed. The fit indices for the unconstrained model are chi sq=15.899, df=12, CFI=.993, rmsea=0.043, srmr=0.033. Can these results be interpreted with a neg residual variance? Thank you! 


You should first find that the factor model fits well in each group before proceeding to test for measurement invariance. It sounds like this is not the case. You might want to start with an EFA in each group to establish the the same number of factors is found in each group and then proceed to a CFA in each group. 


Hi Linda, Thanks for your reply. I did run separate CFAs for the two groups individually first, to establish configural invariance. Both seem to be good fits for the data. The fit indices are: Girls only Chi sq 6.083 df 6 CFI 1.00 RMSEA 0.009 (0.0000.101) SRMR 0.021 Boys only Chi sq 9.816 df 6 CFI 0.981 RMSEA 0.060 (0.0000.125) SRMR 0.042 The negative residual variance is in the boys group. 


Hi again Linda, I forgot to add that even though I have good fit in both groups, you are correct in that the same error message occurs for the boys group (the group with the negative residual variance). I will run an EFA for the boys group and I am also considering parcelling these indicators. Thanks for your help, Nick 


Hello, I have a 6factor CFA model with 13 indicators. In the first 6factor model I fit one indicator had a negative residual variance that was very small (.05) and nonsignificant which I fixed at zero without any theoretical rationale , just figuring that it was small enough to justify as zero and proceed with this good fitting model. After testing the first, proposed model against several others with slight theoretically informed adjustments, a new model with one indicator loading on two factors shows a better fit. When I removed the zero residual variance constraint from the one indicator the actual variance in the newer model is more negative (@.11). If having this model constraint was justified in the first place, would it still be justified in the newer model with the increase in the negative residual variance? thanks Matt 


It sounds to me like you should start with an EFA of your factor indicators to see if the items are behaving as expected. Making adjustments to a CFA model without previously doing an EFA to study the items can result in a misspecified model. A factor with two indicators is not identified without borrowing information from other parts of the model. I would hesitate to use such a factor. 


Thank you Linda. What do you mean by misspecified? I understand how a factor with two indicators is not indentified but if the model as a whole is identified what does misspecification mean? Here is where I my questioning is coming from: I proposed that a certain factor structure would arise via prinicipal components or EFA but it was strongly suggested that I use CFA. In looking at it both ways 6factors will not converge in an EFA but when I run a CFA with the proposed structure the model is a good fit with the exception of the small negative residual variance. So should I take the nonconvergence in EFA as a strong hint that 6factors are not appropriate at all or is there a possibility that a confirmatory model with 6 factors is still feasible? I was hoping to move to factor mixture analysis with this CFA but that may not be a good idea either... Thanks again  I really appreciate your assistance matt 


A misspecified model is a model that does not correctly represent the data which I think you know. Beyond that I am saying that often CFA models are proposed based on theory and estimated using data that may not well measure the constructs represented in the theory. An EFA can often help in seeing this. If you have to modify the CFA by for example fixing residual variances to zero, this may point to a problem with the model. If a 6factor CFA fits the data well but a EFA will not converge, the CFA may be a fragile model that will not be replicated with other data. Note that factor mixture analysis usually has less factors that a regular CFA. 


I definitely had an idea that my issue was something bigger than a negative residual variance. Thank you for helping me clarify that. matt 


Hello again, I now have a question with regard to negative residual variances. This is the model I have: PHYSIO BY haz4 waz4 baz4 haemo2; MENTAL BY WJ3 WJ2 WJ5 stpea; MOTOR BY carty_1a carty_3a carty_5a car2b car6b car4b; MOTOR ON age; comp1 by CD1 CD12 CD2 CD4 CD6; comp2 by CD5 CD7 CD8 CD9 CD18 CD19; comp3 by CD3 CD11 CD13 CD14 CD15 CD16 CD17 CD20 CD25; comp4 by CD21 CD22 CD23 CD26 CD24; comp1 on sex; comp2 on sex; comp3 on sex; comp4 on sex; motor on sex; mental on sex; SUBJ BY comp1 comp2 comp3 comp4; OBJ BY PHYSIO MENTAL MOTOR; SUBJ on ANIO_INC TIPO_LOC LOCAL; OBJ on ANIO_INC TIPO_LOC LOCAL; All of the indicators are categorical, except the ones in MENTAL. I am getting negative residual variance for WAZ4. I have tried changing the model, specifically the PHYSIO part by: (a) splitting it into 2 latent variables, but I get theta not positive definite involving waz4, and psi not positive definite involving one of the new latent constructs. (b) taking out waz4 from the model. With this, I get no error messages but the loading of baz4 changes from being significant to nonsignificant. Any suggestions? I really appreciate your help, thank you very much and best regards, Laura Valadez 


I would start with an EFA to see if the CFA you specify is valid. If so, I would do the CFA with no covariates in the model until I get a wellfitting CFA model. Only then would I add covariates. 


Dear Linda, Thank you for your response. Following the recommendations I read in the forum, I conducted EFAs beforehand, for each of the latent constructs and then for all the indicators. The EFA for PHYSIO comes out with the following: The maximum number of factors is set to 1. So the 4 indicators come out in 1 factor where WAZ4 bears a considerably high loading. WAZ4 3.856 HAZ4 0.192 BAZ4 0.172 HAEMO2 0.023 For this, ChiSq = 12.587 (df=2), CFI=.997, TLI=.994, RMSEA=.053. When I run EFA for all the indicators, haz4 and haemo2 share their highest loadings with indicators of MENTAL. However, haz4=heightforage and haemo2=haemoglobin concentration and the indicators for MENTAL are results of memory/cognitive tests. This makes me think that following theory I should still keep them under PHYSIO. When I did the CFA without controls, I got the following: ChiSq= 1175.263(df=260) CFI=.918, TLI=.937, RMSEA=.043 and PSI not positive definite for MENTAL. Does all of this means the model is seriously misspecified? thanks again, your help is invaluable! Laura Valadez 


If in EFA certain variables do not load on the expected factors, then their validity is questionable. If you force them on the expected factors in CFA, the model will not fit well. I think you need to consider the validity of the items that you are using. 


Thank you very much, this is really helpful. I will reexamine the indicators that I have got. Best, LV 


I have a onefactor CFA with 4 indicators. The error variance of one indicator is negative but very small and nonsignificant. I know this is not what is supposed to be. But everything else looks good, though with a warning message. And I like the results. Is it appropriate that I just leave the results there and proceed to explain them with some reasonable arguments? Thanks for your attention. 


Hello. We’re carrying out an EFA followed by CFA with covariates. There are 12 dependent variables and they are categorical (binary and ordinal). Based on EFA results, it appears that there are 4 factors, but the 4th factor only has two items and so was modeled as a correlation between those two items. When I try to add a binary covariate to the model I run into problems. First, there is a negative residual variance for one of the items (avoidanc). I looked and the residual variance is very close to zero. But, when I set that residual variance to zero (using theta parameterization), I get an unidentified model. If I remove the culprit item (avoidanc) from the model it runs without any errors. However, the avoidanc item, which is problematic, is also theoretically important and should remain in the model. So, I’m not sure how to proceed. Here is what the MPLUS model statement looks like: MODEL: tbi BY dizzy@1 headache irritabl sleep memory visual ; ptsd BY nightmar@1 avoidanc onguard detached irritabl ; dep BY LittlInt@1 depress detached sleep ; onguard with sleep ; tbi ptsd dep on AUDC ; avoidanc@0; 


Please send two full outputs, the one without the ON statement and the one with the ON statement, and your license number to support@statmodel.com. 

Siran Zhan posted on Wednesday, October 19, 2011  1:30 am



Hi Dr. Muthen, I have a model with 8 firstorder factors and 3 secondorder factors. I'd like to clarify if I can specify them in two equivalent ways. One way is that I fix one of the firstorder loadings to 1 by default, e.g.: F1 by V1 V2 V3; . . . F8 by V22 V23 V24; F9 by F1 F2; F10 by F3F6; F11 by F7F8; Another way is fixing the firstorder latents' variance to 1, e.g., F1 by V1* V2 V3; F1@1; . . . Could you tell me if both ways of specification are correct and equivalent? Thank you! 


They are correct and equivalent. 

SY Khan posted on Sunday, March 02, 2014  10:09 am



Hi Dr. Muthen, My EFA results show that there are 4 factors for binary observed variables. When i conduct CFA on just one of my factors (which has 3 items)I get a meesgae of covariance matrix not being positive definite. I tried fixing the starting values of the problem variable (JOBDSCRT)with the following three commands (two of which dont work): 1 AUTOJD BY JOBVARTY@1 JOBDSCRT@0.5 JOBCTRL; (this produces negative variance for JOBCTRL) 2 AUTOJD BY JOBVARTY* JOBDSCRT@0.5 JOBCTRL @1; (this again results in Neagtive variance) 3 AUTOJD BY JOBVARTY* JOBDSCRT@0.5 JOBCTRL; (this one works in the indvidual CFA of this factor).But I realised that the diagram does not show any anchor variable which could be wrong. Is that so? How should I solve this problem? Further, if I use AUTOJD BY JOBVARTY@1 JOBDSCRT@0.5 JOBCTRL; in my overall model CFA with the remaining 10 constructs used in my model It works. Kindly advise what am I doing wrong? How can I get the approriate result for my individual construct AUTOJD without loosing an item? Many thanks 


You should focus on why your CFA gives a negative residual variance and change the model. It seems that you did not translate the EFA into an appropriate CFA. 

SY Khan posted on Monday, March 03, 2014  2:41 am



Hi Dr. Muthen, Thanks for your prompt reply. Sorry, I did not expalin myself clearly above. The EFA suggested 4 factors and CFA confirmed those 4 factors. But when I ran individual CFA's on each four factors seperately I get a negative residual variance for an item on one of the factors i.e. AUTOJD which has three items, JOBVARTY, JOBDSCRT, JOBCTRL. The negtaive variance for JOBDSCRT: JOBDSCRT Undefined 0.11371E+01 0.137 I have read from posts above that when the ve variance is not significant, it can be fixed to zero. so I have done the following: PARAMETERIZATION=THETA; AUTOJD BY JOBVARTY JOBDSCRT JOBCTRL; JOBVARTY (1); JOBDSCRT (1); JOBCTRL (1); AUTOJD@1; 1 Doing this I get the results. Is this correct? 2 Also with the above command the diagram shows that all error terms are the same for all items? please can you explain why? 3Which is better way of dealing with ve variance i.e. fixing ve variance =0 or giving a new starting value? Thanks 


It sounds like your model is very fragile if separating the factors reveals this problem. Fixing a residual variance to zero is for continuous variables only. This cannot be used with categorical variables. It should also be used only for small nonsignificant values. I don't believe .137 falls into this category. In estimating the factors together, you draw on information from other parts of the model which you do not do when you estimate each factor separately. You can consider using the fourfactor model if it fits well. Given these issues, I suspect it does not. The error terms are the same because you have constrained them to be equal by placing (1) behind each one of them. 

SY Khan posted on Tuesday, March 04, 2014  6:41 am



HI Linda, Thanks very much for explanation. After reading your reply I thought that the solution would be to drop JOBDSCRT (binary variable). But when I tried PARAMETERIZATION=THETA it didn't give negative residual. However, the overall fit indices reduced a bit (were better with DELTA). With Theta I get CFA for AUTOJD BY JOBVARTY, JOBDSCRT, JOBCTRL seperately too. Please advise if: 1I can proceed with THETA parameterization? if yes, then do I need to have THETA parameterization in all the subsequent CFAs and SEM ananlysis? OR is it ok to change back to DELTA where it works without a problem? 2 What would be the impact on the quality and legitimacy of result if I did not use same parameterization consistently? 3 I am asking this question becaues my inedepndent variables are binary (4 factors of which one is AUTOJD). Other three independent variable factors work fine with Delta parameterisation). My intermediate and outcome variables are CATEGORICAL(for CFA). But I run SEM with Latent variables (of binary items) and aggregated variables which are treated as continuous observed variables in SEM. These aggregated variables are generated by adding items identified through CFA of categorical items. Sorry for the lengthy question and many thanks for your guidance. 

SY Khan posted on Tuesday, March 04, 2014  8:50 am



Hi just to add a clarification to the above. I simply changed the parameterization=THETA without constraing the model in any other way or to give new starting values.And it worked. Rest of my questions remain as above. Thanks very much 


I have no further comments other than that I would not use the Theta parametrization unless the model could not be analyzed using the Delta parametrization. 

Ting Dai posted on Monday, March 10, 2014  10:37 am



Dear Drs. Muthen, When trying to fix a Heywood case, I could 1) fix it @0; 2) constrain it to be equal with another similar residual variance; Is there a third way, for example, still have Mplus estimate it but set it to be nonnegative? Thanks! 


You can constrain it to be greater than zero using MODEL CONSTRAINT. 


Hi, which method or test you use for testing the significane of negative residual variance? Thanks! 


A negative residual variance makes the results inadmissible. 


Hi Linda again, I know that negative residual variance makes the results inadmissible. But i read an article [Chen, F., Bollen, K. A., Paxton, P., Curran, P. & Kirby, J. (2001).lmproper solutions in structural equation models: Causes, consequences, and strategies. Sociological Methods & Research, 29, 468508.], that describes different tests to check the significance of the negative residual variance and in the Outout in Mplus there isn´t any information about the test, wihich Mplus is using. Thanks again  I really appreciate your assistance!!! Dave 


Mplus simply reports the ztest: Est/SE. 

Jane Doe posted on Saturday, January 23, 2016  3:03 am



I have a model where for one of the latent factors, for theoretical reasons I need to fix all factor loadings to 1: f BY f1@1 f2@1 f3@1 f4@1; Where f1f4 are themselves latent factors estimated within the model. But I get negative variance for f. (All other results make both statistical and theoretical sense.) If I fix the variance of f to 1 then it works OK. f BY f1@1 f2@1 f3@1 f4@1; f@1; But I am not sure if it is OK to fix both the factor loadings and factor variance to 1. Fixing the factor variance to zero doesn't seem to solve the problem. I still get the same error message, which makes even less sense. If I drop f from my model, it works fine. 


Sounds like f1f4 have some negative correlations among them and that is picked up by the f variance. 

Jane Doe posted on Sunday, January 24, 2016  11:30 am



Thanks a lot. This makes sense. And yes there is a negative correlation between two of the factors. Allowing for that fixes the issue. 

Jack Johnny posted on Sunday, April 17, 2016  12:08 pm



Dear Dr.Muthen, I am running an SEM with one of the latent factors "F2" has a negative residual variance according to the warning message. When followed your advice above by fixing it to 0, I receive the following message: THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED. THE MODEL MAY NOT BE IDENTIFIED. CHECK YOUR MODEL. PROBLEM INVOLVING THE FOLLOWING PARAMETER: Parameter 64, F3 ON F2 THE CONDITION NUMBER IS 0.227D16. The original identified model can't be identified now. May I know why this happened? What should I do then? Best regards, 


Try to respecify your model so that you don't get a negative residual variance for F2. 


Dear Dr.Muthen, Great thanks for your prompt reply! I don't quite understand what you meant by respecifying the model. The model is based on a theoretical framework that I want to test. If I respecify it, then the intended hypotheses will be changed. I think the codes (see below) which I used are OK, according to Mplus manual. Is it because I used pseudo data (borrowed from another study) for the variables tested in my model that caused the negative residual variance? The commands are as follow: variable: Names are EP1EP4 ER1ER4 AG1AG18 EN1EN10 AC; usevariables are EP1AG3 EN1EN10 AC; analysis: estimator is MLM; MODEL: F1 by EP1EP4 ER1ER4; F2 by AG1AG3; F3 by EN1EN10; AC ON F3 F2 F1; F3 ON F2 F1; F2 ON F1; MODEL INDIRECT: AC IND F1; AC IND F2; output: sampstat stdyx tech1 tech4; Looking forward to your idea! Best regards, 


Run only the BY statements as a first step to see if your measurement model fits. This data may not be valid measures of the constructs the theory is based on and therefore not be correct for this data. 


Dear Dr.Muthen, I ran the model with only the BY statements and received the same warning of the negative residual variance for "F2". Moreover, I also ran a BY statement with only the latent factor "F2". As it has only three indicators, I first constrained the two residuals of AG2 AG3 to be equal. The model was identified without the warning message, though the TLI index was negative. Then I tried to constrained the two factor loadings of AG2 AG3 to be equal, which resulted in a warning message that stated a negative residual variance from AG1. Based on the above info, can I say all the warnings regarding the negative residual variance is caused by the invalid data? Best regards, 


The model does not fit the data. 

Jack Johnny posted on Tuesday, April 19, 2016  12:23 am



Thank you so much! 

Back to top 