Negative Residual Variance PreviousNext
Mplus Discussion > Confirmatory Factor Analysis >
 Stacey Farber posted on Monday, February 21, 2005 - 10:47 am
Hello. I am conducting a CFA (MLR estimator) with three, continuous indicators across 4 groups. For one of my groups, I get one non-significant, negative (near zero, within 95% CI) residual variance when my intercepts are freely estimated and factor mean is set to 0 for all groups (my Model 1). I could set the residual variance to 0, but then I get a standardized factor loading and r-square of 1.000, which I consider non-"useful" information. If I re-run Model 1 for just that group, I get the same results.

But, when I constrain factor loadings to be equal across groups (free intercepts, factor mean at 0) for my Model 2, I get a positive residual variance for that variable and group (.609), which is a value still within the original 95% CI of Model 1. To re-run Model 1, might I set the residual variance for that variable for that group to .609 instead of zero so that I might get "usable" estimates?
 bmuthen posted on Saturday, February 26, 2005 - 4:54 pm
If the model with invariant loadings fits well, you could make an argument that what you propose is reasonable - you are borrowing information from the other groups to get a better estimate for that residual variance. It is a bit ad hoc, however, since the fixed value of .609 has sampling variability so the resulting SEs of the model might need to be taken with a grain of salt (i.e. work with conservative tests of parameter significance).
 rpaxton posted on Saturday, February 18, 2006 - 9:06 pm
I and trying to confirm the factor structure of 2 second order CFA's with 5 factors each. For some reason the residual variance for one of the factors is negative. Would you recommend deleting that factor? What steps should be taken to handle this situation.
 bmuthen posted on Sunday, February 19, 2006 - 2:45 pm
You could fix that residual variance at zero. That would mean that that first-order factor is a perfect indicator of the second-order factor - this happens in some instances.
 rpaxton posted on Sunday, February 19, 2006 - 4:27 pm
How would I fix the residual variance to zero. This is my model statement:
f1 by var2-var3;
F10 by var27-var30;
exper by f1-f5;
behav by f6-f10;
Should I just say F9@0 below the final statement. Thanks
 bmuthen posted on Sunday, February 19, 2006 - 4:42 pm
That's right.
 Valeriana posted on Tuesday, March 21, 2006 - 7:41 am
Im trying to use a CFA model to test convergent and discriminant validity. Though, almost all the residual variance are non-significant. If I fix them at zero, indexes such as "composite reliability" or "average variance extracted" or any other reliability index will be inflated. What should I do?
 Bengt O. Muthen posted on Tuesday, March 21, 2006 - 7:44 am
I would not fix insignificant residual variances at zero. Having residuals is well-motivated and the fact that they are non-significant may merely mean that you have a smallish sample size.
 Renee Thompson posted on Sunday, March 18, 2007 - 5:21 pm
After looking through various discussion boards, I figured out how to fix my problem of having a negative residual variance for my variable dep3. I just added a line dep3@0 to the model command:

USEVAR = dep1 dep2 dep3 critsen1 critgf1 das1;
MODEL: i s | dep1@1 dep2@2 dep3@3;
i s ON critsen1 critgf1 das1;

I am pleased to have figured out how to fix the problem. However, I do not fully understand why setting that residual variance to zero allowed the model to run. Can you offer me any help in understanding this at a more applied level? Thanks!

 Linda K. Muthen posted on Monday, March 19, 2007 - 10:33 am
Fixing a negative residual variance is done if the residual variance is a small negative value and not signficant. Othwerwise, the model should be changed.

The reason that this is a problem is that variances cannot be negative by definition.
 Suzanne Jak posted on Tuesday, April 22, 2008 - 1:09 am

I'm fitting 2 first-order and 1 second-order factor on 7 continuous observed variables. Using scaling in lambda, the residual variance of de first factor is negative. My model runs well when I fix the variance of this factor to 1, and remain the factorlading of the first factor to be 1.

Q1: Is this bad practice?
Q2: I thought it is useless to have 1 indicator for a factor, so I regressed de variable 'kub' direct on de 2-order factor. Is this ok?

This is my input:

Title: factormodel met weging sommen op 2 factoren, f1 fixed op 1

Data: FILE IS schaal.dat;

Variable: NAMES ARE ana cijf fig kub som syl voc w;

Analysis: ESTIMATOR = MLR;

Model: f1 BY fig ana syl cijf som;
f2 BY som voc;
f3 BY f1 f2 kub;

Output: sampstat standardized modindices tech4 tech1;

Thanks! Suzanne
 Linda K. Muthen posted on Tuesday, April 22, 2008 - 8:27 am
You should not set the metric of the factor by both fixing a factor loading to one and fixing the factor variance to one. If you relax one of these restrictions, the model is not identified. You need a minimum of three first-order factors for the model to be identified without making perhaps unrealistic restrictions on the model.
 JPower posted on Tuesday, January 20, 2009 - 9:28 am
I'm conducting a CFA of a four factor scale with ordinal indicators using wlsmv estimation. One of the factors has only 2 indicators and for these indicators there is limited variability in responses
item s:
Category 1 0.969
Category 2 0.026
Category 3 0.004
Category 4 0.001
item o:
Category 1 0.975
Category 2 0.016
Category 3 0.006
Category 4 0.002
My fit statistics are reasonable (CFI, TLI >0.95, RMSEA, SRMR = 0.07). However, I get a warning about theta not being positive definite and there is a negative residual variance for item s (-0.003). Would dropping this factor be reasonable given the limited variability in responses and the negative residual variance? What would you suggest as next steps?
 Linda K. Muthen posted on Wednesday, January 21, 2009 - 9:16 am
In my opinion a factor with two indicators is not generally believable given that it is not identified without borrowing from other parts of the model. In your case, I would use one of the factor indicators as an observed variable in the model.
 Nicholas Mian posted on Wednesday, June 17, 2009 - 8:53 am

I am trying to establish measurement invariance in a measurement model before proceeding to a multi-group structural model. THere are two latent variables and two groups.

When I ran it constraining the factor loadings to be equal, it ran fine, but when I free the factor loadings for the unconstrained model, I get one negative residual variance in one group (leading to the "not positive definite" error message). The intercepts are also freed. The fit indices for the unconstrained model are chi sq=15.899, df=12, CFI=.993, rmsea=0.043, srmr=0.033. Can these results be interpreted with a neg residual variance?

Thank you!
 Linda K. Muthen posted on Wednesday, June 17, 2009 - 9:16 am
You should first find that the factor model fits well in each group before proceeding to test for measurement invariance. It sounds like this is not the case. You might want to start with an EFA in each group to establish the the same number of factors is found in each group and then proceed to a CFA in each group.
 Nicholas Mian posted on Wednesday, June 17, 2009 - 10:13 am
Hi Linda,

Thanks for your reply. I did run separate CFAs for the two groups individually first, to establish configural invariance. Both seem to be good fits for the data. The fit indices are:

Girls only Chi sq 6.083 df 6 CFI 1.00 RMSEA 0.009 (0.000-0.101) SRMR 0.021
Boys only Chi sq 9.816 df 6 CFI 0.981 RMSEA 0.060 (0.000-0.125) SRMR 0.042

The negative residual variance is in the boys group.
 Nicholas Mian posted on Wednesday, June 17, 2009 - 10:31 am
Hi again Linda,

I forgot to add that even though I have good fit in both groups, you are correct in that the same error message occurs for the boys group (the group with the negative residual variance).

I will run an EFA for the boys group and I am also considering parcelling these indicators.

Thanks for your help,

 Matt Thullen posted on Monday, July 27, 2009 - 11:26 am

I have a 6-factor CFA model with 13 indicators. In the first 6-factor model I fit one indicator had a negative residual variance that was very small (.05) and non-significant which I fixed at zero without any theoretical rationale , just figuring that it was small enough to justify as zero and proceed with this good fitting model. After testing the first, proposed model against several others with slight theoretically informed adjustments, a new model with one indicator loading on two factors shows a better fit. When I removed the zero residual variance constraint from the one indicator the actual variance in the newer model is more negative (@.11). If having this model constraint was justified in the first place, would it still be justified in the newer model with the increase in the negative residual variance?


 Linda K. Muthen posted on Monday, July 27, 2009 - 4:23 pm
It sounds to me like you should start with an EFA of your factor indicators to see if the items are behaving as expected. Making adjustments to a CFA model without previously doing an EFA to study the items can result in a misspecified model. A factor with two indicators is not identified without borrowing information from other parts of the model. I would hesitate to use such a factor.
 Matt Thullen posted on Monday, July 27, 2009 - 9:42 pm
Thank you Linda.

What do you mean by misspecified? I understand how a factor with two indicators is not indentified but if the model as a whole is identified what does misspecification mean?

Here is where I my questioning is coming from: I proposed that a certain factor structure would arise via prinicipal components or EFA but it was strongly suggested that I use CFA. In looking at it both ways 6-factors will not converge in an EFA but when I run a CFA with the proposed structure the model is a good fit with the exception of the small negative residual variance. So should I take the non-convergence in EFA as a strong hint that 6-factors are not appropriate at all or is there a possibility that a confirmatory model with 6 factors is still feasible? I was hoping to move to factor mixture analysis with this CFA but that may not be a good idea either...

Thanks again - I really appreciate your assistance
 Linda K. Muthen posted on Tuesday, July 28, 2009 - 7:56 am
A misspecified model is a model that does not correctly represent the data which I think you know. Beyond that I am saying that often CFA models are proposed based on theory and estimated using data that may not well measure the constructs represented in the theory. An EFA can often help in seeing this.

If you have to modify the CFA by for example fixing residual variances to zero, this may point to a problem with the model. If a 6-factor CFA fits the data well but a EFA will not converge, the CFA may be a fragile model that will not be replicated with other data.

Note that factor mixture analysis usually has less factors that a regular CFA.
 Matt Thullen posted on Tuesday, July 28, 2009 - 8:58 am
I definitely had an idea that my issue was something bigger than a negative residual variance. Thank you for helping me clarify that.

 Laura Valadez posted on Thursday, October 29, 2009 - 11:13 am
Hello again,

I now have a question with regard to negative residual variances. This is the model I have:

PHYSIO BY haz4 waz4 baz4 haemo2;
MENTAL BY WJ3 WJ2 WJ5 stpea;
MOTOR BY carty_1a carty_3a carty_5a car2b car6b car4b;
comp1 by CD1 CD12 CD2 CD4 CD6;
comp2 by CD5 CD7 CD8 CD9 CD18 CD19;
comp3 by CD3 CD11 CD13 CD14 CD15 CD16 CD17 CD20 CD25;
comp4 by CD21 CD22 CD23 CD26 CD24;
comp1 on sex;
comp2 on sex;
comp3 on sex;
comp4 on sex;
motor on sex;
mental on sex;
SUBJ BY comp1 comp2 comp3 comp4;

All of the indicators are categorical, except the ones in MENTAL.

I am getting negative residual variance for WAZ4. I have tried changing the model, specifically the PHYSIO part by:
(a) splitting it into 2 latent variables, but I get theta not positive definite involving waz4, and psi not positive definite involving one of the new latent constructs.
(b) taking out waz4 from the model. With this, I get no error messages but the loading of baz4 changes from being significant to non-significant.

Any suggestions?

I really appreciate your help,

thank you very much and best regards,

Laura Valadez
 Linda K. Muthen posted on Thursday, October 29, 2009 - 11:45 am
I would start with an EFA to see if the CFA you specify is valid. If so, I would do the CFA with no covariates in the model until I get a well-fitting CFA model. Only then would I add covariates.
 Laura Valadez posted on Thursday, October 29, 2009 - 12:22 pm
Dear Linda,

Thank you for your response.

Following the recommendations I read in the forum, I conducted EFAs beforehand, for each of the latent constructs and then for all the indicators.

The EFA for PHYSIO comes out with the following: The maximum number of factors is set to 1. So the 4 indicators come out in 1 factor where WAZ4 bears a considerably high loading.

WAZ4 3.856
HAZ4 0.192
BAZ4 0.172
HAEMO2 0.023

For this, Chi-Sq = 12.587 (df=2), CFI=.997, TLI=.994, RMSEA=.053.

When I run EFA for all the indicators, haz4 and haemo2 share their highest loadings with indicators of MENTAL. However, haz4=height-for-age and haemo2=haemoglobin concentration and the indicators for MENTAL are results of memory/cognitive tests. This makes me think that -following theory- I should still keep them under PHYSIO.

When I did the CFA without controls, I got the following:
Chi-Sq= 1175.263(df=260)
CFI=.918, TLI=.937, RMSEA=.043
and PSI not positive definite for MENTAL.

Does all of this means the model is seriously misspecified?

thanks again, your help is invaluable!

Laura Valadez
 Linda K. Muthen posted on Thursday, October 29, 2009 - 2:35 pm
If in EFA certain variables do not load on the expected factors, then their validity is questionable. If you force them on the expected factors in CFA, the model will not fit well. I think you need to consider the validity of the items that you are using.
 Laura Valadez posted on Thursday, October 29, 2009 - 6:24 pm
Thank you very much, this is really helpful. I will re-examine the indicators that I have got.


 Isaiah Baker posted on Monday, May 10, 2010 - 11:24 am
I have a one-factor CFA with 4 indicators. The error variance of one indicator is negative but very small and non-significant. I know this is not what is supposed to be. But everything else looks good, though with a warning message. And I like the results. Is it appropriate that I just leave the results there and proceed to explain them with some reasonable arguments? Thanks for your attention.
 Erin Madden posted on Friday, July 30, 2010 - 4:09 pm
Hello. Were carrying out an EFA followed by CFA with covariates. There are 12 dependent variables and they are categorical (binary and ordinal). Based on EFA results, it appears that there are 4 factors, but the 4th factor only has two items and so was modeled as a correlation between those two items.

When I try to add a binary covariate to the model I run into problems. First, there is a negative residual variance for one of the items (avoidanc). I looked and the residual variance is very close to zero. But, when I set that residual variance to zero (using theta parameterization), I get an unidentified model. If I remove the culprit item (avoidanc) from the model it runs without any errors. However, the avoidanc item, which is problematic, is also theoretically important and should remain in the model. So, Im not sure how to proceed.

Here is what the MPLUS model statement looks like:

MODEL: tbi BY dizzy@1 headache irritabl sleep memory visual ;
ptsd BY nightmar@1 avoidanc onguard detached irritabl ;
dep BY LittlInt@1 depress detached sleep ;
onguard with sleep ;
tbi ptsd dep on AUDC ;
 Linda K. Muthen posted on Friday, July 30, 2010 - 4:20 pm
Please send two full outputs, the one without the ON statement and the one with the ON statement, and your license number to
 Siran Zhan posted on Wednesday, October 19, 2011 - 1:30 am
Hi Dr. Muthen,

I have a model with 8 first-order factors and 3 second-order factors. I'd like to clarify if I can specify them in two equivalent ways.

One way is that I fix one of the first-order loadings to 1 by default, e.g.:

F1 by V1 V2 V3;
F8 by V22 V23 V24;

F9 by F1 F2;
F10 by F3-F6;
F11 by F7-F8;

Another way is fixing the first-order latents' variance to 1, e.g.,

F1 by V1* V2 V3;

Could you tell me if both ways of specification are correct and equivalent?

Thank you!
 Linda K. Muthen posted on Wednesday, October 19, 2011 - 6:26 am
They are correct and equivalent.
 SY Khan posted on Sunday, March 02, 2014 - 10:09 am
Hi Dr. Muthen,

My EFA results show that there are 4 factors for binary observed variables. When i conduct CFA on just one of my factors (which has 3 items)I get a meesgae of covariance matrix not being positive definite.

I tried fixing the starting values of the problem variable (JOBDSCRT)with the following three commands (two of which dont work):

1- AUTOJD BY JOBVARTY@1 JOBDSCRT@0.5 JOBCTRL; (this produces negative variance for JOBCTRL)

2- AUTOJD BY JOBVARTY* JOBDSCRT@0.5 JOBCTRL @1; (this again results in Neagtive variance)

3- AUTOJD BY JOBVARTY* JOBDSCRT@0.5 JOBCTRL; (this one works in the indvidual CFA of this factor).But I realised that the diagram does not show any anchor variable which could be wrong. Is that so?

How should I solve this problem?

in my overall model CFA with the remaining 10 constructs used in my model-- It works.

Kindly advise what am I doing wrong? How can I get the approriate result for my individual construct AUTOJD without loosing an item?

Many thanks
 Linda K. Muthen posted on Sunday, March 02, 2014 - 5:03 pm
You should focus on why your CFA gives a negative residual variance and change the model. It seems that you did not translate the EFA into an appropriate CFA.
 SY Khan posted on Monday, March 03, 2014 - 2:41 am
Hi Dr. Muthen,

Thanks for your prompt reply. Sorry, I did not expalin myself clearly above.

The EFA suggested 4 factors and CFA confirmed those 4 factors. But when I ran individual CFA's on each four factors seperately I get a negative residual variance for an item on one of the factors i.e. AUTOJD which has three items, JOBVARTY, JOBDSCRT, JOBCTRL.

The negtaive variance for JOBDSCRT:
JOBDSCRT Undefined 0.11371E+01 -0.137

I have read from posts above that when the -ve variance is not significant, it can be fixed to zero. so I have done the following:


1- Doing this I get the results. Is this correct?

2- Also with the above command the diagram shows that all error terms are the same for all items? please can you explain why?

3-Which is better way of dealing with -ve variance i.e. fixing -ve variance =0 or giving a new starting value?

 Linda K. Muthen posted on Monday, March 03, 2014 - 10:16 am
It sounds like your model is very fragile if separating the factors reveals this problem. Fixing a residual variance to zero is for continuous variables only. This cannot be used with categorical variables. It should also be used only for small non-significant values. I don't believe -.137 falls into this category.

In estimating the factors together, you draw on information from other parts of the model which you do not do when you estimate each factor separately. You can consider using the four-factor model if it fits well. Given these issues, I suspect it does not.

The error terms are the same because you have constrained them to be equal by placing (1) behind each one of them.
 SY Khan posted on Tuesday, March 04, 2014 - 6:41 am
HI Linda,

Thanks very much for explanation. After reading your reply I thought that the solution would be to drop JOBDSCRT (binary variable).

But when I tried PARAMETERIZATION=THETA it didn't give negative residual. However, the overall fit indices reduced a bit (were better with DELTA).

With Theta I get CFA for AUTOJD BY JOBVARTY, JOBDSCRT, JOBCTRL seperately too.

Please advise if:

1-I can proceed with THETA parameterization? if yes, then do I need to have THETA parameterization in all the subsequent CFAs and SEM ananlysis? OR is it ok to change back to DELTA where it works without a problem?

2- What would be the impact on the quality and legitimacy of result if I did not use same parameterization consistently?

3- I am asking this question becaues my inedepndent variables are binary (4 factors of which one is AUTOJD). Other three independent variable factors work fine with Delta parameterisation).

My intermediate and outcome variables are CATEGORICAL(for CFA). But I run SEM with Latent variables (of binary items) and aggregated variables which are treated as continuous observed variables in SEM. These aggregated variables are generated by adding items identified through CFA of categorical items.

Sorry for the lengthy question and many thanks for your guidance.
 SY Khan posted on Tuesday, March 04, 2014 - 8:50 am
Hi just to add a clarification to the above. I simply changed the parameterization=THETA without constraing the model in any other way or to give new starting values.And it worked.

Rest of my questions remain as above.

Thanks very much
 Linda K. Muthen posted on Tuesday, March 04, 2014 - 10:36 am
I have no further comments other than that I would not use the Theta parametrization unless the model could not be analyzed using the Delta parametrization.
 Ting Dai posted on Monday, March 10, 2014 - 10:37 am
Dear Drs. Muthen,

When trying to fix a Heywood case, I could 1) fix it @0;
2) constrain it to be equal with another similar residual variance;

Is there a third way, for example, still have Mplus estimate it but set it to be non-negative?

 Linda K. Muthen posted on Monday, March 10, 2014 - 10:50 am
You can constrain it to be greater than zero using MODEL CONSTRAINT.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message