Hello. I am conducting a CFA (MLR estimator) with three, continuous indicators across 4 groups. For one of my groups, I get one non-significant, negative (near zero, within 95% CI) residual variance when my intercepts are freely estimated and factor mean is set to 0 for all groups (my Model 1). I could set the residual variance to 0, but then I get a standardized factor loading and r-square of 1.000, which I consider non-"useful" information. If I re-run Model 1 for just that group, I get the same results.
But, when I constrain factor loadings to be equal across groups (free intercepts, factor mean at 0) for my Model 2, I get a positive residual variance for that variable and group (.609), which is a value still within the original 95% CI of Model 1. To re-run Model 1, might I set the residual variance for that variable for that group to .609 instead of zero so that I might get "usable" estimates?
bmuthen posted on Saturday, February 26, 2005 - 4:54 pm
If the model with invariant loadings fits well, you could make an argument that what you propose is reasonable - you are borrowing information from the other groups to get a better estimate for that residual variance. It is a bit ad hoc, however, since the fixed value of .609 has sampling variability so the resulting SEs of the model might need to be taken with a grain of salt (i.e. work with conservative tests of parameter significance).
rpaxton posted on Saturday, February 18, 2006 - 9:06 pm
I and trying to confirm the factor structure of 2 second order CFA's with 5 factors each. For some reason the residual variance for one of the factors is negative. Would you recommend deleting that factor? What steps should be taken to handle this situation.
bmuthen posted on Sunday, February 19, 2006 - 2:45 pm
You could fix that residual variance at zero. That would mean that that first-order factor is a perfect indicator of the second-order factor - this happens in some instances.
rpaxton posted on Sunday, February 19, 2006 - 4:27 pm
How would I fix the residual variance to zero. This is my model statement: f1 by var2-var3; . . . F10 by var27-var30; exper by f1-f5; behav by f6-f10; !............................... Should I just say F9@0 below the final statement. Thanks
bmuthen posted on Sunday, February 19, 2006 - 4:42 pm
Valeriana posted on Tuesday, March 21, 2006 - 7:41 am
Hi, I´m trying to use a CFA model to test convergent and discriminant validity. Though, almost all the residual variance are non-significant. If I fix them at zero, indexes such as "composite reliability" or "average variance extracted" or any other reliability index will be inflated. What should I do? Thank´s.
After looking through various discussion boards, I figured out how to fix my problem of having a negative residual variance for my variable dep3. I just added a line dep3@0 to the model command:
VARIABLE: NAMES ARE ...; MISSING = ALL (99); USEVAR = dep1 dep2 dep3 critsen1 critgf1 das1; ANALYSIS: TYPE = MEANSTRUCTURE MISSING; MODEL: i s | dep1@1dep2@2dep3@3; i s ON critsen1 critgf1 das1; dep3@0; OUTPUT: TECH4 SAMPSTAT STANDARDIZED MODINDICES (3.84);
I am pleased to have figured out how to fix the problem. However, I do not fully understand why setting that residual variance to zero allowed the model to run. Can you offer me any help in understanding this at a more applied level? Thanks!
Fixing a negative residual variance is done if the residual variance is a small negative value and not signficant. Othwerwise, the model should be changed.
The reason that this is a problem is that variances cannot be negative by definition.
Suzanne Jak posted on Tuesday, April 22, 2008 - 1:09 am
I'm fitting 2 first-order and 1 second-order factor on 7 continuous observed variables. Using scaling in lambda, the residual variance of de first factor is negative. My model runs well when I fix the variance of this factor to 1, and remain the factorlading of the first factor to be 1.
Q1: Is this bad practice? Q2: I thought it is useless to have 1 indicator for a factor, so I regressed de variable 'kub' direct on de 2-order factor. Is this ok?
This is my input:
Title: factormodel met weging sommen op 2 factoren, f1 fixed op 1
Data: FILE IS schaal.dat;
Variable: NAMES ARE ana cijf fig kub som syl voc w; WEIGHT IS w;
Analysis: ESTIMATOR = MLR;
Model: f1 BY fig ana syl cijf som; f1@1; f2 BY som voc; f3 BY f1 f2 kub;
You should not set the metric of the factor by both fixing a factor loading to one and fixing the factor variance to one. If you relax one of these restrictions, the model is not identified. You need a minimum of three first-order factors for the model to be identified without making perhaps unrealistic restrictions on the model.
JPower posted on Tuesday, January 20, 2009 - 9:28 am
Hello, I'm conducting a CFA of a four factor scale with ordinal indicators using wlsmv estimation. One of the factors has only 2 indicators and for these indicators there is limited variability in responses item s: Category 1 0.969 Category 2 0.026 Category 3 0.004 Category 4 0.001 item o: Category 1 0.975 Category 2 0.016 Category 3 0.006 Category 4 0.002 My fit statistics are reasonable (CFI, TLI >0.95, RMSEA, SRMR = 0.07). However, I get a warning about theta not being positive definite and there is a negative residual variance for item s (-0.003). Would dropping this factor be reasonable given the limited variability in responses and the negative residual variance? What would you suggest as next steps? Thanks.
In my opinion a factor with two indicators is not generally believable given that it is not identified without borrowing from other parts of the model. In your case, I would use one of the factor indicators as an observed variable in the model.
I am trying to establish measurement invariance in a measurement model before proceeding to a multi-group structural model. THere are two latent variables and two groups.
When I ran it constraining the factor loadings to be equal, it ran fine, but when I free the factor loadings for the unconstrained model, I get one negative residual variance in one group (leading to the "not positive definite" error message). The intercepts are also freed. The fit indices for the unconstrained model are chi sq=15.899, df=12, CFI=.993, rmsea=0.043, srmr=0.033. Can these results be interpreted with a neg residual variance?
You should first find that the factor model fits well in each group before proceeding to test for measurement invariance. It sounds like this is not the case. You might want to start with an EFA in each group to establish the the same number of factors is found in each group and then proceed to a CFA in each group.
I have a 6-factor CFA model with 13 indicators. In the first 6-factor model I fit one indicator had a negative residual variance that was very small (.05) and non-significant which I fixed at zero without any theoretical rationale , just figuring that it was small enough to justify as zero and proceed with this good fitting model. After testing the first, proposed model against several others with slight theoretically informed adjustments, a new model with one indicator loading on two factors shows a better fit. When I removed the zero residual variance constraint from the one indicator the actual variance in the newer model is more negative (@.11). If having this model constraint was justified in the first place, would it still be justified in the newer model with the increase in the negative residual variance?
It sounds to me like you should start with an EFA of your factor indicators to see if the items are behaving as expected. Making adjustments to a CFA model without previously doing an EFA to study the items can result in a misspecified model. A factor with two indicators is not identified without borrowing information from other parts of the model. I would hesitate to use such a factor.
What do you mean by misspecified? I understand how a factor with two indicators is not indentified but if the model as a whole is identified what does misspecification mean?
Here is where I my questioning is coming from: I proposed that a certain factor structure would arise via prinicipal components or EFA but it was strongly suggested that I use CFA. In looking at it both ways 6-factors will not converge in an EFA but when I run a CFA with the proposed structure the model is a good fit with the exception of the small negative residual variance. So should I take the non-convergence in EFA as a strong hint that 6-factors are not appropriate at all or is there a possibility that a confirmatory model with 6 factors is still feasible? I was hoping to move to factor mixture analysis with this CFA but that may not be a good idea either...
Thanks again - I really appreciate your assistance matt
A misspecified model is a model that does not correctly represent the data which I think you know. Beyond that I am saying that often CFA models are proposed based on theory and estimated using data that may not well measure the constructs represented in the theory. An EFA can often help in seeing this.
If you have to modify the CFA by for example fixing residual variances to zero, this may point to a problem with the model. If a 6-factor CFA fits the data well but a EFA will not converge, the CFA may be a fragile model that will not be replicated with other data.
Note that factor mixture analysis usually has less factors that a regular CFA.
I now have a question with regard to negative residual variances. This is the model I have:
PHYSIO BY haz4 waz4 baz4 haemo2; MENTAL BY WJ3 WJ2 WJ5 stpea; MOTOR BY carty_1a carty_3a carty_5a car2b car6b car4b; MOTOR ON age; comp1 by CD1 CD12 CD2 CD4 CD6; comp2 by CD5 CD7 CD8 CD9 CD18 CD19; comp3 by CD3 CD11 CD13 CD14 CD15 CD16 CD17 CD20 CD25; comp4 by CD21 CD22 CD23 CD26 CD24; comp1 on sex; comp2 on sex; comp3 on sex; comp4 on sex; motor on sex; mental on sex; SUBJ BY comp1 comp2 comp3 comp4; OBJ BY PHYSIO MENTAL MOTOR; SUBJ on ANIO_INC TIPO_LOC LOCAL; OBJ on ANIO_INC TIPO_LOC LOCAL;
All of the indicators are categorical, except the ones in MENTAL.
I am getting negative residual variance for WAZ4. I have tried changing the model, specifically the PHYSIO part by: (a) splitting it into 2 latent variables, but I get theta not positive definite involving waz4, and psi not positive definite involving one of the new latent constructs. (b) taking out waz4 from the model. With this, I get no error messages but the loading of baz4 changes from being significant to non-significant.
Following the recommendations I read in the forum, I conducted EFAs beforehand, for each of the latent constructs and then for all the indicators.
The EFA for PHYSIO comes out with the following: The maximum number of factors is set to 1. So the 4 indicators come out in 1 factor where WAZ4 bears a considerably high loading.
WAZ4 3.856 HAZ4 0.192 BAZ4 0.172 HAEMO2 0.023
For this, Chi-Sq = 12.587 (df=2), CFI=.997, TLI=.994, RMSEA=.053.
When I run EFA for all the indicators, haz4 and haemo2 share their highest loadings with indicators of MENTAL. However, haz4=height-for-age and haemo2=haemoglobin concentration and the indicators for MENTAL are results of memory/cognitive tests. This makes me think that -following theory- I should still keep them under PHYSIO.
When I did the CFA without controls, I got the following: Chi-Sq= 1175.263(df=260) CFI=.918, TLI=.937, RMSEA=.043 and PSI not positive definite for MENTAL.
Does all of this means the model is seriously misspecified?
If in EFA certain variables do not load on the expected factors, then their validity is questionable. If you force them on the expected factors in CFA, the model will not fit well. I think you need to consider the validity of the items that you are using.
I have a one-factor CFA with 4 indicators. The error variance of one indicator is negative but very small and non-significant. I know this is not what is supposed to be. But everything else looks good, though with a warning message. And I like the results. Is it appropriate that I just leave the results there and proceed to explain them with some reasonable arguments? Thanks for your attention.
Hello. We’re carrying out an EFA followed by CFA with covariates. There are 12 dependent variables and they are categorical (binary and ordinal). Based on EFA results, it appears that there are 4 factors, but the 4th factor only has two items and so was modeled as a correlation between those two items.
When I try to add a binary covariate to the model I run into problems. First, there is a negative residual variance for one of the items (avoidanc). I looked and the residual variance is very close to zero. But, when I set that residual variance to zero (using theta parameterization), I get an unidentified model. If I remove the culprit item (avoidanc) from the model it runs without any errors. However, the avoidanc item, which is problematic, is also theoretically important and should remain in the model. So, I’m not sure how to proceed.
Here is what the MPLUS model statement looks like:
MODEL: tbi BY dizzy@1 headache irritabl sleep memory visual ; ptsd BY nightmar@1 avoidanc onguard detached irritabl ; dep BY LittlInt@1 depress detached sleep ; onguard with sleep ; tbi ptsd dep on AUDC ; avoidanc@0;