Negative residual variance
Message/Author
 anon9210 posted on Thursday, September 02, 2010 - 6:29 pm
Hi,

I am trying to run an EFA using 8 indicators (geomin rotation; WLSMV estimator; N approx. 3000). The scree plot I get strongly suggests three factors; this makes a lot of sense for theoretical reasons too.

However, it appears that one of my factors is strongly defined by just one of my indicators. I am guessing this is why that particular indicator has a loading > 1 on its factor; additionally, this leads to negative residual variance for that indicator (res. var. = -.63; s.e. = 1.98; est/s.e. = -.32). I get no error messages of any sort. The three factors are relatively uncorrelated with each other (.18, .003, and .13; the latter two being the intercorrelations of the factor I am talking about). The remaining seven indicators look ok; nothing seems out of place with them.

Fit indices for the 3-factor model are as follows: CFI = 1.000; TLI = 1.004; RMSEA = 0; SRMR = .028.

Fit indices for the 2-factor model are: CFI = .947; TLI = .895; RMSEA = .037; SRMR = .094

So:

1.) Is the 3-factor solution a problem? Or am I running into a modeling technicality?

2.) Relatedly, should I stick with the 3 factor solution, or should I drop to a lower 2 factor solution?

3.) Is there anything I can do to avoid the negative residual variance?

 Linda K. Muthen posted on Friday, September 03, 2010 - 9:05 am
Any solution with a negative residual variances is inadmissible. It sounds like the third factor has only one strong indicator which is problematic.

You could try to run the model using ESEM (see Example 5.24) using MODEL CONSTRAINT to keep the residual variance positive. Check to be sure this does not change model fit.
 anon9210 posted on Friday, September 03, 2010 - 4:40 pm
Thanks! On a related note, I am trying to figure out which rotation to use for my analyses. I get similar results with Oblimin and Quartimin; oddly, enough when I use Mplus' default Geomin, the results change somewhat. I am particularly interested in the significance of the correlations between my factors and I noticed that if I use Oblimin or Quartimin, the magnitudes of the correlation coefficients all drop as a whole, but all are somewhat equal and are significant. However, when I use Geomin, there is a lot more discrepancy in the magnitudes of the correlation coefficient, and some of them end up being significant and some of them, nonsignificant. Which one would you recommend in this case?
 Linda K. Muthen posted on Saturday, September 04, 2010 - 9:05 am
We would recommend our default Geomin. See the paper below which is available on the website and also the Browne (2001) reference in the user's guide:

Sass, D.A. & Schmitt, T.A. (2010). A comparative investigation of rotation criteria within exploratory factor analysis. Multivariate Behavioral Research, 45, 73-103.
 anon9210 posted on Saturday, September 04, 2010 - 10:35 am
Thanks for the references - I will be sure to go over them. I also read Asparouhov & Muthen (2008) and noticed the footnote about geomin EFAs with more factors using larger epsilon values. I tried playing around with these some, while keeping a three-factor solution constant (based on theoretical reasons and the scree mentioned above; I have 11 indicators though now, so no negative variance problems), and noticed as the value of epsilon increased, the Geomin factor correlations resembled Quartimin and Oblimin more (i.e., more significant correlations between factors; though magnitudes of correlations were more equal and lower). So, a couple of questions again:

1.) Would this support going more with greater epsilon values, or sticking with the Mplus defaults?

2.) Additionally, lower epsilon values also lead to loadings greater than 1 (e.g, 1.048) for that one variable I mentioned previously (even with 11 indicators), though I no longer end up with negative residual variance. I checked previous threads, and if I am understanding things correctly, this is due to a couple of reasons:
(a) As geomin factor loadings are more similar to regression coefficients rather than correlation coefficients, they can be greater than 1, and
(b) Additionally, the correlation between factors might be contributing to this.
Is this correct?
 Linda K. Muthen posted on Sunday, September 05, 2010 - 11:46 am
1. Model fit and the model estimated correlation matrix are the same for all rotations so statistics can't help in this decision. Look at which solution seems most reasonable.

I would use the Mplus defaults unless I had a strong reason for doing otherwise.

2a. For all rotations, factor loadings are regression coefficient which can be greater than one.

2b. Yes.