EFA eigenvalues
Message/Author
 Bill Roberts posted on Thursday, September 05, 2002 - 10:02 am
I have a question about the eigenvalues for EFA in Mplus. Using maximum likelihood for EFA in Mplus I get 12 eigenvalues GE 1 that are all positive. The eigenvalues ranged from .35 to 6.379 for the 45 Likert-type items in the analysis. I compared these results with the same items and the same analysis using sas. Eigenvalues from sas ranged from -.46 to 9.25 with 6 eigenvalues GE 1. If I use the typical cut-off of for an eigenvalue GE 1 to explore the underlying dimensionality of the items, sas would indicate 6 factors and Mplus would indicate 12. I compared the rotated promax factor pattern matrix between sas and Mplus for six factors and found that they are similar with minor differences that is probably due to rounding. I tried to run EFA in Mplus with 12 factors and ran into a convergence problem. How should I interpret the eigenvalues given by Mplus?
 bmuthen posted on Thursday, September 05, 2002 - 5:37 pm
If SAS and Mplus have eigenvalue differences, I would assume that they are not computed for the same matrix. Assuming that your outcomes are continuous variables, the eigenvalues should all be positive. Perhaps SAS handles any missing data differently in this run, e.g. using pair-wise present data which could lead to negative eigenvalues. Another reason for differences in the matrix used for eigenvalue computation is if an iterated principal factor method has been used in SAS so that the sample matrix is not used, but a sample matrix that has an adjusted diagonal. Mplus consider the eigenvalues for the sample correlation matrix. The Mplus eigenvalues GE 1 can be used to guide in determining the number of factors, but such a guide is a very rough one - a somewhat less rough guide is to use the scree approach.
 Bill Roberts posted on Friday, September 06, 2002 - 8:06 am
Thank you for discussing reasons that could account for differences in the eigenvalues. I am fairly certain that I can rule out pair-wise deletion of missing cases. According to the sas documentation, missing cases are deleted listwise by default. The sample size is the same in both programs. The simple descriptive statistics and correlation matrix look nearly identical. Perhaps sas uses a different matrix to compute the eigenvalues, as you suggested.
 bmuthen posted on Friday, September 06, 2002 - 1:45 pm
Which factor extraction method in SAS is being used?
 Bill Roberts posted on Friday, September 06, 2002 - 2:08 pm
The method for proc factor is set to maximum likelihood and rotation is set to promax.
 Bengt O. Muthen posted on Friday, September 06, 2002 - 2:30 pm
They must be doing something to the correlation matrix otherwise no eigenvalue would be negative because when a correlation matrix is computed from listwise present data the correlation matrix is positive definite.
 Bill Roberts posted on Monday, September 09, 2002 - 2:55 pm
I am finding that by default, SAS sets prior commonalties for each variable to its squared multiple correlation with other variables in the analysis when the method is maximum likelihood. After taking a closer look at the SAS output, I see that the cumulative variance exceeds 100 percent, at which point, the eigenvalues become negative bringing the cumulative variance back to 100 percent. If, however, the priors option is set to one for method = maximum likelihood all eigenvalues are positive. When squared multiple correlations are inserted along the diagonal of the correlation matrix, then the total variance to be decomposed into factors is less than the number of variables. Eigenvalues using principal components as the method of analysis in SAS were identical to what I am finding in the Mplus output using maximum likelihood as the estimator. Is there a way to specify EFA in Mplus using maximum likelihood and insert squared multiple correlations along the diagonal of the correlation matrix using nonsummary data?
 bmuthen posted on Monday, September 09, 2002 - 6:28 pm
The answer is no, and I can't see the need for it in terms of model estimation. The adjustments to the diagonal are connected with either descriptive purposes - getting a picture of relevant eigenvalues to guide in choosing number of factors - or are part of simpler estimation methods such as principal factoring. You can get Mplus to give such eigenvalues if you input an adjusted correlation matrix. But to get maximum-likelihood estimation from a correlation matrix, you don't want to adjust the diagonal of the correlation matrix.

Just a few more words on this. Mplus computes eigenvalues just like in principal component analysis (keeping the diagonal elements as they are - here 1). You can use such eigenvalues to descriptively guide you in choosing the number of factors. The idea of adjusting the diagonal is that this perhaps makes the resulting matrix more closely approximate Lambda*Psi*Lambda' in

Sigma = Lambda*Psi*Lambda' + Theta,

where the eigenvalues shed light on the rank (number of factors) of Lambda*Psi*Lambda'.
 Hervé CACI posted on Thursday, August 07, 2003 - 2:41 am
Bengt & Linda,

I'm using WLSMV with 4-point Likert like item scores (Mplus 2.01). I understand that I can use other estimators as well.

I'm puzzled by the fact that Mplus outputs a different item correlation matrix as the number of factors extracted grows. I would rather assume that the correlation matrix remained unchanged, because the eigenvalues of this matrix are a guide to the number of factors to extract/rotate.

Also, I'd like to know on which correlation matrix are computed the eigenvalues ? The first printed out, or some hidden matrix ?

What am I missing ?

Thank you in advance.
 Linda K. Muthen posted on Thursday, August 07, 2003 - 6:46 am
The correlation matrices printed are model estimated and then after that the residuals are printed. The residuals are the model estimated minus the observed values. As the model changes, that is, as more factors are extracted, the model estimated values will change. The eigenvalues are based on the observed correlation matrix.
 Anonymous posted on Tuesday, August 24, 2004 - 9:53 am
Hi,

I am wondering how to correctly define an EFA using continuous variables using MPLUS (for write up). Since ones are placed on the diagonal is this analysis really a principal components analysis or a factor analysis with principal component extraction method? To my understanding there should not be residual variances for manifest variables using PCA. Is this correct?

Thanks
 Linda K. Muthen posted on Tuesday, August 24, 2004 - 10:25 am
It is a factor analysis using a maximum likelihood or unweighted least squares estimator. It does not use the principal components estimator.
 anonymous posted on Wednesday, October 19, 2005 - 12:02 pm
Hi Linda,

I have a similar question to the one posed above. How would one describe an EFA using categorical variables using MpLUS (for write up)? Would it be correct to write, "a factor analysis using WLSMV estimator and promax rotation?" What would you write about the extraction method?

Thanks!
 Linda K. Muthen posted on Wednesday, October 19, 2005 - 3:04 pm
Mplus does promax and varimax rotations. And the factor extraction method is the estimator, so if you are using WLSMV, it would be weighted least squares.
 James J. Prisciandaro posted on Friday, September 01, 2006 - 12:46 pm
Hi Dr.s Muthen,

In EFA, Mplus outputs eigenvalues from the sample correlation matrix (i.e., with 1's on the diagonal) that can be used to determine the number of factors to retain (e.g., using scree plot, parallel analysis). However, some researchers have argued that when one is conducting an EFA, it may be more accurate to use eigenvalues from the reduced correlation matrix (e.g., with communalities on the diagonal) to determine the number of factors to retain. I was hoping you could explain to me how to obtain these latter eigenvalues in MPlus. In particular, how to do so in the context of my current research situation: EFA with binary data (WLSMV estimation; data are weighted).

Thanks,
Jim Prisciandaro
 Bengt O. Muthen posted on Monday, September 04, 2006 - 4:13 pm
Mplus does not produce eigenvalues for a reduced correlation matrix. I use a scree plot of eigenvalues for the unadjusted sample correlation matrix. Same with binary items and sampling weights.
 Julia Diemer posted on Thursday, May 03, 2007 - 11:37 pm
Hello,

Just a quick question: Is there a way of doing a parallel analysis (Horn) for determining the number of factors to extract in EFA with Mplus?

Julia Diemer
 Linda K. Muthen posted on Friday, May 04, 2007 - 7:11 am
I am not familiar with this method and it certainly isn't directly implemented in Mplus. I am not sure if it could be done in Mplus using the Monte Carlo simulation features however.
 Xuan Huang posted on Friday, June 08, 2007 - 3:05 pm
Dear professors,
I conducted EFA with eight 7-point scale items. I treated these variables as categorical and used wlsmv as estimator. I got 1 eigenvalue larger than 1 which is 5.419. All other eigenvalues range from .149 to .610. The eigenvalue indicates one-factor model may be good.
Here are my results:
One-factor: ÷2(14)=109.876, P=0.0000, RMSEA=.154, RMSR=.0469;
Two-factor: ÷2(10)=47.601, P=0.0000, RMSEA=.114, RMSR=.0280;
Three-factor: ÷2(6)=21.236, P=0.0017, RMSEA=.094, RMSR=.0165;
Four-factor: ÷2(2)=4.803, P=.0906, RMSEA=.07, RMSR=.009, one residual variance is negative;
I am confused how to interpret the eigenvalue. It indicates one-factor model but one-factor model has large RMSEA value and significant ÷2.
Could you give me some hints on the inconsistency between what eigenvalue suggests and what the model fix index suggests?
Thank you very much!
 Linda K. Muthen posted on Friday, June 08, 2007 - 3:17 pm
When these eight items were developed, for how many dimensions were they developed?
 Xuan Huang posted on Friday, June 08, 2007 - 3:40 pm
Thanks for your reply. The eight items were development to measure one dimesnion:
parental warmth.
 Linda K. Muthen posted on Friday, June 08, 2007 - 4:20 pm
In view of that, the eigenvalues, and the RMSR, I would conclude one factor. I would, however, look at the other factor solutions and see which items cross-load and think about if that is what you would expect. Some items may not be behaving properly.
 QianLi Xue posted on Wednesday, September 30, 2009 - 6:31 am
Hi, Linda,
Does MPLUS provide summary statistics for the amount or % of variance explained by each of the factors in EFA?
 Bengt O. Muthen posted on Wednesday, September 30, 2009 - 8:42 am
No, because amount of variance explained is not the focus of factor analysis, but rather of principal component analysis. Also, the percentages are well-defined only for orthogonal rotations such as Varimax, which may not be an optimal rotation method. In the case of orthogonal rotation, you can compute the percentages yourself by summing the squared loadings in a column.
 Helen Skerman posted on Wednesday, September 07, 2011 - 10:11 pm
In an EFA of categroical variables, I have negative eigenvalues for 2 of the 32 variables. Is the solution inadmissable? Can this be ignored or should I make some adjustment such as eliminating some low frequency variables? I tried "LISTWISE=ON", but this made no difference. Any other suggestions would be appreciated.
 Bengt O. Muthen posted on Thursday, September 08, 2011 - 7:20 am
I think this is ignorable. With categorical variables and WLSMV, you work with tetrachoric and polychoric correlations which are computed for pairs of variables at a time and therefore can produce a non-positive definite sample correlation matrix - which has some negative eigenvalues. You can still get a pos-def model-estimated correlation matrix. If the model fits well to this sample correlation matrix, you can view the situation as the non-pos def sample correlation matrix was not "significantly non-pos def." There have been ideas in the literature about deleting the eigenvalues and eigenvectors for the negative eigenvalues and recreating the sample correlation matrix this way, smoothing it, and then fitting the model, but I am not sure that is an important improvement.

If you use ML instead, this issue does not come up because ML does not fit the model to those sample correlations.
 Lisa M. Yarnell posted on Tuesday, November 29, 2011 - 7:40 pm
Hello. When one is choosing among EFA factor solutions using criteria such as the Scree plot and overall model fit, is it true that when you pick a greater number of factors to be extracted, there is necessarily better fit (according to CFI, TLI, and RMSEA)? Or can it sometimes occur that higher numbers of factors extracted can actually result in a more poorly-fitting model than when fewer factors are extracted? Thanks.
 Linda K. Muthen posted on Wednesday, November 30, 2011 - 9:27 am
Chi-square will improve but I don't necessarily think that would hold with CFI, TLI, and RMSEA. Note that there is a maximum number of factors that can be extracted from a set of indicators. Also, you can get negative residual variances which make the solution inadmissible.
 Tracy Waasdorp posted on Thursday, March 01, 2012 - 7:11 am
In an EFA, what is the equation used to calculate the eigenvalues for ML?

Thank you
 Linda K. Muthen posted on Thursday, March 01, 2012 - 8:01 am
It is the regular algorithm of an eigenvalue of a vector. Try Googling eigenvalue.
 emmanuel bofah posted on Monday, September 24, 2012 - 11:48 am
which example in chapter 4 can save the eigenvalues so i can use O’Connor
(2000) macros to generate the random variables eigenvalues. Is a process recommended by the recent paper Hancock, G. R., & Mueller, R. O. (Eds.). (2013). Structural equation modeling: A second
course (2nd ed.). Charlotte, NC: Information Age Publishing, Inc. supplementary.
 Linda K. Muthen posted on Monday, September 24, 2012 - 12:04 pm
The eigenvalues are printed in the output. They are not saved. This test will be in Version 7 using the PARALLEL option.
 ellen posted on Tuesday, October 15, 2013 - 11:00 am
Hi,
I just installed Mplus version 7.11 yesterday. It runs well with regular SEM analyses. I wanted to perform a Parallel Analysis to determine the optimum number of factors in an exploratory factor analysis. I used the syntax below, but it's been running since yesterday evening till now for 15 hours, and it's still running with no output available yet. I am wondering whether I made a mistake in the syntax. Why is it taking so long?

VARIABLE:
NAMES ARE sex age race y1-y50;

USEVARIABLES ARE y1-y50;

MISSING IS all (-99);

ANALYSIS:
TYPE = EFA 1 50;
PARALLEL = 1000;

PLOT:
TYPE= PLOT2 ;

could you please let me know whether this is the correct syntax to run a Parallel Analysis?

Thanks so much!
 Linda K. Muthen posted on Tuesday, October 15, 2013 - 11:13 am
You are asking for 50 factor solutions with 1000 random data sets for each of the 50 solutions. I would imagine that could take some time and you have 50 items. The problem is most likely that you are trying to extract too many factors and are getting negative residual variances which could cause slow convergence. I would choose a range of factors related to the number of factors for which the data were developed. For example, if the fifty items should contain four factors, I would perhaps ask for solutions from 1 to 6 or 2 to 6.
 lydia posted on Thursday, January 28, 2016 - 12:13 pm
I ran a categorical EFA and I am finding a mismatch between what the eigenvalues are saying as to the number of factors but am getting poor model fit statistics. For example, I may have 1 or 2 factors above 1 using the eigenvalues and factor loadings look okay, but they have significant chi square and poor or mixed model fit (RMSEA, CFI, TLI). If I go to the factors that have good or better model fit, then I'm choosing the factors with low eigenvalues. Do you have any suggestions as to why this happens and if there is anything I can try to attend to this issue? Should I ignore one set of statistics vs another? This is an EFA of a scale that hadn't been previously validated or had been modified so not tied to any specific number of factors.
 Bengt O. Muthen posted on Thursday, January 28, 2016 - 6:20 pm
Eigenvalues is more of a descriptive device not a model-testing device. It is not uncommon that they disagree with chi-square testing. I would use the usual fit statistics.