Message/Author 

sandy posted on Thursday, June 19, 2003  9:23 am



I have a set of 11 categorical variables (some binary and the others have up to 7 levels). I was interested in performing factor analysis on this data. In all of the mulitvariate analysis books that I have read, it says that you cannot perform factor analysis on categorical data. How is it that your program can do what others say is impossible? Is there a paper that you can suggest that would help me to understand why this can be done? Also, I could not find out what type of correlation matrix the program uses in the factor analysis. I found that it uses tetrachoric correlations when the data is binary, but what about when it has more then two levels. Thanks for your help in this. 


I think what these books mean to say is that the methods they are presenting are for continuous outcomes. If you go to www.statmodel.com under References, you will find a set of papers and books related to the factor analysis of categorical outcomes. For ordered polytomous variables, factor analysis uses a polychoric correlation matrix. 


We're attempting to run a EFA on 13 categorical items, but whilst it allows us to run after specifying a 2 factor solution, it cannot/won't allow us to run it for a 1 factor solution. The error message says Unrecognized setting for TYPE option: 1. PLEASE HELP 


You may be saying TYPE = EFA 1. You should say TYPE = EFA 1 1; You need to give both a minimum and a maximum number of factors to extract. If this is not what you are doing, please send the output so that I can take a look at it. 

Anonymous posted on Monday, August 04, 2003  10:45 am



HI, I WOULD LIKE TO CALCULATE A CORRELATION MATRIX FOR A MIXTURE OF CATEGORICAL AND CONTINOUS VARIABLES. HOW DO I GET A CORRELATION MATRIX OUTPUT IN MPLUS? THANKS 


You would use SAVEDATA, FILE (SAMPLE)IS to obtain a correlation matrix. 


My EFA in MPlus and SPSS ran fine but I am getting an error message that my correlation matrix is not positive definite when I run it with CEFA and SAS. I have 37 items with n=102. Is there a way I can calculate the determinant for various combinations to investigate which items are causing it to go to 0. Also, do you know if the 0 determinant thresholds are different such that MPlus will run but SAS won't. Thanks 


I think SAS probably has a function to invert postive definite matrices. You could try to invert the matrix to see where the problem is. The default for EFA in Mplus is the unweighted least squares estimator for which positive definiteness is not required or checked. This may also be the case in SPSS. I don't know. If you use the ML estimator in Mplus, it will check for positive definiteness of the sample correlation matrix. 


Two quick follow up questions: Why is positive definiteness not checked or required for ULS? Also, is there a way to see the determinant in output? Thanks again! 

bmuthen posted on Friday, March 04, 2005  6:11 am



Unlike ML, there is nothing in the computations for ULS that requires pos def of the sample corr/cov matrix. A factor model implies a pos def modelestimated matrix, but you can fit such a model to a non pos def sample counterpart  the model may still fit well. In this sense, the non pos def of the sample may not be "significant". This is often the case with tetrachoric and polychoric correlation matrices. Currently, there is no printout of the determinant. With EFA, however, there is a printout of the eigenvalues. 

Pat posted on Thursday, August 25, 2005  8:15 am



Morning Dr. Muthen, I am conducting an EFA on a set of 13 categorical manifest variables. I am new to Mplus and have been struggling with some of the finer aspects to the code. Below's code gives me the basic output, but I was also hoping to see the polychoric correlation matrix for my items and the eigenvalue plot, but my foray into the code has been sketchy at best. Can you help me out? Data: file is 'C:\Documents and Settings\Pat\Desktop\TRIG209.txt'; format is free; type is individual; ngroups = 1; Variable: Names are trig1trig13; Categorical are trig1trig13; Analysis: type = EFA 1 4; estimator = WLSMV; Thanks, Pat 


See the PLOT command for the eigenvalue plot. See the SAMPSTAT option of the OUTPUT command for the tetrachoric correlations. You can also use TYPE=BASIC in the ANALYSIS command to obtain these. If you have further questions of this type, please send them along with your license number to support@statmodel.com. 


I would like to see the polychoric correlation matrix for 24 item with n=334 using SAS program. Proc freq can only calculate the polychoric correlation coefficients for each two items separately, but does not display the correlation matrix. Many thanks. 


I'm afraid you will have to direct your question to SAS. I am not familiar with this program. 

shig posted on Thursday, February 02, 2006  5:45 pm



Mohammed, see http://support.sas.com/ctx/samples/index.jsp?sid=512 

Anonymous posted on Saturday, February 25, 2006  11:52 am



Hi; I have done an EFA for 24item scales with 4 points and the sample size was n=400; I obtained 3 solutions by PCF, PAF and ML using the polychoric correlation matrix. Eight factors should be retained in the PCF and PAF and the solution accounted only 62% of the variance in both of two solutions but the common factors were not the same in respect of the factors’ items. But when I performed the ML, only 4 factors should be retained and the solution accounted 100% of the variance. The ML solution was statistically acceptable more than the other in respect of RSMR & RSMP; I just wanted to know, what is the best method to perform the exploratory factor analysis? And why there is a difference between the 3 solutions? Am I in the wrong way? Many thanks. 

bmuthen posted on Saturday, February 25, 2006  12:12 pm



It does not sound like you are using Mplus. In Mplus, EFA on categorical items can be done by ULS, WLSMV, or ML. ULS and WLSMV fit the model to polychorics. I would not expect large differences between the 3 Mplus approaches. 

Anonymous posted on Sunday, February 26, 2006  3:44 am



One quick question, Why EFA on categorical items can only be done in Mplus by ULS, WLSMV, or ML. ULS and WLSMV fit the model to polychorics? Where are the principal components factoring and the principal axes factoring as EFA methods? Is it for a theoretical reason or a technical reason? 

bmuthen posted on Sunday, February 26, 2006  5:50 am



Principal components analysis is a biased estimator for the factor analysis model (because it assumes zero residuals). It is only used to generate starting values. Principal factoring (with iterated communalities) gives I believe the same results as ULS. 

james tapp posted on Thursday, July 27, 2006  1:04 am



Dear Dr Muthen, You very kindly helped me with some advice regarding the rotations of factors, and I am emailingyou with another query, and would be much appreciated if you could point me in the right direction. I am interested in running a factor analysis on true binary data but was wanting to avoid some of the pitfalls with linear correlation matrices. Is there any literature available as to whether it is possible to conduct a factor analysis on distance generated matrices (i.e. Jaccard)? Thank you again for your time James NHS 


I am not familiar with such literature  anybody else? 


Hello, I am still new to Mplus and have a question regarding EFA with ordinal data. Can I do a common factor analysis (rather than PCA) with Mplus? Thank you very much in advance, Julia Diemer 


Mplus does what is referred to as common factor analysis. 


Dear authors, I read a post in this topic that says "EFA on categorical items can be done by ULS, WLSMV, or ML. ULS and WLSMV fit the model to polychorics.". But, when I try to estimate an EFA on some categorical indicators with ML estimator Mplus (version 4.2) gives me this warning: *** WARNING in Analysis command Estimator ML is not available for EFA analysis with non continuous variables. Default estimator will be used. My input code is: VARIABLE: ... USEVARIABLES ARE GIUDIZIO RAPCOL RAPDOC RAPNDOC STRAULE STRBLB STRLAB R1452 R1482; CATEGORICAL ARE GIUDIZIO RAPCOL RAPDOC RAPNDOC STRAULE STRBLB STRLAB R1452 R1482; MISSING ARE all(999); ANALYSIS: ESTIMATOR IS ml; TYPE = EFA 2 3 Why? There is something wrong in my code? Best regards 


ML will be available with EFA and categorical factor indicators in Version 5. It is not currently available. 

aleksandar posted on Sunday, March 30, 2008  2:46 pm



Dear, I am confused. I would like to conduct factor analysis but I don't know whether I can do principal components or maximum likelihood or something. THANK YOU FOR YOUR TIME. 


Please have a look at the article by Fabrigar et al (1999) on the Mplus web site under References, Continuous Outcomes, EFA. 

aleksandar posted on Monday, September 01, 2008  2:57 am



I would like to ask. How high does a factor loading have to be to consider that variable as a defining part of that factor? In my example I have binary dates. 


This is a big topic. You should watch the video for Topics 1 and 2 of our short courses which are on our website. 

Cecily Na posted on Friday, May 04, 2012  2:01 pm



Hello, I want to know if several indicators of the same latent factor can be in different metrics. For instance, latent variable X has three continuous indicators x1 x2 x3. x1 is on a 05 scale, x2 is on a 0100 scale, x3 is 01 scale. Would such a composite work? Is there a suggested limit to metric difference? Like the largest range of indicators should be less than 10 times of the smallest range of the indicators. Thank you very much! 


That would work as is. I would make sure continuous indicators have variances between one and ten. You can rescale them by dividing them by a constant using the DEFINE command if not. 

Jan Zirk posted on Wednesday, September 18, 2013  8:10 am



Is there a way to avoid listwise deletion in the Bayesian EFA? Is there a FIMLlike procedure for Bayes? 


Bayes does not use listwise deletion. It uses a FIMLlike procedure. 

Jan Zirk posted on Wednesday, September 18, 2013  8:41 am



Thank you. I have ran a Bayesian EFA on 80 cases but half of them were excluded as they had missings. 


The only cases that would be excluded in an EFA would be cases with missing on all of the dependent variables. If half are being excluded, you may be reading your data incorrectly, for example, you may have more variable names in the NAMES list causing two records to be read instead of one. 

Jan Zirk posted on Wednesday, September 18, 2013  10:04 am



Indeed! Thanks very much for this. 

Nara Jang posted on Sunday, April 06, 2014  11:38 am



Dear Dr. Muthen, Would you tell me if there is a way to conduct interaction term (effect) in EFA? Thank you so much for your great help in advance. 


No, you would have to use CFA for that. 

Nara Jang posted on Sunday, April 06, 2014  8:04 pm



Dear Dr. Muthen, Thank you so much!! 

Back to top 