Message/Author 

Xu, Man posted on Wednesday, March 06, 2013  4:06 am



I'd like to run parallel anlaysis for some categorical data, but the parallel anlaysis otpion is not available for categorical data. I was wondering if it makes sense to use biserial/tetrachoric correlation matrix as data input, and analyse that as if it is continuous (but without estimating the thresholds). To get the biserial/tetrachoric correlation matrix based on the same sample (taking into account of missing data), I would declare all data as categorical and ask for SAMPSTAT output to get correlation matrix. Would this be sensible to do? Thanks! 


We do not provide parallel analysis for categorical data because we have found it does not work well for categorical data. 


Hi Linda, I have a couple of questions regarding the use of parallel analysis in MPlus: 1) What happens when the ML solution does not converge with the random data? I imagine that this would happen quite often when trying to extract multiple factors from data uncorrelated in the population. 2) Is there a particular reason why you chose not to use the principal component eigenvalues for parallel analysis? The PA procedure seems to work quite well with these eigenvalues, including cases with ordinal data and polychoric correlations (e.g., Garrido, Abad, & Ponsoda, 2012). Thanks! 


1. Convergence is not an issue. The randomly generated correlation matrices are used only to compute eigenvalues. 2. The principal component eigenvalues are used in that we compute the eigenvalues for the correlation matrix not adjusting the diagonal. 


Thanks for the clarifications! 


Hi, Could anyone clarify why parallel analysis would not work fine to determine de dimensionality of a categorical, dichotomous, data matrix? Maybe this is somewhere else, but I could not find it by myself. Thanks, Carlos 


Look at what I have found out: http://www.csie.ntu.edu.tw/~r95038/Try/%A7O%A4H%BD%D7%A4%E5/697.pdf 


And here I found a study on why parallel analysis should be used with caution with dichotomous matrices: http://www.er.uqam.ca/nobel/r17165/RECHERCHE/COMMUNICATIONS/2006/AERA_RASCH_SIG/AERA_RASCH_SIG/SUMMARY.pdf 


We had parallel analysis developed also for tetrachoric and polychoric correlations, but my explorations of it indicated that it didn't work well, so we didn't include it in release versions. The poor performance may have to do with the fact that these correlation matrices behave differently than correlations among observed continuous variables. 


Hi, Is there a reference that could be used to say that the performance of parallel analysis on categorical is poor? I have been suggested to perform parallel analysis to decide upon number of factors to keep in an EFA, but either want to learn how to, or motivate why not. Regards, Örjan 


We looked at parallel analysis for categorical variables using eigenvalues for latent correlations (tetrachoric, polychoric...), but it didn't seem to point to the right number of factors so we decided to not include it in Mplus, but only have it for continuous outcomes. Perhaps it has to do with the correlations not capturing all information in the data with categorical variables. I know of no references on this. 


Dear Dr. Muthen, I also wish that Parallel Analysis (PA) would be implemented in Mplus for categorical variables (as I indicated in a previous post in this thread from 2013). My colleagues and I have been working on this issue for some time and have found through large simulations that PA works quite well with categorized data (obviously not as well as with the underlying continuous variables prior to categorization, but this is common for all methods with smaller samples due the loss of information). In fact, we recently published an article (see below) comparing the performance of PA with the CFI, TLI, RMSEA and SRMR fit indices, and PA clearly outperformed all of them with categorical data. To my knowledge there is currently no method used in the psychological/educational literature that outperforms PA with this type of data. Garrido, L. E., Abad, F. J., Ponsoda, V. (2015, December 14). Are Fit Indices Really Fit to Estimate the Number of Factors with Categorical Variables? Some Cautionary Findings via Monte Carlo Simulation. Psychological Methods. Advance online publication. http://dx.doi.org/10.1037/met0000064 


I forgot to clarify in my previous post that we implemented PA with tetrachoric/polychoric correlations. 


Interesting. I tried out PA on sample tetrachoric correlations once we had it for Pearson correlations and my initial impression was that it didn't capture the right number of factors at all. And that's why we didn't implement it. So I will be interested in reading your new paper. Thanks for letting us know. 


Hello, I have categorical variables rated on a 4point Likert scale, and I would like to conduct parallel analysis to determine the number of factors to retain. My version of MPlus (Version 7) will not allow for parallel analysis with categorical variables. I read from an earlier post that with 5point Likert scales you can treat them as continuous in order to run parallel analysis with MPlus. Would the same apply for 4point likert scale items? 


If you don't have strong floor or ceiling effects, 4point variables might be well approximated as continuous. 

Kellina Pyle posted on Wednesday, August 30, 2017  10:55 am



Thank you for your prompt reply. Unfortunately I have strong floor effects. Is parallel analysis inappropriate here? What might be a better approach for determining the number of factors to retain with nonnormal categorical data? 


There was a recent article on parallel analysis for categorical variables (using latent correlations) but this is not implemented in Mplus. With the WLSMV estimator you can decide on the number of factors using chisquare and other overall model fit statistics. With ML, you can use BIC. 

Back to top 