

CFA with categorical data 

Message/Author 

Shirley posted on Wednesday, March 09, 2016  1:36 am



Dear Dr. Muthen, We are examining the factorial structure of a 34item instrument using data from about 100 participants. The item score is generated by summing the subscores on indicators (dichotomously scored) within a particular item, which is treated as ordinal data in the subsequent factor analysis. To evaluate the factorial structure of the instrument, we first performed an exploratory factor analysis, based on the output of which we subsequently fit a 2factor model with a subset of items (estimator is WLSMV). The output from Mplus suggests that the model estimation terminated normally and the final CFA model demonstrates reasonable fit (i.e., RMSEA close to .06, CFI close to .95, etc.). The pattern of factor loadings and factor correlation also match with our expectation. However, we noticed that the number of free parameters is larger than our sample size. May we seek for your advice on this? Specifically, should we be concerned when interpreting the output of this CFA model, and if so, what alternative analysis strategies could we consider? Thanks very much for your time! 


It is generally not good practice to have more parameters than observations. I don't have a reference. You might was to ask about this on a general discussion forum like SEMNET. 

Shirley posted on Thursday, March 10, 2016  1:42 am



Thank you Dr. Muthen for your help. I have a related question about the number of free parameters printed in Mplus output and would appreciate your advice. Specifically, the number of free parameters is 34 in EFA with 1 factor (number of items=34;estimator=WLSMV) if item scores are specified as ordinal data. The number of free parameters increases sharply to 102 in EFA with 1 factor (number of items=34; estimator=ML) if item scores are treated as continuous variables. May I know how the number of parameters is determined in each of the two analyses? Thanks again! 


The difference in the number of parameters is due to the ordinal variable having more thresholds. Compare the results or TECH1 from the two outputs so see the difference in the parameters. 

Back to top 

