CFA with categorical data PreviousNext
Mplus Discussion > Confirmatory Factor Analysis >
Message/Author
 Shirley  posted on Wednesday, March 09, 2016 - 1:36 am
Dear Dr. Muthen,
We are examining the factorial structure of a 34-item instrument using data from about 100 participants. The item score is generated by summing the sub-scores on indicators (dichotomously scored) within a particular item, which is treated as ordinal data in the subsequent factor analysis. To evaluate the factorial structure of the instrument, we first performed an exploratory factor analysis, based on the output of which we subsequently fit a 2-factor model with a subset of items (estimator is WLSMV). The output from Mplus suggests that the model estimation terminated normally and the final CFA model demonstrates reasonable fit (i.e., RMSEA close to .06, CFI close to .95, etc.). The pattern of factor loadings and factor correlation also match with our expectation. However, we noticed that the number of free parameters is larger than our sample size. May we seek for your advice on this? Specifically, should we be concerned when interpreting the output of this CFA model, and if so, what alternative analysis strategies could we consider?

Thanks very much for your time!
 Linda K. Muthen posted on Wednesday, March 09, 2016 - 7:44 am
It is generally not good practice to have more parameters than observations. I don't have a reference. You might was to ask about this on a general discussion forum like SEMNET.
 Shirley  posted on Thursday, March 10, 2016 - 1:42 am
Thank you Dr. Muthen for your help.

I have a related question about the number of free parameters printed in Mplus output and would appreciate your advice. Specifically, the number of free parameters is 34 in EFA with 1 factor (number of items=34;estimator=WLSMV) if item scores are specified as ordinal data. The number of free parameters increases sharply to 102 in EFA with 1 factor (number of items=34; estimator=ML) if item scores are treated as continuous variables. May I know how the number of parameters is determined in each of the two analyses?

Thanks again!
 Linda K. Muthen posted on Thursday, March 10, 2016 - 6:21 am
The difference in the number of parameters is due to the ordinal variable having more thresholds. Compare the results or TECH1 from the two outputs so see the difference in the parameters.
 Jorge Fernando Pereira Sinval posted on Thursday, May 05, 2016 - 7:41 am
Dear Professor,

I did a CFA with dichotomous items using the WLSMV estimator, how can I see the residuals variance in the Mplus diagrammer?

Mplus v. 7.11

Thanks.
 Linda K. Muthen posted on Thursday, May 05, 2016 - 10:03 am
Residual variances are not model parameters with binary items. They are computed as remainders after model estimation and are given if you ask for STANDARDIZED in the OUTPUT command.
 Julia Grant posted on Friday, June 09, 2017 - 9:01 am
Good morning,
I am conducting a CFA with categorical variables in one sample, and was hoping to apply factor score coefficients from this sample to a second smaller sample. I know that in the past, fscoefficients could only be used with continuous data. Is there another way to obtain the factor score coefficients for a CFA with categorical data?
Thank you very much!
Julie Grant
 Linda K. Muthen posted on Friday, June 09, 2017 - 1:06 pm
Factor score coefficients are not available for categorical items because factors scores must be computer iteratively.
 Julia Grant posted on Monday, June 12, 2017 - 11:11 am
Thank you for your speedy response! Given that factor score coefficients are not available, is there any other way to project the factorial architecture from a larger, representative sample to a smaller cohort (e.g. proc score in SAS)? Thanks again!
 Linda K. Muthen posted on Monday, June 12, 2017 - 11:58 am
I do not think that can be done given that the procedure is iterative. You would need to run the analysis on the smaller cohort fixing the values of all free parameters to those in the representative sample.
 Julia Grant posted on Monday, June 12, 2017 - 12:04 pm
OK, thanks. Our thought had also been to try fixing the values for the smaller sample to those in the larger sample. We will see if that works.
 shonnslc posted on Thursday, August 29, 2019 - 9:01 am
Hi,

I am doing CFA. I am not sure why Mplus automatically used MLR instead of WLSMV when my indicators are a mixture of categorical and nominal items even though I set my estimator to be WLSMV? I thought I could use WLSMV in this scenario. Thanks!
 Bengt O. Muthen posted on Saturday, August 31, 2019 - 5:12 pm
WLSMV cannot handle variables declared as nominal.
 Amanda Lemmon posted on Thursday, October 08, 2020 - 10:53 pm
Hi -

I am running a CFA model with MLR as an estimator and items declared as categorical. I got two Chi Square values -- Pearson and LR -- in the output. I was wondering if there is a reference where I can learn about the difference between them?

Thank you!
 Bengt O. Muthen posted on Saturday, October 10, 2020 - 10:35 am
Agresti has a good book on categorical data analysis. But its easier to just google it.
 Amanda Lemmon posted on Monday, October 12, 2020 - 8:48 am
Thank you!
 Amanda Lemmon posted on Thursday, October 29, 2020 - 7:19 pm
I wanted to follow up on my previous question about Chi Squares in CFA with categorical (ordinal) variables and the MLR estimator. I read Agresti's book, but they don't describe Pearson and LR Chi Square in the context of factor analysis. 1. Is my understanding correct that H0 in factor analysis with ordinal variables and MLR is: No difference between the observed and model implied crosstabs of the variables?

I also read the "Estimator choices with categorical outcomes" paper (Muthen et al., 2015) and wanted to ask a few clarifying questions. The paper says, "Weighted least squares gives a X2 test of model fit to the sample LRV correlations. Although this does not test fit against the data, this is nevertheless a useful way to study the factor structure. Maximum likelihood does not offer such a test." (p. 5).
2. For WLS, why doesn't it test fit against the data? What does it do? And if it doesn't evaluate fit, then in what other way can it be useful?
3. For ML, so Pearson and LR Chi Square statistics are not indicators of model fit? I guess I am confused because what are they then?

Also, if I understand correctly, factor analysis with ML and categorical variables (IRT) fixes residual variances to zero. 4. Does it mean that such an analysis assumes perfect measurement of the variables? I know zero residual variances have been viewed as a con in PCA -- is it a similar con in IRT?
 Bengt O. Muthen posted on Saturday, October 31, 2020 - 1:43 pm
Q1: That's right.

Q2: It tests against an unrestricted correlation matrix for the Y* latent response variables. There is a paper on this topic on our website at

http://www.statmodel.com/bmuthen/full_paper_list.htm:

45) Muthén, B. (1993). Goodness of fit with categorical and other non-normal variables. In K. A. Bollen, & J. S. Long (Eds.), Testing Structural Equation Models (pp. 205-243). Newbury Park, CA: Sage.

Q3: They are indicators of model fit. It's just that they can't be used with many variables due to zero cell counts.

Q4: It means that the model is mis-specified.
Back to top
Add Your Message Here
Post:
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Password:
Options: Enable HTML code in message
Automatically activate URLs in message
Action: