G factor in EFA w/ ordinal data PreviousNext
Mplus Discussion > Exploratory Factor Analysis >
 EWickwire posted on Saturday, February 18, 2006 - 8:09 pm
Dear Dr. Metheun,

I'm attempting an exploratory factor analysis with 50 items. There are 538 observations, randomly selected from a total sample of 1076. (After the EFA I will perform CFA/SEM with other 538 predicting a behavioral variable.)

Items are scored on a 5-point scale. The midpoint is "neither" between opposing endpoints ("very good" to "very bad" e.g.). As expected "C" (the middle choice) was the modal score, but variance was acceptable. No item had more than ~80% "C" responses.

In the initial EFA I end up with a huge first factor and a scree plot difficult to interpret with 3-4-5-6-7 factors very close. I run parallel analysis, which indicates retain 5 factors, so that is what I have done.

There are indications of a general factor (i.e. many items load on first factor. Inter-factor correlations are only .3-.7. I cheated in CFA abd model fit is horrible with a one-factor model.)

I'm using ML extraction and switching away from SPSS helped tremendously with model fit. I'm not married to ML, I'm using princiapl axis as well and observing differences. I have also run the EFAs with 10-15 randomly generated items included and the loadings on the first factor increase slightly. When I play with it, I do get 5 interpretable factors, or 4 and one messier (still first, largest)..

It's been suggested to use polychoric correlation matrix in an EFA. Items are def categorical and nonnormal, although no skew/kurtosis exceeded 2/7. Multivariate nonnormality was also no satisfied (11% outside mardia's statistic but they are ioncluded in the sample.)

I also know that I should eliminate items but do not know details of that process. (Is it iterative, where I delete all items that don't load .3 or perhaps (?) load on 2+ factors? Then do I rerun the EFA, and adjust/rerun again if necessary?)

I've been stuck with this for a while and have heard of some of the tools, but I'm not sure how to proceed. What would you recommend for

1. Next step? eliminate items? Using which criteria? Then rerun?

2. How much would polychoric corrs improve the EFA? Should I use mplus to factor analyze polychoric corrs?

3. Are ML/Promax correct?

I appreciate your time very much--I've been stuck for too long!

Many Thanks,

 bmuthen posted on Monday, February 20, 2006 - 7:47 am
With a dominant factor, it sounds like you want to try a general-factor-specific-factor CFA approach. In addition to a factor influencing all items, uncorrelated residual (specific) factors allow further correlations among sets of items.

It doesn't sound as if you have to switch from continuous-variable ML modeling to categorical-variable modeling given your relatively symmetric distributions.
 EWickwire posted on Monday, February 20, 2006 - 8:28 am
Many thanks for a prompt reply.

A couple quick follow-ups:

1. Are you suggesting that I should drop the EFA portion of my analytic plan? Or how would I report?

2. With your CFA suggestion, do you mean I should have one factor influencing all indicators directly (i.e "on the left" in a CFA diagram) and then uncorrelated specific factors "on the right" (in that same diagram)? I believe this is a nested hierarchical design?

3. A question on interpretability:
Based on that nested hierarchical design, from which factors would I predict my DV in the SEM? Would you expect any predictive value from the g factor OR from the specific factors, or both (since the specific factors are now residuals)?

Thank you,

 bmuthen posted on Monday, February 20, 2006 - 9:20 am
1. I would do the EFA, but noting that Varimax and Promax may not be the best rotation schemes with a dominant factor (this is well-known in the literature), and then add the CFA, where the item sets for the specific factors would be suggested from substantive theory or the EFA.

2. Yes, but I don't see that as nested. To me, nested would be first-order factors nested within second-order factors, which is a different model.

3. Both. And the decomposition of the contributions from the general and specific factors would be clearer due to them being orthogonal. For applications, see articles by J.E. Gustafsson in Intelligence.
 Ewickwire posted on Monday, February 20, 2006 - 9:37 am
re: EFA

1. Which rotation would you try, or are you saying to stick with promax but note that it can be problematic with a dominant factor? (Do you have a reference for this? I've searched but must be looking with wrong terms.)

2. If I understand correctly, I should:

a. Perform EFA with ML extraction (any reason to use or not use Principal Axis, or just see which produces more interpretable results?)

b. Eliminate items (load less than .3, or low communality <.20, or load on multiple factors)

c. Rerun EFA (Do I then remove items based on same criteria and rerun again?)

d. Report results of EFA (perhaps after factor analyzing specific factors, as in searching for a dominant factor?)

e. Use results as basis for CFA

f. Predict DV from both g and specific (residual) factors in CFA/SEM

Dr. Muthen, you're a lifesaver. Again, thank you.

 bmuthen posted on Monday, February 20, 2006 - 6:39 pm
1. Mplus only offers Promax and Varimax, so stick with Promax. For the rotation issue, check Multivariate Behavioral Research for a recent EFA article by Cudeck and Browne (?) as well as classic EFA books such as Harman and also Gorsuch and Mulaik.

a. Yes ML
b. Perhaps; in our courses we recommend doing this in an EFA within a CFA, but if you haven't done that before, you may not want to do this now.

c - f Right
 EWickwire posted on Monday, February 20, 2006 - 8:24 pm
Would you ever see it justified that you switched from EFA to exploratory CFA because the data suggested a g factor?

In my split sample EFA-CFA design, would you still do the exploratory CFA in the first sample, then confirm it in the second sample?

What I know about exploratory in CFA context is essentially looking at modification indices and tweaking to improve model fit. If this is what you mean, is there an article that explains or provides a good example for how to write up and report?

Much appreciated.
 bmuthen posted on Tuesday, February 21, 2006 - 3:13 pm
If theory and data suggested it, yes.

Yes; surprises come up.

No article that I know except Joreskog 1969 in Psychometrika. But that is not applied or for writing things up.
 EWickwire posted on Tuesday, February 21, 2006 - 3:19 pm
Is adjusting CFA based on modification indices what you mean by "EFA within a CFA?"

Many thanks. You have been incredibly helpful.
 Linda K. Muthen posted on Tuesday, February 21, 2006 - 4:02 pm
No, EFA in a CFA framework is a specific model. It is described in the Day 1 handout from our short courses which can be ordered on the web.
 Emerson Wickwire posted on Sunday, March 12, 2006 - 6:09 pm
Dear Drs. Muthen,

You recently provided some very helpful info for me as I complete my dissertation involving a split sample EFA/CFS-SEM.

Based on your suggestion, I ordered your day 1 handout, which has also been very helpful.

However, I am still unsure exactly how to conduct an EFA in a CFA context.

Let's say I run an ML EFA. The loading matrix contains all cross loads, etc.

How would I proceed from there?

Sorry for the basic nature of the question but I am, well... a beginner.

Many thanks,

E Wickwire

PS: Also, I recently read somewhere that the split-sample technique is outdated, and that it is "better" to perform one analysis on the entire sample (N=1076). What is your opinion of this?
 Bengt O. Muthen posted on Sunday, March 12, 2006 - 7:29 pm
Although we would like to give a little help to beginners, I'm afraid this forum should not take the role of an advisor. So we can't give much more help than is in the handout describing the approach - are there any particular parts of pages 126-129 are unclear to you and your advisor? I don't think splitting samples into exploration and validation can be considered old fashioned; if you have a large sample such as yours, I would split it.
 Emerson Wickwire posted on Sunday, March 12, 2006 - 9:07 pm
On page 126 it states that the purpose of EFA in CFA is to discover significance of factor loadings (ie from EFA). I'm assuming that this is done by allowing all items to load on all factors.

Once I know statistical significance of individual loadings, how should I proceed?

-should I remove items with multiple sig loadings? One at a time or alll at once? And if one at a time, by which crietria?

-same for items with no sig loadings. Should I remove one at a time or all at once?

It just seems like with 50 variables and 5 factors, that's 250 loadings and would be a lot of iterative rerunning. Perhaps that is required- I'm just looking to verify.

Also, wouldn't the factor structure change as items are dropped from the model? Would you reassess the correct # of factors, etc, after removing items?

Thank you for your help.
 Bengt O. Muthen posted on Monday, March 13, 2006 - 9:30 am
That's a big topic. We take a couple of hours to discuss it in our annual November course in Alexandria - I would recommend it to you.

Briefly, I would remove items for which such cross-loadings were not intended or do not make sense. I would keep it simple, so all at once in both cases. And, you don't want a factor structure that is sensitive to dropping some items.
 Emerson Wickwire posted on Monday, March 13, 2006 - 10:56 am
Many thanks, and I will keep you posted on my progress.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message