Two-Level Exploratory Factor Analysis PreviousNext
Mplus Discussion > Exploratory Factor Analysis >
 Alan Johnson posted on Monday, December 19, 2011 - 3:29 am
I have just been reading Bengt and Tihomir's (2011) article in the Joop Hox's Hanbook.

I am working with data that contain evaluations from about six members (i), nested within with 60 teams (j) on four occasions (t) during 21 weeks.

On each occasion they responded to 12 items designed to reflect 3 team behavior constructs (four items for each construct). The items were all phrased with group-level identifiers, for example, "our team ..." or "We ..."

Would you recommend doing Two-Level EFA on all the data or on the data for each wave. My sense is to do two steps: (1) EFA on all the data to establish the factor structure at the group-level, and (2)temporal invariance analyses on what ever factors emerge.

In addition, what interpretation of the EFA factors if individuals disagree on evaluations of the group-level construct? For example, some sub-group of a team might report that the team is "discussing all issues openly and fairly," while at the same time, other individuals report the opposite, presumably because they are feeling railroaded by the first sub-group.
 Bengt O. Muthen posted on Monday, December 19, 2011 - 8:45 pm
I would do (1) a 2-level EFA for each time point, and then (2) study invariance over the 4 occasions.

For your last question I guess that would be reflected by a large within-level variance. Or, perhaps you want to add variation as an observed measure.
 Alan Johnson posted on Tuesday, December 20, 2011 - 11:51 am
I have done (1) the EFAs for each time point, and I have found the hypothesized factor structure at the individual level, and as you probably suspected, "a mess" at the group level. So, there is probably no point in doing (2) 2-level invariance for the occasions.

However, I have previously done CFAs for each occasion and invariance tests over the four occasions, at the individual level, and found them to be satisfactory. So that suggests that I am measuring something reliable down there, at least. Thus, as you suggested, I could proceed by adding "variation as an observed measure."

How would you recommend doing this? My sense is that for for each variable, I would calculate the standard deviation (Harrison & Klein, 2007, describe s.d. a an index of 'separation') between individuals' summary-scores (mean of four items) for each group on each occasion. I would then include these indices, with the original variable, as observed covariates in the subsequent regression analyses.

As a theoretical aside, is this an example of what Ludtke et al. (2008) were referring to as formative constructs, at the group-level?
 Bengt O. Muthen posted on Wednesday, December 21, 2011 - 11:34 am
Your variation approach seems reasonable, although I can't say if the new measure is an IV or a DV.

I am not sure that Ludtke refered to that.
 Alan Johnson posted on Wednesday, December 21, 2011 - 1:55 pm
You mean the standard deviation index based on X, sd_X, might be a consequence of X and a correlate of Y, as opposed to a covariate of X on Y. I will think about that, supporting group dynamics theory is important in either case. I have thought about the former scenario, but not the latter ...

I have another thought about analyzing these data in the two-level framework from Mplus, that would aggregating anything, IVs or DVs. Here is how i envisage it working ...

Individuals (members) 1, ..., 360 are the respondents for the DVs, and individuals (professors) 1, ..., 9 are the respondents for the DVs. However, the members are all nested within one and only one cluster (group), so they can effectively be renumbered 1, ..., 6 in each team. The professors are cross-classified with two or more clusters (group) on each occasion, so they maintain their 1, ..., 9 level-1 identifier. However, to distinguish the professors from the members, I would need to number them 7, ..., 15. I have tried presenting the data to Mplus in this format, amazingly to me, and it seems to work!

Could this analysis strategy legitimate? In this scenario, would I still include the the deviation index and would I interpret it?
 Tihomir Asparouhov posted on Thursday, December 22, 2011 - 12:59 pm
I would suggest that you use a bivariate model where the members and the professors are modeled separately rather than with the same distribution.

Another approach to deal with different variance on the within level is to have a 2-class model where some clusters have large and some small within level variance.
 Alan Johnson posted on Sunday, December 25, 2011 - 9:32 am
Thank you, I was stumbling towards a similar idea, I think, where I might use the sd_X quantity as the basis for a latent class analysis. I'll try this when I get back from a couple of days away, and I'll start another thread with how I got on ...
Thank you both for your input and "joyeuse fête"
 Katrien Vangrieken posted on Thursday, January 21, 2016 - 9:31 am
Dear Dr. Muthen,

I am doing two level EFA's on two scales (ca. 1300 individuals in 119 groups). I got several warnings for each factor solution, i.a. that there was an issue with the fact that the number of parameters exceeded the number of clusters. It was suggested to use the MLF estimator. Now I get the following messages for each factor solution: STANDARD ERRORS COULD NOT BE COMPUTED.


Do you have suggestions on what to do?
Thank you!
 Linda K. Muthen posted on Thursday, January 21, 2016 - 1:09 pm
Please send the output and your license number to
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message