

TwoLevel Exploratory Factor Analysis 

Message/Author 


I have just been reading Bengt and Tihomir's (2011) article in the Joop Hox's Hanbook. I am working with data that contain evaluations from about six members (i), nested within with 60 teams (j) on four occasions (t) during 21 weeks. On each occasion they responded to 12 items designed to reflect 3 team behavior constructs (four items for each construct). The items were all phrased with grouplevel identifiers, for example, "our team ..." or "We ..." Would you recommend doing TwoLevel EFA on all the data or on the data for each wave. My sense is to do two steps: (1) EFA on all the data to establish the factor structure at the grouplevel, and (2)temporal invariance analyses on what ever factors emerge. In addition, what interpretation of the EFA factors if individuals disagree on evaluations of the grouplevel construct? For example, some subgroup of a team might report that the team is "discussing all issues openly and fairly," while at the same time, other individuals report the opposite, presumably because they are feeling railroaded by the first subgroup. 


I would do (1) a 2level EFA for each time point, and then (2) study invariance over the 4 occasions. For your last question I guess that would be reflected by a large withinlevel variance. Or, perhaps you want to add variation as an observed measure. 


I have done (1) the EFAs for each time point, and I have found the hypothesized factor structure at the individual level, and as you probably suspected, "a mess" at the group level. So, there is probably no point in doing (2) 2level invariance for the occasions. However, I have previously done CFAs for each occasion and invariance tests over the four occasions, at the individual level, and found them to be satisfactory. So that suggests that I am measuring something reliable down there, at least. Thus, as you suggested, I could proceed by adding "variation as an observed measure." How would you recommend doing this? My sense is that for for each variable, I would calculate the standard deviation (Harrison & Klein, 2007, describe s.d. a an index of 'separation') between individuals' summaryscores (mean of four items) for each group on each occasion. I would then include these indices, with the original variable, as observed covariates in the subsequent regression analyses. As a theoretical aside, is this an example of what Ludtke et al. (2008) were referring to as formative constructs, at the grouplevel? 


Your variation approach seems reasonable, although I can't say if the new measure is an IV or a DV. I am not sure that Ludtke refered to that. 

Alan Johnson posted on Wednesday, December 21, 2011  7:55 pm



You mean the standard deviation index based on X, sd_X, might be a consequence of X and a correlate of Y, as opposed to a covariate of X on Y. I will think about that, supporting group dynamics theory is important in either case. I have thought about the former scenario, but not the latter ... I have another thought about analyzing these data in the twolevel framework from Mplus, that would aggregating anything, IVs or DVs. Here is how i envisage it working ... Individuals (members) 1, ..., 360 are the respondents for the DVs, and individuals (professors) 1, ..., 9 are the respondents for the DVs. However, the members are all nested within one and only one cluster (group), so they can effectively be renumbered 1, ..., 6 in each team. The professors are crossclassified with two or more clusters (group) on each occasion, so they maintain their 1, ..., 9 level1 identifier. However, to distinguish the professors from the members, I would need to number them 7, ..., 15. I have tried presenting the data to Mplus in this format, amazingly to me, and it seems to work! Could this analysis strategy legitimate? In this scenario, would I still include the the deviation index and would I interpret it? 


I would suggest that you use a bivariate model where the members and the professors are modeled separately rather than with the same distribution. Another approach to deal with different variance on the within level is to have a 2class model where some clusters have large and some small within level variance. 


Thank you, I was stumbling towards a similar idea, I think, where I might use the sd_X quantity as the basis for a latent class analysis. I'll try this when I get back from a couple of days away, and I'll start another thread with how I got on ... Thank you both for your input and "joyeuse fête" 

Back to top 

