I don't know about an article off hand. Multilevel IRT would be a good Google term. In Mplus this is handled in line with UG ex 9.7 (simplified to having no covariates), where you can request estimation of factor scores for the group-level "fb" factor. The different numbers of items responded to is handled via missing data.
Bob Houchens posted on Thursday, September 04, 2008 - 12:39 pm