Message/Author 

Anonymous posted on Friday, July 15, 2005  8:05 am



I am fitting a two level SEM model (children nested within neighborhoods) and want to calculate the ICC for the empty model. The outcome is a latent variable (PA) with 4 indicators (PA1 PA2 PA3 PA4). In the output for the empty model I receive: “Estimated Intraclass Correlations for the Y Variables” Each of the indicators has an associated ICC with it. Since I am interested in the latent variable PA, do I use these estimates together to determine the ICC for the latent variable PA or should I examine them separately? Second, how would I then go about calculating the Design effect: 1 + (average cluster size  1)*intraclass correlation Would I use a summation of the indicator ICCs to arrive at an empty model ICC? 

bmuthen posted on Friday, July 15, 2005  6:23 pm



I don't know if by "empty model", you mean an unrestricted model (free covariance matrix) or a baseline model (which has zero covariances). The estimated ICCs in the output refers to the estimated covariance matrices without a model imposed  so unrestricted. If you want ICCs for the factor PA then you have to estimate a 2level latent variable model with PA variance on the within and between levels. The design effect formula you give is only for a special case of means with equal cluster sizes  I have not seen a formula for latent variable models and I doubt it can be easily expressed. The formula you give is probably a decent rough approximation. 


Are the "Estimated Intraclass Correlations for the Y Variables" normal ICCs or the corrected ICCs as proposed in your 1991 paper? 


The are regular ICCs for the observed variables, not the latent variable ICCs I wrote about. 


Thank you, Dr. Muthén. From which Covariancematrices (which Output option) can i compute the latent ICCs? 


p.s. for both items and latent variables in a twolevel CFA. 


The latent iccs are computed from the latent variable variances printed for the within and between parts in line with my article that you mentioned. The iccs for the items are printed as “Estimated Intraclass Correlations for the Y Variables” that you referred to. I assume your items are continuous and not declared as categorical. 


I have a twolevel model with continuous latent and observed variables. I'm little confused about the ICCs for items. Normal ICCs for Items are printed as "Estimated Intraclass Correlations for the Y Variables". "Error free" ICCs can be computed from the "Model Estimated Covariances" matrices of the RESIDUAL output? With large sample sizes (N>10000), the expected difference between the two ICCs should decrease. Is this right? 


No, error free ICCs are computed by the formula in the article you refer to as BF/(BF + WF) where BF is the printed estimate of the betweenlevel factor variance and WF is the withinlevel factor variance. Sample size does not have an influence here, only the relative size of within and between factor variance. 


So i can't compute error free ICCs for observed variables/items?! I've read a paper where error free ICCs for the 12 items of 3 latent factors have been computed. I wonder which variances the authors used. 


I misled you  yes, you can. The BF/(BF+WF) formula above is defined in equation (5) of Muthen (1991) where you see that BF and WF are defined for each item, also including the itemspecific loadings on between and within. 


I'm really sorry stressing you again, but i'm still unsure how to compute error free ICCs for observed variables. In formula (5) of your above mentioned paper, BF variance is defined as lamda² x sigma² (with lamda as loading parameter). How is lamda estimated? By using the Estimates from the MODEL RESULTS table with 1st factor loading fixed at 1? 


Yes. 

wendy posted on Tuesday, July 11, 2006  7:35 pm



Hi, Dr. Muthen: I have a question regarding ICC with error variance/indicator residual variance. I am running a 2 level CFA model and in both within and between level, there are two correlated factors and each factor predict 3 indicators. Hence the code is %WITHIN% fw1 BY y1y3; fw2 BY y4y6; y1y6; fw1fw2; fw1 WITH fw2; %BETWEEN% fb1 BY y1y3; fb2 BY y4y6; y1y6; fb1fb2; fb1 WITH fb2;. I calculate ICC according to the formula BF/(BF+WF) and I also incorporate residual variance of each indicator in the formula, however, all my 6 ICC values of indicators are slightly different from the Mplus output and do you know why and could you offer the formula of ICC with error variance, that is, including residual variance of indicators? Another question is whether I need to think about the variance induced by correlation of two factors? Thanks. 


The formula BF/(BF+WF) concerns the latent variable icc, whereas Mplus prints the observed variable icc. The latter uses the formula B/(B+W) where B is the between variance for the observed variable and W is the within variance for the observed variable. Each of these 2 variance components is the squared loading times the factor variances plus the residual variance. 

student07 posted on Monday, July 23, 2007  1:06 am



Hello Drs. Muthen I need to compute true ICC's for observed indicators in a multilevel factor analyses. Can you please help me with understanding formula (5) of your 1991 article and the Mplus 4.2 output. Question is  whether the squared factor loadings (lamda²) of the within and between factors  which are weighted by the latent within and between factor variances (sigma²) are standardized loadings? Shoud I use "stdYX?" from the output? Thanks a lot 


No, raw estimates. 

student07 posted on Wednesday, July 25, 2007  12:29 am



Thank you Dr. Muthén, now I have a followup question: To calculate true ICCs, is it a necessary condition that the unstandardized within and between level factor loadings for the indicator(s) of interest are invariant across the within and betweenlevel? 


Yes, and this is a weakness of the approach. 

Joop Hox posted on Tuesday, February 05, 2008  6:17 am



One remark and a question. To calculate an ICC for a latent variable we must assume that we measure the same thing at both levels. Thus we need measurement invariance. I think Byrne, Muthén & Shavelson argued for partial measurement invariance, meaning that a few loadings my actually be different. The question is that I think I read somewhere that finding a higher ICC for the latent variable than for the observed is typical, but is there a reference for this? 


I agree with your invariance statement. And this makes the latent variable ICC idea less useful because if one tests for it most often this invariance is not found. Here is a reference to an attempt at latent ICCs: Muthén, B. (1991). Multilevel factor analysis of class and student achievement components. Journal of Educational Measurement, 28, 338354. The idea was that measurement error attenuates ICCs given that such error variance goes into the denominator term W, ICC = B /(B + W). 

V X posted on Wednesday, April 14, 2010  1:11 am



Dr. Muthen, I am wondering whether it is possible to calculate ICC for 1) categorical variables, and 2) semicontinuous variables? Thank you. 


You should find information on this in: Snijders, T. & Bosker, R. (1999). Multilevel analysis. An introduction to basic and advanced multilevel modeling. Thousand Oakes, CA: Sage Publications. 

Sungchul Cho posted on Wednesday, October 28, 2015  2:07 pm



Dear Dr. Muthen, In the course of analyzing a twolevel data set with individuals nested in a community, I computed icc for two latent variables, which were 0.622 and 0.843, respectively. I was surprised to see such a large amount of variability attributable to the community level. ICCs for observed, reflective indicators were much smaller, ranging from 0.10.4. When I performed simple factor analyses using SPSS, and computed ICCs for them, ICCs were also much smaller. I was wondering if I made any mistakes related to the Bayesian technique, thereby creating such a powerful level 2 effect. The codes that I used are below. Do you have any suggestions? Thank you so much. Best, Sungchul VARIABLE: NAMES ARE r a1a12 c1c14 v1v6 i d e1e5 k p1p73 ed ic du id w p rw; USEVARIABLES = a2a5 a7a8 ; CATEGORICAL = a2a5 a7 a8 ; CLUSTER = r; MODEL: %WITHIN% ah by a5 a7 a8 ; al by a4 a2 a3 ; ah (w1); al (w2); %BETWEEN% ahbb by a5 a7 a8; albb by a4 a2 a3; ahbb (b1); albb (b2); MODEL CONSTRAINT: NEW(icc1); icc1 = b1/(b1+w1); NEW(icc2); icc2 = b2/(b2+w2); ANALYSIS: TYPE = TWOLEVEL ; estimator = bayes ; proc = 2; 


In this case you need to hold factor loadings equal across the 2 levels; otherwise the icc's won't be meaningful. 

Sungchul Cho posted on Wednesday, October 28, 2015  3:42 pm



Dear Dr. Muthen, Thanks for getting back to me so quickly. Though ICCs still remain high after holding factor loadings to be equal, your suggestion was a really useful correction. Sungchul 

Reeon Kang posted on Wednesday, November 23, 2016  4:00 am



Dear Dr. Muthen, I have an SEM model with six latent variables (A, B, C, D, Ex1, Ex2), and each latent variables includes 5 observed variables. The aim of the study is to measure the indirect effects of Ex1 and Ex2 on A through B, C, and D variables. The data contains about 4,000 students from 150 schools, so I assume that there may be a clustering effect (not sure). The relationship between latent variables are as below. F on S O I; I on S O; O on S Ex1 Ex2; S on Ex1 Ex2; My questions are, if I want to conduct multilevel SEM for this study, 1. can I use 'MODEL INDIRECT' comments to directly get the indirect value of Ex1 and Ex2 towards F? 2. Do I decide whether I use multilevel SEM or not, based on the size of each latent ICCs? 3. For simplicity, would it be better to make all latent variables as sumvariables so as to use them as observed variables? RK 


1. Yes 2. You can. 3. Not needed. 

Back to top 