Mean of between-level-variables
Message/Author
 AK14 posted on Thursday, October 01, 2009 - 9:56 am
Dear Linda,

a short question relating to the calculation of the mean of between-variables:

I was playing with the centering options and noticed the following: When I used the grandmean centering option for a between-level variable as follows

BETWEEN = z_betw;
CENTERING = GRANDMEAN(z_betw);

the estimated mean of the between level variable is not zero but

Means
Z_BETW -0.341 0.160 -2.128 0.033

In my dataset, cluster sizes differed considerably and the estimated value corresponds to the unweighted mean of the values of Z_BETW, i.e. weighting every cluster equally without adjusting for the size of the cluster. The weighted mean, accounting for the cluster sizes, would in fact be equal to zero.

Two questions:

- What is your rationale for using the unweighted average?

- Is their a way to obtain the weighted mean that takes different cluster sizes into account?

Thanks for you help!
 Linda K. Muthen posted on Friday, October 02, 2009 - 9:01 am
If you use GROUPMEAN centering, I believe the mean will be zero. I think GRANDMEAN centering is weighted for cluster size given that clusters with more members contribute more values to computation of the grand mean.
 AK14 posted on Monday, October 05, 2009 - 10:11 am
Hi Linda,

thanks for your reply and for confirming my suspicion about the GRANDMEAN centering weighting for cluster size!

I am still wondering why you have chosen to estimate the mean of a cluster-level variable without weighting for cluster size and if and how I would be able to obtain the weighted mean in Mplus as an estimated parameter. Unfortunately using GROUPMEAN centering did not do the trick as I am looking at a between-level variable and Mplus would - predictably - not let me GROUPMEAN center it.

 Bengt O. Muthen posted on Monday, October 05, 2009 - 12:07 pm
So just to be clear; we are talking about several different methods:

Method 1. Mplus currently grandmean centers a between-level variable by averaging it over all observations (so with 50 clusters and 2000 individuals, 2000 observations are used).

Method 2. Probably a more common approach is to average the variable over the clusters (so an average over the 50 values).

I would call Method 1 a weighted approach in the sense that bigger clusters contribute more to the grandmean.

Method 3. You can estimate the between-level mean also by Type=Basic Twolevel using ML. This mean will be a bit different from the mean of Method 2 (different estimators are esssentially used); ML gives heavier weight to larger clusters.