Complex sample data
Message/Author
 Daniel Rodriguez posted on Monday, September 24, 2007 - 10:56 am
If I have data collected from four high schools, and the ICC is greater than .05, say .10, can I model a LGM in a single level some how adjusting for the design effect, or do I instead use multi-level modeling?
 Linda K. Muthen posted on Monday, September 24, 2007 - 11:33 am
Multilevel modeling requires more than four cluster units. You can include three dummy variables in the analysis to represent the four schools.
 Daniel Rodriguez posted on Monday, September 24, 2007 - 4:27 pm
I had a reviewer for a grant say that because we have four schools the data is likely non-independent and that it is possible our standard errors will be biased downwards. Can we account for this bias in some way? Can I use type=complex or do something else to adjust the standard errors?
 Linda K. Muthen posted on Monday, September 24, 2007 - 6:09 pm
TYPE=COMPLEX has the same requirement for number of clusters as TYPE=TWOLEVEL. I think the best you can do is use the three dummy variables as covariates in the model to control for nonindependence.
 Alison Riddle posted on Wednesday, December 12, 2007 - 9:12 am
I have data that is clustered at two levels (household and community - referred to as enumeration areas in my data). Can Mplus handle clustering at two levels?

Very briefly, my data is from a cross-sectional survey that was conducted in 8 southern African countries. Enumerations areas (EAs) in all the countries were first stratified by type (urban, rural, and capital). EAs were then randomly selected in each stratum. In each EA, all households and all adults in each household were interviewed, without sub-sampling. Thus, the data are clustered at the EA and household levels.

Any recommendations you may have on how to handle this data in Mplus would really be appreciated.

Cheers,
Alison
 Bengt O. Muthen posted on Wednesday, December 12, 2007 - 10:10 am
There are 2 alternatives in Mplus.

1. You could use Type = Complex Twolevel for these data. This allows 2 cluster variables, the first one for EA and the second for household. You would do Twolevel modeling of members within households while correcting for non-independent observations within EAs by the Complex feature. This also allows specifying stratification and sampling weights.

2. You could use Type = Twolevel where the single cluster variable is EA and where the household members are treated in a multivariate fashion (so if each member is observed on p variables and there are s members, you would work with s*p variables). Here too you can specify stratification and weights.
 Alison Riddle posted on Wednesday, December 12, 2007 - 1:16 pm
Thank you, Bengt. This is extremely helpful.
 Alison Riddle posted on Friday, December 14, 2007 - 11:40 am
I have another quick question. I would just like to confirm what estimator is best to use for categorical data in an EFA... Is it WLSM or MLR? Or possibly another?

Thanks again.

Cheers,
Alison
 Linda K. Muthen posted on Friday, December 14, 2007 - 1:38 pm
The default is WLSM. Maximum likelihood requires numerical integration so this may not be feasible with more than a few factors.
 Alison Riddle posted on Friday, December 14, 2007 - 7:19 pm
Thanks. That may pose a problem for me, then. I have 17 dichotomous variables that I am using for an EFA. I have run the analysis so far as follows (see below). If I use eigenvalue=1 as a cut off, it gives me 6 factors (p=0000). I have 16,707 observations in my data set. I am working on the assumption that the missingness is MAR. I do not want to get into multiple imputation. Do you have any suggestions?

Thank you!

ANALYSIS:
TYPE IS COMPLEX EFA 1 10 MISSING H1;
ESTIMATOR = WLSM;
ITERATIONS = 1000;
CONVERGENCE = 0.00005;
H1ITERATIONS = 2000;
H1CONVERGENCE = 0.0001;
COVERAGE = 0.11;
ROTATION = VARIMAX;