How to model this sample design in mplus PreviousNext
Mplus Discussion > Multilevel Data/Complex Sample >
 Jim Prisciandaro posted on Thursday, October 01, 2009 - 2:00 pm
Hi Drs. Muthen,

I am working with a sample that was stratified based on scores on a questionnaire. A weight variable was included with the data to adjust for selection based on the stratification. In most cases, I would use TYPE=COMPLEX to analyze this data.

However, the data come from multiple sites. So the sampling design: 1) over a dozen sites selected based on practical concerns, 2) site population stratified by questionnaire scores and sample selected from each stratum, 3) weight variable created to adjust for selection.

I understand how to incorporate the weight and stratum variables in MPlus (using TYPE=COMPLEX), but how do I incorporate the fact that the data come from different sites?

 Linda K. Muthen posted on Friday, October 02, 2009 - 8:55 am
Multiple group analysis or if you have at least 30 to 50 sites or more you could consider multilevel modeling.
 Jim Prisciandaro posted on Thursday, October 08, 2009 - 10:30 am
Thanks Linda. When I run the model as a multiple group analysis, results are provided by group (i.e., there are no results for the overall sample). Is there any way to get results for the entire sample, accounting for the fact that they come from a number of different sites (there are 14)? I am looking to "control/correct" for the fact that the data coming from multiple sites just as I "control/correct" for stratification and weighting using TYPE=COMPLEX.

 Linda K. Muthen posted on Friday, October 09, 2009 - 8:58 am
Multiple group analysis does not provide results for the overall sample. If you remove the GROUPING option, you will receive results for the entire sample.

You don't have enough sites to use TYPE=COMPLEX or TYPE=TWOELVEL. You can control for site by using 13 dummy variables representing site as covariates in your model.
 Jim Prisciandaro posted on Friday, October 09, 2009 - 11:17 am
Estimated models will consist of a variety of latent factor (CFA) models. Given that I would like to control for site in my analyses, in the dummy variable scenario, would it make more sense to regress observed indicators on the dummy variables, or to regress latent variables on the dummy variables?

I'm guessing that regressing observed indicators on the dummy variables would be more appropriate, but I am not certain.

Thanks again,
 Bengt O. Muthen posted on Friday, October 09, 2009 - 5:10 pm
You would hope that sites don't have indicator-specific differences which would be hard to work with. So regressing the latent variables on the dummies would be at least the starting model (from which you may look for "item bias" - evidence of the need for direct effects; see our Topic 1 handout).
 Jim Prisciandaro posted on Sunday, October 11, 2009 - 9:26 am
This makes sense; thank you.

Quick related question: I have been requesting "output=standardized" in these models (TYPE is COMPLEX MIXTURE; multiple group analysis using "knownclass"). MPlus is printing


STDYX Standardization

Estimate S.E. Est./S.E. P-Value

STDY Standardization

Estimate S.E. Est./S.E. P-Value

STD Standardization

Estimate S.E. Est./S.E. P-Value

but is not providing any values under these headers (nor under the R-square headers). Is there any way for me to obtain standardized estimates of model parameters (especially loadings & thresholds)? If not, how can I convert loadings and thresholds into a meaningful metric (I would like to be able to describe measurement parameters ala IRT)?

 Linda K. Muthen posted on Sunday, October 11, 2009 - 10:36 am
This is a support question. Please send your full output and license number to
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message