Examples from national data sets PreviousNext
Mplus Discussion > Multilevel Data/Complex Sample >
 Raheem Paxton posted on Monday, March 27, 2006 - 7:34 pm
Are there any examples using Mplus with the Youth Risk Behavior Survey (YRBS). How do you include weight factor assigned to each student record (weight), primary sampling units (psu), and stratum (indicates the stratum the school the student was assigned). Are there any examples with this data?
 Linda K. Muthen posted on Tuesday, March 28, 2006 - 4:05 pm
We don't know of any Mplus examples that use the YRBS.

Use the WEIGHT option for the weight, the CLUSTER option for the psu, and the STRATIFICATION option for the strata.
 Christian M. Connell posted on Wednesday, August 15, 2007 - 4:33 pm
In a follow-up to this older posting -- Mplus generates the following error message with the YRBS data using the weight, cluster, and stratification options:

Each stratum must contain unique cluster IDs. Clusters are not nested within strata.

The YRBS has 15 strata and 43 psu, but each strata may involve from 2-9 psu, and a given psu may have more than 1 strata. So the clusters are not fully nested within strata.

Is there a way to work around this issue when modeling the YRBS (or similar sample)?
 Linda K. Muthen posted on Thursday, August 16, 2007 - 3:44 pm
Mplus does not yet have this capability.

A simpler way to work around this issue is to ignore the cross classification -- presumably the SE won't be underestimated that much. This is done by treating a cluster which is in two strata as two separate clusters.

It may be possible to run this model as a mixture/multiple group/known class model where the strata is the grouping variable. This will allow the cross classification.
 Christian M. Connell posted on Friday, August 17, 2007 - 4:19 pm
I tried your initial suggestion -- treating psu with multiple strata as separate clusters and ran a simple regression model (rather than the LCA model that I'm working on, which takes about 1.5 hrs to estimate). Ignoring stratification and treating psu as the cluster variable (with weighting) produces the following:
Estimates S.E. Est./S.E.
Q_A 0.017 0.063 0.262
Q_B 0.058 0.044 1.309
Q_C 0.005 0.169 0.029
Q_D 0.045 0.086 0.528
Q_E 0.100 0.120 0.835
Q_F -0.188 0.107 -1.760
Q_G 0.065 0.152 0.430
Q_H 0.092 0.043 2.115

With a revised psu approach I get the following:
Q_A 0.017 0.119 0.139
Q_B 0.058 0.065 0.893
Q_C 0.005 0.176 0.028
Q_D 0.045 0.111 0.410
Q_E 0.102 0.061 1.676
Q_F -0.188 0.068 -2.784
Q_G 0.065 0.210 0.310
Q_H 0.091 0.069 1.320

Primary difference is S.E. for Q_F & Q_H, resulting in differences in statistical significance. Would this suggest any concern for either approach (neither of which fully incorporates the strata variable)? Also, can you say more about your second option -- which would represent the group and which the known class?

Thank you in advance.
 Bengt O. Muthen posted on Saturday, August 18, 2007 - 9:45 pm
These results seem strange - the SEs should decrease, not increase. If you send the data and model we might be able to resolve it.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message