I have survey data that I want to analyze using a multi-level model (we want to use a model-based approach, with this type of variance decomposition). The data is clustered in nature (students within schools), however no sampling information is available for this. So unfortunately, we are required to assume SRS within states. There are weights in the data set based on population values for states (since an equal number of participants are selected from each state - its not PPS)- in order to make the survey representative for state population values.
Can I include this type of weight as a "weight" in a mutli-level model - or is this incorrect? I have read it is only appropriate to include "conditional" level 1 weights but I'm not sure how this applies here.
Any references that may offer information about this would also be greatly appreciated. I've read a lot about multi-level model weights - but I can't seem to find reference to this type of circumstance.
From your description it seems that you have a stratified sample. I would recommend using strata=state; and the population based weight specified as the bweight= (although technically speaking that is not exactly the weight you have to use but it can be used as an approximation - the exact weight should be counting PSU in each state, which I assume are the schools, rather than people in the state).
Thank you very much for your response. I failed to mention that the weights are based on state population values for grade groups as well (that is, all students from the same state and grade have the same weight). It is still appropriate to apply them as a level 2 weight since clusters (schools) contain samples from multiple grades?
I was wondering why it is inappropriate to apply them as a level 1 weight?
My feeling was to include dummy variables for these design elements - however I couldn't determine whether it was entirely unsuitable to apply them as weights instead.
You will have to use the details of the sampling design to determine the probability of a school to be selected (1/that probability is the bweight variable) and the probability of each student to be selected, given that their school is selected (1/that probability is the within weight).
If for example second graders were over sampled as compared to third graders in a particular school you will have to use both within and between weights.
If the sampling weights vary across clusters(schools) Mplus will not let you use these as bweight=
Unfortunately the sampling weights you were given are computed for single level models and often not enough information is provided to compute the correct two-level weights.
In absence of any additional information about the sampling, I would resort to this: 1. Compute the within weights according to the last formula on page 3 http://statmodel.com/download/Scaling3.pdf 2. Use as bweight the original weight / within weight computed in step 1
One final clarification: What if the assumption was made that the sampling design was not multi-stage, so the analysis is multi-level (to account for the naturally occurring clustering) but the sampling is just assumed to be SRS within state x grade groups.
In this circumstance would it be suitable to apply the overall weights as is?
One final clarification: What about the circumstance in which the design is not multi-stage (but we still wish to use a mutli-level analysis to account for the natural clustering), and instead the sample was an SRS within state x grade groups.
Would it be appropriate to apply such weights as is, to a multi-level model?
In this situation I would recommend using either a two-level model with [state x grade] as the clusters or a three level model with state as the third level grade as the second level. Weights would not apply anywhere as there is no unequal probability sampling anywhere. You have to be careful with the interpretation as these models derive average across state averages rather than average for the population.