Mplus Discussion >> Weights

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Weights

Mplus Discussion > Multilevel Data/Complex Sample >

Message/Author

Anonymous posted on Wednesday, September 21, 2005 - 7:02 am

I have a question about using weights in a multilevel SEM model. This model has two latent variables and six oberved at level-2 and the same setup at level-1 with different variables. Because of this and the data are clustered, I wanted to use multilevel SEM analysis. Also, the data come with weights. How would I incorporate the weights into my model? Is there an example in the manual that I am overlooking?

Thank you in advance for your assistance

Linda K. Muthen posted on Wednesday, September 21, 2005 - 7:34 am

You would use the WEIGHTS option of the VARIABLE command to specify which variable contains the sampling weight information.

Anonymous posted on Wednesday, September 21, 2005 - 2:33 pm

I'll give it a try.

Rick Sawatzky posted on Thursday, May 24, 2007 - 9:36 am

Hi, I am using data from a national survey. A bootstrapping procedure was used to create 500 sets of sampling weights which have been provided by the owner of the database for the purpose of estimating standard errors while taking the survey design into account. Is there a way in MPlus to combine model estimates based on these sampling weights or do I have to manually run the models 500 times and then manually calculate the stardand errors based on the distributions of the obtained estimates?

Linda K. Muthen posted on Thursday, May 24, 2007 - 10:05 am

If you have 500 data sets each with a different weight, you can use external Monte Carlo (Example 11.6, Step 2) to analyze them. You will obtain results that are the average parameter estimates, the average standard error, etc. (see Chapter 11, Monte Carlo Output. I'm not sure this is exactly how these replicate weights should be used.

Thais Rogatko posted on Tuesday, November 11, 2008 - 10:03 am

Hi, I'm using the European Social Survey data which has two sampling weight variables (a Design weight to control for not all people being given the same chance of selesction, and a population weight to accurately represent country populations). I am testing a two-level model (country level and individual level). I'm wondering how I include both weight variables. Do I say WEIGHT = DWEIGHT PWEIGHT

Thanks for your help

Tihomir Asparouhov posted on Tuesday, November 11, 2008 - 1:33 pm

For the twolevel model you should use only WEIGHT = DWEIGHT, however if you want to estimate population totals you should use single level models with weight the product of DWEIGHT and PWEIGHT.

Dennis Koethemann posted on Friday, April 27, 2012 - 9:03 am

Do you know any reference explaining why we should not use pweight in a twolevel sem model?

Many thanks

Tihomir Asparouhov posted on Friday, April 27, 2012 - 3:59 pm

You should construct your weights so that the level 2 weight is 1 / Prob of including that cluster in the sample and the level 1 weight is 1 / Prob of including the observation in the sample.

See
http://statmodel.com/download/asparouhovgmms.pdf

Laura Wray-Lake posted on Monday, November 12, 2012 - 10:20 am

We are trying to estimate a latent class growth analysis using Add Health data, and we are therefore applying complex sampling weights. Importantly, we are using a subsample of the full data (i.e., only 7th graders).

We used Type=Complex TwoLevel and included wtscale=ecluster to try to accommodate for using only a portion of the full data. However, we only wish to specify a within-group model, and the results of the two-level model were uninterpretable. [All individuals were assigned to a single group even though three groups were specified.]

Can we use Type=Complex (without TwoLevel) and incorporate some other method of adjusting the weights to account for use of a subpopulation?

Linda K. Muthen posted on Monday, November 12, 2012 - 11:02 am

You can use the SUBPOPULATION option with TYPE=COMPLEX.

Diana Paksarian posted on Thursday, June 05, 2014 - 10:49 am

Hello,
I am working on a multilevel analysis using survey data and comparing different weight scaling methods following Asparouhov (2006). I am using data in which students were selected from within schools with probabilities proportional to size. Based on my initial reading of the paper I decided that this would be classified as an invariant selection mechanism since it is the same across schools and gives meaning to the ratio of weights from students in different schools. After a few re-readings I have begun to question whether I understood the issue correctly. If someone could confirm or disconfirm that would be very helpful.
Thank you,
Diana

Tihomir Asparouhov posted on Thursday, June 05, 2014 - 8:45 pm

Typically the above language translates to: the schools were selected with probability proportional to the size of the school, then in a second stage sampling a fixed number of students were selected at random from each of the selected schools. Assuming that you are modeling the school as your level 2 cluster unit in Mplus you should use the "bweight=1/prob selection=1/size of school" command and do not specify any weight on the within level.

The 2005 paper and the invariant and non-invariant selection deal with the case where you have within level weight so it would not apply to your situation.

Diana Paksarian posted on Friday, June 06, 2014 - 7:56 am

Thank you for your reply. I left some information out of my previous post. The schools were selected as the SSUs in a complex survey, so the school-level weight I've been using is 1 / [p(psu selection)*p(ssu selection)]. In my case, using this as the level 2 weight and omitting level 1 weights is equivalent to the scaling methods A and B that you describe in the 2006 paper.

I am somewhat confused by your comment that the invariant v. non-invariant distinction is not relevant, since I got the impression that it affects the calculation of the level 2 weights. I apologize if I am missing something basic.
Thank you,
Diana

Tihomir Asparouhov posted on Friday, June 06, 2014 - 8:36 am

Scaling A v.s. B and invariance of selection are relevant only when there are within level weights. If the within level sampling is random both concepts are irrelevant. The case of no within level weights is the best situation since it simplifies so much. When there are no within level sampling weights that technically doesn't even qualify as a two-level model, because there is a multivariate single level model equivalent to your two-level model - that's explained in the 2005 paper.

TD posted on Tuesday, March 28, 2017 - 4:08 pm

Dear Mplus team,

I have two survey datasets with three years apart and no subject selected in more than one survey. The sampling methods used in these surveys (with unequal probability) are the same and I have sampling weights calculated (with respect to the population and non-response) for each survey. I want to combine the two surveys for bigger sample sizes.My question is related to the adjustment of the weights
Let�s assume that w1i is the weight for subject i in the first sample and w2i is the weight for subject i in the second sample; n1 and n2 are the respective sample sizes and that N1= sum of w1i and N2 is the sum of w2i.
The fist adjustment I used is : w�1i= w1i*1/2 and w�2i= w2i*1/2.
The second adjustment is: w�1i= w1i*N2/(N1+N2) and w�2i= w2i*N2/(N1+N2).
Given that none of these adjustments necessary produce the most efficient estimates in term of variance, is it important to choose one adjustment instead of the other?
Is it better to use another adjustment?

Thank you in advance for your help.

Tihomir Asparouhov posted on Thursday, March 30, 2017 - 10:10 am

Whatever adjustment you use should obey this simple logic. If the first sample was random (equal weights) and the second sample was random (equal weights) the combined sample is also random so the weights should be equal. The first adjustment you propose doesn't satisfy that so I would not use that. In the second adjustment you clearly have a typo and I wouldn't speculate what you meant but the weights standardization that we typically use is to standardize the weights so they add up to the sample size. Then you can combine them.
w�1i= w1i*n1/N1
w�2i= w2i*n2/N2
This could be equivalent to your second adjustment (depending on what n1,n2,N1,N2 are). You can also run this as a multiple group and Mplus will do that for you.

Laura Alexandra posted on Friday, April 20, 2018 - 7:23 am

Dear all
I have three questions concerning weighting in multilevel analysis.
1.) Is method A (Asparouhov 2006) equal to using the options: wtscale is cluster; bwtscale is unscaled; ?
2.) The sampling design included PPS sampling of schools and sampling the same number of students within schools (SRS within) � hence the within weights depend on the size of schools. Is it correct to assume that the sampling is non-invariant if the size of schools correlates with the random effects of schools (their average outcomes)? Can I check this with post-estimation of (conditional) random effects for schools which I correlate with the (unscaled) within weight?
3.) In a post above you noted that with this two-stage sampling design only the school level weights are to be included in the multilevel analysis. I do not understand why the student level weights can be neglected and only over- and undersampling at the school level is relevant?

Many thanks for your reply!

Tihomir Asparouhov posted on Friday, April 20, 2018 - 9:31 am

1) it is equal to that but technically it is using bwtscale is sample (the between scaling doesn't affect model estimates)

2) If the sampling is SRS the within weight is 1. You are in Step 2 on page 24
http://statmodel2.com/download/asparouhovgmms.pdf

3) The within weight in multilevel modeling is different from what the student weight would be if you are using a single level model to say compute a population wide average. The within level weight for your situation is 1 because, given that a school is selected the probability of selecting a particular student is the same across students.

Bhuban Dhakal posted on Tuesday, September 17, 2019 - 7:41 pm

Hi Mplus team
I am working on complex survey data. I would like to apply post stratification weight to correct error in correctly representing samples in strata at survey implementation and post-weight factor to adjust the survey with total population factor. Stata provides a clear code to account them as below:

"svyset [pweight=selweight], strata(region) vce(jackknife) poststrata(bmark_grps) postweight(pop)"

Can you please advise me the Mplus code to account the poststrata and postweight ? If not, is there any similar method?
I would like to use them in Bootstrap sampling model.
Thanks.

Tihomir Asparouhov posted on Wednesday, September 18, 2019 - 1:05 pm

Mplus doesn't have a separate post stratification command but it is fairly easy to adjust the sampling weight to incorporate that information. If the target population values for the groups are
N1, N2, N3, etc ...
and the corresponding sample values are
n1, n2, n3, etc ...
all you need to do is replace selweight with selweight*(Ni/ni), if the observation is in the i-th group. Essentially you are multiplying the selection weight with the postratification weight (Ni/ni).

Bhuban Dhakal posted on Wednesday, September 18, 2019 - 8:16 pm

Tihomir Asparouhov
Your answer is great. You gave very na�ve and clear answer to address my problem of postratification weight. What about the postweight command? Does the above approach adjust the post-weight as well? Thanks again.

Tihomir Asparouhov posted on Thursday, September 19, 2019 - 9:10 am

That is the same thing. Note also that the strata command might have to be adjusted. If you have 2 regions (strata) and you are poststratifying by gender you would need to have 4 strata in the Mplus analysis (2 regions * 2 gender). In Mplus you can do that by using something like this:
define: region=region*1000+bmark_grps
variable: strata=region;
as long as both region and bmark_grps are less than 1000. If you don't adjust the strata you will not get the reduction in the SE. If the poststratification is not done for each region separately, however, you should not adjust the strata variable. You will need to consult the survey design description to verify this. If the posstratification is say race. Make sure that the posstratification weights adjustments are specific to each region and reflect race distribution (for the target population) variation across regions. If this is not the case (I don't know why it wouldn't be - it is not an optimal design) don't adjust the strata. This strata adjustment concerns only the standard errors and not the point estimates.

You might find this useful
https://www.stata.com/manuals13/svypoststratification.pdf
which explains the same formulas.
Also
https://www.statmodel.com/download/webnotes/MplusNote921.pdf

Bhuban Dhakal posted on Sunday, January 26, 2020 - 5:13 pm

Hi Tihomir Asparouhov

According to Eun Sul Lee and Ronald N. Forthofer (2006) "the approximate degrees of freedom associated with this covariance matrix are the number of PSUs minus the number of strata. Therefore, the standard likelihood-ratio test for model fit should not be used with the
survey logistic regression analysis. Instead of the likelihood-ratio test,
the adjusted Wald test statistic is used".

I am a Mplus software user. I could not see the adjusted wald statistics in Mplus output sheet.

1. Can we produce any other similar statistics (an indicator of model fit) with simple code in the Mplus output?

2. If not, can you please advise me a reference or webpage where I could find Mplus code for estimating the Adjusted Wall Test". The Mplus code Dong Shuyang posted on Monday, May 14, 2018 {http://www.statmodel.com/cgi-bin/discus/discus.cgi?pg=prev&topic=13&page=24111}, seems cumbersome and do not adjust the number of strata in degree of freedom.
Thanks

Bhuban Dhakal posted on Sunday, January 26, 2020 - 5:17 pm

Sorry, I am talking about fit statistics of complex survey data model with non-response observations and post-stratification.

Tihomir Asparouhov posted on Monday, January 27, 2020 - 4:56 pm

1. You can use LRT and Wald Test to test the estimated model against the null model where all regression coefficients are fixed to zero. To use the LRT test follow the description here
Difference Testing Using the Loglikelihood
http://statmodel.com/chidiff.shtml
See also Section 5
https://nces.ed.gov/FCSM/pdf/2005FCSM_Asparouhov_Muthen_IIA.pdf

To use the Wald test, see page 772 in the User's Guide, i.e., the Model test command.

2. I don't think this is relevant for your case as it concerns inference based on multiple imputations.

Caroline F. D. Black posted on Thursday, July 23, 2020 - 4:21 pm

Dear Dr. Muthen,
I am running a two-level meta-analysis and meta-regression using Mplus. Our team has a particular method for calculating the "weighted effect size" which is very different from traditional approaches as we have multiple effect sizes (ES) per study. We created an ES level weight ((1/ES variance)* (1/#ES per study)), which I imagine is a "within" subject weight.

How do I run a within-subject weight? Right now output says "Weight variable (cluster-size scaling) ESWGHT." Below is my syntax.
Many thanks,
Caroline

BETWEEN =
p_casem_d;

CLUSTER =
ContLevelID;

WEIGHT =
eswght;

DEFINE:
CENTER effectsz (GRANDMEAN);

ANALYSIS:
TYPE = TWOLEVEL;

MODEL:
%WITHIN%
effectsz;

%BETWEEN%
effectsz ON
p_casem_d;

Tihomir Asparouhov posted on Friday, July 24, 2020 - 10:35 am

This setup looks like the right setup for using the within level weighting. The actual weights that are used in the estimation can be found using:
savedata: file=1.dat.
The weights are standardized within each cluster so that the sum of the weight within each cluster equals the number of observations in the cluster.