Message/Author 

Xu, Man posted on Thursday, February 12, 2009  2:32 am



Dear Dr. Muthen, I have an individual level weighting variable for a multilevel dataset. This variable is a combination of both the school weight and individual weight within a school that the individual belongs to. I specified my model with WEIGHT and TYPE=TOWLEVEL in conjunction. I didn't specify a school weight using BWEIGHT as the weighting variable already containing school weighting information. Is this appropriate? Or should I try to decompose the original weight into within weight and between weight for the estimation to be proper? Thank you very much! Xu, Man 


It is not appropriate. You should try to decompose the original weight into within weight and between weight for the estimation to be proper. 

Xu, Man posted on Friday, February 13, 2009  4:22 am



Dear Tihomir, Thank you very much for your advice. I do have from the data a seperate school weight variable. So in order to get the "pure" student weight, I just need to divde the overal weight of both school and student (sum to the target population size) by the school weight, right? A complication confuses me is that the data that I am using (PISA) focuses on the student population rather than the school population, therefore the school weights probably won't sum to the size of the school population. Do you think this is a problem in relation to the analysis? Thank you very much. Xu, Man 


Xu This sounds correct: divide the overall weight of both school and student by the school weight. The fact that the school weights won't sum to the size of the school population is not a problem at all. These weights are rescaled anyway by Mplus to sum up to the sample size from that school  see http://statmodel.com/download/Scaling3.pdf Tihomir 

Xu, Man posted on Tuesday, February 17, 2009  3:11 pm



Dear Tihomir, Thank you very much for your reply. Your advice is greatly appreciated! Xu, Man 

Paul Norris posted on Wednesday, February 23, 2011  8:46 am



Dear Tihomir, I looked at your 2008 paper on rescaling weights for multilevel analysis but am wondering how best to implement weights with my data. I have a dataset contains individuals (level 1) drawn from 21 countries (level 2). The data include sampling weights at level 1  weighting respondents in any given country to reflect the population of that country. The number of individuals sampled within each country varies (typically 2000 respondents per country but, for instance, 1 country provides 7000). The total of weights within any country is equal to the sample size obtained from that country. Now, I assume, weighting just on these weights (with no adjustment) would cause those countries with larger samples to have greater leverage in the calculation of any model. Given that the sample size obtained within each country is a design artefact (question design was common the fieldwork within each country was left to local survey companies) rather than having a substantive interpretation, I would like to weight my model such that each country has equal impact on the model (my concern is with level 2 covariates). Is it acceptable to use the level 1 weights as they are, and create level 2 weights to rebalance the impact of each country, or should I rescale my level 1 weights so that when weighted each country appears to have an equal sample size? Best wishes, Paul 


Use the level 1 weights as they are and do not create level 2 weights. All countries will then have equal weight. Level 1 weights will not cause countries with larger samples to have greater leverage. 

Paul Norris posted on Tuesday, March 01, 2011  1:32 am



Dear Tihomir, Many thanks for your quick response. Paul 

Haigen Huang posted on Thursday, September 13, 2012  8:01 am



Dear Dr. Muthen, Do you know what weights should I use for the PISA 2009 data if I use multiple level models in Mplus? is the W_FSTUWT(final student weight) at the level one, and the W_FSCHWT(final school weight) at the level two ? Or should divide the W_FSTUWT by the W_FSCHWT to generate a within school weight. then use it at the level one and use the final school weight at level two? I will appreciate your help! Haigen 


Both approaches will yield the same result. The weights on the within level are scaled to add up to the cluster sample size with the Mplus defaults. 


hi, i run a threelevel model on pisadata (studentsschoolscountries) and i am mainly interested in variables at the school level (level 2). since pisa schoollevel sampling is indeed informative, i want to use the school weights, specifying B2WEIGHT = W_FSCHWT; this gives the following error message (no matter if either raw, normalized or standardized school weights are used, or B2WTSCALE is SAMPLE): THE LOGLIKELIHOOD DECREASED IN THE LAST EM ITERATION. CHANGE YOUR MODEL AND/OR STARTING VALUES. THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES. THE H1 MODEL ESTIMATION DID NOT CONVERGE. CHISQUARE TEST AND SAMPLE STATISTICS COULD NOT BE COMPUTED. the model works fine without the weights. what can I do? Any help is appreciated. 


When you weight the data, you change it so a model that fits the unweighted data may not fit the weighted data. Please send the output and your license number to support@statmodel.com. 


Dear Tihomir, I am working with schoollevel data that is based on nationallyrepresentative samples of 8th, 10th, and 12th grades. Sampling is conducted separately for each grade, and each school receives a sampling probability weight. I am running twolevel random models to conduct multilevel mediation where the cluster variable is state (we are interested in state policy predictors). State was *not* part of the sampling procedure, but given the nature of the model, it is being used as the cluster. We want to combine 10th and 12th grades in one model, but as I noted, sampling was done independently for each grade. I want to utilize wtscale=cluster, but I don't know if I should (a) scale the data separately by grade prior to Mplus analysis, or (b) go ahead and scale the 10th and 12th grade data together during Mplus analysis. If I scale the data prior to Mplus analysis, I will retain the "separate" nature of the 10th and 12th grade samples, but my weight will not sum to the cluster N (given that it was done separately by grade, but both grades are included in the model). Thank you for your assistance. 


What model are you considering? You might be able to utilize some 3 level models or multiple group twolevel models, see https://www.statmodel.com/examples/webnotes/webnote16.pdf 


We are running twolevel random MLR montecarlo models with logit link (dichotomous variables for both X, M, and Y; 211 mediation models). State policy does not differentiate between 10th and 12th grades  just focuses on middle vs. high school. Because there are no policy differences, and also because we want to have as large an N as possible, we want to combine 10th and 12th grades. 


Assuming that no school is present in both 10 grade and 12 grade samples I would suggest to scale the weights for 10 grade separately and for 12 grade separately so that the 10 grade sum up to the cluster sample size in 10 grade and the 12 grade sum up to the 12 grade cluster sample size. Then use wtscale=unscaled. The scaling of the weights would have to be done outside of Mplus before the analysis. 


Thank you so very much for your responses and help. It is much appreciated! 

Nemanja posted on Saturday, February 07, 2015  12:31 pm



Dear Dr. Muthen, Do you know what weights should I use for the PISA 2009 data? I am estimating peer effects by OLS, but I need to use sample weights as sample is stratified. Thank you very much. Sincirely, Nemanja 


You should check with whoever created the data set. 


Dear Mplus Team, I want to make sure if I get the following post right: Tihomir Asparouhov posted on Wednesday, February 23, 2011  4:49 pm "Use the level 1 weights as they are and do not create level 2 weights. All countries will then have equal weight. Level 1 weights will not cause countries with larger samples to have greater leverage." > Does this mean for a three level model (Student, school, country) using L1 and L2 weights that for the estimation of L3effects all countries have the equal weight (no matter what the student or school sample size or population size is? Thanks 


Yes 

Anna Austin posted on Sunday, October 15, 2017  6:13 pm



Hello! I am conducting latent class analysis with data from a complex survey. I am using TYPE=COMPLEX to account for weighting, stratification, and clustering. However, I would like to import the data into SAS to describe the prevalence of demographic characteristics of individuals belonging to each specific class. In doing so, I want to account for each individual's posterior probability (i.e., most likely class assignment), but still need to account for the survey weights. Can I simply multiple the posterior probabilities by the sample weights for use in analysis? 


Yes, however, I would recommend that you use these better/more accurate procedures. Simply place your variables in the auxiliary command in the model you are running. auxiliary=demog(bch); or auxiliary=demog(du3step); For more information look up these commands in the User's Guide as well as web notes 15 and 21 http://statmodel.com/download/webnotes/webnote15.pdf http://statmodel.com/examples/webnotes/webnote21.pdf 

Anna Austin posted on Tuesday, October 17, 2017  3:49 pm



Dr. Asparouhov, Thank you for your insight! I have conducted a threestep method to generate odds ratios to examine associations of various demographics with latent classes. However, for some factors the 95% CIs are extremely wide due to the low number of individuals with a particular value of a factor in a given class (i.e., only 3% with education <12 years in one class). My plan was to supplement the odds ratios by describing the weighted prevalences as sort of way to explain why the 95% CIs were so wide. Do you have any thoughts? 


I would recommend that you report the probability result instead of the odds ratios (which as you say tend to explode when the probabilities are small). You can use the DCAT option of the auxiliary command as well. 

Anna Austin posted on Wednesday, October 18, 2017  11:38 am



Thanks for your insight! Is the probability result the information found directly about the odds ratios in the output? This section of the output gives an estimate, SE, and pvalue for each factor for two of the three classes. If, for example, a binary variable has an estimate of 1.392 (pvalue=0.004) for class 1, how would this probability result be interpreted? 


This doesn't look correct. Your input should look something like that variable: Names are u1u5 u x; usevar are u1u5; Categorical = u1u5; Classes = c(2); Auxiliary = u(DCAT); data: file=prob.dat; Analysis: Type = Mixture; Model: %Overall% [c#1*0]; %c#1% [u1$1*1 u2$1*1 u3$1*1 u4$1*1 u5$1*1]; %c#2% [u1$1*1 u2$1*1 u3$1*1 u4$1*1 u5$1*1]; The output that the Auxiliary command generates for the variable u looks like that EQUALITY TESTS OF MEANS/PROBABILITIES ACROSS CLASSES U Prob S.E. Odds Ratio S.E. 2.5% C.I. 97.5% C.I. Class 1 Category 1 0.754 0.010 1.000 0.000 1.000 1.000 Category 2 0.246 0.010 0.909 0.076 0.772 1.072 Class 2 Category 1 0.736 0.010 1.000 0.000 1.000 1.000 Category 2 0.264 0.010 1.000 0.000 1.000 1.000 ChiSquare PValue Degrees of Freedom Overall test 1.286 0.257 1 

Anna Austin posted on Wednesday, October 18, 2017  12:32 pm



Thanks for sharing the above code. I have been using Vermunt's 3step approach (R3STEP) rather than Lanza's 1step approach (DCAT). Is it possible to get probabilities using Vermunt's 3step approach? The output does look different from what you have provided above. 


It is possible using what we call the manual 3 step procedure, see Section 3 http://statmodel.com/download/webnotes/webnote15.pdf You would simply declare the distal outcome as categorical in the last step. A shortcut approach is to use the DU3step option in the auxiliary command. For a binary variable the mean is the same as the probability (given 0/1 coding of the binary variable). If you just need descriptive values I would recommend that approach (just keep in mind that SE will be inferior to those obtained via the full 3step manual approach since the DU3step SE are based on continuous variable assumption). 

Anna Austin posted on Wednesday, November 22, 2017  9:11 am



Hi Dr. Asparouhov, I have tried declaring the distal outcome as categorical in the last step, but I get an error message indicating that I cannot declare a variable that is not a dependent variable as categorial. My model command is as follows: MODEL: %OVERALL% CLASS on OUTCOME COV1 COV2 COV3; %class#1% [ModalC#1@0.603]; %class#2% [ModalC#1@3.151]; Is there something wrong with my model command? 


Send your output to Support along with your license number. 

Back to top 