Xu, Man posted on Thursday, February 12, 2009 - 2:32 am
Dear Dr. Muthen,
I have an individual level weighting variable for a multilevel dataset. This variable is a combination of both the school weight and individual weight within a school that the individual belongs to. I specified my model with WEIGHT and TYPE=TOWLEVEL in conjunction. I didn't specify a school weight using BWEIGHT as the weighting variable already containing school weighting information.
Is this appropriate? Or should I try to decompose the original weight into within weight and between weight for the estimation to be proper?
It is not appropriate. You should try to decompose the original weight into within weight and between weight for the estimation to be proper.
Xu, Man posted on Friday, February 13, 2009 - 4:22 am
Thank you very much for your advice. I do have from the data a seperate school weight variable. So in order to get the "pure" student weight, I just need to divde the overal weight of both school and student (sum to the target population size) by the school weight, right?
A complication confuses me is that the data that I am using (PISA) focuses on the student population rather than the school population, therefore the school weights probably won't sum to the size of the school population. Do you think this is a problem in relation to the analysis?
This sounds correct: divide the overall weight of both school and student by the school weight.
The fact that the school weights won't sum to the size of the school population is not a problem at all. These weights are rescaled anyway by Mplus to sum up to the sample size from that school - see http://statmodel.com/download/Scaling3.pdf
Xu, Man posted on Tuesday, February 17, 2009 - 3:11 pm
Thank you very much for your reply. Your advice is greatly appreciated!
Paul Norris posted on Wednesday, February 23, 2011 - 8:46 am
I looked at your 2008 paper on rescaling weights for multilevel analysis but am wondering how best to implement weights with my data. I have a dataset contains individuals (level 1) drawn from 21 countries (level 2).
The data include sampling weights at level 1 - weighting respondents in any given country to reflect the population of that country. The number of individuals sampled within each country varies (typically 2000 respondents per country but, for instance, 1 country provides 7000). The total of weights within any country is equal to the sample size obtained from that country. Now, I assume, weighting just on these weights (with no adjustment) would cause those countries with larger samples to have greater leverage in the calculation of any model.
Given that the sample size obtained within each country is a design artefact (question design was common the fieldwork within each country was left to local survey companies) rather than having a substantive interpretation, I would like to weight my model such that each country has equal impact on the model (my concern is with level 2 covariates).
Is it acceptable to use the level 1 weights as they are, and create level 2 weights to rebalance the impact of each country, or should I rescale my level 1 weights so that when weighted each country appears to have an equal sample size?
hi, i run a threelevel model on pisa-data (students-schools-countries) and i am mainly interested in variables at the school level (level 2). since pisa school-level sampling is indeed informative, i want to use the school weights, specifying B2WEIGHT = W_FSCHWT;
this gives the following error message (no matter if either raw, normalized or standardized school weights are used, or B2WTSCALE is SAMPLE):
THE LOGLIKELIHOOD DECREASED IN THE LAST EM ITERATION. CHANGE YOUR MODEL AND/OR STARTING VALUES.
THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES.
THE H1 MODEL ESTIMATION DID NOT CONVERGE. CHI-SQUARE TEST AND SAMPLE STATISTICS COULD NOT BE COMPUTED.
the model works fine without the weights. what can I do? Any help is appreciated.
I am working with school-level data that is based on nationally-representative samples of 8th, 10th, and 12th grades. Sampling is conducted separately for each grade, and each school receives a sampling probability weight. I am running twolevel random models to conduct multi-level mediation where the cluster variable is state (we are interested in state policy predictors). State was *not* part of the sampling procedure, but given the nature of the model, it is being used as the cluster. We want to combine 10th and 12th grades in one model, but as I noted, sampling was done independently for each grade.
I want to utilize wtscale=cluster, but I don't know if I should (a) scale the data separately by grade prior to Mplus analysis, or (b) go ahead and scale the 10th and 12th grade data together during Mplus analysis. If I scale the data prior to Mplus analysis, I will retain the "separate" nature of the 10th and 12th grade samples, but my weight will not sum to the cluster N (given that it was done separately by grade, but both grades are included in the model).
We are running twolevel random MLR montecarlo models with logit link (dichotomous variables for both X, M, and Y; 2-1-1 mediation models). State policy does not differentiate between 10th and 12th grades - just focuses on middle vs. high school. Because there are no policy differences, and also because we want to have as large an N as possible, we want to combine 10th and 12th grades.
Assuming that no school is present in both 10 grade and 12 grade samples I would suggest to scale the weights for 10 grade separately and for 12 grade separately so that the 10 grade sum up to the cluster sample size in 10 grade and the 12 grade sum up to the 12 grade cluster sample size. Then use wtscale=unscaled. The scaling of the weights would have to be done outside of Mplus before the analysis.
Dear Mplus Team, I want to make sure if I get the following post right:
Tihomir Asparouhov posted on Wednesday, February 23, 2011 - 4:49 pm
"Use the level 1 weights as they are and do not create level 2 weights. All countries will then have equal weight. Level 1 weights will not cause countries with larger samples to have greater leverage."
--> Does this mean for a three level model (Student, school, country) using L1- and L2 weights that for the estimation of L3-effects all countries have the equal weight (no matter what the student or school sample size or population size is?
Anna Austin posted on Sunday, October 15, 2017 - 6:13 pm
I am conducting latent class analysis with data from a complex survey. I am using TYPE=COMPLEX to account for weighting, stratification, and clustering. However, I would like to import the data into SAS to describe the prevalence of demographic characteristics of individuals belonging to each specific class. In doing so, I want to account for each individual's posterior probability (i.e., most likely class assignment), but still need to account for the survey weights. Can I simply multiple the posterior probabilities by the sample weights for use in analysis?
Anna Austin posted on Tuesday, October 17, 2017 - 3:49 pm
Thank you for your insight! I have conducted a three-step method to generate odds ratios to examine associations of various demographics with latent classes. However, for some factors the 95% CIs are extremely wide due to the low number of individuals with a particular value of a factor in a given class (i.e., only 3% with education <12 years in one class). My plan was to supplement the odds ratios by describing the weighted prevalences as sort of way to explain why the 95% CIs were so wide. Do you have any thoughts?
I would recommend that you report the probability result instead of the odds ratios (which as you say tend to explode when the probabilities are small). You can use the DCAT option of the auxiliary command as well.
Anna Austin posted on Wednesday, October 18, 2017 - 11:38 am
Thanks for your insight!
Is the probability result the information found directly about the odds ratios in the output? This section of the output gives an estimate, SE, and p-value for each factor for two of the three classes. If, for example, a binary variable has an estimate of 1.392 (p-value=0.004) for class 1, how would this probability result be interpreted?
Anna Austin posted on Wednesday, October 18, 2017 - 12:32 pm
Thanks for sharing the above code.
I have been using Vermunt's 3-step approach (R3STEP) rather than Lanza's 1-step approach (DCAT). Is it possible to get probabilities using Vermunt's 3-step approach? The output does look different from what you have provided above.
A shortcut approach is to use the DU3step option in the auxiliary command. For a binary variable the mean is the same as the probability (given 0/1 coding of the binary variable). If you just need descriptive values I would recommend that approach (just keep in mind that SE will be inferior to those obtained via the full 3-step manual approach since the DU3step SE are based on continuous variable assumption).
Anna Austin posted on Wednesday, November 22, 2017 - 9:11 am
Hi Dr. Asparouhov,
I have tried declaring the distal outcome as categorical in the last step, but I get an error message indicating that I cannot declare a variable that is not a dependent variable as categorial. My model command is as follows: