

Implicit and explicit stratification ... 

Message/Author 


Hi! I'm using a merged TIMSS 8th grade dataset; achievement (L1) on science teacher attributes (L2). a) I want to include stratification, using implicit nested within explicit strata, nested within country (since these are not randomly selected, and I don't need multilevel/random modelling, cf. modelbased inference). My recoded stratification variable is CountryID*100000 + ExplicitStratum*1000 + ImplicitStratum Should implicit strata be included at all since it's mainly used for the systematic selection of sampling frame units? Secondly, is my horizontal approach valid? b) TIMSS' SciWgt for science teacher analyses is "the (total?) student weight divided by the number of science teachers for the student". I ignore this weight because Mplus prefers weights by level (and I prefer to do it myself). I assume it is only the L1 weight I divide by the number of science teachers? (The actual effect is clearly minimal). Pseudocode: Weight = StudentBaseWeight*StudentAdjWeight/#SciTeachers WTScale = Cluster B2Weight = ClassBaseWeight*ClassAdjWeight B3Weight = SchoolBaseWeight*SchoolAdjWeight BWTScale = Unscaled c) The original school weight was calculated within implicit stratum within explicit stratum within country. Will the BWTScale = Unscaled accommodate this? Thanks! 


And an extension of question b): If I decide to incorporate the features of house weight and senate weight (see TIMSS 1999 Database User Guide, p.512), "manually" (i.e. not the preconstructed weight in the dataset) where would I place the transformed weight? That is, each country (stratum) is weighted either a) equally (senate weight), b) by actual sample size (house weight), or c) by population size (total weight). Is this related to the BWTScale? 


That is an interesting question which I also have conceptually. 


Stephus a) This looks all fine to me. b) In deciding the weight variable you have to keep in mind that the weight has to be proportional to 1/probability of selection for the unit you are assigning the weight to (unit at any level). From the above description it appears that the teachers were selected and then students of the teachers were selected (and if a student had two sci teachers would be twice as likely to be selected). I don't know if this really happened in the design. You will have to check with the survey description. Be very cautious with modifying the weights. This is usually done by the survey design administrators and usually incorporates all the information. Granted weights are usually designed for computing means, usually at the lowest level. When you do multilevel type analysis you are better off understanding the sampling design and computing the weights specifically for the model you want to estimate. c) I would totally discourage you from using the command WTScale and BWTScale. Instead use our Mplus defaults. Those were selected from conducting extensive simulations in pure experimental settings and were chosen as the best performing. Unless you are willing to conduct your own simulations that resemble TIMSS design and make a more informative call on the weight scaling I would use the Mplus defaults. d) Weights are assigned to PSU, secondary sampling unit, third sampling unit (if you are doing a 3level model). They are not assigned to strata or additional unmodeled clustering. If the design is a multistage design the weights are multiplied so you should not split them even though you can. If however you are using "house" as a PSU then splitting the weights should be done. For further input please take a look at http://statmodel.com/resrchpap.shtml It is pretty important to match design issues with the assumptions of the estimating algorithms. 


Dear Dr Asparouhov, thank you so much! Made all sense now. 

Back to top 

