 Regarding skeness in my sample    Message/Author  shrihari sridhar posted on Wednesday, November 14, 2007 - 9:04 am
hi,

i have a 2 level multi-level model i am trying to estimate. i have 46 units that provide a total of 3200 observations of y and x respectively. however, the problem is that 45 contribute a total of 400 and the remaning 2800 observations come form one unit, thereby making it a 2 level model with a very skewed distribution.i don;t want to throw out the last unit since it contributes 2800 data points, but at the same time, i don;t want to introduce severe skewness in my sample, so i employed the following procedure

a) take the 400 from the 45 units and add 100 more drawn randomly from the 2800 observations that the 46th provides.
b) get the estimates from the 2 level model of the effect of the x vector on the dv y with a sample size of 400+100=500
c) repeat a-b 2000 times
d) obtain the mean value of the estimate and the mean vaue of the standard error. divide both for the mean t-value

i know this is somewhat like bootstrapping, but at the same time, it is different becuase my sample is drawn without replacement (at least a large part of it). i have 2 questions

does this seem like a reasonable approach? what would you do if you were me? what would be a good way to infer whether the estimates are statistically significant? right now, i am using the mean value of the estimate and the mean value of the std err (from the repititions) to get the mean t values

thanks!
hari  Linda K. Muthen posted on Thursday, November 15, 2007 - 8:48 am
The approach sounds reasonable but I would use the IMPUTATION option of the DATA command for the analysis to get correct standard errors.  shrihari sridhar posted on Thursday, November 15, 2007 - 11:05 am
I will definitely do the same, but just out of curiosity, what would the imputation option be correcting for?

thanks!
hari  Bengt O. Muthen posted on Thursday, November 15, 2007 - 11:13 am
Instead of only using the average SE, you add the between-imputation parameter estimate variation. So srt of within + between imputation variance in line with the literature on mult imp.    Topics | Tree View | Search | Help/Instructions | Program Credits Administration