June Zhou posted on Wednesday, October 05, 2011 - 8:19 pm
I am dealing with stratified cluster sampling data with missing values.
Q1: Can I do multiple imputation first and then use WEIGHT option to adjust standard error of estimates when doing data analysis on the imputed data sets? If yes, is the original weight of each individual still meaningful b/c imputed data were used instead of incomplete one? If no, what should I do?
Q2: What if I use replicate weights along with sampling weight to adjust standard errors in the analysis? Will I get a more accurate standard error than just using sampling weight with CLUSTER?
Q1. Currently sampling weights are not allowed during the missing data imputation so I would say that multiple imputation is not the best way to deal with missing data when you have sampling weights. Instead use the MLR estimator on the original data set with the missing data. The MLR estimator will yield unbiased estimates if the missing data is MAR.
Q2. Theoretically speaking the two methods are the same and should produce identical results (when you use a large number of replicated weights). However if you didn't generate the replicate weights I would recommend that you use the replicate weights because they may carry more information about the sampling method than just the CLUSTER variable.
June Zhou posted on Sunday, October 09, 2011 - 11:59 am
Thank you very much for your prompt and clear reply, Dr. Asparouhov!
I ran a path analysis under your suggestion using estimator=mlr to deal with missingness and also incorporated WEIGHT option to obtain more accurate standard errors.
I compared the results with the one without incorporating sampling weights and I found that the standard errors became larger.
My question is: are standard errors of estimates supposed to be smaller given that we are using sampling weights? Because if the probability weight is used, tests of inference most likely will be significant b/c the software is interpreting the population rather than the sample size.
June Zhou posted on Sunday, October 09, 2011 - 12:32 pm
Sorry, I have another question here.
When I ran the same path analysis using estimator=mlr and replicate weights this time, I got a warning saying that "Replicate weights are not available for estimator MLR".
Does it mean I cannot deal with missing data by using estimator=mlr and incorporating replicate weights simultaneously?
June Zhou posted on Sunday, October 09, 2011 - 1:35 pm
When using replicate weights in the analysis, we need to specify REPSE, right?
What's the REPSE for balanced repeated replications?
When you use weights, you change the data. Standard errors can be larger or smaller.
With replicate weights use ML. ML and MLR give the same parameter estimates and treat missing data in the same way. It is only the standard errors that differ and they will be computed using the replicate weights.
See Example 13.18 which describes how to use replicate weights. See also the REPSE option in the user's guide.
June Zhou posted on Tuesday, October 11, 2011 - 9:08 pm
Thank you very much for your suggestion, Dr. Muthen! It's a great help.
Stephanie posted on Tuesday, January 21, 2014 - 1:17 pm
I would like to impute missing data in a SEM by using DATA IMPUTATION and TYPE=IMPUTATION. As my model includes a binary dependent variable I have to use WLSMV. Additionally, I have to use a sample weight which I would like to integrate by the WEIGHT option in the variable command when running TYPE=IMPUTATION. So my first question would be: is weighting possible in this case? And the second: if not, how could I solve the problem as weighting is essential for my data?
I would recommend that you do the data imputation and the analysis in two separate steps. In step 1 you can do the imputation using type=basic, specify the categorical variables as such and include the weight variable as one of the imputation variables. You can not specify the weight variable as a weight variable but you can use the weight variable to impute the missing values and extract information about the imputed data if there is such information in the weights. In step 2 you can use the weights as weights data: type=imputation; ... variable: weight=w; ...
Stephanie posted on Wednesday, January 22, 2014 - 2:37 pm
Thank you very much for your kind support. The model runs almost perfectly, now. I have one last question regarding the model fits:
I get means and standard deviations for CFI, TLI etc. Are these values the mean of all, for example, CFI's in all imputed datasets? And if that is the case, is it allowed to quote these means in a research paper?