I am trying to impute missing data in a complex survey data set, and appreciate your help in getting it right. For design I have variables strata for strata, com for clusters and wt for individual weights. I did:
The best solution is to add the weight variable in the imputation
USEVARIABLES = ... wt;
and remove the command
WEIGHT = wt;
Usually the weight variable is computed from other variables such as race gender SES. If that is the case, the best solution is to have these variables in the imputation instead of the weight variable.
You can also add dummy variables for each stratum if you want to use that information.
Bayesian estimation (which is used for the imputation) currently can not use the weight variable directly.
The weight variable is actually sampling probability based, and it depends on which stratum/cluster one is in and not on individual characteristics. I think because I am going to include the stratum variable, the weight variable will not carry any additional information.
And so should I still do CLUSTER = com and TYPE = BASIC TWOLEVEL? Or should I do just TYPE = BASIC?
You should use TYPE = BASIC TWOLEVEL if you can unless the cluster effects are very small. Look at the ICC of the variables and also take a look at https://www.statmodel.com/download/Imputations7.pdf in particular Section 3.3 and the other sections on multilevel imputations.
Is it possible to use the DEFINE command to create variables in multiply imputed datasets? I would like to obtain exogenous indicator variables from a multiply imputed ordinal variable. (This seems preferable to doing this in the imputation stage, as it sacrifices information about the ordinal relationship of the indicators).
But after adding the define command, the input file that had been working normally now produces an output file with only input file instructions--without results or error messages. Also, the "Mplus" activity box does not show that multiple datasets are being analyzed.
mdehne posted on Tuesday, February 11, 2020 - 6:24 am
I was wondering whether my output is right after 25 multiple imputations. In videos concerning multiple imputations in Mplus (maybe of older Mplus versions), the output was printed without information regarding mean and standard deviation of my fit indices based on multiple imputations. I am using Mplus 8.4 and always get the means that to my understanding are my actual model fit indices. Am I right with this assumption?
You should get the average fit indices if the model you are running has those available. You can check that by running one of the imputed data sets alone. If you are using the ML estimation with all continuous variables then the actual combined fit index is computed. If this is not what is happening for you send you example to firstname.lastname@example.org