I'm running multiple imputation. The data has a little clustering by community, is it recommended/possible to cluster the data for multiple imputation (or is clustering just done in final analyses on the imputed data)?
Also, is it recommended that the iterations be fixed? I have been running the imputation model for over 8 hours (at 27000 iterations) and it has yet to converge.
I also have a complicated case of missingness that I wonder if there is an option for while imputing. I have an categorically reported income variable that is reported for each member of each household. I need to impute the income for each of household members (missingness due to responses of "no response" and "don't know" category options).
However I realized that there is an inflated number of missingness for these variables because households have varying numbers of household members (e.g., some households have 2 members, other households have 23 members). As a result, those households with only 2 members look like they are missing values for household member number 3-23 when those are not actually missing; they just don't exist.
Is there a way to only impute values for households that should have reported, but exclude households that appear missing due to smaller household member size?
Maybe use three level imputation where family is the extra level: long modeling.
Those extra values don't hurt. You can remove them after Mplus imputes them. However, if the family units go as high as 23 I guess your data has many variables and that is why it is running slow.
Consider also the issue that if you are using wide format for the family you are using a suboptimal imputation method. For example the imputed value for family member 23 will be based only on very few other families with 23 members. I think you want to use 3 level imputation. That one currently is available only under H0 imputations where you are required to specify the unrestricted imputation model.
Ashley posted on Wednesday, November 12, 2014 - 1:57 pm
Thank you. This is very helpful. I am new to imputation using Mplus and unsure of the potential implications of using the long format to impute for my analyses after imputation (CFA and SEM).
Does Mplus allow for these analyses to be conducted in long format? Alternatively, would it be possible for me to change the format of the imputed datasets after imputation (from long back to wide)?
Also, after imputation, would I be able to create a summed variable in the imputed datasets for a total household income score prior to conducting my analyses (CFA and SEM)?
Search the user's guide for longtowide, widetolong, H0 imputation, example 11.7. Run some 3-level imputation with the user's guide data before you start with your data. After imputing you can form the total income variable.
Three level imputation is currently available only under H0 imputations where you are required to specify the unrestricted imputation model.
Ashley posted on Wednesday, November 12, 2014 - 6:42 pm
Thank you. However, I'm a little confused, but this might be due to my limited experience imputing.
If I impute using H0, then I am specifying my ultimate model (e.g., CFA or SEM) when I impute, is that correct?
If so, I'm confused about how I might be able to create the sum income variable (which is what I ultimately need for the CFA and SEM) if the imputed datasets are based on the individual income scores.
You may consider imputing the data at the item level and creating the sum score using the imputed data.
Ashley posted on Thursday, November 13, 2014 - 10:58 am
Thank you. I just want to make sure I understand correctly.
1) My understanding was that if I used an Ho imputation model, I would be running the CFA analyses when I imputed, but it sounds like the CFA analyses will actually be run in a separate step. Is that right?
2) If I do not have #1 correctly, if I will specify my model (CFA) when I impute, then I'll be able to create the sum score on the imputed dataset before actually running my CFA analyses. Is that correct?
3) Also, if I impute using the long format with the H0 imputation model, I'll be able to change the imputed dataset to wide format before actually running my analyses (as in interim step)?
I apologize for all the questions, I want to make sure I understand this correctly. Thank you in advance for your feedback.