Missing item vs missing entire scale
Message/Author
 Harold Chui posted on Tuesday, June 18, 2013 - 11:21 am
Dear Dr. Muthen,

I am planning to use Three-level Multilevel Modeling to look at a study involving therapists, clients, and sessions. I have some missing data - some involving nonresponse to items and some involving noncompletion of entire scales for particular sessions. I plan to let full information maximization likelihood to take care of the missingness, but I first need to calculate total scores from individual items scores.

How should I go about doing this? Should I first conduct some kind of imputation for sessions with missing items to calculate total scores, and then run models using variables involving total scores and choose FIML as the option?

Thanks,
Harold
 Bengt O. Muthen posted on Tuesday, June 18, 2013 - 11:33 am
Why would you use total scores instead of analyzing on the item-level?
 Harold Chui posted on Tuesday, June 18, 2013 - 11:47 am
My predictor (e.g., social support) and outcome (e.g., depression) variables are total-score variables in the MLM.
 Bengt O. Muthen posted on Tuesday, June 18, 2013 - 2:59 pm
Ok, so those 2 constructs are total score variables. Then what is the construct that has individual missing items?
 Harold Chui posted on Tuesday, June 18, 2013 - 4:19 pm
Each total score variable is computed by adding up the scores for individual items. For example, for a scale that has 10 items, I would have 10 "variables." I compute the sum of the 10 scores to create the total score variable. Individual missing items means that I might have 2 missing scores out of the 10, such that I can't compute the total score by summation alone.

Does this mean that I need to use imputation first to fill in the missing items, then compute the total score, before I run models on the total score variables using FIML?
 Bengt O. Muthen posted on Tuesday, June 18, 2013 - 5:05 pm
I am suggesting that you don't create a total score but instead work with the 10 items as indicators of a factor. Missingness is then handled by FIML.
 Corey Savage posted on Thursday, January 28, 2016 - 1:48 am
I am missing data on a number of Rasch scales. Some of these I intend to use as distal outcomes in a latent profile analysis and the others are used as indicators of the latent profiles. Is it appropriate to impute using MI with this type of scale? Thanks!
 Bengt O. Muthen posted on Friday, January 29, 2016 - 6:21 pm
It is a bit dicey to impute when you expect there to be a mixture structure underlying your data because the imputation does not take that into account. You can handle it by "FIML" if you do a 1-step mixture analysis, but if you use a 3-step then the missing on distals will be deleted.
 Corey Savage posted on Friday, January 29, 2016 - 7:42 pm
Would it be appropriate to fit the model first with FIML and allow for lw deletion after once adding covariates and the distal outcome via BCH?
 Bengt O. Muthen posted on Monday, February 01, 2016 - 10:12 am
If you are concerned with missing data I would use 1-step FIML mixture modeling.
 Daniel Sort posted on Monday, October 10, 2016 - 3:56 am

I want to predict Y (healthyness of food consumption) by X1 (being vegetarian or not), two moderators of X1 (e.g. number of vegetarians in social networks, 3-4 items), and some controlls, using MPLUS.

For Y, I have frequency of consumption for 30 products (P1-P30) and expert rating on the healthyness of each (R1-R30). I used the frequencies P in a LCA and used types of consumption. A reviewer now asks to use a sum score instead, of the kind: Y=P1*R1 + P2*R" + ...

My uncertainties in dealing with this in MPLUS:
1. I cannot model a factor since P1-P30 are weakly correlated. Moreover, I understood that a composite/formative var will not work as endogenous var. So, I'm left to use DEFINE to compute a manifest index variable!?
2. Doing so, I could not use FIML for the imputation of missing values in P1-P30. I have to impute, otherwise, I will loose 300 of 800 cases. If I use MI, would you advice to do it on individual vars, or on the sum score?
3. If I use MI, should I also use it for the moderator variables, or should I let FIML take care of those?

 Bengt O. Muthen posted on Monday, October 10, 2016 - 2:47 pm
If P1-P30 are weakly correlated it won't help to do multiple imputation.

Also, is the missing on the Ps really at random or do the missingness patterns correspond to different consumption types? If they are random, why not take the average of the Ps that are not missing? This can be done outside Mplus.
 Daniel Sort posted on Tuesday, October 11, 2016 - 7:16 am
Many thanks! [OT: I'm really deeply impressed by the efforts your team put into this board. It would be interesting to know how many research projects you've helped to realize (either by showing a workable solution or by saving substantial amounts of time).]

- There don't seem to be a relation between the consumption types and the missingness patterns (no clustering of certain missingness patterns in the consumption types).

- Indeed, what I have done so far, is substituting missing values for each P by using the median (since frequency was measured on a very ordinal scale) of members of the same group and consumption type in Stata and used the data set in Mplus.
 Bengt O. Muthen posted on Tuesday, October 11, 2016 - 12:03 pm
ok.