

Use MLR when missings are not MAR or ... 

Message/Author 

Gerine L posted on Friday, December 05, 2014  4:43 am



Question: Can you use MLR estimator if you have removed specific values for specific cases (e.g., removed outliers on a specific variable?)? Details: I have a very simple model: outcome on S M P S, M, P and outcome are continuous variables (skewed, i.e. not normally distributed). My data are collected among around 1350 children in classrooms. S and M are self report variables. P is based on other analyses (social relations modeling). Every kid in a classroom rates every other kid. For each child, we combine the evaluations they get from every other kid, this value is corrected for lots of things in this Social Relations Model. Using this method, you also get a reliability measure for each classroom. It is recommended to remove classrooms with low reliability (i.e., if a few kids don’t fill out the evaluations seriously, this affects ratings for every other kid in that classroom). I have removed around 120 values for P (4 classrooms) based on the analyses in Social Relations Model. I understand that MLR uses FIML estimation. I saw that for FIML, missings have to be MAR or MCAR. I don’t think that that is the case for my P variable. Neither would that be the case if you remove certain values due to outliers, for instance. What can I do? 


To avoid deleting classrooms, perhaps you can use your reliability measure in some way as a proxy for the reliability of a kid's P value. SEM can handle singleindicator factors with known/fixed reliability. 

Gerine L posted on Friday, December 05, 2014  11:54 am



Thank you for that suggestion! Could you maybe point out where in the manual I could find an example of that? I am a bit unsure about including the scores for those classrooms though, because it is recommended not to include these scores. (I do have to note that this method was developed to measure much smaller groups, e.g., scores within families, this is one of the first studies in which a round robin design is used for ratings in an entire classroom. Thus, excluding 1 family is less drastic compared to excluding 1 classroom of 30 kids). In addition, I was wondering  more on a general level  how you would usually advice dealing with such cases. For instance, imagine an analysis in which for some other reason, some values have been removed for some variables but not others (e.g., removing outliers, or removing values for participants who got the wrong instructions on some task). 


I think the removing of classrooms can give a bias of a selective missingness as you first indicated  and it is hard to compensate for that in the modeling. I am not familiar modeling these types of data. You may want to ask on SEMNET. Singleindicator modeling with fixed reliability is shown in our handout and video for Topic 1 on our website. It is a standard SEM approach. But it may be too novel in your setting. 

Back to top 

