Mplus Discussion >> Confirmatory factor analysis with NMAR data

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Confirmatory factor analysis with NMA...

Mplus Discussion > Missing Data Modeling >

Message/Author

Jukka Matias Marjanen posted on Monday, September 08, 2014 - 11:10 am

I have a data that consists of pupil's grades in different school subjects. I'm trying to fit a bi-factor model with a general factor and specific factors for math/science and language subjects. The problem is that for some of the subjects the students can decide whether they take an advanced or a basic course, so the data is NMAR.

I guess this biases the estimation, but as far as understand some of the bias may be corrected using a selection model to account for the missing data (Muthen 1987). All the simple examples I've been able to find have to do with longitudinal data and I have not been able to figure out how to apply them in the context of CFA. So how would I specify the model (what is the syntax) in the following situation:

-I have 4 subjects (Gen1-Gen4) that only load onto the general factor

-Math/Sci subjects load onto the general and math factors and one of the subjects is divided into basic and advanced course
(Variables: Mat1-Mat2, Mat3Adv, Mat3Bas)

-Language factor is similar to the math/sci factor (Variables: Lan1-Lan2, Lan3Adv, Lan3Bas)

So my problem is not the bi-factor model itself, but the selection model. How many latent factors should this model have? Do I use the grades themselves or dichotomous variables as indicators? What correlates with what and so on?

Many thanks for your help!

Bengt O. Muthen posted on Monday, September 08, 2014 - 5:12 pm

The data may still be MAR if the variables that are observed for a subject predicts his/her missingness.

For instance, if all people have data on gen1-gen4, math1-mat2 and missing on only either mat3adv or mat3bas, perhaps it is reasonable that the choice of taking mat3adv or mat3bas (and therefore missing on the other) is predicted by those observed variables.

In this case you simply specify the bi-factor model as usual and using ML estimation you will get estimates assuming MAR by default.

Jukka Matias Marjanen posted on Monday, September 08, 2014 - 11:53 pm

Thanks for your quick reply!

I think I'll go ahead and assume MAR data then, it seems quite plausible. But just out of interest, how would the selection model work if I was going to give it a try? Or does it even make any sense in this situation?

Bengt O. Muthen posted on Tuesday, September 09, 2014 - 10:16 am

When you say selection model and refer to Muthen 1987, do you refer to the Muthen-Kaplan-Hollis paper or something else?

Jukka Matias Marjanen posted on Wednesday, September 10, 2014 - 12:07 am

Yes, that is the one I meant.

Bengt O. Muthen posted on Wednesday, September 10, 2014 - 12:49 pm

That article actually shows how to do ML under MAR using a multiple-group approach where groups correspond to missing data patterns. That has now mostly historical-pedagogical value. You get the same thing just using the Mplus ML (or MLR) default without having to split data into missing data pattern groups.

Jukka Matias Marjanen posted on Wednesday, September 10, 2014 - 11:42 pm

Ok. Thanks anyway!