Mplus Discussion >> EFA - Estimator with Planned Missing Data

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


EFA - Estimator with Planned Missing ...

Mplus Discussion > Exploratory Factor Analysis >

Message/Author

Yaacov Petscher posted on Tuesday, March 15, 2011 - 10:54 am

I presently have data on 158 categorical (1/0) items which were administered to 1,000 individuals. The data were collected in a planned missing data design, where a total of five different forms were used. Each form had a set of unique and common items (appx 20% of the 158 are common across the five forms). When running an EFA on these data, there are a number of missing cells due to nature of the design. The output I get from the analysis using WLSMV or ULS produces the WARNING: BIVARIATE TABLE OF X_ AND X_ HAS AN EMPTY CELL. Although I can get AIC/BIC using ML, I'm interested in the other ancillary statistics for model evaluation. Is there a way to control for this type of design and use the WLSMV or ULS estimators?

Linda K. Muthen posted on Wednesday, March 16, 2011 - 8:55 am

You would need to use multiple group analysis where each form is a group. EFA does not allow multiple groups but you could use ESEM. See the examples at the end of Chapter 5.

Selahadin Ibrahim posted on Tuesday, August 16, 2011 - 6:19 am

Dear Linda/Bengt,
I am doing an exploratory factor analysis on binary items measuring safety culture in firms. Some items are not applicable to smaller firms and are recorded as missing. What is the best method to deal with this missingness? I have a variable measuring firm size.

thanks,

Linda K. Muthen posted on Tuesday, August 16, 2011 - 10:26 am

Using maximum likelihood estimation would be the best, but with binary items each factor is one dimension of integration which makes EFA difficult. You could use DATA IMPUTATION to obtain imputed data sets and then use TYPE=IMPUTATION and weighted least squares to do the EFA.

Selahadin Ibrahim posted on Tuesday, August 16, 2011 - 1:24 pm

Thanks a lot.

Selahadin Ibrahim posted on Wednesday, August 17, 2011 - 12:11 pm

Hi Linda,
this is a follow-up to my previous question. How do i justify MAR (required for direct maximum likelihood) when the missingness is 'Not applicable' ? Can you suggest any literature on this?

Thanks in advance

Linda K. Muthen posted on Wednesday, August 17, 2011 - 1:58 pm

You might want to check Enders 2010 missing data book to see if he covered this.

Selahadin Ibrahim posted on Thursday, August 18, 2011 - 5:20 am

Thanks, i do have the book. When data is MNAR he discusses about selection and pattern mixture models. I did not see about the performance of ML and Multiple imputation when we have 'not applicable items'. if i understand it correctly I read a section in the paper by Schafer and Graham(Missing data: state of the art) where they say when the missing values are out of scope - we can assume MAR.

thanks,

Bengt O. Muthen posted on Friday, August 19, 2011 - 6:19 pm

The question is what an (implicitly) imputed value means for a subject for whom the question is not applicable. It's similar to imputing values for someone who drops out of a study due to death. There may be other approaches for this case. You may want to ask on SEMNET to hear if someone has a good reference.

Aiden M A Thornton posted on Thursday, January 05, 2017 - 3:58 pm

I'm attempting to identify the latent factors within a complex data set that is comprised of:

- assessments that were administered in multiple different training cohorts
- 5 different version of the same assessment were administered (each of which uses a different number of items such that version 1 uses 18 items, version 2 uses 36 items, version uses 30 items, etc ... so significant planned missingness)
- a total of 55 ordinal items across all versions of assessment
- 3 different scorers across all versions of assessment
- 3 test-times (t1, t2, t3)

I hope to take an exploratory approach to factor analysing using as much data as possible to preserve power.

It seems to me that the best I can hope to do is:
- use ESEM rather than EFA
- analysis the 3 test-times separately rather than together in the same analysis (I know that I can test for measurement invariance using CFA, but I want to do an exploratory analysis instead of something confirmatory if possible)
- use multiple groups where groups would be (i) training cohorts and (ii) assessment version

Is this correct? Or is there a more constructive way of structuring my thinking about this? Many thanks.

Bengt O. Muthen posted on Friday, January 06, 2017 - 3:20 pm

ESEM seems a good choice. It can be used for testing measurement invariance and multiple group analysis - see the UG examples.

Aiden M A Thornton posted on Thursday, January 19, 2017 - 12:24 am

Dear Bengt,

Thank you for your response above.

From the UG I can see that you can use ESEM:
* at multiple time points (Example 5.26) and also
* for multiple groups (Example 5.27).
* I've also understood that multilevel analysis is unavailable for ESEM.

But is it possible to conduct an ESEM that takes into account:
* one group variable such as "training cohort"
* a second group variable such as "version of assessment"
* and multiple time points all simultaneously?

I'm having trouble locating an example that integrates all these variables in one analysis.

Many thanks, A.

Bengt O. Muthen posted on Thursday, January 19, 2017 - 5:56 pm

I don't know what this means:

* one group variable such as "training cohort"
* a second group variable such as "version of assessment"

Aiden M A Thornton posted on Thursday, January 19, 2017 - 7:26 pm

Sorry for not being clearer, let me try again.

It means that the data can be grouped in two different ways:

* the first way to group the data would be according to specific "training cohorts" which people participated in
* the second way to group the data would be according to the particular "version of the assessment" they completed (as there was 5 different versions of the assessment used in total)

I can't seem to find an example of ESEM syntax that incorporates two different ways in which data can be grouped (i.e. by which training cohort they participated in & by which version of the assessment they took) in addition to multiple time points.

Bengt O. Muthen posted on Friday, January 20, 2017 - 6:02 pm

So with 2 training cohorts and 3 assessment versions you would have 6 groups? If so, that's how it would be done - in combination with multiple time points.