High-dimensional EFA (optimize run ti...
Message/Author
 Jaime Derringer posted on Friday, August 06, 2010 - 9:52 am
I'm running a series of EFAs on 60-80 items, extracting up to 10 factors. We are using the ML estimator, and the data are missing completely at random (by design) and weights are applied.

I am currently running this using M+ v6 on a 64-bit Windows 7 quad-core machine - the first EFA has now been running for 6 days.

We are planning to purchase a server on which to run these analyses, and were wondering what kind of machine parameters would optimize run times, or how much M+ is using machine features like, e.g., on-board ram vs virtual ram on the hard drive.
 Linda K. Muthen posted on Friday, August 06, 2010 - 9:59 am
Are the factor indicators categorical or continuous?
 Jaime Derringer posted on Friday, August 06, 2010 - 10:08 am
Categorical (4 options)
 Linda K. Muthen posted on Friday, August 06, 2010 - 10:24 am
With maximum likelihood and categorical factor indicators each factor is one dimension of integration. We do not recommend models with more than four dimensions of integration. I suggest using WLSMV which is less computationally demanding in this case. I would think that you have some idea of how many factors are represented by the set of items. If it is four, for example, perhaps extracting from three to five or two to six would be sufficient.
 Jaime Derringer posted on Friday, August 06, 2010 - 11:30 am
Unfortunately, we need to use ML to handle the missingness in our data (which is substantial, so we lose more than 1/2 of our subjects otherwise). We are constructing a measure, and for one of the sections (with 80 items), the proposed number of factors is 10, so we also need to run up to the 10-factor solution.

Is there a hardware solution to optimize M+'s running of a categorical ML large EFA analysis, without changing the analysis itself?
 Bengt O. Muthen posted on Friday, August 06, 2010 - 12:23 pm
Note that WLSMV does not use listwise deletion.

Staying with ML, both integ=3 and integ=montecarlo can present numerical precision problems with that many factors.

A more practical full-information approach would be to do Bayesian multiple imputation followed by WLSMV. The approach is studied in Section 3.1 of

Asparouhov, T. & Muthén, B. (2010). Multiple imputation with Mplus. Technical Report.

which on our web site under Papers, Bayesian Analysis. The UG ex 11.5 shows how do to the multiple imputation step.
 Jaime Derringer posted on Saturday, August 07, 2010 - 6:45 am
If WLSMV is not using listwise deletion, how exactly is it handling missing data? And how does that work if its based on polychorics and the associated asymptotic weight matrix?
 Bengt O. Muthen posted on Saturday, August 07, 2010 - 8:29 am
Pairwise present.
Back to top
Add Your Message Here
Post:
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Password:
Options: Enable HTML code in message
Automatically activate URLs in message
Action: