J.B. posted on Sunday, February 15, 2015 - 8:38 am
For the past month and a half I have been struggling with an MEFA/MCFA cross validation study.
The dataset comes from ~35,000 teachers clustered within ~7,000 schools, yielding an average cluster size of ~5. There are 49 ordinal teacher-level indicators and no school-level indicators.
The data were randomly split into two halves by schools, one for the MEFA and one for the MCFA cross validation. Unweighted and weighted univariate proportions for the indicators within both halves were checked and are roughly the same.
The dataset provides weights at both the teacher-level and the school-level. I constructed multilevel weights using MPlus Web Note 8.1 and Asparouhov, Muthén & Muthén, 2004.
My MEFA analysis suggested that the model with the best fit for the MEFA half of the data has six within-level factors, six between-level factors, and only uses 42 of the indicators (out of the original 49). Overall the most recent model seemed to fit reasonably well: CFI ~0.94, ~RMSEA 0.02, SRMR-within ~0.03, SRMR-between ~0.05, one cross-loaded indicator at the within-level, thirteen cross-loaded indicators at the between-level.
J.B. posted on Sunday, February 15, 2015 - 8:39 am
Running the model with the best fit on the MCFA half of the data consistently results in either (a) no convergence or (b) very bad model fit (CFI ~0.60, SRMR ~0.10 or higher for both levels). The reason for this seems to be that two factors at the teacher-level (within-level) consistently have a correlation of greater than one.
I have tried re-splitting the data randomly into the MEFA and MCFA halves several times, thinking that perhaps the poor fit is due to not having equivalent halves. I have also recalculated all of the multilevel weights under the possibility that I made an error in this process. I have tried dichotomizing all of the indicators, which overall results in a more parsimonious MEFA model yet still fails on the MCFA.
Thank you in advance for your suggestions and advice!