Mplus Discussion >> CFA and missing data

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


CFA and missing data

Mplus Discussion > Confirmatory Factor Analysis >

Message/Author

R McDowell posted on Monday, February 09, 2015 - 2:06 am

I am fitting a factor model where potentially 1 or more of the indicators may be missing for each observation. Where this is the case missing values have been coded -999 in the data set and the line Missing are all (-999) added to the code.

Mplus output states that there are 34 missing data patterns and 15 observations have all the indicators missing. I expected Mplus would fit the model using only the observations where all indicators had been recorded but it only excludes the 15 observations where no values are present (obviously). Could you explain why and how Mplus is using the observations with partly recorded data in the analysis? Thank you.

Linda K. Muthen posted on Monday, February 09, 2015 - 5:48 am

This is explained on pages 7-8 of the current user's guide. It varies by the estimator being used.

R McDowell posted on Friday, February 13, 2015 - 8:24 am

Thank you. I've had a look at the manual but am still not exactly clear what Mplus is doing in the default setting. My indicators are continuous and I'm using MLR to estimate the factor model. Is Mplus using all available indicators to estimate the factor loadings, but not trying to estimate the missing indicators in doing so, or is it trying to estimate these before calculating the loadings? What is the missingness assumption of this default setting?

Linda K. Muthen posted on Friday, February 13, 2015 - 9:34 am

Mplus uses all available information to estimate the full model at the same time. You should see the reference in the user's guide for further information.

R McDowell posted on Sunday, February 15, 2015 - 9:37 am

Thanks, I gather it's using FIML. When I run a similar model with categorical indicators, missing data and the WLSMV estimator, I see the default is also a solution in which only the observations with have all data missing are excluded. What is happening in this instance?

Linda K. Muthen posted on Sunday, February 15, 2015 - 11:06 am

Yes,it is FIML with MLR. As it says in the user's guide, with WLSMV and no covariates it is pairwise present.

R McDowell posted on Sunday, February 15, 2015 - 11:14 am

Many thanks for clarifying this for me.

Shirley posted on Monday, April 11, 2016 - 1:21 am

Dear Dr. Muthen,
I am trying to conduct a confirmatory factor analysis, where the item data were modeled as ordinal variables. I have several questions:
1. Due to the discontinue rule (i.e., test administration discontinued after X consecutive scores of 0s), there is a considerable amount of missing data present in the data. Currently, I specified MLR as the estimator in the Analysis section, with the goal of using all the information available. May I check if this is a correct strategy?

2. The estimation of the CFA model terminated normally. However, the following warning message was printed in the output *** ONE OR MORE PARAMETERS WERE FIXED TO AVOID SINGULARITY OF THE INFORMATION MATRIX. THE SINGULARITY IS MOST LIKELY BECAUSE THE MODEL IS NOT IDENTIFIED, OR BECAUSE OF EMPTY CELLS IN THE JOINT DISTRIBUTION OF THE CATEGORICAL VARIABLES IN THE MODEL. THE FOLLOWING PARAMETERS WERE FIXED: Parameter 14, f BY Q15,Parameter 10, f BY Q11, and Parameter 9, f BY Q10.*** There are blank cells in the cross-tabulations of item 10 with other item(s), which is the case for item 15 as well. As to item 11, I didn't find any blank cells in cross-tabulation of item 11 and any other item, and couldn't figure out what could cause problem for this item. Would appreciate your insight.

Thank you!

Linda K. Muthen posted on Monday, April 11, 2016 - 10:06 am

Please send the output and your license number to support@statmodel.com.

Aurelie Lange posted on Monday, January 22, 2018 - 2:06 am

Dear Dr Muthen,

we are running a CFA with 3 factors and 31 items. We have 483 observations. The questionnaire has been completed between 1 and 4 times by each respondent. We therefore use TYPE=COMPLEX to account for the nested data structure.

When running the analysis we get the following message:

THE MISSING DATA EM ALGORITHM FOR THE H1 MODEL HAS NOT CONVERGED WITH RESPECT TO THE PARAMETER ESTIMATES. THIS MAY BE DUE TO SPARSE DATA LEADING TO A SINGULAR COVARIANCE MATRIX ESTIMATE.
INCREASE THE NUMBER OF H1 ITERATIONS.
NOTE THAT THE NUMBER OF H1 PARAMETERS (MEANS, VARIANCES, AND COVARIANCES) IS GREATER THAN THE NUMBER OF OBSERVATIONS.
NUMBER OF H1 PARAMETERS : 527
NUMBER OF OBSERVATIONS : 483
NO CONVERGENCE. NUMBER OF ITERATIONS EXCEEDED.

Our data does not seem to be sparse as the covariance coverage is > .95 for all items and item-pairs.
We have tried to increase the number of H1 iterations. We then merely get the message 'no convergence'.

Would you have any suggestions?

Sincerely,
Aurelie Lange

Linda K. Muthen posted on Monday, January 22, 2018 - 6:06 am

Please send the output and your license number to support@statmodel.com.

Aurelie Lange posted on Monday, January 22, 2018 - 6:11 am

Thank you for your quick reply. Fortunately, we found the solution. We forgot to specify some of the missing values.