Mplus Discussion >> FIML and warning message

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


FIML and warning message

Mplus Discussion > Missing Data Modeling >

Message/Author

James McMahon posted on Monday, April 06, 2015 - 6:46 am

I am performing standard multivariable linear regression (interval dependent variable) with a dataset that has 12% missing cases under listwise deletion. I am assuming MCAR or MAR. I would like to avail of full information maximum likelihood (FIML) estimation in Mplus as a means of handling the missing data.
I am using the following estimator option:
ANALYSIS: ESTIMATOR = MLR;
However, the output indicates that 29 cases are excluded from the analysis (see warning message below), which I would not expect under FIML. How do I get Mplus to run FIML? Thank you. –James McMahon

*** WARNING
Data set contains cases with missing on x-variables.
These cases were not included in the analysis.
Number of cases with missing on x-variables: 29

Linda K. Muthen posted on Monday, April 06, 2015 - 7:43 am

The model is estimated conditioned on the x variables. Their means, variances, and covariances are not model parameters. If you don't want to lose these cases, mention the variances of all of the x variables in the MODEL command. Then distributional assumptions will be made about these variables and their means, variances, and covariances will be model parameters..

James McMahon posted on Monday, April 06, 2015 - 10:38 am

Yes, that worked. I added the x variables to the model command to model distributional parameters, as follows:

MODEL: y on x1 x2 x3;
x1 x2 x3;

Using MLR estimator, all cases are now included in the analysis. Thank you

Lauren Molloy Elreda posted on Wednesday, May 24, 2017 - 12:59 pm

Similar to the original poster, I am performing standard multivariate linear regression with a dataset of 188 cases, that has 32.5% missing cases with listwise deletion, but I would like to use FIML. As you recommended to the original poster, I've included a line in the model command that lists all of the x variables, as follows:

MODEL: y on x1 x2 x3;
x1 x2 x3;

However, I have two follow-up questions:

1) We also have some missing data on the y variable. Is it okay to list the y variable on that second line as well, so that our full sample of 188 cases is included? E.g.:

MODEL: y on x1 x2 x3;
y x1 x2 x3;

2) Our model has a large number of observed covariates. If I include all of those covariates in the first line (regression line) as well as in the second line (to model distributional parameters), MPlus gives me an error that I have more parameters than observations. If I exclude just a few of the variables from the distributional parameters line, I avoid this problem (while still keeping the full sample of 188), but is it problematic to model some of the distributional parameters that are in the regression but not all? E.g., is this okay to do?:

MODEL: y on x1 x2 x3 x4 x5 x6;
x1 x2 x4;

If so, is there a principled way (either conceptually- or statistically-driven) to choose which variables to leave out of the distributional parameters line?

Bengt O. Muthen posted on Wednesday, May 24, 2017 - 5:49 pm

This is discussed in section 10.4.2 of our new book.

1) you don't have to list the residual variance of y because it is in the model as the default. Note, however, that subjects with complete data on x's and missing on y do not contribute to estimation of the regression coefficients (see that section).

2) If you exclude some of the x's they won't be allowed to correlate with the other x's. There really isn't a good solution to this problem.

Jill Rabinowitz posted on Monday, July 30, 2018 - 4:03 pm

Hi there,

I am running an LCA with a number of count indicators. I am using the ALGORITHM=INTEGRATION command. Is this the same thing as FIML? If not, what is the main difference?

J

Bengt O. Muthen posted on Monday, July 30, 2018 - 5:27 pm

Yes, Algo=Int gives FIML. That is, its an algorithm that is used with ML(R) under the standard MAR assumption for missing data.

The need for Algo=Int with LCA, however, occurs only when you introduce a factor in addition to the latent class variable.

Jill Rabinowitz posted on Monday, August 06, 2018 - 7:34 am

Thanks, Bengt. A reviewer recently stated that FIML may not be an appropriate method for estimating missing data in an LCA when you're working with non-normal data. Is this correct? If it's not, do you have any web notes that support that FIML is an appropriate estimation method for non-normal data in an LCA framework?

Bengt O. Muthen posted on Monday, August 06, 2018 - 3:28 pm

Are the outcomes continuous or categorical?

Jill Rabinowitz posted on Wednesday, August 08, 2018 - 7:40 pm

They are count outcomes.

Bengt O. Muthen posted on Thursday, August 09, 2018 - 2:07 pm

"FIML" is ML estimation under the MAR (missing at random) assumption. For continuous outcomes, normality is assumed. This is probably what the reviewer refers to - "FIML" isn't perfect when you have non-normal continuous outcomes. But normality is not assumed with categorical or count outcomes. Therefore, there is no problem applying ML under MAR to count outcomes. I don't know which source could be a good reference - perhaps the book by Little & Rubin talks about counts; I don't have it here at the moment.

Ali posted on Tuesday, March 26, 2019 - 8:37 am

Hello, I ran LCA model, and four items are nominal variables. Four items have missing data, and I typed ALGORITHM=INTEGRATION in the LCA analysis; therefore, Mplus should use FIML. However, I got the warning message ¡§These cases were not included in the analysis. Number of cases with missing on all variables:¡¨. Does the warning message occur under FIML?

Another question is that the pattern of missing data was MAR (missing at random), but four items are nominal variables. Is it appropriate to use FIML in the LCA in such situation?

Thank you!

Bengt O. Muthen posted on Tuesday, March 26, 2019 - 5:22 pm

First, FIML does not mean tha the outcomes are continuous - the FIML principle (which is really better called ML under the MAR assumption) applies to all kinds of outcomes.

If a subject has missing data on all your variables, FIML can't help you - it needs at least one variable not missing.

Ali posted on Tuesday, March 26, 2019 - 10:36 pm

Thank you! Yes, a subject had missing data on four items due to survey design . Does it mean that Mplus use pairwise deletion in such situation?

Bengt O. Muthen posted on Wednesday, March 27, 2019 - 4:13 pm

No, Mplus does not use pairwise here. You said that you have four items and you say that all four are missing - when that happens there is no observed variable for FIML to work with.

Ali posted on Wednesday, March 27, 2019 - 9:43 pm

Sorry for confusing you. In my dataset, there are four items, but every item has around 30% missing data, because the survey design is rotated. There are Form A, Form B, and From C, and these four items only existed in the Form A and Form B. In other words, there were 33% people did not answer these four items. However, there were 67% people who answered these four items. In the analysis, I also included these 30% people who did not answer these four items. So, I would like to ask to if FIML works in such situation. If it doesn¡¦t, which method does Mplus apply?
In my output, I got the warning message as following, but the output showed that the estimator is MLR. As you said, does you mean that FIML does not work on for these 30% people who did not answer for the four items? And, FIML still works for the 67% people who answered four items?

*** WARNING
Data set contains cases with missing on all variables.
These cases were not included in the analysis.
Number of cases with missing on all variables: 13032
1 WARNING(S) FOUND IN THE INPUT INSTRUCTIONS

Bengt O. Muthen posted on Thursday, March 28, 2019 - 5:49 pm

If your model analyzes only these 4 items, the 30% who did not answer them don't contribute to the estimation (FIML doesn't apply to those 30%). FIML would apply to people with missing on some but not all of the 4 items. If you bring other items into the model (from other forms) that the 30% answer, then FIML would kick in for the 4 items. Usually, with multiple forms, you analyze all items from all forms using all people and the missingness for some people on some items is handled by FIML.

Ali posted on Friday, May 03, 2019 - 8:19 am

I have a question regarding FIML. My model is: y on x1 x2 x3, and there are missing data on y, x1 ,x2, x3. When I included x1, x2, x3 under the model section, I obtained the original complete case. The output showed the means, variances, and covariance for x1, x2, and x3. Besides, it showed that x1,x2,x3 were correlated. However, I did not hypothesize that x1,x2, and x3 were correlated in my model . Therefore, if my commanded for the model section under the FIML method for dealing with missing data as below:
MODEL: y on x1 x2 x3;
x1 x2 x3;

Did this code change my hypothesized model (i.e. without any correlations among x1,x2,x3)?

Bengt O. Muthen posted on Friday, May 03, 2019 - 3:36 pm

If your model is y on x1 x2 x3, this says that you have a regression model. In a regression model, all x's are correlated even though those correlations are not part of the regression model parameters. In other words, there is no restriction on the marginal distribution of the x's (such as uncorrelatedness) in a regular regression model.

If you don't want the x's correlated, just say

x1-x3 with x1-x3@0;

Ana Martina Greco posted on Thursday, September 17, 2020 - 7:49 am

Hello!

I am trying to use FIML in a logistic regression model that has several categorical predictors. My problems are:

1) I keep getting a warning saying that some predictors "MAY BE DICHOTOMOUS BUT DECLARED AS CONTINUOUS". It is true that I only used CATEGORICAL command for the outcome because it only applies to dependent variables. Can I just ignore this warning? Or how can I solve it?

2) I keep getting an error saying to check my model. I tried increasing the number of iterations but it still does not converge. Any ideas on what could that be? Could it be related to the way I am handling missing data or the number of categorical predictors I am including in the model?

Thanks a lot for your help and time!

Bengt O. Muthen posted on Thursday, September 17, 2020 - 6:06 pm

1) You can ignore it.

2) We need to see the full output to diagnose this - send to Support along with your license number.

Ana Martina Greco posted on Friday, September 18, 2020 - 12:42 am

Will do, thank you very much!