Mplus Discussion >> Estimators and Missing Data

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Estimators and Missing Data

Mplus Discussion > Multilevel Data/Complex Sample >

Message/Author

Thomas A. Schmitt posted on Thursday, June 12, 2008 - 1:29 pm

Hello Linda,

I have several questions concerning growth models in Mplus:

(1) I have estimated a two level multilevel model with the following code below. Is there and example that would show how to the parameters in the SEM framework and if so does this SEM framework give all the same parameter estimates as the multilevel modeling framework?

(2) Why is Mplus able to “handle” missing data at both level one and level two as opposed to other programs such as HLM and SAS PROC MIXED which can only “handle” missing data at level one; is this due to the optimization algorithm? Is this true both in the

(3) I’m trying to understand what default algorithm Mplus uses to estimate parameters in the presence of missing data. I’m fitting a two level model with random intercepts and slopes. My understanding is that MLR estimator is used along with the EM algorithm and that FIML is not being used?

Best,

Tom

TITLE:
This is an example of a one-level growth
model for a continuous outcome (two-level analysis)
DATA:
FILE IS willet_mplus_missing.dat; format=free;
VARIABLE:
NAMES = id covar y time;
USEVARIABLES = id y time;
WITHIN = time;
CLUSTER = id;
MISSING= y (-99);
ANALYSIS:
TYPE = TWOLEVEL RANDOM;
MODEL:
%WITHIN%
s | y ON time;
%BETWEEN%
y s;
y WITH s;

Linda K. Muthen posted on Thursday, June 12, 2008 - 3:59 pm

1. Example 6.1 with residual variances held equal would be the same.

2. It is due to the optimization procedure.

3. MLR is FIML. The same parameter estiamtes are obtained. It is only standard errors and chi-square that differ. In most cases, Quasi-Newton is used. In some cases, EM is used.

Thomas A. Schmitt posted on Friday, June 13, 2008 - 11:11 am

Thank you! I’m still struggling with the algorithm being used. In the MS-DOS window it indicates that EM is being largely invoked when there is not missing data. When there is missing data the window indicates EM is mostly being used along with QN. Also, I thought FIML was a direct method in that model parameters and standard errors were estimated directly from the data and the EM algorithm is an indirect ML in that it provides ML estimates of the covariance matrix and mean vector that is used for further analysis. So my confusion is in that FIML is being used along with the EM algorithm.

Also:
(1) Missing data modeling is handled the same in terms of MLE methods with hierarchical growth data when using the multilevel and SEM framework?
(2) Can auxiliary variables be incorporated into the Mplus approach that uses multilevel modeling?

Thomas A. Schmitt posted on Friday, June 13, 2008 - 11:39 am

One more question. What about the algorithm allows missingness to be modeled at level 2? Is there a reference that discusses this? Is it related to the multivariate approach to growth modeling that Mplus uses?

Bengt O. Muthen posted on Friday, June 13, 2008 - 5:54 pm

One source of confusion is that "EM" is often thought of only as a method to estimate a mean vector and covariance matrix when there is missing data. This is too narrow of a view. EM is an algorithm that is used with maximum-likelihood estimation in general. It is true that it is typically used to estimate an unrestricted mean vector and covariance matrix, which with model fit testing provides the "H1" model estimates to which the fit of the "H0" model is compared. But EM can also be used for H0 models, that is the model you are ultimately interested in. It is used with missing data in a general sense, so for example including latent variables such as random intercepts and slopes which you have in your multilevel example. EM for H0 models can be slow and is therefore often accelerated by algorithms such as Quasi-Newton, Fisher Scoring, etc. When that is done, Mplus shows it in the technical output for the iterations.

The term "FIML" is also confusing and unfortunate. Full-information ML is the usual ML, but for some reason the FI prefix is added in missing data contexts. And ML with missing data - using any algorithm - is as you say estimating the H0 parameters directly, not doing imputations of the missing data first as in Bayesian multiple imputation. So FIML is what Mplus does when ML is requested with missing data, again using various algorithms.

(1) The ML missing data principle is not different for single- and multi-level models.

(2) I don't think so, but that's a support question.

Regarding your question about missing data and the algorithm for two-level modeling, the Mplus group is trying to finish up a paper describing this.

Thomas A. Schmitt posted on Monday, June 16, 2008 - 11:28 am

Thank you Bengt for your detailed answer!

Best,

Tom

Thomas A. Schmitt posted on Tuesday, June 17, 2008 - 5:30 pm

Hello Linda and Bengt:

I have a couple more questions concerning this just so I'm clear. Is FIML being used for both the random and fixed effects? Also, I get the error message below and I'm not sure what to make of it because it sounds like listwise deletion.

Best,

Tom

Data set contains cases with missing on all variables except
x-variables. These cases were not included in the analysis.
Number of cases with missing on all variables except x-variables: 10
1 WARNING(S) FOUND IN THE INPUT INSTRUCTIONS

Bengt O. Muthen posted on Tuesday, June 17, 2008 - 5:59 pm

Regarding your FIML (that is ML) question, I can interpret that 2 ways. (1) you are asking if ML is used both for models with fixed and models with random effects. Then the answer is yes. (2) you are asking if ML is used both for parameter estimation and random effect estimation for each individual in the sample. Then the answer is no. ML is used for parameter estimation. The random effects are not parameters of the model (their means and variances are), but the individual effects can be estimated after the model has been estimated. The individual effects are estimated using the posterior distribution (given the model and the data), typically using the expected value of this distribution. This is also referred to as Empirical Bayes.

Regarding the x variable missingness, yes such cases are deleted. Missing data handling of the x's need to make assumptions about the distribution of the x's and this is not part of the original model. You can however bring the x's into the model (by mentioning their means or variances) in which case the default is to add a normality assumption and missingness is handled as we have discussed for dependent variables.

Thomas A. Schmitt posted on Wednesday, June 18, 2008 - 11:12 am

Thank you Bengt! As always you and Linda provide timely and lucid responses.

Best,

Tom

Hemant Kher posted on Wednesday, November 03, 2010 - 4:35 am

Hello Professors Muthen & Muthen,

I have a question on LGM analysis .. I have 2 repeated variables in my data set – y1-y4 and x1-x4 – each recorded at 4 occasions. There’s missing data in my sample. When I fit a growth model to each, n is 230; with a dual growth model again n is 230. But, when I fit a growth model for y1-y4 with x1-x4 as a TVC, n shrinks to 102 .. this coincides with the fact that 102 student provided data at all 4 points.

The Mplus message I get is: "Data set contains cases with missing on x-variables. These cases were not included in the analysis. Number of cases with missing on x-variables: 128

Any insights on why sample shrinks in the different models?

Thanks for your time; much appreciated.

Hemant

Hemant Kher posted on Wednesday, November 03, 2010 - 5:29 am

Hello again Professors Muthen & Muthen,

I would like to take my question (above) back. On the discussion board there is a suggestion from you to include the Xs in the model by mentioning their variances. I did this and the problem of shrinking sample size appears resolved. Now the output lists a WITH for each X variable (e.g. X1 with I, S; X2 with I, S, etc.).

So a followup question in this case: does the presence of the above-stated WITH estimates, which I did not really request for, alter interpretations of the TVC model?

Thanks for your time as always.

Linda K. Muthen posted on Wednesday, November 03, 2010 - 12:30 pm

If you do not want x1 correlated with i, say:

x1 WITH i@0;

Hemant Kher posted on Wednesday, November 03, 2010 - 1:11 pm

Thank you for a quick response Dr. Muthen. When I include statements like above, the fit deteriorates significantly. Thus keeping x-variables in the model and letting them correlate with growth parameters seems a better option.

My only concern is if the interpretations are affected by the presence of the correlations; my inclination is that they are not.

Hemant

Bengt O. Muthen posted on Wednesday, November 03, 2010 - 1:29 pm

Bringing the time-varying covariates into to the model like this highlights the shortcomings of the typical assumptions made in multilevel modeling. For instance, it is likely that a time-varying covariate influences the slope growth factor. Now, a time-varying covariate measured at time t (t>1) could of course not influence the growth intercept if that is defined at time 1 as customarily done. But a time-varying covariate measured at time 1 could at least be correlated with the growth intercept. See also the handout from our Topic 3 course on growth modeling (slide 138 and on), where data on math development is analyzed by alternative models.

Dex posted on Wednesday, October 14, 2015 - 8:57 am

Hi Dr. Muthens, I was wondering when using FIML to deal with missing data in SEM, does Mplus 7 produce the robust S.E by default? Thanks

Bengt O. Muthen posted on Wednesday, October 14, 2015 - 1:33 pm

Use Estimator = mlr;

Amanda Lemmon posted on Thursday, July 16, 2020 - 11:57 am

Hello -

I wanted to ask a question about MLR and missing data. How much missing data is too much for MLR? I am doing CFA if the type of analysis matters.

Thank you!

Bengt O. Muthen posted on Friday, July 17, 2020 - 5:59 pm

It depends on too many factors to give a general statement. The more missing data you have, the more you rely on distributional and other assumptions and the less you can rely on your data. You are certainly better off typically with for example less than 20% missing and worse off with for example 40% missing. The coverage information we give also shows the missingness (or rather its opposite, coverage) not only for each variable but also for pairs of variables which is relevant for factor analysis since information on the estimates largely come from pairs (covariances/correlations). Have a look at chapter 10 of our RMA book to learn the ins and outs.