Wrong mean estimates with FIML? PreviousNext
Mplus Discussion > Missing Data Modeling >
 Tony Jung posted on Friday, May 12, 2006 - 8:32 pm
I'm running a simple path model with 8 manifest variables. My sample size is 210 and I have missing data (about 30-50% missing for four variables). I ran the model in AMOS and Mplus. Both AMOS and SPSS give me the same listwise-deleted means for the variables. However, Mplus is giving me smaller means for variables that have missing values. I think what's going on is that Mplus is dividing by the total N=210, instead of by the listwise-deleted N.

Using SPSS, I saved the dataset as a fixed-ASCII (.dat) file to be read by Mplus. However, when I open the .dat file, the missing values are just left blank. I think Mplus is just assigning missing as zero, instead of seeing it as system-missing. Hence, the lower means.

My concern is that my parameter estimates and test statistics are different between AMOS and Mplus. Also, should I be reporting the FIML mean values or the listwise mean values? Clearly the FIML mean values are misleading since they are always less than or equal to the listwise-deleted means.

Any thoughts?
 Linda K. Muthen posted on Saturday, May 13, 2006 - 6:37 am
Mplus computes sample statistics using the n for the number of observations used to compute the sample statistics. This would be the listwise deleted mean if listwise deletion is used.

Blanks are not allowed with free format. If you have blanks and free format, they are skipped and the data are not read correctly. If you have fixed format, you need to declare the blanks as missing or they will be treated as zero. This is most likely why your means are incorrect.

For fit statistics, Mplus uses n and I think AMOS uses n-1. They are asymptotically equivalent but can be seen in a small sample like yours.

If your model is estimated using TYPE=MISSING, you should report those sample statistics. If your model is estimated using listwise deletion, you should report those sample statistics.
 Tony Jung posted on Monday, May 15, 2006 - 5:01 am
Thank you for your quick reply. Changing to "Missing = blank" from "Missing = ." helped a lot. The mean estimates from Mplus are identical what I get from AMOS. However, there are still noticeable differences with SPSS means. By any chance, do you know why I am getting different mean estimates using SPSS? I believe SPSS is giving me listwise means.

Here are the means from SPSS, Mplus, and AMOS respectively:

From SPSS:

Descriptive Statistics
N Mean
quiz1 91 7.5275
quiz2 78 7.9103
quiz3 76 8.0132
quiz4 79 7.5949
quiz5 93 7.9462
Valid N (listwise) 4


QUIZ1 7.548
QUIZ2 7.928
QUIZ3 7.861
QUIZ4 7.745
QUIZ5 7.838

From AMOS:

Means: (Group number 1 - Default model)
Estimate S.E. C.R. P Label
QUIZ1 7.548 .244 30.939 ***
QUIZ2 7.928 .167 47.364 ***
QUIZ3 7.861 .242 32.489 ***
QUIZ4 7.745 .241 32.077 ***
QUIZ5 7.838 .181 43.295 ***

Thank you in advance for your help.
 Linda K. Muthen posted on Monday, May 15, 2006 - 5:52 am
I think SPSS uses all possible observations for each mean.
 Jeff Williams posted on Wednesday, March 21, 2007 - 4:06 pm
Drs. Muthen,

I am having a similar problem. I'm doing an EFA with missing data, but the sample means reported in MPlus do not match those reported in SAS, even though the .txt file I used in MPlus was created from the SAS dataset. There is no problem with the .txt file. MPlus reports the same number of observations used as SAS. Furthermore, when I re-import the .txt file into SAS, I get the same means (as I did in SAS originally). For example,

var n mean
1 188 2.77
2 181 2.16
3 188 2.70

1 188 2.91
2 181 2.27
3 188 2.63
 Linda K. Muthen posted on Wednesday, March 21, 2007 - 5:09 pm
You would need to send the SAS output that uses the same data as Mplus, the Mplus input, the data, the Mplus output, and your license number to support@statmodel.com for an explanation of this.
 Linda K. Muthen posted on Thursday, March 22, 2007 - 10:08 am
Thanks for sending the outputs. The means from Mplus are estimated using TYPE=MISSING; with a sample size of 429. The means in SAS have a different number of observations ranging from 181 to 310 depending on the amount of missing data. This is why you see differences.
 P Aria posted on Saturday, March 21, 2009 - 3:23 pm
I am running a path analysis with some factors derived from a factor analysis in SPSS. The mean of the fcator scores is 0.000 in SPSS and excel with SE of 1 (normally distributed). However mplus reports means ranging from -0.161 to 1.453 with SEs between 0.24 to 0.3.

My dataset has no missing values.

What may have caused this problem?

 Bengt O. Muthen posted on Saturday, March 21, 2009 - 6:12 pm
If the factor is a dependent variable in your path model, Mplus reports its intercept and residual variance, not its mean and variance. You get the mean and variance in TECH4.
 P Aria posted on Saturday, March 21, 2009 - 7:10 pm
Thanks for the fast response. The factors are mediators in my model. When I use TECH4, only covariance matrices are reported but not the means. I got the mean and SE using type=basic. Isn't that the right mean/SE from the program?

Also, I suspected in the difference in the number of decimals read by the two programs as a reason for differences in means. I have about 15 decimals in my SPSS program. How many decimals does m-plus read?

Many thanks.
 Linda K. Muthen posted on Sunday, March 22, 2009 - 9:49 am
It sounds like you are using an old program where TYPE=MEANSTRUCTURE is not the default.

I do not believe the difference in means has anything to do with the number of decimals. It is most likely for the same reason you had differences with SAS.

If you have further problems of this kind, please send them along with your license number to support@statmodel.com.
 Stefanie App posted on Sunday, March 10, 2013 - 8:52 am
I am running a sem model with missing data and have done a group-mean centering in SPSS. So that my new means should be zero in total. When I calculate the means in SPSS they are zero, but when I do this in Mplus the means show differences. For example, instead of zero I get 0.004.
I have totally excluded my missing data and then I also get the same means in Mplus. So I seem to have a problem with the missing data.
I am concerned that this might have effects on my model fit.
Any thoughts?
 Linda K. Muthen posted on Sunday, March 10, 2013 - 3:46 pm
In SPSS, your centering is being done using listwise deletion I suspect. I think you should center in Mplus.
 Shraddha Kashyap posted on Tuesday, June 18, 2013 - 2:15 am

I'm having some trouble understanding why I am getting an incorrect mean on one of 14 variables. The mean should be 6.25, but is coming up as 254. All other variables show the correct means (i.e. same as spss), so I assume mplus is recognising 99 as missing values. I also double checked the .dat data file to make sure there werent any incorrect values, and everything looks fine. Does anyone know why the last variable might be different?

 Linda K. Muthen posted on Tuesday, June 18, 2013 - 5:54 am
It sounds like you are reading the data incorrectly. Please send the input, data, output, and your license number to support@statmodel.com.
 Tania Bartolo posted on Sunday, October 05, 2014 - 11:14 am
Hi There,

I just had a question regarding ensuring your data transferred correctly from SPSS to Mplus. I compared descriptive statistics using listwise deletion with my data both in SPSS and Mplus and found that the means and correlations matched exactly, but my covariances were slightly different. I was wondering why this would be the case and whether I should be concerned that something transferred incorrectly. They are not drastically different though so I'm wondering if they are calculated slightly differently from one program to the other? Any clarification would be much appreciated.

 Linda K. Muthen posted on Sunday, October 05, 2014 - 11:49 am
Try multiplying their covariance by n and then dividing by n-1.
 Tania Bartolo posted on Sunday, October 05, 2014 - 12:15 pm
Thanks for the quick reply Linda! It works if I multiply the Mplus covariance by n and divide by n-1 -- I get the SPSS covariance value exactly.

What is the rationale for doing so? Is it merely a calculation difference? And can I then safely assume that the SPSS data was read correctly by Mplus?

 Linda K. Muthen posted on Sunday, October 05, 2014 - 1:38 pm
Yes, you can say the data are read correctly by Mplus. SPSS uses n. We use n-1 because that is what is correct for using the covariance matrix for model estimation using maximum likelihood estimation. In large samples, this difference is not noticeable.
 Zahra Bagheri posted on Sunday, October 05, 2014 - 11:32 pm
Dear Sir/Madam

I want to investigate measurement invariance across two groups (the variable is shown by group in the following code) by means of MIMIC in the Mplus 6 software. In addition, I want to control the effect of age and sex. I have some questions about the syntax and the results.

First, sex is a categorical variable in the structural part of the model; how I should determine in the syntax that this variable is categorical.

Second, if I write: physical on group; physical on sex; physical on age; it means that the effects of age and sex are considered too or the code is wrong?

Third, in the output two different parts are shown for modification indices: “on statement” and “with statement” which one should be used?

Forth, if I want to consider the correlation between two specified items I should writ “q1 with q2” in the Model statement or “q1 with q2 (cov)”.

Would you kindly help me to write the correct syntax?
 Linda K. Muthen posted on Monday, October 06, 2014 - 8:43 am
Covariates can be binary or continuous. In both cases, they are treated as continuous.

See the Topic 1 course handout and video where both the MIMIC and multiple group approach to testing measurement invariance are described.

There is not difference between the two statements except the label (cov) in the second one.
 Louise van Elst posted on Sunday, May 31, 2015 - 2:14 pm
I am new to Mplus and am working with the demo version to see if I'd like to use it for growth mixture modeling. The dataset is large (n=11000) with 26 time points. For the demo version I am only using y0 y1 y2 x t1 t2 t3 with no missings.

I ran a basic analysis of y0 y1 y2 and x and noticed that the means, minimums and maximums of y1 and x differ substantially from my SPSS data. For both variables the minimum is 0 in Mplus, but this is not correct. I opened the data file in Mplus and it has the correct data.

What could be the problem?
 Linda K. Muthen posted on Sunday, May 31, 2015 - 3:12 pm
You may have blanks in the data set which with free format causes the data to be misread. Or the number of variable names may not match the number of columns in the data set.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message