Outliers and model fit PreviousNext
Mplus Discussion > Confirmatory Factor Analysis >
Message/Author
 Christian S posted on Wednesday, September 01, 2010 - 1:47 pm
Dear Drs. Muthen,

I am running a CFA. Multivariate outliers were identified with Mahalanobis D. If OUTMAHAP was 0.001 or below, a case was considered an outlier (rel. conservative approach).

When comparing model fits (Chi2, CFI etc.), the fit is better with the outliers included than after the elimination of outliers.

Is that an indication for a problem? What could be the reason for that?

Thanks in advance.

Best Regards,

Christian
 Linda K. Muthen posted on Wednesday, September 01, 2010 - 4:27 pm
Using outlier detectopm based on the Mplus loglikelihood outlier detection processes should yield a better fitting model when outliers are removed. I don't think this is necessarily the case with Mahalanobis D.
 Elizabeth Barrett-Cheetham posted on Sunday, April 07, 2013 - 12:10 am
Dear Linda and Bengt,

I am currently trying to improve the fit of my model by removing outliers. I understand that mplus offers 4 different ways of detecting outliers.

I have read a Mplus discussion (CFA>Outliers and model fit>01 Sep, 2010) where Linda suggested that “Using outlier detectopm based on the Mplus loglikelihood outlier detection processes should yield a better fitting model when outliers are removed. I don't think this is necessarily the case with Mahalanobis D”.

Would you suggest that I use the Mahalanobis, Logliklihood, Influence or Cooks analysis to try and improve my model fit? Also, for this suggestion that you provide, could you please explain the relevant criterion that I should be using to determine what outlier cases should be deleted/modified? I have searched the user guides and mplus discussions but can’t seem to find anything.

Many thanks for your assistance,
Elizabeth
 Linda K. Muthen posted on Monday, April 08, 2013 - 11:46 am
I would plot the loglikelihood on the y-axis and an important dependent variable on the x-axis and examine the outlier. If you use an IDVARIABLE, you can hold the mouse on the point and see the id of the outlier. There are references for the other outliers in the SAVEDATA command.
 Paulo Alexandre Ferreira Martins posted on Wednesday, October 19, 2016 - 9:11 am
Hello!
I've been trying hard to manage outliers from MPlus, unsuccessfully...

Even though i detected them from data file/raw data (i.e., observing the last column who presented "outmahap" <; 0.001), i still can't figure out how to select these outliers and write them (delete them) in MPlus syntax.

I tried another way by exporting all this data to an excel file, deleted these outliers and exported new data to a csv format file so that i could run MPlus syntax...but, it didn't work...

Thank you!
 Linda K. Muthen posted on Wednesday, October 19, 2016 - 3:01 pm
For the Mac, plots are done using R. Go to http://www.statmodel.com/platforms.shtm where there is a link to plot information using R. On the Mac, you cannot left-click on the plot to see the ID and value. You can read the value off of the plot and exclude the person based on the value.
 Paulo Alexandre Ferreira Martins posted on Saturday, October 22, 2016 - 2:29 pm
Hi!
Ok, i've already download HDF5 package, but i still didn't succeed to run my "gh5 file" in RStudio...

In the RStudio script window, i just can see the input and output Mplus files, but not the new created file: ....newdata.gh5...


Thank you!
 Paulo Alexandre Ferreira Martins posted on Sunday, October 23, 2016 - 11:38 am
Sorry, but i think in your MPlus.R tutorial, you refer to the "R source code", "mplus.r" for a windows user:

"Open R. In Windows, go to Start -> Programs -> R.
Under the File menu, choose the Source R code... option. Browse to the folder with the mplus.R source code...."

Do you possible know which commands for 'os x users' to download mplus.R?...
I've been trying with "R" and "RStudio" but didn't succeed..

Thank you!
 Linda K. Muthen posted on Monday, October 24, 2016 - 9:32 am
You don’t use R to download mplus.R. Just use your browser to go to our website. Then download mplus.R to your computer. Make sure you save it as plain text. In R, go to the File menu and choose “Source File…”. Locate the mplus.R file you have downloaded. Then click Open.
 Paulo Alexandre Ferreira Martins posted on Thursday, October 27, 2016 - 2:10 pm
Thank you for your kind attention!

I did as you said, but contrary to other files, i still can not run gh5 file ..i just can't figure out why...

Thank you
 Linda K. Muthen posted on Thursday, October 27, 2016 - 5:22 pm
You will need to send a detailed description of what you are doing and what you are experiencing along with your license number to support@statmodel.com. Send screen shots to make it clear.
 Paulo Alexandre Ferreira Martins posted on Wednesday, November 16, 2016 - 9:31 am
Dear Linda and Bengt Muthen.
I have two questions that i would appreciated very much your guidance:

In my quest to find multivariate outliers via MPlus.R (in a Machintosh), i reached the following:

In Mplus i asked for Mahalanobis D and with SPSS and/or RStudio to sort them, i detected 50 outliers (i.e., ‘OUTMAHAP<0.001).
However, after deleting these “eventual” outliers, i returned to MPlus, but the model Fit became worst…

1º Following one of your previous suggestions:”I would plot the log likelihood
on the y-axis and an important dependent variable on the x-axis and examine the outlier" - so i did the same and plot loglikelihood with a ‘relevant’ DV (i.e., a 2nd fator variable) in a scatterplot from mplus.R Studio.

- The problem is: it is not possible to see IDs (or eventual outliers) from mplus.R plots in a mac...
But... there must be a way to overcame this problem…right?


2º Finally, following the same proceeding, I also plot ‘INFLUENCE’ and 'Cooks' with the same 'relevant DV'.
The graph showed me same cases apart from the rest …
Like i said, as i’m not able to see IDs in a graph generated from mplus.R in a mac, i asked for data of “Influence” and "Cook" (i.e., mplus.get.data) and obtained some extreme results…- Are this values truthfully to delete?...

Thank you very much for your attention!
 Linda K. Muthen posted on Wednesday, November 16, 2016 - 2:01 pm
You can't see the id's using the Mac and R plots. You have to try to isolate the outlier based on the score on the variable. The only option is to use the Windows version where you can see the id.
 Paulo Alexandre Ferreira Martins posted on Sunday, November 20, 2016 - 11:55 am
Right.

As i already bought mac version, i installed only a Mplus demo version in my pc - trying to bypass this problem…
is it possible to do it with a demo version?… i’ve already tried, but pop us a warning message as my model has more than 6 DVs and 2 IVs…

On the other hand and following your suggestion - “try to isolate [identify] the outlier based on the score on the variables”.
What do you mean?


In Mplus input i saved mahalanobis D, Influence, Cooks D and Maximum Loglikelihood.
As for mahalanobis, i tried to improve my data, deleting p.values <0.001 (outmahap). But my model didn’t improved.
As for the others identifiers, is there any relevante proceeding so that i can identify and isolate outliers?

Thank you very much!
 Linda K. Muthen posted on Sunday, November 20, 2016 - 5:54 pm
You must do a smaller analysis on the Demo. It is limited in the number of variables that can be used.

If you are looking for an outlier for y, you will see in the plot the value of y for the outlier. Exclude cases based on that.
 Paulo Alexandre Ferreira Martins posted on Monday, November 21, 2016 - 11:30 am
As i have a 2nd Order Factor with 3 endogenous variables and 17 IVs in my model, i can't run in a Demo version...

As we already seen from RStudio and using mplus.R function, we simple can't access values of y from plots in a mac...

Outliers from mahalanobis Distance are easy to find, simple writing "outmahap" (p-values <0.001). In this case, i do not need any plot, just my database..

But how about maximum log likelihood?
Contrasting a relevant DV with "outlogl" it's worthless from a plot in the RStudio app for mac users...
I'm still able to get values from "outlogl","outinfl" or "outcook" simple running ("mplus.get.data," from RStudio).

But which proceeding do i need to do afterwards?

Many Thanks!

ps:
I wonder if it is possible to unlace this issue with Mplus Automation.....
 Linda K. Muthen posted on Monday, November 21, 2016 - 3:45 pm
If you want to identify the ID of the outlier, you will need to use Windows.
Back to top
Add Your Message Here
Post:
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Password:
Options: Enable HTML code in message
Automatically activate URLs in message
Action: