Message/Author 


dear all, is there any rule of thumb or literature concernign the question how 'strong' difference in two BIC or AIC values need to be inorder to be considered substantial? Thanks 


I have never heard of any. 


In Krueger et al (J. of Abnormal Psych, v. 111, p. 415), it is written "BIC provides a quantitative index of the extent to which each model maximizes correspondence between the observed and model predicted variances and covariances while minimizing the number of parameters. Better fitting models have more negative values, and the difference in BIC values relates to the posterior odds—the odds ratio formed by taking the probability that the second model is correct, given the data, over the probability that the first model is correct given the data. When comparing models, a difference in BIC of 10 corresponds to the odds being 150:1 that the model with the more negative value is the better fitting model and is considered “very strong” evidence in favor of the model with the more negative BIC value (Raftery, 1995)." 


The Nagin (1999) Psych Methods article has a table of Wasserman's "Bayes factor" values which give guidance related to BIC differences between models. 


check this paper for some information there please? Spiegelhalter DJ, Best NG, Carlin BP and Van der Linde A, "Bayesian Measures of Model Complexity and Fit (with Discussion)", Journal of the Royal Statistical Society, Series B, 2002 64(4):583616. 


Greetings, This discussion, combined with another one ("selecting the number of classes") rise a question. In the other discussion, B. Muthén showed that NaginBIC = 2 * MplusBIC. In the 1999 paper, Nagin showed that to obtain the bayes factor approximation in comparing two models, one should use: e(difference between the two BICs) and then use the Table 2 Dr B. Muthén refered to up here (Jeffreys scale of evidence as reported by Wasserman). As an approximation, this means that BIC differences of 2.3026 or higher indicate strong evidence in favour of the model with the highest BIC (with Nagin BIC, which is negative) since ln(10) = 2.3026. Beeing no mathematician, my question will appear naive. Given the difference between NaginBIC and MplusBIC, should we still use the formula proposed in Nagin but first divide MplusBIC by 2 before using the formula (or simply dividing the differences between the 2 BIC by 2) ? Or am I missing something obvious ? If that is the case, a difference of 4.6052 or higher indicate strong evidence in favour of the model with the lowest BIC for Mplus. 


I think you would divide 2.3026 by 2. The Nagin BIC is 2 times the Mplus BIC. 


Thank you Linda, I hate it when I miss something like that... 


Greetings, Having recovered from my miscalculation, I still have a follow up. If I go back up to Krueger et al citation. This paper states that a BIC difference of 10 provides a Bayes factor of 150 (odds 150:1). This is also what Raftery (1995) reports. According to Nagin method of e(bic difference), a BIC of 5 provides the 150:1 ratio. This would be equivalent to a Mplus BIC difference of 2.5. Then, either Nagin method does not work, or Raftery uses yet another BIC ? Anybody can help with this one ? 


It may be that Raftery uses another BIC. You should check his work. 


Yes, he seems to use something else. I checked his work and I just dont manage to get the relationship between his BIC and Mplus BIC to execute the conversion... My guess would be to multiply BIC(Mplus) by 4 since: BIC(Mplus) of 2.5 = BIC(Nagin) of 5 and a BIC(Nagin) of 5 = (according to Nagin formula) e(5) = 150 Bayes factor (which Raftery equals to a BIC difference of 10 points). But this assumes that Nagin formula is right and my deduction is unrelated to the formulas reported in Raftery. I will send you the paper just in case you manage to easily get the conversion (the relevant part is pp.130135 but especially equations 21 and 23). I believe it may be useful to Mplus users other than me ? 


Got it! For those interested. See my other posting (friday 25th) under "Selecting the number of classes". If Mplus BIC = 2 times Nagin BIC rather than the reverse, it means that Raftery table can be directly applied to Mplus BICs. Everybody but Nagin, including Raftery, appear to be using Schwarz BIC (which is also used in Mplus). According to Raftery, a BIC of 10 = a Bayes factor of 150. According to Nagin, a BIC of 5 = a Bayes factor of 150 (e of 5 = 150). This correspondance between a BIC of 2 logL + r log n and Raftery tables are also supported in McLachlan & Peel (2000) book on pages 209211. 

Jon Elhai posted on Monday, April 28, 2008  12:04 pm



To Alexandre Morin: This clarification on BIC is helpful. Your post the other day got me thinking about this; I'm sure many of us on the listserv feel less confused now. 

Rob Dvorak posted on Friday, July 02, 2010  6:54 pm



Just to clarify (since it's been two years since anyone posted here). Mplus uses Schwartz BIC (the same as Raftery), meaning that the Tables from Raftery can be directly applied to Mplus BIC differences... correct? Therefore, M1 with a BIC of 130 would be rejected for M2 with a BIC of 120 (i.e., a Bayes factor of ~150:1), allowing for a very strong posterior probablity that M2 is the preferred model. 


I think this is correct. Can you give the Raftery reference? 

Rob Dvorak posted on Saturday, July 03, 2010  1:39 pm



Hi Linda, Here's the Raftery reference. Raftery, A. E. (1995). Bayesian Model Selection in Social Research. Sociological Methodology, 25, 111163. 

Alex Walker posted on Sunday, December 19, 2010  4:02 pm



Are there specific guidelines with respect to the significance of BIC differences, e.g, in LR, is a difference of 4.4 between two models significant? (M1 BIC=71.01, MC BIC=66.61) thanks Alex 


Here are some BIC citations of interest from Bengt: Wasserman (2000) in J of Math Psych gives a formula (27) which implies that a BICrelated difference between two models is logBij where B is the Bayes factor for choosing between model i and j. Wasserman's (27) says that logBij is approximately what Mplus calls minus 1/2 BIC. This means that 2log Bij is in the Mplus BIC scale apart from the ignorable sign difference. Kass and Raftery (1995) in J of the Am Stat Assoc gives rules of evidence on page 777 for 2log_e Bij which say that >10 is very strong evidence in favor of the model with largest value. So, to conclude, this says that an Mplus BIC difference > 10 is strong evidence against the model with the highest Mplus BIC value (I hope I got that right). Raftery has a Soc Meth chapter: Raftery, A. E. (1995). Bayesian Model Selection in Social Research. Sociological Methodology, 25, 111163. that talks about Bij from a SEM perspective. There's also a good discussion about this here: http://www.statmodel.com/discussion/messages/23/2232.html?1209409498 

Doug posted on Wednesday, October 05, 2011  1:13 pm



This sounds interesting. How do you output BIC and AIC values in Mplus? Can they be used with the WLSMV estimator? 


BIC and AIC are not available for weighted least squares estimation. They are available for maximum likelihood estimation. They are printed automatically when available. 

Back to top 