Mplus Discussion >> Bayesian iterations

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Bayesian iterations

Mplus Discussion > Structural Equation Modeling >

Message/Author

Ray Reichenberg posted on Tuesday, April 02, 2013 - 4:48 pm

I have a few questions about specifying the number of MCMC iterations with ESTIMATOR=BAYES. 1) When I use BITERATIONS to specify a min/max number of draws, does that number include what Mplus discards as burn-in? For example, if I use BITERATIONS=(50000) and CHAINS=1, are my posterior distributions going to be based off of 50,000 draws, or 25,000 (i.e., Mplus discards 50%)? 2) Does Mplus always discard half the draws regardless of when the PSR values become acceptably small? 3) If I don't specify the number of iterations, am I to understand that the number of iterations shown in the TECH8 output is the total number (i.e., including those that were discarded), or only those draws beyond those discarded? Finally, 4) when BITERATIONS is not specified, draws are taken until the PSR falls below the desired criterion value and then half of the draws are discarded while the rest constitute the posterior (if I am reading the manual correctly); in that case, aren't some of the draws that the posterior is based off of potentially coming from the MCMC process before the chain(s) have converged at the "true" posterior (assuming we define convergence as the moment the PSR meets the convergence criterion)? Thank you in advance and sorry for all of the questions.

Bengt O. Muthen posted on Wednesday, April 03, 2013 - 11:47 am

1. BITER = (50000) refers to the minimum number of total iterations, including the discards, so depending on when convergence occurs, more than 25,000 iterations may be used for the posterior distribution. With FBITER= 50000, the posterior is based on 25,000, that is the last half.

2. Mplus stops when the PSR based on only the last half of the iterations is low enough. The half mark keeps being moved forward, so that after 1000 iterations, the PSR is based on the last 500 iterations and after 1200 iterations the PSR is based on the last 600 iterations. So if PSR is fulfilled at a certain point, the last half consists of iterations that are deemed converged and those are the iterations that the posterior distribution is based on.

3. TECH8 shows the total number of iterations, including the discards.

4. See the answer in 2.

Ray Reichenberg posted on Wednesday, April 03, 2013 - 4:34 pm

Thank you for the clarification. Is there a way to get Mplus to print the number of iterations on the posterior graph outputs similar to what Win/OpenBugs does? I think that would have alleviated some of my confusion.

Bengt O. Muthen posted on Wednesday, April 03, 2013 - 4:44 pm

You see the number of iterations in the trace plot.

Ray Reichenberg posted on Thursday, April 04, 2013 - 11:45 am

Good point. Just so I'm clear, the region of the trace plot to the left of the vertical bar (the black region) are those draws which were discarded and, thus, are not included in the posterior. That would make the calculation of the total number of draws in the posterior = (total length of trace plot - black region of trace plot ) * # of chains?

Bengt O. Muthen posted on Thursday, April 04, 2013 - 12:11 pm

Right. Which is the same as the total TECH8-reported iterations ended at PSR convergence, divided by 2, and multiplied by number of chains. So using the default of 2 chains (which I always use together with PROCESSORS = 2), TECH8 actually gives you the right number.

Hope I got that right.

Ray Reichenberg posted on Thursday, April 04, 2013 - 12:35 pm

Now I am a bit more confused. Dividing the total iterations shown in TECH8 by 2 implies that half of the draws are discarded as burn-in. Item #1 from your first response to this thread gave me the impression that it is possible to have more than half of the draws from a chain included in the posterior when a) BITER is used to specify a minimum, and b) the total computational cycles needed to reach convergence is less than the specified minimum. For example, if I were to specify BITER=(2000) and a PSR value of, say, 1.001 was returned after the 100th iteration (assuming one chain), 1,900 additional draws would be taken to meet the specified minimum (correct?) and my posterior will be comprised of 50 out of those first 100 iterations (50 discarded as burn-in) plus that additional 1,900 for a total of 1,950. In that case, dividing the number shown in TECH8 would yield the incorrect number of total draws in the posterior. Oddly, no matter how many iterations I specify using BITER, the trace plots always seem to indicate that half of the total iterations were discarded.

Bengt O. Muthen posted on Thursday, April 04, 2013 - 1:16 pm

If you specify BITER = (2000), PSR is ignored until you get to 2000 iterations. PSR may well be too high at 2000 iteration and therefore more iterations are done. Say that PSR is good at 2200 iterations. That means that the last 1100 iterations are used to compute this acceptable PSR and that 1100 iterations are used in the posterior (from each chain).

So those 50 iterations you mention would not be used for the posterior.

In my example, TECH8 would show 2200 iterations. You divide by 2 to get rid of the burn-in part. If you multiply by 2 chains, this is also how many draws are used for the posterior.

Ray Reichenberg posted on Thursday, April 04, 2013 - 3:00 pm

Ah. Things are becoming clearer now. So if, in your example, I were to specify BITER=(2000) with a model that is computationally simple such that convergence isn't really an issue, then the software would run for 2,000 iterations without monitoring PSR. At the 2,000th iteration, PSR would be calculated using either the last 1,000 iterations from each chain (#chains>1) or by splitting the last 1,000 iterations (#chains=1). If that PSR value is acceptable, than no additional draws are taken, 1,000 draws from each chain are discarded as burn-in, and the posterior(s) are defined by the (1,000 * #chains) draws, correct? If that is the case, then are the following three statements true; 1) if BITER is used, the posterior distribution(s) will NEVER contain LESS THAN (BITER/2 * #chains) draws, 2) if BITER is used and convergence theoretically occurs before the specified minimum number of draws, then the posteriors will ALWAYS contain EXACTLY (BITER/2 * #chains) draws, and 3) the commands BITER=(5000) 5000; and FBITER=5000; are equivalent? Thank you for all of your patience and assistance.

Bengt O. Muthen posted on Thursday, April 04, 2013 - 3:29 pm

I believe all your statements are correct.

Ray Reichenberg posted on Thursday, April 04, 2013 - 4:31 pm

Thank you and, again, thank you for your assistance.

Lindsay Bell posted on Friday, March 21, 2014 - 8:27 am

Hello -

I am running a multiple imputation model and specified in the IMPUTATION command:

NDATASETS = 150;
THIN = 500;

Generating 150 datasets separated by 500 draws each would require 75,000 iterations, correct? But my TECH8 output indicates that there were only 19,200 iterations. Can you explain the discrepancy?

Also, I am using MPlus on a Mac, so I am using the R code to view the graphs, but I am only able to view the autocorrelations and autocorrelation plots for the first 30 parameters (I have 621). Is there something I need to specify when generating the plots in order to get autocorrelations for more than the first 30 parameters?

Thank you,
Lindsay

Tihomir Asparouhov posted on Monday, March 24, 2014 - 9:25 am

> Generating 150 datasets separated by 500 draws each would require 75,000 iterations, correct?

Yes

> But my TECH8 output indicates that there were only 19,200 iterations. Can you explain the discrepancy?

This is the number of iterations until model convergence. Mplus first estimates an unrestricted model and then performs the imputations based on that unrestricted model.

> Is there something I need to specify when generating the plots in order to get autocorrelations for more than the first 30 parameters?

If you use the thin option of the analysis command you can get bigger autocorrelations. For example if you use thin=10. The autocorrelations you will get are 10, 20, 30, ..., 300 (multiplied by 10).

Yoonjeong Kang posted on Wednesday, April 09, 2014 - 2:05 pm

Hello.

In Mplus user guide (ver.7), it says that

" The FBITERATIONS option is used to specify a fixed number of
iterations for each Markov chain Monte Carlo (MCMC) chain when the
potential scale reduction (PSR) convergence criterion (Gelman & Rubin,
1992) is not used. There is no default. When using this option, it is
important to use other means to determine convergence."

My question is how I can check the convergence if I use FBiteration option?

Bengt O. Muthen posted on Wednesday, April 09, 2014 - 4:38 pm

I would use BITERATIONS = (x) option instead, where x is the minimum number of iterations that you want, but keeping the PSR criterion as the stopping mechanism. Once PSR has indicated convergence, you may want to run again with x set to twice as high as the number of iterations at which it stopped. I am saying this because PSR may prematurely indicate convergence in that it can "bounce" (just like a ball thrown out on the ground, the ground being PSR=1, with smaller and smaller bounces as the iterations progress).

Yoonjeong Kang posted on Thursday, April 10, 2014 - 7:58 am

Thanks for your answer!!

Justin posted on Thursday, June 01, 2017 - 11:43 am

Hello,

I am interested in obtaining plausible values for factor scores. I know you can determine the number of posterior draws for the plausible values with SAVE=FSCORES (#). Many examples in the user's guide use 20 iterations for plausible values. Is there an optimal number or range of posterior iterations for obtaining trustworthy/stable plausible values?

Thank you,
Justin

Bengt O. Muthen posted on Thursday, June 01, 2017 - 6:54 pm

No optimal range but 100 might be called for. Try different values and see how results change.