I have a few questions about specifying the number of MCMC iterations with ESTIMATOR=BAYES. 1) When I use BITERATIONS to specify a min/max number of draws, does that number include what Mplus discards as burn-in? For example, if I use BITERATIONS=(50000) and CHAINS=1, are my posterior distributions going to be based off of 50,000 draws, or 25,000 (i.e., Mplus discards 50%)? 2) Does Mplus always discard half the draws regardless of when the PSR values become acceptably small? 3) If I don't specify the number of iterations, am I to understand that the number of iterations shown in the TECH8 output is the total number (i.e., including those that were discarded), or only those draws beyond those discarded? Finally, 4) when BITERATIONS is not specified, draws are taken until the PSR falls below the desired criterion value and then half of the draws are discarded while the rest constitute the posterior (if I am reading the manual correctly); in that case, aren't some of the draws that the posterior is based off of potentially coming from the MCMC process before the chain(s) have converged at the "true" posterior (assuming we define convergence as the moment the PSR meets the convergence criterion)? Thank you in advance and sorry for all of the questions.
1. BITER = (50000) refers to the minimum number of total iterations, including the discards, so depending on when convergence occurs, more than 25,000 iterations may be used for the posterior distribution. With FBITER= 50000, the posterior is based on 25,000, that is the last half.
2. Mplus stops when the PSR based on only the last half of the iterations is low enough. The half mark keeps being moved forward, so that after 1000 iterations, the PSR is based on the last 500 iterations and after 1200 iterations the PSR is based on the last 600 iterations. So if PSR is fulfilled at a certain point, the last half consists of iterations that are deemed converged and those are the iterations that the posterior distribution is based on.
3. TECH8 shows the total number of iterations, including the discards.
Thank you for the clarification. Is there a way to get Mplus to print the number of iterations on the posterior graph outputs similar to what Win/OpenBugs does? I think that would have alleviated some of my confusion.
Good point. Just so I'm clear, the region of the trace plot to the left of the vertical bar (the black region) are those draws which were discarded and, thus, are not included in the posterior. That would make the calculation of the total number of draws in the posterior = (total length of trace plot - black region of trace plot ) * # of chains?
Right. Which is the same as the total TECH8-reported iterations ended at PSR convergence, divided by 2, and multiplied by number of chains. So using the default of 2 chains (which I always use together with PROCESSORS = 2), TECH8 actually gives you the right number.
Now I am a bit more confused. Dividing the total iterations shown in TECH8 by 2 implies that half of the draws are discarded as burn-in. Item #1 from your first response to this thread gave me the impression that it is possible to have more than half of the draws from a chain included in the posterior when a) BITER is used to specify a minimum, and b) the total computational cycles needed to reach convergence is less than the specified minimum. For example, if I were to specify BITER=(2000) and a PSR value of, say, 1.001 was returned after the 100th iteration (assuming one chain), 1,900 additional draws would be taken to meet the specified minimum (correct?) and my posterior will be comprised of 50 out of those first 100 iterations (50 discarded as burn-in) plus that additional 1,900 for a total of 1,950. In that case, dividing the number shown in TECH8 would yield the incorrect number of total draws in the posterior. Oddly, no matter how many iterations I specify using BITER, the trace plots always seem to indicate that half of the total iterations were discarded.
If you specify BITER = (2000), PSR is ignored until you get to 2000 iterations. PSR may well be too high at 2000 iteration and therefore more iterations are done. Say that PSR is good at 2200 iterations. That means that the last 1100 iterations are used to compute this acceptable PSR and that 1100 iterations are used in the posterior (from each chain).
So those 50 iterations you mention would not be used for the posterior.
In my example, TECH8 would show 2200 iterations. You divide by 2 to get rid of the burn-in part. If you multiply by 2 chains, this is also how many draws are used for the posterior.
Ah. Things are becoming clearer now. So if, in your example, I were to specify BITER=(2000) with a model that is computationally simple such that convergence isn't really an issue, then the software would run for 2,000 iterations without monitoring PSR. At the 2,000th iteration, PSR would be calculated using either the last 1,000 iterations from each chain (#chains>1) or by splitting the last 1,000 iterations (#chains=1). If that PSR value is acceptable, than no additional draws are taken, 1,000 draws from each chain are discarded as burn-in, and the posterior(s) are defined by the (1,000 * #chains) draws, correct? If that is the case, then are the following three statements true; 1) if BITER is used, the posterior distribution(s) will NEVER contain LESS THAN (BITER/2 * #chains) draws, 2) if BITER is used and convergence theoretically occurs before the specified minimum number of draws, then the posteriors will ALWAYS contain EXACTLY (BITER/2 * #chains) draws, and 3) the commands BITER=(5000) 5000; and FBITER=5000; are equivalent? Thank you for all of your patience and assistance.