Mplus Discussion >> Bayesian non-informative priors?

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Bayesian non-informative priors?

Mplus Discussion > Dynamic Structural Equation Modeling >

Message/Author

Fredrik Falkenström posted on Sunday, March 01, 2020 - 12:09 pm

This question is not specifically about DSEM, but about a Random Intercept Cross-Lagged Panel model estimated by Type = Bayesian and the default Mplus priors. We got a response from a journal editor saying the following: "many prominent Bayesian analysts (e.g., Andrew Gelman and the development of Stan, www.mc-stan.org/) have moved away from recommending uninformative priors in general, and Inverse-Wishart priors in particular (one of the only options for covariance matrices in Mplus). This has occurred because such priors actually do seem to influence estimation, especially in small-variance hierarchical models with small to medium samples." Would you say that this is true in the context of a RI-CLPM?

Best,

Fredrik Falkenström

Tihomir Asparouhov posted on Monday, March 02, 2020 - 11:55 am

In small to medium samples, priors influence the results. This statement is true regardless of what the prior is. In particular, for between level parameters, the priors have influence when the number of clusters is small. It is a good idea to

a. Conduct prior sensitivity study
b. Replace random effect with fixed effects when the variances are small and insignificant.
c. Standardize variables as that simplifies sensitivity analysis and removes unnecessary variable scales
d. If possible run ML estimation and use the results to construct reasonable weakly informative priors
e. Conduct simulation studies to evaluate the performance of different prior strategies

Fredrik Falkenström posted on Monday, March 02, 2020 - 12:19 pm

Thanks, but I think the issue here was rather whether it is better to use weakly informative priors than the Mplus default "improper" priors with infinite variances? The sample size here was not extremely small (N = 159).

Tihomir Asparouhov posted on Monday, March 02, 2020 - 2:45 pm

Absolutely. Weakly informative priors, chosen with proper consideration, will almost surely improve the estimation when sample size is small. I am not sure what you mean by N=159. Again, the number of clusters is what is important for the between level parameters - not the total sample size.

Fredrik Falkenström posted on Monday, March 02, 2020 - 11:38 pm

Thanks. N = 159 is the number of clusters. It is not estimated as a multilevel model in Mplus, it is the RI-CLPM so it is estimated in wide format, but of course similar to a 2-level model.

Tihomir Asparouhov posted on Tuesday, March 03, 2020 - 8:25 am

I see. Two things come to my mind. Keep an eye on the identifiability part of the model - meaning how wide the standard errors are for particular parameters - that can be somewhat entangled into the whole story about prior and small sample size. Just to give you an example. A parameter estimated at 0.1 and standard error 3. If you change the prior and you get an estimate of 0.2 and standard error of 3, that doesn't necessarily mean that there is a lot of dependence on the prior. It is more of a reflection that the parameter is a bit hard to identify that anything else, and if you would run the model with infinite number of MCMC iterations (which is more of a theoretical concept) you might find that the estimates are closer even with different priors. This is why it is important to trim the model when parameters are not significant.

Second, you might find this paper useful
Browne, W. J., and Draper, D. (2006). A comparison of Bayesian and
likelihood-based methods for fitting multilevel models. Bayesian Analysis, 1, 473-514.

Fredrik Falkenström posted on Wednesday, March 18, 2020 - 7:51 am

I did a few simulation studies to explore this, and found that with small samples and missing data I get large coefficient bias (around 30%) when I estimate variances for the observed variables in the RI-CLPM. The model is too large to post here, but essentially it separates between- and within variances using latent variables. Usually the variances of the observed variables are constrained to 0, but then the Bayes estimator doesn't converge. I've previously found that estimating these variances (constrained to equality over time) makes convergence much easier, but now I see that this is problematic with small samples.

Do you think it is possible to get this to work if I put informative priors on some parameters, and if so which ones would be most important?

Tihomir Asparouhov posted on Wednesday, March 18, 2020 - 12:49 pm

I would not constrain the variances of the observed variables to 0, you can use 0.01 instead.

Fredrik Falkenström posted on Thursday, March 19, 2020 - 6:15 am

Thank you very much for your response! It seems to work if I use a larger value, e.g. 0.5 or 1, but I don't get convergence with 0.01. Was your suggestion to use 0.01 based on the assumption that observed variables are standardised before analysis (variance = 1)? In my case, the observed variables have variances around 30-40, would it then make sense to fix variances at 0.3-0.4 (i.e. 0.01 * 30 or 40?)

Tihomir Asparouhov posted on Thursday, March 19, 2020 - 10:02 am

Yes 0.01 is mostly for standardize or approximately standardized and would be adjust higher for larger variances or you can divide the variables by 5 to reduce the variance of the observed variables.

Fredrik Falkenström posted on Thursday, March 19, 2020 - 1:33 pm

Great, thanks!