I have a model involving 16 observed variables (3 latents & 4 categorical measures) and 75 freely estimated parameters.
I am playing around with the BAYES estiamtor to try and get a better understanding of this alternative estimation method.
The posterior predictive P value for this model is < .001 and the 95% credible interval for difference in ch-sq values is (37.790, 149.462), which each indicate the hypothesized model does not fit the data well.
I was wondering if it is meaningful under the bayesian approach to consider soemthing like a standardized root mean square residual as an supplementary measure of global fit.
In "Bayesian Analysis In Mplus: A Brief Introduction" version 3, for a CFA with BAYES estimator is is still appropriate to use ML for categorical DVs? For instance, should it the syntax still look like the following: ANALYSIS: ESTIMATOR = BAYES; PROCESS = 2; FBITER = 20000; STVAL = ML;
I just attended your workshop at UConn. Very informative. The Bayesian estimation works nicely and is quite fast. One question: Is there some way to save the chains of values for the posterior distributions of the parameters? I would like to be able to use them to make some statements regarding the probability that an effect of a specified magnitude, or greater, exists. Thanks, Chuck Green
I am running a conditional latent growth model using Bayes as the estimator. I have asked for tech4 to get the mean intercept and slope. Is there a way to request or calculate a credibility interval for the mean intercept and slope? Thanks!
Thank you - that is very helpful. One follow-up question: Shouldn't the estimate I get from MODEL CONSTRAINT be the same as the one from TECH4? Mine is not. I have both continuous and dichotomous predictors, and I am wondering if there is something wrong with how I am asking for the intercept estimate...
I think the problem is that you are putting the binary covariates on the CATEGORICAL list and referring to their thresholds. The CATEGORICAL list is for dependent variables. You should remove them from the CATEGORICAL list and refer to their means not their thresholds.
I am running a small sample (N=79) SEM model and using the Bayes estimator. What kinds of fit statistics are customary in the literature for Bayes models - I noticed that the AIC and BIC are not generated with the ESTIMATOR = BAYES command? I should note that I have no missing data. Thanks!
See the following paper which is available on the website:
Muthén, B. (2010). Bayesian analysis in Mplus: A brief introduction. Technical Report. Version 3.
Jan Zirk posted on Wednesday, April 25, 2012 - 8:36 am
Dear Linda or Bengt, I have three questions concerning the Bayes estimator. 1) The rule of thumb says that under the 'traditional' maximum-likelihood estimation of SEM models we need at least about 10-20 cases per variable to provide appropriate stability of a model. As Bayesian estimation does not apply large sample theory of normality, does it mean that in the case of a bigger number of variables in a model than the above rule-of-thumb under ML allows for, Bayesian approach is more suitable, and do you know any rule of thumb concerning the sample size for Bayesian estimation? 2) As Bayesian estimator does not require the normal distributions, would it be appropriate not to define a binary dependent (or mediating) variable with the 'categorical are' command in the Bayesian input, to obtain the DIC index? 3) is there an equivalent of the chi-sq difference test for testing nested models under Bayesian estimation?
1) I don't know of a rule of thumb - even a rule for ML is debatable and highly dependent on the context. But in general, Bayes could work better than ML for smaller samples.
2) No, you still have to use the proper model, in this case logistic/probit.
3) There is "Bayes factors"
Jan Zirk posted on Wednesday, April 25, 2012 - 10:13 am
Dear Bengt, Thanks very much for your quick response. As to 3) is there any literature with an example showing how to use bayes factors for nested models comparison in MPlus? I did not find it in the User's Guide.
We do not currently have nested model testing in Bayes.
Jan Zirk posted on Thursday, April 26, 2012 - 2:14 pm
I see. Thanks very much.
Jan Zirk posted on Wednesday, May 02, 2012 - 8:16 am
Dear Linda or Bengt, I would like to ask about the parameterization under Bayes estimator.
According to "Bayesian Analysis of Latent Variable Models using Mplus" (http://www.statmodel.com/download/BayesAdvantages18.pdf) parameterization PX outperforms V and L. Is PX default for BAYES? Is it possible to change parameterization under Bayes (as it is for ML- delta vs. theta)?
I try to run "run15.inp" for the paper- Muthen, B. & Asparouhov, T. (2011). Bayesian SEM: A more flexible representation of substantive theory. Psychological Methods, 17, 313-335.
I have one question regarding the understanding of the syntax: ---------------------------------- define: standardize y1-y15; MODEL: y1 on y2@0; ! to get stdy ---------------------------------- Why do you standardize y1-y15? Why can you get stdy by adding "y1 on y2@0"?
I standardize because I want the priors to work in a standardize metric. A certain prior variance has different implications for observed variables with different variances. I am allowed to standardize here because the model is "scale-free".
The y1 on y2@0 statement is just a trick - ignore it.
Jan Zirk posted on Thursday, October 17, 2013 - 4:04 pm
I have a problem with XY standardized coefficients in my objective Bayesian SEM. They are larger than 1 and so the reviewers may be critical. What could be a good solution to get rid of this problem? How would you set the priors or model constraints to help with these? Only 2 standardized coefficients are larger than 1. The remaining have usual values.
We have a FAQ on "Standardized coefficient greater than 1". This usually has to do with highly correlated predictors; so that's the real issue. I'm not sure you want to use priors to get rid of it, but perhaps instead re-formulate the model.
Jan Zirk posted on Friday, October 18, 2013 - 1:47 am
Thanks very much.
Jan Zirk posted on Friday, October 18, 2013 - 10:58 am
When including the variances and means of predictors under ml it is possible to avoid listwise deletion thanks to fiml. what is the 'bayesian fiml'? Including variances and means in a bayesian regression also avoids listwise deletion; may I ask you for a reference as to this fiml-like solution but with bayesian approach ?
Making the covariates part of the model is possible also in Bayes. I don't know that there is a reference for this; it goes back to basic principles that any variable, the parameters of which are part of the model, is handled by MAR-type missing data theory.
Jan Zirk posted on Friday, October 18, 2013 - 8:51 pm
After reading your paper titled "Muthen, B. & Asparouhov, T. (2011). Bayesian SEM: A more flexible representation of substantive theory. Psychological Methods, 17, 313-335.", I try to modify "run15.inp" for my data set.
The indicators for CFA are categorical variables. Therefore, I add a statement "CATEGORICAL =" in my syntax.
I wonder any references or examples I can read/follow in terms of the specification of priors for cross-loadings and residual correlations.
In order to run a Bayesian CFA with categorical indicators (6 points Likert-scale), I read you paper "Bayesian Analysis Using Mplus: Technical Implementation" and have few questions.
Q1. On P.10, you provide a matrix (11) with partially a correlation matrix and partially a covariance matrix. What is the partially covariance matrix about? (the covariances between which parameters??)
Q2. Do I have to give priors for thresholds of each categorical variables?
Q3. Do you have any suggestion regarding the priors for thresholds? Normal distribution with mean zero and variance 6?
Q4. Can I give prior N(0,.01) for cross-loadings?
Q5. Do I need to standardize the categorical indicators? (My thought is NO because my priors are in a standardize metric)
Dear Bengt and Linda, where can I find the data and input for Figure 3 and Tables 16-18 in "Muthen, B. & Asparouhov, T. (2012). Bayesian SEM: A more flexible representation of substantive theory. Psychological Methods, 17, 313-335."? This is the Bayesian SEM reanalysis of Kaplan's (2009) model using the National Educational Longitudinal Study of 1988.
DavidBoyda posted on Wednesday, March 19, 2014 - 3:29 am
I have a question regarding the Bayes estimator. If I was to compare a model using the Bayes estimator to the same model using MLR, would I expect wildly differing results between these two estimators in the same way perhaps that ML might giver slightly differing results to MLR?
I have question regarding the Bayes estimator and that it outputs a one-tailed P-value - yet also outputs a 95%CI. Im under the notion that the CI overs both tails - so I dont really understand the one-tailed approach.
You can say what you have in the quote. I am not sure about a good source - check the Schafer book we refer to in our UG. Perhaps you can also refer to our Bayes implementation papers:
Asparouhov, T. & Muthén, B. (2010). Bayesian analysis of latent variable models using Mplus. Technical Report. Version 4. Click here to view Mplus inputs, data, and outputs used in this paper. download paper contact second author
Asparouhov, T. & Muthén, B. (2010). Bayesian analysis using Mplus: Technical implementation. Technical Report. Version 3.
Yes. You must do this via TYPE=MIXTURE with the KNOWNCLASS and CLASSES options.
Tyler Moore posted on Thursday, August 25, 2016 - 1:54 pm
Hi Bengt and Linda, I'm not sure whether this is the appropriate place for this question, but are there any papers out there presenting strong arguments in favor of the BAYES estimator over others? I'm anticipating some reviewer questions RE: why I decided to use BAYES rather than more common estimators that output conventional fit indices (CFI, RMSEA, etc.). The answer is that BAYES was the only estimator that didn't result in a non-positive-definitely residual covariance matrix, but I'd like to have more support for that decision besides "the others didn't work." Any suggestions or refreences you'd point me to? Thanks!
Muthén, B. (2010). Bayesian analysis in Mplus: A brief introduction. Technical Report. Version 3. Click here to view Mplus inputs, data, and outputs used in this paper. download paper contact author show abstract
Bayes priors are such that negative residual variances are not possible so I am not sure avoiding the non-pos-def res cov matrix is a strong argument in favor of Bayes. Perhaps the model instead needs to be modified in some way.
Hello, I have been reading the "Prior-Posterior Predictive P-values" paper and I have a question.
In the paper, when the PPPP value does not reject, that means that the minor parameters are approximately 0. I take it that the PPPP value can be used with other non 0 mean priors? For instance, if one uses a prior of ~N(-0.3,0.01) for a slope, the likelihood suggests a 0 mean slope, and the posterior lies at some point in between, would a PPPP value of 0.4 suggest that the obtained slope is consistent with the prior?
Fred posted on Wednesday, October 18, 2017 - 11:04 pm
Dear Drs. Muthen,
on page 16 of your paper: Baysesian Analysis Using Mplus: Technical Implementation (Version 3) there is a statement regarding Missing Values on categorical variables. I am a little confused by this and got two questions:
1. Are the Missing Values in a cateogrical Variable (say X) directly handled when the continous Variable X* of X is generated? If so, how does this work exactly? 2. Or are the Missing Values on X handled through X*, where first the Missings are handled with conditional normal distributions (as on page 17) and then the threshold parameters are used to account for the categorical nature of the variable?
X* is generated at each MCMC iteration from the conditional distribution of X* on all observed data and parameters. X* and the thresholds uniquely determine X. On page 16 we simply make a note on how the conditional distribution of X* changes in the case when X is missing. The missing data handling is likelihood based and guarantees consistent estimation as long as the missing data is missing at random (MAR).
I am trying to run a bivariate longitudinal Cross-Lagged Panel Model with three waves of data. I have tried MLR estimator and the model does not work. The message is the following:
THE STANDARD ERRORS FOR H1 ESTIMATED SAMPLE STATISTICS COULD NOT BE COMPUTED. THIS MAY BE DUE TO LOW COVARIANCE COVERAGE. THE ROBUST CHI-SQUARE COULD NOT BE COMPUTED.
THE MODEL ESTIMATION TERMINATED NORMALLY THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES MAY NOT BE TRUSTWORTHY FOR SOME PARAMETERS DUE TO A NON-POSITIVE DEFINITE FIRST-ORDER DERIVATIVE PRODUCT MATRIX. THIS MAY BE DUE TO THE STARTING VALUES BUT MAY ALSO BE AN INDICATION OF MODEL NONIDENTIFICATION. THE CONDITION NUMBER IS -0.521D-16. PROBLEM INVOLVING THE FOLLOWING PARAMETER: THIS IS MOST LIKELY DUE TO HAVING MORE PARAMETERS THAN THE SAMPLE SIZE.
I tried the same model with bayes and it works. Having that error with MLR should I trust the results with bayes estimator? The PPPs that I get re higher than 0.05.
Thank you very much for your quick response Dr. Muthen!
I have tried with several variables and realised that problem only arises with the variables with the most missing data in the dataset. Does it make sense? I have read that bayes estimator is better for missing data. Otherwise I am not able to figure out the problem.
Dear Drs. Muthen, Upon a reviewer suggestion, I am re-running a MLR mediation model using Bayes estimator to address concerns with small sample size. When I do this, I get similar patterns in significance for parameters of interest, however the posterior predictive p value indicates poor model fit (.000/.004 with non/informative priors). PSR in tech 8 is close to 1. Do you have suggestions for why this might be? My data are nested within schools (9), which I had dummy coded in my MLR model since fewer than ~20 clusters. I kept them dummy coded in the Bayes model as type=complex does not work, but wonder if this is causing some of the model fit issue? Many thanks.
Check what your left-out arrows (paths) are in your model. Check if your MLR run has non-zero degrees of freedom. Perhaps df=0 as it often it with mediation models so that not overall test of fit is obtained.
You can get the log-likelihood value from plots: Bayesian posterior predictive checking scatter plots - use the observed chi-square then divide by 2 and subtract from the H1 model log-likelihood. I would recommend that you use DIC, however, and not do any of these computations.
I don't know what you mean here. The posterior predictive checking that Mplus offers refers to testing the overall fit of the model.
A posterior distribution is provided for each parameter, including parameters defined in Model Constraint.
Daniel Lee posted on Friday, October 11, 2019 - 10:20 am
Hi Dr. Muthen,
I am using Mplus to conduct a bayesian multiple imputations. After conducting the imputations, I checked the convergence plot and autocorrelation and all looked good. However, the posterior predictive checking p-value was < .01.
When doing bayesian multiple imputations in mplus, does this mean there was a problem in the imputations and shouldn't be used. How would one rememdy this situation (E.g., add more variables into the imputation)?
The PPP value is like a chi-square. What this means is that the model that you estimated(and used for the data imputation) doesn't quite fit the data very well. That also means that the imputed values you obtained are not good enough. Try to modify the model so you get a good PPP value or use type=basic for the data imputation (and no model), see user's guide example 11.5. With type=basic we use the unrestricted variance covariance model for the data imputation so this will be avoided altogether.
Daniel Lee posted on Friday, October 11, 2019 - 8:03 pm
Thank you so much for the response. I have a follow-up question.
If I use type=basic (the unrestricted variance/covariance model for the data imputation), are there any quality checks I can implement to show that my imputation is acceptable? For example, in the bayesian multiple imputation approach, I was able to produce plots of autocorrelation, convergence plot, etc. With the proposed approach (using type=basic), I wonder if there are quality checks I can do to show that the imputation is acceptable.
To do that you will have to replace ANALYSIS: TYPE = BASIC; with ANALYSIS:estimator=bayes; model: Y1-Y10 with Y1-Y10; Y1-Y10 on X1-X5 assuming you have 10 dependent variables and 5 covariates. Essentially you would write the H1 model manually.
Take a look at User's guide page 576 (as well as the entire section on data imputation). The diagram summarizes the different ways to do imputations in Mplus. You would have to move back to the box with example 11.7.