Inspired by the new features in mplus version 6 I am trying to estimate a multilevel latent covariate model using Bayes estimation in a sample of 192 individuals in 32 teams. I have a couple of questions regarding this analysis.
a) Is there a way to obtain any fit statistics for this model using Bayes?
b) In a multilevel CFA with three factor indicators I get abnormally large variance estimates for the between level factor under Bayes but not under MLR when centering (grandmean) the factor indicators. The input I use is:
Centering = GRANDMEAN(Factor1 Factor2 Factor3);
ANALYSIS: Type = twolevel; Estimator = bayes; Process = 2; STVAL = ML; MODEL:
I have two short questions with regards to a twolevel analysis using estimator = bayes.
1. Is there a way to get fit statistics for a twolevel bayes analysis? 2. I'm freely estimating the first factor loading of a factor and fixing the variance of the factor at 1. Is it ok to do this for only one factor in the model or do I have to use the same parametrization for all factors?
1. In principle, yes. In current Mplus, no. Compare neighboring models instead.
2. It should be ok in principle, although I don't off hand recall if it violates the current Bayes restrictions on which type of Psi matrices can be handled - try it.
ywang posted on Thursday, September 23, 2010 - 1:49 pm
Dear Drs. Muthen: Parallel growth modeling with Bayesian method for categorical indicator variables (fixed time scores) can not converge even if I increased the integration number to 5000. Here is the error message and input. Any advice? Thanks in advance.
THE CONVERGENCE CRITERION IS NOT SATISFIED. INCREASE THE MAXIMUM NUMBER OF ITERATIONS OR INCREASE THE CONVERGENCE CRITERION.
categorical are bmi1 bmi2 bmi3 stnbdi_b stnbdi_6 stnbdi_8;
Analysis: integration = 5000; ESTIMATOR=BAYES; process = 4; Model: i1 s1| stnbdi_b @0 stnbdi_6 @0.8 stnbdi_8@2; i2 s2| bmi1@email@example.com@2; s2 on i1 ; s1 on i2 ; i1 with i2; s1 s2 i1 i2 on male; s2 on intervention;
The INTEGRATION option should not be used with ESTIMATOR=BAYES; I suspect that you are not actually getting BAYES but MLR.
I would suggest running each process separately as a first step. Then put the processes together. After success with this, you can add the regression among the growth factors and the covariates. If you continue to have problems, send the files and your license number to firstname.lastname@example.org.
Ewan Carr posted on Thursday, December 16, 2010 - 8:11 am
I'm interested in a two-level analysis using the Bayes estimator. It was mentioned above that fit statistics are not currently available for such models in Mplus, and that we should "compare nested models", instead.
1. Is this still the case?
2. How would I compare nested models in this case (with a two level model, using the Bayes estimator)?
1. Yes. 2. We aren't able to do that at this time.
Ewan Carr posted on Wednesday, March 16, 2011 - 12:35 pm
I'm getting slightly confused about the "THIN" option for Bayesian estimation.
I'm running a two-level factor model, using estimator = BAYES. I want to run 50,000 or 100,000 iterations, and then thin the output by 50, giving 1000 or 2000 samples, respectively.
I started out using:
FBITER = 100000; THIN = 50;
Thinking this would achieve the above. However, the model is taking a very (very) long time to run. It then occurred to me that these settings might be causing the model to run for 5 million (50 x 100000) iterations.. which would explain the slow progress.
If that is the case, then to achieve the above I should use:
FBITER = 2000; THIN = 50;
Some clarification about what the FBITER and THIN settings achieve would be much appreciated.
This is correct Ewan. FBITER refers to the actual number of iterations that are recorded, i.e., it does not include those that are thinned out. So FBITER = 2000; THIN = 50; would result in 100000 computed MCMC iterations of which every 50-th is recorded for estimation purposes.
Ewan Carr posted on Thursday, March 17, 2011 - 5:03 am
That would explain why things were taking so long..
For a positive estimate as you have, the p-value is the proportion of the posterior that is less than zero, so it is not a contradiction. From a frequentist point of view you can think of that as the chance that the true value is of opposite sign.
Rob Dvorak posted on Saturday, August 20, 2011 - 6:06 am
First, let me apologize for my ignorance. But it seems a lot of us out here are running into the issue, so I thought I'd post on it. I'm trying to wrap my head around the use of a one-tailed test using estimator=bayes. I would like to use the bayes estimator for the reasons you mention in your intro to bayes paper, however, my working knowledge of bayes is weak. Can you recommend a reading for a scientist (not a statistician) that explains the logic of the one-tailed test in bayes for those of us who are used to two-tailed b/c we're trained as frequentists? I'm sure I will need to justify the use of a one-tailed test in any papers I publish, so a reference (and rationale) for this would be great. Thanks.
Muthén, B. (2010). Bayesian analysis in Mplus: A brief introduction. Technical Report. Version 3.
"The third column gives a one-tailed p-value based on the posterior distribution. For a positive estimate, the p-value is the proportion of the posterior distribution that is below zero. For a negative estimate, the p-value is the proportion of the posterior distribution that is above zero."
So if you get a positive effect estimate, this two-tailed p-value can be seen as the probability that it is negative, that is, that it's not the effect you had expected.
However, I would instead report the more common 95% Bayesian credibility interval (CI) that we show and note if that covers zero or not. But at a first glance, instead of looking for CIs covering zero, the two-tailed p-value is a quick way to scan the results for almost zero p-values - which imply that the CIs likely don't cover zero.
Rob Dvorak posted on Sunday, August 21, 2011 - 4:00 am
I have discovered that in my output, the "estimate" reported, which I had assumed was the mode of the posterior parameter distribution, is slightly different from the mode of the distribution that is shown when I view the posterior distribution graphs (i.e., estimate in output is .27, mode in graph is .31). The 95% CCI bounds, however, are identical in the output and the graphs. Can you help me understand this discrepancy and advice me on which point estimate to report?
Thanks for your answer. I have come across another issue this weekend. When I run the analysis as a standard two level model with the default estimator, I get an error message about a non-positive definite matrix because the model has more parameters than there are clusters. However, when I run the analysis with the Bayesian estimator, I don't get that error message. Is that because it's ok to have more parameters than clusters in a Bayesian two-level analysis? Can I feel comfortable with those results or do I need to reduce the number of parameters?
The missing error message should not be interpreted as an endorsement of one method over the other. ML is entirely based on asymptotic assumptions driven by the number of clusters converging to infinity. Bayes is not but if the number of clusters is smaller than 10 the estimates will be sensitive to the priors.
The error message is based on a technical threshold point - we use the MLF matrix for standard errors but that matrix is singular iff the number of parameters is more than the number of clusters - thus the model behaves somewhat as an unidentified model and our ability to confirm model identification is limited. Bayes doesn't have this threshold as it uses different methods for identifications etc (with Bayes the number of clusters should be bigger than the number of random effects though).
Ok, thank you, that helps. I have 35 clusters so I'm not concerned about sensitivity to priors.
I have also noticed that the error message doesn't appear when I use multiply imputed data, even when using a ML estimator with more parameters than clusters. Is there a reason that more parameters than clusters would be ok with multiply imputed data?
No Lindsay ... but I didn't say it was not ok. We have conducted simulation studies that show that ML works fine even when the number of clusters is smaller than the number of parameters. The error message says that it is not possible to confirm that the model is identified. If you already know the model is identified you should just ignore that warning.
Thanks very much, Tihomir. I've been trying to figure out how to get quantiles of the posterior parameter distributions so that I can determine, for example, what percentage of two parameters' posterior distributions overlap, or where the cutoff is for 80% of the distribution. Can you tell me how I can get quantiles for the posterior distributions? I haven't been able to figure it out from the manual.
So sorry for the multiple posts. Additionally, I am trying to determine whether the model fits better when all the data are analyzed together vs. separate models for subgroups. However, because the GROUPING options is not available with ESTIMATOR=BAYES, I can't figure out how to allow parameters to be estimated separately for groups in the same model so that I can compare the fit to the model with all parameters forced to be equal. Do you have any advice on how I can compare the fit of the subgroup model to the full model?
From my understanding, I don't think I can use MIXTURE because I don't have a latent class. I also don't know how to create a new parameter from the subgroup parameters, because as of now, the only way I can think to analyze the subgroups separately is to use USEOBSERVATIONS and select each group one at a time. Can you elaborate on how I could do the subgroup analysis in one model?
Also, a separate question: I've been trying to figure out how to get quantiles of the posterior parameter distributions so that I can determine, for example, what percentage of two parameters' posterior distributions overlap, or where the cutoff is for 80% of the distribution. Can you tell me how I can get quantiles for the posterior distributions? I haven't been able to figure it out from the manual.
Thank you, I understand now. I apologize for my confusion.
I am still struggling with how to assess model fit. I was hoping to be able to compare the Deviance Information Criterion from the unconstrained model to the model with all parameters constrained to be equal across groups, but with the MIXTURE analysis, the DIC is not appearing in the output. Is there a way for me to get the DIC with TYPE=MIXTURE?
If not, is there another index of fit I could use? The posterior predictive p-value is not showing up in the output, which I am assuming is because I am using TYPE=MIXTURE COMPLEX, but I'm not sure about that.
I apologize, but I am still having trouble understanding your recommendation for determining what percentage of two parameters' posterior distributions are overlapping.
For example, in Group A, the 95% CCI for parameter 1 is (-.12, .29); in Group B, it is (.23, .83), so these intervals overlap between .23 and .29. I would like to know what percentage of the posterior distribution for Group A is >.23 and for Group B is <.29.
I created a new parameter that is the difference of the two groups' parameters. This new difference parameter has a CCI of (.06, .49). However, I don't know how to translate this information into the answer I'm looking for.
Ewan Carr posted on Friday, July 13, 2012 - 2:48 pm
I'm trying to a fit a two-level model with the Bayes estimator.
I have a binary mediator (x3 is binary; everything else is continuous):
x3 ON x1 x2; y ON x1 x2 x3;
y ON w1 w2; x3 ON w1 w2; y ON x3;
Everything works fine, except I'm having a problem with the between-level threshold of the binary mediator (i.e. [x3$1]).
The mixing of the chains for this threshold is really, really bad (traceplot), and the parameters — for both the threshold and the residual variances — tend to infinity (they increase with the number of iterations). Diagnostics for other parameters seem fine.
Is there anything (obvious) I can do about this? Specifically:
Are there any alternative priors that might improve mixing for the threshold?
How important is it to set a binary mediator as categorical?
I would say try this alternative parameterization that eliminates the threshold and instead estimates a mean for X3 via regression on ONE. I cant quite see where the poor mixing comes from but it could be due to small number of between level clusters. You can also run this model with WLSMV estimator and ML, but for ML it is a bit harder to setup. If you want to try changing priors - the place to look would be the variances on the between level - IG(1,1) often improves the situation.
Here are the commands you need for the alternative parameterization (X3 on ONE is minus the threshold)
data: ... variance=nocheck;
variables: usevar= y x1-x3 w1-w2 ONE between=ONE;
%WITHIN% x3 ON x1 x2; y ON x1 x2 x3;
%BETWEEN% [x3$1@0]; y ON w1 w2; x3 ON w1 w2 ONE; y ON x3;
Just realized where the poor mixing comes from -- the threshold parameter is highly correlated with X3_between, which is particularly poor when the size of the clusters is large. The above parameterization should resolve your problem - if not send it to email@example.com.
Ewan Carr posted on Friday, July 13, 2012 - 3:48 pm
That is amazing — the chains are mixing very well now, and the parameters don't tend off to infinity.
You can use TYPE=IMPUTATION with TYPE=THREELEVEL RANDOM. Try running this on the first data set.
Tanja Ka posted on Wednesday, March 06, 2013 - 11:03 am
Hello, I'm estimating a multilevel model with a random slope in Mplus7. When using the default gibbs algorithm, the model converges and everything looks fine. But when I switch to the gibbs (rw) algorithm, I don't get an output for the model. The model converges obviously (as I see by the PSR during the iterations), but the output is just a reproduction of the input file. It happens only in models with random slopes. Could this possibly be a minor bug? I'd like to switch to the gibbs (rw) algorithm to estimate two random slopes in one model. Thanks a lot!
I fitted a two-level model (binary dependent variable) using ESTIMATOR=bayes. I used the default prior, but the bayesian estimates did not resemble the maximum likelihood results (the model was fitted the same way) as they were supposed to. Any idea what could be the reason?
There isn't anything automatically produced. One approach is to use "neighboring" models that are less restrictive. For instance, BSEM can be used to allow residual covariances that can be checked for significance. See the article:
Muthén, B. & Asparouhov, T. (2012). Bayesian SEM: A more flexible representation of substantive theory. Psychological Methods, 17, 313-335.
Linda Guo posted on Sunday, December 01, 2013 - 10:21 am
Hi Linda and Bengt:
I am trying to confirm results of a two-level model with a dichotomized outcome, by comparing estimations from MLWin and that from Mplus. I used Bayes estimation in Mplus and mcmc estimation in MLWin. However, the estimations from Mplus and MLWin appear to be quite different. Below are algorithms I specified in the two software packages.
For MLWin, I specified mcmc(burnin(2000) chain(20000)), and set the starting value to be the estimation from ML methods.
For Mplus, I specified Type = twolevel; Estimator= BAYES; Biterations = 2000; Fbiteration = 20000; and also manually set the starting value to be the estimation from ML methods.
The results of the default ML methods from the two software were the same by the way.I just started on Mplus, and am certainly not familiar with the algorithm used by Mplus. Is there a difference between Bayes estimation and MCMC estimation in the two software packages? Any suggestions on why the results are quite different? Thank you.
Have you checked that the programs use the same model, for instance the same number of parameters and logit/probit link?
MCMC is a technique to get Bayesian estimates that is used by both programs and the two programs should get the same results if set up to estimate the same model.
Linda Guo posted on Monday, December 02, 2013 - 10:07 am
I used logistic regression in MLWin, for Mplus, I specified in the variable section: categorical = my outcome; I first run the ML algorithm and got the same results from both softwares. However after I turn on MCMC in MLWin and Bayes in Mplus, results started to differ.In both sofwares, I have the same number of samples being used in the regression, and the same number of parameters, I'm not sure how to specify the link in Mplus, but I'm assuming after I specify categorical = my outcome, it should run the model as logistic regression? Here's the command I use in to set up model in Mplus: Variable:Names ARE VAR1 VAR2 VAR3 VAR4 VAR5 VAR6;,Missing ARE all (-9999); Categorical = VAR5; Cluster = VAR6; Within = VAR1 VAR2; Between = VAR3 VAR4; Analysis: Type = twolevel; Estimator= BAYES; Biterations = 20000; Fbiteration = 200000; Model: %Within% VAR5 ON VAR1*-0.050 VAR2*-0.175; %Between% VAR5 ON VAR3*0.048 VAR4*0.167; Regardless of whether I specify the starting values to be the ones from ML estimation, or not specify the starting values for each parameter, the results from Mplus are different from those I got from MLWin. Is there a problem in the code I use? Or where else do you think could have gone wrong?
Bayes in Mplus uses probit link; see the Bayes papers we have posted. Mplus does not have logit link for Bayes. So request probit link for your MLWiN run.
ML starting values are not needed for Bayes in Mplus. No starting values need to be given. This avoids high-dimensional integration with ML for models with many latent variables.
Linda Guo posted on Monday, December 02, 2013 - 12:04 pm
I called the probit link in MLWiN,and didn't change anything else (distribution is binomial, number of parameter stays the same), but the results are still different as compared to those from Mplus. I removed the starting values in Mplus, and still didn't get the same results. What else could go wrong here?
this is a question to make sure my conceptual understanding is correct. In models of clustered data estimated using ML or least squared, standard errors tend to be incorrect (underestimated in most cases) due to the non-independence of observations. Thus standard errors are 'corrected' using a sandwich estimator. Am I correct in my understanding that this is not the case in Bayesian estimation because the standard errors and p-values are derived from the posterior distribution of parameters, which is generated using Markov chain Monte Carlo (MCMC). This procedure does not assume that observations are independent. Please correct me if I am wrong. Thanks, 'Alim
Hi, Is it possible to estimate between-level differences in within-level variances using a "random factor loadings" approach in cases where there is only a single outcome/indicator variable? I would like both intercept and variance of the random loading to be freely estimated on the between level (in order to enter between-level predictors). Below is what I tried (without person covariates). To avoid poor mixing, I fixed the residual within-level variance at an arbitrary value >0 (but smaller than the within-variance from a simple multilevel "null" model), and estimated the mean within-person variance (mwvar) as this residual plus the estimated mean loading (squared). The resulting model estimates do not appear completely unreasonable. However, (a) the value for “mwvar” always seems lower when using random loadings than when using a fixed loading, and (b) “mwvar” differs depending on the selected value of the residual variance. Especially (b) makes me believe I am doing something wrong. Your input would be greatly appreciated!
ANALYSIS: estimator = bayes; MODEL: %within% sigma | f by y; f@1; firstname.lastname@example.org; %between% [sigma] (p1); sigma; y; MODEL CONSTRAINT: new (mwvar); mwvar = 0.1 + p1**2;
The entire section 5 in that paper discusses this issue but for latent variable. Your model also seem fine but I think the above model mixes better.
As a measure of stability of the model Var(sigma_j) should be small ... smaller than 0.25 so that sigma_j>0 (otherwise interpretation would be hard). Now you can easily add predictors both for the random intercept and for the random variance.
For your model, in your model constraint you inherently are making the mistake regarding this statement * if X and Y are independent Var(XY) is not E(X)*E(X)*Var(Y)
it is var(x)*var(y) + E(X)*E(X)*Var(Y)+E(Y)*E(Y)*Var(x)
%within% ! students f1w BY y1-y6; f2w BY y7-y10; f3w BY y11-y20; f4w BY y21-y25;
%between% ! countries f1b BY y1-y6; f2b BY y7-y10; f3b BY y11-y20; f4b BY y21-y25;
OUTPUT: tech1 tech8 stdyx cinterval(hpd); ...
The output looks quite nice and the estimation terminated normally. Checking the plots suggests convergence.
However, I do not get the DIC and the pD value. I also looked into your webnote#18 and the supplementary files (26-countries modeled as clusters). In there, the output of the Bayesian multilevel model also did not show the DIC and pD. Is there a general problem with these values in multilevel models? Do I need to tell Mplus to give me these values by using an additional command?
I have a mediation model with twins in my analysis, and therefore I used type= two level and estimator=Bayes. I would like to report the standardized estimates of the variables,but as mentioned, MPLUS does not give standardized estimates of the direct, indirect and total effects in this case.
could you help and tell me what should I report instead?
Use SD(x) and SD(y) in the usual way for standardization at the level where mediation is considered.
Tor Neilands posted on Wednesday, November 11, 2015 - 4:27 pm
I am planning to fit cross-classified multilevel models using a binary outcome with missing x-side and y-side data. My understanding is that the only supported estimator for such models in Mplus currently is BAYES.
To determine the significance of multi-category predictors such as race/ethnicity, if I were using a maximum likelihood, I would use MODEL TEST. For analyses involving the BAYES estimator, what is the recommended method for assessing the significance of a multi-category predictor such as race/ethnicity?
With Bayes you can use Model Constraint where you express NEW parameters as functions of Model parameters. So you can express a difference between 2 parameters and then the Bayes posterior gives you the estimate and credibility interval for that new difference parameter.
Can this approach be used if there are 3 or more differences being tested? For instance, suppose I have a 4-category race/ethnicity variable 0=White, 1=Black, 2=Latino/Hispanic, 4=Other/Multi-Ethnic represented by three 0/1 dummy variables coded 1 each for Black, Latino/Hispanic, and Other/Multi-Ethnic, respectively. zero otherwise. If I want an overall or omnibus test for race/ethnicity, in an MLE-based analysis, I would use MODEL TEST (assuming White is the reference category), the 3 DF hypothesis: Black vs. White dummy = 0, Latino/Hispanic vs. White dummy = 0, and Other/Multi-Ethnic vs. White dummy = 0. My understanding is that Model Constraint can test single degree-of-freedom hypotheses, but not multiple DF hypotheses such as the one described above. So I could obtain the posterior for each of the three race dummies and use Model Constraint to set up pairwise comparisons of differences of the race dummies, but I'm not sure how to use it to test for a 3 or more degree of freedom test like the one described above?
I am conducing a 3-level CFA using Bayesian estimation. I have 171 observations at Level 1, 57 L2 clusters, and 19 L3 clusters. I have noticed in several cases that if I reduce the indicators of L1 or L2 LVs from 6 to 4, the PPP improves considerably (from 0.030 to 0.322). The indicators I removed had good loadings, sometimes better than the indicators that remained. I wonder, is it possible that simply reducing the number of indicators, and therefore parameters, is enough to improve the PPP given the small n or number of clusters?
Dear Professor Muthén, As I said in my previous post in MPLus discussion group, I ´m a new user of MPlus. I´m now performing a Bayes twolevel regression as sugested a couple of days ago, as I only had 20 clusters besides a 1229 individuas sample. A few days I contacted you based onde an error that the output was giving me and you suggestted that the variables on the BETWEEN list must have the same value for each cluster member. The data violates this as I hypothesized that level 2 variables also vary between cluster. I actually read the Bayes recomended paper (Muthén, 2010) and it say that intercept is random and slope is fixed. Is this the point that is condition my analysis?. I tried to understand however I´me having a lot of difficulties. Is it possible to have both level 1 and 2 predictors as random in Bayes analysis? Sincerly, Joana
Between-level variables vary across clusters, not within clusters. So there is not the violation you mention.
Intercepts and slopes can be random, that is, vary across clusters. Level-2 predictors cannot have random intercepts and slopes in a two-level analysis - because there is not a third level at which they would vary.
MLsem posted on Thursday, September 28, 2017 - 2:39 am
Is it possible to estimate the model for equality of loadings between levels with "ESTIMATOR IS BAYES" without reparametrizing the error covariances in the model?
ANALYSIS: TYPE = TWOLEVEL;
fw1 by item1*(1); fw1 by item2 (2); fw1 by item3 (3); fw1 by item4 (4); fw1 by item5 (5); fw1@1;
item2 with item3; item4 with item5;
fb1 by item1*(1); fb1 by item2 (2); fb1 by item3 (3); fb1 by item4 (4); fb1 by item5 (5); fb1@1;
"Bayes is not but if the number of clusters is smaller than 10 the estimates will be sensitive to the priors."
I am surprised to see that indeed, if i have a cluster size of approximately 20, the posterior seems very little affected by the prior.
The overall sample size (within or level 1) is large, but I thought this should have negligible influence on the impact of priors at the between level. Reading Tihomir's post above, I wonder whether priors generally will fail to have any substantial effect at the between level when the number of cluster approaches 20 (which I thought should still be considered an area of small-sample size statistics).
I'm running a multilevel CFA with repeated measures using the Bayes estimator. I would like to compare the results across three groups. It seems that KNOWNCLASS isn't available for multilevel Bayes. How else could I compare the results across groups?
You might find it easier to do this with ML or WLSMV. If you want to use Bayes you would have to use a trick where you stack up the data next to each other.
For example, if you have 5 dependent variables - the new model would have 15 variables where y1-y5 would represent group 1, y6-y10 would represent group 2, y11-y15 would represent group 3. Since clusters won't be of the same size among the groups you would fill in missing values to make them equal. The model should be written so that the model for y1-y5 is independent of the model for y6-y10 which would be independent also of the model for y11-y15. Not super easy unfortunately but this is your only option with Bayes. Future version of Mplus will make knowclass available. If the grouping variable is a within level grouping variable things will get even more complex. You might find this note useful in that context http://statmodel.com/examples/webnotes/webnote16.pdf
Thank you for this very helpful explanation. If I've set it up correctly, the factors are now estimated separately for each group at the within level. This seems like it is different than estimating the factor structure for the whole sample, regressing the factors on time, and then comparing the regression coefficients across groups. Does this group-specific analysis affect how the factors are being estimated?
Similarly, would using a more limited sample (approximately 1/3 of the original sample for each treatment group) affect the results?
It does not affect the estimation. The same amount of information goes in and in principle you should get the same result. If you are using the whole sample you would be constraining the loadings to be the same across the three groups etc.
You have to note that with this model the indicators get different between parts for the different groups. If you don't want that you can constraint them on the between level like this Y1B by Y1@1y6@1y11@1; Y1@0.01Y6@email@example.com; I point this out since this is the big difference between the "the whole sample" model v.s. "the group specific" model.
Thank you, Dr. Asparouhov, that's very helpful. I have two follow-up questions:
1. In my initial attempts with the set-up you describe, I get the error message: "THE VARIANCE COVARIANCE MATRIX IS NOT SUPPORTED. ONLY FULL VARIANCE COVARIANCE BLOCKS ARE ALLOWED. USE ALGORITHM=GIBBS(RW) TO RESOLVE THIS PROBLEM." Is this to be expected with the data set-up you describe?
2. Is it possible to compare the groups in any other way (e.g., using saved Bayesian Plausible Values, or another manual approach)?
1. You have to make sure that the covariances between the three sets of variables are all fixed to 0
2. You can add dummy variables as predictos for the three groups. You can also run the three groups one at a time and compare parameter estimates - since the parameter estimates are independent Var(a-b) = Var(a)+Var(b)
I don't thinks so. If the factors are on the within level then you don't need to do that. If they are on the between level then you do (i.e. that kind of code would just go on the between level to approximate multiple-group modeling)
The value 0.01 is needed because Mplus will not accept 0 and it generally works best when the variables are close to being standardized meaning Var in the range near 1. This is also why the model is really approximation of the true model with 0.
Incidentally you don't need to fix it to 0 or 0.01. Fixing it to 0 essentially guarantees that the between level component is time invariant but that doesn't necessarily apply in practice. Average school scores may change across time in a non-linear fashion (some schools can go up others can go down) so you could potentially remove that @0.01 and see for yourself if this holds really or not. There is some discussion on that in http://statmodel.com/examples/webnotes/webnote16.pdf
Thank you; that explanation helps a lot. The model runs but I get the following error in the Tech 8 output:
THE KOLMOGOROV-SMIRNOV DISTRIBUTION TEST FOR PARAMETER 67 HAS A P-VALUE 0.0000, INDICATING DISCREPANT POSTERIOR DISTRIBUTIONS IN THE DIFFERENT MCMC CHAINS. THIS MAY INDICATE NON-CONVERGENCE DUE TO AN INSUFFICIENT NUMBER OF MCMC ITERATIONS OR IT MAY INDICATE A NON-IDENTIFIED MODEL. SPECIFY A LARGER NUMBER OF MCMC ITERATIONS USING THE FBITER OPTION OF THE ANALYSIS COMMAND TO INVESTIGATE THE PROBLEM.
I am currently using FBITER = 80000; THIN = 2;
Given the current number of iterations, would it be reasonable to add more iterations and/or a different thinning rate to try to address this error? Alternatively, could it be a problem with the model set-up?
I have a follow-up question about declaring variables as Within. I am running two analyses (both with Bayes estimator):
1. A multilevel CFA (within-level only, nothing on between) examining change in factors over time. All items are declared as Within variables.
2. A follow-up multilevel CFA using group-specific variables, based on your input above, also examining change over time. Items are not declared as Between or Within variables, and the Between parts of the group variables are constrained using Y1B by Y1@1y6@1y11@1; Y1@0.01Y6@firstname.lastname@example.org;
Declaring items as Within vs. not declaring them changes the results slightly in analysis 1. However, for analysis 2, I can’t declare at least some of the items as Within (y1, y6, y11) because they are used on the Between level. I am wondering: 1) How important it is that the variables are declared the same way in the two analyses; 2) Whether constraining the Between parts in analysis 2 is advisable in my case (the groups received different treatments).
Any suggestions would be greatly appreciated. Thank you in advance.
A multilevel CFA (within-level only, nothing on between) is not a multilevel model. There has to be something on the between level to make this a multilevel CFA. You might want to look into User's Guide example 9.16.
Thank you for your reply and for pointing this out. I have reviewed example 9.16 as you suggested. I am wondering if the set-up below might address the Between part of the model sufficiently (i.e., allowing items to be modeled as random intercepts on the Between level)?
Analysis 1: Do not declare variables as Between or Within. Under the Model command, include the factor structure under %WITHIN% only.
Analysis 2 (group-specific variables): Do not declare variables as Between or Within. Under the Model command, include the factor structure under %WITHIN%, and under %BETWEEN%, constrain the between parts of the groups to be equal with Y1B by y1@1y6@1y11@1; email@example.com@firstname.lastname@example.org;
If not, are there any other modifications you suggest? I have run into issues with computational demands and/or convergence when I tried the model in other ways, including linear growth model, which is why I have come to the current approach.
I would recommend you take a step back in model complexity and use the ML estimator first. The reason I pointed out example 9.16 is because it describes how two-level models are used to model repeated measurements. There is no clustering of subjects, rather observations are nested within subject. I am not sure if you are using MSEM because you have repeated measurements or because subjects are nested within some kind of clusters.
Thank you for your response and for recommending the book. I have downloaded it and have been finding it helpful. I have also reviewed Webnote 16 again, which you referenced earlier.
I previously tried the ML and WLSMV estimators but encountered convergence issues (I believe because items are categorical and some are endorsed at low frequencies). The model converges with the Bayes estimator, so I would like to proceed with this approach. However, I would like to correctly free or constrain the Between parts of the model.
Data are repeated measures, and each subject belongs to one of three conditions. I have created condition-specific variables, as you suggested above, to compare the conditions.
Based on Webnote 16, it seems that groups (conditions) should not have different between parts because the clusters (subjects) can only belong to one group. Therefore, I think I should constrain the Between parts of the model to be equal, as you described above. Is this the correct use of the Between constraint in my case?
"Data are repeated measures, and each subject belongs to one of three conditions. I have created condition-specific variables, as you suggested above, to compare the conditions."
I would say that you have a 3 group single level model - not multilevel model. To have a multilevel model you must have subjects be nested within schools - where the second level model would be fitting school level averages essentially.
Take a look at User's guide example 6.1 and 6.18.
ML and WLSMV should be the easiest estimators to work with. If the ML estimator has convergence problems you might have to focus on that. Low frequencies are generally not a problem but very low frequencies are, such variables have no information and just make models unidentifiable.
Also the concept of multiple group is not necessarily the right one for you. The most basic approach is to create two dummy variables for the second and the third group and treat these as covariate.
Hello, I'm running a multilevel model that explores within-person change in behavior as a function of task feedback. Using Bayesian analysis, I would like to extract an individual difference estimate that essentially represents the change in each subject's behavior as a function of feedback. Diez Roux (2002) notes that "empirical bayes estimates of parameters for a given group can be derived from multilevel models using estimates of the group level errors" (pp.590). Is there any way to "extract" such an estimate for each subject in Mplus? I am interested in using the estimate as a predictor variable for follow-up analyses. I hope my question is clear. Many thanks for your time.
I would suggest that you look at User's guide example 9.2 but I would recommend the syntax on page 278 (example 9.2c in the Mplus installation directory) . You would have to change the estimator to bayes analysis: estimator=bayes; and add savedata: file is 1.dat; save=fs(50); to get the individual level effects. See the savedata information section at the end of the output file which will help you locate the cluster specific regression parameters.
It should be noted that "the follow up analysis" can be done (best option) in the same analysis. You can specify the random slope as a predictir in the same model.
I am fitting a multilevel path analysis using Mplus's latent cluster-mean centering to separate within- and between-level effects. It is unclear to me why Mplus adds mean and variance parameters of the predictors into the model as this is not done by default in ML-estimation? I know that this can be done to handle missing data in X but I have no missing data in X. In reference of what’s said previously in this topic above, I do have high variance for one of my predictors (SSADE), which also skyrockets the variance of its interaction (INT) with XTEMP. So, could I just exclude the mean and variance parameters of the predictors from my model?
In two-level ML without numerical integration (all continuous) all covariates that are within-between do get mean and variance estimated (mean on the between level and variance covariance on the within and between). These are estimated behind the scenes and are not reported as model parameters but they are estimated as a part of the the EM algorithm, i.e., the covariates are latent mean centered.
The variances you are reporting look quite large and you might want to consider re-scaling the variables to be near standardized metric.
You don't have the entire model here so I am not 100% sure about this but I would not recommend this approach. Xtemp-between is the average of the product of the two variables, not the product of the averages. I would recommend Preacher's approach instead. See section 3.3 and 3.4 http://statmodel.com/download/LVinteractions.pdf and the scripts are here http://statmodel.com/download/WebNote23.zip and are very specific and involve latent variable behind each of the observed variables on both levels and then using XWITH for the latent variables.
Samuli Helle posted on Thursday, December 12, 2019 - 1:08 pm
Many thanks Tihomir! I think we are interested in Preacher's A1 hypothesis (1*(1-1)) here, i.e. the moderation is at the within-level. Would the code below do the job?
About the standardized metric: when you fix the variance to 0, it is actually not fixed to 0, but to a small value. That small value is an Analysis option in Mplus called the variance= option. The default is 0.0001, so essentially this is what you are using. It is small enough that can be considered 0 and it allows the MCMC to "move". The value 0.0001 however is relative to the size of the variables. If the total variance of the variable is near 1 that value will be about 1/1000 of the variance of the observed variable so it will not impact the model. If however your variable has a variance of 100, to keep the same ratio you will need to set the variance to 0.01, i.e., you will need to change that option in Mplus (otherwise convegence will be very slow). We discuss this in Section 3.3.2 and on top of page 12 http://statmodel.com/download/LVinteractions.pdf
One more question: is it possible to combine Preacher's A1 and A3 hypotheses into the same model (could not find such an example from your LVinteraction document)? I mean looking at the interaction at both levels simultaneously. Or are there any reasons to "control" for between-level interaction if were're only interested in within-level interaction of the given variables.
Samuli Helle posted on Wednesday, December 18, 2019 - 12:54 pm
Given e.g. a model below, why Mplus by default estimates a covariance between the predictors at both levels? Is the estimation of other parts of the model compromised if those covariances are fixed to zero in this particular interaction model?
If you have large size clusters (>100) the correlations (included or not) wouldn't matter.
If the clusters are smaller though it would matter as the between level components would be estimated more accurately if say large correlations exist and are included in the model, i.e., including the correlations can indeed improve the remaining parts of the model for the case of smaller size clusters.
Hailey Lee posted on Wednesday, February 26, 2020 - 3:39 pm
I am fitting a 3-level multilevel model, with about 600 level 1 units, 170 level 2 units, and 24 level 3 units. The variability of my outcome at level 2 and 3 is pretty small, but I am testing the effects of a cluster randomized trial, and therefore, would like to retain all levels. I am fitting the 3-level model for about 25 outcomes (separate models for each outcome). For many of the outcomes, estimator = MLR works, but there are a few outcomes that do not converge with MLR. They do converge under type = BAYES. I do not have any prior information, and so I am not sure whether a Bayesian estimator is appropriate here. Would you recommend Bayes in this case, or is there another estimator that would work better for my small level 3 sample size and the small ICC’s?
Bayes does not need prior information - Mplus uses uninformative priors as the default. The only exception is that variances are held positive. It may be that MLR has convergence problems due to zero or negative variances.
Hailey Lee posted on Wednesday, February 26, 2020 - 5:32 pm
Thank you! I have a follow-up question then. I'd like to fit a multi-group model with Bayes. I found that some of the earlier postings mentioned that the use of knownclass/mixture can be an option but also came across that knownclass isn’t available for multilevel Bayes. Which one is correct? My goal is to fit a multigroup/multilevel model with Bayes.
When I run null and fixed model I have all model fit information DIC and PPP and their values are more than satisfying. However, when I add random part DIC improves substantially but there is no PPP but also I don't get any error message.
I'm about to submit the paper, and I was wondering what can I write about the fit of the final model. DIC suggest that it's the best and I should keep it, but there is no PPP. Can you help me with this?
There is no PPP because the PPP is based on comparing the unrestricted variance covariance at each level to the model estimated variance covariance. If your model in unrestricted (like in User's Guide example 9.25) the PPP doesn't mean anything, as it is comparing the model to itself. So be careful with that.
The best way to justify the random slopes is to report the confidence interval for the variance of the random slope as well as the SE. The Z-score Estimate/SE > 3 is sufficient evidence that the model is improved.
Youngshin Ju posted on Wednesday, October 14, 2020 - 7:33 am
Dear. Mplus team,
I read the paper: M&A (2018). Recent Method for the Study of Measurement Invariance With Many Groups: Alignment and Random Effects, and I ran Bayesian two-level analysis according to this paper(random intercept and random loading). However, my indicators are all categorical.
The analysis was conducted successfully, but I have a question about the results of between-level. My indicators have its thresholds because they are categorical variables. Following is residual variance on the between-level about the variables. REGIS-LAN are random intercepts (thresholds) and s1-s7 are random loadings.
Residual Variances REGIS 0.015 RECAL 0.014 DRAW 0.013 TIME 0.012 PLACE 0.012 ATTEN 0.010 LAN 0.014 S1 0.031 S2 0.014 S3 0.031 S4 0.017 S5 0.018 S6 0.034 S7 0.023
My question is, why are the residual variances for REGIS-LAN not expressed in relation to thresholds? Does it mean the variance about random intercept? Is it impossible to implement about categorical variables in Bayesian two-level analysis?
Youngshin Ju posted on Wednesday, October 14, 2020 - 7:44 am
Also, my input code is below:
VARIABLE: NAMES = NO PID AGE SEX TIME1 TIME2 TIME3 PLACE1 PLACE2 REGIS ATTEN1 ATTEN2 ATTEN3 ATTEN4 ATTEN5 RECAL NAME1 NAME2 REPEAT COMMAND READ WRITE DRAW TIME PLACE ATTEN LAN NAME group; USEVARIABLE= REGIS RECAL DRAW TIME PLACE ATTEN LAN group; CLUSTER = group; CATEGORICAL= REGIS RECAL DRAW TIME PLACE ATTEN LAN;
With ordinal categorical outcomes, there is one slope per predictor and also one intercept. So you get one random intercept. To get random effects for each category (minus 1), you would treat the variable as nominal instead.
We ask that postings be limited to one window. Longer questions should be sent to Support.
Youngshin Ju posted on Wednesday, October 14, 2020 - 5:15 pm
Thank you for your reply! But, I can't understand well setting the variables nominal instead of ordered.
Could I declare nominal variables in input code? If not, should I use such as dummy code on each response category?