Message/Author 

Tom Dietz posted on Sunday, December 03, 2000  6:10 am



I recently purchased Mplus and so am quite naive about it's use. In a pilot study, I have about 100 subjects. Each was asked to make 14 decisions, say d. There are two characteristics that vary across the 14 decisions, say X1 and X2. These take on the same values for a particular decision across all subjects. I'd like to model the decision with a logit with X1 and X2 predicting the probability of choosing d=1 rather than d=0 for each subject. Then given logit coefficients B1 and B2 for each subject, I'd like to use B1 and B2 as dependent variables that are a function of individual characteristics W1, W2, W3, etc. using the individuals as the between level. Of course, this is a pilot so the numbers are small for justifying estimation, but I'm trying to get a sense of the procedures. Is these possible in Mplus? From the manual it wasn't clear how to model the coefficients at the within level as dependent variables at the between level. Thanks, Tom Dietz George Mason University 


Multilevel SEM currently does not allow for cross level interactions. Further, you can not have binary dependent variables in the two level Mplus models. I think your best solution in a multilevel framework is to use HLM. You could also, I think, do this in a one level model in which there are 14 binary outcomes with X1 and X2 as predictors and including interactions between the X and W variables. The advantage to this is that it allows the B's to vary between the different decisions, which may be a more realistic model. The disadvantage is that the interactions are not as clean and you have 14 dependent variables. Bengt or Linda can correct me if I'm wrong. 8) Best of luck. Lee 

bmuthen posted on Monday, December 04, 2000  10:09 am



I think Van Horn's answer is to the point. Mplus currently does not offer multilevel modeling with categorical outcomes or random slopes. But, Van Horn's suggestion of a multivariate approach (14 binary outcomes) is a possibility because of the fact that your decision characteristics x1 x2 do not vary across individuals. The multivariate approach is analogous to how growth modeling of repeated measures is done in a latent variable framework. Here you have in essence 14 "time points" and 2 "growth factors" for your 100 individuals, so a 14variable, 2factor, singlelevel categorical outcomes problem. The growth factors are the x1 and x2 slopes (probit instead of logit in Mplus), which vary across individuals. The growth factor time scores (i.e. loadings) are your x1 and x2 values, which do not vary across individuals (fixed parameters). The growth factors can be regressed on the w variables. A reference to this type of analysis for continuous outcomes is paper #79 on the Mplus web site under growth modeling. Paper #64 discusses growth modeling of categorical outcomes. 

Anonymous posted on Saturday, January 06, 2001  11:30 am



If I have found a twofactor model with an EFA and have found it not to converge as a twofactor model in a multilevel framework, are there any cites that I could use to argue that the twofactor model found in the EFA is an artifact of the nested nature of the data? 

Anonymous posted on Tuesday, May 18, 2004  11:51 pm



I ask for a help about the problem of some extent overlap between level2 predictor and outcome in analyses moderating effect. I intend to consider the model: Level 1: Yij = b0j + b1j (Xij) + eij Level 2: b0j=r00 b1j=r10+r11(Wj)+u1 The outcome variable Yij is an individual characteristic variable, such as social competemce, where the level 2 variable Wj is a composite group variable was created using sevel individual variables (such as academic performance, leadership, peer acceptance, and social competence), which also including social competence. The result of multilevel confirmation factor analysis revealed that the way of composition of level2 variable is reasonable. Now I want to know: (1) If I only consider the effect of level2 variable on the level1 random slope, whether the overlap of predictor and outcome is a serious problem or not? My consider is that I am look at level2 influence on slopes but not intercept, the slope is the association between two variables which is a distinctive concept from the level2 variable, am I right? (2) If I also consider the effect of level2 variable on the random intercept, what should I do? Thank you very much for any comment. 

bmuthen posted on Wednesday, May 19, 2004  8:01 am



I am not sure about this one. On the one hand one can argue that w is a clusterlevel variable that reflects the environment and even if it is created via aggregation using the individuallevel y variable, the aggregated variable means something different. Which would say that one could allow w to predict both boj and b1j. On the other hand one can argue that this way of creating w introduces a spurious correlation between w and y. Which to me might then call for allowing both b0j and b1j to be predicted by w so that the w and y correlation can be most freely represented. I am inclined to favor the first view, but let's hear what other readers think. 

Anonymous posted on Friday, April 08, 2005  11:05 am



Is there is no convenient means of generating Empirical Bayes (EB) residuals for models with random slopes and intercepts using Mplus 3.12 output ? I'm interested in the EB residuals because I wish to: (1) test for omitted variables using frequency distributions and scatter plots (etc.) of the random coefficientspecific residuals; and (2) calculate a Mahalanobis Distance for each Level2 unit so that I can get a sense of outliers, problem cases, overall model fit (in the manner suggested by Bryk and Raudenbush, Chapter 9). As I understand it  to calculate the EB residuals I would need the Level2 unitspecific EB slopes and intercepts (which Mplus 3.12 does not provide), or the L2 unitspecific reliabilities and the L2 unitspecific OLS coefficents for each of the random coefficients (which Mplus 3.12 does not provide). Are you aware of any way to calculate these quantities using Mplus ? 

BMuthen posted on Saturday, April 09, 2005  3:48 am



The level2 unit specific EB slopes and intercepts are available in Mplus using the FSCORES option of the SAVEDATA command. 

Anonymous posted on Saturday, April 09, 2005  6:06 am



Hi Prof.Muthen, I wanna model a multilevel mutiple regression model as following: Y on X1 x2 x3, and wanna each slop are ramdomized, how can I write the Mplus programme? I do not know how to define each randomized slope in term of S. thanks for your instruction 


See the examples in Chapter 9 of the Mplus User's Guide and the description of the  command. 

Anonymous posted on Friday, June 24, 2005  3:39 pm



I’m looking for a sanity check on a multilevel model I’ve built in Mplus. The model has two outcome variables: a categorical variable (CAT) with three response levels, and a continuous variable (CONT), both of I posit as having BETWEEN and WITHIN cluster variation. For a WITHIN predictor variables X, and BETWEEN predictor variables Z, I specify the following model in Mplus: %WITHIN% CONT on X; CAT on X; %BETWEEN% CONT on Z; I by CAT@1; CAT@0; I ON Z; I obtain from the model a set of WITHIN and BETWEEN parameter estimates (“Beta”s), plus a BETWEEN cluster Tau1 and Tau2 estimates. The residual variance of I is nonzero. I’m interested in using the Mplus parameter estimates to predict values of CAT (CATHAT) and seeing how well they match with the “true” values of CAT. I use the probit conversions: p(CAT=1 X,Z)=cdf.normal(Tau1  X*Beta  Z*Beta), p(CAT=2 X,Z)=cdf.normal(Tau2  X*Beta  Z*Beta)  cdf.normal(Tau1  X*Beta  Z*Beta), p(CAT=3 X,Z)=1cdf.normal(Tau2  X*BetaZ*Beta); my results are surprising – I substantially underpredict the number of respondents that have fall in response category CAT=2. Its possible that my data do not fit the model assumptions, but I also wonder if I’m thinking about the Mplus output / model parameterization the right way. 1. Am I neglecting a “clusterlevel” mean term in the P(CAT  X,Z) formulas (i.e., a mean clusterlevel p(CAT)) ? 2. If the factor scores from I obtained from the above model are like empirical Bayes residuals, can they be used to get a mean, clusterlevel p(CAT) ? Would this be obtained via: CLUST_MEAN_CAT = EB_RESID  Z*Beta ? Thanks. 

BMuthen posted on Sunday, June 26, 2005  2:15 am



In your probability calculations, you need to take into account that the residual variances also include the betweenlevel variation. Also, check the output to see that you have the weighted least squares estimator and not maximum likelihood. If you have maximum likelihood, you have logistic regressions not probit regressions. 

Anonymous posted on Sunday, June 26, 2005  4:55 pm



Regarding your first point: are you saying that the BETWEEN level factor scores need to be included in the probability calculations, i.e.: p(CAT=1 X,Z)=cdf.normal(Tau1  XBeta  ZBeta  EBRESID) ? Regarding your second point: the Mplus 3.0 User Guide suggests that TYPE=TWOLEVEL {ETC.} and TYPE=RANDOM {ETC.} can only be used with ML, MLR, or MLF when one of the dependent variables is categorical (and when I try to run the two level model in question with WLS or WLSMV Mplus 3.12 reverts to ML). Am I misunderstanding your response ? I don’t see how the WLS estimators can be used to construct a multilevel SEM with concurrent categorical and ordinal outcomes, both displaying WITHIN and BETWEEN variation. Thanks again. 

BMuthen posted on Tuesday, June 28, 2005  8:18 am



No, I meant that the argument for the normal cdf function needs to be divided by a standard deviation that includes the betweenlevel residual variance. WLSMV is never available for TWOLEVEL. With CATEGORICAL outcomes and TWOLEVEL and RANDOM, you only have the estmators you mention. There is a table in Chapter 15 under the ESTIMATOR command that summarizes the estimators available. 

Anonymous posted on Thursday, June 30, 2005  4:08 pm



Thanks for your response. 1. I am aware of the table you note from Chapter 15 of the Mplus 3.0 User’s Guide. From your first response, I got the impression that you were suggesting I use an Mplus WLS estimator, and that w/out doing so (i.e., using a ML estimator) Mplus would tread the categorical outcome as a dichotomous outcome. I’m not sure I understand your original msg from June 26 on this point. 2. I don’t understand why the pdf.normal argument needs to be divided by a SD. Could you provide a citation or further guidance ? In a gardenvariety HLM, each Level2 (BETWEEN) unit is assumed to have a different intercept and the output from typical HLM software provides the overall mean BETWEEN intercept. The EB residuals can be used to get the BETWEEN units’ actual intercepts. In the Mplus multilevel model for ordered categorical variables does Mplus assume all BETWEEN level units have the same Tau’s but different underlying (latent) means ? If so, shouldn’t the “latent mean” be included in the p(CAT=1X,Z)=cdf.normal (…) calculations ? Perhaps the difficulty comes in because I use the workaround: %BETWEEN% CONT on Z; I by CAT@1; CAT@0; I ON Z; Thanks very much. 

bmuthen posted on Sunday, July 03, 2005  5:56 pm



1. Because you were using the normal distribution function ("cdf normal"), I assumed you were using WLSMV since that is where probit is used (I overlooked that you had twolevel modeling). With ML, the logistic function is used so the probabilities have to be computed using that function instead. 2. You can read about related matters in Technical Appendix 1 on our web site  see the "latent response variable formulation"  although that is for singlelevel modeling. 2level logit is a bit more complex since you have logit link combined with normally distributed coefficients (varying across the level2 units). Say that you have a twolevel logistic regression with a random intercept. This can be written in terms of a continuous latent response variable u* as (1) u*_ij = alpha_j + beta*x_ij + e_ij where j is the level2 subscript and e is a residual with a logistic density and so has variance V(e) = pi^2/3 (see e.g. Maddala's book). For simplicity, no secondlevel predictor of alpha_j is included. Now, the intercept alpha has a mean, say a, and a variance, say v. In Mplus, a threshold (or several thresholds if a polytomous outcome) is estimated instead of the intercept, where the threshold is the negative of the intercept. So when Mplus prints out the threshold that should be taken as a. We are interested in the probability of u and therefore have to relate u* to u. This is done by postulating that u=1 when u* GT 0, or equivalently (in Mplus style) when (2) beta*x_ij + e_ij exceeds the threshold (a). To compute this we need the mean and the SD of u* given x. The mean given x is beta*x and the SD given x is sqrt(v+V(e)). Now we have to revisit singlelevel logistic regression, u*_i = alpha + beta*x_i + e_i so that (3) P(u*  x GT 0) = P(e LT (alpha + beta*x)) which because e has a logistic density is logit = 1/(1+exp(alphabeta*x)) I hope I got that right. The last step is because e is logistic and therefore has variance pi^2/2. But in (1), if we don't condition on alpha_j but only on x, we have further variance in u* given x due to the random intercept. I think this leads to the need to integrate numerically to get the probability you want by considering the integral over alpha_j of the expression P(u* GT 0  alpha_j) * [alpha_j] where the first part is a logistic function and the second part is the normal density for the random intercept. There must be literature on this in the first publications on 2level logistic regression with a random intercept. Do we have anyone who can point us to that? 

Anonymous posted on Tuesday, July 05, 2005  8:46 am



Thanks very much for your detailed response. I now understand your comments on WLSMV vs. ML  I was not aware that Mplus automatically used the Logit parameterization when NUMERICAL INTEGRATION is used in MLSEMs with ordinal categorical outcome variables (at L1 and L2). It occurs to me after reading the second portion of your response that there may be one or two recent pieces in JEBS which address these matters. I'll be away from my office for a few days, but will be able to look into this further when I return. 

bmuthen posted on Tuesday, July 05, 2005  5:03 pm



Yes, JEBS sounds like a likely outlet. Please let me know if you find something relevant there. 

Anonymous posted on Tuesday, July 19, 2005  3:29 pm



I am back in the office and responding to your post of July 3. Below, I provide a handful of references which may useful to Mplus users interested in the issues alluded to in the latter portion of this thread. As you suggest, there is a rich literature on multilevel models for nominal outcomes, but as best as I can determine at present, little on using the estimated coefficients from these models to verify the model (or predict the outcome variable values for future cases using parameter estimates). You may recall my interest is in fitting an MLSEM which features a categorical and an ordinal variable (CAT and CONT, respectively), both of which have WITHIN and BETWEEN variance components. I then wish to use the Mplusgenerated coefficients and my original covariates / data to determine the extent to which my Mplus model replicates my original data. Afshartous and de Leeuw (2005) suggests that this is a worthwhile exercise (although my application differs slightly from theirs). Mindful of your comments from July 3 as well as the references cited below, the remaining questions I have regarding the Mplus parameterization of my model and associated output are as follows: 1. I still do not understand how the Mplus model allows for Level2 unitspecific effects for the model. According to the RabeHesketh and Skrondal pieces, each Level2 unit in such models should have unique Level2 intercepts, but all Level2 units share the same fixed thresholds (Tau) parameters regardless of the link function / parameterization used (logit / probit). It appears that these Level2 unit unique effects are included in the standard calculations to find p(CAT=1  X, Z), etc. (where X are a series of Level1, and Z are a series of Level2 variables) via a CDF.NORMAL or CDF.LOGISTIC distribution, depending on the link function. My understanding of RabeHesketh and Skrondal, and Hedeker and Gibbons is that provided Mplus can generate the EB residuals for the intercepts Beta0j, such calculations can be readily performed after the user obtains each Level2 units’ specific Beta0j (where j indexes Level2 units in the sample, and the Beta0j are assumed to be drawn from a common normal distribution as in a conventional HLM). Your comments from July 3 seem to imply that there is no intercept in the Mplus MLLOGISTIC model. If this is the case, I fail to see where the Level2, unitspecific contribution to the Level1 CAT probabilities enters into the Mplus parameterization. The only other way I can think of to include Level2 unit variation in the calculation of the CAT probabilities is to assume each Level2 unit has a different mean in the CDF. LOGISTIC calculations (but the same Tau’s, X*Beta, and CDF scales); or that each Level2 unit has its own Tau’s; neither of which I’ve seen discussed anywhere (or appear to make sense). Thus it seems more likely that each Level1 (i.e., WITHIN) unit has a Level2 (i.e., BETWEEN) intercept (but not a Level1 intercept since, as you note, threshold parameters are used); and the Level2 intercept must be included in generating the probabilities in “verification” models of the sort I’m interested in. 2. My MLSEM features two sets of variables with BETWEEN and WITHIN sources of variation; yet Mplus 3.12 only provides one set of factor scores (EB residuals). My sense is these are the factor scores pertaining to the (latent) variable I. Note that the above “I parameterization” is the current Mplusrecommended workaround for including categorical variables at BETWEEN and WITHIN in a MLSEM (i.e., I by CAT@1; CAT@0, I on Z). Can I be sure that the factor scores pertain to the categorical outcome ? How does one obtain the factor scores / EB residuals for the continuous outcome (CONT) also included in the above model ? . The latter are important for verifying the quality of the model for both outcomes. 3. You noted on July 3 that whenever numerical integration is used in Mplus, the link is logistic, a point I missed initially. Thus in trying to replicate my own data I use CDF.LOGISTIC with the CDF mean set to zero and scale set to 1 (corresponding to a variance of (pi^2)/3). After experimenting with various interpretations of the Mplusprovided coefficients for the CAT outcome portion of the model, the best I do is the following (rounded %’s): CAT category “True coding” Mplus Predicted Value 1....................38%...................52% 2....................36......................40 3....................21.......................8 The values obtained in the third column are via probability calculations of the general form CDF.LOGISTIC (Tau  Beta0j  X*Beta). Despite various attempts to reconstruct my CAT variable using the Mplus output, my estimates still appear to be quite a bit off. Does this degree of mismatch appear reasonable ?. Provided that my calculations are correct, the only other source of error I can think of is that the continuous variable, CONT, is actually Poissondistributed, something I do not take account of in my MLSEM. Could this small omission be affecting my results so much ? 4. Its possible I’m misunderstanding your last point from July 3rd regarding “integrating over alpha_j” to get p(u* > 0  X); but since my interest is in replicating my data rather than inference, and given that the WITHIN and BETWEEN level errors are independent, can’t one simply use the estimated BETWEEN intercepts Beta0j obtained from the EB residuals to verify the MLSEM ? Skrondal and RabeHesketh (2003) perform an integration of the type you appear to be mentioning in one of their examples, but in the service of obtaining a population average. Afshartous and de Leeuw (2005) make predictive inferences using gardenvariety HLMs (continuous outcomes) without performing any such integration. If numerical integration has to be used to average over alpha_j, it would seem that multilevel SEMs with ordinal outcome variables cannot be readily used to make predictive inferences. Is this your sense ? Thank you. REFERENCES; MATERIAL OF POTENTIAL INTEREST Afshartous, David and Jan de Leeuw. 2005. “Prediction in Multilevel Models” Journal of Educational and Behavioral Statistics 30: 109140. Gibbons, Robert D. and Donald Hedeker. 1997. “Random Effects Probit and Logistic Regression Models for ThreeLevel Data” Biometrics 53: 15271537. Gibbons, Robert D., Donald Hedeker, Sara C. Charles, and Paul Frisch. 1994. “A Random Effects Probit Model for Predicting Medical Malpractice Claims” Journal of the American Statistical Association 89: 760767. RabeHesketh, S. and Skrondal, A. 2001. “Parameterization of Multivariate Random Effects Models for Categorical Data” Biometrics 57: 12561264. Skrondal, Anders and Sophia RabeHesketh. 2003. “Some Applications of Generalized Linear Latent and Mixed Models in Epidemiology” Norsk Epidemiologi 13: 265278. Hedeker, Donald, and Robert D. Gibbons. 1994. “A RandomEffects Ordinal Regression Model for Multilevel Analysis” Biometrics 50: 933944. Wong George Y. and William M. Mason. 1985. “The Hierarchical Logistic Regression Model for Multilevel Analysis” Journal of the American Statistical Association 80: 513524. Also of potential interest (refer also to Skrondal and RabeHesketh, 2003): Agresti, Booth, Hobert, and Caffo. “Random Effects Modeling of Categorical Data”. Sociological Methodology. 2000. 2780 (Mark P. Becker, Ed.). Wong George Y. and William M. Mason. 1991. “Contextually Specific Effects and Other Generalizations of the Hierarchical Linear Model for Comparative Analysis” Journal of the American Statistical Association 86: 487503. RabeHesketh and Skrondal have also recently authored a ChapmanHall text out on Multilevel models that may be worth a look (although I have not had a chance to look at it myself). 

bmuthen posted on Wednesday, July 20, 2005  9:45 am



Let me first answer your point 1. Say that you have 2level logistic regression with a random intercept and a random slope just like in UG ex 9.2. Say that the dependent variable is ordered polytomous. This model then has a set of thresholds to be estimated (which are not varying across level2 units) and there is a random intercept which is taken to be normal with mean zero (the mean cannot be identified separately from the set of thresholds) and variance to be estimated. So with logit (unlike the fixedeffectsonly probit of Mplus) you can have an intercept that varies across level2 units. You can compute estimated scores on this intercept "factor" by requesting factor scores. Same for the random slope. 

bmuthen posted on Wednesday, July 20, 2005  6:48 pm



Here is more on point 1., also responding to point 2. The most straightforward Mplus setup I think would be: %WITHIN% CONT on X; CAT on X; %BETWEEN% CONT on Z; CAT ON Z; ML estimation uses a logit link for CAT (assumed ordered polytomous) in line with my answer above (July 20, 09:45). This modeling gives a random intercept for CONT and a random intercept for CAT (both of which are modeled on the Between level, i.e. on level 2). Estimates of the level2 values of these random intercepts are obtained when requesting factor scores. From this model one can compute estimates of the marginal probability (1) P(CAT  x, z) or the conditional probability (2) P(CAT  x, z, a_j), where a_j is the estimated random intercept varying across the level2 units j. Because CAT is categorical, has a logit link and has a normally distributed intercept, (1) has to be obtained by numerical integration. This is quite feasible, although not included in Mplus. In contrast, (2) conditions on the intercept estimate for a specific level2 unit j and can therefore be computed using a standard logistic regression expression. So the question is which probability is most useful for your purposes. Hope this is of some help. I will also take a look at the JEBS article you mention since I haven't read this. 

bmuthen posted on Monday, July 25, 2005  6:12 pm



I realize that the current Mplus version does not *directly* give betweenlevel factor scores for random intercepts when integration is required (as with categorical outcomes). But the workaround that was suggested above will give it: %Between% cont on z; i by cat@1; cat@0; i on z; So here i is a factor and it is the random intercept for which you get scores for each between unit when requesting factor score computations. Now if you want only the residual of i in the regression on z, then you have to subtract the beta*z term for each between unit outside Mplus. Mplus just created a little Excel routine to compute probabilities for cat in these models  that is P(cat  x, z). It involves numerical integration but is perfectly doable. 

anonymous posted on Wednesday, November 02, 2005  6:32 am



Hello Linda, hello Bengt, I would like to detect multivariate outliers within nonindependent data. Is there a way to compute Mahalanobis distances with Mplus when using "Type=Complex" for clustered data? Many thanks for the support here in the Mplus discussion, it is invaluable!! 


Several outlier detection measures including Mahalanobis distance will be available in Version 4. 


Regarding detection measures for outliers in version 4. When you use: PLOT: TYPE IS PLOT3; PLOT: OUTLIERS ARE MAHALANOBIS; The graphs available that I get under the "V" button are: Histograms (sample values, outliers, estimated values) Scatterplots (sample values, outliers, estimated values) When I view these graphs, are they only showing me the outliers? I wonder this since it only shows me a portion of the total sample. 


When you ask for histograms or scatterplots, the outliers are available as variable choices. When you choose these variables, you should see the entire sample. If you do not, you should send your input, data, output, and license number to support@statmodel.com. Note that it is not necessary to hae PLOT twice. 

Liu Xiao posted on Monday, April 23, 2007  5:10 am



Hi, Dr. Muthen,I am doing a hierarchical model. analysis: type=twolevel random missing; model: %within% s  y on x; %between% s on z; y on z; because type=random is not allowed for STAND in output.Is it possible to get the standardized value of s and y's intercept? Thank you very much. 


Standardized parameters are not given in this situation because the variance of y varies for each value of x. This makes it unclear what variance to use for standardization. This is a research topic. 

c parker posted on Friday, July 13, 2007  11:40 am



I have fit a multilevel model, with individuals nested within neighborhoods. At level 1 I have a predictor of family type(6 categories, so I am working with 5 dummy variables at level 1). At level 2 I have a predictor of neighborhood type (3 categories, so I am working with 2 dummy variables at level 2). I have performed analyses using the raw metric, grand mean centering, and group mean centering, and am looking at a random intercept model. I have run several analyses, and have changed the reference variables to verify that my results are consistent (e.g., a. using family type 1, neighborhood type 3 as the reference groups (i.e., the omitted dummy codes) in one analysis, b. using family type 2, neighborhood type 3 as the reference groups in another analysis.) It is my understanding that the coefficient associated with family type 2 from analysis a. should be equal but of the opposite sign as the coefficient associated with family type 1 from analysis b. I've obtained parameter estimates consistent with this when using group mean centered dummy variables and dummy variables in their raw metric. However, when using grand mean centered dummy variables, I am not obtaining parameter estimates that are consistent with this. Since grand mean centering is simply a rescaling of the variables, shouldn't I obtain results that follow the equal but of opposite sign situation described above? Any feedback would be greatly appreciated. 


I am not familiar with centering dummy variables because the numbers represent categories. 

Joy Oliver posted on Monday, August 13, 2007  12:40 pm



Hi, I am very new to MPlus, and I am getting an error statement from what I thought was a relatively simple multilevel regression. It is posted below. Could you help me understand what this means and how I can fix it? I have increased the iterations and nothing seems to work. Thanks! THE ESTIMATED BETWEEN COVARIANCE MATRIX IS NOT POSITIVE DEFINITE AS IT SHOULD BE. COMPUTATION COULD NOT BE COMPLETED. PROBLEM INVOLVING VARIABLE VI [WithinLevel Y] THE MODEL ESTIMATION DID NOT TERMINATE NORMALLY DUE TO AN ERROR IN THE COMPUTATION. CHANGE YOUR MODEL AND/OR STARTING VALUES. THE H1 MODEL ESTIMATION DID NOT CONVERGE. SAMPLE STATISTICS COULD NOT BE COMPUTED. INCREASE THE NUMBER OF H1ITERATIONS. 


Please send your input, data, output, and license number to support@statmodel.com. 


Dear Bengt and Linda, I plan to do some multilevel analysis with IEA PIRLS data. International Association for the Evaluation of Educational Achievement (IEA) conducted a Progress in International Reading Literacy Study (PIRLS) to examine the trend of reading achievement of 9/10yearolds around the world. Data collection is done every 5 years and these data are obtained by stratified twostage cluster sampling scheme. The first PIRLS is in 2001 and now we have PIRLS 2006 data available. 28 countries took part in both studies. The two studies used the same questionnaire instruments. So we can say that at country level it is a longitudinal design, but at the lowerlevels (student, classroom and schoollevels) are not (we have same variables but different individuals). Having said the background, now my first question is: I want to look at the effects of changes of some reading related variables, such as SES, early home reading activates, school resources, etc., on the changes of reading achievement (with change I mean that the differences in scores of the reading related variables and reading achievement between 2001 and 2006), is it possible to make any multilevel analysis of change in Mplus? (continue in the next post) 


If not, my second question: I want to do a multilevel analysis with the reading related variables at students, teacher and school levels to predict reading achievement, and I want to compare the 28 countries simultaneously in the model. Should I create a set of dummy variables, one for each country, and bring them into the model or are there other solutions in Mplus? I am looking forward to your suggestions. Thanks in advance. Best regards, Kajsa. 


I don't think you can do much of a change analysis here. If you aggregate your variables to the country level  i.e. use countrylevel variables  you are right that you have longitudinal data for the 28 countries, but only 2 timepoints. That only affords say a random intercept, fixed slope model. As for the second question, I would use 27 dummy variables for country. 

Xiaorui posted on Friday, April 17, 2009  9:00 am



I am using multilevel regression analysis, the model running normally except a warning. Can i fix it and can this results be used? WARNING: THE MLR STANDARD ERRORS COULD NOT BE COMPUTED. THE MLF STANDARD ERRORS WERE COMPUTED INSTEAD. THE MLR CONDITION NUMBER IS 0.128D03. PROBLEM INVOLVING PARAMETER 35 {which is psi (c1r3, betweenlevel)}. THIS MAY BE DUE TO NEAR SINGULARITY OF THE RANDOM EFFECT VARIANCE/COVARIANCE OR INCOMPLETE CONVERGENCE. The syntax are: ANALYSIS: LOGCRITERION=.002; type=twolevel; SDITERATIONS=200; H1iterations=10000; MITERATIONS=50000; MODEL: %within% C1R3 on ms g; C2R3 ON ms g C1R3; C4R3 ON ms g C2R3; C5R3 ON ms g C4R3; C6R3 ON ms g C5R3; %between% C1R3 on MS ZMC1; C2R3 ON ms ZMC2; C4R3 ON ms ZMC4; C5R3 ON ms ZMC5; C6R3 ON ms ZMC6; Thank you very much. Xiaorui 


The results are probably ok here, but if you want to make sure you need to send your input, output, data and license number to support@statmodel.com. 


We surveyed people with stratified random sampling, where the strata were combinations of gender and age. Then we constructed variables from census data that show the proportion of each blockgroup population that has different characteristics. We appended the census blockgroup proportions to the survey participants by matching on blockgroup (var BlockgID). For some blockgroups we have just one observation; for others we have many. I want to fit a twolevel random intercept random slope model. I want to see how individual characteristics (e.g. age)influence a continuous dependent variable (level 1), and I also want to see how the level 1 intercept and slopes depend on the census proportions (the level 2 models). Our obs are correlated because some occur in the same blockgroup. However, cluster sampling was not used  our sampling made no use of blockgroups. 1. Do I need to indicate that the obs are grouped within blockgroups by specifying Cluster = BlockgID? 2. Do I need to somehow indicate that the Level 2 proportion variables represent blockgroups? 3. Or can I just skip mentioning anything about blockgroups and specify the stratification and weight variables? 4. Should I specify Type = Complex or Type = twolevel random? If it makes a difference, why is one of these types preferred for this analysis? Thanks so very much for your assistance! 


Even though your sample consists of individuals from different blockgroups, if they were not actually sampled from blockgroups, the CLUSTER option is not appropriate. It sounds like you would need to use TYPE=COMPLEX with the STRATIFICATION and WEIGHT options. 


Linda, thanks for your reply. I hope my May 6 post above showed that we want to examine random effects representing variation in the intercepts and coefficients across the block groups, even though we did not sample using the block groups. 1. If I just specify type = complex rather than type = twolevel random, how do I tell Mplus that the grouping variable for the random effects is the blockgroup ID variable (BlockgID) without listing it in the cluster statement (which you said I don't need)? I couldn't find any examples in Chpt 9 of the user's manual about how to run this type of analysis using type = complex. All the multilevel analyses shown there (that I could understand) used type = twolevel. 2. Please consider the following multilevel model where depression and drink are binary variables: Level 1: depression = b1*Drink Level 2: b1 = b2*(Pctrenters in blockgrp) + u With type = twolevel random I'd list Drink on the WITHIN statement and Pctrenters on the BETWEEN statement. How do I do that using just type = Complex as you've said I should do? 3. Do I need to do anything special so that with FIML no observations will be omitted because of missing values for depression, Drink, or Pctrenters? Thanks for your help! 


P.S. Could you show me the syntax I should use in my input file for the simple two level model in my above post that predicts the occurrence of depression? Thanks again so much. Much appreciated! 


If you want to estimate random intercepts and random slopes, you can use TYPE=TWOLEVEL with blockgroups as a cluster variable. See Example 9.2. 


Thanks for the clarification, Linda. I was just confused by your earlier reply saying to avoid specify a cluster variable. Could you also answer my question #3 above? It appears that if I just wanted to run a onelevel logisitic regression under FIML, to avoid having observations omitted because of missing values on the predictors, I would have to bring the predictors into the model by making assumptions about their distributions (e.g. normality) by mentioning their variances in the model statement. Will I have to do this "bringing the predictors into the model" in my twolevel random intercepts and random slopes logistic regression model? If yes, what syntax should I use in order to tell Mplus that the dichotomous predictors should follow a binomial distribution and not a normal one? Thanks again for your help! 


In all models except for all continuous outcomes and TYPE=GENERAL, if you don't want observations with missing values on covariates eliminated, you need to mention their variances in the MODEL command. 


Linda, thanks for the confirmation. I'm sorry, but I need some more hand holding. It would appear that if I only mentioned the variances of the dichotomous predictors in the Model command that Mplus would assume they're normally distributed. Right? Rather than assuming they're normally distribjuted, it appears I need to tell Mplus that the dichotomous predictors follow a Bernoulli distribution with variance = p(1p)? How do I specify the value of p (the proportion of successes) or their actual variance p*(1p) in my Mplus syntax? I know in multiple imputation that dichotomous variables are sometimes treated as normally distributed, but if I don't want to do that, what syntax should I use in my random intercepts and random slopes model so that observations with missing values for dichotomous predictors (e.g. gender) aren't omitted? I appreciate your continuing guidance on this issue! 


In regression, all covariates are treated as continuous whether they are binary or continuous. You should not specify anything about their scale. You have two choices with covariates that have missing data. You can estimate the model conditioned on the covariates and all observations with missing on one or more covariates will be elimiated from the analysis or you can bring them into the model and make distribtuional assumptions about them. 


Linda, I need help understanding your answer. Does "and make distributional assumptions about them" mean binary predictors must be treated as normally distributed in an Mplus FIML logistic reg to avoid listwise deletion? Did Bengt suggest below that normality was just one possible solution under FIML for dichotomous predictors? Or is normality for all vars assumed with FIML because it's the basis of the likelihood function for each observation? Could you or Bengt please clarify? How egregious is it to avoid omitting obs in logistic regression by assuming normality for dichotomous predictors? Would reviewers say we made an absurd distribution assumption in our multilevel model? As Bengt alluded, people often impute binary variables via Markov chain Monte Carlo though it assumes multivariate normal data. Imputation with regression switching (chained equations, MICE, ICE) was invented to address the normality assumption. Thus maybe some people wouldn't accept treating binary covariates as normally distributed. Your thoughts? Thanks for your attention!  Bengt’s 11706 partial reply to a previous post of mine: Missingness in covariates can be handled by adding to the original logistic regression model an assumption of (for example) normality for the covariates. Imputation techniques often use this assumption as a proxy even when some covariates are dichotomous. 


Linda and I are saying the same thing here  by bringing the covariates into the model mentioning their means or variances, Mplus treats the covariates as continuousnormal. There are imputation programs such as Schafer's that acknowledge that not all variables to be imputed are continuousnormal but that some variables are categorical. I don't know how wrong you will be ignoring the categorical nature or how sensitive reviewers are to ignoring it. Perhaps others have experience? Related to this, you don't want to put categorical covariates on the CATEGORICAL= list because then you change the model, no longer conditioning on the covariates. 


Bengt, thank you very much for the explanation. When I attempted to run my multilevel logistic regression, I got this error message: "Clusters are not nested within strata. Each stratum must contain unique cluster IDs. Cluster ID 1 appears in more than one stratum." In this analysis I listed the variable STR on the Stratification = statement, SITEWT on the Weight= statment, and BGID on the Cluster = statment (based on Linda's Aug 7 6:13PM reply above). The values of STR identify strata formed from combinations of gender and age ranges. The values of BGID identify blockgroups. Our data were collected by interviewing people using stratified sampling within the gender x age strata that ignored their blockgroup membership, and now we want to examine variation in the intercepts and slopes across census blockgroups. How can I do this in Mplus? 


Typically, cluster units are different in different strata with strata being different geographical regions. Your application seems different because the same cluster unit appears in several different strata. As a simplified approach, perhaps you should use cluster=bgid and let gender x age groups be handled by covariates. 


Bengt, the approach you sugest of including the gender x age groups as covariates is exactly the same suggestion we got a while back from the support service for MLwiN, who said that MLwiN could not allow for stratification variables. Thus we thought we'd try Mplus for our multilevel modeling. Since you've suggested including the gender x age groups as covariates "as a simplified approach", is there an "unsimplified" approach in Mplus that will take our stratified sampling into account so that the standard errors in our multilevel model will be as small as possible? The principal investigator for this project, who has considerable experience with MLwiN, thought that Mplus might provide a superior analysis. Thanks again for your help. It is truly appreciated! 


There does not seem to be an established multilevel method for handling nonnested structures like these (where the sampling and the modeling are not nested) and one would have to choose among these two slightly disadvantaged alternatives. Alternative 1. Split each block into gender.age subblock and have different random effect for each subblock. This you do by adding the command in the define section (assuming BGID<10000) BGID = BGID+10000*STR. This is actually a more general multilevel model then the one you want because it has individual random effects for each subblock (the model you want is the one where each block has one random effect  which is the same as highly correlated subblock random effects). Alternative 2. Have one random effect for each block. Since Mplus does not support nonnested structures you have no choice but to drop the strata variable and make it a covariate (and possibly add interactions between the strata and other covariates where you think there is strata advantage). 


One has to keep track of what the stratification really delivers  very often it's not as helpful as one would think. So keep track of the design effect (and may be run some single level models first to evaluate the design effect). Careful considerations should help you make the right choice. If you don't have a sizable design effect you can pursue Alternative 2 without adding the strata variable as a covariate and still have a simple model. The disadvantage of Alternative 1 is that you get more but smaller clusters that may have bigger measurement error for the random effect which will counteract on the gain from the stratification. 


Thank you for your ideas. They give me a lot to think about! 

Susan M. posted on Friday, April 23, 2010  8:38 am



I have a very general question: I have been using SAS for a HLM model that has repeated measures for patients nested within doctors nested within clinics. This nested model can’t run due to memory constraints (using Proc MIXED or NLMIXED) The newer SAS proc (HPMIXED)will allow continous outcome models to run because it uses sparse matrix techniques. However, the investigator really prefers nonlinear outcomes. I was curious if MPLUS might be using estimation methods that would enable these particular nested nonlinear models to run, but now that I am reading more in the manual I am concerned that I will encounter similar problems. At least my current thinking is that I will be most similar to examples that have the notation below. “* Example uses numerical integration in the estimation of the model.This can be computationally demanding depending on the size of the problem” In addition, as I was playing with a very simple model just using clinic nestings, it appeared I would need to create dummy variables for the 20 clinics. That is certainly fine and doable, but I have hundreds of doctors and many thousands of patients…(which wouldn’t be doable). Am I missing something in my understanding? I don’t expect a detailed answer so much as I am seeking insight as to whether other folks have successfully created large population nested models with categorical outcomes in MPLUS. Thank you ! 


I'm not sure whether there would be problems in Mplus. If you send an Mplus input and your data to support@statmodel.com, I can try it and see. With categorical outcomes and maximum likelihood, numerical integration is needed only of the model includes latent variables with categorical factor indicators. 


Hello mplus team. I've been using your software for 6 months and it's really great. I have a question about multilevel modeling. I've seen more and more HLM models in the management literature lately. In some cases researchers tend to regress unit level variables on individual level variables directly instead of separating the variance of the individual variable to within and between. I'm a bit surprised since in my mind this procedure could bias the relationship by either over or underestimating the effect. Am I mistaken thinking that a group level variable should only influence the intercept or slope of the group rather than the variance of individuals? 


In Mplus a betweenlevel variable cannot be regressed directly on a withinlevel variable. See Examples 9.1 and 9.2 of the user's guide for further information about how this is handled in Mplus. 


Thank you Linda for your quick answer. Sorry for not being more specific. What I'm trying to get my head around is whether the results would be biased if I would instead run the analysis at one level as type = general where I regress a variable measured at the between level on within level observations? 


Sorry about the numerous messages. However, I'm trying to figure out when to use group and grand mean centering. I read an article that grand mean centering can sometimes lead to biased results in a between level mediation analysis where a variable measured at the group level is mediated by a variable measured at the individual level on an outcome also measured at the individual level. However, I could not apply group mean centering on the mediator because it seems group mean centering can only be applied to X variables. 


I think that the point estimate will be correct but that there will be a distortion of the standard errors. You can see if this is true by generating data where y is a between variable and x is both a between and a within variable using TYPE=TWOLEVEL, for example, model population: %within% [x@0]; x@1; %between% y on x*1 ; y*.5; and save it. See mcex9.1.inp and Example 12.6 in the user's guide. In a second step you can do an external Monte Carlo where you analyze the data as TYPE=GENERAL. 


Thank you Linda very much! It works as you suggested. The standard errors differ between the models. Further, I noticed that when both X and Y variables are modeled on within and between the relationship between the two can be significant within and not significant on between or vice versa. 

Rob Dvorak posted on Thursday, October 21, 2010  10:15 am



Hi, I have a general analysis question. I'm estimating a twolevel model, with an interaction between 2 of the level 1 slopes. The output indicates a random variance component for the interaction, and I am wondering if allowing the interaction slope to vary is ok. I'm wondering if there is some sort of dependency since it's an interaction, or if it's simply handled as if it were any other model predictor. Similarly, what if I added a quadratic slope? If this had a random variance component would I allow it to vary even though the slope of it was built from is in the model? Thanks, Rob 


It sounds to me like you have a model with three random slopes: one for one level 1 variable, a second for another level 1 variable, and a third for the interaction between the first two variables. These should be correlated on level 2 as should a quadratic growth factor if you have one. 

Anne Chan posted on Sunday, December 12, 2010  4:31 pm



Hello! I run a twolevel regression, including "ANALYSIS: TYPE = TWOLEVEL RANDOM MISSING;" in my syntax. There are missing data in my level 2 data and I found that MPLUS did not include the cases with missing data in the analyses. (There was a warning statement in the output: Data set contains cases with missing on xvariables. These cases were not included in the analysis.) May I ask is that MPLUS exclude the cases which missing value at level 2 (when these variables are the predictors in the models)? 


Missing data theory does not apply to observed exogenous variables. The model is estimated conditioned on these variables. You can mention the variances of the observed exogenous variables in the MODEL command and these variables will be treated as dependent variables and distributional assumptions will be made about them but cases with missing on them will not be excluded. 


First of all, I apologize for having asked this same question yesterday (under a different thread). So  Linda's post (above) from 12/12 answers the question I posted yesterday about why cases are being excluded from the analysis. However, I don't understand the practical implications of Linda's answer to Anne Chan. How would I "mention the variances of the observed exogenous variables in the MODEL command"? My understanding of Linda's 12/12 response is that if I did this, the cases would not be excluded (am I right about this? that she is saying that by bringing the variances of the observed exogenous variables into the model command, the model will treat these variances as dependent variables and, as a result, the cases will not be excluded?). Thanks. 


You mention the variance by mentioning the variable name in the MODEL command. This is how variances/residual variances are referred to in Mplus. If you do this, the variables will be treated as dependent variables and the observations with missing data on them will not be excluded from the analysis. 


Hello, I have a question about mixed effects logistic regression with nested random effects. In my data I have individual trees sampled within plots that are nested within prescribed fires. I have been modeling the effects of fire on tree mortality in R using the lme4 package and am trying to replicate the results in Mplus. However, I am getting different results in R and Mplus and think it is due to how I have specified the model. My Mplus script is: VARIABLE: CATEGORICAL = Status; USEVAR = Status4 DBH VolSc; WITHIN = DBH VolSc; STRATIFICATION = Site; CLUSTER = Plot; ANALYSIS: TYPE = TWOLEVEL COMPLEX; MODEL: %WITHIN% Status4 ON VolSc DBH; %BETWEEN% The Mplus output is: Akaike (AIC) 349.021 Within Level STATUS4 ON Estimate S.E. VOLSC 0.047 0.004 DBH 0.027 0.010 Between Level Thresholds STATUS4$1 1.149 0.445 Variances STATUS4 0.299 0.259 The output I get from R is: AIC BIC logLik deviance 348.4 368.8 169.2 338.4 Random effects: Groups Variance Std.Dev. PlotID:Site 0.094263 0.30702 Site 0.322407 0.56781 Fixed effects: Estimate Std. Error (Intercept) 1.466283 0.467578 DBH 0.024831 0.005512 VolSc 0.048221 0.004925 Thank you for your help! 


You could try sharpening the convergence criteria in the programs. In Mplus, you would use mconvergence, say mconv = 0.00001; You can tell which program reaches the best solution by comparing their loglikelihood values (high is good). 


I sharpened the convegence as you suggested but I think there is a more fundemental problem with how I am specifying the model in Mplus that I am not recognizing. I think I am somehow specifying the models differently in Mplus and R because the coefficient standard errors of the fixed effects (DBH, VolSc) are quite different and the variance estimates of the random effects are different as well. In Mplus when I set VARIABLE STRATIFICATION = Site CLUSTER = Plot using TYPE = TWOLEVEL COMPLEX Are both grouping variables being treated as random effects? I also made sure to set REML=FALSE in the R script as I believe Mplus model estimation is based on ML correct? 


No. When there is clustering due to both primary and secondary sampling stages, the standard errors and chisquare test of model fit are computed taking into account the clustering due to the primary sampling stage using TYPE=COMPLEX whereas clustering due to the secondary sampling stage is modeled using TYPE=TWOLEVEL. 


OK, I see. I do get the same results between R and Mplus when I just use one grouping variable and TYPE=TWOLEVEL, so my problem definitely is related to how I am specifying the first order grouping variable (Site) in Mplus. So my question is how can I specify in the model that both grouping variables should be treated as nested random effects? I am not interested in Site or Plot effects, but I do want to account for the nested structure in the data so that I am accurately estimating the individual level affects of fire damage and tree size on mortality. I know averaging the data at the Plot level is one option, but I don't think this is ideal and would like to avoid doing so if possible. Thank you both for your prompt responses. Your help has been much appreciated! 


Both cannot be random in the current version of Mplus. 


Hello Dr. Muthen, I am currently working on my dissertation which entails an analytic approach involving a 221 multilevel SEM framework (MSEM). More specifically, I'll be assessing multilevel mediation at level2. Level1 involves student outcomes and level2 involves teacher/classroom predictors (commitment and burnout). My question is whether this type of approach is appropriate with N = 204 students at level1 and J = 75 teachers at level2. If so, do you happen to know of any supporting literature. Additionally, it is also the case where I have inequality of the # of students nested within classrooms. Some classrooms have one student, while others may have 3 to 4. Does this create any biases with estimating parameters or can these biases be rectified by the planned MSEM approach. Thanks so much for the help! 


I think a good way to get more information on this is to do a Monte Carlo simulation study, which can be done in Mplus. See Chapter 12 of the Version 6 UG. As an alternative to the Monte Carlo approach, there are articles on analyticallydetermined power by several authors including Raudenbush (Google it), but I am not sure they cover general enough situations to be helpful to you. And they do not cover quality of recovery of parameter estimates, SE estimates, and chisquare test of model fit  which you can get out of a Monte Carlo study. 


Hi, I just purchased the Mplus and tried to run two level regression for my dissertation. Cluster is SCHOOL How do I write MODEL statement? %WITHIN% y on x; %BETEEN% ??? The example ex9.1a, states as y on w xm; And I'm not sure how to get w xm from x. Thank you so much! 


w is a clusterlevel variable; if you don't have one, you ignore this. xm is the clustermean of x, which you can create by the cluster_mean(x) option (see the UG). If you don't want that, just say %Between% y; 


Thank you for your response! I'm still struggling the multilevel analysis. Below is the incomplete statement I tried. These data are nested in 19 different courses. I don't have any independent variables for cluster level. What BETWEEN statement would be and did I miss anything else? TITLE: MULTI LEVEL; DATA: FILE IS Motivation_path.csv; TYPE IS INDIVIDUAL; VARIABLE: NAMES ARE ID COURSE GRADE Int Tech_eff Pre_Eng Auto_Sup Const Post_Eng Pre_Mot Post_Mot; USEVARIABLES Grade Tech_eff Auto_Sup Const Pre_Mot Post_Mot Int; CLUSTER IS COURSE; ANALYSIS: TYPE=GENERAL TWOLEVEL; MODEL: %WITHIN% Post_Mot ON Int Pre_Mot Tech_eff Auto_Sup Const; Grade ON Post_Mot; %BETWEEN% ????? Thank you so much! 


%BETWEEN% Post_Mot; Grade; Post_Mot WITH Grade; For simplicity, you also want to add in the VARIABLE command: WITHIN = Tech_eff Auto_Sup Const Pre_Mot Int; 


Thank you so much! 


This time I have a question for Twolevel SEM analysis. I don't have independent variables for between level. i just want to test the proposed model in two levels because students nested in different courses. Could you take a look the between statement? WITHIN = Pre_Mot Tech_Eff Auto_S Const; CLUSTER IS COURSE; ANALYSIS: TYPE IS TWOLEVEL; MODEL: %WITHIN% Const BY CON1CON7; Pre_Mot BY Pre_M1 Pre_M2; Post_Mot BY Post_M1 Post_M2; Auto_S BY AS Cont; Post_Mot ON Pre_Mot Tech_eff Auto_S Const; Grade ON Post_Mot; %BETWEEN% Post_M; Grade; Post_M WITH Grade; Thank you so much! 


It seems post_m should be on the BETWEEN list. The best way to know if an input is correct is to run it and see if you get what you want. 


I'm sorry to ask the same question again. This is my first time to try SEM analysis so I'm a beginner. I have done one level (student level) SEM just fine. But because students are nested in different courses, I like to consider course effect to test the model. I don't have different variables in the between level. Q1: Is two level SEM in Mplus right method to consider different course effect? Q2 Then, When I tried below, error showed up saying Withinlevel variables can not be used on between level. Then what statement should be in Between level? Could you please help me? I'd greatly appreciate it. WITHIN = CON1CON7 Pre_M1 Pre_M2 Post_M1 Post_M2 AS1 AS2 Tech_eff; CLUSTER IS COURSE; ANALYSIS: TYPE IS TWOLEVEL; MODEL: %WITHIN% Const BY CON1CON7; Pre_Mot BY Pre_M1 Pre_M2; Post_Mot BY Post_M1 Post_M2; AS BY AS1 AS2; Post_Mot ON Pre_Mot Tech_eff AS Const; Grade ON Post_Mot; %BETWEEN% Post_Mot; Grade; Post_Mot WITH Grade; 


If you don't put variables on either the WITHIN or BETWEEN list, they can be used at both levels. Please read Example 9.1. These issues are described. 


Hello, I am running a 2level multilevel model and I have a latent variable at the aggregate level that I would like to have a crosslevel interaction effect with an observed binary variable at the individual level with. I would be grateful if you could let me know whether this is at all possible to do in MPLUS, (i.e. is there a way of combining the Type=Twolevel with the Type=Random option in some way). Thank you for your time, Jan 


See Example 9.2 which illustrates a crosslevel interaction. 

ElineB posted on Friday, April 20, 2012  1:18 am



Hello, I am trying to to test 3 random slopes in a twolevel, multivariate regression model with 3 dependent (binary) variables. I included WITH statements to estimate the covariances between the Y's on both levels. However, I get the following errors: *** ERROR in MODEL command Covariances for categorical, censored, count or nominal variables with other observed variables are not defined. Problem with the statement: Y1 WITH Y2 *** ERROR The following MODEL statements are ignored: * Statements in the WITHIN level: Y1 with Y2; Y2 with Y3; Y1 with Y3; In short, this is the syntax I used: VARIABLE: NAMES = Sample Y1 Y2 Y3 X1 X2 X3 X4; CLUSTER = Sample; USEVAR = Y1 Y2 Y3 X1 X2 X3 X4; CATEGORICAL = Y1 Y2 Y3; MISSING = ALL (9999); WITHIN =X1 X2 X3 X4; ANALYSIS: ALGORITHM = INTEGRATION ; INTEGRATION = 5; TYPE = TWOLEVEL RANDOM; MODEL: %WITHIN% Y1 on X1 X2 X3 ; Y2 on X1 X2 X3; Y3 on X1 X2 X3; S1  Y1 on X4; S2  Y2 on X4; S3  Y3 on X4; Y1 with Y2; Y2 with Y3; Y1 with Y3; %BETWEEN% S1 with Y1; S2 with Y2; S3 with Y3; Y1 with Y2; Y2 with Y3; Y1 with Y3; OUTPUT: SAMPSTAT; I hope you can help me solve this problem. Thanks in advance! 


With maximum likelihood and categorical outcomes, you cannot use WITH to specify residual covariances. Each residual covariance requires one dimension of integration. You can use BY to specify them as follows: f1 BY y1@1 y2; f1@1; [f1@0]; You will find the residual covariance in the factor loading for y2. 


Hello, I am handling a dataset with a hierarchical structure. In more detail: I have measures of different constructs on the daylevel (for example states) and oftentimes the same measures as traits on the person level. My sample consists of 73 participants each one assessed on three days. Furthermore I would like to control for the common confounds like age, sex and so on. So, my problem is that I don´t know how to include more than one level2 predictor in my analysis. Would be great if you could help me out on this one. Thanks in advance and a big compliment for keeping up with all these requests! 


In Mplus this would be a singlelevel model. See Example 6.10 which is a growth model with both timeinvariant and timevarying covariates. 

Jinni Su posted on Tuesday, August 28, 2012  7:55 am



Hi, Dr.Muthen, I am running a multilevel regression model, in the output file, all S.E. are 0 and all P values are 1. Could you please provide some insight on how/why this happen? here is part of the input: Analysis: Type = twolevel ; h1iteration = 5000; Model: %within% normuse on age gender black hispanic hmong asian drugdisa parinv pardisa peeruse famchaos ; %between% normuse on mage mwhite mdruguse mschcon mschprob mdrugdis ; Thanks, Jinni 


Please send the output and your license number to support@statmodel.com. 


Hi all, I'm running a simple linear regression (regressing a continuous variable onto a dichotomous variable), but using Mplus to account for a complex survey design. How can I get Mplus to tell me the mean/SD for each level of the dichotomous predictor variable? Thanks! Brian Feinstein 


Use the binary covariate as a GROUPING variable and do a TYPE = COMPLEX BASIC with no MODEL command. 

SABA posted on Tuesday, December 15, 2015  6:55 am



Hi, I want to do hierarchical regression model in Mplus. From this I do not meant the multilevel (hierarchical) model. I meant controlling for or taking into account the impact of a different set of independent variables on the dependent variable and model.However, I could not find in user guide how to specify a model for that. Can Mplus do that? if yes, then could you please tell me how to specify the model. Thank you 


If it is regression or mediation, you simply add the variables as further x's. 

Xiaoqiao posted on Wednesday, December 16, 2015  10:54 am



What about blocks? Is it possible to obtain indices of fit (e.g., Rsquare) for specific blocks of predictors, evaluate change in fit after addition of a new block for statistical significance? 


No, that's not available. 


The output of my stepwise regression model contain correlations matrix as part of the sample statistics. The matrix do not come with the pvalues. How can I get this information? 


Currently this is obtained using a Model statement that has WITH for all variable pairs; look at the STDYX standardization. 


Dear Ma'am, I have just gone through an article "When to Use Hierarchical Linear Modeling  Huta (2014)". In that article, the: 1) Withinlevel Slope is negative, which are "100" in number (i.e., there are 100 Withinlevel Slopes) 2) Betweenlevel Slope is positive In the "Multilevel Structural Equation Modeling", there would be: 1) Withinlevel Structural Model 2) Betweenlevel Structural Model In the "Betweenlevel Structural Model", there is a "Single Regression Coefficient" (which is positive in nature), between the "Latent Construct (DV)" & "Latent Construct (IV)". But, in the "Withinlevel Structural Model", there are "100 Regression Coefficients" (i.e., 100 Withinlevel Slopes for 100 Individuals, which are negative in nature), between the "Latent Construct (DV)" & "Latent Construct (IV)". My question is: Q) How are the "100 Regression Coefficients", at the Withinlevel, represented by the by a Single value (i.e., as a Single Regression Coefficient), as it is done in the "Betweenlevel Structural Model"? Do we average all the "100 Withinlevel Regression Coefficients" & convert it into a Single value? 


The 100 regression coefficients are values on the Withinlevel random slope variable. This variable has a mean and a variance estimated on the Betweenlevel. If you are using Mplus, you might want to study our Short Course Topic 7 video and handout. 


Thank you, Ma'am. Where do I get that? Could you please provide me the weblink?? 


Our Short Courses are found at http://www.statmodel.com/course_materials.shtml 


Thank you, Ma'am. 


Hello, for an analysis I used 3leveldata (Students, Classes, Schools) and applied example 9.20 to the data. The model could be summarized as follows: school variables (level 3) have an impact on individual achievement (level 1) mediated by classroom variables (level 2). Unfortunately, I am somewhat lost with regard to the interpretation of the slopes s1 and s12. For example within my data I found, that a level 3 variable is positively associated with slope s12, but has no significant association regarding slope s1. If I understand example 9.20 correctly, slope s12 moderates the association of a variable at the second level and the intercept of the slope s1. Further slope s1 moderates the association of a level 1 independent variable and the intercept of a dependent variable. Is this right? Can you please provide some guidance with regard to the interpretation of the slopes s1 and s12 and/ or refer to a publication where such a model has been applied. Thanks a lot! 


Send your output to Support along with your license number. 


I am regressing a betweenlevel outcome on cluster means. It's a single level regression. I want to know the degrees of freedom. How can I get this? The output provides the number of observations but this includes all the level 1 cases. I need to know the number of cluster means included in the analysis in order to compute df. Any ideas? 


I guess it would be "Number of clusters?" 


That's right. 

Back to top 