Message/Author 

Jessica Li posted on Thursday, July 03, 2014  11:06 am



How do I get/calculate the predicted value of an outcome in a multilevel ordinal logistic regression? Even outside Mplus. Where are the intercepts? MODEL RESULTS TwoTailed Estimate S.E. Est./S.E. PValue Within Level O ON C1 0.011 0.014 0.816 0.414 C2 0.063 0.099 0.635 0.525 C3 0.051 0.117 0.436 0.663 C4 0.113 0.045 2.498 0.012 C5 0.019 0.007 2.568 0.010 C6 0.009 0.013 0.673 0.501 C7 0.034 0.052 0.657 0.511 C8 0.124 0.091 1.364 0.173 R1 0.382 0.067 5.693 0.000 R2 0.652 0.121 5.405 0.000 R3 0.236 0.065 3.604 0.000 U1 0.194 0.074 2.615 0.009 U2 0.195 0.056 3.477 0.001 U3 0.139 0.100 1.397 0.162 U4 0.271 0.094 2.883 0.004 Between Level Thresholds O$1 4.734 0.673 7.034 0.000 O$2 2.945 0.668 4.406 0.000 

Jessica Li posted on Thursday, July 03, 2014  11:07 am



========Here is my input============= INPUT INSTRUCTIONS TITLE: Org model DATA: FILE ="\Desktop\g07023.csv"; VARIABLE: NAMES are sid c1 c2 c3 c4 c5 c6 c7 c8 r1 r2 r3 u1 u2 u3 u4 u o; categorical are o; Missing are all (9999); USEVARIABLES are sid c1 c2 c3 c4 c5 c6 c7 c8 r1 r2 r3 u1 u2 u3 u4 o; WITHIN are c1 c2 c3 c4 c5 c6 c7 c8 r1 r2 r3 u1 u2 u3 u4; CLUSTER are sid; ANALYSIS: TYPE are TWOLEVEL; Estimator are ml; MODEL: %WITHIN% o ON c1 c2 c3 c4 c5 c6 c7 c8 r1 r2 r3 u1 u2 u3 u4; %BETWEEN% o; 

Jessica Li posted on Thursday, July 03, 2014  11:09 am



I should clarify. I was trying to get the predicted value (either categorical or continuous is fine)of the outcome for every case/observation. Thanks. 


If you have a binary DV the intercept is the negative of the threshold. You have an ordinal DV with 3 categories so two thresholds. You can computed the predicted probability for different random intercept values as shown on slide 66 of our Topic 7 handout, where slides 6066 deal with understanding twolevel logistic regression. See handout and video on our website. We ask that you limit postings to one window. 

ljc posted on Monday, September 29, 2014  7:07 am



Slide 66 of topic 7 only has the patterns or cluster sizes. Am I looking in the wrong place? 


Slide 66 refers to the LarsenMerlo article  this is a good one to study. The slide looks like: Understanding The BetweenLevel Intercept Variance • Intraclass correlation – ICC = 0.807/(π2/3+ 0.807) = 0.20 • Odds ratios – Larsen & Merlo (2005). Appropriate assessment of neighborhood effects on individual health: Integrating random and fixed effects in multilevel logistic regression. American Journal of Epidemiology, 161, 8188. – Larsen proposes MOR: "Consider two persons with the same covariates, chosen randomly from two different clusters. The MOR is the median odds ratio between the person of higher propensity and the person of lower propensity." 66 MOR = exp( √(2* σ2) * Φ1 (0.75) ) In the current example, ICC = 0.20, MOR = 2.36 • Probabilities – Compare β0j= 1 SD and β0j= +1 SD from the mean: For males at the aggression mean the probability varies from 0.14 to 0.50 

ljc posted on Monday, September 29, 2014  11:38 am



Sorry, I hate to be dense, but I don't understand the &# notation. I think your last sentence has the answer I am looking for which is, the formula for the predicted value for each cluster. I think I am supposed add (or subtract) the standard deviation to something, but I am not sure what. Just as a note, I only have a random intercept in my particular example. 


The text got garbled when copying from the PPT pdf  check the handout instead. 

ljc posted on Monday, September 29, 2014  2:54 pm



I found it. It is slide 58 using the version that is on the Mplus homepage. But it still doesn't help me with probabilities for specific clusters. It just helps me get a range. I can get cluster specific predictions easily with SAS, but SAS will delete cases with missing x and MPlus won't. I hope you consider adding predicted values to your to save command in the future. Thanks. 


Respected Prof. Muthen I have a similar request: foe estimator being ML or MLR how to get: 1.) Predicted values (yhat) and 2.) Residuals (resid) e.g. I have a latent variable (LV) with three indicators. This LV is a dependent variable Y in the model. So to get the residual of this LV after being predicted by say another independent variable X which also latent with three indicator. i.e. Y by y1 y2 y3; X by x1 x2 x3; Y on X; Just so, I found an interesting way to get this If X and Y were not latent variables. I can get this from the scatter plot>save plot data. However for latent variables this is not there! Please help. (in stata we get this using the predict command for non latent variable regressions) 


Answer to ljc: Perhaps what you are asking for is answered by getting factor scores for the cluster effects, that is, the random intercepts. You get this by Save=FSCORES. Then you plug that into the formula. Regarding the slide number, I am looking at the 3/29/11 Topic 7 handout at our usual site http://www.statmodel.com/course_materials.shtml 


Answer to S.Arunachalam You can get estimates (posterior mean) for Y and X using savedata: file is 1.dat; save=FSCORES; The residuals in the Y on X regression can be computed manually. Just use the estimated coefficient in that regression beta and the estimates for Y and X to get "Y  beta X" residual. 


I am trying to calculate predicted probabilities for a multilevel mediation model with a 4category ordinal DV. The mediation pathway is not significant, so I am only interested in the level2 direct effect of x on y. Using MODEL CONSTRAINT and the mean value of x yields implausible results. The probabilities do all add to 1, but the distribution is extremely unlikely. Below is a pareddown version of my model. Am I using the correct equation given the model? CATEGORICAL = y ; WITHIN = [level1 IVs] ; BETWEEN = x ; CLUSTER = country ; DEFINE: x = log(x) ; ANALYSIS: TYPE = TWOLEVEL ; ESTIMATOR = BAYES ; MODEL: %WITHIN% y ON [individuallevel variables] ; m ON [individuallevel variables] ; %BETWEEN% m ON x (a) ; y ON m (b) ; y ON x (coef) ; [y$1] (tau1) ; [y$2] (tau2) ; [y$3] (tau3) ; MODEL CONSTRAINT: NEW(indb p1 p2 p3 p4) ; indb=a*b ; p1 = phi(tau1  2.125*coef) ; p2 = phi(tau2  2.125*coef)  phi(tau1  2.125*coef) ; p3 = phi(tau3  2.125*coef)  phi(tau2  2.125*coef) ; p4 = phi(tau3 + 2.125*coef) ; 


Your problem might be solved simply by centering all covariates. The main problem is that when you say "predicted probabilities" you have to clarify if these are condtional or unconditional probabilities. The probit regression gives you P(YX). To get the unconditional probabilities P(Y) you have to do something like this p1 = phi((tau1  Mean(X)*Beta)/sqrt(1+Beta*Var(X)*Beta^T+VBY)) ; p2 = phi((tau2 Mean(X)*Beta)/sqrt(1+Beta*Var(X)*Beta^T+VBY))  phi((tau1 Mean(X)*Beta)/sqrt(1+Beta*Var(X)*Beta^T+VBY)) ; ... where VBY is the between variance of Y and Beta*Var(X)*Beta^T is the total variance for the beta*X predictor. I notice that you skipped the within level model completely (but you shouldn't in general). If you skip it that means you condition on all within level X to be zero, which might be inappropriate or irrelevant. Even the above approach assumes normal distribution for X  it may be best to average P(YX) for all X in your data set. We offer that now for single level via the individual predicted values see web note #20 but not for twolevel yet. 


Thank you for the quick reply. 1) Grandmean centering mostly worked, and betas were similar in both modelsa good sign. I did not grandmean center the mediator, as doing so switched the sign of the level2 coefficient. There is no meaningful zero value for the mediator (it's on a 15 scale), so it's difficult to tell whether the raw or centered scores are correct. Would grandmean centering a mediator cause problems with the partitioning of within and between variance, or do you think the raw scores are the more likely problem? 2) Just to verify some figures in the denominator: VBY is the residual variance of y and Beta^T is the squared posterior s.d.? (All level1 variables are dummycoded such that the combined omitted categories represent the benchmark person. I omitted it because I didn't want to clutter the page.) 


1) You would have to use the above formula that I gave. 2) Beta^T is the regression coefficient transposed. 


I now understand the formulaI forgot that SEM must be thought of in terms of matrices. While I understand this theoretically, I'm still at a loss as to which specific numbers to use from the output. Beta*Beta^T suggests Beta is squared to make it conformable with the rest of the equation, but I'm unsure if Var(X) is the residual variance or some other number. Based on page 46 of the Topic 2 handout (https://www.statmodel.com/course_materials.shtml), it does seem to be the residual variance of Beta. Do I understand this correctly? 


We don't actually compute Var(X) in the case when X is a covariate. You can compute this separately  it is the variance covariance of all predictors, not residual variance. However, I would still recommend that you figure out how to compute P(YX) for one observation. That has no Var(X). It would use the observed values for M, X and the estimated random intercepts factor score. p1 = phi((tau1 b*mX*BetaYB); p2 = phi((tau2 b*mX*BetaYB)  phi((tau1b*m X*BetaYB); ... where YB is the posterior mean of the random intercept. 

Back to top 