Mplus Discussion >> Predicted Probabilities

Topics
Last Day
Last 3 Days
Last Week
Tree View

Edit Profile


Predicted Probabilities

Mplus Discussion > Structural Equation Modeling >

Message/Author

Jonika Hash posted on Tuesday, October 10, 2017 - 1:09 pm

Hello! I am a PhD student looking for help with predicted probabilities.

I have a path model with a latent mediator and a final dichotomous outcome (u2). I would like to calculate predicted probabilities of u2 at different levels of two predictors (x3 & x4), holding all else constant. I am using ML with Monte Carlo integration and the logit link function. The model is:

i1 | y1@1 y2@1 y3@1;

u2 ON i1 x2 x3 x4 x5 x6;

i1 ON x1 x2 x3;

x1 x2 x5;

where x6 is an interaction term = x3*x4

I think my equations are:

i1 = a1 + b11*x1 + b12*x2 + b13*x3

u2 = a2 + B21*i1 + b22*x2 + b23*x3 + b24*x4 + b25*x5 + b26*x3*x4

where

a2 = u2 threshold estimate in Mplus output

a1 = i1 intercept estimate in output

b's = regression coefficients for corresponding x's

B21 = regression coefficient for u2 ON i1

Then to calculate logits at varying levels of x3 & x4, holding all else constant at 0:

logit = -a2 + B21*a1 + B21*b13*x3 + b23*x3 + b24*x4 + b26*x3*x4

And predicted probabilities:

1/(1+e^(-1(logit)))

Wondering if I can get help if this does not look correct? Thank you.

Bengt O. Muthen posted on Tuesday, October 10, 2017 - 1:58 pm

It is tempting to think of that as the answer, but what's not taken into account is the residual variance in i which has to be "integrated out". With a logistic link and a normal i residual, this calls for numerical integration so not something done by hand. But with a probit link the expression simplifies to using a standard univariate normal distribution function using as argument the expectation that you have on the right-hand side of logit, divided by the square root of a residual variance expression. This is shown on page 310 of our book Regression and Mediation Analysis using Mplus. You can also figure it out adding residuals to both of your 2 equations where the probit residual variance is 1.

Jonika Hash posted on Tuesday, October 10, 2017 - 3:21 pm

Hi Dr. Muthen,

Thank you so much for your help! I am so glad I asked. I am not as familiar with probit regression as I am with logit, so I am hoping I understand. Does this now look correct?:

Using the probit link function and taking residuals into account, I think my equations would now be:

i1 = a1 + b11*x1 + b12*x2 + b13*x3 + i1residual

u2 = a2 + B21*i1 + b22*x2 + b23*x3 + b24*x4 + b25*x5 + b26*x3*x4 + u2residual

where i1residual = the i1 residual variance estimate given in my Mplus output

u2 residual = 1 (I think this is what you meant by the probit residual variance being 1?)

Then, I calculate my probabilities as shown in the Mplus User’s guide using the formula:

P(u = 1) | x) = F(a + b*x) = F(-t + b*x)

For example, I would calculate the probability of u2 being 1 when x3 = -1 and x4 = 1 using the formula:

P(u = 1 | x3 = -1, x4 = 1 ) = F (-a2 + B21*a1 + B21*b13*1 + B21*i1residual + b23*1 + b24*-1 + b26*1*-1 + 1)

Would you help clarify if I am not understanding this? Thank you so much for your time!

Bengt O. Muthen posted on Tuesday, October 10, 2017 - 4:33 pm

No, that's not right. It's a bit much to explain so I refer to the book page I mentioned.

Geilson Lima Santana Junior posted on Tuesday, October 17, 2017 - 7:14 pm

Dear professors,

I'm trying to calculate predicted probabilities from the probit regression y ON x gender age

Y – dichotomous
X – continuous (number of events: 0-8)
Gender – dichotomous
Age – continuous

Output:

Y ON
X 0.179
SEX 0.197
AGE -0.009

Thresholds
Y$1 1.931

P (y=1|x) = F (-1.931 + 0.179 * x + 0.197*sex – 0.009*age)

1. Sex and age are control variables. Should I put their sample means on the equations?

P (y=1|x=0) = F (-1.931 + 0.179 * 0 + 0.197*0.472 – 0.009*39.052) = F (-2.189) = 0.014
...
P (y=1|x=8) = F (-1.931 + 0.179 * 8 + 0.197*0.472 – 0.009*39.052) = F (-0.757) = 0.224

2. Is this correct?
3. Is there any way to do these calculations on MPLUS or do I have to do them by hand?
3. Can I plot these predicted probabilities directly on MPLUS?

Thank you!!!

Bengt O. Muthen posted on Wednesday, October 18, 2017 - 2:37 pm

Looks ok but I wouldn't use the mean of gender (what does that mean) but instead consider one at a time.

You can let Mplus do this by using Model Constraint with Loop and Plot. For an example, see the first example of our Topic 11 short course video and handout at

http://www.statmodel.com/course_materials.shtml

Joanna Davies posted on Monday, March 09, 2020 - 6:40 am

Hello,
I have produced a useful loop plot of an interaction effect:
DEFINE:
CENTER age(GRANDMEAN);
xz = wealth*age;

MODEL:
pod ON age
sex
wealth (beta2)
edqual somatic access social
xz (beta3);
[pod$1] (tau);

MODEL CONSTRAINT:
LOOP(age, -26, 19, 0.1);
PLOT(effect);
effect = beta2+beta3*age;

I am trying to do a similar plot but with probabilities on the y axis
LOOP(age, -26, 19, 0.1);
PLOT(propens);
propens = PHI(-tau+beta2+beta3*age);

I dont think the propens plot is right, the effect is flat across age range which is not what i see with the effects plot. Do i have the formula for propens plot wrong?

thank you

Joanna Davies posted on Monday, March 09, 2020 - 10:04 am

In a related issue to post above (sorry for multiple posts again!). I have used model constraint to get effects for wealth at different ages to try to understand more about the age*wealth interaction:
NEW(agelo age0 agehi);
agelo = beta2+beta3*(-9.86);
age0 = beta2;
agehi = beta2+beta3*(9.86);

results
threshold: -0.721
new effects:
agelo: -0.068
age0:-0.055
agehi:-0.042

and converted these to probabilities using:
P = F(0.721-0.068)
P = F(0.653)
P = 0.7422 (using a zscore table)

is this correct?
i have flipped the threshold sign from neg to pos is that right?

Re the interaction (effect of wealth at different ages), i think that the difference in terms of probabilities is small which is why the propens plot (earlier post) looks so flat, even though the difference when i plot the probits look larger and CI cross 0 at oldest ages. I think this shows importance of using probability to interpret results??

thank you

Bengt O. Muthen posted on Monday, March 09, 2020 - 4:02 pm

Your first post says that you have several covariates. You evaluate the probability as if their means are zero - and if they are not, you are evaluating the probability at the wrong range of x-axis values. You can center the covariates and see if that changes things.

Joanna Davies posted on Tuesday, March 10, 2020 - 4:05 am

Thank you for suggestion, i have tried but makes little difference. The other covariates all have a meaningful zero, as in there are actual cases with zero edqual, zero somatic etc.

Is it plausible to see the different results i describe in my first post?

Also, in my second post, am i using the correct formula to transfer probit into probability?

Many thanks,

Franziska Meinck posted on Tuesday, June 16, 2020 - 9:59 am

Dear Drs Muthén,
I have a model in which two dichotomous outcomes are regressed (probit) on the same dichotomous exogenous variables. The two outcomes are correlated. I used PROPENSITY for the predicted probability of the outcomes for each combination of the three predictors.
But I also need to report the 95% CI for these probabilities.
1) Is it possible within Mplus to get the 95% CI for the predicted probabilities as yielded by PROPENSITY? I’d really be grateful for the code.
2)I seem to remember that in a particular circumstance with dichotomous outcomes the exogenous variables should be explicitly correlated. When should this still be done? Is this regression such an instance?
Many thanks.

USEVARIABLES ARE Eay Pay PosPar NoHun ;
CATEGORICAL ARE Eay Pay;
ANALYSIS:
ESTIMATOR=WLSMV;
MODEL:
Eay ON PosPar (b11)
NoHun (b22);
[Eay$1] (t1);
Pay ON PosPar (b21)
NoHun (b22);
[Pay$1] (t2);
Eay WITH Pay;

MODEL CONSTRAINT:
NEW (? ) ;

OUTPUT:
STDYX CINTERVAL ;
SAVEDATA:
FILE = prop_EayPay.dat ;
SAVE = PROPENSITY ;

Bengt O. Muthen posted on Tuesday, June 16, 2020 - 5:42 pm

You have to use Model Constraint to express the probabilities using parameter labels given in the Model command. Then you get SEs as well.