

Calculating probabilities from probit... 

Message/Author 


Hello, My SEM model includes: f1 by y1y4 f2 by y5y8 f3 by y9y11 f4 by u1u6 (3ordered categories each) f5 by f1f4 1 outcome (ordered categorical) 1 covariate (binary) I use public data and apply panel weights. Questions related to the calculations of probabilities from the probit regression: 1. Do I need to account for the weights (I read a post on accounting for frequency weights; I am not sure if this applies to the type of weights in my study and in what way). 2. When the outcome is regressed on both f4 and f5, how should I calculate probabilities on the outcome 1SD above/below the means? Should I only change either x1 or x2 and keep the other at the mean, or is there another, more sophisticated way? 3. I am interested in group differences on the outcome. How should I explore these? One way would be to run two separate analyses  one per group. This will result in estimated probabilities that are groupspecific (meaning, an individual compared to her groupspecific mean). Rights? Another way would be to run a MIMIC model. Then, I will have the following estimates: b1 for f5 on covariate b2 for outcome on f5 3 thresholds (t1t3) Considering the outcome does not directly regress on both b weights, how would I calculate the probabilities? Thanks! Hadar 


1. No, the weights are already accounted for in the estimates. 2. Need to see your full model to answer this. Send input, data, output and license number to support. 3. You would want to do a multiplegroup analysis where your first confirm measurement invariance. 


Hello! I am interested in study how x affects y2 probabilities directly and indirectly in a probit model with a latent mediator variable in the causal path. For example, assume y_1 = g_1*x + e_1, y*_2 = b*y_1 + g_2*x + g_3*lat +e_2, where y_1 represents the socio economic status, x is the maternal educational, lat is a latent variable (lipids) and y*_2 denotes the underlying latent response variable in the probit model (b and g are regression coefficients). It follows that y*_2 = b*g_1*x + g_2*x + g_3*lat + b*e_1 + e_2. Then P(y_2 = 1  x) = P(y*_2 > t_2  x) = F( t_2 + b*g_1*x + g_2*x + g_3*lat), where tau_2 is the threshold for y_2 and F is the standard normal distribution function found in tables. I know that I have to fix x and lat to get these probabilities, but it is not clear to me in what values I can fix lat as it is a latent variable. Any help on calculated/interpretting predicted probabilities or using some other approach to interpretation would be greatly appreciated. Thanks. 


Your expression should say P(y_2 = 1  x, lat) if you want to plug in values for lat. If lat has mean zero and variance 1, you can for example plug in 1, 0, 1. 

Back to top 

