Threshold and probability PreviousNext
Mplus Discussion > Categorical Data Modeling >
Message/Author
 John Lee posted on Monday, June 22, 2009 - 6:50 am
Hi,

At the Mplus Users Guide V5 (p. 407), it was stated that the P(u=1|x)=1/(1+exp(-a-b*x)).

I have just fitted a simple model on a binary variable (0: 100 times vs 1: 200 times):
TITLE: binary variable
DATA: FILE IS binary1.dat;
VARIABLE: NAMES x w; ! w: frequency
FREQWEIGHT IS w;
CATEGORICAL ARE x;
USEVARIABLES ARE x;
MODEL:
[x$1];
OUTPUT:

The followings are part of the output:
SUMMARY OF ANALYSIS

SUMMARY OF CATEGORICAL DATA PROPORTIONS
X
Category 1 0.333
Category 2 0.667

MODEL RESULTS
Two-Tailed
Estimate S.E. Est./S.E. P-Value

Thresholds
X$1 -0.431 0.075 -5.754 0.000


The estimated probability is clearly .667 (200/300). When I do the calculations based on the estimated threshold, it is P(u=1|x)=1/(1+exp(0.431))=.39. Even if I use 1-.39=.606, it is still different from the expected value .667.

Did I miss something? Thanks.
 Bengt O. Muthen posted on Monday, June 22, 2009 - 1:55 pm
You should use the formula

P(u=1|x)=1/(1+exp(-0.431))=.606

because the threshold is -a. But then you also have to take into account the freqweight values.
 Linda K. Muthen posted on Monday, June 22, 2009 - 2:58 pm
Also, I think you are using the formula for logistic regression rather than probit regression.
 John Lee posted on Tuesday, June 23, 2009 - 1:16 am
Dear Bengt and Linda,

Thanks for the replies. Yes, I am using the formula for logistic regression as I thought that the default link function is logistic rather than probit.

In my data file, the data are:
x w
0 100
1 200

where w is the frequency weight. I have also tried a version without the frequency weight. That is, I created 100 "0" and 200 "1".

The estimated probability based on the formula is P(u=1|x)=1/(1+exp(-0.431))=.606.

My concern is that the "correct" estimate on the probability should clearly be 100/300=.667.
 Linda K. Muthen posted on Tuesday, June 23, 2009 - 12:05 pm
For maximum likelihood, the default link is logit. However, the default estimator for categorical outcomes is weighted least squares and probit regression. What you posted does not show which estimator you used. Please send your full output and license number to support@statmodel.com.
Back to top
Add Your Message Here
Post:
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Password:
Options: Enable HTML code in message
Automatically activate URLs in message
Action: