Setting up the Heckman Model with Med... PreviousNext
Mplus Discussion > Structural Equation Modeling >
 Scott R. Colwell posted on Tuesday, October 14, 2014 - 9:13 am
I am trying to set up a Heckman model whereby:

1) x predicts a binary u variable
2) x predicts a continuous y variable
3) y is only observed when u = 1
4) y mediates the relationship between x and z (z being continuous)

For 1-3 I have the following based on your MC simulation:

u on x; ! the selection model
y on x; ! the outcome model
f by u@1 y; f@1;

To make this a mediation, I am thinking that the LV f represents y when u = 1. In which case if, y given u = 1, mediates the relationship between x and z then I am thinking the model would be:

u y on x;
f by u@1 y; f@1;
z on F;

Is this correct?
 Bengt O. Muthen posted on Tuesday, October 14, 2014 - 5:37 pm
It is an interesting question. But the factor f is only there to capture the residual covariance between the binary u and cont's y DVs, so the mediator is y (which is only partly observed). So I think you would have

z on y;

Try it out in a simulation.
 Alejandro Sevilla posted on Wednesday, October 14, 2015 - 8:33 am
I am using your Heckman MC simulation as reference to correct for sample selection on a mediation model.
Selection occurs in the mediator (continuous) and outcome variable (binary).
My questions are
1. What is (g) accounting for in the correlation? Could you suggest a reference?

- Heckman MC
y on x*1 (g);
y*1 (v);
f by y@1 u*-1 (cov);
u on x*-1;
model constraint:
new (corr*-.5);
corr = cov/(sqrt(g*g+v)*sqrt(cov*cov+1));

- My model has 3 dummy variables as Xs (rhed4_2-rhed4_4), gpa_m is my continuous mediator (here the response variable)
select on rhed4_2 rhed4_3 rhed4_4 shh_4g male;
gpa_m on rhed4_2 rhed4_3 rhed4_4;
f by select@1 gpa_m (cov);

Model constraint:
new (corr);
corr = cov/(sqrt(?+v)*sqrt(cov*cov+1));

Thank you.
 Alejandro Sevilla posted on Wednesday, October 14, 2015 - 8:39 am
2. How I should estimate the correlation of the residuals in the mediation model?
(gpa_m=continuous, tr_pse=binary)

select on rhed4_2 rhed4_3 rhed4_4 shh_4g male;
gpa_m on rhed4_2 rhed4_3 rhed4_4;
tr_pse on gpa_m rhed4_2 rhed4_3 rhed4_4;
f by select@1 gpa_m (cov1)
tr_pse (cov2);

Model Indirect:
tr_pse IND rhed4_2;
tr_pse IND rhed4_3;
tr_pse IND rhed4_4;

Model constraint:
new (corr);
corr = ?/(sqrt(?+?)*sqrt(?*?+1));

Thank you.
 Bengt O. Muthen posted on Wednesday, October 14, 2015 - 3:01 pm
See the corrected Heckman modeling FAQ on our web site.
 Alejandro Sevilla posted on Wednesday, October 14, 2015 - 3:33 pm
Thank you Prof. Muthen, it is very helpful.

I wonder if it is not too much trouble to have a look of my question 2, whether just adding my binary outcome to the factor is enough to correct for selection on my outcome model. In that case, that would mean to have 2 lambdas to estimate the correlation?

 Bengt O. Muthen posted on Wednesday, October 14, 2015 - 6:01 pm
You can use the factor as you have it in order to capture selection, except have only 1 label per line, so change to

f by select@1
gpa_m (cov1)
tr_pse (cov2);

Note that the Model Constraint correlation computation is not necessary for the estimation of the model.

Just be sure that both gpa_m and tr_pse have missingness when the individual is not selected. Mplus can do that automatically using DATA TWOPART.
 Alejandro Sevilla posted on Thursday, October 22, 2015 - 9:08 am
Dear Prof. Muthen,
Apologies for continuing with the discussion on this topic.

I was able to correct for sample selection for a binary outcome. Using the parametrization for the outcome model /sqrt(2) and the selection /sqrt(lam^2+1) my estimates are similar to Stata's heckprobit.

However, when the outcome is continuous using a regular probit for the selection makes that the residuals' correlation cannot vary in the full range [-1,1]. Skrondal & Rabe-Hesketh (2004, 107-108) suggest that the variances of the selection and outcome models should be estimated and restricted to be equal for a response in a linear regression.
Proposing a rescaled probit for the selection, where the underlying variance is not constrained to 1. Stata's gsem can fit a rescaled probit using a censored regression.

In Mplus, I tried the Theta parametrization to estimate the variance of the probit selection model, but my estimates and correlation are still too different from Stata's heckman. A censored selection model does not allow me to restrict its variance to be equal with a linear regression. Since my full model is more complex, Mplus computational speed is what I need, which is superior to Stata's.

I was hoping that perhaps you could have in mind a suggestion of how to estimate a scaled probit to correct for SS in a linear model.
 Bengt O. Muthen posted on Thursday, October 22, 2015 - 6:59 pm
I can send you the Mplus version of the Skrondal-Rabe-Hesketh Stata Heckman run using their wage data. It matches their Stata results without difficulties or variance restrictions.
 Alejandro Sevilla posted on Friday, October 23, 2015 - 1:47 am
Thank you. It is much appreciated.
 S.Arunachalam posted on Saturday, July 23, 2016 - 9:37 am
Respected Prof. Muthen. Question from the Mplus book on Heckman modeling - "Table 7.5, p. 293" and "Table 7.6, p. 294"

In the output of these two examples, I am trying to locate the estimates for inverse mills ratio (IMR) (which is normally got using STATA's Hecman two step estimator:

Is it not estimated? If so could you guide me as to how to get the estimates for inverse Mills ratio please?
When should I estimate it or avoid estimating it?

My apologies if I am missing something obvious.
 Bengt O. Muthen posted on Saturday, July 23, 2016 - 12:32 pm
You need the Mills ratio for Heckman's 2-step estimator but not for the ML estimator that Mplus uses.

If you still want to compute it for the final estimates and a certain value of the x covariates you can do so using Model Constraint.
 S.Arunachalam posted on Saturday, July 23, 2016 - 7:44 pm
Thank you Prof. Muthen.

-> Any plan of implementing the 2-step estimator in future version of Mplus please?

-> Yes I would like to compute. Could you help me with the code please? I understand I can use PHI command for PDF(which goes to the numerator) but for CDF I am not sure how to get it in Mplus.

-> But computing using Model Constraint doesnot estimate the effect of IMR (inverse mill ratio) on the continuous part. so should I then save it using SAVEDATA and then use in a next step?

p.s.: Book is a goldmine. My sincere gratitude to you and the entire team!!! Wishlist: another one on multilevel for the same topics covered in this book. Sorry to be greedy..just that MPlus is so good addictive..
 Alejandro Sevilla posted on Friday, July 28, 2017 - 2:34 pm
I was wondering if a multinomial sample-selection model
can be estimated by separate (binary) probit models (generating a
multinomial Probit model), such as

estimator = WLSMV;
y ON x; !model, continous DV
y (v);
sel_1 ON x (s1_1) !selection to group 1
z (s1_2) ;
[sel_1$1] (thre_1);
sel_2 ON x (s2_1) !selection to group 2
z (s2_2);
[sel_2$1] (thre_2);

f BY sel_1 sel_2 !generates the multinomial probit

lat BY f@1
y (lam);

NEW (rho selint_1 selint_2 sels1_1 sels1_2 sels2_1 sels2_2);
rho = (lam)/(sqrt(lam*lam+v)*sqrt(sel_1*sel_1+1)*sqrt(sel_2*sel_2+1));
selint_1 = -thre_1/sqrt(sel_1*sel_1+1);
selint_2 = -thre_2/sqrt(sel_2*sel_2+1);
sels1_1 = s1_1/sqrt(sel_1*sel_1+1);
sels1_2 = s1_2/sqrt(sel_1*sel_1+1);
sels2_1 = s2_1/sqrt(sel_2*sel_2+1);
sels2_2 = s2_2/sqrt(sel_2*sel_2+1);

Is this model correctly specified?
Or in turn can a multinomial logistic model be used for sample selection?
 Alejandro Sevilla posted on Friday, July 28, 2017 - 2:44 pm
I mistakenly omitted the labels in the parameters of factor 'f', it should be

f BY sel_1 (lga1)
sel_2 (lga2); !generates the multinomial probit model

Therefore, rho and the proposed reparametrization would be

rho = (lam)/(sqrt(lam*lam+v)*sqrt(lga1*lga1+1)*sqrt(lga2*lga2+1));
selint_1 = -thre_1/sqrt(lga1*lga1+1);
selint_2 = -thre_2/sqrt(lga2*lga2+1);
sels1_1 = s1_1/sqrt(lga1*lga1+1);
sels1_2 = s1_2/sqrt(lga1*lga1+1);
sels2_1 = s2_1/sqrt(lga2*lga2+1);
sels2_2 = s2_2/sqrt(lga2*lga2+1);
 Bengt O. Muthen posted on Friday, July 28, 2017 - 5:37 pm
I don't know about the multinomial sample-selection model done via separate binaries. Mplus can do Heckman and can do switching regressions (both described in our new RMA book). See also articles posted under Papers - both under the heading

Paired-Comparison and Ranking Data SEM

and under the heading Miscellaneous.
 Alejandro Sevilla posted on Friday, July 28, 2017 - 8:40 pm
Thank you Prof. Muthen. I have your book, I'll have a look of it and the papers too.

My first thought was that given there is not a multinomial probit implemented in Mplus, it might be approximated by a latent variable on the residuals of multiple binary probits using the WLSMV estimator, as I read in one of your comments in the multinomial logistic regression discussion.

Then, using another latent variable to correlated these binary probits with the response model (y), as you did to correct for sample selection a la Heckman.

Does that make sense?
 Bengt O. Muthen posted on Sunday, July 30, 2017 - 5:05 pm
Try it and compare the results to software where this is implemented.
 Alejandro Sevilla posted on Monday, July 31, 2017 - 3:40 pm
I'll do, thank you.

Leaving aside the sample-selection model. I'm still wondering if a multinomial probit can be estimated using a system of binary probits in Mplus, by adding a factor to the binary DVs as I did in my previous post

f BY sel_1 (lga1)
sel_2 (lga2);
 Bengt O. Muthen posted on Monday, July 31, 2017 - 5:40 pm
Sounds like a good topic to research.
 Jennie Jester posted on Sunday, January 27, 2019 - 1:03 pm
I am analyzing data from an intervention where participants were not randomly assigned and have varying levels of dosage of treatment, including dropping out early. The outcome can be modeled as binary or multinomial.
I would like to use Heckman model to take into account selection bias. I am wondering if I could do fancier things like using a latent growth model as a moderator of the intervention effect ? And also using a latent growth model or a growth mixture model as a mediator of the effect of dose of treatment on outcome?

 Bengt O. Muthen posted on Monday, January 28, 2019 - 1:16 pm
Selective dropout can be modeled as in

Muthén, B., Asparouhov, T., Hunter, A. & Leuchter, A. (2011). Growth modeling with non-ignorable dropout: Alternative analyses of the STAR*D antidepressant trial. Psychological Methods, 16, 17-33. Click here to view Mplus outputs used in this paper.

As for Heckman, you need good predictors of the assignment probability to handle the selectivity.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message