It is an interesting question. But the factor f is only there to capture the residual covariance between the binary u and cont's y DVs, so the mediator is y (which is only partly observed). So I think you would have
I am using your Heckman MC simulation as reference to correct for sample selection on a mediation model. Selection occurs in the mediator (continuous) and outcome variable (binary). My questions are 1. What is (g) accounting for in the correlation? Could you suggest a reference?
- Heckman MC model: y on x*1 (g); [y*0]; y*1 (v); f by y@1 u*-1 (cov); f@1; u on x*-1; model constraint: new (corr*-.5); corr = cov/(sqrt(g*g+v)*sqrt(cov*cov+1));
- My model has 3 dummy variables as Xs (rhed4_2-rhed4_4), gpa_m is my continuous mediator (here the response variable) Model: select on rhed4_2 rhed4_3 rhed4_4 shh_4g male; gpa_m on rhed4_2 rhed4_3 rhed4_4; gpa_m(v); f by select@1 gpa_m (cov); f@1;
Model constraint: new (corr); corr = cov/(sqrt(?+v)*sqrt(cov*cov+1));
I wonder if it is not too much trouble to have a look of my question 2, whether just adding my binary outcome to the factor is enough to correct for selection on my outcome model. In that case, that would mean to have 2 lambdas to estimate the correlation?
Dear Prof. Muthen, Apologies for continuing with the discussion on this topic.
I was able to correct for sample selection for a binary outcome. Using the parametrization for the outcome model /sqrt(2) and the selection /sqrt(lam^2+1) my estimates are similar to Stata's heckprobit.
However, when the outcome is continuous using a regular probit for the selection makes that the residuals' correlation cannot vary in the full range [-1,1]. Skrondal & Rabe-Hesketh (2004, 107-108) suggest that the variances of the selection and outcome models should be estimated and restricted to be equal for a response in a linear regression. Proposing a rescaled probit for the selection, where the underlying variance is not constrained to 1. Stata's gsem can fit a rescaled probit using a censored regression.
In Mplus, I tried the Theta parametrization to estimate the variance of the probit selection model, but my estimates and correlation are still too different from Stata's heckman. A censored selection model does not allow me to restrict its variance to be equal with a linear regression. Since my full model is more complex, Mplus computational speed is what I need, which is superior to Stata's.
I was hoping that perhaps you could have in mind a suggestion of how to estimate a scaled probit to correct for SS in a linear model.
-> Any plan of implementing the 2-step estimator in future version of Mplus please?
-> Yes I would like to compute. Could you help me with the code please? I understand I can use PHI command for PDF(which goes to the numerator) but for CDF I am not sure how to get it in Mplus.
-> But computing using Model Constraint doesnot estimate the effect of IMR (inverse mill ratio) on the continuous part. so should I then save it using SAVEDATA and then use in a next step?
p.s.: Book is a goldmine. My sincere gratitude to you and the entire team!!! Wishlist: another one on multilevel for the same topics covered in this book. Sorry to be greedy..just that MPlus is so good ..so addictive..
I don't know about the multinomial sample-selection model done via separate binaries. Mplus can do Heckman and can do switching regressions (both described in our new RMA book). See also articles posted under Papers - both under the heading
Thank you Prof. Muthen. I have your book, I'll have a look of it and the papers too.
My first thought was that given there is not a multinomial probit implemented in Mplus, it might be approximated by a latent variable on the residuals of multiple binary probits using the WLSMV estimator, as I read in one of your comments in the multinomial logistic regression discussion.
Then, using another latent variable to correlated these binary probits with the response model (y), as you did to correct for sample selection a la Heckman.
Leaving aside the sample-selection model. I'm still wondering if a multinomial probit can be estimated using a system of binary probits in Mplus, by adding a factor to the binary DVs as I did in my previous post
I am analyzing data from an intervention where participants were not randomly assigned and have varying levels of dosage of treatment, including dropping out early. The outcome can be modeled as binary or multinomial. I would like to use Heckman model to take into account selection bias. I am wondering if I could do fancier things like using a latent growth model as a moderator of the intervention effect ? And also using a latent growth model or a growth mixture model as a mediator of the effect of dose of treatment on outcome?
Muthén, B., Asparouhov, T., Hunter, A. & Leuchter, A. (2011). Growth modeling with non-ignorable dropout: Alternative analyses of the STAR*D antidepressant trial. Psychological Methods, 16, 17-33. Click here to view Mplus outputs used in this paper.
As for Heckman, you need good predictors of the assignment probability to handle the selectivity.