Resolving endogeneity using mixture m... PreviousNext
Mplus Discussion > Latent Variable Mixture Modeling >
 Michel van der Borgh posted on Tuesday, September 17, 2013 - 5:20 am

I want to estimate a model but have a potential bias due to endogeneity. The problem is that I do not have reliable Instrumental Variables. I know that researchers have developed “frugal” IV methods to address this problem. These methods resolve the endogeneity problem without using observable instruments (e.g., Ebbes et al. 2005). The key idea of this approach is to introduce a binary unobserved IV that partitions endogenous predictors into two components, one uncorrelated and the other correlated with the error term in the main equation.

A simple model would look like:
Y(i) = b0 + b1*m1 + e(i),
M(i) = t*z(i) + v(i),

With i = 1, …., n and t an (m x 1)-vector of category means. z(i) is an unobserved discrete instrument (> 1 categories). It is assumed that z is independent of the error terms (e, v).

My question is how to implement this (if possible) in Mplus?

Ebbes, Peter, Michel Wedel, Ulf Böckenholt, and Ton Steerneman (2005), “Solving and Testing for Regressor-Error (in)Dependence When No Instrumental Variables Are Available: With New Evidence for the Effect of Education on Income,” Quantitative Marketing and Economics, 3 (4), 365–92.

 Bengt O. Muthen posted on Tuesday, September 17, 2013 - 1:14 pm
Just to clarify, what is M(i) and what is m1?
 Michel van der Borgh posted on Tuesday, September 17, 2013 - 2:36 pm
Dear Bengt,

sorry for the confusion.

They are the same and indicate the beta-coefficient. The m stands for mediator in the original model. Both terms should be x(i).

Hope this clarifies things.

 Bengt O. Muthen posted on Friday, September 20, 2013 - 5:47 pm
I think you are writing out eqn (1) in the Ebbes et al article. Eqn (1) says that z_i is a latent variable and pi is observed, which makes x_i a latent variable. So in Mplus you would say:


x BY;
z BY x@pi;
y ON x;

Perhaps there are several z variables, I don't know. If this model is identified, Mplus can estimate it.
 Lucy Busija posted on Thursday, November 28, 2013 - 9:37 pm
I am trying to estimate a model that is similar to Michael's above, with random effects for the instrumental variable.

I have a dichotomous outcome (0,1) and a continuous predictor and I am trying to find a break point in the predictor that best separates 0s from 1s on the outcome variable.

the model is specified as follows:
x BY;
z BY x@predictor;
outcome ON x;

My questions are:
is this correct specification to address my question?
is the threshold value for outcome$1 in the output my break point?
how do i introduce random effects of time of day onto z into the equation?

Thankyou in anticipation
 Bengt O. Muthen posted on Friday, November 29, 2013 - 9:25 am
I don't understand this model - how a break point is explored.
 Lucy Busija posted on Friday, November 29, 2013 - 1:02 pm
I would like to apply the instrumental latent variable approach (Ebbes 2005) to find a 'threshold' point on a continuous predictor, above which outcome of 1 is more probable than outcome of 0. The threshold is unknown in advance and needs to be estimated. Essentially, 'threshold' behaves like an instrumental variable (z_i in Ebbes 2005 eqtn (1)) in a sense that it 'decomposes x into a systematic part that is uncorrelated with error term and one that is possibly correlated with error term'.

Is there a way to estimate this type of model in Mplus?

Also, the dataset that I work with contains repeated observations for each person and I would like to use this information to derive random effects of a person on the 'threshold'. Again, is there a way to do this in Mplus? (the data are in a long format.)
 Bengt O. Muthen posted on Friday, November 29, 2013 - 5:23 pm
I am not familiar with that approach so I can't say if it can be done in Mplus. There is not an automatic IV estimator in Mplus.

Random effects can be handled in Mplus in both single-level and two-level models.
 Lucy Busija posted on Friday, November 29, 2013 - 11:57 pm
Thank you for clarifying, Dr Muthen. There is an alternative model that I would like to explore: dose-response threshold (Hunt, DL, Rai, SN. A new threshold dose-response model including random effects for data from developmental toxicity studies. J Appl Toxicol. 2005;25:435–439).
The model summarises the relationship between a continuous predictor and an outcome (0,1). It assumes the existence of a threshold: no association between outcome and predictor below the threshold and a logistic association above the threshold. Mathematically, the model takes on the following form:
Logit(P_ij) = {beta_0 + sigma_ij, for d_i< tau
Logit(P_ij)= {beta_0 + beta_1(d_i - tau) + sigma_ij, for d_i >= tau

where P_ij is the probability of outcome at jth time for ith individual (i = 1, . . . g; j = 1, . . . , m_i);
tau is the threshold level of the predictor (unknown in advance);
d_i is the observed level of the predictor;
beta_0 is the 'background' response;
beta_1 is the slope above threshold;
sigma_ij is the random effect.
According to Hunt and Rain (2005), the “model corresponds to the random effects logistic model … when tau = 0".

So my question is: can this type of model be programmed to run in Mplus and if so, would it be possible to model tau as a random parameter?
 Lucy Busija posted on Sunday, December 01, 2013 - 6:16 pm
Further to my post above (and to simplify my question):
in essence, i have three sets of equations to estimate simultaneously.

1: Outcome=beta0_1+beta1_1(log_predictor)+beta2_1(log_predictor**2);

2: tau=exp(1–beta1_1/(2*beta2_1));

3: Outcome=beta0_2+beta1_2(predictor-tau);

is this possible to implement in Mplus in a single step?

thank you for your help.
 Y.A. posted on Wednesday, May 09, 2018 - 9:18 pm
Dear Prof. Muthen,

I am having the similar trouble with the instrumental variables. The reviewer of my manuscript asked me to add instrumental variables which I am not able to find. The difference between my situation and the ones mentioned by colleagues above is that, my predictor and outcome are latent classes. I have one 3-class predictor and one 3-class outcome (this is the class solution I have now, without the instrumental variables).

Could you help me out here how to incoporate the latent classes into the eq (1) in Ebbes et al. 2005 please?

yi = b0 + b1x1 + ei
xi = pi'zi + vi

Besides, why the code example you gave on September 20, 2013 - 5:47 pm has no indicator?

x by;

Thank you very much!

Best regards,

 Bengt O. Muthen posted on Thursday, May 10, 2018 - 3:20 pm
I am not familiar with Ebbes approach - try SEMNET instead.

x BY; is a trick to get a latent variable X that is then used for some purpose.
Back to top
Add Your Message Here
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Options: Enable HTML code in message
Automatically activate URLs in message