Jon Heron posted on Thursday, August 03, 2017 - 7:04 am
We've made good progress recently in deriving Tyler's 4-way decomposition with binary X, M and Y.
Have spent the last few days seeing how many different combinations or nominal/categorical variable we could get to work.
X binary (assumed cts) with M and Y categorical is the most obvious, but declaring all three as nominal is potentially useful if one is interested in measurement error (or 3-step LCA modelling). It also provides an interesting link between transition analysis and mediation. Not to mention the ability to extend any/all variables beyond the standard binary setting.
Three combinations currently prove elusive, such as where M and Y are nominal but X is not. Not sure if the limiting factor here is mathematics, software or mental ingenuity.
I always wonder about a nominal Y - how should the expectation be considered? You don't want to give scores for the different nominal categories. Do you define an effect for each category?
Good if you can write this up for us all to see.
Jon Heron posted on Friday, August 04, 2017 - 12:03 am
Well my thinking was that for nominal X and/or nominal Y you are effectively fitting multiple binary models at once.
You have me thinking now though
I wonder if we are actually just using the nominal set up as a trick rather than truly estimating a nominal regression model.
Something else that has me pondering is nominal M. Specifically, how we consider controlled direct effects in this situation. A clinician would be justified in having an interest in the impact of moving one specific M class into a better state rather than moving everyone at once as there may be different interventions to target different impairments.
Jon Heron posted on Friday, August 04, 2017 - 2:18 am
the light-bulb went on (or the coffee took effect) and I twigged the subtlety of what you were saying.
We probably don't want to derive effects for Y_c2, Y_c3, Y_c4 relative to some reference outcome category Y_c1, we want to examine Y_c2 in reference to its complement and then again with the other outcome levels.
I suspect both are useful, but only one may permit a causal interpretation which is ultimately what we seek.
Jon Heron posted on Wednesday, August 23, 2017 - 4:54 am
Well we continue to make progress as well as uncovering further aspects we do not understand.
On a positive note it was fairly trivial to reparameterize the nominal-Y model to produce "proper" odds-ratios and these we found to agree with those from a succession of binary-logistic models, so there is a simpler solution unless Y is both nominal AND latent.
However we are now grappling with what appears to be little more than conventions.
For binary M/Binary Y, Tyler's 4-way decomp (and indeed his 2-way which is used to estimate the PNIE/TNDE produced by Stata's -paramed- routine) relies on the rare outcome assumption and makes some approximations such that the total effect yielded by summing across the 4 (or 2) decomposed components does not equal that from a simple Y on X model.
In contrast, in Mplus, PNIE+TNDE does indeed perfectly reproduce the Y on X total effect.
I note that Bengt refers to Pearl 2011 where I see that it states (p3) that probit/logistic assumptions are not needed, merely arithmetic. This begs the question why not do it Bengt's/Pearl's way?
The inability to replicate results across packages due to this is certainly proving a hinderance.
Jon Heron posted on Wednesday, August 23, 2017 - 9:47 am
I hope you don't mind if I add further thoughts as I've been re-reading Tyler's paper for his 2-way decomp
VanderWeele TJ, Vansteelandt S. Odds Ratios for Mediation Analysis for a Dichotomous Outcome. American Journal of Epidemiology. 2010;172(12):1339-1348. doi:10.1093/aje/kwq332.
I think one issue here is that this method is designed to appeal to people who want to fit a series of regression models and who, without this method, would fall back on Baron and Kenny. The 4-way decomp carries on where this first paper leaves off with similar assumptions and approximations.
So if we produce code for 4-way decomp in Mplus for binary-M/binary-Y using Tyler's code, the 4 components would be expected to re-compose in various ways, e.g. INT-MED + PIE = TNIE, however none of the larger components will agree with Bengt's published code as it relies on a slightly different method.
This begs the question of whether a Pearl-esque approach could also be used for a 4-way decomposition.
With continuous data we don't appear to have an issue, and indeed that 4-way code for that is already up on the statmodel website.
I was wondering if someone could explain these four lines of code to me in plain english. Below each line of code I put what I think it calculates.
cde = (t1 + t3*mstar)*(a1-a0); The controlled direct effect, which is the amount of change in the Y variable when the mediator is constant, but the X variable is changed from conditional value a0 to a1. In a typical mediation model this would be change in the direct path from X to Y.
intref = t3*(b0 + b1*a0 + bcc - mstar)*(a1-a0); The interaction effect, the amount of average change in the Y variable attributed to the interaction effect when the mediator is held constant, but the conditional values change from a0 to a1.
intmed = t3*b1*(a1-a0)*(a1-a0); The mediated interaction, the amount of average change in the Y variable attributed to both the interaction and mediation when the X variable is changed from a0 to a1.
pie = (t2*b1 + t3*b1*a0)*(a1-a0); The pure indirect effect, the amount of variance in the X to Y relationship accounted for by the relationship between X and M and M and Y when the X variable changes from a0 to a1.
See pages 372-373 of the Vanderweele 2015 book or page 750 of the 2014 Vanderweele article that we mention in the FAQ:
Mediation: Effects using a 4-way decomposition
Tor Neiland who worked on this will add some comments on your wording in a few days.
Tor Neilands posted on Wednesday, November 29, 2017 - 10:18 am
After having re-read the Vanderweele material, I agree with your definitions of the CDE and PIE and the spirit of your INTREF and INTMED defitions, but might suggest taking a second look at the exact wording for those latter two definitions with respect to the role of A0, especially for INTREF based on what appears below.
The 2014 Vanderweele paper has a nice intuitive explanation for these 4 effects which I'll post separately. The article also shows the following counterfactual-based expected values of the effects, which I found helpful:
where p_am = E(Y|A=a, M=m). The article goes on to comment that, "With such average measures, it is possible to assess how much of the total effect is due to neither mediation nor interaction (the first component); how much is due to interaction but not mediation (the second component); how much is due to both mediation and interaction (the third component); and how much of the effect is due to mediation but not interaction (the fourth component)." Table 1 on p. 751 also helpfully lays out the counterfactual definition, empirical analog, and a very brief natural language interpretation of each of the four effects.
Tor Neilands posted on Wednesday, November 29, 2017 - 10:23 am
From Vanderweele, 2014: "The intuition behind this decomposition is that if the exposure affects the outcome for a particular individual, then at least 1 of 4 things must be the case. One possibility is that the exposure might affect the outcome through pathways that do not require the mediator (ie, the exposure affects the outcome even when the mediator is absent); in other words, the first component is non-zero. A second possibility is that the exposure effect might operate only in the presence of the mediator (ie, there is an interaction), with the exposure itself not necessary for the mediator to be present (ie, the mediator itself would be present in the absence of the exposure, although the mediator is itself necessary for the exposure to have an effect on the outcome); in other words, the second component is non-zero. A third possibility is that the exposure effect might operate only in the presence of the mediator (ie, there is an interaction), with the exposure itself needed for the mediator to be present (ie, the exposure causes the mediator, and the presence of the mediator is itself necessary for the exposure to have an effect on the outcome); in other words, the third component in non-zero. The fourth possibility is that the mediator can cause the outcome in the absence of the exposure, but the exposure is necessary for the mediator itself to be present; in other words, the fourth component is non-zero."
Tor Neilands posted on Wednesday, November 29, 2017 - 10:38 am
A final thought (sorry for needing multiple posts to cover this complex topic): After reading Brandon's post and the Vanderweele materials several times I realized that Brandon was using the MODEL CONSTRAINT syntax involving the parameters of the regression models for Y and M to generate his definitions whereas I relied on the counterfactual definitions shown in Vanderweele's paper. To understand how the counterfactual definitions and their empirical analogs get translated into expressions of regression model parameters, including how A=0 for INTREF can result in an expression that includes (a1-a0), I recommend looking at the appendix of Vanderweele, 2014. It also shows the SAS NLMIXED code that I replicated in MODEL CONSTRAINT in Mplus.
Caveat: I don't consider myself an expert in this and welcome comments, corrections, or suggested improvements.
This has been helpful. I think the method is interesting. I'd like some more clarification about how to set values for examining this. a1 and a0 seem straightforward, but what about mstar. Additionally, assuming one does find significant effects for the interaction or intmed, what would be the appropriate approach for modeling follow-up simple slopes.
Tor Neilands posted on Thursday, December 21, 2017 - 11:21 pm
In the appendix of the 2014 Vanderweele paper on the 4-way decomposition method, he points out in his description of the SAS example on page 9 of the appendix that the analyst must choose the level of mstar at which to compute the controlled direct effect and the remainder of the decomposition. He assumes M=mstar=0, which seems like a sensible reference value to me for most applications. He goes on to add that if a different value from zero makes more sense, that could be used instead.
For the simple slopes question, I'm not sure how well the logic of simple slopes applies to the effects estimated via the 4-way decomposition. You could get simple slopes for the A*M product in expectation of Y given exposure a, mediator m, and covariates c:
from page 1 of the Vanderweele 2014 appendix. The problem is that a*m is the standard multiplicative interaction and the expectations of IntRef and IntMed differ (they are additive interactions). To interpret IntRef and IntMed, I wonder if a plot along the lines of the ones generated in Ex. 3.18 in the Mplus User's Guide might be an option?
I understand that it says mstar=0, but why is that sensible. Are we assuming that 0 means the mediator is not present or is 0 a sample average that is the result of centering. In one case we might interpret say intref, while at the sample average of the mediator vs when the mediator is not present.
I think your idea bout plotting it like in 3.18 would be worth trying.
The mstar value for CDE should be chosen as a value that is substantively of policy interest. See e.g. VanderWeele and Vansteelandt (2009) in Statistics and Its Interface and also Jackson et al (2014) in Am J of Epi.