Professional Documents
Culture Documents
Causes of Effects Learning Individual Responses From Population Data
Causes of Effects Learning Individual Responses From Population Data
R-505
January 2022
3.2 Mediation
Partial Mediator
0,
P (yx ) − P (yx′ ),
U Z max ≤ PNS (14)
P (y) − P (yx′ ),
P (yx ) − P (y)
X Y
P (yx ),
P (yx′ ′ ),
Figure 2: Mediator Z with direct effect
P (y, x) + P (y ′ , x′ ),
In figure 2, Z is a descendant of X, so we cannot use theo- min P (yx ) − P (yx′ )+ ≥ PNS (15)
rems 4 and 5. However, the absence of confounders between
+P (y, x′ ) + P (y ′ , x),
Z and Y and between X and Y permits us to bound PNS as
′ ′
follows: Σz Σz′ ̸=z min{P (y|z), P′ (y′ |z )}×
min{P (z|x), P (z |x )}
Proof. See Appendix. They suggest that the fraction of beneficiaries can be as low as
0% or as high as 29.7%. Now, consider the bounds in theorem
The core terms for theorems 6 and 7 added to the upper 5 which takes into account the position of Z in the diagram.
bounds notably only require observational data. Since Z satisfies the back-door criterion, we can use equations
10 and 11 to compute 0 ⩽ P N S ⩽ 0.01. The conclusion now
4 Examples is obvious. At most 7 out of 314 patients’ recoveries can be
4.1 Credit to the Treatment credited to the drug. This is strong evidence that counters the
manufacturer’s claim.
The manufacturer of a drug wants to claim that a non-trivial
number of recovered patients who were given access to the 4.2 Inflammation Mediator
drug owe their recovery to the drug. So they conduct an obser-
vational study; they record the recovery rates of 700 patients. As before, let X and Y represent drug consumption and re-
464 patients chose to take the drug and 236 patients did not. covery. Let Z represent acute inflammation with z being
The results of the study are in table 1. The manufacturer claims present and z ′ being absent. In some people, the drug causes
success for their drug because the overall recovery rate from acute inflammation, which has adverse effects on recovery, so
the observational study has increased from 54% to 68% for the causal structure is depicted in figure 3. We observe the
non-drug-takers to drug-takers. following proportions among drug takers, non-takers, with
inflammation, and without inflammation:
Table 1: Results of a drug study with gender taken into account P (y|z) = 0.5, P (z|x) = 0.1,
′
Drug No Drug P (y|z ) = 0.5, P (z|x′ ) = 0.1.
1 out of 110 13 out of 120
Women The Tian-Pearl PNS upper bound is:
recovered (1%) recovered (11%)
Men
313 out of 354 114 out of 116 PNS ⩽ min {P (y|x), P (y ′ |x′ )} = 0.5.
recovered (88%) recovered (98%)
314 out of 464 127 out of 236 Given that the lower bound is 0, these bounds are not very
Overall informative. If we knew that an individual would react to
recovered (68%) recovered (54%)
the drug with acute inflammation, we would only look at the
data comprising of people reacting to the drug with acute
The number of recovered patients that should credit the inflammation. Since we are conditioning on z, P N S = 0
drug for their recovery are those who would recover if they because the outcome, Y , will have the same result regardless
had taken the drug and would not recover if they had not taken of whether the person consumed the drug. So knowing a
the drug. This is the PNS. person’s inflammation response to the drug narrows PNS from
Let X = x denote the event that the patient took the drug a wide [0, 0.5] to a point estimate of 0. Imagine, for this drug,
and X = x′ denote the event that the patient did not take that we can’t know ahead of time how a person will react
the drug. Let Y = y denote the event that the patient has inflammation-wise. We can only observe acute inflammation
recovered and Y = y ′ denote the event that the patient has not after the drug is administered. Since we have population data
recovered. Let Z = z represent female patients and Z = z ′ from patients who have already taken the drug, we can utilize
represent male patients. Suppose we know an additional fact, this mediator to bound the PNS for new patients who haven’t
estrogen has a negative effect on recovery, so women are less yet taken the drug:
likely to recover than men, regardless of the drug. Additionally,
P (y|z) · P (z|x) + P (y|z ′ ) · P (z|x′ ),
as we can see from the data, men are significantly more likely
P (y|z) · P (z ′ |x′ ) + P (y|z ′ ) · P (z ′ |x),
to take the drug than women are. The causal diagram is shown PNS ⩽ min ′ ′
in Figure 1a. P (y ′ |z ′ ) · P (z|x)
+ P (y ′ |z) · P (z|x′ ),
P (y |z ) · P (z ′ |x′ ) + P (y ′ |z) · P (z ′ |x)
Node Z on the graph satisfies the back-door criterion, there-
fore we can compute the causal effect P (yx ) and P (yx′ ) via = 0.1.
the adjustment formula [Pearl, 1993] and observational data
from table 1, where, The mediator-improved PNS upper bound is significantly
X smaller than what the Tian-Pearl upper bound provides, 0.1
P (yx ) = P (y|x, z)P (z) = 0.597, vs 0.5. The new upper bound can now be effectively weighed
z against other factors like cost and side-effects.
X
P (y ) =
x′ P (y|x′ , z)P (z) = 0.696,
4.3 Ancestral Covariate
z
P (yx′ ′ ) = 1 − P (yx′ ) = 0.304. Let’s continue from the introduction, where X represents vac-
cination with x being vaccinated and x′ being unvaccinated
Therefore, the bounds of PNS computed using equations 4 and Y represents survival with y is surviving and y ′ is suc-
and 5 are 0 ≤ P N S ≤ 0.297, where the diagram was used cumbing to the pandemic. Instead of classifying by age, let’s
only to identify the causal effects yx and yx′ . These bounds assume our machine learning algorithm uncovers a correla-
aren’t informative enough to conclude whether or not the drug tion between survival and ancestry. Let Z represent ancestry
was the cause of recovery for a meaningful number of patients. and, for simplicity, there are only two ancestries, z and z ′ .
Either graph of figure 1 is representative of this. Our RCT For each causal diagram, 100 out of 100000 sample distribu-
data reveals: tions are randomly selected, sorted by lower and upper PNS
P (Z = z) = 0.5, bound, and then drawn with and without considering the causal
P (yx |Z = z ′ ) = 0.25, diagram (Figures 5 to 8).
P (yx |Z = z) = 0.75,
P (yx′ |Z = z ′ ) = 0.6.
P (yx′ |Z = z) = 0.2, Table 2: Performance metrics for Theorems 4 (Non-desc), 5 (Suff
covar), 6 (Part med & Part med 2), and 7 (Pure med)
We now have four different bounds on PNS:
Tian-Pearl =⇒ 0.1 ⩽ P N S ⩽ 0.5 Incr’d Decr’d
Tian-Pearl Theorems Samples
lower upper
Covariate-improved =⇒ 0.275 ⩽ P N S ⩽ 0.5 PNS gap PNS gap benefiting
bound bound
Person has ancestry z =⇒ 0.55 ⩽ P N S ⩽ 0.75 Non-desc 0.0238 0.0237 0.2673 0.2197 85.622%
Person has ancestry z ′ =⇒ 0 ⩽ P N S ⩽ 0.25 Suff covar 0.0266 0.0264 0.2197 0.1668 75.025%
Part med 0.0000 0.0047 0.2289 0.2242 12.532%
Part med 2 0.0000 0.0382 0.2768 0.2386 100.00%
As expected, using the causal diagram and ancestral Z yields Pure med 0.0000 0.0935 0.2605 0.1670 100.00%
narrower bounds than the Tian-Pearl bounds. However, it’s
surprising that knowing a person has either ancestry z or z ′ Table 2 shows the average gaps between Tian-Pearl PNS
gives us bounds outside of our new bounds. In fact, they bounds and Theorem 6’s bounds are similar for the partial me-
are completely outside the wider Tian-Pearl bounds. This is diator of Figure 4b (Part med in Table 2). This is because only
discussed in section 6. 12.532% of samples are narrowed by the proposed Theorem 6.
In the meantime, it’s important to recognize that the last two A second set of sample distributions were generated repeatedly
ancestry-specific PNS bounds are what should be referred to until 100000 narrowed samples were available (Part med 2 in
if an individual knows their ancestry. The covariate-improved Table 2). This time the difference in gaps were significant,
PNS bounds should only be referred to if a person’s ancestry is which is important if the costs of including partial mediator
unknown. This might be because the person was adopted with data are low.
no hint as to whether they’re from ancestry z or z ′ (physical
features are right in between or indistinguishable).
5 Simulation Results
Z
Z
W
X Y
X Y (b) Z is a partial mediator
(a) Z is a non-descendant of X
Figure 4: Causal diagrams for simulation Figure 5: PNS bounds for causal diagram of Figure 4a
References
[Balke and Pearl, 1997] Alexander Balke and Judea Pearl.
Probabilistic Counterfactuals: Semantics, Computation,
and Applications. PhD thesis, University of California, Los
Angeles, 1997.
[Balke and Pearl, 2013] Alexander Balke and Judea Pearl.
Counterfactuals and policy analysis in structural models.
arXiv preprint arXiv:1302.4929, 2013.
[Dawid et al., 2017] Philip Dawid, Monica Musio, and
Rossella Murtas. The probability of causation. Law, Prob-
ability and Risk, (16):163–179, 2017.
[Galles and Pearl, 1998] David Galles and Judea Pearl. An
axiomatic characterization of causal counterfactuals. Foun-
dations of Science, 3(1):151–182, 1998.
[Halpern, 2000] Joseph Y Halpern. Axiomatizing causal rea-
soning. Journal of Artificial Intelligence Research, 12:317–
337, 2000.
[Koller and Friedman, 2009] Daphne Koller and Nir Fried-
man. Probabilistic Graphical Models: Principles and Tech-
niques. MIT press, 2009.
[Li and Pearl, 2019] Ang Li and Judea Pearl. Unit selection
based on counterfactual logic. In Proceedings of the 28th
International Joint Conference on Artificial Intelligence,
pages 1793–1799. AAAI Press, 2019.
[Li and Pearl, 2021] A. Li and J. Pearl. Unit selec-
tion with causal diagram. Technical Report R-510,
<http://ftp.cs.ucla.edu/pub/stat ser/r510.pdf>, To appear
in AAAI Conference on Artificial Intelligence (AAAI-
2022), Department of Computer Science, University of
California, Los Angeles, CA, 2021.
[Mueller and Pearl, 2020] Scott Mueller and Judea Pearl.
Which Patients are in Greater Need: A counterfactual
analysis with reflections on COVID-19, Apr 2020. https:
//ucla.in/39Ey8sU+.
[Pearl, 1993] J Pearl. Aspects of graphical models connected
with causality. Proceedings of the 49th Session of the
International Statistical Institute, Italy, pages 399–401,
1993.
[Pearl, 1995] Judea Pearl. Causal diagrams for empirical
research. Biometrika, 82(4):669–688, 1995.
[Pearl, 1999] Judea Pearl. Probabilities of Causation: Three
counterfactual interpretations and their identification. Syn-
these, 121(1-2):93–149, 1999.
[Pearl, 2009] Judea Pearl. Causality. Cambridge University
Press, Second edition, 2009.
[Spirtes et al., 2000] Peter Spirtes, Clark N Glymour, Richard
Scheines, and David Heckerman. Causation, Prediction,
and Search. MIT press, 2000.