Journal of Econometrics: Yanqin Fan, Sang Soo Park

Journal of Econometrics 167 (2012) 330344
Contents lists available at SciVerse ScienceDirect

Journal of Econometrics
journal homepage: www.elsevier.com/locate/jeconom
Confidence intervals for the quantile of treatment effects in
randomized experiments
Yanqin Fan
a
, Sang Soo Park
b,
a
Department of Economics, Vanderbilt University, VU Station B #351819, 2301 Vanderbilt Place, Nashville, TN 37235-1819, USA
b
Department of Economics, Korea University, Anam-dong, Sungbuk-gu, Seoul 136-701, Republic of Korea
a r t i c l e i n f o
Article history:
Available online 1 October 2011
JEL classification:
C14
C15
C19
Keywords:
Heterogeneous treatment effects
Partial identification
Quantile treatment effects
Order statistic approach
a b s t r a c t
In this paper, we explore partial identification and inference for the quantile of treatment effects
for randomized experiments. First, we propose nonparametric estimators of sharp bounds on the
quantile of treatment effects and establish their asymptotic properties under general conditions. Second,
we construct confidence intervals for the bounds and the true quantile by using the approach in
Chernozhukov et al. (2009). Third, under additional conditions, we develop a new approach to construct
confidence intervals for the bounds and the true quantile and refer to it as the order statistic approach. A
simulation study is conducted to investigate the finite sample performance of both approaches.
2011 Elsevier B.V. All rights reserved.
1. Introduction
Consider a binary treatment with two treatment states referred
to as the treated and the control states. We define the individual
treatment effect as the difference between the two potential
outcomes: the individuals potential outcome when he/she is
assigned to the treated group and the potential outcome when
he/she is assigned to the control group. The effect of a treatment
or social program may differ across individuals when responses
to treatment/program differ among them. When the treatment
effect is heterogeneous, its distribution is often called for in
order to answer many interesting policy questions; see Heckman
et al. (1997) and Fan and Park (2009, 2010) for discussion and
references.
Given that only one of the two potential outcomes is observed
for any individual, the researcher does not observe the individuals
treatment effect. This missing data problem is the fundamental
obstacle to the identification of the distribution of treatment
We are grateful to three referees, Han Hong, Chung-Ming Kuan, and Yoon-Jae
Whang for many constructive comments on the previous version of this paper that
have led to a much improved paper. We are also grateful to participants of the
2008 international Symposiumon Econometric Theory and Applications for helpful
comments and M.J. Lee for drawing our attention to an important reference.
Corresponding author.
E-mail address: sangsoopark@unc.edu (S.S. Park).
effects when there is heterogeneous response to treatment.
Without imposing strong dependence structure on the potential
outcomes, Fan and Park (2009, 2010) investigated the (partial)
identification of the distribution of the effects of a binary treatment
in a randomized experiment. In a randomized experiment, the
marginal distributions of the potential outcomes are identified
and the distribution of treatment effects is partially identified.
Fan and Park (2009, 2010) established asymptotic properties of
nonparametric estimators of sharp bounds on the distribution of
treatment effects and provided valid inference procedures for the
true bounds.
In this paper, we consider an alternative approach for
investigating the heterogeneous treatment effects, the quantile
approach. Let p denote a given quantile level: 0 < p < 1
and Q
TE
(p) denote the p quantile of the treatment effects. The
Q
TE
(p) is often of direct interest in many applications. For example,
the effectiveness of a policy intervention may be measured by
the median of treatment effects, that is Q
TE
(0.5), rather than the
average treatment effect. It is worthwhile to note that Q
TE
(p)
is different from what is known in the current literature as
Quantile Treatment Effect (QTE); QTE is defined as the difference
between the quantiles of the outcomes of the treated and control
groups at a given quantile level p. QTE has received consistent
and much attention both in terms of its theoretical aspect and
applications; see Lehmann (1974), Doksum (1974), Heckman et al.
(1997), Abadie et al. (2002), Bitler et al. (2006), Chernozhukov
and Hansen (2005, 2006), Firpo (2007) and Djebbari and Smith
0304-4076/$ see front matter 2011 Elsevier B.V. All rights reserved.
doi:10.1016/j.jeconom.2011.09.019
Y. Fan, S.S. Park / Journal of Econometrics 167 (2012) 330344 331
(2008) among others. Although the QTE is informative and useful
in analyzing certain types of treatment effects, Fan and Park
(2010) showed that the QTE thus defined is the same as Q
TE
(p)
only under the assumption that the two potential outcomes are
perfectly positively dependent (Firpo, 2007 called this a rank
preservation assumption) and when the QTE is nondecreasing in
the quantile level p. QTE and Q
TE
(p) are two different treatment
effect parameters and are useful in answering different policy
questions.
Statistically, QTE, being the difference between the quantiles of
marginal distributions of the potential outcomes, is identified as
long as the marginal distributions are identified, whereas Q
TE
(p)
can only be partially identified/bounded unless strong dependence
structure is imposed on the potential outcomes. Estimation and
statistical inference procedures for the QTE have been established
(see, for example, Chernozhukov and Hansen, 2005, 2006) in the
parametric framework and Firpo (2007) in the nonparametric
framework) while, to the best of our knowledge, no systematic
study on inference procedures for Q
TE
(p) is currently available.
Besides work in the probability literature on sharp bounds on
the quantile of a difference between two random variables with
given marginal distributions thoroughly reviewed in Fan and Park
(2009, 2010) and Heckman et al. (1997) attempted to find bounds
for the Q
TE
(p) by extending the concept of QTE with various
dependence measures between the potential outcomes and Lee
(2000) showed conditions under which the sign of Q
TE
(1/2) is
identified.
In this paper, we explore the partial identification of Q
TE
(p)
and provide a systematic study of its inference procedures in
the context of randomized experiments. In particular, we make
several contributions to the current treatment effect literature.
First, we propose nonparametric estimators of sharp bounds
on the quantile of treatment effects Q
TE
(p) and establish their
asymptotic properties under general conditions. Second, we
construct confidence intervals (CIs) for the bounds and the true
quantile by using the approach in Chernozhukov et al. (2009).
Third, under additional conditions that guarantee the asymptotic
normality of nonparametric estimators of the sharp bounds, we
develop a new approach to constructing CIs for the bounds
and the true quantile and refer to it as the order statistic
approach. Although CIs based on the order statistic approach
rely on additional conditions, they are easy to implement and
are completely data-driven. A simulation study is conducted to
investigate the finite sample performance of both approaches.
Since Q
TE
(p) is partially identified, this paper also belongs to the
recent, but rapidly growing area of inference for partially identified
parameters; see Imbens and Manski (2004), Bugni (2010), Canay
(2010), Chernozhukov et al. (2007), Fan and Park (2009, 2010),
Galichon and Henry (2009), Horowitz and Manski (2000), Romano
and Shaikh (2008), Stoye (2009), Rosen (2008), Beresteanu and
Molinari (2009), Andrews and Guggenberger (2009) and Andrews
and Soares (2010) among others.
The rest of this paper is organized as follows. In Section 2, we
introduce sharp bounds on Q
TE
(p), provide their nonparametric
estimators, and develop the asymptotic theory for these estimators
under general conditions. Section 3 presents CIs for each bound
and the true quantile using two different approaches, that of
Chernozhukov et al. (2009) and the order statistic approach.
Monte Carlo simulation results are presented in Section 4.
Section 5 concludes. Technical proofs are gathered in Appendix A.
Appendix B describes data generating processes (DGPs) used in our
Monte Carlo experiment.
Throughout the rest of this paper, we use to denote weak
convergence. All the limits are taken as n n
1
+ n
0
, the sum of
two sample sizes, goes to ; see (A1) in Section 2.2.
2. Nonparametric estimators of sharp bounds on Q
TE
(p) and
their asymptotic properties
2.1. Sharp bounds on Q
TE
(p) and their estimators
The notation in this paper follows the convention in the
treatment effect literature. We consider a binary treatment and use
Y
1
to denote the potential outcome from receiving the treatment
and Y
0
the potential outcome without the treatment. Let F(y
1
, y
0
)
denote the joint distribution function of Y
1
, Y
0
with continuous
marginal distribution functions F
1
() and F
0
() respectively. Let
= Y
1
Y
0
denote the individual treatment effect and F
() its
distribution function.
Let F
1
() denote the generalized inverse of F
(). For a given

quantile level p (0, 1) , Q
TE
(p) F
1
(p). Given the marginals

F
1
and F
0
, sharp bounds on Q
TE
(p) can be found in Williamson
and Downs (1990). They are restated in the following lemma. Let
q 1 p and
j
(u) F
1
j
(u) , j = 1, 0.
Lemma 2.1. For 0 < p < 1, Q
L
(p) Q
TE
(p) Q
U
(p), where
Q
L
(p) = sup
u(0,p)
[
1
(u)
0
(u +q)],
Q
U
(p) = inf
u(p,1)
[
1
(u)
0
(u p)],
and these bounds are sharp.
For a given 0 < p < 1, Lemma 2.1 implies that Q
TE
(p) can
take any value inthe interval
_
Q
L
(p), Q
U
(p)
_
andthe bounds Q
L
(p),
Q
U
(p) are point identified, provided that the marginal distribu-
tions F
1
and F
0
are point identified. For notational compactness, let
L
(u, p) =
1
(u)
0
(u +q) and
U
(u, p) =
1
(u)
0
(u p).
Then Lemma 2.1 implies:
Q
L
(p) = sup
u(0,p)
L
(u, p) and Q
U
(p) = inf
u(p,1)
U
(u, p) .
Further, let
U
sup,p
arg sup
u(0,p)
L
(u, p) and U
inf,p
arg inf
u(p,1)
U
(u, p) .
Suppose random samples {Y
1i
}
n
1
i=1
F
1
and {Y
0i
}
n
0
i=1
F
0
are
available. Let n n
1
+ n
0
, F
1n
() and F
0n
() denote the empirical
distribution functions defined as
F
jn
(y) =
1
n
j
n
j
i=1
1
_
Y
ji
y
_
, j = 1, 0
and
nL
(u, p) =
1n
(u)
0n
(u +q) and
nU
(u, p) =
1n
(u)
0n
(u p),
where
1n
() and
0n
() are the empirical quantile functions defined
as follows:
jn
(p) = Y
jn(i)
for p
_
i 1
n
j
,
i
n
j
_
, i = 1, . . . , n
j
,
jn
(0) = Y
jn(1)
,
where Y
jn(1)
Y
jn(n)
are the order statistics of {Y
ji
}
n
j
i=1
.
For a given quantile level p (0, 1), we propose the following
estimators of Q
L
(p) and Q
U
(p):
Q
L
n
(p) = sup
u(0,p)
nL
(u, p) , Q
U
n
(p) = inf
u(p,1)
nU
(u, p) . (1)
332 Y. Fan, S.S. Park / Journal of Econometrics 167 (2012) 330344
In the next subsection, we will investigate asymptotic prop-
erties of Q
L
n
(p), Q
U
n
(p) including the asymptotic distributions of
n
_
Q
L
n
(p) Q
L
(p)
_
,
n
_
Q
U
n
(p) Q
U
(p)
_
and the weak conver-
gence of the normalized stochastic processes associated with
nL
(, p) and
nU
(, p). Suppose F
1
and F
0
are absolutely continu-
ous with respect to the Lebesgue measure with probability density
functions f
1
and f
0
. To define the latter processes, we let
2
L
(u, p) =
u (1 u)
1
[f
1
(
1
(u))]
2
+
(u +q)(p u)
0
[f
0
(
0
(u +q))]
2
,
2
U
(u, p) =
u (1 u)
1
[f
1
(
1
(u))]
2
+
(u p) (1 u +p)
0
[f
0
(
0
(u p))]
2
,
and
2
nL
(u, p) =
u (1 u)
1n
_
f
1
(
1n
(u))
_
2
+
(u +q) (p u)
0n
_
f
0
(
0n
(u +q))
_
2
,
2
nU
(u, p) =
u (1 u)
1n
_
f
1
(
1n
(u))
_
2
+
(u p) (1 u +p)
0n
_
f
0
(
0n
(u p))
_
2
,
where for j = 1, 0,
jn
= n
j
/n,
j
= lim
jn
, and
f
j
_
y
j
_
is a uni-
formly consistent estimator of f
j
_
y
j
_
. Then the normalized stochas-
tic processes associated with
nL
(, p) and
nU
(, p) are defined as
Z
nL
(u, p) =
n (
nL
(u, p)
L
(u, p))
nL
(u, p)
for u (0, p) , (2)
Z
nU
(u, p) =
n (
U
(u, p)
nU
(u, p))
nU
(u, p)
for u (p, 1) . (3)
2.2. Asymptotic properties
We make the following assumptions.
(A1) (i) The two samples {Y
1i
}
n
1
i=1
and {Y
0i
}
n
0
i=1
are each i.i.d. and
are independent of each other; (ii) For j = 1, 0, we assume
n
j
/n
j
as n with 0 <
j
< 1.
(A2) (i) The supports of Y
1
, Y
0
are compact and assumed to be
[0, 1]; (ii) The distribution functions F
1
and F
0
are twice
continuously differentiable on (0, 1) with density functions
f
1
and f
0
on (0, 1) satisfying inf
y(0,1)
f
j
(y) > 0 for j = 1, 0;
(iii) sup
y(0,1)
f
j
(y) < for j = 1, 0.
Assumption (A1) is satisfied in a randomized experiment. As-
sumption(A2)(i) and(ii) ensure the weak convergence of the quan-
tile processes:

n
_
F
1
1n
() F
1
1
()
_
and

n
_
F
1
0n
() F
1
0
()
_
in
the space of l
((0, 1)). Without (A2)(i) and (ii), the convergence

must be restricted to l
((a, b)), where [a, b] is a strict subset of

[0, 1], see e.g., van der Vaart and Wellner (1996). The common sup-
port [0, 1] in Assumption (A2)(i) is made for convenience. We can
replace it with the assumption that the supports of Y
1
, Y
0
are com-
pact. Assumption (A2)(iii), together with the other assumptions
and uniform consistency of the density estimators, is used to en-
sure the uniform consistency of
2
nL
(, p) and
2
nU
(, p).
Theorem 2.2. Suppose (A1) and (A2) hold. Then (i)
Z
nL
(, p) Z
L
(, p) , Z
nU
(, p) Z
U
(, p)
where the convergences are in the spaces of l
((0, p)) and

l
((p, 1)) respectively and Z

L
(, p) and Z
U
(, p) are zero mean
continuous Gaussian processes with covariance functions given by
Cov (Z
L
(u, p) , Z
L
(v, p))
=
1
L
(u, p)
L
(v, p)
_
min {u, v} uv
1
f
1
(
1
(u)) f
1
(
1
(v))
+
min {u +q, v +q} (u +q) (v +q)
0
f
0
(
0
((u +q))) f
0
(
0
(v +q))
_
,
Cov (Z
U
(u, p) , Z
U
(v, p))
=
1
U
(u, p)
U
(v, p)
_
min {u, v} uv
1
f
1
(
1
(u)) f
1
(
1
(v))
+
min {u p, v p} (u p) (v p)
0
f
0
(
0
(u p)) f
0
(
0
(v p))
_
,
respectively; (ii) Q
L
n
(p) = Q
L
(p) + o
p
(1) and Q
U
n
(p) = Q
U
(p) +
o
p
(1). Moreover, we get:
n
_
Q
L
n
(p) Q
L
(p)
_
sup
uU
sup,p
[
L
(u, p) Z
L
(u, p)] ,
n
_
Q
U
n
(p) Q
U
(p)
_
inf
uU
inf,p
[
U
(u, p) Z
U
(u, p)] .
Theorem 2.2(i) implies that Condition D.1. in Chernozhukov
et al. (2009) is satisfied. In the next section, we will apply their
approach to constructing CIs for Q
L
(p) and Q
U
(p). Theorem 2.2(ii)
implies that, in general, the asymptotic distributions of Q
L
n
(p) and
Q
U
n
(p) are not normal unless U
sup,p
and U
inf,p
are singletons. We
summarize this result as a corollary below.
(A3) The function u
L
(u, p) has a unique maximumat u
sup,p
in the interior of [0, p].
(A4) The function u
U
(u, p) has a unique minimum at u
inf,p
in the interior of [p, 1].
Under Assumptions (A3) and (A4), U
sup,p
=
_
u
sup,p
_
and
U
inf,p
=
_
u
inf,p
_
. In addition, by the first order conditions for the
optimization problems involved in Q
L
(p) and Q
U
(p), we obtain
f
1
_
1
_
u
sup,p
__
= f
0
_
0
_
u
sup,p
+q
__
and
f
1
_
1
_
u
inf,p
__
= f
0
_
0
_
u
inf,p
p
__
. (4)
Corollary 2.3. Suppose (A1)(A4) hold. Then
n
_
Q
L
n
(p) Q
L
(p)
_
N
_
0,
2
L
_
u
sup,p
, p
__
and
n
_
Q
U
n
(p) Q
U
(p)
_
N
_
0,
2
U
_
u
inf,p
, p
__
,
where
2
L
_
u
sup,p
, p
_
=

0
u
sup,p
_
1 u
sup,p
_
+
1
(u
sup,p
+q)(p u
sup,p
)
0
_
f
1
_
1
(u
sup,p
)
__
2
,
2
U
_
u
inf,p
, p
_
=

0
u
inf,p
_
1 u
inf,p
_
+
1
_
u
inf,p
p
_ _
1 u
inf,p
+p
_
0
_
f
1
_
1
_
u
inf,p
___
2
.
3. Confidence intervals
3.1. The ChernozhukovLeeRosen approach
Lemma 2.1 shows that for a given quantile level p (0, 1),
the true quantile Q
TE
(p) is interval identified: Q
L
(p) Q
TE
(p)
Q
U
(p), where Q
L
(p) = sup
u(0,p)

L
(u, p) and Q
U
(p) = inf
u(p,1)
U
(u, p). Given the sup(inf) structure of Q
L
(p) (Q
U
(p)), statistical
inference for Q
L
(p)(Q
U
(p)) is closely related to recent work on
testing for stochastic dominance and on inference for parameters
defined by conditional moment inequalities; see e.g., Linton et al.
(2010), Fan (2008), Andrews and Shi (2009), Galichon and Henry
(2009) and Chernozhukov et al. (2009). Inference for the true
quantile Q
TE
(p) falls within the framework of intersection bounds
studied by Chernozhukov et al. (2009).
In this subsection, we will apply the approach of Chernozhukov
et al. (2009) to constructing CIs for the bounds Q
L
(p), Q
U
(p) and
the true quantile Q
TE
(p) and will refer to it as the CLR approach.
The CLR approach does not restrict the sets U
sup,p
, U
inf,p
. They can
be estimated respectively by
U
sup,p
=
_
u [0, p] :
nL
(u, p) Q
L
n
(p) l
nL
_
,
U
inf,p
=
_
u [p, 1] :
nU
(u, p) Q
U
n
(p) +l
nU
_
,
where
l
nL
= 2
_
log n/n sup
u(0,p)
nL
(u, p) ,
l
nU
= 2
_
log n/n sup
u(p,1)
nU
(u, p) . (5)
We first consider inference for the bounds. Define
CI
L
CLR
=
_
sup
u
U
sup,p
_
nL
(u, p) c
L
(1 )
nL
(u, p)
n
_
, +
_
, (6)
CI
U
CLR
=
_
, inf
u
U
inf,p
_
nU
(u, p) +c
U
(1 )
nU
(u, p)
n
_
_
, (7)
wherec
L
(1 ) is a consistent estimator of the (1 ) quantile
of sup
uU
sup,p
Z
L
(u, p) andc
U
(1 ) is a consistent estimator of
the (1 ) quantile of sup
uU
inf,p
[Z
U
(u, p)].
Theorem 3.1. Suppose (A1) and (A2) hold. Then (i) limPr
_
Q
U
_
p
_

CI
U
CLR
_
= 1 , provided the following condition holds:
U
(u, p) Q
U
(p)
_
c
U
d
_
u, U
inf,p
__
U

U
(8)
for any u [p, 1]\U
inf,p
for some positive constant
U
and constants
c
U
and
U
, where d
_
u, U
inf,p
_
= inf
u
U
inf,p
u u
. (ii) limPr
_
Q
L
_
p
_
CI
L
CLR
_
= 1 , provided the following condition holds:
L
(u, p) Q
L
(p)
__
c
L
d
_
u, U
sup,p
__
L

L
_
(9)
for any u [0, p]\U
sup,p
for some positive constant
L
and constants
c
L
and
L
, where d
_
u, U
sup,p
_
= inf
u
U
sup,p
u u
.
The conditions (8) and (9) extend the usual identification
conditions in the extremum estimation to allow for non-singleton
U
inf,p
and U
sup,p
. In the lemma below, we provide sufficient
conditions for (8) and (9) to hold. Without loss of generality, we
assume U
inf,p
= [p, 1] and U
sup,p
= [0, p]; otherwise (8) and (9)
hold trivially.
Lemma 3.2. Suppose (A2) holds. (i) Suppose U
inf,p
= [p, 1] and
U
inf,p
=
jJ
U
I
j
for some index set J
U
N, where I
j
=
_
a
j
, b
j
_
for
a
j
, b
j
[p, 1] , a
j
b
j
, satisfying I
j
I
k
= whenever j = k.
Further suppose
U
(u, p) is twice continuously right, left differen-
tiable at p, 1 respectively,
1
and inf
u
0
U
inf,p

2
u
2
U
(u
0
, p)
> 0,
1
For notational simplicity, we use the same notations for the first and second
order partial derivatives of
L
(u, p) and
U
(u, p) at aninterior point uto denote the
corresponding left or right partial derivatives at the appropriate boundary points.
where U
inf,p
=
_
a
j
, b
j
, for j J
U
_
, then (8) holds. (ii) Suppose
U
sup,p
= [0, p] and U
sup,p
=
jJ
L
I
j
for some index set J
L
N,
where a
j
, b
j
[0, p] , a
j
b
j
, satisfying I
j
I
k
= whenever
j = k. Further suppose
L
(u, p) is twice continuously right, left dif-
ferentiable at 0, p respectively, and sup
u
0
U
sup,p

2
u
2
L
(u
0
, p)
> 0,
where U
sup,p
=
_
a
j
, b
j
, for j J
L
_
, then (9) holds.
We note that Lemma 3.2 allows U
inf,p
to be comprised of
isolated points only, in which case U
inf,p
= U
inf,p
, and of both
isolated points and line segments.
Now we consider inference for the true quantile Q
TE
(p). Given
Assumption (A2), we show below that for all p (0, 1) , Q
L
(p) <
Q
U
(p), i.e., Q
TE
(p) is not point identified. This simplifies the
construction of CIs for Q
TE
(p) greatly. For any u
sup,p
U
sup,p
and
u
inf,p
U
inf,p
, it must be the case that u
sup,p
u
inf,p
and the
equality holds iff u
sup,p
= u
inf,p
= p. This with Lemma 2.1 leads
to the following lemma.
Lemma 3.3. Under (A2), for any 0 < p < 1, we have: Q
L
(p) <
Q
U
(p).
Lemma 3.3 implies that under (A2), Q
TE
(p) is not point
identified. In this case, it is known fromImbens and Manski (2004)
and Stoye (2009) that an asymptotically valid confidence interval
for the true parameter can be constructed from one-sided CIs for
each bound. Let
CI
CLR
=
_
sup
u
U
sup,p
_
nL
(u, p) c
L
(1 )
nL
(u, p)
n
_
,
inf
u
U
inf,p
_
nU
(u, p) +c
U
(1 )
nU
(u, p)
n
_
_
. (10)
Theorem 3.4. Suppose the conditions of Theorem 3.1 hold. Then
lim inf
Q
TE
(p)[Q
L
(p),Q
U
(p)]
Pr (Q
TE
(p) CI
CLR
) 1 .
We emphasize here that Theorem 3.4 shows pointwise
asymptotic validity of CI
CLR
in the distribution of the data and may
behave poorly in finite samples if Q
TE
(p) is nearly point identified.
It would be interesting to extend CI
CLR
to an asymptotically
uniformly valid CI in the distribution of the data. We hope to
explore this possibility in future work.
The critical values c
L
(1 ) andc
U
(1 ) may be obtained
either by simulation or by bootstrap. In the simulation study in
Section 4, we used the following bootstrap procedure. First we
compute
nL
(u, p) ,
nU
(u, p) ,
nL
(u, p), and
nU
(u, p) with data
{Y
1i
}
n
1
i=1
and {Y
0i
}
n
0
i=1
. For all v
sup

U
sup,p
and all v
inf

U
inf,p
, we
then compute
nL
_
v
sup
, p
_
and
nU
(v
inf
, p) using naive bootstrap
samples
_
Y
1i
_
n
1
i=1
and
_
Y
0i
_
n
0
i=1
. Finally we construct bootstrap
Z
nL
_
v
sup
, p
_
and Z
nU
(v
inf
, p) in formula (2) and (3):
Z
nL
_
v
sup
, p
_
=
n
_
nL
_
v
sup
, p
_
nL
_
v
sup
, p
__
nL
_
v
sup
, p
_ for v
sup

U
sup,p
,
Z
nU
(v
inf
, p)
=
n
_
nU
(v
inf
, p)
nU
(v
inf
, p)
_
nU
(v
inf
, p)
for v
inf

U
inf,p
.
Defining the conditional distributions of sup
v
sup
U
sup,p
Z
nL
_
v
sup
, p
_
and sup
v
inf

U
inf,p
Z
nU
(v
inf
, p) given the observations as L
L
and L
U
,
the bootstrap critical values are: c
L
(1 ) =
_
L
L
_
1
(1 ) and
c
U
(1 ) =
_
L
U
_
1
(1 ). The theorem below shows that the
conclusions of Theorems 3.1 and 3.4 are valid with the bootstrap
critical values.
Theorem 3.5. Suppose the conditions of Theorem 3.1 hold. Then
the conclusions in Theorems 3.1 and 3.4 hold with c
L
(1 ) =
_
L
L
_
1
(1 ) and c
U
(1 ) =
_
L
U
_
1
(1 ).
To implement the CIs in (6), (7) and (10), we need to choose
uniformly consistent estimators of f
1
, f
0
. In the next subsection, we
show that if in addition to (A1)(A4) also hold, then we are able
to construct asymptotically valid CIs without the need to estimate
f
1
, f
0
.
3.2. The order statistic approach
Corollary 2.3 implies that when (A1)(A4) hold, Q
L
n
(p) and
Q
U
n
(p) are asymptotically normally distributed. In this case, we
can apply the standard approach to constructing CIs for each
bound. For example, let z
denote the quantile of the standard

normal distribution, then the following lemma holds. For the ease
of exposition, we define
u
sup,p
= inf
_
arg sup
u(0,p)
nL
(u, p)
_
and
u
inf,p
= inf
_
arg inf
u(p,1)
nU
(u, p)
_
.
Then we have
Q
L
n
(p) = F
1
1n
( u
sup,p
) F
1
0n
( u
sup,p
+q),
Q
U
n
(p) = F
1
1n
( u
inf,p
) F
1
0n
( u
inf,p
p).
Lemma 3.6. (i) Suppose (A1)(A3) hold. Then,
limPr
_
Q
L
n
(p) z
1
nL
_
u
sup,p
, p
_
n
Q
L
(p)
_
= 1 .
(ii) Suppose (A1), (A2), and (A4) hold. Then,
limPr
_
Q
U
(p) Q
U
n
(p) +z
1
nU
_
u
inf,p
, p
_
n
_
= 1 .
The CIs provided in Lemma 3.6 for the bounds require a
consistent estimator of the density function f
1
. Although a kernel
density estimator of f
1
may be used, the choice of the bandwidth
in kernel estimation is troublesome. To avoid the estimation of
f
1
, we propose a new approach, which extends the well-known
CIs for univariate quantiles based on order statistics (see e.g., van
der Vaart, 1998) to our case. We call this approach the Order
Statistic Approach (OSA). There are two issues we have to address
in applying this approach to our case. First, the sharp bounds
are given by differences between two univariate quantiles instead
of a single quantile; see Lemma 2.1. Second, the quantile levels
involved are unknown, as u
sup,p
and u
inf,p
are unknown. Below, we
explain how we handle both problems.
Assuming u
sup,p
and u
inf,p
are known, the following lemma
presents CIs for Q
L
(p) and Q
U
(p) using order statistics of the
observed outcomes. Let [k] be the largest integer that does not
exceed k.
Lemma 3.7. (i) Suppose (A1)(A3) hold. Define u
1nL
and u
0nL
as in
Box I. Then limPr
_
Y
1([n
1
u
1nL
])
Y
0([n
0
u
0nL
])
Q
L
(p)
_
= 1 .
(ii) Suppose (A1), (A2), and (A4) hold. Define u
1nU
and u
0nU
are given
in Box II. Then limPr
_
Q
U
(p) Y
1([n
1
u
1nU
])
Y
0([n
0
u
0nU
])
_
=
1 .
Although Q
L
(p) and Q
U
(p) are defined by differences between
two univariate quantiles, under (A3) and (A4), their quantile levels
are related by the first order conditions for the corresponding
optimization problems; see (4), which is why we are able to
construct CIs in Lemma 3.7.
Inthe next result, we showthat estimating u
sup,p
andu
inf,p
inthe
CIs in Lemma 3.7 by u
sup,p
and u
inf,p
does not affect their validity.
Let
CI
L
OSA
=
_
Y
1([n
1
u
1nL])
Y
0([n
0
u
0nL])
, +
_
,
CI
U
OSA
=
_
, Y
1
__
n
1
u
B
1nU
__
Y
0
__
n
0
u
B
0nU
__
_
,
whereu
1nL
,u
0nL
,u
1nU
andu
0nU
are given in Box III.
Theorem 3.8. (i) Suppose (A1)(A3) hold. Then limPr
_
Q
L
(p)
CI
L
OSA
_
= 1 .
(ii) Suppose (A1), (A2), and (A4) hold. Then limPr
_
Q
U
(p) CI
U
OSA
_
= 1 .
In contrast to the CIs in Lemma 3.6 or CI
L
CLR
and CI
U
CLR
, the CIs in
Theorem 3.8 do not require estimating f
1
and are completely data-
driven. We can also extend the order statistic approach to Q
TE
(p).
Let
CI
OSA
=
_
Y
1([n
1
u
1nL])
Y
0([n
0
u
0nL])
, Y
1([n
1
u
1nU])
Y
0([n
0
u
0nU])
_
.
(11)
Theorem 3.9. Suppose (A1)(A4) hold. Then
lim inf
Q
TE
(p)[Q
L
(p),Q
U
(p)]
Pr {Q
TE
(p) CI
OSA
} = 1 .
4. Monte Carlo simulations
4.1. Simulation design
This section presents results from a small simulation study on
the finite sample performances of the CIs for eachboundandfor the
true quantiles developed in Section 3. In this study, we used two
sets of DGPs: the truncated mixture of normal distributions (DGP-
MN) and the truncated beta distributions (DGP-B). The graphs of
f
1
and f
0
for DGP-MNs are presented in Fig. B.1 in Appendix B. For
DGP-MN, we chose three values for p : 0.35, 0.67, and 0.85. As
can be seen from Fig. B.2., at p = 0.35, the identification interval
_
Q
L
(p) , Q
U
(p)
_
[0.6374, 0.4666] is the widest for all p (or
very close to the maximum width) and both Q
L
(p) and Q
U
(p) are
relatively unsmooth or jumpy around p = 0.35; Q
TE
(0.67) has
the narrowest or at least very close to the narrowest identification
interval:
_
Q
L
(p) , Q
U
(p)
_
[0.6429, 0.8527] and both Q
L
(p) and
Q
U
(p) are very smooth around p = 0.67; at p = 0.85, Q
L
(p)
and Q
U
(p) are smooth. Figs. B.3B.5. show that the DGP-MNs,
satisfying (A2)(A4) for the selected ps, have U
inf,p
=
_
u
inf,p
_
and U
sup,p
=
_
u
sup,p
_
hence the conditions in Lemma 3.2 hold.
Therefore both the CLR and OSA approaches are applicable to
DGP-MNs.
For DGP-B, we used p = 0.5. Both Q
L
(p) and Q
U
(p) are smooth
around p = 0.5, see Fig. B.6. Fig. B.7. shows that the DGP-B satisfies
only (A2) for the selected p. The conditions in Lemma 3.2 are
apparently satisfied for this DGP hence only the CLR approach is
applicable.
u
1nL
= u
sup,p
z
1
0
u
sup,p
_
1 u
sup,p
_
n
1
0
_
0
u
sup,p
_
1 u
sup,p
_
+
1
(u
sup,p
+q)(p u
sup,p
)
,
u
0nL
=
_
u
sup,p
+q
_
+z
1
1
(u
sup,p
+q)(p u
sup,p
)
n
0
1
_
0
u
sup,p
_
1 u
sup,p
_
+
1
(u
sup,p
+q)(p u
sup,p
)
Box I.
u
1nU
= u
inf,p
+z
1
0
u
inf,p
_
1 u
inf,p
_
n
1
0
_
0
u
inf,p
_
1 u
inf,p
_
+
1
(u
inf,p
p)(1 u
inf,p
+p)
,
u
0nU
= (u
inf,p
p) z
1
1
(u
inf,p
p)(1 u
inf,p
+p)
n
0
1
_
0
u
sup,p
_
1 u
sup,p
_
+
1
(u
sup,p
+q)(p u
sup,p
)
Box II.
u
1nL
=u
sup,p
z
1
0n
u
sup,p
_
1 u
sup,p
_
n
1
0n
_
0n
u
sup,p
_
1 u
sup,p
_
+
1n
(u
sup,p
+q)(p u
sup,p
)
,
u
0nL
= (u
sup,p
+q) +z
1
1n
(u
sup,p
+q)(p u
sup,p
)
n
0
1n
_
0n
u
sup,p
_
1 u
sup,p
_
+
1n
(u
sup,p
+q)(p u
sup,p
)
,
u
1nU
=u
inf,p
+z
1
0n
u
inf,p
_
1 u
inf,p
_
n
1
0n
_
0n
u
inf,p
_
1 u
inf,p
_
+
1n
(u
inf,p
p)(1 u
inf,p
+p)
,
u
0nU
=
_
u
inf,p
p
_
z
1
1n
(u
inf,p
p)(1 u
inf,p
+p)
n
0
1n
_
0n
u
inf,p
_
1 u
inf,p
_
+
1n
(u
inf,p
p)(1 u
inf,p
+p)
Box III.
The sample sizes vary from n
1
= n
0
= 100 to n
1
= n
0
= 200
and n
1
= n
0
= 400.
2
The nominal coverage level is 1 = 0.95.
The number of replications is 50,000. To implement CI
L
CLR
, CI
U
CLR
,
and CI
CLR
, we need estimators of f
1
and f
0
. In the simulation, we
used kernel density estimators with Gaussian kernel and a rule-
of-thumb for the bandwidth selection: h
j
= c
h

j
_
ln(n
j )
n
j
_2
9
, where
c
h
= 1.06 and
1.06
4
. The critical values in CI
L
CLR
, CI
U
CLR
, and CI
CLR
are
obtained via the bootstrap procedure described in Section 3.1. The
number of bootstrap replications is 50,000.
4.2. Results
Table 1 presents the true Q
L
(p) and Q
U
(p), averages of
Q
L
n
(p) , Q
U
n
(p), CIs for the lower bounds denoted as CI
L
CLR
and CI
L
OSA
,
and CIs for the upper bounds denoted as CI
U
CLR
and CI
U
OSA
in various
settings.
Several conclusions emerge from Table 1. First, Q
L
n
(0.35) in
DGP-MNandQ
L
n
(0.5) inDGP-BseemupwardbiasedandQ
U
n
(0.35)
in DGP-MN and Q
U
n
(0.5) in DGP-B downward biased. Second,
the average widths of CIs using the CLR approach depend on
the bandwidth used in the kernel estimation or the value of c
h
.
Regardless of the DGPs considered, c
h
= 1.06 generated CIs with
narrower lengths on average than c
h
= 1.06/4. For DGP-MN
where the order statistic approach is applicable, we see that the
order statistic approach generated CIs with the shortest lengths on
2
We also tried different values for n
1
and n
0
and obtained qualitatively similar
results.
average for p = 0.35 and p = 0.67, but the longest average lengths
for p = 0.85.
Table 2 presents the coverage rates for both Q
L
(p) and Q
U
(p)
in each setting, where the underlined is the closest to the nominal
level in each setting. We note that the value of c
h
affects the
coverage rates of CI
L
CLR
in small samples such as n = 100; see e.g.,
the coverage rates for Q
L
(0.35) in DGP-MN.
3
As the sample size
increases, the difference disappears. In most cases, the coverage
rates are larger than the nominal level. For DGP-MN where the
order statistic approach is applicable, we observe that the coverage
rates of CI
L
OSA
and CI
U
OSA
are reasonably close to the nominal level.
Table 3 shows the minimum coverage rates of various CIs for
the true quantile in which the underlined is the closest to the
nominal level in each setting. Similar to coverage rates of CIs
for each bound, CIs for the true quantile overcovered with an
exception of CI
OSA
for Q
TE
(0.67), in which case the coverage rates
are close to the nominal level even for sample sizes as small
as 100. Given that the CIs based on the order statistic approach
are simple to implement and are free from the choice of any
parameter/bandwidth, we would recommend their use whenever
applicable. As one referee suggests, Assumptions (A3) and (A4)
that the order statistic approach relies on are testable assumptions.
Although developing such formal tests is beyond the scope of this
paper, a heuristic approach would be to plot
nL
(u, p) (
nL
(u, p))
for u [0, p] (u [p, 1]) to see if it has a unique interior maximum
(minimum).
3
We note that when (A3) and (A4) hold, CIs in Lemma 3.6 are asymptotically
valid and their small sample performances depend on c
h
as well.
Table 1
Averages of Q
L
n
(p) , Q
U
n
(p), and CIs.
DGP-MN DGP-B
p = 0.35 p = 0.67 p = 0.85 p = 0.5
True Q
L
(p) 0.6374 0.6429 0.8252 0.4932
Sample sizes: 100
Q
L
n
(p) 0.4191 0.6503 0.8301 0.4589
CI
L
CLR
(c
h
= 1.06) 1.0652 0.0682 0.3582 0.6577
_
c
h
=
1.06
4
_
1.2193 0.0346 0.3148 0.6892
CI
L
OSA
1.0388 0.2355 0.4595 0.5884
Sample sizes: 200
Q
L
n
(p) 0.4771 0.6474 0.8288 0.4654
CI
L
CLR
(c
h
= 1.06) 1.0808 0.2108 0.4702 0.6099
_
c
h
=
1.06
4
_
1.1458 0.1997 0.4458 0.6249
CI
L
OSA
1.0705 0.3358 0.5124 0.5550
Sample sizes: 400
Q
L
n
(p) 0.5351 0.6459 0.8289 0.4654
CI
L
CLR
(c
h
= 1.06) 1.0835 0.2728 0.5336 0.5765
_
c
h
=
1.06
4
_
1.0416 0.3150 0.5540 0.5816
CI
L
OSA
1.0678 0.4401 0.5659 0.5370
True Q
U
(p) 0.4666 0.8527 1.0697 0.4932
Sample sizes: 100
Q
U
n
(p) 0.3452 0.8565 1.0711 0.4520
CI
U
CLR
(c
h
= 1.06) 1.2877 1.3070 1.3981 0.6396
_
c
h
=
1.06
4
_
1.2648 1.3507 1.3793 0.6673
CI
U
OSA
1.1456 1.2873 1.7624 0.5795
Sample sizes: 200
Q
U
n
(p) 0.4063 0.8521 1.0659 0.4654
CI
U
CLR
(c
h
= 1.06) 1.1910 1.2229 1.3685 0.6086
_
c
h
=
1.06
4
_
1.2462 1.3204 1.3734 0.6244
CI
U
OSA
1.0428 1.1949 1.6434 0.5508
Sample sizes: 400
Q
U
n
(p) 0.4436 0.8506 1.0668 0.4742
CI
U
CLR
(c
h
= 1.06) 1.0146 1.1193 1.3114 0.5760
_
c
h
=
1.06
4
_
1.1338 1.2408 1.3569 0.5813
CI
U
OSA
0.9193 1.1010 1.5242 0.5367
Table 2
Coverage rates for each bound.
Sample sizes Method DGP-MN DGP-B
(c
h
) p = 0.35 p = 0.67 p = 0.85 p = 0.5
100 CI
L
CLR
(1.06) 0.8536 0.9996 0.9992 0.9892
_
1.06
4
_
0.9642 0.9948 0.9983 0.9918
CI
L
OSA
0.9616 0.9661 0.9689
200 CI
L
CLR
(1.06) 0.9279 1.0 0.9999 0.9914
_
1.06
4
_
0.9912 0.9997 0.9998 0.9926
CI
L
OSA
0.9820 0.9706 0.9781
400 CI
L
CLR
(1.06) 0.9783 1.0 1.0 0.9919
_
1.06
4
_
0.9980 1.0 1.0 0.9914
CI
L
OSA
0.9954 0.9701 0.9779
100 CI
U
CLR
(1.06) 0.9786 1.0 0.9999 0.9845
_
1.06
4
_
0.9747 0.9997 0.9990 0.9885
CI
U
OSA
0.9914 0.9801 0.9941
200 CI
U
CLR
(1.06) 0.9892 1.0 1.0 0.9921
_
1.06
4
_
0.9916 1.0 1.0 0.9932
CI
U
OSA
0.9985 0.9848 0.9955
400 CI
U
CLR
(1.06) 0.9976 1.0 1.0 0.9919
_
1.06
4
_
0.9990 1.0 1.0 0.9919
CI
U
OSA
0.9987 0.9763 0.9931
5. Conclusion
This paper is the first to develop inference procedures for the
quantile Q
TE
(p) of the effect of a binary treatment defined as
the difference between the two potential outcomes. Specifically,
for randomized experiments, (i) we proposed nonparametric
estimators of sharp bounds on the quantile of treatment effects
Q
TE
(p) and established their asymptotic properties under general
conditions; (ii) we constructed CIs for the bounds and the true
quantile by using the approach in Chernozhukov et al. (2009);
(iii) under additional conditions that guarantee the asymptotic
normality of nonparametric estimators of the sharp bounds, we
developed the order statistic approach to constructing CIs for the
bounds and the true quantile. Although CIs based on the order
statistic approach rely on additional conditions, they are easy to
implement and are completely data-driven. A simulation study is
conducted to investigate the finite sample performance of both
approaches.
Table 3
Coverage rates for Q
TE
(p).
Sample sizes Method MN B
(c
h
) p = 0.35 p = 0.67 p = 0.85 p = 0.5
100 CI
CLR
(1.06) 0.8332 0.9996 0.9992 0.9738
_
1.06
4
_
0.9389 0.9945 0.9973 0.9803
CI
OSA
0.9530 0.9462 0.9631
200 CI
CLR
(1.06) 0.9171 1.0 0.9999 0.9835
_
1.06
4
_
0.9828 0.9997 0.9998 0.9860
CI
OSA
0.9805 0.9554 0.9736
400 CI
CLR
(1.06) 0.9758 1.0 1.0 0.9840
_
1.06
4
_
0.9970 1.0 1.0 0.9834
CI
OSA
0.9941 0.9465 0.9710
Much work remains to be done. In terms of the sharp
bounds, those in this paper are the worst bounds in the sense
that they do not make use of any prior information on the
possible dependence between the potential outcomes. When
such information is available, these bounds can be tightened.
The focus on randomized experiments in this paper allows the
identification of the marginal distributions. In cases where the
marginal distributions themselves are not identifiable but bounds
on them can be placed (see, e.g., Manski, 1994, 2003, Manski and
Pepper, 2000, Shaikh and Vytlacil, 2010, Blundell et al., 2007 and
Honore and Lleras-Muney, 2006), we can also place bounds on the
quantile function of treatment effects.
Acknowledgment
The second author is very grateful for the support of the
University of North Carolina, where this research has partly been
done.
Appendix A. Technical proofs
Proof of Theorem 2.2. (i) We only provide a proof for Z
nL
(, p);
that for Z
nU
(, p) is similar and thus omitted. Define q = 1 p.
First, we note:
n (
nL
(u, p)
L
(u, p))
=

n
_
F
1
1n
(u) F
1
1
(u)
_
n
_
F
1
0n
(u +q) F
1
0
(u +q)
_
.
Under (A1) and (A2)(i),(ii), it follows fromthe weak convergence of
the empirical quantile process, see e.g., van der Vaart and Wellner
(1996), we get
n
_
F
1
1n
() F
1
1
()
_
G
1
() on l
(0, p) ,
n
_
F
1
0n
( +q) F
1
0
( +q)
_
G
0
() on l
(0, p) ,
where G
1
() and G
0
() are zero mean Gaussian processes with
respective covariance functions given by
Cov (G
1
(u) , G
1
(v)) =
min {u, v} uv
1
f
1
_
F
1
1
(u)
_
f
1
_
F
1
1
(v)
_ ,
Cov (G
0
(u) , G
0
(v)) =
min {(u +q) , (v +q)} (u +q) (v +q)
0
f
0
_
F
1
0
(u +q)
_
f
0
_
F
1
0
(v +q)
_ .
Moreover, G
1
() and G
0
() are independent of each other. Thus, we
get:
n (
nL
(, p)
L
(, p)) G
1
() G
0
() on l
(0, p) .
Under (A1) and (A2), F
1
1n
(u) and F
1
0n
(u +q) are uniformly con-
sistent estimators of F
1
1
(u) and F
1
0
(u +q) respectively for u
(0, p), see p. 176 of Csrg and Revesz (1981). Given the uni-
formconsistency of
f
1
and
f
0
, we obtain the uniformconsistency of
2
nL
(u, p) for u (0, p). Now since
2
L
(u, p) is bounded uniformly
in u (0, p) above and away from zero, by Slutskys theorem, we
conclude:
n (
nL
(, p)
L
(, p))
nL
(, p)
G
1
() G
0
()
L
(, p)
on l
(0, p) .
(ii) The consistency of Q
L
n
(p) follows fromthe uniformconsistency
of F
1
1n
(u) and F
1
0n
(u +q) for u (0, p) and the continuous map-
ping theorem. The asymptotic distribution of
n
_
Q
L
n
(p) Q
L
(p)
_
follows from(i), the uniformconsistency of
2
nL
(u, p) for u (0, p),
and the continuous mapping theorem.
Proof of Corollary 2.3. This follows fromTheorem2.2 and(4).
Proof of Theorem 3.1. We provide a proof of (i) for the upper
bound Q
U
(p) only. The proof of (ii) for the lower bound is
similar and thus omitted. (i) It follows from Theorem 2.2 that
Condition D.1. in Chernozhukov et al. (2009) is satisfied for the
process Z
nU
(, p). Theorem2 in Chernozhukov et al. (2009) implies
that

U
inf,p
satisfies the requirement of Condition C.2. This and
Lemma 1 in Chernozhukov et al. (2009) imply the conditions
for 2. of Theorem 1 in Chernozhukov et al. (2009) and thus
limP
_
Q
U
(p) CI
U
CLR
_
= 1 .
Proof of Lemma 3.2. We prove (i) only. This proof is straightfor-
wardly applicable to (ii) since sup
L
(u, p) = inf (
L
(u, p)).
For any u
0
U
inf,p
, define
B
(u
0
) = [p, 1] \ U
inf,p

_
u
0

u
0
, u
0
+
u
0
_
,
where
u
0
is small enough such that for any u B
(u
0
) , d
_
u,
U
inf,p
_
= d (u, u
0
). Since d
_
u, U
inf,p
_
= d
_
u, U
inf,p
_
for such u,
we keep the notation d
_
u, U
inf,p
_
whenever relevant. Let
= inf
u
0
U
inf,p
2
u
2
U
(u
0
, p)
> 0.
For any u
0
U
inf,p
, we have one of the followings: (i) u
0
[p, 1]
and

u
U
(u
0
, p) = 0 and

2
u
2
U
(u
0
, p) > 0; (ii) u
0
= p and
U
(u
0
, p) > 0 and

2
u
2
U
(u
0
, p) > 0; (iii) u
0
= 1 and
U
(u
0
, p) < 0 and

2
u
2
U
(u
0
, p) > 0; (iv) u
0
= p and
U
(u
0
, p) > 0 and

2
u
2
U
(u
0
, p) < < 0; (v) u
0
= 1 and
U
(u
0
, p) < 0 and

2
u
2
U
(u
0
, p) < 0.
We choose
u
0
in the following way:
(Case 1) For any u
0
as in (i), (ii), or (iii), there exists
u
0
> 0 such
that

2
u
2
U
(u, p) > /2 for all u B
(u
0
) [p, 1] \ U
inf,p

_
u
0

u
0
, u
0
+
u
0
_
;
(Case 2) For any u
0
as in (iv) or (v), there exists
u
0
> 0 such that

2
u
2
U
(u, p)
> /2 and
U
(u, p) is monotonic (i.e.,

u
U
(u, p)
0 or

u
U
(u, p) 0) for all u B
(u
0
) [p, 1] \ U
inf,p

_
u
0

u
0
, u
0
+
u
0
_
.
Let B
(u
0
) = B
(u
0
) B
(u
0
) and B
_
U
inf,p
_
=
u
0
U
inf,p
B
(u
0
). Clearly B
_
U
inf,p
_
is open relative to [p, 1] \ U
inf,p
. Define
U
min
u[p,1]\U
inf,p
\B
(U
inf,p)
_
U
(u, p) Q
U
(p)
_
.
Since [p, 1] \U
inf,p
\B
_
U
inf,p
_
is closed (hence compact) relative
to [p, 1]\U
inf,p
and
U
(u, p) is continuous,
U
exists. Further, since
U
(u, p) Q
U
(p) > 0 for all u [p, 1] \ U
inf,p
\ B
_
U
inf,p
_
, we
obtain:
U
> 0.
Now we show there exist
U
and c
U
such that
U
>
U
(u, p)
Q
U
(p)
_
c
U
d
_
u, U
inf,p
__
U
for all u
_
u
:
U
_
u
, p
_
Q
U
(p)
(0,
U
)
_
, which will complete the proof. Since, by the con-
struction of B
(u
0
) and
U
, there is a unique u
0
U
inf,p
such
that d
_
u, U
inf,p
_
= d
_
u, U
inf,p
_
= d (u, u
0
) for a u
_
u
U
_
u
, p
_
Q
U
(p) (0,
U
)
_
, we consider u
0
such that
d
_
u, U
inf,p
_
= d
_
u, U
inf,p
_
= d (u, u
0
) for a given u
_
u
U
_
u
, p
_
Q
U
(p) (0,
U
)
_
in what follows.
(Case 1) By the Theorem 6 in Andrews (1999),
U
(u, p) Q
U
(p) =

u
U
(u
0
, p) (u u
0
)
+
1
2
2
u
2
U
(u
, p) [u u
0
]
2
where u
lies between u
0
and u and

2
u
2
U
(u
, p) > /2. Since
U
(u
0
, p) (u u
0
) 0 in this case, we have
U
(u, p) Q
U
(p) =

u
U
(u
0
, p) (u u
0
)
+
1
2
2
u
2
U
(u
, p) [u u
0
]
2
1
2
2
u
2
U
(u
, p) [u u
0
]
2
>
4
_
d
_
u, U
inf,p
__
2
.
(Case 2) Again, by the Theorem 6 in Andrews (1999), we have
U
(u, p) Q
U
(p) =

u
U
(u
0
, p) (u u
0
)
+
1
2
2
u
2
U
(u
, p) [u u
0
]
2
where u
lies between u
0
and u and

2
u
2
U
(u
, p) < /2. Since
U
(u
0
, p) (u u
0
) > 0 in this case, we have
U
(u, p) Q
U
(p)
=
U
(u
0
, p)
|u u
0
|
1
2
2
u
2
U
(u
, p)
[|u u
0
|]
2
=
1
2
2
u
2
U
(u
, p)
_
_
U
(u
0
, p)
1
2
2
u
2
U
(u
, p)
|u u
0
|
_
_
|u u
0
| .
For
U
(u, p) Q
U
(p) to be positive, either
U
(u
0
,p)
1
2
2
u
2

U
(u
,p)
>
|u u
0
| or |u u
0
| > 0 must hold. Also by construction,
U
(u, p)
Q
U
(p) is monotonic. There are two cases to consider.
(Case 2-1) When u
0
= p, the following inequality holds:
1
2
2
u
2
U
(u
, p)
_
_
U
(u
0
, p)
1
2
2
u
2
U
(u
, p)
(u p)
_
_
(u p) 0
which implies: u
_
p, p +
U
(1,p)

2
u
2

U
(u
,p)
_
and, for such u, we get:
U
(u, p) Q
U
(p)
1
2
2
u
2
U
(u
, p)
[u p]
2
=
2
u
2
U
(u
, p)
_
_
p +
U
(p, p)

2
u
2
U
(u
, p)
u
_
_
(u p) 0;
(Case 2-2) When u
0
= 1, the following inequality holds:
1
2
2
u
2
U
(u
, p)
_
_
U
(u
0
, p)
1
2
2
u
2
U
(u
, p)
(1 u)
_
_
(1 u) 0
which implies: u
_
1
U
(1,p)

2
u
2

U
(u
,p)
, 1
_
and, for such u, we ob-
tain:
U
(u, p) Q
U
(p)
1
2
2
u
2
U
(u
, p)
[1 u]
2
=
2
u
2
U
(u
, p)
_
_
u 1 +
U
(1, p)

2
u
2
U
(u
, p)
_
_
(1 u) 0.
Therefore, it must be true that
U
(u, p) Q
U
(p)
1
2
2
u
2
U
(u
, p)
[u u
0
]
2

4
_
d
_
u, U
inf,p
__
2
.
Summing up, we have
U
= 2 and c
U
=

/2.
Proof of Lemma 3.3. We prove this result by contradiction. Sup-
pose there exists a p (0, 1) such that Q
L
(p) = Q
U
(p). Then there
exist u
sup,p
U
sup,p
[0, p] and u
inf,p
U
inf,p
[p, 1] such that
F
1
1
(u
sup,p
) F
1
0
(1 +u
sup,p
p) = F
1
1
(u
inf,p
) F
1
0
(u
inf,p
p).
Re-arranging this, we get
F
1
1
(u
sup,p
) F
1
1
(u
inf,p
) = F
1
0
(1 +u
sup,p
p) F
1
0
(u
inf,p
p).
Since F
1
1
(u
sup,p
) F
1
1
(u
inf,p
) and F
1
0
(1 + u
sup,p
p)
F
1
0
(u
inf,p
p), the above equality holds iff
F
1
1
(u
sup,p
) = F
1
1
(u
inf,p
) and
F
1
0
(1 +u
sup,p
p) = F
1
0
(u
inf,p
p).
But the equality: F
1
1
(u
sup,p
) = F
1
1
(u
inf,p
) implies: u
sup,p
=
u
inf,p
= p which implies F
1
0
(1) = F
1
0
(0). On the other hand,
F
1
0
(1+u
sup,p
p) = F
1
0
(u
inf,p
p) implies u
inf,p
u
sup,p
= 1, which
inturn implies u
inf,p
= 1 and u
sup,p
= 0, i.e., F
1
1
(1) = F
1
1
(0). Both
F
1
0
(1) = F
1
0
(0) and F
1
1
(1) = F
1
1
(0) contradict (A2).
Proof of Theorem 3.4. We follow the proof of Theorem 3 in
Chernozhukov et al. (2009). Let Q
TE
(p) be an arbitrary sequence
of constants within the identified set, so that its value can change
depending on n. Let = Q
U
(p) Q
L
(p)
U
+
L
, where
U
= Q
U
(p) Q
TE
(p) 0 and
L
= Q
TE
(p) Q
L
(p) 0. The
probability that Q
TE
(p) CI
CLR
is
Pr (Q
TE
(p) CI
CLR
)
Pr
_
Q
TE
(p) < sup
u
U
sup,p
_
nL
(u, p) c
L
(1 )
nL
(u, p)
n
__
+ Pr
_
Q
TE
(p) > inf
u
U
inf,p
_
nU
(u, p)
+c
U
(1 )
nU
(u, p)
n
__
= Pr
_
Q
L
(p) <
L
+ sup
u
U
sup,p
_
nL
(u, p) c
L
(1 )
nL
(u, p)
n
_
_
+ Pr
_
Q
U
(p) >
U
+ inf
u
U
inf,p
_
nU
(u, p) +c
U
(1 )
nU
(u, p)
n
__
A
n
+B
n
.
First we analyze A
n
. It is given by
A
n
= Pr
_
Q
L
(p) <
L
+ sup
u
U
sup,p
_
nL
(u, p) c
L
(1 )
nL
(u, p)
n
__
= Pr
_
0 <
L
+ sup
u
U
sup,p
_
{
nL
(u, p)
L
(u, p)}
+
_
L
(u, p) Q
L
(p)
_
c
L
(1 )
nL
(u, p)
n
__
Pr
_
0 <
L
+ sup
u
U
sup,p
_
_
nL
(u, p)
L
(u, p)
_
c
L
(1 )
nL
(u, p)
n
__
= Pr
_
0 <
n
L
+ sup
u
U
sup,p
nL
(u, p)
n {
nL
(u, p)
L
(u, p)}
nL
(u, p)
c
L
(1 )
__
Pr
_
0 <
n
L
+
_
sup
u
U
sup,p
nL
(u, p)
_
sup
u
U
sup,p
_
n {
nL
(u, p)
L
(u, p)}
nL
(u, p)
c
L
(1 )
__
= Pr
_

n
L
sup
u
U
sup,p
nL
(u, p)
+c
L
(1 )
< sup
u
U
sup,p
_
n {
nL
(u, p)
L
(u, p)}
nL
(u, p)
__
.
Similarly, we have:
B
n
= Pr
_
Q
U
(p) >
U
+ inf
u
U
inf,p
_
nU
(u, p)
+c
U
(1 )
nU
(u, p)
n
__
Pr
_
sup
u
U
inf,p
_
n {
U
(u, p)
nU
(u, p)}
nU
(u, p)
_
>
n
U
sup
u
U
inf,p
nU
(u, p)
+c
U
(1 )
_
.
Under (A2), > 0 by Lemma 3.3. So, either
L
> 0 or
U
> 0 or
both.
Suppose (a)
L
> 0 and
U
= 0. Then limA
n
= 0 and
limB
n
. Suppose (b)
L
= 0 and
U
> 0. Then limA
n
and
limB
n
= 0. Suppose (c)
L
> 0 and
U
> 0. Then limA
n
= 0 and
limB
n
= 0. In cases (a) and (b), limPr (Q
TE
(p) CI
CLR
) 1 and
in case (c), limPr (Q
TE
(p) CI
CLR
) = 1 1 . The conclusion in
(i) follows since Q
TE
(p) is anarbitrary sequence of constants within
the identified set, dependent upon n.
Proof of Theorem 3.5. Given Theorems 3.1 and 3.4, it suffices to
showthat
_
L
L
_
1
(1 ) and
_
L
U
_
1
(1 ) are respectively con-
sistent estimators of the (1 ) quantiles of sup
uU
sup,p
Z
L
(u, p)
and sup
uU
inf,p
Z
U
(u, p). We provide a proof for
_
L
L
_
1
(1 )
only. A proof for
_
L
U
_
1
(1 ) can be similarly constructed and
thus omitted. Note that we only need to prove conditional weak
convergence of sup
v
sup
U
sup,p
Z
nL
_
v
sup
, p
_
to sup
uU
sup,p
Z
L
(u, p)
given the observations. It follows from Theorem 23.9 in van
der Vaart (1998) and the uniform consistency of
nL
(, p) that
Z
nL
(, p) converges weakly to Z
L
(, p) conditional on the ob-
servations. This implies the conditional weak convergence of
sup
v
sup
U
sup,p
Z
nL
_
v
sup
, p
_
tosup
uU
sup,p
Z
L
(u, p). It remains toshow
that
sup
v
sup
U
sup,p
Z
nL
_
v
sup
, p
_
sup
v
sup
U
sup,p
Z
nL
_
v
sup
, p
_
= o
p
(1) ,
where o
p
(1) denotes convergence in probability to zero condi-
tional onthe observations. Similar to the proof of Lemma 1inCher-
nozhukov et al. (2009), we obtain:
sup
v
sup
U
sup,p
Z
nL
_
v
sup
, p
_
sup
v
sup
U
sup,p
Z
nL
_
v
sup
, p
_
sup
|vv
|<d
H(
U
sup,p
,U
sup,p)
nL
(v, p) Z
nL
_
v
, p
_
,
where d
H
(, ) denotes the Hausdorff distance. The expression on
the right hand side of the above inequality converges to zero in
probability conditional on the observations due to the stochastic
equicontinuity of Z
nL
(, p) and Theorem 2 of Chernozhukov et al.
(2009).
Proof of Lemma 3.7. We prove (i), since (ii) can be proved
similarly. Lemma 21.7 in van der Vaart (1998), along with
f
1
_
F
1
1
_
u
sup,p
__
= f
0
_
F
1
0
_
1 +u
sup,p
p
__
, implies Y
1([n
1
u
1nL
])

Y
0([n
0
u
0nL
])
which is given in Box IV, where Q
L
n
(p) = F
1
1n
_
u
sup,p
_
F
1
0n
_
1 +u
sup,p
p
_
. The conclusion in (i) follows from:
n
_
Q
L
n
(p) Q
L
(p)
_
N
_
0,
2
L
_
u
sup,p
, p
__
.
Y
1([n
1
u
1nL
])
Y
0([n
0
u
0nL
])
= F
1
1n
_
u
sup,p
_
z
1
0
u
sup,p
_
1 u
sup,p
_
f
1
_
F
1
1
_
u
sup,p
__
n
1
0
_
0
u
sup,p
_
1 u
sup,p
_
+
1
(1 +u
sup,p
p)(p u
sup,p
)
F
1
0n
_
1 +u
sup,p
p
_
z
1
1
(1 +u
sup,p
p)(p u
sup,p
)
f
0
_
F
1
0
_
1 +u
sup,p
p
__
n
0
1
_
0
u
sup,p
_
1 u
sup,p
_
+
1
(1 +u
sup,p
p)(p u
sup,p
)
+o
p
_
1
n
_
= F
1
1n
_
u
sup,p
_
F
1
0n
_
1 +u
sup,p
p
_
z
1
_
0
u
sup,p
_
1 u
sup,p
_
+
1
(1 +u
sup,p
p)(p u
sup,p
)
n
1
0
f
1
_
F
1
1
_
u
sup,p
__ +o
p
_
1
n
_
= Q
L
n
(p) z
1
L
_
u
sup,p
, p
_
n
+o
p
_
1
n
_
Box IV.
Proof of Theorem 3.8. First we show: u
sup,p
u
sup,p
= O
p
_
n
1/3
_
.
The proof of this consists of two steps:
1. We show that u
sup,p
u
sup,p
= o
p
(1);
2. We show that u
sup,p
u
sup,p
= O
p
_
n
1/3
_
.
Proof of 1. Under (A1) and (A2), F
1
1n
(u) and F
1
0n
(u +q) are uni-
formly consistent estimators of F
1
1
(u) and F
1
0
(u +q) respec-
tively for u (0, p), see p. 176 of Csrg and Revesz (1981). This
and A3 imply that the sequence u
sup,p
converges in probability to
u
sup,p
, see e.g., Theorem 5.7 in van der Vaart (1998).
Proof of 2. We use Theorem 3.2.5 in van der Vaart and Wellner
(1996) to establish the rate of convergence for u
sup,p
. Given (A2)
and (A3), the map: u (u, p) is twice differentiable and has a
unique maximum at u
sup,p
. By (A3), the first condition of Theorem
3.2.5 in van der Vaart and Wellner (1996) is satisfied. To check the
second condition of Theorem 3.2.5 in van der Vaart and Wellner
(1996), we consider the centered process:
n
1
(
n
) (u, p)
=

n
1
_
F
1
1n
F
1
1
_
(u)
n
1
_
F
1
0n
F
1
0
_
(u +1 p)
=
1
f
1
_
F
1
1
(u)
_
n
1
_
F
1n
_
F
1
1
(u)
_
u
_
+
1
f
1
_
F
1
1
(u)
_ R
n1
(u)
+
1
f
0
_
F
1
0
(1 +u p)
_
n
1
n
0
n
0
_
F
0n
_
F
1
0
(u +1 p)
_
(u +1 p)
_
+
1
f
0
_
F
1
0
(1 +u p)
_
n
1
n
0
R
n0
(u +1 p)
G
n1
(u)
n
1
n
0
G
n0
(u +1 p) +
1
f
1
_
F
1
1
(u)
_ R
n1
(u)
+
1
f
0
_
F
1
0
(1 +u p)
_
n
1
n
0
R
n0
(u +1 p) ,
where
G
n1
(u) =
1
n
n
1
i=1
[I {F
1
(Y
1i
) u} u]
f
1
_
F
1
1
(u)
_ ,
G
n0
(u) =
1
n
n
1
i=1
[I {F
0
(Y
0i
) u} u]
f
0
_
F
1
0
(1 +u p)
_ ,
R
n1
(u) =

n
1
_
F
1n
_
F
1
1
(u)
_
u
_
+
n
1
_
F
1
1n
F
1
1
_
(u) f
1
_
F
1
1
(u)
_
,
R
n0
(u) =

n
0
_
F
0n
_
F
1
0
(u)
_
u
_
+
n
0
_
F
1
0n
F
1
0
_
(u) f
0
_
F
1
0
(u)
_
.
The processes {R
n1
()} and {R
n0
()} are known as BahadurKiefer
processes or standardized empirical difference processes. Thus, we
obtain
E sup
|uu
sup,p|<
n
1
(
n
) (u, p)
n
1
(
n
)
_
u
sup,p
, p
_
E sup
|uu
sup,p|<
G
n1
(u) G
n1
_
u
sup,p
_
+E sup
|uu
sup,p|<
R
n1
(u) R
n1
_
u
sup,p
_
+
_
n
1
n
0
E sup
|uu
sup,p|<
G
n0
(u +1 p) G
n0
_
u
sup,p
+1 p
_
+
_
n
1
n
0
E sup
|uu
sup,p|<
R
n0
(u +1 p) R
n0
_
u
sup,p
+1 p
_
.
Assuming inf
u
f
1
_
F
1
1
(u)
_
> 0 and inf
u
f
0
_
F
1
0
(u)
_
> 0, we
conclude that
E sup
|uu
sup,p
|<
|G
n1
(u) G
n1
_
u
sup,p
_
|
1/2
(12)
and
E sup
|uu
sup,p
|<
|G
n0
(1 +u p) G
n0
_
1 +u
sup,p
p
_
|
1/2
. (13)
Indeed, the envelope function of the class of functions
_
I {(, u]} I
_
, u
sup,p
_
: u [u
sup,p
, u
sup,p
+]
_
is bounded by I
_
(u
sup,p
, u
sup,p
+)
_
which has a squared
L
2
-norm bounded by 2. Since the class of functions I {Y
1i
} has
a finite uniform entropy integral, Lemma 19.38 in van der Vaart
(1998) implies the above results.
Consequently,
E sup
|uu
sup,p
|<
|
n
1
(
n
)(u, p)
n
1
(
n
)(u
sup,p
, p)|

1/2
_
1 +
_
n
1
n
0
_
+2E sup
u
|R
n1
(u)|
+2
_
n
1
n
0
E sup
u
|R
n0
(u +1 p)| .
Using Theorem5.2.3 inCsrg andRevesz (1981), we conclude that
the secondconditionof Theorem3.2.5 invander Vaart andWellner
(1996) is satisfied leading to the rate of n
1/3
1
.
Under (A1) and (A2)(i),(ii), it follows from the weak conver-
gence of the empirical quantile process, see e.g., van der Vaart and
Wellner (1996), and the proof of Lemma 3.7 that
Y
1([n
1
u
1nL])
Y
0([n
0
u
0nL])
= F
1
1n
_
u
1nL
_
F
1
0n
_
u
0nL
_
=
__
F
1
1n
_
u
1nL
_
F
1
0n
_
u
0nL
__
_
F
1
1
_
u
1nL
_
F
1
0
_
u
0nL
___
+
_
F
1
1
_
u
1nL
_
F
1
0
_
u
0nL
__
=
_
F
1
1n
(u
1nL
) F
1
0n
(u
0nL
)
_
_
F
1
1
(u
1nL
) F
1
0
(u
0nL
)
_
+
_
F
1
1
_
u
1nL
_
F
1
0
_
u
0nL
__
+o
p
_
1
n
_
=
_
Y
1([n
1
u
1nL
])
Y
0([n
0
u
0nL
])
_
+
__
F
1
1
_
u
1nL
_
F
1
0
_
u
0nL
__
_
F
1
1
(u
1nL
)
F
1
0
(u
0nL
)
__
+o
p
_
1
n
_
= Q
L
n
(p) z
1
L
_
u
sup,p
, p
_
n
+
__
F
1
1
_
u
1nL
_
F
1
0
_
u
0nL
__
_
F
1
1
(u
1nL
) F
1
0
(u
0nL
)
__
+o
p
_
1
n
_
= Q
L
n
(p) z
1
L
_
u
sup,p
, p
_
n
1
+o
p
_
1
n
_
,
where
_
F
1
1
_
u
1nL
_
F
1
0
_
u
0nL
__

_
F
1
1
(u
1nL
) F
1
0
(u
0nL
)
_
=
o
p
_
n
1/2
_
, by Taylor series expansion, f
1
_
F
1
1
_
u
sup,p
__
= f
0
_
F
1
0
_
1+u
sup,p
p
__
, and u
sup,p
u
sup,p
= O
p
_
n
1/3
_
. The conclusion in
(i) follows from:
n
_
Q
L
n
(p) Q
L
(p)
_
N
_
0,
2
L
_
u
sup,p
, p
__
.
Proof of Theorem 3.9. Let Q
TE
(p) be an arbitrary sequence of
constants within the identified set, so that its value can change
depending on n. It follows from the proof of Theorem 3.8 that
Pr {Q
TE
(p) CI
OSA
} = Pr
_
Y
1([n
1
u
1nL])
Y
0([n
0
u
0nL])
Q
TE
(p) Y
1([n
1
u
1nU])
Y
0([n
0
u
0nU])
_
= Pr
_
Q
L
n
(p) z
1
L
_
u
sup,p
, p
_
n
1
Q
TE
(p)
Q
U
n
(p) +z
1
U
_
u
inf,p
, p
_
n
1
_
+o (1) .
Let = Q
U
(p) Q
L
(p)
U
+
L
, where
U
= Q
U
(p)
Q
TE
(p) 0 and
L
= Q
TE
(p) Q
L
(p) 0. Then
Pr {Q
TE
(p) CI
OSA
}
= Pr
_
_
_
_
Q
L
n
(p) z
1
L
_
u
sup,p
, p
_
n
1
Q
TE
(p) ,
Q
TE
(p) Q
U
n
(p) z
1
U
_
u
inf,p
, p
_
n
1
_
_
_
_
+o (1)
= Pr
_
_
_
_
Q
L
n
(p) Q
L
(p) z
1
L
_
u
sup,p
, p
_
n
1

L
,
U
Q
U
n
(p) Q
U
(p) +z
1
U
_
u
inf,p
, p
_
n
1
_
_
_
_
+o (1)
= Pr
_
_
_
_
n
1
Q
L
n
(p) Q
L
(p)
L
_
u
sup,p
, p
_ z
1
+
L
n
1
L
_
u
sup,p
, p
_ ,
n
1
U
_
u
inf,p
, p
_ z
1

n
1
Q
U
n
(p) Q
U
(p)
U
_
u
inf,p
, p
_
_
_
_
_
+o (1) .
Under (A2), > 0 by Lemma 2.1. So, either
L
> 0 or
U
> 0
or both. Suppose (a)
L
> 0 and
U
= 0. Then
Pr {Q
TE
(p) CI
OSA
}
= Pr
_
z
1

n
1
Q
U
n
(p) Q
U
(p)
U
_
u
inf,p
, p
_
_
+o (1)
= 1 +o (1) .
Suppose (b)
L
= 0 and
U
> 0. Then
Pr {Q
TE
(p) CI
OSA
}
= Pr
_
n
1
Q
L
n
(p) Q
L
(p)
L
_
u
sup,p
, p
_ z
1
_
+o (1)
= 1 +o (1) .
Suppose (c)
L
> 0 and
U
> 0. Then Pr {Q
TE
(p) CI
OSA
} = 1 +
o (1). The conclusion follows since Q
TE
(p) is an arbitrary sequence
of constants within the identified set, dependent upon n.
Appendix B. Simulation design
Consider the following mixtures of normal distributions with
G
1
and G
0
the distribution functions respectively:
Fig. B.1. Graphs of f
1
(y) and f
0
(y) for DGP-MN.
Fig. B.2. Graphs of Q
L
(p) and Q
U
(p) for DGP-MN.
G
1
(y) = Pr [W
1
y]
=
1
3
_
y +3/4
2/15
_
+
2
3
_
y 3/4
4/15
_
,
G
0
(y) = Pr [W
0
y]
=
1
20
_
y 4/5
1/4
_
+
1
20
_
y +4/5
1/4
_
+
9
10
_
y
1/50
_
.
We truncated these two distributions to obtain the final distribu-
tions for Y
1
and Y
0
:
F
1
(y) = Pr [Y
1
y]
=
G
1
(y) G
1
(1)
G
1
(1.32) G
1
(1)
for y [1, 1.32] ,
F
0
(y) = Pr [Y
0
y]
=
G
0
(y) G
0
(1)
G
0
(1) G
0
(1)
for y [1, 1] .
Let f
1
and f
0
denote the p.d.f.s of F
1
and F
0
respectively. Then,
min
y[1,1.32]
f
1
(y) min
y[1,1]
f
0
(y) = 0.0017 > 0,
Fig. B.3. Graphs of
L
(, p) and
U
(, p) for DGP-MN: p = 0.35.
Fig. B.4. Graphs of
L
(, p) and
U
(, p) for DGP-MN: p = 0.67.
Fig. B.5. Graphs of
L
(, p) and
U
(, p) for DGP-MN: p = 0.85.
Fig. B.6. Graphs of Q
L
(p) and Q
U
(p) for DGP-B.
hence (A2) is satisfied. Note from the graphs of f
1
and f
0
below
that f
1
is asymmetric and bimodal, while f
0
is symmetric and
trimodal.
The bounds for Q
TE
(p) for p (0, 1) are plotted in Fig. B.2.
In the simulation in Section 4, we chose to use the bounds for
Q
TE
(0.35) , Q
TE
(0.67), and Q
TE
(0.85) (corresponding to p
1
, p
2
, p
3
in Fig. B.2 respectively). We can see the identification regions for
different ps from Fig. B.2. For p = 0.35, 0.67, 0.85, the functions
L
(u, p) for u [0, p] and
U
(u, p) for u [p, 1] are graphed
in Figs. B.3B.5. In each figure, the left panels are
L
(u, p) and the
right
U
(u, p).
Figs. B.3B.5 indicate that (A3) and (A4) are satisfied for all
three ps.
For the Beta distributions, let B (y; , ) denote the c.d.f. of the
Beta distribution with parameters and . We created
F (y) =
B
_
y;
1
2
,
1
2
_
B
_
0.005;
1
2
,
1
2
_
B
_
0.995;
1
2
,
1
2
_
B
_
0.005;
1
2
,
1
2
_
and generated Y
1
and Y
0
fromthe same truncated Beta distribution.
The p.d.f. of B
_
;
1
2
,
1
2
_
is bounded belowfromzero but unbounded
from above. The truncation, however, made this new distribution
bounded both from above and below, hence it satisfies (A2).
The bounds for Q
TE
(p) for p (0, 1) are plotted in Fig. B.6.
In this setup, we chose p = 0.5. The functions
L
(u, p) for
u [0, p] (left panel) and
U
(u, p) for u [p, 1] (right panel)
are presented in Fig. B.7.
As seen from Fig. B.7, this set of distributions does not satisfy
(A3) and (A4).
Fig. B.7. Graphs of
L
(, p) and
U
(, p) for DGP-B: p = 0.5.
References
Abadie, A., Angrist, J., Imbens, G., 2002. Instrumental variables estimationof quantile
treatment effects. Econometrica 70, 91117.
Andrews, D.W.K., 1999. Estimation when a parameter is on a boundary.
Econometrica 67, 13411383.
Andrews, D.W.K., Guggenberger, P., 2009. Validity of subsampling and Plug-
in asymptotic inference for parameters defined by moment inequalities.
Econometric Theory 25, 669709.
Andrews, D.W.K., Shi, X., 2009. Inference based on conditional moment inequalities.
Unpublished Manuscript. Yale University, New Haven, CT.
Andrews, D.W.K., Soares, G., 2010. Inference for parameters defined by moment
inequalities using generalized moment selection. Econometrica 78, 119157.
Beresteanu, A., Molinari, F., 2009. Asymptotic properties for a class of partially
identified models. Econometrica 76, 763814.
Bitler, M., Gelbach, J., Hoynes, H.W., 2006. What mean impacts miss: distributional
effects of welfare reform experiments. American Economic Review 96,
9881012.
Blundell, R., Gosling, A., Ichimura, H., Meghir, C., 2007. Changes in the distribution of
male and female wages accounting for employment composition using bounds.
Bugni, F.A., 2010. Bootstrap inference in partially identified models defined
by moment inequalities: coverage of the identified set. Econometrica 78,
735753.
Canay, I.A., 2010. EL inference for partially identified models: large deviations
optimality and bootstrap validity. Journal of Econometrics 156, 408425.
Chernozhukov, V., Hansen, C., 2005. An IV model of quantile treatment effects.
Chernozhukov, V., Hansen, C., 2006. Instrumental quantile regression inference for
structural and treatment effect models. Journal of Econometrics 132, 491525.
Chernozhukov, V., Hong, H., Tamer, E., 2007. Parameter set inference in a class of
econometric models. Econometrica 75, 12431284.
Chernozhukov, V., Lee, S., Rosen, A.M., 2009. Intersection bounds: estimation and
inference. Centre for Microdata Method and Practice, CEMMAP. Working Paper
CWP19/09.
Csrg, M., Revesz, P., 1981. Strong Approximations in Probability and Statistics.
Academic Press, New York, NY.
Djebbari, H., Smith, J., 2008. Heterogeneous program impacts of the PROGRESA
program. Journal of Econometrics 145, 6480.
Doksum, K., 1974. Empirical probability plots and statistical inference for nonlinear
models in the two-sample case. Annals of Statistics 2, 267277.
Fan, Y., 2008. Confidence sets for parameters defined by conditional mo-
ment inequalities/equalities. Unpublished manuscript. Vanderbilt University,
Nashville, TN.
Fan, Y., Park, S., 2009. Partial identification of the distribution of treatment effects
and its confidence sets. Advances in Econometrics 24, 370.
Fan, Y., Park, S., 2010. Sharp bounds on the distribution of the treatment effects and
their statistical inference. Econometric Theory 26, 931951.
Firpo, S., 2007. Efficient semiparametric estimation of quantile treatment effects.
Galichon, A., Henry, M., 2009. A test of non-identifying restrictions and confidence
regions for partially identified parameters. Journal of Econometrics 152,
186196.
Heckman, J., Smith, J., Clements, N., 1997. Making the most out of programme evalu-
ations and social experiments: accounting for heterogeneity in programme im-
pacts. Review of Economic Studies 64, 487535.
Honore, B.E., Lleras-Muney, A., 2006. Bounds in competing risks models and the war
on cancer. Econometrica 74, 16751698.
Horowitz, J., Manski, C.F., 2000. Nonparametric analysis of randomized experiments
with missing covariates and outcome data. Journal of the American Statistical
Association 95, 7784.
Imbens, G.W., Manski, C.F., 2004. Confidence intervals for partially identified
parameters. Econometrica 72, 18451857.
Lee, M.J., 2000. Median treatment effect in randomized trials. Journal of the Royal
Statistical Society: Series B (Statistical Methodology) 62, 595604.
Lehmann, E.L., 1974. Nonparametrics: Statistical Methods Based on Ranks. Holden-
Day Inc., San Francisco, California.
Linton, O., Song, K., Whang, Y., 2010. An improved bootstrap test of stochastic
dominance. Journal of Econometrics 154, 186202.
Manski, C.F., 1994. In: Sims, C. (Ed.), The Selection Problem. In: Advances in
Econometrics. Sixth World Congress, vol. 1. Cambridge University Press.
Manski, C.F., 2003. Partial Identification of Probability Distributions. Springer-
Verlag, New York, NY.
Manski, C.F., Pepper, J., 2000. Monotone instrumental variables: with application to
the returns to schooling. Econometrica 68, 9971010.
Romano, J.P., Shaikh, A.M., 2008. Inference for identifiable parameters in partially
identified econometric models. Journal of Statistical Planning and Inference
138, 27862807.
Rosen, A.M., 2008. Confidence sets for partially identified parameters that satisfy a
finite number of moment inequalities. Journal of Econometrics 146, 107117.
Shaikh, A.M., Vytlacil, E., 2010. Partial identification in triangular systems of
equations with binary dependent variables. National Bureau of Economic
Research Working Paper t0307.
Stoye, J., 2009. More on confidence intervals for partially identified parameters.
van der Vaart, A.W., 1998. Asymptotic Statistics. Cambridge University Press, New
York, NY.
van der Vaart, A.W., Wellner, J., 1996. Weak Convergence and Empirical Processes
With Applications to Statistics. Springer-Verlag Inc., New York, NY.
Williamson, R.C., Downs, T., 1990. Probabilistic arithmetic I: numerical methods
for calculating convolutions and dependency bounds. International Journal of
Approximate Reasoning 4, 89158.

Journal of Econometrics: Yanqin Fan, Sang Soo Park

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Journal of Econometrics: Yanqin Fan, Sang Soo Park

Uploaded by

Copyright:

Available Formats

Journal of Econometrics 167 (2012) 330344

Contents lists available at SciVerse ScienceDirect

() denote the generalized inverse of F

(). For a given

(p). Given the marginals

((0, 1)). Without (A2)(i) and (ii), the convergence

((a, b)), where [a, b] is a strict subset of

((0, p)) and

((p, 1)) respectively and Z

denote the quantile of the standard

, p) > /2. Since

, p) < /2. Since

You might also like