This Content Downloaded From 193.0.118.39 On Sun, 23 May 2021 13:27:39 UTC

A Multivariate Extension of Hoeffding's Lemma
Author(s): Henry W. Block and Zhaoben Fang

Source: The Annals of Probability , Oct., 1988, Vol. 16, No. 4 (Oct., 1988), pp. 1803-1820
Published by: Institute of Mathematical Statistics
Stable URL: https://www.jstor.org/stable/2243993
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms
Institute of Mathematical Statistics is collaborating with JSTOR to digitize, preserve and

extend access to The Annals of Probability
This content downloaded from

193.0.118.39 on Sun, 23 May 2021 13:27:39 UTC
All use subject to https://about.jstor.org/terms
The Annals of Probability
1988, Vol. 16, No. 4,1803-1820
A MULTIVARIATE EXTENSION OF HOEFFDING'S LEMMA
BY HENRY W. BLOCK1 2 AND ZHAOBEN FANG2
University of Pittsburgh
Hoeffding's lemma gives an integral representation of the covariance of

two random variables in terms of the difference between their joint and
marginal probability functions, i.e.,
cov(X, Y) f J {P(X > x, Y >y) - P(X > x)P(Y > y)} dxdy.
00 00
This identity has been found to be a usefu

structure of various random vectors.
A generalization of this result for more than two random variables is
given. This involves an integral representation of the multivariate joint
cumulant. Applications of this include characterizations of independence.
Relationships with various types of dependence are also given.
1. Introduction. It is well known that if a random variable (r.v.) X has

distribution function (d.f.) F(x) with finite expectation, then
(1) EX = (1 - F(x))
0
dx - F(x) dx.
-00
The extensi
(2) EXn = n[ Xn-l(l - F(x)) dx -0 xn-lF(x) dx].

Hoeffding (1940) gave a bivariate version of identity (1), which is mentioned in
Lehmann (1966). Let Fx, y(x, y), Fx(x), Fy(y) denote the joint and marginal
distributions of random vector (X, Y), where EIXYI, EIXI, El YI are assumed
finite. Hoeffding's lemma is
(3) EXY- EXEY= J {Fx, y(x, y) - Fx(x)Fy(y)1 dxdy.

-00 -00
Lehmann (1966) used this result to cha

things, and Jogdeo (1968) extended Lehmann's bivariate characterization of
independence. Jogdeo obtained an extension of formula (3) which we now give.
Let (Y1, Y2, Y3) be a triplet independent of (X1, X2, X3) and having the same
distribution as (- X1, X2, X3). Then
(4) E(X1 - Yl)(X2 - Y2)(X3 - Y3) = J K(ul, u2, u3) du, du2 du3, 00
Received December 1986; revised February 1988.

'Supported by ONR Contract N00014-84-K-0084.
2Supported by AFOSR Grant AFOSR-84-0113.
AMS 1980 subject classifications. Primary 62H05; secondary 60E05.
Key words and phrases. Hoeffding's lemma, joint cumulant, characterization of independence,
inequalities for characteristic functions, positive dependence, association.
1803

193.0.118.39 on Sun, 23 May 2021 13:27:39 UTC
1804 H. W. BLOCK AND Z. FANG
where
K(u1, U2, U3) = {P(B1A2A3) + P(B1)P(A2A3)

-P(A2)P(B1A3) - P(A3)P(B1A2)}
- {P(A1A2A3) + P(A1)P(A2A3)
-P(A2)P(A1A3) - P(A3)P(A1A2)}
and Ai = {Xi < ui}, i = 1,2,3, B1 = {X1 2 -u1}. Jogdeo mentioned that a
similar result holds for n ? 3. We give a different generalization of Hoeffding's
lemma. Notice that expression (3) can be rewritten as
(5) Cov(X, Y) = ff Cov(Xx(x) Xy(y)) cdy,

-00 -00
where Xx(x) = 1 if X > x, 0 otherwise, and that the covariance is the second-
order joint cumulant for the random vector (X, Y). In the following we extend
the results to the rth-order joint cumulant where r 2 3.
2. Main results. Consider a random vector (X1,..., Xr), where ElXilr < 00,
i-1,...,r.
DEFINITION 1. The rth-order joint cumulant of (X1,..., Xr) denoted by

cum(X1,..., Xr) is defined by
(6) cum(Xl,..., Xr) = (1)P' (p - 1)! (E H Xj )...( E Xj),
where summation extends over all partitions (v1,..., vp), p = 1,2

{1,**., r}.
It can be shown [see Brillinger (1975)] that cum(X1,..., Xr) is the co

of the term (i)rtl * tr in the Taylor series expansion of log E(ex
Furthermore the following properties are easy to check:
(i) cum(a1Xl, ..., arXr) = a, ... ar cum(Xi, * * Xr);

(ii) cum(X1,..., Xr) is symmetric in its arguments;
(iii) if any group of the X's are independent of the remaining X's, then
cum(X1,..., Xr) = O;
(iv) for the random variable (Y1, X1,..., Xr), cum(X1 + Y1, X2,..., Xr) =
cum(XV,..., Xr) + cum(Y1, X2,.., Xr);
(v) for ,u constant, r ? 2, cum(Xl + ,u, X2,..., Xr) = cum(Xv,..., Xr
(vi) for (X1, .. ., Xr), (Y1,..., Yr) independent
cum(Xi + Yi ... ** Xr + Yr) = cum(XX,.. ., Xr) + cum(Yl,. . ., Y );
(vii) cum X- = EXj, cum(Xj, Xj) = VarXj and cum(Xi, Xj) = cov(Xi,
To represent certain moments by cumulants, we have the following useful
identity.

193.0.118.39 on Sun, 23 May 2021 13:27:39 UTC
AN EXTENSION OF HOEFFDING'S LEMMA 1805
LEMMA 1. If EjXij' < o0,

EX, .. Xm-EX, EXm
(7) = Ecum(Xk, k Ep Dl) ... cum(Xk, k E Pp),
where E extends over allpartitions (vl, * ., rp), p = 1,..., m - 1, of (
PROOF. In the case of m= 2, p = m-1 = 1 and (7) reduces to the

known
EX1X2 - EX1EX2 = cum(Xk, kE ej = cov(X1, X2).

Notice that
EX1 Xm-EX, EXm = EX, Xm-2Xm-lXm

(8) - EX1 * EXm-2EXmi1Xm
+ EX1 * EXm-.2COV(Xm-1i Xm).
Introduce the new notation Y, = X1, i = 1,..., m - 2, Ym-1 = X,-,~Xm

Theorem 2.3.2 in Brillinger (1975), page 21, and induction we get (7). z]
Our main result is the following.
THEOREM 1. For the random vector (X1,..., Xr), r > 1, if EIXijr < oo,
i = 1,2,..., r, then
(9) cum(X1,... Xr) = J f cum(Xx1(Xi),.*., XXr(Xr)) dX1 dXr,

-00 -00
where xi(xi) = 1 if Xi >
To prove the theorem,

interest.
LEMMA 2. If EIX1 ... Xrj < 0o, we have
EX1 . Xr= (-1)r .. | (F(X) - E ?(Xj)F(X(j))

00 -00 j=1
(10) + E E(xi)E(xj)F(x(' j) )
i<j
+ *fI * ~j * + (-1 ... dxr,
where E(xi) = 1 if xi > C, 0 otherwise. Here x(il ik) represents

(x1, ..., X X ., i 'X 2-" Xi2+1' " Xik-1 Xik+l ' ... Xr). Also F(x(i, , ik)) is
the marginal d. f. of X(i 0i.)*W. We omit the subscripts for F for simplicity when
there is no ambiguity, e.g., F(x(1)) is the marginal of (X2,..., Xr).

193.0.118.39 on Sun, 23 May 2021 13:27:39 UTC
PROOF. First, we have the identity
(11) f_(eXi | (Xi) - I(-.,x.(Xi)) dXL,
where I(-,_ x(Xi) = 1 if Xi < xL, 0 otherwise. Then by
EX, Xr = E n f [e(xi) - I(,xi(Xi)1 dxi}
... I H[ [-(xi) - I(x, xi,(Xi)] dX1 .
Jf__ x lxE(H E e(xi) -I I(-x,x](Xi)] dxl * C r
II1(Xi) E rl ?(Xk)F(xj)
-o -00 i-1 j]1 k*j
+ E rl e(xk)F(xi, xj)
i<j k~i, j
which is just the right side of (10). 0
REMARK 1. It is easy to see that (1) can be written as EX = fr00(e(x) -

F(x)) dx, which is a special case of (10). Thus Lemma 2 is an extension of (1).
REMARK 2. Using the identity
X['i f_| ni ]i[e(xi) -I(_oxil(Xi)] dxi,
we can also obtain an extension of (2), i.e.,
nl ... nk
= (- nl * . . nk|0 1 .
-00 0
x 'F(xl, ..., xk) - E E(X1)F(x(j))

(12) j=1
+ E E
i<j
+ ** +(-1)knE(Xi) dX .*. dx. ,
where ni1, n, + +nk < r.

193.0.118.39 on Sun, 23 May 2021 13:27:39 UTC
REMARK 3. When the X.'s are nonnegative (12) reduces to
EXfl1 ... = | fn* ... * nX-1 ... xnk-1

(13) -_ 0
x-F( XP ... .
where F(x1,..., Xk)

variate case of (13) w
The proof of the T

theorem and Lemma 2. We have
cum(X1,..., Xr)
= E(-1)p(p p-)! (E X1) ... (E HI Xj)
= E(X1 *... * Xr)- E rE XjX)E( H Xj)

JEVl JEV2
+ +(-1)rl(r-1)! HI EXj
j=1
(-1)r .. * (F(x) - F, e(xj)F(xlj))

-oo -o0 j=1
+ E E(xi)E(xj)F(x
i<j
+ * +(_ )r H E(Xj) dx1 ... dXr
_(1)fl~l+fP2f f !F(xji j E v)
-00 - \J~J00
- 5?. E(Xk)F(xj, j E v, \
kev,
kG V1 ~ EV
+ ..+(_1)n"', FB E(Xj)
X F(xi E k=V2
I) 2)- e(xk)F(x i E gJ 2 \ k)
E(X)Fx2
kev2
+ +(-1)nP2 HI E(Xi)} dx.. dXr
r~~~~~~~~
+ *@ * +(-1)r(r -.1)!-00
*. f-00
[E(Xi)
i=
- Fi(Xi)] dX1 ... dXr,
where n1,i is the number of indic

193.0.118.39 on Sun, 23 May 2021 13:27:39 UTC
vi and Fi(x) is the marginal of Xi. All terms w

quantities Y2 1n,., j = 2,..., p, are all equal to r. Thus
cum(X1, ... , Xr)
=(-1)r f. (F(x) -F(xj, j E V1)F(xi, i e

-00 -00
+ *** +(-f0(rc-)1)! .Fi(xi) &x1 ..) )* dr
( 0 _ 00o ? E ( _l)P- (p - 1)!F(xj, j E- vi)
X ... X F( XP E Vrp ) d'X 1... d-Xr
cum(X_ X)r) * fU( c uXX(Xx), ... ., X-Xr (Xr)) dx1 ... dxr
This completes theoproof
-0000 -
= ..***. cm CMX XlFX
The last equality follow

This completes the proo
REMARK 4. The result
-00 -00
The integral can then be expresse
(-1) cum(Xx,(xi), i E A
where A U B = {1, 2, .. ., r}. We
tion and/or survival function in
(i) for A = 4, card B = r the integrand is
(-1)r(F(X) - EF(Xj, i E VJ)F(x, i E v2)

r
+ +((1)r (r-1)!H Fi(xi));

(ii) for B = 4 the integrand is
r
F~(x) - F(X j e vE)F(xi, i E v2) + i * +(-1)r(r - 1)! Hi(xi).
3. Applications. In some sense, the cumulant is a measure of the indepen-

dence of certain classes of r.v.'s.
The following result was shown by Jogdeo (1968). Let Fxl, x2, xx(x, x2, x3)
belong to the family 0(3), where 0(3) denotes the class of trivariate distribu-

193.0.118.39 on Sun, 23 May 2021 13:27:39 UTC
tions such that there exists a choice of and Ai, i = 1, 2

3
(14) P(XlAlXl, X2A2X2, X3A3X3)AH P(XiAiXi)X

i=1
for all x1, x2, X3, where the A, Ai each denote one of the ineq
Then Xi, Xj for all i # j are uncorrelated and EX1X2X3 = EX
only if the Xi's are mutually independent.
Using Theorem 1, we get this conclusion directly. The "if" part is trivial.
Conversely, since F E= #(3) we know Fxix.(xi, xj) E= .(2) [#(n) can be def
similarly]. Since Xi and Xj are uncorrelated this implies the Xi's are pair
independent by Hoeffding's lemma. Thus, using Remark 4, (9) becomes
EX1X2X3 - EX1EX2EX3
- ?JJJ__{P(XlAlxl, X2A2x2, X3A3X3)
-P(XlAlXl)P(X2A2x2)P(X3A3x3)} dX1 dx2 dx3.

Now since F E #(3) the integrand will not change sign, so that EX1X2X3 =
EX1EX2EX3 implies
P(XlAlXl, X2A 2X2' X3A3X3) = P(X1A1X1)P(X2A2X2)P(X3A3X3),

for all x1, X2, X3 which means that the Xi's are independent.
The n-dimension extension is straightforward and is given in the following
discussion.
THEOREM 2. If Fxl,..., xn(xl,..., Xn) E #(n), then EXi* ... Xik= =11 EXj
for all subsets {il..., i}, of 1, ... ., n} if and only if Xl,..., Xn are independent.
PROOF. FX. Xn(x *... * Xn) E J(n) means Fxl,..., Xl(X4l * * Xik) e .#(k)
for any subset (il,..., Wk*. By induction on n, using Theorem 1, we obtain
n
EX1 ... Xn - rl EX.

j=1
= +| . J * H |. P(XAixi) i = 1,..., n)- r|P(XiAixi)

-oo -oo0 i=l
The integrand
Xi are mutual
Several authors have discussed dependence structures in which uncorrelated-

ness implies independence. Among them are Lehmann (1966), Jogdeo (1968),
Joag-Dev (1983) and Chhetry, Kimeldorf and Zahed (1986).
We now give a definition from Joag-Dev (1983). Let X = (X1,..., Xn) be a
random vector, A be a subset of (1,..., n} and x = (x1,..., x ) a vector of
constants.

193.0.118.39 on Sun, 23 May 2021 13:27:39 UTC
DEFINITION 2. Random vectors are said to be PUOD (positive upper orthant

dependent) if (a) (which follows) holds, PLOD (positive lower orthant depen-
dent) if (b) holds and POD (positive orthant dependent) if (a) and (b) hold,
where
n
(a) P(X > x) H P(Xi > xi),

n
(b) P(X < x)? P(Xi<xi).

i=1
If the reverse inequalities between the probabilities in (a) and (b) hold the three
concepts are called NUOD, NLOD and NOD, respectively.
NOTE. In Definition 2 in Block and Ting (1981), POD is used for what is
called PUOD in this paper.
DEFINITION 3. A vector X is said to be SPOD (strongly positively orthant

dependent) if for every set of indices A and for all x the following three
conditions hold:
(c) P(X > x) 2 P( Xi > xi, i E- A)P(Xj > x;, j E- AC),
(d) P(X < x) 2 P(Xi < xi, i E A)P(Xj < x;, j E AC),
(e) P(Xi >xi~ i E-A, Xj <xj, j E=Ac)
< P(X. > xi, i E- A)P(Xj < x;, j E- AC).
The relationships among these definitions are as follows:
v PLOD <
(15) Association => SPOD => POD 1(n).
Since association, SPOD, POD, PLOD, PUOD are all subclasses of X#(n),
Theorem 2 generalizes some results in Lehmann (1966) and it gives us another
proof of Theorem 2 in Joag-Dev (1983) as well as some new characterizations of
independence for POD random variables. Corollary 1 is the result of Joag-Dev.
COROLLARY 1. Let X1, .... Xn be SPOD and assume cov(Xi, Xj) = 0 for all
i ] j. Then X1,..., Xn are mutually independent.
PROOF. Since X1, ..., Xn SPOD implies (X1,..., Xj E) e 1(n) by Theorem 2

we need only check EXil ... Xik = 1EXi for all subsets {i1..., iWj of
(1,..., n}. When n = 2 SPOD is equivalent to POD and uncorrelatedness
implies X1, X2 independent. By induction on n we may assume all subsets with

193.0.118.39 on Sun, 23 May 2021 13:27:39 UTC
n - 1 r.v.'s are mutually independent and thus EXi ... Xik = for all
1 < k < n - 1. Hence cum(Xk, k e vp) = 0, whenever 1 < card(vp) < n - 1.
So we only need to check EX1 ... Xn = HP7 jEXj. By Lemma 1, Theorem 1 and
because of the independence of any (n - 1) r.v.'s,
n
EX1 ... Xn -H EXj

j=1
= Scum(Xk, k E v.).. cum(Xi, k E vp)
(16) = cum(i ... * * Xn)
= . | P(X > x) - I P(Xi > Xi)) CdX1 .. C @Xn 2 0-
Similarly,
n
EX1 Xn- HEXj

j=1
= E(-X1)(-X2)X3 .. Xn- E(-X1)E(-X2)EX3 ... EXn
= cum(-XP -X2X X3,***, Xn)
= f f... {P(-X1 > X1, -X2 > X2, X3 > X3 ... Xn > Xn)
-00 -00
(17) -P(-X1 > X1)P(-X2

00 00,
.. | P(X1 < -XP, X2 < -X2, Xi > Xi, i = 3, ... , n)

n
-P(Xl < -XJ)P(X2 < -X2) H P(Xi > Xi)} CI1 .. Cln
i=3
.. | ( P(Xj < -xj, j = 1,2, Xi > xi, i = 3, ... , n)

-00 -00
-P(Xi < -xj, j = 1,2

< 0.
The last equality holds by the induction assumption of mutual independence and
the last inequality is due to SPOD. Combining (16) and (17) completes the
proof. O
THEOREM 3. Let X1, X2, X3 be POD and assume Xi, Xj for all i # j
uncorrelated. Then X1, X2, X3 are mutually independent.

193.0.118.39 on Sun, 23 May 2021 13:27:39 UTC
PROOF. The following two summands are nonnegative since X1, X2, X3 are
POD. By Lemma 1 we then have
[P(Xl > xl, X2 > X2, X3 > X3)- P(Xi > Xi)]
+ [P(Xl < X1, X2 < X2, X3 < X3)- PMX < Xi)]
= cum(Xx,(xl), XX2(X2), Xx3(X3))
+ E P(Xi > Xi)CoV(XXj(Xj)XXk(Xk))

i*j*k
+cum(1 - xx(xD,1 - xx2(X2), - Xx3(X3))
+ E P(Xi < Xi)cov(xXj(Xj),xXk(Xk))

i*j*ok
= E cov(xXi(xi), Xx1(xi)).
i*j
Since Xi, Xj POD and cov(Xi, Xj) = 0 we obtain cov(Xx (xi), xx (xj)) = 0.
Thus P(Xi > xi, i = 1, 2,3) - HFVP(Xi > xi) = 0, i.e., X1, X2, X3 are mutually
independent. 5
REMARK 5. For three r.v.'s X1, X2, X3 the mixed positive dependence de-
fined in Chhetry, Kimeldorf and Zahed (1986) implies POD but the converse is
not true as shown by an example in Joag-Dev (1983). Notice that since the mixed
positive dependence implies POD in Corollary 1, the SPOD can be relaxed to this
mixed condition.
THEOREM 4. Assume n = 21 + 1 is an odd positive integer and X1,..., Xn

are POD. Then if E(Xl ... Xik) = EXil ... EXik, where 2 < k < 21 for any
subset {ill..., ik3 of {1,... ,21 + 1), it follows that X1,..., Xn are mutually
indepenrnt.
PROOF. By Theorem 2, we need only check EX1 ... Xn = EX1 ... EX,. On
the one hand,
EX1 .. Xn-EX1 ... EXn

=cllm(Xll...-,Xn)
f ... f* jP(Xi > xi, i n)-i fP(Xj >x) dxl * C*Xn

- o - j=0
2 0.

193.0.118.39 on Sun, 23 May 2021 13:27:39 UTC
On the other hand,
EX1 ... Xn-EX1 *- EXn
= (-1)21+ {E(-X1) * - - (-Xn)- E(-X1) ... E(-Xn))

= (-1)21 1cum(-X1, ... * -Xn)
= (-1)2l f*|(...f P(Xi<-xi,i=1,...,n)

n
-H P
j=1
REMARK 6. For n = 4 we construct in Example 1 POD r.v.'s such that any

three Xi's are independent but the Xi's are not mutually independent. This
shows that the conditions of Theorem 4 are reasonable. In Example 2 we show
that for POD r.v.'s cov(Xi, Xj) = 0 is not enough to give mutual independenc
when 21 + 1 > 3.
EXAMPLE 1. Let X1,..., X4 have the following distribution. It is easy to

check that for i J# j k, Xi, X1, Xk are mutually independent and th
X1,..., X4 are POD.
Xl X2 X3 X4 Pr
1 1 1 1 1/8
1 1 0 0 1/8
1 0 1 0 1/8
o 1 1 0 1/8
1 0 0 1 1/8
o 1 0 1 1/8
o 0 1 1 1/8
o o 0 0 1/8
Since P(Xi > 1/2, i = 1,...,4) - FH1P(Xi > 1/2) = 1/16 > 0, X1,..., X4 are
not mutually independent. Notice also that
P(X1 < X2 <X2X2, X3 > X3, X4 > X4)

-P(X1 < X1 X2 < X2)P(X3 > X3, X4 > X4)
= cum(1 - xX1(xD, 1 - Xx2(X2), Xx3(X3), Xx4(X4))
= cum(Xxi(xi) i= 1,... 4)
4
= P(Xi > Xi, i = 1,---, ) 4) - 1_1 PM > xi)
> 0,
so these r.v.'s are not SPOD.

193.0.1ffff:ffff:ffff:ffff on Thu, 01 Jan 1976 12:34:56 UTC
EXAMPLE 2. Let X1, ... , X5 have the following distribution:
X1 X2 X3 X4 X5 Pr
1 1 1 1 1 1/16
1 1 0 0 1 1/16
1 0 1 0 1 1/16
o 1 1 0 1 1/16
1 0 0 1 1 1/16
o 1 0 1 1 1/16
o o 1 1 1 1/16
o 0 0 0 1 1/16
1 1 1 1 0 1/16
1 1 0 0 0 1/16
1 0 1 0 0 1/16
o 1 1 0 0 1/16
1 0 0 1 0 1/16
o 1 0 1 0 1/16
o o 1 1 0 1/16
o o 0 0 0 1/16
It is easy to check that this is PUOD and PLOD, thus it is POD. However
EXiXj = 4/16 and EXi = 1/2 for all i, j.
In this example we can use Theorem 3 to prove that any Xi, X1, Xk are
mutually independent since subsets of POD r.v.'s are still POD.
Newman and Wright (1981), using an inequality for the ch.f.'s of r.v.'s
X1,..., Xm, provided another proof for the characterization of the independence
of associated r.v.'s. This is Theorem 1 of Newman and Wright (1981). These
authors proved that if X1,..., Xm are associated with finite variance, joint and
marginal ch.f.'s 4(rl,..., rm) and 4)(rj), then
m
(18) , rm - H4j(ri) < I Eri IrkIcov(XI, Xk).

j1l jok
To extend this inequality, we need the following lemma.
LEMMA 3. For the r.v. (X,,..., Xm) with EIXilm < oo, m > 1,
cum(exp( irXl), .. *I* exp( irmXm ))
(19) = ... imr . rmexp i Erjxj)

-00 -00
Xcum(xXX(Xi).--
where r1,..., rm are real n

wise.

193.0.118.39 on Sun, 23 May 2021 13:27:39 UTC
PROOF. This proof of the result is similar to that of Lemma 2. Use the
identity
exp( irk Xk) - 1 if rkexP(irkxk) (E(xk) I(_, Xk](Xk)) dXk-
Notice that
-() I X(Xi) {XXk(Xk) forXk > O.

XXk(Xk) -1for Xk < 0.
After doing the obvious calculation, we obtain by property (v) (following D

tion 1) of the joint cumulant that
cum(exp(irkXk), k = 1 ... ., m)
= cum(exp(irkXk) - 1, k = m)
=(-1) (p- 1)! 1=1

H [E
ke,-[H (exp(irkXk) 1)]
=r**| irl ** rmexpti E rjxj)

00 -00
= 1:0 ....
00 -00 j=l
Xcum(Xxl(xl),..., Xxm(xm)) dxi ... dXm E
Using Lemma 3, we can obtain a result parallel to (18) for certain classes of
r.v.'s.
THEOREM 5. If X1,..., Xm are r.v.'s such that EIXjIm < oo, j = 1,...,
and CUm(XX(X4),..., XX (Xk)) has the same sign for all subsets {il.... ik
{1,...,m} and all xl,.... .Xk. Then
|(ri , rm)- rl 0i (ri)|

(20) J=
m
< HrIj IIcum(X k E v- I ... Icum(Xk, k E vp)I.

J=1
Here 44r1, . .., rm) and 4f,(r1) are the joint and marginal ch. f.'s of

193.0.118.39 on Sun, 23 May 2021 13:27:39 UTC
E extends over all partitions (vl,..., vp), p = 1,..., m - 1, and whenever

card(v1) = 1, IrkI - Icum(Xk, k E PI)I is replaced by 1. Here k E vP.
PROOF. From Lemma 3, Theorem 1 and the fact that the cum(Xx(xi),
... ,x (xm)) all have the same sign we have for m > 1,
Icum(exp( irXj), ... , exp(irmXm))I
= ..*-. im
-00 -00
x cUm( XX(Xi),..X --,~xmm)) dXi dXm
(2) < Irl *-- Irmlf f00 | cum(xl(xi),...,xm(xm))Idxi C1X d~
-00 -00
__ 00
I rll *Irmi f_ | |... cumXX,(xl )X... XX m(xm) ) dX... dXm&
= Irl .. Irml lcum(xl),... Xm).
For m = 1, cum(exp(ir1X1)) = E(exp ir1Xl) = 41(rl) which is bounded by 1.

Combining (21) and Lemma 1, we get
m
|0(ri , rm) - H oi(ri)|

1=1
m m
= E n exp(irk Xk) - H E exp(irk Xk)

k=1 k=1
=|E cum(exp irjX
< E Icum(exp irjX
< Irl. IrmlElcum
Whenever card vl = 1
REMARK 7. In Example 3 we define r.v.'s which are uncorrelated but not

mutually independent. By Corollary 1 they cannot be associated so that Theo-
rem 1 of Newman and Wright (1981) does not apply. However, Theorem 4 gives
an upper bound for the difference of ch.f.'s, since it is easy to check
cum(XX.(xd) Xxj(Xj)) = 0, i ? j, and cum(Xx(xl), XX2(X2), Xx3(X3) 2 0 for
X1, X2, X3.

193.0.118.39 on Sun, 23 May 2021 13:27:39 UTC
EXAMPLE 3. Consider the r.v.'s X1, X2, X3 with the following distribution:
X1 X2 X3 Pr
1 1 1 1/4
1 0 0 1/4
0 1 0 1/4
0 0 1 1/4
These are PUOD but not POD.
For nonnegative r.v.'s we can go further.
THEOREM 6. If the r.v.'s Xl,..., Xm are nonnegative (nonpositive) and

PUOD (PLOD) with finite mth moments, then
m
(22) p(r,. ..,r)-|
< Irll Irml 1EX1 .. Xm
PROOF. We prove the PUOD case only. Using Lemmas 1 and 3 and
Remark 3,
| Orl, * ,rm) - H M>ri)|

j=1
= E exp(i E r1X) - H E exp(irjXj)
= j *... fimr, ... rmexp(i E rixi)
(23) x [ Axl) ... xm) - -F(xl) *F *m(Xm)I dxl ... dxm
? IrIl ... Irml ... IF(xl,..., xm)

-00 o
-Fl(xl) ...
=Irl .. Irml
-.F(xl) .F
= lrll ... Irml 1E

193.0.118.39 on Sun, 23 May 2021 13:27:39 UTC
COROLLARY 2. Under the conditions of Theorem 5, if EX1 *- X,=

EX1 ... EXm, then X1,..., Xn are independent.
4. Cumulants and dependence. Cumulants provide us with useful mea-

sures of the joint statistical dependence of random variables. However, the
relationships with positive and negative dependence are not similar to those in
the bivariate (covariance) case. We give some examples to illustrate the relation-
ship between the sign of the cumulant and dependence in the trivariate case.
REMARK 8. By property (iii) of cumulants if any group of X's is indepen-

dent of the remaining X's, then cum(X1,..., Xr) = 0. The converse is true for
normal distributions when r = 2 but not for r > 2. For the trivariate normal, we
can have cum((X1, X2, X3) = 0, where X1, X2, X3 are not necessarily indepen-
dent.
REMARK 9. Assume EXi > 0 for i = 1, 2,3. Also assume that cov(X,, Xj) 2 0
for i, j = 1, 2, 3 [or the even stronger conditions cov(Xx (xj), xx (xj)) ? 0 and
cum(X1, X2, X3) 2 0]. These do not imply PUOD as is shown in the following
example.
EXAMPLE 4. Let X1, X2, X3 take the values 0, + 1 with: P(X1 = x1, X2 =
X3 = X3 X1X2X3 ? 0) = 0; P(X1 = X2 = X3 =0) =0; P(Xi =0 Xi = xi'
Xk = Xk, XiXk > 0) = 1/9, i, j, k = 1,2,3, X= = 1 or x; = Xk = -1; and
P(X1 = X1, X2 = X2 X3 = X3) = 1/36 for the remaining cases. It is easy to
check that EXi = EX1X2X3 = 0, EXiXj > 0 and cum(X1, X2, X3) =
P(X1 > 0 X2 > 0, X3 > 0) - P(X1 > o)P(X2 > o)P(X3 > 0)
= -(11/36)3 < 0.
REMARK 10. Let EXi > 0 and assume (X1, X2, X3) PUOD. This does not
imply cum(X1, X2, X3) ? 0 as is shown in Example 5.
EXAMPLE 5. Let (X1, X2, X3) have the following distribution. It is easy to
check that (X1, X2, X3) is PUOD and that EXi = 0, but cum(X1, X2, X3) =
-0.15 < 0.
Xi X2 X3 Pr
1 1 1 0.35
1 1 -1 0.05
1 -1 1 0.05
-1 1 1 0.05
0 0 -1 0.05
-1 0 0 0.05
0 -1 0 0.05
-1 -1 -1 0.35

193.0.118.39 on Sun, 23 May 2021 13:27:39 UTC
REMARK 11. Let (X1, X2, X3) be associated. It need not be true that
cum(X1, X2, X3) 2 0 as is shown in Example 6.
EXAMPLE 6. Assume (X1, X2, X3) are binary r.v.'s with distribution P(X1 =
X2 = X3 = 0) = 0.3; P(X1 = X1, X2 = X2, X3 = X3) = 0.1 for all othe
{X1, X2, X3} E (0, 1}3.
Checking all binary nondecreasing functions F(Xj, X2, X3) and A(X1, X2
we have cov(F, A) 2 0. Thus (X1, X2, X3) are associated but cum(Xl, X2, X3
-0.012 < 0.
REMARK 12. If (X, Y) are binary and cov(X, Y) 2 0, then (X, Y) is associ-
ated as was shown in Barlow and Proschan (1981). However, if (X1, X2, X3) are
binary, then cov(Xi, Xj) ? 0, i, j = 1,2,3, and cum(X1, X2, X3) > 0 do n
imply (X1, X2, X3) associated as is seen in Example 7.
EXAMPLE 7. Assume (X1, X2, X3) are binary r.v.'s with the following distri-
bution. Then cov(Xi, Xj) = 1/180 > 0 and cum(Xl, X2, X3) = 1/135 > 0. Ho
ever, for the increasing functions max(Xl, X2) and max(Xl, X3),
cov(max(Xl, X2),max(Xl, X3)) = -1/900 < 0,

so (X1, X2, X3) are not associated.
X1 X2 X3 Pr
0 0 0 0
0 0 1 1/30
0 1 0 1/30
1 0 0 1/30
1 1 0 1/10
1 0 1 1/10
0 1 1 1/10
1 1 1 6/10
If we add some restrictions, some results can be obtained. We will give these
and omit the easy proofs.
PROPOSITION 1. If cov(Xi, Xj) = Q for i, j = 1,2,3, then (X1, X2, X3)

PUOD implies cum(X1, X2, X3) > 0 and (X1, X2, X3) PLOD implies
cum(X1, X2, X3) < 0.
REMARK 13. Notice that under the preceding assumptions we have the
peculiar situation that PUOD o NLOD and PLOD NUOD.

193.0.118.39 onf:ffff:ffff on Thu, 01 Jan 1976 12:34:56 UTC
PROPOSITION 2. Let (X1, X2, X3) be a binary trivariate r.v. If

cov(Xi, Xj) 2 0, cum(Xl, X2, X3) > 0 and additionally conditio
then (X1, X2, X3) is associated for i, j, k = 1, 2,3.
(M) f COV(Xi 11 X Xk, X 1L Xk) 2 0,

Cov(Xi J Xi, Xi L Xk) ? 0
where
Xi A1 Xi= 1 - (1 - Xj (l - Xi) = max(Xi, Xi).
To prove Proposition 2, we need to check for all binary increasing functions r

and A that cov(F(Xl, X2, X3), A(X1, X2, X3)) 2 0. We leave this to the reader.
REFERENCES
BARLOW, R. and PROSCHAN, F. (1981). Statistical Theory of Reliability and Lif

Again, Silver Springs, Md.
BLOCK, H. and TING, M. (1981). Some concepts of multivariate dependence. Comm. Statist.
A -Theory Methods 10 749-762.
BRILLINGER, D. R. (1975). Time Series Data Analysis and Theory. Holt, Rinehart and Winston
New York.
CHHETRY, D., KIMELDORF, G. and ZAHED, H. (1986). Dependence structures in which uncorrela
ness implies independence. Statist. Probab. Lett. 4 197-201.
HOEFFDING, W. (1940). Masstabinvariante Korrelations-theorie. Schr. Math. Inst. Univ. Berlin 5
181-233.
JOAG-DEv, K. (1983). Independence via uncorrelatedness under certain dependence structures
Probab. 11 1037-1041.
JOGDEO, K. (1968). Characterizations of independence in certain families of bivariate and mult
variate distributions. Ann. Math. Statist. 39 433-441.
LEHMANN, E. L. (1966). Some concepts of dependence. Ann. Math. Statist. 37 1137-1153.
NEWMAN, C. M. and WRIGHT, A. L. (1981). An invariance principle for certain dependent sequences.
Ann. Probab. 9 671-675.
DEPARTMENT OF MATHEMATICS DEPARTMENT OF MATHEMATICS

AND STATISTICS UNIVERSITY OF SCIENCE AND TECHNOLOGY
UNIVERSITY OF PITTSBURGH OF CHINA
PITTSBURGH, PENNSYLVANIA 15260 HEFEI, ANHUI 230029
THE PEOPLE'S REPUBLIC OF CHINA

193.0.118.39 on Sun, 23 May 2021 13:27:39 UTC

This Content Downloaded From 193.0.118.39 On Sun, 23 May 2021 13:27:39 UTC

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

This Content Downloaded From 193.0.118.39 On Sun, 23 May 2021 13:27:39 UTC

Uploaded by

Copyright:

Available Formats

A Multivariate Extension of Hoeffding's Lemma

Author(s): Henry W. Block and Zhaoben Fang

Stable URL: https://www.jstor.org/stable/2243993

Institute of Mathematical Statistics is collaborating with JSTOR to digitize, preserve and

This content downloaded from

A MULTIVARIATE EXTENSION OF HOEFFDING'S LEMMA

BY HENRY W. BLOCK1 2 AND ZHAOBEN FANG2

Hoeffding's lemma gives an integral representation of the covariance of

This identity has been found to be a usefu

1. Introduction. It is well known that if a random variable (r.v.) X has

(2) EXn = n[ Xn-l(l - F(x)) dx -0 xn-lF(x) dx].

(3) EXY- EXEY= J {Fx, y(x, y) - Fx(x)Fy(y)1 dxdy.

Lehmann (1966) used this result to cha

Received December 1986; revised February 1988.

This content downloaded from

K(u1, U2, U3) = {P(B1A2A3) + P(B1)P(A2A3)

(5) Cov(X, Y) = ff Cov(Xx(x) Xy(y)) cdy,

DEFINITION 1. The rth-order joint cumulant of (X1,..., Xr) denoted by

(6) cum(Xl,..., Xr) = (1)P' (p - 1)! (E H Xj )...( E Xj),

where summation extends over all partitions (v1,..., vp), p = 1,2

It can be shown [see Brillinger (1975)] that cum(X1,..., Xr) is the co

(i) cum(a1Xl, ..., arXr) = a, ... ar cum(Xi, * * Xr);

This content downloaded from

LEMMA 1. If EjXij' < o0,

PROOF. In the case of m= 2, p = m-1 = 1 and (7) reduces to the

EX1X2 - EX1EX2 = cum(Xk, kE ej = cov(X1, X2).

EX1 Xm-EX, EXm = EX, Xm-2Xm-lXm

Introduce the new notation Y, = X1, i = 1,..., m - 2, Ym-1 = X,-,~Xm

Our main result is the following.

(9) cum(X1,... Xr) = J f cum(Xx1(Xi),.*., XXr(Xr)) dX1 dXr,

where xi(xi) = 1 if Xi >

To prove the theorem,

LEMMA 2. If EIX1 ... Xrj < 0o, we have

EX1 . Xr= (-1)r .. | (F(X) - E ?(Xj)F(X(j))

+ *fI * ~j * + (-1 ... dxr,

where E(xi) = 1 if xi > C, 0 otherwise. Here x(il ik) represents

This content downloaded from

PROOF. First, we have the identity

(11) f_(eXi | (Xi) - I(-.,x.(Xi)) dXL,

where I(-,_ x(Xi) = 1 if Xi < xL, 0 otherwise. Then by

EX, Xr = E n f [e(xi) - I(,xi(Xi)1 dxi}

... I H[ [-(xi) - I(x, xi,(Xi)] dX1 .

Jf__ x lxE(H E e(xi) -I I(-x,x](Xi)] dxl * C r

which is just the right side of (10). 0

REMARK 1. It is easy to see that (1) can be written as EX = fr00(e(x) -

REMARK 2. Using the identity

X['i f_| ni ]i[e(xi) -I(_oxil(Xi)] dxi,

we can also obtain an extension of (2), i.e.,

x 'F(xl, ..., xk) - E E(X1)F(x(j))

+ ** +(-1)knE(Xi) dX .*. dx. ,

where ni1, n, + +nk < r.

This content downloaded from

REMARK 3. When the X.'s are nonnegative (12) reduces to

EXfl1 ... = | fn* ... * nX-1 ... xnk-1

where F(x1,..., Xk)

The proof of the T

= E(-1)p(p p-)! (E X1) ... (E HI Xj)

= E(X1 *... * Xr)- E rE XjX)E( H Xj)

(-1)r .. * (F(x) - F, e(xj)F(xlj))

+ * +(_ )r H E(Xj) dx1 ... dXr

+ +(-1)nP2 HI E(Xi)} dx.. dXr

where n1,i is the number of indic

+ fI ~j * + (-1 ... dxr,

= E(X1 ... Xr)- E rE XjX)E( H Xj)