Professional Documents
Culture Documents
23 Aos2339
23 Aos2339
23 Aos2339
Methodologically, we deal with a different problem from Leung and Drton (2018) and Drton,
Han and Shi (2020). They developed tests for mutual independence among the coordinates of
a single high-dimensional random vector, whereas our proposed tests focus on testing inde-
pendence between two random vectors when at least one dimension is large. This implies that
the components of either x or y may not be independent of each other, even under the null hy-
pothesis studied in this paper. This makes the theoretical derivations of our proposed tests be
much more challenging than those of the tests studied by Leung and Drton (2018) and Drton,
Han and Shi (2020) under their null hypothesis. In particular, our derivations shed some light
on the behaviors of U -statistics in the high-dimensional setting under dependence. When
max(p, q) → ∞, which is very unlike the fixed dimension or mutual independence cases,
both the dependence between variables and the high dimensions bring ambiguity in assessing
the relative orders of terms in the Hoeffding decompositions. Even when x and y are indepen-
dent, the leading term in the variance decomposition for the proposed test statistics depends
on unknown dependence structures. Moreover, the proposed tests allow for universal asymp-
totics, in the sense that their null asymptotic distributions hold without imposing any explicit
restriction on the growth rates of p and q when both are regarded as functions of n, although
a few implicit restrictions are imposed in Assumption 1. The details will be discussed in
Propositions 2 and 3. By contrast, Drton, Han and Shi (2020) require that log(p) = o(nθ ) for
a constant θ . We further provide an explicit uniform bound on the error of normal approxi-
mation, and study the asymptotic relative efficiency of our proposed test based on nonlinear
and nonmonotone dependence metrics.
We summarize our main contributions as follows:
(a) Under H0 , we show that all three rank-based tests have null distributions that converge
to normal ones under the setting that max(p, q) diverges to infinity as n grows. The critical
values can thus be determined by normal approximation, avoiding expensive computation to
obtain critical values through permutation. We further derive the explicit rates of convergence
for these rank-based tests. Thanks to the blessing of dimension, the normal approximation is
more accurate as the dimensions increase.
(b) For bivariate normal variables with correlation ρ, Drton, Han and Shi (2020) estab-
lished the relationship that
D, R, τ ∗ ρ 2 as ρ → 0.
But the explicit relationships between D, R, τ ∗ and ρ remains unexplored. We fill this gap
and use the explicit relationships to conduct power comparisons from the asymptotic effi-
ciency perspective.
(c) Theoretically, we compare the relative efficiency with the distance covariance/correla-
tion based tests for high-dimensional data: Székely and Rizzo (2013)’s Student t-test and Zhu
et al. (2020)’s aggregated distance covariance based test. We show that, under a Gaussian
equicorrelation alternative, our proposed tests have significant power gain when the compo-
nents of x or y have very different scales. In addition, the asymptotic efficiency of the rank
tests based on D, τ ∗ and R are sorted in a descending order.
This paper is organized as follows. In Section 2, we propose the rank-based tests, and study
their properties. In Section 3, we compare the asymptotic relative efficiency of our proposed
tests with the distance covariance or correlation based tests. In Section 4, we assess the finite-
sample performance of our proposed tests through simulation studies and an application. We
discuss some extensions in Section 5. All proofs and additional simulations are relegated to
the Supplementary Material (Zhou et al. (2024)).
2.1. Rank-based indices defined by U -statistic theory. Suppose that we draw a ran-
dom sample, {(xi , yi ) : i = 1, . . . , n}, from the joint distribution of (x, y), where xi =
TESTING HIGH-DIMENSIONAL INDEPENDENCE 187
(Xi,1 , . . . , Xi,p )T and yi = (Yi,1 , . . . , Yi,q )T . To avoid ties among the observations, we merely
consider continuous random vectors in the sequel. We are interested in the class of correlation
measures that satisfy the I-consistent, D-consistent and Monotone transformation-invariant
properties, and focus on three typical examples: Hoeffding’s D, Blum–Kiefer–Rosenblatt’s
R and Bergsma–Dassios–Yanagimoto’s τ ∗ in this paper. To lay down the foundation of these
three tests, we first review the estimates of these correlations under the framework of U -
statistic theory.
Let h : (R1 ×R1 )d → R1 be a fixed kernel of order d ≥ 2, which is symmetric and invariant
to permutations of all arguments. For any pair of distinctive indices k ∈ {1, . . . , p} and l ∈
{1, . . . , q}, a standard U -statistic induced by the kernel h takes the form
−1
(kl) = C(n, d)
U h (Xi1 ,k , Yi1 ,l ), . . . , (Xid ,k , Yid ,l ) ,
h
1≤i1 ≤···≤id ≤n
where C(n, d) denotes the number of all combinations of d distinct elements from {1, . . . , n}.
For the sake of notation clarity, we define zi,kl = (Xi,k , Yi,l ) for 1 ≤ i ≤ n, 1 ≤ k ≤ p and 1 ≤
l ≤ q, and Pm as the set of all permutations of {1, . . . , m}. We further define ψ(z1 , z2 , z3 ) =
I (z2 < z1 ) − I (z3 < z1 ), where I (·) is an indicator function, and ω(z1 , z2 , z3 , z4 ) = I (z1 ∨
z3 < z2 ∧ z4 ) + I (z1 ∨ z3 > z2 ∧ z4 ) − I (z1 ∨ z2 < z3 ∧ z4 ) − I (z1 ∨ z2 > z3 ∧ z4 ), where
a ∨ b = max(a, b) and a ∧ b = min(a, b).
The kernels of the aforementioned three correlations are given as follows.
c−1
gh(c) (z1,kl , . . . , zc,kl ) =
(j )
(2) hc − gh (zi1 ,kl , . . . , zij ,kl ),
j =1 1≤i1 <···<ij ≤c
188 ZHOU, XU, ZHU AND LI
where gh (z1,kl ) =
(1) (kl)
h1 (z1,kl ). The kernel h and the U -statistic Uh are nondegenerate if
(1) (1)
var{gh (z1,kl )} > 0, and degenerate if var{gh (z1,kl )} = 0. It can be verified that, for j ∈
{i1 , . . . , i6 },
E ψ(Zi1 , Zi2 , Zi3 )ψ(Zi1 , Zi4 , Zi5 )|Zj
(3)
= E ψ(Zi1 , Zi3 , Zi4 )ψ(Zi2 , Zi5 , Zi6 )|Zj = E ω(Zi1 , Zi2 , Zi3 , Zi4 )|Zj = 0.
∗)
Therefore, h(D) , h(R) and h(τ are degenerate if Xk and Yl are independent.
2.2. The family of proposed tests. To test H0 in (1), we consider the test statistic, which
(kl) to form
aggregates pairwise the U -statistics Uh
p q
(kl)
Th = h C(d, 2) −1 C(n, 2) 1/2 ,
(4) Uh
k=1 l=1
where d is the order of kernel, and h is a finite adjustment factor,
⎧ ⎧
⎪
⎪ if h = h(D) , ⎪
⎪ if h = h(D) ,
⎨5, ⎨40,
d = 6, if h = h(R) , and h = 60, if h = h(R) ,
⎪
⎪ ∗ ⎪
⎪ ∗
⎩4, if h = h(τ ) . ⎩2/3, if h = h(τ ) .
We now provide an estimate for the variance of Th . Let
(k)
A1 (u, v) = FX2 k (u) + FX2 k (v) − 2 max FXk (u), FXk (v) + 2/3, k = 1, . . . , p
(l)
A2 (u, v) = FY2l (u) + FY2l (v) − 2 max FYl (u), FYl (v) + 2/3, l = 1, . . . , q,
(k) (l)
where FXk (u) = pr(Xk ≤ u) and FYl (v) = pr(Yl ≤ v). Both A1 (u, v) and A2 (u, v) satisfy
that E{A(k) (k) (l) (l)
1 (Xk , v)} = E{A1 (u, Xk )} = E{A2 (Yl , v)} = E{A2 (u, Yl )} = 0. By Hoeffding
decomposition, we have, under H0 ,
−1 (kl)
h C(d, 2)
Uh = C(n, 2) −1 A(k) (l) (kl)
1 (Xik , Xj k )A2 (Yil , Yj l ) + Rh ,
1≤i<j ≤n
where R
(kl)
is a remainder term. It follows that Th = J(1) + Jh , where
(2)
h
p
q
J(1) = C(n, 2) −1/2 (k)
A1 (Xik , Xj k )A(l)
2 (Yil , Yj l ), and
1≤i<j ≤n k=1 l=1
p q
1/2 (kl)
Jh = C(n, 2) .
(2)
R h
k=1 l=1
If both p and q are fixed, by the standard U -statistic theory, J(1) plays a dominating role in
determining the asymptotic null distributions of our proposed rank-based indices. It motivates
us to anticipate that this phenomenon remains to be true in high-dimensional settings. In
Lemma 1 of the Supplementary Material, we show that, under H0 , J(1) remains to be a
leading term and Jh(2) is asymptotically negligible even when max(p, q) → ∞. Therefore,
under H0 , the variance of Th , denoted as S 2 , is dominated by
p
p
(k )
var J(1) =
(k )
E A1 1 (X1k1 , X2k1 )A1 2 (X1k2 , X2k2 )
k1 =1 k2 =1
(5)
q
q
× E A(l2 1 ) (Y1l1 , Y2l1 )A(l2 2 ) (Y1l2 , Y2l2 ) .
l1 =1 l2 =1
TESTING HIGH-DIMENSIONAL INDEPENDENCE 189
By the definition of J(1) , the variance term, S 2 , does not depend on h under H0 in an asymp-
totic sense.
Proposition 1 serves as a basis to construct an unbiased estimate of S 2 .
I (X3k2 < X2k2 )|X1k1 , X2k2 }], for k1 , k2 = 1, . . . , p. In parallel, E{A2(l1 ) (Y1l1 , Y2l1 ) A(l2 2 ) (Y1l2 ,
Y2l2 )} can be simplified to 4E[cov2 {I (Y3l1 < Y1l1 ), I (Y3l2 < Y2l2 )|Y1l1 , Y2l2 }] for l1 , l2 =
1, . . . , q.
which implies that S 2 → ∞ when either p or q or both diverge to infinity. Under H0 , the
unbiased estimate of S 2 is defined as follows:
p
p q
q
(6) 2 =
S (k1 k2 )
S 12 ,
S
(l l )
1 2
k1 =1 k2 =1 l1 =1 l2 =1
where
−1
S
(k1 k2 )
= n(n − 1)(n − 2)(n − 3)(n − 4)(n − 5)
1
n
× ψ(Xi1 k1 , Xi3 k1 , Xi4 k1 )ψ(Xi2 k1 , Xi5 k1 , Xi6 k1 )
(i1 ,...,i6 )
Next, we establish the asymptotic null distribution of Th , which serves as a basis to provide a
decision rule for our proposed testing procedure.
We use the martingale central limit theorem (Hall and Heyde (1980)) to derive the asymptotic
null distribution of Th . The following assumption, which is closely related to condition (2.1)
of Hall (1984), is imposed to facilitate the use of martingale central limit theorem.
Assumption 1 is indeed very mild. To gain insights into this assumption, we define
(Hk gk )(·) = E{Vk (z, ·)gk (z)}, where z can be x or y, and k = 1, 2. Let {λkl , l = 1, . . .} be
the eigenvalues of Hk . To obtain a normal limit for the statistic, the following condition is
typically required Hall (1984), p. 3. To be specific, it is assumed that
∞ 2 ∞ t
(7) λtkl λ2kl → 0,
l=1 l=1
for some t > 2. By the definitions of Hk s, we have
∞
∞
2
E Vk (z1 , z2 )Vk (z2 , z3 )Vk (z3 , z4 )Vk (z4 , z1 ) = λ4kl , E Vk (z1 , z2 ) = λ2kl .
l=1 l=1
Assumption 1 is equivalent to condition (7) with t = 4. Similar assumptions are widely used
in literature. See, for example, conditions (18) and (19) of Theorem 1 in Gao et al. (2021),
Assumption D5 in the supplement of Zhu et al. (2020) and Assumption B.1 in Chakraborty
and Zhang (2021).
We remark here that, Assumption 1 does not depend on the ordering of the components of
x or y. Therefore, as long as Assumption 1 is satisfied with a particular permutation, it will
be satisfied with all possible permutations.
In the sequel, we explore under what circumstances Assumption 1 can be satisfied. To
be precise, Proposition 2 reveals that this assumption is satisfied with banded dependence
structure under very general designs; Proposition 3 ensures that it is satisfied with an auto-
regressive structure, covariance structures with bounded or spiked eigenvalues under Gaus-
sian designs.
We introduce a few notation. Let πm be a permutation of {1, . . . , m} for a generic integer
m, and Pm be a collection of all possible permutations. By definition, πm ∈ Pm . Let πm (k)
be the kth element of πm .
In general, as long as m1 and m2 are not very large, Assumption 1 holds true under mild
conditions. It is important to remark here that Assumption 1 itself imposes an implicit re-
TESTING HIGH-DIMENSIONAL INDEPENDENCE 191
striction between the sample size n and the dimensions p and q. In particular, if m1 = 0
or m2 = 0, which corresponds to the special case that the coordinates of x or those of y
are all independent, Assumption 1 takes a much easier form. To be precise, if the coordi-
nates of x are all independent, E{V1 (x1 , x2 )4 }/[nE 2 {V1 (x1 , x2 )2 }] = O{1/(np) + 1/n} and
E{V1 (x1 , x2 )V1 (x2 , x3 )V1 (x3 , x4 )V1 (x4 , x1 )}/E 2 {V1 (x1 , x2 )2 } = O(1/p), both converge to
0 as n and p diverge to infinity. Similar claims can be made for y in parallel. In other word, in
the particular case that the coordinates of x or those of y are all independent, Assumption 1
is always true as long as p or q diverges in an arbitrary way that is completely free of n.
P ROPOSITION 3. Suppose both x and y are normally distributed with mean zero,
cov(x, xT ) = 1 = (σ1,k1 k2 ), and cov(y, yT ) = 2 = (σ2,l1 l2 ). Then Assumption 1 is satis-
fied if as p → ∞,
p p 2 p p 3 p 2
1 2
σ1,k1 k3 σ1,k3 k2 + σ 2
σ1,k
k1 ,k2 =1 k3 =1
n k =1 k =1 1,k1 k2 k1 ,k2 =1
1 k2
1 2
and as q → ∞,
q q 2 q q 3 q 2
1 2
σ2,l1 l3 σ2,l3 l2 + σ 2
σ2,l
l1 ,l2 =1 l3 =1
n l =1 l =1 2,l1 l2 l1 ,l2 =1
1 l2
1 2
converge to 0.
The above conditions can be satisfied by a broad class of commonly seen covariance struc-
tures, including the following three examples. Throughout, we only consider 1 . We remark
here that similar derivations apply to 2 as well. The first example is the autoregressive de-
|k −k |
pendence structure, which has the form of 1 = (ρ1 1 2 ), for ρ1 ∈ (−1, 1). Simple algebra
yields that
p p 2
p{1 − ρ18 + 8ρ12 (1 − ρ14 )}
σ1,k1 k3 σ1,k3 k2 = 1 + o(1)
k1 ,k2 =1 k3 =1 (1 − ρ1 )4
2
and
p p 3
p{1 − ρ112 + 3ρ12 (1 + ρ12 )(1 − ρ16 )}
2
σ1,k 1 k2
= 1 + o(1) .
k1 =1 k2 =1 (1 − ρ1 ) (1 + ρ1 + ρ1 )
2 4 2 4
In addition,
p
p(1 + ρ12 )
2
σ1,k 1 k2
= 1 + o(1) .
k1 ,k2 =1 (1 − ρ1 )
2
It can be clearly seen from the above displays that Assumption 1 holds for the autoregressive
covariance structure.
The second example is the covariance structure with all eigenvalues being bounded. To be
precise, we let λ11 ≤ · · · ≤ λ1p be the eigenvalues of 1 . All are bounded away from above.
A direct consequence is that the eigenvalues of 21 , 41 and 61 are all bounded. It can thus
be verified that
p p 2 p p 3
σ1,k1 k3 σ1,k3 k2 ≤ pλ41p , 2
σ1,k 1 k2
≤ pλ61p , and
k1 ,k2 =1 k3 =1 k1 =1 k2 =1
p
2
σ1,k 1 k2
≥ pλ211 .
k1 ,k2 =1
Assumption 1 holds trivially true as long as the dimensions diverge.
192 ZHOU, XU, ZHU AND LI
The third example is the covariance structure with spiked eigenvalues. To be precise, we
assume that the largest b1 eigenvalues of 1 are unbounded, and the remaining (p − b1 )
eigenvalues are bounded from above by a finite constant M1 . We further assume that (p −
b1 ) → ∞ and (p − b1 )λ411 → ∞. One of the two quantities with regard to 1 in Proposition 3
is bounded by
M14 (p − b1 ) + b1 λ41p + n−1 {M16 (p − b1 ) + b1 λ61p }
.
(p − b1 )2 λ411 + 2(p − b1 )b1 λ211 λ21p−b1 +1
It follows that sufficient conditions to ensure Assumption 1 are
2/3 −1/6 1/3 1/3
λ1p = O (p − b1 )1/3 λ11 b1 or λ1p = O (p − b1 )1/6 λ11 λ1p−b1 +1 ,
where b1 can be bounded or divergent, and λ11 is allowed to shrink to zero.
T HEOREM 1. Suppose x and y are continuous, and Assumption 1 holds. Under H0 in (1),
as max(p, q) → ∞ and n → ∞,
−1
p
1/2
q D
h C(d, 2) C(n, 2)
U
(kl)
S−
→ N(0, 1),
h
k=1 l=1
D
where −
→ stands for “convergence in distribution.”
2 is an unbiased and ratio consistent estimate of S 2
The following theorem shows that S
under the null.
2 /S 2 − P
2 /S 2 ) = 1, where − P
T HEOREM 2. Under the conditions in Theorem 1, S → E(S →
D
denotes “convergence in probability.” Accordingly, Th −
→ N(0, 1).
Theorems 1 and 2 establish the asymptotic normality property of Th under H0 . The critical
value under the significance level α can be computed by 100(1 − α)% normal quantiles. An
important implication of Theorems 1 and 2 is that our proposed tests is applicable to high-
dimensional situations where at least one of the two dimensions is sufficiently large.
Let zi = (xTi , yTi )T , for i = 1, . . . , n. Define V (z1 , z2 ) = V1 (x1 , x2 )V2 (y1 , y2 ). The follow-
ing theorem provides an explicit uniform bound on the error of normal approximation to the
null distribution of Th .
To reveal how the rate of convergence depends on the sample size and dimensions, we
consider the situation in which all the coordinates of x and y are statistically independent. In
such a situation, we have
sup P (Th ≤ x) − (x) ≤ C (pq)−1/5 + n−1/5 ,
x∈R
which suggests that the normal approximation of our proposed test is guaranteed to be more
and more accurate as n and (pq) diverge to infinity.
TESTING HIGH-DIMENSIONAL INDEPENDENCE 193
for c = 1, . . . , d, where
p q
(c) (c)
Gh (z1 , . . . , zc ) = gh (z1,kl , . . . , zc,kl ).
k=1 l=1
To explore the power performance of our proposed test, we consider local alternatives
where some Xk s and Yl s are dependent. When max(p, q) → ∞ and n → ∞, we assume that
the class of local alternatives satisfies satisfy
(c)
(8) ζh = o nc−2 Sh2 , c ∈ {1, 3, . . . , d},
(2)
(9) E Gh (z1 , z2 )G(2) (2) (2)
h (z2 , z3 )Gh (z3 , z4 )Gh (z4 , z1 ) = o Sh ,
4
(2) 4
(10) E Gh (z1 , z2 ) = o nSh4 ,
where
q
p
Sh2 = var h2 (z1,kl , z2,kl ) = −2 2
h S .
k=1 l=1
Suppose x and y are continuous, under H0 in (1). Then the conditions in (8)–(10) are satisfied
if Assumption 1 is true, and the test statistic Th is a degenerate U-statistic. Under the local
alternatives, the conditions in (8)–(10) prescribe a small difference from H0 . These condi-
tions can be viewed as an extension of the conditions from the linear case (Yamada, Hyodo
and Nishiyama (2017)) to the nonlinear case. The class of local alternatives is somewhat
too abstract; it is still a sensible class because the rank-based indices, such as Hoeffding’s
D, Blum–Kiefer–Rosenblatt’s R and Bergsma–Dassios–Yanagimoto’s τ ∗ , can target a very
broad class of alternatives, and allow for arbitrarily nonlinear dependence.
To illustrate the local alternatives in high dimensions, we consider the mixture models
used by Shi et al. (2022) and Shi, Drton and Han (2022a). To be specific, we consider the
following ∗ -contamination model:
(11) (x, y) ∼ Fx,y = 1 − ∗ Fx,y
(0)
+ ∗ Fx,y
(1)
,
where Fx,y and Fx,y are two distributions, and 0 < ∗ < 1. In particular, x and y are de-
(0) (1)
(1) (0) (0)
pendent if (x, y) ∼ Fx,y and independent if (x, y) ∼ Fx,y . We further restrict Fx,y to satisfy
Assumption 1.
(1)
P ROPOSITION 4. For an arbitrary distribution Fx,y such that x and y are dependent,
there exists a small contamination ratio ∗ , that depends on both Fx,y and Fx,y , such that the
(0) (1)
Condition (8) ensures that C(d, 2){C(n, 2)}−1 M (2) is a leading term and {C(d, c)C(n,
h
(2) , for c = 1, 3, . . . , d, are asymptotically negligible. This observation, together with
c)−1 Mh
194 ZHOU, XU, ZHU AND LI
conditions (9) and (10), is used under local alternatives to establish the asymptotic normality
of
p
q
.
U
(kl)
h
k=1 l=1
n
n
−1
n
ij = Bij − (n − 2)−1
B Bij − (n − 2)−1 Bij + (n − 1)(n − 2) Bij ,
j =1 i=1 i,j =1
t-distribution with n(n − 3)/2 − 1 degrees of freedom under H0 , as the dimensions diverge
to infinity.
As an alternative to Székely and Rizzo (2013), Zhu et al. (2020) considered the marginal
sample distance covariance based test statistic,
1/2 2 4 1/2
= n(n − 3)/2 − 1
ZZYS (x, y) 1 − mdCor
mdCor (x, y) ,
where
2 2 2 2 1/2
mdCor (x, y) mdCov
(x, y) = mdCov (y, y)
(x, x)mdCov ,
p
q
2 2
(x, y) =
mdCov (Xk , Yl ).
dCov
k=1 l=1
and SZ
Zhu et al. (2020) showed that under H0 , both ZZYS converge to the standard normal
distribution as n ∧ p ∧ q → ∞.
3.2. Connection with Pearson correlation. We establish the explicit relationships be-
tween D, R, τ ∗ and Pearson’s correlation for bivariate normal random variables, which plays
a critical role in subsequent power analysis.
TABLE 1
Ratios for different ρ (all numbers are multiplied by 100)
|ρ| 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Mh(D) /ρ 2 0.85 0.86 0.89 0.92 0.97 1.05 1.16 1.33 1.65 3.33
(D) /ρ 2
M 0.83 0.86 0.88 0.92 0.97 1.05 1.15 1.33 1.65 3.33
h
Mh(R) /ρ 2 0.85 0.85 0.86 0.87 0.89 0.91 0.93 0.97 1.02 1.11
(R) /ρ 2
M 0.83 0.85 0.86 0.87 0.88 0.91 0.93 0.97 1.02 1.11
h
Mh(τ ∗ ) /ρ 2 30.48 30.75 31.22 31.92 32.91 34.29 36.26 39.22 44.34 66.67
(τ ∗ ) /ρ 2
M 29.97 30.58 31.14 31.87 32.88 34.27 36.25 39.21 44.34 66.67
h
(3)
Mh(D) (ρkl ) Mh(D) (ρkl )
inf 2
= 12π 2 −1 , sup 2
= 1/30,
ρkl =0 ρkl ρkl =0 ρkl
Mh(R) (ρkl ) Mh(R) (ρkl )
inf 2
= 12π 2 −1 , sup 2
= 1/90,
ρkl =0 ρkl ρkl =0 ρkl
Mh(τ ∗ ) (ρkl ) Mh(τ ∗ ) (ρkl )
inf 2
= 3π −2 , sup 2
= 2/3.
ρkl =0 ρkl ρkl =0 ρkl
Heuristically, as the dependence between Xk and Yl increases, E{U (kl) } may deviate from
h
zero at the same rate as ρkl 2 . Bearing this concept in mind, Drton, Han and Shi ((2020),
Lemma 4.1) investigated the power of maximum-type tests using consistent rank correla-
tion statistics from an asymptotic minimax perspective. Theorem 5 firms up this concept by
establishing an elegant relation between E{U (kl) } and Pearson correlation under normal dis-
h
tribution, which is the key to compare the consistent rank and distance covariance/correlation
based tests from an asymptotic efficiency perspective.
To better understand Theorem 5, we conduct a toy example. We consider a bivariate normal
random vector with mean zero and covariance matrix
1 ρ
.
ρ 1
We vary ρ from 0.1 to 1.0, and set n = 1000. Each experiment is repeated 1000 times. We
summarize the ratios of estimated mean M (D) , M
(R) and M
(τ ∗ ) to ρ 2 in Table 1, and com-
h h h
pare them with theoretically accurate ratios.
From Table 1, we have the following three findings.
(i) The estimated means of M (D) , M
(R) and M (τ ∗ ) are close to the values M (D) , M (R)
h h h h h
and Mh(τ ) in the first statement of Theorem 5.
∗
3.3. Asymptotic relative efficiency. We now compare the powers of the tests built on Th ,
from the asymptotic efficiency perspective. According to the results from Sec-
and ZZYS
SZ
TESTING HIGH-DIMENSIONAL INDEPENDENCE 197
F IG . 1. The standardized Mh(D) (red dashed line), standardized Mh(R) (black dotted line), standardized Mh(τ ∗ )
(green dot-dash line) and ρ 2 (blue solid line) in the bivariate normal case.
tion 2, the asymptotic power of the test statistic Th is essentially controlled by the signal to
noise ratio (SNR)
1/2 (kl)
p q
−1
SNRTh = h C(d, 2) C(n, 2) θh S.
k=1 l=1
By comparison, Zhu et al. ((2020), Proposition 2.2.2) showed that the asymptotic powers of
depend on
and ZZYS
the test statistics SZ
p q −1/2
1/2
= C(n, 2)
SNRSZ cov2 (Xk1 , Xk2 ) cov2 (Yl1 , Yl2 )
k1 ,k2 =1 l1 ,l2 =1
p
q
× cov2 (Xk , Yl ),
k=1 l=1
p q −1/2
1/2
= C(n, 2)
2 2
SNRZZYS dCov (Xk1 , Xk2 ) dCov (Yl1 , Yl2 )
k1 ,k2 =1 l1 ,l2 =1
p
q
× dCov2 (Xk , Yl ).
k=1 l=1
is
Thus, the asymptotic relative efficiency (ARE) of Th with respect to SZ
p q p
q −1
−1 (kl)
= h C(d, 2)
ARE(Th , SZ) θh 2
cov (Xk , Yl )
k=1 l=1 k=1 l=1
1/2
p
q
× cov2 (Xk1 , Xk2 ) cov2 (Yl1 , Yl2 ) S,
k1 ,k2 =1 l1 ,l2 =1
is
and the ARE of Th with respect to ZZYS
p
q p
q −1
= h C(d, 2) −1
ARE(Th , ZZYS) θh(kl) 2
dCov (Xk , Yl )
k=1 l=1 k=1 l=1
1/2
p
q
× 2
dCov (Xk1 , Xk2 ) 2
dCov (Yl1 , Yl2 ) S.
k1 ,k2 =1 l1 ,l2 =1
198 ZHOU, XU, ZHU AND LI
Owing to the monotone transformation-invariant property, the proposed rank tests do not
require moment conditions, hence are applicable to generally distributed random vectors in-
cluding heavily tailed ones. By contrast, the distance covariance/correlation based tests are
valid with moment conditions. Therefore, to appreciate the implication of AREs, we con-
sider the Gaussian distributed random vectors. We expect that other types of distributions
could draw similar conclusions to the normal ones.
To ease illustration, we assume that x and y have independent and identically distributed
(k) (l)
coordinates with var(Xk ) = σ1 , k ∈ {1, . . . , p}, and var(Yl ) = σ2 , l ∈ {1, . . . , q}. Let
def
12 = cov(x, yT ) = (σkl ) ∈ Rp×q denote the covariance matrix. Let I ⊆ {(k, l) : 1 ≤ k ≤
p, 1 ≤ l ≤ q} be the index set of the signals. For the alternative class, we consider Gaussian
(k) (l)
distribution with equal correlations, namely σkl = ρ{σ1 σ2 }1/2 for (k, l) ∈ I and σkl = 0
for (k, l) ∈
/ I . From a minimax point of view, a similar class of the alternative was discussed
by Yao, Zhang and Shao (2018) and Leung and Drton (2018). By Proposition 1, Theorem 5
and Székely, Rizzo and Bakirov ((2007), Theorem 7), ARE(Th , SZ) and ARE(T
h , ZZYS)
can be simplified to
!−1
= 15 2π 2 −1 (I ) 1/2
ARE(Th , SZ) σ1(k) σ2(l)
(k,l)∈I
(13) 1/2
p
q
(k) 2 (l) 2
× σ1 σ2 1 + o(1) ,
k=1 l=1
!−1
= 30 1 + π/3 − 31/2 π −2 (I ) 1/2
ARE(Th , ZZYS) σ1(k) σ2(l)
(k,l)∈I
(14) 1/2
p
q
(k) 2 (l) 2
× σ1 σ2 1 + o(1) ,
k=1 l=1
(1) (p)
Homoscedastic case. We assume without loss of generality that σ1 = · · · = σ1 =
(1) (q)
σ2 = · · · = σ2 = 1. Because (pq)/(I ) ≥ 1, some simple algebra gives
−1 1/2
lim = 15 2π 2
ARE(Th , SZ) pq/(I )
max(p,q)→∞
(15) −1
≥ 15 2π 2 ≈ 0.7599089,
1/2
lim = 30 1 + π/3 − 31/2 π −2 pq/(I )
ARE(Th , ZZYS)
max(p,q)→∞
(16)
≥ 30 1 + π/3 − 31/2 π −2 ≈ 0.9579312,
∗
for h ∈ {h(D) , h(R) , h(τ ) }.
The limiting ARE values in (15) and (16) suggest that the proposed tests in high dimen-
sions suffer little from power loss under the homoscedastic scenario. Furthermore, it can be
seen that AREs are inverse proportional to (I ). That is, compared to SZ our
and ZZYS,
proposed tests have advantage when the signals in I are relatively sparse. We next reveal that
the efficiency gain of the proposed tests in the high-dimensional heteroscedastic case. For
simplicity, we assume the number of dependency signals in I is fixed.
TESTING HIGH-DIMENSIONAL INDEPENDENCE 199
(20) lim = ∞.
ARE(Th , ZZYS)
max(p,q)→∞
If one of the conditions (17) and (18) holds as max(p, q) → ∞, the ARE results in (19) and
(20) guarantee that the test based on Th is substantially more powerful than those based on
distance covariance/correlation (Székely and Rizzo (2013), Zhu et al. (2020)). The conditions
(k) (l)
in (17) and (18) are trivially true in the case that σ1 s and σ2 s have different scales. To
appreciate this, let us consider an explicit scenario: for 1 ≤ k ≤ p and 1 ≤ l ≤ q, there are
δ1 > 0, δ2 > 0, which do not depend on the dimensions p or q, such that
(k) (l)
σ 1 k δ1 , σ2 l δ2 ,
where, for two sequences {ak } and {bk }, we write ak bk if there exist a positive constant C
such that C −1 ≤ lim infk ak /bk ≤ lim supk ak /bk ≤ C. It is easy to verify that as p → ∞ and
q → ∞,
p
q
(k) 2 (l) 2
σ1 pp 2δ1
→ ∞, and σ2 p q 2δ2 → ∞.
k=1 l=1
According to the aforementioned analysis, we can conclude that even for the light-tailed
and high-dimensional normal observations, the tests built on these monotone transformation-
invariant correlations have much substantial power gain than distance covariance/correlation
based tests (Székely and Rizzo (2013), Zhu et al. (2020)) in the heteroscedastic scenario.
We consider conditions in (17) and (18), which correspond to an extreme case that allows a
few variance components to diverge to infinity. This extreme case facilitates us to illustrate the
advantages of our proposed test over existing tests in the heteroscedastic scenario. However,
we emphasize here that, even when all variance components are bounded uniformly, our
proposed test can also be more powerful than existing ones in the heteroscedastic scenario.
Let us elaborate this phenomenon in the sequel.
We consider a general case that all variance components are uniformly bounded but nec-
essarily not the same. To be precise, there exist two positive constants c1 and c2 , such that
σ1(k) = σ2(l) = c1 for (k, l) ∈ I and σ1(k) = σ2(l) = c2 for (k, l) ∈
/ I . This implies immediately
that
p
q
(k) 2 (l) 2
σ1 = O(p), and σ2 = O(p).
k=1 l=1
= 15(2π 2 )−1 {1+
In this case, the asymptotic relative efficiency has the form of ARE(Th , SZ)
= 30(1 + π/3 − 31/2 )π −2 {1 +
(pq/(I ) − 1)(c2 /c1 )4 }1/2 {1 + o(1)} and ARE(Th , ZZYS)
(pq/(I ) − 1)(c2 /c1 ) } {1 + o(1)}. It can be verified that, as long as {1 + (pq/(I ) −
4 1/2
and ARE(T
1)(c2 /c1 )4 }1/2 ≥ π 2 max(2/15, 1/{30(1 + π/3 − 31/2 )}), ARE(Th , SZ)
h , ZZYS)
200 ZHOU, XU, ZHU AND LI
F IG . 2. −1 SNR
The adjusted cn,p,q −1
Th(D) (red dashed line), adjusted cn,p,q SNRTh(R) (black dotted line) and ad-
−1 SNR
justed cn,p,q T ∗ (green dot-dash line) under the Gaussian alternatives with equal correlations.
h(τ )
are simultaneously greater than or equal to 1. In other words, our proposed test can be more
powerful than the distance covariance/correlation based tests as long as the variance com-
ponents exhibit a certain level of heterogeneity. Moreover, AREs are proportional to c2 /c1 ,
which indicates that the proposed tests are more advantageous when the signal to overall
noise ratio becomes smaller.
Next, we compare the asymptotic efficiency among the proposed three tests under (17) and
(18). Under the Gaussian alternatives with equal correlations, the signal to noise ratio SNRTh
is reduced to
−1
SNRTh = cn,p,q h C(d, 2) Mh (ρ),
where cn,p,q = {C(n, 2)}1/2 (I )/S and (I ) denotes the cardinality of the set I . The factor
∗
cn,p,q does not depend on the kernel h ∈ {h(D) , h(R) , h(τ ) }, which brings convenience for
−1 SNR with respect to ρ in Figure 2. It can be seen
comparison. We plot the adjusted cn,p,q Th
∗
that the tests based on D, τ and R are sorted in a descending order of asymptotic efficiency,
no matter whether x and y are heteroscedastic.
In the bivariate case, Mudholkar and Wilding (2003) compared the power performance of
the Hoeffding’s D and Blum–Kiefer–Rosenblatt’s R tests through simulations for bivariate
normal models. In particular, their empirical observation is that, with a small number of ob-
servations, the Hoeffding’s D test is advantageous over the Blum–Kiefer–Rosenblatt R test
for negative dependence alternatives, and yet disadvantageous for positive ones. However,
they did not explore the theoretical properties about the power performance of the Hoeffd-
ing’s D and Blum–Kiefer–Rosenblatt’s R tests, even for bivariate normal models. In the
present context, we explore the power properties of three tests in high dimensions. From Fig-
ure 2, it can be seen that the Hoeffding’s D-test is apparently the most powerful against the
Gaussian equicorrelation alternatives, for both negative and positive correlations. Thorough
investigations for nonnormal distributions and other dependence structures, particularly in
high dimensions, are highly desired.
4. Numerical studies.
4.1. Simulations. We first use simulations to examine the performances of our pro-
posed tests based on Hoeffding’s D, Blum–Kiefer–Rosenblatt’s R and Bergsma–Dassios–
Yanagimoto’s τ ∗ . We compare the performance of these tests with that of the tests based
on the bias-corrected distance correlation (Székely and Rizzo (2013)) and the aggregated
distance covariance (Zhu et al. (2020)). We implement these two tests through the R pack-
TESTING HIGH-DIMENSIONAL INDEPENDENCE 201
F IG . 3. The density curves of the asymptotic null distribution of test statistics based on D (red dashed line), R
(black dotted line) and τ ∗ (green dot-dash line) and the standard normal density (blue solid line) under the AR
structure of covariances.
age energy. R-code for our proposed methods can be downloaded at https://github.com/
Yeqing-TJ/Rank-based-test-in-high-dimension. We fix the sample size n = 100, and set the
dimensions p = q = 50, 150 and 300. Each experiment is repeated 500 times at the nomi-
nal level 0.05. The parameter δ is used to control the degree of heteroscedasticity (Li et al.
(2023)). We consider three different structures for the covariance matrix = (σst ).
Independent structure: σss = s δ and σst = 0 for s = t.
√
Autoregressive structure: σss = s δ and σst = 0.5|s−t| σss σtt for s = t.
√
Banded structure: σss = s δ and σst = 0.3 σss σtt for s = t and |s − t| ≤ 3, σst = 0 for
s = t and |s − t| > 3.
We generate x and y independently from multivariate normal distribution N(0, ) with
these three covariance structures. To evaluate the accuracy of normal approximation, we plot
the kernel density estimates of the test statistics as well as the standard normal density in
Figure 3. It is clear that the empirical distributions of test statistics are reasonably close to
N(0, 1). We summarize the empirical sizes of all aforementioned tests under different covari-
ance structures and dimensions in Table 2, Tables S.1 and S.2 of the Supplementary Material.
Due to space limit, we report the results for independent and banded structure in the Supple-
mentary Material. Owing to the scale-invariant property, for a given n and p, the empirical
sizes of the tests based on D, R and τ ∗ stay stable under different values of δ. Across three
covariance structures, all the tests can control type-I error rates pretty well.
Next, we compare the empirical powers of the above tests in detecting linear and nonlinear
dependence. We consider the following two models.
Linear model: We draw x from N(0, ) with three different covariance structures. We
generate Yj = 0.5Xj + j for j = 1, . . . , [p/3], where j s are generated from N(0, j 2 ), and
[·] denotes the integer part of a given number. The rest (q − [p/3]) components of y are
independent of x and follow N(0, ).
Nonlinear model: We draw x from N(0, ) with three different covariance structures. We
generate Yj = Xj2 + j for j = 1, . . . , [p/3], where j s are generated from N(0, j 2 ). We
generate an intermediate vector x = (X1 , . . . , X
q−[p/3] )T independently from N(0, ). The
remaining components of y are given by Yj = X 2 for j = [p/3] + 1, . . . , q.
j
We report the empirical powers for linear model in Tables 3, S.3 and S.4 and for non-
linear model in Tables 4, S.5 and S.6, respectively. In linear model, δ = 0 corresponds to
the homoscedastic case where each component of x has the same variance. The empirical
powers of all five tests decrease as the dimensions p and q grow. Compared to the distance
covariance/correlation based tests of Székely and Rizzo (2013) and Zhu et al. (2020), our
proposed tests suffer little from power loss, which is in line with the theoretical analysis in
Section 3. When δ is nonzero, due to the monotone transformation-invariant property, the
proposed three tests exhibit significant advantages. The empirical powers of the tests based
202 ZHOU, XU, ZHU AND LI
TABLE 2
The empirical size under the AR structure of covariances. The random seeds are
fixed throughout and all rank tests are monotone transformation-invariant.
Therefore, the empirical sizes of all rank tests are unchanged,
though δ varies from 0.00 to 1.00
p δ D R τ∗ ZZYS SZ
on D, R and τ ∗ share a similar trend, and gradually approach to one as δ increases. The
power of D based test is slightly better than those of τ ∗ and R based ones in most settings.
Under the heteroscedastic scenario, the rank tests significantly outperform Székely and Rizzo
(2013)’s and Zhu et al. (2020)’s tests. In nonlinear model, it is clear that cov(Xk , Yl ) = 0 for
k = 1, . . . , p and l = 1, . . . , q. Because Székely and Rizzo (2013)’s test cannot detect the
nonlinearly dependent but componentwisely uncorrelated case, its empirical power is very
near to the nominal level. By contrast, our proposed tests show pretty good capability in
detecting nonlinear dependence.
4.2. An application. We consider a gene expression microarray data set, which was col-
lected from 120 male rats by Scheetz et al. (2006), to demonstrate the practical usefulness
of our proposed tests. It contains 18,976 gene probe sets, which exhibit sufficient expression
signals. The gene TRIM32 at probe 1389163_at is the targeted response, which was validated
to cause the Bardet–Biedl syndrome (Chiang et al. (2006)). We use 500 probe sets that have
the largest variances as the covariates. The empirical observations of previous studies reveal
that there exists nonlinear relationship between the response and the covariates. To verify this
conclusion, we apply the proposed tests to examine independence between the response and
the covariates, with comparison to Székely and Rizzo (2013)’s test and Zhu et al. (2020)’s
test.
To evaluate the power performances of the five tests, we randomly select subsets of size
n = 45, 60, 75 and 90 from the whole data set, to calculate the test statistics. Based on 200
repetitions, the empirical powers of five tests are summarized in Table 5. We can see that the
three tests based on D, R and τ ∗ are more powerful than Székely and Rizzo (2013)’s test and
Zhu et al. (2020)’s test. This may be owe to highly skewed distribution of the response, as
shown in Figure 2 of Zhou and Zhu (2018).
TABLE 3
The empirical power under the AR structure of covariances for linear model
p δ D R τ∗ ZZYS SZ
TABLE 4
The empirical power under the AR structure of covariances for nonlinear model
p δ D R τ∗ ZZYS SZ
TABLE 5
The empirical power of five tests
Acknowledgments. The authors thank the Editor, the Associate Editor and the anony-
mous reviewers for their constructive comments, which have led to a dramatic improvement
of the earlier version of this article. The authors are also very grateful to Dr. Hongjian Shi for
providing us with the R codes of Shi et al. (2022). The authors have contributed equally to
this paper. All correspondence should be addressed to Liping Zhu (the corresponding author)
at zhu.liping@ruc.edu.cn.
SUPPLEMENTARY MATERIAL
Supplement to “Rank-based indices for testing independence between two high-
dimensional vectors” (DOI: 10.1214/23-AOS2339SUPP; .pdf). The supplement (Zhou et al.
(2024)) contains additional simulation results and technical proofs of all theorems and propo-
sitions in the main context.
REFERENCES
A LBERT, M., L AURENT, B., M ARREL , A. and M EYNAOUI , A. (2022). Adaptive test of independence based on
HSIC measures. Ann. Statist. 50 858–879. MR4404921 https://doi.org/10.1214/21-aos2129
A NDERSON , T. W. (2003). An Introduction to Multivariate Statistical Analysis, 3rd ed. Wiley Series in Probability
and Statistics. Wiley-Interscience, New York. MR1990662
BAO , Z. (2019). Tracy–Widom limit for Kendall’s tau. Ann. Statist. 47 3504–3532. MR4025750
https://doi.org/10.1214/18-AOS1786
BAO , Z., L IN , L.-C., PAN , G. and Z HOU , W. (2015). Spectral statistics of large dimensional Spearman’s
rank correlation matrix and its application. Ann. Statist. 43 2588–2623. MR3405605 https://doi.org/10.1214/
15-AOS1353
B ERGSMA , W. and DASSIOS , A. (2014). A consistent test of independence based on a sign covariance related to
Kendall’s tau. Bernoulli 20 1006–1028. MR3178526 https://doi.org/10.3150/13-BEJ514
B ERRETT, T. B. and S AMWORTH , R. J. (2019). Nonparametric independence testing via mutual information.
Biometrika 106 547–566. MR3992389 https://doi.org/10.1093/biomet/asz024
B LUM , J. R., K IEFER , J. and ROSENBLATT, M. (1961). Distribution free tests of independence based on the sam-
ple distribution function. Ann. Math. Stat. 32 485–498. MR0125690 https://doi.org/10.1214/aoms/1177705055
B ODNAR , T., D ETTE , H. and PAROLYA , N. (2019). Testing for independence of large dimensional vectors. Ann.
Statist. 47 2977–3008. MR3988779 https://doi.org/10.1214/18-AOS1771
TESTING HIGH-DIMENSIONAL INDEPENDENCE 205
C HAKRABORTY, S. and Z HANG , X. (2021). A new framework for distance and kernel-based metrics in high
dimensions. Electron. J. Stat. 15 5455–5522. MR4352549 https://doi.org/10.1214/21-ejs1889
C HATTERJEE , S. (2021). A new coefficient of correlation. J. Amer. Statist. Assoc. 116 2009–2022. MR4353729
https://doi.org/10.1080/01621459.2020.1758115
C HIANG , A. P., B ECK , J. S., Y EN , H., TAYEH , M. K., S CHEETZ , T. E. S WIDERSKI , R. E. N ISHIMURA , D. Y.
B RAUN , T. A., K IM , K.-Y. A. et al. (2006). Homozygosity mapping with SNP arrays identifies TRIM32, an
E3 ubiquitin ligase, as a bardet–biedl syndrome gene (BBS11). Proc. Natl. Acad. Sci. USA 103 6287–6292.
D EB , N. and S EN , B. (2023). Multivariate rank-based distribution-free nonparametric testing using measure trans-
portation. J. Amer. Statist. Assoc. 118 192–207. MR4571116 https://doi.org/10.1080/01621459.2021.1923508
D RTON , M., H AN , F. and S HI , H. (2020). High-dimensional consistent independence testing with maxima of
rank correlations. Ann. Statist. 48 3206–3227. MR4185806 https://doi.org/10.1214/19-AOS1926
G AO , L., FAN , Y., LV, J. and S HAO , Q.-M. (2021). Asymptotic distributions of high-dimensional distance cor-
relation inference. Ann. Statist. 49 1999–2020. MR4319239 https://doi.org/10.1214/20-aos2024
G ORSKY, S. and M A , L. (2022). Multi-scale Fisher’s independence test for multivariate dependence. Biometrika
109 569–587. MR4472834 https://doi.org/10.1093/biomet/asac013
G RETTON , A., F UKUMIZU , K., T EO , C., S ONG , L., S CHÖLKOPF, B. and S MOLA , A. (2008). A kernel statistical
test of independence. In Advances in Neural Information Processing Systems 585–592.
H ALL , P. (1984). Central limit theorem for integrated square error of multivariate nonparametric density estima-
tors. J. Multivariate Anal. 14 1–16. MR0734096 https://doi.org/10.1016/0047-259X(84)90044-7
H ALL , P. and H EYDE , C. C. (1980). Martingale Limit Theory and Its Application. Probability and Mathematical
Statistics. Academic Press, New York. MR0624435
H AN , F., C HEN , S. and L IU , H. (2017). Distribution-free tests of independence in high dimensions. Biometrika
104 813–828. MR3737306 https://doi.org/10.1093/biomet/asx050
H ELLER , R., H ELLER , Y. and G ORFINE , M. (2013). A consistent multivariate test of association based on ranks
of distances. Biometrika 100 503–510. MR3068450 https://doi.org/10.1093/biomet/ass070
H OEFFDING , W. (1948). A non-parametric test of independence. Ann. Math. Stat. 19 546–557. MR0029139
https://doi.org/10.1214/aoms/1177730150
J IANG , T. and YANG , F. (2013). Central limit theorems for classical likelihood ratio tests for high-dimensional
normal distributions. Ann. Statist. 41 2029–2074. MR3127857 https://doi.org/10.1214/13-AOS1134
K ENDALL , M. G. (1938). A new measure of rank correlation. Biometrika 30 81–93.
L EE , D., Z HANG , K. and KOSOROK , M. R. (2023). The binary expansion randomized ensemble test. Statist.
Sinica 33 2381–2403. MR4647039
L EUNG , D. and D RTON , M. (2018). Testing independence in high dimensions with sums of rank correlations.
Ann. Statist. 46 280–307. MR3766953 https://doi.org/10.1214/17-AOS1550
L I , R., X U , K., Z HOU , Y. and Z HU , L. (2023). Testing the effects of high-dimensional covariates via aggre-
gating cumulative covariances. J. Amer. Statist. Assoc. 118 2184–2194. MR4646635 https://doi.org/10.1080/
01621459.2022.2044334
M OON , H. and C HEN , K. (2022). Interpoint-ranking sign covariance for the test of independence. Biometrika
109 165–179. MR4374647 https://doi.org/10.1093/biomet/asab011
M UDHOLKAR , G. S. and W ILDING , G. E. (2003). On the conventional wisdom regarding two consistent tests of
bivariate independence. Statistician 52 41–57. MR1973881 https://doi.org/10.1111/1467-9884.00340
P EARSON , K. (1900). On the criterion that a given system of deviations from the probable in the case of a
correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling.
Philos. Mag. Ser. 5 50 157–175.
S CHEETZ , T. E., K IM , K.-Y. A., S WIDERSKI , R. E., P HILP, A. R., B RAUN , T. A., K NUDTSON , K. L., D OR -
RANCE , A. M., D I B ONA , G. F., H UANG , J. et al. (2006). Regulation of gene expression in the mammalian
eye and its relevance to eye disease. Proc. Natl. Acad. Sci. USA 103 14429–14434.
S HEN , C., P RIEBE , C. E. and VOGELSTEIN , J. T. (2020). From distance correlation to multiscale graph correla-
tion. J. Amer. Statist. Assoc. 115 280–291. MR4078463 https://doi.org/10.1080/01621459.2018.1543125
S HI , H., D RTON , M. and H AN , F. (2022a). On the power of Chatterjee’s rank correlation. Biometrika 109 317–
333. MR4430960 https://doi.org/10.1093/biomet/asab028
S HI , H., D RTON , M. and H AN , F. (2022b). Distribution-free consistent independence tests via center-outward
ranks and signs. J. Amer. Statist. Assoc. 117 395–410. MR4399094 https://doi.org/10.1080/01621459.2020.
1782223
S HI , H., H ALLIN , M., D RTON , M. and H AN , F. (2022). On universally consistent and fully distribution-free rank
tests of vector independence. Ann. Statist. 50 1933–1959. MR4474478 https://doi.org/10.1214/21-aos2151
S PEARMAN , C. (1904). The proof and measurement of association between two things. Amer. J. Psychol. 15
72–101.
S ZÉKELY, G. J. and R IZZO , M. L. (2013). The distance correlation t-test of independence in high dimension.
J. Multivariate Anal. 117 193–213. MR3053543 https://doi.org/10.1016/j.jmva.2013.02.012
206 ZHOU, XU, ZHU AND LI
S ZÉKELY, G. J., R IZZO , M. L. and BAKIROV, N. K. (2007). Measuring and testing dependence by correlation
of distances. Ann. Statist. 35 2769–2794. MR2382665 https://doi.org/10.1214/009053607000000505
W EIHS , L., D RTON , M. and M EINSHAUSEN , N. (2018). Symmetric rank covariances: A generalized framework
for nonparametric measures of dependence. Biometrika 105 547–562. MR3842884 https://doi.org/10.1093/
biomet/asy021
X U , K. and Z HU , L. (2022). Power analysis of projection-pursuit independence tests. Statist. Sinica 32 417–433.
MR4359639 https://doi.org/10.5705/ss.202019.0457
YAMADA , Y., H YODO , M. and N ISHIYAMA , T. (2017). Testing block-diagonal covariance structure for high-
dimensional data under non-normality. J. Multivariate Anal. 155 305–316. MR3607897 https://doi.org/10.
1016/j.jmva.2016.12.009
YANAGIMOTO , T. (1970). On measures of association and a related problem. Ann. Inst. Statist. Math. 22 57–63.
YANG , Y. and PAN , G. (2015). Independence test for high dimensional data based on regularized canonical
correlation coefficients. Ann. Statist. 43 467–500. MR3316187 https://doi.org/10.1214/14-AOS1284
YAO , S., Z HANG , X. and S HAO , X. (2018). Testing mutual independence in high dimension via distance covari-
ance. J. R. Stat. Soc. Ser. B. Stat. Methodol. 80 455–480. MR3798874 https://doi.org/10.1111/rssb.12259
Z HANG , K. (2019). BET on independence. J. Amer. Statist. Assoc. 114 1620–1637. MR4047288
https://doi.org/10.1080/01621459.2018.1537921
Z HOU , Y., X U , K., Z HU , L. and L I , R. (2024). Supplement to “Rank-based indices for testing independence
between two high-dimensional vectors.” https://doi.org/10.1214/23-AOS2339SUPP
Z HOU , Y. and Z HU , L. (2018). Model-free feature screening for ultrahigh dimensional datathrough a modified
Blum–Kiefer–Rosenblatt correlation. Statist. Sinica 28 1351–1370. MR3821008
Z HU , C., Z HANG , X., YAO , S. and S HAO , X. (2020). Distance-based and RKHS-based dependence metrics in
high dimension. Ann. Statist. 48 3366–3394. MR4185812 https://doi.org/10.1214/19-AOS1934
Z HU , L., X U , K., L I , R. and Z HONG , W. (2017). Projection correlation between two random vectors. Biometrika
104 829–843. MR3737307 https://doi.org/10.1093/biomet/asx043
Z HU , L., Z HANG , Y. and X U , K. (2018). Measuring and testing for interval quantile dependence. Ann. Statist.
46 2683–2710. MR3851752 https://doi.org/10.1214/17-AOS1635