23 Aos2339

The Annals of Statistics
2024, Vol. 52, No. 1, 184–206

https://doi.org/10.1214/23-AOS2339
© Institute of Mathematical Statistics, 2024
RANK-BASED INDICES FOR TESTING INDEPENDENCE BETWEEN

TWO HIGH-DIMENSIONAL VECTORS
B Y Y EQING Z HOU1,a , K AI X U2,b , L IPING Z HU3,c AND RUNZE L I4,d

1 School of Mathematical Sciences and Key Laboratory of Intelligent Computing and Applications, Tongji University,
a zhouyeqing@tongji.edu.cn
2 School of Mathematics and Statistics, Anhui Normal University, b tjxxukai@ahnu.edu.cn
3 Center for Applied Statistics, Institute of Statistics and Big Data, Renmin University of China
Zhijiang Institute of Big Data and Statistics, School Statistics and Mathematics, Zhejiang Gongshang University,
c zhu.liping@ruc.edu.cn
4 Department of Statistics, The Pennsylvania State University at University Park, d rzli@psu.edu
To test independence between two high-dimensional random vectors, we

propose three tests based on the rank-based indices derived from Hoeffd-
ing’s D, Blum–Kiefer–Rosenblatt’s R and Bergsma–Dassios–Yanagimoto’s
τ ∗ . Under the null hypothesis of independence, we show that the distribu-
tions of the proposed test statistics converge to normal ones if the dimensions
diverge arbitrarily with the sample size. We further derive an explicit rate
of convergence. Thanks to the monotone transformation-invariant property,
these distribution-free tests can be readily used to generally distributed ran-
dom vectors including heavily-tailed ones. We further study the local power
of the proposed tests and compare their relative efficiencies with two clas-
sic distance covariance/correlation based tests in high-dimensional settings.
We establish explicit relationships between D, R, τ ∗ and Pearson’s correla-
tion for bivariate normal random variables. The relationships serve as a basis
for power comparison. Our theoretical results show that under a Gaussian
equicorrelation alternative: (i) the proposed tests are superior to the two clas-
sic distance covariance/correlation based tests if the components of random
vectors have very different scales; (ii) the asymptotic efficiency of the pro-
posed tests based on D, τ ∗ and R are sorted in a descending order.
1. Introduction. Testing independence between random vectors plays an important

role in statistics, economics, machine learning and many other scientific fields. Let x =
(X1 , . . . , Xp )T ∈ Rp and y = (Y1 , . . . , Yq )T ∈ Rq be two random vectors with possibly dif-
ferent dimensions. We aim to test
(1) H0 : x and y are independent, versus, H1 : otherwise,
under the asymptotic regime where either the dimension of x, or that of y, or both, diverge to
infinity as the sample size n grows.
In the fixed-dimensional setting, the problem of testing independence has been extensively
studied. For a pair of univariate random variables, the widely used metrics to quantify de-
pendence include the classic Pearson’s correlation (Pearson (1900)), Kendall’s tau (Kendall
(1938)) and Spearman’s rho (Spearman (1904)). These correlations, however, can only de-
tect linear or monotone associations, hence are not consistent in independence tests. To ad-
dress this issue, several rank-based metrics have been proposed, including Hoeffding’s D
(Hoeffding (1948)), Blum–Kiefer–Rosenblatt’s R (Blum, Kiefer and Rosenblatt (1961)) and
Bergsma–Dassios–Yanagimoto’s τ ∗ (Bergsma and Dassios (2014), Yanagimoto (1970)). For
absolutely continuous bivariate distributions, the three measures D, R and τ ∗ possess the
Received May 2022; revised August 2023.

MSC2020 subject classifications. Primary 62G10; secondary 62G20.
Key words and phrases. Bergsma–Dassios–Yanagimoto’s τ ∗ , Blum–Kiefer–Rosenblatt’s R, degenerate U -
statistics, Hoeffding’s D.
184
TESTING HIGH-DIMENSIONAL INDEPENDENCE 185
I-consistent, D-consistent and Monotone transformation-invariant properties that are elabo-

rated by Weihs, Drton and Meinshausen (2018) and Drton, Han and Shi (2020). The tests
based on these metrics are consistent. Modifications of these correlations are used by Zhou
and Zhu (2018) for feature screening, and Zhu, Zhang and Xu (2018) to quantify the degree
of interval quantile dependence. Recently, Chatterjee (2021) introduced a simple rank-based
correlation, which yields an asymptotically normal test under independence. However, Shi,
Drton and Han (2022a) showed that Chatterjee’s coefficient is rate suboptimal compared to
D, R and τ ∗ , and hence advocated using D, R and τ ∗ for testing independence. For mul-
tivariate x and y, the tests of independence can be constructed through distance covariance
(Székely, Rizzo and Bakirov (2007)), Hilbert–Schmidt independence criterion (Gretton et al.
(2008), Albert et al. (2022)), ranks of distances (Heller, Heller and Gorfine (2013)), projection
covariance (Xu and Zhu (2022), Zhu et al. (2017)), symmetric rank covariance (Weihs, Drton
and Meinshausen (2018)), mutual information (Berrett and Samworth (2019)), binary expan-
sion (Zhang (2019), Lee, Zhang and Kosorok (2023)), multiscale graph covariance (Shen,
Priebe and Vogelstein (2020)), interpoint-ranking sign covariance (Moon and Chen (2022)),
multiscale Fisher’s test (Gorsky and Ma (2022)), center-outward ranks and signs (Shi, Dr-
ton and Han (2022b)), multivariate ranks (Deb and Sen (2023)) and so forth. However, the
theoretical properties of these nonlinear metrics may be totally different in high-dimensional
settings. For example, Zhu et al. (2020) found that the distance/Hilbert–Schmidt covariance
based tests can only detect linear dependence in high dimensions. Similar phenomenon also
appears in the projection covariance based tests.
To deal with high-dimensional x and y, two lines of researches were pursued. One re-
search line focuses on testing a block diagonal structure of covariance matrix. Jiang and
Yang (2013), Yang and Pan (2015) and Bodnar, Dette and Parolya (2019) corrected tradi-
tional likelihood ratio and trace criteria tests (Anderson (2003)) under the situation where the
ratio between the dimensions and the sample size converges to a positive constant. Yamada,
Hyodo and Nishiyama (2017) used an empirical distance approach to conduct the test un-
der “large dimension, small sample size” paradigm. These covariance-based tests are limited
to testing the uncorrelatedness due to the limitation of Pearson correlation. Recently, using
rank correlations such as Kendall’s tau and Spearman’s rho, Bao et al. (2015), Han, Chen
and Liu (2017), Leung and Drton (2018) and Bao (2019) proposed tests for mutual indepen-
dence in high dimensions. The second line of research is based on the arbitrarily nonlinear,
nonmonotone dependence metrics. To be specific, Székely and Rizzo (2013) developed a
bias-corrected distance correlation based test when p and q diverge and n is fixed. This test,
however, has trivial limiting power when x and y are componentwisely uncorrelated. To ad-
dress this issue, Zhu et al. (2020) suggested using an aggregation of marginal sample distance
covariances and Hilbert–Schmidt covariances as test statistics. Their limiting null distribu-
tions are investigated under the setting, where p and q grow more rapidly than n. Moreover,
Gao et al. (2021) considered a rescaled distance correlation and derived an explicit rate of
convergence to the limiting distribution. It is worth noting that distance correlation is con-
sistent but not invariant under strictly monotone transformations of any coordinates (Weihs,
Drton and Meinshausen (2018)).
In this paper, we propose three tests of independence between two high-dimensional vec-
tors through rank-based indices: D of Hoeffding (1948), R of Blum, Kiefer and Rosenblatt
(1961) and τ ∗ of Bergsma and Dassios (2014). Due to the monotone transformation-invariant
property, the proposed tests do not require moment conditions, and hence are readily appli-
cable to generally distributed random vectors including the heavy-tailed ones. Moreover, the
proposed tests are resistant to heteroscedastic vectors. It is worth mentioning that our pro-
posed tests are intrinsically different from existing ones, such as Leung and Drton (2018)
and Drton, Han and Shi (2020), that are also built on rank based indices in high dimensions.
186 ZHOU, XU, ZHU AND LI
Methodologically, we deal with a different problem from Leung and Drton (2018) and Drton,
Han and Shi (2020). They developed tests for mutual independence among the coordinates of
a single high-dimensional random vector, whereas our proposed tests focus on testing inde-
pendence between two random vectors when at least one dimension is large. This implies that
the components of either x or y may not be independent of each other, even under the null hy-
pothesis studied in this paper. This makes the theoretical derivations of our proposed tests be
much more challenging than those of the tests studied by Leung and Drton (2018) and Drton,
Han and Shi (2020) under their null hypothesis. In particular, our derivations shed some light
on the behaviors of U -statistics in the high-dimensional setting under dependence. When
max(p, q) → ∞, which is very unlike the fixed dimension or mutual independence cases,
both the dependence between variables and the high dimensions bring ambiguity in assessing
the relative orders of terms in the Hoeffding decompositions. Even when x and y are indepen-
dent, the leading term in the variance decomposition for the proposed test statistics depends
on unknown dependence structures. Moreover, the proposed tests allow for universal asymp-
totics, in the sense that their null asymptotic distributions hold without imposing any explicit
restriction on the growth rates of p and q when both are regarded as functions of n, although
a few implicit restrictions are imposed in Assumption 1. The details will be discussed in
Propositions 2 and 3. By contrast, Drton, Han and Shi (2020) require that log(p) = o(nθ ) for
a constant θ . We further provide an explicit uniform bound on the error of normal approxi-
mation, and study the asymptotic relative efficiency of our proposed test based on nonlinear
and nonmonotone dependence metrics.
We summarize our main contributions as follows:
(a) Under H0 , we show that all three rank-based tests have null distributions that converge
to normal ones under the setting that max(p, q) diverges to infinity as n grows. The critical
values can thus be determined by normal approximation, avoiding expensive computation to
obtain critical values through permutation. We further derive the explicit rates of convergence
for these rank-based tests. Thanks to the blessing of dimension, the normal approximation is
more accurate as the dimensions increase.
(b) For bivariate normal variables with correlation ρ, Drton, Han and Shi (2020) estab-
lished the relationship that
D, R, τ ∗ ρ 2 as ρ → 0.
But the explicit relationships between D, R, τ ∗ and ρ remains unexplored. We fill this gap
and use the explicit relationships to conduct power comparisons from the asymptotic effi-
ciency perspective.
(c) Theoretically, we compare the relative efficiency with the distance covariance/correla-
tion based tests for high-dimensional data: Székely and Rizzo (2013)’s Student t-test and Zhu
et al. (2020)’s aggregated distance covariance based test. We show that, under a Gaussian
equicorrelation alternative, our proposed tests have significant power gain when the compo-
nents of x or y have very different scales. In addition, the asymptotic efficiency of the rank
tests based on D, τ ∗ and R are sorted in a descending order.
This paper is organized as follows. In Section 2, we propose the rank-based tests, and study
their properties. In Section 3, we compare the asymptotic relative efficiency of our proposed
tests with the distance covariance or correlation based tests. In Section 4, we assess the finite-
sample performance of our proposed tests through simulation studies and an application. We
discuss some extensions in Section 5. All proofs and additional simulations are relegated to
the Supplementary Material (Zhou et al. (2024)).
2. Independence tests through rank based correlations.
2.1. Rank-based indices defined by U -statistic theory. Suppose that we draw a ran-
dom sample, {(xi , yi ) : i = 1, . . . , n}, from the joint distribution of (x, y), where xi =
(Xi,1 , . . . , Xi,p )T and yi = (Yi,1 , . . . , Yi,q )T . To avoid ties among the observations, we merely
consider continuous random vectors in the sequel. We are interested in the class of correlation
measures that satisfy the I-consistent, D-consistent and Monotone transformation-invariant
properties, and focus on three typical examples: Hoeffding’s D, Blum–Kiefer–Rosenblatt’s
R and Bergsma–Dassios–Yanagimoto’s τ ∗ in this paper. To lay down the foundation of these
three tests, we first review the estimates of these correlations under the framework of U -
statistic theory.
Let h : (R1 ×R1 )d → R1 be a fixed kernel of order d ≥ 2, which is symmetric and invariant
to permutations of all arguments. For any pair of distinctive indices k ∈ {1, . . . , p} and l ∈
{1, . . . , q}, a standard U -statistic induced by the kernel h takes the form
−1
(kl) = C(n, d)
U h (Xi1 ,k , Yi1 ,l ), . . . , (Xid ,k , Yid ,l ) ,
h
1≤i1 ≤···≤id ≤n
where C(n, d) denotes the number of all combinations of d distinct elements from {1, . . . , n}.
For the sake of notation clarity, we define zi,kl = (Xi,k , Yi,l ) for 1 ≤ i ≤ n, 1 ≤ k ≤ p and 1 ≤
l ≤ q, and Pm as the set of all permutations of {1, . . . , m}. We further define ψ(z1 , z2 , z3 ) =
I (z2 < z1 ) − I (z3 < z1 ), where I (·) is an indicator function, and ω(z1 , z2 , z3 , z4 ) = I (z1 ∨
z3 < z2 ∧ z4 ) + I (z1 ∨ z3 > z2 ∧ z4 ) − I (z1 ∨ z2 < z3 ∧ z4 ) − I (z1 ∨ z2 > z3 ∧ z4 ), where
a ∨ b = max(a, b) and a ∧ b = min(a, b).
The kernels of the aforementioned three correlations are given as follows.
D EFINITION 1. (a) The kernel h(D) for Hoeffding’s D is defined as

1
h(D) (z1,kl , . . . , z5,kl ) = ψ(Xi1 ,k , Xi2 ,k , Xi5 ,k )
480 (i ,...,i )∈P
1 5 5
× ψ(Xi3 ,k , Xi4 ,k , Xi5 ,k )ψ(Yi1 ,l , Yi2 ,l , Yi5 ,l )ψ(Yi3 ,l , Yi4 ,l , Yi5 ,l ).
(b) The kernel h(R) for Blum–Kiefer–Rosenblatt’s R is defined as

1
h(R) (z1,kl , . . . , z6,kl ) = ψ(Xi1 ,k , Xi2 ,k , Xi5 ,k )
2880 (i ,...,i )∈P
1 6 6
× ψ(Xi3 ,k , Xi4 ,k , Xi5 ,k )ψ(Yi1 ,l , Yi2 ,l , Yi6 ,l )ψ(Yi3 ,l , Yi4 ,l , Yi6 ,l ).

∗)
(c) The kernel h(τ for Bergsma–Dassios–Yanagimoto’s τ ∗ is defined as
∗ 1
h(τ ) (z1,kl , . . . , z4,kl ) = ω(Xi1 ,k , Xi2 ,k , Xi3 ,k , , Xi4 ,k )
24 (i ,...,i )∈P
1 4 4
× ω(Yi1 ,l , Yi2 ,l , Yi3 ,l , Yi4 ,l ).

∗
In Definition 1, the kernels h(D) , h(R) and h(τ ) are symmetric and bounded. Their ex-
∗
pectations E{h(D) }, E{h(R) } and E{h(τ ) } are nonnegative, and equal zero if and only if the
independence holds.
Let hc (z1,kl , . . . , zc,kl ) = E{h(z1,kl , . . . , zd,kl )|z1,kl , . . . , zc,kl } be the projections of h onto
the lower-dimensional spaces, h = h − E{h(z1,kl , . . . , zd,kl )} and hc = hc − E{h(z1,kl , . . . ,
zd,kl )} for c = 1, . . . , d. Let

c−1
gh(c) (z1,kl , . . . , zc,kl ) =
(j )
(2) hc − gh (zi1 ,kl , . . . , zij ,kl ),
j =1 1≤i1 <···<ij ≤c
where gh (z1,kl ) =
(1) (kl)
h1 (z1,kl ). The kernel h and the U -statistic Uh are nondegenerate if
(1) (1)
var{gh (z1,kl )} > 0, and degenerate if var{gh (z1,kl )} = 0. It can be verified that, for j ∈
{i1 , . . . , i6 },

E ψ(Zi1 , Zi2 , Zi3 )ψ(Zi1 , Zi4 , Zi5 )|Zj
(3)
= E ψ(Zi1 , Zi3 , Zi4 )ψ(Zi2 , Zi5 , Zi6 )|Zj = E ω(Zi1 , Zi2 , Zi3 , Zi4 )|Zj = 0.
∗)
Therefore, h(D) , h(R) and h(τ are degenerate if Xk and Yl are independent.
2.2. The family of proposed tests. To test H0 in (1), we consider the test statistic, which
(kl) to form
aggregates pairwise the U -statistics Uh
p q
(kl)
Th = h C(d, 2) −1 C(n, 2) 1/2 ,

(4) Uh
k=1 l=1
where d is the order of kernel, and h is a finite adjustment factor,
⎧ ⎧
⎪
⎪ if h = h(D) , ⎪
⎪ if h = h(D) ,
⎨5, ⎨40,
d = 6, if h = h(R) , and h = 60, if h = h(R) ,
⎪
⎪ ∗ ⎪
⎪ ∗
⎩4, if h = h(τ ) . ⎩2/3, if h = h(τ ) .
We now provide an estimate for the variance of Th . Let

(k)
A1 (u, v) = FX2 k (u) + FX2 k (v) − 2 max FXk (u), FXk (v) + 2/3, k = 1, . . . , p
(l)
A2 (u, v) = FY2l (u) + FY2l (v) − 2 max FYl (u), FYl (v) + 2/3, l = 1, . . . , q,
(k) (l)
where FXk (u) = pr(Xk ≤ u) and FYl (v) = pr(Yl ≤ v). Both A1 (u, v) and A2 (u, v) satisfy
that E{A(k) (k) (l) (l)
1 (Xk , v)} = E{A1 (u, Xk )} = E{A2 (Yl , v)} = E{A2 (u, Yl )} = 0. By Hoeffding
decomposition, we have, under H0 ,
−1 (kl)
h C(d, 2)
Uh = C(n, 2) −1 A(k) (l) (kl)
1 (Xik , Xj k )A2 (Yil , Yj l ) + Rh ,
1≤i<j ≤n

where R
(kl)
is a remainder term. It follows that Th = J(1) + Jh , where
(2)
h
p
q

J(1) = C(n, 2) −1/2 (k)
A1 (Xik , Xj k )A(l)
2 (Yil , Yj l ), and
1≤i<j ≤n k=1 l=1
p q
1/2 (kl)
Jh = C(n, 2) .
(2)
R h
k=1 l=1
If both p and q are fixed, by the standard U -statistic theory, J(1) plays a dominating role in
determining the asymptotic null distributions of our proposed rank-based indices. It motivates
us to anticipate that this phenomenon remains to be true in high-dimensional settings. In
Lemma 1 of the Supplementary Material, we show that, under H0 , J(1) remains to be a
leading term and Jh(2) is asymptotically negligible even when max(p, q) → ∞. Therefore,
under H0 , the variance of Th , denoted as S 2 , is dominated by

p
p
(k )
var J(1) =
(k )
E A1 1 (X1k1 , X2k1 )A1 2 (X1k2 , X2k2 )
k1 =1 k2 =1
(5)
q
q

× E A(l2 1 ) (Y1l1 , Y2l1 )A(l2 2 ) (Y1l2 , Y2l2 ) .
l1 =1 l2 =1
By the definition of J(1) , the variance term, S 2 , does not depend on h under H0 in an asymp-
totic sense.
Proposition 1 serves as a basis to construct an unbiased estimate of S 2 .
P ROPOSITION 1. Suppose both x and y are continuous random vectors. It follows

that E{A1(k1 ) (X1k1 , X2k1 )A(k 2)
1 (X1k2 , X2k2 )} can be simplified to 4E[cov {I (X3k1 < X1k1 ),
2
I (X3k2 < X2k2 )|X1k1 , X2k2 }], for k1 , k2 = 1, . . . , p. In parallel, E{A2(l1 ) (Y1l1 , Y2l1 ) A(l2 2 ) (Y1l2 ,
Y2l2 )} can be simplified to 4E[cov2 {I (Y3l1 < Y1l1 ), I (Y3l2 < Y2l2 )|Y1l1 , Y2l2 }] for l1 , l2 =
1, . . . , q.
Proposition 1 ensures that

p q

4pq (k) (l) 2
0< = E A1 (X1k , X2k ) 2 E A2 (Y1l , Y2l ) ≤ S 2,
2025 k=1 l=1
which implies that S 2 → ∞ when either p or q or both diverge to infinity. Under H0 , the
unbiased estimate of S 2 is defined as follows:
p
p q
q

(6) 2 =
S (k1 k2 )
S 12 ,
S
(l l )
1 2
k1 =1 k2 =1 l1 =1 l2 =1
where
−1

S
(k1 k2 )
= n(n − 1)(n − 2)(n − 3)(n − 4)(n − 5)
1

n
× ψ(Xi1 k1 , Xi3 k1 , Xi4 k1 )ψ(Xi2 k1 , Xi5 k1 , Xi6 k1 )
(i1 ,...,i6 )
× ψ(Xi1 k2 , Xi3 k2 , Xi4 k2 )ψ(Xi2 k2 , Xi5 k2 , Xi6 k2 ), and

−1
(l1 l2 ) = n(n − 1)(n − 2)(n − 3)(n − 4)(n − 5)
S2

n
× ψ(Yi1 l1 , Yi3 l1 , Yi4 l1 )ψ(Yi2 l1 , Yi5 l1 , Yi6 l1 )
(i1 ,...,i6 )
× ψ(Yi1 l2 , Yi3 l2 , Yi4 l2 )ψ(Yi2 l2 , Yi5 l2 , Yi6 l2 ).

In the above displays, the summations are taken over all possible permutations of distinctive
indices. Directly calculating S (k1 k2 ) and S
(l1 l2 ) involves a computational complexity of order
1 2
O(n6 ), which is prohibitive even for moderate sample size. To relax computational complex-
ity, an efficient algorithm introduced by Weihs, Drton and Meinshausen ((2018), lines 14–17,
page 556) is implemented, which reduces the computational cost of calculating S (k1 k2 ) and
1
(l1 l2 ) from O(n6 ) to O(n2 ). With S
S defined in (6), the normalized test statistic has the form
2
of
Th = Th /S.

Next, we establish the asymptotic null distribution of Th , which serves as a basis to provide a
decision rule for our proposed testing procedure.
2.3. Asymptotic analysis under the null. Define

p
q

(k) (l)
V1 (x1 , x2 ) = A1 (X1k , X2k ), and V2 (y1 , y2 ) = A2 (Y1l , Y2l ).
k=1 l=1
We use the martingale central limit theorem (Hall and Heyde (1980)) to derive the asymptotic
null distribution of Th . The following assumption, which is closely related to condition (2.1)
of Hall (1984), is imposed to facilitate the use of martingale central limit theorem.
A SSUMPTION 1. Assume that, as p → ∞,

E V1 (x1 , x2 )4 nE 2 V1 (x1 , x2 )2 → 0,

E V1 (x1 , x2 )V1 (x2 , x3 )V1 (x3 , x4 )V1 (x4 , x1 ) E 2 V1 (x1 , x2 )2 → 0;
and in parallel, assume that, as q → ∞,

E V2 (y1 , y2 )4 nE 2 V2 (y1 , y2 )2 → 0,

E V2 (y1 , y2 )V2 (y2 , y3 )V2 (y3 , y4 )V2 (y4 , y1 ) E 2 V2 (y1 , y2 )2 → 0.
Assumption 1 is indeed very mild. To gain insights into this assumption, we define
(Hk gk )(·) = E{Vk (z, ·)gk (z)}, where z can be x or y, and k = 1, 2. Let {λkl , l = 1, . . .} be
the eigenvalues of Hk . To obtain a normal limit for the statistic, the following condition is
typically required Hall (1984), p. 3. To be specific, it is assumed that
∞ 2 ∞ t

(7) λtkl λ2kl → 0,
l=1 l=1
for some t > 2. By the definitions of Hk s, we have
∞
∞

2
E Vk (z1 , z2 )Vk (z2 , z3 )Vk (z3 , z4 )Vk (z4 , z1 ) = λ4kl , E Vk (z1 , z2 ) = λ2kl .
l=1 l=1
Assumption 1 is equivalent to condition (7) with t = 4. Similar assumptions are widely used
in literature. See, for example, conditions (18) and (19) of Theorem 1 in Gao et al. (2021),
Assumption D5 in the supplement of Zhu et al. (2020) and Assumption B.1 in Chakraborty
and Zhang (2021).
We remark here that, Assumption 1 does not depend on the ordering of the components of
x or y. Therefore, as long as Assumption 1 is satisfied with a particular permutation, it will
be satisfied with all possible permutations.
In the sequel, we explore under what circumstances Assumption 1 can be satisfied. To
be precise, Proposition 2 reveals that this assumption is satisfied with banded dependence
structure under very general designs; Proposition 3 ensures that it is satisfied with an auto-
regressive structure, covariance structures with bounded or spiked eigenvalues under Gaus-
sian designs.
We introduce a few notation. Let πm be a permutation of {1, . . . , m} for a generic integer
m, and Pm be a collection of all possible permutations. By definition, πm ∈ Pm . Let πm (k)
be the kth element of πm .
P ROPOSITION 2. Suppose there exist permutations πp ∈ Pp and πq ∈ Pq , and measur-

able functions {f1k , k = 1, . . . , p} and {f2l , k = 1, . . . , q}, such that Xπp (k) =
f1πp (k) (ε1,k , . . . , ε1,k+m1 ) and Yπq (l) = f2πq (l) (ε2,l , . . . , ε2,l+m2 ), where {εi,ji }s are indepen-
dent random variables, for i = 1, 2, j1 = 1, . . . , p + m1 and j2 = 1, . . . , q + m2 . Under this
circumstance, Assumption 1 is satisfied if m1 = o(p1/3 ) for divergent p, or m2 = o(q 1/3 ) for
divergent q.
In general, as long as m1 and m2 are not very large, Assumption 1 holds true under mild
conditions. It is important to remark here that Assumption 1 itself imposes an implicit re-
striction between the sample size n and the dimensions p and q. In particular, if m1 = 0
or m2 = 0, which corresponds to the special case that the coordinates of x or those of y
are all independent, Assumption 1 takes a much easier form. To be precise, if the coordi-
nates of x are all independent, E{V1 (x1 , x2 )4 }/[nE 2 {V1 (x1 , x2 )2 }] = O{1/(np) + 1/n} and
E{V1 (x1 , x2 )V1 (x2 , x3 )V1 (x3 , x4 )V1 (x4 , x1 )}/E 2 {V1 (x1 , x2 )2 } = O(1/p), both converge to
0 as n and p diverge to infinity. Similar claims can be made for y in parallel. In other word, in
the particular case that the coordinates of x or those of y are all independent, Assumption 1
is always true as long as p or q diverges in an arbitrary way that is completely free of n.
P ROPOSITION 3. Suppose both x and y are normally distributed with mean zero,
cov(x, xT ) = 1 = (σ1,k1 k2 ), and cov(y, yT ) = 2 = (σ2,l1 l2 ). Then Assumption 1 is satis-
fied if as p → ∞,
p p 2 p p 3 p 2
1 2
σ1,k1 k3 σ1,k3 k2 + σ 2
σ1,k
k1 ,k2 =1 k3 =1
n k =1 k =1 1,k1 k2 k1 ,k2 =1
1 k2
1 2
and as q → ∞,
q q 2 q q 3 q 2
1 2
σ2,l1 l3 σ2,l3 l2 + σ 2
σ2,l
l1 ,l2 =1 l3 =1
n l =1 l =1 2,l1 l2 l1 ,l2 =1
1 l2
1 2
converge to 0.
The above conditions can be satisfied by a broad class of commonly seen covariance struc-
tures, including the following three examples. Throughout, we only consider 1 . We remark
here that similar derivations apply to 2 as well. The first example is the autoregressive de-
|k −k |
pendence structure, which has the form of 1 = (ρ1 1 2 ), for ρ1 ∈ (−1, 1). Simple algebra
yields that
p p 2
p{1 − ρ18 + 8ρ12 (1 − ρ14 )}
σ1,k1 k3 σ1,k3 k2 = 1 + o(1)
k1 ,k2 =1 k3 =1 (1 − ρ1 )4
2
and
p p 3
p{1 − ρ112 + 3ρ12 (1 + ρ12 )(1 − ρ16 )}
2
σ1,k 1 k2
= 1 + o(1) .
k1 =1 k2 =1 (1 − ρ1 ) (1 + ρ1 + ρ1 )
2 4 2 4
In addition,
p
p(1 + ρ12 )
2
σ1,k 1 k2
= 1 + o(1) .
k1 ,k2 =1 (1 − ρ1 )
2
It can be clearly seen from the above displays that Assumption 1 holds for the autoregressive
covariance structure.
The second example is the covariance structure with all eigenvalues being bounded. To be
precise, we let λ11 ≤ · · · ≤ λ1p be the eigenvalues of 1 . All are bounded away from above.
A direct consequence is that the eigenvalues of 21 , 41 and 61 are all bounded. It can thus
be verified that
p p 2 p p 3

σ1,k1 k3 σ1,k3 k2 ≤ pλ41p , 2
σ1,k 1 k2
≤ pλ61p , and
k1 ,k2 =1 k3 =1 k1 =1 k2 =1
p

2
σ1,k 1 k2
≥ pλ211 .
k1 ,k2 =1
Assumption 1 holds trivially true as long as the dimensions diverge.
The third example is the covariance structure with spiked eigenvalues. To be precise, we
assume that the largest b1 eigenvalues of 1 are unbounded, and the remaining (p − b1 )
eigenvalues are bounded from above by a finite constant M1 . We further assume that (p −
b1 ) → ∞ and (p − b1 )λ411 → ∞. One of the two quantities with regard to 1 in Proposition 3
is bounded by
M14 (p − b1 ) + b1 λ41p + n−1 {M16 (p − b1 ) + b1 λ61p }
.
(p − b1 )2 λ411 + 2(p − b1 )b1 λ211 λ21p−b1 +1
It follows that sufficient conditions to ensure Assumption 1 are
2/3 −1/6 1/3 1/3
λ1p = O (p − b1 )1/3 λ11 b1 or λ1p = O (p − b1 )1/6 λ11 λ1p−b1 +1 ,
where b1 can be bounded or divergent, and λ11 is allowed to shrink to zero.
T HEOREM 1. Suppose x and y are continuous, and Assumption 1 holds. Under H0 in (1),
as max(p, q) → ∞ and n → ∞,
−1
p
1/2
q D
h C(d, 2) C(n, 2)
U
(kl)
S−
→ N(0, 1),
h
k=1 l=1
D
where −
→ stands for “convergence in distribution.”
2 is an unbiased and ratio consistent estimate of S 2
The following theorem shows that S
under the null.
2 /S 2 − P
2 /S 2 ) = 1, where − P
T HEOREM 2. Under the conditions in Theorem 1, S → E(S →
D
denotes “convergence in probability.” Accordingly, Th −
→ N(0, 1).
Theorems 1 and 2 establish the asymptotic normality property of Th under H0 . The critical
value under the significance level α can be computed by 100(1 − α)% normal quantiles. An
important implication of Theorems 1 and 2 is that our proposed tests is applicable to high-
dimensional situations where at least one of the two dimensions is sufficiently large.
Let zi = (xTi , yTi )T , for i = 1, . . . , n. Define V (z1 , z2 ) = V1 (x1 , x2 )V2 (y1 , y2 ). The follow-
ing theorem provides an explicit uniform bound on the error of normal approximation to the
null distribution of Th .
T HEOREM 3. Under the conditions in Theorem 1, we have

sup P (Th ≤ x) − (x)
x∈R

E{V (z1 , z2 )V (z2 , z3 )V (z3 , z4 )V (z4 , z1 )} + n−1 E{V (z1 , z2 )4 } 1/5
≤C ,
E 2 {V (z1 , z2 )2 }
where C is a universal constant, which is completely independent of n, p and q, and (·) is
the standard normal distribution function.
To reveal how the rate of convergence depends on the sample size and dimensions, we
consider the situation in which all the coordinates of x and y are statistically independent. In
such a situation, we have

sup P (Th ≤ x) − (x) ≤ C (pq)−1/5 + n−1/5 ,
x∈R
which suggests that the normal approximation of our proposed test is guaranteed to be more
and more accurate as n and (pq) diverge to infinity.
2.4. Asymptotic analysis under local alternatives. Define

θh(kl) = E h(z1,kl , . . . , zd,kl ) ,
q
p
(c)
ζh = var hc (z1,kl , . . . , zc,kl ) , and
k=1 l=1

=
M
(c) (c)
Gh (zi1 ,kl , . . . , zic ,kl ),
h
1≤i1 <···<ic ≤n
for c = 1, . . . , d, where
p q
(c) (c)
Gh (z1 , . . . , zc ) = gh (z1,kl , . . . , zc,kl ).
k=1 l=1
To explore the power performance of our proposed test, we consider local alternatives
where some Xk s and Yl s are dependent. When max(p, q) → ∞ and n → ∞, we assume that
the class of local alternatives satisfies satisfy
(c)
(8) ζh = o nc−2 Sh2 , c ∈ {1, 3, . . . , d},
(2)
(9) E Gh (z1 , z2 )G(2) (2) (2)
h (z2 , z3 )Gh (z3 , z4 )Gh (z4 , z1 ) = o Sh ,
4
(2) 4
(10) E Gh (z1 , z2 ) = o nSh4 ,
where
q
p

Sh2 = var h2 (z1,kl , z2,kl ) = −2 2
h S .
k=1 l=1
Suppose x and y are continuous, under H0 in (1). Then the conditions in (8)–(10) are satisfied
if Assumption 1 is true, and the test statistic Th is a degenerate U-statistic. Under the local
alternatives, the conditions in (8)–(10) prescribe a small difference from H0 . These condi-
tions can be viewed as an extension of the conditions from the linear case (Yamada, Hyodo
and Nishiyama (2017)) to the nonlinear case. The class of local alternatives is somewhat
too abstract; it is still a sensible class because the rank-based indices, such as Hoeffding’s
D, Blum–Kiefer–Rosenblatt’s R and Bergsma–Dassios–Yanagimoto’s τ ∗ , can target a very
broad class of alternatives, and allow for arbitrarily nonlinear dependence.
To illustrate the local alternatives in high dimensions, we consider the mixture models
used by Shi et al. (2022) and Shi, Drton and Han (2022a). To be specific, we consider the
following ∗ -contamination model:

(11) (x, y) ∼ Fx,y = 1 − ∗ Fx,y
(0)
+ ∗ Fx,y
(1)
,
where Fx,y and Fx,y are two distributions, and 0 < ∗ < 1. In particular, x and y are de-
(0) (1)
(1) (0) (0)
pendent if (x, y) ∼ Fx,y and independent if (x, y) ∼ Fx,y . We further restrict Fx,y to satisfy
Assumption 1.
(1)
P ROPOSITION 4. For an arbitrary distribution Fx,y such that x and y are dependent,
there exists a small contamination ratio ∗ , that depends on both Fx,y and Fx,y , such that the
(0) (1)
conditions in (8)–(10) are satisfied.
Condition (8) ensures that C(d, 2){C(n, 2)}−1 M (2) is a leading term and {C(d, c)C(n,
h
(2) , for c = 1, 3, . . . , d, are asymptotically negligible. This observation, together with
c)−1 Mh
conditions (9) and (10), is used under local alternatives to establish the asymptotic normality
of
p
q
.
U
(kl)
h
k=1 l=1
T HEOREM 4. Suppose {U (kl) : 1 ≤ k ≤ p, 1 ≤ l ≤ q} have a common kernel h that sat-

h
isfies conditions (8)–(10). As max(p, q) → ∞ and n → ∞,
D
p q
−1 1/2 (kl)
h C(d, 2) C(n, 2)
U − θ (kl) S −
→ N(0, 1).
h h
k=1 l=1
It follows that the power function of Th can be approximated by

1/2 (kl)
p q
−1
(12) −zα + h C(d, 2) C(n, 2) θh S .
k=1 l=1
This allows us to compare our proposed test with existing important tests for (1), in terms of
their limiting efficiency.
3. Asymptotic relative efficiency.
3.1. Two distance correlation/covariance based tests in high dimensions. To measure

the dependence between x and y, Székely, Rizzo and Bakirov (2007) introduced distance
covariance, which measures the L2 -distance between the joint characteristic function of x and
y and the product of marginal characteristic functions. To be precise, the distance covariance
is defined as

E exp(i t, x + i s, y) − E exp(i t, x)E exp(i s, y) 2
dCov (x, y) =
2
dt ds,
cp cq t 1+p s 1+q
where cp = π (1+p)/2 / {(1 + p)/2}, ·, · denotes inner product and · is the Frobenius
norm. Székely and Rizzo (2013) propose to estimate it with
2 −1
n
(x, y) = n(n − 3)
dCov ij B
A ij ,
i=j
where

n
n
−1
n
ij = Aij − (n − 2)−1
A Aij − (n − 2)−1 Aij + (n − 1)(n − 2) Aij ,
j =1 i=1 i,j =1

n
n
−1
n
ij = Bij − (n − 2)−1
B Bij − (n − 2)−1 Bij + (n − 1)(n − 2) Bij ,
j =1 i=1 i,j =1
Aij = xi − xj and Bij = yi − yj for 1 ≤ i, j ≤ n.

Székely and Rizzo (2013) propose the following statistic to test independence:
1/2 1/2
= n(n − 3)/2 − 1
SZ (x, y) 1 − dCor
dCor
2 (x, y) 4
,
where
2 2 2 1/2
2 (x, y) = dCov
dCor (x, x)dCov
(x, y) dCov (y, y) .
Székely and Rizzo ((2013), Theorem 1) indicated that, for a fixed sample size n ≥ 4, if the
converges to Student
coordinates of x and y are all independent and identically distributed, SZ
t-distribution with n(n − 3)/2 − 1 degrees of freedom under H0 , as the dimensions diverge
to infinity.
As an alternative to Székely and Rizzo (2013), Zhu et al. (2020) considered the marginal
sample distance covariance based test statistic,
1/2 2 4 1/2
= n(n − 3)/2 − 1
ZZYS (x, y) 1 − mdCor
mdCor (x, y) ,
where
2 2 2 2 1/2
mdCor (x, y) mdCov
(x, y) = mdCov (y, y)
(x, x)mdCov ,
p
q
2 2
(x, y) =
mdCov (Xk , Yl ).
dCov
k=1 l=1
and SZ
Zhu et al. (2020) showed that under H0 , both ZZYS converge to the standard normal
distribution as n ∧ p ∧ q → ∞.
3.2. Connection with Pearson correlation. We establish the explicit relationships be-
tween D, R, τ ∗ and Pearson’s correlation for bivariate normal random variables, which plays
a critical role in subsequent power analysis.
T HEOREM 5. Assume (Xk , Yl ) ∈ R2 is bivariate Gaussian with Pearson correlation ρkl ,

for 1 ≤ k ≤ p and 1 ≤ l ≤ p. Then
(kl) } = Mh (ρkl ), where
(1) E{Uh
−1 2
Mh(D) (ρkl ) = 4π 2 (arcsin ρkl )2 − arcsin(ρkl /2)
arcsin ρkl
−2 2 sin x
+π arcsin dx
0 2 cos 2x + 1
arcsin ρkl 1/2 !
2 2 cos 2x − 1
+ arcsin sin x dx
0 6 cos 2x + 3
arcsin ρkl
2 −1 sin x
− 2π arcsin dx
0 3
arcsin ρkl !
2 2 cos 2x + 3
+ arcsin sin x dx ,
0 2 cos 2x + 1
arcsin ρkl
2 −1 2 2 cos 2x + 3
Mh(R) (ρkl ) = 2π arcsin sin x dx
0 2 cos 2x + 1
arcsin ρkl !
2 sin x
− arcsin dx ,
0 2 cos 2x + 1
2
Mh(τ ∗ ) (ρkl ) = 3π −2 (arcsin ρkl )2 − arcsin(ρkl /2)
arcsin ρkl 1/2 !
−2 2 2 cos 2x − 1
+ 12π arcsin sin x dx
0 6 cos 2x + 3
arcsin ρkl
sin x
− 6π −2 arcsin dx
0 3
arcsin ρkl !
2 2 cos 2x + 3
− arcsin sin x dx ;
0 2 cos 2x + 1
(2) Mh (ρkl ) = Mh (−ρkl ) and Mh (ρkl )/ρkl
2 is nondecreasing in |ρ |;
kl
TABLE 1
Ratios for different ρ (all numbers are multiplied by 100)
|ρ| 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Mh(D) /ρ 2 0.85 0.86 0.89 0.92 0.97 1.05 1.16 1.33 1.65 3.33
(D) /ρ 2
M 0.83 0.86 0.88 0.92 0.97 1.05 1.15 1.33 1.65 3.33
h
Mh(R) /ρ 2 0.85 0.85 0.86 0.87 0.89 0.91 0.93 0.97 1.02 1.11
(R) /ρ 2
M 0.83 0.85 0.86 0.87 0.88 0.91 0.93 0.97 1.02 1.11
h
Mh(τ ∗ ) /ρ 2 30.48 30.75 31.22 31.92 32.91 34.29 36.26 39.22 44.34 66.67
(τ ∗ ) /ρ 2
M 29.97 30.58 31.14 31.87 32.88 34.27 36.25 39.21 44.34 66.67
h
(3)
Mh(D) (ρkl ) Mh(D) (ρkl )
inf 2
= 12π 2 −1 , sup 2
= 1/30,
ρkl =0 ρkl ρkl =0 ρkl
Mh(R) (ρkl ) Mh(R) (ρkl )
inf 2
= 12π 2 −1 , sup 2
= 1/90,
Mh(τ ∗ ) (ρkl ) Mh(τ ∗ ) (ρkl )
inf 2
= 3π −2 , sup 2
= 2/3.
Heuristically, as the dependence between Xk and Yl increases, E{U (kl) } may deviate from
h
zero at the same rate as ρkl 2 . Bearing this concept in mind, Drton, Han and Shi ((2020),
Lemma 4.1) investigated the power of maximum-type tests using consistent rank correla-
tion statistics from an asymptotic minimax perspective. Theorem 5 firms up this concept by
establishing an elegant relation between E{U (kl) } and Pearson correlation under normal dis-
h
tribution, which is the key to compare the consistent rank and distance covariance/correlation
based tests from an asymptotic efficiency perspective.
To better understand Theorem 5, we conduct a toy example. We consider a bivariate normal
random vector with mean zero and covariance matrix

1 ρ
.
ρ 1
We vary ρ from 0.1 to 1.0, and set n = 1000. Each experiment is repeated 1000 times. We
summarize the ratios of estimated mean M (D) , M
(R) and M
(τ ∗ ) to ρ 2 in Table 1, and com-
h h h
pare them with theoretically accurate ratios.
From Table 1, we have the following three findings.
(i) The estimated means of M (D) , M
(R) and M (τ ∗ ) are close to the values M (D) , M (R)
h h h h h
and Mh(τ ) in the first statement of Theorem 5.
∗
(ii) As |ρ| increases, M (D) /ρ 2 , M

(R) /ρ 2 and M (τ ∗ ) /ρ 2 are all nondecreasing, which
h kl h kl h kl
is consistent with the second statement of Theorem 5. We plot the relationship between stan-
dardized Mh and ρ in Figure 1.
(iii) When ρ = 0.1, M (D) /ρ 2 and M (R) /ρ 2 approximate the limit inferior (12π 2 )−1 ,
h kl h kl
and M (τ ∗ ) /ρ is close to 3π . When ρ = 1, M
2 −2 (D) /ρ 2 , M (R) /ρ 2 and M (τ ∗ ) /ρ 2 approach
h kl h kl h kl h kl
the respective limits superior 1/30, 1/90 and 2/3. This coincides with the third statement of
Theorem 5.
3.3. Asymptotic relative efficiency. We now compare the powers of the tests built on Th ,
from the asymptotic efficiency perspective. According to the results from Sec-
and ZZYS
SZ
F IG . 1. The standardized Mh(D) (red dashed line), standardized Mh(R) (black dotted line), standardized Mh(τ ∗ )
(green dot-dash line) and ρ 2 (blue solid line) in the bivariate normal case.
tion 2, the asymptotic power of the test statistic Th is essentially controlled by the signal to
noise ratio (SNR)
1/2 (kl)
p q
−1
SNRTh = h C(d, 2) C(n, 2) θh S.
k=1 l=1
By comparison, Zhu et al. ((2020), Proposition 2.2.2) showed that the asymptotic powers of
depend on
and ZZYS
the test statistics SZ
p q −1/2
1/2
= C(n, 2)
SNRSZ cov2 (Xk1 , Xk2 ) cov2 (Yl1 , Yl2 )
k1 ,k2 =1 l1 ,l2 =1
p
q
× cov2 (Xk , Yl ),
k=1 l=1
p q −1/2
1/2
= C(n, 2)
2 2
SNRZZYS dCov (Xk1 , Xk2 ) dCov (Yl1 , Yl2 )
k1 ,k2 =1 l1 ,l2 =1
p
q
× dCov2 (Xk , Yl ).
k=1 l=1
is
Thus, the asymptotic relative efficiency (ARE) of Th with respect to SZ
p q p
q −1
−1 (kl)
= h C(d, 2)
ARE(Th , SZ) θh 2
cov (Xk , Yl )
k=1 l=1 k=1 l=1
1/2
p
q

× cov2 (Xk1 , Xk2 ) cov2 (Yl1 , Yl2 ) S,
k1 ,k2 =1 l1 ,l2 =1
is
and the ARE of Th with respect to ZZYS
p
q p
q −1

= h C(d, 2) −1
ARE(Th , ZZYS) θh(kl) 2
dCov (Xk , Yl )
k=1 l=1 k=1 l=1
1/2
p
q

× 2
dCov (Xk1 , Xk2 ) 2
dCov (Yl1 , Yl2 ) S.
k1 ,k2 =1 l1 ,l2 =1
Owing to the monotone transformation-invariant property, the proposed rank tests do not
require moment conditions, hence are applicable to generally distributed random vectors in-
cluding heavily tailed ones. By contrast, the distance covariance/correlation based tests are
valid with moment conditions. Therefore, to appreciate the implication of AREs, we con-
sider the Gaussian distributed random vectors. We expect that other types of distributions
could draw similar conclusions to the normal ones.
To ease illustration, we assume that x and y have independent and identically distributed
(k) (l)
coordinates with var(Xk ) = σ1 , k ∈ {1, . . . , p}, and var(Yl ) = σ2 , l ∈ {1, . . . , q}. Let
def
12 = cov(x, yT ) = (σkl ) ∈ Rp×q denote the covariance matrix. Let I ⊆ {(k, l) : 1 ≤ k ≤
p, 1 ≤ l ≤ q} be the index set of the signals. For the alternative class, we consider Gaussian
(k) (l)
distribution with equal correlations, namely σkl = ρ{σ1 σ2 }1/2 for (k, l) ∈ I and σkl = 0
for (k, l) ∈
/ I . From a minimax point of view, a similar class of the alternative was discussed
by Yao, Zhang and Shao (2018) and Leung and Drton (2018). By Proposition 1, Theorem 5
and Székely, Rizzo and Bakirov ((2007), Theorem 7), ARE(Th , SZ) and ARE(T
h , ZZYS)
can be simplified to
!−1

= 15 2π 2 −1 (I ) 1/2
ARE(Th , SZ) σ1(k) σ2(l)
(k,l)∈I
(13) 1/2
p
q
(k) 2 (l) 2
× σ1 σ2 1 + o(1) ,
k=1 l=1
!−1

= 30 1 + π/3 − 31/2 π −2 (I ) 1/2
ARE(Th , ZZYS) σ1(k) σ2(l)
(k,l)∈I
(14) 1/2
p
q
(k) 2 (l) 2
× σ1 σ2 1 + o(1) ,
k=1 l=1
as ρ → 0, where (I ) stands for the cardinality of I .
(1) (p)
Homoscedastic case. We assume without loss of generality that σ1 = · · · = σ1 =
(1) (q)
σ2 = · · · = σ2 = 1. Because (pq)/(I ) ≥ 1, some simple algebra gives
−1 1/2
lim = 15 2π 2
ARE(Th , SZ) pq/(I )
max(p,q)→∞
(15) −1
≥ 15 2π 2 ≈ 0.7599089,
1/2
lim = 30 1 + π/3 − 31/2 π −2 pq/(I )
ARE(Th , ZZYS)
max(p,q)→∞
(16)
≥ 30 1 + π/3 − 31/2 π −2 ≈ 0.9579312,
∗
for h ∈ {h(D) , h(R) , h(τ ) }.
The limiting ARE values in (15) and (16) suggest that the proposed tests in high dimen-
sions suffer little from power loss under the homoscedastic scenario. Furthermore, it can be
seen that AREs are inverse proportional to (I ). That is, compared to SZ our
and ZZYS,
proposed tests have advantage when the signals in I are relatively sparse. We next reveal that
the efficiency gain of the proposed tests in the high-dimensional heteroscedastic case. For
simplicity, we assume the number of dependency signals in I is fixed.
Heteroscedastic case. When p or q diverges at the rate such that

p

−1 (k) 2
(17) p σ1 → ∞,
k=1
q

−1 (l) 2
(18) q σ2 → ∞,
l=1
it is straightforward to verify that
(19) lim = ∞,
ARE(Th , SZ)
max(p,q)→∞
(20) lim = ∞.
ARE(Th , ZZYS)
max(p,q)→∞
If one of the conditions (17) and (18) holds as max(p, q) → ∞, the ARE results in (19) and
(20) guarantee that the test based on Th is substantially more powerful than those based on
distance covariance/correlation (Székely and Rizzo (2013), Zhu et al. (2020)). The conditions
(k) (l)
in (17) and (18) are trivially true in the case that σ1 s and σ2 s have different scales. To
appreciate this, let us consider an explicit scenario: for 1 ≤ k ≤ p and 1 ≤ l ≤ q, there are
δ1 > 0, δ2 > 0, which do not depend on the dimensions p or q, such that
(k) (l)
σ 1 k δ1 , σ2 l δ2 ,
where, for two sequences {ak } and {bk }, we write ak bk if there exist a positive constant C
such that C −1 ≤ lim infk ak /bk ≤ lim supk ak /bk ≤ C. It is easy to verify that as p → ∞ and
q → ∞,

p
q

(k) 2 (l) 2
σ1 pp 2δ1
→ ∞, and σ2 p q 2δ2 → ∞.
k=1 l=1
According to the aforementioned analysis, we can conclude that even for the light-tailed
and high-dimensional normal observations, the tests built on these monotone transformation-
invariant correlations have much substantial power gain than distance covariance/correlation
based tests (Székely and Rizzo (2013), Zhu et al. (2020)) in the heteroscedastic scenario.
We consider conditions in (17) and (18), which correspond to an extreme case that allows a
few variance components to diverge to infinity. This extreme case facilitates us to illustrate the
advantages of our proposed test over existing tests in the heteroscedastic scenario. However,
we emphasize here that, even when all variance components are bounded uniformly, our
proposed test can also be more powerful than existing ones in the heteroscedastic scenario.
Let us elaborate this phenomenon in the sequel.
We consider a general case that all variance components are uniformly bounded but nec-
essarily not the same. To be precise, there exist two positive constants c1 and c2 , such that
σ1(k) = σ2(l) = c1 for (k, l) ∈ I and σ1(k) = σ2(l) = c2 for (k, l) ∈
/ I . This implies immediately
that
p
q

(k) 2 (l) 2
σ1 = O(p), and σ2 = O(p).
k=1 l=1
= 15(2π 2 )−1 {1+
In this case, the asymptotic relative efficiency has the form of ARE(Th , SZ)
= 30(1 + π/3 − 31/2 )π −2 {1 +
(pq/(I ) − 1)(c2 /c1 )4 }1/2 {1 + o(1)} and ARE(Th , ZZYS)
(pq/(I ) − 1)(c2 /c1 ) } {1 + o(1)}. It can be verified that, as long as {1 + (pq/(I ) −
4 1/2
and ARE(T
1)(c2 /c1 )4 }1/2 ≥ π 2 max(2/15, 1/{30(1 + π/3 − 31/2 )}), ARE(Th , SZ)
h , ZZYS)
F IG . 2. −1 SNR
The adjusted cn,p,q −1
Th(D) (red dashed line), adjusted cn,p,q SNRTh(R) (black dotted line) and ad-
−1 SNR
justed cn,p,q T ∗ (green dot-dash line) under the Gaussian alternatives with equal correlations.
h(τ )
are simultaneously greater than or equal to 1. In other words, our proposed test can be more
powerful than the distance covariance/correlation based tests as long as the variance com-
ponents exhibit a certain level of heterogeneity. Moreover, AREs are proportional to c2 /c1 ,
which indicates that the proposed tests are more advantageous when the signal to overall
noise ratio becomes smaller.
Next, we compare the asymptotic efficiency among the proposed three tests under (17) and
(18). Under the Gaussian alternatives with equal correlations, the signal to noise ratio SNRTh
is reduced to
−1
SNRTh = cn,p,q h C(d, 2) Mh (ρ),
where cn,p,q = {C(n, 2)}1/2 (I )/S and (I ) denotes the cardinality of the set I . The factor
∗
cn,p,q does not depend on the kernel h ∈ {h(D) , h(R) , h(τ ) }, which brings convenience for
−1 SNR with respect to ρ in Figure 2. It can be seen
comparison. We plot the adjusted cn,p,q Th
∗
that the tests based on D, τ and R are sorted in a descending order of asymptotic efficiency,
no matter whether x and y are heteroscedastic.
In the bivariate case, Mudholkar and Wilding (2003) compared the power performance of
the Hoeffding’s D and Blum–Kiefer–Rosenblatt’s R tests through simulations for bivariate
normal models. In particular, their empirical observation is that, with a small number of ob-
servations, the Hoeffding’s D test is advantageous over the Blum–Kiefer–Rosenblatt R test
for negative dependence alternatives, and yet disadvantageous for positive ones. However,
they did not explore the theoretical properties about the power performance of the Hoeffd-
ing’s D and Blum–Kiefer–Rosenblatt’s R tests, even for bivariate normal models. In the
present context, we explore the power properties of three tests in high dimensions. From Fig-
ure 2, it can be seen that the Hoeffding’s D-test is apparently the most powerful against the
Gaussian equicorrelation alternatives, for both negative and positive correlations. Thorough
investigations for nonnormal distributions and other dependence structures, particularly in
high dimensions, are highly desired.
4. Numerical studies.
4.1. Simulations. We first use simulations to examine the performances of our pro-
posed tests based on Hoeffding’s D, Blum–Kiefer–Rosenblatt’s R and Bergsma–Dassios–
Yanagimoto’s τ ∗ . We compare the performance of these tests with that of the tests based
on the bias-corrected distance correlation (Székely and Rizzo (2013)) and the aggregated
distance covariance (Zhu et al. (2020)). We implement these two tests through the R pack-
F IG . 3. The density curves of the asymptotic null distribution of test statistics based on D (red dashed line), R
(black dotted line) and τ ∗ (green dot-dash line) and the standard normal density (blue solid line) under the AR
structure of covariances.
age energy. R-code for our proposed methods can be downloaded at https://github.com/
Yeqing-TJ/Rank-based-test-in-high-dimension. We fix the sample size n = 100, and set the
dimensions p = q = 50, 150 and 300. Each experiment is repeated 500 times at the nomi-
nal level 0.05. The parameter δ is used to control the degree of heteroscedasticity (Li et al.
(2023)). We consider three different structures for the covariance matrix = (σst ).
Independent structure: σss = s δ and σst = 0 for s = t.
√
Autoregressive structure: σss = s δ and σst = 0.5|s−t| σss σtt for s = t.
√
Banded structure: σss = s δ and σst = 0.3 σss σtt for s = t and |s − t| ≤ 3, σst = 0 for
s = t and |s − t| > 3.
We generate x and y independently from multivariate normal distribution N(0, ) with
these three covariance structures. To evaluate the accuracy of normal approximation, we plot
the kernel density estimates of the test statistics as well as the standard normal density in
Figure 3. It is clear that the empirical distributions of test statistics are reasonably close to
N(0, 1). We summarize the empirical sizes of all aforementioned tests under different covari-
ance structures and dimensions in Table 2, Tables S.1 and S.2 of the Supplementary Material.
Due to space limit, we report the results for independent and banded structure in the Supple-
mentary Material. Owing to the scale-invariant property, for a given n and p, the empirical
sizes of the tests based on D, R and τ ∗ stay stable under different values of δ. Across three
covariance structures, all the tests can control type-I error rates pretty well.
Next, we compare the empirical powers of the above tests in detecting linear and nonlinear
dependence. We consider the following two models.
Linear model: We draw x from N(0, ) with three different covariance structures. We
generate Yj = 0.5Xj + j for j = 1, . . . , [p/3], where j s are generated from N(0, j 2 ), and
[·] denotes the integer part of a given number. The rest (q − [p/3]) components of y are
independent of x and follow N(0, ).
Nonlinear model: We draw x from N(0, ) with three different covariance structures. We
generate Yj = Xj2 + j for j = 1, . . . , [p/3], where j s are generated from N(0, j 2 ). We
generate an intermediate vector x = (X1 , . . . , X
q−[p/3] )T independently from N(0, ). The
remaining components of y are given by Yj = X 2 for j = [p/3] + 1, . . . , q.
j
We report the empirical powers for linear model in Tables 3, S.3 and S.4 and for non-
linear model in Tables 4, S.5 and S.6, respectively. In linear model, δ = 0 corresponds to
the homoscedastic case where each component of x has the same variance. The empirical
powers of all five tests decrease as the dimensions p and q grow. Compared to the distance
covariance/correlation based tests of Székely and Rizzo (2013) and Zhu et al. (2020), our
proposed tests suffer little from power loss, which is in line with the theoretical analysis in
Section 3. When δ is nonzero, due to the monotone transformation-invariant property, the
proposed three tests exhibit significant advantages. The empirical powers of the tests based
TABLE 2
The empirical size under the AR structure of covariances. The random seeds are
fixed throughout and all rank tests are monotone transformation-invariant.
Therefore, the empirical sizes of all rank tests are unchanged,
though δ varies from 0.00 to 1.00
p δ D R τ∗ ZZYS SZ
50 0.00 0.052 0.054 0.050 0.048 0.048

0.25 0.052 0.054 0.050 0.044 0.042
0.50 0.052 0.054 0.050 0.042 0.038
0.75 0.052 0.054 0.050 0.048 0.050
1.00 0.052 0.054 0.050 0.052 0.052
150 0.00 0.040 0.050 0.046 0.048 0.056
0.25 0.040 0.050 0.046 0.048 0.060
0.50 0.040 0.050 0.046 0.048 0.052
0.75 0.040 0.050 0.046 0.046 0.052
1.00 0.040 0.050 0.046 0.052 0.060
300 0.00 0.054 0.054 0.054 0.044 0.036
0.25 0.054 0.054 0.054 0.048 0.032
0.50 0.054 0.054 0.054 0.048 0.042
0.75 0.054 0.054 0.054 0.052 0.046
1.00 0.054 0.054 0.054 0.052 0.056
on D, R and τ ∗ share a similar trend, and gradually approach to one as δ increases. The
power of D based test is slightly better than those of τ ∗ and R based ones in most settings.
Under the heteroscedastic scenario, the rank tests significantly outperform Székely and Rizzo
(2013)’s and Zhu et al. (2020)’s tests. In nonlinear model, it is clear that cov(Xk , Yl ) = 0 for
k = 1, . . . , p and l = 1, . . . , q. Because Székely and Rizzo (2013)’s test cannot detect the
nonlinearly dependent but componentwisely uncorrelated case, its empirical power is very
near to the nominal level. By contrast, our proposed tests show pretty good capability in
detecting nonlinear dependence.
4.2. An application. We consider a gene expression microarray data set, which was col-
lected from 120 male rats by Scheetz et al. (2006), to demonstrate the practical usefulness
of our proposed tests. It contains 18,976 gene probe sets, which exhibit sufficient expression
signals. The gene TRIM32 at probe 1389163_at is the targeted response, which was validated
to cause the Bardet–Biedl syndrome (Chiang et al. (2006)). We use 500 probe sets that have
the largest variances as the covariates. The empirical observations of previous studies reveal
that there exists nonlinear relationship between the response and the covariates. To verify this
conclusion, we apply the proposed tests to examine independence between the response and
the covariates, with comparison to Székely and Rizzo (2013)’s test and Zhu et al. (2020)’s
test.
To evaluate the power performances of the five tests, we randomly select subsets of size
n = 45, 60, 75 and 90 from the whole data set, to calculate the test statistics. Based on 200
repetitions, the empirical powers of five tests are summarized in Table 5. We can see that the
three tests based on D, R and τ ∗ are more powerful than Székely and Rizzo (2013)’s test and
Zhu et al. (2020)’s test. This may be owe to highly skewed distribution of the response, as
shown in Figure 2 of Zhou and Zhu (2018).
5. Discussion. In this paper, we develop independence tests between high-dimensional

random vectors through three correlations: Hoeffding’s D, Blum–Kiefer–Rosenblatt’s R and
TABLE 3
The empirical power under the AR structure of covariances for linear model
50 0.00 0.288 0.260 0.274 0.336 0.362

0.25 0.380 0.366 0.368 0.368 0.434
0.50 0.594 0.570 0.578 0.504 0.522
0.75 0.870 0.848 0.858 0.588 0.456
1.00 0.986 0.978 0.982 0.624 0.304
150 0.00 0.134 0.144 0.148 0.160 0.136
0.25 0.200 0.198 0.202 0.242 0.210
0.50 0.380 0.376 0.374 0.346 0.318
0.75 0.736 0.724 0.732 0.508 0.452
1.00 0.992 0.992 0.992 0.656 0.334
300 0.00 0.096 0.102 0.102 0.094 0.090
0.25 0.106 0.110 0.116 0.106 0.120
0.50 0.230 0.230 0.226 0.212 0.224
0.75 0.640 0.626 0.628 0.460 0.368
1.00 0.996 0.994 0.994 0.688 0.312
TABLE 4
The empirical power under the AR structure of covariances for nonlinear model
50 0.00 0.206 0.176 0.186 0.370 0.080

0.25 0.354 0.298 0.312 0.536 0.084
0.50 0.670 0.548 0.594 0.626 0.068
0.75 0.978 0.824 0.902 0.580 0.060
1.00 1.000 0.938 0.982 0.454 0.046
150 0.00 0.126 0.108 0.112 0.172 0.076
0.25 0.254 0.228 0.238 0.426 0.086
0.50 0.732 0.586 0.652 0.706 0.054
0.75 0.998 0.920 0.976 0.690 0.072
1.00 1.000 0.990 1.000 0.532 0.076
300 0.00 0.076 0.078 0.072 0.112 0.050
0.25 0.188 0.162 0.170 0.284 0.062
0.50 0.720 0.556 0.644 0.670 0.054
0.75 1.000 0.954 0.986 0.726 0.040
1.00 1.000 0.996 1.000 0.588 0.064
TABLE 5
The empirical power of five tests
Selected samples D R τ∗ ZZYS SZ
45 0.485 0.495 0.495 0.395 0.415

60 0.635 0.655 0.650 0.535 0.540
75 0.835 0.835 0.830 0.750 0.760
90 0.965 0.970 0.970 0.920 0.940
Bergsma–Dassios–Yanagimoto’s τ ∗ . We obtain the limiting null distributions for our pro-

posed tests and provide explicit convergence rates when at least one dimension of the two
vectors diverges to infinity. We compare the asymptotic relative efficiency of the proposed
tests with two distance covariance/correlation based tests in high-dimensional setting. Sim-
ilar technique can be readily applied to test mutual independence and compare power with
respect to the distance covariance/correlation based tests. Moreover, our asymptotic theory
can be extended to test for conditional independence. These will be investigated in our future
research.
Acknowledgments. The authors thank the Editor, the Associate Editor and the anony-
mous reviewers for their constructive comments, which have led to a dramatic improvement
of the earlier version of this article. The authors are also very grateful to Dr. Hongjian Shi for
providing us with the R codes of Shi et al. (2022). The authors have contributed equally to
this paper. All correspondence should be addressed to Liping Zhu (the corresponding author)
at zhu.liping@ruc.edu.cn.
Funding. Yeqing Zhou is supported by National Natural Science Foundation of China

(12001405), Natural Science Foundation of Shanghai (23ZR1469000) and Fundamental Re-
search Funds for the Central Universities (22120210557).
Kai Xu is supported by National Natural Science Foundation of China (12271005,
11901006), Natural Science Foundation of Anhui Province (2308085Y06, 1908085QA06)
and Young Scholars Program of Anhui Province (2023).
Liping Zhu is supported by Renmin University of China (22XNA026), National Natural
Science Foundation of China (12225113, 12171477).
Runze Li’s research is partially supported by National Institute of Health (NIH) grants
R01AI170249 and R01AI136664. The content is solely the responsibility of the authors and
does not necessarily represent the official views of the NIH.
SUPPLEMENTARY MATERIAL
Supplement to “Rank-based indices for testing independence between two high-
dimensional vectors” (DOI: 10.1214/23-AOS2339SUPP; .pdf). The supplement (Zhou et al.
(2024)) contains additional simulation results and technical proofs of all theorems and propo-
sitions in the main context.
REFERENCES
A LBERT, M., L AURENT, B., M ARREL , A. and M EYNAOUI , A. (2022). Adaptive test of independence based on
HSIC measures. Ann. Statist. 50 858–879. MR4404921 https://doi.org/10.1214/21-aos2129
A NDERSON , T. W. (2003). An Introduction to Multivariate Statistical Analysis, 3rd ed. Wiley Series in Probability
and Statistics. Wiley-Interscience, New York. MR1990662
BAO , Z. (2019). Tracy–Widom limit for Kendall’s tau. Ann. Statist. 47 3504–3532. MR4025750
https://doi.org/10.1214/18-AOS1786
BAO , Z., L IN , L.-C., PAN , G. and Z HOU , W. (2015). Spectral statistics of large dimensional Spearman’s
rank correlation matrix and its application. Ann. Statist. 43 2588–2623. MR3405605 https://doi.org/10.1214/
15-AOS1353
B ERGSMA , W. and DASSIOS , A. (2014). A consistent test of independence based on a sign covariance related to
Kendall’s tau. Bernoulli 20 1006–1028. MR3178526 https://doi.org/10.3150/13-BEJ514
B ERRETT, T. B. and S AMWORTH , R. J. (2019). Nonparametric independence testing via mutual information.
Biometrika 106 547–566. MR3992389 https://doi.org/10.1093/biomet/asz024
B LUM , J. R., K IEFER , J. and ROSENBLATT, M. (1961). Distribution free tests of independence based on the sam-
ple distribution function. Ann. Math. Stat. 32 485–498. MR0125690 https://doi.org/10.1214/aoms/1177705055
B ODNAR , T., D ETTE , H. and PAROLYA , N. (2019). Testing for independence of large dimensional vectors. Ann.
Statist. 47 2977–3008. MR3988779 https://doi.org/10.1214/18-AOS1771
C HAKRABORTY, S. and Z HANG , X. (2021). A new framework for distance and kernel-based metrics in high
dimensions. Electron. J. Stat. 15 5455–5522. MR4352549 https://doi.org/10.1214/21-ejs1889
C HATTERJEE , S. (2021). A new coefficient of correlation. J. Amer. Statist. Assoc. 116 2009–2022. MR4353729
https://doi.org/10.1080/01621459.2020.1758115
C HIANG , A. P., B ECK , J. S., Y EN , H., TAYEH , M. K., S CHEETZ , T. E. S WIDERSKI , R. E. N ISHIMURA , D. Y.
B RAUN , T. A., K IM , K.-Y. A. et al. (2006). Homozygosity mapping with SNP arrays identifies TRIM32, an
E3 ubiquitin ligase, as a bardet–biedl syndrome gene (BBS11). Proc. Natl. Acad. Sci. USA 103 6287–6292.
D EB , N. and S EN , B. (2023). Multivariate rank-based distribution-free nonparametric testing using measure trans-
portation. J. Amer. Statist. Assoc. 118 192–207. MR4571116 https://doi.org/10.1080/01621459.2021.1923508
D RTON , M., H AN , F. and S HI , H. (2020). High-dimensional consistent independence testing with maxima of
rank correlations. Ann. Statist. 48 3206–3227. MR4185806 https://doi.org/10.1214/19-AOS1926
G AO , L., FAN , Y., LV, J. and S HAO , Q.-M. (2021). Asymptotic distributions of high-dimensional distance cor-
relation inference. Ann. Statist. 49 1999–2020. MR4319239 https://doi.org/10.1214/20-aos2024
G ORSKY, S. and M A , L. (2022). Multi-scale Fisher’s independence test for multivariate dependence. Biometrika
109 569–587. MR4472834 https://doi.org/10.1093/biomet/asac013
G RETTON , A., F UKUMIZU , K., T EO , C., S ONG , L., S CHÖLKOPF, B. and S MOLA , A. (2008). A kernel statistical
test of independence. In Advances in Neural Information Processing Systems 585–592.
H ALL , P. (1984). Central limit theorem for integrated square error of multivariate nonparametric density estima-
tors. J. Multivariate Anal. 14 1–16. MR0734096 https://doi.org/10.1016/0047-259X(84)90044-7
H ALL , P. and H EYDE , C. C. (1980). Martingale Limit Theory and Its Application. Probability and Mathematical
Statistics. Academic Press, New York. MR0624435
H AN , F., C HEN , S. and L IU , H. (2017). Distribution-free tests of independence in high dimensions. Biometrika
104 813–828. MR3737306 https://doi.org/10.1093/biomet/asx050
H ELLER , R., H ELLER , Y. and G ORFINE , M. (2013). A consistent multivariate test of association based on ranks
of distances. Biometrika 100 503–510. MR3068450 https://doi.org/10.1093/biomet/ass070
H OEFFDING , W. (1948). A non-parametric test of independence. Ann. Math. Stat. 19 546–557. MR0029139
https://doi.org/10.1214/aoms/1177730150
J IANG , T. and YANG , F. (2013). Central limit theorems for classical likelihood ratio tests for high-dimensional
normal distributions. Ann. Statist. 41 2029–2074. MR3127857 https://doi.org/10.1214/13-AOS1134
K ENDALL , M. G. (1938). A new measure of rank correlation. Biometrika 30 81–93.
L EE , D., Z HANG , K. and KOSOROK , M. R. (2023). The binary expansion randomized ensemble test. Statist.
Sinica 33 2381–2403. MR4647039
L EUNG , D. and D RTON , M. (2018). Testing independence in high dimensions with sums of rank correlations.
Ann. Statist. 46 280–307. MR3766953 https://doi.org/10.1214/17-AOS1550
L I , R., X U , K., Z HOU , Y. and Z HU , L. (2023). Testing the effects of high-dimensional covariates via aggre-
gating cumulative covariances. J. Amer. Statist. Assoc. 118 2184–2194. MR4646635 https://doi.org/10.1080/
01621459.2022.2044334
M OON , H. and C HEN , K. (2022). Interpoint-ranking sign covariance for the test of independence. Biometrika
109 165–179. MR4374647 https://doi.org/10.1093/biomet/asab011
M UDHOLKAR , G. S. and W ILDING , G. E. (2003). On the conventional wisdom regarding two consistent tests of
bivariate independence. Statistician 52 41–57. MR1973881 https://doi.org/10.1111/1467-9884.00340
P EARSON , K. (1900). On the criterion that a given system of deviations from the probable in the case of a
correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling.
Philos. Mag. Ser. 5 50 157–175.
S CHEETZ , T. E., K IM , K.-Y. A., S WIDERSKI , R. E., P HILP, A. R., B RAUN , T. A., K NUDTSON , K. L., D OR -
RANCE , A. M., D I B ONA , G. F., H UANG , J. et al. (2006). Regulation of gene expression in the mammalian
eye and its relevance to eye disease. Proc. Natl. Acad. Sci. USA 103 14429–14434.
S HEN , C., P RIEBE , C. E. and VOGELSTEIN , J. T. (2020). From distance correlation to multiscale graph correla-
tion. J. Amer. Statist. Assoc. 115 280–291. MR4078463 https://doi.org/10.1080/01621459.2018.1543125
S HI , H., D RTON , M. and H AN , F. (2022a). On the power of Chatterjee’s rank correlation. Biometrika 109 317–
333. MR4430960 https://doi.org/10.1093/biomet/asab028
S HI , H., D RTON , M. and H AN , F. (2022b). Distribution-free consistent independence tests via center-outward
ranks and signs. J. Amer. Statist. Assoc. 117 395–410. MR4399094 https://doi.org/10.1080/01621459.2020.
1782223
S HI , H., H ALLIN , M., D RTON , M. and H AN , F. (2022). On universally consistent and fully distribution-free rank
tests of vector independence. Ann. Statist. 50 1933–1959. MR4474478 https://doi.org/10.1214/21-aos2151
S PEARMAN , C. (1904). The proof and measurement of association between two things. Amer. J. Psychol. 15
72–101.
S ZÉKELY, G. J. and R IZZO , M. L. (2013). The distance correlation t-test of independence in high dimension.
J. Multivariate Anal. 117 193–213. MR3053543 https://doi.org/10.1016/j.jmva.2013.02.012
S ZÉKELY, G. J., R IZZO , M. L. and BAKIROV, N. K. (2007). Measuring and testing dependence by correlation
of distances. Ann. Statist. 35 2769–2794. MR2382665 https://doi.org/10.1214/009053607000000505
W EIHS , L., D RTON , M. and M EINSHAUSEN , N. (2018). Symmetric rank covariances: A generalized framework
for nonparametric measures of dependence. Biometrika 105 547–562. MR3842884 https://doi.org/10.1093/
biomet/asy021
X U , K. and Z HU , L. (2022). Power analysis of projection-pursuit independence tests. Statist. Sinica 32 417–433.
MR4359639 https://doi.org/10.5705/ss.202019.0457
YAMADA , Y., H YODO , M. and N ISHIYAMA , T. (2017). Testing block-diagonal covariance structure for high-
dimensional data under non-normality. J. Multivariate Anal. 155 305–316. MR3607897 https://doi.org/10.
1016/j.jmva.2016.12.009
YANAGIMOTO , T. (1970). On measures of association and a related problem. Ann. Inst. Statist. Math. 22 57–63.
YANG , Y. and PAN , G. (2015). Independence test for high dimensional data based on regularized canonical
correlation coefficients. Ann. Statist. 43 467–500. MR3316187 https://doi.org/10.1214/14-AOS1284
YAO , S., Z HANG , X. and S HAO , X. (2018). Testing mutual independence in high dimension via distance covari-
ance. J. R. Stat. Soc. Ser. B. Stat. Methodol. 80 455–480. MR3798874 https://doi.org/10.1111/rssb.12259
Z HANG , K. (2019). BET on independence. J. Amer. Statist. Assoc. 114 1620–1637. MR4047288
https://doi.org/10.1080/01621459.2018.1537921
Z HOU , Y., X U , K., Z HU , L. and L I , R. (2024). Supplement to “Rank-based indices for testing independence
between two high-dimensional vectors.” https://doi.org/10.1214/23-AOS2339SUPP
Z HOU , Y. and Z HU , L. (2018). Model-free feature screening for ultrahigh dimensional datathrough a modified
Blum–Kiefer–Rosenblatt correlation. Statist. Sinica 28 1351–1370. MR3821008
Z HU , C., Z HANG , X., YAO , S. and S HAO , X. (2020). Distance-based and RKHS-based dependence metrics in
high dimension. Ann. Statist. 48 3366–3394. MR4185812 https://doi.org/10.1214/19-AOS1934
Z HU , L., X U , K., L I , R. and Z HONG , W. (2017). Projection correlation between two random vectors. Biometrika
104 829–843. MR3737307 https://doi.org/10.1093/biomet/asx043
Z HU , L., Z HANG , Y. and X U , K. (2018). Measuring and testing for interval quantile dependence. Ann. Statist.
46 2683–2710. MR3851752 https://doi.org/10.1214/17-AOS1635

23 Aos2339

Uploaded by

Copyright:

Available Formats

You might also like

23 Aos2339

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

23 Aos2339

Uploaded by

Copyright:

Available Formats

The Annals of Statistics

2024, Vol. 52, No. 1, 184–206

RANK-BASED INDICES FOR TESTING INDEPENDENCE BETWEEN

B Y Y EQING Z HOU1,a , K AI X U2,b , L IPING Z HU3,c AND RUNZE L I4,d

To test independence between two high-dimensional random vectors, we

1. Introduction. Testing independence between random vectors plays an important

Received May 2022; revised August 2023.

I-consistent, D-consistent and Monotone transformation-invariant properties that are elabo-

2. Independence tests through rank based correlations.

D EFINITION 1. (a) The kernel h(D) for Hoeffding’s D is defined as

× ψ(Xi3 ,k , Xi4 ,k , Xi5 ,k )ψ(Yi1 ,l , Yi2 ,l , Yi5 ,l )ψ(Yi3 ,l , Yi4 ,l , Yi5 ,l ).

(b) The kernel h(R) for Blum–Kiefer–Rosenblatt’s R is defined as

× ψ(Xi3 ,k , Xi4 ,k , Xi5 ,k )ψ(Yi1 ,l , Yi2 ,l , Yi6 ,l )ψ(Yi3 ,l , Yi4 ,l , Yi6 ,l ).

× ω(Yi1 ,l , Yi2 ,l , Yi3 ,l , Yi4 ,l ).

P ROPOSITION 1. Suppose both x and y are continuous random vectors. It follows

Proposition 1 ensures that

× ψ(Xi1 k2 , Xi3 k2 , Xi4 k2 )ψ(Xi2 k2 , Xi5 k2 , Xi6 k2 ), and

× ψ(Yi1 l2 , Yi3 l2 , Yi4 l2 )ψ(Yi2 l2 , Yi5 l2 , Yi6 l2 ).

2.3. Asymptotic analysis under the null. Define

A SSUMPTION 1. Assume that, as p → ∞,

P ROPOSITION 2. Suppose there exist permutations πp ∈ Pp and πq ∈ Pq , and measur-

T HEOREM 3. Under the conditions in Theorem 1, we have

2.4. Asymptotic analysis under local alternatives. Define

conditions in (8)–(10) are satisfied.

T HEOREM 4. Suppose {U (kl) : 1 ≤ k ≤ p, 1 ≤ l ≤ q} have a common kernel h that sat-

It follows that the power function of Th can be approximated by

3. Asymptotic relative efficiency.

3.1. Two distance correlation/covariance based tests in high dimensions. To measure

Aij = xi − xj and Bij = yi − yj for 1 ≤ i, j ≤ n.

T HEOREM 5. Assume (Xk , Yl ) ∈ R2 is bivariate Gaussian with Pearson correlation ρkl ,

(ii) As |ρ| increases, M  (D) /ρ 2 , M

as ρ → 0, where (I ) stands for the cardinality of I .

Heteroscedastic case. When p or q diverges at the rate such that

50 0.00 0.052 0.054 0.050 0.048 0.048

5. Discussion. In this paper, we develop independence tests between high-dimensional

50 0.00 0.288 0.260 0.274 0.336 0.362

50 0.00 0.206 0.176 0.186 0.370 0.080

Selected samples D R τ∗ ZZYS SZ

45 0.485 0.495 0.495 0.395 0.415

Bergsma–Dassios–Yanagimoto’s τ ∗ . We obtain the limiting null distributions for our pro-

Funding. Yeqing Zhou is supported by National Natural Science Foundation of China

You might also like

(ii) As |ρ| increases, M (D) /ρ 2 , M

as ρ → 0, where (I ) stands for the cardinality of I .