Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

A study on new measure

of association

Niranjan Dey

Mentor: Dr. Subhra Sankar Dhar


A study on new measure of association
Niranjan Dey
Mentor: Dr. Subhra Sankar Dhar

Introduction
Since the last century, several measures of association have been proposed to detect the association
between random variables. Some of the most popular are Kendall’s τ (1938), Spearman’s ρ (1904), Ho-
effding’s coefficient (1948), Blum-Kiefer-Rosenblatt’s coefficient (Blum, Kiefer, and Rosenblatt) (1961),
distance covariance (Szekely, Rizzo, and Bakirov) (2007) and Kolmogorov—Smirnov and Cramer von
Mises tests (1980).
There is a famous example of “Joint distribution is not uniquely determined from knowledge
of marginals” as follows:

FX,Y (x, y; α) = FX (x)FY (y){1 + α(x, y)(1 − FX (x))(1 − FY (y))}, ∀(x, y)


The p.d.f. version is as follows:

fX,Y (x, y; α) = fX (x)fY (y){1 + α(x, y)(2FX (x) − 1)(2FY (y) − 1)}, ∀(x, y)
Here, fX (x) and fY (y) be probability density functions and corresponding distribution functions
are FX (x) and FY (y) of X and Y . For, −1 ≤ α ≤ 1 the joint p.d.f. and c.d.f. are fX,Y (x, y; α) and
ˆ∞ ˆ∞
FX,Y (x, y; α). For any α of the said range we will have fX,Y (x, y; α)dx = fY (y) and fX,Y (x, y; α)dy =
−∞ −∞
fX (x) i.e. the marginals doesn’t depends on α, for same marginals and varying α we can have various
joint distributions. Interestingly, α takes value 0 if and only if X and Y are independent as stated
FX,Y (x,y)
−1
and proven in the following result. The expression of α is α(x, y) = FX (x)FY (y)
(1−FX (x))(1−FY (y))
is the population
version.

Result 1 α(x, y) as defined earlier takes value 0 for all (x, y) ∈ R2 if and only if X and Y are
independent.

If part, assuming X and Y are independent to show α(x, y) = 0, ∀(x, y)


Form definition probabilistic independence we have, FX,Y (x, y; α) = FX (x)FY (y) ———(1)
Earlier defined, FX,Y (x, y; α) = FX (x)FY (y){1 + α(x, y)(1 − FX (x))(1 − FY (y))} ———(2)
Comparing (1) and (2) we get,
FX (x)FY (y){1 + α(x, y)(1 − FX (x))(1 − FY (y))} = FX (x)FY (y), ∀(x, y)
FX (x)FY (y) + α(x, y)FX (x)FY (y)(1 − FX (x))(1 − FY (y)) = FX (x)FY (y), ∀(x, y)
α(x, y) = 0, ∀(x, y)
Only if part, assuming α(x, y) = 0, ∀(x, y) to show X and Y are independent.
Earlier defined, FX,Y (x, y; α) = FX (x)FY (y){1 + α(x, y)(1 − FX (x))(1 − FY (y))}
For α(x, y) = 0, ∀(x, y), FX,Y (x, y; α) = FX (x)FY (y)———(3)
From (3) it is clear that, α(x, y) = 0, ∀(x, y) implies X and Y are independent.

Niranjan Dey 1 Mentor: Dr. Subhra Sankar Dhar


Theorem. Let (X1 , Y1 ), (X2 , Y2 ), (X3 , Y3 ), ..., (Xn , Yn ) be paired sample points of (X, Y ). For fixed
\(x,y)
Fn,X,Y
\ \
−1
(x, y), our statistic based on sample points is α\
n (x, y) = (x, y) =
Fn,X (x)Fn,Y (y)
\
where, Fn,X,Y
\
(1−Fn,X \
(x))(1−Fn,Y (y))
n n n
1{Xi ≤x,Yi ≤y} , F\
n,X (x) = 1{Xi ≤x} and F\
n,Y (y) = 1{Yi ≤y} . Population counterpart is α(x, y) =
X X X
1 1 1
n n n
i=1 i=1 i=1
FX,Y (x,y)
−1
FX (x)FY (y)
where, FX,Y (x, y), FX (x), and FY (y) are the joint distribution function of (X, Y ) and
(1−FX (x))(1−FY (y))

marginal distribution functions of X and Y . Then, n(α\ n (x, y) − α(x, y)) converges weakly to a product
of two normal random variables.

Here we have, 1{Xi ≤x,Yi ≤y} ∼ Bernoulli(FX,Y (x, y)), 1{Xi ≤x} ∼ Bernoulli(FX (x))
and 1{Yi ≤y} ∼ Bernoulli(FY (y))
P \ P
From Bernoulli’s WLLN, F\ n,X (x) → FX (x) and Fn,Y (y) → FY (y)
By Lindeberg-Levy CLT, as (Xi , Yi ) are i.i.d., we have 1{Xi ≤x,Yi ≤y} are also i.i.d.
√ d
\
n(Fn,X,Y (x, y) − FX,Y (x, y)) → N (0, FX,Y (x, y)(1 − FX,Y (x, y))),
√ \ d √ d
n(Fn,Y (y)−FY (y)) → N (0, FY (y)(1−FY (y))) and n(F\ n,X (x)−FX (x)) → N (0, FX (x)(1−FX (x)))
\
Fn,X,Y (x,y) FX,Y (x,y)
We define, β\
n (x, y) = \ \
− 1, β(x, y) = FX (x)FY (y)
− 1,
Fn,X (x)Fn,Y (y)

n (x, y) = (1 − Fn,X (x))(1 − Fn,Y (y)), and γ(x, y) = (1 − FX (x))(1 − FY (y))


γ\ \ \
Now,
β\n (x, y) − β(x, y)

\
Fn,X,Y (x, y) FX,Y (x, y)
= −
n,X (x)Fn,Y (y)
F\ \ FX (x)FY (y)
\
Fn,X,Y (x, y) FX,Y (x, y) FX,Y (x, y) FX,Y (x, y)
= − + −
n,X (x)Fn,Y (y)
F\ \ F\n,X (x)Fn,Y (y)
\ F\n,X (x)Fn,Y (y)
\ FX (x)FY (y)
1 1 1
= (Fn,X,Y
\ (x, y) − FX,Y (x, y)) + FX,Y (x, y)( − )
F\
n,X (x)F \
n,Y (y) F\n,X (x)F\n,Y (y) F X (x)F Y (y)
√ \
n(Fn,X,Y (x,y)−FX,Y (x,y)) d FX,Y (x,y)(1−FX,Y (x,y))
So we have, \ \
(= An , say) → N (0, (FX (x)FY (y))2
(= σ12 , say))(= A, say)
Fn,X (x)Fn,Y (y)
[by Slutsky’s Theorem]
Also,
√ \ \
n(Fn,X (x)Fn,Y (y) − FX (x)FY (y))
√ \ \
= n(Fn,X (x)Fn,Y (y) − F\ n,X (x)FY (y) + Fn,X (x)FY (y) − FX (x)FY (y))
\

= n[F\ n,X (x)(Fn,Y (y) − FY (y)) + FY (y)(Fn,X (x) − FX (x))]
\ \
√ \ √ \
=F\n,X (x)( n(Fn,Y (y) − FY (y))) + FY (y)( n(Fn,X (x) − FX (x)))
√ d
We have, n[F\ n,X (x)(Fn,Y (y)−FY (y))](= Bn , say) → N (0, (FX (x)) FY (y)(1−FY (y))(= σ2 , say))(=
\ 2 2

B, say) [by Slutsky’s Theorem]


√ d
and, n[FY (y)(F\ n,X (x) − FX (x))](= Cn , say) → N (0, (FY (y)) FX (x)(1 − FX (x))(= σ3 , say))(=
2 2

C, say)

Result 2 A necessary and sufficient condition for (X1 , X2 , ..., Xn ) to have Multivariate Normal distri-
n
bution is that, every (a1 , a2 , ..., an ) ∈ Rn , the linear combination
X
ai Xi is univariate normal.
i=1

Niranjan Dey 2 Mentor: Dr. Subhra Sankar Dhar


d
We have, a1 Bn + a2 Cn → a1 B + a2 C ≡ N (0, a21 σ22 + a22 σ32 + 2a1 a2 cov(Bn , Cn )) for every (a1 , a2 ) ∈ R2
that implies, (B, C) is Bivariate Normal.
√ d
a1 = 1 and a2 = 1 gives, Bn + Cn = n(F\ n,X (x)Fn,Y (y) − FX (x)FY (y)) → N (0, σ2 + σ3 +
\ 2 2

2cov(Bn , Cn )(= σ42 , say))


Now by Delta method, g(x) = x1 ∀x gives,
√ d σ2
n( \ 1 \ − FX (x)F 1
(y)
) → N (0, (FX (x)F4Y (y))4 )
Fn,X (x)Fn,Y (y) Y
√ d (FX,Y (x,y))2 σ42
So, nFX,Y (x, y)( \ 1 \ − FX (x)F 1
Y (y)
)(= D n , say) → N (0, (FX (x)FY (y))4
(= σ52 , say))
Fn,X (x)Fn,Y (y)
√ d
Similarly from Result 2 n(β\ n (x, y) − β(x, y)) = An + Dn → N (0, σ1 + σ5 + 2cov(An , Dn ))
2 2

Now,
γ\n (x, y) − γ(x, y)

=(1 − F\
n,X (x))(1 − Fn,Y (y)) − (1 − FX (x))(1 − FY (y))
\

=1 − F\
n,X (x) − Fn,Y (y) + Fn,X (x)Fn,Y (y) − 1 + FX (x) + FY (y) − FX (x)FY (y)
\ \ \

=(F\n,X (x)Fn,Y (y) − FX (x)FY (y)) − (Fn,X (x) − FX (x)) − (Fn,Y (y) − FY (y))
\ \ \
√ \ \ d
We already have, n(Fn,X (x)Fn,Y (y) − FX (x)FY (y)) → N (0, σ42 )
√ \ √ \
Define, Pn = n(Fn,X (x) − FX (x)), and Qn = n(Fn,Y (y) − FY (y))
√ \ d
n(γn (x, y) − γ(x, y)) → N (0, σ42 + FX (x)(1 − FX (x)) + FY (y)(1 − FY (y)) − 2cov(Bn + Cn , Pn )
−2cov(Bn + Cn , Qn ) + 2cov(Pn , Qn )(= σ52 , say))
Now by Delta method, g(x) = x1 ∀x gives,
√ d σ52
n( \ 1 1
− γ(x,y) ) → N (0, (γ(x,y)) 4)
γn (x,y)
Finally,
√ \
n(αn (x, y) − α(x, y))
 
√ β\n (x, y) β(x, y) 
= n −
n (x, y)
γ\ γ(x, y)
 
√ β\n (x, y) β\n (x, y) β\
n (x, y) β(x, y)
= n − + − 
γn (x, y)
\ γ(x, y) γ(x, y) γ(x, y)
 
√ 1 1 1
= n β\
n (x, y)( − )+ (β\
n (x, y) − β(x, y))

γn (x, y)
\ γ(x, y) γ(x, y)
 
√ 1 1 1 1 1
= n (β\
n (x, y) − β(x, y))( − ) + β(x, y)( − )+ (β\
n (x, y) − β(x, y))

γn (x, y)
\ γ(x, y) γn (x, y)
\ γ(x, y) γ(x, y)
√ \
Here it can be seen that n(αn (x, y) − α(x, y)) converges weakly to a product of two normal random
variables.

Implementation of Test of Independence and asymptotic power study


➪ Our test is H0 : X and Y are independent against H1 : X and Y are not independent.

➪ Here our proposed test statistic is median α\


n (x, y).
(x,y)

➪ We simulated samples (X, Y ) from Bivariate Normal Distribution. We know when ρ = 0, X and
Y are independent.

Niranjan Dey 3 Mentor: Dr. Subhra Sankar Dhar


➪ Calculated empirical distribution functions Fn,X,Y
\ (x, y), F\
n,X (x),Fn,Y (y) and median αn (x, y).
\ \
(x,y)

➪ Product of two normal distributions takes positive values more often than negative values.

Histogram of Test statistic


Frequency

20
10
0

1.00 1.02 1.04 1.06 1.08

Values of Statistic

➪ Critical values for this samples (set.seed(100)) are 97.5% quantile = 1.295183, 2.5% quantile
= 1.052632 at 5% level of significance.
➪ Computed power of the test for various values of ρ.

➪ Power curve plot:

Power curve of Test statistic


1.0
T_pow

0.6
0.2

−1.0 −0.5 0.0 0.5 1.0

rho

➪ From, the histogram it is clear that the test statistic is positively skewed. We know if a statistic
is positively skewed then it rejects more often for larger values of test statistic. That’s why, the
power curve looks like this.

Niranjan Dey 4 Mentor: Dr. Subhra Sankar Dhar


Appendix (R code)

library(MASS)
set.seed(100)
mu=c(0,0)
multinorm <- function(mu, rho, N = 2e3)
{
Sigma <- matrix(c(1, rho, rho, 1), nrow = 2, ncol = 2)
# Eigenvalue (spectral) decomposition
decomp <- eigen(Sigma)
# Finding matrix square-root
Sig.sq <- decomp$vectors %*% diag(decomp$values^(1/2)) %*% solve(decomp$vectors)
samp <- matrix(0, nrow = N, ncol = 2)
for(i in 1:N)
{
Z <- rnorm(2)
samp[i, ] <- mu + Sig.sq %*% Z
}
return(samp)
}

set.seed(100)
smx = multinorm(mu=mu,rho=0)[,1]
smy = multinorm(mu=mu,rho=0)[,2]

Fxy = function(smx,smy)
{
x = seq(-5,5,length=1e2)
y = seq(-5,5,length=1e2)
XY = matrix(ncol=length(x),nrow=length(y))
for( i in 1:length(x))
{
for(j in 1:length(y))
{
XY[i,j]= mean(smx <= x[i] & smy <= y[j])
if(XY[i,j] ==0)
{
XY[i,j] = 0.0000001
}
if(XY[i,j] ==1)
{
XY[i,j] = 0.9999999
}
}
}
return(XY)
}

Niranjan Dey 5 Mentor: Dr. Subhra Sankar Dhar


Fx = function(smx)
{
x = seq(-5,5,length=1e2)
X = numeric(length(x))
for(i in 1:length(x))
{
X[i] = mean(smx <= x[i])
if(X[i] ==0)
{
X[i] = 0.0000001
}
if(X[i] ==1)
{
X[i] = 0.9999999
}
}
return(X)
}

Fy = function(smy)
{
y = seq(-5,5,length=1e2)
Y = numeric(length(x))
for(i in 1:length(x))
{
Y[i] = mean(smy <= y[i])
if(Y[i] ==0)
{
Y[i] = 0.0000001
}
if(Y[i] ==1)
{
Y[i] = 0.9999999
}
}
return(Y)
}

alp_nxy = function(smx,smy)
{
x = seq(-5,5,length=1e2)
y = seq(-5,5,length=1e2)
alp_XY = matrix(nrow = length(x),ncol = length(y))
FXY=Fxy(smx=smx,smy=smy)
FX=Fx(smx=smx)
FY=Fy(smy=smy)
for(i in 1:length(x))
{

Niranjan Dey 6 Mentor: Dr. Subhra Sankar Dhar


for(j in 1:length(y))
{
alp_XY[i,j] = ((FXY[i,j]/(FX[i]*FY[j]))-1)/((1-FX[i])*(1-FY[j]))
}
}
return(as.vector(alp_XY))
}
max(alp_nxy(smx,smy))
mean(alp_nxy(smx,smy))
median(alp_nxy(smx,smy))

sim = list()
Tst = numeric(length=100)
mu=c(0,0)

for(i in 1:100)
{
sim[[i]] = multinorm(mu=mu,rho=0)
Tst[i] = median(alp_nxy(sim[[i]][,1],sim[[i]][,2]))
}
hist(Tst, xlab = "Values of Statistic", main = "Histogram of Test statistic")

#critical values
Ua = quantile(Tst, 0.975);Ua
La = quantile(Tst, 0.025);La
mean(Tst <=Ua & Tst>=La)

#for power
sim = list()
T_p = numeric(100)
mu=c(0,0)
Tst_val = function(rho)
{
for(i in 1:100)
{
sim[[i]] = multinorm(mu=mu,rho)
T_p[i] = median(alp_nxy(sim[[i]][,1], sim[[i]][,2]))
}
return(T_p)
}

rho = seq(-1,1,length=100)

T_pow = numeric(length(rho))
for(i in 1:length(rho))
{
T_pow[i] = 1-mean(Tst_val(rho[i])>=La & Tst_val(rho[i])<=Ua)

Niranjan Dey 7 Mentor: Dr. Subhra Sankar Dhar


}

size = min(T_pow)
plot(rho,T_pow,main="Power curve of Test statistic",type="l")
abline(h=0.03,col="red",lwd=2)

References
[1] G. Jogesh Babu and C. Radhakrishna Rao. Joint asymptotic distribution of marginal quantiles and
quantile functions in samples from a multivariate population. 27:15–23, 1988.

[2] Gutti Jogesh Babu and B. L. S. Prakasa Rao. Asymptotic theory of statistical inference. 83:1217,
1988.

[3] Wicher Bergsma and Angelos Dassios. A consistent test of independence based on a sign covariance
related to kendallâs tau. 20, 2014.

[4] J. R. Blum, J. Kiefer, and M. Rosenblatt. Distribution free tests of independence based on the
sample distribution function. Annals of Mathematical Statistics, 32:485–498, 1961.

[5] Subhra Sankar Dhar, Wicher Bergsma, and Angelos Dassios. Testing independence of covariates
and errors in nonâparametric regression. 45:421–443, 2018.

[6] Subhra Sankar Dhar, Angelos Dassios, and Wicher Bergsma. A study of the power and robustness
of a new test for independence against contiguous alternatives. 10, 2016.

[7] Jean Dickinson Gibbons and Subhabrata Chakraborti. Nonparametric statistical inference. Stat.,
Textb. Monogr. Boca Raton, FL: CRC Press, 5th ed. edition, 2011.

[8] Wassily Hoeffding. A class of statistics with asymptotically normal distribution. 19:293–325, 1948.

[9] M. G. Kendall. A new measure of rank correlation. Biometrika, 30:81–93, 1938.

[10] A. Rényi. On measures of dependence. 10:441–451, 1959.

[11] Robert J. Serfling. Approximation theorems of mathematical statistics, 1980. Includes bibliography
and indexes.

[12] C. Spearman. The proof and measurement of association between two things. 15:72, 1904.

[13] Gábor J. Székely, Maria L. Rizzo, and Nail K. Bakirov. Measuring and testing dependence by
correlation of distances. The Annals of Statistics, 35(6):2769–2794, 2007.

[14] A. W. van der Vaart. Asymptotic statistics, 1998.

Niranjan Dey 8 Mentor: Dr. Subhra Sankar Dhar


Acknowledgment
I would like to thank and express our heartiest gratitude to respected Prof. Dr. Subhra Sankar Dhar
Sir for providing us with this interesting project and help us to learn some interesting result and facts
regarding the topic. I shall remain ever grateful to SURGE program for pushing our limit into real life
working field.

† ∗ ∗ ∗ ∗ THANK YOU ∗ ∗ ∗ ∗ †

Niranjan Dey 9 Mentor: Dr. Subhra Sankar Dhar

You might also like