Rank Test

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Correlation and Beer-ion 10089

Probable error also enables us to find the limits within which the
population correlation coefficient can be expected to vary. The limits are
r ± p.E.(r). -
10·6. Rank Correlation. Let us suppose that a group of n individuals
is arranged in order of merit or proficiency in possession of two characteristics A
and B. These ranks in the two characteristics will. in general. be different. For
example. if. we consider the relation between intelligence and beauty. it is not
necessary that a beautiful individual is intelligent also. Let (Xi. Yi); i = 1. 2 •...•
n be the ranks of the ith individual iii two characteristics A and B respectively.
PeafSOnian coefficient of correlation between the ra~ks Xj's and y;'s is called the
rank correlation coefficient between A and B for that group of individuals.
Assuming that no two individuals are bracketed equal in either
classification. each of the variables X and Y takes the values 1.2..... n.
Hence -X =y- n+1
=;;I (1 + 2 + 3 + ... + n) =-2-
1
crr-- LII x? -x2=-
n i-I
_ 1(
n
}2
+-
+ 22 + ... + n 2 ) - - 2 (n 1)
=n(n + 1)(2n + 1) _ (n + ·1)2 = n2 - r
6n 2) 12
n2 - 1
crx2 =---u- = crf2
In general Xi '#. Yi . Let di =Xi - Yi
di = (Xi - X ) - (yi - Y) (':x=y)
Squaring and summing over i from 1 to n. we get
L d? = L(Xi -x) - (yj - y»)2
=L(Xj -x)2 + L(Yi - y)2 - 2L(Xj.,..,i )(yi - y)
Dividing both sides by n. we get
1n Ldl =crx2 + crf2- 2 Cov (X. Y) =crx2 + crf2 - 2p crxcry/
where p is the rank correlation coefficient between A and B.
1 Ld~
;;J:.ct? =2CJx2 - 2pcrx2 => 1 - P = 2ncr~
II II

~ dl 6 L di 2
-1 j-,1 -1 i-I
... (10·7)
=> P - - 2ncrr - . n(n2 - 1)
which is the Spearman'sfo171llllafor the rank correlation coefficient.
Remark. We always have
LPi = L (Xi - Yi) = LXi -' LYi = n{x - i>= 0 (.: x =y)
This serves as a check on the calct;lations.
10·40 Fundamentals of Mathematical Statistics

10·6·1. Tied Ranks. If some of the individuals recei.ve the same rank in a
ranking or merit, they are said to be tied. Let us suppose that III of the
individuals, say, (k + I)"', (k + 2)"', .... , (k + III)'" are tied. Then each of these III .
individuals is assigned a common rank, which is the arithmetic mean of the
ranks k.+ I.k +2 • ....• k+m.
Derivatioll ofp (X, y):We have:

... (O)

where x=X-X.y= Y- Y.
If X and Yeach takes the values I, 2 ...... II. then we have
X = (II + 1)12 = Y
and /lCJor
"
?
= ~~-2 11(11
- 212- I)- =an d nCJ ~ = ~""v. 2 n(11. 2 - I)
12
•.. (00)

Also ~d2 =~ (X - Y)2 =~ [(X - X) - (Y - y]2 =~ (x _ y)2


~d2 =u2 + ~y2-2~xy
=> ~ xy =~ [u 2 + ~y2 - ~d2] ...("00)
We shaH now investigate the effect of common ranking. (in case of ties). on
the sum of squares of the ranks. Let S2 and S\2 denote the sum of the squares of
untied and tied ranks respectively.
Then we have:
S2 :: (k + 1)2 + (k + 2)2 + ... + (k + m)2
=mk2 + (12 + 22 + ... + m 2) + 2k. (1 + 2 + ... + III )
2 m(m+J)(2m+\) k( I)
=m k + 6 +m 111+
S? =m (Average rank)2
=m[(k+ 1)+(k+2;1+···+(k+III)Y

=m ( k +
m +
2
I) 2 = III k
2
+
m (m
4
+ 1)2
- • + m k (m + 1)

2_S\2 _m(III+1)[2(2 I) 3( 1)]- lII(m 2 -1)


.. S - 12 m + .- m + - 12
Thus the effect of'tying m individuals (ranks) is to reduce the sum of the
squares by III (m 2 - I )/12, though the mean value of the ranks remains the same,
viz .• (n + 1)/2. .
Suppose that there are s such sets of.ranks 'to be' tied in-the X-series so that
the total sum of squares due to them is
s s
112 ~ Ill; (Ill? - I) =1'2 ~ (m? - m;) =Tx. (say) ... (1O.7a)
,= \ ,= \
eorrelationand ~ion 1041

Similarly suppose that there are t such sets of ranks to be tied with respect
to the other series Y sO that sum of squares due to them is :
, ,
.!. L m.'.(m.'2-I)=.!. L (mp-m:)=Ty,(say) .... (iO·7,b)
12 j = I J J 12 j = I J

Thus, in the case of ties, the new sums of squares are given by :
n(n Z - 1)
n Var'(X) = L xZ - Tx = 12 - Ti
, Z n(n Z - 1)
nVar(y) =LY ,-Ty = 12 -Ty
ad n Cov'(X, Y) = ~ [L xZ -:fx + LYz - Ty - Ld2] [From (***)]

_![n(n Z -1)
-2 12
T
x +
n(n Z -
12
n _ T y - ~~ dZ]
n(n Z - 1) 1 [
= 12 ...,. 2 (Tx + Ty) + L tP ]
11_ L2 [Tx + Ty+ ~~dZ]
n(n 2 -
12
p(X. Y) =---==----------
Z
[ n(n 12- I) - Tx JII2 [n(.n 2 - 1)
12 - Ty
JIll
n(n 26- 1) _ [Ld 2 + Tx + Ty]
-------~----------------------
[ n<nZ - I) 6
[n<n2 - 1)
- 2Tx
JI/2 6 -- 2 T y
JIll
... (l0·7c)
where Tx and Tyare ~iven by (10·7a) and (10·7b).
Remark. If we adjust only the covariance ·term Le.• Liy and not the
variances Gx2 (9r L x 2) and Gy2 (or LY) for ties, then the formula' (10.7c)
reduces to:
n(n~-l) _ (Ld2 + Tx + Ty)
p(X. Y) = n(n Z _ 1)/6
_ I _ 6 [Ld 2 + Tx + T y]
... (l0·7J)
- n~Z-I)'

a formula which is commonly used in practice for nu~erica1 problems. For'


illustration, see Example 10·18.
Example 10·16. The ranks of same 16 students in Mathematics and
Physics are as follows. Two numbers within brackets denote the ra* of the
Gtudents in Mathematics and Physics.
(1.1) '(2,10) (3,3) (4,4) ·(5,5) (6,7) (7,2) (8,6) (9,8)
(10,11) (11.15) (12,9) (13,14) (14,12) (15,16) (16.13).
Fundamentals olMathematieal Statistic,.

Calculate the rank correlation coefficient for proficiencies of this group ill
Mathematics and Physics.
Solution.
Ranks in 1
Maths. (X)
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Total -
Ranks in .1 10 3 4 5 7 2 6 8 11 15 9 14 12 16 13
PhysiC'S(Y)

d=X-Y 0 -8 0 0 o -1 5 2 1 -1 -4 3 -1 2 -1 3 0
tP 0 64 0 0 0 1 25 4 1 1 16 9 1 ;4 1 9 136
Rank correlation coefficient is given by
6 r. d 2 6 x 136 1 4
P = 1- n(n2 _ 1) = 1 -16 x 255 = 1- 5= 5= 0·8
Example 10·17. Ten competitors in a musical test were ranked by the
three judges A. Band C in the following order:
Ranks by A: 1 6 5 10 3 2 4 9 7 8
Ranks by B : 3 5 8 4 7 10 2 1 6 9
Ranks by C : is 4 9 8 1 2 3 10 5 7
Using rank correlation method. discuss which pair ofjudges has the nearest
approach to common likings in music.
Solution. Here· n = 10
Ranks Ranks Ranks
by A by B by C ·dl d,. d3 dl 1 d,.1 ~1
(X) (Y) (Z) = X-Y =X-Z =Y-Z

1 3 6 -2 -5 -3 4 25 9
6 5 4 1 2 1 1 4 1
5 g 9 -3 -4 -1 9 16 1
10 4 8 6 .2 -4 36 "4 16
3 7 1 -4 1 6 16 4 36
2 10 2 - 8 0 8 64 0 64
4 2 3 2 1 -1 4 1 1
9 1 10 8 -1 -9 64 1 81
7 6 5 1 2 1 1 4 1
8 9 7 - 1 1 2 1 1 4
Total rdl =0 !.dz~O !.~=O !.di~=20(] !,dzl=60 !.di
=214

6r.d12 6 X 200 40 7
p(X. Y) = 1 - n(n2 _ 1) =1 10 X 99 = 1 - 33 = - 33
6r. dl 6 X 60 4 7
p(X. Z) = 1 -,r,(n 2 _ 1) = 1 - 10 x.99.= 1 -U~ It
()on'e1ation and Recr-ion 1043

{) L d32 6 X 214 49
p(Y. Z) = 1 - n(n2 _ 1) = 1 to x 99 165
Since p(X. Z) is maximum, we ·conclude that the parr of jQdges A and C
has the nearest approach to common likings in music.
10·6·2. Repeated Ranks (Continued). If any two or more
individuals are bracketed equal in any classification with respect to c~cteristics
A and B, or if there is more than one item with the same value in the series,
then the Spearman's formula (10·7) for calculating the rank correlation
coefficient breaks down, since in this case each of the variables X and Y does
not assume the values 1,2, ... , n and consequently, X:I;. y.
In this case, common ranks are given ro the repeated items. This commor.
rank is the average of lhe ranks which the~e items would h?ve assumed if they
were sightly different from each other and the next item will get the rank next to
the ranks already assumed. As a result of this, followiqg adjustment or
correction is made in the rank correlation formula [c.f. (10·7c) and (10·7d)].
m(m 2 -1)
In the formula, .we add the factor J2 to Ld 2 , where m is the
number of times an item is repeated. This correction factor is to be adcled for
each repeated value in both the X-series. and Y-series.
Example 10·18. Obtain the rank correlation coefficient for the following
data:
X 68 64 15 50 64 80 75 40 55 64
Y 62 58 68 45 81 6() 68 48 50 70
Solution.
CALCUlATIONS R>R RANK CORRELATION

Rank X Rank Y
X Y (x) (y) d=x-y d2
68 62 4 5 -1 1
64 58 (; 7 ~1
75 68 2·5 3·5 -1 1
50 45 9 10 -1 1
·64 81 6 1 5 25
80 60 1 6 -5 25
75 68 2·5 3·5 -1 1
40 48 10 9 1 1
55 50 8 8 0 0
64 70 6 2 4 16
!.d=O !,d2 =72
In the X-series we see that the value 75 occurs 2 times. The common rank
given to these values is 2·5 which is the average of 2 and 3, the ranks which
these values would have taken if they were different. The next value 68, then
gets the next rank which' is 4. Again we see that value 64 occurs thrice. The
common rank given to it is 6 which is the av~rage of 5, 6 and 7. Similarly in

You might also like