ρ r r Cov (X,Y) V (x) .V (Y) X X) (Y Y) X X n Y Y: xy xy

Simple correlation
Basic concept:
Coefficient of correlation: Coefficient of correlation is the measure of linear

relationship between two or more random variables.
It measures the strength and the direction of linear relationship between two
variables.
As the measures of degree of linear relationship between two variables Karl

Pearson developed a formula.
If X and Y are two random variables. Then the coefficient of correlation between X
and Y is denoted by ρ xyor r xywhich is defined by
C ov ( X , Y )
r XY =
√V ( x ) . V (Y )
n
1
∑ (X − X́)(Y i−Ý )
n i=0 i
¿
n n
√ 1
∑
n i =0
(
21
X i− X́ ) ∑ ( Y i −Ý )
n i=0
2
∑ (X i − X́ )(Y i−Ý )
i=0
¿ n n
√ ∑ ( X i− X́ )
i=0
2
∑ ( Y i−Ý )
i=0
2
Example: i) Height and weight of a person.
ii) Price and demand of a thing.
Different kinds of correlation:
Positive correlation: If the variables are variance in the same direction i.e. if the
increase or decrease of one variable result in a corresponding increase or
decrease in the other variable is called positive correlation.
Example: Height and weight of a person.

Negative correlation: If two variables constantly variance in the opposite direction
i.e. if one variables is increasing, the other is decreasing or one is decreasing, the
other is increasing is said to be negative correlation.
Example: Price and demand of crops.
Non-sense correlation: When two variables X and Y are linearly independent then
the value of correlation coefficient is zero. But r=0 does not means that the two
variables X and Y are not related. This type of correlation is known as non-sense
correlation.
Example: The height and age of university student.
Properties of correlation coefficient:
i) The correlation coefficient is independent of change of origin and scale

of measurement.
ii) The value of correlation coefficient is lies between -1 to +1.
iii) The correlation coefficient is the geometric mean of regression
coefficient.
iv) The correlation coefficient is a symmetric measure.
v) The correlation coefficient is dimensionless quantity. It is not expressed
any units of measurement.
Measure of correlation:
strong
Intermediate Weak Weak Intermediate strong
1- 1
-0.75 -0.25
0 0.25 0.75
1
indirect Direct Perfect
Perfect No relation correlation
correlation
Example question: Discuss the situation when r=-1, r=+1, r=-0.93 and r=0.65.
r= -1 means there exists a perfect negative relationship between the two

variables X and Y.
r= +1 means that there exists a perfect negative relation between them.
r= -0.93 indicates that there exists a strong negative relationship between X and
Y.
r= 0.65 means there exists a perfect moderate relationship between them.
Theorem: Prove that correlation coefficient lies between -1 to +1
Proof: Let X= x 1 , x 2 , x 3 ,… , x n and Y= y 1 , y 2 , y 3 , … , y n be two random variables having

mean X́ andÝ . Their corresponding variance are σ 2X and σ Y2 .
Now by definition of variance we can write,

n
12 2
V(X) =σ ❑ = ∑ ( X i− X́ )
X
n i=1
n
2 2
⇒ nσ ❑ =∑ ( X i− X́ )
X
i=1
n
1 2
And V(Y) = σ ❑2Y = ∑ ( Y i−Ý )
n i=1
n
2
⇒ nσ ❑2Y =∑ ( Y i−Ý )
i =1
By definition of correlation coeffient we can write,

C ov ( X , Y )
r=
√ V ( x ) . V (Y )
n
1
⇒r
∑ ( X − X́ )(Y i−Ý )
n i=0 i
¿ 2
√σ ❑X σ ❑2Y
n
∴ nr σ X σ Y =∑ ( X i− X́ )(Y i−Ý )------------------------(i)
i =0
Let us consider the following expression,
¿)
Which square is always positive.
Performing the square and taking sum we get,

n 2
X i− X́ Y i−Ý
∑
i=0
( σX
±
σY ) ≥0
X i− X́ 2 Y i−Ý 2
n n n 2 2
X − X́ Y i−Ý
⇒∑
i=0
( σX ) +¿¿ ∑
i=0
( )
σY (
+¿ ¿2∑ i
i=0 σX )( ) σY
≥0
n n n
1 2 1 2 1 1
⇒ 2 ∑ ( X i− X́ ) + 2 ∑ ( Y i−Ý ) ± 2. . ∑ ( X i− X́ )(Y i−Ý )≥0
σ X i=1 σ Y i=1 σ X σ Y i=0
1 1 1 1
⇒ nσ ❑2X + 2 nσ ❑2Y ±2. . nr σ X σ Y ≥ 0
2
σX σY σ X σY
⇒ 2 n ±2 rn ≥ 0
⇒ 2 n ( 1± n ) ≥ 0
∴ 1± r ≥0
Considering positive sign

1+r ≥ 0
⇒ r ≥−1
∴−1 ≤ r -------(iii)
Considering negative sign

1−r ≥ 0
⇒r ≥1
∴ r ≤ 1 -------(iv)
From equation (iii) and (iv) we can write,

−1 ≤r ≤+1
So the correlation coefficient lies between -1 to +1
(Proved)
Theorem: Show that the coefficient of correlation is independent of the change of

origin and scale of measurement.
Proof: Let X and Y be two random variables. Then he correlation coefficient

between X and Y is defined as,
C ov ( X , Y )
r XY =
√V ( x ) . V (Y )
n
∑ (X i − X́ )(Y i−Ý )
i=0
¿ −−−−−−(i)
n n
√∑ (
i=0
X i− X́ )
2
∑ ( Y i−Ý )
i=0
2
Let us suppose,
X i−A Y i−B
U i= ∧V i=
h k
That means we have shifted the origin of X to A and Y to B. Also we have changed
the scale of X by h and Y by k.
Now,
X i−A
U i=
h
⇒ X i= A+U i h
n n n
⇒ ∑ X i=¿ ∑ A ¿+h ∑ U
i=0 i=0 i=0
n n n
1 1 1
⇒ ∑ X i=¿ ∑ A ¿ +h ∑ U
n i=0 n i=0 n i=0
∴ X́= A+ h Ú
And
Y i−B
V i=
k
⇒ Y i =B+Y i k
n n n
⇒ ∑ Y i =¿ ∑ B ¿ +k ∑ V i
i=0 i=0 i=0
n n n
1 1 1
⇒ ∑ X i=¿ ∑ B ¿+k ∑ V i
n i=0 n i=0 n i=0
∴ Ý =B+k V́
Putting these values on equation (i) we get,

n
1
∑ ( A +U i h− A−h Ú )(B+Y i k−B−k V́ )
n i=0
r XY =
n n
√ 1
∑
n i =0
(
21
A +U i h− A−h Ú ) ∑ ( B+ Y i k −B−k V́ )
n i=0
2
n
1
hk ∑ (U i−Ú )(V i−V́ )
n i=0
¿
n n
√ h2
1
∑
n i=0
(
2 1
U i−Ú ) . k 2 ∑ ( V i −V́ )
n i=0
2
n
1
hk ∑ (U −Ú )(V i−V́ )
n i=0 i
¿
n n
hk
√ 1
∑
n i=0
(
2 1
U i−Ú ) . ∑ ( V i−V́ )
n i=0
2
¿ r UV
∴ r XY =r UV
So the correlation coefficient is independent of change of origin

and scale of measurement.
Theorem: Prove that correlation coefficient is the geometric
mean of two regression coefficient.
Proof: Let X and Y be two variables. Then the regression
coefficient of Y on X is defined as,
n
∑ ( X i− X́ )(Y i−Ý )
b Y∨ X = i=0 n
−−−(i)
2
∑ ( X i− X́ )
i=0
The regression coefficient of X and Y is defined as,

n
∑ ( X i− X́ )(Y i−Ý )
b X ∨Y = i=0 n
−−−(ii)
2
∑ ( Y i−Ý )
i=0
By definition of correlation coefficient we know that,
∑ (X i − X́ )(Y i−Ý )
i=0
r XY = n n
√ 2
∑ ( X i− X́ ) ∑ ( Y i−Ý )
i=0 i=0
2
Multiplying equation (i) and (ii) we get,

n n
∑ (X i− X́)(Y i−Ý ) ∑ ( X i− X́ )(Y i−Ý )

i=0
b Y∨ X × b X ∨Y = n
× i=0 n
2
∑ ( X i − X́ ) ∑ ( Y i−Ý ) 2
i=0 i =0
n 2
⇒ b Y ∨X × b X ∨Y = n
{∑ (
i=0
X i− X́ )( Y i−Ý )
n
}
2 2
∑ ( X i− X́ ) ∑ ( Y i−Ý )
i=0 i=0
∑ ( X i− X́ )(Y i −Ý )
i=0
⇒ √ bY ∨ X × b X ∨Y = n n
√∑ (i=0
X i− X́ )
2
∑ ( Y i−Ý )
i=0
2
⇒ √ bY ∨ X × b X ∨Y =r XY
∴ r XY = √ b Y ∨X × b X ∨Y
∴Correlation coefficient is geometric mean of two regression coefficient.
(Showed)
Problem: aX+bY+c=0 is an equation. Prove that, The correlation coefficient

between X and Y is -1 if signs of a and b are alike and +1 if they are different.
Proof: The given equation is,

aX + bY + c=0−−−−−−−(i)
aE ( X )+ bE ( Y )+ c=0−−−(ii) [Taking expectation on both sides]
From equation (i)-(ii) we get,

aX + bY −aE ( X )−bE ( Y )=0
⇒ a { X−E ( X ) }+ b { Y −E ( Y ) }=0
−b
∴ { X−E ( X ) }= { Y −E ( Y ) } −−−−−(iii)
a
By definition of covariance we know,

Cov ( X , Y )=E [ { X−E ( X ) }{ Y −E ( Y ) } ]
−b
¿ E[ { Y −E ( Y ) }{ Y −E ( Y ) }]
a
b
¿− E[ {Y −E ( Y )2 }]
a
b
¿− σ 2X
a
Squaring both sides of equation (iii) and taking expectation we get,

2
b 2 2
E { X −E ( X ) } = 2 E[ { Y −E ( Y ) } ]
a
b2 2
⇒ V ( X )= σY
a2
b2 2
∴ ⇒ V ( X )= σY
a2
By definition of correlation we get,

Cov( X ,Y )
r=
√ V ( X ) V (Y )
−b 2
σ
a Y
¿
b2 2
√ σ 2Y
a
σ
2 Y
−b 2
σ
a Y
¿
b 2
||
σ
a Y
−b
a
¿
b
a||
When a and b are opposite sign then,
−(+ b)
(−a)
r=
+b
| |
−a
b
a
r=
b
a
∴ r=+1
When a and b are same sign,

−(−b)
(−a)
r=
−b
−a| |
−b
a
r=
b
a
∴ r=−1
So the correlation coefficient between X and Y is -1 if signs of a and b are alike and
+1 if they are different.
(Proved)
Problem: If X and Y are uncorrelated. Find the correlation coefficient between

(X+Y) and (X-Y)
Let,
U =X +Y ∴ Ú= X́ + Ý [Taking sum and dividing by n]
U =X−Y ∴ Ú= X́ −Ý [Taking sum and dividing by n]
By definition of correlation we know that

Cov(U ,V )
r UV = −−−−−(i)
√V ( U ) V (V )
Let,
V ( X )=σ 2X and V ( Y )=σ 2Y
Now,
V ( U )=V ( X +Y ) =V ( X ) +V (Y )+ 2Cov ( X , Y )
¿ σ 2X +σ 2Y +0 [Since X and Y are uncorrelated]
¿ σ 2X +σ 2Y
And
V ( V )=V ( X −Y )=V ( X ) +V ( Y )−2 Cov ( X , Y )
¿ σ 2X +σ 2Y −0 [Since X and Y are uncorrelated]
¿ σ 2X +σ 2Y
Again,
Cov ( X , Y )=E [ ( U−Ú ) ( V −V́ ) ]
¿ E[ ( X +Y − X́ + Ý )( X −Y − X́ + Ý ) ]
¿E¿
¿E¿
2 2
¿ E ( X − X́ ) −E ( Y −Ý ) −2 E ¿
¿ σ 2X −σ 2Y −0 [As X and Y are uncorrelated. So the Cov(X,Y)=0]
Now putting these values on equation (i) we get,
σ 2X −σ 2Y
r UV = 2 2 2 2
√(σ X + σ Y )( σ X + σ Y )
σ 2X −σ 2Y
¿ 2
√( σ 2
X + σ 2Y )
σ 2X −σ 2Y
∴ r= 2 2
σ X +σ Y
(Showed)
Problem: If X 1 , X 2and X 3 are three uncorrelated variable with equal varianceσ 2.

1
Show that the correlation coefficient between X 1 + X 2 and X 2 + X 3 is 2 .
Solution: Let the variance of the variables is,
V ( X 1 ) =V ( X 2 )=V ( X 3 )=σ 2
As the variables are uncorrelated. So,

Cov ( X 1 , X 2 ) =Cov ( X 2 , X 3 ) =Cov ( X 1 , X 3 ) =0
Let,
U =X 1 + X 2 ∴ Ú= X́ 1+ X́ 2 [Taking sum and dividing by n]
V = X2+ X3 ∴ Ú= X́ 2+ X́ 3 [Taking sum and dividing by n]
By definition of correlation we know that

Cov(U ,V )
r UV = −−−−−(i)
√V ( U ) V (V )
We know,
Cov ( U , V )=E [ ( U −Ú ) ( V −V́ ) ]
¿ E[ ( X 1+ X 2− X́ 1− X́ 2 )( X 2+ X 3− X́ 2− X́ 3 ) ]
¿ E¿
¿ E¿
¿ Cov ( X 1 , X 2 ) +V ( X 2) + Cov ( X 1 , X 3 ) +Cov ( X 2 , X 3 )

¿ 0+ σ 2+ 0+0
¿ σ2
Again,
V ( U )=E (U−Ú )
¿ E( X 1+ X 2− X́ 1− X́ 2 )
Type equation here .

ρ r r Cov (X,Y) V (x) .V (Y) X X) (Y Y) X X n Y Y: xy xy

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ρ r r Cov (X,Y) V (x) .V (Y) X X) (Y Y) X X n Y Y: xy xy

Uploaded by

Copyright:

Available Formats

Simple correlation

Coefficient of correlation: Coefficient of correlation is the measure of linear

As the measures of degree of linear relationship between two variables Karl

Example: i) Height and weight of a person.

ii) Price and demand of a thing.

Different kinds of correlation:

Example: Height and weight of a person.

Example: Price and demand of crops.

Example: The height and age of university student.

Properties of correlation coefficient:

i) The correlation coefficient is independent of change of origin and scale

r= -1 means there exists a perfect negative relationship between the two

r= +1 means that there exists a perfect negative relation between them.

r= 0.65 means there exists a perfect moderate relationship between them.

Theorem: Prove that correlation coefficient lies between -1 to +1

Proof: Let X= x 1 , x 2 , x 3 ,… , x n and Y= y 1 , y 2 , y 3 , … , y n be two random variables having

Now by definition of variance we can write,

By definition of correlation coeffient we can write,

Which square is always positive.

Performing the square and taking sum we get,

Considering positive sign

Considering negative sign

From equation (iii) and (iv) we can write,

So the correlation coefficient lies between -1 to +1

Theorem: Show that the coefficient of correlation is independent of the change of

Proof: Let X and Y be two random variables. Then he correlation coefficient

Putting these values on equation (i) we get,

So the correlation coefficient is independent of change of origin

The regression coefficient of X and Y is defined as,

By definition of correlation coefficient we know that,

Multiplying equation (i) and (ii) we get,

∑ (X i− X́)(Y i−Ý ) ∑ ( X i− X́ )(Y i−Ý )

∴Correlation coefficient is geometric mean of two regression coefficient.

Problem: aX+bY+c=0 is an equation. Prove that, The correlation coefficient

Proof: The given equation is,

aE ( X )+ bE ( Y )+ c=0−−−(ii) [Taking expectation on both sides]

From equation (i)-(ii) we get,

By definition of covariance we know,

Squaring both sides of equation (iii) and taking expectation we get,

By definition of correlation we get,

When a and b are same sign,

Problem: If X and Y are uncorrelated. Find the correlation coefficient between

U =X +Y ∴ Ú= X́ + Ý [Taking sum and dividing by n]

U =X−Y ∴ Ú= X́ −Ý [Taking sum and dividing by n]

By definition of correlation we know that

V ( X )=σ 2X and V ( Y )=σ 2Y

¿ σ 2X +σ 2Y +0 [Since X and Y are uncorrelated]

¿ σ 2X +σ 2Y −0 [Since X and Y are uncorrelated]

Cov ( X , Y )=E [ ( U−Ú ) ( V −V́ ) ]

¿ σ 2X −σ 2Y −0 [As X and Y are uncorrelated. So the Cov(X,Y)=0]

Now putting these values on equation (i) we get,

Problem: If X 1 , X 2and X 3 are three uncorrelated variable with equal varianceσ 2.

Solution: Let the variance of the variables is,

As the variables are uncorrelated. So,

By definition of correlation we know that

Cov ( U , V )=E [ ( U −Ú ) ( V −V́ ) ]

¿ Cov ( X 1 , X 2 ) +V ( X 2) + Cov ( X 1 , X 3 ) +Cov ( X 2 , X 3 )

Type equation here .

You might also like