Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Correlation

Correlation Coefficient
Statisticians use a measure called the correlation coefficient to determine the
strength of the linear relationship between two variables. The symbol for the
sample correlation coefficient is r.
The range of the correlation coefficient is from −1 to +1. If there is a strong
positive linear relationship between the variables, the value of r will be close to 1.
If there is a strong negative linear relationship between the variables, the value of r
will be close to −1. When there is no linear relationship between the variables or
only a weak relationship, the value of r will be close to 0.
𝑛 ∑ 𝑥𝑦 − ∑ 𝑥 ∑ 𝑦
𝑟=
√[𝑛 ∑ 𝑥 2 − (∑ 𝑥 )2 ][𝑛 ∑ 𝑦 2 − (∑ 𝑦)2 ]
Example:
Find a correlation coefficient for the data shown for car rental companies in the
United States for a recent year.
Company Cars (in ten thousands) $ Revenue (in billions) $
A 63.0 7.0
B 29.0 3.9
C 20.8 2.1
D 19.1 2.8
E 13.4 1.4
F 8.5 1.5
Solution:
𝒙 (𝒄𝒂𝒓𝒔) 𝒚(𝒓𝒆𝒗𝒆𝒏𝒖𝒆) 𝒙𝒚 𝒙𝟐 𝒚𝟐
63.0 7.0 441 3969 49
29.0 3.9 113.1 841 15.21
20.8 2.1 43.68 432.64 4.41
19.1 2.8 53.48 364.81 7.84
13.4 1.4 18.76 179.56 1.96
8.5 1.5 12.75 72.25 2.25
∑ 𝑥 = 153.8 ∑ 𝑦 = 18.7 ∑ 𝑥𝑦 = 682.77 ∑ 𝑥 2 = 5859.26 ∑ 𝑦 2 = 80.67

𝑛 ∑ 𝑥𝑦 − ∑ 𝑥 ∑ 𝑦
𝑟=
√[𝑛 ∑ 𝑥 2 − (∑ 𝑥 )2 ][𝑛 ∑ 𝑦 2 − (∑ 𝑦)2 ]
6(682.77) − (153.8)(18.7)
= = 0.994
2
√[6(5859) − (153.8) ][6(80.67) − (18.7)2

There is strong positive realtion between x & y.


Regression Line Equation:
The equation of the regression line is written as
𝑦 ′ = 𝑎 + 𝑏𝑥
Where
∑ 𝑦 ∑ 𝑥 2 − ∑ 𝑥 ∑ 𝑥𝑦
𝑎=
𝑛 ∑ 𝑥 2 − (∑ 𝑥 )2
𝑛 ∑ 𝑥𝑦 − ∑ 𝑥 ∑ 𝑦
𝑏=
𝑛 ∑ 𝑥 2 − (∑ 𝑥 )2
Example:
Find an equation of regression line for the data shown for car rental companies in
the United States for a recent year.
Company Cars (in ten thousands) $ Revenue (in billions) $
A 63.0 7.0
B 29.0 3.9
C 20.8 2.1
D 19.1 2.8
E 13.4 1.4
F 8.5 1.5
Find the revenue when car price is 55 (ten thousads $).
Solution:
𝒙 (𝒄𝒂𝒓𝒔) 𝒚(𝒓𝒆𝒗𝒆𝒏𝒖𝒆) 𝑥𝑦 𝒙𝟐 𝒚𝟐
63.0 7.0 441 3969 49
29.0 3.9 113.1 841 15.21
20.8 2.1 43.68 432.64 4.41
19.1 2.8 53.48 364.81 7.84
13.4 1.4 18.76 179.56 1.96
8.5 1.5 12.75 72.25 2.25
∑ 𝑥 = 153.8 ∑ 𝑦 = 18.7 ∑ 𝑥𝑦 = 682.77 ∑ 𝑥 2 = 5859.26 ∑ 𝑦 2 = 80.67

∑ 𝑦 ∑ 𝑥 2 − ∑ 𝑥 ∑ 𝑥𝑦 (18.7)(5859.26) − (153.8)(682.77)
𝑎= = = 0.40
𝑛 ∑ 𝑥 2 − (∑ 𝑥 )2 6(5859.26) − (153.8)2
𝑛 ∑ 𝑥𝑦 − ∑ 𝑥 ∑ 𝑦 6(683.77) − (153.8)(18.7)
𝑏= = = 0.11
𝑛 ∑ 𝑥 2 − (∑ 𝑥 )2 6(5859.26) − (153.8)2
Regression eqation is
𝑦 ′ = 𝑎 + 𝑏𝑥
𝑦 ′ = 0.40 + 0.11𝑥
Put 𝑥 = 55 we get 𝑦 ′ = 0.40 + 0.11(55) = 6.45
Residual:
A residual is the difference of actual value 𝑦 and predicted value 𝑦′.
𝑒 = 𝑎𝑐𝑡𝑢𝑎𝑙 𝑣𝑎𝑙𝑢𝑒 − 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝑣𝑎𝑙𝑢𝑒 = 𝑦 − 𝑦′
Example:
Find Residuals in above example, also draw the regression line.
Solution
We have already calculated 𝑦 ′ = 0.40 + 0.11𝑥
𝒙 (𝒄𝒂𝒓𝒔) 𝒚(𝒓𝒆𝒗𝒆𝒏𝒖𝒆) 𝑦 ′ = 0.40 + 0.11𝑥 𝑒 = 𝑦 − 𝑦′
63.0 7.0 7.3 −0.3
29.0 3.9 3.6 0.3
20.8 2.1 2.7 −0.6
19.1 2.8 2.5 0.3
13.4 1.4 1.9 −0.5
8.5 1.5 1.3 0.2
∑ 𝑥 = 153.8 ∑ 𝑦 = 18.7

Draw the regression line


As this is a straight line thus we will consider only two values to draw this straight
line.

You might also like