Professional Documents
Culture Documents
Lec - 00 7 - The Correlation
Lec - 00 7 - The Correlation
The Correlation
By: Dr. Firas M. AlFiky
2021-2022 Lec_007
Correlation
•Finding the relationship between two quantitative
variables without being able to infer causal
relationships.
•Correlation is a statistical technique used to
determine the degree to which two variables are
related.
Scatter diagram
❖ Rectangular coordinate.
❖ Two quantitative variables.
❖ The horizontal axis is (X) and the vertical axis is (Y).
❖ Points are not joined.
Y
❖ No frequency table.
* *
*
X
Example
Wt.
67 69 85 83 74 81 97 92 114 85
(kg)
SBP
120 125 140 160 130 180 150 140 200 130
(mmHg)
➢ Positive relationship
➢ Negative relationship
➢ No relationship
Positive relationship
18
16
14
12
Height in CM
10
0
0 10 20 30 40 50 60 70 80 90
Age in Weeks
Negative relationship
Reliability
Age of Car
No relation
Correlation Coefficient
Statistic showing the sign and the degree of relation between two
variables.
variable).
in the other).
➢The value of r ranges between ( -1) and ( +1)
-1 -0.75 -0.25
0 0.25 0.75
1
indirect Direct
perfect correlation perfect correlation
no relation
If r = Zero this means no association or correlation between
If r = l = perfect correlation.
How to compute the simple correlation coefficient (r)
xy − x y
r= n
x −
2
( x) 2
. y −
2
( y) 2
n n
Example:
A sample of 6 children was selected, data about their age in years
and weight in kilograms was recorded as shown in the following
table. It is required to find the correlation between age and weight.
serial No Age (years) Weight (Kg)
1 7 12
2 6 8
3 8 12
4 5 10
5 6 11
6 9 13
These 2 variables are of the quantitative type, one variable
(Age) is called the independent and denoted as (X) variable
and the other (weight) is called the dependent and denoted
as (Y) variables to find the relation between age and weight
compute the simple correlation coefficient using the following
formula:
xy − x y
r= n
x2 −
( x) 2
. y 2 −
( y) 2
n n
Age Weight
Serial (years) (Kg) xy x2 y2
(x) (y)
1 7 12 84 49 144
2 6 8 48 36 64
3 8 12 96 64 144
4 5 10 50 25 100
5 6 11 66 36 121
6 9 13 117 81 169
291 − .742 −
6 6
10 2 100 4 20
8 3 64 9 24
2 9 4 81 18
1 7 1 49 7
5 6 25 36 30
6 5 36 25 30
∑x = 32 ∑y = 32 ∑x2 = 230 ∑y2 = 204 ∑xy=129
Calculating Correlation Coefficient
xy − x y
r = n
x2 − . y 2 −
( x) 2
( y) 2
n n
Wt.
67 69 85 83 74 81 97 92 114 85
(kg)
SBP
120 125 140 160 130 180 150 140 200 130
(mmHg)
Answer:
Wt. (kg) SBP (mmHg)
x2 y2 xy
(x) (y)
1 67 120 4489 14400 8040
2 69 125 4761 15625 8625
3 85 140 7225 19600 11900
4 83 160 6889 25600 13280
5 74 130 5476 16900 9620
6 81 180 6561 32400 14580
7 97 150 9409 22500 14550
8 92 140 8464 19600 12880
9 114 200 12996 40000 22800
10 85 130 7225 16900 11050
∑x = 847 ∑y = 1475 ∑x2 = 73495 ∑y2 = 223525 ∑ xy = 127325
Answer: Calculating Correlation Coefficient
𝒏 = 𝟏𝟎
σ𝑥 σ𝑦
σ 𝑥𝑦 −
𝑟= 𝑛
σ 𝑥 2 σ𝑦 2
2
σ𝑥 − . 2
σ𝑦 −
𝑛 𝑛
2392.5
=
𝟏, 𝟕𝟓𝟒 . 5,962
2392.5
= = 0.7398
𝟑𝟐𝟑𝟑
Direct intermediate correlation
Example
student Statistic Physics
The score of 9 students
1 35 65
in statistic and physics 2 55 70
are as follows. 3 45 40
4 50 70
Find the relationship 5 25 45
between them by finding 6 45 50
7 29 52
the Pearson's correlation
8 52 46
Coefficient (r). 9 45 58
Answer:
Physics Statistic
student x2 y2 xy
x y
1 65 35 4225 1225 2275
2 70 55 4900 3025 3850
3 40 45 1600 2025 1800
4 70 50 4900 2500 3500
5 45 25 2025 625 1125
6 50 45 2500 2025 2250
7 52 29 2704 841 1508
8 46 52 2116 2704 2392
9 58 45 3364 2025 2610
∑x = 496 ∑y = 381 ∑x2 =28334 ∑y2 =16995 ∑ xy =21310
Answer: Calculating Correlation Coefficient
σ𝑥 σ𝑦
𝒏=𝟗 σ 𝑥𝑦 −
𝑟= 𝑛
σ 𝑥 2 σ𝑦 2
2
σ𝑥 − 2
. σ𝑦 −
𝑛 𝑛
21310 −
496 ∗ 381
9
312.66
𝑟=
496
=
𝟗𝟗𝟖. 𝟖𝟖𝟖 . 866
2 381 2
28334 − . 16995 −
9 9
312. 𝟔𝟔
= = 0.336
𝟗𝟑𝟎
Direct intermediate correlation
Spearman Rank Correlation Coefficient (rs)
It is a non-parametric measure of correlation.
This procedure makes use of the two sets of ranks that may be assigned
to the sample values of x and y.
6 (di) 2
rs = 1 −
n(n − 1)
2
A 67 69 84 83 74 81 97 92 114 85
B 120 125 145 160 130 180 150 140 200 135
Answer:
A B di
Rank x Rank y di2
(x) (y) Rank x-Rank y
67 120 10 10 0 0
69 125 9 9 0 0
84 145 5 5 0 0
83 160 6 3 3 9
74 130 8 8 0 0
81 180 7 2 5 25
97 150 2 4 -2 4
92 140 3 6 -3 9
114 200 1 1 0 0
85 135 4 7 -3 9
∑di2 = 56
2
6 ∗ σ 𝑑𝑖
𝑟𝑠 = 1 −
𝑛 ∗ 𝑛2 − 1
n = 10, Then:
6 ∗ 56 336
𝑟𝑠 = 1 − =1−
10 ∗ 102 − 1 10 ∗ 100 − 1
336 336
𝑟𝑠 = 1 − =1− = 1 − 0.34
10 ∗ 99 990
𝒓𝒔 = 𝟎. 𝟔𝟔
1 35 65 7 3 4 16
2 55 70 1 1.5 -0.5 0.25
3 45 40 5 9 -4 16
4 50 70 3 1.5 1.5 2.25
5 25 45 9 8 1 1
6 45 50 5 6 -1 1
7 29 52 8 5 3 9
8 52 46 2 7 -5 25
9 45 58 5 4 1 1
∑ di2 = 71.5
2
6 ∗ σ 𝑑𝑖
𝑟𝑠 = 1 −
𝑛 ∗ 𝑛2 − 1
n = 9, Then:
6 ∗ 71.5 429
𝑟𝑠 = 1 − =1−
9 ∗ 92 − 1 9 ∗ 81 − 1
429 429
𝑟𝑠 = 1 − =1− = 1 − 0.596
9 ∗ 80 720
𝒓𝒔 = 𝟎. 𝟒𝟎𝟒
∑ di2 = 64
6 × 64
𝑟𝑠 = 1 −
7 48
𝑟𝑠 = −0.14
Comment:
There is an indirect weak correlation between level
of education and income.