Professional Documents
Culture Documents
MRS.DIANA-CORRELATION ANALYSIS-NOTES
MRS.DIANA-CORRELATION ANALYSIS-NOTES
Simple correlation: Is the study of the relationship that involves only two variables.
And when two or more variables are involved we speak of multiple correlation.
Linear correlation: Is defined when all the points on a scatter diagram seem to lie
near a straight line. If Y tends to increase as X increase, the correlation is positive and
if Y tends to decrease as X increases the correlation is negative. If some points seems
to lie near some curve the correlation is non-linear.
If all the points lie on a straight line we have perfect linearity, while if all the points
lie on a curve we have perfect non-linearity between the two variables.
Examples: one variable might be the number of hunters in a region and the other
variable could be the deer population. Perhaps as the number of hunters increases, the
deer population decreases. This is an example of a negative correlation: as one
variable increases, the other decreases. A positive correlation is where the two
variables react in the same way, increasing or decreasing together. An increase in the
amount of rain will be accompanied by increase in the sales of umbrellas.
1
The correlation coefficient is used to measure the strength of the relationship between
two variables.
This means that any value beyond this range will be the result of an error in correlation
measurement.
A correlation of 0.0 means no linear relationship between the movement of the two
variables.
The following correlation graphs show the examples of different range of values for a
correlation coefficient:
Positive correlation
2
Negative correlation
No correlation
There are several types of correlation coefficients, Pearson's correlation (r) being
the most common among all.
It measures the strength and direction of the linear relationship between the two
variables and cannot capture nonlinear relationships between two variables.
3
Pearson’s -product moment Correlation (r)
A person can tell if there is a correlation by how closely the data resemble a line. If
the points are scattered about then, there may be no correlation. If the points would
closely fit a quadratic or exponential equation, etc., then they have a nonlinear
correlation. In
Where:
4
Σxy=Sum of product of first and second value
Example 01
The local ice cream shop keeps track of how much ice cream they sell versus the noon
temperature on that day. Here are their figures for the last 12 days:
5
a) Sketch the scatter plot
b) Find the correlation coefficient (r)
Solution
6
b). Find the correlation coefficient (r)
From
7
Y
X Co (Ice
X2 Y2 XY
(Temperature) cream
Sales)$
12 ( 95506.6 )−(224.1)(4829)
r= = 0.95737
√[12 ( 4362.1 )−( 224.1 ) ]¿ ¿ ¿
2
There is a very strong correlation between the Temperature (C o) and Ice cream sales
($)
8
±1 Perfect correlation
0 No correlation
QUESTIONS
1. The following data refer to the proportion of households owning a television set
and social class index in ten different towns.
9
% 57 54 49 42 38 32 30 24 20 18
with
T.V’s
(X)
X 1 2 3 4 5 6 7 8
a) Plot a scatter diagram for this information and comment on its feature
b) Compute the correlation coefficient between X and Y using Pearson correlation
coefficient.
Is the measure of the relationship between two ordinal variables that are related but
not linearly.
10
Spearman’s Rank correlation formula
Spearman’s rank correlation coefficient formula quantifies the degree and direction of
association between two ranked variables. It measures the monotonicity of a
relationship between two variables that is how well a monotonic function can represent
a relationship between two variables.
6 ∑ di
2
ρ=1− 2
n(n −1)
Where:
The Spearman Rank correlation coefficient can be anywhere between -1 and +1 such
that −1 ≤r s ≤+1
Example 01
Physics 35 23 47 17 10 43 9 6 28
Mathematics 30 33 45 23 8 49 12 4 31
Calculate the student’s ranks in the two subjects and compute the Spearman rank
correlation
Solution
11
First, find the rank for each individual subject. Assign the rank 1 to the highest score, 2
to the next highest and so on. Thus we have
btn ranks
(d)
35 3 30 5 -2 4
23 5 33 3 2 4
47 1 45 2 -1 1
17 6 23 6 0 0
10 7 8 8 -1 1
43 2 49 1 1 1
9 8 12 7 1 1
6 9 4 9 0 0
28 4 31 4 0 0
∑ d 2=12
From
6 ∑ di
2
ρ=1− 2
n(n −1)
12
6(12)
ρ=1−
9(9 2−1)
ρ=¿ 0.933
The Spearman’s rank correlation for this set of data is 0.933, which implies very strong
positive correlation coefficient.
QUESTION
Calculate the Spearman’s rank correlation coefficient of the data in the table given
below
X 10 8 12 15 8 10
Y 7 4 6 7 9 8
13
14
15
16