Correlation Analysis and Its Types

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 50

Correlation

Analysis
Bike sales grow with income

As the gross national income per capita rises,


the number of two-wheelers also increases
Bike sales grow with income
As the gross national income per capita rises,
the number of two-wheelers also increases

An exponential growth in a country's two-wheeler fleet might


result in a rapid increase in the number of motorcycle
crashes.
 India's per capita income grew by 28% between 2013 and 2017
while two-wheeler registrations increased by 46% (compared to
44% overall new vehicle registrations) over the same period. Last
year, 21.2 million two-wheelers were sold in India and the annual
sales are projected to reach 26.6 million units by 2025 at 2.6%
growth rate, according to UnivDatos.
Helmets and trauma Injury

Not wearing helmets resulted in deaths of 44,666 (30,148


drivers and 14,518 pillions) or 29.82% of total road accident
fatalities during 2019, according to the ministry data.

Correct helmet use can lead to a 42% reduction in the risk of fatal
injuries and a 69% reduction in the risk of head injuries according
to the WHO
• Income growth
• Population growth
• Literacy growth VS. Two-Wheeler Growth
• Economy growth
• Rural Urban
Migration Growth
Learning Outcomes

•Understanding the meaning of Correlation

•Identify the direction and strength of a


correlation between Variables

• Understand the Types of Correlation


Meaning of Correlation
Analysis
• Correlation refers to a process for establishing
whether or not relationships exist between
two variables. 

• Correlation analysis show us how to determine


both the nature and strength of relationship
between two variables

• The correlation coefficient is usually given the


symbol r and it ranges from -1 to +1.
Types of correlation

On the basis of On the basis of


On the basis of
degree of linearity
number of variables
correlation

•Positive •Simple
•Linear
correlation correlation correlation
•Partial correlation •Non – linear
•Negative
correlation correlation
•Multiple
correlation
Correlation : On the basis of
degree
 Positive Correlation
if one variable is increasing and with its
impact on average other variable is
also increasing that will be positive
correlation.
Example of Positive Correlation
• The More time you spend marketing your
business, the more new customers you will
have.
• As the temperature goes up, ice cream
sales also go up.
Correlation : On the basis of
degree
 Negative correlation
if one variable is increasing and with its
impact on average other variable is
decreasing that will be negative
correlation.
Example of Negative Correlation
• The weather gets colder, air conditioning
costs decrease.
• Increase in price of goods and services
decrease in sales
Scatter Plot for Positive and
Negative correlation
Correlation : On the basis of
number of variables
 Simple correlation
Correlation is said to be simple when
only two variables are analyzed.

For example :
Correlation is said to be simple when it
is done between demand and supply
or we can say income and expenditure
etc.
Correlation : On the basis of
number of variables
 Partial correlation :
When three or more variables are
considered for analysis but only two
influencing variables are studied and
rest influencing variables are kept
constant.
For example :
Correlation analysis is done with demand,
supply and income. Where income is
kept constant.
Correlation : On the basis of
number of variables
 Multiple correlation :
In case of multiple correlation three or
more variables are studied
simultaneously.
For example :
Rainfall, production of rice and price of
rice are studied simultaneously will be
known are multiple correlation.
Correlation : On the basis of
linearity
 Linear correlation :
If the change in amount of one variable
tends to make changes in amount of
other variable bearing constant
changing ratio it is said to be linear
correlation.
For example : The ratio of change
between the variables is the same
X: 10 20 30 40 50
Y: 20 40 60 80 100
Correlation : On the basis of
linearity
 Non - Linear correlation :
If the change in amount of one variable
tends to make changes in amount of
other variable but not bearing constant
changing ratio it is said to be non - linear
correlation.
For example :
If the amount of fertilizers is doubled the
yield of wheat would not be necessarily be
doubled.
Importance of correlation
analysis :
 Measures the degree of relation i.e.
whether it is positive or negative.
 Estimating values of variables i.e. if
variables are highly correlated then we
can find value of variable with the
help of gives value of variable.
 Helps in understanding economic
behavior.
Karl Pearson Coefficient of
Correlation
A Parametric Test
COEFFICIENT OF CORRELATION
 The ratio indicating the degree of relationship
between two related variables.
 For a perfect
. POSITIVE CORRELATION the
coefficient of correlation is +1
 For a perfect NEGATIVE CORRELATION the
coefficient of correlation is -1.
 Positive coefficient of correlation varies from 0
to +1.
 Negative coefficient of correlation varies from 0
to -1.
Interpretation of correlation coefficient
Five examples of correlation coefficient
KARL PEARSON METHOD
 Karl Pearson method was invented by KARL
PEARSON and on his name, the method is called
as KARL Pearson method.
 It is represented as ‘r’ .
The standard formula used in the computation of
Pearson’s product moment correlation coefficient is as
follows :

N ( XY )   X
r
Y
[N  X  ( X ) ][N Y 
2 2 2

(Y ) ]
2
N (X Y )   X Y
r 
[NX 2 (X)2 ][NY 2(Y)2 ]

Where,
N - the total no: of scores or cases
Ʃ -the summation of the items indicated
ƩX - the sum of all X scores
ƩX²-
each X score should be squared and then those
squares summed{the sum of the X squared scores}
(ƩX)²
- X scores should be summed and the total
squared(the squares of the sum of all the X scores)
 ƩY – the sum of all Y scores
 ƩY² - each Y score should be squared and then those
squares summed
 (ƩY)² - Y score should be summed and the total
squared
There are five subjects and the students have to
appear in two tests of each subject. We want to find
whether there is any correlation between the marks
scored in test1 and test2 of each subject or not.

SUBJECT SCORES IN TEST 1 SCORES IN TEST 2


(X) (Y)
A 5 12

B 3 15

C 2 11

D 8 10

E 6 18
SUBJECT SCORES IN SCORES IN XY X² Y²
TEST 1 (X) TEST 2 (Y)
A 5 12

B 3 15

C 2 11

D 8 10

E 6 18

N= ƩX= ƩY= ƩXY= ƩX²= ƩY²=


SUBJECT SCORES IN SCORES IN XY X² Y²
TEST 1 (X) TEST 2 (Y)

A 5 12 60 25 144

B 3 15 45 9 225

C 2 11 22 4 121

D 8 10 80 64 100

E 6 18 108 36 324

N=5 ƩX=24 ƩY=66 ƩXY=315 ƩX²=138 ƩY²=914


(Summation (Summation (Summation (Summation (Summation
of X Scores) of Y Scores) of product of of square of of square of
XY) X) Y)
N ( X Y )   X
r 
[N  X 2 Y
 (  X )2 ] [ N  Y 2

(  Y )2 ]
(5  315)  ( 2 4  66)
r 
[5  1 3 8  24 2
][5  9 1 4 
662 ]
r 
( 6 9 0  1 55 77 65 ) ( 41557804  4 3 5 6 )
 9
r 
1 1 4  2 1 4
 9
r 
2 4 3 9 6
 9
r 
1 5 6 .2
r   0 .0 5 7 6

i.e, coefficient of correlation (r) = -0.0576


INTERPRETATION OF CORRELATION
COEFFICIENT
 ‘r’ from 0.00 to ±0.20 denotes negligible correlation
 ‘r’ from ±0.20 to ±0.40 denotes low correlation.
 ‘r’ from ±0.40 to ±0.70 denotes substantial or
marked correlation
 ‘r’ from ±0.70 to ±1.00 denotes high to very high
correlation.

± denotes direction,
‘+’ means positive correlation
‘-’ means negative correlation.
Interpretation of ‘r’

Since the value of ‘r’ is -0.0576, there is


negligible correlation between marks of Test 1
and Test 2
Example 2

Table 15.2 shows the sales revenue and advertisement


expenses of a company for the past 10 months. Find the
coefficient of correlation between sales and advertisement.
Calculation of correlation coefficient between sales
and advertisement
Problem for Practice
There are five subjects and the students have to appear
in two tests of each individual. We want to find
whether there is any correlation between the marks
scored in test1 and test2 of each individual or not.
INDIVIDUALS SCORE IN SCORES IN
TEST (X) TEST (Y)
A 15 60

B 25 70

C 20 40

D 30 50

E 35 30
Non-Parametric
Statistics
Spearman’s Rank Correlation
When data is of ordinal level (ranked data), Pearson correlation
coefficient r cannot be applied.

In this case, Spearman’s rank correlation can be used to


determine the degree of association between two variables.
Spearman’s Rank Correlation
Solution
Solution
Example 14.9

A social science researcher


wants to find out the
degree of association
between sugar prices and
wheat prices. The
researcher has collected
data relating to the price of
sugar and wheat in 14
randomly selected months
from the last 20 years.

How can he compute the


Spearman’s rank correlation
from the data provided in
Table.
Thank
you

You might also like