Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Chapter – IV

Correlation

Correlation analysis
Correlation analysis is the statistical tool that we can use to describe the degree to which
onevariable is linearly related to another. Frequently, correlation analysis is used in conjunction
withregression analysis to measure how well the least squares line fits the data. Correlation
analysis can also be used by itself, however, to measure the degree of association between two
variables.
In this section we present two measures for describing the correlation between two variables: the
coefficient of determination and the coefficient of correlation.
According to A. M.Tuttle, “an analysis of covariation of two or more variables is usually called
correlation”.
Coefficient of correlation is the measure of the strength of the linear relationship between two
variables. It is generally denoted by r.

Significance of the Study of Correlation


The study of correlation is of immense use in practical life because of the following reasons:
1. Most of the variables show kind of relationship between price and supply, income and
expenditure, etc. With the help of correlation analysis we can measure in one figure the
degree of relationship existing between the variables.
2. Once we know that two variables are closely related, we can estimate the value of one
variable given the value of another.
3. Correlation analysis contributes to the economic behavior, aids in locating the critically
important variables on which others depend, may reveal to the economist the connection by
which disturbances spread and suggest to him the paths through which establishing forces
become effective.
4. Progressive development in the methods of science and philosophy has been characterized
by increase in the knowledge of relationship or correlations. Nature has been found to be
multiplicity of inter-related forces.

Types of Correlation
Correlation is described or classified in several different ways. Three of the most important are:
a. Positive and negative;
b. Simple, partial and multiple; and
c. Linear and non-linear.
Generally, there are three types of correleation. They are:
i. Simple correlation
ii. Partial correlation and
iii. Multiple correlation.
The distinct between simple, partial and multiple correlation is based upon the number of variables
studied. When only two variables are studied it is a problem of simple correlation. When three or
more variables are studied it is a problem of either multiple or partial correlation. In multiple
correlation three or more variables are studied simultaneously. For example, when we study the
relationship between the yield of rice per acre and both the amount of rainfall and the amount of
fertilizers used, it is a problem of multiple correlation. Similarly, the relationship of plastic
hardness, temperature and pressure is multivariate. In partial correlation we recognize more than
two variables. But consider only two variables to be influencing each other, the effect of other
influencing variable being kept constant. For example, in the rice problem taken above if we limit

1
our correlation analysis of yield and rainfall to periods when a certain average daily temperature
existed, it becomes a problem of partial correlation.
N.B.: Here we shall discuss only simple correlation.

Simple Correlation
The coefficient of correlation
Definition: Karl Pearson product moment coefficient of correlation (or simply, the coefficient
ofcorrelation) r is a measure of the strength of the linear relationship between twovariables x and
y. It is computed (for a sample of n measurements on x and y) asfollows:
r
 ( x  x)( y  y)
( x  x ) 2 ( y  y ) 2
Where x = The values of the first variable
y = The values of the second variable
x = The mean of x variable
y = The mean of y variable
N = The number of observations
The numerator  ( x  x)( y  y) determines the direction of the movement i.e., the nature of
correlation (positive or negative) and the magnitude of the co-efficient. The value of
 ( x  x)( y  y) will be positive if large values of one series occur with the large values of the other
and small values go with small values and in such a case r will be positive. Similarly if the value
of  ( x  x)( y  y) is negative the value of r will be negative indicating negative correlation.

Formula:
Computation of simple co-efficient of Correlation

1. Computation for Ungrouped Data


(a) Direct Method

Co-efficient of correlation r   ( x  x)( y  y)


( x  x ) 2 ( y  y ) 2

(b) Short-cut Method: Calculation of co-efficient of correlation can also be done by short-cut
method. This method has got the advantage of ease in calculation. The formulae for
correlation co-efficient according to the short-cut method are:
d d
 d x d y  xn y
i. r 

 ( d x ) 2 
 ( d y ) 2 

 d x
2

n
 d y 
 
2

n


  

n  dxd y   dx  d y
ii. r 
n  d x
2

 ( d x ) 2 n  d y  ( d y ) 2
2

 x y
 xy 
iii . r  n
 (  x ) 2
  ( y ) 2 
 x    y 
2 2

 n  n 

where x= The values of the variables x


y= The values of the variables y
dx= x-Ax= The deviation of X from Ax

2
dy= y-Ay = The deviation of Y from Ay
Ax = The assumed mean (arbitrary values) in the x series.
Ay = The assumed mean in the y series.
2. Computation of r from Grouped Data

(a) Direct Method


The formula for computation of r from grouped data by direct method is-
  xy
r
  x2   y2

  xy
which may also be written as r =
n x  y

where x = X - x and x =   X
n
y = Y - y and y =  Y

n
X = The mid-value of the class interval of x variable
Y = The mid-value of the class interval of y variable
  x2 ,   y2
x  y 
n n
(b) Short-cut Method
The formula for computation of r by short-cut method from grouped data is given by-
  dxd y    dx   d y
r
 (  d x ) 2  (  d y ) 2 
  d x    d y 
2 2

 n  n 

where d x  X  Ax and X = The mid-value of the class in x variable


Cx
Y  Ay
dy  and Y = The mid-value of the class in y variable
Cy
Ax= The assumeed mean in x series
Ay= The assumeed mean in y series
Cx= Class interval in x series
Cy= Class interval in y series

Other Formulae:
1. The value of correlation co-efficient lies between -1 to +1.
r
 ( x  x)( y  y) 
 XY Answer: r  1
 ( x  x) ( y  y )
2 2
 X Y 2 2

 
Here, X  x  x  and Y  y  y and x and y are means x and y.

2. The value of the correlation coefficient is 1 when y = a +bx


r
 ( x  x)( y  y) 1 Answer
( x  x ) ( y  y )
Here, y = a + bx and y  a  b x

You might also like