Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

Correlation and Regression

Correlation

Correlation indicates whether there is any relation between the variables and the correlation
coefficient measures the extent of relationship between them.

Simple Correlation and Multiple Correlation

When we study only two variables, the relationship is described as simple correlation.

When we study more than two variables then the relationship is described as multiple
correlation.

Height and weight of a person: Example of simple correlation.

Price, demand and supply of a commodity: Example of multiple correlation.

Positive Correlation and Negative Correlation


If the increase (decrease) in one variable results in the corresponding increase (decrease) in the
others, the variables are positively correlated. For example, the heights and weights of a group of
persons are positively correlated.

If the increase (decrease) in one variable results in the corresponding decrease (increase) in the
others, the variables are negatively correlated. For example, price and demand of a commodity
are negatively correlated.

Karl Pearson’s correlation coefficient


We calculate the Karl Pearson’s correlation coefficient from the following formula:

Cov ( x , y )
r=
√V ( x ) V ( y)
n
1
Cov ( x , y )= ∑ x − x̄ )( y i − ȳ )
n i=1 ( i
where,
n n
1 1
V ( x ) = ∑ ( x i − x̄ ) 2 V ( y )= ∑ ( y i− ȳ )2
n i=1 n i=1

Working formula of Karl Pearson’s Correlation Coefficient

n
∑ x i y i−n x̄ ȳ
i =1
r=

√(∑ )(∑ )
n n
x i −n x̄ 2 y i −n ȳ 2
i=1 i=1

Derivation of working formula from the general formula:

See the class notes…

Scatter diagram: The diagrammatic way of representing bivariate data is called scatter diagram.

Types of scatter diagram for different values of correlation coefficient:

The value of the correlation coefficient shall always lie between +1 and -1.
 When r =+1, then there is a perfect positive correlation between the variables.
 When r = -1, then there is a perfect negative correlation between the variables.
 When r = 0, then there is no relation between the variables.
 The closer to 1, the stronger the positive linear relationship.
 The closer to 0, the weaker any positive linear relationship.
Properties of Correlation Coefficient

1. Correlation coefficient lies between -1 and +1. Symbolically, -1 ≤ r ≤ +1.


2. Correlation coefficient is independent of change of origin and scale of measurement.
3. Correlation coefficient has no unit

Advantages of Correlation Coefficient

1. It is simple to understand and easy to calculate.


2. It is very useful in the case of data which are qualitative in nature.

Regression Analysis

The objective of regression analysis is to analyze the relationship between dependent and
independent variables and form the relationship into a mathematical equation.

Simple Regression Analysis and Multiple Regression Analysis:

Regression analysis that consists of one dependent variable and one independent variable is
called simple regression analysis.

Regression analysis that consists of one dependent variable and more than one independent
variables is called multiple regression analysis.

Regression Equation:

The linear regression equation is:

Where,

y i is the dependent variable


x i is the independent variable

(intercept) and (slope) are called the regression coefficients of the model

is the error term.


Estimated Regression Equation:
The estimated linear regression equation is give as:

Where,
n n

n ∑ xi∑ yi
∑ xi y i −i=1
n
i=1

β^ 1=
i=1

( )
n 2

n ∑ xi
∑ x i 2 −n
i=1

i=1
β^ 0 = ȳ− β^ 1 x̄

Interpretation of the Regression Coefficients

β 0 is the intercept of the regression line. It describes the expected value of the dependent variable y
, when the x value is equal to zero.

β 1 is the slope of the regression line. It describes the amount of change in expected value of the
dependent variable y , for every one unit change in x.

Coefficient of Determination

Coefficient of determination is one of the available tools for checking whether the regression
model is a good model or not. The coefficient of determination is the square of the correlation
coefficient.
In general the higher the value, the better the model fits the data. It implies that how many

percentage of the variation in Y can be explained by the variation in the X variable. Suppose
=92.7%; it interprets that almost 93% of the variability of the dependent variable explained by
the independent variables.

Difference between Correlation and Regression

Correlation Regression

It finds out the degree of relationship It indicates the cause and effect relation
between two or more variables. between the variables and establishes a
functional relation.

Correlation coefficient is independent on Regression coefficients are independent on


change of origin and scale of measurement. change of origin but not of scale of
measurement.

Correlation coefficient lies between -1 ≤r ≤1. Regression coefficients lie between -∞ to +∞

Exercise 1: Do the exercise give in the class which is related to find the correlation coefficient.

Exercise 2: Do the exercise give in the class which is related to estimate the regression
coefficients.

You might also like