Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 20

ASSOCIATION

BETWEEN VARIABLES
Lecture 5-Class Discussion Notes
BM & EBL Year 1
Kelebogile Kenalemang
OUTLINE
• Introduction
• Correlation analysis
• Scatter plots
• Correlation Coefficient
• Coefficient of determination, r2
• Regression analysis
OBJECTIVES
• To be able to draw and interpret scatter diagrams
• To be able to calculate the correlation coefficient and coefficient of
determination.
• To understand and be able to use the least squares method to
estimate the regression line
Introduction
• This lecture introduces two techniques;
• Correlation; to measure the association between two variables, and
regression to obtain the relationship between the two variables.
• For example you might suspect that cost of production is dependent
on the quantity produced , or that sales of a product are related to
price.
Correlation Analysis
• The technique of correlation measures the strength of the association
between two variables.
• Correlation analysis simply explores the strength of association
between two variables, two variables are said to be associated if a
change in the value of one variable is accompanied by a change in the
value of another variable.
• For example we may want to determine the strength of association
between the cost of producing an item and its price, the relationship
between advertising and sales revenue, number of deliveries and
time taken to deliver.
Determining and Measuring Correlation
1. Scatter Diagrams
2. Pearson’s product moment correlation coefficient
Scatter plots/Scatter diagrams
• A scatter plot is simply a way of representing a set of bivariate data to
determine correlation between the two variables.
• The notion of dependence arises here, we should always plot on the ,
the variable which is likely to be dependent on, i.e. influenced by or
caused by, the other variable, the independent variable (cause) – this
‘other variable’ is plotted on the .
• The resulting pattern of the plotted points can give us a lot of
information about the association between the variables and the
suitability of the data set before we spend any time on calculations.
Degrees of Association/Correlation
• When we examine the scatter plot, we should consider several issues
before reaching a conclusion about the association between the
variables.
• The first issue is obviously whether there is evidence of a pattern in
the plotted.
1. Perfectly correlated (+,-), all the pairs of values lie on a straight line
to show an exact linear relationship between the two variables.
2. Partly/ Partially correlated
3. Uncorrelated
Pearson’s Product Moment Correlation Coefficient

• Pearson’s correlation coefficient is used to measure how strong is the


relationship between two variables.
• The value of the correlation coefficient indicates the sense (positive or negative)
and the strength of the association.
• The correlation coefficient is denoted by or and always has a value between -1
and +1. (correlation coefficient range)
• A value close to +1 (0.9 for example) shows a strong, positive, linear association
between two factors.
• A value close to –1 (-0.9 for example) shows a strong, negative, linear association.
• When r is zero, there is no linear relationship between the variables. This does
not necessarily mean that there is no relationship of any kind.
Formula for

• Where x and y represent the two pairs of data


• n is the number of pairs of data used in the analysis
Example 1:
• Calculate the Person’s product moment correlation coefficient for the
data given below;
Units Production cost (y)
produce
d (x)

1 5.0
2 10.5
3 15.5
4 25.0
5 16.0
6 22.5
𝒏 ∑ 𝒙𝒚 − ∑ 𝒙 ∑ 𝒚
𝒓=
√¿¿¿
Class exercise:
• Plot a scatter diagram for the data given below
• Calculate the Person’s product moment correlation coefficient for the
data given below;
Table 1: Data of Policy Policy (X) Overtime
and overtime hours (Y)
150 10
300 20
100 10
400 40
350 30
500 35
Coefficient of Determination
• Before a regression can be used effectively as a predictor for the
dependent variable, it is necessary to decide how well it fits the data.
• The coefficient of determination measures the proportion of the total
variation in the dependent variable explained by the variation in the
independent variable.
• It is given by , which is the square of the product moment correlation
coefficient.
• Calculate for the previous two examples
Linear Regression Analysis
• The technique of linear regression attempts to define the relationship
between two variables (dependent and independent) by means of a
linear equation.
• The linear regression technique focuses on estimating the line of best
fit between two variables.
• The least square method of linear regression analysis involves using
the following formula to determine whether the dependent variable
depends on the independent variable
The Linear Regression Model
• This model is given in the form;

• Whereis the dependent variable


• is the independent variable
• a is the constant
• b, slope, gradient
The values of a and b
• The values of a and b that minimizes the squared errors are given by the
equations;

• You can think of as the slope of the regression line and as the value of the
intercept on the
example
• Calculate the regression line for the data given below;
Class exercise
• Suppose we have the following data about output and cost, calculate
the regression line for the data;

Output (x) (000 Costs (Y) (P’000)


units )
20 82
16 70
24 90
22 85
18 73
Interpretation of a and b
• After calculating a and b, interpret these coefficients.
• Write the regression equation and plot the regression line on a scatter
diagram

You might also like