REGRESION

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 11

STASTICS TOPIC 2

[Document subtitle]

JANUARY 1, 2023
STEVEN
[Company address]
REGRESION
 Regression is a determination of statistical relationship between two or
more variables
 Regression is a technique of determining the statistical relationship
between two or more variables where a change in a dependent variable is
associated with, and depends on, a change in one or more independent
variables.
INDEPENDENT AND DEPENDENT VARIABLES
INDEPENDENT VARIABLE (x)
 Independent variable is a variable that is manipulated to determine the
value ofadependentvariables,
not influenced or controlled by others in matters of opinion, conduct, etc.,
thinking or acting for oneself
DEPENDENT VARIABLE (y)
 A factor or phenomenon that is changed by the effect of an associated
factor or phenomenon called the independent variable

SCATTER DIAGRAM
 Scatter diagram is the suitable way to represent the relationship Between
dependent and independent variables pictorially
 Data presented on graph so that one can see at easier
 Each pair of numbers provides one point on the diagram
Example: The following data relate to rainfall and subsequent crop yield
over five years

Yea Yea Yea Yea Yea


r r2 r3 r4 r5
1

Rainfa 4 2 5 7 8
ll in
inch

Yields 50 25 40 70 85
in tons

Scatter Diagram
90

80

70

60
Yields in tons

50

40

30

20

10

0
1 2 3 4 5 6 7 8 9
Rainfall in Inch

TYPE OF LINEAR REGRESION


 Simple liner regression
 Multiple linear regression

SIMPLE LINEAR REGRESSION


 Linear regression is the process of determining the statistical relation
between one dependent variable against only one independent variable
 Linear regression is any line connecting a one dependent variable (Y) and
only one independent variable (x) that may be expressed as
Y= B1+B2X+e
Where B1 and B2 are constant parameter
b is the gradient or slope of the line
- ε is called error, is the difference between actual value and estimated
value E=y-y”

Meaning of
β1 and
β2

y=β 1 + β 2 x i + ε

rise

run
β2 =slope (=rise/run)

B2 >0 is positive slope

B2 < 0 is negative slope

 We cannot compute the parameter B1 and B2 from equation

 Taking the paired sample of size ‘n’ from the same population we can estimate the values.

 The estimated parameters reveal what we call it the line of the best fit (sample regression line)

y=β 1 + β 2 x i + ε LINE OF THE BEST FIT, LINEAR REGRESION MODEL

To estimate the values of B1 and B2 we use the following formulae’s

B2” =n∑xy - ∑x∑y


n∑x2- (∑x)2
B1” = ∑y -B2” ∑x
N
20
Scatter Diagram 18

15
10
Y

5
00 X
1 2 3 4 5 6 7 8 9 10

EXAMPLE
 A company keeps extensive records on its sales people on the promise that
the sales should increase with experience. A random sample of six new
sales people produces data on the experience and sales provided in the
table below

Months 2 4 6 8 12 14
on job
(X)

Monthly 2. 7.0 8 11.3 15.0 18.0


sales 4
(Tshs.
‘000’)

a) Plot a scatter diagram and estimate the line of the best fit
b) Determine the linear regression model that exists between the two
variables.
c) Project the monthly sales for 9 months experience on job
Scatter Diagram
20
18
16
14
Monthly Sales

12
10
8
6
4
2
0
0 2 4 6 8 10 12 14 16
Months on Jobs

X y xy x2 Y2

2 2.4 4.8 4 5.76

4 7 28 16 49

6 8 48 36 64

8 11 88 64 121

12 15 180 144 225

14 18 252 196 324

Sum 46 61.4 600.8 460 788.76

CORRELATION
 Correlation is the determination of degree of relationship between two or
more variables
 Correlation Analysis is the process of examination on how strong the
variables relate
 The degree measure coefficient of correlation, r can be determined by
different formulas, but we will see only two
i. Carl Pearson’s Moment of correlation Coefficient
ii. Spear Man rank correlation coefficient

THE COEFFICIENT OF r IS DETERMINED BY


n ∑ xy−∑ x ∑ y
r=
√ { n ∑ x − ( ∑ x ) } { n ∑ y −( ∑ y ) }
2 2 2 2

S xy
r= , −1 ≤r≤1
√ S xx S yy
TYPES OF CORRELATION
1. PERECT POSITIVE CORRELATION
 All points lie on the straight line in the direction
 Correlation coefficient = +1
 Called perfect positive linear relationship

Scatter Diagram
12
10
8
6
Y

4
2
0
1 2 3 4 5 X6 7 8 9 10 11

2. HIGHER POSITIVE CORRELATION

 Many points lie on the straight line in the direction


 Correlation coefficient, 0< r < +1
 Called higher or (weak, moderate, strong) positive
12
Scatter Diagram
10
8
6
Y

4
2
0
1 2 3 4 5 6 7 8 9 10 11
X

3. PERFECT NEGATIVE CORRELATION


 All points lie on the straight line in the direction
 Correlation coefficient, r = -1
 Called perfect negative linear relationship

7
6
5
4
3
2
1
0
0 1 2 3 4 5 6 7

Series2

4. HIGHER NEGATIVE CORRELATION


 Many points lie on the straight line in the direction
 Correlation coefficient, -1< r < 0
 Called higher (weak, moderate, strong) negative relationship
12
Scatter Diagram
10

6
Y
4

0
1 2 3 4 5 6 7 8 9 10 11
X

EXAMPLE
 The following data obtained from claims drawn on life assurance
policies for particular category of employment, relates age at
official retirement to age of death for nine males

Age of retire 57 62 60 57 65 60 58 62 56

Age of death 71 70 66 70 69 67 69 63 70

 Calculate the product moment coefficient of correlation between


the age of retirement and age of death.

SPERAMAN RANK OF CORRELATION


 Spearman Rank Correlation measures the degree of association
between the two variables.
 It finds out if the variables concerned do have some association
 Spearman Rank Correlation, r
6∑ d
2
r=1− 2
−1≤r≤1
n( n −1)
where d is the deviation between pairs of rankings
of the two variables
n is the number of pairs for the rankings
EXAMPLE
 The manager of company with ten operating plants of similar size
producing small components have observed the following pattern
of expenditure on inspection and defective parts delivered to the
customer.
Observation Defective parts
Inspection Expenditure per 1000 units
in Tshs 1000 delivered
1 25 50
2 30 35
3 15 60
4 75 15
5 40 46
6 65 20
7 45 28
8 24 45
9 35 42
10 70 22
Find the rank correlation of the expenditure and defective
number of units
X Rank y Rank d d^2
15 10 60 1 9 81
24 9 45 4 5 25
25 8 50 2 6 36
30 7 35 6 1 1
35 6 42 5 1 1
40 5 46 3 2 4
45 4 28 7 -3 9
65 3 20 9 -6 36
70 2 22 8 -6 36
75 1 15 10 -9 81
        Sum  = 310

You might also like