Professional Documents
Culture Documents
Regression Analysis 1 2020
Regression Analysis 1 2020
BUSINESS STATISTICS I
MTU 07203
50
40
30
20
10
0
1 2 3 4 5 6 7 8 9
Rainfall in Inch
Types of Linear Regression
There two types of linear regression analysis are
further classified into
Simple linear regression
Is the relationship developed between only one
dependent variable against only one independent
variables (explanatory)
Multiple Linear regression
Is the relationship developed between only one
dependent variable against many independent
variables (more than one explanatory)
Simple linear regression
Simple Linear Regression
Linear regression is the process of determining the
statistical examination of the line of the best fit
Linear regression is any line connecting a
dependent variable (Y) and only one independent
variable (x) that may be expressed as
y β 1 β 2 x i ε, i 1 ,... n
y
y β1 β 2 x i ε
rise
y-intercept run
β1 β2 =slope (=rise/run)
ŷ i β̂ 1 β̂ 2 x i i 1 ,..., n
Consider the Scatter diagram
20
18 18
16
14
12
10
Y
8
6
4
2
0
0 1 2 3 4 5 6 7 8 9 10
X
Which line has the best “fit” to the data?
?
20 Scatter Diagram ?
18 18
16
14
?
12
10
Y
8
6
4
2
0
0 1 2 3 4 5 6 7 8 9 10
X
Estimation of least squares
Using method of least squares we get
β̂ 1 and β̂ 2 β 1 and β 2
as the estimates of
Least Squares Graphically
LS minimizes i1ε̂ i ε̂1 ε̂ 2 ε̂ 3 ε̂ 4
n 2 2 2 2 2
Y Y i β̂ 1 β̂ 2 X 2 ε̂ 2
^ 4
^2
^ 1 ^ 3
Ŷi β̂1 β̂ 2 X i
X
Least Squares
1.‘Best Fit’ Means Difference Between
Actual Y Values & Predicted Y Values Are
a Minimum. But Positive Differences Off-
Set Negative. So square errors!
n
i1
Y i
Ŷ i 2
n
i1
ε̂
2
i
where
e i y i ŷ i y i β̂ 1 β̂ 2 x i
2 2
2
let i ,
q e then the method involving solving
2
(x i x )(y i y ) 2 n xy x y
β̂ 2
x x x x
2 2
i
n 2
and
β̂ 1 y β̂ 2 x
y β̂ 2 x
n
By defining the following
S xx x
2 1
x 2 S yy y2
1
y 2
n n
1
S,xy xy x y
n
S xy
β̂ 2
One can simplify S xx
The method of list squares …
The line of best fit can roughly be estimated from
scatter diagram plotted using paired (x, y)
Application
The purpose of linear regression is to
develop a mathematical relationship (model)
between variables that can be used to predict
the value of one variable if the value of
another variable is known
Example
A company keeps extensive records on its sales people on
the promise that the sales should increase with experience. A
random sample of six new sales people produces data on the
experience and sales provided in the table below
a) Plot a scatter diagram and estimate the line of the best fit
b) Determine the linear regression model that exists
between the two variables.
c) Project the monthly sales for 9 months experience on job
Scatter Diagram
Scatter Diagram
20
18
16
14
12
Monthly Sales
10
0
0 2 4 6 8 10 12 14 16
Months on Jobs
•Summary data
x y xy x2 Y2
2 2.4 4.8 4 5.76
4 7 28 16 49
6 8 48 36 64
8 11 88 64 121
12 15 180 144 225
14 18 252 196 324
Sum 46 61.4 600.8 460 788.76
From the table we find that
x 46, y 61.4, xy 600.8, 460,
x 2
788.76
y 2
Now yˆ ˆ1 ˆ 2 x
Then,
ˆ n xy x y y 2 x
2 , ˆ1
n x x
2
2 n
6 600 . 8 46 61 . 4
ˆ 2 1 . 21 ,
6 460 46
2
Also we use
S xy
r , 1 r 1
S xx S yy
It can take a value from -1≤ r ≤ 1
Perfect Positive correlation
Scatter Diagram
12
10
6
Y
0
0 2 4 6 8 10 12
X
10
6
Y
0
0 2 4 6 8 10 12
X
0
0 1 2 3 4 5 6 7
10
6
Y
0
0 2 4 6 8 10 12
X
Scatter Diagram
8
7
6
5
Y
4
3
1
0
1.5 2 2.5 3 3.5 4 4.5 5 5.5
X
Solution
Refer the data
Sxx 107.3,Syy 160.4,Sxy 129.3,β̂1 0.97,β̂ 2 1.21
Sxy 129 . 3
r 0 . 99
S xx S yy 107 . 3 160 . 4
Age of retire 57 62 60 57 65 60 58 62 56
Age of death 71 70 66 70 69 67 69 63 70
6 d2
r 1 1 r 1
n(n 2 1)
6 310
r 1
10 ( 100 1)
r 0 . 88
Two commentators gave ratings out of
100 for sports personalities. The ratings
are shown in the table below.
Personality A B C D E F G
Commentator I 73 76 78 65 86 82 91
Commentator II 77 78 79 80 86 89 95