Simple Regression Analysis

Simple Regression Analysis
Pravat Uprety
Simple Linear Regression Equation
(Prediction Line)
The simple linear regression equation provides an estimate of the
population regression line
Estimated (or
predicted) Y Estimate of the Estimate of the
value for regression regression slope
observation i intercept
Value of X for
Ŷi  b0  b1Xi
observation i
The individual random error terms ei have a mean of zero
06/09/2021 Prepared by Pravat Uprety

Computation of slope (b1 ) and Y-intercept (b0)
• By using least squares method the value of bo

and b1 are obtained as
n  XY  (  X )(  Y )
b1 
n X 2
 ( X ) 2
b0 
 Y
b
 X
1
n n
Then the estimating equation becomes as

Yˆ  b0  b1 X
Example
Hhno Income Expenditure XY X2 Y2
(in 000) (in 000) (Y)
(X)
1 18 10
180 324 100
2 20 12
240 400 144
3 20 15
300 400 225
4 25 15
375 625 225
5 28 17
476 784 289
6 30 20
600 900 400
141 89 2171 3433 1383

The estimating equation is
Yˆ  0.794 0.665X
Syx=   b0  Y  b1  XY
Y 2
n2
1383 (0.794) X 89 0.665X 2171

=
62
= 1.57

coefficient of determination
• R2 = b0  Y  b1  XY  n Y 2
Y 2
 nY 2
For the previous example

 0.794 X 89  0.665 X 2171  6 X (14.83) 2
• R2 =
1383  6 X (14.83) 2
53.47
=
63.42
= 0.8431
84.31 % of variation in expenditure (Y) is explained by income

(X).

Inferences about population slope
• Population slope = β1 (parameter)
(Regression Coefficient)
Sample Slope = b1 (Statistic/estimator)
Standard error of estimate = Syx
Standard errorSof slope or regression coefficient
YX
(Sb1) = (X  X ) 2
SYX
X
2
= 2
 nX
Confidence interval estimate for the
population slope
• The (1-α)% confidence interval estimate for
the population slope is
b1 ± tn-2, α Sb1
taking -ve sign which gives lower limit (LL)
taking +ve sign which gives upper limit (UL)
Prob (LL ≤ β1 ≤ UL) = (1 – α)

Example
• For the previous example
We have
n= 6, b1 = 0.665, Syx = 1.57, ∑X = 141 and ∑X2
=3433
The standard error of regression coefficient is
SYX 1.57
Sb1 = = 141 2 = 0.1436
 X  nX
2 2
3433  6( )
6
• Now 95% confidence interval estimate is
b1 ± t6-2, 0.05 Sb1
= 0.665 ± 2.776 X 0.1436
= 0.665 ± 0.3986
Taking -ve sign, Lower limit = 0.2664
Taking +ve sign, Upper limit = 1.0636
The 95% confidence interval estimate is
Prob (0.2664 ≤ β1 ≤ 1.0636) = 95%

Hypothesis testing for population slope
(regression coefficient)
• Case (a) If any past value is not given or zero is given
Null Hypothesis (H0): β1 = 0 (i.e. there is no
significant relationship between Y and X)
Or
Null Hypothesis (H0): β1 ≥ 0 (i.e. there is no
significant –ve relationship between Y and X)
Or
Null Hypothesis (H0): β1 ≤ 0 (i.e. there is no
significant +ve relationship between Y and X)
Alternative Hypothesis (H1): β1 ≠ 0 (i.e. there is
significant relationship between Y and X)
Or
Alternative Hypothesis (H1): β1 < 0 (i.e. there is
significant –ve relationship between Y and X)
Or
Alternative Hypothesis (H1): β1 > 0 (i.e. there is
significant +ve relationship between Y and X)
• Test statistic
b1   b1
t= 1 =
Sb 1 Sb 1
Calculated t = |t|
Tabulated t = tn-2, α
Decision
If Cal value ≤ Tab value
We do not reject Ho
If Cal value > Tab value

We reject Ho
Hypothesis testing for the previous example
(Test of significance)
• Null Hypothesis (H0): β1 = 0
there is no significant relationship between income (X)
and consumption(Y) .
Alternative Hypothesis (H1): β1 ≠ 0

there is significant relationship between income (X) and
consumption(Y) .
• Test statistic
b1   b1
t= 1 = = 0.665/0.1436 = 4.63
Sb 1 Sb 1
Calculated t = |t|= 4.63

Tabulated t = tn-2, α = t4, 0.05 = 2.776
Decision
Here, Cal value (4.63) > Tab value (2.776)
We reject Ho
There is significant relationship between income (X) and
consumption (Y).
Case (b) If any past value is given
• Null Hypothesis (H0): β1 = given value (i.e. the

population slope has not significantly changed from its
past value)
Or
Null Hypothesis (H0): β1 ≥ given value (i.e. the population
slope has not significantly decreased from its past value)
Or
Null Hypothesis (H0): β1 ≤ given value (i.e. the
population slope has not significantly increased from its
past value)
Alternative Hypothesis (H1): β1 ≠ given value
(i.e. the population slope has significantly changed
from its past value)
Or
Alternative Hypothesis (H1): β1 < given value
(i.e. the population slope has significantly decreased
Or
Alternative Hypothesis (H1): β1 > given value
(i.e. the population slope has significantly increased
• Test statistic
t= b1   1
Sb 1
Calculated t = |t|
Tabulated t = tn-2, α
Decision
If Cal value ≤ Tab value
We do not reject Ho
If Cal value > Tab value

We reject Ho
Hypothesis testing for the previous example (Test of
significantly changed from its past value of 0.85)
• Null Hypothesis (H0): β1 = 0.85
the population slope has not significantly changed from its past value of
0.85.
Alternative Hypothesis (H1): β1 ≠ 0.85

the population slope has significantly changed from its past value of 0.85.
• Test statistic
t= b1   1 = (0.665-0.85)/0.1436 = -1.288
Sb 1
Calculated t = |t|= 1.288

Tabulated t = tn-2, α = t4, 0.05 = 2.776
Decision
Here, Cal value (1.288) < Tab value (2.776)
We do not reject Ho
the population slope has not significantly changed from

its past value of 0.85.
Estimating Mean Values and
Predicting Individual Values
Goal: Form intervals around Y to express uncertainty about the
value of Y for a given Xi
Confidence
Interval for Y 
the mean of Y
Y, given Xi

Y = b0+b1Xi
Prediction Interval for

an individual Y, given
Xi
Xi X
Example
Hhno Income (in Expenditure (in
000) (X) 000) (Y)
1 18 10
2 20 12
3 20 15 Mean of Y – Lower
and upper
(Confidence)
4 25 15
5 28 17
X = 29 Y =? LL Prediction
UL (Individual)
6 30 20

Confidence Interval for
the Average/Mean Y, Given X
Confidence interval estimate for the
mean value of Y given a particular Xi
Confidence interval for μ Y|X  X i :

Yˆ  t n  2, SYX hi
Size of interval varies according to

distance away from mean, X
1 (X i  X) 2 1 (X i  X ) 2
hi    
n  (X i  X) 2
n  X 2  nX 2
Prediction Interval for
an Individual Y, Given X
Prediction interval estimate for an
Individual value of Y given a particular Xi
Prediction interval for YX  X i :

Yˆ  t n  2, SYX 1  hi
This extra term adds to the interval width to reflect

the added uncertainty for an individual case

Simple Regression Analysis

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Simple Regression Analysis

Uploaded by

Copyright:

Available Formats

Simple Regression Analysis

The individual random error terms ei have a mean of zero

06/09/2021 Prepared by Pravat Uprety

• By using least squares method the value of bo

Then the estimating equation becomes as

06/09/2021 Prepared by Pravat Uprety

1383 (0.794) X 89 0.665X 2171

06/09/2021 Prepared by Pravat Uprety

For the previous example

84.31 % of variation in expenditure (Y) is explained by income

06/09/2021 Prepared by Pravat Uprety

Prob (LL ≤ β1 ≤ UL) = (1 – α)

Prob (0.2664 ≤ β1 ≤ 1.0636) = 95%

If Cal value > Tab value

Alternative Hypothesis (H1): β1 ≠ 0

Calculated t = |t|= 4.63

• Null Hypothesis (H0): β1 = given value (i.e. the

If Cal value > Tab value

• Null Hypothesis (H0): β1 = 0.85

Alternative Hypothesis (H1): β1 ≠ 0.85

Calculated t = |t|= 1.288

the population slope has not significantly changed from

Prediction Interval for

06/09/2021 Prepared by Pravat Uprety

Confidence interval for μ Y|X  X i :

Size of interval varies according to

Prediction interval for YX  X i :

This extra term adds to the interval width to reflect

You might also like