Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 43

Simple Linear Regression

Learning Objectives

1. Describe the Linear Regression Model


2. State the Regression Modeling Steps
3. Explain Least Squares
4. Compute Regression Coefficients
5. Explain Correlation
6. Predict Response Variable
Models
Models
• Representation of some phenomenon
• Mathematical model is a mathematical
expression of some phenomenon
• Often describe relationships between
variables
• Types
– Deterministic models
– Probabilistic models
Deterministic Models
• Hypothesize exact relationships
• Suitable when prediction error is negligible
• Example: force is exactly mass times
acceleration
– F = m·a

© 1984-1994 T/Maker Co.


Probabilistic Models
• Hypothesize two components
– Deterministic
– Random error
• Example: sales volume (y) is 10 times
advertising spending (x) + random error
– y = 10x + 
– Random error may be due to factors
other than advertising
Types of
Probabilistic Models

Probabilistic
Models

Regression Correlation
Models Models
Regression Models
Types of
Probabilistic Models

Probabilistic
Models

Regression Correlation
Models Models
Regression Models
• Answers ‘What is the relationship between the
variables?’
• Equation used
– One numerical dependent (response) variable
 What is to be predicted
– One or more numerical or categorical
independent (explanatory) variables
• Used mainly for prediction and estimation
Regression Modeling
Steps
1. Hypothesize deterministic component
2. Estimate unknown model parameters
3. Specify probability distribution of random
error term
• Estimate standard deviation of error
4. Evaluate model
5. Use model for prediction and estimation
Model Specification
Regression Modeling
Steps

1. Hypothesize deterministic component


2. Estimate unknown model parameters
3. Specify probability distribution of random
error term
• Estimate standard deviation of error
4. Evaluate model
5. Use model for prediction and estimation
Specifying the Model

1. Define variables
• Conceptual (e.g., Advertising, price)
• Empirical (e.g., List price, regular price)
• Measurement (e.g., $, Units)
2. Hypothesize nature of relationship
• Expected effects (i.e., Coefficients’ signs)
• Functional form (linear or non-linear)
• Interactions
Model Specification
Is Based on Theory
• Theory of field (e.g., Sociology)
• Mathematical theory
• Previous research
• ‘Common sense’
Thinking Challenge:
Which Is More Logical?
Sales Sales

Advertising Advertising

Sales Sales

Advertising Advertising
Types of Relationships
(continued)
Strong relationships Weak relationships

Y Y

X X

Y Y

X X
Types of Relationships
(continued)
No relationship

X
Types of
Regression Models
1 Explanatory Regression 2+ Explanatory
Variable Models Variables

Simple Multiple

Non- Non-
Linear Linear
Linear Linear
Linear Regression Model
Types of
Regression Models
1 Explanatory Regression 2+ Explanatory
Variable Models Variables

Simple Multiple

Non- Non-
Linear Linear
Linear Linear
Linear Regression Model

Relationship between variables is a linear


function
Population Population Random
y-intercept Slope Error

y   0  1 x  
Dependent Independent
(Response) (Explanatory)
Variable Variable
Line of Means

y
e a ns)
n e o fm
β x (li
β + 1 Change
=
E(y)
0
β1 = Slope in y
Change in x

β0 = y-intercept
x
Population & Sample
Regression Models
Population Random Sample

Unknown
y  ?0  1 x  ˆ
Relationship $
y   0  1 x   $
$
$ $
$
$
Population Linear
Regression Model

y y i   0   1 xi   i Observed
value

i = Random error

E  y    0  1 x

x
Observed value
Sample Linear Regression
Model
y yi  ˆ0  ˆ1 xi  ˆi

^i = Random
error
Unsampled
observation
yˆi  ˆ0  ˆ1 xi
x
Observed value
Estimating Parameters:
Least Squares Method
Regression Modeling
Steps

1. Hypothesize deterministic component


2. Estimate unknown model parameters
3. Specify probability distribution of random
error term
• Estimate standard deviation of error
4. Evaluate model
5. Use model for prediction and estimation
Scattergram
1. Plot of all (xi, yi) pairs
2. Suggests how well model will fit

y
60
40
20
0 x
0 20 40 60
Thinking Challenge
• How would you draw a line through the points?
• How do you determine which line ‘fits best’?

y
60
40
20
0 x
0 20 40 60
Least Squares
• ‘Best fit’ means difference between actual y
values and predicted y values are a minimum
– But positive differences off-set negative
n n

 yi  yi    ˆ i
2
ˆ 2

i 1 i 1

• Least Squares minimizes the Sum of the


Squared Differences (SSE)
Least Squares Graphically
n
LS minimizes   i   1   2   3   4
ˆ 2
ˆ 2
ˆ 2
ˆ 2
ˆ 2

i 1

y y2  ˆ0  ˆ1 x2  ˆ2


^4
^2
^1 ^3
yˆ i  ˆ0  ˆ1 xi
x
Coefficient Equations
Prediction Equation ŷ  ˆ0  ˆ1 x
n
  n 
n   x i    yi 
   i 1 

i 1
x y 
ˆ SS xy i i
n
Slope 1   i 1
2
SS xx 
n

n  x i 
 

i 1
xi
2

i 1 n

y-intercept ˆ0  y  ˆ1 x


Computation Table

2 2
xi yi xi yi xi y i
2
x1 y1 x1 y12 x1y1
2 2
x2 y2 x2 y2 x2y2
: : : : :
2
xn yn xn2 yn xnyn
2 2
Σxi Σyi Σxi Σyi Σxiyi
Interpretation of Coefficients
^
1. Slope (1)
^
• Estimated y changes by 1 for each 1unit increase
in x ^
— If 1 = 2, then Sales (y) is expected to increase by 2
for each 1 unit increase in Advertising (x)
^
2. Y-Intercept (0)
• Average value of y when x = 0
^
— If 0 = 4, then Average Sales (y) is expected to be
4 when Advertising (x) is 0
Least Squares Example
You’re a marketing analyst for Hasbro Toys.
You gather the following data:
Ad $ Sales (Units)
1 1
2 1
3 2
4 2
5 4
Find the least squares line relating
sales and advertising.
Scattergram
Sales vs. Advertising

Sales
4
3
2
1
0
0 1 2 3 4 5
Advertising
Parameter Estimation
Solution Table
2 2
xi yi x i y i xiyi
1 1 1 1 1
2 1 4 1 2
3 2 9 4 6
4 2 16 4 8
5 4 25 16 20
15 10 55 26 37
Parameter Estimation
Solution
n
  n 
  x i    yi 
n
 i 1   i 1   15   10 
 x y
i i 
n
37 
5
ˆ1  i 1
  .70
 15 
n 2 2
 
n  x i  55 
5
 

i 1
xi
2

i 1 n

?0  y  1 x  2   .70   3   .10

yˆ  .1  .7 x
Parameter Estimation
Computer Output
Parameter Estimates

^0 Parameter Standard T for H0:


Variable DF Estimate Error Param=0 Prob>|T|
INTERCEP 1 -0.1000 0.6350 -0.157 0.8849
ADVERT 1 0.7000 0.1914 3.656 0.0354

^1

yˆ  .1  .7 x
Coefficient Interpretation
Solution
^
1. Slope (1)
• Sales Volume (y) is expected to increase by .7
units for each $1 increase in Advertising (x)

2. Y-Intercept (^0)
• Average value of Sales Volume (y) is -.10 units
when Advertising (x) is 0
— Difficult to explain to marketing manager
— Expect some sales without advertising
Regression Line Fitted
to the Data

Sales
4
3 yˆ  .1  .7 x
2
1
0
0 1 2 3 4 5
Advertising
Least Squares
Thinking Challenge
You’re an economist for the county cooperative.
You gather the following data:
Fertilizer (lb.) Yield (lb.)
4 3.0
6 5.5
10 6.5
12 9.0
© 1984-1994 T/Maker Co.
Find the least squares line relating
crop yield and fertilizer.

You might also like