Professional Documents
Culture Documents
Simple Linear Regression
Simple Linear Regression
Learning Objectives
Probabilistic
Models
Regression Correlation
Models Models
Regression Models
Types of
Probabilistic Models
Probabilistic
Models
Regression Correlation
Models Models
Regression Models
• Answers ‘What is the relationship between the
variables?’
• Equation used
– One numerical dependent (response) variable
What is to be predicted
– One or more numerical or categorical
independent (explanatory) variables
• Used mainly for prediction and estimation
Regression Modeling
Steps
1. Hypothesize deterministic component
2. Estimate unknown model parameters
3. Specify probability distribution of random
error term
• Estimate standard deviation of error
4. Evaluate model
5. Use model for prediction and estimation
Model Specification
Regression Modeling
Steps
1. Define variables
• Conceptual (e.g., Advertising, price)
• Empirical (e.g., List price, regular price)
• Measurement (e.g., $, Units)
2. Hypothesize nature of relationship
• Expected effects (i.e., Coefficients’ signs)
• Functional form (linear or non-linear)
• Interactions
Model Specification
Is Based on Theory
• Theory of field (e.g., Sociology)
• Mathematical theory
• Previous research
• ‘Common sense’
Thinking Challenge:
Which Is More Logical?
Sales Sales
Advertising Advertising
Sales Sales
Advertising Advertising
Types of Relationships
(continued)
Strong relationships Weak relationships
Y Y
X X
Y Y
X X
Types of Relationships
(continued)
No relationship
X
Types of
Regression Models
1 Explanatory Regression 2+ Explanatory
Variable Models Variables
Simple Multiple
Non- Non-
Linear Linear
Linear Linear
Linear Regression Model
Types of
Regression Models
1 Explanatory Regression 2+ Explanatory
Variable Models Variables
Simple Multiple
Non- Non-
Linear Linear
Linear Linear
Linear Regression Model
y 0 1 x
Dependent Independent
(Response) (Explanatory)
Variable Variable
Line of Means
y
e a ns)
n e o fm
β x (li
β + 1 Change
=
E(y)
0
β1 = Slope in y
Change in x
β0 = y-intercept
x
Population & Sample
Regression Models
Population Random Sample
Unknown
y ?0 1 x ˆ
Relationship $
y 0 1 x $
$
$ $
$
$
Population Linear
Regression Model
y y i 0 1 xi i Observed
value
i = Random error
E y 0 1 x
x
Observed value
Sample Linear Regression
Model
y yi ˆ0 ˆ1 xi ˆi
^i = Random
error
Unsampled
observation
yˆi ˆ0 ˆ1 xi
x
Observed value
Estimating Parameters:
Least Squares Method
Regression Modeling
Steps
y
60
40
20
0 x
0 20 40 60
Thinking Challenge
• How would you draw a line through the points?
• How do you determine which line ‘fits best’?
y
60
40
20
0 x
0 20 40 60
Least Squares
• ‘Best fit’ means difference between actual y
values and predicted y values are a minimum
– But positive differences off-set negative
n n
yi yi ˆ i
2
ˆ 2
i 1 i 1
i 1
2 2
xi yi xi yi xi y i
2
x1 y1 x1 y12 x1y1
2 2
x2 y2 x2 y2 x2y2
: : : : :
2
xn yn xn2 yn xnyn
2 2
Σxi Σyi Σxi Σyi Σxiyi
Interpretation of Coefficients
^
1. Slope (1)
^
• Estimated y changes by 1 for each 1unit increase
in x ^
— If 1 = 2, then Sales (y) is expected to increase by 2
for each 1 unit increase in Advertising (x)
^
2. Y-Intercept (0)
• Average value of y when x = 0
^
— If 0 = 4, then Average Sales (y) is expected to be
4 when Advertising (x) is 0
Least Squares Example
You’re a marketing analyst for Hasbro Toys.
You gather the following data:
Ad $ Sales (Units)
1 1
2 1
3 2
4 2
5 4
Find the least squares line relating
sales and advertising.
Scattergram
Sales vs. Advertising
Sales
4
3
2
1
0
0 1 2 3 4 5
Advertising
Parameter Estimation
Solution Table
2 2
xi yi x i y i xiyi
1 1 1 1 1
2 1 4 1 2
3 2 9 4 6
4 2 16 4 8
5 4 25 16 20
15 10 55 26 37
Parameter Estimation
Solution
n
n
x i yi
n
i 1 i 1 15 10
x y
i i
n
37
5
ˆ1 i 1
.70
15
n 2 2
n x i 55
5
i 1
xi
2
i 1 n
yˆ .1 .7 x
Parameter Estimation
Computer Output
Parameter Estimates
^1
yˆ .1 .7 x
Coefficient Interpretation
Solution
^
1. Slope (1)
• Sales Volume (y) is expected to increase by .7
units for each $1 increase in Advertising (x)
2. Y-Intercept (^0)
• Average value of Sales Volume (y) is -.10 units
when Advertising (x) is 0
— Difficult to explain to marketing manager
— Expect some sales without advertising
Regression Line Fitted
to the Data
Sales
4
3 yˆ .1 .7 x
2
1
0
0 1 2 3 4 5
Advertising
Least Squares
Thinking Challenge
You’re an economist for the county cooperative.
You gather the following data:
Fertilizer (lb.) Yield (lb.)
4 3.0
6 5.5
10 6.5
12 9.0
© 1984-1994 T/Maker Co.
Find the least squares line relating
crop yield and fertilizer.