Professional Documents
Culture Documents
Regression Analysis For Forecasting: Yosef Daryanto
Regression Analysis For Forecasting: Yosef Daryanto
Regression Analysis For Forecasting: Yosef Daryanto
Forecasting
Yosef Daryanto
Atma Jaya Yogyakarta University
2015
Simple Regression Forecasting
Method
Yosef Daryanto
Faculty of Economy
Atma Jaya Yogyakarta University
2015
Overview of forecasting techniques
Quantitative: Sufficient quantitative information is available
*Time series: Predicting the situation of historical pattern such as the
growth in sales or gross national product
*Explanatory: Understanding how explanatory variables such as prices and
advertising affect sales
One should utilize graphic techniques to inspect the data, looking especially for
Trend, seasonal, and cyclical components, as well as for outliers.
Least Square Estimation
Linear relationship between Y and X given by
Y = a + bX + e
(X
i 1
i X )( Yi Y )
n( XY ) - ( X)( Y)
b
n
n( X 2 ) ( X) 2
i 1
(X i X ) 2
a Y - bX
Cross-sectional Forecasting
Example: Based on data in the Pulp Shipment World Pulp
table, develop the (millions metric Price
relationship between world tons) ($/ton)
pulp price and shipment!
How much of shipment 10.4 79
could we expect when 11.4 86
world pulp price increase to 11.1 80
90 $/ton? 11.7 71
12.7 72
14.0 74
15.1 76
15.2 75
Cross-sectional Forecasting
Exercise: One company operate in eight cities. Table below shows
data of most recent year’s sales and the population of each city.
Develop the model to predict sales based on population by using a
simple regression model.
Population (000) Sales (000)
505 372
351 275
186 214
175 135
132 81
115 144
108 90
79 97
Time series forecast
Car Sales
In fitting the linear Month Time period Sales
regression, we have to
Jan’09 1 100
ignored the time ordering
Feb’09 2 96
of the data
Mar’09 3 107
To use the equation to
Apr’09 4 98
make a forecast for time
series data, we need only Mei’09 5 103
substitute the appropriate Jun’09 6 99
values for time (T). Jul’09 7 126
Aug’09 8 128
Sep’09 9 122
Oct’09 10 130
Correlation Coefficient (rxy)
It often occurs that two variables are related to each other,
even though it might be incorrect to say that the value of
one of the variables depends upon, or is influenced by,
changes in the value of the other variable.
The coefficient of correlation, r, is a relative measure of the
linear association between two numerical variables.
rxy
Cov xy
(X X)(Y Y)
i i
SxSy
(X X) (Y Y)
i
2
i
2
Correlation Coefficient (rxy)
13 17.1 3.7 0 2 4 6 8 10 12 14 16 18
Advertising expenditures
14 12.4 3.3
Advertising Expenditure Vs Sales
4 4
3.5 3.5
3 3
Y (Sales)
2.5 2.5
Y (Sales)
2 2
1.5 1.5
1 1
0.5 0.5
0 0
0.000 0.050 0.100 0.150 0.200 0.250 0.300 0.350 0.000 0.200 0.400 0.600 0.800 1.000 1.200 1.400
1/X LogX
More linear
4 4
3.5 3.5
3 3
Y (Sales)
2.5
Y (Sales)
2.5
2 2
1.5 1.5
1 1
0.5 0.5
0 0
0.000 0.500 1.000 1.500 2.000 2.500 3.000 3.500 4.000 4.500 0 50 100 150 200 250 300 350
Sq Root X Square X
Multiple Regression
Analysis
Yosef Daryanto
Universitas Atma Jaya Yogyakarta
2015
Introduction
In simple linear regression, the relationship between
a single independent variable and a dependent
variable is investigated. The relationship between
two variables frequently allows one to accurately
predict the dependent variable from knowledge of
the independent variable.
Unfortunately, many real life forecasting situations
are not so simple. More than one independent
variable is usually necessary in order to predict a
dependent variable accurately.
Regression models with more than one independent
variable are called multiple regression models.
Mr. Charlie observes the selling price and sales
volume of milk gallons for 10 randomly selected weeks
presented in Table below. Construct the linear
relationship between both variables! Evaluate the
equation using r2!
Week Selling Price (X) Sales volume (Y)
1 1.30 10
2 2.00 6
3 1.70 5
4 1.50 12
5 1.60 10
6 1.20 15
7 1.60 5
8 1.40 12
9 1.00 17
10 1.10 20
Introduction
In the problem of forecast sales volume of gallons of
milk from knowledge of price per gallon, Mr. Charlie
is faced with the problem of making a prediction that
is not entirely accurate. He can explain almost 75%
of the differences in gallons of milk sold by using
one independent variable. Thus, 25% (1 – r2) of the
total variation is unexplained.
To do a more accurate job of forecasting, he needs
to find another predictor variable that will enable him
to explain more of the total variation. If Mr. Charlie
can reduce the unexplained variation, his forecast
will involve less uncertainty and be more accurate.
Introduction
A search must be conducted for another
independent variable that is related to sales volume
of gallons of milk. However, this new independent
variable cannot relate too highly to the independent
variable already in use (price per gallon).
If the two independent variables are highly related to
each other, they will explain the same variation, and
the addition of the second variable will not improve
the forecast.
This problem often referred as multi-collinearity.
Correlation Matrix
Mr. Charlie decides that advertising might help improve
his forecast of weekly sales volume. He investigate the
relationship among advertising expense, sales volume,
and price per gallon by examining a correlation matrix.
Correlation matrix is constructed by computing the simple
correlation coefficients for each combination of pairs of
variables.
Variables
variables 1 2 3
R 2
r 2
( Ŷi Y ) 2
1 r11 r12 r13 yy
(Yi Y) 2
2 r22 r23
r 2 = (r)2
3 r33
Example
Mr. Charlie’s data:
Week Sales (1,000) Price per Gallon ($) Advertising ($100)
Y X1 X2
1 10 1.3 9
2 6 2 7
3 5 1.7 5
4 12 1.5 14
5 10 1.6 15
6 15 1.2 12
7 5 1.6 6
8 12 1.4 10
9 17 1 15
10 20 1.1 21