Professional Documents
Culture Documents
The Principles and Practice of Time Series Forecasting and Business Modelling Using Neural Nets
The Principles and Practice of Time Series Forecasting and Business Modelling Using Neural Nets
This paper is intended as a 'hands-on' practical common-sense performance and ease of use. The
discussion of how and why neural networks are used neural network approach will be examined here,
in forecasting and business modelling. The need for and the practical benefits assessed.
forecasting is briefly examined. The theory of the
multilayer perceptron neural network is then covered
both qualitatively and in mathematical detail, includ- 2. Motivations for Forecasting and
ing the methods of back-propagation of error and
Modelling in Business
independent validation. The advantages of the neural
net approach to forecasting, namely nonlinear model-
Forecasting is the rational prediction of future events
ling capability, plausible interpolations and extrapol-
on the basis of information about past and current
ations, robustness to noise, ill-conditioning and
events. The process is very similar to modelling,
insufficient data, and ease of use, are discussed.
where the outcome of an unknown variable is
Finally, some working notes are offered for the
predicted from known or controllable variables. The
practical implementation of neural nets in forecasting,
relationship between known and unknown variables,
and four real-life examples are given from the pursuits
or between past and future events, may be derived
of econometrics, sales forecasting, market modelling,
either through rational deduction or statistical analy-
and risk evaluation.
sis of historical relationships, or through a combi-
nation of the two. If relevant exemplary information
Keywords: Sales forecasting; Market modelling; about past relationships is available, the second
Risk evaluation
approach is invariably more reliable than the first,
because it simply reflects patterns in data in an
unbiased way. This second approach is termed
1. Introduction 'technical analysis', and is the approach that will be
examined here.
Forecasting is a difficult and relevant problem in There are several motivations for forecasting
business. The increasing availability of computers and modelling in business and economics. Short-,
at work can make the forecasting job easier and medium- and long-term forecasts serve a variety of
the results more accurate. With the new freedom purposes; modelling may be useful in a business as
of extensive computing facilities, new approaches a management tool for decision making, or it may
to forecasting and modelling can be considered. be a central function of the business itself:
One of the most attractive approaches is the
neural network, because it combines accuracy with 1. Short-term forecasts: typically, a short-term
forecast is for the week or the month ahead, and
is used for stock control, monitoring cash flow, etc.
Original manuscript received 19 February 1992
Short-term forecasts are usually based on models of
Correspondence and offprint requests to: R.G. Hoptroff, Right current trends and seasonalities in demand.
Information Systems Ltd., 14 St Christopher's Place, London
WlM 5HB, UK. 2. Medium-term forecasts: the medium term fore-
60 R. G. Hoptroff
casts typically look at the position over the next few biological neural network theory, the mathematical
months or years to help manage long-term cash modelling of how the human brain works [2]. Since
utilisation and budgeting. Medium-term forecasts then it has found a wide scope of applications
incorporate independent influential variables into beyond its original field. The MLP has three
the trend/season forecast, to take into account such properties relevant to forecasting and modelling:
factors as the cyclical nature of the economy, and
the effects of different marketing strategies. 1. M L P transfer function: the MLP is a complex
'mathematical function box' which translates (or
3. Long-term forecasts: forecasting many years
maps) an input vector into an output vector.
ahead aids long-term strategic decision making and
(A vector is an ordered list o f numbers, e.g.
capital investment programming, and is used both
coordinates). The mapping from input to output is
in business and in government. These forecasts are
smooth and nonlinear (i.e. a graph relating input
the most difficult of all because of the need to
to output is a continuous arbitrary curve). This
quantify the effects of changes in the fundamental
mapping is wholly dictated by a series of parameters
structure of the system. As there is rarely any
called weights.
relevant data upon which to construct econometric
analyses in these situations, it is common to use 2. Training algorithm: training algorithms exist -
traditional economic modelling for such forecasts. in particular, back propagation of error - for tuning
the weights so that the MLP mapping is the 'best
4. Modelling as a management tool: by far the
fit', according to some measure of error, to a set
greatest interest in modelling for business is market
of training data, i.e. a series of example input/output
modelling. For example, if the effects of price,
data pairs.
promotional activity and advertising spend on
demand can be modelled, then the cost effectiveness These properties are covered in detail in, for
of the different marketing strategies can be quant- example, Rumelhart et al [2] or Wasserman [3].
ified. This has an obvious advantage for both Observe that the MLP's function is in essence a
marketing companies/departments and their clients. nonlinear extension of the linear mapping function
There are a large number of other applications for of multiple regression. There is one further aspect
such quantifiable cause and effect analyses, for to the MLP machine whose relevance is a little
example in project costing, risk estimation and more subtle:
shortlisting (shortlisting anything from personnel to
oil drilling sites). 3. Independent validation: a method, which we
will term independent validation, exists by which
5. Modelling as a central function of a business:
an independent test set may be used to verify the
modelling is the central profit-generating activity in
quality of the training data. The method allows the
a number of businesses. Invariably, the model is
training algorithm to extract what information it can
some form of price model, whether the business be
from the data before identifying a point where, if
in insurance, bookmaking, valuation or speculative
it tried to extract further information, it would begin
trading in financial markets.
to be misled by noise, ill-conditioning or simply
Conventional forecasting methods are almost all insufficient data to draw further conclusion. Indeed,
based on linear or linearised models such as the method can even be used to markedly improve
the auto-regressive moving average method. The on the performance of traditional linear regressions
practical success of these approaches is limited by constructed with small quantities of noisy or ill-
their linearity, their ravenous data requirements, conditioned data. (Ill-conditioning usually arises
and because one needs to be reasonably skilled to when similar, or linearly dependent, input vectors
obtain a good forecast. The interested reader is are associated with very different output vectors.)
directed towards Makridakis et al. [1] for a detailed The detailed argument for independent validation
exposition of traditional forecasting and modelling is as follows. The back-propagation of the error
methods. training process may be initialised so that, before
training starts, the MLP mapping is completely
impartial (or, perhaps, it reflects a priori knowledge
3. Principles of the Multilayer derived from a source other than the training set).
Perceptron (MLP) Neural Network Usually, this state is such that the output is constantly
equal to the training set's mean. Back propagation
In this section the properties of the MLP neural training, which is an iterative process, then proceeds
network are presented, first qualitatively and then by making small changes in the weights so as to
in mathematical detail. The MLP originated in minimise the error as quickly as possible. In time,
Time Series Forecasting and Business Modelling Using Neural Nets 61
6.1. Forecasting Turning-Points in the UK that a growth period similar to that of 1982-1983 is
Economy forecast for 1992-1993.
~C~ Deregu~lon T7
~.,~inr
ln~tc~ar
T5
Longer Forecast R ~
Leading
Indicator
MLP
indicator
nue ! 2
Yllllr : 1 i : I 0
100 90 80 70 60 50 40 30 20 10 0
Fig. 2. Comparison of neural net and longer leading indicator Days to Launch
forecasts of the coincident indicator. Note the neural net forecasts
two years ahead while the longer leading indicator forecasts one Fig. 3. Forecasts of total advertising revenue against the cumulat-
year ahead. ive total in the months leading up to a magazine's publication.
64 R.G. Hoptroff
two months ahead. The neural net achieves shows the corresponding product sales. The 4-
+ / - 10%, which is a significant improvement. Given element single layer MLP was trained to forecast
the small amount of training data (indeed, no sales using January 1984-June 1989 data (1983 being
sensible technical forecast is possible at all without used for independent validation), and was tested by
independent validation), a sensitivity analysis is forecasting July 1989-December 1990 sales. The
probably worthwhile for this problem to weed out test demonstrates excellent results (Fig. 5a). Most
some of the less relevant variables. Figure 3 shows interesting, however, are the cross-sections of the
how the forecast varies for one issue during the run- model which are obtained by varying each variable
up to publication. Ideally, the 'forecast revenue' in turn over its relevant range while keeping all
line is horizontal and intersects the cumulative other variables fixed. Figure 5b shows that the effect
revenue line at five days to publication (the last day of advertising on sales is roughly linear; the slope
on which adverts are accepted). of the line determines whether or not advertising is
worthwhile. Figure 5c shows that, up to the 300
level, promotions have a positive effect on sales.
6.3. Sales Forecasting for Demand and Market
Beyond 300, however, the benefit of promotions on
Modelling
sales begins to level off. Finally, Fig. 5d shows how
sales vary with product price. The product is a
A household product has seasonal sales for which
premium brand, and consequently higher priced
monthly data is available since 1983 (Fig. 4a).
than average. The cross-section reveals how, above
During that time, various marketing tools have been
a 20% premium, sales fall off significantly, while
mixed, including advertising, price positioning and
below 15% premium, sales do not increase drasti-
promotions. The production team needs to forecast
cally.
demand for the year ahead given the marketing
strategy planned for next year. The marketing
department needs to quantify how each marketing 6.4. Modelling Company Performance for Risk
ingredient influences sales so that the optimum and Return Evaluation
marketing mix can be prescribed in the future.
Figure 4 also shows the basic promotions (Fig. The performance of the top 100 UK construction
4b), pricing (Fig. 4c) and advertising (Fig. 4d) input companies were modelled for the industry's annual
data fed to the MLP. Inputs were also provided to review, UK Construction 1991. Looking at the
represent seasonality and time trend. Figure 4a scattergram in Fig. 6, it appears that the smaller
70 20
18
60
16
50
,a 4 0 12
r
~1o
g~ 3 o ~8
20 > 6
4
J,.
10 84
0 I i J i i C 0
1983 1984 1985 1986 1987 1988 1989 1990 1983 1984 1985 1986 1987 1988 1989 1990
700 1.3
6OO ~ 1.25
II
1,2
5OO <
1.15
400
--
C~ 1.1
E
~ 300
1.o5
=.o
2O0
1
~- 0.95
O 0.9 i t i i
1993 1984 1985 1986 1887 1988 1989 1990 1983 1984 1985 1988 1987 1988 1989 1990
Fig. 4. Product data 1983--1989 for market modelling, a Sales. b Advertising activity, c Promotional activity, d Price strategy.
Time Series Forecasting and Business Modelling Using Neural Nets 65
60
7O
50 -~ so
T=
40
~, so
~ gel, o
2O
20
,r
10
i lO
0 i J t i
55
g
65 r
=~ 50
-== 8o g
ss 45
_===
;:so ~_ 40
4s o
.35
40
#,
35
100 200 300 400 500 600 700 0.95 1 't.0S 1.1 1.15 !L2 I=25 1.3 1.35
Promotions Value Ratio of Brand Price to Average Price
Fig. 5. Modelling results from the data in Fig. 4. a Sales forecast (--: actuats; ---: forecast), b Effect of advertising, e Effect of
promotions, d Effect of price point.
companies operate at higher profit margins, and shows that there is sufficient information in the data
one might conclude that these companies are more to conclude that companies of below s million
efficient. However, it is not immediately clear annual turnover tend to be up to twice as profitable
whether the conclusion is statistically justifiable or than their larger competitors.
not, nor what the exact numerical relationship is. In a similar test, company size was used to model
Independent validation is used here as a test of recent (1990) company performance. Not only was
statistical significance. A 4-hidden unit MLP was a model of the relationship built, but a second
trained using independent validation to model model was generated to estimate the mean squared
company profitability on the basis of company size error of the first as this varied with company size.
alone. The MLP cross-section (the curve in Fig. 6) Figure 7 shows that in 1990, s million construction
companies were twice as volatile as their s billion
turnover competitors. This information can be used
E
4~I
35
30
25
o
to support investment decisions: a share in a
s million construction company offers twice the
potential return of a s billion company, but at
o
o D o
twice the total risk. However, Sharpe's work on
20
o o
o capital asset pricing [4] shows that a greater pro-
g 15 portion of risk can be diversified away for small
~oo ~Oo = o o
companies than large companies. Hence this work
m 10-
o Ooo\
o o []
~
o oo
o _n
~ [] o o
o
~
o
indicates that investing in a portfolio of small
5 D o~o o~ OJo o u o u o ~
g -30
r] -~
References
._~ ~