Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

CHAPTER 13

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


Planning Under Uncertainty

The winners of tomorrow will deal proactively with chaos, will


look at the chaos per se as the source for market advantage,
not as a problem to be got around. Chaos and uncertainty are
(will be) market opportunities for the wise.
Tom Peters, Thriving on Chaos, 1987

The analyst who attempts to build a mathematical model for a real-


world system is often faced with the problem of uncertain, noisy, incom­
plete or erroneous data. This is true for several application domains. In
business applications noisy data are prevalent. Returns of financial instru­
ments, demand for a firm’s products, the cost of fuel, and consumption
of power and other resources are examples of model data that are known
with some probabilistic distribution at best. In social sciences data are
often incomplete—for example, partial census surveys are carried out pe­
riodically in lieu of a complete census of the population. In the physical
sciences and engineering data are usually subject to measurement errors,
as in models of image restoration from remote sensing experiments.
For some applications not much is lost by assuming that the value of the
uncertain data is known and then developing a deterministic mathemati­
cal programming model. Worst case or mean values can be used in this
respect because they provide reasonable approximations when either the
level of uncertainty is low, or when the uncertain parameters have a minor
impact on the system we want to model. For many applications, however,
uncertainty plays a key role in the performance of the real-world system:
worst case analysis often leads to conservative and potentially expensive
solutions, and solving the mean value problem, i.e., a problem where all
random variables are replaced by their mean values can even lead to non­
sensical solutions since the mean of a random variable might not be a value
that can be realized in practice.
A general approach to dealing with uncertainty is to assign to the un­
known parameters a probability distribution, which should then be incorpo­
rated into an appropriate mathematical programming model. This chapter
addresses the problem of planning under uncertainty and develops math­
ematical programming formulations using stochastic linear programming

Parallel Optimization, Yair Censor, Oxford University Press (1997), © 1997 by


Oxford University Press, Inc., DOI: 10.1093/9780195100624.003.0013
372 Planning Under Uncertainty Chap. 13

models and robust optimization models. Section 13.2 discusses a classic


example to highlight the issues involved. Sections 13.3 and 13.4 give math­
ematical programming models. Section 13.5 discusses diverse real-world
applications and 13.6 focuses on an application from financial planning.

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


Section 13.7 introduces a broad class of stochastic programming applica­
tions, namely that of stochastic network problems. Section 13.8 gives a
row-action iterative algorithm for solving this special class of problems.
Notes and references in Section 13.9 conclude this chapter.

13.1 Preliminaries
We introduce first some basic definitions on probability spaces that are
needed throughout this section. Additional background material on proba­
bility theory can be found, e.g., in Billingsley (1995) and, with emphasis on
stochastic programming, in Kall (1976) and Wets (1989). In this chapter
boldface Greek characters are used to denote random vectors which belong
to some probability space as defined below.
Let Q be an arbitrary space or set of points. A cr-field for Q is a family
S of subsets of Q such that Q itself, the complement with respect to Q
of any set in S, and any union of countably many sets in E are all in E.
The members of E are called measurable sets, or events in the language
of probability theory. The set Q with the cr-field E is called a measurable
space and is denoted by (Q, S).
Let Q be a (linear) vector space and E a cr-field. A probability measure P
on (Q, S) is a real-valued function defined over the family E, which satisfies
the following conditions: (z) 0 < P(A) < 1 for A G E; (ii) P($) = 0 and
P(Q) = 1; and (Hi) if is a sequence of disjoint sets Ak C E and
if G E then P(U^=1Ak) = The triPlet is
called a probability space. The support of (Q, E, P) is the smallest subset of
Q with probability 1. If the support is a countable set then the probability
measure is said to be discrete. The term scenario is used for the elements
of Q of a probability space with a discrete distribution.
A proposition is said to hold almost surely (abbreviated a.s.) or P-
almost surely if it holds on a subset A C Q with P(A) = 1. The expected
value of a random variable Q on (Q, E, P) is the Stieltjes integral of Q with
respect to the measure P:

E[Q] = j QdP = Q(w)dP{wi).

The expectation of a constant function is also constant and it is easy to see


that E[Qi + Q2] = E[Qi] 4- E[Q2]• The fcth moment of Q is the expected
value of Qfc, i.e., E[Qfc] = / Qfc(u>)dP(u>). The variance of the random

variable Q is defined as Var[Q] = E[Q2] - (E[Q])2.


Sect. 13.2 The Newsboy Problem 373

Finally, we give a formal but restricted (to our needs) definition of a


conditional expectation. Let (Q, E, P) be a probability space and suppose
that v4i, >^2, •. •, Ak is a finite partition of the set Q. From this partition we
form a cr-field A which is a subfield of E. Then the conditional expectation

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


of the random variable Q(u>) on (Q, E, P) given A at o; is denoted by
E[Q | v4] and defined as

E[Q I [ Q(w)dP(w)
^(A) JAi

for u? E assuming that P(A) > 0.

13.2 The Newsboy Problem


To develop an understanding of stochastic programming problems we con­
sider first the following simple problem of planning under uncertainty. On
a street corner a young entrepreneur is selling newspapers that he buys
from a local distributor each morning. He sells these papers for a profit
per unit, and any papers that remain at the end of the day are sold as
scrap paper, in which case a net loss p~ is realized per unit. The demand
for newspapers is a random variable 07 which belongs to a probability space
with support denoted byQ = {u7ElR|0<o7<(X)} and probability dis­
tribution function P(o7). The problem is to choose the optimal number of
papers x that should be bought from the local distributor.
An approach to modeling this situation is to consider a policy x as
optimal if it maximizes the expected profit. Profit is a function of the policy
and the demand random variable 07. Let F(x,aT) be the profit function:

\ • / p+x if x < uj,


F(X,W) = < + _z \ -r
( 07 — p (X — O?) II X > 07.

The expected value of the profit function is the Stieltjes integral with re­
spect to the distribution function:

E[F(x,w)] = f F(x,w)dP(a>)

rx POO
' = / (p+o> — p~(x — 07)) dP(w) + / p+xdP(w),
Jo Jx
and the mathematical model for the newsboy problem is the following op­
timization problem with respect to x,

Maximize E[F(x,w)] (13.1)


s.t. x > 0. (13.2)
374 Planning Under Uncertainty Chap. 13

This is a simple example of a problem of planning under uncertainty.


It is represented by an adaptive model, since decisions adapt as more in­
formation becomes available, i.e., as newspapers are sold to customers that
arrive during the day. The model has fixed recourse, meaning that the re­

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


action to the observed demand is fixed. That is, the number of newspapers
sold for a profit is uniquely determined by the number of customers. The
same is true for the surplus created at the end of the day, which is sold
for scrap at a loss. Other forms of recourse action might have been possi­
ble, such as purchasing additional papers at a higher cost later during the
day, or returning newspapers before the end of the day, at a value higher
than that of scrap paper. This simple, fixed recourse model does not allow
for such considerations, and it also assumes that all risk preferences are
captured by the expected value of the profit. Higher moments of the distri­
bution of the profit function F(x, o>) are ignored. The next section presents
mathematical models for planning under uncertainty in more complicated
settings.

13.3 Stochastic Programming Problems


The following problem is the general formulation of stochastic program­
ming:

Minimize E[fi(x,co)]
s.t. E [fi(x, cu)] = 0, i = l,2, (13.3)
x e X C IRn.

The following notation is used: id is a random vector with support


QC and P = P(w) is a probability distribution function on IR^. Also
/o : lRn x Q —> IR U {Too}, fi : IRn x Q —> IR, i = 1,2,..., m, and X is a
closed set. Inequality constraints can be incorporated into this formulation
with the use of slack variables.
The expectation functions

E[fi(x,w)] = [ fi(x,w)dP(wfi (13.4)

are assumed finite for all i = 0,1,..., m unless the set {u? | fo(x, u>) = Too}
has a nonzero probability, in which case E[fifix,w)] = Too. The feasibility
set
X A {x | E[fi(x,a>)] = 0, i = 1,2,... , m} Pl {x | E[fifix, a;)] < Too}
is assumed to be nonempty.
The model (13.3) is a nonlinear programming problem, whose con­
straints and objective functions are represented by integrals. Much of
the theory of stochastic programming is concerned with identifying the
Sect. 13.3 Stochastic Programming Problems 375

properties of these integral functions and devising suitable approximation


schemes for their evaluation. Optimality conditions are derived from those
for nonlinear programming, with the aid of subdifferential calculus for the
expectation functions. However, the computation of solutions for these

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


nonlinear programs poses serious challenges, since evaluation of the inte­
grals can be an extremely difficult task, especially when the expectation
functionals are multidimensional. There are even cases when the integrands
are neither differentiable, nor convex nor even continuous. A broad class of
stochastic programming models, however, can be formulated as large-scale
linear or nonlinear programs with a specially structured constraints ma­
trix. Most of the work on parallel computing for stochastic programming
focuses on the development of decomposition algorithms that exploit this
special structure. In the next subsections we look at further refinements of
the general stochastic programming formulation.

13.3.1 Anticipative models


Consider now the following situation. A decision x must be made in an un­
certain world where the uncertainty is described by the random vector u>.
The decision does not in any way depend on future observations, but pru­
dent planning has to anticipate possible future realizations of the random
vector.
In anticipative models feasibility is expressed in terms of probabilistic
(or chance) constraints. For example, a reliability level a where 0 < a < 1
is specified and constraints are expressed in the form

P{w | gi(x,w) = 0, f = l,2,... , m} > a,

where gi : IRn x Q -> IR, i — 1, 2,... ,m. This constraint can be cast in
the form of the general model (13.3) by defining as follows

a - 1 if gi(x,w) = 0,
a otherwise.
The objective function may also be of a reliability type, such as P{u> |
gv(x,u)) < 7}, where 7 is a constant.
In summary, an anticipative model selects a policy that leads to some
desirable characteristics of the constraint and objective functionals under
the realizations of the random vector. In the example above it is desirable
that the probability of a constraint violation is less than the prespecified
threshold value a. The precise value of a depends on the application at
hand, the cost of constraint violation, and other similar considerations.

13.3.2 Adaptive models


In an adaptive model, observations related to uncertainty become avail­
able before a decision x is made, such that optimization takes place in a
376 Planning Under Uncertainty Chap. 13

learning environment. It is understood that observations provide only par­


tial information about the random variables because otherwise the model
would simply wait to observe the values of the random variables, and then
make a decision x by solving a deterministic mathematical program. In

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


contrast to this approach we have the other extreme situation where all
observations are made after the decision x has been made, and the model
becomes anticipative.
Let A be the collection of all the relevant information that could become
available by making an observation. This A is a subfield of the cr-field of
all possible events, generated from the support set Q of the random vector
lu. The decisions x depend on the events that could be observed, and x
is termed A-adapted or A-measurable. Using the conditional expectation
with respect to A, E[ • | >1], the adaptive stochastic program can be written
as:

Minimize E[/o(z(a>),u>) | 4]
s.t. E[fi(x(w),w) | ^4] = 0, i = 1,2,..., m, (13.5)
x(w) e X, almost surely.

where the mapping x : Q X is such that x(u>) is X-measurable. This


problem can be addressed by solving for every cu the following deterministic
programs

Minimize E[fo(x, •) | -4](a?) (13.6)


s.t. E[fi(x, •) | -4](cu) = 0, i = 1,2,... ,m, (13.7)
X e X. (13.8)

Each such problem for a given is of the canonical form (13.3).


The two extreme cases (i.e., complete information with A = S, or no
information at all) deserve special mention. The case of no information
reduces the model to the form of the anticipative model; when there is
complete information the model (13.5) is known as the distribution model.
The goal in this later case is to characterize the distribution of the opti­
mal objective function value. The precise values of the objective function
and the optimal policy x are determined after realizations of the random
vector w are observed. The most interesting situations arise when partial
information becomes available after some decisions have been made, and
models to address such situations are discussed next.

13.3.3 Recourse models


The recourse problem combines the anticipative and adaptive models in a
common mathematical framework. The problem seeks a policy that not
only anticipates future observations but also takes into account that obser­
Sect. 13.3 Stochastic Programming Problems 377

vations are made about uncertainty, and thus can adapt by taking recourse
decisions. For example, a portfolio manager specifies the composition of a
portfolio considering both future movements of stock prices (anticipation)
and that the portfolio will be rebalanced as prices change (adaptation).

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


The two-stage version of this model has been studied extensively. It is
amenable to formulations as a large-scale deterministic nonlinear program
with a special structure of the constraints matrix. These formulations yield
naturally to solution via decomposition algorithms and parallel computa­
tions. To formulate the two-stage stochastic program with recourse we need
two vectors for decision variables to distinguish between the anticipative
policy and the adaptive policy. The following notation is used.
x e lRn° denotes the vector of first-stage decisions. These decisions are
made before the random variables are observed and are anticipative.
y E IRni denotes the vector of second-stage decisions. These decisions are
made after the random variables have been observed and are adaptive.
They are constrained by decisions made at the first-stage, and depend
on the realization of the random vector.
We formulate the second-stage problem in the following manner. Once
a first-stage decision x has been made, some realization of the random vec­
tor can be observed. Let q(y,co) denote the second-stage cost function,
and let {T(u>), W(co), h(co) | u? € Q} be the model parameters. Those
parameters are functions of the random vector co and are, therefore, ran­
dom parameters. T is the technology matrix of dimension x n$. It
contains the technology coefficients that convert the first-stage decision x
into resources for the second-stage problem. W is the recourse matrix of
dimension x ni. h is the second-stage resource vector of dimension mi.

The second-stage problem seeks a policy y that optimizes the cost of


the second-stage decision for a given value of the first-stage decision x.
We denote the optimal value of the second-stage problem by Q(x, co). This
value depends on the random parameters and on the value of the first-stage
variables x. Q(x,co) is the optimal value, for any given Q, of the following
nonlinear program

Minimize q(y,co)
s.t. W(co)y = h(co) - T(co)x, (13.9)
y e IR™1.

If this second-stage problem is infeasible then we set Q(x,w) = -Foo. The


model (13.9) is an adaptation model in which y is the recourse decision and
Q(x,co) is the recourse cost function.
The two-stage stochastic program with recourse is an optimization prob­
lem in the first-stage variables x, which optimizes the sum of the cost of
378 Planning Under Uncertainty Chap. 13

the first-stage decisions, /(x), and the expected cost of the second-stage
decisions. It is written as follows.

Minimize f(x) + E[Q(rr, cu)]

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


s.t. (a\x)=bi, i = 1,2,... , mo, (13.10)
x G IR"0,

where a1 denotes the transpose of the zth row of the mo x no matrix 4, and
bi is the zth component of the mo-vector b. (a\x) = b^ i = 1,2,..., mo are
linear restrictions on the first-stage variables. This model can be cast in the
general formulation (13.3) simply by denoting fo(x,u>) = f(x) + Q(x,lj),
and = (a\x) — bi.
A formulation that combines (13.9) and (13.10) is the following:

Minimize (/(x) + ^[Minyg^^ {q(y,w) | T(w)x + W(u>)y = /i(o>)}])

s.t. Ax — b, (13.11)
x G IR"°,

where Min denotes the minimal function value.


Let Ki = {x G IR"° | Ax = d}, denote the feasible set for the first-stage
problem. Let also K% = {x G JR"° | E,[Q(j?,u;)] < +00} denote the set of
induced constraints. This is the set of first-stage decisions x for which the
second-stage problem is feasible. Problem (13.10) is said to have complete
recourse if K% = IRn°, that is, if the second-stage problem is feasible for
any value of x. The problem has relatively complete recourse if CK2,
that is, if the second-stage problem is feasible for any value of the first-
stage variables that satisfies the first-stage constraints. Simple recourse
refers to the case when the resource matrix = I and the recourse
constraints take the simple form Iy+ — Iy~ — Zi(cv) — T(a>)x, where I is
the identity matrix, and the recourse vector y is written as y = y+ — y~~
with y+ >0,y~> 0.

Deterministic Equivalent Formulation


We consider now the case where the random vector has a discrete and
finite distribution, with support Q = {cu1, cj2,... ,uN}. In this case the set
Q is called a scenario set. Denote by ps the probability of realization of
the sth scenario ws. That is, for every s = 1,2,..., N,

ps = Prob (u? = cvs)


= Prob{(q(i/,u?),W(w),h(w),T(a;)) = (q(y, ws), W(ws), h(cds),T(cjs))} .

It is assumed that ps > 0 for all ws G Q, and that Ps — 1-


Sect. 13.3 Stochastic Programming Problems 379

The expected value of the second-stage optimization problem can be


expressed as
N
E[Q(x,u)] = ^psQ(x,uJs')- (13.12)

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


S=1

For each realization of the random vector us E Q a different second-stage


decision is made, which is denoted by ys. The resulting second-stage prob­
lems can then be written as

Minimize q(ys,cvs)
s.t. W^s)ys = h^s)-T(ujs)x, (13.13)
ys e IR^1.

Combining now (13.12) and (13.13) we reformulate the stochastic non­


linear program (13.11) as the following large-scale deterministic equivalent
nonlinear program
N
Minimize /(rr) + ^Ppsg(?/S,a/) (13.14)
S=1
s.t. Ax = b, (13.15)
T(ivs)x + W(us)ys = h(us) for all G Q, (13.16)
x e IR”", (13.17)
G IR"1. (13.18)

The constraints (13.15)-( 13.18) for this deterministic equivalent program


can be combined into a matrix equation with block-angular structure

/ A
T(a?) ITp)
T(w2) W(u2) (13.19)

Split-Variable Formulation
The system of linear equations in (13.19) can be rewritten in a form that is,
for some algorithms, more amenable to decomposition and parallel compu­
tations. In particular, in the absence of the x variables the system (13.19)
becomes block-diagonal. The split-variable formulation replicates the first-
stage variable vector x into a set of vectors xs E IRn° for each ws E Q. Once
a different first-stage decision is allowed for each scenario, the stochastic
program decomposes into S independent problems. Of course, the first-
stage variables must be nonanticipative^ that is they cannot depend on
380 Planning Under Uncertainty Chap. 13

scenarios that have not yet been observed when the first-stage decisions
are made. This requirement is enforced by adding the restrictions that
x1 = x2 = ... — xs. The split-variable formulation is then equivalent to
the original stochastic problem (13.14)-(13.18), for which equation (13.19)

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


can be written in the equivalent form

/ A \ / b \
T(uA) IVp)
A b
T(u2) W(w2) fi(w2)

(13.20)
A b
T(wN) W(wN') h(uN)
x1' 0
\yN)

\ I -I / \ 0 /

Multistage Recourse Problems


The recourse problem is not restricted to the two-stage formulation. It is
possible that observations are made at K different stages and are captured
in the information sets {At}fLi with Ai C A2 • • • C Ar. These sets
are subfields of the underlying cr-field E of all possible observations. A
multistage stochastic program with recourse will have a recourse problem
at stage r conditioned on the information provided by AT, which includes
all information provided by the information sets At, for t = 1,2,..., r. The
program also anticipates the information in At, for t = r + 1,..., K.
Let the random vector have support Q = Qi x Q2 x • • • x which
is the product set of all individual support sets Qt, t = 1,2,..., K, cu
is written componentwise as cu = (cu1,... ,uA). Denote the first-stage
variable vector by y®. For each stage t = 1, 2,..., K define the recourse
variable vector y1 e IRnt, the random cost function qtiy *
,^), and the
random parameters Tt(u?) | a? € Q
*}.
The multistage program, which extends the two-stage model (13.11), is
formulated as the following nested optimization problem

Minimize f(y°) + E Min qTy^m1)-!---- E Min qK(yK,uK) •••


yoe®"0 v'e®;1 *yKeiR."
s.t. T^w1)?/0 + = /^(w1),

(13.21)
TK(iJjK)yK 1 + WK(<jjK)yK - hx(uK).
Sect. 13.4 Robust Optimization Problem 381

For the case of discrete and finitely distributed probability distributions


it is again possible to formulate the multistage model into a deterministic
equivalent large-scale nonlinear program. Section 13.9 provides references
to the literature on multistage programs.

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


13.4 Robust Optimization Problems
This section considers an alternative approach to handling uncertain or
noisy data. This approach is applicable to optimization models that have
two distinct components: a structural component that is fixed and free of
any noise in its input data; and a control component that is subject to noisy
input data. In some cases the robust optimization model is identical to a
two-stage stochastic program with recourse. But it also allows additional
flexibility in dealing with noise. In order to define the model we introduce
two sets of variables
x e IR”0 denotes the vector of decision variables that depend only on the
noise-free structural constraints. These are the design variables whose
values are independent of realizations of the noisy parameters.
y € IR”1 denotes the vector of control variables that can be adjusted once
the uncertain parameters are observed. Their optimal values depend
both on the realization of uncertain parameters, and on the optimal
values of the design variables.
This terminology is borrowed from the flexibility analysis of production
and distribution systems. The design variables determine the structure of
the system and the size of production modules; the control variables are
used to adjust the mode and level of production in response to disruptions
in the system, changes in demand or production yield, and so on.
The optimization model we are interested in has the following structure

Minimize (c, x) + (d,y) (13.22)


s.t. Ax — b, (13.23)
Bx + Cy = e, (13.24)
x e IR”°, (13.25)
y e IR”1, (13.26)

where b,c,d,e are given vectors and X,B,C are given matrices. Equa­
tion (13.23) denotes the structural constraints that are free of noise. Equa­
tion (13.24) denotes the control constraints. The coefficients of these con­
straints, i.e., the elements of B,C, and e are subject to noise. The cost
vector d is also subject to noise, while A, b, and c are not.
To define the robust optimization problem we introduce an index set
Q = {1,2,..., S}. With each index s € Q we associate the scenario set
{d(s), B(s), C(s), e(s)} of realizations of the control coefficients. Reference
382 Planning Under Uncertainty Chap. 13

to an index s imply reference to the scenario set associated with this in­
dex. The probability of the sth scenario is ps, and — 1- Now
the following question is posed: What are the desirable characteristics of a
solution to problem (13.22)-(13.26) when the coefficients of the constraints

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


(13.24) take values from some given set of scenarios? The solution is con­
sidered robust with respect to optimality if it remains close to optimal for
any realization of the scenario index s e Q. The problem is then termed
solution robust. The solution is robust with respect to feasibility if it re­
mains almost feasible for any realization of s. The problem is then termed
model robust. The concepts of close and almost are precisely defined later
through the choice of appropriate norms.
It is unlikely that a solution to the mathematical program will remain
both feasible and optimal for all realizations of s. If the system being
modeled has substantial built-in redundancies, then it might be possible to
find solutions that remain both feasible and optimal. Otherwise a model
is needed that permits a trade-off between solution and model robustness.
The model developed next formalizes a way to measure this trade-off.
First let us introduce a set {y\y2,..., yN} of control variables for each
scenario s C Q, and another set {z1,z2,..., zN} of feasibility error vectors
that measure the infeasibility of the control constraints under each scenario.
The real-valued objective function £(x,y) = (c,x) 4- (d, y) is a random
variable taking the value ^(x.y8) = (c,x) 4- (d(s),ys) with probability ps.
Hence, there is no longer a simple single choice for an aggregate objective
function. The expected value

4) = £p»U) (13-27)
s€Q

is precisely the objective function used in the stochastic programming for­


mulations studied in the previous section. Another choice is to employ
worst case analysis and minimize the maximal value. The objective func­
tion is then defined by
<t(-) = max£s(-). (13.28)

The robust optimization formulation also allows the introduction of higher


moments of the distribution of £(•) in the optimization model. Indeed, the
introduction of higher moments is one of the features of robust optimiza­
tion that distinguishes it from the stochastic programming model of the
previous sections. For example, we could use a nonlinear utility function
that embodies a trade-off between mean value and variability in this mean
value. If Z7(£s) denotes the utility of £s, then the function

<?(•) = £>s^(-)) (13.29)


s€Q
Sect. 13.4 Robust Optimization Problem 383

captures the risk preference of the user. A popular choice of utility func­
tions, for portfolio management applications, is the logarithmic function
U(£s) = log£s. The general robust optimization model includes a term
g/1, y2,..., yN) in the objective function to denote the dependence of

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


the function value on the scenario index s. This term controls solution ro­
bustness, and can take different forms depending on the application. The
examples mentioned above are some popular choices.
The robust optimization model introduces a second term in the objec­
tive function to control model robustness. This term is a feasibility penalty
function, denoted by p(z\ z2,... and it is used to penalize violations
of the control constraints under some of the scenarios. The introduction
of this penalty function also distinguishes the robust optimization model
from the stochastic programming approach for dealing with noisy data. In
particular, the model recognizes that it may not always be possible to ar­
rive at a feasible solution to a problem under all scenarios. Infeasibilities
will inevitably arise, and they will be dealt with outside the optimization
model. The robust optimization model generates solutions that present the
modeler with the fewest infeasibilities to be dealt with outside the model.
The specific choice of penalty function is problem dependent, and it
also has implications for the choice of a solution algorithm. Two suitable
penalty functions are the following:

p(z\z\...,zs) = ^seQ p5||^s||2. This quadratic penalty function (i.e., a


weighted Z^-norm) is applicable to equality control constraints where
both positive and negative violations of the constraints are equally
undesirable. The resulting quadratic programming problem is twice
continuously differentiable, and can be solved using standard quadratic
programming algorithms, although it is typically large scale.
p(z\z2,..., zs) = niax(0, (maxj z^)). This penalty function is ap­
plicable to inequality control constraints when only positive violations
are of interest (negative values of some Zj indicate slack in the in­
equality constraints). With this choice of penalty function, however,
the resulting mathematical program is nondifferentiable. It is possible
to use an c-smoothing of the exact penalty function, and employ the
Linear Quadratic Penalty (LQP) algorithm (Chapter 7). The result is
a differentiable problem that is easier to solve and produces a solution
that lies within e from the solution of the nondifferentiable problem.

The robust optimization model takes a multicriteria objective form. A


goal programming weight parameter A is used to derive a spectrum of an­
swers that trade off solution for model robustness. The general formulation
of the robust optimization model is stated as follows:
384 Planning Under Uncertainty Chap. 13

Minimize a(x, y1, y2, ..., yN) + Xp^z1, z2,..., zN)


s.t. Ax — b,
B(s)x 4- C(s)ys + zs = e(s) for all s € Q,

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


x e IR"°,
ys e fR"1,
zs e IR™

13.5 Applications
In this section we discuss real-world applications where uncertainty is preva­
lent and wherein it is handled using the models introduced above. We give
first (subsection 13.5.1) an illustration of the robust optimization frame­
work, using the classic diet problem as an example. The other subsections
discuss models for planning under uncertainty in production and inventory
management (subsection 13.5.2) and models for matrix balancing (subsec­
tion 13.5.3), with a new section devoted to models for financial planning
under uncertainty (Section 13.6).
13.5.1 Robust optimization for the diet problem
The well-known diet problem is used here as an example to illustrate the
feature of model robustness. This feature is particularly interesting in the
context of optimization formulations, since feasibility has traditionally been
overemphasized.
The problem is to find a minimum-cost diet that will satisfy certain nu­
tritional requirements. The origins of this problem date back to the 1940s
and to the works of G.J. Stigler and G.B. Dant zig where it was soon rec­
ognized as a problem of robust optimization, since the nutritional content
of some food products may not be certain. Dantzig was still intrigued by
this ambiguity when he wrote, several decades later:
When is an apple an apple and what do you mean by its cost
and nutritional content? For example, when you say apple
do you mean a Jonathan, or McIntosh, or Northern Spy, or
Ontario, or Winesap, or Winter Banana? You see, it can make
a difference, for the amount of ascorbic acid (vitamin C) can
vary from 2.0 to 20.8 units per 100 grams depending upon the
type of apple. (Dantzig, 1990.)
The standard linear programming formulation assumes an average nu­
tritional content for each food product and produces a diet. However, as
consumers buy food products of varying nutritional content they will soon
build a deficit or surplus of some nutrients. This situation may be irrele­
vant for a healthy individual over long periods of time, or it may require
remedial action in the form of vitamin supplements.
Sect. 13.5 A pplica tions 385

We develop here a robust optimization formulation for this problem. Let


Xi denote the total cost of food type i in the diet; let denote the contents
of food type i in nutrient j per unit cost; and let bj be the required daily
allowance of nutrient j. Let c denote one specific nutrient, e.g., vitamin

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


C, from the set of nutrients and let A denote, for example, apples, from
the set of foods. A point estimate for the content of apples in vitamin C
is a Ac — a per unit cost. For the sake of the example assume that the
coefficient can take any value {a^c} for s in a scenario set fi. The robust
optimization formulation of the diet problem is then
- ^aicXi - aAcrA)2
Minimize + (13.30)
sen i^A

s.t. di■ ~ bj for all j / c, (13.31)


i
Xi >0 for all i. (13.32)

The weight A is used to trade off feasibility robustness with cost. For
A — 0, and also allowing the index j in (13.31) to take the value j = c,
with aAc = a, we obtain the classic linear programming formulation.
The diets obtained with larger values of A have vitamin C content that
varies very little with respect to the quality of apples. Figure 13.1 illus­
trates the error in vitamin C intake due to alternative robust optimization
solutions under different scenarios of vitamin C content. This figure il­
lustrates the efficacy of the robust optimization model in hedging against
alternative realizations of the data. For example, if an error of ±2c units
in total vitamin C intake is acceptable, no remedial action will be needed
for the robust optimization dieter—no matter what quality of apples are
included in the diet. On the other hand, the linear programming dieter will
need remedial treatment for several of the available apple qualities (note,
from Figure 13.1, that the error in vitamin C intake exceeds the allowable
margin of ±2e under seven scenarios).
The robust optimization diet is somewhat more expensive than the diet
produced by the linear program. Figure 13.2 shows the increase in the cost
of the diet as it becomes more robust with respect to nutritional content.
This simple example clarifies the meaning of a robust solution and shows
that robust solutions are indeed possible, but at some cost.
13.5.2 Robust optimization for planning capacity expansion
Manufacturing and service firms have to plan for capacity expansion in
order to meet increasing demand for their products and services over time.
Demand, however, is usually highly uncertain. Product demand depends
on general economic conditions, competition, technological changes, and
the general business cycle. Demand may also exhibit seasonal variations,
which are particularly difficult to address in the context of service oper-
386 Planning Under Uncertainty Chap. 13

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


Figure 13.1 Error (negative for deficit, positive for surplus)
of the dieter’s intake of vitamin C. The leftmost scenario cor­
responds to the lowest vitamin C content, while the rightmost
scenario corresponds to the highest vitamin C content. The
diet obtained with the linear programming formulation (i.e.,
A = 0) is very sensitive to the vitamin content of the food
products, whereas the diet obtained with the robust optimiza­
tion model (i.e., A = 5) is much less sensitive.

ations since an inventory of services cannot be created during periods of


low demand. Public utility companies—power and water distribution—face
both seasonal and daily variations. The hotel and travel businesses are also
highly seasonal. Stochasticities are not restricted to demand alone; equip­
ment failure, delivery of material by suppliers, and routine maintenance
operations also show varying degrees of uncertainty.
Planning for capacity expansion must account for the stochastic aspects
of the system. A conservative plan may be developed by estimating capac­
ity based on worst case analysis, wherein capacity must be sufficient to
meet the demands of the last customer during the peak season. This strat­
egy is quite expensive, and the prudent manager must carefully weigh the
marginal profit from the last sale against the cost of maintaining under­
utilized manufacturing or service facilities. The United States automobile
industry arrived at this conclusion in the early eighties and slashed ca­
pacity to increase utilization of their facilities, even at the cost of losing
some potential customers. The decision made front-page news in the The
Wall Street Journal on October 7, 1986. The Ford’s chief financial officer
summarized the essence of this strategy:
Sect. 13.5 Applications 387

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


Figure 13.2 Trade-off between cost (i.e., solution robustness)
and expected error in the vitamin C contents (i.e., model
robustness) for diets obtained using increasing values of the
weight parameter A in the robust optimization model.

“We arrived at a conscious willingness to give up the last ve­


hicle we needed in peak years.”

In this section we formulate a stochastic programming model that plans


capacity expansion for a manufacturing firm.

The Multiperiod Stochastic Program


We consider, for simplicity, the example of a firm producing a single product
at multiple manufacturing sites. The model deals with uncertain product
demand, and can be easily extended to handle multiple products, which
is a more realistic situation. We begin with a stochastic programming
formulation such that solution robustness is dealt with as an extension of
the stochastic model.
The model makes capacity expansion decisions at K plant sites during
a planning horizon of T time periods. There are H different decisions that
can be made with respect to each plant site, denoted by h = 0,1,2,..., H
where h — 1 signifies the current state of the plant, h = 0 indicates that
the plant is shut down and h = 2,3,..., H signal various retooling options
that are possible at each plant. Stochasticity in demand is dealt with by
postulating a set of scenarios Q = {1,2,..., N}.
The model makes first-stage decisions on capacity planning, i.e., which
plants to shut down or to retool, and which plants to maintain at their
current status. The recourse decisions are the production levels for each
time period and under the different scenarios.
388 Planning Under Uncertainty Chap. 13

Notation
We define first the parameters of the model: use k ~ 1,2,..., K to denote
plant sites, h = 0,1,2,..., H to denote plant configurations, s € to
denote scenarios, and t = 1,2,..., T to denote time periods.

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


pst : the probability that scenario s occurs at time period t.
dst : the demand for the product under scenario s, during time period t.
a : the fraction of unmet demand that is translated to profit by being
diverted to other products of the same firm.
The capacity parameters are:
Ukht • the capacity available at site k under configuration h during the Uh
time period.
a^h • the production coefficients, indicating the capacity required to pro­
duce one unit of the product at site k under configuration h.
Lkht : the capacity lost during the retooling process if site k is retooled
into configuration h during the tth time period.
The cost coefficients, given next, are discounted to the initial time period,
using an appropriate discount rate:
Fkht • the fixed cost for changing the configuration of site k from h = 1 to
h = 0,2,..., H during the Uh time period.
rkht ' the marginal contribution of producing and selling one unit of the
product at site fc, using configuration h at the tth time period.
r : the marginal contribution realized when there is unmet demand and a
fraction a of it is diverted to other products.
Now we define the decision variables. There are two sets of continuous
variables that denote the schedule of production and the level of unmet
demand. Two sets of integer variables are used to denote retooling decisions
and plant configurations.
^khst ‘ the number of units produced at site k, operating using configura­
tion h under scenario s during the tth time period.
zst : the number of units of unmet demand under scenario s during the tth
time period.
. ( 1 if site k is in configuration h at time period t,
ykht — | o otherwise.
. f 1 if site k is retooled to configuration h at time period t,
Wkht — y o otherwise.
The continuous variables are constrained to be nonegative. The integer
variables are binary, that is ykht^kht are either 0 or 1.
Model Formulation
We now define the model by specifying precisely the objective function and
the constraints.
Sect. 13.5 Applications 389

Objective function: The objective function that must be maximized has two
terms that account for direct profits from sales of the product and from
indirect profits from diverted demand, and a third term that accounts for
the retooling cost. It takes the form:

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


NT / K H \ NT T K H
Pst I 'f'khtTkhst j T Oi r
* Pst^st Fkh.t^kht-

s=l i=l \fc=l h=0 / s=l t=l t=lk=lh=O


(13.33)
Demand constraints: For each time period and for each scenario the total
production from all sites under all configurations plus the unmet demand
diverted to other products is equal to the total realized demand.

K H
Xkhst T zst = dst for all s = 1,2,..., TV, t = 1,2,..., T. (13.34)
k = l h=0

Capacity constraints: The total production capacity utilized at each site


cannot exceed the capacity available at the site under the given configura­
tion, and taking into account losses of production capacity due to retooling
operations. This is described for all k = 1, 2,..., K, all h = 0, 2,..., H, all
s = 1, 2,..., TV, and all T = 1,2,..., T by

t^kh^khst < UkhtPkht ~ (13.35)

Retooling constraints: We consider now the logical conditions between the


retooling decisions and the plant configurations. A plant cannot be in a
given configuration h 1 unless it has first been retooled from its original
configuration h = 1. For the first time period this condition is imposed
by the constraint ykhi < Wkhi- For subsequent time periods we impose the
constraints ykht ~ ykh(t-i) < ™khi for all k = 1, 2,..., /V, all h = 0,..., H,
and all T = 1,2,... ,T.
Operational considerations usually dictate that a plant cannot be re­
tooled more than once during the planning horizon. Hence, we add the
constraint:
H T
52YlWkht - 1 for all k = 1,2,...,K (13.36)
h=0 t=l

Finally, we require that each plant operate under some configuration or


be shut down. These considerations are imposed by the constraints:

H
Y^ykht = 1 for all k = 1,2, t = l,2,...,T, (13.37)
h=0
Wkot < Vkot for all k = 1,2,..., K, t = 1,2,... ,T. (13.38)
390 Planning Under Uncertainty Chap. 13

Robustness Considerations
In practical capacity expansion applications, the expected cost or profit of
the decision is not the only objective. Because large amounts of capital
and other resources are involved, and the careers of many employees are

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


at stake, some form of risk measure should be incorporated into the model
with the objective of reducing it. Models of robust optimization—and es­
pecially those with solution robustness—provide the framework for dealing
with risk.
In order to produce capacity expansion plans that are solution robust we
define first a scenario-dependent measure of marginal profit from produc­
tion Pst at each time period t. We also define the total profit Ps under each
scenario s as the total marginal profit from production during the planning
horizon, less the fixed cost of the capacity expansion plan. We then impose
additional constraints on the optimization model (13.34)-( 13.38) so that
some acceptable level of profit is realized under all scenarios.
The marginal profit realized under scenario s, from a given capacity
plan and production schedule, is given by:
K H T
Pst = ^kht^khst T Ot TZst. (13.39)
fc=lh=0 t=l

The total profit for a given scenario s, accounting for the fixed cost of a
capacity expansion plan, is given by:
T T K H
= (13.40)
t=l t=l fc—1 h,=0

One way to introduce solution robustness now is to obtain solutions


that have maximum expected profit for a given level of variance of profit.
This approach, which is common in the finance literature, is known as the
Markowitz criterion. A range of such solutions can be obtained by trading
off expected profit for variance, that is, reducing the level of acceptable
variance which also reduces the expected profit. This trade-off can be
achieved by setting up the objective of the optimization model as follows:

Maximize (E[PS] - AVar[Ps]), (13.41)

where E[-] and Var[-] denote the expected value and the variance of the
random variable respectively, and A is a user specified parameter. Large
values of A reduce variance at the expense of reduced profits, while smaller
values allow the variance to increase, producing higher returns at a penalty
of increased risks.
Calculation of the variance requires evaluation of a quadratic function
and the resulting optimization program becomes a nonlinear program with
Sect. 13.5 Applications 391

continuous and integer variables. Such problems are very difficult to solve
with currently available software systems, and they are likely to remain
so in the future. Moreover, the use of a variance term as a measure of
risk is inappropriate when the distribution of profits is not symmetric.

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


The decision-maker wishes to reduce only downside risk, but by reduc­
ing variance the model reduces both upside potential and downside risk.
An alternative formulation for robust capacity expansion planning imposes
constraints that limit downside risk alone and gives rise to a linear opti­
mization program. A target level of profit p is set and linear constraints are
added to the optimization program of the previous section such that the
profit is greater than p for all scenarios, i.e., Ps > p for all s — 1,2,..., N.
If p is assigned a large negative value then these constraints are not binding
and the model obtains a capacity expansion plan that maximizes expected
profit. As p is increased the model is forced to seek capacity plans that are
guaranteed to have a profit of at least p under all scenarios. Such plans,
however, have reduced expected profit.
13.5.3 Robust optimization for matrix balancing
The problem of matrix balancing was examined earlier (Chapter 9), and
entropy optimization models were developed wherein the problem data were
assumed consistent, so that the sets of feasible solutions to Problems 9.2.3
and 9.2.4 were nonempty. For inconsistent data an interval-constrained
formulation was proposed that allowed the solution of the matrix balancing
problem in a way that satisfied the constraints within an error of Here
we develop a robust optimization model that provides an alternative way
to overcome difficulties due to data inconsistency.
Consider the following equality-constrained optimization problem
m n / x
Minimize EE ^log(^) (13.42)
»=1 J=1
n
s.t. ^2xij=Ui for i = 1,2,..., m, (13.43)

m
— Vj for j = 1,2,..., n, (13.44)
2=1

X > 0. (13.45)

When the observation vectors u and v are noisy it is possible that this op­
timization problem has no solution. Clearly, if
the optimization problem has no feasible solution. Several approaches
can be pursued in order to overcome this difficulty. Tradition suggests
that the vectors u and v be first scaled using the transformation Ui <—
ui ((Ej uj)/(Ei ut)} for all i so that feasibility is restored. Alternatively,
392 Planning Under Uncertainty Chap. 13

the interval constrained formulation suggested in Problem 9.2.5 may be


used, becasue with sufficiently large values of the interval parameter e the
program becomes feasible.
The robust optimization formulation of the matrix estimation problem

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


explicitly accounts for potential infeasibilities in the linear constraints. It
then introduces a penalty term in the objective function that minimizes a
norm of the infeasibilities. Let y € IRm and z e IRn denote the infeasibility
vectors for the constraints (13.43) and (13.44) respectively. The robust
optimization model is written as

Minimize l°g ( ~ ) + + 52 (13-46)


i=i j=i 2 i=i j=i
n
s.t. — yi = ui for i = 1,2,...,m, (13.47)

m
— Zj — Vj for j = 1,2,..., n, (13.48)
i=l
x > 0. (13.49)

This formulation is a direct application of the robust optimization model


with quadratic penalty for infeasibilities. It is possible to arrive at a simi­
lar mathematical formulation, beginning with statistical arguments on the
desirable properties of the balanced matrix. In particular, this formulation
results in a Bayesian estimate of the matrix, that is, it maximizes (the loga­
rithm of) the probability of the matrix X = (xij), conditional on the noisy
observations {u^Vj}. The entropy term in the objective function estimates
the matrix that is the least biased (or maximally uncommitted) with re­
spect to missing information, conditioned on the observations {a^}. The
quadratic terms are the logarithms of the probability distribution function
of the error (i.e., noise) term, conditioned on the matrix X — (x^), assum­
ing that the errors are normally distributed with mean zero and standard
deviations that are identical for all observations.

Row-action Algorithm for Robust Optimization of Matrix Balancing


Both the quadratic and entropy terms in the objective function of the
robust optimization models are Bregman functions (see Section 2.1) and
so is their sum. It is easy to verify that both functions have the strong
zone consistency property with respect to the hyperplanes specified by the
equality constraints (13.47)-(13.48). Hence, we can apply a row-action
algorithm (Algorithm 6.4.1) to solve this model.
Consider first the application of the iterative step of Algorithm 6.4.1
to the zth row of the constraints (13.47). At the z/th iteration it takes the
form
Sect. 13.5 Applications 393

x^1 = x"j exp (3 for all j = 1, 2,..., n, (13.50)

(13.51)

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


It is also important to observe that since the algorithm is applied to equality
constraints there is no need to explicitly update the dual variables and so
they are omitted from Algorithm 13.5.1 below. The projection parameter
(3 is calculated such that satisfy the zth constraint. Hence,
substituting (13.50)—(13.51) to the zth equation of (13.47) we obtain

£eXPP) ~ Vi + \ = Ui- (13.52)


J=1

Let ^(/3) be the nonlinear function

W) = £
oexp/S)
* -y" + -Ui. (13.53)
J=1

We seek a (3* such that 4>(/?


*) — 0. We may use any nonlinear equation
solver (e.g., Newton’s method) to solve 4/(/3) = 0. However, a sufficient
approximation to /3 * can be obtained by taking a single Newton’s step,
starting from (3 = 0, similar to the use of secant approximations discussed
earlier (Sections 6.9 and 12.4.2). The asymptotic convergence of the row­
action algorithm is preserved if this approximation is used in the iterative
step instead of the exact value /?
*.
It is worth mentioning that this approximation can be calculated in
closed form, whereas the exact calculation of /?
* requires the use of an iter­
ative procedure. Obtaining closed-form solutions to the nonlinear system
of equations has important implications for the efficient implementation
of the algorithm (see Section 12.4.2). The approximate solution of the
nonlinear equation is obtained as

*(0)
(13.54)
«"(0)’

where ’iP' denotes first derivative with respect to (3. Straightforward calcu­
lations yield

(13.55)
394 Planning Under Uncertainty Chap. 13

This value of /3 is used in (13.50)—(13.51) to complete the iterative step over


the constraint equations (13.47). Following a similar argument we obtain
the iterative step of the row-action algorithm over the constraint equations
(13.48). The complete algorithm is summarized below.

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


Algorithm 13.5.1 Row-Action Algorithm for Robust Matrix Balancing
Optimization Model

Step 0: (Initialization.) Set v = 0 and choose x° € IR™™ and y° C


IRm , z° e IRn, such that the initialization conditions of Algorithm 6.4.1
are satisfied. For example, set = a^j, $ = 0, z9 = 0 for all i =
1,2,..., m, j = 1,2,..., n.
Step 1: (Iterative step over rows of the matrix, i.e., equations (13.47).)
For all i = 1, 2,..., m, calculate:

(13.56)

J=1
^75 = ^exP/3. (13.57)
y-+l = Vi - f■
(13.58)
A

Step 2: (Iterative step over columns of the matrix, i.e., equations (13.48).)
For all j = 1, 2,..., n, calculate:
m
(EX+b- Zj ~vi
fl ' ______________ (1 Q

2=1
a#1 = xij^ exp^’ (13.60)

> Zj X- (13.61)

Step 3: Replace v v + 1, and return to Step 1.

13.6 Stochastic Programming for Portfolio Management


Portfolio management problems can be viewed as multiperiod dynamic
decision problems where transactions take place at discrete time points.
Sect. 13.6 Stochastic Programming for Portfolio Management 395

At each point in time the manager has to assess the prevailing market
conditions (such as prices and interest rates) and the composition of the
existing portfolio. The manager has also to assess potential fluctuations in
interest rates, prices, and cashflows. This information is incorporated into a

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


sequence of actions of buying or selling securities, and short-term borrowing
or lending. Thus, at the next point in time the portfolio manager has a
seasoned portfolio and, faced with a new set of possible future movements,
must incorporate the new information so that transactions can be executed.
Portfolio management of equities is based on the notion of diversification
introduced by H. Markowitz in his seminal work in the 1950s. Diversifica­
tion is achieved by minimizing the variance of returns during a holding pe­
riod, subject to constraints on the mean value of the returns. There is only
one time interval under consideration. Therefore, future transactions are
not incorporated and this is a single-period (myopic) model. The portfolio
management strategy for fixed-income securities has been that of portfolio
immunization, i.e., portfolios are developed that are hedged against small
changes from the current term structure of interest rates. Such models are
again single-period and ignore future transactions. Furthermore, they ig­
nore the truly stochastic nature of interest rates, and merely hedge against
(small) changes from currently observed data. The idea of immunization
dates back to the actuary F.M. Reddington in the 1950s, and it has been
used extensively since the mid-70s.
The increased complexity of the fixed-income securities and the in­
creased volatility of the financial markets during the 1980s have motivated
interest in mathematical models that more accurately capture the dynamic
(i.e., multiperiod) and stochastic nature of the portfolio management prob­
lem. Stochastic programming models with recourse provide a versatile tool
for the representation of a wide variety of portfolio management problems.
This section formulates a multistage stochastic programming model for
managing portfolios of fixed-income securities. We assume that readers
have some familiarity with basic concepts of finance. As an introductory
text we recommend Bodie, Kane, and Marcus (1989) and, for more ad­
vanced material, Elton and Gruber (1984) and Zenios (1993a).
The model specifies a sequence of investment decisions at discrete time
points. Decisions are made at the beginning of each time period. The
portfolio manager starts with a given portfolio and a set of scenarios about
future states of the economy which she incorporates into an investment
decision. The precise composition of the portfolio depends on transactions
at the previous decision point and on the realized scenario. Another set of
investment decisions are made that incorporate both the current status of
the portfolio and new information about future scenarios.
We develop a three-stage model, with decisions made at time instances
to, ti, and to- Extension to a multistage model is straightforward. Scenarios
unfold between to and ti, and then again between ti and A simple
396 Planning Under Uncertainty Chap. 13

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


Figure 13.3 The evolution of scenarios on a binomial lattice
and the structure of the portfolio investment decisions.

three-stage problem is illustrated in Figure 13.3, where it is assumed that


scenarios evolve on a binomial lattice. At instance to two scenarios are
anticipated and by instance t± this uncertainty is resolved. Denote these
scenarios by Sq and sj. At t\ two more scenarios are anticipated, s? and s|.
A complete path is denoted by a pair of scenarios. In this example there
are four paths from to to t2 denoted by the pairs (sq, si), (5o, 5i), (so, 5i)’
and (sq, s}).
The model is a nonlinear program maximizing the expected value of a
utility function of terminal wealth. Expectations are computed over the set
of postulated scenarios. The value of the utility function is obtained from
the solution of a multistage program, which is structured in such a way
that decisions at every time period anticipate future scenarios, while they
adapt to new information on market conditions as it becomes available.

13.6.1 Notation
The model is developed using cashflow accounting equations. Investment
decisions are in dollars of face value. We define first the parameters of the
model.

So, Si : the sets of scenarios anticipated at to and ti respectively. We use


sq and $i to denote scenarios from So and Si, respectively. Paths are
denoted by pairs of the form (sq, Si), and with each path we associate
a probability 7r(so,si).
I : the set of available securities. The cardinality of I (i.e., number of
available financial instruments) is m.
co : the dollar amount of riskless asset available at to-
b° e IRm : a vector whose components denote the composition of the initial
portfolio.
Sect. 13.6 Stochastic Programming for Portfolio Management 397

£°, £° G IRm vectors of bid and ask prices respectively, at to. These prices
are known with certainty. In order to buy an instrument the buyer
has to pay the bid price, and in order to sell it the owner is asking for
the ask price.

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


^(so), ^($0) £ IRm,for all so E So : vectors of bid and ask prices, respec­
tively, realized at tp These prices depend on the scenario so-
C2(sq, $i), £2(-$o,<$i) E IRm, for all so E So and all si E Si : vectors of bid
and ask prices, respectively, realized at t2. These prices depend on the
path (sq, «i).
q0(sq), a1(so,«i) 6 IRm, for all sq G So and all si G Si : vectors of amor­
tization factors during the time intervals [£o,^i) and [fi,^) respec­
tively. The amortization factors indicate the fraction of outstanding
face value of the securities at the end of the interval compared to the
outstanding face value at the beginning of the interval. These factors
capture the effects of any embedded options, such as prepayments and
calls, or the effect of lapse behavior. For example, a corporate security
that is called during the interval has an amortization factor of 0, and
an uncalled bond has an amortization factor of 1. A mortgage secu­
rity that experiences a 10 percent prepayment and that pays, through
scheduled payments, an additional 5 percent of the outstanding loan
has an amortization factor of 0.85. These factors depend on the sce­
narios.
A:0(so), Ad(so,<$i) E IRm, for all so E So, and all si E Si: vectors of cash ac­
crual factors during the intervals [to, ti) and [^1,^2) respectively. These
factors indicate cash generated during the interval, per unit face value
of the security, due to scheduled payments and exercise of the embed­
ded options, accounting for accrued interest. For example, a corporate
security that is called at the beginning of a one-year interval, in a 10
percent interest rate environment, will have a cash accrual factor of
1.10. These factors depend on the scenarios.
Po(^o), Pi(so,si) • short-term riskless reinvestment rates during the inter­
vals [^o3i) and [£1,^2) respectively. These rates depend on the scenar­
ios.
Li(so), ^2(50,51) : liability payments at ti and t2 respectively. Liabilities
may depend on the scenarios.

Now let us define decision variables. We have four distinct decisions at


each point in time: how much of each security to buy, sell, or hold in the
portfolio, and how much to invest in the riskless asset. All variables are
constrained to be nonnegative.
First-stage variables, at to'
x® E IRm : the components of the vector denote the face value of each se­
curity bought.
398 Planning Under Uncertainty Chap. 13

yO IRm . denotes componentwise the face value of each security sold.


z° e IRm : denotes componentwise the face value of each security held in
the portfolio .
vq : the dollar amount invested in the riskless asset.

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


Second-stage variables, at t± for each scenario sq:
^(so) € • denotes the vector of the face values of each security bought.
yi(so') E IRm : denotes the vector of the face values of each security sold.
^(so) € IRm : denotes the vector of the face values of each security held
in the portfolio.
ui(sq) : the dollar amount invested in the riskless asset.

Third-stage variables, at £2 for each scenario (sq,5i):


£2(so,Si) € IRm : denotes the vector of the face values of each security
bought.
?/2(sch $i) £ IRm • denotes the vector of the face values of each security sold.
Z2(so, Si) € lRm : denotes the vector of the face values of each security held
in the portfolio.
V2(«o?5i) : the dollar amount invested in the riskless asset.

13.6.2 Model formulation


The constraints of the model express cashflow accounting for the riskless
asset and inventory balance for each security at all time periods.
First-stage constraints: At the first stage (i.e., at time to) all prices are
known with certainty. The cashflow accounting equation specifies that the
original endowment in the riskless asset, plus any proceeds from liquidating
part of the existing portfolio, equal the amount invested in the purchase of
new securities plus the amount invested in the riskless asset, i.e.,
m m

Co + £
i—1
=E i=l
^xi + (13.62)

For each security in the portfolio we have an inventory balance constraint:

b® + x® = y® + z® for all i e I. (13.63)

Second-stage constraints: Decisions made at the second stage (i.e., at time


tfl) depend on the scenario so realized during the interval Hence,
we have one constraint for each scenario. These decisions also depend on
the investment decisions made at the first stage.
Cashflow accounting ensures that the amount invested in the purchase
of new securities and the riskless asset is equal to the income generated by
the existing portfolio during the holding period, plus any cash generated
Sect. 13.6 Stochastic Programming for Portfolio Management 399

from sales, less the liability payments. There is one constraint for each
scenario
m m

+ 52

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


Po(so)vo ki(S0^Zi + (So)
i=l i—1
m
= ^i(gp) + 5?C/(so)a;|(so) + Li(s0), for all So € So- (13.64)
2=1

Inventory balance equations constrain the amount of each security sold


or remaining in the portfolio to be equal to the outstanding amount of
face value at the end of the first period, plus any amount purchased at the
beginning of the second stage. There is one constraint for each security
and for each scenario:

a°(s0)z° + x|(s0) = yj(so) + z41(so), for all i G I, s0 G So- (13.65)

Third-stage constraints: Decisions made at the third stage (i.e., at time £2)
depend on the path (so,si) realized during the period [U,^) and on the
decisions made at U. The constraints are similar to those of the second
stage. The cashflow accounting equation is
m m
Pi(so,si)vi(so) + 52fci(SO,Sl)2/(sO) + 52^2(so,si)i/22(so,si)
i—1 i=l
m
= V2(so,si) + 52C2(so,si)x?(so,si) + L2(so,Si), (13.66)
2=1

for all paths (sq^i) such that sq € So and Si G Si-


The inventory balance equation is:

a)(s0,si)z| (so) + ar?(s0,Si) = l/2(so,si) + 22(s0,si), (13.67)

for all i e I, and all paths (so,Si) such that so € So and «i G Si.
Objective function: The objective function maximizes the expected utility
of terminal wealth. In order to measure terminal wealth all securities in
the portfolio are marked-to-market, in accordance with recent U.S. Federal
Accounting Standards Board (FASB) regulations that require reporting
portfolio market and book values. The composition of the portfolio and
its market value depend on the scenarios (sq,si). The objective of the
portfolio optimization model is

Maximize 7r(so, S\)U (VT(so, si)) ,


(sq,S1 )ESo X Si
400 Planning Under Uncertainty Chap. 13

where 7r(so> si) is the probability associated with the path (so, «i); IT(so, 5i)
denotes terminal wealth; and U denotes the utility function. Terminal
wealth is given by

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


m
W(s0,si) = V2(s0,si) + ]ryf(so,si)Zi2(so,si)- (13.68)
i=l

13.7 Stochastic Network Models


We revisit the deterministic equivalent formulation of the two-stage model
(13.14)—(13.18) and address the special case where the constraints (13.15)—
(13.16) can be represented by a generalized network structure. (See Sec­
tion 12.1 for the definition of networks and generalized networks.) For that
(A \
T(u/) / aRd are n°de-arc incidence matrices with
exactly two nonzero entries per column for each scenario ws in a scenario
set Q = {cj1,^2, ... ,cjs}; the complete matrix in (13.19) is not however
due to the occurrence of T(ws)x for all scenarios 6 Q. The matrix
(4T | T(ccd)T | • • • | T(ojs)t)T has more than two nonzero entries in every
column. The recourse problem has a network structure, but the first-stage
variables x are complicating variables as they link the recourse problem
constraints (via the linear equations (13.16)). The next section gives a de­
tailed formulation of the stochastic network problem, where the network
structure becomes more discernible.
The financial modeling applications of Section 13.6 can be represented
using stochastic network structures. For each fixed-income security at each
time period we associate a network node; and for each transaction we
associate an arc. For example, an arc linking two different securities at the
same time period is used to denote sale of one security and purchase of the
other, while an arc linking the same security across different time periods
is used to denote the inventory of that particular security. Figure 13.4
illustrates the structure of the network flow problem for two securities and
three time periods. If all data are known with complete certainty the
problem is the classic network flow problem. The stochastic programming
model uses a different network flow problem for each scenario, but the flows
on the arcs representing first-stage variables are common across scenarios.
Problems of planning hydroelectric power scheduling can also be repre­
sented as stochastic network problems. Power generation decisions made
today for the coming hours depend on the current state of the system,
electricity demand, water inflows, and so on. The necessary input data for
modeling such a complex system consist of: the network topology specified
by the geographical location of dams and the associated hydropower gener­
ation units, and their interconnections; limits on reservoir storage, level of
Sect. 13.7 Stochastic Network Models 401

First stage
Security^^

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


Figure 13.4 The structure of a generalized network model for
portfolio management of two securities over three time periods.

turbine operations, pumping, and spillage; hydroelectric production coeffi­


cients for each reservoir, obtained from engineering analysis of its storage
capacity and the turbine technology. Important input data are also the
water level in the reservoirs and the electricity demand. Both of these
quantities are uncertain, and furthermore, demand for electricity exhibits
both a daily and a seasonal variation that can be estimated, at best, by
a set of scenarios. The same is also true for the water level that depends
on rainfall. These uncertainties are fundamental to the operation of the
system. They can be incorporated in a stochastic network model.
Another complex problem that has been modeled using stochastic net­
works is that of planning air-traffic ground holding policies (see also Sec­
tion 12.3.2). The air traffic system is a complex web of airports, aircraft,
and traffic controllers at all airports. In the United States a centralized
flow control facility in Washington, DC, coordinates the control of this sys­
tem. The complexity of the air traffic system in Europe is intensified by
the need to coordinate several control centers among different countries,
a situation which has become further complicated with the integration of
east European countries into the system.
The systems are highly congested; traffic flow is carefully monitored and
controlled so that flights proceed without risk to safety. One key control
mechanism is ground holding, whereby a flight is delayed for departure if
congestion is anticipated at the destination airport. Ground holding is
a safe and relatively inexpensive solution, as opposed to holding aircraft
in flight before granting landing clearance. While the air traffic control
system does an excellent job monitoring traffic so that high safety standards
are maintained, there is substantial room for improvement especially with
regard to cost effectiveness. It is estimated that ground delays in the United
States in 1986 averaged 2000 hours per day, equivalent to grounding a total
402 Planning Under Uncertainty Chap. 13

of 250 airplanes (a carrier the size of Delta Airlines). A study by the West
German Institute for Technology estimated the avoidable cost of air traffic
delays in 1990 due to ground holding alone at 1.5 billion US dollars.
The ground holding policy problem seeks optimal holding policies, based

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


on the number of flights scheduled for departure during the planning hori­
zon and the travel time to the destination airport. Even for the simple case
where only a single destination airport is analyzed, its capacity is uncer­
tain due to weather conditions. The problem is complicated further by the
presence of multiple airports: ground holding decisions at each one have
a cascade effect on all others. The single destination airport problem has
been modeled using stochastic network optimization models. Application
of the model to data obtained from Logan airport (Boston, Mass., USA)
proved that substantial reductions in total delay can be realized when using
the stochastic programming dynamic models as opposed to more commonly
used static models.
13.7.1 Split-variable formulation of stochastic network models
We consider here an alternative formulation of the deterministic equiva­
lent formulation (13.14)—(13.18) that better illustrates the network struc­
ture, and is also more suitable to the development of parallel optimization
algorithms (Section 13.8). The split-variable formulation (see also Sec­
tion 13.3.3) breaks the stochastic network problem into a large number of
independent deterministic network flow problems with some additional cou­
pling constraints, by replicating the first-stage variables x into a set of vari­
ables xs € IRno for each e Q. Once a different first-stage decision is al­
lowed for each scenario the stochastic program decomposes into S indepen­
dent problems. Of course, the first-stage variables must be non-anticipative,
that is, they cannot depend on as yet unobserved scenarios, a requirement
that is enforced by adding the condition that x1 = x2 = • • • = xs. The
split-variable formulation is then equivalent to the original stochastic prob­
lem (13.14)-(13.18). It can be written as follows

Minimize y^ps(/(zs) + qs(ys, ws)) (13.69)


S=1
s.t. Axs=b for all s £ Q, (13.70)
T(us)xs + W(ajs)ys=h{ixs) for all s £ Q, (13.71)
a? - xs = 0 for all s £ Q, (13.72)
xs £ IR;0, (13.73)
ys £ IR;1. (13.74)

The constraints (13.72) are known as non-anticipativity constraints and


they ensure that first-stage decisions Xs do not depend on future realiza­
tions. For two scenarios si and «2 that are indistinguishable when the
Sect. 13.7 Stochastic Network Models 403

first-stage decisions are made, we have that xS1 = xS2.


With this reformulation the model (13.69)-( 13.74) is a network with side
constraints. In the absence of the (side) constraints (13.72), the constraint
set decomposes completely into S independent problems. Each problem

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


has a network structure, since (AT | T(ljs)t)t and IV(u,s) are node-arc
incidence matrices.
The constraints matrix of the split-variable formulation has a block­
diagonal structure with additional (coupling) rows for the non-anticipativity
constraints.
Let M = S • (mo + mi) + (S'- 1) - no, N = S • (no + ni), and let I denote
the no x no identity matrix. Recall that mo and no are the numbers of first-
stage constraints and variables respectively, and mi,ni are the numbers of
second-stage constraints and variables respectively. The constraint matrix
for (13.70)-(13.72) has dimension M x N. We denote this matrix by $,
(see also equation (13.20)), and it is defined as follows

( A
TV1) MV1)
A
TV2) WV2)

(13.75)
A
T(ws) WVS)

\ I

It is evident from the structure of the constraints matrix that the problem
decomposes by scenario if the non-anticipativity constraints are ignored.
Let 7 G IRM denote the right-hand side of (13.70)-(13.72) and let z E
IRa be the vector of decision variables, i.e.,

z = ((z1)T | (?/1)T I • • • I (xs)T I (ys)T)T. (13.76)

We also introduce, for completeness, a vector u g IR^ whose components


denote upper bounds on the variables. Finally, let F(z) denote the objective
function (13.69).
404 Planning Under Uncertainty Chap. 13

The split-variable formulation with bounded variables can be written


in matrix form as

Minimize F(z) (13.77)

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


s.t. (13.78)

Z e IR77, (13.79)

where In is the N x N identity matrix.

13.7.2 Component-wise representation of the stochastic network


problem
The previous sections illustrated the macro structure of the problem and
the decomposable nature of the constraints matrix. This section gives a
component-wise formulation of the stochastic network problem and illus­
trates its micro structure by specifying algebraically all equations. We
assume for simplicity the same underlying network structure for all sce­
nario problems, and with the notation of Section 9.2 we represent this
structure by the graph G = (Af, >1), where M ~ {1,2,..., mo + mi} is the
set of nodes and A = {(z, j) | i,j E A/"} CA/’xA/’is the set of arcs. Let
~ {j I (m) A} be the set of nodes having an arc with origin node z,
and 6J = {i | (z,j) € A} be the set of nodes having an arc with destination
node j. We partition the set of all nodes into two disjoint sets, Ao and
A/}. The set A/o consists of the mo nodes whose incident arcs are all first
stage so that their flow conservation constraints do not depend on the real­
ization of the uncertain quantities. The resources (i.e., supply or demand)
for these nodes are real numbers, denoted by bi for all i C A/o. The set
A/} = Af \ A/o consists of the mi nodes with stochastic right-hand sides or
incident second-stage arcs. The resources for these nodes are denoted by
rf for all z E AS, s € Q.
We also partition the arc set A into two disjoint sets A and Ai, corre­
sponding to replicated first- and second-stage decisions respectively. The
number of arcs in these sets are denoted by no and respectively. Denote
by xfj for (z,j) E w4o< and for (z,j) € >ti the flows on the arc with
origin node z and destination node j under scenario index s e Q. The
upper bound of a replicated first-stage arc xfj is denoted by and the
upper bound of a second-stage arc yf - is denoted by vs-, The multiplier on
arc (z,J) is denoted by m^- for (z, j) E Ab and by mfj for (z,j) E Ai- The
network optimization model for a fixed scenario index s C. Q is given by:
Sect. 13.8 Iterative Algorithm for Stochastic Network Optimization 405

Minimize
xseIRno ,yseIRni
V Psfij^)
J
+ V Psqt^vlj)
J J
(13.80)
(m)G.4o
S.t.

E 4-E mkiXski — bi for all i G A/o, (13.81)

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


je«,+ ke6~

E4- E
je^nAf0
mkiXki
ke6~a\T0

+ E4 ~ E
J6«+ njVi fce6“rWi
mkiySki = ri for a11 « e -^1’ (13-82)

0 < xsi:j < utj for all (i,f)eAo, (13.83)


0 < ytj < vsi:j for all (t, j)G A- (13.84)

The complete stochastic network problem (13.69)-(13.72) is obtained


by replicating the network problem (13.80)-( 13.84) for each scenario and
including the non-anticipativity constraints

rrb — xfj = 0 for all s E Q and for all (z,J) G ^4q- (13.85)

In this section we have been referring to quantities pertaining to an arc


(z, J) E A under scenario s E Q by using subscripts (z, J) and a superscript
s, respectively. To establish the correspondence between the matrix/vector
notation of (13.77)—(13.78) and the component-wise notation of this sec­
tion we impose a lexicographic order (see an example of such an order on
page 362) on the arcs in A and let (zi,ji) denote the first arc in A. Then
zi, the first component of z in (13.76), and rr|Ui refer to the same variable
and so on.

13.8 Iterative Algorithm for Stochastic Network


Optimization
In this section we develop an algorithm for stochastic network problems
with a quadratic objective function using the split-variable formulation
of the problem (Section 13.7.1). The algorithm is a specialization of the
general row-action algorithm (Section 6.2) to the network structure. Since
the algorithm works with the replicated problem it is convenient to use
to denote both first- and second-stage variables. That is, xfj for (z, J) E Ao
is a replicated first-stage variable, while x^ for (z, j) G .41 is a second-stage
variable. The objective function F takes the form

W = E ^4(4)2 + 44’ (13.86)


(m)g.4, seQ
406 Planning Under Uncertainty Chap. 13

where wfj > 0 and are constants which are obtained by evaluating the
summations in the minimand of (13.80) when the functions fij and are
quadratic, and then adding up over all scenarios to get the expected value
of the objective function values.

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


Let Mi = S(rriQ H- mi). Then rows 1,2,..., Mi of the constraints ma­
trix $ (cf. equation (13.78)) correspond to network flow conservation con­
straints, and rows Mi + 1,..., M correspond to the non-anticipativity con­
straints that take the simple form

vj - Xji
•'j = 0,'
for all (z,j) G v4o and all s C Q. The dual price 717, £ C {1,2,..., Mi},
associated with the flow conservation constraint for node i C N under
scenario s e Q, is denoted by 7rf. The dual price 717, I C {M + 1,..., M +
AT}, associated with the simple bound constraints for xfj (i.e., the reduced
cost of xfj), is denoted by 7rf-. We now develop the specific projection
formulae for use in the iterative steps of Algorithm 6.4.1. The row-action
algorithm iterates one row at a time on the constraint matrix $. The
precise formulae for the iterative step are obtained by using the equation
corresponding to the chosen matrix row, as given by equations (13.81)-
(13.82). We develop the equations for only a single iterative step of the
algorithm over all constraints, i.e., flow conservation equality constraints,
bounds on the variables and non-anticipativity equality constraints. The
complete algorithm is summarized below as Algorithm 13.8.1.
Projection on Flow Conservation Constraints
First we derive the iterative step of the algorithm when the chosen row
of the matrix $ corresponds to the flow conservation constraint (13.81).
Consider the flows on the incoming- arcs, xski for k C 6~, and the flows
on the outgoing arcs, xfj for j e 6+ for a given node i C A/o under some
scenario s 6 Q.
The generalized projection z of the current iterate z onto the hyperplane
where is the zth column of 4>T, and 7* is the zth component of
7 (see (13.77)-(13.78)), determined by the flow conservation constraint at
node z, is obtained by solving the system (see Lemma 2.2.1):

VF(i) = VF(z) + OS (13.87)


z e WO- (13.88)

Of course, if z C then /3f = 0 and z — z. If the current iterate z


does not satisfy flow conservation at the zth node define the node surplus
af as
ai = bi ~ (52
jest
xii ~ 52
ke67
mkixkiY (13.89)
Sect. 13.8 Iterative Algorithm for Stochastic Network Optimization 407

Applying the iterative step (12.33) to the functional form of the objective
function (13.86) and using the structure of the rows of the constraints
matrix $ that correspond to network flow constraints we get

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


£ij = xij + for 3 e > (13.90)
l3
iski = xski-(3^ forked. (13.91)
wki

Substituting these expressions for x^ and xski into (13.81) we get

E - E “<=•«. - ft'S1) = <1M2>


From this and (13.89) we obtain

Using this result in (13.90) and (13.91) gives the desired formulae for updat­
ing all primal variables incident to node i. The dual variable for this node
7rf is updated by adding to its current value, i.e., 7rf 7rf -I- (3$. Similar
algebraic manipulations lead to the updating formulae required to apply
the row-action algorithm to rows corresponding to constraints (13.82).

Projection on Simple Bound Constraints


Now we develop the specific projections on the simple bounds (13.83)-
(13.84) for the general row-action Algorithm 6.4.1. We do it in detail for
(13.83) only. Denote by the current value of the variable and by xfj the
projected value. If x^ < 0, we get from (6.32) that

0 = ^. = 4 + 4- (13-94)

The primal variable is set to zero and the projection parameter is (3 =


—wfjXfj. The dual price of the constraint is updated by subtracting (3 from
its current value, i.e., 7r?- <— — (3.
If xfj > Uij we similarly set the primal variable xfj to the upper bound
Uijy and from (6.32) compute the projection parameter (3 — (uij —
and update the dual price of the bound constraint by subtracting (3.
Finally, if 0 < x^ < we get from (6.61)
408 Planning Under Uncertainty Chap. 13

(13.95)

and then set the dual price 7i< to zero.

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


Projections on Non-anticipativity Constraints
We now develop the iterative step of the algorithm for the equality, non-
anticipativity constraints. A non-anticipativity constraint (13.85) has the
form
‘■j ‘j 1
(13.96)
for some (z,j) e A) and some s E Q. Let /i(s) be a row index such that
for s — 2,3,..., S is the row of $ that corresponds to the constraint
7*1 • — ■ — 0
V lj
If z is the current iterate, the generalized projection onto the hyperplane
represented by this constraint is the point z which solves the system

VF(z) = + (13.97)
%ij = (13.98)

Noting that the //(s)th row of the constraints matrix has only two nonzero
entries (cf. equation (13.96)) we can write this system as

Solving this, we get

i.e., the point (xL,x^) is projected upon the point with coordinates equal
to the weighted average of x\j and x8^ with wL and w8j being the weights.
Consider now the effect of repeated projections of the row-action algo­
rithm on the non-anticipativity constraints (13.85). We can take advantage
of the almost cyclic control of the algorithm in a way that would not have
been possible with the cyclic control alone. The almost cyclic control of
the row-action algorithm allows repeated projections upon these constraints
alone until convergence—within some tolerance—of the variables x8^ for
any fixed (i, j) E v4o to a limit x*j. We show that x*j can be obtained ana­
lytically, rather than using the iterative scheme. This result has important
Sect. 13.8 Iterative Algorithm for Stochastic Network Optimization 409

implications for implementations, since the effect of repeated application


of equation (13.99) for all s e Q can then be calculated in closed form.
The non-anticipativity constraints for the replications of a single first-
stage variable take the form

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


- rtj = °>

lJ - xf. = 0,
X/i (13.100)

x,-,- - xf
l'J
= 0.

Let VFij : IRS —> IR6 denote the subvector of the gradient VF corre­
sponding to the S replications of the first-stage variable x]j, ... ,xfj and,
similarly, let denote the submatrix of $ consisting of the columns
corresponding to xjj,... ,xfj.
By repeated projection onto these non-anticipativity constraints, such
that the i/th projection is onto the constraint x^ — x^ = 0, we obtain a
sequence of points x" E IR5 satisfying
y
= VF^y) + £ Afc^(fc)), (13.101)
k=l

where is the column of matrix $(tj) corresponding to the /i(£(fc))th


variable, A^ is the projection parameter corresponding to the A;th pro­
jection, y is the starting point, and is the row index of the non-
anticipativity constraints, see the discussion on page 408. The limit point
x* E IRS satisfies
VF0(x*) = VF0(j/) + £ Afc<^(fc)) (13.102)
fc=l

and must, by the non-anticipativity constraints (13.100), have all com­


ponents identical, i.e., x* = (x^,..., x
*
^ for some x*j E IR. Let now
As == Yl{k\e(k)==s} for s = 2,...,S. Using the fact that F(y) is the
quadratic function (13.86) rewrite (13.102) as the system in S variables

1 S
xij — Vij +
V s=2

l3

* S 1 A
xij= ya —S (13.103)
W,-•
410 Planning Under Uncertainty Chap. 13

In matrix form, this is


Ht = y, (13.104)
where

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


wlj wij

(13.105)

and t = (x
j,
* A2,..., As)T. By inverting H we can solve for t. Since we are
only interested in (not in A2,..., As), we need only calculate the first
row of denoted by h = Using the special structure of H we
easily get
I / | \
h — ------- — I TT--- (w?-, w?-,..., I,
det H \ \s—1 U /
where det H is the determinant of H. The inner product of the first column
of H, which consists of all ones, and the first row of H~1 must equal 1.
Therefore hs = 1. Hence

Note that det H > 0, so that the system (13.104) has a unique solution.
Solving for x*j we get

YXijVij
x*j = {h,y} = ^--------. (13.106)

/ -j wii
bj
s=l

Since £ is a first-stage variable


1 9 s
IJ LJ __
Pl P2 ’ ” Ps '

Also, 52f=1Ps = 1, so the result can be simplified to


Sect. 13.8 Iterative Algorithm for Stochastic Network Optimization 411

s
(13.107)
s—1

The Row-action Algorithm for Quadratic Stochastic Networks

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


We have now completed all the components required to specialize the row­
action algorithm to quadratic stochastic network problems. The complete
algorithm proceeds as follows.

Algorithm 13.8.1 Row-Action Algorithm for Quadratic Stochastic


Networks

Step 0: (Initialization.) Set v = 0 and get 7r° and zQ such that

VF(z°) = - ( f ) 7T°.
\1n /

For example, 7r° = 0 and

— —y for j) £ Ab s e Q, (13.108)
Wij

(Vij)° =----- 4~ for all (?, j) e s e Q. (13.109)


wfj

Step 1: (Iterative step for the scenario subproblems.) For all s € 0, do


the following:
Step 1.1: (Iterative step for the flow conservation constraints.) Let

(13.110)

(13.111)

For all first-stage nodes i 6 Aq:

WjY+i = + (AS)P+I T- for all j e t?,


(13.112)
J Wij

(x°kiy+? = (x
* kiy - for all k e 6-, (13.113)
Wki
412 Planning Under Uncertainty Chap. 13

(^T+1 = - (£T+i- (13.114)

For all second-stage nodes i € A/i:

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


(^■r+l = for a11 >e (13.115)
lj
(yskir+i = (VkiY - (£T+4^ for all k e v, (13.116)
wki
«r+1 = «r-(/?rr+|. (13.117)

Step 1.2: (Iterative step for the simple bounds.)


For all first-stage arcs (i, j) e >1q:

' Uij if (xsij)u+i > Uij,


(xs r+i = o if < 0, (13.118)
0 ! (7TS.)P
if 0 < (^0')p+5 < Uij.
w^-

and
((<,)" - wsij(uij - (xsij)v+^) if (^)p+5 >
«.)-+!= J (TT^y + Wlj^ij)^ if (^r+i < ),
|o if 0 < (xfJ )t'+ < Uij.

(13.119)
For all second-stage arcs (z,j) € A-i:

Vs-
V
if (l/o )P+i > vij>
(ysijY+1 0 if < 0, (13.120)
ifO<(y(7r+i <vsi3.
wij
and
- wij<Vij - (ysijY+b if (ytjV+^ > i ip
s \I>4-1 ^+u>^yfj)^ if (yfj)^<<
0 ifO<(y^r+ < v®-.
(13.121)
Step 2: (Iterative step for non-anticipativity constraints.)
For all first-stage arcs (z, j) e .Ao set:
s
(13.122)
Xij =
s=l
Sect. 13.9 Notes and References 413

(^)P+1 = x-j for all s e Q. (13.123)

Step 3: Let v v + 1 and return to Step 1.


Decompositions for Parallel Computing

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


The calculations in Step 1 of Algorithm 13.8.1 are repeated for multiple
independent scenarios. Hence, these calculations can be executed concur­
rently utilizing as many processors as number of scenarios.
The calculations in Step 1.1, for a given scenario index 5, are performed
for all first- and second-stage nodes. On first examination these calculations
are not independent, since nodes have arcs in common and the calculations
for a given node, say z, cannot change the flow on arc at the same
time that the calculations for node j are updating the flow on this arc.
However, it is possible to execute the calculations for multiple nodes con­
currently if we identify nodes that do not have arcs in common. Such sets
of nodes are identified by coloring the underlying graph, and iterating on
same color nodes simultaneously. For example, in a time-staged network it
is possible to iterate concurrently on all nodes corresponding to odd-order
time periods, and then iterate concurrently on all nodes in even-order time
periods. Another alternative is to employ a Jacobi variant of the algorithm
described in Step 1.1. (see Section 13.9 for references to related literature).
The calculations in Step 1.2 can be executed concurrently for all arcs and
all scenarios utilizing as many processors as the number of scenarios times
the number of arcs.
The calculations in Step 2 can be executed concurrently for all first-stage
arcs, although the calculations involved in this step are trivial compared
to the amount of work performed in Step 1. Sections 14.5 and 15.5 give
details on the implementation of this algorithm, and report computational
results with large-scale test problems.

13.9 Notes and References


Stochastic programming models were first formulated as mathematical pro­
grams in the late 1950s by Dantzig (1955) and Beale (1955). Programs with
probabilistic constraints were introduced by Charnes and Cooper (1959).
For general references on stochastic programming see Dempster (1980),
Ermoliev and Wets (1988), Kall (1976), Kall and Wallace (1994), and
Wets (1989). A textbook treatment of stochastic programming is Kall
and Wallace (1994)
13.1 For further discussion on probability theory as it applies to stochas­
tic programming see Kall (1976), Wets (1989), and Frauendorfer’s
thesis (1992). For general background on probability theory refer to
Billingsley (1995) or Parzen (1960).
13.3 The problem formulations can be found in the general references cited
above. See also Walkup and Wets (1966) and Wets (1966a, 1966b,
414 Planning Under Uncertainty Chap. 13

1972, 1983). Wets (1974) develops the deterministic equivalent for­


mulation. Multistage programs are discussed in Birge (1985, 1988),
Olsen (1976), Gassmann (1990), Ermoliev and Wets (1988), and Wets
(1989). Dupacova (1995) compiled a bibliography.

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


13.4 The robust optimization model was suggested by Mulvey, Vanderbei,
and Zenios (1995). The terminology of structural and control variables
is borrowed from the flexibility analysis of manufacturing systems;
see Seider, Brengel, and Widagdo (1991). Applications of robust op­
timization are discussed in Guttierez and Kouvelis (1995), King et
al. (1988), Malcolm and Zenios (1994), Paraskevopoulos, Karakitsos,
and Rustem (1991) and Sengupta (1991).
13.5.1 The diet problem was studied by Stigler (1945) and used by Dantzig
(1963) as the first test problem for the simplex method. See also
Dantzig (1990).
13.5.2 Capacity planning and expansion has been a fertile ground for
the application of optimization models. For a textbook treatment,
in the context of manufacturing applications, see Hayes and Wheel­
wright (1984). The robust optimization approach to capacity plan­
ning for a multiproduct, multifacility production firm was suggested
by Eppen, Martin, and Schrage (1989), who applied their model to
plan car manufacturing facilities for the General Motors Company.
The model described in this section is a simplified version of their
application. The same reference discusses the merits of a robust opti­
mization formulation for the capacity expansion planing model. They
use a model of expected downside risk, and illustrate its performance
with numerical results. The Markowitz criterion was introduced by
Markowitz (1952); see also Perold (1984) and Dahl, Meeraus, and
Zenios (1993). The downside risk function used in this section was
suggested by Zenios and Kang (1993) in the context of portfolio man­
agement applications. It is the limiting case of the mean-absolute de­
viation models of Sharpe (1971) and Konno and Yamazaki (1991) with
asymmetric risk functions. See Speranza (1993) for further analysis of
the properties of asymmetric, piecewise linear, penalty functions.

Robust optimization models for capacity expansion planning have been


developed by Guttierez and Kouvelis (1995) who consider outsourcing
(i.e., subcontracting part of the manufacturing requirements) as the
means to achieving robustness in manufacturing capacity while re­
ducing costs. Stochastic programming models for capacity expansion
for power generation firms have been proposed by Murphy, Sen, and
Soyster (1982), Granville et al. (1988), and Dantzig et al. (1989). Ex­
tensions of these models, using robust optimization, are developed by
Malcolm and Zenios (1994).
Sect. 13.9 Notes and References 415

13.5.3 The robust optimization model for matrix balancing was developed
by Zenios and Zenios (1992). A similar model, for the more general
problem of image reconstruction from projections, was suggested ear­
lier by Elfving (1989), who also suggested the use of a single Newton

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


step for the estimation of the Bregman parameter. Both references
develop solution algorithms for the respective models, using the row­
action algorithms of Chapter 6. The use of a secant approximation for
the estimation of the Bregman parameter is discussed in Section 6.9;
see references in Section 6.10.
13.6 Textbook treatment of investments and portfolio management are
given by Bodie, Kane, and Marcus (1989) and Elton and Gruber (1984).
The classic models for portfolio management, namely Markowitz’s
mean/variance model and the portfolio immunization model are dis­
cussed by Markowitz (1952) and Reddington (1952) respectively. See
Dahl, Meeraus, and Zenios (1993) for a modern treatment of these
models and the associated mathematical programming formulations.

The applications of stochastic programming models in portfolio man­


agement are numerous. For a simple illustration of dynamic program­
ming formulations for multiperiod portfolio optimization see Bert­
sekas (1987, pp. 73-77). General references are collected in Zenios
(1993a) and Ziemba and Vickson (1975). The application of stochas­
tic programming to address problems in short-term financial planning
was suggested by Kallberg, White, and Ziemba (1982). Its application
to bond portfolio management was suggested by Bradley and Crane
(1972); for application to Bank asset/liability management see Kusy
and Ziemba (1986); for application to the asset allocation see Mulvey
and Vladimirou (1992) and Mulvey (1993); for application to fixed-
income portfolio management see Zenios (1991c, 1993b); and for appli­
cation to funding of insurance products see Nielsen and Zenios (1996b).
The model discussed in this section is adapted from Golub et al. (1995)
and Holmer et al. (1993).
13.7 There exists extensive literature on stochastic programming problems
with network recourse. The stochastic transportation problem was
introduced by Williams (1963). See also the papers by Cooper and
LeBlanc (1977), Wallace (1986, 1987), Mulvey and Vladimirou (1991),
Nielsen and Zenios (1993c, 1996a), and the PhD theses by Vladimirou
(1990) and Nielsen (1992).

Applications of stochastic network models to hydroelectric power sched­


uling are reported in Dembo et al. (1990) and Dembo, Mulvey, and
Zenios (1989). Applications to airtraffic control were developed by
Richetta (1991); see also Zenios (1991a). Transportation and logistics
models are developed in Frantzeskakis and Powell (1989).
416 Planning Under Uncertainty Chap. 13

The split-variable formulation for the decomposition of mathemati­


cal programs is fairly standard in large-scale optimization; see, for
example, Bertsekas and Tsitsiklis (1989, page 231). The use of split­
variable formulations for stochastic programming problems was sug­

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024


gested by Rockafellar and Wets (1991). It was used by Mulvey and
Vladimirou (1989, 1991) and Nielsen and Zenios (1993a, 1996a) as a
device for exploiting the special structure of stochastic programs with
network recourse.
13.8 The row-act ion iterative algorithm for the two-stage stochastic pro­
gramming problems with network recourse was developed by Nielsen
and Zenios (1993a). It was further extended to the multistage problem
by Nielsen and Zenios (1996a), and was used for the solution of linear
stochastic networks within the context of the PMD algorithm (Chap­
ter 3) by Nielsen and Zenios (1993d). The partitioning of network
structures using graph coloring to increase the amount of parallelism
was suggested in Zenios and Mulvey (1988a). They describe a graph
coloring heuristic, based on earlier work of Christofides (1971), used
for their test problems, and compare the idea of graph coloring to
the use of a Jacobi variant. As one may expect, the parallel scheme
based on graph coloring exhibits a faster (practical) convergence rate
than the Jacobi algorithm. However, the Jacobi algorithm permits the
use of more processors. For large-scale problems, and on computers
with sufficiently many processors, the Jacobi parallel algorithm could
be substantially faster, in solution time, than parallel implementa­
tions based on graph coloring. This was demonstrated in Zenios and
Lasken (1988a, 1988b).

You might also like