Planning Under Uncertainty: Proactively

CHAPTER 13
Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024

Planning Under Uncertainty
The winners of tomorrow will deal proactively with chaos, will

look at the chaos per se as the source for market advantage,
not as a problem to be got around. Chaos and uncertainty are
(will be) market opportunities for the wise.
Tom Peters, Thriving on Chaos, 1987
The analyst who attempts to build a mathematical model for a real-

world system is often faced with the problem of uncertain, noisy, incom
plete or erroneous data. This is true for several application domains. In
business applications noisy data are prevalent. Returns of financial instru
ments, demand for a firm’s products, the cost of fuel, and consumption
of power and other resources are examples of model data that are known
with some probabilistic distribution at best. In social sciences data are
often incomplete—for example, partial census surveys are carried out pe
riodically in lieu of a complete census of the population. In the physical
sciences and engineering data are usually subject to measurement errors,
as in models of image restoration from remote sensing experiments.
For some applications not much is lost by assuming that the value of the
uncertain data is known and then developing a deterministic mathemati
cal programming model. Worst case or mean values can be used in this
respect because they provide reasonable approximations when either the
level of uncertainty is low, or when the uncertain parameters have a minor
impact on the system we want to model. For many applications, however,
uncertainty plays a key role in the performance of the real-world system:
worst case analysis often leads to conservative and potentially expensive
solutions, and solving the mean value problem, i.e., a problem where all
random variables are replaced by their mean values can even lead to non
sensical solutions since the mean of a random variable might not be a value
that can be realized in practice.
A general approach to dealing with uncertainty is to assign to the un
known parameters a probability distribution, which should then be incorpo
rated into an appropriate mathematical programming model. This chapter
addresses the problem of planning under uncertainty and develops math
ematical programming formulations using stochastic linear programming
Parallel Optimization, Yair Censor, Oxford University Press (1997), © 1997 by

Oxford University Press, Inc., DOI: 10.1093/9780195100624.003.0013
372 Planning Under Uncertainty Chap. 13
models and robust optimization models. Section 13.2 discusses a classic

example to highlight the issues involved. Sections 13.3 and 13.4 give math
ematical programming models. Section 13.5 discusses diverse real-world
applications and 13.6 focuses on an application from financial planning.

Section 13.7 introduces a broad class of stochastic programming applica
tions, namely that of stochastic network problems. Section 13.8 gives a
row-action iterative algorithm for solving this special class of problems.
Notes and references in Section 13.9 conclude this chapter.
13.1 Preliminaries
We introduce first some basic definitions on probability spaces that are
needed throughout this section. Additional background material on proba
bility theory can be found, e.g., in Billingsley (1995) and, with emphasis on
stochastic programming, in Kall (1976) and Wets (1989). In this chapter
boldface Greek characters are used to denote random vectors which belong
to some probability space as defined below.
Let Q be an arbitrary space or set of points. A cr-field for Q is a family
S of subsets of Q such that Q itself, the complement with respect to Q
of any set in S, and any union of countably many sets in E are all in E.
The members of E are called measurable sets, or events in the language
of probability theory. The set Q with the cr-field E is called a measurable
space and is denoted by (Q, S).
Let Q be a (linear) vector space and E a cr-field. A probability measure P
on (Q, S) is a real-valued function defined over the family E, which satisfies
the following conditions: (z) 0 < P(A) < 1 for A G E; (ii) P($) = 0 and
P(Q) = 1; and (Hi) if is a sequence of disjoint sets Ak C E and
if G E then P(U^=1Ak) = The triPlet is
called a probability space. The support of (Q, E, P) is the smallest subset of
Q with probability 1. If the support is a countable set then the probability
measure is said to be discrete. The term scenario is used for the elements
of Q of a probability space with a discrete distribution.
A proposition is said to hold almost surely (abbreviated a.s.) or P-
almost surely if it holds on a subset A C Q with P(A) = 1. The expected
value of a random variable Q on (Q, E, P) is the Stieltjes integral of Q with
respect to the measure P:
E[Q] = j QdP = Q(w)dP{wi).
The expectation of a constant function is also constant and it is easy to see

that E[Qi + Q2] = E[Qi] 4- E[Q2]• The fcth moment of Q is the expected
value of Qfc, i.e., E[Qfc] = / Qfc(u>)dP(u>). The variance of the random
variable Q is defined as Var[Q] = E[Q2] - (E[Q])2.

Sect. 13.2 The Newsboy Problem 373
Finally, we give a formal but restricted (to our needs) definition of a

conditional expectation. Let (Q, E, P) be a probability space and suppose
that v4i, >^2, •. •, Ak is a finite partition of the set Q. From this partition we
form a cr-field A which is a subfield of E. Then the conditional expectation

of the random variable Q(u>) on (Q, E, P) given A at o; is denoted by
E[Q | v4] and defined as
E[Q I [ Q(w)dP(w)
^(A) JAi
for u? E assuming that P(A) > 0.
13.2 The Newsboy Problem

To develop an understanding of stochastic programming problems we con
sider first the following simple problem of planning under uncertainty. On
a street corner a young entrepreneur is selling newspapers that he buys
from a local distributor each morning. He sells these papers for a profit
per unit, and any papers that remain at the end of the day are sold as
scrap paper, in which case a net loss p~ is realized per unit. The demand
for newspapers is a random variable 07 which belongs to a probability space
with support denoted byQ = {u7ElR|0<o7<(X)} and probability dis
tribution function P(o7). The problem is to choose the optimal number of
papers x that should be bought from the local distributor.
An approach to modeling this situation is to consider a policy x as
optimal if it maximizes the expected profit. Profit is a function of the policy
and the demand random variable 07. Let F(x,aT) be the profit function:
\ • / p+x if x < uj,

F(X,W) = < + _z \ -r
( 07 — p (X — O?) II X > 07.
The expected value of the profit function is the Stieltjes integral with re
spect to the distribution function:
E[F(x,w)] = f F(x,w)dP(a>)
rx POO
' = / (p+o> — p~(x — 07)) dP(w) + / p+xdP(w),
Jo Jx
and the mathematical model for the newsboy problem is the following op
timization problem with respect to x,
Maximize E[F(x,w)] (13.1)

s.t. x > 0. (13.2)
This is a simple example of a problem of planning under uncertainty.

It is represented by an adaptive model, since decisions adapt as more in
formation becomes available, i.e., as newspapers are sold to customers that
arrive during the day. The model has fixed recourse, meaning that the re

action to the observed demand is fixed. That is, the number of newspapers
sold for a profit is uniquely determined by the number of customers. The
same is true for the surplus created at the end of the day, which is sold
for scrap at a loss. Other forms of recourse action might have been possi
ble, such as purchasing additional papers at a higher cost later during the
day, or returning newspapers before the end of the day, at a value higher
than that of scrap paper. This simple, fixed recourse model does not allow
for such considerations, and it also assumes that all risk preferences are
captured by the expected value of the profit. Higher moments of the distri
bution of the profit function F(x, o>) are ignored. The next section presents
mathematical models for planning under uncertainty in more complicated
settings.
13.3 Stochastic Programming Problems

The following problem is the general formulation of stochastic program
ming:
Minimize E[fi(x,co)]
s.t. E [fi(x, cu)] = 0, i = l,2, (13.3)
x e X C IRn.
The following notation is used: id is a random vector with support

QC and P = P(w) is a probability distribution function on IR^. Also
/o : lRn x Q —> IR U {Too}, fi : IRn x Q —> IR, i = 1,2,..., m, and X is a
closed set. Inequality constraints can be incorporated into this formulation
with the use of slack variables.
The expectation functions
E[fi(x,w)] = [ fi(x,w)dP(wfi (13.4)
are assumed finite for all i = 0,1,..., m unless the set {u? | fo(x, u>) = Too}
has a nonzero probability, in which case E[fifix,w)] = Too. The feasibility
set
X A {x | E[fi(x,a>)] = 0, i = 1,2,... , m} Pl {x | E[fifix, a;)] < Too}
is assumed to be nonempty.
The model (13.3) is a nonlinear programming problem, whose con
straints and objective functions are represented by integrals. Much of
the theory of stochastic programming is concerned with identifying the
Sect. 13.3 Stochastic Programming Problems 375
properties of these integral functions and devising suitable approximation

schemes for their evaluation. Optimality conditions are derived from those
for nonlinear programming, with the aid of subdifferential calculus for the
expectation functions. However, the computation of solutions for these

nonlinear programs poses serious challenges, since evaluation of the inte
grals can be an extremely difficult task, especially when the expectation
functionals are multidimensional. There are even cases when the integrands
are neither differentiable, nor convex nor even continuous. A broad class of
stochastic programming models, however, can be formulated as large-scale
linear or nonlinear programs with a specially structured constraints ma
trix. Most of the work on parallel computing for stochastic programming
focuses on the development of decomposition algorithms that exploit this
special structure. In the next subsections we look at further refinements of
the general stochastic programming formulation.
13.3.1 Anticipative models

Consider now the following situation. A decision x must be made in an un
certain world where the uncertainty is described by the random vector u>.
The decision does not in any way depend on future observations, but pru
dent planning has to anticipate possible future realizations of the random
vector.
In anticipative models feasibility is expressed in terms of probabilistic
(or chance) constraints. For example, a reliability level a where 0 < a < 1
is specified and constraints are expressed in the form
P{w | gi(x,w) = 0, f = l,2,... , m} > a,
where gi : IRn x Q -> IR, i — 1, 2,... ,m. This constraint can be cast in
the form of the general model (13.3) by defining as follows
a - 1 if gi(x,w) = 0,
a otherwise.
The objective function may also be of a reliability type, such as P{u> |
gv(x,u)) < 7}, where 7 is a constant.
In summary, an anticipative model selects a policy that leads to some
desirable characteristics of the constraint and objective functionals under
the realizations of the random vector. In the example above it is desirable
that the probability of a constraint violation is less than the prespecified
threshold value a. The precise value of a depends on the application at
hand, the cost of constraint violation, and other similar considerations.
13.3.2 Adaptive models

In an adaptive model, observations related to uncertainty become avail
able before a decision x is made, such that optimization takes place in a
learning environment. It is understood that observations provide only par

tial information about the random variables because otherwise the model
would simply wait to observe the values of the random variables, and then
make a decision x by solving a deterministic mathematical program. In

contrast to this approach we have the other extreme situation where all
observations are made after the decision x has been made, and the model
becomes anticipative.
Let A be the collection of all the relevant information that could become
available by making an observation. This A is a subfield of the cr-field of
all possible events, generated from the support set Q of the random vector
lu. The decisions x depend on the events that could be observed, and x
is termed A-adapted or A-measurable. Using the conditional expectation
with respect to A, E[ • | >1], the adaptive stochastic program can be written
as:
Minimize E[/o(z(a>),u>) | 4]
s.t. E[fi(x(w),w) | ^4] = 0, i = 1,2,..., m, (13.5)
x(w) e X, almost surely.
where the mapping x : Q X is such that x(u>) is X-measurable. This

problem can be addressed by solving for every cu the following deterministic
programs
Minimize E[fo(x, •) | -4](a?) (13.6)

s.t. E[fi(x, •) | -4](cu) = 0, i = 1,2,... ,m, (13.7)
X e X. (13.8)
Each such problem for a given is of the canonical form (13.3).

The two extreme cases (i.e., complete information with A = S, or no
information at all) deserve special mention. The case of no information
reduces the model to the form of the anticipative model; when there is
complete information the model (13.5) is known as the distribution model.
The goal in this later case is to characterize the distribution of the opti
mal objective function value. The precise values of the objective function
and the optimal policy x are determined after realizations of the random
vector w are observed. The most interesting situations arise when partial
information becomes available after some decisions have been made, and
models to address such situations are discussed next.
13.3.3 Recourse models

The recourse problem combines the anticipative and adaptive models in a
common mathematical framework. The problem seeks a policy that not
only anticipates future observations but also takes into account that obser
vations are made about uncertainty, and thus can adapt by taking recourse
decisions. For example, a portfolio manager specifies the composition of a
portfolio considering both future movements of stock prices (anticipation)
and that the portfolio will be rebalanced as prices change (adaptation).

The two-stage version of this model has been studied extensively. It is
amenable to formulations as a large-scale deterministic nonlinear program
with a special structure of the constraints matrix. These formulations yield
naturally to solution via decomposition algorithms and parallel computa
tions. To formulate the two-stage stochastic program with recourse we need
two vectors for decision variables to distinguish between the anticipative
policy and the adaptive policy. The following notation is used.
x e lRn° denotes the vector of first-stage decisions. These decisions are
made before the random variables are observed and are anticipative.
y E IRni denotes the vector of second-stage decisions. These decisions are
made after the random variables have been observed and are adaptive.
They are constrained by decisions made at the first-stage, and depend
on the realization of the random vector.
We formulate the second-stage problem in the following manner. Once
a first-stage decision x has been made, some realization of the random vec
tor can be observed. Let q(y,co) denote the second-stage cost function,
and let {T(u>), W(co), h(co) | u? € Q} be the model parameters. Those
parameters are functions of the random vector co and are, therefore, ran
dom parameters. T is the technology matrix of dimension x n$. It
contains the technology coefficients that convert the first-stage decision x
into resources for the second-stage problem. W is the recourse matrix of
dimension x ni. h is the second-stage resource vector of dimension mi.
The second-stage problem seeks a policy y that optimizes the cost of

the second-stage decision for a given value of the first-stage decision x.
We denote the optimal value of the second-stage problem by Q(x, co). This
value depends on the random parameters and on the value of the first-stage
variables x. Q(x,co) is the optimal value, for any given Q, of the following
nonlinear program
Minimize q(y,co)
s.t. W(co)y = h(co) - T(co)x, (13.9)
y e IR™1.
If this second-stage problem is infeasible then we set Q(x,w) = -Foo. The

model (13.9) is an adaptation model in which y is the recourse decision and
Q(x,co) is the recourse cost function.
The two-stage stochastic program with recourse is an optimization prob
lem in the first-stage variables x, which optimizes the sum of the cost of
the first-stage decisions, /(x), and the expected cost of the second-stage
decisions. It is written as follows.
Minimize f(x) + E[Q(rr, cu)]

s.t. (a\x)=bi, i = 1,2,... , mo, (13.10)
x G IR"0,
where a1 denotes the transpose of the zth row of the mo x no matrix 4, and
bi is the zth component of the mo-vector b. (a\x) = b^ i = 1,2,..., mo are
linear restrictions on the first-stage variables. This model can be cast in the
general formulation (13.3) simply by denoting fo(x,u>) = f(x) + Q(x,lj),
and = (a\x) — bi.
A formulation that combines (13.9) and (13.10) is the following:
Minimize (/(x) + ^[Minyg^^ {q(y,w) | T(w)x + W(u>)y = /i(o>)}])
s.t. Ax — b, (13.11)
x G IR"°,
where Min denotes the minimal function value.

Let Ki = {x G IR"° | Ax = d}, denote the feasible set for the first-stage
problem. Let also K% = {x G JR"° | E,[Q(j?,u;)] < +00} denote the set of
induced constraints. This is the set of first-stage decisions x for which the
second-stage problem is feasible. Problem (13.10) is said to have complete
recourse if K% = IRn°, that is, if the second-stage problem is feasible for
any value of x. The problem has relatively complete recourse if CK2,
that is, if the second-stage problem is feasible for any value of the first-
stage variables that satisfies the first-stage constraints. Simple recourse
refers to the case when the resource matrix = I and the recourse
constraints take the simple form Iy+ — Iy~ — Zi(cv) — T(a>)x, where I is
the identity matrix, and the recourse vector y is written as y = y+ — y~~
with y+ >0,y~> 0.
Deterministic Equivalent Formulation

We consider now the case where the random vector has a discrete and
finite distribution, with support Q = {cu1, cj2,... ,uN}. In this case the set
Q is called a scenario set. Denote by ps the probability of realization of
the sth scenario ws. That is, for every s = 1,2,..., N,
ps = Prob (u? = cvs)

= Prob{(q(i/,u?),W(w),h(w),T(a;)) = (q(y, ws), W(ws), h(cds),T(cjs))} .
It is assumed that ps > 0 for all ws G Q, and that Ps — 1-

The expected value of the second-stage optimization problem can be

expressed as
N
E[Q(x,u)] = ^psQ(x,uJs')- (13.12)

S=1
For each realization of the random vector us E Q a different second-stage

decision is made, which is denoted by ys. The resulting second-stage prob
lems can then be written as
Minimize q(ys,cvs)
s.t. W^s)ys = h^s)-T(ujs)x, (13.13)
ys e IR^1.
Combining now (13.12) and (13.13) we reformulate the stochastic non

linear program (13.11) as the following large-scale deterministic equivalent
nonlinear program
N
Minimize /(rr) + ^Ppsg(?/S,a/) (13.14)
S=1
s.t. Ax = b, (13.15)
T(ivs)x + W(us)ys = h(us) for all G Q, (13.16)
x e IR”", (13.17)
G IR"1. (13.18)
The constraints (13.15)-( 13.18) for this deterministic equivalent program

can be combined into a matrix equation with block-angular structure
/ A
T(a?) ITp)
T(w2) W(u2) (13.19)
Split-Variable Formulation
The system of linear equations in (13.19) can be rewritten in a form that is,
for some algorithms, more amenable to decomposition and parallel compu
tations. In particular, in the absence of the x variables the system (13.19)
becomes block-diagonal. The split-variable formulation replicates the first-
stage variable vector x into a set of vectors xs E IRn° for each ws E Q. Once
a different first-stage decision is allowed for each scenario, the stochastic
program decomposes into S independent problems. Of course, the first-
stage variables must be nonanticipative^ that is they cannot depend on
scenarios that have not yet been observed when the first-stage decisions
are made. This requirement is enforced by adding the restrictions that
x1 = x2 = ... — xs. The split-variable formulation is then equivalent to
the original stochastic problem (13.14)-(13.18), for which equation (13.19)

can be written in the equivalent form
/ A \ / b \
T(uA) IVp)
A b
T(u2) W(w2) fi(w2)
(13.20)
A b
T(wN) W(wN') h(uN)
x1' 0
\yN)
\ I -I / \ 0 /
Multistage Recourse Problems

The recourse problem is not restricted to the two-stage formulation. It is
possible that observations are made at K different stages and are captured
in the information sets {At}fLi with Ai C A2 • • • C Ar. These sets
are subfields of the underlying cr-field E of all possible observations. A
multistage stochastic program with recourse will have a recourse problem
at stage r conditioned on the information provided by AT, which includes
all information provided by the information sets At, for t = 1,2,..., r. The
program also anticipates the information in At, for t = r + 1,..., K.
Let the random vector have support Q = Qi x Q2 x • • • x which
is the product set of all individual support sets Qt, t = 1,2,..., K, cu
is written componentwise as cu = (cu1,... ,uA). Denote the first-stage
variable vector by y®. For each stage t = 1, 2,..., K define the recourse
variable vector y1 e IRnt, the random cost function qtiy *
,^), and the
random parameters Tt(u?) | a? € Q
*}.
The multistage program, which extends the two-stage model (13.11), is
formulated as the following nested optimization problem
Minimize f(y°) + E Min qTy^m1)-!---- E Min qK(yK,uK) •••

yoe®"0 v'e®;1 *yKeiR."
s.t. T^w1)?/0 + = /^(w1),
(13.21)
TK(iJjK)yK 1 + WK(<jjK)yK - hx(uK).
Sect. 13.4 Robust Optimization Problem 381
For the case of discrete and finitely distributed probability distributions

it is again possible to formulate the multistage model into a deterministic
equivalent large-scale nonlinear program. Section 13.9 provides references
to the literature on multistage programs.

13.4 Robust Optimization Problems
This section considers an alternative approach to handling uncertain or
noisy data. This approach is applicable to optimization models that have
two distinct components: a structural component that is fixed and free of
any noise in its input data; and a control component that is subject to noisy
input data. In some cases the robust optimization model is identical to a
two-stage stochastic program with recourse. But it also allows additional
flexibility in dealing with noise. In order to define the model we introduce
two sets of variables
x e IR”0 denotes the vector of decision variables that depend only on the
noise-free structural constraints. These are the design variables whose
values are independent of realizations of the noisy parameters.
y € IR”1 denotes the vector of control variables that can be adjusted once
the uncertain parameters are observed. Their optimal values depend
both on the realization of uncertain parameters, and on the optimal
values of the design variables.
This terminology is borrowed from the flexibility analysis of production
and distribution systems. The design variables determine the structure of
the system and the size of production modules; the control variables are
used to adjust the mode and level of production in response to disruptions
in the system, changes in demand or production yield, and so on.
The optimization model we are interested in has the following structure
Minimize (c, x) + (d,y) (13.22)

s.t. Ax — b, (13.23)
Bx + Cy = e, (13.24)
x e IR”°, (13.25)
y e IR”1, (13.26)
where b,c,d,e are given vectors and X,B,C are given matrices. Equa
tion (13.23) denotes the structural constraints that are free of noise. Equa
tion (13.24) denotes the control constraints. The coefficients of these con
straints, i.e., the elements of B,C, and e are subject to noise. The cost
vector d is also subject to noise, while A, b, and c are not.
To define the robust optimization problem we introduce an index set
Q = {1,2,..., S}. With each index s € Q we associate the scenario set
{d(s), B(s), C(s), e(s)} of realizations of the control coefficients. Reference
to an index s imply reference to the scenario set associated with this in
dex. The probability of the sth scenario is ps, and — 1- Now
the following question is posed: What are the desirable characteristics of a
solution to problem (13.22)-(13.26) when the coefficients of the constraints

(13.24) take values from some given set of scenarios? The solution is con
sidered robust with respect to optimality if it remains close to optimal for
any realization of the scenario index s e Q. The problem is then termed
solution robust. The solution is robust with respect to feasibility if it re
mains almost feasible for any realization of s. The problem is then termed
model robust. The concepts of close and almost are precisely defined later
through the choice of appropriate norms.
It is unlikely that a solution to the mathematical program will remain
both feasible and optimal for all realizations of s. If the system being
modeled has substantial built-in redundancies, then it might be possible to
find solutions that remain both feasible and optimal. Otherwise a model
is needed that permits a trade-off between solution and model robustness.
The model developed next formalizes a way to measure this trade-off.
First let us introduce a set {y\y2,..., yN} of control variables for each
scenario s C Q, and another set {z1,z2,..., zN} of feasibility error vectors
that measure the infeasibility of the control constraints under each scenario.
The real-valued objective function £(x,y) = (c,x) 4- (d, y) is a random
variable taking the value ^(x.y8) = (c,x) 4- (d(s),ys) with probability ps.
Hence, there is no longer a simple single choice for an aggregate objective
function. The expected value
4) = £p»U) (13-27)
s€Q
is precisely the objective function used in the stochastic programming for

mulations studied in the previous section. Another choice is to employ
worst case analysis and minimize the maximal value. The objective func
tion is then defined by
<t(-) = max£s(-). (13.28)
The robust optimization formulation also allows the introduction of higher

moments of the distribution of £(•) in the optimization model. Indeed, the
introduction of higher moments is one of the features of robust optimiza
tion that distinguishes it from the stochastic programming model of the
previous sections. For example, we could use a nonlinear utility function
that embodies a trade-off between mean value and variability in this mean
value. If Z7(£s) denotes the utility of £s, then the function
<?(•) = £>s^(-)) (13.29)

s€Q
Sect. 13.4 Robust Optimization Problem 383
captures the risk preference of the user. A popular choice of utility func
tions, for portfolio management applications, is the logarithmic function
U(£s) = log£s. The general robust optimization model includes a term
g/1, y2,..., yN) in the objective function to denote the dependence of

the function value on the scenario index s. This term controls solution ro
bustness, and can take different forms depending on the application. The
examples mentioned above are some popular choices.
The robust optimization model introduces a second term in the objec
tive function to control model robustness. This term is a feasibility penalty
function, denoted by p(z\ z2,... and it is used to penalize violations
of the control constraints under some of the scenarios. The introduction
of this penalty function also distinguishes the robust optimization model
from the stochastic programming approach for dealing with noisy data. In
particular, the model recognizes that it may not always be possible to ar
rive at a feasible solution to a problem under all scenarios. Infeasibilities
will inevitably arise, and they will be dealt with outside the optimization
model. The robust optimization model generates solutions that present the
modeler with the fewest infeasibilities to be dealt with outside the model.
The specific choice of penalty function is problem dependent, and it
also has implications for the choice of a solution algorithm. Two suitable
penalty functions are the following:
p(z\z\...,zs) = ^seQ p5||^s||2. This quadratic penalty function (i.e., a

weighted Z^-norm) is applicable to equality control constraints where
both positive and negative violations of the constraints are equally
undesirable. The resulting quadratic programming problem is twice
continuously differentiable, and can be solved using standard quadratic
programming algorithms, although it is typically large scale.
p(z\z2,..., zs) = niax(0, (maxj z^)). This penalty function is ap
plicable to inequality control constraints when only positive violations
are of interest (negative values of some Zj indicate slack in the in
equality constraints). With this choice of penalty function, however,
the resulting mathematical program is nondifferentiable. It is possible
to use an c-smoothing of the exact penalty function, and employ the
Linear Quadratic Penalty (LQP) algorithm (Chapter 7). The result is
a differentiable problem that is easier to solve and produces a solution
that lies within e from the solution of the nondifferentiable problem.
The robust optimization model takes a multicriteria objective form. A

goal programming weight parameter A is used to derive a spectrum of an
swers that trade off solution for model robustness. The general formulation
of the robust optimization model is stated as follows:
Minimize a(x, y1, y2, ..., yN) + Xp^z1, z2,..., zN)

s.t. Ax — b,
B(s)x 4- C(s)ys + zs = e(s) for all s € Q,

x e IR"°,
ys e fR"1,
zs e IR™
13.5 Applications
In this section we discuss real-world applications where uncertainty is preva
lent and wherein it is handled using the models introduced above. We give
first (subsection 13.5.1) an illustration of the robust optimization frame
work, using the classic diet problem as an example. The other subsections
discuss models for planning under uncertainty in production and inventory
management (subsection 13.5.2) and models for matrix balancing (subsec
tion 13.5.3), with a new section devoted to models for financial planning
under uncertainty (Section 13.6).
13.5.1 Robust optimization for the diet problem
The well-known diet problem is used here as an example to illustrate the
feature of model robustness. This feature is particularly interesting in the
context of optimization formulations, since feasibility has traditionally been
overemphasized.
The problem is to find a minimum-cost diet that will satisfy certain nu
tritional requirements. The origins of this problem date back to the 1940s
and to the works of G.J. Stigler and G.B. Dant zig where it was soon rec
ognized as a problem of robust optimization, since the nutritional content
of some food products may not be certain. Dantzig was still intrigued by
this ambiguity when he wrote, several decades later:
When is an apple an apple and what do you mean by its cost
and nutritional content? For example, when you say apple
do you mean a Jonathan, or McIntosh, or Northern Spy, or
Ontario, or Winesap, or Winter Banana? You see, it can make
a difference, for the amount of ascorbic acid (vitamin C) can
vary from 2.0 to 20.8 units per 100 grams depending upon the
type of apple. (Dantzig, 1990.)
The standard linear programming formulation assumes an average nu
tritional content for each food product and produces a diet. However, as
consumers buy food products of varying nutritional content they will soon
build a deficit or surplus of some nutrients. This situation may be irrele
vant for a healthy individual over long periods of time, or it may require
remedial action in the form of vitamin supplements.
Sect. 13.5 A pplica tions 385
We develop here a robust optimization formulation for this problem. Let

Xi denote the total cost of food type i in the diet; let denote the contents
of food type i in nutrient j per unit cost; and let bj be the required daily
allowance of nutrient j. Let c denote one specific nutrient, e.g., vitamin

C, from the set of nutrients and let A denote, for example, apples, from
the set of foods. A point estimate for the content of apples in vitamin C
is a Ac — a per unit cost. For the sake of the example assume that the
coefficient can take any value {a^c} for s in a scenario set fi. The robust
optimization formulation of the diet problem is then
- âicXi - aAcrA)2
Minimize + (13.30)
sen iÂ
s.t. di■ ~ bj for all j / c, (13.31)

i
Xi >0 for all i. (13.32)
The weight A is used to trade off feasibility robustness with cost. For
A — 0, and also allowing the index j in (13.31) to take the value j = c,
with aAc = a, we obtain the classic linear programming formulation.
The diets obtained with larger values of A have vitamin C content that
varies very little with respect to the quality of apples. Figure 13.1 illus
trates the error in vitamin C intake due to alternative robust optimization
solutions under different scenarios of vitamin C content. This figure il
lustrates the efficacy of the robust optimization model in hedging against
alternative realizations of the data. For example, if an error of ±2c units
in total vitamin C intake is acceptable, no remedial action will be needed
for the robust optimization dieter—no matter what quality of apples are
included in the diet. On the other hand, the linear programming dieter will
need remedial treatment for several of the available apple qualities (note,
from Figure 13.1, that the error in vitamin C intake exceeds the allowable
margin of ±2e under seven scenarios).
The robust optimization diet is somewhat more expensive than the diet
produced by the linear program. Figure 13.2 shows the increase in the cost
of the diet as it becomes more robust with respect to nutritional content.
This simple example clarifies the meaning of a robust solution and shows
that robust solutions are indeed possible, but at some cost.
13.5.2 Robust optimization for planning capacity expansion
Manufacturing and service firms have to plan for capacity expansion in
order to meet increasing demand for their products and services over time.
Demand, however, is usually highly uncertain. Product demand depends
on general economic conditions, competition, technological changes, and
the general business cycle. Demand may also exhibit seasonal variations,
which are particularly difficult to address in the context of service oper-

Figure 13.1 Error (negative for deficit, positive for surplus)
of the dieter’s intake of vitamin C. The leftmost scenario cor
responds to the lowest vitamin C content, while the rightmost
scenario corresponds to the highest vitamin C content. The
diet obtained with the linear programming formulation (i.e.,
A = 0) is very sensitive to the vitamin content of the food
products, whereas the diet obtained with the robust optimiza
tion model (i.e., A = 5) is much less sensitive.
ations since an inventory of services cannot be created during periods of

low demand. Public utility companies—power and water distribution—face
both seasonal and daily variations. The hotel and travel businesses are also
highly seasonal. Stochasticities are not restricted to demand alone; equip
ment failure, delivery of material by suppliers, and routine maintenance
operations also show varying degrees of uncertainty.
Planning for capacity expansion must account for the stochastic aspects
of the system. A conservative plan may be developed by estimating capac
ity based on worst case analysis, wherein capacity must be sufficient to
meet the demands of the last customer during the peak season. This strat
egy is quite expensive, and the prudent manager must carefully weigh the
marginal profit from the last sale against the cost of maintaining under
utilized manufacturing or service facilities. The United States automobile
industry arrived at this conclusion in the early eighties and slashed ca
pacity to increase utilization of their facilities, even at the cost of losing
some potential customers. The decision made front-page news in the The
Wall Street Journal on October 7, 1986. The Ford’s chief financial officer
summarized the essence of this strategy:
Sect. 13.5 Applications 387

Figure 13.2 Trade-off between cost (i.e., solution robustness)
and expected error in the vitamin C contents (i.e., model
robustness) for diets obtained using increasing values of the
weight parameter A in the robust optimization model.
“We arrived at a conscious willingness to give up the last ve

hicle we needed in peak years.”
In this section we formulate a stochastic programming model that plans

capacity expansion for a manufacturing firm.
The Multiperiod Stochastic Program

We consider, for simplicity, the example of a firm producing a single product
at multiple manufacturing sites. The model deals with uncertain product
demand, and can be easily extended to handle multiple products, which
is a more realistic situation. We begin with a stochastic programming
formulation such that solution robustness is dealt with as an extension of
the stochastic model.
The model makes capacity expansion decisions at K plant sites during
a planning horizon of T time periods. There are H different decisions that
can be made with respect to each plant site, denoted by h = 0,1,2,..., H
where h — 1 signifies the current state of the plant, h = 0 indicates that
the plant is shut down and h = 2,3,..., H signal various retooling options
that are possible at each plant. Stochasticity in demand is dealt with by
postulating a set of scenarios Q = {1,2,..., N}.
The model makes first-stage decisions on capacity planning, i.e., which
plants to shut down or to retool, and which plants to maintain at their
current status. The recourse decisions are the production levels for each
time period and under the different scenarios.
Notation
We define first the parameters of the model: use k ~ 1,2,..., K to denote
plant sites, h = 0,1,2,..., H to denote plant configurations, s € to
denote scenarios, and t = 1,2,..., T to denote time periods.

pst : the probability that scenario s occurs at time period t.
dst : the demand for the product under scenario s, during time period t.
a : the fraction of unmet demand that is translated to profit by being
diverted to other products of the same firm.
The capacity parameters are:
Ukht • the capacity available at site k under configuration h during the Uh
time period.
a^h • the production coefficients, indicating the capacity required to pro
duce one unit of the product at site k under configuration h.
Lkht : the capacity lost during the retooling process if site k is retooled
into configuration h during the tth time period.
The cost coefficients, given next, are discounted to the initial time period,
using an appropriate discount rate:
Fkht • the fixed cost for changing the configuration of site k from h = 1 to
h = 0,2,..., H during the Uh time period.
rkht ' the marginal contribution of producing and selling one unit of the
product at site fc, using configuration h at the tth time period.
r : the marginal contribution realized when there is unmet demand and a
fraction a of it is diverted to other products.
Now we define the decision variables. There are two sets of continuous
variables that denote the schedule of production and the level of unmet
demand. Two sets of integer variables are used to denote retooling decisions
and plant configurations.
^khst ‘ the number of units produced at site k, operating using configura
tion h under scenario s during the tth time period.
zst : the number of units of unmet demand under scenario s during the tth
time period.
. ( 1 if site k is in configuration h at time period t,
ykht — | o otherwise.
. f 1 if site k is retooled to configuration h at time period t,
Wkht — y o otherwise.
The continuous variables are constrained to be nonegative. The integer
variables are binary, that is ykht^kht are either 0 or 1.
Model Formulation
We now define the model by specifying precisely the objective function and
the constraints.
Objective function: The objective function that must be maximized has two
terms that account for direct profits from sales of the product and from
indirect profits from diverted demand, and a third term that accounts for
the retooling cost. It takes the form:

NT / K H \ NT T K H
Pst I 'f'khtTkhst j T Oi r
* Pst^st Fkh.t^kht-
s=l i=l \fc=l h=0 / s=l t=l t=lk=lh=O

(13.33)
Demand constraints: For each time period and for each scenario the total
production from all sites under all configurations plus the unmet demand
diverted to other products is equal to the total realized demand.
K H
Xkhst T zst = dst for all s = 1,2,..., TV, t = 1,2,..., T. (13.34)
k = l h=0
Capacity constraints: The total production capacity utilized at each site

cannot exceed the capacity available at the site under the given configura
tion, and taking into account losses of production capacity due to retooling
operations. This is described for all k = 1, 2,..., K, all h = 0, 2,..., H, all
s = 1, 2,..., TV, and all T = 1,2,..., T by
t^kh^khst < UkhtPkht ~ (13.35)
Retooling constraints: We consider now the logical conditions between the

retooling decisions and the plant configurations. A plant cannot be in a
given configuration h 1 unless it has first been retooled from its original
configuration h = 1. For the first time period this condition is imposed
by the constraint ykhi < Wkhi- For subsequent time periods we impose the
constraints ykht ~ ykh(t-i) < ™khi for all k = 1, 2,..., /V, all h = 0,..., H,
and all T = 1,2,... ,T.
Operational considerations usually dictate that a plant cannot be re
tooled more than once during the planning horizon. Hence, we add the
constraint:
H T
52YlWkht - 1 for all k = 1,2,...,K (13.36)
h=0 t=l
Finally, we require that each plant operate under some configuration or

be shut down. These considerations are imposed by the constraints:
H
Y^ykht = 1 for all k = 1,2, t = l,2,...,T, (13.37)
h=0
Wkot < Vkot for all k = 1,2,..., K, t = 1,2,... ,T. (13.38)
Robustness Considerations
In practical capacity expansion applications, the expected cost or profit of
the decision is not the only objective. Because large amounts of capital
and other resources are involved, and the careers of many employees are

at stake, some form of risk measure should be incorporated into the model
with the objective of reducing it. Models of robust optimization—and es
pecially those with solution robustness—provide the framework for dealing
with risk.
In order to produce capacity expansion plans that are solution robust we
define first a scenario-dependent measure of marginal profit from produc
tion Pst at each time period t. We also define the total profit Ps under each
scenario s as the total marginal profit from production during the planning
horizon, less the fixed cost of the capacity expansion plan. We then impose
additional constraints on the optimization model (13.34)-( 13.38) so that
some acceptable level of profit is realized under all scenarios.
The marginal profit realized under scenario s, from a given capacity
plan and production schedule, is given by:
K H T
Pst = ^kht^khst T Ot TZst. (13.39)
fc=lh=0 t=l
The total profit for a given scenario s, accounting for the fixed cost of a
capacity expansion plan, is given by:
T T K H
= (13.40)
t=l t=l fc—1 h,=0
One way to introduce solution robustness now is to obtain solutions

that have maximum expected profit for a given level of variance of profit.
This approach, which is common in the finance literature, is known as the
Markowitz criterion. A range of such solutions can be obtained by trading
off expected profit for variance, that is, reducing the level of acceptable
variance which also reduces the expected profit. This trade-off can be
achieved by setting up the objective of the optimization model as follows:
Maximize (E[PS] - AVar[Ps]), (13.41)
where E[-] and Var[-] denote the expected value and the variance of the
random variable respectively, and A is a user specified parameter. Large
values of A reduce variance at the expense of reduced profits, while smaller
values allow the variance to increase, producing higher returns at a penalty
of increased risks.
Calculation of the variance requires evaluation of a quadratic function
and the resulting optimization program becomes a nonlinear program with
continuous and integer variables. Such problems are very difficult to solve
with currently available software systems, and they are likely to remain
so in the future. Moreover, the use of a variance term as a measure of
risk is inappropriate when the distribution of profits is not symmetric.

The decision-maker wishes to reduce only downside risk, but by reduc
ing variance the model reduces both upside potential and downside risk.
An alternative formulation for robust capacity expansion planning imposes
constraints that limit downside risk alone and gives rise to a linear opti
mization program. A target level of profit p is set and linear constraints are
added to the optimization program of the previous section such that the
profit is greater than p for all scenarios, i.e., Ps > p for all s — 1,2,..., N.
If p is assigned a large negative value then these constraints are not binding
and the model obtains a capacity expansion plan that maximizes expected
profit. As p is increased the model is forced to seek capacity plans that are
guaranteed to have a profit of at least p under all scenarios. Such plans,
however, have reduced expected profit.
13.5.3 Robust optimization for matrix balancing
The problem of matrix balancing was examined earlier (Chapter 9), and
entropy optimization models were developed wherein the problem data were
assumed consistent, so that the sets of feasible solutions to Problems 9.2.3
and 9.2.4 were nonempty. For inconsistent data an interval-constrained
formulation was proposed that allowed the solution of the matrix balancing
problem in a way that satisfied the constraints within an error of Here
we develop a robust optimization model that provides an alternative way
to overcome difficulties due to data inconsistency.
Consider the following equality-constrained optimization problem
m n / x
Minimize EE ^log(^) (13.42)
»=1 J=1
n
s.t. ^2xij=Ui for i = 1,2,..., m, (13.43)
m
— Vj for j = 1,2,..., n, (13.44)
2=1
X > 0. (13.45)
When the observation vectors u and v are noisy it is possible that this op
timization problem has no solution. Clearly, if
the optimization problem has no feasible solution. Several approaches
can be pursued in order to overcome this difficulty. Tradition suggests
that the vectors u and v be first scaled using the transformation Ui <—
ui ((Ej uj)/(Ei ut)} for all i so that feasibility is restored. Alternatively,
the interval constrained formulation suggested in Problem 9.2.5 may be

used, becasue with sufficiently large values of the interval parameter e the
program becomes feasible.
The robust optimization formulation of the matrix estimation problem

explicitly accounts for potential infeasibilities in the linear constraints. It
then introduces a penalty term in the objective function that minimizes a
norm of the infeasibilities. Let y € IRm and z e IRn denote the infeasibility
vectors for the constraints (13.43) and (13.44) respectively. The robust
optimization model is written as
Minimize l°g ( ~ ) + + 52 (13-46)

i=i j=i 2 i=i j=i
n
s.t. — yi = ui for i = 1,2,...,m, (13.47)
m
— Zj — Vj for j = 1,2,..., n, (13.48)
i=l
x > 0. (13.49)
This formulation is a direct application of the robust optimization model

with quadratic penalty for infeasibilities. It is possible to arrive at a simi
lar mathematical formulation, beginning with statistical arguments on the
desirable properties of the balanced matrix. In particular, this formulation
results in a Bayesian estimate of the matrix, that is, it maximizes (the loga
rithm of) the probability of the matrix X = (xij), conditional on the noisy
observations {u^Vj}. The entropy term in the objective function estimates
the matrix that is the least biased (or maximally uncommitted) with re
spect to missing information, conditioned on the observations {a^}. The
quadratic terms are the logarithms of the probability distribution function
of the error (i.e., noise) term, conditioned on the matrix X — (x^), assum
ing that the errors are normally distributed with mean zero and standard
deviations that are identical for all observations.
Row-action Algorithm for Robust Optimization of Matrix Balancing

Both the quadratic and entropy terms in the objective function of the
robust optimization models are Bregman functions (see Section 2.1) and
so is their sum. It is easy to verify that both functions have the strong
zone consistency property with respect to the hyperplanes specified by the
equality constraints (13.47)-(13.48). Hence, we can apply a row-action
algorithm (Algorithm 6.4.1) to solve this model.
Consider first the application of the iterative step of Algorithm 6.4.1
to the zth row of the constraints (13.47). At the z/th iteration it takes the
form
x^1 = x"j exp (3 for all j = 1, 2,..., n, (13.50)
(13.51)

It is also important to observe that since the algorithm is applied to equality
constraints there is no need to explicitly update the dual variables and so
they are omitted from Algorithm 13.5.1 below. The projection parameter
(3 is calculated such that satisfy the zth constraint. Hence,
substituting (13.50)—(13.51) to the zth equation of (13.47) we obtain
£eXPP) ~ Vi + \ = Ui- (13.52)

J=1
Let ^(/3) be the nonlinear function
W) = £
oexp/S)
* -y" + -Ui. (13.53)
J=1
We seek a (3* such that 4>(/?

*) — 0. We may use any nonlinear equation
solver (e.g., Newton’s method) to solve 4/(/3) = 0. However, a sufficient
approximation to /3 * can be obtained by taking a single Newton’s step,
starting from (3 = 0, similar to the use of secant approximations discussed
earlier (Sections 6.9 and 12.4.2). The asymptotic convergence of the row
action algorithm is preserved if this approximation is used in the iterative
step instead of the exact value /?
*.
It is worth mentioning that this approximation can be calculated in
closed form, whereas the exact calculation of /?
* requires the use of an iter
ative procedure. Obtaining closed-form solutions to the nonlinear system
of equations has important implications for the efficient implementation
of the algorithm (see Section 12.4.2). The approximate solution of the
nonlinear equation is obtained as
*(0)
(13.54)
«"(0)’
where ’iP' denotes first derivative with respect to (3. Straightforward calcu
lations yield
(13.55)
This value of /3 is used in (13.50)—(13.51) to complete the iterative step over

the constraint equations (13.47). Following a similar argument we obtain
the iterative step of the row-action algorithm over the constraint equations
(13.48). The complete algorithm is summarized below.

Algorithm 13.5.1 Row-Action Algorithm for Robust Matrix Balancing
Optimization Model
Step 0: (Initialization.) Set v = 0 and choose x° € IR™™ and y° C

IRm , z° e IRn, such that the initialization conditions of Algorithm 6.4.1
are satisfied. For example, set = a^j, $ = 0, z9 = 0 for all i =
1,2,..., m, j = 1,2,..., n.
Step 1: (Iterative step over rows of the matrix, i.e., equations (13.47).)
For all i = 1, 2,..., m, calculate:
(13.56)
J=1
^75 = êxP/3. (13.57)
y-+l = Vi - f■
(13.58)
A
Step 2: (Iterative step over columns of the matrix, i.e., equations (13.48).)
For all j = 1, 2,..., n, calculate:
m
(EX+b- Zj ~vi
fl ' ______________ (1 Q
2=1
a#1 = xij^ exp^’ (13.60)
> Zj X- (13.61)
Step 3: Replace v v + 1, and return to Step 1.
13.6 Stochastic Programming for Portfolio Management

Portfolio management problems can be viewed as multiperiod dynamic
decision problems where transactions take place at discrete time points.
Sect. 13.6 Stochastic Programming for Portfolio Management 395
At each point in time the manager has to assess the prevailing market
conditions (such as prices and interest rates) and the composition of the
existing portfolio. The manager has also to assess potential fluctuations in
interest rates, prices, and cashflows. This information is incorporated into a

sequence of actions of buying or selling securities, and short-term borrowing
or lending. Thus, at the next point in time the portfolio manager has a
seasoned portfolio and, faced with a new set of possible future movements,
must incorporate the new information so that transactions can be executed.
Portfolio management of equities is based on the notion of diversification
introduced by H. Markowitz in his seminal work in the 1950s. Diversifica
tion is achieved by minimizing the variance of returns during a holding pe
riod, subject to constraints on the mean value of the returns. There is only
one time interval under consideration. Therefore, future transactions are
not incorporated and this is a single-period (myopic) model. The portfolio
management strategy for fixed-income securities has been that of portfolio
immunization, i.e., portfolios are developed that are hedged against small
changes from the current term structure of interest rates. Such models are
again single-period and ignore future transactions. Furthermore, they ig
nore the truly stochastic nature of interest rates, and merely hedge against
(small) changes from currently observed data. The idea of immunization
dates back to the actuary F.M. Reddington in the 1950s, and it has been
used extensively since the mid-70s.
The increased complexity of the fixed-income securities and the in
creased volatility of the financial markets during the 1980s have motivated
interest in mathematical models that more accurately capture the dynamic
(i.e., multiperiod) and stochastic nature of the portfolio management prob
lem. Stochastic programming models with recourse provide a versatile tool
for the representation of a wide variety of portfolio management problems.
This section formulates a multistage stochastic programming model for
managing portfolios of fixed-income securities. We assume that readers
have some familiarity with basic concepts of finance. As an introductory
text we recommend Bodie, Kane, and Marcus (1989) and, for more ad
vanced material, Elton and Gruber (1984) and Zenios (1993a).
The model specifies a sequence of investment decisions at discrete time
points. Decisions are made at the beginning of each time period. The
portfolio manager starts with a given portfolio and a set of scenarios about
future states of the economy which she incorporates into an investment
decision. The precise composition of the portfolio depends on transactions
at the previous decision point and on the realized scenario. Another set of
investment decisions are made that incorporate both the current status of
the portfolio and new information about future scenarios.
We develop a three-stage model, with decisions made at time instances
to, ti, and to- Extension to a multistage model is straightforward. Scenarios
unfold between to and ti, and then again between ti and A simple

Figure 13.3 The evolution of scenarios on a binomial lattice
and the structure of the portfolio investment decisions.
three-stage problem is illustrated in Figure 13.3, where it is assumed that

scenarios evolve on a binomial lattice. At instance to two scenarios are
anticipated and by instance t± this uncertainty is resolved. Denote these
scenarios by Sq and sj. At t\ two more scenarios are anticipated, s? and s|.
A complete path is denoted by a pair of scenarios. In this example there
are four paths from to to t2 denoted by the pairs (sq, si), (5o, 5i), (so, 5i)’
and (sq, s}).
The model is a nonlinear program maximizing the expected value of a
utility function of terminal wealth. Expectations are computed over the set
of postulated scenarios. The value of the utility function is obtained from
the solution of a multistage program, which is structured in such a way
that decisions at every time period anticipate future scenarios, while they
adapt to new information on market conditions as it becomes available.
13.6.1 Notation
The model is developed using cashflow accounting equations. Investment
decisions are in dollars of face value. We define first the parameters of the
model.
So, Si : the sets of scenarios anticipated at to and ti respectively. We use

sq and $i to denote scenarios from So and Si, respectively. Paths are
denoted by pairs of the form (sq, Si), and with each path we associate
a probability 7r(so,si).
I : the set of available securities. The cardinality of I (i.e., number of
available financial instruments) is m.
co : the dollar amount of riskless asset available at to-
b° e IRm : a vector whose components denote the composition of the initial
portfolio.
£°, £° G IRm vectors of bid and ask prices respectively, at to. These prices
are known with certainty. In order to buy an instrument the buyer
has to pay the bid price, and in order to sell it the owner is asking for
the ask price.

^(so), ^($0) £ IRm,for all so E So : vectors of bid and ask prices, respec
tively, realized at tp These prices depend on the scenario so-
C2(sq, $i), £2(-$o,<$i) E IRm, for all so E So and all si E Si : vectors of bid
and ask prices, respectively, realized at t2. These prices depend on the
path (sq, «i).
q0(sq), a1(so,«i) 6 IRm, for all sq G So and all si G Si : vectors of amor
tization factors during the time intervals [£o,î) and [fi,^) respec
tively. The amortization factors indicate the fraction of outstanding
face value of the securities at the end of the interval compared to the
outstanding face value at the beginning of the interval. These factors
capture the effects of any embedded options, such as prepayments and
calls, or the effect of lapse behavior. For example, a corporate security
that is called during the interval has an amortization factor of 0, and
an uncalled bond has an amortization factor of 1. A mortgage secu
rity that experiences a 10 percent prepayment and that pays, through
scheduled payments, an additional 5 percent of the outstanding loan
has an amortization factor of 0.85. These factors depend on the sce
narios.
A:0(so), Ad(so,<$i) E IRm, for all so E So, and all si E Si: vectors of cash ac
crual factors during the intervals [to, ti) and [^1,^2) respectively. These
factors indicate cash generated during the interval, per unit face value
of the security, due to scheduled payments and exercise of the embed
ded options, accounting for accrued interest. For example, a corporate
security that is called at the beginning of a one-year interval, in a 10
percent interest rate environment, will have a cash accrual factor of
1.10. These factors depend on the scenarios.
Po(ô), Pi(so,si) • short-term riskless reinvestment rates during the inter
vals [ô3i) and [£1,^2) respectively. These rates depend on the scenar
ios.
Li(so), ^2(50,51) : liability payments at ti and t2 respectively. Liabilities
may depend on the scenarios.
Now let us define decision variables. We have four distinct decisions at

each point in time: how much of each security to buy, sell, or hold in the
portfolio, and how much to invest in the riskless asset. All variables are
constrained to be nonnegative.
First-stage variables, at to'
x® E IRm : the components of the vector denote the face value of each se
curity bought.
yO IRm . denotes componentwise the face value of each security sold.

z° e IRm : denotes componentwise the face value of each security held in
the portfolio .
vq : the dollar amount invested in the riskless asset.

Second-stage variables, at t± for each scenario sq:
^(so) € • denotes the vector of the face values of each security bought.
yi(so') E IRm : denotes the vector of the face values of each security sold.
^(so) € IRm : denotes the vector of the face values of each security held
in the portfolio.
ui(sq) : the dollar amount invested in the riskless asset.
Third-stage variables, at £2 for each scenario (sq,5i):

£2(so,Si) € IRm : denotes the vector of the face values of each security
bought.
?/2(sch $i) £ IRm • denotes the vector of the face values of each security sold.
Z2(so, Si) € lRm : denotes the vector of the face values of each security held
in the portfolio.
V2(«o?5i) : the dollar amount invested in the riskless asset.
13.6.2 Model formulation

The constraints of the model express cashflow accounting for the riskless
asset and inventory balance for each security at all time periods.
First-stage constraints: At the first stage (i.e., at time to) all prices are
known with certainty. The cashflow accounting equation specifies that the
original endowment in the riskless asset, plus any proceeds from liquidating
part of the existing portfolio, equal the amount invested in the purchase of
new securities plus the amount invested in the riskless asset, i.e.,
m m
Co + £
i—1
=E i=l
^xi + (13.62)
For each security in the portfolio we have an inventory balance constraint:
b® + x® = y® + z® for all i e I. (13.63)
Second-stage constraints: Decisions made at the second stage (i.e., at time

tfl) depend on the scenario so realized during the interval Hence,
we have one constraint for each scenario. These decisions also depend on
the investment decisions made at the first stage.
Cashflow accounting ensures that the amount invested in the purchase
of new securities and the riskless asset is equal to the income generated by
the existing portfolio during the holding period, plus any cash generated
from sales, less the liability payments. There is one constraint for each
scenario
m m
+ 52

Po(so)vo ki(S0^Zi + (So)
i=l i—1
m
= î(gp) + 5?C/(so)a;|(so) + Li(s0), for all So € So- (13.64)
2=1
Inventory balance equations constrain the amount of each security sold

or remaining in the portfolio to be equal to the outstanding amount of
face value at the end of the first period, plus any amount purchased at the
beginning of the second stage. There is one constraint for each security
and for each scenario:
a°(s0)z° + x|(s0) = yj(so) + z41(so), for all i G I, s0 G So- (13.65)
Third-stage constraints: Decisions made at the third stage (i.e., at time £2)
depend on the path (so,si) realized during the period [U,^) and on the
decisions made at U. The constraints are similar to those of the second
stage. The cashflow accounting equation is
m m
Pi(so,si)vi(so) + 52fci(SO,Sl)2/(sO) + 52^2(so,si)i/22(so,si)
i—1 i=l
m
= V2(so,si) + 52C2(so,si)x?(so,si) + L2(so,Si), (13.66)
2=1
for all paths (sqî) such that sq € So and Si G Si-

The inventory balance equation is:
a)(s0,si)z| (so) + ar?(s0,Si) = l/2(so,si) + 22(s0,si), (13.67)
for all i e I, and all paths (so,Si) such that so € So and «i G Si.
Objective function: The objective function maximizes the expected utility
of terminal wealth. In order to measure terminal wealth all securities in
the portfolio are marked-to-market, in accordance with recent U.S. Federal
Accounting Standards Board (FASB) regulations that require reporting
portfolio market and book values. The composition of the portfolio and
its market value depend on the scenarios (sq,si). The objective of the
portfolio optimization model is
Maximize 7r(so, S\)U (VT(so, si)) ,

(sq,S1 )ESo X Si
where 7r(so> si) is the probability associated with the path (so, «i); IT(so, 5i)
denotes terminal wealth; and U denotes the utility function. Terminal
wealth is given by

m
W(s0,si) = V2(s0,si) + ]ryf(so,si)Zi2(so,si)- (13.68)
i=l
13.7 Stochastic Network Models

We revisit the deterministic equivalent formulation of the two-stage model
(13.14)—(13.18) and address the special case where the constraints (13.15)—
(13.16) can be represented by a generalized network structure. (See Sec
tion 12.1 for the definition of networks and generalized networks.) For that
(A \
T(u/) / aRd are n°de-arc incidence matrices with
exactly two nonzero entries per column for each scenario ws in a scenario
set Q = {cj1,^2, ... ,cjs}; the complete matrix in (13.19) is not however
due to the occurrence of T(ws)x for all scenarios 6 Q. The matrix
(4T | T(ccd)T | • • • | T(ojs)t)T has more than two nonzero entries in every
column. The recourse problem has a network structure, but the first-stage
variables x are complicating variables as they link the recourse problem
constraints (via the linear equations (13.16)). The next section gives a de
tailed formulation of the stochastic network problem, where the network
structure becomes more discernible.
The financial modeling applications of Section 13.6 can be represented
using stochastic network structures. For each fixed-income security at each
time period we associate a network node; and for each transaction we
associate an arc. For example, an arc linking two different securities at the
same time period is used to denote sale of one security and purchase of the
other, while an arc linking the same security across different time periods
is used to denote the inventory of that particular security. Figure 13.4
illustrates the structure of the network flow problem for two securities and
three time periods. If all data are known with complete certainty the
problem is the classic network flow problem. The stochastic programming
model uses a different network flow problem for each scenario, but the flows
on the arcs representing first-stage variables are common across scenarios.
Problems of planning hydroelectric power scheduling can also be repre
sented as stochastic network problems. Power generation decisions made
today for the coming hours depend on the current state of the system,
electricity demand, water inflows, and so on. The necessary input data for
modeling such a complex system consist of: the network topology specified
by the geographical location of dams and the associated hydropower gener
ation units, and their interconnections; limits on reservoir storage, level of
Sect. 13.7 Stochastic Network Models 401
First stage
Security^^

Figure 13.4 The structure of a generalized network model for
portfolio management of two securities over three time periods.
turbine operations, pumping, and spillage; hydroelectric production coeffi

cients for each reservoir, obtained from engineering analysis of its storage
capacity and the turbine technology. Important input data are also the
water level in the reservoirs and the electricity demand. Both of these
quantities are uncertain, and furthermore, demand for electricity exhibits
both a daily and a seasonal variation that can be estimated, at best, by
a set of scenarios. The same is also true for the water level that depends
on rainfall. These uncertainties are fundamental to the operation of the
system. They can be incorporated in a stochastic network model.
Another complex problem that has been modeled using stochastic net
works is that of planning air-traffic ground holding policies (see also Sec
tion 12.3.2). The air traffic system is a complex web of airports, aircraft,
and traffic controllers at all airports. In the United States a centralized
flow control facility in Washington, DC, coordinates the control of this sys
tem. The complexity of the air traffic system in Europe is intensified by
the need to coordinate several control centers among different countries,
a situation which has become further complicated with the integration of
east European countries into the system.
The systems are highly congested; traffic flow is carefully monitored and
controlled so that flights proceed without risk to safety. One key control
mechanism is ground holding, whereby a flight is delayed for departure if
congestion is anticipated at the destination airport. Ground holding is
a safe and relatively inexpensive solution, as opposed to holding aircraft
in flight before granting landing clearance. While the air traffic control
system does an excellent job monitoring traffic so that high safety standards
are maintained, there is substantial room for improvement especially with
regard to cost effectiveness. It is estimated that ground delays in the United
States in 1986 averaged 2000 hours per day, equivalent to grounding a total
of 250 airplanes (a carrier the size of Delta Airlines). A study by the West
German Institute for Technology estimated the avoidable cost of air traffic
delays in 1990 due to ground holding alone at 1.5 billion US dollars.
The ground holding policy problem seeks optimal holding policies, based

on the number of flights scheduled for departure during the planning hori
zon and the travel time to the destination airport. Even for the simple case
where only a single destination airport is analyzed, its capacity is uncer
tain due to weather conditions. The problem is complicated further by the
presence of multiple airports: ground holding decisions at each one have
a cascade effect on all others. The single destination airport problem has
been modeled using stochastic network optimization models. Application
of the model to data obtained from Logan airport (Boston, Mass., USA)
proved that substantial reductions in total delay can be realized when using
the stochastic programming dynamic models as opposed to more commonly
used static models.
13.7.1 Split-variable formulation of stochastic network models
We consider here an alternative formulation of the deterministic equiva
lent formulation (13.14)—(13.18) that better illustrates the network struc
ture, and is also more suitable to the development of parallel optimization
algorithms (Section 13.8). The split-variable formulation (see also Sec
tion 13.3.3) breaks the stochastic network problem into a large number of
independent deterministic network flow problems with some additional cou
pling constraints, by replicating the first-stage variables x into a set of vari
ables xs € IRno for each e Q. Once a different first-stage decision is al
lowed for each scenario the stochastic program decomposes into S indepen
dent problems. Of course, the first-stage variables must be non-anticipative,
that is, they cannot depend on as yet unobserved scenarios, a requirement
that is enforced by adding the condition that x1 = x2 = • • • = xs. The
split-variable formulation is then equivalent to the original stochastic prob
lem (13.14)-(13.18). It can be written as follows
Minimize y^ps(/(zs) + qs(ys, ws)) (13.69)

S=1
s.t. Axs=b for all s £ Q, (13.70)
T(us)xs + W(ajs)ys=h{ixs) for all s £ Q, (13.71)
a? - xs = 0 for all s £ Q, (13.72)
xs £ IR;0, (13.73)
ys £ IR;1. (13.74)
The constraints (13.72) are known as non-anticipativity constraints and

they ensure that first-stage decisions Xs do not depend on future realiza
tions. For two scenarios si and «2 that are indistinguishable when the
Sect. 13.7 Stochastic Network Models 403
first-stage decisions are made, we have that xS1 = xS2.

With this reformulation the model (13.69)-( 13.74) is a network with side
constraints. In the absence of the (side) constraints (13.72), the constraint
set decomposes completely into S independent problems. Each problem

has a network structure, since (AT | T(ljs)t)t and IV(u,s) are node-arc
incidence matrices.
The constraints matrix of the split-variable formulation has a block
diagonal structure with additional (coupling) rows for the non-anticipativity
constraints.
Let M = S • (mo + mi) + (S'- 1) - no, N = S • (no + ni), and let I denote
the no x no identity matrix. Recall that mo and no are the numbers of first-
stage constraints and variables respectively, and mi,ni are the numbers of
second-stage constraints and variables respectively. The constraint matrix
for (13.70)-(13.72) has dimension M x N. We denote this matrix by $,
(see also equation (13.20)), and it is defined as follows
( A
TV1) MV1)
A
TV2) WV2)
(13.75)
A
T(ws) WVS)
\ I
It is evident from the structure of the constraints matrix that the problem
decomposes by scenario if the non-anticipativity constraints are ignored.
Let 7 G IRM denote the right-hand side of (13.70)-(13.72) and let z E
IRa be the vector of decision variables, i.e.,
z = ((z1)T | (?/1)T I • • • I (xs)T I (ys)T)T. (13.76)
We also introduce, for completeness, a vector u g IR^ whose components

denote upper bounds on the variables. Finally, let F(z) denote the objective
function (13.69).
The split-variable formulation with bounded variables can be written

in matrix form as
Minimize F(z) (13.77)

s.t. (13.78)
Z e IR77, (13.79)
where In is the N x N identity matrix.
13.7.2 Component-wise representation of the stochastic network

problem
The previous sections illustrated the macro structure of the problem and
the decomposable nature of the constraints matrix. This section gives a
component-wise formulation of the stochastic network problem and illus
trates its micro structure by specifying algebraically all equations. We
assume for simplicity the same underlying network structure for all sce
nario problems, and with the notation of Section 9.2 we represent this
structure by the graph G = (Af, >1), where M ~ {1,2,..., mo + mi} is the
set of nodes and A = {(z, j) | i,j E A/"} CA/’xA/’is the set of arcs. Let
~ {j I (m) A} be the set of nodes having an arc with origin node z,
and 6J = {i | (z,j) € A} be the set of nodes having an arc with destination
node j. We partition the set of all nodes into two disjoint sets, Ao and
A/}. The set A/o consists of the mo nodes whose incident arcs are all first
stage so that their flow conservation constraints do not depend on the real
ization of the uncertain quantities. The resources (i.e., supply or demand)
for these nodes are real numbers, denoted by bi for all i C A/o. The set
A/} = Af \ A/o consists of the mi nodes with stochastic right-hand sides or
incident second-stage arcs. The resources for these nodes are denoted by
rf for all z E AS, s € Q.
We also partition the arc set A into two disjoint sets A and Ai, corre
sponding to replicated first- and second-stage decisions respectively. The
number of arcs in these sets are denoted by no and respectively. Denote
by xfj for (z,j) E w4o< and for (z,j) € >ti the flows on the arc with
origin node z and destination node j under scenario index s e Q. The
upper bound of a replicated first-stage arc xfj is denoted by and the
upper bound of a second-stage arc yf - is denoted by vs-, The multiplier on
arc (z,J) is denoted by m^- for (z, j) E Ab and by mfj for (z,j) E Ai- The
network optimization model for a fixed scenario index s C. Q is given by:
Sect. 13.8 Iterative Algorithm for Stochastic Network Optimization 405
Minimize
xseIRno ,yseIRni
V Psfij^)
J
+ V Psqt^vlj)
J J
(13.80)
(m)G.4o
S.t.
E 4-E mkiXski — bi for all i G A/o, (13.81)

je«,+ ke6~
E4- E
je^nAf0
mkiXki
ke6~a\T0
+ E4 ~ E
J6«+ njVi fce6“rWi
mkiySki = ri for a11 « e -^1’ (13-82)
0 < xsi:j < utj for all (i,f)eAo, (13.83)

0 < ytj < vsi:j for all (t, j)G A- (13.84)
The complete stochastic network problem (13.69)-(13.72) is obtained

by replicating the network problem (13.80)-( 13.84) for each scenario and
including the non-anticipativity constraints
rrb — xfj = 0 for all s E Q and for all (z,J) G ^4q- (13.85)
In this section we have been referring to quantities pertaining to an arc

(z, J) E A under scenario s E Q by using subscripts (z, J) and a superscript
s, respectively. To establish the correspondence between the matrix/vector
notation of (13.77)—(13.78) and the component-wise notation of this sec
tion we impose a lexicographic order (see an example of such an order on
page 362) on the arcs in A and let (zi,ji) denote the first arc in A. Then
zi, the first component of z in (13.76), and rr|Ui refer to the same variable
and so on.
13.8 Iterative Algorithm for Stochastic Network

Optimization
In this section we develop an algorithm for stochastic network problems
with a quadratic objective function using the split-variable formulation
of the problem (Section 13.7.1). The algorithm is a specialization of the
general row-action algorithm (Section 6.2) to the network structure. Since
the algorithm works with the replicated problem it is convenient to use
to denote both first- and second-stage variables. That is, xfj for (z, J) E Ao
is a replicated first-stage variable, while x^ for (z, j) G .41 is a second-stage
variable. The objective function F takes the form
W = E ^4(4)2 + 44’ (13.86)

(m)g.4, seQ
where wfj > 0 and are constants which are obtained by evaluating the
summations in the minimand of (13.80) when the functions fij and are
quadratic, and then adding up over all scenarios to get the expected value
of the objective function values.

Let Mi = S(rriQ H- mi). Then rows 1,2,..., Mi of the constraints ma
trix $ (cf. equation (13.78)) correspond to network flow conservation con
straints, and rows Mi + 1,..., M correspond to the non-anticipativity con
straints that take the simple form
vj - Xji
•'j = 0,'
for all (z,j) G v4o and all s C Q. The dual price 717, £ C {1,2,..., Mi},
associated with the flow conservation constraint for node i C N under
scenario s e Q, is denoted by 7rf. The dual price 717, I C {M + 1,..., M +
AT}, associated with the simple bound constraints for xfj (i.e., the reduced
cost of xfj), is denoted by 7rf-. We now develop the specific projection
formulae for use in the iterative steps of Algorithm 6.4.1. The row-action
algorithm iterates one row at a time on the constraint matrix $. The
precise formulae for the iterative step are obtained by using the equation
corresponding to the chosen matrix row, as given by equations (13.81)-
(13.82). We develop the equations for only a single iterative step of the
algorithm over all constraints, i.e., flow conservation equality constraints,
bounds on the variables and non-anticipativity equality constraints. The
complete algorithm is summarized below as Algorithm 13.8.1.
Projection on Flow Conservation Constraints
First we derive the iterative step of the algorithm when the chosen row
of the matrix $ corresponds to the flow conservation constraint (13.81).
Consider the flows on the incoming- arcs, xski for k C 6~, and the flows
on the outgoing arcs, xfj for j e 6+ for a given node i C A/o under some
scenario s 6 Q.
The generalized projection z of the current iterate z onto the hyperplane
where is the zth column of 4>T, and 7* is the zth component of
7 (see (13.77)-(13.78)), determined by the flow conservation constraint at
node z, is obtained by solving the system (see Lemma 2.2.1):
VF(i) = VF(z) + OS (13.87)

z e WO- (13.88)
Of course, if z C then /3f = 0 and z — z. If the current iterate z

does not satisfy flow conservation at the zth node define the node surplus
af as
ai = bi ~ (52
jest
xii ~ 52
ke67
mkixkiY (13.89)
Applying the iterative step (12.33) to the functional form of the objective
function (13.86) and using the structure of the rows of the constraints
matrix $ that correspond to network flow constraints we get

£ij = xij + for 3 e > (13.90)
l3
iski = xski-(3^ forked. (13.91)
wki
Substituting these expressions for x^ and xski into (13.81) we get
E - E “<=•«. - ft'S1) = <1M2>

From this and (13.89) we obtain
Using this result in (13.90) and (13.91) gives the desired formulae for updat
ing all primal variables incident to node i. The dual variable for this node
7rf is updated by adding to its current value, i.e., 7rf 7rf -I- (3$. Similar
algebraic manipulations lead to the updating formulae required to apply
the row-action algorithm to rows corresponding to constraints (13.82).
Projection on Simple Bound Constraints

Now we develop the specific projections on the simple bounds (13.83)-
(13.84) for the general row-action Algorithm 6.4.1. We do it in detail for
(13.83) only. Denote by the current value of the variable and by xfj the
projected value. If x^ < 0, we get from (6.32) that
0 = ^. = 4 + 4- (13-94)
The primal variable is set to zero and the projection parameter is (3 =

—wfjXfj. The dual price of the constraint is updated by subtracting (3 from
its current value, i.e., 7r?- <— — (3.
If xfj > Uij we similarly set the primal variable xfj to the upper bound
Uijy and from (6.32) compute the projection parameter (3 — (uij —
and update the dual price of the bound constraint by subtracting (3.
Finally, if 0 < x^ < we get from (6.61)
(13.95)
and then set the dual price 7i< to zero.

Projections on Non-anticipativity Constraints
We now develop the iterative step of the algorithm for the equality, non-
anticipativity constraints. A non-anticipativity constraint (13.85) has the
form
‘■j ‘j 1
(13.96)
for some (z,j) e A) and some s E Q. Let /i(s) be a row index such that
for s — 2,3,..., S is the row of $ that corresponds to the constraint
7*1 • — ■ — 0
V lj
If z is the current iterate, the generalized projection onto the hyperplane
represented by this constraint is the point z which solves the system
VF(z) = + (13.97)
%ij = (13.98)
Noting that the //(s)th row of the constraints matrix has only two nonzero
entries (cf. equation (13.96)) we can write this system as
Solving this, we get
i.e., the point (xL,x^) is projected upon the point with coordinates equal
to the weighted average of x\j and x8^ with wL and w8j being the weights.
Consider now the effect of repeated projections of the row-action algo
rithm on the non-anticipativity constraints (13.85). We can take advantage
of the almost cyclic control of the algorithm in a way that would not have
been possible with the cyclic control alone. The almost cyclic control of
the row-action algorithm allows repeated projections upon these constraints
alone until convergence—within some tolerance—of the variables x8^ for
any fixed (i, j) E v4o to a limit x*j. We show that x*j can be obtained ana
lytically, rather than using the iterative scheme. This result has important
implications for implementations, since the effect of repeated application

of equation (13.99) for all s e Q can then be calculated in closed form.
The non-anticipativity constraints for the replications of a single first-
stage variable take the form

- rtj = °>
lJ - xf. = 0,
X/i (13.100)
x,-,- - xf
l'J
= 0.
Let VFij : IRS —> IR6 denote the subvector of the gradient VF corre
sponding to the S replications of the first-stage variable x]j, ... ,xfj and,
similarly, let denote the submatrix of $ consisting of the columns
corresponding to xjj,... ,xfj.
By repeated projection onto these non-anticipativity constraints, such
that the i/th projection is onto the constraint x^ — x^ = 0, we obtain a
sequence of points x" E IR5 satisfying
y
= VF^y) + £ Afc^(fc)), (13.101)
k=l
where is the column of matrix $(tj) corresponding to the /i(£(fc))th

variable, A^ is the projection parameter corresponding to the A;th pro
jection, y is the starting point, and is the row index of the non-
anticipativity constraints, see the discussion on page 408. The limit point
x* E IRS satisfies
VF0(x*) = VF0(j/) + £ Afc<^(fc)) (13.102)
fc=l
and must, by the non-anticipativity constraints (13.100), have all com

ponents identical, i.e., x* = (x^,..., x
*
^ for some x*j E IR. Let now
As == Yl{k\e(k)==s} for s = 2,...,S. Using the fact that F(y) is the
quadratic function (13.86) rewrite (13.102) as the system in S variables
1 S
xij — Vij +
V s=2
l3
* S 1 A
xij= ya —S (13.103)
W,-•
In matrix form, this is

Ht = y, (13.104)
where

wlj wij
(13.105)
and t = (x
j,
* A2,..., As)T. By inverting H we can solve for t. Since we are
only interested in (not in A2,..., As), we need only calculate the first
row of denoted by h = Using the special structure of H we
easily get
I / | \
h — ------- — I TT--- (w?-, w?-,..., I,
det H \ \s—1 U /
where det H is the determinant of H. The inner product of the first column
of H, which consists of all ones, and the first row of H~1 must equal 1.
Therefore hs = 1. Hence
Note that det H > 0, so that the system (13.104) has a unique solution.
Solving for x*j we get
YXijVij
x*j = {h,y} = ^--------. (13.106)
/ -j wii
bj
s=l
Since £ is a first-stage variable

1 9 s
IJ LJ __
Pl P2 ’ ” Ps '
Also, 52f=1Ps = 1, so the result can be simplified to

s
(13.107)
s—1
The Row-action Algorithm for Quadratic Stochastic Networks

We have now completed all the components required to specialize the row
action algorithm to quadratic stochastic network problems. The complete
algorithm proceeds as follows.
Algorithm 13.8.1 Row-Action Algorithm for Quadratic Stochastic

Networks
Step 0: (Initialization.) Set v = 0 and get 7r° and zQ such that
VF(z°) = - ( f ) 7T°.
\1n /
For example, 7r° = 0 and
— —y for j) £ Ab s e Q, (13.108)
Wij
(Vij)° =----- 4~ for all (?, j) e s e Q. (13.109)

wfj
Step 1: (Iterative step for the scenario subproblems.) For all s € 0, do

the following:
Step 1.1: (Iterative step for the flow conservation constraints.) Let
(13.110)
(13.111)
For all first-stage nodes i 6 Aq:
WjY+i = + (AS)P+I T- for all j e t?,

(13.112)
J Wij
(x°kiy+? = (x
* kiy - for all k e 6-, (13.113)
Wki
(^T+1 = - (£T+i- (13.114)
For all second-stage nodes i € A/i:

(^■r+l = for a11 >e (13.115)
lj
(yskir+i = (VkiY - (£T+4^ for all k e v, (13.116)
wki
«r+1 = «r-(/?rr+|. (13.117)
Step 1.2: (Iterative step for the simple bounds.)

For all first-stage arcs (i, j) e >1q:
' Uij if (xsij)u+i > Uij,

(xs r+i = o if < 0, (13.118)
0 ! (7TS.)P
if 0 < (^0')p+5 < Uij.
w^-
and
((<,)" - wsij(uij - (xsij)v+^) if (^)p+5 >
«.)-+!= J (TT^y + Wljîj)^ if (^r+i < ),
|o if 0 < (xfJ )t'+ < Uij.
(13.119)
For all second-stage arcs (z,j) € A-i:
Vs-
V
if (l/o )P+i > vij>
(ysijY+1 0 if < 0, (13.120)
ifO<(y(7r+i <vsi3.
wij
and
- wij<Vij - (ysijY+b if (ytjV+^ > i ip
s \I>4-1 ^+u>^yfj)^ if (yfj)^<<
0 ifO<(y^r+ < v®-.
(13.121)
Step 2: (Iterative step for non-anticipativity constraints.)
For all first-stage arcs (z, j) e .Ao set:
s
(13.122)
Xij =
s=l
Sect. 13.9 Notes and References 413
(^)P+1 = x-j for all s e Q. (13.123)
Step 3: Let v v + 1 and return to Step 1.

Decompositions for Parallel Computing

The calculations in Step 1 of Algorithm 13.8.1 are repeated for multiple
independent scenarios. Hence, these calculations can be executed concur
rently utilizing as many processors as number of scenarios.
The calculations in Step 1.1, for a given scenario index 5, are performed
for all first- and second-stage nodes. On first examination these calculations
are not independent, since nodes have arcs in common and the calculations
for a given node, say z, cannot change the flow on arc at the same
time that the calculations for node j are updating the flow on this arc.
However, it is possible to execute the calculations for multiple nodes con
currently if we identify nodes that do not have arcs in common. Such sets
of nodes are identified by coloring the underlying graph, and iterating on
same color nodes simultaneously. For example, in a time-staged network it
is possible to iterate concurrently on all nodes corresponding to odd-order
time periods, and then iterate concurrently on all nodes in even-order time
periods. Another alternative is to employ a Jacobi variant of the algorithm
described in Step 1.1. (see Section 13.9 for references to related literature).
The calculations in Step 1.2 can be executed concurrently for all arcs and
all scenarios utilizing as many processors as the number of scenarios times
the number of arcs.
The calculations in Step 2 can be executed concurrently for all first-stage
arcs, although the calculations involved in this step are trivial compared
to the amount of work performed in Step 1. Sections 14.5 and 15.5 give
details on the implementation of this algorithm, and report computational
results with large-scale test problems.
13.9 Notes and References

Stochastic programming models were first formulated as mathematical pro
grams in the late 1950s by Dantzig (1955) and Beale (1955). Programs with
probabilistic constraints were introduced by Charnes and Cooper (1959).
For general references on stochastic programming see Dempster (1980),
Ermoliev and Wets (1988), Kall (1976), Kall and Wallace (1994), and
Wets (1989). A textbook treatment of stochastic programming is Kall
and Wallace (1994)
13.1 For further discussion on probability theory as it applies to stochas
tic programming see Kall (1976), Wets (1989), and Frauendorfer’s
thesis (1992). For general background on probability theory refer to
Billingsley (1995) or Parzen (1960).
13.3 The problem formulations can be found in the general references cited
above. See also Walkup and Wets (1966) and Wets (1966a, 1966b,
1972, 1983). Wets (1974) develops the deterministic equivalent for

mulation. Multistage programs are discussed in Birge (1985, 1988),
Olsen (1976), Gassmann (1990), Ermoliev and Wets (1988), and Wets
(1989). Dupacova (1995) compiled a bibliography.

13.4 The robust optimization model was suggested by Mulvey, Vanderbei,
and Zenios (1995). The terminology of structural and control variables
is borrowed from the flexibility analysis of manufacturing systems;
see Seider, Brengel, and Widagdo (1991). Applications of robust op
timization are discussed in Guttierez and Kouvelis (1995), King et
al. (1988), Malcolm and Zenios (1994), Paraskevopoulos, Karakitsos,
and Rustem (1991) and Sengupta (1991).
13.5.1 The diet problem was studied by Stigler (1945) and used by Dantzig
(1963) as the first test problem for the simplex method. See also
Dantzig (1990).
13.5.2 Capacity planning and expansion has been a fertile ground for
the application of optimization models. For a textbook treatment,
in the context of manufacturing applications, see Hayes and Wheel
wright (1984). The robust optimization approach to capacity plan
ning for a multiproduct, multifacility production firm was suggested
by Eppen, Martin, and Schrage (1989), who applied their model to
plan car manufacturing facilities for the General Motors Company.
The model described in this section is a simplified version of their
application. The same reference discusses the merits of a robust opti
mization formulation for the capacity expansion planing model. They
use a model of expected downside risk, and illustrate its performance
with numerical results. The Markowitz criterion was introduced by
Markowitz (1952); see also Perold (1984) and Dahl, Meeraus, and
Zenios (1993). The downside risk function used in this section was
suggested by Zenios and Kang (1993) in the context of portfolio man
agement applications. It is the limiting case of the mean-absolute de
viation models of Sharpe (1971) and Konno and Yamazaki (1991) with
asymmetric risk functions. See Speranza (1993) for further analysis of
the properties of asymmetric, piecewise linear, penalty functions.
Robust optimization models for capacity expansion planning have been

developed by Guttierez and Kouvelis (1995) who consider outsourcing
(i.e., subcontracting part of the manufacturing requirements) as the
means to achieving robustness in manufacturing capacity while re
ducing costs. Stochastic programming models for capacity expansion
for power generation firms have been proposed by Murphy, Sen, and
Soyster (1982), Granville et al. (1988), and Dantzig et al. (1989). Ex
tensions of these models, using robust optimization, are developed by
Malcolm and Zenios (1994).
Sect. 13.9 Notes and References 415
13.5.3 The robust optimization model for matrix balancing was developed
by Zenios and Zenios (1992). A similar model, for the more general
problem of image reconstruction from projections, was suggested ear
lier by Elfving (1989), who also suggested the use of a single Newton

step for the estimation of the Bregman parameter. Both references
develop solution algorithms for the respective models, using the row
action algorithms of Chapter 6. The use of a secant approximation for
the estimation of the Bregman parameter is discussed in Section 6.9;
see references in Section 6.10.
13.6 Textbook treatment of investments and portfolio management are
given by Bodie, Kane, and Marcus (1989) and Elton and Gruber (1984).
The classic models for portfolio management, namely Markowitz’s
mean/variance model and the portfolio immunization model are dis
cussed by Markowitz (1952) and Reddington (1952) respectively. See
Dahl, Meeraus, and Zenios (1993) for a modern treatment of these
models and the associated mathematical programming formulations.
The applications of stochastic programming models in portfolio man

agement are numerous. For a simple illustration of dynamic program
ming formulations for multiperiod portfolio optimization see Bert
sekas (1987, pp. 73-77). General references are collected in Zenios
(1993a) and Ziemba and Vickson (1975). The application of stochas
tic programming to address problems in short-term financial planning
was suggested by Kallberg, White, and Ziemba (1982). Its application
to bond portfolio management was suggested by Bradley and Crane
(1972); for application to Bank asset/liability management see Kusy
and Ziemba (1986); for application to the asset allocation see Mulvey
and Vladimirou (1992) and Mulvey (1993); for application to fixed-
income portfolio management see Zenios (1991c, 1993b); and for appli
cation to funding of insurance products see Nielsen and Zenios (1996b).
The model discussed in this section is adapted from Golub et al. (1995)
and Holmer et al. (1993).
13.7 There exists extensive literature on stochastic programming problems
with network recourse. The stochastic transportation problem was
introduced by Williams (1963). See also the papers by Cooper and
LeBlanc (1977), Wallace (1986, 1987), Mulvey and Vladimirou (1991),
Nielsen and Zenios (1993c, 1996a), and the PhD theses by Vladimirou
(1990) and Nielsen (1992).
Applications of stochastic network models to hydroelectric power sched

uling are reported in Dembo et al. (1990) and Dembo, Mulvey, and
Zenios (1989). Applications to airtraffic control were developed by
Richetta (1991); see also Zenios (1991a). Transportation and logistics
models are developed in Frantzeskakis and Powell (1989).
The split-variable formulation for the decomposition of mathemati

cal programs is fairly standard in large-scale optimization; see, for
example, Bertsekas and Tsitsiklis (1989, page 231). The use of split
variable formulations for stochastic programming problems was sug

gested by Rockafellar and Wets (1991). It was used by Mulvey and
Vladimirou (1989, 1991) and Nielsen and Zenios (1993a, 1996a) as a
device for exploiting the special structure of stochastic programs with
network recourse.
13.8 The row-act ion iterative algorithm for the two-stage stochastic pro
gramming problems with network recourse was developed by Nielsen
and Zenios (1993a). It was further extended to the multistage problem
by Nielsen and Zenios (1996a), and was used for the solution of linear
stochastic networks within the context of the PMD algorithm (Chap
ter 3) by Nielsen and Zenios (1993d). The partitioning of network
structures using graph coloring to increase the amount of parallelism
was suggested in Zenios and Mulvey (1988a). They describe a graph
coloring heuristic, based on earlier work of Christofides (1971), used
for their test problems, and compare the idea of graph coloring to
the use of a Jacobi variant. As one may expect, the parallel scheme
based on graph coloring exhibits a faster (practical) convergence rate
than the Jacobi algorithm. However, the Jacobi algorithm permits the
use of more processors. For large-scale problems, and on computers
with sufficiently many processors, the Jacobi parallel algorithm could
be substantially faster, in solution time, than parallel implementa
tions based on graph coloring. This was demonstrated in Zenios and
Lasken (1988a, 1988b).

Planning Under Uncertainty: Proactively

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Planning Under Uncertainty: Proactively

Uploaded by

Copyright:

Available Formats

CHAPTER 13

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024

The winners of tomorrow will deal proactively with chaos, will

The analyst who attempts to build a mathematical model for a real-

Parallel Optimization, Yair Censor, Oxford University Press (1997), © 1997 by

models and robust optimization models. Section 13.2 discusses a classic

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024

E[Q] = j QdP = Q(w)dP{wi).

The expectation of a constant function is also constant and it is easy to see

variable Q is defined as Var[Q] = E[Q2] - (E[Q])2.

Finally, we give a formal but restricted (to our needs) definition of a

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024

for u? E assuming that P(A) > 0.

13.2 The Newsboy Problem

\ • / p+x if x < uj,

Maximize E[F(x,w)] (13.1)

This is a simple example of a problem of planning under uncertainty.

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024

13.3 Stochastic Programming Problems

The following notation is used: id is a random vector with support

E[fi(x,w)] = [ fi(x,w)dP(wfi (13.4)

properties of these integral functions and devising suitable approximation

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024

13.3.1 Anticipative models

P{w | gi(x,w) = 0, f = l,2,... , m} > a,

13.3.2 Adaptive models

learning environment. It is understood that observations provide only par­

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024

where the mapping x : Q X is such that x(u>) is X-measurable. This

Minimize E[fo(x, •) | -4](a?) (13.6)

Each such problem for a given is of the canonical form (13.3).

13.3.3 Recourse models

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024

The second-stage problem seeks a policy y that optimizes the cost of

If this second-stage problem is infeasible then we set Q(x,w) = -Foo. The

Minimize f(x) + E[Q(rr, cu)]

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024

Minimize (/(x) + ^[Minyg^^ {q(y,w) | T(w)x + W(u>)y = /i(o>)}])

where Min denotes the minimal function value.

Deterministic Equivalent Formulation

ps = Prob (u? = cvs)

It is assumed that ps > 0 for all ws G Q, and that Ps — 1-

The expected value of the second-stage optimization problem can be

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024

For each realization of the random vector us E Q a different second-stage

Combining now (13.12) and (13.13) we reformulate the stochastic non­

The constraints (13.15)-( 13.18) for this deterministic equivalent program

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024

Multistage Recourse Problems

Minimize f(y°) + E Min qTy^m1)-!---- E Min qK(yK,uK) •••

For the case of discrete and finitely distributed probability distributions

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024

Minimize (c, x) + (d,y) (13.22)

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024

is precisely the objective function used in the stochastic programming for­

The robust optimization formulation also allows the introduction of higher

<?(•) = £>s^(-)) (13.29)

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024

p(z\z\...,zs) = ^seQ p5||^s||2. This quadratic penalty function (i.e., a

The robust optimization model takes a multicriteria objective form. A

Minimize a(x, y1, y2, ..., yN) + Xp^z1, z2,..., zN)

Downloaded from https://academic.oup.com/book/53915/chapter/422193805 by OUP site access user on 12 May 2024

We develop here a robust optimization formulation for this problem. Let

learning environment. It is understood that observations provide only par

Combining now (13.12) and (13.13) we reformulate the stochastic non

is precisely the objective function used in the stochastic programming for

“We arrived at a conscious willingness to give up the last ve

turbine operations, pumping, and spillage; hydroelectric production coeffi