Professional Documents
Culture Documents
Decomposition Methods in Stochastic Linear Program PDF
Decomposition Methods in Stochastic Linear Program PDF
net/publication/267216478
CITATIONS READS
0 93
1 author:
Gus Gassmann
Dalhousie University
40 PUBLICATIONS 647 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Gus Gassmann on 13 November 2014.
Abstract
Benders decomposition has been shown to be an ecient method to solve dynamic stochastic linear
programming problems. Upon termination, the method yields optimal values for all the decision variables,
but dual variables (for sensitivity analysis and value of information) are much harder to come by. This
paper describes some possible avenues to pursue. Preliminary numerical results for a standard set of test
problems are also reported.
Introduction
This paper is concerned with computational methods to recover dual variables in multiperiod stochastic
linear programs. An algorithm based on nested Benders decomposition has been shown by Birge [4] and
Gassmann [12] to be a computationally viable strategy to nd the primal variables, but no attention has been
given to the dual variables. Recovering the dual variables is of interest in obtaining the value of information
(Rockafellar and Wets [19], Dempster [6,7]), since various renement schemes could then be implemented
which identify \important" points in the distribution of the random elements.
Only discrete random variables will be considered in this paper, so that the stochastic linear program-
ming problem can be solved by working on the deterministic equivalent problem, which is a large scale
(deterministic) LP.
Work on decomposition methods has been appearing since the publication of the rst papers by Dantzig
and Wolfe [5] and Benders [2]. It is known that the two methods are equivalent if applied to problems that
are dual to each other.
Ho and various co-authors (Ho and Manne [15], Ho and Loute [14], Ament et al. [2]) produced a series
of papers about Dantzig-Wolfe decomposition, showing that the method terminates with the optimal dual
variables and describing ways to recover the optimal primal variables.
Benders decomposition was studied by Abrahamson [1], Wittrock [24] and Scott [22]. If this method is
used, the optimal primal decision variables are obtained directly upon termination, but the dual variables
will have to be recovered later. Because of the duality between Dantzig-Wolfe and Benders decomposition,
it is possible to translate (dualize) the methods of Ho and Manne [15] to this purpose.
Fourer [9,10] describes a specialized version of the simplex algorithm to take advantage of the special
structure of staircase problems. His method is not a decomposition method at all and does not have to solve
1
subproblems over and over with slight modications based on partial information gained during previous
iterations.
Stochastic problems involve variable replication arising from the dierent values of the random param-
eters. The number of replications grows from period to period, so a straight application of deterministic
decomposition methods leads to larger and larger subproblems. These subproblems exhibit a great deal of
special structure that has to be exploited. Benders decomposition was adapted to this setup by Van Slyke and
Wets [23] for two-period problems and extended to more than two periods by Birge [4] and Gassmann [12].
Some additional computational results are reported in Gassmann [11].
A similar algorithm is also described in Ruszczynski [21]. His algorithm uses a regularizing quadratic
term that is designed to stabilize the primal variables so that they do not
uctuate too much from one
iteration of a subproblem solution to the next. The quadratic function has a particularly simple form, and
this structure can be exploited to speed up computation.
Dempster [7] and Rockafellar and Wets [20] describe alternative methods for solving the stochastic
programming problem based on augmented Lagrangian techniques. Mulvey and Vladimirou [16,17,18] re-
port some computational success with an experimental implementation of Rockafellar and Wets' algorithm.
Dempster et al. [8] gives a comparison and preliminary relative evaluation of Lagrangian techniques versus
Benders decomposition which strongly favors the latter.
The deterministic version of the multiperiod linear programming problem considered in this paper can
be formulated mathematically as follows:
s: t: A0 x0 = b0
B1 x0 + A1 x1 = b1
... (1)
B x
T T 1 +A x = b
T T T
x 0; t = 1; : : :; T
t
In this form it is amenable to solution by repeated application of Benders decomposition, as the detached
2
coecient structure clearly shows:
c00 c01 c0
T 1 c0
T
A0 b0
B1 A1 b1
... ..
.
B T A T bT
min c01 x1
s: t: A1 x1 = b1 B1 x0 (4)
x1 0
which implies that v0 A1 = 0 for each row v0 of V0 .
i i
w0 A1 = c1.
i
Now (^u00; ^00 ; ^ 00 ) are dual feasible for (2), so that u^00 A0 + ^00 D0 + ^00 E0 = c00 , ^00 1 = 1 and ^0 ; ^0 0.
Dene u~01 = ^00 V0 + ^00 W0 . Then u~01 A1 = (^00 V0 + ^00 W0 )A1 = ^00 0 + ^00 1c01 = c01 and u^00A0 + u~01 B1 =
u^00A0 + ^00 V0 B1 + ^00 W0 B1 = u^00 A0 + ^00 D0 + ^00 E0 = c00, which proves that (^u00; u~01) is dual feasible for (4).
Moreover, u^00b0 + u~01b1 = u^00b0 + ^00 V0 b1 + ^00 W0b1 = u^00 b0 + ^00 d0 + ^00 e0 = c00 x^0 + ^0 = c00x^0 + c01x^1 . The
last relationship follows from the stopping criterion of the nested Benders decomposition method.
2. For T = 3 observe that the cuts (2.3) and (2.4) can be derived either from the second stage problem
min c01 x1 + 1
s: t: A1 x1 = b1 B1 x0
D1 x1 d1
E1x1 + 1 1 e1
x1 0:
or from the aggregate problem
min c01 x1 + c02x2
s: t: A1 x1 = b1
B2 x1 + A2 x2 = b2
x1 0
x2 0;
by application of part 1. The general case then follows by stepwise reduction from T to T 1, T 2, etc.
4
By storing the matrices V and W one therefore has a method to recover the dual variables one period
t t
at a time. Initially, u^00 are the optimal dual variables from the master problem at period 1. For time periods
t > 1 set u^0 +1 = ^0 V + ^ 0 W , where V and W are the matrices of extreme rays and extreme points used to
t t t t t t t
c0
T
c0
T 1 c0
T 2
c00
AT B T b
T
A T 1 B T 1
... ..
.
A0 b0
Here the master problem is the period T problem, and the hierarchy of problems is running backwards
in time. After a nite number of steps the Dantzig-Wolfe decomposition method stops with the optimal
dual variables, but the primal solution has to be xed up in a phase III method analogous to the iterative
procedure described before.
P P PT
min c00 x0 p 1 c0 1 x 1 + p 2 c0 2 x 2 + + p T c0 T x T
K1 K2 K
+ k k k k k k k k k
k1 =1 k2 =K1 +1 T =KT 1 +1
k
5
s: t: A0 x0 = b0
B 1 x0 + A 1 x 1
k k k = b 1 ; k1 = 1; : : :; K1
k
B 2 x ( 2) + A 2 x 2
k a k k k = b 2 ; k2 = K1 + 1; : : :; K2
k
... ..
. (5)
B T x ( T) + A T x T
k a k k k = b T ; k = K 1 + 1; : : :; K
k T T T
x t0
k t = 1; : : :; T:
If the random variables exhibit period-to-period independence, then the number of replications grows
geometrically from one time period to the next.
What makes the problem computationally tractable is the fact that the subproblems arising from an
application of Benders decomposition are separable, so that in the end all that has to be solved to obtain a
primal feasible solution are problems of the form
min c0 t x
k k t + t
k (6.1)
s: t: A t x
k t k = b t B tx k k t 1;a(kt) (6.2)
D tx t
k k d t k (6.3)
E tx t + t 1 e t
k k k k (6.4)
x t 0;
k
where a(k ) is used to denote the immediate ancestor of node k in the decision tree.
t t
The iterative procedure of the last section can be used again to recover the dual variables in a phase III
of Benders decomposition, at the cost of storing all the dual variables used to calculate the active cuts. The
next section will investigate the possibility of nding the dual variables without storing all this information.
Dantzig-Wolfe decomposition, on the other hand, uses the period T problem as the master problem, and
this problem can not be decomposed further, so a direct computation of the dual variables appears dicult
in this setup.
A Lagrangian method
Proposition 1.
When the decomposition method stops, the dual variables u^00 associated with constraints (7.2) of the
rst-period subproblem
min c00x0 + 0 (7.1)
s: t: A0x0 = b0 (7.2)
D0 x0 d0 (7.3)
E0x0 + 0 1 e0 (7.4)
x0 0:
k k k k k
k1 =1 k1 =1
s: t: A0x0 = b0 (8.2)
B 1 x0
k + A 1xk k1 =b 1 k (8.3)
D 1x 1
k k d 1 k (8.4)
E 1x 1 + 11
k k k e 1 k (8.5)
x0; x 1 0;
k k1 = 1; : : :; K1:
where the constraints (8.4) and (8.5) consist of all the cuts which are active in all the period two subproblems
upon termination of the decomposition routine.
In the same way as lemma 1 one proves that the dual variables u^00 and u^0 1 corresponding to constraints k
(8.2) and (8.3), respectively, are part of the optimal dual solution to (5).
Similarly we have the t-period LP problem
P1 P Pt Pt
min c00x0 + p 1 c0 1 x 1 + p 2 c0 2 x 2 + + p t c0 t x t +
K K2 K K
k k k k k k k k k p t t
k k
k1 =1 k2 =K1 +1 t=Kt 1 +1
k k t=Kt 1 +1
s: t: A0 x0 = b0
B 1 x0 + A 1 x 1
k k k = b 1 ; k1 = 1; : : :; K1
k
B 2 x ( 2) + A 2 x 2
k a k k k = b 2 ; k2 = K1 + 1; : : :; K2
k
... ..
.
B t x ( t) + A t x t
k a k k k =b t k
D tx t
k k d t k (9.5)
E tx t
k k + t1 e t k k (9.6)
x t 0;
k k =K t t 1 + 1; : : :; K :
t
where constraints (9.5) and (9.6) correspond to all the active cuts in all the period t subproblems.
Let (^u00; u^00; : : :; u^0 t ; ^0 t 1 +1 ; : : :; ^0 t ; ^ 0 t 1+1 ; : : :; ^ 0 t ) denote the optimal dual solution to problem (9).
K K K K K
Then (^u00; u^00; : : :; u^0 t ) form part of the optimal dual solution of the full stochastic linear programming prob-
K
k k k k k
1 =1
k 1 =1 k
s: t: B 1 x0
k + A 1x 1 k k =b 1 k (10.3)
D 1x 1 k k d 1 k (10.4)
E 1x 1 + 11
k k e 1 k k (10.5)
x0; x 1 0: k
This problem does not decompose, because (10.3) provides linkage between the dierent realizations in
period 2, but with a warm start the computational eort is moderate. A reasonably good starting basis
is given by assembling the bases that were optimal for the second period subproblems upon termination of
phase II. This starting basis is block-diagonal, so it can be inverted readily. Moreover, since the rst period
primal decision variables have also been identied, all the entering variables are known. They correspond to
those components of x^00 that were basic in cuts in the master problem. (Note that the rst period cuts have
disappeared from problem (10).) Typically the number of active constraints is small, so few iterations are
required to solve problem (10) to optimality.
Now consider the case t = 3. As formulated, problem (9) is not separable, but the rst and second
period dual variables u^00; u^00; : : :; u^0 1 have been identied. These constraints can therefore be eliminated by
K
putting them into the objective. This gives the following equivalent representation:
P
min (c00 u^00A0 p 1 u^0 1 B 1 )x0 + P p 1 (c0 1 u^0 1 A 1 )x 1 + P p 2 c0 2 x 2 + P p 2 2
K1 1 K 2 2 K K
k k k k k k k k k k k k k
1 =1
k 1 =1k 2 = 1 +1 2 = 1 +1
k K k K
s: t: B 2 x ( 2) + A 2 x 2
k a k k k =b 2 k
D 2x 2
k k d 2 k
E 2x 2 + 2 1 e 2
k k k k
x 20k ; k2 = K1 + 1; : : :; K2
This problem is (partially) separable; there is one separate two-stage subproblem for each realization of
the second period random elements. Again, a good starting basis is available, and all the entering variables
have been identied at the end of phase II.
The same idea works for general t: Because of the staircase structure of the stochastic linear program-
ming problem, the dual variables can be found by solving a number of two-stage problems.
Computational results
These ideas were coded into a modication of the experimental stochastic programming code MSLiP
(see [12]) and applied to a number of standard test problems from the Ho and Loute [13] set of staircase
problems as randomized by Birge [4]. Table 1 shows the number of simplex iterations and CPU times
during the solution phase of the algorithm and during the computation of dual variables. These gures
8
are somewhat dicult to interpret, because the problem size in phase III is larger than the problem size in
phase I and II. It should also be remembered that the decomposition phase uses a relatively polished code,
whereas the dual variable routine is a rst rough implementation. What comes out, though, is the relatively
modest number of simplex iterations in phase III and a CPU time which compares favorably with phase I
and II, especially on the larger problems. The overall advantage of Benders decomposition over the simplex
algorithm (demonstrated in Gassmann [11,12]) is not seriously compromised by the added work necessary
to recover the dual variables.
Problem Iterations Time (sec.)
Phase I and II Phase III Input Phase I and II Phase III
SC205/3 55 12 3.6 3.5 10.4
SCRS8/3 109 14 4.4 10.0 105.4
SCSD8/3 80 79 6.6 9.4 96.9
SCAGR7/3 259 17 4.0 23.3 22.9
SCTAP1/3 129 45 6.0 17.3 84.1
SCFXM1/4 5106 178 11.0 1811.4 244.0
FOR1/7 11078 200 6.5 604.3 75.2
FOR2/7 72230 455 7.2 4031.8 557.1
Nonanticipativity constraints
As demonstrated in Rockafellar and Wets [19] and Dempster [6,7], the expected value of perfect infor-
mation for a stochastic programming problem can be inferred from its information prices. These are dual
variables associated with nonanticipativity constraints which state, in essence, that the decision variables in
period t have to be chosen before the random variables in periods t + 1; : : :; T have been observed. These
constraints have been substituted implicitly into formulation (5) and in general have alternative implicit [19]
and explicit [6,7] representations. An explicit account can also be taken by using the following equivalent
program.
PT
p T c00 x1 T + c0 1 ( T ) x2 T + c0 2 ( T ) x3 T + + c0 T x
K
min k ;k a k ;k a k ;k k T
T ;k
kT =KT 1 +1
s: t: A0 x1 T
;k = b0
B 1 ( T ) x1 T + A 1 ( T ) x2 T
a k ;k a k ;k = b 1(
a k T)
B 2 ( T ) x2 T + A 2 ( T ) x3
a k ;k a k T
;k = b 2(
a k T)
... ..
. (12)
B Tx 1 T +A Tx
k T ;k k T ;k T = b T;
k k = K 1 + 1; : : :; K
T T T
x T 0;
t;k t = 1; : : :; T:
x0 x1 T
;k =0
9
x 1( T )
a k x2;k T =0
xT
a 1 (kT ) xT 1;kT = 0; k =K
T T 1 + 1; : : :; K
T
where a (k ) denotes the (unique) node in period t which is an ancestor of terminal node k in the decision
t T T
tree.
This program is much larger than (5) because of the explicit replication of each decision variable for
each terminal node in the decision tree. It also is not a staircase problem any more. On the other hand,
there is no need to solve problem (12) from scratch. Instead one should work with problem (5) as long as
possible. Once an optimal basis for (5) has been identied, one can construct an optimal basis for (12): If a
variable in (5) is o its bounds, then all its replications in (12) must belong to the optimal basis. Once this
basis has been constructed, it can be inverted (taking advantage of special structure) to nd the optimal
dual variables. This will be the topic of future research.
Summary
This paper explored methods to compute the optimal dual variables for the multistage stochastic linear
programming problem. If the problem exhibits staircase structure, then the dual variables can be obtained
by solving problems which aggregate two periods at a time, consisting of a non-terminal node together with
all its immediate descendants. Numerical results indicate that the computational eort is not unreasonable.
Acknowledgements
This research was supported in part by a grant from the Natural Sciences and Engineering Research
Council of Canada (NSERC).
I am grateful to M.A.H. Dempster for many fruitful discussions on this topic.
References
[1]. P.G. Abrahamson, \A nested decomposition approach for solving staircase linear programs", Technical
Report SOL 83{4, Systems Optimization Laboratory, Stanford University (Stanford 1983).
[2]. D. Ament, J. Ho, E. Loute and M. Remmelswaal, \LIFT: A nested decomposition algorithm for solving
lower block diagonal linear programs", in: G.B. Dantzig, M.A.H. Dempster and M.J. Kallio (eds.),
Large Scale Linear Programming, Vol. 1, IIASA Collaborative Proceedings Series CP{81{S1, Laxenburg,
Austria, 1981, p. 383{408.
[3]. J.F. Benders, \Partitioning procedures for solving mixed-variables programming problems", Numerische
Mathematik 4 (1962) 238{252.
[4]. J.R. Birge, \Decomposition and partitioning methods for multistage stochastic linear programs", Op-
erations Research 33 (1985) 989{1007.
10
[5]. G.B. Dantzig and P. Wolfe, \Decomposition principle for linear programs", Operations Research 8 (1960)
101{111.
[6]. M.A.H. Dempster, \The expected value of perfect information in the optimal evolution of stochastic
systems", in: M. Arato, D. Vermes and A.V. Balakrishnan (eds.), Stochastic Dierential Systems Vol. 36,
Springer Lecture Notes in Control and Information Sciences, 1981, p. 25{40.
[7]. M.A.H. Dempster, \On stochastic programming II: Dynamic problems under risk", Stochastics 25 (1988)
15{42.
[8]. M.A.H. Dempster, H.I. Gassmann, E.A. Gunn and R.R.Merkovsky, \Lagrangean dual decomposition
methods for solving stochastic programmes with recourse" (in preparation).
[9]. R. Fourer, \Solving staircase linear programs by the simplex method, I: Inversion", Mathematical Pro-
gramming 23 (1982) 274{313.
[10]. R. Fourer, \Solving staircase linear programs by the simplex method, II: Pricing", Mathematical Pro-
gramming 25 (1983) 251{292.
[11]. H.I. Gassmann, \Optimal harvest of a forest in the presence of uncertainty", Canadian Journal of Forest
Research (to appear).
[12]. H.I. Gassmann, \MSLiP: A computer code for the multistage stochastic linear programming problem",
Mathematical Programming (to appear).
[13]. J.K. Ho and E. Loute, \A set of staircase linear programming test problems", Mathematical Programming
20 (1981) 245{250.
[14]. J.K. Ho and E. Loute, \An advanced implementation of the Dantzig-Wolfe decomposition algorithm for
linear programming", Mathematical Programming 20 (1981) 303{326.
[15]. J.K. Ho and A.S. Manne, \Nested decomposition for dynamic models", Mathematical Programming 6
(1974) 121{140.
[16]. J.M. Mulvey and H. Vladimirou, \Solving multistage stochastic networks: An application of scenario
aggregation", Working paper SOR{88{1, Princeton University, Princeton, New Jersey, 1988.
[17]. J.M. Mulvey and H. Vladimirou, \Stochastic network optimization models for investment planning",
Working paper SOR{88{2, Princeton University, Princeton, New Jersey, 1988.
[18]. J.M. Mulvey and H. Vladimirou, \Evaluation of a distributed hedging algorithm for stochastic network
programming", Working paper SOR{88{14, Princeton University, Princeton, New Jersey, 1988.
[19]. R.T. Rockafellar and R.J-B Wets, \Nonanticipativity and L1-martingales in stochastic optimization
problems", Mathematical Programming Study #6 (1976) 170{187.
11
[20]. R.T. Rockafellar and R.J-B Wets,\The principle of scenario aggregation in optimization under uncer-
tainty", Working paper WP{87{119, International Institute for Applied Systems Analysis, Laxenburg,
Austria, 1987.
[21]. A. Ruszczynski, \A regularized decomposition method for minimizing a sum of polyhedral functions",
Mathematical Programming 35 (1986) 309{333.
[22]. D.M. Scott, \A dynamic programming approach to time staged convex programs", Technical Report
SOL 85{3, Systems Optimization Laboratory, Stanford University (Stanford 1985).
[23]. R. Van Slyke and R.J-B Wets, \L-shaped linear programs with application to optimal control and
stochastic optimization", SIAM Journal of Applied Mathematics 17 (1969) 638{663.
[24]. R.J. Wittrock, \Advances in a nested decomposition algorithm for solving staircase linear programs",
Technical Report SOL 83{2, Systems Optimization Laboratory, Stanford University (Stanford 1983).
12