Professional Documents
Culture Documents
Zakeri cuts
Zakeri cuts
PII. S1052623497318700
1999; published electronically February 29, 2000. This work was supported by the New Zealand
Public Good Science Fund, FRST contract 403.
http://www.siam.org/journals/siopt/10-3/31870.html
† Mathematical Sciences Division, Argonne National Lab, Argonne, IL 60439 (zakeri
@mcs.anl.gov).
‡ Operations Research Group, Department of Engineering Science, University of Auckland, Private
To avoid possible confusion, we remark that our use of the term inexact is less
general here than that of Au, Higle, and Sen [1]. At each iteration i of their inexact
subgradient algorithm (applied to minimize a general objective function f (x)), they
construct an approximate subgradient at the current point xi by computing a sub-
gradient to an approximating function fi and taking a projected step from xi in (the
negative of) that direction. With certain restrictions on the convergence of {fi } to
f they prove convergence of xi to the minimum of f under the assumption that the
subgradients of fi at xi form a bounded sequence. Our results are confined to Benders
decomposition (where f (x) = cT x + Q(x) and each fi is defined by the inexact cuts at
iteration i), but we do not require that the subgradients of fi at each iterate (namely,
c − πiT T in our special case) form a bounded sequence.
In the next section we describe a Benders decomposition algorithm that termi-
nates the solution of the subproblem before optimality to produce an inexact cut. The
steps of the algorithm ensure that this cut separates the optimal solution from the
current iterate. In section 3 we consider the convergence of the inexact cut algorithm
under the above assumptions, and in section 4 we discuss the implications of our
results for Dantzig–Wolfe decomposition. In section 5 we give some computational
results.
2. The algorithm. We start the inexact cut algorithm by choosing a conver-
gence tolerance δ, setting an iteration counter i := 0, and choosing some decreasing
sequence {i } that converges to 0. We also set U0 := ∞ and L0 := −∞. The remaining
steps of the algorithm are as follows.
Inexact cut algorithm.
While Ui − Li > δ
(1) Set i := i + 1.
(2) Solve MP to obtain (xi , θi ).
(3) Set Li := cT xi + θi .
(4) Perform an inexact optimization to generate a vector πi feasible for the dual
of SP(xi ) such that
(1) πiT (h − T xi ) + i > Q(xi ).
(5) Set Ui := min{Ui−1 , cT xi + πiT (h − T xi ) + i }.
(6) If πiT (h − T xi ) > θi , then add the cut πiT (h − T x) ≤ θ to MP,
else set i := i + 1, xi+1 := xi , θi+1 := θi , Li+1 := Li , Ui+1 := Ui and go to
step 4.1
We denote by vi the value of the inexact optimization in step 4. Thus vi =
πiT (h − T xi ). In step 6 of each iteration of this method we check to see if vi > θi ,
which ensures that the hyperplane πiT (h − T x) = θ will strictly separate the current
iterate (xi , θi ) from any optimal solution of P. If this check fails, then we decrease the
duality gap tolerance and continue with the solution of SP(xi ), until either i → 0
with no change in (xi , θi ) or (xi , θi ) is separated from an optimal solution of P by a
cut.
To show that this algorithm converges we make use of the following simple results.
Lemma 2.1. −πiT T is an i -subgradient of Q at xi .
Proof. Since πi is dual feasible for SP(xi ), it is dual feasible for every possible
subproblem SP(x). Hence for every x of suitable dimension, we have
Q(x) ≥ πiT (h − T x),
1 Note that in this case x and θ remain fixed and only (possibly) changes.
646 G. ZAKERI, A. B. PHILPOTT, AND D. M. RYAN
and by (1)
Q(xi ) ≤ vi + i .
Thus
giving
0 ≤ Ui − Li ≤ vi + i − θi .
Ui ≤ cT xi + vi + i
and Li = cT xi + θi , we have
0 ≤ Ui − Li ≤ cT xi + vi + i − cT xi − θi
where K is some infinite index set. This can be shown to be equivalent to our equa-
tion (6) below. (Lemma 3.9 shows that this equation holds for our algorithm.) As
[10, Theorem 9] is not directly applicable, we present a self-contained proof of our
convergence result.
INEXACT CUTS IN BENDERS DECOMPOSITION 647
This contradicts the assumption that the epigraph of max1≤k≤N {bTk x + βk } lies in H.
Hence we must have kbk ≤ M as required.
Lemma 3.2. If dom Q = Rn , then the sequence {−πiT T } is bounded.
Proof. Let {π̂k | 1 ≤ k ≤ N } be the set of basic feasible solutions of W T π ≤ q.
Recall that for every x ∈ Rn , πi is dual feasible for SP(x). So for every such x we
have
where the equation follows by virtue of dom Q = Rn . Therefore the epigraph of Q lies
in the half-space
H = {(x, µ) | µ ≥ bT x + β},
where b = −πiT T and β = πiT h. The conclusion is then immediate from Lemma
3.1.
Next we will show that the inexact cut algorithm terminates in a finite number of
iterations with a δ-optimal solution. If the inexact cut algorithm does not terminate
in a finite number of iterations, then it will produce an infinite sequence {(xi , θi )}
that satisfies one of the following conditions:
(1) There exists m such that θi ≥ vi for all i ≥ m.
(2) There exists a subsequence {(xσ(i) , θσ(i) )} such that θσ(i) < vσ(i) .
648 G. ZAKERI, A. B. PHILPOTT, AND D. M. RYAN
The following lemmas show that a contradiction results in either case, namely, the
algorithm eventually yields a δ-optimal solution.
Lemma 3.3. If there exists m such that θi ≥ vi for all i ≥ m, then Ui − Li ↓ 0.
Proof. Since θi ≥ vi , Lemma 2.2 implies
0 ≤ Ui − Li ≤ vi + i − θi ≤ i .
which implies
Now (xτ (i) , θτ (i) ) → (x∗ , θ∗ ) by assumption. Furthermore, from the algorithm we have
and therefore
which implies lim vτ (i) − vτ (i−1) = 0. Furthermore, (2) and (3) imply
so if we let
(4) 0 < Vτ (i) ≤ vτ (i) − vτ (i−1) + πτT(i−1) T (xτ (i) − xτ (i−1) ) + τ (i)
and
Substituting into (4) and taking the limit as τ (i) → ∞ yields Vτ (i) → 0. Since
Uτ (i) − Lτ (i) is bounded above by Vτ (i) and below by 0, it must converge to 0. Now
by their definitions, {Ui } is decreasing and {Li } is increasing. Hence {Ui − Li } is
decreasing, and since a subsequence of this sequence converges, it follows that the
whole sequence converges.
Theorem 3.6. If {x ≥ 0 | Ax = b} is bounded and dom Q = Rn , the inexact cut
algorithm terminates in a finite number of iterations with a δ-optimal solution of P.
Proof. From Lemma 3.3 and Lemma 3.5 we have that Ui − Li ↓ 0. Therefore
there exists some I such that UI − LI < δ, so the algorithm terminates in at most I
iterations. Let xk be such that UI = cT xk + vk + k . Then
cT xk + Q(xk ) ≤ cT xk + vk + k < LI + δ,
We do this by showing in Lemma 3.9 that for some subsequence {xσ(i) } of {xτ (i) },
T
(7) lim inf −πσ(i−1) T (xσ(i) − xσ(i−1) ) ≥ 0.
we get
T
lim sup −πσ(i−1) T (xσ(i) − xσ(i−1) ) ≤ 0,
K = {x | bTj x ≤ βj , 1 ≤ j ≤ k}
bTj x∗ = βj if 1 ≤ j ≤ k,
bTj x∗ < βj otherwise.
Then there is some λ > 0 and N such that for every y in the recession cone of
{x | bTj x ≤ βj , 1 ≤ j ≤ k},
y
i > N ⇒ xi + λ ∈ G.
kyk
K = {x | bTj x ≤ βj , 1 ≤ j ≤ k},
(9) xi + λy ∈ K, λ ≥ 0, y ∈ C.
Now since bTj x∗ < βj for k < j ≤ m, we may choose λ > 0 so that
Furthermore by (9),
y
bTj xi + λ ≤ βj , 1 ≤ j ≤ k,
kyk
y
and so xi + λ kyk ∈ G.
We now apply the above lemmas to prove Lemma 3.9. The proof proceeds by
showing that for an appropriately chosen convergent subsequence {xσ(i) }, the projec-
T
tion of πσ(i) T in the direction of xσ(i+1) − xσ(i) is uniformly bounded. Once this is
established the conclusion of Lemma 3.9 is immediate.
Lemma 3.9. Suppose {(xτ (i) , θτ (i) )} is a subsequence of the sequence of solutions
generated by the inexact cut algorithm, and let {πτ (i) } be the corresponding approxi-
mately optimal solutions to the dual of SP(xτ (i) ). Then there exists a subsequence of
{xτ (i) } (indexed by σ(i)) such that xσ(i) → x∗ and
T
lim inf −πσ(i) T (xσ(i+1) − xσ(i) ) ≥ 0.
Proof. Since X is bounded, convex, and polyhedral, the (finite) collection of all
relative interiors of the faces of X partition it [16, Theorem 18.2]. Hence there is a
subsequence of {xτ (i) }, indexed by γ(i), such that {xγ(i) } lies in the relative interior
of a face G of X and converges to a point x∗ ∈ G. (We shall henceforth denote the
relative interior of G by ri G.) Since G is polyhedral we may represent it by
G = {x | bTi x ≤ βi , 1 ≤ i ≤ m}.
where B is the open unit ball and aff G is the affine hull of G. Now for γ(i) large
enough we have that x∗ − xγ(i) ∈ B, and so if we choose σ(k) = γ(i), then
T xσ(i−1) − xσ(i)
−λπσ(i−1) T° °
°xσ(i−1) − xσ(i) ° ≤ Q(x) − Q(xσ(i−1) ) + σ(i−1)
≤ sup Q(x) − inf Q(x) + σ(i−1) .
x∈G x∈G
If we set M = supx∈G Q(x) − inf x∈G Q(x) + 1 , then since {i } is decreasing we obtain
T xσ(i−1) − xσ(i) M
−πσ(i−1) T° °
°xσ(i−1) − xσ(i) ° ≤ λ .
Therefore
T M° °
°xσ(i−1) − xσ(i) ° ,
−πσ(i−1) T (xσ(i) − xσ(i−1) ) ≥ −
λ
which implies
T
lim inf −πσ(i−1) T (xσ(i) − xσ(i−1) ) ≥ 0.
Theorem 3.10. If X = {x ≥ 0 | Ax = b} is bounded and X ⊆ dom Q, the
inexact cut algorithm terminates in a finite number of iterations with a δ-optimal
solution of P.
Proof. The proof is similar to that of Theorem 3.6. We will start by showing
Ui − Li ↓ 0. If there exists m such that θi ≥ vi for all i ≥ m, then Lemma 3.3
delivers the conclusion. Otherwise, there exists a subsequence {(xτ (i) , θτ (i) )} such
that θτ (i) < vτ (i) , and since X is bounded, without loss of generality we may assume
that {(xτ (i) , θτ (i) )} converges to (x∗ , θ∗ ), say. Then by Lemma 3.4,
This yields Uσ(i) − Lσ(i) → 0, implying that the decreasing sequence {Ui − Li }
tends to 0, which then gives the result as in the proof of Theorem 3.6.
4. Dantzig–Wolfe decomposition. It is well known that Benders decompo-
sition is dual to Dantzig–Wolfe decomposition. Therefore some form of inexact opti-
mization procedure should apply to the latter algorithm in a way that mirrors the
steps of the inexact cut algorithm described in section 2. In fact such a scheme has
been outlined in the literature by Kim and Nazareth [15], who discuss the compu-
tational advantages of using interior-point methods in such an approach. We digress
briefly in this section to explore the asymptotic convergence properties of such an
algorithm.
The dual problem of P can be formulated as
D: maximize bT u + hT v
subject to AT u + T T v ≤ c,
W T v ≤ q.
Suppose for the moment that the set V = {v | W T v ≤ q} is bounded with extreme
points {vi }. Then Dantzig–Wolfe decomposition solves a restricted master problem
P
MD: maximize bT u + i λi hT vi
P
subject to AT u + i λi T T vi ≤ c,
P
i λi = 1,
λ ≥ 0,
where the summations are taken over a subset of {vi }. New extreme points are added
iteratively to this subset by solving MD, obtaining optimal dual variables (x, θ), and
then solving the subproblem
SD(x): maximize (hT − xT T T )v
subject to W T v ≤ q,
T
to give a new column T 1vi to be added to the restricted master problem, in the
event that this column has a positive reduced cost defined by
(hT − xT T T )vi − θ.
T
(6) If viT (h − T xi ) > θi , then add the column T 1vi to MD,
else set i := i + 1, xi+1 := xi , θi+1 := θi , Li+1 := Li , Ui+1 := Ui and go to
step 4.
Here V (SD(xi )) is the optimal value of SD(xi ). Since the dual of SD(xi ) is easily
seen to be SP(xi ), V (SD(xi )) = Q(xi ), and so step 4 of this algorithm is identical to
the same step of the inexact cut algorithm of section 2.
In classical Dantzig–Wolfe decomposition, each solution vi obtained for SD is an
extreme point, of which there is a finite number, thus guaranteeing finite termination.
In the inexact algorithm, this is no longer true. However, Theorem 3.10 may be
invoked to yield the following corollary.
Corollary 4.1. If X = {x ≥ 0 | Ax = b} is a bounded set and for every x ∈ X
the problem SD(x) is bounded, then the inexact Dantzig–Wolfe algorithm terminates
in a finite number of iterations with a δ-optimal solution of D.
Since SD(x) will always have a feasible solution (if D does), the boundedness
condition on SD(x) is equivalent to SP(x) being feasible, which is the relatively com-
plete recourse assumption of the previous section. The other assumption, that X is
bounded, appears to be rather restrictive in the current context, and it fails to hold
in the case when A and b are both absent, a typical situation in many applications of
Dantzig–Wolfe decomposition. The convergence proof requires X to be bounded to
enable the extraction of convergent subsequences. Even when A and b fail to bound
X, we can still extract convergent subsequences as long as we have a guarantee that
the sequence {xi } lies in a bounded set. In Benders decomposition we can enforce this
condition in practice by placing a priori bounds on the components of x. Similarly, in
inexact Dantzig–Wolfe decomposition we can impose a priori bounds on the optimal
dual variables for the master problem constraints (by placing a priori penalties on
infeasibilities in these constraints).
5. Computational results. We conclude by presenting some computational
results of applying the inexact cut algorithm to a set of problems that arise in the
planning of hydroelectric power generation. The problems are all based on a mul-
tistage stochastic programming model developed by Broad [4], in which the New
Zealand electricity system is represented as a side-constrained network model with
nodes representing hydroelectric reservoirs, hydroelectric generation facilities, thermal
generation facilities, and demand points and arcs with constant losses representing
the transmission network. The model consists of six reservoirs, six thermal stations,
and 22 hydrostations.
Each stage is a week long, and demand in each week is represented by a piecewise
linear load duration curve with three linear sections. At each stage several random
outcomes are possible for the inflows into the reservoirs in the current week. We impose
a lower bound on the final level of the reservoirs at the end of the final stage. This lower
bound is a fixed fraction of the original initial level of the reservoirs in the very first
stage. Additional side constraints include DC load flow constraints that govern the
transmission flows and conservation of water flow equations in hydroelectric systems.
The linear program for each stage has 273 variables and 120 constraints. The objective
in each stage is to minimize the cost of thermal electricity generation over the current
week plus the expected future cost of thermal generation.
The multistage models described above were converted into two-stage and three-
stage problems by aggregating consecutive stages into larger problems. For example, to
obtain a two-stage problem from a multistage problem we aggregate each second-stage
problem and its descendants into a single deterministic equivalent linear program.
INEXACT CUTS IN BENDERS DECOMPOSITION 655
Table 1
Problem sizes.
Table 2
Performance comparison.
Table 2 contains a comparison of the computational results for the two meth-
ods. The termination criterion for both algorithms requires a relative gap of 10−5
between the upper and the lower bounds (i.e., we stop when U U −L
< 10−5 ). All times
are reported on an SGI Power Challenge. Column 1 contains the problem identifiers.
Columns 2 and 3 contain the number of cuts under the exact and inexact cut algo-
rithms, respectively. Columns 4 and 5 contain the timing in seconds for the exact and
inexact methods, respectively. The last column contains the percentage of improve-
ment of the inexact cut algorithm over the exact Benders decomposition algorithm.
The entries in this column are calculated as ( exact time − inexact time ) × 100%.
exact time
Note that traditionally the subproblems are not aggregated and they are solved
using the (dual) simplex method with warm starting. For some problems this is more
efficient than using an interior-point method on an aggregated subproblem, although
in other cases (e.g., P3, P7, and P11) we experienced significant speed-up by aggre-
gating and using the interior-point method versus Benders decomposition with warm
starting simplex. It may be possible to warm start the interior-point method effec-
tively when solving the subproblems, using recent research developed to this end (see,
for example, [19, 9]).
6. Conclusions. In every one of our problems the inexact cut algorithm im-
proved the time to obtain a solution with the same accuracy as that of the Benders
decomposition algorithm. In our experiments, the choice of {i } is made indepen-
dently of the problem. Further improvements in speed can be achieved by making
a problem-dependent choice of {i }. In Table 2 the greatest improvements were ob-
tained in cases where the Benders decomposition required a large number of cuts. In
these cases we observed that often during the course of the exact algorithm the lower
bounds did not change over the course of several iterations. The inexact cut algorithm
does not display this behavior, and it reaches an approximately optimal solution with
fewer cuts. This suggests that computing cuts inexactly is a promising and simple im-
provement strategy for operations research practitioners who observe similar behavior
in Benders decomposition applied to their stochastic linear programming models.
REFERENCES
[1] K. T. Au, J. L. Higle, and S. Sen, Inexact subgradient methods with applications in stochastic
programming, Math. Programming, 63 (1994), pp. 65–82.
[2] O. Bahn, O. Du Merle, J.-L. Goffin, and J.-P. Vial, A cutting plane method from analytic
centers for stochastic programming, Math. Programming Ser. B, 69 (1995), pp. 45–73.
[3] J. F. Benders, Partitioning procedures for solving mixed-variables programming problems,
Numer. Math., 4 (1962), pp. 238–252.
[4] K. P. Broad, Power Generation Planning Using Scenario Aggregation, M.S. thesis, University
of Auckland, Auckland, New Zealand, 1996.
[5] G. B. Dantzig and P. Wolfe, Decomposition principle for linear programs, Oper. Res., 8
(1960), pp. 101–111.
[6] G. B. Dantzig and P. Wolfe, The decomposition algorithm for linear programs, Economet-
rica, 29 (1961), pp. 767–778.
[7] E. Flippo and A. Rinnooy Kan, Decomposition in general mathematical programming, Math.
Programming, 60 (1993), pp. 361–382.
[8] A. M. Geoffrion, Generalized Benders decomposition, J. Optim. Theory Appl., 10 (1972),
pp. 237–260.
[9] J. Gondzio, Warm start of the primal-dual method applied in the cutting-plane scheme, Math.
Programming Ser. A, 83 (1998), pp. 125–143.
[10] J. L. Higle and S. Sen, On the convergence of algorithms with implications for stochastic and
nondifferentiable optimization, Math. Oper. Res., 17 (1992), pp. 112–131.
[11] W. Hogan, Application of general convergence theory for outer approximation algorithms,
Math. Programming, 5 (1973), pp. 151–168.
[12] J. Jacobs, G. Freeman, J. Grygier, D. Morton, G. Schultz, K. Staschus, and J. Ste-
dinger, SOCRATES: A system for scheduling hydro-electric generation under uncertainty,
Ann. Oper. Res., 59 (1995), pp. 99–133.
[13] P. Kall and S. W. Wallace, Stochastic Programming, John Wiley, New York, 1994.
[14] J. E. Kelley, Jr., The cutting-plane method for convex programs, J. Soc. Indust. Appl. Math.,
8 (1960), pp. 703–712.
[15] K. Kim and J. L. Nazareth, The decomposition principle and algorithm for linear program-
ming, Linear Algebra Appl., 152 (1991), pp. 119–133.
[16] R. T. Rockafellar, Convex Analysis, Princeton University Press, Princeton, NJ, 1970.
[17] R. M. Van Slyke and R. Wets, L-shaped linear programs with applications to optimal control
and stochastic programming, SIAM J. Appl. Math., 17 (1969), pp. 638–663.
[18] R. J. Wittrock, Advances in a Nested Decomposition Algorithm for Solving Staircase Linear
Programs, Report SOL 83-2, Systems Optimization Laboratory, Department of Operations
Research, Stanford University, Stanford, CA, 1983.
[19] G. Zakeri, D. M. Ryan, and A. B. Philpott, Techniques for Solving Large Scale Set Parti-
tioning Problems, Technical report, University of Auckland, New Zealand, 1996.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.