Energies 14 06503 v2

energies
Article
Generalized Benders Decomposition Method to Solve Big
Mixed-Integer Nonlinear Optimization Problems with Convex
Objective and Constraints Functions
Andrzej Karbowski
Research and Academic Computer Network NASK—National Research Institute, ul. Kolska 12,
01-045 Warsaw, Poland; andrzej.karbowski@nask.pl
Abstract: The paper presents the Generalized Benders Decomposition (GBD) method, which is
now one of the basic approaches to solve big mixed-integer nonlinear optimization problems. It
concentrates on the basic formulation with convex objectives and constraints functions. Apart from
the classical projection and representation theorems, a unified formulation of the master problem
with nonlinear and linear cuts will be given. For the latter case the most effective and, at the same
time, easy to implement computational algorithms will be pointed out.
Keywords: optimization; mixed-integer nonlinear programming; decomposition; convex problems;

bilinear problems; integer programming; Generalized Benders Decomposition; hierarchical optimiza-
tion; branch and cut

Citation: Karbowski, A. Generalized

1. Introduction
Benders Decomposition Method to In the early 1960s, a Dutch mathematician Jacques F. Benders was considering opti-
Solve Big Mixed-Integer Nonlinear mization problems of the form [1]:
Optimization Problems with Convex
Objective and Constraints Functions. max c T x + f (v) (1)
x,v
Energies 2021, 14, 6503. https://
doi.org/10.3390/en14206503 subject to the following constraints (hereinafter, the abbreviation “s.t.” will be used):
Academic Editor: Abu-Siada Ahmed Ax + F (v) ≤ b, b ∈ Rm (2)
Received: 9 August 2021 x ∈ X ⊆ Rn , v ∈ V ⊆ Rq (3)

Accepted: 30 September 2021
Published: 11 October 2021
He called them mixed-variables programming problems. In these problems, both the
objective functions and the functions defining constraints were sums of two components:
linear, dependent on the vector variable x ∈ X, and nonlinear, dependent on the vector
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
variable v ∈ V. Benders called v a vector of complicating variables, because when they
published maps and institutional affil-
were fixed, the problem simplified—it became linear. In addition, he suggested that
iations. these variables can be discrete, i.e., the problem with respect to the x vector itself became
continuous and was already very easy to solve with any linear programming solver (e.g.,
based on the simplex algorithm).
In his work [1] Benders proposed an iterative solution procedure for this optimization
problem, consisting in solving alternately auxiliary problems with respect to one or the
Copyright: © 2021 by the author.
other vector of variables. These problems were related to the dual representation of
Licensee MDPI, Basel, Switzerland.
This article is an open access article
the initial problem and optimality conditions. We will call the problem with respect to
distributed under the terms and
complicating variables v the master problem, and the problem with respect to x variables
conditions of the Creative Commons the primal problem. In its subsequent launches the primal problem provided the master
Attribution (CC BY) license (https:// problem with constraints related to the approximation of the objective function or the
creativecommons.org/licenses/by/ feasible set, computed at different points of the X set. The former were added in case,
4.0/).
Energies 2021, 14, 6503. https://doi.org/10.3390/en14206503 https://www.mdpi.com/journal/energies

Energies 2021, 14, 6503 2 of 18
when for a given trial vector v, there existed a solution admissible in the initial problem,
the latter—otherwise.
The generalization of this approach to nonlinear problems was presented by Arthur
Geoffrion [2]. He called it the Generalized Benders Decomposition (GBD). As in the
original work by Benders, constraints in the master problem were defined with the help
of Lagrangians of the respective primal problems. The objective values in optimal points,
obtained after each iteration of the master problem and of the primal problem—when
it was feasible, were used to estimate the optimal objective of the initial problem. In
the case of a minimization problem, the master problem provided its estimate from the
bottom LBD (from the Lower BounD), the primal problem—from the top—UBD (from the
Upper BounD). When the primal problem was infeasible, the so-called feasibility problem,
consisting in minimization of the maximum exceedance of constraints, was solved. The
stop test consisted in checking whether the upper and the lower estimates of the objective
function were equal with a given accuracy.
Geoffrion’s approach in a slightly modified version was presented in the works of
Floudas [3,4], which assumed that v variables are binary and supplemented the constraints
of the problem with the equality ones. Unfortunately, they are not relaxed in the feasibility
problem (like inequality constraints), which can cause this approach to lead to a deadlock,
when for a trial point selected from the set V, any of these constraints is not satisfied for all
x ∈ X. Floudas provided several versions of the computing algorithm, but they concerned
the same specific problems, actually being one algorithm written in different ways.
Unfortunately, neither the works of Floudas [3,4], nor the Geoffrion’s source work [2],
presented a version of fundamental importance in convex problems with respect to the
entire vector ( x, v)—with linear cuts in a master problem. Despite it being the easiest
version to implement; moreover, it was mentioned by Benders [1] and in more recent works
on mixed integer nonlinear programming, [5,6] it is most often presented as the only one.
Deficiencies of the above-mentioned presentations of the Benders method and the lack
of a coherent and uniform description of this approach convinced the author of the need to
put this knowledge in order, which resulted in this work. The importance of this approach
is growing as more and more practical problems of mixed-integer nonlinear program-
ming are solved, related to, e.g., energy [7], telecommunications [8], transport [9,10], gas
networks [11], production processes [12], water systems [13], two-stage nonlinear stochas-
tic control [14,15]. When these problems exceed a certain size, even the best commercial
solvers fail [14]. It is necessary to write your own specialized code, using structural prop-
erties of the problem. That is what the Benders method is great for. It is also a kernel of
many complicated algorithms solving optimization problems with nonconvex functions,
especially in the field of chemical engineering [16–20].
Generalized Benders Decomposition is not the only decomposition method to solve
MINLP problems. The most popular other classical approaches are: Lagrangian relax-
ation, branch and bound, column generation, outer approximation [6,21]. Recently, the
Alternating Directions Multiplier Method has been gaining a lot of popularity [22,23].
2. Problem Formulation
Let us consider an optimization problem of the following form:
min f ( x, v) (4)
x,v
s.t.
g( x, v) ≤ 0 (5)
n
x∈X⊆R (6)
v ∈ V ⊆ Rq (7)
Energies 2021, 14, 6503 3 of 18
where
f : Rn × Rq → R,
g: Rn × Rq → Rm
The possible equality constraints
h j ( x, v) = 0, j = 1, . . . , s (8)
can be taken into account using standard transformations, e.g.,
h1 ( x, v) ≤ 0


..




 .
hs ( x, v) ≤ 0 (9)





∑sj=1 h j ( x, v) ≥ 0

or
h2j ( x, v) ≤ 0, j = 1, . . . , s (10)
For the sake of simplicity, the equality constraints will be further omitted. We will
mention them in Section 8, suggesting a specific way to treat them in computational
algorithms.
Assumptions about the functions and sets appearing in the formulation will be given
in the following statements. For now, we will only assume, that they are such that the
solution of the problem (4)–(7) exists. It is only worth saying something about the V set. It
is supposed to be contained in Rq . Therefore, this formulation covers also MINLP problems
(from Mixed Integer NonLinear Programming), that is mixed continuous–discrete, nonlinear
problems, where V ⊆ Zq ; Z—the space of all integers.
It is suggested to define the vector v—complicating variables in such a way that, when
we establish it, one of the following occurs:
1. A problem can be decomposed into a number of independent subproblems, each
using a different subproblem vector xi , i = 1, . . . , p
 
x1
 x2 
x= (11)
 
.. 
 . 
xp
Most often, the total objective function is additive:

p
min f ( x, v) =
x,v
min
x1 ,x2 ,...,x p ,v
∑ f i ( xi , v ) + f 0 ( v ) (12)
i =1
s.t.
gij ( xi , v) ≤ 0, j = 1, . . . , mi ; i = 1, . . . , p (13)
ni q
xi ∈ Xi ⊆ R , i = 1, . . . , p, v ∈ V ⊆ R , (14)
p
∑ ni = n
i =1
In this case, it is worthwhile to use not only decomposition, but also a parallel solution
to the resulting primal subproblems.
2. The problem with respect to x variables takes a specific structure for which efficient
algorithms exist—e.g., it is linear or quadratic.
3. The problem is convex with respect to x with v fixed and vice-versa, although it is
non-convex, if we consider it for the concatenated vector variable ( x, v).
Energies 2021, 14, 6503 4 of 18
Convex problems with respect to one subvector when the latter is fixed, for example,
bilinear problems, are very often formulated in engineering [12,24].
3. Examples of Optimization Problems That Can Be Solved with the Benders Method
1. Problem with a separable objective function and separable function constraints
h i
min 2y21 − y1 · y2 + 4y22 + (y3 − 4)2 + (y4 − 3)2 + 8y25 + y26 − 3y5 (15)
y1 ,...,y6
s.t.
y1 ≥ 1, y2 ≥ 0, y3 ≥ 3, y4 ≥ 2, y5 ≥ 3, y6 ≥ 0 (16)
y1 + y2 + y23 + y24 + y5 + y26 ≤ 50 (17)
In problems, where both the objective function and the constraint functions are sums
of functions dependent on subvectors of decision variables, to obtain the standard
form (12)–(14) you need to accordingly group the variables into subvectors and to
introduce additional v variables, separating constraints.
Let us assume p = 3 and denote:

x1,1 y1 x2,1 y3
x1 = , , x2 = , , (18)
x1,2 y2 x2,2 y4

x3,1 y5
x3 = ,
x3,2 y6
Now:
2 2
f 1 ( x1 ) = 2x1,1 − x1,1 x1,2 + 4x1,2 (19)
n o
X1 = [ x1,1 , x1,2 ] ∈ R2 : x1,1 ≥ 1, x1,2 ≥ 0 (20)
f 2 ( x2 ) = ( x2,1 − 4)2 + ( x2,2 − 3)2 (21)

n o
X2 = [( x2,1 , x2,2 ] ∈ R2 : x2,1 ≥ 3, x2,2 ≥ 2 (22)
2 2
f 3 ( x3 ) = 8x3,1 + x3,2 − 3x3,1 (23)
n o
X3 = [ x3,1 , x3,2 ] ∈ R2 : x3,1 ≥ 3, x3,2 ≥ 0 (24)
f 0 (v) ≡ 0 (25)
Let us also introduce artificial variables v1 , v2 , v3 for constraining components binding
primal subvector variables, respectively: x1 , x2 and x3 , in the cumulative constraint (17).
x1,1 + x1,2 ≤ v1 (26)

2 2
x2,1 + x2,2 ≤ v2 (27)
2
x3,1 + x3,2 ≤ v3 (28)
We obtain the following constraint functions in the problem (12)–(14) format:
g11 ( x1 , v) = x1,1 + x1,2 − v1 (29)

2 2
g21 ( x2 , v) = x2,1 + x2,2 − v2 (30)
2
g31 ( x3 , v) = x3,1 + x3,2 − v3 (31)
as well as the V set:
n o
V = [v1 , v2 , v3 ] ∈ R3 : v1 + v2 + v3 ≤ 50 (32)
Energies 2021, 14, 6503 5 of 18
2. Problem with a chain (ring) of constraints.

In problems of this type, successive constraints bind several subsequent decision
variables, i.e., every variable appears in several constraints (at least in two) together
with neighboring variables with a lower and a higher index. This structure is quite
typical for optimal control problems after discretization of differential equations over
time and replacing a resulting system of s nonlinear equations with s + 1 inequality
constraints using the transformation (9). The idea for splitting a problem and ob-
taining the standard form (12)–(14) is to treat some variables in a vector (e.g., those
which divide it into several equal parts) as complicating variables vi , and then the
constraints in which these variables appear as mixed constraints of the type (13).
Consider the following problem:
min y21 + y22 + . . . + y29

y ∈ R9
s.t.
yk+1 − yk ≤ sin k, k = 1, ..., 8
y1 − y9 ≤ 0.5
Let us denote:
v1 , y3 , v2 , y6 , v3 , y9
The remaining elements will form subvectors of the dimension 2. We will denote
them as x1 , x2 , x3 , that is:

x1,1 y1 x2,1 y4
x1 = , , x2 = , ,
x1,2 y2 x2,2 y5

x3,1 y7
x3 = ,
x3,2 y8
These problems can be presented in the format (12)–(14) assuming:
2 2
f 1 ( x1 ) = x1,1 + x1,2 (33)
n o
X1 = [ x1,1 , x1,2 ] ∈ R2 : x1,2 − x1,1 ≤ sin 1 (34)
g11 ( x1 , v) = x1,1 − v3 − 0.5 (35)

g12 ( x1 , v) = v1 − x1,2 − sin 2 (36)
2 2
f 2 ( x2 ) = x2,1 + x2,2 (37)
n o
X2 = [ x2,1 , x2,2 ] ∈ R2 : x2,2 − x2,1 ≤ sin 4 (38)
g21 ( x2 , v) = x2,1 − v1 − sin 3 (39)

g22 ( x2 , v) = v2 − x2,2 − sin 5 (40)
2 2
f 3 ( x3 ) = x3,1 + x3,2 (41)
n o
X3 = [ x3,1 , x3,2 ] ∈ R2 : x3,2 − x3,1 ≤ sin 7 (42)
g31 ( x3 , v) = x3,1 − v2 − sin 6 (43)

g32 ( x3 , v) = v3 − x3,2 − sin 8 (44)
3
f 0 (v) = ∑ v2i (45)
i =1
Energies 2021, 14, 6503 6 of 18
4. Decomposition
The problem (4)–(7) can be represented as follows [2]:
min inf f ( x, v) (46)

v ∈V x ∈ X
s.t.
g( x, v) ≤ 0 (47)
The infimum in the internal problem appeared due to the fact that for some v it can be
unbounded.
Let us denote
z(v) , inf f ( x, v) (48)
x∈X
s.t.
g( x, v) ≤ 0 (49)
We will call the problem (48)–(49), solved for a fixed v, the primal problem.
The problem (46)–(47) can now be written as
min z(v) (50)

v∈V ∩V0
where
V0 = {v : ∃x ∈ X g( x, v) ≤ 0} (51)
and interpreted as a projection of the problem (46)–(47) onto the space of the variable v [2].
The formulation (50) is called the master problem.
The requirement for v defined in (50) as v ∈ V ∩ V0 results from the necessity to
guarantee the existence of a solution—the value of z(v). The set V0 is called the solvability
set.
The problem is that we only know z(v) and V0 indirectly, through their generic
definitions.
The following theorems are valid [2]:
Theorem 1 (Projection).
1. The problem (4)–(7) has no solution or is unbounded if and only if the same is true for the
problem (50) .
2. If ( x̂, v̂) is the optimal solution of the problem (4)–(7), then v̂ is the optimal solution of the
problem (50),
3. If v̂ is the optimal solution of the problem (50) and x̂ reaches the infimum in the problem
(48)–(49) at v = v̂, then ( x̂, v̂) is the optimal solution of the problem (4)–(7).
Theorem 2 (Representation of V0 ).
Assume that X is a nonempty convex set and that the function g is convex on X for each fixed
v ∈ V.
Suppose further that the set
Zv = {z ∈ Rm : ∃ x ∈ X g( x, v) ≤ z}
is closed for each fixed v ∈ V. Then a point v∗ ∈ V also belongs to the set V0 if and only if:
inf L f ( x, v∗ , λ) ≤ 0, ∀λ ∈ Λ (52)
x∈X
where ( )
m
Λ= λ ∈ Rm : λ ≥ 0, ∑ λ j = 1 (53)
j =1
Energies 2021, 14, 6503 7 of 18
and
L f ( x, v, λ) = λ T g( x, v) (54)
Theorem 3 (Representation of z(v)).

Assume that X is a nonempty convex set and that the functions f and g are convex on X for
each fixed v = v∗ ∈ V. Assume further that for v∗ , at least one of the following conditions is met:
1. z(v∗ ) is finite and in the problem (48)–(49) there exists an optimal vector of Lagrange
multipliers;
2. z(v∗ ) is finite, g( x, v∗ ) and f ( x, v∗ ) are continuous on X, the set X is closed and the set of
optimal solutions of the problem (48)–(49) with an accuracy ε is nonempty and bounded for
some ε ≥ 0;
Then
z(v) = sup inf Lo ( x, v, λ), ∀v ∈ V ∩ V0 (55)
λ ≥0 x ∈ X
where
Lo ( x, v, λ) = f ( x, v) + λ T g( x, v) (56)
The last theorem results directly from the strong duality theorem [25].
Substituting in the problem (50) the expression (55) for z(v) and (52) for the v ∈ V0
constraint, we obtain an equivalent problem:
min sup inf Lo ( x, v, λ) (57)

v ∈V λ ≥0 x ∈ X
s.t.
inf L f ( x, v, λ) ≤ 0, λ∈Λ (58)
x∈X
Using the definition of supremum as the smallest upper constraint and introducing
an additional, scalar variable µ, we obtain the following form of the master problem,
equivalent to (57)–(58):
min µ (59)
v∈V,µ
s.t.
inf Lo ( x, v, λ) ≤ µ, ∀λ ≥ 0 (60)
x∈X
inf L f ( x, v, λ) ≤ 0, ∀λ ∈ Λ (61)
x∈X
In practice, it can be assumed that the function z(v) (see formula (48)) is constrained
for all v ∈ V, set X compact and functions f ( x, v) and g( x, v) are continuous throughout
the domain. Therefore, we can replace the infimum with minimum. Then the master
problem will take the following form:
min µ (62)
v∈V,µ
s.t.
min Lo ( x, v, λ) ≤ µ, ∀λ ≥ 0 (63)
x∈X
min L f ( x, v, λ) ≤ 0, ∀λ ∈ Λ (64)
x∈X
where the Lo and L f functions are given by formulas (54) and (56).
The problem (62)–(64) is very difficult to solve due to constraints, which have to be
satisfied in an infinite and even uncountable number of points (for all λ with nonnegative
coordinates or nonnegative summing up to unity) and the existence of internal optimization
subproblems (minimization with respect to x).
Energies 2021, 14, 6503 8 of 18
These difficulties are overcome by relaxing this problem, more precisely by solving it
with the use of successive, more and more precise, approximations of functions on the left
hand side of the (63) and (64) constraints. They will be made up of pieces of the Lo or L f
functions related to the optimal solutions of the primal problem, in a situation, respectively,
when the next trial point from the master problem v∗ belongs to V0 or not.
5. Basic Properties of the Primal Problem

The primal problem is the initial problem (4)–(7) solved for a fixed v = v∗ ∈ V
min f ( x, v∗ ) (65)
x∈X
s.t.
g( x, v∗ ) ≤ 0 (66)
While solving this problem, there may be two cases: when the primal problem is
feasible, i.e., it has a solution, and when it has no solution.
5.1. The Case When the Primal Problem Has a Solution

Suppose that for v∗ obtained from the relaxed master problem, the primal problem
(65)–(66) has a solution, to which corresponds the optimal vector of Lagrange multipliers
λ∗ . These multipliers will be the multipliers of the new constraint of type (63).
So a constraint of type
min Lo ( x, v, λ∗ ) ≤ µ (67)
x∈X
should be added to the relaxed master problem.
5.2. The Case When the Primal Problem Has No Solution

A more complicated situation occurs when the primal problem has no solution. Ac-
cording to Theorem 2, for a fixed v = v∗ the primal problem has a solution if the following
condition is met: h i
min L f ( x, v∗ , λ) = λ T g( x, v∗ ) ≤ 0, ∀λ ∈ Λ (68)
x∈X
where ( )
m
Λ= λ ∈ R : λ ≥ 0, ∑ λ j = 1
m
(69)
j =1
If a solver detects that there is no solution of the primal problem, one should find a
vector λ̂ ∈ Λ for which
min λ̂ T g( x, v∗ ) > 0 (70)
x∈X
Then it is taken that λ∗ = λ̂ and the following inequality is added to the relaxed
master problem:
min L f ( x, v, λ∗ ) ≤ 0 (71)
x∈X
Theorem 4. If the primal problem (65)–(66) is infeasible, the vector of Lagrange multipliers λ̂ ∈ Λ
satisfying the condition (70) can be determined by solving the following auxiliary problem (the
so-called feasibility problem):
min max g j ( x, v∗ ) (72)
x ∈ X j=1,...,m
Proof. The problem (72) is equivalent to the problem:
min α (73)
x ∈ X,α
s.t.
g j ( x, v∗ ) ≤ α, j = 1, . . . , m (74)
Energies 2021, 14, 6503 9 of 18
It certainly has a solution due to the previously taken assumptions concerning the
g( x, v) function.
Without loss of generality, let us assume that the set X is defined by the inequality
constraints r ( x ); that is, we have a problem:
min α (75)
x,α
s.t.
g j ( x, v∗ ) ≤ α, j = 1, . . . , m (76)
r j ( x ) ≤ 0, j = 1, . . . , t (77)
We assume that all constraints functions g and r are continuously differentiable and
convex, and the points of their activities are regular. The Lagrangian for this problem is
as follows:
m t
∑ λj · ( g j ( x, v∗ ) − α) + ∑ λrj · r j ( x )
g
Lα f ( x, α, v∗ , λ) = α + (78)
j =1 j =1
where λ g ≥ 0, λr ≥ 0are vectors of Lagrange multipliers for constraints (76) and (77),
λg

respectively, and λ = . From the Karush–Kuhn–Tucker optimality conditions for
λr
the α variable we obtain the equation:
m
∑ λ̂ j
g
1− =0 (79)
j =1
that is, comparing with the definition of Λ (69), λ g ∈ Λ. Taking into account the equality (79)
in the Lagrangian formula (78) at the optimal point, as well as the conditions of comple-
mentarity in relation to the constraints (77), we will obtain:
m
∑ λ̂ j
g
Lα f ( x ∗ , α∗ , v∗ , λ̂) = · g j ( x ∗ , v∗ ) = L f ( x ∗ , v∗ , λ̂ g ) (80)
j =1
So, the vector of Lagrange multipliers for the constraints (76) calculated at the optimal
point in the problem (75)–(77) is the wanted λ̂ vector.
Modern optimization solvers usually return such a vector of Lagrange multipliers

when they state that a feasible solution does not exist.
6. Basic Properties of the Master Problem: Cuts

It would be best if, from the solutions of the primal problem for different v = v∗ , it
was possible to use not only Lagrange multipliers at the optimal point λ∗ , but also optimal
vectors of the primal variables x ∗ . To obtain a solution of the basic problem (4)–(7) using
this approach, certain conditions must be met.
The most important is as follows:
Assumption 1. Both Lagrangians Lo ( x, v, λ) and L f ( x, v, λ) for any x ∈ X, v ∈ V and λ ≥ 0

can be written as composite functions:
Lo ( x, v, λ) = Qo (wo ( x, λ), v, λ) (81)
L f ( x, v, λ) = Q f (w f ( x, λ), v, λ) (82)
where wo , w f are scalar functions of x and λ, and Qo , Q f are increasing functions of the first
argument, convex of the second.
Energies 2021, 14, 6503 10 of 18
Theorem 5. Let us suppose that for the problem (4)–(7) Assumption 1 is satisfied. Let x ∗ be the
optimal solution of the problem
min Lo ( x, v∗ , λ∗ ) (83)
x∈X
Then also for v 6= v∗

min Lo ( x, v, λ∗ ) = Lo ( x ∗ , v, λ∗ ) (84)
x∈X
Proof. According to Assumption 1, due to the increasing dependency of the Lagrange

function on the first argument, from the formula (81) we have:
Qo (wo ( x ∗ , λ∗ ), v∗ , λ∗ ) = Lo ( x ∗ , v∗ , λ∗ ) = min Lo ( x, v∗ , λ∗ )
x∈X
= min Qo (wo ( x, λ∗ ), v∗ , λ∗ ) = Qo (min wo ( x, λ∗ ), v∗ , λ∗ ) (85)

x∈X x∈X
Hence, taking into account the injectivity of strictly monotone functions:
wo ( x ∗ , λ∗ ) = min wo ( x, λ∗ ) (86)
x∈X
So, for v 6= v∗
min Lo ( x, v, λ∗ ) = min Qo (wo ( x, λ∗ ), v, λ∗ ) = Qo (min wo ( x, λ∗ ), v, λ∗ )

x∈X x∈X x∈X
= Qo (wo ( x ∗ , λ∗ ), v, λ∗ ) = Lo ( x ∗ , v, λ∗ ) (87)
Consequently, if Assumption 1 is in effect, the constraint (63) in the master problem

can be approximated around v = v∗ by the constraint:
µ ≥ Lo ( x ∗ , v, λ∗ ) (88)
where the vectors x ∗ , λ∗ come from the solution of the primal problem (65)–(66).
From now on this constraint will be called a cut.
From Assumption 1, yet another convenience follows. Given the convexity of the
Lagrange function Lo with respect to v, in the case when it is differentiable with respect to
this vector, we obtain:
∂LoT ∗ ∗ ∗
Lo ( x ∗ , v, λ∗ ) ≥ Lo ( x ∗ , v∗ , λ∗ ) + ( x , v , λ )(v − v∗ ) (89)
∂v
In this way, the cut (88) can be relaxed and replaced in the master problem by a
linear cut
∂L T
µ ≥ Lo ( x ∗ , v∗ , λ∗ ) + o ( x ∗ , v∗ , λ∗ )(v − v∗ ) (90)
∂v
If the Lagrange function is not differentiable, the gradient can be replaced with a
subgradient.
Analogical reasoning can be performed to the feasibility constraint (64).
It is easy to check that Assumption 1 is satisfied for two classes of problems:
1. Separable—when the f , g functions are sums of components dependent on x and v,
that is
f ( x, v) = f 1 ( x ) + f 2 (v) (91)
g( x, v) = g1 ( x ) + g2 (v) (92)
where the components f 2 , g2 are convex on V.
Energies 2021, 14, 6503 11 of 18
2. Variable factor programming of the form:

"
q
#
min f ( x, v) =
x,v
∑ vi f i ( xi ) (93)
i =1
s.t.
q
g( x, v) = ∑ vi xi − c ≤ 0 (94)
i =1
Av ≤ b, v ≥ 0, v ∈ Rq (95)
m
xi ≥ 0, xi , c ∈ R (96)
where vi , i = 1, . . . , q are non-negative scalars and c, xi , i = 1, . . . , q are vectors of the
dimension m. A special case of this type of problem is the bilinear problem, which is
nonconvex [24].
If the problem (4)–(7) is neither separable nor with variable factors, but convex with
respect to the full vector of variables ( x, v), the cuts (90) can also be used.
Theorem 6. For convex problems with respect to the full vector ( x, v) the evaluation (90) is valid.
Proof. For convex problems with differentiable functions (if the Lagrange function is not
differentiable, the gradient can be replaced with a subgradient) we have for fixed λ = λ∗
and all x and v:
∂LoT ∗ ∗ ∗
Lo ( x, v, λ∗ ) ≥ Lo ( x ∗ , v∗ , λ∗ ) + ( x , v , λ )( x − x ∗ )
∂x
∂LoT ∗ ∗ ∗
( x , v , λ )(v − v∗ )
+ (97)
∂v
This inequality will be preserved when we compute the minimum with respect to x
on both sides:
n ∂L T ∂L T o
min Lo ( x, v, λ∗ ) ≥ min Lo ( x ∗ , v∗ , λ∗ ) + o ( x ∗ , v∗ , λ∗ )( x − x ∗ ) + o ( x ∗ , v∗ , λ∗ )(v − v∗ )
x∈X x∈X ∂x ∂v
∂LoT ∗ ∗ ∗ ∂L T
= Lo ( x ∗ , v∗ , λ∗ ) + ( x , v , λ )(v − v∗ ) + min o ( x ∗ , v∗ , λ∗ )( x − x ∗ ) (98)
∂v x ∈ X ∂x
Note that the last component of the expression (98) is equal to zero, as the point x ∗
is optimal for v = v∗ , λ = λ∗ (there is no feasible direction of improvement, i.e., with a
negative directional derivative), so the inequality (90) is valid.
7. Computational Algorithm
We will now consider the case when at the optimal point of the primal problem x ∗ in a
version for both v∗ ∈ V ∩ V0 , that is (65)–(66), as well as v∗ ∈ V \ V0 , that is (73)–(74), there
exists a vector of Lagrange multipliers λ∗ . Therefore, these problems must fulfill certain
regularity conditions (more on that below).
A computational algorithm based on the generalized Benders method can be formu-
lated as follows:
GBD Algorithm
1. Choose a starting point v0 ∈ V; Ko := ∅, K f := ∅, k := 0, take a convergence tolerance
ε > 0.
2. For the fixed v = vk try to solve the primal problem (65)–(66).
• If successful k := k + 1, remember as x k and λk the obtained optimal point and
the optimal vector of Lagrange multipliers, Ko := Ko ∪ {k}. Update
UBD = min{UBD, z(vk )}. If there has been an improvement in the estimate of
Energies 2021, 14, 6503 12 of 18
the upper bound UBD, remember a pair ( x k , vk ) as the best solution so far. If
UBD − LBD ≤ ε then STOP.
• Otherwise solve the feasibility problem (73)–(74) (as far as the used solver does
not solve it itself facing infeasibility), k := k + 1, remember as x k and λk the opti-
mal point obtained and the vector of optimal Lagrange multipliers corresponding
to constraints, K f := K f ∪ {k}.
3. Solve the relaxed master problem:
• in nonlinear version
min µ (99)
v∈V,µ
s.t.
Lo ( x l , v, λl ) ≤ µ, l ∈ Ko (100)
L f ( x l , v, λl ) ≤ 0, l ∈ Kf (101)
• or linearized
min µ (102)
v∈V,µ
s.t.
∂LoT l l l
Lo ( x l , vl , λl ) + ( x , v , λ )(v − vl ) ≤ µ, l ∈ Ko , (103)
∂v
∂L Tf
L f ( x l , vl , λl ) + ( x l , vl , λl )(v − vl ) ≤ 0, l ∈ K f (104)
∂v
Let (vk , µk ) be the optimal solution of the above problem. Then µk is the lower estimate
of the optimal objective of the original problem. Let LBD = µk . If UBD − LBD ≤ ε
then STOP.
4. Go to 2.
Convergence of the Algorithm

It has been proved that the GBD algorithm converges in a finite number of steps for
any ε > 0, when [2]:
1. V is a finite discrete set, and the assumptions of Theorems 2 and 3 for Case 1 (then
the same is also true for ε = 0) are satisfied.
2. V is a nonempty, compact subset of V0 , X is a nonempty compact convex set, functions
f and g are convex on X for each fixed v ∈ V and continuous on X × V, the set of
optimal Lagrange multipliers in problem (65)–(66) is non-empty for all v ∈ V0 and
constraints satisfy Slater’s regularity condition: ∃ x̄ ∈ X, v̄ ∈ V g( x̄, v̄) < 0.
In the case when inequality constraints result from the application of the transforma-
tion (9) or (10) to equality constraints (8), Slater’s condition will be fulfilled, when
you relax them using the epsilon tube method.
3. V is not a subset of V0 the constraint function g( x, v) is separable and the set X is
defined using linear constraints; other conditions are as in point 2 [4].
A very important assumption is that there exist Lagrange multipliers for all v ∈ V ∩ V0 .
If this is not checked, serious errors can appear. Consider the following example given (in
the context of a slightly different approach to decomposition of optimization problems) by
Grothey et al. [26]. For the sake of completeness a X set has been added, defined as a box
being a Cartesian product of intervals [−10.10] along every coordinate. It does not matter
for the reasoning.
min v2 − x2 (105)
x1 ,x2 ,v
s.t.
( x1 − 1)2 + x22 ≤ ln v (106)
( x1 + 1)2 + x22 ≤ ln v (107)
Energies 2021, 14, 6503 13 of 18
v≥1 (108)
− 10 ≤ xi ≤ 10, i = 1, 2 (109)
The admissible area for the case when v > e is presented in Figure 1.
x2
-1 1 x1
Admissible area
ln(v)
Figure 1. Cross-section through the admissible set in the problem (105)–(109) for v > e.
For v∗ = e we will have the problem:
min e2 − x2 (110)
x1 ,x2 ,v
s.t.
( x1 − 1)2 + x22 ≤ 1 (111)
( x1 + 1)2 + x22 ≤ 1 (112)
− 10 ≤ xi ≤ 10, i = 1, 2 (113)
Its optimal solution is the only feasible point (0,0). It is easy to check that at this point
the Lagrange multipliers for the constraints (111)–(112) do not exist, so the expressions
(100) and (103) lose their sense. It results from the fact that the Fiacco–McCormick regular-
ity condition (that gradients of active constraints are linearly independent) is not satisfied
at this point. Interestingly, the primal problem (105)–(109), with respect to the full vector
( x, v), is convex and regular—this condition is met also at point ( x, v) = (0, 0, e).
8. Effective Algorithms for Solving the Master Problem for Discrete Variables in the
Version with Linear Cuts
Here we will discuss how to effectively solve the linear master problem (102)–(104)
when V ⊆ Zq .
The GBD algorithm has some serious disadvantages. The most important of them
is the number of constraints, which grows with every solution of the master problem,
making it more complicated, and, hence, extending its maximum and average (in a certain
window, because there can always be short single executions of the master problem, e.g., if
the next optimal point v∗ is close to a given starting point, but this is not a typical thing)
execution time. Moreover, there is no guarantee that in the next steps the value UBD of
the approximation of the feasible solution will decrease. Therefore, it is worthwhile to use
more sophisticated algorithms.
Nowadays the most useful as well as the most commonly used are the algorithms
based on the cutting plane method, especially in conjunction with the interior point
method [27].
The classical cutting plane method (more precisely the “cutting off plane method”)
by Kelley [28], to which the master problem is reduced in the version with linearized cuts,
Energies 2021, 14, 6503 14 of 18
has a number of disadvantages. First of all, it converges slowly. There are assessments
which state that, to reach a solution with an accuracy of ε > 0, this method requires
O( εq1+1 ) iterations [29]. A serious problem, mentioned above, is also that, as the calculations
progress, the growing number of cuts causes subsequent iterations to be longer and longer.
Unfortunately, despite many studies having been conducted, including recent machine
learning techniques [8], there are no reliable and simple rules for removing old cuts, even
those which are inactive for the current solution of the problem (102)–(104) [30]. One can
also observe some kind of instability when the next point generated by the algorithm
may be far from the previous one, even though the previous one was already close to the
optimum, and even within it [25]. This effect may be alleviated by additional distance
control constraints for each set of binary variables, limiting the integer update to a certain
distance from the actual integer variable [31].
The above drawbacks have been largely removed through modifications of the Kelley
method [27]. Unfortunately, from the most popular bundle method, which consists in
adding a quadratic proximal component to a linear approximation of the objective function,
which is a penalty for deviation from the last significant solution [32,33], very little can be
expected in theory in the discrete case and “the mere production of a feasible point is never
guaranteed” [34].
Much better, perhaps the best in this (i.e., MINLP) case, is the largest inscribed sphere
method [35,36]. In this method the concept of a localization set is used. For the best estimate
so far (say up to the k-th iteration of the master problem) of the upper optimal value of the
objective function of the initial problem—let us denote it with UBD k —it will be the set:
(v, µ) ∈ Zq × R
 
 
µ ≤ UBD k

 

Lk = l l T l (114)

 ϕ(v ) + ∇ ϕ(v ) (v − v ) ≤ µ, l ∈ Ko 

ξ (vl ) + ∇ξ (vl ) T (v − vl ) ≤ 0, l ∈ K f
 
where:
ϕ(vl ) = Lo ( x l , vl , λl ) = f ( x l , vl ) + (λl ) T g( x l , vl ) (115)
∂Lo l l l ∂f l l ∂g
∇ ϕ(vl ) = (x , v , λ ) = ( x , v ) + ( x l , vl )λl (116)
∂v ∂v ∂v
ξ (vl ) = L f ( x l , vl , λl ) = (λl ) T g( x l , vl ) (117)
∂L f l l l ∂g l l l
∇ξ (vl ) = (x , v , λ ) = ( x , v )λ (118)
∂v ∂v
Note, that by assuming u = (v, µ), we can express the constraints that describe this
set in the following way :
Au ≤ b (119)
where matrix A is S × (q + 1), with S = (K̄¯ o + K̄¯ f + 1). The largest inscribed sphere method
is one of the central point methods (from the larger family of interior point methods), which
rely on looking in the subsequent steps k, instead of the point minimizing the objective
function (102), for a point which, according to some measure, lies “in the middle” of the
localization set Lk (114).
The center of the largest inscribed sphere for the set U = {u ∈ Zq × R : A u ≤ b}
(called also the Chebyshev center) is determined by solving a linear programming problem:
max σ (120)
u ∈ Z q × R, σ ≥ 0
s.t.
aiT u + k ai kσ ≤ bi , i = 1, . . . , S (121)
The advantage of this method is, in addition to a linear formulation, the ease of
eliminating unnecessary (distant) cuts, based on zero values of Lagrange multipliers
Energies 2021, 14, 6503 15 of 18
(inactive constraints (121)), if in subsequent iterations of the master problem, the value
of σ̂ decreased [35]. This method needs O((q + 1) ln 1ε ) iterations to achieve the optimal
solution with the accuracy of ε [29].
The next possibility is solving at the master level instead of (102)–(104) so-called outer
approximation (OA) problem [37]:
min µ (122)
x ∈ X,v∈V,µ∈R
x − xl

f ( x l , vl ) + ∇ f ( x l , vl ) T ≤ µ, l ∈ Ko (123)
v − vl
x − xl

l l l l T
g( x , v ) + ∇ g( x , v ) ≤ 0, l ∈ K f (124)
v − vl
GBD can be regarded as a particular case of the OA method (because constraints in
GBD are linear combinations of those in OA), and the lower bounds of GBD are weaker
(that is not of a greater value) than those of OA. It means that OA used at the master lever
requires a smaller number of iterations, however at a higher cost for solving the master
problem, since the number of constraints added per iteration is greater [38].
Instead of calling many times at the master level, an external Mixed Integer Linear
Programming (MILP) solver, which is time costly, more advanced users can write their
own specialized solvers which avoid it. The simplest way is to use the branch-and-cut
algorithm proposed in [39], which combines the above OA method and branch and bound
algorithm. Cuts—both optimality and feasibility—are made when a relaxed (to continuous)
master problem delivers a discrete solution v∗ ∈ V, otherwise, in the case of a better than
the discrete solution best so far, a partition of the domain is performed (branching) along a
chosen variable vi∗ with a fractional value or, in the opposite case, the given region (node)
is fathomed [39]. Some improvements to this approach were proposed in [40].
In Boolean problems, that is where V ⊆ {0, 1}q , it may be useful to add the following
cuts, assuring that the old solutions will not appear again [41]:
∑ vj − ∑ v j ≤ | Bl | − 1, l ∈ Ko ∪ K f (125)
j∈ Bl j∈ N l
where Bl = { j|vlj = 1} and N l = { j|vlj = 0}.

Last but not least, it is always possible to extend the master level algorithm with some
heuristics or machine learning techniques, connected more or less to specific features of the
practical problems solved, e.g., [8,42–48].
The latest tests confirm that owing to the Benders method as a big problem as having
1956 binary and 57,616 continuous variables the three-scenario batch plant design problem
can be solved with a 9.45% relative optimality gap in 729 s on a machine with 12 Xeon
Intel processors, while a general solver BARON in 50,000 s delivers a solution with a
22.3% optimality gap (such solvers as SBB, Alpha-ECP, DICOPT failed to deliver a feasible
solution in 50,000 s) [14].
9. Conclusions
The Benders method, although it was proposed sixty years ago, is still very vital.
It is often used, especially in the version generalized by Geoffrion, to solve many large
practical problems in the field of technology, mainly related to all types of networks,
e.g., telecommunications, energy, gas, transport. These problems are characterized by
mixed, discrete–continuous nature of decision variables, which is often accompanied by a
nonconvexity of objective functions and constraints, but of such a type that, after fixing
one of two subvectors of the problem variables, the problem becomes convex with respect
to the second subvector (such are, for example, bilinear problems). Then the Benders
method converges in a finite number of steps to exact solutions. This method copes well
Energies 2021, 14, 6503 16 of 18
with the infeasibility of primal problems—with respect to the vector x—for some values of
the complicating variables v, solving the feasibility problem from which new constraints
are obtained—feasibility cuts. When the primal problem is feasible, the optimality cut is
generated, eliminating part of the admissible set of complicating variables V, for which the
objective function has worse values than those obtained so far.
There are two versions of the Benders method: with nonlinear and linear cuts. Both can
be used when the condition of independence of the solution of the Lagrangian minimization
problem in the primal problem from the vector of complicating variables is met as well as
when the Lagrangian is convex with respect to complicating variables. Linear cuts can also
be used in problems convex with respect to the full vector ( x, v).
The method with linear cuts (which in fact is a version of the Kelley cutting plane
method) seems to be more practical, due to the ease of solving the master problem, e.g.,
using linear or quadratic solvers. It has many advantages, especially when using the
largest inscribed sphere method (from the area of nondifferentiable optimization), outer ap-
proximation method or outer-approximation-based branch-and-cut method. In particular,
practical problems, heuristics and machine learning techniques can additionally increase
the efficiency of the master level algorithm.
Funding: This research received no external funding.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Conflicts of Interest: The author declares no conflict of interest.
Abbreviations
The following abbreviations are used in this manuscript:
ADMM Alternating Directions Multiplier Method

GBD Generalized Benders Decomposition
LBD Lower Bound
UBD Upper Bound
MILP Mixed Integer Linear Programming
MINLP Mixed Integer Nonlinear Programming
OA outer approximation (algorithm)
References
1. Benders, J.F. Partitioning Procedures for Solving Mixed-Variables Programming Problems. Numer. Math. 1962, 4, 238–252.
[CrossRef]
2. Geoffrion, A.M. Generalized Benders Decomposition. J. Optim. Theory Appl. 1972, 10, 237–260. [CrossRef]
3. Floudas, C.A. Nonlinear and Mixed-Integer Optimization; Oxford University Press: New York, NY, USA, 1995.
4. Floudas, C.A. Generalized Benders Decomposition GBD. In Encyclopedia of Optimization, 2nd ed.; Floudas, C.A., Pardalos, P.M.,
Eds.; Springer: Boston, MA, USA, 2009.
5. Lee, J.; Leyffer, S. Mixed Integer Nonlinear Programming; Springer: New York, NY, USA, 2012.
6. Li, D.; Sun, X. Nonlinear Integer Programming; Springer: New York, NY, USA, 2006.
7. Chung, K.H.; Kim, B.H.; Hur, D. Distributed implementation of generation scheduling algorithm on interconnected power
systems. Energy Convers. Manag. 2011, 52, 3457–3464. [CrossRef]
8. Lee, M.Y.; Ma, N.; Yu, G.D.; Dai, H.Y. Accelerating Generalized Benders Decomposition for Wireless Resource. IEEE Trans.
Wirel. Commun. 2021, 20, 1233–1247. [CrossRef]
9. Geoffrion, A.M.; Graves, G.W. Multicommodity Distribution System Design by Benders Decomposition. Manag. Sci. 1974, 20,
822–844. [CrossRef]
10. Lu, J.; Gupte, A.; Huang, Y. A mean-risk mixed integer nonlinear program for transportation network protection. Eur. J. Oper. Res.
2018, 265, 277–289. [CrossRef]
11. Li, X. Parallel nonconvex generalized Benders decomposition for natural gas production network planning under uncertainty.
Comput. Chem. Eng. 2013, 55, 97–108. [CrossRef]
Energies 2021, 14, 6503 17 of 18
12. Osman, H.; Demirli, K. A bilinear goal programming model and a modified Benders decomposition algorithm for supply chain
reconfiguration and supplier selection. Int. J. Prod. Econ. 2010, 124, 97–105. [CrossRef]
13. Cai, X.; McKinney, D.C.; Lasdon, L.S.; Watkins, D.W., Jr. Solving Large Nonconvex Water Resources Management Models Using
Generalized Benders Decomposition. Oper. Res. 2001, 49, 235–245. [CrossRef]
14. Li, C.; Grossmann, I.E. An improved L-shape d method for two-stage convex 0–1 mixed integer nonlinear stochastic programs.
Comput. Chem. Eng. 2018, 112, 165—179. [CrossRef]
15. Li, C.; Grossmann, I.E. A finite ε convergence algorithm for two-stage stochastic convex nonlinear programs with mixed-binary
first and second-stage variables. J. Glob. Optim. 2019, 75, 921–947. [CrossRef]
16. Li, X.; Tomasgard, A.; Barton, P.I. Nonconvex Generalized Benders Decomposition for Stochastic Separable Mixed-Integer
Nonlinear Programs. J. Optim. Theory Appl. 2011, 151, 425–454. [CrossRef]
17. Floudas, C.A.; Aggarwal, A.; Ciric, A.R. Global Optimum Search for Nonconvex NLP and MINLP Problems. Comput. Chem. Eng.
1989, 13, 1117–1132. [CrossRef]
18. Aggarwal, A.; Floudas, C.A. A Decomposition Approach for Global Optimum Search in QP, NLP and MINLP Problems.
Ann. Oper. Res. 1990, 25, 119–146. [CrossRef]
19. Türkay, M.; Grossmann, I.E. Logic-Based MINLP Algorithms for the Optimal Synthesis of Process Networks. Comput. Chem. Eng.
1996, 20, 959–978. [CrossRef]
20. Li, X.; Chen, Y.; Barton, P.I. Nonconvex Generalized Benders Decomposition with Piecewise Convex Relaxations for Global
Optimization of Integrated Process Design and Operation Problems. Ind. Eng. Chem. Res. 2012, 51, 7287–7299. [CrossRef]
21. Nowak, I. Relaxation and Decomposition Methods for Mixed Integer Nonlinear Programming; Birkhäuser: Basel, Switzerland, 2005.
22. Boyd, S. Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends
Mach. Learn. 2011, 3, 1–122. [CrossRef]
23. Li, X.; Bi, S.; Wang, H. Optimizing Resource Allocation for Joint AI Model Training and Task Inference in Edge Intelligence
Systems. IEEE Wirel. Commun. Lett. 2021, 10, 532–536. [CrossRef]
24. Adams, W.P.; Sherali, H.D. Mixed-integer bilinear programming problems. Math. Program. 1993, 59, 279–305. [CrossRef]
25. Bertsekas, D.P. Nonlinear Programming, 2nd ed.; Athena Scientific: Belmont, MA, USA, 1999.
26. Grothey, A.; Leyffer, S.; McKinnon, K.I.M. A Note on Feasibility in Benders Decomposition. In Numerical Analysis Report NA/188;
Dundee University: Dundee, UK, 1999.
27. Elhedhli, S.; Goffin, J.-L.; Vial, J.-P. Nondifferentiable optimization: Cutting plane methods. In Encyclopedia of Optimization, 2nd
ed.; Floudas, C.A., Pardalos, P.M., Eds.; Springer: Boston, MA, USA, 2009.
28. Kelley, J.E. The Cutting-Plane Method for Solving Convex Programs. J. Soc. Ind. Appl. Math. 1960, 8, 703–712. [CrossRef]
29. Goffin, J.-L.; Vial, J.-P. Convex nondifferentiable optimization: A survey focused on the analytic center cutting plane method.
Optim. Methods Softw. 2002, 17, 805–867. [CrossRef]
30. Ruszczyński, A. Nonlinear Optimization; Princeton University Press: Princeton, NJ, USA, 2006.
31. Franke, M.B. Mixed-integer optimization of distillation sequences with Aspen Plus: A practical approach. Comput. Chem. Eng.
2019, 131, 106583. [CrossRef]
32. Lemaréchal, C. An extension of Davidon methods to nondifferentiable problems. In Mathematical Programming Study 3;
Balinski, M.L., Wolfe, P., Eds.; North-Holland: Amsterdam, The Netherlands, 1975; pp. 95–109.
33. Kiwiel, K.C. Methods of Descent for Nondifferentiable Optimization; Springer: Berlin/Heidelberg, Germany, 1985.
34. Daniilidis, A.; Lemaréchal, C. On a primal-proximal heuristic in discrete optimization. Math. Program. Ser. A 2005, 104, 105–128.
[CrossRef]
35. Elzinga, J.; Moore, T.G. A central cutting plane algorithm for the convex programming problem. Math. Program. 1975, 8, 134–145.
[CrossRef]
36. Kronqvist, J.; Bernal, D.E.; Lundell, A.; Westerlund, T. A center-cut algorithm for quickly obtaining feasible solutions and solving
convex MINLP problems. Comput. Chem. Eng. 2019, 122, 105–113. [CrossRef]
37. Quesada, I.; Grossmann, I.E. An LP/NLP Based Branch and Bound Algorithm for Convex MINLP Optimization Problems.
38. Grossmann, I.E. Review of Nonlinear Mixed-Integer and Disjunctive Programming Techniques. Optim. Eng. 2002, 3, 227–252.
[CrossRef]
39. Bonami, P.; Biegler, L.T.; Conna, A.R.; Cornuejols, G.; Grossmann, I.E.; Laird, C.D.; Lee, J.; Lodi, A.; Margot, F.; Sawaya, N.; et al.
An algorithmic framework for convex mixed integer nonlinear programs. Discret. Optim. 2008, 5, 186–204. [CrossRef]
40. Su, L.J.; Tang, L.X.; Grossmann, I.E. Computational strategies for improved MINLP algorithms. Comput. Chem. Eng. 2015, 75,
40–48. [CrossRef]
41. Duran, M.A.; Grossmann, I.E. An outer-approximation algorithm for a class of mixed-integer nonlinear programs. Math. Program.
1986, 36, 307–339. [CrossRef]
42. Raman, R.; Grossmann, I.E. Integration of Logic and Heuristic Knowledge in MINLP Optimization for Process Synthesis.
43. Sirikum, J.; Techanitisawad, A.; Kachitvichyanukul, V. A new efficient GA-Benders’ decomposition method: For power generation
expansion planning with emission controls. IEEE Trans. Power Syst. 2007, 22, 1092–1100. [CrossRef]
Energies 2021, 14, 6503 18 of 18
44. Franke, M.B.; Nowotny, N.; Ndocko, E.N.; Górak, A.; Strube, J. Design and Optimization of a Hybrid Distillation/Melt
Crystallization Process. AICHE J. 2008, 54, 2925–2942. [CrossRef]
45. Naoum-Sawaya, J.; Elhedhli, S. A Nested Benders Decomposition Approach for Telecommunication Network Planning.
Nav. Res. Logist. 2010, 57, 519–539. [CrossRef]
46. Chen, S.; Geunes, J. Optimal allocation of stock levels and stochastic customer demands to a capacitated resource. Ann. Oper. Res.
2013, 203, 33–54. [CrossRef]
47. Marufuzzaman, M.; Eksioglu, S.D. Managing congestion in supply chains via dynamic freight routing: An application in the
biomass supply chain. Transp. Res. Part E-Logist. Transp. Rev. 2017, 99, 54–76. [CrossRef]
48. Meshgi, H.; Zhao, D.M.; Zheng, R. Optimal Resource Allocation in Multicast Device-to-Device Communications Underlaying
LTE Networks. IEEE Trans. Veh. Technol. 2017, 66, 8357–8371. [CrossRef]

Energies 14 06503 v2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Energies 14 06503 v2

Uploaded by

Copyright:

Available Formats

energies

Keywords: optimization; mixed-integer nonlinear programming; decomposition; convex problems;

Citation: Karbowski, A. Generalized

Academic Editor: Abu-Siada Ahmed Ax + F (v) ≤ b, b ∈ Rm (2)

Received: 9 August 2021 x ∈ X ⊆ Rn , v ∈ V ⊆ Rq (3)

Energies 2021, 14, 6503. https://doi.org/10.3390/en14206503 https://www.mdpi.com/journal/energies

can be taken into account using standard transformations, e.g.,

Most often, the total objective function is additive:

f 2 ( x2 ) = ( x2,1 − 4)2 + ( x2,2 − 3)2 (21)

x1,1 + x1,2 ≤ v1 (26)

g11 ( x1 , v) = x1,1 + x1,2 − v1 (29)

2. Problem with a chain (ring) of constraints.

min y21 + y22 + . . . + y29

g11 ( x1 , v) = x1,1 − v3 − 0.5 (35)

g21 ( x2 , v) = x2,1 − v1 − sin 3 (39)

g31 ( x3 , v) = x3,1 − v2 − sin 6 (43)

min inf f ( x, v) (46)

min z(v) (50)

Theorem 3 (Representation of z(v)).

min sup inf Lo ( x, v, λ) (57)

5. Basic Properties of the Primal Problem

5.1. The Case When the Primal Problem Has a Solution

should be added to the relaxed master problem.

5.2. The Case When the Primal Problem Has No Solution

Proof. The problem (72) is equivalent to the problem:

Modern optimization solvers usually return such a vector of Lagrange multipliers

6. Basic Properties of the Master Problem: Cuts

Assumption 1. Both Lagrangians Lo ( x, v, λ) and L f ( x, v, λ) for any x ∈ X, v ∈ V and λ ≥ 0

Lo ( x, v, λ) = Qo (wo ( x, λ), v, λ) (81)

Then also for v 6= v∗

Proof. According to Assumption 1, due to the increasing dependency of the Lagrange

= min Qo (wo ( x, λ∗ ), v∗ , λ∗ ) = Qo (min wo ( x, λ∗ ), v∗ , λ∗ ) (85)

Hence, taking into account the injectivity of strictly monotone functions:

min Lo ( x, v, λ∗ ) = min Qo (wo ( x, λ∗ ), v, λ∗ ) = Qo (min wo ( x, λ∗ ), v, λ∗ )

Consequently, if Assumption 1 is in effect, the constraint (63) in the master problem

2. Variable factor programming of the form:

Convergence of the Algorithm

For v∗ = e we will have the problem:

where Bl = { j|vlj = 1} and N l = { j|vlj = 0}.

Funding: This research received no external funding.

ADMM Alternating Directions Multiplier Method

You might also like