Professional Documents
Culture Documents
Sameh Hosny Math MSC Dissertation
Sameh Hosny Math MSC Dissertation
A Thesis
By
2017
Sameh Hosny
2017
Abstract
nication. Many problems in this field consist of a very large number of variables
and constraints and therefore fit in the platform of large scale linear programming.
Advancements in computing over the past decade have allowed us to routinely solve
from large scale linear programming. There are many software packages that im-
plement such methods, e.g. AMPL, GAMS and Matlab. This dissertation gives A
The dissertation explains some of these techniques, in particular the delayed column
generation method and the decomposition method. It also draws on examples from
concrete examples of how to use various software packages to solve large scale lin-
ear programming problems stemming from our examples in the context of wireless
communication.
ii
To the soul of my father, to my beloved mother, to my great wife, Doaa Eid
iii
Acknowledgments
Ghaith Hiary. You have been a tremendous mentor for me. It has been an honor for
me to be one of your students. I appreciate all your contributions of time and ideas to
make my M.Sc. experience productive and stimulating. The joy and enthusiasm you
have for your research was contagious and motivational for me, even during tough
times in the M.Sc. pursuit. I am also thankful to all the professors who taught me
from the math department. I am really grateful to them all for their dedication and
devotion to the courses they educate. These courses helped me create a strong and
The members of the IPS lab have contributed immensely to my personal and
professional time at The Ohio State University. The group has been a source of
my colleague John Tadrous for his continuous help and generosity. He was always
study. Moreover, I am thankful to my colleague Faisal Alotaibi for the great time we
spent working together and having useful technical discussions in our group meetings.
My time at OSU was made enjoyable in large part due to the many friends and
groups that became a part of my life. I would like to extend my special thanks to my
iv
best Egyptian friend Sameh Shohdy and his great family for their kindness, support,
and hospitality. They supported me and my family until everything was settle down
in Columbus. I also experess my thanks to our great American firends, Betty Rocke
and Randy, for supporting our stay in Columbus and helping my son Mohammed in
Lastly, I would like to thank my family for all their love and encouragement. For
my parents who raised me with a love of science and supported me in all my pursuits.
And most of all for my loving, supportive, encouraging, and patient wife Doaa Eid
whose faithful support during all stages of this Ph.D. is so appreciated. For spending
v
Vita
Publications
vi
S. Hosny, F. Alotaibi, H. E. Gamal and A. Eryilmaz, ”Towards a P2P mobile contents
trading,” 2015 49th Asilomar Conference on Signals, Systems and Computers, Pacific
Grove, CA, 2015, pp. 338-342.
Fields of Study
vii
Table of Contents
Page
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Vita . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Geometry of a Linear Program . . . . . . . . . . . . . . . . . . . . 7
2.3 Degeneracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 The Simplex Method . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4.1 Implementation of the Simplex Method . . . . . . . . . . . 16
2.4.2 Comparisons and Performance Enhancements . . . . . . . . 20
2.5 The Duality Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.6 Example Problems for Wireless Communication Networks . . . . . 27
2.6.1 Power Control in a Wireless Network . . . . . . . . . . . . . 27
2.6.2 Multicommodity Network Flow . . . . . . . . . . . . . . . . 28
2.6.3 D2D Caching Networks . . . . . . . . . . . . . . . . . . . . 30
viii
3. Large Scale Linear Programs . . . . . . . . . . . . . . . . . . . . . . . . . 33
Appendices 68
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
ix
List of Tables
Table Page
2.2 The Different Possibilities for the Primal and Dual Problems . . . . . . . . 26
x
List of Figures
Figure Page
xi
Chapter 1: Introduction
The importance of Linear Programming (LP) derives in part from its many ap-
plications and in part from the existence of efficient techniques to solve it. These
techniques are fast and reliable over a substantial range of problem sizes, inputs and
applications. Linear programming has been proven to be valuable for modeling di-
Industries that make use of LP and its extensions include transportation, energy,
variables and constraints. This makes the problem more complicated and requires
fast memory and higher computational speed. For this reason, a number of special-
ized procedures, such as column generation and cutting-plane methods, have been
developed to effectively solve such large-scale linear programs. Yet, in other cases,
the LP problem may have a special structure where the decomposition methods can
be useful.
wireless communications, e.g. network flow, power control, caching networks, etc.
Most of these applications deal with very large number of variables and constraints.
1
For example, caching networks deal with a very large number of users and a tremen-
programming methods for such large scale problems. We also investigate some soft-
ware packages to implement and solve these problems. Thanks to the advances in
computing over the past decade, linear programs in a few thousand variables and
constraints are nowadays viewed as ”small” problems. Problems having tens, or even
packages such as AMPL, GAMS, Matlab, etc. Large-scale LP software packages uti-
lize special techniques from numerical linear algebra such as sparse matrix techniques
together with refinements developed through years of experience. Though, this is not
going to be the focus of this dissertation. This dissertation presents some common
large scale LP algorithms. In this chapter, we focus on algorithms that have been
and Matlab, to solve some large scale LP examples from the context of wireless
communication.
2
Chapter 2: Review of Linear Programming
2.1 Introduction
is linear in the unknowns and the constraints consist of linear equalities and linear
inequalities [1]. In a general linear programming problem, we are given a cost vector
0
c = (c1 , · · · , cn ) and we seek to minimize a linear cost function c x = ni=1 ci xi over
P
inequality constraints. Any linear program can be transformed into the following
standard form:
minimize c1 x1 + c2 x2 + · · · + cn xn
x
x1 ≥ 0, x2 ≥ 0, · · · , xn ≥ 0.
where the bi ’s, ci ’s and aij ’s are fixed real constants, and the xi ’s are real numbers to
be determined. We always assume that each equation has been multiplied by minus
3
unity, if necessary, so that each bi ≥ 0. A linear programming problem of the form:
0
minimize cx
x
subject to Ax = b (2.2)
x ≥ 0.
The variables x1 , · · · , xn are called decision variables, and a vector x satisfying all of
the constraints is called a feasible solution or feasible vector. The set of all feasible
0
solutions is called the feasible set or feasible region. The function c x is called the
objective function or cost function. A feasible solution x∗ that minimizes the objective
0 0
function (that is , c x∗ ≤ c x, for all feasible x) is called an optimal feasible solution
0
or, simply, an optimal solution. The value of c x∗ is then called the optimal cost.
0 0
An equality constraint ai x = bi is equivalent to the two constraints ai x ≤ bi
0 0
and ai x ≥ bi . In addition, any constraint of the form ai x ≤ bi can be rewritten as
0
(−ai )x ≥ −bi . Finally, constraints of the form xj ≥ 0 or xj ≤ 0 are special cases of
0 0
constraints of the form ai x ≥ bi where ai is a unit vector and bi = 0. We conclude that
the feasible set in a general linear programming problem can be expressed exclusively
0
in terms of inequality constraints of the form ai x ≥ bi . Suppose that there is a total
4
Example 1. The following is a linear programming problem:
minimize x1 − x2 + x3
x
subject to x1 + x2 + x4 ≤2
x2 − x3 =5
x3 + x4 ≥3
x1 ≥0
x3 ≤ 0.
minimize x1 − x2 + x3
x
subject to − x1 − x2 − x4 ≥ −2
x2 − x3 ≥5
− x2 + x3 ≥ −5
x3 + x4 ≥3
x1 ≥0
− x3 ≥ 0.
problem in standard form (2.2). We say that the two problems are equivalent, that is
given a feasible solution to one problem, we can construct a feasible solution to the
5
other, with the same cost. In particular, the two problems have the same optimal cost
and given an optimal solution to one problem, we can construct an optimal solution
(a) Elimination of free variables: Any real number can be written as the dif-
−
variables on which we impose the sign constraints x+
j ≥ 0 and xj ≥ 0.
Pn
variable. Similarly, an inequality constraint aij xj ≥ bi can be put in standard
j=1
form by introducing a surplus variable si and the constraints nj=1 aij xj − si =
P
bi , si ≥ 0.
subject to x1 + x2 ≥3
3x1 + 2x2 = 14
x1 ≥ 0.
subject to x1 + x+
2 − x−
2 − x3 =3
3x1 + 2x+
2 − 2x−
2 = 14
x1 , x+
2, x−
2, x3 ≥ 0.
6
For example, given the feasible solution (x1 , x2 ) = (6, −2) to the original problem, we ob-
−
tain the feasible solution (x1 , x+
2 , x2 , x3 ) = (6, 0, 2, 1) to the standard form problem, which
−
has the same cost. Conversely, given the feasible solution (x1 , x+
2 , x2 , x3 ) = (8, 1, 6, 0) to
the standard form problem, we obtain the feasible solution (x1 , x2 ) = (8, −5) to the origi-
We can also visualize standard form problems geometrically. For example, consider
the problem
minimize − x1 − x2
subject to x1 + 2x2 ≤3
2x1 + x2 ≤3
x1 , x2 ≥ 0.
The feasible set is the shaded region in Figure 2.1. In order to find an optimal
solution, we identify the cost vector c = (−1, −1) and for any given z, we consider
the line described by the equation −x1 − x2 = z. We change z to move this line in the
direction of the vector −c as much as possible as long as we do not leave the feasible
region. The best we can do is z = −2 and the vector x = (1, 1) is an optimal solution
(n − m)-dimensional set. For example, consider the feasible set in R3 defined by the
space. Furthermore, each edge of the triangle corresponds to one of the inequality
constraints. The optimal solution lies inside the shaded triangle in Figure 2.2.
7
x2
3
−x1 − x2 = −2
2x 1
+x
2
=
3
1.5
−x1 − x2 = z (1, 1)
−x1 − x2 = 0 x1
+2
x2
=3
x1
1.5 3
c
x3
x2
x1
=0
x2
0
=
x3 =
0
x1
(a) An (n − m)-dimensional (b) An n-dimensional view of
view of the same set. the feasible set.
(b) There exist multiple optimal solutions; in this case, the set of optimal solutions
8
(c) The optimal cost is −∞, and no feasible solution is optimal.
As a preliminary investigation, we can say that if the problem has at least one
optimal solution, then an optimal solution can be found among the corners of the
feasible set. This idea is the core of how to solve a linear program. To develop this
Definition 1. A polyhedron is a set that can be described in the form {x ∈ Rn |Ax ≥ b},
cannot find two vectors y, z ∈ P , both different from x, and a scalar λ ∈ [0, 1] such that
x = λy + (1 − λ)z.
An alternative definition which is also used to find the unique optimal solution is
a vertex of a polyhedron.
9
0
Definition 4. If a vector x∗ satisfies ai x = bi for some i ∈ M1 , M2 , or M3 , we say that the
If there are n constraints that are active at a vector x∗ , then x∗ satisfies a certain
and only if these n equations are ”linearly independent”. Since we have m equality
constraints in the standard form problems, we need to find n−m inequality constraints
which are also active. Once we have n linearly independent active constraints, a
the latter case we say that we have a basic (but not basic feasible) solution.
ii Out of the constraints that are active at x∗ , there are n of them that are linearly
independent.
(b) If x∗ is a basic solution that satisfies all of the constraints, we say that it is a basic
feasible solution.
Theorem 1. Let P be a nonempty polyhedron and let x∗ ∈ P . Then, the following are
equivalent: (a) x∗ is a vertex; (b) x∗ is an extreme point; (c) x∗ is a basic feasible solution.
10
Every basic solution must satisfy the equality constraints Ax = b, which provides
choose n − m of the variables xi and set them to zero, which makes the corresponding
solve the system of m equations Bx = b for the unknowns xB(1) , · · · , xB(m) . If the
the variables xB(1) , · · · , xB(m) are called basic variables; the remaining variables are
called nonbasic. The columns AB(1) , · · · , AB(m) are called basic columns and, since
they are linearly independent, they form a basis of Rn . By arranging the m basic
can also define a vector xB with the values of the basic variables. Thus,
xB(1)
| | |
B = AB(1) AB(2) · · · AB(m) , xB = ...
| | | xB(m)
The basic variables are determined by solving the equation BxB = b whose unique
11
0
Theorem 3. Consider the LP problem of minimizing c x over a polyhedron P. Suppose that
P has at least one extreme point and that there exists an optimal solution. Then, there exists
2.3 Degeneracy
independent active constraints. This allows for the possibility of the number of active
can be linearly independent. This also means that we will have more than n − m
variables with the value of zero. In this case, we say that we have a degenerate basic
solution.
constraints are active at x. In other words, if more than n − m of the components of x take
If the entries of A or b were chosen at random, this would almost never happen.
x1 + x2 + 2x3 ≤ 8
x2 + 6x3 ≤ 12
x1 ≤ 4
x2 ≤ 6
x, x2 , x3 ≥ 0.
12
The vector x = (2, 6, 0) is a nondegenerate basic feasible solution, because there
are exactly theree active and linearly independent constraints, namely x1 +x2+2x3 ≤
because there are four active constraints, three of them are linearly independent,
If a linear program in standard form has an optimal solution, then there exists a
basic feasible solution that is optimal. The simplex method searches for an optimal
solution by moving from one basic feasible solution to another, along the edges of
the feasible set, in a cost reducing direction. For general optimization problems, a
locally optimal solution need not be globally optimal. In linear programming, local
optimality implies global optimality; because we are minimizing a convex cost function
over a convex set. Therefore, the simplex method terminates once an optimal solution
is found.
Now suppose that we are at point x ∈ P and we are moving away from x in the
and increasing it to a positive value θ, while keeping the remaining nonbasic variables
j. At the same time, the vector xB of basic variables changes to xB + θdB , where
13
require A(x + θd) = b, and since x is feasible, we have Ax = b. Thus, for θ > 0, we
need Ad = 0. Then
n
X m
X
0 = Ad = Ai d i = AB(i) dB(i) + Aj = BdB + Aj
i=1 i=1
dB = −B−1 Aj
0
Now, if d is the jth basic direction, then the rate c d of cost change along the direction
0
d is given by cB dB + cj , where cB = (cB(1) , · · · , cB(m) ). This is defined as the reduced
0
cost c̄j = cj − cB B−1 Aj of moving in this direction. Note that, cj is the cost per unit
0
increase in the variable xj , and the term −cB B−1 Aj is the cost of the compensating
0 0
c̄B(i) = cB(i) − cB B−1 AB(i) = cB(i) − cB ei = cB(i) − cB(i) = 0,
that is the reduced cost of every basic variable is zero. The following theorem illus-
Theorem 4. Consider a basic feasible solution x associated with a basis matrix B, and let
optimal, we need only to check whether all reduced costs are nonnegative, which is
14
the same as examining n − m basic directions. If x is a degenerate basic feasible
is not available. Therefore, to assert that a certain basic solution is optimal, we need
to satisfy two conditions: feasibility and nonnegativity of the reduced costs. This
0 0 0 0
(b) c̄ = c − cB B−1 A ≥ 0 .
the optimality conditions, and is therefore optimal. Let us assume that every basic
that we have computed the reduced costs c̄j of the nonbasic variables. If the reduced
cost c̄j of a nonbasic variable xj is negative, the jth basic direction d is a feasible
direction of cost reduction. While moving along this direction d, the nonbasic variable
xj becomes positive and all other nonbasic variables remain at zero. We describe this
situation by saying that xj (or Aj ) enters or is brought into the basis and replaces
The following theorem states that, in the nondegenerate case, the simplex method
Theorem 5. Assume that the feasible set is nonempty and that every basic feasible solution
is nondegenerate. Then, the simplex method terminates after a finite number of iterations.
15
Algorithm 1 An iteration of the simplex method
1. We start with a basis consisting of the basic columns AB(1) , · · · , AB(m) and an asso-
ciated basic feasible solution x.
0
2. Compute the reduced costs c̄j = cj − cB B−1 Aj for all nonbasic indices j. If they are
all nonnegative, then x is optimal and the algorithm terminates; else, choose some j
0
for which c̄j < 0.
x
5. Let l be such that θ∗ = B(i) ui
. Form a new basis by replacing AB(l) with Aj . If y is
the new basic feasible solution, the values of the new basic variables are yj = θ∗ and
yB(i) = xB(i) − θ∗ ui , i 6= l.
(a) We have an optimal basis B and an associated basic feasible solution which is opti-
mal.
0
(b) We found a vector d satisfying Ad = 0, d ≥ 0, c d < 0, and the optimal cost is
−∞.
From the previous section, we notice that the vectors B−1 Aj play a key role in the
simplex method. If these vectors are available, the reduced cost c̄, the direction of
motion d, and the step size θ∗ are easily computed. Thus, the main difference between
alternative implementations lies in the way that the vectors B−1 Aj are computed and
16
Naive Implementation
At the beginning of a typical iteration, we have the indices B(1), · · · , B(m) of the
0 0
current basic variables. For the basic matrix B, we compute the vector p = cB B−1
which is called the vector of simplex multipliers associated with the basis B. The
0
reduced cost of any variable xj is then obtained by c̄j = cj − p Aj . Depending on
the pivoting rule employed, we may have to compute all of the reduced costs or we
may compute them one at a time until a variable with a negative reduced cost is
encountered. Once a column Aj is selected to enter the basis, we solve the linear
can form the direction along which we will be moving away from the current basic
feasible solution. We finally determine θ∗ and the variable that will exit the basis, and
construct the new basic feasible solution. This iteration is repeated until all reduced
for solving two linear system of equations. In the revised simplex method, the
matrix B−1 is made available at the beginning of each iteration, and the vectors
0
cB B−1 and B−1 Aj are computed by matrix-vector multiplication. However, we
need an efficient method for updating the matrix B−1 for each basis change. Let
B = AB(1) · · · AB(m) be the basis matrix at the beginning of an iteration and let
B̄ = AB(1) · · · AB(l−1) Aj AB(l+1) · · · AB(m) be the basis matrix at the begin-
ning of the next iteration. These two basis matrices have the same columns except
that the lth column AB(l) (the one that exists the basis) has been replaced by Aj
17
(the one that enters the basis). Thus, B−1 contains information that can be exploited
in the computation of B̄−1 . Since B−1 B = I, we see that B−1 AB(i) is the ith unit
We can now apply a sequence of elementary row operations that will change the
above matrix to the identity matrix. This sequence of elementary row operations
have QB−1 B̄ = I, which yields QB−1 = B̄−1 . So, applying the same sequence of
row operations to the matrix B−1 , we obtain B̄−1 . A typical iteration of the revised
simplex method includes the same steps as in Algorithm 1 with one more step added
at the end to compute B̄−1 . Form the m × (m + 1) matrix B−1 |u . Add to each one
of its rows a multiple of the lth row to make the last column equal to the unit vector
Instead of storing and updating the matrix B−1 , we store and update the m ×
(n + 1) matrix B−1 b|A with columns B−1 b and B−1 A1 , · · · , B−1 An . This matrix
is called the simplex tableau. The column B−1 b, called the zeroth column, contains
the values of the basic variables. The column B−1 Ai is called the ith column of the
tableau. The column u = B−1 Aj corresponding to the variable that enters the basis
is called the pivot column. If the lth basic variable exits the basis, the lth row of
the tableau is called the pivot row. Finally, the element belonging to both the pivot
row and the pivot column is called the pivot element. Note that the pivot element
18
is ul and is always positive (unless u ≤ 0 , in which case the algorithm has met the
Note that given the current basis matrix B, the quality constraint Ax = b can
be rewritten as B−1 b = B−1 Ax, which is precisely the information in the tableau.
At the end of each iteration, we need to update the tableau B−1 b|A and compute
B̄−1 b|A . This can be accomplished following the same idea as the revised simplex
method. To determine the exiting column AB(l) and the step size θ∗ , Step 4 and 5 in
the Algorithm 1 amount to the following: xB(i) /ui is the ratio of the ith entry in the
zeroth column of the tableau to the ith entry in the pivot column of the tableau. We
only consider those i for which ui is positive. The smallest ratio is equal to θ∗ and
be referred to as the zeroth row. The entry at the top left corner contains the value
0
−cB xB , which is the negative of the current cost. the rest of the zeroth row is the
0 0 0
row vector of the reduced costs c̄ = c − c̄B B−1 A. The structure of the tableau is
algorithm.
0 0 0
−cB B−1 b −c − cB B−1 A
B−1 b −B−1 A
19
Algorithm 2 An iteration of the full tableau implementation
1. A typical iteration starts with the tableau associated with a basis matrix B and the
corresponding basic feasible solution x.
2. Examine the reduced costs in the zeroth row of the tableau. If they are all nonnega-
tive, the current basic feasible solution is optimal, and the algorithm terminates; else,
choose some j for which c̄j < 0.
3. Consider the vector u = B−1 Aj , which is the jth column (the pivot column) of the
tableau. If no component of u is positive, the optimal cost is −∞, and the algorithm
terminates.
4. For each i for which ui is positive, compute the ratio xB(i) /ui . Let l be the index of a
row that corresponds to the smallest ratio. The column AB(l) exits the basis and the
column Aj enters the basis.
5. Add to each row of the tableau a constant multiple of the lth row (the pivot row)
so that ul (the pivot element) becomes one and all other entries of the pivot column
become zero.
the inverse of B or solving a linear system of the form Bx = b takes O(m3 ) arithmetic
operations.
of all variables requires O(mn) arithmetic operations , because we need to form the
inner product of the vector p with each one of the nonbasic columns Aj . Thus , the
20
total computational effort per iteration, for the naive implementation, is O(m3 +mn).
Bu = Aj can be solved very fast, in which case the naive implementation can be of
practical interest.
The full tableau method requires a constant (and small) number of arithmetic
operations for updating each entry of the tableau. Thus, the amount of computation
per iteration is proportional to the size of the tableau, which is O(mn). The revised
0
simplex method uses similar computations to update B−1 and cB B−1 , and since only
O(m2 ) entries are updated, the computational requirements per iteration are O(m2 ).
In addition, the reduced cost of each variable xj can be computed by forming the inner
0
product p Aj , which requires O(m) operations. In the worst case , the reduced cost
of every variable is computed, for a total of O(mn) computations per iteration. Since
under either implementation. On the other hand , if we consider a pivoting rule that
evaluates one reduced cost at a time, until a negative reduced cost is found, a typical
iteration of the revised simplex method might require a lot less work. In the best case,
if the first reduced cost computed is negative, and the corresponding variable is chosen
to enter the basis, the total computational effort is only O(m2 ). The conclusion is
that the revised simplex method cannot be slower than the full tableau method, and
are reduced from O(mn) to O(m2 ). As n is often much larger than m, this effect can
21
be quite significant. It could be counterargued that the memory requirements of the
revised simplex method are also O(mn) because of the need to store the matrix A.
However, in most large scale problems that arise in applications, the matrix A is very
sparse (has many zero entries) and can be stored compactly. (Note that the sparsity
of A does not usually help in the storage of the full simplex tableau because even if A
and B are sparse, B−1 A is not sparse in general). The following table summarizes this
discussion. Note that memory is the storage space required, time is the computational
effort per iteration and best-case time is considered if first computed reduced cost is
negative.
Some ideas from numerical linear algebra can help us to enhance the performance
of the simplex method. The following are some examples of these ideas:
1. Recall that at each iteration of the revised simplex method, the inverse basis
matrix B−1 is updated according to certain rules. Each such iteration may
greatly enhanced by using suitable data structures and certain techniques from
22
2. Now, suppose that a reinversion has been just carried out and B−1 is available.
Subsequent to the current iteration of the revised simplex method, we have the
option of generating explicitly and storing the new inverse basis matrix B̄−1 .
that QB−1 = B̄−1 . Note that Q basically prescribes which elementary row
matrix, and can be completely specified in terms of m coefficients: for each row,
we need to know what multiple of the pivot row must be added to it.
3. Subsequent to a ”reinversion,” one does not usually compute B−1 explicitly, but
structure.
These methods are designed to accomplish two objectives: improve numerical stability
(minimize the effect of roundoff errors) and exploit sparsity in the problem data
to improve both running time and memory requirements. These methods have a
trustworthy results, they can also speed up considerably the running time of the
simplex method. Duality helps us to check the efficiency of the used algorithm by
solving both primal and dual problems and comparing the obtained results. Therefore,
23
2.5 The Duality Theory
subject to Ax = b
x ≥ 0.
which we call the primal problem, let x∗ be an optimal solution and assume it exists.
we have 0 0
minimize c x + p (b − Ax)
x
subject to x ≥ 0.
Let g(p) be the optimal cost for the relaxed problem, as a function of the price vector
0
p. We see that g(p) is no larger than the optimal primal cost c x∗ , since
0 0 0 0 0
g(p) = min c x + p (b − Ax) ≤ c x∗ + p (b − Ax∗ ) = c x∗
x≥0
where the later inequality follows from the fact that x∗ is a feasible solution to the
primal problem, and satisfies Ax∗ = b. Thus, each p leads to a lower bound g(p) for
0
the optimal cost c x∗ . The problem
maximize g(p)
p
subject to no constraints.
which is interpreted as a search for the tightest possible lower bound of this type, as
is known as the dual problem. Now, using the definition of g(p), we have
0 0 0 0 0
g(p) = min c x + p (b − Ax) = p b + min(c − p A)x
x≥0 x≥0
24
Note that ( 0 0 0
0 0 0, if c − p A ≥ 0 ,
min(c − p A)x =
x≥0 −∞, otherwise.
To maximize g(p), we only consider the values of p for which g(p) is not equal to
Moreover, if we transform the dual into an equivalent minimization problem and then
form its dual, we obtain a problem equivalent to the original problem, i.e. ”the dual
of the dual is the primal”. Since g(p) provides a lower bound for the optimal cost,
Although the weak duality theorem is not a deep result, it does provide some
useful information about the relation between the primal and the dual as stated in
Corollary 1. If the optimal cost in the primal is −∞, then the dual problem must be
infeasible. Moreover, if the optimal cost in the dual is +∞, then the primal problem must
be infeasible.
Corollary 2. Let x and p are feasible solutions to the primal and dual problems, respec-
0 0
tively, and suppose that p b = c x. Then x and p are optimal solutions to the primal and
25
Theorem 7. (Strong Duality) If a linear programming problem has an optimal solution,
so does its dual, and the respective optimal costs are equal.
Recall that in a linear programming problem, exactly one of the following three
(b) The problem is ”unbounded” ; that is, the optimal cost is −∞ (for minimization
This leads to nine possible combinations for the primal and the dual, which are shown
in Table 2.2. By the strong duality theorem, if one problem has an optimal solution,
so does the other. Furthermore, the weak duality theorem implies that if one problem
is unbounded, the other must be infeasible. This allows us to mark some of the entries
Table 2.2: The Different Possibilities for the Primal and Dual Problems
There is another interesting relation between the primal and the dual which is
known as Clark’s theorem (Clark, 1961). It asserts that unless both problems are
26
2.6 Example Problems for Wireless Communication Networks
them can require large scale linear programming techniques to overcome their growing
single base station as shown in Figure 2.4. For each i = 1, 2, · · · , n, user i transmits
a signal to the base station with power pi and an attenuation factor of hi (i.e., the
actual signal power received at the base station from user i is hi pi ). When the
base station is receiving from user i the total power received from all other users is
P
considered as an interference (i.e., j6=i hj pj ). For the communication with user i
“signal” is the power received from user i. We are interested in minimizing the total
power transmitted by all users subject to having reliable communication for all users.
minimize p1 + p2 + · · · + pn
hi pi
subject to P ≥ γi , i = 1, 2, · · · , n
j6=i hj pj
p1 , p2 , · · · , pn ≥ 0
27
We can rewrite the problem as a linear programming problem as follows:
minimize p1 + p2 + · · · + pn
X
subject to hi pi − γi hj pj ≥ 0, i = 1, 2, · · · , n
j6=i
p1 , p2 , · · · , pn ≥ 0
Note that the complexity of this problem increases with the number of users in the
p1 p2 pn
h2 p2
h1 p1 hn pn
Base Station
j is described by an ordered pair (i, j). Let A be the set of all links. We assume that
each link (i, j) ∈ A can carry up to uij bits per second. There is a positive charge
cij per bit transmitted along that link. Each node k generates data, at the rate of bkl
bits per second, that have to be transmitted to node l, either through a direct link
(k, l) or by tracing a sequence of links. The problem is to choose paths along which
28
all data reach their intended destinations, while minimizing the total cost. We allow
the data with the same origin and destination to be split and be transmitted along
different paths.
variables xkl
ij indicating the amount of data with origin k and destination l that
Thus , bkl
i is the net inflow at node i, from outside the network, of data with origin k
The first constraint is a flow conservation constraint at node i for data with origin
with origin and destination k and l, respectively, that leave node i along some link.
data that enter node i from outside the network. The second constraint expresses
the requirement that the total traffic through a link (i, j) cannot exceed the link’s
capacity. The last constraint is a non-negativity constraint which is required for the
29
a
7
8
5 b c
7
9 5
15
d e
6 9
8
11
f g
and a carrier who supplies M data items upon demand. Each data item m ∈
the carrier divides the day into T time slots. The probability that user n requests
like airports, schools, shopping malls, stadiums or governmental buildings where high
demand can be related to user mobility. The probability that user n will be present
l
where Ll=1 θn,t
l
P
at location l in time slot t is denoted by θn,t = 1 ∀n, t.
Each user n has an isolated cache memory of size Zn . The carrier caches an amount
xm
n of content m in the device of user n at time slot 0 and then lets users share it
together for t ≥ 1. Therefore, the carrier smooths out the network load by caching
some of the data items at the network edge and exploits user mobility to enhance
30
allowed and can be used to transfer data items between users. Users occupy part of
their devices memory for caching these data items and consume some of their battery
corresponding to each cached byte denoted by r > 0. The carrier’s objective is to find
an optimal proactive service policy xm∗
n , ∀n, m which minimizes the time-averaged
expected cost while delivering the requested data items on time to all users. The
where,
L
X N
X
m l
αn,t = θn,t pm l
k,t θk,t , (2.7)
l=1 k=1
m
Note that αn,t captures the demand and mobility profiles of all users. Also, we notice
m at location l in time slot t. So, the higher the probability that item m will be
requested at location l in time slot t and the higher the probability that user n will
be present at that location in this time slot, the more amount of this item will be
m
cached at user n. Also, for each user n and for every content m, when r > αn,t the
31
Data Sources
Service Provider
.. .. ..
. D2DLink
. .
D2DLink
D2DLink
End Users
32
Chapter 3: Large Scale Linear Programs
tails. Furthermore, we showed how to use duality theory to solve linear optimization
problems. In this chapter, we extend our discussion to consider large scale linear op-
requirements of the system. For example, in the proactive caching problem men-
tioned in Section 2.6.3, we need to consider the case when the number of users and
the number of constraints are very large. This type of linear optimization problems
scale problems. The complexity of these problems arises when the dimension of matrix
A increases. For instance, in the simplex method, identifying the entering and exiting
In this chapter, we present some methods for solving linear programming problems
with a large number of variables or constraints. We shed light on delayed column gen-
eration where columns of the matrix A are generated only when they are required.
33
We also present its dual, the ”cutting plane” method, in which the feasible set is
approximated using only a subset of the constraints. We also introduce the decom-
problems whose constraints can be divided into two sets: the first set includes gen-
eral constraints Ax ≥ b; while the second set has constraints with a special structure.
These methods are illustrated through a classical application, the cutting-stock prob-
lem presented by Gilmore and Gomory [4]. We close this chapter by surveying some
The delayed column generation method was first presented by Dantzig and Wolfe
in 1960 [3] and Gomory and Gilmore in 1961 [4]. This method still has a great interest
and so many recent applications in the literature [5, 6]. For example, the generalized
bin packing problem (GBPP) is a novel packing problem arising in many transporta-
tion and logistic settings, characterized by multiple items and bins attributes and the
the main mathematical models and an experimental evaluation of the main available
software tools for the one-dimensional bin packing problem is introduced in [8]. A
single items we are packing groups of items, is discussed in [9]. Such a general model
finds applications in various practical problems, e.g., delivering bundles of goods and
34
Some linear programming problems become intractable because of the large num-
of variables is much larger than the number of constraints), most of the variables will
be non-basic and hence only a subset of variables need to be considered when solving
the problem. Column generation leverages this idea to generate only the variables
which have the potential to improve the objective function; that is, to find variables
with negative reduced cost. The problem being solved is split into two problems: the
master problem and the sub-problem. The master problem is the original problem
with only a subset of variables being considered. The sub-problem is another problem
subject to Ax = b
x ≥ 0.
with x ∈ Rn and b ∈ Rm and the rows of A are linearly independent. If the number
of columns of A is so large, then it is not practical to generate and store the entire
matrix A in memory as done in the full tableau method for example. Moreover, in
many problems, most of the columns of A never enter the basis. Therefore, we don’t
have to generate these unused columns. In particular, the revised simplex method,
at any given iteration, requires the current basic columns and the column which is to
enter the basis. Consequently, we need an efficient method for recognizing variables
xi with negative reduced costs c¯i without having to generate all columns. Sometimes,
35
this can be accomplished by solving the problem
In many cases, this optimization problem has a special structure, that is a smallest
c¯i can be found without computing every c¯i . If the minimum of this optimization
problem is greater than or equal to 0, all reduced costs are non-negative and we have
an optimal solution to the original linear programming problem. On the other hand,
has negative reduced cost, and the column Ai can enter the basis. The key to this
approach is our ability to solve the optimization problem (3.1) efficiently without
In the delayed column generation method, the columns that exit the basis are
discarded from memory. In a variant of this method, the algorithm retains in memory
all or some of the columns that have been generated in the past, and proceeds in terms
of restricted linear programming that involves only the retained columns. To clarify
the idea, let us consider a sequence of master iterations. At the beginning of a master
iteration, we have a basic feasible solution to the original problem, and an associated
basis matrix. For each master iteration, we do the steps defined in Algorithm 3.
Note that step (1) in Algorithm 3 may require to go over all columns of A to find
a variable with negative reduced cost. An alternative to this method is to solve the
master problem for some set I. From this solution, we are able to obtain dual prices
for each of the constraints in the master problem (recall that the dual prices are the
0 0
elements of the vector p = cB B−1 ). This information is then utilized in the objective
function of the subproblem. After solving the subproblem, if the objective value of
the subproblem is negative, a variable with negative reduced cost has been identified.
36
Algorithm 3 Delayed Column Generation Master Iteration
1: We search for a variable with negative reduced cost, possibly by minimizing c̄i over all
i using (3.1). If none is found, the algorithm terminates and this solution is optimal.
2: Suppose that we have found some j such that c¯j < 0. We form a collection of columns
Ai , i ∈ I, which contains all of the basic columns, the entering column Aj , and possi-
bly some other columns as well.
3: Define the restricted problem
X
minimize ci x i
x
i∈I
X
subject to Ai x i = b (3.2)
i∈I
x ≥ 0.
4: The basic variables at the current basic feasible solution to the original problem are
among the columns that have been kept in the restricted problem. Therefore, we have
a basic feasible solution to the restricted problem, which can be used as a starting point
for its optimal solution.
5: We perform as many simplex iterations as needed until we find an optimal solution to
the restricted problem.
This variable is then added to the master problem, and the master problem is re-
solved. Re-solving the master problem will generate a new set of dual values, and the
process is repeated until no negative reduced cost variables are identified. We can
The delayed column generation method is a special case of the revised simplex
method with some special rules for choosing the entering variable that give priority
to the variables xi , i ∈ I; only when the reduced costs of these variables are all non-
negative. We wish to give priority to variables for which the corresponding columns
have already been generated and stored in memory. There are several variants of this
method, depending on how the set I is chosen at each iteration. We summarize these
variants as follows:
37
(a) I is just the set of indices of the current basic variables together with the entering
variable. A variable that exits the basis is immediately dropped from the set I.
Since the restricted problem has m + 1 variables and m constraints, its feasible
(b) I is the set of indices of all variables that have become basic at some point in
the past; equivalently, no variables are ever dropped, and each entering variable
is added to I. The set I keeps growing and hence this option is not preferred
(c) The set I is kept to a moderate size by dropping those variables that have exited
the basis in the remote past and have not reentered again.
presence of degeneracy, cycling can be avoided by using the lexicographic tie breaking
Delayed column generation methods in terms of the dual variables can be described
as delayed constraint generation or cutting plane methods. Consider the dual problem
We assume that it is impractical to generate and store each one of the columns Ai
38
and form the relaxed dual problem
0
maximize pb
p
(3.4)
0
subject to p Ai ≤ ci , i ∈ I
Let p∗ be an optimal basic feasible solution to the relaxed dual problem. There are
two possibilities:
(a) p∗ is a feasible solution to the original problem (3.3). Any other feasible solution
p to the original problem (3.3) is also a feasible solution for the relaxed problem
(b) If p∗ is infeasible for the original problem (3.3), we find a violated constraint,
add it to the constraints of the relaxed dual problem and continue similarly.
Therefore, we need a method for checking the feasibility of vector p∗ to the original
dual problem (3.3). We also need an efficient method to identify a violated constraint.
that separates p∗ from the dual feasible set. This can be done by solving the problem
0
minimize ci − (p∗ ) Ai (3.5)
i
over all i. If the optimal solution of this problem is non-negative, we have a feasible
The success of this approach hinges on our ability to solve the problem (3.5) effi-
ciently; fortunately it is sometimes possible. In addition, there are cases where the
39
optimization problem (3.5) is not easily solved but one can test for feasibility and
identify violated constraints using other means such as those used for integer linear
programming [12].
Applying the cutting plane method to the dual problem is identical to applying the
delayed column generation method to the primal. Furthermore, the relaxed problem
(3.4) is the dual of the restricted primal problem (3.2). In some cases, we may have
a primal problem (not in a standard form) that has relatively few variables but a
very large number of constraints. In that case, it makes sense to apply the cutting
plane algorithm to the primal; equivalently, we can form the dual problem and solve
algorithm proposed by Dantzig and Wolfe [3]. Dantzig-Wolfe decomposition has been
an important tool to solve large structured models that could not be solved using
a standard Simplex algorithm as they exceeded the capacity of those solvers. With
the current generation of simplex and interior point LP solvers and the enormous
progress in standard hardware (both in terms of raw CPU speed and availability of
large amounts of memory), the Dantzig-Wolfe algorithm has become less popular.
The decomposition algorithm is a procedure for the solution of linear programs using
a sequence of linear programs each of smaller size than the original. Dantzig–Wolfe
40
consider a linear programming problem of the form
0 0
minimize c1 x1 + c2 x2
x
subject to D1 x1 + D2 x2 = b0
F1 x1 = b1 (3.6)
F2 x2 = b2
x1 , x2 ≥ 0.
Supose that x1 and x2 are vectors of dimensions n1 and n2 , respectively, and that
dimensions. Often, the number of coupling constraints is a small fraction of the total
The first step of this method is to introduce an equivalent problem, with fewer
equality constraints, but many more variables. The original problem is reformulated
into a master program and n subprograms. This reformulation relies on the fact that
any element of a polyhedron that has at least one extreme point can be represented
extreme rays.
called an extreme ray if there are n − 1 linearly independent constraints that are active
41
Now, we define n o
P1 := x1 ≥ 0 : F1 x1 = b1
n o
P2 := x2 ≥ 0 : F2 x2 = b2
we assume that P1 and P2 are non-empty. Then the problem stated in (3.6) can be
rewritten as 0 0
minimize c1 x1 + c2 x2
x
subject to D1 x1 + D2 x2 = b0
x1 ∈ P 1
x2 ∈ P 2 .
For i = 1, 2, let xji , j ∈ Ji be the extreme points of Pi . Let also wik , k ∈ Ki be
where the coefficients λji and θik are nonnegative and satisfy
X
λji = 1, i = 1, 2 (3.7)
j∈Ji
D1 xj1 D2 xj2
X X
subject to λj1 1 + λj2 0
j∈J1 0 j∈J2 1 (3.8)
X D1 w1k X D2 w2k b0
+ θ1k 0 + θ2k 0 = 1
k∈K1 0 k∈K2 0 1
λji ≥ 0, θik ≥ 0, ∀i, j, k.
This problem is called the master problem. Note that the original problem has
ity constrains which are the coupling constraints plus the constraints in (3.7). On
42
the other hand, the number of decision variables in the master problem could be ex-
tremely large because the number of extreme points and rays is usually exponential
in the number of variable and constraints. Therefore, we can see that the delayed
is generated only after it is found to have a negative reduced cost and is about to enter
the basis. we need to use the revised simplex method which, at any iteration, involves
only m0 + 2 basic variables and a basis matrix of dimension (m0 + 2) × (m0 + 2).
Suppose that we have a basic feasible solution to the master problem associated
with a basis matrix B and that B−1 is available. since we have m0 + 2 equality
0 0
constraints, the dual vector p = cB B−1 has dimension m0 +2. Its first m0 components
denoted by q are the dual variables associated with the equality coupling constraints
in (3.8). The last two components, denoted by r1 and r2 , are the dual variables
p = (q, r1 , r2 ). We need to examine the reduced costs of different variables and check
whether any one of them is negative. The reduced cost of the variable λj1 is given by
j
D1 x 1
0 0 0 0
c1 xj1 − q r1 r2 1 = (c1 − q D1 )xj1 − r1 .
Instead of evaluating the reduced cost of every variable λj1 and θ1k , and checking its
subject to x1 ∈ P1 .
43
which is called the first subproblem and can be solved by the simplex method. Simi-
larly, for the variables λj2 and θ2k , we can form the second subproblem
0 0
minimize (c2 − q D2 )x2
x2
subject to x2 ∈ P2 .
and solve it using the simplex method as well. The decomposition method is sum-
marized in Algorithm 4. Note that the sub-problems are smaller linear programming
problems that are employed as economical search method for discovering columns
subject to D1 x1 + D2 x2 + · · · + Dt xt = b0
(3.9)
Fi xi = bi , i = 1, 2, · · · , t
x1 , x2 , · · · , xt ≥ 0.
44
The only difference is that at each iteration of the revised simplex method with the de-
layed column generation for the master problem, we may have to solve t sub-problems.
In fact the method is applicable even if t = 1. Consider the linear programing problem
0
minimize c1 x
x
subject to Dx = b0
(3.10)
Fx = b
x ≥ 0.
The equality constraints have been partitioned intwo two sets, and define the poly-
points and extreme rays, we obtain a master problem with a large number of columns,
but a smaller number of equality constraints. Searching for columns with negative
reduced cost in the master problem is then accomplished by solving a single sub-
problem, which is a minimization over the set P . This approach can be useful if the
subproblem has a special structure and can be solved very fast. Finally, note that
the decomposition methods assumes that all constraints are in standard form and the
feasible sets Pi of the sub-problems are also in standard form. This assumption is
hardly necessary. For example if we assume that the sets Pi have at least one extreme
point, the resolution theorem and the same line of development applies.
The cutting stock problem is the problem of cutting standard-sized pieces of stock
material, such as paper rolls or sheet metal, into pieces of specified sizes while min-
45
is an NP-hard problem reducible to the knapsack problem [13]. It can also be for-
mulated as an integer linear programming problem by solving the real cutting stock
problem and then approximating the results to integers. It was first formulated by
use of material at the cutting stage with the help of linear programming [15]. The
Consider a paper company that has a supply of large rolls of paper, each of width
be produced. We also assume that each wi is an integer and that wi ≤ W, ∀i. Smaller
rolls are obtained by slicing a large roll in a certain way, called a pattern. For example,
a large roll of width 70 can be cut into three rolls of widths w1 = 17 and one roll
of width w2 = 15, with a waste of 4. In general, the jth pattern can be represented
by a column vector Aj whose ith entry aij indicates how many rolls of width wi are
produced by that pattern. For example, the pattern described earlier is represented
feasible pattern, its components must be non-negative integers and we must have
m
X
aij wi ≤ W (3.11)
i=1
Let n be the number of all feasible patterns and consider the m × n matrix A with
large rolls used while satisfying customer demand. Let xj be the number of large rolls
46
cut according to pattern j. Then, the problem will be
n
X
minimize xj
x
j=1
n
X (3.12)
subject to aij xj = bi , i = 1, 2, · · · , m,
j=1
xj ≥ 0, j = 1, 2, · · · , n.
However, rounding the solution of 3.12 often provides a feasible solution to the integer
programming problem, which is fairly close to optimal at least if the demands bi are
reasonably large.
The difficulty of the problem lies in the large number of cutting patterns (columns)
that may be encountered [4]. For example, with a standard roll of 200 in. and
demands for 40 different lengths ranging from 20 in. to 80 in., the number of cutting
patterns can easily exceed 10 million or even 100 million. This happens in practical
problems and, in this case, we are facing a complicated linear programming problem.
However, the problem can be solved efficiently, by using the revised simplex method
For an initial basic solution, we may let the jth pattern consist of one roll of width
wj for j = 1, 2, · · · , m, and none of the other widths. Then the first m columns of
A form a basis that leads to a basic feasible solution. Now, suppose we have a basis
matrix B and an associated basic feasible solution, and that we wish to carry out the
next iteration of the revised simplex method. Because the cost coefficient of every
47
associated with every column (pattern) Aj , we consider the problem
0
minimize 1 − p Aj (3.13)
j
0
This is the same as maximizing p Aj over all j. If the maximum is less than or
equal to 1, all reduced costs are non-negative and we have an optimal solution. On
the other hand, if the maximum is greater than 1, the column Aj corresponding to
a maximizing j has negative reduced cost and enters the basis. We now have the
problem
m
X
maximize p i ai
a
i=1
m
X
subject to w i ai ≤ W (3.14)
i=1
ai ≥ 0, i = 1, 2, · · · , m,
ai integer, i = 1, 2, · · · , m.
This problem is called the integer knapsack problem. Solving the knapsack problem
requires some effort, but for the range of numbers that arise in the cutting stock
problem, this can be done fairly efficiently. The knapsack problem has well-known
methods to solve it, such as branch and bound [16] and dynamic programming [17].
Although the delayed column generation method goes back to the 1960’s, it started
recently to find its way in so many applications related to the wireless communication
and the machine learning fields. Researchers started to pay attention to large scale
the same vein, machine learning and big data science has a lot of linear problems with
a very large number of variables and constraints. In this section, we shed light on
48
some examples to illustrate the importance of the large scale linear programming in
these fields.
discussed in Section 2.6.2, augmented with a scheduling constraint derived from the
conflict graph associated with the network. A fundamental issue with the conflict
graph based MCF formulation is that finding all independent sets (ISs) for scheduling
constraint matrix will contain a very large number of columns, with each IS being
associated with one column. The complexity of this approach is resolved using the
delayed column generation (DCG) method. Furthermore, the DCG method is also
applied to the multi-radio multi-channel (MR-MC) networks. It was shown than the
DCG method achieves the most preferred trade-off between computation complexity
and network capacity and maintains good scalability when addressing large-scale
with average power constraints is studied in [19]. Network utility optimization prob-
The structure of the optimal solution is a time-sharing across a small set of such
modes. This structure was used to develop an efficient heuristic approach to finding
converges quite fast in simulations, and provides a tool for wireless network planning.
Routing in Delay-Tolerant Networks (DTN) has drawn much research effort re-
cently. Since many different kinds of networks fall in the DTN category, many routing
49
approaches have been proposed. Such systems can benefit from a previously proposed
routing algorithm based on linear programming that minimizes the average message
delay [20, 21]. This algorithm, however, is known to have performance issues that
this area, but it contains fewer LP constraints and has a structure suitable to the
A joint caching, routing, and channel assignment for video delivery over coordi-
nated small-cell cellular systems of the future Internet is considered in [23]. The prob-
lem of maximizing the throughput of the system was formulated as a linear program
in which the number of variables is very large. To address channel interference, the
proposed formulation incorporates the conflict graph that arises when wireless links
interfere with each other due to simultaneous transmission. The column generation
method was used to solve the problem by breaking it into a restricted master subprob-
lem that involves a select subset of variables and a collection of pricing subproblems
that select the new variable to be introduced into the restricted master problem. The
which the video data can be delivered to the users, over the state-of-the-art femto-
to analogous gains in video application quality, thereby enhancing the user experience
considerably.
50
Chapter 4: Implementation of Large Scale Linear Programs
The advances in computing in the past decades allowed us to find many software
variables and constraints can be seen as small problems. Problems with tens or even
packages come in two different kinds. Some of them are algorithmic codes devoted
to finding optimal solutions to specific linear programs. They take the input as a
compact list of the linear program constraint coefficients (A, b, c and related values
in the standard form) and produce the output as a compact list of optimal solution
values and related information. Other packages are considered as modeling systems
which allow people to formulate their own linear programs and analyze their solutions.
Most modeling systems support a variety of algorithmic codes, while the more popular
codes can be used with many different modeling systems. Conversion to the forms
this chapter we shed light on some popular modeling software packages and illustrate
how to use them through the examples discussed in Chapter 2. We investigate how
to implement the cutting stock problem, discussed in Section 3.4, using AMPL. The
51
using GAMS. The implementation of the D2D caching network example, discussed
can be used to describe and solve high-complexity problems for large-scale mathe-
was developed by Robert Fourer, David Gay, and Brian Kernighan at Bell Laborato-
ries [25]. AMPL offers an interactive command environment for setting up and solving
be available, both open source and commercial software, including CBC, CPLEX,
FortMP, Gurobi, MINOS, IPOPT, SNOPT, KNITRO, and LGO. It has a syntax
very similar to the mathematical notation of optimization problems, which allows for
Once optimal solutions have been found, they are translated back to the modeler’s
AMPL has a variety of options to format data for browsing, printing reports,
experiment: the AMPL web site, www.ampl.com, provides free downloadable student
versions and representative solvers that run on Windows, Unix/Linux, and Mac OS
X. In this section, we briefly discuss how to use AMPL to model large scale linear
programs like the cutting stock problem, discussed in 3.4. Our objective is to cover
the main features of this software package and how to use it in solving these problems.
52
4.1.1 Implementation of The Cutting Stock Problem using AMPL
The cutting stock problem, discussed in Section 3.4 is a typical example to il-
lustrate the column generation method. In this problem, we wish to cut up long
raw widths of some commodity, such as rolls of paper, into a combination of smaller
widths that meet given orders with as little waste as possible. The Gilmore-Gomory
procedure defines a cutting pattern to be any feasible way in which a raw roll can be
cut. Thus, a pattern is a vector consisting of a certain number of rolls of each desired
width, such that their total width does not exceed the raw width. The Gilmore-
main problem finds the minimum number of raw rolls that need be cut, given a
collection of known cutting patterns that may be used. The sub-problm seeks to
identify a new pattern that can be used in the cutting optimization, either to reduce
the number of raw rolls needed, or to determine that no such new pattern exists. The
variables of this model are the numbers of each desired width in the new pattern; the
feasibility constraint 3.11 ensures that the total width of the pattern does not exceed
53
The complete implementation code is provided in Appendix A. AMPL allows us
to define two problem statements, one for the main problem and another one for the
sub-problem.
option relax_integrality 1;
option relax_integrality 0;
The first statement defines a problem named Cutting Opt that consists of the
Cut variables, the Fill constraints, and the objective Number. This is defined in
the statement
minimize Number:
Comparing the definition of the Cutting Opt problem with (3.12), we see that
Number is the objective function, Cut represents the optimization variables xj , Fill
represents the constraint where nbr are the coefficients aij and orders are the con-
straint values bi . In a similar way, we define a problem Pattern Gen that consists of
the Use variables, the Width Limit constraint, and the objective Reduced Cost.
minimize Reduced_Cost:
54
subject to Width_Limit:
Comparing the definition of the Pattern Gen problem with (3.14), we see that
The for loop creates the initial cutting patterns, after which the main repeat loop
solve Cutting_Opt;
sets the Cutting Opt as the current problem, along with its environment, and
solves the associated linear program. A similar statement is defined for the Pattern Gen
problem. An example for this problem is for a roll width of 110 and required demands
of 48, 35, 24, 10 and 8 for finished rolls of widths 20, 45, 50, 55 and 75, respectively.
0 branch-and-bound nodes
No basis.
0 branch-and-bound nodes
55
No basis.
0 branch-and-bound nodes
No basis.
0 branch-and-bound nodes
No basis.
nbr [*,*]:=
: 1 2 3 4 5 6 7 8
20 5 0 0 0 0 1 1 3
45 0 2 0 0 0 0 2 0
50 0 0 2 0 0 0 0 1
55 0 0 0 2 0 0 0 0
75 0 0 0 0 1 1 0 0;
The final fractional solution means that a pattern of (0, 0, 2, 0, 0) will be generated
will be generated 8 times, a pattern of (1, 2, 0, 0, 0) will be generated 17.5 times and
56
finally a pattern of (3, 0, 1, 0, 0) will be generated 7.5 times. The best fractional
solution cuts 46.25 raw rolls in five different patterns, using 48 rolls if the fractional
for mathematical optimization. GAMS is designed for modeling and solving linear,
nonlinear, and mixed-integer optimization problems. The system is tailored for com-
plex, large-scale modeling applications and allows the user to build large maintainable
models that can be adapted to new situations. GAMS was first presented at the In-
in 1976. GAMS allows the user to concentrate on the modeling problem by making
the setup simple. The system takes care of the time-consuming details of the specific
solvers. Among these solvers are BARON, COIN-OR solvers, CONOPT, CPLEX,
DICOPT, Gurobi, MOSEK, SNOPT, SULUM, and XPRESS [26]. We illustrate how
[27]. In this section, we highlight how GAMS can be used to implement a multi-
commodity network flow problem. The definition of this problem was discussed in
57
Section 2.6.2 and the complete implementation code is provided in Appendix B. In
the beginning, we define the settings of the problem including the number of nodes
GAMS allows us you to specify indices in a straightforward way: declare and name
the set (here, i, k and e(i, k)), and enumerate their elements.
k commodities / k1*k%comm% /
e(i,i) edges
alias (i,j)
Indexed parameters are defined to store the cost of each link cij , the balance bki ,
the demand bk and the capacity uij . Notice that the commodity is indexed by k only
instead of using k and l as discussed in Section 2.6.2. GAMS also allows us to place
explanatory text (shown in lower case) throughout the model, as we develop it. These
comments are automatically incorporated into the output report, at the appropriate
places.
parameters
bal(k,i) balance
kdem(k) demand
58
Decision variables are expressed with their indices specified, where cost cor-
responds to cij in (2.5), bal corresponds to bki , kdem corresponds to bk and cap
corresponds to uij . From this general form, GAMS generates each instance of the
variable in the domain. Variables are specified as to type: FREE, POSITIVE, NEG-
ATIVE, BINARY, or INTEGER. The default is FREE. The objective variable (z,
here) is simply declared without an index. Here, the optimization variable x is defined
variables
z objective
positive variable x;
The objective function and constraint equations are first declared by giving them
names. Then their general algebraic formulae are described. GAMS now has enough
information (from data entered above and from the algebraic relationships specified in
equations
defobj;
=e= bal(k,i);
59
defcap(e).. sum(k, x(k,e)) =l= cap(e);
The model is given a unique name (here, mcf multi-commodity flow problem),
and the modeler specifies which equations should be included in this particular for-
mulation. In this case we specified ALL which indicates that all equations are part
of the model.
A random instance is generated here for testing. However, we could set exact
values for the link cost cij , the balance bki , the demand bk and the capacity uij . The
model checks whether the generated instance is feasible. In this case, the model is
The solve statement tells GAMS which model to solve, selects the solver to use (in
this case an LP solver), indicates the direction of the optimization, either MINIMIZ-
ING or MAXIMIZING , and specifies the objective variable. We have two problems
parameters, variables and equations are defined for each problem. These problem are
The steps defined in Algorithm 4 are implemented in the last part of the model.
To solve the main problem and the pricing sub-problem by these statements
60
solve master using lp minimizing z;
where each statement is at its appropriate place in code (See Appendix B). Run-
ning this code using GAMS platform, a report is generated including many results.
S O L V E S U M M A R Y
0 INFEASIBLE
0 UNBOUNDED
n1 n4 n5 n6 n8
k1.n2 123.400
k2.n4 29.264
k3.n6 51.062
k5.n2 52.463
61
k5.n8 52.463
single serial
The report shows that CPLEX solver was used to find the optimal solution. The
final objective value is 1726.1151 and the problem was solved. The final solution is
Another important platform which allows us to solve large scale linear programs is
creation of user interfaces, and interfacing with programs written in other languages,
including C, C++, Java, Fortran, and Python. Matlab includes an optimization tool-
box which provides functions for solving constrained and unconstrained optimization
problems. The toolbox includes solvers for linear programming, mixed-integer linear
62
programming, quadratic programming, nonlinear optimization, and nonlinear least
squares.
Linear optimization problems can be solved using the linprog function from
the toolbox. The optimization toolbox includes three algorithms used to solve linear
programming problems:
rithm used for solving linear programming problems. Interior point is especially
useful for large-scale problems that have structure or can be defined using sparse
matrices.
• The active-set algorithm minimizes the objective at each iteration over the ac-
tive set (a subset of the constraints that are locally active) until it reaches a
solution.
[x,fval] = linprog(f,A,b,Aeq,beq,lb,ub,x0,options);
which solves min f (x) such that Ax ≤ b. It also includes equality constraints
Aeq x = beq . It defines a set of lower and upper bounds on the design variables, x, so
that the solution is always in the range lb ≤ x ≤ ub. Moreover, x0 is the starting
point from which the algorithm starts searching for the optimal solution. The options
of this optimization function are defined in options using the optimset function.
63
There are many options that can be set using this function. We focus here on two
of them, Algorithm and LargeScale. This function returns the optimal value of
first three algorithms are large-scale algorithms, while last two algorithms are not.
An optimization algorithm is large scale when it uses linear algebra that does not
need to store, nor operate on, full matrices. This may be done internally by storing
sparse matrices, and by using sparse linear algebra for computations whenever pos-
sible. Furthermore, the internal algorithms either preserve sparsity, such as a sparse
method. The option LargeScale can be set to ’on’ (default), with one of the
mentioned large scale algorithms, to solve large size problems or ’off’ when we
In this section we shed light on how Matlab can help us to solve the D2D caching
problem discussed in Section 2.6.3. When the number of users and the number of data
items grow large in this problem we have a large scale linear optimization problem.
when N = 1000 and M = 105 and solve the problem by increasing the number of
users involved in the system. We generate a random instance of the demand and
mobility profiles using the demand gen and mobility gen functions respectively.
We assume that each user can store up to 10% of these data items in his device. We
64
also assume that the carrier pays back 0.5 units for each byte cached and shared by
every user (i.e. r = 0.5). We initialize some variables to store the values obtained
X_optimal = zeros(NMax,NMax*M);
Cost_optimal = zeros(1,NMax);
Gain_optimal = zeros(1,NMax);
Memory_optimal = zeros(1,NMax);
Each time we run the loop, the statistics corresponding to involved users are
captured to prepare alpha. After that we set the parameters of the optimization
function linprog. We set an upper and lower bound on the decision parameter for
the feasibility of the solution. These bounds represent the third constraint in (2.6).
LB = zeros(1,N*M);
UB = zeros(1,N*M);
for n=1:N
UB(M*(n-1)+1:M*n) = Sm;
end
Matrix A1 is for the memory constraint which is the first constraint in (2.6). Matrix
A2 is the second constraint in (2.6). These two matrices are merged in one matrix to
A = [A1 ; A2];
b = [b1; b2];
65
An initial point x0 = (0, 0, · · · , 0) is chosen. The most important part of this code
is to choose the optimization algorithm and to turn the large scale option to ’on’
options = optimset(’Algorithm’,’dual-simplex’,
’Display’,’off’,’LargeScale’,’on’);
Now, everything is ready to call the linprog function and solve the problem
[xopt,costP]=linprog(cost_fun,A,b,[],[],LB,UB,x0,options);
This function returns the optimal solution xopt and the optimal value of the
cost function costP. The results of this system are depicted in Figure 4.1. The
carrier achieves more gain and uses less memory when more users are engaged in the
network. Moreover, we notice that memory usage decays as O( N1 ). More users help
all parties to gain more and, at the same time, it requires less memory for this caching
as the network expands. When a user requests a certain data item and more users are
located around him, he gets that item either from his local cache or from other users
through the D2D communication. This helps the carrier to smooth out the network
load and minimize the incurred service cost. Notice that the LargeScale option of
Matlab allowed us to find the optimal solution even when the number of users N and
the number of data contents M increase. We refer the reader to [2] for more details.
66
100 50
40
80
Memory Usage (%)
Carrier Gain (%)
30
60
20
40
10
20 0
0 50 100 0 50 100
No. of Users (N) No. of Users (N)
67
Appendix A: AMPL Implementation of Column Generation
data;
# -----------------------------------------
# SETTING THE ROLL WIDTH AND ORDER DETAILS
# -----------------------------------------
param roll_width := 110 ;
param: WIDTHS: orders :=
20 48
45 35
50 24
55 10
75 8 ;
# ----------------------------------------
# SETTING MODEL FILE AND SOLVER
# ----------------------------------------
model cut.mod;
data cut.dat;
option solver cplex, solution_round 6;
option display_1col 0, display_transpose -10;
# ----------------------------------------
# DEFINING THE PROBLEMS
# ----------------------------------------
problem Cutting_Opt: Cut, Number, Fill;
option relax_integrality 1;
problem Pattern_Gen: Use, Reduced_Cost, Width_Limit;
option relax_integrality 0;
68
# ----------------------------------------
# GENERATING THE PATTERNS
# ----------------------------------------
let nPAT := 0;
for {i in WIDTHS} {
let nPAT := nPAT + 1;
let nbr[i,nPAT] := floor (roll_width/i);
let {i2 in WIDTHS: i2 <> i} nbr[i2,nPAT] := 0;
}
# ----------------------------------------
# RUNNING THE PROCEDURE
# ----------------------------------------
repeat {
solve Cutting_Opt;
let {i in WIDTHS} price[i] := Fill[i].dual;
solve Pattern_Gen;
if Reduced_Cost < -0.00001 then {
let nPAT := nPAT + 1;
let {i in WIDTHS} nbr[i,nPAT] := Use[i];
}
else break;
}
display nbr, Cut;
# ----------------------------------------
# FINAL ROUND AND OUTPUT DISPLAY
# ----------------------------------------
option Cutting_Opt.relax_integrality 0;
solve Cutting_Opt;
display Cut;
# ----------------------------------------
# CUTTING STOCK USING PATTERNS
# ----------------------------------------
param roll_width > 0; # width of raw rolls
set WIDTHS; # set of widths to be cut
param orders {WIDTHS} > 0; # number of each width to be cut
param nPAT integer >= 0; # number of patterns
set PATTERNS = 1..nPAT; # set of patterns
param nbr {WIDTHS,PATTERNS} integer >= 0;
69
check {j in PATTERNS}:
sum {i in WIDTHS} i * nbr[i,j] <= roll_width;
# defn of patterns: nbr[i,j] = number
# of rolls of width i in pattern j
var Cut {PATTERNS} integer >= 0; #rolls cut using each pattern
minimize Number: # minimize total raw rolls cut
sum {j in PATTERNS} Cut[j];
subject to Fill {i in WIDTHS}:
sum {j in PATTERNS} nbr[i,j] * Cut[j] >= orders[i];
# for each width, total
# rolls cut meets total orders
# ----------------------------------------
# KNAPSACK SUBPROBLEM FOR CUTTING STOCK
# ----------------------------------------
param price {WIDTHS} default 0.0;
var Use {WIDTHS} integer >= 0;
minimize Reduced_Cost:
1 - sum {i in WIDTHS} price[i] * Use[i];
subject to Width_Limit:
sum {i in WIDTHS} i * Use[i] <= roll_width;
70
Appendix B: GAMS Implementation of Multi-Commodity Network
Flow Problem
# ----------------------------------------
# Problem Settings
# ----------------------------------------
$Eolcom !
$setddlist nodes comm maxtime
$if NOT set nodes $set nodes 20
$if NOT set comm $set comm 5
$if NOT set maxtime $set maxtime 50
$if NOT errorfree $abort wrong double dash parameters:
--nodes=n --comm=n --maxtime=secs
# ----------------------------------------
# Defining SETS
# ----------------------------------------
sets i nodes / n1*n%nodes% /
k commodities / k1*k%comm% /
e(i,i) edges
alias (i,j)
# ----------------------------------------
# Defining Indexed Parameters
# ----------------------------------------
parameters
cost(i,j) cost for edge use
bal(k,i) balance
kdem(k) demand
cap(i,j) bundle capacity ;
# ----------------------------------------
# Declaring Variables
# ----------------------------------------
variables
71
x(k,i,j) multi commodity flow
z objective
positive variable x;
# ----------------------------------------
# Objective Functions and Constraints
# ----------------------------------------
equations
defbal(k,i) balancing constraint
defcap(i,j) bundling capacity
defobj;
defobj.. z =e= sum((k,e), cost(e)*x(k,e));
defbal(k,i).. sum(e(i,j),x(k,e))-sum(e(j,i),x(k,e))=e=bal(k,i);
defcap(e).. sum(k, x(k,e)) =l= cap(e);
# ----------------------------------------
# Defining Model
# ----------------------------------------
model mcf multi-commodity flow problem /all/;
# ----------------------------------------
# Making a Random Instance
# ----------------------------------------
scalars inum, edgedensity /0.3/ ;
e(i,j) = uniform(0,1) < edgedensity; e(i,i) = no;
cost(e) = uniform(1,10);
cap(e) = uniform(50,100)*log(card(k));
loop(k,
kdem(k) = uniform(50,150);
inum = uniformInt(1,card(i));
bal(k,i)$(ord(i)=inum) = kdem(k);
inum = uniformInt(1,card(i));
bal(k,i)$(ord(i)=inum) = bal(k,i) - kdem(k);
kdem(k) = sum(i$(bal(k,i)>0), bal(k,i)) );
# ----------------------------------------
# See if the random model is feasible
# ----------------------------------------
option limrow=0, limcol=0;
option solprint=off, solvelink=%solvelink.CallModule%;
solve mcf min z using lp;
abort$(mcf.modelstat <> %modelstat.Optimal%)
’problem not feasible. Increase edge density.’
parameter xsingle(k,i,j) single solve;
72
xsingle(k,e) = x.l(k,e)$[x.l(k,e) > 1e-6];
display$(card(i) < 30) xsingle;
# ----------------------------------------
# Define Master Model
# ----------------------------------------
set p paths idents / p1*p100 /
ap(k,p) active path
pe(k,p,i,j) edge path incidence vector
parameter
pcost(k,p) path cost
positive variable xp(k,p), slack(k);
equations
mdefcap(i,j) bundle constraint
mdefbal(k) balance constraint
mdefobj objective;
mdefobj.. z=e=sum(ap,pcost(ap)*xp(ap))+sum(k,999*slack(k));
mdefbal(k).. sum(ap(k,p), xp(ap)) + slack(k) =e= kdem(k);
mdefcap(e).. sum(pe(ap,e), xp(ap)) =l= cap(e);
model master / mdefobj, mdefbal, mdefcap /;
# ----------------------------------------
# Define Pricing Model: Shortest Path
# ----------------------------------------
parameter ebal(i)
positive variable xe(i,j)
equations
pdefbal(i) balance constraint
pdefobj objective;
pdefobj.. z =e= sum(e, (cost(e)-mdefcap.m(e))*xe(e));
pdefbal(i).. sum(e(i,j), xe(e)) - sum(e(j,i),xe(e))=e=ebal(i);
model pricing / pdefobj, pdefbal /;
# ----------------------------------------
# Solving Master and Pricing Problems
# ----------------------------------------
Scalar done loop indicator, iter iteration counter;
Set nextp(k,p) next path to be added ;
* clear path data
done = 0; iter = 0;
ap(k,p) = no; pe(k,p,e) = no;
pcost(k,p) = 0;
nextp(k,p) = no; nextp(k,’p1’) = yes;
While(not done, iter=iter+1;
73
solve master using lp minimizing z;
done = 1;
loop(k$kdem(k),
ebal(i) = bal(k,i)/kdem(k);
solve pricing using lp minimizing z;
pricing.solprint=%solprint.Quiet%;
! turn off all outputs fpr pricing model
if (mdefbal.m(k) - z.l > 1e-6, ! add new path
ap(nextp(k,p)) = yes;
pe(nextp(k,p),e) = round(xe.l(e));
pcost(nextp(k,p)) = sum(pe(nextp,e), cost(e));
nextp(k,p) = nextp(k,p-1);
! bump the path to the next free one
abort$(sum(nextp(k,p),1)=0) ’set p too small’;
done = 0 ) ) );
abort$(abs(master.objval-mcf.objval)>1e-3)
’different objective values’, master.objval, mcf.objval;
parameter xserial(k,i,j);
xserial(k,e) = sum(pe(ap(k,p),e), xp.l(ap));
display$(card(i) < 30) xserial;
74
Appendix C: Matlab Implementation of D2D Caching Example
% ----------------------------------------
% System Parameters
% ----------------------------------------
clc, clear all; close all; % Clearing Everything
NInit = 5; % Initial Number of Users
step = 5; % Step Size in For Loop
L = 4; % No. of Locations
NMax = 1000; % No. of Users
M = 1e05; % No. of Data Items
Theta_all = mobility_gen(NMax,L,1); % Mobility Profile
P_all = demand_gen(NMax,M,1); % Demand Profile
Sm = 100*ones(1,M); % Data Items Sizes
Zn_all = (M/10)*100*ones(NMax,1); % Memory Sizes
r = 0.50; % Reward Factor
% ----------------------------------------
% Initialization
% ----------------------------------------
X_optimal = zeros(NMax,NMax*M);
Cost_optimal = zeros(1,NMax);
Gain_optimal = zeros(1,NMax);
Memory_optimal = zeros(1,NMax);
% ----------------------------------------
% Running Loop on No. of Users
% ----------------------------------------
for N=NInit:step:NMax
disp([’N = ’ num2str(N)]);
% Taking Chunk
PIndex = [];
for m=1:M
PIndex = [PIndex (1:N)+(m-1)*NMax];
75
end
P = P_all(PIndex);
TIndex = [];
for l=1:L
TIndex = [TIndex (1:N)+(l-1)*NMax];
end
Theta = Theta_all(TIndex);
Zn = Zn_all(1:N);
% Caching Decisions
x = zeros(1,N*M);
% Preparing Matrix (S)
S = zeros(1,M*L);
for m=1:M
S((m-1)*L+1:m*L)=Sm(m)*ones(1,L);
end
% Preparing Matrix (I) & (II)
I = zeros(M*L,M*N);
II = zeros(M*N,M*L);
for m=1:M
I((m-1)*L+1:m*L,(m-1)*N+1:m*N)= reshape(Theta,N,L)’;
II((m-1)*N+1:m*N,(m-1)*L+1:m*L)= reshape(Theta,N,L);
end
% Preparing Matrix (III)
III = zeros(1,M*L);
for m=1:M
for l=1:L
III((m-1)*L+l)=P((m-1)*N+1:m*N)*Theta((l-1)*N+1:l*N)’;
end
end
% Preparing Alpha
alpha = III*I;
% Reactive Load Calculation
loadR=0;
for m=1:M
loadR = loadR + Sm(m)*ones(1,N)*P((m-1)*N+1:m*N)’;
end
% Reactive Cost (Linear)
costR = loadR;
% ----------------------------------------
% Optimization Parameters
% ----------------------------------------
% Upper and Lower Bounds
LB = zeros(1,N*M);
76
UB = zeros(1,N*M);
for n=1:N
UB(M*(n-1)+1:M*n) = Sm;
end
% Constraint 1
A1 = zeros(N,N*M);
for m=1:M
A1(1:N,(m-1)*N+1:m*N)=eye(N,N);
end
b1 = Zn(1:N);
% Constraint 2
A2 = I;
b2 = S’;
% Constraint Matrix
A = [A1 ; A2];
b = [b1; b2];
% Initial Point
x0 = zeros(1,N*M);
% Option Setting
options = optimset(’Algorithm’,’dual-simplex’,
’Display’,’off’,’LargeScale’,’on’);
% ----------------------------------------
% Solving the Problem - Proactive Cost
% ----------------------------------------
cost_fun = r - alpha;
[xopt,costP] = linprog(cost_fun,A,b,[],[],LB,UB,x0,options);
costP = costP + costR;
X_optimal(1:N*M,N) = xopt;
Cost_optimal(N) = costP;
Gain_optimal(N) = 100*(costR-costP)/costR;
Memory_optimal(N) = 100*sum(xopt)/sum(Zn(1:N));
end
Cost_LB = Cost_optimal;
X_LB = X_optimal;
% ----------------------------------------
% Plotting Results
% ----------------------------------------
figure(3); subplot(1,2,1);
plot([NInit:step:NMax],Gain_optimal(NInit:step:NMax),
’r-’,’LineWidth’,2);
grid on; xlabel(’No. of Users (N)’); ylabel(’Gain (%)’);
77
title(’Carrier Gain vs No. of Users (N)’);
figure(3); subplot(1,2,2);
plot([NInit:step:NMax],Memory_optimal(NInit:step:NMax),
’b-’,’LineWidth’,2);
grid on; xlabel(’No. of Users (N)’); ylabel(’Memory Usage (%)’);
title(’Overall Memeory Usage vs No. of Users (N)’);
78
Bibliography
[7] M. M. Baldi and M. Bruglieri, “On the generalized bin packing problem,” Inter-
national Transactions in Operational Research, 2016.
[8] M. Delorme, M. Iori, and S. Martello, “Bin packing and cutting stock problems:
Mathematical models and exact algorithms,” European Journal of Operational
Research, 2016.
[9] L. Chen and G. Zhang, “Packing groups of items into multiple knapsacks,”
in LIPIcs-Leibniz International Proceedings in Informatics, vol. 47. Schloss
Dagstuhl-Leibniz-Zentrum fuer Informatik, 2016.
79
[11] R. G. Bland, “New finite pivoting rules for the simplex method,” Mathematics
of operations Research, vol. 2, no. 2, pp. 103–107, 1977.
[12] R. E. Gomory, “An algorithm for integer solutions to linear programs,” Recent
advances in mathematical programming, vol. 64, pp. 260–302, 1963.
[20] J. Alonso and K. Fall, “A linear programming formulation of flows over time
with piecewise constant capacity and transit times,” Intel Research Technical
Report IRB-TR-03-007, Tech. Rep., 2003.
[21] S. Jain, K. Fall, and R. Patra, Routing in a delay tolerant network. ACM, 2004,
vol. 34, no. 4.
80
[23] A. Khreishah, J. Chakareski, and A. Gharaibeh, “Joint caching, routing, and
channel assignment for collaborative small-cell cellular networks,” arXiv preprint
arXiv:1605.09307, 2016.
81