Download as pdf or txt
Download as pdf or txt
You are on page 1of 93

LARGE SCALE LINEAR OPTIMIZATION FOR

WIRELESS COMMUNICATION SYSTEMS

A Thesis

Presented in Partial Fulfillment of the Requirements for the Degree


Master of Mathematical Science in the Graduate School of The Ohio
State University

By

Sameh Hosny, M.S.

Graduate Program in Department of Mathematics

The Ohio State University

2017

Master’s Examination Committee:


Prof. Ghaith Hiary, Advisor
Prof. Facundo Memoli
© Copyright by

Sameh Hosny

2017
Abstract

Linear Programming has many applications in the domain of wireless commu-

nication. Many problems in this field consist of a very large number of variables

and constraints and therefore fit in the platform of large scale linear programming.

Advancements in computing over the past decade have allowed us to routinely solve

linear programs in thousand of variables and constraints, using specialized methods

from large scale linear programming. There are many software packages that im-

plement such methods, e.g. AMPL, GAMS and Matlab. This dissertation gives A

concise survey of linear programming fundamentals with a focus on techniques for

large scale linear programming problems in the context of wireless communication.

The dissertation explains some of these techniques, in particular the delayed column

generation method and the decomposition method. It also draws on examples from

the active field of wireless communication. The dissertation is concluded by giving

concrete examples of how to use various software packages to solve large scale lin-

ear programming problems stemming from our examples in the context of wireless

communication.

ii
To the soul of my father, to my beloved mother, to my great wife, Doaa Eid

and my kids Rinad, Rawan and Mohammed.

iii
Acknowledgments

I would like to express my special appreciation and thanks to my advisor Professor

Ghaith Hiary. You have been a tremendous mentor for me. It has been an honor for

me to be one of your students. I appreciate all your contributions of time and ideas to

make my M.Sc. experience productive and stimulating. The joy and enthusiasm you

have for your research was contagious and motivational for me, even during tough

times in the M.Sc. pursuit. I am also thankful to all the professors who taught me

from the math department. I am really grateful to them all for their dedication and

devotion to the courses they educate. These courses helped me create a strong and

rigorous background in both my major and minor fields. It allowed me to improve

my research skills and to change my perspective to many things.

The members of the IPS lab have contributed immensely to my personal and

professional time at The Ohio State University. The group has been a source of

friendships as well as good advice and collaboration. I would like to acknowledge

my colleague John Tadrous for his continuous help and generosity. He was always

supporting me with all the information I needed especially in the beginning of my

study. Moreover, I am thankful to my colleague Faisal Alotaibi for the great time we

spent working together and having useful technical discussions in our group meetings.

My time at OSU was made enjoyable in large part due to the many friends and

groups that became a part of my life. I would like to extend my special thanks to my

iv
best Egyptian friend Sameh Shohdy and his great family for their kindness, support,

and hospitality. They supported me and my family until everything was settle down

in Columbus. I also experess my thanks to our great American firends, Betty Rocke

and Randy, for supporting our stay in Columbus and helping my son Mohammed in

learning so many things.

Lastly, I would like to thank my family for all their love and encouragement. For

my parents who raised me with a love of science and supported me in all my pursuits.

And most of all for my loving, supportive, encouraging, and patient wife Doaa Eid

whose faithful support during all stages of this Ph.D. is so appreciated. For spending

many nights waiting for me to accomplish my hard tasks. Thank you.

v
Vita

December 11, 1978 . . . . . . . . . . . . . . . . . . . . . . . . . Born - Cairo, Egypt

2001 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.S. Electrical and Computer Engi-


neering
2010 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M.S. Electrical and Computer Engi-
neering
2014-present . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ph.D. Student, Electrical and Com-
puter Engineering, The Ohio State
University
2015-present . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M.S. Student, Mathematics, The Ohio
State University

Publications

(Accepted) S. Hosny, A. Eryilmaz and H. El Gamal, ”Impact of User Mobility on


D2D Caching Networks,” IEEE Global Communications Conference, Washignton DC,
USA, 2016.

(Accepted) S. Hosny, A. Eryilmaz and H. El Gamal, ”Mobility-Aware Centralized


D2D Caching Networks,” 54th Annual Allerton Conference on Communication, Con-
trol, and Computing, Illinois, USA, 2016.

(Submitted) S. Hosny, F. Alotaibi, J. Tadrous, A. Eryilmaz and H. El Gamal, ”Con-


tent Trading in D2D Caching Networks,” IEEE/ACM Transactions on Networking.

(To be submitted) S. Hosny, A. Abouzeid, A. Eryilmaz and H. El Gamal, ”Mobility-


Aware D2D Caching Networks,” IEEE Transactions on Wireless Communications.

F. Alotaibi, S. Hosny, H. El Gamal and A. Eryilmaz, ”A game theoretic approach to


content trading in proactive wireless networks,” 2015 IEEE International Symposium
on Information Theory (ISIT), Hong Kong, 2015, pp. 2216-2220.

vi
S. Hosny, F. Alotaibi, H. E. Gamal and A. Eryilmaz, ”Towards a P2P mobile contents
trading,” 2015 49th Asilomar Conference on Signals, Systems and Computers, Pacific
Grove, CA, 2015, pp. 338-342.

S. Hosny, F. Alotaibi, H. El Gamal and A. Eryilmaz, ”Towards a mobile content


marketplace,” 2015 IEEE 16th International Workshop on Signal Processing Advances
in Wireless Communications (SPAWC), Stockholm, 2015, pp. 675-679.

Alotaibi, F., S. Hosny, J. Tadrous, H. El Gamal, and A. Eryilmaz. ”Towards a mar-


ketplace for mobile content: Dynamic pricing and proactive caching.” arXiv preprint
arXiv:1511.07573 (2015).

Fields of Study

Major Field: Electrical & Computer Engineering

vii
Table of Contents

Page

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii

Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

Vita . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2. Review of Linear Programming . . . . . . . . . . . . . . . . . . . . . . . 3

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Geometry of a Linear Program . . . . . . . . . . . . . . . . . . . . 7
2.3 Degeneracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 The Simplex Method . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4.1 Implementation of the Simplex Method . . . . . . . . . . . 16
2.4.2 Comparisons and Performance Enhancements . . . . . . . . 20
2.5 The Duality Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.6 Example Problems for Wireless Communication Networks . . . . . 27
2.6.1 Power Control in a Wireless Network . . . . . . . . . . . . . 27
2.6.2 Multicommodity Network Flow . . . . . . . . . . . . . . . . 28
2.6.3 D2D Caching Networks . . . . . . . . . . . . . . . . . . . . 30

viii
3. Large Scale Linear Programs . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.1 Delayed Column Generation Method . . . . . . . . . . . . . . . . . 34


3.2 Cutting Plane Method . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.3 Dantzig-Wolfe Decomposition . . . . . . . . . . . . . . . . . . . . . 40
3.4 The Cutting Stock Problem . . . . . . . . . . . . . . . . . . . . . . 45
3.5 Applications in Wireless Communication . . . . . . . . . . . . . . . 48

4. Implementation of Large Scale Linear Programs . . . . . . . . . . . . . . 51

4.1 AMPL Programming Language . . . . . . . . . . . . . . . . . . . . 52


4.1.1 Implementation of The Cutting Stock Problem using AMPL 53
4.2 GAMS Programming Language . . . . . . . . . . . . . . . . . . . . 57
4.2.1 Implementation of Dantzig-Wolfe Decomposition Method us-
ing GAMS . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.3 Matlab Programming Language . . . . . . . . . . . . . . . . . . . . 62
4.3.1 Matlab Implementation of D2D Caching Example . . . . . . 64

Appendices 68

A. AMPL Implementation of Column Generation . . . . . . . . . . . . . . . 68

B. GAMS Implementation of Multi-Commodity Network Flow Problem . . 71

C. Matlab Implementation of D2D Caching Example . . . . . . . . . . . . . 75

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

ix
List of Tables

Table Page

2.1 Comparison between Simplex implementation methods . . . . . . . . . . . 22

2.2 The Different Possibilities for the Primal and Dual Problems . . . . . . . . 26

x
List of Figures

Figure Page

2.1 Graphical solution of a linear program example. . . . . . . . . . . . . . . . 8

2.2 Visualization of standard form problems . . . . . . . . . . . . . . . . . . . 8

2.3 Full Tableau Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.4 An illustration of the power control example. . . . . . . . . . . . . . . . . 28

2.5 An illustration of the multi-commodity network flow example. . . . . . . . 30

2.6 An illustration of the D2D caching networks example. . . . . . . . . . . . . 32

4.1 System Performance of the D2D Caching Network . . . . . . . . . . . . . 67

xi
Chapter 1: Introduction

The importance of Linear Programming (LP) derives in part from its many ap-

plications and in part from the existence of efficient techniques to solve it. These

techniques are fast and reliable over a substantial range of problem sizes, inputs and

applications. Linear programming has been proven to be valuable for modeling di-

verse types of problems in planning, routing, scheduling, assignment, and design.

Industries that make use of LP and its extensions include transportation, energy,

telecommunications, health care, finance and manufacturing. In a number of these

applications, a realistic model gives rise to a LP problem with a large number of

variables and constraints. This makes the problem more complicated and requires

substantial computational resources to solve it; in particular, substantial amount of

fast memory and higher computational speed. For this reason, a number of special-

ized procedures, such as column generation and cutting-plane methods, have been

developed to effectively solve such large-scale linear programs. Yet, in other cases,

the LP problem may have a special structure where the decomposition methods can

be useful.

Linear programming has numerous and important applications in the domain of

wireless communications, e.g. network flow, power control, caching networks, etc.

Most of these applications deal with very large number of variables and constraints.

1
For example, caching networks deal with a very large number of users and a tremen-

dous amount of data contents. Therefore, we focus in this dissertation on linear

programming methods for such large scale problems. We also investigate some soft-

ware packages to implement and solve these problems. Thanks to the advances in

computing over the past decade, linear programs in a few thousand variables and

constraints are nowadays viewed as ”small” problems. Problems having tens, or even

hundreds, of thousands of continuous variables are regularly solved using software

packages such as AMPL, GAMS, Matlab, etc. Large-scale LP software packages uti-

lize special techniques from numerical linear algebra such as sparse matrix techniques

together with refinements developed through years of experience. Though, this is not

going to be the focus of this dissertation. This dissertation presents some common

examples of wireless communication systems and illustrates of how to solve them

using these software packages.

The dissertation is organized as follows: Chapter 2 is a review for the fundamentals

of linear programming to fix the notation. In Chapter 3, we describe some important

large scale LP algorithms. In this chapter, we focus on algorithms that have been

proven in practise. In Chapter 4, we illustrate how to implement the large scale LP

algorithms, discussed in Chapter 3, using software packages such as AMPL, GAMS

and Matlab, to solve some large scale LP examples from the context of wireless

communication.

2
Chapter 2: Review of Linear Programming

2.1 Introduction

A linear program (LP) is an optimization problem in which the objective function

is linear in the unknowns and the constraints consist of linear equalities and linear

inequalities [1]. In a general linear programming problem, we are given a cost vector
0
c = (c1 , · · · , cn ) and we seek to minimize a linear cost function c x = ni=1 ci xi over
P

all n-dimensional vectors x = (x1 , · · · , xn ) subject to a set of linear equality and

inequality constraints. Any linear program can be transformed into the following

standard form:
minimize c1 x1 + c2 x2 + · · · + cn xn
x

subject to a11 x1 + a12 x2 + · · · + a1n xn = b1

a21 x1 + a22 x2 + · · · + a2n xn = b2


(2.1)
..
.

am1 x1 + am2 x2 + · · · + amn xn = bm

x1 ≥ 0, x2 ≥ 0, · · · , xn ≥ 0.

where the bi ’s, ci ’s and aij ’s are fixed real constants, and the xi ’s are real numbers to

be determined. We always assume that each equation has been multiplied by minus

3
unity, if necessary, so that each bi ≥ 0. A linear programming problem of the form:
0
minimize cx
x

subject to Ax = b (2.2)

x ≥ 0.

is said to be in standard form. Suppose that x has dimension n and let a1 , · · · , an

be the rows of an (m × n) matrix A and b is an m-dimensional column vector.

The variables x1 , · · · , xn are called decision variables, and a vector x satisfying all of

the constraints is called a feasible solution or feasible vector. The set of all feasible
0
solutions is called the feasible set or feasible region. The function c x is called the

objective function or cost function. A feasible solution x∗ that minimizes the objective
0 0
function (that is , c x∗ ≤ c x, for all feasible x) is called an optimal feasible solution
0
or, simply, an optimal solution. The value of c x∗ is then called the optimal cost.
0 0
An equality constraint ai x = bi is equivalent to the two constraints ai x ≤ bi
0 0
and ai x ≥ bi . In addition, any constraint of the form ai x ≤ bi can be rewritten as
0
(−ai )x ≥ −bi . Finally, constraints of the form xj ≥ 0 or xj ≤ 0 are special cases of
0 0
constraints of the form ai x ≥ bi where ai is a unit vector and bi = 0. We conclude that

the feasible set in a general linear programming problem can be expressed exclusively
0
in terms of inequality constraints of the form ai x ≥ bi . Suppose that there is a total

of m such constraints, indexed by i = 1, · · · , m, let b = (b1 , · · · , bm ), and let A be


0 0
the m × n matrix whos rows are the row vectors a1 , · · · , am , the linear programming

problem can be written as:


0
minimize cx
x
(2.3)
subject to Ax ≥ b

4
Example 1. The following is a linear programming problem:

minimize x1 − x2 + x3
x

subject to x1 + x2 + x4 ≤2

x2 − x3 =5

x3 + x4 ≥3

x1 ≥0

x3 ≤ 0.

It can be rewritten as follows:

minimize x1 − x2 + x3
x

subject to − x1 − x2 − x4 ≥ −2

x2 − x3 ≥5

− x2 + x3 ≥ −5

x3 + x4 ≥3

x1 ≥0

− x3 ≥ 0.

with c = (1, −1, 1, 0),  


−1 −1 0 −1

 0 1 −1 0 
 0 −1 1 0 
A= 

 0 0 1 1 
 1 0 0 0 
0 0 −1 0
and b = (−2, 5, −5, 3, 0, 0).

Any general linear programming problem can be transformed into an equivalent

problem in standard form (2.2). We say that the two problems are equivalent, that is

given a feasible solution to one problem, we can construct a feasible solution to the

5
other, with the same cost. In particular, the two problems have the same optimal cost

and given an optimal solution to one problem, we can construct an optimal solution

to the other. The problem transformation is based on two steps:

(a) Elimination of free variables: Any real number can be written as the dif-

ference of two non-negative numbers. Hence, any unrestricted variable xj in a


− −
problem in general form, can be replaced by x+ +
j − xj , where xj and xj are new


variables on which we impose the sign constraints x+
j ≥ 0 and xj ≥ 0.

(b) Elimination of inequality constraints: Given an inequality constraint of


Pn
the form aij xj ≤ bi , we introduce a new variable si and the standard form
j=1
constraints nj=1 aij xj + si = bi , si ≥ 0. Such a variable si is called a slack
P

Pn
variable. Similarly, an inequality constraint aij xj ≥ bi can be put in standard
j=1
form by introducing a surplus variable si and the constraints nj=1 aij xj − si =
P

bi , si ≥ 0.

Example 2. Given the problem:

minimize 2x1 + 4x2

subject to x1 + x2 ≥3

3x1 + 2x2 = 14

x1 ≥ 0.

is equivalent to the standard form problem:

minimize 2x1 + 4x+


2 − 4x−
2

subject to x1 + x+
2 − x−
2 − x3 =3

3x1 + 2x+
2 − 2x−
2 = 14

x1 , x+
2, x−
2, x3 ≥ 0.

6
For example, given the feasible solution (x1 , x2 ) = (6, −2) to the original problem, we ob-

tain the feasible solution (x1 , x+
2 , x2 , x3 ) = (6, 0, 2, 1) to the standard form problem, which


has the same cost. Conversely, given the feasible solution (x1 , x+
2 , x2 , x3 ) = (8, 1, 6, 0) to

the standard form problem, we obtain the feasible solution (x1 , x2 ) = (8, −5) to the origi-

nal problem with the same cost .

2.2 Geometry of a Linear Program

We can also visualize standard form problems geometrically. For example, consider

the problem
minimize − x1 − x2

subject to x1 + 2x2 ≤3

2x1 + x2 ≤3

x1 , x2 ≥ 0.
The feasible set is the shaded region in Figure 2.1. In order to find an optimal

solution, we identify the cost vector c = (−1, −1) and for any given z, we consider

the line described by the equation −x1 − x2 = z. We change z to move this line in the

direction of the vector −c as much as possible as long as we do not leave the feasible

region. The best we can do is z = −2 and the vector x = (1, 1) is an optimal solution

which is a corner in the feasible region.

Now, assuming that m ≤ n and that the constraints Ax = b force x to lie on an

(n − m)-dimensional set. For example, consider the feasible set in R3 defined by the

constraints x1 + x2 + x3 = 1 and x1 , x2 , x3 ≥ 0 and note that n = 3 and m = 1. The

plane defined by the equality constraint appears as a triangle in a two-dimensional

space. Furthermore, each edge of the triangle corresponds to one of the inequality

constraints. The optimal solution lies inside the shaded triangle in Figure 2.2.

7
x2

3
−x1 − x2 = −2

2x 1
+x
2
=
3
1.5
−x1 − x2 = z (1, 1)

−x1 − x2 = 0 x1
+2
x2
=3

x1
1.5 3
c

Figure 2.1: Graphical solution of a linear program example.

x3
x2
x1
=0
x2

0
=

x3 =
0

x1
(a) An (n − m)-dimensional (b) An n-dimensional view of
view of the same set. the feasible set.

Figure 2.2: Visualization of standard form problems

In general, for any linear program, we have the following possibilities:

(a) There exists a unique optimal solution.

(b) There exist multiple optimal solutions; in this case, the set of optimal solutions

can be either bounded or unbounded.

8
(c) The optimal cost is −∞, and no feasible solution is optimal.

(d) The feasible set is empty.

As a preliminary investigation, we can say that if the problem has at least one

optimal solution, then an optimal solution can be found among the corners of the

feasible set. This idea is the core of how to solve a linear program. To develop this

let us start with some basic definitions.

Definition 1. A polyhedron is a set that can be described in the form {x ∈ Rn |Ax ≥ b},

where A is an m × n matrix and b is a vector in Rm .

Definition 2. Let P be a polyhedron. A vector x ∈ P is an extreme point of P if we

cannot find two vectors y, z ∈ P , both different from x, and a scalar λ ∈ [0, 1] such that

x = λy + (1 − λ)z.

An alternative definition which is also used to find the unique optimal solution is

a vertex of a polyhedron.

Definition 3. Let P be a polyhedron. A vector x ∈ P is a vertex of P if there exists some


0 0
c such that c x < c y for all y satisfying y ∈ P and y 6= x.

In other words, x is a vertex of P if and only if P is on one side of a hyperplane

which meets P only at the point x. Consider a polyhedron P ⊂ Rn defined in terms

of the linear equality and inequality constraints


0
ai x ≥ bi , i ∈ M1 ,
0
ai x ≤ bi , i ∈ M2 ,
0
ai x = bi , i ∈ M3 ,
where M1 , M2 and M3 are finite index sets, each ai is a vector in Rn , and each bi is a

scalar. This allows us to see the following definition.

9
0
Definition 4. If a vector x∗ satisfies ai x = bi for some i ∈ M1 , M2 , or M3 , we say that the

corresponding constraint is active or binding at x∗ .

If there are n constraints that are active at a vector x∗ , then x∗ satisfies a certain

system of n linear equations in n unknowns. This system has a unique solution if

and only if these n equations are ”linearly independent”. Since we have m equality

constraints in the standard form problems, we need to find n−m inequality constraints

which are also active. Once we have n linearly independent active constraints, a

unique vector x∗ is determined. However, this procedure has no guarantee of leading

to a feasible vector x∗ , because some of the inactive constraints could be violated; in

the latter case we say that we have a basic (but not basic feasible) solution.

Definition 5. Consider a polyhedron P defined by linear equality and inequality con-

straints, and let x∗ be an element of Rn .

(a) The vector x∗ is a basic solution if:

i All equality constraints are active;

ii Out of the constraints that are active at x∗ , there are n of them that are linearly

independent.

(b) If x∗ is a basic solution that satisfies all of the constraints, we say that it is a basic

feasible solution.

Now, we relate these definitions together in the following theorem:

Theorem 1. Let P be a nonempty polyhedron and let x∗ ∈ P . Then, the following are

equivalent: (a) x∗ is a vertex; (b) x∗ is an extreme point; (c) x∗ is a basic feasible solution.

10
Every basic solution must satisfy the equality constraints Ax = b, which provides

us with m active constraints. These active constraints are linearly independent by

the assumption on the rows of A. To obtain a total of n active constraints, we need to

choose n − m of the variables xi and set them to zero, which makes the corresponding

constraint xi ≥ 0 active. However, to get a set of n linearly independent active

constraints, the choice of these n − m variables is not entirely arbitrary.

Theorem 2. A vector x ∈ Rn is a basic solution if and only if we have Ax = b and there

exist indices B(1), · · · , B(m) such that:

(a) The columns AB(1) , · · · , AB(m) are linearly independent;

(b) if i 6= B(1), · · · , B(m), then xi = 0.

Therefore, to find a basic solution, we need to choose m linearly independent

columns AB(1) , · · · , AB(m) . We let xi = 0 for all i 6= B(1), · · · , B(m). Then, we

solve the system of m equations Bx = b for the unknowns xB(1) , · · · , xB(m) . If the

result of this procedure is nonnegative, then it is feasible. If x is a basic solution,

the variables xB(1) , · · · , xB(m) are called basic variables; the remaining variables are

called nonbasic. The columns AB(1) , · · · , AB(m) are called basic columns and, since

they are linearly independent, they form a basis of Rn . By arranging the m basic

columns next to each other, we obtain an m × m matrix B called a basis matrix. We

can also define a vector xB with the values of the basic variables. Thus,
 
xB(1)
 
| | |
B =  AB(1) AB(2) · · · AB(m)  , xB =  ... 
 
| | | xB(m)

The basic variables are determined by solving the equation BxB = b whose unique

solution is given by xB = B−1 b. We end this section by the following theorem.

11
0
Theorem 3. Consider the LP problem of minimizing c x over a polyhedron P. Suppose that

P has at least one extreme point and that there exists an optimal solution. Then, there exists

an optimal solution which is an extreme point of P.

2.3 Degeneracy

According to the previous definitions, at a basic soution, we must have n linearly

independent active constraints. This allows for the possibility of the number of active

constraints is greater than n. Of course, in n dimensions, no more than n of them

can be linearly independent. This also means that we will have more than n − m

variables with the value of zero. In this case, we say that we have a degenerate basic

solution.

Definition 6. A basic solution x ∈ Rn is said to be degenerate if more than n of the

constraints are active at x. In other words, if more than n − m of the components of x take

the value of zero.

If the entries of A or b were chosen at random, this would almost never happen.

However, in practical problems, the entries of A and b often have a nonrandom

structure and degeneracy is more common.

Example 3. Consider the polyhedron P defined by the constrains

x1 + x2 + 2x3 ≤ 8

x2 + 6x3 ≤ 12

x1 ≤ 4

x2 ≤ 6

x, x2 , x3 ≥ 0.

12
The vector x = (2, 6, 0) is a nondegenerate basic feasible solution, because there

are exactly theree active and linearly independent constraints, namely x1 +x2+2x3 ≤

8, x2 ≤ 6, and x3 ≥ 0. The vector x = (4, 0, 2) is a degenerate basic feasible solution,

because there are four active constraints, three of them are linearly independent,

namely x1 + x2 + 2x3 ≤ 8, x2 + 6x3 ≤ 12, x1 ≤ 4 and x2 ≥ 0.

2.4 The Simplex Method

If a linear program in standard form has an optimal solution, then there exists a

basic feasible solution that is optimal. The simplex method searches for an optimal

solution by moving from one basic feasible solution to another, along the edges of

the feasible set, in a cost reducing direction. For general optimization problems, a

locally optimal solution need not be globally optimal. In linear programming, local

optimality implies global optimality; because we are minimizing a convex cost function

over a convex set. Therefore, the simplex method terminates once an optimal solution

is found.

Now suppose that we are at point x ∈ P and we are moving away from x in the

direction of a vector d ∈ Rn . Clearly, we should not consider those choices of d which

take us outside the feasibility set. We say that d ∈ Rn is a feasible direction at x,

if there is a positive scalar θ for which x + θd ∈ P . We are moving away from x,

to a new vector x + θd by selecting a nonbasic variable xj (which is initially zero)

and increasing it to a positive value θ, while keeping the remaining nonbasic variables

at zero. Algebraically, dj = 1, and di = 0 for every nonbasic index i other than

j. At the same time, the vector xB of basic variables changes to xB + θdB , where

dB = (dB(1) , dB(2) , · · · , dB(m) ). Since we are only interested in feasible solutions, we

13
require A(x + θd) = b, and since x is feasible, we have Ax = b. Thus, for θ > 0, we

need Ad = 0. Then
n
X m
X
0 = Ad = Ai d i = AB(i) dB(i) + Aj = BdB + Aj
i=1 i=1

Since B in invertible, we obtain

dB = −B−1 Aj

0
Now, if d is the jth basic direction, then the rate c d of cost change along the direction
0
d is given by cB dB + cj , where cB = (cB(1) , · · · , cB(m) ). This is defined as the reduced
0
cost c̄j = cj − cB B−1 Aj of moving in this direction. Note that, cj is the cost per unit
0
increase in the variable xj , and the term −cB B−1 Aj is the cost of the compensating

change in the basic variables necessitated by the constraint Ax = b. Since B is the

matrix AB(1) · · · AB(m) , we have B−1 AB(1) · · · AB(m) = I, where I is the m × m


   

identity matrix. Therefore, for every basic variable xB(i) , we have

0 0
c̄B(i) = cB(i) − cB B−1 AB(i) = cB(i) − cB ei = cB(i) − cB(i) = 0,

that is the reduced cost of every basic variable is zero. The following theorem illus-

trates the optimality conditions.

Theorem 4. Consider a basic feasible solution x associated with a basis matrix B, and let

c̄ be the corresponding vector of reduced costs.

(a) If c̄ ≥ 0, then x is optimal.

(b) If x is optimal and nondegenerate, then c̄ ≥ 0.

That is, in order to decide whether a nondegenerate basic feasible solution is

optimal, we need only to check whether all reduced costs are nonnegative, which is

14
the same as examining n − m basic directions. If x is a degenerate basic feasible

solution, an equally simple computational test for determining whether x is optimal

is not available. Therefore, to assert that a certain basic solution is optimal, we need

to satisfy two conditions: feasibility and nonnegativity of the reduced costs. This

leads us to the following definition

Definition 7. A basis matrix B is said to to be optimal if:

(a) B−1 b ≥ 0, and

0 0 0 0
(b) c̄ = c − cB B−1 A ≥ 0 .

If an optimal basis is found, the corresponding basic solution is feasible, satisfies

the optimality conditions, and is therefore optimal. Let us assume that every basic

feasible solution is nondegenerate. Suppose we are at a basic feasible solution x and

that we have computed the reduced costs c̄j of the nonbasic variables. If the reduced

cost c̄j of a nonbasic variable xj is negative, the jth basic direction d is a feasible

direction of cost reduction. While moving along this direction d, the nonbasic variable

xj becomes positive and all other nonbasic variables remain at zero. We describe this

situation by saying that xj (or Aj ) enters or is brought into the basis and replaces

one of the columns in B. An iteration of the simplex method is described as follows:

The following theorem states that, in the nondegenerate case, the simplex method

works correctly and terminates after a finite number of iterations.

Theorem 5. Assume that the feasible set is nonempty and that every basic feasible solution

is nondegenerate. Then, the simplex method terminates after a finite number of iterations.

At termination, there are the following two possibilities:

15
Algorithm 1 An iteration of the simplex method

1. We start with a basis consisting of the basic columns AB(1) , · · · , AB(m) and an asso-
ciated basic feasible solution x.
0
2. Compute the reduced costs c̄j = cj − cB B−1 Aj for all nonbasic indices j. If they are
all nonnegative, then x is optimal and the algorithm terminates; else, choose some j
0
for which c̄j < 0.

3. compute u = B−1 Aj . If no component of u is positive, we have θ∗ = ∞, the optimal


cost is −∞, and the algorithm terminates.
xB(i)
4. If some component of u is positive, let θ∗ = min
{i=1,··· ,m|ui >0} ui

x
5. Let l be such that θ∗ = B(i) ui
. Form a new basis by replacing AB(l) with Aj . If y is
the new basic feasible solution, the values of the new basic variables are yj = θ∗ and
yB(i) = xB(i) − θ∗ ui , i 6= l.

(a) We have an optimal basis B and an associated basic feasible solution which is opti-

mal.

0
(b) We found a vector d satisfying Ad = 0, d ≥ 0, c d < 0, and the optimal cost is

−∞.

2.4.1 Implementation of the Simplex Method

From the previous section, we notice that the vectors B−1 Aj play a key role in the

simplex method. If these vectors are available, the reduced cost c̄, the direction of

motion d, and the step size θ∗ are easily computed. Thus, the main difference between

alternative implementations lies in the way that the vectors B−1 Aj are computed and

the complexity of this computation. We introduce a comparison between alternative

implementations and performance enhancement in Section 2.4.2.

16
Naive Implementation

At the beginning of a typical iteration, we have the indices B(1), · · · , B(m) of the
0 0
current basic variables. For the basic matrix B, we compute the vector p = cB B−1

which is called the vector of simplex multipliers associated with the basis B. The
0
reduced cost of any variable xj is then obtained by c̄j = cj − p Aj . Depending on

the pivoting rule employed, we may have to compute all of the reduced costs or we

may compute them one at a time until a variable with a negative reduced cost is

encountered. Once a column Aj is selected to enter the basis, we solve the linear

system Bu = Aj in order to determine the vector u = B−1 Aj . At this point , we

can form the direction along which we will be moving away from the current basic

feasible solution. We finally determine θ∗ and the variable that will exit the basis, and

construct the new basic feasible solution. This iteration is repeated until all reduced

costs are nonngative.

Revised Simplex Method

The computational complexity of the naive implementation is due to the need

for solving two linear system of equations. In the revised simplex method, the

matrix B−1 is made available at the beginning of each iteration, and the vectors
0
cB B−1 and B−1 Aj are computed by matrix-vector multiplication. However, we

need an efficient method for updating the matrix B−1 for each basis change. Let
 
B = AB(1) · · · AB(m) be the basis matrix at the beginning of an iteration and let
 
B̄ = AB(1) · · · AB(l−1) Aj AB(l+1) · · · AB(m) be the basis matrix at the begin-

ning of the next iteration. These two basis matrices have the same columns except

that the lth column AB(l) (the one that exists the basis) has been replaced by Aj

17
(the one that enters the basis). Thus, B−1 contains information that can be exploited

in the computation of B̄−1 . Since B−1 B = I, we see that B−1 AB(i) is the ith unit

vector ei and hence we have


 
| | | | |
B−1 B̄ =  e1 · · · el−1 u el+1 · · · em  , u = B−1 Aj
| | | | |

We can now apply a sequence of elementary row operations that will change the

above matrix to the identity matrix. This sequence of elementary row operations

is equivalent to left-multiplying B−1 B̄ by a certain invertible matrix Q. Hence, we

have QB−1 B̄ = I, which yields QB−1 = B̄−1 . So, applying the same sequence of

row operations to the matrix B−1 , we obtain B̄−1 . A typical iteration of the revised

simplex method includes the same steps as in Algorithm 1 with one more step added

at the end to compute B̄−1 . Form the m × (m + 1) matrix B−1 |u . Add to each one
 

of its rows a multiple of the lth row to make the last column equal to the unit vector

el . The first m columns of the result is the matrix B̄−1 .

The Full Tableau Implementation

Instead of storing and updating the matrix B−1 , we store and update the m ×

(n + 1) matrix B−1 b|A with columns B−1 b and B−1 A1 , · · · , B−1 An . This matrix
 

is called the simplex tableau. The column B−1 b, called the zeroth column, contains

the values of the basic variables. The column B−1 Ai is called the ith column of the

tableau. The column u = B−1 Aj corresponding to the variable that enters the basis

is called the pivot column. If the lth basic variable exits the basis, the lth row of

the tableau is called the pivot row. Finally, the element belonging to both the pivot

row and the pivot column is called the pivot element. Note that the pivot element

18
is ul and is always positive (unless u ≤ 0 , in which case the algorithm has met the

termination condition in Step 3).

Note that given the current basis matrix B, the quality constraint Ax = b can

be rewritten as B−1 b = B−1 Ax, which is precisely the information in the tableau.

At the end of each iteration, we need to update the tableau B−1 b|A and compute
 

B̄−1 b|A . This can be accomplished following the same idea as the revised simplex
 

method. To determine the exiting column AB(l) and the step size θ∗ , Step 4 and 5 in

the Algorithm 1 amount to the following: xB(i) /ui is the ratio of the ith entry in the

zeroth column of the tableau to the ith entry in the pivot column of the tableau. We

only consider those i for which ui is positive. The smallest ratio is equal to θ∗ and

determines l. We need to augment the simplex tableau by including a top row, to

be referred to as the zeroth row. The entry at the top left corner contains the value
0
−cB xB , which is the negative of the current cost. the rest of the zeroth row is the
0 0 0
row vector of the reduced costs c̄ = c − c̄B B−1 A. The structure of the tableau is

shown in Figure 2.3.

A summary of the full tableau implementation method is described in the following

algorithm.

0 0 0
−cB B−1 b −c − cB B−1 A

B−1 b −B−1 A

Figure 2.3: Full Tableau Structure

19
Algorithm 2 An iteration of the full tableau implementation

1. A typical iteration starts with the tableau associated with a basis matrix B and the
corresponding basic feasible solution x.

2. Examine the reduced costs in the zeroth row of the tableau. If they are all nonnega-
tive, the current basic feasible solution is optimal, and the algorithm terminates; else,
choose some j for which c̄j < 0.

3. Consider the vector u = B−1 Aj , which is the jth column (the pivot column) of the
tableau. If no component of u is positive, the optimal cost is −∞, and the algorithm
terminates.

4. For each i for which ui is positive, compute the ratio xB(i) /ui . Let l be the index of a
row that corresponds to the smallest ratio. The column AB(l) exits the basis and the
column Aj enters the basis.

5. Add to each row of the tableau a constant multiple of the lth row (the pivot row)
so that ul (the pivot element) becomes one and all other entries of the pivot column
become zero.

2.4.2 Comparisons and Performance Enhancements

When comparing different implementations, it is important to keep the following

facts in mind. If B is a given m × m matrix and b ∈ Rm is a given vector, computing

the inverse of B or solving a linear system of the form Bx = b takes O(m3 ) arithmetic

operations. Computing a matrix-vector product Bb takes O(m2 ) operations. Finally,


0
computing an inner product p b of two m-dimensional vectors takes O(m) arithmetic

operations.

Note that, in the naive implementation, we need O(m3 ) arithmetic operations to


0 0
solve the systems p B = cB and Bu = Aj . In addition, computing the reduced costs

of all variables requires O(mn) arithmetic operations , because we need to form the

inner product of the vector p with each one of the nonbasic columns Aj . Thus , the

20
total computational effort per iteration, for the naive implementation, is O(m3 +mn).

The alternative implementations require only O(m2 + mn) arithmetic operations.

Therefore, the naive implementation is rather inefficient, in general. On the other


0 0
hand, for certain problems with a special structure , the linear systems p B = cB and

Bu = Aj can be solved very fast, in which case the naive implementation can be of

practical interest.

The full tableau method requires a constant (and small) number of arithmetic

operations for updating each entry of the tableau. Thus, the amount of computation

per iteration is proportional to the size of the tableau, which is O(mn). The revised
0
simplex method uses similar computations to update B−1 and cB B−1 , and since only

O(m2 ) entries are updated, the computational requirements per iteration are O(m2 ).

In addition, the reduced cost of each variable xj can be computed by forming the inner
0
product p Aj , which requires O(m) operations. In the worst case , the reduced cost

of every variable is computed, for a total of O(mn) computations per iteration. Since

m ≤ n, the worst-case computational effort per iteration is O(mn + m2 ) = O(mn),

under either implementation. On the other hand , if we consider a pivoting rule that

evaluates one reduced cost at a time, until a negative reduced cost is found, a typical

iteration of the revised simplex method might require a lot less work. In the best case,

if the first reduced cost computed is negative, and the corresponding variable is chosen

to enter the basis, the total computational effort is only O(m2 ). The conclusion is

that the revised simplex method cannot be slower than the full tableau method, and

could be much faster during most iterations.

Another advantage to the revised simplex method is that memory requirements

are reduced from O(mn) to O(m2 ). As n is often much larger than m, this effect can

21
be quite significant. It could be counterargued that the memory requirements of the

revised simplex method are also O(mn) because of the need to store the matrix A.

However, in most large scale problems that arise in applications, the matrix A is very

sparse (has many zero entries) and can be stored compactly. (Note that the sparsity

of A does not usually help in the storage of the full simplex tableau because even if A

and B are sparse, B−1 A is not sparse in general). The following table summarizes this

discussion. Note that memory is the storage space required, time is the computational

effort per iteration and best-case time is considered if first computed reduced cost is

negative.

Table 2.1: Comparison between Simplex implementation methods

Naive Revised Full Tableau


Memory O(m) O(m2 ) O(mn)
Worst-case time O(m3 + mn) O(mn) O(mn)
Best-case time O(m3 ) O(m2 ) O(mn)

Some ideas from numerical linear algebra can help us to enhance the performance

of the simplex method. The following are some examples of these ideas:

1. Recall that at each iteration of the revised simplex method, the inverse basis

matrix B−1 is updated according to certain rules. Each such iteration may

introduce roundoff or truncation errors which accumulate and may eventually

lead to highly inaccurate results. The efficiency of such reinversions can be

greatly enhanced by using suitable data structures and certain techniques from

numerical linear algebra like LU factorization, sparse matrices, ... etc.

22
2. Now, suppose that a reinversion has been just carried out and B−1 is available.

Subsequent to the current iteration of the revised simplex method, we have the

option of generating explicitly and storing the new inverse basis matrix B̄−1 .

An alternative that carries the same information, is to store a matrix Q such

that QB−1 = B̄−1 . Note that Q basically prescribes which elementary row

operations need to be applied to B−1 in order to produce B̄−1 . It is not a full

matrix, and can be completely specified in terms of m coefficients: for each row,

we need to know what multiple of the pivot row must be added to it.

3. Subsequent to a ”reinversion,” one does not usually compute B−1 explicitly, but

B−1 is instead represented in terms of sparse triangular matrices with a special

structure.

These methods are designed to accomplish two objectives: improve numerical stability

(minimize the effect of roundoff errors) and exploit sparsity in the problem data

to improve both running time and memory requirements. These methods have a

critical effect in practice. Besides having a better chance of producing numerically

trustworthy results, they can also speed up considerably the running time of the

simplex method. Duality helps us to check the efficiency of the used algorithm by

solving both primal and dual problems and comparing the obtained results. Therefore,

we discuss the duality theory in the following section.

23
2.5 The Duality Theory

Consider the standard form problem


0
minimize cx
x

subject to Ax = b

x ≥ 0.

which we call the primal problem, let x∗ be an optimal solution and assume it exists.

We introduce a relaxed problem in which the constraint Ax = b is replaced by a


0
penalty p (b − Ax), where p is the price vector of the same dimension as b. Then,

we have 0 0
minimize c x + p (b − Ax)
x

subject to x ≥ 0.
Let g(p) be the optimal cost for the relaxed problem, as a function of the price vector
0
p. We see that g(p) is no larger than the optimal primal cost c x∗ , since

 0 0 0 0 0
g(p) = min c x + p (b − Ax) ≤ c x∗ + p (b − Ax∗ ) = c x∗

x≥0

where the later inequality follows from the fact that x∗ is a feasible solution to the

primal problem, and satisfies Ax∗ = b. Thus, each p leads to a lower bound g(p) for
0
the optimal cost c x∗ . The problem

maximize g(p)
p

subject to no constraints.

which is interpreted as a search for the tightest possible lower bound of this type, as

is known as the dual problem. Now, using the definition of g(p), we have

 0 0  0 0 0
g(p) = min c x + p (b − Ax) = p b + min(c − p A)x
x≥0 x≥0

24
Note that ( 0 0 0
0 0 0, if c − p A ≥ 0 ,
min(c − p A)x =
x≥0 −∞, otherwise.
To maximize g(p), we only consider the values of p for which g(p) is not equal to

−∞. Therefore, we conclude that the dual problem is as follows


0
maximize pb
p
(2.4)
0 0
subject to p A ≥ c .

Moreover, if we transform the dual into an equivalent minimization problem and then

form its dual, we obtain a problem equivalent to the original problem, i.e. ”the dual

of the dual is the primal”. Since g(p) provides a lower bound for the optimal cost,

we can now state the weak duality theorem as follows.

Theorem 6. (Weak Duality) If x is a feasible solution to the primal problem and p is a


0 0
feasible solution to the dual problem, then p b ≤ c x.

Although the weak duality theorem is not a deep result, it does provide some

useful information about the relation between the primal and the dual as stated in

the following corollaries.

Corollary 1. If the optimal cost in the primal is −∞, then the dual problem must be

infeasible. Moreover, if the optimal cost in the dual is +∞, then the primal problem must

be infeasible.

Corollary 2. Let x and p are feasible solutions to the primal and dual problems, respec-
0 0
tively, and suppose that p b = c x. Then x and p are optimal solutions to the primal and

the dual problems, respectively.

The next theorem is the central result on linear programming duality.

25
Theorem 7. (Strong Duality) If a linear programming problem has an optimal solution,

so does its dual, and the respective optimal costs are equal.

Recall that in a linear programming problem, exactly one of the following three

possibilities will occur:

(a) There is an optimal solution.

(b) The problem is ”unbounded” ; that is, the optimal cost is −∞ (for minimization

problems) , or +∞ (for maximization problems) .

(c) The problem is infeasible.

This leads to nine possible combinations for the primal and the dual, which are shown

in Table 2.2. By the strong duality theorem, if one problem has an optimal solution,

so does the other. Furthermore, the weak duality theorem implies that if one problem

is unbounded, the other must be infeasible. This allows us to mark some of the entries

in Table 2.2 as ”impossible.”

Table 2.2: The Different Possibilities for the Primal and Dual Problems

Finite Optimum Unbounded Infeasible


Finite Optimum Possible Impossible Impossible
Unbounded Impossible Impossible Possible
Infeasible Impossible Possible Possible

There is another interesting relation between the primal and the dual which is

known as Clark’s theorem (Clark, 1961). It asserts that unless both problems are

infeasible, at least one of them must have an unbounded feasible set.

26
2.6 Example Problems for Wireless Communication Networks

We are interested in the connection between linear programming and wireless

communication networks. In this section we introduce some examples of wireless

communication problems which fit in the platform of linear programming. All of

them can require large scale linear programming techniques to overcome their growing

complexity with the system parameters.

2.6.1 Power Control in a Wireless Network

Consider a wireless communication system consisting of n mobile users and a

single base station as shown in Figure 2.4. For each i = 1, 2, · · · , n, user i transmits

a signal to the base station with power pi and an attenuation factor of hi (i.e., the

actual signal power received at the base station from user i is hi pi ). When the

base station is receiving from user i the total power received from all other users is
P
considered as an interference (i.e., j6=i hj pj ). For the communication with user i

to be reliable, the signal-to-interference ratio must exceed a threshold γi , where the

“signal” is the power received from user i. We are interested in minimizing the total

power transmitted by all users subject to having reliable communication for all users.

The total transmitted power is p1 + p2 + · · · + pn . The signal-to-interference ratio for

user i is P hi pi . Hence, the problem can be written as:


j6=i hj pj

minimize p1 + p2 + · · · + pn
hi pi
subject to P ≥ γi , i = 1, 2, · · · , n
j6=i hj pj

p1 , p2 , · · · , pn ≥ 0

27
We can rewrite the problem as a linear programming problem as follows:

minimize p1 + p2 + · · · + pn
X
subject to hi pi − γi hj pj ≥ 0, i = 1, 2, · · · , n
j6=i

p1 , p2 , · · · , pn ≥ 0

Note that the complexity of this problem increases with the number of users in the

network and hence it fits under large scale linear programming.

User 1 User 2 User n

p1 p2 pn

h2 p2

h1 p1 hn pn

Base Station

Figure 2.4: An illustration of the power control example.

2.6.2 Multicommodity Network Flow

Consider a communication network consisting of n nodes. Nodes are connected

by communication links. A link allowing one-way transmission from node i to node

j is described by an ordered pair (i, j). Let A be the set of all links. We assume that

each link (i, j) ∈ A can carry up to uij bits per second. There is a positive charge

cij per bit transmitted along that link. Each node k generates data, at the rate of bkl

bits per second, that have to be transmitted to node l, either through a direct link

(k, l) or by tracing a sequence of links. The problem is to choose paths along which

28
all data reach their intended destinations, while minimizing the total cost. We allow

the data with the same origin and destination to be split and be transmitted along

different paths.

In order to formulate this problem as a linear programming problem, we introduce

variables xkl
ij indicating the amount of data with origin k and destination l that

traverse link (i, j). Let 


kl
b ,
 i = k,
kl kl
bi = −b , i = l,

0, otherwise

Thus , bkl
i is the net inflow at node i, from outside the network, of data with origin k

and destination l. We then have the following formulation:


n X
X X n
minimize cij xkl
ij
(i,j)∈A k=1 l=1
X X
subject to xkl
ij − xkl kl
ji = bi , i, k, l = 1, 2, · · · , n
{j|(i,j)∈A} {j|(j,i)∈A} (2.5)
Xn X n
xkl
ij ≤ uij , (i, j) ∈ A,
k=1 l=1
xkl
ij ≥ 0, (i, j) ∈ A, k, l = 1, 2, · · · , n.

The first constraint is a flow conservation constraint at node i for data with origin

k and destination l. The expression {j|(i,j)∈A} xkl


P
ij represents the amount of data

with origin and destination k and l, respectively, that leave node i along some link.

The expression {j|(j,i)∈A} xkl


P
ji represents the amount of data with the same origin and

destination that enter node i through some link. Finally, bkl


i is the net amount of such

data that enter node i from outside the network. The second constraint expresses

the requirement that the total traffic through a link (i, j) cannot exceed the link’s

capacity. The last constraint is a non-negativity constraint which is required for the

feasibility of the solution.

29
a
7
8
5 b c
7
9 5
15
d e
6 9
8
11
f g

Figure 2.5: An illustration of the multi-commodity network flow example.

2.6.3 D2D Caching Networks

We consider a wireless network consisting of a set of N users N = {1, 2, · · · , N }

and a carrier who supplies M data items upon demand. Each data item m ∈

{1, 2, · · · , M } has a size Sm > 0. We also consider a time-slotted system where

the carrier divides the day into T time slots. The probability that user n requests

item m in time slot t is denoted by pm


n,t . We assume that users are moving around

many locations. The carrier is interested in L popular locations L = {1, 2, · · · , L}

like airports, schools, shopping malls, stadiums or governmental buildings where high

demand can be related to user mobility. The probability that user n will be present
l
where Ll=1 θn,t
l
P
at location l in time slot t is denoted by θn,t = 1 ∀n, t.

Each user n has an isolated cache memory of size Zn . The carrier caches an amount

xm
n of content m in the device of user n at time slot 0 and then lets users share it

together for t ≥ 1. Therefore, the carrier smooths out the network load by caching

some of the data items at the network edge and exploits user mobility to enhance

its caching decision. We assume that a device-to-device (D2D) communication is

30
allowed and can be used to transfer data items between users. Users occupy part of

their devices memory for caching these data items and consume some of their battery

to transfer it through the D2D communication. We capture this cost by a reward

corresponding to each cached byte denoted by r > 0. The carrier’s objective is to find

an optimal proactive service policy xm∗
n , ∀n, m which minimizes the time-averaged

expected cost while delivering the requested data items on time to all users. The

optimal solution is found by solving the following problem:


T X
X N X
M  
minimize xm
n r− m
αn,t
t=1 n=1 m=1
XM
subject to xmn ≤ Zn , ∀n,
m=1
(2.6)
N
X
l
θn,t xm
n ≤ Sm , ∀m, l, t
n=1
0 ≤ xm
n ≤ Sm , ∀n, m.

where,
L
X N
X
m l
αn,t = θn,t pm l
k,t θk,t , (2.7)
l=1 k=1
m
Note that αn,t captures the demand and mobility profiles of all users. Also, we notice

that the term N m l


P
k=1 pk,t θk,t captures the total expected number of requests for item

m at location l in time slot t. So, the higher the probability that item m will be

requested at location l in time slot t and the higher the probability that user n will

be present at that location in this time slot, the more amount of this item will be
m
cached at user n. Also, for each user n and for every content m, when r > αn,t the

carrier decides xm∗ m∗


n = 0 and decides xn ≥ 0 otherwise based on the remaining space

in user’s memory. More details and results are provided in [2]

31
Data Sources

Service Provider

User 1 User 2 User N

.. .. ..
. D2DLink
. .
D2DLink

D2DLink
End Users

Figure 2.6: An illustration of the D2D caching networks example.

32
Chapter 3: Large Scale Linear Programs

In the previous chapter, we discussed the basics of linear programming in de-

tails. Furthermore, we showed how to use duality theory to solve linear optimization

problems. In this chapter, we extend our discussion to consider large scale linear op-

timization problems. Many practical applications require a large number of variables

or constraints which leads to a tremendous increase in memory and computational

requirements of the system. For example, in the proactive caching problem men-

tioned in Section 2.6.3, we need to consider the case when the number of users and

the number of constraints are very large. This type of linear optimization problems

requires specialized algorithms to find an optimal solution efficiently.

Recent improvements in linear optimization techniques allow us to deal with large

scale problems. The complexity of these problems arises when the dimension of matrix

A increases. For instance, in the simplex method, identifying the entering and exiting

columns among a massive number of columns in an (m × n) matrix A consumes most

of the memory resources. A proper modification of the simplex method allows us to

find a solution to such large scale problems.

In this chapter, we present some methods for solving linear programming problems

with a large number of variables or constraints. We shed light on delayed column gen-

eration where columns of the matrix A are generated only when they are required.

33
We also present its dual, the ”cutting plane” method, in which the feasible set is

approximated using only a subset of the constraints. We also introduce the decom-

position algorithm found by Dantzig-Wolfe [3]. It is used for linear programming

problems whose constraints can be divided into two sets: the first set includes gen-

eral constraints Ax ≥ b; while the second set has constraints with a special structure.

These methods are illustrated through a classical application, the cutting-stock prob-

lem presented by Gilmore and Gomory [4]. We close this chapter by surveying some

of the applications to the mentioned methods in wireless communication networks.

3.1 Delayed Column Generation Method

The delayed column generation method was first presented by Dantzig and Wolfe

in 1960 [3] and Gomory and Gilmore in 1961 [4]. This method still has a great interest

and so many recent applications in the literature [5, 6]. For example, the generalized

bin packing problem (GBPP) is a novel packing problem arising in many transporta-

tion and logistic settings, characterized by multiple items and bins attributes and the

presence of both compulsory and non-compulsory items. The computational com-

plexity and the approximability of the GBPP is discussed in [7]. A presentation of

the main mathematical models and an experimental evaluation of the main available

software tools for the one-dimensional bin packing problem is introduced in [8]. A

generalization of the classical multiple knapsack problem, in which instead of packing

single items we are packing groups of items, is discussed in [9]. Such a general model

finds applications in various practical problems, e.g., delivering bundles of goods and

also in caching networks.

34
Some linear programming problems become intractable because of the large num-

ber of variables involved. Moreover, it becomes more difficult to find an optimal

solution satisfying a huge number of constraints. Assuming n  m (i.e. the number

of variables is much larger than the number of constraints), most of the variables will

be non-basic and hence only a subset of variables need to be considered when solving

the problem. Column generation leverages this idea to generate only the variables

which have the potential to improve the objective function; that is, to find variables

with negative reduced cost. The problem being solved is split into two problems: the

master problem and the sub-problem. The master problem is the original problem

with only a subset of variables being considered. The sub-problem is another problem

created to identify a new variable to enter the basis.

Now, consider the standard form problem


0
minimize cx
x

subject to Ax = b

x ≥ 0.

with x ∈ Rn and b ∈ Rm and the rows of A are linearly independent. If the number

of columns of A is so large, then it is not practical to generate and store the entire

matrix A in memory as done in the full tableau method for example. Moreover, in

many problems, most of the columns of A never enter the basis. Therefore, we don’t

have to generate these unused columns. In particular, the revised simplex method,

at any given iteration, requires the current basic columns and the column which is to

enter the basis. Consequently, we need an efficient method for recognizing variables

xi with negative reduced costs c¯i without having to generate all columns. Sometimes,

35
this can be accomplished by solving the problem

minimize c¯i (3.1)


i

In many cases, this optimization problem has a special structure, that is a smallest

c¯i can be found without computing every c¯i . If the minimum of this optimization

problem is greater than or equal to 0, all reduced costs are non-negative and we have

an optimal solution to the original linear programming problem. On the other hand,

if the minimum is negative, the variable xi corresponding to a minimizing index i

has negative reduced cost, and the column Ai can enter the basis. The key to this

approach is our ability to solve the optimization problem (3.1) efficiently without

using so much memory.

In the delayed column generation method, the columns that exit the basis are

discarded from memory. In a variant of this method, the algorithm retains in memory

all or some of the columns that have been generated in the past, and proceeds in terms

of restricted linear programming that involves only the retained columns. To clarify

the idea, let us consider a sequence of master iterations. At the beginning of a master

iteration, we have a basic feasible solution to the original problem, and an associated

basis matrix. For each master iteration, we do the steps defined in Algorithm 3.

Note that step (1) in Algorithm 3 may require to go over all columns of A to find

a variable with negative reduced cost. An alternative to this method is to solve the

master problem for some set I. From this solution, we are able to obtain dual prices

for each of the constraints in the master problem (recall that the dual prices are the
0 0
elements of the vector p = cB B−1 ). This information is then utilized in the objective

function of the subproblem. After solving the subproblem, if the objective value of

the subproblem is negative, a variable with negative reduced cost has been identified.

36
Algorithm 3 Delayed Column Generation Master Iteration
1: We search for a variable with negative reduced cost, possibly by minimizing c̄i over all
i using (3.1). If none is found, the algorithm terminates and this solution is optimal.
2: Suppose that we have found some j such that c¯j < 0. We form a collection of columns
Ai , i ∈ I, which contains all of the basic columns, the entering column Aj , and possi-
bly some other columns as well.
3: Define the restricted problem
X
minimize ci x i
x
i∈I
X
subject to Ai x i = b (3.2)
i∈I
x ≥ 0.

4: The basic variables at the current basic feasible solution to the original problem are
among the columns that have been kept in the restricted problem. Therefore, we have
a basic feasible solution to the restricted problem, which can be used as a starting point
for its optimal solution.
5: We perform as many simplex iterations as needed until we find an optimal solution to
the restricted problem.

This variable is then added to the master problem, and the master problem is re-

solved. Re-solving the master problem will generate a new set of dual values, and the

process is repeated until no negative reduced cost variables are identified. We can

conclude that the solution to the master problem is optimal.

The delayed column generation method is a special case of the revised simplex

method with some special rules for choosing the entering variable that give priority

to the variables xi , i ∈ I; only when the reduced costs of these variables are all non-

negative. We wish to give priority to variables for which the corresponding columns

have already been generated and stored in memory. There are several variants of this

method, depending on how the set I is chosen at each iteration. We summarize these

variants as follows:

37
(a) I is just the set of indices of the current basic variables together with the entering

variable. A variable that exits the basis is immediately dropped from the set I.

Since the restricted problem has m + 1 variables and m constraints, its feasible

set is at most one-dimensional, and it gets solved in a single simplex iteration,

that is, as soon as the column Aj enters the basis.

(b) I is the set of indices of all variables that have become basic at some point in

the past; equivalently, no variables are ever dropped, and each entering variable

is added to I. The set I keeps growing and hence this option is not preferred

when the number of master iterations is large.

(c) The set I is kept to a moderate size by dropping those variables that have exited

the basis in the remote past and have not reentered again.

These variants are guaranteed to terminate in the absence of degeneracy. In the

presence of degeneracy, cycling can be avoided by using the lexicographic tie breaking

rule for example [10], [11].

3.2 Cutting Plane Method

Delayed column generation methods in terms of the dual variables can be described

as delayed constraint generation or cutting plane methods. Consider the dual problem

of the standard form problem


0
maximize pb
p
(3.3)
0
subject to p Ai ≤ ci , i = 1, 2, · · · , n

We assume that it is impractical to generate and store each one of the columns Ai

because the number n is very large. Instead, we consider a subset I of {1, 2, · · · , n}

38
and form the relaxed dual problem
0
maximize pb
p
(3.4)
0
subject to p Ai ≤ ci , i ∈ I

Let p∗ be an optimal basic feasible solution to the relaxed dual problem. There are

two possibilities:

(a) p∗ is a feasible solution to the original problem (3.3). Any other feasible solution

p to the original problem (3.3) is also a feasible solution for the relaxed problem

(3.4) because the latter has fewer constraints. Therefore, by optimality of p∗


0 0
for the relaxed problem (3.4), we have p b ≤ (p∗ ) b. Hence, p∗ is an optimal

solution to the original problem (3.3) and the algorithm terminates.

(b) If p∗ is infeasible for the original problem (3.3), we find a violated constraint,

add it to the constraints of the relaxed dual problem and continue similarly.

Therefore, we need a method for checking the feasibility of vector p∗ to the original

dual problem (3.3). We also need an efficient method to identify a violated constraint.

This is known as the separation problem, because it amounts to finding a hyperplane

that separates p∗ from the dual feasible set. This can be done by solving the problem

0
minimize ci − (p∗ ) Ai (3.5)
i

over all i. If the optimal solution of this problem is non-negative, we have a feasible

solution to the original dual problem. If it is negative, the corresponding index of


0
an optimizer, i, satisfies ci < (p∗ ) Ai , and we have identified a violated constraint.

The success of this approach hinges on our ability to solve the problem (3.5) effi-

ciently; fortunately it is sometimes possible. In addition, there are cases where the

39
optimization problem (3.5) is not easily solved but one can test for feasibility and

identify violated constraints using other means such as those used for integer linear

programming [12].

Applying the cutting plane method to the dual problem is identical to applying the

delayed column generation method to the primal. Furthermore, the relaxed problem

(3.4) is the dual of the restricted primal problem (3.2). In some cases, we may have

a primal problem (not in a standard form) that has relatively few variables but a

very large number of constraints. In that case, it makes sense to apply the cutting

plane algorithm to the primal; equivalently, we can form the dual problem and solve

it using the delayed column generation method.

3.3 Dantzig-Wolfe Decomposition

Another method to solve large scale linear programming is the decomposition

algorithm proposed by Dantzig and Wolfe [3]. Dantzig-Wolfe decomposition has been

an important tool to solve large structured models that could not be solved using

a standard Simplex algorithm as they exceeded the capacity of those solvers. With

the current generation of simplex and interior point LP solvers and the enormous

progress in standard hardware (both in terms of raw CPU speed and availability of

large amounts of memory), the Dantzig-Wolfe algorithm has become less popular.

The decomposition algorithm is a procedure for the solution of linear programs using

a generalized extension of the simplex method. The solution is obtained by solving

a sequence of linear programs each of smaller size than the original. Dantzig–Wolfe

decomposition relies on delayed column generation for improving the tractability of

large-scale linear programs. To illustrate the idea of this decomposition method,

40
consider a linear programming problem of the form
0 0
minimize c1 x1 + c2 x2
x

subject to D1 x1 + D2 x2 = b0

F1 x1 = b1 (3.6)

F2 x2 = b2

x1 , x2 ≥ 0.

Supose that x1 and x2 are vectors of dimensions n1 and n2 , respectively, and that

b0 , b1 , b2 have dimensions m0 , m1 , m2 , respectively. Thus, besides nonnegativity con-

straints, x1 satisfies m1 constraints, x2 satisfies m2 constraints, and x1 , x2 together

satisfy m0 coupling constraints. Note that, D1 , D2 , F1 , F2 are matrices of appropriate

dimensions. Often, the number of coupling constraints is a small fraction of the total

constraints (i.e. m0  m).

The first step of this method is to introduce an equivalent problem, with fewer

equality constraints, but many more variables. The original problem is reformulated

into a master program and n subprograms. This reformulation relies on the fact that

any element of a polyhedron that has at least one extreme point can be represented

as convex combination of extreme points plus a nonnegative linear combination of

extreme rays.

Definition 8. A nonzero element x of a polyhedral cone C = {x ∈ Rn |Ax ≥ 0} is

called an extreme ray if there are n − 1 linearly independent constraints that are active

at x. Moreover, an extreme ray of the characteristic cone C associated with a nonempty

polyhedron P = {x ∈ Rn |Ax ≥ b} is also called an extreme ray of P .

41
Now, we define n o
P1 := x1 ≥ 0 : F1 x1 = b1
n o
P2 := x2 ≥ 0 : F2 x2 = b2
we assume that P1 and P2 are non-empty. Then the problem stated in (3.6) can be

rewritten as 0 0
minimize c1 x1 + c2 x2
x

subject to D1 x1 + D2 x2 = b0

x1 ∈ P 1

x2 ∈ P 2 .
For i = 1, 2, let xji , j ∈ Ji be the extreme points of Pi . Let also wik , k ∈ Ki be

a complete set of extreme rays of Pi . Using Minkowski’s (resolution) theorem, any

element xi of Pi can be represented in the form


X X
xi = λji xji + θik wik
j∈Ji k∈Ki

where the coefficients λji and θik are nonnegative and satisfy
X
λji = 1, i = 1, 2 (3.7)
j∈Ji

The original problem (3.6) can be rewritten as


X j 0 j X 0
X j 0 j X 0
minimize λ1 c1 x1 + θ1k c1 w1k + λ2 c1 x2 + θ2k c2 w2k
x
j∈J1 k∈K1 j∈J2 k∈K2

D1 xj1 D2 xj2
   
X X
subject to λj1  1 + λj2  0 
j∈J1 0 j∈J2 1 (3.8)
     
X D1 w1k X D2 w2k b0
+ θ1k  0 + θ2k  0 = 1 
k∈K1 0 k∈K2 0 1
λji ≥ 0, θik ≥ 0, ∀i, j, k.
This problem is called the master problem. Note that the original problem has

m0 + m1 + m2 equality constraints while this master problem has only m0 + 2 equal-

ity constrains which are the coupling constraints plus the constraints in (3.7). On

42
the other hand, the number of decision variables in the master problem could be ex-

tremely large because the number of extreme points and rays is usually exponential

in the number of variable and constraints. Therefore, we can see that the delayed

column generation is the centerpiece of the decomposition algorithm where a column

is generated only after it is found to have a negative reduced cost and is about to enter

the basis. we need to use the revised simplex method which, at any iteration, involves

only m0 + 2 basic variables and a basis matrix of dimension (m0 + 2) × (m0 + 2).

Suppose that we have a basic feasible solution to the master problem associated

with a basis matrix B and that B−1 is available. since we have m0 + 2 equality
0 0
constraints, the dual vector p = cB B−1 has dimension m0 +2. Its first m0 components

denoted by q are the dual variables associated with the equality coupling constraints

in (3.8). The last two components, denoted by r1 and r2 , are the dual variables

associated with the convexity constraints (3.7) for i = 1, 2, respectively. In particular

p = (q, r1 , r2 ). We need to examine the reduced costs of different variables and check

whether any one of them is negative. The reduced cost of the variable λj1 is given by
j
 
D1 x 1
0  0 0 0
c1 xj1 − q r1 r2  1  = (c1 − q D1 )xj1 − r1 .


Similarly, the reduced cost of the variable θ1j is given by


 k

D 1 w 1
0  0 0 0
c1 w1k − q r1 r2  0  = (c1 − q D1 )w1k .


Instead of evaluating the reduced cost of every variable λj1 and θ1k , and checking its

sign, we form the following linear programming problem


0 0
minimize (c1 − q D1 )x1
x1

subject to x1 ∈ P1 .

43
which is called the first subproblem and can be solved by the simplex method. Simi-

larly, for the variables λj2 and θ2k , we can form the second subproblem
0 0
minimize (c2 − q D2 )x2
x2

subject to x2 ∈ P2 .
and solve it using the simplex method as well. The decomposition method is sum-

marized in Algorithm 4. Note that the sub-problems are smaller linear programming

problems that are employed as economical search method for discovering columns

with negative reduced costs.

Algorithm 4 Dantzig-Wolfe Decomposition Algorithm


1: Start with a basic feasible solution to the master problem, the corresponding inverse
0
basis matrix B−1 and the dual vector p = (q, r1 , r2 ) = cB B−1 .
2: Form and solve the two sub-problems. If the optimal cost in the first sub-problem is
≥ r1 and the optimal cost in the second sub-problem is ≥ r2 , then all reduced costs in
the master problem are non-negative, we have an optimal solution, and the algorithm
terminates.
3: If the optimal cost in the ith sub-problem is −∞, we obtain an extreme ray wik , asso-
ciated with a variable θ + ik , whose reduced cost is negative. This variable can enter
the basis in the master problem.
4: If the optimal cost in the ith sub-problem is finite and les than ri , we obtain an extreme
point xji , associated with a variable λji , whose reduced cost is negative. This variable
can enter the basis in the master problem.
5: Having chosen a variable to enter the basis, generate the column associated with that
variable, carry out an iteration of the revised simplex method for the master problem
and update B−1 and p.

This method generalizes to problems of the form


0 0 0
minimize c1 x1 + c2 x2 + · · · + ct xt
x

subject to D1 x1 + D2 x2 + · · · + Dt xt = b0
(3.9)
Fi xi = bi , i = 1, 2, · · · , t

x1 , x2 , · · · , xt ≥ 0.

44
The only difference is that at each iteration of the revised simplex method with the de-

layed column generation for the master problem, we may have to solve t sub-problems.

In fact the method is applicable even if t = 1. Consider the linear programing problem
0
minimize c1 x
x

subject to Dx = b0
(3.10)
Fx = b

x ≥ 0.

The equality constraints have been partitioned intwo two sets, and define the poly-

hedron P = {x ≤ 0|Fx = b}. By expressing each element of P in terms of extreme

points and extreme rays, we obtain a master problem with a large number of columns,

but a smaller number of equality constraints. Searching for columns with negative

reduced cost in the master problem is then accomplished by solving a single sub-

problem, which is a minimization over the set P . This approach can be useful if the

subproblem has a special structure and can be solved very fast. Finally, note that

the decomposition methods assumes that all constraints are in standard form and the

feasible sets Pi of the sub-problems are also in standard form. This assumption is

hardly necessary. For example if we assume that the sets Pi have at least one extreme

point, the resolution theorem and the same line of development applies.

3.4 The Cutting Stock Problem

The cutting stock problem is the problem of cutting standard-sized pieces of stock

material, such as paper rolls or sheet metal, into pieces of specified sizes while min-

imizing material wasted. It is an optimization problem in mathematics that arises

from applications in industry. In terms of computational complexity, the problem

45
is an NP-hard problem reducible to the knapsack problem [13]. It can also be for-

mulated as an integer linear programming problem by solving the real cutting stock

problem and then approximating the results to integers. It was first formulated by

Kantorovich in 1939 [14]. In 1951, before computers became widely available, L. V.

Kantorovich and V. A. Zalgaller suggested solving the problem of the economical

use of material at the cutting stage with the help of linear programming [15]. The

proposed technique was later called the column generation method.

Consider a paper company that has a supply of large rolls of paper, each of width

W . We assume that W is a positive integer. The company receives customer demand

for smaller widths of paper. In particular, bi rolls of width wi , i = 1, 2, · · · , m, need to

be produced. We also assume that each wi is an integer and that wi ≤ W, ∀i. Smaller

rolls are obtained by slicing a large roll in a certain way, called a pattern. For example,

a large roll of width 70 can be cut into three rolls of widths w1 = 17 and one roll

of width w2 = 15, with a waste of 4. In general, the jth pattern can be represented

by a column vector Aj whose ith entry aij indicates how many rolls of width wi are

produced by that pattern. For example, the pattern described earlier is represented

by the vector (3, 1, 0, · · · , 0). For a vector (a1j , · · · , amj ) to be a representation of a

feasible pattern, its components must be non-negative integers and we must have
m
X
aij wi ≤ W (3.11)
i=1

Let n be the number of all feasible patterns and consider the m × n matrix A with

columns Aj , j = 1, 2, · · · , n. The goal of the company is to minimize the number of

large rolls used while satisfying customer demand. Let xj be the number of large rolls

46
cut according to pattern j. Then, the problem will be
n
X
minimize xj
x
j=1
n
X (3.12)
subject to aij xj = bi , i = 1, 2, · · · , m,
j=1

xj ≥ 0, j = 1, 2, · · · , n.

Naturally, each xj should be an integer and we have an integer programming problem.

However, rounding the solution of 3.12 often provides a feasible solution to the integer

programming problem, which is fairly close to optimal at least if the demands bi are

reasonably large.

The difficulty of the problem lies in the large number of cutting patterns (columns)

that may be encountered [4]. For example, with a standard roll of 200 in. and

demands for 40 different lengths ranging from 20 in. to 80 in., the number of cutting

patterns can easily exceed 10 million or even 100 million. This happens in practical

problems and, in this case, we are facing a complicated linear programming problem.

However, the problem can be solved efficiently, by using the revised simplex method

and by generating columns of A as needed rather than in advance.

For an initial basic solution, we may let the jth pattern consist of one roll of width

wj for j = 1, 2, · · · , m, and none of the other widths. Then the first m columns of

A form a basis that leads to a basic feasible solution. Now, suppose we have a basis

matrix B and an associated basic feasible solution, and that we wish to carry out the

next iteration of the revised simplex method. Because the cost coefficient of every

varible xj is unity, every component of the vector cB is equal to 1. We compute the


0 0 0
simplex multipliers p = cB B−1 . Instead of computing the reduced cost c̄j = 1−p Aj

47
associated with every column (pattern) Aj , we consider the problem

0
minimize 1 − p Aj (3.13)
j

0
This is the same as maximizing p Aj over all j. If the maximum is less than or

equal to 1, all reduced costs are non-negative and we have an optimal solution. On

the other hand, if the maximum is greater than 1, the column Aj corresponding to

a maximizing j has negative reduced cost and enters the basis. We now have the

problem
m
X
maximize p i ai
a
i=1
m
X
subject to w i ai ≤ W (3.14)
i=1
ai ≥ 0, i = 1, 2, · · · , m,

ai integer, i = 1, 2, · · · , m.
This problem is called the integer knapsack problem. Solving the knapsack problem

requires some effort, but for the range of numbers that arise in the cutting stock

problem, this can be done fairly efficiently. The knapsack problem has well-known

methods to solve it, such as branch and bound [16] and dynamic programming [17].

3.5 Applications in Wireless Communication

Although the delayed column generation method goes back to the 1960’s, it started

recently to find its way in so many applications related to the wireless communication

and the machine learning fields. Researchers started to pay attention to large scale

linear programming techniques to solve resource allocation and caching problems. On

the same vein, machine learning and big data science has a lot of linear problems with

a very large number of variables and constraints. In this section, we shed light on

48
some examples to illustrate the importance of the large scale linear programming in

these fields.

Optimizing the throughput capacity over a multihop wireless network is studied

in [18]. The main thread is to apply a multi-commodity flow (MCF) formulation,

discussed in Section 2.6.2, augmented with a scheduling constraint derived from the

conflict graph associated with the network. A fundamental issue with the conflict

graph based MCF formulation is that finding all independent sets (ISs) for scheduling

is NP-hard in general. By expressing the MCF formulation in a matrix format, the

constraint matrix will contain a very large number of columns, with each IS being

associated with one column. The complexity of this approach is resolved using the

delayed column generation (DCG) method. Furthermore, the DCG method is also

applied to the multi-radio multi-channel (MR-MC) networks. It was shown than the

DCG method achieves the most preferred trade-off between computation complexity

and network capacity and maintains good scalability when addressing large-scale

networks, particularly in the complex MR-MC context.

A joint power control and transmission scheduling problem in wireless networks

with average power constraints is studied in [19]. Network utility optimization prob-

lem involving time-sharing across different “transmission modes” was introduced.

The structure of the optimal solution is a time-sharing across a small set of such

modes. This structure was used to develop an efficient heuristic approach to finding

a suboptimal solution through column generation iterations. This heuristic approach

converges quite fast in simulations, and provides a tool for wireless network planning.

Routing in Delay-Tolerant Networks (DTN) has drawn much research effort re-

cently. Since many different kinds of networks fall in the DTN category, many routing

49
approaches have been proposed. Such systems can benefit from a previously proposed

routing algorithm based on linear programming that minimizes the average message

delay [20, 21]. This algorithm, however, is known to have performance issues that

limit its applicability to very simple scenarios. An alternative linear programming

approach for routing in Delay-Tolerant Networks is proposed in [22]. It was shown

that the proposed formulation is equivalent to that presented in a seminal work in

this area, but it contains fewer LP constraints and has a structure suitable to the

application of Column Generation (CG). Simulation shows that the proposed CG

implementation arrives at an optimal solution up to three orders of magnitude faster

than the original linear program in the considered DTN examples.

A joint caching, routing, and channel assignment for video delivery over coordi-

nated small-cell cellular systems of the future Internet is considered in [23]. The prob-

lem of maximizing the throughput of the system was formulated as a linear program

in which the number of variables is very large. To address channel interference, the

proposed formulation incorporates the conflict graph that arises when wireless links

interfere with each other due to simultaneous transmission. The column generation

method was used to solve the problem by breaking it into a restricted master subprob-

lem that involves a select subset of variables and a collection of pricing subproblems

that select the new variable to be introduced into the restricted master problem. The

proposed framework demonstrates considerable gains in average transmission rate at

which the video data can be delivered to the users, over the state-of-the-art femto-

caching systems, of up to 46%. These operational gains in system performance map

to analogous gains in video application quality, thereby enhancing the user experience

considerably.

50
Chapter 4: Implementation of Large Scale Linear Programs

The advances in computing in the past decades allowed us to find many software

packages to solve linear programs. Nevertheless, problems with few thousands of

variables and constraints can be seen as small problems. Problems with tens or even

hundreds of thousands of variables are usually solvable. Linear programming software

packages come in two different kinds. Some of them are algorithmic codes devoted

to finding optimal solutions to specific linear programs. They take the input as a

compact list of the linear program constraint coefficients (A, b, c and related values

in the standard form) and produce the output as a compact list of optimal solution

values and related information. Other packages are considered as modeling systems

which allow people to formulate their own linear programs and analyze their solutions.

Most modeling systems support a variety of algorithmic codes, while the more popular

codes can be used with many different modeling systems. Conversion to the forms

required by algorithmic codes is done automatically in these modeling systems [24]. In

this chapter we shed light on some popular modeling software packages and illustrate

how to use them through the examples discussed in Chapter 2. We investigate how

to implement the cutting stock problem, discussed in Section 3.4, using AMPL. The

multi-commodity network flow example, discussed in Section 2.6.2, is implemented

51
using GAMS. The implementation of the D2D caching network example, discussed

in Section 2.6.3, is introduced using Matlab.

4.1 AMPL Programming Language

A Mathematical Programming Language (AMPL) is a modeling language that

can be used to describe and solve high-complexity problems for large-scale mathe-

matical computing (i.e., large-scale optimization and scheduling-type problems). It

was developed by Robert Fourer, David Gay, and Brian Kernighan at Bell Laborato-

ries [25]. AMPL offers an interactive command environment for setting up and solving

mathematical programming problems. A flexible interface enables several solvers to

be available, both open source and commercial software, including CBC, CPLEX,

FortMP, Gurobi, MINOS, IPOPT, SNOPT, KNITRO, and LGO. It has a syntax

very similar to the mathematical notation of optimization problems, which allows for

a very concise and readable definition of problems in the domain of optimization.

Once optimal solutions have been found, they are translated back to the modeler’s

form so that they can viewed and analyzed easily.

AMPL has a variety of options to format data for browsing, printing reports,

or preparing input to other programs. In addition, AMPL is readily available for

experiment: the AMPL web site, www.ampl.com, provides free downloadable student

versions and representative solvers that run on Windows, Unix/Linux, and Mac OS

X. In this section, we briefly discuss how to use AMPL to model large scale linear

programs like the cutting stock problem, discussed in 3.4. Our objective is to cover

the main features of this software package and how to use it in solving these problems.

52
4.1.1 Implementation of The Cutting Stock Problem using AMPL

The cutting stock problem, discussed in Section 3.4 is a typical example to il-

lustrate the column generation method. In this problem, we wish to cut up long

raw widths of some commodity, such as rolls of paper, into a combination of smaller

widths that meet given orders with as little waste as possible. The Gilmore-Gomory

procedure defines a cutting pattern to be any feasible way in which a raw roll can be

cut. Thus, a pattern is a vector consisting of a certain number of rolls of each desired

width, such that their total width does not exceed the raw width. The Gilmore-

Gomory procedure consists of a main problem and a knapsack sub-problem. The

main problem finds the minimum number of raw rolls that need be cut, given a

collection of known cutting patterns that may be used. The sub-problm seeks to

identify a new pattern that can be used in the cutting optimization, either to reduce

the number of raw rolls needed, or to determine that no such new pattern exists. The

variables of this model are the numbers of each desired width in the new pattern; the

feasibility constraint 3.11 ensures that the total width of the pattern does not exceed

the raw width. This procedure is described in the following algorithm.

Algorithm 5 The Gilmore-Gomory Procedure


Pick initial patterns sufficient to meet demand
repeat
Solve the (fractional) cutting stock optimization problem
Let price[i] equal Fill[i].dual for each pattern i
Solve the pattern generation sub-problem
if the optimal value is < 0 then
add a new pattern that cuts Use[i] rolls of each with i
else
find a final integer solution and stop
end if
until stop

53
The complete implementation code is provided in Appendix A. AMPL allows us

to define two problem statements, one for the main problem and another one for the

sub-problem.

problem Cutting_Opt: Cut, Number, Fill;

option relax_integrality 1;

problem Pattern_Gen: Use, Reduced_Cost, Width_Limit;

option relax_integrality 0;

The first statement defines a problem named Cutting Opt that consists of the

Cut variables, the Fill constraints, and the objective Number. This is defined in

the statement

minimize Number:

sum {j in PATTERNS} Cut[j];

subject to Fill {i in WIDTHS}:

sum {j in PATTERNS} nbr[i,j] * Cut[j] >= orders[i];

Comparing the definition of the Cutting Opt problem with (3.12), we see that

Number is the objective function, Cut represents the optimization variables xj , Fill

represents the constraint where nbr are the coefficients aij and orders are the con-

straint values bi . In a similar way, we define a problem Pattern Gen that consists of

the Use variables, the Width Limit constraint, and the objective Reduced Cost.

Which is also defined as

minimize Reduced_Cost:

1 - sum {i in WIDTHS} price[i] * Use[i];

54
subject to Width_Limit:

sum {i in WIDTHS} i * Use[i] <= roll_width;

Comparing the definition of the Pattern Gen problem with (3.14), we see that

Use corresponds to ai , price corresponds to pi and roll width corresponds to W .

The for loop creates the initial cutting patterns, after which the main repeat loop

carries out the Gilmore-Gomory procedure as described previously. The statement

solve Cutting_Opt;

sets the Cutting Opt as the current problem, along with its environment, and

solves the associated linear program. A similar statement is defined for the Pattern Gen

problem. An example for this problem is for a roll width of 110 and required demands

of 48, 35, 24, 10 and 8 for finished rolls of widths 20, 45, 50, 55 and 75, respectively.

Running the script mentioned in Appendix A, we get the following result

CPLEX 12.6.3.0: optimal solution; objective 52.1

0 dual simplex iterations (0 in phase I)

CPLEX 12.6.3.0: optimal integer solution; objective -0.2

2 MIP simplex iterations

0 branch-and-bound nodes

No basis.

CPLEX 12.6.3.0: optimal solution; objective 50.5

2 dual simplex iterations (0 in phase I)

CPLEX 12.6.3.0: optimal integer solution; objective -0.2

0 MIP simplex iterations

0 branch-and-bound nodes

55
No basis.

CPLEX 12.6.3.0: optimal solution; objective 47

1 dual simplex iterations (0 in phase I)

CPLEX 12.6.3.0: optimal integer solution; objective -0.1

1 MIP simplex iterations

0 branch-and-bound nodes

No basis.

CPLEX 12.6.3.0: optimal solution; objective 46.25

2 dual simplex iterations (0 in phase I)

CPLEX 12.6.3.0: optimal integer solution; objective -1e-06

9 MIP simplex iterations

0 branch-and-bound nodes

No basis.

nbr [*,*]:=

: 1 2 3 4 5 6 7 8

20 5 0 0 0 0 1 1 3

45 0 2 0 0 0 0 2 0

50 0 0 2 0 0 0 0 1

55 0 0 0 2 0 0 0 0

75 0 0 0 0 1 1 0 0;

Cut [*] := 1 0 2 0 3 8.25 4 5 5 0 6 8 7 17.5 8 7.5;

The final fractional solution means that a pattern of (0, 0, 2, 0, 0) will be generated

8.25 times, a pattern of (0, 0, 0, 2, 0) will be generated 5 times, a pattern of (1, 0, 0, 0, 1)

will be generated 8 times, a pattern of (1, 2, 0, 0, 0) will be generated 17.5 times and

56
finally a pattern of (3, 0, 1, 0, 0) will be generated 7.5 times. The best fractional

solution cuts 46.25 raw rolls in five different patterns, using 48 rolls if the fractional

values are rounded up to the next integer.

4.2 GAMS Programming Language

The General Algebraic Modeling System (GAMS) is a high-level modeling system

for mathematical optimization. GAMS is designed for modeling and solving linear,

nonlinear, and mixed-integer optimization problems. The system is tailored for com-

plex, large-scale modeling applications and allows the user to build large maintainable

models that can be adapted to new situations. GAMS was first presented at the In-

ternational Symposium on Mathematical Programming (ISMP), Budapest, Hungary

in 1976. GAMS allows the user to concentrate on the modeling problem by making

the setup simple. The system takes care of the time-consuming details of the specific

machine and system software implementation. GAMS contains an integrated devel-

opment environment (IDE) and is connected to a group of third-party optimization

solvers. Among these solvers are BARON, COIN-OR solvers, CONOPT, CPLEX,

DICOPT, Gurobi, MOSEK, SNOPT, SULUM, and XPRESS [26]. We illustrate how

to use GAMS to implement the multi-commodity network flow example, discussed in

Section 2.6.2, using the Dantzig-Wolfe decomposition algorithm.

4.2.1 Implementation of Dantzig-Wolfe Decomposition Method using


GAMS

The implementation of Dantzig-Wolfe decomposition algorithm was discussed in

[27]. In this section, we highlight how GAMS can be used to implement a multi-

commodity network flow problem. The definition of this problem was discussed in

57
Section 2.6.2 and the complete implementation code is provided in Appendix B. In

the beginning, we define the settings of the problem including the number of nodes

and commodities. In this example we have 10 nodes and 5 commodities.

$if NOT set nodes $set nodes 20

$if NOT set comm $set comm 5

GAMS allows us you to specify indices in a straightforward way: declare and name

the set (here, i, k and e(i, k)), and enumerate their elements.

sets i nodes / n1*n%nodes% /

k commodities / k1*k%comm% /

e(i,i) edges

alias (i,j)

Indexed parameters are defined to store the cost of each link cij , the balance bki ,

the demand bk and the capacity uij . Notice that the commodity is indexed by k only

instead of using k and l as discussed in Section 2.6.2. GAMS also allows us to place

explanatory text (shown in lower case) throughout the model, as we develop it. These

comments are automatically incorporated into the output report, at the appropriate

places.

parameters

cost(i,j) cost for edge use

bal(k,i) balance

kdem(k) demand

cap(i,j) bundle capacity ;

58
Decision variables are expressed with their indices specified, where cost cor-

responds to cij in (2.5), bal corresponds to bki , kdem corresponds to bk and cap

corresponds to uij . From this general form, GAMS generates each instance of the

variable in the domain. Variables are specified as to type: FREE, POSITIVE, NEG-

ATIVE, BINARY, or INTEGER. The default is FREE. The objective variable (z,

here) is simply declared without an index. Here, the optimization variable x is defined

to be a positive number by feasibility constraint.

variables

x(k,i,j) multi commodity flow

z objective

positive variable x;

The objective function and constraint equations are first declared by giving them

names. Then their general algebraic formulae are described. GAMS now has enough

information (from data entered above and from the algebraic relationships specified in

the equations) to automatically generate each individual constraint statement. Notice

that these equations are typically defined as mentioned in Section 2.6.2.

equations

defbal(k,i) balancing constraint

defcap(i,j) bundling capacity

defobj;

defobj.. z =e= sum((k,e), cost(e)*x(k,e));

defbal(k,i).. sum(e(i,j), x(k,e)) - sum(e(j,i),x(k,e))

=e= bal(k,i);

59
defcap(e).. sum(k, x(k,e)) =l= cap(e);

The model is given a unique name (here, mcf multi-commodity flow problem),

and the modeler specifies which equations should be included in this particular for-

mulation. In this case we specified ALL which indicates that all equations are part

of the model.

model mcf multi-commodity flow problem /all/;

A random instance is generated here for testing. However, we could set exact

values for the link cost cij , the balance bki , the demand bk and the capacity uij . The

model checks whether the generated instance is feasible. In this case, the model is

solved by this statement

solve mcf min z using lp;

The solve statement tells GAMS which model to solve, selects the solver to use (in

this case an LP solver), indicates the direction of the optimization, either MINIMIZ-

ING or MAXIMIZING , and specifies the objective variable. We have two problems

here to solve, a master problem and a pricing sub-problem. Corresponding indices,

parameters, variables and equations are defined for each problem. These problem are

defined by the statements

model master / mdefobj, mdefbal, mdefcap /;

model pricing / pdefobj, pdefbal /;

The steps defined in Algorithm 4 are implemented in the last part of the model.

To solve the main problem and the pricing sub-problem by these statements

60
solve master using lp minimizing z;

solve pricing using lp minimizing z;

where each statement is at its appropriate place in code (See Appendix B). Run-

ning this code using GAMS platform, a report is generated including many results.

We emphasize here the most important messages in this report as follows

S O L V E S U M M A R Y

MODEL mcf OBJECTIVE z

TYPE LP DIRECTION MINIMIZE

SOLVER CPLEX FROM LINE 81

**** SOLVER STATUS 1 Normal Completion

**** MODEL STATUS 1 Optimal

**** OBJECTIVE VALUE 1726.1151

RESOURCE USAGE, LIMIT 0.078 1000.000

ITERATION COUNT, LIMIT 10 2000000000

**** REPORT SUMMARY : 0 NONOPT

0 INFEASIBLE

0 UNBOUNDED

---- 84 PARAMETER xsingle single solve

n1 n4 n5 n6 n8

k1.n2 123.400

k2.n4 29.264

k2.n5 29.264 100.296

k3.n6 51.062

k5.n2 52.463

61
k5.n8 52.463

---- 203 PARAMETER xall summary of flows

single serial

k1.n2.n5 123.400 123.400

k2.n4.n6 29.264 29.264

k2.n5.n4 29.264 29.264

k2.n5.n6 100.296 100.296

k3.n6.n5 51.062 51.062

k5.n2.n8 52.463 52.463

k5.n8.n1 52.463 52.463

The report shows that CPLEX solver was used to find the optimal solution. The

final objective value is 1726.1151 and the problem was solved. The final solution is

also included for the randomly generated instance.

4.3 Matlab Programming Language

Another important platform which allows us to solve large scale linear programs is

the Matrix Laboratory or Matlab. Matlab is a multi-paradigm numerical computing

environment and a programming language developed by Mathworks. Matlab allows

matrix manipulations, plotting of functions and data, implementation of algorithms,

creation of user interfaces, and interfacing with programs written in other languages,

including C, C++, Java, Fortran, and Python. Matlab includes an optimization tool-

box which provides functions for solving constrained and unconstrained optimization

problems. The toolbox includes solvers for linear programming, mixed-integer linear

62
programming, quadratic programming, nonlinear optimization, and nonlinear least

squares.

Linear optimization problems can be solved using the linprog function from

the toolbox. The optimization toolbox includes three algorithms used to solve linear

programming problems:

• The simplex algorithm is a systematic procedure for generating and testing

candidate vertex solutions to a linear program. The simplex algorithm is the

most widely used algorithm for linear programming.

• The interior point algorithm is based on a primal-dual predictor-corrector algo-

rithm used for solving linear programming problems. Interior point is especially

useful for large-scale problems that have structure or can be defined using sparse

matrices.

• The active-set algorithm minimizes the objective at each iteration over the ac-

tive set (a subset of the constraints that are locally active) until it reaches a

solution.

The syntax of this function is as follows

[x,fval] = linprog(f,A,b,Aeq,beq,lb,ub,x0,options);

which solves min f (x) such that Ax ≤ b. It also includes equality constraints

Aeq x = beq . It defines a set of lower and upper bounds on the design variables, x, so

that the solution is always in the range lb ≤ x ≤ ub. Moreover, x0 is the starting

point from which the algorithm starts searching for the optimal solution. The options

of this optimization function are defined in options using the optimset function.

63
There are many options that can be set using this function. We focus here on two

of them, Algorithm and LargeScale. This function returns the optimal value of

decision parameter x corresponding to the optimal value of objective function fval.

We can choose the optimization algorithm from one of interior-point-legacy

(default), interior-point, dual-simplex, active-set or simplex. The

first three algorithms are large-scale algorithms, while last two algorithms are not.

An optimization algorithm is large scale when it uses linear algebra that does not

need to store, nor operate on, full matrices. This may be done internally by storing

sparse matrices, and by using sparse linear algebra for computations whenever pos-

sible. Furthermore, the internal algorithms either preserve sparsity, such as a sparse

Cholesky decomposition, or do not generate matrices, such as a conjugate gradient

method. The option LargeScale can be set to ’on’ (default), with one of the

mentioned large scale algorithms, to solve large size problems or ’off’ when we

intend to solve medium or small size problems.

4.3.1 Matlab Implementation of D2D Caching Example

In this section we shed light on how Matlab can help us to solve the D2D caching

problem discussed in Section 2.6.3. When the number of users and the number of data

items grow large in this problem we have a large scale linear optimization problem.

A complete implementation code is provided in Appendix C. We consider the case

when N = 1000 and M = 105 and solve the problem by increasing the number of

users involved in the system. We generate a random instance of the demand and

mobility profiles using the demand gen and mobility gen functions respectively.

We assume that each user can store up to 10% of these data items in his device. We

64
also assume that the carrier pays back 0.5 units for each byte cached and shared by

every user (i.e. r = 0.5). We initialize some variables to store the values obtained

after each run of the loop.

X_optimal = zeros(NMax,NMax*M);

Cost_optimal = zeros(1,NMax);

Gain_optimal = zeros(1,NMax);

Memory_optimal = zeros(1,NMax);

Each time we run the loop, the statistics corresponding to involved users are

captured to prepare alpha. After that we set the parameters of the optimization

function linprog. We set an upper and lower bound on the decision parameter for

the feasibility of the solution. These bounds represent the third constraint in (2.6).

LB = zeros(1,N*M);

UB = zeros(1,N*M);

for n=1:N

UB(M*(n-1)+1:M*n) = Sm;

end

We define the inequality constraints by generating two matrices, A1 and A2.

Matrix A1 is for the memory constraint which is the first constraint in (2.6). Matrix

A2 is the second constraint in (2.6). These two matrices are merged in one matrix to

be set as an option to the linprog function.

A = [A1 ; A2];

b = [b1; b2];

65
An initial point x0 = (0, 0, · · · , 0) is chosen. The most important part of this code

is to choose the optimization algorithm and to turn the large scale option to ’on’

options = optimset(’Algorithm’,’dual-simplex’,

’Display’,’off’,’LargeScale’,’on’);

Now, everything is ready to call the linprog function and solve the problem

[xopt,costP]=linprog(cost_fun,A,b,[],[],LB,UB,x0,options);

This function returns the optimal solution xopt and the optimal value of the

cost function costP. The results of this system are depicted in Figure 4.1. The

carrier achieves more gain and uses less memory when more users are engaged in the

network. Moreover, we notice that memory usage decays as O( N1 ). More users help

all parties to gain more and, at the same time, it requires less memory for this caching

as the network expands. When a user requests a certain data item and more users are

located around him, he gets that item either from his local cache or from other users

through the D2D communication. This helps the carrier to smooth out the network

load and minimize the incurred service cost. Notice that the LargeScale option of

Matlab allowed us to find the optimal solution even when the number of users N and

the number of data contents M increase. We refer the reader to [2] for more details.

66
100 50

40
80
Memory Usage (%)
Carrier Gain (%)

30
60
20

40
10

20 0
0 50 100 0 50 100
No. of Users (N) No. of Users (N)

Figure 4.1: System Performance of the D2D Caching Network

67
Appendix A: AMPL Implementation of Column Generation

The Data File:

data;

# -----------------------------------------
# SETTING THE ROLL WIDTH AND ORDER DETAILS
# -----------------------------------------
param roll_width := 110 ;
param: WIDTHS: orders :=
20 48
45 35
50 24
55 10
75 8 ;

The Run File:

# ----------------------------------------
# SETTING MODEL FILE AND SOLVER
# ----------------------------------------
model cut.mod;
data cut.dat;
option solver cplex, solution_round 6;
option display_1col 0, display_transpose -10;

# ----------------------------------------
# DEFINING THE PROBLEMS
# ----------------------------------------
problem Cutting_Opt: Cut, Number, Fill;
option relax_integrality 1;
problem Pattern_Gen: Use, Reduced_Cost, Width_Limit;
option relax_integrality 0;

68
# ----------------------------------------
# GENERATING THE PATTERNS
# ----------------------------------------
let nPAT := 0;
for {i in WIDTHS} {
let nPAT := nPAT + 1;
let nbr[i,nPAT] := floor (roll_width/i);
let {i2 in WIDTHS: i2 <> i} nbr[i2,nPAT] := 0;
}

# ----------------------------------------
# RUNNING THE PROCEDURE
# ----------------------------------------
repeat {
solve Cutting_Opt;
let {i in WIDTHS} price[i] := Fill[i].dual;

solve Pattern_Gen;
if Reduced_Cost < -0.00001 then {
let nPAT := nPAT + 1;
let {i in WIDTHS} nbr[i,nPAT] := Use[i];
}
else break;
}
display nbr, Cut;

# ----------------------------------------
# FINAL ROUND AND OUTPUT DISPLAY
# ----------------------------------------
option Cutting_Opt.relax_integrality 0;
solve Cutting_Opt;
display Cut;

The Mod File:

# ----------------------------------------
# CUTTING STOCK USING PATTERNS
# ----------------------------------------
param roll_width > 0; # width of raw rolls
set WIDTHS; # set of widths to be cut
param orders {WIDTHS} > 0; # number of each width to be cut
param nPAT integer >= 0; # number of patterns
set PATTERNS = 1..nPAT; # set of patterns
param nbr {WIDTHS,PATTERNS} integer >= 0;

69
check {j in PATTERNS}:
sum {i in WIDTHS} i * nbr[i,j] <= roll_width;
# defn of patterns: nbr[i,j] = number
# of rolls of width i in pattern j
var Cut {PATTERNS} integer >= 0; #rolls cut using each pattern
minimize Number: # minimize total raw rolls cut
sum {j in PATTERNS} Cut[j];
subject to Fill {i in WIDTHS}:
sum {j in PATTERNS} nbr[i,j] * Cut[j] >= orders[i];
# for each width, total
# rolls cut meets total orders

# ----------------------------------------
# KNAPSACK SUBPROBLEM FOR CUTTING STOCK
# ----------------------------------------
param price {WIDTHS} default 0.0;
var Use {WIDTHS} integer >= 0;
minimize Reduced_Cost:
1 - sum {i in WIDTHS} price[i] * Use[i];
subject to Width_Limit:
sum {i in WIDTHS} i * Use[i] <= roll_width;

70
Appendix B: GAMS Implementation of Multi-Commodity Network
Flow Problem

# ----------------------------------------
# Problem Settings
# ----------------------------------------
$Eolcom !
$setddlist nodes comm maxtime
$if NOT set nodes $set nodes 20
$if NOT set comm $set comm 5
$if NOT set maxtime $set maxtime 50
$if NOT errorfree $abort wrong double dash parameters:
--nodes=n --comm=n --maxtime=secs

# ----------------------------------------
# Defining SETS
# ----------------------------------------
sets i nodes / n1*n%nodes% /
k commodities / k1*k%comm% /
e(i,i) edges
alias (i,j)

# ----------------------------------------
# Defining Indexed Parameters
# ----------------------------------------
parameters
cost(i,j) cost for edge use
bal(k,i) balance
kdem(k) demand
cap(i,j) bundle capacity ;

# ----------------------------------------
# Declaring Variables
# ----------------------------------------
variables

71
x(k,i,j) multi commodity flow
z objective
positive variable x;

# ----------------------------------------
# Objective Functions and Constraints
# ----------------------------------------
equations
defbal(k,i) balancing constraint
defcap(i,j) bundling capacity
defobj;
defobj.. z =e= sum((k,e), cost(e)*x(k,e));
defbal(k,i).. sum(e(i,j),x(k,e))-sum(e(j,i),x(k,e))=e=bal(k,i);
defcap(e).. sum(k, x(k,e)) =l= cap(e);

# ----------------------------------------
# Defining Model
# ----------------------------------------
model mcf multi-commodity flow problem /all/;

# ----------------------------------------
# Making a Random Instance
# ----------------------------------------
scalars inum, edgedensity /0.3/ ;
e(i,j) = uniform(0,1) < edgedensity; e(i,i) = no;
cost(e) = uniform(1,10);
cap(e) = uniform(50,100)*log(card(k));
loop(k,
kdem(k) = uniform(50,150);
inum = uniformInt(1,card(i));
bal(k,i)$(ord(i)=inum) = kdem(k);
inum = uniformInt(1,card(i));
bal(k,i)$(ord(i)=inum) = bal(k,i) - kdem(k);
kdem(k) = sum(i$(bal(k,i)>0), bal(k,i)) );

# ----------------------------------------
# See if the random model is feasible
# ----------------------------------------
option limrow=0, limcol=0;
option solprint=off, solvelink=%solvelink.CallModule%;
solve mcf min z using lp;
abort$(mcf.modelstat <> %modelstat.Optimal%)
’problem not feasible. Increase edge density.’
parameter xsingle(k,i,j) single solve;

72
xsingle(k,e) = x.l(k,e)$[x.l(k,e) > 1e-6];
display$(card(i) < 30) xsingle;

# ----------------------------------------
# Define Master Model
# ----------------------------------------
set p paths idents / p1*p100 /
ap(k,p) active path
pe(k,p,i,j) edge path incidence vector
parameter
pcost(k,p) path cost
positive variable xp(k,p), slack(k);
equations
mdefcap(i,j) bundle constraint
mdefbal(k) balance constraint
mdefobj objective;
mdefobj.. z=e=sum(ap,pcost(ap)*xp(ap))+sum(k,999*slack(k));
mdefbal(k).. sum(ap(k,p), xp(ap)) + slack(k) =e= kdem(k);
mdefcap(e).. sum(pe(ap,e), xp(ap)) =l= cap(e);
model master / mdefobj, mdefbal, mdefcap /;

# ----------------------------------------
# Define Pricing Model: Shortest Path
# ----------------------------------------
parameter ebal(i)
positive variable xe(i,j)
equations
pdefbal(i) balance constraint
pdefobj objective;
pdefobj.. z =e= sum(e, (cost(e)-mdefcap.m(e))*xe(e));
pdefbal(i).. sum(e(i,j), xe(e)) - sum(e(j,i),xe(e))=e=ebal(i);
model pricing / pdefobj, pdefbal /;

# ----------------------------------------
# Solving Master and Pricing Problems
# ----------------------------------------
Scalar done loop indicator, iter iteration counter;
Set nextp(k,p) next path to be added ;
* clear path data
done = 0; iter = 0;
ap(k,p) = no; pe(k,p,e) = no;
pcost(k,p) = 0;
nextp(k,p) = no; nextp(k,’p1’) = yes;
While(not done, iter=iter+1;

73
solve master using lp minimizing z;
done = 1;
loop(k$kdem(k),
ebal(i) = bal(k,i)/kdem(k);
solve pricing using lp minimizing z;
pricing.solprint=%solprint.Quiet%;
! turn off all outputs fpr pricing model
if (mdefbal.m(k) - z.l > 1e-6, ! add new path
ap(nextp(k,p)) = yes;
pe(nextp(k,p),e) = round(xe.l(e));
pcost(nextp(k,p)) = sum(pe(nextp,e), cost(e));
nextp(k,p) = nextp(k,p-1);
! bump the path to the next free one
abort$(sum(nextp(k,p),1)=0) ’set p too small’;
done = 0 ) ) );
abort$(abs(master.objval-mcf.objval)>1e-3)
’different objective values’, master.objval, mcf.objval;
parameter xserial(k,i,j);
xserial(k,e) = sum(pe(ap(k,p),e), xp.l(ap));
display$(card(i) < 30) xserial;

74
Appendix C: Matlab Implementation of D2D Caching Example

The Main File:

% ----------------------------------------
% System Parameters
% ----------------------------------------
clc, clear all; close all; % Clearing Everything
NInit = 5; % Initial Number of Users
step = 5; % Step Size in For Loop
L = 4; % No. of Locations
NMax = 1000; % No. of Users
M = 1e05; % No. of Data Items
Theta_all = mobility_gen(NMax,L,1); % Mobility Profile
P_all = demand_gen(NMax,M,1); % Demand Profile
Sm = 100*ones(1,M); % Data Items Sizes
Zn_all = (M/10)*100*ones(NMax,1); % Memory Sizes
r = 0.50; % Reward Factor

% ----------------------------------------
% Initialization
% ----------------------------------------
X_optimal = zeros(NMax,NMax*M);
Cost_optimal = zeros(1,NMax);
Gain_optimal = zeros(1,NMax);
Memory_optimal = zeros(1,NMax);

% ----------------------------------------
% Running Loop on No. of Users
% ----------------------------------------
for N=NInit:step:NMax
disp([’N = ’ num2str(N)]);
% Taking Chunk
PIndex = [];
for m=1:M
PIndex = [PIndex (1:N)+(m-1)*NMax];

75
end
P = P_all(PIndex);
TIndex = [];
for l=1:L
TIndex = [TIndex (1:N)+(l-1)*NMax];
end
Theta = Theta_all(TIndex);
Zn = Zn_all(1:N);
% Caching Decisions
x = zeros(1,N*M);
% Preparing Matrix (S)
S = zeros(1,M*L);
for m=1:M
S((m-1)*L+1:m*L)=Sm(m)*ones(1,L);
end
% Preparing Matrix (I) & (II)
I = zeros(M*L,M*N);
II = zeros(M*N,M*L);
for m=1:M
I((m-1)*L+1:m*L,(m-1)*N+1:m*N)= reshape(Theta,N,L)’;
II((m-1)*N+1:m*N,(m-1)*L+1:m*L)= reshape(Theta,N,L);
end
% Preparing Matrix (III)
III = zeros(1,M*L);
for m=1:M
for l=1:L
III((m-1)*L+l)=P((m-1)*N+1:m*N)*Theta((l-1)*N+1:l*N)’;
end
end
% Preparing Alpha
alpha = III*I;
% Reactive Load Calculation
loadR=0;
for m=1:M
loadR = loadR + Sm(m)*ones(1,N)*P((m-1)*N+1:m*N)’;
end
% Reactive Cost (Linear)
costR = loadR;

% ----------------------------------------
% Optimization Parameters
% ----------------------------------------
% Upper and Lower Bounds
LB = zeros(1,N*M);

76
UB = zeros(1,N*M);
for n=1:N
UB(M*(n-1)+1:M*n) = Sm;
end
% Constraint 1
A1 = zeros(N,N*M);
for m=1:M
A1(1:N,(m-1)*N+1:m*N)=eye(N,N);
end
b1 = Zn(1:N);
% Constraint 2
A2 = I;
b2 = S’;
% Constraint Matrix
A = [A1 ; A2];
b = [b1; b2];
% Initial Point
x0 = zeros(1,N*M);
% Option Setting
options = optimset(’Algorithm’,’dual-simplex’,
’Display’,’off’,’LargeScale’,’on’);

% ----------------------------------------
% Solving the Problem - Proactive Cost
% ----------------------------------------
cost_fun = r - alpha;
[xopt,costP] = linprog(cost_fun,A,b,[],[],LB,UB,x0,options);
costP = costP + costR;
X_optimal(1:N*M,N) = xopt;
Cost_optimal(N) = costP;
Gain_optimal(N) = 100*(costR-costP)/costR;
Memory_optimal(N) = 100*sum(xopt)/sum(Zn(1:N));
end

Cost_LB = Cost_optimal;
X_LB = X_optimal;

% ----------------------------------------
% Plotting Results
% ----------------------------------------
figure(3); subplot(1,2,1);
plot([NInit:step:NMax],Gain_optimal(NInit:step:NMax),
’r-’,’LineWidth’,2);
grid on; xlabel(’No. of Users (N)’); ylabel(’Gain (%)’);

77
title(’Carrier Gain vs No. of Users (N)’);
figure(3); subplot(1,2,2);
plot([NInit:step:NMax],Memory_optimal(NInit:step:NMax),
’b-’,’LineWidth’,2);
grid on; xlabel(’No. of Users (N)’); ylabel(’Memory Usage (%)’);
title(’Overall Memeory Usage vs No. of Users (N)’);

Demand Generation Function:


function P = demand_gen(N,M,S)
P = zeros(S,N*M);
for s=1:S
for m=1:M
Z = zipf(rand,1,N);
P(s,(m-1)*N+1:m*N) = Z(randperm(N))./max(Z);
end
end
end
Mobility Generation Function:
function T = mobility_gen(N,L,S)
T = rand(S,L*N);
M = zeros(N,N*L);
if N==1
M = ones(S,L);
else
for l=1:L
M(1:N,(l-1)*N+1:l*N)=eye(N,N);
end
end
for s=1:S
temp = M*T(s,:)’;
for l=1:L
T(s,(l-1)*N+1:l*N) = T(s,(l-1)*N+1:l*N)./temp’;
end
end
end

78
Bibliography

[1] D. Bertsimas and J. N. Tsitsiklis, Introduction to linear optimization. Athena


Scientific Belmont, MA, 1997, vol. 6.

[2] S. Hosny, H. El Gamal, and A. Eryilmaz, “Impact of user mobility on d2d


caching networks,” in 2016 IEEE Global Communications Conference (Globe-
com). IEEE, 2016, p. na.

[3] G. B. Dantzig and P. Wolfe, “Decomposition principle for linear programs,”


Operations research, vol. 8, no. 1, pp. 101–111, 1960.

[4] P. C. Gilmore and R. E. Gomory, “A linear programming approach to the cutting-


stock problem,” Operations research, vol. 9, no. 6, pp. 849–859, 1961.

[5] C. Alves, F. Clautiaux, J. V. de Carvalho, and J. Rietz, “Applications for cutting


and packing problems,” in Dual-Feasible Functions for Integer Programming and
Combinatorial Optimization. Springer, 2016, pp. 91–123.

[6] T. Larsson, A. Migdalas, and M. Patriksson, “A generic column generation prin-


ciple: derivation and convergence analysis,” Operational Research, vol. 15, no. 2,
pp. 163–198, 2015.

[7] M. M. Baldi and M. Bruglieri, “On the generalized bin packing problem,” Inter-
national Transactions in Operational Research, 2016.

[8] M. Delorme, M. Iori, and S. Martello, “Bin packing and cutting stock problems:
Mathematical models and exact algorithms,” European Journal of Operational
Research, 2016.

[9] L. Chen and G. Zhang, “Packing groups of items into multiple knapsacks,”
in LIPIcs-Leibniz International Proceedings in Informatics, vol. 47. Schloss
Dagstuhl-Leibniz-Zentrum fuer Informatik, 2016.

[10] P. Wolfe, “A technique for resolving degeneracy in linear programming,” Journal


of the Society for Industrial and Applied Mathematics, vol. 11, no. 2, pp. 205–211,
1963.

79
[11] R. G. Bland, “New finite pivoting rules for the simplex method,” Mathematics
of operations Research, vol. 2, no. 2, pp. 103–107, 1977.

[12] R. E. Gomory, “An algorithm for integer solutions to linear programs,” Recent
advances in mathematical programming, vol. 64, pp. 260–302, 1963.

[13] M. Gradišar, G. Resinovič, and M. Kljajić, “Evaluation of algorithms for one-


dimensional cutting,” Computers & Operations Research, vol. 29, no. 9, pp. 1207–
1220, 2002.

[14] L. V. Kantorovich, “Mathematical methods of organizing and planning produc-


tion,” Management Science, vol. 6, no. 4, pp. 366–422, 1960.

[15] L. Kantorovich and V. Zalgaller, “Calculation of rational cutting of stock,” Leniz-


dat, Leningrad, vol. 5, pp. 11–14, 1951.

[16] G. L. Nemhauser and L. A. Wolsey, “Integer and combinatorial optimization.


interscience series in discrete mathematics and optimization,” ed: John Wiley &
Sons, 1988.

[17] D. P. Bertsekas, D. P. Bertsekas, D. P. Bertsekas, and D. P. Bertsekas, Dynamic


programming and optimal control. Athena Scientific Belmont, MA, 1995, vol. 1,
no. 2.

[18] Y. Cheng, X. Cao, X. S. Shen, D. M. Shila, and H. Li, “A systematic study of


the delayed column generation method for optimizing wireless networks,” in Pro-
ceedings of the 15th ACM international symposium on Mobile ad hoc networking
and computing. ACM, 2014, pp. 23–32.

[19] M. Cao, V. Raghunathan, S. Hanly, V. Sharma, and P. Kumar, “Power control


and transmission scheduling for network utility maximization in wireless net-
works,” in Decision and Control, 2007 46th IEEE Conference on. IEEE, 2007,
pp. 5215–5221.

[20] J. Alonso and K. Fall, “A linear programming formulation of flows over time
with piecewise constant capacity and transit times,” Intel Research Technical
Report IRB-TR-03-007, Tech. Rep., 2003.

[21] S. Jain, K. Fall, and R. Patra, Routing in a delay tolerant network. ACM, 2004,
vol. 34, no. 4.

[22] G. Amantea, H. Rivano, and A. Goldman, “A delay-tolerant network routing


algorithm based on column generation,” in Network Computing and Applications
(NCA), 2013 12th IEEE International Symposium on. IEEE, 2013, pp. 89–96.

80
[23] A. Khreishah, J. Chakareski, and A. Gharaibeh, “Joint caching, routing, and
channel assignment for collaborative small-cell cellular networks,” arXiv preprint
arXiv:1605.09307, 2016.

[24] J. J. More, S. J. Wright, and P. M. Pardalos, Optimization software guide. SIAM,


1993, vol. 14.

[25] R. Fourer, D. M. Gay, and B. W. Kernighan, “A modeling language for mathe-


matical programming,” Management Science, vol. 36, no. 5, pp. 519–554, 1990.

[26] J. Kallrath, Modeling languages in mathematical optimization. Springer Science


& Business Media, 2013, vol. 88.

[27] J. K. Ho and E. Loute, “An advanced implementation of the dantzig—wolfe


decomposition algorithm for linear programming,” Mathematical Programming,
vol. 20, no. 1, pp. 303–326, 1981.

81

You might also like