Professional Documents
Culture Documents
LNO Notes
LNO Notes
1. Introduction.
2. Linear Programming
3. Integer Programming
Linear and Non-Linear Optimization 4. Networks
5. Computational Complexity
6. Non-Linear Programming
Nicholas Cron
Each chapter will be divided into sections.
1 2
Please appreciate that the books listed represent a very small Do understand that, for books and web pages:
sample of all those available. There are dozens more – with • The content may differ (irrelevant topics included,
titles including words such as ‘Operations Research’ or relevant topics omitted);
‘Operational Research’ or ‘OR’ or ‘linear programming’ or • The level may differ (too easy, too advanced);
‘nonlinear programming’ or ‘optimisation’ and so on.
• Notation may differ;
Also very many websites, including complete sets of lecture
notes. Here is one chosen almost at random after a google • Explanations may differ (e.g. proofs of results included
search: or omitted);
http://www.cs.toronto.edu/~stacho/public/IEOR4004-notes1.pdf • Discussion may be better or worse than these notes.
(Looks good but different choice of topics, and no NLP) So it is strongly recommended that these sources be
treated as supplementary (or complementary) rather than
9 as the primary resource. 10
1.2 History and Scope of Optimization Problems History and Scope of Optimization Problems
Optimization problems occur very widely in mathematics, • Scheduling Problems - assign crews to different airline
statistics and elsewhere. Typically, these problems ask for flights to minimize total cost while ensuring that a crew
the maximum or largest or smallest or least or best or rotation begins and ends in the same city.
optimal. Examples: • Revenue Management - for different classes of airline
tickets, determine how many seats to sell or hold back
as flight date approaches to maximize profits.
• Simple Calculus Problems – a cylindrical can is to be • Cutting Stock Problems: Given large paper sheets, and
constructed to contain 500g of soup. What should the demand for units of smaller sizes, determine the cutting
dimensions be to minimise the surface area, i.e. to use pattern of large into small pieces that meets demand
the smallest amount of metal? while minimizing waste.
• Simple Calculus Problems – what is the area of the
largest rectangle that can be inscribed in a circle of
radius 4? 15 16
History and Scope of Optimization Problems
• Electricity Planning - given forecast demand by period
History and Scope of Optimization Problems
and operating cost for each generator, determine which
generators should be run in each time interval to satisfy Most of these problems would normally be considered as
demand and minimize cost. Note that the forecasts are
critical. coming under the ambit of Operational Research. We give
• Maximum Likelihood Estimation in Statistics. a tentative definition of OR shortly, but hold onto the idea
that OR is practical, uses mathematics extensively and
More examples will be given as the course progresses. For typically involves optimisation of some sort.
now, note that: Even for those approaches that are specifically statistical
(e.g. MLE), mathematical (e.g. optimization in non-
• Most of the contexts are highly practical. Euclidean spaces), economic, financial and so on, OR
• Most problems are explicitly or implicitly in the form of techniques are transferrable.
maximizing or minimizing (i.e. optimizing) a function, It is therefore convenient to describe some of the most
often subject to constraints; the functions may be linear important and useful optimization methods against the
or non-linear. background of OR.
• The problems are of different types from different
contexts, but are capable of being represented 17 18
mathematically.
History and Scope of Optimization Problems History and Scope of Optimization Problems
Typical problems: • U-boat attach strategy. At what height should depth
charges be released? Too high means less precision, too
• Transatlantic convoys. Too large and they will be more low means less chance of a ‘kill’. In 1941, only 2 or 3%
visible to German U-boats, will go slower (a convoy can of attacks resulted in a sinking; this rose to 40% in 1944
only go as fast as its slowest vessel) and will be more and as high as 60% in the last few months of the war,
vulnerable to attack. Too small and fewer vital supplies thanks to OR.
will be transported. What is the optimal size? • Aircraft maintenance. Too frequent means wastage of
Conclusion: a few small convoys perform better than scarce resources. Too few means sub-standard aircraft
being used on critical missions.
many large ones.
• Attacks on land targets. How many planes to use? Too
• U-boat search strategy. Allied spies were able to advise many means waste of resources, greater risk of anti-
when U-boats were launched from key sites (Bordeaux, aircraft fire, too few means less effectiveness.
Brest etc.) Planes had a small window of opportunity to • Dogfight strategy. How close to approach an enemy
locate and attack the submarines before they fighter? Too close means greater chance of a hit but
submerged. What is best: to search in a spiral shape, a greater chance of being hit. Too distant and the opposite
rectangular shape etc.? Simulations 40 years later 23 risks arose. A complete solution had to await the 24
confirm the correct approach: a spiral strategy. development of game theory after the war.
History and Scope of Optimization Problems History and Scope of Optimization Problems
Notice how many problems can be expressed in
optimization terms (too many, too few etc.) After the war, the methods were expanded to many
non-military organisations (business, government,
A key test of the newly formed OR team at Stanmore came engineering etc.). Nowadays, most large efficient
in May 1940 as German troops advanced through France. companies undertake OR, either explicitly through an
The French Government requested 120 extra fighter OR division, or in collaboration with mathematicians,
aircraft to defend their country. Churchill passed the statisticians and other researchers.
request to the OR team for analysis. Using (simple) Typical areas of work:
mathematical and statistical tools based on current losses • Location problems: Where should a city locate a new airport?
and knowledge of German capability, the team clearly What is the optimal location of a fire station?
showed the futility of sending the planes. • Route scheduling: How should police officers’ beats be
organised? Postman delivery rounds?
No aircraft were sent and most of those currently in France
were recalled. The Nazis conquered France. But the
aircraft and pilots were crucial during the subsequent Battle
25 26
of Britain. OR had passed its first major test.
History and Scope of Optimization Problems History and Scope of Optimization Problems
• Inventory Problems: How much of a commodity should be Typical Steps in an OR Project
stored in a warehouse? What restocking policy should be 1. Identify the problem (general).
used?
• Auction Strategy: At what level should a bid be made to e.g. How can time and money spent by the Post
maximise chances of success for the smallest price? What Office be reduced?
is the optimal timing? 2. Formulate the problem (specific).
e.g. Minimise time spent on postal deliveries. Is
Many such problems involve us in maximising a the project viable?
function (e.g. profit) or minimising a function (e.g. cost or
3. Observe the system.
time) subject to constraints. This is the ‘classic’ context for
e.g. Find layout of town, union regulations etc.
OR and optimization generally, enabling researchers to
determine the most efficient way to carry out some operation. 4. Meet interested parties. Be aware of ‘territorial disputes’.
Tact needed.
27 28
History and Scope of Optimization Problems History and Scope of Optimization Problems
History and Scope of Optimization Problems History and Scope of Optimization Problems
An OR consultant/practitioner may need to undertake most
Hard systems approaches (hard OR) assume:
or all of these tasks. Tact, common sense and a flexible – objective reality of systems in the world
approach are at least as important as technical ability. – well-defined problem to be solved
– technical factors foremost
Hard and Soft OR – scientific approach to problem-solving
– an ideal solution.
Current academic thinking distinguishes two approaches.
Soft systems approaches (Soft OR) assume: HARD systems provide rigid techniques and procedures to
– organisational problems are ‘messy’ or poorly defined provide unambiguous solutions to well-defined data and
– stakeholders interpret problems differently (no processing problems, focused on computer implementations.
objective reality)
SOFT systems provide a loose framework of tools to be used
– human factors important
at the discretion of the analyst, focused on improvements to
– creative, intuitive approach to problem-solving organisational procedures.
– outcomes are learning, better understanding,
31 32
rather than a ‘solution’.
History and Scope of Optimization Problems History and Scope of Optimization Problems
Summarising further, ‘hard’ OR is primarily technical and OR overlaps with other disciplines, as indicated earlier:
mathematical, ‘soft’ OR sociological and managerial.
- Mathematics (e.g. game theory, optimization theory)
Often discord between the two: ‘hard’ OR is too - Statistics (e.g. regression, MLE, time series and
‘mechanistic’; ‘soft’ OR is too ‘touchy feely’. forecasting, simulation)
- Management (e.g. leadership, organizational structure)
There should be room for both, but we concentrate - Economics (e.g. pricing policy).
on ‘hard’ technical approaches in this course.
And (for soft OR):
Pedantic point: all optimization methods can in principle be
considered as methods of OR. Some OR approaches such - Sociology (e.g. structure of organizations)
as soft OR are not optimization. So arguably optimization is - Philosophy (e.g. approaches to problems, scientific
a proper subset of OR … but let’s not get hung up on method)
definitions. - Psychology (e.g. negotiation).
33 34
etc. 35 36
Typical Problems
Typical Problems
The revenue for each barrel produced is ₤6, ₤5,
₤3, ₤7 for Light, Dark, Ale, Premium respectively.
LP Example
What quantities of each should be produced to
A brewery makes four beers: Light, Dark, Ale and maximise revenue?
Premium. There are three main important ingredients: Let x1, x2, x3, x4 be the barrels of Light, Dark, Ale,
Malt (M), Hops (H) and Yeast (Y). [Assume limitless Premium. We have the following LP:
supply of water, sugar etc.] max 6x1 + 5x2 + 3x3 + 7x4 (revenue)
1 barrel of Light requires 1kg of M, 2kg of H, 1kg of Y s.t. x1 + x2 + + 3x4 ≤ 500 (malt)
1 barrel of Dark requires 1kg of M, 1kg of H, 1kg of Y 2x1 + x2 + 2x3 + x4 ≤ 1500 (hops)
1 barrel of Ale requires 2kg of H, 1kg of Y only x1 + x2 + x3 + 4x4 ≤ 800 (yeast)
1 barrel of Premium requires 3kg of M, 1kg of H, 4kg of Y x1, x2, x3, x4 ≥ 0 (all quantities non-negative)
j =1
j j
i=1 i , j ,i j
Clearly, we have an NLP.
So V(x) represents portfolio risk.
One way to consider the trade-off between the two Various alternatives and extensions are possible.
factors is to use V as objective function to be For example, we might wish to maximise the total
minimised and insist that R be no less than the expected return where the risk should be no
minimum acceptable expected return. greater than a specified amount. There is no
This gives the NLP min V(x) guarantee the two formulations given yield the
s.t. R(x) ≥ L same result.
n
Px B
j =1
j j
1 2
Formulation Formulation
Suppose x, y, z are tonnes produced by processes X, Y, Z
How can we maximise revenue?
Read the question carefully, maybe draw a picture… The amount of A produced is ¾x + ¼y + ½z with income
285(¾x + ¼y + ½z). Similarly the income on B is
105(¼x + ¾y + ½z). Costs are 60x + 40y + 25z.
9 10
17 18
27 28
Simplex Method
Simplex Method
STEP 3: Using row operations, make the pivot 1 and all
STEP 2: Find the most negative value in the bottom row other column entries 0. (i.e. R1 →1/6R1 , R2 →R2-½R1 ,
(here, -25) and then the row with the smallest positive ratio R3 →R3-1/24R1 , Rz →Rz+25/6R1). (Rz is the bottom row).
from bi/aik where k is the column chosen, aik > 0. Here we This leads to a new tableau. Proceed as before: find
choose min{48/6, 30/3 ,9/¼ } = 8. [The ‘minimum ratio’ rule.] column, find pivot, carry out row operations.
This defines a pivot (green) with column chosen in red.
x1 x2 x3 x4 x5 RHS
x1 x2 x3 x4 x5 RHS
0.167 1 0.167 0 0 8
1 6 1 0 0 48
2.5 0 -0.5 1 0 6
3 3 0 1 0 30
0.958 0 -0.042 0 1 7
1 ¼ 0 0 1 9
-15.833 0 4.167 0 0 200
-20 -25 0 0 0 0
29 30
Simplex Method
Simplex Method Further Example
We solve max -x1 - 2x2 + x3
1. Some bottom row coefficient is negative but every other entry in that
column is non-positive. This is unbounded form; the LP has no finite s.t. x1 - x2 + x3 ≤ 1
optimum. x1 + x2 - 2x3 ≤ 4
x1 ≥ 0, x2, x3 free.
2. For some i, bi ≠0 (RHS coefficient non-zero) but aij=0 for all j (all other row Let x2 = x2' - x2'', x3 = x3' - x3'', x4, x5 slack. Tableaux:
entries zero). The feasible region is the empty set; the constraints are
incompatible. The LP is infeasible.
x1 x2' x2'' x3' x3'' x4 x5 RHS
1 -1 1 1 -1 1 0 1
3. bi>0 for some i but aij≤0 for all j. Again, the constraints are incompatible and 1 1 -1 -2 2 0 1 4
again there is no solution. The LP is infeasible. 1 2 -2 -1 1 0 0 0
x1 x2' x2'' x3' x3'' x4 x5 RHS
1 -1 1 1 -1 1 0 1
2 0 0 -1 1 1 1 5
35 36
3 0 0 1 -1 2 0 2
Simplex Method
x1 x2' x2'' x3' x3'' x4 x5 RHS 2.5 Degeneracy and Cycling
3 -1 1 0 0 2 1 6
2 0 0 -1 1 1 1 5
5 0 0 0 0 3 1 7 Consider the LP:
Solution x1 = 0, x2'' = 6, x3'' = 5
i.e. (x1, x2, x3) = (0, -6, -5) value 7. max ¾x1 - 20x2 + ½x3 - 6x4 + 3 = z
s.t. ¼x1 - 8x2 - x3 + 9x4 ≤ 0
Important Point 1 : The method as described assumes that ½x1 - 12x2 -½x3 +3x4 ≤ 0
all bi ≥ 0, and some columns of the tableau have an identity x3 ≤ 1
matrix, with objective function coefficients zero. Sometimes x1, x2, x3, x4 ≥ 0
that is not the case – so alternative procedures are needed.
Important Point 2: The method as described assumes the LP is and the following sequence of tableaux:
in standard form. If not, the technique will not yield the LP
optimum in general. Convert to standard form! 37 38
x1 x2 x3 x4 x5 x6 x7 RHS x1 x2 x3 x4 x5 x6 x7 RHS
1 -32 -4 36 4 0 0 0 1/8 0 1 -10.5 -1.5 1 0 0
0 4 1.5 -15 -2 1 0 0 - 3/64 1 0 3/16 1/16 - 1/8 0 0
0 0 1 0 0 0 1 1 - 1/8 0 0 10.5 1.5 -1 1 1
0 -4 -3.5 33 3 0 0 3 1/4 0 0 -3 -2 3 0 3
39 40
Degeneracy and Cycling Degeneracy and Cycling
x1 x2 x3 x4 x5 x6 x7 RHS
x1 x2 x3 x4 x5 x6 x7 RHS
-2.5 56 1 0 2 -6 0 0
-0.25 5.333 0 1 0.333 -0.667 0 0
2.5 -56 0 0 -2 6 1 1 ¼ -8 -1 9 1 0 0 0
-0.5 16 0 0 -1 1 0 3
x1 x2 x3 x4 x5 x6 x7 RHS
-1.25 28 0.5 0 1 -3 0 0 0 0 1 0 0 0 1 1
-0.167 -4 -0.167 1 0 0.333 0 0
0 0 1 0 0 0 1 1
- 3/4 20 - 1/2 6 0 0 0 3
-1.75 44 0 0 0 -2 0 3
41 42
47
Initialisation
Initialisation Big-M Method
Initialisation Initialisation
Initial tableau:
Write the LP with M a huge positive constant as
max x1 + 4x2 + 9x3 - MR
s.t. -2x1 + x2 - 4x3 - x4 + R =3 Pivot as shown, initially to give ‘identity structure’ (a small
- 3x1+ 2x2 - 7x3 + x5 =5 step but a crucial one), then a standard simplex pivot:
x1, x2 , x3 , x4, x5 , R ≥ 0
R has been introduced to create an ‘artificial’ identity matrix
structure. The term MR is present to penalise positive
values of R; it is designed to compel R=0 if this can be done.
If the modified problem has a solution with R=0, it will solve
the original LP. If the modified problem has a solution with Notice that we have optimal form (M is very large) but R is
R > 0, there is no solution to the original LP without non-zero. Deduce that the feasible region is empty; no
introducing R, so the LP is infeasible. solution to the initial problem.
51 52
Initialisation
Initialisation
Modify the problem slightly.
max x1 + 4x2 + 9x3
s.t. -2x1 + x2 - 4x3 ≥ 3
3x1 -2x2 + 7x3 ≥ -7
x1, x2 , x3 ≥ 0
Leading to
max x1 + 4x2 + 9x3 -MR
s.t. -2x1 + x2 - 4x3 - x4 + R =3
- 3x1+ 2x2 - 7x3 + x5 =7
x1, x2 , x3 , x4, x5 , R ≥ 0
Initialisation
Notice that R has been driven out of the basis. At optimum Initialisation
R=0. We can read off the solution: x1 = 0, x2 = 7, x3 = 1, z = 37.
With two or more artificial variables, proceed similarly; both Phase 2: Use the basic feasible solution from
(all) will need to be removed as basis variables, if possible. phase 1, ignoring the artificial variables which no
longer play any part, as starting point for the
original problem with original objective function.
Two-Phase Method
Apply ordinary simplex to yield the optimum.
Phase 1: Again use artificial variables. Create a new objective Return to the problem:
function consisting of the sum of the artificial variables. Use
simplex to minimise this function subject to given constraints. max x1 + 4x2 + 9x3
If this new artificial function can be reduced to zero, then each s.t. -2x1 + x2 - 4x3 ≥ 3
of the (non-negative) artificial variables will be zero. Then all 3x1 -2x2 + 7x3 ≥ -5
the original constraints are satisfied: proceed to stage 2. If not, x1, x2 , x3 ≥ 0
we deduce at once that the original problem is infeasible.
55 56
Initialisation Initialisation
Solve: Pivot, initially for identity structure, then usual simplex:
min R
s.t. -2x1 + x2 - 4x3 - x4 + R =3
-3x1+ 2x2 - 7x3 + x5 =5
x1, x2 , x3 , x4, x5 , R ≥ 0
Tableau:
57 58
Initialisation Initialisation
Consider modified problem again, in form:
min R
s.t. -2x1 + x2 - 4x3 - x4 + R =3
-3x1+ 2x2 - 7x3 + x5 =7
x1, x2 , x3 , x4, x5 , R ≥ 0
For phase 1, proceed as before.
Now have solution to phase 1 problem with x2 = 3, x5 =1 and
crucially, R=0. A solution to original problem exists.
Find it by dropping the artificial variable, restoring original
objective function, pivoting to make x2 basic, then using
standard simplex.
59 60
Initialisation
Initialisation
Comparison
Which is better, Big-M or Two-Phase? Big-M may
be simpler and has the advantage of carrying out
optimisation in one pass. However, it has a serious
computational disadvantage. In running the
algorithm, we would frequently need to multiply a
very large number (M) by a very small number (R)
and computer arithmetic can lead to serious round-off
error.
We obtain the same solution as earlier. For this reason, the two phase method is more widely
used.
61 62
65 66
67 68
Computing
2.8 Practical LP: Computing Excel
Open Solver under Tools – an Add-in may be needed. One
We see how we can solve an LP simply by computer.
way to proceed is to assume that variables x1, x2, x3, x4 are
min 4x1 + 11x2 - 13x3 + 5x4
in cells A1-A4.
s.t. x1 + 4x2 - 9x3 - x4 ≥ 21
2x1 - 7x2 + 10x3 + 10x4 = 13 B1 contains the value of the objective function
7x1 + 2x2 + 5x3 - 2x4 = 17 =4*A1+11*A2-13*A3+5*A4.
x1 + x2 + x3 + x4 ≥8 C1 contains the LHS of the first constraint =A1+4*A2-9*A3-A4
x1, x2 , x3 , x4 ≥ 0 and similarly, D1-F1 the LHSs for the other constraints.
Having four constraints, with awkward numbers, and ≥ and In the Solver dialogue window:
equality constraints, means that use of Big-M or the two Set Target Cell $B$1
phase method, or otherwise, directly will be tiresome and Equal to Min
prone to error. For larger problems, manual solution is By Changing Cells $A$1:$A$4
almost impossible.
Subject to Constraints – use Add to give $C$1>=21 etc.
69 Options – Assume Non-Negative and Assume Linear Model 70
Computing Computing
Now using Solver gives the solution
x1 = 2.165, x2 = 5.970, x3 = 0, x4 = 5.046,
With window
z = 99.565 (to 3 d.p.)
Some other options are available. For example, Excel
supplies a brief sensitivity analysis if required.
R
More than one way to do this. Maybe simplest is to use the
package lpSolve. This must be downloaded from the
Packages menu.
Then use the following syntax. Fairly self-explanatory, but
note:
- Syntax must be used exactly as written; R is very sensitive
- Coefficients are entered by columns rather than rows.
71 72
Computing 2.9 Duality
Early in the development of LP theory, it was
this.lp=lp(objective.in=c(4,11,-13,5),
const.mat=matrix(c(1,2,7,1,4,-7,2,1,-9,10,5,1,-1,10,-2,1),nrow=4),
realised that every LP has an associated LP,
const.rhs=c(21,13,17,8), its dual, and the solutions to the two are closely
const.dir=c(">=","==","==",">="),direction="min") related. This is important for both theoretical and
this.lp$solution practical reasons.
Obtain the same solution as before: Consider an LP in the form max cTx
s.t. Ax ≤ b
[1] 2.164557 5.970464 0.000000 5.046414
x≥0
Alternatives exist in R, such as solveLP and simplex in the
boot package. Of course, many packages other than R and Its dual is min bTy
s.t. ATy ≥ c
Excel can perform LP.
y≥0
Considerations should include speed, the size of the LP
and the amount of output desired.
73 74
Duality
Duality
For example, the dual of
max 2x1 + x2 Note the correspondence between primal L
and dual L*:
s.t. x1 + x2 ≤ 6
x1 - x2 ≤ 2 • One is a maximisation, the other a minimisation;
x2 ≤ 3 • Both have inequality constraints with opposite signs;
x1, x2 ≥ 0 • To form the dual, matrix A is transposed;
• The objective function in one is the RHS vector in the other, and vice versa;
• Each primal variable corresponds to a dual constraint, and vice versa.
is min 6y1 + 2y2 + 3y3
s.t. y1 + y 2 ≥2
y1 - y2 + y3 ≥ 1
y1, y2 ,y3 ≥ 0
75 76
Duality Duality
Result 1 Now … what about equality constraints?
The dual of the dual of an LP is the original LP. [We usually Let us find the dual of L: min cTx s.t. Ax = b, x ≥ 0.
refer to the original LP as the primal].
A b
Proof Write L as T
min c x s.t. x , x 0
Let L (primal) be the LP max cTx s.t. Ax ≤ b, x ≥ 0. − A −b
Notice that every LP can be written in this way. T T
Duality Duality
Thus L* can be written Result 3 (Weak Duality Theorem)
max bT(u-v) s.t. AT(u-v) ≤ c, u, v ≥ 0 Consider the primal-dual LPs: max cTx s.t.
or max bTz s.t. ATz ≤ c, where z = u-v is a Ax ≤ b, x ≥ 0 and min bTy s.t. ATy ≥ c, y ≥ 0.
vector of free variables. Then if y is feasible for the minimisation and x is
feasible for the maximisation, cTx ≤ bTy
We can deduce a further correspondence between an LP
Proof
and its dual (partial proof given):
Since x is feasible, Ax ≤ b, x ≥ 0.
Result 2 Thus (Ax)T ≤ bT hence xTAT ≤ bT and xTATy≤ bTy
The dual variable defined by an equality constraint is But y is feasible so ATy ≥ c, y ≥ 0
unrestricted in sign, and vice versa. So xTATy ≥ xTc
It follows that xTc ≤ bTy
79
But xTc is a scalar so xTc = (xTc )T = cTx. The result follows. 80
Duality Duality
Result 3 is usually called the Weak Duality Theorem. In words, Corollary 3.3 (direct from Weak Duality)
it says that the value of the objective function of the minimum If the maximum problem is feasible but its objective function
problem is always greater than or equal to that of the is unbounded, the minimum problem cannot have a feasible
maximum problem. Some consequences, for a primal-dual pair: solution. [If the minimum problem has a feasible solution y*,
Corollary 3.1 (direct from Weak Duality) then cTx ≤ bTy* for all solutions x of the maximum problem,
which cannot occur when the maximum problem is unbounded.]
The value of the objective function of the maximum problem for
any feasible solution is a lower bound to the minimum value of Corollary 3.4 (direct from Weak Duality)
the minimum objective function. If the minimum problem is feasible but its objective function is
Corollary 3.2 (direct from Weak Duality) unbounded, the maximum problem cannot have a feasible
solution. [If the maximum problem has a feasible solution x*,
The value of the objective function of the minimum problem for
then cTx* ≤ bTy for all solutions y of the minimum problem,
any feasible solution is an upper bound to the maximum value of
which cannot occur when the minimum problem is unbounded.]
the maximum objective function.
81 82
Duality Duality
Result 5 (Strong Duality Theorem)
Corollary 3.5 If one of the problems has an optimum, then so does the other
If both problems have feasible vectors, both have optimal and the optimal values are equal. (Proof omitted.)
vectors (contrapositive of 3.3 and 3.4).
This result lies at the heart of duality theory. Unfortunately, I am
Result 4 not aware of a satisfactory elementary proof; in fact I am almost
Both primal and dual may be infeasible. convinced that no elementary proof exists. The proofs I have
Proof seen are either short and dubious, or long and hard to follow.
A simple example will suffice, e.g. If you find a good one, let me know. But see Bazaraa, Sherali
L: maximize 2x1 - x2 and Shetty (not elementary).
subject to x1 - x2 ≤ 1
-x1 + x2 ≤ -2 Hillier and Lieberman do not give a proof. Some
x1, x2 ≥ 0 other texts give a matrix based proof which does not fit with the
Easy to see that both L and L* are infeasible. approach adopted in this course. See, for example
www.math.ubc.ca/~anstee/math340/340strongduality.pdf
83 84
Duality Duality
You should know this very important result but its proof will Result 6 (Complementary Slackness)
not be required. Consider a primal-dual pair of feasible LPs, one in standard
form. If a constraint of either LP is slack at optimum, then in
Corollary 5.1 (logical consequence of earlier results) the other problem, the corresponding dual variable is zero
The following are the only possible relationships at optimum. If a variable of either problem is non-zero at
between the primal and dual problems: optimum, then in the dual the corresponding constraint is
a) Both are feasible and bounded, so have (equal) binding at optimum.
optimal solutions;
b) One is feasible and unbounded, the other infeasible; Example
c) Both are infeasible. Suppose we are asked to solve the LP (call it P):
min 12w1 + 20w2 + w3
Note that the finiteness of all solutions implies the s.t. 0.5w1 + w2 + 1/16w3 – 1200 w4 ≥ 24
existence of an optimum for an LP. This may not be true for w1 + w2 + 1/24w3 + 800 w4 ≥ 20
an NLP. Consider, for example the problem min ex for w1, w2 , w3, w4 ≥ 0.
x ≤ 0. This does not have an optimum.
86
Duality Duality
x1 > 0 → first constraint of P is binding at optimum (holds
This is not straightforward. But consider the dual LP (say D):
with equality)
x2 > 0 → second constraint of P is binding at optimum
max 24x1 + 20x2
s.t. 0.5x1 + x2 ≤ 12
Constraint 2 of D is slack → w2 = 0
x1 + x2 ≤ 20
1/ x + 1/ x ≤ 1 Constraint 4 of D is slack → w4 = 0
16 1 24 2
-1200x1 + 800x2 ≤ 0
Putting the above four conditions together, we find
x1, x2 ≥ 0.
0.5w1 + 1/16w3 = 24
It is easy to check (for example, graphically) that x1 = 12,
w1 + 1/24w3 = 20
x2 = 6, z*= 408 is optimal for D. So we know (result 4) that P
has value z = 408. But what are w1, w2 , w3, w4 at optimum?
Solving gives w1 = 6, w3 = 336 so that (w1, w2 , w3, w4)T =
We use complementary slackness.
(6, 0, 336, 0)T is optimal for P. (Check we have z = 408)
87 88
Duality Duality
Proof of Complementary Slackness We can always write the dual in this way, including
Present an algebraic (not matrix) proof. surplus variables tj.
Write the primal P with slacks as: Both P and D are assumed feasible, so have
P: max Σj cjxj =z optimal solutions z*, w*. At optimum, we have
s.t. Σj aijxj + si = bi for all i w*- z* = Σi biyi - Σj cjxj
xj , si ≥ 0 for all i,j = Σi (Σj aijxj + si )yi - Σj(Σi aijyi - tj )xj
We can always write a primal in this way, including = Σisiyi + Σjtjxj
slack variables si. Again by duality, w* = z*
Write the dual D as: so Σisiyi + Σjtjxj = 0.
P: min Σi biyi = w But all variables are non-negative.
s.t. Σi aijyi - tj = cj for all j Therefore siyi = 0 for all i, tjxj = 0 for all j.
yi , tj ≥ 0 for all i,j
89 90
Duality Duality
If a constraint is slack at optimum, either Consider the following LP, for an arbitrary positive
si ≠ 0, when yi = 0, or tj ≠ 0, when xj = 0. integer M (2 variables, 2M constraints):
If a variable is non-zero at optimum, either max x1 + x2
yi ≠ 0, when si = 0, or xj ≠ 0, when tj = 0. s.t. (i – M)x1 + x2 ≤ i(i – 1) for i = 1, …, M
This is precisely the assertion of the x1 + (i – M)x2 ≤ i(i – 1) for i = 1, …, M
Complementary Slackness Theorem. x1, x2 ≥ 0
We can show that using ordinary simplex, we can
Apart from its theoretical importance, duality can find the optimum at (M(M-1), M(M-1)) but
sometimes be put to computational advantage by (M+1) pivots are needed to attain this point.
drastically reducing the number of pivots need to attain
optimum.
91 92
Duality Summary of Duality Relations
Now consider the dual LP. This is
min 0y1 + 2y2 + …M(M-1)yM + 0yM+1 + 2yM+2 + …+M(M-1)y2M Four Possible Primal-Dual Problems
s.t. -(M-1)y1 - … - 1yM-1 + 0yM + 1yM+1 + 1yM+2+… + 1y2M ≥ 1
1y1 + 1y2 + … + 1yM - (M-1)yM+1 - … - 1y2M-1 – 0y2M ≥ 1
yi ≥ 0 (i = 1, 2, …, M) Dual
Apart from non-negativity, there are only two constraints and the Finite Unbounded Infeasible
Optimum
optimum at (0,0,0,0,0,0,0,…,1,0,0,0,0,0,0,0,…,1) is attained
Finite ? X X
with two pivots. Of course, the value 2M(M-1) is the same for Optimum
both primal and dual.
Primal Unbounded X X ?
But the work required to obtain optimal form is much less, for
large M, through using the dual since Infeasible X ? ?
Pivots to solve dual/ 2
Pivots to solve primal = /M+1 93
In standard form, with slack variables added, this is: This would be in optimal form but two RHS coefficients are
-max -2x1 -15x2 - 18x3 negative. The tableau is primal infeasible. It would give a
s.t. -x1 + 2x2 - 6x3 + x4 = -10 solution with x4, x7 < 0.
x2 + 2x3 + x5 = 6
Rather than apply a two-phase method, or the Big-M
2x1 +11x3 + x6 = 19
- x1 + x2 + x7 = -2 method, it is simpler to apply dual simplex to the dual
x1, x2, x3, x4, x5, x6, x7 ≥ 0 feasible tableau. Proceed as follows.
x1 x2 x3 x4 x5 x6 x7 RHS
1 3.5 0 2.5 -2 -0.5 0 7
Solution x1 = 7, x2 = 0, x3 = 4, x4 = 0, z = -(-7) = 7 0 1.5 1 1.5 -1 -0.5 0 4
0 1.5 0 1.5 -1 -0.5 1 -2
A lot easier than phase 1 and phase 2! 0 0.5 0 0.5 2 0.5 0 -7
2.11 Sensitivity Analysis
Dual Simplex Method Also known as post-optimal analysis. After solving an LP,
some specification may change. Can sometimes solve the
Final pivot on a35 leads to optimal (both modified problem without starting all over again. Additionally,
primal and dual feasible) tableau can sometimes find what changes can occur in a given value
without altering the solution.
x1 x2 x3 x4 x5 x6 x7 RHS
1 2 0 1 -1 0 -1 9
0 0 1 0 0 0 -1 6 Recall the brewery problem - slightly modified:
0 -3 0 -3 2 1 -2 4 max 6x1 + 5x2 + 3x3 + 7x4 (revenue)
0 2 0 2 1 0 1 -9
Read off solution x1 = 9, x2 = 0, x3 = 6, x4 = 0. s.t. x1 + x2 + + 3x4 ≤ 50 (malt)
2x1 + x2 + 2x3 + x4 ≤ 150 (hops)
A lot simpler than completely resolving the problem, as well
x1 + x2 + x3 + 4x4 ≤ 80 (yeast)
as Phase 1/Phase 2. x1, x2, x3, x4 ≥ 0
110
113 114
so the new optimum (x1, x2, x3, x4)T = (44, 0, 28, 2)T and z = 362. 115 116
Sensitivity Analysis Sensitivity Analysis
Setting s3 = ½ gives (x1, x2, x3, x4)T = (41, 9, 29.5, 0)T
and z = 379.5. Changes in Resources
Similarly, suppose x2 = 8 instead of x2 = 10. Suppose the available quantity of malt/hops/yeast
Consider constraint x2 + 7x4 - s2 + 2s3 = 10. increases or decreases? If, at optimum, all of a resource is
The change implies x4 = 2/7 or s3 = 1. not used up (slack variable positive), then increasing a
Since z = 380 - 7x4 - 3s1 - s2 - s3, z will reduce to 378 in the resource, or decreasing it by less than will tighten the
former case and 379 in the latter. Choose s3 = 1. Now constraint, will not affect the optimum. So only consider
changes in zero slack variables.
using the tableau on the previous slide, we have
Let amount of yeast available be 80+a instead of a (a could
(x1, x2, x3, x4)T = (40+2s3,10-2s3,30-s3,0)T = (42,8,29,0),
be positive or negative). Modify initial tableau replacing 80
and z = 379.
by 80+a. call this M1. Perform the same operations used to
derive M* from M. This gives tableau M1*:
Reality check: all modified solutions are less than 380.
117 118
x1 x2 x3 x4 s1 s2 s3 RHS
1 1 0 3 1 0 0 50
2 1 2 1 0 1 0 150
1 1 1 4 0 0 1 80
-6-q -5 -3 -7 0 0 0 0
121 122
Now briefly consider the effect of changing prices for non- Again this can be written from M*.
basic variables. Notice that if the variable is non-basic, it is
not profitable at optimum. It will surely not become The optimum is unchanged provided 7-r≥0, that is
profitable (i.e. enter the basis) if the price is lowered, but if the selling price of Premium does not exceed 14.
the price is raised sufficiently it may enter the basis. If the selling is raised beyond this level, the tableau
Premium is not in the basis. Suppose 7+r is the new selling will not be optimal and a further pivot needed.
price.
125 126
Sensitivity Analysis
Suppose r = 8. The tableau is
Sensitivity Analysis
x1 x2 x3 x4 s1 s2 s3 RHS
0 1 0 7 0 -1 2 10 New Constraints
1 0 0 -4 1 1 -2 40
0 0 1 1 -1 0 1 30
These are usually handled by dual simplex.
0 0 0 -1 3 1 1 380 Look at the original final solution again.
A standard simplex pivot leads to: (x1,x2,x3,x4)T = (40, 10, 30, 0)T
x1 x2 x3 x4 s1 s2 s3 RHS Suppose we require that the total amounts of Light and
0 0.142857 0 1 0 -0.142857 0.285714 1.428571 Dark are at least the total amounts of Ale and Premium, i.e.
1 0.571429 0 0 1 0.428571 -0.857143 45.71429
0 -0.142857 1 0 -1 0.142857 0.714286 28.57143 x1+ x2 ≥ x3+ x4. Since 40+10≥30+0, the constraint is
0 0.142857 0 0 3 0.857143 1.285714 381.4286
non-binding, the original solution still stands and no further
steps are needed.
The new optimum is (x1,x2,x3,x4)T = (45.71,0,28.57,1.43)T But if the total amounts of Light and Dark are to be at least
or (455/7,0,284/7,13/7)T and z = 3813/7 twice the total amounts of Ale and Premium, the picture
changes. Since 40+10 < 2(30+0), the constraint must be used.
127 128
Sensitivity Analysis Sensitivity Analysis
The following sequence of tableaux lead to optimum: x1 x2 x3 x4 s1 s2 s3 s4 RHS
0 1 0 7 0 -1 2 0 10
1 0 0 -4 1 1 -2 0 40
0 0 1 1 -1 0 1 0 30 Now apply dual
x1 x2 x3 x4 s1 s2 s3 s4 RHS 0 0 0 3 3 0 -2 1 -10 simplex with pivot
Add constraint
0 1 0 7 0 -1 2 0 10 0 0 0 7 3 1 1 0 380 shown.
x1+ x2 ≥ 2(x3+ x4);
1 0 0 -4 1 1 -2 0 40
choose pivots to x1 x2 x3 x4 s1 s2 s3 s4 RHS
0 0 1 1 -1 0 1 0 30 0 1 0 10 3 -1 0 1 0
give tableau suitable
-1 -1 2 2 0 0 0 1 0 1 0 0 -7 -2 1 0 -1 50
for dual simplex 0 0 1 2.5 0.5 0 0 0.5 25 Optimal form.
0 0 0 7 3 1 1 0 380
0 0 0 -1.5 -1.5 0 1 -0.5 5 x1 = 50, x2 = 0,
0 0 0 8.5 4.5 1 0 0.5 375 x3 = 25, x4 = 0
129 130
See http://www.maths.ed.ac.uk/~gondzio/software/hopdm.html
A useful idea is to exploit the ‘duality gap’. We have seen or http://www.sztaki.hu/~meszaros/bpmpd/
that for a dual pair of LPs, cTx ≤ bTy with usual notation.
More precisely, cTx < bTy except at optimum, for two software packages that have been used.
where cTx = bTy. Then proximity to optimum can be
assessed by the size of the duality gap bTy – cTx. This discussion is extended somewhat in the Moodle
miscellaneous section (non–examinable).
141
2
1
3.1 Introduction Introduction
A linear integer program (IP) is an LP where some or all xj At first glance, it may seem that IPs are easier to solve than
(j = 1, 2, …, n) are required to be integer. [Shall not LPs, because there are less potential solutions. In fact, they
consider non-linear integer programs.] are harder, because there is no universal algorithm such as
Sometimes all variables must be integer, other times just simplex, although there are universal approaches such as
some. So can have:
branch and bound, which may still take a very long time to
- A pure integer problem (PIP)
- A mixed integer problem (MIP) attain optimum.
Such problems arise naturally when xj represents an While LPs and IPs are superficially similar, there are major
indivisible unit (car, aircraft, factory, human being etc.) differences:
Additionally, many problems occur when xj is binary. Usually i) The IP solution may be a long way from the LP solution;
we then assume xj ε {0, 1}. ii) Rounding the LP solution up or down may not work;
If all variables are binary, we have a binary integer problem iii) Examining every possible IP solution may be impractical.
(BIP).
Various combinations can occur for the range of each xj.
3 4
Introduction
Introduction The difficulties may be seen from a sketch. Notice that
solutions can only occur at lattice points.
To see i) and ii), consider the LP
max z=x+ y
s.t. -2x + 2y ≤ 1
16x - 14y ≤ 7
x ≥ 0, y ≥ 0
and the corresponding IP with x, y integer.
The LP solution is (x, y)T = (7, 7.5), z = 14.5.
The rounded LP solutions are (x, y)T = (7, 7), z = 14 and (x, y)T
= (7, 8), z = 15 but unfortunately both of these are infeasible.
The optimum IP solution is (x, y)T = (3, 3), z = 6.
In fact there are four feasible points for the IP: (0,0), (1,1), (2,2)
and (3,3).
5 6
Introduction 3.2 Examples
Have already seen some examples, e.g. the nurse
To illustrate fact iii) – that exhaustive enumeration rostering problem might be solved as a PIP, the knapsack
is usually impractical – consider a fairly modest IP problem as a BIP. Three further examples:
where 20 integer variables can each take 20 Facility Location (MIP)
possible values. There are 1026 solutions to Suppose that facilities for distributing a product to n
verify. Even if still within range of some machines, customers can be placed at m possible locations.
a small increase in size will put the method out of If location i is used as a distribution point, there is a fixed
reach. A BIP with 87 variables requires a similar set-up cost Fi, and cij is the cost of transporting one unit of
production from location i to customer j, including
number of enumerations.
loading/unloading costs at i and j. (i = 1, …, m, j = 1, …, n).
In practice, these are quite small scale problems.
Customer j demands dj units of the product.
Clearly, fresh methods are needed. At which of the m possible locations should facilities be
placed, and how much should be shipped to each customer
to meet demand and minimise totals costs?
7 8
Examples Examples
Let yi = 1 if a facility is established at location i, and 0 otherwise. We also need to ensure that customer demand is met, so
Let xij be the amount shipped from location i to customer j. m
xij d j for each j. This gives the MIP:
i =1
m n m
Notice
n
that if yi = 0, nothing can be shipped from i. So min c x + F y ij ij i i
ij j
i =1
If yi = 1, total demand from all customers should
n
surely
n
not be n n (i = 1, …, m)
exceeded by the amount shipped from i, so xij d j x y d
ij i j
j =1 j =1 j =1 j =1
for those i with yi = 1. xij ≥ 0 (i = 1, …, m, j = 1, …, n)
n n
yj ε {0, 1} (i = 1, …, m)
Combining the two constraints, xij yi d j (1≤i≤n)
j =1 j =1
where xij ≥ 0 for all i, j, yi = 0 or 1. (Various modifications possible).
.
9 10
Examples
Examples
n n
max z = c x
Assignment Problem (BIP) i =1 j =1
ij ij
Examples Examples
Project Planning (BIP) n
s.t. aij xi E j (j = 1, …, m)
each project over each year, viz. aij is the projected i =1
Examples
3.3 Branch and Bound
We can often incorporate logical constraints in a BIP. Thus:
• If both projects A and B should not be included together, To solve an IP, two naïve approaches can
then xA + xB ≤ 1
• If at least one of A and B should be included, then be tried:
xA + xB ≥ 1 a) Solve the associated LP (the ‘LP relaxation’) and
• If exactly one of A and B should be included, then round off. Have seen this may not work.
xA + xB = 1 b) Examine every integer lattice point. Have seen this
• If two of A, B, C, D should be included, may be impractical.
then xA + xB +xC + xD = 2 The branch and bound method incorporates
• If C must be included if either A or B is included,
then xC ≥ xA and xC ≥ xB useful features of both approaches. We successively
• If C must be included if both A and B are included, then eliminate those parts of the feasible region that cannot
xC ≥ xA + xB - 1 contain the optimum, using a tree structure.
17 18
Branch and Bound Branch and Bound
Consider the PIP:
But a solution for the IP must satisfy either
Max -4x1 + 6x2 = z xεF, x1≤1 or xεF, x1≥2. Construct two new
s.t. -x1 + x2 ≤ 1 LPs with these constraints.
x1 + 3x2 ≤ 9 1. max z s.t. xεF, x1≤1 → x1 = 1, x2 = 2, z = 8.
3x1 + x2 ≤ 15 2. max z s.t xεF, x1≥2 → x1 = 2, x2 = 22/3, z = 6.
x1, x2 ≥ 0 and integer. The process of constructing new problems is
called branching. Here, we branched on x1 to produce two
Let F denote the feasible region for the LP. subproblems from the master problem.
The LP relaxation can be written max z The solution to 1 is feasible, and 2 cannot lead to
x F
a higher objective value, so the solution to 1 is optimal.
Solution (simplex or graph) x1 = 1.5, x2 = 2.5, z = 9.
This is useless for the IP. [Note: adding further constraints cannot increase the value
in a maximisation problem.]
19 20
21 22
Branch and Bound
STEP 0 - Initialisation: Branch and Bound
Solve the LP relaxation of the IP. If the solution has integer
values, stop. If not, let L be the objective value at any STEP 3 – Fathoming:
feasible integer solution. If several such are known, let L be For each subset that may contain the optimum, exclude it
the greatest of these. If none are known, let L be a large from consideration if
negative number. a) z < L;
STEP 1 – Branching: b) The subset has no feasible points;
Select a remaining subset of feasible solutions (on the first
c) z is attained at an integer feasible point and z > L.
run, select F, the set of all constraints). Choose a non-integer
In case c), call the integer feasible point the incumbent
component of the solution to the subproblem, and divide the
subset into two by adding constraints to exclude the non- solution, let z = L and repeat step 3 for further unfathomed
integer value. subsets for which z is known.
STEP 2 – Bounding: STEP 4 – Testing
For each subset formed, obtain an upper bound on the If no subsets are unfathomed, stop; the incumbent solution
objective value, say z. is optimal. If not, return to step 1.
23 24
27 28
2 3
Introduction and Definitions
Introduction and Definitions
A path in G is a set of nodes {i1, i2, …, in} and a set of
Can be represented pictorially, viz. distinct arcs {a1, a2, …, an-1} such that ak = (ik, ik+1} for
k = 1, 2, …, n-1. We refer to a path from i1 to in e.g. a path from
nodes 1 to 4 consists of nodes {1,2,3,4} and arcs {a1, a2, a5} in
the above graph. .
We could write this as 1,a1,2,a2,3,a5,4.
A graph is connected if there exists a path between any two
vertices. Otherwise, it is disconnected.
6 7
Introduction and Definitions
Introduction and Definitions We define a network as a digraph containing no loops or
multiple arcs where each arc has a capacity, one or more
. identified nodes are sources and one or more identified nodes
are sinks. Assume all networks are weakly connected.
(There is not always complete consistency about definitions
between textbooks).
8 9
Example 2 - The Transportation Problem Notice that the Transportation Problem can also be written
m n
There are m producers, n consumers, ai (i=1,2,…,m) are as an LP: min c x ij ij
i =1 j =1
available supplies from the producers, bj (j=1,2,…,n) are n
16 17
If the sink cannot be labelled, the current flow is maximum; STEP 4 (optional): Verify the current flow is maximal by finding
stop. If the sink is labelled, proceed to step 3. the capacity of a suitable cut.
24 25
becomes
This flow is optimal, value 5, since sink
cannot be labelled. Or apply step 4:
{(v,s),(v,x),(u,z)} is cut of capacity 5.
32 33
Extensions Extensions
2. With flows permitted both ways along an arc, it is
simplest to add an arc, i.e.
A widely used extension is used when there are
costs on arcs. We require the minimum cost flow
of given value from source to sink. It is a hybrid of
a transportation problem and a network flow
problem. Solution methods are known.
Further extensions exist for networks with gains or
3. With node capacities, add an extra arc.
losses (for example, due to heating in electrical
circuits, taxation for money flows) and where there
are lower capacities on arcs (e.g. where there are
contractual obligations to use certain routes).
34 35
4.5 The Minimal Connector Problem
The Minimal Connector Problem
Example 1: A new underground system is to be built in a city.
A number of stations are proposed servicing the city centre These are all problems on graphs (not digraphs or
and suburban locations. What is the shortest length of track networks) which share the same essential features. First a
that is needed so that one can travel from any station to any few more definitions.
other, not necessarily directly? A graph is simple if it has no loops or multiple edges. A
Example 2: Design a central heating system in a large house cycle in a simple graph is a path {i1, i2, …, in} with i1 = in and
so that every room is in the system, yet the total length of n >1. A connected graph with no cycles is a tree.
piping is minimal. The following equivalent properties hold for a tree on n
Example 3: Design a telecommunications system (e.g. fibre vertices, and may be used as defining properties:
optic network) so that an efficient path exists between every • There is exactly one path between any two vertices;
pair of vertices. Choosing the connections optimally could lead • The graph is connected with n-1 edges;
to significant cost savings. • The graph contains no cycles, but the addition of any
new edge creates exactly one cycle.
36 37
Nicholas Cron
The vertices give rise to the 7 points in the projective
plane, the edges to the 7 lines in the plane, and the
notion of incidence carries across from graph to finite
geometry.
1
Computational Complexity
The data are encrypted by B with some mathematical method Computational Complexity
often involving modular arithmetic. (The technical details can
Are computers not getting faster?
get a bit involved – google, for example, RSA encryption).
The method will depend on p and q. Only A can decrypt the Yes, but not fast enough!
message as she is the only one with the private key. And
when she wants to reply, she simply repeats the process, Moore’s Law (an empirical rule) states that the number of
encrypting her message to B using p and q. transistors in an integrated circuit doubles about every
two years.
Many might like to crack the code by finding p and q given n.
In principle there is an algorithm that from given n will find p If, by some technological miracle, we were able to find the
and q: try all 2512 possible p’s, but an astronomical number. prime factors of a huge number with 1024 bits, we could
In practice no fast algorithm is known for this problem and simply increase the magnitude to 2048 or 4096 bits to put
security of many codes depends on this fact. the solution beyond reach, at least until the speed
catches up.
There is no known polynomial time algorithm to factorise 6
large integers, so the larger n is, the more secure the coding.
Computational Complexity Computational Complexity
As mentioned earlier, there are special situations where
algorithms that are sometimes wrong can still be useful.
Comparing Algorithms
Comparing Algorithms The following fact is useful: if a function f(n) is a sum of
We consider such functions as functions, one of which grows faster than the others, then the
O(1) - Constant fastest growing one determines the order of f(n).
O(log n) - Logarithmic For example: if f(n) = 100 log(n) + 50n + 7nlogn + 3n2 + 0.01n3,
O(n) - Linear then f(n) is O(n3). [We write f(n) is O(n3), not f(n) = O(n3).]
2
O(n ) - Quadratic
O(nk) - Polynomial Some authors use mathematical/logical notation:
O(kn) - Exponential f(n) is O(nlogn) ↔ CN s.t. n>N, f(n) ≤ Cnlogn
These are given in increasing order of time required. In although saying f has order nlogn seems more transparent.
general we would expect an O(n) algorithm to take longer
than a O(logn) algorithm (at least for large n).
Now O(nk) and O(kn) are very different. The latter grows
very much faster with n, no matter what the value of k>1.
(Since kn > nk for all k > 1 for large enough n) 14 15
Comparing Algorithms Comparing Algorithms
The efficiency of an algorithm depends on various factors, The constant C accounts for extraneous factors mainly related
including: to hardware over which we may have little control. The function
• CPU (time) usage of n that gives the order is inherently related to the nature of the
• memory usage algorithm used and is our primary concern. The constant N
accounts for the fact that for small problem sizes, the algorithm
• network usage may not show its worst case performance, and indeed may
All are important but we mostly talk about time complexity behave quite anomalously.
(CPU usage).
Distinguish between: Complexity affects performance but not the other way around.
• Performance: how much memory/disk/... is actually used When we are trying to find the complexity of an algorithm, we
when a program is run. This depends largely on the are not mainly interested in the exact number of operations that
machine, compiler, etc. as well as the code. are being performed. Rather, we are interested in the relation of
• Complexity: how do the resource requirements of a the execution time to the problem size.
program or algorithm scale, i.e., what happens as the size 16 17
of the problem being solved gets larger?
Comparing Algorithms
Comparing Algorithms
For an O(n) algorithm, doubling the problem size doubles
the time taken. n nlog2n n2 n3 1.5n 2n n!
For an O(n3) algorithm, doubling the problem size n=10 < 1 sec < 1 sec < 1 sec < 1 sec < 1 sec < 1 sec 4 sec
increases time taken by a factor of 8. n=30 < 1 sec < 1 sec < 1 sec < 1 sec < 1 sec 18 min 1025 years
For an O(2n) algorithm, doubling the problem size n=50 < 1 sec < 1 sec < 1 sec < 1 sec 11 min 36 years > 1025 years
increases the time taken by a factor of 2n.
n=100 < 1 sec < 1 sec < 1 sec 1 sec 12892 years 1017 years > 1025 years
The table on the next slide gives the running times of
n=1000 < 1 sec < 1 sec 1 sec 18 min > 1025 years > 1025 years > 1025 years
different algorithms on inputs of increasing size, for a n=104 < 1 sec < 1 sec 2 min 12 days > 1025 years > 1025 years > 1025 years
processor performing 106 instructions per second. n=105 < 1 sec 2 sec 3 hours 32 years > 1025 years > 1025 years > 1025 years
We see the deleterious effect of using a high order
n=106 1 sec 20 sec 12 days 31710 years > 1025 years > 1025 years > 1025 years
function. We can regard polynomial problems of order at
most nk for some k as reasonably tractable, exponential
18
problems as intractable.
Comparing Algorithms
Comparing Algorithms Theoretical computer scientists classify algorithms. More
can be found in a textbook on algorithms or the theory of
It is very important to know if an algorithm runs in polynomial computation. For a good elementary account, see
http://cs.stackexchange.com/questions/9556/in-basic-terms-what-is-the-
time or not, and whether it is O(n2), O(n3), etc. definition-of-p-np-np-complete-and-np-hard/9566#9566
Stress that we seek approximate results for large n, based We provide a very simplified account that is adequate for
on worst case performance, simply as a function of n. the purposes of the course.
One class of problems (class P) consists of those problems
Recall that, since time is roughly proportional to speed, an for which an algorithm to solve the problem exists and runs
algorithm is O(f(n)) if execution time is at most Cf(n) or if in polynomial time.
speed is at most Cf(n) for large n and suitable C. Another class (class NP) consists of those problems for which
any instance can be verified in polynomial time.
Notice also that the value of the constant C may affect time in NP stands for ‘non-deterministic polynomial’, not ‘not
a major way (there are still practical reasons to use a fast and polynomial’.
efficient computer). For example, any solution to sudoku can be checked quickly.
20
But there is no known universal polynomial time algorithm, only
strategies which may or may not work in individual cases.
25
If P = NP
26 27
Comparing Algorithms
A lot of work has been done to extend this discussion: Summary
• Is there a ‘gap’ between class NP and class U? (Decidable
problems not verifiable in polynomial time). Yes, but no
simply explained examples I know of. • Algorithms are important. They are very widely used in
computing and in our general lives.
• Even for class P (polynomial time) algorithms, it is important
to find an efficient algorithm, in particular to ensure the • Simple stated problems can be hard to solve. For
polynomial is of smallest possible order. example, TSP or the Subset Sum Problem.
• Class NP problems, not known to be class P, can often by • Simple ideas don’t always work. Choosing the nearest
addressed by heuristic methods that are tractable and will city at each stage will not in general solve TSP.
frequently yield a solution close to the true solution • Simple algorithms can be very slow. Brute-force
• The existence of undecidable problems was proved in the factoring, TSP.
1930s by Alan Turing. His arguments were theoretical. More • For some problems, even the best known algorithms are
recently, actual examples of such problems have been slow. TSP again.
found. So U≠φ.
• This discussion can be extended to include mathematical
logic and Turing machines
This was for long an open question, but we now know the The left hand inequality becomes an equality iff G is a tree,
answer is no. See Y.V. Mativasevich, Hilbert's Tenth when removing any arc disconnects the graph. The right
Problem, MIT Press, Cambridge, Massachusetts, 1993. hand inequality becomes an equality for ‘complete graphs’
The problem is undecidable (Class U). (every pair of distinct nodes joined by an arc), when adding
a further arc leaves a graph that is no longer simple.
41
Some consequences:
• Problems in NLP are as much about solution as
formulation; LP solutions via simplex or interior point
• Different starting points may lead to different final are more standard.
solutions. We may end up at different local optima; worse, • We frequently need numerical methods rather than
the solution may vary with the solution method used. precise algorithms.
• There is seldom a clear determination of the optimum. We • Fresh approaches are needed. NLP solutions generally
may believe we have an optimum – at best we can often are based on calculus (whereas LP is based on linear
only check conditions that ensure a local optimum. algebra).
(Because we lack sufficient conditions for optimality). • There is no quick fix to the necessary/sufficient issue in
general.
Introduction and Examples Introduction and Examples
Terminology: the notions ‘objective function’, ‘constraint’, Classification: it is convenient to classify NLPs.
‘feasible region’, etc. carry over from work on LPs. • If there is only one variable, the problem is univariate,
For a maximisation, a point x* in the feasible region is a otherwise it is multivariate. Univariate problems are
important and non-trivial.
global optimal solution to the NLP if f(x*)≥f(x) for all points x
• If there are no constraints, the problem is unconstrained,
in the feasible region. otherwise it is constrained. Unconstrained problems are
(Similarly for a minimisation). important and non-trivial.
A feasible point x'=(x'1, x'2, …, x'n) is a local optimum if, for • If there is an analytic solution, the method is exact; if a
sufficiently small ε, any feasible point x=(x1, x2, …, xn) numerical procedure is used, the method is approximate.
with |xi-x'i|<ε (i=1,2,…,n) satisfies f(x')≥f(x).
(Similarly for a minimisation). So conceptually there are 8 types of NLP. Whilst a couple
So a global optimum is a local optimum, but not vice versa. are straightforward, most pose serious difficulties.
Let’s see how this can be achieved by computer. In R, use code (minimisation is the default)
f=function(x)
200*sqrt((x[1]-5)^2+(x[2]-10)^2) + 150*sqrt((x[1]-10)^2+(x[2]-5)^2) +
In Excel, can use Solver again. 200*sqrt(x[1]^2+(x[2]-12)^2) + 300*sqrt((x[1]-12)^2+x[2]^2)
optim(c(8,8),f)$par
Values for x and y are in cells A1 and A2.
B1 contains =200*SQRT(((A1-5)^2)+((A2-10)^2)) This uses starting point (8,8). $par suppresses all output
Similarly for C1, D1, E1 to give remaining terms. except solution values. A range of options is available, such
F1 contains =SUM(B1:E1) as changing the search method.
In the Solver dialogue window: R gives solution (9.313 5.029), almost the same as Excel.
Set Target Cell $F$1 Equal to Min Do not expect complete agreement for these approximate
By Changing Cells $A$1:$A$2 methods. Consider some approaches later in the chapter.
11
Constraints can easily be accommodated.
many chairs as tables. How many of each to produce? are produced. [So x1=1 → 2 units, x1=100 → 1.92 units etc.]
Suppose there are x1 tables, x2 chairs.
We have: max 20x1 + 15x2 There may be similar economies of scale for the
s.t. 2x1 + x2 ≤ 700 chairs, so use a function 0.5 + 0.5
x2 ≥ 2x1 0.994 + 0.004 x + 0.002 x
1 2
Result 5 Result 7
A function f is convex iff –f is concave If f and g are convex functions, then so is f+g.
Proof: Direct from definitions of convexity and concavity. Proof: Apply the definition of convexity to both f and g, and add
the resulting inequalities.
21 22
Convexity and Concavity Convexity and Concavity
In practice, we often need to determine if a given function is Now consider n > 1 variables. How can we tell whether
convex or concave. How can this be done? f(x1,x2,…xn) is convex or concave on a convex set S in Rn?
Simple in the single variable case. If f(x) is convex the line Assume throughout that f has continuous second order
joining any two points is never below the curve, so the slope derivatives. Recall that the Hessian of f(x1,x2,…xn) is the nxn
of f(x) must be non-decreasing for all x. Hence we have matrix whose (i,j) entry is 2 f
Proposition xi x j
If f''(x) exists for all x in a convex set S, then f(x) is a convex
2 f f
2
function iff f''(x) ≥ 0 for all x ε S. [Since we know that = , the Hessian is a
x x x x
symmetric matrix.] i j j i
And similarly
Proposition We let H(x1,x2,…xn) (or H(x) or simply H where the context is
If f''(x) exists for all x in a convex set S, then f(x) is a concave clear) denote the Hessian matrix at (x1,x2,…xn).
function iff f''(x) ≤ 0 for all x ε S. 6x 2
23 If f(x1,x2) = x13+2x1x2+x22 then H (x , x ) = 1
24
2 2
1 2
An ith principal minor of an nxn matrix is the determinant The important test is now given. Proofs, while not very difficult,
of an ixi matrix obtained by deleting (n-i) rows and the involve some linear algebra and are omitted.
corresponding (n-i) columns from the matrix.
For example, for the matrix −2 −1 the first principal minors Main Theorem 1 (Convexity/Concavity Testing)
−1 −4
are -2 and -4, the second principal minor is 8-1=7. Suppose f(x1,x2,…xn) has continuous second order derivatives
for each point (x1,x2,…xn) ε S. Then
The kth leading principal minor of an nxn matrix is the i) f is a convex function on S iff for all x ε S, all principal
determinant of the kxk matrix obtained by deleting the last (n-k) minors of H are non-negative.
columns and the last (n-k) rows. ii) f is a concave function on S iff for all x ε S, all kth non-zero
principal minors of H have the same sign as (-1)k for
k=1,2,…n.
27 28
f ( x )
n +1 n
1.3
1.2
f=function(x)
f (x)
1.1
{exp(-x)+x^2}
curve(f, from = 0, to = 1)
1.0
fprime=function(x){2*x-exp(-x)}
0.9
fprimeprime=function(x) {exp(-x)+2}
x=c(0.5,rep(NA,6)) 0.0 0.2 0.4 0.6 0.8 1.0
fval=rep(NA,7) x
At stationary points, all partial derivatives are zero, Just to check on principal minors:
so x is a stationary point of f iff f( x )= 0. 8 7 6
If H= 5 4 3 then
This defines a system of n equations which may 2 1 0
have no solution, one solution or multiple solutions.
As in the univariate case, numerical methods of H1= 8
solution are often needed.
H 2 = 8 7 = −3
The nature of stationary points can be tested in 5 4
various ways (eigenvalues, definiteness of a matrix, H3= |H| = 0
quadratic forms etc.). The following result is
important. It uses Hk, the kth leading principal minor of the
Hessian matrix.
16 17
Multivariate Unconstrained Problems: Exact Methods Multivariate Unconstrained Problems: Exact Methods
Main Theorem 2 (Examination of Stationary Pints) These results are important, but as mentioned, what often
1. If H k ( x ) 0 (for all k=1,2,…,n) then a stationary point x concerns us is to test for global optima of an NLP.
is a local minimum for the NLP. Main Theorem 3 (Sufficient Condition for Global Optimum)
Consider an NLP in the form of a maximisation. Suppose
2. If H k ( x ) 0 and has the same sign as (-1)k the feasible region S is a convex set. If the objective
(for all k=1,2,…,n), a stationary point x is a local function f0 is concave on S, then any local maximum is an
maximum for the NLP. optimal solution, so solves the NLP.
3. If H n ( x ) 0 and the conditions of 1 and 2 fail to hold, a Proof
stationary point x is not a local extremum. If the result is false, there is a local maximum x' that is not a
4. If H n ( x ) = 0, no conclusions can be drawn; the tests are global maximum. Then for some x ε S, f0(x) > f0(x').
inconclusive. Since f0 is concave,
f0(cx'+(1-c)x) ≥ cf0(x') + (1-c) f0(x) for 0 ≤ c ≤ 1
> cf0(x') + (1-c) f0(x') because f0(x) > f0(x')
(Some extensions to part 4 can be given.) = f0(x')
18 19
Multivariate Unconstrained Problems: Exact Methods
Multivariate Unconstrained Problems: Exact Methods
Now, x' is a local maximum so there is a neighbourhood N A stationary point that is not a local extremum is sometimes
of x' such that f0(x') ≥ f0(x) for all x ε N. But have seen called a saddle point.
f0(cx'+(1-c)x) > f0(x'), and cx'+(1-c)x ε S (0≤c≤1) since Example
x ε S, x' ε S and S is convex. Choose c close enough to 1 min f(x1,x2) = x12+x1x2+x22-3x1-3x2+3
so that cx'+(1-c)x ε N. Then f0(cx'+(1-c)x) ≤ f0(x'), an
f
evident contradiction. So every local maximum must be a x 2x + x − 3
0
global maximum. (Implicitly using the fact that a convex f ( x1 , x2 ) = 1 = 1 2
=
f x1 + 2 x2 − 3 0
function on Rd is continuous.) x
2
Solving, the only extreme point can be (x1,x2)=(1,1)
We can similarly prove 2 1
Corollary H (x , x ) =
1 2
1 2
24 25
Multivariate Unconstrained Problems: Exact Methods Multivariate Unconstrained Problems: Exact Methods
For a concave function, a single local stationary point must For a concave function, a single local stationary point must
be a global maximum and for a convex function a single be a global maximum and for a convex function a single
local stationary point must be a global minimum. These local stationary point must be a global minimum. These
results are not always true for any function on Rn. Note results are not true for any function on Rn. Note again that
again that f = 0 is necessary but not sufficient for optimum. f = 0 is necessary but not sufficient for optimum.
Pros and cons of exact methods: Pros and cons of exact methods:
1. Relative simple, when applicable. 1. Relative simple, when applicable.
2. Can often locate global optima. But….. 2. Can often locate global optima. But…..
3. Usually not applicable. 3. Usually not applicable.
4. Assumes differentiability. 4. Assumes differentiability.
5. Solution of f = 0 may be impractical or impossible. 5. Solution of f = 0 may be impractical or impossible.
26 27
6.5 Multivariate Unconstrained Problems: Approximate Methods
Chapter 6. Non-Linear Programming Newton’s Method (Multivariate)
Assume differentiability. In the univariate case, we
6.1 Introduction and Examples
approximated f using Taylor’s Theorem. The
6.2 Convexity and Concavity
6.3 Univariate Problems: Approximate Methods multivariate case is a simple generalisation,
6.4 Multivariate Unconstrained Problems: Exact Methods although results are only established for two
6.5 Multivariate Unconstrained Problems: Approximate Methods variables. Taylor says:
6.6 Multivariate Equality Constrained Problems
6.7 Multivariate Inequality Constrained Problems
6.8 Quadratic Programming f f
f ( x, y ) = f ( a , b ) + ( x − a ) ( a, b) + ( y − b) ( a, b) +
6.9 Penalty Functions x y
1 f 2
f f 2 2
1 2
Multivariate Unconstrained Problems: Approximate Methods Multivariate Unconstrained Problems: Approximate Methods
f f f f 2 2
x x x x x
1 2 1 1 2 1 2 2 2 1 2
evaluated at (a,b). f f f f 2 2
= (x , x ) + (x − x ) (x , x ) + (x − x )
(n) (n)
( x , x ) + ...
(n) (n) (n) (n) (n) (n)
x x x x x
1 2 2 2 2 1 2 1 1 1 2
Take (x1(n),x2(n)) as approximate value for the optimum (a,b) at 2 2 2 1 2
the nth iteration. We want an expression for (x1(n+1),x2(n+1)). Now (derivatives on RHS evaluated at (x1 x2 (n), (n)))
(x − x ) ( x , x ) + 2( x − x )( x − x )
(n) 2 (n)
(x , x ) + (x − x )
(n)
( x , x ) + ...
(n) (n) (n) (n) (n) 2 (n) (n)
f f f
2
2 2
x x x x
1 1 1 2 1 1 2 2 1 2 2 2 1 2
2 2
0 x x x x x −x
1 1 2 2
2 ( n +1) (n)
0 = f +
1 1 1 2 1 1
Differentiate: f
2
f x
2
2
( n +1)
− x 2
(n)
x x x x
2
3 2 1 2 2 4
Multivariate Unconstrained Problems: Approximate Methods Multivariate Unconstrained Problems: Approximate Methods
Example
In matrix form, this is 0 = f + H(x(n+1)-x(n)) min f(x1,x2) = x12+x1x2+x22-3x1-3x2+3
which can be expressed x(n+1)=x(n) - H-1 f Take starting point (x1(1),x2(1)) = (0,0)
This last expression provides an iteration,
Newton’s method, based on a suitable 2 1
2
3
1
- 2 x + x2 − 3
H = 3 f = 1
H -1 =
starting value, intended to converge to a 1 2 - 1 2
x1 + 2 x2 − 3
3 3
local, if not necessarily global optimum.
Stress that at any iteration H-1 and f are 2 −
1
x
( n +1)
x 3
(n)
3
2x + x (n) (n)
− 3
=x − 1
1 2
each evaluated at (x1(n),x2(n)), and that a good x + 2 x − 3
1 1
x ( n +1)
(n) (n)
−
(n)
x = 0 − 1 = =
1 1
−
− 1 x
(2) ( 3)
2
2 3 2
5
3 3 6
H = 1
H = = (n) − =
3 3
12( x − 2)
−1 1
0
2
1 x
2
0 2
12( x2 − 2)
2 ( x ( n ) − 2) + 2 x ( n ) )
1 2
3 3 3
2 2
Note that H-1 exists for x1≠1, x2≠2. Starting from (0,0) obtain a sequence of points:
4( x − 1)
3
f = 1
4( x − 2)
3
7 8
2
Multivariate Unconstrained Problems: Approximate Methods Multivariate Unconstrained Problems: Approximate Methods
Example
Iteration 1 2 3 4 5 Apply Newton to the function f(x,y) = sin(x2/2-y2/4)cos(2x-ey).
(x1(n), x2(n)) (0,0) (0.33,0.67) (0.56,1.11) (0.70,1.41) (0.80,1.61) Run the algorithm a number of times with starting points
around (1.5,0.5):
Iteration 6 7 8 9
(x1(n), x2(n)) (0.87,1.74) (0.91,1.82) (0.94,1.88) (0.96,1.92) Start point End point f at end point
(1.4,0.4) (0.0407,-2.5073) -1.0000
We see that the iteration is converging, but rather (1.4,0.5) (0.1180,3.3447) 0.3403
too slowly. (1.4,0.6) (-1.5532,6.0200) -1.0000
(1.5,0.4) (2.8371,5.3540) 0.0000
If we had naively used starting point (1, 2), the (1.5,0.5) (0.0407,-2.5073) -1.0000
(1.5,0.6) (0.0000,0.0000) 0.0000
singularity of H would have meant we couldn’t run
(1.6,0.4) (-0.5584,-0.7897) 0.0000
the algorithm at all.
(1.6,0.5) (-0.2902,-0.2305) 0.0056
9 (1.6,0.6) (-1.5529,-3.3326) 1.0000 10
The process is clearly unstable, although Newton has been Summarising, the main advantage is that convergence may
able to find the true maximum and minimum (1 and -1 be very rapid in some cases. But:
respectively) for some starting values. The method may 1. It needs considerable computing power (e.g. storing
converge to saddle points, as well as to maxima and minima. and evaluating H-1 for large n)
Further, unless very close to a local optimum, it is possible to 2. May converge to local, not global optima
move in unexpected directions. 3. May wander badly, converge slowly or diverge
The algorithm did indeed converge each time, but to very 4. Assumes f twice differentiable, and that derivatives
different destinations, despite the starting points being quite have an explicit analytic form.
close. We are no closer to finding optima! 5. Not always robust; sensitive to initial estimate.
This example, and earlier examples, are designed to make 6. Assumes H invertible and well conditioned.
the point that Newton’s Method suffers serious potential 7. Can sometimes encounter a stationary iteration point
or an infinite cycle.
defects and is not a viable practical tool.
8. Discarding quadratic terms may have an adverse
11 effect on convergence. 12
Multivariate Unconstrained Problems: Approximate Methods Multivariate Unconstrained Problems: Approximate Methods
Multivariate Unconstrained Problems: Approximate Methods Multivariate Unconstrained Problems: Approximate Methods
Step 1: Initialise at a point x(0).
Step 2: Find a direction of movement away from the current
Steepest Descent solution so as to improve the value of f(x).
A very useful method for unconstrained Step 3: Determine how far to move in this direction, i.e. find a
optimisation. This uses the iterative scheme suitable step size.
x(n+1)=x(n) + αn+1▽f(x(n)) Step 4: Repeat steps 2 and 3 until no further improvement can
[The signs of ▽f(x(n)) and αn+1 determine the direction of be made, or until it is less than a specified tolerance.
On termination, the current solution is taken as optimum.
movement from one solution to the next.]
We can think of the method as quasi-Newton with Hn=I for
Since the maximum rate of decrease (or increase) of a function
all n, although it is not usually classified as such. at a point is to follow the gradient vector, we move in that
Alternatively, more intuitively, this is an example of a direction at step 2. The step length tells us how far to go.
generic algorithm: As always, a good starting point speeds the algorithm. A more
formal statement follows:
15 16
Multivariate Unconstrained Problems: Approximate Methods Multivariate Unconstrained Problems: Approximate Methods
Example
Step 1: Select initial solution x(0). Set n = 0. min f(x1,x2) = x12+x1x2+x22-3x1-3x2+3
Step 2: Evaluate ▽f(x) at x(n). (we know the true minimum is at (1,1)).
Step 3: Choose αn+1 to minimise (or maximise) Take starting solution x(0) = (x1(0),x2(0)) = (1,0)T.
f(x(n+1))=f(x(n)+ αn+1▽f(x(n))) by univariate search.
Step 4: Obtain new solution x(n+1)=x(n)+αn+1▽f(x(n)). 2 x + x − 3 −1
Search direction f ( x) = 1 2
=
If |xj(n+1)- xj(n)| < ε (small, specified) for all x + 2 x − 3 −2
1 2
components xj, set x*=x(n+1) as optimum and stop; Choose step size α1 to minimise
otherwise set n →n+1 and return to step 2. f(x(1)) = f(x(0) + α1▽f(x(0)))
= f((1,0) + α1(-1,-2))
= f(1- α1 , -2α1)
= (1- α1)2 + (1- α1)(-2α1)+(-2α1)2 -3(1- α1) -3(-2α1)+3
17 = 7α12 +5α1+1 18
Multivariate Unconstrained Problems: Approximate Methods Multivariate Unconstrained Problems: Approximate Methods
Stock
whereupon the firm immediately restocks with x. Hence λ-1 = ±√2X/ed .
The firm requires X units of the commodity each Solutions are x = -e/λ = √2eX/d and y = -d/2λ = √dX/2e
year and, on average, the firm reorders with a
Time
frequency of y times a year. If the requirement is The firm should therefore order (2eXd-1)½ of the commodity
to be met, it is therefore necessary that xy = X. (dX(2e)-1)½ times a year. The yearly cost will be (2eXd)½ .
The cost of holding one unit of the commodity in stock for a year is d. Since the There are many more complex inventory models –
average amount held in stock is ½x, the yearly holding cost is ½dx. shortages allowed, varying depletion rates etc. Lagrange
The cost of reordering is e. The yearly reordering cost is therefore ey. multipliers are a key solution tool.
The firm faces the problem of minimising cost C(x,y) = ½dx + ey subject33to xy = X. 34
If there are no Lagrange points, we can deduce there is no (This reflects some changes that were recently put into effect.
optimum. If there is a unique solution to the Lagrange Univariate exact problems and quadratic programming have been
equations, and there is an optimum, we must have located it. relegated to the miscellaneous section of Moodle )
35 1
6.7 Multivariate Inequality Constrained Problems Multivariate Inequality Constrained Problems
The most general type of NLP. Consider a problem of form
min f0(x) for x ε Rn The following method is therefore suggested:
subject to fi(x) ≤ 0 (i=1,2,…,m). • If there are m constraints, solve 2m subproblems:
Fairly clear that any real programming problem can be ignoring all constraints, ignoring all but one which is an
expressed this way. Of course, many will be maximisations equality, ignoring all but two constraints which are
or have different types of constraints. The form is convenient, equalities, and so on
and it is quite important to be consistent here in view of • Any solutions found which satisfy all current constraints
subsequent work. are candidates for optimum; any solutions which violate
Kuhn-Tucker is the main tool, enabling exact solutions in one or more constraints can be discarded.
some cases, and underlying theory in others. First, briefly Naturally, the method is only viable when m is quite small,
consider a couple of other approaches.
As in other areas, can again use a plot – only with 2 or 3 even if modifications can reduce the labour somewhat.
variables. Alternatively, observe that at optimum, each
constraint either holds as an equality, or is not binding – in
which case, it can be ignored.
2 3
23
Penalty Functions
Generally, the use of penalty functions provides a
useful extra tool for solution of NLPs. The biggest
THE END!!!
problem is potentially slow convergence, due to
the awkward nature of the composite functions.
The decision as to which NLP algorithm to use is
delicate, depending on the exact nature of the
problem, the degree of accuracy obtained, https://www.youtube.com/watch?v=0b-tN0LBfOs
available software and so on. Recent developments over
the last 50 years or so provide a range of options.
32 33