Tutorial: Operations Research and Constraint Programming: John Hooker Carnegie Mellon University

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 329

Tutorial: Operations Research and

Constraint Programming

John Hooker
Carnegie Mellon University
June 2008

LSE tutorial, June 2007


Slide 1
Why Integrate OR and CP?

Complementary strengths
Computational advantages
Outline of the Tutorial

LSE tutorial, June 2007


Slide 2
Complementary Strengths

• CP:
– Inference methods Let’s bring them
– Modeling together!
– Exploits local structure
• OR:
– Relaxation methods
– Duality theory
– Exploits global structure

LSE tutorial, June 2007


Slide 3
Computational Advantage of
Integrating CP and OR
Using CP + relaxation from MILP

Problem Speedup

Focacci, Lodi, Lesson 2 to 50 times faster


Milano (1999) timetabling than CP

Refalo (1999) Piecewise linear 2 to 200 times


costs faster than MILP

Hooker & Osorio Flow shop 4 to 150 times


(1999) scheduling, etc. faster than MILP.

Thorsteinsson & Product 30 to 40 times


Ottosson (2001) configuration faster than CP,
MILP

LSE tutorial, June 2007


Slide 4
Computational Advantage of
Integrating CP and MILP
Using CP + relaxation from MILP

Problem Speedup

Sellmann & Automatic 1 to 10 times faster


Fahle (2001) recording than CP, MILP

Van Hoeve Stable set Better than CP in


(2001) problem less time

Bollapragada, Structural design Up to 600 times


Ghattas & (nonlinear) faster than MILP.
Hooker (2001) 2 problems: <6 min
vs >20 hrs for MILP
Beck & Refalo Scheduling with Solved 67 of 90, CP
(2003) earliness & solved only 12
tardiness costs
LSE tutorial, June 2007
Slide 5
Computational Advantage of
Integrating CP and MILP
Using CP-based Branch and Price

Problem Speedup

Yunes, Moura & Urban transit Optimal schedule


de Souza (1999) crew scheduling for 210 trips, vs.
120 for traditional
branch and price
Easton, Traveling First to solve
Nemhauser & tournament 8-team instance
Trick (2002) scheduling

LSE tutorial, June 2007


Slide 6
Computational Advantage of
Integrating CP and MILP
Using CP/MILP Benders methods

Problem Speedup

Jain & Min-cost planning 20 to 1000 times


Grossmann & scheduing faster than CP,
(2001) MILP
Thorsteinsson Min-cost planning 10 times faster
(2001) & scheduling than Jain &
Grossmann
Timpe (2002) Polypropylene Solved previously
batch scheduling insoluble problem
at BASF in 10 min

LSE tutorial, June 2007


Slide 7
Computational Advantage of
Integrating CP and MILP
Using CP/MILP Benders methods

Problem Speedup

Benoist, Gaudin, Call center scheduling Solved twice as


Rottembourg many instances
(2002) as traditional
Benders
Hooker (2004) Min-cost, 100-1000 times
min-makespan faster than CP,
planning & cumulative MILP
scheduling
Hooker (2005) Min tardiness 10-1000 times
planning & cumulative faster than CP,
scheduling MILP

LSE tutorial, June 2007


Slide 8
Outline of the Tutorial
• Why Integrate OR and CP?
• A Glimpse at CP
• Initial Example: Integrated Methods
• CP Concepts
• CP Filtering Algorithms
• Linear Relaxation and CP
• Mixed Integer/Linear Modeling
• Cutting Planes
• Lagrangean Relaxation and CP
• Dynamic Programming in CP
• CP-based Branch and Price
• CP-based Benders Decomposition

LSE tutorial, June 2007


Slide 9
Detailed Outline

• Why Integrate OR and CP?


• Complementary strengths
• Computational advantages
• Outline of the tutorial
• A Glimpse at CP
• Early successes
• Advantages and disadvantages
• Initial Example: Integrated Methods
• Freight Transfer
• Bounds Propagation
• Cutting Planes
• Branch-infer-and-relax Tree

LSE tutorial, June 2007


Slide 10
Detailed Outline

• CP Concepts
• Consistency
• Hyperarc Consistency
• Modeling Examples
• CP Filtering Algorithms
• Element
• Alldiff
• Disjunctive Scheduling
• Cumulative Scheduling
• Linear Relaxation and CP
• Why relax?
• Algebraic Analysis of LP
• Linear Programming Duality
• LP-Based Domain Filtering
• Example: Single-Vehicle Routing
• Disjunctions of Linear Systems
LSE tutorial, June 2007
Slide 11
Detailed Outline
• Mixed Integer/Linear Modeling
• MILP Representability
• 4.2 Disjunctive Modeling
• 4.3 Knapsack Modeling
• Cutting Planes
• 0-1 Knapsack Cuts
• Gomory Cuts
• Mixed Integer Rounding Cuts
• Example: Product Configuration
• Lagrangean Relaxation and CP
• Lagrangean Duality
• Properties of the Lagrangean Dual
• Example: Fast Linear Programming
• Domain Filtering
• Example: Continuous Global Optimization

LSE tutorial, June 2007


Slide 12
Detailed Outline
• Dynamic Programming in CP
• Example: Capital Budgeting
• Domain Filtering
• Recursive Optimization
• CP-based Branch and Price
• Basic Idea
• Example: Airline Crew Scheduling
• CP-based Benders Decomposition
• Benders Decomposition in the Abstract
• Classical Benders Decomposition
• Example: Machine Scheduling

LSE tutorial, June 2007


Slide 13
Background Reading

This tutorial is based on:


• J. N. Hooker, Integrated Methods for Optimization, Springer
(2007). Contains 295 exercises.
• J. N. Hooker, Operations research methods in constraint
programming, in F. Rossi, P. van Beek and T. Walsh, eds.,
Handbook of Constraint Programming, Elsevier (2006), pp.
527-570.

LSE tutorial, June 2007


Slide 14
A Glimpse at Constraint Programming

Early Successes
Advantages and Disadvantages

LSE tutorial, June 2007


Slide 15
What is constraint programming?

• It is a relatively new technology developed in the computer


science and artificial intelligence communities.
• It has found an important role in scheduling, logistics and supply
chain management.

LSE tutorial, June 2007


Slide 16
Early commercial successes

• Container port scheduling


• Circuit design (Siemens)
(Hong Kong and Singapore)

• Real-time control
(Siemens, Xerox)

LSE tutorial, June 2007


Slide 17
Applications

• Job shop scheduling


• Assembly line smoothing
and balancing
• Cellular frequency
assignment
• Nurse scheduling
• Shift planning
• Maintenance planning
• Airline crew rostering and scheduling
• Airport gate allocation and stand planning

LSE tutorial, June 2007


Slide 18
Applications

• Production scheduling
chemicals
aviation
oil refining
steel
lumber
photographic plates
tires
• Transport scheduling (food,
nuclear fuel)
• Warehouse management
• Course timetabling

LSE tutorial, June 2007


Slide 19
Advantages and Disadvantages

CP vs. Mathematical Programming

MP CP
Numerical calculation Logic processing
Relaxation Inference (filtering,
constraint propagation)
Atomistic modeling High-level modeling
(linear inequalities) (global constraints)
Branching Branching
Independence of model Constraint-based
and algorithm processing

LSE tutorial, June 2007


Slide 20
Programming ≠ programming

• In constraint programming:
• programming = a form of computer programming
(constraint-based processing)

• In mathematical programming:
• programming = logistics planning (historically)

LSE tutorial, June 2007


Slide 21
CP vs. MP

• In mathematical programming, equations


(constraints) describe the problem but don’t tell how to
solve it.

• In constraint programming, each constraint invokes a


procedure that screens out unacceptable solutions.
• Much as each line of a computer program invokes
an operation.

LSE tutorial, June 2007


Slide 22
Advantages of CP

• Better at sequencing and scheduling


• …where MP methods have weak relaxations.
• Adding messy constraints makes the problem easier.
• The more constraints, the better.
• More powerful modeling language.
• Global constraints lead to succinct models.
• Constraints convey problem structure to the solver.
• “Better at highly-constrained problems”
• Misleading – better when constraints propagate well, or
when constraints have few variables.

LSE tutorial, June 2007


Slide 23
Disdvantages of CP

• Weaker for continuous variables.


• Due to lack of numerical techniques
• May fail when constraints contain many variables.
• These constraints don’t propagate well.
•Often not good for funding optimal solutions.
• Due to lack of relaxation technology.
• May not scale up
• Discrete combinatorial methods
• Software is not robust
• Younger field
LSE tutorial, June 2007
Slide 24
Obvious solution…

• Integrate CP and MP.


• More on this later.

LSE tutorial, June 2007


Slide 25
Trends

• CP is better known in continental Europe, Asia.


• Less known in North America, seen as threat to OR.
• CP/MP integration is growing
• Eclipse, Mozart, OPL Studio, SIMPL, SCIP, BARON
• Heuristic methods increasingly important in CP
• Discrete combinatorial methods
• MP/CP/heuristics may become a single technology.

LSE tutorial, June 2007


Slide 26
Initial Example: Integrated Methods

Freight Transfer
Bounds Propagation
Cutting Planes
Branch-infer-and-relax Tree

LSE tutorial, June 2007


Slide 27
Example: Freight Transfer

• Transport 42 tons of freight using 8 trucks, which come in


4 sizes…

Truck Number Capacity Cost


size available (tons) per
truck
1 3 7 90
2 3 5 60
3 3 4 50
LSE tutorial, June 2007 4 3 3 40
Slide 28
Number of trucks of type 1

min 90 x1 + 60 x2 + 50 x3 + 40 x 4
7 x1 + 5 x2 + 4 x3 + 3 x 4 ≥ 42 Knapsack
covering
Knapsack x1 + x2 + x3 + x 4 ≤ 8
packing constraint
constraint xi ∈ {0,1,2,3}

Truck Number Capacity Cost


type available (tons) per
truck
1 3 7 90
2 3 5 60
3 3 4 50
LSE tutorial, June 2007 4 3 3 40
Slide 29
Bounds propagation

min 90 x1 + 60 x2 + 50 x3 + 40 x 4
7 x1 + 5 x2 + 4 x3 + 3 x 4 ≥ 42
x1 + x2 + x3 + x 4 ≤ 8
xi ∈ {0,1,2,3}

 42 − 5 ⋅ 3 − 4 ⋅ 3 − 3 ⋅ 3 
x1 ≥   =1
 7 

LSE tutorial, June 2007


Slide 30
Bounds propagation

min 90 x1 + 60 x2 + 50 x3 + 40 x 4
7 x1 + 5 x2 + 4 x3 + 3 x 4 ≥ 42
x1 + x2 + x3 + x 4 ≤ 8
x1 ∈ {1,2,3}, x2 , x3 , x 4 ∈ {0,1,2,3}
Reduced
domain

 42 − 5 ⋅ 3 − 4 ⋅ 3 − 3 ⋅ 3 
x1 ≥   =1
 7 

LSE tutorial, June 2007


Slide 31
Bounds consistency

• Let {Lj, …, Uj} be the domain of xj


• A constraint set is bounds consistent if for each j :
• xj = Lj in some feasible solution and
• xj = Uj in some feasible solution.
• Bounds consistency ⇒ we will not set xj to any infeasible
values during branching.
• Bounds propagation achieves bounds consistency for a
single inequality.
• 7x1 + 5x2 + 4x3 + 3x4 ≥ 42 is bounds consistent when the
domains are x1 ∈ {1,2,3} and x2, x3, x4 ∈ {0,1,2,3}.
• But not necessarily for a set of inequalities.
LSE tutorial, June 2007
Slide 32
Bounds consistency

 Bounds propagation may not achieve bounds consistency


for a set of constraints.
 Consider set of inequalities x1 + x2 ≥ 1
x1 − x2 ≥ 0
with domains x1, x2 ∈ {0,1}, solutions (x1,x2) = (1,0), (1,1).

 Bounds propagation has no effect on the domains.


 But constraint set is not bounds consistent because x1 = 0
in no feasible solution.

LSE tutorial, June 2007


Slide 33
Cutting Planes
Begin with continuous relaxation
min 90 x1 + 60 x2 + 50 x3 + 40 x 4
7 x1 + 5 x2 + 4 x3 + 3 x 4 ≥ 42
x1 + x2 + x3 + x 4 ≤ 8
Replace domains
0 ≤ xi ≤ 3, x1 ≥ 1
with bounds
This is a linear programming problem, which is easy to
solve.
Its optimal value provides a lower bound on optimal
value of original problem.

LSE tutorial, June 2007


Slide 34
Cutting planes (valid inequalities)

min 90 x1 + 60 x2 + 50 x3 + 40 x 4
7 x1 + 5 x2 + 4 x3 + 3 x 4 ≥ 42
x1 + x2 + x3 + x 4 ≤ 8
0 ≤ xi ≤ 3, x1 ≥ 1

We can create a tighter relaxation (larger minimum


value) with the addition of cutting planes.

LSE tutorial, June 2007


Slide 35
Cutting planes (valid inequalities)

min 90 x1 + 60 x2 + 50 x3 + 40 x 4
7 x1 + 5 x2 + 4 x3 + 3 x 4 ≥ 42
Cutting
x1 + x2 + x3 + x 4 ≤ 8 plane Continuous
0 ≤ xi ≤ 3, x1 ≥ 1 relaxation

All feasible solutions of the


original problem satisfy a
cutting plane (i.e., it is valid).
But a cutting plane may
exclude (“cut off”) solutions of
the continuous relaxation.
Feasible solutions
LSE tutorial, June 2007
Slide 36
Cutting planes (valid inequalities)

min 90 x1 + 60 x2 + 50 x3 + 40 x 4
7 x1 + 5 x2 + 4 x3 + 3 x 4 ≥ 42
x1 + x2 + x3 + x 4 ≤ 8
0 ≤ xi ≤ 3, x1 ≥ 1
{1,2} is a packing
…because 7x1 + 5x2 alone cannot satisfy the inequality,
even with x1 = x2 = 3.

LSE tutorial, June 2007


Slide 37
Cutting planes (valid inequalities)

min 90 x1 + 60 x2 + 50 x3 + 40 x 4
7 x1 + 5 x2 + 4 x3 + 3 x 4 ≥ 42
x1 + x2 + x3 + x 4 ≤ 8
0 ≤ xi ≤ 3, x1 ≥ 1
{1,2} is a packing

So, 4 x3 + 3 x 4 ≥ 42 − (7 ⋅ 3 + 5 ⋅ 3) Knapsack cut

which implies
 42 − (7 ⋅ 3 + 5 ⋅ 3) 
x3 + x 4 ≥   =2
 max {4,3} 
LSE tutorial, June 2007
Slide 38
Cutting planes (valid inequalities)

Let xi have domain [Li,Ui] and let a ≥ 0.


In general, a packing P for ax ≥ a0 satisfies

∑a x
i ∉P
i i ≥ a0 − ∑ ai Ui
i ∈P

and generates a knapsack cut

 a0 − ∑ ai U i 
∑ xi ≥  i ∈P 
i ∉P
 max {ai } 
 i ∉P 

LSE tutorial, June 2007


Slide 39
Cutting planes (valid inequalities)

min 90 x1 + 60 x2 + 50 x3 + 40 x 4
7 x1 + 5 x2 + 4 x3 + 3 x 4 ≥ 42
x1 + x2 + x3 + x 4 ≤ 8
0 ≤ xi ≤ 3, x1 ≥ 1

Maximal Packings Knapsack cuts


{1,2} x3 + x4 ≥ 2

{1,3} x2 + x4 ≥ 2

{1,4} x2 + x3 ≥ 3

Knapsack cuts corresponding to nonmaximal


LSE tutorial, June 2007 packings can be nonredundant.
Slide 40
Continuous relaxation with cuts

min 90 x1 + 60 x2 + 50 x3 + 40 x 4
7 x1 + 5 x2 + 4 x3 + 3 x 4 ≥ 42
x1 + x2 + x3 + x4 ≤ 8
0 ≤ xi ≤ 3, x1 ≥ 1
x3 + x 4 ≥ 2
x2 + x4 ≥ 2 Knapsack cuts

x 2 + x3 ≥ 3

Optimal value of 523.3 is a lower bound on optimal value


of original problem.

LSE tutorial, June 2007


Slide 41
x1 ∈ { 123}
Branch- x2 ∈ {0123}
x3 ∈ {0123}
infer-and- x4 ∈ {0123}
x = (2⅓,3,2⅔,0)
relax tree value = 523⅓

Propagate bounds
and solve
relaxation of
original problem.

LSE tutorial, June 2007


Slide 42
x1 ∈ { 123}
x2 ∈ {0123}
Branch-infer- x3 ∈ {0123}
and-relax tree x4 ∈ {0123}
x = (2⅓,3,2⅔,0)
value = 523⅓

Branch on a x1 ∈ {1,2}
x1 = 3
variable with
nonintegral value
in the relaxation.

LSE tutorial, June 2007


Slide 43
x1 ∈ { 123}
x2 ∈ {0123}
Branch-infer- x3 ∈ {0123}
and-relax tree x4 ∈ {0123}
x = (2⅓,3,2⅔,0)
value = 523⅓
x1 ∈ { 12 }
Propagate bounds x2 ∈ { 23} x1 ∈ {1,2}
x1 = 3
and solve x3 ∈ { 123}
x4 ∈ { 123}
relaxation. infeasible
relaxation
Since relaxation
is infeasible,
backtrack.

LSE tutorial, June 2007


Slide 44
x1 ∈ { 123}
x2 ∈ {0123}
Branch-infer- x3 ∈ {0123}
and-relax tree x4 ∈ {0123}
x = (2⅓,3,2⅔,0)
value = 523⅓
x1 ∈ { 12 } x1 ∈ { 3}
Propagate bounds x2 ∈ { 23} x1 ∈ {1,2}
x1 = 3 x2 ∈ {0123}
and solve x3 ∈ { 123} x3 ∈ {0123}
x4 ∈ { 123} x4 ∈ {0123}
relaxation. infeasible x = (3,2.6,2,0)
relaxation value = 526
Branch on
x2 = 3
nonintegral x2 ∈ {0,1,2}
variable.

LSE tutorial, June 2007


Slide 45
x1 ∈ { 123}
x2 ∈ {0123}
Branch-infer- x3 ∈ {0123}
and-relax tree x4 ∈ {0123}
x = (2⅓,3,2⅔,0)
value = 523⅓
x1 ∈ { 12 } x1 ∈ { 3}
Branch again. x2 ∈ { 23} x1 ∈ {1,2}
x1 = 3 x2 ∈ {0123}
x3 ∈ { 123} x3 ∈ {0123}
x4 ∈ { 123} x4 ∈ {0123}
infeasible x = (3,2.6,2,0)
relaxation value = 526
x2 = 3
x1 ∈ { 3}
x2 ∈ {0,1,2}
x2 ∈ {012 }
x3 ∈ { 123}
x4 ∈ {0123}
x = (3,2,2¾,0)
value = 527½
x3 = 3
x3 ∈ {1,2}

LSE tutorial, June 2007


Slide 46
x1 ∈ { 123}
x2 ∈ {0123}
Branch-infer- x3 ∈ {0123}
and-relax tree x4 ∈ {0123}
x = (2⅓,3,2⅔,0)
value = 523⅓
x1 ∈ { 12 } x1 ∈ { 3}
Solution of x2 ∈ { 23} x1 ∈ {1,2}
x1 = 3 x2 ∈ {0123}
relaxation x3 ∈ { 123} x3 ∈ {0123}
x4 ∈ { 123} x4 ∈ {0123}
is integral and infeasible x = (3,2.6,2,0)
therefore feasible relaxation value = 526
in the original x2 = 3
x1 ∈ { 3}
problem. x2 ∈ {0,1,2}
x2 ∈ {012 }
x3 ∈ { 123}
This becomes the x4 ∈ {0123}
incumbent x = (3,2,2¾,0)
value = 527½
solution.
x3 = 3
x1 ∈ { 3} x3 ∈ {1,2}
x2 ∈ { 12 }
x3 ∈ { 12 }
x4 ∈ { 123}
x = (3,2,2,1)
value = 530
LSE tutorial, June 2007 feasible solution
Slide 47
x1 ∈ { 123}
x2 ∈ {0123}
Branch-infer- x3 ∈ {0123}
and-relax tree x4 ∈ {0123}
x = (2⅓,3,2⅔,0)
value = 523⅓
x1 ∈ { 12 } x1 ∈ { 3}
Solution is x2 ∈ { 23} x1 ∈ {1,2}
x1 = 3 x2 ∈ {0123}
nonintegral, but x3 ∈ { 123} x3 ∈ {0123}
x4 ∈ { 123} x4 ∈ {0123}
we can backtrack infeasible x = (3,2.6,2,0)
because value of relaxation value = 526
relaxation is x2 = 3
x1 ∈ { 3}
no better than x2 ∈ {0,1,2}
x2 ∈ {012 }
incumbent solution. x3 ∈ { 123}
x4 ∈ {0123}
x = (3,2,2¾,0)
value = 527½
x3 = 3
x1 ∈ { 3} x3 ∈ {1,2} x1 ∈ { 3}
x2 ∈ { 12 } x2 ∈ {012 }
x3 ∈ { 12 } x3 ∈ { 3}
x4 ∈ { 123} x4 ∈ {012 }
x = (3,2,2,1) x = (3,1½,3,½)
value = 530 value = 530
feasible solution backtrack
LSE tutorial, June 2007 due to bound
Slide 48
x1 ∈ { 123}
x2 ∈ {0123}
Branch-infer- x3 ∈ {0123}
and-relax tree x4 ∈ {0123}
x = (2⅓,3,2⅔,0)
value = 523⅓
x1 ∈ { 12 } x1 ∈ { 3}
Another feasible x2 ∈ { 23} x1 ∈ {1,2}
x1 = 3 x2 ∈ {0123}
solution found. x3 ∈ { 123} x3 ∈ {0123}
x4 ∈ { 123} x4 ∈ {0123}
infeasible x = (3,2.6,2,0)
No better than value = 526
relaxation
incumbent solution,
x2 = 3 x1 ∈ { 3}
which is optimal x1 ∈ { 3}
x2 ∈ {0,1,2} x2 ∈ { 3}
x2 ∈ {012 }
because search x3 ∈ { 123}
x3 ∈ {012 }
has finished. x4 ∈ {012 }
x4 ∈ {0123}
x = (3,3,0,2)
x = (3,2,2¾,0)
value = 530
value = 527½
feasible solution
x3 = 3
x1 ∈ { 3} x3 ∈ {1,2} x1 ∈ { 3}
x2 ∈ { 12 } x2 ∈ {012 }
x3 ∈ { 12 } x3 ∈ { 3}
x4 ∈ { 123} x4 ∈ {012 }
x = (3,2,2,1) x = (3,1½,3,½)
value = 530 value = 530
feasible solution backtrack
LSE tutorial, June 2007 due to bound
Slide 49
Two optimal solutions…

x = (3,2,2,1)

x = (3,3,0,2)

LSE tutorial, June 2007


Slide 50
Constraint Programming Concepts

Consistency
Hyperarc Consistency
Modeling Examples

LSE tutorial, June 2007


Slide 51
Consistency

• A constraint set is consistent if every partial assignment to the


variables that violates no constraint is feasible.
• i.e., can be extended to a feasible solution.
• Consistency ≠ feasibility
• Consistency means that any infeasible partial assignment is
explicitly ruled out by a constraint.
• Fully consistent constraint sets can be solved without
backtracking.

LSE tutorial, June 2007


Slide 52
Consistency

Consider the constraint set

x1 + x100 ≥ 1
x1 − x100 ≥ 0
x j ∈ {0,1}
It is not consistent, because x1 = 0 violates no constraint
and yet is infeasible (no solution has x1 = 0).
Adding the constraint x1 = 1 makes the set consistent.

LSE tutorial, June 2007


Slide 53
x1 + x100 ≥ 1
x1 − x100 ≥ 1
other constraints
x j ∈ {0,1}

x1 = 0 x1 = 1

subtree with 299 nodes


but no feasible solution

By adding the constraint


x1 = 1, the left subtree is
eliminated

LSE tutorial, June 2007


Slide 54
Hyperarc Consistency

• Also known as generalized arc consistency.


• A constraint set is hyperarc consistent if every value in
every variable domain is part of some feasible solution.
• That is, the domains are reduced as much as
possible.
• If all constraints are “binary” (contain 2 variables),
hyperarc consistent = arc consistent.
• Domain reduction is CP’s biggest engine.

LSE tutorial, June 2007


Slide 55
Graph coloring problem that can be solved by arc
consistency maintenance alone. Color nodes with red,
green, blue with no two adjacent nodes having the same
color.

LSE tutorial, June 2007


Slide 56
Graph coloring problem that can be solved by arc
consistency maintenance alone. Color nodes with red,
green, blue with no two adjacent nodes having the same
color.

LSE tutorial, June 2007


Slide 57
Graph coloring problem that can be solved by arc
consistency maintenance alone. Color nodes with red,
green, blue with no two adjacent nodes having the same
color.

LSE tutorial, June 2007


Slide 58
Graph coloring problem that can be solved by arc
consistency maintenance alone. Color nodes with red,
green, blue with no two adjacent nodes having the same
color.

LSE tutorial, June 2007


Slide 59
Graph coloring problem that can be solved by arc
consistency maintenance alone. Color nodes with red,
green, blue with no two adjacent nodes having the same
color.

LSE tutorial, June 2007


Slide 60
Graph coloring problem that can be solved by arc
consistency maintenance alone. Color nodes with red,
green, blue with no two adjacent nodes having the same
color.

LSE tutorial, June 2007


Slide 61
Graph coloring problem that can be solved by arc
consistency maintenance alone. Color nodes with red,
green, blue with no two adjacent nodes having the same
color.

LSE tutorial, June 2007


Slide 62
Modeling Examples with Global Constraints
Traveling Salesman

Traveling salesman problem:


Let cij = distance from city i to city j.
Find the shortest route that visits each of n cities exactly
once.

LSE tutorial, June 2007


Slide 63
Popular 0-1 model

Let xij = 1 if city i immediately precedes city j, 0 otherwise

min ∑c x
ij
ij ij

s.t. ∑x
i
ij = 1, all j

∑x
j
ij = 1, all i

∑∑x ij ≥ 1, all disjoint V ,W ⊂ {1,…, n}


i ∈V j∈W

xij ∈ {0,1}
Subtour elimination constraints

LSE tutorial, June 2007


Slide 64
A CP model

Let yk = the kth city visited.


The model would be written in a specific constraint programming
language but would essentially say:

Variable indices

min ∑c
k
y k y k +1

s.t. alldiff( y1,…, y n )


y k ∈ {1,…, n}
“Global” constraint

LSE tutorial, June 2007


Slide 65
An alternate CP model

Let yk = the city visited after city k.

min ∑c
k
ky k

s.t. circuit( y1,…, y n )


y k ∈ {1,…, n}
Hamiltonian circuit
constraint

LSE tutorial, June 2007


Slide 66
Element constraint

The constraint cy ≤ 5 can be implemented:


z≤5
element ( y ,(c1,…, cn ), z )
Assign z the yth
value in the list

The constraint xy ≤ 5 can be implemented

z≤5 Add the


element ( y ,( x1,…, xn ), z ) constraint
z = xy
(this is a slightly different constraint)

LSE tutorial, June 2007


Slide 67
Modeling example: Lot sizing and scheduling

Day: 1 2 3 4 5 6 7 8

A B A

Product
• At most one product manufactured on each day.
• Demands for each product on each day.
• Minimize setup + holding cost.
LSE tutorial, June 2007
Slide 68
  Many variables
min ∑  hit sit + ∑ qij δ ijt 
t ,i  j ≠t 
s.t. si ,t −1 + xit = d it + sit , all i , t
zit ≥ y it − y i ,t −1, all i , t
Integer zit ≤ y it , all i , t
programming zit ≤ 1 − y i ,t −1, all i , t
model δ ijt ≥ y i ,t −1 + y jt − 1, all i , j , t
(Wolsey) δ ijt ≥ y i ,t −1, all i , j , t
δ ijt ≥ y jt , all i , j , t
xit ≤ Cy it , all i , t
∑ y it = 1, all t
i

y it , zit ,δ ijt ∈ {0,1}


xit , sit ≥ 0
LSE tutorial, June 2007
Slide 69
CP model
Minimize holding and setup costs

 
min ∑  qy t −1y t + ∑ hi sit 
t  i  Inventory balance

s.t. si ,t −1 + xit = d it + sit , all i , t Production capacity


0 ≤ xit ≤ C, sit ≥ 0, all i , t
( y t ≠ i ) → ( xit = 0 ) , all i , t

LSE tutorial, June 2007


Slide 70
CP model
Minimize holding and setup costs
Variable indices

 
min ∑  qy t −1y t + ∑ hi sit 
t  i  Inventory balance

s.t. si ,t −1 + xit = d it + sit , all i , t Production capacity


0 ≤ xit ≤ C, sit ≥ 0, all i , t
( y t ≠ i ) → ( xit = 0 ) , all i , t
Production level of product i in period t
Product manufactured in period t
LSE tutorial, June 2007
Slide 71
Cumulative scheduling constraint

• Used for resource-constrained scheduling.


• Total resources consumed by jobs at any one time must
not exceed L.

cumulative ( (t1,…, tn ),( p1,…, pn ),(c1,…, cn ), L )

Job start times


(variables)
Job processing times
Job resource
requirements

LSE tutorial, June 2007


Slide 72
Cumulative scheduling constraint
Minimize makespan (no deadlines, all release times = 0):

L
4
resources 1 5
Min makespan = 8
2 3
time

min z L

s.t. cumulative ( (t1,…, t5 ),(3,3,3,5,5),(3,3,3,2,2),7 )


z ≥ t1 + 3
Resources used
⋮ Processing times
z ≥ t5 + 2 Job start times
LSE tutorial, June 2007
Slide 73
Modeling example: Ship loading

• Will use ILOG’s OPL Studio modeling language.


• Example is from OPL manual.
• The problem
• Load 34 items on the ship in minimum time (min makespan)
• Each item requires a certain time and certain number of
workers.
• Total of 8 workers available.

LSE tutorial, June 2007


Slide 74
Item Dura- Labor Item Dura- Labor
tion tion

1 3 4 18 2 7
Problem data
2 4 4 19 1 4

3 4 3 20 1 4

4 6 4 21 1 4

5 5 5 22 2 4

6 2 5 23 4 7

7 3 4 24 5 8

8 4 3 25 2 8

9 3 4 26 1 3

10 2 8 27 1 3

11 3 4 28 2 6

12 2 5 29 1 8

13 1 4 30 3 3

14 5 3 31 2 3

15 2 3 32 1 3

16 3 3 33 2 3

17 2 6 34 2 3

LSE tutorial, June 2007


Slide 75
Precedence constraints

1 → 2,4 11 →13 22 →23


2 →3 12 →13 23 →24
3 →5,7 13 →15,16 24 →25
4 →5 14 →15 25 →26,30,31,32
5 →6 15 →18 26 → 27
6 →8 16 →17 27 → 28
7 →8 17 →18 28 → 29
8 →9 18 →19 30 → 28
9 →10 18 →20,21 31 → 28
9 →14 19 →23 32 → 33
10 →11 20 → 23 33 → 34
10 →12 21 → 22

LSE tutorial, June 2007


Slide 76
Use the cumulative scheduling constraint.

min z
s.t. z ≥ t1 + 3, z ≥ t2 + 4, etc.
cumulative ( (t1,…, t34 ),(3,4,…,2),(4,4,…,3),8 )
t2 ≥ t1 + 3, t 4 ≥ t1 + 3, etc.

LSE tutorial, June 2007


Slide 77
OPL model

int capacity = 8;
int nbTasks = 34;
range Tasks 1..nbTasks;
int duration[Tasks] = [3,4,4,6,…,2];
int totalDuration =
sum(t in Tasks) duration[t];
int demand[Tasks] = [4,4,3,4,…,3];
struct Precedences {
int before;
int after;
}
{Precedences} setOfPrecedences = {
<1,2>, <1,4>, …, <33,34> };

LSE tutorial, June 2007


Slide 78
scheduleHorizon = totalDuration;
Activity a[t in Tasks](duration[t]);
DiscreteResource res(8);
Activity makespan(0);
minimize
makespan.end
subject to
forall(t in Tasks)
a[t] precedes makespan;
forall(p in setOfPrecedences)
a[p.before] precedes a[p.after];
forall(t in Tasks)
a[t] requires(demand[t]) res;
};

LSE tutorial, June 2007


Slide 79
Modeling example: Production scheduling with
intermediate storage
Storage Packing
Tanks Units

Capacity
C1

Manufacturing
Unit
Capacity
C2

Capacity
C3
LSE tutorial, June 2007
Slide 80
Filling of storage tank

Level Need to enforce


capacity constraint
here only

t u t + (b/r) u + (b/s)
Manufac-
turing rate Packing rate
Filling starts
Batch size
Packing starts Packing ends
Filling ends
LSE tutorial, June 2007
Slide 81
Makespan
min T
bj
s.t. T ≥ uj + , all j
sj Job release time
t j ≥ R j , all j
cumulative ( t ,v , e, m ) m storage tanks
b
v i = ui + i − t i , all i Job duration
si
 si 
bi  1 −  + si ui ≤ Ci , all i Tank capacity
 ri

  b1 bn  
cumulative  u,  ,…,  , e, p 
 
  s1 sn   p packing units
uj ≥ tj ≥ 0
e = (1,…,1)
LSE tutorial, June 2007
Slide 82
Modeling example: Employee scheduling

• Schedule four nurses in 8-hour shifts.


• A nurse works at most one shift a day, at least 5 days a week.
• Same schedule every week.
• No shift staffed by more than two different nurses in a week.
• A nurse cannot work different shifts on two consecutive days.
• A nurse who works shift 2 or 3 must do so at least two days in
a row.

LSE tutorial, June 2007


Slide 83
Two ways to view the problem

Assign nurses to shifts


Sun Mon Tue Wed Thu Fri Sat
Shift 1 A B A A A A A
Shift 2 C C C B B B B
Shift 3 D D D D C C D

Assign shifts to nurses


Sun Mon Tue Wed Thu Fri Sat
Nurse A 1 0 1 1 1 1 1
Nurse B 0 1 0 2 2 2 2
Nurse C 2 2 2 0 3 3 0
Nurse D 3 3 3 3 0 0 3

LSE tutorial, June 2007 0 = day off


Slide 84
Use both formulations in the same model!
First, assign nurses to shifts.
Let wsd = nurse assigned to shift s on day d

alldiff(w1d ,w 2d ,w 3d ), all d The variables w1d, w2d,


w3d take different values
That is, schedule 3
different nurses on each
day

LSE tutorial, June 2007


Slide 85
Use both formulations in the same model!
First, assign nurses to shifts.
Let wsd = nurse assigned to shift s on day d

alldiff(w1d ,w 2d ,w 3d ), all d
cardinality (w | ( A, B,C, D ),(5,5,5,5),(6,6,6,6) )

A occurs at least 5 and at most 6


times in the array w, and similarly
for B, C, D.
That is, each nurse works at least
5 and at most 6 days a week

LSE tutorial, June 2007


Slide 86
Use both formulations in the same model!
First, assign nurses to shifts.
Let wsd = nurse assigned to shift s on day d

alldiff (w1d ,w 2d ,w 3d ) , all d


cardinality (w | ( A, B,C, D ),(5,5,5,5),(6,6,6,6) )
nvalues (w s,Sun ,...,w s,Sat | 1,2 ) , all s

The variables ws,Sun, …, ws,Sat take


at least 1 and at most 2 different
values.
That is, at least 1 and at most 2
nurses work any given shift.
LSE tutorial, June 2007
Slide 87
Remaining constraints are not easily expressed in this
notation.
So, assign shifts to nurses.
Let yid = shift assigned to nurse i on day d

alldiff ( y1d , y 2d , y 3d ) , all d

Assign a different nurse to each


shift on each day.
This constraint is redundant of
previous constraints, but
redundant constraints speed
solution.
LSE tutorial, June 2007
Slide 88
Remaining constraints are not easily expressed in this
notation.
So, assign shifts to nurses.
Let yid = shift assigned to nurse i on day d

alldiff ( y1d , y 2d , y 3d ) , all d


stretch ( y i ,Sun ,…, y i ,Sat | (2,3),(2,2),(6,6), P ) , all i

Every stretch of 2’s has length between 2 and 6.


Every stretch of 3’s has length between 2 and 6.
So a nurse who works shift 2 or 3 must do so at least
two days in a row.

LSE tutorial, June 2007


Slide 89
Remaining constraints are not easily expressed in this
notation.
So, assign shifts to nurses.
Let yid = shift assigned to nurse i on day d

alldiff ( y1d , y 2d , y 3d ) , all d


stretch ( y i ,Sun ,…, y i ,Sat | (2,3),(2,2),(6,6), P ) , all i

Here P = {(s,0),(0,s) | s = 1,2,3}


Whenever a stretch of a’s immediately precedes a stretch of b’s,
(a,b) must be one of the pairs in P.
So a nurse cannot switch shifts without taking at least one day off.
LSE tutorial, June 2007
Slide 90
Now we must connect the wsd variables to the yid variables.
Use channeling constraints:

w y id d = i , all i , d
y w sd d = s, all s, d

Channeling constraints increase propagation and make the


problem easier to solve.

LSE tutorial, June 2007


Slide 91
The complete model is:

alldiff (w1d ,w 2d ,w 3d ) , all d


cardinality (w | ( A, B,C, D ),(5,5,5,5),(6,6,6,6) )
nvalues (w s,Sun ,...,w s,Sat | 1,2 ) , all s

alldiff ( y1d , y 2d , y 3d ) , all d


stretch ( y i ,Sun ,…, y i ,Sat | (2,3),(2,2),(6,6), P ) , all i

w y id d = i , all i , d
y w sd d = s, all s, d

LSE tutorial, June 2007


Slide 92
CP Filtering Algorithms

Element
Alldiff
Disjunctive Scheduling
Cumulative Scheduling

LSE tutorial, June 2007


Slide 93
Filtering for element
element ( y ,( x1,…, xn ), z )

Variable domains can be easily filtered to maintain hvperarc


consistency.

Dz ← Dz ∩ ∪D xj
j ∈Dy

Domain of z
{
Dy ← Dy ∩ j | Dz ∩ Dx j ≠ ∅ }
Dz if Dy = { j } 
Dx j ←  
D
 x j otherwise 

LSE tutorial, June 2007


Slide 94
Filtering for element

Example... element ( y ,( x1, x2 , x3 , x4 ), z )

The initial domains are: The reduced domains are:

Dz = {20,30,60,80,90} Dz = {80,90}
Dy = {1,3,4} Dy = {3}
Dx1 = {10,50} Dx1 = {10,50}
Dx2 = {10,20} Dx2 = {10,20}
Dx3 = {40,50,80,90} Dx3 = {80,90}
Dx4 = {40,50,70} Dx4 = {40,50,70}

LSE tutorial, June 2007


Slide 95
Filtering for alldiff
alldiff ( y1,…, y n )

Domains can be filtered with an algorithm based on maximum


cardinality bipartite matching and a theorem of Berge.
It is a special case of optimality conditions for max flow.

LSE tutorial, June 2007


Slide 96
Filtering for alldiff

Consider the domains

y1 ∈ {1}
y 2 ∈ {2,3,5}
y 3 ∈ {1,2,3,5}
y 4 ∈ {1,5}
y 5 ∈ {1,2,3,4,5,6}

LSE tutorial, June 2007


Slide 97
Indicate domains with edges

1
y1
2
y2
3
y3
4
y4
5
y5
6

LSE tutorial, June 2007


Slide 98
Indicate domains with edges

1 Find maximum cardinality bipartite


y1
matching.
2
y2
3
y3
4
y4
5
y5
6

LSE tutorial, June 2007


Slide 99
Indicate domains with edges

1 Find maximum cardinality bipartite


y1
matching.
2
y2
3
y3
4
y4
5
y5
6

LSE tutorial, June 2007


Slide 100
Indicate domains with edges

1 Find maximum cardinality bipartite


y1
matching.
2
y2
3 Mark edges in alternating paths
that start at an uncovered vertex.
y3
4
y4
5
y5
6

LSE tutorial, June 2007


Slide 101
Indicate domains with edges

1 Find maximum cardinality bipartite


y1
matching.
2
y2
3 Mark edges in alternating paths
that start at an uncovered vertex.
y3
4
y4
5
y5
6

LSE tutorial, June 2007


Slide 102
Indicate domains with edges

1 Find maximum cardinality bipartite


y1
matching.
2
y2
3 Mark edges in alternating paths
that start at an uncovered vertex.
y3
4
y4
5 Mark edges in alternating cycles.

y5
6

LSE tutorial, June 2007


Slide 103
Indicate domains with edges

1 Find maximum cardinality bipartite


y1
matching.
2
y2
3 Mark edges in alternating paths
that start at an uncovered vertex.
y3
4
y4
5 Mark edges in alternating cycles.

y5
6
Remove unmarked edges not in
matching.

LSE tutorial, June 2007


Slide 104
Indicate domains with edges

1 Find maximum cardinality bipartite


y1
matching.
2
y2
3 Mark edges in alternating paths
that start at an uncovered vertex.
y3
4
y4
5 Mark edges in alternating cycles.

y5
6
Remove unmarked edges not in
matching.

LSE tutorial, June 2007


Slide 105
Filtering for alldiff

Domains have been filtered:

y1 ∈ {1} y1 ∈ {1}
y 2 ∈ {2,3,5} y 2 ∈ {2,3}
y 3 ∈ {1,2,3,5} y 3 ∈ {2,3}
y 4 ∈ {1,5} y 4 ∈ {5}
y 5 ∈ {1,2,3,4,5,6} y 5 ∈ {4,6}

Hyperarc consistency achieved.

LSE tutorial, June 2007


Slide 106
Disjunctive scheduling

Consider a disjunctive scheduling constraint:


disjunctive ( (s1, s2 , s3 , s5 ),( p1, p2 , p3 , p5 ) )

Start time variables

LSE tutorial, June 2007


Slide 107
Edge finding for disjunctive scheduling

Consider a disjunctive scheduling constraint:


disjunctive ( (s1, s2 , s3 , s5 ),( p1, p2 , p3 , p5 ) )

Processing times

LSE tutorial, June 2007


Slide 108
Edge finding for disjunctive scheduling

Consider a disjunctive scheduling constraint:


disjunctive ( (s1, s2 , s3 , s5 ),( p1, p2 , p3 , p5 ) )
Variable domains defined by time
windows and processing times
s1 ∈ [0,10 − 1]
s2 ∈ [0,10 − 3]
s3 ∈ [2,7 − 3]
s5 ∈ [4,7 − 2]

LSE tutorial, June 2007


Slide 109
Edge finding for disjunctive scheduling

Consider a disjunctive scheduling constraint:


disjunctive ( (s1, s2 , s3 , s5 ),( p1, p2 , p3 , p5 ) )

A feasible (min makespan) solution:

Time window

LSE tutorial, June 2007


Slide 110
Edge finding for disjunctive scheduling

But let’s reduce 2 of the deadlines to 9:

LSE tutorial, June 2007


Slide 111
Edge finding for disjunctive scheduling

But let’s reduce 2 of the deadlines to 9:

We will use edge finding


to prove that there is no
feasible schedule.

LSE tutorial, June 2007


Slide 112
Edge finding for disjunctive scheduling

We can deduce that job 2 must precede jobs 3 and 4: 2 ≪ {3,5}

Because if job 2 is not first, there is not enough time for all 3
jobs within the time windows:
L{2,3,5} − E{3,5} < p{2,3,5}

E{3,5} 7<3+3+2 L{2,3,5}


LSE tutorial, June 2007
Slide 113
Edge finding for disjunctive scheduling

We can deduce that job 2 must precede jobs 3 and 4: 2 ≪ {3,5}

Because if job 2 is not first, there is not enough time for all 3
jobs within the time windows:
L{2,3,5} − E{3,5} < p{2,3,5}
Latest deadline

E{3,5} 7<3+3+2 L{2,3,5}


LSE tutorial, June 2007
Slide 114
Edge finding for disjunctive scheduling

We can deduce that job 2 must precede jobs 3 and 4: 2 ≪ {3,5}

Because if job 2 is not first, there is not enough time for all 3
jobs within the time windows:
L{2,3,5} − E{3,5} < p{2,3,5}

Earliest release time

E{3,5} 7<3+3+2 L{2,3,5}


LSE tutorial, June 2007
Slide 115
Edge finding for disjunctive scheduling

We can deduce that job 2 must precede jobs 3 and 4: 2 ≪ {3,5}

Because if job 2 is not first, there is not enough time for all 3
jobs within the time windows:
L{2,3,5} − E{3,5} < p{2,3,5}

Total processing time

E{3,5} 7<3+3+2 L{2,3,5}


LSE tutorial, June 2007
Slide 116
Edge finding for disjunctive scheduling

We can deduce that job 2 must precede jobs 3 and 4: 2 ≪ {3,5}

So we can tighten deadline of job 2 to minimum of

L{3} − p{3} = 4 L{5} − p{5} = 5 L{3,5} − p{3,5} = 2


Since time window of job 2 is now too narrow, there is no
feasible schedule.

E{3,5} 7<3+3+2 L{2,3,5}


LSE tutorial, June 2007
Slide 117
Edge finding for disjunctive scheduling

In general, we can deduce that job k must precede all the jobs
in set J: k ≪ J
If there is not enough time for all the jobs after the earliest
release time of the jobs in J
LJ ∪{ k } − EJ < pJ ∪{ k } L{2,3,5} − E{3,5} < p{2,3,5}

LSE tutorial, June 2007


Slide 118
Edge finding for disjunctive scheduling

In general, we can deduce that job k must precede all the jobs
in set J: k ≪ J
If there is not enough time for all the jobs after the earliest
release time of the jobs in J
LJ ∪{ k } − EJ < pJ ∪{ k } L{2,3,5} − E{3,5} < p{2,3,5}

Now we can tighten the deadline for job k to:


min {LJ ′ − pJ ′ } L{3,5} − p{3,5} = 2
J ′⊂ J

LSE tutorial, June 2007


Slide 119
Edge finding for disjunctive scheduling

There is a symmetric rule: k ≫J

If there is not enough time for all the jobs before the latest
deadline of the jobs in J:
LJ − EJ ∪{ k } < pJ ∪{ k }

Now we can tighten the release date for job k to:


max {EJ ′ + pJ ′ }
J ′⊂ J

LSE tutorial, June 2007


Slide 120
Edge finding for disjunctive scheduling

Problem: how can we avoid enumerating all subsets J of jobs


to find edges?
LJ ∪{ k } − EJ < pJ ∪{ k }

…and all subsets J′ of J to tighten the bounds?


min {LJ ′ − pJ ′ }
J ′⊂ J

LSE tutorial, June 2007


Slide 121
Edge finding for disjunctive scheduling

Key result: We only have to consider sets J whose time


windows lie within some interval.

min {LJ ′ − pJ ′ }
J ′⊂ J

e.g., J = {3,5}

LSE tutorial, June 2007


Slide 122
Edge finding for disjunctive scheduling

Key result: We only have to consider sets J whose time


windows lie within some interval.

min {LJ ′ − pJ ′ }
J ′⊂ J

e.g., J = {3,5}

Removing a job from those within an interval only weakens the


test LJ ∪{ k } − EJ < pJ ∪{ k }
There are a polynomial number of intervals
LSE tutorial, June 2007 defined by release times and deadlines.
Slide 123
Edge finding for disjunctive scheduling

Key result: We only have to consider sets J whose time


windows lie within some interval.

min {LJ ′ − pJ ′ }
J ′⊂ J

e.g., J = {3,5}

Note: Edge finding does not achieve bounds consistency,


which is an NP-hard problem.

LSE tutorial, June 2007


Slide 124
Edge finding for disjunctive scheduling
One O(n2) algorithm is based on the Jackson pre-emptive
schedule (JPS). Using a different example, the JPS is:

LSE tutorial, June 2007


Slide 125
Edge finding for disjunctive scheduling
One O(n2) algorithm is based on the Jackson pre-emptive
schedule (JPS). Using a different example, the JPS is:

For each job i Jobs unfinished at time Ei in JPS

Scan jobs k ∈ Ji in decreasing order of Lk


Select first k for which Lk − Ei < pi + pJik
Conclude that i ≫ Jik Jobs j ≠ i in Ji with Lj ≤ Lk
Update Ei to JPS(i , k )

LSE tutorial, June 2007


Slide 126
Latest completion time in JPS of jobs in Jik
Not-first/not-last rules
We can deduce that job 4 cannot precede jobs 1 and 2:
¬ ( 4 ≪ {1,2} )
Because if job 4 is first, there is too little time to complete the
jobs before the later deadline of jobs 1 and 2:
L{1,2} − E 4 < p1 + p2 + p4

E4 6<1+3+3 L{1,2}
LSE tutorial, June 2007
Slide 127
Not-first/not-last rules
We can deduce that job 4 cannot precede jobs 1 and 2:
¬ ( 4 ≪ {1,2} )
Now we can tighten the release time of job 4 to minimum of:
E1 + p1 = 3 E2 + p2 = 4

E4 6<1+3+3 L{1,2}
LSE tutorial, June 2007
Slide 128
Not-first/not-last rules
In general, we can deduce that job k cannot precede all the
jobs in J: ¬(k ≪ J )
if there is too little time after release time of job k to complete
all jobs before the latest deadline in J:
LJ − Ek < pJ

Now we can update Ei to


min {E j + p j }
j ∈J

LSE tutorial, June 2007


Slide 129
Not-first/not-last rules
In general, we can deduce that job k cannot precede all the
jobs in J: ¬(k ≪ J )
if there is too little time after release time of job k to complete
all jobs before the latest deadline in J:
LJ − Ek < pJ

Now we can update Ei to


min {E j + p j }
j ∈J

There is a symmetric not-last rule.


The rules can be applied in polynomial time, although an
efficient algorithm is quite complicated.

LSE tutorial, June 2007


Slide 130
Cumulative scheduling

Consider a cumulative scheduling constraint:


cumulative ( (s1, s2 , s3 ),( p1, p2 , p3 ),(c1, c2 , c3 ),C )

A feasible solution:

LSE tutorial, June 2007


Slide 131
Edge finding for cumulative scheduling

We can deduce that job 3 must finish after the others finish: 3 > {1,2}
Because the total energy required exceeds the area between
the earliest release time and the later deadline of jobs 1,2:
(
e3 + e{1,2} > C ⋅ L{1,2} − E {1,2,3} )

LSE tutorial, June 2007


Slide 132
Edge finding for cumulative scheduling

We can deduce that job 3 must finish after the others finish: 3 > {1,2}
Because the total energy required exceeds the area between
the earliest release time and the later deadline of jobs 1,2:
(
e3 + e{1,2} > C ⋅ L{1,2} − E {1,2,3} )

Total energy
required = 22
9
8

LSE tutorial, June 2007


Slide 133
Edge finding for cumulative scheduling

We can deduce that job 3 must finish after the others finish: 3 > {1,2}
Because the total energy required exceeds the area between
the earliest release time and the later deadline of jobs 1,2:
(
e3 + e{1,2} > C ⋅ L{1,2} − E {1,2,3} )

Total energy
required = 22
9
Area available 8
= 20
5

LSE tutorial, June 2007


Slide 134
Edge finding for cumulative scheduling

We can deduce that job 3 must finish after the others finish: 3 > {1,2}
We can update the release time of job 3 to

eJ − (C − c3 )(L{1,2} − E{1,2} )
E{1,2} +
c3
Energy available
for jobs 1,2 if
space is left for job
3 to start anytime
= 10
10

LSE tutorial, June 2007


Slide 135
Edge finding for cumulative scheduling

We can deduce that job 3 must finish after the others finish: 3 > {1,2}
We can update the release time of job 3 to

eJ − (C − c3 )(L{1,2} − E{1,2} )
E{1,2} +
c3
Energy available
for jobs 1,2 if
space is left for job 4
3 to start anytime
= 10
Excess energy 10
required by jobs
1,2 = 4
LSE tutorial, June 2007
Slide 136
Edge finding for cumulative scheduling

We can deduce that job 3 must finish after the others finish: 3 > {1,2}
We can update the release time of job 3 to

eJ − (C − c3 )(L{1,2} − E{1,2} )
E{1,2} +
c3
Energy available
for jobs 1,2 if
space is left for job 4 Move up job 3
3 to start anytime release time
= 10 4/2 = 2 units
beyond E{1,2}
Excess energy 10
required by jobs
1,2 = 4 E3
LSE tutorial, June 2007
Slide 137
Edge finding for cumulative scheduling

In general, if (
eJ ∪{k } > C ⋅ LJ − EJ ∪{ k } )
then k > J, and update Ek to
 eJ ′ − (C − ck )(LJ ′ − EJ ′ ) 
max
J ′⊂ J
E J ′ + 
 c k 
e ′ −(C −c )( L ′ −E ′ )>0
J k J J

In general, if (
eJ ∪{k } > C ⋅ LJ ∪{ k } − EJ )
then k < J, and update Lk to
 eJ ′ − (C − ck )(LJ ′ − EJ ′ ) 
min
J ′⊂ J
LJ ′ − 
 c k 
eJ ′ −(C −ck )( LJ ′ −EJ ′ )>0

LSE tutorial, June 2007


Slide 138
Edge finding for cumulative scheduling

There is an O(n2) algorithm that finds all applications of the


edge finding rules.

LSE tutorial, June 2007


Slide 139
Other propagation rules for cumulative
scheduling

• Extended edge finding.


• Timetabling.
• Not-first/not-last rules.
• Energetic reasoning.

LSE tutorial, June 2007


Slide 140
Linear Relaxation
Why Relax?
Algebraic Analysis of LP
Linear Programming Duality
LP-Based Domain Filtering
Example: Single-Vehicle Routing
Disjunctions of Linear Systems
LSE tutorial, June 2007
Slide 141
Why Relax?
Solving a relaxation of a problem can:

• Tighten variable bounds.


• Possibly solve original problem.
• Guide the search in a promising direction.
• Filter domains using reduced costs or Lagrange multipliers.
• Prune the search tree using a bound on the optimal value.
• Provide a more global view, because a single OR relaxation
can pool relaxations of several constraints.

LSE tutorial, June 2007


Slide 142
Some OR models that can provide relaxations:

• Linear programming (LP).


• Mixed integer linear programming (MILP)
– Can itself be relaxed as an LP.
– LP relaxation can be strengthened with cutting planes.
• Lagrangean relaxation.
• Specialized relaxations.
– For particular problem classes.
– For global constraints.

LSE tutorial, June 2007


Slide 143
Motivation

• Linear programming is remarkably versatile for representing


real-world problems.
• LP is by far the most widely used tool for relaxation.
• LP relaxations can be strengthened by cutting planes.
- Based on polyhedral analysis.
• LP has an elegant and powerful duality theory.
- Useful for domain filtering, and much else.
• The LP problem is extremely well solved.

LSE tutorial, June 2007


Slide 144
Algebraic Analysis of LP

An example…

min 4 x1 + 7 x2
2 x1 + 3 x2 ≥ 6
2x1 + x2 ≥ 4
2 x1 + x2 ≥ 4
Optimal solution
x1, x2 ≥ 0 x = (3,0)

4x1 + 7x2 = 12

2x1 + 3x2 ≥ 6

LSE tutorial, June 2007


Slide 145
Algebraic Analysis of LP

Rewrite as
min 4 x1 + 7 x2 min 4 x1 + 7 x2
2 x1 + 3 x2 ≥ 6 2 x1 + 3 x2 − x3 = 6
2 x1 + x2 ≥ 4 2 x1 + x2 − x4 = 4
x1, x2 ≥ 0 x1, x2 , x3 , x4 ≥ 0

In general an LP has the form min cx


Ax = b
x≥0

LSE tutorial, June 2007


Slide 146
Algebraic analysis of LP

Write min cx as min cB xB + cN xN where

Ax = b BxB + NxN = b A = [B N ]
x≥0 xB , xN ≥ 0

m × n matrix Any set of


Basic m linearly
Nonbasic independent
variables
variables columns of A.
These form a
basis for the
space spanned
by the columns.

LSE tutorial, June 2007


Slide 147
Algebraic analysis of LP

Write min cx as min cB xB + cN xN where

Ax = b BxB + NxN = b A = [B N ]
x≥0 xB , xN ≥ 0

Solve constraint equation for xB: xB = B −1b − B −1NxN

All solutions can be obtained by setting xN to some value.


The solution is basic if xN = 0.
It is a basic feasible solution if xN = 0 and xB ≥ 0.

LSE tutorial, June 2007


Slide 148
x2
Example… = basic feasible
solution

min 4 x1 + 7 x2 x2, x3 basic


2 x1 + 3 x2 − x3 = 6
2 x1 + x2 − x4 = 4 2x1 + x2 ≥ 4

x1, x2 , x3 , x4 ≥ 0
x2, x4 basic
x1, x2 basic

2x1 + 3x2 ≥ 6
x1, x4 basic
x3, x4 basic
x1, x3 basic x1
LSE tutorial, June 2007
Slide 149
Algebraic analysis of LP

Write min cx as min cB xB + cN xN where

Ax = b BxB + NxN = b A = [B N ]
x≥0 xB , xN ≥ 0

Solve constraint equation for xB: xB = B −1b − B −1NxN

Express cost in terms of nonbasic variables:

cB B −1b − (cN − cB B −1N )xN Since xN ≥ 0,


basic solution (xB,0)
Vector of reduced costs is optimal if
reduced costs are
nonnegative.
LSE tutorial, June 2007
Slide 150
x2
Example…

min 4 x1 + 7 x2
2x1 + 3 x2 − x3 = 6
2x1 + x2 − x4 = 4
x1, x2 , x3 , x4 ≥ 0 Consider this
basic feasible
solution

x1, x4 basic
x1
LSE tutorial, June 2007
Slide 151
Example…

Write… as… cBxB c Nx N


 x1   x2 
min 4 x1 + 7 x2 min [ 4 0 ]   + [7 0]  
2x1 + 3 x2 − x3 = 6  x4   x3 
2 0   x1  3 −1  x1   6 
2x1 + x2 − x4 = 4 BxB     +    = 
2 −1  x 4   1 0   x 4   4 
x1, x2 , x3 , x4 ≥ 0
 x1   x1  0  NxN b
 x  ,  x  ≥ 0 
 4  4  

LSE tutorial, June 2007


Slide 152
Example…

cBxB c Nx N
 x1   x2 
min [ 4 0 ]   + [7 0]  
 x4   x3 
2 0   x1  3 −1  x1   6 
BxB     +    = 
2 −1  x 4   1 0   x 4   4 
 x1   x1  0  NxN b
 x  ,  x  ≥ 0 
 4  4  

LSE tutorial, June 2007


Slide 153
Example…
Basic solution is
xB = B −1b − B −1NxN = B −1b
cBxB c Nx N  x1  1/ 2 0   6  3 
= =    = 
 x1   x2   x4   1 −1  4  2
min [ 4 0 ]   + [7 0]  
 x4   x3 
2 0   x1  3 −1  x1   6 
BxB     +    =  x2
2 −1  x 4   1 0   x 4   4 
 x1   x1  0  NxN b
 x  ,  x  ≥ 0 
 4  4  

x1, x4 basic

x1
LSE tutorial, June 2007
Slide 154
Example…
Basic solution is
xB = B −1b − B −1NxN = B −1b
cBxB c Nx N  x1  1/ 2 0   6  3 
= =    = 
 x1   x2   x4   1 −1  4  2
min [ 4 0 ]   + [7 0]  
 x4   x3 
2 0   x1  3 −1  x1   6 
BxB     +    = 
2 −1  x 4   1 0   x 4   4  Reduced costs are
 x1   x1  0  NxN cN − cB B −1N
 x  ,  x  ≥ 0  1/ 2 0  3 −1
 4  4   = [7 0] − [ 4 0]   
 1 −1  1 0 
= [1 2] ≥ [0 0]
Solution is
LSE tutorial, June 2007 optimal
Slide 155
Linear Programming Duality

An LP can be viewed as an inference problem…

min cx = max v
x ≥0
Ax ≥ b Ax ≥ b ⇒ cx ≥ v
x≥0
implies

Dual problem: Find the tightest lower bound on the


objective function that is implied by the constraints.

LSE tutorial, June 2007


Slide 156
An LP can be viewed as an inference problem…

min cx = max v
x ≥0 That is, some surrogate
Ax ≥ b Ax ≥ b ⇒ cx ≥ v (nonnegative linear
x≥0 combination) of
Ax ≥ b dominates cx ≥ v

From Farkas Lemma: If Ax ≥ b, x ≥ 0 is feasible,


x ≥0 λ Ax ≥ λ b dominates cx ≥ v
Ax ≥ b ⇒ cx ≥ v iff
for some λ ≥ 0
λA ≤ c and λb ≥ v
LSE tutorial, June 2007
Slide 157
An LP can be viewed as an inference problem…

min cx = max v = max λ b This is the


Ax ≥ b x ≥0 λA ≤ c classical
Ax ≥ b ⇒ cx ≥ v LP dual
x≥0 λ ≥0

From Farkas Lemma: If Ax ≥ b, x ≥ 0 is feasible,


x ≥0 λ Ax ≥ λ b dominates cx ≥ v
Ax ≥ b ⇒ cx ≥ v iff
for some λ ≥ 0

λA ≤ c and λb ≥ v
LSE tutorial, June 2007
Slide 158
This equality is called strong duality.

min cx = max λ b This is the


Ax ≥ b λA ≤ c classical
LP dual
x≥0 λ ≥0
If Ax ≥ b, x ≥ 0 is feasible

Note that the dual of the dual is the primal


(i.e., the original LP).

LSE tutorial, June 2007


Slide 159
Example

Primal Dual
min 4 x1 + 7 x2 = max 6λ1 + 4λ2 = 12
2 x1 + 3 x2 ≥ 6 (λ1 ) 2λ1 + 2λ2 ≤ 4 ( x1 )
2 x1 + x2 ≥ 4 (λ1 ) 3λ1 + λ2 ≤ 7 ( x2 )
x1, x2 ≥ 0 λ1, λ2 ≥ 0

A dual solution is (λ1,λ2) = (2,0)

2 x1 + 3 x2 ≥ 6 ⋅ (λ1 = 2)
Dual multipliers
2 x1 + x2 ≥ 4 ⋅ (λ2 = 0)

4 x1 + 6 x2 ≥ 12 Surrogate
dominates

LSE tutorial, June 2007


4 x1 + 7 x2 ≥ 12 Tightest bound on cost
Slide 160
Weak Duality

If x* is feasible in the and λ* is feasible in the then cx* ≥ λ*b.


primal problem dual problem
min cx max λ b This is because
Ax ≥ b λA ≤ c cx* ≥ λ*Ax* ≥ λ*b
x≥0 λ ≥0
λ* is dual x* is primal
feasible feasible
and x* ≥ 0 and λ* ≥ 0

LSE tutorial, June 2007


Slide 161
Dual multipliers as marginal costs

Suppose we perturb the RHS of an LP min cx


(i.e., change the requirement levels):
Ax ≥ b + ∆b
x≥0
The dual of the perturbed LP has the max λ ( b + ∆b )
same constraints at the original LP:
λA ≤ c
λ ≥0
So an optimal solution λ* of the original dual is feasible in the
perturbed dual.

LSE tutorial, June 2007


Slide 162
Dual multipliers as marginal costs

Suppose we perturb the RHS of an LP min cx


(i.e., change the requirement levels):
Ax ≥ b + ∆b
x≥0
By weak duality, the optimal value of the perturbed LP is at least
λ*(b + ∆b) = λ*b + λ*∆b.

Optimal value of original LP, by strong duality.

So λi* is a lower bound on the marginal cost of increasing the


i-th requirement by one unit (∆bi = 1).

If λi* > 0, the i-th constraint must be tight (complementary slackness).

LSE tutorial, June 2007


Slide 163
Dual of an LP in equality form

Primal Dual
min cB xB + cN xN max λ b
λ B ≤ cB ( xB )
BxB + NxN = b (λ )
λ N ≤ cN ( xB )
xB , xN ≥ 0
λ unrestricted

LSE tutorial, June 2007


Slide 164
Dual of an LP in equality form

Primal Dual
min cB xB + cN xN max λ b
λ B ≤ cB ( xB )
BxB + NxN = b (λ )
λ N ≤ cN ( xB )
xB , xN ≥ 0
λ unrestricted

Recall that reduced cost vector is cN − cB B −1N = cN − λ N


λ
this solves the dual
if (xB,0) solves the primal

LSE tutorial, June 2007


Slide 165
Dual of an LP in equality form

Primal Dual
min cB xB + cN xN max λ b
λ B ≤ cB ( xB )
BxB + NxN = b (λ )
λ N ≤ cN ( xB )
xB , xN ≥ 0
λ unrestricted

Recall that reduced cost vector is cN − cB B −1N = cN − λ N


λ
Check: λB = cB B B = cB
−1

this solves the dual


λN = cB B −1N ≤ cN if (xB,0) solves the primal

Because reduced cost is nonnegative


at optimal solution (xB,0).
LSE tutorial, June 2007
Slide 166
Dual of an LP in equality form

Primal Dual
min cB xB + cN xN max λ b
λ B ≤ cB ( xB )
BxB + NxN = b (λ )
λ N ≤ cN ( xB )
xB , xN ≥ 0
λ unrestricted

Recall that reduced cost vector is cN − cB B −1N = cN − λ N


λ
this solves the dual
if (xB,0) solves the primal
In the example,
1/ 2 0 
λ = cB B = [ 4 0 ] 
−1
 = [ 2 0]
 1 −1
LSE tutorial, June 2007
Slide 167
Dual of an LP in equality form

Primal Dual
min cB xB + cN xN max λ b
λ B ≤ cB ( xB )
BxB + NxN = b (λ )
λ N ≤ cN ( xB )
xB , xN ≥ 0
λ unrestricted

Recall that reduced cost vector is cN − cB B −1N = cN − λ N


λ
Note that the reduced cost of an individual variable xj is r j = c j − λ Aj

Column j of A

LSE tutorial, June 2007


Slide 168
LP-based Domain Filtering
min cx
Let Ax ≥ b be an LP relaxation of a CP problem.
x≥0

 One way to filter the domain of xj is to minimize and maximize xj


subject to Ax ≥ b, x ≥ 0.
- This is time consuming.
 A faster method is to use dual multipliers to derive valid
inequalities.
- A special case of this method uses reduced costs to bound or
fix variables.
- Reduced-cost variable fixing is a widely used technique in OR.

LSE tutorial, June 2007


Slide 169
Suppose:
min cx has optimal solution x*, optimal value v*, and
Ax ≥ b optimal dual solution λ*.
x≥0

…and λi* > 0, which means the i-th constraint is tight


(complementary slackness);
…and the LP is a relaxation of a CP problem;
…and we have a feasible solution of the CP problem with value
U, so that U is an upper bound on the optimal value.

LSE tutorial, June 2007


Slide 170
min cx
has optimal solution x*, optimal value v*, and
Supposing Ax ≥ b
optimal dual solution λ*:
x≥0

If x were to change to a value other than x*, the LHS of i-th constraint
Aix ≥ bi would change by some amount ∆bi.
Since the constraint is tight, this would increase the optimal value
as much as changing the constraint to Aix ≥ bi + ∆bi.
So it would increase the optimal value at least λi*∆bi.

LSE tutorial, June 2007


Slide 171
min cx
has optimal solution x*, optimal value v*, and
Supposing Ax ≥ b
optimal dual solution λ*:
x≥0

We have found: a change in x that changes Aix by ∆bi increases


the optimal value of LP at least λi*∆bi.

Since optimal value of the LP ≤ optimal value of the CP ≤ U,


we have λi*∆bi ≤ U − v*, or U −v*
∆bi ≤
λi*

LSE tutorial, June 2007


Slide 172
min cx
has optimal solution x*, optimal value v*, and
Supposing Ax ≥ b
optimal dual solution λ*:
x≥0

We have found: a change in x that changes Aix by ∆bi increases


the optimal value of LP at least λi*∆bi.

Since optimal value of the LP ≤ optimal value of the CP ≤ U,


we have λi*∆bi ≤ U − v*, or U −v*
∆bi ≤
λi*
Since ∆bi = Aix − Aix* = Aix − bi, this implies the inequality
U −v*
Ai x ≤ bi +
λi*
…which can be propagated.
LSE tutorial, June 2007
Slide 173
Example

min 4 x1 + 7 x2
Suppose we have a feasible solution
2 x1 + 3 x2 ≥ 6 (λ1 = 2) of the original CP with value U = 13.
2 x1 + x2 ≥ 4 (λ1 = 0)
x1, x2 ≥ 0
Since the first constraint is tight, we can propagate
the inequality
U −v*
A1x ≤ b1 +
λ1*

13 − 12
or 2 x1 + 3 x2 ≤ 6 + = 6.5
2

LSE tutorial, June 2007


Slide 174
Reduced-cost domain filtering

Suppose xj* = 0, which means the constraint xj ≥ 0 is tight.

U −v* U −v*
The inequality Ai x ≤ bi + becomes xj ≤
λi
*
rj

The dual multiplier for xj ≥ 0 is the reduced cost


rj of xj, because increasing xj (currently 0) by 1
increases optimal cost by rj.

Similar reasoning can bound a variable below when it is at its


upper bound.

LSE tutorial, June 2007


Slide 175
Example

min 4 x1 + 7 x2
Suppose we have a feasible solution
2 x1 + 3 x2 ≥ 6 (λ1 = 2) of the original CP with value U = 13.
2 x1 + x2 ≥ 4 (λ1 = 0)
x1, x2 ≥ 0
U −v*
Since x2* = 0, we have x2 ≤
r2
13 − 12
or x2 ≤ = 0.5
2

If x2 is required to be integer, we can fix it to zero.


This is reduced-cost variable fixing.

LSE tutorial, June 2007


Slide 176
Example: Single-Vehicle Routing

A vehicle must make several stops and return home, perhaps subject
to time windows.
The objective is to find the order of stops that minimizes travel time.
This is also known as the traveling salesman problem (with time
windows).

Stop j
Travel time cij

Stop i

LSE tutorial, June 2007


Slide 177
Assignment Relaxation

= 1 if stop i immediately precedes stop j


min ∑c x
ij
ij ij

∑ xij = ∑ x ji = 1, all i
j j
Stop i is preceded and
followed by exactly one stop.

xij ∈ {0,1}, all i , j

LSE tutorial, June 2007


Slide 178
Assignment Relaxation

= 1 if stop i immediately precedes stop j


min ∑c x
ij
ij ij

∑ xij = ∑ x ji = 1, all i
j j
Stop i is preceded and
followed by exactly one stop.

0 ≤ xij ≤ 1, all i , j
Because this problem is totally unimodular, it can be solved as an LP.

The relaxation provides a very weak lower bound on the optimal value.

But reduced-cost variable fixing can be very useful in a CP context.

LSE tutorial, June 2007


Slide 179
Disjunctions of linear systems

Disjunctions of linear systems often occur naturally in problems


and can be given a convex hull relaxation.

A disjunction of linear systems min cx


represents a union of polyhedra.
∨(A x ≥ b )
k
k k

LSE tutorial, June 2007


Slide 180
Relaxing a disjunction of linear systems

Disjunctions of linear systems often occur naturally in problems


and can be given a convex hull relaxation.

A disjunction of linear systems min cx


represents a union of polyhedra.
∨(A x ≥ b )
k
k k

We want a convex hull relaxation


(tightest linear relaxation).

LSE tutorial, June 2007


Slide 181
Relaxing a disjunction of linear systems

Disjunctions of linear systems often occur naturally in problems


and can be given a convex hull relaxation.

The closure of the convex hull of min cx


∨(A x ≥ b )
k
k k

…is described by min cx


Ak x k ≥ b k y k , all k
∑y =1
k
k

x = ∑x k

LSE tutorial, June 2007


0 ≤ yk ≤ 1
Slide 182
Why?
To derive convex hull
relaxation of a disjunction…

min cx
Ak x k ≥ b k , all k
Write each
solution as a ∑y =1 k
convex k
combination x = ∑y x k
k

of points in x
k 1
x
the 0 ≤ yk ≤ 1
polyhedron x2

Convex hull relaxation


(tightest linear relaxation)
LSE tutorial, June 2007
Slide 183
Why? min cx
Ak x k ≥ b k y k , all k
To derive convex hull
relaxation of a disjunction… Change of ∑y =1
k
k
variable
x = yk x k x = ∑x k

min cx k

Ak x k ≥ b k , all k 0 ≤ yk ≤ 1
Write each
solution as a ∑y =1 k
convex k
combination x = ∑y x k
k

of points in x
k 1
x
the 0 ≤ yk ≤ 1
polyhedron x2

Convex hull relaxation


(tightest linear relaxation)
LSE tutorial, June 2007
Slide 184
Mixed Integer/Linear Modeling

MILP Representability
Disjunctive Modeling
Knapsack Modeling

LSE tutorial, June 2007


Slide 185
Motivation

A mixed integer/linear programming min cx + dy


(MILP) problem has the form Ax + by ≥ b
x, y ≥ 0
y integer

• We can relax a CP problem by modeling some constraints with an MILP.


• If desired, we can then relax the MILP by dropping the integrality constraint,
to obtain an LP.
• The LP relaxation can be strengthened with cutting planes.
• The first step is to learn how to write MILP models.

LSE tutorial, June 2007


Slide 186
MILP Representability
A subset S of R n is MILP representable if it is the projection onto x
of some MILP constraint set of the form
Ax + Bu + Dy ≥ b
x, y ≥ 0
x ∈ R n , u ∈ R m , y k ∈ {0,1}

LSE tutorial, June 2007


Slide 187
MILP Representability
A subset S of R n is MILP representable if it is the projection onto x
of some MILP constraint set of the form
Ax + Bu + Dy ≥ b
x, y ≥ 0
x ∈ R n , u ∈ R m , y k ∈ {0,1} Recession cone
of polyhedron

Theorem. S ⊂ R is MILP
n

representable if and only if


S is the union of finitely Polyhedron
many polyhedra having the
same recession cone.

LSE tutorial, June 2007


Slide 188
Example: Fixed charge function
min x2
Minimize a fixed charge function:
0 if x1 = 0 
x2 ≥  
f + cx1 if x1 > 0 
x1 ≥ 0
x2

x1
LSE tutorial, June 2007
Slide 189
Example
min x2
Minimize a fixed charge function:
0 if x1 = 0 
x2 ≥  
f + cx1 if x1 > 0 
x1 ≥ 0
x2

Feasible set

x1
LSE tutorial, June 2007
Slide 190
Example
min x2
Minimize a fixed charge function:
0 if x1 = 0 
x2 ≥  
f + cx1 if x1 > 0 
x1 ≥ 0
x2

Union of two
polyhedra
P1, P2

P1

x1
LSE tutorial, June 2007
Slide 191
Example
min x2
Minimize a fixed charge function:
0 if x1 = 0 
x2 ≥  
f + cx1 if x1 > 0 
x1 ≥ 0
x2

Union of two P2
polyhedra
P1, P2

P1

x1
LSE tutorial, June 2007
Slide 192
Example
min x2
Minimize a fixed charge function:
0 if x1 = 0 
x2 ≥  
f + cx1 if x1 > 0 
x1 ≥ 0
x2

The P2
polyhedra
have
different
recession P1
cones.
P1 P2
x1 recession recession
LSE tutorial, June 2007 cone cone
Slide 193
Example
min x2
Minimize a fixed charge function:
0 if x1 = 0 
Add an upper bound on x1 x2 ≥  
f + cx1 if x1 > 0 
0 ≤ x1 ≤ M
x2

The P2
polyhedra
have the
same
recession P1
cone.
P1 P2
x1 recession recession
LSE tutorial, June 2007
M cone cone
Slide 194
Modeling a union of polyhedra

Start with a disjunction of linear min cx


systems to represent the union
of polyhedra. ∨(A
k
k
x ≥ bk )
The kth polyhedron is {x | Akx ≥ b}

Introduce a 0-1 variable yk that is min cx


1 when x is in polyhedron k.
Ak x k ≥ b k y k , all k
Disaggregate x to create an xk for
each k. ∑ yk = 1
k

x = ∑ xk
k

y k ∈ {0,1}
LSE tutorial, June 2007
Slide 195
Example

Start with a disjunction of min x2


linear systems to represent
 x1 = 0   0 ≤ x1 ≤ M 
 x ≥ 0  ∨  x ≥ f + cx 
the union of polyhedra
 2   2 1

x2

P2

P1

x1
LSE tutorial, June 2007
Slide 196
M
Example

Start with a disjunction of min x2


linear systems to represent
 x1 = 0   0 ≤ x1 ≤ M 
 x ≥ 0  ∨  x ≥ f + cx 
the union of polyhedra
 2   2 1

Introduce a 0-1 variable yk


min cx
that is 1 when x is in x11 = 0, x21 ≥ 0
polyhedron k.
0 ≤ x12 ≤ My 2 , − cx12 + x22 ≥ fy 2
y1 + y 2 = 1, y k ∈ {0,1}
Disaggregate x to create an
xk for each k.
x = x1 + x 2

LSE tutorial, June 2007


Slide 197
Example

To simplify: min x2
Replace x12 with x1. x11 = 0, x21 ≥ 0
Replace x22 with x2. 0 ≤ x12 ≤ My 2 , − cx12 + x22 ≥ fy 2
Replace y2 with y. y1 + y 2 = 1, y k ∈ {0,1}
x = x1 + x 2

This yields min x2 or min fy + cx


0 ≤ x1 ≤ My 0 ≤ x ≤ My
x2 ≥ fy + cx1 y ∈ {0,1} “Big M ”
y ∈ {0,1}
LSE tutorial, June 2007
Slide 198
Disjunctive Modeling

Disjunctions often occur naturally in problems and can be given


an MILP model.

Recall that a disjunction of linear min cx


systems (representing polyhedra
with the same recession cone) ∨(A x ≥ b )
k
k k

…has the MILP model min cx


Ak x k ≥ b k y k , all k
∑ yk = 1
k

x = ∑ xk
k

LSE tutorial, June 2007


y k ∈ {0,1}
Slide 199
Example: Uncapacitated facility location
m possible
factory
locations n markets Locate factories to serve
markets so as to minimize
total fixed cost and
transport cost.
No limit on production
capacity of each factory.

fi cij
i j
Fixed
cost Transport
cost

LSE tutorial, June 2007


Slide 200
Fraction of
market j’s demand
Uncapacitated facility location satisfied from
m possible location i
factory
locations n markets Disjunctive model:
min ∑z + ∑c x
i
i
ij
ij ij

 xij = 0, all j   0 ≤ xij ≤ 1, all j 


 z =0 ∨ z ≥ f  , all i
 i   i i 
∑ xij = 1, all j
i

fi cij No factory Factory


i j at location i at location i
Fixed
cost Transport
cost

LSE tutorial, June 2007


Slide 201
Uncapacitated facility location

MILP formulation: Disjunctive model:

min ∑ fi y i + ∑ cij xij min ∑z + ∑c x


i
i
ij
ij ij
i ij

0 ≤ xij ≤ y i , all i , j  xij = 0, all j   0 ≤ xij ≤ 1, all j 


 z =0 ∨ ≥  , all i
y i ∈ {0,1}  i   z i fi 
∑ xij = 1, all j
i

No factory Factory
at location i at location i

LSE tutorial, June 2007


Slide 202
Uncapacitated facility location Maximum output
from location i

MILP formulation: Beginner’s model:

min ∑f y + ∑c x
i
i i
ij
ij ij min ∑f y + ∑c x
i
i i
ij
ij ij

0 ≤ xij ≤ y i , all i , j ∑x ij ≤ ny i , all i , j


y i ∈ {0,1}
j

y i ∈ {0,1}

Based on capacitated location model.


It has a weaker continuous relaxation
(obtained by replacing yi ∈ {0,1} with 0 ≤ yi ≤ 1).
This beginner’s mistake can be avoided by
starting with disjunctive formulation.
LSE tutorial, June 2007
Slide 203
Knapsack Modeling

• Knapsack models consist of knapsack covering and


knapsack packing constraints.
• The freight transfer model presented earlier is an example.
• We will consider a similar example that combines disjunctive
and knapsack modeling.
• Most OR professionals are unlikely to write a model as good
as the one presented here.

LSE tutorial, June 2007


Slide 204
Note on tightness of knapsack models

• The continuous relaxation of a knapsack model is not in general


a convex hull relaxation.
- A disjunctive formulation would provide a convex hull
relaxation, but there are exponentially many disjuncts.
• Knapsack cuts can significantly tighten the relaxation.

LSE tutorial, June 2007


Slide 205
Example: Package transport

Disjunctive model Knapsack


Each package j constraints
has size aj min ∑z
i
i

∑Q y ≥ ∑ a ; ∑ x
i
i i
j
j
i
ij = 1, all j
Each truck i has yi = 1
 
capacity Qi and    yi = 0 
zi = ci
costs ci to    
operate ∨
 ∑ a j xij ≤ Qi   i z = 0  , all i
 j   x = 0
 0 ≤ x ≤ 1, all j   ij 
Truck i used  ij 
xij , y i ∈ {0,1}
Truck i not used

1 if truck i carries
package j 1 if truck i is used
LSE tutorial, June 2007
Slide 206
Example: Package transport

MILP model Disjunctive model

min∑c y i i
min ∑z
i
i
i

∑Q y ≥ ∑ a ; ∑ x
i i j ij = 1, all j ∑Q y ≥ ∑ a ; ∑ x
i
i i
j
j
i
ij = 1, all j
i j i

∑a x j ij ≤ Qi y i , all i  yi = 1 
j  zi = ci   yi = 0 
xij ≤ y i , all i , j    

 ∑ a j xij ≤ Qi   i z = 0  , all i
xij , y i ∈ {0,1}  j   x = 0
 0 ≤ x ≤ 1, all j   ij 
 ij 
xij , y i ∈ {0,1}

LSE tutorial, June 2007


Slide 207
Example: Package transport
Most OR professionals
would omit this constraint,
since it is the sum over i
MILP model
of the next constraint.
But it generates very
min∑c y i
i i effective knapsack cuts.

∑Q y ≥ ∑ a ; ∑ x
i
i i
j
j
i
ij = 1, all j

∑a x
j
j ij ≤ Qi y i , all i Modeling trick;
unobvious without
xij ≤ y i , all i , j disjunctive approach

xij , y i ∈ {0,1}

LSE tutorial, June 2007


Slide 208
Cutting Planes

0-1 Knapsack Cuts


Gomory Cuts
Mixed Integer Rounding Cuts
Example: Product Configuration

LSE tutorial, June 2007


Slide 209
To review…

A cutting plane (cut, valid inequality) for


an MILP model: Cutting
plane Continuous
• …is valid
relaxation
- It is satisfied by all feasible solutions
of the model.
• …cuts off solutions of the continuous
relaxation.
- This makes the relaxation tighter.

Feasible solutions

LSE tutorial, June 2007


Slide 210
Motivation

• Cutting planes (cuts) tighten the continuous relaxation of an


MILP model.
• Knapsack cuts
- Generated for individual knapsack constraints.
- We saw general integer knapsack cuts earlier.
- 0-1 knapsack cuts and lifting techniques are well studied
and widely used.
• Rounding cuts
- Generated for the entire MILP, they are widely used.
- Gomory cuts for integer variables only.
- Mixed integer rounding cuts for any MILP.
LSE tutorial, June 2007
Slide 211
0-1 Knapsack Cuts

0-1 knapsack cuts are designed for knapsack constraints with 0-1
variables.
The analysis is different from that of general knapsack constraints,
to exploit the special structure of 0-1 inequalities.

LSE tutorial, June 2007


Slide 212
0-1 Knapsack Cuts

0-1 knapsack cuts are designed for knapsack constraints with 0-1
variables.
The analysis is different from that of general knapsack constraints,
to exploit the special structure of 0-1 inequalities.
Consider a 0-1 knapsack packing constraint ax ≤ a0. (Knapsack
covering constraints are similarly analyzed.)

Index set J is a cover if ∑a


j ∈J
j > a0

The cover inequality ∑x j ≤ J − 1 is a 0-1 knapsack cut for


ax ≤ a0 j ∈J

Only minimal covers need be considered.


LSE tutorial, June 2007
Slide 213
Example
J = {1,2,3,4} is a cover for
6 x1 + 5 x2 + 5 x3 + 5 x4 + 8 x5 + 3 x6 ≤ 17
This gives rise to the cover inequality
x1 + x2 + x3 + x 4 ≤ 3

Index set J is a cover if ∑a


j ∈J
j > a0

The cover inequality ∑x j ≤ J − 1 is a 0-1 knapsack cut for


ax ≤ a0 j ∈J

Only minimal covers need be considered.


LSE tutorial, June 2007
Slide 214
Sequential lifting

• A cover inequality can often be strengthened by lifting it into a


higher dimensional space.
• That is, by adding variables.
• Sequential lifting adds one variable at a time.
• Sequence-independent lifting adds several variables at once.

LSE tutorial, June 2007


Slide 215
Sequential lifting

To lift a cover inequality ∑x


j ∈J
j ≤ J −1

add a term to the left-hand side ∑x


j ∈J
j + π k xk ≤ J − 1

where πk is the largest coefficient for which the inequality is still valid.
 
So, π k = J − 1 − max ∑ x j ∑ a j x j ≤ a0 − ak 
x j ∈{0,1}
for j ∈J
 j∈J j ∈J 

This can be done repeatedly (by dynamic programming).

LSE tutorial, June 2007


Slide 216
Example
Given 6 x1 + 5 x2 + 5 x3 + 5 x4 + 8 x5 + 3 x6 ≤ 17
To lift x1 + x2 + x3 + x 4 ≤ 3
add a term to the left-hand side x1 + x2 + x3 + x 4 + π 5 x5 ≤ 3
where
π 5 = 3 − max
x j ∈{0,1}
{x1 + x2 + x3 + x4 6 x1 + 5 x2 + 5 x3 + 5 x4 ≤ 17 − 8}
for j∈{1,2,3,4}

This yields x1 + x2 + x3 + x 4 + 2 x5 ≤ 3
Further lifting leaves the cut unchanged.
But if the variables are added in the order x6, x5, the result is different:
x1 + x2 + x3 + x 4 + x5 + x6 ≤ 3

LSE tutorial, June 2007


Slide 217
Sequence-independent lifting

• Sequence-independent lifting usually yields a weaker cut than


sequential lifting.
• But it adds all the variables at once and is much faster.
• Commonly used in commercial MILP solvers.

LSE tutorial, June 2007


Slide 218
Sequence-independent lifting

To lift a cover inequality ∑x


j ∈J
j ≤ J −1

add terms to the left-hand side ∑ x + ∑ ρ (a ) x


j ∈J
j
j ∉J
j k ≤ J −1

j if Aj ≤ u ≤ A j +1 − ∆ and j ∈ {0,…, p − 1}

where ρ (u ) =  j + (u − A j ) / ∆ if Aj − ∆ ≤ u < A j − ∆ and j ∈ {1,…, p − 1}
 p + (u − A ) / ∆ if A − ∆ ≤ u
 p p

j
with ∆ = ∑ a j − a0 A j = ∑ ak
j ∈J k =1

J = {1,…, p} A0 = 0

LSE tutorial, June 2007


Slide 219
Example
Given 6 x1 + 5 x2 + 5 x3 + 5 x4 + 8 x5 + 3 x6 ≤ 17
To liftx1 + x2 + x3 + x 4 ≤ 3
Add terms x1 + x2 + x3 + x 4 + ρ (8) x5 + ρ (3) x6 ≤ 3
where ρ(u) is given by

This yields the lifted cut


x1 + x2 + x3 + x 4 + (5 / 4)x5 + (1/ 4)x6 ≤ 3

LSE tutorial, June 2007


Slide 220
Gomory Cuts
• When an integer programming Solution of
Separating
problem has a nonintegral solution, continuous
cut
we can generate at least one Gomory relaxation
cut to cut off that solution.
- This is a special case of a
separating cut, because it
separates the current solution of
the relaxation from the feasible
set.
• Gomory cuts are widely used and
very effective in MILP solvers.

Feasible solutions

LSE tutorial, June 2007


Slide 221
Gomory cuts

Given an integer programming Let (xB,0) be an optimal solution


problem of the continuous relaxation,
min cx where
xB = bˆ − Nx
ˆ N
Ax = b
bˆ = B −1b, Nˆ = B −1N
x ≥ 0 and integral

Then if xi is nonintegral in this solution, the following Gomory cut is


violated by (xB,0):
xi + Nˆ i  xN ≤ bˆi 

LSE tutorial, June 2007


Slide 222
Example
min 2 x1 + 3 x2 or min 2 x1 + 3 x2 Optimal solution of
x1 + 3 x2 ≥ 3 x1 + 3 x2 − x3 = 3 the continuous
relaxation has
4 x1 + 3 x2 ≥ 6 4 x1 + 3 x2 − x 4 = 6
x   1 
x1, x2 ≥ 0 and integral x j ≥ 0 and integral xB =  1  =  
 x2   2 / 3 
 1/ 3 −1/ 3 
Nˆ =  
 −4 / 9 1/ 9 
 1 
bˆ =  
2 / 3 

LSE tutorial, June 2007


Slide 223
Example
min 2 x1 + 3 x2 or min 2 x1 + 3 x2 Optimal solution of
x1 + 3 x2 ≥ 3 x1 + 3 x2 − x3 = 3 the continuous
relaxation has
4 x1 + 3 x2 ≥ 6 4 x1 + 3 x2 − x 4 = 6
x   1 
x1, x2 ≥ 0 and integral x j ≥ 0 and integral xB =  1  =  
 x2   2 / 3 
 1/ 3 −1/ 3 
The Gomory cut xi + Nˆ i  xN ≤ bˆi  Nˆ =  
 −4 / 9 1/ 9 

 x3   1 
is x2 + [ −4 / 9 1/ 9]   ≤  2 / 3  bˆ =  
 x4  2 / 3 

or x 2 − x3 ≤ 0 In x1,x2 space this is x1 + 2 x2 ≥ 3

LSE tutorial, June 2007


Slide 224
Example
min 2 x1 + 3 x2 or min 2 x1 + 3 x2 Optimal solution of
x1 + 3 x2 ≥ 3 x1 + 3 x2 − x3 = 3 the continuous
relaxation has
4 x1 + 3 x2 ≥ 6 4 x1 + 3 x2 − x 4 = 6
x   1 
x1, x2 ≥ 0 and integral x j ≥ 0 and integral xB =  1  =  
 x2   2 / 3 
 1/ 3 −1/ 3 
Nˆ =  
 −4 / 9 1/ 9 
 1 
bˆ =  
2 / 3 

 1  Gomory cut x1 + 2x2 ≥ 3


bˆ =  
2 / 3

Gomory cut after re-solving LP with


previous cut.
LSE tutorial, June 2007
Slide 225
Mixed Integer Rounding Cuts

• Mixed integer rounding (MIR) cuts can be generated for solutions


of any relaxed MILP in which one or more integer variables has a
fractional value.
− Like Gomory cuts, they are separating cuts.
− MIR cuts are widely used in commercial solvers.

LSE tutorial, June 2007


Slide 226
MIR cuts
In an optimal solution of the
Given an MILP problem continuous relaxation, let

min cx + dy J = { j | yj is nonbasic}

Ax + Dy = b K = { j | xj is nonbasic}
x, y ≥ 0 and y integral N = nonbasic cols of [A D]

Then if yi is nonintegral in this solution, the following MIR cut is


violated by the solution of the relaxation:

 frac(Nˆ ij )  1
y i + ∑ Nˆ ij y j + ∑  Nˆ ij  +  + ∑ Nˆ ij+ x j ≥ Nˆ ij bˆi 
j ∈J1 j ∈J 2  frac( bˆi )  frac(bˆi ) j∈K

where {
J1 = j ∈ J frac(Nˆ ij ) ≥ frac(bˆ j ) } J 2 = J \ J1

LSE tutorial, June 2007


Slide 227
Example
Take basic solution (x1,y1) = (8/3,17/3).
3 x1 + 4 x2 − 6 y1 − 4 y 2 = 1
Then  1/ 3 2 / 3   8/3 
x1 + 2 x2 − y1 − y 2 = 3 Nˆ =   bˆ =  
 −2 / 3 8 / 3  17 / 3 
x j , y j ≥ 0, y j integer
J = {2}, K = {2}, J1 = ∅, J2 = {2}

 1/ 3  1
The MIR cut is y1 +  1/ 3  + y
 2 + (2 / 3)+ x2 ≥ 8 / 3 
 2/3 2/3

or y1 + (1/ 2)y 2 + x2 ≥ 3

LSE tutorial, June 2007


Slide 228
Example: Product Configuration

This example illustrates:


• Combination of propagation and relaxation.
• Processing of variable indices.
• Continuous relaxation of element constraint.

LSE tutorial, June 2007


Slide 229
The problem

Choose what type of each component, and how many

Personal computer Disk


drive
Memory
Memory Disk Disk
Memory Disk
drive drive
Memory Memory drive

Power
Memory Disk
supply
drive

Power Power Power


supply supply supply

LSE tutorial, June 2007


Slide 230
Model of the problem

Unit cost of producing Amount of attribute j


attribute j produced by type ti
of component i

Amount of attribute j min ∑c v


j
j j
produced
(< 0 if consumed): v j = ∑ qi Aijti , all j
memory, heat, power, ik ti is a variable
weight, etc. L j ≤ v j ≤ U j , all j index

Quantity of
component i
installed

LSE tutorial, June 2007


Slide 231
To solve it:

• Branch on domains of ti and qi.


• Propagate element constraints and bounds on vj.
– Variable index is converted to specially structured
element constraint.
– Valid knapsack cuts are derived and propagated.
• Use linear continuous relaxations.
– Special purpose MILP relaxation for element.

LSE tutorial, June 2007


Slide 232
Propagation

min ∑c v
j
j j

v j = ∑ qi Aijti , all j
ik

L j ≤ v j ≤ U j , all j This is propagated


in the usual way

LSE tutorial, June 2007


Slide 233
Propagation

v j = ∑ zi , all j
i

element ( t i ,(qi , Aij 1,…, qi Aijn ), zi ) , all i , j

min ∑c v
j
j j

v j = ∑ qi Aijti , all j This is rewritten as


ik

L j ≤ v j ≤ U j , all j This is propagated


in the usual way

LSE tutorial, June 2007


Slide 234
Propagation

v j = ∑ zi , all j
i

element ( t i ,(qi , Aij 1,…, qi Aijn ), zi ) , all i , j

This can be propagated by


(a) using specialized filters for element constraints of this form…

LSE tutorial, June 2007


Slide 235
Propagation

v j = ∑ zi , all j
i

element ( t i ,(qi , Aij 1,…, qi Aijn ), zi ) , all i , j

This is propagated by
(a) using specialized filters for element constraints of this form,
(b) adding knapsack cuts for the valid inequalities:

∑ max {A } q ≥ v , all j
i
k ∈Dti
ijk i j

∑ min{A } q ≤ v , all j
i
k ∈Dti
ijk i j

and (c) propagating the knapsack cuts.


[v j ,v j ] is current
domain of vj
LSE tutorial, June 2007
Slide 236
Relaxation

min ∑c v
j
j j

v j = ∑ qi Aijti , all j
ik
This is relaxed as
L j ≤ v j ≤ U j , all j
v j ≤ vj ≤ vj

LSE tutorial, June 2007


Slide 237
Relaxation

v j = ∑ zi , all j
i

element ( t i ,(qi , Aij 1,…, qi Aijn ), zi ) , all i , j

min ∑c v
j
j j
This is relaxed by relaxing this
and adding the knapsack cuts.
v j = ∑ qi Aijti , all j
ik
This is relaxed as
L j ≤ v j ≤ U j , all j
v j ≤ vj ≤ vj

LSE tutorial, June 2007


Slide 238
Relaxation

v j = ∑ zi , all j
i

element ( t i ,(qi , Aij 1,…, qi Aijn ), zi ) , all i , j

This is relaxed by replacing each element constraint


with a disjunctive convex hull relaxation:

zi = ∑A
k∈Dti
ijk qik , qi = ∑q
k∈Dti
ik

LSE tutorial, June 2007


Slide 239
Relaxation

So the following LP relaxation is solved at each node


of the search tree to obtain a lower bound:
min ∑c v j
j j

vj = ∑ ∑A ijk qik , all j


i k ∈Dti

qj = ∑q
k∈Dti
ik , all i

v j ≤ v j ≤ v j , all j
qi ≤ qi ≤ qi , all i
knapsack cuts for ∑ max {A } q ≥ v , all j i
k∈Dti
ijk i j

knapsack cuts for ∑ min{ A } q ≤ v , all j ijk i j


k ∈Dti
i

qik ≥ 0, all i , k

LSE tutorial, June 2007


Slide 240
Computational Results

1000

100

10
Seconds

CPLEX
CLP
Hybrid
1
8x10 16x20 20x24 20x30

0.1

0.01
Problem

LSE tutorial, June 2007


Slide 241
Lagrangean Relaxation

Lagrangean Duality
Properties of the Lagrangean Dual
Example: Fast Linear Programming
Domain Filtering
Example: Continuous Global Optimization

LSE tutorial, June 2007


Slide 242
Motivation

• Lagrangean relaxation can provide better bounds than LP


relaxation.
• The Lagrangean dual generalizes LP duality.
• It provides domain filtering analogous to that based on LP
duality.
- This is a key technique in continuous global optimization.
• Lagrangean relaxation gets rid of troublesome constraints by
dualizing them.
- That is, moving them into the objective function.
- The Lagrangean relaxation may decouple.

LSE tutorial, June 2007


Slide 243
Lagrangean Duality

Consider an
inequality-constrained
problem
min f ( x )
Hard constraints
g(x ) ≥ 0
x ∈S Easy constraints

The object is to get rid of (dualize) the hard constraints


by moving them into the objective function.

LSE tutorial, June 2007


Slide 244
Lagrangean Duality

Consider an It is related to an
inequality-constrained inference problem
problem
min f ( x ) max v
s∈S
g(x ) ≥ 0 g( x ) ≥ b ⇒ f (x ) ≥ v
x ∈S implies

Lagrangean Dual problem: Find the tightest lower bound


on the objective function that is implied by the constraints.

LSE tutorial, June 2007


Slide 245
Primal Dual
min f ( x ) max v
g( x ) ≥ 0 s∈S
g( x ) ≥ b ⇒ f (x ) ≥ v
x ∈S
Surrogate
Let us say that
x∈S λ g ( x ) ≥ 0 dominates f (x ) − v ≥ 0
g( x ) ≥ 0 ⇒ f ( x ) ≥ v iff
for some λ ≥ 0

λg(x) ≤ f(x) − v for all x ∈ S


That is, v ≤ f(x) − λg(x) for all x ∈ S

LSE tutorial, June 2007


Slide 246
Primal Dual
min f ( x ) max v
g( x ) ≥ 0 s∈S
g( x ) ≥ b ⇒ f (x ) ≥ v
x ∈S
Surrogate
Let us say that
x∈S λ g ( x ) ≥ 0 dominates f (x ) − v ≥ 0
g( x ) ≥ 0 ⇒ f ( x ) ≥ v iff
for some λ ≥ 0

λg(x) ≤ f(x) − v for all x ∈ S


That is, v ≤ f(x) − λg(x) for all x ∈ S
Or v ≤ min {f ( x ) − λ g ( x )}
x∈S

LSE tutorial, June 2007


Slide 247
Primal Dual
min f ( x ) max v
g( x ) ≥ 0 s∈S
g( x ) ≥ b ⇒ f (x ) ≥ v
x ∈S
Surrogate
Let us say that
x∈S λ g ( x ) ≥ 0 dominates f (x ) − v ≥ 0
g( x ) ≥ 0 ⇒ f ( x ) ≥ v iff
for some λ ≥ 0

λg(x) ≤ f(x) − v for all x ∈ S


That is, v ≤ f(x) − λg(x) for all x ∈ S
Or v ≤ min {f ( x ) − λ g ( x )}
x∈S
So the dual becomes
max v
LSE tutorial, June 2007 v ≤ min{f ( x ) − λ g ( x )} for some λ ≥ 0
x∈S
Slide 248
Now we have…

Primal Dual
min f ( x ) max v
g( x ) ≥ 0 v ≤ min {f ( x ) − λ g ( x )} for some λ ≥ 0
x∈S
x ∈S
or where
These constraints max θ (λ )
λ ≥0
θ (λ ) = min {f ( x ) − λ g ( x )}
are dualized x∈S

Lagrangean Vector of
relaxation Lagrange
multipliers

The Lagrangean dual can be viewed as the problem


of finding the Lagrangean relaxation that gives the
tightest bound.
LSE tutorial, June 2007
Slide 249
Example
The Lagrangean relaxation is
min 3 x1 + 4 x2
− x1 + 3 x2 ≥ 0 θ (λ1, λ2 ) = min {3 x1 + 4 x2 − λ1( − x1 + 3 x2 ) − λ2 (2 x1 + x2 − 5)}
x ∈{0,…,3}
j

2 x1 + x2 − 5 ≥ 0 = min {(3 + λ1 − 2λ2 )x1 + (4 − 3λ1 − λ2 )x2 + 5λ2 }


x j ∈{0,…,3}
x1, x2 ∈ {0,1,2,3}

The Lagrangean relaxation is easy to solve


for any given λ1, λ2:

0 if 3 + λ1 − 2λ2 ≥ 0
x1 = 
3 otherwise
Strongest
surrogate
0 if 4 − 3λ1 − λ2 ≥ 0
x2 = 
3 otherwise

Optimal solution (2,1)


LSE tutorial, June 2007
Slide 250
Example
θ(λ1,λ2) is piecewise linear and concave.
min 3 x1 + 4 x2
λ2
− x1 + 3 x2 ≥ 0 θ(λ)=5
2 x1 + x2 − 5 ≥ 0
x1, x2 ∈ {0,1,2,3}
θ(λ)=9 2/7

θ(λ)=0
λ1
θ(λ)=0
θ(λ)=7.5
Solution of Lagrangean dual:
(λ1,λ2) = (5/7, 13/7), θ(λ) = 9 2/7
Optimal solution (2,1) Note duality gap between 10 and 9 2/7
Value = 10
LSE tutorial, June 2007 (no strong duality).
Slide 251
Example
Note: in this example, the Lagrangean dual
min 3 x1 + 4 x2 provides the same bound (9 2/7) as the
− x1 + 3 x2 ≥ 0 continuous relaxation of the IP.
2 x1 + x2 − 5 ≥ 0 This is because the Lagrangean relaxation
x1, x2 ∈ {0,1,2,3} can be solved as an LP:
θ (λ1, λ2 ) = min {(3 + λ1 − 2λ2 )x1 + (4 − 3λ1 − λ2 )x2 + 5λ2 }
x ∈{0,…,3}
j

= min {(3 + λ1 − 2λ2 )x1 + (4 − 3λ1 − λ2 )x2 + 5λ2 }


0 ≤ x j ≤3

Lagrangean duality is useful when the


Lagrangean relaxation is tighter than an LP
but nonetheless easy to solve.

LSE tutorial, June 2007


Slide 252
Properties of the Lagrangean dual

Weak duality: For any feasible x* and any λ* ≥ 0, f(x*) ≥ θ(λ*).


In particular, min f ( x ) ≥ max θ ( λ )
λ ≥0
g( x ) ≥ 0
x ∈S

Concavity: θ(λ) is concave. It can therefore be maximized by


local search methods.

Complementary slackness: If x* and λ* are optimal, and there


is no duality gap, then λ*g(x*) = 0.

LSE tutorial, June 2007


Slide 253
Solving the Lagrangean dual

Let λk be the kth iterate, and let λ k +1 = λ k + α k ξ k

Subgradient of θ(λ) at λ = λk

If xk solves the Lagrangean relaxation for λ = λk, then ξk = g(xk).


This is because θ(λ) = f(xk) + λg(xk) at λ = λk.

The stepsize αk must be adjusted so that the sequence


converges but not before reaching a maximum.

LSE tutorial, June 2007


Slide 254
Example: Fast Linear Programming

• In CP contexts, it is best to process each node of the search tree


very rapidly.
• Lagrangean relaxation may allow very fast calculation of a lower
bound on the optimal value of the LP relaxation at each node.
• The idea is to solve the Lagrangean dual at the root node (which
is an LP) and use the same Lagrange multipliers to get an LP
bound at other nodes.

LSE tutorial, June 2007


Slide 255
At root node, solve min cx
Dualize Ax ≥ b (λ )
Special structure, Dx ≥ d
e.g. variable bounds x ≥0

The (partial) LP dual solution λ*


solves the Lagrangean dual in which
θ (λ ) = min
Dx ≥d
{cx − λ ( Ax − b )}
x ≥0

LSE tutorial, June 2007


Slide 256
At root node, solve min cx
Dualize Ax ≥ b (λ )
Special structure, Dx ≥ d
e.g. variable bounds x ≥0

The (partial) LP dual solution λ*


solves the Lagrangean dual in which
θ (λ ) = min
Dx ≥d
{cx − λ ( Ax − b )}
x ≥0
min cx
Ax ≥ b (λ )
At another node, the LP is Dx ≥ d
Branching
Hx ≥ h
Here θ(λ*) is still a lower bound on the optimal constraints,
value of the LP and can be quickly calculated x ≥0 etc.
by solving a specially structured LP.

LSE tutorial, June 2007


Slide 257
Domain Filtering

Suppose:
min f ( x )
g(x ) ≥ 0 has optimal solution x*, optimal value v*, and
optimal Lagrangean dual solution λ*.
x ∈S

…and λi* > 0, which means the i-th constraint is tight


(complementary slackness);
…and the problem is a relaxation of a CP problem;
…and we have a feasible solution of the CP problem with value
U, so that U is an upper bound on the optimal value.

LSE tutorial, June 2007


Slide 258
min f ( x )
has optimal solution x*, optimal value v*, and
Supposing g ( x ) ≥ 0
optimal Lagrangean dual solution λ*:
x ∈S

If x were to change to a value other than x*, the LHS of i-th constraint
gi(x) ≥ 0 would change by some amount ∆i.
Since the constraint is tight, this would increase the optimal value
as much as changing the constraint to gi(x) − ∆i ≥ 0.
So it would increase the optimal value at least λi*∆i.
(It is easily shown that Lagrange multipliers are marginal costs. Dual
multipliers for LP are a special case of Lagrange multipliers.)

LSE tutorial, June 2007


Slide 259
min f ( x )
has optimal solution x*, optimal value v*, and
Supposing g ( x ) ≥ 0
optimal Lagrangean dual solution λ*:
x ∈S

We have found: a change in x that changes gi(x) by ∆i increases


the optimal value at least λi*∆i.

Since optimal value of this problem ≤ optimal value of the CP ≤ U,


we have λi*∆i ≤ U − v*, or U −v*
∆i ≤
λi*

LSE tutorial, June 2007


Slide 260
min f ( x )
has optimal solution x*, optimal value v*, and
Supposing g ( x ) ≥ 0
optimal Lagrangean dual solution λ*:
x ∈S

We have found: a change in x that changes gi(x) by ∆i increases


the optimal value at least λi*∆i.

Since optimal value of this problem ≤ optimal value of the CP ≤ U,


we have λi*∆i ≤ U − v*, or U −v*
∆i ≤
λi*
Since ∆i = gi(x) − gi(x*) = gi(x), this implies the inequality
U −v*
gi ( x ) ≤
λi*
…which can be propagated.
LSE tutorial, June 2007
Slide 261
Example: Continuous Global Optimization

• Some of the best continuous global solvers (e.g., BARON)


combine OR-style relaxation with CP-style interval arithmetic and
domain filtering.
• The use of Lagrange multipliers for domain filtering is a key
technique in these solvers.

LSE tutorial, June 2007


Slide 262
Continuous Global Optimization

Global optimum
x2
max x1 + x2
4 x1x2 = 1
2 x1 + x2 ≤ 2
x1 ∈ [0,1], x2 ∈ [0,2]
Local optimum

Feasible set

x1
LSE tutorial, June 2007
Slide 263
To solve it:

• Search: split interval domains of x1, x2.


– Each node of search tree is a problem restriction.
• Propagation: Interval propagation, domain filtering.
– Use Lagrange multipliers to infer valid inequality for
propagation.
– Reduced-cost variable fixing is a special case.
• Relaxation: Use function factorization to obtain linear
continuous relaxation.

LSE tutorial, June 2007


Slide 264
Interval propagation

x2

Propagate intervals
[0,1], [0,2]
through constraints
to obtain
[1/8,7/8], [1/4,7/4]

x1

LSE tutorial, June 2007


Slide 265
Relaxation (function factorization)

Factor complex functions into elementary functions that have


known linear relaxations.
Write 4x1x2 = 1 as 4y = 1 where y = x1x2.
This factors 4x1x2 into linear function 4y and bilinear function x1x2.
Linear function 4y is its own linear relaxation.

LSE tutorial, June 2007


Slide 266
Relaxation (function factorization)

Factor complex functions into elementary functions that have


known linear relaxations.
Write 4x1x2 = 1 as 4y = 1 where y = x1x2.
This factors 4x1x2 into linear function 4y and bilinear function x1x2.
Linear function 4y is its own linear relaxation.
Bilinear function y = x1x2 has relaxation:
x2 x1 + x1x2 − x1x2 ≤ y ≤ x2 x1 + x1x2 − x1x2
x2 x1 + x1x2 − x1x2 ≤ y ≤ x2 x1 + x1x2 − x1x2
where domain of xj is [ x j , x j ]

LSE tutorial, June 2007


Slide 267
Relaxation (function factorization)

The linear relaxation becomes:


min x1 + x2
4y = 1
2 x1 + x2 ≤ 2
x2 x1 + x1x2 − x1x2 ≤ y ≤ x2 x1 + x1x2 − x1x2
x2 x1 + x1x2 − x1x2 ≤ y ≤ x2 x1 + x1x2 − x1x2
x j ≤ x j ≤ x j , j = 1,2

LSE tutorial, June 2007


Slide 268
Relaxation (function factorization)

x2

Solve linear relaxation.

x1

LSE tutorial, June 2007


Slide 269
Relaxation (function factorization)

x2

x2 ∈ [1,1.75]
Solve linear relaxation.

Since solution is infeasible,


split an interval and branch.
x2 ∈ [0.25,1]
x1

LSE tutorial, June 2007


Slide 270
x2 ∈ [1,1.75] x2 ∈ [0.25,1]

x2 x2

x1 x1

LSE tutorial, June 2007


Slide 271
x2 ∈ [1,1.75] x2 ∈ [0.25,1]

x2 Solution of x2
relaxation is
feasible,
value = 1.25
This becomes
incumbent
solution

x1 x1

LSE tutorial, June 2007


Slide 272
x2 ∈ [1,1.75] x2 ∈ [0.25,1]

x2 Solution of x2
relaxation is Solution of
feasible, relaxation is
value = 1.25 not quite
feasible,
This becomes value = 1.854
incumbent
solution Also use
Lagrange
multipliers for
domain
filtering…

x1 x1

LSE tutorial, June 2007


Slide 273
Relaxation (function factorization)

min x1 + x2 Associated Lagrange


multiplier in solution of
4y = 1 relaxation is λ2 = 1.1
2 x1 + x2 ≤ 2
x2 x1 + x1x2 − x1x2 ≤ y ≤ x2 x1 + x1x2 − x1x2
x2 x1 + x1x2 − x1x2 ≤ y ≤ x2 x1 + x1x2 − x1x2
x j ≤ x j ≤ x j , j = 1,2

LSE tutorial, June 2007


Slide 274
Relaxation (function factorization)

min x1 + x2 Associated Lagrange


multiplier in solution of
4y = 1 relaxation is λ2 = 1.1
2 x1 + x2 ≤ 2
x2 x1 + x1x2 − x1x2 ≤ y ≤ x2 x1 + x1x2 − x1x2
x2 x1 + x1x2 − x1x2 ≤ y ≤ x2 x1 + x1x2 − x1x2
x j ≤ x j ≤ x j , j = 1,2

This yields a valid inequality for propagation:


1.854 − 1.25
2 x1 + x2 ≥ 2 − = 1.451
1.1
Value of Value of incumbent
relaxation solution
Lagrange multiplier
LSE tutorial, June 2007
Slide 275
Dynamic Programming in CP
Example: Capital Budgeting
Domain Filtering
Recursive Optimization

LSE tutorial, June 2007


Slide 276
Motivation

• Dynamic programming (DP) is a highly versatile technique that


can exploit recursive structure in a problem.
• Domain filtering is straightforward for problems modeled as a
DP.
• DP is also important in designing filters for some global
constraints, such as the stretch constraint (employee scheduling).
• Nonserial DP is related to bucket elimination in CP and exploits
the structure of the primal graph.
• DP modeling is the art of keeping the state space small while
maintaining a Markovian property.
• We will examine only one simple example of serial DP.

LSE tutorial, June 2007


Slide 277
Example: Capital Budgeting

We wish to built power plants with a total cost of at most 12 million


Euros.
There are three types of plants, costing 4, 2 or 3 million Euros
each. We must build one or two of each type.
The problem has a simple knapsack packing model:

4 x1 + 2 x2 + 3 x3 ≤ 12
Number of x j ∈ {1,2}
factories of type j

LSE tutorial, June 2007


Slide 278
State sk

Example: Capital Budgeting


4 x1 + 2 x2 + 3 x3 ≤ 12 Stage k

x j ∈ {1,2}

In general the recursion for ax ≤ b is

fk (sk ) = max {fk +1(sk + ak xk )} x3=2


xk ∈Dxk

x3=1
= 1 if there is State is sum
a path from of first k terms
state sk to a of ax
feasible
solution,
0 otherwise
f3(8) = max{f4(8+3⋅1), f4(8+3⋅2)} = max{1,0} = 1
f4(14)=0
LSE tutorial, June 2007 f4(11)=1
Slide 279
Example: Capital Budgeting
0 0
4 x1 + 2 x2 + 3 x3 ≤ 12
0
x j ∈ {1,2} 0
0 0
In general the recursion for ax ≤ b is 0
1
fk (sk ) = max {fk +1(sk + ak xk )} 1 0
xk ∈Dxk

1 1
Boundary condition: 1
1
1 if sn +1 ≤ b 1
fn +1(sn +1 ) = 
0 otherwise

fk(sk) for each state sk

LSE tutorial, June 2007


Slide 280
Example: Capital Budgeting
0 0
4 x1 + 2 x2 + 3 x3 ≤ 12
0
x j ∈ {1,2} 0
0 0
The problem is feasible. 1 0

1 0
Each path to 0 is a feasible
solution. 1 1

Path 1: x = (1,2,1) 1 1

Path 2: x = (1,1,2) 1

Path 3: x = (1,1,1)
fk(sk) for each state sk
Possible costs are 9,11,12.

LSE tutorial, June 2007


Slide 281
Domain Filtering
4 x1 + 2 x2 + 3 x3 ≤ 12
x j ∈ {1,2}

To filter domains: observe what


values of xk occur on feasible x1=1
paths. x2=2 x3=1

Dx3 = {1,2} x3=2


x2=1
Dx2 = {1,2} x3=1

Dx1 = {1}

LSE tutorial, June 2007


Slide 282
Recursive Optimization
max 15 x1 + 10 x2 + 12 x3 Maximize
revenue
4 x1 + 2 x2 + 3 x3 ≤ 12
x j ∈ {1,2}

The recursion includes arc values: 12⋅2

fk (sk ) = max {ck xk + fk +1(sk + ak xk )}


xk ∈Dxk 11⋅1

= value on max Arc value


value path from
sk to final stage
(value to go)
f3(8) = max{12⋅1+f4(8+3⋅1), 12⋅2+f4(8+3⋅2)}
= max{12,−∞} = 12
f4(14)=−∞
LSE tutorial, June 2007 f4(11)=0
Slide 283
Recursive optimization
−∞ −∞
max 15 x1 + 10 x2 + 12 x3 −∞ −∞
4 x1 + 2 x2 + 3 x3 ≤ 12 −∞
−∞
x j ∈ {1,2}
49 −∞
The recursion includes arc values: −∞
12
fk (sk ) = max {ck xk + fk +1(sk + ak xk )} 34 0

24 0
Boundary condition:
0
0 if sn +1 ≤ b
fn +1(sn +1 ) = 
 −∞ otherwise
fk(sk) for each state sk

LSE tutorial, June 2007


Slide 284
Recursive optimization
−∞ −∞
max 15 x1 + 10 x2 + 12 x3 −∞ −∞
4 x1 + 2 x2 + 3 x3 ≤ 12 −∞
−∞
x j ∈ {1,2}
49 −∞

The maximum revenue is 49. 12 −∞

34 0
The optimal path is easy to
retrace. 24 0

(x1,x2,x3) = (1,1,2) 0

fk(sk) for each state sk

LSE tutorial, June 2007


Slide 285
CP-based Branch and Price

Basic Idea
Example: Airline Crew Scheduling

LSE tutorial, June 2007


Slide 286
Motivation

• Branch and price allows solution of integer programming


problems with a huge number of variables.
• The problem is solved by a branch-and-relax method. The
difference lies in how the LP relaxation is solved.
• Variables are added to the LP relaxation only as needed.
• Variables are priced to find which ones should be added.
• CP is useful for solving the pricing problem, particularly when
constraints are complex.
• CP-based branch and price has been successfully applied
to airline crew scheduling, transit scheduling, and other
transportation-related problems.

LSE tutorial, June 2007


Slide 287
Basic Idea
min cx
Suppose the LP relaxation of an integer
programming problem has a huge number of Ax = b
variables: x≥0

We will solve a restricted master problem, min ∑c x


j ∈J
j j
which has a small subset of the variables:
∑A x
j∈J
j j =b (λ )
Column j of A
xj ≥ 0

Adding xk to the problem would improve the solution if xk has a


negative reduced cost:
rk = ck − λ Ak < 0

LSE tutorial, June 2007


Slide 288
Basic Idea

Adding xk to the problem would improve the solution if xk has a


negative reduced cost:
rk = ck − λ Ak < 0

Computing the reduced cost of xk is known as pricing xk.

Cost of column y

So we solve the pricing problem: min c y − λ y


y is a column of A

If the solution y* satisfies cy* − λy* < 0, then we can add column y to
the restricted master problem.

LSE tutorial, June 2007


Slide 289
Basic Idea

The pricing problem max λ y


y is a column of A
need not be solved to optimality, so long as we find a column with
negative reduced cost.
However, when we can no longer find an improving column, we
solved the pricing problem to optimality to make sure we have the
optimal solution of the LP.

If we can state constraints that the columns of A must satisfy,


CP may be a good way to solve the pricing problem.

LSE tutorial, June 2007


Slide 290
Example: Airline Crew Scheduling
We want to assign crew members to flights to minimize
cost while covering the flights and observing complex
work rules.

Flight data A roster is the sequence of flights assigned to


a single crew member.
The gap between two consecutive flights in a
roster must be from 2 to 3 hours. Total flight
time for a roster must be between 6 and 10
hours.
For example,
flight 1 cannot immediately precede 6
flight 4 cannot immediately precede 5.
Start Finish
time time The possible rosters are:
LSE tutorial, June 2007 (1,3,5), (1,4,6), (2,3,5), (2,4,6)
Slide 291
Airline Crew Scheduling
There are 2 crew members, and the possible rosters are:
1 2 3 4
(1,3,5), (1,4,6), (2,3,5), (2,4,6)

The LP relaxation of the problem is:


Cost of assigning crew member 1 to roster 2
= 1 if we assign crew member 1
to roster 2, = 0 otherwise.

Each crew member is assigned to


exactly 1 roster.
Each flight is assigned at least 1
crew member.

LSE tutorial, June 2007


Slide 292
Airline Crew Scheduling
There are 2 crew members, and the possible rosters are:
1 2 3 4
(1,3,5), (1,4,6), (2,3,5), (2,4,6)

The LP relaxation of the problem is:


Cost of assigning crew member 1 to roster 2
= 1 if we assign crew member 1
to roster 2, = 0 otherwise.

Each crew member is assigned to


exactly 1 roster.
Each flight is assigned at least 1
crew member.

Rosters that cover flight 1.


LSE tutorial, June 2007
Slide 293
Airline Crew Scheduling
There are 2 crew members, and the possible rosters are:
1 2 3 4
(1,3,5), (1,4,6), (2,3,5), (2,4,6)

The LP relaxation of the problem is:


Cost of assigning crew member 1 to roster 2
= 1 if we assign crew member 1
to roster 2, = 0 otherwise.

Each crew member is assigned to


exactly 1 roster.
Each flight is assigned at least 1
crew member.

Rosters that cover flight 2.


LSE tutorial, June 2007
Slide 294
Airline Crew Scheduling
There are 2 crew members, and the possible rosters are:
1 2 3 4
(1,3,5), (1,4,6), (2,3,5), (2,4,6)

The LP relaxation of the problem is:


Cost of assigning crew member 1 to roster 2
= 1 if we assign crew member 1
to roster 2, = 0 otherwise.

Each crew member is assigned to


exactly 1 roster.
Each flight is assigned at least 1
crew member.

Rosters that cover flight 3.


LSE tutorial, June 2007
Slide 295
Airline Crew Scheduling
There are 2 crew members, and the possible rosters are:
1 2 3 4
(1,3,5), (1,4,6), (2,3,5), (2,4,6)

The LP relaxation of the problem is:


Cost of assigning crew member 1 to roster 2
= 1 if we assign crew member 1
to roster 2, = 0 otherwise.

Each crew member is assigned to


exactly 1 roster.
Each flight is assigned at least 1
crew member.

Rosters that cover flight 4.


LSE tutorial, June 2007
Slide 296
Airline Crew Scheduling
There are 2 crew members, and the possible rosters are:
1 2 3 4
(1,3,5), (1,4,6), (2,3,5), (2,4,6)

The LP relaxation of the problem is:


Cost of assigning crew member 1 to roster 2
= 1 if we assign crew member 1
to roster 2, = 0 otherwise.

Each crew member is assigned to


exactly 1 roster.
Each flight is assigned at least 1
crew member.

Rosters that cover flight 5.


LSE tutorial, June 2007
Slide 297
Airline Crew Scheduling
There are 2 crew members, and the possible rosters are:
1 2 3 4
(1,3,5), (1,4,6), (2,3,5), (2,4,6)

The LP relaxation of the problem is:


Cost of assigning crew member 1 to roster 2
= 1 if we assign crew member 1
to roster 2, = 0 otherwise.

Each crew member is assigned to


exactly 1 roster.
Each flight is assigned at least 1
crew member.

Rosters that cover flight 6.


LSE tutorial, June 2007
Slide 298
Airline Crew Scheduling
There are 2 crew members, and the possible rosters are:
1 2 3 4
(1,3,5), (1,4,6), (2,3,5), (2,4,6)

The LP relaxation of the problem is:


Cost c12 of assigning crew member 1 to roster 2
= 1 if we assign crew member 1
to roster 2, = 0 otherwise.

Each crew member is assigned to


exactly 1 roster.
Each flight is assigned at least 1
crew member.

In a real problem, there can be millions of rosters.


LSE tutorial, June 2007
Slide 299
Airline Crew Scheduling

We start by solving the problem with a subset


of the columns:
Optimal
dual
solution

u1
u2
v1
v2
v3
v4
v5
v6

LSE tutorial, June 2007


Slide 300
Airline Crew Scheduling

We start by solving the problem with a subset


of the columns:
Dual
variables
u1
u2
v1
v2
v3
v4
v5
v6

LSE tutorial, June 2007


Slide 301
Airline Crew Scheduling

We start by solving the problem with a subset


of the columns:
Dual
variables
u1 The reduced cost of an
u2 excluded roster k for
v1 crew member i is
v2
v3 cik − ui − ∑
j in roster k
vj
v4
v5
v6 We will formulate the
pricing problem as a
shortest path problem.

LSE tutorial, June 2007


Slide 302
Pricing problem

Crew
member 1

Crew
member 2

LSE tutorial, June 2007


Slide 303
Pricing problem
Each s-t path corresponds to a roster,
provided the flight time is within bounds.

Crew
member 1

Crew
member 2

LSE tutorial, June 2007


Slide 304
Pricing problem
Cost of flight 3 if it immediately follows
flight 1, offset by dual multiplier for flight 1

Crew
member 1

Crew
member 2

LSE tutorial, June 2007


Slide 305
Pricing problem
Cost of transferring from home to flight 1,
offset by dual multiplier for crew member 1

Crew
member 1

Dual multiplier
omitted to break
symmetry

Crew
member 2

LSE tutorial, June 2007


Slide 306
Pricing problem
Length of a path is reduced cost of the
corresponding roster.

Crew
member 1

Crew
member 2

LSE tutorial, June 2007


Slide 307
Pricing problem
Arc lengths using dual solution of LP
relaxation

5 2
−10 2
2
3
Crew
member 1 4
0 −1
5 6

5 2
0 2

Crew 3
member 2 4
-9 −1
5 6

LSE tutorial, June 2007


Slide 308
Pricing problem
Solution of shortest path problems

5 2
−10 2
2
3
Crew
4
member 1 0 −1
5 6
Reduced cost = −1
Add x12 to problem.
5 2
0 2
3
Crew
member 2 4
-9 −1
Reduced cost = −2 5 6
Add x23 to problem.
After x12 and x23 are added to the problem, no
LSE tutorial, June 2007
Slide 309 remaining variable has negative reduced cost.
Pricing problem

The shortest path problem cannot be solved by traditional shortest


path algorithms, due to the bounds on total path length.
It can be solved by CP:

Set of flights Path


assigned to crew length Graph
member i

Path global constraint Path( X i , zi ,G ), all flights i


Setsum global constraint Tmin ≤ ∑ ( f j − s j ) ≤ Tmax
j∈X i

X i ⊂ {flights}, zi < 0, all i


Duration of flight j

LSE tutorial, June 2007


Slide 310
LSE tutorial, June 2007
Slide 311
CP-based Benders Decomposition
Benders Decomposition in the Abstract
Classical Benders Decomposition
Example: Machine Scheduling

LSE tutorial, June 2007


Slide 312
Motivation

• Benders decomposition allows us to apply CP and OR to


different parts of the problem.
• It searches over values of certain variables that, when fixed,
result in a much simpler subproblem.
• The search learns from past experience by accumulating
Benders cuts (a form of nogood).
• The technique can be generalized far beyond the original OR
conception.
• Generalized Benders methods have resulted in the greatest
speedups achieved by combining CP and OR.

LSE tutorial, June 2007


Slide 313
Benders Decomposition in the Abstract

When x is fixed to some


Benders decomposition
value, the resulting
can be applied to
subproblem is much
problems of the form
easier:
min f ( x, y ) min f ( x , y )
…perhaps
S( x, y ) S( x , y ) because it
decouples into
x ∈ Dx , y ∈ Dy y ∈ Dy
smaller problems.

For example, suppose x assigns jobs to machines, and y schedules


the jobs on the machines.
When x is fixed, the problem decouples into a separate scheduling
subproblem for each machine.

LSE tutorial, June 2007


Slide 314
Benders Decomposition
We will search over assignments to x. This is the master problem.

min f ( x k , y )
In iteration k we assume x = xk and get optimal
and solve the subproblem S( x k , y ) value vk
y ∈ Dy

We generate a Benders cut (a type of nogood) v ≥ Bk +1( x )


that satisfies Bk+1(xk) = vk. Cost in the original problem

The Benders cut says that if we set x = xk again, the resulting cost v
will be at least vk. To do better than vk, we must try something else.
It also says that any other x will result in a cost of at least Bk+1(x),
perhaps due to some similarity between x and xk.

LSE tutorial, June 2007


Slide 315
Benders Decomposition
We will search over assignments to x. This is the master problem.

min f ( x k , y )
In iteration k we assume x = xk and get optimal
and solve the subproblem S( x k , y ) value vk
y ∈ Dy

We generate a Benders cut (a type of nogood) v ≥ Bk +1( x )


that satisfies Bk+1(x) = vk. Cost in the original problem

We add the Benders cut to the master problem, which becomes


min v
Benders cuts
v ≥ Bi ( x ), i = 1,…, k + 1 generated so far
x ∈ Dx
LSE tutorial, June 2007
Slide 316
Benders Decomposition
min v
We now solve the v ≥ B ( x ), i = 1,…, k + 1 to get the next
master problem i trial value xk+1.
x ∈ Dx

The master problem is a relaxation of the original problem, and its


optimal value is a lower bound on the optimal value of the original
problem.

The subproblem is a restriction, and its optimal value is an upper


bound.

The process continues until the bounds meet.

The Benders cuts partially define the projection of the feasible set
onto x. We hope not too many cuts are needed to find the optimum.

LSE tutorial, June 2007


Slide 317
Classical Benders Decomposition
The classical method
and the subproblem
applies to problems whose dual is
is an LP
of the form

min f ( x ) + cy min f ( x ) + cy
k max f ( x k ) + λ ( b − g ( x k ) )
g ( x ) + Ay ≥ b Ay ≥ b − g ( x k ) (λ ) λA ≤ c
x ∈ Dx , y ≥ 0 y ≥0 λ ≥0

Let λk solve the dual.


By strong duality, Bk+1(x) = f(x) + λk(b − g(x)) is the tightest lower
bound on the optimal value v of the original problem when x = xk.

Even for other values of x, λk remains feasible in the dual. So by


weak duality, Bk+1(x) remains a lower bound on v.

LSE tutorial, June 2007


Slide 318
Classical Benders

So the master problem becomes

min v min v
v ≥ Bi ( x ), i = 1,…, k + 1 v ≥ f ( x ) + λ i (b − g ( x )), i = 1,…, k + 1
x ∈ Dx x ∈ Dx

In most applications the master problem is


• an MILP
• a nonlinear programming problem (NLP), or
• a mixed integer/nonlinear programming problem (MINLP).

LSE tutorial, June 2007


Slide 319
Example: Machine Scheduling

• Assign 5 jobs to 2 machines (A and B), and schedule the


machines assigned to each machine within time windows.
• The objective is to minimize makespan.
Time lapse between
start of first job and
end of last job.

• Assign the jobs in the master


problem, to be solved by MILP.
• Schedule the jobs in the
subproblem, to be solved by CP.

LSE tutorial, June 2007


Slide 320
Machine Scheduling

Job Data Once jobs are assigned, we can


minimize overall makespan by
minimizing makespan on each
machine individually.
So the subproblem decouples.

Machine A
Machine B

LSE tutorial, June 2007


Slide 321
Machine Scheduling

Job Data Once jobs are assigned, we can


minimize overall makespan by
minimizing makespan on each
machine individually.
So the subproblem decouples.

Minimum makespan
schedule for jobs 1, 2, 3, 5
on machine A

LSE tutorial, June 2007


Slide 322
Machine Scheduling

The problem is
Start time of job j
min M
M ≥ s j + px j j , all j Time windows
r j ≤ s j ≤ d j − px j j , all j Jobs cannot overlap
disjunctive ( (s j x j = i ),( pij x j = i ) ) , all i

LSE tutorial, June 2007


Slide 323
Machine Scheduling

The problem is
Start time of job j
min M
M ≥ s j + px j j , all j Time windows
r j ≤ s j ≤ d j − px j j , all j Jobs cannot overlap
disjunctive ( (s j x j = i ),( pij x j = i ) ) , all i

For a fixed assignment x the subproblem on each machine i is

min M
M ≥ s j + px j j , all j with x j = i
r j ≤ s j ≤ d j − px j j , all j with x j = i

LSE tutorial, June 2007 disjunctive ( (s j x j = i ),( pij x j = i ))


Slide 324
Benders cuts

Suppose we assign jobs 1,2,3,5 to machine A in iteration k.


We can prove that 10 is the optimal makespan by proving that the
schedule is infeasible with makespan 9.

Edge finding derives infeasibility by reasoning only with jobs 2,3,5.


So these jobs alone create a minimum makespan of 10.
So we have a Benders cut
10 if x2 = x3 = x4 = A
v ≥ Bk +1( x ) = 
0 otherwise
LSE tutorial, June 2007
Slide 325
Benders cuts

We want the master problem to be an MILP, which is good for


assignment problems.
So we write the Benders cut
10 if x2 = x3 = x4 = A
v ≥ Bk +1( x ) = 
0 otherwise

Using 0-1 variables: v ≥ 10 ( x A 2 + x A3 + x A5 − 2 )


v ≥0 = 1 if job 5 is
assigned to
machine A

LSE tutorial, June 2007


Slide 326
Master problem

The master problem is an MILP:


min v
Constraints derived from time windows
5

∑p
j =1
Aj x Aj ≤ 10, etc.

5
Constraints derived from release times
∑p
j =1
Bj xBj ≤ 10, etc.

5 5
v ≥ ∑ pij xij , v ≥ 2 + ∑ pij xij , etc., i = A, B
j =1 j =3

v ≥ 10( x A 2 + x A3 + x A5 − 2)
v ≥ 8 xB 4 Benders cut from machine A
xij ∈ {0,1} Benders cut from machine B

LSE tutorial, June 2007


Slide 327
Stronger Benders cuts

If all release times are the same, we can strengthen the Benders cuts.
We are now using the cut  
v ≥ Mik  ∑ xij − J ik + 1
 j∈Jik 
Min makespan Set of jobs
on machine i assigned to
in iteration k machine i in
iteration k
A stronger cut provides a useful bound even if only some of the jobs in
Jik are assigned to machine i:
v ≥ Mik − ∑ (1 − x
j ∈Jik
ij )pij

These results can be generalized to cumulative scheduling.


LSE tutorial, June 2007
Slide 328
LSE tutorial, June 2007
Slide 329

You might also like