Tutorial: Operations Research and Constraint Programming: John Hooker Carnegie Mellon University

Tutorial: Operations Research and
Constraint Programming
John Hooker
Carnegie Mellon University
June 2008
LSE tutorial, June 2007

Slide 1
Why Integrate OR and CP?
Complementary strengths
Computational advantages
Outline of the Tutorial

Slide 2
Complementary Strengths
• CP:
– Inference methods Let’s bring them
– Modeling together!
– Exploits local structure
• OR:
– Relaxation methods
– Duality theory
– Exploits global structure

Slide 3
Computational Advantage of
Integrating CP and OR
Using CP + relaxation from MILP
Problem Speedup
Focacci, Lodi, Lesson 2 to 50 times faster

Milano (1999) timetabling than CP
Refalo (1999) Piecewise linear 2 to 200 times

costs faster than MILP
Hooker & Osorio Flow shop 4 to 150 times

(1999) scheduling, etc. faster than MILP.
Thorsteinsson & Product 30 to 40 times

Ottosson (2001) configuration faster than CP,
MILP

Slide 4
Integrating CP and MILP
Using CP + relaxation from MILP
Problem Speedup
Sellmann & Automatic 1 to 10 times faster

Fahle (2001) recording than CP, MILP
Van Hoeve Stable set Better than CP in

(2001) problem less time
Bollapragada, Structural design Up to 600 times

Ghattas & (nonlinear) faster than MILP.
Hooker (2001) 2 problems: <6 min
vs >20 hrs for MILP
Beck & Refalo Scheduling with Solved 67 of 90, CP
(2003) earliness & solved only 12
tardiness costs
Slide 5
Using CP-based Branch and Price
Problem Speedup
Yunes, Moura & Urban transit Optimal schedule

de Souza (1999) crew scheduling for 210 trips, vs.
120 for traditional
branch and price
Easton, Traveling First to solve
Nemhauser & tournament 8-team instance
Trick (2002) scheduling

Slide 6
Using CP/MILP Benders methods
Problem Speedup
Jain & Min-cost planning 20 to 1000 times

Grossmann & scheduing faster than CP,
(2001) MILP
Thorsteinsson Min-cost planning 10 times faster
(2001) & scheduling than Jain &
Grossmann
Timpe (2002) Polypropylene Solved previously
batch scheduling insoluble problem
at BASF in 10 min

Slide 7
Using CP/MILP Benders methods
Problem Speedup
Benoist, Gaudin, Call center scheduling Solved twice as

Rottembourg many instances
(2002) as traditional
Benders
Hooker (2004) Min-cost, 100-1000 times
min-makespan faster than CP,
planning & cumulative MILP
scheduling
Hooker (2005) Min tardiness 10-1000 times
planning & cumulative faster than CP,
scheduling MILP

Slide 8
Outline of the Tutorial
• Why Integrate OR and CP?
• A Glimpse at CP
• Initial Example: Integrated Methods
• CP Concepts
• CP Filtering Algorithms
• Linear Relaxation and CP
• Mixed Integer/Linear Modeling
• Cutting Planes
• Lagrangean Relaxation and CP
• Dynamic Programming in CP
• CP-based Branch and Price
• CP-based Benders Decomposition

Slide 9
Detailed Outline
• Why Integrate OR and CP?

• Complementary strengths
• Computational advantages
• Outline of the tutorial
• A Glimpse at CP
• Early successes
• Advantages and disadvantages
• Initial Example: Integrated Methods
• Freight Transfer
• Bounds Propagation
• Cutting Planes
• Branch-infer-and-relax Tree

Slide 10
Detailed Outline
• CP Concepts
• Consistency
• Hyperarc Consistency
• Modeling Examples
• CP Filtering Algorithms
• Element
• Alldiff
• Disjunctive Scheduling
• Cumulative Scheduling
• Linear Relaxation and CP
• Why relax?
• Algebraic Analysis of LP
• Linear Programming Duality
• LP-Based Domain Filtering
• Example: Single-Vehicle Routing
• Disjunctions of Linear Systems
Slide 11
Detailed Outline
• Mixed Integer/Linear Modeling
• MILP Representability
• 4.2 Disjunctive Modeling
• 4.3 Knapsack Modeling
• Cutting Planes
• 0-1 Knapsack Cuts
• Gomory Cuts
• Mixed Integer Rounding Cuts
• Example: Product Configuration
• Lagrangean Relaxation and CP
• Lagrangean Duality
• Properties of the Lagrangean Dual
• Example: Fast Linear Programming
• Domain Filtering
• Example: Continuous Global Optimization

Slide 12
Detailed Outline
• Dynamic Programming in CP
• Example: Capital Budgeting
• Domain Filtering
• Recursive Optimization
• CP-based Branch and Price
• Basic Idea
• Example: Airline Crew Scheduling
• CP-based Benders Decomposition
• Benders Decomposition in the Abstract
• Classical Benders Decomposition
• Example: Machine Scheduling

Slide 13
Background Reading
This tutorial is based on:

• J. N. Hooker, Integrated Methods for Optimization, Springer
(2007). Contains 295 exercises.
• J. N. Hooker, Operations research methods in constraint
programming, in F. Rossi, P. van Beek and T. Walsh, eds.,
Handbook of Constraint Programming, Elsevier (2006), pp.
527-570.

Slide 14
A Glimpse at Constraint Programming
Early Successes
Advantages and Disadvantages

Slide 15
What is constraint programming?
• It is a relatively new technology developed in the computer

science and artificial intelligence communities.
• It has found an important role in scheduling, logistics and supply
chain management.

Slide 16
Early commercial successes
• Container port scheduling

• Circuit design (Siemens)
(Hong Kong and Singapore)
• Real-time control
(Siemens, Xerox)

Slide 17
Applications
• Job shop scheduling

• Assembly line smoothing
and balancing
• Cellular frequency
assignment
• Nurse scheduling
• Shift planning
• Maintenance planning
• Airline crew rostering and scheduling
• Airport gate allocation and stand planning

Slide 18
Applications
• Production scheduling
chemicals
aviation
oil refining
steel
lumber
photographic plates
tires
• Transport scheduling (food,
nuclear fuel)
• Warehouse management
• Course timetabling

Slide 19
Advantages and Disadvantages
CP vs. Mathematical Programming
MP CP
Numerical calculation Logic processing
Relaxation Inference (filtering,
constraint propagation)
Atomistic modeling High-level modeling
(linear inequalities) (global constraints)
Branching Branching
Independence of model Constraint-based
and algorithm processing

Slide 20
Programming ≠ programming
• In constraint programming:
• programming = a form of computer programming
(constraint-based processing)
• In mathematical programming:
• programming = logistics planning (historically)

Slide 21
CP vs. MP
• In mathematical programming, equations

(constraints) describe the problem but don’t tell how to
solve it.
• In constraint programming, each constraint invokes a

procedure that screens out unacceptable solutions.
• Much as each line of a computer program invokes
an operation.

Slide 22
Advantages of CP
• Better at sequencing and scheduling

• …where MP methods have weak relaxations.
• Adding messy constraints makes the problem easier.
• The more constraints, the better.
• More powerful modeling language.
• Global constraints lead to succinct models.
• Constraints convey problem structure to the solver.
• “Better at highly-constrained problems”
• Misleading – better when constraints propagate well, or
when constraints have few variables.

Slide 23
Disdvantages of CP
• Weaker for continuous variables.

• Due to lack of numerical techniques
• May fail when constraints contain many variables.
• These constraints don’t propagate well.
•Often not good for funding optimal solutions.
• Due to lack of relaxation technology.
• May not scale up
• Discrete combinatorial methods
• Software is not robust
• Younger field
Slide 24
Obvious solution…
• Integrate CP and MP.

• More on this later.

Slide 25
Trends
• CP is better known in continental Europe, Asia.

• Less known in North America, seen as threat to OR.
• CP/MP integration is growing
• Eclipse, Mozart, OPL Studio, SIMPL, SCIP, BARON
• Heuristic methods increasingly important in CP
• Discrete combinatorial methods
• MP/CP/heuristics may become a single technology.

Slide 26
Initial Example: Integrated Methods
Freight Transfer
Bounds Propagation
Cutting Planes
Branch-infer-and-relax Tree

Slide 27
Example: Freight Transfer
• Transport 42 tons of freight using 8 trucks, which come in

4 sizes…
Truck Number Capacity Cost

size available (tons) per
truck
1 3 7 90
2 3 5 60
3 3 4 50
LSE tutorial, June 2007 4 3 3 40
Slide 28
Number of trucks of type 1
min 90 x1 + 60 x2 + 50 x3 + 40 x 4
7 x1 + 5 x2 + 4 x3 + 3 x 4 ≥ 42 Knapsack
covering
Knapsack x1 + x2 + x3 + x 4 ≤ 8
packing constraint
constraint xi ∈ {0,1,2,3}
Truck Number Capacity Cost

type available (tons) per
truck
1 3 7 90
2 3 5 60
3 3 4 50
LSE tutorial, June 2007 4 3 3 40
Slide 29
Bounds propagation
min 90 x1 + 60 x2 + 50 x3 + 40 x 4
7 x1 + 5 x2 + 4 x3 + 3 x 4 ≥ 42
x1 + x2 + x3 + x 4 ≤ 8
xi ∈ {0,1,2,3}
 42 − 5 ⋅ 3 − 4 ⋅ 3 − 3 ⋅ 3 
x1 ≥   =1
 7 

Slide 30
Bounds propagation
min 90 x1 + 60 x2 + 50 x3 + 40 x 4
7 x1 + 5 x2 + 4 x3 + 3 x 4 ≥ 42
x1 + x2 + x3 + x 4 ≤ 8
x1 ∈ {1,2,3}, x2 , x3 , x 4 ∈ {0,1,2,3}
Reduced
domain
 42 − 5 ⋅ 3 − 4 ⋅ 3 − 3 ⋅ 3 
x1 ≥   =1
 7 

Slide 31
Bounds consistency
• Let {Lj, …, Uj} be the domain of xj

• A constraint set is bounds consistent if for each j :
• xj = Lj in some feasible solution and
• xj = Uj in some feasible solution.
• Bounds consistency ⇒ we will not set xj to any infeasible
values during branching.
• Bounds propagation achieves bounds consistency for a
single inequality.
• 7x1 + 5x2 + 4x3 + 3x4 ≥ 42 is bounds consistent when the
domains are x1 ∈ {1,2,3} and x2, x3, x4 ∈ {0,1,2,3}.
• But not necessarily for a set of inequalities.
Slide 32
Bounds consistency
Bounds propagation may not achieve bounds consistency

for a set of constraints.
Consider set of inequalities x1 + x2 ≥ 1
x1 − x2 ≥ 0
with domains x1, x2 ∈ {0,1}, solutions (x1,x2) = (1,0), (1,1).
Bounds propagation has no effect on the domains.

But constraint set is not bounds consistent because x1 = 0
in no feasible solution.

Slide 33
Cutting Planes
Begin with continuous relaxation
min 90 x1 + 60 x2 + 50 x3 + 40 x 4
7 x1 + 5 x2 + 4 x3 + 3 x 4 ≥ 42
x1 + x2 + x3 + x 4 ≤ 8
Replace domains
0 ≤ xi ≤ 3, x1 ≥ 1
with bounds
This is a linear programming problem, which is easy to
solve.
Its optimal value provides a lower bound on optimal
value of original problem.

Slide 34
Cutting planes (valid inequalities)
min 90 x1 + 60 x2 + 50 x3 + 40 x 4
7 x1 + 5 x2 + 4 x3 + 3 x 4 ≥ 42
x1 + x2 + x3 + x 4 ≤ 8
0 ≤ xi ≤ 3, x1 ≥ 1
We can create a tighter relaxation (larger minimum

value) with the addition of cutting planes.

Slide 35
min 90 x1 + 60 x2 + 50 x3 + 40 x 4
7 x1 + 5 x2 + 4 x3 + 3 x 4 ≥ 42
Cutting
x1 + x2 + x3 + x 4 ≤ 8 plane Continuous
0 ≤ xi ≤ 3, x1 ≥ 1 relaxation
All feasible solutions of the

original problem satisfy a
cutting plane (i.e., it is valid).
But a cutting plane may
exclude (“cut off”) solutions of
the continuous relaxation.
Feasible solutions
Slide 36
min 90 x1 + 60 x2 + 50 x3 + 40 x 4
7 x1 + 5 x2 + 4 x3 + 3 x 4 ≥ 42
x1 + x2 + x3 + x 4 ≤ 8
0 ≤ xi ≤ 3, x1 ≥ 1
{1,2} is a packing
…because 7x1 + 5x2 alone cannot satisfy the inequality,
even with x1 = x2 = 3.

Slide 37
min 90 x1 + 60 x2 + 50 x3 + 40 x 4
7 x1 + 5 x2 + 4 x3 + 3 x 4 ≥ 42
x1 + x2 + x3 + x 4 ≤ 8
0 ≤ xi ≤ 3, x1 ≥ 1
{1,2} is a packing
So, 4 x3 + 3 x 4 ≥ 42 − (7 ⋅ 3 + 5 ⋅ 3) Knapsack cut
which implies
 42 − (7 ⋅ 3 + 5 ⋅ 3) 
x3 + x 4 ≥   =2
 max {4,3} 
Slide 38
Let xi have domain [Li,Ui] and let a ≥ 0.

In general, a packing P for ax ≥ a0 satisfies
∑a x
i ∉P
i i ≥ a0 − ∑ ai Ui
i ∈P
and generates a knapsack cut
 a0 − ∑ ai U i 
∑ xi ≥  i ∈P 
i ∉P
 max {ai } 
 i ∉P 

Slide 39
min 90 x1 + 60 x2 + 50 x3 + 40 x 4
7 x1 + 5 x2 + 4 x3 + 3 x 4 ≥ 42
x1 + x2 + x3 + x 4 ≤ 8
0 ≤ xi ≤ 3, x1 ≥ 1
Maximal Packings Knapsack cuts

{1,2} x3 + x4 ≥ 2
{1,3} x2 + x4 ≥ 2
{1,4} x2 + x3 ≥ 3
Knapsack cuts corresponding to nonmaximal

LSE tutorial, June 2007 packings can be nonredundant.
Slide 40
Continuous relaxation with cuts
min 90 x1 + 60 x2 + 50 x3 + 40 x 4
7 x1 + 5 x2 + 4 x3 + 3 x 4 ≥ 42
x1 + x2 + x3 + x4 ≤ 8
0 ≤ xi ≤ 3, x1 ≥ 1
x3 + x 4 ≥ 2
x2 + x4 ≥ 2 Knapsack cuts
x 2 + x3 ≥ 3
Optimal value of 523.3 is a lower bound on optimal value

of original problem.

Slide 41
x1 ∈ { 123}
Branch- x2 ∈ {0123}
x3 ∈ {0123}
infer-and- x4 ∈ {0123}
x = (2⅓,3,2⅔,0)
relax tree value = 523⅓
Propagate bounds
and solve
relaxation of
original problem.

Slide 42
x1 ∈ { 123}
x2 ∈ {0123}
Branch-infer- x3 ∈ {0123}
and-relax tree x4 ∈ {0123}
x = (2⅓,3,2⅔,0)
value = 523⅓
Branch on a x1 ∈ {1,2}
x1 = 3
variable with
nonintegral value
in the relaxation.

Slide 43
x1 ∈ { 123}
x2 ∈ {0123}
x = (2⅓,3,2⅔,0)
value = 523⅓
x1 ∈ { 12 }
Propagate bounds x2 ∈ { 23} x1 ∈ {1,2}
x1 = 3
and solve x3 ∈ { 123}
x4 ∈ { 123}
relaxation. infeasible
relaxation
Since relaxation
is infeasible,
backtrack.

Slide 44
x1 ∈ { 123}
x2 ∈ {0123}
x = (2⅓,3,2⅔,0)
value = 523⅓
x1 ∈ { 12 } x1 ∈ { 3}
Propagate bounds x2 ∈ { 23} x1 ∈ {1,2}
x1 = 3 x2 ∈ {0123}
and solve x3 ∈ { 123} x3 ∈ {0123}
x4 ∈ { 123} x4 ∈ {0123}
relaxation. infeasible x = (3,2.6,2,0)
relaxation value = 526
Branch on
x2 = 3
nonintegral x2 ∈ {0,1,2}
variable.

Slide 45
x1 ∈ { 123}
x2 ∈ {0123}
x = (2⅓,3,2⅔,0)
value = 523⅓
x1 ∈ { 12 } x1 ∈ { 3}
Branch again. x2 ∈ { 23} x1 ∈ {1,2}
x1 = 3 x2 ∈ {0123}
x3 ∈ { 123} x3 ∈ {0123}
x4 ∈ { 123} x4 ∈ {0123}
infeasible x = (3,2.6,2,0)
relaxation value = 526
x2 = 3
x1 ∈ { 3}
x2 ∈ {0,1,2}
x2 ∈ {012 }
x3 ∈ { 123}
x4 ∈ {0123}
x = (3,2,2¾,0)
value = 527½
x3 = 3
x3 ∈ {1,2}

Slide 46
x1 ∈ { 123}
x2 ∈ {0123}
x = (2⅓,3,2⅔,0)
value = 523⅓
x1 ∈ { 12 } x1 ∈ { 3}
Solution of x2 ∈ { 23} x1 ∈ {1,2}
x1 = 3 x2 ∈ {0123}
relaxation x3 ∈ { 123} x3 ∈ {0123}
x4 ∈ { 123} x4 ∈ {0123}
is integral and infeasible x = (3,2.6,2,0)
therefore feasible relaxation value = 526
in the original x2 = 3
x1 ∈ { 3}
problem. x2 ∈ {0,1,2}
x2 ∈ {012 }
x3 ∈ { 123}
This becomes the x4 ∈ {0123}
incumbent x = (3,2,2¾,0)
value = 527½
solution.
x3 = 3
x1 ∈ { 3} x3 ∈ {1,2}
x2 ∈ { 12 }
x3 ∈ { 12 }
x4 ∈ { 123}
x = (3,2,2,1)
value = 530
LSE tutorial, June 2007 feasible solution
Slide 47
x1 ∈ { 123}
x2 ∈ {0123}
x = (2⅓,3,2⅔,0)
value = 523⅓
x1 ∈ { 12 } x1 ∈ { 3}
Solution is x2 ∈ { 23} x1 ∈ {1,2}
x1 = 3 x2 ∈ {0123}
nonintegral, but x3 ∈ { 123} x3 ∈ {0123}
x4 ∈ { 123} x4 ∈ {0123}
we can backtrack infeasible x = (3,2.6,2,0)
because value of relaxation value = 526
relaxation is x2 = 3
x1 ∈ { 3}
no better than x2 ∈ {0,1,2}
x2 ∈ {012 }
incumbent solution. x3 ∈ { 123}
x4 ∈ {0123}
x = (3,2,2¾,0)
value = 527½
x3 = 3
x1 ∈ { 3} x3 ∈ {1,2} x1 ∈ { 3}
x2 ∈ { 12 } x2 ∈ {012 }
x3 ∈ { 12 } x3 ∈ { 3}
x4 ∈ { 123} x4 ∈ {012 }
x = (3,2,2,1) x = (3,1½,3,½)
value = 530 value = 530
feasible solution backtrack
LSE tutorial, June 2007 due to bound
Slide 48
x1 ∈ { 123}
x2 ∈ {0123}
x = (2⅓,3,2⅔,0)
value = 523⅓
x1 ∈ { 12 } x1 ∈ { 3}
Another feasible x2 ∈ { 23} x1 ∈ {1,2}
x1 = 3 x2 ∈ {0123}
solution found. x3 ∈ { 123} x3 ∈ {0123}
x4 ∈ { 123} x4 ∈ {0123}
infeasible x = (3,2.6,2,0)
No better than value = 526
relaxation
incumbent solution,
x2 = 3 x1 ∈ { 3}
which is optimal x1 ∈ { 3}
x2 ∈ {0,1,2} x2 ∈ { 3}
x2 ∈ {012 }
because search x3 ∈ { 123}
x3 ∈ {012 }
has finished. x4 ∈ {012 }
x4 ∈ {0123}
x = (3,3,0,2)
x = (3,2,2¾,0)
value = 530
value = 527½
feasible solution
x3 = 3
x1 ∈ { 3} x3 ∈ {1,2} x1 ∈ { 3}
x2 ∈ { 12 } x2 ∈ {012 }
x3 ∈ { 12 } x3 ∈ { 3}
x4 ∈ { 123} x4 ∈ {012 }
x = (3,2,2,1) x = (3,1½,3,½)
value = 530 value = 530
feasible solution backtrack
LSE tutorial, June 2007 due to bound
Slide 49
Two optimal solutions…
x = (3,2,2,1)
x = (3,3,0,2)

Slide 50
Constraint Programming Concepts
Consistency
Hyperarc Consistency
Modeling Examples

Slide 51
Consistency
• A constraint set is consistent if every partial assignment to the

variables that violates no constraint is feasible.
• i.e., can be extended to a feasible solution.
• Consistency ≠ feasibility
• Consistency means that any infeasible partial assignment is
explicitly ruled out by a constraint.
• Fully consistent constraint sets can be solved without
backtracking.

Slide 52
Consistency
Consider the constraint set
x1 + x100 ≥ 1
x1 − x100 ≥ 0
x j ∈ {0,1}
It is not consistent, because x1 = 0 violates no constraint
and yet is infeasible (no solution has x1 = 0).
Adding the constraint x1 = 1 makes the set consistent.

Slide 53
x1 + x100 ≥ 1
x1 − x100 ≥ 1
other constraints
x j ∈ {0,1}
x1 = 0 x1 = 1
subtree with 299 nodes

but no feasible solution
By adding the constraint

x1 = 1, the left subtree is
eliminated

Slide 54
Hyperarc Consistency
• Also known as generalized arc consistency.

• A constraint set is hyperarc consistent if every value in
every variable domain is part of some feasible solution.
• That is, the domains are reduced as much as
possible.
• If all constraints are “binary” (contain 2 variables),
hyperarc consistent = arc consistent.
• Domain reduction is CP’s biggest engine.

Slide 55
Graph coloring problem that can be solved by arc
consistency maintenance alone. Color nodes with red,
green, blue with no two adjacent nodes having the same
color.

Slide 56
color.

Slide 57
color.

Slide 58
color.

Slide 59
color.

Slide 60
color.

Slide 61
color.

Slide 62
Modeling Examples with Global Constraints
Traveling Salesman
Traveling salesman problem:

Let cij = distance from city i to city j.
Find the shortest route that visits each of n cities exactly
once.

Slide 63
Popular 0-1 model
Let xij = 1 if city i immediately precedes city j, 0 otherwise
min ∑c x
ij
ij ij
s.t. ∑x
i
ij = 1, all j
∑x
j
ij = 1, all i
∑∑x ij ≥ 1, all disjoint V ,W ⊂ {1,…, n}

i ∈V j∈W
xij ∈ {0,1}
Subtour elimination constraints

Slide 64
A CP model
Let yk = the kth city visited.

The model would be written in a specific constraint programming
language but would essentially say:
Variable indices
min ∑c
k
y k y k +1
s.t. alldiff( y1,…, y n )

y k ∈ {1,…, n}
“Global” constraint

Slide 65
An alternate CP model
Let yk = the city visited after city k.
min ∑c
k
ky k
s.t. circuit( y1,…, y n )

y k ∈ {1,…, n}
Hamiltonian circuit
constraint

Slide 66
Element constraint
The constraint cy ≤ 5 can be implemented:

z≤5
element ( y ,(c1,…, cn ), z )
Assign z the yth
value in the list
The constraint xy ≤ 5 can be implemented
z≤5 Add the

element ( y ,( x1,…, xn ), z ) constraint
z = xy
(this is a slightly different constraint)

Slide 67
Modeling example: Lot sizing and scheduling
Day: 1 2 3 4 5 6 7 8
A B A
Product
• At most one product manufactured on each day.
• Demands for each product on each day.
• Minimize setup + holding cost.
Slide 68
  Many variables
min ∑  hit sit + ∑ qij δ ijt 
t ,i  j ≠t 
s.t. si ,t −1 + xit = d it + sit , all i , t
zit ≥ y it − y i ,t −1, all i , t
Integer zit ≤ y it , all i , t
programming zit ≤ 1 − y i ,t −1, all i , t
model δ ijt ≥ y i ,t −1 + y jt − 1, all i , j , t
(Wolsey) δ ijt ≥ y i ,t −1, all i , j , t
δ ijt ≥ y jt , all i , j , t
xit ≤ Cy it , all i , t
∑ y it = 1, all t
i
y it , zit ,δ ijt ∈ {0,1}

xit , sit ≥ 0
Slide 69
CP model
Minimize holding and setup costs
 
min ∑  qy t −1y t + ∑ hi sit 
t  i  Inventory balance
s.t. si ,t −1 + xit = d it + sit , all i , t Production capacity

0 ≤ xit ≤ C, sit ≥ 0, all i , t
( y t ≠ i ) → ( xit = 0 ) , all i , t

Slide 70
CP model
Minimize holding and setup costs
Variable indices
 
min ∑  qy t −1y t + ∑ hi sit 
t  i  Inventory balance
s.t. si ,t −1 + xit = d it + sit , all i , t Production capacity

0 ≤ xit ≤ C, sit ≥ 0, all i , t
( y t ≠ i ) → ( xit = 0 ) , all i , t
Production level of product i in period t
Product manufactured in period t
Slide 71
Cumulative scheduling constraint
• Used for resource-constrained scheduling.

• Total resources consumed by jobs at any one time must
not exceed L.
cumulative ( (t1,…, tn ),( p1,…, pn ),(c1,…, cn ), L )
Job start times

(variables)
Job processing times
Job resource
requirements

Slide 72
Cumulative scheduling constraint
Minimize makespan (no deadlines, all release times = 0):
L
4
resources 1 5
Min makespan = 8
2 3
time
min z L
s.t. cumulative ( (t1,…, t5 ),(3,3,3,5,5),(3,3,3,2,2),7 )

z ≥ t1 + 3
Resources used
⋮ Processing times
z ≥ t5 + 2 Job start times
Slide 73
Modeling example: Ship loading
• Will use ILOG’s OPL Studio modeling language.

• Example is from OPL manual.
• The problem
• Load 34 items on the ship in minimum time (min makespan)
• Each item requires a certain time and certain number of
workers.
• Total of 8 workers available.

Slide 74
Item Dura- Labor Item Dura- Labor
tion tion
1 3 4 18 2 7
Problem data
2 4 4 19 1 4
3 4 3 20 1 4
4 6 4 21 1 4
5 5 5 22 2 4
6 2 5 23 4 7
7 3 4 24 5 8
8 4 3 25 2 8
9 3 4 26 1 3
10 2 8 27 1 3
11 3 4 28 2 6
12 2 5 29 1 8
13 1 4 30 3 3
14 5 3 31 2 3
15 2 3 32 1 3
16 3 3 33 2 3
17 2 6 34 2 3

Slide 75
Precedence constraints
1 → 2,4 11 →13 22 →23

2 →3 12 →13 23 →24
3 →5,7 13 →15,16 24 →25
4 →5 14 →15 25 →26,30,31,32
5 →6 15 →18 26 → 27
6 →8 16 →17 27 → 28
7 →8 17 →18 28 → 29
8 →9 18 →19 30 → 28
9 →10 18 →20,21 31 → 28
9 →14 19 →23 32 → 33
10 →11 20 → 23 33 → 34
10 →12 21 → 22

Slide 76
Use the cumulative scheduling constraint.
min z
s.t. z ≥ t1 + 3, z ≥ t2 + 4, etc.
cumulative ( (t1,…, t34 ),(3,4,…,2),(4,4,…,3),8 )
t2 ≥ t1 + 3, t 4 ≥ t1 + 3, etc.

Slide 77
OPL model
int capacity = 8;
int nbTasks = 34;
range Tasks 1..nbTasks;
int duration[Tasks] = [3,4,4,6,…,2];
int totalDuration =
sum(t in Tasks) duration[t];
int demand[Tasks] = [4,4,3,4,…,3];
struct Precedences {
int before;
int after;
}
{Precedences} setOfPrecedences = {
<1,2>, <1,4>, …, <33,34> };

Slide 78
scheduleHorizon = totalDuration;
Activity a[t in Tasks](duration[t]);
DiscreteResource res(8);
Activity makespan(0);
minimize
makespan.end
subject to
forall(t in Tasks)
a[t] precedes makespan;
forall(p in setOfPrecedences)
a[p.before] precedes a[p.after];
forall(t in Tasks)
a[t] requires(demand[t]) res;
};

Slide 79
Modeling example: Production scheduling with
intermediate storage
Storage Packing
Tanks Units
Capacity
C1
Manufacturing
Unit
Capacity
C2
Capacity
C3
Slide 80
Filling of storage tank
Level Need to enforce

capacity constraint
here only
t u t + (b/r) u + (b/s)
Manufac-
turing rate Packing rate
Filling starts
Batch size
Packing starts Packing ends
Filling ends
Slide 81
Makespan
min T
bj
s.t. T ≥ uj + , all j
sj Job release time
t j ≥ R j , all j
cumulative ( t ,v , e, m ) m storage tanks
b
v i = ui + i − t i , all i Job duration
si
 si 
bi  1 −  + si ui ≤ Ci , all i Tank capacity
 ri

  b1 bn  
cumulative  u,  ,…,  , e, p 
 
  s1 sn   p packing units
uj ≥ tj ≥ 0
e = (1,…,1)
Slide 82
Modeling example: Employee scheduling
• Schedule four nurses in 8-hour shifts.

• A nurse works at most one shift a day, at least 5 days a week.
• Same schedule every week.
• No shift staffed by more than two different nurses in a week.
• A nurse cannot work different shifts on two consecutive days.
• A nurse who works shift 2 or 3 must do so at least two days in
a row.

Slide 83
Two ways to view the problem
Assign nurses to shifts

Sun Mon Tue Wed Thu Fri Sat
Shift 1 A B A A A A A
Shift 2 C C C B B B B
Shift 3 D D D D C C D
Assign shifts to nurses

Sun Mon Tue Wed Thu Fri Sat
Nurse A 1 0 1 1 1 1 1
Nurse B 0 1 0 2 2 2 2
Nurse C 2 2 2 0 3 3 0
Nurse D 3 3 3 3 0 0 3
LSE tutorial, June 2007 0 = day off

Slide 84
Use both formulations in the same model!
First, assign nurses to shifts.
Let wsd = nurse assigned to shift s on day d
alldiff(w1d ,w 2d ,w 3d ), all d The variables w1d, w2d,

w3d take different values
That is, schedule 3
different nurses on each
day

Slide 85
alldiff(w1d ,w 2d ,w 3d ), all d
cardinality (w | ( A, B,C, D ),(5,5,5,5),(6,6,6,6) )
A occurs at least 5 and at most 6

times in the array w, and similarly
for B, C, D.
That is, each nurse works at least
5 and at most 6 days a week

Slide 86
alldiff (w1d ,w 2d ,w 3d ) , all d

nvalues (w s,Sun ,...,w s,Sat | 1,2 ) , all s
The variables ws,Sun, …, ws,Sat take

at least 1 and at most 2 different
values.
That is, at least 1 and at most 2
nurses work any given shift.
Slide 87
Remaining constraints are not easily expressed in this
notation.
So, assign shifts to nurses.
Let yid = shift assigned to nurse i on day d
alldiff ( y1d , y 2d , y 3d ) , all d
Assign a different nurse to each

shift on each day.
This constraint is redundant of
previous constraints, but
redundant constraints speed
solution.
Slide 88
notation.

stretch ( y i ,Sun ,…, y i ,Sat | (2,3),(2,2),(6,6), P ) , all i
Every stretch of 2’s has length between 2 and 6.

Every stretch of 3’s has length between 2 and 6.
So a nurse who works shift 2 or 3 must do so at least
two days in a row.

Slide 89
notation.

Here P = {(s,0),(0,s) | s = 1,2,3}

Whenever a stretch of a’s immediately precedes a stretch of b’s,
(a,b) must be one of the pairs in P.
So a nurse cannot switch shifts without taking at least one day off.
Slide 90
Now we must connect the wsd variables to the yid variables.
Use channeling constraints:
w y id d = i , all i , d
y w sd d = s, all s, d
Channeling constraints increase propagation and make the

problem easier to solve.

Slide 91
The complete model is:
alldiff (w1d ,w 2d ,w 3d ) , all d

nvalues (w s,Sun ,...,w s,Sat | 1,2 ) , all s

w y id d = i , all i , d
y w sd d = s, all s, d

Slide 92
CP Filtering Algorithms
Element
Alldiff
Disjunctive Scheduling
Cumulative Scheduling

Slide 93
Filtering for element
element ( y ,( x1,…, xn ), z )
Variable domains can be easily filtered to maintain hvperarc

consistency.
Dz ← Dz ∩ ∪D xj
j ∈Dy
Domain of z
{
Dy ← Dy ∩ j | Dz ∩ Dx j ≠ ∅ }
Dz if Dy = { j } 
Dx j ←  
D
 x j otherwise 

Slide 94
Filtering for element
Example... element ( y ,( x1, x2 , x3 , x4 ), z )
The initial domains are: The reduced domains are:
Dz = {20,30,60,80,90} Dz = {80,90}
Dy = {1,3,4} Dy = {3}
Dx1 = {10,50} Dx1 = {10,50}
Dx2 = {10,20} Dx2 = {10,20}
Dx3 = {40,50,80,90} Dx3 = {80,90}
Dx4 = {40,50,70} Dx4 = {40,50,70}

Slide 95
Filtering for alldiff
alldiff ( y1,…, y n )
Domains can be filtered with an algorithm based on maximum

cardinality bipartite matching and a theorem of Berge.
It is a special case of optimality conditions for max flow.

Slide 96
Consider the domains
y1 ∈ {1}
y 2 ∈ {2,3,5}
y 3 ∈ {1,2,3,5}
y 4 ∈ {1,5}
y 5 ∈ {1,2,3,4,5,6}

Slide 97
Indicate domains with edges
1
y1
2
y2
3
y3
4
y4
5
y5
6

Slide 98
1 Find maximum cardinality bipartite

y1
matching.
2
y2
3
y3
4
y4
5
y5
6

Slide 99

y1
matching.
2
y2
3
y3
4
y4
5
y5
6

Slide 100

y1
matching.
2
y2
3 Mark edges in alternating paths
that start at an uncovered vertex.
y3
4
y4
5
y5
6

Slide 101

y1
matching.
2
y2
y3
4
y4
5
y5
6

Slide 102

y1
matching.
2
y2
y3
4
y4
5 Mark edges in alternating cycles.
y5
6

Slide 103

y1
matching.
2
y2
y3
4
y4
y5
6
Remove unmarked edges not in
matching.

Slide 104

y1
matching.
2
y2
y3
4
y4
y5
6
Remove unmarked edges not in
matching.

Slide 105
Domains have been filtered:
y1 ∈ {1} y1 ∈ {1}
y 2 ∈ {2,3,5} y 2 ∈ {2,3}
y 3 ∈ {1,2,3,5} y 3 ∈ {2,3}
y 4 ∈ {1,5} y 4 ∈ {5}
y 5 ∈ {1,2,3,4,5,6} y 5 ∈ {4,6}
Hyperarc consistency achieved.

Slide 106
Disjunctive scheduling
Consider a disjunctive scheduling constraint:

disjunctive ( (s1, s2 , s3 , s5 ),( p1, p2 , p3 , p5 ) )
Start time variables

Slide 107
Edge finding for disjunctive scheduling

Processing times

Slide 108

Variable domains defined by time
windows and processing times
s1 ∈ [0,10 − 1]
s2 ∈ [0,10 − 3]
s3 ∈ [2,7 − 3]
s5 ∈ [4,7 − 2]

Slide 109

A feasible (min makespan) solution:
Time window

Slide 110
But let’s reduce 2 of the deadlines to 9:

Slide 111
But let’s reduce 2 of the deadlines to 9:
We will use edge finding

to prove that there is no
feasible schedule.

Slide 112
We can deduce that job 2 must precede jobs 3 and 4: 2 ≪ {3,5}
Because if job 2 is not first, there is not enough time for all 3
jobs within the time windows:
L{2,3,5} − E{3,5} < p{2,3,5}
E{3,5} 7<3+3+2 L{2,3,5}

Slide 113
L{2,3,5} − E{3,5} < p{2,3,5}
Latest deadline
E{3,5} 7<3+3+2 L{2,3,5}

Slide 114
L{2,3,5} − E{3,5} < p{2,3,5}
Earliest release time
E{3,5} 7<3+3+2 L{2,3,5}

Slide 115
L{2,3,5} − E{3,5} < p{2,3,5}
Total processing time
E{3,5} 7<3+3+2 L{2,3,5}

Slide 116
So we can tighten deadline of job 2 to minimum of
L{3} − p{3} = 4 L{5} − p{5} = 5 L{3,5} − p{3,5} = 2

Since time window of job 2 is now too narrow, there is no
feasible schedule.
E{3,5} 7<3+3+2 L{2,3,5}

Slide 117
In general, we can deduce that job k must precede all the jobs
in set J: k ≪ J
If there is not enough time for all the jobs after the earliest
release time of the jobs in J
LJ ∪{ k } − EJ < pJ ∪{ k } L{2,3,5} − E{3,5} < p{2,3,5}

Slide 118
In general, we can deduce that job k must precede all the jobs
in set J: k ≪ J
If there is not enough time for all the jobs after the earliest
release time of the jobs in J
LJ ∪{ k } − EJ < pJ ∪{ k } L{2,3,5} − E{3,5} < p{2,3,5}
Now we can tighten the deadline for job k to:

min {LJ ′ − pJ ′ } L{3,5} − p{3,5} = 2
J ′⊂ J

Slide 119
There is a symmetric rule: k ≫J
If there is not enough time for all the jobs before the latest
deadline of the jobs in J:
LJ − EJ ∪{ k } < pJ ∪{ k }
Now we can tighten the release date for job k to:

max {EJ ′ + pJ ′ }
J ′⊂ J

Slide 120
Problem: how can we avoid enumerating all subsets J of jobs

to find edges?
LJ ∪{ k } − EJ < pJ ∪{ k }
…and all subsets J′ of J to tighten the bounds?

min {LJ ′ − pJ ′ }
J ′⊂ J

Slide 121
Key result: We only have to consider sets J whose time

windows lie within some interval.
min {LJ ′ − pJ ′ }
J ′⊂ J
e.g., J = {3,5}

Slide 122

min {LJ ′ − pJ ′ }
J ′⊂ J
e.g., J = {3,5}
Removing a job from those within an interval only weakens the

test LJ ∪{ k } − EJ < pJ ∪{ k }
There are a polynomial number of intervals
LSE tutorial, June 2007 defined by release times and deadlines.
Slide 123

min {LJ ′ − pJ ′ }
J ′⊂ J
e.g., J = {3,5}
Note: Edge finding does not achieve bounds consistency,

which is an NP-hard problem.

Slide 124
One O(n2) algorithm is based on the Jackson pre-emptive
schedule (JPS). Using a different example, the JPS is:

Slide 125
One O(n2) algorithm is based on the Jackson pre-emptive
schedule (JPS). Using a different example, the JPS is:
For each job i Jobs unfinished at time Ei in JPS
Scan jobs k ∈ Ji in decreasing order of Lk

Select first k for which Lk − Ei < pi + pJik
Conclude that i ≫ Jik Jobs j ≠ i in Ji with Lj ≤ Lk
Update Ei to JPS(i , k )

Slide 126
Latest completion time in JPS of jobs in Jik
Not-first/not-last rules
We can deduce that job 4 cannot precede jobs 1 and 2:
¬ ( 4 ≪ {1,2} )
Because if job 4 is first, there is too little time to complete the
jobs before the later deadline of jobs 1 and 2:
L{1,2} − E 4 < p1 + p2 + p4
E4 6<1+3+3 L{1,2}
Slide 127
We can deduce that job 4 cannot precede jobs 1 and 2:
¬ ( 4 ≪ {1,2} )
Now we can tighten the release time of job 4 to minimum of:
E1 + p1 = 3 E2 + p2 = 4
E4 6<1+3+3 L{1,2}
Slide 128
In general, we can deduce that job k cannot precede all the
jobs in J: ¬(k ≪ J )
if there is too little time after release time of job k to complete
all jobs before the latest deadline in J:
LJ − Ek < pJ
Now we can update Ei to

min {E j + p j }
j ∈J

Slide 129
In general, we can deduce that job k cannot precede all the
jobs in J: ¬(k ≪ J )
if there is too little time after release time of job k to complete
all jobs before the latest deadline in J:
LJ − Ek < pJ
Now we can update Ei to

min {E j + p j }
j ∈J
There is a symmetric not-last rule.

The rules can be applied in polynomial time, although an
efficient algorithm is quite complicated.

Slide 130
Cumulative scheduling
Consider a cumulative scheduling constraint:

cumulative ( (s1, s2 , s3 ),( p1, p2 , p3 ),(c1, c2 , c3 ),C )
A feasible solution:

Slide 131
Edge finding for cumulative scheduling
We can deduce that job 3 must finish after the others finish: 3 > {1,2}
Because the total energy required exceeds the area between
the earliest release time and the later deadline of jobs 1,2:
(
e3 + e{1,2} > C ⋅ L{1,2} − E {1,2,3} )

Slide 132
(
e3 + e{1,2} > C ⋅ L{1,2} − E {1,2,3} )
Total energy
required = 22
9
8

Slide 133
(
e3 + e{1,2} > C ⋅ L{1,2} − E {1,2,3} )
Total energy
required = 22
9
Area available 8
= 20
5

Slide 134
We can update the release time of job 3 to
eJ − (C − c3 )(L{1,2} − E{1,2} )
E{1,2} +
c3
Energy available
for jobs 1,2 if
space is left for job
3 to start anytime
= 10
10

Slide 135
eJ − (C − c3 )(L{1,2} − E{1,2} )
E{1,2} +
c3
Energy available
for jobs 1,2 if
space is left for job 4
3 to start anytime
= 10
Excess energy 10
required by jobs
1,2 = 4
Slide 136
eJ − (C − c3 )(L{1,2} − E{1,2} )
E{1,2} +
c3
Energy available
for jobs 1,2 if
space is left for job 4 Move up job 3
3 to start anytime release time
= 10 4/2 = 2 units
beyond E{1,2}
Excess energy 10
required by jobs
1,2 = 4 E3
Slide 137
In general, if (
eJ ∪{k } > C ⋅ LJ − EJ ∪{ k } )
then k > J, and update Ek to
 eJ ′ − (C − ck )(LJ ′ − EJ ′ ) 
max
J ′⊂ J
E J ′ + 
 c k 
e ′ −(C −c )( L ′ −E ′ )>0
J k J J
In general, if (
eJ ∪{k } > C ⋅ LJ ∪{ k } − EJ )
then k < J, and update Lk to
 eJ ′ − (C − ck )(LJ ′ − EJ ′ ) 
min
J ′⊂ J
LJ ′ − 
 c k 
eJ ′ −(C −ck )( LJ ′ −EJ ′ )>0

Slide 138
There is an O(n2) algorithm that finds all applications of the

edge finding rules.

Slide 139
Other propagation rules for cumulative
scheduling
• Extended edge finding.

• Timetabling.
• Not-first/not-last rules.
• Energetic reasoning.

Slide 140
Linear Relaxation
Why Relax?
Algebraic Analysis of LP
Linear Programming Duality
LP-Based Domain Filtering
Example: Single-Vehicle Routing
Disjunctions of Linear Systems
Slide 141
Why Relax?
Solving a relaxation of a problem can:
• Tighten variable bounds.

• Possibly solve original problem.
• Guide the search in a promising direction.
• Filter domains using reduced costs or Lagrange multipliers.
• Prune the search tree using a bound on the optimal value.
• Provide a more global view, because a single OR relaxation
can pool relaxations of several constraints.

Slide 142
Some OR models that can provide relaxations:
• Linear programming (LP).

• Mixed integer linear programming (MILP)
– Can itself be relaxed as an LP.
– LP relaxation can be strengthened with cutting planes.
• Lagrangean relaxation.
• Specialized relaxations.
– For particular problem classes.
– For global constraints.

Slide 143
Motivation
• Linear programming is remarkably versatile for representing

real-world problems.
• LP is by far the most widely used tool for relaxation.
• LP relaxations can be strengthened by cutting planes.
- Based on polyhedral analysis.
• LP has an elegant and powerful duality theory.
- Useful for domain filtering, and much else.
• The LP problem is extremely well solved.

Slide 144
An example…
min 4 x1 + 7 x2
2 x1 + 3 x2 ≥ 6
2x1 + x2 ≥ 4
2 x1 + x2 ≥ 4
Optimal solution
x1, x2 ≥ 0 x = (3,0)
4x1 + 7x2 = 12
2x1 + 3x2 ≥ 6

Slide 145
Rewrite as
min 4 x1 + 7 x2 min 4 x1 + 7 x2
2 x1 + 3 x2 ≥ 6 2 x1 + 3 x2 − x3 = 6
2 x1 + x2 ≥ 4 2 x1 + x2 − x4 = 4
x1, x2 ≥ 0 x1, x2 , x3 , x4 ≥ 0
In general an LP has the form min cx

Ax = b
x≥0

Slide 146
Algebraic analysis of LP
Write min cx as min cB xB + cN xN where
Ax = b BxB + NxN = b A = [B N ]
x≥0 xB , xN ≥ 0
m × n matrix Any set of

Basic m linearly
Nonbasic independent
variables
variables columns of A.
These form a
basis for the
space spanned
by the columns.

Slide 147
x≥0 xB , xN ≥ 0
Solve constraint equation for xB: xB = B −1b − B −1NxN
All solutions can be obtained by setting xN to some value.

The solution is basic if xN = 0.
It is a basic feasible solution if xN = 0 and xB ≥ 0.

Slide 148
x2
Example… = basic feasible
solution
min 4 x1 + 7 x2 x2, x3 basic

2 x1 + 3 x2 − x3 = 6
2 x1 + x2 − x4 = 4 2x1 + x2 ≥ 4
x1, x2 , x3 , x4 ≥ 0
x2, x4 basic
x1, x2 basic
2x1 + 3x2 ≥ 6
x1, x4 basic
x3, x4 basic
x1, x3 basic x1
Slide 149
x≥0 xB , xN ≥ 0
Solve constraint equation for xB: xB = B −1b − B −1NxN
Express cost in terms of nonbasic variables:
cB B −1b − (cN − cB B −1N )xN Since xN ≥ 0,

basic solution (xB,0)
Vector of reduced costs is optimal if
reduced costs are
nonnegative.
Slide 150
x2
Example…
min 4 x1 + 7 x2
2x1 + 3 x2 − x3 = 6
2x1 + x2 − x4 = 4
x1, x2 , x3 , x4 ≥ 0 Consider this
basic feasible
solution
x1, x4 basic
x1
Slide 151
Example…
Write… as… cBxB c Nx N

 x1   x2 
min 4 x1 + 7 x2 min [ 4 0 ]   + [7 0]  
2x1 + 3 x2 − x3 = 6  x4   x3 
2 0   x1  3 −1  x1   6 
2x1 + x2 − x4 = 4 BxB     +    = 
2 −1  x 4   1 0   x 4   4 
x1, x2 , x3 , x4 ≥ 0
 x1   x1  0  NxN b
 x  ,  x  ≥ 0 
 4  4  

Slide 152
Example…
cBxB c Nx N
 x1   x2 
min [ 4 0 ]   + [7 0]  
 x4   x3 
2 0   x1  3 −1  x1   6 
BxB     +    = 
2 −1  x 4   1 0   x 4   4 
 x1   x1  0  NxN b
 x  ,  x  ≥ 0 
 4  4  

Slide 153
Example…
Basic solution is
xB = B −1b − B −1NxN = B −1b
cBxB c Nx N  x1  1/ 2 0   6  3 
= =    = 
 x1   x2   x4   1 −1  4  2
min [ 4 0 ]   + [7 0]  
 x4   x3 
2 0   x1  3 −1  x1   6 
BxB     +    =  x2
2 −1  x 4   1 0   x 4   4 
 x1   x1  0  NxN b
 x  ,  x  ≥ 0 
 4  4  
x1, x4 basic
x1
Slide 154
Example…
Basic solution is
xB = B −1b − B −1NxN = B −1b
cBxB c Nx N  x1  1/ 2 0   6  3 
= =    = 
 x1   x2   x4   1 −1  4  2
min [ 4 0 ]   + [7 0]  
 x4   x3 
2 0   x1  3 −1  x1   6 
BxB     +    = 
2 −1  x 4   1 0   x 4   4  Reduced costs are
 x1   x1  0  NxN cN − cB B −1N
 x  ,  x  ≥ 0  1/ 2 0  3 −1
 4  4   = [7 0] − [ 4 0]   
 1 −1  1 0 
= [1 2] ≥ [0 0]
Solution is
LSE tutorial, June 2007 optimal
Slide 155
Linear Programming Duality
An LP can be viewed as an inference problem…
min cx = max v
x ≥0
Ax ≥ b Ax ≥ b ⇒ cx ≥ v
x≥0
implies
Dual problem: Find the tightest lower bound on the

objective function that is implied by the constraints.

Slide 156
min cx = max v
x ≥0 That is, some surrogate
Ax ≥ b Ax ≥ b ⇒ cx ≥ v (nonnegative linear
x≥0 combination) of
Ax ≥ b dominates cx ≥ v
From Farkas Lemma: If Ax ≥ b, x ≥ 0 is feasible,

x ≥0 λ Ax ≥ λ b dominates cx ≥ v
Ax ≥ b ⇒ cx ≥ v iff
for some λ ≥ 0
λA ≤ c and λb ≥ v
Slide 157
min cx = max v = max λ b This is the

Ax ≥ b x ≥0 λA ≤ c classical
Ax ≥ b ⇒ cx ≥ v LP dual
x≥0 λ ≥0
From Farkas Lemma: If Ax ≥ b, x ≥ 0 is feasible,

x ≥0 λ Ax ≥ λ b dominates cx ≥ v
Ax ≥ b ⇒ cx ≥ v iff
for some λ ≥ 0
λA ≤ c and λb ≥ v
Slide 158
This equality is called strong duality.
min cx = max λ b This is the

Ax ≥ b λA ≤ c classical
LP dual
x≥0 λ ≥0
If Ax ≥ b, x ≥ 0 is feasible
Note that the dual of the dual is the primal

(i.e., the original LP).

Slide 159
Example
Primal Dual
min 4 x1 + 7 x2 = max 6λ1 + 4λ2 = 12
2 x1 + 3 x2 ≥ 6 (λ1 ) 2λ1 + 2λ2 ≤ 4 ( x1 )
2 x1 + x2 ≥ 4 (λ1 ) 3λ1 + λ2 ≤ 7 ( x2 )
x1, x2 ≥ 0 λ1, λ2 ≥ 0
A dual solution is (λ1,λ2) = (2,0)
2 x1 + 3 x2 ≥ 6 ⋅ (λ1 = 2)
Dual multipliers
2 x1 + x2 ≥ 4 ⋅ (λ2 = 0)
4 x1 + 6 x2 ≥ 12 Surrogate
dominates

4 x1 + 7 x2 ≥ 12 Tightest bound on cost
Slide 160
Weak Duality
If x* is feasible in the and λ* is feasible in the then cx* ≥ λ*b.

primal problem dual problem
min cx max λ b This is because
Ax ≥ b λA ≤ c cx* ≥ λ*Ax* ≥ λ*b
x≥0 λ ≥0
λ* is dual x* is primal
feasible feasible
and x* ≥ 0 and λ* ≥ 0

Slide 161
Dual multipliers as marginal costs
Suppose we perturb the RHS of an LP min cx

(i.e., change the requirement levels):
Ax ≥ b + ∆b
x≥0
The dual of the perturbed LP has the max λ ( b + ∆b )
same constraints at the original LP:
λA ≤ c
λ ≥0
So an optimal solution λ* of the original dual is feasible in the
perturbed dual.

Slide 162
Dual multipliers as marginal costs
Suppose we perturb the RHS of an LP min cx

(i.e., change the requirement levels):
Ax ≥ b + ∆b
x≥0
By weak duality, the optimal value of the perturbed LP is at least
λ*(b + ∆b) = λ*b + λ*∆b.
Optimal value of original LP, by strong duality.
So λi* is a lower bound on the marginal cost of increasing the

i-th requirement by one unit (∆bi = 1).
If λi* > 0, the i-th constraint must be tight (complementary slackness).

Slide 163
Dual of an LP in equality form
Primal Dual
min cB xB + cN xN max λ b
λ B ≤ cB ( xB )
BxB + NxN = b (λ )
λ N ≤ cN ( xB )
xB , xN ≥ 0
λ unrestricted

Slide 164
Primal Dual
λ B ≤ cB ( xB )
BxB + NxN = b (λ )
λ N ≤ cN ( xB )
xB , xN ≥ 0
λ unrestricted
Recall that reduced cost vector is cN − cB B −1N = cN − λ N

λ
this solves the dual
if (xB,0) solves the primal

Slide 165
Primal Dual
λ B ≤ cB ( xB )
BxB + NxN = b (λ )
λ N ≤ cN ( xB )
xB , xN ≥ 0
λ unrestricted

λ
Check: λB = cB B B = cB
−1

λN = cB B −1N ≤ cN if (xB,0) solves the primal
Because reduced cost is nonnegative

at optimal solution (xB,0).
Slide 166
Primal Dual
λ B ≤ cB ( xB )
BxB + NxN = b (λ )
λ N ≤ cN ( xB )
xB , xN ≥ 0
λ unrestricted

λ
if (xB,0) solves the primal
In the example,
1/ 2 0 
λ = cB B = [ 4 0 ] 
−1
 = [ 2 0]
 1 −1
Slide 167
Primal Dual
λ B ≤ cB ( xB )
BxB + NxN = b (λ )
λ N ≤ cN ( xB )
xB , xN ≥ 0
λ unrestricted

λ
Note that the reduced cost of an individual variable xj is r j = c j − λ Aj
Column j of A

Slide 168
LP-based Domain Filtering
min cx
Let Ax ≥ b be an LP relaxation of a CP problem.
x≥0
One way to filter the domain of xj is to minimize and maximize xj

subject to Ax ≥ b, x ≥ 0.
- This is time consuming.
A faster method is to use dual multipliers to derive valid
inequalities.
- A special case of this method uses reduced costs to bound or
fix variables.
- Reduced-cost variable fixing is a widely used technique in OR.

Slide 169
Suppose:
min cx has optimal solution x*, optimal value v*, and
Ax ≥ b optimal dual solution λ*.
x≥0
…and λi* > 0, which means the i-th constraint is tight

(complementary slackness);
…and the LP is a relaxation of a CP problem;
…and we have a feasible solution of the CP problem with value
U, so that U is an upper bound on the optimal value.

Slide 170
min cx
has optimal solution x*, optimal value v*, and
Supposing Ax ≥ b
optimal dual solution λ*:
x≥0
If x were to change to a value other than x*, the LHS of i-th constraint
Aix ≥ bi would change by some amount ∆bi.
Since the constraint is tight, this would increase the optimal value
as much as changing the constraint to Aix ≥ bi + ∆bi.
So it would increase the optimal value at least λi*∆bi.

Slide 171
min cx
Supposing Ax ≥ b
x≥0
We have found: a change in x that changes Aix by ∆bi increases

the optimal value of LP at least λi*∆bi.
Since optimal value of the LP ≤ optimal value of the CP ≤ U,

we have λi*∆bi ≤ U − v*, or U −v*
∆bi ≤
λi*

Slide 172
min cx
Supposing Ax ≥ b
x≥0
We have found: a change in x that changes Aix by ∆bi increases

the optimal value of LP at least λi*∆bi.
Since optimal value of the LP ≤ optimal value of the CP ≤ U,

we have λi*∆bi ≤ U − v*, or U −v*
∆bi ≤
λi*
Since ∆bi = Aix − Aix* = Aix − bi, this implies the inequality
U −v*
Ai x ≤ bi +
λi*
…which can be propagated.
Slide 173
Example
min 4 x1 + 7 x2
Suppose we have a feasible solution
2 x1 + 3 x2 ≥ 6 (λ1 = 2) of the original CP with value U = 13.
2 x1 + x2 ≥ 4 (λ1 = 0)
x1, x2 ≥ 0
Since the first constraint is tight, we can propagate
the inequality
U −v*
A1x ≤ b1 +
λ1*
13 − 12
or 2 x1 + 3 x2 ≤ 6 + = 6.5
2

Slide 174
Reduced-cost domain filtering
Suppose xj* = 0, which means the constraint xj ≥ 0 is tight.
U −v* U −v*
The inequality Ai x ≤ bi + becomes xj ≤
λi
*
rj
The dual multiplier for xj ≥ 0 is the reduced cost

rj of xj, because increasing xj (currently 0) by 1
increases optimal cost by rj.
Similar reasoning can bound a variable below when it is at its

upper bound.

Slide 175
Example
min 4 x1 + 7 x2
Suppose we have a feasible solution
2 x1 + 3 x2 ≥ 6 (λ1 = 2) of the original CP with value U = 13.
2 x1 + x2 ≥ 4 (λ1 = 0)
x1, x2 ≥ 0
U −v*
Since x2* = 0, we have x2 ≤
r2
13 − 12
or x2 ≤ = 0.5
2
If x2 is required to be integer, we can fix it to zero.

This is reduced-cost variable fixing.

Slide 176
Example: Single-Vehicle Routing
A vehicle must make several stops and return home, perhaps subject
to time windows.
The objective is to find the order of stops that minimizes travel time.
This is also known as the traveling salesman problem (with time
windows).
Stop j
Travel time cij
Stop i

Slide 177
Assignment Relaxation
= 1 if stop i immediately precedes stop j

min ∑c x
ij
ij ij
∑ xij = ∑ x ji = 1, all i
j j
Stop i is preceded and
followed by exactly one stop.
xij ∈ {0,1}, all i , j

Slide 178
Assignment Relaxation
= 1 if stop i immediately precedes stop j

min ∑c x
ij
ij ij
∑ xij = ∑ x ji = 1, all i
j j
Stop i is preceded and
followed by exactly one stop.
0 ≤ xij ≤ 1, all i , j
Because this problem is totally unimodular, it can be solved as an LP.
The relaxation provides a very weak lower bound on the optimal value.
But reduced-cost variable fixing can be very useful in a CP context.

Slide 179
Disjunctions of linear systems
Disjunctions of linear systems often occur naturally in problems

and can be given a convex hull relaxation.
A disjunction of linear systems min cx

represents a union of polyhedra.
∨(A x ≥ b )
k
k k

Slide 180
Relaxing a disjunction of linear systems

A disjunction of linear systems min cx

represents a union of polyhedra.
∨(A x ≥ b )
k
k k
We want a convex hull relaxation

(tightest linear relaxation).

Slide 181
Relaxing a disjunction of linear systems

The closure of the convex hull of min cx

∨(A x ≥ b )
k
k k
…is described by min cx

Ak x k ≥ b k y k , all k
∑y =1
k
k
x = ∑x k

0 ≤ yk ≤ 1
Slide 182
Why?
To derive convex hull
relaxation of a disjunction…
min cx
Ak x k ≥ b k , all k
Write each
solution as a ∑y =1 k
convex k
combination x = ∑y x k
k
of points in x
k 1
x
the 0 ≤ yk ≤ 1
polyhedron x2
Convex hull relaxation

(tightest linear relaxation)
Slide 183
Why? min cx
To derive convex hull
relaxation of a disjunction… Change of ∑y =1
k
k
variable
x = yk x k x = ∑x k
min cx k
Ak x k ≥ b k , all k 0 ≤ yk ≤ 1
Write each
solution as a ∑y =1 k
convex k
combination x = ∑y x k
k
of points in x
k 1
x
the 0 ≤ yk ≤ 1
polyhedron x2
Convex hull relaxation

(tightest linear relaxation)
Slide 184
Mixed Integer/Linear Modeling
MILP Representability
Disjunctive Modeling
Knapsack Modeling

Slide 185
Motivation
A mixed integer/linear programming min cx + dy

(MILP) problem has the form Ax + by ≥ b
x, y ≥ 0
y integer
• We can relax a CP problem by modeling some constraints with an MILP.

• If desired, we can then relax the MILP by dropping the integrality constraint,
to obtain an LP.
• The LP relaxation can be strengthened with cutting planes.
• The first step is to learn how to write MILP models.

Slide 186
A subset S of R n is MILP representable if it is the projection onto x
of some MILP constraint set of the form
Ax + Bu + Dy ≥ b
x, y ≥ 0
x ∈ R n , u ∈ R m , y k ∈ {0,1}

Slide 187
A subset S of R n is MILP representable if it is the projection onto x
of some MILP constraint set of the form
Ax + Bu + Dy ≥ b
x, y ≥ 0
x ∈ R n , u ∈ R m , y k ∈ {0,1} Recession cone
of polyhedron
Theorem. S ⊂ R is MILP
n
representable if and only if

S is the union of finitely Polyhedron
many polyhedra having the
same recession cone.

Slide 188
Example: Fixed charge function
min x2
Minimize a fixed charge function:
0 if x1 = 0 
x2 ≥  
f + cx1 if x1 > 0 
x1 ≥ 0
x2
x1
Slide 189
Example
min x2
0 if x1 = 0 
x2 ≥  
f + cx1 if x1 > 0 
x1 ≥ 0
x2
Feasible set
x1
Slide 190
Example
min x2
0 if x1 = 0 
x2 ≥  
f + cx1 if x1 > 0 
x1 ≥ 0
x2
Union of two
polyhedra
P1, P2
P1
x1
Slide 191
Example
min x2
0 if x1 = 0 
x2 ≥  
f + cx1 if x1 > 0 
x1 ≥ 0
x2
Union of two P2
polyhedra
P1, P2
P1
x1
Slide 192
Example
min x2
0 if x1 = 0 
x2 ≥  
f + cx1 if x1 > 0 
x1 ≥ 0
x2
The P2
polyhedra
have
different
recession P1
cones.
P1 P2
x1 recession recession
LSE tutorial, June 2007 cone cone
Slide 193
Example
min x2
0 if x1 = 0 
Add an upper bound on x1 x2 ≥  
f + cx1 if x1 > 0 
0 ≤ x1 ≤ M
x2
The P2
polyhedra
have the
same
recession P1
cone.
P1 P2
x1 recession recession
M cone cone
Slide 194
Modeling a union of polyhedra
Start with a disjunction of linear min cx

systems to represent the union
of polyhedra. ∨(A
k
k
x ≥ bk )
The kth polyhedron is {x | Akx ≥ b}
Introduce a 0-1 variable yk that is min cx

1 when x is in polyhedron k.
Disaggregate x to create an xk for
each k. ∑ yk = 1
k
x = ∑ xk
k
y k ∈ {0,1}
Slide 195
Example
Start with a disjunction of min x2

linear systems to represent
 x1 = 0   0 ≤ x1 ≤ M 
 x ≥ 0  ∨  x ≥ f + cx 
the union of polyhedra
 2   2 1
x2
P2
P1
x1
Slide 196
M
Example
Start with a disjunction of min x2

linear systems to represent
 x1 = 0   0 ≤ x1 ≤ M 
 x ≥ 0  ∨  x ≥ f + cx 
the union of polyhedra
 2   2 1
Introduce a 0-1 variable yk

min cx
that is 1 when x is in x11 = 0, x21 ≥ 0
polyhedron k.
0 ≤ x12 ≤ My 2 , − cx12 + x22 ≥ fy 2
y1 + y 2 = 1, y k ∈ {0,1}
Disaggregate x to create an
xk for each k.
x = x1 + x 2

Slide 197
Example
To simplify: min x2
Replace x12 with x1. x11 = 0, x21 ≥ 0
Replace x22 with x2. 0 ≤ x12 ≤ My 2 , − cx12 + x22 ≥ fy 2
Replace y2 with y. y1 + y 2 = 1, y k ∈ {0,1}
x = x1 + x 2
This yields min x2 or min fy + cx

0 ≤ x1 ≤ My 0 ≤ x ≤ My
x2 ≥ fy + cx1 y ∈ {0,1} “Big M ”
y ∈ {0,1}
Slide 198
Disjunctive Modeling
Disjunctions often occur naturally in problems and can be given

an MILP model.
Recall that a disjunction of linear min cx

systems (representing polyhedra
with the same recession cone) ∨(A x ≥ b )
k
k k
…has the MILP model min cx

∑ yk = 1
k
x = ∑ xk
k

y k ∈ {0,1}
Slide 199
Example: Uncapacitated facility location
m possible
factory
locations n markets Locate factories to serve
markets so as to minimize
total fixed cost and
transport cost.
No limit on production
capacity of each factory.
fi cij
i j
Fixed
cost Transport
cost

Slide 200
Fraction of
market j’s demand
Uncapacitated facility location satisfied from
m possible location i
factory
locations n markets Disjunctive model:
min ∑z + ∑c x
i
i
ij
ij ij
 xij = 0, all j   0 ≤ xij ≤ 1, all j 

 z =0 ∨ z ≥ f  , all i
 i   i i 
∑ xij = 1, all j
i
fi cij No factory Factory

i j at location i at location i
Fixed
cost Transport
cost

Slide 201
Uncapacitated facility location
MILP formulation: Disjunctive model:
min ∑ fi y i + ∑ cij xij min ∑z + ∑c x

i
i
ij
ij ij
i ij
0 ≤ xij ≤ y i , all i , j  xij = 0, all j   0 ≤ xij ≤ 1, all j 

 z =0 ∨ ≥  , all i
y i ∈ {0,1}  i   z i fi 
∑ xij = 1, all j
i
No factory Factory
at location i at location i

Slide 202
Uncapacitated facility location Maximum output
from location i
MILP formulation: Beginner’s model:
min ∑f y + ∑c x
i
i i
ij
ij ij min ∑f y + ∑c x
i
i i
ij
ij ij
0 ≤ xij ≤ y i , all i , j ∑x ij ≤ ny i , all i , j

y i ∈ {0,1}
j
y i ∈ {0,1}
Based on capacitated location model.

It has a weaker continuous relaxation
(obtained by replacing yi ∈ {0,1} with 0 ≤ yi ≤ 1).
This beginner’s mistake can be avoided by
starting with disjunctive formulation.
Slide 203
Knapsack Modeling
• Knapsack models consist of knapsack covering and

knapsack packing constraints.
• The freight transfer model presented earlier is an example.
• We will consider a similar example that combines disjunctive
and knapsack modeling.
• Most OR professionals are unlikely to write a model as good
as the one presented here.

Slide 204
Note on tightness of knapsack models
• The continuous relaxation of a knapsack model is not in general

a convex hull relaxation.
- A disjunctive formulation would provide a convex hull
relaxation, but there are exponentially many disjuncts.
• Knapsack cuts can significantly tighten the relaxation.

Slide 205
Example: Package transport
Disjunctive model Knapsack

Each package j constraints
has size aj min ∑z
i
i
∑Q y ≥ ∑ a ; ∑ x
i
i i
j
j
i
ij = 1, all j
Each truck i has yi = 1
 
capacity Qi and    yi = 0 
zi = ci
costs ci to    
operate ∨
 ∑ a j xij ≤ Qi   i z = 0  , all i
 j   x = 0
 0 ≤ x ≤ 1, all j   ij 
Truck i used  ij 
xij , y i ∈ {0,1}
Truck i not used
1 if truck i carries
package j 1 if truck i is used
Slide 206
MILP model Disjunctive model
min∑c y i i
min ∑z
i
i
i
∑Q y ≥ ∑ a ; ∑ x
i i j ij = 1, all j ∑Q y ≥ ∑ a ; ∑ x
i
i i
j
j
i
ij = 1, all j
i j i
∑a x j ij ≤ Qi y i , all i  yi = 1 
j  zi = ci   yi = 0 
xij ≤ y i , all i , j    
∨
 ∑ a j xij ≤ Qi   i z = 0  , all i
xij , y i ∈ {0,1}  j   x = 0
 0 ≤ x ≤ 1, all j   ij 
 ij 
xij , y i ∈ {0,1}

Slide 207
Most OR professionals
would omit this constraint,
since it is the sum over i
MILP model
of the next constraint.
But it generates very
min∑c y i
i i effective knapsack cuts.
∑Q y ≥ ∑ a ; ∑ x
i
i i
j
j
i
ij = 1, all j
∑a x
j
j ij ≤ Qi y i , all i Modeling trick;
unobvious without
xij ≤ y i , all i , j disjunctive approach
xij , y i ∈ {0,1}

Slide 208
Cutting Planes
0-1 Knapsack Cuts

Gomory Cuts
Mixed Integer Rounding Cuts
Example: Product Configuration

Slide 209
To review…
A cutting plane (cut, valid inequality) for

an MILP model: Cutting
plane Continuous
• …is valid
relaxation
- It is satisfied by all feasible solutions
of the model.
• …cuts off solutions of the continuous
relaxation.
- This makes the relaxation tighter.
Feasible solutions

Slide 210
Motivation
• Cutting planes (cuts) tighten the continuous relaxation of an

MILP model.
• Knapsack cuts
- Generated for individual knapsack constraints.
- We saw general integer knapsack cuts earlier.
- 0-1 knapsack cuts and lifting techniques are well studied
and widely used.
• Rounding cuts
- Generated for the entire MILP, they are widely used.
- Gomory cuts for integer variables only.
- Mixed integer rounding cuts for any MILP.
Slide 211
0-1 Knapsack Cuts
0-1 knapsack cuts are designed for knapsack constraints with 0-1
variables.
The analysis is different from that of general knapsack constraints,
to exploit the special structure of 0-1 inequalities.

Slide 212
0-1 Knapsack Cuts
0-1 knapsack cuts are designed for knapsack constraints with 0-1
variables.
The analysis is different from that of general knapsack constraints,
to exploit the special structure of 0-1 inequalities.
Consider a 0-1 knapsack packing constraint ax ≤ a0. (Knapsack
covering constraints are similarly analyzed.)
Index set J is a cover if ∑a

j ∈J
j > a0
The cover inequality ∑x j ≤ J − 1 is a 0-1 knapsack cut for

ax ≤ a0 j ∈J
Only minimal covers need be considered.

Slide 213
Example
J = {1,2,3,4} is a cover for
6 x1 + 5 x2 + 5 x3 + 5 x4 + 8 x5 + 3 x6 ≤ 17
This gives rise to the cover inequality
x1 + x2 + x3 + x 4 ≤ 3
Index set J is a cover if ∑a

j ∈J
j > a0
The cover inequality ∑x j ≤ J − 1 is a 0-1 knapsack cut for

ax ≤ a0 j ∈J
Only minimal covers need be considered.

Slide 214
Sequential lifting
• A cover inequality can often be strengthened by lifting it into a

higher dimensional space.
• That is, by adding variables.
• Sequential lifting adds one variable at a time.
• Sequence-independent lifting adds several variables at once.

Slide 215
Sequential lifting
To lift a cover inequality ∑x

j ∈J
j ≤ J −1
add a term to the left-hand side ∑x

j ∈J
j + π k xk ≤ J − 1
where πk is the largest coefficient for which the inequality is still valid.
 
So, π k = J − 1 − max ∑ x j ∑ a j x j ≤ a0 − ak 
x j ∈{0,1}
for j ∈J
 j∈J j ∈J 
This can be done repeatedly (by dynamic programming).

Slide 216
Example
Given 6 x1 + 5 x2 + 5 x3 + 5 x4 + 8 x5 + 3 x6 ≤ 17
To lift x1 + x2 + x3 + x 4 ≤ 3
add a term to the left-hand side x1 + x2 + x3 + x 4 + π 5 x5 ≤ 3
where
π 5 = 3 − max
x j ∈{0,1}
{x1 + x2 + x3 + x4 6 x1 + 5 x2 + 5 x3 + 5 x4 ≤ 17 − 8}
for j∈{1,2,3,4}
This yields x1 + x2 + x3 + x 4 + 2 x5 ≤ 3
Further lifting leaves the cut unchanged.
But if the variables are added in the order x6, x5, the result is different:
x1 + x2 + x3 + x 4 + x5 + x6 ≤ 3

Slide 217
Sequence-independent lifting
• Sequence-independent lifting usually yields a weaker cut than

sequential lifting.
• But it adds all the variables at once and is much faster.
• Commonly used in commercial MILP solvers.

Slide 218
Sequence-independent lifting
To lift a cover inequality ∑x

j ∈J
j ≤ J −1
add terms to the left-hand side ∑ x + ∑ ρ (a ) x

j ∈J
j
j ∉J
j k ≤ J −1
j if Aj ≤ u ≤ A j +1 − ∆ and j ∈ {0,…, p − 1}

where ρ (u ) =  j + (u − A j ) / ∆ if Aj − ∆ ≤ u < A j − ∆ and j ∈ {1,…, p − 1}
 p + (u − A ) / ∆ if A − ∆ ≤ u
 p p
j
with ∆ = ∑ a j − a0 A j = ∑ ak
j ∈J k =1
J = {1,…, p} A0 = 0

Slide 219
Example
Given 6 x1 + 5 x2 + 5 x3 + 5 x4 + 8 x5 + 3 x6 ≤ 17
To liftx1 + x2 + x3 + x 4 ≤ 3
Add terms x1 + x2 + x3 + x 4 + ρ (8) x5 + ρ (3) x6 ≤ 3
where ρ(u) is given by
This yields the lifted cut

x1 + x2 + x3 + x 4 + (5 / 4)x5 + (1/ 4)x6 ≤ 3

Slide 220
Gomory Cuts
• When an integer programming Solution of
Separating
problem has a nonintegral solution, continuous
cut
we can generate at least one Gomory relaxation
cut to cut off that solution.
- This is a special case of a
separating cut, because it
separates the current solution of
the relaxation from the feasible
set.
• Gomory cuts are widely used and
very effective in MILP solvers.
Feasible solutions

Slide 221
Gomory cuts
Given an integer programming Let (xB,0) be an optimal solution

problem of the continuous relaxation,
min cx where
xB = bˆ − Nx
ˆ N
Ax = b
bˆ = B −1b, Nˆ = B −1N
x ≥ 0 and integral
Then if xi is nonintegral in this solution, the following Gomory cut is

violated by (xB,0):
xi + Nˆ i  xN ≤ bî 

Slide 222
Example
min 2 x1 + 3 x2 or min 2 x1 + 3 x2 Optimal solution of
x1 + 3 x2 ≥ 3 x1 + 3 x2 − x3 = 3 the continuous
relaxation has
4 x1 + 3 x2 ≥ 6 4 x1 + 3 x2 − x 4 = 6
x   1 
x1, x2 ≥ 0 and integral x j ≥ 0 and integral xB =  1  =  
 x2   2 / 3 
 1/ 3 −1/ 3 
Nˆ =  
 −4 / 9 1/ 9 
 1 
bˆ =  
2 / 3 

Slide 223
Example
relaxation has
4 x1 + 3 x2 ≥ 6 4 x1 + 3 x2 − x 4 = 6
x   1 
 x2   2 / 3 
 1/ 3 −1/ 3 
The Gomory cut xi + Nˆ i  xN ≤ bî  Nˆ =  
 −4 / 9 1/ 9 
 x3   1 
is x2 + [ −4 / 9 1/ 9]   ≤  2 / 3  bˆ =  
 x4  2 / 3 
or x 2 − x3 ≤ 0 In x1,x2 space this is x1 + 2 x2 ≥ 3

Slide 224
Example
relaxation has
4 x1 + 3 x2 ≥ 6 4 x1 + 3 x2 − x 4 = 6
x   1 
 x2   2 / 3 
 1/ 3 −1/ 3 
Nˆ =  
 −4 / 9 1/ 9 
 1 
bˆ =  
2 / 3 
 1  Gomory cut x1 + 2x2 ≥ 3

bˆ =  
2 / 3
Gomory cut after re-solving LP with

previous cut.
Slide 225
Mixed Integer Rounding Cuts
• Mixed integer rounding (MIR) cuts can be generated for solutions

of any relaxed MILP in which one or more integer variables has a
fractional value.
− Like Gomory cuts, they are separating cuts.
− MIR cuts are widely used in commercial solvers.

Slide 226
MIR cuts
In an optimal solution of the
Given an MILP problem continuous relaxation, let
min cx + dy J = { j | yj is nonbasic}
Ax + Dy = b K = { j | xj is nonbasic}
x, y ≥ 0 and y integral N = nonbasic cols of [A D]
Then if yi is nonintegral in this solution, the following MIR cut is

violated by the solution of the relaxation:
 frac(Nˆ ij )  1
y i + ∑ Nˆ ij y j + ∑  Nˆ ij  +  + ∑ Nˆ ij+ x j ≥ Nˆ ij bî 
j ∈J1 j ∈J 2  frac( bî )  frac(bî ) j∈K
where {
J1 = j ∈ J frac(Nˆ ij ) ≥ frac(bˆ j ) } J 2 = J \ J1

Slide 227
Example
Take basic solution (x1,y1) = (8/3,17/3).
3 x1 + 4 x2 − 6 y1 − 4 y 2 = 1
Then  1/ 3 2 / 3   8/3 
x1 + 2 x2 − y1 − y 2 = 3 Nˆ =   bˆ =  
 −2 / 3 8 / 3  17 / 3 
x j , y j ≥ 0, y j integer
J = {2}, K = {2}, J1 = ∅, J2 = {2}
 1/ 3  1
The MIR cut is y1 +  1/ 3  + y
 2 + (2 / 3)+ x2 ≥ 8 / 3 
 2/3 2/3
or y1 + (1/ 2)y 2 + x2 ≥ 3

Slide 228
Example: Product Configuration
This example illustrates:

• Combination of propagation and relaxation.
• Processing of variable indices.
• Continuous relaxation of element constraint.

Slide 229
The problem
Choose what type of each component, and how many
Personal computer Disk

drive
Memory
Memory Disk Disk
Memory Disk
drive drive
Memory Memory drive
Power
Memory Disk
supply
drive
Power Power Power

supply supply supply

Slide 230
Model of the problem
Unit cost of producing Amount of attribute j

attribute j produced by type ti
of component i
Amount of attribute j min ∑c v

j
j j
produced
(< 0 if consumed): v j = ∑ qi Aijti , all j
memory, heat, power, ik ti is a variable
weight, etc. L j ≤ v j ≤ U j , all j index
Quantity of
component i
installed

Slide 231
To solve it:
• Branch on domains of ti and qi.

• Propagate element constraints and bounds on vj.
– Variable index is converted to specially structured
element constraint.
– Valid knapsack cuts are derived and propagated.
• Use linear continuous relaxations.
– Special purpose MILP relaxation for element.

Slide 232
Propagation
min ∑c v
j
j j
v j = ∑ qi Aijti , all j
ik
L j ≤ v j ≤ U j , all j This is propagated

in the usual way

Slide 233
Propagation
v j = ∑ zi , all j
i
element ( t i ,(qi , Aij 1,…, qi Aijn ), zi ) , all i , j
min ∑c v
j
j j
v j = ∑ qi Aijti , all j This is rewritten as

ik
L j ≤ v j ≤ U j , all j This is propagated

in the usual way

Slide 234
Propagation
i
This can be propagated by

(a) using specialized filters for element constraints of this form…

Slide 235
Propagation
i
This is propagated by
(a) using specialized filters for element constraints of this form,
(b) adding knapsack cuts for the valid inequalities:
∑ max {A } q ≥ v , all j
i
k ∈Dti
ijk i j
∑ min{A } q ≤ v , all j
i
k ∈Dti
ijk i j
and (c) propagating the knapsack cuts.

[v j ,v j ] is current
domain of vj
Slide 236
Relaxation
min ∑c v
j
j j
ik
This is relaxed as
L j ≤ v j ≤ U j , all j
v j ≤ vj ≤ vj

Slide 237
Relaxation
i
min ∑c v
j
j j
This is relaxed by relaxing this
and adding the knapsack cuts.
ik
This is relaxed as
L j ≤ v j ≤ U j , all j
v j ≤ vj ≤ vj

Slide 238
Relaxation
i
This is relaxed by replacing each element constraint

with a disjunctive convex hull relaxation:
zi = ∑A
k∈Dti
ijk qik , qi = ∑q
k∈Dti
ik

Slide 239
Relaxation
So the following LP relaxation is solved at each node

of the search tree to obtain a lower bound:
min ∑c v j
j j
vj = ∑ ∑A ijk qik , all j

i k ∈Dti
qj = ∑q
k∈Dti
ik , all i
v j ≤ v j ≤ v j , all j
qi ≤ qi ≤ qi , all i
knapsack cuts for ∑ max {A } q ≥ v , all j i
k∈Dti
ijk i j
knapsack cuts for ∑ min{ A } q ≤ v , all j ijk i j

k ∈Dti
i
qik ≥ 0, all i , k

Slide 240
Computational Results
1000
100
10
Seconds
CPLEX
CLP
Hybrid
1
8x10 16x20 20x24 20x30
0.1
0.01
Problem

Slide 241
Lagrangean Relaxation
Lagrangean Duality
Properties of the Lagrangean Dual
Example: Fast Linear Programming
Domain Filtering
Example: Continuous Global Optimization

Slide 242
Motivation
• Lagrangean relaxation can provide better bounds than LP

relaxation.
• The Lagrangean dual generalizes LP duality.
• It provides domain filtering analogous to that based on LP
duality.
- This is a key technique in continuous global optimization.
• Lagrangean relaxation gets rid of troublesome constraints by
dualizing them.
- That is, moving them into the objective function.
- The Lagrangean relaxation may decouple.

Slide 243
Lagrangean Duality
Consider an
inequality-constrained
problem
min f ( x )
Hard constraints
g(x ) ≥ 0
x ∈S Easy constraints
The object is to get rid of (dualize) the hard constraints

by moving them into the objective function.

Slide 244
Lagrangean Duality
Consider an It is related to an
inequality-constrained inference problem
problem
min f ( x ) max v
s∈S
g(x ) ≥ 0 g( x ) ≥ b ⇒ f (x ) ≥ v
x ∈S implies
Lagrangean Dual problem: Find the tightest lower bound

on the objective function that is implied by the constraints.

Slide 245
Primal Dual
min f ( x ) max v
g( x ) ≥ 0 s∈S
g( x ) ≥ b ⇒ f (x ) ≥ v
x ∈S
Surrogate
Let us say that
x∈S λ g ( x ) ≥ 0 dominates f (x ) − v ≥ 0
g( x ) ≥ 0 ⇒ f ( x ) ≥ v iff
for some λ ≥ 0
λg(x) ≤ f(x) − v for all x ∈ S

That is, v ≤ f(x) − λg(x) for all x ∈ S

Slide 246
Primal Dual
min f ( x ) max v
g( x ) ≥ 0 s∈S
g( x ) ≥ b ⇒ f (x ) ≥ v
x ∈S
Surrogate
Let us say that
g( x ) ≥ 0 ⇒ f ( x ) ≥ v iff
for some λ ≥ 0

Or v ≤ min {f ( x ) − λ g ( x )}
x∈S

Slide 247
Primal Dual
min f ( x ) max v
g( x ) ≥ 0 s∈S
g( x ) ≥ b ⇒ f (x ) ≥ v
x ∈S
Surrogate
Let us say that
g( x ) ≥ 0 ⇒ f ( x ) ≥ v iff
for some λ ≥ 0

Or v ≤ min {f ( x ) − λ g ( x )}
x∈S
So the dual becomes
max v
LSE tutorial, June 2007 v ≤ min{f ( x ) − λ g ( x )} for some λ ≥ 0
x∈S
Slide 248
Now we have…
Primal Dual
min f ( x ) max v
g( x ) ≥ 0 v ≤ min {f ( x ) − λ g ( x )} for some λ ≥ 0
x∈S
x ∈S
or where
These constraints max θ (λ )
λ ≥0
θ (λ ) = min {f ( x ) − λ g ( x )}
are dualized x∈S
Lagrangean Vector of
relaxation Lagrange
multipliers
The Lagrangean dual can be viewed as the problem

of finding the Lagrangean relaxation that gives the
tightest bound.
Slide 249
Example
The Lagrangean relaxation is
min 3 x1 + 4 x2
− x1 + 3 x2 ≥ 0 θ (λ1, λ2 ) = min {3 x1 + 4 x2 − λ1( − x1 + 3 x2 ) − λ2 (2 x1 + x2 − 5)}
x ∈{0,…,3}
j
2 x1 + x2 − 5 ≥ 0 = min {(3 + λ1 − 2λ2 )x1 + (4 − 3λ1 − λ2 )x2 + 5λ2 }

x j ∈{0,…,3}
x1, x2 ∈ {0,1,2,3}
The Lagrangean relaxation is easy to solve

for any given λ1, λ2:
0 if 3 + λ1 − 2λ2 ≥ 0
x1 = 
3 otherwise
Strongest
surrogate
0 if 4 − 3λ1 − λ2 ≥ 0
x2 = 
3 otherwise
Optimal solution (2,1)

Slide 250
Example
θ(λ1,λ2) is piecewise linear and concave.
min 3 x1 + 4 x2
λ2
− x1 + 3 x2 ≥ 0 θ(λ)=5
2 x1 + x2 − 5 ≥ 0
x1, x2 ∈ {0,1,2,3}
θ(λ)=9 2/7
θ(λ)=0
λ1
θ(λ)=0
θ(λ)=7.5
Solution of Lagrangean dual:
(λ1,λ2) = (5/7, 13/7), θ(λ) = 9 2/7
Optimal solution (2,1) Note duality gap between 10 and 9 2/7
Value = 10
LSE tutorial, June 2007 (no strong duality).
Slide 251
Example
Note: in this example, the Lagrangean dual
min 3 x1 + 4 x2 provides the same bound (9 2/7) as the
− x1 + 3 x2 ≥ 0 continuous relaxation of the IP.
2 x1 + x2 − 5 ≥ 0 This is because the Lagrangean relaxation
x1, x2 ∈ {0,1,2,3} can be solved as an LP:
θ (λ1, λ2 ) = min {(3 + λ1 − 2λ2 )x1 + (4 − 3λ1 − λ2 )x2 + 5λ2 }
x ∈{0,…,3}
j
= min {(3 + λ1 − 2λ2 )x1 + (4 − 3λ1 − λ2 )x2 + 5λ2 }

0 ≤ x j ≤3
Lagrangean duality is useful when the

Lagrangean relaxation is tighter than an LP
but nonetheless easy to solve.

Slide 252
Properties of the Lagrangean dual
Weak duality: For any feasible x* and any λ* ≥ 0, f(x*) ≥ θ(λ*).

In particular, min f ( x ) ≥ max θ ( λ )
λ ≥0
g( x ) ≥ 0
x ∈S
Concavity: θ(λ) is concave. It can therefore be maximized by

local search methods.
Complementary slackness: If x* and λ* are optimal, and there

is no duality gap, then λ*g(x*) = 0.

Slide 253
Solving the Lagrangean dual
Let λk be the kth iterate, and let λ k +1 = λ k + α k ξ k
Subgradient of θ(λ) at λ = λk
If xk solves the Lagrangean relaxation for λ = λk, then ξk = g(xk).

This is because θ(λ) = f(xk) + λg(xk) at λ = λk.
The stepsize αk must be adjusted so that the sequence

converges but not before reaching a maximum.

Slide 254
Example: Fast Linear Programming
• In CP contexts, it is best to process each node of the search tree

very rapidly.
• Lagrangean relaxation may allow very fast calculation of a lower
bound on the optimal value of the LP relaxation at each node.
• The idea is to solve the Lagrangean dual at the root node (which
is an LP) and use the same Lagrange multipliers to get an LP
bound at other nodes.

Slide 255
At root node, solve min cx
Dualize Ax ≥ b (λ )
Special structure, Dx ≥ d
e.g. variable bounds x ≥0
The (partial) LP dual solution λ*

solves the Lagrangean dual in which
θ (λ ) = min
Dx ≥d
{cx − λ ( Ax − b )}
x ≥0

Slide 256
At root node, solve min cx
Dualize Ax ≥ b (λ )
Special structure, Dx ≥ d
e.g. variable bounds x ≥0
The (partial) LP dual solution λ*

solves the Lagrangean dual in which
θ (λ ) = min
Dx ≥d
{cx − λ ( Ax − b )}
x ≥0
min cx
Ax ≥ b (λ )
At another node, the LP is Dx ≥ d
Branching
Hx ≥ h
Here θ(λ*) is still a lower bound on the optimal constraints,
value of the LP and can be quickly calculated x ≥0 etc.
by solving a specially structured LP.

Slide 257
Domain Filtering
Suppose:
min f ( x )
g(x ) ≥ 0 has optimal solution x*, optimal value v*, and
optimal Lagrangean dual solution λ*.
x ∈S
…and λi* > 0, which means the i-th constraint is tight

(complementary slackness);
…and the problem is a relaxation of a CP problem;
…and we have a feasible solution of the CP problem with value
U, so that U is an upper bound on the optimal value.

Slide 258
min f ( x )
Supposing g ( x ) ≥ 0
optimal Lagrangean dual solution λ*:
x ∈S
If x were to change to a value other than x*, the LHS of i-th constraint
gi(x) ≥ 0 would change by some amount ∆i.
Since the constraint is tight, this would increase the optimal value
as much as changing the constraint to gi(x) − ∆i ≥ 0.
So it would increase the optimal value at least λi*∆i.
(It is easily shown that Lagrange multipliers are marginal costs. Dual
multipliers for LP are a special case of Lagrange multipliers.)

Slide 259
min f ( x )
x ∈S
We have found: a change in x that changes gi(x) by ∆i increases

the optimal value at least λi*∆i.
Since optimal value of this problem ≤ optimal value of the CP ≤ U,

we have λi*∆i ≤ U − v*, or U −v*
∆i ≤
λi*

Slide 260
min f ( x )
x ∈S
We have found: a change in x that changes gi(x) by ∆i increases

the optimal value at least λi*∆i.
Since optimal value of this problem ≤ optimal value of the CP ≤ U,

we have λi*∆i ≤ U − v*, or U −v*
∆i ≤
λi*
Since ∆i = gi(x) − gi(x*) = gi(x), this implies the inequality
U −v*
gi ( x ) ≤
λi*
…which can be propagated.
Slide 261
Example: Continuous Global Optimization
• Some of the best continuous global solvers (e.g., BARON)

combine OR-style relaxation with CP-style interval arithmetic and
domain filtering.
• The use of Lagrange multipliers for domain filtering is a key
technique in these solvers.

Slide 262
Continuous Global Optimization
Global optimum
x2
max x1 + x2
4 x1x2 = 1
2 x1 + x2 ≤ 2
x1 ∈ [0,1], x2 ∈ [0,2]
Local optimum
Feasible set
x1
Slide 263
To solve it:
• Search: split interval domains of x1, x2.

– Each node of search tree is a problem restriction.
• Propagation: Interval propagation, domain filtering.
– Use Lagrange multipliers to infer valid inequality for
propagation.
– Reduced-cost variable fixing is a special case.
• Relaxation: Use function factorization to obtain linear
continuous relaxation.

Slide 264
Interval propagation
x2
Propagate intervals
[0,1], [0,2]
through constraints
to obtain
[1/8,7/8], [1/4,7/4]
x1

Slide 265
Relaxation (function factorization)
Factor complex functions into elementary functions that have

known linear relaxations.
Write 4x1x2 = 1 as 4y = 1 where y = x1x2.
This factors 4x1x2 into linear function 4y and bilinear function x1x2.
Linear function 4y is its own linear relaxation.

Slide 266
Factor complex functions into elementary functions that have

known linear relaxations.
Write 4x1x2 = 1 as 4y = 1 where y = x1x2.
This factors 4x1x2 into linear function 4y and bilinear function x1x2.
Linear function 4y is its own linear relaxation.
Bilinear function y = x1x2 has relaxation:
x2 x1 + x1x2 − x1x2 ≤ y ≤ x2 x1 + x1x2 − x1x2
x2 x1 + x1x2 − x1x2 ≤ y ≤ x2 x1 + x1x2 − x1x2
where domain of xj is [ x j , x j ]

Slide 267
The linear relaxation becomes:

min x1 + x2
4y = 1
2 x1 + x2 ≤ 2
x2 x1 + x1x2 − x1x2 ≤ y ≤ x2 x1 + x1x2 − x1x2
x2 x1 + x1x2 − x1x2 ≤ y ≤ x2 x1 + x1x2 − x1x2
x j ≤ x j ≤ x j , j = 1,2

Slide 268
x2
Solve linear relaxation.
x1

Slide 269
x2
x2 ∈ [1,1.75]
Solve linear relaxation.
Since solution is infeasible,

split an interval and branch.
x2 ∈ [0.25,1]
x1

Slide 270
x2 ∈ [1,1.75] x2 ∈ [0.25,1]
x2 x2
x1 x1

Slide 271
x2 ∈ [1,1.75] x2 ∈ [0.25,1]
x2 Solution of x2
relaxation is
feasible,
value = 1.25
This becomes
incumbent
solution
x1 x1

Slide 272
x2 ∈ [1,1.75] x2 ∈ [0.25,1]
x2 Solution of x2
relaxation is Solution of
feasible, relaxation is
value = 1.25 not quite
feasible,
This becomes value = 1.854
incumbent
solution Also use
Lagrange
multipliers for
domain
filtering…
x1 x1

Slide 273
min x1 + x2 Associated Lagrange

multiplier in solution of
4y = 1 relaxation is λ2 = 1.1
2 x1 + x2 ≤ 2
x2 x1 + x1x2 − x1x2 ≤ y ≤ x2 x1 + x1x2 − x1x2
x2 x1 + x1x2 − x1x2 ≤ y ≤ x2 x1 + x1x2 − x1x2
x j ≤ x j ≤ x j , j = 1,2

Slide 274
min x1 + x2 Associated Lagrange

multiplier in solution of
4y = 1 relaxation is λ2 = 1.1
2 x1 + x2 ≤ 2
x2 x1 + x1x2 − x1x2 ≤ y ≤ x2 x1 + x1x2 − x1x2
x2 x1 + x1x2 − x1x2 ≤ y ≤ x2 x1 + x1x2 − x1x2
x j ≤ x j ≤ x j , j = 1,2
This yields a valid inequality for propagation:

1.854 − 1.25
2 x1 + x2 ≥ 2 − = 1.451
1.1
Value of Value of incumbent
relaxation solution
Lagrange multiplier
Slide 275
Dynamic Programming in CP
Example: Capital Budgeting
Domain Filtering
Recursive Optimization

Slide 276
Motivation
• Dynamic programming (DP) is a highly versatile technique that

can exploit recursive structure in a problem.
• Domain filtering is straightforward for problems modeled as a
DP.
• DP is also important in designing filters for some global
constraints, such as the stretch constraint (employee scheduling).
• Nonserial DP is related to bucket elimination in CP and exploits
the structure of the primal graph.
• DP modeling is the art of keeping the state space small while
maintaining a Markovian property.
• We will examine only one simple example of serial DP.

Slide 277
We wish to built power plants with a total cost of at most 12 million

Euros.
There are three types of plants, costing 4, 2 or 3 million Euros
each. We must build one or two of each type.
The problem has a simple knapsack packing model:
4 x1 + 2 x2 + 3 x3 ≤ 12
Number of x j ∈ {1,2}
factories of type j

Slide 278
State sk

4 x1 + 2 x2 + 3 x3 ≤ 12 Stage k
x j ∈ {1,2}
In general the recursion for ax ≤ b is
fk (sk ) = max {fk +1(sk + ak xk )} x3=2

xk ∈Dxk
x3=1
= 1 if there is State is sum
a path from of first k terms
state sk to a of ax
feasible
solution,
0 otherwise
f3(8) = max{f4(8+3⋅1), f4(8+3⋅2)} = max{1,0} = 1
f4(14)=0
LSE tutorial, June 2007 f4(11)=1
Slide 279
0 0
4 x1 + 2 x2 + 3 x3 ≤ 12
0
x j ∈ {1,2} 0
0 0
In general the recursion for ax ≤ b is 0
1
fk (sk ) = max {fk +1(sk + ak xk )} 1 0
xk ∈Dxk
1 1
Boundary condition: 1
1
1 if sn +1 ≤ b 1
fn +1(sn +1 ) = 
0 otherwise
fk(sk) for each state sk

Slide 280
0 0
4 x1 + 2 x2 + 3 x3 ≤ 12
0
x j ∈ {1,2} 0
0 0
The problem is feasible. 1 0
1 0
Each path to 0 is a feasible
solution. 1 1
Path 1: x = (1,2,1) 1 1
Path 2: x = (1,1,2) 1
Path 3: x = (1,1,1)
Possible costs are 9,11,12.

Slide 281
Domain Filtering
4 x1 + 2 x2 + 3 x3 ≤ 12
x j ∈ {1,2}
To filter domains: observe what

values of xk occur on feasible x1=1
paths. x2=2 x3=1
Dx3 = {1,2} x3=2

x2=1
Dx2 = {1,2} x3=1
Dx1 = {1}

Slide 282
Recursive Optimization
max 15 x1 + 10 x2 + 12 x3 Maximize
revenue
4 x1 + 2 x2 + 3 x3 ≤ 12
x j ∈ {1,2}
The recursion includes arc values: 12⋅2
fk (sk ) = max {ck xk + fk +1(sk + ak xk )}

xk ∈Dxk 11⋅1
= value on max Arc value

value path from
sk to final stage
(value to go)
f3(8) = max{12⋅1+f4(8+3⋅1), 12⋅2+f4(8+3⋅2)}
= max{12,−∞} = 12
f4(14)=−∞
LSE tutorial, June 2007 f4(11)=0
Slide 283
Recursive optimization
−∞ −∞
max 15 x1 + 10 x2 + 12 x3 −∞ −∞
4 x1 + 2 x2 + 3 x3 ≤ 12 −∞
−∞
x j ∈ {1,2}
49 −∞
The recursion includes arc values: −∞
12
fk (sk ) = max {ck xk + fk +1(sk + ak xk )} 34 0
24 0
Boundary condition:
0
0 if sn +1 ≤ b
fn +1(sn +1 ) = 
 −∞ otherwise

Slide 284
Recursive optimization
−∞ −∞
max 15 x1 + 10 x2 + 12 x3 −∞ −∞
4 x1 + 2 x2 + 3 x3 ≤ 12 −∞
−∞
x j ∈ {1,2}
49 −∞
The maximum revenue is 49. 12 −∞
34 0
The optimal path is easy to
retrace. 24 0
(x1,x2,x3) = (1,1,2) 0

Slide 285
CP-based Branch and Price
Basic Idea
Example: Airline Crew Scheduling

Slide 286
Motivation
• Branch and price allows solution of integer programming

problems with a huge number of variables.
• The problem is solved by a branch-and-relax method. The
difference lies in how the LP relaxation is solved.
• Variables are added to the LP relaxation only as needed.
• Variables are priced to find which ones should be added.
• CP is useful for solving the pricing problem, particularly when
constraints are complex.
• CP-based branch and price has been successfully applied
to airline crew scheduling, transit scheduling, and other
transportation-related problems.

Slide 287
Basic Idea
min cx
Suppose the LP relaxation of an integer
programming problem has a huge number of Ax = b
variables: x≥0
We will solve a restricted master problem, min ∑c x

j ∈J
j j
which has a small subset of the variables:
∑A x
j∈J
j j =b (λ )
Column j of A
xj ≥ 0
Adding xk to the problem would improve the solution if xk has a

negative reduced cost:
rk = ck − λ Ak < 0

Slide 288
Basic Idea
Adding xk to the problem would improve the solution if xk has a

negative reduced cost:
rk = ck − λ Ak < 0
Computing the reduced cost of xk is known as pricing xk.
Cost of column y
So we solve the pricing problem: min c y − λ y

y is a column of A
If the solution y* satisfies cy* − λy* < 0, then we can add column y to
the restricted master problem.

Slide 289
Basic Idea
The pricing problem max λ y

y is a column of A
need not be solved to optimality, so long as we find a column with
negative reduced cost.
However, when we can no longer find an improving column, we
solved the pricing problem to optimality to make sure we have the
optimal solution of the LP.
If we can state constraints that the columns of A must satisfy,

CP may be a good way to solve the pricing problem.

Slide 290
Example: Airline Crew Scheduling
We want to assign crew members to flights to minimize
cost while covering the flights and observing complex
work rules.
Flight data A roster is the sequence of flights assigned to

a single crew member.
The gap between two consecutive flights in a
roster must be from 2 to 3 hours. Total flight
time for a roster must be between 6 and 10
hours.
For example,
flight 1 cannot immediately precede 6
flight 4 cannot immediately precede 5.
Start Finish
time time The possible rosters are:
LSE tutorial, June 2007 (1,3,5), (1,4,6), (2,3,5), (2,4,6)
Slide 291
Airline Crew Scheduling
There are 2 crew members, and the possible rosters are:
1 2 3 4
(1,3,5), (1,4,6), (2,3,5), (2,4,6)
The LP relaxation of the problem is:

Cost of assigning crew member 1 to roster 2
= 1 if we assign crew member 1
to roster 2, = 0 otherwise.
Each crew member is assigned to

exactly 1 roster.
Each flight is assigned at least 1
crew member.

Slide 292
1 2 3 4
(1,3,5), (1,4,6), (2,3,5), (2,4,6)


exactly 1 roster.
crew member.
Rosters that cover flight 1.

Slide 293
1 2 3 4
(1,3,5), (1,4,6), (2,3,5), (2,4,6)


exactly 1 roster.
crew member.

Slide 294
1 2 3 4
(1,3,5), (1,4,6), (2,3,5), (2,4,6)


exactly 1 roster.
crew member.

Slide 295
1 2 3 4
(1,3,5), (1,4,6), (2,3,5), (2,4,6)


exactly 1 roster.
crew member.

Slide 296
1 2 3 4
(1,3,5), (1,4,6), (2,3,5), (2,4,6)


exactly 1 roster.
crew member.

Slide 297
1 2 3 4
(1,3,5), (1,4,6), (2,3,5), (2,4,6)


exactly 1 roster.
crew member.

Slide 298
1 2 3 4
(1,3,5), (1,4,6), (2,3,5), (2,4,6)

Cost c12 of assigning crew member 1 to roster 2

exactly 1 roster.
crew member.
In a real problem, there can be millions of rosters.

Slide 299
We start by solving the problem with a subset

of the columns:
Optimal
dual
solution
u1
u2
v1
v2
v3
v4
v5
v6

Slide 300

of the columns:
Dual
variables
u1
u2
v1
v2
v3
v4
v5
v6

Slide 301

of the columns:
Dual
variables
u1 The reduced cost of an
u2 excluded roster k for
v1 crew member i is
v2
v3 cik − ui − ∑
j in roster k
vj
v4
v5
v6 We will formulate the
pricing problem as a
shortest path problem.

Slide 302
Pricing problem
Crew
member 1
Crew
member 2

Slide 303
Pricing problem
Each s-t path corresponds to a roster,
provided the flight time is within bounds.
Crew
member 1
Crew
member 2

Slide 304
Pricing problem
Cost of flight 3 if it immediately follows
flight 1, offset by dual multiplier for flight 1
Crew
member 1
Crew
member 2

Slide 305
Pricing problem
Cost of transferring from home to flight 1,
offset by dual multiplier for crew member 1
Crew
member 1
Dual multiplier
omitted to break
symmetry
Crew
member 2

Slide 306
Pricing problem
Length of a path is reduced cost of the
corresponding roster.
Crew
member 1
Crew
member 2

Slide 307
Pricing problem
Arc lengths using dual solution of LP
relaxation
5 2
−10 2
2
3
Crew
member 1 4
0 −1
5 6
5 2
0 2
Crew 3
member 2 4
-9 −1
5 6

Slide 308
Pricing problem
Solution of shortest path problems
5 2
−10 2
2
3
Crew
4
member 1 0 −1
5 6
Reduced cost = −1
Add x12 to problem.
5 2
0 2
3
Crew
member 2 4
-9 −1
Reduced cost = −2 5 6
Add x23 to problem.
After x12 and x23 are added to the problem, no
Slide 309 remaining variable has negative reduced cost.
Pricing problem
The shortest path problem cannot be solved by traditional shortest

path algorithms, due to the bounds on total path length.
It can be solved by CP:
Set of flights Path

assigned to crew length Graph
member i
Path global constraint Path( X i , zi ,G ), all flights i

Setsum global constraint Tmin ≤ ∑ ( f j − s j ) ≤ Tmax
j∈X i
X i ⊂ {flights}, zi < 0, all i

Duration of flight j

Slide 310
Slide 311
CP-based Benders Decomposition
Benders Decomposition in the Abstract
Classical Benders Decomposition
Example: Machine Scheduling

Slide 312
Motivation
• Benders decomposition allows us to apply CP and OR to

different parts of the problem.
• It searches over values of certain variables that, when fixed,
result in a much simpler subproblem.
• The search learns from past experience by accumulating
Benders cuts (a form of nogood).
• The technique can be generalized far beyond the original OR
conception.
• Generalized Benders methods have resulted in the greatest
speedups achieved by combining CP and OR.

Slide 313
Benders Decomposition in the Abstract
When x is fixed to some

Benders decomposition
value, the resulting
can be applied to
subproblem is much
problems of the form
easier:
min f ( x, y ) min f ( x , y )
…perhaps
S( x, y ) S( x , y ) because it
decouples into
x ∈ Dx , y ∈ Dy y ∈ Dy
smaller problems.
For example, suppose x assigns jobs to machines, and y schedules

the jobs on the machines.
When x is fixed, the problem decouples into a separate scheduling
subproblem for each machine.

Slide 314
Benders Decomposition
We will search over assignments to x. This is the master problem.
min f ( x k , y )
In iteration k we assume x = xk and get optimal
and solve the subproblem S( x k , y ) value vk
y ∈ Dy
We generate a Benders cut (a type of nogood) v ≥ Bk +1( x )

that satisfies Bk+1(xk) = vk. Cost in the original problem
The Benders cut says that if we set x = xk again, the resulting cost v
will be at least vk. To do better than vk, we must try something else.
It also says that any other x will result in a cost of at least Bk+1(x),
perhaps due to some similarity between x and xk.

Slide 315
We will search over assignments to x. This is the master problem.
min f ( x k , y )
In iteration k we assume x = xk and get optimal
and solve the subproblem S( x k , y ) value vk
y ∈ Dy
We generate a Benders cut (a type of nogood) v ≥ Bk +1( x )

that satisfies Bk+1(x) = vk. Cost in the original problem
We add the Benders cut to the master problem, which becomes

min v
Benders cuts
v ≥ Bi ( x ), i = 1,…, k + 1 generated so far
x ∈ Dx
Slide 316
min v
We now solve the v ≥ B ( x ), i = 1,…, k + 1 to get the next
master problem i trial value xk+1.
x ∈ Dx
The master problem is a relaxation of the original problem, and its

optimal value is a lower bound on the optimal value of the original
problem.
The subproblem is a restriction, and its optimal value is an upper

bound.
The process continues until the bounds meet.
The Benders cuts partially define the projection of the feasible set
onto x. We hope not too many cuts are needed to find the optimum.

Slide 317
Classical Benders Decomposition
The classical method
and the subproblem
applies to problems whose dual is
is an LP
of the form
min f ( x ) + cy min f ( x ) + cy
k max f ( x k ) + λ ( b − g ( x k ) )
g ( x ) + Ay ≥ b Ay ≥ b − g ( x k ) (λ ) λA ≤ c
x ∈ Dx , y ≥ 0 y ≥0 λ ≥0
Let λk solve the dual.

By strong duality, Bk+1(x) = f(x) + λk(b − g(x)) is the tightest lower
bound on the optimal value v of the original problem when x = xk.
Even for other values of x, λk remains feasible in the dual. So by

weak duality, Bk+1(x) remains a lower bound on v.

Slide 318
Classical Benders
So the master problem becomes
min v min v
v ≥ Bi ( x ), i = 1,…, k + 1 v ≥ f ( x ) + λ i (b − g ( x )), i = 1,…, k + 1
x ∈ Dx x ∈ Dx
In most applications the master problem is

• an MILP
• a nonlinear programming problem (NLP), or
• a mixed integer/nonlinear programming problem (MINLP).

Slide 319
Example: Machine Scheduling
• Assign 5 jobs to 2 machines (A and B), and schedule the

machines assigned to each machine within time windows.
• The objective is to minimize makespan.
Time lapse between
start of first job and
end of last job.
• Assign the jobs in the master

problem, to be solved by MILP.
• Schedule the jobs in the
subproblem, to be solved by CP.

Slide 320
Machine Scheduling
Job Data Once jobs are assigned, we can

minimize overall makespan by
minimizing makespan on each
machine individually.
So the subproblem decouples.
Machine A
Machine B

Slide 321
Machine Scheduling
Job Data Once jobs are assigned, we can

minimize overall makespan by
minimizing makespan on each
machine individually.
So the subproblem decouples.
Minimum makespan
schedule for jobs 1, 2, 3, 5
on machine A

Slide 322
Machine Scheduling
The problem is
Start time of job j
min M
M ≥ s j + px j j , all j Time windows
r j ≤ s j ≤ d j − px j j , all j Jobs cannot overlap
disjunctive ( (s j x j = i ),( pij x j = i ) ) , all i

Slide 323
Machine Scheduling
The problem is
Start time of job j
min M
M ≥ s j + px j j , all j Time windows
r j ≤ s j ≤ d j − px j j , all j Jobs cannot overlap
disjunctive ( (s j x j = i ),( pij x j = i ) ) , all i
For a fixed assignment x the subproblem on each machine i is
min M
M ≥ s j + px j j , all j with x j = i
r j ≤ s j ≤ d j − px j j , all j with x j = i
LSE tutorial, June 2007 disjunctive ( (s j x j = i ),( pij x j = i ))

Slide 324
Benders cuts
Suppose we assign jobs 1,2,3,5 to machine A in iteration k.

We can prove that 10 is the optimal makespan by proving that the
schedule is infeasible with makespan 9.
Edge finding derives infeasibility by reasoning only with jobs 2,3,5.

So these jobs alone create a minimum makespan of 10.
So we have a Benders cut
10 if x2 = x3 = x4 = A
v ≥ Bk +1( x ) = 
0 otherwise
Slide 325
Benders cuts
We want the master problem to be an MILP, which is good for

assignment problems.
So we write the Benders cut
10 if x2 = x3 = x4 = A
v ≥ Bk +1( x ) = 
0 otherwise
Using 0-1 variables: v ≥ 10 ( x A 2 + x A3 + x A5 − 2 )

v ≥0 = 1 if job 5 is
assigned to
machine A

Slide 326
Master problem
The master problem is an MILP:

min v
Constraints derived from time windows
5
∑p
j =1
Aj x Aj ≤ 10, etc.
5
Constraints derived from release times
∑p
j =1
Bj xBj ≤ 10, etc.
5 5
v ≥ ∑ pij xij , v ≥ 2 + ∑ pij xij , etc., i = A, B
j =1 j =3
v ≥ 10( x A 2 + x A3 + x A5 − 2)
v ≥ 8 xB 4 Benders cut from machine A
xij ∈ {0,1} Benders cut from machine B

Slide 327
Stronger Benders cuts
If all release times are the same, we can strengthen the Benders cuts.
We are now using the cut  
v ≥ Mik  ∑ xij − J ik + 1
 j∈Jik 
Min makespan Set of jobs
on machine i assigned to
in iteration k machine i in
iteration k
A stronger cut provides a useful bound even if only some of the jobs in
Jik are assigned to machine i:
v ≥ Mik − ∑ (1 − x
j ∈Jik
ij )pij
These results can be generalized to cumulative scheduling.

Slide 328
Slide 329

Tutorial: Operations Research and Constraint Programming: John Hooker Carnegie Mellon University

Uploaded by

Copyright:

Available Formats

You might also like

Tutorial: Operations Research and Constraint Programming: John Hooker Carnegie Mellon University

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Tutorial: Operations Research and Constraint Programming: John Hooker Carnegie Mellon University

Uploaded by

Copyright:

Available Formats

Tutorial: Operations Research and

LSE tutorial, June 2007

LSE tutorial, June 2007

LSE tutorial, June 2007

Focacci, Lodi, Lesson 2 to 50 times faster

Refalo (1999) Piecewise linear 2 to 200 times

Hooker & Osorio Flow shop 4 to 150 times

Thorsteinsson & Product 30 to 40 times

LSE tutorial, June 2007

Sellmann & Automatic 1 to 10 times faster

Van Hoeve Stable set Better than CP in

Bollapragada, Structural design Up to 600 times

Yunes, Moura & Urban transit Optimal schedule

LSE tutorial, June 2007

Jain & Min-cost planning 20 to 1000 times

LSE tutorial, June 2007

Benoist, Gaudin, Call center scheduling Solved twice as

LSE tutorial, June 2007

LSE tutorial, June 2007

• Why Integrate OR and CP?

LSE tutorial, June 2007

LSE tutorial, June 2007

LSE tutorial, June 2007

This tutorial is based on:

LSE tutorial, June 2007

LSE tutorial, June 2007

• It is a relatively new technology developed in the computer

LSE tutorial, June 2007

• Container port scheduling

LSE tutorial, June 2007

• Job shop scheduling

LSE tutorial, June 2007

LSE tutorial, June 2007

CP vs. Mathematical Programming

LSE tutorial, June 2007

LSE tutorial, June 2007

• In mathematical programming, equations

• In constraint programming, each constraint invokes a

LSE tutorial, June 2007

• Better at sequencing and scheduling

LSE tutorial, June 2007

• Weaker for continuous variables.

• Integrate CP and MP.

LSE tutorial, June 2007

• CP is better known in continental Europe, Asia.

LSE tutorial, June 2007

LSE tutorial, June 2007

• Transport 42 tons of freight using 8 trucks, which come in

Truck Number Capacity Cost

Truck Number Capacity Cost

LSE tutorial, June 2007

LSE tutorial, June 2007

• Let {Lj, …, Uj} be the domain of xj

Bounds propagation may not achieve bounds consistency

Bounds propagation has no effect on the domains.

LSE tutorial, June 2007

LSE tutorial, June 2007

We can create a tighter relaxation (larger minimum