Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

Digital Systems Design Methodologies

High level synthesis

Scheduling

20/03/20
The Scheduling

❑ The scheduling problem


❑ Scheduling without constraints
❑ Scheduling under resource constraints
– The ILP model (Integer Linear Programming)
– Heuristic methods (list based, graph coloring, etc)
❑ Scheduling under timing constraints

-2-
What is Scheduling

❑ Definition:
– Scheduling is the assignment of operations to time (control
steps), possibly within given limits on hardware resources and
timing
❑ Exploits potential parallelism (i.e. data dependencies)
❑ Exploits mutual exclusion
❑ Exploits loops

-3-
What is necessary to solve the
scheduling problem?
❑ Circuit model:
– Intermediate representation (DFG,CDFG, etc).
– Cycle-time is given.
– Operation delays expressed in cycles.

❑ Scheduling:
– Determine the start times for the operations.
– Satisfying all the sequencing (timing and resource) constraints.
❑ Goal:
Do you remember what is
– Determine area/latency trade-off.
latency?

-4-
Scheduling affects

Area: maximum number of concurrent


operations of same type is a lower
bound on required hardware resources.

Performance: concurrency of resulting


implementation.

-5-
What are types of Scheduling
Algorithms?
Scheduling problems are NP-hard, so all kind of heuristics are used

❑ ASAP – As soon as possible


❑ ALAP
❑ List scheduling – Resource Constrained algorithms
❑ Force directed algorithms
❑ Path based
❑ Percolation algorithms Time & Resource Tradeoff
❑ Simulated annealing Scheduling is temporal binding
❑ Tabu search and other heuristics
❑ Simulated evolution
❑ Linear Programming
❑ Integer Linear Programming

-6-
Simplest model of scheduling

❑ All operations have fixed delays.

❑ All delays are expressed in numbers of cycles of


a single one-phase clock.
– Cycle-time is given.

❑ No constraints - no bounds on area.

❑ Goal: Minimum Latency Unconstrained


– Minimize latency. Scheduling Problem

-7-
ASAP algorithm

for each node vi  V do


if Pred_vi = Ø then
Ei = 1;
V = V - {vi};
else
Ei =0;
endif;
endfor;

while V  Ø do
for each node vi  V do
if ALL_NODES_SCHED ( Pred_vi, E ) then
Ei = MAX ( Pred_vi, E ) + 1;
V = V - {vi};
endif;
endfor;
endwhile;

-8-
Example of ASAP: differential equation

Numeric integration of diffential equation: 3 x u dx 3 y u dx x dx


y’’ + 3 y’ x + 3 y = 0
In the range [0, a] , with integration step dx and initial values
v2 * * v1 v3 * v4 * + v10
x(0), y(0), y’(0)
dx y a
Input data: a, dx, x(0) = x (< a), y(0) = y, y’(0) = u. x1
Output result: y. * v5 v6 * v9 + < v11

“Forward Euler Method”: y’ = u; u’ = y’’ = - 3 u x – 3 y u


- v7 y1 c
read (x,y,u,dx,a);
repeat (
x1=x+dx; - v8
u1=u-(3*x*u*dx)-(3*y*dx);
y1=y+u*dx;
c=x1<a; u1
x=x1;u=u1;y=y1;
until (c) DFG

HDL description

-9-
Example of ASAP: scheduling

3 x u dx 3 y u dx x dx
v1
v2
v3 v4 v10
v2 * * v1 v3 * v4 * + v10
E=1
* * * * +
dx y a
x1
* v5 v6 * v9 + < v11
E=2 v5 * v6 + v9 < v11
u *

- v7 y1 c

E=3
- v7
- v8

DFG u1 E=4 - v8
ASAP result

- 10 -
❑ Dual algorithm to ASAP
❑ ALAP solves a latency-constrained problem.
❑ Latency bound can be set to latency computed by ASAP
algorithm.

❑ Mobility
– Defined for each operation.

– Difference between ALAP and ASAP schedule.

– Zero mobility implies that an operation can be started only at one


given time step.
– Mobility greater than 0 measures span of time interval in which an
operation may start.

❑ Slack on the start time.


Example of ALAP: scheduling

L=1 * v1 * v2

L=2 * v5 * v3

L=3 - v7 * v6 * v4 + v10

L=4 + v9 < v11


-
v8

ASAP result

- 12 -
Example of using mobility

❑ Operations with zero mobility: mobility two:


– {v 1, v 2, v 3, v 4, v 5 }
– They are on the critical path.
❑ Operations with mobility one:
– {v 6 , v 7 }
❑ Operations with mobility two:
– {v 8 , v 9 , v 10 ,v 11 }

1. Start from
ALAP
2.Use mobility to
improve
Various Operator types in Scheduling

❑ Operator variety
– Single cycle
– Single cycle-multifunction
– Multi-cycle
– Multi-speed for the same operation
– Pipelined operator

- 14 -
Resource sharing

❑ More than one operation bound to same resource


❑ Operations much be serialized
❑ Can be represented using hyperedges (define vertex partition)
NOP

1     +

2   + <

3 -
-
4
NOP
[©Gupta]
Portion of the Datapath

b R1
R1 R2
0 1 mux1
0 1
mux2

Multiplier, 0
Controller
FSM
we_r1
R1
we_r2
R2

- 16 -
Resource Constrained Scheduling

❑ Constrained scheduling
– General case NP-complete
– Minimize latency given constraints on area or
the resources (ML-RCS)
– Minimize resources subject to bound on latency (MR-LCS)
❑ Exact solution methods
– ILP: Integer Linear Programming
❑ Heuristics
– List scheduling
– Force-directed scheduling
ILP Formulation of ML-RCS

❑ Use binary decision variables


– i = 0, 1, ..., n
– l = 1, 2, ..., l’+1 l’ given upper-bound on latency
– xil = 1 if operation i starts at step l, 0 otherwise.
❑ Set of linear inequalities (constraints),
and an objective function (min latency)
❑ Observations
𝑥𝑖𝑙 = 0 𝑓𝑜𝑟 𝑙 < 𝑡𝑖𝑆 𝑎𝑛𝑑 𝑙 > 𝑡𝑖𝐿
(𝑡𝑖𝑆 = 𝐴𝑆𝐴𝑃(𝑣𝑖 ), 𝑡𝑖𝐿 = 𝐴𝐿𝐴𝑃(𝑣𝑖 ))

𝑡𝑖 = ෍ 𝑙 ∗ 𝑥𝑖𝑙 ti = start time of op i.


𝑙

[Mic94] p.198
Start Time vs. Execution Time

❑ For each operation vi , only one start time


❑ If di=1, then the following questions are the same:
– Does operation vi start at step l?
– Is operation vi running at step l?
❑ But if di>1, then the two questions should be formulated
as:
– Does operation vi start at step l?
• Does xil = 1 hold?
– Is operation vi running at step l?
• Does the following hold?
?
𝑙

෍ 𝑥𝑖𝑚 = 1
𝑚=𝑙−𝑑𝑖 +1
Operation vi Still Running at Step l ?

❑ Is v9 running at step 6?
– Is x9,6 + x9,5 + x9,4 = 1 ?

4 4 4
5 5 5 v9
6 v9 6 v9 6

x9,6=1 x9,5=1 x9,4=1


❑ Note:
– Only one (if any) of the above three cases can happen
– To meet resource constraints, we have to ask the same question for ALL
steps, and ALL operations of that type
Operation vi Still Running at Step l ?

❑ Is vi running at step l ?
– Is xi,l + xi,l-1 + ... + xi,l-di+1 = 1 ?

l-di+1 l-di+1 l-di+1


vi
...

...

...
...
l-1 l-1 l-1
l vi l vi l

xi,l=1 xi,l-1=1 xi,l-di+1=1


ILP Formulation of ML-RCS (cont.)

❑ Constraints:
– Unique start times: ෍ 𝑥𝑖𝑙 = 1, 𝑖 = 0,1, … , 𝑛
𝑙

– Sequencing (dependency) relations must be satisfied


𝑡𝑖 ≥ 𝑡𝑗 + 𝑑𝑗 ∀(𝑣𝑗 , 𝑣𝑖 ) ∈ 𝐸 ⇒ ෍ 𝑙. 𝑥𝑖𝑙 ≥ ෍ 𝑙. 𝑥𝑗𝑙 + 𝑑𝑗
𝑙 𝑙

– Resource constraints
𝑙

෍ ෍ 𝑥𝑖𝑚 ≤ 𝑎𝑘 , 𝑘 = 1, … , 𝑛𝑟𝑒𝑠 , 𝑙 = 1, … , 𝜆ሜ + 1
𝑖:𝑇(𝑣𝑖 )=𝑘 𝑚=𝑙−𝑑𝑖 +1

❑ Objective: min cTt.


– t =start times vector, c =cost weight (e.g., [0 0 ... 1])

– When c =[0 0 ... 1], cTt = ෍ 𝑙. 𝑥𝑛𝑙


𝑙
ILP Example

❑ Assume l = 4
❑ First, perform ASAP and ALAP
– (we can write the ILP without ASAP and ALAP, but using ASAP and
ALAP will simplify the inequalities)
NOP NOP

1  v1  v2  v6  v8 + v10 1  v1  v2

2  v3  v7 + v9 < v11 2  v3  v6

3 - v4 3 - v4  v7  v8 + v10

- v5 - v5 + v9 < v11
4 4
NOP vn NOP vn
ILP Example: Unique Start Times
Constraint
❑ Without using ASAP ❑ Using ASAP and ALAP:
and ALAP values: x1,1 = 1
x2 ,1 = 1
x1,1 + x1, 2 + x1, 3 + x1, 4 = 1
x3, 2 = 1
x2 ,1 + x2 , 2 + x2 , 3 + x2 , 4 = 1 x4 , 3 = 1
... x5, 4 = 1
... x6 ,1 + x6 , 2 = 1
... x7 , 2 + x7 , 3 = 1

x11,1 + x11, 2 + x11, 3 + x11, 4 = 1 x8,1 + x8, 2 + x8, 3 = 1


x9 , 2 + x9 , 3 + x9 , 4 = 1
....
ILP Example: Dependency Constraints

❑ Using ASAP and ALAP, the non-trivial inequalities are:


(assuming unit delay for + and *)

2 ∗ 𝑥7,2 + 3 ∗ 𝑥7,3 − 𝑥6,1 − 2. 𝑥6,2 − 1 ≥ 0


2 ∗ 𝑥9,2 + 3 ∗ 𝑥9,3 + 4 ∗ 𝑥9,4 − 𝑥8,1 − 2 ∗ 𝑥8,2 − 3 ∗ 𝑥8,3 − 1 ≥ 0
2 ∗ 𝑥11,2 + 3 ∗ 𝑥11,3 + 4 ∗ 𝑥11,4 − 𝑥10,1 − 2 ∗ 𝑥10,2 − 3 ∗ 𝑥10,3 − 1 ≥ 0
4 ∗ 𝑥5,4 − 2 ∗ 𝑥7,2 − 3 ∗ 𝑥7,3 − 1 ≥ 0
5 ∗ 𝑥𝑛,5 − 2 ∗ 𝑥9,2 − 3 ∗ 𝑥9,3 − 4 ∗ 𝑥9,4 − 1 ≥ 0
5 ∗ 𝑥𝑛,5 − 2 ∗ 𝑥11,2 − 3 ∗ 𝑥11,3 − 4 ∗ 𝑥11,4 − 1 ≥ 0
ILP Example: Resource Constraints

❑ Resource constraints (assuming 2 adders and 2 multipliers)

x1,1 + x2 ,1 + x6 ,1 + x8,1  2
x3, 2 + x6 , 2 + x7 , 2 + x8, 2  2
x7 , 3 + x8, 3  2
x10,1  2
❑ Objective: Min Xn,4
x9 , 2 + x10, 2 + x11, 2  2
x4 , 3 + x9 , 3 + x10, 3 + x11, 3  2
x5, 4 + x9 , 4 + x11, 4  2
ILP Formulation of MR-LCS

❑ Dual problem to ML-RCS


❑ Objective:
– Goal is to optimize total resource usage, a.
– Objective function is cTa , where entries in c
are respective area costs of resources
❑ Constraints:
– Same as ML-RCS constraints, plus:
– Latency constraint added:

෍ 𝑙. 𝑥𝑛𝑙 ≤ 𝜆ሜ + 1
𝑙
– Note: unknown ak appears in constraints.

[©Gupta]
List Scheduling

❑ Heuristic methods for ML-RCS and MR-LCS


– Does NOT guarantee optimum solution
– Greedy strategy
– Operation selection decided by criticality
– O(n) time complexity
❑ More general input
– Works on general graphs
– Resource constraints on different resource types
List Scheduling Algorithm: ML-RCS

LIST_L (G(V,E), a) {
l=1
repeat {
for each resource type k {
Ul,k = available vertices in V
Tl,k = operations in progress.
Select Sk  Ul,k such that|Sk| + |Tl,k|  ak
Schedule the Sk operations at step l
}
l=l+1
} until vn is scheduled.
}
RCS: List Scheduling

❑ A simple scheduling algorithm based on


greedy strategies
❑ List scheduling algorithm:
1. Construct a priority list based on some
metrics (operation mobility, numbers of
successors, etc)
2. While not all operations scheduled
1. For each available resource, select
an operation in the ready list
following the descending priority.
2. Assign these operations to the
current clock cycle
3. Update the ready list
4. Clock cycle ++
❑ Qualities depend on benchmarks and
particular metrics
List Scheduling Example
Assumptions: three multipliers with latency 2; 1 ALU with latency 1

Source: Gupta
List Scheduling Algorithm: MR-LCS
LIST_R (G(V,E), l’) {
a = 1, l=1
Compute the ALAP times tL.
if t0L < 0
return (not feasible)
repeat {
for each resource type k {
Ul,k = available vertices in V.
Compute the slacks { si = tiL - l,  vi Ul,k }.
Schedule operations with zero slack, update a
Schedule additional Sk  Ul,k under a constraints
}
l = l + 1}
until vn is scheduled. }

You might also like