Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 19

Review: Design Objectives

Cost
Improving
cost is desired
Better

Be
tte

Improving
quality beyond
threshold is
desired

Better

Improving
performance
beyond threshold
Is a waste

Performance

Thresholds
Quality
1

Co-design Flow
Refine
Informal Specification

System
Model

System Simulation

Algorithmic Design
Hardware/Software Partitioning
Partitioned
Model

Schedule

HW/SW
Co-simulation

Partitioned
Model & Sch.
2

Co-design Flow
Partitioned
Model + Sch.

Communication
Synthesis
Software
Model

HW/SW
Co-simulation

Compilation
Binary Exec.
Model

Refine
Hardware
Model

Synthesis
HW/SW
Co-simulation

Gate-level
Model

Co-design Flow
Refine
Binary Exec.
Model

Emulate or
Prototype

Gate-level
Model

Fabrication

Winter 2010- CS 244

Informal Specification &


System Level Model

Informal Specification loosely defines high level behavior,


constraints, and optimization objectives of the system
Algorithmic and implementation details absent
Performance estimates not present
System level model formally captures behavior, constraints, and
optimization objectives
Can be simulated to obtain early performance estimates

Feedback to refine the system specification

Can serve as a golden model for validation of intermediate or


final stages

Algorithmic design

Hardware Software
Partitioning

Decompose (i.e., partition)


the function F of the system
into N sub-functions F1, F2,
F3 FN
Decompose the constraints
and design objectives of the
system into sub-constraints
and design sub-objectives
Cluster F1, F2, F3, , Fn into
M partitions to run on M
processors

F
{F1, F2, F3 Fn}

P1

P2

P3 PM
6

Scheduling

Scheduling is to obtain an
execution sequence such
that dependencies are
obeyed
Static

During execution time, the


schedule is determined
(reconfigurable computing)

F2
F4

During design time the


schedule is fixed (the
common case)

Dynamic

F1

F5

F6

F7

F3
F8

Scheduling

A deadline D for the


entire schedule
An execution time for
each Ti for each Fi
ASAP (as soon as
possible)
ALAP (as late as
possible)

F2 3

F1 3

F4 6
F6

F5
F7

F3 1

F8 3

Winter 2010- CS 244

Partitioning (Clustering)

Iterative partitioning

Start with a partition and improve


Gradient search
Controlled random search
Modified Kernighan/Lin and FM algorithm
Partitions a set of nodes (functions) into two bins
(processors)
Minimize edges between bins (communication cost, wires,
etc.)
Cost function for moving a node from one partition to
another
ILP
Genetic evolution
Simulated annealing
9

Iterative Partitioning
Algorithms

The computation time in an


iterative algorithm is spent
evaluating large numbers of
partitions
Iterative algorithms differ
from one another primarily in
the ways in which they
modify the partition and in
which they accept or reject
bad modifications

Kernighan-Lin (Min-Cut)
Algorithms

Two-way partitioning
example
Start with 2 equal
subgraphs
Exchange k pairs in
each iteration
Continue until no further
improvement

Gain function

f(internal external) cost

Hierarchical Clustering
Example

12

Alternate Partitioning
Techniques

Start with all functionality in software and


move portions into hardware which are timecritical and can not be allocated to software
(software-oriented partitioning)
Start with all functionality in hardware and
move portions into software implementation
(hardware-oriented partitioning)

13

More Partitioning Issues

Partitioning into hardware and software affects


overall system cost and performance
Hardware implementation

Provides higher performance via hardware speeds and


parallel execution of operations
Incurs additional design expense

Software implementation

Lower performance
Incurs high cost of developing and maintaining (complex)
software

14

Functional Co-simulation

Some of the M processors are single-purpose (e.g., those with a


single function mapped on to them), others are general purpose
Functions mapped onto the general-purpose processors are
implemented in software and simulated on virtual machines with
performance models
Functions mapped onto the single-purpose processors are
simulated at the behavioral level with performance models
Communication is done via abstract channels
Feedback is used to refine the partitioning and scheduling tasks

15

Communication Synthesis &


Bus-accurate Co-simulation

Abstract channels A1, A2 An are mapped onto a set of


communication channels C1, C2 Cm
Similar to functional partitioning
Similar to hardware/software scheduling
Channels correspond to physical artifacts of the architecture
Hardware and software models are annotated with detailed
communication constructs
A hardware model and software model is obtained and cosimulated
Communication synthesis (or possibly higher levels of design)
are refined

16

Compilation & Synthesis &


Cycle-accurate Co-simulation

Compiler used to generate binary executables for


general-purpose processors
Synthesis used to generate gate-level models of
single-purpose processors
Synthesis used to generate gate-level models of
general-purpose processors
Cycle accurate co-simulation of the entire system

Note: mixed level co-simulation is common

17

Emulate/Prototype and
Fabrication

Use hardware (e.g, FPGAs) to emulate a system as


fast as possible (relative to real-time)
Fabrication

Place & route


Mask design
Chip testing

Manufacturing fault models


Test vector generation

Packaging

18

Conclusion

Satisfying performance, cost, and quality metrics


of a system entails hardware and software
codesign
Partitioning is at the heart of codesign

Partitioning techniques

Functional
Communication
Scheduling
Constructive
Iterative

Heuristics often used to bound the running time


19

You might also like