Specification and Design of Embedded Systems PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 216

SPECIFICATION AND DESIGN

OF
EMBEDDED SYSTEMS
by

Daniel D. Gajski
Frank Vahid
Sanjiv Narayan
Jie Gong

University of California at Irvine


Department of Computer Science
Irvine, CA 92715-3425

1 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Design representations

Behavioral
Represents functionality but not implementation

Structural
Represents connectivity but not dimensionality

Physical
Represents dimensionality but not functionality

Introduction 2 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Levels of abstraction

Behavioral Structural Physical


Levels components
forms objects

Differential eq., Transistors,


Analog and
Transistor current−voltage resistors, digital cells
diagrams capacitors

Gate Boolean equations, Gates, Modules,


finite−state machines flip−flops units
Algorithms, Adders, comparators,
Register flowcharts, registers, counters, Microchips,
instruction sets, ASICs
generalized FSM register files, queues

Processor Executable spec., Processors, controllers, PCBs,


programs memories, ASICs MCMs

Introduction 3 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Design methodologies

Capture-and-simulate
Schematic capture
Simulation

Describe-and-synthesize
Hardware description language
Behavioral synthesis
Logic synthesis

Specify-explore-rene
Executable specication
Software and hardware partitioning
Estimation and exploration
Specication renement

Introduction 4 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Motivation

Executable System
specification implementation

Processor Memory

if (x = 0) then
y=a*b/2

Video ASIC I/O


accelerator

Partitioning Software compilation Physical design


Models
Estimation Behavioral synthesis Test generation
Languages
Refinement Logic synthesis Manufacturing

Introduction 5 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Outline

Introduction

Design models and architectures

System-design languages

An example

Translation

Partitioning

Estimation

Renement

Methodology and environments

Outline 6 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Models and architectures

Models
Specification + Constraints (Specification)

Design
process

Architectures
Implementation (Implementation)

Models are conceptual views of the system’s functionality


Architectures are abstract views of the system’s implementation

Models & Architectures 7 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Models and architectures

Model: a set of functional objects and rules for composing these objects

Architecture: a set of implementation components and their connections

Models & Architectures 8 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Models of an elevator controller

"If the elevator is stationary and the floor


requested is equal to the current floor, loop
then the elevator remains idle. if (req_floor = curr_floor) then
direction := idle;
If the elevator is stationary and the floor elsif (req_floor < curr_floor) then
requested is less than the current floor, direction := down;
then lower the elevator to the requested floor. elsif (req_floor > curr_floor) then
direction := up;
If the elevator is stationary and the floor end if;
requested is greater than the current floor, end loop;
then raise the elevator to the requested floor."
(a) English description (b) Algorithmic model

(req_floor < curr_floor) (req_floor = curr_floor) (req_floor > curr_floor)


/ direction := down / direction := idle / direction := up

(req_floor < curr_floor) (req_floor > curr_floor)


/ direction := down / direction := up
Down Idle Up
(req_floor = curr_floor) (req_floor = curr_floor)
/ direction := idle / direction := idle
(req_floor < curr_floor) / direction := up

(req_floor < curr_floor) / direction := down

(c) State−machine model

Models & Architectures 9 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Architectures for implementing the elevator controller

req_floor Combinational logic direction


curr_floor

State register req_floor


In/out ports direction
curr_floor

Processor Memory
Bus

(a) Register level (b) System level

Models & Architectures 10 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Models

State-oriented models
Finite-state machine (FSM), Petri net, Hierarchical concurrent FSM

Activity-oriented models
Dataow graph, Flowchart

Structure-oriented models
Block diagram, RT netlist, Gate netlist

Data-oriented models
Entity-relationship diagram, Jackson’s diagram

Heterogeneous models
Control/dataow graph, Structure chart, Programming language paradigm,
Object-oriented paradigm, Program-state machine, Queueing model

Models & Architectures 11 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
State oriented: Finite-state machine (Mealy model)

r1/n r2/n
r2/u1
start S1 S2
r1/d1

d1
r3
/u2

r2/

u1
r1/

r3/
d2
S3

r3/n

S = { s1, s2, s3}


I = {r1, r2, r3}
O = {d2, d1, n, u1, u2}
f: S x I −> S
h: S x I −> O

Models & Architectures 12 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
State oriented: Finite-state machine (Moore model)

r1
r1
r1 r3

start
S11/d2 S21/d1 S31 /n
r2

r2
r2 r2
r1
r3 r3
r1 r2

S12 /d1 r1 S22 /n r3 S32 /u1

r1 r3
r1
r2 r3
r2 r2

S13 /n r2 S23 /u1 S33 /u2

r1 r3
r3
r3

Models & Architectures 13 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
State oriented: Finite-state machine with datapath

(curr_floor != req_floor) / output := req_floor − curr_floor; curr_floor := req_floor

start S1

(curr_floor = req_floor) / output := 0

Models & Architectures 14 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Finite-state machines

Merits:
represent system’s temporal behavior explicitly
suitable for control-dominated system

Demerits:
lack of hierarchy and concurrency resulting in
state or arc explosion when representing complex systems

Models & Architectures 15 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
State oriented: Petri nets

p2

t4
p1 t1 p5 t2 p4

p3

Net = (P, T, I, O, u)
P = {p1, p2, p3, p4, p5} t3
T = {t1, t2, t3, t4}

I: I(t1) = {p1} O: O(t1) = {p5} u: u(p1) = 1


I(t2) = {p2,p3,p5} O(t2) = {p3,p5} u(p2) = 1
I(t3) = {p3} O(t3) = {p4} u(p3) = 2
I(t4) = {p4} O(t4) = {p2,p3} u(p4) = 0
u(p5) = 1

Models & Architectures 16 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Petri nets

t1 t2 t1 t2 t1

(a) Sequence (b) Branch (c) Synchronization

t1 t2 t1 t2 t3 t4

(d) Resource contention (e) Concurrency

Models & Architectures 17 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Petri nets

Merits:
good at modeling and analyzing concurrent systems

Demerits:
‘at’ model that is
incomprehensible when system complexity increases

Models & Architectures 18 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
State oriented: Hierarchical concurrent FSM

Y
A D

B E
u

a(P)/c b r
F
s
a
C
G

Models & Architectures 19 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Hierarchical concurrent FSMs

Merits:
support both hierarchy and concurrency
good for representing complex systems

Demerits:
concentrate only on modeling control aspects
and not data and activities

Models & Architectures 20 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Activity oriented: Dataow graphs (DFG)

Y Z
A2.1 A2.2

Input V’ A2.3 W

X
X Z

Y Z
A1 A2 Output
Y + * W
W
V V’

File Output

(a) Activity level (b) Operation level

Models & Architectures 21 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Dataow graphs

Merits:
support hierarchy
suitable for specifying complex transformational systems
represent problem-inherent data dependencies

Demerits:
do not express temporal behaviors or control sequencing
weak for modeling embedded systems

Models & Architectures 22 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Activity oriented: Flowchart (CFG)

start

J=1
MAX = 0

J = J+1

No

No Yes
J>N MEM(J) > MAX MAX = MEM(J)

Yes

end

Models & Architectures 23 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Flowcharts

Merits:
useful to represent tasks governed by control ow
can impose a order to supersede natural data dependencies

Characteristics:
used only when the system’s computation is well known

Models & Architectures 24 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Structure oriented: Component-connectivity diagrams

Left Right A B
bus bus

Program Data
memory memory Register file

System bus
Processor LIR RIR

I/O Application ALU


coprocessor specific
hardware

(a) Block diagram (b) RT netlist (c) Gate netlist

Models & Architectures 25 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Component-connectivity diagrams

Merits:
good at representing system’s structure

Characteristics:
often used in the later phases of design process

Models & Architectures 26 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Data oriented: Entity-relationship diagram

Availability

Supplier P.O. Product


instance

Customer Request Order

Models & Architectures 27 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Entity-relationship diagrams

Merits:
provide a good view of the data in the system, also
suitable for expressing complex relations among various kinds of data

Demerits:
do not describe any functional or temporal behavior of the system.

Models & Architectures 28 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Data oriented: Jackson’s diagram

Drawing

AND

Color Shape Users *

OR

Circle Rectangle Name

AND

Radius Width Height

Models & Architectures 29 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Jackson’s diagrams

Merits:
suitable for representing data having a complex composite structure.

Demerits:
do not describe any functional or temporal behavior of the system.

Models & Architectures 30 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Heterogeneous: Control/dataow graph

Data flow graphs

Read X Read W

Control flow graph


Write A

start stop X C Const 3 Read X


1 2 E
W = 10
disable +
S0 A1
stop / disable A2 , disable A3

enable X := X + 2
A := X + 5 A := X + 3 A := X + W

start / enable A1 , enable A2 W


Write A

disable Y
S1 enable A2 Read X Const 2

W = 10 / disable A1 , enable A3 Z
+ Const 5
disable
S2 enable A3 +
Control

Write X Write A

(a) Activity level (b) Operation level

Models & Architectures 31 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Control/dataow graphs

Merits:
correct the inability of DFG in representing the control of a system
correct the inability of CFG to represent data dependencies

Models & Architectures 32 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Heterogeneous: Structure chart

control Main
Data
A,B C
A,B A’,B’
A’,B’ C,D

Get Transform Compute Out_C


Branch
A B

A B
A’ B’

Get_A Get_B Change_A Change_B Do_Loop1 Do_Loop2

Iteration

Models & Architectures 33 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Structure charts

Merits:
represent both data and control

Characteristics:
used in the preliminary stages of program design

Models & Architectures 34 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Heterogeneous: Programming languages

Imperative vs declarative programming languages:


C, Pascal, Ada, C++, etc.
LISP, PROLOG, etc.

Sequential vs concurrent programming languages:


Pascal, C, etc.
CSP, ADA, VHDL, etc.

Models & Architectures 35 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Programming languages

Merits:
model data, activity, and control

Demerits:
do not explicitly model the system’s states

Models & Architectures 36 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Heterogeneous: Object-oriented paradigm

Object Object Object

Data Data Data

Operations Operations Operations

Transformation
function

Models & Architectures 37 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Object-oriented paradigms

Merits:
support information hiding, inheritance, natural concurrency

Demerits:
not suitable for systems with complicated transformation functions

Models & Architectures 38 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Heterogeneous: Program-state machine

Y variable A: array[1..20] of integer

A D

variable i, max: integer ;


B
max = 0;
for i = 1 to 20 do
e1 e2 if ( A[i] > max ) then
max = A[i] ;
end if;
end for
C

e3

Models & Architectures 39 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Program-state machines

Merits:
represent system’s states, data, control and activities in a single model
overcome the limitations of programming languages and HCFSM models

Models & Architectures 40 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Heterogeneous: Queueing model

Queue Server
Arriving
requests

(a) One server

Arriving
requests

(b) Multiple servers

Models & Architectures 41 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Queueing model

Characteristics:
used for analyzing system’s performance, and
can nd utilization, queueing length, throughput

Models & Architectures 42 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Architectures

Application-specic architectures
Controller architecture,
Datapath architecture,
Finite-state machine with datapath (FSMD).

General-purpose processors
Complex instruction set computer (CISC)
Reduced instruction set computer (RISC)
Vector machine
Very long instruction word computer (VLIW)

Parallel processors

Models & Architectures 43 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Controller architecture

State register

Next−state Output Outputs


function function

Inputs

Models & Architectures 44 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Datapath architecture
x(i) b(0) x(i−1) b(1) x(i−2) b(2) x(i−3) b(3)

* * * *

+ +
Pipeline stages
+

y(i)
(a) Three stage pipeline

x(i) b(0) x(i−1) b(1) x(i−2) b(2) x(i−3) b(3)

* * * *

+ + + y(i)

Pipeline stages

(b) Four stage pipeline

Models & Architectures 45 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
FSMD

Datapath inputs

State register

Control
Next−state Output Datapath
function function

Status

Control unit

Datapath outputs

Models & Architectures 46 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
CISC architecture

Control
Microprogram
memory Datapath

PC

MicroPC

+1
Address Status
selection
logic

Memory
Control unit Instruction reg.

Models & Architectures 47 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
RISC architecture

Datapath

Register
file
Control
Hardwired
output and
next−state
logic ALU

Data
State register cache
Status

Instruction reg. Instr.


cache Memory
Control unit

Models & Architectures 48 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Vector machines

Interleaved memory

Memory Memory
pipes pipes

Vector Scalar
registers registers

Vector Scalar
functional functional
unit unit

Models & Architectures 49 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
VLIW architecture

Memory

Register file

+ + * *

Models & Architectures 50 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Parallel processors: SIMD/MIMD

PE 0 PE 1 PE N−1

Proc. 0 Proc. 1 Proc. N−1

Control
unit Mem. 0 Mem. 1 Mem. N−1

Interconnection network

(a) Message passing

Proc. 0 Proc. 1 Proc. N−1

Interconnection network

Mem. 0 Mem. 1 Mem. N−1

(b) Shared memory

Models & Architectures 51 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Conclusion

Different models focus on different aspects

Proper model needs to represent system’s features

Models are implemented in architectures

Smooth transformation of models to architectures increases productivity

Models & Architectures 52 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
System specication

For every design, there exists a conceptual view

Conceptual view depends on application


Computation : conceptualized as a program
Controller : conceptualized as a state-machine

Goal of specication language


Capture conceptual view with minimum designer effort

Ideal language
1-to-1 mapping between conceptual model & language constructs

53 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Outline

Characteristics of commonly used conceptual models:


Concurrency, hierarchy, synchronization

Requirements for embedded system specication

Evaluate HDLs with respect to embedded systems


VHDL, Verilog, Esterel, CSP, Statecharts, SDL, SpecCharts

System specication 54 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Concurrency

Behavior: a chunk of system functionality


e.g. process, procedure, state-machine

System often conceptualized as set of concurrent behaviors

Concurrency can exist at different abstraction levels:


Job-level
Task-level
Statement-level
Operation-level
Bit-level

Two types of concurrency within a behavior


Data-driven, Control-driven

System specication 55 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Data-driven concurrency

Operations execute when input data is available

Execution order determined by data dependencies

A B C D X

add subtract

1: Q = A + B
2: Y = X + P multiply
3: P = (C − D) * Q

add

Q P Y

System specication 56 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Control-driven concurrency

Control thread : set of operations executed sequentially

Concurrency represented by multiple control threads

Q
Fork-join statement sequential behavior X
begin
Q(); A B C
fork A(); B(); C(); join;
R();
end behavior X;
R

concurrent behavior X
begin
process A(); A B C
process B();
Process statement process C();
end behavior X;

System specication 57 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
State-transitions

Systems often are state-based, e.g. controllers

State may represent


mode or stage of being
computation

Difcult to capture using programming constructs

u v
P

w z
start Q R T finish

x y
S

System specication 58 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Hierarchy

Required for managing system complexity


Allows system modeler to focus on one subsystem at a time
Enhances comprehension of system functionality
Scoping mechanism for objects like types and variables

Two types of hierarchy


Structural hierarchy
Behavioral hierarchy

System specication 59 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Structural hierarchy

System represented as set of interconnected components

Interconnections between components represent wires

Several levels: systems, chips, RT-components, gates

System
Processor
Control Logic Datapath
data bus

Memory
control
lines

System specication 60 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Behavioral hierarchy

Ability to successively decompose behavior into sub-behaviors

behavior P
variable x, y;
begin
Q(x) ;
R(y) ;
end behavior P;
Concurrent decomposition
Fork-join
Process P
Q e4 R
Sequential decomposition Q1 e5 R1
e2
Procedure
e1 e8
State-machine Q3
e6
Q2 e3 R2
e7

System specication 61 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Programming constructs

Some behaviors easily conceptualized as sequential algorithms

Wide variety of constructs available


Assignment, branching, iteration, subprograms,
recursion, complex data types (records, lists)

type buffer_type is array (1 to 10) of integer;


variable buf : buffer_type;
variable i, j : integer;
for i = 1 to 10
for j = i to i
if (buf(i) > buf(j)) then
SWAP(buf(i), buf(j));
end if;
end for;
end for;

System specication 62 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Behavioral completion

Behavior completes when all computations performed

Advantages
Behavior can be viewed without inter-level transitions
Allows natural decomposition into sequential subbehaviors

B
X Y
q
1
start X1 e5 Y1
q q final e3
0 3 state e1 X3
q Y2
2 X2 e2 e4

System specication 63 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Communication

shared memory

Concurrent behaviors exchange data


process P process Q
Shared-memory model
Sender updates common medium
Persistent, Non-persistent

Message-passing model process P process Q


Data sent over abstract channels
begin begin
Unidirectional / bidirectional variable x channel variable y
.... ....
Point-to-point / multiway send (x); C receive (y);
.... ....
Blocking / non-blocking end end

System specication 64 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Synchronization

Concurrent behaviors execute at different speeds

Synchronization required when


Data exchanged between behaviors
Different activities must be performed simultaneously

Two types of synchronization mechanisms


Control-dependent
Data-dependent

System specication 65 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Control-dependent synchronization

Synchronization based on control structure of behavior

Q
behavior X
begin
Fork-join Q(); A B C
fork A(); B(); C(); join;
R(); synchronization
end behavior X; point
R

AB
Reset ABC

A B C A B

B1
A1

A2 B2

e e

System specication 66 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Data-dependent synchronization

Synchronization based on communication of data between behaviors

AB
AB AB
A B
A B A B
A1
A1 B1 A1 B1
B1 x:=0
e e e entered A2 e (x=1)

A2 B2 A2 A2
B2 B2
x:=1

Synchronization by Synchronization by Synchronization by


common event status detection common variable

System specication 67 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Exception handling

Occurrence of event terminates current computation

Control transferred to appropriate next mode

Example of exceptions: interrupts, resets

P
P1 e Q

P2

System specication 68 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Timing

Required to represent real world implementations

Functional timing: affects simulation of system specication


wait for 200 ns;
A <= A + 1 after 100 ns;

Timing constraints: guide synthesis and verication tools

min 50 ns
behavior Q
IN
behavior channel C
B max 10 ms (max 10 Mb/s)
OUT

behavior P time

System specication 69 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Embedded system specication

Embedded system: behavior dened by interaction with environment

Essential characteristics
State-transitions Exceptions
Behavioral hierarchy Concurrency
Programming constructs Behavioral completion

start P
P
P1
u fork
P P2

v Q R
e
Q
w
Q join

R
x S

System specication 70 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
VHDL

IEEE standard, intended for documentation


and exchange of designs [IEE88]

Characteristics supported
Behavioral hierarchy : single level of processes
Structural hierarchy : nested blocks and component instantiations
Concurrency : task-level (process), statement-level (signal assignment)
Programming constructs
Communication : shared-memory using global signals
Synchronization : wait on and wait until statements
Timing : wait for statement, after clause in assignments

Characteristics not supported


Exceptions : partially supported by guarded signal assignments
State transitions

System specication 71 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Verilog and Esterel

Verilog [TM91] developed as proprietary language


for specication, simulation

Esterel [Hal93] developed for specication of reactive systems

Characteristics supported:
Behavioral hierarchy : fork-join
Structural hierarchy : hierarchy of interconnected modules
Programming constructs
Communication : shared registers (Verilog) and broadcasting (Esterel)
Synchronization : wait for an event on a signal
Timing : modeling of gate, net, assignment delays in Verilog
Exceptions : disable (Verilog), watching, do-upto, trap statements (Esterel)

Characteristics not supported: State transitions

System specication 72 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
SDL (Specication and Description language)

system
CCITT standard in telecommunication
block
for protocol specication [BHS91]
signal route
process

Characteristics supported
Behavioral hierarchy : nested dataow
process
Structural hierarchy : nested blocks signal route
State transitions : state machine in processes
Communication : message passing
channel channel
Timing : timeouts generated by timer object
block
Characteristics not supported
Exceptions channel
Programming constructs

System specication 73 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
CSP (Communicating Sequential Processes)

Intended to specify programs running on


multiprocessor machines [Hoa78]

Characteristics supported
Behavioral hierarchy : fork-join using parallel command
Programming constructs
Communication : message passing using input, output commands
Synchronization : blocking message passing

Characteristics not supported


Exceptions
State transitions
Structural hierarchy
Timing

System specication 74 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
SpecCharts

Developed for embedded E


port P, Q : in integer;
system specication [NVG92]
B
type INTARRAY is array
(natural range <>) of integer;
PSM (program-state machine) model + VHDL signal A : INTARRY (15 downto 0);

X Y

Characteristics supported X1 variable MAX : integer ;


Behavioral hierarchy : sequential/concurrent behaviors
MAX := 0;
State transitions: TOC (transition on completion) arcs e1 for J in 0 to 15 loop
if ( A(J) > MAX ) then
Communication : shared memory, message passing X2 max := A(J) ;
end if;
end loop
Exceptions : TI (transition immediately) arcs e2

Characteristics similar to VHDL e3

Programming constructs
Structural hierarchy
Synchronization and Timing

System specication 75 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
SpecCharts : state transitions

State transitions represented by TOC and TI arcs between behaviors

start
behavior MAIN type sequential subbehaviors is
begin
P P : (TOC, u, Q) ;
u Q : (TOC, v, P), (TOC, w, R);
R : (TOC, x, Q);
v
Q behavior P .....
w behavior Q .....
behavior R .....

x end MAIN;
R

76 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
System specication
SpecCharts : behavioral hierarchy

Hierarchy represented by nested behaviors

Behavior decomposed into sequential or concurrent subbehaviors

behavior MAIN type sequential subbehaviors is


begin
P : (TOC, true, Q_R);
P Q_R : (TOC, true, S);
S:;
.....
behavior P .....
fork
behavior Q_R type concurrent subbehavior is
begin
Q : (TOC, true, halt);
Q R R : (TOC, true, halt);
behavior Q .....
join behavior R .....
end Q_R;

behavior S.....
end MAIN;
S

77 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
System specication
SpecCharts : exceptions

Exceptions represented by TI (transition immediately) arcs

behavior MAIN type sequential subbehaviors is


begin
P P : (TI, e, Q);
P1
Q : ;
P2 behavior P
behavior P1
.......
e behavior P2
.......
Q behavior Q
......
end MAIN;

78 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
System specication
Summary

Embedded System Features


Language
State Behavioral Concurrency Program Exceptions Behavioral
Transitions Hierarchy Constructs Completion

VHDL

Verilog

Esterel

SDL

CSP

Statecharts

SpecCharts

Feature fully Feature partially Feature not


supported supported supported

System specication 79 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Specication example

An executable specication-language enables:


Early verication
Precision
Automation
Documentation

A good language/model match reduces:


Capture time
Comprehension time
Functional errors

80 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Outline

Capture an example’s model in a particular language


PSM model in the SpecCharts language

Point out the benets of a good language/model match

Highlight experiments that demonstrate those benets

Specication example 81 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Answering machine controller’s environment

phone line

Announcement Tape Line


unit unit circuitry

ann_done

tape_play

tape_rew
ann_play

tape_fwd
tape_rec

tape_cnt
ann_rec

hangup

offhook

beep
tone

ring
tollsaver
messages
power

rec
ann
Controller light
hear
ann
on/off

memo

play
msgs
stop rew play fwd
mic

Specication example 82 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Highest-level view of the controller

Controller

SystemOff

power=’0’ power=’1’

SystemOn
phone line

Announcement Tape Line


unit unit circuitry
ann_done

tape_play

tape_rew
ann_play

tape_fwd
tape_rec

tape_cnt
ann_rec

hangup

offhook
beep
tone

ring

tollsaver
messages
power

rec
ann
Controller light
hear
ann
on/off

memo

play
msgs
stop rew play fwd
mic

Specication example 83 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
The SystemOn behavior

SystemOn
System usually responds RespondToLine
to the line
rising(any_button_pushed)

Pressing any machine button RespondToMachineButton

gets immediate response


phone line

Announcement Tape Line


unit unit circuitry
ann_done

tape_play

tape_rew
ann_play

tape_fwd
tape_rec

tape_cnt
ann_rec

hangup

offhook
beep
tone

ring

tollsaver
messages
power

rec
ann
Controller light
hear
ann
on/off

memo

play
msgs
stop rew play fwd
mic

Specication example 84 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
The RespondToMachineButton behavior

RespondToMachineButton

behavior RespondToMachineButton HandlePlay


type code is
play=’1’
begin
if (play=’1’) then
HandleFwd
HandlePlay;
elsif (fwd=’1’) then fwd=’1’
HandleFwd;
elsif (rew=’1’) then HandleRew
HandleRew; rew=’1’
elsif (memo=’1’) then
HandleMemo; HandleMemo
elsif (stop=’1’) then memo=’1’
HandleStop;
elsif (hear_ann=’1’) then HandleStop
HandleHearAnn; stop=’1’
elsif (rec_ann=’1’) then
HandleRecAnn; HandleHearAnn
phone line
elsif (play_msgs=’1’) then
hear_ann=’1’
Announcement Tape Line
HandlePlayMsgs;
unit unit circuitry end if;
HandleRecAnn
end;
ann_done

tape_play

tape_rew
ann_play

tape_fwd
tape_rec

rec_ann=’1’
tape_cnt
ann_rec

hangup

offhook
beep
tone

ring

tollsaver
messages
HandlePlayMsgs
power

rec
play_msgs=’1’
ann
Controller light
hear
ann
on/off
(a) (b)
memo

play
msgs
stop rew play fwd
mic

Specication example 85 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
The RespondToLine behavior

Monitors line for rings

Answers line
RespondToLine

Responds to exceptions Monitor


Hangup
Machine turned off rising(hangup)
falling(machine_on)
phone line

Announcement Tape Line


Answer
unit unit circuitry
ann_done

tape_play

tape_rew
ann_play

tape_fwd
tape_rec

tape_cnt
ann_rec

hangup

offhook

beep
tone

ring

tollsaver
messages
power

rec
ann
Controller light
hear
ann
on/off

memo

play
msgs
stop rew play fwd
mic

Specication example 86 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
The Monitor behavior

Monitor
signal rings_to_wait : integer range 1 to 20 := 4;
function DetermineRingsToWait return integer is begin
Counts for if ((num_msgs > 0) and (tollsaver=’1’) and (machine_on=’1’)) then
return(2);
elsif (machine_on=’1’) then
required rings return(4);
else
return(15);
end if;
Requirements end;

may change
phone line MaintainRingsToWait CountRings
Announcement Tape Line
variable I : integer range 0 to 20;
unit unit circuitry

loop i := 0;
ann_done

tape_play

tape_rew
ann_play

tape_fwd
tape_rec

tape_cnt
ann_rec

hangup

offhook
beep

rings_to_wait <= DetermineRingsToWait; while (i < rings_to_wait) loop


tone

ring

tollsaver

power
messages
wait on tollsaver, machine_on; wait on rings_to_wait, ring;
rec
end loop; if (rising(ring)) then
ann
Controller light i := i + 1;
hear
ann
on/off
end if;
memo end loop;
play
msgs
stop rew play fwd
mic

Specication example 87 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
The Answer behavior

Answer
rising(hangup)

PlayAnnouncement RecordMsg Hangup

button="0001" button="0001"

RemoteOperation

(a)

behavior RecordMsg type code is


begin
behavior PlayAnnouncement type code is ProduceBeep(1 s);
begin if (hangup = ’0’) then
ann_play <= ’1’; tape_rec <= ’1’;
wait until ann_done = ’1’; wait until hangup=’1’ for 100 s;
ann_play <= ’0’; ProduceBeep(1 s);
end; num_msgs <= num_msgs + 1;
tape_rec <= ’0’;
end if;
end;
(b) (c)
phone line

Announcement Tape Line


unit unit circuitry
ann_done

tape_play

tape_rew
ann_play

tape_fwd
tape_rec

tape_cnt
ann_rec

hangup

offhook
beep
tone

ring

tollsaver
messages
power

rec
ann
Controller light
hear
ann
on/off

memo

play
msgs
stop rew play fwd
mic

Specication example 88 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
The RemoteOperation behavior

Owner can operate machine remotely by phone


Owner identies himself by four button ID

RemoteOperation behavior CheckUserCode type code is


hangup=’1’ begin
code_ok <= true;
CheckCode for (i in 1 to 4) loop
wait until tone /= "1111" and tone’event;
code_ok=’0’ if (tone /= user_code(i)) then
code_ok=’1’
code_ok <= false;
end if;
RespondToCmds end loop;
end;
(a) (b)
phone line

Announcement Tape Line


unit unit circuitry
ann_done

tape_play

tape_rew
ann_play

tape_fwd
tape_rec

tape_cnt
ann_rec

hangup

offhook
beep
tone

ring

tollsaver
messages
power

rec
ann
Controller light
hear
ann
on/off

memo

play
msgs
stop rew play fwd
mic

Specication example 89 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
The answering machine controller specication

Controller
SystemOff

phone line power=’1’ power=’0’

SystemOn InitializeSystem RespondToMachineButton


Announcement Tape Line
unit unit circuitry
rising(any_button_pushed)

RespondToLine
ann_done

tape_play

tape_rew
ann_play

tape_fwd

Monitor rising(hangup)
tape_rec

tape_cnt
ann_rec

hangup

offhook
beep
tone

ring
falling(machine_on)

tollsaver Answer
messages
rising(hangup)
power PlayAnnouncement RecordMsg Hangup

rec tone="0001"
ann
light RemoteOperation
Controller
hangup=’1’
hear CheckUserCode
ann
on/off code_ok not code_ok

memo RespondToCmds
tone="0010"
play HearMsgsCmds MiscCmds
msgs
stop rew play fwd hangup=’1’ other
mic
ResetTape

Specication example 90 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Executable specication use

Precision
Readability/precision compete in a natural language
Executable specication encourages precision
Designer asks questions, specication answers them

Language/model match (SpecCharts/PSM):


Hierarchy
State-transitions
Programming constructs
Concurrency
Exceptions
Completion
Equivalence of states and programs

Specication example 91 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Specication capture experiment

VHDL SpecCharts

Average specification−time in minutes 40 16


Number of modelers 3 3
Number of incorrect specifications first time 2 0
Number of incorrect specifications second time 1 0

VHDL modelers required 2.5 times longer

Two VHDL specications possessed control errors

SpecCharts were effective for state-transitions and exceptions

Specication example 92 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Comparison of SpecCharts, VHDL and Statecharts

Answering machine example


Conceptual VHDL VHDL
model SpecCharts (hierarch.) (flat) Statecharts
Specification attributes

Program−states 42 42 42 32 80
Arcs 40 40 40 152 135
Control signals −− 0 84 1 0
Lines/leaf −− 7 27 29 −−
Lines −− 446 1592 963 −−
Words −− 1733 6740 8088 −−

No sequential
program constructs X

No hierarchy X X
Shortcomings

No exception
constructs X X

No hierarchical
events X

No state−transition
constructs X X

Specication example 93 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Design quality experiment

Designed from Designed from


Design attribute English SpecCharts

Control transistors 3130 2630


Datapath transistors 2277 2251
Total transistors 5407 4881
Total pins 38 38

No loss in design quality with an executable language

Specication example 94 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Summary

Executable languages encourage precision and automation

The language should support an appropriate model


Makes specication easy

Strongly parallels programming languages


Structured vs. assembly languages
Object-oriented model and C++

Specication example 95 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Translation

Model often unsupported by a standard language

(1) Use a standard language anyway


Many tools available
But, captures model unnaturally

(2) Use an application-specic language


Captures model naturally
But, not many tools available

(3) Use a front-end language


Captures model naturally
Many tools available after translating to a standard

96 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Outline

Front-end language in VHDL environment

State machine translation

Fork-join translation

Exception translation

Translation 97 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
A front-end language in a VHDL environment

VHDL SpecCharts

Translator

VHDL

VHDL environment
Synthesis
tool Simulator Debuger Test−generator

Tool output

Translation 98 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
State machine translation

type state_type is (P, Q, R);


variable state : state_type := P;
loop
start case (state) is
when P =>
<actions for P>
if (u) then
u not u state := Q;
P else if (not u) then
state := R;
end if;
when Q =>
Q R <actions for Q>
state := P;
when R =>
<actions for R>
state := Q;
end case;
end loop;
(a) (b)

Translation 99 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Fork-join translation

signal fork, P1_done, P2_done : boolean;

Main: process Main : process P1_process : process


begin begin begin

statement1; statement1; wait until fork;

parallel P1;
{ fork <= true;
P1; wait until P1_done P1_done <= true;
P2; and P2_done; wait until not fork;
} P1_done <= false;

statement2; statement2; end;


... ...

(a) (b)

Translation 100 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Exception translation

−− T
event e : T −−> S; T_loop : loop
−− T statement;
statement1; if (e)
T: if (e) exit T_loop;
statement1; goto S_start; statement2;
statement2; statement2; if (e)
statement3; if (e) exit T_loop;
goto S_start; statement3;
statement3; exit T_loop;
end loop;
S: S_start: −− S −− S
statement4; statement4; statement4;
statement5; statement5; statement5;
(a) (b) (c)

Translation 101 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Summary

The perfect standard language may never exist

No standard language supports all models

Using a front-end language solves the problem


Natural capture
Large base of tools and expertise

Translators are simple


Maps characteristics to existing constructs
Generates well-structured and consistent output

Translation 102 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
System partitioning

System functionality is implemented on system components


ASICs, processors, memories, buses

Two design tasks:


Allocate system components or ASIC constraints
Partition functionality among components

Constraints
Cost, performance, size, power

Partitioning is a central system design task

103 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Outline

Structural vs. functional partitioning

Natural vs. executable language specications

Basic partitioning issues and algorithms

Functional partitioning techniques for hardware

Hardware/software partitioning

Functional partitioning techniques for software

Exploring tradeoffs with functional partitioning

System partitioning 104 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Structural vs. functional partitioning

Structural: Implement structure, then partition

Functional: Partition function, then implement


Enables better size/performance tradeoffs
Uses fewer objects, better for algorithms/humans
Permits hardware/software solutions
But, it’s harder than graph partitioning

System partitioning 105 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Natural vs. executable language specications

Alternative methods for specifying functionality

Natural languages common in practice

Executable languages becoming popular


Automated estimation/partitioning explores solutions
Early verication reduces costly late changes
Precision eases integration

System partitioning 106 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Basic partitioning issues

Specification abstraction−level

Granularity

Metrics and estimations


Partitioning algorithms
Objective and closeness functions

System−component allocation

Output

System partitioning 107 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Basic partitioning issues (cont.)

Specication-abstraction level: input denition


Just indicating the language is insufcient
Abstraction-level indicates amount of design already done
e.g. task DFG, tasks, CDFG, FSMD

Granularity: specication size in each object


Fine granularity yields more possible designs
Coarse granularity better for computation, designer interaction
e.g. tasks, procedures, statement blocks, statements

Component allocation: types and numbers


e.g. ASICs, processors, memories, buses

Output: format and uses


e.g. new specication, hints to synthesis tool

System partitioning 108 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Basic partitioning issues (cont.)

Metrics and estimations: "good" partition attributes


e.g. cost, speed, power, size, pins, testability, reliability
Estimates derived from quick, rough implementation
Speed and accuracy are competing goals of estimation

Objective and closeness functions


Combines multiple metric values
Closeness used for grouping before complete partition
Weighted sum common





e.g. 1









2 3



System partitioning 109 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Basic partitioning issues (cont.)

A
Algorithms: control strategies
B
Cost
seeking best partition
Constructive creates partition
Iterative improves partition Number of moves
Key is to escape local minimum

System partitioning 110 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Typical partitioning-system conguration

User interface

Input Output
Model

Algorithms Estimators

Design
feedback
Objective
function

System partitioning 111 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Basic partitioning algorithms

Clustering and multi-stage clustering [Joh67, LT91]

Group migration (a.k.a. min-cut or Kernighan/Lin) [KL70, FM82]

Ratio cut [KC91]

Simulated annealing [KGV83]

Genetic evolution

Integer linear programming

System partitioning 112 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Hierarchical clustering

Constructive algorithm using closeness metrics

Overview
Groups closest objects
Recomputes closenesses
Repeats until termination condition met

Cluster tree maintains history of merges


Cutline across the tree denes a partition

System partitioning 113 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Hierarchical clustering algorithm

/* Initialize each object as a group */


for each loop













end loop

/* Compute closenesses between objects */


for each loop


for each loop


ComputeCloseness( )


 






end loop
end loop

/* Merge closest objects and recompute closenesses


*/
while not Terminate( ) loop
FindClosestObjects(  )

















 

for each loop




ComputeCloseness( )

  



 

end loop
end loop

return


System partitioning 114 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Hierarchical clustering example

o o1 o1 o
1 1
30 25
10 20
15 o2 o3
o2 o3 o2 o3 o2 o3
10
10 10 10 10
o4 o4 o4 o4

Avg(10,10) = 10
Avg(15,25) = 20

o1 o2 o3 o4 o1 o2 o3 o4 o1 o2 o3 o4 o1 o2 o3 o4
(a) (b) (c) (d)

System partitioning 115 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Simulated annealing

Iterative algorithm modeled after physical annealing process

Overview
Starts with initial partition and temperature
Slowly decreases temperature
For each temperature, generates random moves
Accepts any move that improves cost
Accepts some bad moves, less likely at low temperatures

Results and complexity depend on temperature decrease rate

System partitioning 116 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Simulated annealing algorithm

initial temperature




Objfct( )





while not Frozen loop
while not Equilibrium loop

Move( )








Objfct( )
























if (Accept( ) Random(0 1)) then




























end if
end loop
DecreaseTemp( )







end loop








where: 1


 












System partitioning 117 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Functional partitioning for hardware: BUD

Goal: incorporate area/time into synthesis [MK90]

Clusters CDFG operations into datapath modules

Closeness metrics:
Interconnecting wires
Concurrency
Shared hardware

Each clustering corresponds to an allocation/scheduling

Selects clustering with best area/time

System partitioning 118 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
BUD example

start

(bit−widths = 4)
a b

+ =
x cond

38
cond

−.

0
x := a + b; 0 1 .7
.2
if (a = b)
c := ((x − y) < z); x y z = <

4
0

.2

<
c

finish

(a) (b) (c)

System partitioning 119 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
BUD example (cont.)

Av
+− +−

g(
19 )
−. .38,0

0,
+−=<

.2
.1
AVG(−.19,.12) =

4)

2
g(

=
.035

Av
.2
= < =<

+ − < = + − < = + − < =


(a)

Clusters Chip area A Expected cycle Objfct = AxT


time T

+−=< 17.5 36 630


+−, =< 15.8 26 411
+−, =, < 13.8 26 359 (best)
+, −, =, < 16.4 26 426

(b)

Chip
+−
Controller
< =
3 clusters
(c)

System partitioning 120 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Functional partitioning for hardware: Aparty

Extends BUD clustering to multiple stages [LT91]


Different closeness metrics for each stage

Closeness metrics:
Control transfer reduction
Data transfer reduction
Hardware sharing

System partitioning 121 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Aparty example

o1
o 12 17 o 12
o2 o3 o3 o3
23
21
o4 o4 o4

o1 o2 o3 o4 o 12 o3 o4

(a) (b) (c)

System partitioning 122 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Hardware/software partitioning

Combined hardware/software systems are common

Software is cheap, modiable, and quick to design

Hardware is fast

Special algorithms are needed to favor software

Proposed algorithms
Greedy [GD92]
Hill climbing [EHB94]
Binary-constraint search with hill climbing [VGG93]

System partitioning 123 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Functional partitioning for systems: Vulcan, Cosyma

Vulcan [GD90]I
Partitions CDFG operations among hardware only
Group migration and simulated annealing algorithms

Vulcan II [GD93]
Partitions operations among hardware/software
Architecture: processor, hardware, memory, bus
All communication through memory
Uses greedy algorithm, extracts behaviors from hardware

Cosyma [EHB94]
Partitions statement blocks among hardware/software
Architecture: processor, hardware, memory, bus
Simulated annealing, extracts behaviors from software

System partitioning 124 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Functional partitioning for systems: SpecSyn

Solves three partitioning problems


Behaviors to processors/ASICs
Variables to memories
Communication channels to buses

Uses fast incremental-update estimators

Covers both hardware and


hardware/software partitioning [GVN94, VG92]

System partitioning 125 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Exploring tradeoffs with functional partitioning

1200.0

chipset1
chipset2
chipset3

1000.0

Each line represents a

performance (microseconds)
800.0
different vendor’s chip set

Each point represents an 600.0


D

allocation and partition


400.0
Many designs quickly examined
C
A B

200.0
0.0 20.0 40.0 60.0 80.0 100.0 120.0 140.0
cost (dollars)

System partitioning 126 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Summary

Partitioning heavily inuences design quality

Functional partitioning is necessary

Executable specication enables:


Automation
Exploration
Documentation

Variety of algorithms exist

Variety of techniques exist for different applications

System partitioning 127 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Future directions

Metrics from real design to guide partitioning

Comparison of functional partitioning algorithms

Impact of metric selections and orderings

Impact of of granularity on partition quality

Exploitation of regularity in partitioning

System partitioning 128 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Estimation

Estimates allow
Evaluation of design quality
Design space exploration

Design model
Represents degree of design detail computed
Simple vs. complex models

Issues for estimation


Accuracy
Speed
Fidelity

129 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Outline

Accuracy versus speed

Fidelity

Quality metrics
Performance metrics
Hardware and software cost metrics

Estimation 130 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Accuracy vs. Speed

Accuracy: difference between estimated and actual value








1






Speed: computation time for obtaining estimate

Estimation Error Computation Time

Simple Model Actual Design

Estimation 131 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Fidelity

Estimates must predict quality metrics for different design alternatives

Fidelity: % of correct predictions for pairs of design implementations

Higher delity correct decisions based on estimates




Metric
estimate (A, B) = E(A) > E(B), M(A) < M(B)

(B, C) = E(B) < E(C), M(B) > M(C)

(A, C) = E(A) < E(C), M(A) < M(C)


measured
Fidelity = 33 %

A B C Design
points

Estimation 132 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Quality metrics

Performance Metrics
Clock cycle, control steps, execution time, communication rates

Cost Metrics
Hardware: manufacturing cost (area), packaging cost(pin)
Software: program size, data memory size

Other metrics
Power, testability, design time, time to market

Estimation 133 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Hardware design model

Memory

p
DR 1 AR

Control
Logic Control Muxes
Register
n2 Registers/
R1 R2
Register Files
RF
n
1
n
3
State Reg. p
3 Muxes
n6
n
4
Next−State
Logic p
FU 2 Functional
n
5
Units
Status bits
Status
Register
Control Unit Datapath

Estimation 134 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Clock cycle estimation

Clock cycle determines:


Resources, execution time

Determining clock cycle


Designer specied [PK89, MK90]
Maximum delay of any functional unit [PPM86, JMP88]
Clock utilization [NG92]

i1 i2 i3 i4 i5 i6
i1 i2 i3 i4 i5 i6
i1 i2 i3 i4 i5 i6
150
80 +
x 150
80 +
80 + x
150 x
80 +
80 +
80 + 80 + 80 +
150 x 80 +
150 x 80 + 80 +
150
x
80 +
o1 o2
o1 o2
o1 o2
Clock Cycle : 380 ns Clock Cycle : 150 ns Clock Cycle : 80 ns
Exec. Time : 380 ns Exec. Time : 600 ns Exec. Time : 400 ns
Resources : 2 x, 4 + Resources : 1 x, 1 + Resources : 1 x, 1 +

Estimation 135 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Clock slack and utilization

Slack : portion of clock cycle for which FU is idle











 




 











Average slack: FU slack averaged over all operations




 


 




















Clock utilization : % of clock cycle utilized for computations

















Estimation 136 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Clock utilization

number of
1 x CLK 2 x CLK 3 x CLK
operations

occur(x)=6

occur(−)=2

occur(+)=2

time (ns)
50 100 150

Functional unit delay Slack Clock = 65 ns

6x32

2x9 2 x 17
x
+ − + +
ave_slack(65 ns) = = 24.4 ns
6 + 2 + 2

utilization(65 ns) = 1 − (24.4 / 65.0) = 62

Estimation 137 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Slack minimization algorithm

Clock Slack Minimization [NG92]


Compute range: ,














Compute occurrences:





0













/* Examine each clock cycle in range */ for







loop





for all operation types loop





Compute slack






end loop


Compute average slack:











Compute utilization:



 


/* If highest utilization */ if



 


 










then




 


 













 









end if
end loop






 









Estimation 138 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Execution time vs. clock utilization
Second order differential equation example

Clock with highest utilization results in better execution times

Clock cycle vs. Utilization Execution time vs. utilization

160.0 1200.0

140.0

Execution time (ns)


1000.0
120.0
Clock cycle (ns)

100.0 800.0

80.0
600.0 560 ns

60.0 56 ns
92%

40.0 400.0
0.0 20.0 40.0 60.0 80.0 100.0
Utilization (%)
20.0 92%

0.0
0.0 20.0 40.0 60.0 80.0 100.0
Utilization (%)

Estimation 139 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Control steps estimation

Operations in the specication assigned to control step

Number of control steps determines:


Execution time of design
Complexity of control unit

Scheduling
Granularity is operations in a dataow graph
Computationally expensive

Estimation 140 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Operator-use method

Granularity is statements in specication

Faster than scheduling, average error 13%


maximum
u1 := u x dx macro−node
ti num(t i ) clocks(t i ) control steps
u2 := 5 x w u1 := u x dx add: (1/1)*1= 1
add 1 1 u2 := 5 x w
mult 2 4 n u3 := 3 x y mult: (4/2)*4= 8
u3 := 3 x y 1
sub 1 1 y1 := i x dx max (1 , 8) = 8
w := w + dx
y1 := i x dx

w := w + dx u4 := u1 x u2 add: (1/1)*1= 1
n mult: (2/2)*4= 4
u1 := u x dx ; 2 u5 := dx x u3
u4 := u1 x u2 y := y + y1 max (1 , 4) = 4
u2 := 5 x w ;
u3 := 3 x y ;
y1 := i x dx ; u5 := dx x u3
n sub: (1/1)*1= 1
w := w + dx ; 3 u6 := u − u4 max (1 ) = 1
u4 := u1 x u2 ; y := y + y1
u5 := dx x u3 ;
y := y + y1 ; n sub: (1/1)*1= 1
u6 := u − u4 ; u6 := u − u4 4 u := u6 − u5 max (1 ) = 1
u := u6 − u5 ;
u := u6 −u5 Estimated total = 14
control steps

Estimation 141 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Branching in behaviors

Control steps maybe shared across exclusive branches


sharing schedule: fewer states, status register
non-sharing schedule: more states, no status registers

B
1

o1 s1 o1 s1 o1

o2 s2 o2 s2 o2
B B
2 3
s3 o3 o6 s3 o3 o6 s6
o3 o6

o4 o7 s4 o4 o7 s4 o4 o7 s7

o5 s5 o5 s5 o5

B o8 s6 o8 s8 o8
4

(a) (b) (c)

Estimation 142 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Execution time estimation

Average start to nish time of behavior

Straight-line code behaviors




















Behavior with branching
Estimate execution time for each basic block
Create control ow graph from basic blocks
Determine branching probabilities
Formulate equations for node frequencies
Solve set of equations




























 


Estimation 143 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Probability-based ow analysis

B1
A := A + 1;
V
1
A := A + 1;
e
B := B + 1 ; 12
for I in 1 to 10 loop B2
B := B + 1; C := C − A;
V
C := C − A; 2
D>A D <= A 0.5 0.5
if (D > A ) then B B e e
D := D + 2; 3 4 23 24
else
D := D + 3; D := D + 2; D := D + 3; V3 V
4
end if
E := D * 2; e e
45
end loop; 35
e
B 52
B := B * A; 5 E := D * 2 ; V
5
(I =< 10) 0.9
C := 3
(I > 10) e 0.1
56

B B: = B * A; V
6 C := 3; 6

Estimation 144 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Probability-based ow analysis

Flow equations:



10









10







1






10 09










2 1 5






05










3 2






05










4 2







10 10










5 3 4






01










6 5


Node execution frequencies:




10 10 0











1 






50 50











3 4







10 0 10











5 6



Can be used to estimate number of accesses to
variables, channels or procedures

Estimation 145 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Communication rates
bits sent over
channel C

8 8 8 8 8 8 8

200 400 600 800 1000 time (ns)

Average channel rate


rate of data transfer over lifetime of behavior
56





56









1000


Peak channel rate
rate of data transfer of single message
8 



80













100


Estimation 146 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Communication rate estimation

Total behavior execution time consists of



Computation time, , obtained from ow-analysis











Communication time,






















Total bits transferred by the channel,

























Channel average rate










































Channel peak rate


























Estimation 147 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Area estimation

Two tasks:
Determining number and type of components required
Estimating component size for a specic technology (FSMD, gate arrays etc.)

Behavior implemented as a FSMD (nite state machine with datapath)


Datapath components: registers, functional units, multiplexers/buses
Control unit: state register, control logic, next-state logic

We will discuss
Datapath component estimation
Control unit estimation
Layout area for a custom implementation

Estimation 148 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Clique-partitioning

Commonly used for determining datapath components





Let be a graph, and are set of vertices and edges





Clique is a complete subgraph of

Clique-partitioning
divides the vertices into a minimal number of cliques
each vertex in exactly one clique

One heuristic: maximum number of common neighbors [CS86]


Two nodes with maximum number of common neighbors are merged
Edges to two nodes replaced by edges to merged node
Process repeated till no more nodes can be merged

Estimation 149 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Clique-partitioning

Common
Edge neighbors
s Edge Common
e’ 1 s 2 neighbors
1,3 13 e’
s 13,4 0
s 2 e’ 1 v2
1 v 1,4 v e’
1
v2 1 2,5 0
e’ 0 e’
2,3 0
4,5
e’ 0
2,5
s
5 v3 v4
v3 e’ 1 v
v4 3,4
5 s
v s 5
5 e’ 4
s 4,5
0
3 s
4

v v2 s
Common 1
s Edge 25
2 neighbors
v v2 e’ 0
2,5
1

v3 v4
v
s 5
134
v3 v4
v s
5 5
s s = {v1 , v 3 , v 4 }
134 134
Cliques:
s {v2 , v 5 }
25 =

Estimation 150 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Storage-unit estimation

Variables not used concurrently maybe mapped same storage-unit

To use clique-partitioning, construct a graph where


Each variable represented by a vertex
Variables with non-overlapping lifetimes have an edge between] their vertices

v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11
v
s 8
0
v10
s v1 Cliques Storage unit
1
v9 v2 {v2 , v 3 } = R
1
s v7 {v6 , v7 , v 9 } R2
2 =
v11 {v4 , v5 , v 8 } = R
3
v3 v5
s {v10 , v 11} = R4
3
{v1 } = R5
v4
s v6
4

s
5

Estimation 151 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Functional-unit and interconnect-unit estimation

Clique-partitioning can be applied

For determining the number of FU’s required, construct a graph where


Each operation in behavior represented by a vertex
Edge connects two vertices if
Corresponding operations assigned different control steps
There exists an FU that can implement both operations

For determining the number of interconnect units, construct a graph where


Each connection between two units is represented by a vertex
Edge connects two vertices if corresponding connections not used
in same control step

Estimation 152 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Computing datapath area

Routing Bit slices


channel

LSB MSB

Bit-sliced datapath













L
bit





































Control
lines
























H H
cell rt
Datapath
components
H
bit

Estimation 153 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Pin estimation

Number of wires at behavior’s boundary depends on


Global data
Port accessed
Communication channels used
Procedure calls

variable N : integer;
variable X : bit_vector(15 downto 0);
procedure SUM(A, B, OUT) is portF
begin
.... portG
end SUM;

process Main ( ch1, ch2) process Factorial ( ch1, ch2)


out channel ch1 ; in channel ch1 ;
in channel ch2; channel ch1 out channel ch2;
{ {
send (ch1, N); receive (ch1, M);
portF <= portG + 4; /* compute factorial */
............ ................
receive (ch2, Result); channel ch2 send (ch2, result);
} }

Estimation 154 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Software estimation models

Specification
Specification

Compile Compile Compile


to 8086 to 68000 to MIPS Compile to
generic instructions

8086 68000 MIPS 8086


instructions instructions instructions instruction
Generic timing & size
instructions information

technology 68000
8086 68000 MIPS instruction
8086 instruction 68000 instruction MIPS instruction files for target timing & size
Estimator timing & size Estimator timing & size Estimator timing & size Estimator processors information
information information information

MIPS
Software instruction
Software Metrics timing & size
information
Metrics

Processor specific model Generic model

Estimation 155 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Deriving processor technology les
Generic instruction
dmem3 = dmem1 + dmem2

8086 instructions 68020 instructions


instruction clocks bytes instruction clocks bytes
mov ax, word ptr[bp+offset1] (10) 3 mov a6@(offset1), d0 (7) 2
add ax, word ptr[bp+offset2] (9 + EA1) 4 add a6@(offset2), d0 (2 + EA2) 2
mov word ptr[bp+offset3], ax (10) 3 mov d0, a6@(offset3) (5) 2

technology file for 8086 technology file for 68020


execution execution
generic instruction time size generic instruction time size

... ...
10 6
dmem3 = dmem1 + dmem2 35 clocks bytes dmem3 = dmem1 + dmem2 22 clocks bytes
... ...

Estimation 156 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Software estimation

Program execution time


Create basic blocks and compile into generic instructions
Estimate execution time of basic blocks
Perform probability-based ow analysis
Compute execution time of the entire behavior:


































accounts for compiler optimizations

Program memory size








 













 

Data memory size




















Estimation 157 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Summary and future directions

We described methods for estimating:


Performance metrics: clock, control steps, execution time, communication rates
Cost metrics: design area, pins, program and data memory size

Future directions:
Incorporating synthesis/compilation optimizations
New metrics for testability, power, integration cost, etc.
New architectural features for the estimation model

Estimation 158 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Renement

Functional objects are grouped and mapped to system components


Functional objects: variables, behaviors, and channels
System components: memories, chips or processors, and buses

Renement is update of specication to reect mapping

Need for renement


Makes specication consistent
Enables simulation of specication
Generate input for synthesis, compilation and verication tools

159 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Outline

Rening variable groups

Channel renement

Resolving access conicts

Rening incompatible interfaces

Renement 160 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Rening variable groups

Group of variables mapped to a memory

Variable folding:
Implementing each variable in a memory with a xed word size

Memory address translation


Assignment of addresses to each variable in group
Update references to variable by accesses to memory

Renement 161 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Variable folding

11 8 7 0
variable A: bit_vector( 3 downto 0) ;
variable B: bit_vector(15 downto 0) ;
variable C: bit_vector(11 downto 0) ;
variable D: bit_vector(11 downto 0) ;

4x1
7 0

... 7..4 3..0


to variable C in memory
A( 3 downto 0)
B( 7 downto 0)
B(15 downto 8) 11 6 5 0

C( 7 downto 0)
C(11 downto 8)
D( 5 downto 0)
6x1
D(11 downto 6)
...
... 5..0

8−bit Memory to variable D in memory

Renement 162 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Memory address translation

variable J, K : integer := 0;
variable V : IntArray (63 downto 0);
.... V (63 downto 0)
V(K) := 3;
X := V(36);
V(J) := X;
....
for J in 0 to 63 loop MEM(163 downto 100)
SUM := SUM + V(J);
end loop;
....
Original specification Assigning addresses to V

variable J : integer := 100;


variable J, K : integer := 0; variable K : integer := 0;
variable MEM : IntArray (255 downto 0); variable MEM : IntArray (255 downto 0);
.... ....
MEM(K +100) := 3; MEM(K + 100) := 3;
X := MEM(136); X := MEM(136);
MEM(J+100) := X; MEM(J) := X;
.... ....
for J in 0 to 63 loop for J in 100 to 163 loop
SUM := SUM + MEM(J +100); SUM := SUM + MEM(J);
end loop; end loop;
.... ....

Refined specification Refined specification


without offsets for index J

Renement 163 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Rening channel groups

Channels are virtual entities over which messages are transferred

Bus is a physical medium that implements groups of channels

Bus consists of:


wires representing data and control lines
protocol dening sequence of assignments to data and control lines

Two renement tasks


Bus generation: determining buswidth i.e. number of data lines
Protocol generation: specifying mechanism of transfer over bus

Renement 164 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Characterizing communication channels


For a given behavior that sends data over channel ,



Message size, : number of bits in each message





Accesses, : number of times transfers data over







Average rate, : rate of data transfer of over lifetime of behavior






Peak rate, : rate of transfer of single message







8 8 8
channel
X X1 X2 X3
time (ns)
t=0 100 200 300 400


8 bits




24




60












400


8




80















100


Renement 165 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Characterizing buses

For a given bus ,





Buswidth , : number of data lines in










Protocol delay, : delay for single message transfer over bus












Average rate, : rate of data transfer over over lifetime of system








Peak rate, : maximum rate of transfer of data on bus
































Renement 166 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Determining bus rates

Idle slots of a channel used for messages of other channels

To ensure that channel average rates are unaffected by bus
















Goal: to synthesize a bus that constantly transfers data i.e.


















Average rate
8 8
channel (2x8 bits) / 4s
X X1 X2 = 4 bits/s

16 16 16
channel (3x16 bits) / 4s
Y Y1 Y2 Y3
= 12 bits/s

8 16 16 8 16
bus (4 + 12 bits/s)
B X1 Y1 Y2 X2 Y3
= 16 bits/s

t=0 1s 2s 3s 4s
time

Renement 167 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Constraints for bus generation

Buswidth: affects number of pins on chip boundaries

Channel average rates: affects execution time of behaviors

Channel peak rates: affects time required for single message transfer
16 16

channel X X1 X2 averate(X) = 8 bits/s

8 8 8 8 averate(B) = 8 bits/s
peakrate(B) =8 bits/s
bus B X1 X2

16 16
averate(B) = 8 bits/s
bus B X1 X2 peakrate(B) = 16 bits/s

t=0 1s 2s 3s 4s time

Renement 168 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Bus generation algorithm [NG94]

/* Determine range of buswidths */





1, Max

 

 













,
















for in to loop

 


















/* compute bus peak rate */























/* compute sum of channel average rates */

= 0;







for all channels loop









































= + ;

















end loop

if ( ) then
















/* feasible solution, determine minimal cost */
ComputeCost( )

 













if ( ) then
 










 



 

 



















end if
end if
end loop
return( )








Renement 169 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
For

if


  


Renement
 
  


if
  
   
 
  
 
    


170 of 214
 
  
   
   
 

 
 
  





 
 

Compute bus peak rate:

  
 
  
 
 
Compute buswidth range:

  
     
  
 

     


Compute channel average rates

   

 
 


  
then
  


  
     
     

 
 
    
 
  
  

 

    
1,

 
 




 
loop

 
then



 
Bus generation algorithm

 
 

 





Max

 
Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong


UC Irvine
Bus generation example

2 behavior accessing 16 bit data over two channels

Constraints specied for channel peak rates

9000.0
8000.0
Cost Function Value

7000.0
6000.0
5000.0
4000.0
infeasible selected buswidth
3000.0
implementations
2000.0
1000.0 feasible
implementations
0.0
-1000.0
0.0 4.0 8.0 12.0 16.0 20.0 24.0
Buswidth

Renement 171 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Performance vs. buswidth tradeoffs

Allows a buswidth to be selected, given performance constraints


e.g. behavior 1 has performance constraint of 2500 clocks.
buswidths of 4 or greater must be selected

7000.0
Behavior execution time (clocks)

6000.0

5000.0

4000.0

3000.0

2000.0

1000.0

0.0
0.0 4.0 8.0 12.0 16.0 20.0 24.0
Buswidth (pins)

Renement 172 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Protocol generation

Bus consists of several sets of wires:


Data lines, used for transferring message bits
Control lines, used for synchronization between behaviors
ID lines, used for identifying the channel active on the bus

All channels mapped to bus share these lines

Number of data lines determined by bus generation algorithm

Protocol generation consists of six steps

Renement 173 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Protocol generation

1. Protocol selection: full handshake, half-handshake etc.



2. ID assignment: channels require 2 ID lines

behavior P CH0 "00"


variable AD; variable X :
begin CH1 "00" bit_vector(15 downto 0) ;
.....
X <= 32 ;
.....
MEM(AD) := X + 7;
..... CH2 "00"
end ;

variable MEM : bit_vector


behavior Q CH3 "00" (63 downto 0, 15 downto 0);
variable COUNT;
begin
.....
MEM(60) := COUNT ; bus B
.....
end ;

Renement 174 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Protocol generation

type HandShakeBus is record


START, DONE : bit ;
ID : bit_vector(1 downto 0) ;
DATA : bit_vector(7 downto 0) ;
end record ;

signal B : HandShakeBus ;
3. Bus structure denition procedure ReceiveCH0( rxdata : out bit_vector) is
begin
for J in 1 to 2 loop
wait until (B.START = ’1’) and (B.ID = "00") ;
rxdata (8*J−1 downto 8*(J−1)) <= B.DATA ;
B.DONE <= ’1’ ;
wait until (B.START = ’0’) ;
B.DONE <= ’0’ ;
end loop;
end ReceiveCH0;

procedure SendCH0( txdata : in bit_vector) is


4. Bus protocol denition begin
bus B.ID <= "00" ;
for J in 1 to 2 loop
B.data <= txdata(8*J−1 downto 8*(J−1)) ;
B.START <= ’1’ ;
wait until (B.DONE = ’1’) ;
B.START <= ’0’ ;
wait until (B.DONE = ’0’) ;
end loop;
end SendCH0;

Renement 175 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Protocol generation

5. Update variable references


6. Generate behaviors for variables

process Xproc
process P variable X ;
variable AD Xtemp; begin
begin bus B wait on B.ID;
..... if (B.ID="00") then
SendCH0(32) ; receiveCH0(X);
..... elsif (B.ID="01" ) then
ReceiveCH1(Xtemp); sendCH1(X);
SendCH2(AD, Xtemp+7); end if;
..... end;
end ;

process MEMproc
variable MEM: array(0 to 63);
begin
process Q wait on B.ID;
variable COUNT; if (B.ID="10") then
begin receiveCH2(MEM);
..... 8
elsif (B.ID="11" ) then
SendCH3(60, COUNT); receiveCH3(MEM);
..... end if;
end ; end;

Renement 176 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Resolving access conicts

System partitioning may result in concurrent accesses to a resource


Channels mapped to a bus may attempt data transfer simultaneously
Variables mapped to a memory may be accessed by behaviors simultaneously

Arbiter needs to be generated to resolve such access conicts

Three tasks
Arbitration model selection
Arbitration scheme selection
Arbiter generation

Renement 177 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Arbitration models

addr / data

addr / data

port1 port2
MemArbiter
memory MEM
req, req,
grant grant

Static behavior P behavior Q behavior R

addr / data

addr / data
Dynamic
port1 port2
MemArbiter
memory MEM
req, req, req,
grant grant grant

behavior P behavior Q behavior R

Renement 178 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Arbiter generation

Example of bus arbitration


Two behaviors accessing a single resource, bus


Behavior assigned higher priority than
Fixed priority implemented with two handshake signals and









bus B
process P process Xproc
variable AD Xtemp; variable X ;
Req_P begin begin
process B_arbiter Grant_P ..... wait on B.ID;
begin Req_P <= ’1’; if (B.ID="00") then
wait until (Req_P=’1’) wait until (Grant_P = ’1’); receiveCH0(X);
or (Req_Q = ’1’); SendCH0(32) ; elsif (B.ID="01" ) then
if (Req_P = ’1’) then Req_P <= ’0’; sendCH1(X);
Grant_P = ’1’; ..... end if;
wait unitl (Req_P = ’0’); end process ; end process;
Grant_P = ’0";
elsif (Req_Q = ’1’) then
Grant_Q <= ’1’; process Q
wait until (Req_Q = ’0’); Req_Q variable COUNT; process MEMproc
Grant_Q <= ’0’; Grant_Q begin variable MEM: array(0 to 63);
end if; ..... begin
end process; Req_Q <= ’1’; wait on B.ID;
wait until (Grant_Q = ’1’); 8 if (B.ID="10") then
SendCH3(60, COUNT); receiveCH2(MEM);
Req_Q <= ’0’; elsif (B.ID="11" ) then
..... receiveCH3(MEM);
end process; end if;
end process;

Renement 179 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Effect of binding on interfaces

Custom Custom

Channel X
behavior Pa Pb
behavior
A B

protocol protocol

Custom Standard

Channel X
behavior Pa Pb behavior
X B

Standard Standard

behavior Interface behavior


Pa Process Pb
A B

Renement 180 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Protocol operations

Protocols usually consist of ve atomic operations


waiting for an event on input control line
assigning value to output control line
reading value from input data port
assigning value to output data port
waiting for xed time interval

Protocol operations may be specied in one of three ways


Finite state machines (FSMs)
Timing diagrams
Hardware description languages (HDLs)

Renement 181 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Protocol specication : FSMs

Protocol operations ordered by sequencing between states

Constraints between events may be specied using timing arcs

Conditional & repetitive event sequences require extra states, transitions

start start

ADDRp <= AddrVar(7 downto 0); b1


a1 ARDYp <= ’1’;
(ARCVp = ’1’ ) (RDp = ’1’)

ADDRp <= AddrVar(15 downto 8); b2 MAddrVar := MADDRp


a2 AREQp <= ’1’;
(DRDYp = ’1’ ) (100 ns)

MDATAp <=
a3 DataVar <= DATAp b3 MemVar (MAddrVar)

Protocol Pa Protocol Pb

Renement 182 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Protocol specication : Timing diagrams

Advantages:
Ease of comprehension, representation of timing constraints

Disadvantages:
Lack of action language, not simulatable
Difcult to specify conditional and repetitive event sequences

ARDYp

15..0 MADDRp
ADDRp 7..0 15..8

ARCVp RDp

DREQp
15..0 MDATAp
DRDYp
100ns
DATAp 15..0

Protocol Pa Protocol Pb

Renement 183 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Protocol specication : HDLs

Advantages:
Functionality can be veried by simulation
Easy to specify conditional and repetitive event sequences

Disadvantages:
Cumbersome to represent timing constraints between events

port ADDRp : out 8


bit_vector(7 downto 0); ADDRp
port DATAp : in DATAp
bit_vector(15 downto 0); 16 port MADDRp : in
port ARDYp : out bit; bit_vector(15 downto 0);
port ARCVp : in bit; ARDYp
port MDATAp : out
port DREQp : out bit; ARCVp bit_vector(15 downto 0);
port DRDYp : in bit; port RDp : in bit;
DREQp
ADDRp <= AddrVar(7 downto 0); DRDYp RDp wait until (RDp = ’1’);
ARDYp <= ’1’; MAddrVar := MADDRp ;
wait until (ARCVp = ’1’ ); 16
MADDRp wait for 100 ns;
ADDRp <= AddrVar(15 downto 8); MDATAp <= MemVar (MAddrVar);
DREQp <= ’1’; MDATAp
wait until (DRDYp = ’1’); 16
DataVar <= DATAp;

Protocol Pa Protocol Pb

Renement 184 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Interface process generation

Input: HDL description of two xed, but incompatible protocols

Output: HDL process that translates one protocol to the other


i.e. responds to their control signals and sequence their data transfers

Four steps required for generating interface process (IP):


Creating relations
Partitioning relations into groups
Generating interface process statements
interconnect optimization

Renement 185 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
IP generation: creating relations

Protocol represented as an ordered set of relations

Relations are sequences of events/actions

Protocol Pa Relations
A1 [ (true) :
ADDRp <= AddrVar(7 downto 0); ADDRp <= AddrVar(7 downto 0)
ARDYp <= ’1’; ARDYp <= ’1’ ]
wait until (ARCVp = ’1’ ); A2 [ (ARCVp = ’1’) :
ADDRp <= AddrVar(15 downto 8); ADDRp <= AddrVar(15 downto 8)
DREQp <= ’1’; DREQp <= ’1’ ]
wait until (DRDYp = ’1’);
DataVar <= DATAp; A3 [ (DRDYp = ’1’) :
DataVar <= DATAp ]

Renement 186 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
IP generation: partitioning relations

Partition the set of relations from both protocols into groups.

Group represents a unit of data transfer

Protocol Pa Protocol Pb

A1 (8 bits out)
B1 (16 bits in) G1
A2 (8 bits out)

A3 (16 bits in) B2 (16 bits out) G2










1 1 2 1 2 1 3

Renement 187 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
IP generation: inverting protocol operations

For each operation in a group, add its dual to interface process

Dual of an operation represents the complementary operation

Temporary variable may be required to hold data values

Interface Process

/* (group G1)’ */
Atomic operation Dual operation wait until (ARDYp = ’1’);
8
ADDRp TempVar1(7 downto 0) := ADDRp ; 16
wait until (Cp = ’1’) Cp <= ’1’ ARCVp <= ’1’ ; MADDRp
DATAp
wait until (DREQp = ’1’); MDATAp
Cp <= ’1’ wait until (Cp = ’1’) 16
TempVar1(15 downto 8) := ADDRp ; 16
ARDYp RDp <= ’1’ ;
var <= Dp Dp <= TempVar
ARCVp MADDRp <= TempVar1;
Dp <= var TempVar := Dp /* (group G2)’ */
DREQp wait for 100 ns; RDp
wait for 100 ns wait for 100 ns DRDYp TempVar2 := MDATAp ;
DRDYp <= ’1’ ;
DATAp <= TempVar2 ;

Renement 188 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
IP generation: interconnect optimization

Certain ports of both protocols may be directly connected

Advantages:
Bypassing interface process reduces interconnect cost
Operations related to these ports can be eliminated from interface process

Interface Process
ADDRp
wait until (ARDYp = ’1’);
8 TempVar1(7 downto 0) := ADDRp ; MADDRp
ARDYp ARCVp <= ’1’ ; 16
wait until (DREQp = ’1’);
ARCVp
TempVar1(15 downto 8) := ADDRp ; RDp
RDp <= ’1’ ;
A DREQp
MADDRp <= TempVar1; B
DRDYp wait for 100 ns;
DRDYp <= ’1’ ;
DATAp 16 MDATAp

Renement 189 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Transducer synthesis [BK87]

Input: Timing diagram description of two xed protocols

Output: Logic circuit description of transducer

Steps for generating logic circuit from timing diagrams:


Create event graphs for both protocols
Connect graphs based on data dependencies or explicitly specied ordering
Add templates for each output node in combined graph
Merge and connect templates
Satisfy min/max timing constraints
Optimize skeletal circuit

Renement 190 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Generating event graphs from timing diagrams

e.g. FIFO stack control cell

Ri

Ro L
Ri
Cell Ai
Ao Ro
L
Ao

Ai

S Ri Ri

L L L L

Ro Ro

Ao Ao

Ai Ai E

Renement 191 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Deriving skeletal circuit from event graph

Ri
Ao Ro S
Ri L
L Q Ro
Ao Ri R
Ri Ro
L S L
Ao Q L
Ro
R Ai
Ro S
L L
Ao Q Ai
Ro Ai R
L Ro

Advantages:
Synthesizes logic for transducer circuit directly
Accounts for min/max timing constraints between events

Disadvantages:
Cannot interface protocols with different data port sizes
Transducer not simulatable with timing diagram description of protocols

Renement 192 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Hardware/Software interface renement

v2

Software partition Hardware partition v1 Processor Memory


Data access

v1 v2 v3 v4 s2

s1 B1 B2
p1 v3
v4 s2
B1 B2 B3 B4 p2
Buffer s1
Ports

p1 p2 p3
B3 B4
ASIC

p1 p2 p3

(a) Partitioned specification (b) Mapping to architecture

Renement 193 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Tasks of hardware/software interfacing

Data access (e.g., behavior accessing variable) renement

Control access (e.g., behavior starting behavior) renement

Select bus to satisfy data transfer rate and reduce interfacing cost

Interface software/hardware components to standard buses

Schedule software behaviors to satisfy data input/output rate

Distribute variables to reduce ASIC cost and satisfy performance

Renement 194 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Summary and future directions

In this section, we described:


Renement of variable groups: variable folding, address translation
Renement of channel groups: bus and protocol generation
Resolution of access conicts: arbiter generation
Renement of incompatible interfaces: IP generation, transducer synthesis

Future work should address the following issues:


Effects of bus arbitration delays on performance of a behavior
Developing metrics to guide selection of protocols and arbitration schemes
Efcient synthesis of arbiter and interface processes

Renement 195 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Methodology

Past design effort focused on lower levels

Higher levels lack well-dened methodology and tools

Paradigm shift to higher levels can increase productivity

Need methodology and tools for system level

196 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Outline

Basic concepts in design methodology

Example

A design methodology

A generic synthesis system

Conceptualization environment

Methodology 197 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Items a design methodology must specify

Syntax and semantics of input and output

Algorithms for transforming input to output

Components to be used in the design implementation

Denition and ranges of constraints

Mechanism for selection of architectural styles

Control strategies (scenarios or scripts)

Methodology 198 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Example: Interactive TV processor

InteractiveTvProcessor

audio_in audio_out
Analog video_in Digital video_out Analog
subsystem subsystem subsystem
av_cmd

video audio + button audio video


commands
keypad
receiver
IC

Main computer

Methodology 199 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Example’s dataow behavior

Digital subsystem

audio_in audio_out
audio1[100k][8]
StoreAudio GenerateAudio
audio2[100k][8]
video_in video_out
video[500k][8]

ProcessAVCmd StoreGenerateVideo

av_cmd[8] OverlayCharacters fonts[128][16][16]

screen_chars[30][30][8]

av_cmd
StoreAVCmd ProcessMainCmds ProcessRemoteButtons

main_cmds button

Methodology 200 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Example’s implementation after system design

Digital subsystem

Memory1 Memory2

audio1[100k][8]
video[500k][8]
audio2[100k][8]

audio_in audio_out

video_in video_out

ASIC1 ASIC2 Memory3

StoreGenerateVideo
StoreAudio fonts[128][16][16]
StoreAVCmd
GenerateAudio screen_chars[30][30[]8]
av_cmd[8]

av_cmd

Processor

ProcessAVCmd ProcessRemoteButtons

ProcessMainCmds OverlayCharacters

main_cmds button

Methodology 201 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
An example design methodology

Current practice Proposed methodology


Functionality specification

Natural language Functional specification Executable language

Allocation
Manual System design Partitioning
Refinement
bus

Processor ASIC ASIC Memory


Funct. Funct. Funct.
Spec. Spec. Spec. Variables

Component implementation

detailed bus protocol

Processor ASIC ASIC Memory


C RTL RTL mapped
address
code struct. struct. space

Methodology 202 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
System-design tasks

System−design tasks
Allocation Partitioning Refinement
Functional objects

Variables Memories Variables to memories Address assignment

Behaviors Processors Behaviors to processors Interfacing

Channels Buses Channels to buses Arbitration/protocols

Methodology 203 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
One possible ordering of tasks

1. Functionality specification
Specification

Memory allocation

Variable−to−memory partitioning

Bus allocation

Channel−to−bus partitioning
2. System design

ASIC/processor allocation

Behavior−to−ASIC/processor partitioning

Interface synthesis

Arbiter synthesis

3. Component implementation
Implement software Implement hardware

Methodology 204 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Generic synthesis system requirements

Completeness
All levels of design, all implementation styles

Extensibility
Allow addition of new algorithms and tools

Controllability
User control of tools, design-quality feedback

Interactivity
Partial design, design modication

Upgradability
Evolve to describe-and-synthesize method

Methodology 205 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
A generic synthesis system

System Specification Designer

System
synthesis

Software ASIC SDB


synthesis synthesis

Conceptualization environment
Verification/simulation suite

Intermediate forms
Description generators

Logic/Sequential CDB
Compilation
synthesis

Physical design
synthesis

ASIC description
Assembly code
to manufacturing

Methodology 206 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
A generic system-synthesis tool

System behavioral specification

Compiler

Allocator

Transformer SR Estimators

Partitioner

Interface &
arbitration
synthesis

System−module
behavioral specifications

To software synthesis To chip synthesis

Methodology 207 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
A generic chip-synthesis tool

Behavioral
description

Compiler

Scheduler

Component
selector

CDFG Storage
binder

Functional unit
binder

Interconnection
binder

Module
selector

Technology
mapper
CDB
Microarchitecture
optimizer

Logic/Sequential synthesis

To physical design

Methodology 208 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
A generic logic-synthesis tool

State Boolean Timing Memory


tables expressions diagrams specifications

State Timing graph Memory


minimization compiler synthesis

State Interface
encoding synthesis

Logic
minimization

Technology
mapping

Physical design

Methodology 209 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Conceptualization environment

Tool is only effective if the designer can use it


Understandable display of data
Highlight design parts that need attention

Must support many design avenues

Methodology 210 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
A system-synthesis tool interface

Module Execution
Mappings type $ time Area Pins Instr

System 105
/100*
ASIC1 X100 30 16000 46/60
/20000
CaptureAudio 100/110
GenerateAudio 100/110
Allocation ASIC2 X100 30 18000 48/60
/20000
CaptureGenerateVideo 100/110

Partition CaptureAVCmd 100/110


Memory1 V1000 10
audio_array1
Estimates audio_array2
Memory2 V1000 10
video_array
Constraints Processor1 Y900 25 6000
/5000*
ProcessRemoteButtons
ProcessMiscCmds

Cost: 5.43 View options Partition/Allocate Refine

Methodology 211 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
An optional design view

Estimate/
Quality metric Constraint Violation?

$(System) 105/100
Execution−time(CaptureAudio) 100/110
Execution−time(GenerateAudio) 100/110
Execution−time(CaptureGenerateVideo) 100/110
Execution−time(CaptureAVCmd) 100/110
Area(ASIC1) 16000/20000
Area(ASIC2) 18000/20000
Pins(ASIC1) 56/60
Pins(ASIC2) 58/60
Instr(Processor1) 6000/5000
0 constraint

Methodology 212 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Summary

Three-step design methodology


Functionality specication
System design
Component implementation

Major tasks in system design


Allocation
Partitioning
Renement

Generic synthesis tool

Conceptualization environment
Crucial to practical use

Methodology 213 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
Future directions

Advanced estimation methods

Formal verication

Testability

Frameworks and databases

Regularity exploiting

System-level transformations

Feedback incorporation

Methodology 214 of 214 Copyright (c) 1994 Daniel D. Gajski, Frank Vahid, Sanjiv Narayan, and Jie Gong UC Irvine
References
[BHS91] F. Belina, D. Hogrefe, and A. Sarma. SDL with Applications from Protocol Specications. Prentice Hall, 1991.

[BK87] G. Borriello and R.H. Katz. \Synthesis and optimization of interface transducer logic,". In Proceedings of the International
Conference on Computer-Aided Design, 1987.

[CS86] C.Tseng and D.P. Siewiorek. \Automated synthesis of datapaths in digital systems,". IEEE Transactions on Computer-Aided
Design, pages 379{395, July 1986.

[EHB94] R. Ernst, J. Henkel, and T. Benner. \Hardware-software cosynthesis for microcontrollers,". In IEEE Design & Test of Com-
puters, pages 64{75, December 1994.

[FM82] C.M. Fiduccia and R.M. Mattheyses. \A linear-time heuristic for improving network partitions,". In Proceedings of the Design
Automation Conference, 1982.

[GD90] R. Gupta and G. DeMicheli. \Partitioning of functional models of synchronous digital systems,". In Proceedings of the Inter-
national Conference on Computer-Aided Design, pages 216{219, 1990.

[GD92] R. Gupta and G. DeMicheli. \System-level synthesis using re-programmable components,". In Proceedings of the European
Conference on Design Automation (EDAC), pages 2{7, 1992.

[GD93] R. Gupta and G. DeMicheli. \Hardware-software cosynthesis for digital systems,". In IEEE Design & Test of Computers, pages
29{41, October 1993.

[GVN94] D.D. Gajski, F. Vahid, and S. Narayan. \A system-design methodology: Executable-specication renement,". In Proceedings
of the European Conference on Design Automation (EDAC), 1994.

[Hal93] Nicolas Halbwachs. Synchronous Programming of Reactive Systems. Kluwer Academic Publishers, 1993.

[Hoa78] C.A.R. Hoare. \Communicating sequential processes,". Communications of the ACM, 21(8): 666{677, 1978.

[IEE88] IEEE Inc., N.Y. IEEE Standard VHDL Language Reference Manual, 1988.

[JMP88] R. Jain, M. Mlinar, and A. Parker. \Area-time model for synthesis of non-pipelined designs,". In Proceedings of the Interna-
tional Conference on Computer-Aided Design, 1988.

[Joh67] S.C. Johnson. \Hierarchical clustering schemes,". Psychometrika, pages 241{254, September 1967.
[KC91] Y.C. Kirkpatrick and C.K. Cheng. \Ratio cut partitioning for hierarchical designs,". IEEE Transactions on Computer-Aided
Design, 10(7): 911{921, 1991.

[KGV83] S. Kirkpatrick, C.D. Gelatt, and M. P. Vecchi. \Optimization by simulated annealing,". Science, 220(4598): 671{680, 1983.

[KL70] B.W. Kernighan and S. Lin. \An efcient heuristic procedure for partitioning graphs,". Bell System Technical Journal, February
1970.

[LT91] E.D. Lagnese and D.E. Thomas. \Architectural partitioning for system level synthesis of integrated circuits,". IEEE Transactions
on Computer-Aided Design, July 1991.

[MK90] M.C. McFarland and T.J. Kowalski. \Incorporating bottom-up design into hardware synthesis,". IEEE Transactions on
Computer-Aided Design, September 1990.

[NG92] S. Narayan and D.D. Gajski. \System clock estimation based on clock slack minimization,". In Proceedings of the European
Design Automation Conference (EuroDAC), 1992.

[NG94] S. Narayan and D.D. Gajski. \Synthesis of system-level bus interfaces,". In Proceedings of the European Conference on
Design Automation (EDAC), 1994.

[NVG92] S. Narayan, F. Vahid, and D.D. Gajski. \System specication with the SpecCharts language,". In IEEE Design & Test of
Computers, Dec. 1992.

[PK89] P.G. Paulin and J.P. Knight. \Algorithms for high-level synthesis,". In IEEE Design & Test of Computers, Dec. 1989.

[PPM86] A.C. Parker, T. Pizzaro, and M. Mlinar. \MAHA: A program for datapath synthesis,". In Proceedings of the Design Automation
Conference, 1986.

[TM91] D.E. Thomas and P. Moorby. The Verilog Hardware Description Language. Kluwer Academic Publishers, 1991.

[VG92] F. Vahid and D.D. Gajski. \Specication partitioning for system design,". In Proceedings of the Design Automation Conference,
1992.

[VGG93] F. Vahid, J. Gong, and D.D. Gajski. \A hardware-software partitioning algorithm for minimizing hardware,". UC Irvine, Dept.
of ICS, Technical Report 93-38,1993.

You might also like