Professional Documents
Culture Documents
Unit 2 ES - CLG
Unit 2 ES - CLG
1. State machine
3. Queue.
Among. three components state machines are well suited to reactive systems
Such as user interfaces. Circular buffers and queues are useful in digital signal
processing
2.8.1. STATE MACHINE
When inputs given to any kinds of systems, the reaction of most systems can be
characterized in terms of the input received and the current state ofthe system. It
leads to a technique known as finite state machine style.
The finite state machine style will describethe reactive systems behavior. The
state machine style of programming is also an efficient implementation of hardware
design.
2.8.2. CIRCULAR BUFFERS AND STREAM ORIENTED PROGRAMMING
The Circular buffer is a data structure that handles streaming data in an
efficient way. Below figure shows how a circular buffer stores a subset of the data
stream. At each point in time, the algorithm needs a subset of the data stream that
forms a window into the stream.
To avoid copying data within the buffer, we will move the head ofthe buffer in
time. The buffer points to the location at which the next sample will be placed.
Every time we add a sample means automatically overwrite the oldest sample,|
which is the one that needs to be thrown out. When the pointer gets to the end of
the buffer, it wraps around to the top.
Many Digital signal processors provide addressing modes to support circular
Duffers. For example, the C55x provides five circular buffer start address registers.
These registers allow circular buffers to be placed without alignment constraints.
Embedded and Real Time System
tems
2.34 absence of special:.
write our own C code for a circular buffer with ized
understand the operation ofthe buffer.
We can
helps us to
instructions. This code also
Timet
2 6
Time
Time t+ 1
Data stream
2 2
3 3
Timet Timet+1
x(n)-
-y(n)
b1
the sample
rate. The inputs x(n) and y (n) are sequences indexed by n, which
corresponds to the sequence of samples.
In this graph nodes can be either arithmetic operators or delay operators. I he
labeled zl 1S a
node adds its two inputs and produces the output y (n). The box
means that
delay operator. Ihe z notation represent z transform and-1 superscript
from the delay
the operation performs a time delay of one sample period. The edge
with b, that the output of the
operator to the addition operator is labeled
means
PRODUCER/CONSUMER SYSTEMS
2.8.3. QUEUES AND
are used
and event processing. Queues
processing
Queues are also used in signal times or when
whenever data mayarrive depart at somewhat unpredictable
and
elastic buffer.
also called as an
variable amounts of data may arrive. Queue
another oneis
build in two ways. One is using linked list and
Queues can be
list method can allows
the queue to grow
Build a queue using linked
using array.
with array cam hold all
the data.
to an arbitrary size.
Designing the queue
data elements while a queue may
buffer always has
a fixed number of
Circular
elements in it.
numbers of
have a varying time period. But
take in the same amount of data in each
Digital filters always amounts of data
over time and
take in varying
systems may in a chain means the
Signal processing
When these systems operate
amounts. another stage.
produce varying variable rate input of
becomes the
output of
one stage
variable rate
|Producer/Consumer System
- -e-e-
Consumer system
2.25. Aproducer/
Fig.
2.36 Embedded and Real Time Systems
Above figure shows a simple
producer/consumer system. Pl and P2 are t
blocks that perform algorithmic
processing. The data is fed to them by queues that
act as elastic buffers. The queues modify the flow
as store data.
of control in the system as well
For
example consider P2 runs ahead of P1, it will eventually run out of data in
itsq12 input queue. At that point, the queue will return an empty signal to P2.
this point, P2 should At
stop working until more data is available.
This method is easier to
implement in a multitasking environment and also
possible to make effective use of queues in programs structured as
procedures. nested
Data Structures in Queues
The queues in
producer/consumer system may hold either uniform sized data
a
Forexample
W a +b;
X a -C,
y X + d;
X a +c;
Z y+e;
Fig. 2.26. A basic block in C
x1 a-c
y x 1 +d;
-C
x2 a +c;
z y+e;
block.
if(cond 1)
basic_block_1 ();
else
basic block_2 ();
basic block_3 ():
switch ( test 1)
case C1 : basicblock4 )
break;
case C2: basic_block_5():
break;
case C3: basic_block_6 ();
break;
For the above C code, we have following CDFG code which is shown in
fig.2.29.
In the CDFG construction it has two kinds of nodes.
is equivalent to
i =0;
while (i<N)
loop body ( );
i++
basic block20|
basic block_3()|
test 1
C1
basic block
40||basic_block_50||basic_block 6()
CDFG
We can build a CDFG for an assembly language program. ARM and many
VLIW processors support predicated execution of instruction for that we need
Linker
Symbol table
Assembly code
Fig. 2.31. Symbol table
processing during assembly
Enbedded Compuing Platform Design 2.43
In the first pass the name of each symbol and its address is stored in a symboi
table. The symbol table is built by scanning from the first instruction to the last
instructions.
assign memory locations to labels. PLC and PC same but PLC always makes
are
exactly one pass through but program counter (PC) makes many passes over code
in a loop.
After examining the line, the assembler updates the PLC to the next location and
looks at the next instruction. Ifthe instruction beings with a label, a new entry is
made in the symbol table, it includes the label name and its value. The value ofthe
label is equal to the current value of the PLC.
At the end of the first pass, the assembler rewinds to the beginning of the
assembly language file to make the second pass.
During the second pass, when a label name is found, the label is looked up in the
symbol table and its value substituted intothe appropriate place in the instruction.
Assembler allows label to be added to the symbol table without occupying space
in the program memory. A typical name ofthis pseudo-op is EQU, for equate.
The ARM assembler supports one pseudo-op that is particular to the ARM
instruction set. In other architectures an address would be loaded into a register by
reading it from a memory location.
The assembler produces an objectfile that describes the instructions and data in
binary format. A commonly used object file format is known as COFF (common
object file format).
The object file must describe the instructions, data and any addressing
information and also usually carries along the symbol table for later use in
debugging. To understand the details ofturning reloadable code, into executable
Code we must understand the linking process.
|2.10.2. LINKING
Many assembly Language programs are written as several smaller pieces rather
Lhan as a single largefile. A linker allows a program to bestitched together out of
Embedded and Real Time
2.4 Systema
several smaller pieces. The linker operates on the object files created by
assembler and moditfies the assembled code to make the necessary links betwee
the
ween|
files
Some labels will be defined and used in the same file and other tables wil
be defined in a single file but used elsewhere.
Label defined place is known as an entry point and Label used place i
known as an external references. The main Jobs of the loader is to resolve
external references based on available entry point.
*Even entire. symbol table is not kept for later debugging purposes,
if the it|
must be at least pass the entry points.
Phases of Linker
2. Second phase
In tirsi phase, it determines the address of the start of each object file.
In second phase, the loader merges all symbol tables from the object files
into a single large table.
Work stations and PCs provide dynamically linked libraries and some
embedded computing environment also
provides it, Dynamically linked
libraries allow them to be linked in at the start of
program execution.
2.10.3. OBJECT CODE DESIGN
When designing an embedded
system, we may need to control the placement
several types of data such as o"|
Interrupt vectors and other ihformation for
specific locations. VO devices must be placed in
Memory management tables must be set
up
Global variables used for
communication
locations that are accessible to all the usersbetween processes must bePut in
of that data.
Embedded Computing Platform Design 2.45
Reentrancy
Many programs should be designed to be reentrant. A program is reentrant if
Example
int foo=1;
inttask 1 ()
{
foo foo +1;
return foo;
Relocatability
A program is reloadable if it can be executed when loaded into different parts
of memory. It provides some support from Hardware that provides-address
calculation. But it is possible to write non relocatable code for non relocatable
architecturés. Any addresses that are not fixed by the, architecture or system
configuration should be accessed using relocatable code.
High Levet
language code
Assembly
code
The nodes are numbered in the order in which code is generated, since every
node in the data flow graph corresponds to an operation that is directly supported
by the instruction set. The ARM code for above expression is shown in below
figure.
Optimization is to reuse a register whose value is no longer needed. In the case
of the intermediate values w, x and y, we know that they cannot be used after the
end of the expression. The find result z may in fact be used in a C assignment and
the value reused later in the program.
operator 1 (+)
ADR r4, a
get address for a
MOV r1, [r4]
Load a
ADR r4, b
get address for b
MOV 2, [r4]
load b
ADD 13,r1,12
put w into r3
operator 2 (-)
ADR r4, c
get address for c
MOV 4, [r4] load c
operator 4 (+)
ADD r8, r7, r3 operator 4, puts x into r8
assign to x
ADR r1,x
STR r8, [rl] assigns to x location
CS
A
Fig. 2.18.
2.11.3. PROCEDURES
Creation of procedures is the
code for
major problem in code generation. Generating|
procedure relatively straight forward. Procedure definition
is
the procedure call and must handle
return
In modern
programming language the CPUs subroutine call mechanism
usually not sufficient to directly support is
procedures. So procedures stack and
procedure linkages are different kinds of functions
performed on procedure.
Procedure Linkage
Procedure linkage mechanism
provides a way for the program
prograr to pasS
parameters into the program_and for the
procedure to return a value. It aso
provides help in
restoring the values of registers that the procedure has moditfied.
All procedures in a
given programming language use the
mechanism. The mechanism can also be same inkag|
used to call handwritten
language routines from compiled codc. assemoy
Embedded Computing Platform Design
2.51
Procedure Stack
two pointers, stack pointer defines the end of the current frame and frame has
defines the end of the last frame. pointer
The procedure can refer to an element in the frame by
stack pointer. When a new
addressing relative to
procedure is called, the stack pointer and frame pointer
are modified to push another frame onto the stack.
Example
Linked list
Array
Queue
Structure
Union etc.
Array
Array is interesting data structure because the address of an array element is
generally computed at run time. Arrays have three kinds such as
1. One-dimensional Array
2. Two dimensional Arrays
3. Multi - dimensional Arrays
) One-Dimensional Array
It contains only one subscript value. Consider one dimensional aray which is
having following format. a [i] it contains i number of values. The memory layout
of one dimensional array is shown like this.
2.52 Embedded and Real Time Syt
Systems
a a[0]
a[]
X OPLT
alil
The Zero element is stored as the first element of the array, the first elemen
directly below and so on. We can create a pointer for the array and array pointer is
a variable it contains the address of another variable
Amay pointer points to the arrayheadnamely a [0]. Ifwe call that pointer apt
for convenience, then we can rewrite the reading of a [i] as
(aptr+ i)
Gin) Two dimensional Arrays
It contains two subscript values. Two dimensional arrays are more challenging.|
There are multiple possible ways to Layout a two dimensional array in memory.
.
One form of memory Layout for two dimensional arrays is row major.
In the row major the inner variable of the array (j in a [i, jl varies most quickly.|
a[0,0]
a[0,1]
**°°*
a[l,0]
a[l,1]
Embedded Computing Platform Design 2.53
dimensional arrays also require more sophiscated addressing. First we must
Two
Enow the size of array. In row-major form if the a[] array is of size N x M, then
we
can turn the two dimensional array access into a one dimensional array
access.
a [i, j]
becomes
a [i* M+j]
Where the maximum value forj is M -1.
Loop Transformations
Loops are important program structure because they arecompactly described in
the source code and they use a large fraction of the computation time. Many
techniques have been designed to optimize loops.
Loop Unrolling
A simple and useful transformation is known as loop unrolling. It is important
because it helps to expose parallelism that can be used by later stages of the
compiler.
LoopFusion
Loop fusion is a process used to combines two or more loops into a single loop.
For this transformation to be legal, two conditions must be satisfied.
Loop Distribution
Loop distribution is the opposite of loop fusion that is, decomposing a single
loop into multiple loops.
2.54 Embedded and Real Time
Loop Tiling
Systems
Loop tiling process breaks up a loop into a set of nested loops, with each ins
1oop performing the operation on a subset of the data. inner
for (i =
0; i < N; +)i +
for(i 0;i <N; i =2)+
for (i =
Before After
Drawback
It changes the order in which array elements are
accessed, so allowing us to
better control the behavior of the cache conflicts
during loop execution.
Dead Code Elimination
Dead code is code that can never be executed. Dead code can be generated by
programmers or by compilers. Dead code can be identified by reach ability
Analysis.
Enbedded Computing Platform Design 2.55
Reachability Analysis is the process of finding the other statement or
instructions from which it can be reached. If a given piece of code cannot be
reached, or it can be reached only by a piece of code that is unreachable from the
main program, then it can be eliminated. Dead code elimination will analyzes the
code for reachability and trims away dead code.
Register Allocation
Register allocation is a very important compilation phase. For a block of code,
we want to choose assignments of variables, (both declared and temporary) to.
registers to minimize the total number of required registers.
If a section of code requires more registers than are available means we must
spill some of the values out to memory temporarily. After computing some vàlues,
we write the values to temporary memory locations, reuse those registers in other
computations and then reread the old values from the temporary locations to
resume work.
Scheduling
Scheduling is the process of selecting a processes from the ready state to
running state. Every CPU manufacturers generally disclose enough information
about the micro architecture to allow us to schedule instructions even when they do
not provide a detailed description of the CPU's internals.
Reservation Table
Reservation table is used to keep track of CPU resources during instruction
scheduling. It has shown in belowfigure.
t+1 X X
t+ 2 X
t3 X
particular time,
we chect
columns represent executed at a
instruction to be
Before scheduling
an
needed by the instruction
determine whether
all resources
reservation table to
the
available at that time.
are to note all resources used
the instruction, we update the table
Upon scheduling
table provides a good summary of the state of an
that instruction. Reservation
by
instruction scheduling problem in progress.
Software Plipelining
instructions across several loop
Software pipelining is a technique for reordering
iterations to reduce pipeline bubbles.
Instruction Selection
varying complexity.
The CPU pipeline and cache act as windows into our program. The cache hasa
major effect on program performance and cache's behavior depends in part onthe
data values input to the program.