Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Chapter 5: Unfolding

Keshab K. Parhi
• Unfolding ≡ Parallel Processing
2-unfolded
(1) (1)
A B (1) (1) 0,2,4,….
A0 B0
2D
T’∞ = 2ut D
A0àB0=> A2àB2=> A4àB4=>…..
A1àB1=> A3àB3=> A5àB5=>….. (1) (1) 1,3,5,….
2 nodes & 2 edges A1 B1
T∞ = (1+1)/2 = 1ut
T’∞ = 2ut D
4 nodes & 4 edges
T∞ = 2/2 = 1ut

• In a ‘J’ unfolded system each delay is J-slow => if input to a delay element
is the signal x(kJ + m), the output is x((k-1)J + m) = x(kJ + m – J).

Chap. 5 2
• Algorithm for unfolding:
Ø For each node U in the original DFG, draw J node U0 , U1 ,
U2 ,…, UJ-1 .
Ø For each edge U → V with w delays in the original DFG,
draw the J edges Ui → V(i + w)%J with (i+w)/J delays for i
= 0, 1, …, J-1.
37D U0 9D V0
U V
U1 9D V1
w = 37
⇒(i+w)/4 = 9, i = 0,1,2 U2
9D
V2
=10, i = 3
U3 10D V3

ØUnfolding of an edge with w delays in the original DFG


produces J-w edges with no delays and w edges with 1delay in
J unfolded DFG for w < J.
ØUnfolding preserves precedence constraints of a DSP
program.
Chap. 5 3
2D

U
D
V U0 V0 2D T0
3-unfolded
5D 6D
U1 V1 2D T1
T DFG
2D
U2 D V2 2D T2
D
Properties of unfolding :
Ø Unfolding preserves the number of delays in a DFG.
This can be stated as follows:
w/J + (w+1)/J + … + (w + J - 1)/J = w
Ø J-unfolding of a loop l with wl delays in the original DFG
leads to gcd(wl , J) loops in the unfolded DFG, and each
of these gcd(wl , J) loops contains wl/ gcd(wl , J) delays
and J/ gcd(wl , J) copies of each node that appears in l.
Ø Unfolding a DFG with iteration bound T∞ results in a J-
unfolded DFG with iteration bound JT∞ .

Chap. 5 4
• Applications of Unfolding
Ø Sample Period Reduction
Ø Parallel Processing
• Sample Period Reduction
Ø Case 1 : A node in the DFG having computation
time greater than T∞ .
Ø Case 2 : Iteration bound is not an integer.
Ø Case 3 : Longest node computation is larger
than the iteration bound T∞, and T∞ is not an
integer.

Chap. 5 5
Case 1 :

ØThe original DFG cannot


have sample period equal to
the iteration bound
because a node computation
time is more than iteration
bound

Ø If the computation time of a node ‘U’, tu, is greater than the


iteration bound T∞, then tu/T ∞  - unfolding should be used.
Ø In the example, tu = 4, and T∞ = 3, so 4/3 - unfolding i.e., 2-
unfolding is used.

Chap. 5 6
• Case 2 :

ØThe original DFG cannot


have sample period equal
to the iteration bound
because the iteration
bound is not an integer.

ØIf a critical loop bound is of the form tl/wl where tl and wl


are mutually co-prime, then wl-unfolding should be used.
ØIn the example tl = 60 and wl = 45, then tl/wl should be
written as 4/3 and 3-unfolding should be used.
•Case 3 : In this case the minimum unfolding factor that allows
the iteration period to equal the iteration bound is the min
value of J such that JT∞ is an integer and is greater than the
longest node computation time.
Chap. 5 7
• Parallel Processing :
Ø Word- Level Parallel Processing
Ø Bit Level Parallel processing
vBit-serial processing
vBit-parallel processing
vDigit-serial processing

Chap. 5 8
• Bit-Level Parallel Processing

a0 b0
a1 b1
a2 Bit-parallel a3 a2 a1 a0 Bit-serial b3 b2 b1 b0
b2
a3 b3

a2 a0 b2 b0
Digit-Serial
(Digit-size = 2)
a3 a1 b3 b1

a3 a2 a1 a0 s3 s2 s1 s0
Bit-serial
b3 b2 b1 b0
adder
D
4l+0 4l+1,2,3
0
Chap. 5 9
• The following assumptions are made when
unfolding an edge U→V :
Ø The wordlength W is a multiple of the unfolding factor J,
i.e. W = W’J.
Ø All edges into and out of the switch have no delays.
• With the above two assumptions an edge U→V can
be unfolded as follows :
Ø Write the switching instance as
Wl + u = J( W’l + u/J ) + (u%J)
Ø Draw an edge with no delays in the unfolded graph from
the node Uu%J to the node Vu%J , which is switched at time
instance ( W’l + u/J ) .

Chap. 5 10
Example : 4l + 3

U0 V0
12l + 1, 7, 9, 11 Unfolding by 3 4l + 0,2
U V U1 V1
4l + 3

U2 V2

To unfold the DFG by J=3, the switching instances are as follows


12l + 1 = 3(4l + 0) + 1
12l + 7 = 3(4l + 2) + 1
12l + 9 = 3(4l + 3) + 0
12l + 11 = 3(4l + 3) + 2

Chap. 5 11
• Unfolding a DFG containing an edge having a switch and a
positive number of delays is done by introducing a dummy
node.
2D 2D 6l + 1, 5
A 6l + 1, 5 A D
Inserting
C C
Dummy node
B 6l + 0, 2, 3, 4 B 6l + 0, 2, 3, 4

A0 D D0
2l + 0

2l + 1 C0 B0 C0
A1 D1
D A2 D 2l + 0
A2 D2 2l + 0

C1 C1
B0 2l + 1
B1 2l + 1

B1 2l + 1 B2 2l + 0

C2 C2
Chap. 5
B2 2l + 0 A0 2l + 1 12
• If the word-length, W, is not a multiple of the
unfolding factor, J, then expand the switching
instances with periodicity lcm(W,J)
• Example: Consider W=4, J=3. Then lcm(4,3) = 12.
For this case, 4l = 12l + {0,4,8), 4l+1 = 12l + {1,5,9},
4l+2 = 12l + {2,6,10}, 4l+3 = 12l + {3,7,11}. All new
switching instances are now multiples of J=3.

Chap. 5 13

You might also like