Professional Documents
Culture Documents
Introduction To Distributed Arithmetic K. Sridharan, IIT Madras
Introduction To Distributed Arithmetic K. Sridharan, IIT Madras
Introduction To Distributed Arithmetic K. Sridharan, IIT Madras
Distributed Arithmetic
y = A1 × x1 + A2 × x2 + L + AK × xK
K
i.e. y = ∑ Ak xk
k =1
A numerical example
A = [32 , 42 , 45 , 23 ] x = [42 , 20 , − 22 ,67 ] ( K = 4 )
y = 32 × 42 + 45 × 20 + 78 × ( − 22 ) + 23 × 67
y = 1344 + 900 − 1716 + 1541 = 2069
How does distributed arithmetic work?
Distributed Arithmetic (DA) is a technique that is bit-
serial in nature. It can therefore appear to be slow
It turns out that when the number of elements in a
vector is nearly the same as the wordsize, DA is
quite fast
DA `replaces’ the explicit multiplications by ROM
look-ups Æ an efficient technique to implement on
Field Programmable Gate Arrays (FPGAs)
What does DA achieve ?
[
y = −∑ (bk 0 • Ak ) + ∑ ( Ak • bk1 )2 −1 + ( Ak • bk 2 )2 − 2 + L + (Ak • bk ( N −1) )2 −( N −1) ]
K K
k =1 k =1
y = −[b10 • A1 + b20 • A2 + L + bK 0 • AK ]
+ [(b ) −1
(
11 • A1 2 + b12 • A1 2 ) −2
(
+ L + b1( N −1) • A1 2 ) − ( N −1)
]
+ [(b 21 • A2 )2 −1
+ (b22 • A2 )2 −2
+ L + (b2 ( N −1) • A2 )2 − ( N −1)
]
M
[
+ (bK 1 • AK )2 −1 + (bK 2 • AK )2 − 2 + L + (bK ( N −1) • AK )2 −( N −1) ]
Further simplification leads to
y = −[b10 • A1 + b20 • A2 +L+ bK 0 • AK ]
[
+ (b11 • A1 )2 −1 + (b12 • A1 )2 − 2 + L + (b (
1 N −1) • A1 )2 − ( N −1)
]
+ [(b 21 • A2 )2 −1
+ (b22 • A2 )2 −2
+L+ (b (
2 N −1) • A2 )2 − ( N −1)
]
M
[
+ (bK 1 • AK )2 −1 + (bK 2 • AK )2 − 2 + L + (bK ( N −1) • AK )2 −( N −1) ]
y = −[b10 • A1 + b20 • A2 + L + bK 0 • AK ]
+ [(b11 • A1 ) + (b21 • A2 ) + L + (bK 1 • AK )]2 −1
+ [(b12 • A1 ) + (b22 • A2 ) + L + (bK 2 • AK )]2 − 2
M
+ [(b1( N −1) • A1 ) + (b2( N −1) • A2 ) + L + (bK ( N −1) • AK )]2 −( N −1)
The rewrite showing interchange of sum ..
y = −[b10 • A1 + b20 • A2 + L + bK 0 • AK ]
+ [(b11 • A1 ) + (b21 • A2 ) + L + (bK 1 • AK )]2 −1
+ [(b12 • A1 ) + (b22 • A2 ) + L + (bK 2 • AK )]2 − 2
M
+ [(b1( N −1) • A1 ) + (b2( N −1) • A2 ) + L + (bK ( N −1) • AK )]2 −( N −1)
K N −1
y = −∑ (bk 0 ) • Ak + ∑ [b1n • Ak + b2 n • A2 + L + bKn • AK ]2 − n
k =1 n =1
K
⎡K ⎤ −n
N −1
y = −∑ Ak • (bk 0 ) + ∑ ⎢∑ Ak • bkn ⎥ 2 …(4)
k =1 n =1 ⎣ k =1 ⎦
How is the hardware realization ?
Consider the equation (4) rewritten as:
N −1
⎡K ⎤ −n K
y = ∑ ⎢∑ Ak bkn ⎥ 2 + ∑ Ak (−bk 0 )
n =1 ⎣ k =1 ⎦ k =1
⎡K ⎤
⎢∑ Ak bkn ⎥ has only 2K possible values
⎣ k =1 ⎦
∑ k k0
A (
k =1
−b ) has only 2K possible values
⎡ 4 ⎤
0 1 0 0 A2=-0.30
1⎡
( ) ( )
−( N −1) ⎤
N −1
xk = ⎢− bk 0 − bk 0 + ∑ bkn − bkn 2 − 2
−n
⎥
2⎣ n =1 ⎦
Rewriting xk differently, we have
1⎡
( ) ( ) −( N −1) ⎤
N −1
xk = ⎢ − bk 0 − bk 0 + ∑ bkn − bkn 2 − 2
−n
⎥
2⎣ n =1 ⎦
1 K ⎡ N −1 − ( N −1) ⎤
y = ∑ Ak ⎢∑ ckn 2 − 2
−n
⎥
2 k =1 ⎣ n =0 ⎦
1 K N −1 1 K
y = ∑∑ Ak ckn 2 − n − ∑ Ak 2 −( N −1)
2 k =1 n =0 2 k =1
N −1
1 K 1 K
y = ∑ ∑ Ak ckn 2 − ∑ Ak 2 −( N −1) …(9)
−n
n = 0 2 k =1 2 k =1
The New Formulation in Offset Code
N −1
1 K 1 K
y = ∑ ∑ Ak ckn 2 − n − ∑ Ak 2 −( N −1)
n = 0 2 k =1 2 k =1
If we let K
1
Q (0 ) = − ∑
K
( ) 1
Q c1n c2 n L cKn = ∑ Ak ckn and
2 k =1
Ak
2 k =1
Constant
N −1
y = ∑ Q(c1n c2 n L cKn )2 + 2
−n − ( N −1)
Q(0 )
n =0
The Gain: have reduced storage to 8 rows
b1n b2n b3n b4n c1n c2n c3n c4n Contents
0 0 0 0 -1 -1 -1 -1 -1/2 (A1+ A2 + A3 + A4) = -0.74
0 0 0 1 -1 -1 -1 1 -1/2 (A1+ A2 + A3 - A4) = - 0.63
0 0 1 0 -1 -1 1 -1 -1/2 (A1+ A2 - A3 + A4) = 0.21
0 0 1 1 -1 -1 1 1 -1/2 (A1+ A2 - A3 - A4) = 0.32
0 1 0 0 -1 1 -1 -1 -1/2 (A1 - A2 + A3 + A4) = -1.04
Inverse symmetry
0 1 0 1 -1 1 -1 1 -1/2 (A1 - A2 + A3 - A4) = - 0.93
0 1 1 0 -1 1 1 -1 -1/2 (A1 - A2 - A3 + A4) = - 0.09
0 1 1 1 -1 1 1 1 -1/2 (A1 - A2 - A3 - A4) = 0.02
1 0 0 0 1 -1 -1 -1 -1/2 (-A1+ A2 + A3 + A4) = -0.02
1 0 0 1 1 -1 -1 1 -1/2 (-A1+ A2 + A3 - A4) = 0.09
1 0 1 0 1 -1 1 -1 -1/2 (-A1+ A2 - A3 + A4) = 0.93
1 0 1 1 1 -1 1 1 -1/2 (-A1+ A2 - A3 - A4) = 1.04
1 1 0 0 1 1 -1 -1 -1/2 (-A1 - A2 + A3 + A4) = - 0.32
1 1 0 1 1 1 -1 1 -1/2 (-A1 - A2 + A3 - A4) = - 0.21
1 1 1 0 1 1 1 -1 -1/2 (-A1 - A2 - A3 + A4) = 0.63
1 1 1 1 1 1 1 1 -1/2 (-A1 - A2 - A3 - A4) = 0.74
DA Hardware with Offset Binary Coding
x1 selects
between the
two symmetric
halves
Ts indicates
when the sign
bit arrives
How to reduce ROM size further ?