Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Lecture 7 EE 486

Multiplication Add-and-Shift Algorithm


EE 486 lecture 7: Integer Multiplication
Multiplicand “X” 110 6
Multiplier “Y” x 101 x5
110
Partial Products “PP” 000
110
M. J. Flynn Result “S” 11110 30

slides prepared by Albert Liddicoat and Hossam Fahmy X Shift Y


0

ALU

Shift P

Computer Architecture & Arithmetic Group 1 Stanford University Computer Architecture & Arithmetic Group 2 Stanford University

Parallel Multiplication Generation of PP’s Using ROMs


1 Simultaneous generation of partial products
4b x 4b Multiply Using
2 Parallel reduction of partial products
256x8-bit ROM
3 Carry Propagate Addition (CPA) 8b x 8b Multiply Using
256x8-bit ROMs
Address

X
Data

R
Parallel generation of partial products Y
X7 X6 X5 X4 X3 X2 X1 X0

Y7 Y6 Y5 Y4 Y3 Y2 Y1 Y0
Using AND gate as 1x1-bit multiplier DOT representation x

.. ... .. .
X3 X2 X1 X0 * Y3 Y2 Y1 Y0
X2 X1 X0 X2 X1 X0
x Y2 Y1 Y0 x Y2 Y1 Y0 X7 X6 X5 X4* Y3 Y2 Y1 Y0

R5 R4 R3
X2Y0 X1Y0 X0Y0
X2Y1 X1Y1 X0Y1
X2Y2 X1Y2 X0Y2
R2 R1 R0
.
.. . . ..
X3 X2 X1 X0 * Y7 Y6 Y5 Y4

X7 X6 X5 X4 * Y7 Y6 Y5 Y4

Computer Architecture & Arithmetic Group 3 Stanford University Computer Architecture & Arithmetic Group 4 Stanford University

Generation of PP’s Using ROMs Booth’s Algorithm


Observation: We can replace a string of 1’s in the multiplier by +1 and -1.
4, 8, and 16 bit Multiply
8 x 8 bit Multiply using using 256x8bit ROMs ...0111110...
Example:
256x8b ROMs 16x16 bit ...0111110...
+1
8x8 bit -1
a
b 4x4 bit ...1000000...
c -1
d
. . . 1 0 0 0 0-1 0 . . .
n h CSA Delay X*(0 1 1 1 1 1) X*(1 0 0 0 0-1)
Y Y
4 1 0 X 1 -X -1
b 8 3 1 X 1 0 0
d a 16 7 4
X 1 0 0
c X 1 0 0
32 15 6 X 1 0 0
64 31 8 0 0 X 1

Computer Architecture & Arithmetic Group 5 Stanford University Computer Architecture & Arithmetic Group 6 Stanford University

M.J. Flynn 1
Lecture 7 EE 486

Booth’s Algorithm Modified Booth 2


+ Reduces the number of partial products which in turn reduces the
hardware and delay required to sum the partial products. • Booth 2 modified to produce at most n/2+1
- Adds Delay into the formation of the Partial Products. partial products.
• Works well for serial multiplication that can tolerate variable
latency operations by reducing the number of serial additions Algorithm: (for unsigned numbers)
required for the multiplication.
1) Pad the LSB with one zero.
• The number of serial additions depends on the data (multiplicand). 2) Pad the MSB with 2 zeros if n is even and 1 zero if n is odd.
• Worst case 8-bit multiplicand requires 8 additions 3) Divide the multiplier into overlapping groups of 3-bits.
01010101 <=> 1 -1 1 -1 1 -1 1 -1 4) Determine partial product scale factor from modified
• Parallel systems generally are designed for worst case hardware booth 2 encoding table.
and latency requirements. Standard booth 2 does not significantly 5) Compute the Multiplicand Multiples
reduce the worst case number of partial products. 6) Sum Partial Products

Computer Architecture & Arithmetic Group 7 Stanford University Computer Architecture & Arithmetic Group 8 Stanford University

Modified Booth 2 Modified Booth 2


4) Determine action from
Example: (n=8-bits unsigned) table for each 3-bit group 5) Compute Multiplicand Scale Factor (~4 gate delays)
Modified Booth 2 Mux Control
Scale Multiplier
1) Pad LSB with 1 zero Yi+1 Yi Yi-1 Factor Direct Multiplier Logic
Xi Xi-1 0
0 0 0 +0 Xi Yj Scale
2) n is even so Pad MSB with 2 zeros Factor Action
0 0 1 +X
+0 Mux 0
0 1 0 +X
0 0 Y7 Y6 Y5 Y4 Y3 Y2 Y1 Y0 0 +X Mux Xi
0 1 1 +2X
Yj+1 +2X Mux Xi-1
1 0 0 -2X Yj
1 0 1 -X -X Mux Xi’
Yj-1
3) Form 3-bit overlapping groups -2X Mux Xi-1’
1 1 0 -X PP ij
for n=8 we have 5 groups 1 1 1 -0 PP ij

Computer Architecture & Arithmetic Group 9 Stanford University Computer Architecture & Arithmetic Group 10 Stanford University

Modified Booth 2 Modified Booth 2


6) Sum Partial Products Algorithm Extension: (for signed multiplier)
• Sign extend partial products to the full width of the final result
• Logic may replace the A9, B9, C9, D9, and E9 sign extention bits. 1) Pad the LSB with one zero.
• Yi+1 bit determines if the multiple needs to be complemented 2) If n is even don’t pad the MSB ( n/2 PP’s) and
Multiplicand if n is odd sign extend the MSB by 1 bit ( n+1/2 PP’s).
Partial Products 3) Divide the multiplier into overlapping groups of 3-bits.
Yi+1 Yi YI-1
4) Determine partial product factor from table.
A9 . . . A9 A9 A8 A7 A6 A5 A4 A3 A2 A1 A0 (Y1 Y0 0) 5) Compute the Multiplicand Multiples
B9 . . . B8 B7 B6 B5 B4 B3 B2 B1 B0 Y1 (Y3 Y2 Y1) 6) Sum Partial Products n Even: n Odd:
C9 . . . C6 C5 C4 C3 C2 C1 C0 Y3 (Y5 Y4 Y3)
D9 . . . D4 D3 D2 D1 D0 Y5 (Y7 Y6 Y5) Xn-1 Xn-2 Xn-2 Ø Xn-1 Xn-2
E9 . . . E2 E1 E0 Y7 (0 0 Y7)
Positive: 0 x x 0 0 x
S15. . . S 10 S9 S8 S7 S6 S5 S4 S3 S2 S1 S0 Negative: 1 x x 1 1 x

Computer Architecture & Arithmetic Group 11 Stanford University Computer Architecture & Arithmetic Group 12 Stanford University

M.J. Flynn 2
Lecture 7 EE 486

Modified Booth 2 Booth 3


• Reduces the partial products to ~n/3
Algorithm Extension: (for signed multiplicand) • Form overlapping groups of 4 bits.

Nothing the algorithm works fine ! 0 Y7 Y6 Y5 Y4 Y3 Y2 Y1 Y0 0

• The multiplicand may be represented in 2’s complement code. • Booth 3 Encoding Scale Scale
Yi+2 Yi+1 Yi Yi-1 Factor Yi+2 Yi+1 Yi Yi-1 Factor
• The scale factors (0, +X, +2X, -X, and -2X) are handled correctly. 0 0 0 0 +0 1 0 0 0 -4X
• Shift left for 2 times weighting. 0 0 0 1 +X 1 0 0 1 -3X
Note: 3x is a hard 0 0 1 0 +X 1 0 1 0 -3X
•2’s complement the multiplicand for subtraction. multiple that must 0 0 1 1 +2X 1 0 1 1 -2X
0 1 0 0 +2X 1 1 0 0 -2X
be precomputed 0 1 0 1 +3X 1 1 0 1 -X
0 1 1 0 +3X 1 1 1 0 -X
0 1 1 1 +4X 1 1 1 1 -0

Computer Architecture & Arithmetic Group 13 Stanford University Computer Architecture & Arithmetic Group 14 Stanford University

Partially Redundant Booth 3 Partially Redundant Booth 3

Computer Architecture & Arithmetic Group 15 Stanford University Computer Architecture & Arithmetic Group 16 Stanford University

M.J. Flynn 3

You might also like