Professional Documents
Culture Documents
Arithmetic Circuits & Multipliers
Arithmetic Circuits & Multipliers
Arithmetic Circuits & Multipliers
• Addition, subtraction
• Performance issues
-- ripple carry
-- carry bypass
-- carry skip
-- carry lookahead
• Multipliers
-2N-1 2N-2
… … … 23 22 21 20
42 = 00101010 -5 = ~00000101 + 1
= 11111010 + 1
= 11111011
What is their 16-bit 2’s complement representation?
42 = 0000000000101010
________00101010
-5 = ________11111011
1111111111111011
Extend the MSB (aka the “sign bit”)
into the higher-order bit positions
“Ripple-
carry
adder”
S ABC
But what
about the
“+1”?
Signed comparison:
N (negative): result is < 0 SN-1
LT NV
C (carry): indicates an add in the most LE Z+(NV)
significant position produced a carry, e.g.,
EQ Z
1111 + 0001
NE ~Z
from last FA
GE ~(NV)
V (overflow): indicates that the answer GT ~(Z+(NV))
has too many bits to be represented
correctly by the result width, e.g., Unsigned comparison:
0111 + 0111 LTU C
LEU C+Z
V A B S A B S GEU ~C
N 1 N 1 N 1 N 1 N 1N 1 GTU ~(C+Z)
V COUT CIN
N 1 N 1
6.111 Fall 2016 Lecture 8 7
Condition Codes in Verilog
Z (zero): result is = 0
Using a fixed word size can lead to overflow, e.g., when the operation
produces a result that’s too large to fit in the word size. One can
Cn-1 Cn-2 C2 C1 C0
Cin Full Co
Adder
S
Generate (G) = AB
Propagate (P) = A B
Actually, P is usually
module fa(input a,b,cin, output s,cout);
wire g = a & b;
defined as P = A^B
wire p = a ^ b; which won’t change
assign s = p ^ cin; COUT but will allow us
assign cout = g | (p & cin); to express S as a
endmodule simple function :
S = P^CIN
B
A Y = A B Cin
P
Cin
A[63:0]
+ Y[63:0]
B[63:0]
Y[64]
A[63:60]
CLB15 Y[63:60]
B[63:60]
A[7:4]
CLB1 Y[7:4]
B[7:4]
A[3:0]
CLB0 Y[3:0]
B[3:0]
CLBs must be in same column
6.111 Fall 2016 Lecture 8 15
Carry Bypass Adder
A0 B0 A1 B1 A2 B2 A3 B3
BP= P0P1P2P3
P,G P,G P,G P,G
P0 G0 P1 G1 P2 G2 P3 G3
Ci,0
C/S Co,0
C/S Co,1
C/S Co,2
C/S 0 Co,3
Co,15
Co,15
BP2 = 0 or BP2 = 1 ?
tadder
be
This
display ed.
image
cannot
currentl
y be
M = bypass
word size
ripple adder
N = number
of bits being
added
bypass adder
N
Thi i t tl b di l d 4..8 This image
cannot
image
currently be
cannot
display ed.
currentl
y be …
CN = GN-1 + PN-1CN-1
P/G generation
1st level of
lookahead
Log2(N)
Log2(N)
tPD = Θ(log(N))
A3 A2 A1 A0
x B3 B2 B1 B0
HA FA FA HA
FA FA FA HA
x3 x2 x1 y3
x0
z2
FA FA FA HA
z7 z6 z5 z4 z3
6.111 Fall 2016 Lecture 8 28
Combinational Multiplier (signed!)
X3 X2 X1 X0
* Y3 Y2 Y1 Y0
--------------------
X3Y0 X3Y0 X3Y0 X3Y0 X3Y0 X2Y0 X1Y0 X0Y0
+ X3Y1 X3Y1 X3Y1 X3Y1 X2Y1 X1Y1 X0Y1
+ X3Y2 X3Y2 X3Y2 X2Y2 X1Y2 X0Y2
- X3Y3 X3Y3 X2Y3 X1Y3 X0Y3
----------------------------------------- y0
x3 x2 x1 x0
Z7 Z6 Z5 Z4 Z3 Z2 Z1 Z0
y1
x3 x2 x1 x0
z0
FA FA FA FA FA FA HA
x3 x2 y2
x1 x0
z1
FA FA FA FA FA HA
x3 x2 x1 y3
x0
z2
Step 2: don’t want all those extra additions, so Step 4: finish computing the constants…
add a carefully chosen constant, remembering
to subtract it at the end. Convert subtraction
into add of (complement + 1).
X3Y0 X2Y0 X1Y0 X0Y0
X3Y0 X3Y0 X3Y0 X3Y0 X3Y0 X2Y0 X1Y0 X0Y0 + X3Y1 X2Y1 X1Y1 X0Y1
+ 1 + X2Y2 X1Y2 X0Y2 X0Y2
+ X3Y1 X3Y1 X3Y1 X3Y1 X2Y1 X1Y1 X0Y1 + X3Y3 X2Y3 X1Y3 X0Y3
+ 1 + 1 1
+ X3Y2 X3Y2 X3Y2 X2Y2 X1Y2 X0Y2
+ 1
+ X3Y3 X3Y3 X2Y3 X1Y3 X0Y3 –B = ~B + 1 Result: multiplying 2’s complement operands
takes just about same amount of hardware as
+ 1
+ 1
- 1 1 1 1 multiplying unsigned operands!
6.111 Fall 2016 Lecture 8 30
Baugh Wooley Formulation –The Math
no insight required
Assuming X and Y are 4-bit twos complement numbers:
2 2
X= -23x 3 +Σ xi2i Y= -23y 3 + Σ yi2i
i=0 i=0
FA FA FA HA
x3 x2 y2
x1 x0
z1
FA FA FA HA
x3 x2 x1 y3
x0
1 z2
HA FA FA FA HA
z7 z6 z5 z4 z3
tPD ≈ 10ns
SN SN-1…S0
Init: P0, load A and B
LSB
P B NC A Repeat M times {
M bits
P P + (BLSB==1 ? A : 0)
N 1 N
shift P/B right one bit
+ xN
}
N+1
Repeat M times {
P A Repeat N times {
B shift A,P:
Amsb = Alsb
Pmsb = Plsb + Alsb*Blsb + C/0
0 }
shift P,B: Pmsb = C, Bmsb = Plsb
C FA }
HA FA FA HA
FA FA FA HA
x3 x2 x1 y3
x0
z2
FA FA FA HA
z7 z6 z5 z4 z3
6.111 Fall 2016 Lecture 8 37
Useful building block: Carry-Save Adder
Good for pipelining: delay
through each partial product
(except the last) is just
tPD,AND + tPD,FA.
No carry propagation time!
CSA
...
CSA CSA CSA
hackers: counts
number of 1’s on the
3 inputs, outputs 2-
bit result.
CSA CSA
M/2 2
...
BK+1,K*A = 0*A 0
Booth’s insight: rewrite = 1*A A
2*A and 3*A cases, leave = 2*A 4A – 2A
4A for next partial = 3*A 4A – A
product to do!
6.111 Fall 2016 Lecture 8 42
Booth recoding
On-the-fly canonical signed digit encoding!
current bit pair from previous bit pair
Inputs:
• passenger door switch
• driver door switch
• ignition switch
• hidden switch
• brake pedal switch
Outputs:
• fuel pump power
• status indicator
• siren
• Design Specs
– Operating voltage 8-18VDC
– Operating temp: -40C +65C
– Attitude: sea level
Fuel pump
Cloaking – Shock/Vibration
device relay
• Notes
– Protected against 24V power
surges
– CMOS implementation
– CMOS inputs protected against
200V noise spikes
– On state DC current <10ma
– Include T_PASSENGER_DELAY
and Fuel Pump Disable
– System disabled (cloaked) when
being serviced.
To create a one hz tick, use the following in the Verilog test fixture:
always #5 clk=!clk;
always begin
#5 tick = 1;
#10 tick = 0;
#15;
end
initial begin
// Initialize Inputs
clk = 0;
tick = 0; . . .
reg signal_delayed;