Professional Documents
Culture Documents
Low Power Vlsi Design 2
Low Power Vlsi Design 2
Cinput
interconnect
Cdrain Cinput
Vout
VA nMOS Cdrain C int erconnect C input
VB network
1
P [ T2 / T
avg
T 0 Vout (Cload dVout )dt
dt T/
dV
(VDD Vout )(Cload dtout )dt]
2
C
• The node transition rate can be slower than the
clock rate. To better represent this behavior, a
node transition factor ( T ) should be
introducedP C f
avg T 2
load V DD CLK
VA VA
Vinternal VB
VB
Cinternal Vinternal
Vout
VA VB Vout
Cload
CEFF = 3/16 * CL
Transition Probability of 2-
input NOR Gate
i
P i (1 P i ) C i
A 0.2 0.038
B 0.2 0.0196
0.0099
C 0.5
D 0.5
A
B
C A 0.2
D 0.0384
B 0.2
C 0.5
0.0099
D 0.5
0.1875
National Central University EE4012VLSI Design 35
Gate-Level Design – Phase Assignment
A
A
B
B
C
C
a b c d a b c d
d a
Switching activity
Switching activity
c b
b
c
a d
d a
c
b b
a c
d
A
A
B
D
C
B E
C D
E
B
C Chain structure
D
B Tree structure
C
D
National Central University EE4012VLSI Design 39
Gate-Level Design – Precomputation
REG REG
Combinational Logic
R1 R2
REG REG
Combinational Logic
R1 R2
Precomputation
Logic
REG
A<n-2:0>
R2
(n-1)-bit REG
Enable
Comparator R4
Precomputation logic F
REG
B<n-2:0>
R3
D Q D D D
Q Q Q Fail DFT rule
clk checking
f1
clk
+
select
f2
D D Q
CK
D D Q D Q D Q D Q
multiplexer
Output
CK(f/2)
Flip-flops are operated at full voltage and half the clock frequency.
1 33.0 1535
Normalized power
2 16.5 887
4 8.25 738 0.5
0.25
32 16 32
16x16 16x16
fref multiplier fref/2 multiplier
16 R
B R
M 32
fref/2 X
U
fref
fref/2
National Central University EE4012VLSI Design 21
Architecture-Level Design – Pipelining
The hardware between the pipeline stages is reduced then
the reference voltage Vref can be reduced to Vnew to maintain
the same worst case delay. For example, let a 50MHz
multiplier is broken into two equal parts as shown below. The
delay between the pipeline stages can be remained at
50MHz when the voltage Vnew is equal to Vref/1.83
32 Half Half 32
(A ,B) REG REG
multiplier multiplier
fref
V ref
(
Ppipeline 1.2C ref ) 2 fref 0.36 ref
1.83 P
National Central University EE4012VLSI Design 48
Architecture-Level Design – Retiming
Retiming is a transformation technique used to change the
locations of delay elements in a circuit without affecting the
input/output characteristics of the circuit.
D D a 2D D a
w(n)
(2) (1) D (2) 2D
(1)
w1(n)
b retiming D b
w2(n)
(2) (2)
REG C1 C2 REG C3
(6ns) (2ns) (4ns)
fref
REG C1 REG C2
C3
(6ns) (2ns)
(4ns)
fref
C2
C1
C1_FREEZE
C2
C2_FREEZE
C1
C1_FREEZE
C2_FREEZE
National Central University EE4012VLSI Design 52
Architecture-Level Design – Bus Segmentation
• Avoid the sharing of resources
Reduce the switched capacitance
• For example: a global system bus
A single shared bus is connected to all modules, this
structure results in a large bus capacitance due to
The large number of drivers and receivers sharing the same
bus
The parasitic capacitance of the long bus line
• A segmented bus structure
Switched capacitance during each bus access is
significantly reduced
Overall routing area may be increased
Cbus
Cbus1
Interface
Bus
Cbus1
a
Present Next state
state A
b
a b A B
B
0 0 0 1
0 1 1 0
1 0 1 1
1 1 0 0
A = a’b + ab’ CK
B = a’b’ + ab’ CLR
Source: Prof. V. D. Agrawal
National Central University EE4012VLSI Design 31
Binary Counter: Gray Encoding
Present a
Next state
state
A
a b A B
0 0 0 1 B
b
0 1 1 1
1 0 0 0
1 1 1 0
A = a’b + ab CK
B = a’b’ + a’b CLR
Source: Prof. V. D. Agrawal
National Central University EE4012VLSI Design 58
Three-Bit Counters
Binary Gray-code
State No. of toggles State No. of toggles
000 - 000 -
001 1 001 1
010 2 011 1
011 1 010 1
100 3 110 1
101 1 111 1
110 2 101 1
111 1 100 1
000 3 000 1
Av. Transitions/clock = 1.75 Av. Transitions/clock = 1
Xi/Zk
Si
Sk Xk/Zk
PI
Combinational
logic PO
Flip-flops
Clock
activation Latch
L. Benini and G. De Micheli,
logic
Dynamic Power Management,
CK Boston: Springer, 1998.
N/2
0
0 N/2 N
Number of bit
Source: Prof. V. D. Agrawal
National Central University
transitions
EE4012VLSI Design 64
Bus-Inversion Encoding Logic
Received data
Sent data
Bus register
Polarity
M. Stan and W. Burleson, “Bus-Invert
decision
Polarity bit Coding for Low Power I/O,” IEEE
logic
Trans. VLSI Systems, vol. 3, no. 1, pp.
49-58, March 1995.
stable Mux
A<B Mux
glitchy
glitchy
A<B Mux
Mux
stable
128x32
din
32
addr dout
write
noe
q addr[7:0]
8 M
pre_addr d addr[7:1] dout
32 U
X
clk noe
write
dout
addr 32
din
addr0 128x32
Reads
64K bytes
Data
ARM
Core Addr
R/W
Addr
28K 4K 32K Range
64K
Decoder
ARM
R/W
Addr
Data
R/W
R/W
Addr
Addr
Data
Data
Core CS
CS
CS
National Central University EE4012VLSI Design 70