CH 08

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 41

VLSI Design

Chapter 8

Low-Power VLSI Design


Methodology

Jin-Fu Li
Chapter 8 Low-Power VLSI Design
Methodology
• Introduction
• Low-Power Gate-Level Design
• Low-Power Architecture-Level Design
• Algorithmic-Level Power Reduction
• RTL Techniques for Optimizing
Power
• Adiabatic Logic Circuits

National Central University EE613 VLSI Design 2


Introduction
• Most SOC design teams now regard power as one
of their top design concerns
• Why low-power design?
− Battery lifetime (especially for portable devices)
− Reliability
• Power consumption
− Peak power
− Average power

National Central University EE613 VLSI Design 3


Overview of Power Consumption
• Average power consumption
− Dynamic power consumption
− Short-circuit power consumption
− Leakage power consumption
− Static power consumption

• Dynamic power dissipation during switching

Cinput

interconnect
Cdrain Cinput

National Central University EE613 VLSI Design 4


Overview of Power Consumption
• Generic representation of a CMOS logic gate for
switching power calculation
VA
pMOS
VB network

Vout
VA nMOS C drain + ∑ Cint erconnect + ∑ Cinput
VB network

1 T /2 dVout T dVout
Pavg = [∫ Vout (−Cload )dt + ∫ (VDD − Vout )(Cload )dt ]
T 0 dt T /2 dt

National Central University EE613 VLSI Design 5


Overview of Power Consumption
• The average power consumption can be expressed
as
1
Pavg = 2
C load V DD = C load V DD
2
f CLK
T
• The node transition rate can be slower than the
clock rate. To better represent this behavior, a
node transition factor (α T ) should be introduced
Pavg = α T C load V DD
2
f CLK
• The switching power expressed above are derived
by taking into account the output node load
capacitance

National Central University EE613 VLSI Design 6


Overview of Power Consumption

VA VA
Vinternal VB
VB Cinternal Vinternal
Vout

VA VB Cload Vout

The generalized expression for the average power dissipation


can be rewritten as
 # ofnodes 
Pavg =  ∑ α Ti C iV i V DD f CLK

 i =1 
National Central University EE613 VLSI Design 7
Gate-Level Design – Technology Mapping
• The objective of logic minimization is to reduce the
boolean function.
• For low-power design, the signal switching activity
is minimized by restructuring a logic circuit
• The power minimization is constrained by the
delay, however, the area may increase.
• During this phase of logic minimization, the
function to be minimized is


i
P i (1 − P i ) C i

National Central University EE613 VLSI Design 8


Gate-Level Design – Technology Mapping
• The first step in technology mapping is to decompose
each logic function into two-input gates
• The objective of this decomposition is to minimizing the
total power dissipation by reducing the total switching
activity

A 0.2 α = 0.0384
α = 0.0196
B 0.2 α = 0.0099
C 0.5
D 0.5
A
B
C
D A 0.2 α = 0.0384
B 0.2
α = 0.0099
C 0.5
D 0.5 α = 0.1875
National Central University EE613 VLSI Design 9
Gate-Level Design – Phase Assignment

High activity node


High activity node

A
A

B
B

C
C

National Central University EE613 VLSI Design 10


Gate-Level Design – Pin Swapping

a b c d a b c d

d a

Switching activity
Switching activity

c b

b c

a d

d a
c
b b
a c
d

National Central University EE613 VLSI Design 11


Gate-Level Design – Glitching Power
• Glitches
− spurious transitions due to imbalanced path delays
• A design has more balanced delay paths
− has fewer glitches, and thus has less power dissipation
• Note that there will be no glitches in a dynamic CMOS
logic

A
A
B
B D
C
E D
C
E

National Central University EE613 VLSI Design 12


Gate-Level Design – Glitching Power
• A chain structure has more glitches
• A tree structure has fewer glitches
A
B

C Chain structure
D

B Tree structure
C
D

National Central University EE613 VLSI Design 13


Gate-Level Design – Precomputation

REG REG
Combinational Logic
R1 R2

REG REG
Combinational Logic
R1 R2

Precomputation
Logic

National Central University EE613 VLSI Design 14


Gate-Level Design – Precomputation

A<n-1> REG 1-bit Comparator


B<n-1> R1 (MSB)

REG
A<n-2:0>
R2

(n-1)-bit REG
Enable
Comparator R4
Precomputation logic F
REG
B<n-2:0>
R3

National Central University EE613 VLSI Design 15


Gate-Level Design – Gating Clock

D Q D Q D Q D Q
Fail DFT rule
clk checking

T
Add control pin
D Q D Q D Q D Q to solve DFT
violation
problem

clk

National Central University EE613 VLSI Design 16


Gate-Level Design – Input Gating

f1

clk

+
select

f2

National Central University EE613 VLSI Design 17


Architecture-Level Design – Parallelism
16 16
A R A R

32 16 32
16x16 16x16
fref fref/2
multiplier multiplier
16 R
B R
M 32
U
fref fref/2 X

Assume that With the same 16x16 R


multiplier, the power supply can fref
be reduced from Vref to Vref/1.83.
16x16
Vref f ref fref/2
Pparallel = 2.2Cref ( ) 2
= 0.33Pref multiplier 32
16
1.83 2 B R
16
fref/2
National Central University EE613 VLSI Design 18
Architecture-Level Design – Pipelining
The hardware between the pipeline stages is reduced then
the reference voltage Vref can be reduced to Vnew to maintain
the same worst case delay. For example, let a 50MHz
multiplier is broken into two equal parts as shown below. The
delay between the pipeline stages can be remained at 50MHz
when the voltage Vnew is equal to Vref/1.83

32 Half Half 32
(A ,B) REG REG
multiplier multiplier

fref

V ref
Ppipeline = 1 .2C ref ( ) f ref = 0 .36 Pref
2

1 .83
National Central University EE613 VLSI Design 19
Architecture-Level Design – Retiming
Retiming is a transformation technique used to change the
locations of delay elements in a circuit without affecting the
input/output characteristics of the circuit.

Two versions of an IIR filter.


(1) (1)
x(n) y(n) x(n) y(n)
D
D D a 2D a
w(n)
(2) (1) D (2) 2D
(1)
w1(n)
b retiming D b

w2(n)
(2) (2)

National Central University EE613 VLSI Design 20


Architecture-Level Design – Retiming
Retiming for pipeline design

REG C1 C2 REG C3
(6ns) (2ns) (4ns)

fref

REG C1 REG C2
C3
(6ns) (2ns)
(4ns)

fref

National Central University EE613 VLSI Design 21


Architecture-Level Design – Retiming

Clock cycle is 4 gate delays

Clock cycle is 2 gate delays

National Central University EE613 VLSI Design 22


Architecture-Level Design – Power Management

C2
C1

C1_FREEZE

C2_FREEZE

C2

C1

C1_FREEZE

C2_FREEZE

National Central University EE613 VLSI Design 23


Architecture-Level Design – Bus Segmentation
• Avoid the sharing of resources
− Reduce the switched capacitance
• For example: a global system bus
− A single shared bus is connected to all modules, this
structure results in a large bus capacitance due to
∗ The large number of drivers and receivers sharing the same
bus
∗ The parasitic capacitance of the long bus line

• A segmented bus structure


− Switched capacitance during each bus access is
significantly reduced
− Overall routing area may be increased

National Central University EE613 VLSI Design 24


Architecture-Level Design – Bus Segmentation

Cbus

Cbus1

Interface
Bus
Cbus1

National Central University EE613 VLSI Design 25


Algorithmic-Level Design – factivity Reduction
Minimization the switching activity, at high level, is one way to
reduce the power dissipation of digital processors.
One method to minimize the switching signals, at the algorithmic
level, is to use an appropriate coding for the signals rather than
straight binary code.
The table shown below shows a comparison of 3-bit representation
of the binary and Gray codes.

Binary Code Gray Code Decimal Equivalent


000 000 0
001 001 1
010 011 2
011 010 3
100 110 4
101 111 5
110 101 6
111 100 7
National Central University EE613 VLSI Design 26
RTL-Level Design – Signal Gating
Simple Decoder Decoder with enable

module decoder (a, sel); module decoder (en,a, sel);


input [1:0[ a; input en;
ouput [3:0] sel; input [1:0[ a;
reg [3:0] sel; ouput [3:0] sel;
always @(a) begin reg [3:0] sel;
case (a) always @({en,a}) begin
2’b00: sel=4’b0001; case ({en,a})
2’b01: sel=4’b0010; 3’b100: sel=4’b0001;
2’b10: sel=4’b0100; 3’b101: sel=4’b0010;
2’b11: sel=4’b1000; 3’b110: sel=4’b0100;
endcase 3’b111: sel=4’b1000;
end default: sel=4’b0000;
endmodule endcase
end
endmodule

National Central University EE613 VLSI Design 27


RTL-Level Design – Datapath Reordering
Initial Reordered

stable Mux
A<B Mux
glitchy

glitchy
Mux A<B Mux
stable

National Central University EE613 VLSI Design 28


RTL-Level Design – Memory Partition

128x32
din
32
addr dout
write

noe
q addr[7:0]
8 M 32
pre_addr d addr[7:1] U dout
X
clk noe
write
addr dout
din 32

addr0 128x32

National Central University EE613 VLSI Design 29


RTL-Level Design – Memory Partition

• Application-driven memory partition

Reads
64K bytes

Data
ARM
Addr
Core
R/W

Addr
28K 4K 32K Range
64K

National Central University EE613 VLSI Design 30


RTL-Level Design – Memory Partition

• A power-optimal partitioned memory organization

Decoder

ARM

R/W
Addr

Addr
CS

Data

R/W
Addr

R/W
CS

Data

CS

Data
Core

National Central University EE613 VLSI Design 31


Adiabatic Logic Circuits
• Energy drawn from the power supply during 0-to-V
transition is calculated as follows
V
Q = CV
E = QV = CV 2
C

• The amount of stored energy from in the output node is


expressed as follows

V 1
E=∫ Cv o dv o = CV 2
0 2

National Central University EE613 VLSI Design 32


Adiabatic Logic Circuits
• To reduce the dissipation, the designer can minimize the
switching events, decrease the capacitance, reduce the
voltage swing, or apply a combination of these methods
− The energy drawn from the power supply is used only
once
• To increase the energy efficiency of logic circuits, other
methods can be introduced for recycling the energy drawn
from the power supply
• A novel class of logic circuits called adiabatic logic offers
the possibility of further reducing the energy dissipated
during switching events, and the possibility of recycling

National Central University EE613 VLSI Design 33


Adiabatic Logic Circuits – Adiabatic Switching
• The equivalent circuit used to model the charge-up event
in conventional CMOS circuits
t=0
R VC

Isource C

1
VC (t ) = I source ⋅ t
C
VC ( t )
I source = C
t

National Central University EE613 VLSI Design 34


Adiabatic Logic Circuits – Adiabatic Switching
• The amount of energy dissipated in the resistor R from t=0
to t=T can be expressed as
T
E diss = R ∫ I source
2
dt = RI source
2
⋅T
0

RC
E diss = CV C2 (T )
T
• A number of simple observations can be obtained
− The energy is smaller than the conventional case if the
charging time T is larger than 2RC
− The dissipated energy is proportional to the resistance R
• Can a portion of the energy stored in the capacitance be
reclaimed by reversing the current direction?
− The possibility is unique to the adiabatic operation

National Central University EE613 VLSI Design 35


Adiabatic Logic Circuits – Adiabatic Switching
• It is possible to reduce the dissipation to an arbitrary
degree by increasing the switching time to ever-larger
values
− This is referred as the principle of adiabatic charging
• The term “adiabatic” is used to indicate that all charge
transfer is to occur without generating heat
• Switching circuits that charge and discharge their load
capacitance adiabatically are said to use adiabatic
switching
• The circuits rely on special power supplies that provide
accurate pulsed-power delivery
• The circuits are useful only if the supplies can deliver
power efficiently and recycle the power fed back to them
National Central University EE613 VLSI Design 36
Adiabatic Logic Circuits – Adiabatic Logic Gates
• The adiabatic amplifier circuit

VA

X X’
Y’ Y
X’ iC
X CY’ X CY X’

VA
X Y
X’ Y’

National Central University EE613 VLSI Design 37


Adiabatic Logic Circuits – Adiabatic Logic Gates

charge-down charge-up path


charge-up path
F F Vout’
path
Cload2
inputs Vout inputs

VDD VDD Vout


Cload F’
F’
charge-down
Cload1
path

National Central University EE613 VLSI Design 38


Adiabatic Logic Circuits – Adiabatic Logic Gates

A’ B’
Vout

A B
A
VA

A’ Vout’
B

B’

National Central University EE613 VLSI Design 39


Adiabatic Logic Circuits – Adiabatic Logic Gates

VA
constant R Vout
current VA
low Vout ic C
Cload

VDD
VA
VDD/n
Vout
t
National Central University EE613 VLSI Design 40
Adiabatic Logic Circuits – Adiabatic Logic Gates

VN

Vout
V2
Vout
V1
C
t

National Central University EE613 VLSI Design 41

You might also like