Download as pdf or txt
Download as pdf or txt
You are on page 1of 35

Low-Power High-Speed Links

Gu-Yeon Wei Division of Engineering and Applied Sciences Harvard University 6.976 Guest Lecture, Spring 2003

Outline
Motivation Brief Overview of High-Speed Links Design Considerations for Low Power DVS Link Design Example Summary

Wei

Low-Power High-Speed Links

Motivation
Demand for high bandwidth communications Advancements in IC fabrication technology
Higher performance More complex functionality Chip I/O becomes performance bottleneck Increasing power consumption

Network router example

network card

switch card

digital crosspoint transceivers

backplane PCB

Wei

10s to 100s of links on a single crosspoint switch


Low-Power High-Speed Links

50- traces

High-Speed Links Overview


High-speed data communication between chips across an impedance controlled channel
Shared communication bus (memories) Point-to-point links

Types of link architectures and implementations


Parallel vs. serial Differential vs. single-ended Low-impedance vs. high-impedance driver Transmitter-only vs. receiver-only vs. double termination

We will focus on point-to-point serial links using differential highimpedance drivers with double termination for network routers
Techniques to reduce power applicable to other link types

Wei

Low-Power High-Speed Links

Link Components
High-speed links consist of 4 main components
data out

10010

TX

RX

channel timing recovery

Serializing transmitter driver Communication channel De-serializing receiver samplers Timing recovery

Wei

Low-Power High-Speed Links

data in
5

10010

Performance Limitations
Important to look at performance because higher performance can lead to lower power by trading off performance for energy reduction Several factors limit the performance of high-speed links
Non-ideal channel characteristics Bandwidth limits of transceiver circuitry Noise from power supply, cross talk, clock jitter, device mismatches, etc.

Eye diagrams a qualitative measure of link performance

Tbit

random bit stream

ideal data eye


Wei

realistic data eye


6

Low-Power High-Speed Links

Channel Impairments and ISI


One of the dominant causes of eye closure is inter-symbol interference (ISI) due to channel bandwidth limitations
Two ways to view channel impairments

Wei

Low-Power High-Speed Links

Equalization
Placing a high-pass filter in the signal path can counter the rolloff effects of the channel
Preemphasis or transmit-side equalization is commonly used

Wei

Low-Power High-Speed Links

Critical Path in Links


The critical path in links can be as short as 1~2 gate delays
Transmit path

Rterm

Zo=50

Receive path

Rterm
Zo=50 Deven
D Q

Vref Dodd
D Q

Wei Low-Power High-Speed Links 9

Clock Frequency Limit


While a symbol time can be short, there is a limit to the maximum on-chip clock frequency
Must distribute a clock driven by a buffer chain

CLKIN

CLKOUT

Wei

Overcome this limitation with parallelism


Low-Power High-Speed Links 10

Parallelism
Parallelism can increase bit rate even with limited clock frequency
Time-interleaved multiplexing Multi-level signaling

clk[n] clk[n+1] data

10 11 01 00

Some performance issues to be wary of


Static timing offsets in multi-phase clock generator (DLL or PLL) Requires higher voltage dynamic range in transmitter and receiver

Wei

Parallelism can also be low power


Low-Power High-Speed Links 11

Sources of Noise
Power supply noise
Translates into voltage and timing uncertainty

Cross talk
Near- and Far-End Cross Talk (NEXT and FEXT) High-frequency coupling

Clock Jitter
Timing uncertainty in transmitted and sampled symbol Probabilistic distribution of timing edges (bounded and unbounded components)

Device mismatches and systematic offsets


Deterministic or systematic variation in timing edges from multi-phase clock generators

Wei

Low-Power High-Speed Links

12

Considerations for Low Power


Low noise low power
Target some signal to noise ratio (SNR) Reducing noise allows for lower signal power

Trade speed for lower power


Reducing bit rate improves SNR Many noise sources are fixed ratio of timing uncertainty to bit time improves (have longer bit times)

Lets look at a few design choices for low power


Circuit level Architecture level System level
Wei Low-Power High-Speed Links 13

Offset Calibration
Two sets of offsets that can manifest itself as voltage and timing uncertainty to close the eye and may require higher power to overcome them:
Multi-phase clock generator timing offsets Receiver input voltage offsets
ideal Tx timing
higher margins

zero Rx offset non-zero Rx offset

w/ offsets
lower margins

sampling edge

Static offsets due to systematic (layout) and random (device) mismatches


Calibration enables more timing and voltage margins (i.e., lower noise)

Wei

Low-Power High-Speed Links

14

Differential Signaling
Differential communication can lead to a lower power solution
Immunity to common mode noise Injects less noise into the supplies Signal amplitudes can be smaller on both channels Alternative is pseudo-differential signaling but needs a reference voltage which can be noisy and require larger Vswing Rterm
Zo=50

But arent there now are two channels that switch? Yes, but

Rterm D

Vswingdiff pk2pk = 2(I*R)

Rterm

Vswing = I*R

Zo=50
Vref

Zo=50

What does it cost?


Requires two pins per link

Wei

Low-Power High-Speed Links

15

Signal Multiplexing
There are different options for choosing where to combine pulses to create sub-clock period symbols
Combine at the final transmitter stage vs. farther up stream
Rterm
Zo=50 high-speed path

Rterm
Zo=50

D0

D1

D2

D3

Cload for the clocks higher when combined at the final transmitter vs. Need faster signal path after the multiplexer

Wei

Best choice depends on implementation (see Zerbe, ISSCC2003)


Low-Power High-Speed Links 16

Multiplexing
Parallelized Transmitter TX TX TX channel bitrate = M.fclk Parallelized Receiver RX RX RX

Multiphase Clocks

Wei

Low-Power High-Speed Links

17

Multiplexing = Low Power?


With M:1 multiplexing, fCLK = bit rate/M Power = MCV2f = MCV2 (BR/M) = CV2BR With fixed supply, power does not vary with M

But wait, at lower frequencies, I can lower voltage! With lower supply voltage (V fCLK = BR/M), Power decreases as 1/M2 !

Wei

Low-Power High-Speed Links

18

Power vs. M
Larger M Fixed Supply B Can reduce voltage B Lower power B Less accurate timing
static phase offsets jitter

Lower Supply 1/M or fCLK

Cannot make M arbitrarily large b/c there is a lower limit to Vdd Choice: M= 4~6

choice

This begs the questions What if we make Vdd adaptive w/ fCLK?


Wei Low-Power High-Speed Links 19

DVS Links
Dynamic Voltage Scaling (DVS)
Technique first introduced for digital systems (e.g., uP, DSP chips)
Lots of work done in both academia and industry (e.g., Intel, Transmeta)

Allows trade off between speed and power

Lets investigate DVS for high-speed links


Motivation and potential benefits Design example from Dr. Jaeha Kim (ISSCC2002, JSSC2002, PhD thesis 2002)

Wei

Low-Power High-Speed Links

20

DVS Links
Dynamic Voltage Scaling (DVS) can reduce power consumption in two ways
1) Digital circuits operate at their most energy-efficient point in the presence of PVT variations by eliminating extra performance margins
1.2

Normalized Frequency

1 0.8 0.6 0.4 0.2 0

Pdynamic = Cswitched V 2F
88% 0 50 130

0.5 V 0.8 V
1 1.5 2 2.5 3 3.5

Supply Voltage (V)


Wei Low-Power High-Speed Links 21

Trade Performance for Energy Savings


2) DVS enables trade off between performance and energy
Reducing frequency alone reduces power but not energy per bit

1 Normalized energy / bit 0.8 0.6 0.4

Fixed Vdd = 3.3V

E C V 2

Energy Savings

Dynamically scaled Vdd 0.2 0

0.2

0.4 0.6 Normalized bit rate


Low-Power High-Speed Links

0.8

Wei

22

DVS Link Components


DVS links require two additional components
Mechanism to measure circuit critical path to appropriately adjust voltage with respect to frequency
Use an on-chip performance monitor circuit (inverter delay elements of core DLL)

Efficient supply-voltage regulator (buck converter)

Overall Block diagram (Wei et al, ISSCC2000)


core DLL

FCLK

VCTRL

Digital Controller

Buck Converter I/O Transceiver


DTX DRX

RVdd

Wei

Low-Power High-Speed Links

23

Performance Monitoring DLL


CP 0 A clk
Reduces design complexity by enabling one to replace precision analog components with simple digital gates. How?
Inverters of the delay line model the critical path (clock distribution) Delay of gates in I/O circuitry are fixed relative to clock period
Wei Low-Power High-Speed Links 24
O

VCP 180
O

UP

DN

PD VCTRL

Adaptive Receiver Filter


Filter signal frequencies beyond the Nyquist rate at the receiver (helps for dealing with cross talk) Receiver example
IN IN

Vbias
0

AdB

Preamplifier

fsymbol
1

Regenerative Latch

Filters corner frequency tracks fsymbol


Wei Low-Power High-Speed Links 25

Example: Adaptive Supply Serial Links


Jaeha Kim (Ph.D. defense 2002)
Multiphase Clock Recovery Multiphase Clock Generation

1:5 Demultiplexing Receiver

Adaptive Supply, V

5:1 Multiplexing Transmitter

fref

Adaptive Power Supply Regulator

Wei

Low-Power High-Speed Links

26

Multi-Phase Clock Generation


Must minimize static offsets between phases Generate multiphase clocks locally at each pin, but watch out for power and area overhead
Static Offsets Jitter

TX TX TX

Wei

Low-Power High-Speed Links

27

Dual-Loop Clock Generation


Adaptive Power Supply Regulator

fref
f

Coarse Control Reference VCO

Adaptive Supply, V
Local Multiphase Clock Generation Fine Control Local VCO

Local Multiphase Clock Recovery Fine Control Local VCO

1:5 Demultiplexing Receiver

5:1 Multiplexing Transmitter

Wei

Low-Power High-Speed Links

28

Dual-Loop Clock Generation (2)


Global loop brings the local VCO frequency close to lock Narrow local tuning range (+/-15%) is sufficient to compensate for on-chip mismatches Narrow tuning range leads to low VCO gain
Small loop capacitor area (2.5pF) Low sensitivity on Vctrl noise

Wei

Low-Power High-Speed Links

29

Clock Recovery
Optimal receiver timing is recovered from the incoming data stream
Data RX

PD Clock Recovery Filter VCO

Wei

Low-Power High-Speed Links

30

Phase Detection
Phase detector made of an identical set of receivers minimizes timing error

RX PD

PD RX PD RX PD RX PD RX PD

[4:0]

Edge Detection / Majority Voting

Early/Late /None

5
Data RX

Received Data

Wei

Low-Power High-Speed Links

31

Chip Prototype
TX
Digital Sliding Controller

TX
data gen

TX

TX
data gen

TX/RX Feedback Biasing

Testing Interface

RX- PRBS PLL RX

RX-DLL RX
PRBS

0.25m CMOS 2.5V / 0.55Vth 3.12.9mm2 0.4~5.0Gb/s 0.9~2.5V 5.6~375mW BER < 10-15 Reg. Efficiency: 83-94%

Power Transistors

Wei

Low-Power High-Speed Links

TX-DLL

TX-PLL

32

Power and Performance

3.1Gb/s, 113mW

Wei

Low-Power High-Speed Links

33

Power Breakdown

Wei

Low-Power High-Speed Links

34

Summary
Higher performance links require low noise low noise solutions lead to lower power
Trade performance (speed) for power reduction

DVS links enable energy-efficient link operation and also have some nice properties Outstanding issues with using DVS links
Communication during frequency and voltage transitions Supply voltage regulator slew rate limits Overhead of multiple regulators

Wei

Low-Power High-Speed Links

35

You might also like