A Compact Low-Power Decimation Filter Sigma Delta Modulators

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

A Compact Low-Power Decimation Filter

for Sigma Delta Modulators


Suet-Fei Li and John Wetherrell (suetfei, wetherel@eecs.berke1ey.edu)
University of California at Berkeley

ABSTRACT Fig 2: Typical FIR filter configuration(runs@

Sigma delta modulators can provide the lowest power and area solu- 1.2. FIR-Sinc Configuration
tion for high resolution A/D converters. Unfortunately the required Figure 3 shows the proposed decimation filter configuration with the
decimation filter for the modulator tends to consume much more FIR filter and sinc filter sectionsswupped. The overall filter has the
powercand area than the modulator. This paper solves both problem by same input and output word size, and frequency response. However,
introducing a new FIR-Sinc architecture and taking advantage of the the low input bit width (1 bit versus 10 bit) implies that this new archi-
low number of bits at the input. Careful choice of pipelining and logic tecture requires much smaller adders and multipliers. As a result, this
style and the use of multiple-bD logic lead to a low power imple- implementation has a smaller area and power consumption. In addi-
mentation without compromising on performance and area. The tion, the simplicity of its structure allows fully pipelining and hence a
power consumed is 93% less and area used is 14% less than the best very fast implementation.
reported designs. The 64-tap programmable decimation filter with 10- fS*OSR fs'OSR fS*2
bit coefficients consumes 1 S m W of power and 0 . 7 l m d of area run-
n-tap FIR sin$
ning on a 40MHz I .5 bit input with a 1.25MHz 15 bit output on a
1.65V supply.
Fig 3: New decimationhiker configuration
1. BACKGROUND
Typically a FIR filter with the minimum number of taps can be
achieved by operating the sample frequency at 4 times the desired sig-
Recent power optimization of sigma delta modulators is placing an
nal bandwidth. This is called a half-band filter, because the signal
ever increasing need for power and area reduction of digital decima-
band is at half of the Nyquist frequency, which is half of the sampling
tion and filtering. Die area and power are dominated by the digital fil-
rate. Half-band filters have the advantage that every other coefficient
ter implementations. Future needs to make these filters programmable
is zero, except for the middle coefficient, which is 0.5. Multiplying by
makes this situation worse.Our proposed architecture can improve the
112 is simply a shift and thus half of the multiplications and additions
situation by significantly reducing the size of the adders and multipli-
can be eliminated with a half-band filter.
ers required., and hence dramatically reduce the area and power con-
sumption.
To operate a FIR filter at N times higher frequenciesrequires N times
1.1 Sinc-FIR Configuration more coefficients for the same adjacent channel attenuation. By alias-
ing the original coefficients most of the coefficients at the high fre-
quency can be set to zero, reducing the hardware requirements to the
sin8 same as the typical implementation.

Fig I : Typical decimation filter configuration. 1.3. Preliminary analysis: FIR-sinc vs. sinc-FIR
For parallel implementations, the FIR-sinc architecture will consume
Figure 1 and Figure 2 show a typical digital decimation filter configu- less power than it5 sinc-FIR counterpart for oversampling ratios less
ration and the FIR filter section of the filter respectively. Notice the than 48. An equation determine which Over Sampling Ratio (OSR)
FIR filter requires, for example, 64 6xlObit multipliers and 16xlObit to switch from one implementation to another was derived and shown
adders all running at twice the output sampling frequency. This below.
amount of computation demands large consumption of power and
area. In practice, smaller number of multipliers are used and time mul-
tiplexed for area savings at the cost of increased power consumption.
This time multiplexing becomes increasingly prohibitive at higher
sampling frequencies demanded by such applications as CDMA
receivers and ADSL modems running at low MHz output sampling For practical implementations the Sinc-FIR is implemented in a serial
rates with 15 bit resolutions. architecture.
Y (z)=X(z)*(1 +a]z-'+ac*z-2+...+ffNz-N)

Z3 Modulator FIR-Sinc Filte


"N

Fig 4: FIR-Sinc implementation for a cascaded SD modulator


0 - 0

17-25 Reference [3] incorrectly states decimation filters using 1-bit input

0-7 803-629 3-4/00/$10.00 02000 IEEE . 3223


cannot be used for cascaded SD modulators. Cascaded SD modulators multipliers.
CAN take advantage of a I-bit input by using the architecture shown
in Figure 4 above. The recombination logic is moved to the output of
two decimation filters to preserve the low number of bits in the deci-
-J
mation filter.

2. IMPLEMENTATION
9"I
inxco inxcl
"I
In this section, circuit implementations of the basic blocks in the pro- Fig 6: IxNbit multiplier implementation
posed architecture are described first. We have studied various logic
styles for implementing these basic blocks. And our focus is to find
the implementation that consumes minimum area and power while
still meeting our moderate performance requirement. System level
techniques are discussed at the end of the section.

2.1. FIR F i l t e r
.4minimum area approach with high speed through pipelining was
used to implement the FIR filter. Fig 7: 1SxN-bit multiplier implementation

2.1.1. Pipelining 2.1.4. Adders


One has to cautious when pipelining this architecture. Typically the [ I ] is an excellent reference comparing a variety of different adder
operating speed of the filter is limited by performance of the analog architectures. This application requires approximately 492 1-bit full
sigma delta modulator. Here maximum speeds rarely exceed adders. Although absolute maximum speed is not necessary, it must
100MHz, and are more typically around 20MHz. This filter was be able to run at 40MHz on low supply voltages, around 1 SV,to min-
designed for a 40MHz input. Because of this speed limitation, pipelin- imize power dissipation. Therefore an attractive architecture is one
ing may not be necessary between every stage even with minimum with a small power-delay product and area. The ripple carry adder is
size devices. We decide to use pipelining at every tap to allow opera- chosen because of itS simplicity, which leads to very small imple-
tion at very low supply voltages. mentations. They typically have larger propagation delays, which
make them unattractive for large adders running at high speeds. For
2.1.2. Tapped Delay Line our application the adder sizes are small (around 10 bits), and ripple-
The tapped delay line for the FIR filter consists of hundreds of delay carry adders can easily meet the timing requirements.
elements all driven by the same clock. This will place a significant
loading on the clock drivers. The delay elements also consume signif-
icant amounts of area. For these reasons a delay element with minimal
clock loading is chosen, such as a true single phase clock d type flip
flop (TSPC DFF).

Figure 5 below shows an attractive way to implement a small flip flop.


The flip flop only consists of I O transistors, 8 of which are cascaded
transistors, which require no intermediate contact. This will help
reducing the size of the gate layout. To reduce area, all gates and tran-
sistors outside of the critical path will use minimum size transistors. Fig 8: Small area adder implementation
Maximum operating frequency will be determined by the adder cells
and not the delay lines, therefore most transistors in the tapped delay
line will be minimum size, except around the taps. The area of full adder shown above is 29647, which is the smallest
known layout known to the authors. The second best, which is 50%
larger is a current mode adder with an area of 4450f.

2.2. Sinc Filter


The large number of bits in the sinc filter results in a speed bottleneck
w in the decimation filter. The carry signal has to ripple through approx-
imately 20 full adders in this filter before the clock can change. Pipe-
lining is difficult in this closed loop circuit, but can be achieved using
a> carry save arithmetic as described in [2]. Use of a pipelined accumu-
Fig 5 : Flip flops a) TSPC b) Standard lator, however, is attractive in terms of speed, but unattractive in terms
2.1.3 Multipliers of power and area. While the number of adders, N, has remained con-
The main advantage of the FIR-sinc implementation over the sinc-FIR stant, the number of flip flops, 2N, has increased to 2N+$. A 4Ih
implementation is the elimination of the multiplier blocks. And multi- order sinc filter consisting of four 24-bit pipelined accumulators will
plier blocks are the ones that consume the most power and area in the require 2304 additional flip flops over an unpipelined version. Thus
filter. Figures 14 and 15 show implementations of the 1 and 1.5 bit we have chosen a more attractive adder architecture for this filter ---
carry lookahead adders, which is only slightly slower than the pipe-
lined accumulator.

3224
Power savings can be achieved in the decimation filter by increasing
2.3. Multiple Power Supply Implementation the number of levels from 2 (1 bit) to 3(1.5 bits=2). For the radio
In portable wireless application that the decimation filter is’designed applications, signals are typically very small (-20dB to-40dB=10-100
for, it is often desirable to design all circuits to operate from a single times smaller) relative to its maximum possible value. This small sig-
supply. This eliminates the need for switching regulators. Switching nal is centered around a DC bias point. For some cellular phone stan-
regulators are particularly problematic for very low power circuits. dards a power control loop is applied to the transmitter. This varies the
The efficiency of common switching regulars drops for currents less transmit power to save energy and minimize interference to other
ImA, or power consumptions less than 3mW. The expected power cells, while keeping the received signal to an acceptably small level,
consumption of our design with a 0.5um process will be around ImW, which is well below the maximum level. A plot of the distribution of
Switching regulators will result in very low efficiencies in this case. the received signal strength with power control is shown in the follow-
ing figure [4].
A solution to this problem is to use an internal multiple power supply ,-I;c,Il
implementation for the logic. The architecture is illustrated in Figure 9
below. a..

*I.

.I.

VDD 1
.

Upper Logic Block .I

I.

.I.

.a.

Lower Logic Block .U

Fig 9: Simple low-power multiple VDDlogic


GND . . 4 * . 7, I

Fig IO: Normalized probability vs. received signal strength


. ” .. , . ”
(I$
No).
The operation of the logic allows power savings in two ways. First, When small signals are applied at the input of a two-level modulator,
the power is halved by effectively cutting the swing of the logic in the modulator output consists of IS and OS alternating near fs/2, as
half. Second, the power is halved again because charge from supply
shown in the following figure.
used by logic in the upper half is reused by the logic in the lower half
I bit
before it is dumped to ground. In real operation, the power dissipated
will be somewhere between the best case and the worst case shown
below and will depend on the activity factors of the top and bottom
circuits.

I [1.5DIAbit
Fig 1 1 : 2 level (1 bit) sigma delta modulator
output wardorm
Y

,OD
ow
0-
aw
0-
O D
0 0
0 9
ow
OY)
Om
An important point to note is to reuse the energy the middle node must ow
Dz)
be allowed to swing up (VDd2) * (cL/c)kIf an ideal amplifier is 0 9
0 0
used as a vDD/2 buffer, energy from the upper logic will be lost to the Ox)
0-
buffer before the lower logic can use it. Energy reuse can be main- 00
OBD
tained by lowering the speed of the amplifier. If the amplifier band- ow
la,
width is low enough, it will not have time to reset the \6d2 line .1 w I I I I
I rT.d
1- .n .A. .a In
before the lower logic can reuse the charge. A large holding capacitor, Fig 12: Typical a) 2-level SD modulator output (small inputs)
probably external, must also be used on the %D/2 line to minimize
ripple on the line. This forces the filter to add coefficients and immediately subtract
them a cycle later, which is a wasted computation. For small inputs,
The multiple VDD implementation is ideally suited for parallel imple- the three level modulator output (decimation filter input) stays rela-
mentation of symmetric FIR filters. Here, the two XDS may be parti- tively constant with a zero value output with occasional pulses repre-
tioned at the middle of the filter. The symmetry of the filter ensures senting the signal. Thus in normal operation where a typical signal is
equal power dissipation in the upper and lower logic block, so that lit- 20 times lower than it% maximum value, 20 times the power could
tle power is dissipated in regulating the circuit. potentially be saved by replacing a 2-level modulator with a 3-level
modulator.
2.4. Filter power savings from higher level system design

3225
1.5 bit
Ain
2-level 3-level [31
Area (mm2) 0.55 0.71 0.85
Max Speed (MHz) 235 235 36
Power(mW) 2.02 1.54 20

Fig 13: 3 level (1.5 bit) sigma delta modulator


From Table I , we could see that our 3-level decimation filter con-
sumes 13 times less power than best reported decimation filter archi-
tecture[3] without compromising on area.

Logic Area(”*) Area(% j

Pipe 0.1 I
Delay 0.32 45
Route 0.12 16
Total 0.71 100

However, let us not ignore the area, power, and speed overhead From Table 2, we could see that the delay line implementation domi-
required by the extra bit. This amounts to an extra inverter and multi- nate the chir, area.
plexor for every coefficient bit. The adder sizes stay the same. Most
significantly, the delay chain doubles with the addition of the extra bit.
4. CONCLUSIONS
which roughly increases the area by 25%, and doubles the delay chain
power. With the overhead considered, the 3-level modulator saves
This paper shows that a significant power savings can be achieved by
33% more power than its 2-level counterpart (see simulation result).
using a FIR-Sinc architecture over a sinc-FIR filter architecture. Care-
ful choice of pipelining and logic style can reduce the power dissipa-
Further power savings can be achieved by varying the coefficient and
tion of the filter to levels below the power of the analog portion of a
adder lengths. The decimation filter is designed to filter signals
sigma-delta A/D convertor. The FIR-Sinc architecture is very suitable
60dB(IObits) bigger than the desired signal, but the desired signal
for digital filtering of modulator outputs up to GHz range operation
only needs 18dB(3 bits) of resolution. After the signal is filtered, the 3
through massive pipelining. The simple multiplier-less, small-adder
MSBS are removed for further processing. When the signal is large
this means the accuracy of the adders can be reduced by stopping the implementation leads well for pipelining to the bit level without sig-
nificant area penalty. The primary speed limitation i s the comb filter,
activity of the lower bits. On average, this can reduce the power dissi-
which has been recently pipelined to the bit level [2]. When imple-
pation by about 30%.
mented with multiple-VDD logic a simple high-efficiency solution to
low-voltage, low-power iogic can be achieved.
3. SIMULATION & LAYOUT RESULTS
The Fir-Sinc architecture has 13 times lower power dissipation than
the best reported decimation filter architectureI31. The Fir-Sinc archi-
tecture has approximately the same area as a comparable Sinc-FIR
architecture. The use of 3-level SD modulators save 33% more power
in the decimation filter over 2-level S D modulators by taking advan-
tage of low activity factors.

REFERENCES
[ I ] C. Nagendra, M. Irwin, R. Owens, “Area-Time-Power Tradeoffs
in Parallel Adders,” lEEE J. Solid State Circuits, vol. 43, no. IO, pp.
689-702, October 1997.
[2]F. Lu, H. Samueli, “A 700MHz 24-b Pipelined Accumulator in
I .2mm CMOS for Application as a Numerically Controlled Oscilla-
tor,” IEEE J. Solid State Circuits, vol. 28, no. 8, pp. 878-886, August
Fig 15: Decimation Filter Layout 1993.
[3] Brandt, “A Low-Power, Area-Efficient Digital Filter for Decima-
tion and Interpolation,”lEEE J. Solid State Circuits, vol. 29, no. 6 , p.
679, June 94.
[4]CDMA Network Engineering Handbook, 1992

3226

You might also like