Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Int. ASIC conference, Sept. 1999, Washington, D.C.

A Low-Power Viterbi Decoder Design


for Wireless Communications Applications

Samirkumar Ranpara Dong Sam Ha


Portland Technology Division, Bradley Dept. of Electrical and Computer Eng.
Intel Corporation. Virginia Tech, Blacksburg VA 24061
Hillsboro OR 97124 E-mail: ha@vt.edu Web: www.ee.vt.edu/ha

Abstract ---Viterbi decoders employed in digital The hardware complexity of a Viterbi decoder
wireless communications are complex and dissipate is proportional to the number of states in the trellis,
large amount of power. In this paper, we investigate which is equal to the number of states of the
power dissipation for three different corresponding convolutional encoder. For instance, the
implementations of Viterbi algorithm for wireless CDMA standard IS-95 employs a convolutional
communications applications, including our encoder with 8 states. Hence, a Viterbi decoder for the
proposed low-power Viterbi decoder. The schemes CDMA system has 256 states in the trellis, which
employed in our low-power design are clock-gating results in complex hardware and high power dissipation
and toggle filtering. We described the behavior of unless designed properly for low power dissipation.
three Viterbi decoders in VHDL and synthesized The design of high performance Viterbi
using a synthesis tool. The synthesized circuits were decoders has been investigated intensively in the past
placed and routed in the standard cell design three decades [1]. Recently, the low-power design of
environment. Power estimation obtained through Viterbi decoders has been an important issue for mobile
gate level simulations indicates that the proposed and portable applications. Numerous techniques to
design reduces the power dissipation of an original reduce power dissipation have been proposed, and we
Viterbi decoder design by 55%. review three recent works briefly. Chan, Lee, Lin and
Chen implemented an adaptive Viterbi decoder based
I. INTRODUCTION on an earlier work [2]. The adaptive decoder discards
some states (in the trellis) with high path metrics
Viterbi decoders are used to decode dynamically during the decoding process. Seki et al.
convolutional coding, which has been used in deep and Lang et al. suggested the use of a scarce state
space communications as well as wireless transition (SST) scheme [3],[4]. The scheme employs a
communications. A wireless cellular standard for simple pre-decoder followed by a pre-encoder to
CDMA (code division multiple access), IS-95 employs minimize signal transitions at the input of a
convolutional coding. A third generation wireless conventional Viterbi decoder, which leads to lower
cellular standard, to be introduced in early 2000s, dynamic power dissipation. Kang and Wilson studied
adopts turbo coding, which stems from convolutional various issues in designing a low-power Viterbi
coding. decoder for the IS-95 CDMA system [5]. Their decoder
The Viterbi algorithm is to find a maximum- employs various low-power design schemes such as
likelihood sequence of state transitions, equivalently a state partition, gated control, and gray coding.
path, in a trellis by assigning a transition metric to In this paper, we investigate two different
possible state transitions. A transition metric is called a implementations of the Viterbi algorithm in the IS-95
branch metric, and the cumulative branch metrics along CDMA environment. We compare the area and power
the path from the initial state to a given state is called dissipation of the two implementations and suggest a
the path metric of the state. When two or more paths low-power Viterbi decoder design to save power.
end at the same state, the path with the smallest (or
largest) path metric is selected as the most likely path. II. VITERBI DECODER
The survivor path obtained by backtracing in time
corresponds to the decoded output. In this section, we describe key parameters and
the architecture of the Viterbi decoders considered in

1
Int. ASIC conference, Sept. 1999, Washington, D.C.

the paper. We also explain two different methods for approach results in complex hardware due to the need
management of survivor path information and a block to copy the contents of all the registers in a stage to the
diagram of a Viterbi decoder. next stage.
In traceback, the survivor branch of each state
2.1 Parameters of the proposed Viterbi decoder is recorded. One bit is assigned to each state to indicate
The IS-95 CDMA system employs a 256-state if the survivor branch is the upper or the lower path.
convolutional encoder with a rate of r=1/2 in the Using the one-bit information of each state, it is
forward link and a convolutional encoder with the same possible to traceback the survivor path starting from the
number of states but with a reduced rate of r=1/3 in the final state (which is identical to the initial state for the
reverse link. In this paper, we consider a small-sized Viterbi decoders considered in this paper). The decoded
Viterbi decoder, which corresponds to a scaled-down output sequence can be obtained from the identified
version of the IS-95 convolutional encoder in the survivor path through the trellis.
reverse link, a 16-state encoder with a rate of r=1/3. The Figure 2 illustrates the traceback approach. For
number of symbols in a frame for IS-95 is as high as simplicity, only the significant bits of registers are
192, but we limit the number of symbols to 20 for our shown in the figure. As in the register-exchange
Viterbi decoder. Like the IS-95 system, the tail approach, a register is assigned to each state. Each
sequence in each frame is fixed to ‘0’ and resets the register contains the past history of survivor branches,
convolutional encoder to its initial state. Therefore, the ‘1’ for the upper branch and ‘0’ for the lower branch in
traceback operation can start from a known state for our the figure. As the decoding process proceeds, new bits
Viterbi decoders as well as for an IS-95 decoder. for survivor path information are appended to the
existing one.
2.2 Management of survivor path information
Two basic approaches used to record survivor
paths are register-exchange and traceback [1]. Since S3 0 01 010
one of our objectives is to investigate the power
efficiency of the two approaches, we describe the two
S2
approaches in detail. 0 00 000

In the register exchange, a register assigned to


each state contains information bits for the survivor S1 0 00 001 0011
path from the initial state to the current state. In fact,
the register keeps the partially decoded output sequence 0 00 000 0001
S0
along the path, as illustrated in Figure 1. The register of
state S1 at t=3 contains ‘101’, which is the decoded
output sequence along the bold path from the initial
state. t=0 t=1 t=2 t=3 t=4

Figure 2: An example of traceback approach


S3 11 111 1011

The survivor path information is applied to the


S2 10 110 1010 least significant bit of each register, and all the registers
perform a shift left operation at each stage to make
S1 1 01 101 1101 room for the next bits [12]. Hence, each register fills in
the survivor path information from the least significant
S0 0 00 000 1100 bit toward the most significant bit. The scheme is called
shift update in this paper to distinguish it from the
proposed scheme. The shift update method is simple in
t=0 t=1 t=2 t=3 t=4 implementation but causes high switching activity due
to the shift operation and, hence, results in high power
Figure 1: An example register-exchange approach dissipation. We propose a low-power Viterbi decoder
design to address the problem.
The register-exchange approach eliminates the
need to traceback since the register of the final state
contains the decoded output sequence. However, this

ii
Int. ASIC conference, Sept. 1999, Washington, D.C.

2.3 System architecture and generates the decoded output corresponding to the
The major building blocks of a Viterbi decoder path.
are shown in Figure 3. The role of each block is
described briefly below. III. PROPOSED LOW-POWER DESIGN

Viterbi Decoding Block We explained two basic approaches used to


record survivor paths: register-exchange and traceback.
Both methods cause substantial switching activity and
Branch Path
hence are inefficient in power dissipation. In this

Output interface
metric ACS metric
Input interface

storage storage
section, we present two schemes to reduce power
dissipation in the traceback approach.
Survivor path storage
3.1 Survivor path storage block and clock gating
In the traceback approach, one flip-flop is
Output generator
necessary to record the survivor branch for each state
per stage. The shift update scheme forms a shift register
for each state by collecting the flip-flops in the
Control block
horizontal direction as shown in Figure 4 (a). The
survivor branch information is filled into the least
significant bits of the registers as explained earlier. We
Figure 3: Block diagram of a Viterbi decoder propose the formation of registers vertically as shown
in Figure 4 (b). The survivor branch information is
Input and output interface: Input and output interface filled into registers from left to right as time progresses.
blocks provide the interface, including a serial-to- This scheme is called selective update in this paper.
parallel conversion and vice versa.
Branch metric storage: The block calculates the New survivor
branch metrics of each stage in the trellis, and a branch path information

metric is calculated as the hamming distances between


the received symbol and expected symbol in this paper. S15 R15
New
survivor
Path metric storage: The block stores the path metric path
of each state at the current stage. information R0 R1 R19

ACS: The ACS (Add-Compare-Select) block is a S1 R1

collection of ACS units. An ACS unit receives two R0


S0
branch metrics and two path metrics. It adds each t=0 t=1 t=19
incoming branch metric to the corresponding path
metric and compares the two results to select a smaller (a) Shift update (b) Selective update
one. The path metric of the state is updated with the Figure 4: Shift update versus selective update
selected one. An ACS unit can be time-shared between
multiple modules, but it incurs more power dissipation The key difference between the two schemes
due to the control circuitry. One ACS unit is assigned to is that the content of a register in the selective update
each state in our designs. method does not change once it is updated. Hence, the
Survivor path storage: The block is a bank of register incurs less switching activity, thus reducing
registers, which record the survivor path of each state power dissipation. Moreover, the fact makes it possible
selected by the ACS module. A register is assigned to to apply a clock-gating scheme to further reduce power
each state, and the length of a register is equal to the dissipation. The clock of each register is enabled only
frame length (which is 20 for our decoders). when the register updates its survivor path information.
Output generator: This block generates the decoded Figure 5 shows the proposed survivor path
output sequence. It is trivial for the register-exchange storage block for the selective update method with
approach since the decoded output is the content of the
employment of clock-gating. In the figure, register Ri
register of the final state. In the traceback approach, the
holds the survivor path information of all the states at
block incorporates combinational logic, which traces
stage i, where i is 1 to 20 for our decoders. The five-bit
back along the survivor path starting from the final state
counter keeps track of the current stage, which is equal
to the number of code symbols received so far for the

iii
Int. ASIC conference, Sept. 1999, Washington, D.C.

frame. When the ith code symbol is received, the clock path at the end of the frame and generates the decoded
of register Ri is enabled, and the survivor path output sequence. The decoded output sequence is
information of all the states at stage i is recorded in the loaded into a register at the first clock.
register. Note that all the other registers hold their state Since the registers update the survivor path
since the clock of the other registers is disabled. information progressively throughout the entire frame,
Therefore the proposed survivor path storage reduces the output generator receives spurious inputs, which
switching activity to a minimal level to save power. cause unnecessary switching activity to dissipate
Finally, it is possible to replace the five-bit counter and power. The proposed design blocks spurious inputs
the demultiplexer with a ring-counter. applied to the block. The array of AND gates and the
enable signal shown in Figure 7 are introduced for this
purpose. The enable signal is activated during the one
Survivor path
information clock period at the end of the frame as shown in Figure
16
7.

R1 R2 R 19 R 20

R1 R2 R20

Clock Enable 16 16 16
0 0 1 0

Demux 16 16 16
0
0
1 Trace back block
1
0
20
Parallel load
Clock
5-bit Register
counter

Figure 7: A block diagram of the output generator


Figure 5: Proposed survivor path storage block for the
selective update with clock gating
IV. EXPERIMENTAL RESULTS
3.2 Toggle filtering of the output generator block
We measured the area and power dissipation
The output generator block in the traceback
of the three different implementations of Viterbi
approach traces back the survivor path after all the
decoders: the register-exchange; the traceback approach
symbols have been received and generates the decoded
with the shift update scheme; and the proposed low-
output sequence. The block is a combinational circuit,
power design.
which can be active during only one clock cycle as
We implemented the three different Viterbi
shown in Figure 6.
decoders in the standard cell environment. First, we
Begin a new frame
described the three Viterbi decoders at the register
transfer level in VHDL and synthesized them using a
inactive active inactive Synopsys tool. Then the synthesized gate-level circuits
were migrated into Cadence environment and placed
and routed using Cadence tools. The processing
18 19 20 1 2 time unit
technology used in our experiments was CMOS six-
metal layer 0.25 µm with the supply voltage of 1.8V.
Figure 6: Activation of the output generator block Power dissipation was estimated for the
synthesized gate-level circuits using Synopsys tools.
A block diagram of the output generator block The process for power estimation is as follows. About
is shown in Figure 7. Ignoring the AND gates with an 10,000 random patterns were simulated at the clock
enable input for the time being, the block receives frequency of 10 MHz for each Viterbi decoder, and the
inputs from the survivor storage block containing the switching activity of each node was recorded. Then, the
survivor path information. The block traces the survivor power dissipation of the circuit was estimated using a

iv
Int. ASIC conference, Sept. 1999, Washington, D.C.

formula, P= αCV2f, where α is the switching activity, C design are clock-gating of the survivor path storage
is the parasitic capacitance, V is the supply voltage block and toggle filtering of output generation block.
(=1.8V), and f is the clock frequency. The clock We implemented the three designs in the
frequency was set to 10 MHz [6]. The static power standard cell design environment and measured the
dissipation of cells was not considered due to the performance in terms of area and power dissipation.
limitation of the library cells used in our experiments. Among the three implementations, it is interesting to
Experimental results for three Viterbi decoders note that the proposed low-power design takes the
are shown in Table 1. Among the three designs, the smallest area and dissipates the least power. The
register-exchange approach has the largest area and the proposed design reduces the power dissipation of the
proposed low-power design the least area. The area of register-exchange approach by 55%. However, our
the proposed design is 30% less than that for the design and the traceback approach with the shift update
register-exchange approach and 22% less than that of scheme are slower than the register-exchange approach
the traceback approach. We observed that the large area due to the need for the traceback operation.
for the register-exchange method is due to the Finally, it is difficult to make a head-to-head
complexity of the survivor path storage block. comparison of power efficiency between the proposed
method and other existing methods due to different
Table 1: Performance of the three Viterbi decoders environments (such as hard decision versus soft
decision) and constraints imposed. Nonetheless some of
Type Register- Traceback Proposed our techniques can be applied to other low-power
exchange design designs to further save power.
Total area
176085 157284 123309 REFERENCES
(µm2)
power
dissipation [1] S. B. Wicker, Error Control Systems for Digital
1167 1529 683
Communication and Storage, Prentice Hall, 1995.
(µw)
[2] M-H Chan, W-T Lee, M-C Lin and L-G Chen, “IC
Design of an Adaptive Viterbi Decoder,” IEEE
The power dissipation of the three decoders is
Trans. on Consumer Electronics, Vol. 42, pp. 52-
small due to the small size of the decoders and is in the
61, Feb. 1996.
range of 0.7 mW to 1.5 mW. The traceback approach
[3] K. Seki; S. Kubota; M. Mizoguchi and S. Kato,
dissipates the largest amount of power, while the
“Very Low Power Consumption Viterbi Decoder
proposed low-power design dissipates the least power.
LSIC Employing the SST (Scarce State Transition)
The proposed method reduces the power dissipation by
Scheme for Multimedia Mobile Communications,”
41% compared with the register exchange approach and
Electronics-Letters, IEE, Vol.30, no.8, pp.637-639,
by 55% compared with the traceback approach. We
April 1994.
observed that the selective update scheme and toggle
[4] L. Lang; C.Y. Tsui and R.S. Cheng, “Low Power
filtering are most effective in saving power and the
Soft Output Viterbi Decoder Scheme for Turbo
clock gating the least effective. Details on the
Code Decoding,” Proc. Int. Symp. on Circuits and
experiments, including the area of each block and the
Systems, pp. 1369-1372, vol.2, 1997.
power dissipation for several derivatives of the
[5] Kang and A.N. Willson Jr,” Low-power Viterbi
decoders, are available in [7].
decoder for CDMA Mobile Terminals, “
Conference-Paper; Journal-article, IEEE-Journal-
V. SUMMARY
of-Solid-State-Circuits. Vol.33, no.3, p.473-82,
March 1998.
Viterbi decoders employed in digital wireless
[6] G. K. Yeap, Practical Low Power Digital VLSI
communications are complex and dissipate high power.
Design, Kluwer Academic Publishers, 1998.
In this paper, we investigate power dissipations of two
[7] S. Ranpara, On a Viterbi Decoder Design for Low
different implementations of Viterbi decoders: the
Power Dissipation, M.S. Thesis, Dept. of Electrical
register-exchange approach and the traceback approach
and Computer Eng., Virginia Polytechnic Institute
with shift update scheme. We propose a low-power
and State University, April 1999.
Viterbi decoder design based on the traceback
approach. The schemes employed for our low-power

You might also like