Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

1378 IEEE COMMUNICATIONS LETTERS, VOL. 15, NO.

12, DECEMBER 2011

A Simplified Successive-Cancellation Decoder for Polar Codes


Amin Alamdar-Yazdi and Frank R. Kschischang, Fellow, IEEE

Abstract—A modification is introduced of the successive- information set 𝐴 ⊂ [𝑁 ] of cardinality 𝐾. Given these three
cancellation decoder for polar codes, in which local decoders parameters, a polar codeword x may be obtained by mapping
for rate-one constituent codes are simplified. This modification a binary 𝑁 -tuple b whose 𝑖th component is set (“frozen”)
reduces the decoding latency and algorithmic complexity of the
conventional decoder, while preserving the bit and block error to zero for all 𝑖 ∈ [𝑁 ] ∖ 𝐴, forming x = b𝐺𝑛 , where 𝐺𝑛
rate. Significant latency and complexity reductions are achieved denotes the generator matrix for polar codes of block-length
over a wide range of code rates. 2𝑛 (as defined in [1]). We refer to bit 𝑏𝑖 as an information bit
Index Terms—Polar codes, successive-cancellation decoding. (resp., as a frozen bit) with respect to 𝐴 if 𝑖 ∈ 𝐴 (resp., if
𝑖∈/ 𝐴). The codeword x is then passed through the channel
and the received vector y = (𝑦0 , . . . , 𝑦𝑁 −1 ) is processed by
I. I NTRODUCTION the decoder. ∪
P OLAR CODES, introduced in [1], are the first provably
capacity-achieving family of codes with low encoding
and decoding complexity and have recently attracted much
We let ℝ := ℝ {+∞, −∞} denote the extended reals.
We denote by ℎ(𝑥) the binary quantizer that takes on value
0 if 𝑥 > 0, takes on value 1 if 𝑥 < 0, and takes on 1 and 0
attention [2]–[6]. Although theoretically very interesting, polar each with probability 1/2 if 𝑥 = 0. If x is a vector or a tuple,
codes have certain drawbacks in practice. To achieve error then x[𝑖] represents the 𝑖th component of x and y = ℎ(x)
rates suitable for practical applications, the block-length has means y[𝑖] = ℎ(x[𝑖]) for all 𝑖. We define the binary operator
to be very large. The traditional successive-cancellation (SC) ⊞ which combines two extended real numbers 𝑥 and 𝑦 to form
decoder suffers from a very high latency, particularly at these 𝑥⊞ 𝑦 := 2atanh(tanh(𝑥/2)tanh(𝑦/2)). This operator has the
long block-lengths, and the corresponding decoder algorithm useful property that
has high complexity.
The SC decoder operates according to a recursive divide- ℎ(𝑥 ⊞ 𝑦) = ℎ(𝑥) ⊕ ℎ(𝑦) if 𝑥𝑦 ∕= 0, (1)
and-conquer approach, reducing the problem of decoding a
where ‘⊕’ denotes the modulo-two sum.
length 𝑁 polar code into two (coupled) decodings of length
𝑁/2 constituent codes. These decodings are not independent,
as the input to the second decoder relies on the output of III. SC D ECODING AS M ESSAGE -PASSING
the first one, thus introducing significant latency. Further- In this section, we set up a general framework for decoding
more, providing the input to the first decoder requires many polar codes recursively. Conventional SC decoding can be
computationally-intensive ‘⊞’ function evaluations. thought of as a special case of this general set-up.
It is well known that the recursion can be simplified when For 𝑛 > 0, let 𝑇𝑛 denote the full binary tree of depth 𝑛, i.e.,
the constituent code has rate zero (corresponding to the case the binary tree with 2𝑛 leaves, each having depth 𝑛. Given a
when all involved bits are so-called “frozen bits”), since in node 𝑣, we refer to its depth, parent node, and left and right
this case the recursive decoder can be replaced with a trivial child nodes by 𝑑𝑣 , 𝑝𝑣 , 𝑣𝑙 and 𝑣𝑟 , respectively; see Fig. 1(a).
one, avoiding latency and computation. The main observation We use 𝑉𝑣 to denote the set of nodes of the subtree rooted
in this paper is that the recursion can also be simplified at node 𝑣. The leaves of the tree are indexed in the set [2𝑛 ]
when the constituent code has rate one (corresponding to the in the usual way, as illustrated in Fig. 1(b) for 𝑛 = 3. We let
case when all involved bits are so-called “information bits”). ℓ(𝑣) denote the index of a leaf node 𝑣. Furthermore, to each
Latency and computation are reduced with no sacrifice in error node 𝑣 we associate the set
performance. For a polar code of length 218 and rate 0.7, the
new algorithm reduces the number of ‘⊞’ function evaluations ℐ𝑣 = {ℓ(𝑢) : 𝑢 ∈ 𝑉𝑣 and 𝑢 is a leaf node},
by more than 33% and reduces the decoding latency by more
than 88%. containing the indices of all leaf nodes that are descendents
of node 𝑣.
II. BACKGROUND Let 𝐴 ⊂ [2𝑛 ] be the information set of a given polar code.
We say that a node 𝑣 in 𝑇𝑛 is a rate one node with respect
A polar coding scheme is uniquely defined by three pa- to 𝐴 if ℐ𝑣 ⊆ 𝐴, i.e., if the leaf nodes that are descendents of
rameters: block-length 𝑁 = 2𝑛 , rate 𝑅 = 𝐾/𝑁 and an node 𝑣 all are information bits. Similarly, we say that 𝑣 is a
Manuscript received July 8, 2011. The associate editor coordinating the rate zero node with respect to 𝐴 if ℐ𝑣 ⊆ [2𝑛 ] ∖ 𝐴, i.e., if the
review of this letter and approving it for publication was V. Stankovic. leaf nodes that are descendents of node 𝑣 all are frozen bits.
The authors are with the Dept. of Electrical and Computer Engineering, For example, in Fig. 1(b), 𝐴 = {5, 6, 7}, rate zero nodes are
University of Toronto, 10 King’s College Road, Toronto, Ontario M5S 3G4,
Canada (e-mail: {ayazdi, frank}@comm.utoronto.ca). shown as white circles, and rate one nodes are shown as black
Digital Object Identifier 10.1109/LCOMM.2011.101811.111480 circles.
1089-7798/11$26.00 ⃝
c 2011 IEEE
ALAMDAR-YAZDI and KSCHISCHANG: A SIMPLIFIED SUCCESSIVE-CANCELLATION DECODER FOR POLAR CODES 1379

𝑝𝑣 IV. P ROPOSED M ETHOD


𝛼𝑣
𝛽𝑣
𝛼𝑣𝑙 In this section, we modify conventional SC decoding by
𝑣 𝛽𝑣𝑟 changing the local decoding algorithm at rate one nodes; we
𝑣𝑙 𝛼𝑣𝑟 𝑣𝑟 refer to this technique as “modified SC decoding.” We prove
𝛽𝑣𝑙 0 1 2 3 4 5 6 7
that the bit and block error rate of modified SC decoding are
(a) (b)
the same as those of conventional SC decoding.
Fig. 1. (a) Local decoder; (b) Labelling the leaf nodes. Rate zero nodes are In the modified algorithm, when a rate one node 𝑣 is
shown as white circles, rate one nodes are shown as black circles. activated, it immediately calculates 𝛽𝑣 via

We assign to each node 𝑣 a constituent code with generator 𝛽𝑣 = ℎ(𝛼𝑣 ), (3)


matrix 𝐺𝑛−𝑑𝑣 (ℐ𝑣 ∩ 𝐴), which is the matrix composed of all
rows 𝑖 from matrix 𝐺𝑛−𝑑𝑣 with 𝑖 ∈ ℐ𝑣 ∩ 𝐴. Each node 𝑣 acts and all bits with indices in ℐ𝑣 are immediately decoded using
as a decoder for its constituent code. As illustrated in Fig. 1, the formula
the decoder at node 𝑣 receives a soft information vector 𝛼𝑣
from its parent 𝑝𝑣 and is responsible for producing a codeword (ˆ
𝑢[min ℐ𝑣 ], . . . , 𝑢
ˆ[max ℐ𝑣 ]) = 𝛽𝑣 𝐺𝑛−𝑑𝑣 . (4)
𝛽𝑣 from its codebook. The decoding algorithm is initialized by
feeding the decoder at the root with (𝜆0 , . . . , 𝜆𝑁 −1 ), where Effectively, the decoder for a rate one constituent code simply
𝜆𝑖 := log(Pr(𝑦𝑖 ∣𝑥𝑖 = 0)/Pr(𝑦𝑖 ∣𝑥𝑖 = 1)) (the channel log- hard-quantizes its soft-input values and computes the inverse
likelihood ratios). transform (note that 𝐺−1 𝑛−𝑑𝑣 = 𝐺𝑛−𝑑𝑣 ) of the resulting {0, 1}-
Each local decoder may use any suitable decoding algo- valued vector, thus recovering the local codeword.
rithm. The conventional SC decoder is obtained when the local Note that none of the local decoders 𝑢 ∈ 𝑉𝑣 ∖ {𝑣} become
decoders use the following recursive algorithm. activated. In precisely the same manner that computation is
In conventional SC decoding, when an internal (non-leaf) saved at rate zero nodes in the conventional SC decoder,
decoder 𝑣 is activated, it calculates 𝛼𝑣𝑙 via the modified algorithm saves computation at rate one nodes.
𝛼𝑣𝑙 [𝑖] = 𝛼𝑣 [2𝑖] ⊞ 𝛼𝑣 [2𝑖 + 1] for 𝑖 = 0 : 2 𝑛−𝑑𝑣 −1
− 1, The modified algorithm effectively replaces (laboriously com-
puted) “soft-information” with (trivially computed) “hard-
and passes 𝛼𝑣𝑙 to 𝑣𝑙 . Local decoder 𝑣 then waits until it information” at the nodes of 𝑉𝑣 .
receives codeword 𝛽𝑣𝑙 from 𝑣𝑙 . It then calculates 𝛼𝑣𝑟 via We now show that the modified algorithm results in the
𝛼𝑣𝑟 [𝑖] = 𝛼𝑣 [2𝑖](1−2𝛽𝑣𝑙 [𝑖])+𝛼𝑣 [2𝑖+1] for 𝑖 = 0 : 2𝑛−𝑑𝑣 −1 −1, same bit and block error rate as the conventional algorithm.
To prove this, it is enough to show that for any rate one node 𝑣,
and passes 𝛼𝑣𝑟 to 𝑣𝑟 . Local decoder 𝑣 then waits until it
the calculated 𝛽𝑣 of both methods agree, and that the decisions
receives codeword 𝛽𝑣𝑟 from 𝑣𝑟 . It then calculates 𝛽𝑣 via
made by both methods at leaves with index 𝑖 ∈ ℐ𝑣 also agree.
𝛽𝑣 [2𝑖 + 1] = 𝛽𝑣𝑟 [𝑖] and 𝛽𝑣 [2𝑖] = 𝛽𝑣𝑙 [𝑖] ⊕ 𝛽𝑣𝑟 [𝑖], (2) These results are proved in Lemmas 1 and 2. We note that
information bits that are decoded prior to activation of node
and passes 𝛽𝑣 to 𝑝𝑣 . The operations of the local decoders in
𝑣 are, of course, unaffected by the operation of the algorithm
𝑉𝑣 terminate at this point as these local decoders will never
at node 𝑣. Furthermore, as discussed at the end of Section III,
be activated again through the rest of the decoding.
information bits with index 𝑖 > max ℐ𝑣 depend only on 𝛽𝑣
When a leaf local decoder 𝑣 is activated, it immediately
(which is the same in both algorithms). Information bits with
sets 𝛽𝑣 to ℎ(𝛼𝑣 ) if ℐ𝑣 ⊂ 𝐴 (otherwise it sets 𝛽𝑣 to zero) and
index 𝑖 ∈ ℐ𝑣 are decoded according to (4), which results in
passes 𝛽𝑣 to 𝑝𝑣 . Once 𝛽𝑣 is known for a leaf 𝑣, information
the same decoding decisions as the conventional algorithm (as
bit 𝑖 is decoded as ˆ𝑏𝑖 = 𝛽𝑣 , where 𝑖 is the index of 𝑣.
shown in Lemma 2).
These message-passing rules can be slightly modified for
rate zero nodes. Since all descendents of a rate zero node 𝑣 Lemma 1: In the conventional SC decoder, for a rate one
are themselves rate zero, node 𝑣 can immediately set 𝛽𝑣 to node 𝑣, we have
the zero codeword without activating its children. Children of 𝛽𝑣 = ℎ(𝛼𝑣 ), (5)
rate zero nodes are never activated.
Note that the operation of a local decoder 𝑣 does not affect i.e., the modified and conventional decoders agree at rate one
decoding of bits with index 𝑖 such that 𝑖 < min ℐ𝑣 . This is nodes.
because local decoder 𝑣 gets activated after all such bits are Proof: We prove the claim for the case that the channel
decoded. Note also that 𝛽𝑣 acts as a summary of all codewords log-likelihood ratios have no probability mass at zero. First,
computed in the subtree rooted at 𝑣; given 𝛽𝑣 , the decoding of we prove the claim for a rate one node 𝑣 that has the property
bits with index 𝑖 such that 𝑖 > max ℐ𝑣 does not depend on any that
of the codewords 𝛽𝑢 or soft information 𝛼𝑢 , where 𝑢 ∈ 𝑉𝑣 ,
that are computed in this subtree (except for 𝛽𝑣 itself). 𝛽𝑣𝑙 = ℎ(𝛼𝑣𝑙 ), 𝛽𝑣𝑟 = ℎ(𝛼𝑣𝑟 ). (6)
Note also that the rules described in this section for compu-
tation of the 𝛽 messages also give rise to an encoder, defined Later we show by induction that all rate one nodes have this
by initializing 𝛽 messages at the leaf nodes and applying (2) property. For compactness, for any fixed 𝑖, let 𝑒 = 2𝑖 and let
until the root node is reached. 𝑜 = 2𝑖 + 1.
1380 IEEE COMMUNICATIONS LETTERS, VOL. 15, NO. 12, DECEMBER 2011

Relative Latency and Complexity Reductions method, once a non-leaf rate one node 𝑣 is activated, it uses
100 one clock cycle to calculate 𝛼𝑣𝑙 using 2𝑛−𝑑𝑣 −1 processors in
90 parallel. Then node 𝑣 waits a certain time 𝜏𝑙 until it receives
80 𝛽𝑣𝑙 , after which it uses one clock cycle to calculate 𝛼𝑣𝑟 .
𝐿(18)
70 𝐿(16)
Then node 𝑣 again has to wait another period of time 𝜏𝑟
Rel. Gain (%)

𝐿(14) until it receives 𝛽𝑣𝑟 , after which it calculates 𝛽𝑣 instantly.


60 𝐻(18) Rate one leaves complete their operation instantaneously. In
𝐻(16)
50
𝐻(14)
our proposed method, rate one nodes complete their operation
40 instantaneously and therefore we save 𝜏𝑙 + 𝜏𝑟 + 2 clock cycles.
30 There are different possible ways to measure 𝜏𝑙 and 𝜏𝑟 . One
possible way is as follows. We assume that calculating the
20
vector of soft information 𝛼 requires one clock cycle whereas
0.3 0.4 0.5 0.6 0.7 0.8
Rate
calculating codewords 𝛽 as well as making hard decisions can
be done instantaneously.
Fig. 2. Relative latency reduction 𝐿(𝑛) and algorithmic-complexity reduction SC decoding requires evaluation of the ‘⊞’ operation (cor-
𝐻(𝑛) for polar codes of various rates and block-length 2𝑛 . responding to a check node operation in an LDPC decoder)
and can be complex to implement in hardware. This problem
is addressed in [7] whose authors use an approximation for
With probability one 𝛼𝑣 [𝑒] ∕= 0 and 𝛼𝑣 [𝑜] ∕= 0, therefore replacement of this operation. In our proposed method, we
property (1) implies reduce the number of operations that are required, replacing
ℎ(𝛼𝑣𝑙 [𝑖]) = ℎ(𝛼𝑣 [𝑒]) ⊕ ℎ(𝛼𝑣 [𝑜]); thus them with XOR operations without any approximation.
(𝑎) We have calculated the decoding latency and complexity
ℎ(𝛼𝑣𝑟 [𝑖]) = ℎ(𝛼𝑣 [𝑜] + (1 − 2ℎ(𝛼𝑣𝑙 [𝑖]))𝛼𝑣 [𝑒]) of the two methods for polar codes with various block-
= ℎ(𝛼𝑣 [𝑜] + (1 − 2ℎ(𝛼𝑣 [𝑒]) ⊕ ℎ(𝛼𝑣 [𝑜]))𝛼𝑣 [𝑒]) lengths and various rates 𝑅, each designed for a binary
= ℎ(𝛼𝑣 [𝑜]), erasure channel with capacity 10𝑅/9. We denote by 𝐿(𝑛)
the reduction in decoding latency of the modified algorithm
where (a) uses (6). Hence relative to the conventional one for a code of block-length
𝛽𝑣 [𝑜] = 𝛽𝑣𝑟 [𝑖] = ℎ(𝛼𝑣𝑟 [𝑖]) = ℎ(𝛼𝑣 [𝑜]); thus 𝑁 = 2𝑛 , i.e., 𝐿(𝑛) = (𝐿SC (𝑛) − 𝐿ModifSC(𝑛))/(𝐿SC (𝑛)).
(𝑏)
We also denote by 𝐻(𝑛) the reduction in the number of
𝛽𝑣 [𝑒] = 𝛽𝑣𝑙 [𝑖] ⊕ 𝛽𝑣𝑟 [𝑖] = ℎ(𝛼𝑣𝑙 [𝑖]) ⊕ ℎ(𝛼𝑣𝑟 [𝑖]) ‘⊞’ operations of the modified algorithm relative to the
= ℎ(𝛼𝑣 [𝑜]) ⊕ ℎ(𝛼𝑣 [𝑜]) ⊕ ℎ(𝛼𝑣 [𝑒]) = ℎ(𝛼𝑣 [𝑒]), conventional one for a code of block-length 𝑁 = 2𝑛 , i.e.,
𝐻(𝑛) = (𝐻SC (𝑛) − 𝐻ModifSC (𝑛))/(𝐻SC (𝑛)). The results are
where again (b) follows from (6). shown in Figure 2. As we see, the reduction is significant for
So far, we have proved that the claim of the lemma holds a wide range of rates, increasing with block-length.
for a rate one node 𝑣 that satisfies (6). We use induction to It is important to note that the underlying channel influences
complete the proof. Note that any rate one node of depth which bits are frozen and which bits are information bits; thus
𝑛 − 1 satisfies (6), and therefore, for rate one nodes of depth the channel will also have an impact on the gain in latency and
𝑛 − 1, the claim holds (base of the induction). Now, let us complexity. However, our simulation results (not shown here),
assume the claim holds for rate one nodes of depth 𝑖 (induction indicate that the latency and complexity reductions follow the
hypothesis). For a rate one node 𝑣 of depth 𝑖 − 1, its children same trends as for the binary erasure channel.
are rate one nodes of depth 𝑖 which, based on the induction
hypothesis, satisfy (6). Therefore, we can use the proof as laid
out above to prove the claim for 𝑣 as well. R EFERENCES
One can easily show that the claim of the lemma also holds
in the case (as in the binary erasure channel) where the channel [1] E. Arıkan, “Channel polarization: a method for constructing capacity-
log-likelihood ratios have probability mass at zero. achieving codes for symmetric binary-input memoryless channels,” IEEE
Trans. Inf. Theory, vol. 55, no. 7, pp. 3051–3073, July 2009.
Lemma 2: In the conventional SC decoder, for a rate one [2] E. Şaşoğlu, E. Telatar, and E. Arıkan, “Polarization for arbitrary discrete
node 𝑣, we have (ˆ 𝑢[min ℐ𝑣 ], . . . , 𝑢
ˆ[max ℐ𝑣 ]) = 𝛽𝑣 𝐺𝑛−𝑑𝑣 . memoryless channels,” in Proc. 2009 IEEE Inf. Theory Workshop, pp.
Proof: The claim holds trivially for rate one nodes at 144–148.
depth 𝑛. That the claim also holds for rate one nodes at smaller [3] S. B. Korada and R. L. Urbanke, “Polar codes are optimal for lossy
source coding,” IEEE Trans. Inf. Theory, vol. 56, pp. 1751–1768, 2010.
depths can be shown by induction. [4] N. Hussami, S. B. Korada, and R. Urbanke, “Performance of polar codes
for channel and source coding,” in Proc. 2009 IEEE Int. Symp. Inf.
V. R ESULTS Theory, pp. 1488–1492.
[5] E. S. Şaşoğlu, E. Telatar, and E. Yeh, “Polar codes for the two-user
Due to the sequential nature of SC decoding, the decoding binary-input multiple-access channel,” in Proc. 2010 IEEE Inf. Theory
latency of polar codes is very high. In the modified method, Workshop.
[6] E. Abbe and E. Telatar, “Polar codes for the 𝑚-user MAC and matroids,”
however, it follows from (4) that all bits with indices in ℐ𝑣 in Proc. 2010 Int. Zurich Seminar, pp. 29–32.
for a rate one node 𝑣 can be decoded simultaneously. [7] C. Leroux, I. Tal, A. Vardy, and W. J. Gross, “Hardware architectures
More precisely, one can compare the decoding latency of for successive cancellation decoding of polar codes,” in Proc. 2011 IEEE
Int. Conf. Acoust. Speech & Sig. Proc., pp. 1665–1668.
the two decoding methods as follows. In the conventional

You might also like