Professional Documents
Culture Documents
Paper - 2011 - Binary Polar Code Kernels from Code Decompositions - Noam Presman
Paper - 2011 - Binary Polar Code Kernels from Code Decompositions - Noam Presman
Abstract—Code decompositions (a.k.a code nestings) where {Zn }n≥0 is the Bhattacharyya random sequence
are used to design good binary polar code kernels. The corresponding to Arikan’s random tree process [1].
proposed kernels are in general non-linear and show a In [3], Korada et al. studied the use of alternatives to
better rate of polarization under successive cancelation
decoding, than the ones suggested by Korada et al., for G2 for the symmetric B-MC. They gave sufficient con-
the same kernel dimensions. In particular, we construct ditions for polarization when linear binary kernels are
kernels of sizes 14, 15 and 16 providing polarization rates used over the symmetric B-MC channels. Furthermore,
better than any binary kernel of such sizes. the notion of the rate of polarization was generalized
I. I NTRODUCTION for polar codes based on linear codes having generating
matrix G of dimensions ℓ × ℓ. The rate of polarization
Polar codes were introduced by Arikan [1] and pro- was quantified by the exponent of the kernel E(G),
vided a scheme for achieving the symmetric capacity of which plays the general role of the threshold (equal
binary memoryless channels (B-MC) with polynomial 0.5) appearing in (1) and (2) (note that here N = ℓn ).
encoding and decoding complexity. Arikan used a sim- Korada et al. showed that E(G) ≤ 0.5 for all binary
ple construction based on the following linear kernel linear kernels of dimension ℓ ≤ 15, which is the kernel
exponent found for Arikan’s 2 × 2 kernel, and that for
1 0
G2 = . ℓ = 16 there exists a code generator matrix G in which
1 1
N E(G) = 0.51828, and this is the maximum exponent
n
In this scheme, a 2n × 2n matrix, G2 , is generated by achievable by a binary linear kernel up to this dimension.
performing the Kronecker power on G2 . An input vector Furthermore, for optimal linear kernels, the exponent
u of length N = 2n is transformed to an N length E(G) approaches 1 as ℓ → ∞.
vector x by multiplying
N a certain permutation of the In [4], Mori and Tanaka considered the general case
n
vector u by G2 . The vector x is transmitted through of a mapping g(·), which is not necessarily linear and
N independent copies of the memoryless channel, W . binary, as a basis for channel polarization constructions.
This results in new N (dependent) channels between They gave sufficient conditions for polarization and
the individual components of u and the outputs of the generalized the exponent for these cases. In [5] they
channels. Arikan showed that these channels exhibit the considered non-binary, however linear, kernels based
phenomenon of polarization under successive cancela- on Reed-Solomon codes and Algebraic Geometry codes
tion decoding. This means that as n grows there is a and showed that their exponents are by far better than
proportion of I(W ) (the symmetric channel capacity) the exponents of the known binary kernels. This is true
of the channels that become clean channels (i.e. having even for such a small kernel dimension as ℓ = 4 and
the capacity approaching 1) and the rest of the chan- the alphabet size q = 4, in which E (G) = 0.573120.
nels become completely noisy (i.e. with the capacity In this paper, we propose designing good binary
approaching 0). An important question is how fast the kernels (in the sense of large exponent), by using code
polarization occurs in terms of the codes’ length N . In decompositions (a.k.a code nestings). The kernels we
[2], the rate of polarization was analyzed for the 2 × 2 suggest show better exponents than the ones considered
0.5
kernel, and it was proven that the rate is 2−N . More in [3]. Moreover, we describe binary non-linear kernels
precisely, it was proven that of sizes 14, 15 and 16 providing a superior polarization
β
exponent than any binary kernel.
lim inf Pr Zn ≤ 2−N = I(W ) for β < 0.5 (1) The paper is organized as follows. In Section II, we
n→∞
β
describe building kernels that are related to decompo-
lim inf Pr Zn ≥ 2−N = 1 for β > 0.5, (2) sitions of codes into sub-codes. Furthermore, by using
n→∞
1]). We denote the set of sub-codes of level number i quential decision making on the bits of the input to the
by transformation (uℓ1 ), given a noisy observation of the
n
(bi−1 )
o output, is actually a decision on the sub-code to which
Ti = Ti 1 |bj ∈ {0, 1, 2, ..., mj − 1} , j ∈ [i − 1] . the transmitted vector belongs to. As such, deciding on
the first bit u1 is actually deciding if the transmitted
The partition is usually described by the following chain (0) (1)
vector belongs to T2 or to T2 . Once we decided
of codes parameters
on u1 , we assume that we transmitted a codeword of
(u )
(n1 , k1 , d1 ) − (n2 , k2 , d2 ) − ... − (nm , km , dm ), T2 1 and by deciding on u2 we choose the appropriate
180
(u )
refinement or sub-code of T2 1 , i.e. we should decide and generalized in [3]. A random sequence {Wn }n≥0
(u ,0) (u ,1) ℓn
between the candidates T3 1 and T3 1 . Due to this is defined such that Wn ∈ W (i) i=1 with
fact, it comes as no surprise that the Hamming distances
between two candidate sub-codes plays an important W0 = W
role when considering the rate of polarization. Wn+1 = Wn(Bn+1 ) ,
Definition 3: For a binary code decomposition as in
Definition 2, the Hamming distances between sub-codes where {Bn }n≥1 is a sequence of i.i.d random variables
in the decomposition are defined as follows: uniformly distributed over the set {0, 1, 2, ..., ℓ − 1}. In
(i) a similar manner, the symmetric capacity correspond-
Dmin (ui−1
1 ) =
ing to the channels {In }n≥0 = {I(Wn )}n≥0 and the
(ui−1 ·0) (ui−1 ·1)
min dH (c1 , c2 ) c1 ∈ Ti+11 , c2 ∈ Ti+11 , Bhattacharyya parameters random variables {Zn }n≥0 =
n o
{Z(Wn )}n≥0 are defined. Just as in [1, Proposition
(i) (i)
Dmin = min Dmin (u1i−1 ) ui−1
1 ∈ {0, 1}i−1 . 8], we can prove that the random sequence {In }n≥0
is a bounded martingale, and it is uniform integrable
A transformation g (·) can be used as a building which means it converges almost surely to I∞ and that
block for a recursive construction of a transformation E {I∞ } = I(W ). Now, if we can show that Zn → Z∞
of greater length, in a similar manner to [1]. We specify w.h.p such that Z∞ ∈ {0, 1}, by the relations between
this construction explicitly in the next definition. the channel’s information and the Bhattacharyya param-
Definition 4: Given a transformation g(·) of dimen- eter [1, Proposition 1], we have that I∞ ∈ {0, 1}. But,
sion ℓ, we construct a m mapping g (m) (·) of dimension this means that Pr (I∞ = 1) = E {I∞ } = I(W ), which
m (m) ℓ ℓm
ℓ (i.e. g (·) : {0, 1} → {0, 1} ) in the following is the channel polarization phenomenon.
recursive fashion. Proposition 1: Let g(·) be a binary transformation of
dimension ℓ, induced by a binary code decomposition
g (1) (uℓ1 ) = g(uℓ1 ) ;
{T1 , T2 , ..., Tℓ+1 }. If there exists uℓ−1
1 ∈ {0, 1}ℓ−1 such
(ℓ) ℓ−1
that Dmin (u1 ) ≥ 2, then Pr (I∞ = 1) = I(W ).
h Proof: In [4, Corollary 11], sufficient conditions are
g (m) = g (m−1) γ1,1 , γ2,1 , γ3,1 , . . . , γℓm−1 ,1 ,
given for
g (m−1) γ1,2 , γ2,2 , γ3,2 , . . . , γℓm−1 ,2 , . . . ,
lim Pr (Zn ∈ (δ, 1 − δ)) = 0 ∀δ ∈ (0, 0.5). (4)
n→∞
i
g (m−1) γ1,ℓ , γ2,ℓ , γ3,ℓ , . . . , γℓm−1 ,ℓ , The first condition is that there exists a vector uℓ−1
1 ,
indices i, j ∈ [ℓ] and permutations σ(·), and τ (·) on
where {0, 1} such that
γi,j = gj ui·ℓ
(i−1)·ℓ+1 1 ≤ i ≤ ℓm−1 1 ≤ j ≤ ℓ. (uℓ−1 ) (uℓ−1 )
gi 1
(uℓ ) = σ(uℓ ) and gj 1
(uℓ ) = µ(uℓ ).
(m)
The transformation g (·) can be used to transmit This requirement applies here, because if there exists
data over the B-MC channel. The method of successive ℓ−1 (ℓ)
uℓ−1
1 ∈ {0, 1} such that Dmin (uℓ−1
1 ) ≥ 2, then
cancelation can now be used to decode, with decoding (uℓ−1 )
the two codewords of the code Tℓ 1 , c1 and c2 , are
complexity of O 2ℓ · N · logℓ (N ) as in [1].
at Hamming distance at least 2. This means that there
We use the same channel definition, the corresponding
exist at least two indices i, j such that c1,i 6= c2,i and
symmetric capacity and the Bhattacharyya parameter as (uℓ−1 ) (uℓ−1 )
in [1], [3], [4]. Note that for c1,j 6= c2,j , therefore gi 1 (uℓ ) and gj 1 (uℓ ) are
uniform binary random
vectors U1ℓ , and X1ℓ = g U1ℓ we have that I(Y1ℓ ; U1ℓ ) = both permutations. The second condition is that for any
I(Y1ℓ ; X1ℓ ), because the transformation g(·) is invertible. v1ℓ−1 ∈ {0, 1}ℓ−1 there exist an index m ∈ [ℓ] and a
Furthermore, since we consider memoryless channels, permutation µ(·) on {0, 1} such that
we have I(Y1ℓ ; X1ℓ ) = ℓ · I(Y1 ; X1 ) = ℓ · I(W ), and on (vℓ−1 )
gm 1 (vℓ ) = µ(vℓ ).
the other hand
ℓ ℓ This requirement also applies here, by noting that for
ℓ−1
X X
I(Y1ℓ ; U1ℓ ) = I(Y1ℓ ; Ui |U1i−1 ) = I(W (i) ). each v1ℓ−1 ∈ {0, 1} the two codewords of the set
i=1 i=1 (v1ℓ−1 )
Tℓ are at Hamming distance at least 1. This means
Define the tree process of the channels generated by that (4) holds, which implies that I∞ ∈ {0, 1} almost
the kernels, in the same way as it was done in [1] surely, and therefore Pr (I∞ = 1) = I(W ). ♦
181
# ℓ chain description lower
The next proposition on the rate of polarization is an bound on
easy consequence of [4, Theorem 19] and Proposition E(g)
1 16 (16, 1) − (15, 2) − (11, 4) − (8, 6) − (5, 8) − (1, 16) 0.52742
1. 2 16 (16, 1) − (15, 2) − (11, 4) − (7, 6) − (5, 8) − (1, 16) 0.51828
Proposition 2: Let g(·) be a bijective transforma- 3 15 (15, 1) − (14, 2) − (10, 4) − (7, 6) − (4, 8) 0.50773
4 14 (14, 1) − (13, 2) − (9, 4) − (6, 6) − (3, 8) 0.50193
tion of dimension ℓ, induced by code partitioning
ℓ−1
{T1 , T2 , ..., Tℓ+1 }. If there exists uℓ−1
1 ∈ {0, 1} such TABLE I
(ℓ) ℓ−1 C ODE DECOMPOSITIONS FROM [7, TABLE 5] WITH THEIR
that Dmin (u1 ) ≥ 2, then CORRESPONDING LOWER BOUNDS ON KERNEL EXPONENTS FOR
(i) For any β < E(g) THE KERNELS INDUCED BY THEM .
nβ
lim Pr Zn ≤ 2−ℓ = I(W ),
n→∞
182
input coset vectors coset vectors main code sub-code
are coset decompositions, so we only need to specify vector form a
the sub-code representatives. indices linear space?
1 [0000000000000001] yes (16, 16, 1) (16, 15, 2)
#1)(16, 16, 1) − (16, 15, 2) − (16, 11, 4) − (16, 8, 6) − 2−5 [0000000100000001] yes (16, 15, 2) (16, 11, 4)
(16, 5, 8) − (16, 1, 16): The sub-code representatives [0000000000010001]
[0000000000000101]
are (16, 15, 2) single parity check code, (16, 11, 4) ex- [0000000000000011]
tended Hamming code, (16, 8, 6) Nordstrom-Robinson 6−8 [0001000100010001] yes (16, 11, 4) (16, 8, 6)
[0000010100000101]
code, (16, 5, 8) first order Reed-Muller code, (16, 1, 16) [0000000001010101]
repetition code. 9 − 11 [0000000000000000] no (16, 8, 6) (16, 5, 8)
[0000001101010110]
#2)(16, 16, 1) − (16, 15, 2) − (16, 11, 4) − (16, 7, 6) − [0001000101001011]
(16, 5, 8) − (16, 1, 16): The sub-code representatives [0001001000101110]
[0001011100011000]
are (16, 15, 2) - single parity check code, (16, 11, 4) - [0000011000110101]
extended Hamming code, (16, 7, 6) - extended 2-error [0001010001110010]
[0000010101101100]
correcting BCH code, (16, 5, 8)- first-order Reed-Muller 13 − 15 [0101010101010101] yes (16, 5, 8) (16, 1, 16)
code, (16, 1, 16) - repetition code. [0011001100110011]
[0000111100001111]
#3)(15, 15, 1) − (15, 14, 2) − (15, 10, 4) − (15, 7, 6) − [0000000011111111]
(15, 4, 8): The sub-code representatives are (15, 14, 2) 16 [1111111111111111] yes (16, 1, 16) -
- single parity check code, (15, 10, 4) - short- TABLE II
ened extended Hamming code, (15, 7, 6) - shortened C OSET VECTORS FOR CODE DECOMPOSITION #1.
Nordstrom-Robinson code, (15, 4, 8) - shortened first
order Reed-Muller code.
#4)(14, 14, 1) − (14, 13, 2) − (14, 9, 4) − (14, 6, 6) −
(14, 3, 8): The sub-code representatives are (14, 13, 2) as well.
- single parity check code, (14, 9, 4) - twice shortened R EFERENCES
extended Hamming code, (14, 6, 6) - twice shortened
[1] E. Arikan, “Channel polarization: A method for constructing
Nordstrom-Robinson code, (14, 3, 8) - twice shortened capacity-achieving codes for symmetric binary-input memoryless
first order Reed-Muller code. channels,” IEEE Trans. Inf. Theory, vol. 55, no. 7, pp. 3051–3073,
Explicit Encoding of Decomposition #1: For de- 2009.
[2] E. Arikan and E. Telatar, “On the rate of channel polarization,”
composition #1 we elaborate on the kernel mapping Jul. 2008. [Online]. Available: http://arxiv.com/abs/0807.3806
16 16
function g(·) : {0, 1} → {0, 1} . To do so, we use [3] S. B. Korada, E. Sasoglu, and R. Urbanke, “Polar codes:
Table II. The third column from the left determines Characterization of exponent, bounds, and constructions,” Jan.
2009. [Online]. Available: http://arxiv.com/abs/0901.0536
whether the vectors on the second column are all the [4] R. Mori and T. Tanaka, “Performance and construction of polar
coset vectors (if they do not form a linear space) or just codes on symmetric binary-input memoryless channels,” in Proc.
the basis for the space of coset vectors (if they form a IEEE Int. Symp. Information Theory ISIT 2009, 2009, pp. 1496–
1500.
linear space). The fourth and the fifth columns determine [5] ——, “Non-binary polar codes using reed-solomon codes and
the stage of the code decomposition these vectors belong algebraic geometry codes,” Jul. 2010. [Online]. Available:
to; the ”main code” is decomposed to cosets of the ”sub- http://arxiv.org/abs/1007.3661
[6] N. Presman, O. Shapira, and S. Litsyn, “Binary polar
code” (each coset is generated by adding a different code kernels from code decompositions,” Jan. 2011. [Online].
coset vector from the set specified by column 2 to the Available: http://arxiv.org/abs/1101.0764
sub-code). The entry corresponding to indices 9 − 11 is [7] S. Litsyn, Handbook of Coding Theory. Eds., Elsevier, The
Netherlands, 1998, ch. An Updated Table of the Best Binary
taken from [9]. Codes Known.
We now describe the encoding process. Let u16 1 be a [8] N. Presman, O. Shapira, and S. Litsyn, “Polar codes with mixed
binary vector. The indices of the vector are partitioned kernels,” 2011, accepted for 2011 IEEE International Symposium
on Information Theory.
to subsets according to the first column of the table. For [9] A. E. Ashikhmin and S. N. Litsyn, “Fast decoding algorithms for
each subset the corresponding sub-vector of u is mapped first order reed-muller and related codes,” Designs, Codes and
to a coset vector. The mapping can be arbitrary, but when Cryptography, vol. 7, pp. 187–214, 1996, 10.1007/BF00124511.
[Online]. Available: http://dx.doi.org/10.1007/BF00124511
the coset vectors form a linear space, we usually prefer
to multiply the corresponding sub-vector by a generating
matrix which rows are the vectors in the ”coset vectors”
column. To get the value of g(u), we add-up the six
coset vectors we got from the last step. Note that using
this mapping definition, it is easy to derive the mapping
function corresponding to decompositions #3 and #4
183