Professional Documents
Culture Documents
Slides 3
Slides 3
Slides 3
Adrish Banerjee
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Coding a single random variable
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Prefix-free code
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Prefix-free code
0
11111
00000
00000
11111 Z
0 11001100 11111
00000 0 4
1 1100110000000
11111
00000
11111
1 11111
00000
111
000 00000
11111
0 11111
00000 00000 Z1
11111
1 00000
11111 Z2
0000
1111
111
000 0011 1 11111
00000
00000
11111
00000 Z3
11111
Binary tree for prefix-tree code
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Kraft’s inequality
There exists a D-ary prefix-free code whose codeword lengths are the
positive integers w1 , w2 , · · · , wK if and only if
K
D −wi ≤ 1
i =1
Sketch of Proof:
In the full D-ary tree of length N, D N−w leaves stem from each node
at depth w where w < N.
Suppose there exist a D-ary prefix-free code, construct a tree for the
code by pruning the full D-ary tree of length N = maxi wi at all
vertices corresponding to codewords.
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Kraft’s inequality
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Kraft’s inequality
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Kraft’s inequality
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Prefix-free code
2 2
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Rooted tree with probabilities
Sketch of proof:
The probability of each node is the sum of the probabilities of the
leaves in the subtree stemming from that node.
A leaf at depth d appears in d nodes on the path from the root to
the leaf.
The sum of probabilities of the nodes equals the sum of the products
of each leaf’s probability and its depth, which is the average depth
of the leaves.
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
111
000
000
111
000
111
0.1
111
00 000
111
00
11 000 0.2
111
111
000
11
00 000
111
000
111
0.3
000
111
000 0.4
111
By Path Length Lemma, average depth of the leaves is 1 + 0.7 = 1.7.
As a check, average depth of the leaves is
1(0.1) + 1(0.2) + 2(0.3) + 2(0.4) = 1.7.
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Coding a single random variable
Leaf entropy: Rooted tree with T leaves whose probabilities are
p1 , p2 , · · · , pT , then
Hleaf = − pi log pi
i :pi =0
q
where Piji is the conditional probability of choosing the j th of these
branches given we are at the given node.
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Branching entropy
H1 = −.1 log .1 − .2 log .2 − .7 log .7 = 1.157 bits
3 3 4 4
H2 = − log − log = .985 bits
7 7 7 7
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Coding a single random variable
Leaf entropy theorem: The leaf entropy of a rooted tree with
probabilities equals to the sum over all the nodes (including the
root) of the branching entropy of that node weighted by the node
probabilitity, i.e.
N
Hleaf = Pi H i
i =1
Sketch of Proof:
By definition of rooted tree with probabilities
Li
Pi = qij
j=1
Using the result log(qij /Pi ) = log qij − log Pi in the definition of
branching entropy, Hi , we get
Pi H i = − qij log qij + Pi log Pi
j:qij =0
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Coding a single random variable
Lower bound on E[W]
Leaf entropy Hleaf = H(U) , and Hi ≤ log D.
Using leaf entropy theorem and above results, we get
N
H(U) ≤ log D Pi
i =1
Hence,
H(U)
E [W ] ≥
log D
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
wi = − logD PU (ui )
This ensures that a D-ary prefix-free code exists for this choice of wi .
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Shannon-Fano prefix-free code
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
H(U) H(U)
≤ E [W ] < +1
log D log D
with equality on the left if and only if the probability of each value
of U is some negative integer power of D.
E[W] for Shannon-Fano coding also satisfies the above inequality.
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Shannon-Fano prefix-free code
Consider binary Shannon-Fano prefix-free coding for 4-ary random
variable U for which PU (ui ) equals 0.4, 0.3, 0.2 and 0.1 for i equal to
1, 2, 3, 4 respectively.
w1 = log2 0.4 1
= 2, w2 = log2 0.3
1
= 2, w3 = log2 0.2
1
= 3, w4 =
log2 0.1 = 4
1
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Huffman code
111
000 000 i
111 111
000
000 1
111 OR 000 1
111
1111
0000 u
0000 i
1111
In either case, we can delete the last digit of the codeword, and still
have a prefix-free code. The new code has smaller E[W] and thus
original code must not be optimal.
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Huffman code
There is an optimal binary prefix-free code for U such that the two
least likely codewords, say those for uK −1 and uK , differ only in their
last digit.
Sketch of Proof:
Assume PU (uK −1 ) ≥ PU (uK ). We have the following situation.
111 u
000 1111 u
0000
0
000
111
000 i
111
0
0000
1111
0000 j
1111
111
000
000 1
111 OR
111
000
000 1
111
000
111
111
u
000 j 0000
1111 u
0000 i
1111
If j = K , we switch the leaves for uj and uK without increasing E[W].
Similarly if i = K − 1, we switch the leaves for ui and uK −1 without
increasing E[W].
The new optimum code has its two least likely codewords differing
only in their last digit.
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Huffman code
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Huffman code
0 0.15
0
111
000
000
111
u1
11 1
00 0.1
0 0.3 000
111 u
000 2
111
01 1 0.15 U Z
111
000
000
111
u3
u1 0000
0.2
1.0
0 0.57
11
00
0
000
111 u
000 4
111
u 2 0001
11
00 1 0.43
11 1
00 0.23
u 3 001
111
000
000
111
u5 u4 10
u5 11
0.27
1
000
111
000 6
111
u u
6 01
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Huffman code
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Huffman code
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Huffman code
The number of unused leaves in the tree of an optimal D-ary
prefix-free code for a random variable U with K possible values,
K ≥ D, is the remainder when (K − D)(D − 2) is divided by D − 1.
Proof:
Let r be the number of unused leaves. Then if U has K values,
This implies
r = [D + q(D − 1)] − K
or
D − K = −q(D − 1) + r where 0 ≤ r < D − 1
Adding (K-D)(D-1) to both sides of the above equation, we get
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Huffman code
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Huffman code
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Huffman code
Construct ternary Huffman code for the following example:
P(u1 ) = 0.05, P(u2 ) = 0.1, P(u3 ) = 0.15, P(u4 ) = 0.2, P(u5 ) =
0.23, P(u6 ) = 0.27
11111
00000
0.05
0
00000
11111 u
00000 1
11111
0.1
0 0.15 1
01 00000
11111 u
00000 2
11111
2
11111
00000
0.15
U 1 010 Z
0.5
01
1
00000
11111 u
00000 3
11111 000000000000
111111111111
u1 1 00 200
00000
11111
0.2 u2 1
2
00000
11111 u
u3 1
100 201
00000 4
11111
0.23 u4 1 010 2221
0
11111
00000 u
u5 1 00 0
00000 5
11111
u6 1
1.0 1
0.27
00000
11111 1010 1
01 u
00000 6
11111
2
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
The source parser divides the output sequences from the information
source into messages that are encoded by the message encoder.
We consider an L-block source parser, i.e.
V1 = [U1 , U2 , · · · , UL ]
V2 = [UL+1 , UL+2 , · · · , U2L ]
.. ..
. .
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
Coding an Information source
E [W ] H(U) 1
< +
L log D L
E [W ] H(U)
≥
L log D
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems
H(V ) H(V )
≤ E [W ] < +1
log D log D
LH(U) LH(U)
=⇒ ≤ E [W ] < +1
log D log D
H(U) E [W ] H(U) 1
=⇒ ≤ < +
log D L log D L
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
EE321: Communication Systems