Slides 3

EE321: Communication Systems
Adrish Banerjee
Department of Electrical Engineering

Indian Institute of Technology Kanpur
Kanpur, Uttar Pradesh
India
Jan. 13, 2024
Lecture #3: Block to variable length coding
Adrish Banerjee Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India
Coding a single random variable
Message U Source Z = X1 , X2 ,......., Xw

Source Encoder
Variable length coding scheme
U is a K −ary random variable.

Xi takes on values in the D−ary alphabet.
W is a random variable, i.e. Z is variable length.
A list (z1 , z2 , · · · , zK ) of D-ary sequences is a codeword of
U = [u1 , u2 , · · · , uK ].
If zi = [xi 1 , xi 2 , · · · , xiwi ] is the codeword for ui , and wi is the length

of this codeword, then average codeword length is defined as
K

E [W ] = wi PU (ui )
i =1
Smallness of average codeword length is a measure of goodness of

the code.
Prefix-free code
A D−ary tree, is a finite rooted tree such that D branches stem

outward from each node.
The full D−ary tree of length N is the D−ary tree with D N leaves
each at depth N from the root.
D-ary prefix-free code can be identified with a set of leaves in a
D-ary tree.
Ex. Draw binary tree for the following code
z1 = [011], z2 = [10], z3 = [11], and z4 = [00]
Prefix-free code
0
11111
00000
00000
11111 Z
0 11001100 11111
00000 0 4
1 1100110000000
11111
00000
11111
1 11111
00000
111
000 00000
11111
0 11111
00000 00000 Z1
11111
1 00000
11111 Z2
0000
1111
111
000 0011 1 11111
00000
00000
11111
00000 Z3
11111
Binary tree for prefix-tree code
Kraft’s inequality
There exists a D-ary prefix-free code whose codeword lengths are the
positive integers w1 , w2 , · · · , wK if and only if
K

D −wi ≤ 1
i =1
Sketch of Proof:
In the full D-ary tree of length N, D N−w leaves stem from each node
at depth w where w < N.
Suppose there exist a D-ary prefix-free code, construct a tree for the
code by pruning the full D-ary tree of length N = maxi wi at all
vertices corresponding to codewords.
Due to prefix-free condition, if at depth wi , we delete D N−wi leaves,

none of these leaves could have been previously deleted.
Since there are only D N leaves that can be deleted, we have
D N−w1 + D N−w2 + · · · + D N−wK ≤ D N
Dividing the above equation by D N we get the necessary condition

for prefix-free code.
Sketch of converse proof: Suppose w1 , · · · , wK are positive integers

such that Kraft inequality is satisfied. Consider an ordered list of
length w1 ≤ w2 ≤ · · · ≤ wK . Consider the following algorithm
(1) i ← 1
(2) Choose zi as any surviving node or leaf at depth wi , and prune
the tree. Stop if there is no such surviving node or leaf.
(3) i = K stop , otherwise i ← i + 1
If we are able to choose zK in step (2), we are able to construct

prefix-free code. Suppose z1 , z2 , · · · , zi −1 have been chosen, the
number of surviving leaves at depth N not stemming from any
codeword is
⎛ ⎞
i −1

D N − (D N−w1 + D N−w2 + · · · + D N−wi −1 ) = D N ⎝1 − D −wj ⎠
j=1
Therefore the number of surviving leaves at depth N is greater than

zero. If there is a surviving leaf at depth N, there must be some
unused nodes at depth wi < N and no already chosen codeword can
stem outward from such a surviving node. Thus the surviving node
can be chosen as zi .
Prefix-free code
Construct a ternary prefix-free code with lengths

w1 = 1, w2 = 2, w3 = 2, w4 = 3, w5 = 4.
5
Since i =1 3−wi = 1/3 + 1/9 + 1/9 + 1/27 + 1/81 = 49/81 < 1, a
prefix-free code exists.
0
1111
0000 u
0000 01
1111 1111 u
0000
0000
1111
000 u 2
111
0000
1111
00111100 000
111 000
111
000 0 3 000111 u
1
000 1
111 111 000
111
000 0 4 111
111 000 u
000
111
0011 2 111 000 5
2
0011 1 1
2 2
Rooted tree with probabilities
By rooted tree with probabilities, we mean a finite rooted tree with

probabilities assigned to each vertex such that
(1) the root is assigned probability 1, and
(2) the probability of every node is the sum of the probabilities of the
nodes and leaves at depth 1 in the subtree stemming from this
intermediate node.
Path length lemma: In a rooted tree with probabilities, the average
depth of the leaves is equal to the sum of the probabilities of the
nodes (including the root).
Sketch of proof:
The probability of each node is the sum of the probabilities of the
leaves in the subtree stemming from that node.
A leaf at depth d appears in d nodes on the path from the root to
the leaf.
The sum of probabilities of the nodes equals the sum of the products
of each leaf’s probability and its depth, which is the average depth
of the leaves.
111
000
000
111
000
111
0.1
111
00 000
111
00
11 000 0.2
111
111
000
11
00 000
111
000
111
0.3
000
111
000 0.4
111
By Path Length Lemma, average depth of the leaves is 1 + 0.7 = 1.7.
As a check, average depth of the leaves is
1(0.1) + 1(0.2) + 2(0.3) + 2(0.4) = 1.7.
Leaf entropy: Rooted tree with T leaves whose probabilities are
p1 , p2 , · · · , pT , then

Hleaf = − pi log pi
i :pi =0
Branching entropy: Suppose that qi 1 , qi 2 , · · · , qiL are the

probabilities of the nodes and leaves at the ends of the Li branches
stemming outward from the node whose probability is Pi . Then the
branching entropy Hi at this node is given by
qij qij
Hi = − log
Pi Pi
j:qij =0
q
where Piji is the conditional probability of choosing the j th of these
branches given we are at the given node.

11111
00000
00000
11111
00000
11111
00000
11111
0.1
1
0011 00000
11111
00000 0.2 00000
11111 11111
00000
11111
0.7 00000 0.3
11111
0011 11111
00000
00000
11111
00000 0.4
11111
Leaf entropy,

4
Hleaf = − pi log pi = 1.846 bits
i =1
Branching entropy
H1 = −.1 log .1 − .2 log .2 − .7 log .7 = 1.157 bits
3 3 4 4
H2 = − log − log = .985 bits
7 7 7 7
Leaf entropy theorem: The leaf entropy of a rooted tree with
probabilities equals to the sum over all the nodes (including the
root) of the branching entropy of that node weighted by the node
probabilitity, i.e.
N
Hleaf = Pi H i
i =1
Sketch of Proof:
By definition of rooted tree with probabilities
Li

Pi = qij
j=1
Using the result log(qij /Pi ) = log qij − log Pi in the definition of
branching entropy, Hi , we get

Pi H i = − qij log qij + Pi log Pi
j:qij =0
Sketch of Proof (contd.):

A non-root k-th node will contribute +Pk log Pk to the sum
(corresponding to i = k) and will contribute −Pk log Pk to the term
for i such that node k is at the end of branch leaving node i. Hence
net contribution to the sum is zero.
The root will contribute the term P1 log P1 to the sum and that’s
zero as P1 = 1.
The k-th leaf will contribute −pk log pk to the sum (corresponding
to i : qij = pk ).
Hence,
N
T

Pi H i = − pk log pk
i =1 k=1
Lower bound on E[W]
Leaf entropy Hleaf = H(U) , and Hi ≤ log D.
Using leaf entropy theorem and above results, we get
N

H(U) ≤ log D Pi
i =1
By path length lemma we know

N

E [W ] = Pi
i =1
Hence,
H(U)
E [W ] ≥
log D
Shannon-Fano prefix-free code
The length, wi of the codeword ui is chosen as
wi = − logD PU (ui )
Kraft’s inequality is satisfied for this choice of wi .

K
K

−wi
D ≤ D logD PU (ui )
i =1 i =1
K

= PU (ui ) = 1
i =1
This ensures that a D-ary prefix-free code exists for this choice of wi .
Using the relation

x ≤ x < x + 1
we get
− log PU (ui )
wi < +1
log D
Multiplying the above equation by PU (ui ) and summing over i, we
get
H(U)
E [W ] < +1
log D
Coding theorem for a K-ary Random variable
The average codeword length of an optimum D-ary prefix-free code

for a K-ary random variable U satisfies
H(U) H(U)
≤ E [W ] < +1
log D log D
with equality on the left if and only if the probability of each value
of U is some negative integer power of D.
E[W] for Shannon-Fano coding also satisfies the above inequality.
Consider binary Shannon-Fano prefix-free coding for 4-ary random
variable U for which PU (ui ) equals 0.4, 0.3, 0.2 and 0.1 for i equal to
1, 2, 3, 4 respectively.
w1 = log2 0.4 1
= 2, w2 = log2 0.3
1
= 2, w3 = log2 0.2
1
= 3, w4 =
log2 0.1 = 4
1
H(U) = 1.846 bits, and by path length lemma,

E [W ] = 1 + 0.7 + 0.3 + 0.3 + 0.1 = 2.4.
0
111
000
0 0.7 0110 000 0.4
111
1
000
111
000 0.3
111 0 111
000
1.0
01 0
000 0.2
111
0011 0 1111
0000
1 0.3 1 0000 0.1
1111
0011 0011 1
1 0
Huffman code
The binary tree of an optimum binary prefix-free code for U has no

unused leaves.
Sketch of Proof:
If the tree has unused leaves, they must be at maximum depth as
the code is optimal.
For atleast one value of ui of U, we have following situation.
0 111
000
000
111 u
0
111
000 000 i
111 111
000
000 1
111 OR 000 1
111
1111
0000 u
0000 i
1111
In either case, we can delete the last digit of the codeword, and still
have a prefix-free code. The new code has smaller E[W] and thus
original code must not be optimal.
Huffman code
There is an optimal binary prefix-free code for U such that the two
least likely codewords, say those for uK −1 and uK , differ only in their
last digit.
Sketch of Proof:
Assume PU (uK −1 ) ≥ PU (uK ). We have the following situation.
111 u
000 1111 u
0000
0
000
111
000 i
111
0
0000
1111
0000 j
1111
111
000
000 1
111 OR
111
000
000 1
111
000
111
111
u
000 j 0000
1111 u
0000 i
1111
If j = K , we switch the leaves for uj and uK without increasing E[W].
Similarly if i = K − 1, we switch the leaves for ui and uK −1 without
increasing E[W].
The new optimum code has its two least likely codewords differing
only in their last digit.
Huffman code
Huffman coding: Algorithm for constructing binary prefix-free codes

for K-ary random variable U
Step 0: Designate K vertices as u1 , u2 , · · · , uK and assign probability
PU (ui ) to vertex ui . These vertices are designated as “active” nodes.
Step 1: Create a node that ties together the two least likely active
vertices with binary branches and assign probability equal to the sum
of two vertices to this new node. Active this new node and deactive
the two vertices that are joined.
If there is only one active vertex left. make it root and stop,
otherwise go back to step 1.
Huffman code
Construct binary Huffman code for the following example:

P(u1 ) = 0.05, P(u2 ) = 0.1, P(u3 ) = 0.15, P(u4 ) = 0.2, P(u5 ) =
0.23, P(u6 ) = 0.27
0.05
0 0.15
0
111
000
000
111
u1
11 1
00 0.1
0 0.3 000
111 u
000 2
111
01 1 0.15 U Z
111
000
000
111
u3
u1 0000
0.2
1.0
0 0.57
11
00
0
000
111 u
000 4
111
u 2 0001
11
00 1 0.43
11 1
00 0.23
u 3 001
111
000
000
111
u5 u4 10
u5 11
0.27
1
000
111
000 6
111
u u
6 01
Huffman code
The number of leaves in a finite D-ary tree is always given by D +

q(D-1) where q is the number of nodes not counting the root.
Sketch of proof:
In constructing a tree from root, we get D leaves initially.
At each subsequent step, if we extend any leaf we get D new leaves
and lose one old leaf.
Huffman code
There are atmost D-2 unused leaves in the tree of an optimal

prefix-free D-ary code for U and all are at maximum length.
Sketch of proof:
If unused leaves are not at maximum length, we could decrease E[W]
by transferring one of the codewords at maximum length to this leaf.
Hence, unused leaves for optimal prefix-free D-ary code can only be
at maximum length.
If there are D − 1 unused leaves in the tree of optimal prefix-free
code, then we can shorten the codeword by removing its last digit.
Huffman code
The number of unused leaves in the tree of an optimal D-ary
prefix-free code for a random variable U with K possible values,
K ≥ D, is the remainder when (K − D)(D − 2) is divided by D − 1.
Proof:
Let r be the number of unused leaves. Then if U has K values,
r = [number of leaves in D-ary tree of the code] − K
This implies
r = [D + q(D − 1)] − K
or
D − K = −q(D − 1) + r where 0 ≤ r < D − 1
Adding (K-D)(D-1) to both sides of the above equation, we get
(K − D)(D − 2) = (K − D − q)(D − 1) + r where 0 ≤ r < D − 1
Huffman code
There is an optimal D-ary prefix-free code for a random variable U

with K possible values such that D − r least likely codewords differ
only in their last digit, where r is the remainder when
(K − D)(D − 2) is divided by D − 1.
Sketch of proof:
Arguments similar to binary case can be used to prove this.
Huffman code
Huffman coding: Algorithm for constructing D-ary (D ≥ 3)

prefix-free codes for K-ary random variable U
Step 0: Designate K vertices as u1 , u2 , · · · , uK and assign probability
PU (ui ) to vertex ui . These vertices are designated as “active”
nodes. Compute r as the remainder when (K − D)(D − 2) is divided
by D − 1.
Step 1: Create a node that ties together the D − r least likely active
vertices with D − r branches of a D-ary branch and assign
probability equal to the sum of the probabilities of these D − r
vertices to this new node. Active this new node and deactive the
D − r vertices that are joined.
If there is only one active vertex left. make it root and stop,
otherwise set r=0 and go back to step 1.
Huffman code
Construct ternary Huffman code for the following example:
P(u1 ) = 0.05, P(u2 ) = 0.1, P(u3 ) = 0.15, P(u4 ) = 0.2, P(u5 ) =
0.23, P(u6 ) = 0.27
11111
00000
0.05
0
00000
11111 u
00000 1
11111
0.1
0 0.15 1
01 00000
11111 u
00000 2
11111
2
11111
00000
0.15
U 1 010 Z
0.5
01
1
00000
11111 u
00000 3
11111 000000000000
111111111111
u1 1 00 200
00000
11111
0.2 u2 1
2
00000
11111 u
u3 1
100 201
00000 4
11111
0.23 u4 1 010 2221
0
11111
00000 u
u5 1 00 0
00000 5
11111
u6 1
1.0 1
0.27
00000
11111 1010 1
01 u
00000 6
11111
2
Coding an Information source
Parsing an information source
Information U1U2........ Source V1V2 ........ Z1Z2 ........

Message
Source Parser Encoder
The source parser divides the output sequences from the information
source into messages that are encoded by the message encoder.
We consider an L-block source parser, i.e.
V1 = [U1 , U2 , · · · , UL ]
V2 = [UL+1 , UL+2 , · · · , U2L ]
.. ..
. .
Block to variable length coding theorem for a DMS. There exists a

D-ary prefix-free coding of a L-block message from DMS such that
the average number of D-ary code digits per source satisfies
E [W ] H(U) 1
< +
L log D L
where H(U) is the uncertainty of a single source letter. Conversely

for every D-ary prefix-free coding of an L-block message
E [W ] H(U)
≥
L log D
Proof: Since the message V = [U1 , U2 , · · · , UL ] has L i.i.d.

components, we have
H(V ) = H(U1 ) + H(U2 ) + · · · + H(UL )

= LH(U)
For any D-ary prefix-free code for V, we have
H(V ) H(V )
≤ E [W ] < +1
log D log D
LH(U) LH(U)
=⇒ ≤ E [W ] < +1
log D log D
H(U) E [W ] H(U) 1
=⇒ ≤ < +
log D L log D L

Slides 3

Uploaded by

Copyright:

Available Formats

You might also like

Slides 3

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Slides 3

Uploaded by

Copyright:

Available Formats

EE321: Communication Systems

Department of Electrical Engineering

Jan. 13, 2024

Lecture #3: Block to variable length coding

Message U Source Z = X1 , X2 ,......., Xw

Variable length coding scheme

U is a K −ary random variable.

Coding a single random variable

If zi = [xi 1 , xi 2 , · · · , xiwi ] is the codeword for ui , and wi is the length

Smallness of average codeword length is a measure of goodness of

A D−ary tree, is a ﬁnite rooted tree such that D branches stem

Due to preﬁx-free condition, if at depth wi , we delete D N−wi leaves,

D N−w1 + D N−w2 + · · · + D N−wK ≤ D N

Dividing the above equation by D N we get the necessary condition

Sketch of converse proof: Suppose w1 , · · · , wK are positive integers

If we are able to choose zK in step (2), we are able to construct

Therefore the number of surviving leaves at depth N is greater than

Construct a ternary preﬁx-free code with lengths

Rooted tree with probabilities

By rooted tree with probabilities, we mean a ﬁnite rooted tree with

Rooted tree with probabilities

Branching entropy: Suppose that qi 1 , qi 2 , · · · , qiL are the

Coding a single random variable

Coding a single random variable

Sketch of Proof (contd.):

By path length lemma we know

Shannon-Fano preﬁx-free code

The length, wi of the codeword ui is chosen as

Kraft’s inequality is satisﬁed for this choice of wi .

Using the relation

Coding theorem for a K-ary Random variable

The average codeword length of an optimum D-ary preﬁx-free code

H(U) = 1.846 bits, and by path length lemma,

The binary tree of an optimum binary preﬁx-free code for U has no

Huﬀman coding: Algorithm for constructing binary preﬁx-free codes

Construct binary Huﬀman code for the following example:

The number of leaves in a ﬁnite D-ary tree is always given by D +

There are atmost D-2 unused leaves in the tree of an optimal

r = [number of leaves in D-ary tree of the code] − K

(K − D)(D − 2) = (K − D − q)(D − 1) + r where 0 ≤ r < D − 1

There is an optimal D-ary preﬁx-free code for a random variable U

Huﬀman coding: Algorithm for constructing D-ary (D ≥ 3)

Coding an Information source

Parsing an information source

Information U1U2........ Source V1V2 ........ Z1Z2 ........

Block to variable length coding theorem for a DMS. There exists a

where H(U) is the uncertainty of a single source letter. Conversely

Coding an Information source

Proof: Since the message V = [U1 , U2 , · · · , UL ] has L i.i.d.

H(V ) = H(U1 ) + H(U2 ) + · · · + H(UL )

For any D-ary preﬁx-free code for V, we have

You might also like