Information Theory and Coding - Chapter 3

Mustaqbal University
College of Engineering &Computer Sciences

Electronics and Communication Engineering Department
Course: EE301: Probability Theory and Applications

Prerequisite: Stat 219
Text Book: B.P. Lathi, “Modern Digital and Analog Communication Systems”, 3 th edition, Oxford University Press, Inc., 1998
Reference: A. Papoulis, Probability, Random Variables, and Stochastic Processes, Mc-Graw Hill, 2005
Dr. Aref Hassan Kurdali

Shannon – Fano Coding
The basic idea behind Shannon-Fano coding is to assign to each symbol of a source
alphabet, a sequence of bits roughly equal in length to the amount of information
conveyed by the symbol in question. The end result is a source code whose average
code-word length approaches the fundamental limit set by the entropy of a discrete
memoryless source, namely, H(S).
If we can make = H(S) = a 100% efficient code could
be obtained. The codeword length li must be integer while the amount of information Ii
is not necessary be integer. Therefore, choosing the symbol code word length li such as
I(si) ≤ li < I(si) +1, i.e. log(1/pi) ≤ li < 1 + log(1/pi) , i= 1, 2, ...., q
Where the radix of the used code is equal to the base of the unit of information
logarithm (i.e. for binary code, bits of information is used)
The code word lengths obtained satisfy Kraft inequality, i.e. a prefix code can be
constructed from theses code word lengths using the decision tree. If and only if li = I(si)
for all i, (i.e. when pi = r -li ) the obtained code is the most efficient code with η = 1.
How efficient Shannon – Fano Code is? Ii is proofed that
H(S) ≤ LSF < H(S)+1
Shannon – Fano Coding
The Shannon-Fano coding algorithm
1. List the source symbols in order of decreasing probability.
2. Partitioning the set into two sets that are as close to
equiprobable as possible, and assign 0 to the upper set and 1
to the lower set.
3. Continue this process until further partitioning is not
possible.
Note that an ambiguity may arise in the choice of approxamitaly

equiprobable sets.
In other words, the number of bits required to
describe si by a Shannon code is .
or
is the codeword length by using the

Shannon coding.
Coding of source extensions
How do we increase a prefix code efficiency toward to unity?
The answer to this problem lies in coding of the discrete memoryless
source extensions. Let Ln denote the average prefix code-word length of
the extended source T = Sn. Then, it can be deduced that
H(T) = H(Sn) ≤ LSFn < H(Sn)+1
nH(S) ≤ LSFn < nH(S)+1
Dividing by n
H(S) ≤ LSFn/n < H(S)+1/n
Which is called Noiseless Coding Theorem
In the limit, as n approaches infinity, the lower and upper bounds in the
above Equation converge to H(S), i.e. unity efficiency
Where Limn →∞ (LSFn/n) = H(S)
Noiseless Coding Theorem
Again, if the code efficiency of Huffman is less than 100%, the solution,
according to source-coding theorem, is to encode source extensions.
Huffman code by definition has the most highest code efficiency, i.e.
H(S) ≤ LH ≤ LSF < H(S)+1
Therefore, H(Sn) ≤ LHn ≤ LSFn < H(Sn)+1
Then, H(S) ≤ LHn/n ≤ LSFn/n < H(S)+1/n
It is noteworthy that the Huffman encoding process (i.e., the Huffman
tree) is not unique. In particular, when the probability of a combined
symbol (obtained by adding the last two probabilities pertinent to a
particular step) is found to equal another probability in the list. We may
proceed by placing the probability of the new symbol as high as
possible. Alternatively, we may place it as low as possible. (It is
presumed that whichever way the placement is made, high or low, it is
consistently adhered to throughout the encoding process.). Thus,
noticeable differences arise in that the code words in the resulting source
code can have different code word lengths. Nevertheless, the average
code-word length remains the same, i.e. there is, in this particular case,
more than one Huffman code.
Problem 1
A 6-symbols of memoryless source S {a, b, c, d, e and f} which

probabilities are {0.3, 0.25, 0.2, 0.12, 0.08, 0.05}.
1. Encode the source S symbols using the Shannon-Fano code.
2. Calculate the efficiency and the redundancy of the previous
code.
Problem 1 - Solution
) Step 1 Step 2 Step 3 Step 4 Code
0.30 0 0 00
0.25 0 1 01
0.20 1 0 10
0.12 1 1 0 110
0.08 1 1 1 0 1110
0.05 1 1 1 1 1111
𝜸 = 1 - 𝜂 =1%
Problem 2
A 4-symbols of memoryless source S {a, b, c, and d} which

probabilities are {1/2, 1/4, 1/8, 1/8}.
1. Encode the source S symbols using the Shannon-Fano code.
2. Calculate the efficiency of previous code.
) SF
Step 1 Step 2 Step 3
Code
0.5 0 0
0.25 1 0 10
0.125 1 1 0 110
0.125 1 1 1 111
Problem 3
A discrete memoryless source S has five equally likely symbols.

1. Construct a Shannon-Fano code for S, and calculate the
efficiency.
2. Construct another Shannon-Fano code for S and compare the
results.
3. Repeat for the Huffman code and compare the results.
) SF
Step 1 Step 2 Step 3
Code
0.20 0 0 00
0.20 0 1 01
0.20 1 0 10
0.20 1 1 0 110
0.20 1 1 1 111
Problem 4
A 4-symbols of memoryless source S {a, b, c, and d} which emits
one symbol every 125 μsec is encoded using the following binary
Shannon-Fano encoder:
a 1100
b 10
c 0110
d 00
1. Find a possible source probability distribution.

2. Calculate the average code length.
3. Calculate the encoder output average binit rate.
4. Decode the following received binary stream:
101001101100001000
1) The possible source probability distribution:
SF code ) )
a 1100 4 1/16 4
b 10 2 7/16 1.2
c 0110 4 1/16 4
d 00 2 7/16 1.2
2) The average code length:
3) The encoder output average binit rate =

4) The following received binary stream:
101001101100001000
bbcadbd
Problem 5
The second extension (T) of a binary memoryless source S {a and
b} ie encoded using the following binary Shannon-Fano encoder:
T S SF Code
aa 1100
bb 0
ab 101
ba 111
1. Find a possible source T probability distribution.

2. Find the source S probability distribution.
3. Calculate the average code length of the second extension source T.
4. Find the encoder output average binit rate if the source S emits randomly
one symbol every 20 μsec.
5. Decode the following received binary stream: 1010110010111100
1) The possible source probability distribution of the extension source T:
2) The source S probability distribution:

and
3) The average code length of the second extension source T:
4) The encoder output average binit=

5) ; (abbbaaabbabbbb)
Problem 6
The codeword lengths of 6-symbol source that is

encoded using Shannon-Fano binary code are as
following: {1, 3, 3, 3, 4, 4}

1. Find a possible source probability distribution
2. Find the above Shannon-Fano binary code and
calculate its efficiency
) Step Step Step Step SF

1 2 3 4 Code
1 ½ 0 0
3 1/8 1 0 0 100
3 1/8 1 0 1 101
3 1/8 1 1 0 110
4 1/16 1 1 1 0 1110
4 1/16 1 1 1 1 1111
Memory (Markov) Information Sources
Most practical information sources have highly correlated symbols, i.e.
the probability of the current symbol statistically depends on previous
emitted symbols.
For a j-memory source of q symbols has qj+1 conditional probabilities
which are written in a state transition matrix of size (qj x q).
Example for a first order memory source whose three symbols (j=1 &
q=3; a, b & c), the transition matrix is the following
p(a/a) p(b/a) p(c/a)
p(a/b) p(b/b) p(c/b)
p(a/c) p(b/c) p(c/c)
Each raw must add to one.
A transition diagram can be drawn by representing each state by a circle
and each conditional probability by an arrow originating from the given
state towards the new state (i.e., p(a/b) from state b to state a).
The entropy of a memory source
H(S) = sum of [Conditional entropy of each state (row) * state
probability]
The average code length of memory source
L = sum of [Conditional average code length of each state
(row) * state probability]
Therefore, the code efficiency of the memory source is
η = H(S)/L
Note that The memory source has a lower entropy than that of its
adjoint source (If the source, for simplicity, were considered as a zero
memory source) therefore, a lower average code length (lower binit rate)
can be achieved by using coding of states [different code for each
Problem 1
Consider a 3-symbol, 1st order Markov (memory) source S with the following state
transition matrix:
1. Draw the state transition diagram.

2. Calculate the state probability distribution. (i.e., calculate P(1), P(2), & P(3)).
3. Calculate the source entropy.
1)
0.
5
0.25
1 0.25
0.25 0.25
3
0.
5
2 0.25
0.
5
0.25
2)
+0.25 +0.25
+0.5 +0.25
+0.25 +0.5
+ + =1 …………..…(4)
So;
(1)-(2):
(2)-(3):
From (4), (5), and (6):

3) Entropy of the source:
+
Entropy of the state (i) :
=1.5 bit/SS
Problem 2
Consider a 2-symbol, 1st order Markov (memory) source with the following state
diagram:
2/3
1/
1/ 1 2 3
3
2/3
1. Construct the state transition matrix.

2. Calculate the state probability distribution. (i.e., calculate P(1) and P(2)).
Problem 2 - Solution 2/3
1/
1/ 1 2 3
3
2/3
+
+
From (1): …. (4)

(4) In (3):
2/3
1/
1/ 1 2 3
3
2/3
=0.918 bit/SS
Entropy of the source:

=
Problem 3
Consider a 4-symbol, 1st order Markov (memory) source S = {a, b, c & d), with the
following state transition matrix:
1. Draw the state transition diagram.

2. Calculate the state probability distribution. (i.e., calculate P(a), P(b), P(c) &
P(d)).
1)
0.
7
0.
a
2
0.
1
0.
0. 4
5
c
0.
3
b 0.
2
1
0.4 d
2)
P(a) =0.7 P(a) + 0.5P(b) + 0.4P(d)  0.3P(a) = 0.5P(b) + 0.4P(c)

P(c) = 0.1P(a) + 0.2P(b) + 0.2P(d)  0.8P(c) = 0.1P(a) + 0.2P(b)
P(d) = P(C)
P(a) + P(b) + 2P(c) =1

P(a) = 8/15
P(b) = 2/9
P(c) = 11/90
P(d) = 11/90
3) Entropy of the source:

Vector Quantization (Lossy Data Compression)
In accordance with the source-coding theorem the average code-word
length of an extended source can be made as small as the entropy of the
source provided the extended source has a high enough order.
However, the price that have to be paid for decreasing the average code-
word length is increased decoding complexity, which is brought about by
the high order of the extended source (Vector quantization VQ).
VQ consists of a designed codebook contains the most representative
vectors called codevectors for a large pool of training vectors and a
search algorithm to find the nearest codevector to an input vector.
Tr and Rx must have same copy of the designed codebook.
The transmitter needs only to send the address of the chosen codevector
and the receiver will use the addressed codevector to reconstruct the
original vector with acceptable distortion. (lossy compression).
Entropy coding can be used for the address transmission for further
reduction of binit rate.
Example: A black-and-white television picture may be viewed as consisting of approximately (784 x 440) pixels. The value
of each pixel may equal one of 256 distinct gray levels with equal probability. The rate of transmission is 30 picture
frames per second.
(a) Calculate the binit rate if scalar uniform quantizer is used for each pixel.
Each pixel can take one of 256 gray levels with equal probability (uniform distribution). To send one of these gray levels,
log2 (256) = 8 binits are needed. Therefore, the binit rate = 784 x 440 x 30 x 8 = 82790400 = 82.7904 x 10 6 binit/sec

(b) Calculate the binit rate if vector quantizer with codebook size equals to (256 x 16), is used where the picture is divided
into blocks of size (4 x 4) pixels each.
The codebook is designed to have the best representative blocks called codevectors for image coding. The codebook size
(256 x 16) has 256 different codevectors of size 16 pixels (block size of 4x4 pixels) each. To send one of these
codevectors, log2 (256) = 8 binits are needed.
Therefore, the binit rate = [total # of blocks/sec] * 8 binit/block
= [((784 x 440)/16) x 30] x 8 = 5174400 = 51.744 x 106 binit/sec
(c) Calculate the binit rate if vector quantizer with codebook size equals to (512 x 64), is used where the picture is divided
into blocks of size (8 x 8) pixels each.
The binit rate = [total # of blocks/sec] * log2 (512) binit/block
= [((784 x 440)/64) x 30] x 9 = 1455300 = 1.4553x 10 6 binit/sec

(d) Compare and comment.
The binit rate has been reduced in (b) by [(82.7904 -51.744)/82.7904]*100% = 37.5%
The binit rate has been reduced in (c) by [(82.7904 -1.4553)/82.7904]*100% = 98.24%
As the codevector (block) size increases the binit rate decreases but with less image quality so that the codebook size (# of
codevectors) must be increased in order to improve the image quality. Larger codebook size needs more efficient
design and search algorithms

Information Theory and Coding - Chapter 3

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Information Theory and Coding - Chapter 3

Uploaded by

Copyright:

Available Formats

Mustaqbal University

College of Engineering &Computer Sciences

Course: EE301: Probability Theory and Applications

Dr. Aref Hassan Kurdali

Note that an ambiguity may arise in the choice of approxamitaly

is the codeword length by using the

A 6-symbols of memoryless source S {a, b, c, d, e and f} which

A 4-symbols of memoryless source S {a, b, c, and d} which

A discrete memoryless source S has five equally likely symbols.

1. Find a possible source probability distribution.

2) The average code length:

3) The encoder output average binit rate =

1. Find a possible source T probability distribution.

2) The source S probability distribution:

3) The average code length of the second extension source T:

4) The encoder output average binit=

The codeword lengths of 6-symbol source that is

) Step Step Step Step SF

1. Draw the state transition diagram.

From (4), (5), and (6):

Entropy of the state (i) :

1. Construct the state transition matrix.

From (1): …. (4)

Entropy of the state (i) :

Entropy of the source:

1. Draw the state transition diagram.

P(a) =0.7 P(a) + 0.5P(b) + 0.4P(d)  0.3P(a) = 0.5P(b) + 0.4P(c)

Entropy of the state (i) :

You might also like