Lecture 4

COE 343 – Information theory and coding
Lecture 4: Coding of Information
Dr. E. T. Tchao
Quotes about Shannon
• ”What is information? Sidestepping questions about
meaning, Shannon showed that it is a measurable
commodity”.
• ”Today, Shannon’s insight help shape virtually all

systems that store, process, or transmit information
in digital form, from compact discs to computers,
from facsimile machines to deep space probes”.
• ”Information theory has also infilitrated fields

outside communications, including linguistics,
psychology, economics, biology, even the arts”.
Information Sources
Channel
Information (errors) Information
Symbols Encoding signal signal Decoding Symbols Destin-
Source 
Source/Channel + noise Channel/Source ation
s1,…,sq s1,…,sq
Noise
Example: Morse Code
telegraph wire
transmitter receiver
dots, dashes ∙─_
A, …, Z Encoding Decoding A, …, Z
spaces
keyer recognizer
shortwave radio
Example: ASCII Code
seven-bit Telephone seven-bit terminal

Character keyboard modem modem character
blocks wire blocks screen
Why Code information
• The general reasons for coding information are:
Coding for compressing data
Coding for ensuring the quality of the transmission in a

noisy condition
Coding for secrecy

Stochastic sources
• A source outputs symbols X1, X2, ...

• Each symbol take its value from an alphabet A = (a1,
Example 1: A text is a sequence of
a2Example
, …). 2: A (digitized) grayscale image is a
symbols each taking its value from the
sequence
• Model: P(X1,…,Xof symbols
) assumedeach
to taking
be known its value
N alphabet for all
from the
combinations. alphabet A = (0,1) or A = (0, …, 255).
A = (a, …, z, A, …, Z, 1, 2, …9, !, ?, …).
Source X1, X2, …

Two Special Cases
1. The Memoryless Source
 Each symbol independent of the previous
ones.
 P(X1, X2, …, Xn) = P(X1) ¢ P(X2) ¢ … ¢ P(Xn)
2. The Markov Source
 Each symbol depends on the previous one.
 P(X1, X2, …, Xn) = P(X1) ¢ P(X2|X1) ¢ P(X3|X2)
¢ … ¢ P(Xn|Xn-1)
The Markov Source
• A symbol depends only on the previous symbol, so the

source can be modelled by a state diagram.
0.7 b
a 0.5 A ternary source with

1.0 alphabet A = (a, b, c).
0.3
0.2 c
0.3
The Markov Source
• Assume we are in state a, i.e., Xk = a.

• The probabilities for the next symbol are:
0.7 b
P(Xk+1 = a | Xk = a) = 0.3
a 0.5 P(Xk+1 = b | Xk = a) = 0.7

1.0
0.3 P(Xk+1 = c | Xk = a) = 0
0.2 c
0.3
The Markov Source
• So, if Xk+1 = b, we know that Xk+2 will

equal c.
0.7 b
P(Xk+2 = a | Xk+1 = b) = 0
a 0.5 P(Xk+2 = b | Xk+1 = b) = 0

1.0
0.3 P(Xk+2 = c | Xk+1 = b) = 1
0.2 c
0.3
Definition of terms
Non-singular code: A code of a discrete
information source is said to be non-singular when
different source symbols map to different
codewords.
Non-Ambiguous codes: A code of a discrete

source is said to be non ambiguous if and only if
each sequence of codewords uniquely corresponds
to a single message
Prefix-free Set
• Let T be a subset of {0,1}*.
• Definition:
• T is prefix-free if for any distinct x,y 2 T,
• if |x| < |y|, then x is not a prefix of y
• Example:
• {000, 001, 1, 01} is prefix-free
• {0, 01, 10, 11, 101} is not.
Prefix-free Code for S
• Let S be any set.
• Definition: A prefix-free code for S is a prefix-free

set T and a 1-1 “encoding” function f: S -> T.
• The inverse function f-1 is called the “decoding

function”.
• Example: S = {apple, orange, mango}.

• T = {0, 110, 1111}.
f(apple) = 0, f(orange) = 1111, f(mango) = 110.
What is so cool
about prefix-free
codes?
Sending sequences of
elements of S over a
communications channel
Let T be prefix-free and f be an encoding

function. Wish to send <x1, x2, x3, …>
Sender: sends f(x1) f(x2) f(x3)…

Receiver: breaks bit stream into elements
of T and decodes using f-1
Sending information on a channel
• Example: S = {apple, orange, mango}.
• T = {0, 110, 1111}.
f(apple) = 0, f(orange) = 1111, f(mango) = 110.
• If we see
• 00011011111100…
• we know it must be
• 0 0 0 110 1111 110 0 …
• and hence
• apple apple apple mango orange mango
apple …
Morse Code is not Prefix-free!
• SOS encodes as …---…
A .- F ..-. K -.- P .--. U ..- Z --..

B -... G --. L .-.. Q --.- V ...-
C -.-. H .... M -- R .-. W .--
D -.. I .. N -. S ... X -..-
E. J .--- O --- T- Y -.--
Morse Code is not Prefix-free!
• SOS encodes as …---…
• Could decode as: IAMIE
A .- F ..-. K -.- P .--. U ..- Z --..

B -... G --. L .-.. Q --.- V ...-
C -.-. H .... M -- R .-. W .--
D -.. I .. N -. S ... X -..-
E. J .--- O --- T- Y -.--
Unless you use pauses
• SOS encodes as … --- …
A .- F ..-. K -.- P .--. U ..- Z --..

B -... G --. L .-.. Q --.- V ...-
C -.-. H .... M -- R .-. W .--
D -.. I .. N -. S ... X -..-
E. J .--- O --- T- Y -.--
Properties
Any prefix-free code is non-ambiguous
There exist some non-ambiguous codes which are

not prefix-free
A codeword is said to be instantaneously decodable
if and only if each codeword in any string of code
words can be decoded as soon as its end is reached
A code is instantaneously decodable if and only if it

is prefix-free
Coding tree
• A coding tree is a n-ary tree, the arcs of which are

labelled with letters of a given alphabet of size n, in
such a way that each letter appears at most once of
a given node.
N-ARY TREES FOR CODING
0 1
0 1 0 1
0 1 0 1
An n-ary tree is a tree in which each

interior node has arity n.
Representing prefix-free codes
A = 100
0 1
B = 010
C = 101 0 1 0 1
D = 011
É F
É = 00 0 1 0 1
F = 11
B D A C
“CAFÉ” would encode as 1011001100

Example 0 1
0 1 0 1
É 0 1 0 1 F
B D A C
• If you see: 1000101000111011001100
• can decode as:

0 1
0 1 0 1
É 0 1 0 1 F
B D A C
• If you see: 1000101000111011001100
• can decode as: A

0 1
0 1 0 1
É 0 1 0 1 F
B D A C
• If you see: 1000101000111011001100
• can decode as: AB

0 1
0 1 0 1
É 0 1 0 1 F
B D A C
• If you see: 1000101000111011001100
• can decode as: ABA

0 1
0 1 0 1
É 0 1 0 1 F
B D A C
• If you see: 1000101000111011001100
• can decode as: ABAD

0 1
0 1 0 1
É 0 1 0 1 F
B D A C
• If you see: 1000101000111011001100
• can decode as: ABADC

0 1
0 1 0 1
É 0 1 0 1 F
B D A C
• If you see: 1000101000111011001100
• can decode as: ABADCA

0 1
0 1 0 1
É 0 1 0 1 F
B D A C
• If you see: 1000101000111011001100
• can decode as: ABADCAF

0 1
0 1 0 1
É 0 1 0 1 F
B D A C
• If you see: 1000101000111011001100
• can decode as: ABADCAFÉ

Prefix-free codes
are yet another
representation of a
decision tree.
Theorem:
S has a decision tree of depth d
if and only if
S has a prefix-free code with all

codewords bounded by length d
0 1
0 1 0 1
É F 0 1 0 1
0 1
B D A C 0 1
B D A C
É G F
H
Theorem:
S has a decision tree of depth d
if and only if
S has a prefix-free code with all

codewords bounded by length d
Let S be any prefix-free code.
Kraft Inequality:
w2 S D-li · 1
There exists a D-ary prefix-free

code of N codewords and whose
codeword length are the positive
integers l1, l2,……lN
if and only if w2 S D-li · 1

Lecture 4

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 4

Uploaded by

Copyright:

Available Formats

COE 343 – Information theory and coding

Lecture 4: Coding of Information

• ”Today, Shannon’s insight help shape virtually all

• ”Information theory has also infilitrated fields

Example: ASCII Code

seven-bit Telephone seven-bit terminal

• The general reasons for coding information are:

Coding for compressing data

Coding for ensuring the quality of the transmission in a

Coding for secrecy

• A source outputs symbols X1, X2, ...

Source X1, X2, …

• A symbol depends only on the previous symbol, so the

a 0.5 A ternary source with

• Assume we are in state a, i.e., Xk = a.

a 0.5 P(Xk+1 = b | Xk = a) = 0.7

• So, if Xk+1 = b, we know that Xk+2 will

a 0.5 P(Xk+2 = b | Xk+1 = b) = 0

Non-Ambiguous codes: A code of a discrete

• Let T be a subset of {0,1}*.

• Definition: A prefix-free code for S is a prefix-free

• The inverse function f-1 is called the “decoding

• Example: S = {apple, orange, mango}.

Let T be prefix-free and f be an encoding

Sender: sends f(x1) f(x2) f(x3)…

• SOS encodes as …---…

A .- F ..-. K -.- P .--. U ..- Z --..

• SOS encodes as …---…

• Could decode as: IAMIE

A .- F ..-. K -.- P .--. U ..- Z --..

• SOS encodes as … --- …

A .- F ..-. K -.- P .--. U ..- Z --..

There exist some non-ambiguous codes which are

A code is instantaneously decodable if and only if it

• A coding tree is a n-ary tree, the arcs of which are

An n-ary tree is a tree in which each

“CAFÉ” would encode as 1011001100

• can decode as:

• can decode as: A

• can decode as: AB

• can decode as: ABA

• can decode as: ABAD

• If you see: 1000101000111011001100

• can decode as: ABADC

• can decode as: ABADCA

• can decode as: ABADCAF

• can decode as: ABADCAFÉ

S has a decision tree of depth d

S has a prefix-free code with all

S has a decision tree of depth d

S has a prefix-free code with all

There exists a D-ary prefix-free

You might also like