Cha 02

CH- 2:
Information Theory and

Coding
By Mr. Yosef B. (MSc.)

Lesson
 What is “information” and how to measure it?

 What are the fundamental limits on the storage and the
transmission of information?
 Given an information source, is it possible to reduce its
data rate?
If so, By how much can the data rate be reduced?
 How can we achieve data compression without loss of
information?
What is information Theory?
 Information theory is the mathematical
treatment of the concepts, parameters
and rules governing the transmission of
messages through communication
systems.
 Quantizes the amount of information
carried by a random variable and
collection of random variables
 It was founded by Claude Shannon
 Shannon stated the inverse r/ship b/n info
and Probability
What is information?
 First attempt to quantify information by Hartley
(1928).
◼ Every symbol of the message has a choice of s possibilities.

l
◼ A message of length l , therefore can have s distinguishable
possibilities.
◼ Information measure is then the logarithm of sl

I = log( s l ) = l log( s)
Intuitively, this definition makes sense:
one symbol (letter) has the information of log( s) then a sentence of length l
should have l times more information, i.e. l log s
Example
 Let us consider a pack of 32 playing cards, one of
which is drawn at random. Calculate the amount of
information of one card
Shannon’s Information Theory
The
Claude Shannon: A Mathematical Theory of Communication
Bell System Technical Journal, 1948
 Shannon’s measure of information is the number of bits to

represent the amount of uncertainty (randomness) in a
data source, and is defined as entropy
n
H = − pi log( pi )
i =1
Where there are n symbols 1, 2, … n, each with

probability of occurrence of pi
Shannon’s Entropy
 Consider the following string consisting of symbols a and b:
abaabaababbbaabbabab… ….
◼ On average, there are equal number of a and b.

◼ The string can be considered as an output of a below source
with equal probability of outputting symbol a or b:
0.5 a
We want to characterize the average

information generated by the source!
0.5 b
The average # of bits to represent the symbol is therefore
1
−  pi log pi
source
i =0
Self Information
 So, let’s look at it the way Shannon did.
 Assume a memoryless source with
◼ alphabet A = (a1, …, an)
◼ symbol probabilities (p1, …, pn).
 How much information do we get when finding out that the
next symbol is ai?
 According to Shannon the self information of ai is
Why?
Assume two independent events A and B, with
probabilities P(A) = pA and P(B) = pB.
For both the events to happen, the probability is

pA ¢ pB. However, the amount of information
should be added, not multiplied.
Logarithms satisfy this!

No, we want the information to increase with
decreasing probabilities, so let’s use the negative
logarithm.
Self Information
Example 1:
Example 2:
Which logarithm? Pick the one you like! If you pick the natural log,
you’ll measure in nats, if you pick the 10-log, you’ll get Hartleys,
if you pick the 2-log (like everyone else), you’ll get bits.
Entropy
Example: Binary Memoryless Source
BMS 01101000…
Let
Then
1
The uncertainty (information) is greatest when
0 0.5 1
Example
Three symbols a, b, c with corresponding probabilities:
P = {0.5, 0.25, 0.25}
What is H(P)?
Three weather conditions in Corvallis: Rain, sunny, cloudy with

corresponding probabilities:
Q = {0.48, 0.32, 0.20}
What is H(Q)?
Important parameters:
Average Length L =  pi li ,
H
Efficiency of coding E =
L
where is length (in binary digits)
li
Source Coding Method
▪ Fano-Shannon method
▪ Huffman’s Method
Fano-Shannon Coding
 Writing the symbol in a table in the order of
descending order of probabilities ;
 Dividing lines are inserted to successively divide
the probabilities into halves, quarters, etc (or as
near as possible);
 A ‘0’ and ‘1’ are added to the code at each
division.
 Final code for each symbol is obtained by reading
from towards each symbol.
Example
Example…
s1 0.5 0 0
s2 0.2 0 100
0.1 1 0 101
s3
1
s4 0.1 0 110
0.1 1 1 111
s5
L=0.5×1+0.2 ×3+3 × 0.1 ×3=2.0
H=1.96
E=0.98
Huffman’s Method
1. Writing the symbol in a table in the order of
descending order of probabilities ;
2. The probabilities are added in pairs from bottom
and reordered.
3. A ‘0’ or ‘1’ is placed at each branch;

4. Final code for each symbol is obtained by reading
from towards each symbol.
Example…
L=0.5×1+0.2 ×2+ 0.1 ×3+2 × 0.1 ×4 =2.0

H=1.96
E=0.98
Example
Example
Example
Example
Example
Example
Example
Example
 An information source produces a long
sequence of three independent symbols A,
B, C with probabilities 16/20,3/20 and
1/20 respectively; 100 such symbols are
produced per second. The information is
to be transmitted via a noiseless binary
channel which can transmit up to 100
binary digits per second. Design a suitable
compact instantaneous code and find the
probabilities of the binary digits produced.
Example..
100 symbol/s 0, 1
channel
source coder decoder
P(A)=16/20, p(B)=3/20, p(C)=1/20

H = − pi log pi = 0.884bits, source rate = 88.4 bits/s
Coding singly, using Fano-Shannon

method
A 16/20 0 0
L =  pi li = 1.2bits / symbol,
B 3/20 0 10
c 1/20 1 1 11 source rate = 120 bits/s
P(0)=0.73, p(1)=0.27
Channel Coding
Channel Coding
Why?
To increase the resistance of digital communication systems
to channel noise via error control coding
How?
By mapping the incoming data sequence into a channel input
sequence and inverse mapping the channel output sequence
into an output data sequence in such a way that the overall
effect of channel noise on the system is minimized
Redundancy is introduced in the channel encoder so as to

reconstruct the original source sequence as accurately as
possible.
Channel Coding
Digital Communications over physical
channels is prone to errors
Channel Coding means :

Introducing redundancy (i.e., adding
extra bits) to information messages to
protect against channel errors
Coding theory develops methods to protect information against a

noise.
Error Correction and detection
REDUNDANCY
 0 is encoded as 00000 and 1 is encoded as 11111.

 The key idea is that in order to protect a message
against a noise, we should encode the message by
adding some redundant information to the message.
 In such a case, even if the message is corrupted by a

noise, there will be enough redundancy in the
encoded message to recover, or to decode the
message completely.
Parity Check
 It detects the error either using even and
odd parity by adding bit “1” accordingly
 The data parity bit will be sent
 The receiver counts the number of “1” and
decide the detection
 Not Efficient way of detecting error
 EXAMPLE u= 1010100 Even parity
Bi-directional Parity Check
 It splits the data to n-bits
 It arranges the data in two-dimention
 Check the number of “1’s”
 Add bits accordingly/ for Even or Odd
parity
 EXAMPLE u= 1000101101110100 using
even parity and n=4 bits
Check Sum
 Divide the data in to K-sections /
segments with each of n-bits
 Add all section
 Take 1’s complement of the result( check
sum)
 Take data + check sum
 Example u= 1100111000110001 n=4 and
k=4
❑ Addition of redundancy implies the need for increased
transmission bandwidth
❑ It also adds complexity in the decoding operation
❑ Therefore, there is a design trade-off in the use of error-

control coding to achieve acceptable error performance
considering bandwidth and system complexity.
Types of Error Control Coding

• Block codes
• Convolutional codes
Linear Block Codes
Usually in the form of (n,k) block code where n is the
number of bits of the codeword and k is the number of bits
for the binary message
To generate an (n,k) block code, the channel encoder accepts

information in successive k-bit blocks
For each block add (n-k) redundant bits to produce an
encoded block of n-bits called a code-word
The (n-k) redundant bits are algebraically related to the k
message bits
The channel encoder produces bits at a rate called the
channel data rate, R0
n Where Rs is the bit rate of
R0 =   Rs the information source
k
and n/k is the code rate
Forward Error Control(FEC)
The channel encoder accepts information in successive k-bit blocks
and for each block it adds (n-k) redundant bits to produce an encoded
block of n-bits called a code-word.
The channel decoder uses the redundancy to decide which message
bits were actually transmitted.
In this case, whether the decoding of the received code word is
successful or not, the receiver does not perform further
processing.
In other words, if an error is detected in a transmitted code
word, the receiver does not request for retransmission of the
corrupted code word.
Automatic-Repeat Request (ARQ) scheme
Upon detection of error, the receiver requests a repeat transmission of
the corrupted code word
There are 3 types of ARQ scheme
• Stop-and-Wait
• Continuous ARQ with pullback
• Continuous ARQ with selective repeat
Types of ARQ Scheme
Stop-and-wait
• A block of message is encoded into a code word and
transmitted
• The transmitter stops and waits for feedback from the
receiver either an acknowledgement of a correct receipt of the
codeword or a retransmission request due to error in decoding.
• The transmitter resends the code word before moving onto
the next block of message
What is the implication of this?

Types of ARQ Scheme..
Continuous ARQ with pullback (or go-back-N)
•Allows the receiver to send a feedback signal while the
transmitter is sending another code word
•The transmitter continues to send a succession of code words
until it receives a retransmission request.
•It then stops and pulls back to the particular code word that
was not correctly decoded and retransmits the complete
sequence of code words starting with the corrupted one.
What is the implication of this?

Types of ARQ Scheme..
Continuous ARQ with selective repeat

•Retransmits the code word that was incorrectly decoded only.
•Thus, eliminates the need for retransmitting the successfully decoded
code words.
Linear Block Codes
An (n,k) block code indicates that the codeword has n number of bits
and k is the number of bits for the original binary message
A code is said to be linear if any two code words in the code can be
added in modulo-2 arithmetic to produce a third code word in the
code
Code Vectors
Any n-bit code word can be visualized in an n-dimensional space as a
vector whose elements having coordinates equal the bits in the code
word
For example a code word 101 can be written in a row vector notation
as (1 0 1)
Matrix representation of block codes
The code vector can be written in matrix form:
A block of k message bits can be written in the form of 1-by-k matrix
Modulo-2 operations
The encoding and decoding functions involve the binary
arithmetic operation of modulo-2
Rules for modulo-2 operations are…..
Linear Block Codes…
Modulo-2 operations
The encoding and decoding functions involve the binary
arithmetic operation of modulo-2
Rules for modulo-2 operations are:
Modulo-2 addition
0+0=0
1+0=1
0+1=1
1+1=0
Modulo-2 multiplication
0 x0=0
1 x0=0
0 x1=0
1 x1=1

Cha 02

Uploaded by

Copyright:

Available Formats

You might also like

Cha 02

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cha 02

Uploaded by

Copyright:

Available Formats

CH- 2:

Information Theory and

By Mr. Yosef B. (MSc.)

 What is “information” and how to measure it?

◼ Every symbol of the message has a choice of s possibilities.

◼ Information measure is then the logarithm of sl

 Shannon’s measure of information is the number of bits to

Where there are n symbols 1, 2, … n, each with

◼ On average, there are equal number of a and b.

We want to characterize the average

For both the events to happen, the probability is

Logarithms satisfy this!

P = {0.5, 0.25, 0.25}

Three weather conditions in Corvallis: Rain, sunny, cloudy with

Q = {0.48, 0.32, 0.20}

3. A ‘0’ or ‘1’ is placed at each branch;

L=0.5×1+0.2 ×2+ 0.1 ×3+2 × 0.1 ×4 =2.0

P(A)=16/20, p(B)=3/20, p(C)=1/20

Coding singly, using Fano-Shannon

Redundancy is introduced in the channel encoder so as to

Channel Coding means :

Coding theory develops methods to protect information against a

 0 is encoded as 00000 and 1 is encoded as 11111.

 In such a case, even if the message is corrupted by a

❑ It also adds complexity in the decoding operation

❑ Therefore, there is a design trade-off in the use of error-

Types of Error Control Coding

To generate an (n,k) block code, the channel encoder accepts

What is the implication of this?

What is the implication of this?

Continuous ARQ with selective repeat

You might also like