Lect3 - 2021 IT

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 29

Measure of Information



Course Instructor Dr. Abdel-Fatao Hamidu

Computer Science and Engineering Department

March 15, 2022

Prepared by Dr. Abdel-Fatao Information Theory 1/29

Measure of Information

Recap of MoI

Some Point to Remember

• The number of bits used to represent a message is completely

different from the amount of information it is conveying
• Intuitive Feel
• The occurrence of a less probable event conveys more
• Since a lower probability implies a higher degree of uncertainty
(and vice versa), a random variable with a higher degree of
uncertainty contains more information
• This correlation between uncertainty and information shall form
the basis of all physical interpretations

Prepared by Dr. Abdel-Fatao Information Theory 2/29

Measure of Information

Measure of Information (MoI)

Uncetainty and Information

• Consider a discrete random variable X with possible outcomes
xi where i = 1, 2, ..., n
• The self information of the event X = xi is defined as
I(xi ) = log( ) = −log P(xi ) (1)
P(xi )

• When the base of the logarithm is 2 the units of I(x) are in bits
• When the base is e, the units of I(x) are in nats (natural units)

Prepared by Dr. Abdel-Fatao Information Theory 3/29

Measure of Information

Measure of Information (MoI)

Example 1
• Consider a binary source A which tosses a fair coin
• It produces an output equal to 1 if a head appears and a 0 if a
tail appears
• What is the information content of each output?

Prepared by Dr. Abdel-Fatao Information Theory 4/29

Measure of Information

Measure of Information (MoI)

• For the source, P(1) = P(0) = 0.5
• The information content of each output from the source is

I(xi ) = −log2 P(xi )

= −log2 (0.5) = 1 bit

• This is consistent with intuition, since the output of the binary

source being a fair coin can be represented with one bit (1 for
head and 0 for a tail)

Prepared by Dr. Abdel-Fatao Information Theory 5/29

Measure of Information

Measure of Information (MoI)

Example 1 contd
• Suppose the successive outputs from this binary source are
statistically independent, i.e. the source is memory less
• Consider a block of m binary digits
• There are 2m possible m-bit blocks, each of which is equally
probable with probability of 2−m
• The self information of an m-bit block is

I(xi ) = −log2 P(xi )

= −log2 2−m = m bits

Prepared by Dr. Abdel-Fatao Information Theory 6/29

Measure of Information

Measure of Information (MoI)

Example 1 contd
• Suppose the successive outputs from this binary source B are
statistically independent, i.e. the source is memory less
• Consider a block of m binary digits
• There are 2m possible m-bit blocks, each of which is equally
probable with probability of 2−m
• The self information of an m-bit block is

I(xi ) = −log2 P(xi )

= −log2 2−m = m bits

• Again, we observe that we indeed need m bits to represent the

possible m-bit blocks

Prepared by Dr. Abdel-Fatao Information Theory 7/29

Measure of Information

Measure of Information (MoI)

Example 2
• Consider a discrete memoryless source C that generates two
bits at a time
• This source comprises of two binary sources (A and B), each
contributing one bit
• The two binary sources within the source C are independent
• What is the information content of the aggregate source C

Prepared by Dr. Abdel-Fatao Information Theory 8/29

Measure of Information

Measure of Information (MoI)

• Intuitively, the information content of the aggregate source C
should be the sum of the information contained in the outputs
of the two independent sources that constitute this source C
• Since A and B are independent
P(C) = P(A)P(B) = 0.5 × 0.5 = 0.25
I(C) = −log2 P(xi ) = −log2 (0.25) = 2 bits
• The answer is again consistent with intuition

Prepared by Dr. Abdel-Fatao Information Theory 9/29

Measure of Information

Measure of Information (MoI)

• Given a bag containing 3 green, 4 red and 2 yellow balls, what is
the average surprise associated with choosing a ball at random
from the bag.
⊣ What is the information gained by choosing a green ball?
⊣ Differentiate between the two types of information

Prepared by Dr. Abdel-Fatao Information Theory 10/29

Measure of Information

Measure of Information (MoI)

• Calculate the entropy of fair coin.
⊣ Entropy of a biased coin that comes up 75% heads
⊣ What is the entropy of the coin if somehow it is incapable
of landing tails

Prepared by Dr. Abdel-Fatao Information Theory 11/29

Measure of Information

Measure of Information (MoI)

Why the Logarithm?

Considering two independent sources
• Independent events =⇒ Probabilities multiply
• Independent sources =⇒ Information must add up
• Logarithm seems to do the job

Prepared by Dr. Abdel-Fatao Information Theory 12/29

Measure of Information

Measure of Information (MoI)

Why the Logarithm?

Visual Representation of the Independent Sources

Prepared by Dr. Abdel-Fatao Information Theory 13/29

Measure of Information

Measure of Information (MoI)

Why the Logarithm?

• Note that coin A may be a fair coin whiles coin B is a biased
• In that case, the rate of information generation buy coin A is
not the same as that of coin B
• The rate of information generation is amount of information
generated (number of bits) per second

Prepared by Dr. Abdel-Fatao Information Theory 14/29

Measure of Information

Measure of Information (MoI)

Mutual Information
• Consider two discrete random variables X andY with possible
outcomes xi where i = 1, 2, ..., n and yj where j = 1, 2, ..., m
• Suppose we observe some outcome Y = yj and we want to
determine the amount of information this event provides about
the event X = xi ∀i = 1, 2, ..., n
• That is, we want to mathematically represent the mutual

Prepared by Dr. Abdel-Fatao Information Theory 15/29

Measure of Information

Measure of Information (MoI)

Mutual Information

• Consider two discrete random variables X andY with possible

outcomes xi where i = 1, 2, ..., n and yj where j = 1, 2, ..., m
• Suppose we observe some outcome Y = yj and we want to
determine the amount of information this event provides about
the event X = xi ∀i = 1, 2, ..., n
• We want to mathematically represent the mutual information

• Found in applications such as DNA sequencing, where Y could

be profile of a disease and X, the genetic code
Prepared by Dr. Abdel-Fatao Information Theory 16/29
Measure of Information

Measure of Information (MoI)

Mutual Information
Note the following
• If X and Y are independent events, the occurrence of Y = yj
provides no information about X = xi

• If X and Y are dependent events, the occurrence of Y = yj

determines the occurrence X = xi

Prepared by Dr. Abdel-Fatao Information Theory 17/29

Measure of Information

Measure of Information (MoI)

Mutual Information – Transinformation

• Definition: The mutual information I(xi , yj ) between xi and
yj is defined as
( )
P(xi |yi )
I(xi ; yj ) = log (2)
P(xi )

• Mutual information is a measure of how much information can

be obtained about one random variable by observing another
• As before, the units of I(x) are determined by the base of the
logarithm, which is usually selected as 2 or e
• When the base is 2, the units are in bits

Prepared by Dr. Abdel-Fatao Information Theory 18/29

Measure of Information

Measure of Information (MoI)

Mutual Information
• Representation
( )
P(xi |yi )
I(xi ; yj ) = log
P(xi )

• Observe that
P(xi |yi ) P(xi |yi )P(yi ) P(xi , yi ) P(yi |xi )
= = =
P(xi ) P(xi )P(yi ) P(xi )P(yi ) P(yi )

• Therefore there exist a two way relationship

( ) ( )
P(xi |yi ) P(yi |xi )
I(xi ; yj ) = log = log = I(yj ; xi )
P(xi ) P(yi )

• It is symmetric
Prepared by Dr. Abdel-Fatao Information Theory 19/29
Measure of Information

Measure of Information (MoI)

Physical Interpretation of Mutual Information

The case of two extremes

• When the random variables X and Y are statistically
independent, P(xi |yj ) = P(xi ) which leads to I(xi ; yj ) = 0
• When the occurrence of Y = yj uniquely determines the
occurrence of the event X = xi , P(xi |yj ) = 1, the mutual
information becomes
I(xi ; yj ) = log = −logP(xi )
P(xi )

• This is the self information of the event X = xi

Prepared by Dr. Abdel-Fatao Information Theory 20/29

Measure of Information

Measure of Information (MoI)

Mutual Information - A binary symmetric channel


Prepared by Dr. Abdel-Fatao Information Theory 21/29

Measure of Information

Measure of Information (MoI)

Mutual Information - BSC

P(Y = 0) = P(X = 0)P(Y = 0|X = 0) + P(X = 1)P(Y = 0|X = 1)
= 0.5(1 − p) + 0.5(p) = 0.5

P(Y = 1) = P(X = 0)P(Y = 1|X = 0) + P(X = 1)P(Y = 1|X = 1)

= 0.5(p) + 0.5(1 − p) = 0.5

( ) ( 1−p )
I(x0 ; y0 ) = I(0; 0) = log2 P(Y=0)
= log2 0.5
= log2 2(1 − p)

( ) ( )
P(Y=0|X=1) p
I(x1 ; y0 ) = I(1; 0) = log2 P(Y=0)
= log2 0.5
= log2 2p

Prepared by Dr. Abdel-Fatao Information Theory 22/29

Measure of Information

Measure of Information (MoI)

Mutual Information - BSC

• Suppose p = 0, it is an ideal channel (noiseless)

• In that case
I(x0 ; y0 ) = I(0; 0) = log2 2(1 − p) = 1 bit
• Hence having observed with certainty the output, we can
determine what was transmitted
• Recall that the self information about event X = x0 was 1 bit
Prepared by Dr. Abdel-Fatao Information Theory 23/29
Measure of Information

Measure of Information (MoI)

Mutual Information - BSC

• If p = 0.5, we obtain
I(x0 ; y0 ) = I(0; 0) = log2 2(1 − p) = log2 2(0.5) = 0 bits
• This implies that having observed the output, we have no
information about what was transmitted
• Thus, it is a useless channel
⊣ For such a channel, there is no point in observing the
received symbol and trying to make a guess as to what
was sent
⊣ Instead we can as well toss a fair coin at the receiver in
order to estimate what was sent

Prepared by Dr. Abdel-Fatao Information Theory 24/29

Measure of Information

Measure of Information (MoI)

Variation of Mutual Information with Probability in BSC

Prepared by Dr. Abdel-Fatao Information Theory 25/29

Measure of Information

Measure of Information (MoI)

Variation of Mutual Information with Probability in BSC

• The lower the probability of the channel, the higher the mutual
• At probability p = 0.5, the mutual information is 0
• At probability p > 0.5, the mutual information becomes
• The negative mutual information implies that, the channel
might have been in error
⊣ The channel might have sent a 0 instead of a 1 and vice

Prepared by Dr. Abdel-Fatao Information Theory 26/29

Measure of Information

Measure of Information (MoI)

Mutual Information - Binary Channel

From the channel transition probabilities we have

P(Y = 0) = P(X = 0)P(Y = 0|X = 0) + P(X = 1)P(Y = 0|X = 1)
= 0.5(1 − p0 ) + 0.5(p1 ) = 0.5(1 − p0 + p1 )
P(Y = 1) = P(X = 0)P(Y = 1|X = 0) + P(X = 1)P(Y = 1|X = 1)
= 0.5(p0 ) + 0.5(1 − p1 ) = 0.5(1 − p0 + p1 )

Prepared by Dr. Abdel-Fatao Information Theory 27/29

Measure of Information

Measure of Information (MoI)

Mutual Information - Binary Channel

The mutual information about the occurrence of the event X = 0

given that Y = 0 is
( ) ( 1−p ) ( )
P(Y=0|X=0) 2(1−p0 )
I(x0 ; y0 ) = I(0; 0) = log2 P(Y=0)
= log2 0.5
= log2 1−p0 +p1

( ) ( ) ( )
P(Y=0|X=1) p 2p1
I(x1 ; y0 ) = I(1; 0) = log2 P(Y=0)
= log2 0.5
= log2 1−p0 +p1

Prepared by Dr. Abdel-Fatao Information Theory 28/29

Measure of Information



Prepared by Dr. Abdel-Fatao Information Theory 29/29

You might also like