Lect3 - 2021 IT

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 29

Measure of Information

UNIVERSITY OF MINES AND TECHNOLOGY

INFORMATION THEORY

Course Instructor Dr. Abdel-Fatao Hamidu

Computer Science and Engineering Department

March 15, 2022

Prepared by Dr. Abdel-Fatao Information Theory 1/29


Measure of Information

Recap of MoI

Some Point to Remember

• The number of bits used to represent a message is completely


different from the amount of information it is conveying
• Intuitive Feel
• The occurrence of a less probable event conveys more
information
• Since a lower probability implies a higher degree of uncertainty
(and vice versa), a random variable with a higher degree of
uncertainty contains more information
• This correlation between uncertainty and information shall form
the basis of all physical interpretations

Prepared by Dr. Abdel-Fatao Information Theory 2/29


Measure of Information

Measure of Information (MoI)

Uncetainty and Information


• Consider a discrete random variable X with possible outcomes
xi where i = 1, 2, ..., n
• The self information of the event X = xi is defined as
1
I(xi ) = log( ) = −log P(xi ) (1)
P(xi )

• When the base of the logarithm is 2 the units of I(x) are in bits
• When the base is e, the units of I(x) are in nats (natural units)

Prepared by Dr. Abdel-Fatao Information Theory 3/29


Measure of Information

Measure of Information (MoI)

Example 1
• Consider a binary source A which tosses a fair coin
• It produces an output equal to 1 if a head appears and a 0 if a
tail appears
• What is the information content of each output?

Prepared by Dr. Abdel-Fatao Information Theory 4/29


Measure of Information

Measure of Information (MoI)

Solution
• For the source, P(1) = P(0) = 0.5
• The information content of each output from the source is

I(xi ) = −log2 P(xi )


= −log2 (0.5) = 1 bit

• This is consistent with intuition, since the output of the binary


source being a fair coin can be represented with one bit (1 for
head and 0 for a tail)

Prepared by Dr. Abdel-Fatao Information Theory 5/29


Measure of Information

Measure of Information (MoI)

Example 1 contd
• Suppose the successive outputs from this binary source are
statistically independent, i.e. the source is memory less
• Consider a block of m binary digits
• There are 2m possible m-bit blocks, each of which is equally
probable with probability of 2−m
• The self information of an m-bit block is

I(xi ) = −log2 P(xi )


= −log2 2−m = m bits

Prepared by Dr. Abdel-Fatao Information Theory 6/29


Measure of Information

Measure of Information (MoI)

Example 1 contd
• Suppose the successive outputs from this binary source B are
statistically independent, i.e. the source is memory less
• Consider a block of m binary digits
• There are 2m possible m-bit blocks, each of which is equally
probable with probability of 2−m
• The self information of an m-bit block is

I(xi ) = −log2 P(xi )


= −log2 2−m = m bits

• Again, we observe that we indeed need m bits to represent the


possible m-bit blocks

Prepared by Dr. Abdel-Fatao Information Theory 7/29


Measure of Information

Measure of Information (MoI)

Example 2
• Consider a discrete memoryless source C that generates two
bits at a time
• This source comprises of two binary sources (A and B), each
contributing one bit
• The two binary sources within the source C are independent
• What is the information content of the aggregate source C

Prepared by Dr. Abdel-Fatao Information Theory 8/29


Measure of Information

Measure of Information (MoI)

Solution
• Intuitively, the information content of the aggregate source C
should be the sum of the information contained in the outputs
of the two independent sources that constitute this source C
• Since A and B are independent
P(C) = P(A)P(B) = 0.5 × 0.5 = 0.25
I(C) = −log2 P(xi ) = −log2 (0.25) = 2 bits
• The answer is again consistent with intuition

Prepared by Dr. Abdel-Fatao Information Theory 9/29


Measure of Information

Measure of Information (MoI)

Example
• Given a bag containing 3 green, 4 red and 2 yellow balls, what is
the average surprise associated with choosing a ball at random
from the bag.
⊣ What is the information gained by choosing a green ball?
⊣ Differentiate between the two types of information
obtained.

Prepared by Dr. Abdel-Fatao Information Theory 10/29


Measure of Information

Measure of Information (MoI)

Example
• Calculate the entropy of fair coin.
⊣ Entropy of a biased coin that comes up 75% heads
⊣ What is the entropy of the coin if somehow it is incapable
of landing tails

Prepared by Dr. Abdel-Fatao Information Theory 11/29


Measure of Information

Measure of Information (MoI)

Why the Logarithm?


Considering two independent sources
• Independent events =⇒ Probabilities multiply
• Independent sources =⇒ Information must add up
• Logarithm seems to do the job

Prepared by Dr. Abdel-Fatao Information Theory 12/29


Measure of Information

Measure of Information (MoI)

Why the Logarithm?


Visual Representation of the Independent Sources

Prepared by Dr. Abdel-Fatao Information Theory 13/29


Measure of Information

Measure of Information (MoI)

Why the Logarithm?


• Note that coin A may be a fair coin whiles coin B is a biased
coin
• In that case, the rate of information generation buy coin A is
not the same as that of coin B
• The rate of information generation is amount of information
generated (number of bits) per second

Prepared by Dr. Abdel-Fatao Information Theory 14/29


Measure of Information

Measure of Information (MoI)

Mutual Information
• Consider two discrete random variables X andY with possible
outcomes xi where i = 1, 2, ..., n and yj where j = 1, 2, ..., m
respectively
• Suppose we observe some outcome Y = yj and we want to
determine the amount of information this event provides about
the event X = xi ∀i = 1, 2, ..., n
• That is, we want to mathematically represent the mutual
information

Prepared by Dr. Abdel-Fatao Information Theory 15/29


Measure of Information

Measure of Information (MoI)

Mutual Information

• Consider two discrete random variables X andY with possible


outcomes xi where i = 1, 2, ..., n and yj where j = 1, 2, ..., m
respectively
• Suppose we observe some outcome Y = yj and we want to
determine the amount of information this event provides about
the event X = xi ∀i = 1, 2, ..., n
• We want to mathematically represent the mutual information

• Found in applications such as DNA sequencing, where Y could


be profile of a disease and X, the genetic code
Prepared by Dr. Abdel-Fatao Information Theory 16/29
Measure of Information

Measure of Information (MoI)

Mutual Information
Note the following
• If X and Y are independent events, the occurrence of Y = yj
provides no information about X = xi

• If X and Y are dependent events, the occurrence of Y = yj


determines the occurrence X = xi

Prepared by Dr. Abdel-Fatao Information Theory 17/29


Measure of Information

Measure of Information (MoI)

Mutual Information – Transinformation


• Definition: The mutual information I(xi , yj ) between xi and
yj is defined as
( )
P(xi |yi )
I(xi ; yj ) = log (2)
P(xi )

• Mutual information is a measure of how much information can


be obtained about one random variable by observing another
• As before, the units of I(x) are determined by the base of the
logarithm, which is usually selected as 2 or e
• When the base is 2, the units are in bits

Prepared by Dr. Abdel-Fatao Information Theory 18/29


Measure of Information

Measure of Information (MoI)


Mutual Information
• Representation
( )
P(xi |yi )
I(xi ; yj ) = log
P(xi )

• Observe that
P(xi |yi ) P(xi |yi )P(yi ) P(xi , yi ) P(yi |xi )
= = =
P(xi ) P(xi )P(yi ) P(xi )P(yi ) P(yi )

• Therefore there exist a two way relationship


( ) ( )
P(xi |yi ) P(yi |xi )
I(xi ; yj ) = log = log = I(yj ; xi )
P(xi ) P(yi )

• It is symmetric
Prepared by Dr. Abdel-Fatao Information Theory 19/29
Measure of Information

Measure of Information (MoI)

Physical Interpretation of Mutual Information

The case of two extremes


• When the random variables X and Y are statistically
independent, P(xi |yj ) = P(xi ) which leads to I(xi ; yj ) = 0
• When the occurrence of Y = yj uniquely determines the
occurrence of the event X = xi , P(xi |yj ) = 1, the mutual
information becomes
1
I(xi ; yj ) = log = −logP(xi )
P(xi )

• This is the self information of the event X = xi

Prepared by Dr. Abdel-Fatao Information Theory 20/29


Measure of Information

Measure of Information (MoI)

Mutual Information - A binary symmetric channel


(BSC)

Prepared by Dr. Abdel-Fatao Information Theory 21/29


Measure of Information

Measure of Information (MoI)

Mutual Information - BSC


P(Y = 0) = P(X = 0)P(Y = 0|X = 0) + P(X = 1)P(Y = 0|X = 1)
= 0.5(1 − p) + 0.5(p) = 0.5

P(Y = 1) = P(X = 0)P(Y = 1|X = 0) + P(X = 1)P(Y = 1|X = 1)


= 0.5(p) + 0.5(1 − p) = 0.5

( ) ( 1−p )
P(Y=0|X=0)
I(x0 ; y0 ) = I(0; 0) = log2 P(Y=0)
= log2 0.5
= log2 2(1 − p)

( ) ( )
P(Y=0|X=1) p
I(x1 ; y0 ) = I(1; 0) = log2 P(Y=0)
= log2 0.5
= log2 2p

Prepared by Dr. Abdel-Fatao Information Theory 22/29


Measure of Information

Measure of Information (MoI)

Mutual Information - BSC


CASE 1
• Suppose p = 0, it is an ideal channel (noiseless)

• In that case
I(x0 ; y0 ) = I(0; 0) = log2 2(1 − p) = 1 bit
• Hence having observed with certainty the output, we can
determine what was transmitted
• Recall that the self information about event X = x0 was 1 bit
Prepared by Dr. Abdel-Fatao Information Theory 23/29
Measure of Information

Measure of Information (MoI)

Mutual Information - BSC


CASE 2
• If p = 0.5, we obtain
I(x0 ; y0 ) = I(0; 0) = log2 2(1 − p) = log2 2(0.5) = 0 bits
• This implies that having observed the output, we have no
information about what was transmitted
• Thus, it is a useless channel
⊣ For such a channel, there is no point in observing the
received symbol and trying to make a guess as to what
was sent
⊣ Instead we can as well toss a fair coin at the receiver in
order to estimate what was sent

Prepared by Dr. Abdel-Fatao Information Theory 24/29


Measure of Information

Measure of Information (MoI)

Variation of Mutual Information with Probability in BSC

Prepared by Dr. Abdel-Fatao Information Theory 25/29


Measure of Information

Measure of Information (MoI)

Variation of Mutual Information with Probability in BSC


• The lower the probability of the channel, the higher the mutual
information
• At probability p = 0.5, the mutual information is 0
• At probability p > 0.5, the mutual information becomes
negative
• The negative mutual information implies that, the channel
might have been in error
⊣ The channel might have sent a 0 instead of a 1 and vice
versa

Prepared by Dr. Abdel-Fatao Information Theory 26/29


Measure of Information

Measure of Information (MoI)

Mutual Information - Binary Channel

From the channel transition probabilities we have


P(Y = 0) = P(X = 0)P(Y = 0|X = 0) + P(X = 1)P(Y = 0|X = 1)
= 0.5(1 − p0 ) + 0.5(p1 ) = 0.5(1 − p0 + p1 )
P(Y = 1) = P(X = 0)P(Y = 1|X = 0) + P(X = 1)P(Y = 1|X = 1)
= 0.5(p0 ) + 0.5(1 − p1 ) = 0.5(1 − p0 + p1 )

Prepared by Dr. Abdel-Fatao Information Theory 27/29


Measure of Information

Measure of Information (MoI)

Mutual Information - Binary Channel

The mutual information about the occurrence of the event X = 0


given that Y = 0 is
( ) ( 1−p ) ( )
P(Y=0|X=0) 2(1−p0 )
I(x0 ; y0 ) = I(0; 0) = log2 P(Y=0)
= log2 0.5
= log2 1−p0 +p1

( ) ( ) ( )
P(Y=0|X=1) p 2p1
I(x1 ; y0 ) = I(1; 0) = log2 P(Y=0)
= log2 0.5
= log2 1−p0 +p1

Prepared by Dr. Abdel-Fatao Information Theory 28/29


Measure of Information

ADIOS

THANK YOU, QUESTIONS!!!

Prepared by Dr. Abdel-Fatao Information Theory 29/29

You might also like