Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 20

Probability & Information

Prof. J Bapat
Information
• What do we mean by information?
– “A numerical measure of the uncertainty of an
experimental outcome” – Webster Dictionary
• How to quantitatively measure and represent
information?
– Shannon proposed a statistical-mechanics inspired
approach
• Let us first look at how we assess the amount
of information in our daily lives using common
sense
2
Information=Uncertainty
• Zero information
– Sachin Tendulkar retired from Professional Cricket. (celebrity, known
fact)
– Narendra Modi is Prime Minister of India. (Known fact)
• Little information
– It will rain in Bangalore tomorrow (not much uncertainty since this is
monsoon time)
• Large information
– An earthquake is going to hit California in December 2016 (are you
sure? an unlikely event)
– Someone solved world hunger problem. (Seriously?)

3
Uncertainty, Information and Entropy

• Let the source alphabet generated by a


discrete memory-less source (DMS)
S  {s0, s1 , .. , sK -1}

K -1
P( s  sk )  pk , k  0,1, .. , K -1 and p
k 0
k 1
with the probability of occurrence

4
Uncertainty, Information, and Entropy

Interrelations between information, uncertainty or surprise


No surprise no information

1
(  Info.  )
Pr ob.
The amount of information may be related to the inverse of
the probability of occurrence.

5
Information
• Using such intuition, Hartley proposed the
following definition of the information
associated with an event whose probability of
occurrence is
• I = log(1/p) = -log(p)
• This definition satisfies the basic requirement
that it is a decreasing function of p. But so do
an infinite number of other functions, so
– what is the intuition behind using the logarithm to
define information? 6
Information
• One flip of a fair coin:
– Before the flip, there are two equally probable
choices: heads or tails. P(H)=P(T)=1/2
– After the flip, we’ve narrowed it down to one
choice. Amount of information = log2(2/1) = 1 bit.
• Simple roll of two dice:
– Each die has six faces, so in the roll of two dice
there are 36 possible combinations for the
outcome. Amount of information = log2(36/1) =
5.2 bits.
• Learning that a randomly chosen decimal
7
Entropy
• The expected information may be quantified in
a set of possible outcomes or mutually
exclusive events.
N
• Specifically,
H ( p , pif,...,
1 2 N )   p ilog
anp eventi i 1/ p  with
occurs
i 1
probability pi, 1<= i<= N out of a set of N
mutually exclusive events, then the average or
expected information is given by

8
Entropy of Binary Memory-less Source
• For binary source of symbol 0 with, and symbol 1 with
p0 p1 ( 1  p0 )
H ( S )  - p0 log 2 p0 - p1 log 2 p1
 - p0 log 2 p0 - (1- p0 )log 2 (1- p0 ), (bits)
H (S )
1.0

1
p0
0 2
1
9
Entropy
• Definition: measure of average information contents per
source symbol.
The mean value of I (sk ) over S,
K-1 K-1
1
H ( S )  E[I ( sk )]   pk I (sk ) 
k 0
 pk log 2 (
k 0 pk
)
The property of H

1) H(S)=0,iff pk  1 for some k, and all other pi ' s  0

0  H (S )  log 2 K , where K is radix ( # of symbols)

1
log 2 K , iff pk  for all k
2) H(S)= K

10
Examples
• 1. Several people at a party are trying to guess
a 5-bit binary number. Alice is told that the
number is odd; Bob is told that it is not a
multiple of 3; Charlie is told that the number
contains exactlyP(HT)=0.999
If there is a game
two 1’s; andP(LT)=0.001
Deb is given all
three of these clues. How much information
No Game P(HT)=0.25 P(LT)=0.75
(in bits) did each player get about the
number?

• 2. After careful data collection, it was 11


Examples
• 1. Several people at a party are trying to guess a 5-bit binary number. Alice
is told that the number is odd; Bob is told that it is not a multiple of 3;
Charlie is told that the number contains exactly two 1’s; and Deb is given
all three of these clues. How much information (in bits) did each player get
about the number?

• Original uncertainty (without any information) = 5 bits.


• Information given to Alice: Number is odd.
• Information given to Alice = Original uncertainty – Uncertainty after
information = log2(32) – log2(16)=1 bit
• Bob: the number would not be one of these (3, 6, 9, 12, 15, 18, 21, 24, 27,
30)
• Information = log2(32) – log2(22)=5- 4.459= 0.541 bits

12
Examples
• What does Charlie know?
• The Number is one of the following:
(3,5,6,9,10,12,17,18,20,24)
• Information gained = log2(32) - log2(10) = 1.67 bits

• Deb knows all this so now her choices are narrowed down to:
(5,17)
• Information=log2(32)-log2(2)=4 bits (not addition of all the
received information)

13
Examples
• 2. After careful data collection, it was
observed that the probability of “HIGH” or
“LOW” traffic on Main Street is given by the
If there is a game P(HT)=0.999 P(LT)=0.001
following table:
No Game P(HT)=0.25 P(LT)=0.75

14
Example
• It is known that a game is being played, no
traffic a large piece of information:

• I(S=Low Traffic) =log2(1/0.001)=9.96 bits

• If no game is being played, there is some


uncertainty about traffic conditions.

• H=p1log2(1/p1)+p2log2(1/p2)
15
Example
• X is an unknown 8-bit binary number. You are
given another 8-bit binary number, Y, and told
that Y differs from X in exactly one bit
position. How many bits of information about
X have you been given?

16
Examples
• With 8 bits, total number of combinations
possible=256. We know that only bit is
different so the number of combinations
available = 8
• Information given = log2(256) – log2(8)=5 bits

17
Example: Fool’s Gold
• Consider the following weighing problem.3
You have 27 apparently identical gold coins.
One of them is false and lighter but is
otherwise indistinguishable form the others.
• You also have a balance with two pans, but
without standard, known weights for
comparison weighings. Thus you have to
compare the gold coins to each other and any
measurement will tell you if the loaded pans
weigh the same or, if not, which weighs more.
18
Example
• First, the amount of information we need to
select one of 27 items is log 27 bits, for we
want to select one coin from among 27 and
“indistinguishable" here translates to “equi-
probable."
• A weighing - putting some coins on one pan of
the balance and other coins on the other pan-
produces one of three outcomes:
• (1) nThe 3  log
logpans 27  3log 3, n  3
balance
• (2) The left pan is heavier 19
Example
• Suppose instead that you have 12 coins,
indistinguishable in appearance but one is
either heavier or lighter than the others. What
weighing scheme will allow you to find the
false coin?

20

You might also like