Professional Documents
Culture Documents
2020 - 2 - Info Theoretic Models - Notes
2020 - 2 - Info Theoretic Models - Notes
2020 - 2 - Info Theoretic Models - Notes
models
MIE523H1F
Anthony Soung Yee
anthony.soungyee@utoronto.ca
Topics covered
• Human as an information processing
channel
• Quantification of information
Weaver Shannon
2
Communication theory and
human information processing
3
Communication theory and
human information processing
4
Basic principle
The Human Information Processor can
be modelled as:
A noisy limited-capacity
communication system
5
Basic principle
• For any tasks, there is an absolute limit
to the capacity to “transmit” information
• Maximum absolute quantity of
information
• Maximum rate of information
transmission
• When working at capacity, one can
increase speed only at the expense of
accuracy (and vice versa) – a.k.a.
8
“speed-accuracy tradeoff”
https://cdn-images-
1.medium.com/max/2000/1*qqz45XGsMFdUa
wK7Kf7zfQ.jpe
Source: https://medium.com/@louiciano/7-manfaat-dan-5-resiko-yang-perlu-anda-
10 ketahui-memakai-aplikasi-transportasi-online-b3ec969d0ed6
in· for· ma·
tion
/ˌinfərˈmā SH(ə)n/: The reduction of
uncertainty
11
“The weather will be sunny
tomorrow”
12
Source: https://thenounproject.com/ratch0013/collection/weather/?oq=weather&cidx=0
Factors that influence amount
of information in source
1. Number of possible events
2. Probability of events occurring
3. Sequential constraints
13
1. Number of possible
events
14
15
16
3.Sequel constraints or context
• The statement “It snowed in
Calgary on September 9, 2014”
might contain a lot of information.
17
Definition of “information”
• Shannon defined it as the reduction of
uncertainty
• Factors that influence amount of
information in source:
1. Number of possible events
2. Probability of events
occurring
3. Sequential constraints
18
More formally…
Factors that influence amount of information
in source:
1. Number of possible events (N)
2. Probability of events occurring (pi)
(a.k.a. distributional constraints)
3. Sequential constraints (a.k.a.
redundancy)
20
1. Number of possible events
e.g. For N = 8:
21
Pop quiz
Which information source
contains a greater Have?
23
A B
2. Probabilities of those events
• Define “surprisal” function as Hi =
0 1 pi
In other words
• High probability events > low information content
• Low probability events > high information content
23
Pop quiz
• For N = 4, with equally likely
outcomes, Have = log2N = log2(4)
= 2 bits
25
Weighted average
• Average information, Have conveyed by group of
events
• Example: 4 events with equal probabilities
(N = 4)
• Have = log2N = log2(4) = 2 bits
26
27
For two possible outcomes
(N = 2)
Maximum entropy
28
Information redundancy
Redundancy: percentage reduction of info
• Hmax is the theoretical maximum,
based on equal probabilities
• Have is reduced compared to Hmax
29
Information redundancy
• From earlier example of N=4
• Hmax = 2 bits
• Have = 1.75 bits
• % redundancy = (1 – 1.75/2) x
100 = 12.5%
30
Duh??
31
32
3. Sequential constraints (context)
• Depending on the context, a stimulus
may be more or less informative
• E.g. In June, it snowed in Calgary
for third day in a row
• Using conditional probabilities Pi | x
33
Information redundancy for
the English language
34
Psaele raed tihs out luod
Aoccdrnig to a rscheearch at Cmabrigde
Uinervtisy, it dseno't mtaetr in waht oerdr
the ltteres in a wrod are, the olny
iproamtnt tihng is taht the frsit and lsat
ltteer be in the rghit pclae.
The rset can be a taotl mses and you can
sitll raed it whotuit a pboerlm.
Tihs is bcuseae the huamn mnid deos
raed
not ervey lteter by istlef, but the wrod as
a wlohe. Azanmig huh?
35
Example: Have for English language
Based on actual
frequencies of letter
occurrences:
36
Using letter N-grams
37
Redundancy of English language letters
• English language is highly redundant
• Hmax = 4.7 bits
• Taking into account distributional and
sequential constraints, for letters of the
alphahet - Have ~ 1.5 bits
39