Professional Documents
Culture Documents
Unit07 Cs Information Theory
Unit07 Cs Information Theory
NOISE
Source of Destination
Encoder Channel Decoder
Message of Message
Prob. of
1/2 1/4 1/8 1/16 1/64 1/64 1/64 1/64
winning
Bits
000 001 010 011 100 101 110 111
assigned
So the average number of bits required are:
1/2 x 3+ 1/4 x 3 + 1/8 x 3 + 1/16 x 3 + 1/64 x 3 + 1/64 x 3 + 1/64 x 3 + 1/64 x 3 = 3 bits
But can we transmit lower than three bits to convey the same
amount of information?? The answer is yes!
Horses #1 #2 #3 #4 #5 #6 #7 #8
Prob. of
1/2 1/4 1/8 1/16 1/64 1/64 1/64 1/64
winning
Bits
0 10 110 1110 111100 111101 111110 111111
assigned
Conclusions
Should use fewer bits for frequent events!
Frequent events have lower information.
Rare events have more information.
Information theory
In our case, the messages will be a sequence of binary
digits
One detail that makes communicating difficult is noise
noise introduces uncertainty
Suppose I wish to transmit one bit of information what
are all of the possibilities?
tx 0, rx 0 - good 0 0
tx 0, rx 1 - error
tx 1, rx 0 - error
tx 1, rx 1 - good 1 1
Measure of “Information”
Suppose we have an event X, where xi represents a particular
outcome of the event
Consider flipping a fair coin, there are two equi-probable outcomes,
say:
X0 = heads, P0 = 1/2,
X1 = tails, P1 = 1/2
The amount of information for any single result is:
Definition on “Information”
When outcomes are equally likely, there is a
lot of information in the result
The higher the likelihood of a particular
outcome, the less information that outcome
conveys
However, if the coin is biased such that it
lands with heads up 99% of the time, there is
not much information conveyed when we flip
the coin and it lands on heads
Example
An event X generates randomly 1 or 0 with
equal probability P(X=0) = P(X=1) = 0.5
then I(X) = -log2(0.5) = 1
or 1 bit of info each time X occurs
if X is always 1 then P(X=0) = 0, P(X=1) = 1
then I(X=0) = -log2(0) =
and I(X=1) = -log2(1) = 0
Discussion
I(X=1) = -log2(1) = 0
Means no information is delivered by X, which is
consistent with X = 1 all the time.
I(X=0) = -log2(0) =
Means if X=0 then a huge amount of information
arrives, however since P(X=0) = 0, this never
happens.
Entropy
The sum of average information arriving with
each bit is Entropy
L L
H ( X ) P ( xi ) I ( xi ) P ( xi ) log 2 ( P ( xi ))
i 1 i 1
Example
Consider an event with two outcomes 1 and 0,
occurring with probability of p and 1 – p
respectively .
Then H(X) = p*log2(1/p) + (1 – p)*log2(1/(1 – p))
For p = 0 and 1
H(X) = 0
For p = 0.5
H(X) = 1
Channel capacity
Definition:
channel capacity is the rate of information that can be
reliably transmitted over a communications channel.
Shannon-channel capacity
In the early 1940s, it was thought that increasing the
transmission rate of information over a communication
channel increased the probability of error.
Shannon surprised the communication theory community
by proving that this was not true as long as the
communication rate was below channel capacity.
C = BW.log2(1 + SNR) bits per second
Fundamental Constraint
Shannon’s capacity upper bound
Achievable data rate is fundamentally limited by
bandwidth and signal-to-noise ratio
Fundamental Constraints
Fundamental constraints for high data rate communications