Information Theory

pF | lifer maton infos maton Ce Hen cu Pp efeatty biket, iinclependent cuenk, Giver that Ps. Tee pore tet te jugormmation Is se “bs & A: | “Total =P. | Probabilihy of est is ye | ae —— Lfor mahion TD. bey ay. f| Plu) = L Pp I. og [ oe] leq P 1 > x | alae fog, 2 . x logh 2 e 2-Y. “omits Degree of tamdonr ness | dis order a fe Bere Overall Prformetlenr a, G aanerays peg or) Asser tha ye act errterk | : ef. “ 1 i od 1 Hows ae | nw enews y Pw P Pah Pas Pn¥, r Prsbabibly = Mo. af foromsahi omen = Tleal me ap eran te wo ame Lor ly 4¢R) = uPs, ! po vane fe. ACO) wP3 ete for he, ; | L Ti ee Die tog (a) 0 be occu caf Homes Deon CA). log G:) Sub nCP) = HP, intr abou efi, Te np, tog CE Te P+ TD, pi Met fee al 2+ Ds ‘ea Te np log Chan Paley fea oe Le nfl Ploy) + Plog (Ye.)*° Niel]ro kb, 1 Piveraye Tin forena ton ae cletat Haforrad, Nu itele of aa TT C = =r Pileg £3 Pa log B+ sr: tt hk 2 “ DPT Sr og Lun a)~ i lo —— | br Bo . — Pil 4p; [xy LSI ‘Teoperties oY Batre py. ») Zotropy 's Reo y if tHe erat IS sun or itis Sroposelhl. ea) If Pyso ors Xx G 2) thar. Pee Yrq for att the 4 Sybole Ha rte Fyrrbe le ace epuathy be kel © Auch aoutce entry is gp Pe oe Fl = leg, hy,ee git he X loo ¥, if atte Hera ae (Re Speen: Eye Ee 5 Tie tate ab colle the fnformnnt tt. Ra be fracunted over % channes fe | = v.H | Ray v- #OD) | > Lalwopy - y L Bt/ the oe [pik] symbd ] Yorecagis HPS . for Sec: CrBoise © Cousicta a bineuy Sy ranael nie Thar 4 trans wctiny 0 fl tls i Chomnet is tree oe Hos ed per sec. Caeulal infow weoliem nS P (o)= peo BED. ZY _ Ba he > | sc 5 Roce Symbe ow Spevavetie He 2000 syrbst/ secre @ Held: ES prt #) ok = 4, PP \ a XS 2 + Fas) + felt te [egg 2 tly oY Lei]s Mfr Obl t figptel. Q- rCued) = geoo X | © 28° fils JooeTs gee 7 Binary Chnnnel > depinedl Conclitonl pabitity bln P/p4 Op PS tes cn off pas diciengten HE tae Bib ga. ot PCY+/oce Ply) Deshusntion x Ye be Plas) PC ¥/,’) eCity. x x I. ecu) PCY) PED = 3 Sour Etrepy EAE [© n Gola plas) log et P Gad 3 pex,). 4 ET sy po Bes inaten + POD MAG nCD= PY “IPG Flys). Wet Zosvopy* Qveratl OProratrel cater | 4 [x 4) A’ Joint Enboors 5~ —- Teint Aotvopy ~ Aveage Tiferenetion tne the co na mind Caton Channel [Aunty cenceslainshy Pte at oie sual (let cor muni cao syle “te H Cx,4) =P Cx, Yo) log soap * Pm 18 Fae 7 PCa) log ee 7 Posey, 4a) fog “Fea TPC) pm Noy i “ Cay) F -. a plan d leg ple y;) izp -7gF! ee eee ne a4.3.1 | Properties of Entropy 1. Entropy is zero if the event is sure or it is impossible. ice., H=0 if p=Oorl 2. When p; =1/M for all the 'M’ symbols, then the symbols are equally likely. For such source entropy is given as H=log, M 3. Upper bound on entropy is given as, Hmax = log, M Proofs of these properties is given in next examples.aceameascation Breinering ar Information Theory and Coding examples for Understanding pu dil: Caleulate entropy when py, =0 and when py al gob: Consider equation 436, uM H = ¥ plog, (1 2nve(h gove Fy Zh the above equation will be, 4 M H-> tes.(7) = ogra) mm ~U 2 logi9 @) cote Since logy) 1=0 Now consider the second case when p, =0. Instead of putting p, =0 directly let us consider age Emiting case ie, ¥ 1 H = Y plo (=) By equation (4.3.6) mi Pr Wh tending to ‘0’ above equation will be, ut 1 He lim_p, log, | — 2 amor (x) ‘The RHS of above equation will be zero when p; ->0. Hence entropy will be zero. ie, H=0 | Thus entropy is zero for both certain and most rare message. Ex 432: A source tnamsmiis to independent messages with probabilities of p and (1-p) iy. Prove that the entropy iS maximum when both the messages are equally likely. Plot Se nerkstion of entropy (H) as a function of probability ‘p’ of the messages. Sel: We know that entropy is given as, i= ¥ nes) = Re two above tion will be, messages equal H-y reloee 5s mi Pe Hee we have two messages with probabilities p, = pand p, =1-p Then above equation a TECHNICAL PUBLICATIONS”: An up S18 for nowedye |48 He ros (30-5) A plot of H as a function of pis plotted in F Fig. 4.3.1 Plot of entropy ‘H’ with probability ‘p' for two messages. The maximum of H occurs at py Le. whon messages are equally likely As shown in above figure, Entropy is maximum at p=3. Putting j= in equation (1) we get, logio @) logio @) Hae * 1082 2)+ 3 loge @) = logs @)= = 1 bit/message. This shows that Hyax occurs when both the messages have same probability, ie. when they are equally likely. Ex. 43.3: Show that if there are 'M’ number of equally likely messages, then entropy of the source is logy M Sol. : We know that for ‘M’ number of equally likely messages, probability is, ea CoM This probability is same for all ‘M’ messages. ie, 1 Pr 7 P2=P3=Pa =~ PM 7G ol) Entropy is given by equation (4.3.6), x 1 H =X log, (i) kel Pre TECHNICAL PUBLICATIONS". Anup tua or krowedyecommunication Engineering 4-9 Information Theory and Coding i 1 Py logy (x) Pp logy (a) Pm log, (#) putting fF probabilities from equation (1) in above equation we get, al 1 terms H = 77 loR2 (M)+ + logy (M) +. Jy loge (M)_ (Add ‘Mt’ number of ) in the above equation there are ‘M’ number of terms in summation. Hence after adding, these terms above equation becomes, H = log, (M) Ex. 43.4: Prove that the upper bound on entropy is given as Hmax $ log number of messages emitted by the source. Gol.:To prove the above property,..we will use the following logarithm : Inx $x-1 for x20 ‘ 1 Q) M Here 'M’ is the property of natural (1) Let us consider any two probability distributions {p,,p2 -.----Pm}and faye4--4m pon the alphabet X={x1/%2 ---* 4} of the discrete memoryless source. Then let us consider the M term >, rtona(# This term can be written as, kel ‘k 4k logi0| 5 M M I Pe Px log; (2): M 2 Peal 2? 58:0? Multiply the RHS by logo ¢ and rearrange 4k. logo} 5 py top,() = Sn eS BPR Dy |” 24 T0810 9810 terms as follows : M 4%. = > rx loge eros) is Here ws( - w(); Hence above equati Pk Pk M M Ok ¥. py log,{ 2} = togs ¢ nio( i) 10, P82 ee os Pe jon becomes, TECHNI . GAL PUBLICATIONS™- Anup tt or knowade‘Communteation Engineering 4-10 Information Theory and Coding munleation Engineering 4210. Information Theory and Coding From the equation (1) we can write Inf) ¢ a} Hence above equation becomes, Pe Pe M M Ke Ik 4) logy ¢ Y) (ge -Pe) Bi os t(j ) ames , ali ) = kel eI MM S loge} D dk- 2 Pe kel M M ‘ | Here note that }' q, = 1as well as ; p; = 1. Hence above equation becomes, kel kel % Px log (*) <0 Q) 2, ee Pr Now let us consider that 4 =2 for all k, That is all symbols in the alphabet are equally likely. Then above equation becomes, M. x [los 4 logy 4] s0 k=l . M M 1 % Pr loge %+ 2 Pi log 5-0 Kea Ke1 Pe aM fl a au 7 D Peloses- S$ - Di me logs ae $Y r% log, — Ke Pk Ke Ke % Putting 4, = — in above equation, 1 TM sl M M 2 P% logy P, logy M Slog, M >, py Kel et 1 Mw Since) pj = 1, above equation becomes, Ket x 1 2 Pelog2 5 § logy M kel ke The LHS of above equation is entropy H(X) with arbitrary probability distribution. TECHNICAL PUBLICATIONS"- An up thrust fr knowiedgenication Engineering 4-11 Information Theory and Coding sis is the proof of upper bound on entropy, And maximum value of entropy is, Hyax (X) = 1082 M gxample for Practice 1 nie 1 A source generates four messages mg, my,mz and m, with probabilities 3/¢ 4 1 . and 7 respectively. The successive messages emitted by the source are statistically independent. Calculate entropy of the source. | Review Questions 1, What is entropy ? DEA 2. Define entropy and discuses its properties. PAu : May-08, Marks 5 | 3, Explain the properties of entropy with suitable example. [44] information Rate AU: Dec.-10 The information rate is represented by R and it is given as, Information rate : R = rH (44.1) Here _R is information rate. H is Entropy or average information and ris rate at which messages are generated. Information rate R is represented in average number of bits of information per second. It is calculated as follows : Messages ) ¥ ( Hin Information we) Ls (r= Second Message = Information bits / second Examples for Understanding canatimited to B Hz and sampled at Nyquist rate. The samples are 5 one message. Thus there are 4 messages. The 1 =p, =yad Pa =P =3. Find out Baga: An analog signal is Hantized into 4 levels. Each level represent Probabilities of occurrence of these 4 levels (messages) are Py 0 informati Yormation rate of the source. TECHNICAL PUBLICATIONS™- Anup tvst fo rowiedgeCom Communication Engineer Information Theory ard Conting Sol. : i) To calculate entropy (H) : o 1 We have four messages with probabilities py = Pa = g 094 Pz = P3 H (or entropy) is given by equation (4.3.5) a, H=p, tsa (5 } rh ta (Fe }e ps logy (i } M4 log (i) 1 3 roe (8 V3 top, (8\e2 = } logy 8+3 log, ¢ } 3 log, G}s logy 8) 3 a: Average information H = 18 bits/message ve (l) ll) To calculate message rate (r) : We know that the signal is sampled at nyquist rate. Nyquist rate for B Hz bandlimited signal is, Nyquist rate = 2B samples/sec. Since every sample generates one message signal, Messages per second, 1 = 2B message/sec. lil) To calculate Information rate (R) : Information rate is given by equation (4.4.1) as, R = rH Putting values of rand H in above equation, R = 2B message/sec. x 1.8 bits/message = 3.6 B bits/sec. 1 @[45] source coding Theorem (Shannon's First Theorem) ea 45.4 hon Need of Efficient Source Encoding Process PCM, DM, ADM, DPCM are source coding algorithms. We know that PCM allots same number of binary digits to all the sample values. - If all the messages are equally likely, then PCM and similar other source encoders provide maximum information rate. But if the messages are not equally likely, then actual information rate is less than maximum achievable rate. TECHNICAL PUBLICATIONS” An up thrust for knowiedgecommannatin genewrtng ans Information Theory anal Colin iy) Bificient source encoders use the statistical properties of the soures, Frequently occurring Messages are encoded with leas number of binary digits and carey occurring Messages are encoded with more number of binary digits, methods are also called variable length encoding, algorithms. They increase information rate, ~ Requirements of Efficient Source Encoder i) The codewords generated by the encoder should be binary in nature. yy Thus the efficient source encosti ii) The source code should be unique in nature, That is every codeword should represent unique message. Avorage Numbor of Bits (WN) Let there be ‘L' number of messages emitted by the source. The probability of the kl" message is py and the number of bits assigned to this message be nm. ‘Then the average number of bits (N) in the codeword of the message are given as, haa N= rm io Coding Efficiency Lot Nain be the minimum value of N. Then the coding efficiency of the source encoder is defined as, vs (4.5.2) The source encoder is called efficient if coding efficiency (n) approaches unity. In other words Neyin SN and, coding efficiency is maximum when Nuiq = N. Shannon's Theorem on Source Coding The value of Nin can be determined with help of Shannon's first theorem, called source coding theorem. This theorem is also called Shannon's theorem on source coding. It is Stated as follows Geen @ di Tmcmorylew source of entropy H, the average codeword length N Neu wo (45.3) Here the entropy H represents the fundamental limit on the average number of bits per Symbol ie, N. This limit says that the average number of bits per symbol cannot be made Smaller than entropy H. df ‘TECHNICAL PUBLICATIONS”. An up thrust for knowincte:Communication Engineering ___Information Theory and Coding Hence Nmin =H, and we can write the efficiency of source encoder from equation (4.5.2) as, 7 . 1 oe . = (4.8.4) 4.5.6 | Code Redundancy It is the measure of redundancy of bits in the encoded message sequence. It is given as, Redundancy (7) = 1 ~ Code efficiency = 1-9 (455) Redundancy should be as low as possible. 4.5.7 | Code Variance Variance of the code is given as, Ll = = Sy (m-N) +» (4.5.6) k=0 . Here o? is variance of the code. Mis the number of symbols. Px is probability of k"* symbol. 1, is the number of bits assigned to k"" symbol. N is average codeword length. Variance is the measure of variability in codeword lengths. Variance should be as small as possible. Review Questions 1. Discuss source coding theorem. 2. State source coding theorem. DEE AU : May-07, Marks 5, May-10, Marks 6 Data Compaction (Entropy Coding) le length cod se algorithms are also called 4.6.1 | Shannon-Fano Algorithm In the section 4.4 we have seen that ‘ifethe»probabilitiés Of 6ceuirrencerofalbiehenmcssages ‘thensaverageninformations(iementropy)eisereduced. This in turn TECHNICAL PUBLICATIONS”. An up trust for knowledgo eeecommuni Information Theory and Coding ea __ Information Theory and Coding _ a7 a hs tonne Problem is!solvedby:coulinigithe messages withidifferent®> + As tnesprobability:of the: message-is! increased, less: number of bits” are” shannon-Fano algorithm is used to encode the depending upon their pilities. This algorithm allo eet ene ee, probal nd ‘annon-Fano algorithm can be: best plained with the help of an example, ‘A complete encoding process of Shannon-Fano algorithm is shown in Table 4.6.1. As shown in Table 4.6.1 there are eight messages with ™m to mg. The probabilities of occurrence of those messages are shown in 2" column, For example probability of m, is x probability 4 of m, and m3 is => and so on. The algorithm Proceeds as shown in Table 4.6.1. As shown in column-I a dotted line is drawn between m, and m,. This line makes two paritions. In upper Partition there is only one message and its probability is 3. Lower partition contains mz, to mg and sum of their probabilities is also x Thus the partition is made such that sum of probabilities in both the partitions’ are almost equal. The messages in upper partition are assigned bit ‘0’ and lower partition are assigned bit ‘1’. Those partitions are further subdivided into new partitions following the same rule. The partitioning is stopped when there is only one message in partition. Thus in column-I upper partition has only one message hence no further partition is possible. But lower partition of column-I is further subdivided in column-II. ‘n column-Il, the dotted line is drawn between ms and m,. Observe that in upper partition 4_8 3203 The lower partition in column-II has messages my tomg. Their sum of probabilities is atgtg+Z+2- = Thus the sum of probabilities in both the partitions is equal, wre have two messages m andm;. The sum of probabilities of my andimy is 4. The messages in upper partition are assigned ‘0’ and lower partition are assigned ‘1’. Since both the partitions in column-II contains more than one message, they are further Subdivided. This subdivision is shown in column-IIl. This partitioning is continued till there is only one message in the partition. The Partitioning process is self explanatory in columns-Ill, IV, V in Table 4.6.1. In the last ®lumn of the table codeword for the message and number of bits / message are shown. codeword is obtained by reading the bits of a particular message rowwise through all SS For example message _m has only one bit ie, 0, message m, has three bits i.e. 1 90, message ms has also three bits i. 10 1, message my has five bits ie 1111 1. This TECHNICAL PUBLICATIONS”- An up thrust for knowledgeCommunication Fingineoring, eth Information theury and Coding oy i HE if = Bh jz, é i ese tates sin t | anatacies | aay j 5 # 5 i ze, Be ae He] eee] as. ed Gonued ‘Gonwed ec 1 i 8 ‘ou, | ekiuy | amu | s{uonued oy t $ [earns | samia anmeag uy soqyyqeqoud yo wn i eee ce ee 2 | a ye a ws | my) S-2e | BBB. B oun uy | ony Uy 8 uojqped 914 9] uonped ayy ‘Aiiqeqord | Amiaeqoud |)s1 sonyiqeqosd Jo wins ‘u} So;igeqoud yo wing eee] o ° = = = = a zee | 2. , 8) UonMed S14) Uy g c zc t 7 t ‘sommiqegoud jo wing 8{ uonued S14) uj somiqeqoud jo wing 2 - = = = = = a Pasco a. we Be, By, ze ze zs aa | ten tt Ew Ge sae des eae feonngegoi 5 | uonned s1y) u) sonniqeqosd jo wng, : oBessowjo | o. Augeaois | 218°] lt | ols | ols | sie | ols | la | ls obess0y & & & & e & f F | Table 4.6.1 Shannon-Fano algorithm shows that, the message m, has highest probability hence it is coded using single bit i.e. ‘0’. As the probabilities of messages goes on decre, asing, the bits in codeword increase. TECHNICAL PUBLICATIONS” An up thrust for knowledgesontifrening . ee | vie snow PY equation (4.3.6) that average w= ¥ nn( 7) log, ( 2 1 1 py he + bee a 1 log, Gi} Pr tsa :] st Dy toa (5 } information per message (entropy) is given a4, “Hem(jdm(tom(@aee(2) +2 top, (F}% logy (2) 3 loge (Fess log (F ) - 719822 +1 lop, 8+ 2. togy 16+ 7 log, 32 o = 2 bits of information / message. vs: (4.6.6) Now let us calculate average number of binary digits (binits) per message. Since each message is coded with different number of binits, we should use their probabilities to calculate average number of binary digits (binits) per message. It is calculated as follows : Average number of binary digits per message BPEAHE BME) ) 2g binary digits / message vs (4.6.7) lis lear from equation (4.6.6) and equation (4.6.7) that average bits of information per sessage is same as average binary digits per message. This means one binary digit (binit) aarries one bit of information, which is maximum information that can be conveyed by one binit In general to transmit eight messages (m, tomg) we need three binary digits per Sessage without any special coding. Thus with Shannon-Fano coding algorithm we ‘quired 27 binary digits per message. Thus with special type of coding like Stannon-Fano algorithm, average number of binary digits per message are reduced and ‘aximum information is conveyed by every binary digit (binit). (482 | Huffman Coding 5 the last section we have seen Shannon-Fano algorithm. This algorithm assigns different " Tber of binary digits to the messages according to their probabilities of occurrence. tees coding also uses the same principle. This type of coding makes average number a brary digits per message nearly equal to entropy (Average bits of information per ““S3ge). Huffman coding can be best explained with the help of an example. TECHNICAL PUBLICATIONS”- An up thst for knowledge,Communication Engineering 4-20 Information Theor and Cong Consider that the source generates five messages my, m,...m4. The probabilities of thee, messages are as shown in 2™ column of Table 4.6.2. ese hence put last in the column ‘Stage-IV 0 4d Lowest probability 2, aie § 3 3 288 852 2 gee 3 be 3 é fe H 3 3 3 3 $ zg zg 2 2 Fy bd a 5 82 By «7 ; i i Eb a2 i Ba 33 Be 68 | Ee ES 2 a 8 8 é \ : 4 3 " a é 3 3 3 3 - Stage-I z0=1 ‘M@U 0} PAUIqUIOD B12: ssoniqeqaud ysemo} jo soBessow om) SOUL Probabilities of message Table 4.6.2 Huffman coding TECHNICAL PUBLICATIONS”. An up thrust for knowledgeon nest e 2 a go younages are arrange xl according to thelr decreasing, probabilities. For example rhe My have lowest probabilities and hence they are put at the bottom in column bane af rage aye messages of lowest probabilities are assigned binary ‘0’ and ‘1’ tne : 7 a a F ine 60° jowest probabilities in stage-I are added. Observe that the sum of two otities #8 OL + OL = 0.2. r of probabilities in stage-I is placed in stage-I such that the probabilities are phe sum ding order. Observe that 0.2 is placed last in stage-II. in deseo the last two probabilities are assigned ‘0’ to ‘I’ and they are added. Thus the f last two probabilities in stage-Il is 0.2 + 0.2 = 0.4. Now gam of ‘The sum of last two probabilities (ie. 0.4) is placed in stage-IIl such th srobabilities are in descending order. Again ‘0’ and ‘1’ is assigned to the last two probabilities. similarly the values in stage-IV are obtained. Since there are only two values in gagelV, these two values are assigned digits 0 and 1 and no further repetition is atthe 2 required. ow let us see how the codewords for messages are obtained. qo obtain codeword for message m, : the Table 4.6.2 is reproduced in Table 4.63 for explanation. the sequence of 0 and 1 are traced as shown in Table 463. The tracing is started from sage. The dotted line shows the path of tracing. Observe that the tracing is in the Gaction of arrows. In stage-I digit ‘1’ is traced. In stage-II digit ‘I’ is traced. In stage-IIT digit ‘0 is traced. In stage-IV digit ‘0’ is traced. Thus the traced sequence is 1100. We get the codeword for the message by reading this sequence from LSB to MSB. ie. 0011. Thus Se codeword for m, is 0011. (See Table 4.6.2 on next page and Table 4.6.3 on page 4 - 23). To obtain codeword for mg : The center line shows ( ~) the tracing path for message my. The tracing is started fom stage. No binary digits are occurred in stage-l, Il and III. Only digit ‘1’ is occurred S sagelV in the path of tracing. Hence codeword for message my is 1. Thus single digit Sassigned to my since its probability is highest. X obtain codeword for m2 : aay if we trace in the direction of arrows for message m, we obtain the sequence as {2 racing path is not shown in Table 4.64). Reading from LSB side we get codeword ™, as 000 again. Table 46.4 shows the messages, their probabilities the: sequence by tracing and codeword obtained by reading from LSB to MSB. TECHNICAL PUBLICATIONS”. An un thrust far knowledgeay Cong Information Theory and Ca, Communication Engineering, «81 0us o6essow 30} pioMmapoo SouaH “V6Ip oj6us s} 1 a2uls eues SueUIO. | YOpI0 pesvenas ui HOI SUN Peas L<—1 Table 4.6.3 Huffman coding TECHNICAL PUBLICATIONS”. An up trust for knowledgeCodeword obtained by reading digits of column-d eee from LSB side Message Probability Digits obtained by tracing Table 4.6.4 Huffman coding We know that equation (4.3.6) that average information per message (entropy) is given as, 4 1 H=)% mesa] bet Pe For five messages above equation can be expanded as, 1 = Po toa (5 }: ry tsa ( }. P2 log, (e) +P to (7 }. Ps logz fea ) Here we started from k=0. Putting values of probabilities in above equation from Table 464 we get, : 1 sam) soho( ho) vos(q00() = 0.52877 + 0.46439 + 0.46439 + 0.33219+ 0.33219 = 2.12193 bits of information/message e .. (468) Now let us calculate the average number of binary digits (binits) per message. Since each Message is coded with different number of binits, we should use their probabilities to Galeulate average number of binary digits (binits) per message. It is calculated as follows : ("see *) x ( ~et) Average number of binary aoe message in codew digits per message = (04x 1)+ (02 x2)+ (023) + 1x 4)+(01x 4) “TECHNICAL PUBLICATIONS”- An up thrust for knowledge‘Communication Engineering, 4-28 Inomaton Try nd Cry = 22 binary digits/message (459) Thus it is clear from equation (4.6.8) and equation (4.69) that Huffman coding assign binary digits to each message such that Average number of binary digits per message oe nearly equal to average bits of information per message (i. H). This means because of Huffman coding one binary digit carries almost one bit of information, which is the maximum information that can be conveyed by one digit. Disadvantages of Huffman coding : 1) It requires probabilities of the source symbols. 2) It assigns variable length codes to symbols, hence it is not suitable for synchronous transmission. 3) Inter character dependencies are not considered. Hence compression is less. Examples for Understanding Ex. 4.6.1: A discrete memoryless source has five symbols x1,x2,%3,Xq and x5 with probabilities 0.4, 0.19, 0.16, 0.15 and 0.15 respectively attatched to every symbol. i) Construct a Shannon-Fano code for the source and calculate code efficiency 7. ii) Repeat (i) for Huffman code compare the two techniques of source coding. Sol. : i) To obtain Shannon-Fano code : The Shannon-fano algorithm is explained in Table 4.6.1. The Table 4.6.5 shows the procedure and calculations for obtaining Shannon-Fano code for this example. ‘ Probability of 1 , Code word for Number of bits per Bie, message ME s escage message ie my h os Table 4.6.5 To obtain Shannon-fano code TECHNICAL PUBLICATIONS”- An up trust for knowledgeInformation Theory and Coding, yy (HH) is given by equation ( \trOP) w= $ nial) seo mA 5 and putting the values of probabilities in above equation, a= 0 eee( ty 7 }}029}6e(sfs ar6te,( ate) 1 +015 1oga(ars 045 toea(a3s) = 2.2281 bits/message per A= ‘The average number of bits per message N is given by equation (4.5.1) as, Let = Dvn Ko ere py is the probability of k"* message and n, are number of bits assigned to it. Putting the values in above equation, WN = 0.4(1) + 0.19(8) + 0.16(3) + 0.15(3) + 0.158) = 235 The code efficiency is given by equation (4.5.4) i.e. H code efficiency 1 = F li) To obtain Huffman code : Table 46.46 illustrates the Huffman coding. Huffman coding is explained in Table 4.62 with the help of an example. The coding shown in Table 4.6.6 below is based on this explanation. il - hd z | 2 ee || 0.1645 | Table 4.6.6 Ta ‘obtain Huffman code Ih =o mane hia neem |oe Communication, A Engineering Information Theory and Coding Table 463 shows how code words are obtained by tracing along the path. These code Words are given in Table 467, Table 4.6.7 Huffman coding 7 Now let us determine the average number of bits per message (N} It is given as, tat NaS rete x0 Futting the values in above equation, N= 0.4(1) + 0.19(8) + 0.16(3) + 0.15(3) + 0.15(3) = 2.35 Hence the code efficiency is, Thus the code efficiency of Shannon-Fano code and Huffman code is same in this example. Ex. 462: Fice soune messages are probable to appear as m, = 04, my = 0.15, my = 015, my = O15 and ms = 0.15. Find coding efficiency for (i) Shannon-Fano coding (ii) Huffman coding. | | Sol. : D Entropy of the source M Entropy, 9 H = dave(z) m Pe For wesH = Sate(t] : ~edahomtsbnstsbon(sbn = Oleg: G+ 04S log: gt pg tOSlog, ars ute. Slog. gts ig +0. 150g) gts = 2.171 bitsimessage TECHNICAL PUBLICATIONS”. An up thrust for knowledgeaa omsuunication Engineering 4-27 Information Theory and Coding 1p Shannon-Fano coding 3) To obtain codewords able 4.68 lists the procedure for Shannon-Fano coding. ‘The partitioning is made as per the procedure discussed earlier. Table 4.6.8 ii) To obtain average number of bits per message (N) N= dan Sine pm t7at +0 + Pally + P5M5 0.4% 140.15 «3 +015 x 3+0.15 x 3 + 0.15x3 = 2.2 bits/message iii) To obtain code efficiency 2171 «0.9868 or 98.68 % NW iv) To obtain code redundancy y=i-n=1- 0.9868 = 0.0132 1) Huffman coding )) To obtain codewords Table 4.69 lists Huffman coding algorithm along with codewords obtained by tracing. TECHNICAL PUBLICATIONS”. An up rst or nowfedoe———_— i, | Communication Engineering, 4-28 Information Theory and Coding Digits obtained by tracing COdeword Number of Bab; by Pobsbe digits n, 1 Table 4.6.9 ii) To obtain average number of bits per message (N) N- Ln = Putty + Patta + Pals + Pas + PMs el = 04x 140.15 x 3 + 0.15 «3 + 0.15 x 3 + 0.15 x 3 = 2.2 bits/message iii) To obtain code efficiency H 2471 _ 1 = 557 Gq 7 0.9868 oF 98.68 % iv) To obtain code redundancy “y= 1-n=1- 0.9868 = 0.0132 4.6.3 | Comparison between Huffman, Shannon Fano and Prefix Coding | Sr, No. Huffman coding Shannon Fano coding Prefix coding 1) Codewords are assigned as Codewords are assigned as Codewords are assigned as per probability of symbol. per probability of symbol. _per probability of symbol. Lower probability symbols Symbols are partitioned into No.codeword is prefix of (two) are combined for next two groups and combined. other codeword. stage. Average number of bits per Average number of bits per Average number of bits pet message nearly equal to message are little higher message are little higher entropy. than that of Huffman than other two methods. exding, s TECHNICAL PUBLICATIONS” An up thrust for knowledge

Information Theory

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Information Theory

Uploaded by

Copyright:

Available Formats

You might also like