Download as pdf
Download as pdf
You are on page 1of 15
Rem Ue soem cy Coen oe ee Cre CS Ree Cahn cam eee n ence CMe MCI a ur oneg frequency information). Then some of the eee PCa ncn Ne ete a) STIS MM MLL Ca Sa eae MRT CC Co BS compression to pack the end result into the RU See ees Me ee en explore the process step-by-step, using real images and following them through the entire encoding (saving as JPEG) and decoding (ecm CONN aso Me com emg PTR CMCC Te mente CMEC MECN TOTS Pion eet cm mere Encoding Start by choosing an input image to eer MM) aT) Ce enn a cone Flowers Trey MS] cums euro Cee eM Nut Ore eT CMI rae CT) De mn. ice ea eu Rae M IS unt biol Pee UE ca meta Rr CeCe meee Cm nee oR REA ok Coa Sot Sone Rm melody Peete Ci ae ae ECC dragging and dropping the image file on this Po Step 1: Isolate the colour information “Ene 75% 31% 5% ea a ee Cee on Tai green, and blue light. Typical computer images are made up from a CCMA ets RSC MeL Merl eC ry Feet CR lye] mse aS RUN Maas Meet PNM ee nae ee OMS ICCC gran Th CPS colour. For this reason, it is called an RGB image. On the left side of the left-hand illustration, you can see the red, green, and NCR CM aN ata CeCe Casal’ TRE MI Sse u nen Cee Sen ec a eae Te USSR COL spread evenly through the R, G, and B TM RC n murat SSeS important than colour, so we'll want to isolate Pre Oem MRR Cl on RC EIRCOM NAMM RUM 3 Tee MCR Ron eu at MITT] Tea Peed CMCC COM (eer (elem ToT also has three channels, but it stores all of the Pre Oem monn MoU O)) PRS nha kee Ma oe en ce the other two (Cb and Cr) SMa MMU ei enc LC Mm tro RMR UREN Ue Ts Sea Aco) ae (middle), and Cr (bottom) channels. Notice ieee MU Meme eM Ng because all of the definition given by the Pre UES MUU ecu kon Serr) Sco ee MUM ste ese m MACRO Ig the channels. In reality, each one is the Reena us BS) Ee elt Lg ere) Before doing anything else, JPEG throws away some of the colour information by scaling down just the Cb and Cr (colour) channels while keeping the important Y (brightness) channel full size. Strictly speaking, this step is optional. The standard says you can keep all of the colour information, half of it, or a quarter of it. For images, most apps will keep half of the colour information; for video it is usually a quarter. For this demo I’m keeping a quarter, both to exaggerate the effect and because it makes AColmnlecctm LV ESiar1 Col nt Notice that we started with 3 full channels and now we have 1 full channel and 2x % channels, for a total of 1%. We're just getting started and we are already down to half of the information we started with! Step 3: Convert to the frequency domain To DR RCO ANCOR oT RMR ace te oR A ML To oe Ue A EVR me Wee mom: hss) Deca Cem CR RCS (Hueco Rad eee te ON ug Caen ete om urn Whoa there, horsie! What? OK, let's consider just one of these 8x8 blocks from the Y SOC MS CR eRe NEN) TOUR MMU te (aes OCS ICM UM nat Sem NAVI Mean I PC MIMUCMT ie i Melt mm ciallctd CRMC CMM ACM ON Cag attics OSCR UN ae USeM MUN CMU GIs lower-right corner of that block. Hence the term spatial: position in the block represents PS CUMAR UTE MN URICR UST Cnn LT De aOR UR Ce MMe eM eran Res ee MME Ac cecal e mice cate wet in that block of the image. The value in the Myo Cimecnmi uM ltt MMC cca CM Sei CTC IS mm CeIn Re gT value in the lower-right corner of the block will SMU e ne eice Menem nlc This domain transformation is accomplished using a_ bit of mathematical legerdemain called the 2D Discrete Cosine Transform (DCT). (If you have heard of Fourier Belated La TORS Sie) SAX] AO 91-1 H aS SOD olen nle for orl vie representation.) The essential idea is to represent the values in the 8x8 block as a sum of cosine functions, where each cosine function has a specific unique frequency. You don’t need to understand the math to gq a sense of how it works. Look at the frequency illustration for the Tower imagq You can clearly see each 8x8 block’s uppe| left corner thanks to a dark dot of lo frequency information. Now if you look q blocks from the sky parts of the image, yo will see that the rest of each block is most empty. The sky doesn’t have lots of dramati changes from pixel to pixel: no high-frequend information. Compare that to blocks from th tower parts of the image: the busy texture q the bricks means lots of higher-frequen change, and this shows up as grey througho the block. Step 4: The quality slider (quantization) The next step is to selectively throw away some of the frequency information. If you have ever saved a JPEG image and chosen a quality value, this is where that choice comes into play. It works like this: start with two 8x8 tables of whole numbers, called the quantization tables. One table is _ for brightness information, and one is for colour information. You will use these numbers on each of the 8x8 blocks in the image data by dividing the frequency value in the image data by the corresponding number in_ its quantization table. So the upper-left corner of each 8x8 block in the Y frequency channel will be divided by the number in the upper-left corner of the brightness quantization table, and so on. The result of each division is rounded to the nearest whole number and the fractional parts are thrown away. OTT tre elt r4 he wee Output CAC uO RSL} Chrominance (colour) table 7) i rt fl 7| 99] 99 13] 17] 22 r 51 cr rT Pa MUN -M oii ela) moll mele) oe UME lmOle LoL g image is shown for reference. The larger a number in one of the quantization tables, the more information gets thrown away from that part of that frequency range. Since we care less about high-frequency information, the numbers in that area of the quantization tables will be larger. And since we care less about colour than about brightness, the numbers in the colour table will be larger overall than the numbers in the brightness table. The quantization tables are saved along with the image data in the JPEG file. They'll be needed to decode the image correctly. COC Cee ae Ea Ce Ce mn eee MMe Sen LT the quality down towards the low end. Reese Reon Tae ee ee eae oS eu TIN Re ua by tossing the decimal parts after division, we Se a See Rcd Ce CC actually buy us anything. However, this data is now going to be compressed using traditional eee ee La whole reason we used lossy compression in Te COMM Ree doesn't work well for images? Yes, but that quantization we just did is going to make the CONC MT na ARR CS number sequences: The first row lists values for, say, some pixel in the Y frequency channel. The second row is the same values divided by 2 and rounded; the third row is divided by 16 and rounded. You Cn Ue me ed eR MCU RG cee nr De Cun ORO Pe eC ae Toe ee Nee OR Omg am RCC) eos aOR Re ce ei Reece nC um a om De ae Sn Tee CU quantized parts (with the largest divisors) are next to each other to make nice, repetitive Pete ey Decoding Gena i ed PT Rg =Car eae usa ec Nec oe (em RR RUM aur TC e Re en ren rena Cu aaa Prem Mes Me One TT RUE Ue Cd PI Cum UNC UML aL cad ea een aU a ROM COMME UM Ulcer eMC ten BS ered) Cree he Ce decompress the quantized (divided and oe Rie U om CMe RU CRT Ce ee Me oa Oem eS coy SCA use ane uc) Next rm eu CM ecco TCR Re RO oe ea dividing by the numbers in the tables, we RASC RURta CRSeS Re eC ee to Ree UN Ure nur nls Ce ect eM Ce CR) PR MRM U TL cu nc acy Se ae a COE eu Ce te Ons lost, and the less accurate our reconstruction Raa SC ae eee ER uC Now that have reconstructed the frequency information, we OCIS Ce ee une ay CMU a CC uD tes RS ER eR Cie Ce oy oy reed CORR URC Tg CM oem CRC Mca et ae Cu ne ce CN CT) INR RO nea cr Ae ea ees Cet Ces Tae URI i ee ema TS Deen RUC PRU mem by tossing the decimal parts after division, we ROR U Rea) Tego SCR Cun my Cr eae CUS Eee Ne DTU amu eT RRR Reese Mente er eee ee ees Umea UR MSs me eon Ue SMU ae So Cee a RMS COM a CMU eu a RN ROR CR Cem ey ME ae OR MTNA Cn ea Deuce eat n ve LyAt runt Se Maer eran ee gether. Wl Oo Penne ne maesccurscs Co Re URC eM ee PML olan e} Step 9: Fill in missing colour information cb cree sae) eek Ren) Re Me RO CRC MR Ure channels Cb and Cr back up to their orignal Fyrom Man MI COmuCU Coan CM UIC mT AS aOR RRM re Tur Cae To inter PaCS hc Uo RUC ae nc that are still there. There are different ways to CCR UTA Meo) a LL TT scaled up image will tend to be either blocky or blurry, depending on the method used. SO mC estan) Be Re cue a Tou 5 ay TC na emt eae OR Ue Mal ge MIC oe Dat a eR MC a Cement Pe original input image and the decoded output Creo fice f cma Gee neo ceca Re LCC) eM om Rae

You might also like