Rem Ue soem cy
Coen oe ee Cre CS
Ree Cahn cam eee n ence
CMe MCI a ur oneg
frequency information). Then some of the
eee PCa ncn Ne ete a)
STIS MM MLL Ca Sa
eae MRT CC Co BS
compression to pack the end result into the
RU See ees Me ee en
explore the process step-by-step, using real
images and following them through the entire
encoding (saving as JPEG) and decoding
(ecm CONN aso Me com emg
PTR CMCC Te mente CMEC MECN TOTS
Pion eet cm mere
Encoding
Start by choosing an input image to
eer
MM) aT)
Ce enn a
cone
Flowers
Trey
MS]
cums euro
Cee eM NutOre eT CMI rae CT)
De mn. ice ea
eu Rae M IS unt biol
Pee UE ca meta Rr
CeCe meee
Cm nee oR REA ok Coa
Sot Sone Rm melody
Peete Ci ae ae ECC
dragging and dropping the image file on this
Po
Step 1: Isolate the colour information
“Ene
75% 31% 5%
ea a ee
Cee on Tai
green, and blue light.Typical computer images are made up from a
CCMA ets RSC MeL Merl eC ry
Feet CR lye] mse
aS RUN Maas Meet
PNM ee nae ee OMS ICCC gran Th CPS
colour. For this reason, it is called an RGB
image. On the left side of the left-hand
illustration, you can see the red, green, and
NCR CM aN ata CeCe Casal’
TRE MI
Sse u nen Cee Sen ec a
eae Te USSR COL
spread evenly through the R, G, and B
TM RC n murat SSeS
important than colour, so we'll want to isolate
Pre Oem MRR Cl on RC
EIRCOM NAMM RUM 3
Tee MCR Ron eu at MITT] Tea
Peed CMCC COM (eer (elem ToT
also has three channels, but it stores all of the
Pre Oem monn MoU O))
PRS nha kee Ma oe en ce
the other two (Cb and Cr)
SMa MMU ei enc LC Mm tro
RMR UREN Ue Ts Sea Aco) ae
(middle), and Cr (bottom) channels. Notice
ieee MU Meme eM Ng
because all of the definition given by the
Pre UES MUU ecu kon
Serr)
Sco ee MUM ste
ese m MACRO Ig
the channels. In reality, each one is the
Reena usBS) Ee elt Lg
ere)
Before doing anything else, JPEG throws
away some of the colour information by
scaling down just the Cb and Cr (colour)
channels while keeping the important Y
(brightness) channel full size. Strictly
speaking, this step is optional. The standard
says you can keep all of the colour
information, half of it, or a quarter of it. For
images, most apps will keep half of the colour
information; for video it is usually a quarter.
For this demo I’m keeping a quarter, both to
exaggerate the effect and because it makes
AColmnlecctm LV ESiar1 Col nt
Notice that we started with 3 full channels
and now we have 1 full channel and 2x %
channels, for a total of 1%. We're just getting
started and we are already down to half of the
information we started with!Step 3: Convert to the frequency domain
To
DR RCO ANCOR oT
RMR ace te oR A ML
To oe Ue A EVR me Wee mom: hss)
Deca Cem CR RCS (Hueco Rad
eee te ON ug Caen
ete om urn
Whoa there, horsie! What? OK, let's consider
just one of these 8x8 blocks from the Y
SOC MS CR eRe NEN)
TOUR MMU te (aes
OCS ICM UM nat Sem NAVI Mean I
PC MIMUCMT ie i Melt mm ciallctd
CRMC CMM ACM ON Cag attics
OSCR UN ae USeM MUN CMU GIs
lower-right corner of that block. Hence the
term spatial: position in the block represents
PS CUMAR UTE MN URICR UST Cnn LT
De aOR UR Ce MMe eM eran Res
ee MME Ac cecal e mice cate wet
in that block of the image. The value in the
Myo Cimecnmi uM ltt MMC cca
CM Sei CTC IS mm CeIn Re gT
value in the lower-right corner of the block will
SMU e ne eice Menem nlcThis domain transformation is
accomplished using a_ bit of
mathematical legerdemain called the
2D Discrete Cosine Transform (DCT).
(If you have heard of Fourier
Belated La
TORS Sie) SAX] AO 91-1 H aS SOD
olen nle for orl vie
representation.) The essential idea is
to represent the values in the 8x8 block
as a sum of cosine functions, where
each cosine function has a specific
unique frequency.
You don’t need to understand the math to gq
a sense of how it works. Look at the
frequency illustration for the Tower imagq
You can clearly see each 8x8 block’s uppe|
left corner thanks to a dark dot of lo
frequency information. Now if you look q
blocks from the sky parts of the image, yo
will see that the rest of each block is most
empty. The sky doesn’t have lots of dramati
changes from pixel to pixel: no high-frequend
information. Compare that to blocks from th
tower parts of the image: the busy texture q
the bricks means lots of higher-frequen
change, and this shows up as grey througho
the block.Step 4: The quality slider (quantization)
The next step is to selectively throw away
some of the frequency information. If you
have ever saved a JPEG image and chosen a
quality value, this is where that choice comes
into play. It works like this: start with two 8x8
tables of whole numbers, called the
quantization tables. One table is _ for
brightness information, and one is for colour
information. You will use these numbers on
each of the 8x8 blocks in the image data by
dividing the frequency value in the image data
by the corresponding number in_ its
quantization table. So the upper-left corner of
each 8x8 block in the Y frequency channel will
be divided by the number in the upper-left
corner of the brightness quantization table,
and so on. The result of each division is
rounded to the nearest whole number and the
fractional parts are thrown away.OTT tre
elt r4 he wee
OutputCAC uO RSL} Chrominance (colour) table
7) i rt fl 7| 99] 99
13] 17] 22 r 51 cr
rT
Pa
MUN -M oii ela) moll mele) oe UME lmOle LoL g
image is shown for reference.
The larger a number in one of the quantization
tables, the more information gets thrown
away from that part of that frequency range.
Since we care less about high-frequency
information, the numbers in that area of the
quantization tables will be larger. And since
we care less about colour than about
brightness, the numbers in the colour table
will be larger overall than the numbers in the
brightness table.
The quantization tables are saved
along with the image data in the JPEG
file. They'll be needed to decode the
image correctly.COC Cee ae
Ea Ce Ce mn
eee MMe Sen LT
the quality down towards the low end.
Reese Reon
Tae ee ee eae
oS eu
TIN Re ua
by tossing the decimal parts after division, we
Se a
See Rcd
Ce CC
actually buy us anything. However, this data is
now going to be compressed using traditional
eee ee La
whole reason we used lossy compression in
Te COMM Ree
doesn't work well for images? Yes, but that
quantization we just did is going to make the
CONC MT na
ARR CS
number sequences:
The first row lists values for, say, some pixel
in the Y frequency channel. The second row is
the same values divided by 2 and rounded; the
third row is divided by 16 and rounded. You
Cn Ue me ed
eR MCU RG cee nr
De Cun ORO
Pe eC ae
Toe eeNee OR Omg am RCC)
eos aOR Re
ce ei Reece nC um a om
De ae Sn
Tee CU
quantized parts (with the largest divisors) are
next to each other to make nice, repetitive
Pete ey
Decoding
Gena i ed
PT Rg =Car eae usa ec
Nec oe (em RR RUM aur
TC e Re en ren
rena Cu aaa
Prem Mes Me One
TT RUE Ue Cd
PI Cum UNC UML aL cad
ea een aU a ROM
COMME UM Ulcer eMC
ten
BS ered)
Cree he
Ce
decompress the quantized (divided and
oe Rie U om CMe RU CRT
Ce ee Me oa
Oem eS coySCA use ane uc)
Next
rm
eu CM ecco TCR
Re RO oe ea
dividing by the numbers in the tables, we
RASC RURta CRSeS
Re eC ee to
Ree UN Ure nur nls
Ce ect eM Ce CR)
PR MRM U TL cu nc acy
Se ae a
COE eu Ce te Ons
lost, and the less accurate our reconstruction
Raa
SC ae eee ER uC
Now
that
have
reconstructed the frequency information, we
OCIS Ce ee une ay
CMU a CC uD
tes RS ER eR
Cie Ce oy oyreed
CORR URC Tg
CM oem CRC Mca
et ae Cu ne ce CN CT)
INR RO nea cr
Ae ea ees Cet Ces
Tae URI i ee ema TS
Deen RUC
PRU mem
by tossing the decimal parts after division, we
ROR U Rea)
Tego SCR Cun my
Cr eae CUS
Eee Ne DTU amu eT
RRR Reese Mente er
eee ee ees Umea
UR MSs me eon
Ue SMU ae So
Cee a RMS COM a
CMU eu a RN ROR CR
Cem ey ME ae
OR MTNA Cn ea
Deuce eat
n
ve
LyAt
runt
Se Maer
eran ee gether.
Wl OoPenne ne maesccurscs
Co Re URC eM ee
PML olan e}
Step 9: Fill in missing colour information
cb
cree sae)
eek Ren)
Re Me RO CRC MR Ure
channels Cb and Cr back up to their orignal
Fyrom Man MI COmuCU Coan CM UIC mT AS
aOR RRM re Tur Cae To
inter PaCS hc
Uo RUC ae nc
that are still there. There are different ways to
CCR UTA Meo) a LL TT
scaled up image will tend to be either blocky
or blurry, depending on the method used.SO mC estan)
Be Re cue a
Tou 5 ay
TC na
emt eae OR Ue Mal ge
MIC
oe
Dat a
eR MC a
Cement Pe
original input image and the decoded output
Creo
fice f cma
Gee neo ceca Re LCC)
eM om
Rae