Next Generation Video Coding (By APSIPA)

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 147

APSIPA

Asia-Pacific Signal and Information Processing Association

Next Generation Video Coding


H.265/HEVC and its extensions
OscarC.Au(PhD,PrincetonUniv.)
Dept.ofElectronicandComputerEngineering
HongKongUniversityofScienceandTechnology
ClearwaterBay,HongKong
Email:eeau@ust.hk

1
APSIPA Distinguished Lecture Series
www.apsipa.org

Oscar C. Au

BS, Toronto. MA/PhD, Princeton. Postdoc, Princeton.


Professor, HKUST. Director, Multimedia Tech Center.
Steering Committee, ICME/TMM.
IEEE/HKIE Fellow. BoG, APSIPA.
Best Paper Awards: SiPS/PCM/MMSP/ICIP
AE of journals: TCSVT, TIP, TCAS1, JVCIR, JSPS, TSIP, JMM, JFI.
Chair of 3 TC: CAS MSATC, SPS MMSP TC, APSIPA IVM TC.
Member of TC: CAS VSPS/DSP, SPS IVMSP/IFS, ComSoc MMC.
400+ papers. Hindex=29. 100+patents filed. 20 granted.
80+ standard contribution (MPEG/VCEG/JCTVC/AVS).

Outline

Introduction
LosslessCompression
ImageCompression
VideoCompression
SimulationResults
Conclusion

What is Image?
Animageisworthathousandwords.
Camera anopticalinstrumentthatrecordsimagesthatcanbe
storeddirectly(wikipedia)
Black&whitephoto,Colorphoto
Analogcamera(film),DC(card),DSLR,smartphone
Storagecard:SD,xD,SDHC,USB,
autofocus(AF),autoexposure(AE),autowhitebalancing(AWB)
ISO,shutterspeed,aperture,metering
Lens: wideangle,zoom,fisheye,vibrationreduction/image
stablizer,
4

What is Video?

Videoissequenceofimages
Transmission:TV,CCTV,broadcastingTV,movie,camcorder,
Storage:VHS,betamax,SVHS,video8,LD,VCD,DVD,BluRay
PAL/SECAM(625lines,25fps),NTSC(525lines,30fps),movie
SDTV:720x576(PAL/SECAM),720x480(NTSC)
HDTV:1920x1080(1080i/p),1280x720(720i/p)
UHDTV:7680x4320
digitalcinema:2048x1080(2K),4096x2160(4K)
CCD/CMOSsensor,Bayercolorfilterarray,demosaicking
AF/AE/AWB
5

Why video coding? Why possible?


Why?
Rawvideodataishuge.
Channelcapacityislimited
Videocodingtoreducedatarate
Whypossible?
Signalrepresentationisnaturallyredundant.
Somesignaldetailsirrelevanttoobserver.
Videocodingbyremovingredundancyandirrelevancy
Redundancyremovalisreversible(lossless)
Irrelevancyremovalisirreversible.

Why Compression? Storage.


a) A3000linetextfile
3000linex80char/linex1byte/char=240KB

b) A3minutesongonCD
3x60secx44100Hzx2byte/samplex2channel=32MB

c) A150minuteNTSCmovie(3.4TBforHDTV,60fps)
150x60secx30fpsx480x720pixelx3color=273GB

Storagecapacity:

SDcard/Flashdrive:1/2/4GB,8/64/128GB
iPod/harddisk:8GB/16GB,100GB/500GB/1TB
CDR:650MB/700MB
DVD+R:~4.7GB

Why Compression? Communication.


a) Speechovertelephone
8000Hzx8bit/sample=64kbit/s

b) CDqualitymusic/audioovernetwork
44.1x1000Hzx16bit/samplex2channels=1.4Mbit/s

c) NTSCDigitalVideoovernetwork(3.0Gb/sforHDTV)
30fpsx480x720pixelx8bit/pixelx3color=250Mbit/s

Channelcapacity

GPRS:120/20kb/s3G(SS):384/160kb/s
HSPA:28/42Mb/s,4G(OFDMA/MIMO):1000/100Mb/s
Bluetooth:13Mb/s
WiFi:12/54Mb/s,WiMax:1Gb/s

Why Possible?
Compressionisperformedtoremovetheredundancy
inherentintheimage/video
Spatial,temporal,statisticalandpsychovisual redundancies.

Compressionmethods
Lossless:Redundancyremovalisreversible;
Lossy:Redundancyremovalisirreversible.

Lossy coding: Distortion measure


Distortionmeasures
Sumofabsoluteerror(SAD),Sumofsquareerror(SSE)
Peaksignaltonoiseratio(PSNR)
Ratedistortiontradeoff
Lowerbitrate=>
higherdistortion
Higherbitrate=>
lowerdistortion
10

Lossy coding issues

Bitrate(R)
Distortion(D)
Codingdelay(importantforrealtimesystems)
Complexity(importantforsoftware/hardwarecost)
Sensitivitytomodeldeviation(robustness)
Sensitivitytochannelerrors(robustness,errorconcealment)

11

Two coding approaches


Waveformcoding
Representasignalsuchthataftersourcecoding,agood
approximationoforiginalwaveformremains.
E.g.DPCM,JPEG,MPEG1/2/4,H.261/3/4,JPEG2000,GIF,MP3
Typicallydistortionisduetoquantization

Parametriccoding
Assumesignalgeneratedfrommodelwithparameters.
Encoderextract/encodemodelparametersfrom
Goodapproximationofwaveformmaynotbeobtained,butthe
resultcouldbesound/imagethatresembletheoriginal
perceptually
E.g.fractals,modelbasedcoding,LPC,CELP,

12

Video coding standards?


Whatisastandard?

Internationalstandards
Nationalstandards
Industrystandards
Defacto standards

Mandatory?Optional?MP3?JPEG?
Whystandard?
Thestandardwar..

13

Standard Organizations
Threemaininternationalorganizations:
ITUT:InternationalTelecommunicationUnion(Telegraphy
section);
ISO:InternationalStandardsOrganization;
IEC:InternationalElectrotechnical Commission.

Theseorganizationsformdifferentgroupstodevelopvarious
standards:
ITUTformstheVCEG(VideoCodingExpertsGroup);
ISOandIECformtheMPEG(MovingPictureExpertsGroup);
ITUTandISOformtheJPEG(JointPhotographicExpertsGroup).
14

ISO/IEC with subgroups for video

15

Standardization History

Video

Image

16

Outline

Introduction
LosslessCompression
ImageCompression
VideoCompression
SimulationResults
Conclusion

17

Entropy coding
Entropyisameasureofrandomness
EntropyHisnonnegative.
N

H qi p j log p j
j 1

Theorem:Averagecodeword lengthSofauniquelydecodable
binarycodesatisfiesS>=H.
Theorem:AprefixconditioncodeforgroupsofMsymbols
existswithanaverageno.ofbitspersymbolS<H(q_i)+1/M
18

Lossless Compression Techniques

HuffmanCoding
ArithmeticCoding
LZWCoding
RunlengthCoding
DPCMCoding

19

Huffman Coding
Properties

Fixedlengthinput,variablelengthoutput;
Sensitivetochannelerrors;
Encode/decodeusingtablelookup;
Sensitivetochangesinsignalstatistics.

Applications
ModifiedHuffmancodinginJPEGandH.261;
2Dand3DHuffmancodinginMPEGandH.263;

20

Examples (one symbol)


P(A)=3/4,P(B)=3/16,P(C)=1/16
Averagelengthpersymbol(S)is
S=1x3/4+2x3/16+2x1/16
=1.25bit/symbol

1
3
3 3
3 1
H log log log 1.012
4 16
16 16
16
4
whereentropy(H)isdefinedas

H pi log pi

3/4
0

B 3/16

0 1/4
C 1/16 1

A
B
C

Theoretically,S>=H.ScanbearbitrarilyclosetoHbycodingN
symbolsatatime.(meaningful20%improvementpossible)

0
10
11
21

A 0.6
B 0.2
C 0.1
D 0.05
E 0.025
F 0.0125
G 0.00625
H 0.00625
Ave.codeword length=
1x0.6+2x0.2+3x0.1+4x0.05+5x0.025+6x0.0125+7x0.00625x2=1.7875
bit/symbol
Entropy=H=1.7585

22

Huffman Coding Example

AA
AB
AC
BA
BB
BC
CA
CB
CC

9/16
9/64
3/64
9/64
9/256
3/256
3/64
3/256
1/256

Ave.length=2.09bit/codeword =1.045bit/symbol
MuchclosertoH=1.012!!
Codemultiplesymbolsatatime=>loweraveragelength/symbol

23

Example (two symbols)

Arithmetic Coding
Properties

Fixedlengthinput,variablelengthoutput
Optimalminimumaveragecodewordlength
Sensitivetochannelerrors;
Adapttochangingofsignalstatistics.

Application
ImagecodingstandardslikeJBIG,JPEG,JPEG2000;
VideocodingstandardslikeH.263andH.264/MPEG4AVC

24

flexible,canuseadaptivemodelofsignalstatistics
optimal,minimumaveragecodeword length
slowerthanHuffmanandZivLempel
norandomaccess
potentialunboundedoutputdelay
needtoindicateendoffile
poorerrorresistance;errorpropagates
compressbetterthanHuffmaninJPEG,more
complicated

25

Arithmetic Coding

Basic algorithm of AC
1.
2.

Initializethecurrentinterval[L,H)to[0,1).
Foreachsymbolofthefile,do:
a) subdividecurrentintervalintosubintervals,onefor
eachpossiblealphabetsymbol.Thesizeofasymbols
subintervalisproportionaltotheestimatedprobability
thatthesymbolwillbethenextsymbolinthefile,
accordingtothemodeloftheinput.
b) selectthesubintervalcorrespondingtothesymbolthat
actuallyoccursnextinfile,andmakeitthecurrent
interval.
3. Outputenoughbitstodistinguishthefinalcurrentinterval
fromallotherpossiblefinalintervals.
logP 1

26

Example (Arithmetic coding)


P(a)=0.5,P(b)=0.4,P(c=EOF)=0.1
Current
Interval

Action

Subinterval

[0.0, 1.0)

Subdivide

[0.0, 0.5)

[0.5. 0.9)

[0.9, 1.0)

[0.0, 0.5)

Subdivide

[0.0, 0.25)

[0.25, 0.45)

[0.45, 0.5)

[0.0, 0.25)

Subdivide

[0.0, 0.125)

[0.125, 0.225) [0.225, 0.25)

[0.125, 0.225) Subdivide


[0.215, 0.225) End-offile

In-put
b

[0.125, 0.175) [0.175, 0.215) [0.215, 0.225) c


27

Toencodetheinterval[0.215,0.225),
needanumberwithenoughprecisiontospecifythe
interval.
Widthofinterval=0.01.
Halfwidth=0.005.
Need8bitsofprecisiontospecifyhalfwidth:
28 =0.0039<0.005,27 =0.0078>0.005

28

Example (Arithmetic coding)

Targetinterval:[0.215,0.225)
0 0.5
[0.0,0.5) /[0.5,1.0)
0 0.25
[0.0,0.25) /[0.25,0.5)
1 0.125
[0.0,0.125)/[0.125,0.25)
1 0.0625
[0.125,0.1875)/[0.1875,0.25)
1 0.03125
[0.1875,0.21875)/ [0.21875,0.25)
0 0.015625
[0.21875,0.234375) /[0.234375,0.25)
0 0.0078125 [0.21875,0.22656) /[0.22656,0.234375)
0 0.00390625 [0.21875,0.2227125) /[0.2227125,
0.22656)(completelywithin=>stop)
Codeword =00111000 =0.21875

29

Example (Arithmetic coding)

a) ifnewsubintervalisnotentirelywithinoneofthese
intervals:[0,1/2),[1/4,3/4),or[1/2,1),thenstop
iteratingandreturn.
b) ifnewsubintervalliesentirelywithin[0,1/2),then
output0andany1sleftoverfrompreviousfollow
symbols(one1foreachfollow symbol),anddoublethe
sizeoftheinterval[0,1/2)expandingfromleftboundary
(0)towardstheright.

30

Modified Arithmetic Coding

c) ifnewsubintervalliesentirelywithin[1/2,1),thenoutput
1andany0sleftoverfrompreviousfollow symbols
(one0foreachfollow),anddoublethesizeofcurrent
intervalbyexpandingitfromrightboundary(1)towards
theleft.Forexample:interval[0.72,0.8)willbecome
[0.44,0.6),because2*dist(0.72,1)=0.56,10.56=0.44=>
leftpoint;newwidth=2*(0.80.72)=0.16;0.44+0.16=0.6
=>rightpoint.

31

Modified Arithmetic Coding

d) ifnewsubintervalliesentirelywithin[1/4,3/4),apply
afollow symbolanddoublethesizeofcurrent
intervalbyexpandingitinbothdirectionsawayfrom
midpoint(1/2).Forexample:interval[0.48,0.588)will
become[0.46,0.676),because2*dist(0.48,0.5)=0.04,
0.50.04=0.46=>leftpoint;2*dist(0.588,0.5)=0.176,
0.5+0.176=0.676=>rightpoint.
e) goto (a).

32

Modified Arithmetic Coding

Example (Modified AC)


P(a)=0.4,P(b)=0.2,P(c)=0.4.Assumea=EOF.
Current
Interval

Action

[0.0, 1.0)

Subdivide

[0.4, 0.6)

Expand (Follow)

[0.3, 0.7)

Expand (Follow)

[0.1, 0.9)

Subdivide

Subinterval

Input

[0.0, 0.4)

[0.4. 0.6)

[0.6, 1.0)

[0.1, 0.42)

[0.42, 0.58)

[0.58, 0.9) a

[0.1, 0.42) Expand (Output 0, 1, 1)


[0.2, 0.84) End (Output 01 or 10)*

33

Therefore,
codeword =01110or01101.
cw(01110)=0.25+0.125+0.0625=0.4375
cw(01101)=0.25+0.125+0.003125=0.40625
*Needtooutput10or01forinputx,whereP(x1 )=0.2,
P(x2 )=0.64,P(x3 )=0.16

34

Example (Modified AC)

Example (old method AC)


P(a)=0.4,P(b)=0.2,P(c)=0.4
Current
Interval

Action

[0.0, 1.0)

Subinterval

Input

Subdivide

[0.0, 0.4)

[0.4. 0.6)

[0.6, 1.0)

[0.4, 0.6)

Subdivide

[0.4, 0.48)

[0.48, 0.52) [0.52, 0.60) a

[0.4, 0.48)

End-of-File

35

Targetinterval:[0.4,0.48)
0
0.5
[0.0,0.5) /[0.5,1.0)
1
0.25
[0.0,0.25)/[0.25,0.5)
1
0.125
[0.25,0.375)/[0.375,0.5)
1
0.0625
[0.375,0.4375)/ [0.4375,0.5)
0
0.03125
[0.4375,0.46875) /[0.46875,0.5)
(entirelywithin[0.4,0.48))
codeword =01110 (canalsobe01101)
SameasmodifiedAC.

36

Example (old method AC)

Example (Modified AC Decode)


P(a)=0.4,P(b)=0.2,P(c)=0.4.Assumea=EOF.cw(01110)=0.4375
Current
Interval

Action

[0.0, 1.0)

Subinterval
a

Subdivide

[0.0, 0.4)

[0.4. 0.6)

[0.6, 1.0)

[0.4, 0.6)

Expand (Follow)

cw=0.5-2*(0.5-0.4375)=0.375

[0.3, 0.7)

Expand (Follow)

cw=0.5-2*(0.5-0.375)=0.25

[0.1, 0.9)

Subdivide

[0.1, 0.42)

[0.42, 0.58)

Input
b

[0.58, 0.9) a

Stop decoding because a=end-of-file. (cw=codeword)


37

Example (Modified AC Decode)


P(a)=0.4,P(b)=0.2,P(c)=0.4.Assumea=EOF.cw(01101)=0.40625
Current
Interval

Action

[0.0, 1.0)

Subinterval
a

Subdivide

[0.0, 0.4)

[0.4. 0.6)

[0.6, 1.0)

[0.4, 0.6)

Expand (Follow)

cw=0.5-2*(0.5-0.40625)=0.3125

[0.3, 0.7)

Expand (Follow)

cw=0.5-2*(0.5-0.3125)=0.125

[0.1, 0.9)

Subdivide

[0.1, 0.42) [0.42, 0.58)

Input
b

[0.58, 0.9) a

Stop decoding because a=end-of-file. (cw=codeword)


38

Outline

Introduction
LosslessCompression
ImageCompression
VideoCompression
SimulationResults
Conclusion

39

Image Compression
Applications:Internet,digitalphotography,medicalimaging,
remotesensing,surveillance,facsimile,etc.
Standardizationofimagecompression
JPEGbecametheInternationalStandardsin1992.
JPEG2000wasfinalizedin2002.
GeneralStructureofImageCodingStandards:

40

Image Standards

JPEG(lossy andlossless):ITUTT.81,ISO/IEC109181
JPEGextensions:ITUTT.84
JPEGLS (lossless,improved):ITUTT.87,ISO/IEC144951
JBIG (lossless,bilevelpictures,fax):ITUTT.82,ISO/IEC11544
JBIG2 (bilevelpictures):ITUTT.88,ISO/IEC14492
JPEG2000:ITUTT.800,ISO/IEC154441
JPEG2000extensions:ITUTT.801
JPEGXR (formerlycalledHDPhotopriortostandardization) :
ITUTT.832,ISO/IEC291992
41

JPEG
History
Startingfromthemid1980s;
Designedforcompressinggrayscaleandcolorstillimages;
Becameaninternationalstandardin1992.

Application
TheJPEGcodingstandardstillservesasthemostwidelyused
compressionalgorithmtoday.
Itsapplicationcanbefoundindiversestorageandtransmission
domains,suchastheInternet,digitalprofessionalandconsumer
photography,andvideo.
42

Fourmodesofoperation:
a) Sequentialencoding:imageencodedinsingleleftright,top
bottomscan(includebaselinesequentialcodec)
b) Progressiveencoding:imageencodedinmultiplescans
encodingwithprogressivelyrefineddetails,forimage
reconstructioninmultiplecoarsetoclearpasses
c) Losslessencoding:imageencodedtoguaranteeexact
recoveryofeverysourceimagesamplevalue(though
compressratiolowcomparedtolossy modes)
d) Hierarchicalencoding:imageencodedatmultiple
resolutionss.t. lowerresolutionimageaccessiblewithout
decompressinghigherorevenfullresolution

43

Goals of JPEG

Sequential Mode of JPEG

44

3majorsteps:DCT,quantization,entropycoding

Image
Source

FDCT

Quantizer

Table

Reconstructed
Image

IDCT

Dequantizer

Entropy
Encoder

Compressed
Image

Entropy
Decoder

Discrete Cosine Transform (DCT)


7

F u, v 1 C u C v
4
f x, y 1

2 x 1u
2 y 1v
cos
f x, y cos
16

x 0 y 0

C u C v cos

16

2 x 1u cos 2 y 1v
16

u 0 v 0

16

This image cannot currently be display ed.

where C u

1
2

, C v

,foru=0andv=0,

C u 1, C v 1 ,otherwise.

45

Quantization
quantizationisamanytoonemappingandthuslossy
theprincipalsourceofdistortioninDCTbasedencoder
quantizationdefinedasdivisionofeachDCTcoefficients
byitscorrespondingquantizer stepsizefollowedby
roundingtonearestinteger(normalizedbythe
quantizer stepsize):

F u , v
F u , v round

Q
u
v
,

46

Entropy Coding
eachquantizedDCcoefficientencodedasdifference
fromDCtermofpreviousblockinencodingorder;
thisspecialtreatmentisworthwhileasDCcoefficients
frequentlycontainasignificantfractionoftotalimage
energy.

47

DCT/Quantization Example
139
144

150

159
159

161
162

162

144 149 153 155 155 155 155


151 153 156 156 156 156 156
155 160 163 158 156 156 156

161 162 160 160 159 159 159


160 161 162 162 155 155 155

161 161 161 160 157 157 157


162 161 163 162 157 157 157

162 161 161 163 158 158 158

a)Inputimage
79 0 1
2 1 0

1 1 0

0
1 0
0
0
0

0
0
0
0
0
0

0
0
0

0 0 0 0 0
0 0 0 0 0
0 0 0 0 0

0 0 0 0 0
0 0 0 0 0

0 0 0 0 0
0 0 0 0 0

0 0 0 0 0

1259.6
22.6

10.9

7.1
0.6

1.8
1.3

2.6

1.0
17.5
9.3
1.9
0.8
0.2
0.4
1.6

12.1
6.2
1.6
0.2
1.5
1.6
0.3
3.8

5.2
3.2
1.5
1.5
1.6
0.3
1.5
1.8

2.1
2.9
0.2
0.9
0.1
0.8
0.5
1.9

1.7
0.1
0.9
0.1
0.7
1.5
1.7
1.2

2.7
0.4
0.6
0
0.6
1.0
1.1
0.6

1.3
1.2
0.1

0.3
1.3

1.0
0.8

0.4

b) Forward DCT coefficients


10
1264 0
24 12 0

14 13 0

0
0
14
0
0
0

0
0
0
0
0
0

0
0
0

0 0 0 0 0
0 0 0 0 0
0 0 0 0 0

0 0 0 0 0
0 0 0 0 0

0 0 0 0 0
0 0 0 0 0

0 0 0 0 0

d) quantized DCT coefficients e) dequantized DCT coefficients

16
12

14

14
18

24
49

72

11
12
13
17

10
14
16
22

16
19
24
29

24
26
40
51

40
58
57
87

51
60
69
80

22
35
64
92

37
55
78
95

56 68 109
64 81 104
87 103 121
98 112 100

103
113
120
103

61
55
56

62
77

92
101

99

c) Quantization table
142
149

157

162
162

160
160

160

144
150
158
162
162
161
160
161

147
153
159
163
162
161
161
163

150
155
161
163
162
161
162
164

152
156
161
162
161
160
161
164

153
157
160
160
158
158
160
163

154
156
159
158
156
156
158
161

154
156
158

157
155

154
157

160

48
f) Reconstructed image

1. JPEGdiscoveredthataDCTbasedlosslessmodewas
difficulttodefineasapracticalstandardagainstwhich
encodersanddecoderscouldbeindependently
implemented,withoutplacingsevereconstraintsonboth
encoderanddecoderimplementations.
Instead,JPEGhaschosenasimplepredictivemethod
whichiswhollyindependentofDCTprocessing.

49

JPEG Lossless Mode

2.

Thepredictivemethodproducesresultswhich,inlight
ofitssimplicity,aresurprisinglyclosetothestateofthe
artforlosslesscontinuoustonecompression.
Losslesscodecstypicallyproducearound2:1
compressionforcolorimageswithmoderately
complexscenes.

50

JPEG Lossless Mode

3. Apredictorcombinesvaluesofupto3neighbouring
samples(A,B,C)toformapredictionofthesample
indicatedbyX,andthedifferenceisencodedlosslessly
byeitherHuffmanorarithmeticcoding.
Theencodercanuseanysourceimageprecisionfrom
2to16bitspersample,andcanuseanyofthe
predictorsexceptselectionvalue0.

51

JPEG Lossless Mode

52

Eachimagecomponentisencodedinmultiplescans
ratherthaninasinglescan.
Thefirstscanencodesaroughbutrecognizableversionof
theimagewhichcanbetransmittedquicklyincomparison
tothetotaltransmissiontime
Subsequentscansrefinedtheimageprogressivelyto
finallyreachthelevelofpicturequalitythatwas
establishedbythequantizationtables.

53

JPEG Progressive Mode

Toachievethisrequirestheadditionofanimagesized
buffermemoryattheoutputofthequantizer before
theinputtotheentropyencoder.
Thebuffermemorymustbesufficientlylargetostore
theimageasquantizedDCTcoefficientseachofwhichis
3bitslargerinsizethanthesourceimagesamples.

54

JPEG Progressive Mode

Therearetwocomplementarymethodbywhichablock
ofquantizedDCTcoefficientsmaybepartiallyencoded.
Inthefirstmethod,onlyNlowfrequencyDCT
coefficientsneedtobeencoded.Thisiscalledspectral
selection.Theotherhighfrequencycoefficientsaresent
insucceedingscans.

55

JPEG Progressive Mode

Inthesecondmethod,thecoefficientsneednotbe
encodedtotheirfull(quantized)accuracyinagiven
scan
InitiallytheNmostsignificantbitsareencoded.
Insubsequentscans,thelesssignificantbitscanbe
encoded.
Thisiscalledsuccessiveapproximation.

56

JPEG Progressive Mode

57

imageisfilteredanddownsampled(decimated)bythe
desirednumberofmultipleof2ineachdimension.
Encodethereducedsizeimageusingoneofthe
sequentialDCT,progressiveDCTorlosslessencoders.
(stepb)

58

JPEG Hierarchical Mode


(pyramidal coding)

Decodethisreducedsizedimage,interpolateandup
samplebyafactorof2horizontallyandvertically,using
anidenticalinterpolationfilterwhichthereceivermust
alsouse.
Usethisupsampledimageasapredictionofthe
originalatthisresolutionandencodethedifference
imageusingoneofthesequentialDCT,progressiveDCT
orlosslessencoder.(stepd)
repeatuntilfullresolutionofimageisencoded.

59

JPEG Hierarchical Mode

encodinginsteps(b)and(d)maybedoneusingonlyDCT
basedprocesses,onlylosslessprocessesorDCTbased
processeswithafinallosslessprocessforeach
component.
usefulinapplicationsinwhichaveryhighresolution
imagemustbeaccessedbyalowerresolutiondevice,
whichdoesnothavethebuffercapacitytoreconstructthe
imageatitsfullresolutionandthenscaleitdownfor
lowerresolutiondisplay.

60

JPEG Hierarchical Mode

61

JPEG Hierarchical Mode

JPEG
Keytechnology
Losslesscodingscheme:predictivecodingmethodusing
neighboringpixelvalues
Lossy mode:thewellknownDCT,servesasthebaselineofJPEG
Blockartifacts

Block based DCT

62

JPEG

Cameraman, 8 bits/pixel

Cameraman, 0.15 bits/pixel

63

Cameraman, 0.5 bits/pixel

Cameraman, 0.8 bits/pixel

JPEG

Lena, 8 bits/pixel

Lena, 0.15 bits/pixel

64

Lena, 0.5 bits/pixel

Lena, 0.8 bits/pixel

JPEG 2000
History
Startedin1998;
AimedatimprovingthequalityandcapabilityofJPEG;
Approvedastheinternationalstandardsin2002

Application
Remotesensing,colorfax,printing,scanning,digitalphotography,
medicalimagery,digitallibraries/archives,Internet,ecommerce,
etc.

65

JPEG 2000 Tiling


Eachcolorcomponentdividedintorectangulartiles(E.g.
64x64nonoverlappingblocks)
Eachtileencodedindependently.Foreachtile,apply
wavelettransform,quantization,formprecinctandcode
blocks,EBCOT,AC.
Bad:Slightlylowercompressionefficiencywithtilingthan
without.
Good:lowermemoryrequirement,randomaccess
Arithmeticcodingused:MQcoder(alsousedinJBIG2,
similartoQMcoderinJPEG)

Block Diagram of JPEG2000


(Encoder)

Preprocessing
subtractinputvalue(unsigned)by128
ForwardIntercomponentTransform
RGBtoYUV(reversible)orYCbCr (irreversible)
Decorrelation,possible2:1subsamplinginUVorCbCr
UsefuldatareductionmethodinJPEG,butnotsomuch
inJPEG2000(candiscardHL,LH,HH)
Reversible
Irreversible

ForwardIntracomponentTransform
DiscreteWaveletTransform(DWT)ofnonoverlappingtiles
Daubechies 9tap/7tapwaveletfilterforirreversible/lossy
transform
5tap/3tapforreversible/losslesstransform(allow
repetitiveencoding/decdding withoutadditionalloss)
implementedbyconvolutionorliftingusingperiodic
symmetricextensiontohandleboundaryeffect

Image wavelet
transform

Quantization
Stepsizecalculatedfromratecontrol
Precinctpartitioning
3spatiallyconsistentrectanglesgroupedtoformapacket
partitionlocationoraprecinct
Eachprecinctdividedintononoverlappingcodeblocks,
scannedwithaparticularorder

Tier1Encoder
Bitplaneofcoefficientswithincodeblockentropy
encodedusingembeddedblockcodingwithoptimal
truncation(EBCOT)
EBCOT:ArithmeticCoding(AC)followedbypost
compressionratedistortion(PCRD)optimized
truncation
certainROIcanbecodedathigherquality
Tier2Encoder
Referredtoaspacketization.

EBCOT (Embedded Block Coding


with Optimal Truncation)

EBC (Embedded Block Coding)


Eachbitplaneofacodeblockisencodedby3coding
passes
Significantpass(ACordirect)
Refinementpass(ACordirect)
Cleanuppass(AC)
Exception:highestbitplaneencodedwithCleanuppass
only

JPEG 2000

Cameraman, 8 bits/pixel

Cameraman, 0.15 bits/pixel

77

Cameraman, 0.5 bits/pixel

Cameraman, 0.8 bits/pixel

JPEG 2000

Lena, 8 bits/pixel

Lena, 0.15 bits/pixel

78

Lena, 0.5 bits/pixel

Lena, 0.8 bits/pixel

Comparison of JPEG and JPEG 2000


ComparedwithJPEG
Advantages
Bettercompression
performanceatlowbitrate
Spatialandqualityscalability
Noblocking artifacts

Disadvantages
No substantialimprovementat
mediumandhighbitrate
MorecomplexthanJPEG

79

JPEG Derived Industry Standards

JFIF(JPEGFileInterchangeFormat,XXX.jpg);
JTIP(JPEGTiled,PyramidFormat);
TIFF(TaggedImageFileFormat);
SPIFF(StillPictureInterchangeFileFormat,JPEGPart3);
FlashPix
DevelopedbyKodak,HewlettPackard,Microsoft(1996);
Widelyusedindigitalstillcameras.

80

Outline

Introduction
LosslessCompression
ImageCompression
VideoCompression
SimulationResults
Conclusion

81

Video Compression
Applications
DVD,digitalTV,HDTV,videotelephony,andteleconferencing

Standardizationprocess
VCEG

MPEG

H.261

H.262/MPEG2

MPEG1

H.263

H.264/MPEG4AVC

MPEG4
82

HEVC

Two types of applications


Asymmetric applications:infrequentuseofcompressor
(complication)butfrequentuseofdecompressor (simple).

2.

e.g.electronicpublishing,educationandtraining,travel
guidance,videotext,pointofsale,videogames,
entertainment(movies),videoondemand(VOD),etc.

Symmetric applications:essentiallyequaluseofcompressor
anddecompressor.

e.g.videomail,videophone,videoconferencing,generation
ofmaterialforplaybackonlyapplications,etc.

*MPEG isforasymmetricapplications.H.261/263arefor
symmetricapplications.

83

1.

Requirements for compressed video


on digital storage media
Randomaccess :anyframedecodableinlessthan0.5second.Need
accesspoint,i.e.segmentsofinformationcodedonlywithreference
tothemselves.

2.

FastForward/ReverseSearches :possibletoscanacompressedbit
stream,displayselectedframestoobtainafastforwardorfast
reverseeffect.(amoredemandingformofrandomaccess)

3.

ReversePlayback :impossiblewithoutanextremeadditionalcostin
memory.

4.

Audiovideosynchronization :needmechanismtoresynchronize
audioandvideoshouldtheybederivedfromslightlydifferentclocks.

84

1.

5.

Robustnesstoerrors :avoidcatastrophicbehaviorinthepresenceof
errorsinstoragemediaortransmissionchannels.

6.

Coding/Decodingdelay :videoconferencingapplicationsneedto
maintaintotalsystemdelayunder150msinordertomaintain
conversation.Publishingapplicationscanallowlongencodingdelay,
butshortinteractivethresholddecodingdelayof1sec.

7.

Editability :editingunitsofashorttimedurationandcodedonlywith
referencetothemselvesneededforeditability incompressedform.

8.

Formatflexibility :allowforalargeflexibilityofformatintermsof
rastersize(width,height)andframerate.

9.

Costtradeoff :decoderimplementableinsmallnumberofchips.

85

Requirements for compressed video


on digital storage media

Video Compression
Generalstructureofthevideocodec
Intraframe:exploitthespatialcorrelationtopredictthesignal
Interframe:exploitsthetemporalcorrelationtofurtherreduce
theredundancies.

86

Temporal Redundancy Reduction


1)
2)
3)

Intrapictures(I):provideaccesspointforrandomaccess,but
onlymoderatecompression(~10:1)
PredictivePictures(P):codedwithreferencetoapastpicture(I
orP);usedasareferenceforfuturePpictures;higher
compression(~20:1)
Bidirectionalpredictedpictures(B):providehighestamountof
compression(~40:1)butrequirebothapastandfuture
referenceforprediction;notusedasareference.

87

Threetypesofpictures :

MPEG1/2/4 Video Compression


Rawvideoishugeinsizefortransmission/storage(e.g.150GBfora
120minutemovie).Needcompression.
Compressionachievedbyexploiting4characteristics:
statisticalredundancy,perceptualirrelevancy,spatialredundancy
usedinimagecompression
temporalredundancy (MotionEstimation)uniqueandmost
importantforvideocompression

Temporal Redundancy Reduction


Frame 20 of tennis sequence

Frame difference without motion estimation

Frame 21 of tennis sequence

with motion estimation

Advantage of DPCM over PCM

90

Blockbased Motion Estimation

91

Goal :Toestablishblockwise correspondencebetweentwoframes

ops
search
block frame

9
30
512
961
396
5.8 x10 ops / sec
search
block
frame
sec

ME too computation intensive!

Computational distribution in MPEG-4 encoder

Multimodel

Easilytrappedinlocalminimum

93

Error Surface

94

Error Surface

Fast Motion Estimation

Full Search (41.79dB, slow)

PMVFAST (41.85dB, speedup=1125)

Foreman test sequence, medium bit rate 512 kbit/s,


medium resolution CIF, 15 fps, SA32

Adaptive Quantization/Rate Control


cij 8

Cij round
Q Q
ij p
cij =ijth DCTcoefficientofan8x8block,
cijq =quantizedcij,
Qij =ijth entryinthequantizationtable,
Qp =extraquantizationstep.
Foragivenframe,Qij orthequantizationtableisfixed.ButQp can
bechangedonablockbyblockbasis.

96

Error concealment
TwotypesofIframeerrors:

b)

lossofHPcellsresultinginlossofheaders,DCandloworderAC
coefficients,whichresultsinseriousdegradationofvideo
signal.
lossofSPcellsresultinginlossofhigherorderACcoefficients,
whichresultsinlossofdetailintheblocksbeingreconstructed.

ThreetypesofPandBframeerrors

lossofanHPcellresultingintotallossofinformationpertaining
toimageregionrepresentedbythecodedbits
lossofSPcellcontainingmotioninformation
lossofSPcellcontainingnomotioninformation(ignore)

97

a)

Error concealment
DCsynthesis(Iframes)

d)

DCvaluesaresynthesizedbybilinearinterpolationfromthe
nearestblocksinthetopandbottommacroblocks.Sinceacell
lossgenerallycauseslossofdatainaseriesofmacroblocks,
thehorizontalneighborsarenotusedforsynthesis.

ACsynthesis(Iframes)

Inordertoreducetheeffectofdistinctblockboundaries,
someoftheloworderACcoefficientshavetobesynthesized.
ThefivelowestorderACcoefficients(inzigzagorder)are
synthesizedusingsomemethod.

98

c)

Error concealment
Predictionmode(PandBframes)

f)

ForPframe,iftoporbottommacroblockiscodedasforward
predicted,thedamagedmacroblockisassignedforward
mode.
Ifbothneighborsarecodedinintramode,thedamaged
macroblockisassignedintramode.SimilarforBframes.

Motionvectors(PandBframes)

Ifbothtopandbottomvectorsaredefined,thentheaverage
ofmotionvectorsisusedforthesynthesizedmacroblock.
Ifonlyoneoftheverticalneighborshasvalidmotionvector(s)
defined,thenthisvector(s)isused.
Ifnomotionavailable,themacroblockissynthesizedbyintra
frametechniqueasinIframes.

99

e)

Video Standards

CCIR601 (ITUT)
H.261 (ITUT)
H.263 (ITUT)
H.264/MPEG4AVC (ITUT + ISO)
H.265
MJPEG (ISO)
MPEG1 (ISO)
MPEG2 (ITUT + ISO)
MPEG4 (ISO)
VC1 (SMPTE)
AVS

100

H.261
Target
InternationalstandardforISDNpicturephonesandforvideo
conferencingsystems(1990)
Imageformat:CIF(352x288)orQCIF(176*144),framerate7.5...30fps
Bitrate:multipleof64kbps,typically128kbpsincludingaudio.
Picturequality:for128kbpsacceptablewithlimitedmotioninthescene

BasicProperties

TheveryfirstoneoftheH.26xstandardsinthedomainofVCEG
Iframe+Pframe
Iframe:basicallythesameasJPEG
Pframe:motionestimation/compensation+JPEG
Motionestimation/compensation.
Loopfilter.

101

H.261
RatifiedinNovember1988
Thefirstwidespreadpracticalsuccess
Stillinuse,mostlyasabackwardcompatibilityfeatureovertaken
byH.263
Designedtooperateatvideobitratesbetween40kbit/sand
2Mbit/s
Stillusedasbackwardcompatibilitymodeinsomevideo
conferencingsystemsandforsometypesofinternetvideo

102

MPEG1
Target

Targetbitrateabout1.5Mbit/s
TypicalimageformatCIF,nointerlace
Framerate24...30fps
Mainapplication:videostorageformultimedia(e.g.,onCDROM)

BasicProperties

DesignedforCDROMapplicationbyMPEG
Iframe+Pframe+Bframe
Pframe:unidirectionalmotioncompensation
Bframe:bidirectionalmotioncompensation
HalfpixelME

103

MPEG1

FinalstandardwasapprovedinNovember1992
Usewasfairlywidespread,butmostlyovertakenbyMPEG2
CanprovideapproximatelyVHSqualitybetween12Mbps
Application:

MP3
VCD
DVD
CDROM
DVB(DigitalVideoBroadcasting)
DAB(DigitalAudioBroadcasting)
104

H.262/MPEG2
Target
Extensionforinterlace,optimizedforTVresolution(NTSC:704x
480Pixel)
ImagequalitysimilartoNTSC,PAL,SECAMat48Mbit/s
HDTVat20Mbit/s

BasicInformation
MeettheneedofentertainmentTVfortransmissionmedia.
Frame/fieldpicture;
Thescalabilitytoolsasfunctionalitytoolswerefirstdefined.
105

EI and EP have
same resolution as I
and P
EI predicted from I
EP predicted from P
or previous EI or EP

106

SNR scalability

Similar to SNR
scalability except
that I and P are
half-size of EI and
EP
Prediction from I
and P involves
enlarging them by
a factor of 2

107

Spatial scalability

108

Multilayer scalability

H.262/MPEG2
Firstvideocompressioncodecreleasedin1995
NowinwideuseforDVDstandardandDTV
Themostcommonlyusedvideocodingstandard

Rangeofusenormally220Mbps
Application

DVD(NTSC&PAL)
HDV(HighdefinitionvideoonDVcassettetape)
MODandTOD(Digitaltapelesscamcorders)
VOD(VideoOnDemand)
ATSC
XDCAM
ISDBT
DVB

109

H.263
Target
Internationalstandardforpicturephonesoveranalogsubscriber
lines(1995)
ImageformatusuallyCIF,QCIForSubQCIF,framerateusually
below10fps
Bitrate:arbitrary,typically20kbpsforPSTN
Picturequality:withnewoptionsasgoodasH.261(athalfrate)
WidelyusedascompressionengineforInternetvideostreaming

BasicProperties
Designedforvideoconferencingatalowbitrateinthemobile
wirelesscommunicationscenario.
8x8motioncompensation;
8x8blockDCT

110

H.263
RatificationinMarch1996
Application
WidelyusedascompressionengineforInternetvideostreaming
(YouTube,GoogleVideo,Myspace)
AlsofounduseinH.323(RTP/IPbasedvideoconferencing),RTSP,
SIP(IPbasedvideoconferencing)solutions.
Lowbitratecompressedformat
MMS(mobilemultimediamessage)
Videotelephony

111

MPEG4
Target
Objectbasedcoding;
Widerangeofapplications,withchoicesofinteractivity,
scalability,errorresilience,etc.

BasicProperties
Alotofnewcodingtools:
Interactivegraphics;
Objectandshapecoding;
Scalablevideocoding.

Robusttransmission;
Canencodemixedmediadata.

112

MPEG 4
Introducedinlate1998,stilladevelopingstandard
Efficientacrossavarietyofbitratesrangingfromafewkbps
totensofmbps.
Supportvarietyofbitrates
Application

Internetvideostreaming
Wirelessvideo
Studioediting
Videodatabase
Interactivevideo
Videoconferencing/email
Games
Education

113

H.264/MPEG4 AVC
Target
ReducehalfofthebitratecomparedtoMPEG2,H.263orMPEG
4Part2

BasicProperties

Finalizedin2003
Variableblocksizemotioncompensation
Quarterpixelprecisionformotioncompensation
Powerfulentropycodingtechniques:
ContextAdaptiveBinaryArithmeticCoding(CABAC)
ContextAdaptiveVariableLengthCoding(CAVLC)

ScalableVideoCoding(SVC)
Multiview VideoCoding(MVC)
Integerbasedtransform

114

H.264/MPEG4 AVC
FirstdraftingworkwascompletedinMay2003
BroadApplication

BlurayDiscs
Streaminginternetsource(Vimeo,YouTube,iTunesStore)
Websoftware(AdobeFlashPlayer,MicrosoftSilverlight)
HDTVbroadcasts(ATSC,DVBT,DVBCDVBS)
CCTV(ClosedCircuitTV)andVideosurveillance

115

Advanced Techniques of H.264

MultipleReferenceFrames
MultipleBlockSizeME(e.g.16x8,8x8,4x4)
BetterMEaccuracy(1/4pixel)

4x4integertransform
InLoopDeblocking Filter
Betterentropyencoding(CABAC)

116

NewIntraPrediction Method
AdvanceInterPredictionMethod

117

4x4 Intraprediction (luma)

4x4 Intra Prediction Modes


8
1
6
3

4
7

118

119

120

16x16 Intra Prediction (luma)

Mode 0 (Vertical)

Mode 1 (Horizontal)

Complicated!

Mode 2 (DC)

Mode 3 (Plane)

121

122

8x8 intraprediction (chroma)

123

Coding intraprediction mode

124

ME: Multiple reference frames

N-5

N-4

N-3

N-2

N-1

Frame N

125

ME: Multiple block size

Totally7blockmode
Motionestimation
4x4integertransform
Advantage
Savebits(~15%,7modes)

Disadvantage
Computationincrease

126

ME: Multiple block size

127

Transform

4x4 array for luma DC coefficients


2x2 array for chroma DC coefficients

128

4x4 DCT
(with Hadamard transform for DC)

129

4x4 Integer DCT

130

Fast 4x4 DCT

131

Quantization of DCT coeff.

QP and QStep

132

133

Quantization in H.264 reference


software

HEVC video compression


Target
TargetatHDTVorultraHDTVcompression,withsubstantially
improvedcodingefficiencycomparedtoH.264/AVC,i.e.50%bit
ratereduction
Focusontheincreasingneedforparallelprocessing

BasicProperties

Finalizedin2013
LargeBlockstructure
Quadtreebasedblockpartition
Asymmetricmodepartition
Sampleadaptiveoffset
Tile

134

HEVC Timeline
2010.01:FormaljointCfP fromVCEGandMPEG
2010.04:JCTVCteam,HEVCjointproject,fullproposals
2010.07:TMuC SWready,toolexperiments(TE)
2010.10:HMSWready,coreexperiments(CE)
2011.02:WD
2012.02:CD
2012.07:DIS
2012.10:SoDIS (StudyofDraftInternationalStandard)
2013.01:FDIS
Mid2013 mid2014:Extensions/amendments,suchas
Scalable,3D,4:x:x,bitdepth>10,color

135

HEVC Involved Companies

136

Major Applications Summary


Field

Bandwidth

VideoStandards

DigitalTelevision
Broadcasting

2...6Mbps(10...20
MbpsforHD)

H.262/MPEG2
H.264/MPEG4AVC

Blueray DVDvideo

6...8Mbps

H.262/MPEG2
H.264/MPEG4AVC

Internetvideostreaming

20...200kbps

H.263
H.264/MPEG4AVC

Videoconferencing,
Videotelephony

20...320kbps

H.261
H.263
H.264/MPEG4AVC

Videoover3Gwireless

20...200kbps

H.263
H.264/MPEG4AVC
137

AVS video compression


AVS:AudioVideoStandard
FoundedbytheChinaAudioVideoCodingStandardWorking
GroupinJune2002.
Aimedatreducingtheforeigntechnologydependence.

TwoAVSstandardsarefinalizedortobefinalized
AVS1
Finalizedin2008
ProvidethecodingefficiencytwotimeshigherthanMPEG2,
comparabletoH.264/MEPG4AVC
Thecomplexityisonly30%,and70%comparedtoH.264/MPEG4
AVCencoderanddecoder.

AVS2
TobefinalizedinDec2013
Expectedtoimprovecodingefficiencybytwotimescomparedto
AVS1,underhighdefinitionorhigherresolutionconditions

138

Outline

Introduction
LosslessCompression
ImageCompression
VideoCompression
SimulationResults
Conclusion

139

Image Standards Comparison


JPEGandJPEG2000simulationsareconducted;
Severalimagesaretestedundervariouscompressionrate;

Tools:1520x1200
Bike:2048x2560
Cafe:2048x2560
Woman:2048x2560

140

Image Standards Comparison


Tools

Bike

37

44
JPEG
JPEG2000

36

42

35

40

PSNR dB

PSNR dB

JPEG
JPEG2000

34

33

38

36

32

34

31

32

30

0.2

0.4

0.6

0.8
1
1.2
Bit Rate
bits/pixel
Cafe

1.4

1.6

30

1.8

37

0.6
0.8
Bit Rate
bits/pixel
Woman

1.2

1.4

JPEG
JPEG2000

36

42

35

40

PSNR dB

PSNR dB

0.4

44
JPEG
JPEG2000

34

33

38

36

32

34

31

32

30

0.2

0.2

0.4

0.6

0.8
1
1.2
Bit Rate bits/pixel

1.4

1.6

1.8

30

141
0

0.2

0.4

0.6
0.8
Bit Rate bits/pixel

1.2

1.4

Video Standards Comparison


Variousvideostandardssimulationsareconductedunder
variousqualitylevels:
H.261,H.263,MPEG1,MPEG2,MPEG4areconductedusingthe
FFMPEGsoftware;
H.264usesthelatestJM18.4;
HEVCusesthelatestHM11.0;

Twosequencesaretested:
Foreman:176x144
RaceHorses:832x480

AllIntraconditionistested:
AlltheframesareencodedasIframe;
JPEGandJPEG2000arealsoincludedinthecomparison;

LowDelayconditionistested:
IPPPstructureareused;

142

Video Standards Comparison


All Intra Comparison
Foreman All Intra

RaceHorses All Intra

42

40

40
38
38
36

34
HEVC
H264
MPEG4
MPEG1
MPEG2
H263
H261
MJPEG2000
MJPEG

32
30
28
26

200

400

600
800
BitRate(kbps)

1000

1200

1400

PSNR(dB)

PSNR(dB)

36
34

HEVC
H264
MJPEG2000
MPEG4
MPEG1
MPEG2
MJPEG

32

30

28

5000

10000

15000

BitRate(kbps)

143

Video Standards Comparison


Low Delay Comparison
Foreman Low Delay

RaceHorses All Intra

38

37
36

36
35
34

PSNR(dB)

PSNR(dB)

34

32

30
HEVC
H264
MPEG4
H263
MPEG1
MPEG2
H261

28

26

24

50

100

150

200
250
BitRate(kbps)

300

350

400

33
32
31
30
HEVC
H264
MPEG4
MPEG1
MPEG2

29
28
450

27

1000

2000

3000
4000
BitRate(kbps)

5000

6000

7000

144

Outline

Introduction
LosslessCompression
ImageCompression
VideoCompression
SimulationResults
Conclusion

145

Conclusion
Imagecodingstandards:JPEGandJPEG2000,bothstandards
havetheirownadvantagesandenjoypopularityundercertain
circumstances.
Videocodingstandards:MPEG1,MPEG2,MPEG4,H.261,
H.263,H.264/MPEG4AVCandHEVC,keyfeaturesand
applicationsintroduced
Experimentalresults:everyevolutionofthecodingalgorithms
contributesgreatlythecompressionperformance
Thisfieldisdevelopingrapidlyanditsapplicationcanbefound
invarioussituations,andcontinuouseffortonimprovingthe
codingalgorithmwillbringaboutapromisingfutureforimage
andvideocompression.

146

What Next?

H.266?
MPEG5?
Wavelettransform?
BigBlock?
SSIMv.s.PSNR?

147

You might also like