MPEG Video Coding and Beyond: Spring '09 Instructor: Min Wu

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 45

M.

Wu: ENEE631 Digital Image Processing (Spring'09)


MPEG Video Coding and Beyond
Spring 09 Instructor: Min Wu

Electrical and Computer Engineering Department,
University of Maryland, College Park
bb.eng.umd.edu (select ENEE631 S09)
minwu@eng.umd.edu
ENEE631 Spring09
Lecture 17 (4/6/2009)
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [2]
Overview and Logistics
Last Time:
Block-matching and application to hybrid video coding
Exploit spatial redundancy via transform coding: e.g. block DCT coding
Exploit temporal redundancy via predictive coding: ME/MC

MPEG-1 video coding standard

Today:
Finish MPEG-1 Discussion
Other coding considerations/standards: H.26x, MPEG-2, MPEG-4, etc.
Geometric transform of images

Assign#4 on video and motion estimation posted online
U
M
C
P

E
N
E
E
6
3
1

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u


2
0
0
4
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [3]
Review: DCT + ME/MC for Hybrid Video Coding
Hybrid ~ combined transform coding & predictive coding
Spatial redundancy removal
Use DCT-based transform coding for reference frame
Temporal redundancy removal
Use motion-based predictive coding for next frames
estimate motion and use reference frame to predict
only encode MV & prediction residue (motion compensation residue)
(From Princeton EE330 S01 by B.Liu)
U
M
C
P

E
N
E
E
4
0
8
G

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u

&

R
.
L
i
u


2
0
0
2
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [4]
Review: Hybrid MC-DCT Video Encoder & Decoder

(From R.Lius Handbook Fig.2.18)
Intra-frame: encoded
without prediction

Inter-frame: predictively
encoded => use quantized
frames as ref for residue
U
M
C
P

E
N
E
E
4
0
8
G

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u

&

R
.
L
i
u


2
0
0
2
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [5]
Review: Additional Issues in Hybrid Video Coding
Not all regions are easily inferable from previous frame
Occlusion ~ solvable by backward prediction using future frames as ref.
Adaptively decide using prediction or not

Drifting and error propagation
Solution: Encode reference regions or frames from time to time (intra coding)
Random access: e.g. want to get 95th frame
Solution: Encode frame without prediction from time to time
How to allocate bits?
Based on visual model and statistics: JPEG-like quantiz.steps; entropy coding
Consider constant or variable bit-rate requirement
Constant-bit-rate (CER) vs. Variable-bit-rate (VER)
Wrap up all solutions ~ MPEG-like codec
U
M
C
P

E
N
E
E
4
0
8
G

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u

&

R
.
L
i
u


2
0
0
2
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [9]
Review: MPEG-1 Video Coding Standard
Standard only specifies decoders capabilities
Prefer simple decoding and not limit encoders complexity
Leave flexibility and competition in implementing encoder

Block-based hybrid coding (DCT + M.C.)
8x8 block size as basic coding unit
16x16 macroblock size for motion estimation/compensation

Group-of-Picture (GOP) structure with 3 types of frames
Intra coded
Forward-predictively coded
Bidirectional-predictively coded
U
M
C
P

E
N
E
E
4
0
8
G

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u

&

R
.
L
i
u


2
0
0
2
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [10]
MPEG-1 Picture Types and Group-of-Pictures
A Group-of-Picture (GOP) contains 3 types of frames (I/P/B)
Frame order
I
1
BBB P
1
BBB P
2
BBB I
2

Coding order
I
1
P
1
BBB P
2
BBB I
2
BBB
(From R.Liu Handbook Fig.3.13)
U
M
C
P

E
N
E
E
4
0
8
G

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u

&

R
.
L
i
u


2
0
0
2
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [11]
Adaptive Predictive Coding in MPEG-1
Half-pel M.V. search within +/-64 pel range
Use spatial differential coding on M.V. to remove M.V. spatial redundancy
Coding each block in P-frame
Predictive block using previous I/P frame as reference
Intra-block ~ encode without prediction
use this if prediction costs more bits than non-prediction
good for occluded area
can also avoid error propagation

Coding each block in B-frame
Intra-block ~ encode without prediction
Predictive block
Use previous I/P frame as reference (forward prediction),
Or use future I/P frame as reference (backward prediction),
Or use both for prediction and take average
U
M
C
P

E
N
E
E
4
0
8
G

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u

&

R
.
L
i
u


2
0
0
2
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [12]
Coding of B-frame (contd)

Previous frame
Current frame
Future frame
A
B
C
B = A forward prediction
B = C backward prediction
or B = (A+C)/2 interpolation

one motion vector
two motion vectors
U
M
C
P

E
N
E
E
4
0
8
G

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u

&

R
.
L
i
u


2
0
0
2
)

(Fig. from Ken Lam HK Poly Univ.
short course in summer2001)
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [13]
Quantization for I-frame (I-block) & M.C. Residues
Quantizer for I-frame (I-block)
Different step size for different freq. band (similar to JPEG)
Default quantization table
Scale the table for different compression-quality
Quantizer for residues in predictive block
Noise-like residue
Similar variance in different frequency band
=> Assign same quantization step size for each frequency band
U
M
C
P

E
N
E
E
4
0
8
G

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u

&

R
.
L
i
u


2
0
0
2
)

Revised from R.Liu Seminar Course 00 @ UMD
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [14]
Adjusting Quantizer
For smoothing out bit rate
Some applications prefer approx. constant bit rate video stream (CBR)
e.g., prescribe # bits per second
very-short-term bit-rate variations can be smoothed by a buffer
variations cant be too large on longer term
~ o.w. buffer overflow, delay and jitter in playback

Need to assign large step size for complex and high-motion frames

For reducing bit rate by exploiting HVS temporal properties
Noise/distortion in a video frame would not be very much visible when
there is a sharp temporal transition (scene change)
can compress a few frames right after scene change with fewer bits

Alternative bit-rate adjustment tool ~ frame type
I I I I I I lowest compression ratio (like motion-JPEG)
I P P P I P P moderate compression ratio
I B B P B B P B B I highest compression ratio
U
M
C
P

E
N
E
E
6
3
1

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u


2
0
0
1
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [15]
Color Transformation
RGB YUV color coordinates







U/V chrominance components are downsampled in coding
(
(
(

(
(
(


=
(
(
(

B
G
R
V
U
Y
0813 . 0 4187 . 0 5000 . 0
5000 . 0 3313 . 0 1687 . 0
1140 . 0 5870 . 0 2990 . 0
U
M
C
P

E
N
E
E
4
0
8
G

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u

&

R
.
L
i
u


2
0
0
2
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [17]
Video Coding Summary: Performance Tradeoff

From R.Lius Handbook Fig.1.2:
mos ~ 5-pt mean opinion scale of bad,
poor, fair, good, excellent
U
M
C
P

E
N
E
E
4
0
8
G

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u

&

R
.
L
i
u


2
0
0
2
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [18]
About Compression Ratio
Raw video
24 bits/pixel x (720 x 480 pixels) x 30 fps = 249 Mbps
Potential cheating points => contributing ~ 4:1 inflation
Color components are actually downsampled
30 fps may refer to field rate in MPEG-2 ~ equiv. to 15 fps
( 8 x 720 x 480 + 16 x 720 x 480 / 4 ) x 15 fps = 62 Mbps
U
M
C
P

E
N
E
E
4
0
8
G

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u

&

R
.
L
i
u


2
0
0
2
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [19]
Other Standards and Considerations for
Digital Video Coding

U
M
C
P

E
N
E
E
6
3
1

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u


2
0
0
4
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [20]
H.26x for Video Telephony
Remote face-to-face communication: A dream for years
H.26x series video coding targeted low bit rate
through ISDN or regular analog telephone line ~ on the order of 64kbps
need roughly symmetric complexity on encoder and decoder
H.261 (early 1990s)
Similar to simplified MPEG-1 ~ block-based DCT/MC hybrid coder
Integer-pel motion compensation with I/P frame only ~ no B frames
Restricted picture size/fps format and M.V. range
H.263 (mid 1990s) and H.263+/H.263++ (late 1990s)
Support half-pel motion compensation & many options for improvement
H.264 (latest, 2001-): also known as H.26L / JVT / MPEG4 part10
Hybrid coding framework with many advanced techniques
Focusing on greatly improving compression ratio at a cost of complexity
allow smaller block size; more choices on ref; advanced entropy coding, etc.
U
M
C
P

E
N
E
E
4
0
8
G

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u

&

R
.
L
i
u


2
0
0
2
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [21]

From Gonzalez-Woods
3/e Table 8.11
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [22]
MPEG-2
Extend from MPEG-1
Target at high-resolution high-bit-rate applications
Digital video broadcasting, HDTV, ; also used for DVD
Support interlaced video
Frame pictures vs. Field pictures
New prediction modes for motion compensation for interlaced video
Use previously encoded fields to do M.E.-M.C.
Support scalability

U
M
C
P

E
N
E
E
4
0
8
G

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u

&

R
.
L
i
u


2
0
0
2
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [25]

From Wangs book preprint Fig. 13.17
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [26]
Scalability in Video Codecs
Scalability: provide different quality in a single stream
Stack up more bits on base layer to provide improved quality
Possible ways for achieving scalabilities
SNR Scalability ~ Multiplequality video services
Basic vs. premium quality

Spatial Scalability ~ Multiple-dimension displays
Display on PDA vs. PC vs. Super-resolution display

Temporal Scalability ~ Multiple frame rates
Frequency Scalability ~ Blurred version to sharp, detailed version

Layered coding concept facilitates:
Unequal error protection Efficient use of resources
Different needs from customers Multiple services
U
M
C
P

E
N
E
E
4
0
8
G

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u

&

R
.
L
i
u


2
0
0
2
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [27]
SNR Scalability
Two layers with same spatio-temporal resolution but
different qualities
base-layer
encoder
base-layer
decoder
enhancement-layer
encoder
m
u
l
t
i
p
l
e
x
e
r

+
-
Video in
Base-layer
bitsteam
Enhancement-layer
bitsteam
Output
bitsteam
From R.Liu Seminar Course @ UMCP
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [29]
Spatial Scalability
Two layers with different spatial resolution
base-layer
encoder
base-layer
decoder
enhancement-layer
encoder
m
u
l
t
i
p
l
e
x
e
r

+
-
Video in
Base-layer
bitsteam
Enhancement-layer
bitsteam
Output
bitsteam
Down-sampler
Up-sampler
From R.Liu Seminar Course @ UMCP
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [32]
MPEG-4
Many functionalities targeting a variety of applications
Introduced object-based coding strategy
For better support of interactive applications & graphics/animation video
Require encoder to perform object segmentation
difficult for general applications

Introduced error resilient coding techniques
Streaming video profile for wireless multimedia applications

Part-10 is converged into H.264/AVC (Advanced Video Coding)
Focused on improving compression ratio and error resilience
Stick with Hybrid Coding framework
U
M
C
P

E
N
E
E
4
0
8
G

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u

&

R
.
L
i
u


2
0
0
2
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [33]
Object-based Coding in MPEG-4
Interactive functionalities
Higher compression
efficiency by separately
handling
Moving objects
Unchanged background
New regions
M.C.-failure regions
=> Sprite encoding
Object segmentation
needed (not easy )
Based on color, motion,
edge, texture, etc.
Possible for targeted
applications
Revised from R.Liu Seminar Course @ UMCP
U
M
C
P

E
N
E
E
4
0
8
G

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u

&

R
.
L
i
u


2
0
0
2
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [36]

From Wangs
Preprint
Table 1.3
U
M
C
P

E
N
E
E
6
3
1

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u


2
0
0
1
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [37]
MPEG-7
Multimedia Content Description Interface
Not a video coding/compression standard like previous MPEG
Emphasize on how to describe the video content for efficient
indexing, search, and retrieval






Standardize the description mechanism of content
Descriptor, Description Scheme, Description Definition Languages
Employ XML type of description language

Example of MPEG-7 visual descriptors: Color, Texture, Shape,
Figure from MPEG-7
Document N4031
(March 2001)
U
M
C
P

E
N
E
E
4
0
8
G

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u

&

R
.
L
i
u


2
0
0
2
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [38]
Summary of Todays Lecture
MPEG-1 video coding standard
Other coding considerations and standards
H.26x, MPEG-2, MPEG-4, MPEG-7, etc.

Geometric transform of images ~ more in next lecture

Readings:
Gonzalezs 3/e book 8.2.9, 8.1.7; 2.6.5 (geometric transform)

Lius book on video coding (see course website)
Chapter 2 Motion-Compensated DCT Video Coding
Chapter 3 Video Coding Standards

Other reference: Wangs textbook
Chapter 13 (video standards); Chapter 1 (video basics)

U
M
C
P

E
N
E
E
6
3
1

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u


2
0
0
4
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [40]
Geometric Relations and Manipulations of Images

Useful to characterize:
- global camera motion in video;
- relate two images of similar scenes taken from
different time or slightly different view point
=> image registration
U
M
C
P

E
N
E
E
6
3
1

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u


2
0
0
4
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [41]
Rotation, Translation, and Scaling
R.S.T. of an image object
Original pixel location (x,y) New location (x,y)
| |
y x
y
x
t t
t
t
y
x
y
x
by ation transl
'
'
(

+
(

=
(

clockwise - counter
origin around by rotate

cos sin
sin cos
'
' u
u u
u u
(


=
(

y
x
y
x
y x
y
x
s s
y
x
s
s
y
x
and by scaling
0
0
'
'
(

=
(

Uniform scaling Sx = Sy
(preserve angle and shape)
Differential scaling Sx = Sy
Preserve
length & angle
(x, y)
(x, y)
u
U
M
C
P

E
N
E
E
6
3
1

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u


2
0
0
1
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [42]
Rotation, Translation, and Scaling (contd)
Rotation and translation of image coordinates
Note the relations with rotation and translation of image objects
) , ( origin to ate transl
'
'
y x
y
x
t t
t
t
y
x
y
x
(

+
(

=
(

clockwise - counter
by axis rotate

cos sin
sin cos
'
' u
u u
u u
(

=
(

y
x
y
x
x
y
u
y
x
U
M
C
P

E
N
E
E
6
3
1

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u


2
0
0
1
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [43]
Implementation Issues of Geometric Transform
Forward transform
Index mapping from input to output image
What if most values obtained for an output image are at fractional
coordinate indices?
Reverse transform
Map integer indices of output image to input image
Get values of input image at fractional indices through interpolation
(p,q)

(p,q)

(p,q+1)

(p+1,q+1)

(p+1,q)

a

b

U
M
C
P

E
N
E
E
6
3
1

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u


2
0
0
1
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [45]
2-D Homogeneous Coordinate
Describe R.S.T. transform by P = M P + T
Need calculating intermediate coordinate values for successive transf.
Homogeneous coordinate
Allow R.S.T. represented by matrix multiplication operations
successive transf. can be calculated by combining transf. matrices

Cartesian point (x,y) Homogeneous representation ( s x, s y, s )
represent same pixel location for all nonzero parameter s; often use s=1





The name: Equation f(x,y) = 0 becomes homogeneous equation in (s x, s y, s )
such that if the common factors in 3 parameters can be factored out from the equation.
(
(
(

(
(
(

=
(
(
(

(
(
(

1 1
'
'
~
1
'
'
32 31
23 22 21
13 12 11
y
x
a a
a a a
a a a
s
sy
sx
y
x
U
M
C
P

E
N
E
E
6
3
1

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u


2
0
0
1
/
2
0
0
4
)

Exercise: express RST
and reflection in homo-
geneous coordinate
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [46]
R.S.T. in Homogeneous Coordinates
Successive R.S.T.
Left multiply the basic transform matrices
(
(
(

(
(
(

=
(
(
(

1 1 0 0
1 0
0 1
1
'
'
y
x
t
t
y
x
y
x
(
(
(

(
(
(

=
(
(
(

1 1 0 0
0 0
0 0
1
'
'
y
x
s
s
y
x
y
x
(
(
(

(
(
(


=
(
(
(

1 1 0 0
0 cos sin
0 sin cos
1
'
'
y
x
y
x
u u
u u
U
M
C
P

E
N
E
E
6
3
1

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u


2
0
0
1
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [47]
Reflection
Reflect about x-axis, y-axis, and origin





Reflect about y=x and y=-x





Reflect about a general line y=ax+b
Combination of translate-rotate => reflect => inverse rotate-translate

1 0 0
0 1 0
0 0 1
(
(
(


1 0 0
0 1 0
0 0 1
(
(
(


1 0 0
0 1 0
0 0 1
(
(
(


1 0 0
0 0 1
0 1 0
(
(
(


1 0 0
0 0 1
0 1 0
(
(
(


U
M
C
P

E
N
E
E
6
3
1

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u


2
0
0
1
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [48]
Shear
Shear ~ a transformation that distorts the shape
Cause opposite layers of the object slide over each other

Shear relative to x-axis






Extend to shears relative to other reference lines

1 0 0
0 1 0
0 1
(
(
(

x
sh
(1, 1)
y
x
(1, 0)
y
x
(2, 1) (3, 1)
sh
x
=2
U
M
C
P

E
N
E
E
6
3
1

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u


2
0
0
1
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [49]
General Composite Transforms
Combined R.S.T.
{a
ij
} is determined by
R.S.T. parameters


Rigid-body transform
Only involve translations and rotations
2x2 rotation submatrix is orthogonal
row vectors are orthonormal

Extension to 3-D homogeneous coordinate
( sX, sY, sZ, s ) with 4x4 transformation matrices
(
(
(

(
(
(

=
(
(
(

1 1 0 0 1
'
'
23 22 21
13 12 11
y
x
a a a
a a a
y
x
U
M
C
P

E
N
E
E
6
3
1

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u


2
0
0
1
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [51]
General Composite Transforms (contd)
Affine transforms ~ 6 parameters
Can be expressed as composition of RST,
reflection and shear
Parallel lines are transformed as parallel lines
Projective transforms ~ 8 parameters
Cover more general geometric transformations between 2 planes
Widely used in computer vision (e.g. image mosaicing, synthesized views)
Two unique phenomena:
Chirping: increase in perceived spatial freq as distance to camera increases
Converging/Keystone effects: parallel lines appear closer & merging in
distance
(
(
(

(
(
(

=
(
(
(

1 1 0 0 1
'
'
23 22 21
13 12 11
y
x
a a a
a a a
y
x
w c
b w A
w y
x
c c
b a a
b a a
s
sy
sx
T
new
y x w
T
+
+
=
(
(
(

(
(
(

=
(
(
(

=
1

1 1
'
' ] ' , ' [
2 1
2 22 21
1 12 11
U
M
C
P

E
N
E
E
6
3
1

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u


2
0
0
1
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [52]
Effects of Various Geometric Mappings
U
M
C
P

E
N
E
E
6
3
1

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u


2
0
0
4
)

From Wangs Book Preprint Fig.5.18
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [53]
Higher-order Nonlinear Spatial Warping
Analogous to rubber sheet stretching
Forward and reverse mapping of pixels coordinate indices
(x, y) (x, y)
Polynomial warping
Extend affine transform to higher-order polynomial mapping
2
nd
-order warping
x = a
0
+ a
1
x + a
2
y + a
3
x
2
+ a
4
xy + a
5
y
2

y = b
0
+ b
1
x + b
2
y + b
3
x
2
+ b
4
xy + b
5
y
2


Spatial distortion in imaging system (lens)
Pincushion and Barrel distortion
U
M
C
P

E
N
E
E
6
3
1

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u


2
0
0
1
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [54]
Example of
2
nd
-order
Polynomial
Spatial
Warping

U
M
C
P

E
N
E
E
6
3
1

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u


2
0
0
1
)

From P. Ramadges PU EE488 F00
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [55]
Illustration of Geometric Distortion

U
M
C
P

E
N
E
E
6
3
1

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u


2
0
0
1
)

From P. Ramadges PU EE488 F00
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [56]
Compensating Spatial Distortion in Imaging
Control points establishing correspondence
Coordinates before and after distortion are known
Fit into polynomial warping model: (x, y) => (x, y)
x = a
0
+ a
1
x + a
2
y + a
3
x
2
+ a
4
xy + a
5
y
2

y = b
0
+ b
1
x + b
2
y + b
3
x
2
+ b
4
xy + b
5
y
2


Minimize the sum of squared error between a set of warped
control points and the polynomial estimates
x = [ x
1
, x
2
, , x
M
]
T
, Z = [ 1, x
1
, y
1
, x
1
2
, x
1
y
1
, y
1
2
; 1, x
2
, y
2
, ]
E = ( x Z a )
T
( x Z a ) + ( y Z b )
T
( y Z b )
c E / c a = 0 => x = Z a

Least square estimates: solution expressed by generalized inverse of Z
a = Z
^
x = (Z
T
Z)
-1
Z
T
x; b = Z
^
y
Higher-order approximation
2
nd
order polynomial usually suffices for many applications
U
M
C
P

E
N
E
E
6
3
1

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u


2
0
0
1
)

M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [57]
Example of
Image Registration

Figure from
Gonzalez-Wood 3/e
online book resource
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [58]
M. Wu: ENEE631 Digital Image Processing (Spring'09) Lec17 MPEG and more [59]
Generations of Video Coding

U
M
C
P

E
N
E
E
4
0
8
G

S
l
i
d
e
s

(
c
r
e
a
t
e
d

b
y

M
.
W
u

&

R
.
L
i
u


2
0
0
2
)

From R.Liu Seminar Course 00 @ UMCP

You might also like