EC8093 Unit 5

UNIT V
IMAGE COMPRESSION
Dr.S.Deepa
Mrs.B.Sathyabhama
Mrs.V.Subashree
IMAGE COMPRESSION
• The goal of image compression is to reduce the amount of data

required to represent a digital image.
IMAGE COMPRESSION (CONT’D)
• Lossless
• Information preserving
• Low compression ratios
• Lossy
• Information loss
• High compression ratios
Trade-off: information loss vs compression ratio

COMPRESSION RATIO
compression
Compression ratio:
RELEVANT DATA REDUNDANCY
Example:
TYPES OF DATA REDUNDANCY
(1) Coding Redundancy

(2) Interpixel Redundancy
(3) Psychovisual Redundancy
• Data compression attempts to reduce one or more of these

redundancy types.
CODING REDUNDANCY
• Coding redundancy results from employing inefficient
coding schemes!
• Code: a list of symbols (letters,

numbers, bits etc.)
• Code word: a sequence of
symbols used to represent
information (e.g., gray levels).
• Code word length: number of
symbols in a code word; could
be fixed or variable.
CODING REDUNDANCY (CONT’D)
• To compare the efficiency of different coding schemes,
we can compute the average number of symbols Lavg
per code word.
N x M image
rk: k-th gray level
Example: l(rk): # of bits for rk
P(rk): probability of rk
Average Image Size: E ( X ) =  xP ( X = x)

x
CODING REDUNDANCY -
EXAMPLE
• Case 1: l(rk) = fixed length
Average image size: 3NM

CODING REDUNDANCY –
EXAMPLE (CONT’D)
• Case 2: l(rk) = variable length
Average image size: 2.7NM

INTERPIXEL REDUNDANCY
• Interpixel redundancy results from pixel
correlations (i.e., a pixel value can be reasonably
predicted by its neighbors).
histograms
auto-correlation

f ( x) o g ( x) =  f ( x) g ( x + a)da
−
auto-correlation: f(x)=g(x)
INTERPIXEL REDUNDANCY -
EXAMPLE
• To reduce interpixel redundancy, some transformation
is typically applied on the data.
Original
threshold
threshold
11 ……………0000……………………..11…..000…..
Binary image
PSYCHOVISUAL REDUNDANCY
• The human eye is more sensitive to the lower frequencies than to
the higher frequencies in the visual spectrum.
• Idea: discard data that is perceptually insignificant!
256 gray levels 16 gray levels 16 gray levels + random noise
i.e., add a small

Example: pseudo-random
quantization number
to each pixel
prior to
quantization
C=8/4 = 2:1
GENERAL IMAGE COMPRESSION
MODEL
We will focus on the Source Encoder/Decoder only.

ENCODER
• Mapper: transforms data to account for interpixel

redundancies.
ENCODER (CONT’D)
• Quantizer: quantizes the data to account for

psychovisual redundancies.
ENCODER (CONT’D)
• Symbol encoder: encodes the data to account for

coding redundancies.
DECODER
• The decoder applies the inverse steps.
• Note that quantization is irreversible in general!

LOSSLESS COMPRESSION
HUFFMAN CODING
(ADDRESSES CODING REDUNDANCY)
• A variable-length coding technique.
• Source symbols (i.e., pixel values) are encoded one at a

time!
• There is a one-to-one correspondence between source
symbols and code words.
• Optimal code - minimizes code word length per source
symbol.
HUFFMAN CODING (CONT’D)
• Forward Pass
1. Sort probabilities per symbol
2. Combine the lowest two probabilities
3. Repeat Step2 until only two probabilities remain.
• Backward Pass
Assign code symbols going backwards
• Lavg assuming binary coding:
• Lavg assuming Huffman coding:

MORE EXAMPLES – HUFFMAN CODING
Construct Huffman code for the word ‘committee’. Construct the code word for
committee
Solution
Total number of letters in the word = 9. Determine the probability of each letter
in the word,
for ex ‘c’ → occurs 1 time→ Prob of ‘c’ is 1/9, and e is 2/9 and so on
Symbol Code
C 001
E 01
I 0000
M 10
O 0001
T 11
CONSTRUCTING THE CODE
C O M M I T T E E
10 11 11 01 01
10 0000
001 0001
Symbol Code
C 001
E 01
I 0000
M 10
O 0001
T 11
HUFFMAN DECODING-PROBLEM
• Decode the message 00010110011011000 for the given
symbols and probability
Symbol Codeword
1 a 1
01 L 01
00
000 M 000
Y 001
01
001
Symbol Codeword
a 1
L 01
M 000
Y 001
00010110011011000
A L A A A
M y L
M
Decoded Word is : MALAYALAM

ENCODE THE GIVEN IMAGE –
USING HUFFMAN CODE
10 10 11 12 13
10 12 12 14 15
10 12 12 15 15
10 10 10 11 11
10 11 11 11 12
ARITHMETIC (OR RANGE) CODING
(ADDRESSES CODING REDUNDANCY)
• Huffman coding encodes source symbols one at a time which

might not be efficient.
• Arithmetic coding encodes sequences of source symbols to
variable length code words.
• There is no one-to-one correspondence between source
symbols and code words.
• Slower than Huffman coding but can achieve higher
compression.
EXAMPLE
Encode
α1 α2 α3 α3 α4
[0.06752, 0.0688) 0.8
0.4
code: 0.068
0.2
(any number within sub-interval)
Warning: finite precision arithmetic might cause problems due to truncations!

Arithmetic Decoding
1.0 0.8 0.72 0.592 0.5728
α4 α4 α4 α4 α4
0.8 0.72 0.688 0.5856 0.57152
Decode 0.572
α3 α3 α3 α3 α3
0.4 0.56 0.624 0.5728 0.56896

α2 α2 α2 α2 α2 α3 α3 α1 α2 α4
0.2 0.48 0.592 0.5664 0.56768
α1 α1 α1 α1 α1 A special EOF symbol can
be used to terminate iterations.
0.0 0.4
0.56 0.56 0.5664
RUN-LENGTH CODING (RLC)
(ADDRESSES INTERPIXEL REDUNDANCY)
• Reduce the size of a repeating string of symbols (i.e., “run”):
1 1 1 1 1 0 0 0 0 0 0 1 → (1,5) (0, 6) (1, 1)

a a a b b b b b b c c → (a,3) (b, 6) (c, 2)
• Encode each “run” as (symbol, count), for example:

• Using two bytes (assuming 0<=symbol, count<=255)
• Using Huffman or Arithmetic coding, by treating them as new symbols.
Image Compression Standards
LOSSY COMPRESSION
• Transform the image into some other domain to reduce interpixel
redundancy.
JPEG ENCODER
DCT (DISCRETE COSINE
TRANSFORM)
Forward:
Inverse:
if u=0 if v=0
if u>0 if v>0
DCT (CONT’D)
• Basis functions for a 4x4 image (i.e., cosines of different

frequencies).
DCT (CONT’D)
DFT WHT DCT
Using
8 x 8 sub-images
yields 64 coefficients
per sub-image.
Reconstructed images
by truncating
50% of the
coefficients
DCT is a more compact

transformation!
RMS error: 2.32 1.78 1.13

DCT (CONT’D)
RMS error
• Sub-image size selection:
Reconstructions (75% truncation of coefficients)
original 2 x 2 sub-images 4 x 4 sub-images 8 x 8 sub-images

DCT (CONT’D)
• DCT minimizes "blocking artifacts" (i.e., boundaries between
subimages do not become very visible).
JPEG COMPRESSION
Entropy
encoder
Became an
international
image
compression
standard in
1992.
JPEG - STEPS
1. Divide image into 8x8 subimages.
For each subimage do:

2. Shift the gray-levels in the range [-128, 127]
(i.e., reduces the dynamic range requirements of DCT)
3. Apply DCT; yields 64 coefficients

1 DC coefficient: F(0,0)
63 AC coefficients: F(u,v)
JPEG STEPS
4. Quantize the coefficients (i.e., reduce the amplitude of

coefficients that do not contribute a lot).
Q(u,v): quantization table

EXAMPLE (CONT’D)
Cq(u,v)
C(u,v)
Quantization
Q(u,v) Small magnitude coefficients

have been truncated to zero!
“quality” controls how many of

them will be truncated!
JPEG STEPS (CONT’D)
5. Order the coefficients using zig-zag ordering
- Creates long runs of zeros (i.e., ideal for run-length

encoding)
JPEG STEPS (CONT’D)
6. Encode coefficients:
6.1 Form “intermediate” symbol sequence.
6.2 Encode “intermediate” symbol sequence into

a binary sequence.
Note: DC coefficient is encoded differently from AC coefficients

FINAL SYMBOL SEQUENCE
WHAT IS THE EFFECT OF THE
“QUALITY” PARAMETER?
(58k bytes) (21k bytes) (8k bytes)
lower compression higher compression

WHAT IS THE EFFECT OF THE
“QUALITY” PARAMETER? (CONT’D)
JPEG MODES
• JPEG supports several different modes

• Sequential Mode
• Progressive Mode
• Hierarchical Mode
• Lossless Mode
• The default mode is “sequential”

• Image is encoded in a single scan (left-to-right, top-to-bottom).
(see “Survey” paper)

PROGRESSIVE JPEG
• Image is encoded in multiple scans.
• Produce a quick, roughly decoded image when transmission
time is long.
Sequential
Progressive
JPEG 2000
• Based on DWT – Discrete Wavelet Transform
• No need to subdivide images into 8 x 8 blocks
• Hence no BLOCKING ARTIFACTS
• Produces better compression than JPEG
• Faster than JPEG based on DCT
WAVELET DECOMPOSITION
MPEG
• Used for video compression
• Moving (Motion) Picture Experts Group
• Based on DCT
MPEG STANDARDS
• MPEG-1: For the storage and retrieval of moving pictures and audio on
storage media.
• MPEG-2: For digital television, it’s the timely response for the satellite
broadcasting and cable television industries in their transition from analog to
digital formats.
• MPEG-4: Codes content as objects and enables those objects to be

manipulated individually or collectively on an audiovisual scene.
• MPEG-7: Used for Search Engines

MPEG ENCODER
FRAME TYPES IN MPEG
• I – Frame → Independent Frame
• P-Frame → Prediction Frame
• B –Frame → Bidirectional Frame
Representation & Description
REPRESENTATION
• Image regions (including segments) can be represented by

either the border or the pixels of the region. These can be
viewed as external or internal characteristics, respectively.
• Chain codes: represent a boundary of a connected region.
REPRESENTATION
CHAIN CODES
REPRESENTATION-CHAIN CODES
• Chain codes can be based on either 4-connectedness

or 8-connectedness.
• The first difference of the chain code:
– This difference is obtained by counting the number of
direction changes (in a counterclockwise direction)
– For example, the first difference of the 4-direction chain
code 10103322 is 3133030.
• Assuming the first difference code represent a closed
path, rotation normalization can be achieved by
circularly shifting the number of the code so that the
list of numbers forms the smallest possible integer.
• Size normalization can be achieved by adjusting the
size of the resampling grid.
REPRESENTATION
POLYGONAL APPROXIMATION
• Polygonal approximations: to represent a boundary by straight line
segments, and a closed path becomes a polygon.
• The number of straight line segments used determines the accuracy
of the approximation.
• Only the minimum required number of sides necessary to preserve
the needed shape information should be used (Minimum perimeter
polygons).
• A larger number of sides will only add noise to the model.
REPRESENTATION
POLYGONAL APPROXIMATION
• Minimum perimeter polygons: (Merging and splitting)

– Merging and splitting are often used together to ensure that
vertices appear where they would naturally in the boundary.
– A least squares criterion to a straight line is used to stop the
processing.
PROCEDURE TO PERFORM MPP ALGORITHM
• Step1: Enclose the given
boundary using a sampling grid
with the help of a set of
concatenated cells.
• Step 2: Consider the boundary as
a rubber band. Now allow the
rubber band to shrink.
• Step 3: When it is allowed to
shrink ,the rubber band will be
constrained by the inner and
outer walls of the boundary
region defined by the cells.
• Step 4: The shape of the polygon
produced in the previous step
will give the minimum perimeter
polygon as shown in the figure
REPRESENTATION-SIGNATURE
• The idea behind a signature is to convert a two dimensional

boundary into a representative one dimensional function.
REPRESENTATION-SIGNATURE
• Signatures are invariant to location, but will

depend on rotation and scaling.
– Starting at the point farthest from the reference
point or using the major axis of the region can be
used to decrease dependence on rotation.
– Scale invariance can be achieved by either scaling
the signature function to fixed amplitude or by
dividing the function values by the standard
deviation of the function.
REPRESENTATION
BOUNDARY SEGMENTS
• Boundary segments: decompose a boundary into segments.

• Use of the convex hull of the region enclosed by the boundary
is a powerful tool for robust decomposition of the boundary.
REPRESENTATION
SKELETONS
• Skeletons: produce a one pixel wide graph that has the same
basic shape of the region, like a stick figure of a human. It can
be used to analyze the geometric structure of a region which
has bumps and “arms”.
REPRESENTATION
SKELETONS
• Before a thinning algorithm:

– A contour point is any pixel with value 1 and having at least
one 8-neighbor valued 0.
– Let
N ( p1 ) = p2 + p3 + ... + p8 + p9
T ( p1 ) : the number of 0 - 1 transitions
in the ordered sequence
p2 , p3 ,..., p8 , p9 , p2
REPRESENTATION
SKELETONS
• Step 1: Flag a contour point p1 for deletion if the

following conditions are satisfied
(a ) 2  N ( p1 )  6 ( c ) p 2  p 4  p6 = 0
(b) T ( p1 ) = 1 (d) p4  p6  p8 = 0
REPRESENTATION
SKELETONS
• Step 2: Flag a contour point p1 for deletion again.

However, conditions (a) and (b) remain the same,
but conditions (c) and (d) are changed to
(d' ) p2  p6  p8 = 0
(c' ) p2  p4  p8 = 0
REPRESENTATION
SKELETONS
• A thinning algorithm:
– (1) applying step 1 to flag border points for
deletion
– (2) deleting the flagged points
– (3) applying step 2 to flag the remaining border
points for deletion
– (4) deleting the flagged points
– This procedure is applied iteratively until no
further points are deleted.
REPRESENTATION
SKELETONS-EXAMPLE
• One application of
skeletonization is for
character recognition.
• A letter or character is
determined by the
center-line of its strokes,
and is unrelated to the
width of the stroke lines.
BOUNDARY DESCRIPTORS
• There are several simple geometric measures that can be useful

for describing a boundary.
– The length of a boundary: the number of pixels along a
boundary gives a rough approximation of its length.
– Curvature: the rate of change of slope
• To measure a curvature accurately at a point in a digital
boundary is difficult
• The difference between the slops of adjacent boundary
segments is used as a descriptor of curvature at the
point of intersection of segments
SHAPE NUMBERS
First difference
• The shape number of a boundary is defined as the first

difference of smallest magnitude.
• The order n of a shape number is defined as the number of
digits in its representation.
SHAPE NUMBERS
SHAPE NUMBERS
FOURIER DESCRIPTORS
• This is a way of using the Fourier transform to

analyze the shape of a boundary.
– The x-y coordinates of the boundary are treated as the real
and imaginary parts of a complex number.
– Then the list of coordinates is Fourier transformed using
the DFT (chapter 4).
– The Fourier coefficients are called the Fourier descriptors.
– The basic shape of the region is determined by the first
several coefficients, which represent lower frequencies.
– Higher frequency terms provide information on the fine
detail of the boundary.
FOURIER DESCRIPTORS
FOURIER DESCRIPTORS
• Boundary Can be Represented in the form of complex number S(K)=x(k)+j y(k)

Where K=0,1,2……..M-1
• Fourier descriptors reduces a 2-D to a 1-D problem. Fourier equation is represented
as
− j 2ux
F(u)=
M −1 M
 f ( x)e
x =0
, u = 0,1,2,....M − 1
Discrete Fourier transform is represented as
− j 2uk
 (U ) =  S (k )e
M −1 K
x =0
Where u=0,1,2……..K-1
Inverse Fourier transform
− j 2ux
M −1 M
1
f ( x) =
M
 F (u)e
u =0
, x = 0,1,2,....M
− j 2uk
1 k −1 K
S (k ) =   (u )e
K k =0
STATISTICAL MOMENTS
• Moments are statistical measures of data.

– They come in integer orders.
– Order 0 is just the number of points in the data.
– Order 1 is the sum and is used to find the average.
– Order 2 is related to the variance, and order 3 to the skew
of the data.
– Higher orders can also be used, but don’t have simple
meanings.
STATISTICAL MOMENTS
• Let r be a random variable, and g(ri) be normalized (as the

probability of value ri occurring), then the moments are
K −1
 n (r ) =  (ri − m) n g (ri )
k =0
K −1
where m =  ri g (ri )
i =0
REGIONAL DESCRIPTORS
• Some simple descriptors

– The area of a region: the number of pixels in the
region
– The perimeter of a region: the length of its
boundary
– The compactness of a region: (perimeter)2/area
– The mean and median of the gray levels
– The minimum and maximum gray-level values
– The number of pixels with values above and below
the mean
Regional Descriptors
Example
TOPOLOGICAL DESCRIPTORS
Topological property 1:
the number of holes (H)
the number of connected
components (C)
Euler number: the number of connected components subtract the number of holes
FOR A
E = C – H=1-1=0
FOR B
E = C – H=1-2=-1
E=0 E= -1
Topological
property 4:
the largest
connected
component.
TEXTURE
TEXTURE
• Texture is usually defined as the smoothness or roughness of a

surface.
• In computer vision, it is the visual appearance of the
uniformity or lack of uniformity of brightness and color.
• There are two types of texture: random and regular.
– Random texture cannot be exactly described by words or
equations; it must be described statistically. The surface of
a pile of dirt or rocks of many sizes would be random.
– Regular texture can be described by words or equations or
repeating pattern primitives. Clothes are frequently made
with regularly repeating patterns.
– Random texture is analyzed by statistical methods.
– Regular texture is analyzed by structural or spectral
(Fourier) methods.
STATISTICAL APPROACHES
• Let z be a random variable denoting gray levels and let p(zi),

i=0,1,…,L-1, be the corresponding histogram, where L is the
number of distinct gray levels.
– The nth moment of z:
L −1
 n ( z ) =  ( zi − m) p( zi ) where m =  zi p( zi )
L −1
n
k =0 i =0
1
– The measure R: R = 1−
1 +  2 ( z)
L −1
– The uniformity: U =  p 2 ( zi )
i =0
L −1
– The average entropy: e = −  p( z ) log
i =0
i 2 p ( zi )
STATISTICAL APPROACHES
Smooth Coarse Regular

STRUCTURAL APPROACHES
• Structural concepts:
– Suppose that we have a
rule of the form S→aS,
which indicates that the
symbol S may be
rewritten as aS.
– If a represents a circle
[Fig. 11.23(a)] and the
meaning of “circle to the
right” is assigned to a
string of the form
aaaa… [Fig. 11.23(b)] .
SPECTRAL APPROACHES
• For non-random primitive spatial patterns, the 2-dimensional

Fourier transform allows the patterns to be analyzed in terms of
spatial frequency components and direction.
• It may be more useful to express the spectrum in terms of polar
coordinates, which directly give direction as well as frequency.
• Let S (r , ) is the spectrum function, and r and  are the
variables in this coordinate system.
– For each direction  , S (r, ) may be considered a 1-D
function S (r ).
– For each frequency r, S r ( ) is a 1-D function.
– A global description: S (r ) =  S (r )
R
 
0
 S ( ) = S ( )
r
=0 r =1
SPECTRAL APPROACHES
SPECTRAL APPROACHES
MOMENTS OF TWO DIMENSIONAL FUNCTIONS
• For a 2-D continuous function f(x,y), the moment of

order (p+q) is defined as
 
m pq =   x p y q f ( x, y )dxdy for p, q = 1,2,3,...
− −
• The central moments are defined as

 
 pq =   ( x − x ) ( y − y ) f ( x, y )dxdy
p q
− −
m10 m01
where x = and y =
m00 m00
• If f(x,y) is a digital image, then

 pq =  ( x − x ) p ( y − y ) q f ( x, y )
x y
• The central moments of order up to 3 are
00 =  ( x − x ) 0 ( y − y ) 0 f ( x, y ) =  f ( x, y ) = m00
x y x y
m10
10 =  ( x − x )1 ( y − y ) 0 f ( x, y ) = m10 − (m00 ) = 0
x y m00
m01
01 =  ( x − x ) ( y − y ) f ( x, y ) = m01 −
0 1
(m00 ) = 0
x y m00
m10 m01
11 =  ( x − x ) ( y − y ) f ( x, y ) = m11 −
1 1
x y m00
= m11 − x m01 = m11 − ym10
• The central moments of order up to 3 are

 20 =  ( x − x ) 2 ( y − y ) 0 f ( x, y ) = m20 − x m10
x y
02 =  ( x − x ) 0 ( y − y ) 2 f ( x, y ) = m02 − ym01
x y
 21 =  ( x − x ) 2 ( y − y )1 f ( x, y ) = m21 − 2 x m11 − ym20 + 2 x m01

x y
12 =  ( x − x )1 ( y − y ) 2 f ( x, y ) = m12 − 2 ym11 − x m02 + 2 ym10

x y
30 =  ( x − x )3 ( y − y ) 0 f ( x, y ) = m30 − 3x m20 + 2 x 2 m10

x y
03 =  ( x − x ) 0 ( y − y )3 f ( x, y ) = m03 − 3 ym02 + 2 y 2 m01

x y
• The normalized central moments are defined as

 pq
 pq = 
00
p+q
where  = +1 for p + q = 2,3,....
2
• A seven invariant moments can be derived from the

second and third moments:
1 =  20 +  02
2 = ( 20 −  02 ) 2 + 4112
3 = (30 − 312 ) 2 + (3 21 −  03 ) 2
4 = (30 + 12 ) 2 + ( 21 +  03 ) 2
5 = (30 − 312 )(30 + 12 )(30 + 12 ) 2 − 3( 21 +  03 ) 2 

+ (3 21 −  03 )( 21 +  03 ) 3(30 + 12 ) 2 − ( 21 +  03 ) 2 
6 = ( 20 −  02 )(30 + 12 ) 2 − ( 21 +  03 ) 2 
•
+ 411 (30 + 12 )( 21 +  03 )

7 = (3 21 −  03 )(30 + 12 )(30 + 12 ) 2 − 3( 21 +  03 ) 2 

+ (312 − 30 )( 21 +  03 ) 3(30 + 12 ) 2 − ( 21 +  03 ) 2 
• This set of moments is invariant to translation,
rotation, and scale change.
Object Recognition
Introduction
• The approaches to pattern recognition developed are divided
into two principal areas: decision-theoretic and structural
• The first category deals with patterns described using

quantitative descriptors, such as length, area, and texture
• The second category deals with patterns best described by

qualitative descriptors, such as the relational descriptors.
47
Pattern and Pattern classes
• Patterns and features

• Pattern classes: a pattern class is a family of patterns
that share some common properties
• Pattern recognition: to assign patterns to their
respective classes
• Three common pattern arrangements used in practices
are
– Vectors
– Strings
– Trees
Vector examples
Another Vector examples
• Here is another example of pattern vector generation.

• In this case, we are interested in different types of noisy shapes.
String examples
• String descriptions adequately generate patterns of objects and

other entities whose structure is based on relatively simple
connectivity of primitives, usually associated with boundary
shape.
Tree examples
• Tree descriptions is more powerful than string ones.

• Most hierarchical ordering schemes lead to tree structure.
Tree examples
Recognition Based on Matching
Two Methods
Minimum Distance Classifier
Matching based on Correlation
54
• Suppose that we define the prototype of each pattern class to
be the mean vector of the patterns of that class:
1
mj = 
N j xw j
xj j=1,2,…,W (1)
• Using the Euclidean distance to determine closeness reduces

the problem to computing the distance measures
D j ( x) = x − m j j=1,2,…,W (2)
55
• The smallest distance is equivalent to evaluating the functions

1 T
d j ( x) = x m j − m j m j
T
j=1,2,…,W (3)
2
• The decision boundary between classes and for a minimum
distance classifier is
dij ( x) = di ( x) − d j ( x) j=1,2,…,W (4)
1
= xT (mi − m j ) − (mi − m j )T (mi + m j ) = 0
2
56
• Decision boundary of minimum distance classifier
¹Ï3.a¡G¤Gºû¥--±¤ÀÃþ½d¨Ò
2 D
1.5
2
1
x
Class C 1
0.5 Class C 2
-0.5 0 0.5 1 1.5 2 2.5

x1
57
• Advantages:
1. Unusual direct-viewing
2. Can solve rotation the question
3. Intensity
4. Chooses the suitable characteristic,
then solves mirror problem
5. We may choose the color are one kind
of characteristic, the color question
then solve.
58
• Disadvantages:
1. It costs time for counting samples,
but we must have a lot of
samples for high accuracy, so it is
more samples more accuracy!
2. Displacement
3. It is only two features, so that the
accuracy is lower than other methods.
4. Scaling
59
Matching by Correlation
• We consider it as the basis for finding matches

of a sub-image of size J  K within f ( x, y ) an
image of M  N size ,
• where we assume that J  M and K  N
c( x, y ) =  f ( s, t ) w( x + s, y + t )
s t
for x =0,1,2,…,M-1
y=0,1,2,…,N-1
60
• Arrangement for obtaining the correlation of and at point
f w
( x0 , y0 )
Origin K
J
o
M ( x0 , y0 )
w = ( x0 + s, y0 + t )
f ( x, y )
61
• The correlation function has the disadvantage of being sensitive to changes
in the amplitude of f and w
• For example, doubling all values of f doubles the value of c( x, y )
• An approach frequently used to overcome this difficulty is to perform
matching via the correlation coefficient
[ f (s, t ) − f (s, t )][w( x + s, y + t ) − w]

 ( x, y ) = s t
1
 2
 [ f ( s, t ) − f ( s, t )]  [ w( x + s, y + t ) − w] 
2
2
s t s t 
• The correlation coefficient is scaled in the range-1 to 1, independent of
scale changes in the amplitude of and
f w
62
• Advantages:
1.Fast
2.Convenient
3.Displacement
• Disadvantages:
1.Scaling
2.Rotation
3.Shape similarity
4.Intensity
5.Mirror problem
6.Color can not recognition
63
THANK YOU

EC8093 Unit 5

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

EC8093 Unit 5

Uploaded by

Copyright:

Available Formats

UNIT V

• The goal of image compression is to reduce the amount of data

Trade-off: information loss vs compression ratio

(1) Coding Redundancy

• Data compression attempts to reduce one or more of these

• Code: a list of symbols (letters,

Average Image Size: E ( X ) =  xP ( X = x)

Average image size: 3NM

Average image size: 2.7NM

256 gray levels 16 gray levels 16 gray levels + random noise

i.e., add a small

We will focus on the Source Encoder/Decoder only.

• Mapper: transforms data to account for interpixel

• Quantizer: quantizes the data to account for

• Symbol encoder: encodes the data to account for

• The decoder applies the inverse steps.

• Note that quantization is irreversible in general!

• Source symbols (i.e., pixel values) are encoded one at a

• Lavg assuming Huffman coding:

Decoded Word is : MALAYALAM

• Huffman coding encodes source symbols one at a time which

[0.06752, 0.0688) 0.8

Warning: finite precision arithmetic might cause problems due to truncations!

0.4 0.56 0.624 0.5728 0.56896

• Reduce the size of a repeating string of symbols (i.e., “run”):

1 1 1 1 1 0 0 0 0 0 0 1 → (1,5) (0, 6) (1, 1)

• Encode each “run” as (symbol, count), for example:

• Basis functions for a 4x4 image (i.e., cosines of different

DCT is a more compact

RMS error: 2.32 1.78 1.13

• Sub-image size selection:

Reconstructions (75% truncation of coefficients)

original 2 x 2 sub-images 4 x 4 sub-images 8 x 8 sub-images

For each subimage do:

3. Apply DCT; yields 64 coefficients

4. Quantize the coefficients (i.e., reduce the amplitude of

Q(u,v): quantization table

Q(u,v) Small magnitude coefficients

“quality” controls how many of

- Creates long runs of zeros (i.e., ideal for run-length

6.1 Form “intermediate” symbol sequence.

6.2 Encode “intermediate” symbol sequence into

Note: DC coefficient is encoded differently from AC coefficients

(58k bytes) (21k bytes) (8k bytes)

lower compression higher compression

• JPEG supports several different modes

• The default mode is “sequential”

(see “Survey” paper)

• MPEG-4: Codes content as objects and enables those objects to be

• MPEG-7: Used for Search Engines

• Image regions (including segments) can be represented by

• Chain codes can be based on either 4-connectedness

• Minimum perimeter polygons: (Merging and splitting)

• The idea behind a signature is to convert a two dimensional

• Signatures are invariant to location, but will

• Boundary segments: decompose a boundary into segments.

• Before a thinning algorithm:

• Step 1: Flag a contour point p1 for deletion if the

• Step 2: Flag a contour point p1 for deletion again.

• There are several simple geometric measures that can be useful

• The shape number of a boundary is defined as the first

• This is a way of using the Fourier transform to