Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 68

By:

Shubhra goyal
Definition
Why Compression
Types of compression
Binary image compression scheme
Video image compression
Audio image compression
Fractal image compression

Compression: the process of coding that will
effectively reduce the total number of bits
needed to represent certain information.


Encoder
(compression)
storage
decoder
(decompression)
Input
data
output
data
compression is possible due to:
Redundancy in digital audio, image, and video data
(silence removal, spatial redundancy, temporal
redundancy)
Properties of human perception.
(Compressed version of digital audio, image, video
need not represent the original information exactly
Perception sensitivities are different for different signal
patterns
Human eye is less sensitive to the higher spatial
frequency components than the lower frequencies
(transform coding)
)

Video and audio have much higher storage
requirements than text.
Data transmission rates (in terms of bandwidth
requirements) for sending continuous media
are considerably higher than text.
Efficient compression of audio and video data,
including some compression standards, will be
considered in this lesson.

Data compression is about storing and sending a smaller number
of bits.
Therere two major categories for methods to compress data:
lossless and lossy methods

Data compression
methods
Lossless
methods
Lossy methods
Run-
lengt
h
LZW
Ccitt
grp 3
2D
Ccitt
grp 4
Ccitt grp
3 1 D
JPEG
Ccitt
h.261
fractals
Intel
DVI
MPEG
In lossless methods, original data and the data after compression and
decompression are exactly the same.

Redundant data is removed in compression and added during decompression.

It achieve reduction in size in the range of 1/10 to 1/50 of the original
uncompressed size.

Lossless methods are used when we cant afford to lose any data: eg. Text
Compression like legal and medical documents, computer programs.

There are five common lossless methods:
Run length encoding
CCITT Group 3 1D
CCITT Group 3 2D
CCITT Group 4
Lempel-Ziv and welch algorithm LZW


Run-length encoding is the simplest method of
compression.
It can be used to compress data made of any
combination of symbols.
The general idea behind this method is to replace
consecutive repeating occurrences of a symbol by one
occurrence of the symbol followed by the number of
occurrences.
The method can be even more efficient if the data uses
only two symbols (for example 0 and 1) in its bit
pattern and one symbol is more frequent than the other.

This is disadvantageous for a busy image. In busy
image, adjacent pixels or groups of adjacent pixels
change rapidly. These lead to shorter run lengths of
black pixels or white pixels.
In this, it could take more bits for the code to
represent the run length which generates more
bytes than the original number of bytes in an
image.
This effect is called as reverse compression or
negative compression.
It was designed for black and white images only,
not for gray scale or color images.



Many facsimile and document imaging file formats
support a form of lossless data compression often
described as CCITT encoding.
The CCITT (International Telegraph and Telephone
Consultative Committee) is a standards organization
that has developed a series of communications
protocols for the facsimile transmission of black-and-
white images over telephone lines and data networks.
The CCITT actually defines three algorithms for the
encoding of image data:
Group 3 One-Dimensional (G31D)
Group 3 Two-Dimensional (G32D)
Group 4 (G4)

Huffman encoding is used for encoding the pixel run length in CCITT
Group 3 and Group 4.
It is a variable length encoding scheme generating the shortest code for
frequently occurring run lengths and longer code for less frequently
occurring run lengths.
Algorithm:
Make a leaf node for each code symbol
Add the generation probability of each symbol to the leaf node
Take the two leaf nodes with the smallest probability and connect
them into a new node
Add 1 or 0 to each of the two branches
The probability of the new node is the sum of the probabilities
of the two connecting nodes
If there is only one node left, the code construction is completed. If
not, go back to (2)

Advantages
- it is simple to implement in both hardware
and software.
Disadvantages
- it is one-dimensional as it encodes each row or
line separately.
- it assumes a reliable communication link and does
not provide any protection mechanism.

CCITT Group 3 - 2D compression scheme is also known as
modified run-length encoding. This scheme is more
commonly used for software based document imaging
system.
While CCITT Group 3- 2D scheme provides fairly good
compression, it is easier to compress in software than
CCITT Group 4 standard. The compression ratio averages
somewhere between 10-25, between Group 3 and Group 4.
The compression scheme is based on statistical nature of
images. For example, the image data across the adjacent
scan line may normally be redundant, if black and white
transitions occur within plus or minus 3 pixels in the next
line as well. Depending upon the scan resolution one line of
text may consist of 20-30 scan lines.

2 dimensional coding
Images are divided into several groups of K lines
the first line of each group is encoded using CCITT Group 3 1D method
The rest of lines are encoded using some "differential schemes"
Typically compression ratio 10 ~ 20
The "K-factor" allows more error-free transmission
World-wide fassimile standard
The 2D scheme uses a combination of additional codes called vertical code,
pass code, and horizontal code.

Only one pass code, i.e. 0001 and one horizontal code, i.e. 001
If vertical code and horizontal code are not applied, then the horizontal code is
applied.
Horizontal Code + Group 3 1D Code = 001 + markup code + terminating code

Parse the coding line and look for the change in the
pixel value. The pixel value change is found at the
a1 location (a1 is the indicator that the pixel
changed from binary 0 to binary 1.)
Parse the reference line and look for the change in
the pixel value. The change is found at the b1
location.
Find the difference in the location between b1 and
a1: Delta = b1-a1.
Advantages
- the implementation of the K factor allows
error-free transmission.
- it is a worldwide facsimile standard.
- due to its 2-dimensional nature, the compression
ratios achieved with this scheme are better than
CCITT Group 3 1D.
Disadvantages
- it does not provide a dense compression
- it is complex and relatively difficult to implement
in software.

The compression ratio was not sufficient for serious, high-
resolution document imaging.
This is a 2D coding scheme without the k-factor, the k-factor in
this scheme is the entire page of lines.
Here, the first reference line is an imaginary all-white line above
the top of the image.
The first group of pixels is encoded using the imaginary white line
as reference line.
The new coded line becomes the reference line for the next scan
line.
Each successive line is coded relative to the previous line.
This provides very large level of compression

There are no EOL markers before the start of
the compressed data.

Fillers are not used for the scan line .
There is an EOP (End-Of-Page) mark consisting
of
- two concatenated EOLs
- padding bits are added immediately
after the end of compressed data

Advantages:
- Better resolution
Disadvantages:
- Slow
- Complex
- As there is no reference line, a single error
error can result in the rest of the page
being skewed.

It is dictionary-based encoding algorithm.
It creates a dictionary (a table) of strings used during
the communication session.
If both the sender and the receiver have a copy of the
dictionary, then previously-encountered strings can be
substituted by their index in the dictionary to reduce
the amount of information transmitted.
In this phase there are two concurrent events:
- building an indexed dictionary and
- compressing a string of symbols.
The algorithm extracts the smallest substring that
cannot be found in the dictionary from the remaining
uncompressed string.
It then stores a copy of this substring in the dictionary
as a new entry and assigns it an index value.
Compression occurs when the substring, except for the
last character, is replaced with the index found in the
dictionary.
The process then inserts the index and the last
character of the substring into the compressed string.

Decompression is the inverse of the compression
process.
The process extracts the substrings from the
compressed string and tries to replace the indexes with
the corresponding entry in the dictionary, which is
empty at first and built up gradually.
The idea is that when an index is received, there is
already an entry in the dictionary corresponding to that
index.

Used for compressing images and video
files (our eyes cannot distinguish subtle
changes, so lossy data is acceptable).
These methods are cheaper, less time and
space.
Several methods:
JPEG: compress pictures and graphics
MPEG: compress video
MP3: compress audio

Color characteristics:
Luminance : This is the measure of the light
emitted or reflected by an object.
Hue: This is the color sensation produced
in an observer due to the presence of certain
wavelengths of color.
Saturation: Depth of a color
Difference between red and pink

A color model is an orderly system for creating a
whole range of colors from a small set of primary
colors.
There are two types of color models,
- Subtractive
- Additive
Additive color models use light to display color while
subtractive models use printing inks.
Colors perceived in additive models are the result of
transmitted light. the typical technique
on color displays.
Colors perceived in subtractive models are the result of
reflected light, the typical technique in printers/plotters.
CMYK model:
The Cyan ,Magenta ,Yellow and Black (CMYK)
model is used in color printing devices.
It is a color subtractive model.
HSI MODEL (HSB MODEL):
The Hue, Saturation and Intensity model
represents tint, shade and tone.
This model is used in IP for filtering and
smoothing images.
Requires high level of computation


YUV MODEL:
Its a 3-D and Subtractive model.


Y is Luminance component


UV is chrominance components.
Used in full motion video.
Black and white
information
Coloured information
U = red-cyan
V= magenta-green
RGB Model:
This model is additive in
nature.intensities of Red, Green and Blue
are added to generate various colors.
Used in design of image capture devices,
television, and color monitors.

No color model is better than the other,
the choice depends on the application

Color component conversion
Y 0.299R + 0.587G + 0.114B
U 0.596R 0.247G 0.322B
V 0.211R 0.523G + 0.312B
Color component conversion
R 1.0Y + 0.956U + 0.621V
G 1.0Y 0.272U 0.647V
B 1.0Y -1.1061U 1.703V
JPEG standard is a collaboration among :
International Telecommunication Union (ITU)
International Organization for Standardization
(ISO)
International Electrotechnical Commission
(IEC)
The official names of JPEG :
Joint Photographic Experts Group
ISO/IEC 10918-1 Digital compression and coding
of continuous-tone still image
ITU-T Recommendation T.81

It should address image quality where visual
fidelity is very high and an encoder can be
parameterized to allow the user to set the
compression or the quality level.
Should compress any kind of continuous-tone
digital source image and is not restricted by
dimensions, color, aspect ratios etc.
Should be scalable from completely lossless
to lossy.
Four operation modes:
Sequential encoding component is encoded
in left to right and top to bottom scan.
Progressive encoding the image is
decompressed so that a coarser image is
displayed first and filled in with more
components when decompressed to a finer
version of the image.
Hierarchical encoding- the image is
compressed to multiple resolution levels.
Lossless encoding the image can be
guaranteed to provide full detail at the
selected resolution when decompressed.
A codec is a device or computer program
capable of encoding and decoding a digital
stream of data or signal.
They differ within an operation mode
according to the precision of source image
they can handle or the entropy coding
method they use.

It have three levels of defination:
Baseline System- it decompress color images,
maintain a high compression ratio, and handle
from 4 bits/pixels to 16 bits/pixels. It ensures s/w
implementation are cost effective.
Extended System it covers various encoding
aspects such as variable-length, progressive and
hierarchical mode of encoding.
Special Lossless function it ensures there is no
loss of detail in the compression and
decompression process, but there is some loss in
scanning process.
Baseline sequential codec- It consist of three steps:
formation of DCT coefficients, quantization, and
entropy encoding. Its a rich compression scheme.
DCT progressive mode- key steps of DCT coefficients
and quantization are same as above with a diff. that
each component is coded in multiple scans instead of
single scan.
Predictive lossless encoding-defines a means of
approaching lossless continuous-tone compression.
Predictor combines sample areas and predicts
neighboring areas.
Hierarchical mode- provides a means of carrying
multiple resolutions. Each successive encoding is
reduced by a factor of two, either in horizontal or
vertical dimension.
The main steps in JPEG encoding are the following
Transform RGB to YUV or YIQ and subsample color
DCT on 8x8 image blocks
Quantization
Zig-zag ordering and run-length encoding
Entropy coding

The image is divided up into 8x8 blocks
2D DCT is performed on each block
The DCT is performed independently for each
block
This is why, when a high degree of compression is
requested, JPEG gives a blocky image result

7 7
0 0
1 (2 1) (2 1)
( , ) ( ) ( ) ( , ) cos cos
4 16 16
for 0,..., 7 and 0,..., 7
x y
x u y v
F u v C u C v f x y
u v
t t
= =
+ +
( (
=
( (

= =

1/ 2 for 0
where ( )
1 otherwise
k
C k

=
=

7 7
0 0
1 (2 1) (2 1)
( , ) ( ) ( ) ( , ) cos cos
4 16 16
for 0,..., 7 and 0,..., 7
u v
x u y v
f x y C u C v F u v
x y
t t
= =
+ +
( (
=
( (

= =

Forward DCT:
Inverse DCT:
0 1 2 3 4 5 6 7
0





1





2





3




4





5






6





7

u
v
Y
the luminance of an image
W
H
8x8 values of luminance
48 39 40 68 60 38 50 121
149 82 79 101 113 106 27 62
58 63 77 69 124 107 74 125
80 97 74 54 59 71 91 66
18 34 33 46 64 61 32 37
149 108 80 106 116 61 73 92
211 233 159 88 107 158 161 109
212 104 40 44 71 136 113 66
DCT
699.25 43.18 55.25 72.11 24.00 -25.51 11.21 -4.14
-129.78 -71.50 -70.26 -73.35 59.43 -24.02 22.61 -2.05
85.71 30.32 61.78 44.87 14.84 17.35 15.51 -13.19
-40.81 10.17 -17.53 -55.81 30.50 -2.28 -21.00 -1.26
-157.50 -49.39 13.27 -1.78 -8.75 22.47 -8.47 -9.23
92.49 -9.03 45.72 -48.13 -58.51 -9.01 -28.54 10.38
-53.09 -62.97 -3.49 -19.62 56.09 -2.25 -3.28 11.91
-20.54 -55.90 -20.59 -18.19 -26.58 -27.07 8.47 0.31
Quantization in JPEG aims at reducing the
total number of bits in the compressed image
Divide each entry in the frequency space block
by an integer, then round
Use a quantization matrix Q(u, v)

Use larger entries in Q for the higher spatial
frequencies
These are entries to the lower right part of the
matrix
The following slide shows the default Q(u, v)
values for luminance and chrominance
Based on psychophysical studies intended to maximize
compression ratios while minimizing perceptual
distortion
Since after division the entries are smaller, we can use
fewer bits to encode them

F(u,v)
8x8 DCT coefficiences
16 11 10 16 24 40 51 61
12 12 14 19 26 58 60 55
14 13 16 24 40 57 69 56
14 17 22 29 51 87 80 62
18 22 37 56 68 109 103 77
24 35 55 64 81 104 113 92
49 64 78 87 103 121 120 101
72 92 95 98 112 100 103 99
Q(u,v)
Quantization matrix
699.25 43.18 55.25 72.11 24.00 -25.51 11.21 -4.14
-129.78 -71.50 -70.26 -73.35 59.43 -24.02 22.61 -2.05
85.71 30.32 61.78 44.87 14.84 17.35 15.51 -13.19
-40.81 10.17 -17.53 -55.81 30.50 -2.28 -21.00 -1.26
-157.50 -49.39 13.27 -1.78 -8.75 22.47 -8.47 -9.23
92.49 -9.03 45.72 -48.13 -58.51 -9.01 -28.54 10.38
-53.09 -62.97 -3.49 -19.62 56.09 -2.25 -3.28 11.91
-20.54 -55.90 -20.59 -18.19 -26.58 -27.07 8.47 0.31
43.70 3.93 5.52 4.51 1.00 -0.64 0.22 -0.07
-10.82 -5.96 -5.02 -3.86 2.29 -0.41 0.38 -0.04
6.12 2.33 3.86 1.87 0.37 0.30 0.22 -0.24
-2.91 0.60 -0.80 -1.92 0.60 -0.03 -0.26 -0.02
-8.75 -2.25 0.36 -0.03 -0.13 0.21 -0.08 -0.12
3.85 -0.26 0.83 -0.75 -0.72 -0.09 -0.25 0.11
-1.08 -0.98 -0.04 -0.23 0.54 -0.02 -0.03 0.12
-0.29 -0.61 -0.22 -0.19 -0.24 -0.27 0.08 0.00
( , )
( , )
F u v
Q u v
44 4 6 5 1 -1 0 0
-11 -6 -5 -4 2 0 0 0
6 2 4 2 0 0 0 0
-3 1 -1 -2 1 0 0 0
-9 -2 0 0 0 0 0 0
4 0 1 -1 -1 0 0 0
-1 -1 0 0 1 0 0 0
0 -1 0 0 0 0 0 0
( , )
( , )
( , )
q
F u v
F u v
Round
Q u v
=
| |
|
\ .
0 1 5 6 14 15 27 28
2 4 7 13 16 26 29 42
3 8 12 17 25 30 41 43
9 11 18 24 31 40 44 53
10 19 23 32 39 45 52 54
20 22 33 38 46 51 55 60
21 34 37 47 50 56 59 61
35 36 48 49 57 58 62 63
The zigzag sequence starts at the DC coefficient value.
It is designed to facilitate entropy coding by placing low-frequency
coefficients (which are non- zero) before high frequency coefficients.


44 4 6 5 1 -1 0 0
-11 -6 -5 -4 2 0 0 0
6 2 4 2 0 0 0 0
-3 1 -1 -2 1 0 0 0
-9 -2 0 0 0 0 0 0
4 0 1 -1 -1 0 0 0
-1 -1 0 0 1 0 0 0
0 -1 0 0 0 0 0 0
( , )
q
F u v
Zig-Zag Reordering :
44,
4,-11,
6,-6,6,
5,-5,2,-3,
-9,1,4,-4,1,
-1,2,2,-1,-2,4,
-1,0,0,-2,0,0,0,
0,0,0,1,0,1,-1,0,
-1,0,-1,0,0,0,0,
0,0,0,-1,0,0,
0,1,0,0,0,
0,0,0,0,
0,0,0,
0,0,
0

Entropy is used in thermodynamics for the study of heat and work.
In data compression, it is a measure of the information content of
a message in number of bits.
Entropy in no. of bits = log
2
(probabilityofobject)
Object can be a character
Eg. If the probability of character T present in a string is 1/8, the
entropy is 3 bits. i.e. If there are 7 Ts in a text string, then the
message can be represented by 21 bits.
JPEG uses two entropy coding schemes: huffman and arithmetic
coding.
Huffman coding requires one or more sets of huffman code tables
for coding as well as decoding.
Arithmetic coding uses DC and AC coefficients. The coefficient at
0,0 position in the matrix is called DC coefficients and other 63
are called as AC coefficients.

You might also like