Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 66

Image Compression

Digital images are very large in size and hence occupy larger
storage space. Due to their larger size, they take larger
bandwidth and more time for upload or download through the
Internet.
This makes it inconvenient for storage as well as file sharing.
To combat with this problem, the images are compressed in
size with special techniques.
This compression not only helps in saving storage space but
also enables easy sharing of files. Image compression
applications reduce the size of an image file without causing
major degradation to the quality of the image.

Image Compression
Digital

images require huge amounts of space for

storage and large bandwidths for transmission.


A

640 x 480 color image requires close to 1MB of

space.
The

goal of image compression is to reduce the amount

of data required to represent a digital image.


Reduce

rates.

storage requirements and increase transmission

Data Information

Data and information are not synonymous terms!

Data is the means by which information is conveyed.

Data compression aims to reduce the amount of data required to


represent a given quantity of information while preserving as much
information as possible.

Image Compression
Aims

to reduce the amount of data required to represent


images while preserving as much information as possible.
Saves

storage space

Increase

access speed (transmission rates)

Applications

Medical imaging

Satellite imagery (photograph of orbit of earth, forecasting

weather and earth resource

Digital radiography

Legal aspects

Video conferencing (as person is talking, we want to send full image: rate at
which information is send must be quite fast

Controlling remote vehicle (Delhi crime satellite)

Fax (less no. of bits are transmitted more efficiently)

Entertainment

Document imaging (any printing document for scanning and storage in


computer

Gray and Color Image Compression

Gray Image Compression


A digital grayscale image is represented by 8 bits per pixel (bpp) in its
uncompressed form. Each pixel has a value ranging from 0 (black) to 255
(white).

Color Image Compression


A digital color image is represented by 24 bits per pixel in its
uncompressed form. Each pixel contains a value representing a red (R),
green (G), and blue (B) component scaled between 0 and 255. This format
is known as the RGB format.

Image Compression Characteristics

Compression Ratio

Image Quality

Compression Speed

Peak Signal to Noise Ratio

Compression Ratio
C.R.

sizeof original image sizeof compressed image


sizeof original image

Image Quality
number of bitsincompressed image
bpp
no. of pixels

Compression Speed
Amount of the time required to compress and decompress image.

PSNR
PSNR 10 log

Where

MAX 2
1
MXN

2
(
X
(
m
,
n
)

Y
(
m
,
n
))

m 1 n 1

X = original image data


Y = compressed image data
MAX = Max. value that a pixel can have

Principles Behind Compression


The no. of bits actually required to represent an image may be less
because of redundancy.
Principles behind compression is to remove redundancy
Three types of redundancy in digital images:

Spatial redundancy: due to correlation between neighbouring pixel values.

Spectral Redundancy: due to correlation between different color planes.

Temporal Redundancy: due to correlation between adjacent frames in a


sequence of images.

Data Redundancy

The relative data redundancy is define as :


1
CR
where CR is the compression ratio
RD 1

CR

n1
n2

n1 denotes the number of bits in the original data set and


n2 denotes the number of bits in the compressed data set
If compression ratio C=10 [10:1] means we are representing a 10 bit data
by 1 bit by using some coding technique.
Then the redundancy R=1-1/10 = 9/10 = 0.9
That means 90% of data was redundant.

Types of Data Redundancy

Coding
Interpixel (spatial and temporal or video) redundancy
Psychovisual

Compression attempts to reduce one or more of these redundancy

Coding Redundancy

Code: a list of symbols (letters, numbers, bits etc.)


Code word: a sequence of symbols used to represent a piece of
information or an event (e.g., gray levels).
Code word length: number of symbols in each code word

Example
l(rk) = constant length

Example
l(rk) = variable length
Consider the probability of the gray levels:

Interpixel redundancy

Interpixel redundancy implies that any pixel value can be reasonably


predicted by its neighbors (i.e., correlated).

f ( x) o g ( x)

f ( x) g ( x a )da

Interpixel redundancy

Interpixel redundancy implies that any pixel value can be reasonably


predicted by its neighbors (i.e., correlated).

Inter pixel red.: There are pixel gray level ( 1 byte to store)
123 120 121 124
126
128 127
123 -3 1
3
2
2
-1 difference of 2 adj. Pixel

Run length coding (when binary images)


1111110000111110000000111111000
(1,6) (0,4) (1,5) (0,7) (1,6) (0,3) : no need to store separately
645763: initially 1 and then 0 because alternate for document, letter like
paper with some data.

Psycho-visual Redundancy

In this case compression is lossy means compressed image is numerically


degraded to original image
Use Quantization
The human visual system is more sensitive to edges
DCT and predictive Transform with quantizer

Uniform quantization from 256 to 16 gray levels, C.R.= 2

Compression Model

ENCODER
DECODER

The source encoder is responsible for removing redundancy (coding,


inter-pixel, psycho-visual)

The channel encoder ensures robustness against channel noise.

Compression Techniques
Lossless Compression

Reconstructed image after compression is numerically identical to the


original image.

Achieve only modest amount of compression.

Lossy Compression

Reconstructed image contains degradation relative to the original image


because the compression scheme completely discards the redundant
information.

Achieve much higher compression.

Visually lossless.

Lossless (Error-Free)Compression

Some applications require no error in compression (medical, business


documents, etc..)

CR=2 to 10 can be expected.

Make use of coding redundancy and inter-pixel redundancy.

Ex: Huffman codes, LZW, Arithmetic coding, Run-length coding, Lossless Predictive Codingand Bit-Plane Coding.

Huffman Coding

The most popular technique for removing coding redundancy is due to


Huffman (1952)

Huffman Coding yields the smallest number of code symbols per source
symbol

The resulting code is optimal symbol per source symbol also known as
block coding.

The basic aim is to reduce the entropy of each bit in source coding.

Example
l(rk) = variable length
Consider the probability of the gray levels:

Huffman Coding

Symbol
S0
S1
S2
S3
S4

probability code words


0.4
00
0.2
10
0.2
11
0.1
010
0.1
011

L 1

Lavg l (rk ) pr (rk ) 0.4 2 0.2 2 0.2 2 0.1 3 0.1 3 2.2 bits
k 0

L 1

H ( s ) pk log 2 ( pk )
k 0

Lavg = average length of symbol


Pr( rk)= probability of occurrence of symbol
l(rk) =length of each symbol
H(s)= Entropy
Avg. Information generated per pixel=2.2 bits/symbol
C.R.=3/2.2=1.36
RD=1-1/C.R.=26.67% of data is redundant

Efficiency

H ( s)

EXAMPLE
CODING REDUNDANCYrk
r87
r128

r186

r255

pr ( rk )

CODE-1

L1 ( rk )

CODE-2

L2 ( rk )

0.25

01010111 8

01

0.47

10000000 8

0.25

11000100 8

000

0.03

11111111

001

EXAPMPLE
rk -

Given intensity values

Lavg

pr (rk )

L 1

l (r ) p
k 0

- Probability of given intensity value

( rk )

for code-2 its 0.252+0.471+0.253+0.033 = 1.81 bits


for code-1 its 0.258+0.478+0.258+0.038 = 8 bits
The total no. f bits needed to represent the entire image is
M N Lavg 256 256 1.81 118, 621 256 by 256 image
256 256 8
Compression ratio C= b/b = 256 256 1.81

C=4.42 so R= 1-1/C = 0.774.


Thus 77.4% of the data in original8 bit 2d intensity array is redundant.

Example

Arithmetic Coding

It does not generate individual code for each character.

There is no one-to-one correspondence between source symbols and code words.

A single codeword is used for an entire sequence of symbols

The code defines an interval of real number between 0 and 1.


As no. of symbols in message increases, interval used to represent it become
smaller and no. of information units required to represent the interval become
larger.

No assumption on encode source symbols one at a time.

Slower than Huffman coding but typically achieves better compression.


Performs well for sequences with low entropy where Huffman code lose their
efficiency.

It is complex but optimal

Arithmetic Coding
Sources symbol Probability
a1
a2
a3
a4

0.2
0.2
0.4
0.2

Initial sub-range
[0.0,0.2)
[0.2, 0.4)
[0.4, 0.8)
[0.8, 1.0)

Arithmetic decoding
If first symbol is a1 the tag will lie in 0.02 and rest of unit interval
is discarded, then this subinterval is divided into same
proportion as the original one. Suppose second symbol in seq
is a2, the tag value is restricted to lie in the interval between .
02to .04 (tag seq is alwaysa disjoint from others), we now
partition this interval into as same proportion as original one.

Arithmetic decoding
1) 0.068 [0.0; 0.2) => a1;

(0.068-0.0)/(0.2-0.0)=0.34

2) 0.34 [0.2; 0.4) => a2;

(0.34-0.2)/(0.4-0.2)=0.7

3) 0.7 [0.4; 0.8) => a3; (0.7-0.4)/(0.8-0.4)=0.75


4) 0.75 [0.4; 0.8) => a3; (0.75-0.4)/(0.8-0.4)=0.875
5) 0.875 [0.8; 1) => a4

Arithmetic decoding
The decoded sequence: a1 a2 a3 a3 a4
So final code for string a1 a2 a3 a3 a4 is between .0624 to .0688
Drawbacks of the arithmetic coding:
- precision is big issue
- an end-of-message flag is needed
Alternative solutions: re-normalization and rounding

Lampel Ziv Welch (LZW) coding

Removing inter-pixel redundancy

Fixed length coding used in gif, tiff, pdf

Builds an identical decompression dictionary as it decodes


simultaneously the decoded data stream

Adv: No need of probability of occurrence of event

Example given Seq. is

000101110010100101

Numerical positions
: 1 2 3
4
Subsequences
: 0 1 00 01
Numerical representation :
11 12
Binary coded block
:
0010 0010

5
6
7
8
011 10 010 100
42 21 41 61

9
101
62

1001 0100 1000 1101 1101

Bits planes encoding

8 planes of 1 bit, independently encoded (MSB . LSB)

The higher order bits contains the majority of visual significant


data

Useful for image compression

Image data compression


methods

Predictive coding

Transform Coging

Predictive Coding

Lossless and lossy.

Eliminates interpixel redundancies in time domain.

Information already sent or available is used to predict future values and


difference is coded.

f (n) round

Fig.: Predictor

i 1

f (n i )

Lossless predictive coding

Consists of an encoder and decoder both contains an identical predictor

The predictor generates the anticipated value of each sampler based on a


specified no. of past samples

Fig.: Lossless Predictive Coding model (a) Encoder (b) Decoder

e(n) = f(n) - (n)


where e(n) prediction error,
f(n): input signal, (n) output of predictor
e(n) is encoded by using variable length code. At the receiver end, decoder
reconstructs e(n) from the received variable length codewords and perform
the inverse operation to decompress or recreate the original input sequence.
f(n) = e(n) + (n)
In many cases the prediction is formed by linear combination of m previous
samples. Where m is the order of the linear predictor, round is a function
used to denote the rounding or nearest integer operation and the for i = 1,
2 ... m are the prediction coefficients. It does 50% of compression.

Lossy Predictive Coding

Fig.: Lossy Predictive Coding model (a) Encoder (b) Decoder

PREDICTIVE CODING
DPCM coding principles: Maximize image compression efficiency by
exploiting the spatial redundancy present in an image!
Many close to zero data => spatial redundancy, the brightness
is almost repeating from a point to the next one! => no need to
encode all brightness info, only the new one!

Line-by-line
difference of
the luminance

see images: one can estimate ( predict) the brightness of the subsequent spatial point
based on the brightness of the previous (one or more) spatial points = PREDICTIVE CODING

DPCM coding principles continued BASIC FORMULATION OF DPCM:


Let {u(m)} the image pixels, represented on a line by line basis, as a vector.
Consider we already encoded u(0), u(1),, u(n-1); then at the decoder => only
available their decoded versions ( original + coding error), denoted: u(0), u(1),,
u(n-1)
=> currently to do: predictive encode u(n), n-current sample:
(1) estimate (predict) the grey level of the nth sample based on the knowledge of
the previously encoded neighbor pixels u(0), u(1),, u(n-1):

u (n) (u (n 1)u (n 2)...)


(2) compute the prediction error:

e(n) u(n) u (n)


(3) quantize the prediction error e(n) and keep the quantized e(n); encode e(n)
PCM and transmit it.

At the decoder: 1) decode e => get e(n) + 2) build prediction + 3)


u (n)
construct u(n):

u (n) u (n) e (n)

u(n) u(n) u (n) e(n) e (n) q(n)


* The encoding & decoding error:

PREDICTIVE CODING - continued

Basic DPCM codec

DPCM codec: (a) with distorsions; (b) without distorsions

Lossy Transform Compression

Provide greater compression compared to predictive methods although at the expense


of greater computation.
A reversible linear transform (F.T.) is used to map the image into set of transform
coefficients
To pack as much information as possible into smallest no. of coefficients.
Quantizer stage eliminates coefficients that carry least information

Fig. A transform coding system a) Encoder b) Decoder

JPEG Compression

JPEG is an image compression standard which was accepted as an


international standard in 1992.

Developed by the Joint Photographic Expert Group of the ISO/IEC for


coding and compression of color/gray scale images.

Yields acceptable compression in the 10:1 range.

A scheme for video compression based on JPEG called Motion JPEG


(MJPEG) exists

Different transform techniques for


image
compression
Discrete fourier transform (DFT)

Discrete sine transform (DST)

The Karhunen Loeve Transform (KLT)

Discrete Cosine Transform (DCT)

Discrete Wavelet Transform (DWT)

Walsh Hadamad Transform (WHT)

Choice of particular transform in a given application depends on


the amount of reconstruction error that can be tolerated.

Forward Transform
N 1 N 1

T (u , v) f ( x, y )g ( x, y , u , v)
x 0 y 0

At the receiving end, reconstructed image by takimg Inverese transform


N 1 N 1

f ( x, y ) T (u , v) h( x, y, u , v)
u 0 v 0

g ( x, y, u , v ) : transformation kernel or basis image


h( x, y , u , v) : inverse transformation kernel
g ( x, y, u , v ) g1 ( x, u ) g 2 ( y, v )
Kernel is seperable (horizontal & vertical axis are independent
so no. of computations are reduced).

Image Transformation

Image Transformations

Unitary Transformations

Orthogonal and Orthonormal basis vectors

How an arbitrary 1-D signal can be represented by series summation of


orthogonal basis vectors

How an arbitrary image can be represented by series summation of


orthogonal basis images

What is Image Transformation?

Image
NXN

Transform

Another Image
NXN

Inverse Transform

Coefficient Matrix

Image Transformation Applications

Preprocessing
Filtering
Enhancement etc.

Data Compression

Feature Extraction
Edge Detection
Corner Detection etc.

What does Image Transformation do?

It represents a given image as a series summation of a set of Unitary


matrices (orthogonal basis functions).

Example: For 1-D signal x(t), this representation can be given as

x (t ) Cn an (t )
n 0

Here an (t ) is set of orthogonal functions.

Unitary Matrix

A matrix A is a Unitary Matrix if


A-1 = A*T

Basis Images

Orthogonal/Orthonormal
Function
A set a (t ) a (t ), a (t )...... of real valued continuous functions

is called orthogonal over the interval t to t+T, if

(t ).an (t )dt k ; m n

0; m n
if k=1 then we say above set is orthonormal.

Now consider an example :-

Where an(t) is a set of orthogonal functions


Then plot sinwt and sin2wt in time period 0 to T. We will get

Now product of sinwt and sin2wt wil be shown and the


intergration of sinwt and sin2wt over the interval 0 to T
is given as

Similarly if we multiply sin2wt and sin3wt then integrate it we will g


Hence this particular set i.e {Sinwt, sin2wt,sin3wt} is called orthog

Now to calculate the value of Cn, multiply both side by am(t) and integrate it

And expand it

Now according to definition of orthogonal, this expansion will be equal to k if


n=m and zero otherwise.

This is how we get mth coefficient of any arbitrary function x(t)

Compression Techniques

The Karhunen Loeve Transform (KLT)

Discrete Cosine Transform (DCT)

Discrete Wavelet Transform

DFT

The 2 dimensional DFT is defined by


1
F (u , v)
N

N 1 N 1

f ( x , y )e

2
vx v y
N

(1)

x 0 y 0

The inverse DFT is


1
f ( x, y )
N

N 1 N 1

F (u, v)e
u 0 v 0

2
u x u y
N

(2)

Properties of DFT
Fast transform
Good energy compaction; however requires
complex computations
very useful in digital signal processing,
convolution, filtering, image analysis

Why DCT not FFT?


DCT is like FFT, but can approximate linear signals well with few
coefficients.

Output File of Gray Image Using DCT


Original image size (bytes) = 65536, Compressed image size (bytes) = 6382,
C.R. = 90%, SNR (db) =24.49, Simulation Time (Secs) = 1.21

SNR Performance for Gray Image

CR = 75%

CR = 85%

CR = 90%

DWT

31.24

30.29

29.06

KLT

30.92

29.23

28.47

DCT

29.26

26.32

24.49

Table : SNR Values for gray Image (All values in db)

Simulation time performance for gray


image

CR = 75%

CR = 85%

CR = 90%

KLT

3.68

3.79

3.95

DWT

1.31

1.33

1.34

DCT

1.1

1.15

1.21

Table : Simulation Time for Gray Image (All values in seconds )

Output File of Color Image Using DCT


Original image size (bytes) = 196608, Compressed image size (bytes) = 58896,
C.R. = 70%, SNR (db) =23.05, Simulation Time (Secs) = 3.57

Output File of Color Image Using DCT


Original image size (bytes) = 196608 ,Compressed image size (bytes) = 18978
C.R. = 90%, SNR (db) = 17.51, Simulation Time (Secs) = 3.68

You might also like