Data Compress in On Techniques

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 33

DATA COMPRESSION

TECHNIQUES

Presented By : Archit Gupta & Gaurav Gupta


CONTENT
 Distinguish between lossless and lossy
compression.
 Describe run-length encoding and how it

achieves compression.
 Describe Lempel Ziv encoding and the role

of the dictionary in encoding and decoding.


 Describe Huffman coding and how it

achieves compression.
LOSSLESS AND LOSSY
COMPRESSION
Data compression implies sending or
storing a smaller number of bits. In general
these methods can be divided into two
broad categories: lossless and lossy
methods.
 In lossless data compression, the integrity of
the data is preserved. The original data and
the data after compression and decompressi-
on are exactly the same.

 In lossy data compression, the integrity of


the data is not exactly preserved. Our eyes
and ears cannot distinguish subtle changes.
In such cases, we can use a lossy data
compression method.
DATA COMPRESSION
METHODS
RUN-LENGTH ENCODING
₪Run-length encoding is probably the
simplest method of compression.
It can be used to compress data made
of any combination of symbols.
o It can be very efficient if data is in the

form of icons,bitamp,etc.
EXAMPLE OF RUN
LENGTH ENCODING
Original Data:

BBBBBBBAAAAAAAAAANMMMMM
BBBBBBBAAAAAAAAAAANMMMMM

Characters: B A N M

Frequency: 07 12 01 05
BBBBBBBAAAAAAAAAAANMMMMM

Compressed Data:
B07A12N01M05
A 17 X 17 image
Compressed Data
2W4R5W4R3W6R3W6R2W6R3W
6R1W8R1W16R1W59R1W15R2W
15R3W13R5W11R7W9R9W7R11
W5R13W3R15W1R8W
LEMPEL ZIV
ENCODING
Lempel Ziv (LZ) encoding is an
example of a category of algorithms
called dictionary-based encoding.

The idea is to create a dictionary (a


table) of strings used during the
communication session.
COMPRESSION
In this phase there are two concurrent
events: building an indexed dictionary and
compressing a string of symbols. The
algorithm extracts the smallest substring that
cannot be found in the dictionary from the
remaining uncompressed string. It then stores
a copy of this substring in the dictionary as a
new entry and assigns it an index value.
Compression occurs when the
substring, except for the last character, is
replaced with the index found in the
dictionary. The process then inserts the
index and the last character of the
substring into the compressed string.
EXAMPLE OF LEMPEL
ZIV ENCODING

Original Data: BAABABBBA


BAABABBBA
B PARSED STRING

B
1
B
BAABABBBA
A PARSED STRING

B,A
1 2
B A
BAABABBBA
AB PARSED STRING

B,A,2B
1 2 3
B A AB
BAABABBBA
ABB PARSED STRING

B,A,2B,3B
1 2 3 4
B A AB ABB
BAABABBBA
BA PARSED STRING

B,A,2B,3B,1A
1 2 3 4 5
B A AB ABB BA
HUFFMAN
CODING
The main idea behind Huffman
Coding is that it assigns shorter
codes to symbols that occur
more frequently and longer
codes to those that occur less
frequently.
CONSTRUCTION
ALGORITHM
Given frequencies of character we
wish to compute a trie so that the
length of the encoding is minimum
possible.
Each character is a leaf of the trie.
CONSTRUCTION ALGORITHM

The number of bits used to encode a character


is its level number.
Thus if fi is the frequency of the ith character
and li is the level of the leaf corresponding to it
when we want to find a tree which minimize ∑i
fili.
HUFFMAN ENCODING
TRIE
Let our text is ABRACADABRA

Characters A B R C D

Frequency 5 2 2 1 1
2
5 2 2
A B R 1 1
C D
5
A 4 2

2 2 1 1
B R C D
6
5
A
4 2

2 2 1 1
B R C D
11
6
5

4 2

2 2 1 1
11
0 1

5 6
0 1

4 2
0 1 0 1

2 2 1 1
11
0 1

A 6
0 1

4 2
0 1 0 1

B R C D
11
0
1
A 6
0 1

4 1
0 0 2 1

B R C D

A B R A C A D A B R A
0 100 101 0 110 0 111 0 100 101 0

= 23 Bits
Y
☼ www.wikipedia.org
☼ www.vectorsite.net
☼ www.scribd.com
☼ www.cs.cmu.edu
☼ www.ics.uci.edu
☼ www.webopedia.com
☼ www.citeseerx.ist.psu.edu
☼ www.ecma-international.org
☼ www.davidsalomon.name
☼ www.data-compression.com
☼ www.authorstream.com
☼ www.ligo.caltech.edu
☼ www.aha.com
☼ Data compression: The complete reference.
Thank You
Question And
Answer Session

You might also like