Compression Techniques

COMPRESSION
TECHNIQUES
TY-CS-A GROUP-11
Archit Kothawade-15
Maitrey Chitale-58
Sharvari Deshmukh-73
Atharva Deshpande-75
Guide- Prof. Pushkar Joglekar
TABLE OF CONTENT
01 02
INTRODUCTION TYPES OF COMPRESSION TECHNIQUES
03 04
LOSS-LESS TECHNIQUES LOSSY TECHNIQUES
WHAT-
Process of reducing size of data files
without significantly affecting their
quality/ integrity.
WHY-
1. Reduced Storage requirements
2. Faster data transmission
3. Cost efficiency
4. Reduced traffic in the network
5. Faster processing
TYPES-
Lossless Lossy
Original data can be completely Some data is permanently removed
Data
reconstructed from compressed data from original data to achieve higher
integrity without any loss of information compression ratios
For multimedia data (images, audio,

For text, program files, databases where
video) where a certain degree of
Suitability preserving every bit of the original
imperceptible loss is acceptable in
information is crucial.
exchange for significant file size reduction
Relatively quick
Maintains original data
Advantage Offers lower compression ratios
Reduces file size dramatically
User can select compression level
Run Length Encoding Discrete Cosine Transform

Algorithms Huffman Encoding Wavelet Transform
Lempel–Ziv–Welch Quantization
LOSS-LESS ALGORITHMS
1. Run Length Encoding
Working-
RLE works by examining the input data

and identifying consecutive
occurrences of the same symbol.
It then replaces these sequences with a

single symbol followed by its frequency.
Suitability-
Best suited for simple images & animations with many redundant pixels.
Useful for black and white images in particular.
May not be as effective for data with minimal repetition, as the encoding could
potentially result in a longer string than the original data.
Advantages-
Lossless compression
Easy to implement, minimal computational resources
Effective for large redundant data
Drawbacks-
Ineffective for complex data
2. Huffman Encoding
Working-
Assigns variable-length codes to input characters, with more frequent

characters having shorter codes and less frequent characters having
longer codes.
These are called Prefix Codes (bit sequences). This ensures no ambiguity
while decoding the generated bitstream.
Step 2- Building the Huffman Tree
Step 1- Frequency Analysis
A Huffman tree is constructed using min-heap
Character Frequency
a 5
b 9
Character Frequency
c 12
c 12
d 13
d 13
e 16
Internal Node 14
f 45
e 16
f 45
character frequency
Internal Node 14 character frequency
e 16 Internal Node 25
Internal Node 25 Internal Node 30
f 45 f 45
character frequency character frequency
f 45 Internal Node 100
Internal Node 55
Step 3- Assigning Codes character code-word
f 0
c 100
d 101
a 1100
b 1101
e 111
Time Complexity-
O(nlogn) where n is the number of unique characters.
Space Complexity-
O(n)
Advantages-
No ambiguity in decoding
Efficient for data with varying frequencies of characters
Disadvantages-
Encoding overhead
3. Lempel–Ziv–Welch
Working-
As input data is being processed, a dictionary keeps correspondence
between the longest encountered words and a list of code values.
The words are replaced by their corresponding codes and so the input
file is compressed.
Efficiency of the algorithm increases as the number of long, repetitive

words in the input data increases.
Time Complexity-
O(n) where n is the length of the input data.
Space Complexity-
O(n)
Advantages-
LZW requires no prior information about the input data
stream.
LZW can compress the input stream in one single pass.
LZW can achieve high compression ratios
Drawbacks-
Slower compression
Dictionary Size Matters because of Memory Constraints
LOSSY ALGORITHMS
1. Discrete Cosine Transform
Breaking the Signal into Blocks-
Input signal (eg. image) is divided into small, square blocks of pixels. Each
block is treated as a 2D matrix of pixels
Transforming the Blocks:

DCT calculates weighted sum of cosine functions of varying frequencies
that oscillate across the block.
These functions capture the changes in intensity across the block & represent
the image data in terms of its frequency components.
Separating High and Low Frequencies
The resulting DCT coefficients represent the contributions of different
frequencies to the original signal.
Lower-frequency components tend to capture the overall structure of the

image, while the higher-frequency components capture the details and edges.
2. Quantization
Breaking Values into Intervals:
Quantization involves dividing the range of continuous values into distinct intervals.
Rounding Values:
Each original value is rounded to the nearest value in the reduced set of levels. This
rounding process reduces precision of the data, leading to a loss of detail or accuracy.
Loss of Information:
Since the original data is approximated or simplified during quantization, there is typically a
loss of some information or fine details. This loss can affect the quality of the reconstructed
data, especially in the case of highly detailed or complex signals.
Quantization of a grayscale image :
Pixel intensity range : 0 to 255.
8 levels : [0-31], [32-63], [64-95], [96-127], [128-159], [160-191], [192-223], [224-

255].
Rounding pixel intensity:

THANK
YOU

Compression Techniques

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Compression Techniques

Uploaded by

Copyright:

Available Formats

COMPRESSION

For multimedia data (images, audio,

Run Length Encoding Discrete Cosine Transform

RLE works by examining the input data

It then replaces these sequences with a

Assigns variable-length codes to input characters, with more frequent

Internal Node 14 character frequency

Internal Node 25 Internal Node 30

f 45 Internal Node 100

Efficiency of the algorithm increases as the number of long, repetitive

Transforming the Blocks:

Lower-frequency components tend to capture the overall structure of the

8 levels : [0-31], [32-63], [64-95], [96-127], [128-159], [160-191], [192-223], [224-

Rounding pixel intensity:

You might also like