Professional Documents
Culture Documents
Compression Techniques
Compression Techniques
TECHNIQUES
TY-CS-A GROUP-11
Archit Kothawade-15
Maitrey Chitale-58
Sharvari Deshmukh-73
Atharva Deshpande-75
Guide- Prof. Pushkar Joglekar
TABLE OF CONTENT
01 02
INTRODUCTION TYPES OF COMPRESSION TECHNIQUES
03 04
LOSS-LESS TECHNIQUES LOSSY TECHNIQUES
WHAT-
Process of reducing size of data files
without significantly affecting their
quality/ integrity.
WHY-
1. Reduced Storage requirements
2. Faster data transmission
3. Cost efficiency
4. Reduced traffic in the network
5. Faster processing
TYPES-
Lossless Lossy
Original data can be completely Some data is permanently removed
Data
reconstructed from compressed data from original data to achieve higher
integrity without any loss of information compression ratios
Relatively quick
Maintains original data
Advantage Offers lower compression ratios
Reduces file size dramatically
User can select compression level
Advantages-
Lossless compression
Easy to implement, minimal computational resources
Effective for large redundant data
Drawbacks-
Ineffective for complex data
2. Huffman Encoding
Working-
These are called Prefix Codes (bit sequences). This ensures no ambiguity
while decoding the generated bitstream.
Step 2- Building the Huffman Tree
Step 1- Frequency Analysis
A Huffman tree is constructed using min-heap
Character Frequency
a 5
b 9
Character Frequency
c 12
c 12
d 13
d 13
e 16
Internal Node 14
f 45
e 16
f 45
character frequency
e 16 Internal Node 25
f 45 f 45
character frequency character frequency
Internal Node 55
Step 3- Assigning Codes character code-word
f 0
c 100
d 101
a 1100
b 1101
e 111
Time Complexity-
O(nlogn) where n is the number of unique characters.
Space Complexity-
O(n)
Advantages-
Lossless compression
No ambiguity in decoding
Efficient for data with varying frequencies of characters
Disadvantages-
Encoding overhead
3. Lempel–Ziv–Welch
Working-
As input data is being processed, a dictionary keeps correspondence
between the longest encountered words and a list of code values.
The words are replaced by their corresponding codes and so the input
file is compressed.
Drawbacks-
Slower compression
Dictionary Size Matters because of Memory Constraints
LOSSY ALGORITHMS
1. Discrete Cosine Transform
Breaking the Signal into Blocks-
Input signal (eg. image) is divided into small, square blocks of pixels. Each
block is treated as a 2D matrix of pixels
These functions capture the changes in intensity across the block & represent
the image data in terms of its frequency components.
Separating High and Low Frequencies
The resulting DCT coefficients represent the contributions of different
frequencies to the original signal.
Rounding Values:
Each original value is rounded to the nearest value in the reduced set of levels. This
rounding process reduces precision of the data, leading to a loss of detail or accuracy.
Loss of Information:
Since the original data is approximated or simplified during quantization, there is typically a
loss of some information or fine details. This loss can affect the quality of the reconstructed
data, especially in the case of highly detailed or complex signals.
Quantization of a grayscale image :
Pixel intensity range : 0 to 255.