Professional Documents
Culture Documents
Chapter 7 Mmedia
Chapter 7 Mmedia
Chapter 7:
Lossless Compression Algorithms
Abraham Abayneh
abraham12abay@gmail.com
1
Contents of the chapter
• Introduction
• Basics of Information Theory
• Run-Length Coding
• Variable-Length Coding (VLC)
• Dictionary Based Coding
• Huffman Coding
• Arithmetic Coding
• Lossless Image Compression
2
Introduction Lossless Compression Algorithms
3
Lossless Compression
4
Lossless Compression….
5
What is lossless compression algorithm?
6
Information theory
• Information theory is defined to be the study of efficient coding and its
consequences.
• It is the field of study concerned about the storage and transmission of
data.
• It is concerned with source coding and channel coding.
– Source coding: involves compression
– Channel coding: how to transmit data, how to overcame noise, etc
• Entropy is the measure of information content in a message
• Data compression may be viewed as a branch of information theory in
which the primary objective is to minimize the amount of data to be
transmitted.
9
7.3. The algorithms used in lossless
compression are:
7.3.1. Run-Length Coding
Memoryless Source: an information source that is
independently distributed.
Namely, the value of the current symbol does not depend
on the values of the previously appeared symbols.
Instead of assuming memory less source, Run-Length
Coding (RLC) exploits memory present in the information
source.
Rationale for RLC: if the information source has the
property that symbols tend to form continuous groups,
then such symbol and the length of the group can be coded
Itec3121 - Multimedia Systems
10
7.3.2. Variable-Length Coding (VLC)
12
Itec3121 - Multimedia Systems
13
Huffman Coding
A bottom-up approach
Initialization: Put all symbols on a list sorted according to
their frequency counts.
Repeat until the list has only one symbol left:
From the list pick two symbols with the lowest frequency counts.
Form a Huffman sub-tree that has these two symbols as child nodes
and create a parent node.
Assign the sum of the children’s frequency counts to the parent and
insert it into the list such that the order is maintained
Delete the children from the list.
Assign a code word for each leaf based on the path from the
root. Itec3121 - Multimedia Systems
14
Itec3121 - Multimedia Systems
15
Properties of Huffman Coding
Unique Prefix Property: No Huffman code is a prefix of
any other Huffman code.
Optimality: minimum redundancy code - proved optimal
for a given data model
16
7.3.3 Dictionary-based Coding
The Lempel-Ziv-Welch (LZW) uses fixed-length code words to
represent variable-length strings of symbols/characters that
commonly occur together,
e.g., words in English text.
The LZW encoder and decoder build up the same dictionary
dynamically while receiving the data.
LZW places longer and longer repeated entries into a dictionary,
and then emits the code for an element, rather than the string
itself, if the element has already been placed in the dictionary.
17
Itec3121 - Multimedia Systems
18
7.3.4. Arithmetic Coding
Arithmetic coding is a more modern coding method that
usually outperforms Huffman coding in practice.
Huffman coding assigns each symbol a code word which
has an integral bit length. Arithmetic coding can treat the
whole message as one unit.
A message is represented by a half-open interval [a; b]
where a and b are real numbers between 0 and 1.
Initially, the interval is [0; 1). When the message becomes
longer, the length of the interval shortens and the number of
bits needed to represent the interval increases.
Itec3121 - Multimedia Systems
19
Itec3121 - Multimedia Systems
20
Suppose the alphabet is [A, B, C, D, E, F, $],in which$ is a special symbol used to
terminate the message, and the known probability distribution is as shown in the
following figure;
21
7.4. Lossless Image Compression
22
Itec3121 - Multimedia Systems
23
Lossless JPEG
24
Lossless JPEG
25
Thank you!!!!
26