Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

2) Intro

The Moving Picture Experts Group (MPEG) is an alliance of working groups established jointly by ISO and
IEC in 1988 that sets standards for media coding, including compression coding of audio, video, graphics
and genomic data, and transmission and file formats for various applications.

We need to compress video (more so than audio/images) in practice since: 1. Uncompressed video (and
audio) data are huge. In HDTV, the bit rate easily exceeds 1 Gbps and hence can create big problems for
storage and network communications.

Increasingly, individuals produce their own content Of all information produced in the world 93% is
stored in digital form HD in stand-alone PCs account for 55% of total storage shipped each year

MPEG formats are used in various multimedia systems. The most well-known older MPEG media
formats typically use MPEG-1, MPEG-2, and MPEG-4 AVC media coding and MPEG-2 systems transport
streams and program streams. Newer systems typically use the MPEG base media file format and
dynamic streaming.

MPEG has audio extension popularly known as MP3 and video extension known as MP4.

Compression is obtained by exploiting spatial and temporal redundancy i.e., consecutive frames in video
are similar and don not need to be coded as whole image.

And based on the fact that our eyes are more sensitive to luminance than chrominance. Hence
luminance components are more emphasized and hence chrominance components can be encoded
using lesser resolution.

3) Motion compensation

Motion compensation includes 3 main steps which include – Motion estimation, i.e., motion vector
search. Motion based prediction and Deriving prediction error.

On encoder side we obtain difference vector by subtracting previous frame from present frame and
send this difference obtained over channel to decoder. On the decoder side we add this difference
vector obtained through channel with the previous frame to obtain the present frame. This helps us to
reduce size of channel transmission.

Image is divided into macro blocks of NxN where, N = 16 for luminance sample and N = 8 for
chrominance provided we are performing 4:2:0 Chroma subsampling.

Motion compensation is done at macro block level. Current frame is referred as target frame or target
image. Previous and or future frames are referred as reference frames.

An in motion compensation we find a match between target frame and most similar reference frames.

Motion vector is found by displacement between reference frames to target frame.


In forward prediction reference frame is previous frame and in backward prediction it is future frame.

Motion vector search is usually limited to immediate neighborhood. And we try to search for most
similar frame by either sequential search, 2D logarithmic search or Hierarchical search. But sequential
search is a time consuming method, hence 2D logarithmic search was introduced which is cheaper,
suboptimal yet effective approach. Hierarchical search is a multiresolution approach.

5) Frame type

MPEG encoding produces 3 types of frames

„ I-Frame “Intracoded” frames. These are independently coded , and no temporal prediction is involved.
We can start decoding once we receive an I frame as No other images needed to view. They contain the
most data of any type

„ P-Frame “Forward Predicted” frames Encodes the changes from a previous I or P frame. Typically 30%
Size of I frame.

„ B-Frame “Bidirectionally Predicted” frames Encodes changes from a previous or future frame. They
Contains the least data 25% of an “I and Typically 50% the size of P frame.

Additional frame type like DC frames were also used in earlier technology. Which were used for fast
forwarding. They use block averages for prediction.

And, frame structure between two I frames is referred as Group of pictures (GOP). For example in this
set B B P P B P between two I frames is GOP

6) Components

4 major components of mpeg coding are data preparation, data preprocessing, quantization and
entropy coding.

For data preparations we have subsamples of chrominance signal. 4:1:1 chroma subsampling is mostly
used in MPEG – 1 which is ration of Y:Cb:Cr for NTSC standard used in America and PAL used in India.
Resolutions are different for both systems.

We further divide one frame into macroblocks of 16x16 for luminance and 8x8 for chrominance
components. Combination of all these blocks forms an image.

For data processing we apply motion prediction at macroblock level as we only change the data where
motion is found. It is only applied for those parts of image in motion and not for entire image. We figure
out if the current macroblock is present in next frame or not. By methods like forward, backward or
biderctional prediction.

We further apply DCT discrete cosine transform in MPEG on 8x8 blocks to reduce high frequncy
components and retain low frequency components. There are 2 types of components DC which are
constant and of low frequency and AC components with high frequency. Then we do quantization on
frequency domain coefficients which favour the low frequency components. And allow us to use
minimum data to store given video.

Further entropy coding is performed to convert 2D symbols to 1D series.because we need to store


image in memory and we can only store 1D data. Here we use zigzag or vertical scan. And then apply run
length coding on obtained 1D series as After DCT and quantization high freq components becomes 0.
After that we apply huffman codig to further reduce the size of file.

7)MPEG-1

MPEG -1 was the first standard to be finalized for video compression for interactive videos on CD’s and
for digital audio broadcasting. with the help of MPEG-1 a VCR quality video of resolution 640x480 pixels
of uncompressed rate 368 Mbps can be converted to rate of 1.5 Mbps giving tremendous compression
which can be practically sent over many networks. MPEG-1 used to dominate encoding of CD-ROM
based movies as it gives good quality performance and can be transmitted over twisted pair cables for
modest distances. For example through ADSL network it can be sent over distance of 18 thousand feet
(nearly 5km). Mpeg -1 can code progressive video only. 3 major components of MPEG-1 are audio video
and system. Audio signal is given to audio encoder which is compressed independently of video
encoder. And video signal applied to video encoder. Clock is used which commonly operates at 90khz to
provide the timestamp to both encoder and decoder. Timestamp is necessary for streaming data to
maintain in-order display. Encoder can take long time to perform encoding but decoding has to be fast.
MPEG-1 compresses video by removing spatial redundancy in each frame separately. Additonal
compression can be achieved by taking advantage of the fact that consecutive frames are often almost
similar that is by exploiting temporal redundancy. MPEG-1 output consists of 4 types of frames – I, P , B
and DC frames.

8)MP3

Audio coding can be widely divided in 2 parts predictive encoding and peceptual encoding. Predictive
encoding encodes the diffrence between the samples instead of sample values. Hence we can use lower
bit rates but aciehve lower compression. It is used in DPCM and ADPCM. Perceptual encoding makes use
of the flaws of our auditory system based o study of how human perceives sound. Used in MP3 format.
In MP3 first smapling is done on either 32, 44.1 or 48 khz commonly. But 44.1 Khz sampling fequency is
mostly used in wide applications. Then the signal is converted from time domain to frequency domain
using FFT fast fourier transforms. Resulting in spectrum divided on amost 32 frequency bands, each of
which is preocessed seprately. We make use of the masking effects like frequency masking and temporal
masking. Frequency masking occurs when a strong soung of particular frequency is followed by weaker
sound at the nearby bands. If the time iterval between both is short the weaker sounds is completely
masked and becomes in audible. Temporal masking occurs when aloud sound numbs our ear for a short
duration of time even after sound has been stopped.

After obtaining the frquency bands we perform bit alloocations. Frequency ranges to be completely
masked are alloted zero bits. Those to be partially masked are alloted fewer bits are ranges which are
not to be amsked are alloted high number of bits.
We try to exploit redundancy in stereo channels as they are highly overlapping.

Audio stream is adjustable from 32kbps to 448 kbps depending on bandwidth of source.

9)MPEG-2

MPEG-2 format is similar to MPEG-1. It was developed for digital television. It is used for higher quality
video at bit rate of more than 4Mbps. MPEG-2 supports 7 different profiles for different applications
namely – Simple, MAIN, SNR scalable, spatially scalable, high, 4:2:2 and multi view profile. Within each
profile up to 4 levels are defined.

MPEG-2 is used to code interlaced video which contain 2 fields – top field and bottom field. All scan lines
from both fields are interleaved to form a single frame, then divided into 16x16 micro blocks and coded
by Motion compensation.

MPEG-2 has more general way of multiplexing. Here each stream are packetized with time stamps and
the output of each packetizer is a Packetized Elementary system (PES) having 30 header fields. Then 2
multiplexers are used. One is PS multiplexer essentially program multiplexer which is variable length
packets and has common time base.

and other is TS i.e. transport multiplexer which gives fixed length packets with no common time base

It has HDTV applications.

10) MPEG2 VS 1

Although MPEG-2 is similar to MPEG-1, there are some differences observed like –

D frames i.e., DC coded frames are not supported in mpeg2. DCT is done on blocks of 10x10 instead of
8x8 for better quality. It also supports 4 different resolutions like HDTV, TV

Also, MPEG-2 provides Better resilience to bit error than MPEG-1 due to the addition of a transport
stream along with program stream. It also Supports 4:2:2 and 4:4:4 chroma sub-sampling. MPEG – 2 IS
More flexible video format and Non linear quantization is used.

11)MPEG4

Coming to MPEG-4, MPEG-4 video compression project started as a standard for very low bit rates for
using portable applications like videophones. This standard includes much more than just data
compression. Some of the other functionalities of MPEG-4 includes – Content based multimedia access
tools, manipulation and bitstream editing. Using MPEG-4 we can code hybrid, natural as well as
synthetic data. Coding efficiency is improved in this standard. MPEG-4 provides robustness in error
prone environment and we can code multiple concurrent data streams parallely.

Unlike MPEG-1 and 2 which are frame based coding, MPEG-4 is object based coding. Therefore, MPEG-4
gives great attention to user interactivity. It provides high compression ratio and hence MPEG-4 is
beneficial for digital video composition, manipulation, indexing and retrieval.

Audio or Video Objects are composited into the scene and can interact with objects at the receiver’s
end.

12) Composition and manipulation of MPEG-4 videos is based on VOP

Composition and manipulation of MPEG-4 videos is based on VOP – video object plane. Each object
inside the video alone to get Video object planes and with each plane it can make the encoding and
decoding and the content based manipulation easier. We can select one object and perform desired
operations on it like rotate, scale, delete. etc.

13)MPEG-7

MPEG-4 codes contents as objects. But an object can be described in many different ways, just like how
we can describe the object ‘apple’ in for example French, English, Russian etc.

MPEG-7 defines the ‘universal language’ as to how these objects are described and the ‘grammar’ as to
how ‘sentences’ of these objects can be made.

MPEG-7 is also known as ’Multimedia Content Description Interface’. An ISO/IEC standard Strictly
speaking, MPEG-7 is not a data compression scheme but is mainly a software implementation. MPEG-7
specifies the rules as to how to describe audiovisual data content whereas MPEG-1,2 and 4 make
content available.

Audiovisual information used to be consumed directly by human beings is Increasingly created,


exchanged, retrieved re-used by computational systems Representations that allow some degree of
interpretation of the information’s meaning can be accessed and processed by computer. Hence MPEG-
7 has been introduced.

MPEG-7 is not targeted at specific application. It aims to be as generic as possible for further extension.

14)MPEG 21

The MPEG-21 standard, from the Moving Picture Experts Group, aims at defining an open framework for
multimedia applications.

MPEG-21 is based on two essential concepts: - definition of a Digital Item (a fundamental unit of
distribution and transaction) and users interacting with Digital Items.

MPEG-21 defines also a "Rights Expression Language" standard as means of managing restrictions for
digital content usage. As an XML-based standard, MPEG-21 is designed to communicate machine-
readable license information and do so in a "ubiquitous, unambiguous and secure" manner.

MPEG-21 provides Comprehensive and flexible framework for the 21st Century Quality of Service, Rights
Management and E-Commerce applications.
15) CONCLUSION

As we have seen throughout presentation MPEG is widely used standard for both audio and visual data.

Many different MPEG standards have been introduced over years, and further advancements are being
made hence it is one one the most efficient compression technique.

Even though other methods like H.261 to H.264 and further have been introduced, still MPEG dominates
due to wide applications and advantages observed.

MPEG compression makes it easy to distribute and transmit audio visual files over various media like
internet as a result of substantially smaller file size due to achieved compression.

Hence MPEG has promoted online purchases and digital download of data with reduced storage and
bandwidth requirements

You might also like