Chapter 4 Multimedia

11/25/2022
Multimedia Information Representation

 Text:
1. Unformatted text: plain text
e.g. ASCII character set
2. Formatted text: richtext
3. Hypertext
e.g. Pages and hyperlinks, HTML, SGML
 Image:
• 1-bit image, 8-bit gray scale image, 24-bit colour image, GIF,
JPEG, BMP, TIFF, PNG etc
 Audio:
• Speech signals: 50Hz ~ 10kHz
• Music-quality audio: 15Hz ~ 20kHz
• Mono , stereo, MP3, WAV, MIDI etc
 Video:
• Sequence of images, MP4, AVI, MPG etc
Streaming stored audio/video refers to on-demand requests for

compressed audio/video files.
Streaming live audio/video refers to the broadcasting of radio and
TV programs through the Internet.
Interactive audio/video refers to the use of the Internet for

interactive audio/video applications.
1
11/25/2022
Compression
Compression is generally applied in multimedia communication to
reduce the volume of information to be transmitted or to reduce the
bandwidth that is required for its transmission.
Lossless compression:
The aim is to reduce the number of bits used to represent the
source information in such a way that there is no loss of information.
Lossy compression:
The aim is to reduce the number of bits used to represent the
source information by introducing acceptable loss of information.
Text compression:
Any compression algorithm associated with text must be lossless
since the loss of just a single character could modify the meaning
of a complete string.
Huffman coding
Characters a e i o u s t
Frequencies 10 15 12 3 4 13 1
Step-01:
Step-02:
2
11/25/2022
Step-03:
Step-04:
Step-05:
Step-06:
3
11/25/2022
Step-07:
Step-08:
a = 111
e = 10
i = 00
o = 11001
u = 1101
s = 01
t = 11000
4
11/25/2022
Audio Compression
Audio compression can be used for speech or music. For speech, we
need to compress a 64-kHz digitized signal; for music, we need to
compress a 1.411-MHz signal.
Two categories of techniques are used for audio compression:
1. predictive encoding
2. perceptual encoding
Predictive Encoding
In predictive encoding, the differences between the samples are
encoded instead of encoding all the sampled values.
This type of compression is normally used for speech.
Several standards have been defined such as GSM (13 kbps), G.729
(8 kbps), and G.723.3 (6.4 or 5.3 kbps).
Perceptual Encoding:
Perceptual encoding is based on the science of psychoacoustics,
which is the study of how people perceive sound.
The idea is based on some flaws in our auditory system:
5
11/25/2022
Some sounds can mask other sounds. Masking can happen in

frequency and time.
In frequency masking, a loud sound in a frequency range can
partially or totally mask a softer sound in another frequency range.
For example, we cannot hear what our dance partner says in a room
where a loud heavy metal band is performing.
In temporal masking, a loud sound can numb our ears for a short
time even after the sound has stopped.
6
11/25/2022
 MP3 uses these two phenomena, frequency and temporal

masking, to compress audio signals.
 The technique analyzes and divides the spectrum into several
groups.
 Zero bits are allocated to the frequency ranges that are totally
masked.
 A small number of bits are allocated to the frequency ranges that
are partially masked.
 A larger number of bits are allocated to the frequency ranges
that are not masked.
 MP3 produces three data rates: 96 kbps, 128 kbps, and 160 kbps.
The rate is based on the range of the frequencies in the original
analog audio.
Image Compression: JPEG

 If the picture is in gray scale, each pixel can be represented by an
8-bit integer (256 levels).
 If the picture is in color, each pixel can be represented by 24 bits
with each 8 bits representing red, blue, or green (RBG).
 To simplify the discussion, we concentrate on a gray scale
picture.
 In JPEG, a gray scale picture is divided into blocks of 8 × 8 pixels
7
11/25/2022
 The purpose of dividing the picture into blocks is to decrease the

number of calculations because, the number of mathematical
operations for each picture is the square of the number of units.
 The whole idea of JPEG is to change the picture into a linear
(vector) set of numbers that reveals the redundancies.
 The redundancies (lack of changes) can then be removed by using
one of the text compression methods.
 Discrete Cosine Transform (DCT): each block of 64 pixels goes

through a transformation called the discrete cosine transform.
 The transformation changes the 64 values so that the relative
relationships between pixels are kept but the redundancies are
revealed.
Case 1: In this case, we have a block of uniform gray, and the value of
each pixel is 20.
When we do the transformations, we get a nonzero value for the first
element (upper left corner); the rest of the pixels have a value of 0.
The value of T (0,0) is the average (multiplied by a constant) of the
P(x,y) values and is called the dc value.
The rest of the values, called ac values, in T(m,n) represent changes
in the pixel values. But because there are no changes, the rest of the
values are 0s.
8
11/25/2022
Case 2: In the second case, we have a block with two different

uniform gray scale sections. There is a sharp change in the values of
the pixels (from 20 to 50). When we do the transformations, we get
a dc value as well as nonzero ac values. However, there are only a
few nonzero values clustered around the dc value.
Case 3: In the third case, we have a block that changes gradually.

That is, there is no sharp change between the values of neighboring
pixels. When we do the transformations, we get a dc value, with
many nonzero ac values also.
We can say the following:
❑ The transformation creates table T from table P.
❑ The dc value is the average value (multiplied by a constant) of the
pixels.
❑ The ac values are the changes.
❑ Lack of changes in neighboring pixels creates 0s.
9
11/25/2022
Quantization:
 After the T table is created, the values are quantized to reduce the
number of bits needed for encoding.
 Here, we divide the number by a constant and then drop the
fraction. This reduces the required number of bits even more.
 In most implementations, a quantizing table (8 by 8) defines how
to quantize each value.
 This is done to optimize the number of bits and the number of 0s
for each particular application.
 Note that the only phase in the process that is not completely
reversible is the quantizing phase.
 We lose some information here that is not recoverable.
 As a matter of fact, the only reason that JPEG is called lossy
compression, is because of this quantization phase.
1 1 2 4 8 16 32 64 160 95 60 23 6 4 0 0
1 1 2 4 8 16 32 64 90 85 40 15 5 3 0 0
2 2 2 4 8 16 32 64 56 48 32 12 5 2 0 0
4 4 4 4 8 16 32 64 15 11 10 7 4 3 0 0
8 8 8 8 8 16 32 64 5 3 2 1 1 0 0 0
16 16 16 16 16 16 32 64 3 2 2 1 1 0 0 0
32 32 32 32 32 32 32 32 0 0 0 0 0 0 0 0
64 64 64 64 64 64 64 64 0 0 0 0 0 0 0 0
Quantization table DCT coefficients
10
11/25/2022
Quantized coefficients
160 95 30 6 1 0 0 0
90 85 20 15 1 0 0 0
28 24 16 3 1 0 0 0
4 3 3 2 1 0 0 0
1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Compression: After quantization, the values are read from the table,
and redundant 0s are removed. However, to cluster the 0s together,
the table is read diagonally in a zigzag fashion.
Result after Compression: 20 15 15 12 17 15 0
11
11/25/2022
Video Compression: Moving Picture Experts Group (MPEG)

In principle, a motion picture is a rapid flow of a set of frames,
where each frame is an image.
In other words, a frame is a spatial combination of pixels, and a
video is a temporal combination of frames that are sent one after
another.
Compressing video, means spatially compressing each frame and
temporally compressing a set of frames.
Spatial Compression: The spatial compression of each frame is done
with JPEG. Each frame is a picture that can be independently
compressed.
Temporal Compression: In this, redundant frames are removed.
When we watch television, we receive 50 frames per second.
However, most of the consecutive frames are almost the same.
For example, when someone is talking, most of the frame is the
same as the previous one except for the segment of the frame
around the lips, which changes from one frame to another.
Moving Picture Experts Group-1 (MPEG-1)

It has 3 components: audio, video and system.
A 90kHz clock outputs the current time value to both the encoders
and propagated all the BPP
way to FPS
the receiver.
Quality Resolution Without MPEG-1
compression
VCR 640 480 24 25 368.64 Mbps 1.5Mbps
Audio signal Audio

Encoder Output
Clock Audio
Encoder
Video signal Video
Encoder
12
11/25/2022
To temporally compress data, the MPEG-1 method first divides

frames into three categories: I-frames, P-frames, and B-frames.
I-frames: An intracoded frame (I-frame) is an independent frame that

is not related to any other frame. They are present at regular
intervals (e.g., every ninth frame is an I-frame). An I-frame must
appear periodically to handle some sudden change in the frame that
the previous and following frames cannot show. Also, when a video is
broadcast, a viewer may tune at any time. If there is only one I-frame
at the beginning of the broadcast, the viewer who tunes in late will
not receive a complete picture. I-frames are independent of other
frames and cannot be constructed from other frames.
P-frames: A predicted frame (P-frame) is related to the preceding I-

frame or P-frame. In other words, each P-frame contains only the
changes from the preceding frame. The changes, however, cannot
cover a big segment. For example, for a fast-moving object, the new
changes may not be recorded in a P-frame. P-frames can be
constructed only from previous I- or P-frames. P-frames carry much
less information than other frame types and carry even fewer bits
after compression.
B-frames: A bidirectional frame (B-frame) is related to the preceding
and following I-frame or P-frame. In other words, each B-frame is
relative to the past and the future. Note that a B-frame is never
related to another B-frame.
MPEG has gone through two versions. MPEG1 was designed for a
CD-ROM with a data rate of 1.5 Mbps. MPEG2 was designed for high-
quality DVD with a data rate of 3 to 6 Mbps.
13
11/25/2022
MPEG-2 is intended for high data rates. It is multichannel coding

standard. It defines an audio coding standard at lower sampling
frequencies (16kHz, 22.05kHz, 24kHz).
Quality Resolution BPP FPS Without MPEG-2

compression
HDTV 1920 1080 24 60 2986 Mbps 25-30Mbps
TV 720 576 24 25 498Mbps 3-6Mbps
MPEG-1 and MPEG-2

Parameter MPEG-1 MPEG-2
Standardized 1992 1994
Main Digital video on CD-
Digital TV (and HDTV)
application ROM
Spatial SIF format (1/4 TV) TV (4xTV)
resolution 360x288 pixels 720x576 (1440x1152)
Temporal
25/30 fps 50/60 fps
resolution
Bit rate 1.5 Mbps 4 Mbps (20 Mbps)
Quality VHS NTSC/PAL for TV
Compression
20-30 30-40
ratio over PCM
28
14
11/25/2022
H.261
H.261 is a two-way communication over ISDN lines and supports data
rate in multiples of 64 KBPS.
H.261 defines a video encoder that is intended to be used to
compress video data that will be sent over Integrated Services Digital
Network (ISDN) lines.
The H.261 codec is intended primarily for use in video telephony and
video conferencing applications.
Image format: the images must be YCbCr images.
In terms of size, the images must adhere to either the Common
Interchange Format (CIF) or the Quarter-ClF (QCIF) format.
Table below indicates the widths and heights defined by these
formats.
Width Height
CIF 352 288
QCIF 176 144
Two types of frames are defined:

Intra-frame( I-frame):
• Treated as independent images
• Transform coding method similar to JPEG
Inter-frames (P-frames):
• Not independent
• Forward predictive coding method
Intra-frame coding:
15
11/25/2022
Inter-frame coding:
H.261 encoder
H.261 decoder
16
11/25/2022
• Macroblocks are of size 16 x 16 pixels for the Y frame, and 8 x 8

for Cb and Cr frames.
• A macroblock consists of four Y, one Cb, and one Cr 8 x 8 blocks.
• For each 8 x 8 block a DCT transform is applied, the DCT
coefficients then go through quantization zigzag scan and entropy
coding.
• Inter-frame (P-frame) Predictive Coding: P-frame coding scheme
is based on motion compensation.
• In motion compensation a search area is constructed/predicted
in previous frame to determine the best reference macroblock.
• After the prediction, a difference macroblock is derived to
measure the prediction error.
• Each of these 8x8 blocks go through DCT, quantization, zigzag
scan and entropy coding procedures.
• The P-frame coding encodes the difference macroblock.
17
11/25/2022
H.263
• Low-bit rate standard for teleconferencing applications
– Optimize H.261 so as to operate on below 64Kbps
– 2.5 times more compressed than H.261
– 5 image formats
– Motion-compensated prediction has been refined
– supports B frame(has only P frame as a reference)
• Used in IETF RTSP(Real Time Streaming Protocol)
18
11/25/2022
Multimedia transport across IP networks

Transmission of compressed multimedia over (IP) networks in real-
time – streaming.
Real-time Interactive Audio/Video characteristics:
Time Relationship
Jitter is introduced in real-time data by the delay between packets.
19
11/25/2022
Timestamp
One solution to jitter is the use of a timestamp.
If each packet has a timestamp that shows the time it was produced
relative to the first (or previous) packet, then the receiver can add
this time to the time at which it starts the playback.
Playback Buffer
To be able to separate the arrival time from the playback time, we
need a buffer to store the data until they are played back. The buffer
is referred to as a playback buffer.
20
11/25/2022
Ordering
In addition to time relationship information and timestamps for real-
time traffic, one more feature is needed. We need a sequence
number for each packet.
Translation
Sometimes real-time traffic needs translation. A translator is a
computer that can change the format of a high-bandwidth video
signal to a lower-quality narrow-bandwidth signal.
Mixing
If there is more than one source that can send data at the same time
(as in a video or audio conference), the traffic is made of multiple
streams. To converge the traffic to one stream, data from different
sources can be mixed. A mixer mathematically adds signals coming
from different sources to create one single signal.
Real-time Transport Protocol (RTP)

• RTP is designed to handle real-time traffic on the Internet.
• The main contributions of RTP are timestamping, sequencing,
and mixing facilities.
21
11/25/2022
RTP Packet Format
❑Ver: This 2-bit field defines the version number. The current
version is 2.
❑P: This 1-bit field, if set to 1, indicates the presence of padding at
the end of the packet. In this case, the value of the last byte in the
padding defines the length of the padding. There is no padding if the
value of the P field is 0.
❑X: This 1-bit field, if set to 1, indicates an extra extension header

between the basic header and the data. There is no extra extension
header if the value of this field is 0.
❑Contributor count: This 4-bit field indicates the number of
contributors. We can have a maximum of 15 contributors because a
4-bit field only allows a number between 0 and 15.
❑M: This 1-bit field is a marker used by the application to indicate
the end of its data.
❑Payload type: This 7-bit field indicates the type of the payload.
Several payload types have been defined so far.
22
11/25/2022
❑Sequence number: This field is 16 bits in length. It is used to

number the RTP packets. The sequence number of the first packet is
chosen randomly; it is incremented by 1 for each subsequent packet.
The sequence number is used by the receiver to detect lost or out of
order packets.
❑Timestamp: This is a 32-bit field that indicates the time relationship
between packets. The timestamp for the first packet is a random
number. For each succeeding packet, the value is the sum of the
preceding timestamp plus the time the first byte is produced. The
value of the clock tick depends on the application.
❑Synchronization source identifier: If there is only one source, this
32-bit field defines the source. However, if there are several sources,
the mixer is the synchronization source and the other sources are
contributors.
❑Contributor identifier: Each of these 32-bit identifiers defines a
source. When there is more than one source in a session, the mixer is
the synchronization source and the remaining sources are the
contributors.
Real-Time Transport Control Protocol(RTCP)

• RTP allows only one type of message, one that carries data from
the source to the destination.
• In many cases, there is a need for other messages which control
the flow and quality of data and allow the recipient to send
feedback to the source or sources.
• RTCP is designed for this purpose.
• RTCP has five types of messages. The number next to each box
defines the type of the message.
23
11/25/2022
Sender Report
• The sender report is sent periodically by the active senders in a
conference to report transmission and reception statistics for all
RTP packets sent during the interval.
• The sender report includes an absolute timestamp, which is the
number of seconds elapsed since midnight January 1, 1970.
• The absolute timestamp allows the receiver to synchronize different
RTP messages.
Receiver Report
• The receiver report is for passive participants, those that do not
send RTP packets.
• The report informs the sender and other receivers about the quality
of service.
Source Description Message
• The source periodically sends a source description message to give
additional information about itself.
• This information can be the name, e-mail address, telephone
number, and address of the owner or controller of the source.
Bye Message
• A source sends a bye message to shut down a stream.
• It allows the source to announce that it is leaving the conference.
Although other sources can detect the absence of a source, this
message is a direct announcement. It is also very useful to a
mixer.
Application-Specific Message
• The application-specific message is a packet for an application
that wants to use new applications (not defined in the standard).
• It allows the definition of a new message type.
24
11/25/2022
Distance Vector Multicast Routing Protocol(DVMRP)

 Unicast distance vector routing is very simple; extending it to
support multicast routing is complicated.
 Multicast routing does not allow a router to send its routing table
to its neighbors.
 The protocol constructs a routing tree for a network. Whenever a
router receives a packet, it forward it to some of its ports based on
the source address of the packet.
 The rest of the routing tree is made by downstream routers.
 It must prevent formation of loops in the network.
 It must prevent formation of duplicate packets.
 It must ensure that the path followed by the packet is the shortest
from source to destination.
 It should provide dynamic membership.
Flooding
• A router receives a packet and without even looking at the
destination group address, sends it out from every interface
except the one from which it was received.
• Flooding accomplishes the first goal of multicasting: every
network with active members receives the packet. However, so
will networks without active members. This is a broadcast, not a
multicast.
• There is another problem: it creates loops.
• A packet that has left the router may come back again from
another interface or the same interface and be forwarded again.
• Some flooding protocols keep a copy of the packet for a while
and discard any duplicates to avoid loops.
• The next strategy, reverse path forwarding, corrects this defect.
25
11/25/2022
Reverse Path Forwarding (RPF)

• To prevent loops, only one copy is forwarded; the other copies are
dropped.
• In RPF, a router forwards only the copy that has traveled the
shortest path from the source to the router.
• To find this copy, RPF uses the unicast routing table. The router
receives a packet and extracts the source.
• If the multicast packet has just come from the hop defined in the
table, the packet has traveled the shortest path from the source to
the router because the shortest path is reciprocal in unicast
distance vector routing protocols.
• If the path from A to B is the shortest, then it is also the shortest
from B to A.
• This strategy prevents loops because there is always one shortest
path from the source to the router. If a packet leaves the router
and comes back again, it has not traveled the shortest path..
26
11/25/2022
The shortest path tree as calculated by routers R1, R2, and R3 is

shown by a thick line. When R1 receives a packet from the source
through the interface m1, it consults its routing table and finds that
the shortest path from R1 to the source is through interface m1. The
packet is forwarded.
However, if a copy of the packet has arrived through interface m2, it
is dis-carded because m2 does not define the shortest path from R1
to the source. The story is the same with R2 and R3.
You may wonder what happens if a copy of a packet that arrives at
the m1 interface of R3 travels through R6, R5, R2, and then enters R3
through interface m1. This interface is the correct interface for R3.
Is the copy of the packet forwarded? The answer is that this scenario
never happens because when the packet goes from R5 to R2, it will
be discarded by R2 and never reaches R3.
The upstream routers toward the source always discard a packet that
has not gone through the shortest path, thus preventing confusion
for the downstream routers
Reverse Path Broadcasting (RPB)

RPF guarantees that each network receives a copy of the multicast
packet without formation of loops. However, RPF does not guarantee
that each network receives only one copy; a network may receive
two or more copies. The reason is that RPF is not based on the
destination address (a group address); forwarding is based on the
source address.
27
11/25/2022
Net3 in this figure receives two copies of the packet even though
each router just sends out one copy from each interface. There is
duplication because a tree has not been made; instead of a tree we
have a graph. Net3 has two parents: routers R2 and R4.
To eliminate duplication, we must define only one parent router for
each network.
A network can receive a multicast packet from a particular source
only through a designated parent router.
For each source, the router sends the packet only out of those
interfaces for which it is the designated parent. This policy is called
reverse path broad-casting (RPB).
RPB guarantees that the packet reaches every network and that
every network receives only one copy.
Reverse Path Multicasting (RPM)

RPB does not multicast the packet, it broadcasts it. This is not
efficient. To increase efficiency, the multicast packet must reach only
those networks that have active members for that particular group.
This is called reverse path multicasting(RPM). To convert
broadcasting to multicasting, the protocol uses two procedures,
pruning and grafting.
28
11/25/2022
Pruning: The designated parent router of each network is

responsible for holding the membership information. This is done
through the IGMP protocol. The process starts when a router
connected to a network finds that there is no interest in a multicast
packet. The router sends a prune message to the upstream router
so that it can prune the corresponding interface. That is, the
upstream router can stop sending multicast messages for this group
through that interface. Now if this router receives prune messages
from all downstream routers, it, in turn, sends a prune message to
its upstream router.
Grafting: What if a router at the bottom of the tree has sent a prune
message but suddenly realizes, through IGMP, that one of its
networks is again interested in receiving the multicast packet? It can
send a graft message. The graft message forces the upstream router
to resume sending the multicast messages.
29
11/25/2022
Real-Time Streaming Protocol(RTSP):

It is a control protocol designed to add more functionalities to the
streaming process. Using RTSP, we can control the playing of
audio/video.
RTSP example
30
11/25/2022
1. The HTTP client accesses the Web server using a GET message.
2. The information about the metafile comes in the response.
3. The metafile is passed to the media player.
4. The media player sends a SETUP message to create a connection
with the media server.
5. The media server responds.
6. The media player sends a PLAY message to start playing
(downloading).
7. The audio/video file is downloaded using another protocol that
runs over UDP.
8. The connection is broken using the TEARDOWN message.
9. The media server responds.
Major methods:
Setup: Setup requests specify how a media stream must be
transported before a play request is sent.
Play: A play request starts the media transmission by telling the
server to start sending the data.
Pause: Pause requests temporarily halt the stream delivery.
Teardown: This request terminates the session entirely and stops
all media streams.
Additional methods:
Options: get available methods
Describe: A describe request identifies the URL and type of data.
Announce: change description of media object
Record: A record request initiates a media recording.
Redirect: Redirect requests inform the client that it must connect
to another server by providing a new URL for the client to issue
requests to.
31
11/25/2022
Voice over IP(VoIP):

The idea is to use the Internet as a telephone network with some
additional capabilities. Instead of communicating over a circuit-
switched network, this application allows communication between
two parties over the packet-switched Internet.
Two protocols have been designed to handle this type of
communication: SIP and H.323.
Session Initiation Protocol (SIP):
It was designed by IETF. It is an application layer protocol that
establishes, manages, and terminates a multimedia session (call).
It can be used to create two-party, multiparty, or multicast sessions.
SIP is designed to be independent of the underlying transport layer.
1. The caller initializes a session with the INVITE message.

2. After the receiver answers the call, the caller sends an ACK
message for confirmation.
3. The BYE message terminates a session.
4. The OPTIONS message queries a machine about its capabilities.
5. The CANCEL message cancels an already started initialization
process.
6. The REGISTER message makes a connection when the receiver is
not available.
Addresses
In a regular telephone communication, a telephone number
identifies the sender, and another telephone number identifies the
receiver.
SIP is very flexible. In SIP, an e-mail address, an IP address, a
telephone number, and other types of addresses can be used to
identify the sender and receiver.
32
11/25/2022
SIP formats
Simple Session: A simple session using SIP consists of 3 modules:

1. Establishing
2. Communicating
3. Terminating
INVITE: address, options
Establishing OK: address
ACK
Communicating Exchanging audio
Terminating BYE
Tracking the Callee

SIP has a mechanism that finds the IP address of the terminal at
which the callee is sitting.
To perform this tracking, SIP uses the concept of registration. SIP
defines some servers as registrars.
At any moment an user is registered with at least one registrar
server; this server knows the IP address of the callee.
When a caller needs to communicate with the callee, the caller can
use the e-mail address instead of the IP address in the INVITE
message.
The message goes to a proxy server. The proxy server sends a lookup
message (not part of SIP) to some registrar server that has registered
the callee.
When the proxy server receives a reply message from the registrar
server, the proxy server takes the caller’s INVITE message and inserts
the newly discovered IP address of the callee. This message is then
sent to the callee.
33
11/25/2022
Tracking the callee
INVITE
Lookup
Reply
INVITE
OK
OK
ACK
ACK
Exchanging audio
BYE
H.323
H.323 is a standard designed by ITU to allow telephones on the
public telephone network to talk to computers (called terminals in
H.323) connected to the Internet.
A gateway connects the Internet to the telephone network.

A gateway is a five-layer device that can translate a message from
one protocol stack to another. It transforms a telephone network
message into an Internet message.
The gatekeeper server on the local area network plays the role of
the registrar server.
34
11/25/2022
H.323 protocols
35
11/25/2022
Operation:
1. The terminal sends a broadcast message to the gatekeeper. The
gatekeeper responds with its IP address.
2. The terminal and gatekeeper communicate, using H.225 to
negotiate bandwidth.
3. The terminal, the gatekeeper, gateway, and the telephone
communicate using Q.931 to set up a connection.
communicate using H.245 to negotiate the compression method.
5. The terminal, gateway, and the telephone exchange audio using
RTP under the management of RTCP.
communicate using Q.931 to terminate the communication
IPTV:
The term IPTV is the short form of Internet Protocol Television.
The technology is used to deliver digital television service using
existing IP network.
IPTV services include video on demand, live TV and time shifted
programming.
36
11/25/2022
The IPTV can be viewed on display either PC monitor or TV set

using Set top box (STB) device.
It requires broadband connection to access IPTV services.
The STB helps to access desired channels as per subscription
services as per demand.
It helps to obtain interactive multimedia services over end to end
secured connection with desired QoS.
IPTV Advantages
Following are the features advantages of IPTV technology:
• IPTV system works on the existing internet connection and it
needs just IPTV set top box to avail the services.
• Latest versions of IPTV STB supports wifi and hence it works
wirelessly without the need of cumbersome wiring.
• As IPTV is 100% digital, it is providing good quality of picture. It is
slowly replacing the older analog TV sets.
• Multiple programs can be viewed simultaneously on the same
display.
• It supports other applications in addition to video on demand such
as data, voice and mobile telephony.
• The programs to be viewed can be stored on the storage servers
and can be viewed as per schedule decided by the user with just
clock of a IPTV remote.
• It helps to generate revenue for the service providers by way of
displayed advertisements.
37

Chapter 4 Multimedia

Uploaded by

Copyright:

Available Formats

You might also like

Chapter 4 Multimedia

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 4 Multimedia

Uploaded by

Copyright:

Available Formats

11/25/2022

Multimedia Information Representation

Streaming stored audio/video refers to on-demand requests for

Interactive audio/video refers to the use of the Internet for

Some sounds can mask other sounds. Masking can happen in

 MP3 uses these two phenomena, frequency and temporal

Image Compression: JPEG

 The purpose of dividing the picture into blocks is to decrease the

 Discrete Cosine Transform (DCT): each block of 64 pixels goes

Case 2: In the second case, we have a block with two different

Case 3: In the third case, we have a block that changes gradually.

Quantization table DCT coefficients

Result after Compression: 20 15 15 12 17 15 0

Video Compression: Moving Picture Experts Group (MPEG)

Moving Picture Experts Group-1 (MPEG-1)

Audio signal Audio

To temporally compress data, the MPEG-1 method first divides

I-frames: An intracoded frame (I-frame) is an independent frame that

P-frames: A predicted frame (P-frame) is related to the preceding I-

MPEG-2 is intended for high data rates. It is multichannel coding

Quality Resolution BPP FPS Without MPEG-2

MPEG-1 and MPEG-2

Two types of frames are defined:

• Macroblocks are of size 16 x 16 pixels for the Y frame, and 8 x 8

Multimedia transport across IP networks

Jitter is introduced in real-time data by the delay between packets.

Real-time Transport Protocol (RTP)

RTP Packet Format

❑X: This 1-bit field, if set to 1, indicates an extra extension header

❑Sequence number: This field is 16 bits in length. It is used to

Real-Time Transport Control Protocol(RTCP)

Distance Vector Multicast Routing Protocol(DVMRP)

Reverse Path Forwarding (RPF)

The shortest path tree as calculated by routers R1, R2, and R3 is

Reverse Path Broadcasting (RPB)

Reverse Path Multicasting (RPM)

Pruning: The designated parent router of each network is

Real-Time Streaming Protocol(RTSP):

Voice over IP(VoIP):

1. The caller initializes a session with the INVITE message.

Simple Session: A simple session using SIP consists of 3 modules:

Communicating Exchanging audio

Tracking the Callee

Tracking the callee

A gateway connects the Internet to the telephone network.

The IPTV can be viewed on display either PC monitor or TV set

You might also like