Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/230882786

Internet TV Architecture Based on Scalable Video Coding

Conference Paper · January 2011

CITATIONS READS

2 76

3 authors, including:

Rui António Santos Cruz Mario Nunes


University of Lisbon Inesc-ID
27 PUBLICATIONS   97 CITATIONS    144 PUBLICATIONS   1,144 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

eBalance (FP7 european Project ) View project

IETF PPSP View project

All content following this page was uploaded by Rui António Santos Cruz on 25 June 2014.

The user has requested enhancement of the downloaded file.


Internet TV Architecture
Based on Scalable Video Coding

Pedro G. Moscoso Rui S. Cruz Mário S. Nunes


Instituto Superior Técnico Instituto Superior Técnico Instituto Superior Técnico
Lisboa, Portugal INESC-ID/INOV INESC-ID/INOV
pedro.moscoso@ist.utl.pt Lisboa, Portugal Lisboa, Portugal
rui.s.cruz@ist.utl.pt mario.nunes@inov.pt

ABSTRACT nature of connectivity to the Internet does not also guar-


The heterogeneity of the Internet raises several problems to antees uniform and stable conditions of reception at any
the distribution of multimedia contents. This paper starts moment in time to the end user devices. From the content
by introducing those problems and briefly overviewing the producer and/or broadcaster side it is rather important that
main approaches being used to mitigate them, in order to multimedia contents reach the end users with the best qual-
present a novel web-based Adaptive Video Streaming so- ity possible while not wasting precious resources. These situ-
lution prototype, supporting Scalable Video Coding (SVC) ations have been the main drivers for the techniques recently
techniques. The proposed solution, with focus on the client developed, that aim to reduce the mentioned problems in or-
side, incorporates quality control mechanisms to maximize der to optimize multimedia distribution and resources usage.
the end user experienced quality, is suitable for Interactive Companies like Apple, Microsoft and Adobe already provide
Internet TV Architectures and cooperative with Content mechanisms, more or less transparent to the end user, able
Distribution Networks (CDNs) and Peer-to-Peer (P2P) web- to support dynamic variations in video streaming quality
streaming environments. while ensuring support for a plethora of end user devices
and network connections. For these techniques to work, the
Categories and Subject Descriptors original content is re-encoded with different qualities and
C.2.1 [Computer-Communication Networks]: Network bitrates [1, 10, 15].
Architecture and Design—Network communications
But, with the SVC extension to the H.264/AVC [13] stan-
dard for video coding, new approaches are possible for video
General Terms distribution and consumption as videos can now be encoded
Algorithms, Performance, Measurement in nested dependent layers corresponding to several hierar-
chical levels of quality (i.e., higher layers refine the quality
Keywords of the video of lower layers), not requiring multiple encoded
Internet video streaming, adaptive streaming, H.264/AVC, versions (bitrates/quality levels) of the same content. The
scalable video coding, peer-to-peer base layer (i.e., the first layer) is required in order for a
SVC video to be decoded and played-out, but this layer cor-
1. INTRODUCTION responds to a low definition, H.264/AVC compatible video
The growth and expansion of the broadband Internet and with acceptable quality and low bitrate, into which higher
the increasing number of connected devices with multime- layers, if successfully received, can be added to produce a
dia content play-out capabilities, notably the recent hand- higher quality, higher definition video in terms of space (i.e.,
held devices (smartphones, tablets) supporting high defini- image resolution), time (i.e., frame rate) or Signal-to-Noise
tion video [14], raises several problems to the distribution Ratio) (SNR) dimensions. Almost all techniques involving
of multimedia contents. From the end user device side, a SVC are based on packetized bitstream transmission meth-
wide range of networked devices with different characteris- ods. The authors in [5] propose a Raptor coding scheme to
tics and capabilities, such as screen size and resolution, Cen- protect the transmission of SVC in order to improve the dis-
tral Processing Unit (CPU) power, operating system and tribution of video over lossy networks. A similar approach is
media player applications, poses big challenges to the dis- proposed in [16] but with network-assisted adaptive Forward
tribution of contents, hardly scaled for all of those devices. Error Correction (FEC) scheme. In [11] the authors present
From the access network provider side, the heterogeneous a survey of several P2P systems supporting bitstream mode
SVC. In [4] SVC is combined with Multiple Description
Coding (MDC) to alleviate packet loss in P2P overlay mul-
ticast. A packet scheduling mechanism, also for P2P stream-
ing, is proposed in [7]. A prioritized scheduling mechanism,
but for a chunk-based P2P transmission approach is pro-
posed in [8].

This paper presents and evaluates the architecture of a web-


based Adaptive Video Streaming solution prototype, sup-
porting scalable video coding techniques, suitable for Inter- delimited by Supplemental Enhancement Information (SEI)
active Internet TV Architectures, that aims to maximize Network Abstraction Layer (NAL) units, starting when a
the end user experienced quality. The proposed architec- SEI NAL unit appears in the encoded file and ending in the
ture, being developed by the authors under the scope of the NAL unit that precedes the next SEI NAL unit. This fea-
European Project SARACEN [12], can be seamlessly inte- ture is the basis for the support of Live video streaming,
grated with web oriented CDNs, allowing the media to get as it turns independent of the media timeline the moment
easily cached along the network, and used in a P2P web- when a user joins the stream (can join at the current time-
streaming environment. line and start watching the stream, in the worst case, after
2 seconds plus the time required to fill the play-out buffer).
Section 2 gives an overview of the proposed architecture, The Intra-Chunk Layer Partitioner splits the SVC chunk
Section 3 analyses the results from the evaluation of the
prototype and Section 4 concludes the paper.

2. ARCHITECTURE OVERVIEW
The Internet TV distribution network architecture considers
end user nodes and serving platforms. The end user nodes
are distributed peers (with P2P capabilities) that can pro-
duce, consume and share contents, offering their resources
(bandwidth, processing power, storing capacity) to other
end user nodes. The serving platforms are centralized ser-
vice nodes providing control (tracker for P2P), content treat- Figure 1: Layer creation process
ment and distribution (transcoders and media servers), as
well as interaction tools and facilities. The architecture is
a multi-source Hypertext Transfer Protocol (HTTP) client files into several transmission layers, as illustrated in Fig-
and server solution providing an advanced form of Web- ure 1. This process begins with the demultiplexing of the
Streaming and WebSeeding (HTTP based P2P Streaming NALs and with their identification with one ID. This is very
Protocols) [3]. The process used for streaming distribution important for the reconstitution of the original bitstream on
relies on a chunk transfer scheme (instead of a bitstream) the client side. This identification process is done by the in-
whereby the original video data is chopped into small video sertion of a Sequence Numbering field with 2 bytes between
chunks with a short duration (of typically two seconds). The the start code (0001), and the beginning of the NAL unit
chunk-based streaming protocols allow the deployment of (Figure 2). Layer separation is done by using the three iden-
a distribution network compatible with the Internet infras-
tructure, such as Web caches and CDNs as well as P2P
distribution. The description of the serving platforms of the
Internet TV distribution network will not be covered in this
paper (except for a brief description of the partition sys- Figure 2: Identification of NAL order
tem), as its focus is on the client side, for a solution able
to consume Video On Demand (VoD) and Live video ser-
vices, supporting multiple device types and resolutions with tifiers: Definition (DID), Quality (QID) and Temporal IDs
adaptive streaming mechanisms based on SVC extension of (TID) that exist on SVC NALs. After the partition of the
H.264/AVC . The overview of the architecture will focus the video file in layer files, an index file (Manifest) of the content
following components: is created. The manifest file holds information about the
content, i.e., describes the structure of the media, namely,
the codecs used, the chunks, the number of layers, the audio
• The Partition system, which will be responsible for component, etc., is a Well-Formed XML Document encoded
splitting the SVC video files in chunks and then in as double-byte Unicode.
several layer files.
• The Adaptation System that requests the video with 2.2 Adaptation System
maximum possible quality. The Adaptation System is responsible for the adjustments in
video quality by determining the number of layers to request
• The Reassembler System that rebuilds the video file to
from the serving nodes, based on a set of heuristics related
a given level of quality.
to network and host system conditions. Network conditions,
• The Media Player that plays the SVC video. such as bandwidth and Round-Trip Time (RTT), are con-
tinuously measured, with their averages used as smoothing
factors to prevent abrupt changes in the quality of the video.
The first component, the Partition system, is typically de- This ensures that the variations between layers are smooth
ployed in a centralized heavy-duty SVCencoder appliance. and causes an almost imperceptible impact on the user view-
The other components are for the end user client. ing experience. For the host system condition the heuristics
are related to the Screen Resolution and the CPU Load and
2.1 Partition system the system always uses the lowest values returned by the
The Chunk Partitioner (Figure 1) encodes the SVC video metrics. Additionally, the download time of several layers
file in a set of independent chunks that can be played in- of each chunk is also limited to 2 s to prevent pauses and
dependently, each one with 2 s duration. The chunks are re-buffering.
2.3 Reassembler System streams as well as accompanying metadata related to the
The Reassembler re-creates the independent video chunk file stream content and Quality of Service (QoS) metrics, the
from the received layer files (containing several NALs iden- daemon communicates directly with other P2P nodes, ap-
tified by unique IDs that provide the order of the NAL in propriate external Web servers, local video codecs, and the
the final video chunk). This video chunk is then sent to the browser plug-in via a standard Javascript API (JSAPI). The
SVC Media Player. Figure 3 illustrates the reassembly chain back-end component lets the P2P core engine and the HTTP
used for P2P or client-server streaming methods. server to run in the background regardless if the front-end
interface is running or not.

3. EVALUATION RESULTS
In order to evaluate the prototype solution a network sce-
nario was prepared, using only a client-server mode, on
which the available bandwidth for the client systems could
be artificially adjusted. The HTTP web streaming server
contained the SVC encoded videos, either as stored contents
or as real-time encoded media chunk streams (simulating a
Figure 3: The SVC reassembly chain Live TV program), together with the corresponding manifest
files. The web streaming server had public address accessible
from the Internet. The SVC video used on the evaluation,
was encoded with ten layers for two spatial scalability lev-
els, with the first five layers with a Common Intermediate
2.4 Client Video Player Format (CIF) resolution and the other layers with a Double
The end user client media player can be either a platform- Common Intermediate Format (DCIF) resolution.The Peak
specific software client to deliver audio-visual content to the Signal-to-Noise Ratio (PSNR) of the video has been ana-
user in a variety of formats, a Web browser plug-in, embed- lyzed for the first 200 frames. The results were plotted in
ded into an HTML5 document, or a WebApp targeted to Figure 5 where each numbered Layer PSNR line corresponds
mobile smartphones, providing the user interface and con- to the number of layers combined in the video. The metrics
tent playback functionalities. The architecture of the client
provides not only a client side but also a peer serving side.
55
The client side includes a local HTTP process that also sup- Layer 2
ports standard client-server downloading and streaming via 50 Layer 1
PSNR (dB)

Layer 9
HTTP protocol. The local HTTP process listens at a lo- 45 Layer 3
cal port to redirect HTTP GET or POST methods initiated 40 Layer 4
from either the local web browser or from the application Layer 8
35 Layer 5
Graphical User Interface (GUI), to either the P2P engine Layer 6
or to the appropriate external Web server, basing its deci- 30 Layer 7
sion on information taken from the Manifest of the content. 0 50 100 150 200
Frame
The client media player supports several codecs, including
SVC decoding [9]. As illustrated in Figure 4, the video play-
back can be made directly in the browser video canvas (for Figure 5: PSNR of a test video
the browser plug-in version) making it easier to integrate
P2P based video delivery into Web based distribution mech-
anisms. The client serving side back-end component is a used in each test measured the Bandwidth, Network Load,
RTT, Cache size and PSNR. For each relevant test the layer
variation during streaming was also collected. For a score
reference on the perceived quality of the received media after
compression and/or transmission, during the analysis of the
results the following relationship between PSNR and Mean
Opinion Score (MOS) was used (Table 1). The bandwidth

Table 1: Possible PSNR to MOS conversion. [2]


PSNR MOS
> 37 5 (Excellent)
31-37 4 (Good)
25-31 3 (Fair)
20-25 2 (Poor)
< 20 1 (Bad)
Figure 4: Web browser plugin prototype interface
variation, from 10 Mbit/s to 256 kbit/s, allowed testing the
adaptability of the solution to fluctuations in network ca-
software daemon that embeds the P2P core engine and an pacity. The lower limit of 256 kbit/s corresponded to the
HTTP server to exchange data across the network. Act- minimum throughput required to play-out a video without
ing as the underlying transport layer for all video and audio pauses or re-buffering. The tests started with the maximum
5. REFERENCES
[1] Adobe. Live dynamic streaming.
[2] C.-O. Chow and H. Ishii. Enhancing real-time video
streaming over mobile ad hoc networks using
multipoint-to-point communication. Computer
Communications, 30:1754–1764, Jun. 2007.
[3] R. S. Cruz, M. S. Nunes, C. Patrikakis, and
N. Papaoulakis. SARACEN: A platform for adaptive,
socially aware multimedia distribution over P2P
networks. In Proceedings of the 4th IEEE Workshop
Figure 6: Network load and number of layers on Enabling the Future Service-Oriented Internet:
Towards Socially-Aware Networks, GLOBECOM
2010, Dec. 2010.
bandwidth until arriving to second t = 120 when the band- [4] F. de Ası́s López-Fuentes. Adaptive Mechanism for
width was suddenly dropped to the minimum (Figure 6). P2P Video Streaming Using SVC and MDC. In
The results show, for t < 120, that the system does not oc- Proceedings of the 2010 International Conference on
cupies a constant bandwidth during the video stream, but Complex, Intelligent and Software Intensive Systems,
has a spiky nature due to the small size of the video chunks CISIS’10, pages 457 –462, Feb. 2010.
(of 2 s) that are downloaded faster that their play-out dura- [5] J. Monteiro, C. Calafate, and M. Nunes. Robust
tion. The RTT value (Figure 7) was also fairly small during multipoint and multi-layered transmission of
that first period (around 10 ms) and the video could be H.264/SVC with Raptor codes. Telecommunication
watched with maximum quality. At the instant t = 120, the Systems, pages 1–16, 2010.
[6] J. F. Monteiro. Quality Assurance Solutions for
Multipoint Scalable Video Distribution over Wireless
IP Networks. PhD thesis, Instituto Superior Técnico -
Universidade Técnica de Lisboa, Dec. 2009.
[7] M. Mushtaq and T. Ahmed. Smooth Video Delivery
for SVC Based Media Streaming Over P2P Networks.
In Proceedings of the 5th IEEE Consumer
Communications and Networking Conference, CCNC
’08., pages 447 –451, Jan. 2008.
[8] R. P. Nunes, R. S. Cruz, and M. S. Nunes. Scalable
Video Distribution in Peer-to-Peer Architecture. In
Proceedings of the 10a Conferência sobre Redes de
Figure 7: Round-Trip Time (RTT)
Computadores, CRC’10, Nov. 2010.
[9] OpenSVC. OpenSVC Decoder, 2011.
system detects the variation in networks conditions and au- [10] R. Pantos. HTTP Live Streaming. Internet-Draft
tomatically adapts the number of layers to be requested to draft-pantos-http-live-streaming-05, Internet
the available bandwidth. From that moment onwards, the Engineering Task Force, Nov. 2010. Work in progress.
system uses the full available bandwidth continuously, and [11] N. Ramzan, E. Quacchio, T. Zgaljic, S. Asioli,
still manages, at a few moments, to download up to layer L. Celetto, E. Izquierdo, and F. Rovati. Peer-to-Peer
level 4, due to the variable size of the layer files (above the Streaming of Scalable Video in Future Internet
base layer) of each chunk. applications. IEEE Communications Magazine,
49(3):128 –135, Mar. 2011.
4. CONCLUSION [12] SARACEN Consortium. SARACEN: Socially Aware,
This paper describes the architecture of a SVC adaptive collaboRative, scAlable Coding mEdia distributioN
streaming solution, using HTTP with P2P capabilities for project Home Page, 2011.
chunk-based media transport, suitable for heterogeneous net- [13] H. Schwarz, D. Marpe, and T. Wiegand. Overview of
work environments.The solution incorporates quality control the Scalable Video Coding Extension of the
mechanisms to allow video play-out without pauses, stut- H.264/AVC Standard. IEEE Transactions on Circuits
tering or image artifacts, by smoothly minimizing variations and Systems for Video Technology, 17(9):1103 –1120,
from network and system conditions. This system is com- Sep. 2007.
patible with H.264/AVC and supports real-time streaming. [14] Wikipedia. Apple’s iPad, 2010.
[15] A. Zambelli. IIS Smooth Streaming Technical
4.1 Acknowledgments Overview. Microsoft Corporation, March 2009.
The research leading to these results has received funding [16] X. Zhu, R. Pan, N. Dukkipati, V. Subramanian, and
from the European Union’s Seventh Framework Programme F. Bonomi. Layered Internet Video Engineering:
([FP7/2007-2013] ) under grant agreement n◦ ICT-248474. Network-Assisted Bandwidth Sharing and Transient
The authors would like to thank Jânio Monteiro for all his Loss Protection for Scalable Video Streaming. In
support and expertise in the Scalable Video Coding area [6]. Proceedings of the IEEE IFOCOM’ 10, pages 1 –5,
2010.

View publication stats

You might also like