Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

BASIC CONCEPTS

MULTIMEDIA APPLICATION
This is an application which uses a collection of multiple media sources e.g. text, graphics,
images, sound/audio, animation and video.

HYPERTEXT
This is a text which contains links to other media.

HYPERMEDIA
This is media that is linked to other media.

MULTIMEDIA SYSTEM
Its a system that is capable of processing multimedia data and applications. Its characterized by
the processing, storage, generation, manipulation and rendition of multimedia information.

CHARACTERISTICS OF A MULTIMEDIA SYSTEM


 Must be computer controlled

 They are integrated

 The information they handle must be represented digitally

 The interface to the fnal presentation of the media is usually interactive

CHALLENGES FOR MULTIMEDIA SYSTEMS


1. Supporting multimedia applications of a computer network renders the application
distributable. This involves many special computing techniques which can be challenge.

2. Multimedia system may have to render a variety of media at the same instant. This means
a temporal between many forms of media must exist. This introduces 2 challenges here:

 Sequencing – within the media I.e playing frames in the correct order/time frames
in the video

1
 Synchronization – lip synchronization is clearly important for humans to watch
playback of video and audio

3. Data has to be represented digitally and the initial source of data needs to digitize. This
may involve scanning for still images, sampling for audio and video. NB: Digital cameras
now exist that are able to capture images ad video in digital form direct from the source.

4. The data is large i.e. for audio and video and therefore storage transfer (bandwidth) and
processing overheads are high. Data compression techniques are commonly used.

DESIRABLE FEATURES OF A MULTIMEDIA SYSTEM


1. Very high processing power - This is needed to deal with large data processing and real
time delivery of media

2. Multimedia capable fle system – This is needed to deliver real time media e.g video/audio
streaming

3. Efficient and high input/output – the input/output to the fle subsystem needs to be
efficient and fast. It should allow for real time recording as well as playback of database.

4. Special operating system – this is to allow access to fle system and process data
efficiently and quickly. It may additionally need to support direct transfers to disks, real
time scheduling, frst interrupt processing, input/output streaming e.t.c

5. Storage and memory – large storage units usually in the order of gigabytes required

6. Network support – client server systems and distributed systems are commonly used

7. Software tools – user friendly tools are needed to handle the media, design and develop
applications as well as to deliver the media.

COMPONENTS OF MULTIMEDIA SYSTEM


1. Capture devices - this includes video, cameras, video, recorder, microphones, keyboard,
mouse, graphic tablets, VR devices.

2
2. Storage devices – they include hard disks, CD ROMS, DVDs, memory cards, fash disks

3. Communication networks – they include the intranet, internet, ethernet, token link

4. Computer systems – workstations, multimedia, desktop machines

5. Display devices – CD quality, speakers, high resolution monitors, color printer.

APPLICATIONS OF MULTIMEDIA SYSTEMS


1. World wide web

2. Video conferencing

3. Shopping

4. Farming

5. Security systems

6. Education and training

7. Games

8. Transport systems eg Uber

MULTIMEDIA AUTHORING SYSTEMS


An authoring system is a program which has per-programmed elements for the development of
interactive multimedia software titles.

Authoring systems vary widely in orientation, capabilities and learning curve.

There is no such thing as completely point and click automated authoring systems; some
knowledge/thinking and algorithm design necessary.

Authoring is actually just a speedy form of programming

3
WHY USE AN AUTHORING SYSTEM?
1. It generally takes very little time to develop an interactive multimedia project such as
computer-based training program in an authoring system as opposed to programming it in
a compiled code.

2. Reduced time means reduced cost of the programmer.

3. Using an authoring system allows increased re-use of code.

MULTIMEDIA AUTHORING PARADIGMS


An authoring paradigm/metaphor is the methodology by which the authoring system
accomplishes its tasks. These paradigms include:

1. Scripting language

 This is the authoring method that is closest in form to traditional programming.

 The paradigm is that of a programming language which specifes by fle name


multimedia elements, sequences, hot spot, synchronization.

 Scripting languages vary a lot in their development time.

 Generally they tend to take longer to develop (it takes longs to code an individual
interaction) but they generally allow more powerful interactivity.

2. Iconic/Flow control

 This tends to be the fastest authoring style in development time

 It’s best suited for rapid prototyping and short development time projects

 The core of this paradigm is an icon palette, containing the possible


functions/interactions of a program and a fow line which shows actual links
between the icons

4
 These programs tend to have the slowest run times because each interaction carries
with it all of its possible permutations. They are extremely powerful but sufer speed
problems (slow)

3. Frame

 This is similar to the iconical fow control in that it usually incorporates an icon
palette. However, the links shown between icons are conceptual and don’t always
represent the actual fow of the programming.

 It’s a very fast development system but usually requires a good auto debugging
function

4. Card scripting

 The paradigm provides a great deal of power via an incorporated scripting language.

 It is excellently suited for hypertext applications and navigation intensive


applications.

 Many entertainment applications are prototyped in this paradigm prior to compile


language code

5. Cast/score scripting

 This paradigm uses a music score as its primary authoring metaphor

 The true power of this metaphor lies in its ability to script the behavior of each of the
cast members

 These programs are best suited for animation intensive or synchronized media
applications

 They are easily extensible to handle other functions.

6. Hierarchical object

5
 The paradigm uses an object metaphor like OOP which is usually represented by
embedded objects and iconic properties

 The visual representation of objects can make very complicated constructions


possible

7. Hypermedia linkage

 The paradigm is similar to the frame paradigm as it shows conceptual links between
elements; however it lacks the frame paradigm’s visual linkage metaphor

8. Tagging

 Uses tags in text fles e.g. Standard Generalised Mark-up (SGML), HTML, SMIL
(Synchronised Mark-up Integration Language) e.t.c.

 Link pages provide interactivity and integrate multimedia elements

MULTIMEDIA AUTHORING STAGES


When a multimedia application is produced via an authoring system, the author goes through
several stages:

1. Concept

 This involves identifying the application audience, application type (presentation


interaction), the application purpose (to inform, entertain, teach) and the general
subject matter.

 At this stage the authoring systems cannot help.

2. Design

 The style and content of the applications must be specifed. Objects should include
and generate enough detail so that the stages that follow can be carried out by the
authoring systems without interruptions

6
 Design parameters are entered into the authoring system.

 The authoring systems can take over the task of documenting the design and keeping
information for the next steps.

 The other task is to decide which data fles will be needed in the application e.g
audio, video, image fles.

 A list of the materials required should be generated.

3. Content Collection

 Content material collected and entered into the authoring system.

 This may include taking of pictures making video clips and producing an audio
sequence

4. Assembly

 The entire application is put together. The screens are defned and placed together in
an orderly manner and the presentation is ready to run.

5. Testing

 The created application must be tested. Sophisticated authoring systems may


provide advanced features such as tracing the program fow.

DATA COMPRESSION
REASONS FOR COMPRESSION
1. Storage capacity
Uncompressed graphics, audio and video data require considerable storage capacity
which in the case of uncompressed video is often not feasible

7
2. Bandwidth
Data transfer of uncompressed video data over digital networks requires very high
bandwidth to be provided for a single point to point communication.

3. Cost
To provide feasible and cost efective solutions data compression is important.

CODING REQUIREMENTS
Images have considerably higher storage requirements than text. Audio and video have even
more demanding properties for data and storage. To compare data storage and bandwidth
requirement of diferent visual media , the following specifcations are necessary based on a
typical window of 640 by 480 pixels on a screen:

i. For representation of text, 2 bytes are used for each character thus allowing for the
presentation of diferent language variants. Each character is displayed using an 8 x 8
pixel block which is sufficient for the display of ASCII characters.

ii. For the representation of vector graphics, a typical still image is composed of 500 lines.
Each line is defned by its horizontal position, vertical position and an 8 bit attribute feld.
The horizontal axis is represented using 10 bits and the vertical axis using 9 bits.

iii. In a very simple color display modes, a single pixel of a bit mark can be presented by 256
diferent colors

Compression in multimedia systems is subject to certain constraints:

i. The quality of the coded and later on decoded data should be as good as possible. To make
a cost efective implementation possible, the complexity of the technique should be
minimal. The processing of the algorithms must not exceed certain time spans.

ii. For each compression technique there are requirements that difer from those of other
techniques. One can distinguish between requirements of an application running in a
dialogue mode and retrieval mode.

8
DIALOGUE MODE
Dialogue mode means an interaction among human users via multimedia information. In a
dialogue mode application, the following requirements are necessary:

i. The end-to-end delay should not exceed 150ms i.e. for compression and decompression. A
delay in the range of 50ms should be achieved to support face to face dialogue
applications. The number 50ms relates to the delay introduced by compression and
decompression only.

ii. The overall end to end delay additionally comprises of any delay in the network in the
involved communication protocol processing at the end system and in the later transfer
from end to the respective input and output devices.

RETRIEVAL MODE
It is a retrieval of multimedia information by a human user from a multimedia database.

The requirements for retrieval mode are:

i. Fast forward and backward data retrieval with simultaneous display should be possible.
This implies a fast search for information in multimedia database.

ii. Random access to single images and audio frames of data streams should be possible.

iii. Decompression of images, video and audio should be possible without a link to other data
units

REQUIREMENTS FOR DIALOGUE AND RETRIEVAL MODES


i. To support scalable video in diferent systems, it is necessary to defne a format that is
independent of frame size and video frame rate.

ii. Various audio and video data rates should be supported. Usually this leads to diferent
qualities. Depending on specifc system conditions the data rates can be adjusted.

iii. It must be possible to synchronize audio with video data as well as with other media.

9
iv. To make an economical solution possible, coding should be realized using software (for a
cheap and low quality solution) or VLSI chips for a high quality solution.

v. It should be possible to generate data on one multimedia system and reproduce this media
on another system. The compression technique should be compatible . This compatibility
is relevant e.g. in the case of tutoring programs available on a CD which allows diferent
users to read the data on diferent systems thus being independent of the manufacturer.

CODING TECHNIQUES
DPCM – Diferential Pulse Code Modulation

DM – Delta Modulation

FFT – Faster Fourier Transformation

DCT – Discreet Cosine Transformation

JPEG – Joint Picture Expert Group

MPEG – Motion Picture Expert Group

DVI – Digital Video Interactive

PLV – Presentation Level Video

RTV – Real Time Video

10
Run-Length Coding

Entropy Hufman Coding

Arithmetic Coding

DPCM
Prediction
DM

FFT
Transformation
DCT
Source Coding
Bit Position

Layered Coding Sub- Sampling

Sub-Band Coding

Vector Quantization

JPEG

Hybrid Coding MPEG

H26I, DVI, RTV, PLV

Entropy coding is used regardless of the medias specifc characteristics. Entropy coding is an
example of lossless coding where data characteristics are ignored.

11
MAJOR STEPS OF DATA COMPRESSION

Preparation includes analogue to digital conversion and generating an appropriate digital


representation of the information.

Processing is actually the frst step of the compression process which makes use of sophisticated
algorithms. Transformations may be done from the time to the frequency domain using DCT.

Quantization processes the results of picture processing. It specifes the granularity of the
mapping of real numbers into integers. This process results in reduction of precision. Entropy
coding compresses a sequential digital data stream without loss.

TYPES OF COMPRESSION
LOSSLESS DATA COMPRESSION
This is a class of data compression algorithms that allows the exact original data to be
reconstructed from the compressed data without data loss e.g text.

LOSSY DATA COMPRESSION


This allows an approximation of the original data to be reconstructed in exchange for better
compression rates i.e. accept some loss of data in order to achieve higher compression.

SYMMETRIC VS AS-SYMMETRIC COMPRESSIONS


Symmetric compression and decompression roughly uses the same technology and takes just as
long. De-compression and compression takes the same amount of time.

Asymmetric is where compression takes a lot more time than decompression e.g. in an image
database each image will be compressed once and be decompressed many times.

Sometimes decompression may take a lot more time than compression (not common) for
example creating many backup fles which will hardly ever be read.

12
NON-ADAPTIVE VS ADAPTIVE
Non-adaptive compression contains a static dictionary of predefned substrings to encode which
are known to occur with high frequency.

Adaptive compression, the dictionary is built from scratch.

RUN-LENGTH CODING
Sampled images, audio and video data streams often contain sequences of the same bytes. By
replacing these repeated byte sequences with the number of occurrences, a substantial reduction
of data can be achieved. This is called run-length coding which is indicated by a special fag that
doesn’t occur as part of the data stream itself.

This fag byte can also be realized using any other of the 255 diferent bytes in the compressed
data stream. To illustrate this, we defne the exclamation mark (!) as a special fag.

A single occurrence of this exclamation fag is interpreted as a special fag during decompression.
Run-length coding procedure can be described as follows:

If a byte occurs at least 4 consecutive times, the number of occurrences is counted. The
compressed data contains this byte followed by a special fag and the number of its occurrences.
This allows for the compression of between 4 – 259 bytes into 3 bytes only.

Given the following sequence; uncompressed data → ABCCCCCCCCEFGGG. To compress with


run-length we will have: ABC!8EFGGG.

Run-length is very simple form of data compression in which runs of data (I.e sequences in which
the same data value occurs in many consecutive data elements) are stored as a single value and
count rather than as the original run. This is most useful on data that contains many runs for
example simple graphics such as icons, line drawings and animations.

Let us consider a screen containing plain black text on a solid white background. There will be
many long runs of white pixels in the blank space and many short runs of black pixels within the

13
text. Let us take a hypothetical single scan line with ‘b’ representing a black pixel ans ‘w’
representing white as follows:

WWWWWWWWWWWWBBBWWWWWWWWWWBBWWWWWWWWWWWWBBBBWWW
WWWWWWW

If we apply the run-length coding data compression algorithm on the hypothetical scan line we
get

W!12BBBW!10BBW!12B!4W!10

Run-length performs lossless data compression and is used a lot in fax machines. Run-length is
relatively efficient because most faxed documents are mostly whitespace with occasional
interruptions of black.

Q. Discuss 4 areas where run-length compression is used.

HUFFMAN CODING
this provides a simple way of producing a set of replacement bits. The algorithm is easy to
describe, simple to code and comes with a proof that its optimal.

Hufman coding algorithm determines the optimal code using the minimum number of bits. The
length of the coded bits will difer. The shortest code is assigned to those characters that occur
most frequently.

To determine a Hufman Code, it is useful to construct a binary tree. The leaves (nodes) of this
tree represent the characters that are to be coded. Every node contains the occurrence probability
of one of the characters belonging to this sub tree. A zero (0) and a one (1) are assigned to the
branches (edges) of the tree.

The algorithm fnds the string of bits for each letters bu using the binary tree to read of the codes.
The common letters end up near the top of the tree while the least common letters end up near the
bottom.

14
Suppose you are provided with the following characters and their probability of occurrence as
follows:

Constructing a binary tree for these characters we have;

the addresses for the paths are:

A → 111

B → 110

15
C → 10

D→0

The most common characters end up with the shortest path.

Coding is straight forward, for example:

DAD → 01110

CAB → 10111110

The path addresses for words with the common letter D end up shorter.

The Hufman Code has the following characteristics:

 Length. The most common characters are closer to the top and have shorter paths from the
root to the leaf. This means their replacement codes are shorter and this forms the basis of
compression

 Prefx. No code for one character is a prefx of another. This essential to decompressing a
fle by matching up the strings of zeros on ones with a character. This means that no
impossible decisions must be made when decoding a string for instance if A was coded as
111 and E was coded as 11 then it would be impossible to tell whether 11111 stood for AE or EA

 Optimal. The Hufman codes constructed from such a tree are considered optimal in the
sense that no other set of variable length codes will produce better compression.

 Fragile. If one bit disappears from the fle or is changed, then the entire fle after the
disappearing bit could be corrupted. The only solution is to break up the fle into smaller
sub fles and keep track of the start of each sub fle.

Challenges of Hufman Coding:

16
i. Changing ensemble – if the ensemble changes, the frequencies and probabilities change.
The optimal coding changes e.g. in text compression symbol frequencies vary with
context.

ii. Does not consider ‘blocks of symbols’ - Strings of ‘ch’ the next nine symbols are predictable
‘aracters’. But bits are used without conveying any new information.

ARITHMETIC CODING
The idea behind arithmetic coding is to have a probability line and a sign to every symbol. A
range in this line is based on its probability.

The higher the probability the higher the range which it assigns to it. It’s widely used and tends to
achieve better compression than Hufman but has a problem with speed.

Disadvantages of arithmetic coding are:

 The whole code word must be received to start decoding the symbols

 If there’s a corrupt bit in the code word, the entire message could become corrupt

 There’s a limit to the precision of the number which can be coded, thus limiting the
number of symbols that can be coded within a code word.

 They exist many patterns on arithmetic coding, and so the use of some of the algorithms
may attract royalty fees

SOURCE CODING
There are 4 diferent types of source coding:

i. Sub-banding coding
This coding gives diferent resolutions to diferent bands. For example since the human eye
is more sensitive to intensity changes than color changes. It gives the Y-component of the
YUV video sequence more resolution than the U and V components.

17
ii. Sub-sampling
This groups pixels together into a meta region and encodes a single value for the entire
region

iii. Predictive coding


This uses one sample to guess the next. It assumes a model and sends only diferences
from the model.

iv. Transform coding


This transforms one set of reference planes to another.

HYBRID CODING
It’s where 2 or more coding techniques can be combined so as to gain a greater advantage from
compression without sacrifcing much in way of quality.
It can be applied in levels as a way to enhance the quality of a service that is tailored to a
particular requirement.
Hybrid transformation techniques out perform conventional transformation coding techniques in
terms of coding time and prediction errors.
Other advantages of this technique include: the potential for improving efficiency and ease of
transmission.

i. JPEG
It applies to color and gray scaled still images. JPEG fulflls the following requirements to
guarantee its distribution and application:

a) The JPEG implementation should be independent of the image size

b) The JPEG implementation should be applicable to any image and pixel aspect ratio

c) Colour representation itself should be independent of the special implementation

d) The image content may be of any complexity with any statistical characteristics

18
e) The JPEG standard specifcations should be state of the art (or near) regarding the
compression factor and achieved image quality

f) The processing complexity must permit a software solution to run on as many


available standard processors as possible. Additionally, the use of specialised software
should substantially enhance image quality

g) Sequential decoding (line by line) and progressive decoding (refnement of the whole
image) should be possible.

With these requirements/characteristics, the user can select the quality of the reproduced
image, the compression processing time and the size of the compressed image by choosing
appropriate individual parameters.

STEPS OF THE JPEG COMPRESSION PROCESS

1. Image Preparation
In the frst step, JPEG specifes a very special general image model. With this model, it’s

19
possible to describe most of the well 2-dimensional image representations.
A source image consists of at least one and at most 255 components as shown below.
Each component (Ci) may have a diferent number of pixels in the horizontal (Xi) and a
diferent number in the vertical (Vi).
The resolution of the individual components may be diferent.

All pixels of all components within the same image are coded with the same number of
bits. The lossy modes of JPEG use appreciation of either 8 or 12 bits per pixel.

Lossless modes appreciation of 2 upto 12 bits per pixel. If a JPEG application makes use
any other number of bits the application itself must perform a suitable image
transformation to the well defned number of bits int the JPEG standard.

In most cases the data units are processed component by component. For one component,
the processing order of the data units is left to right and from top to bottom one
component after the other. This is known as non-interleaved data ordering.

{diagram}

Inter-leaved data units of diferent components are combined into minimum coded units
(MCU’s). If all components have the same resolution (Xi * Yi), an MCU consists of exactly
one data unit for each component. The decoder displays the image MCU by MCU. This
allows for correct color presentation even for partly decoded images.

20
In the case of diferent resolutions for single components, the construction of MCU’s
becomes more complex. For each component, the regions of the data units are determined
and each MCU consists of one region in each component.

2. Image processing
After image preparation, the uncompressed image sample are grouped into data units of 8
* 8 pixels and passed to the encoder. The order of these data units is defned by the
MCU’s. The pixel values are shifted into the range of -128 to 127 with 0 as the center.
These data units of 8 * 8 shifted pixel values are defned by Syx, where x and y are in the
range of 0 to 7. Each of these values is then transformed using forward discrete cosine
transformation (FDCT).
Altogether this transformation must be carried out 64 times per data unit as shown below
{diagram}
The co-efficient S00 corresponds to the lowest frequency in both dimensions. It is known
as the DC co-efficient which determines the fundamental color of the data units of the 64
pixels.
A DC co-efficient if the DCT co-efficient for which the frequency is 0 (zero) in both
dimensions. All the other co-efficients are referred to as the AC co-efficient. AC co-efficient
is a DCT co-efficient for which the frequency in one or both dimensions is non-zero. The
co-efficient S77 indicates the highest frequency appearing equally in both dimensions.

3. Quantization
Each entry (64) will be used for the quantization of one of the 64 DCT co-efficients. Each
of the 64 co-efficients can be adjusted separately. The application has the possibility to
efect the relative signifcance of the diferent co-efficients and specifc frequencies can be
given more importance than others. This co-efficient should be determined according to
the characteristics of the source image. The possible compression is infuenced at the
expense of the achievable image quality.

21
The quantization process becomes less accurate as the size of the table entries increases.
Quantization and de-quantization must use the same tables. No default values for
quantization tables are specifed in JPEG. Applications may specify values which
customize the desired picture quality according to the particular image characteristics.

4. Entropy coding

The quantized DC co-efficient are treated diferently from the quantized DC co-efficient.
The processing order of the whole set of co-efficient is specifed by a zig-zag sequence. The
DC co-efficient determine the basic color of the data units. Between adjacent data units
the variation of color is fairly small, therefore, a DC co-efficient is coded as the diference
between the current DC co-efficient and the previous one. Only the diferences are
subsequently processed as shown below:
{diagram}
Since the processing order of the AC co-efficients is done using the zig-zag sequence, it
means that co-efficients with lower frequencies are processed frst followed by those with
higher frequencies. This results into a very efficient coding scheme.

ii. MPEG

22

You might also like