Professional Documents
Culture Documents
Final Assignment
Final Assignment
and clips in addition to the regular text. It is even capable of sending message and
formatted multimedia documents.
ii) What is noise in relation to multimedia data? For each respective modality
audio, photographic image and digital video give one example of a possible
source of noise and how this noise might manifest itself in the data.
Noisy data is meaningless data, often from data corruption in the recording or
sampling process or from extraneous source during the recording.
Audio (One of each):
Manifestation: salt and pepper noise, graininess, blocky pictures, low image contrast
Manifestation: Per Frame: salt and pepper noise, graininess, blocky pictures, low
image contrast. Video jitters, missing frames
Discrete Media: refers to the media involving space dimension only (e.g. Still
images, text and graphics).Discrete media is also referred to as static media or space-
based media or non-temporal media.
Typographic Etiquette
1. Select a font
2. Modify the font size
3. Scale your headings
4. Set-Line spacing
5. Add tracking and kerning to make the text look more roomy
6. Add white space between headers and the body text.
7. Use a line -length of 45-50 characters.
Typographic goals
I. To remain invisible to the reader.
II. To increase clarity and readability.
III. To subtly indicate voice and tone of speaker.
a) Multimedia System
b) Typography
A collection of letters, numbers, punctuation, and other symbols used to set text (or
related) matter. Although font and typeface are often used interchangeably, font refers
to the physical embodiment (whether it's a case of metal pieces or a computer file)
while typeface refers to the design (the way it looks).
c) Typefaces
Typeface refers to a group of characters, letters and numbers that share the same
design. For example Garamond, Times, and Arial are typefaces. For example, Arial is
a typeface; 16pt Arial Bold is a font. So typeface is the creative part and font is the
structure.
i) Different color models are often used in different applications, discuss these
different color models.
There are two basic kinds of color models, additive and subtractive. Let's look at an
additive color model first. The most common one is Red,Green,Blue (RGB) and other
is Cyan, Magenta, Yellow, Black (CMYK).
The RGB Color Model
This color model uses light to create color, and it's used for digital media. When you play a game
on your smart phone or watch a movie on your TV, you're seeing color in an RGB color space.
RGB is called an additive color model because when the three colors of light are shown in the
same intensity at the same time, they produce white. If all the lights are out, they create black.
When printing color images, you can't use colored light, and that means images can
not be printed in RGB. That's where the other color model comes in. A subtractive
color model adds pigment in the form of ink or dye that causes an absence of white.
The most common subtractive color model is Cyan/Magenta/Yellow/Black, usually
referred to as CMYK.
ii) What is the YIQ color model? Give one application in which this color model
is most commonly used and explain the reason.
The YIQ color space model is use in U.S. commercial color television broadcasting
(NTSC). It is a rotation of the RGB color space such that the Y axis contains the
luminance information, allowing backwards-compatibility with black-and-white color
TV's, which display only this axis of the color space.
Application
YIQ color model is used for US TV broadcast
This model was to designed to separate chrominance (I and Q ) from luminance (Y).
This was the requirement in the early days of color television when black-white set
were expected to pickup and display what were originally color pictures.
The Y channel contain luminance information (sufficient for black-and-white
television sets) while the I and Q channel carried the color information
iii) Explain why JPEG compression is not always suitable for compression of
images that contain sharp edges or abrupt changes of intensity (such as black
text on a white background).
Low pass filtering less to blurring of edges - High Frequency component will not
be small as assumed by JPEG.
Ringing artefacts occur due to Gibbs phenomenon: Fourier sums overshoot at a
jump discontinuity, and this overshoot does not die out as the frequency increases
iv) Shown below is a JPEG quantization table. Explain why the values in the top
left corner are smaller than the values in the bottom-right corner. Why are some
values not symmetrical with respect to the
main diagonal? What would happen to the quality of the picture if all values in
the table were halved?
16 11 10 16 24 40 51 61
12 12 14 19 26 58 60 55
14 13 16 24 40 57 69 56
14 17 22 29 51 87 80 62
18 22 37 56 68 109 103
77 24 35 55 64 81 104
113 92 49 64 78 87 103
121 120 101 72 92 95
98 112 100 103 99
• Eye is most sensitive to low frequencies (upper-left corner), less sensitive to high
frequencies (bottom-right corner). Hence we quantise more aggressively in the high-
frequency range.
• The quality would be reduced (doubling the values makes quantization more
aggressive).
v) Explain different coding techniques, such as, entropy coding/Huffman coding
with an example.
Entropy Coding
In information theory an entropy coding (or entropy encoding) is a lossless data
compression scheme that is independent of the specific characteristics of the
medium. ... Two of the most common entropy coding techniques are Huffman coding
and arithmetic coding.
Entropy example
Entropy example
Entropy calculation for a two symbol alphabet.
Example 1: A pA=0.5
B pB=0.5
H(A,B) = -PA log2 PA - PB log2 pB =
= - 0.5log2 0.5,- 0.5log 0.5= 1
Huffman coding
Huffman Coding is a famous Greedy Algorithm.
The code length of a character depends on how frequently it occurs in the given text.
The character which occurs most frequently gets the smallest code.
The character which occurs least frequently gets the largest code.
It is also known as Huffman Encoding.
Huffman code
306
0
1
120 186
0 1
e
107
79 1
0
0
1 0 65
42 1
37 42 I
d 32
u
c 0 33
1
9
24
0 m
1
2 7
z k
vi) Define
a) Dithering
b) GIF File
c) Human visual acuity
d) Color Harmony Schemes
a) Dithering
Dithering is an image processing operation used to create the illusion of color depth in
images with a limited color palette. Colors not available in the palette are
approximated by a diffusion of colored pixels from within the available palette.
b) GIF File
GIF stands for “Graphics Interchange Format”. It's a bitmap image format which was
created by CompuServe in 1987. GIF images are compressed with a lossless
compression but the size of the files are significantly small. It is one of the most
widely used image format on CorelDraw.
3. Digital Audio
i) Describe the difference between reverb and echo.
Reverberation
. Reverberation can be heard when the sound gets reflected by a nearby wall.
Echo
ii) Audio signals are often sampled at different rates. CD quality audio is
sampled at 44.1kHz rate while telephone quality audio sampled at 8kHz. What
are the maximum frequencies in the input signal that can be fully recovered for
these two sampling rates? Briefly describe the theory you use to obtain
the results.
• CD quality audio, the maximum frequency: 44,100Hz / 2 = 22,050Hz.
• Telephone quality audio, the maximum frequency: 8kHz / 2 = 4kHz.
• This is based on Nyquist theorem: the sampling frequency for a signal must be at
least twice the highest frequency component in the signal.
iv) Define:
a) decibel
b) Critical Band
c) Masking
d) Midi
a) Decibel
A decibel is a unit of measurement which is used to indicate how loud a sound is.
Continuous exposure to sound above 80 decibels could be harmful.
b) Critical Band
c) Masking
When talking about editing and processing images the term 'masking' refers to the
practice of using a mask to protect a specific area of an image, just as you would use
masking tape when painting your house. Masking an area of an image protects that
area from being altered by changes made to the rest of the image.
d) Midi
MIDI (Musical Instrument Digital Interface) is a protocol designed for recording and
playing back music on digital synthesizers that is supported by many makes of
personal computer sound cards. Originally intended to control one keyboard from
another, it was quickly adopted for the personal computer.
ii) Explain the key differences between I-frames, P-frames and B-frames in
MPEG-2 video compression. Describe the advantages and disadvantages of using
B-frames.
I frame compression removes the spatial redundancy of the image, and P and B
frames remove the temporal redundancy,
An I-frame or a Key-Frame or an Intra-frame consists ONLY of macro blocks that
use Intra-prediction. It can only use “spatial redundancies” in the frame for
compression. Spatial Redundancy is a term used to refer to similarities between the
pixels of a single frame.
P-frame also known as Inter-frames stands for Predicted Frame and allows macro
blocks to be compressed using temporal prediction in addition to spatial prediction.
A B-frame is a frame that can refer to frames that occur both before and after it.
The B stands for Bi-Directional for this reason.
Advantantage of B-frames
• Coding efficiency.
• Most B frames use fewer bits.
• Quality can also be improved in the case of moving objects that reveal hidden areas
within a video sequence.
Disadvantage of B-frames:
• Frame reconstruction memory buffers within the encoder and decoder must be
doubled in size to accommodate the 2 anchor frames.
• More delays in real-time applications
Spatial Compression
Spatial compression techniques are based on still image compression. The most
popular technique, which is adopted by many standards, is the transform technique. In
this technique, the image is split into blocks and the transform is applied to each
block. The result of the transform is scaled and quantized. The quantized data is
compressed by a lossless entropy encoder and the output bitstream is formed from the
result. The most popular transform algorithm is the Discrete Cosine Transform (DCT)
or its modifications. There are many other algorithms for spatial compression such as
wavelet transform, vector coding, fractal compression, etc.
Temporal Compression
Temporal compression can be a very powerful method. It works by comparing
different frames in the video to each other. If the video contains areas without motion,
the system can issue a short command that copies that part of the previous frame, bit-
for-bit, into the next one. If some of the pixels are changed (moved, rotated, change
brightness, etc.) with respect to the reference frame or frames, then a prediction
technique can be applied. For each area in the current frame, the algorithm searches
for a similar area in the previous frame or frames. If a similar area is found, it’s
subtracted from the current area and the difference is encoded by the transform coder.
The reference for the current frame area may also be obtained as a weighted sum of
corresponding areas from previous and consecutive frames. If consecutive frames are
used, then the current frame must be delayed by some number of frame periods.
v) Define
a) PAL
b) Progressive Scan
c) Animation Techniques
d) Motion Compensation
a) PAL
PAL is an abbreviation for Phase Alternate Line. This is the video format standard
used in many European countries. A PAL picture is made up of 625 interlaced lines
and is displayed at a rate of 25 frames per second. SECAM is an abbreviation for
Sequential Color and Memory.
b) Progressive Scan
c) Animation Techniques
This can refer to two things: traditional animations made from paper or other similar
medium and vector-based animations made on computer. ... You make all the
drawings digitally on a computer and play those images to give an animation affect.
So, it's comparatively easier and quicker than the traditional technique.
d) Motion Compensation
Motion compensation is an algorithmic technique used to predict a frame in a video,
given the previous and/or future frames by accounting for motion of the camera
and/or objects in the video. It is employed in the encoding of video data for video
compression, for example in the generation of MPEG-2 files.
5. Miscellaneous
i) Discuss Immersive Reality and differentiate between Virtual Reality and
Augmented Reality.
Immersive Reality
It represents the next step of augmented reality and builds up itself through the
immersion in a specific room: the “cave” (virtual automatic ambient), to develop
an immersive virtual reality in which some projectors are toward three or more
displays. In this case an high definition projection system with 3D special glasses has
been made, searching for multi-sensorial effects (3D sound).
Virtual Reality
. VR might work better for video games and social networking in a virtual
environment , such as Second Life , or even PlayStation Home.
Augmented Reality
. Augmented reality enhances real life with artificial images and adds graphics ,
sound & smell to the natural world , as it exists.
. The user can interact with the real world , and at the same time can see, both the real
and virtual world.
A) Project Manager
. Center of action
. Responsible for overall development and implementation of a project as well as day-
to-day operations
. Budgets
. Schedule
. Creative session
. Time Sheets
. Illness
. Invoices
. Team dynamics
b) Multimedia Programmer
. Software Engineer
. Integrates all the multimedia elements of a project into a seamless whole using
authoring system or programming language
. lighting designers
. set designers
. script supervisors
. gaffers
. grips,
. production assistants
. actors
. However , for many modest projects , a video specialist many shoot and edit all of
the footage without outside help.
Audio Specialist
. Wizard who make a multimedia program come alive , designing and producing
music , voice-over narrations , and sound sound effects.
. Selecting suitable music and talent , and scheduling recording sessions , and
digitizing and editing recorded material into computer file.
QoS Parameters
To provide and sustain QoS, resource management must be QoS-driven. To allocate
resources, the resource management system must consider different parameters:
resource availability;
resource control policies, including Service Level Agreements (SLA); QoS
requirements of applications, which are quantified by QoS parameters (e.g. Jitter,
Delay, Packet loss).
Jitter: Jitter is the delay variation and is introduced by the variable transmission of
delay of the packets over the network. This can occur because of routers' internal
queues behavior in certain circumstances (e.g. flow congestion), routing changes, etc.
This parameter can seriously affect the quality of streaming audio and/or video.
Delay: this parameter is intrinsic to communications, since the end points are distant
and the information will consume some time to reach the other side. Delay is also
referred as to latency. Delay time can be increased if the if packets face long queues
in the network (congestion), or crosses a less direct route to avoid congestion.
Packet Loss: happens when one or more packets of data being transported across the
internet or a computer network fail to reach their destination. Wireless and IP
networks cannot provide a guarantee that packets will be delivered at all, and will fail
to deliver (drop) some packets if they arrive when their buffers are already full. This
loss of packets can be caused by other factors like signal degradation, high loads on
network links, packets that are corrupted being discarded or defect in network
elements.
v)Discuss Multimedia Hardware (Processor/GPUs, Input Devices, Output
Devices)
Processor/GPUs
A graphics card (also called a video card, display card, graphics adapter, or display
adapter) is an expansion card which generates a feed of output images to a display
device (such as a computer monitor).
Input Devices
Keyboard- Most common and very popular input device is keyboard. The keyboard
helps in inputting the data to the computer.
Joystick - Joystick is also a pointing device, which is used to move cursor position on
a monitor screen.
Track Ball - Track ball is an input device that is mostly used in notebook or laptop
computer, instead of a mouse
Scanner - Scanner is an input device, which works more like a photocopy machine
Magnetic Ink Card Reader (MICR) - MICR input device is generally used in banks
because of a large number of cheques to be processed everyday.
Optical Character Reader (OCR) - OCR is an input device used to read a printed
text.
Bar Code Readers - Bar Code Reader is a device used for reading bar coded data
(data in form of light and dark lines).
Optical Mark Reader (OMR) - OMR is a special type of optical scanner used to
recognize the type of mark made by pen or pencil.
Output Devices
Monitors - Monitor commonly called as Visual Display Unit (VDU) is the main
output device of a computer.
Printers - Printer is the most important output device, which is used to print
information on paper.
Speakers and Sound Card - Computers need both a sound card and speakers to hear
audio, such as music, speech and sound effects.
v) Define:
a) Mixed Reality
b) Net neutrality
c) Priority Scheduling
d) RSVP
a) Mixed Reality
Mixed Reality is a blend of physical and digital worlds, unlocking natural and
intuitive 3D human, computer, and environment interactions. This new reality is
based on advancements in computer vision, graphical processing, display
technologies, input systems, and cloud computing.
b) Net neutrality
Net neutrality is the principle that an internet service provider (ISP) has to provide
access to all sites, content and applications at the same speed, under the same
conditions without blocking or giving preference to any content.
c) Priority Scheduling
d) RSVP
The End