Lect 5

CSO243:
FUNDAMENTALS OF MULTIMEDIA
Lecture 5: Fundamental Concepts in Video

What is video?
2
 Video is a combination of image and audio.

 It consists of a set of static (still) images called frames
displayed to the user one after another at a specific
speed, known as the frame rate.
 Frame rate is the number of frames per second (fps).
 If frames displayed fast enough our eye cannot
distinguish the individual frames, but because of
persistence of vision merges the individual frames with
each other thereby creating an illusion of motion.
 Audio is added and synchronized with the apparent
movement of images.
Types of video
3
 Analog video is represented as a continuous (time-

varying) signal,
 Digital video is represented as a sequence of digital
images.
Analog versus Digital Video
4
 A major weakness of analog recordings is that every time

analog video is copied from tape to tape, some of the
data is lost and the image is degraded, which is referred
to as generation loss.
 Digital video is less susceptible to deterioration when
copied.
 You can convert analog video to digital video with the
proper hardware and software configurations.
Video frames scanning
5
 Each frame is a picture which consists of a set of

pixels.
 Instead of trying to count each pixel individually,
let’s do it line by line, so in a 1080p or 1080i video
frame, there are 1080 lines of pixels.
 But what do “i” and “p” stand for? It stands for
interlaced and progressive video, our fresh new
topic.
Analog Video Scanning
6
It is divided into two types

1. Progressive Scanning
traces through a complete frame row-wise for each time
interval
2. Interlaced Scanning
Here, the odd-numbered lines are traced first, then the even
numbered lines.This results in “odd” and “even” fields two
fields make up one frame.
What is Interlaced Video?
7
 Interlaced video scan shows even and odd scan lines as

separate sets of lines, so the odd lines will be displayed
on the screen first, and the even ones will fill in after.
 You effectively double the perceived frame rate without
providing more information by presenting every other
video line.
 In addition, quickly showing each half-frame can trick the
viewers into thinking there are more frames than there
actually are.
8
History of Interlaced Video
9
 Back when analog television was still in use, and most

people used 480i or 576i on CRT monitors, interlaced
transmissions were typical for TV channels and, even
later, some DVDs.
 The bandwidth issue was a major contributing factor.
Only a certain amount of data could be transmitted
simultaneously over the coaxial cable or the airways.
Pros and Cons of Interlaced Videos
10
 Pros
 Due to the persistence of the vision, interlacing can help to
reduce flicker between frames.
 Interlacing saves bandwidth by only sending half of a
complete frame at once.
 Cons
 Interlacing can cause problems like combing - where the
even lines of the next frame B have loaded onto a display,
but the odd lines of the previous frame A are still being
shown as well, giving you an image that’s half of the frame
A and half of the frame B.
Broadcasters in charge of sports would choose to display a lower quality
progressive video rather than a higher quality interlaced video to save on
bandwidth and avoid combing problems.
11
What is Progressive video?
12
 Progressive scan displays the video frame line by

line from top to bottom. Thus, it shows even and odd
scan lines at the same time and sequentially, which
means the entire video frame is shown at once.
 Also called “non-interlaced,” all modern monitors

and TVs support progressive video. The highest
progressive format in everyday use is 2160p
(commonly referred to as 4K).
Pros and Cons of Progressive video
13
 Pros
 The progressive scan offers a more vibrant and
realistic display. In addition, due to its full-frame
transmission, it reduces flicker and artifacts.
 Less flicker also implies less strain on the eyes after
extended use.
 The development of modern displays like LCDs and
LEDs has made progressive technology lasting.
Pros and Cons of Progressive video
14
 Cons
 The system becomes more expensive and demanding
due to the increased bandwidth requirements of
progressive scanning.
 Older interlaced videos must be de-interlaced in
order to be shown on progressive screens.
NTSC Video
15
 The NTSC TV standard was mostly used in North

America and Japan.
 It uses a familiar 4:3 aspect ratio (i.e., the ratio of
picture width to height) and 525 scan lines per
frame at 30 frames per second.
PAL Video
16
 PAL (phase alternating line) is a TV standard originally

invented by German scientists.
 It uses 625 scan lines per frame, at 25 frames per
second (or 40 ms/frame), with a 4:3 aspect ratio and
interlaced fields. Its broadcast TV signals are also used
in composite video.
 This important standard is widely used inWestern
Europe, China, India, and many other parts of the world.
 Because it has higher resolution than NTSC (625 versus
525 scan lines), the visual quality of its pictures is
generally better.
PAL Video cont.
17
 PAL uses the YUV color model, allocating a

bandwidth of 5.5 MHz to Y and 1.8 MHz each to U
and V.
SECAM Video
18
 SECAM, which was invented by the French, is the third

major broadcast TV standard.
 SECAM stands for Systeme Electronique Couleur Avec
Memoire. SECAM also uses 625 scan lines per frame, at
25 frames per second, with a 4:3 aspect ratio and
interlaced fields.
 The original design called for a higher number of scan
lines (over 800), but the final version settled for 625.
 SECAM and PAL are similar, differing slightly in their
color coding scheme. In SECAM,U and V signals are
modulated using separate color subcarriers at 4.25MHz
and 4.41 MHz, respectively.
Color Spaces in Video
19
 Color space is a mathematical representation of a range

of colors.
 When referring to video, many people use the term “color
space” when actually referring to the “color model.”
 Some common color models include
 RGB,
 YUV 4:4:4,
 YUV 4:2:2,
 and YUV 4:2:0.
How are colors represented digitally?
20
 Virtually all displays—whether TV, smartphone, monitor,

or otherwise—start by displaying colors at the same
level: the pixel.
 The pixel is a small component capable of displaying
any single color at a time.
 Pixels are like tiles on a mosaic, with each pixel
represents a single sample of a larger image. When
properly aligned and illuminated, they can collectively be
presented as a complex image to a viewer.
Pixel representation of a sample of a
21
larger image
 While the human eye perceives each pixel as a single

color, every pixel is actually made up of the combination
of three subpixels colored red, green, and blue.
RGB color space (model)
22
 By mixing red, green, and blue, it's possible to obtain a

wide spectrum of colors.
 This is referred to as RGB additive mixing.
 The color space itself is a mathematical representation of
a range of colors:
23
 8-bit vs 10-bit color

 8-bit and 10-bit refer to the number of bits per color
component or color depth.
 RGB 8 bits (sometimes written as RGB 8:8:8) refers to a
pixel with 8 bits of red component, 8 bits of green
component, and 8 bits of blue component.
 This means that each color component can be represented
in 28, or 256 hues.
 Since there are three color components per pixel, this
leaves a total of 2563, or 16.77 million possible colors per
pixel.

24
 Similarly, RGB 10 bits refers to a pixel with 10 bits of red

component, 10 bits of green component, and 10 bits of
blue component.
 Each color can therefore be represented in 210, or 1024
hues, leaving a total of 10243 or 1.074 billion total
possible pixel colors.
YUV or YCbCr color space (model)
25
 YUV color model was invented as a broadcast solution to

send color information through channels built for
monochrome signals.
 Color is incorporated to a monochrome signal by
combining the monochrome signal (also called
brightness, luminance, or luma, and represented by the Y
symbol), with two chrominance signals (also called
chroma and represented by UV or CbCr symbols).
 This allows for full color definition and image quality on
the receiving end of the transmission.
Chroma subsampling
26
 Storing or transferring video over IP can be taxing on

network infrastructure.
 Chroma subsampling is a way to represent this video at a
fraction of the original bandwidth, therefore reducing the
strain on the network.
 This takes advantage of the human eye’s sensitivity to
brightness as opposed to color.
 By reducing the detail required in the color information,
video can be transferred at a lower bitrate in a way that's
barely noticeable to the viewers.
YUV 4:4:4
27
 Full color depth is usually referred to as 4:4:4.

 The first number indicates that there are four pixels across,
 the second indicates that there are four unique colors,
 and the third indicates that there are four changes in color
for the second row.
4:4:4 full color depth

4:4:4 chroma subsampling
28
 Each pixel then receives three signals, one luma

(brightness) component represented by Y, and two
color difference components known as chroma
represented by Cr (U) and Cb (V).
YUV subsampling
29
 Subsampling is a way of sharing color across multiple

pixels and using the eye and brain’s natural tendency to
mix neighboring pixels.
 Subsampling reduces the color resolution by sampling
chroma information at lower rate than luma information.
YUV 4:2:2 chroma subsampling
30
 4:2:2 subsampling implies that the chroma components are

only sampled at half the frequency of the luma:
 The chroma components from pixels one, three, five, and seven will be
shared with pixels two, four, six, and eight respectively. This reduces
the overall image bandwidth by 33%.
YUV 4:2:0 chroma subsampling
31
 in 4:2:0 sub-sampling, the chroma components are

sampled at a fourth of the frequency of the luma.
 The components are shared by four pixels in a square
pattern, which reduces the overall image bandwidth by
50%.
32
YUV 4:4:4 vs 4:2:2 vs 4:2:0
 The image below details how a 4x2 pixel region is
represented in 4:2:0 and 4:2:2 subsampling.
Subsampling size saving
33
With 8 bits per component,

 In 4:4:4, each pixel will require three bytes of data (since
all three components are sent per pixel).
 In 4:2:2, every two pixels will have four bytes of data.
This gives an average 1 of two bytes per pixel (33%
bandwidth reduction).
 In 4:2:0, every four pixels will have six bytes of data. This
gives an average of 1.5 bytes per pixel (50% bandwidth
reduction).
When to use chroma subsampling and
34
when to avoid?
 Chroma subsampling is a useful method to use for natural
content, where lower chroma resolution isn't noticeable.
 On the other hand, for complex and precise synthetic content
(for example, Computer-generated imagerycontent), full color

depth is needed to prevent visible artifacts (edge blurring),
since the pixel precise content may exacerbate them.
The images below show how CGI data can be impacted by
subsampling.

Lect 5

Uploaded by

Copyright:

Available Formats

You might also like

Lect 5

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lect 5

Uploaded by

Copyright:

Available Formats

CSO243:

Lecture 5: Fundamental Concepts in Video

 Video is a combination of image and audio.

 Analog video is represented as a continuous (time-

 A major weakness of analog recordings is that every time

 Each frame is a picture which consists of a set of

It is divided into two types

 Interlaced video scan shows even and odd scan lines as

 Back when analog television was still in use, and most

 Progressive scan displays the video frame line by

 Also called “non-interlaced,” all modern monitors

 The NTSC TV standard was mostly used in North

 PAL (phase alternating line) is a TV standard originally

 PAL uses the YUV color model, allocating a

 SECAM, which was invented by the French, is the third

 Color space is a mathematical representation of a range

 Virtually all displays—whether TV, smartphone, monitor,

 While the human eye perceives each pixel as a single

 By mixing red, green, and blue, it's possible to obtain a

 8-bit vs 10-bit color

 Similarly, RGB 10 bits refers to a pixel with 10 bits of red

 YUV color model was invented as a broadcast solution to

 Storing or transferring video over IP can be taxing on

 Full color depth is usually referred to as 4:4:4.

4:4:4 full color depth

 Each pixel then receives three signals, one luma

 Subsampling is a way of sharing color across multiple

 4:2:2 subsampling implies that the chroma components are

 in 4:2:0 sub-sampling, the chroma components are

With 8 bits per component,

(for example, Computer-generated imagerycontent), full color

You might also like