MMC Chap2

You might also like

Download as rtf, pdf, or txt
Download as rtf, pdf, or txt
You are on page 1of 22

MMCUnit-2 MM Info Representation

UNIT 2: MULTIMEDIA INFORMATION REPRESENTATION


Contents: 1) Introduction 2) Digitization principles 3) Text 4) Images 5) Audio 6) Video

Instructional Objectives: At the end of the unit, the students should be able to: 1. Explain the various steps involved in the conversion of an analog signal into a digital format. 2. List the various types of text representations used, and also understand the representation of the same. 3. Enumerate the representation of graphics, digital documents and digital pictures. 4. Elaborate the representation of speech and synchronized audio signals. 5. Understand the scanning principles and representation of video signals in broadcast television.
1 BSS ECE, REVA

1. INTRODUCTION:
Multimedia Information is stored and processed within a computer in a digital form. In the case of textual information each character is represented by a combination of a fixed number of bits and is called the codeword. Most of the signals that are inputted from and outputted back to the natural environment are analog signals. These are signals whose amplitude (magnitude of the sound/image intensity) varies continuously with time. To process such analog signals, they have to be first converted into digital form, and while outputting they must be converted back into digital form. To convert an analog signal into digital form an electrical circuit is used and is called the signal encoder. The reverse operation is achieved by the use of signal decoder.

2. DIGITIZATION PRINCIPLES:
2.1 Analog Signals: Fourier analysis can be used to show that any time-varying analog signal is made up of a possibly infinite number of single-frequency sinusoidal signals whose amplitude and phase vary continuously with time relative to each other. The same is shown in fig 2.1(a). Fig 2.1(b) shows the highest and lowest frequency components of the signal in fig 2.1(a).

MMCUnit-2 MM Info Representation

Fig 2.1(a): Analog Signal

Fig 2.1(b): Lowest & Highest frequency components The range of frequencies of the sinusoidal components that make up a signal is called the signal bandwidth and the same can be listed in fig 2.1(c). The bandwidth of the transmission channel should be equal to or greater than the bandwidth of the signal and is called the bandlimiting channel and is shown in fig 2.1 (d).

Fig 2.1 (c): Signal bandwidth

Fig 2.1 (d): Bandlimiting Channel 2.2 Encoder Design:


3 BSS ECE, REVA

The principles of an encoder are shown in fig 2.2. It consists of bandlimiting filter and ADC (consisting of sample and hold circuit). The waveforms are shown in fig 2.3. The explanation of the same is as follows: (A): Input signal to the circuit (B): Remove selected higher-frequency components from the source signal (A) (C): signal (B) is then fed to the sample-and-hold circuit (D): Sample the amplitude of the filtered signal at regular time intervals (C) and hold the sample amplitude constant between samples.

Fig 2.2: Circuit of signal encoder

Fig 2.3: Waveforms signal encoder

2.2.1Sampling Rate:
Quantizer circuit converts each sample amplitude, into a binary value known as a codeword (E). The signal is sampled at a rate which is higher than the maximum rate of change of the signal amplitude. The number of different quantization levels is as large as possible. Nyquist sampling theorem states that: in order to obtain an accurate representation of a timevarying analog signal, its amplitude must be sampled at a minimum rate that is equal to or greater than twice the highest sinusoidal frequency component that is present in the signal. Distortion is caused by sampling a signal at a rate lower than the Nyquist rate.

These signals are called as the 2 alias signals, as they replace the corresponding original signals. the same is represented in fig 2.4. The unit of Nyquist rate is samples per second.

MMCUnit-2n MM Info Representation q = max

2V

Fig 2.4: Alias signal generation due to sampling at a rate lower than the Nyquist rate 2.2.3 Quantization intervals: A finite number of digits are used; each sample can only be represented by a corresponding number of discrete levels. If Vmax is the maximum positive and negative signal amplitude and n is the number of binary bits used, then the magnitude of each quantization interval, q is given by

5 BSS ECE, REVA

V dB D = 20 log max Vmin 10

Fig 2.5: Quantization procedure Each codeword corresponds to a nominal amplitude level which is at the center of the corresponding quantization interval. The difference between the actual signal amplitude and the corresponding nominal amplitude is called the quantization error (Quantization noise). The ratio of the peak amplitude of a signal to its minimum amplitude is known as the dynamic range of the signal, D (decibels or dB)

When determining the quantization interval, it is necessary to ensure that the level of quantization noise relative to the smallest signal amplitude is acceptable. 2.3 Decoder Design

MMCUnit-2 MM Info Representation

The decoder reproduces the original signal (digital to analog conversion). The decoder circuit comprises of the DAC and low pass filter. The output of the DAC is passed through a low-pass filter which only passes those frequency components that made up the original filtered signal. The low pass filter is also called as the recovery or reconstruction filter. The same is illustrated in fig 2.6 and the waveforms in fig 2.7.

Fig 2.6: Decoder

Fig 2.7: Waveforms of the decoder

3. TEXT
There are 3 types of text: 3.1 Unformatted text 3.2 Formatted text 3.3 Hypertext
7 BSS ECE, REVA

3.1 Unformatted text American Standard Code for Information Interchange (ASCII character set) is widely used to create unformatted text strings. Mosaic characters are used to create relatively simple graphical images. The same is shown in fig 2.8 and 2.9. The ASCII Codes consists of 7 bits hence a total of 128 characters can be represented. The ASCII consists of printable and control characters.

Fig 2.8: ASCII Set

Fig 2.9: Mosaic Characters

MMCUnit-2 MM Info Representation

3.2 Formatted text Produced by most word processing packages, consists of formatted text. It is used for preparation of papers, books magazines, etc, each with different headings and with tables, graphics, and pictures inserted at appropriate points. Print preview is used to display the computer screen in a similar way the document is printed. It is also called as WYSIWYG: an acronym for what-you-see-is-what-youget. The program and formatted text is as shown in fig 2.10.

Figure 2.10: Formatted texts: (a) an example formatted text string; (b) printed version of the string. 3.3 Hypertext Formatted text enables a related set of documents normally referred to as pages to be created which have defined linkage points referred to as hyperlinks between each other pages. The same is illustrated in fig 2.11.

9 BSS ECE, REVA

Fig 2.11 Electronic document edited using hypertext

4 IMAGES
Images are displayed in the form of a two-dimensional matrix of individual picture elements, known as pixels or pels. 4.1 Graphics There are 2 forms of representation of a computer graphic: a high-level version (similar to the source code of a high-level program) and the actual pixel-image of the graphic (similar to the bytestring corresponding to the low-level machine code bit-map format). Standardized forms of representation such as GIF (graphical interchange format) and TIFF (tagged image file format). The same is illustrated as in fig 2.12.

MMCUnit-2 MM Info Representation

Fig 2.12: Graphics Principles 4.2 Digitized documents: Example of a digitized document is a fax machine. The scanner in the fax machine scans each page from left to right. This operation produces a sequence of scan lines. The output of the scanner is digitized and a single binary digit is represented by a pel, a 0 for a white pel and a 1 for a black pel. This operation is illustrated in fig 2.13.

Fig 2.13: Fax machine schematic and digitized format. 4.3 Digitized pictures:
11 BSS ECE, REVA

Color principles: A whole spectrum of colors known as a color gamut can be produced by using different proportions of red(R), green (G), and blue (B) as shown in Fig 2.14.Additive color mixing produces a color image on a black surface. Subtractive color mixing is used for producing a color image on a white surface

Fig 2.14: Color Derivative principle: Additive and Subtractive Principles Most of the televisions use raster scan principles. The picture tubes used in most television sets operate using what is known as a raster-scan; this involves a finely-focussed electron beam being scanned over the complete screen.

MMCUnit-2 MM Info Representation

Fig 2.15: Television principles Progressive scanning is performed by repeating the scanning operation that starts at the top left corner of the screen and ends at the bottom right corner follows by the beam being deflected back again to the top left cornerEach complete set of horizontal scan is called a frame. The number of bits per pixel is known as the pixel depth and determines the range of different colors. The set of three related colour-sensitive phospors associated with each pixel is called a phospor triad. Concepts of Digitized images: Frame: Each complete set of horizontal scan lines (either 525 for North & South America and most of Asia, or 625 for Europe and other countries). Flicker: Caused by the previous image fading from the eye retina before the following image is displayed, after a low refresh rate (to avoid this refresh rate of 50 times per second is required). Pixel depth: Number of bits per pixel that determines the range of different colours that can be produced. Colour Look-up Table (CLUT): Table that stores the selected colours in the subsets as an address to a location reducing the amount of memory required to store an image.
13 BSS ECE, REVA

Aspect Ratio: This is the ratio of the screen width to the screen height (television tubes and PC monitors have an aspect ratio of 4/3 and wide screen television is 16/9). Some examples of display resolutions and memory requirements are shown in fig 2.16.

Fig 2.16: Example of display resolution and memory requirements Various standards use their own frame formats. Some of them are as follows: NTSC = 525 lines per frame (480 Visible) PAL,CCIR,SECAM=625 lines ( 576 visible) resolutions: VGA (640x480x8), XGA (1024x768x8) and SVGA

Example display (1024x768x24).

Fig 2.17 shows the screen resolution.

Fig 2.17: Screen Resolution Typical arrangement that is used to capture and store a digital image produced by a scanner or a digital camera (either a still camera or a video camera) is shown in fig 2.18.

MMCUnit-2 MM Info Representation

Fig 2.18: Schematic color image capturing Photosites: Silicon chip which consists of a two dimensional grid of light-sensitive cells, which stores the level of intensity of the light that falls on it. Charge-coupled devices (CCD): Image sensor that converts the level of light intensity on each photosites into an equivalent electrical charge.

Fig 2.19: RGB signal generation techniques.

5 AUDIO:
Two types of audio signal - Speech signal as used in a variety of interpersonal applications including telephony and video telephony - Music-quality audio as used in applications such as CD-on-demand and broadcast television
15 BSS ECE, REVA

Audio can be produced either naturally by means of a microphone or electronically using some form of synthesizer. The bandwidth of a typical speech signal varies from 50Hz - 10 kHz and that of a music signal from 15Hz 20 kHz. Tests have recommended the use of a minimum of 12 bits per sample for speech and 16 bits for music 5.1 PCM Speech Initially a PSTN operated with analogue signals throughout, the source speech signal being transmitted and switched. However, today these have been replaced with digital circuits. In order to support interworking of the analogue and digital circuits the design of the digital equipment is based on the analogue network operating parameters. The BW of a speech circuit was limited to 200 Hz to 3.4 kHz. The digitization procedure is known as pulse code modulation PCM.

Fig 2.20: Signal encoding and decoding schematic The characteristics of the PCM expander are as follows: In linear quantization irrespective of the signal amplitude same level of quantization noise is produced ( noise level is same for the quiet signals and loud signals). Pulse Code Modulation consists of two additional circuits: Compressor (encoder) and Expander (decoder) to help reduce the effect of quantization noise with just 8 bits per sample, making the intervals non-linear with narrower intervals for small amplitude signals than larger amplitude signals. This is achieved by the means of the compressor circuit. The analog output from the DAC is passed to the expander circuit which performs the reverse operation of the compressor circuit. The overall operation is known as companding. The compression and expansion characteristics are known as A-law in Europe. 5.2 CD-Quality Audio: The computer takes input commands from the keyboard and outputs these to the sound generators which produce the corresponding sound waveform to drive the speakers.

MMCUnit-2 MM Info Representation

Fig 2.21: Synthesized Audio Synthesized audio is often used since the amount of memory required can be between two or three orders of magnitude less than that required to store the equivalent digitised waveform version. The three main components of an audio synthesizer are the computer (with various application programs), the keyboard (based on that of a piano) and the set of sound generators. The computer takes the commands and outputs these to the sound generators which in turn produce the corresponding sound waveform via DACs to drive the speakers. Pressing a key has similar effects to pressing a keyboard of a computer. For each key press a different codeword (message indicating the key pressed and the pressure applied) is generated. The control panel contains range of different switches and sliders that collectively allow the user to indicate to the program information such as the volume of the generated output and selected sound effects to be associated with each key. To discriminate between the inputs from different possible sources a standard known set of messages (also includes the type of connectors, cables, electrical signals, etc) have been defined: Music Instrument Digital Interface (MIDI). Status byte - This defines the particular event that has caused the message to be generated. Data bytes Which collectively define a set of parameters (pressure applied, identity of the key) associated with the event. Event A key being pressed.

It is important to identify the different types of instruments that generated the events. Each instrument has a MIDI code associated with it e.g Piano has a code of 0 and violin 40. Since the music is in the form of MIDI messages it is vital to have a sound card in the client computer to interpret the sequence.

6. VIDEO
6.1 Broadcast Television: The three main properties of a colour source that the eye makes use of are: Brightness: represents the amount of energy that stimulates the eye (from black-lowest to
17 BSS ECE, REVA

white-highest) Hue: Represents the actual colour of the source (each colour has a different frequency/wavelength) Saturation: represents the strength of the colour Luminance is used to refer to the brightness of a source, and hue and saturation (concerned with its colour) are referred to as chrominance characteristics The combination of the three signals Y (amplitude of luminance signal), Cb (blue chrominance), and Cr (red chrominance) contains all the necessary information to describe a colour signal. The principles of colour television can be explained as follows: Colour transmission is based on two facts

- The first is that all colours may be produced by the addition of appropriate quantities of the three primary colours: RGB E.g: Yellow = R + G Magenta = R + B White = R + G + B Yellow and magenta are known as complementary colours

- The second fact is that human eye reacts predominantly to the luminance (black and white) components of a colour picture, much more than to its chrominance (colour) component. Colour TV transmission involves the simultaneous transmission of the luminance and chrominance components of a colour picture, with luminance predominant over chrominance. As for the chrominance component, it is first purified by removing the luminance component from each primary colour, resulting in what is known as colour difference signals: R-Y G-Y B- Y Since the luminance signal Y= R + G + B, only two colour difference signals need to be transmitted, namely R-Y and B-Y. The third colour difference, G-Y may be recovered at the receiver from the three transmitted components: Y, R-Y and B-Y. In analog TV broadcasting, the two colour difference signals R-Y and B-Y are known as U and V respectively. In digital television they are referred to as Cr and Cb.

MMCUnit-2 MM Info Representation

Fig 2.22: Interlaced Scanning Principles In NTSC the eye is more responsive to the I signal than the Q signal, hence maximizing the available bandwidth and minimizing the level of interference with the luminance signal are needed.

19 BSS ECE, REVA

Fig 2.23: Signal Bandwidth Baseband spectrum of colour TV in NTSC System

In PAL, the larger luminance bandwidth allows both the U and V chrominance signals to have the same modulated bandwidth. U and V chrominance signals have the same modulated bandwidth of 3 MHz. The addition of the sound and video signal is called the complex baseband signal.

Fig 2.24: Signal Bandwidth - Baseband spectrum of colour TV in PAL System There are three main systems of analogue colour encoding: NTSC (used in USA), PAL (used in UK) and SECAM (used in France) All three systems split the colour picture into luminance and chrominance All three types use the colour difference signals to transmit the chrominance SECAM transmits the colour difference signals on alternate lines The other two systems NTSC and PAL transmit both chrominance components simultaneously using a technique known as Quadrature amplitude modulation (QAM)

MMCUnit-2 MM Info Representation

6.2 Digital Video With digital television it is more usual to digitize the three component signals separately prior to their transmission to enable editing and other operations to be readily performed. Since the eye is less sensitive for colour than it is for luminance, a significant saving in terms of resulting bit rate can be achieved by using the luminance and two colour difference signals instead of the RGB directly. Digitization formats exploit the fact that the two chrominance signals can tolerate a reduced resolution relative to that used for the luminance signal. Sampling Structure:

There are several structures for subsampling the chrominance components. One way is to sample the chrominance components every other pixel known as the 4:2:2 sampling structure. This reduces the chrominance resolution in the horizontal dimension only leaving the vertical resolution unaffected. The ratio 4:2:2 (Y: CR: CB) indicates that both CR and CB are sampled at half the rate of the luminance signal. It is used in television studios. Bandwidth up to 6MHz for the luminance signal and less than half this for the chrominance signals. 4:2:0 is a derivative of the 4:2:2 format and is used in digital video broadcast applications (achieving good picture quality)

Figure 2.25: Sample positions with 4:2:2 digitization format.


21 BSS ECE, REVA

Fig 2.26: 4:2:0 Format

You might also like