MPEG Audio Compression-1

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

MPEG AUDIO COMPRESSION

1. PSYCHOACOUSTICS
|

Psychoacoustics is the research where you aim to understand how the ear and brain interact as various sounds enter the ear. A perceptual audio codec is a codec that takes advantage of this human characteristic. The range of human hearing is about 20 Hz to about 20 kHz The easiest frequency range to perceive by humans is typically from about 2 kHz to 4 kHz The dynamic range, the ratio of the maximum sound amplitude to the quietest sound that humans can hear, is on the order of about 120 dB

EQUAL-LOUDNESS RELATIONS
|

Fletcher-Munson Curves
y

Equal loudness curves that display the relationship between perceived loudness ( (Phons Phons , in dB) for a given stimulus sound volume (Sound Pressure Level, also in dB), as a function of frequency

Fig. 1 shows the ears perception of equal loudness:


The bottom curve shows what level of pure tone stimulus is required to produce the perception of a 10 dB sound y All the curves are arranged so that the perceived loudness level gives the same loudness as for that loudness level of a pure tone at 1 kHz
y
3

EQUAL-LOUDNESS RELATIONS

Fig 1: Flaetcher-Munson Fig.1: Flaetcher Munson Curves (re-measured (re measured by Robinson and Dadson)

THRESHOLD OF HEARING
|

A plot of the threshold of human hearing for a pure tone

Fig. 2: Threshold of human hearing, for pure tones

THRESHOLD OF HEARING (CONTD)


The threshold of hearing curve: if a sound is above the dB level shown then the sound is audible | Turning T i up a t tone so that th t it equals l or surpasses the curve means that we can then distinguish the sound | An approximate formula exists for this curve:
|

Threshold( es o d( f ) = 3.64( 3.6 ( f /1000) / 000)


y

0.8

6.5 e

0.6( f /1000 3.3)2

+ 10 0 3 ( f / /1000) 000) 4

The threshold units are dB; the frequency for the origin (0,0) in formula (14.1) is 2,000 Hz: Threshold(f) = 0 at f =2 kHz
6

FREQUENCY MASKING
|

Experiments have shown that the human ear has 24 frequency bands (Basilar membrane). Frequencies in these so called critical bands are harder to distinguish by the human ear.

FREQUENCY MASKING
|

Suppose there is a dominant tonal component present in an audio signal. The dominant noise will introduce a masking threshold that will mask out frequencies in the same critical band (see Figure below). This frequency-domain phenomenon is known as simultaneous masking, which has been observed within critical bands.

FREQUENCY MASKING CURVES


|

Frequency masking is studied by playing a particular pure tone, say 1 kHz again, at a loud volume and determining how this tone affects our volume, ability to hear tones nearby in frequency
y

one would g generate a 1 kHz masking g tone, , at a fixed sound level of 60 dB, and then raise the level of a nearby tone, e.g., 1.1 kHz, until it is just audible

The threshold in Fig. 3 plots the audible level for f a single masking tone (1 kHz) Fig. 4 shows Fi h how h the h plot l changes h if other h masking ki tones are used
9

FREQUENCY MASKING CURVES

Fig. 3: Effect on threshold for 1 kHz masking tone

10

FREQUENCY MASKING CURVES

Fig. 4: Effect of masking tone at three different frequencies


11

CRITICAL BANDS
|

Critical bandwidth represents the ears resolving power for simultaneous tones or partials
y

At the th low-frequency l f end, d a critical iti l band b d is i less l than th 100 Hz wide, while for high frequencies the width can be greater than 4 kHz

Experiments indicate that the critical bandwidth:


for masking frequencies < 500 Hz: remains approximately constant in width ( about 100 Hz) y for masking frequencies > 500 Hz: increases approximately i t l li linearly l with ith f frequency
y

12

TABLE 1. 1 25-C 25 CRITICAL BANDS AND BANDWIDTH

13

14

BARK UNIT
Bark unit is defined as the width of one critical band, for any masking frequency | The Th idea id of f the th Bark B k unit: it every critical iti l band b d width idth is roughly equal in terms of Barks (refer to Fig. 5)
|

Fig 5 Fig. 5: Effect Eff t of f masking ki g tones, t expressed d in i Bark B k units it

15

CONVERSION: FREQUENCY & CRITICAL BAND NUMBER


|

Conversion expressed in the Bark unit:

f /100, for f < 500 , Critical band number (Bark) = og 2 ( f / /1000), 000), for o f 500.% 9 + 4 log
|

(2) ( )

Another formula used for the Bark scale: b = 13.0 13 0 arctan(0.76 arctan(0 76 f)+3.5 f)+3 5 arctan(f2/56.25) arctan(f2/56 25) where f is in kHz and b is in Barks (the same applies to all below) (3)

Th i The inverse equation: ti f = [(exp(0.219*b)/352)+0.1]*b0.032*exp[0.15*(b5)2] (4)

The critical bandwidth (df) for a given center frequency f can also be approximated by: df = 25 + 75 [1 + 1.4(f2)]0.69 (5)
16

TEMPORAL MASKING
|

Phenomenon: any loud tone will cause the hearing receptors in the inner ear to become saturated and require time to recover The following figures show the results of Masking experiments:

17

Fig. 6: The louder is the test tone, the shorter it takes for our hearing to get over hearing the masking.

TEMPORAL MASKING
|

Temporal masking occurs in the time-domain. A stronger tonal component (masker) will mask a weaker one (maskee) if they appear within a small interval of time. The masking threshold will mask weaker signals pre and post to the masker. Premasking usually lasts about 50 ms while postmasking will last from 50 to 300 ms, depending on the strength and duration of the masker as shown in Figure below.

18

TEMPORAL MASKING

Fig. 14.8: For a masking tone that is played for a longer time, it t k longer takes l before b f a test t t tone t can be b heard. h d Solid curve: masking tone played for 200 ms; dashed curve: masking tone played for 100 ms.
19

You might also like