1. The document discusses MPEG-1 Audio, the first high quality audio compression standard that could provide CD quality two-channel audio at 256 kbits/s.
2. It describes the key aspects of MPEG-1 Audio including psychoacoustics, subband coding, and its three layers (Layer I, II, and III) that provide increasing quality and compression ratios.
3. The document outlines the encoder and decoder block diagrams, how filterbanks and quantization are used, and new features of Layer III including MDCT, nonuniform quantization, and entropy coding.
1. The document discusses MPEG-1 Audio, the first high quality audio compression standard that could provide CD quality two-channel audio at 256 kbits/s.
2. It describes the key aspects of MPEG-1 Audio including psychoacoustics, subband coding, and its three layers (Layer I, II, and III) that provide increasing quality and compression ratios.
3. The document outlines the encoder and decoder block diagrams, how filterbanks and quantization are used, and new features of Layer III including MDCT, nonuniform quantization, and entropy coding.
1. The document discusses MPEG-1 Audio, the first high quality audio compression standard that could provide CD quality two-channel audio at 256 kbits/s.
2. It describes the key aspects of MPEG-1 Audio including psychoacoustics, subband coding, and its three layers (Layer I, II, and III) that provide increasing quality and compression ratios.
3. The document outlines the encoder and decoder block diagrams, how filterbanks and quantization are used, and new features of Layer III including MDCT, nonuniform quantization, and entropy coding.
tsuhan@ece.cmu.edu 18-899 Special Topics in Signal Processing Multimedia Communications: Coding, Systems, and Networking Lecture 8 MPEG-1 Audio 2 18-899/Spring 1998/Chen MPEG-1 Audio Outline Background Psychoacoustics Subband coding Layer I and II Layer III Frame structure and packetization 18-899/Spring 1998/Chen MPEG-1 Audio ISO/IEC 11172-3 (1988~1991) First high quality audio compression standard CD quality two-channel audio at 256 kbits/s CD: 44.1 kHz 16 bits 2 = 1.411 Mbits/s Frequency Band (Hz) Sampling Rate Bits per Sample Raw Bitrate Telephone Speech 300~3400 8 8 64 Wideband Speech 50~7000 16 8 128 Mediumband Audio 10~11000 24 16 384 Wideband Audio 10~22000 48 16 768 3 18-899/Spring 1998/Chen Quality Demonstration MPEG-1 Audio (Layer II) Stereo 44.1 kHz at 64 kbits/s Stereo 44.1 kHz at 128 kbits/s Stereo 44.1 kHz at 192 kbits/s Stereo 44.1 kHz at 256 kbits/s 18-899/Spring 1998/Chen Psychoacoustics Threshold in quiet 4 18-899/Spring 1998/Chen Frequency Masking 18-899/Spring 1998/Chen Temporal Masking Post-Masking: 50~200ms Also Pre-Masking (much shorter) 5 18-899/Spring 1998/Chen Encoder Block Diagram mapping quantizer and coding frame packing psychoacoustic model PCM audio samples 32, 44.1, 48 kHz encoded bitstream 11172-3 Encoder ancillary data 18-899/Spring 1998/Chen Decoder Block Diagram frame unpacking reconstruction inverse mapping encoded bits tream PCM audio samples 32, 44.1, 48 kHz ancillary data 11172-3 Decoder 6 18-899/Spring 1998/Chen H 1 (z) H 2 (z) F 1 (z) F 2 (z) H M (z) F M (z) M M M M M M Q Q Q Analysis Filterbank Synthesis Filterbank Mapping: Subband Coding Critical downsampling Q should be based on signal-to-masking ratio (SMR) Ears critical bands are not uniform, but logarithmic 18-899/Spring 1998/Chen Alias cancellation and perfect reconstruction M M M z -1 z -1 E(z) R(z) M M M z z . . . . . . . . . Polyphase Filterbank 7 18-899/Spring 1998/Chen Layers Increasing complexity, delay, and quality Layer I ~384 kbits/s for perceptually lossless quality (4:1) Layer II ~192 kbits/s for perceptually lossless quality (8:1) Layer III ~128 kbits/s for perceptually lossless quality (12:1) 18-899/Spring 1998/Chen Analysis Filterbank Scaler & Quantizer Mux 32 Masking Threshold Generator Layer I and II Encoder Dynamic Bit Allocator FFT Coder 8 Analysis Filterbank . . . 12 12 12 Layer I Layer II Block-Based Coding 12 samples for Layer I, 36 samples for Layer II Block companding: Each block normalized by scalefactor For Layer II, up to 3 scalefactors, with 2-bit scalefactor select Each block receives one bit allocation 18-899/Spring 1998/Chen Analysis Filterbank Scaler & Quantizer Mux Layer III Encoder FFT MDCT Huffman Coding Masking Threshold Generator Coding 6 or 18 with overlap 9 18-899/Spring 1998/Chen New Features in Layer III Modified DCT (MDCT) DCT with overlap Long/short window switching Short for better temporal resolution (to prevent pre-echoes) Long for better frequency resolution Nonuniform quantization Entropy coding Run-length and Huffman coding Bit reservoir (buffer) 18-899/Spring 1998/Chen Side Info Subband Sanples Header Info Aux Data Frame Structure Header info: Sync bits, system info, CRC (cyclic redundancy code) Side info: bit allocation, scalefactor, (and scalefactor select for Layer II and III) Subband samples: 32 12 for Layer I, 32 36 for Layer II and III Packetization: 4-byte header, 184-byte payload 10 18-899/Spring 1998/Chen Stereo Redundancy Coding Four modes: mono, stereo, dual with two separate channel, joint stereo In joint stereo mode Human stereo perception > 2kHz is based on envelope Intensity stereo coding > 2kHz Encode (L + R) Assign independent left- and right- scalefactors Layer III supports (L+R) and (LR) coding 18-899/Spring 1998/Chen References Peter Noll, MPEG digital audio coding, IEEE Signal Processing Magazine, Sept. 1997, pp. 59-81 D. Pan, A tutorial on MPEG/Audio compression, IEEE Trans. on Multimedia, vol. 2, no. 2, 1995, pp. 60-74