Download as pdf or txt
Download as pdf or txt
You are on page 1of 36

Audio Processing for DTV:

Solving LOUD Commercials and Other Problems in 5.1


Tim Carroll Linear Acoustic Inc. Tim@LinearAcoustic.com

What We Will Cover


Basics of Dolby Digital (AC-3) Dolby Digital (AC-3) Decoders Metadata: Essentials & Problems Loudness Problems (Real World) One possible solution Q&A

Dolby Digital Quick Review


One to 5.1 channels of audio Highly efficient: 56-640 kbps (5.1 is 448 kbps) Part of ATSC, DVB, and OpenCable specifications Includes basic provisions for loudness matching:
Wideband dynamic range control (dynrng, compr) Data-controlled attenuator in consumer decoders

(dialnorm)

Advanced features require accurate Metadata

Metadata
Data about the audio MUST be in time with audio Describes number of channels, loudness, how to downmix, etc 20-30 parameters per program

Essential Metadata Never Leave Home Without These:


Dialogue Level
Either make the value match the program, -or-

make the program match a preset value

Dynamic Range
Pick a preset (Film Standard is good) Make sure RF Overmod is OFF!

Audio Coding Mode (2/0, 3/2L)


Beware of the switching effects!

Dialogue Level (dialnorm)


A measure of the Dialog Loudness of a program. Measured using an LAeq meter (long-term A-weighted loudness equivalent), can use a Dolby DP570 or better is LM100. Value is unique to each program, However:
A single value can be chosen and audio can be

adjusted to match this value.

ORIGINAL SIGNAL Digital Full Scale 0 dBFS -10 dBFS -20 dBFS -30 dBFS -40 dBFS -50 dBFS AVERAGE DIALOG SIGNAL PEAKS DIALNORM VALUE DIALOG LOUDNESS

SIGNAL WITH DIALOG NORMALIZATION Digital Full Scale 0 dBFS -10 dBFS -20 dBFS -30 dBFS -40 dBFS -50 dBFS AVERAGE DIALOG SIGNAL PEAKS PROGRAM LEVEL SHIFTED -11 dB DIALOG LOUDNESS AT -31 dBFS

ACTION MOVIE 0 dBFS -10 dBFS -20 dBFS -30 dBFS -40 dBFS

DRAMA SPORTS SYMPHONY

ROCK

NEWS

10 27 24 21 20 20

AVERAGE DIALOG

SIGNAL PEAKS

ACTION MOVIE 0 dBFS -10 dBFS -20 dBFS -30 dBFS -40 dBFS

DRAMA SPORTS SYMPHONY

ROCK

NEWS

AVERAGE DIALOG

SIGNAL PEAKS

Dynamic Range
Control words that allow the decoder to modify the dynamic range of the audio:
Line Mode (dynrng) = 24dB range RF Mode (compr) = 48dB range

The control word is actually generated in the Dolby Digital encoder. Values are also part of metadata, but are set via Profiles

Dynamic Range Control


dBFS 0 -10 -20 -30 -40 -50 -60 -70 -80 ORIGINAL SIGNAL DEFAULT DECODING OPTIONAL DECODING

Audio Coding Mode


Notation is simple: Front/Rear + L (LFE) 5.1 = 3/2L Stereo or surround = 2/0 Switching modes causes changes in consumer decoders

Guts of a Decoder
All two channel products must make AC-3 bitstream available (S/PDIF or TOSLINK) Bass management implementation varies, but is usually present in some form:
Front speakers large/small Surrounds large/small Subwoofer present?

L R C Ls Rs LFE L R C S +

L R C Ls Rs LFE

Dolby Digital Decoder

Enc. Data

Set-Up Info

Meta data

DownMix dialnorm dynrng control

120 Hz

Pro Logic Decoder

S/PDIF

Consumer Decoders
All 5.1 channel decoders contain a Pro Logic (and usually a PLII) decoder:
Decodes legacy 2-channel material to fill all the

speakers May cause a quick code re-boot (mute) during switch, varies depending on method of changing metadata:
Single value change (acmod) is best Switching entire metadata stream on VI works Switching Dolby Digital encoder preset is a disaster (i.e. how the audio guys finally got back at the video guys)

Metadata Problems
Creation Ideally generated at the time audio program is created, but matches that audio only:
If audio levels are changed (i.e. in master

control), metadata MUST be re-created

Distribution How do you carry this data?


If you are distributing baseband, plan for a

115kbps synchronous RS485 layer Dolby E includes metadata path

Metadata Problems
Interruption What happens if metadata fails? Tricking the system What happens if metadata is wrong? (a.k.a. Trust me, my dialogue level really is 31 even though my audio is LSB from FS) 2004 Grammy Awards broadcast was a good example of both situations

Loudness Issues
Measurement Dolbys LM100 makes it easy and accurate, but some of it is still subjective: music vs. dialogue is tough Adjusting dialnorm doesnt fix dynamics matching the dialogue from program to program doesnt take into account the loud explosions

2004 Grammy Awards (NY station)

-45 0 -5

-40

-35

-30

-25

-20

-15

-10

21:06:15 21:16:15 21:26:45 21:36:45 21:46:45 21:56:45 22:06:45 22:16:45 22:26:45 22:36:45 22:46:45 22:56:45 23:06:45 23:16:45 23:27:31 23:37:31 23:47:31 23:57:31 0:07:31 0:17:31

-31

0:27:31

-35 -5 0
20:57:08 21:07:38 21:18:08 21:28:47 21:44:57 21:55:27 22:05:57 22:16:27 22:27:14 22:37:44 22:48:14 22:58:44 23:09:14 23:19:44 23:30:14 23:40:44 23:51:14 0:01:44

-30

-25
0:12:14

-20

-15

-10

2004 Academy Awards


(different NY Station)

-27

Dynamics Processing is Necessary


(but not over processing)
Dialnorm can center the action, processing can keep it all in check As the systems mature, processing can be relaxed (but still present as a safety net) With access to metadata, processors can learn to relax automatically

Goals of a DTV Audio Processor


Handle programs from mono to 5.1 channels
Three 2-channel processors does not equal a 5.1

channel processor (as we found out)

NO clipping old processors will not work


AC-3 is data reduction, clipping ruins

efficiency of these systems

Goals of a DTV Audio Processor


Must deal with metadata
Directly or indirectly, systems must work

together

Local processes:
Voiceover of network 5.1 programs Process local audio

Help integrate legacy 2-channel programs


Upmixing turns 2-channel into 5.1 channel

(with some caveats)

Upmixing
Not a substitute for a good 5.1 or 6.1 channel mixJerry Springer in pseudo-5.1 is pointless MUST be downmix compatible Very helpful when trying to maintain dialogue from center channel with no switching artifacts Fixes the 2-channel source during 5.1 channel broadcast problem:
If 3/2L mode cannot be changed, all stereo sources

come out of Left and Right speakers, no PL or PLII

DTV Processor Signal Flow


1/2 OCTiMAX 5.1 Core 3/4 C/LFE L/R

+
Voiceover Mix

1/2

3/4

5/6

Ls/Rs

+
X

5/6

7/8

Local (7/8)

OCTiMAX 2.0 Core

UpMix

Matrix Encode Lt=L+C(.707)-S(.707) Rt=R+C(.707)+S(.707) LtRt (Prog 1) (9/10)

V/O (9/10)

OCTiMAX 2.0 Core

UpMix

Linear Acoustic OCTiMAX 5.1

NY DTV Station #1 Revisited


8:00 8:03 8:06 8:09 8:12 8:15 8:18 8:21 8:24 8:26

0 -5 -10 -15 -20 -25 -30 -35

No Audio Processing

NY DTV Station #1 Revisited


8:00 8:03 8:06 8:09 8:12 8:15 8:18 8:21 8:24 8:26

0 -5 -10 -15 -20 -25 -30 -35

Post OCTiMAX 5_1

-28

NY DTV Station #1 Revisited


8:00
0 -5 -10 -15 -20 -25 -30 -35

8:03

8:06

8:09

8:12

8:15

8:18

8:21

8:24

8:26

Pre OCTiMAX 5.1 Post OCTiMAX 5.1

-28

Network Distribution
ABC High rate (640 kbps) AC-3 CBS Dolby E FOX Dolby E now, transport stream soon* NBC 8 channels via MPEG (built-in) PBS Transport stream (19.39 Mbps)* WB, UPN - ? Possibly Dolby E
*Metadata only during network pass-through

Network Distribution
IRD

HD VIDEO Network Feed Main In

?
1/2 3/4 5/6 7/8 PCM Out

Metadata ?

1/2

3/4

5/6

7/8

RE F

MD Analog Digital

OCTiMAX 5.1
1/2 3/4 5/6 7/8

Local LtRt

SAP/DVS L/R C/ Ls/Rs LFE Metadata ? Main Out

DP569 Dolby Digital (AC-3) Encoder

Summary
DTV audio is complicated let no one fool you! 5.1 channels of audio PLUS metadata Dolby Digital (AC-3) solves only the emission part of the puzzle but how do you feed it? Local operations must be accommodated (voiceover, commercial insertion, etc) Audio processing is required for NTSC and is turning out to be a really good idea for DTV

More Information:
www.LinearAcoustic.com www.Dolby.com www.ATSC.org

Q&A
Tim Carroll Linear Acoustic Inc. Tim@LinearAcoustic.com www.LinearAcoustic.com

Thanks!
Tim Carroll Linear Acoustic Inc. Tim@LinearAcoustic.com www.LinearAcoustic.com

You might also like