Coursework 2: Digital Audio Principles - Dither

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Coursework 2

Digital audio principles - Dither


Audio Engineering and Recording Techniques B (TON1017)

Mark Thompson

5th May, 2018


First introduced in the 1930’s by Bell Labs and originally utilised in telecommunications [Fine, 2008:
3], digital pulse-code modulation (PCM) based recording has rapidly evolved as the primary means
for the transmission and recording of audio. The development of PCM led to immediate application
in World War II where the robust encryption and transmission of speech was essential; followed by
the conception of the first monophonic audio recorder in 1967 (by the research branch of Japan’s
NHK broadcast network) [Fine, 2008: 3]. Modern standards of high-resolution digital audio have
advanced on the back of the constant adaptation of the analogue-to-digital conversion (ADC)
process; resulting in the widespread adoption of PCM-encoded audio signals in telecommunications,
broadcast, and professional to consumer level audio and video [Lipshitz, Vanderkooy, 2004: 200].

The two prime operations which facilitate the analogue to digital conversion (ADC) of a signal are
sampling and quantisation – generally occurring in that order [Lipshitz, Vanderkooy, 2004: 205]. The
fundamental theory behind the sampling operation (first outlined by Harry Nyquist, 1928) observes
the faultless reconstruction of the original analogue signal. This can be explained mathematically as
sampling is a linear process, and an inverse Fourier transform of the sampled spectrum in the
frequency domain will output the original input signal [Widrow, et al, 1996]. The implementation of
the Nyquist-Shannon sampling theorem is, however, dependent on a number of idealised conditions
[Jerri, 1997: 1565]. Conversely, the quantisation of the analogue signal is a non-linear operation, and
results in an inherent quantisation error, as the finite bit-per-sample resolution cannot precisely
represent the infinite resolution of the analogue signal voltage [Lipshitz, Vanderkooy, 2004: 205].
Examination of how this quantisation error presents during PCM-based ADC is necessary to assess
the implementation of subtractive and non-subtractive dither, and the relative efficacy of the types
of dither noise utilised.

Figure 1: Uniform quantisation "staircase" functions. a) Mid-tread function, and b) mid-riser function - ∆ represents the
quantisation interval [Wannamaker, et al, 2000: 500]

In 1956, Widrow developed a statistical theory of quantisation, relating the operation of quantisers
as a discretization of amplitude through probability density functions, much as sampling is a
discretization of time [Widrow, et al, 1996]. The operation of quantising involves assigning binary
values closest to the input signal’s amplitude at the positions specified during sampling [Lipshitz,
Vanderkooy, 1984: 108]. The underlying problem associated with quantisation, is that values at the

Audio Engineering and Recording Techniques B (TON1017) – Coursework 2 2


sampling instants can only be allocated to the precision allowed by the finite resolution of the digital
system [Lipshitz, Vanderkooy, 2004: 205]. The degree of quantisation error is a reflection of the
quantisation interval “∆”; with a maximum possible error of ± ∆/2 [Lipshitz and Vanderkooy, 1984:
108] (figure 1a, 1b). Deriving the signal-to-noise ratio (SNR) of the digital system below shows that
an increased encoding resolution results in an increased SNR, and the noise floor is reduced with
respect to the signal voltage [Lipshitz, Vanderkooy, 1984: 108].

∆ ∆


( )

Maximum possible error is from ± ∆/2, the RMS value of the quantisation error can be derived by:

( ) ∆ √

∆ ∆
√ √


√ √
( ) ( ) ( √ )
∆ √
( √ )

( ) (√ )

For example, 24 bit (ideal) quantisation would have a SNR of 146dB.

The effect of the quantisation error is largely determined by the input analogue signal. Where audio
signals are loud and complex, quantisation error manifests at the noise floor as low-level white
noise, independent of the signal [Lipshitz, Vanderkooy, 1984: 108]. For low-level, relatively simple
signals, quantisation errors also occur as low level noise (subject to modulation), however the
artefacts introduced are correlated with the input signal [Lipshitz, Vanderkooy, 1984: 108]. This
correlation with the input signal results in artefacts appearing as harmonic and intermodulation
distortion [Lipshitz, Vanderkooy, 2004: 205], (see figure 1d) to which the human ear is more
sensitive than white noise. This can be explained as the operation of the cochlear and neural
pathways can be functionally described by a series of Fourier transforms in the analysis of complex
sounds [Altes, 1978: 178].

Audio Engineering and Recording Techniques B (TON1017) – Coursework 2 3


Figure 3: Undithered quantisation: a) original analogue Figure 2: Dithered quantisation: a) original analogue signal,
signal, b) mid-tread quantised representation of a), c) b) mid-tread quantised representation of a), c) resulting
resulting quantisation error d) power spectrum of the total error d) power spectrum of the quantised system's
quantised system's output signal [Lipshitz, et al, 1992: output signal [Lipshitz, et al, 1992: 365]
358]

Audio Engineering and Recording Techniques B (TON1017) – Coursework 2 4


It is also worth noting that in the ADC – particularly of low level signals – the signal correlated noise
and distortion due to the quantisation error is not constant, resulting in a variable SNR over time
[Lipshitz, Vanderkooy, 2004: 206]. Furthermore, elements of non-uniformity arise in the ADC to DAC
(digital-to-analogue conversion) systems themselves, as perfect linearity of the ADC/DAC transfer
function is not possible in real digital systems [Lipshitz, Vanderkooy, 1984: 106]. The signal
degradation effects of quantisation error are also significant in re-quantisation, where the bit
resolution of the digital signal is reduced for improved data efficiency for example [Lipshitz, et al,
1992: 355]. With these serious noise repercussions during the conversion of audio-visual signals, the
addition of a dither signal to the ADC operation serves to linearise the signal-correlated quantisation
noise and distortion [Wagdy, 1989: 850].

The usefulness of dither was first realised in the removal of contouring effects (produced due to
quantisation error) from PCM video [Lipshitz, Vanderkooy, 1984: 107]. In the early 1960’s, Roberts
applied pseudo-random noise to the transmission of digital television pictures. He noted that by
adding noise prior to quantisation and subtracting the same dither signal from the output, the
resultant quantisation error of the system was independent from the input signal [Lipshitz, et al,
1992: 363]. Roberts derived an expression for the mean-square error (E) of PCM systems, which is
comprised of variance (V) and deviation (D) [Roberts, 1962: 148]. Variance refers to the apparent
white noise of a signal, defined by variance from the mean value of the output. Deviation refers to
the deviation of the mean output ( ) from the input ( ), or a measure of the signal-correlation of
the quantising noise [Schuchman, 1964: 164].

( ) ( )

Introduction of the dither signal reduced the deviation during Roberts’ testing, producing better
perceived picture quality despite an increase in the total noise (E) [Schuchman, 1964: 164]. A
doubling in efficiency of each PCM channel was also observed, where the resolution could be
reduced from 6 to 3 bit resolution without degradation of the input signal [Roberts, 1962: 154]. The
issue with this subtractive dither (SD) operation is that it requires the same pseudo-random dither
signal to be summed prior to quantisation, as well as prior to DAC for subtraction of the dither.
Synchronisation of the noise generators or transmission of the original dither signal is not always
practical and added digital processing or re-quantising of the signal would not satisfy the conditions
required for a complete subtraction of the dither [Wannamaker, et al, 2000: 501].

Non-subtractive dither (NSD) (figure 4) does not subtract the dither signal after quantisation, and
the practical implications of SD do not apply. The issue with NSD quantisation systems arises in the
additional errors (aside from the quantisation error) which also comprise a portion of the total error
of the system [Wannamaker, et al, 2000: 502]. Wannamaker [et al, 2000: 502-504] mathematically
explained a number of theorems inherent with NSD systems, most significantly the improbability for
the total error to be uniformly distributed or statistically independent of the input signal. Where
conditions do allow an input independent total error, the resultant error always exceeds the total
quantisation error associated with SD quantisation systems [Wannamaker, et al, 2000: 504]. Despite
these comparative drawbacks, the use of NSD is far more desirable than dither-less systems.

Audio Engineering and Recording Techniques B (TON1017) – Coursework 2 5


Particularly with high resolution systems, the increased noise due to NSD is unimportant in
comparison to the dither effect [Wannamaker, et al, 2000: 503]. The additive nature of NSD is a
further practical benefit, as other input signals can be added at any point prior to quantisation
without negatively impacting the applied dither – provided the signals added are independent of the
NSD signal [Wannamaker, et al, 2000: 514].

Figure 4: System block diagrams and relative differential equations for a) subtractive dither system, and b) non-
subtractive dither system [Wannamaker, et al, 2000: 501]

The overall application of dither with regards to digital PCM systems cannot be addressed
thoroughly without consideration for types of dither noise used. The influence of specific type of
dither noise utilised is most significant at quantisation resolutions of less than 6 bits [Rabiner and
Johnson, 1972: 1487]; at these bit depths, variations in statistical non-linearities with the input signal
and noise modulation is most audible. In listening tests published by Wannamaker, et al, [2000: 514]
rectangular or uniform probability density function (RPDF) dither, with a width equal to the
quantisation interval, succeeded in eliminating the quantisation related distortion, however resulted
in noticeable noise modulation. Triangular-PDF dither with amplitude of twice the quantisation
interval is recommended by the author in the same listening test, as both distortion and noise
modulation are removed [Wannamaker, et al, 2000: 514]. The required amplitude does however
reduce the SNR of the system (by approximately 4.8dB) as the quantisation noise is increased
[Lipshitz, Vanderkooy, 2004: 208]. In addition to standard white spectrum TPDF dither, a high-pass
spectrum TPDF dither can be implemented such that the applied noise is less audible as it is in the
upper ranges of the audible threshold [Wannamaker, et al, 2000: 510]. Dither of RPDF and TPDF
types are sensitive to their amplitudes in relation to the quantisation intervals however, non-integer
values of the given amplitudes will result in a failure to prevent distortion or noise modulation
[Lipshitz, Vanderkooy, 2004: 211].

Audio Engineering and Recording Techniques B (TON1017) – Coursework 2 6


Gaussian-PDF dither is also worth consideration
in quantisation systems, despite not generally
correcting distortion and noise modulation to the
same degree as TPDF dither, and producing a
total error level approximately 1.25dB greater
than that of TPDF dither [Lipshitz, Vanderkooy,
2004: 210]. Input signals or ADC systems may
already contain a source of Gaussian-PDF, for
example the thermal noise associated with
electronic components [Lipshitz, Vanderkooy,
2004: 210]. As the amplitude requirements are
low in comparison to the other types of dither,
the system may (conveniently) not require an
additional dither signal [Wannamaker, et al, 2000:
514].

Noise shaping and oversampling with respect to


digital systems is a more recent development
worth mentioning, as the ability to reduce the Figure 5: TPDF dither noise of ideal amplitude 2∆, relative
to a mid-riser quantisation step function [Lipshitz,
digital data rate requirements whilst maintaining a Vanderkooy, 2004: 208]
desirable dynamic range and SNR is ideal [Lipshitz,
Vanderkooy, 2004: 211]. The use of noise shaping or sigma-delta modulation is achieved through the
feedback of the error signal into the system permits the variation of noise power with frequency
[Wannamaker, et al, 2000: 514]. Noise shaping allows a portion of quantisation noise with a reduced
SNR to be shifted to less audible frequencies. In coordination with noise shaping, oversampling
spreads the quantisation noise over a greater bandwidth, significantly reducing the associated noise
power [Lipshitz, Vanderkooy, 2004: 212]. Without applying adequate TPDF dither to linearise the
noise of the noise shaping system, the feedback loop can result in the modulation of the error signal.

Broadly speaking, the application of dither in modern, high resolution conversion using PCM is
standard. Regarding experimental evaluations, the case for applying dither to quantisation in
extremely low bit depth systems is not always confirmed. For example Rabiner and Johnson’s [1972:
1487] experimental evaluation confirmed that the addition of dither to PCM speech signals at
quantisation resolutions of 2 and 3 bits resulted in reduced speech intelligibility due to the noise of
the dither. Mathmatically, for a 1 bit resolution system, a TPDF dither signal would be undesirable
[Lipshitz, Vanderkooy, 2004: 208], as the width of the dither would account for the entire input
range with amplitude of 2 LSBs (figure 5). Aside from specific exceptions, the use of non-subtracting
TPDF dither as a solution to non-linear quantisation noise is generally recommended [Wannamaker,
et al, 2000: 514] and should be applied for any quantisation or re-quantisation operation for PCM
systems [Lipshitz, Vanderkooy, 2004: 210].

Audio Engineering and Recording Techniques B (TON1017) – Coursework 2 7


References
 Altes, Richard, A,. 1978: The Fourier-Mellin transform and mammalian hearing, Journal of
the Acoustical Society of America, Vol. 63, Iss. 1, (January), pp. 178
 Fine, Tomas, 2008: The dawn of commercial digital recording, ASRC Journal, Vol. 39, No. 1,
(Spring), pp. 2–3.
 Jerri, Abdul, J,. 1977: The Shannon sampling theorem – its various extensions and
applications: a tutorial review, Proceedings of the IEEE, Vol. 65, Iss. 11, (November), pp.
1565
 Lipshitz, Stanley, P, and Vanderkooy, John 1984: Resolution below the least significant bit in
digital systems with dither, Journal of the Audio Engineering Society, Vol. 32, Iss. 3, (March),
pp. 106–112
 Lipshitz, Stanley, P, and Vanderkooy, John 1987: Dither in digital audio, Journal of the Audio
Engineering Society, Vol. 35, Iss. 12, (December), pp. 966–975
 Lipshitz, Stanley, P, and Vanderkooy, John 2004: Pulse-code modulation – an overview,
Journal of the Audio Engineering Society, Vol. 52, Iss. 3, (March), pp. 200–214
 Lipshitz, Stanley, P, Vanderkooy, John, Wannamaker, Robert, A,. 1992: Quantization and
dither: a theoretical survey, Journal of the Audio Engineering Society, Vol. 40, Iss. 5, (May),
pp. 355–364
 Nyquist, Harry, 1928: Certain topics in telegraph transmission theory, Transactions of the
American Institute of Electrical Engineers, Vol. 47, Iss. 2, (April), pp. 617-644
 Rabiner, L, R, Johnson, J, A,. 1972: Perceptual evaluation of the effects of dither on low bit
rate PCM systems, The Bell System Technical Journal, vol. 51, Iss. 7, (September) pp. 1487-
1494
 Roberts, L, G,. 1962: Picture coding using pseudo-random noise, IRE Transactions on
Information Theory, vol. 8, no. 2, (February) pp. 145-154
 Schuchman, Leonard, 1964: Dither Signals and Their Effect on Quantization Noise, IEEE
Transactions on Communication Technology, Vol. 12, Iss. 4, (December), pp. 162-165
 Wagdy, Mahmoud, F,. 1989: Effect of various dither forms on quantisation errors of ideal A/D
converters, IEEE Transactions on Instrumentation and Measurement, Vol. 38, Iss. 4, (August),
pp. 850
 Wannamaker, Robert, A, Lipshitz, Stanley, P, Vanderkooy, John, Wright, Nelson J,. 2000: A
theory of non-subtractive dither, IEEE Transactions on Signal Processing, vol. 48, Iss. 2,
(February) pp. 499-516
 Widrow, B, Kollar, I, Liu, Ming-Chang, 1996: Statistical theory of quantization, IEEE
Transactions on Instrumentation and Measurement, vol. 45, Iss. 2, (April) pp. 353-361

Audio Engineering and Recording Techniques B (TON1017) – Coursework 2 8

You might also like