Capture D'écran . 2023-11-16 À 09.33.08

Spectra
1
§1 Back to the superposition of harmonics
2 In the last class, we have worked hard to show that

‚ for any integer N ,
‚ for any positive number Ê ° 0,
‚ for any weights w0 , w1 , . . . , wN ,
‚ for any non-negative phases Ï0 , Ï1 , . . . , ÏN ,
the following signal S
ÿN
Sptq “ wk Ssinusoidal pÊkt ` Ïk q (1)
k“0
1
is periodic with period .
Ê
2 This signal (1) is called the sum (or superposition) of N harmonics. Each harmonic is a sinusoidal
signal modified in three ways:
‚ its argument is shifted by a phase Ïk ;
‚ its argument is rescaled by a coefficient Êk;
‚ its value is re-weighted by a weight wk .
2 In our discussion of sums of signals, we have started to appreciate the following fundamental fact: the
sum of even just two harmonics is a new signal that can have a completely different shape from the two
harmonics we started from. Let us further elaborate on this point with some more examples.
2 First example. We consider the following sum
ÿ 1
N
Sptq “ Ssinusoidal pkÊtq (2)
k“1
k
of harmonics with the following properties:
‚ the phases are all equal to zero;
1
‚ the weight of the kth harmonic is equal to
k
Figure 1 illustrates the superposition of these harmonics with Ê “ 1:
‚ the black curves on the left-hand side plot the harmonics in (2) for k “ 1, 2, 3, 4;
‚ the red curves on the right-hand side plot the sum of the harmonics in (2) for N “ 1, 2, 3, 4;
0 0
0 0
0 0
0 0
0 1 2 3 0 1 2 3
(a) (b)
Figure 1.
2 Second example. We consider the sum of harmonics in (3). These are the same harmonics as in the
preceding example, but for the fact that we only consider kth harmonics corresponding to odd k.
ÿ 1
N
` ˘
Sptq “ Ssinusoidal p2k ` 1qÊt (3)
k“0
2k ` 1
Figure 2 illustrates the superposition of these harmonics with Ê “ 1:
0 0
0 0
0 0
0 0
0 1 2 3 0 1 2 3
(a) (b)
Figure 2.
2 Third example. We consider the sum
ˆ ˙
ÿ 1 1
N
Sptq “ Ssinusoidal kÊt ` (4)
k“1
2k 4
of harmonics with the following properties:
1
‚ the phases are all equal to
4
1
‚ the weight of the kth harmonic is equal to as in the first example
k
Figure 3 illustrates the superposition of these harmonics with with Ê “ 1:
0 0
0 0
0 0
0 0
0 0
0 1 2 3 0 1 2 3
(a) (b)
Figure 3.
§2 Dirichlet’s theorem for Fourier series
2 We have now seen that superpositions of harmonics describe a surprisingly large variety of different periodic
signals. Is it the case that actually any periodic signal can be described as the superposition of harmonics?
2 Dirichlet’s theorem stated below without proof provides sort of a positive answer to this question:
Consider an arbitrary sound signal S that is periodic with fundamental period T pSq “ 1{Ê. Under
mild assumptions on S (that we do not make explicit), there exist
‚ infinitely many weights w0 , w1 , . . . , wk , . . . ,
‚ infinitely many non-negative phases Ï0 , Ï1 , . . . , Ïk , ¨ ¨ ¨ • 0
such that the two following quantities
‚ the value Sptq returned by the signal S at time t;
ÿ
N
‚ the value wk Ssinusoidal pÊkt ` Ïk q returned at time t by a superposition of N harmonics
k“0
can be made arbitrarily close to each other by increasing the number N of harmonics. Furthermore,
the weights wk and the phases Ïk that satisfy this condition are univocally determined.
2 We summarize this boxed statement by saying that our signal S coincides at every time t with the super-
position of an infinite number of harmonics:
ÿ8
Sptq “ wk Ssinusoidal pÊkt ` Ïk q (5)
k“0
The right-hand side of the identity (5) is a sum or a superposition of an infinite number of harmonics.
This infinite sum is called a Fourier series. Dirichlet’s theorem thus says that any periodic signal S can
be construed as a superposition of infinitely many harmonics, namely as a Fourier series.
2 We can make intuitive sense of this theorem as follows:

‚ any signal is the superposition of unitary pulses;
‚ an unitary pulse is a superposition of harmonics, as shown by the third example above;
‚ hence, any signal is the superposition of harmonics.
§3 Superpositions that only differ for the phases
2 We consider two superpositions of harmonics

ÿ
N
S 1 ptq “ wk Ssinusoidal pkÊt ` Ï1k q (6)
k“1
ÿ
N
S 2 ptq “ wk Ssinusoidal pkÊt ` Ï2k q (7)
k“1
that only differ for their phases Ï11 , . . . , Ï1N versus Ï21 , . . . , Ï2N because they share the same:
‚ number N of harmonics;
‚ fundamental frequency Ê;
‚ weights w1 , . . . , wN .
We now make two observations on superpositions that only differ for their phases.
2 First observation: despite the two superpositions (6) and (7) differing only for their phases, the two
resulting signals S 1 and S 2 can have very different shapes.
2 The following example establishes this fact. We consider the following superpositions of harmonics
S 1 ptq “ 0.3Ssinusoidal pÊt ` 0q ` 1.2Ssinusoidal p2Êt ` 0 q ` 0.6Ssinusoidal p3Êt ` 0q (8)
S ptq “ 0.3Ssinusoidal pÊt ` 0q ` 1.2Ssinusoidal p2Êt ` 0.3q ` 0.6Ssinusoidal p3Êt ` 0.1q
2
(9)
that have the same:
‚ number N “ 3 of harmonics;
‚ fundamental frequency Ê;
‚ weights w1 “ 0.3, w2 “ 1.2, and w3 “ 0.6.
They only differ for their phrases:
‚ the phases in the superposition S 1 are all equal to zero;
‚ the phases in the superposition S 2 are equal to Ï1 “ 0, Ï2 “ 0.3, and Ï3 “ 0.1.
2 Figure 4a illustrates the superposition (8) for Ê “ 1:

‚ the N “ 3 harmonics are plotted as the three black curves;
‚ the resulting superposition S 1 is plotted as the red curve at the bottom.
Figure 4b describes the superposition (9) analogously.
2 We make the two following observations:

‚ the kth harmonic of S 1 and the kth harmonic of S 2 are similar,
because the two black curves on each row have similar shapes;
‚ the resulting superpositions S 1 and S 2 are nonetheless quite different,
because the two red curves in the bottom row have quite different shapes.
2 Second observation: despite the two superpositions (6) and (7) differing only for their phases and thus
yielding different signals S 1 and S 2 , the human ear perceives the two corresponding sounds as identical
(unless the fundamental frequency Ê is very large, which is never the case in phonetics).
2 Briefly put: phase differences do not matter; only weight differences matter.
Representing sounds as sound signals as we have done so far is not quite correct because
‚ different sound signals . . .
‚ . . . can correspond to the same sound.
We now develop a better representation of sounds.
0 0
0 0
0 0
0 0
0 1 2 3 0 1 2 3
(a) (b)
Figure 4.
§4 Discrete amplitude spectra
2 Let us take stock:

‚ in §2, we have seen that “any” periodic sound signal S with fundamental period T pSq “ 1{Ê can be
described as a Fourier series;
‚ In §3, we have seen that two Fourier series whose harmonics share the same weights and only differ
for their phases (yield different signals but) correspond to the same sound.
1
2 It follows that a better representation of a periodic sound signal S with fundamental period is simply
Ê
the list of the weights
w1 , w2 , w3 , . . . , wk , . . . (10)
of the harmonics of its Fourier series
8
ÿ
wk Ssinusoidal pÊkt ` Ïk q (11)
k“0
As we have seen, the weight of an harmonic controls the amplitude of its oscillations: a large weight (in
absolute value) means large oscillations (namely large departures from zero). This list (10) is therefore
called the discrete amplitude spectrum of the sound signal S considered.
2 We usually represent the discrete amplitude spectrum (10) graphically as follows:

‚ on the horizontal axis we plot the frequencies Ê, 2Ê, 3Ê, . . . of the harmonics;
‚ for each frequency kÊ, we draw a vertical line proportional to the weight wk of the kth harmonic.
Let us look at a few examples.
2 First example.
Suppose that the Fourier series (11) actually consists of a single harmonic, namely the kth harmonic with
weight wk . Equivalently, all other harmonics have zero weight. The discrete amplitude spectrum therefore
consists of a single line, as in figure 5.
wk
amplitudes
kÊ
frequencies
Figure 5.
This is allegedly the simplest possible discrete spectrum. This observation captures the intuition that a
single harmonics is the simplest periodic signals.
2 Second example.
The two Fourier series in (8)/(9) that correspond to the different signals in figure 4 actually share the
spectrum in figure 6. It has only three vertical lines because only three harmonics have non-zero weights.
amplitudes
Ê 2Ê 3Ê
frequencies
Figure 6.
Arehart et al.: Perceptual Separation 1437
Third1.example.
2 Figure Amplitude spectra of five steady state vowels of American English that were the single vowel constituents of the double vowel
To illustrate,
stimuli. figure
Single vowels 7 provides
shown have a 100 theHzamplitude
fundamentalspectra
frequencyfor the
(F0). four
Also vowels
shown is the in the English
amplitude words
spectrum of the herd,
doublehot,
vowel stimulus
head,/”/)
(/œ/ andshown pronounced
heed,with a ∆F0 of fouratsemitones.
a fundamental frequency of 100 Hz.
Figure 7. From Hoberg Arehart et al. (1997)
stimuli (10 double vowel pairs × 2 fundamental fre- Procedure: Double Vowel Identification
quency combinations). Figure 1 also shows an example
of a double vowel stimulus for the 4 semitone condi- The effect of ∆F0 on listeners’ ability to identify both
tion: the vowel /œ/ with F0 = 100 Hz, paired with the vowels in a double vowel was measured in 5 listeners
vowel /”/ with F0 = 126 Hz. For the masked-vowel iden- with normal hearing (listeners N1 through N5) and 8
tification task, one of the constituent vowels was al- listeners with hearing loss (listeners I1 through I8).
ways the vowel /œ/ with an F0 = 100 Hz; the second
Task
constituent vowel (/”/, /a/, /i/, or /∏±/) had an F0 = 100 Hz
+ ∆F0 , where ∆F0 ranged from 0–4 semitones.2 Double The listeners’ task was to identify the two vowels
vowel stimuli were played out in the following way: two contained in the double vowel signal using a forced-
constituent single vowels were routed separately choice paradigm. No feedback was provided. In a given
through low pass filters (cutoff frequency = 10,000 Hz) block, the semitone difference between the constituent
and programmable attenuators, then were combined vowels was fixed. For each semitone-difference condi-
in a mixer, and finally delivered to the listener’s ear tion, a listener’s performance was measured by the per-
with a Telephonics TDH-49 earphone. centage of trials in which both constituent vowels in
the double vowel stimulus were correctly identified. To
2
We required that the masker vowel have relatively equal energy across the extent that listeners can use F0 cues in the percep-
the first three formants. Since all the vowels used in this study met this
requirement, the choice of /œ/ as the masker was arbitrary. tual separation of double vowels, performance should
Journal of Speech, Language, and Hearing Research

spectra of five steady state vowels of American English that were the single vowel constituents of the double vowel
shown have a 100 Hz fundamental frequency (F0). Also shown is the amplitude spectrum of the double vowel stimulus
h a ∆F0 of four semitones.
§5 Formants
2 A frequency kÊ on the horizontal axis of a discrete amplitude spectrum is called a formant provided it is
a point of local maximum: the spectrum decreases when I move away from that frequency on either side.
2 To illustrate, we consider again the spectrum plotted in the bottom right panel of figure 7, repeated below:
We seee four formants, highlighted by the four red arrows:

‚ the first formant is 200 Hz;
‚ the second formant is 2300 Hz;
‚ the third formant is 3000 Hz;
‚ the fourth formant is 3800 Hz.
le vowel pairs × 2 fundamental fre- Procedure: Double Vowel Identification

ions). Figure 1 also shows an example
el stimulus for the 4 semitone condi- The effect of ∆F0 on listeners’ ability to identify both
œ/ with F0 = 100 Hz, paired with the vowels in a double vowel was measured in 5 listeners
= 126 Hz. For the masked-vowel iden- with normal hearing (listeners N1 through N5) and 8
ne of the constituent vowels was al- listeners with hearing loss (listeners I1 through I8).
/œ/ with an F0 = 100 Hz; the second
Task
l (/”/, /a/, /i/, or /∏±/) had an F0 = 100 Hz
ranged from 0–4 semitones.2 Double The listeners’ task was to identify the two vowels
0
re played out in the following way: two contained in the double vowel signal using a forced-
gle vowels were routed separately choice paradigm. No feedback was provided. In a given
filters (cutoff frequency = 10,000 Hz) block, the semitone difference between the constituent
ble attenuators, then were combined vowels was fixed. For each semitone-difference condi-
finally delivered to the listener’s ear tion, a listener’s performance was measured by the per-
cs TDH-49 earphone. centage of trials in which both constituent vowels in
the double vowel stimulus were correctly identified. To
masker vowel have relatively equal energy across the extent that listeners can use F0 cues in the percep-
s. Since all the vowels used in this study met this
e of /œ/ as the masker was arbitrary. tual separation of double vowels, performance should
Journal of Speech, Language, and Hearing Research

§6 From periodic to non-periodic signals
2 So far, we have looked at periodic sound signals. What about non-periodic signals? We reason intuitively:
‚ non-periodic signals can be construed as periodic with very large, infinite period T ;
‚ equivalently, non-periodic signals can be construed as periodic with small, infinitesimal frequency Ê
1
(the equivalence holds because the frequency is the inverse of the period: Ê “ ).
T
We thus intuitively expect Dirichlet’s theorem to extend from periodic to non-periodic signals: any non-
periodic signal can be expressed as a Fourier series (5) with the following changes:
‚ the multiples kÊ of the fundamental frequency
are replaced with all frequencies Ê between 0 and infinity;
ÿ8
‚ the sum over multiples kÊ of the fundamental frequency
k“1 ≥8
is replaced with the integral 0 . . . dÊ over all frequencies Ê between 0 and infinity;
‚ the sequence of weights w1 , w2 , . . . , wk , . . .
is replaced with a function from frequencies Ê to weights wpÊq;
‚ the sequence of phases Ï1 , Ï2 , . . . , Ïk , . . .
is replaced with a function from frequencies Ê to (non-negative) phases ÏpÊq.
Figure 8 summarizes these changes.
ª8
8
ÿ
Sptq “ wk Ssinusoidal pkÊt ` Ïk q
k“0
wpÊq ÏpÊq
Figure 8.
2 Fourier’s theorem stated below without proof says that our intuitive reasoning is correct:
Consider an arbitrary sound signals S, not necessarily periodic. Under mild assumptions on S (that
we do not make explicit), there exist
‚ a function wp¨q that takes a non-negative number Ê (interpreted as a frequency)
and returns a number wpÊq (interpreted as a weight);
‚ a function Ïp¨q that takes a non-negative number Ê (interpreted as a frequency)
and returns a non-negative number ÏpÊq (interpreted as a phase);
such that the identity (12) holds at every time t • 0.
ª8
` ˘
Sptq “ wpÊqSsinusoidal Êt ` ÏpÊq dÊ (12)
0
The weight and phase functions wp¨q and Ïp¨q are univocally determined by the signal S.
§7 Continuous amplitude spectra
2 Given a periodic or non-periodic sound signal S, the uniquely determined function wpÊq in the expression
(12) is called the (continuous) amplitude spectrum of the sound signal S.
2 We usually represent the continuous amplitude spectrum graphically as follows:

‚ on the horizontal axis, we plot the frequency Ê;
‚ on the vertical axis, we plot the corresponding amplitudes wpÊq
We now look at a couple of examples.
2 First example.
Let us starts with an example of periodic signal, as illustrated in 9:
‚ the top panel plots the periodic signal Sptq “ 9Ssinusoidal p0.3tq;
‚ the bottom panel plots its continuous amplitude spectrum: it has a single peak at the frequency 0.3,
as expected.
Figure 9.
2 In general the continuous spectrum of a periodic signal is a continuous line that interpolates the vertical
lines of its discrete spectrum.
2 Second example.
At the other extreme, the continuous amplitude spectrum of white noise is completely flat because no
frequency is more prominent than the others.
§8 Non-periodic signals have time dynamics
2 We consider the two non-periodic signals plotted in figure 10:

‚ The top signal (in red):
˚ starts with a sinusoidal with small frequency Êsmall ;
˚ followed by a later sinusoidal with larger frequency Êlarge .
‚ The bottom signal (in blue) reverses the order of the two sinusoidals:
˚ starts with a sinusoidal with large frequency Êlarge ;
˚ followed by a later sinusoidal with smaller frequency Êsmall ;
These two signals correspond to sounds that would indeed be perceived as different: two pure tones in
reverse order.
Figure 10.
2 Despite representing two different sounds, these two signals share the same continuous amplitude spectrum,
shown in figure 11. As expected, the continuum spectrum for both signals consists of two peaks centered
at the two frequencies Êsmall and Êlarge .
Figure 11.
2 This example shows that, while (discrete) amplitude spectra are an adequate representation of periodic
signals, (continuous) amplitude spectra are an inadequate representation of non-periodic signals:
‚ Two non-periodic sound signals can differ for their time dynamics.
For instance, the two non-periodic signals in figure 10 only differ for their time dynamics, namely
for whether the large frequency sinusoidal precedes or follows the small frequency one.
‚ Yet, the continuous spectrum only captures information about the frequency components.
For instance, the spectra in figure 11 only capture the information that the signals feature two
sinusoidals with frequencies Êsmall and Êlarge but does not capture their temporal order.
‚ Hence, non-periodic signals with different time dynamics but same frequency components unfortu-
nately end up sharing the same continuous spectrum.
2 Here is one natural solution to this problem:

‚ chop time into many little windows;
‚ compute the continuous amplitude spectrum for the signal within each time window;
‚ plots all the resulting spectra together.
We now implement this intuition.
§9 Windowing
2 We consider an arbitrary non-periodic sound signal S, such as the one in figure 12 for concreteness. We
want to chunk it up into small pieces centered at different times. Here is a way to do that.
Figure 12.
2 The triangular signal ·

centered at a time · with width ⁄ is defined as follows:
⁄
‚ it is equal to zero up to time · ´ ;
2
⁄ ⁄
‚ it has a triangular shape for times in between · ´ and · ` ;
2 2
⁄
‚ it is equal to zero after time · ` .
2
Figure 13a plots the triangular signals · centered at · “ 0.5, 1, 1.5, 2, 2.5 with width ⁄ “ 1.
2 The product
S · ptq “ · ptq ¨ Sptq (13)
between the triangular signal ·,⁄
and our signal S is a new signal S ·,⁄
with the following properties:
⁄
‚ it is equal to zero up to time · ´ ;
2
⁄ ⁄
‚ it is a smoothed-out copy of the original signal S for times between · ´ and · ` ;
2 2
⁄
‚ it is equal to zero after time · ` .
2
Figure 13b plots the product S · between the original signal S in figure 12 and the triangular signal ·
centered at · “ 0.5, 1, 1.5, 2, 2.5 with width ⁄ “ 1.
0 0
0 0
0 0
0 0
0 0
0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3

(a) (b)
Figure 13.
§10 Spectrograms
2 The spectrogram of an arbitrary (possibly non-periodic) sound signal S is the plot schematized in figure
14 obtained as follows:
‚ the horizontal axis plots discrete times ·1 , ·2 , ·3 , . . . ;
‚ the vertical axis plots discrete frequencies Ê1 , Ê2 , Ê3 , . . . ;
‚ for every discrete time · and every discrete frequency Ê, we draw a little gray square;
‚ the intensity of the gray is proportional to the size of the weight wpÊq in the continuous amplitude
spectrum of the product signal S · “ · ¨ S.
frequencies
Ê
44 Digital Signal Processing

time
Figure 14.
2 To illustrate, figure 15 provides the spectrogram of a Cantonese speaker’s pronunciation of the word [kA]
‘chicken’. The two bottom dark bands singled out by the two blue arrows are the first and second formant.
2nd formant
1st formant
Time-
Figure 2.17 The bottom half of this figure shows a spectrogram of a Cantonese speaker's
pronunciation of [ko']Figure
"chicken."15.TimeFrom Johnson
is shown on the(2012, page
horizontal axis,79)
and frequency (from 0 to
5 kHz) on the vertical axis. The top half shows a power spectrum and LPG analysis taken from a
2 The column ofpoint
gray(marked
shadeswith the vertical line in the spectrogram) late in the diphthong. The vertical line in
singled out by the red arrow in figure 15 is the continuous amplitude spectrum of
the spectrum marks the second broad spectral peak, which is marked in the spectrogram by a
the windowed short
signal plotted in
horizontal line. the usual format in figure 16. The first and second formants are highlighted
by the two blue arrows.
44 Digital Signal Processing
analog spectrograph (Potter eta!., 1947; Joos, 1948) this was accomplished by
analyzing the speech signal with a bank of band-pass filters that had relatively
broad bandwidths. Each filter had a different center frequency, and responded
only to energy within the band. For instance, if the fundamental frequency (F0) is
150 Hz and the spectrograph filters have 300 Hz bandwidths, then the filters
smear together adjacent harmonics, and the resulting spectrogram shows only
the broad peaks of spectral energy, not the individual harmonics.
To produce digital
Figurespectrograms, we use(2012,
16. From Johnson FFT analysis
page 79)to calculate the indi-
vidual spectra, and we change the number of samples in the analysis window
(with zero padding) to control the effective width of the analysis filter. An illus-
tration of this was given in figure 2. 12. For narrow-band spectrograms we use
long analysis windows, giving us spectra which have high frequency resolution
and low temporal resolution, whereas for wide-band spectrograms we use short
analysis windows, giving us high temporal resolution (you can usually see indi-
Bibliography
Hoberg Arehart, Kathryn, Catherine Arriaga King, and Kelly S. McLean-Mudgett. 1997. Role of fun-
damental frequency differences in the perceptual separation of competing vowel sounds by listeners
with normal hearing and listeners with hearing loss. Journal of Speech Language and Hearing Research
40:1434–1444.
Johnson, Keith. 2012. Acoustic and auditory phonetics. Wiley-Blackwell. Third edition.
19

Capture D'écran . 2023-11-16 À 09.33.08

Uploaded by

Copyright:

Available Formats

You might also like

Capture D'écran . 2023-11-16 À 09.33.08

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Capture D'écran . 2023-11-16 À 09.33.08

Uploaded by

Copyright:

Available Formats

Spectra

2 In the last class, we have worked hard to show that

2 We can make intuitive sense of this theorem as follows:

2 We consider two superpositions of harmonics

2 Figure 4a illustrates the superposition (8) for Ê “ 1:

2 We make the two following observations:

2 Let us take stock:

2 We usually represent the discrete amplitude spectrum (10) graphically as follows:

Figure 7. From Hoberg Arehart et al. (1997)

Journal of Speech, Language, and Hearing Research

We seee four formants, highlighted by the four red arrows:

le vowel pairs × 2 fundamental fre- Procedure: Double Vowel Identification

Journal of Speech, Language, and Hearing Research

2 We usually represent the continuous amplitude spectrum graphically as follows:

2 We consider the two non-periodic signals plotted in figure 10:

2 Here is one natural solution to this problem:

2 The triangular signal ·

0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3

44 Digital Signal Processing

You might also like