Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Mathematics: Analysis and Approaches IA

Session: May 2023


Title: Decomposing piano chords into individual notes
Candidate Number:
Mathematics Internal Assessment Candidate Number:

1 Introduction
As I have played the piano for seven years, one concept has always eluded me. It is the
skill of developing a so-called ‘perfect pitch’, the ability to detect the pitch of any note by
ear. The skill seems to be trivial at first, somewhat of a natural ability. However, the use of
mathematics allows us to create a concrete and quantifiable method to understand perfect
pitch and its implications. Developing a mathematical model for this will let us efficiently
analyse pieces of music and precisely detect the presence of certain pitches in music where
they may not be so apparent (in the form of chords). This investigation thus aims to explore
how musical chords can be mathematically decomposed into constituent notes.

To begin, let us establish a few axioms that will be adhered to for this investigation. Firstly,
all sound energy travels through waves created by the vibration of the source particles.
Sound waves are simply variations of pressure ⇢ in air, as the vibrations spread energy to
adjacent particles and onwards. A little bit of elementary physics tells us that the patterns
of pressure form alternating ‘compressions’ (regions of high pressure) and ‘rarefractions’ (re-
gions of low pressure). Secondly, let us establish the relationship between music and sound
waves. All music has two fundamental measures: volume and pitch. Volume is a measure
of the scientific power of sound (Fahy and Thompson), expressed through amplitude. The
pitch of any sound is determined by its frequency. Each pitch has a corresponding frequency,
hence, each note has a respective sound wave. When two or more of these harmonic pitches
are incident simultaneously, we refer to them together as a ‘chord’ (Benward and Saker 67).
These chords make up the base layer of all the music we hear.

This investigation will use Fourier coefficients and transforms to find the notes present in
randomized audio recordings of myself playing the piano.

Page 1
Mathematics Internal Assessment Candidate Number:

2 Background
2.1 Sound waves
Let us begin by understanding how chords are formed in a mathematical sense. As we have
covered in section 1, sound waves are transported through alternating regions of high and
low pressure. This pattern creates a displacement of the air particles at a particular time
t. The sound wave travels in two domains: time and displacement. As such, we can establish
a general equation for a sound wave that travels through both of these domains.

Theorem 2.1. Any sound wave that travels through time and space can be represented
through a function of the form
f ( , t) = sin( !t) (2.2)

As we see in (2.2), the speed at which the wave travels in time is given by !. This general
form allows us to form representations of the sound waves for individual piano notes.

2.2 Superposition
When multiple notes are played simultaneously, we refer to the joint sound as a chord. In
physics and mathematics, the addition of simultaneously occurring waves is governed by the
principle of superposition, due to the phenomenon of interference (Naidu).

The principle of superposition dictates that when two or more waves pass through the same
point in a domain at the same time, they form a resultant wave. This resultant wave
has a di↵erent amplitude and pattern as compared to the original waves. The principle of
superposition is valid for sound waves because all sound waves form a solution to the wave
equation. This will not be covered in the scope of the investigation, rather we shall be
understanding superposition using the mathematical definition of ‘linear’.

Definition 2.3 (Linear Functions). A function is linear if it follows the properties of addi-
tivity and homogeneity (Bronshtein et al.)

f (↵ + ) = f (↵) + f ( ) (2.4)

f (cu) = cf (u) (2.5)

As sound waves are linear, the superposition theorem can be given as below

Page 2
Mathematics Internal Assessment Candidate Number:

Theorem 2.6. When two sound waves (x, t) and ⇢(x, t) occur simultaneously, the resulting
wave is given by
µ(x, t) = (x, t) + ⇢(x, t) (2.7)

Example 2.8. For example, let us define two functions.

f (x, t) = sin(x ⇡t) g(x, t) = sin(2(x at)) (2.9)

Using simple addition, define the resultant wave as h(x, t)

h(x, t) = f (x, t) + g(x, t) (2.10)

Using the sine-sum formula as given below


✓ ◆ ✓ ◆
c+d c d
sin(c) + sin(d) = 2 sin cos (2.11)
2 2

h(x, t) = sin(x at) + sin(2(x at) (2.12)


✓ ◆ ✓ ◆
3x 3at x + at
= 2 sin cos (2.13)
2 2

Equation (2.13) is a wave that travels along the x-axis in time. Since multi-variable functions
are animated and not static, it is not possible to display them directly in this investigation1 .
Let us plot this wave without the time-variable (t = 0), to simply demonstrate how a
superposed wave is formed.

0.5

10 5 5 10
0.5

Figure 1: Plot of f (x, t) and g(x, t)


1
Refer to https://imgur.com/O55Skci for a graphical visualisation of the animated wave.
Page 3
Mathematics Internal Assessment Candidate Number:

The plot of sum of the two functions is as follows

10 5 5 10

Figure 2: Plot of the sum of f (x, t) and g(x, t)

Via 2.6, we know that the function plotted in 2 is also a solution to the wave equation. In
the context of music, since each note has a corresponding sound wave, the sound wave of a
chord is formed by the superposition of the involved notes. Undoing this superposition is
what the rest of this investigation will focus on.

Page 4
Mathematics Internal Assessment Candidate Number:

3 Fourier Series
Following 2.3, we can deduce that a sum of any n sinusoidal functions will form a trigono-
2
metric function sum as well. Let us assume a single-variable infinite sum of sine waves,
each with a di↵erent frequency defined by !n .
1
X
un+1 = sin(!n t) (3.1)
n=1

Suppose we add another cosine function to this sum


1
X 1
X
un+1 = sin(!n t) + cos( n t) (3.2)
n=1 n=1

For a practical example of this, let us add a limit of n till 50. We can redefine this instead
as a function that sums these sinusoids

50
X 50
X
u(t) = sin(!a t) + cos( b t) (3.3)
a=1 b=1

(3.3) tells us that there are fifty di↵erent sines and cosines that make up this function. Notice
how the a coefficient (i.e. the amplitude of the wave) remains constant throughout all fifty
di↵erent trigonometric functions. Let us modify this by adding a variable to change the
amplitudes of the waves.

50
X 50
X
u(t) = xa sin(!a t) + yb cos( b t) (3.4)
a=1 b=1

In (3.4), the coefficients xa and yb tell us the relative ‘weightage’ of each of the fifty com-
posing cosines and sines. As this value grows higher, the relative weightage of the term
increases. Through these four steps, we have derived a theorem fundamental to performing
a decomposition operation on chords.

Theorem 3.5 (Fourier Series). Any function f (t) can be represented as an infinite sum of
sinusoidal functions expressed as follows (Weisstein),
1
X ✓ ◆ 1
X ✓ ◆
2⇡mt 2⇡nt
f (t) = a0 + am cos + bn sin (3.6)
m=1
T n=1
T

2
The reason for assuming only a single-variable sinusoid will be explained in the next section.
Page 5
Mathematics Internal Assessment Candidate Number:

4 Discrete Fourier Transform


The expression of a wave as a sum of other waves makes it possible to decompose the
chords using the concept presented in 3. The method of reversing this process i.e. finding
the constituent waves is known as the Fourier Transform (Fourier). There are two primary
methods of computing the Fourier Transform of any function. Firstly, the continuous Fourier
Transform (CFT), which, as the name suggests, takes continuous inputs, and secondly the
Discrete Fourier Transform (DFT), which takes discrete inputs. At their core, both trans-
forms are designed to convert a spatial domain function (such as time or displacement) into
a frequency-based function. As each note has a di↵erent frequency, applying any one of these
transforms on the wave would result in a frequency-domain representation of the wave.

4.1 Why we use the DFT


For this investigation, we will be decomposing an audio wave that we have recorded on a
computer. This means that the input is digital and not acoustic. The implications of this
are that we are unable to use the continuous Fourier transform due to the fundamental na-
ture of the recording. All digital data is stored in discrete form using bytes (Ceruzzi 10).
An analogue signal such as sound is continuous, and must be recorded and converted to
a discrete form using an analogue-to-digital converter (ADC) (Self 200). This process of
conversion is known as sampling. Essentially, when we record a sound on a computer, we
are actually inputting ‘samples’. Each sample can be thought of as storing some data about
the sound, in most cases being the pitch and loudness. Any digitally sound is just a string
of data stored in rapid succession. This is also why we assume the sine waves in section 3 to
be single-variabled, as recorded audios only contain the variable of time, and the sampled
value of the sound at that specific time.

The CFT will not be useful for this scenario as it requires a completely periodic and math-
ematically continuous sound wave. So, we look to the discrete Fourier transform instead.

4.2 Understanding the DFT


The DFT, similar to the Fourier series, is built on the proposition that a collection of any
data can be broken down into and represented via many di↵erent sinusoidal functions. It
follows a similar definition to that seen in (3.6).

Page 6
Mathematics Internal Assessment Candidate Number:

Theorem 4.1 (Discrete Fourier Transform). The Discrete Fourier Transform uses a complex-
valued sum to analyse the periodically recorded samples.

N
X1 i2⇡kn
Xk = xn · e N where, N, n, k 2 R (4.2)
0

where Xk is the k-th frequency,N is the number of samples, and n is the current sample.
2⇡kn
We can rewrite (4.2) using the substitution of the N
component of the exponent as bn .

N
X1
ibn
Xk = xn · e (4.3)
0

Now expanding the summation in (4.3) to get

b0 i b1 i b2 i bN 1i
X k = x0 e + x1 e + x2 e + ... + xn e (4.4)

Recall Euler’s Formula as

ei✓ = cos ✓ + i sin ✓ (4.5)

We can use Euler’s formula (4.5) to rewrite the expanded DFT in (4.4) using the polar form

Xk = x0 [cos( b0 ) + i sin( b0 )] + ... + xn [cos( bN 1) + i sin( bN 1 )] (4.6)

From here, we can simplify the expansion by collecting the real parts and the imaginary
parts of all the terms, and assigning them to placeholder variables Ak and Bk

Xk = Ak + Bk i where Xk 2 C (4.7)

This tells us that the solution for the expanded DFT is a complex number, which can be
plotted on an Argand Diagram as follows

Page 7
Mathematics Internal Assessment Candidate Number:

Im
Ak (Ak , Bk )

Bk

Re

Figure 3: Argand Diagram plot of an arbitrary complex-valued number Ak + Bk i

In figure 4.2, the magnitude of the solution 2 C is given by


q
r= A2k + Bk2 (4.8)

Where magnitude determines the amplitude of the function, and the phase-shift is given by
the argument of the complex number Xk by
✓ ◆
1 Bk
✓ = tan (4.9)
Ak

Page 8
Mathematics Internal Assessment Candidate Number:

5 Applying the DFT


For this investigation, I recorded myself playing unknown chords blindfolded. The chords
were recorded into a computer software called Audacity3 . The waveforms formed are dis-
crete, made up of many individual samples.

Definition 5.1. A triad is a collection of three notes played together, usually to create some
harmonic chord (Pen 81).

For this part of the investigation I recorded myself playing a random triad. The recorded
waveform is given below.

Figure 4: Recorded waveform in Audacity

While this may look complicated, when zoomed in we can observe that it is but a simple
sinusoidal wave.

Figure 5: Recorded waveform in Audacity, zoomed in 100x

When we zoom in further, we can see the individual samples.

Figure 6: Selection of samples in Audacity


Audacity® software is copyright © 1999-2021 Audacity Team. The name Audacity® is a registered
3

trademark.
Page 9
Mathematics Internal Assessment Candidate Number:

We can then screenshot an image of these samples, then plot them in a mathematical mod-
elling tool: GeoGebra (Hohenwarter et al.). Resizing the image to make the maximum height
equal to one, in order to make all the values of the wave relative and easy to compute.

Figure 7: Plotted selections of samples

Let us take a set of these 29 samples, as computing more would quickly become impractical.
We can now place these values in a table.

xn value
x0 0
x1 -0.0267
x2 0.0294
x3 0.0688
x4 0.1250
x5 0.2037
x6 0.2487
x7 0.2993

Table 1: Values of individual samples

Let us perform some steps to simplify the expansion. Recall the DFT definition in (4.1).
For X1 ,

28
X i2⇡(1)(n)
X1 = xn · e 29 (5.2)
0

Expanding this gives

h i2⇡(1)
i h i2⇡(2)
i h i2⇡(3) i
X1 = 0 + ( 0.0267) e 29 + (0.0294) e + (0.0688) e 29
29

h i2⇡(4) i h i2⇡(28)
i
+ (0.1250) e 29 + ... + ( 0.019) e 29 (5.3)

Page 10
Mathematics Internal Assessment Candidate Number:

h i2⇡
i h i4⇡
i h i6⇡
i
X1 = ( 0.0267) e 29 + (0.0294) e 29 + (0.0688) e 29

h i8⇡ i h i56⇡
i
+ (0.1250) e 29 + ... + ( 0.019) e 29 (5.4)

We can now convert these into the polar form.

 ✓ ◆ ✓ ◆  ✓ ◆ ✓ ◆
2⇡ 2⇡ 4⇡ 4⇡
X1 = ( 0.0267) cos + i sin + (0.0294) cos + i sin
29 29 29 29
 ✓ ◆ ✓ ◆  ✓ ◆ ✓ ◆
6⇡ 6⇡ 8⇡ 8⇡
+ (0.0688) cos + i sin + (0.1250) cos + i sin
29 29 29 29
 ✓ ◆ ✓ ◆
56⇡ 56⇡
+ ... + ( 0.019) cos + i sin (5.5)
29 29

Evaluating these gives us:

X1 = ( 0.0267)(0.976621 0.214970i) + (0.0294)(0.907575 0.419889)


+ (0.0688)(0.796093 0.605174i) + (0.1250)(0.647386 0.762162i)
+ ... + ( 0.019)(0.976621 + 0.214970i) (5.6)

We can repeat these steps for X2 also

28
X i2⇡(2)(n)
X2 = xn · e 29 (5.7)
0

Expanding this gives

h i2⇡(2)
i h i2⇡(4)
i
h i2⇡(6) i
X2 = 0 + ( 0.0267) e 29 + (0.0294) e + (0.0688) e 29
29

h i2⇡(8) i h i2⇡(56)
i
+ (0.1250) e 29 + ... + ( 0.019) e 29 (5.8)

h i4⇡
i h i8⇡
i h i12⇡ i
X2 = ( 0.0267) e 29 + (0.0294) e 29 + (0.0688) e 29
h i16⇡ i h i112⇡
i
+ (0.1250) e 29 + ... + ( 0.019) e 29 (5.9)

Converting to polar form gives

Page 11
Mathematics Internal Assessment Candidate Number:

 ✓ ◆ ✓ ◆  ✓ ◆ ✓ ◆
4⇡ 4⇡ 8⇡ 8⇡
X2 = ( 0.0267) cos + i sin + (0.0294) cos + i sin
29 29 29 29
 ✓ ◆ ✓ ◆  ✓ ◆ ✓ ◆
12⇡ 12⇡ 16⇡ 16⇡
+ (0.0688) cos + i sin + (0.1250) cos + i sin
29 29 29 29
 ✓ ◆ ✓ ◆
112⇡ 112⇡
+ ... + ( 0.019) cos + i sin (5.10)
29 29

We can see how the DFT quickly gets very inefficient when we try to apply it on any input
with a sample size N larger than around 10. Additionally, we must compute all of the
calculations ourselves, which is impractical to do if we are to accurately analyse an entire
sound wave. The selected samples in the figure 7 span only 0.012 seconds. For context, the
entire sustained audio is about 12 seconds long. Selecting a sample this small is inefficient
as it does not provide large enough of a range of values for the DFT to accurately estimate
the frequencies present. Hence, we must then look to another method.

5.1 Fast Fourier Transform


The FFT is an algorithm that performs the DFT on the entire selected track’s samples,
then giving us the output that the DFT would have if we manually calculated all samples
(Heideman et al.). Fortunately, we can create a code to compute this for us. MATLAB
(MATLAB), a computer mathematics library, can do this process for us, using a simple
function using Python. The attached code can be found in the appendix.

The input (X,n) is a matrix that contains the time and value of the individual samples.
Using Audacity, we can export these values into a .txt file. A snippet of this file can be
accessed on this Pastebin link: https://pastebin.com/CzfLg5Ap.

Inputting this data as a matrix into our code provides us with the frequency-domain output
of the sound wave. Finally, we are able to plot this image as a spectrometer.

Page 12
Mathematics Internal Assessment Candidate Number:

Figure 8: Spectrum of recorded audio file

Figure 10, the frequencies with higher intensities are coloured in a lighter orange/red, while
frequencies with lower intensities are coloured purple. From this image, we can also notice
that the y-axis, denoting the frequencies is arranged via a logarithmic scale. This is to bet-
ter understand the frequencies as the pitches of the piano keys increase exponentially, not
linearly.

The spectrometer tells us that the first visible frequency is 263 Hz, followed by 330 Hz and
392 Hz. We can use a frequency-to-note chart to find which keys these exact frequencies
denote. The results we get are: the C-note, the E-note and the G-note respectively. These
notes, when played together, form the C-chord. In doing so, we have successfully decomposed
the audio wave using the Fourier transform, and have identified the constituent notes.
There is however, a question that arises when viewing figure 10. If the only notes played
were the three stated above, why are there so many patches of light purple. The answer to
this is found through understanding the basic behaviour of a piano. Since, by most regards,
the piano is a percussive instrument, there exists a phenomenon known as ‘sympathethic
reverberation’.

Definition 5.11. Sympathetic reverberation refers to a sonic phenomenon wherein an oth-


erwise inactive body has a response to external vibrations with which it is in harmony. (von
Helmholtz and Ellis)
Page 13
Mathematics Internal Assessment Candidate Number:

Essentially, 5.11 states that when one piano note with the pitch C is played, the other C
notes on the piano are also played, by association, to a slight extent. This phenomenon ex-
plains why the frequency spectrum plot displays so many active frequencies. We can repeat
this process for another recording, this time with four notes instead of the three we have
analysed earlier.

The process remains the same. Below is the recorded waveform. Let us name it waveform 2.

Figure 9: Recorded waveform 2 in Audacity

We can export the sample data on a .txt file once again4 . Running our python code as
given in Appendix A, using the sample data file, we can plot the spectrum.

Figure 10: Spectrum of recorded audio file 2

From this spectrum, we can observe that the notes most prominent are having the frequen-
cies: 247 Hz, 293 Hz, 370 Hz and 440 Hz. The respective keys for these frequencies are:
4
The link for the Pastebin file is: https://pastebin.com/cRD8V195
Page 14
Mathematics Internal Assessment Candidate Number:

B, D, F-sharp and A. These notes together form the B 7 chord. Hence, we have once again
identified the notes present in this chord.

6 Conclusion
This investigation has demonstrated how piano chords can be decomposed into their con-
stituent notes using the Fourier transform and a basic understanding of complex numbers.

During the research for this investigation, I was faced with the issue of not being able to
use the continuous Fourier Transform, due to the reasons mentioned above in section 4.1.
I did, eventually, manage to use the Discrete Fourier Transform to move on to the next
step. I came across another hurdle here as well, as processing individual samples would have
been very time-inefficient. Realising that brute-force mathematics is not always the answer,
I attempted to optimise my investigation through the use of the Fast Fourier Transform
mechanism o↵ered by MATLAB. As I had some prior knowledge about using the program-
ming language Python, I was able to put together a satisfactory program that performed
the decomposition for me. This investigation could have been performed better had I had
access to the entire MATLAB library, using which I could create frequency-amplitude dia-
grams that would display the frequency-amplitude relationship in more straightforward and
beautiful manner.

Another challenge was learning to use LATEXto typeset the document. Creating the diagrams
and typesetting the equations so that they would be neat and legible proved to be a harder
task than expected, however, the knowledge I have gained from this investigation in terms
of displaying mathematics efficiently and concisely will assist me in my further studies.
Despite these hurdles, the investigation allowed me to analyse the decomposition of piano
chords through the usage of mathematics, two fields that I hadn’t thought of being interlinked
prior to undertaking this investigation.

Page 15
Mathematics Internal Assessment Candidate Number:

A Fast Fourier Transform Code

import numpy as np
import matplotlib.pyplot as plt
import scipy as sp
import scipy.fftpack as spf

File = np.loadtxt(’export-data.txt’)

wave = File[:]

FFTX = abs(sp.fft(wave))
freqx = spf.fftfreq(len(wave),dt)

plt.figure(2)
plt.grid()
plt.title(’Fast fourier Transform of Wave’)
plt.ylabel(’Frequency (Hz)’)
plt.plot(freqx,10*np.log10(FFTX))
plt.clf

plt.show()

Page 16
Mathematics Internal Assessment Candidate Number:

Works Cited
Benward, B., and M.N. Saker. Music in Theory and Practice. McGraw-Hill. Music in Theory
and Practice v. 1.
Bronshtein, I.N., et al. Handbook of Mathematics. Springer, 2003. Google Books, books .
google.co.in/books?id=Ao8rAAAAYAAJ.
Ceruzzi, P.E. Computing: A Concise History. MIT P, 2012. Google Books, books.google.co.
in/books?id=BFT5DwAAQBAJ. The MIT Press Essential Knowledge series.
Fahy, F., and D. Thompson. Fundamentals of Sound and Vibration. CRC P, 2015. Google
Books, books.google.co.in/books?id=znd3CAAAQBAJ.
Fourier, J.B.J. Théorie analytique de la chaleur. Chez Firmin Didot, père et fils, 1822. Google
Books, books.google.co.in/books?id=TDQJAAAAIAAJ. Manuscripta; History of science,
18th and 19th century.
Heideman, M., et al. “Gauss and the history of the fast fourier transform”. IEEE ASSP
Magazine, vol. 1, no. 4, Oct. 1984, pp. 14–21. https://doi.org/10.1109/MASSP.1984.
1162257.
Hohenwarter, M., et al. “GeoGebra 5.0.507.0”, Nov. 2022, http://www.geogebra.org.
MATLAB. version 7.10.0 (R2010a). The MathWorks, 2010.
Naidu, M. Engineering Physics. Pearson Education India, 2013. Google Books, books.google.
co.in/books?id=Tzs8BAAAQBAJ. Always learning.
Pen, R. Schaum’s Outline of Introduction To Music. McGraw-Hill Education, 1992. Google
Books, books.google.co.in/books?id=9i84IQaIBWIC. A Shaum’s Book.
Self, D. Audio Engineering Explained. Focal, 2010. Google Books, books.google.co.in/books?
id=WzYm1hGnCn4C.
Von Helmholtz, H., and A.J. Ellis. On the Sensations of Tone as a Physiological Basis for
the Theory of Music. Longmans, Green, 1885. Google Books, books.google.co.in/books?
id=GwE6AAAAIAAJ.
Weisstein, Eric W. “Fourier Series”, MathWorld–A Wolfram Web Resource, mathworld .
wolfram.com/FourierSeries.html.

Page 17

You might also like