Professional Documents
Culture Documents
Visualizing Sound Victor Cheng
Visualizing Sound Victor Cheng
Visualizing Sound Victor Cheng
Victor Cheng
accompanied by sound, whether we notice it or not. Have you ever thought about what makes a
sound, a sound? Why do different things have different sounds? In this article, I will be using a
Tibetan Singing Bowl and the human voice to help demonstrate properties of sound and timbre.
The App:
I used a free program, called Spectralissime. This tool was made available by VB-Audio
Software, which is a French software company specializing in real time digital audio processing.
I am using a Microsoft LifeCam VX-5000 to capture my audio. This program has a variety of
modes to display sound, and I fiddled with the settings to gather the best images for this project.
Sound Analysis:
The first of the three sounds I will be looking at is one that is fundamental to our lives: the
human voice. Using an online tone generator for a reference note of C# 138.6Hz, I will anticipate
that the fundamental frequency of the first pitch to be that, followed by the first, 2nd, and 3rd
overtone to be present in the graph. These would be, 277.2Hz, 415.8Hz, and 554.4Hz. Singing
C# 138.6Hz on a U (oo) vowel, I was surprised to find that the overtones sound louder than the
fundamental. Observing the distribution of the higher frequencies, I notice that on the U (oo)
vowel, there is an area between approximately 1k to 2k Hz which frequencies sound much lower
than those higher and lower than it. To analyze this further, I decided that I would change the
sound of my voice, by modifying the vowel which is being sung. While differences of timbres
between individual voices is quite apparent, as described by the online resource HyperPhysics
“different vowels sounds have distinctly different timbres”. To expand on this, it is through a
Fourier Spectrum analysis that we can identify differences between vowels, while on the same
pitch. For the next test, Using an C# 138.6Hz as a reference point, and sang between the 5 main
Italian vowels of singing, Eh (as in bet) I (as in ski) Ah (as in tall) O (as in slow) and U (as in
you). In the same breath, I maintained the same pitch, while using my tongue to change the space
in my mouth to change my vowel. Translated onto the Fourier Spectrum Analysis, please note
that the fundamental frequency never changes, though the peaks of the waves and the various
overtones do. For the relatively mute vowel of U, it was previously observed that there is a
noticeable dip in the 1k – 2k Hz range. Compared to a vowel such as Ah, there is more consistent
activity throughout the entirety of the spectrum. Compared to the U vowel, Ah has a more
uniform distribution of activity in the 500 – 4k Hz range, which may suggest why U is perceived
as a relatively closed and round vowel, whilst Ah is perceived as being brighter, and larger. It is
also interesting to note the similarities and differences between vowels which are on the same
horizontal axis of the IPA vowel chart. Chart taken from the International Phonetic Association
websiteThe first of the three sounds I will be looking at is one that is fundamental to our lives:
the human voice. Using an online tone generator for a reference note of C# 138.6Hz, I will
anticipate that the fundamental frequency of the first pitch to be that, followed by the first, 2nd,
and 3rd overtone to be present in the graph. These would be, 277.2Hz, 415.8Hz, and 554.4Hz.
Singing C# 138.6Hz on a U (oo) vowel, I was surprised to find that the overtones sound louder
than the fundamental. Observing the distribution of the higher frequencies, I notice that on the U
(oo) vowel, there is an area between approximately 1k to 2k Hz which frequencies sound much
lower than those higher and lower than it. To analyze this further, I decided that I would change
the sound of my voice, by modifying the vowel which is being sung. While differences of
timbres between individual voices is quite apparent, as described by the online resource
HyperPhysics “different vowels sounds have distinctly different timbres”. To go a bit further on
this, it is through a Fourier Spectrum analysis that we can identify differences between vowels,
while on the same pitch. For the next test, Using an C# 138.6Hz as a reference point, and sang
between the 5 main Italian vowels of singing, E (as in hay) I (as in ski) Ah (as in tall) O (as in
slow) and U (as in you). In the same breath, I maintained the same pitch, while using my tongue
to change the space in my mouth to change my vowel. Translated onto the Fourier Spectrum
Analysis, please note that the fundamental frequency never changes, though the peaks of the
waves and the various overtones do. For the relatively mute vowel of U, it was previously
observed that there is a noticeable dip in the 1k – 2k Hz range. Compared to a vowel such as Ah,
we see that there is more consistent activity throughout the entirety of the spectrum. Compared to
the U vowel, Ah has a more uniform distribution of activity in the 500 – 4k Hz range, which may
suggest why U is perceived as a relatively closed and round vowel, whilst Ah is perceived as
Looking at /i/, /u/, /e/ and /o/, there are similarities in their distribution on a Fourier Spectrum.
When oscillating between horizontally paired vowels, I notice that the lower and higher end of
the Hz range stays relatively similar, and hardly change in intensity as I swap. However, it seems
that the presence of mid-range Hz, from about 700Hz – 1900Hz is integral to changing the
perception of sound. On vowels which have a more forward tongue placement (/i/ and /e/), there
is a higher concentration of frequencies on the higher end of this mid-range. When I swap to its
paired back vowel (/u/ and /o/ respectively), I notice that the higher end of this mid-range of
frequencies lowers in intensity, and focuses itself on the lower end of the mid-range of
frequencies.
Progression of the Human Voice on different Vowels, U, Ah, Ay, Ee, Oh.
The 2nd sound analyzed is a small Tibetan Singing bowl. Comparing the sound of the Tibetan
singing bowl to a tuning fork, and the human voice, I anticipated that this sound will have
noticeable peaks at each overtone. In addition to this, I hypothesize that the 4th harmonic, which
is the 3rd of the major scale, will have a noticeable peak, as that harmonic is very present when
playing the singing bowl. Upon analysis of the sound, I find that the peaks of the Tibetan singing
bowl are very noticeable, with the sound containing noticeable less inharmonics than the human
voice. I believe this ‘calmness’ in the Fourier Spectrum Analysis correlates to the ‘pure’ quality
of sound that presents itself when the singing bowl is played. The timbre of the Tibetan Singing
bowl contains a certain focused quality, which I observe as the fundamental being highlighted by
the harmonics, with a lot less competing inharmonics to fill out the sound, and ‘dirty’ the wave.
However, my hypothesis about which overtones would be present were wrong. I thought that the
Fourier Spectrum of the singing bowl would demonstrate the harmonic series of this instrument,
with noticeable peaks occurring on the overtone which I perceived to be more present. However,
from the images, there is no evidence of this. In fact, the data indicates exactly what I hear, a
5.5Hz.
Elsing, John. Handbook of the International Phonetic Association. Cambridge University Press,
1999.
Nave, Rod. “Harmonic Content Differences in Vowel Sounds” Hyperphysics Sound. Georgia
2020.