Audio Data Analysis Using Python

13/02/22, 13:58 Audio Data Analysis using Python
LOG IN
Explore
October 5, 2021
AUDIO DATA ANALYSIS USING PYTHON
Shubham Kumar Shukla

Shubham9455
DURATION
15min
CATEGORIES
Data Science
Data Analysis & Pattern Matching
TAGS
Python Physics
SHARE
Learn more
TOPCODER THRIVE
Assistenza
Before we discuss audio data analysis, it is important to learn some physics-based concepts of audio and sound, like its definition,
and parameters such as amplitude, wavelength, frequency, time-period, phase intensity, etc. Here are some concepts and
https://www.topcoder.com/thrive/articles/audio-data-analysis-using-python 1/10
mathematical equations.
Definition of audio (sound):
Sound is a form of energy that is produced by vibrations of an object, like a change in the air pressure, due to which a sound is
produced. This change in pressure causes air molecules to oscillate.
Mechanical wave:
Oscillates the travel through space;
Energy is required from one point to another point;
Medium is required.
When we get sound data which is produced by any source, our brain processes this data and gathers some information. The sound
data can be a properly structured format and our brain can understand the pattern of each word corresponding to it, and make or
encode the textual understandable data into waveform. From that wave, numerical data is gathered in the form of frequency.
Our sound data in various form:
1. Wav (waveform audio file) format

2. MP3 (MPEG-1 audio layer-3) format
3. WMA (window media audio) format.
Below is the corresponding waveform we get from a sound data plot.
This above waveform carries
1. Frequency
2. Intensity
Assistenza
Now we see how our sound wave is represented in the mathematical way.
Y(t)=Asin(2πft+Q)
Amplitude:
Amplitude is defined as distance from max and min distance.
In the above equation amplitude is represented as A.
Wavelength:
Wavelength is defined as the total distance covered by a particle in one time period.
Phase:
Phase is defined as the location of the wave from an equilibrium point as time t=0.
Now we will look at some important terms like intensity, loudness, and timbre.
power=2*pie*F/T
Assistenza
1. Rate at which energy is transferred
2. Energy is emitted by a sound source in all the directions in unit time
3. It is measure in watt/m^2
This is also called sound intensity or loudness.
Intensity is measured by various scales.
1. Logarithmic Scale
2. Decibels
3. Ratio between two intensity values
4. Use a frequency of reference
db(T)=10*log10(I1/I2)
Where I1 and I2 are two intensity levels.
Timbre:
Timbre describes the quality of sound. Like we see in a heatmap, there are different colors for different magnitudes of values. If we
have different-different sounds in one file then timbre will easily analyze all the sound on a graphical plot on the basis of the library.
1. Timbre is multidimensional
2. Sound envelope
3. Harmonic content
4. Amplitude, frequency, modulation
Attack-decay-sustain-release model; below is a graphical analysis.
Assistenza
The above data is in the form of analog signals; these are mechanical signals so we have to convert these mechanical signals into
digital signals, which we did in image processing using data sampling and quantization.
There are a lot of techniques for data analysis, like statistical and graphical. Here we see the graphical way of performing data
analysis.
SPECTROGRAM
Using a spectrogram we represent the noise or sound intensity of audio data with respect to frequency and time. On the premise of
those frequency values we assign a color range, with lower values as a brighter color and high frequency values as a darker color. A
spectrogram may be a sort of heatmap.
Below is code for a a spectrogram.
1
import librosa
2
audio = 'training\\00003.wav'
3
x, sr = librosa.load(audio)
4
X = librosa.stft(x)
5
Xdb = librosa.amplitude_to_db(abs(X))
6
plt.figure(figsize = (10, 5))
7
librosa.display.specshow(Xdb, sr = sr, x_axis = 'time', y_axis = 'hz')
8
plt.colorbar()
Assistenza
FEATURE EXTRACTIONS:
All sound data has features like loudness, intensity, amplitude phase, and angular velocity. But, we will extract only useful or relevant
information. Feature extraction is extracting features to use them for analysis.
There are a lot of libraries in python for working on audio data analysis like:
1. Librosa
2. Ipython.display.Audio
3. Spacy, etc.
CENTROID OF WAVE:
During any sound emission we may see our complete sound/audio data focused on a particular point or mean. This is called the
centroid of the wave. In other words, the center mass of audio data.
Below is the code of the program.
1
import sklearn
2
spectral_centroids = librosa.feature.spectral_centroid(x, sr = sr)[0]
3
spectral_centroids.shape(775, )
4
# Computing the time variable
5
for visualization
6
7
frames = range(len(spectral_centroids))
8
t = librosa.frames_to_time(frames)
9
# Normalising the spectral centroid
10
for visualisation
11
def normalize(x, axis = 0):
12
return sklearn.preprocessing.minmax_scale(x, axis = axis)
13
#Plotting the Spectral Centroid along the waveform
14
librosa.display.waveplot(x, sr = sr, alpha = 0.4)
15
plt.plot(t, normalize(spectral_centroids), color = 'b')
SPECTRAL ROLLOFF:
In this method we try to analyze the waveform in which our frequency drops suddenly from high to 0. In the language of calculus we
can say that there is a non-differentiability point in our waveform.
Assistenza
Below is the code of the function.
1
spectral_rolloff = librosa.feature.spectral_rolloff(x + 0.01, sr = sr)[0]
2
3
4
plt.plot(t, normalize(spectral_rolloff), color = 'r')
SPECTRAL BANDWIDTH:
Bandwidth is defined as the change or difference in two frequencies, like high and low frequencies.
1
spectral_bandwidth_2 = librosa.feature.spectral_bandwidth(x + 0.01, sr = sr)[0]
2
spectral_bandwidth_3 = librosa.feature.spectral_bandwidth(x + 0.01, sr = sr, p = 3)[0]
3
spectral_bandwidth_4 = librosa.feature.spectral_bandwidth(x + 0.01, sr = sr, p = 4)[0]
4
5
6
plt.plot(t, normalize(spectral_bandwidth_2), color = 'r')
Assistenza
7
plt.plot(t, normalize(spectral_bandwidth_3), color = 'g')
8
plt.plot(t, normalize(spectral_bandwidth_4), color = 'y')
5 1
RECOMMENDED FOR YOU
STATISTICS FOR DATA SCIENCE

Generally, statistics is a graphical and
mathematical representation of
information. Data science is all about...
READ MORE
Assistenza
PYTHON FOR CHARACTER

RECOGNITION – TESSERACT
Tesseract is an optical character
recognition tool in Python. It is used to
detect embedded characters in an i...
READ MORE
TOP THREE TENSORFLOW

TOOLS FOR DATA SCIENTISTS
Nowadays, huge companies are investing
more in machine learning projects because
a lot of libraries and framew...
READ MORE
COMPETE COMMUNITY ABOUT

TRACKS HELP CENTER
© 2022 Topcoder Policies
Assistenza
Assistenza

Audio Data Analysis Using Python

Uploaded by

Copyright:

Available Formats

You might also like

Audio Data Analysis Using Python

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Audio Data Analysis Using Python

Uploaded by

Copyright:

Available Formats

13/02/22, 13:58 Audio Data Analysis using Python

AUDIO DATA ANALYSIS USING PYTHON

Shubham Kumar Shukla

Data Analysis & Pattern Matching

Definition of audio (sound):

Oscillates the travel through space;

Energy is required from one point to another point;

Our sound data in various form:

1. Wav (waveform audio file) format

Below is the corresponding waveform we get from a sound data plot.

This above waveform carries

Amplitude is defined as distance from max and min distance.

In the above equation amplitude is represented as A.

This is also called sound intensity or loudness.

Intensity is measured by various scales.

Where I1 and I2 are two intensity levels.

Attack-decay-sustain-release model; below is a graphical analysis.

Below is code for a a spectrogram.

Below is the code of the program.

Below is the code of the function.

RECOMMENDED FOR YOU

STATISTICS FOR DATA SCIENCE

PYTHON FOR CHARACTER

TOP THREE TENSORFLOW

COMPETE COMMUNITY ABOUT

© 2022 Topcoder Policies

You might also like