ULNN

Machine Learning Based Guitar Tablature
Transcription
• BY :-
• MOHD ADNAN 2K22CSUN01149
• YASH POONIA 2K22CSUN01159
• ARYAN BHANOT 2K22CSUN01130
• KRISHNAV 2K21CSUN0
ABSTRACT
• This paper proposes an innovative approach to automating guitar audio
transcription to tablature
• It addresses the labor-intensive and error-prone nature of manual methods
utilizing convolutional auto-encoder neural networks trained on a dataset of
guitar audio recordings and corresponding tablature annotations.
• Through this training, the model learns to encode the unique features of
guitar music into a concise representation, enabling accurate tablature
generation.
• Leveraging machine learning and neural networks, this approach offers
increased speed, scalability, and accessibility.
• the technical details of the convolutional autoencoder architecture, audio data

preprocessing, training process, and model evaluation are discussed.
• Additionally, we explore potential applications in music education,
composition, and analysis.Through this tablature transcription, many of the
laborious tasks can be removed and the musician can focus only on somgs
produced by strings & frets.
INTRODUCTION
The Guitar : Is a very widespread musical

instrument in popularity & usage amongst many of
the top songs.
TRANSCRPTION : WHAT DOES IT MEAN ?
guitar audio transcription involves converting the

audio of a guitar performance into a written format,
typically in the form of tablature or sheet music.
Tablature is a form of musical notation that
represents which strings and frets should be
played on a stringed instrument
• ROLE OF MACHINE LEARNING......
• In recent years, advancements in machine learning & signal processing have

opened new avenues for automating the transcription process. By leveraging
computational techniques, we can now develop algorithms capable of
analyzing audio recordings & generating tablature with remarkable accuracy
& efficiency.
• In our research, we introduce an innovative methodology for guitar audio
transcription, centered around convolutional autoencoder neural networks.
• RESEARCH WORK & APPROACH
• Our approach entails training a convolutional autoencoder using a
comprehensive dataset comprising of guitar audio recordings paired with
corresponding tablature annotations.
• Through iterative training on features , the autoencoder aptly discovers and
encapsulates the distinct characteristics inherent in guitar music into a
condensed representation.
• This refined encoding effectively encapsulates the fundamental attributes of
individual notes, chords, and strumming patterns. Subsequently, this encoded
representation can be efficiently decoded to produce precise tablature output,
facilitating accurate transcription.
RELATED WORK
The task has already been handled before by Wiggins & Kim
(2019) [REFERENCE .] who proposed a system called TabCNN
that can automatically generate guitar tablature from solo
acoustic guitar audio.
The system uses a constant-Q transform to convert the audio

signal into an image representation, & then applies a
convolutional neural network to learn a direct mapping from the
image to the tablature
The authors evaluated their system on the GuitarSet dataset.

• DATASET : GUITARSET
GUITARIST NO OF RECORDS STYLE HIGHEST PITCH
• The GuitarSet dataset was made using a hexaphonic pickup & a monophonic
pickup, allowing for detailed processing, but the model has been trained
using only a monophonic pickup. The 6 individuals play the guitar in standard
tuning, so that pitch estimations for corresponding frets can be determined
easily.
• Actual guitar performances are utilized to establish a ground truth; they will
enable us to learn the exact fingerings so that a tablature can be created.
• The authors also introduced a set of metrics to measure the performance of
tablature estimation systems. Humphrey & Bello (2014) [REFERENCES ]
developed a novel approach for generating guitar tablature from music audio.
• They utilized a deep convolutional network that models the physical

constraints of a guitar. The network was trained with a radial basis function
layer & a set of chord shape templates.
• The study entailed assessing the network's chord recognition capabilities in

comparison to a basic model lacking guitar-specific refinements. The results
underscored the network's ability to transform music audio into human-
interpretable and playable formats, showcasing significant proficiency in
identifying chords.
ROBOTABA
• Burlet & Fujinaga (2013) presented Robotaba, a framework that allowed the
creation of web applications for automatic guitar tablature transcription from
audio recordings.
• The authors implemented a web application using Robotaba, which

incorporated an existing polyphonic transcription algorithm & a new guitar
tablature arrangement algorithm based on the A* pathfinding algorithm. The
authors also compiled two ground-truth datasets for evaluating the
performance of the transcription system. The results showed that the
polyphonic transcription algorithm performed better on clean guitar
recordings than on distortion guitar recordings.
TABLATURE USING HMM & BAYESIAN
NETWORKS
• Barbancho et al. (2012)[REERENCES] introduced a technique for
automatically transcribing guitar chords and fingering from audio.
• Their approach utilized a probabilistic model that amalgamates chroma
features, hidden Markov models, and Bayesian networks to infer the most
probable chord and fingering for each audio frame. Evaluation was
conducted on a diverse dataset comprising 100 guitar recordings across
various genres and styles.
• Findings revealed that their method achieved an average accuracy of 82.6%
for chord transcription and 75.4% for fingering transcription, significantly
outperforming baselines.
• Furthermore, the authors deliberated on limitations such as dependency on
guitar tuning, handling complexities in chord structures, and the inability to
generalize to other instruments.
MODEL - TAB CNN
• An autoencoder model has been defined using Sequential from Keras. The
model consists of several layers:
• The first layer is a 2D convolutional layer with 32 filters, a kernel size of (3,
3), ReLU activation.
• It takes input images of size (128, 128, 1).
• This is followed by a max pooling layer with a pool size of (2, 2).
• Then, another convolutional layer with 64 filters, ReLU activation, followed by
another max pooling layer.
The model has been compiled
using the Adam optimizer & mean
squared error as the loss function,
& then trains the autoencoder
model using the spectrogram
images as both input & target
output, for 10 epochs with a batch
size of 32 & shuffling the data
during training
FEATURES EXTRACTION
• In our study, we simplified our complex 'autoencoder' model by creating a
new 'encoder' model using its first three layers.
• This 'encoder' focuses on extracting essential features from input images.
When given an image, it produces a condensed summary akin to
compressing a large painting into a smaller version, emphasizing key
elements.
• In the encoded features, there are 8 samples (spectrogram images). Each
sample has a feature map with dimensions 64x64, & there are 64 channels
or filters in each feature map.
• Each channel captures different aspects or patterns present in the input
images.
CONCLUSION
• Our research presents an innovative method for automating guitar audio
transcription through the application of convolutional autoencoder neural
networks.
• By leveraging the synergy between machine learning and signal processing,
we have showcased the capacity to seamlessly convert guitar audio
recordings into tablature with exceptional precision and efficiency.
• This breakthrough not only revolutionizes the transcription process but also
gives freedom access to musical transcription for musicians across
proficiency levels.
• As automated transcription systems progress, they stand to redefine music
education, composition, and analysis, paving the way for novel avenues of
creativity and expression in the musical domain.
• FUTURE WORK :
• Utilizing the annotations accompanying the dataset can enhance transcription
accuracy by leveraging additional information about the audio recordings.
• Expanding the current model's prediction capability beyond the six standard
tuning classes of the guitar could lead to more precise audio transcriptions
for individual notes.
• Converting the audio into MIDI files offers a potential solution to simplify the
transcription process for individual notes, further improving accuracy and
efficiency.

ULNN

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ULNN

Uploaded by

Copyright:

Available Formats

Machine Learning Based Guitar Tablature

• the technical details of the convolutional autoencoder architecture, audio data

The Guitar : Is a very widespread musical

TRANSCRPTION : WHAT DOES IT MEAN ?

guitar audio transcription involves converting the

• In recent years, advancements in machine learning & signal processing have

The system uses a constant-Q transform to convert the audio

The authors evaluated their system on the GuitarSet dataset.

• They utilized a deep convolutional network that models the physical

• The study entailed assessing the network's chord recognition capabilities in

• The authors implemented a web application using Robotaba, which

You might also like