Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Work:

yamnet
neural network

Carlos Saldana.

2022
CONTENIDO

INTRODUCTION ........................................................................................................................ 3
PROBLEM ................................................................................................................................... 3
SOLUTION .................................................................................................................................. 3
RESULTS ..................................................................................................................................... 3
REFERENCES ............................................................................................................................. 6
INTRODUCTION

YAMNET is a pretrained acoustic detection model trained by Dan Ellis on the AudioSet dataset which
contains labelled data from more than 2 million Youtube videos. It employs the MobileNet_v1 depth-wise-
separable convolution architecture. This pretrained model is readily available in Tensorflow Hub, which
includes TFLite(lite model for mobile) and TF.js(running on the web) versions. [1]

PROBLEM

1. Read in an audio signal

2. Call classifySound to return the detected sounds

3. Identify ONLY sounds from the folders (like '\kick', '\snare', '\hihat'

“Ideally, just add lines to my transfer learning example, and delete YAMNets Drum category and
all of Drum's sub categories (like Bass drum)”

SOLUTION

1.The principal problem I ´ve found was the samples taken, I created a file .m where to built new
sound file with more time of duration for each one, only it was made for the 3 folders. But you could
made more folders for more sounds save, the file called, snare, kick, hihat respectively

2.The new file let yamnet work like we are thinking, the tag are well, drum kit 157, snare drum 160
hihat 167 is into a excel file when you download the yamnet folder, and it could be rewrite, but no
es recommendable because it is a midi classification, a midi is a type of file to write sound, and the
classification made is a general midi sound.

3.The yamnet identify only sound from the folder kick, snare, and hihat.

4.The results are show to the following screen capture

RESULTS

1. Read in an audio signal (up 1 second duration)


I had to erase some samples (hihat 7,49,54,55,57,67,68,81,84 and 88) because they need to be rec again
with a little silent time after the sound, and could be replaced into the file oldsample
If you decide do it you have to follow this steps:
1. Rec again this sound, with a silence (0.5 second) at least
2. Replace the files on folder hithat of old sample folder
3. Copy the hithat folder into the drumtrain
4. Run de hihat.m file
5. replace the files of the folders into of newsample_1second
6. And begin the train again.

3. Identify ONLY sounds from the folders '\kick', '\snare', '\hihat') train and validation

3. Contend of rar

The exampletransferlearningYAMNEt_cas is the file modified, and hithat.m, kick.m,snare.m are


the file to create the new sample folder (up 1 second of duration).
I give you:
File .m to create new sample sound
Net= DrumNet.mat
New files with up 1 second of duration
exampletransferlearningYAMNEt_cas file. before run this file, change the path file.
Mel spectrograms generated from audioIn, returned as a 96-by-64-by-1-by-K array, where:

• 96 –– Represents the number of 25 ms frames in each mel spectrogram


• 64 –– Represents the number of mel bands spanning 125 Hz to 7.5 kHz
• K –– Represents the number of mel spectrograms and depends on the length
of audioIn, the number of channels in audioIn, as well as OverlapPercentage

REFERENCES
[1] M. Rustagi, “Guide to YAMNet : Sound Event Classifier,” Analytics India Magazine, Jun. 08, 2021.
https://analyticsindiamag.com/guide-to-yamnet-sound-event-classifier/ (accessed Nov. 14, 2022).

Copyright 2013-2014 The MathWorks, Inc.

You might also like