Professional Documents
Culture Documents
JSS Campus, Dr. Vishnuvardhan Road, Bangalore - 560060
JSS Campus, Dr. Vishnuvardhan Road, Bangalore - 560060
JSS Campus, Dr. Vishnuvardhan Road, Bangalore - 560060
TECHNICALSEMINAR
2021-2022
Presented By
Yashu S K
1JS19EC164
• In [6] it use transfer learning to adapt the pre-trained DNN on a large-scale audio dataset
to the SER task. The authors fine-tune the DNN on a smaller labeled SER dataset and use
data augmentation techniques to improve the model's robustness to noise and variability .
This proposed model is based on two benchmark datasets, the Emo-DB and the Berlin
Database of Emotional Speech (Emo-DB).
1.Preprocessing
• Silent removal
• Background Noise
Removal
• Windowing
• Normalization
• 2.Feature Extraction
Pitch
Loudness
Rhythm
DATASET
Multilayer Perceptron
Multi-Layer Perceptron Classifier
• Speaker variability
• Channel variability
• Data Availability
• Lack of Diversity in Datasets
• Processing Time
Applications
• customer Service Chatbots
• Human-Computer Interaction
• Sentiment Analysis
• Mental Health Diagnosis
• Entertainment Industry
• Voice-Based Personal Assistants
References
• Jerry Joy, Aparna Kannan, Shreya Ram, S. Rama Speech Emotion Recognition
using Neural Network and MLP Classifier, IJESC, April 2020.
• Navya Damodar, Vani H Y, Anusuya MA. Voice Emotion Recognition using CNN
and Decision Tree. International Journal of Innovative Technology and Exploring
Engineering(UITEE), October 2019.