Professional Documents
Culture Documents
Mufecs
Mufecs
Mufecs
Music Application(Mufecs)
(KCA-353)
Session-2023-2024
The satisfaction that accompanies the successful completion of any task would be incomplete
without the mention of people whose ceaseless cooperation made it possible whose constant
guidance and encouragement crown all efforts with success. Our project whose title is “Mufics
- Listen your mood” developed for partial fulfillment of MCA degree course as Mini Project.
We are grateful to our projects Mentor Mr. Gaurav Bhatia for the guidance, inspiration and
(2201920140018)
CERTIFICATE OF ORIGINALITY
I hereby declare that my Project titled “Mufics- Listen Your Mood“ submitted to Dr. APJ ABDUL
KALAM TECHNICAL UNIVERSITY, Lucknow for the partial fulfillment of the degree of Master
of Computer Application Session 2023-2024 from GL Bajaj College of Technology and
Management, Greater Noida has not previously formed the basis for the award of any other degree,
diploma or other title.
This is to certify that the project entitled, “Mufics - Listen Your mood” submitted by Amit Kumar
Rai a bonafide student of GL Bajaj College of Technology and Management, Greater Noida in
partial fulfillment for the award of Master of Computer Applications affiliated to Dr. APJ ABDUL
KALAM TECHNICAL UNIVERSITY, LUCKNOW during the year 2023-24. It is certified that all
corrections, suggestions indicated as per Internal Assessment have been incorporate in the project.
To the best of our knowledge, the work embodied in this report is original and has not been submitted
to any other degree of discipline. The project report has been approved as it satisfies the academic
(MCA Department)
HOD
1. Certificate
2. Executive Summary
3. Key words
9. List of Tables
10.ER Digram
12. References
Page NO.-01
Executive summery
We suggest a novel method for employing facial expression to trigger automatic music
playing. The majority of the current methods entail manually playing music, utilizing
Our suggested approach tends to decrease the total cost of the planned system as well as
the computational time required to collect the results, improving the system's overall
correctness. The FER2013 dataset is used to test the system. Face expressions are
recorded by means of an integrated camera. Using input face photos, feature extraction
is used to identify emotions including happiness, anger, sadness, surprise, and neutral.
state. Compared to the approach in the current literature, it produces better results in
Key word: TensorFlow, Flask, Music, Player, Camera, Convolutional Neural Network,
Chapter No. 1
project mufecs has been developed to override the problems of loneliness to recommend
music based on their moods. This application supported seven face emotion like happy,
sad, angry, neutral, disgusting etc. The project recommendation features work on
The Aim of Project is to increase productivity of human being which helps growth of
country and also an individual.It also analyse the moods of user that helps
Mufecs include the registration of the user , storing their music collection and
also user can create their own music, and it also include the search facility to listen
Music online.
An user can be entered using username and password and new user can create its
account.
This web based application is working smoothly without any bug .It is developed using
Recent research has shown that people respond and react to music, and that music has a
significant impact on brain activity. Researchers looking into the reasons individuals
listen to music found that music was important in connecting arousal and mood. Music
serves two primary purposes for participants: firstly, it can help them feel happy and,
secondly, it can improve their mood.and increase in self-awareness. It has been shown
that emotions and personality factors are strongly correlated with musical choices [1].
The brain regions that influence emotions and mood are also responsible for
controlling the meter, timbre, rhythm, and pitch of music [2]. One of the main
wealth of human data visible, including emotions, body language, voice, and facial
expressions [3]. In many applications these days, including smart card applications,
efficient feature extraction algorithms. This system has a lot of potential uses, including
recommender system for emotion recognition based on facial expressions that can
identify the user's moods and provide a list of suitable music [13–24]. The suggested
emotions, a playlist including the most upbeat music genres will be displayed. And in
Page No. - 04
the event that the feeling is pleasant, a particular playlist featuring various kinds of
We utilized the Kaggle Facial Expression Recognition dataset [5] to detect emotions.
Bollywood Hindi music was used to construct the music player's dataset. Convolutional
neural networks are used to implement facial emotion detection, and its accuracy rate is
Chapter 2:
The purpose of the review is to get insight into the processes and identify any
survey, is a text of an academic work that summarizes the state of the field's knowledge,
contributions to a certain subject. Humans have innate abilities that can contribute to
any system in a variety of ways, which has drawn attention to comprising several
students, scientists, engineers, and other professionals from around the globe.
Facial expressions convey the person's present mental state. In most interpersonal
movements, facial expressions, and voice tonality. Preema et al. [6] claimed that
making and maintaining a big playlist takes a lot of effort and time. According to the
publication, the music player chooses a tune based on the user's present mood. The
program creates playlists based on mood by scanning and categorizing audio files based
on audio attributes. The Viola-Jonas method, which is used for face detection and facial
Anger, joy, surprise, sadness, and disgust are the five main universal emotions that were
music recommendation system that determines the user's mood based on signals
Page No. - 06
photoplethysmography (PPG) and galvanic skin response (GSR) [3]. Humans are
fundamentally emotional beings. They are essential to everything in life. This work
considers the problem of emotion recognition as the prediction of arousal and valence
from multi-channel physiological information. Ayush Guidel et al. claimed in [7] that
facial expressions are a simple way to read a person's mental state and present emotional
mood. Basic emotions were used to construct this system. (content, upset, furious,
ecstatic, shocked, disgusted, afraid, and indifferent) into account. In this research, face
detection was implemented by convolutional neural network. All around the world,
Emotion identification was used by Ramya Ramanathan et al. [1] to demonstrate their
A fundamental aspect of human nature is emotion. They have the most significant
impact on life. Human emotions are designed to be shared and understood by one
another. Initially, the user's local music selection is categorized according to the mood
the album expresses. This is frequently computed with the lyrics of the song in mind. In
particular, the paper highlights the specialization of existing methods for identifying
human emotions in order to produce emotion-based music players, the approach a music
player follows to identify human emotions, and the way it is best to apply the offered
system for emotion detection. It also provides a quick overview of how our systems
operate, how playlists are made, and how emotions are categorized.Manually sorting
through a playlist and annotating music based on a user's emotional state is a labor-
Page No. - 07
Unfortunately, the current algorithms are quite inaccurate, slow, and require more
hardware (such as sensors and EEG structures) which raises the system's total cost. In
this research, an algorithm is presented that performs the task of automatically creating
an audio playlist using the facial expressions of an individual, in order to save time and
labor, when carrying out this task by hand. The algorithm presented in the research aims
to lower the cost of the planned system as well as the total computational time. It also
seeks to increase the system design's correctness. The facial expression recognition
module of the system is verified by comparison with a dataset that is both user-
Create a system that uses a webcam and machine learning algorithms to deliver a cross-
platform music player that makes recommendations for music based on the user's
current mood.
Page No.: 08
Chapter 3:
We profit from the user-music player interaction that the suggested system presents.
The system's objective is to adequately capture the face using the camera. The
Convolutional Neural Network receives input from captured photos to forecast the
emotion. A playlist of music is then generated by utilizing the emotion contained in the
of songs to alter the user's mood, which might range from shocked to joyful to sad to
natural. The suggested system is able to identify emotions; if the subject matter
involves a negative emotion, a playlist featuring the best possible music genres that
would uplift the listener's spirits will be played. Four modules make up the facial
• Real-Time Recording: Within this section, the system is to accurately record the user's
face .
• Face Recognition: In this case, the user's face will be entered. The user image's
• Emotion Detection: In this stage, the elements of the user image are extracted in order
to identify the emotion. The system then generates captions based on the user's emotions.
• music Recommendation: Based on the user's emotions and the mood type of the music,
3.2 Methodology
We used the Kaggle datasets to construct the Convolutional Neural Network model.
The database, called FER2013, is divided into training and testing datasets. 24176
images make up the training datasets, whereas 6043 images make up the testing datasets.
The collection contains 48x48 pixel grayscale pictures of faces. Five emotions are
assigned to each image in FER-2013: joyful, sad, furious, surprised, and neutral. The
faces are automatically registered such that they occupy roughly the same amount of
area and are roughly centered in each shot. Grayscale 48x48 pixel head-shots, both
The FER-2013 datasets was produced by compiling the outcomes of a Google image
search for each emotion together with its synonyms. When trained on an unbalanced
datasets, FER systems may exhibit strong performance on dominating emotions like
happiness, sadness. They do well on the underrepresented ones, like as disgust and
t e r r o r , bu t po o r l y on t he on e s th a t ar e ang r y , i n d i ff e r e n t , an d s t a rt l e d .
This issue is typically solved using the weighted-Soft Max loss strategy, which weights
Page No. - 10
the loss term for each emotion class based on how much of each class it represents in
the training set. The Soft-Max loss function, on the other hand, is the foundation of this
weighted-loss method and is said to readily drive features of different classes to remain
loss to coach the neural network is one practical method for handling the Soft-Max loss
issue. We have applied the categorical cross entropy loss function to handle missing and
outlier values. A chosen loss function is used for every iteration in order to estimate the
A face detection application falls within the category of computer vision technology. In
Page No. - 11
photos, algorithms must first be created and trained. Real-time detection from an image
or video frame can be used for this. These classifiers are used in face identification;
they are algorithms that identify whether an image contains a face (1) or not (0). To
increase accuracy, classifiers are taught to recognize faces in a large number of photos.
Two types of classifiers are used by OpenCV: LBP (Local Binary Pattern) and the Haar
using pre-defined, variable face data, allowing it to recognize various faces with
accuracy. Face detection's primary goal is to identify the face inside the frame by
minimizing background noise and other distractions. Using a collection of input files,
the cascade function is taught in this machine learning-based method. The Haar Wavelet
approach supports the function-based pixel investigation within the image into squares
[9]. This makes use of "training data" and machine learning techniques to extract a high
degree of accuracy.
Page No.- 12
extractor while carrying out feature extraction. letting the input image continue until it
reaches the pre-designated layer, at which point we take the layer's outputs and use them
as our features. Use only a few filters since the initial layers of a convolutional network
extract high-level characteristics from the captured image. We raise the number of
filters to twice or three times the dimension of the previous layer's filter as we create
deeper layers. Although they require a lot of work, filters in the deeper levels obtain
additional features.
Page No. - 13
that the convolution neural network learnt [10]. The model will provide feature maps as
its outputs, which are a kind of intermediate representation for every layer above the
first. To see which features were significant enough to classify the image, load the input
image that you wish to examine as a feature map. Applying filters or feature detectors to
the input image or the feature map output of the previous layers yields feature maps.
Visualization of feature maps will shed light on the internal representations for
Using the Relu activation function, convolution neural network design applies filters or
feature detectors to the input picture to obtain feature maps or activation maps [11].
Bends, vertical and horizontal lines, edges, and other characteristics that are present in
the image can all be identified with the use of feature detectors or filters. Subsequently,
the feature maps undergo pooling to ensure translation invariance. Pooling is predicated
on the idea that the combined outputs remain constant when we make slight changes to
the input. Any pooling from min, average, or max can be used. However, max-pooling
outperforms min or average pooling in terms of performance. All of the inputs should
be flattened before being fed into a deep neural network, which will produce outputs for
The image's class will either be binary or multi-class, used for things like digit
interpreted, making neural networks akin to a black box. In other words, the CNN
model receives an input image and outputs the results [10]. The model that has been
trained by weights using CNN is loaded in order to detect emotions. When a user
We assembled a collection of Bollywood Hindi song lyrics. There are between 100 and
150 tracks for each feeling. As everyone knows, music definitely contributes to
Page No. - 16
elevating our mood. If a user is depressed, for example, the system will suggest a
playlist of music that will cheer them up and make them feel better automatically.
Real-time user emotion is recognized by the emotion module. Labels like Happy, Sad,
Angry, Surprise, and Neutral will result from this. We linked these labels to the folders
in our built music database using Python's os.listdir() function. Table 1 displays the song
list.
To retrieve a list of every file in the specified directories, use the os.listdir() method.
Emotion Song
Track 2 "Ilahi"
Happy
Track 3 “Neeche Pholo ki dukaan”
Track 2 "Matargashti"
Neutral
Track 3 ''Dildara"
As a result, the GUI of the music player will suggest a playlist for the user, displaying
captions based on emotions that have been identified. To play the audio, we utilized the
Pygame library, which can play a variety of multimedia formats, including audio, video,
and so on. The play, pause, resume, and stop functions of this library are accustomed to
interacting with the music player. The names of all the songs, the status of the songs that
are presently playing, and the main GUI window are stored in variables called playlist,
song status, and root, respectively. HTML and CSS was utilized in the GUI
development process.
We assessed several research that make use of convolutional neural networks, extreme
learning machines (ELM), and support vector machines (SVM) [12]. A comparison of
algorithms and accuracy values are provided. Convolutional neural networks are used to
Table 2 shows the accuracy of the three methods' validation and testing using the
Fer2013 Datasets
Algorithm SVM ELM CNN
The trained CNN network's hyperparameters are displayed in Table 3. The weight
update at the conclusion of each batch is controlled by the learning rate. Many epochs of
the training dataset's iterations are applied to the network during training. The number
of patterns shown in the network prior to the weights being updated is known as the
batch size. The model is able to learn nonlinear prediction bounds thanks to activation
functions. When training deep learning models, Adam might be used in place of
stochastic gradient descent as the optimization algorithm. Deep learning model errors
are usually quantified using the loss function categorical crossentropy in single-label,
Page No. - 21
Hyperparameters Values
Batch size 64
No. of classes 7
Optimizer Adam
Epoch 21
No. of Layers 28
7. Final thought
A comprehensive analysis of the literature reveals that there are numerous ways to put
the music recommender system into practice. A review of approaches put forth by
earlier researchers and developers was conducted. The results led to the fixation of our
system's aims. Since AI-powered applications are becoming more and more popular,
our project will make use of cutting-edge, in-demand technology. We give a summary
of the system's features, including how music can lift users' spirits and how to select the
appropriate songs. Emotions of the user can be detected by the developed system. The
emotions that the machine was able to identify were neutral, shocked, joyful, sad, and
furious. The suggested system identified the user's emotion and then showed them a
Page No. - 22
playlist with music selections that matched their mood. Processing a large dataset
requires a lot of memory and processing power. Development will become more
appealing and hard as a result. The goal is to develop this application using the least
recommendation system.
Page No. :23
Chapter 5:
Even if this system is fully operational, there is always room for improvement. The
program can be adjusted in a number of ways to improve user experience overall and
yield superior results. Some of these use an alternate approach based on other feelings
like fear and disgust that are not included in our system. This feeling includes endorsing
the automated playing of music. Potential applications of the system's future scope
include the creation of a mechanism that could aid in the treatment of patients with
mental stress, anxiety, acute depression, and trauma by music therapists. There is a
chance to add functionality as a future solution because the existing system performs
poorly in extremely poor light circumstances and with low camera resolution.
Building a mini- project requires the assistance and direction of multiple individuals. It
is, therefore, our first priority to express our gratitude to everyone who supported us
We express our gratitude to Dr. Madhu Gaur, our Head of Department, for his
inspiration and provision of the necessary materials for us to work on this project.
Additionally, we are grateful for the gracious cooperation of Mrs. Manju verma, Head
of the Mini-Project.
We take great pleasure in expressing our gratitude to Mr. Gaurav Bhatia, our project
guide, for her ongoing guidance and support during the project planning process, as well
Page No. - 24
Finally, but just as importantly, we would like to express our gratitude to our friends and
the teaching and non-teaching staff members whose support and recommendations
enabled us to improve our mini Project. We are also appreciative of our parents'
List of Tables
2. Table 2 shows the accuracy of the three methods' validation and testing using the
Fer2013 Datasets.
ER Diagram
Page No.: 27
Level 0
Page No. - 28
Level 1
Page No. - 29
Level 2
Page No.: 30
References
[2] MySQL Documentation: Official documentation for MySQL database setup, data
modeling, and querying.(https://mysql.com)
[5] An intelligent music player based on emotion identification was presented by Ramya
Ramanathan, Radha Kumaran, Ram Rohan R, Rajat Gupta, and Vishalakshi Prabhu at
[6] The following authors published a paper in 2017: "Smart music player integrating
Engineering, Pune Institute of Computer Technology, Pune, India; Husain Zafar, Shlok
be found here.
[7] Deger Ayata, Yusuf Yaslan, and Mustafa E. Kamasak, Wearable physiological
Consumer Electronics, vol. 14, no. 8, May 2018.What is the doi for TCE.2018.2844736?
[8] Alaa Alsaedi, Kholood Albalawi, Ahlam Alrihail, Liyakathunisa Syed, Users' music
of Computer Science, Taibah University (DeSE), 2019. What is the DOI for
DeSE.2019.00188?
[10] Sahana M, Savitri H, Rajashree, Preema J.S., Review of a music player that plays
[11] AYUSH Guidel, Krishna Sapkota, Birat Sapkota, Facial analysis-based music
and Innovative Research (JETIR), Volume 7, Issue 4, April 2020; CH. Sadhvika,
[13] Vincent Tabora, Face identification with Haar Cascade Classifiers in OpenCV,
Becominghuman.ai, 2019.
[14] Xiang Chen, Chenchen Liu, Fuxun Yu, and Zhuwei Qin. An overview of
[15] ResearchGate, June 2019, Ahmed Hamdy AlDeeb, Emotion-Based Music Player
Page No. - 32
[17] Singhal, Singh, and Vidyarthi (2020) Thorax illness interpretation and localization
[19] Object Detection, A. Singh and P. Singh (2021). 3, pp. 1–20 in Journal of
[21] A. Singh and P. Singh (2021): Identification of License Plates. 1, pp. 1–14 in