Professional Documents
Culture Documents
Music Recommender 212 - 215 - 237 (1) - 2
Music Recommender 212 - 215 - 237 (1) - 2
Music Recommender 212 - 215 - 237 (1) - 2
B.E. PROJECT ON
SUBMITTED BY:
Akash Kumar 212/CO/10
Akshat Gupta 215/CO/10
Devender Issar 237/CO/10
GUIDED BY:
Dr. Satbir Jain
CERTIFICATE
CANDIDATES DECLARATION
This is to certify that the work which is being hereby presented by us in this
project titled Hybrid Music Recommender System in Partial fulfilment of the
award of the Bachelor of Engineering submitted at the Department of
Computer Engineering , Netaji Subhas Institute of Technology Delhi, is a
genuine account of our work carried out during the period from January 2014
to May 2014 under the guidance of Dr. Satbir Jain, Department of Computer
Engineering, Netaji Subhas Institute of Technology Delhi. The matter
embodied in the project report to the best of our knowledge has not been
submitted for the award of any other degree elsewhere.
Dated:
Akash Kumar
Akshat Gupta
Devender Issar
This is to certify that the above declaration by the students is true to the best
of my knowledge.
ACKNOWLEDGEMENT
Dr. Satbir Jain has again been very supportive and involved in yet another
student project. It was his support that helped the project to start in its earliest
and most vulnerable stages. His name opened many doors for us and persuaded
many people. He was always found with energy and enthusiasm to make sure
that we were provided everything we needed. No amount of words can express
thanks to him. He was the one who backed us in providing any assistance we
needed during the project work.
We are also thankful to our friends who motivated us at each and every
step of this project. Without their interest in our project we could not have been
gone so far.
And the most of all, we would like to thanks our wonderful parents who
motivated us from day one of the project. You were the lights that lead us.
It was a great pleasure and honour to spend our time with all of them and
there could be no better payment for the efforts put into completing this B.E.
Project than their valuable presence. They are all very special to us.
Page |5
ABSTRACT
Table of Content
2. Literature Survey
2.1. Recommender Systems
2.1.1. Content Based
2.1.2. Collaborative Based
2.2. User Based Collaborative Filtering
2.3. Item Based Collaborative Filtering
2.4. Tag Similarity
2.5. Track Similarity
2.6. Pearson Correlation Coefficient
2.7. Euclidean Metric
6. Future Scope
7. References
Page |8
List of Figures
1.1. Introduction
1.2. Motivation
1.3. Problem
Statement
P a g e | 10
1.1. INTRODUCTION
In the past twelve years, there has been interest in the development of tech, and
son on.niques that provide personalised content to users. The type of
application have included filtering of news ,messages, presenting lists of stories
or artwork that a usert may be interested in and so on.
Most of these applications have applied a technique known as Collaborative
Filtering.
This involves collecting other users opinion of how good or useful an item is ,
and then ranking items based in this information for presentation to the user.
P a g e | 11
1.2. MOTIVATION
While it may be argued that there has been some success with this technique,
there is much room for improvement. Parallel to the development of
collaborative filtering has been content based filtering. This is an approach that
tries to extract useful information from the items of the collection that are good
indicators of their usefulness of the user. It aims to develop better techniques to
locate documents that satisfy a users information need.
P a g e | 12
Understanding:
The project aims to Build a System to recommend Music using Song heard by
User Databse and feedback /Rating provided by user. The System shall be able
to make useful/appericaible/ PERSONALISED recommendations based upon
predictive models. The system should make effective recommendations
efficiently/ fast/rapid/ Low Time Complexity.
This is an effort to implement mathematical predictive models to Improve User
Experience by making him personalised/relevant Recommendations. We believe
in ehnaced user experience by taking feedback/rating from him.
Set /List of Songs that would interest the user based upon predictive
models.
P a g e | 13
Recommender systems analyze user's profile and the relationship between user
and target item to help user purchase or rent the item based on user's interest[2].
With the help of computer, recommender systems can analyze huge collection
of data based on users' preferences to give good recommended items. Some
online company like Netflix and Amazon use recommender systems to help
users easy to find items they want on their website[9]. Every time a user logins
to their website, a new list of recommended items are showed based on past
users reviews or purchases[3]. Instead of spend time navigate on the website
and search for the items, a recommender system can save time for the user by
display the list of items which the user likes based on users profile.
Recommender system also can help online companies sell their products better.
Recommender system can give personalize feeling to the user because it is
based on the real input from the user and it is always update. Whenever the user
buys or reviews new item, a new recommended list is created for that particular
user.
P a g e | 12
Collaborative Filtering
Collaborative filtering (CF) systems build a database of user opinions of
available items. They use the database to find users whose opinions are
similar (i.e., those that are highly correlated) and make predictions of user
opinion on an item by combining the opinions of other likeminded
individuals.
They don't need the explicit profiles of each user or item[1]. For a user X
who has rated five on all five movies, a CF system will analyse the data
and find all users who give the same five movies with rating of five then
recommend the list of movies that these same users' interest to user X.
P a g e | 16
1.2 Look for users who share the same rating patterns with the active user (the
user whom the prediction is for).
1.3Use the ratings from those like-minded users found in step 1 to calculate a
prediction for the active user.
For user based algorithms, Pearson correlation only computes the similarity
between the two users who rate a same item[3]. For example, let S is the set of
items where both user x and user y rated. Then the Pearson correlation
computes the similarity between user x and user y as:
Here i is the average number of rating for item i, Ru,i is number of rating user
u gives on item i.
Here d(p,q) is the Euclidean distance between two users p and q , and q1 and
p1 are the ratings of the songs provided by p and q.
To normalize the rating , make rating = 1 /(1 + d(p,q) )
P a g e | 24
P a g e | 25
3.1. Introduction
3.1.1. Purpose
3.1.2. Scope
3.1.3. Definitions and Acronyms
3.1.4. Technologies Used
3.1.5. Refrences
3.1.6. Overview
3.1. INTRODUCTION
3.1.1. Purpose
3.1.2. Scope
Initial functional requirements will be: -
This list is by no means, a final one. The final list will be dictated by
implementation constraints, market forces and most importantly, by end user
P a g e | 27
Programming language:
3.1.6. Overview
The user should know how to give feedback to the system/Rate Songs.
The user need not have any knowledge about the working of the system.
The system runs on its own and learns on its own.
P a g e | 32
Initially the System starts with Song dataset and User, User Song Data Set
is Built over time
Many parameters like Personality Traits, Social Networking Groups,
Mood, etc. have not been taken into consideration.
Initially the system starts with a smaller dataset. So, it may not give good
results initially but will learn over time.
3.2.5. Assumptions
The system is based on learning algorithm. The system will start with a
comparatively smaller dataset. With time, the system will learn and give more
efficient and appropriate predictions.
1) Development Tools
The system shall be built using MATLAB.
2) Offline Product
There is no need of internet connection
3) Database
Database is required to store the dataset which is being built and used during the
working of the system.
2)Security
Data security will be maintained by the database management system used.
Network security is not required as there is no transfer of information from one
place to another. Users feedback will be stored and not displayed after it is
stored.
3)Usability
The system shall be easy to use and understand. User will only need to give
feedback. All other functions will be automated.
4) Maintainability
A commercial database is used for maintaining the database and the application
server takes care of the site. In case of a failure, a re-initialization of the program
is done. Also the software design is being done with modularity in mind so that
maintainability can be done efficiently.
P a g e | 35
5)Portability
The application is Matlab based and should be compatible with all other
systems. The end-user part is fully portable and any system should be able to use
the features of the application, including any hardware platform that is available
or will be available in the future.
1)Dataset
The dataset for this project is quite simple. Below is the summary of the
information needed plus a short description.
i. User Details: It contains fields like User name, User id, Age, Nationality .
ii. Song Details: It contains fields like Song name, Song id, Genre.
iii. User Song Ratings: It consists of user id, song id and rating provided
/specified by the user.
The SRS is developed in such a manner that any desired changes can be
introduced by the designing party in the near future according to the suitability.
P a g e | 36
For this program, ratings from 1000 selected users on 1000 selected songs were
extracted . Each user only rated a fraction of the songs, which results in the
sparsity of the rating information.Users starting from 1 to 200 are used for
testing purposes.Rest of the users are used as training data. The goal is to
predict the ratings of first 200 users and songs, making recommendations to
these users and comparing the resultof predicted ratings with the actual ratings
given by the users to the songs.
P a g e | 38
This section explains the whole implementation step by step. Below is the
workflow diagram of prediction of rating of songs and recommending it to
the user.
This block uses the below functions for extraction and processing of data
with specified inputs and outputs respectively.
Here we read in the training and test data set. While reading we update -
various data elements. We create a matrix for training dataset. Also for test data
set we make the rating of this song for the user equal to zero.
1. loadtrainingset This reads the rating data set of users and songs
and updates the ratings of songs for users with user_id between
201 to 1000 and saves it to the training data set .
2. loadtestset : This reads the rating data set of users and songs and
updates the ratings of songs for users having user_id between 1 to
200 and saves it to the test data set .
In this block for all users and songs, we update the average rating data member.
For any user or song, we compute sum of all ratings by iterating through all the
values of user_song_rating matrix and then dividing by total no of songs rated.
Below is the function utilised by this block.
Here an O(n^2) brute force algorithm is run and for every pair of users and
songs, similarity is calculated.
setuserSimilarity Next an O(n^2) brute force algorithm runs which for every
pair of users in the list populated, pearsons coefficient is calculated. This
similarity is updated in corresponding variable in user_similarity_matrix.
It calculates similarity if and only if two users have at least 1 common songs
rated. user_similarity_matrix(I,j) contains the degree of similarity between
two users.
Input:- Input to the function is user_song_rating matrix.
Output:- After processing is done for every pair of user ,
corresponding user_similarity_matrix element is updated .
predictscore-
The prediction score p(u; tv) of a track tv from the set of tracks Tnr(u) which
user u has so far not rated yet, is computed based on a linear combination of the
similarity scores of the tag and track recommenders,
Input:- Input to the function are track_similarity_matrix,
tag_similarity_matrix ,user_song_rating
matrix,average_song_rating array and a user_id for which
predictions are to be made.
Output:- An array predicted_rating2 is created which contains
the predicted ratings of songs for user given in input.
P a g e | 44
After the ratings are predicted on the basis of users, track and tags of
songs, we filter out the top five matches from each prediction. Then for
every unique user in test set, best predicted songs are recommended.
This section explains the accuracy of our predictions. Below is the work
flow diagram of accuracy testing.
1. Degree of closeness.
2. Root mean Square.
5.1. Results
5.2. Conclusion
P a g e | 48
5.1. RESULTS
The predicted model was tested on a corei3 laptop with 4GB RAM and
running windows 7 ultimate using matlab.
Step 1. Following is some snapshots of the tables imported in mysql from
excel file.
2.1 Track similarity was calculated. Following is the screen shot of similarity
of 8 songs. Similarity varies between -1 and 1. Similarity >0 means songs have
some common +ve ratings by users. Since similarity of song i and song j is same
as similarity of song j and song i, so only lower part of the matrix is evaluated.
Upper part is symmetric to the lower part.
P a g e | 50
2.2 tag similarity was calculated. Following is the screen shot of similarity of 8
songs.
Similarity varies between 0 and 1. Similarity =0 means songs have no content
matching.
P a g e | 51
ERROR GRAPH
Fig 11: Error Graph i.e. abs(Actual Rating-Predicted Rating) v/s Users
P a g e | 52
5.2. CONCLUSION
The derived results were much to our satisfaction but we still feel
much can be added to the same. We also want that the fellow
researchers would take active dive into this new dimension. The whole
process was enriching for us.
P a g e | 53
P a g e | 54
6. Future Scope
With the advent of technology, people have become more and more choosy.
They Seek personalised suggestions / recommendations for every thing.
Therefore the futue scope of recommender Systems is very high.
REFERENCES
[3] D. Shen, Z. Lu
Computation of Correlation Coefficient and Its Confidence Interval
in
SAS
SUGI 31 Proceedings, 2006