Collaborative Filtering

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Anisha Koul

22020845005

Recommendation system for Spotify.

As we move ahead into the 2020s, an ever-increasing share of music consumption and discovery is
going to be mediated by AI-driven recommendation systems. Back in 2020, as much as 62% of
consumers rated across platforms like Spotify and YouTube among their top sources of music
discovery and a healthy chunk of that discovery is going to be mediated by recommender systems.

Yet, as algorithmic recommendations take centre stage in the music discovery landscape, the
professional community at large still perceives these recommender algorithms as black boxes.

About Dataset:

Dataset used : https://www.kaggle.com/datasets/mrmorj/dataset-of-songs-in-spotify


Features that could be used :

1) Danceability
2) Energy
3) Key
4) Loudness
5) Mode
6) Speechiness
7) Acousticness
8) Instrumentalness
9) Liveness
10) Valence
11) Tempo
12) Duration_ms
13) Time_signature
14) Liked

Most of these features have to do with objective sonic descriptions. For example, the metric of
"instrumentalness" reflects the algorithm's confidence that the track has no vocals, scored on a
scale from 0 to 1. However, on top of these "objective" audio attributes, Spotify generates at least
three perceptual, high-level features designed to reflect how the track sounds like in a more holistic
way:

Danceability, describing how suitable a track is for dancing based on a combination of musical
elements, including tempo, rhythm stability, beat strength, and overall regularity.

Energy, representing "a perceptual measure of intensity and activity", based on the track's dynamic
range, perceived loudness, timbre, onset rate, and general entropy.
Valence, describing "the musical positiveness of the track". Generally speaking, tracks with high
valence sound more positive (e.g., happy, cheerful, euphoric), while songs with low valence sound
more negative (e.g., sad, depressed, angry)

Types of Recommendation systems that could be used for Spotify:

• Content based filtering


• Collaborative filtering

The only difference between them is the sort of data that they use, recommend songs that are
similar to the other songs in the dataset. Content data use the song data that is available on the
while the collaborative data will use the user-item data.

When we speak about content filtering, Bag of words (BoW) has a very vital role to play. Suppose the
type of playlists ‘A’ depicts BoWs : ‘motivation’, ‘soft’, ‘A.R. Rahman’, ‘latest’. He may be very much
interested in a latest Sufi song composed by A.R. Rahman. This is the way we can predict the next
recommendation for ‘A’

While as for collaborative filtering, we can make a comparison between two playlists. For eg., ‘A’ has
the songs: 1, 2, 3, 4, 5 in his play list, while ‘B’ who likes the same genre has songs: 2, 3, 4, 5, 6 in his
respective playlist. We can recommend song 1 to ‘B’ and song 6 to ‘A’.

Source : https://www.analyticsvidhya.com/blog/2022/02/introduction-to-collaborative-filtering/

Cosine Similarity: Cosine similarity is used as a metric in different machine learning algorithms like
the KNN for determining the distance between the neighbors, in recommendation systems, it is used
to recommend movies with the same similarities and for textual data, it is used to find the similarity
of texts in the document.

In context to Spotify, cosine similarity could be used to determine the similarity between any two
vectors. Here, the vectors could be two songs, singers, playlists etc. Let’s suppose that I want to
create a new playlist for Dr. Anuradha. I would run a cosine similarity test between Dr. Anuradha’s
existing Spotify playlist and the playlists that my she might be interested in. This model will then give
me a score for each playlist, which would essentially lie between 0 and 1. The score more towards 1
depicts that the suggested playlist is very similar to the existing one and thus the user would be
more interested to listen to it. Similarly, the score which would be closest to zero would depict that
the user would be least interested in it.

Cons that I see in this model:


> One thing to note is that in our case, we have implemented the Content-Based Filtering
mechanism. Thus, the model will only be able to make recommendations based on that specific
user's interests. Therefore, it limits the ability of a user to expand their existing interests.

> To use this recommender, users should have at least one playlist on their Spotify account, which is
disadvantage in the case of an entirely new user of the Spotify application, which of course paves
way for cold start recommendation.

My Ideation towards how the existing models could be used to make the Spotify
recommendations better:

Currently, Spotify uses Discover weekly, which is an extremely powerful recommendation tool, to
ascertain the user tastes. To make our personalized Discover Weekly playlist, Spotify builds a taste
profile based on our listening habits, then uses collaborative filtering to find songs that we haven’t
listened to from people with similar taste profiles. It’s the same approach that helps us buy more
stuff on Amazon!
However, one feature that I see lacking is recommendation on the basis of mood. Instead of songs
being offered on my overall personality, I would want to create a recommendation system that
would also let me have recommendation on the basis of mood. I would like to call it Mood Playlist,
which would be some kind of Biased Random mode. Suppose, I am on a dinner date and might want
to listen to some soft songs that I might have saved. All I would have to do is select “The way you
look tonight”, switch Mood Playlist on, and it will play all of those songs that are similar to the first
one you selected.

In order to achieve this type of recommendation system, understanding the correlation of the above
stated features is a must. Therefore, I have tried to run a correlation to understand which features
are most related.
After applying correlation, I can easily see which of the features have directly proportional relation.
For eg, I can see that ‘Loudness’ and ‘Energy’ are correlated. Therefore, when one plays an energetic
song and then clicks on the mood playlist, the recommender system can fetch more songs based on
the corelated features like ‘Loudness’.

I can here use the Cosine similarity to find out the most relevant songs with the existing one and
thus I would be able to create say, top 10 or 20 songs based on the first song choice.

Spotify uses an extremely well-built recommendation system. They’ve been working on it


for years, thoroughly developing their algorithms and testing countless hypotheses with teams
of top of the class Data Scientists from all over the world. It would be naive of me to even mention
that my project can be as good as what they do. But that’s not the goal.

The goal is to come up with a concept, that could be polished into a feature in a recommendation
environment, to be further tested and improved, aiming to deliver a more complete experience for
their users.

You might also like