Music Recommendation

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 113

Music Recommendation

Listen to the music You like

Recuperao de Informao 1semestre 2011/2012 Ricardo Dias, n55444

Bibliography
Music Recommendation and Discovery: The Long Tail, Long Fail, and Long Play
scar Celma, Springer 2010, Ch-1-3,5

Recommender Systems
Prem Melville, Vikas Sindhwani Encyclopedia of Machine Learning, 2010

Handbook of Multimedia for Digital Entertainment and Arts


Borko Furht (Ed.), Springer 2009

What is Recommendation? Do we need it? Why?

MOTIVATION & CONTEXT

Recommendation in our lives

Restaurants

Discos and Bars

Food

Books

And Music!

Do we need Music recommendation?

Music Consumption Change


Physical Stores Online Stores

Time

Music Recommendation
Digital Era - Portability

Up to ~20 tracks

Up to ~40.000 tracks

Time

Music Recommendation
Digital Era Online Services

Music Recommendation
Digital Era Online Services

Amazon

17 Million Songs

Music Recommendation Problem


Overwhelming number of choices of which music to listen to
Users feel:
Paralyzed Doubtful

Need to provide personalized filters and recommendations to ease users decisions

Music Recommendation
Before Digital Era

Cannot only rely on recommendations from:


Radios Friends Local Record Dealers Djs and Music Experts Etc.

Music Recommendation
Digital Era

Music Characteristics
Different from other types of media
Tracking users preferences can be implicit Items can be consumed several times (even repeatedly and continuously) Instant feedback Music consumption depends on context (morning, work, afternoon, etc.)

Music Recommendation Specificities


Current music recommendation algorithms try to accurately predict what people will want to listen
Making accurate predictions about a user could listen to, or buy next, independently of how

useful the provided recommendations are to


the user

Formalization, Use Cases, Profile Generation, Recommendation Methods

THE RECOMMENDATION PROBLEM

Formalization
Recommendation Problem
Prediction problem estimation of the items likeliness for a given user Recommend a list of N items assuming that the system can predict likeliness for yet unrated items

Prediction Problem
= , , , the set of Users = , , , items that can be recommend list of items a user j expressed his interests
Function ,j predicted likeness of item , for the active user , where
Usually represented by a rating < , , >

Recommendation Problem
Find a list of N items that the user will like the most
The ones with higher , The resultant list should not contain items from the users interests
=

Use Cases
Common usages of a recommender system:
1. 2. 3. 4. 5. 6. 7. Find good items Find all good items Recommend sequence (e.g. playlist generation) Just browsing Find credible recommender Express self Influence others

General Model
Users and Items Two types of recommendations:
Top-N predicted items Top-N predicted neighbors

User Profile Generation


Two key elements:
Generation and Maintenance Exploitation of the profile using a recommendation system

User Profile Creation


Empty Profile The simplest, but Manual Direct feedback to the system, but Data import create the profile from an external representation Training Set Provide feedback to concrete items, marking them relevant or irrelevant to users interests, but Stereotyping Assign a user into a cluster of similar users that are represented by their stereotype

User Profile Maintenance


Explicit Feedback
Ratings (Problems?) Comments and Opinions

Implicit Feedback
Monitoring users actions (e.g., tracking play, pause, skip and stop buttons in the media player, etc.) Problem? Advantage over Explicit feedback approaches?

User Profile Adaptation


Adapt the system to users profile changes:
Manually Adding new information while keeping the old Gradually forgetting old interest and promoting the new ones

Recommendation Methods
Standard classification of recommender systems:
1. 2. 3. 4. 5. Demographic Filtering Collaborative Filtering Content-based Filtering Context-based Filtering Hybrid Approaches

Demographic Filtering
Used to identify the kind of users that like a certain item Classifies user profiles in clusters based on:
Personal data (age, gender, marital status, etc.) Geographic data (city, country) Psycographic data (interests, lifestyle, etc.)

Advantages/Limitations
The simplest recommendation method But
Recommendations are too general Requires effort from the user to generate the profile

Collaborative Filtering
Predict user preferences for items by learning past user-item relationships CF methods work by build a matrix M with n items and m users, that contains the interaction (e.g. ratings, plays, etc.) of the users with the items.

Collaborative Filtering
The value , represents the rating of the user for the item

Collaborative Filtering Approaches


Item-Based Neighborhood User-Based Neighborhood Matrix Factorization

Item-Based Neighborhood
Only users that rated the items and , are taken into account in the process

Item-Based Neighborhood
Only users that rated the items and , are taken into account in the process

Item-Based Neighborhood
1. Compute the similarity between two items, i and j
1. Example: Adjusted cosine similarity

2. Predict to the target user, u, a value for the active item, i


; - set of k neighbors of item I, that the user u has rated

User-Based Neighborhood
Compute the predicted rating value of item i, for the active user u, taking into account those users that are similar to u

- average rating for user u ()- set of k neighbors for user u (the similar ones)

Matrix Factorization
Useful when the M user-item matrix is sparse
Reduce dimensionality of the original matrix, generating matrices U and V that approximate the original one Example: SVD Singular Value Decomposition
Computes matrices and , for a given number k, such as: diagonal matrix containing the singular values of M

Matrix Factorization
After matrix reduction we can calculate the predicted rating value for item i for a user u

Limitations
Data sparsity and high dimensionality Gray sheep problem* Cold-start problem (early-rater problem) Does not take into account items descriptions Popularity Bias Feedback Loop

Content-based Filtering
Uses information describing the items Process of characterizing item data set can be:
Manual (annotations by domain experts) Automatic (extracting features by analyzing the content)

Key component: Similarity Function

Content-based Filtering
Similarity Functions
1. Euclidean

2. Manhattan
3. Chebychev 4. Mahalanobis

Limitations
Cold-start problem (only to user) Gray-sheep problem Novelty (?) Limitation of extracted automatic features Subjective (personal opinions) not taken into account

Context-based Filtering
Uses context information to describe and characterize the items
Context Information any information that can be used to characterize a situation or an entity Context != Content

Two main techniques:


Web mining Social Tagging

Web Mining
3 different web mining categories:
Web content mining
text, hypertext, markup, and multimedia mining

Web structure mining


focuses on link analysis (in- and out- links)

Web usage mining


uses the information available on session logs. This information can be used to derive user habits and preferences, link prediction, or item similarity based on co-occurrences in the session log

Social Tagging
Aims at annotating web content using tags Tags are freely chosen keywords, not constrained to a predefined vocabulary Recommender systems can derive social tagging data to derive item (or user) similarity

Social Tagging
When users tag items, we get tuples of :
< , , >

These triples conform a 3-order matrix (tensor)

Social Tagging
Two approaches to compute item (and user) similarity:
1. Unfold the 3-order tensor in three bidimensional matrices (user- tag, item-tag and user-item matrices) 2. Directly use the 3-order tensor

Unfolding the 3-order tensor


User-Tag (U matrix) - , contains the number of times user i applied the tag j Item-Tag (I matrix) - , contains the number of times an item i has been tagged with tag j User-Item (R binary matrix) - , denotes whether the user i has tagged the item j

Unfolding the 3-order tensor


Item similarity (using I) or user similarity (using U or I), can be computed using:
Cosine-based distance Dimensionality reduction techniques(SVD, NMF)

Then recommendations can be made by using:


R user-item matrix or, User profile obtained from U or I

Using the 3-order tensor


The available techniques are (high-order) extensions of SVD and NMF
HOSVD is a higher order generalization of matrix SVD for tensors, Non-negative Tensor Factorization (NTF) is a generalization of NMF

Limitations
Coverage Problems with tags:
Polysemy Synonymy Usefulness of personal tags Sparsity

Attacks / Vandalism

Hybrid Approaches
Goal
Achieve better recommendations by combining some of the previous approaches

Methods:
Weighted Switching Mixed Cascade

Factors Affecting Recommendation


Novelty and Serendipity Explainability (transparency) Cold Start Problem Data Sparsity and High Dimensionality Coverage

Factors Affecting Recommendation


Trust Attacks Temporal Effects Understanding the Users

Use cases, User and Item Profiles Representation, Recommendation Examples

MUSIC RECOMMENDATION

Use Cases
Main task of a music recommendation system:
Propose interesting music, consisting of a mix of known and unknown artists, as well as the available tracks, given a user profile

Use Cases
Artist Recommendation Playlist Generation
Shuffle, Random Playlists Personalized Playlists

Neighbor Recommendation

How about other use cases?

User Profile Representation


Extend user profile with music related information
Has not been largely investigated

Useful to:
Improve music recommendation Share with others your preferences

Type of Listeners
Each type of listener needs different type of recommendations

User Profile Representation Proposals


Most relevant proposals are:
User modeling for Information Retrieval (UMIRL) MPEG-7 standard Friend of a Friend (FOAF) initiative

User Modeling for Information Retrieval


Allows one to describe perceptual and qualitative features of the music

MPEG-7 User Preferences


User preferences in MPEG-7 includes:
Content filtering Searching and browsing preferences Usage history

FOAF: User Profiling in the Semantic Web


Provides conventions and a language to tell a machine the type of things a user says about herself in her homepage

Item Profile Representation


Music items:
Artists Songs

Music Information Plane

Music Information Plane


Music knowledge management categories:
Editorial Metadata Cultural Metadata Acoustic Metadata

Editorial Metadata

Cultural Metadata

Acoustic Metadata

Music Description Facets


Low-level Timbre Descriptors
Spectral Centroid/Flateness/Skewness, MFCCs, etc.

Instrumentation Rhythm Harmony Structure Intensity Genre Mood

Recommendation Methods (examples and specificities)


Collaborative Filtering (CF)
Explicit/Implicit Feedback

Content-Based Filtering Context-Based Filtering Hybrid Methods

Collaborative Filtering
CF makes use of the editorial and cultural information
Explicit feedback based on ratings about songs / artists Implicit feedback tracking user listening habits

CF with Explicit Feedback


Examples:
Ringo 1st music recommender based on CF and explicit feedback Racofi based on CF and a set of logic rules based on Horn clauses
Indiscover Slope One CF method

CF with Implicit Feedback


Main Drawbacks:
The value that a user assigns to an item is not always in a predefined range (e.g. from 1..5 or like it/hate it) Cannot gather negative feedback

Recommendations usually performed at artist level, but listening habits are recorded at song level Aggregation

Content-Based Filtering
Uses content extracted from music to provide recommendations
Compute similarity among songs, in order to recommend music to the user

Two ways to describe audio content:


Manually Automatically

Manually Audio Content Description


Very time consuming
Scalability problems

But
Annotations can be more accurate than automatic

Example: Pandora
Analysts annotate 400 parameters per song, using a ten point scale per attribute ~ 15.000 songs analyzed per month

Automatic Audio Content Description


Early work on audio similarity is based on lowlevel descriptors, such as Mel Frequency Cepstral Coefficients ( MFCC)
Foote proposed a music indexing system based on MFCC histograms Audio features are usually aggregated together using mean and variance, or modeling it as a Gaussian Mixture Model (GMM)

Automatic Audio Content Description


Analyses audio signal and automatically extracts a set of features:
Tzanetakis extracted a set of features representing the spectrum, rhythm and harmony (chord structure); then merged into a single vector, and were used to determine song similarity Cataltepe et al. presented a music recommendation system based on audio similarity, where users listening history is taken into account

Context-Based Filtering Techniques


Uses cultural information to compute artist or song similarity Mainly based on web mining techniques, or mining data from collaborative tagging

Context-Based Filtering Techniques


Example:
M3 (Music for My Mood), uses context (season, month, day of the week, weather, temperature) and Case-based Reasoning to recommend music

Hybrid Methods
Allows a system to minimize the issues that a solely method can have
How cascade approach works:
One technique is applied first, obtaining a ranked list of items. Then, a second technique refines or re-rank the results obtained in the first step

Hybrid Method
Example:
Tiemann et al. investigate ensemble learning methods for hybrid music recommender algorithms. Their approach combines social and content-based methods, where each one produces a weak learner. Then using a combination rule, it unifies the output of the weak learners.

System-centric, Network-centric, User-centric

EVALUATION

Evaluation
Three different strategies
System-centric Network-centric User-centric

System-centric Evaluation
Evaluation measures how accurate the system can predict the actual values that user have previously assigned

System-centric Evaluation
Most approaches based on the leave-n-out method
Similar to the classic n-fold cross validation

Dataset divided into two (usually disjunct) sets:


Training and Test

Accuracy evaluation based only on a users dataset


The rest of the items of the catalog are ignored

System-centric Evaluation
Metrics:
Predictive accuracy
Mean Absolute Error, Root Mean Square Error

Decision based
Mean Average Precision, Recall, F-measure, Accuracy, ROC

Rank based
Spearmans , Kandall , Half-life Utility, Discounted Cumulative Gain

System-centric Evaluation Limitations


Cannot evaluate recommendations concerning:
1. 2. 3. 4. 5. Coverage Novelty Transparency (explainability) Trustworthiness (confidence) Perceived Quality

Network-centric Evaluation
Evaluation aims at measuring the topology of the item (or user) similarity network

Network-centric Evaluation
The similarity network is the basis to provide the recommendations
Important to analyze and understand the underlying topology of the similarity network

Measures:
Coverage Diversity of recommendations

Network-centric Evaluation
In terms of:
Navigation
Average Shortest Path, Giant Component

Connectivity
Degree Distribution, Degree-Degree Correlation, Mixing Patterns

Clustering
Local/Global Clustering Coefficient

Centrality
Degree, Closeness, Betweeness

Network-centric Evaluation
Limitations:
Accuracy of the recommendations cannot be measured Transparency (explainability) and trustworthiness (confidence) of the recommendations cannot be measured The perceived quality (i.e. usefulness and effectiveness) of the recommendations cannot be measured

User-centric Evaluation
Evaluation focuses on the users perceived quality and usefulness of the recommendations

User-centric Evaluation
Copes with the limitations of both:
System- and Network-centric approaches

Evaluates:
Novelty Perceived Quality

User-centric Evaluation
Gathering Feedback (Explicit, Implicit)
Perceived Quality Novelty A/B Testing

Perceived Quality
Easiest way to measure?
Explicitly ask the users User needs information about:
Item (e.g. metadata, preview, etc.) Reasons why the item was recommended

Then can rate the quality of each recommended item (or the whole list)

Novelty
Ask users if they recognize the predicted items or not Combining novelty and perceived quality we can infer if:
User likes to receive and discover unknown items Prefers more conservative and familiar recommendations

A/B Testing
Present two different versions of an algorithm (or two algorithms)
Evaluate which one performs the best

Performance measured by the impact the new algorithm has on the visitors behavior, compared to the baseline algorithm

User-centric Evaluation
Limitations:
Need of user intervention in the evaluation process
Gathering feedback from the user can be tedious for some users Time-consuming

Evaluation summary
Combining the three methods we can cover all the facets of a recommender algorithm

Evaluation summary
System-centric
Evaluates performance accuracy of the algorithm

Network-centric
Analyses the structure of the similarity network

User-centric
Measure satisfaction about recommendations they receive

Which datasets can we use to evaluate Music Recommendation Approaches?

DATASETS FOR EVALUATION

Last.fm Dataset 1K users


Contains <user, timestamp, artist, song> tuples
Represents the listening habits for ~1.000 users

Collected from Last.fmAPI


User.getRecentTracks()

Statistics:
~108,000 Artists with MusicBrainz ID ~70.000 Artists without MusicBrainz ID

Last.fm Dataset 360K users


Contains tuples <user, artist, plays> from 360.000 users Collected from Last.fm API
User.getTopArtists()

Statistics:
~190.000 Artists with MusicBrainz ID ~107.000 Artists without MusicBrainz ID

The Million Song Dataset


One Million Songs!!!
280GB of data ~45.000 unique artists ~8.000 unique terms > 2 Million asymmetric similarity relationships Acoustic features
Pitch, Timbre, Loudness, etc.

Links to other sources to obtain more information


Musicbrainz, 7digital, playme

NEXTONE PLAYER
NEXTONE PLAYER: A Music Recommendation System Based on User Behavior
Yajie Hu and Mitsunori Ogihara
ISMIR 2011

Session-based CF for Music Recommendation


Session-based Collaborative Filtering for Predicting the Next Song
Sung Eun Park, Sangkeun Lee, Sanggoo Lee
CNSI 2011

The End
Thank you!

You might also like