Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 22

Diversity in news recommendations using

contextual bandits

Saurav V Sonawane
22101060
Contents

• Introduction
• Types of Recommendation System
• Functions
• Abstract Summary of Research Paper
• Background
• Objectives
• Setup
• Results
• Conclusion and Future Work
• References
Introduction

• Recommendation systems are algorithms designed to suggest items to users based on various factors,
which could include user preferences, behaviour, item characteristics, and contextual information.

• The primary goal of these systems is to present the most relevant, personalized suggestions to a user,
ideally enhancing the user's experience and satisfaction with the service.
• Recommendation systems are critical in helping users navigate the overwhelming amount of choices
available in various domains, enhancing user experience, driving user engagement, and increasing the
likelihood of item consumption or purchase.
• They are a cornerstone of many modern digital services and play a significant role in the success of
online platforms of E-commerce, Entertainment and Media and Retail.
Types of Recommendation Systems

• Collaborative Filtering: This method makes automatic predictions about a user's interests
by collecting preferences from many users (collaborating). It assumes that if users agreed
in the past, they will agree in the future about certain items. It can be user-based or item-
based.
• Content-Based Filtering: This approach uses item features to recommend additional items
similar to what the user likes, based on their previous actions or explicit feedback.
• Hybrid Methods: These systems combine collaborative and content-based filtering to
overcome limitations found in either approach. They can provide more accurate
recommendations by integrating multiple data sources and prediction models.
Functions

• Predictive Rating: Predict how much a user would rate an item they have not interacted with.
• Item Ranking: Rank items in order of relevance or likelihood of user interest.
• Personalized Content: Customize the content delivered to a user's unique taste.
• Diverse Suggestions: Ensure that users are exposed to a wide variety of items, preventing the "filter
bubble" effect.
Applications

• E-commerce: Suggest products to customers.

• Media and Entertainment: Recommend movies, TV shows, music, and books.

• News Aggregation: Propose news articles and other written content.

• Social Media: Personalize feeds and suggest connections or content.


Abstract Summary of Research Paper

• Personalization with Contextual Bandits: Contextual bandit techniques enhance user recommendation
systems, especially where data and preferences are constantly evolving.
• Beyond Collaborative Filtering: Collaborative filtering may fall short in dynamic environments where users
and items change frequently, contextual bandits present a solution.
• Sequential Article Selection: The method involves selecting articles sequentially for user recommendations
and adapting based on maximizing user clicks.
• Balancing Clicks and Exposure: Focusing solely on click maximization can lead to an imbalance
overexposure of some articles while others get overlooked.
• Social Responsibility in News Delivery: The paper introduces a socially responsible recommendation
technique that considers not just click maximization but also the historical frequency of article
recommendations as a cost.
• Diverse Recommendations: This approach results in a more balanced distribution and diversity in article
recommendations.
Background

The most common class of recommender algorithms are matrix factorization (MF)-based algorithms,
which exploit the idea that latent factors motivate users to like or dislike an item (e.g., a news article).
The underlying logic is that if a user prefers certain factors, and an item contains them, then the he or she
is likely to enjoy consuming the item.
Although latent factors are similar to item features (e.g., styles in the music domain or genres in motion
pictures), they are not interpretable. Matrix factorization considers a user–item rating matrix 𝐑 of size
𝑚×𝑛, where 𝑚 is the number of users and 𝑛 is the number of items. Each cell of matrix 𝐑 corresponds to
a user’s rating of a particular item.
The described MF methods are conducive to situations when the dimensions of matrix 𝐑 are known. In
may cases, however, the dimensions may change over time as new users and content items appear, with
no available historical data associated with the new elements. To address such issues, recommendation
problems can be modeled by using multi-armed bandit models.
Objectives

Despite the presence of many fairness metrics that have been proposed for recommender systems, they
are mainly suitable for collaborative-filtering based recommender systems. However, recommendations
via multi-armed bandits is conceptually different from collaborative-filtering based approach.

Three main objectives are:

• The best approach for developing a MAB recommender algorithm that, in addition to maximizing
clickthrough rate, would also account the diversity of recommended articles.

• The impact of increases in diversity of recommended articles relative to changes in clickthrough rates.

• Descriptive study of recommendation frequencies and corresponding comparison with empirical


observations provided in articles (Haim et al., 2018; Nechushtai & Lewis, 2019).
Multi-armed Bandits

• Multi-armed bandit (MAB) problems model sequential decision making. A 𝐾-armed bandit problem
considers cases where 𝐾 ‘‘slot machines’’, each with a different payoff probability distribution, are
present.

• Then, the objective of the ‘‘gambler’’ is to identify a strategy that would maximize his or her payoff
over a given time horizon.
• Here, 𝑐𝑜𝑠𝑡(𝑡,𝑎) represents the cost of selecting arm 𝑎 at the time
interval 𝑡, and 𝛽 ≥ 0 is a tuning parameter that determines how
much weight is given to the cost. If 𝛽 = 0, then 𝑐𝑜𝑠𝑡𝑡, 𝑎 = 1 and
therefore has no effect in expression.

• The term 𝑟𝑐(𝑎) represents the recommendation count for 𝑎 at time


instance 𝑡, i.e. the number of times article 𝑎 has been previously
recommended by the bandit algorithm.
Setup

• As an experimental test bed, authors utilized the R6 A dataset2 obtained from the Yahoo! Webscope
program (Yahoo!, 2019), which is a standard dataset for evaluating bandit algorithms.

• It comprises 271 articles with over 45 million rows that contain ten days of user click logs for articles
displayed on the Today Module on Yahoo! Front Page. Each row consists of a time stamp, displayed
article (with click/no-click information), and other articles in the selection pool.

• The selection pool for each row consists of 20 articles. User features are also provided in each row
along with the features for all the articles.

• The dataset does not distinguish between the users’ and articles’ features, but rather all features are
merged into context vectors. In bandit algorithms, the articles are considered to be the arms that are
chosen from at a particular time. DivLinUCB from Algorithm 1 is implemented in C++ (using
Armadillo library).
LinUCB and DivLinUCB

• The core idea behind LinUCB (Linear Upper Confidence Bound) is to balance exploration (trying out
less-known options) and exploitation (choosing the best-known option) while taking into account
additional information (context) about the arms and the user. This context might include user
demographics, past behavior, or any other feature that could affect the reward of pulling an arm.

• DivLinUCB, or Diversified Linear Upper Confidence Bound, is a variant of the LinUCB algorithm
designed to handle the diversity aspect in recommendation systems.

• It modifies the selection process to ensure that the set of items recommended as a whole covers a
broader range of the user's interests.

• It tries to avoid the scenario where the user is repeatedly shown very similar items, which might
happen with LinUCB if those items have the highest expected reward.
Evaluation method and metrics

• In our numerical simulations the authors consider the following baselines: LinUCB and Random
(random selection of arms). The developed algorithm DivLinUCB is tested for different values of 𝛽
ranging from 0.001 to 10000.

• In LinUCB and DivLinUCB we set 𝛼 = 2.62 corresponding to 𝑃 =0.99 and 𝛿 = 0.01 (Walsh et al.,
2009). Moreover, for a given arm, the feature vector is considered as the concatenation of its features
and the
• Due tofeatures of the
the fact thatuser.
we consider choosing a single article
recommendation at each time point, clickthrough rate (ctr) is
utilized as the metric for evaluating recommendation performance.
• The Gini Index is typically used as a measure for quantifying
recommendation diversity where 𝑟𝑐(𝑎) is the recommendation count
of 𝑎, i.e., the number of times 𝑎 was recommended.
Results
• As one of the baselines, authors
implemented the LinUCB
algorithm (Li et al., 2010) and
analyzed frequencies of the
resulting recommendations. Fig.
1 depicts the empirical
cumulative distribution function
of the recommendation counts
and Table1 summarizes the
resulting descriptive statistics.
• Observe that 75% of the articles
were recommended no more than
9269 times, yet the maximum
recommender count of 42,643
was significantly higher than 3rd
quartile.
Results
Results
Challenges in Recommendation Systems:

• Scalability: Handling large datasets of users and items efficiently.

• Sparsity: Dealing with datasets that have a large number of items but relatively few ratings.

• Cold Start: Recommending items to new users who have no history or to new items that have no
ratings.

• Dynamic Environments: Keeping up with changing user interests and new items being added.
Conclusion

• An algorithm DivLinUCB that is a modification of LinUCB was introduced which considers repetitive
recommendation of same articles as cost and, therefore, naturally increases the diversity of
recommendations.

• Experimental results on a benchmark dataset confirms that the developed algorithm is effective in
trading off click through rate for diversity.
References

• Li, L., Chu, W., Langford, J., & Schapire, R. E. (2010). A contextual-bandit approach to personalized
news article recommendation.

• Balakrishnan, A., Bouneffouf, D., Mattei, N., & Rossi, F. (2018). Using contextual bandits with
behavioral constraints for constrained online movie recommendation.

• Zhu, Z., Wang, J., & Caverlee, J. (2020). Measuring and mitigating item under recommendation bias
in personalized ranking systems. In Proceedings of the 43rd international ACM SIGIR conference on
research and development in information retrieval (pp. 449–458). New York, NY, USA: Association
for Computing Machinery.
Thank you

You might also like