Professional Documents
Culture Documents
ML A3
ML A3
Filtering
submitted to
by
Nov 2017
1 Introduction
Neighborhood-based collaborative filtering algorithms, also referred to as memory-based
algorithms, were among the earliest algorithms developed for collaborative filtering.
These algorithms are based on the fact that similar users display similar patterns of
rating behavior and similar items receive similar ratings. There are two primary types of
neighborhood-based algorithms:
1. User-based collaborative filtering: In this case, the ratings provided by similar users
to a target user A are used to make recommendations for A. The predicted ratings
of A are computed as the weighted average values of these "peer group" ratings for
each item.
1
2 Methodology
Computing the similarity between items is the fundamental step of our recommendation
system, since we want to recommend similar items to customers based on what they
have bought before. The basic idea of similarity computation between two items i and
j is to firstly isolate the users who have rated both of these items and then to apply
a similarity computation technique to determine the similarity si,j . We used Adjusted
Cosine similarity technique for this.
2
Adjusted Cosine Similarity
In case of the item-based CF the similarity is computed along the columns, i.e., each
pair in the co-rated set corresponds to a different user. Computing similarity using basic
cosine measure in item-based case has one important drawback: the differences in rating
scale between different users are not taken into account. The adjusted cosine similarity
offsets sets this drawback by subtracting the corresponding user average from each co-
rated pair. Formally, the similarity between items i and j using this scheme is given
by:
This approach tries to capture how the active user rates the similar items. The
weighted sum is then scaled by the sum of the similarity terms to make sure the prediction
is in the specific range.
3
3 Results
We evaluated the accuracy of a system by comparing the numerical recommendation
scores against the actual user ratings for the user-item pairs in the test dataset. For this,
Root Mean Squared Error (RMSE) was used. Assuming a total of n tuples in the test
dataset and the error between the predict and the acutal value of each user-item pair is
ei , RMSE is calculated as follows:
4 Conclusion
Recommender systems are a powerful new technology for extracting additional value for
a business from its user databases. These systems help users find items they want to
buy from a business. Conversely, they help the business by generating more sales. Rec-
ommender systems are rapidly becoming a crucial tool in E-commerce on the Web. Our
results show that item-based techniques hold the promise of allowing CF-based algorithms
to scale to large data sets and at the same time produce high-quality recommendations.
4
5 Division of Work
The division of work among team members is as follows:
Accuracy computation.
This report.
Prediction computation.
Accuracy computation.
Prediction computation.
Prediction Computation.
References