Professional Documents
Culture Documents
2-Recommender Systems - Section B - Annotated PDF
2-Recommender Systems - Section B - Annotated PDF
Personalized Recommendation
Prof. Rajib L. Saha
Personalized Retail
Rhythmic syncopation
Dreamy Drawbacks?
lyrics
These entries
could be
numbers
(ratings) as
well
Cosine-based similarity
A.B
Cos(A,B) =
|A|∗|B|
• A: (a1, a2, …, aN)
• B: (b1, b2, …, bN)
• A.B: a1*b1+a2*b2+…+aN*bN
• |A|: (a12+a22+…+aN2)1/2 ; |B|: (b12+b22+…+bN2)1/2
1 2 3 4
Example: A 3 5 0 1
B 1 4 0 0
3∗1+5∗4+0∗0+1∗0
Cos(A,B)= =0.94
(32+52+02+12)1/2∗(12+42+02+02)1/2
Correlation-based Similarity
1 2 3 4
A 3 5 0 1
B 1 4 0 0
Covariance (A,B)
CorrAB =
𝑆𝑡𝑑𝑒𝑣 𝐴 ∗ 𝑆𝑡𝑑𝑒𝑣(𝐵)
n customers X p items
While computing
similarity between
Persons 1 & 2, Item 2’s
rating cannot be
included, since Person 2
hasn’t bought Item 2.
Source: Longtail.com
Supply-side drivers:
•centralized warehousing with more offerings
•lower inventory cost of electronic products
Demand-side drivers:
•search engines
•recommender systems
What happens to the diversity
• Total sale of niche items go up in
absolute terms
• So does the sale of popular items
• However, market share of niche
products goes down (Why?)
– While niche products benefit, popular
products benefit even more
– ‘Rich-get-richer’ phenomena
– The ratings of AM1 and AM2 are not even included while
computing similarity!
UN x r: User-feature matrix
Vn x r: Item-feature matrix
X U ∑ VT
1 0 0 0 2 0 0 1 0 4 0 0 0 0 1 0 0 0
0 0 3 0 0 0 1 0 0 0 3 0 0 0 0 1 0 0
= x x
0 0 0 0 0 0 0 0 -1 0 0 2.24 0 0.45 0 0 0 0.89
0 4 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0
k=3
0 0 1x 4 0 0x 0 1 0 0 0= 1 0 0 0 2 Compressed size
0 1 0 0 3 0 0 0 1 0 0 0 0 3 0 0 4*3+3+3*5=30 bytes
0 0 0 0 0 2.24 0.45 0 0 0 0.89 0 0 0 0 0
1 0 0 0 4 0 0 0
Dealing with New Users
R= UΣVT
• 𝑟𝑖 = ith row of rating matrix = item ratings of user i
• 𝑢𝑖 = ith row of user-feature matrix = feature ratings of
user i
• 𝑢𝑛𝑒𝑤 = 𝑟𝑛𝑒𝑤 𝑉 Σ −1
– Let the new users rate a few items and use those partial
ratings to compute feature ratings
Dealing with missing values
before applying SVD
• Impute the missing values in the
Rank matrix with user mean or item
mean
• If the rank matrix is already
normalized (mean-subtracted),
missing values can be simply zeroes
Some guidelines on evaluating a
recommender system?
• Divide the user-product rating matrix
to training and test randomly
• Predict Rating[user, product] for each
cell in the test user-product rating
matrix using one of the methods (as
shown in the next page)
• Calculate RMSE/MAE/MAPE
Calculate R[user, product]=?
– User-user collaborative filtering
• Predict rating of User u for Product p in test data
• Say, p has been bought by 𝑢𝑖 , 𝑢𝑗 , and 𝑢𝑘 and rated as 𝑟𝑖 , 𝑟𝑗 , and 𝑟𝑘
in training data
• Say, sim(u, 𝑢𝑖 )=𝑠𝑖 , sim(u, 𝑢𝑗 )=𝑠𝑗 , and sim(u, 𝑢𝑘 )=𝑠𝑘
• R[u,p]=(𝑟𝑖 *𝑠𝑖 +𝑟𝑗 *𝑠𝑗 +𝑟𝑘 *𝑠𝑘 )/(𝑠𝑖 +𝑠𝑗 +𝑠𝑘 )