Professional Documents
Culture Documents
Matrix Factorisation - Part 7. Part 7 - by Rakesh4real - Fnplus Club - Medium
Matrix Factorisation - Part 7. Part 7 - by Rakesh4real - Fnplus Club - Medium
Table of Contents :
1. Introduction and Recommendation Framework
6. KNN Recommendations
7. Matrix Factorisation
10. AutoRecs
https://medium.com/fnplus/matrix-factorisation-d3cd9c4d820a 1/10
10/7/21, 9:31 PM Matrix Factorisation — Part 7. Part 7 | by Rakesh4real | Fnplus Club | Medium
Collaborative filtering is a good method. But if it is that good, why are we seeking an
alternative approach?
Good results are obtained only if to of the conditions below are satisfied.
MATRIX FACTORISATION
It has many sub categories of techniques.
CREEPY! Manages to find broader feature of users/items on their own (eg. Action or
Romantic). Math doesn’t know what to call newly found features which are simply
described by matrices.
MAIN IDEA:
For example, we may say Bob likes action movies as well as comedy movies. The same
can told as
https://medium.com/fnplus/matrix-factorisation-d3cd9c4d820a 2/10
10/7/21, 9:31 PM Matrix Factorisation — Part 7. Part 7 | by Rakesh4real | Fnplus Club | Medium
PCA is a statistical procedure which reduces the dimension of our user-item matrix
without losing any important information!
It is used as,
Note: Often, the dimensions they find correspond to features humans have learnt to
associate with items (eg. How ‘action-ey’ a movie is)
Whatever it is about movies that causes individuals to rate them differently, PCA extracts
and finds those ‘Latent Features’
https://medium.com/fnplus/matrix-factorisation-d3cd9c4d820a 3/10
10/7/21, 9:31 PM Matrix Factorisation — Part 7. Part 7 | by Rakesh4real | Fnplus Club | Medium
If R (Orignal Matrix) has some missing values, we can reconstruct R by filling blank
values!!
SIGMA MATRIX
It is a simple diagonal matrix only used to scale the values we get into proper scale -
We can multiply this sigma matrix with M or U and still R will just be product of two
matrices
We can even predict ratings using dot products of using recunstruction formula
described above
But wait! Just before applying PCA to original matrix R, how to deal with missing
values?
You may fill missing values with averages of some sort… but there is a better way!
Assume we have some ratings for any given row/column in M and U-Transpose
Find values of those complete rows and columns that best minimizes errors in
known ratings R
Apache Spark uses a different technique called ALS (Alternating Least Squares)
Note: You might be confused here because we are talking about learning the values as
matrices in M and U.
Actually, we are predicting ratings and not computing them directly which is what SVD
does. We are not doing real SVD Recommendations because SVD can’t perform on missing
data. Hence, it is ‘SVD-Inspired-algoithm’ not SVD itself. Winner of Netflix prize was a
variation of SVD called SVD++
TIPS:
You can see source code for SVD algrithm used in surpiselib in github. But too
complex to understand.
Never write your own algorithm. Odds are too high that you will mess up
somewhere. Use third party library that has been used and validated by others.
https://medium.com/fnplus/matrix-factorisation-d3cd9c4d820a 5/10
10/7/21, 9:31 PM Matrix Factorisation — Part 7. Part 7 | by Rakesh4real | Fnplus Club | Medium
In SVD++, merely rating item at all is an indication of some sort of interest in the
item. No matter what the rating was!
I . Factorisation Matrices
Well suited for predicting ratings/clicks in Recommendation System
Handles sparse data with ease. Unlike shoehorning itself into problem like SVD
II. TimeSVD++
Used for predicting next item in series of events.
Use PLSA on ‘movie titles’ and ‘description’ and match them up with users
As they are content based, they don’t do well on their own. Add it with user
behaviour data.
https://medium.com/fnplus/matrix-factorisation-d3cd9c4d820a 6/10
10/7/21, 9:31 PM Matrix Factorisation — Part 7. Part 7 | by Rakesh4real | Fnplus Club | Medium
HYPERPARAMETER TUNING(SVD)
Hyperparameters are the parameters chosen by the ML engineer himself in order to
recommend with effective predictions. Finding optimal hyperparameters is called
hyperparameter tuning.
Example Code
The below code helps us understand how we can tune hyperparameters in surpriselib
using GridSearchCV package.
https://medium.com/fnplus/matrix-factorisation-d3cd9c4d820a 7/10
10/7/21, 9:31 PM Matrix Factorisation — Part 7. Part 7 | by Rakesh4real | Fnplus Club | Medium
Hyperparameter Tuning
GridSearchCV package,
Defines grid of different parameters you want to try different values for (try every
possible combination)
Exiting results!
Not only applicable to movies but books, music, credit cards and many more as well!
https://medium.com/fnplus/matrix-factorisation-d3cd9c4d820a 8/10
10/7/21, 9:31 PM Matrix Factorisation — Part 7. Part 7 | by Rakesh4real | Fnplus Club | Medium
Because user has rated only some items and weights only exist for these items can be
calculated — ‘sparse’ aggregation
Extend to entire user item rating matrix with the formula below
— L1 Norm Regularisation
NEXT>>>
https://medium.com/fnplus/matrix-factorisation-d3cd9c4d820a 9/10