Sahu, Dwivedi - 2019 - User Profile As A Bridge in Cross-Domain Recommender Systems For Sparsity Reduction

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

Applied Intelligence

https://doi.org/10.1007/s10489-018-01402-3

User profile as a bridge in cross-domain recommender systems


for sparsity reduction
Ashish Kumar Sahu1 · Pragya Dwivedi1

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Abstract
In the past two decades, recommender systems have been successfully applied in many e-commerce companies. One of
the promising techniques to generate personalized recommendations is collaborative filtering. However, it suffers from
sparsity problem. Alleviating this problem, cross-domain recommender systems came into existence in which transfer
learning mechanism is applied to exploit the knowledge from other related domains. While applying transfer learning, some
information should overlap between source and target domains. Several attempts have been made to enhance the performance
of collaborative filtering with the help of other related domains in cross-domain recommender systems framework. Although
exploiting the knowledge from other domains is still challenging and open problem in recommender systems. In this paper,
we propose a method namely User Profile as a Bridge in Cross-domain Recommender Systems (UP-CDRSs) for transferring
knowledge between domains through user profile. Firstly, we build a user profile using demographical information of a user,
explicit ratings and content information of user-rated items. Thereafter, the probabilistic graphical model is employed to
learn latent factors of users and items in both domains by maximizing posterior probability. At last prediction on unrated item
is estimated by an inner product of corresponding latent factors of users and items. Validating of our proposed UP-CDRSs
method, we conduct series of experiments on various sparsity levels using cross-domain dataset. The results demonstrate
that our proposed method substantially outperforms other without and with transfer learning methods in terms of accuracy.

Keywords Cross-domain recommender systems · Recommender systems · Transfer learning · User profile ·
Matrix factorization

1 Introduction is one of the most promising techniques in recent years. CF


focuses on user preferences data which are provided by the
With development and explosion of Internet technologies user explicitly such as, numerical ratings, like/dislike, etc.
and continuous growth of accessibility of the World Wide It can be classified into two categories: memory-based and
Web, the amount of digital information generated by model-based. The former category of method focuses on
humans increases exponentially. We become more easily similarity strategy between co-rated users or items, followed
overwhelmed by huge amount of information and unable to by TopK Neighbors (kNN) selection and then weighted
find what we really desire. Recommendation Systems (RSs) average strategy is used for prediction. There exist multiple
[1, 2] are the software tool that help us to find most relevant variations [5–8] of memory-based CF, these are based on
items out of millions of items in the database. Several the similarity measure modification and the topK neighbor
techniques [3–5] are used for generating personalized selection.
recommendations, among them Collaborative Filtering (CF) In the case of model-based CF or Latent Factor Model
(LFM), firstly the model is constructed, and then prediction
can be made. One of the leading methods in this category
 Ashish Kumar Sahu
is Matrix Factorization (MF) [9] in which both items and
sahuashishcs@gmail.com
users characterize in small number of factors that infer from
Pragya Dwivedi the user-item rating matrix. Multiple variations [10–14] of
pragyadwi86@mnnit.ac.in MF have been proposed by several researchers with their
limitations.
1 Motilal Nehru National Institute of Technology Allahabad, Although CF gains great success in recent years, one
Prayagraj, 211004, India of the major problems is data sparsity because users provide
A. K. Sahu and P. Dwivedi

their feedbacks (in terms of numerical ratings) in limited focus on implicit information of users as well as items.
number of items out of millions of items. Using sparse data In case of user side internal information, it may be
for generating the predictions followed by recommenda- demographical data such as user’s age, gender, occupation,
tions, the system may degrade the performance. Addressing etc., and content information of an item (the set of genres
this problem of CF, several authors have provided their solu- if a movie as an item) as item side internal information,
tions by using additional feedbacks with in-domain such as for instance, if we focus on two different movie domains,
likes/dislikes [15, 16], users reviews [17], history records these types of information may common in both domains.
[18], etc. and tried to mitigate the sparsity problem. The novelty of proposed method with existing methods
Rather focusing on heterogeneous feedbacks with in- is that we can be able to exploit the knowledge of an
domain, another solution to this problem is Cross-domain additional domain even no explicit information overlap
Recommender Systems (CDRSs) [19–22] in which lever- between domains, i.e., overlap users and items, overlap
aging the knowledge from other related additional domains tags, etc.
(source domains) to improve the performance of the tar- In our proposed method, we combine both domains
get domain. This type of strategy in CDRSs can be done information through graphical model theory [29] in which
by transfer learning mechanism [23]. Transfer learning is a user latent factors, item latent factors, rating and similarity
new paradigm of machine learning which aims to transfer of users profile between distinct domains work as random
some useful knowledge from one or more source domains variables. We then solve the graphical model and maximize
to the target domain in order to increase the performance of the posterior probability of user latent factors and item latent
the target domain with some assumptions. While applying factors of both domains.
transfer learning in CDRSs, two assumptions are: 1) data The proposed method consists four fold: firstly, we
sparsity of a source domain must be dense compare to tar- build a user profile in both domains using demographical
get one 2) some information must overlap between source information of a user, explicit ratings (provided by a
and target domains. According to [24], CDRSs model can user) and content information of user-rated items. We then
be categorized into three types based on the overlapping calculate the similarity between users profile of distinct
of users/items between source and target domains, i.e., domains. After that, we build the probabilistic graphical
fully users/items overlap, partially users/items overlap and model where each node represents a random variable,
non-overlap users and items. and links express probabilistic relationships between these
This paper focuses on the third category of CDRSs variables. Random variables in our proposed UP-CDRSs
model, i.e., non-overlap of users and items. In this category, method are: a vector of user latent factors, a vector of item
two approaches have been proposed: tags information latent factors, rating information and similarity of users
transfer [25–28] and rating pattern knowledge transfer [19]. profile. We solve the probabilistic graphical model the learn
In the first approach, tags (a lightweight user review in text latent factors of users and items in both domains. In last fold,
form) should be common in both domains. A limitation of prediction is estimated on unrated items through an inner
tag-based transfer learning methods in CDRSs model is that product of corresponding latent factors of users and items in
it is too expensive to find overlap tags between domains. the target domain. The major contributions of our work can
Another approach of non-overlap users and items cate- be summarized as follows:
gory is rating pattern transfer knowledge. Li, et al. [19] have
proposed Codebook Transfer (CBT) method wherein com- – Presenting a new CDRSs method for exploiting the
pact rating pattern extracted from a source domain and then knowledge from a source domain in order to enhance
transferred to the target domain. The authors have focused the accuracy of the target domain. The knowledge is
on numerical rating feedbacks only and assumed that both transferred in terms of users profile which are built from
domains share the same rating distribution. But in a practi- explicit ratings, internal information of users and items.
cal scenario, every domain has its rating distribution that we – Applying transfer learning in CDRSs, we need some
can not assume to be the same. overlap information between the domains. The novelty
In this paper, we propose a novel method namely User of our proposed method is that we focus on non-overlap
Profile as a Bridge in Cross-domain Recommender Systems users and items category. Hence, we extract implicit
(UP-CDRSs) for exploiting the knowledge from source information of users and items for mapping the users
domain in order to enhance prediction accuracy of the target profile in distinct domains.
domain. This paper focuses on an additional domain in – To the best of our knowledge, this is the first attempt
which neither users nor items overlap between domains, in CDRSs model in which knowledge is transferred
and tag information is also not available in both domains. in terms of internal information of users and items by
According to [23], some information should overlap probabilistic graphical model where each node works
between domains, therefore, for building the bridge, we as random variable. After solving the graphical model
User profile as a bridge in cross-domain...

by maximizing the posterior probability, we are able to the performance of the target domain though we have
extract latent factors of users and items more precisely. less amount of recorded data for learning the model.
– Experiments are done on CDRSs non-overlap users While applying transfer learning framework, informa-
and items dataset. The proposed method is compared tion should overlap between domains.
with and without transfer learning methods. In without In mathematics notation, the definitions of domain
transfer learning, we focused only the target domain and task are: A domain consists two
rating matrix and apply their approaches. In case of terms: A feature space , and marginal probability distri-
transfer learning methods, we used two rating matrices bution P(X ), where .
(source and target) domains data as training set and We can differentiate domains through both terms, i.e., if
applied their approaches. We used two evaluation two domains are different, then they may have different
matrices and compared all existing related works with feature space or different marginal probability distribu-
our proposed UP-CDRSs method. tion. In terms of machine learning framework, a dataset
is domain and using this, we have two components: a
The rest of this paper is organized as follows. In Section 2, label space Y and an objective function f (·). Second
we introduce some basic definitions, mathematical nota- term, A task T = {Y, f (·)}, we have to learn an objec-
tions and the problem formulation. The literature review is tive function using training data which consists{xi , yi },
described in Section 3 where we will briefly describe RSs, where xi ∈ X and yi ∈ Y . After that, learned function
CDRSs in transfer learning framework followed by related f (·) used for prediction of a new instance x with corre-
works dealing with the problems of sparsity. In Section 4, sponding label.
we describe our proposed UP-CDRSs method, after that we – Cross-domain recommender systems (CDRSs):
show results of experiments those are conducted on cross- CDRSs [20] are new framework for recommender
domain dataset to verify the effectiveness of the proposed systems to mitigate the problems of traditional RSs
method in Section 5. Finally, we conclude our work and techniques. It uses additional information from one or
provide future direction in section Section 6. more source domains to improve the recommendations
quality of the target domain.
– User profile: A user profile is a representation of
2 Basic definitions, notations and the knowledge and personal characteristics of a user. The
problem formulation profiling information can be elicited from demograph-
ical information (e.g., user’s age, gender, occupation,
In this section, we describe some basic definitions in etc.), user-rated items, and content information such
Section 2.1, followed by mathematical notations those as movie title, genre, director, etc.(in case of movie
are used in this paper in Section 2.2 and the problem domain) of those rated items. A user profile represents
formulation is described in Section 2.3. in form of vector (refer in Fig. 7).

2.1 Basic definitions 2.2 Notations

– Matrix factorization (MF): MF is one of the model- In this subsection, we describe notations which are used
based CF methods. The small assumption of MF is that throughout the paper. The list of notations as follows:
interaction between users and items is governed by a
small number of hidden factors called latent factors. R̂i, j Prediction on item j to user i of the target domain
Therefore, the user and the item described in a vector D k Domian k
form and, the size of a vector is equal to the number of S s t  Similartiy between users profile i and i’ in
Pi ,Pi 
hidden factors. For example, in a movie recommender
system, each movie measures the distribution of latent domains and t, respectively
factors (e.g., Science Fiction, Comedy, Action, Love, Pik User profile of user i in domain k
R k ∈ R N ×M
k k
etc.), and each user represents a user taste on those User-item rating matrix of domain k
latent factors. So, estimate overall user taste by an inner Ri,k j Rating provided by user i on item j in domain k
product of corresponding user and item latent factors. M k Number of users in domain k
– Transfer learning: Transfer learning [23] or knowl- N k Number of items in domain k
edge transfer is a part of machine learning framework Uik ∈ R1× f A row vector of latent factor of user i in
that reapplies knowledge acquired from one or more domain k
source domains to the target domain in order to improve f Number of latent factors
A. K. Sahu and P. Dwivedi

I k ∈ R N ×M
k k
Binary mask of rating martix R k content information of items and demographical data of a
V jk ∈ R1× f A row vector of latent factor of item j in user overlap. So, if both types of information also disjoint,
domain k we can not exploit the knowledge of source domain.
According to our formulation, similarities between all users
2.3 Problem formulation profile of distinct domains would be zero and our solution
of cross-domain would be converted into a single-domain
This subsection describes the motivation behind our RSs.
proposed method in the CDRSs framework. How transfer We take an example for better understanding. Figure 1
learning is being done and what knowledge of source shows the user-item matrices of both domains. Source and
domain is being exploited wherein users and items do not target domains are movie and book, respectively. There
overlap, the answers of both questions are described by are five users and six items in both domains. Ratings are
using an example. in numerical form{1, 2, 3, 4, 5, ?}, where ’?’ represents
Our solution for establishing a bridge between domains unrated items’ rating. In both domains, we initialize the
is user profile. As we mentioned earlier, a user profile is latent factor values of all users. Figure 2 shows users’ latent
a representation of knowledge and personal characteristics factors where the size of f is 5. Additional information
of a user. We can correlate two users of distinct domains about user and item are demographical information and
through users profile. If both users of distinct domains have content information, respectively. In this example, we use
similar users profile, we can say that the behavior of both genre information as a content information. Five types of
users also be similar. For instance, user u is and user u it  have genre as content information present on each item, i,e, each
Pis and Pit users profile. If both are similar in terms of users item describes in five genres. If particular genre presents
profile, then similarity value of users profile S s t  tend on an item, mention 1 otherwise 0. Similarly, for user
Pi ,Pi 
demographical information, two types of demographic
to be 1. information is presented: User’s age and gender. The pres-
Another intuition is based on LFM model in which user’s ence of user’s age is in form of seven age ranges, i.e.,
characteristics are represented in vector form. A vector of 1 − 17, 18 − 24, 25 − 34, 35 − 44, 45 − 50, 51 − 55, 56+,
latent factors represents how much his/her likes or dislikes and the presence user’s gender in from of ’M’/’F’ or
on each of latent factor. If two users belong two distinct 1/0. Figure 3 describes the content information of items
domains and latent vectors of both users have similar and demographical information of users. We can see that
likes/dislikes on all corresponding latent factors, then an content information and demographical information both
inner product of both vectors tend to be 1. For instance, user overlap between domains.
u is and user u it  have Uis and Uit vectors of latent factors, From Figs. 1, 2 and 3, some observations can be made as
T
 s Uit U i  ≈ 1.
respectively, then s t follows:
Two users u i , u i  have similar users profile and also In source domain, u s1 provided high ratings on item i 1
have similar likes/dislikes in form of latent factors, then the and item i 3 (refer in Fig. 1), and item i 1 and i 3 contain
T
difference S s t  - Uis Uit ≈ 0 . If the difference is not g1 and g3 genres (refer Fig. 3 wherein the values of
Pi ,Pi 
(i 1 , g1 ) ; (i 1 , g3 ) ; (i 3 , g1 ) ; (i 3 , g3 ) are 1). Therefore, we can
tend to zero, error may present and it can be minimized
say that the user u s1 prefers g1 and g3 genres comparability.
using only update latent factor values of users because user
For detailed explanation, we have taken following three
profile values are static.
scenarios from the target domain as:
According to [23], applying transfer learning mechanism
to use the knowledge of source domain, some information Scenario 1: In target domain, u t7 provided high ratings
must overlap. In our case, users and items both disjoint but on item i 10 and item i 12 . Both items contain g1 and g3

Fig. 1 Illustration of source and


target domains user-item rating
matrices
User profile as a bridge in cross-domain...

Fig. 2 Representation of users


latent factor in a 2-D matrix
( f = 5)

genres. i.e., we can also say that the user u t7 prefers g1 similarity tend to be zero, and their latent factors also
T t ≈ I is
and g3 genres comparability. In terms of demographical should not be equal. But, the condition U1s U10
information, u s1 .age and u t7 .age belong same age group satisfied. If we calculate the error between users profile
and gender also same. Although both users belong t T , it
and their vectors of latent factor S(1s ,10t ) − U1s U10
distinct domains, users (u s1 , u t7 ) are similar in terms of must be high. Therefore, we also have to adjust the latent
characteristics and their behavior, so similarity between factor values so that error can be minimized.
both user’s profile S(1s ,7t ) ≈ 1. Moreover, their vectors
of latent factor (shown in Fig. 2) also be equal, i.e., These scenarios boost the objective of our proposed
T T
U1s U7t ≈ 1. So, difference S(1s ,7t ) − U1s U7t ≈ 0. work. How to build a user profile, how to transfer learning
If the difference is not tend to zero, then error may be in CDRSs framework are being done and how to minimize
present. an error, all things are described in Section 4.
Scenario 2: In scenario 2, u t9 provided high ratings on
item i 11 and item i 12 , and both items contain g1 and g3
genres. In this scenario, we can also say that the user u t9 3 Literature review
prefers g1 and g3 genres comparability. u s1 .age and u t9 .age
belong in the same age group and gender also a same In this section, we briefly describe recommender systems
in terms of demographical information. In this scenario, and their techniques, limitations of their techniques
both users (u s1 , u t9 ) are also similar, so, S(1s ,9t ) ≈ 1. In followed the existing solutions by leveraging the knowledge
T
terms of latent vectors, U1s U9t ≈ I is not true. In this from additional domains. After that, we describe related
situation, we have to adjust the latent factor values so that works in detail.
error can be minimized. RSs have been successfully applied in many areas such
Scenario 3: Similarly, u t10 provided high ratings on item i 8 as movies [3, 28, 30], social networks [31–33], music
and item i 9 . Both items contain g2 and g5 genres. u s1 .age [34], books [35, 36], medical science [37, 38], e-learning
and u t10 .age are dissimilar, and gender also in opposite. In [39] etc. Several techniques on RSs have been proposed
this scenario, we also can say that users profile S(1s ,10t ) for recommendations to users. According to the literature

Fig. 3 Representation of
demographical data and content
information in a 2-D matrix
A. K. Sahu and P. Dwivedi

review paper [5], three filtering techniques are: Content- available, it fails to address the problem. In the case of
based, collaborative and hybrid. Content-based filtering second group, we have to use brute force algorithm to find
approaches analyze a set of descriptions of items previously the optimal number of groups, so, it is not convenient. In the
rated by the user then create a profile of that user’s interests third group, i.e., hybrid, it is too expensive for computation
on the basis of features of the rated items. In case of of similarities.
collaborative filtering (CF), it focuses on explicit ratings For handling sparsity problem of CF, several researchers
which are given by users in terms of their opinion on seen have provided their solutions by using additional feedbacks
items. A main hypothesis of CF is that if two users are with in-domain such as likes/dislikes [15, 16, 18], users
similar in past would similar in future. Hybrid RS technique reviews [17], history records [18], etc. and tried to mitigate
provides a combination both filtering and makes use of the the sparsity problem. In two literature papers [20, 44],
advantages of each technique. One of the gained filtering several authors have focused one or more than one domains
techniques in recent years is CF because it is more versatile. and proposed their methods to leveraging the knowledge
CF [1, 3, 4] can be classified into two categories: from multiple source domains in order to increase the
memory-based and model-based. The former category performance of target prediction by transfer learning [23].
of methods focuses on similarity strategy between co- This type of strategy is called Cross-domain Recommender
rated users or items, followed by topK selection and Systems (CDRSs).
then weighted average of similarities with others co-rated Cremonesi, et al. [24] have classified three types of
users or items for prediction. Various modified similarity category based on the overlapping of users/items in CDRSs:
formule [6, 7, 40] have been proposed to calculate enrich fully users/items overlap, partially users/items overlap and
similarity between users or items. Although memory-based non-overlap users and items. In the case of fully or
CF provides good performance, it is not suitable for large partially users/items overlap, a first paper presented by
amount of database. In case of model-based methods, [45] in 2008. Similarly, [46] have proposed the method
firstly, we build a model through historical records and for CDRSs using aggregating user rating vectors from
then prediction can be made. Various models [8, 10–13] different domains and apply traditional memory-based CF.
have been proposed which are based on latent factor model. Hu, et al. [47] has proposed the method for cross-domain
Matrix factorization (MF) [9] is one of the most popular version of a matrix factorization, in which an augmented
models in model-based CF. MF tries to characterize user user-item rating matrix is constructed by horizontally
and item in the small number of latent factors inferred concatenating all matrices. These type of methods are used
from ratings. This method was very popular in 2009 and it multi-task transfer learning where both domains are used
played a central role in the Netflix competition.1 Although simultaneously. Cremonesi, et al. [24] considered to model
CF gains great success in recent years, it still suffers from (partially overlap users and/or items) the classical similarity
sparsity problem and cold-start user/item problem [41]. The relationships as a direct graph and explore all possible paths
reason of the sparsity problem is that users provide ratings connecting users or items in order to find new cross-domain
in limited number of items out of millions of items. The item relationships. However, all the above mentioned methods
cold-start problem occurs when there is a new item that has consider overlapping users or items across domains for
been transferred to the system, so no ratings are available on knowledge transfer which are not a realistic setting because
the new items. Similarly, for the user cold-start problem, a finding same user or item in two distinct domains is too
user just enter into a system and has not provided any ratings difficult.
on items. In the third category, i.e., non-overlap users and items,
Handling the cold-start user problem, the approach [42] [19] have proposed the method name as Codebook Transfer
is to recommend a few items to a cold-start user and use (CBT) based on cluster-level rating pattern for knowledge
the feedback to learn a profile. Learned profile can then be transfer between domains. CBT is fully based on rating
used to make good recommendations to the cold user. A key pattern, extract the rating pattern from source domain by
question is how to select items to recommend to a cold-start using the two-way k-mean algorithm, and then transfer
user. In the literature review paper of [43], the authors have it to the target domain. The authors have assumed that
classified the relevant studies for cold-start user problem two different domains have similar rating pattern. But in a
into three groups: 1) makes use of additional data sources; practical scenario, every domain has its rating distribution,
2) selects the most prominent groups of analogous users; 3) we can not assume to be same. Rather focusing on
enhances the prediction using hybrid methods. A limitation rating pattern transfer, several researchers [25–28] have
of first group is that we need some additional data sources also focused on non-overlap users/items category by using
(tags, demographical information, etc). If these are not additional tag information, and assume that tags overlap
in both domains. The limitation of tag-based transfer
1 https://www.netflixprize.com
learning methods in CDRSs is to finding overlap tag
User profile as a bridge in cross-domain...

between domains which is too expensive. Table 1 shows After calculating similarity, prediction is made by a
the classification of methods based on types of knowledge weighted average strategy as follows:
exploitation from either additional feedbacks (in-domain) or topK
additional other related domains. sim a,u (ru,i − r̄u )
r̂a,i = r̄a + 
u=1  (2)
 topK 
 u=1 sim a,u 
3.1 Related work
where a, u ∈ M, i ∈ N , sim a,u represents similarity
Various techniques of recommendation systems have been
between user a and u, Iu,a is co-rated items between user
developed in single-domain as well as cross-domain
u and a. topK means set of most k-neighbors users that are
recommender systems. This paper focuses an efficient
most similar to user a. r̄a and r̄u are mean rating of user a
cross-domain based recommendations using user profile
and u for all rated items, respectively.
which acts as a bridge for knowledge transfer from a source
Due to the behavior of available datasets are sparse, so
domain. According to literature studied, no one used user
calculate the similarity from every user is not good idea.
profile in CDRSs framework. In this section, we summarize
Another solution, User Preference Clustering (UPC-CF) has
the works that are most representative and relevant to the
been proposed by [7]. UPC-CF is based on user preference
study.
clustering to reduce the impact of data sparsity. The Authors
The existing work can be classified into two categories:
have used three clusters as Co, Cp and Cn represent
without transfer and with transfer. In the former category,
optimistic user group, pessimistic user group, and neutral
researchers [5, 8, 12] have focused only a single specific
user group respectively. An intuition of UPC-CF method is
domain, i.e., no transfer learning is applied. So, collabora-
that users could have starkly different views on an item. The
tive filtering is one of the best state-of-the-art techniques. As
Clustering has been done based a mean rating of a user with
mentioned earlier, two types of CF are memory-based and
threshold values, for instance, threshold values are (based
model-based. One of the traditional methods in memory-
on numerical rating range 1 − 5): {{1, 2, 3} , {4, 5} , {3}} for
based is kNN-CF where topK neighbors are found based
pessimistic user group, optimistic user group and neutral
on similarities between an active user and other users. A
user group, respectively. If user u mean rating is 4.12,
similarity can be calculated by Pearson correlation formula
then the user u belongs in optimistic user group. After
[40] as:
 categorized all users based on their mean rating, the cluster
i∈Ia,u (ra,i − r̄a )(ru,i − r̄u ) center of all three clusters is calculated as:
sim a,u =   (1)
i∈Ia,u (ra,i − r̄a ) i∈Ia,u (ru,i − r̄u )
2 2
C∗ = u ⇐ max |Iu | (3)

Table 1 Classification of methods based on types of knowledge exploitation from either additional feedbacks (in-domain) or additional other
related domains

Research paper Additional domains based on the overlap of users and items Additional data

fully users/items partial users/items non-overlap users and items

Winoto and Tang [45], 2008 


Li et al. [19], 2009 ∗ (Rating pattern)
Pan et al. [48], 2010 
Cremonesi et al. [24], 2011 
Shi et al. [25], 2011  (Tags)
Pan and Yang [15], 2013 
Enrich et al. [26], 2013  (Tags)
Hu et al. [47], 2013 
Fernández-Tobı́ [27], 2014  (Tags)
Xin et al. [17], 2015 
Fang et al. [49], 2016  (Tags)
Zhao et al. [50], 2017 
Zhu et al. [21], 2017 
Sahu et al. [28], 2018  (Tags)
Yu et al. [32], 2018 
A. K. Sahu and P. Dwivedi

cluster center is that user who has provided maximum After removing the biases, latent factor matrix U and
number of ratings in specific cluster of users. After calculate V can be estimated using user-item rating matrix. The loss
the cluster center of all three clusters, a preference of an function in MF as follows:
active user is identify by using modified similarity equation 1 
 2
 λu λv
as: min I Y − U V T  + U 2f + U 2f (6)
U,V 2 f 2 2


i∈Ia,u (ra,i − ru,i ) ∗ |r̄a − r̄u | where x f is Frobenius norm and yu,i ∈ Y = ru.i − bu −
sim U PC
= ex p −  
a,u  Ia,b  bi − μ
|Ia | ∩ |Iu | In later category, i.e., with transfer, CBT is one of the
∗ (4) state-of-the art techniques where transfer learning is used.
|Ia | ∪ |Iu |
An advantage of CBT is that it does not require overlap users
Prediction can be calculate from (2) by replacing the as well as items from a source domain. Most of memory
similarity formula. An advantage of UPC-CF is that it does based CF methods based on similarity that are estimated
not require to calculate similarities from all users. through co-rated items between users. If co-rated items may
Another paper in same direction has been proposed by not enough amount similarity value may not be accurate,
[12] with different strategies to find topK neighbors. The therefore some researchers [5] have provided their solutions
authors have proposed a modified version of memory- by filling unknown ratings through corresponding mean
based CF name as Neighbor Users by Subspace Clustering value of provided known ratings. But, it is not good idea.
(NUSC-CF). The authors have tried to find users in the So, [19] have proposed codebook transfer approach wherein
corresponding subspaces of items. An intubation is that ratings fill through compact codebook which are extract
users grouped under the same cluster share similar interests. from other related source domain. After filling unknown
These subspaces are then used to find a tree of neighbor ratings, traditional kNN-CF is applied. CBT has two phases:
users. The amount of similarity of every user to the target extraction of codebook and expansion of codebook.
user determines his position on the tree. A disadvantage Figure 4 shows an example of CBT approach wherein
of NUSC-CF is that number of subspaces may grow two rating matrices have used for source and target domain.
exponentially, and it depends on total number of items in In extraction phase, rating matrix of source domain is
dataset. permuted through 2-way k-mean clustering [52] and then
In case of model-based approach, MF, as mentioned ear- extract compact user-item rating pattern called codebook. In
lier, tries to describe users’ taste and items’ information in second phase, extracted codebook is expended into a target
small number of latent factors or space. These latent factors domain. Fill missing rating entries are shown in Fig. 4 with
can be estimated through learning mechanism [29] by using ’*’ symbol in the target domain.
previous historical records. After estimating both latent The objective function of CBT as:
  
factor matrices, the prediction can be made using corre-  s sT 
min I R s
− U s
BV  (7)
sponding their inner product of user and item latent U ,B,V
s s f
factors.
T T
In traditional MF, rather directly apply learning algorithm s.t. U s U s = I , V s V s = I
to estimate latent factors of users and items, firstly we After minimizing the (7), construction of codebook as
remove user and item biases from rating for more critical follows:
user (who rates more critically than others) and highly    
popular item bias for estimating the values more precisely. B = U s R s V s U s 11T V s (8)
This type of method is baseline or Average Filling with bias
(AF with bias) method [51] of MF. The user and item biases where means entry-wise division. Equation 8 shows
are estimated as: averaging all the ratings in each user-item co-cluster as
an entry in the codebook, i.e., the cluster-level rating

m
n pattern. U s , U t , V s , V t ∈ {0, 1} represent cluster indicator
min Iu,i matrices. After extraction phase of CBT from a source
bu ,bi

u=1 i=1
domain, same procedure is applied in reverse order for
1 λb λb transferring CBT in the target domain and then apply
(ru,i − μ − bu − bi )2 + u bu2 + i bi2 (5)
2 2 2 traditional kNN-CF method for prediction. The limitation
of CBT is sharing same cluster-level rating pattern which
where bu bi and μ are user bias, item bias and average rating is not realistic in practical scenario because distribution of
of all available ratings in dataset, respectively. λbu and λbi ratings in distinct domains may not be same cluster-level
are regularization parameters to control over-fitting. rating patten.
User profile as a bridge in cross-domain...

Fig. 4 An example of CBT


method using transfer learning

Table 2 shows summarization of related works. In previ- 4 Proposed UP-CDRSs method


ous related works, modifications and enhancements of col-
laborative filtering are mainly embodied in three scenarios: In this section, we describe our proposed UP-CDRSs
the similarity measure followed by topK neighbor selection method which consists four fold: 1) Build user profile in
[3, 6, 7], latent factor model [9, 14] and transfer knowl- both domains 2) Calculate similarity between users profile
edge [19] from other domain. All related work methods have of distinct domains 3) Merge both domains into a single set
been tried to mitigate the sparsity problem of CF with some and build probabilistic model to find relationship between
assumptions. Although the researchers have tried to address variables, and then solve it by probabilistic theory to get
sparsity problem, it is a challenging and open problem of objective function. The objective function is solved by
RSs. alternating least square approach 4) In last fold, predictions
This paper falls in two types of scenario: transfer on unrated items’ rating in the target domain are made.
knowledge and latent factor model. In this paper, we Figure 5 shows the architecture of UP-CDRSs method.
propose a novel method for CF to mitigate sparsity problem
and enhance the performance of rating prediction using 4.1 Build user profile
other related domain by transfer learning strategy in CDRSs
framework. As we mentioned earlier, leveraging knowledge Firstly, we build a user profile using demographical
from other related domains, we need some overlapping information of a user, explicit ratings (provided by a user)
information between domains. Here, we use user profile and content information of user-rated items. Block diagram
as a bridge even no users/items overlap between domains. of building a user profile is shown in Fig. 6. We capture
Another aspect is latent factor model, i.e., how to learn user’s preferences (in form of vector) using explicit ratings
hidden factors of user and item. So, we use probabilistic and content information of rated items of a user. After
matrix factorization model which helps to learn hidden that, we augment demographical information with user’s
factor efficiently. Several authors have focused on any first preferences’ vector. Figure 7 shows a user profile vector.
two types scenario, i.e., neighbors selection and latent factor
model. We can differentiate our method and related work User’s preferences: The benefit of capturing user’s pref-
methods by combine last two scenarios (latent factor model erences is that we can be able to know the content
and transfer learning). preferences of a user. For capturing user’s preferences,
A. K. Sahu and P. Dwivedi

Table 2 Summarization of previous approaches

Research Is transfer learning Approach Remark


papers applied

Candillier et al. [3], 2007 No kNN-CF, Find topK neighbors – Similarity calculation is too expensive
using Pearson correlation coeffi- – Problem of data scalability
cients and apply weighted aver-
age of corresponding similarity
and rating of other co-rated user
Salakhutdinov and Mnih [14], 2007 No cPMF, Use a concept of hidden – Handle the problem of data scalability
latent factors and represent a user – Handle semi cold-start user problem,
and an item in small number of i.e., who has not provided enough ratings
hidden factors
Koren et al. [9], 2009 No MF, Use a concept of hidden – Handle the problem of data scalability
latent factors and represent a user – Handle the problem of critical users
and an item in small number of and items by removing the bias from the
hidden factors ratings
Li et al. [19], 2009 Yes CBT, Find cluster-level rating – Handle the problem of data sparsity
pattern from a source domain and – Too expensive to find compact rating
then transfer to the target domain pattern
followed by kNN-CF method
Zhang et al. [7], 2016 No UPC-CF, Clustering has been – Mitigate the problem of similarity
used for finding similar user calculation to all other users in the dataset
groups. Three clusters represent – It may not work well in huge dataset
optimistic user group, pessimistic
user group and neutral user
group, and followed by modified
modified kNN method
Koohi et al. [6], 2017 No NUSC-FC, Find the users in – Find subspace is too expensive
the corresponding subspaces of
items. These subspaces are then
used to find a tree of neighbor
users. The amount of similarity
of every user to the target user
determines his position on the
tree.

we use two formule: Relative Genre Ratings (RG R) and rated (r >= 3) items of genre G g corresponding to
Modified Relative Genre Frequency (M RG F) [53]. In the user u is computed.
this paper, we use genre information as content informa- MRGF: Rather focusing on items rating, frequency of
tion because we focus on movie recommendations. After genres preferred by a user also an important concern,
calculating both values, a particular user’s preference can the ratio of user u ratings (respect to frequency) for
be calculated by harmonic mean of both values. Formule high rated items on each genre of total ratings (respect
of RGR and MRGF as follows: to frequency).
MGRF(u, g)
RGR: The ratio of user u for high rated items on each 
genre of total ratings on items. s∈G g ⊂Si δ3 (ru,s )+2 ∗ δ4 (ru,s )+3 ∗ δ5 (ru,s )
= (10)
3 ∗ T F(u)
G R(u, g) 
RG R(u, g) = (9) GF(u, g) = s∈G g ⊂Si δk (ri,s )
T R(u)
s.t.  k ∈ 3, 4, 5 
, and TF(u) = |Si |
TR(u) = ru,s , and 1 k =ri,s
δk (ri,s ) =
s∈Su 0 dk  = ri,s

GR(u, g) = ru,s where Si is total number of ratings given by the user u.
s∈G g ⊂Su ,r ≥3
2∗nf ∗RGR(u, g)∗MRGF(u, g)
user’s preference(u, g) =
Here, T R is total ratings of user u, S is set of items RGR(u, g)+MRGF(u, g)
rated by user u and the genre rating (G R) for high (11)
User profile as a bridge in cross-domain...

Fig. 5 Architecture of
UP-CDRSs method

where n f is a normalization factor. User’s prefer- such as age, gender, country name, occupation, etc.
ence(u,g) means the user u how much likely to prefer the These types of information or data called demographic
genre g. Similarly, we calculate all preferences of user u information or demographical information.
according to corresponding genres in given dataset. In this paper, we use only age and gender as demo-
Demographical information: While a user register on graphical information of a user. Each user’s age belongs
the system, he/she provides some personal information, a particular group ranges < 1 − 17, 18 − 24, 25 −
34, 34 − 44, 45 − 49, 50 − 55, 56+ >. This
information can be encoded as in binary value, for
instance, user u age is 43, so binary encoded form is
< 0, 0, 0, 1, 0, 0, 0 > Other demographical information
is gender, where only two options are provided, i.e., ’M’
and ’F’ for male and female, respectively.

After capturing user’s preferences and extracting demo-


graphical information, we concatenate into a single vector
called user profile (refer Fig. 7).
For better understanding, we consider an example of
movie recommender systems (shown in Table 3). There are
three users u 1 , u 2 and u 3 and ten movies i 1 , i 2 , · · · i 10 .
Explicit ratings are expressed with numerical values from 1
to 5. Unrated ratings are shown as ’?’. Here, genre descrip-
Fig. 6 Block diagram of building the user profile tion as content information is considered. Five genres are
A. K. Sahu and P. Dwivedi

Fig. 7 Illustration of a user


profile vector

given in the form of < g1 , g2 , g3 , g4 , g5 >. If genre gi form is< 0, 1, 0, 0, 0, 0, 0, 1 >. Thereafter concatenate
presents on the movie denoted as 1 otherwise 0. both vectors into a single one, i.e., < 0.4528, 0, 0.9030,
Firstly, we calculate RG R and M RG F values of all users 0.444, 0, 0, 1, 0, 0, 0, 0, 0, 1 >.
on each genre using (9) and (10), respectively. Tables 4 and
5 show RG R and M RG F values, respectively. 4.2 Similarity calculation between users profile
After calculating both values, users’ preferences can be
calculated using (11). Table 6 shows users’ preferences. In the second fold, we calculate similarity S  between
Pis ,Pit
The value of n f = 0.8 is used. We can observe from
all users profile of domain and Pis Pit
from domain s and
Table 6 is that the user u 1 prefers (g3 = .9030) genre
t, respectively. Similarity is calculated using cosine formula
comparatively, similarly the user u 3 prefers (g5 = .9274)
as follows:
genre comparatively.
After calculating users’ preferences, we extract demo- P s .P t
S s t  =  s i  i t  (12)
graphical information. In Fig. 3, age and gender of users are Pi ,Pi  P  P 
i i
described. We consider seven groups of ages < a1 , a2 , a3 ,
a4 , a5 , a6 , a7 >. Ranges of group are < 1 − 17, 18 − 4.3 Probabilistic graphical model of UP-CDRSs
24, 25 − 34, 34 − 44, 45 − 49, 50 − 55, 56+ >. Each method
user’s age belongs a particular group, for instance, If a user
u age is 39 than encoded vector is < 0, 0, 0, 1, 0, 0, 0 >. Figure 8 shows the probabilistic graphical model [29, 54],
Other demographical information is gender, so only two where each node represents a random variable, and links
options are there, i.e., ’M’ and ’F’ for male and female, express probabilistic relationships between these variables.
respectively. Here, we use only scalar value in binary form Random variables in our proposed method are Uik , V jk , Ri, j
(1 and 0). Combine both information into a single vector, and S s t  for user latent factors, item latent factors, rat-
Pi ,Pi 
so size of vector is eight. for instance, u 1 .age = 18 and ings and similarity between users profile, respectively. How
u 1 .gender = M  so demographic information in vector to variables are related among them are shown in Fig. 8.
form is < 0, 1, 0, 0, 0, 0, 0, 1 >. According to graphical model theory, the joint distribution
After calculating both information, again we concate- of all variables expressed as:
nate into single vector called user profile (Pik ) vector. for
instance, user’s preferences vector of user u 1 is < 0.4528, 0, p(U s , V s , U t , V t , R s , R t , S, σU s , σV s , σU t , σV t , σs , σt , σ p ) =
0.9030, 0.444, 0 >, and demographic information in vector p(R s |U s , V s , σs ) p(R t |U t , V t , σt ) p(S|U s , U t , σ p ) p(U s |σU s )
p(V s |σV s ) p(U t |σU t ) p(V t |σV t ) p(S|σ p )
Table 3 An Example of movie domain rating matrix P(σU s )P(σV s )P(σU t )P(σV t )P(σs )P(σt )P(σ p )
u1 u2 u3 (13)

i 1<1,0,1,0,0> 5 ? ? Neglecting the constant prior probability of equation (13).


i 2<0,0,1,1,0> ? 3 2 The log-posterior probability over the latent variables is:
i 3<1,0,1,0,0> 4 ? ?
i 4<0,1,0,0,1> ? 4 5 Table 4 RGR values of an example
i 5<0,0,1,1,0> 3 ? ?
User # TR G Rr>=3 RG R
i 6<1,0,0,0,1> 2 5 4
i 7<0,0,1,1,0> 4 ? ? u1 21 < 9, 0, 19, 10, 0 > < 0.4286, 0, 0.9048, 0.4762, 0 >
i 8<0,1,1,0,1> ? 5 4 u2 23 < 5, 13, 8, 3, 18 > < 0.2174, 0.5652, 0.3478, 0.1304,
i 9<0,1,0,0,1> ? 4 ? 0.7826 >
i 10<0,0,1,1,0> 3 2 ? u3 15 < 4, 9, 4, 0, 13 > < 0.2667, 0.6000, 0.2667, 0, 0.8667 >
User profile as a bridge in cross-domain...

log p(U s , V s , U t , V t |R s , R t , S, σU s , σV s , σU t , σV t , σs , σt , σt , σ p ) ∝
log[ p(R s |U s , V s , σs ) p(R t |U t , V t , σt ) p(S|U s , U t , σ p )
p(U s |σU s ) p(V s |σV s ) p(U t |σU t ) p(V t |σV t ) p(S|σ p )] (14)

The conditional distribution over the observed ratings as: where N (x | μ, σk ) denotes the probability density function
for a Gaussian distribution with mean μ and variance σk .
 N 
M  k k
I k
T i, j The zero-mean spherical Gaussian prior on users an items
p(R | U , V , σk ) =
k k k
N (Ri,k j | Uik V jk , σk )
vectors as:
i=1 j=1
k
(15) 
M
p(U k | σU k ) = N (Uik | 0, σU k ) , (17)
where Ii,k j is binary mask. If user i rated movie j in domain
i=1
k is equal to 1 otherwise 0. The conditional distribution over k
the users profile similarities as: 
N
p(V | σV k ) =
k
N (V jk | 0, σV k )

Ms Mt
T
 j=1
p(S | U s , U t , σ p ) = N (S  | Uis Uit , σ p )
Pis ,Pit
i=1 i  =1 Substitute the (15), (17) and (16) into (14) for k ∈ {s, t},
(16) we get:

log p(U s , V s , U t , V t |R s , R t , S, σU s , σV s , σU t , σV t , σs , σt , σt , σ p ) =
⎡ s s

M  N  I s  Mt Nt  I t
log ⎣ s sT
N (Ri, j | Ui V j , σs )
s i, j T
N (Ri,t j | Uit V jt , σt )
i, j

i=1 j=1 i=1 j=1


Ms 
 Mt  
Ms
T
N (S  | Uis Uit , σ p ) N (Uis | 0, σU s )
Pis ,Pit
i=1 i  =1 i=1

Ns
 Mt
 Nt

p(V k | σV s ) N (V js | 0, σV s ) N (Uit | 0, σU t ) p(V t | σV t ) N (V jt | 0, σV t ) + C ⎦ (18)
j=1 i=1 j=1

where C is term containing users profile variance, ratings depend on the parameters. Applying the product rule of
variance and prior variance. A constant term C does not (18). we get:

s s t t
1 s 1 t
M N M N
s sT 2 T
− I i, j (R s
i, j − U V
i j ) − Ii, j (Ri,t j − Uit V jt )2
2σs2 2σ 2
t i=1 j=1
i=1 j=1
M M  s t 2 s s
1  1 s sT 1 s sT
M N
T
− 2 S s t  − Uis Uit − U U
i i − Vj Vj
2σ p Pi ,Pi  2σu2s i=1 2σv2s j=1
i=1 i  =1
t t
1 t tT 1 t tT
M N
− 2 Ui Ui − Vj Vj + C (19)
2σu t i=1 2σv2t j=1

Table 5 MRGF values of an


example TF GF δ3 GF δ4 GF δ5 MRGF

u1 6 < 0, 0, 2, 2, 0 > < 1, 0, 2, 1, 0 > < 1, 0, 1, 0, 0 > < 0.8333, 0, 1.5000, 0.6667, 0 >
u2 6 < 0, 0, 1, 1, 0 > < 0, 2, 0, 0, 2 > < 1, 1, 1, 0, 2 > < 0.5, 1.1667, 0.6667, 0.1667, 1.6667 >
u3 4 < 0, 0, 0, 0, 0 > < 1, 1, 1, 0, 2 > < 0, 1, 0, 0, 1 > < 0.5000, 1.2500, 0.5000, 0, 1.7500 >
A. K. Sahu and P. Dwivedi

Table 6 An example of user’s preferences where λs , λt , are constant trade-off parameters to avoid
over-fitting by penalizing the magnitudes of the parameters.
User # Users’ preferences
α also is constant trade-off parameter to control the
u1 < 0.4528, 0, 0.9030, 0.4444, 0 > influence of users profile similarity.
u2 < 0.2424, 0.6092, 0.3657, 0.1171, 0.8521 > To minimize an error (E) of the objective function (20),
u3 < 0.2783, 0.6486, 0.2783, 0, 0.9274 > we have used alternating least square approach [51] to learn
all the parameters.
Ns
We assume that all have a equal variance, i.e., σs2 = σt2 = 1 s T
Let E(Uis ) = Ii, j (Ri,s j − Uis V js )2
σ p = σu2s = σv2s = σu2t = σv2t . Therefore, the objective
2
2
j=1
function E as: ⎛ s ⎞
Ns
λs ⎝ s s T s s T ⎠
M
Ms N s
1 s T + Ui Ui + Vj Vj
E(U , V , U , V ) =
s s t t
Ii, j (Ri,s j − Uis V js )2 2
2 i=1 j=1
i=1 j=1
⎛ s ⎞ Mt 
Ms
2
Ns α
λs ⎝ s s T s s T ⎠
M T
+ S  − Uis Uit
+ Ui Ui + Vj Vj 2 Pis ,Pit
2 i=1 i  =1
i=1 j=1
Mt Nt partial derivative:
1 t T
+ Ii, j (Ri,t j − Uit .V jt )2 Ns
2 ∂E T
i=1 j=1
⎛ t ⎞ = − Ii,s j (Ri,s j − Uis V js )V js
Nt
∂Uis
λt ⎝ t t T t t T ⎠
M j=1
+ Ui Ui + Vj Vj Mt  
2 α   s tT
i=1 j=1 − S s t − Ui Ui  Ui  + λs Uis
2  Pi ,Pi 
M M 
s t 2
α   s tT
i =1
+ S s t − Ui Ui 
2 
Pi ,Pi  Similarly, in target domain E(Uit ) of user i,
i=1 i =1
(20) Nt
∂E T
= − Ii,t j (Ri,t j − Uit V jt )V jt
∂Uit
j=1
Ms  
α  T
− S s t  − Uis Uit Uis + λt Uit
2 Pi ,Pi 
i=1

Item latent factor in source and target domains can be


optimized as follows:

Ms
∂E T
= − Ii,s j (Ri,s j − Uis V js )Uis + λs V js
∂Vj
s
i=1

Mt
∂E T
=− Ii,t j (Ri,t j − Uit Vtt )Uit + λt V jt
∂Vj
t
i=1

∂E ∂E ∂E ∂E
So, ∂Uis
= ∂Uis
= ∂ V js
= ∂ V jt
=0
After calculating the gradient of all four parameters,
update the values as follows:
∂E
θ ←θ −γ ∗ (21)
∂θ
where γ is a learning! rate of the objective function E, and
θ ∈ Us, V s, Ut, V t .
This process iteratively repeat until to convergence of
Fig. 8 Probabilistic graphical model of UP-CDRSs method local optimal state.
User profile as a bridge in cross-domain...

4.4 Prediction datasets belong movie recommender systems, so, there is


a case wherein same movies may overlap. Figure 9 shows
After learning all four parameters (θ ) of the objective cross-domain structure of both datasets. For considering non
function E, predict the unknown ratings in the target domain overlap scenarios, we have to remove overlapping movies in
can be computed as: one of the datasets. Some preprocessing steps as follows:
T
R̂i, j = Uit V jt (22) – Firstly, we find overlap movies in both datasets through
movies name. In this process, 1,361 movies are found
which belong to both datasets. These overlapped
movies are discarded in one of the datasets. In our
5 Experiments and results
scenario, we chose ML and discarded 1,361 overlapped
movies. Remaining 2,591 movies are considered in
In this section, firstly we describe datasets that are used
ML (domain 1). After that, the rating matrix, in triplet
in this paper followed by data preprocessing. After that
form (userID, itemID, rating), is refined and we only
we describe experiment protocols and evaluation metrics.
considered 589,395 out of 1,000,209 ratings.
At last, we discuss compared methods and summary of
– In EM dataset (domain 2), we consider equal number of
experimental results.
users as domain 1. In this case, 6,040 users are chosen
randomly with constraint that users must provide
5.1 Datasets
demographical information. After that, rating matrix is
refined, and only 284,886 out of 2,811,983 rating is
We have used two datasets which are publicly available
considered .
and also benchmark datasets for RSs: MovieLens2 and
– Building a user profile, genres information and
EachMovie 3 . Both datasets are similar in terms of recom-
demographical information are used. In case of genre
mendations because both are used for movie recommender
information, only 7 genres overlap, name as ’Action’
systems. The brief overview of datasets are following:
as ’Act’, ’Animation’ as ’Ani’, ’Comedy’ as ’Com’,
– A MovieLens (ML) rating dataset contains 1,000,209 ’Drama’ as ’Dra’, ’Horror’ as ’Hor’, ’Romance’ as
ratings of 3,952 movies rated by 6,040 users. Rating is ’Rom’ and ’Thriller’ as ’Thr’ between domains.
made on a 5-star scale i.e., 1 − 5. Other information – An intersecting thing is that one more hidden overlap
about users and items are demographical information genre present between domains. Finding a hidden
(gender, age, occupation and zip code) and genre overlap genre, we use cosine similarity between genres
information, respectively. The presence of user’s age is as follows:
in form of seven group ranges, i.e., 1−17, 18−24, 25−
∗ As mentioned earlier, 1,361 movies are same
34, 35 − 44, 45 − 50, 51 − 55, 56+, and the presence
in both domains. The intuition is that If two
user’s gender in form of ’M’/’F’. Each movie measures
movies have same name in distinct domains
in binary combination of eighteen genres.
than the presence of genres in a movie also
– In EachMovie (EM) dataset contains 2,811,983 ratings
be a same. Here, we have calculated Jaccard
of 1,628 movies given by 72,916 users. Rating score
is given in six ranges < 0, 0.2, .04, 0.6, 0.8, 1 >.
Other details of users and items are demographical
information (gender, age and zip code) and genre
information, respectively. The presence of user’s age
is in numerical value, and the presence user’s gender
in form of ’M’/’F’. Each movie measures in binary
combination of ten genres.

5.2 Data preprocessing

This paper uses the transfer learning mechanism to exploit


the knowledge of source domain wherein the constraint non-
overlap users and items between domains. Both mentioned

2 https://grouplens.org/datasets/movielens/10m/
3 http://www.cs.cmu.edu/?lebanon/IR-lab.htm
Fig. 9 Cross-domain structure of MovieLens and EachMovie datasets
A. K. Sahu and P. Dwivedi

similarity between genres based on overlapped – For rating-scale consistency, we have replaced the
1361 items. Similarity is calculated as: rating values from {0, 0.2, 0.4, 0.6, 0.8, 1} to {1, 2, 3,
4, 5, 5} same as [19] have done in CBT method.
|A ∩ B|
jaccar d sim(g1 , g2 ) = (23) Table 8 shows the overall datasets description after the
|A ∪ B|
preprocessing steps. We observe that sparsity level of ML is
lesser than sparsity level of EM. To validate the effective-
∗ Seven genres name are same in both domain,
ness of transfer learning mechanism, we consider ML and
we have also calculated the highest similarity
EM as source domain and target domain, respectively.
with same name of the genre. In Table 7,
boldface values (e.g., 0.6130) show that
5.3 Experiment protocols
the highest similarities between the genres
in two distinct domains. ’chi’(children) and
Furthermore, validating the effectiveness of a proposed UP-
’fam’(family) same type of genre because
CDRSs method in CDRSs model, experiments are done on
we calculated high similarity ( italic-boldface
various sparsity levels of the target domain. In this manner,
0.6884) Although ’chi’ and ’fam’ genres name
The target domain’s ratings are divided based on number of
are different, both genres meaning are same.
rated ratings of each user {1%, 1.5%, 2%, 2.5%, 2.9%}, for
We consider ’chi’ and ’fam’ same genre. So,
instance, 6,040 users and 1,628 items in a domain. If we take
total eight genres overlap between domains.
1% of ratings of each user than total number of ratings are (
– In ML dataset, 7 bins (hard encoded) are used for age .01 ∗ 6, 040 ∗ 1, 628 ≈ 98, 331). So, total five experiments
information and gender information is provided in form are done on the basis of number of known ratings.
of ”M” and ”F”,so, only needs scalar value (0/1). In case We have adopted 10-fold cross-validation process, where
of EM dataset, age is provided in form of numerical rating of the target domain is divided into 10 equal parts.
number. Due to consistency between datasets, we have 9 parts are used for training purpose and a remain part
applied numeric ranges filter and transformed in seven for testing purpose. After that training part of the target
bins same as in ML. domain is concatenated with a source domain. Same process
– Finally, a user profile (P∗k ) vector size is 8+7+1=16 is applied ten times on each test part. After evaluating each
(refer Fig. 7), i.e., number of genres are 8, age ranges part, mean value of all parts shows an overall performance
group are 7 and gender in scalar form. of the method. We have also used 95% confidence interval

Table 7 Similarities between genres

Act Ani Art Cla Com Dra Fam Hor Rom Thr

Act 0.6130 0.0130 0.0222 0.0295 0.0506 0.0579 0.0256 0.0420 0.0063 0.1551
Adv 0.2385 0.0464 0.0138 0.0767 0.0210 0.0291 0.1942 0.0154 0.0083 0.0701
Ani 0.0000 0.8333 0.0095 0.0364 0.0075 0.0000 0.2222 0.0000 0.0000 0.0000
Chi 0.0143 0.2832 0.0108 0.0764 0.0465 0.0034 0.6884 0.0000 0.0000 0.0000
Com 0.0274 0.0221 0.0641 0.0946 0.6255 0.0647 0.1000 0.0141 0.1232 0.0171
Cri 0.1034 0.0000 0.0079 0.0508 0.0300 0.0747 0.0049 0.0533 0.0000 0.1013
Doc 0.0044 0.0000 0.0419 0.0000 0.0122 0.0684 0.0057 0.0079 0.0000 0.0000
Dra 0.0440 0.0016 0.1828 0.0950 0.0921 0.5157 0.0350 0.0169 0.0946 0.0681
Fan 0.0103 0.0545 0.0157 0.0048 0.0158 0.0040 0.0896 0.0103 0.0000 0.0000
FNo 0.0211 0.0000 0.0000 0.0615 0.0000 0.0020 0.0000 0.0105 0.0000 0.0444
Hor 0.0122 0.0091 0.0041 0.0397 0.0186 0.0036 0.0000 0.6237 0.0000 0.1198
Mus 0.0044 0.2222 0.0182 0.1014 0.0272 0.0095 0.1282 0.0000 0.0298 0.0000
Mys 0.0326 0.0000 0.0185 0.0780 0.0073 0.0253 0.0000 0.0081 0.0120 0.1134
Rom 0.0462 0.0082 0.0850 0.0815 0.1364 0.1201 0.0182 0.0070 0.3891 0.0216
Sci-fi 0.1713 0.0268 0.0203 0.0391 0.0255 0.0239 0.0100 0.0690 0.0050 0.1027
Thr 0.2078 0.0086 0.0278 0.0724 0.0181 0.0611 0.0031 0.0916 0.0063 0.4818
War 0.0396 0.0000 0.0446 0.0779 0.0240 0.0650 0.0000 0.0000 0.0339 0.0177
Wes 0.0361 0.0000 0.0000 0.0288 0.0156 0.0264 0.0067 0.0000 0.0137 0.0104
User profile as a bridge in cross-domain...

Table 8 Datasets description after preprocessing Here , we have used λbi = λb j = .001 as trade-off
parameters to avoid the over-fitting problem.
ML EM
Collaborative filtering with topK neighbors (kNN-CF)
# of users 6,040 6,040 is one of the most traditional techniques of RSs. We have
# of items 2,591 1,628 fixed the value of topK= 50. Two variants of kNN-CF are
# of ratings 589,395 284,886 UPC-CF and NUSC-CF. A first method uses clustering
Sparsity level 96.27% 97.10% to find similar users group after that selection process of
# genres 8 8 topK neighbors. So, we have fixed the value of topK= 50
same as kNN-CF. The second method uses sub-spacing
concept, so we have fixed the value of d = 5. d denotes
while calculating an average value of all test sets. Finding dimensionality [6]. Another compared method is MF [9]
the best trade-off parameters, 20% of the training data is which provides the lower rank approximations of the user-
used as a validation set. item matrix. The prediction can be done using an inner
product of corresponding two vectors of user and item. The
5.4 Evaluation metrics prediction is estimated as follows:

Evaluating of our proposed work, we have used Mean T


R̂i, j = Uit V jt + bi, j (28)
Absolute Error (M AE) and Root Mean Square Error
(R M S E) as the evaluation metrics which are defined as: where Ui , V j ∈ R1× f and bi, j = μ + bi + b j . We have used
  
  λu = λv = .001 as trade-off parameters, and size of latent
(i, j)∈RT Ri, j − R̂i, j 
M AE = (24) factors f = 10 is fixed. Similarly, constraint Probabilistic
|RT | Matrix Factorization (cPMF) extended version of proba-
"
#  2 bilistic matrix factorization (PMF) [14]. Rather learn two
#
$ (u,i)∈rT ru,i − r̂u,i latent vectors, one for user and other for item, add an addi-
RMSE = (25) tional constrain user-specific feature vectors for infrequent
|r T |
users. The prediction can be estimated as:
where RT denotes the total number of predicted ratings in
the target domain. Ri, j denotes the actual rating and R̂i, j ⎛ Mt % ⎞
t
j=1 Ii, j Wj
predicted rating on item i to user u. R̂i, j = ⎝Uit + Mt ⎠ V jt T (29)
t
j=1 Ii, j
5.5 Compared methods and parameters setting
where W j is ∈ R1× f additional vector of latent factors. In
In this subsection, we describe compared methods which this method, first we scale the ratings to the interval [0,1]
are used for comparisons of our proposed UP-CDRSs using the function f (x) = x−1k−1 . k is maximum rating value
method. This subsection also covers parameters setting of given matrix. The trade-off parameter values are λu =
because applying any methods for validation, we need some λv = λw = .002 and size of latent factor f = 30 is fixed.
threshold values, learning parameter values, etc. Compare with transfer learning mechanism in RSs,
Average filling (AF) method takes very less time to Codebook transfer knowledge (CBT) method [19] has been
predict the ratings. In a user-item rating matrix, we just fill proposed for knowledge transfer between domains through
the mean value of all observed items’ rating provided by a codebook. The clusters size of CBT is set to be 50. And
user. The prediction as follows: topK nearest neighbors K = 30 is fixed for Pearson
Mt t % t correlation coefficients. In case of our proposed UP-CDRSs
j=1 Ii, j Ri, j
R̂i,∗ = Mt t (26) method, The trade-off parameter values are λs = λt = .002,
j=1 Ii, j α = .001, and size of latent factors f = 30 is fixed.
%
where symbol is element wise product. Another method
in same manner is Average filling with bais (AF with bias), 5.6 Summary of the experimental results
i.e., rather blinding fill mean value of the ratings, learn user
bias and item bias using observed ratings for critical users All state-of-the-art methods, described in the aforemen-
and items, and predict the ratings as follows [9]: tioned subsection, and our proposed work are experimented
on publicly available datasets. In addition, validating and
R̂i, j = μ + bi + b j (27) analysing of our UP-CDRSs methods by transfer learning
where μ is a global mean value of the user-item rating mechanism, target domain’s ratings are divided based on the
matrix, bi and b j are user bias and item bias, respectively. number of rated ratings of each user. Analyse the perfor-
A. K. Sahu and P. Dwivedi

Table 9 Comparison results in


terms of M AE Methods % of observed ratings

1% 1.5% 2% 2.5% 2.9%

AF 1.1747 1.0921 1.0792 1.0735 1.0639


± 0.0058 ±0.004 ± 0.0036 ±0.0024 ± 0.0031
kNN-CF 1.0542 1.0374 1.0329 1.0281 1.0151
± 0.0054 ±0.0082 ± 0.0091 ± 0.0048 ±0.0078
AF with bais 0.8535 0.8485 0.8416 0.8408 0.8371
± 0.0058 ±0.0038 ± 0.0036 ± 0.0026 ±0.0028
MF 0.8006 0.7907 0.7830 0.7765 0.7392
± 0.0039 ±0.0025 ± 0.0029 ±0.004 ± 0.0039
cPMF 0.7921 0.7816 0.7790 0.7459 0.7245
v 0.0049 ± 0.0064 ± 0.0038 ± 0.0028 ± 0.0045
CBT 0.7839 0.7845 0.7746 0.7576 0.7249
± 0.0058 ± 0.0049 ± 0.0035 ± 0.0035 ± 0.0038
UPC-CF 0.8542 0.8452 0.8256 0.8011 0.7888
± 0.0074 ± 0.0041 ± 0.0052 ± 0.0075 ± 0.0091
NUSC-CF 0.7912 0.7875 0.7653 0.7622 0.7503
± 0.0082 ± 0.0042 ± 0.0046 ± 0.0074 ± 0.0081
UP-CDRSs 0.7746 0.7617 0.7538 0.7215 0.7042
± 0.0025 ±0.0035 ± 0.004 ±0.0043 ± 0.0026

mance of our UP-CDRSs method, we have used MAE and methods, respectively. Table 11 shows overall performance
RMSE as evaluation metrics. Table 9 and 10 show MAE based on average MAE and average RMSE of different
and RMSE performance of proposed method over existing sparsity levels. Bold symbols in Tables 9, 10 and 11 show

Table 10 Comparison results


in terms of R M S E Methods % of observed ratings

1% 1.5% 2% 2.5% 2.9%

AF 1.4442 1.3878 1.3771 1.3745 1.3659


± 0.0093 ± 0.0048 ±0.0051 ± 0.0026 ± 0.0039
kNN-CF 1.1549 1.1254 1.1081 1.0973 1.0882
± 0.0079 ± 0.0041 ± 0.0045 ± 0.0028 ± 0.0053
AF with bais 1.1139 1.1106 1.1015 1.0869 1.0741
± 0.0073 ±0.005 ± 0.0034 ± 0.0029 ± 0.004
MF 1.056 1.0483 1.0384 1.0289 1.0188
± 0.0058 ± 0.0054 ± 0.0042 ±0.0044 ± 0.0026
cPMF 1.0526 1.0485 1.0257 1.0146 1.0085
± 0.0039 ± 0.0036 ± 0.0032 ± 0.0044 ± 0.0038
CBT 1.0498 1.0347 1.0212 1.0047 0.9914
±0.0048 ± 0.0066 ± 0.0032 ± 0.0044 ± 0.0057
UPC-CF 1.1058 1.0951 1.0746 1.0731 1.0701
± 0.0028 ± 0.0091 ± 0.0079 ± 0.0076 ± 0.0025
NUSC-CF 1.1019 1.0945 1.0891 1.0745 1.0684
± 0.0042 ± 0.0046 ± 0.0018 ± 0.0078 ± 0.0086
UP-CDRSs 1.0315 1.0234 1.0179 0.9987 0.9841
± 0.0034 ± 0.0045 ± 0.0065 ± 0.005 ± 0.0061
User profile as a bridge in cross-domain...

Table 11 The overall performance based on average MAE and average (vii) Our UP-CDRSs method gave superior results com-
RMSE of different sparsity levels pared with all state-of-the-art with and without trans-
Methods The overall performance fer learning learning methods. Comparing with CBT,
MAE and RMSE performance have increased by
MAE RMSE 3% and 1% respectively. Comparing with NUSC-CF,
3.6% MAE has increased.
AF 1.0967 1.3899
± 0.0038 ± 0.0051
kNN-CF 1.0335 1.1148
6 Conclusion and future direction
± 0.0071 ± 0.0049
AF with bais 0.8443 1.0974
In this paper, we have proposed a novel method name as
± 0.0037 ± 0.0045
User Profile as a Bridge in Cross-domain Recommender
MF 0.7780 1.0381
Systems (UP-CDRSs) for knowledge transfer from source
± 0.0034 ± 0.0045 to the target domain. For establishing a bridge between
cPMF 0.7646 1.0300 domains, we have used user profile which is build from
± 0.0045 ± 0.0038 user’s preferences and demographical information of a user.
CBT 0.7651 1.0204 After building the user profile in both domains, we calculate
± 0.0043 ± 0.0050 many similarity between users profile of distinct domains,
UPC-CF 0.8230 1.0837 and we then apply alternating least square approach in
± 0.0067 ± 0.0060 objective function which is formulated by probabilistic
NUSC-CF 0.7713 1.0850 graphical model. After learning all latent factor of user and
± 0.0065 ± 0.0054 item, prediction on unrated items in the target domain is
UP-CDRSs 0.7432 1.0111 made using corresponding latent factors of user and item.
± 0.0034 ± 0.0051 We have done five experiments to validate UP-CDRSs
method and analysed how transfer learning is effective in
CDRSs framework. The experimental results show that our
the higher performance (lower MAE and RMSE) of our
proposed UP-CDRSs method performs significantly better
proposed UP-CDRSs method over existing methods.
than several with and without transfer learning methods.
The following observation can be made from the results:
In future, we will extend our method to use a slightly
(i) A first method AF, we can observe that it gave the different types of domain, for instance (movie v/s book) to
very worse results in terms of accuracy performance. check how to transfer learning framework is significantly
Another observation is that while increasing the size effective.
of the training set, not much improvement in accuracy
Publisher’s note Springer Nature remains neutral with regard to
was made. jurisdictional claims in published maps and institutional affiliations.
(ii) kNN-CF is one of the traditional methods of CF.
It gave better results compared to the AF method.
As the training data increased, the performance of References
kNN-CF also increased.
(iii) AF with bias and MF both gave impressive results 1. Adomavicius G, Tuzhilin A (2005) Toward the next generation of
because both methods belong learning based mech- recommender systems: a survey of the state-of-the-art and possible
anism. Compare with kNN-CF, MF achieved great extensions. IEEE Trans Knowl Data Eng 17(6):734–749
performance, i.e., 28% and 18% preference have 2. Li Z, Zhao H, Liu Q, Huang Z, Mei T, Chen E (2018)
Learning from history and present: Next-item recommendation
increased compared with kNN-CF in MAE and via discriminatively exploiting user behaviors. In: KDD, pp 1734–
RMSE, respectively. 1743. ACM
(iv) cPMF is a variant of MF method wherein one extra 3. Candillier L, Meyer F, Boullé M (2007) Comparing state-of-
leant factor is used for critical users. So, it gave better the-art collaborative filtering systems. Lect Notes Comput Sci
4571:548
result compare to MF. 4. Jiang L, Cheng Y, Li Y, Li J, Yan H, Wang X (2018) A trust-based
(v) Compare with transfer learning method, CBT pro- collaborative filtering algorithm for E-commerce recommendation
vided best results compare with non-transfer learning system. J Ambient Intell Humaniz Comput 0(0):0
methods. We are able to say transfer learning performs 5. Bobadilla J, Ortega F, Hernando A, Gutiérrez A (2013)
Recommender systems survey. Knowl-Based Syst 46:109–132
well by using the knowledge of other related domain. 6. Koohi H, Kiani K, Hwangbo H, Kim Y (2017) A new
(vi) UPC-PC and NUSC-CF, the variants of kNN-CF, method to find neighbor users that improves the performance of
have provided better results compare to kNN-CF. Collaborative Filtering. Expert Syst Appl 89:254–265
A. K. Sahu and P. Dwivedi

7. Zhang J, Lin Y, Lin M, Liu J (2016) An effective collaborative 26. Enrich M, Braunhofer M, Ricci F (2013) Cold-start management
filtering algorithm based on user preference clustering. Appl Intell with cross-domain collaborative filtering and tags. Springer,
45(2):230–240 Berlin, pp 101–112
8. Dakhel AM, Malazi HT, Mahdavi M (2018) A social recom- 27. Fernández-Tobı́ (2014) Exploiting social tags in matrix factoriza-
mender system using item asymmetric correlation. Appl Intell tion models for cross-domain collaborative filtering. In: CBREc-
48(3):527–540 sys@ recsys, pp 34–41
9. Koren Y, Bell R, Volinsky C (2009) Matrix factorization 28. Sahu AK, Dwivedi P, Kant V (2018) Tags and item features as a
techniques for recommender systems. Comput 42(8):30–37 bridge for cross-domain recommender systems. Procedia Comput
10. Li Y, Wang D, He H, Jiao L, Xue Y (2017) Mining Sci 125:624–631
intrinsic information by matrix factorization-based approaches for 29. Bishop CM (2006) Pattern recognition and machine learning
collaborative filtering in recommender systems. Neurocomputing (information science and statistics). Springer-Verlag New York,
249:48–63 Inc., Secaucus
11. Zhang F, Lu Y, Chen J, Liu S, Ling Z (2017) Robust collaborative 30. Al-Shamri MYH (2016) User profiling approaches for demo-
filtering based on non-negative matrix factorization and R1-norm. graphic recommender systems. Knowl-Based Syst 100:175–187
Knowl-Based Syst 118:177–190 31. Ma H, Yang H, Lyu MR, Sorec IK (2008) Social recommenda-
12. Hernando A, Bobadilla J, Ortega F (2016) A non negative matrix tion using probabilistic matrix factorization. In: Proceedings of the
factorization for collaborative filtering recommender systems 17th ACM conference on information and knowledge manage-
based on a Bayesian probabilistic model. Knowl-Based Syst ment, CIKM ’08, pp 931–940, New York, NY, USA. ACM
97:188–202 32. Yu X, Chu Y, Jiang F, Guo Y, Gong D (2018) Knowledge-
13. Himabindu TVR, Padmanabhan V, Pujari AK (2018) Conformal based systems SVMs classification based two-side cross domain
matrix factorization based recommender system. Information collaborative filtering by inferring intrinsic user and item features.
Sciences Knowl-Based Syst 141:80–91
14. Salakhutdinov R, Mnih A (2007) Probabilistic matrix factoriza- 33. Zheng X, Luo Y, Sun L, Ding X, Ji Z (2018) A novel
tion. In: Proceedings of the 20th International Conference on social network hybrid recommender system based on hypergraph
Neural Information Processing Systems, NIPS’07, pp 1257–1264, topologic structure. World Wide Web 21(4):985–1013
USA. Curran Associates Inc 34. Chou S-Y, Yang Y-H, Jang J-SR, Lin Y-C (2016) Addressing
15. Pan W, Yang Q (2013) Transfer learning in heterogeneous cold start for next-song recommendation. In: Proceedings of the
collaborative filtering domains. Artif Intell. 197:39–55 10th ACM conference on recommender systems - RecSys ’16,
16. Pan W (2016) A survey of transfer learning for collaborative pp 115–118
recommendation with auxiliary data. Neurocomputing 177:447– 35. Valdéz ERN, Lovelle JMC, Martı́nez SO, Garcı́a-dı́az V, Ordoñez
453 de Pablos P, Marı́n CEM (2012) Implicit feedback techniques on
17. Xin X, Liu Z, Lin C-Y, Huang H, Wei X, Guo P (2015) Cross- recommender systems applied to electronic books. Comput Hum
domain collaborative filtering with review text. In: Proceedings Behav 28(4):1186–1193
of the 24th international conference on artificial intelligence, 36. Crespo RG, Martı́nez OS, Lovelle JMC, Garcı́a-Bustelo CPB,
IJCAI’15, pp 1827–1833. AAAI Press Gayo JEL, Ordoñez de Pablos P (2011) Recommendation system
18. Guo G, Qiu H, Tan Z, Liu Y, Ma J, Wang X (2017) Resolving based on user interaction data applied to intelligent electronic
data sparsity by multi-type auxiliary implicit feedback for books. Comput Hum Behav 27(4):1445–1449
recommender systems. Knowl-Based Syst 138:202–207 37. Dang Thanh N, Son LH, Ali M (2017) Neutrosophic rec-
19. Li B, Yang Q, Xue X (2009) Can movies and books collaborate?: ommender system for medical diagnosis based on algebraic
Cross-domain collaborative filtering for sparsity reduction. In: similarity measure and clustering. In: 2017 IEEE Interna-
Proceedings of the 21st International Jont conference on artifical tional Conference on Fuzzy Systems (FUZZ-IEEE), pp 1–6.
intelligence, IJCAI’09, pp 2052–2057, San Francisco, CA, USA. https://doi.org/10.1109/FUZZ-IEEE.2017.8015387
Morgan Kaufmann Publishers Inc 38. Le HS, Thong NT (2015) Intuitionistic fuzzy recommender
20. Khan MM, Ibrahim R, Ghani I (2017) Cross domain recommender systems: an effective tool for medical diagnosis. Knowl-based
systems: a systematic literature review. ACM Comput Surv Syst 74:133–150
39. Dwivedi P, Bharadwaj KK (2015) E-learning recommender
50(3):1–34
system for a group of learners based on the unified learner profile
21. Zhu F, Wang Y, Chen C, Liu G, Orgun M, Wu J (2017) A Deep
approach. Expert Syst 32(2):264–276
Framework for Cross-Domain and Cross-System Recommenda-
40. Liu H, Hu Z, Mian A, Tian H, Zhu X (2014) A new user
tions. pp 3711–3717
similarity model to improve the accuracy of collaborative filtering.
22. He M, Zhang J, Yang P, Yao K (2018) Robust transfer learning for Knowl-Based Syst 56(Supplement C):156–166
cross-domain collaborative filtering using multiple rating patterns 41. Son LH (2015) HU-FCF++: a novel hybrid method for the new
approximation. In: Proceedings of the 11th ACM international user cold-start problem in recommender systems. Eng Appl Artif
conference on web search and data mining - WSDM ’18, pp 225– Intell 41:207–222
233 42. Biswas S, Lakshmanan LVS, Roy SB (2017) Combating the cold
23. Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans start user problem in model based collaborative filtering. CoRR,
Knowl Data Eng 22(10):1345–1359 arXiv:1703.00397
24. Cremonesi P, Tripodi A, Turrin R (2011) Cross-domain rec- 43. Son LH (2016) Dealing with the new user cold-start problem in
ommender systems. In: ICDMW2011: IEEE 11th international recommender systems: a comparative review. Inf Syst 58:87–104
conference on data mining workshops, pp 496–503 44. Fernández-Tobı́as I, Cantador I, Kaminskas M, Ricci F (2012)
25. Shi Y, Larson M, Hanjalic A (2011) Tags as bridges between Cross-domain recommender systems: a survey of the state of the
domains: improving recommendation with tag-induced cross- art. In: Spanish conference on information retrieval
domain collaborative filtering. In: Proceedings of the 19th interna- 45. Winoto P, Tang T (2008) If you like the devil wears prada the book,
tional conference on user modeling, adaption, and personalization, will you also enjoy the devil wears prada the movie? a study of
UMAP’11. Springer-Verlag, Berlin, pp 305–316 cross-domain recommendations. N Gener Comput 26(3):209–225
User profile as a bridge in cross-domain...

46. Berkovsky S, Kuflik T, Ricci F (2007) Cross-Domain mediation in 50. Zhao L, Pan SJ, Yang Q (2017) A unified framework of active
collaborative filtering 2 Cross-Domain mediation in collaborative transfer learning for cross-system recommendation. Artif Intell
filtering. User Model 4511:355–359 245:38–55
47. Hu L, Cao J, Xu G, Cao L, Gu Z, Zhu C (2013) Personalized 51. Koren Y, Bell R (2015) Advances in collaborative filtering.
recommendation via cross-domain triadic factorization. In: Recommender systems handbook, 2nd edn. pp 77–118
Proceedings of the 22nd international conference on World Wide 52. Li T, Ding C (2006) The relationships among various nonnegative
Web - WWW ’13, pp 595–606 matrix factorization methods for clustering. In: Proceedings -
48. Pan W, Xiang EW, Liu NN, Yang Q (2010) Transfer learning in IEEE international conference on data mining, ICDM, (1):362–
collaborative filtering for sparsity reduction. In: Proceedings of 371
the 24th AAAI conference on artificial intelligence, AAAI’10, pp 53. Al-Shamri MYH, Bharadwaj KK (2008) Fuzzy-genetic approach
230–235. AAAI Press to recommender systems based on a novel hybrid user model.
49. Fang Z, Gao S, Li B, Li J, Liao J (2016) Cross-domain Expert Syst Appl 35(3):1386–1399
recommendation via tag matrix transfer. In: Proceedings - 15th 54. Huang J, Zhu K, Zhong N (2016) A probabilistic inference model
IEEE international conference on data mining workshop, ICDMW for recommender systems. Appl Intell 45(3):686–694
2015, pp 1235–1240

You might also like