Professional Documents
Culture Documents
A CrossPlatform Personalized Recommender System For Connecting ECommerce and Social Network - 2023 - MDPI
A CrossPlatform Personalized Recommender System For Connecting ECommerce and Social Network - 2023 - MDPI
Article
A Cross‑Platform Personalized Recommender System for
Connecting E‑Commerce and Social Network
Jiaxu Zhao 1 , Binting Su 2, *, Xuli Rao 1 and Zhide Chen 3, *
Abstract: In this paper, we build a recommender system for a new study area: social commerce,
which combines rich information about social network users and products on an e‑commerce plat‑
form. The idea behind this recommender system is that a social network contains abundant infor‑
mation about its users which could be exploited to create profiles of the users. For social commerce,
the quality of the profiles of potential consumers determines whether the recommender system is a
success or a failure. In our work, not only the user’s textual information but also the tags and the
relationships between users have been considered in the process of building user profiling model. A
topic model has been adopted in our system, and a feedback mechanism also been design in this pa‑
per. Then, we apply a collative filtering method and a clustering algorithm in order to obtain a high
recommendation accuracy. We do an empirical analysis based on real data collected on a social net‑
work and an e‑commerce platform. We find that the social network has an impact on e‑commerce,
so social commerce could be realized. Simulations show that our topic model has a better perfor‑
mance in topic finding, meaning that our profile‑building model is suitable for a social commerce
recommender system.
Recommender systems for e‑commerce companies have been well studied [5,6]. How‑
ever, most of the existing recommender systems use only information from e‑commerce
to make recommendations, such as consumer purchased history log or rating scores of
purchased commodities. With the integration of e‑commerce and social platforms, new
recommender systems should be designed, and they should make full use of social net‑
work user information and product information. Some studies have begun to use social
network information to improve the accuracy of the recommendations on e‑commerce plat‑
forms. Damian et al. proposed a web recommender system for e‑commerce that traces
clients and analyzes their activities on Facebook [7]. However, the user profiles are only
based on keywords from the users’ activity and their friends’ activities. Hao Ma et al.
improved the recommender system by adding social contextual information, i.e., social
tags and latent information obtained by heterogeneous data mining [8]. Zhao et al. ex‑
tracted the demographic information of both products and social network users’ activities,
and then leveraged the demographic information to improve the recommendation perfor‑
mance on e‑commerce websites [9]. They also proposed a cross‑platform recommender
system that operates by learning both users’ and products’ features from data collected
from e‑commerce websites using recurrent neural networks and then apply a modified
gradient boosting trees method to transform users’ social networking features into user
embeddings [10]. Using these features, their work realized cross‑site product recommen‑
dation and solved the cold‑start problem. With the rise of AI technology, Pan et al. pro‑
posed a unified framework of active transfer learning for cross‑system recommendation,
which used an active learning principle to construct entity correspondences across sys‑
tems [11]. Xiang et al. integrated the fuzzy association rule and complex preference into a
recommendation model to improve the efficiency of the traditional collaborative filtering
recommendation algorithm [12]. However, these cross‑platform recommendation meth‑
ods solely rely on sparse social network data and e‑commerce data. These methods do not
fully integrate textual information, tagging information and behavioral information from
the social network. In fact, the social network contains abundant, detailed, time‑resolved,
real‑world user data, e.g., tweets (microblogs), tags and relations with other users, which
motivates us to extract users’ information and capture users’ interest profiles for cross‑
platform recommendations.
In this paper, we propose a novel cross‑platform recommender system (CPRec) to
make full use of the abundant information on social networks to improve recommenda‑
tion accuracy. In CPRec, we build both a user profile and a commodity profile from data
on social networks and e‑commerce. An improved topic model for detecting users’ inter‑
est profiles from their historical released information is designed, which is based on Latent
Dirichlet Allocation (LDA) [13]. The users’ tags and their followees’ profiles will be used
when we are building the users’ profiles. After obtaining a user profile, we make recom‑
mendations based on the recommended score as calculated from the users’ profiles and the
commodity profiles. Considering that each user will take different actions after he/she re‑
ceives the recommended products, a feedback mechanism is designed for the CPRec. Since
a user‑commodity‑score matrix will be obtained if the CPRec starts to work, we develop
an improved collaborative filtering algorithm that combines user profiles to make further
recommendations. Finally, in order to show the performance of the proposed system, we
evaluate and analyze the CPRec based on two platforms, Sina Weibo and Alibaba. The
contributions of this paper are summarized as follows:
• We propose a novel cross‑platform personalized recommender system, CPRec, for
recommending e‑commerce commodities to users on social network platforms.
• An interest mining process is proposed to build the user interest profiles, which makes
full use of users’ information on social networks.
• We propose three subdivisions for CPRec, i.e., recommendations for individuals, a
feedback mechanism and an improved collaborative filtering algorithm.
Future Internet 2023, 15, x FOR PEER REVIEW 3 of 20
• The experimental results validate the feasibility of the CPRec, the veracity of user
profiling and the superior performance of our improved collaborative filtering algo-
• The experimental results validate the feasibility of the CPRec, the veracity of user pro‑
rithm compared with some existing algorithms.
filing and the superior performance of our improved collaborative filtering algorithm
The remainder
compared with of the existing
some paper is algorithms.
organized as follows. In Section 2, an overview of our
proposed cross-platform recommender system is given. System building and user profil-
The remainder of the paper is organized as follows. In Section 2, an overview of our
ing are presented in Section 3, and commodity profiling and recommendation subdivi-
proposed cross‑platform recommender system is given. System building and user profil‑
sions are described in detail. Section 4 discusses the experimental results. Section 5 con-
ing are presented in Section 3, and commodity profiling and recommendation subdivisions
cludes this paper and outlines future work.
are described in detail. Section 4 discusses the experimental results. Section 5 concludes
this paper and outlines future work.
2. System Model
2.
2.1.System Model
Preliminary
2.1. Preliminary
2.1.1. Social Networks
2.1.1. Social Networks
Social networks can be illustrated as a graph of the relationships and interactions
withinSocial
in a networks can be illustrated
group of individuals, and theyas aplay
graph of the relationships
a fundamental role as a and
mediuminteractions
for the
within
spread inof ainformation,
group of individuals, and they play
ideas and influence among a fundamental
their members. roleInasthis
a medium
paper, weforcon-
the
spread of information, ideas and influence among their members. In this
sider a general model of social networks, which is abstracted as a set of nodes and a set of paper, we con‑
sider
edgesabetween
general themodel of social
nodes. networks,
Each node can be which is abstracted
considered as a set ofor
as an individual nodes and a set
as a collective
of edges between the nodes. Each node can be considered as
unit such as a department, organization or family. There exists an edge between an individual or as a two
col‑
lective unit such as a department, organization or family. There exists
nodes if they have relation. Figure 1 shows a brief instance of a social network, which an edge between
two nodes
contains if they
four have
nodes relation.
(users in theFigure
social 1network)
shows a andbrieftheir
instance of a social
relations network,
(following which
a person).
contains
In Figurefour nodes
1, user (usersuser
a follows in the socialanetwork)
c while and theirby
is being followed relations
b. In the(following a person).
social network, the
In
user followed by other users is defined as followee, and those who follow this user the
Figure 1, user a follows user c while a is being followed by b. In the social network, are
user
calledfollowed
followers. byInother usersusers
practice, is defined as followee,
are mostly likely toand those
follow whowhose
a user followinterests
this user are
align
called followers. In practice, users are mostly likely to follow a user whose
with their own. In Sina Weibo, users always write a short message (limited 140 characters) interests align
with their own. In Sina Weibo, users always write a short message (limited 140 characters)
and upload some pictures to show moments in their lives or interesting things.
and upload some pictures to show moments in their lives or interesting things.
Figure
Figure 1. An abstract
1. An abstract graph
graph of
of social
social network.
network.
The short message is an important part of our model. While a short microblog may be
The short message is an important part of our model. While a short microblog may
unable to depict the full scope of a user’s interests or thoughts, we collect user’s historical
be unable to depict the full scope of a user’s interests or thoughts, we collect user’s histor-
microblogs and divide them into groups to analyze. The user’s microblogs from a uniform
ical microblogs and divide them into groups to analyze. The user’s microblogs from a
time period ∆t will be represented as the microblog group M∆t . For a given user,) his/her
uniform
entire time period
microblog
t canbe
history Mwill berepresented
denoted as M as =
(
the Mmicroblog group 𝑀 , M . For a given
∆t0 , M∆t1 , M∆t2 , . . .∆𝑡 ∆tn , where
user, his/her entire
t0 represents microblog
the current timehistory M cantags
slot. Users’ be denoted as 𝑀useful
are another = (𝑀∆𝑡source 0
, 𝑀∆𝑡1 ,of
𝑀∆𝑡 , … , 𝑀∆𝑡𝑛 ),
information.
2
where 𝑡0 tags
Selecting represents the current
is an essential time slot.
if optional Users’
part of thetags are another
registration processusefulaftersource
users of create
infor-
mation.
their Selecting
accounts with tags
theissocial
an essential
network. if optional
Tags givepart of the information
obvious registration process about theafter users
interests
create
that their
user accounts
want to representwith the social network.
to others, Tags give
such as singing, obvious
eating, information
shopping, about
traveling, orthe
IT.
interests
Let Tu =that
( T0 ,user
T2 , Twant
3 , . . . ,to
Tmrepresent
) denote to theothers, suchuser
tags that as singing,
u selected. eating, shopping,
Another datatraveling,
source is
or IT. Let 𝑇We
followees. 𝑢 =denote
(𝑇0 , 𝑇2 , the
𝑇3 , … , 𝑇𝑚u) i denote
user ’s followees as Fthat
the tags ( F1 , uF2selected.
ui =user , F3 , . . . , FAnother
l ). data source
is followees. We denote the user 𝑢𝑖 ’s followees as 𝐹𝑢𝑖 = (𝐹1 , 𝐹2 , 𝐹3 , … , 𝐹𝑙 ).
2.1.2. E‑Commerce
2.1.2.In
E-Commerce
this paper, the e‑commerce data source that we adopted is Taobao, which is a child
company ofpaper,
In this Alibaba
theand the most data
e-commerce successful
sourcee‑business platform
that we adopted in China.
is Taobao, As ofis2014,
which it
a child
had generated
company a totaland
of Alibaba volume of 1.172
the most trillion RMB.
successful e-business platform in China. As of 2014, it
Commodities
had generated on volume
a total e‑commerce platforms
of 1.172 trillionare the key factor that we concentrate on, and
RMB.
they can be described by the commodity’s name, its description and the buyer’s informa‑
tion. These three kinds of data will be used to define the commodity profiles.
Future Internet 2023, 15, x FOR PEER REVIEW 4 of 20
Commodities on e-commerce platforms are the key factor that we concentrate on,
and they can be described by the commodity’s name, its description and the buyer’s in-
Future Internet 2023, 15, 13 4 of 20
formation. These three kinds of data will be used to define the commodity profiles.
Figure
Figure 2. System
2. System model
model of the
of the proposed
proposed cross‑platform
cross-platform recommender
recommender system
system (CPRec).
(CPRec).
3. The
3. The Proposed
Proposed CPRec
CPRec
3.1.3.1. User
User Profiling
Profiling
User
User profiling
profiling is is
thethe coreofofthe
core theproposed
proposedrecommender
recommendersystem.
system.ItItisishardly
hardly possible
possible to
recommend suitable goods to a consumer without knowing his/her profiles. In this section,
to recommend suitable goods to a consumer without knowing his/her profiles. In this sec-
we aim to identify each user’s profiles, or what we may call their interests. Considering
tion, we aim to identify each user’s profiles, or what we may call their interests. Consid-
the users’ interests may be variant, we indicate that the users’ interest profiles consist of
ering the users’ interests may be variant, we indicate that the users’ interest profiles con-
two components: stable interest and temporal interest. Furthermore, the stable interests
sist of two components: stable interest and temporal interest. Furthermore, the stable in-
have the characteristic of being time‑immune, which means that they only change slightly
terests have the characteristic of being time-immune, which means that they only change
as time passes. However, some interests may be generated due to reasons like the influence
slightly as time passes. However, some interests may be generated due to reasons like the
of a hot social trend, and we define these interests as temporal interests, meaning that
they are short‑time interests. In our model, we employ a scheme with time‑weighting
to capture both stable interests and temporal interests from a user’s historical microblogs.
Then, considering tags to be part of the interest criteria that users set for tagging themselves
at the initial time, which could be powerful evidence for defining a user’s stable interests,
influence of a hot social trend, and we define these interests as temporal interests, mean-
ing that they are short-time interests. In our model, we employ a scheme with time-
weighting to capture both stable interests and temporal interests from a user’s historical
Future Internet 2023, 15, 13 microblogs. Then, considering tags to be part of the interest criteria that users set for tag-
5 of 20
ging themselves at the initial time, which could be powerful evidence for defining a user’s
stable interests, we propose an algorithm to combine the interest profiles drawn from
these two sources. Last, we integrate the profiles of a user’s followees.
we propose an algorithm to combine the interest profiles drawn from these two sources.
Last, we integrate the profiles of a user’s followees.
3.1.1. Latent Interest Profiles Obtained by Microblogs
Many
3.1.1. studies
Latent related
Interest to users’
Profiles postedby
Obtained messages have been conducted [14–16], includ-
Microblogs
ing research
Manyfocusing on the to
studies related problem of identifying
users’ posted messagesinfluential
have been users in a social
conducted network
[14–16], includ‑
by ing
taking into account
research focusingtheonsimilarity
the problem of the topics that influential
of identifying users post users
aboutintopics, which
a social is
network
decided by a user’s posted messages [17]. However, we wish to detect a user’s latent
by taking into account the similarity of the topics that users post about topics, which is de‑ in-
terest profiles
cided though
by a user’s the messages
posted messagesposted naturallywe
[17]. However, bywish
that to
user. Theamethod
detect we adopt
user’s latent interest
is based on Latent Dirichlet Allocation (LDA), an unsupervised machine
profiles though the messages posted naturally by that user. The method we adopt is learning tech-
based
nique which Dirichlet
on Latent has been Allocation
widely used to detect
(LDA), latent topics in
an unsupervised documents.
machine LDA
learning treats a sin-
technique which
glehasdocument as “aused
been widely bag toof detect
words”, which
latent means
topics that it views
in documents. LDA a document as adocument
treats a single vector of as
word counts.
“a bag Each document
of words”, which meansis represented as aaprobability
that it views document as distribution
a vector ofover
wordvarious
counts. top-
Each
ics,document
while each is topic is represented
represented as a probability
as a probability distribution
distribution overtopics,
over various a numberwhileof each
words, topic
as shown in Figure
is represented as 3.
a probability distribution over a number of words, as shown in Figure 3.
Figure
Figure 3. An
3. An abstract
abstract representation
representation of LDA
of LDA principle.
principle.
Standard
Standard LDALDA maymay notnotfitfitthethewriting
writingof of microblog
microblog users,
users, the
the reason
reason being
beingthat
thataasin‑
gle microblog
single will always
microblog will always be beshort
shortand andcontain
containonly
onlyone
onetopic,
topic,sosothethe method
method wewe adopt
adopt
is the Microblogs Topic Discovery Model, which is based on the twitter-LDA in [18]. In In
is the Microblogs Topic Discovery Model, which is based on the twitter‑LDA in [18].
Section
Section 2, we
2, we have
(have introduced
introduced method
method for)for dividing
dividing a user’s
a user’s microblogs
microblogs into
into groups
groups such
such
that M = M
that 𝑀 = (𝑀∆𝑡0 , 𝑀∆t , M ∆t , M ∆t , . . . , M ∆t . M ∆t denotes all of the microblogs that user re‑
∆𝑡01 , 𝑀∆𝑡21, … , 𝑀2∆𝑡𝑛 ). 𝑀∆𝑡 ndenotes all of the microblogs that user released
leased during time ∆t , and it will be processed to obtain the user’s
during time ∆𝑡𝑖 , and it i will be processed to obtain the user’s interest profiles during interest profiles during
time ∆𝑡𝑖∆t
time . i.
Suppose
Suppose that
that therethere
areare T hidden
T hidden topics
topics in in microblogs
microblogs 𝑀∆𝑡M𝑖 ,∆tand
setset i
, and
thatthat
eacheach topic
topic t t
has a word distribution ∅ 𝑡 and a background words distribution ∅ 𝐵. π denotes a Bernoulli
t B
has a word distribution ∅ and a background words distribution ∅ . π denotes a Ber-
distribution that manages the choice between background words and topic words. θ u is the
noulli distribution that manages the choice between background words and topic words.
topic distribution of user u. Each multinomial distribution is governed by some symmetric
𝜃 𝑢 is the topic distribution of user u. Each multinomial distribution is governed by some
Dirichlet distribution. Gibbs sampling is used to perform model inference. We leave out
symmetric
Future Internet 2023, 15, x FOR PEER REVIEW Dirichlet distribution. Gibbs sampling is used to perform model inference. We 6 of 20
the derivation details and the sampling formulas here. Figure 4 describes the generation
leave out the derivation details and the sampling formulas here. Figure 4 describes the
process of microblogs, and we illustrate the plate notation of the model in Figure 5.
generation process of microblogs, and we illustrate the plate notation of the model in Fig-
ure 5.
Figure4.
Figure 4. Generation
Generation process
process of
of Microblogs.
Microblogs
Future Internet 2023, 15, 13 Figure 4. Generation process of Microblogs 6 of 20
Pu M = {( I1 , λu,1 )𝑃
, ( I2 , λ u,2 ), ( I′3 , λu,3 ), . . .′, ( In , λu,n )}
′ ′ (4)
𝑢𝑇 = {(𝐼1 , 𝑐1 ), (𝐼2 , 𝑐1 ), (𝐼3 , 𝑐1 ), … , (𝐼𝑚 , 𝑐1 )}
3.1.2. Interest Profiles Obtained from Tags
where 𝑐1 u’s
A user
is tags
a constant and the value of c corresponds to the interest d
are denoted as Tu = ( T , T2 , T3 , . . . , Tm ), which correspond to the user’s
( ) 1
interests I ′ = I1′ , I2′ , I3′ , . . . , Im
′ . Considering that tags are dominant and stable indicators
of a user’s
3.1.3. interests,
Interest we set PuTObtained
Profiles (interest vector user u’s tags) asProfiles
byinFollowees’ following equation:
{( ′ ) ( ′ ) ( ′ ) ( ′ )}
In the real world, PuT = we I1 , chave 1 , I3 , c1connections
1 , I2 , cmore , . . . , Im , c1 with people who
hav (5)
similar
where c1 tois aours.
constantThe motivation
and the for a user
value of c corresponds tointerest
to the follow of I ′ .
another
degree user in aso
termined
3.1.3. Interestby whether
Profiles he/she
Obtained has nearly
by Followees’ Profiles the same interests as this use
lowees’ In theprofiles
real world,can mirror
we have the user’s
more connections withprofiles
people whoto havesome
tastes andextent.
habits There
similar to ours. The motivation for a user to follow another
user’s profiles via the people whom the user is following. Suppose user in asocial network is use
determined by whether he/she has nearly the same interests as this user. Namely, the
𝐹𝑢followees’
= (𝐹1 , 𝐹 2 , 𝐹3 , can
profiles … , mirror
𝐹𝑙 ) and
the their profiles
user’s profiles 𝑃𝑢𝐹extent.
to some 𝑖
have already
Therefore, we beenexpandcreated.
a
user’s profiles via the people whom the user is following.
terest profiles reflected by the followees, is calculated by Suppose user u follows l people
Fu = ( F1 , F2 , F3 , . . . , Fl ) and their profiles Pu F i have already been created. Hence, Pu F , the
interest profiles reflected by the followees, is calculated by 𝑙
𝑃𝑢𝐹 = ∑𝑖=1 𝜉𝑃𝑢𝐹
𝑖
l
Pu F = ∑i=1 ξPuFi (6)
where 𝜉 is reduction factor.
where ξ is reduction factor.
define Pu as the user’s stable interest profile, while the temporal interest profiles refer to
current interest vector P∆t0 , which is decided by the recent user data.
Pu T Pc
RScore( Pu | Pc ) = √ √ (7)
Pu T Pu Pc T Pc
which means that, if the relevance of u and c is big enough, we recommend c to u. Other‑
wise, c is not a suitable recommended item for u.
where Ri,j (1 ≤ i ≤ m, 1 ≤ j ≤ n) represents the feedback score that user i gives to com‑
modity j. Rm×n will always be a sparse matrix while the recommendation method aims to
predict the unknown score in Rm×n .
Measuring the similarity between users is quite important in CF. There are three popu‑
lar methods of measurement used in CF, which are Cosine Similarity, Pearson Correlation
Coefficient Similarity and Modified Cosine Similarity. In our paper, Pearson Correlation
Coefficient Similarity has been used. The formula is shown below.
( ) ( )
∑ik ∈Ca ∩ Cb R a,k − R a × Rb,k − Rb
sim(u a , ub ) = √ ( )2 √ ( )2 (11)
∑Cc ∈Ca ∩ Cb R a,k − R a ∑Cc ∈Ca ∩ Cb Rb,k − Rb
where sim(u a , ub ) ∈ (0, 1) denotes the similarity of user a and user b. R a,k and Rb,k denote
the scores that u a and ub give to commodity ck . R a and Rb are the average score that u a and
ub give to commodities. Ca ∩ Cb denotes the set of commodities to which have both given
a score u a and ub .
Pa,j indicates the predicted score that u a gives to c j , which means that u a has never
given a score to c j . Firstly, we need the set S of users who have given scores to c j . Then
Future Internet 2023, 15, 13 11 of 20
we calculate the similarity between u a and every user in S, choose k users with the greatest
similarity as the set of neighbor users S(u a ) and calculate Pa,j by
Considering that each user has his or her own profiles, we make some improvements
to the similarity formula and propose a collaborative filtering algorithm based on the user
profiles
( (CFUP). u a) and ub have n profiles Aua = ( aua ,1 , aua ,2 , . . . , aua ,n ) and Aub =
aub ,1 , aub ,2 , . . . , aub ,n . We use the Euclidean metric method to measure the profile differ‑
ence of u a and ub . d(u a , ub ) denotes the Euclidean distance of u a and ub , while the calcula‑
tion formula is shown below.
√
n ( )2
d(u a , ub ) = ∑ aua ,i − aub ,i (13)
i =1
Figure
Figure7.7.The
Thedetailed process
detailed of collecting
process data. data.
of collecting
The same collection method is used in commodity information collection process. The
For the purpose of obtaining credible and complete user profiles, the collected
six types of commodity information are shown in Table 3.
from the social network platform contains four main types of information, which
shown
Table in types
3. Six Table of 2information
in detail.collected from e‑business platform.
No. Information Type
Table 2. Four types of information collected from social network platform.
1 Commodity’s title
No. 2 Information
Commodity’s Type
category
1 3
User’s whole microblogs since he/she became a weibo register.
Detail description of commodity
2 User’s basic registered information
4 Sell recorder of commodity
3 User’s tags which were set by user.
5 Commodity’s reviews
4 User’s followees (The person that user follows.)
6 Sales
Figure 8.ofAn
Figure 8. An instance theinstance
special of the we
users special users we study.
study.
In this experiment,
In this experiment, five differentfive different promotional
promotional microblogs
microblogs are selected are
andselected and analyz
analyzed,
and detailedand detailed information
information about them is about
shownthem is shown
in Table 4. in Table 4.
Table 4. Detailed
Tableinformation
4. Detailedon five different
information on promotional
five differentmicroblogs.
promotional microblogs.
Information Information
Information of Five Promotional
of Five Promotional Microblogsmicroblogs Commodity TypesCommodity
that Types
Information
Followers the Shop Ownerthat the
Sell Shop
on Owner
Followers Forward
Forward Like Like
Num-Comment
CommentA Brief
A Brief Overview of
Overview of MI-
E‑Business Platform
Shop Owner Shop Owner Number Number Number Number Microblog Contents Sell on E-Business
Number Number ber Number CROBLOG Contents
1. Discount on Platform
beauty products,on beauty prod-
1. Discount
women’s products
ucts, women’s products Food and
Shop Owner 1 and red wine. Food
1,917,983 980 434 369 Women’s products
(Wu Yue SanShop
Ren) Owner 1 1917983 980 434 369 2. red
New wine.
commodities
Cosmetics Women’s products
(Wu Yue San Ren) 3. 2. New gifts
Receive commodities
for
forwarding
3. Receive thegifts for forwarding Cosmetics
microblog
the microblog
1. 1.
Festival discount
Festival discount on some
Shop Owner 2 on some Milk powder
commodities
commodities Milk powder Women’s products
(Emergency
Shop Owner 2 Fe- 3043249 181 76 79
3,043,249 181 76 79 2. 2. Chance
Chance to aget a cash
to get gift for
Women’s products
(Emergency Female Superman)
male Superman) cash gift for
forwarding Cosmetics Cosmetics
this microblog
forwarding this
Shop Owner 3 1. Festival discount on red
microblog
1968905 69 104 57 Red Wine
(Wang Xiaoshan) wine
Shop Owner 3 1. Festival discount
Shop Owner 4 1,968,905
(Wang Xiaoshan)
69 104 57 1. Chance to win a gift RedifWine
you
62809 68933 9660 15402 on red wine Barbecue
(Barbecue) forward this microblog
1. Chance to win a
Shop Owner 4
62,809 68,933 9660 15,402 gift1.ifNew products Barbecue
you forward
(Barbecue)
Shop Owner 5 this microblog
2. Chances to get cash or cloth-
590873 1018 285 1458 Clothes
(Zhou Xiaoxiong) 1. ing
Newgift for users who forward
products
2. this
Chances to get
microblog
Shop Owner 5 cash or clothing gift
590,873 1018 285 1458 Clothes
(Zhou Xiaoxiong) for users who
After these microblogs were released, we kept track of all of the commodities s
forward this
in the e-business stores and collectedmicroblog
their sales information. Figure 9 shows the metab
curves of different shop owners’ commodities sales.
After these microblogs were released, we kept track of all of the commodities sales in
the e‑business stores and collected their sales information. Figure 9 shows the metabolic
curves of different shop owners’ commodities sales.
In these figures, the abscissa denotes the time and each unit of abscissa represents one
day, while the zero abscissa indicates the day that the shop owner released the promotional
microblog, and the ordinate denotes the sale volume of shop on e‑commerce. To display
the change clearly, we use a red line to indicate places where the commodities sales are
larger than before.
Future Internet 2023, 15, x FOR PEER REVIEW 14 of 20
Future Internet 2023, 15, 13 14 of 20
Figure 9. The
Figure 9. Thecurves
curvesofof different
different shop
shop owner’s
owner’s commodities
commodities sales. sales.
From thesefigures,
In these curves, the
we can observedenotes
abscissa that, after these
the timeshop owners
and each released their promo‑
unit of abscissa represents
tional microblogs, there was an obvious upward trend of their shops’ sales volumes. For
one day, while the zero abscissa indicates the day that the shop owner released the pro-
shop owners 1, 4 and 5, their commodities sales appeared to peak after they released the
motional microblog, and the ordinate denotes the sale volume of shop on e-commerce. To
promotional microblogs. For shop owner 2 and 3, there exist a continuously higher sales
display
volume thanthe change
the days clearly, we use athese
before. Although red sales
line to indicate
curves moveplaces where
differently, thegeneral
their commodities
sales are larger than before.
tendency is to go up, which means that promotional microblogs certainly have a facilitat‑
From these
ing function curves, we
for promoting can
sales. Inobserve that,a after
other words, socialthese shop
network hasowners
ability toreleased their pro-
play a role
motional microblogs, there was an obvious upward trend of their shops’ sales volumes.
in creating economical value for e‑commerce, which gives meaning to our research field.
In turn,
For shop if owners
we know1,about
4 andone user’scommodities
5, their profiles in a social
salesnetwork,
appearedmeaning
to peak that we they
after knowreleased
what this user prefers, could we take a suitable commodity and recommend
the promotional microblogs. For shop owner 2 and 3, there exist a continuously it to the corre‑ higher
sponding user? This is the process of cross‑platform recommendation, which discovers a
sales volume than the days before. Although these sales curves move differently, their
user’s interests with the help of social media and chooses products fit for that user interests.
general tendency is to go up, which means that promotional microblogs certainly have a
In the next section, we will make an evaluation of our method of obtaining user profiles.
facilitating function for promoting sales. In other words, a social network has ability to
play a role in Topic
4.3. Microblogs creating economical
Discovery value
Model and for e-commerce,
Standard LDA Model which gives meaning to our re-
search field. In turn, if we know about one user’s profiles in aevaluate
In order to test the efficiency of our model, we quantitatively social network,
the MTDMmeaning
that we know
compared withwhat this user
the standard prefers,
LDA model,could we takealla tweets
i.e., treating suitable
as commodity and recommend
a single document.
Thecorresponding
it to the above‑mentioned two This
user? models have
is the four parameters,
process and different
of cross-platform choices of pa‑ which
recommendation,
rameters have
discovers implications
a user’s forwith
interests the inference
the helpresults. In our
of social experiment,
media learning
and chooses from other
products fit for that
research and from our own experience, the number of topics T is
user interests. In the next section, we will make an evaluation of our method set as 5, α is 50/T,of is
β obtaining
0.1 and the iterations of Gibbs sampling that we set is 1000. In addition, some preparatory
user profiles.
work must be accomplished before using these two models, such as deleting stop words,
removing punctuation and segmenting words. However, we omit the description of this
4.3.
work.Microblogs Topicsamples
Table 5 shows Discovery Model
of the andobtained
results Standardusing
LDAMTDMModel and the standard LDA
modelIn(we
order
onlytolisttest the efficiency
six words of our
in each topic, andmodel, we quantitatively
we translate evaluate
them into English the MTDM
in brackets).
compared with the standard LDA model, i.e., treating all tweets as a single document.
The above-mentioned two models have four parameters, and different choices of pa-
rameters have implications for the inference results. In our experiment, learning from
other research and from our own experience, the number of topics T is set as 5, α is 50/T,
β is 0.1 and the iterations of Gibbs sampling that we set is 1000. In addition, some prepar-
atory work must be accomplished before using these two models, such as deleting stop
words, removing punctuation and segmenting words. However, we omit the description
of this work. Table 5 shows samples of the results obtained using MTDM and the standard
Future Internet 2023, 15, 13 15 of 20
However, we repeat the process 100 times with different data sets so that we have
100 pairs of results. We select three human judges to make judgements regarding these
results. The results are first mixed randomly and then sent to the judges. They assign a
grade for each topic according to original data. The grading rules are given below.
Grading rules:
• 1: meaningful and coherent
• 0.5: not very good; contains other topics or meaningless words
• 0: makes no sense
Then, we calculate the average grade for two models and list them in Table 6.
The average grade of MTDM is larger than that of Standard LDA, which indicates
that MTDM has better performance on topic detecting. In the next section, we use MTDM
to detect user interest profiles.
We can observe from Table 7 that P∆ti are different from each other. P∆t0 shows that,
in last six months, the user mainly focused on Car, Mobile and Soccer, while P∆t1 shows
that the user was interested in Shopping, Mobile, Technology, Soccer and News. This
phenomenon demonstrates that user interests as detected using microblogs, which reflect
Future Internet 2023, 15, 13 17 of 20
actual user preferences, would change over different time periods. This indirectly shows
Future Internet 2023, 15, x FOR PEERthe necessity
REVIEW of distinguishing stable interest profiles and temporal interest profiles.17 of 20
The interest profiles Pu M in a user u’s microblogs can be obtained by
4.4.2.
4.4.2.Efficiency
Efficiencyof
ofthe
theProfiles
ProfilesUsed
Used
Since
Sincewe wehave
havebeen
beenable to build
able useruser
to build profiles using
profiles the process
using mentioned
the process mentionedabove,above,
now
we proceed to test the effectiveness of the profiles that we obtain. The
now we proceed to test the effectiveness of the profiles that we obtain. The method we method we adopt
isadopt
an indirect way which
is an indirect relies on
way which common
relies sense to
on common an extent.
sense It is easy
to an extent. It is to understand
easy to under-
that
standusers
thatin a social
users in a network prefer prefer
social network to make to comments
make comments on microblogs when when
on microblogs they have
they
an interest
have in the information
an interest that thethat
in the information microblog spreads.spreads.
the microblog Therefore, we chose
Therefore, we one promo‑
chose one
tional microblog
promotional which wanted
microblog to sell mobile
which wanted to sellphone
mobileand analyzed
phone its 51 reviewers’
and analyzed interest
its 51 reviewers’
profiles, obtainedobtained
interest profiles, through through
the process
the above.
processIfabove.
the user profiles
If the we get are
user profiles weeffective, the
get are effec-
reviewers’ interest profiles will be much likely to include an interest in ‘Mobile’.
tive, the reviewers’ interest profiles will be much likely to include an interest in ‘Mobile’.
We
Wegenerate
generatestatistics
statisticsaccording
accordingto tokeywords
keywordsin inthe
thereviewers’
reviewers’interest
interestprofiles,
profiles,and
and
the statistical results are shown in Figure
the statistical results are shown in Figure 10. 10.
50
40
40
number
30
20 14 14 13 13 13 12 11 11 11 10
10 6 6 5 4 3 3 1
0
Figure10.
Figure 10.Statistical
Statisticalresults
resultsof
ofkeywords
keywordsin
inusers’
users’interest
interestprofiles.
profiles.
Figure10
Figure 10indicates
indicatesthat
that most
most ofof reviewers
reviewersare
areinterested
interestedininthetheMobile
Mobilearea,
area,which
which
canprove
can proveourourassumption.
assumption. From
From thethe figure
figure above,
above, we
we can
can see
see that
that the
the reviewers
reviewersofofthis
this
promotional
promotionalmicroblog
microblogmostly
mostly have
haveinterests in mobile,
interests technology,
in mobile, technology,digital and some
digital and other
some
aspects relatedrelated
other aspects to mobile. This phenomenon
to mobile. indicates
This phenomenon that users
indicates thatinusers
socialinnetworks prefer
social networks
to be selective
prefer about the
to be selective information
about that they
the information thatfocus on and
they focus onthat
andour thatprofile model
our profile can
model
describe user user
can describe interest profiles.
interest profiles.
4.5. Analysis of the Collaborative Filtering Algorithm Based on the User Profiles (CFUP)
While there is no actual data set for a cross-platform recommender system, we choose
the SUSHI data set, which is similar to the actual cross-platform data (containing user’s
profiles) that we want to use. It contains 5000 users, 100 different kinds of sushi and the
Future Internet 2023, 15, 13 18 of 20
4.5. Analysis of the Collaborative Filtering Algorithm Based on the User Profiles (CFUP)
Future
Future Internet
Internet 2023,
2023, 15,
15, xx FOR
FOR PEER
While there is no actual data set for a cross‑platform recommender system, we choose
PEER REVIEW
REVIEW 18
18 of
of 20
20
the SUSHI data set, which is similar to the actual cross‑platform data (containing user’s
profiles) that we want to use. It contains 5000 users, 100 different kinds of sushi and the
scores that different users give to different types of sushi. Each user has ten attributes. The
scoresrange
scores that different
from zero users give to
to four. Indifferent types of sushi.
this experiment, 80% ofEach the user
SUSHI hasdata
ten attributes. The
set is training
scores range from zero to four. In this experiment, 80% of the
data, while the rest is test data. Mean absolute error (MAE) is adopt as the evaluationSUSHI data set is training
data, while
criterion andthe rest is testbydata.
is calculated Mean absolute
the formula error (MAE)
below, where pu,i standsis adopt
for theaspredicted
the evaluation
score,
criterion and is calculated by the formula below,
ru,i stands for real score and T is the number of train data. where 𝑝𝑢,𝑖
𝑢,𝑖
stands for the predicted
score, 𝑟𝑢,𝑖
𝑢,𝑖
stands for real score and T is the number of train data.
∑𝑇𝑖=1
𝑇 T|𝑝
∑ 𝑢,𝑖
𝑖=1 𝑢,𝑖
−−
| pu,i 𝑢,𝑖r u,i |
𝑟𝑢,𝑖 |
MAE== i=1
MAE (15)
(17)
𝑇T
4.5.1.
4.5.1.Impacts
Impactsof ofthe Parameterµμ
theParameter
µμ isisthe
the parameter
parameter we we use
use inin the
the CFUP
CFUP similarity
similarity formula,
formula, and different µ𝜇 lead
and different leadto
to
different predicted results. This experiment aims at finding the best µ for
different predicted results. This experiment aims at finding the best 𝜇 for CFUP. Here theCFUP. Here the
number
numberof ofneighbor
neighborusers
usersisis30.
30.
As
As we can observefrom
we can observe fromFigure
Figure11, 11, the
the MAE
MAE decreases
decreases asas µμ∈∈ [[0,0.7)
0, 0.7) increases,
increases,while
while
the
the MAR
MAR increases
increases asas µμ ∈∈(0.7,1] increases.The
(0.7, 1]increases. The
MAEMAE reaches
reaches thethe lowest
lowest point
point when when
μ=
µ = 0.7. So µ = 0.7 will be selected as the best parameter
0.7. So μ = 0.7 will be selected as the best parameter value. value.
1.168
1.168
1.166
1.166
MAE
1.164
MAE
1.164
1.162
1.162
1.16
1.16
1.158
1.158
00 0.1
0.1 0.2
0.2 0.3
0.3 0.4
0.4 0.5
0.5 0.6
0.6 0.7
0.7 0.8
0.8 0.9
0.9 11
μ
μ
Figure 11.
Figure11.
Figure The
11.The impact
Theimpact of
impactof parameterµμμonon
parameter
ofparameter on MAE.
MAE.
MAE.
4.5.2.Comparison
4.5.2. ComparisonResults
Resultswith
withCollaborative
CollaborativeFiltering
FilteringAlgorithm
Algorithm
Thisexperiment
This experimentattempts
attempts to show
to show whether
whether the collaborative
the collaborative filtering
filtering algorithm
algorithm based
based
on useron user profiles
profiles (CFUP)(CFUP) has better
has better performance
performance in terms
in terms of prediction
of prediction precision.
precision. In
In or‑
der to to
order answer
answerthis question,
this we
question, wemake
makea acomparison
comparisonwithwiththe
thetradition
tradition collaborative
collaborative fil‑
fil-
tering
teringalgorithm
algorithm (CF),
(CF), and
and the
the result shown in
result is shown in Figure
Figure12.12.The
Theabscissa
abscissaisisthe
the number
number of
of neighbors.
neighbors.
CF
CF CFUP
CFUP
1.24
1.24
1.22
1.22
1.2
1.2
MAE
MAE
1.18
1.18
1.16
1.16
1.14
1.14
00 10
10 20
20 30
30 40
40 50
50 60
60 70
70 80
80 90
90 100
100
NEIGHBOR
NEIGHBOR NUMBER
NUMBER
Figure 12.
Figure12.
Figure The
12.The comparison
Thecomparison results
comparisonresults of
resultsof CF
ofCF and
CFand CFUP.
andCFUP.
CFUP.
It can be observed from Figure 12 that MAE decreases as the number of neighbor
users increases in the different algorithms. It changes sharply at the beginning, and then
more slowly before finally seeming become a stable number. However, the general rec-
ommendation result of CFUP is better than that of the traditional CF.
Future Internet 2023, 15, 13 19 of 20
It can be observed from Figure 12 that MAE decreases as the number of neighbor users
increases in the different algorithms. It changes sharply at the beginning, and then more
slowly before finally seeming become a stable number. However, the general recommen‑
dation result of CFUP is better than that of the traditional CF.
5. Conclusions
This paper proposed a cross‑platform recommender system, CPRec. By constructing
user profiles and commodity profiles, commodity recommendations will be realized based
on the similarity of user and commodity. The experiments and analysis demonstrated that
social networks effect e‑ commerce, which will play an important role in creating econom‑
ical value for e‑commerce. The simulation results showed that the Microblogs Topic Dis‑
covery Model performs better compared with the LDA model, and we built users’ profiles
more precisely with the help of the proposed model. Moreover, we also improved the tra‑
ditional collaborative filtering algorithm and proposed a collaborative filtering algorithm
based on the user profiles (CFUP) by considering the similarity of users’ attributes. The
experiments with CFUP show that µ = 0.7 is the best parameter for CFUP and CFUP,
which obtain more accurate recommendation results than the traditional collaborative fil‑
tering algorithm.
In future work, we will focus on studying the information spread path. We aim to
find the fastest and broadest path for information spreading to enhance the influence of
promotional information. The reason for this is that, the greater the influence of cross‑
platform information, the more economical value it may obtain.
Author Contributions: J.Z. and B.S. designed the proposed method and wrote the paper; J.Z. and
X.R. wrote the code and performed the experiments; J.Z. and B.S. analyzed the data; Z.C. modi‑
fied the paper and offered support. All authors have read and agreed to the published version of
the manuscript.
Funding: This research was funded by the 2020 Youth Fund Project of Fuzhou Polytechnic, grant
number FZYKJJQN202001. Additionally, the APC was funded by the 2020 Youth Fund Project
of Fuzhou Polytechnic (FZYKJJQN202001). This work is also supported by the National Natural
Science Foundation of China (62277010), the Fujian Natural Science Foundation (2021J011013 and
2020J01132452) and the Medical Innovation Project (2021CXA001).
Data Availability Statement: Not applicable.
Acknowledgments: The authors thank the 2020 Youth Fund Project of Fuzhou Polytechnic
(FZYKJJQN202001) for covering the costs to publish in open access and the costs incurred when
writing this study. In addition, the authors thank the anonymous reviewers for their insightful com‑
ments that helped improve the quality of this study.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Gao, C.; Huang, C.; Yu, D.; Fu, H.; Lin, T.; Jin, D.; Li, Y. Item Recommendation for Word‑of‑Mouth Scenario in Social E‑Commerce.
IEEE Trans. Knowl. Data Eng. 2022, 34, 2789–2809. [CrossRef]
2. Wigand, R.T.; Benjamin, R.I.; Birkland, J.L.H. Web 2.0 and beyond: Implications for Electronic Commerce. In Proceedings of the
10th International Conference on Electronic Commerce, Innsbruck, Austria, 19–22 August 2008; pp. 1–5.
3. Tajvidi, M.; Wang, Y.; Hajli, N.; Love, P.E. Brand Value Co‑creation in Social Commerce: The Role of Interactivity, Social Support,
and Relationship Quality. Comput. Hum. Behav. 2021, 115, 105238. [CrossRef]
4. Adam, I.O.; Alhassan, M.D. The Role of Social Media on the Diffusion of E‑Government and E‑Commerce. Inf. Resour. Manag. J.
2021, 34, 63–79. [CrossRef]
5. Kang, S.; Lee, D.; Kweon, W.; Yu, H. Personalized Knowledge Distillation for Recommender System. Knowl. ‑Based Syst. 2022,
239, 107958. [CrossRef]
6. Forestiero, A. Heuristic recommendation technique in Internet of Things featuring swarm intelligence approach. Expert Syst.
Appl. 2022, 187, 115904. [CrossRef]
7. Fijalkowski, D.; Zatoka, R. An Architecture of a Web Recommender System Using Social Network User Profiles for E‑Commerce.
In Proceedings of the 2011 Federated Conference on Computer Science and Information Systems (FedCSIS), Szczecin, Poland,
19–21 September 2011; pp. 287–290.
Future Internet 2023, 15, 13 20 of 20
8. Ma, H.; Zhou, T.C.; Lyu, M.R.; King, I. Improving Recommender Systems by Incorporating Social Contextual Information. ACM
Trans. Inf. Syst. 2017, 29, 1–23. [CrossRef]
9. Zhao, W.X.; Li, S.; He, Y.; Wang, L.; Wen, J.R.; Li, X. Exploring demographic information in social media for product recommen‑
dation. Knowl. Inf. Syst. 2016, 49, 61–89. [CrossRef]
10. Zhao, W.X.; Li, S.; He, Y.; Chang, E.Y.; Wen, J.R.; Li, X. Connecting Social Media to E‑Commerce: Cold‑Start Product Recommen‑
dation Using Microblogging Information. IEEE Trans. Knowl. Data Eng. 2016, 28, 1147–1159. [CrossRef]
11. Pan, S.J.; Zhao, L.; Yang, Q. A unified framework of active transfer learning for cross‑system recommendation. Artif. Intell. 2017,
245, 38–55.
12. Xiang, D.; Zhang, Z. Cross‑border e‑commerce personalized recommendation based on fuzzy association specifications com‑
bined with complex preference model. Math. Probl. Eng. 2020, 2020, 8871126. [CrossRef]
13. Blei, D.M.; Ng, A.Y.; Jordan, M.I. Latent dirichlet allocation. J. Mach. Learn. Res. 2003, 3, 993–1022.
14. Zhou, X.; Chen, L. Migrating social event recommendation over microblogs. Proc. VLDB Endow. 2022, 15, 3213–3225. [CrossRef]
15. Djenouri, Y.; Belhadi, A.; Srivastava, G.; Lin, C.W. Toward a Cognitive‑Inspired Hashtag Recommendation for Twitter Data
Analysis. IEEE Trans. Comput. Soc. Syst. 2022, 9, 1748–1757. [CrossRef]
16. Tahmasebi, H.; Ravanmehr, R.; Mohamadrezaei, R. Social movie recommender system based on deep autoencoder network
using Twitter data. Neural Comput. Appl. 2021, 33, 1607–1623. [CrossRef]
17. Weng, J.; Lim, E.P.; Jiang, J.; He, Q. Twitterrank: Finding topic‑sensitive influential twitterers. In Proceedings of the 3rd ACM
International Conference on Web Search and Data Mining (WSDM 2010), New York, NY, USA, 4–6 February 2010; pp. 261–270.
18. Zhao, W.X.; Jiang, J.; Weng, J.; He, J.; Lim, E.P.; Yan, H.; Li, X. Comparing twitter and traditional media using topic models. In
Proceedings of the European Conference on Information Retrieval, Heidelberg, Berlin, 18–21 April 2011; pp. 338–349.
19. Li, L.; Zheng, L.; Yang, F.; Li, T. Modeling and broadening temporal user interest in personalized news recommendation. Expert
Syst. Appl. 2014, 41, 3168–3177. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual au‑
thor(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.