Paper 1 (Instagram)

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Neurocomputing 239 (2017) 9–18

Contents lists available at ScienceDirect

Neurocomputing
journal homepage: www.elsevier.com/locate/neucom

User relationship strength modeling for friend recommendation on


Instagram
Dongyan Guo a,∗, Jingsong Xu b, Jian Zhang b, Min Xu b, Ying Cui a, Xiangjian He b
a
College of Computer Science and Technology, Zhejiang University of Technology, No. 288, Liuhe Road, Hangzhou, Zhejiang 310023, China
b
Global Big Data Technologies Centre, University of Technology Sydney, 15 Broadway, Ultimo NSW 2007, Australia

a r t i c l e i n f o a b s t r a c t

Article history: Social strength modeling in the social media community has attracted increasing research interest. Dif-
Received 12 May 2016 ferent from Flickr, which has been explored by many researchers, Instagram is more popular for mobile
Revised 22 January 2017
users and is conducive to likes and comments but seldom investigated. On Instagram, a user can post
Accepted 28 January 2017
photos/videos, follow other users, comment and like other users’ posts. These actions generate diverse
Available online 14 February 2017
forms of data that result in multiple user relationship views. In this paper, we propose a new framework
Communicated by Prof. Zidong Wang to discover the underlying social relationship strength. User relationship learning under multiple views
and the relationship strength modeling are coupled into one process framework. In addition, given the
Keywords:
Friends recommendation learned relationship strength, a coarse-to-fine method is proposed for friend recommendation. Experi-
Social networks ments on friend recommendations for Instagram are presented to show the effectiveness and efficiency
Multi-view learning of the proposed framework. As exhibited by our experimental results, it can obtain better performance
over other related methods. Although our method has been proposed for Instagram, it can be easily ex-
tended to any other social media communities.
© 2017 Elsevier B.V. All rights reserved.

1. Introduction photos/videos rather than professional photo/video sharing. It al-


lows users to follow other users, comment and like other users’
Social strength modeling in the social media community has re- posts. These actions result in user relationships being shown in
cently attracted increased research effort in the multimedia field, multiple views and in diverse forms. For example, a user relation-
with most of the research [1,2] applying Flickr data. However, In- ship can be described as: one follows the other one; one is fol-
stagram, enabling photo sharing, video sharing and social network- lowed by the other one; one comments on the other one’s post;
ing, has become very popular recently. Flicker focuses on display- two users have similar tags; or one likes the other’s post. The mul-
ing collections of photographs in photo streams, sets and galleries, tiple views and diverse forms of user relationships make relation-
which are organized by tags and maps. In comparison, Instagram ship strength modeling very difficult since optimal modeling needs
is a community conducive to likes and comments and is far more to take heterogeneous data and various relationship views into ac-
of a cell phone snapshot site. count but treats them differently, depending on the importance of
With the rapid development of mobile technology and the na- how they affect global relationship strength.
tional broadband network, Instagram has become increasingly pop- Modeling and analyzing multiple relationship views have re-
ular, especially for mobile users. There is an urgent need to explore cently been explored for generic data mining tasks such as clas-
social relationship strength on Instagram. Different from Flickr sification [3], clustering [4], link prediction [5] and influence anal-
with interest group information, only binary ties exist on Insta- ysis [6] but is still an open area, especially for social relationship
gram. This makes social relationship modeling extremely meaning- strength analysis. Most recently, researchers have started explor-
ful but difficult. Without group information, the relationships ex- ing the social media community with multiple relationship views
isting among a group of people might be sparse. Moreover, rather [1,2,7,8]. Cai et al. [7] propose a regression-based technique to find
than sharing professional photos/videos, Instagram is a platform the optimal linear combination of a number of different weighted
which focuses more on supporting mobile users who post casual networks, relying on a set of input examples that have been as-
signed community labels which indicate the friendship between
users. Based on the combined network, the authors then apply
∗ a spectral clustering algorithm to produce disjoint communities.
Corresponding author.
E-mail addresses: guodongyan@zjut.edu.cn, dongyan.guo@gmail.com (D. Guo). Zhuang et al. [2] propose a kernel-based learning framework for

http://dx.doi.org/10.1016/j.neucom.2017.01.068
0925-2312/© 2017 Elsevier B.V. All rights reserved.
10 D. Guo et al. / Neurocomputing 239 (2017) 9–18

social strength modeling. They adopt the Kernel-Based Learning to compare the similarity between two users. Roth et al. [18] esti-
(KBL) scheme to integrate the multiple modalities (relationship). mates a user’s friends through interaction-based metric. Wu et al.
These two algorithms can be seen as supervised learning meth- [19] declares that most of the online users are interested in the
ods. They construct a target social network based on the existing other users whose appearances are somehow attractive according
users’ relationship. The optimal weights are derived by maximizing to their own preferences. They propose a friend recommendation
the similarity between the linear combination social network and system based on the appearances on photos. In [20], the authors
the target social network. However, due to the noise and other im- take people’s location history into account to measure the similar-
pact factors, in real social networks, the existing relationship may ity among users and then recommend the potential friends.
not represent the real relationship between users. Greene and Cun- Conventional social relationship modeling methods consider
ningham [9] propose an unsupervised method for integrating mul- only single form of information in connections between users. The
tiple data views to produce a single unified network representa- user relationships which are actually multiple cannot be fully rep-
tion, based on the combination of the k-nearest neighbor sets for resented by these models. Zhuang et al. [2] collects multiple inter-
users derived from each view. action information between users and adopts the existing follow
In this paper, a user relationship strength model is built based relationships between users to guide supervised training. How-
on multiple relationship views and applied to friend recommenda- ever, the existing relationships in social networks are very sparse
tion. Compared to the aforementioned existing work, three unique which are not effective to constrain the model. To address these
features of this work are summarized as follows. problems, we propose an unsupervised multiple relationship view
learning method to model the user relationship in social networks.
(1) It is vital to analyze how the individual relationship view
The other task of this work is to recommend latent friends for
can improve the user relationship strength model. In this pa-
users [21–30]. In general, there are two approaches for building
per, we extract four different types of user relationships and
recommendation system: content-based (CB) and collaborative fil-
construct six relationship views.
tering (CF). CB approach is based on a description of the item and
(2) An unsupervised relationship strength learning method is
a profile of the user’s preference. The CF approach, on the other
proposed to fuse multiple relationship views by determin-
hand, recommends items to the user based on other users who
ing the weights of each relationship view. Feature extracted
have similarity behaviors. In this paper, we use the combination of
from sparse similarity matrix leads a fast processing.
related people, tags, interaction and image feature to recommend
(3) At the application level, the existing research on social me-
friends. Our system can be viewed as CB based recommendation
dia relationship strength modeling focuses on Flickr data.
system. Li et al. [28] propose a tag based social interest discovery
To the best of our knowledge, this is the first attempt to
approach for recommendation. Firan et al. [26] makes use of the
deal with Instagram data. As discussed previously, the na-
keywords(tags) which can characterize the user to achieve person-
ture of existing only binary ties among users and diverse
alized recommendations. Sen et al. [31] introduces tag preference
data forms make this task very challenging. Our proposed
inference algorithms based on users’ interactions with tags. The
method outperforms the existing methods on friend recom-
experience result shows that tag-based approach generates better
mendation, especially when the number of recommended
recommendation result than state-of-the-art CF-based algorithms
friends is more than 20.
on the MovieLens system.
The paper is structured as follows. In Section 2, we discuss On Instagram, the recommended pictures for a user are from
the related work about user relationship mining, recommenda- the updates of his followed users. Users usually find their inter-
tion system and the motivation for the proposed framework. est friends by internet searching or friend recommendation. It is
Section 3 gives a detailed description of the proposed frame- difficult for users to find those users they are really interested. In
work. Multiple views of data are extracted, and a novel unsu- addition, most of current friend recommendation methods are de-
pervised multiple relationship view learning framework is de- signed for Flickr. Therefore, it is necessary to design a typical friend
rived. In Section 4, we propose a coarse-to-fine method for friend recommendation method for Instagram.
recommendation by applying the learned relationship strength. Different recommendation systems should be designed for dif-
In Section 5, performances of the proposed method are demon- ferent social media. For the typical photo sharing website, Insta-
strated on a dataset from Instagram for friend recommendation. In gram, the multiple and diverse user relationships are discovered
Section 6, we conclude the paper with future work. and modeled in this paper. Furthermore, given the learned rela-
tionship strength, a coarse-to-fine method is proposed for friend
2. Related work recommendation.

The goal of this paper is modeling user relationship strength 3. User relationship modeling and learning
for friend recommendation on photo sharing websites. Naturally,
it falls into the category of works on social community discovery In this section, we propose an unsupervised multiple view
and recommendation systems. Thus the related work in these two learning algorithm for user relationship modeling through the
areas are reviewed. multi-activities between users. The design of the algorithm is
The first task of this work is mining user relationship in the based on the idea that different user activities can represent differ-
social network which has attracted more and more attentions in ent user relationships from different views. Each view may reflect
recent years [7,10–16]. Agrawal et al. [10] analysis the social behav- partial information between users. By combining the kernel ma-
ior for newsgroups via link-based methods. Li et al. [11] develops trix learning and the user relationship modeling into one process
a scalable algorithm to mining community relationship on large- framework, the algorithm can determine the kernel weights and
scale text document corpus. Cai et al. [7] propose a new method the user relationship modeling simultaneously. The detailed algo-
for learning an optimal linear combination of heterogeneous social rithm description is given as follows.
networks which can achieve better performance for community
mining. Lin et al. [12] aims at discovering community structure 3.1. The overview of the proposed algorithm
through analysis of time-varying, multi-relational data in rich me-
dia social networks. Bhattacharyya et al. [17] analyzes the semantic Fig. 1 shows the overview of our proposed algorithm. In the
similarity of user profile entries and the social network topology first part, relationship links between different users are extracted
D. Guo et al. / Neurocomputing 239 (2017) 9–18 11

Fig. 1. An unsupervised relationship strength learning method based on multiple relationship views.

based on their existing activities on Instagram. These multiple is based on the number of common friends.
types relation links are adopted to describe the user relationship.
G2 (i, j ) = #Common f riend between ui and u j
Similarity matrixes are used to represent the relation links. In our
case, as shown in Fig. 1, they are generated from the corresponded
3.2.2. Like similarity matrixes
activities: follow, like, comment and tagging, respectively.
A user can “like” any other users’ posts by clicking on the “like”
In the second part, an unsupervised multiple relationship view
icon below the posted photo/video. As an important attribute, like
learning and feature extraction are proposed to automatically allo-
can be used to infer user relationship. The more user j’s pho-
cate weights to these views and fuse these views in an iteration
tos/videos are liked by user i, the higher the possibility of user j
process to achieve an optimal combination of the fusion.
being user i’s friend.
In the third part, the outcome of the fusion represents the user
Moreover, if two users have the same interests or common
relationship strength. It is more accurate and better than individ-
friends, it is more likely they will like same photos and videos. The
ual view and average fusion. The learned relationship can be used
more photos/videos two users both like, the more likely they are
for efficient friend recommendation as show below. Moreover, a
to be friends.
coarse-to-fine method is proposed in this paper.
We generate the following two similarity matrixes G3 and G4 to
represent the like relationship.
3.2. Multiple relationship representation
G3 (i, j ) = #Like between ui and u j .
Compared with other photo sharing websites, Instagram has
G4 (i, j ) = #Image both ui and u j likes.
a stronger social attribute. Based on this, we firstly compute the
“user strength” between users and then use this relationship to do
3.2.3. Comment similarity matrix
friend recommendation. The “user strength” between users can be
Commenting is another way to show user interest in a partic-
regarded as a kind of undirected relations. Instagram provides dif-
ular post. The more comments a person makes on a user’s posts,
ferent functions for users sharing photos/videos on the website and
the more likely it is that this person is a friend. A similarity matrix
communicating with other users. Different types of raw data are
G5 based on comment is then generated:
generated which require different pre-processing methods to each
type/view to build user relationships. Thus similarity matrixes are G5 (i, j ) = #Comment between ui and u j .
generated to represent each relationship view. Note that these sim-
ilarity matrixes are not necessarily positive semi-definite (p.s.d.). 3.2.4. Tag similarity matrix
As in [2] we can force these measures to be p.s.d. to construct ker- We firstly build a tag dictionary from the all-existing tags. Then,
nels by adding a properly scaled identity matrix to the correspond- the t f − idf weighting method [32] is used to represent a user’s
ing similarity matrix. tag. In this way, each user’s tag is represented as a vector t ∈ Rd ,
the value of tag similarity Vtag can be computed as:
3.2.1. Follow similarity matrixes tiT t j
On Instagram, a user friendship is defined as one user A, follows Vtag (i, j ) =
(tiT ti )(t Tj t j )
the other one B, which directly reflects the user relationship. Note,
if A follows B, this does not mean that B has to follow A. From a threshold τ (in this paper, the τ is set to be around 0.01) is then
these follows, a similarity matrix G1 can be generated as follows: applied to generate the sparse tag similarity matrix G6 :
 
0 i f ui has no relationship with u j 0 i f Vtag (i, j ) < τ
G1 (i, j ) = G6 (i, j ) =
1 i f ui has unidirectional relationship with u j Vtag (i, j ) otherwise
2 i f ui has bidirectional relationship with u j
where ui and uj indicate user i and user j respectively, and G1 (i, j) 3.3. Multiple relationship view learning
is the element at location ith row and jth column. The following
similarity matrixes have the similar definition. Different similarity matrix represents the user relationship from
In general, if two users who are not friends follow the same different views. We linearly add each similarity matrix to obtain
user, then it is likely for them to be friends in the future. The more the fused similarity matrix S between users, that is:
friends two users have in common, the higher the possibility that 
m
these two users have the potential to become friends. Thus, an- S= μpG p (1)
other similarity matrix G2 generated from the follow relationship p=1
12 D. Guo et al. / Neurocomputing 239 (2017) 9–18


where μ p ≥ 0, s.t. m p=1 μ p = 1, μ = {μ p , p = 1, . . . , m} is the To uniquely determine the low-dimensional feature matrix Y, we
weighted vector. constrain Y by Y Y T = I. Eq. (8) is then transformed into the follow-
S is a sparse matrix that shows the explicit relationships ing form:
of users. However, two users may have an indirect relationship min(tr (I ) − tr (Y D−1/2 SD−1/2Y T ))
through a third user or more. In order to discover such underly- Y
(9)
ing social relationships, we introduce a hidden feature yj for each s.t.Y Y T = I
user to mine the pairwise measurements of all users. yj is a low-
which is equivalent to:
dimensional feature that preserves user similarity based on manip-
ulations of S: max tr (Y D−1/2 SD−1/2Y T )
Y

n (10)
min  y j − yi 2 S( j, i ) (2) s.t.Y Y T = I
yj
i=1 Note that in the above derivation, the weighted vector μ de-
We can obtain all the users’ features by preserving the global sim- fined in Eq. (1) is assumed to be known. The inner product
ilarity between users: < yi , yj > can be considered as the similarity between user i and

n 
n j. Therefore YT Y is the similarity matrix of users based on feature
min  y j − yi 2 S( j, i ) (3) y1 , . . . , yn . Since YT Y and S are positive semi-definite, the multiple
Y
j=1 i=1 kernel alignment method [33] can be applied. In this paper, the
S( j, 1 ) parameter μ defined in Eq. (1) can be derived by maximizing the
Setting H j = diag([ . . . ] ), then Eq. (3) can be written as: alignment between YT Y and D−1/2 SD−1/2 :
S( j, n ) < Y T Y, D−1/2 SD−1/2 >F
  max (11)
( y j − y1 ) T μ Y T Y F D−1/2 SD−1/2 F

n

min tr ... ( y j − y1 ), ..., ( y j − yn ) where the operator < A, B >F = tr (AT B ) and AF = < A, A >F ,
Y
j=1 ( y j − yn )T A, B ∈ n × n .
 
S( j, 1 ) In order to simplify the calculation, instead of normalizing S
× diag ... by D−1/2 SD−1/2 , we do the normalization on each sub-view sim-
S( j, n ) ilarity matrix Gp . Another advantage of this normalization is that
it will be able to partially eliminate the imbalance between dif-

n
= min tr ( y j − y1 ), ..., ( y j − yn ) ferent types of users. For example, one “power user” may have a
Y
j=1
large number of followers and, by normalization, the weight of fol-
  low between this “power user” and other users will be reduced.
( y j − y1 )T For a “normal user”, it works in the same way. it can eliminate
× Hj ... the scale effects of different similarity matrices. We then combine
( y j − yn )T Eqs. (10) and (11) to deduce the final optimization problem:

n
tr (Y S˜Y T )
= min tr (Y L j × H j × LTj Y T ) max
Y
j=1
Y,μ S˜F (12)

n s.t. Y Y T = I.
= min tr (Y (L j × H j × LTj )Y T ) (4)  −1/2
Y where S˜ = m p=1 μ p G p , G p = D p
˜ ˜ G p D−1
p
/2
and Dp is a diagonal
j=1 n
matrix with D p (i, i ) = j=1 G p (i, j ).
On the above equation
Problem (12) is a non-convex problem, which is difficult to
Lj = Ej − I (5) solve. We adopt an alternating optimization method to achieve a
where Ej is a n × n matrix that all the elements in jth row are 1 local maxima.
and the other elements are 0, and I is a n × n identity matrix. For a given μ, problem (12) is equivalent to:
So that max tr (Y S˜Y T )

n Y
(13)
(L j × H j × LTj ) s.t. Y Y T = I.
j=1
which can be solved by computing eigenvectors.

n
For a given Y, problem (12) turns out to be a kernel alignment
= ((E j − I ) × H j × ( E Tj − I ))
problem. There are two kinds of methods to solve this problem
j=1
namely, independent alignment and alignment maximization algo-

n
rithm. Based on our experiments, we find that using the alignment
= (E j × H j × E Tj − E j × H j − H j × E Tj + H j ) maximization algorithm might cause over-fitting. For example, dur-
j=1
ing the iterative optimization process, one weight μp might reach
= D−S−S+D to 1 and the others will be 0 (the total summation of the weights
= 2 (D − S ) (6) is equal to 1), whereas using an independent alignment-based al-
n gorithm can avoid over-fitting effectively during the alternating op-
where D is diagonal, and Dii = j=1 Si j . Therefore Eq. (3) can be
timization.
write in this form:
An alternating optimization procedure is summarized in
min tr (Y (D − S )Y T ) (7) Algorithm 1 to obtain an optimal solution for problem (12) locally.
Y

Note that D − S is actually an unnormalized graph Laplacian 4. A Coarse-to-fine method for friend recommendation
matrix. We adopt a normalized graph Laplacian matrix by perform-
ing normalization on D − S: Since photos are important elements of the photo sharing net-
D−1/2 (D − S )D−1/2 = I − D−1/2 SD−1/2 (8) work, how to use the photo information to improve the accuracy
D. Guo et al. / Neurocomputing 239 (2017) 9–18 13

Algorithm 1 User relationship strength modeling. Suppose user A has m photos, user B has n photos, the photo
similarity Sim(A, B) between user A and B can be computed as fol-
Require:
lows:
Different similarity matrix that present different relationship
 m 
views: G1 , G2 , . . ., Gm ,user feature dimension k. i=1 − min j∈[0,n] (A(i ) − B( j )2 )
Sim(A, B ) = exp (14)
Ensure: mσ 2
User feature matrix Y .
1: Initialization: Gˆ p = D−1
p
/2
G p D−1/2 p = 1, 2, . . ., m; μ= where A(i) denotes the image feature extracted from the ith photo
n p
[1/m, . . ., 1/m] ; D p (i, i ) = j=1 G p (i, j ).
T of user A, B(j) denotes the image feature extracted from the
2: Repeat jth photo of user B. Notice that Sim(A, B) is not equivalent to
m Sim(B, A).
3: Sˆμ = μ p Gˆ p .
p=1
4: Y = max t r (Y T SˆμY ) s.t .Y T Y = I. 4.2. Coarse friend recommendation based on user interaction
Y ∈Rn×k
information
5: μ p = tr (Y T Gˆ pY )/Gˆ p F .
6: μ = μ/  v  2
As described in Algorithm 1, each column of the feature matrix
7: Until converged
Y can be seen as a user feature vector. The user feature vectors
represent the relationships between users in a low dimension sub-
space. By this way, the expected user features that extracted from
of friend recommendation is a key problem. Through the obser- the user interaction information are actually obtained by the simi-
vation we believe that Instagram has a strong social attribute be- larity between different users. Therefore, the user feature similarity
tween users. The “follow”, “like”, “comment” and other actions can can be used directly for friend recommendation. For user u˜0 , sup-
directly reflect the user’s social relationship. However, it is diffi- pose the users with the most similar user features are computed
cult to analyze this social relationship through image similarity be- in turns as (u˜1 , . . . , u˜ p ), the corresponding user similarity are de-
tween users, due to the complexity of images and image contexts. noted as (π1 , . . . , π p ), obviously we have:
Therefore, we choose to use the image similarity to adjust the rec-
ommendation of the user’s potential friends. In this section, we π1 ≥ π2 ≥ · · · ≥ π p
propose a two-step friend recommendation method by a coarse-to-
The above user sorting method only considers the user interac-
fine process. In the first step, a user relationship strength model is
tion information. However, in photo sharing media, the photos are
used for sorting the friend relationship and getting a coarse friend
the main data information. Up to now, the study on photo feature
recommendation result. In the second step, the photo similarity
extraction, semantic perception is far from attaining the ideal re-
between users is used to re-sort the coarse recommendation result
quirements. In the situation of huge image data, using traditional
and get a much more accurate friend recommendation result.
image analysis approaches to do image study is difficult to obtain
the ideal result. In this paper, we use the image feature to re-sort
the user similarity.
4.1. Photo similarity estimation

For each image, we use the alexnet model trained from Im- 4.3. Fine friend recommendation based on random walk
agenet to implement the user image classification by the deep
learning framework Caffe. Then 15 maximally accurate semantic If there is a user u˜t , the photos he shares are very similar to
classes and the corresponding accuracies are selected as the new the photos shared by user u˜0 or u˜0 ’s best friends, then u˜t is very
image feature for similarity computation. Fig. 2 shows one origi- likely to be a friend of u˜0 .
nal image shared by an Instagram user and its corresponding top-5 Suppose the users (u˜1 , . . . , u˜ p ) are the friend recommenda-
maximally accurate classes and accuracies. tion candidates for user u˜0 . Then for a user u˜t (1 ≤ t ≤ p, the

Fig. 2. An example of photo semantic description.


14 D. Guo et al. / Neurocomputing 239 (2017) 9–18

relationship to u˜0 can be modified as: Table 1


The number of activities between users in our database.

p
π˜ t = Sim(u˜t , u˜i )πi . (15) Name User Follow Photo Like Comment Tag

i=0 Number 50 0 0 61,752 10,693 9471 19,833 95,872

Let S(i, j ) = Sim(u˜i , u˜ j ), π 0 = (π0 , . . . , π p )T , the user relationship


can be re-sorted as:
data. Each photo has the attributes of owner, textual tags, com-
π˜ = Sπ 0 . (16) ments and a list of users who “like” this photo. These attributes are
To reduce the influence of different size, we need to apply a nor- used for user similarity measurement, as described in Section 3.
malization on matrix S: In all the six similarity matrixes, some of the elements in
G2 , G3 , G4 , G5 may be very large. Take G5 (comment similarity ma-
P = SW −1 . (17) trix) for example, some users like to make comment everywhere

p+1 while others may not. In order to eliminate these effects, we apply
where W is a diagonal matrix, and W (i, i ) = S( j, i ). At this time, log operation to each element in the matrixes. It can keep the sim-
j=1
ilarity between users with small elements while reduce the simi-
Eq. (16) can be rewritten as:
larity between users with very large elements.
π˜ = Pπ 0 . (18) The statistical information of the final collected dataset is
shown in Table 1. There are 61,752 follow relationships among the
A steady state of the user similarity can be reached through re-
50 0 0 users. Half of the downloaded user follow relationships are
sorting several times by using the above method:
removed for the learning process and used as the benchmark to
π˜ = Pπ˜ . (19) evaluate the accuracy of friend recommendation. Note that if A fol-
lows B, only the recommendation of B to be a friend of A will be
To prevent the re-sorting result from over-dependent of the image counted as a correct recommendation. If A is recommended to B,
similarity, Eq. (19) can be modified to get the final solution: this will be counted as a wrong recommendation.
π˜ = (1 −  )π 0 +  Pπ˜ . (20)
5.2. Experimental analysis
The re-sorting result π˜ can be obtained iteratively. A detailed
description of the proposed friend recommendation method is
Following the Top-N recommendation criterion, the precision
given in Algorithm 2.
and recall are used to evaluate the proposed method, as shown
in the follows:
Algorithm 2 The coarse-to-fine method for friend recommenda- 
tion. u∈U |R ( u ) ∩ T ( u )|
P recision = 
Require: u∈U |R (u )|
The feature matrix of all users Y ; 
u∈U |R ( u ) ∩ T ( u )|
The number of coarse recommended friend p. Recall = 
Ensure: u∈U |T (u )|
The sorted friend recommendation result for user u0 : π˜ where R(u) is the recommended friend list for user u, T(u) is the
1: Initialization:
real friend list and U is the testing user set.
Computing the p most similar users of user u0 based on the
user feature similarity (u˜1 , . . ., u˜ p ).
5.2.1. User relationship strength analysis
Áîπ 0 = (π0 , . . ., π p )T , π 0 = π 0 /π 0 .
The proposed unsupervised multiple learning method fuses
Using Eq. (17) to compute the matrix P .
the individual view relationships into a multi-view relationship.
2: Repeat
The effectiveness of each individual view (tag, cofollow, comment,
3: π k+1 = π 0 + (1 −  )Pπ k
colike, like and follow) affecting the global relationship strength
4: π k+1 = π k+1 /π k+1 
for friend recommendation is shown in Fig. 3, compared with the
5: Until convergence
user relationship modeling (coarse method).
6: π˜ = π k+1
From Fig. 3 we can see that, using the individual view relation-
ship “follow” for friend recommendation can obtain higher accu-
racy than any other individual view relationship. Besides the fol-
5. Experiments and discussions low relationship, the rest views that affecting the friend relation-
ship from a decreased order are: “like”, “colike”, “comment”, “co-
In this section, we conduct extensive experiments on Insta- follow” and “tag”. Table 2 shows the weights of each similarity
gram datasets and the proposed method is compared with differ- matrix that allocated by our proposed multiple relationship view
ent friend recommendation methods. learning method. The order of the weights is roughly consistent
with the order of the friend recommendation accuracy in Fig. 3.
5.1. Data collection and experiment setups This demonstrates the rationality of the proposed method.
Moreover, the experimental results shown in Fig. 3 indicates
To build an experimental dataset, at first 10 users are randomly that individual relationship view is not enough to infer global re-
selected from Instagram, then other users followed by them are lationship strength. The fused multi-view relationship obtained by
collected by breadth – first algorithm. All the collected users are the proposed method has been proved to be much more effective
then sorted by the number of their followers and the number of than all the individual views.
users they followed. The top-50 0 0 users are chosen as experimen- The user relationship strength model is a kind of subspace
tal data, and the photos they shared are gathered. Finally these learning method based on multi-kernel learning. To illustrate the
photos are filtered by the number of “like” and “comment” they effectiveness of the proposed method, some comparisons are done
obtained, and about 10 thousands photos are reserved as image with baseline methods, such as kernel product, Ave (average sum
D. Guo et al. / Neurocomputing 239 (2017) 9–18 15

0.1 0.35
tag tag
0.09 cofollow cofollow
comment 0.3 comment
colike colike
0.08
like like
follow 0.25 follow
0.07
coarse coarse
0.06
0.2
Precision

Recall
0.05
0.15
0.04

0.03 0.1

0.02
0.05
0.01

0 0
10 20 30 40 50 10 20 30 40 50
Top−m Friend Recommendation Top−m Friend Recommendation
Fig. 3. Comparison of the effectiveness of individual relationship views (the names are shown in Table 1) and the fused multi-view affecting global relationship strength.

Table 2 5.2.2. Quantitative comparisons with other methods


The learned similarity matrix weight.
The proposed coarse-to-fine method based on user relationship
Name Follow Cofollow Like Colike Comment Tag strength modeling for friend recommendation is compared with
Weight 0.303 0.152 0.223 0.145 0.123 0.055 the kernel-based learning (KBL) method in [2] and the pagerank
method in [36]. Note that the method in [2] was tested in Flickr
data. The user features and experiment setups are different from
ours. To compare the two methods, we implement the method in
all the kernel matrix), co-train [34] and MSE [35]. The experimen- [2] according to our experiment setups, i.e. use half of the user fol-
tal results (Fig. 4) demonstrate the effectiveness of our proposed low relationships together with other relationship views for learn-
method.

0.1 0.35
Kernel Product Kernel Product
Co−trained Co−trained
0.09
Ave 0.3 Ave
MSE MSE
0.08
coarse coarse
0.07 0.25

0.06
0.2
Precision

Recall

0.05
0.15
0.04

0.03 0.1

0.02
0.05
0.01

0 0
10 20 30 40 50 10 20 30 40 50
Top−m Friend Recommendation Top−m Friend Recommendation
Fig. 4. Comparison of the effectiveness with some baseline methods: kernel product, co-trained [34], Ave, MSE [35] and the proposed coarse method.
16 D. Guo et al. / Neurocomputing 239 (2017) 9–18

0.1 0.4
MKL MKL
0.09 PageRank PageRank
coarse 0.35 coarse
0.08 coarse−to−fine coarse−to−fine
0.3
0.07

0.25
0.06
Precision

Recall
0.05 0.2

0.04
0.15

0.03
0.1
0.02

0.05
0.01

0 0
10 20 30 40 50 10 20 30 40 50
Top−m Friend Recommendation Top−m Friend Recommendation
Fig. 5. Comparison of our method with other method: MKL [2], PageRank [36], proposed coarse method, proposed fine method.

Table 3
The precision and recall result of different algorithm for top-10 recommendation.

Kernel product Co-trained Ave MSE MKL PageRank Coarse Coarse-to-fine

Precision 0.04 0.071 0.085 0.091 0.0932 0.0804 0.093 0.0993


Recall 0.055 0.11 0.129 0.131 0.1406 0.1242 0.1429 0.1799

ing and use the other half follow relationships as the benchmark data on Instagram. Different from Flickr, user relationship model-
to evaluate the accuracy of friend recommendation. ing is seldom researched for Instagram. The proposed framework
From Fig. 5 we can see that, the KBL method in [2] outperforms can model user relationship and learn user relationship strength
the proposed coarse method when the number of recommended simultaneously, which can better infer the underlying relationship
friends is 20. With the increasing of recommended friends, coarse between users. Furthermore, a coarse-to-fine friend recommenda-
method gets higher accuracy than KBL method. Since the user rela- tion method has been proposed to encode the learned relationship
tionship in Instagram is relatively sparse and non-equilibrium, it is strength. Experiments for friend recommendation show the effec-
difficult to find accurate information for supervision. Therefore, we tiveness of the proposed methods.
design an unsupervised learning method to extract user features. In Eq. (12), we have chosen to normalize each sub-view similar-
This kind of model is more suitable for modeling user relationship ity matrix instead of normalizing the fused similarity matrix. This
in Instagram, which is the advantage of our proposed model. will be able to partially eliminate the effects of different behav-
A comparison of precision and recall results corresponding to iors of different users on the final results. For example, a user is
different methods for top-10 friends recommendation is shown in very often to “like” others photos, therefore his behavior of “like”
Table 3. From Table 3 we can see that, compared with other meth- is not so valuable for recommendation analysis. Through normaliz-
ods, the recall value of the proposed coarse-to-fine method has ob- ing the similarity matrix, the similarity of “like” between this user
tained an absolute improvement. The precision value of the pro- and others can be significantly reduced. In addition, how to utilize
posed method is the highest one, though it still seems a little low. the behavior difference between users to do personalized friend
The precision value 0.0993 means that when we recommend 10 recommendation is a key issue for future work.
friends for each user there is only about 1 friend will be accepted
by a user. The reason for this low precision result is caused by Acknowledgment
the sparsely between social network users. Through observation,
we found that the follow relationship between users is very sparse This work was supported in part by National Natural Science
in the down loaded database. Moreover, the relation distribution is Foundation of China (U1509207, 61325019), in part by Natural Sci-
imbalance. For example, some users are followed by hundreds of ence Foundation of Zhejiang Province (LY15F020024).
users and some users are followed by one or two users. In this sit-
References
uation, even the best recommendation method cannot get a high
precision result. [1] T. Yao, C.-W. Ngo, T. Mei, Context-based friend suggestion in online photo-shar-
ing community, in: Proceedings of the Association for Computing Machinery
6. Conclusion and future work (ACM)’s Annual Conference on Multimedia, ACM, 2011, pp. 945–948.
[2] J. Zhuang, T. Mei, S.C. Hoi, X.-S. Hua, S. Li, Modeling social strength in so-
cial media community via kernel-based learning, in: Proceedings of the Asso-
In this paper, we have proposed an unsupervised multiple view ciation for Computing Machinery (ACM)’s Annual Conference on Multimedia,
learning framework to learn the diverse types of user relationship ACM, 2011, pp. 113–122.
D. Guo et al. / Neurocomputing 239 (2017) 9–18 17

[3] R. Angelova, G. Kasneci, G. Weikum, Graffiti: graph-based classification in het- [32] H.C. Wu, R.W.P. Luk, K.F. Wong, K.L. Kwok, Interpreting tf-idf term weights as
erogeneous networks, WWW 15 (2) (2012) 139–170. making relevance decisions, ACM Trans. Inf. Syst. 26 (3) (2008) 13.
[4] Y. Sun, Y. Yu, J. Han, Ranking-based clustering of heterogeneous information [33] C. Cortes, M. Mohri, A. Rostamizadeh, Algorithms for learning kernels based on
networks with star network schema, in: Proceedings of the Association for centered alignment, JMLR 13 (2012) 795–828.
Computing Machinery (ACM)’s Annual Conference on Knowledge Discovery [34] A. Kumar, H. Daumé, A co-training approach for multi-view spectral cluster-
and Data Mining (KDD), ACM, 2009, pp. 797–806. ing, in: Proceedings of the Twenty Eighth International Conference on Machine
[5] D. Davis, R. Lichtenwalter, N.V. Chawla, Multi-relational link prediction in het- Learning (ICML-11), 2011, pp. 393–400.
erogeneous information networks, in: Proceedings of the International Confer- [35] T. Xia, D. Tao, T. Mei, Y. Zhang, Multiview spectral embedding, IEEE Trans. Syst.
ence on Advances in Social Network Analysis and Mining, ASONAM, IEEE, 2011, Man Cybern. Part B 40 (6) (2010) 1438–1446.
pp. 281–288. [36] M. Gupta, A. Pathak, S. Chakrabarti, Fast algorithms for Topk personalized
[6] L. Liu, J. Tang, J. Han, M. Jiang, S. Yang, Mining topic-level influence in het- pagerank queries, in: Proceedings of the Seventeenth International Conference
erogeneous networks, in: Proceedings of the Conference on Information and on World Wide Web, ACM, 2008, pp. 1225–1226.
Knowledge Management, CIKM, ACM, 2010, pp. 199–208.
[7] D. Cai, Z. Shao, X. He, X. Yan, J. Han, Mining hidden community in heteroge- Dongyan Guo received the B.S. degree in application
neous social networks, in: Proceedings of the Third International Workshop on mathematics and the Ph.D. degree in pattern recognition
Link Discovery, ACM, 2005, pp. 58–65. and intelligent system from Nanjing University of Science
[8] M. Jiang, P. Cui, F. Wang, Q. Yang, W. Zhu, S. Yang, Social recommendation and Technology, China, in 2008 and 2015, respectively.
across multiple relational domains, in: Proceedings of the Conference on In- Since 2015, He has been a faculty member in the College
formation and Knowledge Management, CIKM, ACM, 2012, pp. 1422–1431. of Computer Science and Technology, Zhejiang University
[9] D. Greene, P. Cunningham, Producing a unified graph representation from mul- of Technology, Hangzhou, China. His research interests in-
tiple social network views, in: Proceedings of the International Web Science clude computer vision and machine learning.
Conference, ACM, 2013, pp. 118–121.
[10] R. Agrawal, S. Rajagopalan, R. Srikant, Y. Xu, Mining newsgroups using net-
works arising from social behavior, in: Proceedings of the Twelfth International
Conference on World Wide Web, ACM, 2003, pp. 529–535.
[11] H. Li, Z. Nie, W.-C. Lee, L. Giles, J.-R. Wen, Scalable community discovery on
textual data with relations, in: Proceedings of the Seventeenth ACM Confer-
Jingsong Xu received the B.S. degree from Nanjing Uni-
ence on Information and Knowledge Management, ACM, 2008, pp. 1203–1212.
versity of Science and Technology, China, in 2007, and the
[12] Y.-R. Lin, J. Sun, P. Castro, R. Konuru, H. Sundaram, A. Kelliher, Metafac: com-
Ph.D. degree from the same university, in 2014. He is cur-
munity discovery via relational hypergraph factorization, in: Proceedings of the
rently a Research Fellow with the Global Big Data Tech-
Fifteenth ACM SIGKDD International Conference on Knowledge Discovery and
nologies Centre, University of Technology, Sydney. His re-
Data Mining, ACM, 2009, pp. 527–536.
search interests include computer vision, pattern recogni-
[13] T. Mei, B. Yang, X.-S. Hua, S. Li, Contextual video recommendation by multi-
tion and machine learning and the applications in object
modal relevance and user feedback, ACM Trans. Inf. Syst. 29 (2) (2011) 10.
detection and action recognition.
[14] J. Sang, C. Xu, Right buddy makes the difference: an early exploration of social
relation analysis in multimedia applications, in: Proceedings of the Twentieth
ACM International Conference on Multimedia, ACM, 2012, pp. 19–28.
[15] M. Yan, J. Sang, T. Mei, C. Xu, Friend transfer: cold-start friend recommendation
with cross-platform transfer learning of social knowledge, in: Proceedings of
the 2013 IEEE International Conference on Multimedia and Expo (ICME), IEEE,
2013, pp. 1–6. Jian Zhang (SM’04) received the B.Sc. degree from East
[16] X. Qian, H. Feng, G. Zhao, T. Mei, Personalized recommendation combining user China Normal University, Shanghai, China, in 1982; the
interest and social circle, IEEE Trans. Knowl. Data Eng. 26 (7) (2014) 1763–1777. M.Sc. degree in computer science from Flinders Univer-
[17] P. Bhattacharyya, A. Garg, S.F. Wu, Analysis of user keyword similarity in online sity, Adelaide, Australia, in 1994; and the Ph.D. degree in
social networks, Soc. Netw. Anal. Min. 1 (3) (2011) 143–158. electrical engineering from the University of New South
[18] M. Roth, A. Ben-David, D. Deutscher, G. Flysher, I. Horn, A. Leichtberg, N. Leiser, Wales (UNSW), Sydney, Australia, in 1999. He is currently
Y. Matias, R. Merom, Suggesting friends using the implicit social graph, in: Pro- an Associate Professor with the Global Big Data Technolo-
ceedings of the Sixteenth ACM SIGKDD International Conference on Knowledge gies Centre and School of Computing and Communica-
Discovery and Data Mining, ACM, 2010, pp. 233–242. tion, Faculty of Engineering and Information Technology,
[19] Z. Wu, S. Jiang, Q. Huang, Friend recommendation according to appearances University of Technology Sydney, Sydney. He is the au-
on photos, in: Proceedings of the Seventeenth ACM International Conference thor or co-author of more than100 paper publications,
on Multimedia, ACM, 2009, pp. 987–988. book chapters, and six issued patents filed in the U.S. and
[20] Q. Li, Y. Zheng, X. Xie, Y. Chen, W. Liu, W.-Y. Ma, Mining user similarity based China. His current research interests include multimedia
on location history, in: Proceedings of the Sixteenth ACM SIGSPATIAL Interna- processing and communications, image and video processing, machine learning,
tional Conference on Advances in Geographic Information Systems, ACM, 2008, pattern recognition, media and social media visual information retrieval and min-
p. 34. ing, human–computer interaction and intelligent video surveillance systems. He is
[21] M. Balabanović, Y. Shoham, Fab: content-based, collaborative recommendation, a senior member of the IEEE.
Commun. ACM 40 (3) (1997) 66–72.
[22] P. Bonhard, M. Sasse, ’knowing me, knowing you’-using profiles and social net-
working to improve recommender systems, BT Technol. J. 24 (3) (2006) 84–98. Min Xu received the B.E. degree from University of Sci-
[23] D. Carmel, N. Zwerdling, I. Guy, S. Ofek-Koifman, N. Har’El, I. Ronen, E. Uziel, ence and Technology of China, in 20 0 0, M.S. degree from
S. Yogev, S. Chernov, Personalized social search based on the user’s social net- National University of Singapore in 2004 and Ph.D. de-
work, in: Proceedings of the Eighteenth ACM conference on Information and gree from University of Newcastle, Australia in 2010. She
knowledge management, ACM, 2009, pp. 1227–1236. is currently a Senior Lecturer at University of Technology,
[24] M. Claypool, A. Gokhale, T. Miranda, P. Murnikov, D. Netes, M. Sartin, Com- Sydney. Her research interests include multimedia data
bining content-based and collaborative filters in an online newspaper, in: Pro- analytics, social multimedia, pattern recognition and com-
ceedings of ACM SIGIR Workshop on Recommender Systems, vol. 60, Citeseer, puter vision.
1999.
[25] S. Farrell, T. Lau, Fringe contacts: people-tagging for the enterprise, in: Pro-
ceedings of the WWW’06 Collaborative Web Tagging Workshop, 2006.
[26] C.S. Firan, W. Nejdl, R. Paiu, The benefit of using tag-based profiles, in: Pro-
ceedings of the Fifth Latin American Web Congress, LA-WEB 2007, IEEE, 2007,
pp. 32–41.
Ying Cui received the B.S. degree in computer science and
[27] D. Goldberg, D. Nichols, B.M. Oki, D. Terry, Using collaborative filtering to
technology and the Ph.D. degree in pattern recognition
weave an information tapestry, Commun. ACM 35 (12) (1992) 61–70.
and intelligent system from Nanjing University of science
[28] X. Li, L. Guo, Y.E. Zhao, Tag-based social interest discovery, in: Proceedings of
and Technology, China, in 2008 and 2015, respectively.
the Seventeenth International Conference on World Wide Web, ACM, 2008,
From 2013 to 2014, she was a visiting student with Uni-
pp. 675–684.
versity of Technology, Sydney, Australia. Since 2015, she
[29] M.J. Pazzani, D. Billsus, Content-based recommendation systems, The Adaptive
has been a faculty member in the College of Computer
Web, Springer, 2007, pp. 325–341.
Science and Technology, Zhejiang University of Technol-
[30] R.R. Sinha, K. Swearingen, Comparing recommendations made by online sys-
ogy, Hangzhou, China. Her research interests include com-
tems and friends., in: Proceedings of the DELOS Workshop on Personalisation
puter vision, pattern recognition and image processing.
and Recommender Systems in Digital Libraries, vol. 1, 2001.
[31] S. Sen, J. Vig, J. Riedl, Tagommenders: connecting users to items through tags,
in: Proceedings of the Eighteenth International Conference on World Wide
Web, ACM, 2009, pp. 671–680.
18 D. Guo et al. / Neurocomputing 239 (2017) 9–18

Xiangjian He is a professor of computer science. He is


the Director of Computer Vision and Pattern Recognition
Laboratory and a co-leader of the Network Security Re-
search group, University of Technology Sydney. His re-
search interests include network security, image process-
ing, pattern recognition and computer vision. He is a se-
nior member of the IEEE.

You might also like