Towards Segregation Aware Social

Towards Segregation Aware Social
Recommendations
INTRODUCTION
•A community in a social network refers to a group of people who are
more tightly interconnected than the overall network.
• Users in a community tend to interact more frequently with each other

and share common interests.
• Detecting such tight communities is one of the essential tools in social

network analysis.
•Manual addition of circles is laborious and also requires constant

updates as and when new connections are made
Whether Similarity based Friendship
recommendation increase or Decrease
Social Segregation?
Segregation is the social division of human beings based on any
number of factors, including race or nationality. For example :BJP
and Congress communities
Less segregation is good for society for long term.
So, our task is to see how social recommendation system

Segregates different types of users while friendship
recommendation.
Literature Study
•In SNAP (Stanford Network Analysis Project), concept of viewing users as
individual “Egos” was introduced.
•They formulated the problem of circle detection as a clustering problem on her
ego-network, the network of friendships between her friends.
•Current project is constructed based on personal friendship network which
draws ideas from the ego-network.
•In this paper we studied the problem of automatically discovering users’ social
circles.
Related work
•Barbieri et al. (2014) gave an method to predict new connections between
people with a stochastic topic model.The method also represents whether
a link is “topical” or “social” and produces an explanation of the type of
recommendation produced.
•Gupta et al. (2013) presented Twitter’s user recommendation service,

which is based on shared interests, common connections, and other
related factors.
•Liben-Nowell and Kleinberg (2003) studied the user recommendation

problem as a link prediction problem. They develop several approaches,
based on metrics that analyze the proximity of the nodes in a social
network, to infer the probability of new connections among users.
Dataset Description
•We have used the facebook dataset provided by the Stanford

University which consist of networks with predefined communities.
•This dataset consists of 'circles' (or 'friends lists') from Facebook. The
data has been anonymized by replacing the Facebook-internal ids for
each user with a new value.
•It is possible to determine whether two users have the same political
affiliations, but not what their individual political affiliations represent.
Dataset Description
ﬁrst name Alan position Cryptanalyst
last name
Fig 2. User profile tree representation
Turing company GC&CS
name Cambridge
work
type College
education name Princeton

type Graduate School
ﬁrst name Dilly position Cryptanalyst
last name Knox company
GC&CS position Cryptanalyst
work
company Royal Navy
education name Cambridge
type College
Profile information in all of our datasets can be represented as a tree where each level encodes increasingly
specific information From Facebook we collect data from 26 categories, including hometowns, birthdays,
colleagues, political affiliations, etc.
Main objectives
•To design and implement friendship recommendation simulation

using facebook dataset for friends recommendation
•To run our simulation with two different friendship recommendation

algorithms
•The goal here is to see how the different friendship recommendation

algorithms causes social segregation in the Social network.
•The idea here is to find an intervention mechanism for the

downstream applications so that they do not further reinforce societal
segregation.
Implementation of friendship recommendations simulation
Creating Network :
Form the network as given in the dataset. Let’s represent the network as G = (V, E), where
V is the set of nodes and E is the set of edges. For all e ∈ E,e = (vi, vj ) where vi, vj ∈ V.
Fig 3. Representing graph a social network
We have also given users political labels in our dataset namely 0 or 1. So, we have
included all the users having political label 0 in group 0 and all the users having political
label 1 in group 1 and there is also some user who does not belongs to either groups.
User Arrival Process:
Some users in social networks are more active than others.

Thus we model it using user-specific activity rate.
Let ri represent the activity rate of vi. At the start of the simulation, we sample this
activity rate for all the users from the below normal distribution clipped to [0, 1].
ri ∼ Normal(mean = 0.5, std = 0.2)
The unit of ri is the number of activities per unit time. We then create a Poisson
point process for each of the users using their corresponding activity rates.
The Poisson process can be used to model the number of occurrences

of events, such as user arrival activity , during a certain period of time,
Modeling inter-arrival times and arrival times in a Poisson process:
The number of occurrence of events(user activity rate) is modeled using a discrete
Poisson distribution,then the interval of time between consecutive events can be
modelled using the Exponential Distribution which is a continuous distribution.
Simulating inter-arrival times in a Poisson process:

We do this by using the Inverse CDF technique, in which we literally construct the
inverse function of the CDF, and feed it different probability values from a Uniform(0,1)
distribution. This gives us the corresponding inter-arrival times for the respective
probabilities.The inverse function of the CDF of the inter-arrival times is
We feed into this function, probability values from the continuous uniform
distribution Uniform(0,1)
following is the table of patient inter-arrival times in hours at the ER
for the first 10 patients. We have generated this date using the
above formula, with λ set to 5 patients per hour.
Simulating a Poisson process:

1. For the given average incidence rate λ, use the inverse-CDF technique to generate inter-arrival times.
2. Generate actual arrival times by constructing a running-sum of the interval arrival times.
User’s Affinity Score:
•Social network platforms use network structure, user

histories,likes/dislikes, etc. to find a type of affinity score between all
the pairs of users who do not have a direct link, and then use those
affinity scores for friend suggestions.
•Here we only use network structure to find the affinity score as

defined below.
4 User Dynamics:
•We model above mentioned friendship dynamics using user-

specific send probabilities and user-specific accept-probabilities in
our simulation.
•Based on the send and accept-probability,the user then chooses

whether to send friend requests to those who are listed in then
recommendation, and also chooses whether to accept the
received friend requests.
The user sends request to the jth ranked friend recommendation Ri[j] based on the
following Bernoulli sampling.
The user then chooses to accept or reject friend requests if any. If vj had earlier sent a
friend request, vi accepts or rejects it based on the following Bernoulli sampling.
The following is experiment screenshot of our friendship recommendation simulation for user 3195
Recommendation based on Network structure(Algorithm 1)
One of the important factors while recommending new friends to a user is the number of mutual or
common friends between them and all those nodes which is having less distance from other nodes .
We take graph of friends having their group label as input and output recommendation list of new
friends based on their distance.
The Algorithm can be easily underslood by using our practice graph. There are five nodes in the
graph namely A,B,C,D,E.F.
Where did the algorithm fails?
𝑢3 𝑢13
𝑢10
𝑢11 𝑢14
𝑢4
𝑢12 𝑢2 𝑢5 𝑢15
𝑢1
𝑢9
𝑢6
𝑢16
𝑢19
𝑢7
𝑢8 𝑢17
𝑢20 𝑢18
Users network with their group label.
Group 0 users :u2,u3,u10,u11,u12 Group 1 users :u8,u6,u7,u16,u17,u18

Recommendations based on Reservation
(Algorithm 2)
•Generate Graph from given dataset and stores groups of each nodes, nodes
with political label 0 belongs group 0 and nodes with political label 1
belongs group 1.
•Simulates the order in which the users come in the network using poisson
point process.
•For a user, it calculates affinity scores to all the other nodes,the score
calculation based on number of minimum paths of length 2 and length 3
and then do score normalisation..
•It takes top 10 recommendations based on their affinity score in which top
8(80%) are normal and bottom 2 (20%) are reserve for different group user.
•If group 0 user comes in network then 20% of their recommendations list
is reserverd for group 1 users and vice-versa .
•Then the algorithm follows the user dynamic process.
Here we are doing 20% reservation for different group user, in similar we
can also increase it to 30% reservation
Evaluation Metric.
This metric measures segregation between groups in
network.
Formula for metric calculation= A / B
Where A is intergroup average distance,

B is average of intra groups average distance.
Global cluster coefficient:
Clustering coefficient is a measure of the degree to which nodes in a

graph tend to cluster together.we used this to calculate how different
types of algorithms are making clusters in a specific group.
EXPERIMENT
We tested the two recommendation systems in the following way:

 Create network along with users specific groups. i.e group 0 and group 1.
 Randomly generates 50000 user logins through poisson point process.
 Whenever a user accepts some friends request then we update the graph and
the next friendship recommendation is based on updated graph.
 After every 1000 user logins into network. We do the following:
 Calculates Group specific average cluster coefficient of the group 0 subnetwork
and stores it in list.
 Calculates Group specific average cluster coefficient of the group 1 subnetwork
and stores it in list.
 Calculates Evaluation metric by computing intra groups average distance divided
by average of group specific inter groups average distance and stores it in list.
 Calculates general average cluster coefficient of the network and stores it in list.
RESULT
Recommendation based on Network structure (Algorithm 1)
•Here we can clearly see that groups segregation is increasing with the numbers of users
login.
•Now with this we can clearly say that similarity based friendship recommendation increases
social segregation.
•We have run avg cluster coefficient for group specific subnetwork and for the entrie
network.
•Here we can clearly see that there is increase of cluster coefficient of the nodes with the
increasing numbers of user login.
•Here users are clustering together more tightly because of the reason that our algorithm is
only recommending user who is close to the target user.
Recommendations based on Reservation (Algorithm 2)
•Here we can clearly see that groups segregation is decreasing with the numbers of users login
significantly as compared to algorithm 1.
•Here we can also see that with the increase in reservation percentage ,more new connection are
being made between users from different groups like from figure we can see that 30% reservation are
making our network less segregation as compared to 20% reservation.
•Here we can clearly see that algorithm 2 decreases of avg. cluster coefficient of the nodes
with the increasing numbers of user login as compared to algorithm 1 and with this we can
say that nodes are not getting clustered more tightly as compared to algorithm
•Here we can also see that with the increase in reservation percentage ,more new
connection are being made between users from different groups and with that more open
triangles are being made and it results less clustering of nodes.
Conclusion and Future works
In summary, we performed graph analysis and studied the various
properties of the social network. After properly understanding the
entire network, we run friendship recommendation simulation and
see how it recommending friends for a specific user on the
network with two different type of algorithms. Apart from this, we
analysis our network user dynamic by applying evaluation metric
and avg cluster coefficient at different login interval on the
network.In brief,what we found is algorithms which suggestion is
based on closest or nearest distance generally forms clusters. The
current dataset does not have specific features listed for each user.
If they were provided, we can use them to detect users with
similar features and recommend friends.

References
[1] Representing degree distributions, clustering, in social networks with latent cluster
random effects models. Social Networks, 2009 P. Krivitsky, M. Handcock.
[2] Modularity and community structure in social networks. PNAS, 2006 M. Newman.
[3] Uncovering the overlapping community structure of complex networks in nature and society.
Nature, 2005 I. Farkas, and T. Vicsek.
[4] Discovering social circles in ego networks. J. McAuley and J. Leskovec. 2012.
[5] The anatomy of the Facebook social graph. J. Ugander, B. Karre preprint, 2011.
[6] Towards discovering hidden communities based on user profiles. In ICDM , 2010 T.
Yoshida
THANK YOU

Towards Segregation Aware Social

Uploaded by

Copyright:

Available Formats

You might also like

Towards Segregation Aware Social

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Towards Segregation Aware Social

Uploaded by

Copyright:

Available Formats

Towards Segregation Aware Social

• Users in a community tend to interact more frequently with each other

• Detecting such tight communities is one of the essential tools in social

•Manual addition of circles is laborious and also requires constant

Less segregation is good for society for long term.

So, our task is to see how social recommendation system

•Gupta et al. (2013) presented Twitter’s user recommendation service,

•Liben-Nowell and Kleinberg (2003) studied the user recommendation

•We have used the facebook dataset provided by the Stanford

education name Princeton

education name Cambridge

•To design and implement friendship recommendation simulation

•To run our simulation with two different friendship recommendation

•The goal here is to see how the different friendship recommendation

•The idea here is to find an intervention mechanism for the

Fig 3. Representing graph a social network

Some users in social networks are more active than others.

ri ∼ Normal(mean = 0.5, std = 0.2)

The Poisson process can be used to model the number of occurrences

Simulating inter-arrival times in a Poisson process:

Simulating a Poisson process:

•Social network platforms use network structure, user

•Here we only use network structure to find the affinity score as

•We model above mentioned friendship dynamics using user-

•Based on the send and accept-probability,the user then chooses

Users network with their group label.

Group 0 users :u2,u3,u10,u11,u12 Group 1 users :u8,u6,u7,u16,u17,u18

Formula for metric calculation= A / B

Where A is intergroup average distance,

Global cluster coefficient:

Clustering coefficient is a measure of the degree to which nodes in a

We tested the two recommendation systems in the following way:

You might also like