Download as pdf or txt
Download as pdf or txt
You are on page 1of 92

Lectures 3-4

Social Networks Analysis

Dr. Nesma Ebrahim

1
Social Network Analysis (SNA)
including a tutorial on concepts and methods

Social Media – Dr. Giorgos Cheliotis


Communications and New Media, National University of Singapore
What is a Social
Network ?
• Network
– a set of nodes, points or locations connected
What is a Social
Network ?
• Social Network
- a social structure made up of individuals (or
organizations) called "nodes", which are tied
(connected) by one or more specific types
of interdependency, such as friendship, common
interest
What is a Social
Network ?
• Social Network Analysis (SNA)
- views social relationships in terms of network
theory consisting of nodes and ties (also
called edges, links or connections).
Some concepts

• A node or vertex is an individual unit in the graph or


system.

• A graph or system or network is a set of units that


may be (but are not necessarily) connected to each
other.
Some concepts
• An “edge” is a connection or tie between two nodes.

• A neighborhood N for a vertex or node is the set of


its immediately connected nodes.

• Degree: The degree ki of a vertex or node is the


number of other nodes in its neighborhood.
Some concepts

• In
‫متبادل‬
a directed graph or network, the edges are
reciprocal—so if A is connected to B, B is by
definition connected to A.

• In undirected graph or network, the edges


are not necessarily reciprocal—A may be
connected to B, but B may not be connected to
A (think of a graph with arrows indicating
direction of the edges.)
Directed vs
undirected
graphs
What is a Network?
• Network = graph
• Informally a graph is a set of nodes joined by a
set of lines or arrows.

1 1 2 3
2 3

4 5 6 4 5 6
Example 1: Friendship Network
Example 2: Scientific collaboration network
‫مستقلة‬
A collaboration network (CN) is a partnership of autonomous
people and organizations, supported by a computer network,
that collaborate to share resources, such as data and
connectivity.
Example 3: Business ties in US
biotech-industry Example
Definition: Graph

• G is an ordered pair G:=(V, E)


• V is a set of nodes, points, or vertices.
• E is a set, whose elements are known as edges or
lines.
Example

• V:={1,2,3,4,5,6}
• E:={{1,2},{1,5},{2,3},{2,5},{3,4},{4,5},{4,6}}
Weighted graphs

• is a graph for which each edge has an associated weight, usually


given by a weight function w: E → R.

1.2 2
1 2 3 1 2 3
.2
.5 1.5 5 3
.3 1
4 5 6 4 5 6

.5
Storing data on a directed graph
Edge list
Vertex Vertex
1 2
1 3
Graph (directed)
2 3
1 2 4
2
3 4

Adjacency matrix
3
4 Verte 1 2 3 4
x
1 - 1 1 0
2 0 - 1 1
3 0 0 - 0
4 0 0 1 -
Representing an undirected graph
Directed Edge list remains the same
(who contacts whom) Vertex Vertex
But interpretation
1 1 2 is different now
2
1 3
2 3
3 2 4
4
3 4

1 Adjacency matrix becomes symmetric


2
Verte 1 2 3 4
x
3 1 - 1 1 0
4
2 1 - 1 1
Undirected 3 1 1 - 1
(who knows whom)
4 0 1 1 -
18
Adding weights to edges (directed or
undirected)
30 Edge list: add column of weights
1 Vertex Vertex Weight
2
22
5 2 1 2 30

3 1 3 5
4 2 3 22
37
2 4 2
Weights could be:
3 4 37
• Frequency of
interaction in period of Adjacency matrix: add weights instead of 1
observation
• Number of items Verte 1 2 3 4
exchanged in period x
• Individual perceptions
1 - 30 5 0
of strength of
relationship 2 30 - 22 2
• Costs in communication 3 5 22 - 37
or exchange, e.g. distance
• Combinations of these 4 0 2 37 -
Social links

• explicit and declared relations colleague

• interactions between actors communicate

like
• affiliation between actors
web
like
Example: Extracting relations from
Data
Anne Jim John
Mary
Can we study their
interactions as a
network?

1 2 3 4

Graph
Communication 1
Anne: Jim, tell the Murrays they’re invited
2
Jim: Mary, you and your dad should come for dinner!
Jim: Mr. Murray, you should both come for dinner
Anne: Mary, did Jim tell you about the dinner? You must come. 3
4
John: Mary, are you hungry?
Vertex
… Edge (link)
(node)
21
Graph Neural Networks (GNN)
• Machine learning methods are based on data. Because of
everyday encounters with data that are audio, visual, or textual
such as images, video, text, and speech
• Connection-based data can be displayed as graphs. Such
structures are much more complex than images and text due to
multiple levels of connectivity in the structure itself which is
completely irregular and unpredictable.
• All convolutional network graph methods are based
on message propagation. Such messages carry information
through a network composed of nodes and edges of the graph,
while each node entity carries its computational unit. The task
of each node is to process the information and pass it on to the
neighbors.
Graph Neural Networks (GNNs)
• emerged as a general framework addressing
both node-related and graph-related tasks .
• GNNs employ a message passing procedure, where each node
updates its feature vector by aggregating the feature vectors of
its neighbors
• After k iterations of the message passing procedure, each
node obtains a feature vector which captures the structural
information within its k-hop neighborhood
• For graph-related tasks, GNNs compute a feature vector for
the entire graph using summing function the feature vectors of
all the nodes of the graph
Limitations of the Standard GNN Model

• Three fundamental graph properties:

(1)Connectivity: A graph is connected if there is a


path from any vertex to any other vertex in the
graph.
(2) Bipartiteness
(3)triangle-free
Limitations of the Standard GNN Model
(2) Bipartiteness : A graph G = (V, E) is‫ل‬bipartite
‫منفص‬
if its set of
vertices V can be decomposed into two disjoint sets V1 and V2, i.
e., V = V1 ∪ V2, such that every edge e ∈ E connects a vertex in V1
to a vertex in V2
• A bipartite graph is a graph in which we can divide vertices
into two independent sets, such that every edge connects
vertices between these sets.
• No connection can be established within the set.
• matching in bipartite graphs (bipartite matching) is described as
a set of edges that are picked in a way to not share an endpoint.
Furthermore, maximum matching is such matching of maximum
cardinality of the chosen edge set.
• The algorithm runs in O(|V|*|E|) time, where V represents a set
of nodes and E represents a set of edges.
Bipartiteness
Maximum Bipartite Matching
• There are many real-world problems that can be formed as
Bipartite Matching. For example, consider the following
problem:
• There are M job applicants and N jobs. Each applicant has a
subset of jobs that he/she is interested in. Each job opening can
only accept one applicant and a job applicant can be appointed
for only one job. Find an assignment of jobs to applicants in such
that as many applicants as possible get jobs.
Maximum Bipartite Matching

Maximum Bipartite Matching and Max Flow Problem


Maximum Bipartite Matching (MBP) problem can be
solved by converting it into a flow network
(3) triangle-free if :a graph is triangle-free if it does not
contain a triangle (a cycle of three vertices).
Path length & Neighbourhoods

• Path length: number of edges in the shortest path between two


nodes
• k-hop neighbourhood of a node: the set of nodes that can be
reached through paths of length k (friends… and friends of
friends… etc.)

1-hop

2-hop

3-hop
4-hop
Graph Example: Counting Network Hops

• Graphs play an important part in solving many networking


problems.
• One problem, for example, is determining the best way to get
from one node to another in an internet, a network of gateways
into other networks.
• One way to model an internet is using an undirected graph in
which vertices represent nodes, and edges represent
connections between the nodes.
• With this model, we can use breadth-first search to help
determine the smallest number of traversals, or hops, between
various nodes.
Graph Example: Counting Network Hops Cont.
consider the graph which represents an internet of six nodes.
Starting at node1 , there is more than one way we can
reach node4 .
The paths 〈 node1 , node2 , node4 〉
, 〈 node1 , node3 , node2 , node4 〉,
and 〈 node1 , node3 , node5, node4 〉 are all acceptable. Breadth-
first search determines the shortest path, 〈 node1 , node2 , node4 〉,
which requires two hops.
Solution

• Implements breadth-first search to determine the


smallest number of hops between nodes in an internet.
• The function has three arguments: graph is a graph,
which in this problem represents the internet; start is
the vertex representing the starting point; and hops is
the list of hop .
k-hop GNNs
Graph neural networks (GNNs) have emerged recently as a
powerful architecture for learning node and graph representations.
we propose a more expressive architecture, k-hop GNNs, which
updates a node's representation by aggregating information not only
from its direct neighbors, but from its whole k-hop neighborhood.
k-hop GNNs algorithm
By updating node features, we can capture structural
information that is not visible when aggregating only the 1-
hop neighborhood.

1. model consists of neighborhood aggregation layers that do


not take into account only the direct neighbors of the
nodes, but their entire k-hop neighborhood.

2. Instead of the neighborhood aggregation layer shown in


(1), the proposed model updates the hidden state h (t) v of
a node v as follows:
k-hop GNNs algorithm
Social Network Analysis
Main Measures for Social Network are:

1. Degree Centrality:
‫ المساله الحقيقية‬The number of direct connections a node has. What
really matters is‫من‬where
‫بالرغم‬
those connections lead to and how they
connect the otherwise unconnected.
2. Betweenness Centrality: ‫تاثير‬
A node with high betweenness has great influence over what
flows in the network indicating important links and single point of
failure.
3. Closeness Centrality:
The measure of closeness of a node which are close to
everyone else. The pattern of the direct and indirect ties allows the
nodes any other node in the network more quickly than anyone
else. They have the shortest paths to all others.
Degree Centrality
• Definition: Degree centrality assigns an
importance score based simply on the
number of links held by each node.

• What it tells us: How many direct, ‘one


hop’ connections each node has to
other nodes in the network.
• When to use it: For finding very
connected individuals, popular
individuals, individuals who are likely to
hold most information or individuals
who can quickly connect with the wider
network.
• A bit more detail: Degree centrality is
the simplest measure of node Degree centrality: A network of terrorists,
connectivity. Sometimes it’s useful to repeatedly filtered by degree (also known
look at in-degree (number of inbound
links) and out-degree (number of as a k-degenerate graph) revealing
outbound links) as distinct measures, clusters of tightly-connected nodes
for example when looking at
transactional data or account activity.
Degree centrality
• A node’s (in-) or (out-)degree is Hypothetical graph
2
the number of links that lead into
or out of the node 1
2 3
• In an undirected graph they are
of course identical
• Often used as measure of a 4 3
node’s degree of connectedness
and hence also influence and/or
popularity 5 4
1 4
• Useful in assessing which nodes
are central with respect to
spreading information and 7 1
1 6
influencing others in their
immediate ‘neighborhood’

Nodes 3 and 5 have the highest degree


(4) 41
Problem of Degree centrality:
Degree centrality, however, can be
deceiving ‫خادعة‬, because it is a purely local
measure.

For example: point 3 has 3 direct


connections only , but it has indirect short
path access to many other nodes.
Centralization
‫نسبة‬
Centralization is calculated as the ratio between the numbers of links for each node
divided by the maximum possible sum of differences Centralization provides a
measure of the extent to which a whole network has a centralized structure
Betweenness Centrality
• Definition: Betweenness centrality
measures the number of times a node lies
on the shortest path between other nodes.
• What it tells us: This measure shows which
nodes are ‘bridges’ between nodes in a
network. It does this by identifying all the
shortest paths and then counting how many
times each node falls on one.
• When to use it: For finding the individuals
who influence the flow around a system.
• A bit more detail: Betweenness is useful for
analyzing communication dynamics,but
should be used with care. A high
betweenness count could indicate someone
holds authority over disparate clusters in a
network, or just that they are on the
Visualizing an email network, with nodes
periphery of both clusters.
resized by betweenness score
Betweenness centrality
 For a given node v,
 T = shortest paths between nodes i and
0
 Y = calculate the number of shortest 1
paths between nodes i and j that pass 2 1.5
through v,
 Let Cent = T/ Y
 Sum the above Cent values
for all node pairs i,j 6.5 3
 Shows which nodes are more
likely to be in communication paths
between other nodes 5 9
 Also useful in determining points 0 4
where the network would break
apart (think who would be cut off if 0
nodes 3 or 5 would disappear) 0 7
6

Node 5 has higher betweenness centrality


than 3
45
Betweenness Centrality:
Model based on communication flow: A person who lies
on communication paths can control communication flow, and is
thus important. Betweenness centrality counts the number of
shortest paths between i and k that actor j resides on.

b
a

C d e f g h
Betweenness Centrality:

C B (ni ) =  g jk (ni ) / g jk
j k

Where gjk = the number of geodesics connecting jk, and


gjk(ni) = the number that actor i is on.

geodesics is:the shortest path between any particular pair of


nodes in a network.
Betweenness Centrality Examples:
Closeness centrality
• Calculate the average length of 0.5
all shortest paths from a node to
all other nodes in the network 1
2 0.67
(i.e. how many hops on average
it takes to reach every other
node)
0.75 3
• Take the inverse ‫مقلوب‬of the
above value so that higher
values are ‘better’ (indicate
higher closeness) like in other 5 0.75
measures of centrality 0.46 4
• It is a measure of reach, i.e. the
speed with which information 7 0.46
0.46 6
can reach other nodes from a
given starting node
Note: Sometimes closeness is calculated without taking the reciprocal
of the mean shortest path length. Then lower values are ‘better’.

Nodes 3 and 5 have the highest (i.e. best)


closeness, while node 2 fares almost as well
49
Closeness Centrality
• Definition: Closeness centrality
scores each node based on their
‘closeness’ to all other nodes in
the network.
• What it tells us: This measure
calculates the shortest paths
between all nodes, then assigns
each node a score based on its
sum of shortest paths.
• When to use it: For finding the
individuals who are best placed
to influence the entire network
most quickly.
• A bit more detail: Closeness
centrality can help find good
‘broadcasters’, but in a highly- A corporate email network; nodes with a
connected network, you will high closeness degree are enlarged
often find all nodes have a
similar score. What may be
more useful is using Closeness
to find influencers in a single
cluster.
Measuring Networks: Flow
closeness centrality

A third measure of centrality is closeness centrality. An actor is considered important if


he/she is relatively close to all other actors.

Closeness is based on the inverse of the distance of each actor to every other actor in the
network.

Closeness Centrality:
−1
 g

Cc (ni ) =  d (ni , n j )
 j =1 
where Cc ni defines the standardized closeness centrality of node i and
d (n i , n j ) denotes the geodesic distance between i and j.
Interpretation of measures (1)

Centrality measure Interpretation in social networks

Degree  How many people can this person reach directly?

How likely is this person to be the most direct route


Betweenness  between two people in the network?

How fast can this person reach everyone in the


Closeness  network?

52 CNM Social Media Module – Giorgos Cheliotis


Recommender systems
Lecture 5

53
Recommender systems

A recommender system is an algorithmic


tool that recommends items to users.

54 CNM Social Media Module – Giorgos Cheliotis


Difficulties of Decision Making

• Which digital camera should I buy?


• Where should I spend my holiday?
• Which movie should I rent?
• Whom should I follow?
• Where should I find interesting news article?
• Which movie is the best for our family?

55
55
When Does This Problem Occur?

• There are many choices


‫واضح‬
• There are no obvious advantages among them
• We do not have enough resources to check all
options (information overload)
• We do not have enough knowledge and
experience to choose, or
– I’m lazy, but don’t want to miss out on good stuff
– Defensive decision making

Goal‫لي‬of Recommendation:
‫الحصول ع‬
To come up with a short list of
items that fits user’s interests
56
56
Common Solutions to the Problem

‫تعيين‬

57
57
Recommender Systems - Examples

58
58
Main Idea behind Recommender Systems

Use historical data such as the user’s past preferences


or similar users’ past preferences to predict future likes

59
59
Main Idea behind Recommender Systems.CONT

60
60
Recommendation vs. Search

• One way to get answers is using search engines


• Search engines find results that match the query
provided by the user
• The results are generally provided as a list
ordered with respect to the relevance of the item
to the given query
• Consider the query “best 2013 movie to watch”
– The same results for an 8 year old and an adult

Search engines’ results are not customized

61
61
Challenges of Recommender Systems

• The Cold Start Problem


– Recommender systems use historical data or
information provided by the user to recommend
items, products, etc.
– When user join sites, they still haven’t bought any
product, or they have no history.
– It is hard to infer what they are going to like when they
start on a site.

• Data Sparsity ‫غير كااف‬


– When historical or prior information is insufficient.
– Unlike the cold start problem, this is in the system as a
whole and is not specific to an individual.

62
62
Challenges of Recommender Systems
• Attacks
– Push Attack: pushing ratings up by making fake users
– Nuke attack: DDoS denial-of-service attack (DoS
attack) attacks, stop the whole recommendation systems

• Privacy
– Using one’s private info to recommend to others

• Explanation
– Recommender systems often recommend items with no
explanation on why these items are recommended

63
63
2 commonly used terms where
the target server or application
are made unresponsive

64
64
Main Recommender System methods

1. Content-based recommendation

2. Collaborative filtering
1- Content-based Recommender Systems
Content-based Recommendations

Main idea: Recommend items to customer x similar to


previous items rated highly by x

Example:
▪ Movie recommendations: recommend movies with
same actor(s),
director, genre, …
▪ Websites, blogs, news: recommend other sites with
“similar” content
Content-based Recommendations
Content-based recommendation

▪ Collaborative filtering does NOT require any information about the items,
▪ However, it might be reasonable to exploit such information
▪ E.g. recommend fantasy novels to people who liked fantasy novels in the past

▪ What do we need:
▪ Some information about the available items such as the genre ("content")
▪ Some sort of user profile describing what the user likes (the preferences)

▪ The task:
▪ Learn user preferences
▪ Locate/recommend items that are "similar" to the user preferences

- 69 -
‫نماذج‬
Paradigms of recommender systems

Content-based: "Show me more of the


same what I've liked"

- 70 -
What is the "content"?

▪ The genre is actually not part of the content of a book


▪ Most CB-recommendation methods originate from Information Retrieval
(IR) field:
– The item descriptions are usually automatically extracted (important words)
– Goal is to find and rank interesting text documents (news articles, web pages)

▪ Here:
– Classical IR-based methods based on keywords
– No expert recommendation knowledge involved
– User profile (preferences) are rather learned than explicitly elicited

- 71 -
User and item profiling
• Item profile is a set (vector) of features
› Movies: author, title, actor, director,…
› Text: Set of “important” words in document
Usual heuristic from text mining is TF-IDF

• User profile is also a vector of features


› average of rated item profiles (possibly weighted by difference from
average)
Content representation and item similarities

- 73 -
Term-Frequency - Inverse Document Frequency (TF-IDF)

▪ Simple keyword representation has its problems


– In particular when automatically extracted because
▪ Not every word has similar importance
▪ Longer documents have a higher chance to have an overlap with the user profile

▪ Standard measure: TF-IDF


– Encodes text documents as weighted term vector
– TF: Measures, how often a term appears (density in a document)
▪ Assuming that important terms appear more often
▪ Normalization has to be done in order to take document length into account
– IDF: Aims to reduce the weight of terms that appear in all documents
❖ It is the product of TF and IDF. TFIDF gives more weightage to the word
that is rare in the corpus (all the documents). TFIDF provides more
importance to the word that is more frequent in the document.

- 74 -
TF-IDF

- 75 -
Content-Based Recommendation Algorithm
Pros & Cons of Content-based Approach

+: No need for data on other users


+: Able to recommend to users with unique tastes
+: Able to recommend new items
+: Able to provide explanations

–: Finding the appropriate features is hard


Think images, movies, music
–: No recommendations for new users
–: Overspecialization: never recommends items outside
user’s content profile
– : People might have multiple interests
2- Collaborative Filtering Recommender System
Collaborative Filtering

• Consider user x

• Find set N of other


users whose ratings
are “similar” to x
x’s ratings

• Estimate x’s ratings


based on ratings N
of users in N
Collaborative Filtering

Collaborative filtering: the process of selecting


information or patterns using techniques involving
collaboration among multiple agents, viewpoints,
data sources, etc.

Advantage: we don’t need to have additional


information about the users or content of the items
– Users’ rating or purchase history is the only information
that is needed to work

80
80
Rating Matrix: An Example

81
81
Rating Matrix

Users rate (rank) items (purchased, watched)


‫صريح‬
Explicit ratings:
– entered by a user directly
– i.e., “Please rate this on a scale of 1-5”

‫ضمني‬
Implicit ratings:
‫يستدل‬
– Inferred from other user behavior
– E.g., Play lists or music listened to, for a music Rec system
– The amount of time users spent on a webpage

82
82
Memory-Based Collaborative Filtering

Two memory-based methods:

User-based CF
Users with similar previous
ratings for items are likely to rate
future items similarly

Item-based CF
Items that have received similar
ratings previously from users are
likely to receive similar ratings
from future users

83
83
Collaborative Filtering: Algorithm
‫تقييم‬
1. Weigh all users/items with respect to their
similarity with the current user/item

2. Select a subset of the users/items (neighbors) as


recommenders

3. Predict the rating of the user for specific items


using neighbors’ ratings for the same (or similar)
items

4. Recommend items with the highest predicted rank

84
84
Finding “Similar” Users

representation as sets:
Rx = {1, 4, 5}
Ry = {1, 3, 4}
User-based nearest-neighbor
collaborative filtering (1)
 The basic technique:
 Given an "active user" (Alice) and an item I not yet seen by
Alice
 The goal is to estimate Alice's rating for this item, e.g., by
 find a set of users (peers) who liked the same items as Alice in the
past and who have rated item I
 use, e.g. the average of their ratings to predict, if Alice will like item I
 do this for all items Alice has not seen and recommend the best-rated
Item1 Item2 Item3 Item4 Item5
Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1
User-based nearest-neighbor
collaborative filtering (2)
 Some first questions
 How do we measure similarity?
 How many neighbors should we consider?
 How do we generate a prediction from the neighbors' ratings?

Item1 Item2 Item3 Item4 Item5


Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1
Measuring user similarity
 A popular similarity measure in user-based CF: Pearson correlation

a, b : users
ra,p : rating of user a for item p
P : set of items, rated both by a and b
Possible similarity values between -1 and 1; = user's average ratings

Item1 Item2 Item3 Item4 Item5


Alice 5 3 4 4 ? sim = 0,85
User1 3 1 2 3 3 sim = 0,70
sim = -0,79
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1
Rating Predictions

Pros/Cons of Collaborative Filtering
+ Works for any kind of item
No feature selection needed
- Cold Start:
Need enough users in the system to find a match
- Sparsity:
The user/ratings matrix is sparse
Hard to find users that have rated the same items
- First rater:
Cannot recommend an item that has not been
previously rated
New items, Esoteric items
- Popularity bias:
Cannot recommend items to someone with
unique taste
Tends to recommend popular items
Making predictions
 A common prediction function:

 Calculate, whether the neighbors' ratings for the unseen


item i are higher or lower than their average
 Combine the rating differences – use the similarity as a
weight
 Add/subtract the neighbors' bias from the active user's
average and use this as a prediction
Item-based collaborative filtering
 Basic idea:
 Use the similarity between items (and not users) to make
predictions
 Example:
 Look for items that are similar to Item5
 Take Alice's ratings for these items to predict the rating for
Item5 Item1 Item2 Item3 Item4 Item5
Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1

You might also like