Professional Documents
Culture Documents
Week 3 4 SNA+Recommender
Week 3 4 SNA+Recommender
1
Social Network Analysis (SNA)
including a tutorial on concepts and methods
• In
متبادل
a directed graph or network, the edges are
reciprocal—so if A is connected to B, B is by
definition connected to A.
1 1 2 3
2 3
4 5 6 4 5 6
Example 1: Friendship Network
Example 2: Scientific collaboration network
مستقلة
A collaboration network (CN) is a partnership of autonomous
people and organizations, supported by a computer network,
that collaborate to share resources, such as data and
connectivity.
Example 3: Business ties in US
biotech-industry Example
Definition: Graph
• V:={1,2,3,4,5,6}
• E:={{1,2},{1,5},{2,3},{2,5},{3,4},{4,5},{4,6}}
Weighted graphs
1.2 2
1 2 3 1 2 3
.2
.5 1.5 5 3
.3 1
4 5 6 4 5 6
.5
Storing data on a directed graph
Edge list
Vertex Vertex
1 2
1 3
Graph (directed)
2 3
1 2 4
2
3 4
Adjacency matrix
3
4 Verte 1 2 3 4
x
1 - 1 1 0
2 0 - 1 1
3 0 0 - 0
4 0 0 1 -
Representing an undirected graph
Directed Edge list remains the same
(who contacts whom) Vertex Vertex
But interpretation
1 1 2 is different now
2
1 3
2 3
3 2 4
4
3 4
3 1 3 5
4 2 3 22
37
2 4 2
Weights could be:
3 4 37
• Frequency of
interaction in period of Adjacency matrix: add weights instead of 1
observation
• Number of items Verte 1 2 3 4
exchanged in period x
• Individual perceptions
1 - 30 5 0
of strength of
relationship 2 30 - 22 2
• Costs in communication 3 5 22 - 37
or exchange, e.g. distance
• Combinations of these 4 0 2 37 -
Social links
like
• affiliation between actors
web
like
Example: Extracting relations from
Data
Anne Jim John
Mary
Can we study their
interactions as a
network?
1 2 3 4
Graph
Communication 1
Anne: Jim, tell the Murrays they’re invited
2
Jim: Mary, you and your dad should come for dinner!
Jim: Mr. Murray, you should both come for dinner
Anne: Mary, did Jim tell you about the dinner? You must come. 3
4
John: Mary, are you hungry?
Vertex
… Edge (link)
(node)
21
Graph Neural Networks (GNN)
• Machine learning methods are based on data. Because of
everyday encounters with data that are audio, visual, or textual
such as images, video, text, and speech
• Connection-based data can be displayed as graphs. Such
structures are much more complex than images and text due to
multiple levels of connectivity in the structure itself which is
completely irregular and unpredictable.
• All convolutional network graph methods are based
on message propagation. Such messages carry information
through a network composed of nodes and edges of the graph,
while each node entity carries its computational unit. The task
of each node is to process the information and pass it on to the
neighbors.
Graph Neural Networks (GNNs)
• emerged as a general framework addressing
both node-related and graph-related tasks .
• GNNs employ a message passing procedure, where each node
updates its feature vector by aggregating the feature vectors of
its neighbors
• After k iterations of the message passing procedure, each
node obtains a feature vector which captures the structural
information within its k-hop neighborhood
• For graph-related tasks, GNNs compute a feature vector for
the entire graph using summing function the feature vectors of
all the nodes of the graph
Limitations of the Standard GNN Model
1-hop
2-hop
3-hop
4-hop
Graph Example: Counting Network Hops
1. Degree Centrality:
المساله الحقيقيةThe number of direct connections a node has. What
really matters isمنwhere
بالرغم
those connections lead to and how they
connect the otherwise unconnected.
2. Betweenness Centrality: تاثير
A node with high betweenness has great influence over what
flows in the network indicating important links and single point of
failure.
3. Closeness Centrality:
The measure of closeness of a node which are close to
everyone else. The pattern of the direct and indirect ties allows the
nodes any other node in the network more quickly than anyone
else. They have the shortest paths to all others.
Degree Centrality
• Definition: Degree centrality assigns an
importance score based simply on the
number of links held by each node.
b
a
C d e f g h
Betweenness Centrality:
C B (ni ) = g jk (ni ) / g jk
j k
Closeness is based on the inverse of the distance of each actor to every other actor in the
network.
Closeness Centrality:
−1
g
Cc (ni ) = d (ni , n j )
j =1
where Cc ni defines the standardized closeness centrality of node i and
d (n i , n j ) denotes the geodesic distance between i and j.
Interpretation of measures (1)
53
Recommender systems
55
55
When Does This Problem Occur?
Goalليof Recommendation:
الحصول ع
To come up with a short list of
items that fits user’s interests
56
56
Common Solutions to the Problem
تعيين
57
57
Recommender Systems - Examples
58
58
Main Idea behind Recommender Systems
59
59
Main Idea behind Recommender Systems.CONT
60
60
Recommendation vs. Search
61
61
Challenges of Recommender Systems
62
62
Challenges of Recommender Systems
• Attacks
– Push Attack: pushing ratings up by making fake users
– Nuke attack: DDoS denial-of-service attack (DoS
attack) attacks, stop the whole recommendation systems
• Privacy
– Using one’s private info to recommend to others
• Explanation
– Recommender systems often recommend items with no
explanation on why these items are recommended
63
63
2 commonly used terms where
the target server or application
are made unresponsive
64
64
Main Recommender System methods
1. Content-based recommendation
2. Collaborative filtering
1- Content-based Recommender Systems
Content-based Recommendations
Example:
▪ Movie recommendations: recommend movies with
same actor(s),
director, genre, …
▪ Websites, blogs, news: recommend other sites with
“similar” content
Content-based Recommendations
Content-based recommendation
▪ Collaborative filtering does NOT require any information about the items,
▪ However, it might be reasonable to exploit such information
▪ E.g. recommend fantasy novels to people who liked fantasy novels in the past
▪ What do we need:
▪ Some information about the available items such as the genre ("content")
▪ Some sort of user profile describing what the user likes (the preferences)
▪ The task:
▪ Learn user preferences
▪ Locate/recommend items that are "similar" to the user preferences
- 69 -
نماذج
Paradigms of recommender systems
- 70 -
What is the "content"?
▪ Here:
– Classical IR-based methods based on keywords
– No expert recommendation knowledge involved
– User profile (preferences) are rather learned than explicitly elicited
- 71 -
User and item profiling
• Item profile is a set (vector) of features
› Movies: author, title, actor, director,…
› Text: Set of “important” words in document
Usual heuristic from text mining is TF-IDF
- 73 -
Term-Frequency - Inverse Document Frequency (TF-IDF)
- 74 -
TF-IDF
- 75 -
Content-Based Recommendation Algorithm
Pros & Cons of Content-based Approach
• Consider user x
80
80
Rating Matrix: An Example
81
81
Rating Matrix
ضمني
Implicit ratings:
يستدل
– Inferred from other user behavior
– E.g., Play lists or music listened to, for a music Rec system
– The amount of time users spent on a webpage
82
82
Memory-Based Collaborative Filtering
User-based CF
Users with similar previous
ratings for items are likely to rate
future items similarly
Item-based CF
Items that have received similar
ratings previously from users are
likely to receive similar ratings
from future users
83
83
Collaborative Filtering: Algorithm
تقييم
1. Weigh all users/items with respect to their
similarity with the current user/item
84
84
Finding “Similar” Users
•
representation as sets:
Rx = {1, 4, 5}
Ry = {1, 3, 4}
User-based nearest-neighbor
collaborative filtering (1)
The basic technique:
Given an "active user" (Alice) and an item I not yet seen by
Alice
The goal is to estimate Alice's rating for this item, e.g., by
find a set of users (peers) who liked the same items as Alice in the
past and who have rated item I
use, e.g. the average of their ratings to predict, if Alice will like item I
do this for all items Alice has not seen and recommend the best-rated
Item1 Item2 Item3 Item4 Item5
Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1
User-based nearest-neighbor
collaborative filtering (2)
Some first questions
How do we measure similarity?
How many neighbors should we consider?
How do we generate a prediction from the neighbors' ratings?
a, b : users
ra,p : rating of user a for item p
P : set of items, rated both by a and b
Possible similarity values between -1 and 1; = user's average ratings