Professional Documents
Culture Documents
Lec_18-Graph_Analytics
Lec_18-Graph_Analytics
t2 t3
t7 t4 t5
t8 t6
t9
tf Penn State
F
AGE 19
MAJOR F 8 PHY
ETHNICITY 20 002
ACF 4 5
002 F M
21 6 20
ECO
PHY
001 002
Local-Level Features
Large scale-features
Global Features
expandedramblings.com
source: V. Po do b ni k @ ResearchGate
February 2014
Suppose G = (V , E ), |V | = n, |E | = m
m d(G) = 46
d (G ) n
2
=
2 1/8
5 1 3/8
3
4 1/8
5 1/8
3
6 0
3 1 7 0
d av d m ax
log plots are good for variables with high variance (in one or
both dimensions)
Graph Analytics 36 / 100
Exploring Degree Distribution
Degree Distribution
Pr (k): Probability that a randomly chosen vertex has degree
k Σ n
complementary cdf: Pr (deg ≥ k) i Pr (deg = j ) vs.
= =k k
|E (G
Clustering Coefficient of v C (v ) = v G [N (v )
[N(vd)])| ]
(v2)
3×
C (G )
tΣ(G )d
= v (v ) 2
Average distance is the average shortest path over all pairs of vertices
Σ
u,v d (u, v
lG = ∈V n )
2
There are some vertices (and edges) that act as bridges between
network segments, they are important for communication and explain
the
small-world phenomena in many network
Graph Analytics 51 / 100
Bridges
There are some edges that act as bridges between network segments,
they are important for communication and explain the small-world
phenomena in many networks
An edge (i, j) is a local bridge if i and j have no friends in common
Vulnerability
Node x belongs to minimum nodes set that, if removed from
the graph, the graph is disconnected
Example: Removal of x will cause a high disruption in the
network
Network Centralization (graph-level measure)
Measure of variation of centrality score amongst network nodes
Cd (v ) := deg (v )
H ar m o n ic Ce ntrality
D G
E A
/
1 32 H C /
5 32
/
2 16 5 32/
F /
2 16
/
5 32 B
de g re e c e n t r a l i t y closeness c e n t r a l i t y
betweenness c e n t r a l i t y
e i g e nv e c tor c e n t r a l i t y
N e a r l y a ll va lu e of c lu ste r in g coef-
fi cient co m e s f ro m t h e c liq u e
AGE 19
MAJOR F 8 PHY
20 002
ACF 4 5
ETHNICIT 002 F
Y M
21 6
ECO
20
PHY
001
002
‘M A J O R ’ attribute is
homophilic ‘G E N D E R ’
attribute heterophilic Graph Analytics 76 / 100
Clusters in Graphs
Modular or community structure in graphs: homogeneous building
blocks of the larger heterogeneous structure
Cluster: A dense subgraph within a graph
Cohesion: Nodes are more similar to other nodes in the same cluster
Separation: Nodes in a cluster are dissimilar to nodes in other
clusters
Analytical questions:
Discover communities
Describe interaction with a community
Describe interaction between communities
How a community emerged/dissolved
Which communities are stable
Predict whether a node will migrate to
another community Graph Analytics 78 / 100
Communities in Graphs
Community structure alone is a single level of organization reveal
little information about structure within and between communities
Hierarchical community structure
e.g. CS department inside SSE inside LUMS inside Lahore ...
source: Prem, Cook, & J i t (2017) Profecti ng social contact matrices in 152 countries using surveys and demographic
data
Analytical questions:
Discover communities
Describe interaction with a community
Describe interaction between communities
How a community emerged/dissolved
Which communities are stable
Predict whether a node will migrate to
another community Graph Analytics 86 / 100
Communities in Graphs
Community structure alone is a single level of organization reveal
little information about structure within and between communities
Hierarchical community structure
e.g. CS department inside SSE inside LUMS inside Lahore ...
Cm ← M E RG E (Ci , Ci )
Graph Analytics 94 / 100
Quantifying Cluster Quality
Broadly, we want dense (more intra-cluster edges) and well-separated (few
inter-cluster edges) clusters
Let G = (V , E ), |V | = n, and C a subset of nodes, |C |
= nc G [C ]: the subgraph induced by C
G [C, C ]: bipartite graph between C
and C mc = |E (G [C ])| number of
edges inside C
fc = |{(u, v ) ∈ E : uC ∈ C, vC ∈/ C }|
1 2
C 3 • nc1 = 5, mc1 = 7, f c 1 = 3
• nc2 = 4, mc2 = 5, f c 2 = 2
• nc3 = 4, mc3 = 5, f c 3 = 3
m c / ( nc )
Inter-cluster density of C aka cut ratio
δext (C ) = f c/ nc (n−nc )
C conductance(C ) = f c/ 2mc + f c
Σ nC
metric(G ) ×
n
= C metric(C )
∈communities(G )
Node2Vec
Graph2Vec
Convolution NN
Embedding
Graph
Clustering and
Classification