2 Grouping

Social Network Analysis (SNA)
Node grouping
Grau de Ciència de Dades | Escola Tècnica Superior d’Informàtica | Universitat Politècnica de València
Sources
●
Albert László Barabási: Network Science. Cambridge
University Press, 2016
– Follows almost section-by-section chapter 02
● Newman, Mark E. J.: Networks: an introduction. Oxford

University Press, 2010
– Chapter 7 - Measures and metrics
2/43
Contents
1. Connectedness
2. Clustering coefficient
3. Central sets
4. Bipartite graphs
5. Assortativity
6. Other structures and measures
7. A case study
Social Network Analysis (SNA): Node grouping

Connectedness

CONNECTIVITY OF UNDIRECTED GRAPHS
Connected (undirected) graph: any two vertices can be joined by a path
B
B
A
A Largest Component:
Giant Component
C
D E C
D E
F
G
F The rest: Isolates
G
A disconnected graph is made up by two or more connected components
Bridge: edge such as if we erase it, the graph becomes disconnected

CONNECTIVITY OF UNDIRECTED GRAPHS Adjacency Matrix
The adjacency matrix of a network with several components can be

written in a block-diagonal form, so that nonzero elements are confined
to squares, with all other elements being zero:
CONNECTIVITY OF DIRECTED GRAPHS
Strongly connected (directed) graph: it has a path from each node to

every other node and vice-versa (i.e. AB path and BA path).
Weakly connected (directed) graph: it is connected if we disregard the
edge directions.
Strongly connected components (SCC) can be identified (e.g blue):

B
E
A F
B
D E C
D C G
F
G
In-component: nodes that can reach the SCC; e.g. E, G (right)

Out-component: nodes that can be reached from the SCC; e.g. D, F (right)
Section 2.9
Clustering coefficient

CLUSTERING COEFFICIENT
(Local) clustering coefficient (Ci)
What fraction of your neighbors are connected among them?

Example: node i with degree ki
Ci in [0,1]
●
Clustering coefficient Ci is a property of a node i
●
Let Li represent the number of links among neighbors
of node i
Average clustering coefficient (<C>)

●
The average clustering coefficient <C> is a property of
a graph
Exercice
(Programming notebook-2)
Calculate the clustering coefficient of node 1

and the average coefficient of the graph
Exercice
Calculate the clustering coefficient of each node
and the average coefficient of the graph
Tendency to form triangles
●
Many natural processes of link formation encourage the
closing of “V”s into triangles
●
Example 1: you’re more likely to meet new friends
through common friends
●
Example 2: you’re more likely to follow an account u
if you see content posted by u and re-posted by an
account v that you already follow
●
This fact will promote the existence of other measures
related to clusters
Central sets

CENTRAL SETS
• Network-level centrality (as opposed to node centrality)

• Subsets of “important nodes”
• Interconnected central nodes
Q: What are the (related)

central fields of science?
Nodes: fields of science

Edges: fields are similar
A. Calderone, "A Wikipedia Based Map of Science." Figshare (2020), https://doi.org/10.6084/m9.figshare.11638932.v5

K-CORE
Maximal subnetwork where each node has

degree at least k
Main core or simply the core:
K’-core such that there is not

k-core with k > k’
M.E.J. Newman. (2010). Networks: An Introduction. Oxford University Press.

K-CORE
Cores used to find structural patterns in healthy

cells lost in cancer
“Nodes with high inter-

connectedness as opposed to high
connectedness are conserved in
the healthy Gene co-Expression
Network”
K-CORE
Large core in Procurement markets lead to

corruption risk
“We study the structure of these

networks in each member state,
identify their cores, and find that
highly centralized markets tend to
have higher corruption risk”
K-CORE
Back to the science map… Core nodes Other nodes

K-CORE
The core of science are

computational/mathematical fields!
K-CORE
Words of caution
• No formal reason to suppose that k-cores are linked

with node roles or behaviors
• Strongly inter-connected not necessarily mean
central (important) to the overall network
Bipartite graphs

BIPARTITE GRAPHS
Bipartite graph (or bigraph)

A graph G = (V,E) whose nodes E can be divided into two disjoint sets VL and VR
such that every link connects a node in VL to one in VR, and VL and VR are
independent sets (nodes within an independent set are not linked between them).
VL VR
Examples:
Hollywood actor network

Collaboration networks
Disease network (diseasome)
Ingredient-Flavor Bipartite Network
Y.-Y. Ahn, S. E. Ahnert, J. P. Bagrow, A.-L. Barabási Flavor network and the principles
of food pairing , Scientific Reports 196, (2011).
TRIPARTITE NETWORK
29/43
Assortativity
Newman, Mark E. J.; Networks: an introduction; Oxford University Press (2010)

ASSORTATIVE MIXING (or HOMOPHILY)
Tendency of the nodes to

connect to other nodes that
are like them in some way
“Some way” could mean any

node attribute. In social
networks: age, income, race,
social interests, ...
Sexual relationships are

mostly disassortive
Assortativity has substancial

effects on network structure
Image: Friendship network at a

US high school
ASSORTATIVE MIXING BY DEGREE
Particular interest Assortative (Dissasortative) mixing by degree: nodes connect

to nodes with similar (different) degree
(a) Assortative by degree: dense core of high-degree nodes and a periphery of
lower-degree nodes (b) Disassortative by degree: star-like structures
The covariance for degrees ki and kj is:
We can normalize by the maximum value of covariance to get the

Assortativity coefficient (or correlation coefficient)
δij=1 if i=j
δij=0 otherwise
r corresponds to Pearson correlation coefficient -1>= r <= 1:

-1 perfectly disassortative network
0 uncorrelated values among the degree of a node and its neighbors
1 perfectly assortative network
• Computation of r as expressed above is O(N)

• Sparse networks optimization O(L):
where
Basic statistics for a number of networks

n: Total number of vertices

m: Total number of edges
c: Mean degree
S: Fraction of vertices in the largest component S (or the largest weakly
connected component in the case of a directed network)
l: Mean geodesic distance between connected vertex pairs
α: Exponent α of the degree distribution if the distribution follows a power
law (or “–” if not; in/out-degree exponents are given for directed graphs)
C: Clustering coefficient C from Eq. (7.41)
CWS: Clustering coefficient from the alternative definition of Eq. (7.44)
r: Degree correlation (or assortativity) coefficient from Eq. (7.82)

• None of the values of r are of very large magnitude
 assortative mixing by degree

• Clear tendency for the social networks to have positive r
•
Nodes tend to group in small groups of low-degree or
high-degree
•
Multiedge featured
• Tecnological, information and biological networks tend to a
negative r
•
The number of edges that fall between high-degree nodes
is small
•
Single-edge featured
A case study:
Protein-protein interaction network

THREE CENTRAL QUANTITIES IN NETWORK SCIENCE
A. Degree distribution: pk
B. Average path length: <d>
C. Clustering coefficient:
GENOME
protein-gene
interactions
PROTEOME
protein-protein
interactions
METABOLISM
Bio-chemical
reactions
Citrate Cycle
A CASE STUDY: PROTEIN-PROTEIN INTERACTION NETWORK
Metabolic Network Protein Interactions

●
Undirected network
●
N=2,018 proteins as nodes
L=2,930 binding interactions
●
Average degree <k>=2.90
●
Not connected:
185 components
●
the largest (giant component)
1,647 nodes
pk is the probability that a

node has degree k
Nk = # nodes with degree k
pk = N k / N
dmax=14
<d>=5.61
<C>=0.12

2 Grouping

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2 Grouping

Uploaded by

Copyright:

Available Formats

Social Network Analysis (SNA)

● Newman, Mark E. J.: Networks: an introduction. Oxford

Social Network Analysis (SNA): Node grouping

Social Network Analysis (SNA): Node grouping

Connected (undirected) graph: any two vertices can be joined by a path

A disconnected graph is made up by two or more connected components

Bridge: edge such as if we erase it, the graph becomes disconnected

The adjacency matrix of a network with several components can be

Strongly connected (directed) graph: it has a path from each node to

Strongly connected components (SCC) can be identified (e.g blue):

In-component: nodes that can reach the SCC; e.g. E, G (right)

Social Network Analysis (SNA): Node grouping

(Local) clustering coefficient (Ci)

What fraction of your neighbors are connected among them?

Average clustering coefficient (<C>)

Calculate the clustering coefficient of node 1

Tendency to form triangles

Social Network Analysis (SNA): Node grouping

• Network-level centrality (as opposed to node centrality)

Q: What are the (related)

Nodes: fields of science

A. Calderone, "A Wikipedia Based Map of Science." Figshare (2020), https://doi.org/10.6084/m9.figshare.11638932.v5

Maximal subnetwork where each node has

Main core or simply the core:

K’-core such that there is not

M.E.J. Newman. (2010). Networks: An Introduction. Oxford University Press.

Cores used to find structural patterns in healthy

“Nodes with high inter-

Large core in Procurement markets lead to

“We study the structure of these

Back to the science map… Core nodes Other nodes

The core of science are

• No formal reason to suppose that k-cores are linked

Social Network Analysis (SNA): Node grouping

Bipartite graph (or bigraph)

Hollywood actor network

Social Network Analysis (SNA): Node grouping

Tendency of the nodes to

“Some way” could mean any

Sexual relationships are

Assortativity has substancial

Image: Friendship network at a

Particular interest Assortative (Dissasortative) mixing by degree: nodes connect

The covariance for degrees ki and kj is:

We can normalize by the maximum value of covariance to get the

r corresponds to Pearson correlation coefficient -1>= r <= 1:

• Computation of r as expressed above is O(N)

Newman, Mark E. J.; Networks: an introduction; Oxford University Press (2010)

n: Total number of vertices

Newman, Mark E. J.; Networks: an introduction; Oxford University Press (2010)

• None of the values of r are of very large magnitude

 assortative mixing by degree

Protein-protein interaction network

Social Network Analysis (SNA): Node grouping

Metabolic Network Protein Interactions

pk is the probability that a

Nk = # nodes with degree k

You might also like