3 CentralityMeasures Lastupdate2324

Complejidad y Redes
Centrality measures
Complejidad y Redes.
Universidad Politécnica de Madrid
Designed by starline / Freepik

Slides based on:
Network centrality: an introduction

by Francisco A. Rodrigues https://arxiv.org/pdf/1901.07901.pdf
MA5Q3 Topics in Complexity Science

slides by Francisco A. Rodrigues
Lecture 3 http://conteudo.icmc.usp.br/pessoas/francisco/networks/lecture3.pdf
Nuevas Tecnologías y Empresa

slides by J.I. Santos
Chapter 2 and 3 https://sites.google.com/site/meetnachosantos/
Universidad Politécnica de Madrid Centrality measures 2
Overview
Centrality measures
Degree centrality
k-core centrality
Closeness centrality
Betweenness centrality
KATZ centrality
Page Rank centrality
How to describe node characteristics?
Picture from https://www.inverse.com/article/27435-stranger-
things-season-2-super-bowl-trailer-demogorgon-villain
Until know we have seen:

- degree
- distances to other nodes
- clustering
Demogorgon: very well-connected node,

should have high values of centrality
How can we measure importance, influence, power?
--> This is what centrality measures are made for Which kind of
centrality measures
can we calculate?
Complejidad y Redes. Ideas?
Who is most important in a network?
Who the most connected in the network?

Degree centrality
k-core centrality
Who is closest to everyone else?

Closeness centrality
Who links groups far from the network?

Betweenness centrality
Who is connected to the best connected?

Eigenvector centrality
Katz centrality
Pagerank centrality
® Translated slide from NNTT y Empresa, by J.I. Santos
Degree Centrality
Picture from https://www.telltalesonline.com/26925/popular-celebs/
Hypothesis:
Individuals who have more links have more influence, more prestige, more access to
information, are more popular, ... than those who have less
Ex: "Celebrities"
Degree Centrality: measure of connectedness
Remember: adjacency matrix Aij where (i, j) represents link from j to i
Directed network
How popular is an individual: Sometimes it
is normalized
by dividing by
How many people know an individual:
(N-1)
Undirected network
adjacency matrix Aij ((i, j) = (j, i)):
Undirected weighted network

adjacency matrix Wij:
Examples: #followers (Twitter); #friends (Facebook); #citations (scientific papers)

Example
Degree centrality (Gephi)
In Gephi instead of the normalised degree centrality, we have

just the degree.
In Python we can calculate the normalized degree centrality
with the function degree_centrality
Degree Centrality
Picture from Network centrality: an introduction, Francisco Aparecido Rodrigues
https://arxiv.org/pdf/1901.07901.pdf
Limitations
The degree centrality of a node depends exclusively on the number of links it has,
regardless of its importance.
E.g.: would my twitter account be equally important is followed by an "unknown" twitter

account that by Mark Newman’s twitter account?
It is a local measure; it does not depend on the rest of the network.
It can happen that the nodes with the highest degree

are at the periphery, they are not central, and we may
also be interested in highlight centrality in terms of position.
K-Core Centrality
It allows to identify peripheral hubs. The most central nodes have the highest values of k-
core centrality.
A k-core of a graph G is a maximal

connected subgraph of G in which
all vertices have degree at least k.
A node i has coreness kc(i) = k

whether it belongs to the k-core
® Slide from MA5Q3 Topics in Complexity Science by F. A. Rodrigues
K-Core Centrality
“This centrality measure is obtained by the k-shell decomposition, which partitions the network by iteratively
removing all nodes whose degree is smaller than k. After removing these nodes, the network is re-analyzed to verify
whether there are nodes with less than k connections. If such nodes are present, then they are also removed.”
Original network Original network
The remaining nodes have k-core=1
Remove nodes whose degree < 3 Remove nodes whose degree < 2
Again remove nodes whose degree < 3 Again remove nodes whose degree < 2
® Text from Network centrality: an introduction by Francisco A. Rodrigues
Example
k-core centrality
4
7
3 1 5 6
2 8
Kci Label
2 1
2 2
2 3
2 4
2 6
2 7
In Gephi you cannot compute the k-core centrality but can 2 8

filter the graph according to k-cores 2 8
K-Core Centrality
Limitations
The limitation of this measure lies in the fact that many nodes may be assigned to the
same k-core number.
4
7
3 1 6
2 8
Right picture from Network centrality: an introduction, Francisco Aparecido Rodrigues

https://arxiv.org/pdf/1901.07901.pdf
Closeness Centrality: measure of proximity
The centrality of a node can also be seen from the perspective of proximity to the other
nodes.
Hypothesis:
The nodes closest to all the others have better access to information from other nodes
and / or can transmit their opinion more quickly to others.
Given the mean geodesic distance (shortest path) di from node i to all others
Its "closeness" Ci is defined: higher values of closeness indicate higher

centrality.
Example
Closeness centrality (Gephi)
Ci Label
0.583333 1
0.583333 5
0.5 6
0.4375 3
0.411765 2
0.411765 4
0.368421 7
0.368421 8
Closeness Centrality
Limitations:
It is based only on the shortest distances and, therefore, in small-diameter networks the
range of variation is too narrow. The ratio between the largest and minimal distances is of
order log(N), since the minimal distance is equal to one.
It is very sensitive to changes in the network.
It is undefined when there are different components that make the distance between
nodes of both components is infinite.
In disconected networks, Gephi computes the closeness centrality for the
component. NetworkX computes the closeness centrality for the
component and scales by the size of the component (see eq. on the rigth)
n: number of nodes in the component
https://networkx.org/documentation/stable/reference/algorithms/generate N: number of nodes in the network
d/networkx.algorithms.centrality.closeness_centrality.html
Betweenness Centrality: measure of load
“If we consider the flow of particles on a network, then we can define centrality in terms of the load.
It is natural to think that the most central node receives the largest number of particles in a defined
time interval. Assuming that these particles move following the shortest distances, the load in a node
i is given by the total number of shortest paths passing through i. However, since we can have more
than one shortest path between a pair of nodes a and b, it is more suitable to define the load in node
i as the fraction of shortest paths connecting each pair of nodes (a,b) that includes i.
𝜂(𝑎, 𝑖, 𝑏)
So the Betweenness centrality is the total load of i: 𝐵! = #
𝜂(𝑎, 𝑏)
(#,%)
where η(a,i,b) is the number of shortest paths connecting vertices a and b that pass through node i
and η(a,b) is the total number of shortest paths between a and b.
Saying it other way: Serves to identify those "bridge" nodes between separate groups, to identify
"bottlenecks”.
® Based on Network centrality: an introduction by F.A. Rodrigues
Betweenness Centrality
Nodes with high betweenness have a great power of intermediation insofar as they can
influence a greater number of messages.
They are "bridge" between remote groups. If they are eliminated, they can fragment the
network into isolated groups (some algorithms for identifying communities are built on
this property).
Example
Betweenness centrality (Gephi)
List of shortest paths
Node 1 Node 2 Node 3 Node 4

{1,2} {2,3} {3,4} {4,1,5}
{1,3} {2,3,4}, {2,1,4} {3,1,5} {4,1,5,6}
{1,4} {2,1,5} {3,1,5,6} {4,1,5,6,7}
{1,5} {2,1,5,6} {3,1,5,6,7} {4,1,5,6,8} #({2,3,4})
𝐵! = = 0.5
{1,5,6} {2,1,5,6,7} {3,1,5,6,8} #( 2,3,4 , {2,1,4})
{1,5,6,7} {2,1,5,6,8}
{1,5,6,8}
Node 5 Node 6 node 7 Node 8
{5,6} {6,7} {7,8}
{5,6,7} {6,8}
{5,6,8}
Example
Betweenness centrality (Gephi)
List of shortest paths
Node 1 Node 2 Node 3 Node 4

{1,2} {2,3} {3,4} {4,1,5}
{1,3} {2,3,4}, {2,1,4} {3,1,5} {4,1,5,6}
{1,4} {2,1,5} {3,1,5,6} {4,1,5,6,7}
{1,5} {2,1,5,6} {3,1,5,6,7} {4,1,5,6,8}
{1,5,6} {2,1,5,6,7} {3,1,5,6,8} #({1,5,6,7}) #({1,5,6,8})
{1,5,6,7} {2,1,5,6,8} 𝐵" = + … = 10
{1,5,6,8} #({1,5,6,7}) #({1,5,6,8})
Node 5 Node 6 node 7 Node 8

{5,6} {6,7} {7,8}
{5,6,7} {6,8}
{5,6,8}
Betweenness Centrality
Limitation:
Calculus is computationally expensive. It considers only the shortest distances and,

therefore, is not general, since information can travel long distances in a network.
Requiring O(N3) time and O(N2) space, where N is the number of nodes in the network.
Even the solution proposed by Brandes to calculate exact betweenness centrality, which
runs in O(NM), where M is the number of edges in the network, is computationally
expensive for large graphs.
To overcome these limitations, calculus based on random walks can be considered.

In disconected networks, both Gephi and NetworkX assign a zero value to the
betweenness centrality of isolated nodes ( changing the indeterminate 0/0 to B=0).
Also both compute the betweenness centrality inside each component separately.
® Slide from MA5Q3 Topics in Complexity Science by F. A. Rodrigues
Random-walk Betweenness
The previous metric assumes that "messages" use only the shortest paths, discarding the
contribution of any other alternative.
A variant is "random-walk betweenness" (Newman, 2005) the traffic between two nodes
(s, t) is measured (repeatedly) by a random passer that exits s and finally arrives at t. Now
Bi is the number of times the walker passed through node i.
With this algorithm any path between s and t contributes, although those longer
contribute less (they are less likely).
So, the betweenness centrality based on random walks is given by the expected number
of visits to each node i during a random walk.
Newman, M. (2005). A measure of betweenness centrality based on random walks. Social Networks. http://arxiv.org/pdf/cond-mat/0309045.pdf
Eigenvector Centrality: Influence / Prestige measure
“not what you know, but who you know.. “ M.O. Jackson, Social and Economic Networks: Models and Analysis
Hypothesis:
The importance of a node in the network grows.

if it has links that they are also important.
Ex.: ”Kim Kardashian"
Source: https://www.dailymail.co.uk/tvshowbiz/article-12580769/Fans-convinced-
Anna-Wintour-snubbed-Kim-Kardashian-Victoria-Beckhams-Paris-fashion-show.html
Eigenvector Centrality
Let us make some initial guess about the centrality 𝑥! of each node i. For instance, we could
start off by setting 𝑥! = 1 for all i. Obviously this is not a useful measure of centrality, but we
can use it to calculate a better one 𝑥!" , which we define to be the sum of the centralities of
iʹs neighbors thus:
𝑥!" = ∑# 𝐴!# 𝑥# ,
where Aij is an element of the adjacency matrix. We can also write this expression in matrix
notation as xʹ = Ax, where x is the vector with elements 𝑥! . Repeating this process to make
better estimates, we have after t steps a vector of centralities x(t) given by:
𝑥 𝑡 = 𝐴$ 𝑥(0)
® Text from Networks: An Introduction, by M.E.J. Newman
Now let us write x(0) as a linear combination of the eigenvectors vi of the adjacency matrix, A, thus:
𝑥 0 = # 𝑐! 𝒗!
!
for some appropriate choice of constants ci. Then
'
𝑘 !
𝑥 𝑡 = 𝐴' 𝑥 0 = 𝐴' # 𝑐! 𝒗! = # 𝑐! 𝑘!' 𝒗! = 𝑘(' # 𝑐! 𝒗!
𝑘(
! ! !
where the κi are the eigenvalues of A, and κ1 is the largest of them.

Since κi/κ1 < 1 for all i ≠1, all terms in the sum other than the first decay exponentially as t becomes
large, and hence in the limit t → ∞ we get 𝑥 𝑡 → 𝑐(𝑘('𝒗( (note that 𝑣1 is the leading eigenvector)
In other words, the limiting vector of centralities is simply proportional to the leading eigenvector of
the adjacency matrix.
Equivalently we could say that the centrality x satisfies. 𝑨 𝒙 = 𝑘(𝒙
This then is the eigenvector centrality, first proposed by Bonacich in 1987.

The centrality xi of node i is proportional to the sum of the centralities of iʹs neighbours:
𝑥! = 𝑘()( # 𝐴!* 𝑥*
*
which gives the eigenvector centrality the nice property that it can be large either because a node has
many neighbors or because it has important neighbors (or both).
In theory eigenvector centrality can be calculated for either undirected or directed networks. It works best however
for the undirected case. In the directed case other complications arise. First of all, a directed network has an adjacency
matrix that is, in general, asymmetric. This means that it has two sets of eigenvectors, the left eigenvectors and the
right eigenvectors, and hence two leading eigenvectors. So which of the two should we use to define the centrality? In
most cases the correct answer is to use the right eigenvector. The reason is that centrality in directed networks is
usually bestowed by other vertices pointing towards you, rather than by you pointing to others.
Example
Eigenvector centrality (Gephi)
xi Label
Gephi obtain the leading eigenvector 1 1
and normalize it by the leading 0.846328 3
eigenvalue. 0.668745 2
0.668745 4
0.560831 5
0.490898 6
In NetworkX the centralities are not normalized
0.337226 7
https://networkx.org/documentation/stable/reference/algorithms/ge
nerated/networkx.algorithms.centrality.eigenvector_centrality.html 0.337226 8
Limitations
There are still problems with eigenvector centrality on directed networks. Only vertices that
are in a strongly connected component of two or more vertices, or the out-component of
such a component, can have non-zero eigenvector centrality. Recall also that acyclic
networks, such as citation networks, have no strongly connected components of more than
one node, so all vertices will have centrality zero. Clearly this make the standard
eigenvector centrality completely useless for acyclic networks.
A variation on eigenvector centrality that addresses these problems is the Katz centrality,
which is the subject of the next section.
Katz Centrality
Katz (1953): the definition of "eigenvector centrality" is modified by adding two

parameters α> 0 and β> 0
where 1 is the vector (1, 1, 1 ...)
β is a positive constant that guarantees that all nodes have at least one non-zero positive
value as a centrality.
“We simply give each node a small amount of centrality “for free,” regardless of its
position in the network or the centrality of its neighbours”. ® Networks: An Introduction, by M.E.J. Newman
α modulates the weight of "eigenvector" with respect to the constant β in the centrality of
a node.
Katz Centrality
Matritially:
The range of variation of α is limited by the principal eigenvalue k1 of A

(for 𝛼 = 1/𝑘% the centrality of Katz diverges, thus, we should choose a value of α smaller
than 1/𝑘% if we wish the expression for the centrality to converge to meaningfull values).
The computation of the inversion of is computationally expensive: O(n3). There

are recursive algorithms to compute this in O(t·m)
Katz Centrality
Considering a constant term βi different for each node

Gephi plugin:
One could incorporate relevant information from each node not contained in the network,
for example in a social network: age, wealth, etc.
PageRank
Eigenvector centrality and Katz centrality have a drawback: a prestigious node also makes
prestigious all those who points out.
Intuition: my association with a prestigious node should be "downgraded" if prestige is

shared among many.
Hypothesis: The importance that a node receives from its neighbours is proportional to its
centrality divided by its out-degree
Larry Page Page Rank Algorithm: PageRank
PageRank
The range of variation of α is limited by the main eigenvalue of AD-1.

This eigenvalue in undirected networks is 1.
PageRank arbitrarily fixed α = 0.85.
As before, calculating X by the inversion of matrices is computationally expensive, so

different heuristic approximations are used.
Example: http://ccl.northwestern.edu/netlogo/models/PageRank
Example
PageRank centrality (Gephi)
Hubs-Authorities (For directed networks)
In directed networks, the previous measures compute the centrality of a node insofar as
central nodes point to it.
Sometimes it can also be interesting to identify those nodes that point to important
nodes:
Ex: Twitter: a twitter account (hub) that follows others that are a reference (authority) in
a subject
twitter@gassol
twitter@falonso twitter@cronaldo
twitter@marca
twitter@mesi
twitter@nadal
Hubs-Authorities
Hypothesis:
Authority: nodes that are relevant because they contain important information.
Hubs: nodes that tell us where the best authorities are
Hub Authority
Point to "authorities" It is pointed by "hubs"
It is a circular definition: A node can play both roles.
Hubs-Authorities
Algorithm Hyperlink-Induced Topic Search (HITS): each node is given an initial value (t = 0)
of "authority centrality" xi (0) and of "hub centrality" yi (0)
Ajixj(0)
"Authority centrality” "Hub centrality”

corresponds to the main eigenvector of corresponds to the main eigenvector of
the cocitation matrix (#nodes that point the bibliographic pairing matrix (#nodes
simultaneously to (i, j)) to which (i, j) point simultaneously)
Example
cocitation matrix (#nodes that point bibliographic pairing matrix (#nodes to

simultaneously to (i, j)) which (i, j) point simultaneously)
Hubs-Authorities
The HITS algorithm does not suffer the disadvantages in directed networks of the
eigenvector centralities that demand to introduce a constant.
Ex: an "article" may not be cited by anyone (xi = 0) and yet cite relevant articles in a
subject (yi ≠ 0).
It is more suitable for directed networks than the eigenvector algorithms.
It was the base of the ask.com search engine.
None is better than the other, its application depends
Comparison on the nature of the problem and the relationships
Degree centrality Closeness
Eigenvector centrality PageRank

Betweenness
Periodic Table of network centralities
http://schochastics.net/sna/periodic.html (Interactive version)
¡Gracias!
Centrality measures
Universidad Politécnica de Madrid
Designed by starline / Freepik

3 CentralityMeasures Lastupdate2324

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

3 CentralityMeasures Lastupdate2324

Uploaded by

Copyright:

Available Formats

Complejidad y Redes

Designed by starline / Freepik

Network centrality: an introduction

MA5Q3 Topics in Complexity Science

Nuevas Tecnologías y Empresa

Until know we have seen:

Demogorgon: very well-connected node,

Who the most connected in the network?

Who is closest to everyone else?

Who links groups far from the network?

Who is connected to the best connected?

Remember: adjacency matrix Aij where (i, j) represents link from j to i

Undirected weighted network

Examples: #followers (Twitter); #friends (Facebook); #citations (scientific papers)

Degree centrality (Gephi)

In Gephi instead of the normalised degree centrality, we have

E.g.: would my twitter account be equally important is followed by an "unknown" twitter

It is a local measure; it does not depend on the rest of the network.

It can happen that the nodes with the highest degree

A k-core of a graph G is a maximal

A node i has coreness kc(i) = k

The remaining nodes have k-core=1

In Gephi you cannot compute the k-core centrality but can 2 8

Right picture from Network centrality: an introduction, Francisco Aparecido Rodrigues

Its "closeness" Ci is defined: higher values of closeness indicate higher

Closeness centrality (Gephi)

It is very sensitive to changes in the network.

Betweenness centrality (Gephi)

List of shortest paths

Node 1 Node 2 Node 3 Node 4

Betweenness centrality (Gephi)

List of shortest paths

Node 1 Node 2 Node 3 Node 4

Node 5 Node 6 node 7 Node 8

Calculus is computationally expensive. It considers only the shortest distances and,

To overcome these limitations, calculus based on random walks can be considered.

The importance of a node in the network grows.

Ex.: ”Kim Kardashian"

where the κi are the eigenvalues of A, and κ1 is the largest of them.

Equivalently we could say that the centrality x satisfies. 𝑨 𝒙 = 𝑘(𝒙

This then is the eigenvector centrality, first proposed by Bonacich in 1987.

Eigenvector centrality (Gephi)

Gephi obtain the leading eigenvector 1 1

and normalize it by the leading 0.846328 3

Katz (1953): the definition of "eigenvector centrality" is modified by adding two

where 1 is the vector (1, 1, 1 ...)

The range of variation of α is limited by the principal eigenvalue k1 of A

The computation of the inversion of is computationally expensive: O(n3). There

Considering a constant term βi different for each node

Intuition: my association with a prestigious node should be "downgraded" if prestige is

Larry Page Page Rank Algorithm: PageRank

The range of variation of α is limited by the main eigenvalue of AD-1.

As before, calculating X by the inversion of matrices is computationally expensive, so

PageRank centrality (Gephi)

Point to "authorities" It is pointed by "hubs"

It is a circular definition: A node can play both roles.

"Authority centrality” "Hub centrality”

cocitation matrix (#nodes that point bibliographic pairing matrix (#nodes to

It is more suitable for directed networks than the eigenvector algorithms.

It was the base of the ask.com search engine.

Degree centrality Closeness

Eigenvector centrality PageRank

Designed by starline / Freepik