Professional Documents
Culture Documents
Graph Centrality Measures
Graph Centrality Measures
Centrality is a crucial concept in graph analytics that deals with distinguishing important
nodes in a graph. Simply put, it recognizes nodes that are important or central among
the whole list of other nodes in a graph.
With so many angles that can define the importance of every node, various metrics are
taken into consideration to study each node from a different perspective. Centrality
offers all the relevant analytical information regarding that node and the graph from
which the conclusion can be derived.
The different perspectives of a particular node are studied under different indices, which
are collectively known as centrality measures. They help in gaining a better grasp of the
network and cutting through the chaos while extracting information from a network.
Since importance is defined differently by each measure, it's vital to understand them all
to decide on the best one for graph visualization applications.
Degree centrality
Before diving into degree centrality, here’s a little refresher on the degree of a node in a
graph.
There are two kinds of graphs – directed and non-directed. The degree of a node in a
non-directed graph is defined as the number of links a node has with other nodes. In a
directed graph, the degree is subdivided into in and out degrees. In-degree refers to the
links incident on the node while out-degree is the number of nodes directed at other
nodes from a particular node.
Degree centrality defines the importance of a node based on the degree of that node.
The higher the degree, the more crucial it becomes in the graph. It’s used to find
popular individuals, the most connected individuals, individuals who connect quickly in a
wider network, or the ones that hold the most information.
It’s one of the simplest of all the centrality measures of node connectivity, and is used in
transactional data, account activity, etc.
Closeness centrality
Closeness centrality identifies a node's importance based on how close it is to all the
other nodes in the graph. The closeness is also known as geodesic distance (GD),
which is the number of links included in the shortest path between two nodes.
To calculate that closeness or GD for a node, sum up all the GD amidst that and all the
other nodes in the network on the graph.
Harmonic centrality
Betweenness centrality
Betweenness centrality defines the importance of any node based on the number of
times it occurs in the shortest paths between other nodes. It measures the percent of
the shortest path in a network and where a particular node lies in it.
A node with high betweenness centrality is considered the most influential one over
other nodes in the network. This is because this measure can provide insights into the
most critical path as disrupting them will disrupt the network.
Relationships with high scoring nodes have more contribution to the score of a node
than connections to nodes with low eigenvector centrality scores.
This centrality identifies nodes that influence the whole network, not only those that are
linked. A high score of a node will automatically mean that its neighboring connected
nodes have high scores too. It has a wide range of applications, one of which is the
calculation of PageRank used by Google. It also finds use in understanding human
social networks, malware propagation, etc.
PageRank
Since eigenvector centrality is suitable for undirected graphs, there wasn’t one for
directed graphs before PageRank entered the picture. It has applications in online
platforms like Twitter, which uses this centrality to offer recommendations of other
accounts that a user can follow.
PageRank is also used to detect flaws in the fraud detection system utilized in the
insurance and healthcare industries. The algorithm even predicts traffic workflows in
public spaces and streets by running over a graph that includes road intersections.
ArticleRank
ArticleRank is a subvariant of PageRank and gives a measure of the transitive influence
of nodes. However, it contradicts the assumption of the nodes’ relationship which
narrates that relationships with low out degree nodes are important compared to
relationships from higher out degree nodes. Now that all the centrality measures have
been explained, you can compare nodes in your network based on various
perspectives. With massive nodes in millions of networks worldwide, the centrality
measures are tools that offer strong conclusions.
References
https://www.turing.com/