Md. Kamrul Hassan: Two-Hundred Years Long Journey From Graph To Complex Network Theory

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 47

Two-hundred years long journey from graph to complex

network theory

• Md. Kamrul Hassan


• Professor, Department of physics, Dhaka University

1
Power of power-law

2
Tale of the power-law
Is mean always meaningful?
Communism Capitalism

3
Power-law
A power-law distribution is regarded as scale-free since it looks the same no matter
what scale we look at it.

In general, a function f(x) is called scale-free if it satisfies: f  x  g   f  x

• It can be rigorously proved that such function can only have none but
power-law solution. For instance

f ( x)  x  f (x)   f ( x)

• It can be rigorously proved that g ( ) too can only have none but
power-law solution.

It implies that if we know the function at a given value of x we can find the value of f
for any other value x   x
Power laws are seemingly everywhere
note: these are cumulative distributions, more about this in a bit…

5
First return probability of a random walker

We ask a random walker to walk along a line and we measure the time it takes for
the walker to return to the place where he started the journey. Let the walker do the
walking N times. A scan through the data will definitely appear times he needed to
return are just a few number which will not give rise to any simple and clean
emergent behavior. Yet we can see a pattern if we plot a histogram in the log-log
scale.

here we have tens of


thousands of observations
when x < 10

Number of times return n


Total number of walks

Noise in the tail:


Here we have 0, 1 or 2 observations

Time t of values when the walker takes long to to return

6
The 7 Bridges of Konigsberg is a famous mathematics
problem inspired by an actual city in Germany.

Seven bridges were built around 1735 so that the people


of the city could get from one part to another.
• Many people claimed that they could walk a route by
crossing each bridge exactly once yet nobody could prove it.

Having trouble?

That’s okay, so did Euler. It doesn’t seem possible to cross


every bridge exactly once. In fact it isn’t.

This one is solvable. Here's one possible solution:


8
Solution:
Euler paths are possible if it has
two or less odd vertices. If this
is so, then the walk must begin
at one such vertex and end at
the other. Since there are only
4 vertices, the solution is
simple. The walk is desired to
begin at the blue vertex and
end at the orange vertex.
Therefore a new bridge is
drawn between the other two
vertices. Since they each
formerly has an odd number of
bridges, they must now have an
even number of bridges fulfilling
all conditions.

9
In 1735 Leonard Eular solved it by mapping it as an abstract
graph which in general is comprised of a set of nodes or
vertices connected by a set of links or edges where the
spatial distance between nodes are totally disregarded.

It was eventually developed as an active subject of discrete


mathematics contributed by many famous mathematicians
like Cauchy, Hamilton, Cayley, Kirchhoff and others. They dealt
with graphs those were regular and deterministic.

However, the graph theory that began with the work of


Euler witnessed a paradigm shift in 1959 with the seminal
work of two mathematicians Paul Erdos and Alfred Renyi who
proposed a simple model to generate random graph.
10
What is network?
• Basic constituents of networks are:
• (a) Nodes or vertices and (b) Links or Edges

• Many complex systems, natural or man made, can be


described as interwoven web of large network if the
constituents are regarded as nodes or vertices and
interactions between constituents as links or edges.

11
12
13
14
Observables

Degree: Nodes in the network or graph theory are regarded as


topologically identical though they may differ in terms of their
importance in the network. The parameter that quantify this
importance is known as the degree k and it is defined for a given
node by the number of edges attached to this node. A node with
degree k means that it is connected to k other existing nodes
regardless of their spatial positions.

Degree distribution P(k): In the case where nodes have great


many different number of links connected to them we find it
worthwhile to characterize the network by the degree
distribution P(k)! Here, is defined as fraction of nodes which
has degree k.
15
Degree distribution
How many different kinds of P(k) do we usually find?

Typically, there are three kind of P(k)

16
Clustering coefficient C
Clustering coefficient: Many networks have strong local
recurrent connection leading to loops and clusters. Assume that you
have only four good friends which in network language a node is
linked to four other nodes by four edges. If all the four friends are
also friends of each other then it requires six addition edges. Instead
it may happen that a couple of your friends may not be friends of
each other in such case the real count will less than sixi

17
Clustering Coefficient, C

Mi
Ci 
ki ( ki  1)
2


N
Ci
C i 1

Tree network C 0

The network whose clustering


coefficient is significantly high
and remain independent of the
network size – signature of
small-world network
18
Typical nature of clustering co-efficient C

• C decays following power-law C ( N )  N 

• C(k) decays following power-law C k  k 1

• C is high and independent of N

• C is low and independent of N

19
Geodesic path length
The average shortest path length or the mean geodesic distance l is
defined as the mean shortest distance among all the nodes. It is the
average of the shortest distance between all pairs of nodes of the
network. Say there are 5 nodes in the network then the average
geodesic path length is:

For square lattice network: If number of nodes increases from 100


to 10000 the geodesic path length increases from 10 to 100. While
for network it grows from 2 to 4.
20
Classifications of network
Regular graph or network:

Degree distribution P(k )   k  n 


Mean geodesic path length l  N 1/ d
Clustering coefficient: Constant

Random network:
( pN ) k  k  k
Degree distribution P(k )  e  pN  e  k 
k! k!
Mean geodesic path length l ( N )  log N
1
Clustering coefficient: C(N )  N 21
Classifications of network
Small-world network:

Degree distribution P(k )   k  n 


Mean geodesic path length l ( N )  log N
Clustering coefficient: High and independent of N

Scale-free network:

Degree distribution P(k )  k 

Mean geodesic path length l ( N )  log N


1
Clustering coefficient: C(N )  N
22
Pál Erdös
(1913-1996)

He wrote 1475 artícles


with 493 coauthors.

Erdos and Renyi in 1959


first realized that the real-
life graph must have
some degree of disorder.
The question they addressed is the following:

How nature decides to establish links between pairs of nodes?

They were not physicists and hence they assumed the


simplest case:

At each step a pair of labeled nodes are picked at random with


uniform probability and the link is occupied if it is empty else it
is discarded.

24
Erdös-Rènyi (ER) Model: G(N,p) version
First framework to describe complex networks
was proposed by Paul Erdös and Alfred Renyi in 1959:
– Start with N nodes
– Edges are added with probability p between pair of nodes
picked at random

25
ER Model: G(N,p) version

Each links are picked at random and then we add the link with probability
p and leave it unoccupied with probability 1-p. Thus p is a prefixed value
that we have to choose before we start the game!

What is the most expected number of


links added after we exhaust all the
distinct pairs?
N ( N  1) p
m
2
Shall we get this number each time we
play the game? No!

2m
k   p ( N  1)  pN
N

However, if we do this experiment large number of times then their


mean value will get closer and closer to this value as the experiment
number increases. 26
27
ER Model
What is the probability that there are only m links at probability p out of
N(N-1)/2 links?

The answer is:

The expected number of


links at probability p:

Hints to prove this:

2m
Average degree: k   p ( N  1)  pN
N

What is the probability P(k) that a node picked at random is connected


to k other nodes?
28
The degree distribution P(k )   p
N 1
k
k
(1  p) N 1k

In the limit:

( pN ) k  k  k
P(k )  e  pN  e  k 
k! k!

The hall-mark of the ER model: The degree distribution is Poissonian.


29
ER model

• In a 1960 paper, Erdős and Rényi described the behavior


of G(N, p) very precisely for various values of p. Their results
included that:

• If Np < 1, then a graph in G(N, p) will almost surely have no


connected components of size larger than O(log(N)).

• If Np = 1, then a graph in G(N, p) will almost surely have a largest


component whose size is of order N2/3.

• If Np → c > 1, where c is a constant, then a graph in G(N, p) will


almost surely have a unique giant cluster of size O(N).
30
ER Model: G(N,m) version

In the G(N, m) model, a graph is chosen uniformly at random from


the collection of all graphs which have N nodes and m edges.
We can add the m number of edges from the N(N-1)/2 of possible edges.
These are added one by one at random with uniform probability:
Now, say, X is a governing parameter which we are interested in. The
question is then how do we connect it with that of the G(N,p) model?

We measure X as we keep adding links one by one so that we have a


data of X as a function of m. We can then get X as a function of p using
the following relation:

2t In this case the total number of edges at probability p


p
( N  1) is not fixed rather fluctuate around:
31
ER model: G(N,m) version

We thus see that below t=0.5, the largest cluster grow logarithmically with
network size N. However, above t=0.5, it grows linearly with N.

Thus there is a transition from miniscule size cluster to giant cluster across a
threshold or critical point which has been regarded as percolation transition.
Therefore, since 1960, it has been argued that it undergoes a phase
transition like percolation in lattice!
32
What is the order parameter? In the case of lattice defined
relative size of the spanning cluster as the order parameter.
In network there is no border, or surface and hence one cannot really define
spanning cluster. However, we find that the largest cluster behaves in the same way
as the spanning cluster both in the network and in the lattice.

If we regard P as the order parameter then there must exist another quantity that
can quantify H the degree of disorder such that where P=0 there H must be
maximum and vice versa. And that quantity is none other than entropy!
33
The question you may ask is: What is order? The idea of order can
be best understood if we know what is disorder or degree of disorder.
The later can be quantified by entropy. Let us try to understand
entropy in the context of percolation.

Initially at t=0 all the nodes are isolated. Say they are all distinct or
colored or labeled nodes. What is the probability that a node picked
at random is the ith node? The answer is: 1/N which is the same for
all nodes. It means that we are at a state of most uncertain. Entropy
measures the degree of uncertainty or the extent of confusion.

What happens at the other extreme? That is, at t=1. It is almost


certain that there is a giant cluster of size almost the same as the
system size N. What is the probability that a node picked at random
belong to the giant cluster? In this case the probability is, if not one,
almost equal to one. It means that we are almost sure that it will
belong to the giant cluster and the degree of uncertainty or the
extent of confusion is almost zero.
34
35
How many distinct ways N labeled nod can be picked?

Taking log on both sides and use Stirlings approximation:

Shannon Entropy

36
Entropy

37
What if we pick two links instead of one but finally we add only one link
and other is recycled for future picking?

We then measure the following :

The two possible options to choose as litmus test are:

38
What does it mean?
Both model, product and sum rules, suggest that the smaller clusters are
encouraged to grow faster over the larger cluster.

Big question is: What is the impact


of such minor change in adding a
node? Do we expect significant
change in the behavior of the
observable quantities? Let us see!
Achlioptas, D'Souza, and Spencer
39
Explosive Percolation Model

The question is: Why?

D. Achlioptas, R. M. D'Souza, and J. Spencer, Science 323 1453 (2009)


40
41
Explosive Percolation Model

The Shannon entropy : H   i (t ) log( i (t ))


i

01/26/16
C (t )  (t  tc )    1.0   (t  tc )  ;   0.893

43
It clearly suggests order-disorder transition!

Transition is so sudden and abrupt that initially it was thought that order
parameter undergone a sudden finite jump and hence claimed that it
describes the first order transition. However, later it was proved that it was
actually continuous or second order phase transition but still it has some
first order like behavior due to finite size effect.
44
What if we do the opposite!

45
The question is then: Why encouraging the smaller cluster to grow faster
makes the transition so sudden and abrupt? What is the physics behind?

From thermodynamics, we know that the equilibrium state is the


where free energy is minimum. At high temperature disordered
phase is achieved by maximizing entropy and at low temperature
ordered phase is achieved by minimizing energy of the system.
46
Basically, what happens is the following: When we pick two
links at random and make attempt to choose one of them to
add and the other to discard we essentially measure the size of
the cluster that addition of the potential links would form. The
one that forms smaller cluster is kept and the other link is
discarded.

Note that initially at t=0 the system is at its utmost disordered


state. We know that reduction of entropy cost energy. Thus
choosing the bigger cluster would mean to reduce the entropy
of the system by a greater extent than that of the smaller
cluster. Moreover, as we are in the disordered phase, the
minimum of the free energy is achieved by maximizing entropy
as permitted by the situation at that instant.
47

You might also like