Professional Documents
Culture Documents
Introducción a la Teoría de Grafos
Introducción a la Teoría de Grafos
Dragomir R. Radev
Thursdays, 6-8 PM
233 Mudd
Spring 2008
(3) Random graphs
Statistical analysis of networks
• We want to be able to describe the behavior of
networks under certain assumptions.
• The behavior is described by the diameter,
clustering coefficient, degree distribution, size of
the largest connected component, the presence
and count of complete subgraphs, etc.
• For statistical analysis, we need to introduce the
concept of a random graph.
Erdos-Renyi model
• A very simple model with several variants.
• We fix n and connect each candidate edge with
probability p. This defines an ensemble Gn,p
• The two examples below are specific instances of G10,0.2.
In other models, m is fixed. There are also versions in
which some graphs are more likely than others, etc.
Try Pajek
Erdos-Renyi model
• We are interested in the computation of
specific properties of E-R random graphs.
• The number of
n n(n 1)
candidate edges is:
2 2
S 1 eS
• Let S=1-u. We now have
For <1, the only non-negative solution is S=0
For >1 (after the phase transition), the only non-
negative solution is the size of the giant component
• At the phase transition, the component sizes are
distributed according to a power law with exponent 5/2.
Giant component size
1
• Similarly one can prove that s
1 S
[Newman 2003]
Diameter
• A given vertex i has Ni1 first neighbors. The
expected value of this number is .
• But we also know that = pn.
• Now move to Ni2. This is the number of second
neighbors of i. Let’s make the assumption that
these 2are the1 neighbors of the first neighbors.
So, N i N i
2
N
i
D D
At all distances:
n N i1 N i2 N iD D
In other words (after taking a logarithm):
log n
D
log
Are E-R graphs realistic?
• They have small world properties
(diameter is logarithmic in the size of the
graph)
• But low clustering coefficient. Example for
autonomous internet systems, compare
0.30 with 0.0004 [Pastor-Satorras and
Vespignani]
• And unrealistic degree distributions
• Not to mention skinny tails
Clustering coefficient
• Given a vertex i and its two real neighbors
j and k, what is the probability that the
graph contains an edge between j and k.
• Ci = #triangles at i / #triples at I
• C = average over all Ci
• Typical value in real graphs can be as high
as 50% [Newman 2002].
• In random graphs, C = p (ignoring the fact
that j and k share a neighbor (i).
Some real networks
• From Newman 2002:
Network n Mean degree z Cc Cc for random graph