Professional Documents
Culture Documents
Small World
Small World
Small World
10/4/2011 3
What is the typical
shortest path between
any two people?
What is the typical shortest path
length between any two people?
Experiment on the global social network
Can’t measure, need to probe explicitly
Small-world experiment [Milgram ’67]
Picked 300 people in Omaha, Nebraska
and Wichita, Kansas
Task: Get a letter to a Boston stock-
broker by passing it through friends
How many steps did it take?
Median of 6 steps, thus
“six degrees of separation”
Assume each human is connected to 100 other people.
Then:
Step 1: reach 100 people
Step 2: reach 100*100 = 10,000 people
Step 3: reach 100*100*100 = 1,000,000 people
Step 4: reach 100*100*100*100 = 100M people
In 5 steps we can reach 10 billion people
What’s wrong here?
92% of new FB friendships are to a friend-of-a-friend
Data set Avg. shortest Avg. Shortest Clustering Clustering
path length path length coefficient coefficient
(measured) (random) (measured) (random)
Film actors 3.65 2.99 0.79 0.00027
(225,226
nodes, avg.
degree k=61)
Electrical 18.7 12.4 0.080 0.005
power grid
(4,941 nodes,
k=2.67)
Network of 2.65 2.25 0.28 0.05
neurons (282
nodes, k=14)
MSN (180 6.6 ... 0.114 0.00000008
million edges,
k=7)
Facebook (721 4.7 ... 0.14 ...
Consequence of expansion:
Short paths: O(log n)
This is the “best” we can
do if the graph has constant
degree and n nodes
Pure exponential growth
But networks have
local structure:
Triadic closure:
Friend of a friend is my friend
How can we have both?
Triadic closure reduces growth rate
Where should we place social networks?
Clustered? Random?
Could a network with high clustering be at
the same time a small world?
How can we at the same time have
high clustering and small diameter?
Ci ≥ 2*12/(8*7) ≥ 0.43
What’s the diameter?
Logarithmic in n !
Proof:
Consider a graph where we contract
2x2 subgraphs into supernodes
Now we have 4 edges sticking out of
each supernode
4-regular random graph!
Use theorem we magically know:
In a graph on n nodes with expansion α for
all pairs of nodes s and t there is a path of
O((log n) / α) edges connecting them.
We can turn this into a path in the real
graph by adding at most 2 steps per
hop
Diameter of the model is
O(2 log n) i.e. short paths exist! 4-regular random
graph
Can a network with high clustering also be a
small world?
Yes! Only need a few random links.
t
Searchable Not searchable
Search time:
O((log n) ) O(n )
Model: Use the 2D grid where each node has
one random edge
This is a small-world
Ein =interval
first kI ofcontacts
width 2x nodes
know
Then: P(Esomeone
i)= 2x/n
(haven’t seen node i yet, but can
close to t
assume random edge generation)
x
Let: E=event that any of first k x
I
Then:
k k 2kx
P( E ) P Ei P( Ei )
i i n
2kx
Prob. of link to I: P( E )
2kx n
Need k, x s.t. 1 Case when: Case when:
1n
P( E ) T≥k T≥x
Choose: k x 2 12 n
So, P( E ) 2
2 s s
1
n 1 k
2
k
n 2
Suppose initial s is outside I
and E does not happen. x x
t
Then the search algorithm must t
Kleinberg’s intuition:
Our long range links
are not random
They follow geography!
Saul Steinberg, “View of the World from 9th Avenue”
[Kleinberg, Nature ‘01]
Model
Nodes still on a grid
Node has one long range link
Prob. of long link to node v:
−𝜶
𝑷 𝒖 ՜ 𝒗 ~𝒅 𝒖, 𝒗 d(u, v)-
P(u v)
d(u,v): grid distance between u and v d(u, w)-
w u
α: parameter ≥ 0
α=0
α=1 α >> 1
P(uv)
P(uv)
P(uv)
d d d
Small α: too many long links Big α: too many short links
d(v,t)=d
v
One-dimensional case
Claim: For α=1, we can get from s
to t in O(log2 n) steps
d
Set: 𝐼 = 𝑑 closest contacts to t
all possible
1
d 1 d
d/2
distances d
from 1 n/2
−𝟏
𝒅(𝒗, 𝒘) d/2
t
n
n/2
𝐏(𝐯1՜ 1𝒘)
n/2
≥dx
d 1 d
x
1𝟐𝐥𝐧
ln𝒏 ln n
2
1
We need P(v points to I) v
-1
d(v, w)
P(v points to I ) P(v w)
d
wI wI 2 ln n
1 1 1 2 1
d
2 ln n wI d (v, w) 2 ln n 3d 3 ln n d/2
All terms
1
≥ 2/(3d)
O
𝟏 ln n t
d/2
𝐏(𝐯 ՜ 𝑰) ≥ 𝑶
𝐥𝐧 𝒏 Note:
d(v,x)=3d/2 x
We have: v
𝟏
𝐏(𝐥𝐨𝐧𝐠 𝐫𝐚𝐧𝐠𝐞 𝐨𝐟 𝐯 ՜ 𝑰) ≥ 𝑶
𝐥𝐧 𝒏
Expect to need at most ln(n) steps to enter I
Getting to I halves the distance to the target!
d/2
Distance can be halved at most log2(n) times
Erdős–Rényi
(n)
We know:
α=0: (i.e., Watts-Strogatz): we need 𝒏 steps
α=1: we need T=𝐎(𝐥𝐨𝐠 𝟐 𝒏 ) steps
Exponent β in T=O(logβ n)
0 1 2
Exponent α
Does Kleinberg‘s small-world
model reflect reality?
Problem:
Non-uniform population density
Solution: Rank based friendship
[Liben-Nowell et al. ‘05]
P(uv) = ranku(v)-
What is the best ?
For equally spaced pairs: =dimension of the space
In this special case =1 is best for search
[Liben-Nowell et al. ‘05]
Close to
theoretical
optimum
of = -1 !
11/9/2019 Prob.
Jure Leskovec, Stanford CS224W: of link
Social andvs.Information
distance in the hierarchy
49
Algorithmic consequence of
small-world:
N42
N21
N38 K24
N32 K30
K34
Search algorithm:
Take the longest link that does not overshoot
This way with each step we half the distance to the
target
11/9/2019 Jure Leskovec, Stanford CS224W: Social and Information 59
N1
N8+1 = N14
N56 N8+2 = N14
N8
N8+4 = N14
N51
N8+8 = N21
N14 N8+16 = N32
N48 N8+32 = N42
N42
N21
N38 N32
N42+1 = N48
N42+2 = N48 N42
N21
N42+4 = N48
N42+8 = N51 N38 N32
N42+16 = N1
N42+32 = N8
11/9/2019 Jure Leskovec, Stanford CS224W: Social and Information 61
Search for a key in the network of N nodes
visits O(log N) nodes
Hierarchy
(height of the least common ancestor)
P(uv) ~ b-α h(u,v)
Tree distance
P(uv) is approx uniform
at all scales of resolution
How many nodes are
at dist. h? (b-1)bh-1 ~ bh Nodes/Edges of the network
Extension:
Multiple hierarchies – geography, profession, …
Generate separate random graph in each hierarchy
Superimpose the graphs
Search algorithm:
Choose a link that gets closest in any hierarchy
Q: How to analyze the model?
Simulations:
Search Time
Search works for a range of alphas
Biggest range of searchable
alphas for 2 or 3 hierarchies
Too many hierarchies hurts α
11/9/2019 Jure Leskovec, Stanford CS224W: Social and Information 67