Professional Documents
Culture Documents
The Long Tail (Social Media Analytics) : Deepayan Chakrabarti (Deepay@utexas - Edu)
The Long Tail (Social Media Analytics) : Deepayan Chakrabarti (Deepay@utexas - Edu)
degree = 1
degree = 3
In-degree = 6
Out-degree = 3
Average degree?
Median degree?
If I told you:
The average height of human males is 5’10”
I pick someone at random
What is this person’s height?
A guess of 5’10” is probably not too far from the truth
6 The Long Tail
Degrees and Distributions
A guess of 5’10” is probably not too far from the truth
Why?
average
How many males
Duffield, VA (pop
89) Most towns have low pop,
but:
many atypical cities
distribution is not
New York City
symmetric “right skew”
City population
Me
Most users have very few
followers
but several are extremely
popular
Justin Bieber
Node degree
(e.g., number of Twitter followers)
Plotted on
Percentage of cities
log-log scale
10 20 30 40 50 100 200
1 2 4 8 16 32 64 128 256
Plotted on
Percentage of users
log-log scale
104 1/102
a b
Percentage of users
Popularity (degree)
14 The Long Tail
Power Laws
City Population Percentage of
(x axis) cities (y axis)
104 1/102
a b
Percentage of users
Popularity (degree)
15 The Long Tail
Power Laws
The log-log plot shows the tail
better
Data shows a “line on log-log
scales”
Percentage of users (y)
slope
If popularity=x, what is the
corresponding % of users y?
y ~ 1/x slope
y = c. x-slope
Fraction of users
Fraction of users
α=4
α=2
Popularity Popularity
Rank Me
Justin Bieber
21 The Long Tail
Power Laws
Statistical sense Business sense
Popularity
Probability
“Head”
“Long tail”
Popularity Rank Me
Me Justin Bieber Justin Bieber
20 40 60 80 100
Rank of word
Chris Andreson, ‘The Long Tail’, Wired, Issue 12.10 - October 2004
27 The Long Tail
Business implications
The long tail has always existed
So why is it important now?
29
Outline
Degrees and distributions
Power Laws
Business implications
Preferential attachment
Start with a small village with m0 villagers
everyone knows everyone
New people arrive one at a time
Each forms exactly m friendships with existing villagers
They prefer to connect to the popular villagers
preferential attachment
The rich get richer
32 The Long Tail
The rich get richer
How can we end up with power law degree
distributions?
degree of v
Prob(connection to v) =
total degree of all
villagers
A new product
(say, a video app for iOS, in the early days of the iPhone)
gets a few users
who rate it, and it goes to the top among all video apps
New users can find it and buy it more easily
Even more popular
The cycle continues