Professional Documents
Culture Documents
2013 SIGMETRICS Heavytails
2013 SIGMETRICS Heavytails
2013 SIGMETRICS Heavytails
BUT
Similar stories in
electricity nets,
citation nets,
2005, ToN
1. Properties
2. Emergence
3. Identification
Pr ( > )
Exponential:
Pr ( > )
Exponential:
>
= ( )=
density:
( )=
for
Pr ( > )
Exponential:
: log
Pr ( > )
Exponential:
Scale invariance
Scale invariance
is scale invariant if there exists an and a such that
=
( ) for all , such that
.
change of scale
Scale invariance
is scale invariant if there exists an and a such that
=
( ) for all , such that
.
Theorem: A distribution is scale invariant if and only if it is Pareto.
Example: Pareto distributions
=
Scale invariance
is scale invariant if there exists an and a such that
=
( ) for all , such that
.
( )
for all .
( )
( )
for all .
( )
A thought experiment
During lecture I polled my 50 students about their heights and
the number of twitter followers they have
The sum of the heights was ~300 feet.
The sum of the number of twitter followers was 1,025,000
A thought experiment
During lecture I polled my 50 students about their heights and
the number of twitter followers they have
The sum of the heights was ~300 feet.
The sum of the number of twitter followers was 1,025,000
Conspiracy principle
Catastrophe principle
Example
Consider + i.i.d Weibull.
Given +
= , what is the marginal density of
Light-tailed Weibull
( )
Conspiracy principle
Exponential
0
Heavy-tailed Weibull
Catastrophe principle
Subexponential distributions
is subexponential if for i.i.d.
, Pr
+ +
>
> )
Heavy-tailed
Weibull
Pareto
Regularly Varying
Subexponential
LogNormal
Subexponential distributions
is subexponential if for i.i.d.
, Pr
+ +
>
> )
A thought experiment
residual life
)
( )
Examples:
Exponential :
Pareto :
=
=
= 1+
memoryless
Increasing in x
)
( )
Hazard rate
>
(0)
Long-tailed distributions
is long-tailed if lim
= lim
)
( )
= 1 for all
Long-tailed distributions
is long-tailed if lim
= lim
Long-tailed distributions
= 1 for all
Heavy-tailed
Regularly Varying
Subexponential
)
( )
Weibull
Pareto
LogNormal
Heavy-tailed
Weibull
Pareto
Regularly Varying
Subexponential
Long-tailed distributions
LogNormal
Difficult to work
with in general
Heavy-tailed
Weibull
Pareto
Regularly Varying
Subexponential
Long-tailed distributions
LogNormal
Difficult to work
with in general
Catastrophe principle
Heavy-tailed
Weibull
Pareto
Regularly Varying
Subexponential
Long-tailed distributions
LogNormal
Difficult to work
with in general
Heavy-tailed
Weibull
Pareto
Regularly Varying
Subexponential
Long-tailed distributions
LogNormal
Difficult to work
with in general
1. Properties
2. Emergence
3. Identification
A quick review
Consider i.i.d.
. How does
grow?
= 15
. . when
+ ( )
= 300
<
A quick review
Consider i.i.d.
. How does
grow?
1
(0,
when Var
[ ]
+ (
< .
A quick review
Consider i.i.d.
. How does
grow?
1
(0,
when Var
[ ]
+ (
< .
What if
A quick review
Consider i.i.d.
. How does
grow?
1
(0,
when Var
[ ]
+ (
< .
What if
A quick review
Consider i.i.d.
. How does
grow?
1
(0,
when Var
+
= 300
[ ]
+ (
< .
What if
A quick review
Consider i.i.d.
. How does
grow?
1
(0,
when Var
< .
(0,
)
(0,2)
+
+ (
What if
A quick review
Consider i.i.d.
. How does
grow?
1
when Var
(0,
< .
(0,
)
(0,2)
. How does
grow?
Either the Normal distribution OR
a power-law distribution can emerge!
. How does
grow?
Either the Normal distribution OR
a power-law distribution can emerge!
but this isnt the only question one can ask about
>
1. Additive Processes
Heavy-tails are more normal than the Normal! 2. Multiplicative Processes
3. Extremal Processes
, where
( )
>
Pr
=
>
is heavy-tailed!
log
= log
+ log
+ + log
, where
when Var
(0,
< .
(0,
where =
and Var log
)
]
< .
(0,
where =
and Var log
)
]
< .
(0,
where =
and Var log
)
]
< .
, where
Heavy-tails
, where
, where
( )
where
such that
= sup ( 0|
1)
1. Additive Processes
Heavy-tails are more normal than the Normal! 2. Multiplicative Processes
3. Extremal Processes
,,
= max(
,,
How does
scale?
A simple example
Pr max
( )
,,
= 1,
>
= log
= 1e
= 1e
: Gumbel distribution
= max(
,,
How does
scale?
Heavy-tailed
Heavy or light-tailed
Light-tailed
= max(
,,
How does
scale?
,,
.
What is the distribution of the
time until a new record is set?
1. Additive Processes
Heavy-tails are more normal than the Normal! 2. Multiplicative Processes
3. Extremal Processes
1. Properties
2. Emergence
3. Identification
BUT
Similar stories in
electricity nets,
citation nets,
2005, ToN
frequency
value
frequency plot
Log(frequency)
log-linear scale
value
log-linear scale
Linear
Exponential tail
log
= log
Heavy-tailed
or light-tailed?
log
= log
( + 1) log
log-log scale
Linear
Power-law tail
log-linear scale
Linear
Exponential tail
log
= log
log
= log
( + 1) log
log-log scale
Linear
Power-law tail
Regression
Estimate of tail index ( )
log
= log
( + 1) log
log-log scale
Linear
Power-law tail
Is it really linear?
Is the estimate of accurate?
Regression
Estimate of tail index ( )
Pr
>
log
log-logscale
scale
log-log
Pr
>
log-log scale
= 1.9
= 1.4
rank plot
frequency plot
True
= log
= 2.0
( + 1) log
Pr
>
log-logscale
scale
log-log
rank plot
frequency plot
Pr
>
log-log scale
rank plot
The data is from an Exponential!
frequency plot
Pr
>
log-linear scale
=3
(from Science)
rank plot
frequency plot
Pr
>
log-log scale
= 1.7
= 1.1
(from Science)
rank plot
frequency plot
Pr
>
Linear
Power-law tail
other distributions can be nearly linear too
rank plot
Regression
Estimate of tail index ( )
log-log scale
Lognormal
log-log scale
Weibull
Pr
>
Linear
Power-law tail
other distributions can be nearly linear too
rank plot
Regression
Estimate of tail index ( )
assumptions of regression are not met
tail is much noisier than the body
log ( ; ) =
log
log
Maximizing gives
log( /
is asymptotically efficient.
not so
A completely different approach: Maximum Likelihood Estimation (MLE)
Weighted Least Squares Regression (WLS)
asymptotically for large data sets, when weights are
chosen as = 1/ (log log ).
log( / )
=
log( / )
log( / )
not so
A completely different approach: Maximum Likelihood Estimation (MLE)
Weighted Least Squares Regression (WLS)
asymptotically for large data sets, when weights are
chosen as = 1/ (log log ).
log-log scale
WLS,
= 2.2
LS,
= 1.5
true plot
= 2.0
rank
rank plot
Impossible to answer
An example
Suppose we have a mixture of power laws:
=
+ 1
<
We want
as .
but, suppose we use
as our cutoff:
1
(1 )
(
MLE/WLS
v.s.
Identifying power-law tails
+ 1
<
We want
as .
but, suppose we use
as our cutoff:
1
(1 )
(
The idea: Improve robustness by throwing away nearly all the data!
, where
as
+ Larger
- Larger
( ) Small bias
( ) Larger variance
The idea: Improve robustness by throwing away nearly all the data!
, where
as
where
( ) is the
log
The idea: Improve robustness by throwing away nearly all the data!
, where
as
where
( ) is the
log
how do we choose ?
as if
/ 0 &
The idea: Improve robustness by throwing away nearly all the data!
, where
as
where
( ) is the
log
how do we choose ?
as if
/ 0 &
Exponential
=2
2.4
1.6
log-log scale
=2
2.5
2.4
1.5
1.6
Mixture,
with Pareto-tail,
=2
= 1.4
= 0.92
1
0.5
= 300
= 12000
MLE/WLS
Hill estimator
1. Properties
2. Emergence
3. Identification
1. Properties
2. Emergence
Heavy-tailed
3. Identification
Weibull
Pareto
Regularly Varying
Subexponential
Long-tailed distributions
LogNormal
1. Properties
2. Emergence
3. Identification
1. Properties
2. Emergence
3. Identification