Internet Traffic: Statistical Multiplexing Gains

Jin Cao, William S. Cleveland, Dong Lin, and Don X. Sun

I. I NTRODUCTION queueing of a sufficiently large magnitude invalidates the as-

sumptions that lead to the over-provision results. A more so-
This note describes recent results on the effect of sta-
phisticated theory to handle the queueing does not exist. Rough
tistical multiplexing on the long-range dependence (LRD)
arguments suggest that the LRD of  and  should still tend
of Internet packet traffic. Details and bibliographies
toward independence, that the LRD of the   is eventually al-
may be found in a series of papers at http://cm.bell-
tered, and that the coefficient of variation eventually stabilizes
to a small positive constant. But we need to appeal to empirical
Let  , for  

 be the arrival times of packets on
study of live traffic from the Internet, and synthetic link traffic
an Internet link, let       be the inter-arrival times,
from simulations, to resolve the uncertainty.
and let   be the packet sizes. Suppose we divide time up into
equal-length, consecutive intervals. Let  be the packet counts III. FSD AND FSD-MA(1) S TATISTICAL M ODELS
in interval  . The  and  are studied as time series in  , and
the   as a time series in  . We found that very simple statistical time series models,
The packet traffic on a link is the result of the statistical mul- which we call fractional sum difference (FSD) models and FSD-
tiplexing of packets from different active connections. Let be MA(1) models, provide an excellent fit to the  ,  , and  for
the mean number of active connections over an interval of time the live and synthetic link packet traces.
during which link usage is stationary. serves as a measure of The first step in the modeling is to transform   and   to bring
the magnitude of statistical multiplexing over the interval. This their marginal distributions closer to Gaussian. We transformed
note addresses the effect of an increasing on the LRD of the the  by sixth roots and the  by logs. The three time series,
three traffic variables  ,   , and  . two transformed and one not, are then normalized by subtracting
Results are based on the following: (1) the mathematical the- the sample mean and dividing by the sample standard deviation.
ory of marked point processes; (2) empirical study of live packet Let  & ,  & , and '& denote the normalized time series. (We carried
traces from 15 interfaces whose 5-min average traffic rates range out our analyses without the transformation, and the results were
from about 2 kbps to 250 mbps; (3) empirical study of synthetic similar, but the transformation puts the statistical modeling on a
packet traces from network simulation with NS; (4) simple sta- more rigorous basis.)
tistical models, FSD models and FSD-MA(1) models, for the Let (*) be an FSD time series, normalized to have mean 0 and
traffic variables. variance 1. Then

II. T HEORY (+), %  .-0/ )1 % -32 )

Theory is easy and convincing for a link on an over- where / ) and 2 ) are independent of one another and each has
provisioned network where is never so big that there is more mean 0 and variance 1. 2 ) is white noise, that is, an uncorrelated
than minor queueing. As increases, ! tends toward a Pois- time series. / ) is a fractional ARIMA model
son process, so  tends toward independence. As increases,
  also tends toward independence. We use the word toward 465
87,9:*/ ),<; )1=; )> 
because   and   each always has an LRD component, but the
contribution of the component to the variability of the time se- where 7,/ )  / )?>  , @.ACBD4KAC 4 @EGF , and4 ; ) is white
4 noise
4 with
ries gets less and less. mean 0 and a variance H'J I    B 9ML I   B 99 # L   !B 9K9
The correlation structure of " , however, does not change, so to make the 46P variance46P of / ) equal to 1.
LRD is preserved in the counts. (This might seem like a contra- Let N*O 9 and N*Q 9 be theP autocorrelation functions of of / )
diction to the results for  ; later we describe how the two results and46P ( ) respectivelyP for lags C@R
 . / ) has LRD, and
fit together.) The coefficient of variation (standard deviation di- N O 9 falls off like I :  and increases at all positive lags as B
vided by the mean) of " goes to zero like $#% . This means increases. The autocorrelation function of (+) is
that the excursions of   above or below the mean, which last
4P 4 46P
for long periods of time because of the LRD, get smaller and N Q 9   .-?9 N O 9 
smaller in magnitude relative to the mean. While the LRD of
the  is unchanging in the sense that the autocorrelation is un- Thus ( ) , whose variance is 1, is the sum of the correlated com-
changing, eventually the LRD ceases to be salient because the ponent %  T-U/ ) , whose variance is  V- , and the uncorrelated
variability of the   gets small. component % -!2 ) , whose variance is - . The dependence de-
The over-provision theory applies to a link for the range of creases as - increases
46P P8W 1; if  D- decreases by a certain
values of that are small enough that queueing at the input factor, then all NQ 9 for @ decrease by the factor. Finally,
router and nearby upstream devices is not substantial. Upstream when - X , (+) is white noise.

The power spectrum of ( ) is the arrival counts  in fixed-length intervals of time. The ap-
4bacY pearance is a contradiction. This is a case where the formal
46Y 4 I 9 mathematics yields an unequivocal proof, but where we need
 Q 9   .-?9 H [
JI d Z]\_^?` b4 acY 1 -
I 9Kh : an heuristic argument for better understanding. We will do this
Z]`Kegf using the over-provision theory.
where the frequency has units cycles/inter-arrival for  , cy- First, we fix the interval length v used for the definition of   .
cles/packet for  , 46and cycles/interval-length for " Y , and where Instead of considering   , we study   #*v , just a change of units,
@ji46Y ik@EGF .  Q 9 decreasesY monotonically 4lacY as increases.
which are now packets per sec or p/s. Consider the process
 Q 9 goes to infinity at @ like I : 9 , so if - Am ,
4Y `Kegf >  4
no matter how close - gets to 1, "
 Q 9 gets arbitrarily large near   wgxy {z   |   > 9

Y x gw x xy
n@ , but its ascent begins closer and closer to 0 as - gets closer
to 1. for ,X
| These are the inter-arrival times per packet of
The FSD-MA(1) model is blocks of b packets in units of s/p. To relate the   wgxy to   #+v , we
need to make the z  wgxy vary around the interval length v . Suppose
(+), %  .-0/ )1 % -32 )
there are } traffic sources, each with a mean inter-arrival time
~ , that are multiplexed to form the link traffic. Then the mean
similar to the FSD model, but where 2 ) instead of white noise of   is ~ #$} , and the mean of of z  wgxy is z ~ #$} . Thus we take
is a first order moving-average zk}€v*# ~ . This means z increases with the magnitude of the
2 ),<o)p1rqo)?> 
multiplexing. We could take +#+  wx6y , with units p/s, to be another
measure of packets per second and consider its properties as a
4 surrogate for "#+v to resolve our contradiction. But the math
where o ) is Gaussian white noise with mean 0 and variance 1
>  would be too hard. Instead we take   wgxy itself as the surrogate.
qcI 9 , which makes the variance of 2 ) equal to 1. If qst@ ,
the moving-average component is white noise so the model is Our surrogate has a very simple dependence on the  :
simply an FSD.  >
> uƒ„ x … x
For the live and synthetic packet traces, the FSD model pro- ƒ„ x
vides an excellent fit to  & and '& , and the FSD-MA(1) to the The operation of going from   to   wx6y is a low-pass linear filter-
 & . Values of have a range of 20–8200 connections for the ing of   followed by taking every z -th value. As } increases,
live traces and 1–215 connections for the synthetic traces. The so does z , and the frequency band that is passed gets closer and
  & and  & for very small can have effects not captured by the closer to 0. But as we have seen,  has an LRD component that
modeling, but even here the models serve as an excellent ap- affects the power spectrum over a band closer and closer to zero
proximation for studying LRD. We found that a B of about 0.4 as } increases. The decreasing pass band of the operation in-
provides an excellent fit for all three traffic variables and for all creasingly filters out4 more of the noise process, % -!2 ) , than the
. Estimates of - are less than 1 for all three traffic variables. LRD component †  .-‡/ ) , and effectively boosts the smaller
In other words, there are elements of LRD in all variables at all
LRD component for   wgxy for a larger } back to where it was for
a smaller value of } .
As the trace value of increases, - tends to 1 for  & and  & . so
there is a clear reduction in LRD. This means the  & tend toward VI. O UTCOMES
independence. For  & from some interfaces, the FSD-MA(1)
often has a q significantly greater than zero, so the  & tend to Recent and current work is showing, as one would expect, that
short-range dependent in these cases. This is likely caused by these results have important implications for traffic engineering.
upstream queueing at these interfaces. For other interfaces, qT If we fix a buffer size and fix an amount traffic as measured by
@ provides a good fit, so the model is an FSD, and the  & tend to the connection load , then the link speed needed to achieve a
independent. fixed QoS criterion, such as 0.5% packet loss, results in a uti-
For '& , - shows no consistent change, so the autocorrelation lization that increases dramatically with . In other words, uti-
of therefore the LRD of the u& shows no consistent change with lizations can increase dramatically from the edges to the core.
; The coefficient of variation at all interfaces declines quite More broadly, the results show that engineering studies that are
close to the rate $#% predicted by the over-provision theory. meant to apply to the Internet as a whole, and that use synthetic
The unchanging LRD and the decline of the coefficient of vari- or live packet traffic to assess performance, need to consider
ation occur, surprisingly, even at the interfaces with substantial packet traces varying across a wide range of magnitudes of sta-
queueing. However, the count interval length is 100 ms, and it tistical multiplexing in order to achieve generality.
is possible that for smaller intervals an alteration would occur.

V.   VS .  
Theory, empirical study, and simulation study show LRD dis-
sipates for the inter-arrival times   but does not change for

