Professional Documents
Culture Documents
AUGURY: A Time-Series Based Application For The Analysis and Forecasting of System and Network Performance Metrics
AUGURY: A Time-Series Based Application For The Analysis and Forecasting of System and Network Performance Metrics
AUGURY: A Time-Series Based Application For The Analysis and Forecasting of System and Network Performance Metrics
Email: manuela.wiesinger@risc-software.at
Abstract—This paper presents AUGURY, an application for input data sets. The output is an aggregated data format, which
the analysis of monitoring data from computers, servers or is illustrated in this paper using snapshots, representing the
cloud infrastructures. The analysis is based on the extraction status of this output at some particular point in time.
of patterns and trends from historical data, using elements of
time-series analysis. The purpose of AUGURY is to aid a server AUGURY uses time series analysis in two ways: to extract
administrator by forecasting the behaviour and resource usage of the parameters for a model intended to represent the memory
specific applications and in presenting a status report in a concise usage of a single application, and to extract the seasonal
manner. AUGURY provides tools for identifying network traffic component from the network traffic data. Building models of
congestion and peak usage times, and for making memory usage
the memory usage of single applications enables projections
projections. The application data processing specialises in two
tasks: the parametrisation of the memory usage of individual of overall memory usage. Forecasting the memory usage for
applications and the extraction of the seasonal component from a specific time window subsequently only requires summing
network traffic data. AUGURY uses a different underlying up the models for all application executions within that time
assumption for each of these two tasks. With respect to the frame. By obtaining a seasonal component, a forecast for
memory usage, a limited number of single-valued parameters
that number of executions is also acquired. Combining both,
are assumed to be sufficient to parameterize any application
being hosted on the server. Regarding the network traffic data, a network traffic congestion and peak usage times can be
long-term patterns, such as hourly or daily exist and are being identified, studied and expressed in terms of memory-usage
induced by work-time schedules and automatised administrative projections.
jobs. In this paper, the implementation of each of the two tasks is A brief list of related work, in the context of monitoring
presented, tested using locally-generated data, and applied to data
from weather forecasting applications hosted on a web server.
and/or forecasting of system and network performance metrics,
This data is used to demonstrate the insight that AUGURY can is presented in the following. Refs. [8], [9], [10], [11], [12],
add to the monitoring of server and cloud infrastructures. [13] also approach this topic using time-series analysis, but
concentrate on the accuracy of the forecast. In contrast, in
I. I NTRODUCTION this work, the focus is on the intuitiveness and usability of
The PIPES-VS-DAMS framework [1] for the monitoring of the time-series analysis outcome, from the perspective of a
cloud infrastructures supervises various system and network system administrator. A variety of network traffic monitoring
performance metrics. Vast amounts of this data are available tools exist1 , yet AUGURY is novel in that it focuses on the
for studying. Analysing it and extracting regular patterns analysis and visualisation of seasonal patterns.
could help to identify hazardous trends. These patterns can This paper is organised as follows. First, in Section II
be used to plan and optimise the usage of resources, and to the features of the data samples are discussed. Then, in
highlight extraordinary events requiring detailed attention. In Section III, a well-known seasonal-pattern extraction method
this paper, AUGURY, an application for the extraction of such is introduced in parallel with the elements from time series
trends and patterns in data from computers, servers or cloud analysis upon which it is based on. In Section IV the memory-
infrastructures is presented. usage model and parameter-extraction algorithm are described.
AUGURY is able to handle various input data formats. Then, the specifications of AUGURY and the packages it
Internally, a unified data structure exists, which is transformed uses are discussed in Section V. The tests performed using
according to the use case, and most importantly, for the appli- locally-generated data are presented in Section VI and the
cation of time-series analysis techniques. AUGURY contains results based on real-world data are shown in Section VII.
queries for accessing subsets of the internal data structure and
for modifying it. It also handles overlaps and gaps between 1 http://www.cs.wustl.edu/∼jain/cse567-06/ftp/net traffic monitors3/
An auxiliary functionality provided by AUGURY for the The MA can be defined in multiple ways, according to the
modelling of residuals is discussed in Section VIII. Finally, weights assigned to each of the lagged values. In general, any
in Section IX, the conclusions are presented. MA is given by:
II. DATA SAMPLES wt ∗ yt + . . . + wt−(N −1) ∗ yt−(N −1)
MA(N, w)t = P , (2)
wi
Three data samples are used in this paper:
• memory-usage from a single application; where N corresponds to the number of lags used and wi
• network-traffic from server requests with a fixed sched- is the weight given to each observation. The standard case
ule; corresponds to wi = 1 for any i. Other versions are also
• monitoring of a web server. frequently used, for instance, the Exponentially Weighted MA
(EWMA) where the largest weight is given to the most recent
The first two are used for testing and validation, while the
observation. Here, the weight decays exponentially as the
latter is used for establishing a use case for AUGURY.
observations move further back in time.
Memory-usage data is collected from a virtual environment
The calculation of the MA requires at least N observations
hosting a GNU/Linux system. This emulates the conditions
and, therefore, no value of the MA exists for the first N − 1
of an isolated application. Regular executions of a task with
points. In the extraction of the seasonal pattern a symmetric
a rectangle-like memory usage pattern are invoked using
MA is used, i.e. the same number of past and future observa-
crontab. A lapse between executions of two minutes is used.
tions is used, and, hence, this procedure yields missing values
The virtual machine performance metrics are monitored using
at both the beginning and the end of the dataset.
GLANCES 2 , an open source software. The status is exported
every two seconds to a Comma-Separated Values (CSV) file. A seasonal pattern is one which repeats periodically, i.e.
This file is shared with the host machine, where the data is there is a period p for which:
analysed. Several performance metrics are exported into the St+p = St , (3)
CSV file, including memory and CPU usage.
Network traffic data is collected from an apache server where St is the seasonal component of a time series. In this
installed on a Raspberry Pi computer. A second Raspberry paper, three types of seasonality are considered, labelled as
Pi sends requests to this server via a local network. A script hourly, daily or weekly depending on whether p is an hour,
submitting these requests is scheduled to run every minute. day or week, respectively.
Thus, the network traffic has an inherent seasonal pattern. In this work, the extraction of seasonal patterns is based on
Web-server monitoring data is collected from a server sample decomposition. That is
hosting weather forecasting applications. A limited number of
Yt = St + Tt + Et (4)
applications dominate this sample. The 5 most frequently used
applications make for approximately 70% of the total traffic. where the series Yt is given by the combination of the
Each of these applications corresponds primarily to a single seasonal (St ), trend (Tt ) and the stochastic error (Et ) terms.
IP-address submitting the requests. Established methods for such a decomposition exist and are
documented elsewhere [2]. These are typically referred to as
III. S EASONAL ADJUSTMENT
seasonal adjustment [6] and are based on the application of
In this section, the procedure used to extract the seasonal a chain of filters. First, a MA is used to estimate the trend
component from a dataset is introduced. AUGURY’s main use- component, since changes on a smooth MA can be associated
case is the study of this component. to a trend. This detrended, filtered, data is then used to extract
A time series [3], [4], [5] is a sequence of time-dependant the seasonal and error terms.
data points, where the lag between them is constant in
time. Time series are studied mainly to understand the time- IV. M EMORY USAGE PARAMETRISATION
dependant behaviour of some quantity, for instance, fluctua- In this section, the model used to characterise the memory
tions in the price of some stock or seasonal patterns in the usage of a single application, and the strategy devised to
demand for a certain product. An application of time series estimate its parameters, is presented. The purpose of this
analysis is the forecasting of the future value of such a quantity model is to be used in memory-usage projections.
based on previous observations. That is A simple three-parameter model is used to characterise the
memory usage of a given application. These are the running
yt+1 = F (yt , yt−1 , ...., yt−(N −1) ), (1)
time (run time), the maximum memory used (max memory)
where t denotes the time of the last (most recent) out of and β, the latter of which is illustrated in Fig. 1 and corre-
N observed values of the variable y. This function is by no sponds to the magnitude of the first significant deviation.
means known, and could depend on derivative variables, such The extraction of the model parameters relies on the identi-
as a Moving Average (MA) or standard deviation (σ), and on fication of signal-like patterns. These structures are recognised
stochastic terms. by transforming the series as
2 https://nicolargo.github.io/glances/ yt0 = yt − yt−1 . (5)
15 For the standard case the MA yields yt0 /N , in which case the
data
backwards-isolated maximum EWMA(14) MA drops as N P increases. In contrast, the EWMA is closer
10 EWMA(14) ± 5σ to yt0 , as wt ≈ wi , and, hence, has a negligible dependence
run time
on the gap length.
5 β
Change on Memory [%]
400 data
parameter. 200
Fig. 2 shows a snapshot of the memory-usage parametri- 0
sation model in action. The model prediction for yt is yt−1 200
unless a trigger is activated, in which case the signal model 400
takes over. For this validation, the trigger is given by a jump
in the CPU usage. The signal model takeover lasts for a time 60
Relative network traffic
150 10 4
100
10 3
50
Number of executions
0
May 29 2016 May 30 2016 May 31 2016 Jun 01 2016 Jun 02 2016 Jun 03 2016 Jun 04 2016 Jun 05 2016 10 2
10 0 10 -3 10 -2 10 -1 10 0 10 1
The size of the daily seasonal component relative to the run time [s]
trend component varies from application to application. Fig. 5
shows this contribution for the five most frequently used Fig. 7. Run time for the 5th most frequently used application.
applications. For the first application in the ranking, the
median fluctuates between +200 and −100, approximately, The parametrisation of the memory usage via the model
around the average value of the trend component at 200. For discussed in Section IV can be applied here, despite the fact
the last application in the ranking, the median of the seasonal that the run time does not take a single value per application.
component peaks at around +150, which is a sizeable con- Fig. 8 shows the cumulative memory usage, for a minute-
tribution for an application with an average trend component sized time window, corresponding to that of the peak usage
of approximately 100. In Fig. 5, the seasonal component is time of the application. This shows the worst-case scenario,
also shown relative to the residual. This is illustrated through where memory is not being released, i.e. the run-time of the
the box and whisker, encapsulating 50% of the data and +1.5 application extends beyond the time window. For this scenario,
(−1.5) interquartile ranges from the top (bottom) edges of the an accumulated memory of 150 MBs is reached. A graphic
box, respectively. A significant change on the seasonal pattern, such as Fig. 8 can be used by a system administrator to prevent
relative to the neighbouring points along the x-axis, is that system overloads by revealing dangerous tendencies on the
which is not covered by the overlapping boxes or whiskers. memory usage at peak operating times.
A graphic such as Fig. 5 presents abundant information in a
concise manner, relevant to both diagnosis and forecasting pur- VIII. M ODELLING OF THE RESIDUAL COMPONENT OF
poses. Issues can be tracked down to the source by searching WEB - SERVER DATA
for sudden increases on the traffic data in a very narrow time In this section, the auxiliary functionality that AUGURY
window. A systematic congestion is illustrated, for instance, provides for the forecasting of the stochastic fluctuations
in the bottom panel of Fig. 5. Extraordinary events can also be around the stable long-term prediction, given by the seasonal
shown with respect to a forecast based on the seasonal pattern. component, is discussed.
800
600 Top 1 application
Share of total data traffic: 21 %
400
200
0
200
400
300 Top 2 application
200
Share of total data traffic: 16 %
100
0
100
1500
Top 3 application
1000 Share of total data traffic: 13 %
500
600
500 Top 4 application
400 Share of total data traffic: 10 %
300
200
100
0
100
200
200
150 Top 5 application
100 Share of total data traffic: 9 %
50
0
50
100 00:00 01:00 02:00 03:00 04:00 05:00 06:00 07:00 08:00 09:00 10:00 11:00 12:00 13:00 14:00 15:00 16:00 17:00 18:00 19:00 20:00 21:00 22:00 23:00
Fig. 5. Daily seasonality of the top 5 applications, ranked according to their share of the total traffic data. The trend component has been subtracted using
seasonal adjustment. For each hour there are as many entries as there are days in the input data. Each entry shows the total number of executions, relative to
the trend component, for a given day at the hour indicated by the x-axis. In each sample, the red line shows the median (second quartile), the bottom and
top of the box are the first and third quartiles, and the ends of the whiskers show the lowest (highest) entry still within 1.5 interquartile ranges of the lower
(upper) quartile. Outliers are represented as star-shaped markers.
60
50 Top 5 application
Share of total data traffic: 9 %
40
30
20
10
0 22:00 22:02 22:04 22:06 22:08 22:10 22:12 22:14 22:16 22:18 22:20 22:22 22:24 22:26 22:28 22:30 22:32 22:34 22:36 22:38 22:40 22:42 22:44 22:46 22:48 22:50 22:52 22:54 22:56 22:58
Fig. 6. Hourly seasonality for the 5th most frequently used application. This corresponds to a zoom of the daily seasonality plot, obtained by selecting entries
occurring between 22 : 00 and 23 : 00. For each minute there are as many entries as there are days in the input data. Each entry shows the total number of
executions, for a given day at chosen hour, as indicated in the x-axis. In each sample, the red line shows the median (second quartile), the bottom and top of
the box are the first and third quartiles, and the ends of the whiskers show the lowest (highest) entry still within 1.5 interquartile ranges of the lower (upper)
quartile. Outliers are represented as star-shaped markers.
160 Here, the γ = 0 hypothesis is tested.
2016-06-10 2016-06-02
2016-06-09 2016-05-27 The residual term of the web-server data is divided into
140 2016-06-08 2016-05-31
2016-06-05 2016-05-30 two sets. The first part is used as in-sample data for fitting the
2016-06-04 2016-06-12
Accumulated memory usage [MBs]
7th Framework Programme of the European Commission [13] C. Katris and S. Daskalaki. Combining Time Series Forecasting Methods
through the Initial Training Network HiggsTools PITN-GA- for Internet Traffic. Volume 122 of the series Springer Proceedings in
Mathematics & Statistics, 309-317. 2015.
2012-316704. [14] R. Sitte and J. Sitte. Analysis of the Predictive Ability of Time Delay
Neural Networks Applied to the S&P 500 Time Series. IEEE Transactions
R EFERENCES on Systems, Man, and Cybernetics-Part C: Applications and Reviews, Vol.
30, No 4. 2000.
[1] B. Ahmad, I. Leitner and M. Krieger. PVD: A Lightweight Distributed [15] SAS Institute. Notation for ARIMA Models. Time Series Forecasting
Monitoring Architecture for Cloud Infrastructures IEEE 4th Symposium System.
on Network Cloud Computing and Applications (NCCA 15), 2015. [16] D. Asteriou, S. G. Hall. ARIMA Models and the BoxJenkins Method-
[2] R. B. Cleveland, W. S. Cleveland, J. E. McRae, and I. Terpenning. STL: ology. Applied Econometrics (Second ed.). Palgrave MacMillan, 265286.
A Seasonal-Trend Decomposition Procedure Based on Loess. Journal of 2011.
Official Statistics, 6, 373, 1990. [17] D. A. Dickey, W. A, Fuller Distribution of the Estimators for Autore-
[3] R. H. Shumway. Applied statistical time series analysis. Englewood Cliffs, gressive Time Series with a Unit Root. Journal of the American Statistical
NJ: Prentice Hall. ISBN 0130415006. 1988. Association 74 (366), 427431. 1979.
[4] P. Bloomfield. Fourier analysis of time series: An introduction. New York: [18] J. D. Hunter Matplotlib: A 2D graphics environment. Computing In
Wiley. ISBN 0471082562. 1976. Science & Engineering, Vol. 9, No 3, 90-95. 2007
[5] R. Adhikari and R. K. Agrawal. An Introductory Study on Time Series
Modeling and Forecasting. e-print arXiv:1302.6613. 2013
[6] J. Shiskin, A. Young, and J. Musgrave. The X-11 Variant of the Census
Method II Seasonal Adjustment Program. Bureau of the census. Technical
Paper 15. 1967.
[7] P. Tchebichef, “Des valeurs moyennes,” Journal de mathmatiques pures
et appliques. 2 12 (1867) 177
[8] R. Wolski, N. Spring, and J. Hayes. The network weather service: A
distributed resource performance forecasting service for metacomputing.
Future Generation Computer Systems, 15(56): 757768. October 1999.
[9] K. Hu, A. Sim, D. Antoniades and C. Dovrolis. Estimating and forecasting
network traffic performance based on statistical patterns observed in
SNMP data. MLDM’13 Proceedings of the 9th international conference
on Machine Learning and Data Mining in Pattern Recognition, 601-615.
2013.
[10] S. Jung, C. Kim, Y. Chung. A Prediction Method of Network Traffic
Using Time Series Models. Volume 3982 of the series Lecture Notes in
Computer Science, 234-243. 2006.
[11] M. Joshi and T. H. Hadi. A Review of Network Traffic Analysis and
Prediction Techniques. e-print arXiv:1507.05722. 2015.
[12] P. Cortez, M. Rio, M. Rocha and P. Sousa. Multi-scale Internet traf-
fic forecasting using neural networks and time series methods. Expert
Systems, Vol. 29, No. 2. 2012.