Base Paper 1

Cluster Computing
https://doi.org/10.1007/s10586-020-03099-x (0123456789().,-volV)(0123456789().,-volV)
An intelligent approach for predicting resource usage by combining

decomposition techniques with NFTS network
Seyedeh Yasaman Rashida1 • Masoud Sabaei2 • Mohammad Mehdi Ebadzadeh2 • Amir Masoud Rahmani1
Received: 21 September 2019 / Revised: 13 March 2020 / Accepted: 22 March 2020

Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract
Time sensitive virtual machines that run real-time control tasks are constrained by hard timing requirements. Optimal
resource management and guarantee the hard timing requirements of virtual machines are critical goals. Basically, cloud
resource usage predicting and resource reservation play a crucial role to achieve these two goals. So, we propose a
predicting approach based on two-phase decomposition method and hybrid neural network to predict future resource usage.
This paper uses a clustering method based on the AnYa algorithm in an on-line manner in order to obtain the number of
fuzzy rules and the initial value of the premise and consequent parameters. Since cloud resource usage varies widely from
time to time and server to server, extracting the best time series model for predicting cloud resource usage depend not only
on time but on the cloud resource usage trend. For this, we present a recursive hybrid technique based on singular spectrum
analysis and adaptively fast ensemble empirical mode decomposition to identify the hidden characteristics of the time
series data. This method tries to extract seasonal and irregular components of the time series. According to the simulation
results, it can be found that the proposed model can have significantly better performance than the three comparison models
from one-step to six-step CPU usage predictions with the MAPE of 33.83% average performance promotion, MAE of
36.54% average performance promotion, RMSE of 36.70% average performance promotion.
Keywords Takagi–Sugeno network Singular spectrum analysis (SSA) Adaptive fast EEMD (AFEEMD)
Cloud computing Future resource usage
1 Introduction geographically in different datacenters of cloud which are

virtualized and allocated to the user applications. The
Cloud computing has been widely used in various fields applications can consist of hard real-time control tasks,
with the advantages of plenty of resource provision and which have strict timing requirements expressed as hard
pay-as-you-go model [1]. Resources are distributed deadlines [2]. They could also be grouped into on–off,
high-growth, periodic and aperiodic-burst depending on
their workload [3]. Different user applications may require
& Masoud Sabaei different on-demand resources with parameter constants
sabaei@aut.ac.ir
such as start time, execution time, number of virtual
Seyedeh Yasaman Rashida machines (VMs) and deadlines [4, 5]. However, the VM
s.y.rashida@gmail.com
scheduling within a hard deadline while considering dif-
Mohammad Mehdi Ebadzadeh ferent VM parameter is quite interesting and challenging
ebadzadeh@aut.ac.ir
one. There are several optimized time sensitive VM
Amir Masoud Rahmani scheduling methods [6–11]. Solving the problem with
rahmani@srbiau.ac.ir
meta-heuristics take a long time and they fail to validate
1
Department of Computer Engineering, Science and Research whether the provisioned resources were optimally used [3]
Branch, Islamic Azad University, Tehran, Iran and hard real-time VM constraints are met. On the other
2
Department of Computer Engineering and Information hand, simple greedy algorithms cannot guarantee optimal
Technology, Amir Kabir University of Technology, Tehran, resource provisioning while maintaining a high quality of
Iran
123
Cluster Computing
service (i.e. cost). So, predicting the future of users’ they do not pay attention to the interactions between fea-
demands and cloud resource usage can be applied to handle tures at different scales. For example, for the NNs or deep
this issue [12]. network models, which are also widely used in many other
The cloud resource usage over time can be regarded as a applications such as image processing, mechanical trans-
time series since recent investigation indicated that it is lation, speech recognition and so on, the non-linearities
highly correlated over time. The time series predicting mainly handled by the activation functions, and there are
methods are important tools for helping the decision few techniques addressing the non-linear interactions
making and preventive planning measures by experts. among the inputs [33]. Among these models, FNN inherits
Thus, applying time series prediction models is useful for the learning ability of NNs and the inference technology of
prediction [12]. These methods try to identify patterns in fuzzy systems, which can be used as an effective method
historical time series data and develop a model to build the for dealing with complex nonlinear systems. Therefore,
future temporal patterns of series [13, 14]. However, many researchers have used the FNN approaches to rep-
resource demands change constantly and sometimes fluc- resent complex plants and construct advanced controllers
tuate very strong, making it difficult for accurate predic- [41–43].
tions to be produced [1, 15]. In the following, we are In this paper, we apply a hybrid prediction model to
addressing that what type of method is required to char- infer the current situation and predict future resource usage
acterize resource usage patterns and also predict the of hosts can be used to overcome previous problems. We
resource usage and why this method is applied to predict propose the composition of a data-based approach with a
future resource usage. model-based approach for achieving advantages of data-
The resource usage can be predicted through several based and model-based approaches. we choose AR from
approaches. Most of approaches can be classified in two model-based approaches and FNN from data-based
categories: statistical techniques (model-based) and approaches. AR is a simple linear model. This simple linear
machine learning methods (data-based) [16]. The applica- model is frequently found to produce smaller prediction
tion of approaches based on the statistics heavily depends errors than techniques with complicated model forms due
on the a priori statistical statements, such as specification to their parsimonious [44]. However, two issues should be
of relationship between variables (e.g. independence) and noticed. On one hand, linear approaches may output
the model-specific assumptions regarding the process unsatisfactory predicting accuracy when the focused sys-
probability distributions (e.g., the outcome variable may be tem illustrates a nonlinear trend. This new arising problem
required to be binomial). Auto-Regression (AR) is one of indicates that model structure is instable. According to
the most commonly used model-based tools [17]. However, Clements and Hendry [45], this structural instability has
methods based on the statistics such as measuring the become a key issue, which dominates the predicting per-
maximum or average resource usage for specified time formance [44]. To settle this problem, AR is composed
intervals are very general and not capable of producing with FNN.
accurate predictions. The prediction models based on such Among FNNs, Takagi–Sugeno (TS) fuzzy prediction
methods are considered to be poor as they are not good in model has attracted extensive attention [46] due to its
prediction and correspond to a very few numbers of cases better generalization and excellent approximation in the
[16]. On the other hand, these approaches are primarily dense region [15]. So, we choose Neuro-Fuzzy Takagi–
based on the assumptions of the stationarity in time series Sugeno network (NFTS). Since the proposed network is
and linearity among normally distributed variables [18]. combined NFTS and AR networks, we call the network as
However, stationarity, linearity and normality assumptions NFTSAR. In classic TS fuzzy neural network (NFTS),
are not satisfied in workload demands. The newest and the which is a linear polynomial of the input variables, the
most common proposed predictors are based on machine system output is approximated locally by the rule hyper-
learning techniques [19] such as Support Vector Machine planes. Nevertheless, the traditional NFTS does not take
(SVM) [20, 21], Genetic Algorithm (GA) [22], Neural full advantage of the mapping capabilities that the conse-
Network (NN) [23–29], Fuzzy Neural Network (FNN) quent part might offer [47]. In [37], Sarıca et al. proposed
[30–35], Moving Average (MA) and AR [36] and Hybrid the composition of classical ANFIS with AR, namely AR-
Prediction Approaches [37–40]. Machine learning methods ANFIS that had advantages of data-based and model-based
are widely being adopted for establishing more accurate approaches. AR-ANFIS was not capable to identify sea-
prediction models [6]. sonal and irregular patterns. It also had disadvantages of
Although the machine learning based models have classical NFTS. In order to solve these issues, the original
achieved remarkable results, there are still limitations. time series needs to be stabilized before constructing the
Firstly, the most models do not have an explicitly mecha- prediction method [48]. So, we present a methodology that
nism to handle the non-stationarity of workload. Secondly, works on two objectives, characterizing and predicting. In
123
Cluster Computing
the characterizing stage, historical resource usage patterns 2 Related works

of hosts in the cloud data centers will be identified. On the
other word, this stage extracts seasonal and irregular events Many prediction models have been proposed since they
of the resource usage patterns. In the predicting stage, the become more and more essential for computer systems
level of resource usage of hosts used by hard real-time [12]. In this section, we briefly review previous works
VMs will be predicted. So, our proposed method is com- related to this work on cloud and distributed computing.
bined a two-phase decomposition technique with the A survey on challenges and approaches of resource
NFTSAR network to characterize and predict the resource demand prediction in cloud computing has been reported in
usage. The proposed decomposition method is a Recursive the paper [25]. Nguyen et al. [49] presented a VM con-
hybridization of SSA and AFEEMD, namely RSA. Con- solidation with a usage prediction algorithm that was based
sequently, the modified hybrid prediction method, RSA– on multiple resource usage predictions to estimate long-
NFTSAR, is presented for predicting and identifying term future resource utilization. By knowing the last n
dynamic cloud system. The proposed methodology is an recorded usages, the next usage was generated via multiple
online learning approach that decreases the prediction error linear regressions (LR). The resulting algorithm was exe-
with time. It also considers the behavioral changes of cuted during the VM consolidation process and decided
demands in predictors. when a host was underused or overloaded based on the
As say before, we present a recursive version of the two- current and future states, thus avoiding future SLA viola-
phase decomposition strategy, RSA, to characterize sea- tions. This method was based on the oversimplified
sonal and irregular time series and promote the predicting assumptions of the workload (the linear relationship). The
performance. The RSA–NFTSAR uses the hidden com- method was trained and parameterized based on past
ponents data to predict future resource usage accurately. In observations in the workload behavior. Therefore, it cannot
our approach, we use the hidden components data to obtain capture the behavioral changes in workloads. So, this
the fuzzy rules without any assumption about the structure method should be retrained to adapt to the workload
of the data and independent of the model structure. The changes as the prediction error increases. Consequently, it
structure identification and parameter optimization steps in takes a long time. In [50], Tang et al. combined LR and
this approach are done automatically and with the optimal wavelet NN techniques into a prediction method to forecast
number of the rules in an acceptable accuracy. In this the cloud data center short-term workload and reduce the
methodology, the predicting is performed for each hidden energy consumption of cloud data center servers, network,
component separately. Then, the predicting results of each and cooling systems.
component are composed to provide the future value for Gapta et al. [51] proposed a resource utilization
the original time series. The main contributions of the approach based on AR for load balancing. While load
paper are presented in the following: balancing helped energy consumption and SLA formula-
• Identify the hidden characteristics of the original data tion, it did not consider the impact of dynamic workload on
by applying recursive two-phase decomposition algo- energy consumption and SLA. This method provided the
rithm, namely RSA. one-step-ahead prediction and a general view of the future
• Presenting a prediction methodology for seasonal and trend of the application behavior. However, in this method,
irregular time series, not only on the single-step the prediction accuracy decreased as the interval length of
predicting but also on the multistep predicting. prediction increased. On the other hand, although the
• Increasing accuracy of the future resource usage model was implemented simple but it was not effective for
prediction by identifying the hidden components of extrapolation because of its simplicity. Tao et al. [52]
time series. proposed a multi-strategy collaborative prediction model
• Evaluating the performance of the proposed prediction (MSCPM) for the runtime of online tasks. They introduced
approach under the dataset of the CoMon project a novel concept named Prediction Accuracy Assurance
regarding prediction accuracy metrics. (PAA) as a criterion to evaluate the precision of the pre-
diction runtime provided by a specific prediction strategy.
The paper is organized as follows. In Sect. 2, the related Kecskemeti et al. [53] proposed a technique for predicting
works are discussed. Section 3 deals with proposed generic background workload by means of simulations.
methodology including decomposition-ensemble model The technique was capable to provide additional knowl-
and neural network structure. Evaluation results will be edge of the underlying private cloud systems in order to
observed in Sect. 4 and conclusions as well as future support activities like cloud orchestration or work flow
researches will be provided in Sect. 5. enactment. The technique also used long running scientific
workflows and their behavior discrepancies and tried to
123
Cluster Computing
replicate these in a simulated cloud with known workloads. (SGW-SCN), to predict the future workload. In the method,
Unfortunately, it was unable to extract hidden features (i.e. the workload time series was first smoothed by a Savitzky–
trend, seasonal, irregular features) in original data. Golay filter and then decomposed into multiple compo-
Artificial intelligence and machine learning techniques nents via wavelet decomposition (WD). With stochastic
provide an opportunity for the development of a cognitive configuration networks (SCN), an integrated model was
cloud system. In recent years, reinforcement learning has established to characterize the statistical characteristics of
proven to be a promising autonomous approach for the both trend and detail components. Although the main
optimal allocation of cloud resources [45]. The reinforce- advantage of WD was the provision of temporal informa-
ment learning based approaches need no domain knowl- tion, it was not suitable for time-invariant transform [63].
edge. They are able to adapt to the behavioral changes of On the other hand, the choice of the wavelet function
applications using generating new policies [54]. The scal- affected the decomposition result. Usually, this choice was
ability of reinforcement learning is poor in the large state empirical, depending on previous experience knowledge or
space. The initial policies of reinforcement learning affect trial and error [63]. It was not a useful method where there
the convergence speed to the optimal policy. Thus, the poor was no human activity interference. On the other hand, the
initial policy might lead to the poor performance of rein- used WD method just decomposed the original time series
forcement learning [55]. In [56], Thein et al. proposed a into two hidden components. In the environment with more
framework for cloud resource allocation with the aim of hidden components than two, this approach cannot be
green cloud resource deployment. The framework was on suitable.
the basis of reinforcement learning mechanism and fuzzy Mason et al. [15] proposed CPU usage prediction
logic. The simulation results showed that this framework approach by using a number of state of the art swarm and
can achieve effective performance for the high data center evolutionary optimization algorithms, such as particle
energy efficiency and prevent SLA violation respectively. swarm optimization (PSO), differential evolution (DE), and
Although this approach improved the accuracy and inter- covariance matrix adaptation evolutionary strategy. They
pretability of model, it was unable to extract hidden fea- used traditional NN for prediction. Results showed that the
tures in original data. The model also had disadvantages of covariance matrix adaptation evolutionary strategy-trained
reinforcement learning based approaches. NN outperformed both DE and PSO. Traditional NNs did
Some recent works also introduced machine learning to not require restrictive assumptions about the form of data
self-adaptive resource allocation [57, 58]. Zheng et al. [57] and can model well the nonlinear behavior of the appli-
used a load predictor that clustered historical resource cation [55]. They cloud also be used to model the corre-
utilization and selected the cluster set with the highest lation among input and output. There was no specific rule
similarity as a training sample into a NN. Cioara et al. [58] for determining the structure of NNs. Their network
proposed a methodology for energy management in data structure was achieved through experience and trial and
centers. They showed that the model can help to calculate error [53]. So, this issue reduced trust in the network. Jiang
the level of energy efficiency in data centers at runtime. In et al. [64] proposed a multi-prediction model, including the
recent years, learning automata were used as a technique ARMA model and the feedback based online AR model.
for energy management in smart grid and cloud computing This model predicted current and future resource avail-
[59, 60]. Automata models of learning systems introduced ability and scheduled hybrid workloads in the cloud data
in the 1960s were popularized as learning automata in a center to reduce task failures and increase resource uti-
survey paper in 1974 [61]. Learning Automata was pro- lization. The implementation of this model was simple. In
posed to maximize the reinforcement received from the this approach, input data must be continuous and the
environment. This approach provides a lot of flexibility in convergence of finding weights was dependent to the
designing an appropriate learning system in different learning rates. Moreover, this model was unable to extract
applications [62]. Rahmanian et al. [12] proposed ensemble hidden features in original data.
algorithm based on learning automata. This algorithm Long-short-term memory (LSTM) models are a recur-
predicted the cloud resource usage by employing two rent network architecture in conjunction with an appro-
approaches namely Single Window and Multiple Window. priate gradient-based learning algorithm [65]. These
The simulation results showed the high performance of the models are good for the high noisy one-dimensional time
proposed approach for the cloud environment. The model series. LSTM models were used in [23, 24, 66] to model
was unable to extract hidden features in original data. the long-range dependence in cloud workloads. In [66], an
In [48], the authors proposed an integrated forecasting approach was presented for predicting future resource
method. The method equipped with noise filtering and data usage trends. This approach used univariate LSTM models.
frequency representation, named Savitzky–Golay and The model required computing and storing multiple gating
Wavelet-supported Stochastic Configuration Networks neuron responses at each step. The model was unable to
123
Cluster Computing
extract hidden features (i.e. trend, seasonal, irregular fea- cloud resource management including a prediction module
tures) in original data. Babu et al. [67] introduced an to estimate resource usage more accurately. They used
interference-aware prediction mechanism for VM migra- biased theory in their proposed model. The Bayesian theory
tion. This mechanism used automatic scaling policies to is simple. It is able to incorporate initial information of the
handle sudden load changes with precise prediction and resource manager as prior probabilities in the predictor
minimum VM migration. In [68], Kumar and Patel pro- [55]. However, due to the non-linear behavior and the load
posed a predicting model based on ANN and PSO to dynamics of cloud applications, filtering did not seem
minimize the cost and Makespan. The proposed model appropriate here. It eliminated the useful information about
resolved issues like dynamic and heterogeneous cloud the load dynamics. Thus, the resource manager could not
computing environment, reliability, availability and over- allocate the appropriate resources to applications. On the
loaded problem. Witanto et al. [69] proposed an NN-based other hand, if the resources manager could not provide
method for selecting the best VM consolidation algorithm. prior information, it had to be estimated from the past
This method adaptively chose appropriate algorithm behavior of the application. For adapting to the behavioral
according to cloud provider’s goal priority and environ- changes of the application, the prior probabilities had to be
ment parameters. As explain before, there was no specific recomputed. Moreover, the model also was unable to
rule for determining the structure of NNs. The appropriate extract hidden features in original data.
network structure was achieved through experience and Manjula and Folurence [73] introduced a hybrid
trial and error. approach by combining genetic algorithm (GA) for feature
In [1], Chen and Wang proposed a resource demand optimization with deep neural network (DNN) for classi-
prediction method based on ensemble empirical mode fication. They improved the fitness function of GA and
decomposition (EEMD) and ARIMA in cloud computing. chromosome designing and computation. DNN technique
This method decomposed the non-stationary users’ was also improvised using adaptive auto-encoder which
resource demands into a plurality of intrinsic mode func- provided better representation of selected software fea-
tion components (IMFs) and residual component (RES) tures. However, DNN-based methods required significantly
through EEMD method to improve prediction accuracy. more computation than traditional methods, which leaded
The proposed method did not require restrictive assump- to high energy consumption [74]. In [75], Fei et al. pre-
tions about the form of data. It could also be used to model sented a kind of rough-set predicative analysis method for
the correlation among input and output. The structure of heat insulation performance of the composite wall in hot
network was achieved through experience and trial and summer and cold winter zone. This method was improved
error. So, this issue reduced trust in the network. In [70], precision of the heat insulation performance analysis
Kaur et al. presented an intelligent regressive ensemble algorithm of the composite wall. In [55], a wide study on
approach for resource usage prediction which integrated prediction models of applications for resource provisioning
feature selection and prediction techniques to achieve high in the cloud had been presented that investigated the main
performance. The implementation of the model was sim- characteristics and challenges of the different models.
ple. It needed a lot of data to train. So, training of the Deep Belief Network (DBN) was one type of DNNs
model took a long time. It also was probable that the which used densely connected Restricted Boltzmann
resource manager cloudnot interpret some prediction Machines (RBMs). DBN, as the feature learning approach
results. Moreover, it was unable to extract hidden features in regression models, had been investigated for predictive
in original data and adapted to workload changes. Kumar analytics. To predict the resource request in cloud com-
and Singh [16] used NN and self-adaptive differential puting, DBN was proposed in [76] to optimize the job
evolution algorithms for workload prediction. The perfor- schedule and balance the computational load. DNNs did
mance of the model was evaluated using NASA and Sas- not take into account the temporal information contained in
katchewan servers’ datasets. The proposed model reduced data series [77]. So, it was unable to predict large temporal
the prediction error rate and the results were compared with datasets accurately. Moreover, the proposed model had one
the back-propagation learning algorithm. hidden layer. a 2-layer DBN was RBM, and a stack of
Cloud resource management requires complex policies RBMs shared parametrization with a corresponding DBN
and decisions to ensure the suitable use of computing [78]. So, it would be RBM instead of DBM and would have
resources [71] due to fluctuations in the demanding disadvantages of RBMs. A disadvantage of RBN was the
workload. Deciding the right amount of resource usage for approximate inference based on mean field approach was
performing user requests in cloud environments is not slower compared to a single bottom-up pass as in DBN.
trivial [72]. In [72], the authors applied a couple of smooth Mean field inference needed to be performed for every new
filters to decrease the negative impacts of outliers in the test input. However, DBN was more generative than auto-
observed data points. They also presented a framework for encoder and needed less memory than RNN. Muralitharan
123
Cluster Computing
et al. [79] compared different NN for the energy demand from the raw data. Then, data is aggregated into a minute
prediction in the smart grid. However, the authors did not unit level. We assume that a timestamp is recorded every
consider the effects of temporal time series on prediction 5 min. The aggregated data is passed to the decomposition
accuracy. step, which is responsible for extracting and identifying the
Wang et al. [80] proposed a Sparse DBN with FNN fluctuation characteristics in resource usage by VMs. In
(SDBFNN) for nonlinear system modeling. They consid- this paper, we propose a recursive two-phase decomposi-
ered the sparse DBN as a pre-training technique to realize tion strategy which is composed of SSA with AFEEMD
fast weight-initialization and to obtain feature vectors. The method. In the first phase, the aggregated data is passed to
results were shown that this approach balanced the dense SSA to reduce the noise of the original data. Following the
representation to improve its robustness. In [81], a growing noise reduction by using the SSA method, AFEEMD is
DBN with transfer learning (TL-GDBN) was proposed to used to decompose the difficult resource usage prediction
automatically decide its structure size. The approach task into some simple prediction subtasks in the second
accelerated its learning process and improved accuracy of phase. The proposed decomposition approach is detailed in
model. In the method, at first, a basic DBN structure with a Sect. 3.2.
single hidden layer was initialized and then pretrained. The Then, we normalize decomposed data into range (0, 1)
learned weight parameters also were frozen. Then, the and sent it to NFTSAR as input data. NFTSAR network is
method used transfer learning (TL) to transfer the knowl- based on Neuro-Fuzzy Takagi–Sugeno network (NFTS).
edge from the learned weight parameters to newly added NFTS has the benefits of both fuzzy logic and NNs in a
neurons and hidden layers. It could achieve a growing single framework. It has a fuzzy inference system with
structure until the stopping criterion for pretraining was learning capability [32]. In this work, we improve the
satisfied. In this method, the weight parameters derived NFTS network by composing NFTS with linear AR to
from pretraining of TL-GDBN were further fine-tuned by achieve the advantages of data-based and model-based
using layer-by-layer partial least square regression from approaches. The new methodology attempts to deal with
top to bottom. This caused avoiding many problems of the non-stationary behavior, uncertainties, and other com-
traditional backpropagation algorithm-based fine-tuning. plexities of resource usage series in a cloud computing
Table 1 summarizes most of the works related to environment.
resource usage prediction approaches. The table discusses The proposed methodology has three other operation
the advantages and disadvantages of the related works. As stages in addition to the pre-processing stage: the training
explain in above, the main disadvantage of most approa- stage, the predicting stage, and the evaluating stage.
ches is that they are unable to extract hidden features in Training means the determination of the parameters
original data and adapt with workload changes. For this, we belonging to parts of the proposed NFTSAR network. This
decide to present an approach to solve these kinds of issues. stage works when the observation from time series is
In the following section, we will explain our proposed available. Thus, this stage analyzes n previous resource
solution. usage values in order to predict upcoming resource usage
on the data center at time n þ 1.
If there is no available data, the predicting step starts.
3 The proposed RSA–NFTSAR architecture The detailed discussion on training and predicting stages
are done in Sect. 3.4. After the training, its accuracy is
This section describes the details of the proposed RSA– evaluated in the evaluating stage that is critical for decision
NFTSAR for predicting and characterization of seasonal making. For this, the prediction data is transformed into the
time series consisting of (1) A recursive two-phase original format and the MSE, MAPE, and RMSE are cal-
decomposition method, and (2) Adaptive NFTSAR based culated to evaluate its accuracy. In the following subsec-
prediction scheme. tions, we explain used data series in the network and each
The structure of the proposed RSA–NFTSAR is shown step of the proposed methodology.
in Fig. 1. This structure consists of four steps, prepro-
cessing, training, predicting, and evaluating. Pre-process- 3.1 Data series
ing data techniques are a significant step in the data mining
process. These techniques play an important role in Arti- A time series is a sequence of numbers taken at successive
ficial NNs (ANN) by fostering high precision and minimal equally spaced points in time [84, 97]. Each data point in
computational costs at the training stage as noisy and time series is a measurement captured at different times-
unreliable information that could be present in data records tamps [68]. Generally, a time series data can be denoted as
will adversely affect the learning phase and result in a poor N M two-dimensional matrix (Eq. 1).
model [83]. In the pre-processing step, data is extracted
123
Cluster Computing
Table 1 Strengths and weaknesses of most of the works related to resource usage prediction approaches
Refs Approach Strength Weakness
[15, 16, 69, 79, 82] NN Simple implementation Picking the correct topology
Modeling nonlinear behavior of application Training take a long time and needs a lot
Ability to learn the relationship between input and output of data
It is probable that resource manager
cannot interpret some results
Unable to adapt with workload changes
It is not effective for extrapolation
Unable to extract hidden features in
original data
[1] EEMD-ARIMA Simplicity It is should retrained to adapt to workload
Interpretability changes
Data should be independent
Application behavior should be linear
original data
[45, 56] Reinforcement learning Adaptability to the behavioral changes of workload Poor scalability in the large state space
No need for domain knowledge Unable to extract hidden features in
original data
[51] AR Simple implementation Picking the correct topology
Modeling nonlinear behavior of application The long time and a lot of data for
training
original data
[76] DBN It is better than traditional NN in persisting the information from Unable to predict large temporal datasets
previous event accurately
It is more generative than auto-encoder and needs less memory It has expensive inference since it must
than RNN be performed for every new test input
original data
[64] ARMA model and the Limited accuracy due to the poor feature-extracted ability The convergence of the weights is
feedback based online AR Very simple dependent to the learning rates
model Input data must be continuous
Relatively easy to implement
It is temporally stable The network has to be retrained if
disturbances are introduced to the
Independent of the initial conditions system
original data
[70] Regressive Ensemble Simple implementation Picking the correct topology
Approach Modeling nonlinear behavior of application Training take a long time and needs a lot
of data
original data
[12, 59] Learning Automata No need to knowledge of the environment in which it operates or Unable to extract hidden features in
any analytical knowledge of the function to be optimized original data
adapt the control action to system limitations
[72] Bayesian technique Simplicity Independence of features describing
Simple interpretation application behavior
original data
123
Cluster Computing
Table 1 (continued)
Refs Approach Strength Weakness
[66] LSTM Learn long-term dependencies It requires computing and storing

Ability learn to extract useful information from data multiple gating neuron responses at
each step
Good for the high noisy one-dimensional time series
LSTM is designed to escape the long-term dependency issue of original data
RNN
Fig. 1 Resource usage

prediction approach workflow
X ¼ ½ Xð1Þ . . .; XðjÞ; . . . XðNÞ T ð1Þ This method extracts the trend, irregular and seasonal
components from time series data and removes noise with
X ð jÞ ¼ ½ X 1 ðjÞ . . .; X i ðjÞ; . . . X M ðjÞ ð2Þ low computational cost. These patterns are simpler than the
Here N is the total number of time stamps, M is number original time series and they help experts to understand the
input variables at the jth time stamp, XðjÞ denotes the jth series behavior [14, 87].
observation of input variables ð1 j NÞ, and X i ðjÞ As explain before, we present a recursive two-phase
denotes the jth observation of ith input variable method in which SSA runs in the first phase and AFEEMD
(1 i M). We assume that M is equal one in this paper. runs in the second phase. Before describing the proposed
This variable is related to the number of host resources. decomposition method, we describe our reasons for using
these two techniques as follows.
3.2 Structural description of proposed recursive SSA is employed in the analysis of time series and
two-phase decomposition approach combines multivariate statistic and probability theory, and
it is often used for identifying and extracting periodic,
Most of the resource usage prediction methods perform quasi-periodic and oscillatory components from the primal
forecasts by mainly modeling the original time series. data [86]. We choose this method because the SSA method
However, multiple seasonal patterns in the original series does not make any statistical assumptions [83] and not
can lead to the complexity of the prediction model, being restricted to predetermined components, i.e., trend,
weakening the generalization capability of the models [85]. seasonal, and cycle patterns, for example [13]. Therefore,
Two parameters, the prediction accuracy, and computa- the different types of components are determined from the
tional cost, are important in the resource usage prediction. time series data. Although, experts can determine the
The data decomposition, which could reduce the non-sta- number of patterns which be extracted [13].
tionary feature of the original data, promotes the predicting In the second phase, we use AFEEMD as defined in [88]
performance indirectly [86]. There are many approaches to which is an extension of the empirical mode decomposition
perform the decomposition task. The decomposition (EMD) [64] and EEMD techniques [89]. The major
method should be chosen based on the time series features breakthrough of AFEEMD reduces the time complexity
and expert’s goals [14, 87]. In this paper, we develop a and solve the predicament of parameters selection in
recursive two-phase decomposition method to achieve a EEMD [88]. This technique also optimizes the procedure
higher quality of resource usage decomposition and coding of the original EMD techniques like the FEEMD
enhance the predicting precision. The proposed method is technique [90]. For this, we use the AFEEMD method in
called RSA, which is composed of SSA with AFEEMD.
123
Cluster Computing
the second phase. The technique can decompose an original [38]. Define I ¼ fI 1 ; . . .; I m g and Y I ¼ Y I 1 þ Y I 2 þ . . .
signal into a series of IMFs and a residue. þY I m , then Y I can represent the trend component of the
Moreover, the hybrid methods outperform individual resource usage time series data, while the other ðd mÞ
forecasting models and promote the predicting perfor- components are considered as noise. In diagonal averaging
mance. Consequently, we apply a two-phase decomposi- step, each matrix Y I j of the grouped decomposition is
tion method to achieve the described goals. The proposed transformed into a new series with lengthN. Let Y be an
RSA method includes eight steps, which the first four steps, L K matrix with elements y‘ ¼ y‘ ; whereL ¼ F
‘ þ 1; 1 ‘ L; 1 K. Set L ¼ minðL; K Þ; K ¼
F F
Embedding, Singular Value Decomposition (SVD),
Grouping and Diagonal Averaging, are related to the SSA maxðL; K Þ andN ¼ L þ K 1. Let y‘ ¼ y‘ if L\K and
F F
phase and the next three steps, Upper frequency prescrib- y‘ ¼ y ‘ otherwise. By making diagonal averaging, the
F F
ing, Ensemble trials and Final extraction, are related to ^ ^
AFEEMD phase. The method performs a recursive update matrix Y is transferred into series X ð1Þ; . . .; X ðN Þ using
(eighth step) to reduce the computational complexity and the Eq. 5 [91].
speeds up the extraction process. The proposed method is 8 k
>
> 1X
summarized in Fig. 2 and described as follows. >
> y ; for 1 k\L
>
> k m¼1 m;kmþ1
In the embedding step, the values of three parameters, >
>
>
<1X L
N; L; m, should be determined. Let L be as the window ^
X ðk Þ ¼ y ; for L k\K
length, N be the initial number of time series samples and >
> L m¼1 m;kmþ1
>
>
m be the number of hidden components extracted. We >
> X
NK
þ1
>
> 1
assume that user determines these parameters initially. >
:N K þ 1 ym;kmþ1 ; for K k\N
m¼kK þ1
After initialization, the series data X ¼ ½X ð1Þ; . . .; X ðN Þ of
length N is shifted to a Hankel matrix Y ¼ ðY 1 : . . . : Y K Þ ð5Þ
with L K dimensions, where 2 L N and K ¼ This corresponds to averaging the matrix elements over
N L þ 1. F
the anti-diagonals‘ þ ¼ k þ 1; the choice k = 1 gives
The values of ascending diagonal in the Hankel matrix Xð1Þ ¼ y1;1 for k = 2 we have the moment of the
are equal from left to right [13]. Each element of the ^
Hankel matrix Y is defined as Yi ¼ ðX ðiÞ; X ð2Þ ¼ ðy1;2 þ y2;1 Þ=22, and so on. Diagonal averaging
X ði þ L 1ÞÞ; i ¼ 1; . . .; K. The lagged vectors Y i are the applied to a matrix Y I k produces a reconstructed series
^ ðkÞ
^ ^
columns of the Hankel matrix Y and the Y is given as X ¼ X ð1ÞðkÞ ; . . .; X ðNÞðkÞ Þ [92]. Therefore, the initial
follows:
L;K series X ð1Þ; . . .; X ðN Þ is decomposed into a sum of m
Y ¼ ðY 1 : . . . : Y K Þ ¼ y‘ ‘; ¼1
F F reconstructed series:
2 3
Xð1Þ Xð2Þ . . . XðKÞ X
m
^
6 Xð2Þ Xð3Þ . . . XðK þ 1Þ 7 X SSA ðtÞ ¼ X ðtÞðkÞ ; ðt ¼ 1; 2; . . .; NÞ ð6Þ
6 7
6 : : : : 7 k¼1
¼6 7 ð3Þ
6 : : : : 7 After extracting the hidden components by the SSA
4 5
: : : :
method and remove noise, the second phase will be started
XðLÞ XðL þ 1Þ . . . XðNÞ LK
to refine the time series again and detect the trend com-
In the next step (SVD), the SVD of the Hankel matrix Y ponents of the extracted time series X SSA . This phase
is performed. Set the covariance matrix C N ¼ YY T and converts a group of time series into multiple empirical
denote byki ; ði ¼ 1; . . .; LÞ, the eigenvalues of C N taken in modes, named as the IMFs which are ranged from high to
the decreasing order of magnitudeðk1 kL 0Þ. ki low frequencies. The procedure of the AFEEMD method is
refers to the partial alteration of the original time series briefly described in three steps as follows.
pffiffiffiffi
inU i , U i denotes the left eigenvector, and Vi ¼ YiT Ui ki The first step, frequency prescribing, is responsible for
denotes the right eigenvector. Then, matrix Y can be determining the appropriate upper frequency limit of the
decomposed into d components, whered ¼ rankðY Þ ¼ added white noise. We describe this procedure as follows.
maxfi; suchthatki [ 0g. In this notation, matrix Y can be At first, the upper frequency limit of the added noise,
further rewritten as follows: f ui ¼ ði þ 1Þf s , i ¼ 1; . . .; @, is prescribed to the extracted
signal, X SSA ðtÞ. Let f s be the sampling frequency of the
Y ¼ Y1 þ Y2 þ . . . þ Yd ð4Þ
pffiffiffiffiffiffiffiffiffiffiffiffiffiffi signal X SSA ðtÞ. In EEMD, the frequency of the added white
whereYi ¼ ki Ui ViT . In the eigentriple grouping step, m noise is determined based on the sampling frequency of the
out of d components are selected as the trend components original signal [88], and for the convenience of the inter-
polation in the next step, the frequency, f ui is the integral
123
Cluster Computing
Fig. 2 Cycle of RSA method for time series decomposition

multiples of the sampling frequency f s . Generally, the PðtÞ ¼ X SSA ðtÞ þ gðtÞ
value of @ is suggested to be 10 to 20. The cubic spline ð8Þ
N ðtÞ ¼ X SSA ðtÞ gðtÞ
interpolation for the signal X SSA ðtÞ is done. Let X i ðtÞ be the
Then, EMD method is performed on the new pair of
interpolated signal where the data length of X i ðtÞ is equal noise-added signals ðPðtÞ; NðtÞÞ. It causes a collection of
to the length of the signal X SSA ðtÞ. Then, the decomposition IMFs and the residues are obtained (Eq. 9).
of signal X i ðtÞ is performed using the original EEMD 8
> P.
method. We assume that the number of ensemble trials is >
< Pð t Þ ¼ cþ
i þ r.
þ
i¼1
two, the noise amplitude is 0.01 standard deviation (SD) P. ð9Þ
>
>
and f ui Hz is the upper frequency limit of the added white : N ðt Þ ¼ ci þ r
.
i¼1
noise. After determining the different upper frequency
limit, RRMSE values are calculated by the following where cþi is the i th IMF of the noise–added signal with
equation. positive noise and c
i represents the i th IMF of the noise-
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffi added signal with negative noise, r þ
PN 2 . and r . are the final
PN
t¼1 X i ðtÞ cmax ði; tÞ
2
RRMSEi ¼ X i ðtÞ residue. The . is the number of IMFs. In the final step of
t¼1
AFEEMD, the means of all the corresponding IMFs is
ð7Þ calculated by Eq. 10.
Let N be the length of the original signal and cmax 1
ci ¼ ðcþ þ c
i Þ ð10Þ
denotes the IMF component with the highest correlation to 2 i
the original signal X i ðtÞ. After calculation RRMSE values, where ci , is the i th primary IMF component derived by the
the upper frequency limit in which the RRMSE reaches the AFEEMD method. In the AFEEMD method, the amplitude
maximum is selected as the proper upper frequency limit of the added white noise and the number of ensemble trials
f p. are fixed as 0.01 SD and two [88]. After time series is
In ensemble trial step, one pair of white noises decomposed by the AFEEMD method, we assume that the
ðgðtÞ; gðtÞÞ is added to the signal X SSA ðtÞ to generate one observed additive time series is the form:
pair of polluted signals ðPðtÞ; NðtÞÞ with positive and v ¼ X trend þ X season þ X irregular þ X cycle ð11Þ
negative noise,
where X trend is trend components, X season is seasonal
123
Cluster Computing
components, X cycle and X irregular are cycle and irregular well as better efficiency than the only neuro-fuzzy system
term. Trend and cyclical components are long-term non- [32]. For this, this paper uses the NFTS network to predict
stationary components. So, we will not make a distinction the future resource usage of hosts in the cloud data centers.
between these two components and rewrite Eq. 11 as The network structure of the prediction model is shown in
follows: Fig. 3, respectively. The network has five layers. All nodes
at the same layer have similar functions. After the trend,
v ¼ X trendCycle þ X season þ X irregular ð12Þ
seasonal and irregular components extraction by proposed
The final step of the proposed two-phase decomposition the two-phase decomposition approach, the components
method is called a recursive update, which is responsible are sent to the NFTSAR network to training data. In the
for updating the parameters of the proposed approach and following, we describe each layer of the NFTSAR network.
reducing the time-consuming. For this, after running the Layer 1. After extracting the hidden components, the
proposed training methodology for the first time (explained first layer is started. This layer generates the antecedent
in Sects. 3.3 and 3.4), the Recursive update cycle of the propositions of fuzzy rules using the knowledge extracted
proposed decomposition method begins (Fig. 2). In the from the hidden components data [12, 13, 93]. At first, a
recursive step, for each time series sample X(k) data stream dt is generated for each instantt. The dt con-
k ¼ N þ 1; N þ 2; . . ., the L value is incremented by siders n extracted components by proposed decomposition
Eq. 13. approach, dt ¼ ðv1 ðt 1Þ; v2 ðt 1Þ; . . .; vn ðt 1Þ; X 0 ðtÞÞ;
L ¼ L þ 1orL ¼ k L þ 1 ð13Þ dt 2 R1ðnþ1Þ . Let X 0 ðtÞ be the desired output used for
NFTSAR training and ðv1 ðt 1Þ; v2 ðt 1Þ; . . .; vn ðt 1ÞÞ
In the online implementation, the SVD of the Hankel
be the inputs to NFTSAR network.
covariance matrix should be performed at each time, this
Since the data of different components can have sig-
issue is time-consuming in order that Hankel matrix Y is
nificantly different ranges in its values, it is necessary to
with dimension L K and when a new time series sample
normalize or standardize data stream dt [31]. Thus, it is
arrives, the L should update. So, the dimension of this
important to apply one of both techniques to transform the
matrix is modified at each time instant. For this reason, the
data so that they have comparable ranges. In this paper, we
time of computation is not interesting. In this approach, we
consider standardization, because the mean and the SD
calculate the SVD of the Hankel covariance matrix (C k )
which are used in this method accumulate the effects of all
recursively which uses the current sample vector (uL ) and
data online. For each element of data
the previous covariance estimates (C k1 ) at any timek
dt ¼ ðd t1 ; . . .; dtj ; . . .; dtnþ1 Þ, the method can be written as
(Eqs. 14 and 15).
2 3 [13]
yL
6y 7 t d tj mtj
uk ¼ 4 Lþ1 5 ð14Þ b
dj ¼ ; 1 j n þ 1; t [ N þ 1 ð16Þ
... dtj
yk L1
t
k1 1 where b
d j is the standardized value of elementdtj . The initial
Ck ¼ C þ u :uTk ð15Þ t
k k1 k k value of mtj and bd j are zero and the initial value of dtj is one
By this approach, time-consuming to perform the SVD (1 j n þ 1; t N þ 1Þ. The mean mtj and the SD dtj can
of the Hankel covariance matrix can be reduced. After the be updated recursively by Eqs. 17 and 18.
proposed decomposition method applies to input data, the
t 1 t1 1 t
extracted data is normalized into range (0,1) and then is mtj ¼ mj þ d j ; 1 j n þ 1; t [ N þ 1 ð17Þ
t t
sent to the NFTSAR network to predict future resource rffiffiffiffiffiffiffiffiffiffi
usage. t t 1 t1 2 1 t
dj ¼ dj þ =td mt2j ; 1 j n þ 1; t [ N þ 1
t
3.3 Structural description of NFTSAR network ð18Þ
where N ? 1 is the first instant t after the initial conditions
This section focuses on the overall workflow of the
of the two-phase decomposition method.
NFTSAR method. Among methods for nonlinear system
Then, we use an unsupervised learning technique based
modeling, NFTS prediction model has attracted extensive t
attention due to its better generalization and excellent on the AnYa algorithm to extract knowledge from data b d.
approximation in the dense region [33]. It has the benefits Andonovski et al. [93] proposed this technique for evolving
of both fuzzy logic and NNs in a single framework. Hence, the structure in an on-line manner. This method divides the
t
the NFTS network is considered a more optimal way as time series b
d into several data clouds. The method uses a
123
Cluster Computing
Fig. 3 Structure of the NFTSAR network
cloud based algorithm to determine the data clouds from t X

j
standard data, simplify the fuzzy rule base and elimination it b
d j ¼ j Ht Dtj;w ð21Þ
of extra rules. The degree of membership of a cloud is w¼1
obtained by the normal relative density of a given data At first, a cloud is created with the arrival of the first
t
sample bd (Eq. 19).
j
data. Then, the evolution process starts at time N ? 1. As
the data stream affects the density of the existing clouds at
t Ht X R
lij b
dj ¼ Ht Dw;t
j ;t [ N þ 1 ð19Þ any time, the local density value Ht Dtj;w and the global
Di;t
j w¼1 t
density of it b d j are updated as follows (Eqs. 22 and 23).
where R is the number of clouds or fuzzy rules and
Mi
2
Ht Di;t is the local density of ith cloud for data sample i;t Mi X
j Ht b
dj ¼ iþ ðDi;t
j;w Þ ð22Þ
M
Di;t
j which is obtained by the distance between the current w¼1
sample and all other samples of that cloud. The local t j1 2
it b
dj ¼ Pj1 ð23Þ
density value of each standard data sample Di;t
j is calcu- j1þ w¼1 ðDtj;w Þ
lated according to the following [94],
The used evolution process in this paper is done in two
i;t
Mi
X steps. The following describes each of the two stages in
Ht b
dj ¼ j Ht Di;t
j;w ð20Þ detail.
w¼1
Action 1 (Data Cloud Generation). A data stream has
where M i is the number of data points in the ith cloud and
information about unexplored regions in data space [13] or
Di;t
j;w is the distance between the current sample and the it has high generalization capacity if it fulfills the following
other sample of the ith cloud. Here, Euclidean distance is criteria.
used. The global density of it is calculated for a given
sample b
t
d as follows. If the global density of the new data sample it d^jt is
j
larger than the global density of all the existing clouds,
123
Cluster Computing
i;t
then a new cloud will be created (Eq. 24). Let it b
d j be xt
kti ¼ PR i ð29Þ
the global density of new data samples. q¼1 xtq
t i;t
it bd j [ it b dj 8iji ¼ ½1; R ð24Þ where kti is the normalized firing level of the ith rule at
instant t.
Action 2 (Data Cloud Reduction). To removing the Every rule has an antecedent and a consequent part. The
redundancy and control high overlap between data clouds, antecedent of each rule uses the information of the input
we investigate the following conditions after a new fuzzy data, and the consequent of each rule calculates the output
rule is created. of the model. One of the most important advantages of the
AnYa algorithm is that in spite of the traditional fuzzy rule
1. If this condition is satisfied for some existing data based systems that need to be determined antecedent and
clouds, then these clusters are replaced with the new consequent, the algorithm does not need to determine these
data clouds generated by Action 1. parameters. So, the time complexity of this algorithm is
i;t
9i; i ¼ ½1; RjHt b
d j [ e1 ð25Þ lower than traditional methods in determining these
parameters [94].
2. If it detects less active in a data cloud, the cloud should Based on the used consequent type, the consequent
be removed [93]. We define activity as the number of parameters are determined. Since the structure of the model
data samples associated with the data cloud and can be modified according to input data, in this paper a
calculate from the moment of the cloud’s creation. recursive strategy is used to estimate the consequent
This condition is defined as follows. parameters. The corresponding fuzzy inference system
(TS) is as follows.
Sti
\1=R ð26Þ
t ti Rule i : IF d^t1 cloud i THEN
...t
where ti is the moment of the ith cloud’s creation and 1 d i ¼ kti ½ati vðt 1Þ þ bti u^ðt sÞ ð30Þ
2 ½0; 1 is a constant parameter. If 1 = 0 then no data
cloud will be removed, while if 1 = 1 then with adding where at 2 Rnn ; bti 2 Rnn ; b
u ðt sÞ 2 Rn1 , and
h i ts i
ts T
a new data cloud, the previous one will be removed. In u ðt sÞ ¼ b
b d 1 ; . . .; b
d nþ1 . Let bu ðt sÞ be the infor-
practice, the constant 1 should be within [0, 1), usually
mation of resource usage patterns at the time t-s and s be
0.1 [93]. We choose 0.1 for the constant 1 in this paper.
the seasonal period of the time series. We assume that s
However, if these both actions (Data Cloud Generation value is informed by experts as prior. The parameter set {ati
t

and Reduction) are not net, it means that the data points b d and bti will be denoted as consequent propositions for
are associated with the closest existing cloud. each fuzzy rule and must be identified. The determination
t of these parameters is detailed in layer 3.
According to data b d , the propositions of fuzzy rule have
the following composition, Layer 3. We update consequent parameters by the
R i
recursive least square algorithm with a variable forgetting
Rule i : IF d^t1 cloud i then x ! classargmaxi¼1 ðl Þ ð27Þ factor (WRLS-VFF) proposed by [30]. WRLS-VFF is an
where li is the membership of ith cloud. improved version of the recursive least square (RLS)
Layer 2. This layer investigates the impact of fuzzy rules method with better performance on tracking sudden system
in time series prediction. Firing strength of the succeeding changes. WRLS-VFF has an optimal dynamic forgetting
node is the output of this layer. We assign a degree factor obtained from minimizing the mean-square noise-
(Eq. 28) for each rule by multiplying the membership free posterior error and usually has better performance in
degree of the input/output variables in the membership both stationary and non-stationary environments [30].
function. Activation degree denotes to what extent these Moreover, WRLS-VFF minimizes the error functions.
rules influence the others. Therefore, we use WRLS-VFF in this paper. For more
information, refer to [30].
nþ1
Y t Layer 4. In this layer, the outputs of NFTS and AR
xti ¼ lij b
dj ð28Þ
j¼1
produce separately. The output is estimated in two ways:
NFTS and AR, which are calculated by Eqs. 31 and 32.
Then, the input weights are normalized and the nor-
malized firing strength is calculated by Eq. 29.
123
Cluster Computing

X
R ... dt ¼ X~0 ðt 1Þnull
XðtÞ NFTS
¼ kti d t ð31Þ 0
i ¼ X~ ðt 1Þ. . .; X~0 ðt 1Þ. . .X~n0 ðt 1Þ; null ð36Þ
i¼1
X
n After, layer 2 runs normally, as well as layers 3 of
XðtÞAR ¼ uti vti ð32Þ NFTSAR. Time series is computed based on hidden com-
i¼1
ponents decomposition to create future resource usage data

The parameter set { uti will be denoted as consequent of host. Then, this resource usage data standardizes.
propositions for each fuzzy rule and be identified uniform
distribution from the interval (0, 1).
Layer 5. The last layer calculates the output of NFTSAR 4 Performance evaluation
(Eq. 33) which is a composition of the vectors X NFTS ðtÞ and
X AR ðtÞ. In the following section, we describe the CPU data models
used to train and test out proposed RSA–NFTSAR and also
X 0 ðtÞ ¼ vNFTS
t X ðtÞNFTS þ vAR
t X ðt Þ
AR
ð33Þ describe metrics that are used to determine the accuracy of
Then, the new time series is rebuilt using the final out- the proposed method for predicting CPU usage.
puts obtained by Eq. 33 in time t. We use the uniform
distribution from the interval (0, 1) to generate values of 4.1 Experimental setup
vt NFTS and vt AR in order to simplicity. Since we standardize
the original data at the first layer, therefore it needs to Before proceeding further, let us introduce the experi-
convert the output into the original data range using mental setup used in the rest of this paper. We used Matlab
Eqs. 17 and 18. Thus, the final outputs in original range r2015 for simulations. For performance evaluation, we use
X 0 ðtÞ are: the real VMs workload traces generated by the PlanetLab
files. These files include the CPU and memory utilization
X~0 ðtÞ ¼ Xj0 ðtÞdtj þ mtj ; j ¼ 1; . . .; n ð34Þ values of more than a thousand VMs every 5 min in a
In the end, the output of NFTSAR at instant t is com- period of 24 h. There are10 folders, each folder contains
puted as: between 898 to 1516 files. The PlanetLab data trace files
X can be obtained from their github [15]. Information of
X~0 ðtÞ ¼ j
1n X~j0 ðtÞ ð35Þ these VMs is gathered from 500 hosts around the world
[12]. In this paper, we consider the CPU values from these
data sets. We chose 10 days from the workload traces
3.4 The training and predicting stages collected during March and April 2011. The characteristics
of the VMs and their resource utilization in the PlanetLab
An ANN is characterized based upon structure (how neu- traces are presented in Table 2.
rons are connected to each other), training method (how We divide the PlanetLab files into three data sets. The
weights between connected neurons are determined), and first one is related to the training data which contained
transfer function [95]. Training of NNs means determining 2296 CPU values. The second one is used to test the
the parameters in its structure. In this paper, we run these NFTSAR after the training phase which contained 288
two stages, training and predicting, like as [13]. In this CPU values. These two data sets are a part of the same
way, if the time series is available, the proposed method- host. However, the third data set is used to also test the
ology is trained and it adds or removes fuzzy rules to NFTSAR. The third data set has come from a different host
perform the mapping between the hidden components data contained 2296 CPU values. We use two test data sets from
at time t-1 and one step ahead of time series data at time t. two different hosts in order to the generality and adapt-
On the other hand, the predicting stage works when ability of the NNs.
there is not available data X(t). In this time, the NFTSAR
does not evolve and the fuzzy rules are not updated. It 4.2 Performance metrics
means the structure of NFTSAR is fixed during the pre-
dicting stage. So, the recursive two-phase decomposition There are many metrics to measure the performance and
phase does not run because there is not available data X(t) to validate the precision accuracy of each model. We use
extract the hidden components. To identify the hidden three statistical indicators, mean absolute percentage error
patterns and build the data stream dt, the final outputs (MAPE), mean absolute error (MAE), and root mean
X~0 ðt 1ÞÞ computed in the previous time is feedback to its square error (RMSE) which are explained in Eqs. 37, 38
input, then: and 39. We also compare the proposed model with six
other models previously proposed for cloud workload
123
Cluster Computing
prediction purposes [15, 48, 64, 66, 80, 81]. We choose series experiments. As time is very important in hourly
these papers because they are most match with our problem time series, we investigate the running time of our
and three samples of the latest prediction models. approach. As shown in Table 4, the running time of our
h proposed method is lower than 0.5 s. We also show the
1X Pi;c Pi;m
MAPE ¼ 100 ð37Þ running time of six other approaches
h i¼1 Pi;m [15, 48, 64, 66, 80, 81] in Table 9 of Sect. 4.5.
1X h
Pi;c Pi;m
MAE ¼ ð38Þ 4.4 Decomposing of CPU utilization time series
h i¼1
using RSA method
rffiffiffi h
1X 2
RMSE ¼ Pi;c Pi;m ð39Þ How the workload is entered is a non-stationary problem in
h i¼1
nature [96, 97]. So, the problem leads to poor generaliza-
where Pi,c is the ith predicted value and is the ith measured tion and undesirable in performance of CPU utilization
data. Let Pi,c,avg and Pi,m,avg be the mean of the pre- prediction because they impose a number of pseudo-vari-
dicted, and measured values and h be the total number of ation requirements on the model and affect the correct
observed data. The RMSE value denotes the accuracy of understanding of data variations. For this, a two-phased
models by comparing the differences between estimated decomposition method, RSA, is proposed to simplify the
and actual observed values. The MAPE refers to the predicting task in this paper. The results of the RSA
accuracy of the prediction deviation of the model where method are demonstrated in Figs. 4 and 5. As explained
MAE represents the absolute errors between predicted and before, RSA is run in two phases, SSA and AFEEMD. At
measured data. The smaller RMSE, MAPE, MAE values the first phase, SSA was used to decompose the time series
show a better performance model [32]. into four different components. Figure 4 shows the four
components produced by the SSA for CPU utilization.
4.3 Parameter settings After extracting the hidden components and de-noising
by the SSA method, the second phase will be started to
In this subsection, we evaluate the characteristics of the refine the time series again and detect the trend components
proposed method. Variation of some parameters considered of the extracted time series by the SSA method. By
in Sect. 3.2 can affect the results such as the initial amount employing the AFEEMD technique, the original CPU uti-
of time series samples W, a dimension of lagged vectors in lization time series is decomposed into several independent
the SSA method L and number of hidden components n. In IMFs and one residue, respectively. The results are illus-
the following, we try to select suitable values for these trated in Fig. 5. As shown in Fig. 5, the time series is
three parameters. decomposed into four independent IMF components, and
To set the value of other parameters, we split the one residue component, respectively. The IMFs are sorted
training set into three time-series, namely TS1, TS2, and from the highest frequency to the lowest frequency.
TS3. The interval of utilization measurements is 5 min As shown in Fig. 5, the original sequence can be clas-
with seasonality 12. This seasonal period determines that sified into two categories. The low-frequency components
the resource usage pattern repeats every 12 time-series represent the trend of the original data, and the high-fre-
samples. We consider one of three time-series commented quency components reflect the variation of the original
previously to evaluate these parameters. The predicting data. The variations may result from some vague factors.
horizon is adopted: h = 12 . For example, IMF1–IMF4 are the IMF components from
Techniques are compared according to the described high frequency to low frequency and Residue is the lowest-
performance metrics in Sect. 4.2. The comparison results frequency component. The Residue component shows the
are shown in Table 3. As shown in Table 3, the NFTSAR overall trend of CPU utilization, which firstly increases and
has lower error values in these experiments. So, it can be finally keeps steady on the whole. IMF3–IMF4 indicate the
concluded that NFTSAR presents the best performance explicit fluctuation of the original sequence. The high-
than three other approaches. Table 4 shows the values of frequency IMF1–IMF2 reflect the variation of the original
effectiveness parameters (W,L,n), the number of data sequence affected by other random factors. Hence, a sig-
clouds (R) and the running time (in sec.) in the three time nificant improvement of the computational efficiency can
Table 2 Characteristics of the

Date VMs Res Mean (%) SD (%) Median
considered workload traces
10 days March–April 2011 1473 CPU 19.77 14.55 15
123
Cluster Computing
Table 3 Quantitative comparison of the techniques by criteria decomposition algorithms such as AFEEMD, EEMD, SSA
Time series TS1 TS2 TS3
and WD. At first, we evaluate our proposed method,
NFTSAR without applying the two-phase decomposition
ENN method in three steps ahead (as shown in Table 5).
MAPE (%) 1.1458 1.8045 1.2151 The error evaluation results of different decomposition
MAE 0.1714 0.2146 0.162 methods in multi-step prediction are displayed in Table 6
RMSE 0.8766 0.5162 0.3907 and Fig. 4. Considering the results of Tables 5 and 6, it can
MPHW be said that all the decomposition-based models (Table 6)
MAPE (%) 0.9306 1.0547 0.8795 perform better than the single model (Table 5) in multi-step
MAE 0.1721 0.2085 0.1532 prediction. For example, compared with single NFTSAR,
RMSE 0.2917 0.3246 0.2231 the MAPE indexes of EEMD, WD, AFEEMD, SSA and
LSTM RSA decline from 1.4115 to 1.0348%, 1.0324%, 1.011%,
MAPE (%) 1.1765 1.6192 1.1656 0.8354% and 0.5462% in 1-step prediction, respectively. It
MAE 0.2289 0.3014 0.1752 can be found that the predicting difficulty will be decreased
RMSE 0.3312 0.3528 0.3091 greatly since the data time series can be decomposed into
SGW-SCN steadier modes using the decomposition methods. More-
MAPE (%) 1.0367 1.3274 1.2094 over, we observe that the RSA–NFTSAR has the best
MAE 0.1699 0.2118 0.1723 accuracy among the other decomposition-based models.
RMSE 0.3904 0.4129 0.3416 We can conclude that de-noising and selecting parameters
SDBFNN accurately can lead to improve the performance of
MAPE (%) 0.6067 0.7288 0.6146 prediction.
MAE 0.1606 0.1914 0.1482 From Table 6 and Fig. 6, it is evident that the SSA-
RMSE 0.2108 0.2676 0.2083 based model obtains better prediction accuracy than the
TL-GDBN
other single decomposition methods. For this, we apply the
MAPE (%) 0.6251 0.7547 0.6318
SSA method to decompose the resource usage series in the
MAE 0.1672 0.2053 0.1498
first step of RSA. On the other hand, by comparing RSA–
NFTSAR with SSA–NFTSAR, MAE promoted percent-
RMSE 0.2401 0.2721 0.2199
ages of RSA–NFTSAR are 36.11%, 60.63% and 27.82% in
NFTSAR
1-step, 2-step, and 3-step prediction. These results
MAPE (%) 0.5842 0.6021 0.5548
demonstrate that AFEEMD as secondary decomposition
MAE 0.1564 0.1758 0.1248
can improve prediction precision. Moreover, increasing the
RMSE 0.2051 0.2410 0.1843
MAPE, MAE and RMSE values of all the models in the
multi-step predictions indicates that the prediction errors of
each step would be added to next steps.
Table 4 Parameter values for each time series 4.6 Comparison of different prediction
Time W L n rjNþ1 f R Time approaches
series (s)
TS1 13 5 5 0.00232 0.5 6 0.37

To show the effectiveness of our proposed approach, we
compare our prediction approach with six other approaches
TS2 21 3 3 0.00014 0.5 5 0.25
[15, 48, 64, 66, 80, 81] in multiple time steps ahead. The
TS3 14 6 6 0.00005 0.5 3 0.22
aim of the multiple time steps ahead prediction experiment
is to evaluate how far into the future the RNN could
accurately predict CPU usage of a host and to determine by
how much does predictions decrease the further into the
be achieved by using the proposed two-phased decompo- future the network attempts to predict [19]. We predict the
sition method, RSA. CPU usage of a host from one to six-time steps into the
future in the experiment. Each time step takes 5 min.
4.5 Comparison of different decomposition Totally, CPU usage prediction takes 30 min into the future
algorithms by considering six-time steps.
The prediction accuracy on the training and testing data
In this subsection, we compare the performance of the sets at each of the six-time steps are presented in Tables 7
proposed decomposition method with four other popular and 8. As shown in tables, performance metrics are
123
Cluster Computing
Fig. 4 The four components obtained from the SSA method
Table 5 The multi-step predicting error evaluation results of the

single NFTSAR model without applying the two-phase decomposi-
tion method
Index 1-Step 2-Step 3-Step
MAPE (%) 1.4115 1.5462 1.7854

MAE 0.3033 0.3574 0.3761
RMSE 0.3316 0.3649 0.4025
time series into a simpler time series. It is noted that

SDBFNN and TL-GDBN have almost 1-step prediction
results near to our proposed approach but the prediction
errors of these approaches increase when the prediction is
done at multi-steps-ahead. In the six compared approaches,
TL-GDBN and SDBFNN have lower prediction errors
rather than other approaches. This is because that SDBFNN
uses a decomposition method to de-noise the original series
Fig. 5 The decomposed results derived by AFEEMD method for the and TL-GDBN uses the TL approach to transfer the
CPU utilization time series knowledge from the learned weight parameters to newly
added neurons and hidden layers, which can achieve a
increased linearly in the prediction error. According to growing structure until the stopping criterion for pretrain-
results, it can be concluded that our approach predicts ing was satisfied.
future resource usage accurately compared with the other In Table 9, we also evaluate the training time and testing
six approaches. This is because our approach not only uses time of the proposed CPU utilization prediction model. It is
rule base inference system but also decomposes original observed that the computational time of training and test
123
Cluster Computing
Table 6 The error evaluation results of different decomposition changes at time step 14 (Fig. 7a), the MAE accuracy of
algorithms in multi-step predicting RSA–NFTSAR decreases at the same time (Fig. 7b).
Index 1-Step 2-Step 3-Step 1-Step 2-Step 3-Step However, results show that the RSA–NFTSAR performs
well at multi-steps ahead into the future.
EEMD-NFTSAR AFEEMD-NFTSAR
For real applications, the convergent result guarantees
MAPE (%) 1.0324 1.5008 1.7045 1.011 1.2836 1.4247 the deterministic convergent behavior from a theoretical
MAE 0.1218 0.1684 0.2005 0.0971 0.1202 0.1427 point of view [96]. So, we mainly present the convergent
RMSE 0.1486 0.1903 0.2152 0.1407 0.1721 0.1929 result of the proposed model in Fig. 8. With regard to the
SSA-NFTSAR WD-NFTSAR low RMSE of RSA–NFTSAR compared to six other
MAPE (%) 0.8354 0.8936 0.9794 1.0348 1.5104 1.7435 approaches, it can be easily shown that the convergence of
MAE 0.0706 0.0816 0.0981 0.1472 0.1693 0.2098 RSA–NFTSAR occurs at a much faster rate than six other
RMSE 0.0997 0.1281 0.1532 0.1594 0.2061 0.2198 approaches in training mode.
RSA–NFTSAR In Table 10, we show the improvement percentages of
MAPE (%) 0.5462 0.6003 0.6702 the comparison models by RSA–NFTSAR. In accordance
MAE 0.0451 0.0566 0.0708 with the results, it can be summarized as follows:
RMSE 0.0608 0.0694 0.0932
1. As shown in the experiments, the RSA–NFTSAR has
better prediction accuracy and performance than the
comparison models in the experiment. So, we can
conclude that it has satisfactory prediction accuracy.
step are shorter than 3 s. This leads to makes this method 2. The prediction results of the RSA–NFTSAR model are
suitable for resource usage prediction. The models need to better than those of the ENN model. For example,
be retrained after a certain period, we choose half-hour as compared to the ENN model, the MAPE of the
the time interval for retraining the predicting models. The proposed model are reduced by 38.59%, 38.12%,
results demonstrate that the computational time of the 35.13%, 41.32%, 44.45% and 45.16%, respectively;
proposed method is reasonable. the MAE of the proposed model are reduced by
Figure 7a shows the results of prediction by the LSTM, 38.02%, 39.06%, 38.10%, 46.20%, 50.95% and
MPHW and RSA–NFTSAR algorithms. The RSA– 44.57%, respectively; and the RMSE of the proposed
NFTSAR algorithm performs similarly in trend when model are reduced by 53.51%, 50.50%, 49.96%,
compared with the actual CPU data-set. Figure 7b shows 43.53%, 44.07% and 34.10%, respectively.
the MAE for both 1-time-step-ahead and 6-time-steps- 3. The prediction results of the RSA–NFTSAR model are
ahead for the RSA–NFTSAR algorithm on the test data. better than those of the MPHW model. For instance, in
The prediction errors of the RSA–NFTSAR algorithm experiment, compared to the MPHW model, the
increase when the prediction is done at 6-time-steps-ahead MAPE of the proposed model are reduced by
(as shown in Fig. 7b). When CPU utilization suddenly
Fig. 6 The multi-step error evaluation of different decomposition algorithms
123
Cluster Computing
Table 7 Multiple steps ahead

Approach Index 1-Step 2-Step 3-Step 4-Step 5-Step 6-Step
trained data prediction accuracy
ENN MAPE (%) 0.3796 0.4173 0.6989 0.8676 1.0185 1.4229
MAE 0.0478 0.0589 0.0702 0.1176 0.1392 0.1609
RMSE 0.0832 0.1178 0.1245 0.1426 0.1502 0.1775
MPHW MAPE (%) 0.3756 0.4152 0.6791 0.7895 0.9447 1.3525
MAE 0.0472 0.0597 0.0725 0.1214 0.1364 0.1592
RMSE 0.0814 0.1107 0.1179 0.1264 0.1307 0.1572
LSTM MAPE (%) 0.3554 0.3894 0.6574 0.7145 0.8685 1.2354
MAE 0.0412 0.0518 0.0597 0.0982 0.1109 0.1316
RMSE 0.0612 0.0846 0.1042 0.1084 0.1124 0.1313
SGW-SCN MAPE (%) 0.3517 0.4041 0.6747 0.7987 1.0341 1.4674
MAE 0.0435 0.0523 0.0816 0.1098 0.1646 0.1738
RMSE 0.0754 0.1132 0.1438 0.1662 0.1554 0.1689
SDBFNN MAPE (%) 0.3329 0.3669 0.4934 0.6064 0.7167 0.8975
MAE 0.0411 0.0478 0.0513 0.0852 0.0834 0.1267
RMSE 0.0481 0.0639 0.0855 0.0929 0.1159 0.1479
TL-GDBN MAPE (%) 0.3511 0.3741 0.5874 0.7015 0.8236 1.0697
MAE 0.0367 0.0496 0.0616 0.0951 0.1497 0.1424
RMSE 0.0521 0.0784 0.0911 0.1024 0.1336 0.1591
RSA–NFSTAR MAPE (%) 0.279 0.2989 0.4178 0.4901 0.5658 0.7415
MAE 0.0295 0.0361 0.0395 0.0647 0.0716 0.0817
RMSE 0.0416 0.0592 0.0672 0.0781 0.0834 0.1008
Table 8 Multiple steps ahead

test data prediction accuracy
ENN MAPE (%) 0.3211 0.3843 0.548 0.7278 0.8926 1.2085
MAE 0.0405 0.0489 0.0601 0.1067 0.1311 0.1454
RMSE 0.0854 0.0997 0.1177 0.1314 0.1417 0.156
MPHW MAPE (%) 0.3124 0.3741 0.5421 0.6925 0.8301 1.0841
MAE 0.0393 0.0478 0.0582 0.1004 0.1149 0.1378
RMSE 0.0617 0.0807 0.0987 0.1231 0.1397 0.1607
LSTM MAPE (%) 0.2447 0.3024 0.4474 0.5675 0.6815 0.9167
MAE 0.0304 0.0399 0.0503 0.0845 0.0975 0.1114
RMSE 0.0512 0.0692 0.0849 0.0964 0.119 0.1329
SGW-SCN MAPE (%) 0.2642 0.3468 0.5649 0.7142 0.8197 1.2134
MAE 0.0324 0.0443 0.0686 0.1034 0.1135 0.1581
RMSE 0.0561 0.0765 0.1075 0.1297 0.1296 0.1792
SDBFNN MAPE (%) 0.2384 0.2903 0.4454 0.5775 0.6687 0.9168
MAE 0.0296 0.0376 0.0457 0.0737 0.0805 0.1183
RMSE 0.0481 0.0594 0.0761 0.0925 0.1063 0.1408
TL-GDBN MAPE (%) 0.2522 0.3189 0.4947 0.6286 0.7241 0.9934
MAE 0.0316 0.0389 0.0476 0.0798 0.0994 0.1202
RMSE 0.0504 0.0642 0.0811 0.0929 0.1221 0.1494
RSA–NFSTAR MAPE (%) 0.1972 0.2378 0.3555 0.4271 0.4958 0.6628
MAE 0.0251 0.0298 0.0372 0.0574 0.0643 0.0806
RMSE 0.0397 0.0494 0.0589 0.0742 0.0835 0.1028
36.88%, 36.43%, 34.42%, 38.32%, 40.27% and 44.04% and 41.51%, respectively; and the RMSE of
38.86%, respectively; the MAE of the proposed model the proposed model are reduced by 35.66%, 38.79%,
are reduced by 36.13%, 37.66%, 36.08%, 42.83%, 40.32%, 39.72%, 40.23%, and 36.03%, respectively.
123
Cluster Computing
Table 9 Comparison of different models on training and testing time the MAE of the proposed model are reduced by
Approach Training time (s) Testing time (s)
17.43%, 25.31%, 26.04%, 32.07%, 34.05%, and
27.65%, respectively; and the RMSE of the proposed
ENN 1.9530 0.5645 model are reduced by 22.46%, 28.61%, 30.62%,
MPHW 2.0352 0.7164 23.03%, 29.83%, and 22.65%, respectively.
LSTM 1.6639 0.5940 5. The prediction results of the RSA–NFTSAR model are
SGW-SCN 2.0024 0.6484 better than those of the SGW–SCN model. For
SDBFNN 0.8347 0.2982 instance, in experiment, compared to the SGW-–CN
TL-GDBN 1.6028 0.4212 model, the MAPE of the proposed model are reduced
RSA–NFSTAR 0.8124 0.2741 by 25.36%, 31.43%, 37.07%, 40.20%, 39.51%, and
45.38%, respectively; the MAE of the proposed model
are reduced by 22.53%, 32.73%, 45.77%, 44.49%,
43.35%, and 49.02%, respectively; and the RMSE of
4. The prediction results of the RSA–NFTSAR model are the proposed model are reduced by 29.23%, 35.42%,
better than those of the LSTM model. For instance, in 45.21%, 42.79%, 35.57%, and 42.63%, respectively.
experiment, compared to the LSTM model, the MAPE 6. The prediction results of the RSA–NFTSAR model are
of the proposed model are reduced by 19.41%, 21.36%, better than those of the SDBFNN model. For instance,
20.54%, 24.74%, 27.25%, and 27.70%, respectively; in experiment, compared to the SDBFNN model, the
Fig. 7 Predictive performance of back-propagation through LSTM, MPHW and RSA–NFTSAR with MAE for one and six time steps: a Actual,
LSTM, MPHW and RSA–NFTSAR CPU values b MAE for one and six time steps
123
Cluster Computing
Fig. 8 Convergence
performance of RSA–NFTSAR
along with iteration NO
Table 10 Improvement
percentages of the comparison
models by RSA–NFTSAR ENN pMAPE (%) 38.59 38.12 35.13 41.32 44.45 45.16
model
pMAE (%) 38.02 39.06 38.10 46.20 50.95 44.57
pRMSE (%) 53.51 50.45 49.96 43.53 41.07 34.10
MPHW pMAPE (%) 36.88 36.43 34.42 38.32 40.27 38.86
pMAE (%) 36.13 37.66 36.08 42.83 44.04 41.51
pRMSE (%) 35.66 38.79 40.32 39.72 40.23 36.03
LSTM pMAPE (%) 19.41 21.36 20.54 24.74 27.25 27.70
pMAE (%) 17.43 25.31 26.04 32.07 34.05 27.65
pRMSE (%) 22.46 28.61 30.62 23.03 29.83 22.65
SGW-SCN pMAPE (%) 25.36 31.43 37.07 40.20 39.51 45.38
pMAE (%) 22.53 32.73 45.77 44.49 43.35 49.02
pRMSE (%) 29.23 35.42 45.21 42.79 35.57 42.63
SDBFNN pMAPE (%) 17.28 18.08 20.18 26.04 25.86 27.71
pMAE (%) 15.20 20.74 18.60 22.12 20.12 31.87
pRMSE (%) 17.46 16.84 22.60 19.78 21.45 26.99
TL-GDBN pMAPE (%) 21.81 25.43 28.14 32.06 31.53 33.28
pMAE (%) 20.57 23.39 21.85 28.07 35.31 32.95
pRMSE (%) 21.23 23.05 27.37 20.13 31.61 31.19
MAPE of the proposed model are reduced by 17.28%, respectively; the MAE of the proposed model are
18.08%, 20.18%, 26.04%, 25.86%, and 27.71%, reduced by 20.57%, 23.39%, 21.85%, 28.07%,
respectively; the MAE of the proposed model are 35.31%, and 32.95%, respectively; and the RMSE of
reduced by 15.20%, 20.74%, 18.60%, 22.12%, the proposed model are reduced by 21.23%, 23.05%,
20.12%, and 31.87%, respectively; and the RMSE of 27.37%, 20.13%, 31.61%, and 31.19%, respectively.
the proposed model are reduced by 17.46%, 16.84%,
To show the strength of our proposed method, we
22.60%, 19.78%, 21.45%, and 26.99%, respectively.
compare the error trend lines (RMSE) of RSA–NFTSAR
7. The prediction results of the RSA–NFTSAR model are
with NFTSAR and six other approaches
better than those of the TL-GDBN model. For instance,
[15, 48, 64, 66, 80, 81]. In this experiment, only the number
in experiment, compared to the TL-GDBN model, the
of inputs varies. 100 data samples of dataset are selected
MAPE of the proposed model are reduced by 21.81%,
randomly. The results are observed in Fig. 9. It is showed
25.43%, 28.14%, 32.06%, 31.53%, and 33.28%,
that RSA–NFTSAR has lower RMSE value than other
123
Cluster Computing
methods for locations with fewer data samples. On the proposed RSA-based decomposition strategy can decom-
other hand, when the number of samples of a dataset pose the resource usage series into several meaningful
grows, RMSE of RSA–NFTSAR is decreased. It is because modes. These interpretable modes represent hourly
that RSA–NFTSAR decomposes the original dataset and resource usage trend changes. Thus, the decomposition
removes the noise. The figure shows that building a new strategy can make a significant reduction in heavy analyses
model needed a least 40 samples to obtain an accept- for identifying and extracting meaningful components in
able prediction accuracy. Moreover, it can be recognized the predictions. On the other hand, it can help the NFTS
that the proposed decomposition method, RSA, leads to a network to identify its structure and predict future resource
slight improvement of the error trend line, compared to the usage accurately.
case when no decomposition is performed. It is very clear In addition, the proposed method achieved good pre-
that the RSA–NFTSAR provides better results than six dicting results for seasonal resource usage series compared
other approaches, because this model has been tested on with other approaches [15, 48, 64, 66, 80, 81]. We also
data extracted from them. Although SDBFNN also found that extracting the hidden components of seasonal
decomposes the original series, it has a higher error than timeseries can lead to more accurate predictions. Accord-
our proposed method. It is because that SDBFNN is cap- ing to the results, we can summarize that: (a) The proposed
able to split the series into two hidden components. So, model has good prediction accuracy and generalization
SDBFNN may have worse prediction values in noisy performance in short-term multi-step resource usage pre-
environments with different trends. However, the key dif- dicting. (b) The proposed model can have significantly
ference is that RSA–NFTSAR uses only a smaller fraction better performance than the three comparison models from
of all available predictors and hence drastically reduces the 1-step to 6-step CPU usage predictions with the MAPE of
computational complexity and training time of a NN 33.83% average performance promotion, MAE of 36.54%
model. average performance promotion, RMSE of 36.70% average
performance promotion.
For the future work, we aim to present a simple heuristic
5 Conclusion and future works to place time sensitive VMs on active hosts based on
prediction results of our methodology. We should show
Driven by such factors as cyclic and seasonality, resource that the proposed methodology can be effective on time
usage series present corresponding data characteristics, sensitive VM placement policy and reduce the energy cost
resulting in inferential and computational complexity for of cloud components in cloud computing.
developing an effective prediction model, which cannot
generate favorable prediction results. According to the
‘‘divide and conquer’’ idea and identifying simpler pat-
terns, an effective alternative to overcome this problem is References
to capture these data characteristics and then employ pre-
1. Chen, J., Wang, Y.: A Resource demand prediction method based
diction techniques for forecasting purposes [85]. on EEMD in cloud computing. Int. Congress Inform. Commun.
In this paper, we present an adaptive resource usage Technol. 131, 116–123 (2018)
prediction model that is designed based on a two-phase 2. Raagaard, M.L., Pop, P., Gutierrez, M., Steiner, W.: Runtime
decomposition method and hybrid NFTS network. reconfiguration of time-sensitive networking schedules for fog
computing. IEEE Fog World Congress (2017). https://doi.org/10.
According to the characteristics of resource usage data, the 1109/FWC.2017.8368523
3. Balaji, M., Kumar, C.A., Rao, G.S.V.R.K.: Predictive cloud
resource management framework for enterprise workloads.
J. King Saud Univ. Comput. Inform. Sci. 30, 404–415 (2018)
4. Nayak, S.C., Parida, S., Tripathy, C., Pattnaik, P.K.: An enhanced
deadline constraint based task scheduling mechanism for cloud
environment. J. King Saud Univ. Comput. Inform. (2018). https://
doi.org/10.1016/j.jksuci.2018.10.009
5. Nayak, S.C., Tripathy, C.: Deadline sensitive lease scheduling in
cloud computing environment using AHP. J. King Saud Univ.
Comput. Inform. 30, 152–163 (2018)
6. Khabbaz, M., Assi, C.: Modelling and analysis of a novel dead-
line-aware scheduling scheme for cloud computing data centers.
IEEE Trans. Cloud Comput. 6, 141–155 (2015)
7. Chen, C.H., Lin, G.W., Kuo, S.Y.: MapReduce scheduling for
deadline-constrained jobs in heterogeneous cloud computing
Fig. 9 Impact of the number of data samples on RMSE for RSA– systems. IEEE Trans. Cloud Comput. (2015). https://doi.org/10.
NFTSAR, NFTSAR and other six approaches 1109/TCC.2015.2474403
123
Cluster Computing
8. Li, Z., Ge, J., Hu, H., Song, W., Hu, H., Luo, B.: Cost and energy Comput. Big Dat. (2013). https://doi.org/10.1109/CLOUDCOM-
aware scheduling algorithm for scientific workflows with dead- ASIA.2013.52
line constraint in clouds. IEEE Trans. Serv. Comput. (2015). 26. Lang, K., Zhang, M., Yuan, Y., Yue, X.: Short-term load fore-
https://doi.org/10.1109/TSC.2015.2466545 casting based on multivariate time series prediction and weighted
9. Ji, S., Liu, S., Li, B.: Deadline-aware scheduling and routing for neural network with random weights and kernels. Clust. Comput.
inter-datacenter multicast transfers. IEEE Int. Conf. Cloud Eng. (2018). https://doi.org/10.1007/s10586-017-1685-7
(2018). https://doi.org/10.1109/IC2E.2018.00035 27. Rather, A.M., Agarwal, A., Sastry, V.: Recurrent neural network
10. Shishido, H.Y., Estrella, J.C., Toledo, C.F.M., Arantes, M.S.: and a hybrid model for prediction of stock returns. Exp. Syst.
Genetic-based algorithms applied to a workflow scheduling Appl. 42(6), 3234–3241 (2015)
algorithm with security and deadline constraints in clouds. 28. Mason, K., Duggan, J., Howley, E.: A multi-objective neural
Comput. Electr. Eng. (2017). https://doi.org/10.1016/j.compelec network trained with differential evolution for dynamic economic
eng.2017.12.004 emission dispatch. Electr. Power Energy Syst. 100, 201–221
11. Garg, N., Goraya, M.S.: Task deadline-aware energy-efficient (2018)
scheduling model for a virtualized cloud. Arab J. Sci. Eng. 29. Yu, L., Wang, S., Lai, K.K.: Forecasting crude oil price with an
(2018). https://doi.org/10.1007/s13369-017-2779-5 EMD-based neural network ensemble learning paradigm. Energy
12. Rahmanian, A.A., Ghobaei-Arani, M., Tofighy, S.: A learning Econ. 30, 2623–2635 (2008)
automata-based ensemble resource usage prediction algorithm for 30. Ge, D., Zeng, X.J.: A self-evolving fuzzy system which learns
cloud computing environment. Future Gener. Comput. Syst. dynamic threshold parameter by itself. IEEE Trans. Fuzzy Syst.
(2017). https://doi.org/10.1016/j.future.2017.09.049 (2018). https://doi.org/10.1109/TFUZZ.2018.2886154
13. Rodrigues Junior, S.E., Serra, G.L.O.: A novel intelligent 31. Angelov, P.: Evolving Takagi–Sugeno fuzzy systems from
approach for state space evolving forecasting of seasonal time streaming data. In: Angelov, P., Filev, D., Kasabov, N. (eds.)
series. Eng. Appl. Artif. Intell. 64, 272–285 (2017) Evolving Intelligent Systems: Methodology and Applications,
14. Abdollahzade, M., Miranian, A., Hassani, H., Iranmanesh, H.: A pp. 21–50. IEEE Press Series on Computational Intelligence,
new hybrid enhanced local linear neuro-fuzzy model based on the Wiley, Hoboken (2010)
optimized singular spectrum analysis and its application for 32. Halabi, L.M., Mekhilef, S., Hossain, M.: Performance evaluation
nonlinear and chaotic time series forecasting. Inform. Sci. (2014). of hybrid adaptive neuro-fuzzy inference system models for
https://doi.org/10.1016/j.ins.2014.09.002 predicting monthly global solar radiation. Appl. Energy 213,
15. Mason, K., Duggan, M., Barrett, E., Duggan, J., Howley, E.: 247–261 (2018)
Predicting host CPU utilization in the cloud using evolutionary 33. Zhao, K., Li, S., Kang, Z.: Takagi-Sugeno fuzzy modeling and
neural Networks. Future Gener. Comput. Syst. (2018). https://doi. control of nonlinear system with adaptive clustering algorithms.
org/10.1016/j.future.2018.03.040 Int. Conf. Modell. Identif. Control (ICMIC) (2018). https://doi.
16. Kumar, J., Singh, A.K.: Workload prediction in cloud using org/10.1109/ICMIC.2018.8530000
artificial neural network and adaptive differential evolution. 34. Rezaee, B., Fazel Zarandi, M.H.: Data-driven fuzzy modeling for
Future Gener. Comput. Syst. (2018). https://doi.org/10.1016/j. Takagi–Sugeno–Kang fuzzy system. Inform. Sci. 180, 241–255
future.2017.10.047 (2010)
17. Gao, C., Sun, H., Wang, T., Tang, M., Bohnen, N.I., Müller, 35. Vieira, R., Gomide, F., Ballini, R.: Kernel evolving participatory
M.L., Herman, T., Giladi, N., Kalinin, A., Spino, C.: Model- fuzzy modeling for time series forecasting. IEEE Int. Conf. Fuzzy
based and model-free machine learning techniques for diagnostic Syst. (2018). https://doi.org/10.1109/FUZZ-IEEE.2018.8491484
prediction and classification of clinical outcomes in Parkinson’s 36. Boroojenia, K.G., Hadi Amini, M., Bahramif, S., Iyengara, S.S.,
disease. Sci. Rep 8(1), 1–21 (2018) Sarwatg, A.I., Karabasoglu, O.: A novel multi-time-scale mod-
18. Zhou, F., Zhou, H.-M., Yang, Z., Yang, L.: EMD2FNN: A eling for electric power demand forecasting: From short-term to
strategy combining empirical mode decomposition and factor- medium-term horizon. Electric Power Syst. Res. 142, 58–73
ization machine based neural network for stock market trend (2017)
prediction. Exp. Syst. Appl. 115, 136–151 (2019) 37. Sarıca, B., Egrioglu, E., Asıkgil, B.: A new hybrid method for
19. Amiri, M., Khanli, L.M., Mirandolab, R.: A sequential pattern time series forecasting: AR–ANFIS. Neural Comput Appl.
mining model for application workload prediction in cloud (2016). https://doi.org/10.1007/s00521-016-2475-5
environment. J. Netw. Comput. Appl. 105, 21–62 (2018) 38. Mi, X., Liu, H., Li, Y.: Wind speed prediction model using sin-
20. Kim, K.-J.: Financial time series forecasting using support vector gular spectrum analysis, empirical mode decomposition and
machines. Neurocomputing 55(1–2), 307–319 (2003) convolutional support vector machine. Energy Conv. Manag.
21. Qian, X.-Y.: Financial series prediction: Comparison between 180, 196–205 (2019)
precision of time series models and machine learning methods. 39. Patel, J., Shah, S., Thakkar, P., Kotecha, K.: Predicting stock
arXiv preprint arXiv:1706.00948 (2017). market index using fusion of machine learning techniques. Expert
22. Kim, K.-J., Han, I.: Genetic algorithms approach to feature dis- Syst. Appl. 42, 2162–2172 (2015)
cretization in artificial neural networks for the prediction of stock 40. Wang, J.-Z., Wang, J.-J., Zhang, Z.-G., Guo, S.-P.: Forecasting
price index. Exp. Syst. Appl. 19(2), 125–132 (2000) stock indices with back propagation neural network. Expert Syst.
23. Trana, N., Nguyena, T., Nguyena, B.M., Nguyen, G.: A multi- Appl. 38(11), 14346–14355 (2011)
variate fuzzy time series resource forecast model for clouds using 41. Er, M.J., Deng, C.: Obstacle avoidance of a mobile robot using
LSTM and data correlation analysis. Int. Conf. Knowl. Based hybrid learning approach. IEEE Trans. Indust. Electron. 52(3),
Intell. Inform. Eng. Syst. 126, 636–645 (2018) 898–905 (2005)
24. Kumara, J., Goomer, R., Singh, A.K.: Long short term memory 42. Lin, F.-J., Huang, M.-S., Yeh, P.-Y., Tsai, H.-C., Kuan, C.-H.:
recurrent neural network (LSTM-RNN) based workload fore- DSP-based probabilistic fuzzy neural network control for Li-ion
casting model for cloud datacenters. Int. Confer. Smart Comput. battery charger. IEEE Trans. Power Electron. 27(8), 3782–3794
Commun. 125, 676–682 (2018) (2012)
25. Ullrich, M., Lässig, J.: Current challenges and approaches for 43. Lin, F.-J., Hung, Y.-C., Hwang, J.-C., Tsai, M.-T.: Fault-tolerant
resource demand estimation in the cloud. Int. Confer. Cloud control of a six-phase motor drive system using a Takagi–
Sugeno–Kang type fuzzy neural network with asymmetric
123
Cluster Computing
membership function. IEEE Trans. Power Electron. 28(7), comprehensive analysis of the current challenges for future
3557–3572 (2012) research. Int. J. Commun. Syst. (2018). https://doi.org/10.1002/
44. Wang, Z., Zhang, Y., Fu, H.: Autoregressive prediction with dac.3808
rolling mechanism for time series forecasting with small sample 61. Narendra, K.S., Thathachar, M.A.: Learning automata-a survey.
size. Math. Probl. Eng. (2014). https://doi.org/10.1155/2014/ IEEE Trans. Syst. Man Cybernet. 4, 323–334 (1974)
572173 62. Thathachar, M.A., Sastry, P.S.: Varieties of learning automata: an
45. Duggan, M., Duggan, J., Howley, E., Barrett, E.: A reinforcement overview. IEEE Trans. Syst. Man Cybernet. Part B 32(6),
learning approach for the scheduling of live migration from under 711–722 (2002)
utilised hosts. Memetic Comput. (2016). https://doi.org/10.1007/ 63. Figueiredo, M.B., de Almeida, A., Ribeiro, B.: Wavelet decom-
s12293-016-0218-x position and singular spectrum analysis for electrical signal
46. Lin, F.-J., Lu, K.-C., Ke, T.-H., Yang, B.-H., Chang, Y.-R.: denoising. In: 2011 IEEE International Conference on Systems,
Reactive power control of three-phase grid-connected PV system Man, and Cybernetics 2011, pp. 3329–3334. IEEE
during grid faults using Takagi–Sugeno–Kang probabilistic fuzzy 64. Jiang, H., Haihong, E., Song, M.: Multi-prediction based
neural network control. IEEE Trans. Ind. Electron. 62(9), scheduling for hybrid workloads in the cloud data center. Clust.
5516–5528 (2015) Comput. (2018). https://doi.org/10.1007/s10586-018-2265-1
47. Lin, C.J., Chin, C.C.: Prediction and identification using wavelet- 65. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural
based recurrent fuzzy neural networks. IEEE Trans. Syst. Man Comput. 9(8), 1735–1780 (1997)
Cybernet. 34, 2144–2154 (2004) 66. Song, B., Yu, Y., Zhou, Y., Wang, Z., Du, S.: Host load pre-
48. Bi, J., Yuan, H., Zhang, L., Zhang, J.: SGW-SCN: An integrated diction with long short-term memory in cloud computing. J. Su-
machine learning approach for workload forecasting in geo-dis- percomput. (2018). https://doi.org/10.1007/s11227-017-2044-4
tributed cloud data centers. Inform. Sci. 481, 57–68 (2019) 67. Babu, K.R.R., Samuel, P.: Interference aware prediction mecha-
49. Nguyen, T.H., Di Francesco, M., Yla-Jaaski, M.: Virtual machine nism for auto scaling in cloud. Comput. Electr. Eng. (2017).
consolidation with multiple usage prediction for energy-efficient https://doi.org/10.1016/j.compeleceng.2017.12.021
cloud data centers. IEEE Trans. Serv. Comput. (2016). https://doi. 68. Kumar, N., Patel, P.: Resource management using feed forward
org/10.1109/TSC.2017.2648791 ANN-PSO. Cloud Comput. Environ. (2016). https://doi.org/10.
50. Tang, X., Liao, X., Zheng, J., Yang, X.: Energy efficient job 1145/2905055.2905115
scheduling with workload prediction on cloud data center. Clust. 69. Witanto, J.N., Lim, H., Atiquzzaman, M.: Adaptive selection of
Comput. (2018). https://doi.org/10.1007/s10586-018-2154-7 dynamic VM consolidation algorithm using neural network for
51. Gupta, R.K., Pateriya, R.K.: Balance Resource Utilization (BRU) cloud resource management. Future Gener. Comput. Syst. (2018).
Approach for the Dynamic Load Balancing in Cloud Environ- https://doi.org/10.1016/j.future.2018.04.075
ment by Using AR Prediction Model. Journal of Organizational 70. Kaur, G., Bala, A., Chana, I.: An intelligent regressive ensemble
and End User Computing (2017). https://doi.org/10.4018/JOEUC. approach for predicting resource usage in cloud computing.
2017100102 J. Parallel. Distrib. Comput. 123, 1–12 (2019)
52. Tao, M., Dong, S., Zhang, L.: A multi-strategy collaborative 71. Gill, S.S., Chana, I., Singh, M., Buyya, R.: CHOPPER: an
prediction model for the runtime of online tasks in computing intelligent QoS-aware autonomic resource management approach
cluster/grid. Clust. Comput. (2011). https://doi.org/10.1007/ for cloud computing. Clust. Comput. (2018). https://doi.org/10.
s10586-010-0145-4 1007/s10586-017-1040-z
53. Kecskemeti, G., Nemeth, Z., Kertesz, A., Ranjan, R.: Cloud 72. Tofighy, S., Rahmanian, A.A., Ghobaei-Arani, M.: An ensemble
workload prediction based on workflow execution time discrep- CPU load prediction algorithm using a Bayesian information
ancies. Clust. Comput. (2018). https://doi.org/10.1007/s10586- criterion and smooth filters in a cloud computing environment.
018-2849-9 Softw. Pract. Exp. (2018). https://doi.org/10.1002/spe.2641
54. Xu, C.Z., Rao, J., Bu, X.: URL: A unified reinforcement learning 73. Manjula, C., Florence, L.: Deep neural network based hybrid
approach for autonomic cloud management. J. Parallel Distrib. approach for software defect prediction using software metrics.
Comput. 72(2), 95–105 (2012) Clust. Comput. (2018). https://doi.org/10.1007/s10586-018-1696-
55. Amiri, M., Mohammad-Khanli, L.: Survey on prediction models z
of applications for resources provisioning in Cloud. J. Netw. 74. Chen, Y., Yang, T.-J., Emer, J., Sze, V.: Understanding the
Comput. Appl. 82, 93–113 (2017) limitations of existing energy-efficient design approaches for
56. Thein, T., Myo, M.M., Parvin, S., Gawanmeh, A.: Reinforcement deep neural networks. Energy 2, 1–3 (2018)
learning based methodology for energy-efficient resource allo- 75. Fei, X., Youfu, S., Xuejun, R.: A rough set data prediction
cation in cloud data centers. J. King Saud Univ. Comput. Inform. method based on neural network evaluation and least squares
Sci. (2018). https://doi.org/10.1016/j.jksuci.2018.11.005 fusion. Clust. Comput. (2018). https://doi.org/10.1007/s10586-
57. Zheng, S., Zhu, G., Zhang, J., Feng, W.: Towards an adaptive 018-2641-x
human-centric computing resource management framework 76. Zhang, W., Duan, P., Yang, L.T., Xia, F., Li, Z., Lu, Q., Gong,
based on resource prediction and multi-objective genetic algo- W., Yang, S.: Resource requests prediction in the cloud com-
rithm. Multimed Tools Appl. (2015). https://doi.org/10.1007/ puting environment with a deep belief network. Softw. Pract.
s11042-015-3096-1 Exp. (2016). https://doi.org/10.1002/spe.2426
58. Cioara, T., Anghel, I., Salomie, I.: Methodology for energy aware 77. Vohra, R., Goel, K., Sahoo, J.: Modeling temporal dependencies
adaptive management of virtualized data centers. Energy Effic. in data using a DBN-LSTM. In: 2015 IEEE International Con-
(2016). https://doi.org/10.1007/s12053-016-9467-2 ference on Data Science and Advanced Analytics (DSAA) 2015,
59. Ranjbari, M., Akbari Torkestani, J.: A learning automata-based pp. 1–4. IEEE
algorithm for energy and SLA efficient consolidation of virtual 78. Agarwalla, N., Panda, D., Modi, M.K.: Deep learning using
machines in cloud data centers. J. Parallel Distrib. Comput. restricted boltzmann machines. Int. J. Comput. Sci. Inform.
(2017). https://doi.org/10.1016/j.jpdc.2017.10.009 Secur. 7(3), 1552–1556 (2016)
60. Alireza Souri, A., Rahmani, A.M., Jafari Navimipour, N.: Formal 79. Muralitharan, K., Sakthivel, R., Vishnuvarthan, R.: Neural net-
verification approaches in the web service composition: a work based optimization approach for energy demand prediction
123
Cluster Computing
in smart grid. Neurocomputing (2017). https://doi.org/10.1016/j.

neucom.2017.08.017
80. Wang, G., Jia, Q.-S., Qiao, J., Bi, J., Liu, C.: A sparse deep belief Publisher’s Note Springer Nature remains neutral with regard to
network with efficient fuzzy learning framework. Neural Netw. jurisdictional claims in published maps and institutional affiliations.
121, 430–440 (2020)
81. Wang, G., Qiao, J., Bi, J., Li, W., Zhou, M.: Tl-gdbn: Growing
deep belief network with transfer learning. IEEE Trans. Automat. Seyedeh Yasaman Rashida re-
Sci. Eng. 16(2), 874–885 (2018) ceived the B.Sc. degree in soft-
82. Wang, L., Kunze, M., Tao, J.: Performance evaluation of virtual ware engineering from Azad
machine-based Grid workflow system. Concurr. Comput. Pract. University, Iran, in 2006, and
Exp. 20, 1759–1771 (2008) the M.Sc. in software engineer-
83. Zubaidi, S.L., Dooley, J., Alkhaddar, R.M., Abdellatif, M., Al- ing from Azad University, Arak
Bugharbee, H., Ortega-Martorell, S.: A novel approach for pre- Branch, Iran, in 2010. She is
dicting monthly water demand by combining singular spectrum currently pursuing the Ph.D.
analysis with neural networks. J. Hydrol. (2018). https://doi.org/ degree in computer engineering
10.1016/j.jhydrol.2018.03.047 at Science and Research
84. Nguyen, L., Novák, V.: Forecasting seasonal time series based on Branch, Islamic Azad Univer-
fuzzy techniques. Fuzzy Sets Syst. (2018). https://doi.org/10. sity, Iran. Her research interests
1016/j.fss.2018.09.010 include wireless networks, dis-
85. Zhang, X., Wang, J.: A novel decomposition-ensemble model for tributed computing systems,
forecasting short-term load-time series with multiple seasonal cloud computing. She is a fac-
patterns. Appl. Soft Comput. 65, 478–494 (2018) ulty member of Islamic Azad University of Iran.
86. Xiao, L., Qian, F., Shao, W.: Multi-step wind speed forecasting
based on a hybrid forecasting architecture and an improved bat Masoud Sabaei received his BSc
algorithm. Energy Conv. Manag. 143, 410–430 (2017) degree from Esfahan University
87. de Oliveira, J.F.L., Ludermir, T.B.: A hybrid evolutionary of Technology, Iran, and his
decomposition system for time series forecasting. Neurocom- MSc and PhD degrees from the
puting (2015). https://doi.org/10.1016/j.neucom.2015.07.113 Amirkabir University of Tech-
88. Xue, X., Zhou, J.Z., Xu, Y., Zhu, W., Li, C.: An adaptively fast nology (Tehran Polytechnic),
ensemble empirical mode decomposition method and its appli- Iran, all in the field of computer
cations to rolling element bearing fault diagnosis. Mech. Syst. engineering, in 1992, 1995, and
Signal Process. (2015). https://doi.org/10.1016/j.ymssp.2015.03. 2000, respectively. Since 2002,
002 He has been a professor in the
89. Sahni, J., Vidyarthi, D.P.: A Cost-effective deadline-constrained Computer Engineering Depart-
dynamic scheduling algorithm for scientific workflows in a cloud ment, Amirkabir University of
environment. IEEE Trans. Cloud Comput. (2015). https://doi.org/ Technology. His research inter-
10.1109/TCC.2015.2451649 ests are software-defined net-
90. Xue, X.M., Zhou, J.Z.: A hybrid fault diagnosis approach based working, the Internet of Things,
on mixed-domain state features for rotating machinery. ISA wireless networks, and telecommunication network management.
Trans. 66, 284–295 (2017)
91. Sasmita, Y., Darmawan, G.: Accuracy evaluation of Fourier Mohammad Mehdi Ebadzadeh
series analysis and singular spectrum analysis for predicting the received the B.Sc. in Electrical
volume of motorcycle sales in Indonesia. AIP Conf. Proc. (2017). Engineering from Sharif
https://doi.org/10.1063/1.4995125 University of Technology, Iran
92. Golyandina, N., Korobeynikov, A.: Basic singular spectrum in 1991 and M.Sc. in Machine
analysis and forecasting with R. Comput. Stat. Data Anal. (2014). Intelligence and Robotic from
https://doi.org/10.1016/j.csda.2013.04.009 Amirkabir University of Tech-
93. Andonovski, G., Mušič, G., Blažič, S., Škrjanc, I.: Evolving nology, Iran in 1995 and his
model identification for process monitoring and prediction of Ph.D. in Machine Intelligence
non-linear systems. Eng. Appl. Artif. Intell. 68, 214–221 (2018) and Robotic from Télécom
94. Angelov, P., Yager, R.: A new type of simplified fuzzy rule-based ParisTech in 2004. Currently, he
system. Int. J. General Syst. (2012). https://doi.org/10.1080/ is an associate professor in the
03081079.2011.634807 Computer Engineering Depart-
95. Banihabib, M.E., Bandari, R., Peralta, R.C.: Auto-regressive ment of Amirkabir University of
neural-network models for long lead-time forecasting of daily Technology (Tehran Polytech-
flow. Water Resour. Manag. 33, 159–172 (2019) nic). His research interests include: Evolutionary Algorithms, Fuzzy
96. Beloglazov, A., Buyya, R.: Managing overloaded hosts for Systems, Neural Networks, Artificial Immune Systems and Robotics
dynamic consolidation of virtual machines in cloud data centers and Artificial Muscles.
under quality of service constraints. IEEE Trans. Parallel Distrib.
Syst. 24(7), 1366–1379 (2012)
97. Casolari, S., Colajanni, M., Presti, F.L.: Runtime state change
detector of computer system resources under non stationary
conditions. In: 2009 IEEE International Symposium on Model-
ing, Analysis & Simulation of Computer and Telecommunication
Systems 2009, pp. 1–10. IEEE
123
Cluster Computing
Amir Masoud Rahmani received conferences. His research interests are in the areas of distributed
his B.S. in computer engineer- systems, ad hoc and wireless sensor networks, fault tolerant com-
ing from Amir Kabir University, puting and evolutionary computing.
Tehran, in 1996, the M.S. in
computer engineering from
Sharif University of technology,
Tehran, in 1998 and the PhD
degree in computer engineering
from Islamic Azad University
(IAU), science and research
branch, Tehran, in 2005. He is
Professor in the department of
computer engineering at the
IAU, science and research
branch. He is the author/co-au-
thor of more than 200 publications in technical journals and
123

Base Paper 1

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Base Paper 1

Uploaded by

Copyright:

Available Formats

Cluster Computing

An intelligent approach for predicting resource usage by combining

Received: 21 September 2019 / Revised: 13 March 2020 / Accepted: 22 March 2020

1 Introduction geographically in different datacenters of cloud which are

the characterizing stage, historical resource usage patterns 2 Related works

[66] LSTM Learn long-term dependencies It requires computing and storing

Fig. 1 Resource usage

Fig. 2 Cycle of RSA method for time series decomposition

Fig. 3 Structure of the NFTSAR network

cloud based algorithm to determine the data clouds from t X

Table 2 Characteristics of the

TS1 13 5 5 0.00232 0.5 6 0.37

Fig. 4 The four components obtained from the SSA method

Table 5 The multi-step predicting error evaluation results of the

MAPE (%) 1.4115 1.5462 1.7854

time series into a simpler time series. It is noted that

Fig. 6 The multi-step error evaluation of different decomposition algorithms

Table 7 Multiple steps ahead

Table 8 Multiple steps ahead

in smart grid. Neurocomputing (2017). https://doi.org/10.1016/j.

You might also like

Base Paper 1

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Base Paper 1

Uploaded by

Copyright:

Available Formats

Cluster Computing

An intelligent approach for predicting resource usage by combining

Received: 21 September 2019 / Revised: 13 March 2020 / Accepted: 22 March 2020

1 Introduction geographically in different datacenters of cloud which are

the characterizing stage, historical resource usage patterns 2 Related works

[66] LSTM Learn long-term dependencies It requires computing and storing

Fig. 1 Resource usage

Fig. 2 Cycle of RSA method for time series decomposition

Fig. 3 Structure of the NFTSAR network

cloud based algorithm to determine the data clouds from  t X

Table 2 Characteristics of the

TS1 13 5 5 0.00232 0.5 6 0.37

Fig. 4 The four components obtained from the SSA method

Table 5 The multi-step predicting error evaluation results of the

MAPE (%) 1.4115 1.5462 1.7854

time series into a simpler time series. It is noted that

Fig. 6 The multi-step error evaluation of different decomposition algorithms

Table 7 Multiple steps ahead

Table 8 Multiple steps ahead

in smart grid. Neurocomputing (2017). https://doi.org/10.1016/j.

You might also like

cloud based algorithm to determine the data clouds from t X