CM 12

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 13

An Approach For Forecasting Workload In Data Canter for

Cloud Computing

ABSTRACT

Aintaining resilience of resources and their ability to expand in data centers has become a prediction of future
workloads which is very important and indispensable. Resource requests are in a state of variation as a result of
high and low workloads, and there may be another effect that hinders forecasting workloads such as noise and
redundancy requests and this makes it difficult to predict future workloads. In our search, we present a Neural
Network trained using a particle swarming optimization (PSO) algorithm for workload prediction. The method is
based on inputting past loads for periods of time and the network predicts the future value based on past data.
We used a historical NASA dataset and we took a timestamp and calculated the loads at certain time intervals to
be an earlier entry and predicted the next value. When applying the proposed method to training the neural
network using the (PSO) algorithm, the results were obtained in the training condition and the accuracy was
0.97%, the average MSE square error was 0.001, the RMSE square error was 0.03 and the results were in the test
phase, the accuracy was 0.95 and MSE = 0.003 and RMSE = 0.05. Results of the prediction periods used were better
\compared to ANN, differential adaptive development, posterior propagation, mean, and maximum.

1. INTRODUCTION

Cloud computing has expanded and is also referred to as computing on demand In previous years that have
passed preparing Businesses are making their cloud applications faster, and this enables users to process their data
as well as ways to store it in data centers that are third parties. The use of virtual cloud computing resources as
online services has led to great interest by the In technology and the middle academic researcher .In Previous
computing, The user can take a fixed amount of computing resources [1] Therefore, a cloud has great importance
and benefit in reducing the cost of the infrastructure. A large number of companies and Organizations and
departments shift to cloud computing due to its important Features such as strength, scalability, services On
request, etc. Hence datacenters serve Because it has become the most important component of cloud computing
at all. In order to expand Of the providers of services as well as maintain the Good service, datacenters need ” an
effective and dynamic policy to expand and allocate resources. The sizing of resources has become one of the most
important points for data centers to operate in a Error free way. Resources were measured On many properties
users number, Upcoming business and service, system status, and many other factors. However, these methods
require information about Future workload. However, predicting Resources to be provided in the future
requirements by determining Historical patterns and status of datacenters is the expected workload. It will
determine the amount of resources to be measured. Establish an efficient And reliability forecast Accuracy Proof
System estimate Of the burdens coming. The upcoming workload can be predicted through several methods,
which are The upper limit and the rate of workload for specific periods of time. These statistics-based ways are
unable to produce an accurate prediction. For example, If possible, use a forecast pattern that depends The upper
limit of the burden the resources will be inactive most of the time. As for using the average approach of workload
in the future, the system will experience a shortage of resources. If machine learning methods are used, they are
more accurate in predicting future loads, and these methods use historical data to train them.

In this paper, a workload prediction model was developed on the basis of an artificial neural network trained
using the algorithm PSO , where the proposed model was taught on historical data where these data are input to
the system and then trained and tested on it. By comparing the proposed model with previous methods, the mean
square error was less. From the previous methods

2. RELATED WORK

Many researchers work The presence of a workload forecasting Used problems using different approaches.
There are two known methods, homogeneous prediction and prediction based on historical data. In homogeneous
forecasts, the next workload is expected By adding or subtracting a value as the average of Workloads from past or
present may be a throw or dynamic value. As for the methods based on historical data, these are common
predictive models, and these models define and analyze previous workload cases and determine Styles to predict
future orders by specific time periods [2]. In [3], a prediction model was used to predict the workload that could
cause energy consumption, and it depended on the reverse neural network. Several experiments were conducted
in it to achieve results for comparison with the HMM Naive Bayes Classifier model. In [4] a workload prediction
model was used based on a trained neural network using a Self-Adaptive Differential Evolution (SaDE) algorithm to
predict workloads in data centers. Likewise, the researchers suggested in [5] the method of memory networks to
expect high accuracy and reduce the average square error to the lowest possible according to the proposed
research and treatment of the case. [6] proposed a simplified adaptive prediction method, an adaptive prediction
method, a medium-adaptive adaptive prediction method, and a distributed adaptive prediction method based on
an adaptive prediction model for application in a cloud data management system for workload prediction center.
The model is based on the current and next workload of the system. As a result of implementation, the proposed
adaptive methods are found to be better than the individual methods. In [7] the authors proposed a prediction
model using a neural network in addition to the optimization of the differential development (DE) and differential
self-adaptive development algorithm and called it the self-adaptive and networked differential development
algorithm trained on two algorithms for back propagation (BP).

3- WORKLOAD FORECASTING APPROACH

The research methodology is based on neural network training using the Particle Swarm Optimization (PSO)
algorithm to predict workloads in data centers. The basic idea of the research is to get the lowest square error rate
for
predictive time periods from the number of requests in the data centers.

3.1 Proposed model


In cloud data centers, resources are provided according to the users' need and this leads to an increase in the
workload. We propose a system for predicting workloads based on historical data that are processed as inputs to
the prediction system which consists of a neural network trained using the particle swarm optimization(PSO)
algorithm where the algorithm trains the network And optimizing the particle swarm and the weights connecting
the nodes in the neural network, after which the expected results are compared with the real loads, and the
accuracy is evaluated. If the mean square error is reduced, it will be adopted as a prediction model, otherwise the
training process continues using the particle swarm optimization algorithm.

FIGURE 1. Proposed forecast model

3.2 Pre-processing
The proposed forecasting approach includes many steps and as shown in the figure (Figure) where the initial
processing step is to extract the requests and calculate their number according to the time unit. When collecting
these requests and they represent the real loads, we then perform the normalization process and according to the
formula (2) where we put the values between (0-1)
Suppose the vector VL t represents the workload in the forecast period t , where it is defined as follows the
formula (1)
VL =( vl1 vl 2 vl 3 … … vl s )…..(1)
t t t t t
t

t
t VL −VLmin
VLn = …….(2)
VLmax −VLmin
( t t t t
)
Where VLnt =¿ vln 1 vln2 vln 3 … … vln st . VL min and VL max are the minimum and maximum workload in the
extracted workload dataset . Also, in the initial processing process, we divide the data into input data ( I t )and
output data (O t ) as in the following equations (3,4)
I t=[ vln t1 vln t2 … vln tM vlnt2 vln t3 … vln tM +1 vlnt3 ⋮ vln ts − M vlnt4 … ⋱ vlnts − M+1 vln tM +2 ⋮ vln ts −1 ] ……..(3)
t t t

Ot =[ vlntM+1 vln tM +2 ⋯ vln ts ]……….(4)t

Where M represents the number of workloads normalizing at each entry of the neural network as in the
proposed model (1) and the following figure (3.4).
3.3 Neural Network
Many neural networks are used in the field of prediction because of their accuracy and effectiveness compared
to other methods used. In the method proposed in our paper we use a neural network that is trained under
supervision. This neural network is trained using the particle swarm optimization(PSO) algorithm. The network
consists of three layers, which are input, output, and hidden. In the input layer we use 10 previous inputs for a
network. In the hidden layer we use 12 a hidden node consisting of the sigmoid activation function[17] as in the
equation (5). As for the output layer, we use the equation of the linear function. The data used is divided into
training and testing

FIGURE 2. Neural Network Forecasting

1
φ ( v )= …….(5)
1+ exp(−av)
where (a) is a real constant associated with the function slope in its inflection point. The slope parameters of the x
function (usually given a value of +1) are in the term. And when the parameters of the x function are close to the
termination in this case it becomes like a threshold function. The threshold imposes values of either 0 or 1 while
the sigmoid function imposes a continuous range of values between 0 to 1 logistical, and 1 to 1 for the hyperbolic
tangent function. It can also be observed that the function of sigmoid is different, while the opposite is, the
threshold function is defined as either 0 or 1
Then the data were divided into (60%) training and (40%) testing, After training the proposed model, its value is
measured using square root of mean squared error (MSE) Eq. (6). Where O t and O tt are the actual and predicted
workloads
N

∑ ❑(Ot −Ott)2 ……(6)


MSE= i=1
N

3.4 Particle Swarm Optimization Algorithm (PSO)

particle swarm Optimization (PSO) that have taken inspiration from birds' foraging behavior, which Kennedy and
Kennedy proposed in 1995 [8][9]. PSO where the problem is represented as a particle, which is represented as a
vector or matrix. These particles move through The area of research is to find optimal solutions, and during this
movement each part can remember their best experience. Squadron continues to search for the perfect By
updating the position of each particle based on the best experience and the adjacent particle. Initially a random
velocity and position vector is assigned to each particle in the search space. This particle contains memory for the
best previous vector of objective function and the vector of previous best position. In addition [10], each particle in
the swarm knows the best global vector for the objective function among all particles, as well as the global best
position vector. During the optimization process, each particle moves randomly towards the previous and Particles
tend to be the best global location to move towards better positions in each iteration and the process is repeated
until all the particles converge to an ideal solution [8].

In PSO the swarm the individual will be called a particle a complete solution to the problem to be solved, then each
t
Random particle configured in the search area. A particle is bound by two vectors, velocity vector vi = [
vi 1 , v i 2 … ….. v 1 D] and position vector x i = [ x i 1, x i 2 , … … .. x iD ] where i∈ { 1,2,3 , ….. K }K is the population
t t t t t t t

size; D is the dimensions of the solution space and t Indicates to repeat. x id ∈ [ Ld , U d ] d ∈ {1,2 , … .. D } where
Ld and U d are the lower and upper bound of the dth dimension of the Solution space. And respectively the speed
and the location of each particle in the original are uniformly randomized Be within ranges. The speed and the
position of all the particles are updated to the current iterations, Figure (3) represents Basic PSO [11][9][12].
Best Global PSO is a method when The position of the particle will be affected by the most suitable particle by
the whole swarm. At access to social information from the flock, Here, each part is separately i∈ [ 1,2, … . C ] and
C>1has its current location x i , the current velocity, vi , and the best personal position pbest , i in the search area.
The best personal site Bbest , i Corresponds to the location in the search area Where a particle i has the smallest
According to the specified value of the objective function f , in addition to that, the position that results in the
lowest value among all the personal best Bbest , i It is called the best global site and is symbolized by the symbol
Gbest and as in the two equations that show how to update the best personal and global values show figure (2.9)
[13][14].
FIGURE 3. represents Basic PSO

then The best personal site B best , i Next time t+ 1 where t ∈ [ 1,2 , .. N ] ,is calculated Eq(7).
t t +1 t t +1 t +1 t
Bbest , i={Bbest , i if f ( x i ) ≥ Bbest ,i x i if f ( x i )< Bbest , i …..(7)
And f : R → R is the fitness function. The best global site H best . at time step t is calculated as Eq(8).
n

H best =min { Btbest ,i } , where i∈ [ 1,2, … . n ] ∧n> 1 ……(8)


So it is important to note that the personal best is the best single particle visited site from the very first step. On
the other hand, the best global site is the best location detected by any of the particles in the entire swarm The
particle velocity is calculated by gbest PSO method with the following equation (9)
Y ij =Y ij +C1 r 1 j [ Bbest , i−s ij ]+ C2 r 2 j [ H best−s ij ] ……..(9)
t +1 t t t t t t

t t t
Where Y ij It is the velocity vector and x ij It represents the location of the particle and Bbest , i It is the best
personal location for the particle during initialization, H best The best global location for a particle during
t t
initialization through time ,C 1∧C 2 They are the acceleration constants and their values are specific, r 1 j ∧r 2 j
Random numbers are between (0,1). Then the position of each particle is updated according to the equation (10).
t +1 t t +1
si =si +Y i …….(10)
Inertia weight was applied by PSO for the first time in order to control the effect of previous speeds in the velocity
update equation. When particle velocities are very high during optimization and then they do not move back
towards the optimum point and the swarm will diverge. But if the velocity were slow during the optimization
process, when the particles were concentrated in local areas only (exploitation) and did not reach the search area
efficiently. The inertia weight of the exploration and exploitation ability of the algorithm is controlled by adjusting
the velocities. Inertia weight can be executed as either a static parameter or a dynamic variable Then the velocity
update equation (11) [8][12].
Y ij =wY ij +C 1 r 1 j [ Bbest , i−s ij ]+ C2 r 2 j [ H best −s ij ] …….(11)
t +1 t t t t t t

Where w is weight.

4. PSO ALGORITHM PARAMETERS [10][14].

4.1 Swarm size


It represents the size of the population or the number of points in the swarm, and the greater the size of the
swarm, the larger the search area will be in each iteration, and it is possible to reach the least frequency and
obtain the satisfactory result during the swarm search.
4.2 Iteration number
Here is the number of iterations required a better result depends on the nature of the Difficulties. And it is possible
that a small The number of repairs halt the search process to reach the optimal solution. The increased frequency
increases the complexity of unnecessary calculations and more time..
4.3 Acceleration Coefficient
t t
Acceleration coefficients C 1 and C 2 These are random values s1 j and s2 j Preservation the random effect The
social swarm components for particle velocity running .The constant C 1 Represent the Reliability level of the same
particle. The constant C 2 Represent the particle's Reliability in its neighbors[36]. The usual choice for these
parameters is C 1 = C 2 Other values used in most researches are Most of the time it is C 1 = C 2 and ranges from
[0,4] [38], and research using the value C 1 = C 2=2 [15][16][13][8] .

5. TRAIN NEURAL NETWORK USING PSO

Training the neural network using the algorithm (PSO) is that the weights and biases are obtained from the
algorithm (PSO) and these weights are used as in the Neural Network. where the best values that will represent
the weights and biases are obtained and this is done according to the following figure (15) represents the work for
ANN and PSO. The PSO takes all the weights, configures them to random values, and starts training them. For
every pass through a dataset, the PSO compares the fitness of each weight. The network with the lowest square
error rate is the best. All weights are updated based on the best values obtained through the network. In this
study, the grid fitness function that depends on the mean squared errors is used group as in the equation (6) The
velocity can be calculated from the equation (10). Likewise, The speed of each particle can be updated
Dependence on the equation (9) and the results of the fitness function in the equation (7) that give the best results
as weights of the neural network.
FIGURE 15. Flow diagram chart for of the ANN and PSO program developed for this work.

6. DEVICES BECIFICATIONS AND IMPLEMENTATION TOOL

Work and experiment were performed to implement neural network training using the algorithm PSO on a
computer equipped with an Intel(R)Core(TM) i5-4210 processor of 2.60 GHz Speed per hour 4 GB of memory and a
hard disk 512 GB . The tool chosen for the implementation of our research is the Matlab (2015 b) program,
through which the data, results and drawings necessary for our research will be executed and extracted.

7. DATASET

Historical data is the standard set of data that has been selected to work on and this data is tracking the NASA
relay request taken [18] provides Internet traffic traces in files ASCII that contain request HTTP which is in the
form of a row. NASA is tracking requests for HTTP for a period of two months registered at a University WWW
server of the NASA Canadian Space Center in Florida .

8- RESULTS AND DISCUSSION

After collecting the data, we divided it into a set of training data Figure(13) and test data Figure (14). The
algorithm parameters as in the table (1). After the implementation of the work, we obtained the accuracy of
training 97% and Mean Square Error MSE= 0.001 and Root Mean Square Error RMSE =0.03 As for accuracy in the
test phase were 95% , and MSE= 0.003 and RMSE=0.05 Also, the expected and real data for the training phase can
be observed in the figure (11) and the test in the figure (10).
N N
u u
m m
b b
e e
r r
o o
f f
R Sample R Sample

FIGURE 4. Prediction error in test FIGURE 5. Prediction error in training

FIGURE 9. Regression of data during training


FIGURE 6. Regression of data during
ANN
test ANN

N N
u u
m m
b b
e e
r r
o o
f f
R R
e (a) Time(min) e (b) Time(min)
FIGURE 10. The expected loads versus the real loads in training FUGURE 11. The expected loads versus the real loads in
data testing data

N N
u u
m m
b b
e e
r r
o o
f f (b) Sample
(a) Sample
R R
e FIGURE 12. Loads test FIGURE 13. Loads of training
e

The results of the proposed method will be presented in comparison with all the previously used methods, where
RMSE the algorithms of Back propagation, SaDE , Average and Maximum, and the proposed method of training the
neural network using PSO and as shown in the figure (14) and the table (2) show the results compared with the
mentioned methods where they appear The proposed method for training the network using PSO has the best
results in all time periods of data taken from NASA.

FIGURE 14. RMSE vs. prediction intervals over datasets


TABLE 1. Parameters PSO algorithm

Parameter Value

Ld The minimum search volume -0.4

C1 1

C2 1

Iteration 250

w 1.3

TABLE 2. Accuracy results RMSE

Prediction Dataset
Interval
Maximum Average Back propagation SaDE Proposed
(min)

1 0.802 0.330 0.256 0.013 0.023


5 0.568 0.320 0.311 0.100 0.037
10 0.604 0.339 0.312 0.158 0.045
20 0.492 0.289 0.288 0.158 0.041
30 0.613 0.313 0.292 0.142 0.048
60 0.520 0.319 0.305 0.142 0.105

The accuracy of ARMSE can be calculated for each algorithm used by the sum of the periods divided by their
number, and as in the table (3), where we note that the proposed method is less ARMSE.

TABLE 3. Accuracy results ARMSE.

Dataset Maximum Average Back SaDE Proposed


propagation

1 0.599833 0.318333 0.294 0.11883 0.049833

When listing the RMSE values for the Results have been achieved by the BP, SaDE, Average and Maximum learning
algorithms on the test data set Table (2). Results are shown in Figure (14) for dataset NASA. Comparison of neural
network training using algorithm PSO with BP , SaDE, Average and Maximum [4] the results showed that prediction
accuracy For the proposed method is preferable through all prediction periods on Training dataset. The accuracy
results in terms of ARMSE are listed in Table (3), as the neural network training method using PSO achieves the
best Better accuracy with an obvious difference from BP ,SaDE, Average and Maximum .

9. CONCLUSIONS

The algorithm PSO explores the solution area in several different directions through changing locations as well as
updating the speed. This enables us to reach the best values of the fitness function and these values represent
suitable weights that enable us to train the neural network and obtain good results where the network accuracy
was during training 0.97% And MSE = 0.001 and RMSE =0.03 and in the test the network accuracy was 0.95% and
MSE =0.003 and RMSE =0.05.It was found when applying the Particle Swarm Optimization algorithm to train the
neural network gave better results than the previous methods. Continuing to research in the field of workload
forecasting in order to reduce the loads on data centers and can also be distributed dynamically, which leads to
speed in work and reduce cost and energy on cloud data centers.

REFERENCES

1-R. Buyya, C.S. Yeo, S. Venugopal, J. Broberg, I. Brandic, “Cloud computing and emerging IT platforms: Vision,
hype, and reality for delivering computing as the 5th utility”, Future Gener. Comput. Syst. 25 (6), 599–616 (2009).

2- KUANG, Shiann-Rong, et al. “Efficient architecture and hardware implementation of hybrid fuzzy- Kalman filter
for workload prediction”. Integration, 47.4, 408-416 (2014).

3- LU, Yao, et al. RVLBPNN: “A workload forecasting model for smart cloud computing”. Scientific Programming,
2016, 2016.

4- KUMAR, Jitendra; SINGH, Ashutosh Kumar. “Workload prediction in cloud using artificial neural network and
adaptive differential evolution”. Future Generation Computer Systems. 81, 41-52 (2018).

5- KUMAR, Jitendra; GOOMER, Rimsha; SINGH, Ashutosh Kumar. “Long short term memory recurrent neural
network (lstm-rnn) based workload forecasting model for cloud datacenters”. Procedia Computer Science. 125,
676-682 (2018).

6- ZHARIKOV, Eduard; TELENYK, Sergii; BIDYUK, Petro. “Adaptive workload forecasting in cloud data
centers”. Journal of Grid Computing. 18.1, 149-168 (2020).

7- ATTIA, M. A., et al. “Application of an enhanced self-adapting differential evolution algorithm to workload
prediction in cloud computing”. Int. J. Inf. Technol. Comput. Sci. 11.8, 33-40 (2019).

8- PANAHLI, Chingiz. “Implementation of Particle Swarm Optimization Algorithm within FieldOpt Optimization


Framework-Application of the algorithm to well placement optimization”. 2017. Master's Thesis. NTNU.

9- QIN, Quande, et al. “Particle swarm optimization with interswarm interactive learning strategy”.  IEEE
transactions on cybernetics. 46.10, 2238-2251(2015).

10- Settles, Matt, and Bart Rylander. "Neural network learning using particle swarm optimizers." Advances in
information science and soft computing , 224-226 (2002).
11- PIOTROWSKI, Adam P.; NAPIORKOWSKI, Jaroslaw J.; PIOTROWSKA, Agnieszka E. “Population size in Particle
Swarm Optimization”. Swarm and Evolutionary Computation , 100718 (2020).

12- IMRANª, Muhammad; HASHIMA, Rathiah; ABD KHALIDB, Noor Elaiza. “An overview of particle swarm
optimization variants”. Procedia Engineering , 53, 491-496 (2013).

13- TALUKDER, Satyobroto. “Mathematicle modelling and applications of particle swarm optimization”. 2011.

14- ENGELBRECHT, Andries P. “Computational intelligence: an introduction”. John Wiley & Sons, 2007

15- HE, Yan; MA, Wei Jin; ZHANG, Ji Ping. “The parameters selection of pso algorithm influencing on performance
of fault diagnosis”. In: MATEC Web of conferences. EDP Sciences, p. 02019 (2019).

16- DU, Yuji; XU, Fanfan. “A Hybrid Multi-Step Probability Selection Particle Swarm Optimization with Dynamic
Chaotic Inertial Weight and Acceleration Coefficients for Numerical Function Optimization”. Symmetry , 12.6: 922
(2020).

17- Islam, Mohaiminul, Guorong Chen, and Shangzhu Jin. "An Overview of Neural Network." American Journal of
Neural Networks and Applications 5.1 , 7-11 (2019).

18- Traces available in the Internet Traffic Archive. http://ita.ee.lbl.gov/html/


traces.html

You might also like