WAN Optimization To Speed Up Data Transfer WAN Optimization To Speed Up Data Transfer

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Available online at www.sciencedirect.

com
Available online at www.sciencedirect.com
Available online at www.sciencedirect.com

ScienceDirect
Procedia
Procedia Computer
Computer Science
Science 00(2017)
116 (2017) 000–000
Procedia Computer Science 00 (2017)45–53
000–000
www.elsevier.com/locate/procedia
www.elsevier.com/locate/procedia

2nd
2nd International
International Conference
Conference on
on Computer
Computer Science
Science and
and Computational
Computational Intelligence
Intelligence 2017,
2017,
ICCSCI 2017, 13-14 October 2017, Bali, Indonesia
ICCSCI 2017, 13-14 October 2017, Bali, Indonesia
WAN
WAN Optimization
Optimization to
to Speed
Speed up
up Data
Data Transfer
Transfer
a a b c
Benfano
Benfano Soewito
Soewitoa ,, Andy
Andya ,, Fergyanto
Fergyanto E.
E. Gunawan
Gunawanb ,, Melki
Melki Sadekh
Sadekh Mansuan
Mansuanc
a Computer Science Department, BINUS Graduate Program, Bina Nusantara University, Jakarta - 11530, Indonesia
a Computer Science Department, BINUS Graduate Program, Bina Nusantara University, Jakarta - 11530, Indonesia
b Industrial Engineering Department, BINUS Graduate Program, Bina Nusantara University, Jakarta - 11530, Indonesia
b Industrial Engineering Department, BINUS Graduate Program, Bina Nusantara University, Jakarta - 11530, Indonesia
c School of Computer Science, Bina Nusantara University, Jakarta - 11530, Indonesia
c School of Computer Science, Bina Nusantara University, Jakarta - 11530, Indonesia

Abstract
Abstract
Currently
Currently thethe development
development of of digital
digital technology
technology has has advanced
advanced very very rapidly,
rapidly, as
as well
well asas data
data storage
storage and
and data
data transfer.
transfer. As
As the
the devel-
devel-
opment of information technology and the use of data that increasingly large, it will be a main factor
opment of information technology and the use of data that increasingly large, it will be a main factor that affects to the company’sthat affects to the company’s
business.
business. TheThe amount
amount of of data
data that
that must
must be be processed
processed bothboth atat headquarters
headquarters and and other
other branch
branch offices
offices make
make the the process
process of
of data
data ex-
ex-
change
change become
become the the main
main issue
issue inin term
term of of speed
speed andand delay.
delay. Due
Due to to data
data exchange
exchange beingbeing aa top
top priority
priority inin the
the company,
company, often
often the
the
WAN
WAN network
network bandwidth
bandwidth used used by
by thethe file
file system,
system, email
email system,
system, proxy,
proxy, web
web system
system becomes
becomes peakpeak or
or full
full during
during peak
peak hours
hours thus
thus
slowing
slowing down
down accessing
accessing file
file or
or email
email system.
system. OneOne method
method to to reduce
reduce bandwidth
bandwidth usageusage is
is almost
almost full
full is
is to
to use
use WAN
WAN Optimization
Optimization
tools.
tools. WAN Optimization has several functions, including Data Streamlining, Transport Streamlining, and Application Streamlin-
WAN Optimization has several functions, including Data Streamlining, Transport Streamlining, and Application Streamlin-
ing.
ing. The
The purpose
purpose of of this
this study
study is is to
to analyze
analyze performance
performance network
network whenwhen WAN
WAN optimization
optimization waswas applied
applied in in the
the system.
system. Data
Data is is
taken
taken from simulated file transfers conducted several days and at different hours. Data were analyzed using Wireshark tools and
from simulated file transfers conducted several days and at different hours. Data were analyzed using Wireshark tools and
calculation
calculation formula.
formula. TheThe results
results can
can bebe seen
seen if
if there
there is
is aa significant
significant increasing
increasing performance,
performance, delay
delay become
become better
better from
from 287
287 ms
ms toto
0.604
0.604 ms
ms for
for file
file size
size 93
93 MBMB andand jitter
jitter increased
increased 12.4%
12.4% better.
better. It
It can
can bebe concluded
concluded thatthat with
with the
the optimization
optimization of of WAN,
WAN, thethe process
process
of
of data
data transfer
transfer becomes
becomes moremore efficient
efficient both
both in
in bandwidth
bandwidth and and time
time during
during working
working hours.
hours.

c 2017
c 2017 The
2017 The Authors.
The Authors. Published
Authors. Published
Published by by Elsevier
by Elsevier B.V.
Elsevier B.V.
B.V.
©
Peer-review under
under responsibility
responsibility of organizing
organizing committee
committee of of the
the 2nd
2nd International
International Conference
Conference on on Computer
Computer Science and and Compu-
Peer-review
Peer-review under responsibility of of the scientific committee of the 2nd International Conference on ComputerScienceScience andCompu-
tational Intelligence
tational Intelligence (ICCSCI
(ICCSCI 2017).
2017).
Computational Intelligence 2017.
Keywords: network
Keywords: optimization; WAN optimization; data transfer; delay and bandwidth
network optimization; WAN optimization; data transfer; delay and bandwidth

1.
1. Introduction
Introduction

In
In the
the last
last decade,
decade, the
the development
development andand use
use of
of IT
IT has
has increased
increased rapidly,
rapidly, including
including in
in data
data storage.
storage. Data
Data in
in physical
physical
form,
form, now all become data in digital form. This affects how to transmit and distribute the data. All data in digital form
now all become data in digital form. This affects how to transmit and distribute the data. All data in digital form
will
will be
be send,
send, distributed,
distributed, and
and transmitted
transmitted over
over the
the internet
internet network.
network. AA delay
delay problem
problem will
will appear
appear ifif the
the data
data to
to be
be sent
sent
exceeds
exceeds the
the available
available bandwidth
bandwidth capacity.
capacity. The
The WAN
WAN Network
Network become
become congested
congested with
with high
high traffic
traffic which
which causes
causes aa lot
lot

∗ Corresponding author.
∗ Corresponding author.
E-mail address: benfano@gmail.com, bsoewito@binus.edu
E-mail address: benfano@gmail.com, bsoewito@binus.edu

1877-0509 c 2017 The Authors. Published by Elsevier B.V.


1877-0509 c 2017
© The
TheAuthors.
Authors.Published
Publishedby Elsevier B.V.B.V.
by Elsevier
Peer-review under responsibility of organizing committee of the 2nd International Conference on Computer Science and Computational Intelligence
Peer-review under
Peer-review underresponsibility of organizing
responsibility committee
of the scientific of the 2nd
committee ofInternational Conference Conference
the 2nd International on Computeron
Science and Computational
Computer Science and Intelligence
(ICCSCI 2017).
Computational
(ICCSCI 2017). Intelligence 2017.
10.1016/j.procs.2017.10.007
46 Benfano Soewito et al. / Procedia Computer Science 116 (2017) 45–53
2 Author name / Procedia Computer Science 00 (2017) 000–000

of data loss and jitter 1 . This will certainly affect the performance of staff who use the data which will ultimately affect
the performance of the company as a whole.
Therefore many large companies usually invest their money by building IT infrastructure in each branch office to
support the company’s business every day. As the development of information technology and the use of data that
increasingly large, it will be a main factor that affects to the company’s business. The amount of data that must be
processed both at headquarters and other branch offices make the process of data exchange become the main issue
in term of speed and delay. To provide applications and services required by users, IT department in the company
must invest in infrastructure for up to tens of millions of dollars, such as File Server, Mail Server, Storage and Tape
Backup Server at each branch office. Once the infrastructure is completed in large companies that have hundreds or
even thousands of overseas branches, there will be new problems, namely network complexity and high cost. In order
to solved the problem WAN optimizer was introduced which will increase the performance of the network.
Although almost all areas related to network and computing have improved rapidly, Wide Area Network (WAN)
has always been an obstacle that must be addressed and resolved by IT engineer 2 3 . Many IT engineer have tried
many things to improve throughput, by purchasing a larger bandwidth or data compression module, but have not been
able to help increasing the performance of the application in sending and transmitting data in WAN. To cut costs
and simplify infrastructure, several companies have taken steps to centralize distributed systems in previous branch
offices. Therefore, we want to implement an appliance that can make easier and simplify the WAN network as if it
were similar to a LAN network for applications that use WAN communications on both centralized and distributed
environments. In order prove that our work is better than we measured three main parameter 4 : Delay, Jitter, and
throughput.

2. Literature review

2.1. Wide Area Network (WAN)

WAN is computer network over a large geographical area and often build by leased telecommunication circuits 5 .
Normally, WAN connected computers between branch offices and headquarters in different cities or countries. Each
of computer in each office has aplications for user and called as a host. The network that connected the hosts called
subnet. The task of the subnet is to carry messages from host to host, as the telephone system carries the words (really
just sound) from the speaker to the listener 6 . In most WANs, the subnet consists of two distinct components: line
transmission and switching elements 7 . The transmission line moves bits between machines. They can be made from
copper wires, fiber optics, or even radio links. Most companies do not have transmission lines, so they leased line
from telecommunication companies. Element switching or switch, is a computer specialist to connects two or more
transmission lines. When data arrives in the incoming path, the switching element must specify an exit path to forward
the data 8 . This switching computer is a nickname in the previous time, and now better known by the name of the
router.

2.2. WAN Optimization

The goal of WAN optimization is to increase performance of data transfer over wide area networks 7 . For that
purpose there are several techniques in wan optimization such as Deduplication, Compression, Web Caching, Wide
Area File Services (WAFS), and Forward Error Correction (FEC). In our work will focus on compression and web
caching.

2.2.1. Compression
The basic principle of compression on a WAN product is shown in Fig. 1, which is to represent a frame of data
being shortened by a certain compression algorithm method to be transmitted over the network. The basic idea is to
reduce data size therefore it will save space, bandwith and time to transmit. This data compression occurs only on the
WAN path and then decompressed automatically after receiver receive the data. This will certainly save more space
on the WAN path for other packages so as to make the network more efficient. The compression can be used not only
for data content or payload but also included data header as we can see in Fig. 1. Optimization with compression
Benfano Soewito et al. / Procedia Computer Science 116 (2017) 45–53 47
Author name / Procedia Computer Science 00 (2017) 000–000 3

method is suitable for network with topology point-to-point leased lines. One of the concept to compress the data is
by remove all extra character, denoting a string of recurrent characters through insertion of a single repeat character,
and also the word that occur often can be substituted with a single character.

Fig. 1: WAN optimization - Compression technique.

Fig. 2: WAN optimization - Caching technique.


48 Benfano Soewito et al. / Procedia Computer Science 116 (2017) 45–53
4 Author name / Procedia Computer Science 00 (2017) 000–000

2.2.2. Caching
The caching mechanism is shown in Fig. 2. Caching is necessary if there is accessing the same data or site over
time thereby reducing the same repetitive packet delivery, since the caching server has the ability to store frequently
requested information. Therefore, web caching can help to reduce and save bandwidth, and also make more efficient
in transmission of data over the WAN. This technique has been applying in many web server to reduce the time and
traffic on internet. One of the shortcomings of caching is the risk of providing non-update data. If we access the
current cache page, then it is at risk of getting incorrect and stale information. Most browsers can actually perform
their own caching mechanisms. Many web servers store time stamps from their last update, then the browser uses a
cached copy of the remote page after checking the time stamp. In our research, we do not compress the file from web
caching, because this file will not be transmitted over the WAN.

3. Methodology

We compress the data to optimize the speed up and also make caching in server site. We used various data
type format video and zip. The steps of this research are literature study, WAN Optimization tool implementation,
comparative test in data transfer process using various file sizes, data collection, analysis of research results and
conclusions and suggestions. In the early stages of the study started by determining the background and objectives of
the study as well as the scope of the study. The literature study includes discussions on technology and how WAN
optimization works. The second phase of this research is the installation and implementation of WAN optimization
tools along with trial data delivery in various sizes. In the third stage data collection from the test results is the
percentage of WAN optimization and bandwidth performance. In the fourth stage data analysis is done and will be
drawn conclusions and suggestions from the steps that have been done.
The measurement tool that will be used is using packet capture tool that is Wireshark 9 . Wireshark is a tool that
can capture all the transmission of packets that are in and out of the source or to the destination 10 . Data collection is
done by comparing time without and with WAN optimization, delay, latency and throughput when data transmission
is done. This Wireshark tool will be placed on a workstation computer where the data will be copied / transmitted to
the server.
Fig. 3 shown the topologies in general without and with WAN optimization appliance. We can see in Fig. 3, is
a network topology in general in a company that does not using WAN optimization where there are two segments
DataCenter connected by a router, that is the data resides in File Server (All Dept) accessed by all departments in HQ
and data residing in File Server (Accounting) accessed only by department Finance.
In Fig. 4, it is the topology of the network in general in a company that has implemented WAN optimization in
Data Center as well as in HQ office, but the implementation is only placed on one segment of DataCenter because in
this segment Data Center is the centralization of infrastructure accessed by almost all HQ employees.
Data collection (delay, jitter, and throughput) in this study will use a way to copy five files of various sizes from
workstation computers to Data Center at various times as well. The workstation computer will be installed Wireshark
tool and this data capture tool will be executed when the data copying process is in progress. Then we will analyze
the details of the capture data and summarize it (delay, jitter, and throughput) and make comparisons with and without
WAN optimization.

4. Results and discussion

Parameter measurement such as delay, jitter, and throughput will be simulated against five files of various sizes.
The parameter measurements will be done using Tera Copy tools 11 (measuring throughput) and Wireshark (measuring
delay and jitter). The five files that are measured are:

1. Video.mp4 Test with size of 93 MB taken from Youtube website.


2. Test Video 2.mp4 with size 766 MB taken from Youtube website.
3. Test 3.zip with 120 MB size taken from zip result of document.
4. 2016.zip with 310 MB size taken from the zip collection of images.
Benfano Soewito et al. / Procedia Computer Science 116 (2017) 45–53 49
Author name / Procedia Computer Science 00 (2017) 000–000 5

Fig. 3: Network topology without WAN optimization.

Fig. 4: Network topology with WAN optimization.

5. Lotus.zip with size 1.17 GB taken from zip installer.

In this simulation used files with various sizes because there are various events / complain of the user when the
process of copying data from computer to File Server or vice versa where they must copy repeatedly until the process
is successful. Likewise with the IT headquarters where every month have to share updates to IT staff in branch offices
that have at least a file size of at least 1 GB.

4.1. Delay

Delay is a time that needed when the packet is caused by the transmission process from one point to another which
is the destination. Delay is obtained from difference between one TCP packet with another packet. Table 1 is a table
showing the quality of latency based on the amount of delay that we used as our reference. The simulation of this
50 Benfano Soewito et al. / Procedia Computer Science 116 (2017) 45–53
6 Author name / Procedia Computer Science 00 (2017) 000–000

Table 1: Delay Category.

Latency Category Delay Time


Very Good < 150 ms
Good 150 to 300 ms
Bad 300 to 450 ms
Very Bad > 450 ms

delay measurement, taken from one example on a non-optimized network and measured using Wireshark tools. Fig. 5
is an example of Wireshark tools and their calculations.

Fig. 5: Measurement and calculate delay by wireshark .

Furthermore, we did simulation using five different file and calculations on unoptimized networks and optimized
networks are summarized in Table 2.
In the delay measurement in Table 2, it can be seen that there is a very significant difference between unoptimized
and optimized networks where the different in the first file is 286 ms, the second file is 258 ms, the third file is 264
ms, the fourth file is 186 ms, and the fifth file is 243 ms. Based on the reference in Table 1, it can be concluded
that the delay on the network is not optimized included in the good category. Although the conclusion is good, but
when the copy is done during peak hours, sometimes resulting in the network to be disconnected where we have to
re-copy again so that time consuming not less. Furthermore the delay on the optimized network category is very good,
because the process of copy files with a maximum size of 1.77 GB has a slight delay, which is equal to 1.4343 ms.
This can be different because WAN optimization has a function called Scalable Data Referencing which works to
reduce bandwidth consumption. This means that the reference data is stored in two appliance hard drives that will
be used by any application or file requiring either file, web page, email or other data. When there is data exchange
process, appliance will check the reference data on harddisk, if data is not there then appliance will record it on
harddisk. If the data already exists, then the appliance stay to transfer the process to destination. In other words, all
common data will be known, and if there is a small change, then that is what will be sent over the network. This is
what makes the delay in data transfer to be small compared without using WAN optimization. The comparison of
delay can be seen more clearly in the Fig. 6

4.2. Jitter

Jitter is defined as the delay variation caused by the length queue in a data processing and reassemble the data
packets at the end of delivery due to the previous failure. To calculate jitter, used the equation 1.
Benfano Soewito et al. / Procedia Computer Science 116 (2017) 45–53 51
Author name / Procedia Computer Science 00 (2017) 000–000 7

Fig. 6: Measurement and comparison of delay.

Table 2: Measurement of delay.

Delay Packets Test Video.mp4 Test Video2.mp4 Test 3.zip 2016.zip Lotus.zip
(93 MB) (766 MB) (120 MB) (310 MB) (1.17 GB)
Not Optimized 287 ms 268 ms 265 ms 189 ms 255 ms
Optimized <1 ms (0.604) 9.936 ms <1 ms (0.723) 2.451 ms 1.434 ms
Different 286 ms 258 ms 264 ms 186 ms 253 ms

Total of delay variation


jitter = (1)
Total of packet received − 1

Where Total delay variation is obtained from sum of (delay 2 - delay 1) + (delay 3 - delay 2) + ......... .. + (delay
(n) - delay (n-1)). The result after running the simulation can be seen in table 3

Table 3: Measurement and calculate of jitter.

Jitter Test Video.mp4 Test Video2.mp4 Test 3.zip 2016.zip Lotus.zip


(93 MB) (766 MB) (120 MB) (310 MB) (1.17 GB)
Not Optimized 1.99 ms 2.39 ms 1.283 ms 2.45 ms 1.611 ms
Optimized 0.131 ms 0.108 ms 0.124 ms 0.667 ms 0.128 ms
Percentage 1419% 2113% 934% 267% 1158%
52 Benfano Soewito et al. / Procedia Computer Science 116 (2017) 45–53
8 Author name / Procedia Computer Science 00 (2017) 000–000

In jitter measurement in table 3, it can be seen that the difference between unoptimized network and optimized
network is not so significant, can be proved by looking at the percentage difference. The average jitter was increased
1178%. The magnitude of jitter also affects the delay parameter, meaning that jitter has a large scale will decrease
network performance or delay becomes large as well. The measurements in table 3 can also be summarized again in
the Fig. 7.

Fig. 7: Measurement and comparison of jitter.

4.3. Throughput

Measurement throughput has also been done in the first simulation, both on the network without WAN optimization
or on networks that use WAN optimization, and tools used are TeraCopy tools. In throughput measurement in table
4, it can be seen that the difference between unoptimized network and optimized network is very significant. The
average throughput was increased in order mega bytes.

Table 4: Measurement of Throughput.

Jitter Test Video.mp4 Test Video2.mp4 Test 3.zip 2016.zip Lotus.zip


(93 MB) (766 MB) (120 MB) (310 MB) (1.17 GB)
Not Optimized 341 KB/s 410 KB/s 171 KB/s 73 KB/s 273 KB/s
Optimized 11 MB/s 11MB/s 6.8 MB/s 8.8 MB/s 7.7 MB/s
Benfano Soewito et al. / Procedia Computer Science 116 (2017) 45–53 53
Author name / Procedia Computer Science 00 (2017) 000–000 9

5. Conclusion

From the simulation results, in both unoptimized and optimized networks, the results on optimized networks have
significantly better value so that it makes the data exchange process shorter and users do not have to re-copy from
the start again and also do not overload the bandwidth just because the CIFS protocol is more dominant on the WAN
network.
The results can be seen if there is a significant increasing performance, delay become better from 287 ms to 0.604
ms for file size 93 MB and jitter increased 12.4% better. It can be concluded that with the optimization of WAN, the
process of data transfer becomes more efficient both in bandwidth and time during working hours.

References

1. Tanenbaum, A.S. Computer Networks. 5th ed. Boston: Pearson Education, Inc., 2010.
2. Dordal, P. L. An Introduction to Computer Networks Release 1.8.17, Department of Computer Science, Loyola University Chicago, 2016.
3. Bonaventure, O. Computer Networking : Principles, Protocols and Practice. Release 0.25. The Saylor Foundation, 2011.
4. Kaur, M. A. An Overview of Quality of Service Computer Network, Indian Journal of Computer Science and Engineering (IJCSE), Vol. 2 No. 3
Jun-Jul 2011.
5. White, C. M. Data Communications and Computer Networks: A Business User’s Approach, Seventh Edition. Boston: Course Technology,
2013.
6. Comer, D. E. Computer Networks and Internets, Fifth Edition. New Jersey: Pearson Education, Inc., 2009.
7. Y. Takano, N. Oguchi, H. Tomonaga and S. Abe, Application and evaluation of distributed WAN optimization technique in heterogeneous
networks, the 22nd International Conference on Software, Telecommunications and Computer Networks (SoftCOM), Split, 2014, pp. 284-288.
8. Dr. Shu Yinbiao, T. P. Internet of Things : Wireless Sensor Networks. International Electrotechnical Commission, 2014.
9. Chappell, L. Wireshark Network Analysis 2nd Ed. Chappell University, 2012.
10. A. M. Al-Sadi, A. Al-Sherbaz, J. Xue and S. Turner, Routing algorithm optimization for software defined network WAN, Al-Sadeq International
Conference on Multidisciplinary in IT and Communication Science and Applications (AIC-MITCSA), Baghdad, 2016, pp. 1-6.
11. Marchese, M. QoS Over Heterogeneous Networks. England: John Wiley & Sons, Ltd., 2007.

You might also like