Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

N E TA P P S N A P M I R RO R

Proving the concept of accelerated replication


with PORTrockIT
E X EC U T I V E S U M M A RY For example, even at 1.5% packet loss
(the highest packet loss rate tested),
For one major US healthcare provider, the PORTrockIT solution was able to
the need to protect large volumes of maintain a transfer rate of between 68
clinical image files by replicating them and 98 MB/s, in scenarios where an
to a disaster recovery site was causing unaccelerated connection could only
problems. With significant packet loss deliver speeds between 1 and 7 MB/s.
on the connection between its two
data centres, transfer speeds slowed
to a snail’s pace – making it impossible
“PORTrockIT achieved
to complete the NetApp SnapMirror transfer rates up to
replication job within a timeframe
acceptable to the business. 68 times faster than
The company’s IT partners, Trace3 and
an unaccelerated
Agilesys, knew of a technology that architecture.”
could potentially solve the problem:
PORTrockIT from Bridgeworks. They
offered to run a comprehensive proof- These results have helped to build a
of-concept exercise to demonstrate strong business case for the healthcare
how PORTrockIT could counteract the provider to adopt PORTrockIT – and
effects of packet loss, and transform the provide compelling evidence that a
performance of the replication process. similar solution could benefit almost
any organisation that uses NetApp
This paper explores the results of SnapMirror for data replication across a
the proof-of-concept, showing how wide-area network.
PORTrockIT significantly improved
average performance across more
than 300 individual test scenarios,
including different levels of packet loss,
different types of data, and different
configurations of the NetApp SnapMirror
replication process.
W H Y S P E E D M AT T E R S critical systems. This scenario applies
particularly to hospitals and healthcare
In the healthcare sector, data volumes organisations that operate 24/7, which
are growing rapidly. In particular, the may have only very brief periods of low
drive to digitise medical records, x-ray activity.
and pathology results is generating
terabytes of image files, full of sensitive To exacerbate the problem, if a NetApp
private data that needs to be stored SnapMirror replication job fails before it
safely and securely, potentially for is completed, it must be restarted from
decades. the beginning. As jobs get larger, both
the risk and the consequences of failure
In most cases, the safest way to protect become more severe, and the likelihood
this data is to replicate it off-site, to of overrunning the available window
a geographically remote data centre. increases.
However, image files can present a
challenge for backup and replication
solutions, especially those that use data
“PORTrockIT helps to
compression to optimise performance, avoid backup issues
such as NetApp SnapMirror. Since
most image formats are already highly by accelerating the
compressed, the backup solution
cannot slim them down much further,
replication process.”
and therefore cannot provide the
usual performance gains. As a result, If the data cannot be fully replicated
replication jobs can take longer for within the required window, the
images than they would for a similar organisation may need to make
volume of more highly compressible some tough choices. It could extend
data. the window, but that may impact
the performance of other systems.
As the amount of data that needs to Alternatively, it could either replicate
be protected increases, it can become less data, or replicate the same amount
difficult to complete off-site replications of data less frequently. However, this
within a reasonable time-window. For might leave data unprotected, or make it
example, if an organisation wants to more difficult to recover in the event of a
replicate its data every day, it may disaster.
only have a few hours during the night
when few users are online, when the A better solution is to find a way to
replication job can be executed without accelerate the replication process – such
affecting performance for business- as PORTrockIT.
T H E P RO B L E M S : L AT E N C Y The second issue is packet loss – where
A N D PAC K E T LOS S a packet sent from a system on one side
of the WAN never arrives at the system
In general, there are two main issues that is intended to receive it, or the
that cause the majority of performance acknowledgement from the recipient
problems when replicating data across a goes astray before it reaches the sender.
wide area network (WAN). When this happens, TCP/IP automatically
reduces the number of packets it sends
The first is latency – the time delay in the next group, to compensate for
between a system sending a packet the unreliability of the connection. As
across the WAN, and the target system a result, network utilisation is greatly
receiving that packet. The main causes of reduced, because the sender is sending
latency are the physical distance that the fewer packets in the same amount of
packet has to travel, and the time taken time.
to receive, queue and process packets
at either end of the connection and at “Investing in expensive
any intermediate gateways. The further
the data has to travel, and the more high-bandwidth
gateways it has to pass through, the network infrastructure
greater the latency.
will not fix TCP/IP
For replication jobs that use the TCP/IP
protocol, high latency can cripple transfer
problems caused by
rates, even over a theoretically high- latency or packet loss.”
bandwidth WAN infrastructure. TCP/IP
works by sending a group of packets,
then waiting for an acknowledgement Organisations often try to solve TCP/IP
that the packets have been received performance problems by investing in
before it sends the next group. If the more expensive network infrastructure
latency of the connection is high, then that offers a larger maximum bandwidth.
the sender spends most of its time However, this does not fix the problem.
waiting for acknowledgements, rather As we have seen, latency and packet loss
than actually sending data. During these prevent TCP/IP connections from fully
periods, the network is effectively idle, utilising the available bandwidth – so any
with no new data being transferred. extra investment in bandwidth will simply
be wasted unless the latency and packet
loss issues can be addressed.
THE SOLUTION: Moreover, PORTrockIT is capable of
P O R T RO C K I T optimising the flow of data across
the WAN in real time, even if network
PORTrockIT offers a solution to network conditions change. The solution
latency issues. Instead of sending a incorporates a number of artificial
group of packets down a single physical intelligence engines that continuously
connection and waiting for a response, manage, control and configure multiple
the solution creates a number of parallel aspects of PORTrockIT – enabling the
virtual connections that send a constant appliance to operate optimally at all
stream of data across the physical times, without any need for input from a
connection. network administrator.

As soon as a virtual connection has sent “PORTrockIT uses


its packets and starts waiting for an
acknowledgement from the recipient,
artificial intelligence
PORTrockIT immediately opens another to accelerate data
virtual connection and sends the next
set of packets. Further connections transfer across multiple
are opened until the first connection
receives its acknowledgement; this first
parallel connections.”
connection is then re-used to send
another set of packets, and the whole In practical terms, PORTrockIT is installed
process repeats. as a pair of physical or virtual appliances,
deployed at either end of the WAN. The
This parallelisation combats latency by backup server simply passes data to the
ensuring that the physical connection is PORTrockIT appliance on the near side
constantly transferring new packets from of the WAN, which manages the virtual
the sender to the recipient: there is no connections to the second PORTrockIT
longer any idle time, and the network’s appliance on the far side of the WAN.
bandwidth can be fully utilised. Once the second PORTrockIT appliance
begins receiving packets, it routes them
The solution also significantly reduces seamlessly to the recipient server. The
the impact of packet loss. If one of the effect is simply much faster network
virtual connections loses a packet, TCP/IP transfer performance, without any need
will only reduce the number of packets to make any changes to the rest of the
in the next group sent by that specific network architecture.
virtual connection. All the other virtual
connections continue to operate at full
speed.
T U R N I N G T H EO RY I N TO The proof-of-concept infrastructure
P R AC T I C E mimicked a real-world SnapMirror
architecture, using a WANulator to
A major US healthcare company wanted simulate different levels of latency and
to replicate large quantities of data packet loss between the source (the
– mainly image files – from its main NetApp storage unit that would send the
data centre to a secondary location data), and the target (the NetApp unit
several miles away. The image data was that would receive the replicated data).
already highly compressed, and due to
its sensitive nature, it also needed to All the tests were performed first on
be encrypted – limiting SnapMirror’s an unaccelerated architecture, where
ability to optimise network performance the source and target were connected
through compression. Moreover, due to directly to the WANulator (see figure 1).
the long distance between the sites, the The same tests were then repeated on
WAN connection suffered from relatively an architecture that was accelerated
high levels of both latency and packet by introducing two physical PORTrockIT
loss. appliances, which sat either side of the
WANulator, between the source and the
The company’s IT partners, Trace3 and target (see figure 2).
Agilesys, proposed a solution based on
NetApp SnapMirror and Bridgeworks A battery of more than 300 tests
PORTrockIT appliances. To demonstrate was designed to simulate different
to their client that the solution would SnapMirror replication configurations
work in practice, they offered to run (for example, sync, async and semi-
a comprehensive proof-of-concept sync settings in ONTAP cluster mode),
exercise at Trace3’s testing facility in and different size and types of data (for
Phoenix, Arizona. example, replications jobs containing

NetApp FAS NetApp FAS


Clustered Clustered
ONTAP WANulator ONTAP

10Gb 10Gb

Figure 1: Unaccelerated architecture


multiple 10+ KB files versus multiple
10+ GB files, and jobs containing highly T E ST EQ U I P M E N T
compressible files versus compressed/
encrypted files). S O F T WA R E :

• WANulator ISO image


The performance of each of these • Test Data Generation Tool
configurations was then tested
under different levels of latency and H A R DWA R E :
packet loss, and the results from all
• Windows Server 2008 R2
of the configurations were used to
• 2 x NetApp FAS8060 units
calculate NetApp SnapMirror’s average • 2 x PORTrockIT 210 nodes
performance in the various latency and • WANulator host
packet loss scenarios.
N E T WO R K :
After conducting each performance test,
• Two standard 10 Gb/s LAN
a data integrity check was performed
connection management ports
to confirm that data was not lost or
• One 10 Gb/s 10 GbE interface
corrupted during testing, and that the card (with 1 SFP+ LC 50 micron
files written to and read from the NetApp multimode fiber connector)
storage were identical to the original
source files.

NetApp FAS NetApp FAS


Clustered Clustered
ONTAP PORTrockIT WANulator PORTrockIT ONTAP

10Gb 10Gb 10Gb 10Gb

Figure 2: Accelerated architecture with PORTrockIT


W H AT T H E DATA T E L L S U S At these higher levels of latency, the
PORTrockIT architecture did offer a
L AT E N C Y significant performance improvement
– a 44% increase at 100 ms, and a 45%
The first set of tests simulated a scenario
increase at 150 ms.
with no packet loss, at latencies ranging
from 0 ms to 360 ms round trip time
The results show that even without
(RTT). Data was replicated from the
PORTrockIT acceleration, SnapMirror
source to the target using NetApp
provides an efficient means of replicating
SnapMirror, first via the unaccelerated
data across a WAN when packet loss
architecture, and then again via
is not an issue. High levels of latency
the accelerated architecture with
do impact performance, but transfer
PORTrockIT.
rates of around 100 MB/s can still
be considered acceptable in most
Looking at Figure 3, the results show
replication scenarios.
that there was relatively little difference
between the two architectures – both
However, next, let’s look at what happens
delivered very high performance at low
when packet loss is introduced into the
levels of latency, with a sharp dip in
equation.
transfer rate as latency neared 100 ms.

500
Accelerated
Unaccelerated

400
Transfer Rate (MB/s)

300

200

100

0
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150

Latency / Round Trip Time (ms)

Figure 3: Accelerated and unaccelerated performance at various latencies with 0% packet loss
PAC K E T LOS S The results lead to the conclusion that
while NetApp SnapMirror is efficient at
Three other scenarios with different
replicating data across an ideal network,
levels of packet loss (0.5%, 1% and
it can struggle to achieve the same
1.5%) were also tested at various levels
performance in a real-world environment
of latency across multiple different
where significant levels of packet loss
SnapMirror configurations. Figures 4, 5
occur. Even where network latency is
and 6 all show that the unaccelerated
very low, a small amount of packet loss
architecture saw severe performance
can degrade performance to a point
degradation from the introduction of
where replication jobs run so slowly that
packet loss. In all three scenarios, the
they are likely to overrun the available
accelerated architecture performed
window.
considerably better.

In such situations, the adoption of


Even in the most extreme example
PORTrockIT can have a very significant
(150 ms of latency with 1.5% packet loss)
impact, keeping transfer rates at 68
the accelerated architecture achieved
MB/s or above even in the most difficult
a transfer rate of 68 MB/s, compared
scenarios.
to just 1 MB/s on the unaccelerated
architecture.

150
Accelerated
Unaccelerated

100
Transfer Rate (MB/s)

50

0
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150

Latency / Round Trip Time (ms)

Figure 4: Accelerated and unaccelerated performance with 0.5% packet loss at various levels of latency
150
Accelerated
Unaccelerated

100
Transfer Rate (MB/s)

50

0
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150

Latency / Round Trip Time (ms)

Figure 5: Accelerated and unaccelerated performance with 1% packet loss at various levels of latency

100
Accelerated
90 Unaccelerated

80

70
Transfer Rate (MB/s)

60

50

40

30

20

10

0
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150

Latency / Round Trip Time (ms)

Figure 6: Accelerated and unaccelerated performance with 1.5% packet loss at various levels of latency
REALISING THE BUSINESS high-bandwidth connections or more
B E N E F I TS powerful backup servers – enabling
significant cost-avoidance.
The proof-of-concept exercise enabled
Trace3 and Agilesys to help their “PORTrockIT delivers
client make a strong business case
for investing in PORTrockIT. The tests significant business
showed that the Bridgeworks technology
would make it feasible to replicate large
value by transforming
volumes of sensitive, highly compressed WAN replication
and encrypted clinical image data across
a WAN. This would help the healthcare performance.”
provider meet its data protection,
business continuity and disaster recovery A BO U T T H E AU T H O R
goals, as well as facilitating compliance
with laws and regulations around the David Trossell has been part of the
safe and secure storage of patient data. IT industry for over 30 years, working
for infrastructure specialists such as
Looking beyond the specific context of Rediffusion, Norsk Data and Spectra
the proof-of-concept, the results also Logic before joining Bridgeworks in
suggest that PORTrockIT can transform 2000 as CEO/CTO. He is a recognised
performance for any organisation that visionary in the storage technology
wants to use NetApp SnapMirror for industry, and has been instrumental
replication across a WAN. If replication in setting the company’s strategic
jobs are threatening to overrun the direction and developing its innovative
available window, or if it is desirable to range of solutions. David is the primary
reduce the replication window to free up inventor behind Bridgeworks’ intellectual
server and network resources for other property, and has authored or co-
important jobs, PORTrockIT provides an authored 16 international patents.
elegant solution.
TA K E T H E N E X T ST E P S
PORTrockIT offers plug-in-and-go
technology that can be implemented
To learn more about PORTrockIT and
quickly with minimal impact on the
other smart networking solutions
rest of your IT infrastructure – keeping
from Bridgeworks, please visit
deployment cost and risk to a minimum.
www.4bridgeworks.com, or call us on
By maximising the performance of
+44 (0) 1590 615 444.
existing infrastructure, PORTrockIT also
reduces the need to invest in expensive
Copyright Bridgeworks 2018 / www.4bridgeworks.com / +44 (0) 1590 615 444

You might also like