Professional Documents
Culture Documents
Akoush Migration
Akoush Migration
Migration
Sherif Akoush, Ripduman Sohan, Andrew Rice, Andrew W. Moore and Andy Hopper
University of Cambridge Computer Laboratory
firstname.lastname@cl.cam.ac.uk
Abstract—With the ability to move virtual machines between is necessary on a per-workload basis in order to effectively
physical hosts, live migration is a core feature of virtualisation. plan resource allocation and maintenance schedules.
However for migration to be useful, deployable feature on a In this paper we examine migration times in detail and
large (datacentre) scale, we need to predict migration times with
accuracy. In this paper, we characterise the parameters affecting propose models which can be used to make workload specific
live migration with particular emphasis on the Xen virtualisation predictions of service interruptions. In Section II we discuss
platform. We discuss the relationships between the important live migration approaches with detailed emphasis on the Xen
parameters that affect migration and highlight how migration migration architecture. In Section III we investigate migration
performance can vary considerably depending on workload. We performance and confirm that the page dirty rate and link speed
further provide 2 simulation models that are able to predict
migration times to within 90% accuracy for both synthetic and are the significant determining factors. We quantitatively show
real-world benchmarks. how particular combinations of these factors can markedly ex-
tend migration time and VM unavailability. Finally, in Section
I. I NTRODUCTION IV we utilise our measurements to develop and validate 2
Virtualisation has become a core facility in modern com- simulation models which are useful for predicting the total
puting installations. It provides opportunities for improved migration and service interruption times when migrating any
efficiency by increasing hardware utilisation and application workload.
isolation [1] as well as simplifying resource allocation and
II. L IVE M IGRATION OF VM S
management. One key feature that makes virtualisation attrac-
tive is that of live migration. Live migration is a technology with which an entire running
Live migration platforms (e.g. XenMotion [2] and VMo- VM is moved from one physical machine to another. Migration
tion [3]) allow administrators to move running virtual ma- at the level of an entire VM means that active memory
chines (VMs) seamlessly between physical hosts. This is of and execution state are transferred from the source to the
particular benefit for service providers hosting high availability destination. This allows seamless movement of online services
applications. The level of service to which a service provider without requiring clients to reconnect. As migration designs
commits when hosting and running an application is described to-date require universally accessible network attached storage
by a Service Level Agreement (SLA) and is typically couched (NAS), migrations are reduced to copying in-memory state and
in terms of application availability. Policies such as 99.999% CPU registers. On migration completion, virtual I/O devices
availability (common in the telecommunication industry) per- are disconnected form the source and re-connected on the
mit only 5 minutes of downtime a year. Routine activities destination physical host.
such as restarting a machine for hardware maintenance are ex- There are several techniques for live migration that trade-
tremely difficult under such a regime and so service providers off two important parameters—total migration time and down-
invest considerable resources in high-availability fault-tolerant time. Total migration time refers to the total time required to
systems. Live migration mitigates this problem by allowing move the VM between physical hosts while downtime is the
administrators to move VMs with little interruption. This per- portion of that time when the VM is not running.
mits regular maintenance of the physical hardware; supports Pure stop-and-copy [7], [8], [9] designs halt the original VM
dynamic reconfiguration enabling cluster compaction in times and copy its entire memory to the destination. This technique
of low demand; and shifts computing load to manage cooling minimises total migration time but suffers from high downtime
hotspots within a datacentre [4]. as the VM is suspended during the entire transfer. Pure on-
However, short interruptions of service are still unavoidable demand [10] migration on the other hand operates by stopping
during live migration due to the overheads of moving the the VM to copy only essential kernel data to the destination.
running VM. Previous studies have demonstrated that this The remainder of the VM address space is transferred when it
can vary considerably between applications due to different is accessed at the destination. While this technique has a very
memory usage patterns, ranging from 60 milliseconds when short downtime, it suffers from high total migration time.
migrating a Quake game server [5] to 3 seconds in the It has previously been established that both stop-and-copy
case of High-Performance Computing benchmarks [6]. This and on-demand migrations have poor performance [5]. The
variation means that predicting the duration of any interruption former may lead to significant service disruption especially
if the VM is running highly used applications while the latter While it is expected that the iterative pre-copy stage will
incurs a longer total migration time and degraded performance dominate total migration time, our measurements found that
during the synchronisation of on-demand pages between hosts. for certain classes of applications (specifically those that do not
have a high memory page modification rate) the initialisation,
A. Pre-copy Migration
reservation, commitment and activation stages may add a
Pre-copy migration tries to tackle problems associated with significant overhead to total migration time and downtime.
earlier designs by combining a bounded iterative push step We classify the initialisation and reservation stages together as
with a final and typically very short stop-and-copy phase [5]. pre-migration overhead while the commitment and activation
The core idea of this design is that of iterative convergence. stages compose post-migration overhead.
The design involves iterating through multiple rounds of
copying in which the VM memory pages that have been
modified since the previous copy are resent to the destination T otalM igrationT ime = Initialisation + Reservation
| {z }
on the assumption that at some point the number of modified P re-migrationOverhead
pages will be small enough to halt the VM temporarily, copy
X
+ P re-copyi + Stop-and-copy
the (small number of) remaining pages across, and restart it i
on the destination host. Such a design minimises both total + Commitment + Activation (1)
migration time and downtime. | {z }
P ost-migrationOverhead
Pre-copy migration involves 6 stages [5], namely:
1) Initialisation: a target is pre-selected for future migra-
tion. T otalDowntime =Stop-and-copy
2) Reservation: resources at the destination host are re- + Commitment + Activation (2)
| {z }
served. P ost-migrationOverhead
3) Iterative pre-copy: pages modified during the previous
C. Migration Bounds
iteration are transferred to the destination. The entire
RAM is sent in the first iteration. Given the stop conditions, it is possible to work out the
4) Stop-and-copy: the VM is halted for a final transfer upper and lower migration performance bounds for a specific
round. migration algorithm. We will use a real-world case to charac-
5) Commitment: the destination host indicates that it has terise these boundaries.
received successfully a consistent copy of the VM. While there exist a range of live migration platforms, for the
6) Activation: resources are re-attached to the VM on the remainder of this paper we will base our analysis on the Xen
destination host. migration platform. Xen is already being used as the basis
for large scale cloud deployments [11] and thus this work
Unless there are stop conditions, the iterative pre-copy stage
would immediately benefit these deployments. Moreover, Xen
may continue indefinitely. Thus, the definition of stop condi-
is open-source allowing us to quickly and efficiently determine
tions is critical in terminating this stage in a timely manner.
the migration sub-system design and implementation. Note
These conditions are usually highly dependent on the design
however that our measurement techniques, methodology, and
of both the hypervisor and the live migration sub-system but
prediction models design basis are applicable to any virtualisa-
are generally defined to minimise link usage and the amount
tion platform that employs the pre-copy migration mechanism.
of data copied between physical hosts while minimising VM
The stop conditions that are used in Xen migration algo-
downtime. However, the existence of these stop conditions has
rithm are defined as follows:
a significant effect on migration performance and may cause
non-linear trends in the total migration time and downtime 1) Less than 50 pages were dirtied during the last pre-copy
experienced by VMs (as will be shown in Section III). iteration.
2) 29 pre-copy iterations have been carried out.
B. Defining Migration Performance 3) More than 3 times the total amount of RAM allocated
Migration performance may be evaluated by measuring total to the VM has been copied to the destination host.
migration time and total downtime. The former is the period The first condition guarantees a short downtime as few
when state on both machines is synchronised, which may pages are to be transferred. On the other hand, the other 2
affect reliability while the latter is the duration in which the conditions just force migration into the stop-and-copy stage
VM is suspended thus seen by clients as service outage. which might still have many modified pages to be copied
Using the pre-copy migration model, total migration time across resulting in large downtime.
may be defined as the sum of the time spent on all 6 1) Bounding Total Migration Time (Equation 3): Consider
migration stages (Equation 1) from initialisation at the source the case of an idle VM running no applications. In this
host through to activation at the destination. Total downtime, case the iterative pre-copy stage will terminate after the first
however, is the time required for the final 3 stages to complete iteration as there is no memory difference. Consequently, the
(Equation 2). migration sub-system need only to send the entire RAM in
the first round. The total migration lower bound is thus the
TABLE I
time required to send the entire RAM coupled with pre- and M IGRATION B OUNDS . MT: T OTAL M IGRATION T IME (S ECONDS ). DT:
post-migration overheads. T OTAL D OWNTIME (M ILLISECONDS ). LB: L OWER B OUND . UB: U PPER
On the other hand, consider the case where the entire B OUND . VM S IZE = 1,024 MB.
memory pages are being modified as fast as link speed. In Speed M TLB M TU B DTLB DTU B
this scenario, the iterative pre-copy stage will be forced to 100 Mbps 95.3 s 459.1 s 314 ms 91,494.5 ms
terminate after copying more than 3 times the total amount 1 Gbps 13.3 s 49.9 s 314 ms 9,497.8 ms
10 Gbps 5.3 s 10.1 s 314 ms 1,518.7 ms
of RAM allocated to the VM. Migration then re-sends the
entire modified RAM during the stop-and-copy stage. The total
migration upper bound is thus defined as the time required to III. PARAMETERS A FFECTING M IGRATION
send 5 times the VM size less 1 page1 plus pre- and post-
There are several factors that we need to study as a pre-
migration overheads.
requisite for accurate migration modelling. In this section,
we explore these factors and their impact on total migration
V M Size time and downtime. Moreover, stop conditions that may force
Overheads + ≤ T otalM igrationT ime ≤
LinkSpeed migration to reach its final stages are generally what governs
5 ∗ V M Size − 1 ∗ page migration performance. Obviously, this is implementation spe-
Overheads + (3)
LinkSpeed cific which is exemplified by but not limited to Xen support
2) Bounding Total Downtime (Equation 4): Similarly, the for live migration.
total downtime lower bound is defined as the time required Migration link bandwidth is perhaps the most influen-
for the post-migration overhead, assuming that the final stop- tial parameter on migration performance. Link capacity is
and-copy stage does not transfer any pages. This occurs either inversely proportional to total migration time and downtime.
when the VM is idle or the link speed is fast enough to copy Higher speed links allow faster transfers and thus require less
all dirtied pages in the pre-copy stage. On the other hand, the time to complete. Figure 1 illustrates migration performance
total downtime upper bound is defined as the time required to for a 1,024 MB VM running a micro-benchmark that writes
copy the entire RAM in the stop-and-copy stage coupled with to memory pages with rates up to 300,000 pages/second on
the post-migration overhead. 100 Mbps, 1 Gbps, and 10 Gbps links. It represents the impact
of each link speed on total migration time and downtime.
As link bandwidth increases, the point in the curve when
P ost-migrationOverhead ≤ T otalDowntime ≤ migration performance starts to degrade rapidly shifts to the
V M Size right roughly with the same ratio.
P ost-migrationOverhead + (4) Page dirty rate is the rate at which memory pages in the
LinkSpeed
3) Difference in Bounds: Modelling bounds is useful as it VM are modified which, in turn, directly affects the number of
enables us to reason about migration times provided that we pages that are transferred in each pre-copy iteration. Higher
know the link speed and VM memory size. These bounds are page dirty rates result in more data being sent per iteration
the limits in which the total migration time and total downtime which leads to longer total migration time. Furthermore, higher
are guaranteed to lie. Given a 1,024 MB VM and 1 Gbps page dirty rates results in longer VM downtime as more pages
migration link, for example, the total migration time has a need to be sent in the final transfer round in which the VM is
lower bound of 13 and upper bound of 50 seconds respectively. suspended.
Similarly, the downtime has a lower bound of .314 and upper Figure 1 shows the effect of varying the page dirty rate
bound of 9.497 seconds respectively. on total migration time and downtime for each link speed.
Table I illustrates the migration bounds for some common The relationship between page the dirty rate and migration
link speeds. While the downtime lower limit is fixed (as it is performance is not linear because of the stop conditions
dependent purely on post-migration overhead) all other bounds defined in the algorithm. If the page dirty rate is below
vary in accordance to link speed due to their correlation with link capacity, the migration sub-system is able to transfer
the VM memory size. As the table indicates, the bounds vary all modified pages in a timely fashion, resulting in a low
significantly. For bigger VM memory sizes (which is common total migration time and downtime. On the other hand, if
in current installations [12]) we have even larger differences. the page dirty rate starts approaching link capacity, migration
Thus, using bounds is at best an imprecise exercise and does performance degrades significantly.
not allow for accurate prediction of migration times. Building Total downtime at low page dirty rates is virtually constant
better predictions requires understanding the relationship be- and approximately equal to the lower bound (Equation 4). This
tween factors that impact migration performance. We address is because the link has enough capacity to transfer dirty pages
this in the next section. in successive iterations leading to a very short stop-and-copy
stage. When the page dirty rate increases to the point that 29
n−1
1
X iterations are not sufficient to ensure a short final copy round
P re-copyi + P re-copyn + Stop-and-copy
| {z } | {z } or when more than 3x the VM size have been transferred,
i=0 1V M Size 1V M Size
| {z } migration is forced to enter its final stage with a large number
3V M Size−1page
(a) 100 Mbps Total Downtime (b) 100 Mbps Total Migration Time
of dirty pages yet to be sent. Consequently, total downtime migration is forced to reach its final transfer stage after 29
starts to increase in proportion to the increase in the number iterations having sent virtually no pages.2 It then has to transfer
of modified pages that need to be transferred in the stop-and- the entire RAM in the final iteration. This is exemplified
copy stage. Total downtime further increases until the defined clearly for the 100 Mbps link in Figure 1(b), in which the
upper bound in which it has to send the entire VM memory. total migration time drops back to its lower bound (almost all
Total migration time also increases with an increasing page dirty pages are skipped in every iteration except the final one)
dirty rate. This is attributable to the fact that more modified while having a total downtime (Figure 1(a)) at its upper bound
pages have to be sent in each pre-copy round. Moreover, the (as the entire RAM has to be transferred in the stop-and-copy
migration sub-system has to go through more iterations with stage).
the hope to have a short final stop-and-copy round. For page The first pre-copy iteration tries to copy across the entire
dirty rates near link speed, total migration time approaches its VM allocated memory. The duration of this first iteration
upper bound (Equation 3) as migration stops when 3x VM is thus directly proportional to the VM memory size and
size has been transferred. Then, it starts to fall back towards
2 If a memory page is modified more than once in the same iteration, it is
its lower bound.
skipped
For extremely high page dirty rates (compared to link speed),
subsequently impacts total migration time. On average, total Algorithm 1 The AVG Simulation Model
#These values are given
migration time increases linearly with VM size. On the other LIN K SP EED
P AGE DIRT Y RAT E
hand, the total downtime for low page dirty rates is almost the P RE M IGRAT ION OV ERHEAD
P OST M IGRAT ION OV ERHEAD
same regardless of the VM size as the migration sub-system V M SIZE
#Start simulation
succeeds in copying all dirtied pages between successive p2m size = V M SIZE
P AGE SIZE
M AX SEN T = 50
iterations resulting in a short stop-and-copy stage. When the M AX IT ERAT ION S = 29
M AX F ACT OR = 3
link is unable to keep up with the page dirty rate, larger VMs M AX BAT CH SIZE = 1, 024
to send ←ALL PAGES
suffer longer downtime (linearly proportional to the VM size) total sent = 0
iteration = 0
as there are more distinct physical pages that require copying loop
iteration = iteration + 1
in the stop-and-copy stage. N = 0
while N ≤ p2m size do
Pre- and post-migration overheads refer to operations sent this iteration = 0
skip this iteration = 0
that are not part of the actual transfer process. These are if NOT last iteration then
to skip =sim peek()
operations related to initialising a container on the destination end if
batch = 0
host, mirroring block devices, maintaining free resources, for (batch ≤ M AX BAT CH SIZE) AND (N ≤ p2m size)) do
if (NOT last iteration) AND
reattaching device drivers to the new VM, and advertising (N ∈ to send) AND
(N ∈ to skip) then
moved IP addresses. As these overheads are static, they are skip this iteration++
end if
significant especially with higher link speeds. For instance, if NOT ((N ∈ to send AND N ∈ / to skip) OR (N ∈ to send AND
last iteration)) then
pre-migration setup constitutes around 77% of total migration continue
end if
time on a 10 Gbps link for a 512 MB idle VM. More batch set ← N
batch++
importantly, post-migration overhead is an order of magnitude N ++
end for
larger than the time required for the stop-and-copy stage. migration time = migration time + batch set
LIN K SP EED
if last iteration then
To conclude this section, there are several parameters downtime time = downtime time + batch set
LIN K SP EED
end if
affecting migration performance. These parameters may be total sent = total sent + batch set
sent this iteration = batch set
classified as having either a static or dynamic effect on end while
if last iteration then
migration performance. Parameters having static effects are exit()
end if
considered as unavoidable migration overheads. On the other if (iteration ≥ M AX IT ERAT ION S) OR (total sent > p2m size ∗
M AX F ACT OR) OR (sent this iteration + skip this iteration <
hand, parameters having dynamic effects on migration affect M AX SEN T ) then
last iteration =True
only the transfer process. Dynamic parameters are typically end if
to send =sim clean()
related to the VM specification and applications hosted inside end loop
it.
We show that the page dirty rate and link speed are the total migration time =migration time
+P RE M IGRAT ION OV ERHEAD
major factors influencing migration times. We also show how +P OST M IGRAT ION OV ERHEAD
particular combinations of these factors can extend expected total downtime time =downtime time
total migration time and downtime. Finally, we observe that +P OST M IGRAT ION OV ERHEAD
M TA M TP DTA DTP
Min 525.82 s 567.80 s 629.00 ms 499.61 ms
Max 1,109.88 s 1,219.37 s 42,534.00 ms 45,431.12 ms