Professional Documents
Culture Documents
OSV - DA-Hari Krishna D R
OSV - DA-Hari Krishna D R
OSV - DA-Hari Krishna D R
Department SCOPE
Key words: containers, migration, live migration, check pointing, docker, OpenVZ, LXC
I. Introduction
The two main virtualization technologies today are virtual machines (VMs) and containers.
Containers are known to boot faster than VMs, and thus, lower service downtime in the
application.
Containers are a new technology relative to VMs. Containerization technologies like OpenVZ
[4][32], LXC and Docker [14] provide a different way of virtualization compared to classic VMs
due to their lightweight structure. There are key differences between the two technologies: One
difference is that while VMs imitate the operating system kernel, containers make use of
hardware and kernel of a single host operating system in a shared manner. Containers
encapsulate applications with their required binaries in order to provide the application as a
service. Therefore, containers have less virtualization cost and use less resources relative to VMs
because of their lightweight nature. Furthermore, since a container does not need its own
operating system, it uses only the resources required for the application upon container start.
Both of the virtualization techniques have capability to provide increased efficiency in the
utilization of the resources in big data centers, which is achieved by the migration of the
encapsulated service. Live migration has become an increasingly popular topic because of its
contribution to the consolidation of services. There are many other reasons to migrate a service
from a source host to a destination.
These include system maintenance (for a software or hardware update), load balancing, efficient
resource utilization, service protection from attacks through moving target defense, etc. In
addition to those, migration management frameworks have also been developed such as
OpenStack, to be able to apply for load balancing.
Live service migration is a novel technique that provides fast handovers during the runtime of
the application, warranting seamless and reliable low latency communications. Hence, it is
considered an ideal technique to improve network flexibility. The two main live migration
technologies today are VMs and containers. Several works compared VMs and containers, and
have shown that containers perform better because they are lighter and boot faster. This is crucial
for live migration since the main objective is to provide a seamless and reliable communication.
Live migration is about moving application instances around without disconnecting the clients.
Live containers[1] migration refers to the process of moving application between different
physical machines or clouds without disconnecting the client. Memory, file system, and network
connectivity of the containers running on top of bare metal hardware are transferred from the
original host machine to the destination keeping the state without downtime. Live migration
process of containers heavily leans on checkpoint/resume strategies by benefiting the small sized
nature of containers compared to VMs [19].
It involves moving a service from one host to another with the minimum possible downtime.
Live migration is also required for system maintenance, load balancing [8], and protecting
services from attacks through moving target defense [25]. While migrating a service, the system
should not be vulnerable to attacks.
A live migration can help with server maintenance scenarios or unbalanced load. Popular
implementation strategies include "pre-copy" where state is copied to alternate host and traffic
switched, or "post-copy" [3] where initial state is copied and the remainder is "lazy loaded". Live
migration is a particular case of service migration where the service is transparently relocated to
another physical host seamlessly. This means that the resulting downtime is not detectable by the
end user, and the end user does not realize that the server was relocated (e.g. by detecting a new
IP address). Container technology has been widely adopted on various computing platforms, like
cloud platforms, CI/CD, and DevOps. It employs layered image management to enable agile
deployment of applications and leverages cgroup and namespace to provide an isolate
environment for each application and mitigate resource contention among concurrently running
applications.
Docker makes it convenient for developers to package the application runtime into an image, and
run the application on any OS with the assistance of Docker daemon. In a large-scale data center,
millions of Docker containers are usually managed by various orchestration tools (i.e.,
Kubernetes, Mesos, and Swarm). None of them can fulfill the live migration requirements for
Docker containers in the scenes of load balancing, host maintenance, and system upgrade.
Kubernetes[2] leverages replicate controller to relocate Docker containers as an alternative
solution. It can seamlessly move stateless Docker container to another node, but will bring
terrible downtime for stateful Docker containers which are more and more popular in cloud
environments. As a result, live migration of Docker container is a desirable and valuable
technology for resource utilization and QoS guarantee in data centers. There are mature works
and migration mechanisms (i.e., Pre-Copy, Post-Copy, and LoggingReplay) for live migrations
of virtual machines[7] (VMs).
Different from VM, Docker has a layered image, a shared kernel runtime, and a richer functional
management architecture, which make live migration of a Docker container more complicated.
Correspondingly, live migration of Docker containers consists of three-part tasks, i.e., migration
of image, migration of runtime, and migration of management context. During the procedure of
live migration, it is undoubtedly important to guarantee the integrity of three key components,
also known as component-integrity. Besides, the scalability and downtime of the live migration
are the critical metrics in a data center. As a result, ideal live migration of Docker containers
should provide good scalability and negligible downtime in performance, while guaranteeing the
component-integrity of Docker containers in terms of functionality.
The basic idea of live migration algorithm, first proposed by Clark et. al. [9]. First Hypervisor
marks all pages as dirty, then algorithm iteratively transfer dirty pages across the
network until the number of pages remaining to be transferred is below a certain threshold or a
maximum number of iterations is reached. Then Hypervisor mark transferred pages as clean,
since VM operates during live migration, so already transferred memory pages may be dirtied
during iteration and must need to be re-transferred. The VM is suspended at some point on the
source for stopping further memory writes and transfer remaining pages. After
transferring all the memory contents, the VM resumes at destination. Performance is measured
with hundred virtual machines, migrating concurrently with standard industry
benchmarks. It shows that for a variety of workloads, application downtime due to
migration is less than a second.
Method1:
Method 2:
Luo et. al. [11] describe a whole-system live migration scheme, which transfers the whole
system run-time state, including CPU state, memory data, and local disk storage, of the
virtual machine (VM). They propose a three-phase migration (TPM) algorithm as well as
an incremental migration (IM) algorithm, which migrate the virtual machine back to the source
machine in a very short total migration time. During the migration, all the write accesses to the
local disk storage are tracked by using Block-bitmap. The migration downtime is around 100
milliseconds, close to shared-storage migration. Using IM algorithm, total migration
time is reduced. Performance overhead of recording all the writes on migrated VM is very
low.
Method 3:
Bradford et. al. [12] presented a system for supporting the transparent, live wide-area migration
of virtual machines which use local storage for their persistent state. This approach is
transparent to the migrated VM, and does not interrupt open network connections to and from the
VM during wide area migration, guarantees consistency of the VM’s local persistent state at the
source and the destination after migration, and is able to handle highly write-intensive
workloads.
The challenges of live migration [15][16] are to reduce the time as the service is down during the
migration process. Other challenges include:
In this survey we are going to discuss about the procedure of live migration of containers, main
challenges faced during live migration and solutions proposed for each challenge faced while
performing live migration of containers.
Live Migration of Containers vs Virtual Machines
In [20], the author proposed a multi-layer framework for live migration of applications.
Encapsulated in a container or virtual machine. Experiments were conducted to make
comparisons based on the results obtained using the VM. The framework aims to provide
excellent performance based on frequent needs. Transition to mobile edge cloud architecture.
Mobile edge cloud is a network an architecture that provides cloud services at the edge of a
cellular network.
The study in [20] shows the difference between VM and container migration. analyzed.
Experimental results show a container (LXC is used in the experiment) It has significant
advantages over VMs (KVM is used for experiments) in terms of total migration time,
application downtime, and data sent from the source node. Sent Destination in transition. The
main reason for this is explained by the fact that containers are lighter than VMs and the contents
of container storage are predominant. teeth Based on the application running in the container.
However, in the case of VM, it is Different, d. H. The contents of VM memory are related to
many other processes, such as: A background process that is usually unrelated to the migrated
service.
II. Live Migration and its working
a. Brief Description of how live migration is performed
To perform the migration, the platform freezes container at the source node blocking memory,
processes, file system and network connections, and gets the state of this container. After that, it
is copied to the destination node. The platform restores the state and unfreezes the container at
this node. Then, there is a quick cleanup process at the source node.
It is pretty straightforward: you get the state, you copy the state, and you restore the state.
However, please note, there is a freeze timeframe, and we have to consider this during the
application architecture design, as it can be an issue for some applications.
There are two kinds of solutions of live migration. One of them is pre-copy memory. If you want
to migrate a container, platform turns track memory on the source node, and copies this memory
in parallel with the destination node until the difference becomes minimal. After that, it freezes
the container, gets the rest of the state, migrates it to a destination node, restores and unfreezes it.
Another solution is post-copy memory, or in other words - lazy migration. The system freezes
container at the source node at the beginning, gets the state of the fastest changing memory
pages, moves the state to the destination node, restores it, and unfreezes the container. The rest of
the state is copied from the source node to the destination one in a background mode.
The following metrics are usually used to measure the performance of live migration:
1. Preparation Time: The time when migration has started and transferring the
VM’s state to the target node. The VM continues to execute and dirty its memory.
2. Downtime: The time during which the migrating VM’s is not executing. It
includes the transfer of processor state.
3. Resume Time: This is the time between resuming the VM’s execution at the target
and the end of migration, all dependencies on the source are eliminated.
4. Pages Transferred: This is the total amount of memory pages transferred,
including duplicates, across all of the above time periods.
5. Total Migration Time: This is the total time of all the above times from start to
finish. Total time is important because it affects the release of resources on both
participating nodes as well as within the VMs.
6. Application Degradation: This is the extent to which migration slows down the
applications executing within the VM.
Before performing migration, we should make sure all these metrics are satisfied. Container is an
isolated entity (meaning that all the inter-process relations, such as parent-child relationships and
inter-process communications, are within the container boundaries), its complete state can be
saved into a disk file—the procedure is known as checkpointing. A container can then be
restarted back from that file. The ability to checkpoint and restart a container has many
applications, such as:
Checkpointing and restarting [4][5][29] a system has some prerequisites which must be supplied
by the OS which we use to implement it. First of all, a container infrastructure is required which
gives:
1. PID virtualization – to make sure that during restart the same PID can be assigned to a
process as it had before checkpointing.
2. Process group isolation – to make sure that parent child process relationships will not
lead to outside a container.
3. Network isolation and virtualization – to make sure that all the networking connections
will be isolated from all the other containers and the host OS.
4. Resources virtualization – to be independent from hardware and be able to restart the
container on a different server.
For the purpose of checkpointing/restoring, the CRIU tool is used. CRIU (Checkpoint/Restore in
User Space) is a software tool for the Linux OS. Using this tool, a running application can be
frozen and it can be checkpointed as a set of disk files. The files can be utilized to resume the
application and run it starting from the state at the time of the freeze. Application live migration
becomes possible with this feature. CRIU is supported as integrated with Docker, OpenVZ and
LXC / LXD [23].
The main feature of the CRIU tool is that it is basically developed in the user space. Instead of
kernel space. This feature allows this tool to provide live container migration by allowing users
to see and restore currently running applications. instance. Explain the process migration
performed by the CRIU tool There are three main phases: checkpointing, page server activity,
and recovery.
CRIU offers the possibility to save a running process as a set of files, such as: B. Page maps, file
descriptors, and open sockets. In other words, CRIU will search A process tree that provides
sufficient information about the processes associated with. Gather Resurrection [24]. More
specifically, at the start of the checkpoint, the damper process Go through the process directory
under / proc and create a process tree structure by collecting the necessary information about the
relevant processes. Next, the parasite code Added to the task at the appropriate point to execute
the CRIU subroutine Address space for related processes. The parasite code is in CRIU.
Connected Accept commands from CRIU.
After the dump process is the parasite code Extracted from the task and reverts to the original
code. CRIU releases the process and gives the control to the operating system fully. In the end,
CRIU evaluates the entire gathered data and records this information to dump files. At restoring
stage, CRIU reads the image files and resolves which resources are shared between processes.
Then, by calling the operating system function fork (), CRIU creates processes on the destination
node. After that, CRIU arranges necessary settings for files, namespaces, maps, private memory
areas, sockets and ownership. Finally, memory allocation to the exact location, timers,
credentials, threads It will be restored to accommodate the activation of the landing page process
[3]. CRIU is only required for containers with stateful applications and is not recommended as
memory contents and state for use in stateless applications Execution is not important for
container recovery. CRIU will stop soon Run the container process and check the number of
image files you need Restore the container to the stopped state. In other words, CRIU It is
basically used to move container storage to a persistent collection of files. This simplifies
transfer and recovery [25]. Next figure Used by CRIU to display a sequence of live migrations of
containers.
The checkpointing and restart procedure is initiated from the user-level, but it is mostly
implemented at the kernel-level, thus providing full transparency of the checkpointing process.
Also, a kernel-level implementation does not require any special interfaces for resources re-
creation. The checkpointing procedure consists of the following three stages:
1. Freeze processes – move processes to previously known state and disable network.
2. Dump the container – collect and save the complete state of all the container’s processes
and the container itself to a dump file.
3. Stop the container – kill all the processes and unmount container’s file system.
Figure 4: CRIU Principle Diagram
The procedure to perform restarting:
1. Restart the container – create a container with the same state as previously saved in a
dump file.
2. Restart processes – create all the processes inside the container in the frozen state, and
restore all of their resources from the dump file.
3. Resume the container – resume processes’ execution and enable the network. After that,
the container continues its normal execution.
The first step of the checkpointing procedure [4][5][6] and also the last step of restart procedure
before processes can resume their execution is process-freeze. The freeze is required to make
sure that processes will not change their state and saved processes’ data will be consistent. It is
also easier to reconstruct frozen processes.
It is very important to save a consistent state of all the container’s processes. All process
dependencies should be saved and reconstructed during restart. Dependencies include the process
hierarchy (see Figure 1), identifiers (PGID, SID, TGID, and other identifiers), and shared
resources (open files, System, IPC objects, etc.). During the restart, all such resources and
identifiers should be set correctly. Any incorrectly restored parameter can lead to a process
termination, or even to a kernel oops.
As most of the resources must be restored from the process context, a special function (called
“hook”) is added on top of the stack for each process during the restart procedure. Thus, the first
function which will be executed by a process will be that “hook,” and the process itself will
restore its resources. For the container’s unit process, this “hook” also restores the container state
including mount points, networking (interfaces, route tables, iptables rules, and conntracks), and
System IPC objects; and it initiates process tree reconstruction.
d. Live Migration
Using the checkpointing and restart feature, it is easy to implement live migration. A simple
algorithm is implemented which does not require any special hardware like SAN or iSCSI
storage:
1. Container’s file system synchronization. Transfer the container’s file system to the
destination server. This can be done using the rsync utility.
2. Freeze the container. Freeze all the processes and disable networking.
3. Dump the container. Collect all the resources and save them to a file on disk.
4. Second container’s file system synchronization. During the first synchronization, a
container is still running, so some files on the destination server can become outdated.
That is why, after a container is frozen and its files are not being changed, the second
synchronization is performed.
5. Copy the dump file. Transfer the dump file to the destination server.
6. Restart the container on the destination server. At this stage, we are creating a container
on the destination server and creating processes inside it in the same state as saved in
dump file. After this stage, the processes will be in the frozen state.
7. Resume the container. Resume the container’s execution on the destination server.
8. Stop the container on the source server. Kill the container’s processes and unmount its
file system.
9. Destroy the container on source server. Remove the container’s file system and config
files on the source server.
In the above migration scheme, Stages 3–6 are responsible for the most delay in service. Let us
take a look at them again and dig in a little bit deeper:
1. Dump time – the time needed to traverse over all the processes and their resources and
save this data to a file on disk.
2. Second file system sync time – time needed to perform the second file system
synchronization.
3. Dump file copying time – time needed to copy the dump file over the network from the
source server to the destination server.
4. Undump time – time needed to create a container and all its processes from a dump file.
Second file system sync time and dump file copying time are responsible for about 95% of all
the delay in service. That is why optimization of these stages can make sense [17]. The following
options are possible:
1. Second file system sync optimization – decrease the number of files being compared
during the second sync. This could be done with the help of file system changes tracking
mechanism.
2. Decreasing the size of a dump file:
a. Lazy migration – migration of memory after actual migration of container, i.e.,
memory pages are transferred from the source server to the destination on
demand.
i. Request a page from swap.
ii. Resend the request to the source server.
iii. Find the page on the source server.
iv. Transfer the page to the destination server.
v. Load the page to memory.
Another way to decrease the size of the dump file is to transfer memory pages in advance. In this
case, all the pages are transferred to the destination server before container freeze. But as
processes continue their normal execution, pages can be changed and transferred pages can
become outdated. That is why pages should be transferred iteratively. On the first step, all pages
are marked with a clean flag and transferred to the destination server. Some pages can be
changed during this process, and the clean flag will be removed in this case. On the second step,
only the changed pages are transferred to the destination server.
IV. Challenges faced during live migration of containers and
solution proposed.
Early challenge in the development of container technology was to find effective way to isolate
and provide security to different containers in the same machines.
o Overhead Analysis:
We conduct the overhead analysis of the live container migration using with the
following configurations: Each (the source and destination) VM is configured to
have sufficient resources (4 virtual CPUs and 4 GB Memory) running on the same
physical host (12 physical CPUs and 128 GB memory). The network bandwidth
between these two intra-host VMs is set to 10 Gbps. Two main metrics are used
for gauging the performance of live container migration: total migration time —
the time between the start and the end of the whole migration; and frozen time —
the time during which the migrated container is suspended (i.e., in the last
iteration)
Figure 12: Breakdown of frozen time at each stage of live container migration
Memory wrapping (m-warp) is a fast and live intra-host container migration approach. In m-
warp, instead of copying a container’s memory, it relocates the ownership of the container’s
physical memory pages from the source VM to the destination VM on the same host via a
highly-efficient memory relocation mechanism. The preliminary evaluation shows that m-warp
leads to sub-second total container migration time regardless of the container sizes and
significant application-level performance improvement for memory intensive applications.
2. Problem of security: Live migration is also required for system maintenance, load
balancing, and protecting services from attacks through moving target defense. While
migrating a service, the system should not be vulnerable to attacks. Live migration of
containers can be vulnerable to many kinds of attacks [26] such as eavesdropping, man-
in-the-middle, denial of service (DoS) etc. The migration system should take precautions
for these types of attacks.
3. Live Migration Attacks: Live migration of containers/VMs is susceptible to active and
passive attacks. Active attacks cause loss of data integrity, whereas passive attacks cause
the loss of sensitive data confidentiality. Some of the most remarkable attacks can be
listed as man-in-the-middle, DoS, overflow and replay attacks.
a. Man-in-the-Middle Attack: Attackers can eavesdrop on the data while migrating
from source host to destination and modify data content, which could result in the
loss of data integrity.
b. Denial of Service (DoS) Attack: By using false resource advertisement, an
attacker can attract more virtual machines towards a specific machine. This will
result in migrating virtual machines, stealing the bandwidth resource by
preventing the actual required migrations. This can lead to serious problems in the
cloud system, where migrations are started in an automatic manner.
c. Overflow Attack: Stack overflow can be caused by attackers by creating
congestion in the communication channel traffic, which can result in the memory
corruption of the running processes.
d. Replay Attack: Attackers can re-transmit the previous replicates of memory pages
to the destination host where the changed ones are required. This happens because
of frequent dirty page occurrences. Attackers can also modify the order of
memory pages sent to destination from source. This results in ordering problems
in the destination host.
4. Live Migration Security Factors:
The factors that need to be achieved for making live migration secure are as follows:
a. Access Control: Access control policies should be defined to ensure only users
with granted privileges have control on the system.
b. Authentication: Authentication is required between the source and the destination
hosts for the migration process.
c. Non-Repudiation: All the actions of both the source and destination hosts should
be observed. While live migration is occurring, all activities should be logged.
d. Data Confidentiality: Data encryption is required while migrating data between
source and destination hosts.
e. Communication Security: The data transmission channel should be protected on
the migration path between source and destination hosts.
f. Availability: The system should be protected against DoS attacks to make
resources available for legitimate users.
g. Privacy: The migration traffic is required to be isolated from the other networks
in order to protect the system from man-in-the-middle and sniffing attacks.
Machen et al. [19] proposed a layered framework for live migration of applications
encapsulated either in containers or virtual machines. Experiments were conducted to make
comparisons based on the results obtained when working with VMs and containers. The
framework aims at achieving good performance
Proposed Model
Model Architecture
This section describes the proposed model architecture. In our proposed model, there are five
main components. Two of these are the source and destination instances. The remaining main
components are an application server, a database (DB) server and the client interface. As
shown in Figure 1, all components have secure connections established between them. The
application server behaves as the controller of our model that initiates the migration. It
connects to the instances by SSH and issues commands over that SSH [21][22] channel. The
application server creates the related SSH channels between the instances and itself in this
model. In order to achieve that, it creates an SSH channel between itself and the host instance
(instance-1) first. Then, it commands instance-1 to run the application on Docker, and start
migration if requested. That is, it commands instance-1 to send related checkpoint files to
instance-2 (destination instance) by using the SSH channel created by the parameters
provided by the application server. Parameters are also provided by using the SSH channel,
which means that an SSH command is sent to instance-1 to connect it to the other instance by
using the SSH command parameters.
In our stateless application example, Clock Application, we retrieve the system time from
cloud instances. We created a table in the database to store the current instance time. The
users logged into the migration system, after navigating the clock tab should observe the
clock timestamp with the provider cloud instance IP on the screen. Because our instances are
located in the same geographic location, they have the same system time information. If the
user navigates to the Clock tab by using its own browser application after connecting to the
application server over the Internet, the user can see the clock data and this data is not
affected by any user input and does not save any state at the execution time on the instance
machine, which makes this application stateless, indeed.
In our stateful application example, Face Recognition application, we give an image as input
to the application. This application performs detection of the faces in the given image, saves
them to the database, extracting the features of the given face and compares them with the
images in the database. We integrated model training functionality to the application, in order
to make analysis of the migration by making the application duration longer and making the
checkpoint file size change dynamically. In other words, we change the model training set
size by providing parameters to the model training function in the source code. In order to be
able to take metrics on the system performance, this modification is integrated to the
application and does not affect the functionality of the application. It only affects the
application runtime duration and the complexity of the checkpoint and resume operations.
The application server also acts as a bridge between the client and the cloud instances, which
means that although the service provider address is changed after the migration process, the
user does not have to navigate to the new address of the service provider. Because the
application depends on the user input and is not required to execute infinitely as in the Clock
application, we need to know the end of the program execution and make the end user wait
until the execution is finished. In order to achieve that, we needed to save a status flag in the
database located in the database server. This flag holds the information on whether the result
set has been updated or not. The application server waits until the flag indicates that
execution has ended. If the flag turns to 1, it means that the result is ready to display. The
application server again retrieves the output from the database. It also parses the result and
renders the page accordingly.
Container load balancing helps to deliver traffic management services for containerized
applications efficiently. Today’s developers use containers to quickly test, deploy, and scale
applications through continuous integration and continuous delivery. But the transient and
stateless nature of container-based applications requires different traffic control for optimal
performance. A load balancer in front of the Docker engine will result in higher availability
and scalability of client requests. This ensures uninterrupted performance of the
microservices-based applications running inside the container. The ability to update a single
microservice without disruption is made possible by load balancing Docker containers. When
containers are deployed across a cluster of servers, load balancers running in Docker
containers make it possible for multiple containers to be accessed on the same host port.
Figure 18: Network diagram for Policy programmable live migration technique
VIII. Conclusion
Live migration using containers is most used and preferable technique when compared to VM
migrations and cold migrations of containers. In this survey we have discussed briefly about the
procedure and challenges faced during live migration and solutions proposed. To make sure
migration of resources is successful between two physical system using containers without
shutting down the system then it is better to combine the proposed solutions discussed above
instead of implementing individual solution for each problem. Our future work will be based on
proposing one solution for three main challenges i.e., downtime should me as minimum as
possible, security and load balancing as it decreases the overhead while performing live
migration using containers.
IX. References
[1] Live Container Migration: Opportunities and Challenges Niroj Pokhrel Aalto University
[2] B. Burns, B. Grant, D. Oppenheimer, E. Brewer, and J. Wilkes, “Borg, omega, and
Kubernetes,” Communications of the ACM 59, pp. 50–57, 2016.
[3] M. Rapoport. Userfaultfd and post-copy migration. [Online; accessed on December 18,
2021]. Available: http://www.slideshare.net/kerneltlv/userfaultfd-and-postcopy-migration
[5] O.O. Sudakov, Yu.V. Boyko, O.V. Tretyak, T.P. Korotkova, E.S. Meshcheryakov, Process
checkpointing and restart system for Linux, Mathematical Machines and Systems, 2003.
[8] W. Voorsluys, J. Broberg, S. Venugopal, and R. Buyya, "Cost of Virtual Machine Live
Migration in Clouds: A Performance Evaluation,” in 1st International Conference on Cloud
Computing, Berlin, Germany, 2009, pp. 254-65.
[9] C. Christopher, F. Keir, H. Steven, H. Jacob Gorm, J. Eric, L. Christian, P. Ian, and W.
Andrew, “Live migration of virtual machines,” 2nd conference on Symposium on
Networked Systems Design & Implementation - Volume 2: USENIX Association, 2005.
[10] H. Wei, G. Qi, L. Jiuxing, and D. K. Panda, “High performance virtual machine
migration with RDMA over modern interconnects,” in IEEE International Conference on Cluster
Computing, 2007, pp. 11-20
[11] L. Yingwei, Z. Binbin, W. Xiaolin, W. Zhenlin, S. Yifeng, and C. Haogang, “Live and
incremental whole-system migration of virtual machines using block-bitmap,” in IEEE
International Conference on Cluster Computing, 2008, pp. 99-106.
[12] B. Robert, K. Evangelos, F. Anja, S. Harald, and berg, “Live wide-area migration of
virtual machines including local persistent state,” 3rd International Conference on Virtual
execution environment, San Diego, California, USA: ACM, 2007.
[13] R. H. Michael, D. Umesh, and G. Kartik, "Post-copy live migration of virtual machines,"
SIGOPS Oper. Syst. Rev., vol. 43, pp. 14-26, 2009.
[14] “Docker”. Available: https://docs.docker.com/get-started/overview [Online; accessed
on December 18, 2021]
[15] D. Kapil, E. S. Pilli, and R. C. Joshi, “Live virtual machine migration techniques: Survey
and research challenges,” in Advance Computing Conference (IACC), 2013 IEEE 3rd International.
IEEE, 2013, pp. 963–969.
[16] P. Kokkinos, D. Kalogeras, A. Levin, and E. Varvarigos, “Survey: Live migration and
disaster recovery over long-distance networks,” ACM Computing Surveys (CSUR), vol. 49, no. 2,
p. 26, 2016.
[17] Zeynep Mavus¸ and Pelin Angın, “A Secure Model for Efficient Live Migration of
Containers”, Middle East Technical University, Ankara, Turkey {e1670157,
pangin}@ceng.metu.edu.tr
[18] mWarp: Accelerating Intra-Host Live Container Migration via Memory Warping Piush K
Sinha, Spoorti S Doddamani, Hui Lu, and Kartik Gopalan State University of New York (SUNY) at
Binghamton {psinha1, sdoddam1, huilu, kartik}@binghamton.edu
[19] W. Li, A. Kanso, and A. Gherbi, “Leveraging Linux containers to achieve High Availability
for cloud services,” Proceedings - 2015 IEEE International Conference on Cloud Engineering, IC2E
2015, pp. 76–83, 2015
[20] Y. Chen, “Checkpoint and Restore of Micro-service in Docker Containers,” no. Icmii, pp.
915–918, 2015.
[24] Y. Chen, “Checkpoint and Restore of Micro-service in Docker Containers,” no. Icmii, pp.
915–918, 2015.
[25] M. Azab, B. M. Mokhtar, A. S. Abed, and M. Eltoweissy, “Smart Moving Target Defense
for Linux Container Resiliency,” in IEEE 2nd International Conference on Collaboration and
Internet Computing (CIC), pp. 122–130, 2016.
[29] Hua Zhong, Jason Nieh, CRAK: Linux Checkpoint/Restart as a Kernel Module,
Department of Computer Science, Columbia University, Technical Report CUCS-014-01,
November 2001.
[31] G. Soni and M. Kalra, “Comparative study of live virtual machine migration techniques in
cloud,” International Journal of Computer Applications, vol. 84, no. 14, 2013.