Download as pdf or txt
Download as pdf or txt
You are on page 1of 92

CLOUD

SECURITY

Practical Use of
Microservices 6

Economics of
Microservices 16

SEPTEMBER/OCTOBER 2016
www.computer.org/cloudcomputing
Experience the Newest and Most Advanced
Thinking in Big Data Analytics

03 November 2016 | Austin, TX

Big Data: Big Hype or Big Imperative? Rock Star Speakers


BOTH.
Business departments know the promise of
big dataand they want it! But newly minted
data scientists cant yet meet expectations, and
technologies remain immature. Yes, big data is
transforming the way we doeverything. But
knowing that doesnt help you decide what steps
to take tomorrow to assure your companys future.
Thats why May 24 is your real-world answer. Kirk Borne Satyam Priyadarshy Bill Franks
Come meet the experts who are grappling with and Principal Data Chief Data Scientist, Chief Analytics
solving the problems you face in mining the value Scientist, Halliburton Officer, Teradata
of big data. You literally cant aord to miss the all Booz Allen Hamilton
new Rock Stars of Big Data 2016.

www.computer.org/bda
Now theres
even more to
love about your
membership...

Read all your IEEE Computer Society


magazines and journals yourWAY on

NO
ADDITIONAL
FEE

ON
ON YOUR COMPUTER ON YOUR eREADER
ON YOUR SMARTPHONE ON YOUR TABLET

Introducing myCS, the digital magazine portal from IEEE Computer Society.
FinallyGo beyond static, hard-to-read PDFs. Our go-to portal makes it easy to access and customize your
favorite technical publications like Computer, IEEE Software, IEEE Security and Privacy, and more. Get started
today for state-of-the-art industry news and a fully adaptive experience.

LEARN MORE AT: mycs.computer.org


Take the
CS Library
wherever
you go!
IEEE Computer Society magazines and Transactions
are available to subscribers in the portable ePub format.

Just download the articles from the IEEE Computer


Society Digital Library, and you can read them on any device
that supports ePub, including:

Adobe Digital Editions (PC, MAC)


iBooks (iPad, iPhone, iPod touch)
Nook (Nook, PC, MAC, Android, iPad, iPhone, iPod, other devices)
EPUBReader (FireFox Add-on)
Stanza (iPad, iPhone, iPod touch)
ibis Reader (Online)
Sony Reader Library (Sony Reader devices, PC, Mac)
Aldiko (Android)
Bluere Reader (iPad, iPhone, iPod touch)
Calibre (PC, MAC, Linux)
(Can convert EPUB to MOBI format for Kindle)

www.computer.org/epub
Editor in Chief
Mazin Yousif, T-Systems International, mazin@computer.org

Editorial Board
Pascal Bouvry, University of Luxembourg David Linthicum, Cloud Technology Partners
Ivona Brandic, Vienna University of Technology Christine Miyachi, Xerox Corporation
Christopher Crin, University of Paris 13 Omer Rana, Cardiff University
Kim-Kwang Raymond Choo, University Rajiv Ranjan, Newcastle University
of Texas at San Antonio Lutz Schubert, Ulm University
Beniamino Di Martino, Second University of Naples Alan Sill, Texas Tech University
Mianxiong Dong, Muroran Institute of Technology Zahir Tari, RMIT University
Keith G. Jeffery, Keith G. Jeffery Consultants Joe Weinman
and Cardiff University Yongwei Wu, Tsinghua University

Steering Committee
Sherman Shen, University of Waterloo (chair, Hui Lei, IBM
Communications Society liaison) V.O.K. Li, University of Hong Kong
Kirsten Ferguson-Boucher, Aberystwyth University (Communications Society liaison)
Raouf Boutaba, University of Waterloo Rolf Oppliger, eSecurity Technologies
(Communications Society Liaison) Manish Parashar, Rutgers, the State University of New Jersey
Carl Landwehr, NSF, IARPA (EIC Emeritus IEEE S&P)

Editorial Staff CS Magazine


Brian Brannon Lead Editor bbrannon@computer.org Operations Committee
Joan Taylor Content Editor Forrest Shull (chair), Brian Blake, Maria Ebling, Lieven
Eeckhout, Miguel Encarnacao, Nathan Ensmenger,
Annie Lubinsky, Keri Schreiner, Jenny Stout
Sumi Helal, San Murugesan, Shari Lawrence
Contributing Editors
Pfleeger, Yong Rui , Diomidis Spinellis, George
Carmen Garvey, Jennie Zhu-Mai Production & Design
K. Thiruvathukal, Mazin Yousif, Daniel Zeng
Robin Baldwin Senior Manager, Editorial Services
Evan Butterfield Products and Services Director
Sandy Brown Senior Business Development Manager CS Publications Board
Marian Anderson Senior Advertising Coordinator Alfredo Benso, Irena Bojanova, Greg Byrd,
Min Chen, Robert Dupuis, David S. Ebert,
Niklas Elmqvist, Davide Falessi, William Ribarsky,
Forrest Shull, Melanie Tory

IEEE Cloud Computing (ISSN 2325-6095) is published bimonthly by the IEEE Subscription rates: IEEE Computer Society members get the lowest rate of US$39
Computer Society. IEEE headquarters: Three Park Ave., 17th Floor, New York, NY per year. Go to www.computer.org/subscribe to order and for more information on
10016-5997. IEEE Computer Society Publications Office: 10662 Los Vaqueros Cir., Los other subscription prices.
Alamitos, CA 90720; +1 714 821 8380; fax +1 714 821 4010. IEEE Computer Society
headquarters: 2001 L St., Ste. 700, Washington, DC 20036.
Initial state Data loss check

22
Elastic
action
S

1-y

v v
:vs :ws $:ms

CONTENT
What will the future of cloud computing look like? What are some of the issues
professionals, practitioners, and researchers need to address when utilizing cloud
services? IEEE Cloud Computing magazine serves as a forum for the constantly
shifting cloud landscape, bringing you original research, best practices, in-depth
analysis, and timely columns from luminaries in the field.

THEmE ARTIClEs

22 Guest Editors Introduction: 44 Cryptographic Public Verication of


Cloud Security Data Integrity for Cloud Storage Systems
Peter Mueller, Chin-Tser Huang, Shui Yu, Zahir Tari, Yuan Zhang, Chunxiang Xu, Hongwei Li , and
and Ying-Dar Lin Xiaohui Liang

26 Online Analysis of Security Risks in 54 To Docker or Not to Docker:


Elastic Cloud Applications A Security Perspective
Athanasios Naskos, Anastasios Gounaris, Theo Combe, Antony Martin, and Roberto Di Pietro
Haralambos Mouratidis, and Panagiotis Katsaros

64 User-Centric Security and


34 Privacy-Preserving Access to Big Data Dependability in the Clouds-of-Clouds
in the Cloud Marc Lacoste, Markus Miettinen, Nuno Neves, Fernando
Peng Li, Song Guo, Toshiaki Miyazaki, Miao Xie, M.V. Ramos, Marko Vukolic, Fabien Charmet, Reda Yaich,
Jiankun Hu, and Weihua Zhuang Krzysztof Oborzynski, Gitesh Vernekar, and Paulo Sousa
k Data leakage check Next state External code
violation check
repositories

y z Github, Private repositories External repositories

26 54
for example (dependencies, webs
Public repositories for example)
S
1-y 1-z
Code docker build Code
github hook
v v
Dockerfile
:vs :ws $:ms
Images repositories

Private repositories
September/October 2016
Private reposito
Volume 3, Issue 5
Alternate
Public repositories registry
www.computer.org/cloudcomputing
Third-party
Public repositor
Docker repositories
hub
Official repositories

(b)
docker pull Image docker pull Image
docker hook
Development
environment

Production
environment
Columns
Docker Orchestra
(Kubernet
4 From the Editor in Chief host
76 Standards
Cont. Cont.Now
Cont. for examp
Microservices The Design and Architecture Tasks
Mazin Yousif Docker daemon
of Microservices Commands
Service
Alan Sill Host libraries

6 Cloud Tidbits Host OS


Practical Use of Microservices in Moving 81 Blue Skies Hardware
Workloads to the Cloud Opendocker run
Issues in / ps /
Scheduling Microservices in
inspect / exec ...
David S. Linthicum the Cloud
(c)
Maria Fazio, Antonio Celesti, Rajiv Ranjan, Chang Liu,

10 Cloud and the Law Lydia Chen, and Massimo Villari

Challenges in Delivering Software in the


Cloud as Microservices
Christian Esposito, Aniello Castiglione, and
Kim-Kwang Raymond Choo 52 Advertising Index
62 IEEE CS Information
16 Cloud Economics
The Economics of Microservices
Andy Singleton

Reuse Rights and Reprint Permissions: Educational or personal use of this material is permitted without fee, provided such use: 1) is not made for profit; 2)
includes this notice and a full citation to the original work on the first page of the copy; and 3) does not imply IEEE endorsement of any third-party products
or services. Authors and their companies are permitted to post the accepted version of their IEEE-copyrighted material on their own Web servers without
permission, provided that the IEEE copyright notice and a full citation to the origin al work appear on the first screen of the posted copy. An accepted manu-
script is a version which has been revised by the author to incorporate review suggestions, but not the published version with copyediting, proofreading and
formatting added by IEEE. For more information, please go to: http://www.ieee.org/publications_standards/publications/rights/paperversionpolicy.html.
Permission to reprint/republish this material for commercial, advertising, or promotional purposes or for creating new collective works for resale or redistribu-
tion must be obtained from the IEEE by writing to the IEEE Intellectual Property Rights Office, 445 Hoes Lane, Piscataway, NJ 08854-4141 or pubs-permissions
@ieee.org. Copyright 2016 IEEE. All rights reserved.
Abstracting and Library Use: Abstracting is permitted with credit to the source. Libraries are permitted to photocopy for private use of patrons, provided the
per-copy fee indicated in the code at the bottom of the first page is paid through the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923.
IEEE prohibits discrimination, harassment, and bullying. For more information, visit www.ieee.org/web/aboutus/whatis/policies/p9-26.html.
From the EditoR in Chief

Microservices or multiple tasks. For example, service-oriented archi-


tectures (SOAs) bring services together through an
enterprise service bus (ESB). An ESB isnt as simple
as it might seem; it must include the glue for all the
connectivity and integration among all the services.
The key architectural advantage of modular
architectures is that they tackle the complexity of
Software architectures are con- monolithic architectures. There are other advantag-
stantly evolving. We initially had, and still es, however, mainly as side benefits to breaking the
have, monolithic software architectures, where the code into smaller pieces:
deployed software stack is one long string of code
that does all, including functional and nonfunc- Its easier to make changes, update, and test.
tional application requirements. This is sometimes There are fewer barriers to introducing new
referred to as tightly coupled code. Such an ap- technology trends.
proach might suffice for small applications devel- Theyre likely faster to start.
oped by small teams as small projects. However, as Its easier to mix and match modules with dif-
complexity (in terms of more involved functional ferent profiles in terms of processor and mem-
and nonfunctional requirements) increases, such ory needs, resulting in much better resource
an approach becomes inefficient, costly, and time utilization.
consuming. Making a change is terrifying because Its easier to construct applications by bringing
youll need to retest the whole application. Starting together modules with different functions.
time can be slow because you need to upload large
amounts of code. The whole thing can become a The market has been positioning microservices
nightmare. Another interesting twist here is that be- as the hottest new trend in software development,
cause the code is so tightly coupled, its very difficult and it might be. But the confusion around mi-
to incorporate new technology trends. croservices is everywhere. When you ask N people
Modular software architectures evolved mainly to define microservices or what the typical size of
to ease the complexities around monolithic archi- a microservice is, youll likely get N + M different
tectures. They include loosely coupled modules, la definitions. Moreover, many mistakenly equate mi-
distributed applications. Functions, services, and mi- croservices with containers.
croservices are other examples of modular software So what are microservices? Theyre programs
architectures. They differ in their size and complex- with a single task (or unit of work) that also in-
ity, the mechanisms for interconnecting them, the clude all the connectivity to the outside world as
scope of integration, and whether they deliver single well as the runtime requirements to run the task.
(Note that the word task is generic and refers to
the smallest function possible, but no smaller.)
Regardless, microservices inherit all the benefits
of a modular architecture. Also increasing the de-
veloper communitys interest in microservices are
containers and DevOps, which evolved around the
same time that microservices did. Containers tai-
lor themselves nicely to microservices because they
can be deployed with much less overhead than vir-
Mazin Yousif tual machines. DevOps represents an approach to
developing, testing, and running code with tighter
T-Systems International collaboration between developers, testers, and op-
mazin@computer.org erators. Read more about this in David Linthicums
Cloud Tidbits column.

4 I E E E C l o u d C o m p u t i n g p u b l i s h e d b y t h e I E E E c o m p u t e r s o cie t y  2325-6095/16/$33.00 2016 IEEE


New Editorial Board Member
W elcome to Christine Miyachi, the newest member
of the IEEE Cloud Computing editorial board.
Web-based tools to create server-based applications
that can be configured for the multifunction periph-
erals touchscreen user interface. Miyachi graduated
Christine Miyachi has almost 30 years from the University of Rochester with a BS in
of experience working for startups electrical engineering. She holds two degrees from
and large corporations. Shes currently the Massachusetts Institute of Technology: an MS
a systems engineer and architect at in technology and policy/electrical engineering and
Xerox Corporation and holds several computer science and an MS in engineering and
patents.She works on Xeroxs Exten- management. Miyachi writes a blog about software
sible Interface Platform, which isa software plat- architecture (http://abstractsoftware.blogspot.com).
form upon which developers can use standard For more, see www.christinemiyachi.com.

Microservices (or modular architectures in Lydia Chen, Chang Liu, and Massimo Villari look
general) are better suited for the many complex ap- at scheduling and efficient resource management for
plications were building these days. This includes microservices. Finally, in Standards Now, Alan Sill
enterprise applications (that is, confined within the explores how microservices exploit both modern and
enterprise) as well as Web-scale applications, where historical standards and looks at the future of mi-
companies need to scale to reach consumers world- croservices development.
wide. Microservices, specifically, work well for new One last item of news is the addition of Chris-
types of applications such as the Internet of Things, tine Miyachi to IEEE Cloud Computings edito-
where single-function sensors and actuators are de- rial board (see the sidebar for a brief biography).
ployed in the field. She currently chairs the IEEE Special Technical
Given the complexities of our business environ- Community of Cloud Computing and has worked
ments, technologys role in the social fabric, and the diligently to expand its reach to more than 10,000
flat-world view of things, Im encouraged that tech- subscribersan outstanding accomplishment!
nological evolutions have brought us microservices,
containers, and DevOps.
The columns in this issue address various top- Mazin Yousif is the editor in chief of IEEE Cloud
ics related to microservices. As mentioned earlier, Computing. Hes the chief technology officer and vice
David Linthicum provides a generic overview of president of architecture for the Royal Dutch Shell
microservices and relates them to containers and Global account at T-Systems International. Yousif has
DevOps. Cloud Economics guest author Andy Sin- a PhD in computer engineering from Pennsylvania
gleton addresses the costs and benefits associated State University. Contact him at mazin@computer
with microservices. In the Cloud and the Law col- .org.
umn, Christian Esposito, Aniello Castiglione, and
Kim-Kwang Raymond Choo look at the possible se-
curity challenges around microservices and related Read your subscriptions through
mitigation topics requiring more research. In Blue the myCS publications portal at
http://mycs.computer.org.
Skies, Maria Fazio, Antonio Celesti, Rajiv Ranjan,

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 5
Cloud Tidbits

Practical Use of abstracted into a larger applicationfor example, a


trading systemthat remote application service has
additional value.
Note that we leverage the behavior of this remote

Microservices service more than the information it produces or


consumes. If youre a programmer, you can view
application services as subroutines or methods
something you invoke to make something happen.

in Moving The basic notion of service-oriented architecture


(SOA), and SOA using cloud computing, is to
leverage these remote services using some controlled
infrastructure that allows applications to invoke

Workloads to remote application services as if they were local to


the application. The result (or goal) is a composite
application made up of many local and remote
application services. Since theyre location and

the Cloud platform independent, they can reside on premises


or within one of many cloud computing providers.
Furthermore, once services are identified and
exposed, or developed from scratch, we might have
services that can be placed on and span both on-
premises and cloud-enabled platforms. So, those are
What is a service, and when is a services in general.1
service a microservice? Good question.
When using a service, we leverage a remote method Enter Microservices
or behavior versus simply extracting or publishing Microservices is an architecture as well as a
information to a remote system. Moreover, we mechanism, and is often confused with traditional
typically abstract this remote service into another SOA-type services. Indeed, theres a great deal
application known as a composite application, which of overlap. Its an architectural pattern in which
is usually made up of more than one service. complex applications are composed of small,
A good example of a service is a risk analysis independent processes that communicate with each
process, which runs within an enterprise to calcu- other using language-agnostic APIs.
late the risk of a financial transaction. This remote This is service-oriented computing, at its
application service is of little use by itself, but when essence, decomposing the application down to
the functional primitive, and building it as sets of
services that can be leveraged by other applications,
or the application itself. This is also the foundation
of reuse, and these services are systemic to
the use of containers as well as non-container-
based applications. (See https://blog.akana.com/
the-venn-of-microservices.)

David S. The benefits of this approach include


efficiencies through reuse of microservices. As
Linthicum we rebuild applications for the cloud, we modify
them to expose services that are accessible by other
Cloud Technology Partners applications. More importantly, we can consume
david.linthicum@cloudtp.com services from the rebuilt application so we dont
have to build functionality from scratch.

6 I E E E C l o u d C o m p u t i n g p u b l i s h e d b y t h e I E E E c o m p u t e r s o cie t y  2325-6095/16/$33.00 2016 IEEE


For instance, some programs have built-in application domain provides better portability and
systems such as credit validations, mapping, and less complexity when refactoring. The ability to
address validation services that must be maintained. leverage microservices in this context provides the
This can cost upward of hundreds of thousands of same advantages as well, no matter if youre using
dollars per year. The service-based approach lets us containers or not.
reach out and consume remote services that provide Containers can provide better distributed
this functionality and more, so you can get out of computing capabilities as well. A traditional
the business of maintaining services that can be application can be divided into many different
found in other places. It also lets us expose services domains, all residing within containers. These
for use within the enterprise by other applications, containers can be run on any number of different
or even sell services to other enterprises over the cloud platforms, including those that provide the
open Internet. highest cost and performance efficiencies. So,
applications can be distributed and optimized
Enter Containers according to their utilization of the platform from
The use of containers to wrap or containerize existing within the container.
applications comes with a few advantages, including For example, one could place an I/O-intensive
the ability to reduce complexity by leveraging portion of the application on a bare-metal cloud that
container abstractions. The containers remove provides the best performance, place a compute-
dependencies on the underlying infrastructure intensive portion of the application on a public
services, which reduces the complexity of dealing cloud that can provide the proper scaling and load
with those platforms. This means we can abstract balancing, and perhaps even place a portion of the
the access to resources, such as storage, from the application on traditional hardware and software.
application itself. This makes the application All of these elements work together to form the
portable, but also speeds the refactoring of the application, and the application is separated into
applications, since the containers handle much of components that can be optimized.
the access to native cloud resources.
Containers also offer the ability to leverage Refactoring Traditional Applications for
automation to maximize their portability, and, Containers and Microservices
with portability, their value. Through the use of The process of containerizing an application and
automation, were scripting a feature we could also service enabling it at the same time is more art
do manually, such as migrating containers from one than science at this point. However, certain success
cloud to another. However, this options use cases patterns are beginning to emerge as enterprises
have proved limited. Indeed, most new applications begin to migrate traditional applications to the
are built to take advantage of containers, but existing cloud using containers and service orientation as
applications are often difficult to containerize. The the architecture.
objective of leveraging containers seems to be that of Pattern one decides quickly how the application
a distributed architectural value versus portability, is to be broken into components that will be run
as originally thought. However, portability is always inside of containers in a distributed environment.
a byproduct of leveraging containers. This means breaking the application down to its
Also consider the ability to provide better functional primitives, and building it back up as
security and governance services by placing those component pieces to minimize the amount of code
services around rather than within containers. In that needs to be changed.
many instances, security and governance services Pattern two builds data access as a service for the
are platform specific, not application specific. For applications use, and has all data calls go through
example, traditional on-premises applications tend those data services. This will decouple the data from
not to have security and governance functions the application components (containers) and let you
innate to the application. The ability to place change the data without breaking the application.
security and governance services outside of the Moreover, youre putting data complexity into its

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 7
Cloud Tidbits

own domain, which will be another container thats tributed components that function together to form
accessed using data-oriented microservices. the applications, and are also separately scaled. For
Pattern three splurges on testing. Although instance, the container that manages the user inter-
many will point to the stability of containers as a face can be replicated across servers as the demand
way around black-box and white-box testing, the for that container goes up when users log on in the
application now exists in a new architecture with morning. This provides a handy way for cloud opera-
new service dependencies. There could be a lot tions to build autoscaling features around the ap-
of debugging that has to occur up front, before plication, to expand and de-expand the use of cloud
deployment. resources as needs change.
There are other sides to this as well. Lori Most enterprises believe the cloud will become
MacVittie, one of my advisory board members, the new home for applications. However, not all
noted in an email that containers and microservices applications are fit for the cloudat least, not yet.
seem to mix many tangentially related topics Care must be taken to select the right applications
together, but microservices have nothing to do with to make the move.
The use of containers and
microservices makes things easier.
This approach forces the application
developer charged with refactoring
the application to think about how
When applications are put into to best redesign the applications to
production, those charged with cloud become containerized and service
oriented. In essence, youre taking a
operations should take advantage of monolithic application and turning it
the container architecture. into something thats more complex
and distributed. However, it should
also be more productive, agile, and cost
effective. Thats the real objective here.

containers other than they happened to appear at Microservice Coupling


the same time. As we create microservices to serve new or existing
The focus here should be on the technologies applications, we need to understand the benefits of
working together, not on each technology as a being loosely coupled. Loose coupling, as related to
standalone. The concept is to understand that microservices, has a few basic patterns.
there are some additive advantages of leveraging In the location independence pattern, it doesnt
both containers and microservices, which are matter where the microservice exists; the other
indeed independent. That said, if the notion is that components that need to leverage the service can
microservices are indeed services in the traditional discover it within a directory and leverage it through
sense, I have no retort for that. the late binding process. This comes in handy when
youre leveraging microservices that are consistently
Operational Considerations changing physical and logical locations, especially
Cloud-enabled traditional applications must be services outside of your organization that you might
managed differently in production than they were not own, such as cloud-delivered resources. Your
prior to migration. This phase is known as cloud risk calculation service might exist on premises on
operations, or the operation of the application Monday and within the cloud on Tuesday, and it
containers in the cloud. should make no difference to you.
When applications are put into production, those Dynamic discovery is key to this concept,
charged with cloud operations should take advantage meaning that calling components can locate
of the container architecture. Manage them as dis- microservice information as needed, and without

8 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


having to bind tightly to the service. Typically these slightly different approaches. As long as the objec-
services are private, shared, or public services as tive of having a complete understanding of the mi-
they exist within the directory. croservices within the problem domain are achieved,
In the communications independence pattern, how you do that project is up to you. Another benefit
all components can talk to each other, no matter of understanding the domain at a services level is
how they communicate at the interface or protocol that you can easily leverage this work in other direc-
levels. Thus, we leverage enabling standards, such as tions, perhaps to support the core enterprise archi-
microservices, to mediate the protocol and interface tecture or build and/or refine your architecture.
difference.
The security independence pattern is based on Acknowledgments
the concept of mediating the difference between Portions of this article were adapted from an article
security models in and between components. This I wrote for TechBeacon.2
is a bit difficult to pull off, but necessary to any
service-based architecture. To enable this pattern, References
you have to leverage a federated security system that 1. D.S. Linthicum, Cloud Computing and SOA
can create trust between components, no matter Convergence in Your Enterprise: A Step-by-Step
what security model is local to the components. This Guide, Addison-Wesley, 2009.
has been the primary force behind the number of 2. D. Linthicum, From Containers to Microservices:
federated security standards that have emerged in How to Modernize Legacy Applications,
support of a loosely coupled model and Web services. TechBeacon, 19 Jan. 2016; http://techbeacon.com/
In the instance independence patterns, the ar- containers-microservices-how-modernize-legacy
chitecture should support component-to-component -applications.
communications using both a synchronous and an
asynchronous model, and not require that the other
component be in any particular state before receiv- David S. Linthicum is senior vice president
ing the request or message. Thus, if done right, all of of Cloud Technology Partners. Hes also Gigaoms
the services should be able to service any requesting research analyst and frequently writes for InfoWorld
component asynchronously, and retain and manage on deep technology subjects. His research interests
state no matter what the sequencing is. include complex distributed systems, including
The need for loosely coupled architecture cloud computing, data integration, service-oriented
within your cloud computing solution is really architecture, Internet of Things, and big data systems.
not the question. If you leverage cloud computing Contact him at david@davidlinthicum.com.
correctly, you should have a loosely coupled
architecture, except in some rare circumstances.
Analysis and planning are also part of the mix, as
are understanding your requirements and how each
component of your architecture should leverage the
other components of your architecture. Leverage
the coupling model that works best for your
requirements.

Although this seems like a lot of


work, its really only a quick survey
and explanation of the present ser-
vices. Moreover, whereas I recommend basic Read your subscriptions through
the myCS publications portal at
concepts and basic approaches, the needs of your http://mycs.computer.org.
IT environment will be unique and might require

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 9
Cloud and the Law

Challenges the design of service-oriented soft-


ware using a set of small services.
These services are small applications that can be de-
ployed independently, with a precise and hardened

in Delivering interface, and easily integrated. Microservices are


supported by middleware for communication and a
platform for flexible and low-cost deployment.
The small applications have a high degree of in-

Software in ternal cohesion around a single task and can cope


with a simple responsibility. Here, we consider re-
sponsibility as doing (or being responsible for)
one activity only. The activity can include serving

the Cloud as or representing a particular resource. Practical ex-


amples include logging incoming messages within
a database or a file, or managing a message queue
by taking, processing, and discarding queued mes-

Microservices sages. Such a concept started within the industrial


practice of splitting large monolithic applications
into small cooperating pieces to improve their main-
tainability, scalability, and testability. Microservice
architectures are receiving increasing focus, as evi-
denced by search statistics on Google Trends.2 Aca-
Microservices, one of the latest ar- demias interest in microservices is also evidenced
chitectural trends in software engi- by the publication of the first mature academic book
neering,1 can be broadly defined as in 2015.3
The increasing popularity of microservices, de-
spite the effort required to implement them, is prob-
ably due to the benefits associated with their agility,
resilience, scalability, and maintainability.4 Specifi-
Christian Esposito cally, we could enforce separation of concerns by ap-
University of Naples
plying the single responsibility principle.
Federico II

Microservices in the Cloud


Recent advancements in container technologies and
the capability to overcome limitations in virtualiza-
Aniello Castiglione tion have, perhaps, encouraged the use of containers
University of Salerno in the cloud for software development and deploy-
ment. This might also have paved the way for the
adoption of a microservice architectural paradigm
in cloud-hosted software by lowering infrastructure
and maintenance costs.2,5,6 For example, microser-
Kim-Kwang vices support the realization of small (sized) appli-
Raymond Choo cations that are fine-grained and loosely coupled
University of Texas at and communicate through REST protocols. These
San Antonio applications are implemented using APIs provided
by the infrastructure-as-a-service (IaaS) layer for
provisioning data computing, storage, and delivery
capabilities.

10 I E E E C l o u d C o m p u t i n g p u b l i s h e d b y t h e I E E E c o m p u t e r s o cie t y  2325-6095/16/$33.00 2016 IEEE


The growing adoption of microservices in the
cloud is motivated by the ease of deploying and
updating the software, as well as the provisioned
loose coupling provided by dynamic service dis- VM1 VM2 VM3
covery and binding. Moreover, structuring the
software to be deployed in the cloud as a collec-
tion of microservices allows cloud service provid-
ers to offer higher scalability guarantees through Traditional application
HM1 HM2 HM3
more efficient consumption of cloud resources,
and to dynamically and quickly restructure soft-
(a) Cloud service provider
ware to accommodate growing consumer demand.
Structuring software in smaller computation units
enables optimized allocation of the application
components within proper containers in the vir-
tual machines (VMs) running on top of the host
machine provided by the cloud infrastructure, as Communication bus VM1 VM2 VM3
the example in Figure 1 illustrates. This lets us
minimize waste of resources and maximize pack-
ing of the components within a single VM. This
is possible because, despite having to realize the Microservice-based
same application tasks, microservices are typically application HM1 HM2 HM3
thinner than conventional software components
because they use lightweight software technolo- (b) Cloud service provider
gies and platforms. This model also allows a sim-
pler and faster migration of software component Figure 1. Software deployment in a cloud platform using (a) conventional
instances from one VM to another to satisfy cloud and (b) microservice-based software.
applications changing resource demands.
Partitioning monolithic applications into small
pieces of computation also allows for the segmen- ware that dynamically and seamlessly resolves the
tation of application data, and thus avoids having implicit dependencies among them.
a single monolithic data storage. Such a solution is Figure 2 depicts the differences between a con-
required by the decentralized data governance that ventional monolithic architecture and a microservice
a microservice architecture imposes and is known architecture. In the first case (Figure 2a), users in-
in the literature as the sharding pattern.7 It con- teract with a front-end application, which redirects
sists of dividing a data store into a set of horizon- user requests to multiple instances of the software
tal partitions, or shards, to improve scalability and hosted within a container and implements all applica-
performance when storing and accessing large vol- tion responsibility using the data stored in a database.
umes of data, as in the case of cloud-hosted applica- In the second case (Figure 2b), the software is split
tions. The different shards are kept consistent with into multiple components, each implementing only
respect to a centralized data storage, or through a its given responsibility, hosted in multiple containers,
distributed protocol. and/or replicated at the users convenience. The de-
pendencies among the microservices and data shards
Security Threats and Legal Ramifications are complex and having centralized data storage can
The key idea underlying the microservice architec- help maintain consistency. Using existing middleware
ture is to distribute application complexity among solutions, such as those offering event notification,8
narrowly focused and independently deployable allows such architectures to cope with the complexity
units of computation. Such small components are of connecting multiple entities, even with the dynam-
loosely coupled by means of an integration middle- ic discovery of components.

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 11
Cloud and the Law

Trustworthiness is also an issue when dealing


with microservices. For example, an adversary could
compromise or gain control of a component, which
isnt uncommon within the public cloud context.
Users Database
Front-end application However, in a typical microservice architecture, the
and load balancer Container other components assume a trusted component base;
(a)
thus, theres a real risk that attacks can be easily
propagated due to the microservices dependencies
and (blind) trust of peer components. In data shard-
Shard ing, security threats are related to data provenance,
trustworthiness, and protection. Specifically, when
data is horizontally segmented into multiple shards,
Container we need to determine if the data source is verifiable
(or trustworthy) and how to protect against forging
attempts (for example, data modification and tam-
pering) during sharding.
Shard
Furthermore, data can be provided by multiple
Users Database data sources, and each should have a reputation de-
Front-end application gree associated with it. This would give data consum-
Container
(b) and load balancer
ers a way to determine the trustworthiness degree
of the data prior to using it in their computations.
Data must also be kept consistent with respect to
Shard
eventual replicas (or a centralized data store), since
such a process is vulnerable to exposure, denial of
service, and alteration attacks. Even in the best-case
Container
scenario, where we assume a single microservice has
Figure 2. Architectural complexity of (a) monolithic and (b) microservice- been demonstrated to be secure and the data shard-
based software. ing mechanism protected, the architecture presents
an additional point of vulnerability: the composition
and coordination middleware can also be targeted
However, such complexity can result in secu- and exploited. In the absence of a centralized orches-
rity vulnerabilities affecting one or more vectors in trator, for example, microservices coordinate among
the architecture,9 including single microservices,10 themselves through message-based communications.
data shards,11 and the cooperation and orchestra- These communications, if not protected, wont let us
tion among microservices.12 In the first vulner- achieve secure guarantees during data exchange and/
ability vector, having multiple components within or receive falsified or modified coordination which is
an architecture can enlarge the attack surface by required when undertaking complex and elaborate
presenting more points of vulnerability that can be functionalities.14
exploited to compromise the application. In addition, Should a breach or cybersecurity incident oc-
the complexity of gluing together a multitude of mi- cur as a result of a successful exploitation in any of
croservices complicates the challenge of debugging, the three vectors (single microservices, data shards,
monitoring, and auditing the application, especially and the cooperation and orchestration among mi-
within a cloud thats not under the application own- croservices), who is responsible for users pure eco-
ers control. Moreover, in microservice architectures, nomic loss (such as direct and indirect financial
reuse is strongly enforced by allowing the use of off- loss) under the current legal frameworkthe mi-
the-shelf (OTS) components. Deployed OTS compo- croservice provider or the cloud service provider? If
nents represent a security threat and they should be an attacker exploits the trust in peer components
properly validated, authenticated, and monitored.13 within the architecture, should the compromised

12 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


microservice provider be held responsible? What state by autonomously resolving possible compro-
happens if the compromised microservice provider mises and violations.
is found to have failed to provide reasonable indus- We also need proper cryptographic schemes for
try-standard security? data protection at rest and in motion, allowing ac-
Because microservices are relatively new, we cess and integrity control to the exchanged messages
dont know if there are gaps in existing legal frame- and stored shards while allowing privacy-preserving
works. For example, because there are no known processing tasks.
cases on which to comment, we cant effectively
assess the adequacy of existing legislation, and
challenges will arise as untested provisions are ap- We need to strike a balance between
plied in civil litigation or prosecution. Thus, we security and performance. For example,
need regular dialogues between service providers security mitigation solutions should also provide
(such as microservices and cloud services), legal important nonfunctional requirements, such as
practitioners, and policymakers to improve under- minimal overhead and tamper resistance, so they
standing of the nature and extent of the risks and minimally impact the systems performance and us-
associated legal implications. This would, in turn, ers quality of experience. Ideally, the security solu-
guide further responses at the operational and stra- tions should continue to function after an attacker
tegic levels for service providers and provide the evi- has successfully taken control of the cloud infra-
dence base to inform policymakers in governments structure or VMs. Since equipping microservices
to design national regulatory strategies to address with a proper security mechanism might not always
the associated risks more effectively (for example, be the best option,10 offering security as a service for
without resulting in a floodgate of claims against microservice-based cloud applications can be an at-
service providers). tractive solution.

Mitigation References
Conventional solutions such as secure message- 1. M. Fowler, Microservices Resource Guide,
passing middleware, physical security for cloud 2016; http://martinfowler.com/microservices.
infrastructure, and privacy-preserving cloud data 2. A. Balalaie, A. Heydarnoori, and P. Jamshidi,
storage schemes might only be effective for one or Microservices Architecture Enables DevOps,
more components, but theyre unlikely to address IEEE Software, vol. 33, no. 3, 2016, pp. 4252.
the security (and certainly not the legal) challenges 3. S. Newman, Building Microservices: Designing
due to microservices unique characteristics. Fine-Grained Systems, OReilly Media, 2015.
Potential research to address these technical 4. T. Killalea, The Hidden Dividends of Microser-
limitations or security vulnerabilities can focus on vices, Comm. ACM, vol. 59, no. 8, 2016, pp.
several areas. One area is microservice validation, 4245.
which could be conducted both in isolation and in 5. F. Oliveira et al., Delivering Software with Agil-
composition with other microservices, allowing us ity and Quality in a Cloud Environment, IBM
to identify and mitigate potential vulnerabilities in J. Research and Development, vol. 60, nos. 23,
the software, implementation, or interaction be- 2016, pp. 10:1-10:11.
tween microservices. 6. M. Villamizar et al., Infrastructure Cost Com-
In addition, we need a dynamic, and preferably parison of Running Web Applications in the
lightweight, security monitoring and management Cloud Using AWS Lambda and Monolithic and
system, responsible for monitoring and enforcing Microservice Architectures, Proc. 16th IEEE/
correct behavior of microservices and other related ACM Intl Symp. Cluster, Cloud and Grid Com-
activities (such as cooperation). Once the behav- puting (CCGrid), 2016, pp. 179182.
ior or activities deviate from the norm, the system 7. C.H. Costa et al., Sharding by Hash Partition-
should undertake corrective actions and/or coun- ing: A Database Scalability Pattern to Achieve
termeasures to bring the application to the correct Evenly Sharded Database Clusters, Proc. 17th

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 13
CLOUD AND THE LAW

Intl Conf. Enterprise Information Systems Proc. 14th Intl Symp. Software Reliability Eng.
(ICEIS), 2015. (ISSRE), 2003, pp. 154165.
8. C. Esposito, D. Cotroneo, and S. Russo, On Re- 13. D.I. Savchenko, G.I. Radchenko, and O. Taipale,
liability in Publish/Subscribe Services, Comput- Microservices Validation: Mjolnirr Platform
er Networks, vol. 57, no. 5, 2013, pp. 13181343. Case Study, Proc. 38th Intl Convention Infor-
9. N. Dragoni et al., Microservices: Yesterday, mation and Comm. Technology, Electronics and
Today, and Tomorrow, 2016; https://arxiv.org/ Microelectronics (MIPRO), 2015, pp. 235240.
abs/1606.04036. 14. C. Esposito and M. Ciampi, On Security in Pub-
10. Y. Sun, S. Nanda, and T. Jaeger, Security-as-a- lish/Subscribe Services: A Survey, IEEE Comm.
Service for Microservices-Based Cloud Applica- Surveys and Tutorials, vol. 17, no. 2, 2015, pp.
tions, Proc. 7th IEEE Intl Conf. Cloud Comput- 966997.
ing Technology and Science (CloudCom), 2015,
pp. 5057.
11. F. Callegati et al., Data Security Issues in MaaS- christian esposito is adjunct professor at the
Enabling Platforms, Proc. Intl Forum Research University of Naples Federico II, Italy, and research
and Technologies for Society and Industry, 2016; fellow and adjunct professor at the University of Saler-
https://hal.inria.fr/hal-01336700. no, Italy. His research interests include information
12. S. Kim et al., High-Assurance Synthesis of security and reliability, middleware, and distributed
Security Services from Basic Microservices, systems. Esposito has a PhD in computer engineer-
ing from the University of Naples Federico II, Italy.
Contact him at christian.esposito@dia.unisa.it.

aniello castiglione is adjunct professor at


the University of Salerno, Italy, and at the University
2017 B. Ramakrishna Rau Award of Naples Federico II, Italy. His research interests
Call for Nominations include security, communication networks, informa-
tion forensics and security, and applied cryptography.
Honoring contributions to the computer microarchitecture field Castiglione has a PhD in computer science from the
University of Salerno, Italy. Hes member of several as-
New Deadline: 1 May 2017 sociations, including IEEE and ACM. Contact him at
Established in memory of Dr. B. (Bob) Ramakrishna castiglione@ieee.org.
Rau, the award recognizes his distinguished career in
promoting and expanding the use of innovative comput-
er microarchitecture techniques, including his innovation
in complier technology, his leadership in academic and
industrial computer architecture, and his extremely high
kiM-kwang rayMond choo holds the
personal and ethical standards. Cloud Technology Endowed Professorship at the Uni-
WHO IS ELIGIBLE?: The candidate will have made an versity of Texas at San Antonio. His research interests
outstanding innovative contribution or contributions to microarchitecture,
use of novel microarchitectural techniques or compiler/architecture include cyber and information security and digital
interfacing. It is hoped, but not required, that the winner will have also forensics. Choo has a PhD in information security
contributed to the computer microarchitecture community through
teaching, mentoring, or community service. from Queensland University of Technology, Australia.
AWARD: Certificate and a $2,000 honorarium. Hes a fellow of the Australian Computer Society and
PRESENTATION: Annually presented at the ACM/IEEE International a senior member of IEEE. Contact him at raymond
Symposium on Microarchitecture
.choo@fulbrightmail.org.
NOMINATION SUBMISSION: This award requires 3 endorsements.
Nominations are being accepted electronically: www.computer.org/web
/awards/rau
CONTACT US: Send any award-related questions to awards@computer.org Read your subscriptions through
www.computer.org/awards the myCS publications portal at
http://mycs.computer.org.

14 I EEE Clo u d Co m p u t I n g w w w.Co m p u t Er .o rg /Clo u d Co m p u t I n g


IEEE Cloud Computing Call for Papers

Special Issue on
Multicloud
Submission deadline: 2 January 2017 Publication date: July/August 2017

A
s Cloud Computing evolved to a widely used cloud federations,
computing as a service model, limitations and intrinsic scheduling and load balancing,
characteristics of monolithic cloud provider offerings hybrid clouds,
emerged. Moreover, specialized computing power such as
autonomic management,
clusters, GPUs, solid state storage, and specific applications
multicloud and the Internet of Things,
at different service levels can now be acquired as services
from different providers. The use of a combination of cloud QoS and QoE,
services from various providers can be performed to contour economic and business models,
limitations of a single provider and enhance application cross-service-level management (IaaS, PaaS, SaaS,
execution by gathering together the necessary specific, on and XaaS),
demand resources for a wide range of applications. incentive mechanisms, and
multiclouds and green computing.
This IEEE Cloud Computing Magazine Special Issue on
Multicloud aims to cover all aspects of connecting multiple
clouds to allow automatic, transparent, and on demand Guest Editors
application execution that takes advantage from the synergy
Dr. Luiz F. Bittencourt, University of Campinas
among resources of different providers. For this synergy to
Dr. Rodrigo N. Calheiros, University of Melbourne
become effective and efficient, connecting different providers
across their boundaries brings new, challenging efforts. Dr. Craig A. Lee, Aerospace Corporation
Multicloud deployment must solve challenges that include
resource management and scheduling, identity management, Submission Information
trust and security issues, business models, and incentive
mechanisms in multicloud environments. We invite authors to Submissions should be 3,000 to 5,000 words long, with a
submit outstanding and original manuscripts on the following maximum of 15 references, and should follow the magazines
topics within the context of multiclouds: guidelines on style and presentation (see https://www.computer
.org/web/peer-review/magazines for full author guidelines). All
brokering mechanisms, submissions will be subject to single-blind, anonymous review
resource discovery and management, in accordance with normal practice for scientific publications.
security and privacy, For more information, contact the guest editors at cc4-2017
authentication and authorization, @computer.org.
applications and case studies, Authors should not assume that the audience will have
auditing and accounting, specialized experience in a particular subfield. All accepted
multicloud APIs, articles will be edited according to the IEEE Computer Society
monitoring, style guide (www.computer.org/web/publications/styleguide).
data management, Submit your papers through Manuscript Central at https://
performance modeling and evaluation, mc.manuscriptcentral.com/ccm-cs.

www.computer.org/cloudcomputing
Cloud Economics

The occur in a monolithic application, it might never be-


come stable and reliable.
With an older monolithic architecture, its dif-
ficult or impossible to release systems over a certain

Economics of size. Capers Jones, writing in 1999 after a study of


1,000 applications, wrote that MIS applications
greater than 10,000 function points are rarely suc-
cessful.1 These problems have resulted in failed sys-

Microservices tems costing hundreds of billions of dollars over the


last 50 years. Needless to say, in the more than 15
years since, many more failures have occurred.
Fred Brooks classic book The Mythical Man-
Month described the difficulty of building and test-
ing large systems, and the decline in development
productivity that inevitably set in with increasing
Microservices are a solutionper- system size.2 Brooks observed that adding personnel
haps the only solutionto the to a late project would make it even later, and he de-
problem of efficiently building and clared at the time that there is no silver bullet that
managing complex software sys- will fix the productivity problem.
tems. For medium-sized systems, they can deliver Now there might well be a silver bullet: mi-
cost reduction, quality improvement, agility, and croservices architectures and continuous integra-
decreased time to market. For large cloud systems, tion enable companies such as Google and Amazon
they fundamentally change the rules of the game. to build and release enormous systems such as Web
Microservices are the software equivalent of Lego search and e-commerce engines with far more than
bricks: they are proven to work, fit together nicely, and 10,000 function points, and evolve them with great
can be used to rapidly construct complex solutions. speed. For example, Google search and applications
Complexity problems often arise in legacy, are built from an interlinked set of more than 5,000
monolithic systems, such as large enterprise re- services. The company tests and releases its online
source planning or core banking systems, making systems from a single codebase with more than 2
them inflexible and expensive to maintain. In many billion lines of code, deploying about 75,000 com-
cases, such systems are linked together so even an mitted changes per week.3
apparently trivial change in a single component Some sort of service architecture is inevitable
can create errors in that component or others. This for large systems. After all, there is a physical limit
therefore requires regression testing of all compo- to the amount of load one server can handleeven
nents and the system as a whole. We see the same a mainframe, parallel processing supercomputer
complexity and reliability requirements in new soft- or quantum computer. There is a conceptual limit
ware and digital products. When many changes, to the number of functions one release team can
ranging from bug fixes to feature enhancements, specify, test, and maintain. Beyond those limits,
functions will be placed into separate processes. Ap-
plication code will be separated from the database
tier, front-end proxy servers and load balancers will
be used to organize horizontal scaling to multiple
Andy Singleton
applications servers, and so on.
MAXOS.ai
So, systems that are large, complex, and need to
scale require service architectures. But, why do we
need micro services? The answer lies in economics
and the efficiency of developing, testing, deploying,
and enhancing software. As an organization gets

16 I E E E C l o u d C o m p u t i n g p u b l i s h e d b y t h e I E E E c o m p u t e r s o cie t y  2325-6095/16/$33.00 2016 IEEE


better at deploying and managing multiple services, These decisions are currently more of an art than
it will find reasons to make them smaller, more like a science.
microservices. Moreover, microservices approaches
fit particularly well with cloud computing, enabling Benefits and Tradeoffs of Microservices
the economic benefits of microservices to comple- A microservices architecture has substantial costs,
ment the economic benefits of cloud computing, but it also has substantial benefits.
such as cost and user experience optimization.4
Moreover, the rapid release of microservices works System Size and Complexity
well with cloud-based online systems that dont re- Microservices architectures are good for big sys-
quire further distribution of software updates. What tems but not for small systems. They require extra
we know as the cloud is actually a mass of microser- machinery to communicate between services, route
vices, speaking to each other over the Internet with to services, deploy services, and monitor services.
the universal language of APIs. An understanding of If you have a small system that wont change much
microservices opens the door to participating in this and doesnt need to scale, you can avoid this extra
huge new market and/or exploiting their benefits to machinery and save time and money by building a
achieve competitive advantage.

Costs of Microservices
Although microservices approaches of-
fer substantial benefits, a microservices I currently believe that if you have
architecture requires extra machinery, fewer than about 60 people working
which can impose substantial costs. It
also often requires extra code to com-
on your system, you dont need a
municate between services. Instead of microservices architecture.
making simple function calls, youll de-
fine API calls or messages, and imple-
ment API calls on each end.
In addition, you need systems
such as service catalogs and messaging and queu- more monolithic system. If you have a large system
ing services to discover and then route calls to the that changes frequently and does need to scale, you
correct service instances. When you make a call to benefit from a microservices architecture through
a microservice, youll go through a proxy or mes- several mechanisms. Its much easier to test and
saging layer that finds a suitable instance. Because release the smaller components. You get greater
microservices can be event-processing scripts, con- reliability through redundancy and scalability be-
tainers, or entire virtual machines, youll also want a cause of your ability to increase the instances of any
systematic way to package and deploy them. service thats a bottleneck. You get greater quality
A microservices architecture must include sys- through reuse of field-proven components packaged
tems to monitor service performance and behavior into microservices.
as well as special techniques to handle errors. When Although there are, no doubt, exceptions, I
a microservice isnt responding, there is no simple currently believe that if you have fewer than about
way for other services to understand the error or 60 people working on your system, you dont need
even see there is a problem. Youll need extra code a microservices architecture. Over this amount of
and monitoring to make sure problems get handled product complexity, youll probably benefit from a
as errors, rather than just piling up or cascading into microservices approach.
a catastrophe. A high volume or complex system will inevitably
Finally, youll need to have new discussions move to Web services and then to smaller microser-
about when to include functions inside one service, vices. Theres a limit to the transaction volume and
and when to break them into separate services. throughput that a single server can handle. Beyond

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 17
Cloud Economics

that point, youll have multiple servers, and routing merges. Each team can run an integration test on its
and load-balancing machinery typical of a service code, and release it directly as a packaged service.
architecture. Theres also a limit to the number of A microservices architecture provides a simple
functions that can be tested and maintained on that way to change components without spreading prob-
server. Beyond that point, youll have multiple ser- lems throughout the system. If you make a signifi-
vices. Most modern business systems are well be- cant change to a component, youll want to know
yond these two points. that this doesnt cause problems in the consumers
that use the component. You dont want to test and
Release Frequency andAgility fix every consumer. Consumers will be reliable if
Monolithic applications include a lot of functions they have access to a stable API. So, microservices
that need to be tested, so they take a long time to maintain their old APIs and behavior, even while
test and releasesometimes a month or more. A they provide new features in new API calls. In fact,
monolithic architecture works well for an installed this is one way to determine how to right-size a ser-
software product that requires several months to be vice: making it small and simple enough that the
distributed, tested, and installed at customer sites. API can remain stable.
In contrast, a microservices-centric develop- Microservices approaches typically utilize auto-
ment team can test and release changes to smaller mated test scripts that run in a continuous integra-
tion system. Such scripts help ensure
that localized changes that shouldnt
impact the larger system in fact dont.
Continuous integration typically ex-
A microservices architecture ploits a layer of automated testing thats
provides a simple way to change more extensive, reliable, and efficient
than the tests that run on the output of
components without spreading a larger application. This drives down
problems throughout the system. the time and cost of testing.
All of these techniques add up to
the practice of continuous delivery, as
practiced by SaaS and online service
companies. Systems are chopped up
components more than once per day. A company like into an array of microservices. Each microservice is
Amazon with thousands of services can make more assigned to a development team, which monitors it,
than a thousand changes per day, fixing problems fixes it, and releases improvements whenever theyre
and adding new features. This type of continuous readywhether once a month, once a week, once a
delivery becomes a powerful tool for developers that day, or even more often. They maintain their APIs
update cloud-based software-as-a-service (SaaS) and and feed their services into a continuous integration
online systems, which in turn means that continu- system to make sure that the whole system works
ous delivery is also a powerful tool for business agil- correctly. This continuous process is more adaptable
ity and competitive advantage. and easier to manage than the older Scrum-style ag-
Large software teams often have problems with ile development with its two-week cadence.
merging code, in addition to testing. We often see Youll benefit from a microservices architecture
small teams of one to seven programmers making if you run continuous delivery of online services, if
changes and then merging these changes with the you do a lot of work merging code, or if you have
work of other teams. Any conflicts in the changes long test cycles or high test expenses.
from other teams need to be studied and fixed by
hand. If many groups are merging, the process be- ComponentSize
comes difficult and unreliable. The microservices Bigger components (monolithic applications and
architecture solves this problem by skipping the macroservices) are easier to operate and have less

18 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


code per feature because they have less interprocess bining unrelated software platforms. Its also impor-
communication. Smaller microservices are easier tant when youre migrating between two platforms.
to build, test, mix and match, configure, and de- For example, if youre migrating from an old legacy
ploy. You can therefore have more frequent releases, application to a new architecture, you can start with
smaller development teams, and faster onboarding the old architecture by inserting microservice com-
of new developers and tech leads. ponents to do things such as processing events. As
Deciding how big to make a component is more you switch to the new platform, you can use the old
art than science. Almost any service architecture application as a service.
will show a tremendous variation in service sizes,
from a 10-line script to process an event or post a One Product or ManyProducts
Web message, to a database server built from mil- Microservices are very useful when youre support-
lions of lines of code. Not all services in a microser- ing multiple products or developing new products
vices architecture are necessarily micro. A service and services. You can reuse services in more than
will become as big as it needs to be to provide a co- one product. For example, most online service com-
herent, efficient, reliable function. panies use many shared components (for example,
login and authentication) for all of their products.
Build VersusBuy You might need to add only a thin layer of new ca-
Cloud vendors package their products into Web pabilities and product management to launch a new
services (for example, storage, database, and mes- product or invade a new market. Or, you can strip
sage queues). Its easy to use these services if you away product capabilities and sell microservices
already have a microservices architecture. SaaS directly. Thats what Google and Amazon do when
vendors also make their functions available through they sell you a cloud database instead of a search
API calls that are easy to insert into a microservices or e-commerce app. Think of it this way: if 100 de-
architecture. So, a microservices architecture will velopers work for a year on a monolithic product,
have a higher percentage of buy versus build. This theyll end up with a single product. If they divide
can result in a very large reduction in build and into 10 teams of 10 people each to develop 10 mi-
maintenance costs, which may create a tradeoff in croservices, those services can theoretically be com-
operational costs. For example, rather than incur- bined into 210 that is, more than 1,000 different
ring a capital expenditure to build out compute, products.
storage, and network infrastructure, one can pay
for microservices on each invocation, an obvious Scale of Production Services and Cost of Load
cost, and for data transferred to or from a cloud, a When you divide your application into microservices,
potentially hidden cost. In a more traditional archi- you gain more ways to increase its capacity. You can
tecture, a computer systems owner is responsible add more instances of any microservice that comes
for maintenance and upgrades. In the API economy under heavy load. You can increase capacity more
used by microservices, its possible to buy services easily and cheaply than with a monolithic system,
for any layer in the stack, from compute and storage because youre only adding the components that are
up to complete SaaS functions. Vendors centrally heavily loaded, and not a whole complex app.
manage and continuously upgrade these services. Your base load also declines, so you might get
more advantages from moving to an on-demand
Integration cloud vendor. In an extreme case, you might be able
Do you have special integration needs? Microser- to take advantage of serverless architectures, such
vices architectures increase adaptability because as what Amazon Web Services calls Lambda func-
you can swap out components and include compo- tions (https://aws.amazon.com/lambda), Google calls
nents from different languages and platforms. You cloud functions, (https://cloud.google.com/functions),
can have different languages and platforms on dif- and Microsoft calls Azure functions (https://azure
ferent sides of an API call. This is important if you .microsoft.com/en-us/services/functions), in which
do things like merger integration, where youre com- the cloud vendor runs small scripts to process your

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 19
Cloud Economics

incoming events and uses compute resources only based microservices architecture if youre dealing
when such events or API calls arrive. This has ad- with any of the following types of complexity:
vantages of simplicity and efficiency, and it can re-
duce compute costs by 90 percent or more. large software systems with large numbers of
developers or long and expensive test cycles,
DataSize, Consistency, and Complexity a competitive environment that requires the
When your database is small, you can provide it rapid upgrading and release of online systems or
to all parts of a big application, and a monolithic business services,
architecture is practical. In the 1980s, data archi- multiple software-based products or online
tects learned to normalize databases, so that each services,
piece of data was in one place, and they were sure migration from building and maintaining sys-
that the data was correct and not contradicted else- tems to buying more components that will be
where. This seems important for data like a bank continuously upgraded by vendors,
account balance. However, this central data store integration with systems on different platforms,
propagates problems when you change the data high volume of usage on cloud-based platforms,
structure, and you need to test all of the code that or
uses the data. And, as data volumes increase, it be- large flow of data, or rapidly changing data
comes difficult or even impossible to put the chang- structures.
es in one place. So, the central shared database has
become outdated. Your services will become cells in the API economy
A monolithic architecture is often based on a re- of the cloud, the largest and most powerful comput-
liable transaction approach typically referred to as er system ever conceived.
ACIDatomic (all or nothing), consistent, isolated
(transactions lead to identical outcomes whether ex- References
ecuted in parallel or serially), and durable (perma- 1. C. Jones, Software Assessments, Benchmarks, and
nent, even in the face of a disaster). It has proven Best Practices, Addison-Wesley, 2000.
to be impossible to run ACID systems at very large 2. F.P. Brooks, Jr., The Mythical Man-Month: Es-
scale. Large systems must live in a BASE world of says on Software Engineering, Addison-Wesley,
basic availability, soft state, and eventual consisten- 1975.
cy.5 In the BASE world, we often get data through 3. R. Potvin, The Motivation for a Monolithic
API calls to microservices.6 Codebase: Why Google Stores Billions of Lines
When the database is very big, or data structure of Code in a Single Repository, presentation,
changes frequently, or data flows at high volume, @Scale conference, 2015; www.youtube.com/
youll want a service architecture in whichdata is en- watch?v=W71BTkUbdqE.
capsulated into services, and other parts of the app 4. J. Weinman, Cloudonomics: The Business Value
get it with API calls. In addition, the structure of the of Cloud Computing, John Wiley & Sons, 2012.

API calls should be stable and backward compatible, 5. D. Pritchett, BASE: An ACID Alternative,
even when the underlying data or schema changes. ACM Queue, vol. 6, no. 3, 2008, pp. 4855.
The architecture should also include an expandable 6. E.A. Brewer, Towards Robust Distributed Sys-
number of data handling nodes. Finally, the service tems, Principles of Distributed Computing,
architecture should allow data to be replicated in an 2000; https://people.eecs.berkeley.edu/~brewer/
eventually consistent way to all nodes. cs262b-2004/PODC-keynote.pdf.

A microservices architecture imposes Andy Singleton is the founder of MAXOS.ai,


costs and complexity. However, it solves Assembla, and PowerSteering Software. He writes
some expensive problems faced by the developers of about continuous delivery, microservices, big systems,
complex systems. You should consider using a cloud- and big companies. Contact him atandy@singleton.ai.

20 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


IEEE Cloud Computing Call for Papers

Special Issue on Cloud-


Native Applications
Submission deadline: 1 March 2017 Publication date: September/October 2017

I
EEE Cloud Computing magazine seeks accessible, useful Comparing applications one cloud-native and the
papers for a special issue on Cloud-Native Applications other not in terms of performance, security, reliability,
and Architecture. Many applications in enterprises are maintainability, scalability, etc.;
not able to leverage the advantages of cloud computing Cloud-native applications for various industry sectors
without a great deal of refactoring a process that is costly, (engineering, financial, scientific, health);
time consuming and often producing disappointing results. Cloud-native operating systems and databases; and
However, over the last five years we have seen cloud
New models for capacity planning and pricing inspired by
software architectures evolve that promote the design of
cloud-native architecture paradigms.
applications that, from conception to deployment, are
envisioned, prototyped and built with cloud tools and
cloud resources. These cloud-native applications are born Special Issue Guest Editors
and run in the cloud and follow new classes of design and Roger Barga, Amazon AWS
maintenance patterns. Dennis Gannon, Indiana University
Neel Sundaresan, Microsoft Corporation
The purpose of the special issue is to urge the research
community to better define and document the cloud-native
movement. Topics of interest include but are not limited to: Submission Information
Submissions should be 3,000 to 5,000 words long, with a
Frameworks to make it easier for industry to build cloud- maximum of 15 references, and should follow the magazines
native applications; guidelines on style and presentation (see https://www
Educational approaches and community based .computer.org/web/peer-review/magazines for full author
organizations that can promote cloud-native design guidelines). All submissions will be subject to single-blind,
concepts; anonymous review in accordance with normal practice for
The tooling to develop cloud-native applications; scientific publications. For more information, contact the guest
The role of open source for building cloud-native editors at cc5-2017@computer.org.
applications; Authors should not assume that the audience will have
VM and container orchestration systems for managing specialized experience in a particular subfield. All accepted
cloud-native designs; articles will be edited according to the IEEE Computer Society
Cloud-native applications running in hybrid cloud or style guide (www.computer.org/web/publications/styleguide).
migrated from one cloud to another; Submit your papers through Manuscript Central at https://
Efficient mechanisms to make legacy applications mc.manuscriptcentral.com/ccm-cs.Guest Editors
cloud-native;

www.computer.org/cloudcomputing
Guest Editors Introduction

Cloud Security
Peter Mueller, IBM Zurich Research Laboratory
Chin-Tser Huang, University of South Carolina
Shui Yu, Deakin University
Zahir Tari, RMIT University
Ying-Dar Lin, National Chiao Tung University

he cloud is becoming a major computing environment. Many


critical applications are being migrated to cloud platforms.
These include medical and finance applications, big data
applications, and applications with real-time constraints.
Gartner predicts that the bulk of future IT infrastructure
spending will be on cloud platforms and applications, and
nearly half of all large enterprises are planning cloud deployments by the end
of 2017.
However, cloud computing systems and services have become major tar-
gets for cyberattackers. Because the cloud infrastructure is to a certain de-
gree an open and shared platform, its subject to malicious attacks from both
insiders and outsiders. The high degree of virtualization in cloud systems
also makes them ideal subjects for side-channel attacks. Identity hijacking
and distribution of malicious code have become critical issues in cloud sys-
tems. Thus, organizations must carefully plan, deploy, and maintain cen-
tralized management of security properties and effectively enforce security
policies in cloud environments.
To provide strong protection of cloud platforms, infrastructure, hosted
applications, and data stored in the cloud, we need to address the security
issue from a range of perspectives. These include secure data and applica-
tion outsourcing, information leakage protection, information retrieval on
encrypted data, anonymous communication, vulnerability discovery, attack
handling, homomorphic encryption, and secure multiparty computation.
To achieve the high level of cloud security management required, we need
comprehensive vulnerability analyses and innovative security technologies in
both theory and practice.

22 I E E E C l o u d C o m p u t i n g p u b l i s h e d b y t h e I E E E c o m p u t e r s o cie t y  2325-6095/16/$33.00 2016 IEEE


The Articles outsourced to cloud storage services, but theyre ef-
By organizing this special issue on cloud security, fective only if a strong assumption holds: that the
it was our intention and hope to emphasize and ad- auditors are honest and reliable. Moreover, many
dress the importance of protecting and securing such schemes are vulnerable to an active external
cloud platforms, infrastructures, hosted applica- adversary, who can modify the outsourced data and
tions, and data storage. tamper with the interaction messages between the
We received 42 submissions in response to our cloud server and the auditor. The authors propose
call for papers. After two rounds of rigorous review, an efficient and secure public verification scheme
we selected five high-quality papers for publication that uses a random masking technique to protect
in this special issue. against external adversaries, and requires users to
In Online Analysis of Security Risks in Elastic audit auditors behaviors to prevent malicious audi-
Cloud Applications, Athanasios Naskos, Anasta- tors from forging verification results. The authors
sios Gounaris, Haralambos Mouratidis, and Pan- also use Bitcoin to construct unbiased challenge
agiotis Katsaros address security-related concerns messages to thwart collusion between malicious au-
in elastic cloud applications stemming from the ditors and cloud servers.
inherent tradeoffs between security and other non- In To Docker or Not to Docker: A Security
functional requirements, such as performance. To Perspective, Theo Combe, Antony Martin, and
this end, the authors propose a solution that can Roberto Di Pietro address the security of cloud-
be efficiently realized by modeling the application based infrastructures. They first provide a com-
behavior as a Markov decision process, on top of prehensive overview of the container ecosystem,
which they apply probabilistic model checking. The which was designed for shortening development cy-
authors show how their approach can be used to cles, providing continuous delivery, and achieving
perform online analysis and elastic decision making, cost savings in cloud-based infrastructures. The re-
and how its runtime analysis can provide evidence mainder of the article is dedicated to the introduc-
for key security-related aspects of the running appli- tion of Docker, which is both a leading container
cations, such as determining the probability of data solution and a complete packaging and software
leakage in the next hour. delivery tool. Using realistic use cases, the authors
In Privacy-Preserving Access to Big Data in discuss the security implications of the Docker
the Cloud, Peng Li, Song Guo, Toshiaki Miyazaki, environment. Moreover, they define an adversary
Miao Xie, Jiankun Hu, and Weihua Zhuang focus on model, point out several vulnerabilities affecting
the security and privacy concerns regarding third- current Docker uses, and discuss future research
party cloud storage service providers, which cause directions for Docker.
many users and companies to hesitate to move their In User-Centric Security and Dependability in
data to cloud storage. The authors provide a tuto- the Clouds of Clouds, Marc Lacoste, Markus Miet-
rial and survey of oblivious RAM (ORAM), a solu- tinen, Nuno Neves, Fernando M.V. Ramos, Marko
tion designed to enable privacy-preserving access Vukolic, Fabien Charmet, Reda Yaich, Krzysztof
to data stored in the cloud. Moreover, the authors Oborzy nski,
Gitesh Vernekar, and Paulo Sousa con-
study the access load balancing problem when ap- sider the issue of lack of interoperability in a dis-
plying ORAM for big data in the cloud, and propose tributed environment of multiple clouds, and point
heuristic algorithms to achieve access load balanc- out that the complexity of management could raise
ing in both static and dynamic deployments. many security and dependability concerns. The au-
In Cryptographic Public Verification of Data thors introduce secure Supercloud computing as a
Integrity for Cloud Storage Systems, Yuan Zhang, new paradigm for security and dependability man-
Chunxiang Xu, Hongwei Li, and Xiaohui Liang also agement of distributed clouds. Supercloud follows
deal with the security of cloud storage services but a user-centric and self-managed approach to avoid
from a different angle: the verification of data in- technology and vendor lock-ins. In this system, us-
tegrity. Many public verification schemes employ ers can define U-clouds, which are isolated sets of
a third-party auditor to verify the integrity of data computation, data, and networking services run over

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 23
GUEST EDITORS INTRODUCTION

both private and public clouds operated by multiple CHIN-TSER HUANG is an associate professor of
providers, with customized security requirements as computer science and engineering at the University of
well as self-management for reducing administration South Carolina, where hes the director of the Secure
complexity. The authors present the Supercloud se- Protocol Implementation and Development Labora-
curity architecture along with several use cases to tory. His research interests include network security,
illustrate its practical applicability. network protocol design and verification, and distrib-
uted systems. Huang has a PhD in computer science
from the University of Texas at Austin. Hes a senior
e hope you enjoy reading these five articles member of IEEE and ACM, and a member of Sigma
and expect that the publication of this spe- Xi and Upsilon Pi Epsilon. Contact him at huangct
cial issue will both increase public awareness of the @cse.sc.edu.
significance of cloud security and inspire further in-
vestigation on the development and enhancement of SHUI YU is a senior lecturer in the School of Infor-
state-of-the-art cloud security solutions. mation Technology at Deakin University, Australia.
His research interests include networking theory, net-
work security, privacy and forensics, and mathemati-
PETER MUELLER is a research staff member at IBM cal modeling. Hes a senior member of IEEE. Contact
Research. His research interests include datacenter him at shui.yu@deakin.edu.au.
storage security and reliability and high-frequency
technology. Hes a senior member of IEEE and a ZAHIR TARI is a full professor of distributed systems
member of the Society for Industrial and Applied at RMIT University, Australia. His research interests
Mathematics, the Electrochemical Society, and the include system performance (for example, webserv-
Swiss Physical Society. Contact him at pmu@zurich ers, peer to peer, and cloud computing) and system
.ibm.com. security (such as SCADA systems and the cloud). Tari
has a PhD in computer science from the University
of Grenoble, France. Contact him at zahir.tari@rmit
.edu.au.

YING-DAR LIN is a distinguished professor of com-


puter science at National Chiao Tung University,
CONFERENCES Taiwan. His research interests include design, analy-
in the Palm of Your Hand sis, implementation, and benchmarking of network
protocols and algorithms, quality of services, network
Let your attendees have: security, deep packet inspection, wireless communi-
cations, embedded hardware/software co-design, and
conference schedule
conference information software-defined networking. Lin has a PhD in com-
paper listings puter science from the University of California, Los
and more
Angeles. Contact him at ydlin@cs.nctu.edu.tw.
The conference program mobile app
works for Android devices, iPhone,
iPad, and the Kindle Fire.

For more information please contact


Conference Publishing Services (CPS) at
cps@computer.org

Read your subscriptions through


the myCS publications portal at
http://mycs.computer.org.

24 I EEE Clo u d Co m p u t I n g W W W.Co m p u t Er .o rg /Clo u d Co m p u t I n g


IEEE Cloud Computing Call for Papers

Intelligence in
the Cloud
Submission deadline: 1 May 2017 Publication date: November/December 2017

A
rtificial intelligence (AI), since its birth in 1950s, has and applications for intelligence in the cloud with special
been heralded as the key to our civilizations brightest focus on, but not limited to, the following topics:
future. To pursue the vision of AI, various machine
learning approaches (for example, deep learning, supervised new distributed architecture for machine learning;
learning, unsupervised learning, reinforcement learning, and new machine learning engines in the cloud;
so on) have been proposed and a few have actually been analytics architectures, frameworks, and models for
developed and deployed in the market. The recent hype complex intelligent systems;
around big data has enthusiastically renewed the call and intelligent cloud applications or services such as intelligent
focus for advanced machine learning technologies to extract traffic, intelligent buildings, intelligent environments,
knowledge from large data pools. With its rich resource intelligent businesses, and so on;
provisioning, cloud computing is widely regarded as an ideal cloud resource allocation and optimization through
platform to facilitate resource-intensive machine learning so as machine-learning algorithms;
to enable intelligence in the cloud. Integrating intelligence into
machine learning for cloud resource management;
the cloud is without doubt a promising development trend to
both cloud computing and AI. combining human and machine intelligence in the cloud; and
security and privacy issues for intelligent systems in the cloud.
We are still at the early stage of integrating intelligence into
the cloud. Toward this exciting future, the path still entangles
many critical challenges in different aspects.
Special Issue Guest Editors
Song Guo, The Hong Kong Polytechnic University,
At the application layer, cloud-based efficient and powerful AI Hong Kong
techniques are highly in demand that target various applications Victor Leung, University of British Columbia, Canada
such as natural language processing, stock analysis, medical
Xin Yao, University of Birmingham, UK
diagnosis, intelligent industry control, intelligent transportation,
and scientific discovery.
Submission Information
At the platform layer, while intelligence has been deployed
Submissions should be 3,000 to 5,000 words long, with a
(for example, Sparks scalable machine learning MLlib and
Googles cloud machine-learning framework TensorFlow) maximum of 15 references, and should follow the magazines
new machine learning engines are expected for emerging guidelines on style and presentation (see https://www
computing frameworks (for example, the dataflow computing .computer.org/web/peer-review/magazines for full author
model HAMR). guidelines). All submissions will be subject to single-blind,
anonymous review in accordance with normal practice for
At the infrastructure layer, new cloud computing architecture scientific publications. For more information, contact the
and resource scheduling strategies are required to support guest editors at ccm6-2017@computer.org.
computation-intensive and IO-intensive machine learning Authors should not assume that the audience will have
algorithms. How to configure cloud computation, storage, and specialized experience in a particular subfield. All accepted
networking resources for fast, efficient, and scalable machine articles will be edited according to the IEEE Computer Society
learning must still be addressed.
style guide (www.computer.org/web/publications/styleguide).
The goal of this special is to seek original articles examining Submit your papers through Manuscript Central at https://
the state of the art, open research challenges, new solutions, mc.manuscriptcentral.com/ccm-cs.

www.computer.org/cloudcomputing
Cloud Security

Online Analysis
of Security Risks
in Elastic Cloud
Applications
Athanasios Naskos and Anastasios Gounaris, Aristotle University
of Thessaloniki
Haralambos Mouratidis, University of Brighton
Panagiotis Katsaros, Aristotle University of Thessaloniki

To address security- wo properties of cloud computing have made it the


main option for the deployment of Web applications
related concerns for a wide range of stakeholders, including large com-
panies, small and medium enterprises, and research
in elastic cloud institutes. First, cloud computing allows computational
applications, such resources to be released on demand, giving applica-
tion providers the option to charge in a pay-as-you-go manner. Cloud
as data loss and computing solutions therefore minimize users up-front investments
in equipment and human resources, which can then be allocated to
leakage, the authors application deployment and management. Second, providers release
resources in response to workload changes. This property, known as
propose modeling autoscaling, impacts the monetary cost of running a cloud application:
application behavior resources are used only when required.
Elastic applications can exploit these properties to provide com-
as a Markov putational resources on the fly according to current needs. Computa-
tional resources are typically provided in the form of virtual machines
decision process (VMs). Elasticity is manifested in three main forms. Horizontal scal-
ing, where either new VMs are added or existing ones are removed,
and applying provides the biggest potential for scalability and performance improve-
probabilistic model ments, because of the perceived unlimited number of VMs that can
be provided. In vertical scaling, certain properties (for example, the
checking. number of cores) of the existing VMs are modified. In the third form,

26 I E E E C l o u d C o m p u t i n g p u b l i s h e d b y t h e I E E E c o m p u t e r s o cie t y  2325-6095/16/$33.00 2016 IEEE


live migration, a VM is moved to a different physical Latency distribution for different cluster sizes
(14,000 requests/second)
host while staying operational.
We focus on horizontal scaling and argue that 90
the goals of meeting performance requirements with

Response latency (msecs)


80
the help of horizontal scaling contradict those of se-
curity, thus calling for a risk-based security solution. 70

60
Security Concerns and Horizontal Scaling in
Public Clouds 50
Public and hybrid clouds, unlike private clouds, of-
fer resources to arbitrary customers, or tenants. 40
Tenants control neither the clouds security policy 30
nor the types of other tenants whose VMs are col-
located on the same physical machines. Although 8 9 10 11 12 13 14 15 16 17 18
this doesnt necessarily imply that public clouds are No. of VMs
insecure, many organizations cite it as a reason for
hesitating to migrate applications to the cloud.1 Figure 1. Execution plans based on different cost
A 2013 Cloud Security Alliance report identified metrics. Increasing the number of virtual machines
data breaches due to malicious cotenants as the top (VMs) leads, on average, to lower response times,
cloud-related security threat.2 These data breaches subject to increased probability of a malicious tenant
can lead to both data leakagethe unauthorized being collocated.
disclosure of data from one user to anotherand
data lossa condition where data is destroyed and
becomes unavailable. In addition, in a multitenant hoo Cloud Serving Benchmark (YCSB).4 The figure
environment, a lack of authorization mechanisms refers to a fixed rate of user requests and shows how
for sharing physical resources increases the risk the average and standard deviation values of the re-
of threats such as service traffic hijacking, which sponse latency vary with the number of VMs used.
occurs when attackers hijack cloud accounts by Thus, a strict threshold on latency would force the
stealing security credentials and eavesdropping on system to acquire additional VMs, the exact quan-
activities and transactions, and side-channel at- tity of which needs to be computed at runtime ac-
tacks, which use information obtained from band- cording to the current workload and considering the
width monitoring or other similar techniques. systems volatility. However, increasing the number
Moreover, when multiple tenants share an under- of VMs, and assuming that each VM runs on a dif-
lying infrastructure, the risk of threats related to ferent physical machine in the generic case, also
misconfiguration and uncoordinated change control increases the probability of a malicious tenant be-
increases, allowing a malicious tenant to gain access ing collocated. This probability might vary accord-
to another tenants resources. ing to the cloud provider type,5 but the important
Keeping the number of VMs as low as possible issue is that it is not negligible. Choosing a single
is an indirect way to mitigate data leakage- and loss- trustworthy provider isnt sufficient either, given
related security concerns but might entail an unac- that providers might offer VMs from other provid-
ceptable compromise on performance. Performance ers as well in periods of very high demand.6 Overall,
is one of the top three most studied service-level as noted earlier, adding VMs poses a threat of data
agreement (SLA) parameters, since critical applica- leakage and data loss. The magnitude of this threat
tions require responses in a fixed, short time period.3 depends on the application. In the figure, the threat
Therefore, a reliable cloud application must both ad- of data loss is lower than in typical applications, be-
dress security concerns and honor SLAs. cause NoSQL databases are replicated at least two
In the example in Figure 1, an elastic NoSQL or three times, making it harder for a malicious user
database serves user requests according to the Ya- to destroy all copies.

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 27
Cloud Security

State 1 t t+1 State 2

$ $
v1 v2

w1 m1 w2 m2
NO_OP
REMOVE

x1% y1% ADD x2% y2%

k1% Latency + 5 minutes k2%


z1% VM type Data leakage violation z2%

Deployment Normal
$ cost Data loss state Added cost

Figure 2. A conceptual model of an elastic application that considers both security and performance service-level agreement
(SLA) requirements. Each state represents a specific configuration of the application in terms of number of VMs, along with its
security- and performance-related properties, at a given point of time. The transition between states is due to elasticity actions.

Because of the inherent trade-offs among se- of both cloud security 9,10 and dynamic resource al-
curity and performance requirements, any solution location in clouds.11,12
to analyze and enforce security-aware horizontal
scaling for cloud applications must be risk based. Modeling Elastic Applications
It should account for the probability of data leak- We advocate a model-based approach. Figure 2
age and data loss; dynamic evolution of the external shows a conceptual model of an elastic application
environment and volatility of system behavior; and that considers both security and performance SLA
potential heterogeneity of the cloud infrastructure. requirements and is deployed on a public cloud.
We advocate the use of a formal verification ap- Each curved rectangle represents a conceptual state
proach as a means to apply mathematical reasoning at a specific time instant. We model the elastic ap-
for providing security-oriented probabilistic guar- plications evolution as a transition to a state at a
antees for elastic cloud applicationsspecifically, future point in time t + t through elastic actions,
probabilistic model checking7 on top of system mod- such as adding or removing VMs.
els in the form of Markov decision processes (MDPs), For each state, we capture the features of inter-
which are instantiated on the fly. Our technique can est for the analysis and decision making:
analyze and provide evidence for key security-related
aspects of the running applicationsfor example, to the mixture of VM types employed, which we
answer questions such as, What is the probability assume are of two different types and/or provid-
that there will be a data leakage in the next hour? ers, v and w;
Moreover, it can drive elasticity decisions, taking the total deployment cost, m;
into account security constraints (for example, given the probability of data leakage, x;
the current query load and the prediction for this the probability of data loss, y;
load in the next half hour) to decide how many VMs the probability of performance-related SLA con-
to add or remove. dition violations, z; and
Using probabilistic model checking to analyze the probability of no security threats or perfor-
and drive elasticity decisions has shown promising mance violations, k.
initial results.4 Its potential in addressing security
requirements has also been demonstrated.8 Most We consider all these probabilities statistically
of the work in cloud security focuses on identifying independent, so k = (1 x)(1 y)(1 z). Further, we
risks, vulnerabilities, security mechanisms, digital can safely regard the probability of data leakage on a
signatures, access control, and manners to attain single machine of type A, dlA, as independent of the
security assurance, such as monitoring, certificates, number of VMs of type A employed. So, x = 1 (1
and auditability.9 However, the detailed investiga- dlv)v(1 dlw)w. Similarly, we can define y as a func-
tion of security assurance during horizontal scaling, tion of the number and type of VMs employed.
and even more, the security-aware elasticity decision The systems evolution due to elasticity actions
making that we hereby enable is novel in the fields refers to discrete time intervals of period t. We con-

28 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


p0
Data leakage

State 1 p1
No data leakage

$
v1
p2
Performance violation
w1 m1

p3 No performance violation

x1% y1% p4 Data loss

p5 No data loss
z1% k1%
t
VM type (t)
p6

p7
$ Deployment cost

Figure 3. Mapping of a conceptual state to a set of Markov decision process (MDP) states. Each MDP state
corresponds to a distinct behavior type.

sider three actions: add, remove, and no_change. uncertainty. Finally, the rewards are used to perform
In the generic case, the effects of actions at time quantitative analysis (or solution) of MDP models.
t might be delayed and not manifested at the next The conceptual model needs to be implemented
period but after multiple time points. For example, as an MDP according to the analysis requirements.
adding a new VM to serve a NoSQL database im- Because we need to explicitly consider different
plies that a new VM needs to be created, booted, types of application behavior during analysis and
and configured, and it needs to receive data, which elasticity decision making, we map each conceptual
might take more than t time. During this process, system state to multiple MDP model states, one for
the system should be in a transient state. each behavior type. The behavior type is defined
according to the application nonfunctional require-
MDP Implementation of the Conceptual Model ments. Let an application set three requirements:
We implement our technique using the MDP mod- to avoid data leakage, data loss, and latency above
eling approach. We chose MDP because it enables a user-specified threshold. Then, each combination
analysis and decision making and can capture non- of a binary variable that indicates the satisfaction of
determinism and uncertainty in a given system.13 each requirement defines a behavior type (see Fig-
Both properties are essential in an elastic cloud ap- ure 3). Furthermore, each MDP state is annotated
plication. Because of horizontal scaling, at each time as to whether or not it refers to a transient state (not
point, the number of VMs can increase, remain the shown in the figure).
same, or decrease. This gives rise to nondeterminism. The actions are the same as in the conceptual
Also, at any given point, there might or might not be model. The only difference is that, if the MDP state
a performance or security-related requirement viola- is transient, only no_change is allowed, because
tion. This necessitates the modeling of uncertainty. making further resizing decisions during instable
MDPs are specified using states, actions, prob- periods is prone to suboptimal decision making. The
abilities, and rewards. The states represent system next step is to define the transition probabilities.
snapshots at specific time points, which are charac- Figure 4 gives a complementary view of Figure 3,
terized by a set of system properties. The actions are where each path from s to s corresponds to a MDP
transitions between the states, which express some model state. The probabilities p0 to p7 in Figure
change to the state properties. The probabilities re- 3 are the product of the probabilities in Figure 4
fer to each triple (state s)-(action a)-(state s) and rep- along the corresponding path. For example, p7 = (1
resent the probabilities of transition from one state x)(1 y)(1 z). In general, for a given initial state
to another due to a specific action thus quantifying s and action a,

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 29
Cloud Security

Performance t
Data leakage Data loss VM type (t)
violation
.LOGS
No performance Deployment
Logs Prediction No data leakage violation No data loss $ cost

Performance
Initial state Data loss check Data leakage check Next state
violation check

Elastic y y z
action
S S
1-y 1-y 1-z

v v
:vs :ws $:ms v
:vs
v
:ws $:ms

Figure 4. Setting MDP model probabilities for state transitions based on past log entries referring to the same
mixture of VM types and prediction of future external load.

p ( s, a, s) = 1 ,
s
we assume that a security analysis and profiling
mechanism is in place, which is capable of deriving
and multiple actions can be plausible for each state. attack probabilities as a function of the cluster
configuration. Such a mechanism is orthogonal to
our approach and can be even more sophisticated.
MDP Instantiation For example, upon instantiation, it could account for
To serve online analysis (which we discuss later in whether any VM additions would involve the use of a
detail), the model is instantiated on the fly. To this new physical machine rather than deploying VMs on
end, the decision depth and the probabilities take one thats already in use. Finally, since the models
actual values. Decision depth refers to how many pe- are instantiated on the fly, the attack probabilities
riods the model can account for. If the depth is too can be dynamically refined.
small, the system becomes too short-sighted; if its
too big, the prediction uncertainty increases. Both Online Analysis and Decision Making
situations lead to suboptimal decisions. Clearly, the The analysis is based on verification of models in-
number of MDP states grows exponentially in the stantiated on demand. To this end, we couple the
number of t periods. If a period represents 5 min- probabilistic model with Probabilistic Computation
utes of real time, a model with depth set to 4 refers Tree Logic (PCTL), a probabilistic property speci-
to a scenario where elasticity is reassessed every 5 fication language thats fed to the parallel reduced-
minutes, and the model looks ahead for 20 minutes. instruction-set multiprocessing (Prism) model
If the system evolves less rapidly, the application checker.14 We also show how the analysis can direct-
manager could map each period to a longer time. In ly support decision making with regard to elasticity.
general, setting t appropriately heavily relies on the Prism can efficiently analyze complex models.
application environments volatility (for example, if We have been able to solve MDP models with
during night hours the workload remains roughly 9,958 states corresponding to four periods in 0.073
stable, t can be increased). seconds using a machine with a quad-core CPU and
In our approach, we derive the probabilities x, 8 Gbytes of RAM, while the program is reported to
y, and z through logs. We analyze past log entries have processed models up to 1011 states on a single
referring to the same mixture of VM types as the machine.14
state of interest to estimate the probabilities. For
performance-related metrics, such as latency, we Examples of Verified Analysis Using Prism
consider not only the number of VMs but also the Figure 5 gives a concise view of the analysis of two
external load of incoming requests. This implies the PCTL properties. For simplicity, we grouped the
need to add a load prediction component, as Figure 4 eight states of Figure 3 in two groups according to
shows. In general, for our approach to be applicable, the data leakage property. Further, we assume that

30 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


the decision depth is set to 2 and the probability of P=? [G !data_leakage] P=? [F data_leakage & steps=max_steps]
data leakage in a time interval x is 0.02. Also, in this P=0.98*0.98=0.9604 P=0.02*0.02+0.98*0.02=0.02
toy example, we allow only one action (for example,
no_change)that is, theres no nondeterminism. 0.02
? ?
Obviously, in a more complete example with nonde- ? ?
0.02
terminism including all elasticity actions, multiple
0.98
such paths are eligible.
The first PCTL property (in green) answers the 0.02
question What is the probability of having no data 0.98
leakage incident? The corresponding PCTL is ex- ? ?
? ?
pressed as P = ? [G !data_leakage], and is satisfied 0.98

by the green path in the model, while the returned t t + t1 t + t2


probability is 0.9604.
The second PCTL (in red) answers the ques-
tion What is the probability of eventually (that is, in Figure 5. An example of probabilistic model checking analyzing two
the final state in the verification) having data leak- properties of the application with regards to data leakage.
age? Its expressed as P = ? [F data_leakage & steps
= max_steps] and returns the cumulative probability
of the red transitions, which is 0.02. The examples in Table 1 fall into three catego-
Table 1 presents more complex examples of risk- ries. The first two rows refer to analyses where the
based security analysis, where any of the three ac- outcome is a probability of reaching certain model
tions is allowed. Thus, during analysis, we investi- states. The next two examples return a Boolean
gate different sequences of actions. Formally, these value indicating whether a specific property holds.
sequences are adversaries, policies, and strategies. The last two PCTL properties are numerical mul-
The results of each analysis can include multiple tiobjective ones, as they ask for the maximum
adversaries. The PCTL statements are mostly self- possible probability of reaching a state under the
explanatory. In the statements, U defines which condition that the probability of reaching an-
state needs to follow the state on the left, F stands other state is bounded by a given threshold. The
for eventually reaching a state, G stands for a condi- objectives can refer to a single property, like data
tion that needs to hold throughout the policy, and X loss in the fifth example, or multiple ones, like data
for the next state. We refer to the decision depth as leakage, data loss, and monetary cost in the last
max_steps. example.

Table 1. Additional examples of analyses enabled.

Analysis goal Probabilistic computation tree logic


(PCTL) property

What is the maximum probability (among all possible adversaries) of experiencing Pmax = ? [data leakage U !data leakage]
a data loss incident until eventually moving to a state with no data loss?

What is the maximum probability (among all possible adversaries) of moving from a Pmax = ? [data loss U F !data loss & steps
state with data leakage to a state with no data leakage? = max_steps]

Starting from any reachable state, is it always possible (that is, is there at least one filter(exists, P >= 1 [F !data leakage & steps
adversary) to eventually reach a state with no data leakage? = max_steps])

Starting from a state with no data loss, do all adversaries eventually reach a state filter(forall, P >= 1 [F data loss & steps =
with no data loss? max_steps], !data loss)

What is the maximum probability of experiencing data loss in a state that multi(P max = ? [X data loss],
immediately follows the initial state, while the probability to end up at a state with no P >= 0.9 [F !data loss & steps = max_steps])
data loss is greater than or equal to 0.9?

What is the maximum probability of having total cost of deployment less than a multi(P max = ? [F total cost <= Budget
specified budget, while the probability of experiencing any security incident does not & steps = max_steps], P <= 0.05 [G data
exceed 0.05? leakage & data loss])

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 31
Cloud Security

Adaptation of the no. of VMs to the incoming load variation


er than probabilities: R{cumulative reward}min = ?
14,000 18 [F steps = max_steps].
Interestingly, several decision policies can be
Incoming load (requests/second)

12,000 16 built on top of the aforementioned verification. For


example, if multiple adversaries are returned with

No. of VMs
10,000 14 equal reward, we can use a second PCTL on other
aspects, such as the probability of security violation
8,000 12 as presented in Table 1, to choose the final strategy.
Moreover, it isnt necessary to perform all ac-
10 tions in that strategy. An elasticity decision-making
6,000
technique tailored to NoSQL databases thats pre-
8 sented elsewhere follows the steps mentioned ear-
lier.4 After deciding on the adversary at each time
0 500 1,000 1,500 2,000 2,500
point the decision mechanism is activated, our pro-
Time steps (30 seconds)
posal enacts only the first elasticity action, and then
Incoming load No. of VMs (security reevaluates the whole adversary from scratch. Such
No. of VMs (performance) performance)
an approach is in line with a wide range of adaptive
solutions, such as model predictive control (MPC),15
Figure 6. Example of security-aware elasticity decisions. The number of which computes a sequence of adaptations but only
VMs follows the incoming load. When security requirements are taken applies the first step. According to evaluation re-
into consideration, the elasticity actions tend to be more conservative. sults published elsewhere,4 the quality of elasticity
decisions outperforms other proposals for scaling
NoSQL databases in avoiding both violations of la-
Decision Making tency thresholds and overprovisioning of VMs.
MDPs are inherently suited for decision making as Figure 6 shows how a security-aware elasticity
well. To this end, each model state needs to be as- decision maker behaves in a setting similar to the
sociated with a reward value. State rewards are one described in earlier work.4 Although the load
computed using functions that quantify various as- varies (green plot), the decision maker constant-
pects of the system like performance and security ly reevaluates the number of VMs. The blue plot
concerns or more concrete assets like the number shows the number of VMs when considering only
of active VMs or the actual deployment cost. As an performance requirements, and the red plot shows
example, consider the following utility function the behavior when the state rewards are computed
that uses weights to balance three aspects: the nor- based on Equation 1. In the latter case, we limit
malized probability of data leakage ( p  dleak ) , data the use of VMs to mitigate the threat of data leak-
loss ( p  dloss ) , and the latency exceeding a threshold age and loss.
( p per f ) :

u ( vmw) = a p
 dleak + b p  per f , a + b + c = 1 .(1)
 dloss + c p ur approach is of interest to both owners
of elastic applications and cloud service
Note that, in general, we can prioritize threats providers. The outcomes of our proposal can be
and objectives. We reflect this on the utility func- used either to analyze (elastic) behavior or to make
tion by assigning different values to the weights. In elasticity decisions. Additionally, the analysis results
addition, our approach is orthogonal to any user- can be used to fine tune the utility function, acting
defined utility function. as a feedback mechanism, so that decisions are good
Our decision-making proposal is based on the in practice.
computation of the cumulative reward of every ad- Analysis and decision making in elastic appli-
versary. The model solver examines the possible al- cations is, by its nature, an instance of autonomic
ternativesthat is, all combinations of state transi- computing problems. A key issue for autonomic
tionsand computes the optimal cumulative state solutions is to render them dependable and endow
reward along with the corresponding sequence of them with a solid formal basis. To this end, proba-
actions. For example, using this utility function, the bilistic model checking not only allows for the con-
optimal reward is the minimum one. In Prism, we tinuous verification of system properties but is also
can do this with the help of a different type of PCTL an effective tool for meeting both security- and
specification that asks for reward minimization rath- performance-oriented goals.4,8

32 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


Although we focused on horizontal scaling with 11. Z.A. Mann, Allocation of Virtual Machines in
data leakage and loss as the most prominent security Cloud Data CentersA Survey of Problem Mod-
threats, our approach can also cover additional se- els and Optimization Algorithms, ACM Comput-
curity threats affected by scaling. Moreover, its ap- ing Surveys, vol. 48, no. 1, 2015, article 11.
plicable to both vertical elasticity and live migration. 12. S. Singh and I. Chana, QoS-Aware Autonomic
To this end, the envisaged models must be more Resource Management in Cloud Computing: A
fine-grained, considering VM configuration types Systematic Review, ACM Computing Surveys,
and physical machines rather than only the number vol. 48, no. 3, 2015, article 42.
of VMs. The analysis of all elasticity types in combi- 13. M.L. Puterman, Markov Decision Processes:
nation with additional security threats is a challeng- Discrete Stochastic Dynamic Programming, John
ing avenue for further research. Wiley & Sons, 1994.
14. M. Kwiatkowska, G. Norman, and D. Parker,
PRISM: Probabilistic Model Checking for
References Performance and Reliability Analysis, ACM
1. C. Kalloniatis et al., Migrating into the Cloud: SIGMETRICS Performance Evaluation Rev., vol.
Identifying the Major Security and Privacy 36, no. 4, 2009, pp. 4045.
Concerns, Collaborative, Trusted and Privacy- 15. J. Maciejowski, Predictive Control with Con-
Aware e/m-Services, Springer, 2013, pp. 7387. straints, Prentice Hall, 2001.
2. Cloud Security Alliance, The Notorious Nine:
Cloud Computing Top Threats in 2013, report,
2013; https://cloudsecurityalliance.org/download/ Athanasios Naskos is a PhD candidate in the
the-notorious-nine-cloud-computing-top-threats Department of Informatics of the Aristotle University
-in-2013. of Thessaloniki, Greece. His research interests include
3. F. Faniyi and R. Bahsoon, A Systematic Review cloud elasticity, distributed data management, and
of Service-Level Management in the Cloud, model checking. Naskos has an MSc in computer sci-
ACM Computing Surveys, vol. 48, no. 3, 2015, ence from the Aristotle University of Thessaloniki.
article 43. Contact him at anaskos@csd.auth.gr.
4. A. Naskos et al., Dependable Horizontal Scaling
Based on Probabilistic Model Checking, Proc. Anastasios Gounaris is an assistant profes-
15th IEEE/ACM Intl Symp. Cluster, Cloud and sor in the Department of Informatics of the Aristotle
Grid Computing (CCGrid), 2015, pp. 3140. University of Thessaloniki, Greece. His research inter-
5. S. Ramgovind, M.M. Eloff, and E. Smith, The ests include distributed data management, resource
Management of Security in Cloud Computing, scheduling, autonomic computing, and adaptive que-
Proc. Information Security for South Africa ry processing. Gounaris has a PhD in computer sci-
(ISSA), 2010, pp. 17. ence from the University of Manchester. Contact him
6. E. Casalicchio, D.A. Menasc, and A. Aldhalaan, at gounaria@csd.auth.gr.
Autonomic Resource Provisioning in Cloud
Systems with Availability Goals, Proc. 2013 Haralambos Mouratidis is professor of soft-
ACM Cloud and Autonomic Computing Conf., ware systems engineering at the School of Comput-
2013, article 1. ing, Engineering, and Mathematics at the University
7. V. Forejt et al., Automated Verification of Brighton. His research interests include secure soft-
Techniques for Probabilistic Systems, Formal ware systems engineering, requirements engineering,
Methods for Eternal Networked Software Systems, and information systems development. Mouratidis
LNCS 6659, 2011, pp. 53113. has a PhD in computer science from the University
8. A. Naskos et al., Security-Aware Elasticity for of Sheffield. Contact him at H.Mouratidis@brighton
NoSQL Databases, Proc. 5th Intl Conf. Model .ac.uk.
and Data Eng. (MEDI 15), 2015, pp. 181197.
9. C.A. Ardagna et al., From Security to Assurance Panagiotis Katsaros is an assistant professor
in the Cloud: A Survey, ACM Computing in the Department of Informatics at the Aristotle Uni-
Surveys, vol. 48, no. 1, 2015, article no. 2. versity of Thessaloniki, Greece. His research interests
10. H. Mouratidis et al., A Framework to Support include formal analysis, model checking, and depend-
Selection of Cloud Providers Based on Security ability and security. Katsaros has a PhD in computer
and Privacy Requirements, J. Systems and science from the Aristotle University of Thessaloniki.
Software, vol. 86, no. 9, 2013, pp. 22762293. Contact him at katsaros@csd.auth.gr.

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 33
Cloud Security

Privacy-Preserving
Access to Big Data in
the Cloud
Peng Li, Song Guo, and Toshiaki Miyazaki, University of Aizu
Miao Xie and Jiankun Hu, University of New South Wales at the Australian
Defense Force Academy
Weihua Zhuang, University of Waterloo

Oblivious RAM aims to enable privacy-preserving


access to data stored in the cloud. This article
provides a tutorial about ORAM and proposes
heuristic algorithms to achieve access load balancing
in both static and dynamic deployments.

ig data has emerged in domains such as science, engineering, and com-


merce. Facebook, for example, currently stores more than 20 petabytes of
photos, and this number grows by 60 terabytes each week.1 In the big data
era, clouds become a perfect candidate for data storage by providing virtu-
ally unlimited storage that can be accessed over the Internet. By outsourc-
ing large volumes of data to cloud storage, such as Google Drive, Dropbox,

34 I E E E C l o u d C o m p u t i n g p u b l i s h e d b y t h e I E E E c o m p u t e r s o cie t y  2325-6095/16/$33.00 2016 IEEE


and Amazon Simple Storage Service, users can sim- storage are preventing users from subscribing to this
plify their data management and reduce data mainte- service. These concerns include:
nance costs through the pay-as-you-use model.
Due to security and privacy concerns, however, Untrustiness. Users can operate on data re-
some users and companies still hesitate to move motely only through the APIs in cloud sites that
their data to the cloud. Although encryption can span multiple locations and belong to untrusted
protect data confidentiality, its insufficient because third-party organizations.
access patterns can also leak important information. Dynamic environment. Users and cloud storage
For instance, more than 80 percent of encrypted providers can dynamically change the services
email queries can be identified by analyzing user ac- requested or offered, resulting in sensitive data
cess patterns.2 moving frequently within single or around mul-
In this article, we present the challenges of pre- tiple organizations and increasing the possibility
serving privacy in cloud storage. To address these of disclosing private information.
challenges, we apply oblivious RAM (ORAM), known Uncensored new services. Usually, cloud storage
to be the most effective solution for hiding user ac- providers allow new services to be added with
cess patterns. We provide a tutorial about ORAM and less control than in traditional multiorganiza-
survey recent efforts to increase the practicability of tional scenarios. A new service might introduce
using it in cloud storage by reducing its overhead. security and privacy requirements at a different
Consider distributed file systems built on level and, in some cases, additional risk is im-
hundreds or thousands of servers in a single or posed on users as this service might need to col-
multiple geodistributed cloud sites. Applying an lect and store sensitive information.
ORAM-based algorithm for privacy-preserving ac-
cess can lead to serious access load imbalance The community is dedicated to improving the
among the storage servers. Therefore, in this arti- security and reliability of cloud storage, and many
cle, we study a data-placement problem to achieve researchers have proposed techniques. The redun-
a load-balanced storage system with improved avail- dant array of inexpensive disks (RAID) technique,
ability and responsiveness. Given this problems NP- for example, is integrated in the high-availability and
hardness, we propose a low-complexity algorithm integrity layer (HAIL),3 which manages remote file
that can deal with a large-scale problem with re- integrity and availability across a collection of serv-
spect to big data. We conduct extensive simulations ers. Alysson Bessani and her colleagues proposed
to show that the proposed algorithm finds results DepSky, a distributed storage system that integrates
close to the optimal solution, and significantly out- encryption, encoding, and replication.4 IRIS, an au-
performs a random data-placement algorithm. thenticated file system for enterprises, stores data
in the cloud with resilience against potentially un-
Preserving Privacy in Cloud Storage trusted cloud providers.5 Several proposals deal with
Cloud storage is often managed and maintained by data availability by constructing distributed storage
cloud storage providers in the form of services, and systems across several cloud sites. SPANStore is a
it is comprised of logical storage pools that interact key-value storage system that exports a unified view
directly with data, physical storage spanning of storage services in geographically distributed
multiple servers, and the physical environment. datacenters.6 It minimizes an application providers
Users obtain storage capacity from providers cost by exploiting pricing discrepancies across pro-
and operate on their data through public APIs. viders, estimating application workload at the right
Many well-known providers have begun to offer granularity, and minimizing the use of computation-
cloud storage services in recent years. As the best al resources.
infrastructure to accommodate big data, cloud Although security and privacy can be guaran-
storage has attracted attention from both industry teed to some extent using the aforementioned tech-
and academia. However, various security and niques, protection of access privacy is still a gap to
privacy concerns arising from the nature of cloud be filled in the context of cloud storage. Mohammad

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 35
Cloud Security

Islam and his colleagues prove that most of the pro- Oblivious RAM
tocols proposed for privacy-preserving cloud storage As Oded Goldreich and Rafail Ostrovsky originally
will leak access patterns due to efficiency issues.2 proposed, ORAM allows a trusted processor to use
A deliberately designed attack that exploits access untrusted RAM.18,19 Most existing ORAM solutions
pattern leakage can disclose a significant amount use the basic memory structure suggested by Ostro-
of sensitive information, such as the identification vskys hierarchical scheme.19 The ORAM is arranged
of encrypted email queries. The private information in a series of progressively larger caches. Each cache
retrieval (PIR) technique addresses the access pri- consists of a hash table of buckets. When a block
vacy problem by letting users retrieve a block from is requested, the algorithm checks a bucket at each
a database of N items held by a server that learns level of the hierarchy. If it finds the block, it con-
nothing about this block.7 Unfortunately, Radu Sion tinues searching for a dummy block, thus hiding
and Bogdan Carbuna showed that existing PIR tech- the desired blocks location. Finally, the algorithm
niques will never be more efficient than a trivial reinserts the block into the top-level cache. When a
PIR technique of downloading the entire database.8 cache is close to overflowing, its obliviously shuffled
PIRs extremely poor performance makes it inappli- into the cache below.
cable in cloud storage with big data. Recent ORAM works include optimizations of
ORAM provides data access privacy by periodi- the classic hierarchical scheme, such as the use of
cally reshuffling data blocks stored in an untrust- cuckoo hashing and Bloom filters.20 Peter Williams
ed server so user access cant be tracked. Michael and Radu Sion proposed SR-ORAM, the first sin-
Goodrich and Michael Mitzenmacher proposed gle-round-trip polylogarithmic time ORAM, which
an ORAM algorithm with O(pN) client storage to requires only logarithmic client storage.21 Taking
achieve O(logN) amortized cost, that is, each oblivi- only a single round trip to perform a query, SR-
ous read or write leads to O(logN) data access op- ORAM has an online communication/computation
erations on average.9 Elaine Shi and her colleagues cost of O(log n log log n). Jacob Lorch and his col-
further reduced the client storage to O(1).10 Later, leagues proposed Shroud, a general storage system
Tarik Moataz and his colleagues proposed replac- that hides data access patterns from the servers.22
ing homomorphic eviction with a new and much Shroud uses many secure coprocessors acting in
cheaper permute-and-merge eviction, so the block parallel as client proxies in the datacenter. Circuit
size can be reduced to (log4 N) while maintaining ORAM, a new tree-based ORAM scheme, achieves
O(1) complexity.11 optimal circuit size both in theory and in practice
Jonathan Dautrich and his colleagues combined for realistic choices of block sizes.23
PIR techniques with the most bandwidth-efficient To define privacy, we denote a data access se-
existing ORAM to reduce bandwidth cost.12 Chang quence as A = (op1, u1, data1), (op2, u2, data2), . . . ,
Liu and his colleagues developed ObliVM, a pro- where opi is the read or write operation, ui is the data
gramming framework for secure computation, and address, and datai denotes the data contents. Given
demonstrated it on various applications (for exam- two data access sequences A and A, a cloud storage
ple, data mining, streaming algorithms, and graph is defined to be privacy-preserved if its access pat-
algorithms).13 Xiangyao Yu and his colleagues pro- terns cant be distinguished within polynomial time.
posed and evaluated PrORAM, a dynamic ORAM Next, we present the ORAM algorithm that we
prefetching technique.14 A heuristic compact apply to the cloud storage scenario.
ORAM design, called SCORAM, is optimized for We consider a client that wants to store and re-
secure computation protocols. SCORAM is almost trieve data in a cloud; the cloud is honest but curi-
10 times smaller in circuit size and faster than all ousthat is, it cant tamper with or modify the data,
other designs, so its feasible to perform secure com- but it can learn information about the data. We di-
putations on gigabyte-sized datasets.15 Christopher vide the data into blocks, each of which is identified
Fletcher and his colleagues proposed a new ORAM by a unique address. For example, a typical block
structure, the PosMap lookaside buffer (PLB) and size value is 64 or 256 Kbytes. Data stored on the
PosMap compression techniques, that empirically cloud is organized as a tree, where each node, or
reduces the performance overhead from recursive bucket, stores several data blocks. Figure 1 shows an
ORAM.16 Finally, researchers developed a novel fork example binary tree structure. Note that any arbi-
path ORAM scheme that supports redundant mem- trary tree structure is applicable in ORAM. Follow-
ory accesses by leveraging three optimization tech- ing previous work,10 we translate each read or write
niques: path merging, ORAM request scheduling, operation into two primitives, ReadAndRemove and
and merging-aware caching.17 Add, which are defined as follows:

36 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


ReadAndRemove(u): given an address u specified Level 0
by the client, the cloud returns the correspond-
ing data block and removes it from storage.
Add(u, d): the client writes block d to address u Level 1
at the client storage.
Level 2
With the two primitives, each read(u) operation
can be replaced by a ReadAndRemove(u) followed by Level 3
an Add(u, d) that writes the same data block back to
address u. Similarly, to implement a write(u, d) opera-
tion, we conduct a redundant ReadAndRemove(u) be-
fore Add(u, d). Although the number of access opera-
tions is doubled in ORAM, it prevents the untrusted
cloud from distinguishing read and write operations.
The implementation of ReadAndRemove(u) and
Server A Server B Server C
Add(u, d) is critical for hiding access patterns in
ORAM. When a data block is written into the cloud Figure 1. Oblivious RAM (ORAM)-based cloud storage.
storage, its always inserted into the root bucket in
level 0, as Figure 1 shows. As more data blocks are
added to the root bucket, it will eventually be full, structure into multiple parts, each of which is stored
without residual capacity to accommodate new in a server. For example, we consider storing the
blocks. To avoid overflowing, data blocks in each ORAM tree shown in Figure 1 in three servers, each
nonleaf bucket are periodically evicted to its children of which can accommodate five buckets at most.
buckets. We assign a random number, a designator, Figure 1 shows a partition scheme. Since the root
to each newly added data block to indicate which leaf bucket is accessed in each read and write operation,
bucket its evicted to along the tree. Only the client the server A holding the root bucket has the highest
knows the mapping between the block address and access load. On the other hand, each read operation
its associated designator. At each level of the tree, involves only one leaf node, leading to the lowest
the client randomly chooses several buckets to evict. load on server B, which stores five leaf nodes in level
To prevent the cloud from tracking the eviction pro- 3. As this example illustrates, ORAM-based storage
cess, dummy blocks are inserted into other children would lead to a serious unbalanced data access load
buckets that dont receive the real data block. among servers without proper bucket placement,
To read a data block, the client first looks up which motivates us to develop an algorithm to opti-
its corresponding designator in local storage, and mize data placement.
then reads all buckets along the path between the
root and the leaf bucket indicated by this designa- Load Balancing for Static Deployment
tor. When it finds the data block, the algorithm re- We consider storing the data of multiple users to m
moves it from its current bucket and writes it back storage servers. We organize each users data as an
to the root bucket with a new designator. This way, ORAM tree. In total, we have n buckets of size B.
the cloud cant infer which block is read because re- The users generate a set of access requests that are
peated reads for the same block will produce differ- translated into a series of ReadAndRemove(u) and
ent lookup paths through the tree. Add(u, d) operations. Each bucket is the minimum
Its easy to see that the algorithm guarantees an storage unit, and is associated with an access rate
independent and random access to the cloud stor- ai due to the read, write, and eviction operations in
age for each read or write operation. Furthermore, ORAM. We can estimate each buckets access rate
the background eviction process generates a parti- according to the characteristics of the ORAM algo-
tion access sequence independent of the data access rithm, such as tree structure and eviction probability.
pattern. By following an analysis similar to that of The ith server can accommodate at most Ci buckets.
Shi and her colleagues,10 we can prove that the algo- Based on the system model, our load balance prob-
rithm can preserve access privacy. lem can be described as a max-min problem.

Load Balancing of ORAM Deployment Definition 1: The problem of load balance for deploy-
To deploy an ORAM-based storage in a distributed ing ORAM-based storage in clouds (LBOC). Given
system, we need to partition the corresponding tree a tree-based ORAM structure and a set of storage

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 37
Cloud Security

servers, the LBOC problem seeks a data placement servers. In each iteration, we only need to deal with
that minimizes the maximum access load among all a small-scale linear programming problem. Figure 2
servers. shows the flowchart for the algorithm.
Since a bucket is the minimum access unit in In the beginning, we initialize both the residual
ORAM, we define binary variables xij to describe capacity Cjres and current access load yjcurr of the jth
bucket placement, given by server to zero. In each iteration of the following while
loop, we conduct bucket placement by solving a lin-
1, if the ith bucket is placed on the jth servver, ear programming problem with respect to N, current
x ij
0, otherwise. access load, and Cjres on each server. The linear pro-
gramming problem is the same with LBOC formula-
Since each bucket has to be placed at only one serv- tion except that xij is relaxed so it can be a real vari-
er, we have the following constraint: able between 0 and 1, as shown in (9) in Figure 2. In
addition, we consider current access load yjcurr in con-

m
x ij = 1, 1 i n. (1) straint (5), and constrain the capacity of each server
j=1
with Cjres in (8).
The total size of buckets deployed to a server cannot We then sort xij in descending order according
exceed its capacity; we represent this as to their values after solving this linear programming
problem. For the ith bucket in set N, we place it in

n
x ij C j , 1 j m. (2) the server with maximum value of xij based on the
i=1
expectation that a larger xij represents higher prob-
We define another variable, yj, to denote the total ability of the corresponding optimal data placement.
access rate to the jth server. It can be calculated by We finally update the values of Cjres and yjcurr; 1 j
m, to finish this iteration.

n
yj = ai x ij , 1 j m. (3)
i=1
Load Balancing for Dynamic Deployment
The maximum access rate of all servers are con- In practice, the data access rate might change with
strained by Y, giving us time. For example, some users intensively access their
data during the daytime, but rarely connect to the
y j Y , 1 j m. (4) cloud storage at night. Some companies retrieve their
business data for statistic computation at night to
By summarizing these constraints, we formulate minimize access conflict with normal business during
the LBOC problem as a mixed integer linear pro- the day. In addition, irregular accesses exist because
gramming (MILP), given by of unpredictable user activities. Were motivated to
develop an online load-balancing algorithm to deal
LBOC: min Y with dynamic data accesses.
(1), (2), (3), and (4), A straightforward approach is to divide the time
x ij {0,1}, 1 i n,1 j m. into discrete time slots and re-execute the algorithm
for static deployment in each time slot according to
Theorem 1: The LBOC problem is NP-hard. the current access rates. Although this approach
Proof: We can prove the NP-hardness of the LBOC can always guarantee load balance, it will incur fre-
problem by reducing the well-known 2-partition quent data movement that consumes a large portion
problem. We complete the proof by following a pro- of network bandwidth among storage servers. To
cess similar to that presented elsewhere.24 address this challenge, we propose an online algo-
To solve the LBOC problem, we propose a fast rithm that dynamically adjusts data placement for a
heuristic algorithm whose basic idea is to first solve tradeoff between load balance and data movement
the MILP problem formulated by relaxing all integer among servers. In each time slot, we need to make
variables, and then find a feasible integer solution by two decisions. First, we must decide whether data
rounding the results. However, were dealing with rebalancing is needed. We define a threshold, denot-
big data, and the corresponding formulation might ed by , and conduct rebalancing if the gap between
contain a large number of variables and constraints the highest and lowest load servers are greater than
because of too many buckets in the ORAM tree. It the threshold . Otherwise, we keep the current data
would be time-consuming to solve such a large-scale placement. Second, we must decide how to move
linear programming problem. The basic idea of our data if load rebalancing is needed. Although we can
proposed algorithm is to iteratively place buckets on use the algorithm for static deployment, it might in-

38 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


Start

Cjres = Cj , 1 j m

yjcurr = 0, 1 j m

WHILE
FOR
There are buckets that Put a set of unplaced buckets in set N
each xij in the order IF
havent been placed
the ith bucket isnt
Solve the following linear programming placed and Cjres > 0
min Y
yj + yjcurr Y, 1 j m; (5)
yj ai xij , 1 j m; (6)
iN
place this bucket on the jth server
m Cjres = Cjres 1;

j =1
x ij
= 1, i N; (7)
yjcurr = yjcurr + yj;
x
iN
ij
Cj , 1 j m;
res
(8)

0 xij 1, i N, 1 j m; (9)
IF
the ith bucket isnt
Sort xij in a descending order placed and Cjres > 0

END WHILE END IF

END

Figure 2. The flowchart of our proposed algorithm.

3,500
OPT
cur a large amount of data movement because the ILB
3,000 RAND
optimization process ignores the current data place-
Maximum access load

ment. Instead, we propose a simple heuristic algo- 2,500


rithm that greedily moves the highest loaded data
buckets to the server with the lowest load. The data 2,000
moving process terminates when the load gap be- 1,500
tween servers becomes less than the threshold.
1,000
Performance Evaluation
500
We conducted extensive simulations to evaluate our
proposed algorithms performance. For comparison 0
under static deployment, we also show the optimal 1 2 3 4 5 6 7 8 9 10
Instance
solution and the performance of an algorithm that
randomly allocates data buckets to servers. For dy- Figure 3. Comparison with the optimal solution in
namic deployment, we compare our proposed online 10 random instances. Our proposed algorithm, ILB,
algorithm with one that periodically updates the obtains results close to the optimal solutions.
whole deployment using the optimal solution.
We first evaluate ILBs performance by compar-
ing its results with the optimal solution. We simu- We average all results over 50 random instances
late allocating 200 data buckets to 10 servers, and with 10 servers. Each servers capacity is set as a
show the results of 10 random instances in Figure 3. Gaussian random variable with a mean of 100 and
On average, ILBs access load is 1.15 times the opti- a variance of 20. In each ILB iteration, we consider
mal solution, whereas RANDs corresponding ratio data placement for 100 buckets. As Figure 4a shows,
is 1.91. the access rate of both algorithms is an increasing
We then study the influence of the number of function of the bucket number. Moreover, the per-
buckets by changing the value from 600 to 1,000. formance gap between the two algorithms increases

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 39
Cloud Security

9,000 11,000
RAND RAND
Maximum access rate 8,000 ILB 10,000 ILB

Maximum access rate


7,000 9,000

6,000 8,000

5,000 7,000

4,000 6,000

3,000 5,000
600 700 800 900 1,000 10 20 30 40 50
Number of buckets Variance
(a) (b)

Figure 4. Maximum access rate versus (a) different numbers of buckets and (b) different variance of server
capacity. Our proposed algorithm, ILB, always outperforms the random algorithm.

12 We investigate our proposed algorithms time


ILB
ILB-S complexity by comparing its execution time with
10 the one (denoted by ILB-S) that solves the whole
Execution time (seconds)

problem using a single linear programming. As Fig-


8 ure 5 shows, ILB and ILB-S have a runtime of 0.14
seconds and 0.18 seconds, respectively, for 1,000
6 buckets and 10 servers, denoted by {1,000, 10}. In
instances with the largest scale, that is, {4,000, 40},
4 the ILB-S runs for 10.2 seconds, which is 3.4 times
the ILB runtime.
2 To evaluate our proposed online algorithms
performance, referred to as online_heuristic, for dy-
0 namic deployment, we consider a network with 100
{1,000, 10} {2,000, 20} {3,000, 30} {4,000, 40}
buckets and 10 servers, and randomly change each
Problem scale
buckets access rate. For comparison, we also show
Figure 5. Execution time under different instance the performance of an alternative online algorithm,
scales. Our proposed algorithm, ILB, is faster than online_opt, which adjusts data placement accord-
ILB-S. ing to the optimization framework. As Figure 6a
shows, we increase the rebalancing threshold from
100 to 400, and compare the maximum access load
as the number of buckets grows. For example, when of both algorithms. Although online_opt always
the number of buckets is 600, the maximum access outperforms our proposed heuristic algorithm, the
rate of RAND is 17 percent higher. The performance performance gap is small. We also compare the traf-
gap increases to 33 percent as the bucket number fic of both algorithms by increasing the value of
grows to 1,000. The results indicate that ILB can ef- from 100 to 400. As Figure 6b shows, our proposed
fectively reduce the maximum access rate. online_heuristic incurs significantly less network
We fix the mean value of server capacity at 100, traffic than online_opt during data replacement,
and then study the effect of server capacity variance because online_opt conducts global optimization
with 1,000 buckets and 10 servers. As Figure 4b for load rebalancing and ignores the current data
shows, the maximum access rate of both algorithms placement.
increases with the variance, but their performance
gap becomes smaller. We attribute this phenom-
enon to the fact that servers with small capacity RAM is promising in hiding access privacy in
quickly become full during data placement, and lots cloud storage, but there are still many open
of buckets have to be accommodated in the servers challenges in applying it in practice. We will contin-
with large capacity, leading to high access load. ue studying to reduce the complexity of the ORAM

40 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


2,050 9,000
2,000 online_heuristic online_heuristic
8,000
online_opt online_opt
1,950
Maximum access load

7,000
1,900
1,850 6,000

Teaffic
1,800 5,000
1,750 4,000
1,700
3,000
1,650
1,600 2,000
1,550 1,000
100 200 300 400 100 200 300 400
The value of threshold The value of threshold
(a) (b)

Figure 6. Performance comparison: (a) maximum access load under different values of threshold , and
(b) traffic of data movement under different values of threshold .

algorithm, and solving practical deployment issues 6. Z. Wu et al., SPANstore: Cost-Effective Geo-
in the near future. replicated Storage Spanning Multiple Cloud
Services, Proc. ACM Symp. Operating Systems
Acknowledgments Principles, 2013, pp. 292308.
This work was supported by the Japan Society for 7. B. Chor et al., Private Information Retrieval, J.
the Promotion of Science KAKENHI grant number ACM, vol. 45, no. 6, 1998, pp. 965981.
16K16038 and the Council for Science, Technology 8. R. Sion and B. Carbunar, On the Computation-
and Innovation (CSTI), Cross-Ministerial Strategic al Practicality of Private Information Retrieval,
Innovation Promotion Program (SIP), Enhancement Proc. Network and Distributed Systems Security
of Societal Resiliency against Natural Disasters Symp., 2007, pp. 110.
(Funding agency: JST). A part of this article was pub- 9. M.T. Goodrich and M. Mitzenmacher, MapRe-
lished in the proceedings of the 2014 IEEE Confer- duce Parallel Cuckoo Hashing and Oblivious
ence on Computer Communications Workshops. RAM Simulations, CoRR, vol. abs/1007.1259,
2010.
References 10. E. Shi et al., Oblivious RAM with O((log N)3)
1. D. Beaver et al., Finding a Needle in Haystack: Worst-Case Cost, Advances in Cryptology (ASI-
Facebooks Photo Storage, Proc. 9th USENIX ACRYPT 11), 2011, pp. 197214.
Conf. Operating Systems Design and Implemen- 11. T. Moataz, T. Mayberry, and E.-O. Blass, Con-
tation (OSDI), 2010, pp. 4760. stant Communication ORAM with Small Block-
2. M. Islam, M. Kuzu, and M. Kantarcioglu, Ac- size, Proc. ACM SIGSAC Conf. Computer and
cess Pattern Disclosure on Searchable Encryp- Comm. Security, 2015, pp. 862873.
tion: Ramification, Attack, and Mitigation, 12. J. Dautrich and C. Ravishankar, Combin-
Proc. Network and Distributed System Security ing ORAM with PIR to Minimize Bandwidth
Symp. (NDSS), 2012, pp. 115. Costs, Proc. 5th ACM Conf. Data and Applica-
3. K.D. Bowers, A. Juels, and A. Oprea, Hail: A tion Security and Privacy (CODASPY), 2015, pp.
High-Availability and Integrity Layer for Cloud 289296.
Storage, Proc. 16th ACM Conf. Computer and 13. C. Liu et al., ObliVM: A Programming Frame-
Comm. Security, 2009, pp. 187198. work for Secure Computation, Proc. IEEE
4. A. Bessani et al., DepSky: Dependable and Se- Symp. Security and Privacy, 2015, pp. 359376.
cure Storage in a Cloud-of-Clouds, Proc. 6th 14. X. Yu et al., ProRAM: Dynamic Prefetcher for
Conf. Computer Systems, 2011, pp. 3146. Oblivious RAM, Proc. ACM/IEEE Ann. Intl
5. E. Stefanov et al., Iris: A Scalable Cloud File Symp. Computer Architecture (ISCA), 2015, pp.
System with Efficient Integrity Checks, Proc. 616628.
28th Ann. Computer Security Applications Conf., 15. X.S. Wang et al., SCORAM: Oblivious RAM
2012, pp. 229238. for Secure Computation, Proc. ACM SIGSAC

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 41
Cloud Security

Conf. Computer and Comm. Security, 2014, pp. big data,wireless network,and cyberphysical systems.
191202. Guo has a PhD in computer science from the Univer-
16. C.W. Fletcher et al., FreeCursive ORAM: [Near- sity of Ottawa. Hes a senior member of IEEE, a se-
ly] Free Recursion and Integrity Verification for nior member of ACM, and an IEEE Communications
Position-Based Oblivious RAM, Proc. 20th Intl Society Distinguished Lecturer. Contact him at song
Conf. Architectural Support for Programming Lan- .guo@polyu.edu.hk.
guages and Operating Systems, 2015, pp. 103116.
17. X. Zhang et al., Fork Path: Improving Efficiency Toshiaki Miyazaki is a professor in the School
of ORAM by Removing Redundant Memory Ac- of Computer Science and Engineering and the dean
cesses, Proc. 48th Intl Symp. Microarchitecture of the Undergraduate School of Computer Science
(MICRO), 2015, pp. 102114. and Engineering at the University of Aizu, Fukushima,
18. O. Goldreich, Towards a Theory of Software Japan. His research interests include reconfigurable
Protection and Simulation by Oblivious RAMs, hardware systems, adaptive networking technologies,
Proc. ACM Symp. Theory of Computing, 1987, and autonomous systems. Miyazaki has a PhD in elec-
pp. 182194. tronic engineering from the Tokyo Institute of Tech-
19. R. Ostrovsky, Efficient Computation on Oblivi- nology. Hes a senior member of IEEE, theInstitute of
ous RAMs, Proc. ACM 22nd Ann. ACM Symp. Electronics, Information, and Communication Engi-
Theory of Computing (STOC), 1990, pp. 514523. neers, and the Information Processing Society of Ja-
20. O. Goldreich and R. Ostrovsky, Software Pro- pan. Contact him at miyazaki@u-aizu.ac.jp.
tection and Simulation on Oblivious RAMs, J.
ACM, vol. 43, no. 3, 1996, pp. 431473. Miao Xie is a PhD student in the School of Engi-
21. P. Williams and R. Sion, Single Round Access neering and IT at the School of Engineering and IT,
Privacy on Outsourced Storage, Proc. ACM University of New South Wales at the Australian De-
Conf. Computer and Comm. Security (CCS), fence Force Academy. His research interests include
2012, pp. 293304. intrusion/anomaly detection in wireless sensor net-
22. J.R. Lorch et al., Shroud: Ensuring Private Ac- works, network security, data mining, and forecasting
cess to Large-Scale Data in the Data Centers, algorithms. Xie has a masters degree in engineering
Proc. USENIX Conf. File and Storage Technolo- and IT from University of New South Wales. Contact
gies (FAST), 2013, pp. 199213. him at m.xie@adfa.edu.au.
23. X. Wang, H. Chan, and E. Shi, Circuit ORAM:
On Tightness of the Goldreich-Ostrovsky Lower Jiankun Hu is a professor and research director at
Bound, Proc. ACM SIGSAC Conf. Computer the Cyber Security Lab, School of Engineering and
and Comm. Security, 2015, pp. 850861. IT, University of New South Wales at the Australian
24. P. Li and S. Guo, Load Balancing for Privacy- Defence Force Academy. His research interests in-
Preserving Access to Big Data in Cloud, Proc. clude cybersecurity, including biometrics security.
IEEE Conf. Computer Comm. Workshops, 2014, Hu has a PhD in control engineering from the Harbin
pp. 524528. Institute of Technology, China. Hes a member of the
IEEE. Contact him at j.hu@adfa.edu.au.

Peng Li is an associate professor in the School of Weihua Zhuang is a full professor in the Depart-
Computer Science and Engineering at the University ment of Electrical and Computer Engineering at the
of Aizu, Japan. His research interests include wireless University of Waterloo, Canada. Her research inter-
communication and networking, specifically wireless ests include multimedia wireless communications,
sensor networks, green and energy-efficient mobile wireless networks, and radio positioning. Zhuang has
networks, cross-layer optimization for wireless net- a PhD in electrical engineering from the University
works, cloud computing, big data processing, and of New Brunswick, Canada. Contact her at wzhuang
smart grid. Li has a PhD in computer science and @bbcr.uwaterloo.ca.
engineering from the University of Aizu, Japan. Hes
a member of IEEE. Contact him at pengli@u-aizu
.ac.jp.

Song Guo is a full professor in the Department Read your subscriptions through
the myCS publications portal at
of Computing atthe Hong Kong Polytechnic Univer- http://mycs.computer.org.
sity. His research interests include cloud computing,

42 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


NEW PREFERRED PLUS

MEMBERSHIP
OPTIONS TRAINING & DEVELOPMENT

FOR A
BETTER FIT. RESEARCH

BASIC

STUDENT

And a better match for your career goals.


IEEE Computer Society lets you choose your membership and
the benefits it provides to fit your specific career needs. With four
professional membership categories and one student package, you can
select the precise industry resources, offered exclusively through the
Computer Society, that will help you achieve your goals.

Learn more at www.computer.org/membership.


Cloud Security

Cryptographic Public
Verification of Data
Integrity for Cloud
Storage Systems
Yuan Zhang, Chunxiang Xu, and Hongwei Li , University of Electronic Science
and Technology of China

Xiaohui Liang, University of Massachusetts Boston

A public verification of data integrity scheme uses


a random masking technique to protect against
external adversaries. A performance analysis
demonstrates that the proposed scheme is efficient
in terms of user auditing overhead.
44 I E E E C l o u d C o m p u t i n g p u b l i s h e d b y t h e I E E E c o m p u t e r s o cie t y  2325-6095/16/$33.00 2016 IEEE
loud storage services enable users the user. Constructing an efficient public verifica-
to outsource their data to cloud tion of data integrity for cloud storage against mali-
servers and access that data re- cious auditors is of paramount importance.10
motely over the Internet. These Some public verification schemes dont protect
services give users an efficient and against external adversaries, so an active and online
flexible way to manage their data adversary can intrude into the cloud server, modify
without deploying and maintaining local storage de- the outsourced data, tamper with the interaction
vices and services.14 Specifically, users can process messages between the cloud server and auditor, and
their data on their PCs, outsource the processed pass the auditors verification. (See the Literature
data to cloud servers, and use the data on other de- Review sidebar for related work in this area.) A
vices (for example, mobile phones). The great con- common approach to resisting such adversaries is to
venience provided by such services is leading to a have the cloud server interact with the auditor using
growing number of cloud storage providers.5,6 a secure channel. However, constructing a secure
However, despite the benefits brought by cloud channel for each verification task is cumbersome.
storage services, critical security concerns in data We propose a public verification scheme in which
outsourcing exist. One of the most important se- we adopt a random masking technique instead of
curity concerns for users is data integritythat is, secure channels between cloud servers and auditors to
whether their data remains intact on cloud serv- resist external adversaries. In addition, users are able
ers.7 A cloud service provider might hide data loss to examine auditors behaviors to prevent malicious
incidents to maintain its reputation8 or discard data auditors from fabricating verification results.
thats rarely accessed to save storage space, while Furthermore, we use Bitcoin to construct an unbiased
claiming that no data loss has occurred. Moreover, challenge message, which helps prevent malicious
an external adversary might distort users data auditors from colluding with cloud servers. Security
on cloud servers for financial or political reasons. and performance analyses show that the proposed
Consequently, users require an efficient and secure scheme can achieve security goals with efficiency.
verification method to ensure their datas integrity.9
Some schemes rely on users to perform the veri- Public Verification of Data Integrity
ficationthat is, a user must actively and repeatedly Traditional cryptographic primitives for protect-
engage in a process that has expensive communica- ing data security, such as message authentication
tion and computation overhead, which assumes that code (MAC) and signature, can ensure data integ-
the user always has a device with sufficient compu- rity. However, public verification schemes cant use
tation capability and Internet bandwidth to perform these methods directly since they require an exter-
integrity verification. To reduce the verification bur- nal auditor to possess the data to verify its integrity,
den on users, we propose a public verification para- which creates high communication overhead. Public
digm in which an external and independent auditor key-based homomorphic linear authenticator (HLA)
periodically verifies data integrity on users behalf. offers a practical and affordable solution to enable
Existing public verification schemes assume the an auditor to verify the integrity of outsourced data
auditor is honest and cant be corrupted. But this without demanding a local copy of the data.
is a strong assumption, since auditors can be cor- As Figure 1 shows, the public verification
rupted in practice. A malicious auditor can always scheme consists of three entities: user, cloud server,
claim that the outsourced data is (not) retained well and auditor. The user outsources data to the cloud
in the cloud, regardless of the verification result, server and later accesses the data as needed. After
so even the malicious auditor wouldnt perform the outsourcing the data, the user delegates the data
verification. In addition, the vulnerability of existing integrity verification task to the auditor, a trusted
schemes is further exacerbated by the fact that the entity with the expertise and capabilities to perform the
malicious auditor colludes with the cloud server and verification. To verify the outsourced datas integrity,
generates a biased challenging message to check the the auditor first generates a challenge message and
data blocks that arent corrupted, and thus deceiving issues it to the cloud server. With the challenge

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 45
Cloud Security

Cloud servers data and tampers with the interaction messages


between the cloud server and auditor to pass the
verification.9
Resistance against malicious auditors. A secure
Public data public verification should resist malicious
Dataflow verification auditors, where a malicious auditor might
fabricate a verification result to deceive the user
Data verification delegation and/or cloud server.10

User Auditor
These influencing factors arent isolated and
can be closely related; thus, public verification tech-
Figure 1. System model. There are three entities in the public verification niques should be evaluated from both cryptography
scheme: user, cloud server (cloud service provider), and auditor. and engineering perspectives.

Warm-up Scheme
message, the cloud server generates corresponding We first review the public verification scheme pro-
proof information and sends it to the auditor. After posed by Shacham and Waters (SWP),11 which in-
receiving this information, the auditor verifies the data volves a user U, cloud server C, and auditor A. SWP
integrity by checking the proof informations validity. If consists of five algorithms.
the verification fails, the auditor informs the user that
the data might be corrupted. Because we use HLA, Setup. With a security parameter , U determines the
the proof information generated by the cloud server bilinear map: e: G GGT.11 Then U chooses secret
doesnt include the data file, thus the communication parameters and generates the public parameters
overhead between the cloud server and auditor is low. (v = g , u1, , us), where g is the generator of the
multiplicative group G.
Basic Public Verification Scheme
Because public verification aims to enable a third- Store. User U transforms its data M into n blocks and
party auditor to efficiently and securely verify the further splits each block into s sectors. User U chooses
integrity of outsourced data, these schemes should a random element name for file naming and computes
be evaluated using both systems and crypto criteria. a file tag on name and the public parameters,
Systems criteria include which enables A to check the validity of the public
parameters used to check the data integrity. In other
Efficiency. A public verification scheme should be words, the validity of ensures that C cant deceive A
as efficient as possible in terms of communication by replacing the public parameters. Then, U gener-
and computation overhead. ates a tag i for each data block. The tag i is based
Boundless verification. A public verification on the BLS signature,12 which is the HLA, and allows
scheme should enable auditors to verify data multiple tags to be aggregated into a single tag, where
integrity without a priori bound on the number the size of the aggregated tag is independent of the
of verification interactions. number of tags to be aggregated, and a verifier can
Stateless auditor. Auditors should be stateless confirm the validity of the aggregated tag instead of
and shouldnt need to maintain and update state checking the tags one by one. Finally, U outsources
during verification. the data file, file tag, and i to C.

Crypto criteria include Audit. For each verification task, A first determines I
and randomly chooses vi, i I, where I is a random
Soundness. Any time a cloud server passes the subset of set {1, , n} to determine which data
auditors verification, it must possess the specified blocks should be verified, and vi is a random element
data intact. This should be provably secure under for each verification. Next, A sends the challenge
the security model proposed by Hovav Shacham message chal = {(i, vi)}(i I) to C.
and Brent Waters.11
Resistance against external adversaries. A Prove. After receiving chal, C verifies , and then sends
secure public verification scheme should resist
common attacks, where an external, active, vi
i ,j = v m ( j [1, s])
iI
i ij
iI
and online adversary modifies the outsourced

46 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


Literature Review
ublic verification techniques let users outsource verification scheme to protect against malicious audi-
data integrity verification to a third-party auditor. tors with minimal communication overhead.7
GiuseppeAteniese and his colleagues proposed the
first public verification scheme.1 HovavShacham and References
Brent Waters defined the first formal security model 1. G. Ateniese et al., Provable Data Possession at
of public verification and proposed a public verifica- Untrusted Stores, Proc. 2007 ACM SIGSAC Conf.
tion scheme with full proofs of security against arbi- Computer and Comm. Security (CCS 07), 2007, pp.
trary adversaries.2 Various public verification schemes 598609.
were later proposed based on this work. 2. H. Shacham and B. Waters, Compact Proofs of Re-
Solomon Worku and his colleagues proposed trievability, J. Cryptology, vol. 26, no. 3, 2013, pp.
a privacy-preserving public verification scheme.3 442483.
However, they didnt consider external attacks in their 3. S.G. Worku et al., Secure and Efficient Privacy-Pre-
threat model, where an external adversary thats active serving Public Auditing Scheme for Cloud Storage,
and online can modify the outsourced data and tam- Computers and Electrical Eng., vol. 40, no. 5, 2014,
per with the interaction messages between the cloud pp.17031713.
server and auditor, thus invalidating the data integrity 4. Y. Zhang et al., Cryptanalysis of an Integrity Check-
verification.4 A common approach to resisting such an ing Scheme for Cloud Data Sharing, J. Information
adversary is for the cloud server to send proof infor- Security and Applications, vol. 23, Aug. 2015, pp.
mation to the auditor using a secure channel, which 6873.
ensures the integrity and correctness of the interac- 5. C. Xu et al., An Efficient Provable Secure Public Au-
tion messages. However, constructing a secure chan- diting Scheme for Cloud Storage, KSII Trans. Inter-
nel between the cloud server and auditor for each net and Information Systems, vol. 8, no. 11, 2014,
verification task is cumbersome. Other work proposed pp. 42264241.
the first privacy-preserving public verification scheme 6. F. Armknecht et al., Outsourced Proofs of Retriev-
against external adversaries without secure channels.5 ability, Proc. 2014 ACM SIGSAC Conf. Computer
Frederik Armknecht and his colleagues, considering and Comm. Security (CCS 14), 2014, pp. 831843.
that malicious auditors exist in practice, proposed the 7. Y. Zhang et al., SCLPV: Secure Certificateless Pub-
first framework to resist malicious auditors.6 However, lic Verification for Cloud-Based Cyber-Physical-So-
their work cant achieve public verification with accept- cial Systems Against Malicious Auditors, IEEE Trans.
able communication overhead. The Secure Certifi- Computational Social Systems, vol. 2, no. 4, 2015,
cateless Public Verification scheme is the first public pp. 159170.

to A as the corresponding proof information. Observe where H() is a BLS hash.12


that for each verification, vi is different, ensuring the
proof informations freshness. By allowing multiple Vulnerability against External Adversaries
data blocks and signatures to be aggregated into a In SWP, the proof information generated by C is {,
short tag (that is, j and =
iI i
vi ), HLA guar- j}j[1, s]. Observe that
antees minimal communication overhead.

Verify. Upon receiving the proof information, A j = iI


v i m ij
verifies the data integrity by checking
when an external and active adversary intrudes into
s C and modifies each data block mij to m ij = m ij + lij

uj j , v ,
vi
e (, g) = e H ( i || name) for i[1, n], j [1, s]. The corresponding proof

iI j=1 information at this point is {, , j } j[1,s], where

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 47
Cloud Security

Table 1. The log file L used to verify data integrity. thus have the same framework and threat model.
Consequently, these schemes cant protect against ex-
Bitcoin hash Authenticator Random element Data ternal adversaries and malicious auditors.
(1)
Bl t (1) Rj
(1)
j
(1)
An external adversary can invalidate SWP, since
(2) (2) (2)
theres a definite linear relationship between the
Bl t (2) Rj j proof information (j) and the data blocks (mij). To
resist external adversaries without secure channels,

we adopt a random masking technique when com-
(k) (k) (k) (k)
Bl t Rj j puting the proof information. Specifically, we use
random masking as a nonlinear disturbance code to
change the definite linear relationship between the
proof information and the data blocks to a nonlinear
j = iI
ij ,
vim relationship.
To resist malicious auditors, auditors behav-
but still corresponds to mij rather than iorthat is, whether the auditor performs the es-
ij ( i I, j [1, s ]) . To deceive A and pass the
m tablished verificationshould be checked. In the
verification, the adversary can eavesdrop the enhanced scheme, the auditor is required to gener-
challenge message chal, intercept the proof ate an entry for each verification task and store it
information, and compute in a log file. The user audits the auditors behavior
by checking the log files validity, guaranteeing that
j = j
( i, v i )chal
v ilij (1 j s). a malicious auditor cant fabricate a verification re-
sult to deceive the user and/or cloud server. Here, we
Finally, the adversary sends the modified proof in- want to further emphasize that the periodicity of the
formation to A, and the modified data file passes the users audits of the auditor should be much longer
verification. than the periodicity of the auditors verification of
the datas integrity.
Vulnerability against Malicious Auditors However, such a paradigm cant deter malicious
A corrupted auditor can deceive C and U in several auditors perfectly, since a malicious auditor can still
ways. deceive the user by generating a biased challenge
First and simplest, a malicious auditor can claim message, where the corrupted data blocks will nev-
that the outsourced data is (not) retained intact in er be checked. For security and efficiency reasons,
the cloud, no matter what the verification result is, its also impractical to require the user to generate
even the malicious auditor wont perform the verifi- a new challenge message for each verification task.
cation. Because C and U trust the auditor, they will To address this problem, we use Bitcoin to construct
accept its claim without doubt. the challenge message. Given a determinate time t,
Second, the malicious auditor can collude with if t is a past or current time, we can easily find a
C to deceive U. In this case, the outsourced data has Bitcoin block, which is generated in the nearest time
been corrupted, but the auditor generates a biased t; however, if t is a future time, the Bitcoin block,
challenge message to check the data blocks, which which is generated in t, is unpredictable. Here, we de-
arent corrupted. note the hash of a Bitcoin block, which is generated
Third, the malicious auditor can collude with U in a past time t as Blt. Since Bitcoin has this property,
to circumvent C. That is, the outsourced data is re- we can consider Bitcoin as a time-based pseudoran-
tained in good condition, but the auditor claims that domness source. We can compute this sources out-
its been corrupted. put when its input is a past/current time; otherwise,
SWP cant protect against malicious auditors, so the output is unpredictable.
it must bear a strong assumption: auditors are hon- As Figure 2 shows, the enhanced scheme consists
est and reliable. Resisting malicious auditors is thus of six algorithms: Setup, Store, Audit, Prove, Verify,
a worthwhile area for further study. and CheckLog. In our enhanced scheme, the first two
algorithms are the same as those in SWP.
Protecting against Malicious Auditors and In Audit, the auditor first acquires Blt based on
External Adversaries the current time t and initializes the pseudorandom
As discussed earlier, malicious auditors and external bit generator as = GetRand(Blt). Then the auditor
adversaries can invalidate the SWP technique. Most generates the challenging message {(i, vi)}i I on
existing public verification schemes follow SWP, and and .

48 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


Audit Verify Auditor
Prove Cloud server
Based on Blt, generate a
challenging message {i, vi} {i, vi }
Randomly choose r
Verify the validity of

Verify the validity of {, , j , Rj } Compute Rj , , j

Generate an entry {Blt(k),(k), t(k), Rt(k)}

Store the entry into a log file as


shown in Table I.

{M, i , }

Store Data owner


Set up Checklog Data owner

Generate secret parameters Transform the data file to


M={m11, , m1s, mns}
Generate public parameters
{v, u1, , us} Compute a file tag , and n tags i
for each mi , 1 i n
Check the validity of the log file

Figure 2. Execution steps of our scheme. Different algorithms are performed by different entities and the
details are shown in the figure.

In Prove, C randomly chooses r G as a secret blocks and generates a set of challenge messages
parameter, and computes I(B) = {{ i(1) , v (i1) }i[ I(1) ],...,{ i(b) , v (i b) }i[ I( b ) ] } , where b is
the size of the subset B. Then the user sends B to the
R j = u rj , = iI
ivi , *j = iI
v i m ij , and auditor and receives (B) , R(jB) , (jB) ( j [1, s]) , where

j = r 1 (*j + h(R j ))( j [1, s]) ,


(k)
(B) = ( i(k) ) vi .
kB iI( B)

where h() is a BLS hash. Then, C sends {, , j, Finally, the user audits the auditor by checking
Rj}j [1, s] to A.
e ( (B) , g)

s s h( R(jk ) )
In Verify, upon receiving the proof information v(i k ) (jk )
= e H ( (B) , g) (R(jk) )
u j kB , v.

{, j, Rj}j [1,s], the auditor verifies kB iI( B) kB j=1 j=1

(1)
s s

( , g) = e H(i || name) vi Rj j uj h(R j ) , v.
If Equation 1 fails, the user can consider that the
iI j=1 j=1
cloud-stored data is corrupted, and either the TPA
If the verification holds, the auditor creates an en- and/or C are malicious.
try as (Blt, , j, Rj)j[1,s]. Finally, the auditor stores the
entry in a log file L, as shown in Table 1 (in L, j [1, s]), Remark
where k denotes the index of the verification that the Unlike some previous schemes,4,7,8,13 we dont con-
auditor performs. That is, {Blt(k ) , (k ) , R(jk ) , (jk ) } is the sider privacy protection of user data against the audi-
proof information of kth verification. tor. An auditor in our scheme could be compromised
In CheckLog, to check the validity of L, the user or could collude with the cloud server, which might
first picks a random subset B of indices of Bitcoin reveal the users data to the auditor. Exploiting data

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 49
Cloud Security

encryption before outsourcing is an easy and afford- Next, we analyze the proposed schemes perfor-
able way to achieve such privacy protection. mance in terms of communication and computation
overhead. We tested all the experiments using a
Security and Performance Evaluation Windows 7 system with an Intel Core 2 i5 CPU run-
We first analyze the security of the proposed scheme ning at 2.53 GHz with 2 Gbyte DDR 3 of RAM (1.74
by using the crypto criteria proposed earlier. Gbytes available). We implemented all algorithms
With the proposed scheme, if the cloud server using C language and our code uses the MIRACL li-
passes the auditors verification, it ensures that the brary version 5.6.1. We use an MNT elliptic curve,12
verified data is intact. This claim is provably se- and a security level of 80 bits. The difference for
cure under the Shacham and Waters security mod- choices on s is discussed elsewhere.11 For simplicity,
el,11 and the formal proof is presented elsewhere.8 we give the atomic operation analysis for the case s = 1
Therefore, our proposed scheme achieves the sound- in the following.
ness criterion. We first analyze communication overhead be-
Next, we show that the proposed scheme can tween the cloud server and auditor. In Audit and
resist an external adversary. The adversary first Prove, the auditor sends the challenge message to
intrudes into the cloud server and modifies m ij to the cloud server, and the cloud server responds to
ij = m ij + lij . In Prove, the adversary intercepts the
m the auditor with the proof information. The size of
proof information {, , j , R j } j[1,s] , where the challenge message is c |i + vi|, where |i| denotes
the size of i. In our scheme, i and vi are random
*
j = r 1( *j + h(R j )) and j = iI
ij .
vim numbers with 80 bits under || = 80 bits. The size of
the proof information is || + || + |j| + |Rj|, which
Since r and r1 are unknown to the adversary, and is approximately equal to 80 bytes under the 80-bit
security level. Therefore, the total communication
j = r 1 ( *j + h(R j )) cost between the auditor and cloud server is approxi-
mately equal to 90 bytes for each verification task.

= r 1 vim ij + h(R j ) Next, we analyze our schemes computation
iI
overhead. Because the performance analysis on the
auditor side of the proposed scheme is presented
= r 1 v i m ij + h(R j ) + r 1 v ilij ,
iI iI elsewhere,8 we only analyze the additional cost on
the user side to protect against the malicious audi-
the adversary cant compute tor. In CheckLog, the user performs Equation 1 to
audit the auditors behavior, which is the only addi-
r 1 ( iI )
v ilij . tional computational overhead for the user. Figure
3 shows the additional verification overhead on the
Thus, such an attack is computationally infeasible. user side in different challenge entries. As Figure
Finally, we show that the proposed scheme can 3 shows, the user in the proposed scheme can au-
protect against a malicious auditor. Because the pro- dit the auditor with high efficiency. In other word,
posed scheme meets the soundness criterion, the compared with SWP, the proposed scheme requires
cloud server and malicious auditor cant forge a proof higher verification costs on the user side, but this
that passes the data integrity verification. In addi- extra cost is exactly the guarantee to resist the mali-
tion, the user will audit the auditors behavior, and cious auditor, so is a worthwhile sacrifice.
the auditor must execute the established verification.
Furthermore, because the challenge message is deter-
mined by the time-based pseudorandom source (that e plan to focus our future research efforts in
is, the Bitcoin) and can be recovered by the user, the several areas.
auditor cant deceive the user by generating a biased Most existing public verification schemes are
challenge message. In other words, even if the audi- based on the public-key cryptosystem. In these
tor colludes with the cloud server, it cant deceive the schemes, even the auditor is equipped with a powerful
user by only checking the uncorrupted datas integ- device, so verification is a second-long (hundreds of
rity. Therefore, malicious auditors cant invalidate the milliseconds) computation. For these schemes, it
proposed scheme. A formal and detailed proof is pre- would be impractical for the auditor to verify the
sented elsewhere.10 data integrity using a low-power device. Reducing
All in all, the proposed scheme meets the crypto the computation of the operations in the public-
criteria proposed earlier. key cryptography to those in the symmetric-key

50 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


25
cryptography while retaining the benefits of public Challenge blocks number c =300
verification is still an open issue and deserves Challenge blocks number c = 460

Verification delay (seconds)


further investigation. 20
In practice, an auditor always serves multiple
cloud users and multiple cloud servers. If the audi- 15
tor handles multiple verification tasks from different
users and cloud servers one by one, the verification
10
can incur a huge delay and become a bottleneck in
applications. Therefore, enabling a single auditor to
simultaneously handle multiple verification tasks 5
from different users and different cloud servers with
high efficiency is also worth further study. 0
With the development of cryptography, many new 0 5 10 15 20 25 30 35 40 45 50
cryptographic primitives have been proposed, such Number of entries to be audited
as program obfuscation and structure-preserving
cryptography. These new cryptographic primitives Figure 3. Verification delay on the user side. This
bring new appealing features and powerful figure shows the overhead that the user audits the
functionalities not previously available. Combining auditors behavior.
current public verification techniques with these
new and powerful cryptographic primitives could
provide users with more feature-rich cloud storage 13, no. 3, 2016, pp. 312325.
services than ever before. This also remains an open 6. H. Li et al., Engineering Searchable Encryption
research issue that should be further explored. of Mobile Cloud Networks: When QoE Meets
QoP, IEEE Wireless Comm., vol. 22, no. 4, 2015,
Acknowledgments pp. 7480.
The National Natural Science Foundation of 7. C. Wang et al., Privacy-Preserving Public Au-
China under grants 61370203, 61472065, and diting for Data Storage Security in Cloud Com-
61350110238; the Science and Technology on puting, Proc. 2010 IEEE Intl Conf. Computer
Communication Security Laboratory Foundation Comm. (INFOCOM 10), 2010, pp. 19.
under grant 9140C1103 01110C1103; the 8. C. Xu et al., An Efficient Provable Secure Pub-
International Science and Technology Cooperation lic Auditing Scheme for Cloud Storage, KSII
and Exchange Program of Sichuan Province, Trans. Internet and Information Systems, vol. 8,
China, under grant 2014HH0029; and the China no. 11, 2014, pp. 42264241.
Postdoctoral Science Foundation under grant 9. Y. Zhang et al., Cryptanalysis of an Integrity
2014M552336 supported this work. Checking Scheme for Cloud Data Sharing, J.
Information Security and Applications, vol. 23,
References Aug. 2015, pp. 6873.
1. S. Yu et al., Can We Beat DDos Attacks in 10. Y. Zhang et al., SCLPV: Secure Certificateless
Clouds? IEEE Trans. Parallel and Distributed Public Verification for Cloud-Based Cyber-Phys-
Systems, vol. 25, no. 9, 2014, pp. 22452254. ical-Social Systems Against Malicious Auditors,
2. S. Yu, G. Wang, and W. Zhou, Modeling Mali- IEEE Trans. Computational Social Systems, vol.
cious Activities in Cyber Space, IEEE Network, 2, no. 4, 2015, pp. 159170.
vol. 29, no. 6, 2015, pp. 8387. 11. H. Shacham and B. Waters, Compact Proofs of
3. S. Yu, S. Guo, and I. Stojmenovic, Fool Me If Retrievability, J. Cryptology, vol. 26, no. 3, 2013,
You Can: Mimicking Attacks and Anti-Attacks pp. 442483.
in Cyberspace, IEEE Trans. Computers, vol.64, 12. D. Boneh, B. Lynn, and H. Shacham, Short Sig-
no. 1, 2015, pp. 139151. natures from the Weil Pairing, Proc. 7th Intl
4. C. Wang et al., Privacy-Preserving Public Au- Conf. Theory and Application of Cryptology and
diting for Secure Cloud Storage, IEEE Trans. Information Security: Advances in Cryptology
Computers, vol. 62, no. 2, 2013, pp. 362375. (ASIACRYPT 01), 2001, pp. 514532.
5. H. Li et al., Enabling Fine-Grained Multi- 13. S. Guadie Worku et al., Secure and Efficient
Keyword Search Supporting Classified Sub- Privacy-Preserving Public Auditing Scheme for
Dictionaries over Encrypted Cloud Data, IEEE Cloud Storage, Computers and Electrical Eng.,
Trans. Dependable and Secure Computing, vol. vol. 40, no. 5, 2014, pp.17031713.

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 51
Cloud Security

Yuan Zhang is a PhD student in computer sci- software and theory from UESTC. Hes a member of
ence and engineering at the University of Electronic IEEE, the China Computer Federation, and the Chi-
Science Technology of China (UESTC), Chengdu. na Association for Cryptologic Research. Contact him
His research interests include cryptography, network at hongweili@uestc.edu.cn.
security, and cloud computing security. Zhang has a
BSc from UESTC. Hes a student member of IEEE. Xiaohui Liang is an assistant professor in the De-
Contact him at ZY_LoYe@126.com. partment of Computer Science at the University of
Massachusetts, Boston. His research interests include
Chunxiang Xu is a professor of computer science applied cryptography, and security and privacy issues
and technology at the University of Electronic Sci- for e-healthcare systems, cloud computing, mobile so-
ence Technology of China. Her research interests in- cial networks, and smart grids. Liang has a PhD in
clude information security, cloud computing security, electrical and computer engineering from the Uni-
and cryptography. Xu has a PhD from Xidian Univer- versity of Waterloo, Canada. Hes a member of IEEE.
sity. Shes a member of IEEE. Contact her at chxxu@ Contact him at Xiaohui.Liang@umb.edu.
uestc.edu.cn.

Hongwei Li is an associate professor in the School


of Computer Science and Engineering at the Uni-
versity of Electronic Science Technology of China Read your subscriptions through
the myCS publications portal at
(UESTC). His research interests include cryptography http://mycs.computer.org.
and the secure smart grid. Li has a PhD in computer

ADVERTISER INFORMATION

Advertising Personnel Southwest, California:


Mike Hughes
Marian Anderson: Sr. Advertising Coordinator Email: mikehughes@computer.org
Email: manderson@computer.org Phone: +1 805 529 6790
Phone: +1 714 816 2139 | Fax: +1 714 821 4010
Southeast:
Sandy Brown: Sr. Business Development Mgr. Heather Buonadies
Email sbrown@computer.org Email: h.buonadies@computer.org
Phone: +1 714 816 2144 | Fax: +1 714 821 4010 Phone: +1 973 304 4123
Fax: +1 973 585 7071
Advertising Sales Representatives (display)
Advertising Sales Representatives (Classified Line)
Central, Northwest, Far East:
Eric Kincaid Heather Buonadies
Email: e.kincaid@computer.org Email: h.buonadies@computer.org
Phone: +1 214 673 3742 Phone: +1 973 304 4123
Fax: +1 888 886 8599 Fax: +1 973 585 7071

Northeast, Midwest, Europe, Middle East: Advertising Sales Representatives (Jobs Board)
Ann & David Schissler
Email: a.schissler@computer.org, d.schissler@computer.org
Phone: +1 508 394 4026 Heather Buonadies
Fax: +1 508 394 1707 Email: h.buonadies@computer.org
Phone: +1 973 304 4123
Fax: +1 973 585 7071

52 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


Focus on
Your Job Search
IEEE Computer Society Jobs helps you easily find
a new job in IT, software development, computer en-
gineering, research, programming, architecture, cloud
computing, consulting, databases, and many other
computer-related areas.
New feature: Find jobs recommending or requiring the
IEEE CS CSDA or CSDP certifications!

Visit www.computer.org/jobs to search technical job


openings, plus internships, from employers worldwide.

http://www.computer.org/jobs

The IEEE Computer Society is a partner in the AIP Career Network, a collection of online job sites for scientists, engineers, and com-
puting professionals. Other partners include Physics Today, the American Association of Physicists in Medicine (AAPM), American
Association of Physics Teachers (AAPT), American Physical Society (APS), AVS Science and Technology, and the Society of Physics
Students (SPS) and Sigma Pi Sigma.
Cloud Security

To Docker or Not
to Docker: A Security
Perspective
Theo Combe, Telecom Paris-Tech
Antony Martin and Roberto Di Pietro, Nokia Bell Labs

Container solutions loud computing is inherently rooted in virtualization


technologies. Recently, new lightweight virtualiza-
such as the popular tion technologies such as containers have become
increasingly popular and are nowadays an essential
Docker environment part of cloud offerings. Containers also tightly inte-
provide more grate into the host operating system, reducing the
software overhead imposed by virtual machines (VMs).1 However, this
flexibility than tighter integration also increases the attack surface, raising security
concerns.
virtual machines Existing work on container securityfocuses mainly on the relation-
ship between the host and the container.25 However, containers are
and offer near- now part of a complex ecosystem, which includes containers and vari-
native performance ous repositories and orchestrators, that is highly automated. In par-
ticular, container solutions embed automated deployment chains6that
in cloud-based are meant to speed up the code deployment processes. These chains
are often composed of third-party elements running on different plat-
infrastructures. forms provided by different providers, raising concerns about code in-
tegrity. This can cause multiple vulnerabilities that an adversary could
However, Docker exploit to penetrate the system.
and its current To the best of our knowledge, container ecosystem security has
yet to be fully investigated, despite being fundamental to container
usage scenarios adoption. Here, we address that gap and focus our investigation on
the Docker ecosystem for three reasons. First, Docker successfully be-
entail security came the reference on both the market of containers and the associ-
ated DevOps ecosystem. In particular, 92 percent of people surveyed
vulnerabilities that by ClusterHQ and DevOps.com are using or planning to use Docker in
must be addressed. a container solution.7 Second, security is the first barrier to container
adoption in the production environment,7 Docker being no exception

54 I E E E C l o u d C o m p u t i n g p u b l i s h e d b y t h e I E E E c o m p u t e r s o cie t y  2325-6095/16/$33.00 2016 IEEE


App App App App App App App App App

Libs Libs Libs Libs Libs Libs Libs Libs Libs

Guest Guest Guest Guest Guest Guest Cont. Cont. Cont.


OS OS OS OS OS OS

VM VM VM VM VM VM Docker daemon

Hypervisor Host libraries


Hypervisor
Host OS Host OS

Hardware Hardware Hardware

(a) (b) (c)

Figure 1. Comparing various application runtime models: (a) a type 1 hypervisor, (b) a type 2 hypervisor, and (c) a container.

in this. Finally, Docker is already running in some example, Berkeley Software Distribution (BSD) jails
environments, making it possible to run experi- and chroot can be considered an early form of con-
ments and explore the practicality of some attacks. tainer technology. Recent Linux-based container so-
lutions rely on kernel supportthat is, a userspace
Containerization and Dockerization in a library to provide an interface to syscalls and front-
Growing Ecosystem end applications. There are two main kernel imple-
Cloud applications have typically leveraged virtu- mentations: Linux container (LXC) implementations
alization. However, several factorsincluding ac- using cgroups and namespaces, and the OpenVZ
celeration of the development cycle (such as agile patch. Table 1 shows the most popular implementa-
methods and DevOps), an increasingly complex ap- tions and their dependences.
plication stack (mostly Web services and their frame- Containers can be integrated in a multitenant
works), and market pressure to densify applications environment, thus profiting from resource sharing
on servershave triggered the need for a fast, easy- to increase average hardware use. This is achieved
to-use way of pushing code into production. by sharing the kernel with the host machine. In-
deed, unlike VMs, containers dont embed their own
Linux Containers kernel, but rather run directly on the host kernel.
Figure 1 shows how virtualization hypervisors (Fig- This shortens the syscalls execution path by remov-
ures 1a and 1b) compare to a container (Figure 1c), ing the guest kernel and the virtual hardware layer.
which provides near-bare-metal performance1 and Additionally, containers can share software resourc-
offers the possibility of seamlessly running mul- es (such as libraries) with the host, hence avoiding
tiple versions of applications on the same machine. code duplication. The absence of kernel and some
New instances of containers can be created quasi- system libraries make containers very lightweight
instantly to face a customer demand peak, which is (image sizes can shrink to a few megabytes), which
convenient for spawning applications on-demand or enables a quick boot process.
quickly moving a service, such as when implement-
ing network function virtualization (NFV). Docker
Containers have long existed in various forms As Figure 2 shows, the Docker ecosystem includes
that differ by the level of isolation they provide. For various components. Docker provides a specification

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 55
Cloud Security

Table 1. Container solutions.

Base Container Library Kernel dependence Other dependencies

Linux Docker libcontainer cgroups + namespaces + capabilities iptables, perl, Apparmor, sqlite, Go
containers + kernel version 3.10 or above
(LXC)
LXC liblxc cgroups + namespaces + capabilities Go

LXD liblxc cgroups + namespaces + capabilities LXC, Go

Rocket AppContainer cgroups + namespaces + capabilities cpio, Go, squashfs, gpg


+ kernel version 3.8 or above

Warden custom tools cgroups + namespaces debootstrap, rake

OpenVZ OpenVZ libCT Patched kernel Specific components: checkpoint/restore


in userspace (CRIU), ploop, Virtuozzo
container memory management daemon
(VCMMD)

for container images and runtime, including Docker- process is close to the classical VM installation, but
files that allow a reproducible building process (Figure must be performed at each image rebuild (such as
2a). Docker software implements this specification for updates); because the base image is standard-
using the Docker daemon, known as the Docker en- ized, the sequence of commands is exactly the same.
gine. The repositories include a central repository, the To automate this process, Dockerfiles (Figure2a) let
Docker hub, which lets developers upload and share users specify a base image and a sequence of com-
their images, along with a trademark and bindings mands to be performed to build the image, along
with third-party applications (Figure 2b). Finally, the with other optionssuch as exposed portsspe-
build process fetches code from external repositories cific to the image. The image is then built with the
and holds the packages that will be embedded in the docker build command.
images (Figure 2c). Docker is written in the Go lan-
guage and was first released in March 2013. Docker internals. Docker containers create a
wrapped, controlled environment on the host ma-
Docker specification. The specifications scope is con- chine in which arbitrary code can be (ideally) run
tainer images and runtime. Docker disk images are safely. This isolation is achieved through two main
composed of a set of layers, along with metadata in the kernel featureskernel namespaces8 and control
JavaScript Object Notation (JSON) format. The im- groups (cgroups)that were merged starting from the
ages are stored at /var/lib/docker/<driver>/, Linux kernel version 2.6.24. Namespaces are used to
where <driver> stands for the storage driver being split the view that processes have of the system. Cur-
usedsuch as advanced multi-layered unification rently, the kernel has six different namespacesPID,
filesystem (aufs), B-tree file system (Btrfs), virtual IPC, NET, MNT, UTS, and USERthat isolate vari-
filesystem switch (VFS), device mapper, or OverlayFS. ous aspects of the system. Each of these namespaces
Each layer contains the filesystem modifications rela- has its own kernel internal objects related to its type,
tive to the previous layer, starting from a base im- and each gives processes a local instance of some
age (typically, a lightweight Linux distribution). This paths in the /proc and /sys filesystems. The Linux
lightweight Linux distribution organizes the images namespaces isolation role is detailed elsewhere.3 The
in trees; each image has a parent, except for the base cgroups are a kernel mechanism to restrict the re-
images, which are the roots of the trees. This struc- source usage of a process or group of processes. Their
ture allows Docker to ship in an image only the modi- goal is to prevent a process from taking all available
fications specifically related to it. resources and starving other processes and contain-
Docker can build images in two ways. It can ers on the host. Controlled resources include CPU
launch a container from an existing image (docker shares, RAM, network bandwidth, and disk I/O.
run), perform modifications and installations inside
the container, and then stop the container and save The Docker daemon. The Docker software runs as a
its state as a new image (docker commit). This daemon on the host machine. It can launch contain-

56 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


ers, control their isolation level (cgroups, namespac- Developer
docker
es, capabilities restrictions, and SELinux/Apparmor machine
Cont. Cont. built
profiles), monitor them to trigger actions (such as re- Dev. Online service
environ-
start), and spawn shells into running containers for Docker
ment Physical machine
daemon
administration purposes. The software can change docker
Host libraries push Commands
iptables rules on the host and create network inter- Host OS
faces. Its also responsible for managing container Hardware
images, including pulling and pushing images on a (a)
remote registry (such as the Docker hub), building docker Image docker
push git Code Image
images from Dockerfiles, and signing images. The push
push
Dockerfile
daemon itself runs as root (with full capabilities) on
the host and is remotely controlled through a Unix External code
repositories
socket. Alternatively, the daemon can listen on a
classical TCP socket. Github, Private repositories External repositories
for example (dependencies, website,
Public repositories for example)
The Docker hub. The Docker hub online repository
lets developers upload their Docker images and lets
Code docker build Code
users download them. Developers can sign up for a github hook
Dockerfile
free account, in which all repositories are public, or
for a paid account, which lets them create private re- Images repositories
positories. Developer repositories are namespaced
that is, their name is developer/repository. Official Private repositories
Private repositories
Alternate
repositories also exist, directly provided by Docker Public repositories registry
Public repositories
Inc.; these are called repository. The Docker dae- Third-party
Docker repositories
mon, hub, and repositories are similar to a package hub
manager, with a local daemon installing software on Official repositories

both the host and the remote repositories. Some of


(b)
these repositories are official, while others are unof- docker pull docker pull
Image Image
ficial and provided by third parties. docker hook
Development
environment
Docker Security Overview
Docker security relies on three factors: isolation Production
of processes at the userspace level managed by the environment
Docker daemon, enforcement of this isolation by the Orchestrator
Docker
kernel, and network operations security. host (Kubernetes,
Cont. Cont. Cont. for example)

Isolation Tasks
Docker daemon Commands
Services
Docker containers rely exclusively on Linux kernel Host libraries
features, including namespaces, cgroups, hardening, Host OS
and capabilities. Namespace isolation and capabili- Hardware
ties drop are enabled by default, but cgroups limita- docker run / ps /
inspect / exec ...
tions arent; they must be enabled on a per-container (c)
basis through -a -c options on container launch.
The default isolation configuration is relatively Figure 2. The Docker ecosystem. (a) Docker specifies container images
strict. The only flaw is that all containers share the and runtime, including Dockerfiles that enable a reproducible building
same network bridge, enabling Address Resolution process. (b) The Docker repositories. (c) The build process. Arrows show
Protocol (ARP) poisoning attacks between contain- the code path and associated commands (docker <action>).
ers on the same host.
However, as we describe in more detail later,
Dockers global security can be lowered by options, This includes options lowering security, such as the
triggered at container launch, that give extended insecure-registry option, which disables the
access on some parts of the host to containers. Ad- Transport Layer Security (TLS) certificate check on
ditionally, security configuration can be set glob- a particular registry. Options that increase security
ally through options passed to the Docker daemon. such as the icc=false parameter, which forbids

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 57
Cloud Security

network communications between containers and nection to this socket can give root privileges on the
mitigates the ARP poisoning attackare available, host. Therefore, the connection must be secured with
but they prevent multicontainer applications from TLS (tlsverify), which enables both encryption
operating properly, and hence are rarely used. and authentication of the two sides of the connection
(and requires additional certificate management).
Host Hardening
Host hardening through Linux kernel security mod- Docker Usages: Security Challenges
ules enforces security-related limitation constraints Most of the security discussions about containers
imposed on containers (such as compromising a compare them to VMs, thus assuming both tech-
container and escaping to the host operating sys- nologies are equivalent in terms of design. Although
tem). Currently SELinux, Apparmor, and Seccomp this is the aim of some container technologies (such
are supported with available default profiles. These as OpenVZ, which is used to spawn virtual private
profiles are generic, not restrictive. The docker- servers), recent lightweight container solutions such
default Apparmor profile9 (https://wikitech as Docker were designed to achieve completely dif-
.wikimedia.org/wiki/Docker/apparmor), for exam- ferent objectives than those of VMs. Therefore, its
ple, allows full access to the filesystem, network, important to develop Dockers typical usages to dis-
and all capabilities of Docker containers. Similarly, cuss their security implications and how they affect
the default SELinux policy puts all Docker objects Dockers security.
in the same domain. Therefore, while default hard-
ening protects the host from containers, it doesnt Docker Usages
protect containers from other containers. This secu- We can distinguish three types of Docker usages.
rity aspect can be addressed by writing specific pro- Recommended usages are those that Docker was
files that depend individually on the containers. designed for, as explained in the official documenta-
tion. Docker developers recommend a microservices
Network Security approach13 that is, a container must host a single
Docker uses network resources for image distribu- service, in a single process or in a daemon spawning
tion and remote control of the Docker daemon. children. Therefore, a Docker container isnt consid-
To distribute images, Docker verifies images ered a VM: theres no package manager, no init pro-
downloaded from a remote repository with a hash cess, no sshd to manage it. All administration tasks
and the connection to the registry is made over TLS (container stop, restart, backups, updates, builds,
(unless explicitly specified otherwise). Moreover, and so on) must be performed via the host machine,
the Docker Content Trust architecture now lets de- which implies that the legitimate containers admin
velopers sign their images before pushing them to has root access to the host.
a repository.10 Content Trust relies on the update Docker developers also recommend a repro-
framework (TUF),11 which was specifically designed ducible and automated deployment of applications.
to address package manager flaws.12 TUF can recov- Docker images should be built anywhere through a
er from a key compromise, mitigate replay attacks by generic build file (Dockerfile) which specifies the
embedding expiration timestamps in signed images, steps to build the image from a base image. This ge-
and so on. The tradeoff is complex key management; neric way of building images makes the process and
TUF actually implements a public-key infrastruc- the resulting images almost host-agnostic, depending
ture in which each developer owns a root key (of- only on the kernel and not on the installed libraries.
fline key) that is used to sign signing keys that are Widespread usages include common usages of
used to sign Docker images. Docker by application developers and system ad-
The Docker daemon is remote-controlled through ministrators. Some system administrators or devel-
a socket, making it possible to perform any Docker opers use Docker as a way to ship complete virtual
command from another host. By default, the sock- environments and update them regularly, turning
et used to control the daemon is a Unix socket, their containers into VMs. Although this is conve-
located at /var/run/docker.sock and owned nient because it limits system administration tasks
by root:docker, but it can be changed to a TCP to the bare minimum (such as docker pull), as
socket. Access to this socket lets attackers pull and we describe later, it has several security implica-
run any container in privileged mode, thereby giv- tions. With containers embedding enough software
ing them root access to the host. In case of a Unix to run a full system (logging daemon, ssh server,
socket, a user member of the docker group can gain and even sometimes an init process), its tempting
root privileges; when a TCP socket is used, any con- to perform administration tasks from within the

58 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


container, which is completely opposed to Dockers Indirect adversaries have the same capabilities
design. Indeed, some of these administration tasks as direct ones, but they leverage the Docker ecosys-
require root access on the container, while other ad- tem (such as the code and images repositories) to
ministration actionssuch as mounting a volume reach the production environment.
in a containercould require extra capabilities that Depending on the attack phase, we identified
Docker drops by default. the following targets: containers, host operating
Platform as a service (PaaS) usages are guided system, collocated containers, code repositories,
by PaaS implementations to cope with security and images repositories, and the management network.
infrastructure integration issues. As of the end of MITREs Common Vulnerabilities and Exposures
2015, the main PaaSs integrated Docker. We focus (CVE) records illustrate that these are relevant tar-
here on Amazon Web Services and Google Container gets. Vulnerabilities found in Docker and the libcon-
Engine, the two market leaders that we experimented tainer mostly concern filesystem isolation: chroot
on. Both solutions provide similar approaches: a VM escapes (CVE-2014-9357, CVE-2015-3627), path
or cluster of VMs is created, with an orchestrator traversals (CVE-2014-6407, CVE-2014-9356, and
tool available to manage the containers
inside the VMs. In this model, the con-
tainers admin has full rights on the or-
chestrator. The microservices approach The Docker attack surface encompasses
promoted by DevOps and Docker cur-
rently requires manual configuration to the whole deployment toolchain, from
launch appropriate images on appropri-
ate nodes. This task is automated by or- the build to the execution of the images.
chestrators that manage clusters of VMs,
which themselves run on multiple physi-
cal hosts.
CVE-2014-9358), and access to special file systems
Adversary Model on the host (CVE-2015-3630). These specific vul-
Given the ecosystem and usages description, we nerabilities are all patched as of Docker 1.6.2. Be-
consider two main categories of adversaries: direct cause container processes often run with user ID 0,
and indirect. they have read and write access on the whole host
Direct adversaries can sniff, block, inject, or filesystem when they escape, which lets them over-
modify network and system communications, and write host binaries, leading to a delayed arbitrary
they directly target the production machines. Lo- code execution with root privileges.
cally or remotely, direct adversaries can compromise To subvert a Dockerized environment, we con-
several system components: sider a subset of all the potential attack vectorsthe
Docker containers, code repositories, and images re-
In-production containers. With containers from positoriesprimarily because theyre associated with
an Internet-facing container service, for exam- publicly available services and interfaces. Other at-
ple, attackers can gain root privileges on a re- tack vectors might include the host operating system,
lated container. Then, from the compromised management network, or physical access to systems.
container, they can make a denial-of-service
(DoS) attack on containers located on the same Vulnerabilities Affecting Docker Usages
host operating system. The Docker attack surface encompasses the whole
In-production host operating system. From a deployment toolchain proposed by Docker, from the
compromised container, for example, attackers build to the execution of the images, including im-
can gain access to critical host operating system age conception, the image distribution process, au-
filesthat is, launch a container escape attack. tomated builds, image signature, host configuration,
In-production Docker daemons. In this case, for and third-party components.
example, attackers might lower the default se-
curity parameters to launch Docker containers Insecure local configuration. Dockers default con-
from a compromised host operating system. figuration on local systems following recommended
The production network. From a compromised usages is relatively secure as it provides isolation
host operating system, attackers can redirect between containers and restricts containers access
network traffic and so on. to the host. Assuming these isolation mechanisms

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 59
Cloud Security

are working as expected in both the recommended more relevant in widespread usages, where contain-
and PaaS usagesthat is, there are no implementa- ers are used as VMs and thus have a bigger attack
tion vulnerabilities or CVEsa privilege boundary surface than microservice containers. They also
between the containers and the host machine has have more vulnerabilities, leading to attacks such as
been designed. Technical controls supporting the container escapes.
boundary include the isolation of processes through
namespaces, resources management through Weak local access control. Beyond the kernel
cgroups, and (by default) limited communication ca- namespaces, cgroups, Docker dropping capabilities,
pabilities between the containers and the host. and mount restrictions, mandatory access control
In contrast, widespread usages take advantage (MAC) enforces constraints if the normal execution
of optionsgiven either to the Docker daemon on flow isnt respected. This approach is visible in the
startup or to the command launching a container docker-default Apparmor policy. However, the
that give containers extended access to the host. MAC profiles for containers have room for improve-
When used with untrusted containers, these op- ment. In particular, Apparmor profiles typically
tions trigger many security concerns, including the behave as whitelists,14 explicitly identifying which
following: resources any process can access while denying any
other access when the profile is in enforce mode.
options giving extended access to the host to However, the docker-default profile installed
containers (net=host, uts=host, privi- with the docker.io package gives containers com-
leged, and so on); plete access on network devices and filesystems
the mounting of sensitive host directories in with a full set of capabilities, and contains a small
containers; list of deny directives, which constitute a de facto
TLS configuration of remote image registries; blacklist.
permissions on the Docker control socket; and These vulnerabilities are relevant to all usages
cgroups activation (disabled by default). and could lead to the attacks mentioned earlier,
such as DoS or container escapes.
For instance, when given the option net=host
at container launch, Docker doesnt place the con- Image distribution vulnerabilities. The distribution
tainer into a separate NET namespace; it therefore of images through the Docker hub and other regis-
gives the container full access to the hosts network tries in the Docker ecosystem is a source of vulner-
stack (enabling network sniffing, reconfiguration, abilities. Because these vulnerabilities are similar to
and so on). The option uts=host lets the contain- classical package managers,12 we consider only the
er in the same UTS namespace as the host, which automated deployment pipeline perspective here.
lets the container see and change the hosts name Automated builds and webhooks proposed by
and domain. The option cap-add=<CAP> gives the Docker hub are key elements in the image distri-
the container the specified capability, thus making bution process. They lead to a pipeline in which each
it potentially more harmful to the host. With cap- element has full access to the code that will end up
add=SYS_ADMIN, a container can, for example, re- in production, and are increasingly hosted in the
mount /proc and /sys subdirectories in read/write cloud. For instance, to automate this deployment,
mode and change the hosts kernel parameters, lead- Docker proposes automated builds on the Docker
ing to potential vulnerabilities, data leakage, or DoS. hub, triggered by an event from an external code re-
Along with these runtime container options, pository (such as github). Docker then proposes to
several settings on the host can influence potential send an HTTP request to a Docker host reachable
attacks. Even basic properties can at a minimum on the Internet to notify it that a new image is avail-
trigger DoS. For instance, when using some storage able. This triggers an image pull and a container re-
drivers (aufs), Docker doesnt limit containers disk start on the new image (through Docker hooks; see
usage. A container with a storage volume can fill up https://docs.docker.com/docker-hub/webhooks).
this volume and affect other containers on the same In this deployment pipeline, a commit on github
hostor even the host itselfif the Docker storage will trigger a build of a new image and automatically
located at /var/lib/docker isnt mounted on a launch it into production. Optional test steps can be
separate partition. added before production, which might themselves
As mentioned earlier, whatever the usages are, be hosted at yet another provider. In this case, the
containers are an attack vector and therefore rep- Docker hub makes a first call to a test machine that
resent a potential threat for the host. This is even will then pull the image, run the tests, and send re-

60 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


sults to the Docker hub using a callback URL. The trust is split across external entities, each of which
build process itself often downloads dependencies is capable of compromising the images.
from other third-party repositories, sometimes over These vulnerabilities are especially relevant in
an insecure channel prone to tampering. the recommended and PaaS usages, which aim at an
This setup adds several external intermediary extensive use of automation at all layers to deliver
steps to the code path, each of which has its own shorter development cycles and continuous delivery.
authentication and attack surface, increasing the
global attack surface.
For instance, we had the intuition that a com- n orchestrator could solve some of the secu-
promised github account could lead to the execu- rity issues raised here, helping limit misuses
tion of malicious code on a large set of production of Docker through higher levels of abstractionin-
machines within minutes. We therefore tested a cluding tasks, replication controllers, and remote
scenario that included a Docker hub account, a persistent storagethat completely remove host
github account, a development machine, and a dependence and thus enable better isolation. Or-
production machine. The assumption was that ad- chestrator also brings automation as a key enabler
versaries would use the Docker ecosystem to put a to repeatable, predictable, auditable security and
backdoored Docker container in production. More continuously improved security. We are currently
precisely, we assumed that adversaries had success- investigating orchestrator security issues through
fully compromised some code on the code repository experiments.
(for instance, via a successful phishing attack).
Because of network restrictions (our server was Acknowledgments
behind a corporate proxy that did not allow us to be We thank the anonymous reviewers for their useful
reached directly from the Internet on a public IP ad- comments.
dress and a port) our servers couldnt be reached by
webhooks, so we wrote a script to monitor both our References
repository on the Docker hub and downloads of new 1. M.G. Xavier et al., Performance Evaluation of
images. Our initial intuition was confirmed: the ad- Container-Based Virtualization for High Per-
versaries code was put in production five-and-a-half formance Computing Environments, Proc.
minutes after their commit on github. Its worth not- 21st Euromicro Intl Conf. Parallel, Distributed,
ing that this attack can scale to an arbitrary number and Network-Based Processing (PDP 13), 2013,
of machines watching the same Docker hub reposi- pp.233240.
tory. Given space limitations, we report more de- 2. T. Bui, Analysis of Docker Security, 2015,
tailed results elsewhere.15 arXiv:1501.02967v1.
Although compromising a code repository is 3. E.Reshetova et al., Security of OS-Level Virtu-
independent of Docker, automatically pushing it in alization Technologies, Proc. Nordic Conf. Se-
production dramatically increases the number of cure IT Systems, 2014, pp.7793.
compromised machines, even if the malicious code 4. R.Di Pietro and F.Lombardi, Security for Cloud
is removed within minutes. Compromise could also Computing, Artec House, 2015.
happen at the Docker hub account level with the 5. F. Lombardi and R. Di Pietro, Virtualization
same consequences. Account hijacking isnt a new and Cloud Security: Benefits, Caveats, and Fu-
problem, but it should be an increasing concern with ture Developments, Cloud Computing, Z.Mah-
the multiplication of accounts at different providers. mood, ed., Springer Intl Publishing, 2014,
Moreover, although the code path is usually pp.237255.
secured using TLS communications (and always is 6. Docker, Automated Builds on Docker Hub,
with Docker), its not the case with API calls that Docker User Guide, 2016; https://docs.docker
trigger builds and callbacks. Tampering with these .com/docker-hub/builds.
data can lead to erroneous test results, unwanted 7. ClusterHQ and DevOps.com, The Current State
restarts of containers, and so on. Additionally, such of Container Usage: Identifying and Eliminating
a setup isnt compatible with the Content Trust Barriers to Adoption, survey, June 2015; https://
scheme, because code is processed by external enti- clusterhq.com/assets/pdfs/state-of-container
ties between the developer and the production envi- -usage-june-2015.pdf.
ronment. Content Trust provides an environment in 8. E.W. Biederman, Multiple Instances of the
which a single entity is trusted (the person or orga- Global Linux Namespaces, Proc. Linux Symp.,
nization that signed the images), while in this case vol.1, 2006, pp.101112.

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 61
Cloud Security

10. Docker, Content Trust in Docker, Docker User security, telecommunications, and programming. Con-
Guide, 2016; https://docs.docker.com/engine/ tact him at theo-nokia@sutell.fr.
security/trust/content_trust.
11. J. Samuel et al., Survivable Key Compromise Antony Martin is a security analyst and mem-
in Software Update Systems, Proc. 17th ACM ber of the technical staff in the security department at
Conf. Computer and Comm. Security (CCS 10), Nokia Bell Labs, Nozay, France. His research interests
2010, pp.6172. include network security, virtualization, cloud com-
12. J.Cappos et al., A Look in the Mirror: Attacks puting, and network function virtualization. Martin
on Package Managers, Proc. 15th ACM Conf. has an engineering degree in telecommunications
Computer and Comm. Security, P.Ning, P.F. Sy- from Telecom Lille engineering school and holds a
verson, and S.Jha, eds., 2008, pp.565574. number of technical certifications. Contact him at
13. Docker, Best Practices for Writing Dockerfiles, antonymartin.pro@gmail.com.
Docker User Guide, 2016; https://docs.docker
.com/engine/userguide/eng-image/dockerfile Roberto Di Pietro is security research global
_best-practices. head at Nokia Bell Labs, Paris-Saclay, France, and a
14. Novell, Novell AppArmor Administration Guide, part-time professor of computer science (security) at
Oct. 2007; w w w.suse.com/documentation/ the University of Padova, Italy. His research interests
apparmor/pdfdoc/ book _apparmor21_admin/ include security, privacy, distributed systems, com-
book_apparmor21_admin.pdf. puter forensics, and analytics. Di Pietro has a PhD
15. T.Combe, A.Martin, and R.Di Pietro, Contain- in computer science from the University of Roma La
ers: Vulnerability Analysis, tech. report, Nokia Sapienza. Contact him at roberto.di_ pietro@nokia
Bell Labs; http://ricerca.mat.uniroma3.it/users/ -bell-labs.com.
dipietro/containers_security.pdf.

Theo Combe is a graduate student at the Ecole


Polytechnique, France, and is pursuing a double de- Read your subscriptions through
the myCS publications portal at
gree in networks and cybersecurity at Telecom Paris- http://mycs.computer.org.
Tech. His research interests include networked systems

EXECUTIVE STAFF
Executive Director: Angela R. Burgess; Director, Governance & Associate Executive
PURPOSE: The IEEE Computer Society is the worlds largest association of computing Director: Anne Marie Kelly; Director, Finance & Accounting: Sunny Hwang;
professionals and is the leading provider of technical information in the field. Director, Information Technology & Services: Sumit Kacker; Director, Membership
MEMBERSHIP: Members receive the monthly magazine Computer, discounts, and Development: Eric Berkowitz; Director, Products & Services: Evan M. Butterfield;
opportunities to serve (all activities are led by volunteer members). Membership is open to Director, Sales & Marketing: Chris Jensen
all IEEE members, affiliate society members, and others interested in the computer field.
OMBUDSMAN: Email ombudsman@computer.org. COMPUTER SOCIETY OFFICES
COMPUTER SOCIETY WEBSITE: www.computer.org Washington, D.C.: 2001 L St., Ste. 700, Washington, D.C. 20036-4928
Phone: +1 202 371 0101 Fax: +1 202 728 9614 Email: hq.ofc@computer.org
Next Board Meeting: 1314 November 2016, New Brunswick, NJ, USA Los Alamitos: 10662 Los Vaqueros Circle, Los Alamitos, CA 90720
EXECUTIVE COMMITTEE Phone: +1 714 821 8380 Email: help@computer.org
President: Roger U. Fujii
MEMBERSHIP & PUBLICATION ORDERS
President-Elect: Jean-Luc Gaudiot; Past President: Thomas M. Conte; Phone: +1 800 272 6657 Fax: +1 714 821 4641 Email: help@computer.org
Secretary: Gregory T. Byrd; Treasurer: Forrest Shull; VP, Professional and Educational Asia/Pacific: Watanabe Building, 1-4-2 Minami-Aoyama, Minato-ku, Tokyo 107-0062,
Activities: Andy T. Chen; VP, Member & Geographic Activities: Nita K. Patel; Japan Phone: +81 3 3408 3118 Fax: +81 3 3408 3553 Email: tokyo.ofc@computer.org
VP, Publications: David S. Ebert; VP, Standards Activities: Mark Paulk;
VP, Technical & Conference Activities: Hausi A. Mller; 2016 IEEE Director & Delegate IEEE BOARD OF DIRECTORS
Division VIII: John W. Walz; 2016 IEEE Director & Delegate Division V: Harold Javid; President & CEO: Barry L. Shoop; President-Elect: Karen Bartleson; Past President:
2017 IEEE Director-Elect & Delegate Division VIII: Dejan S. Milojii Howard E. Michel; Secretary: Parviz Famouri; Treasurer: Jerry L. Hudgins; Director
& President, IEEE-USA: Peter Alan Eckstein; Director & President, Standards
BOARD OF GOVERNORS Association: Bruce P. Kraemer; Director & VP, Educational Activities: S.K. Ramesh;
Term Expriring 2016: David A. Bader, Pierre Bourque, Dennis J. Frailey, Jill I. Gostin, Director & VP, Membership and Geographic Activities: Wai-Choong (Lawrence) Wong;
Atsuhiro Goto, Rob Reilly, Christina M. Schober Director & VP, Publication Services and Products: Sheila Hemami; Director & VP,
Term Expiring 2017: David Lomet, Ming C. Lin, Gregory T. Byrd, Alfredo Benso, Technical Activities: Jose M.F. Moura; Director & Delegate Division V: Harold Javid;
Forrest Shull, Fabrizio Lombardi, Hausi A. Mller Director & Delegate Division VIII: John W. Walz
Term Expiring 2018: Ann DeMarle, Fred Douglis, Vladimir Getov, Bruce M. McMillin,
Cecilia Metra, Kunio Uchiyama, Stefano Zanero revised 10 June 2016

62 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


IEEE Cloud Computing
Call for Papers

A
lthough cloud technologies have been advanced and adopted at an astonishing
pace, much work remains. IEEE Cloud Computing seeks to foster the evolution of
cloud computing and provide a forum for reporting original research, exchanging
experiences, and developing best practices.

IEEE Cloud Computing magazine seeks accessible, useful papers on the latest peer-reviewed
developments in cloud computing. Topics include, but arent limited to:

Cloud architectures (delivery models and deployments),


Cloud management (balancing automation and robustness with monitoring and
maintenance),
Cloud security and privacy (issues stemming from technology, process and governance,
international law, and legal frameworks),
Cloud services (cloud services drive and are driven by consumer demand; as markets
change, so do the types of services being offered),
Cloud experiences and adoption (deployment scenarios and consumer expectations),
Cloud and adjacent technology trends (exploring trends in the market and impacts on
and influences of cloud computing),
Cloud economics (direct and indirect costs of cloud computing on the consumer;
sustainable models for providers),
Cloud standardization and compliance (facilitating the standardization of cloud tech and
test suites for compliance), and
Cloud governance (transparency of processes, legal frameworks, and consumer
monitoring and reporting).

Submissions will be subject to IEEE Cloud Computing magazines peer-review process.


Articles should be at most 6,000 words, with a maximum of 15 references, and should be
understandable to a broad audience of people interested in cloud computing, big data, and
related application areas. The writing style should be down to earth, practical, and original.

All accepted articles will be edited according to the IEEE Computer Society style guide.
Submit your papers through Manuscript Central at https://mc.manuscriptcentral.com/ccm-cs.

If you have any questions, feel free to email lead editor Brian Brannon at bbrannon@computer.org.

www.computer.org/cloudcomputing
Cloud Security

User-Centric Security
and Dependability
in the Clouds-of-
Clouds
Marc Lacoste, Orange Labs
Markus Miettinen, Technische Universitt Darmstadt
Nuno Neves and Fernando M.V. Ramos, University of Lisbon
Marko Vukoli, IBM Research
Fabien Charmet and Reda Yaich, Institut Mines-Telecom
Krzysztof Oborzyski and Gitesh Vernekar, Philips Healthcare
Paulo Sousa, Maxdata Software

Secure Supercloud computing aims to provide


security and dependability management of
distributed clouds. This approach is both user-centric
and self-managed, enabling users to achieve provider
independence for security management.
64 I E E E C l o u d C o m p u t i n g p u b l i s h e d b y t h e I E E E c o m p u t e r s o cie t y  2325-6095/16/$33.00 2016 IEEE
he high maintenance costs of absent. Policy heterogeneity among providers fa-
private datacenters and disaster- cilitates the introduction of more vulnerabilities be-
recovery requirements are causing cause of mismatching APIs and workflows. Finally,
cloud architectures to go distributed. security administration challenges for such complex
Virtualization is expanding out- infrastructures clearly prohibit a manual approach.
side a single datacenter for compute, Automation of security management is required, but
network, storage, and devices. Resource-specialized still lacking, in the multicloud.
clouds are becoming federated, evolving from cen- In todays provider-centric clouds, service speci
tralized to fully distributed infrastructures across fication, security, dependability, pricing, and SLAs
heterogeneous resourcesa cloud-of-cloudsand are beyond users influence. To tackle the security
away from the datacenter to the edge.1,2 and dependability challenges in a multicloud, we
These new architecture paradigms present key need new infrastructure management paradigms
benefits: that are both user-centric and self-managed. The
former means enabling self-service of cloud-of-
better user performance (for example, low- clouds, where customers define their own protec-
er end-to-end latency) due to fine-grained tion requirements and can avoid technology and
geod istribution, vendor lock-ins. The latter means reducing the ad-
lower costs by choosing best-of-breed cloud pro- ministration complexity of cloud-of-clouds through
viders in terms of pricing model,3 and automation techniques.
improved resilience to avoid wide-area outages
due to single points of failure. Secure Supercloud Computing
This article introduces the notion of Supercloud, a
Nevertheless, distributed cloud computing raises new architectural concept that follows the vision of
several concerns,4 mainly due to these systems high user-centric distributed cloud security and depend-
complexity and the current lack of interoperabil- ability management.6 Supercloud can be understood
ity between heterogeneous, often proprietary, infra as a security distribution layer, providing an end-
structure technologies. to-end interface between user-centric and provider-
In practice, distributed cloud computing has re- centric views of multiple clouds.
mained highly provider-centric, and multicloud in- Supercloud deploys several user-centric clouds
tegration remains a challenge. Adoption also suffers (or U-Clouds). A U-Cloud is a set of computation,
from vendor lock-in, with services tightly coupled data storage, and communication services that lets
to providers. Lack of interoperability stems mainly individual Supercloud users run their applications
from the heterogeneity of technologies (for example, and services over a distributed cloud. U-Clouds can
different hypervisors), and from service-resources be implemented on top of resources from several
mappings that are incompatible across providers, providers. However, strict U-Cloud is guaranteed us-
hampering, for instance, uniformity in service-level ing data encryption and dedicated U-Cloud-specific
agreements (SLAs). User control is also limited by VMs for computation.
monolithic infrastructures, preventing fine-grained Supercloud addresses the interoperability chal-
cloud customization by the customer (for example, lenge by providing a resource abstraction layer span-
hypervisors hide specific hardware capabilities). ning multiple cloud providers, decoupling resource
Multicloud infrastructures also raise several production by cloud providers from their consumption
security and dependability challenges. First, infra- by users. It also addresses the control challenge by
structure layers, which include customer virtual enabling customers to deploy clouds with self-service
machines (VMs), provider hypervisors, and services, security, ranging from software as a service (SaaS) to
are extremely vulnerable to attacks, in part due to full infrastructure as a service (IaaS), independent of
new virtualization technologies,5 so the infrastruc- the underlying providers. In addition, it offers unified
tures cant be trusted. Second, interoperability and control for automated management of security and re-
unified control of security across providers is mostly silience across different clouds.

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 65
Cloud Security

U-Cloud of U-Cloud of U-Cloud of


customer 1 customer 2 customer 3 access can be reached. Private CSPs are typically
entities such as a corporations IT department,
supporting tailored cloud services for its own orga-
Supercloud
nization. These services arent limited to a narrow
users
service API, but can also have lower-level control
over specific deployment details (of the computa-
tional, storage, and networking resources). This
enables the U-Cloud to provide user-centric system
Supercloud
Supercloud providers services (USS), extending user control into the lower
layers of the infrastructure to enforce flexible, se-
cure, and dependable computing behaviors (for ex-
ample, firewalling, introspection, or live migration).
Supercloud hypervisor
Supercloud is a step away from provider-centric
cloud interoperability approaches (such as hybrid/
Traditional federated clouds) that rely on business, interface,
cloud providers
or protocol agreements between providers. The Su-
percloud approach is closer to customer-centric
solutions (such as multicloud and broker-based ag-
Figure 1. The Supercloud concept, which includes users, Supercloud gregation), but focuses on security: interoperability
providers, and traditional cloud providers. Cloud resource consumption is transparent to providers, using an adaptation layer
(by users) is separated from cloud resource production (by cloud or third-party operation. (See the sidebar for a dis-
providers) thanks to the Supercloud layer, thus enabling it to overcome cussion of other work in this area.)
vendor lock-in.
Requirements
To meet these challenges, the Supercloud architec-
This approach has several benefits. First, ture should address the following objectives:
independence from the provider means lower infra-
structure operation overhead and faster service de- Self-service security: Users should be able to
ployment, but also increased homogeneity. Second, specify their own protection requirements and
increased customizability can also be expected, as the manage the corresponding security and privacy
customer can choose which virtualized services (such policies autonomously, to control their resourc-
as for security) to deploy, resulting in fully la carte es security in a fine-grained manner.
clouds. Third, it can create new business opportuni- Self-managed security: The architecture should
ties and ecosystems.7 automatically and seamlessly manage the dis-
In a nutshell, Supercloud is a provider-agnostic tributed clouds security over compute, storage,
distributed virtualization infrastructure for run- and network layers, and across provider domains
ning U-Clouds, leveraging compute, data, and net- to ensure compliance with user-defined security
work resources from both public cloud providers and policies.
private cloud infrastructures (see Figure 1). This End-to-end security: The architecture should
heterogeneity impacts the level of infrastructure vis- guarantee SLAs (for example, for isolation)
ibility and control that can be achieved for services for multiple compute clouds, data protection
running in U-Clouds. in a multiprovider setting, and secure network
On one end, public clouds operated by commer- interconnection.
cial cloud service providers (CSPs) give only limited Resilience: Resource management should pro-
visibility and control over the hypervisor and the vide robust composition of provider-agnostic
network. Public CSPs are typically big players, of- resources, leveraging primitives from multiple
fering commercial cloud services to their customers providers.
at massive scale, allowing them to take advantage
of cost savings and elastic resources. They provide This leads to the following requirements.
well-defined high-level service APIs with few pos- First, the Supercloud architecture must enable
sibilities for individual customers to customize the provider independence and isolation. It should of-
deployment details of their cloud service instances. fer a distributed cloud infrastructure that lets users
On the other end, in private clouds, where the deploy cloud applications and services in specific
datacenter belongs to the user, full infrastructure cloud instances (that is, U-Clouds) in a transpar-

66 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


Related Work in Multicloud Security
any distributed virtualization infrastructures to proprietary protocols and single administrative
(for example, microhypervisors, nested virtu- domains.
alization, container platforms, and library operating Software-defined networking-based virtualiza-
systems), isolation and trust management technolo- tion solutions allow cloud providers to offer complete
gies, and protection automation techniques have network virtualization.1 They give tenants the free-
tackled security challenges related to multiprovider dom to specify their network topologies and address-
interoperability and vulnerable software layers, but ing schemes, while guaranteeing the required level of
without meeting requirements for user control, low isolation. These platforms, however, have been tar-
attack surface, interoperability, and legacy compat- geting the datacenter of a single cloud provider with
ibility. The Supercloud hybrid virtualization archi- full control over the infrastructure. In Supercloud, we
tecture enables flexible but efficient user-centric extend this concept, supporting the creation of virtual
tradeoffs in terms of both interoperability and secu- networks spanning multiple datacenters that might
rity for the multicloud. belong to distinct cloud providers, while including
Many solutions, such as Google Drive and private facilities owned by the tenant. The novelty of
Dropbox, allow users to store their own data on the our solution arises mainly from tackling the challeng-
cloud. However, most dont permit user-centric data es of using multiple clouds, including public clouds
encryption. In current cloud-based data storage on which we have very limited control.
solutions, no commercial product uses advanced
cryptographic tools for data confidentiality protec- Reference
tion, such as those we propose to use in Supercloud. 1. T. Koponen et al., Network Virtualization in Mult-
Most major infrastructure-as-a-service providers offer itenant Datacenters, Proc. 11th USENIX Symp. Net-
replication solutions across multiple datacenters to worked Systems Design and Implementation (NSDI),
support dependability; however, this remains limited 2014, pp. 203216.

ent and user-configurable manner. Individual U- preventing cloud providers from accessing user data
Clouds must be strictly separated, preventing, for without the users explicit consent.
instance, misbehaving U-Clouds from impacting Finally, the architecture should guarantee integ-
other U-Clouds. rity and availability of services and data. It should
The architecture must also support interoper- allow specification and enforcement of measures
ability at the infrastructure and platform levels. It related to integrity, redundancy, and disaster recov-
should support a distributed cloud with flexibility and ery of data resources as part of a user-provider SLA.
control levels similar to those in a single-provider sce- Performance guarantees might also be required,
nariofor example, in terms of usage or migration of namely on response times for critical accesses to
resources across providers. In particular, it should en- some data resources.
able the deployment of legacy applications and man-
agement tools in the distributed cloud infrastructure. System Architecture
Third, it should enable user-controlled security. We now describe the architecture of the Super-
It should allow users to define fine-grained security cloud, both statically (that is, its components) and
settings to control the protection level of their cloud dynamically (that is, how these components interact
resources. For instance, to meet legal requirements to guarantee overall security).
that prohibit transfer of particular data types across
jurisdictional boundaries, users might need to con- Static Architecture
trol where their U-Cloud data is physically stored The Supercloud architecture allows customers to
and processed. It must also protect user privacy by instantiate U-Clouds that run on the underlying

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 67
Cloud Security

Compute plane and security self-management.


Compute plane
management

Figure 3a shows a simplified computing view of two


Security

U-Clouds: one (U-Cloud A) spans different pro-


Data plane
viders, while another (U-Cloud B) is confined to a
single provider.
Network plane
The virtualization infrastructure is a distribut-
ed abstraction layer for computing resources across
Network Compute Data Network Compute Data Network Compute Data multiple providers. Nested virtualization is a core
U-Cloud technology because it offers interoperabil-
ity and security benefits to guarantee VM protection
Provider 1 Provider 2 Provider n
despite untrusted virtualization layers.9 The pro-
vider controls the lower virtualization layer, called
Figure 2. High-level overview of the Supercloud architecture, including L0. Public clouds usually run general-purpose hy-
compute, data, and network planes, and security management framework. pervisors (for example, Xen for Amazon). In private
clouds, more modular hypervisors enable users to
take control on a part of L0 in the form of infrastruc-
infrastructure. Figure 2 shows the three abstrac- ture services for deep, fine-grained customization
tion planes, each addressing a particular aspect of of U-Cloud security. The upper virtualization layer,
the Supercloud system. Each plane is realized with called L1, provides the necessary facilities for users
resources from the underlying CSPs. to instantiate execution environments forming layer
The compute plane enables users to instantiate L2, using VMs or containers that are under users
computational nodes regardless of physical serv- control. A horizontal orchestration component typi-
ers hosting computations. The data plane realizes cally realizes distributed execution or migration of
an abstract cloud data storage service transparent L2 environments connecting multiple L1 instances.
to providers and data resources providing the physi- Supercloud mainly addresses security at the in-
cal storage space. The network plane provides the frastructure level, for example, to guarantee isola-
connectivity between computational and storage tion among system computation units such as VMs
resources regardless of the networking infrastruc- or containers and enforce a U-Cloud boundary. Se-
ture realizing physical connectivity between serv- curity partitioning of applications across clouds is
ers hosting computational nodes and data storage. also important to users,10 and can be achieved on
A security management framework provides fine- top of the Supercloud layer as in single clouds. Pro-
grained control to users over protection of any com- vider heterogeneity is hidden within the U-Cloud,
putational, data, and networking resources in the already a secure, distributed environment for appli-
abstraction planes. cation deployment.
This layered design minimizes interface complex- The self-management infrastructure implements
ity between planes, clearly defining interdependences autonomic configuration and management of securi-
between architectural components. Users can deploy ty aspects for the distributed cloud. Such automation
computational nodes and storage resources in the Su- means simpler, faster, and more efficient detection
percloud system easily and flexibly, regardless of the and response to threats, minimizing overall human
specific technical requirements of individual CSPs intervention. U-Cloud-specific components also let
resource platforms: orchestration of resources for users control their U-Clouds security settings. An
computation, storage and networking is handled by overall component arbitrates between such settings
respective abstraction planes. and provider security requirements. The security re-
Interplay between the three planes allows flex- sponse to a threat is elaborated by orchestration of
ible and efficient attack mitigation. A key risk of multiple autonomic security loops across infrastruc-
federated clouds is that a malicious component be ture layers and providers. Other services include
present in a U-Cloud, considered as a service com- flexible isolation, trust management, configuration
position.8 Each plane provides relevant counter- compliance for auditability, and authorization.
measures: enforcing VM isolation, attesting service
trustworthiness, guaranteeing data availability, or Data plane. Figure 3b shows several types of stor-
sanitizing the network environment. Such mecha- age entities in the data plane. Clients represent
nisms can be orchestrated with the security man- users of the Supercloud storage infrastructure. Or-
agement plane to prepare and enforce a relevant dinary clients interact transparently with the data
security response. plane via storage proxies. This requires minimal

68 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


U-Cloud A U-Cloud B

Horizontal orchestration
Overall User User User User User
security VM VM VM VM VM

management

management
Compute

Computing

Computing
self- plane

security

security
management

L1 L1 L1

Data plane
Network
plane

VM1 VM2 VM3 VM1 VM2


VM3
Public cloud

Private cloud
USS USS

Provider 1 Provider 2

(a)

Overall security self-management Client software component

User User VM User


Storage VM Storage VM Compute
server proxy plane
client DA client client

Data security
management Data plane

Network
plane

VM1 VM2 VM3 VM1 VM2 VM3


CP data Storage CPS Storage CP data CPS
node server server node

Provider 1 Provider 2

(b)
Overall security self-management

User User User User Compute


VM VM VM VM plane
OvS OvS OvS OvS

Data plane

Network security Network


management Network hypervisor plane

Network VM2 Network VM2


SDN
API proxy proxy
OvS OvS

Provider 1 Provider 2

Secure tunnel
(c)

Figure 3. Detailed view of the Supercloud architecture: (a) compute plane, (b) data plane, and (c) network
plane. Each figure shows detailed subcomponents for computation, data management, and networking, and
interplay with security self-management. (OvS: Open vSwitch; SDN: software-defined networking controller)

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 69
Cloud Security

changes to clients without installing additional li- dress translator (to offer L2 and L3 address virtual-
braries. In contrast, direct accessor clients run Su- ization), a topology abstraction module (for topology
percloud-specific logic as a client library and can virtualization), and a resource isolation application
interact and access storage servers and L1 cloud (to slice network resources among tenants, such as
provider services directly. Direct accessor clients switch CPU and forwarding tables). The network
can also have certain features of storage servers hypervisor controls and configures the OvS switches
built-in. Such clients could thus also be indepen- that are installed in all VMs. An SDN controller will
dent of storage servers. establish secure connections with each OvS switch
Proxies, typically L2 VMs, facilitate client ac- to control the forwarding plane.
cess to Supercloud storage and data management The network hypervisor is built as an applica-
offerings, such as for encryption and secure dedu- tion that runs in the Supercloud SDN controller.
plication. Theyre usually stateless and can be easily Each cloud will host a specific VM, the network
added dynamically to the system. proxy, where secure tunnels are set up to all other
Servers, typically stateful L1 or L2 VMs, per- clouds. In a distributed configuration, each proxy
form housekeeping of critical portions of metadata will host an instance of the SDN controller.
vital to the Supercloud data planes operation, such Security management is facilitated through the
as metadata for storage, data integrity, or configu- interplay of overall security self-management and
ration management. Cloud provider services (CPSs) network security management components, which
are L1 cloud storage services that direct accessor enable Supercloud users to specify user-specific
clients or proxies can directly access. They expose settings for network configurations inside their
different APIs, notably object storage and block stor- U-Clouds.
age. Examples include OpenStack Swift and Ama-
zons Simple Storage Service (S3) and Elastic Block Dynamic Architecture
Store. Cloud provider data nodes are L1 VMs in the Figure 4 illustrates two typical workflows between
distributed provider infrastructure. Complementing some key Supercloud architecture components. User
CPS, they can perform computation and have locally 1 interacts with its VM (u1VMx) through a set of
mounted L1 block storage for Supercloud user data. APIs. Providers 1 and 2 host compute (VMx), net-
Security self-management components allow working (NVMx), and storage management (DVMx)
arbitration between provider and user data security VMs. Provider 1 also hosts a physical storage ser-
settings. vice. Supercloud considers a nested architecture
that is, u1VMx runs inside VMx.
Network plane. Figure 3c illustrates the Supercloud Supercloud users interact through four inter-
network virtualization architecture. Its main de- faces to deploy their applications in the cloud. The
sign goals are network controllability; full network network plane interface, typically the network hy-
virtualization to guarantee isolation between users, pervisor, interacts with the SDN controller and
while enabling them to use their desired addressing network proxies, hosted in the NVMx machines, to
schemes and topologies; and VM snapshotting and handle communication and establish secure tunnels
migration for availability and flexibility. with other clouds. The data plane interface, typi-
To fulfill these objectives, the architecture lever- cally storage proxies, interacts with the providers
ages software-defined networking (SDN),11 which DVMx VMs to ensure access to the users private
provides logically centralized control over forwarding data. The compute plane interface, typically the L1
and configuration state of the software switches run- hypervisor, interacts with providers VMx machines
ning in the Supercloud VMs. OpenFlow and Open to provide memory and CPU resources.
vSwitch (OvS) technologies provide fine-grained We describe the interfaces between Supercloud
control of packet forwarding and of switch configu- elements in several scenarios. The first scenario re-
rations, respectively. Logical centralization of control lates to requesting data from the cloud storage; the
facilitates isolation, for example, through flow rule second relates to establishing communication be-
redefinition at the network edge, with translation of tween two VMs hosted in the Supercloud. The last
physical to virtual events. Availability goals extend example shows how the Supercloud security manage-
well-proven techniques to the multicloud setting. ment interfaces enable to deploy security services.
For each user, a specific set of network applica-
tions that control the virtual network will run on top Access to cloud storage. In this scenario (steps
the Supercloud network hypervisor that maps the af in Figure 4), during a request to the data layer
virtual and physical resources. These include an ad- (step a), the user VM (uVM) sends a request to the

70 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


1

a
Network
security
U-Cloud

Data
User 1 security u1VM1 u1VM2 u1VM3

Computing
security
management

Security b 2 6
management

User p c
CSP DVM1 NVM1 5
storage DVM1 NVM2 VM2 VM
Comp d
SDN 4
Comp cont.
Provider 2
Provider 1

f
L1 API API

Hypervisor LO Hypervisor LO
SDN e
infrastructure

Figure 4. Sample Supercloud workflows. Shown are the Supercloud interfaces (computation, data,
networking, self-management) to deploy applications in several scenarios: access to cloud storage,
establishing communications between VMs, and self-management of security.

data management VM (DVM) (step b). The DVM is face (see Figures 2 and 3) to deploy, orchestrate,
aware of the resources physical location inside the enforce, and monitor security requirements. Such
cloud infrastructure. It provides this information to requirements are specified and negotiated through
the hypervisor hosting the uVM (step c), which will SLAs during the cloud service discovery and bro-
ask the network management VM (NVM) (step d) to kering phases. This distributed protection plane is
establish a connection between resource and uVM realized through interplay of several security self-
through the SDN network (steps e and f). management components spread across the Super-
cloud abstraction planes.
Establishing communication between user VMs. The resource management components are self-
In this scenario (steps 16 in Figure 4), after re- management agents (SMAs) responsible for deliver-
ceiving a request from User 1 (step 1), uVM1 ing atomic security services such as enforcement,
sends a communication request through the hy- detection, reaction, and monitoring. These compo-
pervisor (step 2). The hypervisor forwards the nents operate on a particular architecture abstrac-
request to the NVM (step 3), which establishes tion plane and are dedicated to a specific security
the SDN rules for the path to communicate with service (such as intrusion detection, authorization
NVM2 (step 4). If the destination VM is hosted enforcement, or trust management). Some security
on a different CSP, the NVM forwards the request services might require multiple SMAs across mul-
to the NVM of the other CSP, hence setting up tiple planes and/or providers. For instance, intru-
the connection. Finally, NVM2 shares the request sion detection might require the collaboration of
with VM2 (step 5), which is hosting uVM2 (step multiple (cross-provider) SMAs to collect, aggregate,
6). Each component (VMx, DVMx, and NVMx) and process activity logs.12
is accessed independently of the provider owning Aggregation components provide a unified and
the physical resource. uniform view of multiple SMAs to the orchestrator.
They abstract the heterogeneity of provider security
Security management. Users and providers also mechanisms, meeting platform independence and
interact with a security management plane inter- interoperability requirements.

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 71
Cloud Security

Orchestration components are decision-making each hospital for its operations. Whenever a hospital
components providing security services. Each com- VM wants to store or retrieve clinical data (for exam-
ponent is a manager for a specific security service ple, MRI imaging data), it communicates with a pic-
such as authorization and access control, intrusion ture archiving and communication system (PACS)
detection and prevention, and trust management. VM interfacing with the data plane. The data plane
In addition, an overall orchestrator coordinates the provides an abstraction to the user VM, making all
actions of all security managers; a planner gener- underlying storage directly accessible (including en-
ates plans to reach and/or maintain security objec- crypting stored data). Data older than six months is
tives; and a storage manager guarantees persistence stored on the public cloud, whereas recent data is
and delivery of the knowledge needed for self-man- kept in the private clouds on-premises storage, pro-
agement of security. Orchestration components are viding instantaneous access to it. Here, the network
also responsible for retrieving user security require- plane is responsible for handling all communica-
ments from SLAs, converting them into policies tions across different clouds and VMs. Hospitals can
and configurations to be enforced, and detecting also define their security policies, such as how other
and managing conflicts between tenants, users, hospitals can access their data. Components that
and/or providers. are dedicated to data security and security manage-
ment across the L1 hypervisor and compute VMs
Use Cases will prevent any unprivileged access to data based
To illustrate how the Supercloud architecture can on security policies defined by each hospital.
be mapped to real-world use cases, we use examples
from the healthcare domain. Healthcare Laboratory Information System
This use-case demonstrates the impact of the Super-
Hospital Imaging Archive cloud architecture for Maxdata Software, a healthcare
The amount of diagnostic imaging data is quickly software vendor that aims to deploy its software on
increasing, imposing great challenges on hospital the cloud as SaaS while enforcing the security re-
archive infrastructures, which must ensure high quirements of different healthcare institutions.
data availability, security, and regulatory compli- The CLINIdATALIS healthcare laboratory in-
ance. A cloud-based solution can help address these formation system (LIS) is a cross-platform Web ap-
challenges. plication in which server components can run on any
Such a solutions architecture should minimize common operating system and relational database.
the risk of security breaches and privacy violations, The CLINIdATALIS must integrate with dozens
including unprivileged access to data (both at rest of other clinical and nonclinical information systems
and during processing) with regard to defined poli- (such as intensive care units, patient identification,
cies. These policies might include hospital-specific billing, and regional health portals). It includes a set
policies context, legal country boundaries, and user of real-time interfaces with physical electronic equip-
groups. In terms of performance, robust data pro- ment (automated analyzers). The solution consists of
cessing with low latency is desired, especially across three components on the server side: a stateless ap-
different clouds. plication, a database engine, and database data. The
Hospitals can store their clinical data as well as Supercloud approach allows each healthcare institu-
their imaging studies in on-premises private cloud tion to define the U-Cloud that best fits its needs.
storage. Archiving in the cloud helps simplify the Concrete deployment on physical cloud providers is
data management and hospital archive infrastruc- then automated. The considered setting is a large
tureespecially due to high-volume imaging studies hospital cluster that employs thousands of profes-
that are often as large as 1 Gbyte. Since on-premises sionals, processes tens of millions of transactions per
storage can be limited, it makes sense to store this day, and is located in a country where personal data
data securely in public cloud storage. For example, a protection must be guaranteed.
hospital might store data from the last six months in In a typical U-Cloud specification,
the private clouds on-premises storage, while stor-
ing older data (10 years or more) in the public cloud. the application and database engine are repli-
Figure 5a shows a sample Supercloud implementa- cated across several VMs on the compute plane
tion of such a solution. (fault tolerance and load balancing);
In Figure 5a, three hospitals (A, B, and C) share data is split among different storage nodes in
a private cloud to store and manage their clinical the data plane (offering confidentiality, even if
data. A VM on the compute plane is dedicated to one storage node is compromised);

72 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


Comp Comp Comp Compute VM Compute VM Compute VM

Data security
Computing security Hospital A Hospital A Hospital C
Security management
management Data Data
server proxy PACS Client Client Client

Horizontal orchestration
L1 L1
Network Network
OvS security hypervisor OvS

VM VM VM VM VM VM VM

Comp Comp

Private cloud
Comp SDN Comp Comp
Public cloud

Network Data cont Network


proxy L1 proxy proxy L1 L1

API OvS OvS OvS API USS OvS USS OvS USS OvS USS OvS

Hypervisor LO Hypervisor LO

(a)

Automated Automated
analyzer analyzer

U-Cloud A

Comp Comp Compute VM Compute VM


Data security
Computing security

CLINIdATALIS CLINIdATALIS
Security
management

management Data Data


server proxy Client Client

Horizontal orchestration
L1 L1
Network Network
OvS security hypervisor OvS

VM VM VM VM VM VM VM
Comp Comp
Private cloud

Comp SDN Comp Comp


Public cloud

Network Data cont. Network


proxy L1 proxy proxy L1 L1

API OvS OvS OvS API USS OvS USS OvS USS OvS USS OvS

Hypervisor LO Hypervisor LO

(b)

Figure 5. Supercloud practical deployments: (a) high-availability storage and disaster recovery, and
(b) healthcare laboratory information system. (USS: user-centric system service)

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 73
Cloud Security

a set of networks connect application VMs to novation Program, grant 644962) and by the Swiss
automated analyzers running on hospital prem- Secretariat for Education Research and Innovation
ises and to database engine VMs, which in turn (contract 15.0091). It is based on contributions from
are connected to storage nodes; the entire Supercloud consortium.
VMs on the compute plane ensure confidential-
ity, integrity, and 99.99 percent availability; References
storage nodes on the data plane ensure data in- 1. F. Bonomi et al., Fog Computing and Its Role in
tegrity and 99.99 percent availability; and the Internet of Things, Proc. 1st Workshop Mo-
data may be processed and stored only in a pre- bile Cloud Computing (MCC), 2012, pp. 1316.
defined set of countries. 2. F. Manco et al., The Case for the Superfluid
Cloud, Proc. 7th USENIX Workshop Hot Topics
As Figure 5b shows, a Supercloud infrastructure in Cloud Computing (HotCloud), 2015.
can then deploy the VMs on a trusted private cloud 3. L. Zheng et al., How to Bid the Cloud, Proc.
to ensure confidentiality on the compute plane, in- ACM Conf. Special Interest Group on Data
stantiate the storage nodes on a set of public cloud Comm. (SIGCOMM), 2015, pp. 7184.
providers running security mechanisms (such as 4. R. Los, D. Shackleford, and B. Sullivan, Notori-
encryption and secret sharing) to ensure confiden- ous Nine Cloud Computing Top Threats in 2013,
tiality, and connect the different components us- tech. report, Cloud Security Alliance, 2013.
ing virtual networks provided by the network plane. 5. D. Sgandurra and E. Lupu, Evolution of At-
Deployments consider the locations or countries tacks, Threat Models, and Solutions for Virtual-
specified by the healthcare institution. Replicated ized Systems, ACM Computing Surveys, vol. 48,
instances of the CLINIdATALIS application run no. 3, 2016, pp. 138.
on VMs on the compute plane. These instances then 6. D. Williams, H. Jamjoom, and H. Weatherspoon,
connect to the database engine running on a differ- Plug into the Supercloud, IEEE Internet Com-
ent VM linked with the data plane. puting, vol. 17, no. 2, 2013, pp. 2834.
In case of regulatory, economic, or other type of 7. A. Ludwig and S. Schmid, Distributed Cloud
change, healthcare institutions can update U-Cloud Market: Who Benefits from Specification Flex-
requirements and/or features. The Supercloud in- ibilities? ACM SIGMETRICS Performance
frastructure automatically redeploys the solution Evaluation Rev., vol. 43, no. 3, 2015, pp. 3841.
accordingly, enabling quick adaptation to context 8. K. Bernsmed et al., Thunder in the Clouds: Se-
changes. It also prevents vendor lock-in. curity Challenges and Solutions for Federated
Clouds, Proc. IEEE 4th Intl Conf. Cloud Com-
puting Technology and Science (CloudCom),
ere implementing the different compo- 2012, doi:10.1109/CloudCom.2012.6427547.
nents of the Supercloud architecture to 9. M. Ben-Yehuda et al., The Turtles Project: De-
gradually achieve integrated proof of concepts. sign and Implementation of Nested Virtualiza-
The solution is currently at an advanced stage of tion, Proc. 9th USENIX Conf. Operating Sys-
implementation. Several results are already avail- tems Design and Implementation (OSDI), vol. 10,
able (see https://Supercloud-project.eu/publications 2010, pp. 423436.
-deliverables). However, were still integrating the 10. P. Watson, Application Security through Feder-
various components. Preliminary performance re- ated Clouds, IEEE Cloud Computing, vol. 1, no.
sults have shown relatively modest overheads, giving 3, 2014, pp. 7680.
good indications about the potential for the solution 11. D. Kreutz et al., Software-Defined Networking:
(such as for network virtualization13). Our next step A Comprehensive Survey, Proc. IEEE, vol. 103,
is to validate the approach through testbed integra- no. 1, 2015, pp. 1476.
tion. Other foreseen application domains include 12. S.T. Zargar et al., DCDIDP: A Distributed, Col-
network function virtualization or smart home secu- laborative, and Data-Driven Intrusion Detection
rity. Results will be disseminated to promote open and Prevention Framework for Cloud Comput-
source cloud technologies and will be contributed to ing Environments, Proc. 7th Intl Conf. Col-
major standardization bodies. laborative Computing: Networking, Applications
and Worksharing (CollaborateCom), 2011, pp.
Acknowledgments 332341.
This work is supported by the European Union Su- 13. M. Alaluna, F. Ramos, and N. Ferreira Neves,
percloud Project (Horizon 2020 Research and In- (Literally) above the Clouds: Virtualizing the

74 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


Network over Multiple Clouds, Proc. IEEE Tlcom. His research interests include penetration
Conf. Network Softwarization (NetSoft), 2016, testing, network security, and software-defined net-
pp. 112115. working. Charmet has an MSc in enterprise architec-
ture and an MSc in network and security from the
Marc Lacoste is a senior research scientist in University of Lille 1, France. Contact him at fabien
the Security Department of Orange Labs, and tech- .charmet@telecom-sudparis.
nical leader of the H2020 Supercloud Project. His
main research interests include security architecture, Reda Yaich is a researcher in the LabSTICC
cloud computing security, self-protecting systems, and Laboratory (Centre National de la Recherche Scien-
open security kernels. Lacoste has a PhD in comput- tifique) in Tlcom Bretagne, Institut Mines-Tlcom.
er science from the University of Grenoble, France. His research interests include the specification and
Hes a member of ACM. Contact him at marc.lacoste enforcement of trust and security policies over open,
@orange.com. distributed, and decentralized systems. Yaich has a
PhD in computer science from Ecole des Mines of
Markus Miettinen is a researcher in the System Saint-Etienne, France. Contact him at reda.yaich
Security Lab at the Department of Computer Science @telecom-bretagne.eu.
at Technische Universitt Darmstadt, Germany. His
research interests include contextual security, data
Krzysztof Oborzynski is a software archi-
analysis-based security enablers, and security manage- tect at Healthcare Informatics Services and Solutions,
ment in new computation environments such as the Clinical Platforms, Philips Healthcare. His research
Internet of Things. Miettinen has an MSc in computer interests include systems reliability, performance, and
science from the University of Helsinki, Finland. Con- serviceability. Oborzynski
has a PhD in computer sci-
tact him at markus.miettinen@trust.tu-darmstadt.de. ence at the Institute of Computing Science, Poznan
University of Technology, Poland. Contact him at
Nuno Neves is an associate professor in and head Krzysztof.Oborzynski@philips.com.
of the University of Lisbons Department of Computer
Science, where he leads the Navigators research group Gitesh Vernekar is a senior manager at Health-
and is on the executive board of the Large-Scale Infor- care Informatics Services and Solutions, HealthSuite
matics Systems Laboratory (LaSIGE) research unit. His Digital Platform, Philips Healthcare. His research in-
research interests include the security and dependabil- terests include delivering innovative and repeatable
ity aspects of distributed systems. Neves has a PhD in software solutions in healthcare, banking, and high-
computer science from University of Illinois Urbana- tech industries, and technology-enabled operations
Champaign. Contact him at nuno@di.fc.ul.pt. and services. Vernekar has an international MBA in
entrepreneurship and strategy management from the
Fernando M.V. Ramos is an assistant professor Rotterdam School Management, Erasmus University,
in the Department of Computer Science the Univer- The Netherlands. Contact him at Gitesh.Vernekar
sity of Lisbon. His research interests include network @philips.com.
programmability and network virtualization. Ramos
has a PhD in computer science and engineering from Paulo Sousa is chief executive officer at Max-
the University of Cambridge, UK. Contact him at data Software. His research interests include real-time
fvramos@ciencias.ulisboa.pt. systems, intrusion tolerance, and security. Sousa has a
PhD in computer science from the University of Lis-
Marko Vukolic is a research staff member at bon. Contact him at paulo.sousa@maxdata.pt.
IBM Research, Zurich. His research interests are in
distributed algorithms and systems, including fault-
tolerance, blockchain and distributed ledgers, cloud
computing security, and distributed storage. Vukolic
has a PhD in distributed systems from Ecole Polytech-
nique Fdrale de Lausanne (EPFL), Switzerland.
Contact him at mvu@zurich.ibm.com.

Fabien Charmet is a research engineer in the Read your subscriptions through


the myCS publications portal at
Samovar Lab (Centre National de la Recherche http://mycs.computer.org.
Scientifique) in Tlcom SudParis, Institut Mines-

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 75
Standards Now

The Design and sonal home to build a larger structure, such as a


sports arena, concert hall, or high-rise building. In
the same way, the choice of architectural design pat-
tern in software must be tuned to the desired appli-

Architecture of cation, workload, and expected level of use.


Aesthetics and user reaction are important in all
of these settings. No one would argue at this stage,
in which products from all vendors tend to be beau-

Microservices tifully designed, about the need to pay significant


attention to issues of user experience and ease of
use in software design. Just as we enjoy beautifully
designed and functional buildings, software designs
are most enjoyable when theyre both useful and art-
fully built.

To explore the role of design in Microservice Architectures


software, consider two other fields Concepts related to microservices are discussed ex-
that also depend on it: art and ar- tensively elsewhere in this issue. They are, to some
chitecture. Like art, much of software design degree, old wine in new bottles. The basic approach
can be a matter of taste. As in the art world, issues of separating services into functions that can inter-
that inspire passionate debate in the field of soft- act via programming interfaces has been with us for
ware design resonate most loudly within its internal some time. Methods to implement such separation
boundaries, and dont necessarily have much of an in the framework of service-oriented architectures
effect outside of those boundaries. (SOAs) are also not new.
Design is also important in architecture, both Recent implementations of microservices in
for aesthetic and physically important reasons. As cloud settings, however, take the SOA idea to new
in the structure of buildings, architectural design limits that are driven by the goals of rapid, inter-
can have serious ramifications on the reliability, ro- changeable, easily adapted, and easily scaled com-
bustness, and suitability for use of software. Like ponents. This is obviously not the only way to use
architects, developers are generally aware of the im- clouds, but it draws well on the basic functional
portance of internal structural elements in software, features of cloud computing and is a good match to
and study and debate the performance and business the corresponding delivery framework. A continued
reasons for selecting one approach over another. emphasis on the use of RESTful APIs as discussed
One would not wish to use the plans for a per- in previous Standards Now columns has also ac-
celerated the pace of change and overall utility of
cloud service delivery.
The resulting factorization of workloads and in-
crementally scalable features of microservices pro-
vide a multitude of ways by which SOA can be freed
from its previously hidebound, overly formal imple-
mentation settings and be implemented in much less
forbidding ways. One consequence of this evolution
is the development of new architectural patterns and
Alan Sill the corresponding emergence and use of standards.
As with art and architecture, much discretion
Texas Tech University, is left to the designer in microservice delivery. You
alan.sill@standards-now.org might be tempted to think that standards arent im-
portant, or less important, in the rapidly changing

76 I E E E C l o u d C o m p u t i n g p u b l i s h e d b y t h e I E E E c o m p u t e r s o cie t y  2325-6095/16/$33.00 2016 IEEE


microservices arena, but this assumption, as I am dation are the Open Container Initiative (www.open-
about to show, wouldnt be correct. To a great de- containers.org) and the Cloud Native Computing
gree, the flexibility and ease of implementation of Foundation (CNCF, https://cncf.io). Popular image
modern approaches to microservice architecture is formats include ACI, the container image format de-
either compatible with or in fact greatly driven by fined in the appc specification, and OCI, the Open
the emergence of successful design patterns that are Containers Image Format specification. Much of the
already in the process of becoming standards. work going on within the CNCF is aimed higher up
in the software stack and addresses the large-scale
Microservice Delivery Using Containers behaviors of distributed systems of microservices.
It would be equally incorrect to equate different Although work is still in progress on various as-
trends in cloud design as being equivalent. The cur- pects of each of these standards within these orga-
rent tendency to implement different sets of software nizations and on their relationship to each other, its
in the context of software containers, for example, is encouraging to see efforts of this sort emerge natu-
more of a coincidence than a direct consequence of rally from ongoing community interests.
microservice design. Its true that con-
tainers can be made to isolate execution
environments from each other, and that
they lend themselves to scalability by
allowing such containers to be instanti-
ated quickly on demand.
Just as we enjoy beautifully designed
Other features of software contain- and functional buildings, software
ers require much more work to overcome, designs are most enjoyable when
however, such as the need to provide theyre both useful and artfully built.
well-thought-through mechanisms for
network communications between them
and associated complications of their use
on different physical hosts or on hosts
located in different datacenters. Similar
problems crop up with regard to security, monitor- Data Formats and APIs
ing, and the need to minimize the operational size of To make microservice architectures work in prac-
containers. These issues require careful thought and tice, one must get information into and out of these
attention to details that arent directly related to the services and find ways to make the information ex-
SOA aspects of microservices themselves. change and control-passing features take place at
Despite these shortcomings, microservice de- component boundaries. Programmers must there-
livery matches well in many ways to deployment in fore address design topics dealing with data ex-
software containers. This method is, in fact, cur- change and messaging, and must implement these
rently the most popular way to deploy them, but to services with suitable orchestration and control.
deal effectively with the resulting complications just Standards exist that provide the basis for such
described absolutely requires the use of standards. data exchange. The most popular data formats in
This column has covered the emergence of such cloud computing are JavaScript Object Notation
standards in this area many times, starting with the (JSON) and XML. JSON is documented in two
appc application container specification originally standards: Ecma Internationals ECMA-4041 and
developed by CoreOS, and the runC container en- IETFs RFC 7159.2 XML is a somewhat older but
gine originally developed by Docker. Much commu- still popular text-based format for data exchange sup-
nity work has gone into integrating the approaches ported by several W3C standards. Although it isnt as
of these two specification sets and extending them human-readable as JSON, each format has particu-
to newer, broader use cases. lar strengths and weaknesses and both are still very
Two current relevant projects of the Linux Foun- much in use.

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 77
Standards Now

For Internet of Things (IoT) and sensor-oriented mat standards. As a result, current cloud microser-
settings, as discussed in the previous issue on man- vice designs are burdened with a huge variety and
ufacturing, the Sensor Network Object Notation multiplicity of API definitions.
(SNON, www.snon.org) is a representation based on In previous columns, Ive referred to the API
JSON that includes some predefined fields that are directory maintained, for example, by the website
especially useful in dealing with sensor data. In ad- ProgrammableWeb.com, which at the time of this
dition, the Data Distribution Service (DDS, http:// writing maintains a directory of more than 15,000
www.omg.org/spec/DDS) and DDS Data Local Re- APIs (www.programmableweb.com/apis/directory).
construction Layer (DDS-DLRL) specifications This situation requires APIs to be designed to work
were developed by the Object Management Group either in small subsets of the application arena in
specifically to handle data interchange tasks related which either the API is stable, or to be built to a
to IoT systems. common self-describing or standardized pattern.
Examples of effective API standards
are the RESTful API Markup Language
(RAML, http://raml.org) and Swagger,
which has evolved into the Open API
General data standards are available Initiative (https://openapis.org), as dis-
to deal with the wide variety of cussed in previous columns.

formats for datasets without having Messaging Standards


to be locked into a particular format. The next step after understanding data
formats and APIs for data exchange is
to move in the direction of messaging
and application control. HTTP and its
secure variant HTTPS are the most fa-
Unlike the other protocols Ive mentioned, DDS miliar messaging standards, and are specified in a
can handle content-aware network routing, data pri- range of IETF documents summarized at the work-
oritization by transport priorities, and both unicast ing group website (httpwg.org/specs).
and multicast communications within the methods The IETF specifications underlying TCP form
defined by the standard set itself. the basis of a large fraction of Internet traffic. TCP
Additionally, general data standards are avail- continues to receive detailed attention from the
able to deal with the wide variety of formats for da- community due to its importance in a wide variety
tasets without having to be locked into a particular of settings. The most important TCP specifications
format. For example, working with the US National and their relationships with one another are sum-
Center for Supercomputing Applications (NCSA, marized in RFC 7414.3 A number of other applica-
http://www.ncsa.illinois.edu) and IBM, the Open tion, transport Internet, and link layer protocols are
Grid Forum has developed a language for describing also useful.4
the structure of data formats without needing to re- The User Datagram Protocol (UDP) is useful for
write them. The resulting Data Format Description Internet communications that can be intermittent
Language (DFDL, www.ogf.org/dfdl) is a flexible, or dont have to be completely received at all times.5
general specification set suited to a wide variety of UDP can be used to carry out IP communications
data input, output, and format transcription prob- in situations in which handshaking and verification
lems and is supported by both commercial and open of receipt of the individual message packets arent
source software implementation tools. necessary. The Stream Control Transmission Proto-
Many approaches currently used in microser- col (SCTP) provides an alternative to TCP and UDP
vices create custom APIs for access to specific data. applicable to streaming use cases.6
This approach is compatible with, though typically Another example of a manufacturing-relevant
implemented without, reference to external data for- specialized transfer standard is the Constrained Ap-

78 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


plication Protocol (CoAP).7 According to its descrip- zine,4 stating there that the underpinnings of the
tion, CoAP provides a request/response interaction cloud consist of the ways in which otherwise discon-
model between application endpoints, supports nected, highly scaled, and rapidly changing collec-
built-in discovery of services and resources, and in- tions of service components can be instantiated and
cludes key concepts of the Web such as URIs and hooked together swiftly and flexibly to form the ba-
Internet media types. CoAP is designed to easily sis of a cloud service.
interface with HTTP for integration with the Web The need for performance is especially impor-
while meeting specialized requirements such as tant in the implementation of microservice archi-
multicast support, very low overhead, and simplicity tectures. This consideration obviously provides the
for constrained environments. practical limit to the degree to which individual
The Extensible Messaging and Presence Protocol service components can be scaled down in terms of
(XMPP) is an XML-based communications standard information exchange and functionality. Issues re-
designed for message-oriented middleware communi- lated to security, connectivity between microservice
cations. The core specifications for XMPP are RFCs components, and scalability also have considerations
6120,8 6121,9 and 762210 and include a WebSocket that are affected by the choice of networking archi-
binding defined in RFC 7395.11 Several extensions tecture and protocols.
beyond the base specifications are supported by the A US National Institute of Standards and Tech-
dedicated XMPP organization (see http://xmpp.org/ nology draft publication covers this topic, with an
extensions). Beyond its applications to human- emphasis on security considerations.13 Although the
oriented communications, XMPP is also used in comment period has closed on this particular draft,
smart electrical grid applications and a variety of in- the topics general nature makes it seem to me that
dustrial settings. Several extensions oriented toward the issues discussed in this document will be revis-
use in IoT settings were published in late 2015. ited several times in the near future as the general
Methods to handle publish/subscribe messag- area of microservice delivery matures.
ing can have advantages compared to the protocols Special considerations that relate to networking
when used for machine-to-machine communica- are also pushing some microservice frameworks in
tions at high speeds. The Message Queuing Telem- directions that lead away from human readability of
etry Transport (MQTT, http://docs.oasis-open.org/ the interchanged data and even of the on-the-wire
mqtt/mqtt/v3.1.1/os/mqtt-v3.1.1-os.html), recently protocols used in the API calls and messaging. Ex-
standardized by the Organization for Advanced amples that Ive discussed in previous issues contin-
Structured Information Systems (OASIS), is another ue to mature, including the recent release by Google
example of such a method. of its gRPC framework at version 1.0 (https://github
The Advanced Message Queuing Protocol .com/grpc/grpc/releases/tag/v1.0.0) after an extend-
(AMQP) is another popular middleware messaging ed period of development, with multiple language
standard set. It can be applied using either publish/ bindings now available.
subscribe or point-to-point communication patterns. The design approach underpinning gRPC makes
OASIS published AMQP as a set of standards in extensive use of protocol buffers (https://developers
2014 (www.oasis-open.org/standards#amqpv1.0) and .google.com/protocol-buffers/docs/reference/overview),
adopted it as a joint International Organization for a design construct intended to serialize structured
Standardization/International Electrotechnical Com- data in a simpler manner than in XML but without
mission (ISO/IEC) later that year.12 AMQP has a some of JSONs limitations, and is designed to be
layered architecture, and the specification set is orga- compatible with HTTP/2. My personal belief is
nized into different parts to reflect that architecture. that these developments illustrate the emergence
of new design trends for cloud services that favor
Networking Considerations speed and responsiveness over human readability
Networking provides the core feature that ties all of the exchanged data and API calls, and that might
cloud services to each other. I discussed this topic lead to radical revisions of some of the fundamental
extensively in the May/June 2016 issue of this maga- assumptions that govern microservices in the future.

Se p t e m b e r / Oc t o b e r 2 0 1 6 I EEE Clo u d Co m p u t i n g 79
Standards Now

The discussion here has focused on tocol, IETF RFC 4960, 2007; www.rfc-editor.org/
the design and architecture of mi- info/rfc4960.
croservices. Ive covered considerations related 7. Z. Shelby, K. Hartke, and C. Bormann, The Con-
to packaging and delivery of microservices in contain- strained Application Protocol (CoAP), IETF RFC
ers, data exchange, and data formats, messaging and 7252, 2014; www.rfc-editor.org/info/rfc7252.
networking, focusing on some up-to-date topics on 8. P. Saint-Andre, Extensible Messaging and Pres-
standards related to these areas. ence Protocol (XMPP): Core, IETF RFC 6120,
My next column will address topics related to 2011; www.rfc-editor.org/info/rfc6120.
microservices orchestration, including relevant stan- 9. P. Saint-Andre, Extensible Messaging and Presence
dards such as Topology and Orchestration Speci- Protocol (XMPP): Instant Messaging and Presence,
fication for Cloud Applications (Tosca) and Cloud IETF RFC 6121, 2011; www.rfc-editor.org/info/
Application Management for Platforms (CAMP); rfc6121.
microservices control, including the Open Cloud 10. P. Saint-Andre, Extensible Messaging and Pres-
Computing Interface (OCCI) and Cloud Infrastruc- ence Protocol (XMPP): Address Format, IETF RFC
ture Management Interface (CIMI) standard sets; and 7622, 2015; www.rfc-editor.org/info/rfc7622.
serverless microservices, such as Amazon Lambda 11. L. Stout, ed., An Extensible Messaging and Pres-
and related concepts. Ill also take another look at the ence Protocol (XMPP) Subprotocol for WebSock-
SOA basis for microservice architectures to tie both et, IETF RFC 7395, 2014; www.rfc-editor.org/
of these columns together. info/rfc7395.
As always, this discussion only represents my 12. Information technologyAdvanced Message
own viewpoint. Id like to hear your opinions and Queuing Protocol (AMQP), Intl Organization for
experience in this area. Im sure other readers of the Standardization/Intl Electrotechnical Commis-
magazine would also appreciate additional informa- sion, ISO/IEC 19464, v.1.0, 2014; www.iso.org/
tion on this topic. iso/home/store/catalogue_tc/catalogue_detail.
Please respond with your input on this or previ- htm?csnumber=64955.
ous columns. Please include news you think the com- 13. A. Karmel, R. Chadromouli, and M. Iorga, NIST
munity should know in the general areas of cloud Definition of Microservices, Application Contain-
standards, compliance, or related topics. Im happy to ers and System Virtual Machines, Natl Inst. of
review ideas for potential submissions to the maga- Standards and Technology (NIST) Special Publica-
zine or for proposed guest columns. I can be reached tion 800-180, 2016; http://csrc.nist.gov/publications/
for this purpose at alan.sill@standards-now.org. drafts/800-180/sp800-180_draft.pdf.

References
1. Ecma International, The JSON Data Interchange Alan Sill directs the US National Science Founda-
Format, Ecma-404, 1st ed. 2013; www.ecma tions Cloud and Autonomic Computing industry/uni-
-international.org/publications/standards/Ecma versity cooperative research center. Hes interim senior
-404.htm. director of the High Performance Computing Center
2. T. Bray, ed., The JavaScript Object Notation (JSON) and adjunct professor of physics at Texas Tech Univer-
Data Interchange Format, IEEE RFC 7159, 2014; sity, and visiting professor of distributed computing at
https://www.rfc-editor.org/info/rfc7159. the University of Derby. Sill has a PhD in particle phys-
3. M. Duke, et al., A Roadmap for Transmission Con- ics from American University. Hes an active member of
trol Protocol (TCP) Specification Documents, IETF IEEE, the Distributed Management Task Force, and the
RFC 7414, 2015; www.rfc-editor.org/info/rfc7414. TeleManagement Forum, and he serves as president for
4. A. Sill, Standards Underlying Cloud Networking, the Open Grid Forum. Hes a member of several cloud
IEEE Cloud Computing, vol. 3, no. 3, 2016, pp. 7680. standards working groups and national and interna-
5. J. Postel, User Datagram Protocol, IETF RFC 768, tional standards roadmap committees, and he remains
1980; www.rfc-editor.org/info/rfc768. active in particle physics and advanced computing re-
6. R. Stewart, ed., Stream Control Transmission Pro- search. Contact him at alan.sill@standards-now.org.

80 I EEE Clo u d Co m p u t i n g w w w.co m p u t er .o rg /clo u d co m p u t i n g


Blue Skies

Open Issues in Scheduling


Microservices in the Cloud
he adoption of container-based microservices architec-
tures is revolutionizing application design. By adopting
a microservices architecture, developers can engineer Maria Fazio and
Antonio Celesti
applications that are composed of multiple lightweight, University of Messina
self-contained, and portable runtime components deployed across
a large number of geodistributed servers. Rajiv Ranjan and
Chang Liu
Newcastle University
A microservices-based cloud application in- wards a DevOps culture, in
volves the interoperation of multiple microservices, which development and op- Lydia Chen
each developed separately, that can be deployed, up- erations teams work closely IBM Research
dated, and redeployed independently without com- together to support an appli-
promising the applications ecosystems integrity. cation over its lifecycle, and Massimo Villari
The ability to independently update and redeploy the go through a rapid or even University of Messina
code base of one or more microservices increases ap- continuous release cycle.
plications scalability, portability, updatability, and Microservices act as stand-
availability, but at the cost of expensive remote calls alone application subunits or
(instead of in-process calls) and increased overhead components, implementing specific communication
for cross-component synchronization. protocols for sending and receiving messages. In
The microservices-based approach is in con- microservices, data flows through smart endpoints,
trast to the traditional monolithic development which also process incoming information. Using
of applications, where each application is a single, well-defined interfaces and protocols, application
autonomous unit. For example, in a client-server developers can deploy different microservices on
application, the server is a monolithic entity that heterogeneous infrastructures without a specific
handles HTTP requests, executes logic, and re- integration framework. Generally, microservice
trieves or updates its data. The problem with such communication uses a REST approach based on
monolithic architectures is that even a small modi- HTTP and TCP protocols, XMPP, or JavaScript
fication of the applications logic requires the de- Object Notation (JSON). However, currently, there
ployment of a new running version of the entire are no widely adopted standardized protocols or
code base. A microservice architecture is light- data formats for microservice communication.1
weight and easily shipped and updated. Hence, its Microservices deployment and execution also leads
ideal for engineering applications where we can- to various networking issues. To this end, applica-
not fully anticipate functionalities in advance (for tion developers currently adopt various software-
example, the types of devices that might one day defined networking (SDN) and network function
access the application). Microservice architectures virtualization (NFV) solutions for networking
are a part of a larger shift in IT departments to- microservices.

2325-6095/16/$33.00 2016 IEEE Sep t ember /O c to ber 2016 I E E E C l o u d Co m p u t i n g  81


Blue Skies

Guest Guest Guest Guest


Guest Guest microservice microservice
microservice microservice
processes processes
Runtime Libs Runtime Libs
Runtime Libs Runtime Libs Runtime Libs Runtime Libs Container Container
Guest OS Guest OS Container Container Container engine
VM VM Host operating system
Container engine

Hypervisor Host operating system Hypervisor

Physical cloud hardware Physical cloud hardware Physical cloud hardware

(a) (b) (c)

Figure 1. Comparison of cloud architectures: (a) hypervisor-based application deployment, (b) hypervisor-free containerized
microservice, and (c) containerized microservice within a hypervisor-managed physical cloud hardware.

Overview of Virtualization Technologies container virtualization (LCV) is the most well-


Hypervisor-based resource virtualization (such as known container-based virtualization technology.
that used by Citrix and VMware) is a key concept in Popular LCV solutions include Docker, LXC, lmct-
cloud computing. Hypervisor-based virtualization fy, and OpenVZ.
enables cloud providers to create unique virtual ma- Figure 1 shows the key architectural differenc-
chines (VMs) that share a set of physical hardware re- es between hypervisor-based and container-based
sources (CPU, memory, network, and disk). Each VM virtualization. Figure 1a shows application compo-
executes distinct operating system instances (rang- nents deployed within a hypervisor-based VM that
ing from proprietary to open source), which supports provides abstraction for full guest operating sys-
fault-tolerant and isolated security context behavior. tems (one per VM). Figure 1b shows microservice
Container-based virtualization can be used to deployment within a hypervisor-free containerized
create microservices.2 A container is a collection environment. Finally, Figure 1c shows microservice
of operating system kernel utilities configured to deployment within a containerized environment on
manage the physical hardware resources used by a a physical hardware managed by a hypervisor-based
particular application component.3 Containeriza- VM. After physical hardware (for example, a server
tion allows cloud providers to instantiate, relocate, or appliance), a downward-facing hypervisor is more
and optimize hardware resources in a more flexible suitable for managing infrastructure-as-a-service
way while providing near-native performance (if de- (IaaS) clouds, whereas containers are more suited
ployed in hypervisor-free mode). Because the con- for platform-as-a-service (PaaS) clouds. Having said
tainers share a single operating system kernel, they that, hypervisor-free containerization isnt a replace-
incur lower overhead.3 However, container-based vir- ment for traditional hypervisor technologies; the two
tualization leads to weaker isolation and introduces technologies complement each other and must be
greater security vulnerabilities than hypervisor-based carefully analyzed during the application architec-
virtualization.4 ture design phase in terms of performance isolation,
From the user viewpoint, each container looks overhead, and security requirements.
and executes exactly like a standalone operating sys-
tem. Additionally, in a cloud computing scenario, Container Engines for Microservices
developers can deploy a higher density of contain- Scheduling and Management
ers (compared to VM density in hypervisor-managed Several tools can instantiate and manage contain-
datacenters) on the same physical hardware. Linux ers in clouds. Docker Swarm, for example, pro-

82 I E E E C l o u d Co m p u t i n g  w w w . c o m p u t e r . o r g / c l o u d c o m p u t i n g
vides native clustering for Docker containers. It network abstractions, such as virtual L1 and L2
turns a pool of Docker hosts into a single virtual overlays and security groups. OVN also supports the
Docker host. Because Docker Swarm serves the security inspection of data transfer inside virtual
standard Docker API, any tool that already com- networks (for example, packet inspection); hence it
municates with a Docker daemon can use Swarm provides extra features useful for increasing custom-
to transparently scale to multiple hosts. A Docker er security and privacy.
container manager represents the basic container-
oriented technology. Open Issues in Scheduling and Resource
Kubernetes is an open-source technology for Management
automating deployment, operations, and scaling of Despite the clear technological advances in con-
containerized applications. It groups the containers tainer and hypervisor-based virtualization technol-
making up an application into logical units for easy ogies, we are yet to realize a standard large-scale,
management and discoveryfor example, based on performance-optimized scheduling platform for man-
their resource requirements and other constraints. aging an ecosystem of microservices networked to-
Kubernetes also provides horizontal scaling of ap- gether to create a specialized application stack, such
plications, which can be performed manually or as a multitier Web application and Internet of Things
automatically based on CPU load. Finally, it pro- (IoT) application. Future efforts will focus on solv-
vides automated rollouts and rollbacks and self- ing the following research challenges.
healing features.
Magnum is the OpenStack API service that Configuration Selection and Management
makes container orchestration engines such as A cloud application (for example, a multitier Web
Docker Swarm and Kubernetes available as first-class application) must typically combine multiple inter-
resources in the OpenStack managed datacenter. dependent microservices that provide diverse func-
Magnum uses the Heat service to schedule an operat- tionalitiesfor example, load balancer, webserver,
ing system image, which contains Docker and Kuber- and database server. Moreover, these microservices
netes, and runs this image on either VMs or a bare have both control and dataflow dependencies.
metal cluster. The challenges exist in dealing with heteroge-
The Google Container Engine provides a com- neous configurations of microservices and cloud
mercial service that relies on Docker and Kuber- datacenter resources driven by heterogeneous
netes for cluster management and orchestration. performance requirements. With the increase in
Similarly, the Amazon Elastic Compute Cloud microservice application functionality types (en-
(EC2) container service supports Docker containers cryption, compression, SQL/NSQL server, virtual
to be deployed on a managed cluster of Amazon EC2 private network, and so on) and the heterogeneity
instances. Rackspace is slightly behind with respect of container engines (LXC, Docker, Google, and
to container-based offerings. Its beta service, Cari- Amazon) and underlying cloud datacenter resourc-
na, is based on Docker Swarm and doesnt provide es, the mapping of microservices to datacenters
any elasticity features. demands selecting bespoke configurations from an
With regard to networking containerized mi- abundance of possibilities,5 which is impossible to
croservices, OpenStack Neutron supports the man- resolve manually.
agement of virtual LANs in cloud datacenters by Branded price calculators, available from pub-
creating ad hoc NFV. NFV uses virtualization tech- lic cloud providers (Amazon and Azure, for ex-
nologies to manage core networking functions via ample) and academic projects (Cloudrado), allow
software instead of relying on hardware to handle comparison of datacenter resource leasing costs.
these functions. However, these calculators cant recommend or
Creating NFVs using Open Virtual Network compare configurations across microservices and
(OVN) technology guarantees an efficient and se- datacenter resources.
cure use of the network. OVN complements existing We therefore need new research that focuses
SDN capabilities, adding native support for virtual on developing techniques for accurately modeling,

Sep t ember /O c to ber 2016 I E E E C l o u d Co m p u t i n g  83


Blue Skies

representing, and querying configurations of mi- Performance Characterization and Isolation


croservices and datacenter resources. In addition, we In a datacenter, microservices can be deployed
need general-purpose decision-making techniques, inside hypervisor-based VMs or on nonvirtual-
driven by heterogeneous performance requirements, ized physical hardware. A recent study found that
to automate the selection of microservice configu- deployment within VMs imposes additional per-
rations and their mapping to heterogeneous data- formance overhead while giving no extra benefit
center resources.5 compared to deploying microservice containers on
nonvirtualized physical hardware.7 As noted ear-
Application Topology Specification and lier, single containers, such as Docker, can sup-
Composition port multiple and heterogeneous microservices
To compose a microservices-based application topol- that provide various application-specific features
ogy, you need to describe the microservices using a in a containerized environment. In this environ-
well-known standard. For example, you can base mi- ment, unexpected interference and contention
croservice descriptions on the Topology and Orches- can occur. For some microservices (such as a com-
tration Specification for Cloud Applications (Tosca)/ pression server) storage requirements dominate,
YAML along with the usual image representation. whereas for others (for example, transactional
Moreover, workloads pertaining to different mi- query processing by database server) computa-
croservices depend on each other, and changes in one tional requirements dominate, and for still others
microservices execution and dataflow will influence (for example, a VPN server) communication re-
those of others. Overall, the topology specification quirements dominate. Hence, container schedul-
and composition needs to cover the whole life cycle ing platforms (Kubernetes, Docker Swarm, and so
that is, deploy, patch, monitor, reconfigure, and shut- on) must consider which microservices to combine
down driven by the performance objectives of each to minimize workload interference and conten-
microservice as well as the application as a whole. tion. Balancing resource consumption and per-
The Business Process Execution Lan- formance is critical in deciding where to deploy
guage (BPEL) and Web Service Choreography microservices.
Interface (WSCI) are examples of Web service com- Some recent work has investigated performance
position (agnostic to microservices) languages used isolation and interference detection. New hard-
in SOAs. The Resource and Application Description ware design techniques change processor cache
Language (RADL) is designed for composing and architecture partitioning8 or integrate novel inser-
deploying VM images to different cloud providers.6 tion policies to pseudo-partition caches to reduce
Some application topology composition and speci- contention.9
fication tools found in literature (Crane, Fig, and Hardware-based approaches add complexity
Maestro, for example) cant deploy microservices to the processor architecture and are difficult to
across distributed datacenter hosts.2 Although Tosca manage over time. Sriram Govindan and his col-
supports topology pattern specification, it lacks sup- leagues developed a scheme to quantify the effects
port for describing data and control flow dependen- of cache contention between consolidated work-
cies between microservices, with a specific focus on loads.10 However, they limit their discussion to
identifying event coordination and dataflow mecha- cache contention issues, ignoring other hardware
nisms; properties of microservices in terms of work- resource types. Ripal Nathuji and Aman Kansal
load features (such as data format, query rate, and present a control theory-based consolidation ap-
runtime I/O dependency); and performance objec- proach that mitigates the effects of cache, memory,
tives and measures relevant to microservices. and hardware prefetching contention of coexisting
Hence, an important research direction is to in- workloads.11 However, their focus is CPU-bound or
vestigate an application-agnostic microservices com- compute-intensive applications.
position framework, which will facilitate knowledge Several new research topics are worthy of inves-
reuse and make it simpler for application engineers tigation: performance isolation and characterization
to interact with a complex computing platform. techniques when multiple microservices run in the

84 I E E E C l o u d Co m p u t i n g  w w w . c o m p u t e r . o r g / c l o u d c o m p u t i n g
same container or on the same physical host; live Elastic Scheduling and Runtime Adaptation
migration of containers to reduce interference and The elastic scheduling of microservices is a com-
contention; and tradeoffs between live migration plex research problem due to several runtime
and restarting. uncertainties.
First, its difficult to estimate microservice work-
Microservice Monitoring load behavior in terms of request arrival rate,
Guaranteed application performance requires clear type, and processing time distributions; I/O sys-
and real-time understanding of performance met- tem behavior; and number of users connecting to
rics across microservices and datacenter resources. different types and mix of microservices. The real
However, variations in performance metrics across challenge in devising microservice-specific work-
different microservices and datacenter resources load models is to accurately learn and fit statisti-
complicate this problem. For example, key perfor- cal functions to the monitored distributions such
mance metrics for SDN resources are throughput as request arrival pattern, CPU usage patterns,
and latency; for CPU resources, theyre utilization memory usage patterns, I/O system behaviors, re-
and throughput; and for SQL and NoSQL da- quest processing time distributions, and network
tabase microservices, its query response time. usage patterns.
Therefore, how to define and formulate perfor- Without knowing the workload behaviors of
mance metrics coherently across microservices to microservices, its difficult to make decisions about
give a holistic view of data and control flows remains the types and scale of datacenter resources to be
an open issue. provisioned to microservices at any given time.
Monitoring tools that were popular in the grid Furthermore, the availability, load, and throughput
and cluster computing era (for example, R-GMA of datacenter resources can vary in unpredictable
and Hawkeye) were concerned only with moni- ways, due to failure or congestion of network links.
toring performance metrics at the datacenter re- Kubernetes offers a microservice container re-
source level (such as CPU percentage and TCP/ configuration feature, which scales by observing
IP performance), but not at the microservice level CPU usage (elasticity is agnostic to the workload
(such as end-to-end request processing latency and behavior and performance objectives of microser-
communication overhead). Cluster-wide monitor- vices). Amazons autoscaling service employs simple
ing frameworks (Nagios, Ganglia, Apache Ha- threshold-based rules or scheduled actions based
doop, and Apache Spark) provide information on a timetable to regulate infrastructural resources
about hardware metrics (cluster, CPU, and mem- (for example, if the average CPU usage is above 40
ory utilization, and so on) of cluster resources percent, add another microservice container). Other
that might belong to public or private cloud data- cloud providers have implemented similar simple
center.12,13 Monitoring frameworks used by the rule-based reactive runtime scheduling techniques:
Amazon EC2 Container Service (Amazon Cloud- Googles Cloud Platform autoscaler, Rackspaces
Watch) and Kubernetes (Heapster) typically moni- Auto Scale, Microsoft Azures Fabric Controller, and
tor CPU, memory, filesystem, and network usage IBMs Softlayer autoscale.
statistics, so they cant monitor microservice-level To the best of our knowledge, no prior work
performance metrics. has developed workload and resource performance
This leads to several new research topics, in- prediction models to enable reconfiguration (scal-
cluding development of holistic techniques13 for ing, descaling, and migration) of microservices on
collecting and integrating monitoring data from all cloud datacenters while ensuring microservice-
microservices and datacenter resources so admin- specific performance objectives. Hence, important
istrators or a scheduler (a computer program) can new research is investigating predictive workload
track and understand the impact of runtime uncer- and performance models to forecast workload
tainities (for example, failure, load-balancing ef- input and performance metrics across multiple,
ficiency, and overloading) on performance without coexisting microservices deployed on cloud data-
understanding the whole platforms complexity. center resources.

Sep t ember /O c to ber 2016 I E E E C l o u d Co m p u t i n g  85


Blue Skies

Federated Clouds
Storage and processing services
The cloud services market has been growing in re-
C1 C2 cent years, a trend thats confirmed by the number
S1 S2 S3 ... Sn of cloud providers that have appeared on the market.
VM1 VM2 VM3 VM4 Currently, small and medium cloud providers cant
directly compete with the big players (such as Google,
IoT cloud provider
Amazon, and Microsoft), so they must implement
new business strategies to penetrate the market.16,17
...
SA1 SA2 SA3 SAm In particular, small and medium providers can
establish stronger partnerships to share resources
C1 C2 Sensing and actuating services
according to the rules of the cloud federation eco-
system they belong to. Small providers can federate
C3 C4 C5 with large providers to gain economies of scale, op-
timize their assets, scale their capabilities, and share
resources to establish new forms of collaboration. If
Figure 2. A microservice as the enabler for the IoT application cloud. a small providers cloud runs out of capacity, it can
IoT application are decomposed into collection of microservices which migrate its microservices to federated datacenters to
are distributed across physical hardware resources available in the cloud ensure business continuity (see Figure 3).
and on the network edge. However, federated clouds need to respond to
high heterogeneity across independent cloud systems,
efficient and secure data exchange among clouds, and
Evolution of Microservice-Powered Cloud the ability to efficiently deploy resources and services
Paradigms across such federated systems. Indeed, the dynamism
Wide-scale adoption of containerization technolo- of a federation with incoming and outgoing providers
gies and microservices architectures will strongly and variable resource availability makes microser-
influence other emerging computing paradigms. vices and containers the best solution to quickly
adapt to changes in the federated system.
Cloud Computing and Internet of Things
The combination of cloud computing and the IoT
is presenting new opportunities for delivering new icroservices will simplify orchestration of
types of application services (see Figure 2). For ex- networked applications across heterogeneous
ample, private, public, and hybrid cloud providers cloud datacenters and emerging microdatacenters
are looking to integrate their datacenters software (on the network edge). However, the creation of
and hardware stacks with embedded devices (in- such applications (for example, smart city and smart
cluding sensors and actuators) to provide IoT as a healthcare IoT clouds) requires new research into
service (IoTaaS). scheduling and resource management algorithms
Typically, IoT devices run customized soft- and platforms for managing highly distributed and
ware developed with a particular programming networked microservices.
language and/or development framework. Minimal
processing and storage tasks can be performed in References
IoT devices (for example, a sensor gateway or SDN 1. A Sill, The Design and Architecture of Mi-
virtualization) by deploying lightweight, contain- croservices, IEEE Cloud Computing, vol. 3, no.
erized microservices.14,15 Meanwhile, the massive 5, 2016, pp. 7680.
data storage and processing tasks (data mining and 2. C. Pahl and B. Lee, Containers and Clusters
big data analytics) are performed in cloud datacen- for Edge Cloud Architectures: A Technology
ters that exploit virtualization (both hypervisor and Review, Proc. 3rd Intl Conf. Future Internet of
container-based) to elastically scale up/down storage Things and Cloud (FiCloud), 2015, pp. 379386.
and processing capabilities. 3. M. Xavier et al., Performance Evaluation of

86 I E E E C l o u d Co m p u t i n g  w w w . c o m p u t e r . o r g / c l o u d c o m p u t i n g
Cloud Home cloud services (IaaS, PaaS, SaaS)
User

Server 1 Server 2 ... Server N


Enterprise
Government

Foreign Foreign
Cloud Home cloud
cloud A cloud B
federation

Foreign cloud A Home cloud Foreign cloud B


virtual infrastructure virtual infrastructure virtual infrastructure

Home cloud
Virtual resources used by Virtual resources owned by capabilities Virtual resources used by
foreign cloud A and placed home cloud and placed in its enlargement foreign cloud B and placed
in its virtual infrastructure virtual infrastructure in its virtual infrastructure
Virtual resources Virtual resources
placed in foreign cloud A placed in foreign cloud B
and rented to home cloud and rented to home cloud

Figure 3. Microservice as the basis of federating multiple cloud datacenters as part of cohesive federation,
where datacenter providers can meet the performance requirements of client applications through optimal
placement and migration of microservices across datacenters.

Container-Based Virtualization for High Per- 8. M.K. Qureshi and Y.N. Patt, Utility-Based
formance Computing Environments, Proc. Cache Partitioning: A Low-Overhead, High-
21st Euromicro Intl Conf. Parallel, Distributed, Performance, Runtime Mechanism to Partition
and Network-Based Processing (PDP), 2013, pp. Shared Caches, Proc. 39th Ann. IEEE/ACM
233240. Intl Symp. Microarchitecture (Micro 06), 2006,
4. C. Esposito, A. Castiglione, and K.-K.R. Choo, pp. 423432.
Challenges in Delivering Software in the Cloud 9. Y. Xie and G.H. Loh, Pipp: Promotion/Inser-
as Microservices, IEEE Cloud Computing, Vol. tion Pseudo-Partitioning of Multi-Core Shared
3, no. 5, 2016, pp. 1014. Caches, Proc. 36th Ann. Intl Symp. Computer
5. R. Ranjan et al., Cross-Layer Cloud Resource Architecture (ISCA 09), 2009, pp. 174183.
Configuration Selection in the Big Data Era, 10. S. Govindan et al., Cuanta: Quantifying Ef-
IEEE Cloud Computing, vol. 2, no. 3, 2015, pp. fects of Shared On-Chip Resource Interference
1622. for Consolidated Virtual Machines, Proc. 2nd
6. M. Caballer et al., Dynamic Management of ACM Symp. Cloud Computing (SOCC 11), 2011,
Virtual Infrastructures, J. Grid Computing, vol. article 22.
13, Mar. 2015, pp. 5370. 11. R. Nathuji and A. Kansal, Q-Clouds: Manag-
7. W. Felter et al., An Updated Performance Com- ing Performance Interference Effects for QoS-
parison of Virtual Machines and Linux Contain- Aware Clouds, Proc. 5th European Conf. Com-
ers, Proc. IEEE Intl Symp. Performance Analysis of puter Systems (EuroSys 10), 2010, pp. 237250.
Systems and Software (ISPASS), 2015, pp. 171172. 12. R. Ranjan, Streaming Big Data Processing in

Sep t ember /O c to ber 2016 I E E E C l o u d Co m p u t i n g  87


Blue Skies

Datacenter Clouds, IEEE Cloud Computing, interests include grid computing, peer-to-peer net-
vol. 1, no. 1, 2014, pp. 7883. works, cloud computing, Internet of Things, and big
13. M. Natu et al., Holistic Performance Monitor- data analytics. Ranjan has a PhD in computer science
ing of Hybrid Clouds: Complexities and Future and software engineering from the University of Mel-
Directions, IEEE Cloud Computing, vol. 3, no. bourne (2009). Contact him at raj.ranjan@ncl.ac.uk
1, 2016, pp. 7281. or http://rajivranjan.net.
14. A. Celesti et al., Exploring Container Virtu-
alization in IoT Clouds, Proc. 2016 IEEE Intl Chang Liu is a research fellow (assistant professor)
Conf. Smart Computing (SmartComp), 2016, pp. at Newcastle University, UK. His research interests in-
16. clude cloud computing, big data, distributed systems,
15. M. Fazio and A. Puliafito, Cloud4sens: A Cloud- Internet of Things, and information security and pri-
Based Architecture for Sensor Controlling and vacy. Liu has a PhD in information technology from
Monitoring, IEEE Comm, vol. 53, Mar. 2015, the University of Technology, Sydney, Australia. Con-
pp. 4147. tact him at changliu.it@gmail.com.
16. M. Assis and L. Bittencourt, A Survey on Cloud
Federation Architectures: Identifying Function- Lydia Y. Chen is a research staff member at the
al and Non-functional Properties, J. Network IBM Zurich Research Lab, Zurich, Switzerland. Her
and Computer Applications, vol. 72, 2016, pp. research interests include modeling, optimizing per-
5171. formance and dependability for big data applica-
17. A. Celesti et al., Characterizing Cloud Fed- tions and highly virtualized datacenters. She received
eration in IoT, Proc. 30th Intl Conf. Advanced a PhD in operations research from the Pennsylvania
Information Networking and Applications Work- State University. Contact her at yic@zurich.ibm.com.
shops (WAINA), 2016, pp. 9398.
Massimo Villari is an associate professor of
computer science at the University of Messina. His re-
Maria Fazia is an assistant researcher of computer search interests include cloud computing, Internet of
science at the University of Messina. Her research in- Things, big data analytics, and security systems. Vil-
terests include distributed systems and wireless com- lari has a PhD in computer engineeringfrom the Uni-
munications, especially with regard to the design and versity of Messina. Hes a member of IEEE and IARIA
development of cloud solutions for IoT services and boards. Contact him at mvillari@unime.it.
applications. Fazia has a PhD in advanced technolo-
gies for information engineering from the University
of Messina. Contact her at mfazio@unime.it.

Antonio Celesti is a postdoctoral researcher at


University of Messina. His research interests include
distributed systems and cloud computing, with par-
ticular regard to federation, storage, security, energy
efficiency; and assistive technology. Celesti has a PhD
in advanced technology for information engineering
from the University of Messina, Italy. Contact him at
acelesti@unime.it.

Rajiv Ranjan is a reader in the School of Com-


puting Science at Newcastle University, UK; chair
professor in the School of Computer, Chinese Uni- Read your subscriptions through
versity of Geosciences, Wuhan, China; and a visiting the myCS publications portal at
http://mycs.computer.org.
scientist at Data61, CSIRO, Australia. His research

88 I E E E C l o u d Co m p u t i n g  w w w . c o m p u t e r . o r g / c l o u d c o m p u t i n g

You might also like