Unit 5

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

UNIT-V

CASE STUDIES

Google App Engine(GAE) – GAE Architecture – Functional Modules of GAE –


Amazon Web Services(AWS) – GAE Applications – Cloud Software Environments –
Eucalyptus – Open Nebula – Open Stack.

PUBLIC CLOUD PLATFORMS: GAE, AWS, AND AZURE


Commercial cloud platforms
 In this section, we will review the system architectures of commercially
available cloud platforms - GAE, AWS, AND AZURE
 Cloud services are demanded by computing and IT administrators, software
vendors, and end users.
 Figure 5.1 introduces five levels of cloud players.

 At the top level, individual users and organizational users demand very
different services.
 The application providers at the SaaS level serve mainly individual
users.
 Most business organizations are serviced by IaaS and PaaS providers.
 The infrastructure services (IaaS) provide compute, storage, and
communication resources to both applications and organizational users.
 The cloud environment is defined by the PaaS or platform providers.
 Note that the platform providers support both infrastructure services and
organizational users directly.
 Cloud services rely on new advances in machine virtualization, SOA,
grid infrastructure management, and power efficiency.
 Consumers purchase such services in the form of IaaS, PaaS, or SaaS
as described earlier. Also, many cloud entrepreneurs are selling value-
added utility services to massive numbers of users.

Panimalar Engineering College 1 S.Hariharan


 Table 5.1 summarizes the profiles of five major cloud providers by 2010
standards

Amazon
 Amazon pioneered the IaaS business in supporting e-commerce and cloud
applications by millions of customers simultaneously.
 The elasticity in the Amazon cloud comes from the flexibility provided by
the hardware and software services.
 EC2 provides an environment for running virtual servers on demand. S3
provides unlimited online storage space. Both EC2 and S3 are supported in
the AWS platform.
Microsoft
 Microsoft offers the Azure platform for cloud applications.
 It has also supported the .NET service, dynamic CRM, Hotmail, and SQL
applications
Salsforce.com
 Salsforce.com offers extensive SaaS applications for online CRM
applications using its Force.com platforms.
In General,
 In Table 5.1, all IaaS, PaaS, and SaaS models allow users to access services
over the Internet, relying entirely on the infrastructures of the cloud service
providers.
 These models are offered based on various SLAs between the providers and
the users.
 For cloud computing services, it is difficult to find a reasonable precedent
for negotiating an SLA.
 In a broader sense, the SLAs for cloud computing address service
availability, data integrity, privacy, and security protection.
 Blank spaces in the table refer to unknown or underdeveloped features.

Panimalar Engineering College 2 S.Hariharan


Google App Engine (GAE)

Google App Engine (GAE)


 Google has the world’s largest search engine facilities.
 The company has extensive experience in massive data processing that
has led to new insights into data-center design and novel programming
models that scale to incredible sizes.
 The Google platform is based on its search engine expertise, but as
discussed earlier with MapReduce, this infrastructure is applicable to many
other areas.
 Google has hundreds of data centers and has installed more than
460,000 servers worldwide.
– For example, 200 Google data centers are used at one time for a
number of cloud applications.
 Data items are stored in text, images, and video and are replicated to
tolerate faults or failures.
 Here we discuss Google’s App Engine (GAE) which offers a PaaS
platform supporting various cloud and web applications.

Google Cloud Infrastructure


 Google has explored cloud development by leveraging the large number of
data centers it operates.
– For example, Google explorerd cloud services in Gmail, Google
Docs, and Google Earth, among other applications.
– These applications can support a large number of users
simultaneously.
 Notable technology achievements include the Google File System (GFS),
MapReduce, BigTable, and Chubby.
 In 2008, Google announced the GAE web application platform which is
becoming a common platform for many small cloud service providers.
 This platform specializes in supporting scalable (elastic) web applications
 GAE enables users to run their applications on a large number of data
centers associated with Google’s search engine operations.

Panimalar Engineering College 3 S.Hariharan


GAE Architecture

GAE Architecture
 The major Building blocks of the Google cloud platform which has been
used to deliver the cloud services are shown in the below figure.

Building blocks of the Google cloud platform


 GFS is used for storing large amounts of data.
 MapReduce is for use in application program development.
 Chubby is used for distributed application lock services.
 BigTable offers a storage service for accessing structured data.
 Users can interact with Google applications via the web interface provided
by each application.
Third-party application providers
 Third-party application providers can use GAE to build cloud
applications for providing services.
– The applications all run in data centers under tight management by
Google engineers.
– Inside each data center, there are thousands of servers forming
different clusters.
 Google is one of the larger cloud application providers, although it’s
fundamental service program is private and outside people cannot use the
Google infrastructure to build their own service.

Panimalar Engineering College 4 S.Hariharan


Building blocks of Google’s cloud computing application
 The building blocks of Google’s cloud computing application include,
– The Google File System for storing large amounts of data,
– The MapReduce programming framework for application developers,
– Chubby for distributed application lock services, and
– BigTable as a storage service for accessing structural or semi-
structural data.

 With these building blocks, Google has built many cloud applications.
 Above Figure shows the overall architecture of the Google cloud
infrastructure.
 A typical cluster configuration can run the Google File System, Map-
Reduce jobs, and Big Table servers for structure data.
 Extra services such as Chubby for distributed locks can also run in the
clusters.

 GAE runs the user program on Google’s infrastructure.


– As it is a platform running third-party programs, application
developers now do not need to worry about the maintenance of
servers.
– GAE can be thought of as the combination of several software
components.
– The frontend is an application framework which is similar to
other web application frameworks such as ASP, J2EE, and JSP.
– At the time of this writing, GAE supports Python and Java
programming environments.
– The applications can run similar to web application containers.
– The frontend can be used as the dynamic web serving
infrastructure which can provide the full support of common
technologies.

Panimalar Engineering College 5 S.Hariharan


Functional Modules of GAE

Functional Modules of GAE


 The GAE platform comprises the following five major components.
 The GAE is not an infrastructure platform, but rather an application
development platform for users.
 We describe the component functionalities separately.
– 1) The data store offers object-oriented, distributed, structured data
storage services based on Big-Table techniques.
 The data-store secures data management operations.
– 2) The application runtime environment offers a platform for
scalable web programming and execution.
 It supports two development languages: Python and Java.
– 3) The software development kit (SDK) is used for local application
development.
 The SDK allows users to execute test runs of local applications
and upload application code.
– 4) The administration console is used for easy management of user
application development cycles, instead of for physical resource
management.
– 5) The GAE web service infrastructure provides special interfaces to
guarantee flexible use and management of storage and network
resources by GAE.

 Google offers essentially free GAE services to all Gmail account owners.
 You can register for a GAE account or use your Gmail account name to
sign up for the service.
 The service is free within a quota.
– If you exceed the quota, the page instructs you on how to pay for
the service.
– Then you download the SDK and read the Python or Java guide to get
started.
 Note that GAE only accepts Python, Ruby, and Java programming
languages.

Panimalar Engineering College 6 S.Hariharan


 The platform does not provide any IaaS services, unlike Amazon, which
offers Iaas and PaaS.
 This model allows the user to deploy user-built applications on top of
the cloud infrastructure that are built using the programming
languages and software tools supported by the provider (e.g., Java,
Python).
 Azure does this similarly for .NET.
– The user does not manage the underlying cloud infrastructure.
– The cloud provider facilitates support of application development,
testing, and operation support on a well-defined service platform.

Amazon Web Services (AWS)

Amazon Web Services (AWS)

 VMs can be used to share computing resources both flexibly and safely.
 Amazon has been a leader in providing public cloud services
(http://aws.amazon.com/).
 Amazon applies the IaaS model in providing its services. Figure shows the
AWS architecture.

 EC2 provides the virtualized platforms to the host VMs where the cloud
application can run.
 S3 (Simple Storage Service) provides the object-oriented storage service
for users.

Panimalar Engineering College 7 S.Hariharan


 EBS (Elastic Block Service) provides the block storage interface which
can be used to support traditional applications.
 SQS stands for Simple Queue Service, and its job is to ensure a reliable
message service between two processes.
– The message can be kept reliably even when the receiver processes
are not running.
– Users can access their objects through SOAP with either browsers or
other client programs which support the SOAP standard.
 Table 4.6 summarizes the service offerings by AWS in 12 application
tracks.

 Amazon offers queuing and notification services (SQS and SNS), which
are implemented in the AWS cloud.
 Note brokering systems run very efficiently in clouds and offer a striking
model for controlling sensors and providing office support of smart phones
and tablets.

Panimalar Engineering College 8 S.Hariharan


Different from Google,
 Amazon provides a more flexible cloud computing platform for developers to
build cloud applications.
– Small and medium-size companies can put their business on the
Amazon cloud platform.
– Using the AWS platform, they can service large numbers of Internet
users and make profits through those paid services.
 ELB automatically distributes incoming application traffic across multiple
Amazon EC2 instances and allows user to avoid non operating nodes and
to equalize load on functioning images.
 Both auto scaling and ELB are enabled by Cloud Watch which monitors
running instances.
 Amazon DevPay is a simple-to-use online billing and account management
service that makes it easy for businesses to sell applications that are built
into or run on top of AWS.
 FPS provides developers of commercial systems on AWS with a convenient
way to charge Amazon’s customers that use such services built on AWS.
 Customers can pay using the same login credentials, shipping address, and
payment information they already have on file with Amazon.
 The FWS allows merchants to access Amazon’s fulfillment capabilities
through a simple web service interface.
 Merchants can send order information to Amazon to fulfill customer orders
on their behalf.

Panimalar Engineering College 9 S.Hariharan


GAE Applications
GAE Applications

 Well-known GAE applications include ,


– The Google Search Engine, Google Docs, Google Earth, and Gmail.
 These applications can support large numbers of users simultaneously.
 Users can interact with Google applications via the web interface
provided by each application.
 Third-party application providers can use GAE to build cloud
applications for providing services.
 The applications are all run in the Google data centers.
– Inside each data center, there might be thousands of server
nodes to form different clusters.
– Each cluster can run multipurpose servers.
 GAE supports many web applications.
 Storage service
– One is a storage service to store application-specific data in the
Google infrastructure.
– The data can be persistently stored in the backend storage
server while still providing the facility for queries, sorting, and even
transactions similar to traditional database systems.
 Google-specific services
– GAE also provides Google-specific services, such as the Gmail
account service (which is the login service, that is, applications can
use the Gmail account directly).
– This can eliminate the tedious work of building customized user
management components in web applications.
– Thus, web applications built on top of GAE can use the APIs
authenticating users and sending e-mail using Google accounts.

Panimalar Engineering College 10 S.Hariharan


CLOUD SOFTWARE ENVIRONMENTS
Cloud Software Environment
 In this section, we will assess popular cloud operating systems and
emerging software environments.
 We cover the open source Eucalyptus and Nimbus, then examine
OpenNebula, Sector/Sphere, and Open Stack.
Open Source Eucalyptus and Nimbus
 Eucalyptus is a product from Eucalyptus Systems (www.eucalyptus.com)
that was developed out of a research project at the University of California,
Santa Barbara.
 Eucalyptus was initially aimed at bringing the cloud computing paradigm
to academic supercomputers and clusters.
 Eucalyptus provides an AWS-compliant EC2-based web service interface
for interacting with the cloud service.
 Additionally, Eucalyptus provides services, such as the AWS-compliant
Walrus, and a user interface for managing users and images.
Eucalyptus Architecture
 The Eucalyptus system is an open software environment.
 Figure shows the architecture based on the need to manage VM images.

Fig: The Eucalyptus architecture for VM image management.

 The system supports cloud programmers in VM image management as


follows.
 Essentially, the system has been extended to support the development of
both the computer cloud and storage cloud.

Panimalar Engineering College 11 S.Hariharan


VM Image Management
 Eucalyptus takes many design queues from Amazon’s EC2, and its image
management system is no different.
 Eucalyptus stores images in Walrus, the block storage system that is
analogous to the Amazon S3 service.
 As such, any user can bundle their own root file system, and upload
and then register this image and link it with a particular kernel and
ram-disk image.
 This image is uploaded into a user-defined bucket within Walrus, and
can be retrieved anytime from any availability zone.
 This allows users to create specialty virtual appliances
(http://en.wikipedia.org/wiki/Virtual_appliance) and deploy them within
Eucalyptus with ease.
 The Eucalyptus system is available in a commercial proprietary version, as
well as the open source version we just described.

Nimbus
 Nimbus is a set of open source tools that together provide an IaaS cloud
computing solution.
 Figure shows the architecture of Nimbus,
– Which allows a client to lease remote resources by deploying VMs
on those resources and configuring them to represent the
environment desired by the user.

Fig: Nimbus cloud infrastructure.

Panimalar Engineering College 12 S.Hariharan


Nimbus Web.
 To this end, Nimbus provides a special web interface known as Nimbus
Web.
 Its aim is to provide administrative and user functions in a friendly
interface.
 Nimbus Web is centered around a Python Django web application that is
intended to be deployable completely separate from the Nimbus service.
 As shown in Figure, a storage cloud implementation called Cumulus has
been tightly integrated with the other central services, although it can also
be used stand-alone.
Cumulus
 Cumulus is compatible with the Amazon S3 REST API ,but extends its
capabilities by including features such as quota management.
 Therefore, clients such as boto and s2cmd, that work against the S3 REST
API, work with Cumulus.
 On the other hand, the Nimbus cloud client uses the Java Jets3t library to
interact with Cumulus.
 Nimbus supports two resource management strategies.
– The first is the default “resource pool” mode.
 In this mode, the service has direct control of a pool of VM
manager nodes and it assumes it can start VMs.
– The other supported mode is called “pilot.”
 Here, the service makes requests to a cluster’s Local Resource
Management System (LRMS) to get a VM manager available to
deploy VMs.
 Nimbus also provides an implementation of Amazon’s EC2 interface that
allows users to use clients developed for the real EC2 system against
Nimbus-based clouds.

Panimalar Engineering College 13 S.Hariharan


OpenNebula and OpenStack
OpenNebula

 OpenNebula is an open source toolkit which allows users to transform


existing infrastructure into an IaaS cloud with cloud-like interfaces.
 Figure shows the OpenNebula architecture and its main components.
 The architecture of OpenNebula has been designed to be flexible and
modular to allow integration with different storage and network
infrastructure configurations, and hypervisor technologies.

Fig: OpenNebula architecture and its main components.

 Here, the core is a centralized component that manages the VM full life
cycle, including setting up networks dynamically for groups of VMs and
managing their storage requirements, such as VM disk image deployment
or on-the-fly software environment creation.
 Another important component is the capacity manager or scheduler.
– It governs the functionality provided by the core.
– The default capacity scheduler is a requirement/rank matchmaker.
– However, it is also possible to develop more complex scheduling
policies, through a lease model and advance reservations.
 The last main components are the access drivers.
– They provide an abstraction of the underlying infrastructure to
expose the basic functionality of the monitoring, storage, and
virtualization services available in the cluster.

Panimalar Engineering College 14 S.Hariharan


 Therefore, OpenNebula is not tied to any specific environment and can
provide a uniform management layer regardless of the virtualization
platform.
 Additionally, OpenNebula offers management interfaces to integrate the
core’s functionality within other data-center management tools, such as
accounting or monitoring frameworks.
 To this end, OpenNebula implements the libvirt API, an open interface for
VM management, as well as a command-line interface (CLI).
 A subset of this functionality is exposed to external users through a cloud
interface.
 OpenNebula is able to adapt to organizations with changing resource
needs, including addition or failure of physical resources.
 Some essential features to support changing environments are live
migration and VM snapshots.
 Furthermore, when the local resources are insufficient, OpenNebula can
support a hybrid cloud model by using cloud drivers to interface with
external clouds.
 This lets organizations supplement their local infrastructure with
computing capacity from a public cloud to meet peak demands, or
implement HA strategies.
 OpenNebula currently includes an EC2 driver, which can submit requests
to Amazon EC2 and Eucalyptus , as well as an Elastic Hosts driver .
 Regarding storage, an Image Repository allows users to easily specify disk
images from a catalog without worrying about low-level disk configuration
attributes or block device mapping.
 Also, image access control is applied to the images registered in the
repository, hence simplifying multiuser environments and image sharing.
 Nevertheless, users can also set up their own images.

Panimalar Engineering College 15 S.Hariharan


OpenStack
OpenStack
 OpenStack was been introduced by Rackspace and NASA in July 2010.
 The project is building an open source community spanning
technologists, developers, researchers, and industry to share resources
and technologies with the goal of creating a massively scalable and
secure cloud infrastructure.
 In the tradition of other open source projects, the software is open source
and limited to just open source APIs such as Amazon.
 Currently, OpenStack focuses on the development of two aspects of
cloud computing to address compute and storage aspects with the
OpenStack Compute and OpenStack Storage solutions.
 “OpenStack Compute is the internal fabric of the cloud creating and
managing large groups of virtual private servers” and “OpenStack Object
Storage is software for creating redundant, scalable object storage using
clusters of commodity servers to store terabytes or even petabytes of data.”
 Recently, an image repository was prototyped.
– The image repository contains an image registration and discovery
service and an image delivery service.
– Together they deliver images to the compute service while
obtaining them from the storage service.
 This development gives an indication that the project is striving to
integrate more services into its portfolio.

OpenStack Compute
 As part of its computing support efforts, OpenStack is developing a
cloud computing fabric controller, a component of an IaaS system,
known as Nova.
 The architecture for Nova is built on the concepts of shared-nothing and
messaging-based information exchange.
 Hence, most communication in Nova is facilitated by message queues.
 To prevent blocking components while waiting for a response from
others, deferred objects are introduced.
 Such objects include callbacks that get triggered when a response is
received.

Panimalar Engineering College 16 S.Hariharan


 This is very similar to established concepts from parallel computing, such
as “futures,” which have been used in the grid community by projects such
as the CoG Kit.
 To achieve the shared-nothing paradigm, the overall system state is kept
in a distributed data system.
 State updates are made consistent through atomic transactions.
 Nova it implemented in Python while utilizing a number of externally
supported libraries and components.
 This includes boto, an Amazon API provided in Python, and Tornado, a
fast HTTP server used to implement the S3 capabilities in OpenStack.
 Figure shows the main architecture of Open Stack Compute.
 In this architecture,
– The API Server receives HTTP requests from boto, converts the
commands to and from the API format, and forwards the requests to
the cloud controller.
– The cloud controller maintains the global state of the system,
ensures authorization while interacting with the User Manager via
Lightweight Directory Access Protocol (LDAP), interacts with the
S3 service, and manages nodes, as well as storage workers through
a queue.

Fig: OpenStack Nova system architecture.

 Additionally, Nova integrates networking components to manage private


networks, public IP addressing, virtual private network (VPN)
connectivity, and firewall rules.
Panimalar Engineering College 17 S.Hariharan
 It includes the following types:
– Network Controller manages address and virtual LAN (VLAN)
allocations
– Routing Node governs the NAT (network address translation)
conversion of public IPs to private IPs, and enforces firewall rules
– Addressing Node runs Dynamic Host Configuration Protocol (DHCP)
services for private networks
– Tunneling Node provides VPN connectivity

 The network state (managed in the distributed object store) consists of the
following:
– VLAN assignment to a project
– Private subnet assignment to a security group in a VLAN
– Private IP assignments to running instances
– Public IP allocations to a project
– Public IP associations to a private IP/running instance

OpenStack Storage
 The OpenStack storage solution is built around a number of interacting
components and concepts, including
 a proxy server,
 a ring, -an object server, a container server, an account server,
 replication,updaters, and auditors.
 The role of the proxy server is to enable lookups to the accounts,
containers, or objects in OpenStack storage rings and route the requests.
– Thus, any object is streamed to or from an object server directly
through the proxy server to or from the user.
 A ring represents a mapping between the names of entities stored on disk
and their physical locations.
– Separate rings for accounts, containers, and objects exist.
– A ring includes the concept of using zones, devices, partitions, and
replicas.
 Hence, it allows the system to deal with failures, and isolation of zones
representing a drive, a server, a cabinet, a switch, or even a data center.

Panimalar Engineering College 18 S.Hariharan


 Weights can be used to balance the distribution of partitions on drives
across the cluster, allowing users to support heterogeneous storage
resources.
 According to the documentation, “the Object Server is a very simple blob
storage server that can store, retrieve and delete objects stored on local
devices.”
 Objects are stored as binary files with metadata stored in the file’s extended
attributes.
 This requires that the underlying file system is built around object
servers, which is often not the case for standard Linux installations.
 To list objects, a container server can be utilized.
 Listing of containers is handled by the account server.
 The first release of OpenStack “Austin” Compute and Object Storage was
October 22, 2010.

Panimalar Engineering College 19 S.Hariharan

You might also like