Professional Documents
Culture Documents
Escience Poster
Escience Poster
Platforms
Amanda Calatrava and Germn Molt
Instituto de Instrumentacin para Imagen Molecular (I3M), Grupo de Grid y Computacin de Altas Prestaciones (GRyCAP)
Universitat Politcnica de Valncia (UPV), Valencia (Spain)
amcaar@i3m.upv.es, gmolto@dsic.upv.es
Introduction
Clusters of PCs are one of the most widely used computing platforms in science and engineering, supporting different programming models. However, they suffer
from lack of customizability, difficult extensibility and complex workload-balancing. The major improvements in hypervisor technologies and virtualization have paved
the way for Cloud computing. This new paradigm can solve those disadvantages with fully customizable virtual machines (VMs) that decouple the application
execution from the underlying hardware and are dynamically provisioned and released on a pay-as-you-go basis. Moreover, they can be automatically enlarged and
shrinked to cope with increases and decreases in the workload, thus adapting the size of the cluster to the workload. To meet the challenges posed by traditional
clusters, this PhD project aims to develop a tool to manage all aspects involved in the execution of scientific applications on virtual hybrid elastic clusters [1],
abstracting the details of cluster deployment, configuration and management (transparency).
Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to
a shared pool of configurable computing resources (e.g., networks, servers, storage, applications,
and services) that can be rapidly provisioned and released with minimal management effort or
service provider interaction.
The NIST Definition of Cloud Computing. (September 2011).
On-demand self-service.
Resource virtualization.
Pay-as-you-go.
Broad network access.
Transparency.
Rapid elasticity.
Elasticity Management:
Ability to adapt the size of the
cluster to the workload,
adding or removing nodes
(horizontal elasticity).
Managed by CLUES [3].
Resource Management:
Using the Infrastructure Manager [4], that provides access to
multiple infrastructures to deploy the resources. Hybrid Clouds
(on premise Cloud + public Cloud) with resources connected via
VPN. Spot instances. Automatic configuration (via Ansible) and
monitoring.
Data Management:
Fault tolerancy offered by data replication. Efficient access and
transferences of data.
IM
Client
Infrastructure
Manager
User/Administrator
Job Management:
By automatically installing and configuring a
Local Resource Management System (Torque,
SGE and SLURM currently supported). Fault
tolerance via checkpointing and migration.
Deploy &
Configure
Submit jobs
On-premise Cloud
Figure 3 Evolution of the EC3 architecture [2].
Create cluster
Worker
Node
Worker
Node
Frontend
Virtual Cluster
Worker
Node
Virtual Cluster
Scale in/out
Worker
Node
VPN
Worker
Node
Worker
Node
Worker
Node
Virtual Cluster
OpenNebula
Physical Cluster
References:
[1] A. Calatrava, G. Molt, M. Caballer, and C. De Alfonso. Virtual Hybrid Elastic
Clusters in the Cloud. In: 8th Iberian Grid Infrastructure Conference (IberGrid
2014), 2014.
[2] M. Caballer, C. de Alfonso, F. Alvarruiz, and G. Molt. EC3: Elastic Cloud
Computing Cluster. In: Journal of Computer and System Sciences, 2013.
[3] Cluster Energy Saving, CLUES: http://www.grycap.upv.es/clues/
[4] Infrastructure Manager: http://www.grycap.upv.es/im/