Professional Documents
Culture Documents
Cs264 Intro To Cloud Computing
Cs264 Intro To Cloud Computing
Cs264 Intro To Cloud Computing
Justin Riley Software Tools for Academics and Researchers Office of Educational Innovation and Technology Massachusetts Institute of Technology
Cloud computing is a very fuzzy term in general Often includes everything and the kitchen sink Three broad categories: Software as a Service (SaaS) Platform as a Service (PaaS) Infrastructure as a Service (IaaS)
Hardware On Demand Pay for what you use Full root access you control the OS and Software Stack Ability to scale computing resources up and down No dealing with racks, networks, power, cooling, housing, etc.
Resizable Compute Capacity As much as you need, when you need it. Scale up or down in minutes. Complete Control via API Create, scale, & manage instances programmatically. Variety of Instance Sizes CPU Power, Cores, RAM, Disk. Wide Variety of Pre-built AMIs (Amazon Machine Images) Hit the ground running with minimal system building effort. Now: Linux, Windows, and OpenSolaris. Secure & Flexible Network Security Model Full control of access for each running instance. Keypair required for SSH access.
Small 32 1.7 GB
XL 64 17.1
2 XL 64 34.2 850 GB
4 XL 64 68.4 1690 GB
Medium 32 1.7 GB
4 XL 64 23
4XL 64 22
Disk
420 GB
160 GB
1690 GB
420
350 GB
1690 GB
1690 GB
1690 GB
2 NVIDIA Tesla Fermi GPUs
Virtual Cores
2 (Burst ) Yes
6.5
13
26
20
33.5
33.5
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
On-Demand Pricing Linux Per Hour Window s $0.02 $0.03 $0.085 $0.12 $0.34 $0.48 $0.68 $0.96 $0.50 $0.62 $1.00 $1.24 $2.00 $2.48 $0.17 $0.29 $0.68 $1.16 1.60 N/A 2.10 N/A
Spot Instances
Bid for unused AWS capacity Prices controlled by AWS based on supply and demand AWS can terminate Spot Instances without notice Best approach to temporary requests for large numbers of servers Default maximum = 100 servers (instead of 20 on-demand)
AMI
Persistent storage Volume lifetime is independent of any particular EC2 instance. General purpose Raw, unformatted, block device. Use from Linux, Solaris or Windows. High performance Equal to or better than local EC2 drive. High reliability Built-in redundancy within availability zone. AFR (Annual Failure Rate) between 0.1% and 1%. Scalable Volume sizes ranging from 1 GB to 1 TB. Easy to create, attach, back up, restore, and delete volumes.
EBS Volumes
$0.10 per GB-month of provisioned storage $0.10 per 1 million I/O requests No charge for mounting/unmounting volume
$0.14 per GB-month of data stored $0.01 per 1,000 PUT requests (when saving a snapshot) $0.01 per 10,000 GET requests (when loading a snapshot)
Availability Zone A
Availability Zone B
Availability Zone A
Availability Zone B
Availability Zone C
US West Region
Singapore
Availability Zone A
Availability Zone B
Availability Zone A
Availability Zone B
Note: Conceptual drawing only. The number of Availability Zones may vary
EBS volumes can only be used with instances in the same availability zone they were created in
Elastic MapReduce
Easily launch Map/Reduce jobs on Amazon EC2
Uses Hadoop
Define Map/Reduce work flows either at command line or from AWS console
Introducing StarCluster
Developed at MIT Under active development Open source Web site: http://web.mit.edu/stardev/cluster/ Easy to install and use ($ easy_install starcluster) Simplifies creation and management of EC2 clusters
Why StarCluster?
EC2 provides raw compute power Theres work to be done to create a usable cluster:
Software installation AMI creation AWS / SSH key management and distribution Persistent Disk Storage and File Sharing Configuration management Higher-level management (cluster vs. instance)
StarCluster Features
Prebuilt 32 and 64 bit AMIs Launch a cluster of EC2 instances:
One command (starcluster) to rule them all Passwordless SSH pre-configured Security group for SSH access Shared disk volume (NFS) Preinstalled libraries (OpenMPI, NumPy, SciPy, etc.)
Node001 EC2
NodeN
EC2
EC2
Master Disk
Config File
Prerequisites
Client computer running Mac/Linux AWS security credentials:
Steps
Install StarCluster on client Configure StarCluster Start cluster(s) Use them Stop cluster(s)
Configure StarCluster
Download your keypair to client
Edit .starcluster/config
Edit .starcluster/config
AWS Credentials
Must match KEYNAME Name and location of file downloaded in last slide
AMI for nodes Node instance type Master instance type AMI for master
Start Cluster
<client>: starcluster start mycluster
Access Cluster
SSH to master node as root: <client>: starcluster sshmaster mycluster
StarCluster AMI
Ubuntu-based (8.10, 9.04, 10.04) Automatically installs/configures:
Important commands:
qstat Examine work queue qsub Submit work qhost List hosts in grid
<master-sge>: qsub -V -cwd exercise.sh Your job 9 ("exercise.sh") has been submitted
The argument -V is used to pass the current environment to the job once it's executed.
Stop Cluster
AWS charges accrue as long as the cluster is running! Easy to start, easy to stop, so be parsimonious. To stop the cluster:
<client>: starcluster stop jb1 StarCluster - (http://web.mit.edu/starcluster) Software Tools for Academics and Researchers (STAR) Please submit bug reports to starcluster@mit.edu Shutdown cluster jb1 (y/n)? y >>> Shutting down i-3fad6653 >>> Shutting down i-3bad6657 >>> Shutting down i-35ad6659 >>> Shutting down i-37ad665b >>> Shutting down i-31ad665d >>> Removing cluster security group @sc-jb1 <client>:
Launching instance in specified zone Creating and attaching an EBS volume to the instance Partitioning/formatting the EBS volume
Launch instance of AMI Install and configure desired libraries, tools, apps
Discussion / Q&A