ECS and Docker

The Hitchhiker's Guide to AWS ECS and
Docker
Posted by J Cole Morrison on May 8th, 2017.
Introduction
In this guide we're going to discuss the major components of AWS EC2 Container Service
(ECS), what they are conceptually and how they work together.
The prime directive - understanding how hosting, scaling and load balancing an application
with Docker and ECS works. What are the primary pieces? How do we put the puzzle
together? Does it interfere with internal development of alien civilizations?
This is a conceptual guide. Not a technical step-by-step. If you're looking for that, I have a
full one of those here:
Guide to Fault Tolerant and Load Balanced AWS Docker Deployment on ECS
There's plenty of piecemeal step guides out there, including mine, so it didn't seem necessary
to create another one.
Instead, we're looking for mental framework of how to think about it. Without a conceptual
understanding of our tools and systems, our problem-solving ability is limited. And that
limit usually winds up being "what technical step-by-step guides can I find?"
Check out the entire guide, with extras, in a 10 Part Video Series if you prefer to watch over
read:
Table of Contents
1. Overview
2. The Analogy We'll Use
3. Docker Images and Containers Overview
4. Summary of Docker Analogy
5. Challenges with Managing Docker Containers
6. AWS EC2 Container Service "ECS"
7. Clusters and the ECS Container Agent
8. Task Definitions
9. Running a Task
10. Services
11. Application Load Balancers
12. Launch Configurations and AutoScaling Groups
13. Summary
14. Final Thoughts
15. Image Accreditation
Overview
So what's our agenda here?
1) A conceptual overview of Docker Images and Containers
We'll discuss some of the problems it solves and build a visual analogy.
I'm including this because there seem to be a severe lack of "Dude, this is just wtf it is and
what it solves - in plain english." Instead, there's just tons of marketing content, technical
speculation and "reasons to use."
Also, we'll build on the analogy we set up here when diving into ECS concepts. I'll put
a tl;dr at the end of the Docker section just in case you do have a good grasp of these
concepts. That way you'll be caught up on the analogy.
2) The core components of ECS (EC2 Container Service) and how they're connected
What are they?
 Cluster
 Container Agent
 Container Instances
 Task Definition
 Task
 Service
The most helpful naming conventions in the world.
3) The supporting components to ECS and how they're connected
Outside of the core components, there are supporting ones that are practically required to do
anything useful:
 Elastic Application Load Balancers

 Launch Configurations
 AutoScaling Groups
 CloudWatch Alarms (for Auto Scaling)
EC2 is also a critical piece of ECS (since it's in the name), but we're not going to dive into it.
That would be a deep dive and we might drown in text because of how long it would be. If
you're unfamiliar with EC2, you should probably go learn about that first before continuing
with ECS. At least know "enough to be dangerous."
Also, the more you understand about VPCs and IAM, the better your overall architecture and
design will be.
One more note: even though these are "supporting" they aren't something you can skip.
They're about as much "supporting" as buns are to a sandwich. They're not the meat, but ya
need them.
Now, when looking at the above, the list may seem pretty exhaustive. And I won't lie there is
a lot to it. However, the "supporting" concepts are going to be necessary to anything serious
you might set up on AWS. Even Kubernetes.
On the other hand, implementing this is easy as pie. A few clicks or a couple of CLI calls. It's
understanding it that's a bit tricky.
The Analogy We'll Use

The year is 21XX. We've discovered alien life, and they're just as much of consumers as we
are. Because of this, the new frontier has merchants and explorers journeying out into space.
To go where no merchant has gone before. And to sell their wares at great prices once they
get there.
So businesses left and right begin traveling "beyond the clouds." Beyond Cloud providers
like AWS, Google, etc are leasing out spaceships. Companies can set up their intergalactic
shops on these ships and begin their star-bound commerce.
A fictional company we'll call "Edges Group" begins ideating how they can bring physical
books to the galaxy...
(Warning: I use Star Trek references ahead. It fit the analogy better despite the title - although
the thought of a lone surviving developer wandering with an alien documenting AWS was a
fun idea.)
Docker Images and Containers Overview
The way we'll structure our analogy: first we'll talk about a futuristic physical spaceship shop;
second we'll compare that to a modern day software company and their application.
Engage.
The Bookstore - Setting up in a spaceship

Edges Group is looking to build a bookstore that can travel the galaxy. They'll name it
"Edges." There's an idea for the layout, what it will sell, the interior furniture needed, the
level of electricity, etc. This is the blueprint for their shop.
With the blueprint in-hand, they need to set up the bookstore in a spaceship. Let's assume
they're renting / leasing space (vs. brand new construction).
Challenges here:
1) Finding a spaceship that suits the needs of the bookstore
2) Tweak the blueprint and bookstore to fit the uniqueness of the spaceship
3) Optimize square footage price by finding a spaceship with just enough space for their
bookstore.
Let's say Edges Group solves these challenges for setting up one of their Edges bookstores.
Guess what? Now they've created challenges for if they want to build another Edges
bookstore:
1) The blueprint was tweaked to fit the first store's spaceship.
2) Because of #1 we'd need to find an identical spaceship to reuse our blueprint
3) If we can't do #2 we'll have to get a different spaceship. We'll have to tweak our blueprint
to fit this spaceship.
So if we get all the way to #3, which is likely, we'll have 2 different blueprints and won't be
able to "standardize."
For example, let's say our original blue print for the Edges Bookstore was for a 10000 square
foot spaceship. If our second spaceship is 8000 sq. ft., we won't be able to use the exact same
layout as the 10000 sq. ft. spaceship.
What if we get a 15000 sq. ft. spaceship? What do we do with the extra 5000?
We could try renting it to other tenants or set up a different shop. BUT. If we've specially
modified our spaceship just for our Edges Bookstore, new tenants/shops/centers would have
to be compatible. If that extra 5000 sq. ft. is disco tiling, other tenants would need to be okay
with that OR we'd need to renovate that area.
Let's cross over to the modern day software part now (And therefore talk about Docker).
The Application - Setting up on a server

Edges Group is looking to host an Application, "Edges", for their bookstores. It has a set of
requirements in order to run. OS, CPU, access to other services, file system, software
dependencies etc. From our space bookstore scenario above, the Application is the space
bookstore. These list of dependencies, etc is the blueprint.
To deploy the app, we need to provision a server that meets these requirements. Let's assume
that we're "renting" servers. So like EC2 instances.
In comparison to above, the servers are like the spaceships for the shop. Similarly, just
putting our app straight on a server creates some challenges.
Challenges:
1) Finding a server that fits our app's needs
2) Tweaking our app to fit the nuances of the server we select
3) Optimizing for cost to provision a server that provides just enough power and resources
for our app and traffic.
But Edges Group is tenacious. So they get their app hosted on a server. Well, as with above,
there's new challenges involved with scaling out:
1) Our app has been tweaked to fit the server that it's on.
2) If we do want to straight copy our app, we need an identical server to #1.
3) If the additional servers are different, we may need to tweak our app again. And thus
manage 2 versions.
Also what do we do with unused computing power and memory servers might have? It...
just... goes idle?
Yes, we can put other apps and processes on there and expose them. However, it will require
tweaking to both the server and the app. Especially if we have many environment specific
modifications on our server just for our main "Edges" app.
Now of course, the problem we've described above has been solved in a variety of ways with
a variety of different technologies. We're interested in how to do it with Docker.
The Bookstore - Space Containers

There's a new construction trend going around. Instead of worrying about spaceship fit,
companies can make a big boxed-space. These big boxed-spaces are called space-
containers.
Inside of these boxed-spaces, or space-containers, they can create their shops, retail,
whatever, exactly how they would want it to be.
On the outside of these are utility hookups for electricity, plumbing, internet, robotics, etc.
These utilities are then made available to whatever is being built within the space-container.
From the outside, these boxed-spaces, or space-containers, look more or less the same.
Although some are large and others are small, they all just look like boxes with hookups on
the outside. There can be anything on the inside though!
In our company's case, there's an entire "Edges" bookstore. Every single thing it needs for its
shop is there - registers, shelves, lights, robots, etc. Of course, it's not live until we have it
hooked up to those utilities.
We can also define blueprints for these space-containers. In this blueprint we'll define every
single thing required by one of our "Edges" bookstores. These blueprints are called space-
images, because we wouldn't want to confuse anyone with terminology...
These space-containers have a number of advantages:
1) Any spaceship will do, as long as it has hookups for those utilities
2) The inside "Edges" shop never needs to be modified based on the spaceship
3) Any unused space in a spaceship can EASILY be rented out or used by other shops that are
in space-containers
4) Because of #3, our company can easily repurpose unused space for other space-containers!
The Application - Containers

Similar to the above scenario - instead of worrying about app-to-server fit, we can create
a Docker Container. Inside of it will be everything we need to run our application - the OS,
the different software dependencies, etc.
On the outside, all of these Docker Containers just look like another Docker Container.
Although some are larger in size and others smaller, they're all just Docker Containers that
interface with Docker. There could be anything on the inside though!
In our company's case, there's an entire "Edges" application with all the needed
dependencies! Of course this container isn't live until we have it hooked up and running in
context of Docker.
Also, instead of having to define each Container individually, we can create a Docker Image.
This is essentially a blueprint for our Containers. We use a "Dockerfile" to list all of our
needs for our Docker Container. This way, spinning up Containers is a cinch.
Docker Containers come with a number of advantages:
1) Any server with Docker will do.

2) The "Edges" application never needs to be modified based on the server.
3) Any unused server power/resources can easily be repurposed for other Docker Containers.
4) Because of #3 we can easily take advantage of extra server space for other Docker
Containers!
Summary of Docker Analogy

Docker Containers are like the boxed-spaces where we can build anything inside. We called
them "space-containers." On the outside there are hookups for utilities.
Docker Images are like the boxed-space blueprints.
Servers set up for Docker are like spaceships set up to plug-n-play these boxed-spaces.
Learn to build production-ready Docker infrastructures on AWS from

nothing.
Get notified when my next AWS DevOps Workshop opens:
Let Me Know!
Challenges with Managing Docker

Containers
The Bookstore
With this concept of boxed-spaces, or space-containers, in hand, the company leases 5
spaceships that can house them. The spaceships all have the correct utility hookups for the
space-containers. These spaceships will travel together in a fleet.
They now have some new challenges to face:
1) Monitoring each of the space-containers in each of the spaceships

2) Monitoring the spaceships themselves
3) Managing space and utility usage in the spaceships
4) Directing traffic to the most appropriate spaceship
5) Adding/removing space-containers if traffic levels demands it
6) Updating all of the space-containers if the blueprint changes
The company could solve all of these manually if they wanted. But there's probably a better
way.
The Application
The company provisions (leases) 5 EC2 instances from AWS to run Docker Containers. The
servers all have Docker and all the supports for running the Containers. The servers are all in
different AWS availability zones in the same region.
Similar to our space commerce company, our tech company has challenges to face:
1) Monitoring each of the Docker Containers on the different servers
2) Monitoring the servers themselves
3) Managing usage of space, resources, power, etc of the servers
4) Directing traffic to the most appropriate Container within the most appropriate server
5) Adding removing Containers if traffic levels demand it
6) Updating all of the Containers if the Docker Image changes
So we could manually solve all of this if we wanted. However, there is a better way.
AWS EC2 Container Service "ECS"

All ECS does is solve the previously mentioned problems. In a simple phrase: "ECS manages
and deploys Docker Containers." That's really it.
Sure I could rattle off the other hundred things it does, but it does all of those things to
manage and deploy Docker Containers.
For our analogy: it's a service where instead of leasing our spaceships and being on our own,
we get a leadership team to manage the fleet. Each of our ships will also get a captain to help
coordinate with each other and manage space-containers on board. All the captains will
coordinate with an admiral who oversees the entire fleet. The admiral reports to us.
We'll still use the analogy when it helps, but some of these concepts seemed to get even more
confusing with it. Therefore for a few of them we'll just look at them 100% in the real world.
Clusters and the ECS Container Agent

A Cluster is just a group of EC2 instances, that each have an ECS Container Agent on them
configured to point the same Cluster.
For our analogy...
The EC2 Instances are like our spaceships.

The Cluster is like our spaceship fleet's Admiral.
The ECS Container Agent is like the captain of each ship, that reports back on the status
of the ship itself and the space-containers.
Let's cover each a bit more.
The Cluster is like the admiral of all the spaceships Edges Group has leased. It receives
information from each of the spaceship captains and coordinates them. It also reports back
directly to us, versus us having to check in with each captain.
The ECS Container Agent is like the captain of each ship, that reports back on the status of
the ship itself and the space-containers.
An EC2 instance that has a Container Agent and is part of a Cluster is referred to as
a Container Instance. This is like a spaceship with a captain that knows it belongs to
specific admiral's fleet.
If we have a Cluster named "Luster" and 4 EC2 instances with a Container Agent that's
configured to point to "Luster": Then our Cluster is coordinating with 4 instances. In other
words our Cluster has 4 instances.
Sound complex? Nope, simple as pie actually. In fact creating a Cluster is the easiest part.
It consists of 1 thing. A name! Of course the majority of a Cluster is the supporting

components. So it's almost like a club. The "club" doesn't get things done, it's members do.
But we still say the "club" did it.
So, let's step through ALL the things needed to set up a complete Cluster.
1) Create a Cluster.
Creating a Cluster is literally nothing more than navigating to the ECS console and creating
an EMPTY Cluster. With a name. That's it. Nothing else.
On the the CLI? Same. We create a Cluster with a name.
$ aws ecs create-cluster --cluster-name "luster"
The console launch wizard makes it look like there's a lot more to creating Cluster. It asks for
instances, instance types, vpcs, security groups blah blah. Those are things we use with a
Cluster, but they're not a Cluster. That's actually AWS trying to help us out by using
CloudFormation to create all the supporting components.
What supporting components? Instances, VPCs, EBS volumes, IAM roles, etc. But these
things aren't directly a part of ECS. We still need them, and need to set them up, but they are
not a part of ECS.
With our spaceship analogy - the admiral and captains manage the spaceships. But the
spaceships are not a part of the admiral and captains.
Okay, I could be super literal and stop here. But let's walk through those supporting
components.
2) Create the IAM Role for the EC2 Instances to be used in the Cluster
EC2 instances that hook up with ECS need an IAM role with
the AmazonEC2ContainerServiceforEC2Role policy. This is actually a managed policy,
meaning that it's pre-made. So all that's need to create this role is to:
a) Create a new IAM role with the type: Amazon EC2 Role for EC2 Container Service
b) Select the AmazonEC2ContainerServiceforEC2Role policy
And that's it. If you'd like to learn more about the wizardry surrounding IAM policies, I have
an in-depth write-up about it here:
AWS IAM Policies in a Nutshell
3) Set up EC2 instances with the ECS Container Agent
While you can manually install this on instances yourself, AWS has an ECS Optimized
AMI for us to use as well. Therefore the most straightforward way to create an instance
that's ready to hook up with ECS is to just use the AMI:
ECS Optimized AMI
There's nothing crazy in it, so we'll still be able to customize it further if need be (and make
more decorated AMIs). The list of what's in it is:
 The latest minimal version of the Amazon Linux AMI

 The latest version of the Amazon ECS Container Agent
 The recommended version of Docker for the latest Amazon ECS Container
Agent
 The latest version of the ecs-init package to run and monitor the Amazon
ECS agent
And we can find a list of all the latest optimized AMI's here:
List of the Latest ECS-Optimized AMIs by Region
Remember, instances with the Container Agent that are a part of an ECS Cluster are referred
to as "Container Instances." I keep reiterating this because at first look I thought it was some
play on words for it being an instance of our Containers or Images.
We'll come back and talk about strategies for launching and managing Container Instances in
the Launch Configurations and AutoScaling Groups section.
4) Point the ECS Container Agent on the instances to the Cluster we want them to join
If we're using the ECS Optimized AMI, just write:
ECS_CLUSTER=YOURCLUSTERNAME
To the ECS config file at /etc/ecs/ecs.config . We can do this using a user data script. The
script can either write to the ECS_CLUSTER variable itself or we can keep our config file
in a secure S3 bucket and pull it in.
If we're doing it the manual way, just set an environment variable on the Docker run
command:
$ docker run --name ecs-agent -env=ECS_CLUSTER=YOURCLUSTERNAME
You'll need to put in more options to the Docker run command if you're doing it manually.
5) Launch the instances!
The instances will automatically "join" the Cluster if you've done all of the above.
Obviously there's more to launching instances beyond this. Instance type, Security Groups,
etc etc. I'd say that's one of the biggest off-puts to ECS for those unfamiliar with AWS - it
assumes that we know a metric ton about AWS.
A Cluster can also manage these instances across availability zones within a region. That's
right. It can work with all of the great fault tolerance tools that AWS provides pretty simply.
If our instances are spread across the different AZs, our Docker Containers will be as well.
To do so, we'd need to launch the instances into differing VPC subnets. If you'd like to use a
non-default VPC, I have a great write-up here on it:
AWS VPC Core Concepts in an Analogy and Guide
We'll also need to make use of an Application Load Balancer which we'll talk about after
the rest of the primary components.
And That's the Cluster

Just a bunch of Instances with the ECS Container Agent, pointing to the same Cluster.
Although there are many supporting concepts, it has only one true property - a name. It's the
Admiral of our fleet of Container Instances.
So our servers are all linked up and ready to coordinate. Now we need to actually set up our
containers on them. Before we can do that though, we need a way to tell ECS how to set
up our Containers.
Task Definitions
In context of our spaceship and space-containers analogy, a Task Definition is a specification
of exactly what's needed to set up our space-container(s) in our spaceships. Note
the (s) there. In a "Task Definition" we can say that the "Task it's defining" consists of
multiple space-containers.
(also note that Task Definition is both the real name of the resource AND what I'm calling it
in the analogy.)
For example, our Edges Bookstore "Task Definition" might consist of:
1) the blueprint for our bookstore "Edges" space-container
2) a blueprint for a coffee shop space-container
3) a list of utilities, space, power, plumbing, etc levels for each space-container
If we created one "Task" from this "Task Definition" we'd have a bookstore and coffee shop.
They'd be linked in a specified amount of space in one of our spaceships. It'd use up certain
amounts of plumbing, electricity, etc.
In AWS ECS, this comparison carries over. A Task Definition is a specification exactly
what's needed to set up our Container(s) on our Container Instances. Note the (s) there.
In a "Task Definition" we can say that the "Task it's defining" consists of multiple Docker
Containers.
For example, our Edges App "Task Definition" might consist of:
1) a Docker Image for our Edges web app
2) a Docker Image for a video processing app
3) the cpu, memory, port mappings, entry points, etc for each to-be-made Container
The Docker Images we specify in 1 and 2 are what we make the Containers from. In each, we
specify the levels of cpu, memory, etc. If you've done anything with Docker, these properties
should ring a bell. Most of the options we pass to our "Task Definition" about Containers are
the options we can pass to Docker when creating Containers. Like PortBindings,
MemoryReservation, and the like.
If we created one "Task" from this "Task Definition" we'd have a web app and video
processing app on one server. It'd reserve a certain amount of cpu, memory, bind to certain
ports, etc.
A Task Definition can be defined in the AWS console through the usual console UI OR
through a JSON format. The CLI can also be interacted with via the JSON Format as well.
Here's the sample template straight from AWS:
The AWS Task Definition Template.
Here's the specific area that covers all the specific options and properties that can be used in a
task definition:
Task Definition Parameters
The real meat of the template is the containerDefintions property. This is where we define
the Containers and their needs. In a containerDefinition , the image property is where we
point to the Docker Image we'd like used for that specific Container. We can define multiple
Containers all using different Images.
Setting up a Task Definition?

Again, we're not diving too deep here because I have another guide that shows exactly how
to set one up. This is more the thought process of creating one.
Why is that?
Because as soon as you see how many options there are for a Task Definition, you're likely to
be overwhelmed. This is especially true if you haven't done much directly with Docker.
For the Console
1) In the AWS ECS console, go to the Task Definitions and create a new one.

2) Follow through the instructions and cross reference each of the fields it asks for with
the Task Definition Parameters documentation.
For the CLI or JSON Format in the Console
1) Start with The AWS Task Definition Template..
2) Begin modifying each of the properties and also cross reference it with Task Definition
Parameters documentation.
How do I know what properties to use?
This is where you'll need to read up on what each of the properties does. There aren't any real
shortcuts here folks. You've either profiled your Containers and know the needed memory or
not. Same for a lot of the other properties.
Some notes about the process:
 Properties of containerDefinition s like CPU and Memory aren't just ECS
things. Those are Docker concepts. You'll need to figure out what each of your
container needs and uses.
 the essential property on a containerDefinition means that if that that
container goes down, they all do.
 entrypoint and command just overwrite whatever you have in the image.
You don't have to re-define them here.
 links is how you specify that containers can communicate with each other. It's
like using the --link option in docker run .
 Many of the properties have defaults, allowing us to leave them blank.
So I'm Done Right?

Nope.
Remember, Task Definitions are just a set of instructions. They're independent of Clusters.
Meaning that we can use them in any Cluster (assuming they have the resources).
In order to use the Task Definitions, we have 2 options:

a) Running a Task
or
b) Creating a Service
The best visual here is that we have a set of instructions on how to build our space-containers
sets (Task Definitions). Now we're handing it to the admiral (Cluster). The admiral knows we
want it set up in the spaceships, but needs to know how (and how many).
So of course the next step is for the admiral (Cluster) to coordinate with the captains (ECS
Container Agents) and deploy it to the proper ships (Instances).
Let's start with Running a Task.

nothing.
Let Me Know!
Running a Task
We've defined what's needed to set up a full-blown "Edges" shopping experience with our
Containers. We did so through the aforementioned "Task Definition." We can hand this to
our spaceships' admiral (Cluster) and ask them to create and run a "Task."
The process of running a task goes along the lines of something like:
a) The admiral (Cluster) takes the "Task Definition"
b) Coordinates with all of the spaceship captains (ECS Container Agents)
to
c) Find which spaceship (Server) is the best fit.
When it knows what spaceship is the best fit, it will set the space-containers up there - in our
case a bookstore and coffee shop. These shops will have all the required utilities, like
electricity and plumbing, needed to run.
In ECS, in our Cluster, this is like choosing to Run a Task. We specify the Task Definition
and the Cluster will create all the Containers we've specified within it. It will coordinate with
the different Container Instances (EC2 servers with the container agents) and find the ones
that can best fit the entire "Task."
When running a Task, we can specify to run more than 1 Task. This is the equivalent of
saying:
"Given my blueprint (Task Definition), make 3 Tasks from it."
Therefore, if we had a Task Definition that called for 1 web app Container and 1 video app
Container linked, and we wanted 3 Tasks from it: then we'll wind up with 3 pairs of those
apps.
Spreading Tasks Across Availability Zones (AZs)
Obviously, we want some say in how our Tasks are placed on our Instances. This is
where Task Placement strategies come into play. We tell our Cluster how we want them
spread across our Container Instances.
Assuming that our Instances are being launched into different Availability Zones...
...then the only thing we have to do to spread traffic across zones is select a strategy. There's
5 main strategies we get to select from:
1) AZ Balanced Spread - Spreads Tasks evenly across AZ's. Within an AZ, it spreads Tasks
evenly among Instances.
2) AZ Balanced Binpack - Spreads Tasks evenly across AZ's. Within an AZ, it tries to use
the least amount of Instances. How? By prioritizing Instances with the least amount of
available CPU or memory.
3) Binpack - Places tasks on instances with the least amount of CPU or Memory. Doesn't
care about AZ spread.
4) One Per Host - One per Instance.

5) Custom - Allows the user to customize the placement from a set of options. This allows us
base spread on something like OS type or AMI ID as well. The docs are basically non-
existent on how they work though.
The official docs on this are here:
Amazon ECS Task Placement Strategies
Unfortunately, these aren't the most helpful because they don't explain the options from the
console. Instead they just give the values of the API's type and field parameters. When it
comes down to it, the above 5 options we covered are just a mix-and-match of 2 things:
a) ECS Constraint Types and Attributes
and
b) The Cluster Query Language
For my own sanity, I'm not going to dive into those just yet. I've experimented with them
some, but since the docs are so sparse it's hard to know what's happening. The default 5
strategies seem pretty solid.
Let's Not Get Distracted From The Fact...

That running a Task is simply:
Giving our Cluster a Task Definition and telling it to put Tasks on our Container
Instances.
Yes there's a lot of details we can configure, but they still all head towards that same purpose.
However, the deal ends here. If one of our Tasks goes down, that's the end. The Cluster won't
try and put it back up. It also won't give us detailed metrics on them either. Oh, also, how do
we do service discovery? How do we load balance between Tasks??
Given these issues, running a Task (or Tasks) is going to be limited. If we want something
like a web server, we need a different option. That's where Services come into play. Don't
you love these amazingly descriptive names?
Services
As we just covered, Running a Task is a very "one and done" type of deal. To remember it,
think "one and done" because it's "run and done."
Creating a Service tells our Cluster to go beyond just running Tasks - it manages them. We
create a Service and hand it a Task Definition. It takes the Task Definition and does a number
of things for us. Specifically:
1) Sets up our Tasks from our Task Definitions on the best suited Container Instances (same
as running a Task).
2) Monitors our Tasks and reports back metrics
3) Keeps the number of Tasks we specify always up and running
4) Updates our Tasks by handing them an updated Task Definition
5) Optionally scales out/in our Tasks based on customer demand (traffic)
6) Distributes our customer demand evenly to all Tasks via Load Balancer (if we hook one
up)
It becomes a little unwieldy for me to continue with the analogy we've set up here. If you're
interested in keeping the visual though just think of it as us asking the admiral (Cluster) to
also...
 To manage our space-container sets (Tasks).

 To monitor and report back to us about them.
 If one of our space-container sets (Tasks) burns down, to put them back up.
 To put a traffic guide outside of the spaceships and have them direct it to the
space-container set best suited for traffic.
 To build/remove more space-container sets based on traffic demand.
Like I said, it gets a bit odd.
Let's dive into each of the things a Service does for us in detail now.
1. Set up of our Tasks on the best suited Container
Instances
This is what plain running a Task does. Services also do this. They also give us the same
option of selecting a Task Placement strategy: the options of things like AZ Balanced Spread,
Bin Pack, etc.
So they do everything running a Task does PLUS the other 5 things we're going to step
through.
2. Monitors our Tasks and reports back metrics
When we have a Service, we can get specific metrics about it. How much of our allotted CPU
and Memory are our Services Tasks' taking up? How much is reserved? How much is being
utilized?
The docs on the different metrics are here. The two for Services are CPU and Memory
Utilization. Note that there are also metrics available for the Cluster as a whole.
It also reports back to us a detailed list of "events." The events are simply a play-by-play of
what ECS is doing with the Tasks. Is it launching a Task? Is it removing a Task? Is it
updating a Task? At what times are these things happening? Are they in a steady state?
Lots of extra info that goes far and beyond just whether or not a Task is Running, Pending, or
Stopped.
The power of these metrics really comes into play when utilizing CloudWatch alarms. We
can pick a metric from ECS, either Cluster wide or Service wide metrics, and watch them. If
they go above or below thresholds we've set, we can respond to them.
CloudWatch alarms are one of the "supporting components." They're not so complex that
they require their own section here. At the same time, there's so much to them that if we did
make a section, it'd be huge. To create an alarm we just pick a CloudWatch metric and set a
threshold. If that threshold is crossed the alarm sounds.
Check out more about CloudWatch in the Docs.
3. Keeps Tasks up and running

When creating a Service we can specify the "Number of Tasks" we want to keep alive in our
Cluster. The Service will always make sure that the number of Tasks we specify will be up. If
one fails, the Service will create a new one in its place.
We can also set a maximum percent of Tasks and a minimum healthy percent of Tasks.
These percents represent the number range of Tasks that our Service can have live at any
given time. They're used in updating Tasks when we revise our Task Definitions. For
example, we might update our Task Definition to use an updated Docker Image for the
Containers.
The maximum and minimum percents are used to deploy updated Tasks without ever having
a service outage.
That brings us to the next benefit:
4. Updates our Tasks by handing them an updated Task

Definition
... AND it can update them without ever having a service outage. How? By deploying new
Tasks incrementally and taking down old Tasks incrementally. It uses the maximum
percent and minimum healthy percent of Tasks parameters to achieve this.
For example, if we set our service up with the following settings:
1) Number of Tasks: 4
2) Minimum Healthy Percent: 50%
3) Maximum Percent: 200%
Then that means we can have as few as 2, but as many as 8.
What does this have to do with updates? Well it determines how ECS will update everything.
First off, the way an update works with a Service is:
a) Make a revision of your Task Definition
This means, make a new version of it. We don't "update" an existing one. We create revision
of an old one.
b) Update the Service with the revised Task Definition
Which is either done by a couple of simple clicks in the console, or passing up the JSON
formatted Task Definition via CLI.
When we do this, we'll trigger an update to all Tasks being managed by the Service. The
Service won't just remove all of the current ones and then add all of the new ones. Instead it'll
follow some rules based on our maximum percent and minimum healthy percent values
we set:
a) Any old Tasks that are currently receiving traffic will be "drained".
In other words, it will allow existing requests to complete, but will deny new ones. This is
related to Elastic Load Balancers, which we'll get into in a bit.
b) Deployment of the new Tasks is based on our Minimum Healthy and Maximum percents.
Since we have our Maximum Percent at 200%, it will add 4 of the new revisions first and
then remove the 4 old Tasks.
This allows our Service to update without ever having to be down.
Let's look at another example:
Number of Tasks: 4
Minimum Healthy Percent: 50%
Maximum Percent: 100%

Because 4 is our maximum, if we update, it won't add 4 new Tasks and remove 4 old Tasks.
Instead it will...
1) Wait for any existing traffic to drain
2) Remove 2 old Tasks, once drained.
Now we're at 50% of our tasks, which is still healthy.
3) Add 2 new Tasks
Now we're at that max of 100%. Once these are up and live...
4) Remove the last 2 old Tasks, one drained.
5) Add the final 2 new Tasks.
Yep. And we don't have to deal with any of the headaches of doing this manually!
5. Optionally scale out or in our Tasks based on traffic

Services can automatically add and remove Tasks to our Cluster. The flow of auto scaling is
the general AWS responsive workflow:
1) Pick a metric to watch in CloudWatch
2) Set a CloudWatch alarm to trigger when the metric goes above or below thresholds
3) Respond to the CloudWatch alarm with an action
It's a very simple process. Any service that feeds data into CloudWatch can be measured,
monitored and responded to. ECS feeds two type sets of metrics in:
a) Cluster Metrics - usage / reservation of resources in context of our entire cluster
b) Service Metrics - usage / reservation of resources in context of our Service
We would pick a metric in CloudWatch, like "CPU Utilization" of our Service. We'd set a
CloudWatch alarm with a threshold like "When it gets above 50% of the total available
CPU."
With the alarm set up, we'd hand that to our ECS Service. We'd create "Scaling Policies",
which are actions to take when a specified CloudWatch alarm is triggered. Our Scaling
Policy may be to "Add 1 Task."
In terms of setting up these resources - watching a metric and setting up an alarm are all done
through CloudWatch. The Scaling Policy is done through ECS in the console. The CLI and
CloudFormation use a variety of separate concepts, in combo with ECS, to set this up:
 The Application AutoScaling API

 Application AutoScaling Scalable Targets (what are we scaling?)
 Application AutoScaling Scaling Policy (rules for scaling)
Again, we're focusing on concepts here. There's an entire list of tech steps here for setting up
Service Auto Scaling.
If you're using CloudFormation, scaling down is unfortunately broken.

I've asked multiple times about it.
6. Distributes our Traffic evenly to all Tasks via Load
Balancer (if we hook one up)
Services allow us to hook up Elastic Load Balancers and spread traffic to our Tasks.
Specifically, we use the Version 2 Load Balancer known as the "Application Load Balancer."
We'll dive into these more in the next section.
These are what we use for service discovery as well (note the lower case "s"). We can launch
multiple Services into a single Cluster. For example, we might have a Book Service, that
launches Tasks for a book app. We might have a Movie Service that launches Tasks for a
movie app.
Using an Application Load Balancer, we can route to those differing Services using a variety
of rules. The Book Service might be available at /edges . And our Movie Service might be
available at /cube-buster . But they'd all be sharing the same set of Container Instances!
In simple terms though, an Application Load Balancer saves us from a major pain:
Having to load balance traffic across servers and then across Tasks on a server.
Hooking one up to a Service is as simple as:
a) Upon creation of the Service, choosing to configure an ELB

b) Specifying the Application Load Balancer
c) Selecting the Target Group
Or for the CLI or CloudFormation
1) Specifying the Load Balancer and associated Target Group
NOW. Obviously, this assumes that we have an Application Load Balancer created. "ALBs"
are one of the supporting concepts. They're practically required - how else will traffic
reasonably reach our Containers? But the docs and most guides out there assume knowledge
of Elastic Load Balancing. Well that's BS, so let's actually learn about them.
This transitions us right into the next section.
Application Load Balancers

Traffic obviously needs to reach our Tasks and the Containers within them. How do we do
that? Right now they're all on a Container Instance, but there's not a practical way to reach
them.
This is where our Elastic Application Load Balancers come into play. These accept incoming
traffic on a protocol (i.e. HTTP) and port (i.e. 80) we specify. They then route the traffic to
Target Groups, based on the path (i.e. "/").
A Target Group, in context of ECS, is where our Service will register its Tasks. Once these
Tasks are registered, our Load Balancer knows to spread traffic between them.
If you've worked with Elastic Load Balancers, this should point out the differences between
Classic and Application. Classic Load Balancers spread traffic between servers. Application
Load Balancers spread traffic between applications. In this case, the applications are our ECS
Tasks.
It's a pretty neat concept where we just quit caring about balancing between servers and
subsequently the apps within them. We do away with that and say:
"Hey, just tell me where all the live Applications are. I'll spread traffic between those."
Application Load Balancers have a few primary components:
a) The Load Balancer itself - from a CLI and API standpoint this is almost just a central point
to attach options and configurations to.
b) Target Groups - a common group of apps, in our case Tasks, that will receive load
balanced traffic.
c) Listeners - what ports and protocols is our Load Balancer listening on? i.e. Port 80,
Protocol HTTP
d) Listener Rules - when traffic arrives on a particular Listener, what do we do? i.e. Send it to
a particular Target Group when the path is /
Conceptually, a load balanced setup for one of our "Edges" Applications we be:
1) We have an Edges Task Definition

2) We create an Edges Service using the Task Definition
3) We have a Target Group called Edges Targets
4) We hand the Target Group, Edges Targets, to our Service
5) Our Service registers ALL of our Edges Tasks in our Service with the Target Group
6) We set up a Listener for HTTP request on Port 80

7) We create a Listener Rule that says, when the Path is / we send it to our Edges Target
Group.
Our Application Load Balancer can now receive traffic and spread it to all the Tasks being
managed by our Edges Service.
More than 1 Target Group and Service

As mentioned earlier, we're also not limited to one Target Group, and Service in our Cluster.
We can launch as many as we have room for. For example, we could do the following:
1) Create an entirely new set of Docker Images and Containers
2) Create a Task Definition with the new Images. We'll call it the CubeBuster Task
Definition.
3) Create a new Service out of the CubeBuster Task Definition. Launch it into the same
Cluster we've been using (with the Edges Service).
4) Create a new Target Group called CubeBuster Targets
5) Hand the CubeBuster Target Group to our CubeBuster Service
6) Our CubeBuster Service registers ALL of our CubeBuster Tasks managed by our
CubeBuster Service with the Target Group
7) Change up our Load Balancer Listener Rules to be:
Listener for HTTP 80 Traffic

Listener Rule 1 - Send to CubeBuster Targets:
Target Group: CubeBuster Targets

Path: `/cubebuster`
Priority: `1`
Listener Rule 2 - Send to Edges Targets:
Target Group: Edges Targets

Path: `/`
Priority: `2`
Priority is just which rule takes precedence - The lower the number, the higher the priority.
Brilliant inverse relationship AWS.
With our new configuration, if a request comes in for /cubebuster , it will go to our
CubeBuster Targets.
There's A Lot Going On Here, Let's Simplify

So this got pretty out of hand. When it comes down to it, hooking up your Service to an
Application load balancer comes down to:
1) Creating a Target Group.
Note here - don't register targets manually if you're working with ECS. ECS will do that for
us.
2) Creating your Application Load Balancer, and making a Listener and Listener Rule
that points to the Target Group.
3) Register your Application Load Balancer and Target Group with the ECS Service.
And that's it. This is even easier in the console:
a) Create your Application Load Balancer
In the console wizard, it'll have you walk through and create the Target Group, Listener and
Listener Rules. Just remember, don't register any targets manually. ECS does that.
b) Create the ECS Service and Register the Load Balancer

In the console, when creating a Service you Configure ELB . In this page you...
1. select from the Load Balancer that you've created

2. select what Task to Load Balancer (although the console says container)
3. select or create the listener you've set up (i.e. 80:HTTP)
4. select or create a Target Group
And that's it. Then everything is good to go.
CLI and CloudFormation require all of the above concepts to be created individually. After
creating them, we then have to configure them to point to and reference each other. I plan on
releasing more on this aspect, but it's a pretty heavy topic to dive into.
We've discussed scaling and load balancing our Tasks (and thus Containers), but we're still
missing something. We've discussed how to set up an EC2 instance, but we probably don't
want to set them up individually each time. This is where Launch Configuration and
AutoScaling Groups come into play.

nothing.
Let Me Know!
Launch Configurations and AutoScaling

Groups
Review: Container Instances are just EC2 instance that have the ECS Container Agent on
them.
When EC2 Instances have the agent on them configured to point to a cluster, they "join" the
cluster. And this point they've fulfilled their entire destiny and are full-fledged Container
Instances.
How do we make these Container Instances?

Well, we've already explained how to make one of them above. We either use the ECS-
Optimized AMI or install/configure the ECS Container Agent + Docker.
What about making many of them? We have 2 primary options:
1) Launching All of them Individually.
This obviously isn't the route you'll likely want to take. Too much manual labor. To speed
things up, you could always get an instance exactly how you want it and then create an AMI
from it. This way, every time you make a new instance, there's no extra configuration.
The better way...
2) Launch Configurations and AutoScaling Groups
These allow us to automate and manage many Instances at once. Like the Application Load
Balancer, it's a "supporting concept." These aren't directly related to ECS, and are used in
almost anything that leverages EC2. Since they deal with EC2, which is the bread and butter
of everything server related, let's dive in a bit.
Launch Configurations and AutoScaling Groups

A Launch Configuration is just a blueprint for an EC2 instance. In terms of creating one, it's
exactly like creating an EC2 instance. You still select an AMI, pass it user data scripts, etc.
At the end of the process you have Launch Configuration though and not a live Instance.
With the Launch Configuration in hand, we now create an AutoScaling Group. These take a
Launch Configuration and automate / manage the creation and scaling of instances.
There as simple as:
a) selecting the desired launch configuration
b) selecting which VPC and Subnets to launch instances into
c) determining how many instances you'd like
d) configuring scaling actions based on CloudWatch alarms (optional)
e) setting up notifications for events
Getting these Launch Configurations and AutoScaling Groups working with ECS involves
exactly 2 steps:
1) Setting up the Launch Configuration to use the ECS Optimized AMI; or using a user data
script to manually configure the ECS Container Agent.
And...
2) Pointing the container agent to the Cluster you'd like them to join. Which we covered in
the Cluster section.
If your Cluster is live and the Launch Configurations are set-up to either use the ECS
Optimized AMI or install the ECS Container Agent; If you've also configured the Launch
Configuration to point to the correct cluster; Then BOOM. That's all you have to do to launch
instances into your Cluster.
One More Note About Container Instances

In order to work with ECS they need access to the internet. If you've not done much work
with VPCs, this is probably confusing. This means that whatever VPC they reside in needs to
be attached to an internet gateway. The subnet in the VPC then needs to either:
a) Point directly to the internet gateway
or
b) Point to a NAT Gateway or NAT Instances in a subnet pointing to an internet gateway
If this sounds like incoherent rants of a mad man, check out my guide here on VPCs.
Summary
AWS ECS just helps us manage and deploy our Docker Containers across EC2 instances.
That's really it. We can bog it down with 1000 other supporting things, but it's relatively
sparse.
The most confusing part of grasping ECS is due to the number of assumed supporting
components. Like Application Load Balancers, AutoScaling Groups, Launch Configurations,
VPCs, etc. These things are definitely used with ECS, but they're not ECS.
ECS consists of just a few basic concepts:
Task Definition - Everything your Docker Container(s) need(s) to persist on a server. CPU,
Memory, volumes, Docker Images, etc. Think blueprints.
Task - instantiating a Task Definition.
EC2 Instance - the servers. These are the spaceships in our analogy.
ECS Container Agent - software installed on an EC2 instance that helps coordinate with
other Agents, monitor local Docker Containers and communicate with the Cluster. These are
the captains of each spaceship.
An EC2 instance with a Container Agent that belongs to a Cluster is referred to as

a "Container Instance."
Cluster - The parent to which our Tasks and Agents belong to. It's a very ambiguous
concept, that's more just there to represent membership.
For our analogy, think of an admiral. It's in charge of the fleet, all the captains report back to
it. It reports to us so that we don't have to check in with each captain individually.
Running a Task - we hand our Cluster a Task Definition; it creates a Task and places it on
the best suited server; the best suited server is found by coordinating with the ECS Container
Agents.
Creating a Service - we hand our Cluster a Task Definition; it does the same thing as
running a Task PLUS:
1) Monitors our Tasks and reports back metrics
2) Keeps the number of Tasks we specify always up and running
3) Updates our Tasks by handing them an updated Task Definition
4) Optionally scales out/in our Tasks based on customer demand (traffic)
5) Distributes our customer demand evenly to all Tasks via Load Balancer (if we hook one
up)
The confusing aspect of ECS is that it practically requires a TON of supporting concepts.
And unfortunately the docs just kind of mention them in passing. Don't mistake this
requirement as a "ECS is just complicated" thing. Anything beyond playing around
will require these supporting components.
What are they?
Application Load Balancers are needed to direct traffic to Containers. They allow for a
variety of powerful concepts like name-spacing sets of containers as "Target Groups." This
allows us to direct traffic to different sets based on rules like the request Protocol or Path.
While not required, the following help bolster our ECS setup:
CloudWatch alarms and metrics to respond to events in our Clusters and Services. This is
how we achieve auto scaling.
Launch Configurations and Auto Scaling Groups to manage sets of Instances vs. setting
them up piecemeal.
VPCs to create varying subnets to allow for multiple availability zone deploys
IAM to create the proper roles for your Instances, Services and Tasks
Caffeine to make sure that you don't fall asleep while reading through 100s of documentation
pages.
Final Thoughts
The obvious next steps need to be an actual implementation of everything. However, now
that you understand the concepts, you won't be clicking in the dark! If you've done an
implementation before, hopefully this has shed some light on why things worked as they did.
I feel the need to mention this one more time, if you're looking for a step-by-step of
implementation, I have a full one here:
Guide to Fault Tolerant and Load Balanced AWS Docker Deployment on ECS
The reason I opted to make this long conceptual guide is because implementing ECS is the
easy part. Understanding it is the tricky part. It's just a few clicks or CLI calls. So "80%
mental, 20% mechanical" wound up being true here as well.

ECS and Docker

Uploaded by

Copyright:

Available Formats

You might also like

ECS and Docker

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ECS and Docker

Uploaded by

Copyright:

Available Formats

The Hitchhiker's Guide to AWS ECS and

1) A conceptual overview of Docker Images and Containers

What are they?

The most helpful naming conventions in the world.

3) The supporting components to ECS and how they're connected

 Elastic Application Load Balancers

The Analogy We'll Use

The Bookstore - Setting up in a spaceship

1) Finding a spaceship that suits the needs of the bookstore

1) The blueprint was tweaked to fit the first store's spaceship.

2) Because of #1 we'd need to find an identical spaceship to reuse our blueprint

The Application - Setting up on a server

1) Finding a server that fits our app's needs

2) Tweaking our app to fit the nuances of the server we select

The Bookstore - Space Containers

These space-containers have a number of advantages:

The Application - Containers

Docker Containers come with a number of advantages:

1) Any server with Docker will do.

Summary of Docker Analogy

Docker Images are like the boxed-space blueprints.

Learn to build production-ready Docker infrastructures on AWS from

Get notified when my next AWS DevOps Workshop opens:

Challenges with Managing Docker

They now have some new challenges to face:

1) Monitoring each of the space-containers in each of the spaceships

3) Managing space and utility usage in the spaceships

4) Directing traffic to the most appropriate spaceship

5) Adding/removing space-containers if traffic levels demands it

6) Updating all of the space-containers if the blueprint changes

1) Monitoring each of the Docker Containers on the different servers

2) Monitoring the servers themselves

3) Managing usage of space, resources, power, etc of the servers

5) Adding removing Containers if traffic levels demand it

6) Updating all of the Containers if the Docker Image changes

AWS EC2 Container Service "ECS"

Clusters and the ECS Container Agent

For our analogy...

The EC2 Instances are like our spaceships.

Let's cover each a bit more.

It consists of 1 thing. A name! Of course the majority of a Cluster is the supporting

On the the CLI? Same. We create a Cluster with a name.

$ aws ecs create-cluster --cluster-name "luster"

b) Select the AmazonEC2ContainerServiceforEC2Role policy

AWS IAM Policies in a Nutshell

3) Set up EC2 instances with the ECS Container Agent

ECS Optimized AMI

 The latest minimal version of the Amazon Linux AMI

List of the Latest ECS-Optimized AMIs by Region

If we're using the ECS Optimized AMI, just write:

$ docker run --name ecs-agent -env=ECS_CLUSTER=YOURCLUSTERNAME

5) Launch the instances!

AWS VPC Core Concepts in an Analogy and Guide

And That's the Cluster

1) the blueprint for our bookstore "Edges" space-container

2) a blueprint for a coffee shop space-container

1) a Docker Image for our Edges web app

2) a Docker Image for a video processing app

The AWS Task Definition Template.