Professional Documents
Culture Documents
ECS and Docker
ECS and Docker
ECS and Docker
Docker
Posted by J Cole Morrison on May 8th, 2017.
Introduction
In this guide we're going to discuss the major components of AWS EC2 Container Service
(ECS), what they are conceptually and how they work together.
The prime directive - understanding how hosting, scaling and load balancing an application
with Docker and ECS works. What are the primary pieces? How do we put the puzzle
together? Does it interfere with internal development of alien civilizations?
This is a conceptual guide. Not a technical step-by-step. If you're looking for that, I have a
full one of those here:
Guide to Fault Tolerant and Load Balanced AWS Docker Deployment on ECS
There's plenty of piecemeal step guides out there, including mine, so it didn't seem necessary
to create another one.
Instead, we're looking for mental framework of how to think about it. Without a conceptual
understanding of our tools and systems, our problem-solving ability is limited. And that
limit usually winds up being "what technical step-by-step guides can I find?"
Check out the entire guide, with extras, in a 10 Part Video Series if you prefer to watch over
read:
Table of Contents
1. Overview
2. The Analogy We'll Use
3. Docker Images and Containers Overview
4. Summary of Docker Analogy
5. Challenges with Managing Docker Containers
6. AWS EC2 Container Service "ECS"
7. Clusters and the ECS Container Agent
8. Task Definitions
9. Running a Task
10. Services
11. Application Load Balancers
12. Launch Configurations and AutoScaling Groups
13. Summary
14. Final Thoughts
15. Image Accreditation
Overview
So what's our agenda here?
We'll discuss some of the problems it solves and build a visual analogy.
I'm including this because there seem to be a severe lack of "Dude, this is just wtf it is and
what it solves - in plain english." Instead, there's just tons of marketing content, technical
speculation and "reasons to use."
Also, we'll build on the analogy we set up here when diving into ECS concepts. I'll put
a tl;dr at the end of the Docker section just in case you do have a good grasp of these
concepts. That way you'll be caught up on the analogy.
2) The core components of ECS (EC2 Container Service) and how they're connected
Cluster
Container Agent
Container Instances
Task Definition
Task
Service
Outside of the core components, there are supporting ones that are practically required to do
anything useful:
Also, the more you understand about VPCs and IAM, the better your overall architecture and
design will be.
One more note: even though these are "supporting" they aren't something you can skip.
They're about as much "supporting" as buns are to a sandwich. They're not the meat, but ya
need them.
Now, when looking at the above, the list may seem pretty exhaustive. And I won't lie there is
a lot to it. However, the "supporting" concepts are going to be necessary to anything serious
you might set up on AWS. Even Kubernetes.
On the other hand, implementing this is easy as pie. A few clicks or a couple of CLI calls. It's
understanding it that's a bit tricky.
So businesses left and right begin traveling "beyond the clouds." Beyond Cloud providers
like AWS, Google, etc are leasing out spaceships. Companies can set up their intergalactic
shops on these ships and begin their star-bound commerce.
A fictional company we'll call "Edges Group" begins ideating how they can bring physical
books to the galaxy...
(Warning: I use Star Trek references ahead. It fit the analogy better despite the title - although
the thought of a lone surviving developer wandering with an alien documenting AWS was a
fun idea.)
Docker Images and Containers Overview
The way we'll structure our analogy: first we'll talk about a futuristic physical spaceship shop;
second we'll compare that to a modern day software company and their application.
Engage.
With the blueprint in-hand, they need to set up the bookstore in a spaceship. Let's assume
they're renting / leasing space (vs. brand new construction).
Challenges here:
2) Tweak the blueprint and bookstore to fit the uniqueness of the spaceship
3) Optimize square footage price by finding a spaceship with just enough space for their
bookstore.
Let's say Edges Group solves these challenges for setting up one of their Edges bookstores.
Guess what? Now they've created challenges for if they want to build another Edges
bookstore:
3) If we can't do #2 we'll have to get a different spaceship. We'll have to tweak our blueprint
to fit this spaceship.
So if we get all the way to #3, which is likely, we'll have 2 different blueprints and won't be
able to "standardize."
For example, let's say our original blue print for the Edges Bookstore was for a 10000 square
foot spaceship. If our second spaceship is 8000 sq. ft., we won't be able to use the exact same
layout as the 10000 sq. ft. spaceship.
What if we get a 15000 sq. ft. spaceship? What do we do with the extra 5000?
We could try renting it to other tenants or set up a different shop. BUT. If we've specially
modified our spaceship just for our Edges Bookstore, new tenants/shops/centers would have
to be compatible. If that extra 5000 sq. ft. is disco tiling, other tenants would need to be okay
with that OR we'd need to renovate that area.
Let's cross over to the modern day software part now (And therefore talk about Docker).
To deploy the app, we need to provision a server that meets these requirements. Let's assume
that we're "renting" servers. So like EC2 instances.
In comparison to above, the servers are like the spaceships for the shop. Similarly, just
putting our app straight on a server creates some challenges.
Challenges:
3) Optimizing for cost to provision a server that provides just enough power and resources
for our app and traffic.
But Edges Group is tenacious. So they get their app hosted on a server. Well, as with above,
there's new challenges involved with scaling out:
1) Our app has been tweaked to fit the server that it's on.
2) If we do want to straight copy our app, we need an identical server to #1.
3) If the additional servers are different, we may need to tweak our app again. And thus
manage 2 versions.
Also what do we do with unused computing power and memory servers might have? It...
just... goes idle?
Yes, we can put other apps and processes on there and expose them. However, it will require
tweaking to both the server and the app. Especially if we have many environment specific
modifications on our server just for our main "Edges" app.
Now of course, the problem we've described above has been solved in a variety of ways with
a variety of different technologies. We're interested in how to do it with Docker.
Inside of these boxed-spaces, or space-containers, they can create their shops, retail,
whatever, exactly how they would want it to be.
On the outside of these are utility hookups for electricity, plumbing, internet, robotics, etc.
These utilities are then made available to whatever is being built within the space-container.
From the outside, these boxed-spaces, or space-containers, look more or less the same.
Although some are large and others are small, they all just look like boxes with hookups on
the outside. There can be anything on the inside though!
In our company's case, there's an entire "Edges" bookstore. Every single thing it needs for its
shop is there - registers, shelves, lights, robots, etc. Of course, it's not live until we have it
hooked up to those utilities.
We can also define blueprints for these space-containers. In this blueprint we'll define every
single thing required by one of our "Edges" bookstores. These blueprints are called space-
images, because we wouldn't want to confuse anyone with terminology...
1) Any spaceship will do, as long as it has hookups for those utilities
2) The inside "Edges" shop never needs to be modified based on the spaceship
3) Any unused space in a spaceship can EASILY be rented out or used by other shops that are
in space-containers
4) Because of #3, our company can easily repurpose unused space for other space-containers!
On the outside, all of these Docker Containers just look like another Docker Container.
Although some are larger in size and others smaller, they're all just Docker Containers that
interface with Docker. There could be anything on the inside though!
In our company's case, there's an entire "Edges" application with all the needed
dependencies! Of course this container isn't live until we have it hooked up and running in
context of Docker.
Also, instead of having to define each Container individually, we can create a Docker Image.
This is essentially a blueprint for our Containers. We use a "Dockerfile" to list all of our
needs for our Docker Container. This way, spinning up Containers is a cinch.
3) Any unused server power/resources can easily be repurposed for other Docker Containers.
4) Because of #3 we can easily take advantage of extra server space for other Docker
Containers!
Servers set up for Docker are like spaceships set up to plug-n-play these boxed-spaces.
Let Me Know!
The Bookstore
With this concept of boxed-spaces, or space-containers, in hand, the company leases 5
spaceships that can house them. The spaceships all have the correct utility hookups for the
space-containers. These spaceships will travel together in a fleet.
The company could solve all of these manually if they wanted. But there's probably a better
way.
The Application
The company provisions (leases) 5 EC2 instances from AWS to run Docker Containers. The
servers all have Docker and all the supports for running the Containers. The servers are all in
different AWS availability zones in the same region.
Similar to our space commerce company, our tech company has challenges to face:
4) Directing traffic to the most appropriate Container within the most appropriate server
So we could manually solve all of this if we wanted. However, there is a better way.
Sure I could rattle off the other hundred things it does, but it does all of those things to
manage and deploy Docker Containers.
For our analogy: it's a service where instead of leasing our spaceships and being on our own,
we get a leadership team to manage the fleet. Each of our ships will also get a captain to help
coordinate with each other and manage space-containers on board. All the captains will
coordinate with an admiral who oversees the entire fleet. The admiral reports to us.
We'll still use the analogy when it helps, but some of these concepts seemed to get even more
confusing with it. Therefore for a few of them we'll just look at them 100% in the real world.
The ECS Container Agent is like the captain of each ship, that reports back on the status
of the ship itself and the space-containers.
The Cluster is like the admiral of all the spaceships Edges Group has leased. It receives
information from each of the spaceship captains and coordinates them. It also reports back
directly to us, versus us having to check in with each captain.
The ECS Container Agent is like the captain of each ship, that reports back on the status of
the ship itself and the space-containers.
An EC2 instance that has a Container Agent and is part of a Cluster is referred to as
a Container Instance. This is like a spaceship with a captain that knows it belongs to
specific admiral's fleet.
If we have a Cluster named "Luster" and 4 EC2 instances with a Container Agent that's
configured to point to "Luster": Then our Cluster is coordinating with 4 instances. In other
words our Cluster has 4 instances.
Sound complex? Nope, simple as pie actually. In fact creating a Cluster is the easiest part.
So, let's step through ALL the things needed to set up a complete Cluster.
1) Create a Cluster.
Creating a Cluster is literally nothing more than navigating to the ECS console and creating
an EMPTY Cluster. With a name. That's it. Nothing else.
The console launch wizard makes it look like there's a lot more to creating Cluster. It asks for
instances, instance types, vpcs, security groups blah blah. Those are things we use with a
Cluster, but they're not a Cluster. That's actually AWS trying to help us out by using
CloudFormation to create all the supporting components.
What supporting components? Instances, VPCs, EBS volumes, IAM roles, etc. But these
things aren't directly a part of ECS. We still need them, and need to set them up, but they are
not a part of ECS.
With our spaceship analogy - the admiral and captains manage the spaceships. But the
spaceships are not a part of the admiral and captains.
Okay, I could be super literal and stop here. But let's walk through those supporting
components.
2) Create the IAM Role for the EC2 Instances to be used in the Cluster
EC2 instances that hook up with ECS need an IAM role with
the AmazonEC2ContainerServiceforEC2Role policy. This is actually a managed policy,
meaning that it's pre-made. So all that's need to create this role is to:
a) Create a new IAM role with the type: Amazon EC2 Role for EC2 Container Service
And that's it. If you'd like to learn more about the wizardry surrounding IAM policies, I have
an in-depth write-up about it here:
While you can manually install this on instances yourself, AWS has an ECS Optimized
AMI for us to use as well. Therefore the most straightforward way to create an instance
that's ready to hook up with ECS is to just use the AMI:
There's nothing crazy in it, so we'll still be able to customize it further if need be (and make
more decorated AMIs). The list of what's in it is:
The latest version of the ecs-init package to run and monitor the Amazon
ECS agent
And we can find a list of all the latest optimized AMI's here:
Remember, instances with the Container Agent that are a part of an ECS Cluster are referred
to as "Container Instances." I keep reiterating this because at first look I thought it was some
play on words for it being an instance of our Containers or Images.
We'll come back and talk about strategies for launching and managing Container Instances in
the Launch Configurations and AutoScaling Groups section.
4) Point the ECS Container Agent on the instances to the Cluster we want them to join
ECS_CLUSTER=YOURCLUSTERNAME
To the ECS config file at /etc/ecs/ecs.config . We can do this using a user data script. The
script can either write to the ECS_CLUSTER variable itself or we can keep our config file
in a secure S3 bucket and pull it in.
If we're doing it the manual way, just set an environment variable on the Docker run
command:
You'll need to put in more options to the Docker run command if you're doing it manually.
The instances will automatically "join" the Cluster if you've done all of the above.
Obviously there's more to launching instances beyond this. Instance type, Security Groups,
etc etc. I'd say that's one of the biggest off-puts to ECS for those unfamiliar with AWS - it
assumes that we know a metric ton about AWS.
A Cluster can also manage these instances across availability zones within a region. That's
right. It can work with all of the great fault tolerance tools that AWS provides pretty simply.
If our instances are spread across the different AZs, our Docker Containers will be as well.
To do so, we'd need to launch the instances into differing VPC subnets. If you'd like to use a
non-default VPC, I have a great write-up here on it:
We'll also need to make use of an Application Load Balancer which we'll talk about after
the rest of the primary components.
Task Definitions
In context of our spaceship and space-containers analogy, a Task Definition is a specification
of exactly what's needed to set up our space-container(s) in our spaceships. Note
the (s) there. In a "Task Definition" we can say that the "Task it's defining" consists of
multiple space-containers.
(also note that Task Definition is both the real name of the resource AND what I'm calling it
in the analogy.)
For example, our Edges Bookstore "Task Definition" might consist of:
3) a list of utilities, space, power, plumbing, etc levels for each space-container
If we created one "Task" from this "Task Definition" we'd have a bookstore and coffee shop.
They'd be linked in a specified amount of space in one of our spaceships. It'd use up certain
amounts of plumbing, electricity, etc.
In AWS ECS, this comparison carries over. A Task Definition is a specification exactly
what's needed to set up our Container(s) on our Container Instances. Note the (s) there.
In a "Task Definition" we can say that the "Task it's defining" consists of multiple Docker
Containers.
For example, our Edges App "Task Definition" might consist of:
3) the cpu, memory, port mappings, entry points, etc for each to-be-made Container
The Docker Images we specify in 1 and 2 are what we make the Containers from. In each, we
specify the levels of cpu, memory, etc. If you've done anything with Docker, these properties
should ring a bell. Most of the options we pass to our "Task Definition" about Containers are
the options we can pass to Docker when creating Containers. Like PortBindings,
MemoryReservation, and the like.
If we created one "Task" from this "Task Definition" we'd have a web app and video
processing app on one server. It'd reserve a certain amount of cpu, memory, bind to certain
ports, etc.
A Task Definition can be defined in the AWS console through the usual console UI OR
through a JSON format. The CLI can also be interacted with via the JSON Format as well.
Here's the sample template straight from AWS:
Here's the specific area that covers all the specific options and properties that can be used in a
task definition:
The real meat of the template is the containerDefintions property. This is where we define
the Containers and their needs. In a containerDefinition , the image property is where we
point to the Docker Image we'd like used for that specific Container. We can define multiple
Containers all using different Images.
Why is that?
Because as soon as you see how many options there are for a Task Definition, you're likely to
be overwhelmed. This is especially true if you haven't done much directly with Docker.
2) Begin modifying each of the properties and also cross reference it with Task Definition
Parameters documentation.
This is where you'll need to read up on what each of the properties does. There aren't any real
shortcuts here folks. You've either profiled your Containers and know the needed memory or
not. Same for a lot of the other properties.
Properties of containerDefinition s like CPU and Memory aren't just ECS
things. Those are Docker concepts. You'll need to figure out what each of your
container needs and uses.
the essential property on a containerDefinition means that if that that
container goes down, they all do.
entrypoint and command just overwrite whatever you have in the image.
You don't have to re-define them here.
links is how you specify that containers can communicate with each other. It's
like using the --link option in docker run .
Remember, Task Definitions are just a set of instructions. They're independent of Clusters.
Meaning that we can use them in any Cluster (assuming they have the resources).
or
b) Creating a Service
The best visual here is that we have a set of instructions on how to build our space-containers
sets (Task Definitions). Now we're handing it to the admiral (Cluster). The admiral knows we
want it set up in the spaceships, but needs to know how (and how many).
So of course the next step is for the admiral (Cluster) to coordinate with the captains (ECS
Container Agents) and deploy it to the proper ships (Instances).
Let Me Know!
Running a Task
We've defined what's needed to set up a full-blown "Edges" shopping experience with our
Containers. We did so through the aforementioned "Task Definition." We can hand this to
our spaceships' admiral (Cluster) and ask them to create and run a "Task."
The process of running a task goes along the lines of something like:
to
When it knows what spaceship is the best fit, it will set the space-containers up there - in our
case a bookstore and coffee shop. These shops will have all the required utilities, like
electricity and plumbing, needed to run.
In ECS, in our Cluster, this is like choosing to Run a Task. We specify the Task Definition
and the Cluster will create all the Containers we've specified within it. It will coordinate with
the different Container Instances (EC2 servers with the container agents) and find the ones
that can best fit the entire "Task."
When running a Task, we can specify to run more than 1 Task. This is the equivalent of
saying:
Therefore, if we had a Task Definition that called for 1 web app Container and 1 video app
Container linked, and we wanted 3 Tasks from it: then we'll wind up with 3 pairs of those
apps.
Spreading Tasks Across Availability Zones (AZs)
Obviously, we want some say in how our Tasks are placed on our Instances. This is
where Task Placement strategies come into play. We tell our Cluster how we want them
spread across our Container Instances.
Assuming that our Instances are being launched into different Availability Zones...
...then the only thing we have to do to spread traffic across zones is select a strategy. There's
5 main strategies we get to select from:
1) AZ Balanced Spread - Spreads Tasks evenly across AZ's. Within an AZ, it spreads Tasks
evenly among Instances.
2) AZ Balanced Binpack - Spreads Tasks evenly across AZ's. Within an AZ, it tries to use
the least amount of Instances. How? By prioritizing Instances with the least amount of
available CPU or memory.
3) Binpack - Places tasks on instances with the least amount of CPU or Memory. Doesn't
care about AZ spread.
Unfortunately, these aren't the most helpful because they don't explain the options from the
console. Instead they just give the values of the API's type and field parameters. When it
comes down to it, the above 5 options we covered are just a mix-and-match of 2 things:
and
For my own sanity, I'm not going to dive into those just yet. I've experimented with them
some, but since the docs are so sparse it's hard to know what's happening. The default 5
strategies seem pretty solid.
Giving our Cluster a Task Definition and telling it to put Tasks on our Container
Instances.
Yes there's a lot of details we can configure, but they still all head towards that same purpose.
However, the deal ends here. If one of our Tasks goes down, that's the end. The Cluster won't
try and put it back up. It also won't give us detailed metrics on them either. Oh, also, how do
we do service discovery? How do we load balance between Tasks??
Given these issues, running a Task (or Tasks) is going to be limited. If we want something
like a web server, we need a different option. That's where Services come into play. Don't
you love these amazingly descriptive names?
Services
As we just covered, Running a Task is a very "one and done" type of deal. To remember it,
think "one and done" because it's "run and done."
Creating a Service tells our Cluster to go beyond just running Tasks - it manages them. We
create a Service and hand it a Task Definition. It takes the Task Definition and does a number
of things for us. Specifically:
1) Sets up our Tasks from our Task Definitions on the best suited Container Instances (same
as running a Task).
6) Distributes our customer demand evenly to all Tasks via Load Balancer (if we hook one
up)
It becomes a little unwieldy for me to continue with the analogy we've set up here. If you're
interested in keeping the visual though just think of it as us asking the admiral (Cluster) to
also...
Let's dive into each of the things a Service does for us in detail now.
1. Set up of our Tasks on the best suited Container
Instances
This is what plain running a Task does. Services also do this. They also give us the same
option of selecting a Task Placement strategy: the options of things like AZ Balanced Spread,
Bin Pack, etc.
So they do everything running a Task does PLUS the other 5 things we're going to step
through.
When we have a Service, we can get specific metrics about it. How much of our allotted CPU
and Memory are our Services Tasks' taking up? How much is reserved? How much is being
utilized?
The docs on the different metrics are here. The two for Services are CPU and Memory
Utilization. Note that there are also metrics available for the Cluster as a whole.
It also reports back to us a detailed list of "events." The events are simply a play-by-play of
what ECS is doing with the Tasks. Is it launching a Task? Is it removing a Task? Is it
updating a Task? At what times are these things happening? Are they in a steady state?
Lots of extra info that goes far and beyond just whether or not a Task is Running, Pending, or
Stopped.
The power of these metrics really comes into play when utilizing CloudWatch alarms. We
can pick a metric from ECS, either Cluster wide or Service wide metrics, and watch them. If
they go above or below thresholds we've set, we can respond to them.
CloudWatch alarms are one of the "supporting components." They're not so complex that
they require their own section here. At the same time, there's so much to them that if we did
make a section, it'd be huge. To create an alarm we just pick a CloudWatch metric and set a
threshold. If that threshold is crossed the alarm sounds.
We can also set a maximum percent of Tasks and a minimum healthy percent of Tasks.
These percents represent the number range of Tasks that our Service can have live at any
given time. They're used in updating Tasks when we revise our Task Definitions. For
example, we might update our Task Definition to use an updated Docker Image for the
Containers.
The maximum and minimum percents are used to deploy updated Tasks without ever having
a service outage.
1) Number of Tasks: 4
What does this have to do with updates? Well it determines how ECS will update everything.
This means, make a new version of it. We don't "update" an existing one. We create revision
of an old one.
Which is either done by a couple of simple clicks in the console, or passing up the JSON
formatted Task Definition via CLI.
When we do this, we'll trigger an update to all Tasks being managed by the Service. The
Service won't just remove all of the current ones and then add all of the new ones. Instead it'll
follow some rules based on our maximum percent and minimum healthy percent values
we set:
a) Any old Tasks that are currently receiving traffic will be "drained".
In other words, it will allow existing requests to complete, but will deny new ones. This is
related to Elastic Load Balancers, which we'll get into in a bit.
b) Deployment of the new Tasks is based on our Minimum Healthy and Maximum percents.
Since we have our Maximum Percent at 200%, it will add 4 of the new revisions first and
then remove the 4 old Tasks.
Number of Tasks: 4
Now we're at that max of 100%. Once these are up and live...
Yep. And we don't have to deal with any of the headaches of doing this manually!
2) Set a CloudWatch alarm to trigger when the metric goes above or below thresholds
It's a very simple process. Any service that feeds data into CloudWatch can be measured,
monitored and responded to. ECS feeds two type sets of metrics in:
We would pick a metric in CloudWatch, like "CPU Utilization" of our Service. We'd set a
CloudWatch alarm with a threshold like "When it gets above 50% of the total available
CPU."
With the alarm set up, we'd hand that to our ECS Service. We'd create "Scaling Policies",
which are actions to take when a specified CloudWatch alarm is triggered. Our Scaling
Policy may be to "Add 1 Task."
In terms of setting up these resources - watching a metric and setting up an alarm are all done
through CloudWatch. The Scaling Policy is done through ECS in the console. The CLI and
CloudFormation use a variety of separate concepts, in combo with ECS, to set this up:
Again, we're focusing on concepts here. There's an entire list of tech steps here for setting up
Service Auto Scaling.
These are what we use for service discovery as well (note the lower case "s"). We can launch
multiple Services into a single Cluster. For example, we might have a Book Service, that
launches Tasks for a book app. We might have a Movie Service that launches Tasks for a
movie app.
Using an Application Load Balancer, we can route to those differing Services using a variety
of rules. The Book Service might be available at /edges . And our Movie Service might be
available at /cube-buster . But they'd all be sharing the same set of Container Instances!
In simple terms though, an Application Load Balancer saves us from a major pain:
Having to load balance traffic across servers and then across Tasks on a server.
NOW. Obviously, this assumes that we have an Application Load Balancer created. "ALBs"
are one of the supporting concepts. They're practically required - how else will traffic
reasonably reach our Containers? But the docs and most guides out there assume knowledge
of Elastic Load Balancing. Well that's BS, so let's actually learn about them.
This is where our Elastic Application Load Balancers come into play. These accept incoming
traffic on a protocol (i.e. HTTP) and port (i.e. 80) we specify. They then route the traffic to
Target Groups, based on the path (i.e. "/").
A Target Group, in context of ECS, is where our Service will register its Tasks. Once these
Tasks are registered, our Load Balancer knows to spread traffic between them.
If you've worked with Elastic Load Balancers, this should point out the differences between
Classic and Application. Classic Load Balancers spread traffic between servers. Application
Load Balancers spread traffic between applications. In this case, the applications are our ECS
Tasks.
It's a pretty neat concept where we just quit caring about balancing between servers and
subsequently the apps within them. We do away with that and say:
"Hey, just tell me where all the live Applications are. I'll spread traffic between those."
a) The Load Balancer itself - from a CLI and API standpoint this is almost just a central point
to attach options and configurations to.
b) Target Groups - a common group of apps, in our case Tasks, that will receive load
balanced traffic.
c) Listeners - what ports and protocols is our Load Balancer listening on? i.e. Port 80,
Protocol HTTP
d) Listener Rules - when traffic arrives on a particular Listener, what do we do? i.e. Send it to
a particular Target Group when the path is /
Conceptually, a load balanced setup for one of our "Edges" Applications we be:
5) Our Service registers ALL of our Edges Tasks in our Service with the Target Group
Our Application Load Balancer can now receive traffic and spread it to all the Tasks being
managed by our Edges Service.
2) Create a Task Definition with the new Images. We'll call it the CubeBuster Task
Definition.
3) Create a new Service out of the CubeBuster Task Definition. Launch it into the same
Cluster we've been using (with the Edges Service).
6) Our CubeBuster Service registers ALL of our CubeBuster Tasks managed by our
CubeBuster Service with the Target Group
Priority is just which rule takes precedence - The lower the number, the higher the priority.
Brilliant inverse relationship AWS.
With our new configuration, if a request comes in for /cubebuster , it will go to our
CubeBuster Targets.
Note here - don't register targets manually if you're working with ECS. ECS will do that for
us.
2) Creating your Application Load Balancer, and making a Listener and Listener Rule
that points to the Target Group.
3) Register your Application Load Balancer and Target Group with the ECS Service.
In the console wizard, it'll have you walk through and create the Target Group, Listener and
Listener Rules. Just remember, don't register any targets manually. ECS does that.
CLI and CloudFormation require all of the above concepts to be created individually. After
creating them, we then have to configure them to point to and reference each other. I plan on
releasing more on this aspect, but it's a pretty heavy topic to dive into.
We've discussed scaling and load balancing our Tasks (and thus Containers), but we're still
missing something. We've discussed how to set up an EC2 instance, but we probably don't
want to set them up individually each time. This is where Launch Configuration and
AutoScaling Groups come into play.
Let Me Know!
When EC2 Instances have the agent on them configured to point to a cluster, they "join" the
cluster. And this point they've fulfilled their entire destiny and are full-fledged Container
Instances.
This obviously isn't the route you'll likely want to take. Too much manual labor. To speed
things up, you could always get an instance exactly how you want it and then create an AMI
from it. This way, every time you make a new instance, there's no extra configuration.
These allow us to automate and manage many Instances at once. Like the Application Load
Balancer, it's a "supporting concept." These aren't directly related to ECS, and are used in
almost anything that leverages EC2. Since they deal with EC2, which is the bread and butter
of everything server related, let's dive in a bit.
With the Launch Configuration in hand, we now create an AutoScaling Group. These take a
Launch Configuration and automate / manage the creation and scaling of instances.
There as simple as:
Getting these Launch Configurations and AutoScaling Groups working with ECS involves
exactly 2 steps:
1) Setting up the Launch Configuration to use the ECS Optimized AMI; or using a user data
script to manually configure the ECS Container Agent.
And...
2) Pointing the container agent to the Cluster you'd like them to join. Which we covered in
the Cluster section.
If your Cluster is live and the Launch Configurations are set-up to either use the ECS
Optimized AMI or install the ECS Container Agent; If you've also configured the Launch
Configuration to point to the correct cluster; Then BOOM. That's all you have to do to launch
instances into your Cluster.
or
If this sounds like incoherent rants of a mad man, check out my guide here on VPCs.
Summary
AWS ECS just helps us manage and deploy our Docker Containers across EC2 instances.
That's really it. We can bog it down with 1000 other supporting things, but it's relatively
sparse.
The most confusing part of grasping ECS is due to the number of assumed supporting
components. Like Application Load Balancers, AutoScaling Groups, Launch Configurations,
VPCs, etc. These things are definitely used with ECS, but they're not ECS.
Task Definition - Everything your Docker Container(s) need(s) to persist on a server. CPU,
Memory, volumes, Docker Images, etc. Think blueprints.
EC2 Instance - the servers. These are the spaceships in our analogy.
ECS Container Agent - software installed on an EC2 instance that helps coordinate with
other Agents, monitor local Docker Containers and communicate with the Cluster. These are
the captains of each spaceship.
Cluster - The parent to which our Tasks and Agents belong to. It's a very ambiguous
concept, that's more just there to represent membership.
For our analogy, think of an admiral. It's in charge of the fleet, all the captains report back to
it. It reports to us so that we don't have to check in with each captain individually.
Running a Task - we hand our Cluster a Task Definition; it creates a Task and places it on
the best suited server; the best suited server is found by coordinating with the ECS Container
Agents.
Creating a Service - we hand our Cluster a Task Definition; it does the same thing as
running a Task PLUS:
5) Distributes our customer demand evenly to all Tasks via Load Balancer (if we hook one
up)
The confusing aspect of ECS is that it practically requires a TON of supporting concepts.
And unfortunately the docs just kind of mention them in passing. Don't mistake this
requirement as a "ECS is just complicated" thing. Anything beyond playing around
will require these supporting components.
Application Load Balancers are needed to direct traffic to Containers. They allow for a
variety of powerful concepts like name-spacing sets of containers as "Target Groups." This
allows us to direct traffic to different sets based on rules like the request Protocol or Path.
While not required, the following help bolster our ECS setup:
CloudWatch alarms and metrics to respond to events in our Clusters and Services. This is
how we achieve auto scaling.
Launch Configurations and Auto Scaling Groups to manage sets of Instances vs. setting
them up piecemeal.
VPCs to create varying subnets to allow for multiple availability zone deploys
IAM to create the proper roles for your Instances, Services and Tasks
Caffeine to make sure that you don't fall asleep while reading through 100s of documentation
pages.
Final Thoughts
The obvious next steps need to be an actual implementation of everything. However, now
that you understand the concepts, you won't be clicking in the dark! If you've done an
implementation before, hopefully this has shed some light on why things worked as they did.
I feel the need to mention this one more time, if you're looking for a step-by-step of
implementation, I have a full one here:
Guide to Fault Tolerant and Load Balanced AWS Docker Deployment on ECS
The reason I opted to make this long conceptual guide is because implementing ECS is the
easy part. Understanding it is the tricky part. It's just a few clicks or CLI calls. So "80%
mental, 20% mechanical" wound up being true here as well.