Notes AWS Solutions Architects

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 55

Notes AWS Solutions Architects

EC2 Instances
Amazon EC2 is free to try. There are multiple ways to pay for Amazon EC2 instances:
On-Demand, Savings Plans, Reserved Instances, and Spot Instances. You can also pay
for Dedicated Hosts, which provide EC2 instance capacity on physical servers dedicated
for your use. For more information on how to optimize your Amazon EC2 spend, visit
the Amazon EC2 Cost and Capacity page.

On-Demand
With On-Demand instances, you pay for compute capacity by the hour or the second
depending on which instances you run. No longer-term commitments or upfront
payments are needed. You can increase or decrease your compute capacity depending
on the demands of your application and only pay the specified per hourly rates for the
instance you use.
On-Demand instances are recommended for:
Users that prefer the low cost and flexibility of Amazon EC2 without any upfront
payment or long-term commitment
Applications with short-term, spiky, or unpredictable workloads that cannot be
interrupted
Applications being developed or tested on Amazon EC2 for the first time

Spot instances
Amazon EC2 Spot instances allow you to request spare Amazon EC2 computing
capacity for up to 90% off the On-Demand price. Learn More.
Spot instances are recommended for:
Applications that have flexible start and end times
Applications that are feasible only at very low compute prices
Users with urgent computing needs for large amounts of additional capacity.

Savings Plans
Savings Plans are a flexible pricing model that offers low prices on EC2 and Fargate
usage, in exchange for a commitment to a consistent amount of usage (measured in
$/hour) for a one- or three-year term.
Dedicated Hosts
A Dedicated Host is a physical EC2 server dedicated for your use. Dedicated Hosts can
help you reduce costs by allowing you to use your existing server-bound software
licenses, including Windows Server, SQL Server, and SUSE Linux Enterprise Server
(subject to your license terms), and can also help you meet compliance requirements.
Learn more.
Can be purchased On-Demand (hourly).
Can be purchased as a Reservation for up to 70% off the On-Demand price.

Per-Second Billing
With per-second billing, you pay only for what you use. EC2 per-second billing removes
the cost of unused minutes and seconds from your bill. Focus on improving your
applications instead of maximizing hourly usage, especially for instances running over
irregular time periods such as dev/testing, data processing, analytics, batch processing,
and gaming applications.
EC2 usage is billed in one-second increments, with a minimum of 60 seconds. Similarly,
provisioned storage for Amazon Elastic Block Store (EBS) volumes will be billed per-
second increments, with a 60-second minimum. Per-second billing is available for
instances launched in:
On-Demand, Savings Plans, Reserved, and Spot instances
All regions and Availability Zones
Amazon Linux, Windows, and Ubuntu
For details on related costs like data transfer, Elastic IP addresses, and EBS Optimized
Instances, visit the On-Demand pricing page.

connecting to your EC2 instance via SSH


you need to ensure that port 22 is allowed on the security group of your EC2 instance.
A security group acts as a virtual firewall that controls the traffic for one or more
instances. When you launch an instance, you associate one or more security groups
with the instance. You add rules to each security group that allow traffic to or from its
associated instances. You can modify the rules for a security group at any time; the
new rules are automatically applied to all instances that are associated with the
security group.
LOAD BALANCER

routing types in Elastic Load Balancer (ELB)


Path-based Routing: Path-based routing allows you to route traffic to different target
groups based on the path of the incoming URL. For example, you could configure your
load balancer to route requests with the path "/api" to a different target group than
requests with the path "/app".

Host-based Routing: Host-based routing allows you to route traffic to different target
groups based on the hostname of the incoming request. For example, you could
configure your load balancer to route requests with the hostname "api.example.com"
to a different target group than requests with the hostname "www.example.com".
HTTP Header-based Routing: HTTP header-based routing allows you to route traffic to
different target groups based on the value of an HTTP header in the incoming request.
For example, you could configure your load balancer to route requests with the "User-
Agent" header set to "iOS" to a different target group than requests with the "User-
Agent" header set to "Android".
Query String Parameter-based Routing: Query string parameter-based routing allows
you to route traffic to different target groups based on the value of a query string
parameter in the incoming request. For example, you could configure your load
balancer to route requests with the "category" parameter set to "books" to a different
target group than requests with the "category" parameter set to "electronics".
Source IP Address CIDR-based Routing: Source IP address CIDR-based routing allows
you to route traffic to different target groups based on the source IP address of the
incoming request. For example, you could configure your load balancer to route
requests originating from a specific IP address range to a different target group than
requests originating from a different IP address range.
By using these routing methods, you can customize the behavior of your load balancer
and direct traffic to the appropriate target groups based on a variety of factors,
improving the performance and reliability of your applications.
Auto Scaling Group Attributes

Launch Template
is similar to a launch configuration, in that it specifies instance configuration
information such as the ID of the Amazon Machine Image (AMI), the instance type, a
key pair, security groups, and the other parameters that you use to launch EC2
instances. Also, defining a launch template instead of a launch configuration allows
you to have multiple versions of a template.
With launch templates, you can provision capacity across multiple instance types using
both On-Demand Instances and Spot Instances to achieve the desired scale,
performance, and cost.
A launch configuration
is an instance configuration template that an Auto Scaling group uses to launch EC2
instances. When you create a launch configuration, you specify information for the
instances. Include the ID of the Amazon Machine Image (AMI), the instance type, a key
pair, one or more security groups, and a block device mapping.

It is not possible to modify a launch configuration once it is created. The correct option
is to create a new launch configuration to use the correct instance type. Then modify
the Auto Scaling group to use this new launch configuration. Lastly to clean-up, just
delete the old launch configuration as it is no longer needed.

What are ENIs?


ENI stands for "Elastic Network Interface" and is a key component of the network
infrastructure in Amazon Web Services (AWS). An ENI is a network virtual interface
that can be attached to an Amazon EC2 (Elastic Compute Cloud) instance in the AWS
Cloud.
Each EC2 instance can have one or more ENIs associated with it, allowing you to
connect to specific virtual networks (VPCs) and subnets. ENIs can also be used to
enable communication between EC2 instances on a private network, as well as to
create network redundancy in the event of failures.
Some important features of ENIs include:
Each ENI has a private IP address and a public IP address (if assigned) that can be used
to connect to the EC2 instance.
ENIs can be configured with network security rules to control incoming and outgoing
traffic.
ENIs can be moved between EC2 instances in the same AWS Availability Zone.
ENIs can be used to enable high-performance network connections, such as link
aggregation.
In summary, ENIs are a fundamental part of the AWS network architecture, allowing
users to control and configure network connectivity for their EC2 instances in the AWS
Cloud.

AWS Lake Formation


is a service that makes it easy to set up a secure data lake in days. A data lake is a
centralized, curated, and secured repository that stores all your data, both in its
original form and prepared for analysis. A data lake enables you to break down data
silos and combine different types of analytics to gain insights and guide better business
decisions.
Lake Formation is a service provided by Amazon Web Services (AWS) that makes it
easier for organizations to build, secure, and manage data lakes on AWS. A data lake is
a central repository that allows organizations to store all their structured and
unstructured data at any scale. It provides a single source of truth for all data, which
can be used for various purposes such as analytics, machine learning, and business
intelligence.
Lake Formation provides a set of tools and features that allow users to create, manage,
and secure their data lakes on AWS. These tools include:
Data Catalog: A centralized metadata catalog that allows users to discover, manage,
and share their data assets across their organization.
Data Ingestion: A set of tools that enables users to easily ingest data from various
sources into their data lake, including databases, streaming data, and files.
Data Security: A comprehensive set of security features that ensure the confidentiality,
integrity, and availability of data in the data lake.
Data Transformation: A set of tools that enable users to transform and cleanse their
data to prepare it for analytics and other use cases.
Data Access: A set of tools that allow users to access and query data in the data lake
using various analytics and visualization tools.
Overall, Lake Formation simplifies the process of creating and managing data lakes on
AWS, allowing organizations to focus on deriving insights from their data rather than
worrying about the underlying infrastructure and security.

S3 Simple Storage Service


By default, all Amazon S3 resources such as buckets, objects, and related subresources
are private, which means that only the AWS account holder (resource owner) that
created it has access to the resource. The resource owner can optionally grant access
permissions to others by writing an access policy. In S3, you also set the permissions of
the object during upload to make it public.
Amazon S3 offers access policy options broadly categorized as resource-based policies
and user policies. Access policies you attach to your resources (buckets and objects)
are referred to as resource-based policies.
For example, bucket policies and access control lists (ACLs) are resource-based policies.
You can also attach access policies to users in your account. These are called user
policies. You may choose to use resource-based policies, user policies, or some
combination of these to manage permissions to your Amazon S3 resources.
You can also manage the public permissions of your objects during upload. Under
Manage public permissions, you can grant read access to your objects to the general
public (everyone in the world) for all of the files that you're uploading. Granting public
read access is applicable to a small subset of use cases, such as when buckets are used
for websites.
Amazon S3 is composed of buckets, object keys, object metadata, object tags, and
many other components as shown below:
An Amazon S3 bucket name is globally unique, and the namespace is shared by all AWS
accounts.
An Amazon S3 object key refers to the key name, which uniquely identifies the object
in the bucket.
An Amazon S3 object metadata is a name-value pair that provides information about
the object.
An Amazon S3 object tag is a key-pair value used for object tagging to categorize
storage.
You can perform S3 Select to query only the necessary data inside the CSV files based
on the bucket's name and the object's key.

Amazon S3 access points


simplify data access for any AWS service or customer application that stores data in S3.
Access points are named network endpoints that are attached to buckets that you can
use to perform S3 object operations, such as GetObject and PutObject.
Each access point has distinct permissions and network controls that S3 applies for any
request that is made through that access point. Each access point enforces a
customized access point policy that works in conjunction with the bucket policy that is
attached to the underlying bucket. You can configure any access point to accept
requests only from a virtual private cloud (VPC) to restrict Amazon S3 data access to a
private network. You can also configure custom block public access settings for each
access point.
You can also use Amazon S3 Multi-Region Access Points to provide a global endpoint
that applications can use to fulfill requests from S3 buckets located in multiple AWS
Regions. You can use Multi-Region Access Points to build multi-Region applications
with the same simple architecture used in a single Region, and then run those
applications anywhere in the world. Instead of sending requests over the congested
public internet, Multi-Region Access Points provide built-in network resilience with
acceleration of internet-based requests to Amazon S3. Application requests made to a
Multi-Region Access Point global endpoint use AWS Global Accelerator to
automatically route over the AWS global network to the S3 bucket with the lowest
network latency.
With S3 Object Lock
you can store objects using a write-once-read-many (WORM) model. Object Lock can
help prevent objects from being deleted or overwritten for a fixed amount of time or
indefinitely. You can use Object Lock to help meet regulatory requirements that
require WORM storage, or to simply add another layer of protection against object
changes and deletion.

Amazon S3 notification
The Amazon S3 notification feature enables you to receive notifications when certain
events happen in your bucket. To enable notifications, you must first add a notification
configuration that identifies the events you want Amazon S3 to publish and the
destinations where you want Amazon S3 to send the notifications. You store this
configuration in the notification sub resource that is associated with a bucket.
Amazon S3 supports the following destinations where it can publish events:
- Amazon Simple Notification Service (Amazon SNS) topic
- Amazon Simple Queue Service (Amazon SQS) queue
- AWS Lambda
Take note that Amazon S3 event notifications are designed to be delivered at least
once and to one destination only. You cannot attach two or more SNS topics or SQS
queues for S3 event notification. Therefore, you must send the event notification to
Amazon SNS.

S3 always returns the latest version of the object


Amazon S3 delivers strong read-after-write consistency automatically, without changes
to performance or availability, without sacrificing regional isolation for applications,
and at no additional cost.
After a successful write of a new object or an overwrite of an existing object, any
subsequent read request immediately receives the latest version of the object. S3 also
provides strong consistency for list operations, so after a write, you can immediately
perform a listing of the objects in a bucket with any changes reflected.

S3 Bucket policies
Bucket policies in Amazon S3 can be used to add or deny permissions across some or
all of the objects within a single bucket. Policies can be attached to users, groups, or
Amazon S3 buckets, enabling centralized management of permissions. With bucket
policies, you can grant users within your AWS Account or other AWS Accounts access
to your Amazon S3 resources.
Bucket settings for Block Public Access

S3 multipart upload
allows you to upload a single object as a set of parts. Each part is a contiguous portion
of the object's data. You can upload these object parts independently and in any order.
If transmission of any part fails, you can retransmit that part without affecting other
parts. After all parts of your object are uploaded, Amazon S3 assembles these parts
and creates the object. In general, when your object size reaches 100 MB, you should
consider using multipart uploads instead of uploading the object in a single operation.
Multipart upload provides improved throughput; therefore, it facilitates faster file
uploads.

Partition Key
portion of a table's primary key determines the logical partitions in which a table's data
is stored. This in turn affects the underlying physical partitions. Provisioned I/O
capacity for the table is divided evenly among these physical partitions. Therefore, a
partition key design that doesn't distribute I/O requests evenly can create "hot"
partitions that result in throttling and use your provisioned I/O capacity inefficiently.

The optimal usage of a table's provisioned throughput depends not only on the
workload patterns of individual items, but also on the partition-key design. This doesn't
mean that you must access all partition key values to achieve an efficient throughput
level, or even that the percentage of accessed partition key values must be high. It
does mean that the more distinct partition key values that your workload accesses, the
more those requests will be spread across the partitioned space. In general, you will
use your provisioned throughput more efficiently as the ratio of partition key values
accessed to the total number of partition key values increases.
One example for this is the use of partition keys with high-cardinality attributes, which
have a large number of distinct values for each item.
VPC (Virtual Private Cloud)
Virtual Private Cloud (VPC): A virtual private cloud is a logically isolated virtual
network within the AWS cloud. It allows you to define your own IP address space,
create subnets, and configure network gateways, all within a virtual network that you
control.
Subnets: Subnets are logical partitions within a VPC that allow you to segregate
resources based on their security or operational requirements.
Route Tables: Route tables are used to control the flow of traffic within and outside
the VPC. They contain a set of rules that define how traffic is routed between subnets
and to the internet.
Internet Gateway: An internet gateway is a horizontally scaled, redundant, and highly
available VPC component that allows communication between instances in your VPC
and the internet.
Network Access Control Lists (NACLs): NACLs are used to control traffic at the subnet
level. They act as a firewall and can be used to allow or deny traffic based on IP
addresses, ports, and protocols.
Security Groups: Security groups act as a virtual firewall for your instances to control
inbound and outbound traffic. They are stateful, which means that they automatically
allow return traffic.
Elastic IP addresses (EIPs): EIPs are static IP addresses that can be assigned to your
instances, allowing them to be reachable from the internet even if their IP address
changes.
NAT Gateway: A NAT gateway is a highly available, managed network address
translation (NAT) service that allows instances in a private subnet to connect to the
internet or other AWS services, while remaining private.
VPN Connections: VPN connections provide secure connectivity between your on-
premises data center or office and your VPC.
Direct Connect: Direct Connect is a dedicated network connection between your on-
premises infrastructure and AWS. It provides a high-speed, low-latency, and reliable
connection, which can be used to access services in your VPC.
The Amazon VPC console wizard provides the following four configurations:
1. VPC with a single public subnet - The configuration for this scenario
includes a virtual private cloud (VPC) with a single public subnet, and an
internet gateway to enable communication over the internet. We recommend
this configuration if you need to run a single-tier, public-facing web application,
such as a blog or a simple website.
2. VPC with public and private subnets (NAT) - The configuration for this
scenario includes a virtual private cloud (VPC) with a public subnet and a
private subnet. We recommend this scenario if you want to run a public-facing
web application while maintaining back-end servers that aren't publicly
accessible. A common example is a multi-tier website, with the web servers in a
public subnet and the database servers in a private subnet. You can set up
security and routing so that the web servers can communicate with the
database servers.

3. VPC with public and private subnets and AWS Site-to-Site VPN access -
The configuration for this scenario includes a virtual private cloud (VPC) with a
public subnet and a private subnet, and a virtual private gateway to enable
communication with your network over an IPsec VPN tunnel. We recommend
this scenario if you want to extend your network into the cloud and also
directly access the Internet from your VPC. This scenario enables you to run a
multi-tiered application with a scalable web front end in a public subnet and to
house your data in a private subnet that is connected to your network by an
IPsec AWS Site-to-Site VPN connection.
4. VPC with a private subnet only and AWS Site-to-Site VPN access - The
configuration for this scenario includes a virtual private cloud (VPC) with a
single private subnet, and a virtual private gateway to enable communication
with your network over an IPsec VPN tunnel. There is no Internet gateway to
enable communication over the Internet. We recommend this scenario if you
want to extend your network into the cloud using Amazon's infrastructure
without exposing your network to the Internet.

VPC sharing
allows multiple AWS accounts to create their application resources such as EC2
instances, RDS databases, Redshift clusters, and Lambda functions, into shared and
centrally-managed Amazon Virtual Private Clouds (VPCs). To set this up, the account
that owns the VPC (owner) shares one or more subnets with other accounts
(participants) that belong to the same organization from AWS Organizations. After a
subnet is shared, the participants can view, create, modify, and delete their application
resources in the subnets shared with them. Participants cannot view, modify, or delete
resources that belong to other participants or the VPC owner.
You can share Amazon VPCs to leverage the implicit routing within a VPC for
applications that require a high degree of interconnectivity 0. This reduces the number
of VPCs that you create and manage while using separate accounts for billing and
access control.

bastion host in aws


a bastion host is a special-purpose instance that is designed to provide secure access to
resources in a private subnet of a VPC (Virtual Private Cloud) from the internet. The
bastion host acts as a gateway, providing users with secure access to resources in the
private subnet by forwarding traffic from the user's machine to the private subnet.
A bastion host is typically deployed in a public subnet of a VPC, which is accessible
from the internet. The bastion host is secured with strong authentication and
authorization mechanisms, such as SSH (Secure Shell) or RDP (Remote Desktop
Protocol) access. Users can connect to the bastion host using SSH or RDP and then use
it as a jump server to access resources in the private subnet, such as instances running
in an application tier.
Using a bastion host provides an additional layer of security by enforcing a "single
point of entry" into the private subnet. This helps to reduce the risk of unauthorized
access and data breaches. Additionally, a bastion host can be used to audit and
monitor access to resources in the private subnet.
In summary, a bastion host is a secure and controlled way for users to access resources
in a private subnet of a VPC from the internet.

NAT (Network Address Translation) Gateway


NAT (Network Address Translation) Gateway is a managed service that provides a
highly available and scalable way for instances in a private subnet to access the
internet, while also preventing inbound traffic from the internet to the instances in the
private subnet.
A NAT Gateway is deployed in a public subnet of a VPC (Virtual Private Cloud), and it
allows instances in private subnets to access external resources, such as databases,
software updates, and other services that require internet access. When an instance in
a private subnet needs to access the internet, the traffic is directed to the NAT
Gateway, which then forwards the traffic to the internet. The source IP address of the
traffic is replaced with the IP address of the NAT Gateway, allowing the response traffic
to return to the NAT Gateway, which then forwards the response traffic to the instance
in the private subnet.

NACLs (Network Access Control Lists)


are a type of security control that can be used to filter traffic at the subnet level. NACLs
are stateless, meaning that they don't keep track of the state of connections, unlike
security groups which are stateful.
NACLs are used to control inbound and outbound traffic to and from subnets in your
VPC (Virtual Private Cloud) network. You can use NACLs to create rules that allow or
deny traffic based on the source and destination IP addresses, ports, and protocols.
NACLs are a more coarse-grained control compared to security groups, as they are
applied at the subnet level and not at the instance level. This means that if you apply a
NACL rule to a subnet, it will apply to all instances in that subnet.
It's worth noting that NACLs are evaluated in a specific order, and the first matching
rule is applied. Therefore, it's important to ensure that the order of the rules is correct,
otherwise, traffic may not be allowed or denied as expected.
By default, every newly created subnet in AWS has a NACL associated with it that
allows all inbound and outbound traffic. This means that all traffic is allowed to flow in
and out of the subnet.
However, this default NACL configuration may not be appropriate for all use cases, and
you may need to configure custom NACL rules to enforce more restrictive traffic
filtering policies based on your specific requirements.

VPC Endpoint
is a virtual device that enables private connectivity between a VPC and AWS services
without using public IP addresses, NAT (Network Address Translation) devices, VPN
(Virtual Private Network) connections, or internet gateways.
VPC endpoints allow you to connect to AWS services, such as Amazon S3, Amazon
DynamoDB, and Amazon Kinesis, from your VPC without exposing your traffic to the
public internet. This improves security by keeping traffic within the AWS network and
reducing exposure to threats from the public internet.

There are two types of VPC endpoints:


Interface Endpoints: This type of endpoint provides a private IP address that you can
use to access supported services over an elastic network interface (ENI). Interface
endpoints support traffic over the internet gateway and AWS Direct Connect.
Gateway Endpoints: This type of endpoint provides a target for a specified route in
your VPC route table. Gateway endpoints support traffic to S3 and DynamoDB only.
VPC endpoints are created in the VPC console, and they are associated with a specific
VPC and a specific service. Once created, endpoints can be used by instances in the
associated VPC to access the specified service over a private connection.

Transit Gateway
is a service that enables you to connect multiple VPCs and on-premises networks
together using a single gateway. Site-to-Site VPN (Virtual Private Network) Enhanced
ECMP (Equal Cost Multipath) is a feature of Transit Gateway that allows you to
distribute VPN traffic across multiple VPN connections between Transit Gateway and
on-premises networks.
ECMP is a routing technique that allows traffic to be distributed across multiple equal-
cost paths, enabling efficient use of available bandwidth and providing redundancy in
case of path failure. Site-to-Site VPN Enhanced ECMP extends this capability to VPN
connections between Transit Gateway and on-premises networks, allowing traffic to
be distributed across multiple VPN connections.
By using Transit Gateway and Site-to-Site VPN Enhanced ECMP, you can create a
scalable and highly available solution for connecting multiple networks. You can
connect multiple VPCs and on-premises networks to Transit Gateway, and use Site-to-
Site VPN Enhanced ECMP to distribute traffic across multiple VPN connections,
providing redundancy and efficient use of available bandwidth.
To use Site-to-Site VPN Enhanced ECMP, you must have multiple VPN connections
between Transit Gateway and on-premises networks with the same routing prefix.
When traffic is sent to the routing prefix, Transit Gateway distributes the traffic across
the multiple VPN connections using ECMP, providing an efficient and redundant
solution for connecting multiple networks.

Here are some ways to migrate data from on-


premises to AWS:
AWS Database Migration Service (DMS): AWS DMS is a fully managed service that
allows you to migrate databases to AWS easily and securely. It supports both
homogeneous and heterogeneous migrations and provides continuous data replication
with minimal downtime.
AWS Snowball: AWS Snowball is a petabyte-scale data transfer service that enables
you to move large amounts of data to and from AWS using secure devices. You can use
it to transfer data from your on-premises environment to AWS, or to migrate data
between AWS regions.
AWS Storage Gateway: AWS Storage Gateway is a hybrid storage service that enables
you to seamlessly integrate your on-premises storage infrastructure with AWS. It
supports file, volume, and tape gateways, and allows you to backup and archive your
data to AWS.
AWS Transfer Family: AWS Transfer Family is a fully managed service that allows you
to transfer files over the internet using a variety of protocols such as FTP, SFTP, and
FTPS. It enables you to migrate files from your on-premises environment to AWS easily
and securely.
Amazon S3 Transfer Acceleration: Amazon S3 Transfer Acceleration is a feature of
Amazon S3 that enables you to transfer data to and from Amazon S3 over the internet
at faster speeds than traditional methods. It uses Amazon CloudFront's globally
distributed edge locations to accelerate transfers.
AWS Direct Connect: AWS Direct Connect is a dedicated network connection between
your on-premises infrastructure and AWS. It provides a high-speed, low-latency, and
reliable connection, which can be used to transfer large amounts of data to and from
AWS.
Third-party tools: There are many third-party tools available that can help you migrate
data from on-premises to AWS. These tools offer various features and capabilities,
such as database migration, data replication, and file transfer.
AWS Application Migration Service (AMP): AWS AMP is a service that helps you
migrate your applications to AWS quickly and easily. It automates many of the manual
steps involved in application migration and provides a comprehensive view of your
application portfolio.
AWS SMS: AWS SMS is a service that simplifies the migration of on-premises
workloads to AWS by providing automated, live replication of your applications and
data. It enables you to migrate your applications to AWS without incurring downtime
or data loss.
AWS DataSync: AWS DataSync is a service that makes it easy to move large amounts of
data between on-premises storage and AWS storage services. It is a fully managed
service that automates many of the manual steps involved in data transfer, such as
scheduling, monitoring, and error handling.

Network Protection on AWS


To protect network on AWS, we’ve seen
● Network Access Control Lists (NACLs)

● Amazon VPC security groups

● AWS WAF (protect against malicious requests)

● AWS Shield & AWS Shield Advanced

● AWS Firewall Manager (to manage them across accounts)

But what if we want to protect in a sophisticated way our entire VPC?

AWS Network Firewall


• Protect your entire Amazon VPC
• From Layer 3 to Layer 7 protection
• Any direction, you can inspect
• VPC to VPC traffic
• Outbound to internet
• Inbound from internet
• To / from Direct Connect & Site-to-Site VPN
• Internally, the AWS Network Firewall uses the AWS Gateway Load Balancer
• Rules can be centrally managed crossaccount
by AWS Firewall Manager to apply to many VPCs
AWS WAF (Web Application Firewall)
• Protects your web applications from common web exploits (Layer 7)
• Layer 7 is HTTP (vs Layer 4 is TCP/UDP)
• Deploy on
• Application Load Balancer
• API Gateway
• CloudFront
• AppSync GraphQL API
• Cognito User Pool
is a security service provided by Amazon Web Services (AWS) that protects web
applications from common web exploits and attacks, such as SQL injection, cross-site
scripting (XSS), and more.
AWS WAF allows you to create custom rules that block or allow traffic to your web
applications based on characteristics such as IP addresses, HTTP headers, HTTP body
content, and more. You can also use AWS WAF to set rate limits, whitelist or blacklist
IP addresses, and protect your web applications from bad bots and other automated
attacks.
AWS WAF integrates with other AWS services such as Amazon CloudFront, Amazon API
Gateway, and AWS Application Load Balancer, allowing you to deploy and manage web
application security across your entire AWS infrastructure.
Some key features of AWS WAF include:
Customizable Rules: AWS WAF allows you to create your own rules to block or allow
traffic based on your specific needs.
Advanced Filtering: AWS WAF uses advanced filtering to block malicious traffic,
including SQL injection, cross-site scripting (XSS), and more.
Integrations: AWS WAF integrates with other AWS services like CloudFront and API
Gateway to provide comprehensive web application security.
Managed Rules: AWS WAF provides pre-configured managed rules that block known
threats, making it easy to get started with web application security.
Overall, AWS WAF is a powerful and flexible web application firewall that allows you to
protect your web applications from a variety of threats and attacks, helping to ensure
the security and reliability of your applications.

Amazon Cognito User Pools


A user pool is a user directory in Amazon Cognito. You can leverage Amazon Cognito
User Pools to either provide built-in user management or integrate with external
identity providers, such as Facebook, Twitter, Google+, and Amazon. Whether your
users sign-in directly or through a third party, all members of the user pool have a
directory profile that you can access through a Software Development Kit (SDK).
User pools provide: 1. Sign-up and sign-in services. 2. A built-in, customizable web UI
to sign in users. 3. Social sign-in with Facebook, Google, Login with Amazon, and Sign in
with Apple, as well as sign-in with SAML identity providers from your user pool. 4. User
directory management and user profiles. 5. Security features such as multi-factor
authentication (MFA), checks for compromised credentials, account takeover
protection, and phone and email verification. 6. Customized workflows and user
migration through AWS Lambda triggers.
After creating an Amazon Cognito user pool, in API Gateway, you must then create a
COGNITO_USER_POOLS authorizer that uses the user pool.

Amazon Cognito Identity Pools


The two main components of Amazon Cognito are user pools and identity pools.
Identity pools provide AWS credentials to grant your users access to other AWS
services. To enable users in your user pool to access AWS resources, you can configure
an identity pool to exchange user pool tokens for AWS credentials. So, identity pools
aren't an authentication mechanism in themselves.

AWS Certificate Manager


is a service that lets you easily provision, manage, and deploy public and private
Secure Sockets Layer/Transport Layer Security (SSL/TLS) certificates for use with AWS
services and your internal connected resources. SSL/TLS certificates are used to secure
network communications and establish the identity of websites over the Internet as
well as resources on private networks.
is a service provided by Amazon Web Services (AWS) that makes it easy to provision,
manage, and deploy Secure Sockets Layer/Transport Layer Security (SSL/TLS)
certificates for use with AWS services and your internal connected resources.
ACM supports three types of SSL/TLS certificates:
Domain validated (DV) certificates: These certificates validate only that the requester
controls the domain name associated with the certificate.
Organization validated (OV) certificates: These certificates validate that the requester
controls the domain name associated with the certificate, as well as some additional
information about the organization that the requester represents.
Extended validation (EV) certificates: These certificates offer the highest level of
assurance, as they require extensive validation of the requester's identity and the
organization they represent.
ACM can also automatically renew SSL/TLS certificates before they expire, reducing the
amount of manual work required to maintain security on your website or application.
Additionally, ACM can be integrated with other AWS services like Amazon CloudFront,
Amazon Elastic Load Balancer, and Amazon API Gateway, making it easy to deploy your
SSL/TLS certificates to your AWS resources.

DynamoDB Global Tables and KMS Multi-Region


Keys
can be used together with client-side encryption to provide an additional layer of
security to your data.
Client-side encryption involves encrypting data before it is sent to DynamoDB or any
other service, and decrypting it after it is retrieved. With client-side encryption, the
encryption keys are managed by the application, rather than by the cloud service. This
provides an additional layer of security, as the application has complete control over
the keys and can ensure that the data is encrypted and decrypted securely.
When using client-side encryption with DynamoDB Global Tables, you can encrypt the
data in the application before it is sent to DynamoDB, and decrypt it after it is retrieved
from DynamoDB. This ensures that the data is protected while it is in transit and while
it is stored in DynamoDB.
To encrypt and decrypt data using client-side encryption with KMS Multi-Region Keys,
you can use the AWS Encryption SDK. The Encryption SDK provides a simple and secure
way to encrypt and decrypt data using KMS Multi-Region Keys. When encrypting data,
the Encryption SDK retrieves the appropriate encryption keys from KMS, encrypts the
data, and stores the encrypted data in DynamoDB. When decrypting data, the
Encryption SDK retrieves the appropriate decryption keys from KMS, decrypts the data,
and returns the decrypted data to the application.

Overall, using DynamoDB Global Tables and KMS Multi-Region Keys with client-side
encryption can provide a highly secure and resilient way to store and access data in the
cloud.

AWS Systems Manager Parameter Store


is a managed service provided by AWS that enables you to store and manage your
application configuration and secrets in a central location. SSM Parameter Store
provides a secure and scalable solution for storing and retrieving configuration data
such as database connection strings, API keys, and passwords.
With SSM Parameter Store, you can store your configuration data as plain text,
encrypted text, or binary data. The service provides encryption options that allow you
to encrypt sensitive data at rest and in transit. Additionally, you can easily manage
access to the stored data using AWS Identity and Access Management (IAM) policies.
One of the key benefits of SSM Parameter Store is its integration with other AWS
services. For example, you can use SSM Parameter Store to store the database
connection string for your application, and then retrieve that string from your EC2
instances or Lambda functions using AWS Systems Manager Run Command or AWS
Lambda. You can also use SSM Parameter Store to store secrets for your containers
running on Amazon Elastic Container Service (ECS) or Kubernetes.
SSM Parameter Store also provides features such as versioning, parameter hierarchies,
and automatic parameter expiration. These features make it easier to manage your
application configuration and secrets at scale.
Overall, SSM Parameter Store is a powerful and flexible solution for managing your
application configuration and secrets in AWS, and can help simplify your application
development and deployment processes.

AWS Config
AWS Config provides a detailed view of the configuration of AWS resources in your
AWS account. This includes how the resources are related to one another and how
they were configured in the past so that you can see how the configurations and
relationships change over time.
provides AWS-managed rules, which are predefined, customizable rules that AWS
Config uses to evaluate whether your AWS resources comply with common best
practices. You can leverage an AWS Config managed rule to check if any ACM
certificates in your account are marked for expiration within the specified number of
days. Certificates provided by ACM are automatically renewed. ACM does not
automatically renew the certificates that you import. The rule is NON_COMPLIANT if
your certificates are about to expire.

Cluster Placement Group


packs instances close together inside an Availability Zone. This strategy enables
workloads to achieve the low-latency network performance necessary for tightly-
coupled node-to-node communication that is typical of HPC applications.

Partition Placement Group


spreads your instances across logical partitions such that groups of instances in one
partition do not share the underlying hardware with groups of instances in different
partitions. This strategy is typically used by large distributed and replicated workloads,
such as Hadoop, Cassandra, and Kafka.
Spread Placement Group
strictly places a small group of instances across distinct underlying hardware to reduce
correlated failures.
There is no charge for creating a placement group.

Amazon GuardDuty
is a threat detection service that continuously monitors for malicious activity and
unauthorized behavior to protect your AWS accounts, workloads, and data stored in
Amazon S3. With the cloud, the collection and aggregation of account and network
activities is simplified, but it can be time-consuming for security teams to continuously
analyze event log data for potential threats. With GuardDuty, you now have an
intelligent and cost-effective option for continuous threat detection in AWS. The
service uses machine learning, anomaly detection, and integrated threat intelligence to
identify and prioritize potential threats.
GuardDuty analyzes tens of billions of events across multiple AWS data sources, such
as AWS CloudTrail events, Amazon VPC Flow Logs, and DNS logs.
With a few clicks in the AWS Management Console, GuardDuty can be enabled with no
software or hardware to deploy or maintain. By integrating with Amazon EventBridge
Events, GuardDuty alerts are actionable, easy to aggregate across multiple accounts,
and straightforward to push into existing event management and workflow systems.

AWS DataSync
is an online data transfer service that simplifies, automates, and accelerates copying
large amounts of data between on-premises storage systems and AWS Storage
services, as well as between AWS Storage services.
You can use AWS DataSync to migrate data located on-premises, at the edge, or in
other clouds to Amazon S3, Amazon EFS, Amazon FSx for Windows File Server, Amazon
FSx for Lustre, Amazon FSx for OpenZFS, and Amazon FSx for NetApp ONTAP.

AWS AppSync
is a serverless GraphQL and Pub/Sub API service that simplifies building modern web
and mobile applications. It provides a robust, scalable GraphQL interface for
application developers to combine data from multiple sources, including Amazon
DynamoDB, AWS Lambda, and HTTP APIs.
GraphQL
is a data language to enable client apps to fetch, change and subscribe to data from
servers. In a GraphQL query, the client specifies how the data is to be structured when
it is returned by the server. This makes it possible for the client to query only for the
data it needs, in the format that it needs it in.
With AWS AppSync, you can use custom domain names to configure a single,
memorable domain that works for both your GraphQL and real-time APIs.
In other words, you can utilize simple and memorable endpoint URLs with domain
names of your choice by creating custom domain names that you associate with the
AWS AppSync APIs in your account.

IAM POLICY FOR S3 BUCKET


The main elements of a policy statement are:
Effect: Specifies whether the statement will Allow or Deny an action (Allow is the effect
defined here).
Action: Describes a specific action or actions that will either be allowed or denied to
run based on the Effect entered. API actions are unique to each service (DeleteObject
is the action defined here).
Resource: Specifies the resources—for example, an S3 bucket or objects—that the
policy applies to in Amazon Resource Name (ARN) format (example-bucket/* is the
resource defined here).
This policy provides the necessary delete permissions on the resources of the S3
bucket to the group.

IAM database authentication


You can authenticate to your DB instance using AWS Identity and Access Management
(IAM) database authentication. IAM database authentication works with MySQL and
PostgreSQL. With this authentication method, you don't need to use a password when
you connect to a DB instance. Instead, you use an authentication token.

An authentication token is a unique string of characters that Amazon RDS generates on


request. Authentication tokens are generated using AWS Signature Version 4. Each
token has a lifetime of 15 minutes. You don't need to store user credentials in the
database, because authentication is managed externally using IAM. You can also still
use standard database authentication.
IAM database authentication provides the following benefits:
1. Network traffic to and from the database is encrypted using Secure Sockets
Layer (SSL).
2. You can use IAM to centrally manage access to your database resources,
instead of managing access individually on each DB instance.
3. For applications running on Amazon EC2, you can use profile credentials
specific to your EC2 instance to access your database instead of a password, for
greater security

Amazon API Gateway


provides throttling at multiple levels including global and by a service call. Throttling
limits can be set for standard rates and bursts. For example, API owners can set a rate
limit of 1,000 requests per second for a specific method in their REST APIs, and also
configure Amazon API Gateway to handle a burst of 2,000 requests per second for a
few seconds.
Amazon API Gateway is a fully managed service provided by Amazon Web Services
(AWS) that makes it easy for developers to create, deploy, and manage application
programming interfaces (APIs) at any scale. With API Gateway, you can create RESTful
APIs that enable real-time two-way communication between client applications and
backend services running on AWS or on-premises.
API Gateway provides several features and benefits, including:
API creation: You can create RESTful APIs using a simple, user-friendly interface or by
importing an OpenAPI specification file.
Integration with backend services: You can integrate your APIs with AWS services like
AWS Lambda, Amazon DynamoDB, and Amazon Simple Storage Service (S3), as well as
with HTTP/HTTPS endpoints running on-premises or in other cloud environments.
Security: API Gateway provides multiple security options, including identity and access
management (IAM), authorization and authentication, and encryption of data in
transit.
Scalability and availability: API Gateway automatically scales to handle any amount of
traffic, and provides high availability and fault tolerance.
Monitoring and logging: API Gateway provides detailed monitoring and logging
capabilities, including real-time monitoring of API usage, error rates, and latency, as
well as the ability to configure and view logs of API requests and responses.
Overall, API Gateway is a powerful tool for building and managing APIs, and can help
simplify the development and deployment of scalable, secure, and highly available
applications.
throttles requests to your API using the token bucket algorithm, where a token counts
for a request. Specifically, API Gateway sets a limit on a steady-state rate and a burst of
request submissions against all APIs in your account. In the token bucket algorithm,
the burst is the maximum bucket size.

CloudWatch has available Amazon EC2 Metrics for


you to use for monitoring.
CPU Utilization identifies the processing power required to run an application upon a
selected instance. Network Utilization identifies the volume of incoming and outgoing
network traffic to a single instance. Disk Reads metric is used to determine the volume
of the data the application reads from the hard disk of the instance. This can be used
to determine the speed of the application. However, there are certain metrics that are
not readily available in CloudWatch such as memory utilization, disk space utilization,
and many others which can be collected by setting up a custom metric.
You need to prepare a custom metric using CloudWatch Monitoring Scripts which is
written in Perl. You can also install CloudWatch Agent to collect more system-level
metrics from Amazon EC2 instances. Here's the list of custom metrics that you can set
up:
- Memory utilization
- Disk swap utilization
- Disk space utilization
- Page file utilization
- Log collection
EC2 Metrics: CPU utilization, Disk read/write operations, Network traffic, Status check
EBS Metrics: VolumeReadOps, VolumeWriteOps, VolumeReadBytes,
VolumeWriteBytes, etc.
ELB Metrics: RequestCount, HTTPCode_Backend_2XX, HTTPCode_Backend_4XX,
HTTPCode_Backend_5XX, etc.
RDS Metrics: CPUUtilization, DatabaseConnections, FreeStorageSpace, ReadIOPS, etc.
Lambda Metrics: Invocations, Errors, Duration, Throttles, etc.
S3 Metrics: BucketSizeBytes, NumberOfObjects, AllRequests, GetRequests,
PutRequests, etc.
CloudFront Metrics: Requests, BytesDownloaded, BytesUploaded, etc.
Auto Scaling Metrics: GroupDesiredCapacity, GroupMinSize, GroupMaxSize, etc.
Route 53 Metrics: HealthCheckStatus, Latency, QueryVolume, etc.
Elastic Beanstalk Metrics: HTTPCode_ELB_4XX_Count, HTTPCode_ELB_5XX_Count,
RequestCount, etc.
Note that this is not an exhaustive list, as AWS CloudWatch provides a wide range of
metrics for various AWS services. Additionally, some of the metrics may not be
available for all services or may require additional configuration to be enabled.

list of some common CloudWatch Logs and their


descriptions
API Gateway Access Logs: Logs that record detailed information about requests sent to
an API Gateway REST API, including the source IP address, request time, and response
status code.
CloudTrail Logs: Logs that record AWS API calls made by or on behalf of an AWS
account.
ELB Access Logs: Logs that record detailed information about requests sent to an
Elastic Load Balancer, including the source IP address, request time, and response
status code.
VPC Flow Logs: Logs that capture information about the IP traffic going to and from
network interfaces in an Amazon VPC.
Lambda Logs: Logs generated by AWS Lambda functions, including logs output to
stdout and stderr, as well as logs generated by the Lambda runtime environment.
RDS Logs: Logs that capture events and transactions that occur on an Amazon RDS
database instance, including error messages, slow queries, and connection attempts.
Route 53 Logs: Logs that capture information about DNS queries received by a Route
53 hosted zone.
CloudFront Logs: Logs that capture detailed information about requests and responses
served by an Amazon CloudFront distribution.
ECS Container Logs: Logs generated by containers running in an Amazon ECS cluster.
EKS Kubernetes Logs: Logs generated by containers running in an Amazon EKS cluster.
In general, CloudWatch Logs are used to capture, store, and analyze log data from
various AWS resources and custom applications. They can be used to troubleshoot
issues, monitor system and application performance, and generate insights and
reports.
The CloudWatch Logs Agent
is a lightweight tool that can be installed on an EC2 instance to collect log files from the
instance and send them to CloudWatch Logs. The agent can be configured to monitor
one or more log files and can be customized to parse log data and add metadata or
custom tags. The CloudWatch Logs Agent can also be used to monitor system-level
logs, such as those generated by the operating system or application servers.
For virtual servers (EC2 instances, on-premises servers…)
• CloudWatch Logs Agent
• Old version of the agent
• Can only send to CloudWatch Logs.

The CloudWatch Unified Agent


is a newer tool that can be used to collect logs from a wider variety of sources,
including EC2 instances, containers, and on-premises servers. The unified agent
provides a single tool for collecting and shipping log data to CloudWatch Logs,
simplifying the process of managing logs across multiple sources. The unified agent can
also be used to collect metrics from a variety of sources and send them to CloudWatch
Metrics.
Both the CloudWatch Logs Agent and the CloudWatch Unified Agent are designed to
be easy to install and configure, and provide a cost-effective way to centralize log data
and gain insights into system and application performance.

Amazon EventBridge (formerly called CloudWatch


Events)
is a serverless event bus that makes it easy to connect applications together. It uses
data from your own applications, integrated software as a service (SaaS) application,
and AWS services. This simplifies the process of building event-driven architectures by
decoupling event producers from event consumers. This allows producers and
consumers to be scaled, updated, and deployed independently. Loose coupling
improves developer agility in addition to application resiliency.
You can use Amazon EventBridge to run Amazon ECS tasks when certain AWS events
occur. You can set up an EventBridge rule that runs an Amazon ECS task whenever a
file is uploaded to a certain Amazon S3 bucket using the Amazon S3 PUT operation.

file gateway
supports a file interface into Amazon Simple Storage Service (Amazon S3) and
combines a service and a virtual software appliance. By using this combination, you
can store and retrieve objects in Amazon S3 using industry-standard file protocols such
as Network File System (NFS) and Server Message Block (SMB). The software
appliance, or gateway, is deployed into your on-premises environment as a virtual
machine (VM) running on VMware ESXi, Microsoft Hyper-V, or Linux Kernel-based
Virtual Machine (KVM) hypervisor.
The gateway provides access to objects in S3 as files or file share mount points. With a
file gateway, you can do the following:
- You can store and retrieve files directly using the NFS version 3 or 4.1 protocol.
- You can store and retrieve files directly using the SMB file system version, 2 and 3
protocol.
- You can access your data directly in Amazon S3 from any AWS Cloud application or
service.
- You can manage your Amazon S3 data using lifecycle policies, cross-region
replication, and versioning. You can think of a file gateway as a file system mount on
S3.

Amazon FSx for Windows File Server


Amazon FSx for Windows File Server provides fully managed, highly reliable file storage
that is accessible over the industry-standard Service Message Block (SMB) protocol. It
is built on Windows Server, delivering a wide range of administrative features such as
user quotas, end-user file restores, and Microsoft Active Directory (AD) integration.
Amazon FSx supports the use of Microsoft’s Distributed File System (DFS) to organize
shares into a single folder structure up to hundreds of PB in size.

Amazon FSx for Lustre


Amazon FSx for Lustre makes it easy and cost-effective to launch and run the world’s
most popular high-performance file system. It is used for workloads such as machine
learning, high-performance computing (HPC), video processing, and financial modeling.
Amazon FSx enables you to use Lustre file systems for any workload where storage
speed matters. FSx for Lustre does not support Microsoft’s Distributed File System (DFS
AWS Storage Gateway
is a hybrid cloud storage service that gives you on-premises access to virtually
unlimited cloud storage. Customers use Storage Gateway to simplify storage
management and reduce costs for key hybrid cloud storage use cases. These include
moving backups to the cloud, using on-premises file shares backed by cloud storage,
and providing low
latency access to data in AWS for on-premises applications.
supports the Amazon S3 Standard, Amazon S3 Standard-Infrequent Access, Amazon S3
One Zone-Infrequent Access and Amazon Glacier storage classes. When you create or
update a file share, you have the option to select a storage class for your objects. You
can either choose the Amazon S3 Standard or any of the infrequent access storage
classes such as S3 Standard IA or S3 One Zone IA. Objects stored in any of these
storage classes can be transitioned to Amazon Glacier using a Lifecycle Policy.
Although you can write objects directly from a file share to the S3-Standard-IA or S3-
One Zone-IA storage class, it is recommended that you use a Lifecycle Policy to
transition your objects rather than write directly from the file share, especially if you're
expecting to update or delete the object within 30 days of archiving it.

File Gateway:
File Gateway is a service that allows users to store and retrieve files from Amazon S3
using traditional file protocols like NFS or SMB. It presents a file interface to
applications and users, and the data is stored in Amazon S3 as objects.
S3 File Gateway is an Amazon Web Services (AWS) service that allows you to create a
file-based storage gateway to store and retrieve data in Amazon S3 (Simple Storage
Service) using the NFS (Network File System) and SMB (Server Message Block) file
protocols.

The S3 File Gateway


service enables you to store files as objects in S3, but still access them as files from
your on-premises applications or virtual machines. This allows you to take advantage
of the scalability and durability of S3 storage while still using your existing applications
or workflows that require file-based storage.
With S3 File Gateway, you can configure your gateway to use the SMB or NFS file
protocols to mount file shares on your on-premises servers, and these file shares are
backed by Amazon S3. The gateway uses caching and local disk storage to optimize
performance, and it automatically synchronizes data between your on-premises file
server and Amazon S3.
S3 File Gateway is useful for a variety of use cases, including backup and archive,
content distribution, and file sharing. It can also be used to replace traditional file
servers or to provide access to shared files across multiple locations or applications.

FSx File Gateway


is an Amazon Web Services (AWS) service that provides a fully managed file server that
is accessible over the network using industry-standard file sharing protocols, including
Server Message Block (SMB) and Network File System (NFS). It enables customers to
run file workloads on-premises and seamlessly access their data stored in Amazon FSx
for Windows File Server or Amazon FSx for Lustre, which are fully managed file storage
services that run on AWS.
FSx File Gateway supports two deployment modes: file gateway and cached volume
gateway. In file gateway mode, the gateway provides a file interface for accessing data
stored in Amazon FSx for Windows File Server or Amazon FSx for Lustre, without
caching data locally on the gateway. In cached volume gateway mode, the gateway
caches frequently accessed data locally on the gateway, providing low-latency access
to data and reducing the amount of data transferred to the cloud.
FSx File Gateway is designed to help customers simplify their storage infrastructure by
consolidating their file servers into a single file server accessible from on-premises and
cloud-based environments. It can be used for a wide variety of use cases, including
backup and disaster recovery, media and entertainment, life sciences, financial
services, and more.

Volume Gateway:
Volume Gateway provides a way for users to store data in Amazon S3, using their
existing applications and infrastructure. It presents a block storage interface to
applications, allowing them to write data to a virtual machine running on-premises or
in the cloud. Volume Gateway supports two types of volumes:
Stored Volumes: It stores data locally and asynchronously backs up point-in-time
snapshots of the data to Amazon S3.
Cached Volumes: It stores the primary data in Amazon S3 while retaining a cache of
frequently accessed data on-premises.

Tape Gateway:
Tape Gateway provides a way to backup data to Amazon S3 using virtual tapes, similar
to how traditional tape backups are used. It presents an iSCSI interface to applications,
allowing them to use the virtual tapes like physical tapes. Data written to the virtual
tapes is stored in Amazon S3 as objects, and can be retrieved as needed.
In summary, File Gateway is used for file-level access to data in Amazon S3, Volume
Gateway provides block-level storage volumes for applications, and Tape Gateway is
used for backup and archive storage using virtual tapes.

AWS Resource Access Manager (RAM)


is a service that enables you to easily and securely share AWS resources with any AWS
account or within your AWS Organization. You can share AWS Transit Gateways,
Subnets, AWS License Manager configurations, and Amazon Route 53 Resolver rules
resources with RAM.
Many organizations use multiple accounts to create administrative or billing isolation,
and limit the impact of errors. RAM eliminates the need to create duplicate resources
in multiple accounts, reducing the operational overhead of managing those resources
in every single account you own. You can create resources centrally in a multi-account
environment, and use RAM to share those resources across accounts in three simple
steps: create a Resource Share, specify resources, and specify accounts. RAM is
available to you at no additional charge.

Aws CloudFront
Many companies that distribute content over the internet want to restrict access to
documents, business data, media streams, or content that is intended for selected
users, for example, users who have paid a fee. To securely serve this private content by
using CloudFront, you can do the following:
- Require that your users access your private content by using special CloudFront
signed URLs or signed cookies.
- Require that your users access your content by using CloudFront URLs, not URLs that
access content directly on the origin server (for example, Amazon S3 or a private HTTP
server). Requiring CloudFront URLs isn't necessary, but we recommend it to prevent
users from bypassing the restrictions that you specify in signed URLs or signed cookies.
CloudFront signed URLs and signed cookies provide the same basic functionality: they
allow you to control who can access your content.
Amazon CloudFront is a fast content delivery network (CDN) service that securely
delivers data, videos, applications, and APIs to customers globally with low latency,
high transfer speeds, all within a developer-friendly environment.
CloudFront points of presence (POPs) (edge locations) make sure that popular content
can be served quickly to your viewers. CloudFront also has regional edge caches that
bring more of your content closer to your viewers, even when the content is not
popular enough to stay at a POP, to help improve performance for that content.
Dynamic content, as determined at request time (cache-behavior configured to
forward all headers), does not flow through regional edge caches, but goes directly to
the origin. So, this option is correct.
If you want to serve private content through CloudFront and you're trying to decide
whether to use signed URLs or signed cookies, consider the following:

Use signed URLs for the following cases:

- You want to use an RTMP distribution. Signed cookies aren't supported for RTMP
distributions.
- You want to restrict access to individual files, for example, an installation download
for your application.
- Your users are using a client (for example, a custom HTTP client) that doesn't support
cookies.
Use signed cookies for the following cases:

- You want to provide access to multiple restricted files, for example, all of the files for
a video in HLS format or all of the files in the subscribers' area of a website.

- You don't want to change your current URLs.

AWS Global Accelerator and Amazon CloudFront


are separate services that use the AWS global network and its edge locations around
the world. CloudFront improves performance for both cacheable content (such as
images and videos) and dynamic content (such as API acceleration and dynamic site
delivery). Global Accelerator improves performance for a wide range of applications
over TCP or UDP by proxying packets at the edge to applications running in one or
more AWS Regions.

Global Accelerator is a good fit for non-HTTP use cases, such as gaming (UDP), IoT
(MQTT), or Voice over IP, as well as for HTTP use cases that specifically require static IP
addresses or deterministic, fast regional failover. Both services integrate with AWS
Shield for DDoS protection.

what's CloudFront Functions & Lambda@Edge


CloudFront Functions and Lambda@Edge are both serverless compute services offered
by Amazon Web Services (AWS) that allow developers to add custom logic to their
content delivery network (CDN) using JavaScript code.
CloudFront Functions is a new service that allows developers to write serverless
functions in JavaScript that can be executed at the edge of the AWS CloudFront CDN.
These functions can be used to modify HTTP requests and responses in real-time,
enabling developers to add custom logic such as header manipulation, URL rewrites,
and content transformation.
Lambda@Edge, on the other hand, is a similar service that has been available for
several years. It allows developers to write Lambda functions in Node.js, Python, Java,
or C# that can be executed at the edge of the CDN. These functions can be used for a
wide range of use cases, such as authentication, authorization, content
personalization, and security.
Both CloudFront Functions and Lambda@Edge provide developers with the ability to
add custom logic to their CDN, which can help improve performance, security, and
user experience. They also provide a highly scalable and cost-effective way to process
requests and responses at the edge of the network, closer to the end-user.
Overall, CloudFront Functions and Lambda@Edge are powerful tools for developers
looking to customize their CDN behavior using serverless compute services.
Use Cases
• Website Security and Privacy
• Dynamic Web Application at the Edge
• Search Engine Optimization (SEO)
• Intelligently Route Across Origins and Data Centers
• Bot Mitigation at the Edge
• Real-time Image Transformation
• A/B Testing
• User Authentication and Authorization
• User Prioritization
• User Tracking and Analytics

Limits of AWS Lambda


currently supports 1000 concurrent executions per AWS account per region. If your
Amazon SNS message deliveries to AWS Lambda contribute to crossing these
concurrency quotas, your Amazon SNS message deliveries will be throttled. You need
to contact AWS support to raise the account limit

Features about Lambda


When you create or update Lambda functions that use environment variables, AWS
Lambda encrypts them using the AWS Key Management Service. When your Lambda
function is invoked, those values are decrypted and made available to the Lambda
code.
The first time you create or update Lambda functions that use environment variables
in a region, a default service key is created for you automatically within AWS KMS. This
key is used to encrypt environment variables. However, if you wish to use encryption
helpers and use KMS to encrypt environment variables after your Lambda function is
created, you must create your own AWS KMS key and choose it instead of the default
key. The default key will give errors when chosen. Creating your own key gives you
more flexibility, including the ability to create, rotate, disable, and define access
controls, and to audit the encryption keys used to protect your data.

AWS Managed Microsoft AD


AWS Directory Service for Microsoft Active Directory, also known as AWS Managed
Microsoft AD, enables your directory-aware workloads and AWS resources to use
managed Active Directory in the AWS Cloud. AWS Managed Microsoft AD is built on
the actual Microsoft Active Directory and does not require you to synchronize or
replicate data from your existing Active Directory to the cloud. AWS Managed
Microsoft AD does not support Microsoft’s Distributed File System (DFS)

Amazon FSx For Lustre


is a high-performance file system for fast processing of workloads. Lustre is a popular
open-source parallel file system which stores data across multiple network file servers
to maximize performance and reduce bottlenecks.

Amazon FSx for Windows File Server


is a fully managed Microsoft Windows file system with full support for the SMB
protocol, Windows NTFS, Microsoft Active Directory (AD) Integration.

Amazon Elastic File System


is a fully-managed file storage service that makes it easy to set up and scale file storage
in the Amazon Cloud.

RDS Storage Auto Scaling


automatically scales storage capacity in response to growing database workloads, with
zero downtime.
Under-provisioning could result in application downtime, and over-provisioning could
result in underutilized resources and higher costs. With RDS Storage Auto Scaling, you
simply set your desired maximum storage limit, and Auto Scaling takes care of the rest.
RDS Storage Auto Scaling continuously monitors actual storage consumption, and
scales capacity up automatically when actual utilization approaches provisioned
storage capacity. Auto Scaling works with new and existing database instances. You
can enable Auto Scaling with just a few clicks in the AWS Management Console. There
is no additional cost for RDS Storage Auto Scaling. You pay only for the RDS resources
needed to run your applications.
Expedited retrievals
allow you to quickly access your data when occasional urgent requests for a subset of
archives are required. For all but the largest archives (250 MB+), data accessed using
Expedited retrievals are typically made available within 1–5 minutes. Provisioned
Capacity ensures that retrieval capacity for Expedited retrievals is available when you
need it.
To make an Expedited, Standard, or Bulk retrieval, set the Tier parameter in the Initiate
Job (POST jobs) REST API request to the option you want, or the equivalent in the AWS
CLI or AWS SDKs. If you have purchased provisioned capacity, then all expedited
retrievals are automatically served through your provisioned capacity.

Provisioned capacity
ensures that your retrieval capacity for expedited retrievals is available when you need
it. Each unit of capacity provides that at least three expedited retrievals can be
performed every five minutes and provides up to 150 MB/s of retrieval throughput.
You should purchase provisioned retrieval capacity if your workload requires highly
reliable and predictable access to a subset of your data in minutes. Without
provisioned capacity Expedited retrievals are accepted, except for rare situations of
unusually high demand. However, if you require access to Expedited retrievals under
all circumstances, you must purchase provisioned retrieval capacity.

Amazon Kinesis
provides several features that enable users to collect, process, and analyze real-time
streaming data.
Here are some key features of Amazon Kinesis:
Data Streams: Users can create and manage data streams to collect and store large
amounts of real-time data.
Automatic Scaling: Kinesis can automatically scale the number of stream shards (data
partitions) based on the volume of incoming data, allowing users to process data at
any scale.
Real-Time Processing: Kinesis enables users to process data in real-time using custom
applications or AWS services such as AWS Lambda or Amazon Kinesis Analytics.
Data Retention: Kinesis allows users to store data for up to 365 days, enabling them to
analyze historical data and detect patterns over time.
Data Encryption: Kinesis provides encryption at rest and in transit to ensure that data
is secure.
Multiple Language Support: Kinesis supports multiple programming languages and
frameworks, including Java, Python, and Node.js.
Integration with AWS Services: Kinesis can integrate with other AWS services such as
Amazon S3, Amazon Redshift, and Amazon Elasticsearch for advanced data processing
and analytics.
Data Analytics: Kinesis provides tools for real-time data analytics, including Amazon
Kinesis Analytics, which allows users to analyze and query data streams in real-time
using SQL.
Overall, Amazon Kinesis is a powerful service that provides a scalable, secure, and
easy-to-use platform for collecting, processing, and analyzing real-time streaming data.
Amazon Kinesis is a real-time data streaming and processing service provided by
Amazon Web Services (AWS). It allows users to collect, process, and analyze large
amounts of streaming data from various sources such as website clickstreams, IoT
devices, social media, and log data.

With Amazon Kinesis, users can ingest streaming data from multiple sources into a
data stream, which can then be processed in real-time using custom applications or
AWS services such as AWS Lambda, Amazon Kinesis Analytics, or Amazon Kinesis Data
Firehose. This enables users to gain insights from the data in real-time and take
immediate actions based on the insights.
Amazon Kinesis also provides features such as automatic scaling, data retention, and
data encryption to ensure reliable and secure data processing. It supports multiple
programming languages and frameworks, such as Java, Python, and Node.js, and can
integrate with various AWS services such as Amazon S3, Amazon Redshift, and Amazon
Elasticsearch.
Overall, Amazon Kinesis is a powerful service that allows users to build real-time
streaming data applications and gain insights from large amounts of data in real-time.

KINESIS USE CASE


Amazon Kinesis is a versatile service that can be used in a variety of use cases where
there is a need to process and analyze large volumes of streaming data in real-time.
Here are a few examples of how Kinesis can be used in AWS:
Real-time data processing: Kinesis Data Streams can be used to capture and process
data in real-time from various sources such as IoT devices, social media feeds, web
clickstreams, and more. This data can then be processed using Kinesis Data Analytics or
custom applications to derive insights and make informed business decisions.
Log and event data processing: Kinesis Data Firehose can be used to collect and
process log and event data in real-time from various sources such as AWS CloudTrail,
VPC Flow Logs, and application logs. This data can then be loaded into data stores such
as Amazon S3 or Amazon Elasticsearch for further analysis.
Real-time analytics and monitoring: Kinesis Data Analytics can be used to analyze real-
time data streams and derive insights in real-time. This can be used for real-time
monitoring of applications, systems, and IoT devices.
Fraud detection and anomaly detection: Kinesis Data Streams and Kinesis Data
Analytics can be used to detect fraudulent transactions or anomalies in real-time data
streams such as financial transactions, website clickstreams, and social media feeds.
Real-time data-driven applications: Kinesis can be used to build real-time data-driven
applications that can respond to changes in streaming data. For example, real-time
applications that monitor the status of IoT devices, update dashboards in real-time, or
trigger alerts based on specific events.
Overall, Kinesis is a powerful service that can be used to process and analyze streaming
data in real-time, enabling businesses to gain insights and take informed actions based
on the data.

Amazon Kinesis Data Streams


is a managed service provided by Amazon Web Services (AWS) that allows you to
collect, process, and analyze real-time streaming data at scale. It enables you to ingest
and process large amounts of data in real time from various sources such as website
clickstreams, IoT telemetry data, financial transactions, social media feeds, and more.
Kinesis Data Streams is designed to handle high-volume, high-velocity data streams
and allows you to build custom applications that can process and analyze this data in
real time. With Kinesis Data Streams, you can set up and configure data streams,
collect and process streaming data, and store the data in durable data stores for
further analysis.
Using Kinesis Data Streams, you can easily build applications that can respond to
changes in real-time data, allowing you to quickly identify and respond to events,
trends, and anomalies. The service provides features such as data encryption, real-time
data processing, automatic scaling, and data retention capabilities that make it easier
to manage and analyze streaming data.
Amazon Kinesis Data Analytics is a fully-managed service provided by Amazon Web
Services (AWS) that allows you to analyze real-time streaming data using SQL or
Apache Flink without requiring you to manage any infrastructure. It enables you to
easily query and process streaming data from different sources such as Kinesis Data
Streams, Kinesis Data Firehose, and AWS IoT Core.
Kinesis Data Firehose
acts as a data delivery system between your streaming data sources and your data
stores or analytics tools. You can use Kinesis Data Firehose to transform the incoming
data in real time and deliver it to various destinations such as Amazon S3 for data
archival, Amazon Redshift for data warehousing, Amazon Elasticsearch for real-time
search and analytics, and Splunk for log analysis.
With Kinesis Data Firehose, you can process, buffer, and aggregate your data in real-
time, and deliver it to your destination services without having to worry about scaling
or managing infrastructure. The service automatically scales to match the data
ingestion rate, ensuring that the data delivery pipeline is highly available and fault-
tolerant.
Kinesis Data Firehose provides features such as data compression, data
transformation, data buffering, and data encryption that make it easy to manage and
process streaming data. Additionally, you can easily configure Kinesis Data Firehose to
integrate with your existing AWS services or third-party applications.
Amazon Kinesis Data Firehose is another managed service provided by Amazon Web
Services (AWS) that allows you to capture, transform, and load streaming data into
data stores or analytics tools. It enables you to simplify the data pipeline and stream
data into different destinations such as Amazon S3, Amazon Redshift, Amazon
Elasticsearch Service, and Splunk without writing any custom code.

Kinesis Data Analytics


provides a real-time processing engine that enables you to run continuous queries
against streaming data sources and derive insights from that data. You can use
standard SQL or Apache Flink to query and transform the data, and Kinesis Data
Analytics takes care of scaling and managing the underlying infrastructure required to
run the queries.
With Kinesis Data Analytics, you can gain insights from real-time data and make timely
business decisions based on that information. The service provides features such as
real-time analytics, fault tolerance, data resiliency, and automatic scaling that make it
easy to manage and analyze streaming data.
Kinesis Data Analytics also integrates with other AWS services such as Amazon S3,
Amazon Redshift, and Amazon Elasticsearch to enable you to store and visualize the
results of your analytics queries. Additionally, you can use Kinesis Data Analytics to
build custom real-time applications that can respond to changes in streaming data.
Amazon EMR
is the industry-leading cloud big data platform for processing vast amounts of data
using open-source tools such as Apache Spark, Apache Hive, Apache HBase, Apache
Flink, Apache Hudi, and Presto. Amazon EMR uses Hadoop, an open-source
framework, to distribute your data and processing across a resizable cluster of Amazon
EC2 instances.

DATABASE
Here are the unique features and benefits of each of the databases available on AWS:

Amazon RDS:
Easy setup and management of popular relational databases like MySQL, PostgreSQL,
Oracle, SQL Server, MariaDB, and Amazon Aurora.
Automated backups and software patching.
Multi-AZ deployments for high availability and fault tolerance.
Scalability and flexibility to adjust instance sizes and storage on the fly.
Amazon Aurora:
High performance, availability, and scalability with up to 5 times better performance
than standard MySQL and up to 3 times better performance than standard PostgreSQL.
Compatibility with MySQL and PostgreSQL, allowing for easy migration and integration
with existing applications.
Built-in fault-tolerance and self-healing capabilities with Aurora Multi-Master and
Global Database.
Amazon DynamoDB:
Fully managed NoSQL database service with automatic scaling and high availability.
Predictable, single-digit millisecond latency for reads and writes.
Flexible data model with support for key-value and document data structures.
Encryption at rest and in transit, and integration with AWS Identity and Access
Management (IAM) for security and access control.
Amazon DocumentDB:
Compatibility with MongoDB, allowing for easy migration and integration with existing
MongoDB applications.
Fully managed service with automatic scaling and high availability.
Consistent, single-digit millisecond latency for reads and writes.
Encryption at rest and in transit, and integration with AWS IAM for security and access
control.
Amazon Neptune:
Fully managed graph database service that provides highly connected data storage and
querying.
Optimized for storing and querying highly connected data sets such as social
networking, recommendation engines, and fraud detection.
Compatibility with Apache TinkerPop and SPARQL, allowing for flexible querying and
integration with existing graph applications.
Amazon ElastiCache:
In-memory caching service that provides high-performance caching for popular open-
source in-memory data stores like Redis and Memcached.
Fully managed service with automatic scaling and high availability.
Enables faster response times and reduces database load by caching frequently
accessed data in-memory.
Amazon Keyspaces:
Fully managed, scalable, and highly available Apache Cassandra-compatible database
service.
Flexible and scalable data model to store and retrieve data from multiple data centers
and regions.
Encryption at rest and in transit, and integration with AWS IAM for security and access
control.
Amazon Timestream:
Fully managed time series database service that makes it easy to store and analyze
trillions of time series data points.
Serverless and automatically scales up and down to meet the needs of your
application.
Supports fast and flexible querying of time series data for real-time analytics.
Amazon Managed Apache Cassandra Service:
Fully managed, scalable, and highly available Apache Cassandra-compatible database
service.
Provides the power of Apache Cassandra with the ease of a managed service,
removing the need for self-managed infrastructure.
Encryption at rest and in transit, and integration with AWS IAM for security and access
control.
Amazon Quantum Ledger Database (QLDB):
Fully managed ledger database that provides a transparent, immutable, and
cryptographically verifiable transaction log.
Designed for systems that require an immutable and tamper-proof ledger, such as
supply chain management and financial services.
Fully managed service with automatic scaling and high availability.
Amazon Redshift:
Fully managed, petabyte-scale data warehouse service that provides fast query
performance using a columnar storage format.
Enables fast analysis and business intelligence using SQL-based queries and tools.
Supports integrations with other AWS services like AWS Glue, Amazon EMR, and
Amazon
Amazon QLDB:
Fully managed ledger database that provides a transparent, immutable, and
cryptographically verifiable transaction log.
Designed for systems that require an immutable and tamper-proof ledger, such as
supply chain management and financial services.
Fully managed service with automatic scaling and high availability.
Amazon Timestream:
Fully managed time series database service that makes it easy to store and analyze
trillions of time series data points.
Serverless and automatically scales up and down to meet the needs of your
application.
Supports fast and flexible querying of time series data for real-time analytics.
Amazon Managed Apache Cassandra Service:
Fully managed, scalable, and highly available Apache Cassandra-compatible database
service.
Provides the power of Apache Cassandra with the ease of a managed service,
removing the need for self-managed infrastructure.
Encryption at rest and in transit, and integration with AWS IAM for security and access
control.
Amazon Neptune:
Fully managed graph database service that provides highly connected data storage and
querying.
Optimized for storing and querying highly connected data sets such as social
networking, recommendation engines, and fraud detection.
Compatibility with Apache TinkerPop and SPARQL, allowing for flexible querying and
integration with existing graph applications.

Database Types
• RDBMS (= SQL / OLTP): RDS, Aurora – great for joins
• NoSQL database – no joins, no SQL: DynamoDB (~JSON), ElastiCache (key /
value pairs), Neptune (graphs), DocumentDB (for MongoDB), Keyspaces (for
Apache Cassandra)
• Object Store: S3 (for big objects) / Glacier (for backups / archives)
• Data Warehouse (= SQL Analytics / BI): Redshift (OLAP), Athena, EMR
• Search: OpenSearch (JSON) – free text, unstructured searches
• Graphs: Amazon Neptune – displays relationships between data
• Ledger: Amazon Quantum Ledger Database
• Time series: Amazon Timestream
• Note: some databases are being discussed in the Data & Analytics section

Amazon RDS – Summary


• Managed PostgreSQL / MySQL / Oracle / SQL Server / MariaDB / Custom
• Provisioned RDS Instance Size and EBS Volume Type & Size
• Auto-scaling capability for Storage
• Support for Read Replicas and Multi AZ
• Security through IAM, Security Groups, KMS , SSL in transit
• Automated Backup with Point in time restore feature (up to 35 days)
• Manual DB Snapshot for longer-term recovery
• Managed and Scheduled maintenance (with downtime)
• Support for IAM Authentication, integration with Secrets Manager
• RDS Custom for access to and customize the underlying instance (Oracle & SQL
Server)
• Use case: Store relational datasets (RDBMS / OLTP), perform SQL queries,
transactions

Amazon Relational Database Service (RDS)


is a fully-managed service provided by Amazon Web Services (AWS) that enables you
to set up, operate, and scale a relational database in the cloud easily. It supports six
different database engines: Amazon Aurora, MySQL, MariaDB, PostgreSQL, Oracle, and
Microsoft SQL Server.
Here are some key features of RDS:
Multi-Engine Support: Amazon RDS supports popular relational database engines
including MySQL, PostgreSQL, MariaDB, Oracle, and Microsoft SQL Server, making it
easy to migrate your existing applications to the cloud.
Automated Backups: Amazon RDS automatically backs up your database and
transaction logs, and allows you to restore your data to any point in time within the
retention period.
High Availability and Replication: Amazon RDS provides options for high availability and
read scaling through its multi-AZ deployment and Read Replica features respectively.
Scalability: Amazon RDS allows you to scale your database instance up or down
depending on your needs, without any downtime.
Security: Amazon RDS provides several security features such as network isolation,
encryption at rest and in transit, and support for VPC (Virtual Private Cloud).
Monitoring and Metrics: Amazon RDS provides monitoring and metrics through
Amazon CloudWatch, which allows you to monitor the performance of your database
instances, and set alarms on various metrics such as CPU utilization, free storage
space, and more.
Patching and Upgrades: Amazon RDS automatically patches and upgrades your
database instance to the latest version of the database engine, ensuring that your
database is always up-to-date with the latest security and performance improvements.
Cost-effective: Amazon RDS offers a pay-as-you-go model, allowing you to only pay for
what you use. Additionally, you can leverage Reserved Instances to reduce your costs
further.
Fully managed: AWS takes care of the provisioning, patching, backup, and
maintenance of the database infrastructure, so you can focus on your application
development and data management tasks.
Scalable: You can easily scale up or down your database instance based on your
application demands without any downtime. You can also add read replicas to improve
read performance and scale-out your database workload.
High availability: RDS provides automatic failover capability with Multi-AZ deployment,
which automatically replicates your data across multiple Availability Zones to ensure
that your database is highly available and durable.
Secure: RDS provides built-in security features such as network isolation, encryption at
rest and in transit, database activity monitoring, and IAM-based access control to
protect your data.
Monitoring and Metrics: RDS provides various monitoring tools to help you track the
performance and health of your database, including CloudWatch metrics, enhanced
monitoring, and automated notifications.
Automated backups and point-in-time recovery: RDS automatically takes backups of
your database at a defined interval and allows you to recover your database to any
point in time within a retention period.
Overall, RDS is a popular choice for businesses that want to run their relational
databases in the cloud and want a fully-managed database service that is scalable,
highly available, and secure.
Compatible API for PostgreSQL / MySQL, separation of storage and compute
• Storage: data is stored in 6 replicas, across 3 AZ – highly available, self-healing, auto-
scaling
• Compute: Cluster of DB Instance across multiple AZ, auto-scaling of Read Replicas
• Cluster: Custom endpoints for writer and reader DB instances
• Same security / monitoring / maintenance features as RDS
• Know the backup & restore options for Aurora
• Aurora Serverless – for unpredictable / intermittent workloads, no capacity planning
• Aurora Multi-Master – for continuous writes failover (high write availability)
• Aurora Global: up to 16 DB Read Instances in each region, < 1 second storage
replication
• Aurora Machine Learning: perform ML using SageMaker & Comprehend on Aurora
• Aurora Database Cloning: new cluster from existing one, faster than restoring a
snapshot
• Use case: same as RDS, but with less maintenance / more flexibility / more
performance / more features
Amazon RDS Multi-AZ deployments provide enhanced availability and durability for
RDS database (DB) instances, making them a natural fit for production database
workloads. When you provision a Multi-AZ DB Instance, Amazon RDS automatically
creates a primary DB Instance and synchronously replicates the data to a standby
instance in a different Availability Zone (AZ). Multi-AZ spans at least two Availability
Zones within a single region.
Amazon RDS Read Replicas provide enhanced performance and durability for RDS
database (DB) instances. They make it easy to elastically scale out beyond the capacity
constraints of a single DB instance for read-heavy database workloads. For the MySQL,
MariaDB, PostgreSQL, Oracle, and SQL Server database engines, Amazon RDS creates a
second DB instance using a snapshot of the source DB instance. It then uses the
engines' native asynchronous replication to update the read replica whenever there is
a change to the source DB instance.
Amazon RDS replicates all databases in the source DB instance. Read replicas can be
within an Availability Zone, Cross-AZ, or Cross-Region.

use cases for Amazon RDS:


E-commerce Websites: Amazon RDS is a popular choice for e-commerce websites that
need to manage large amounts of data and handle high levels of traffic. By using
Amazon RDS, these websites can scale their database instances up or down depending
on their needs, and take advantage of automatic backups and high availability features
to ensure that their data is always available and secure.
SaaS Applications: Software-as-a-Service (SaaS) applications often require a scalable
and reliable database infrastructure to handle multiple tenants and their data. Amazon
RDS provides a fully managed database service that allows SaaS providers to focus on
their core business, without worrying about the underlying infrastructure.
Mobile Apps: Mobile apps often require a backend database to store user data and
application state. By using Amazon RDS, app developers can easily provision and
manage a scalable and secure database infrastructure that can handle the demands of
their mobile application.
Healthcare Applications: Healthcare applications often require high levels of security
and compliance with regulatory standards such as HIPAA. Amazon RDS provides
several security features such as network isolation, encryption at rest and in transit,
and support for VPC (Virtual Private Cloud), making it a popular choice for healthcare
applications.
Gaming Applications: Gaming applications often require a high-performance database
infrastructure to store player data, handle in-game transactions, and support real-time
multiplayer interactions. Amazon RDS provides a scalable and reliable database service
that can handle the demands of gaming applications, while also providing automatic
backups and high availability features to ensure that player data is always available
and secure.

Amazon Aurora
is a cloud-based relational database service developed and managed by Amazon Web
Services (AWS). It is designed to provide a highly available, scalable, and performant
database solution that is compatible with MySQL and PostgreSQL.
Some key features of Amazon Aurora include:
High availability: Aurora is designed to automatically detect and recover from failures,
providing high availability for your databases.
Scalability: Aurora can scale up or down based on your needs, automatically adding or
removing resources as necessary.
Performance: Aurora is optimized for performance, with low latency and high
throughput.
Compatibility: Aurora is compatible with MySQL and PostgreSQL, allowing you to use
familiar tools and drivers.
Security: Aurora supports encryption at rest and in transit, and provides fine-grained
access control through AWS Identity and Access Management (IAM).
Overall, Amazon Aurora is a robust and flexible database service that can meet the
needs of a wide range of applications and workloads.
High Performance: Amazon Aurora is designed to provide high performance and low
latency. It is optimized for running on AWS infrastructure, and is built using a
distributed, fault-tolerant architecture that provides high availability and automatic
failover.
MySQL and PostgreSQL Compatibility: Amazon Aurora is compatible with both MySQL
and PostgreSQL, allowing you to use your existing skills and tools.
Scalability: Amazon Aurora allows you to scale your database instance up or down
depending on your needs, without any downtime. Additionally, it supports up to 15
read replicas for read scaling.
Automated Backups: Amazon Aurora automatically backs up your database and
transaction logs, and allows you to restore your data to any point in time within the
retention period.
Multi-AZ Deployment: Amazon Aurora provides Multi-AZ deployment for high
availability, which automatically replicates your data to a standby instance in a
different Availability Zone (AZ) to ensure that your database is highly available in the
event of a failure.
Security: Amazon Aurora provides several security features such as network isolation,
encryption at rest and in transit, and support for VPC (Virtual Private Cloud).
Global Database: Amazon Aurora Global Database allows you to create a single
database that spans multiple AWS regions, providing low-latency access to your data
from anywhere in the world.
Performance Insights: Amazon Aurora Performance Insights provides a dashboard that
allows you to monitor the performance of your database instance, and identify
performance issues quickly.
Cost-effective: Amazon Aurora offers a pay-as-you-go model, allowing you to only pay
for what you use. Additionally, you can leverage Reserved Instances to reduce your
costs further.
Compatible API for PostgreSQL / MySQL, separation of storage and compute
• Storage: data is stored in 6 replicas, across 3 AZ – highly available, self-healing, auto-
scaling
• Compute: Cluster of DB Instance across multiple AZ, auto-scaling of Read Replicas
• Cluster: Custom endpoints for writer and reader DB instances
• Same security / monitoring / maintenance features as RDS
• Know the backup & restore options for Aurora
• Aurora Serverless – for unpredictable / intermittent workloads, no capacity planning
• Aurora Multi-Master – for continuous writes failover (high write availability)
• Aurora Global: up to 16 DB Read Instances in each region, < 1 second storage
replication
• Aurora Machine Learning: perform ML using SageMaker & Comprehend on Aurora
• Aurora Database Cloning: new cluster from existing one, faster than restoring a
snapshot
• Use case: same as RDS, but with less maintenance / more flexibility / more
performance / more features
Amazon Aurora features a distributed, fault-tolerant, self-healing storage system that
auto-scales up to 128TB per database instance. It delivers high performance and
availability with up to 15 low-latency read replicas, point-in-time recovery, continuous
backup to Amazon S3, and replication across three Availability Zones (AZs).
For Amazon Aurora, each Read Replica is associated with a priority tier (0-15). In the
event of a failover, Amazon Aurora will promote the Read Replica that has the highest
priority (the lowest numbered tier). If two or more Aurora Replicas share the same
priority, then Amazon RDS promotes the replica that is largest in size. If two or more
Aurora Replicas share the same priority and size, then Amazon Aurora promotes an
arbitrary replica in the same promotion tier.

Use case Amazon Aurora


E-commerce Applications: E-commerce applications need a database that can handle
high traffic, provide fast response times, and support complex queries. Aurora's
compatibility with MySQL and PostgreSQL makes it easy for developers to migrate
from these open-source databases and use Aurora's high-performance architecture to
handle their data needs.
Gaming Applications: Gaming applications require a scalable and performant database
that can handle real-time data processing, in-game transactions, and complex
analytics. Aurora's distributed architecture, automatic failover, and up to 15 read
replicas make it a great option for gaming companies to support large numbers of
players, high traffic, and high concurrency.
Analytics Applications: Analytics applications require a database that can store large
amounts of data and provide fast query processing to generate insights. Aurora's
advanced compression and distributed query processing make it a great option for
analytics applications that need to store and analyze large datasets.
IoT Applications: IoT applications generate large volumes of data that need to be
stored and processed in real-time. Aurora's compatibility with open-source databases
like MySQL and PostgreSQL, and its support for multiple read replicas and high
availability, make it a good choice for IoT applications that need to store, process, and
analyze large volumes of data.
Healthcare Applications: Healthcare applications require a database that is secure,
reliable, and compliant with regulatory requirements. Aurora's encryption at rest and
in transit, VPC support, and compliance with HIPAA and other regulatory standards
make it a good choice for healthcare applications that need to store sensitive patient
data.
Overall, Amazon Aurora is a great option for applications that require a high-
performance, scalable, and reliable database that can handle a range of use cases,
from e-commerce and gaming to analytics and IoT. Its compatibility with open-source
databases like MySQL and PostgreSQL, along with its advanced architecture and
features, make it a compelling option for a wide variety of applications.

Amazon Aurora Global Database


is designed for globally distributed applications, allowing a single Amazon Aurora
database to span multiple AWS regions. It replicates your data with no impact on
database performance, enables fast local reads with low latency in each region, and
provides disaster recovery from region-wide outages. Amazon Aurora Global Database
is the correct choice for the given use-case.
Amazon Aurora Global Database is a feature of Amazon Aurora, a fully managed
relational database service that is compatible with MySQL and PostgreSQL. Aurora
Global Database allows you to create a single Aurora database that spans multiple
AWS regions, enabling you to build a globally distributed, highly available, and low-
latency application.
With Aurora Global Database, you can replicate your Aurora database to up to five
read-only replicas in secondary AWS regions, which allows you to reduce read latency
for globally distributed applications. The replicas can also be promoted to become the
primary database in the event of a regional outage.
Aurora Global Database provides a number of benefits, including:
Global access: You can access your Aurora database from any region where you have
read replicas, reducing the need to manage multiple databases.
Low latency: Read replicas in different regions can reduce read latency for users in
those regions.
High availability: In the event of a regional outage, you can promote a read replica to
become the primary database, minimizing downtime.
Managed solution: Amazon Aurora is a fully managed database service, meaning that
Amazon takes care of backups, patching, and other routine maintenance tasks.
To set up an Aurora Global Database, you need to create an Aurora cluster in your
primary region, then create read replicas in one or more secondary regions. You can
then set up global replication and manage failovers using the AWS Management
Console, AWS CLI, or AWS SDKs.

Amazon Redshift
Redshift is based on PostgreSQL, but it’s not used for OLTP
• It’s OLAP – online analytical processing (analytics and data warehousing)
• 10x better performance than other data warehouses, scale to PBs of data
• Columnar storage of data (instead of row based) & parallel query engine
• Pay as you go based on the instances provisioned
• Has a SQL interface for performing the queries
• BI tools such as Amazon Quicksight or Tableau integrate with it
• vs Athena: faster queries / joins / aggregations thanks to indexes
High Performance: Amazon Redshift is designed for fast querying and analysis of large
datasets. It is optimized for running complex queries across multiple nodes in a cluster.
Fully Managed: Amazon Redshift is a fully managed service that handles all the heavy
lifting of provisioning, configuring, monitoring, and scaling your data warehouse.
Petabyte-scale Data Warehousing: Amazon Redshift can store and query petabyte-
scale data in a single cluster. It uses columnar storage to improve query performance
and reduce I/O.
Massively Parallel Processing (MPP): Amazon Redshift uses MPP to parallelize queries
across multiple nodes in a cluster, which allows for fast query processing and
scalability.
Advanced Compression: Amazon Redshift uses advanced compression techniques to
reduce the amount of storage required for your data, which can help lower your
storage costs.
Automated Backups: Amazon Redshift automatically backs up your data warehouse
and allows you to restore your data to any point in time within the retention period.
Encryption: Amazon Redshift provides encryption of data in transit and at rest using
industry-standard AES-256 encryption.
Integration with BI Tools: Amazon Redshift integrates with popular business
intelligence (BI) tools such as Tableau, Looker, and Power BI, making it easy to visualize
and analyze your data.
Pricing: Amazon Redshift offers a pay-as-you-go model, allowing you to only pay for
what you use. Additionally, you can leverage Reserved Instances to reduce your costs
further.
Overall, Amazon Redshift provides a scalable, high-performance, and cost-effective
data warehousing solution that is easy to use and integrates with popular BI tools.

use cases for Amazon Redshift:


Business Intelligence and Data Warehousing: Amazon Redshift is a popular choice for
businesses that need to store and analyze large amounts of data for business
intelligence and data warehousing purposes. Redshift's columnar storage, parallel
processing, and ability to scale up to petabytes of data make it a great option for
processing and analyzing large data sets quickly.
AdTech and Digital Marketing: AdTech and digital marketing platforms need to process
large amounts of data quickly to serve relevant ads and optimize ad performance.
Redshift's ability to scale up quickly and handle large amounts of data make it a great
option for these platforms.
Financial Services: Financial services companies need to store and analyze large
amounts of transactional data in real-time to identify fraudulent transactions and
support compliance requirements. Redshift's scalability, fast query performance, and
security features make it a good choice for these applications.
Healthcare: Healthcare providers need to store and analyze large amounts of patient
data while maintaining regulatory compliance. Redshift's HIPAA compliance,
encryption, and access controls make it a good choice for healthcare applications.
IoT Analytics: IoT devices generate a large volume of data that needs to be stored,
processed, and analyzed in real-time. Redshift's ability to handle large volumes of data,
fast query performance, and integration with other AWS services like AWS IoT
Analytics make it a good choice for IoT applications.
Overall, Amazon Redshift is a powerful data warehousing and analytics service that can
handle large volumes of data and complex queries. Its ability to scale quickly, security
features, and integration with other AWS services make it a compelling option for
businesses in a variety of industries, from financial services and healthcare to AdTech
and IoT analytics.

ENCRYPTION
Amazon S3-Managed Keys (SSE-S3) Server-Side
 

Encryption
Key Management: manages the encryption keys used to encrypt and decrypt your
data. SSE-S3 manages encryption keys internally and automatically.
Key Control: you have limited control over the encryption keys because they are
managed by Amazon S3.
Encryption Options: SSE-S3 uses AES-256 encryption for data at rest.
Key Features: With SSE-KMS, you can take advantage of additional features provided
by AWS KMS such as key rotation, audit logs, and key policies SSE-S3 does not offer
these additional features.
Pricing: SSE-S3 is included in the standard Amazon S3 pricing. is a simpler and more
cost-effective option for encrypting your data at rest in Amazon S3.
Compliance: If you have specific compliance requirements for encryption key
management, such as HIPAA or PCI DSS, then you may need to use SSE-KMS to meet
those requirements.

AWS KMS-Managed Keys (SSE-KMS) Server-Side


Encryption
Key Management: you managed the encryption keys used to encrypt and decrypt your
data.
Encryption Options: SSE-KMS offers both AES-256 and AWS KMS envelope encryption.
The latter provides additional security by encrypting data with a unique data key for
each object, which is itself encrypted with a master key stored in KMS
Key Control: you have more control over the encryption keys because you can manage
them using AWS KMS. you can have greater control over key management, including
key rotation, auditing, and revocation.
Key Features: With SSE-KMS, you can take advantage of additional features provided
by AWS KMS such as key rotation, audit logs, and key policies.
Pricing: SSE-KMS is an additional charge on top of the Amazon S3 pricing. provides
additional features and greater control over encryption key management, but at an
additional cost.
Access Control: SSE-KMS and SSE-C allow for granular access control to encryption
keys, so you can control who has access to specific keys and what actions they can
perform.
Compliance: SSE-KMS and SSE-C can help you meet compliance requirements for data
encryption and key management, as they provide greater control and transparency
over key management and usage.

Server-Side Encryption with Customer-Provided


Keys (SSE-C)
Key Management: SSE-C allows you to bring your own encryption keys. With SSE-KMS
and SSE-C, you can have greater control over key management, including key rotation,
auditing, and revocation.
Encryption Options: SSE-C allows you to use your own encryption algorithm, with
either AES-256 or other supported encryption algorithms.
Access Control: SSE-KMS and SSE-C allow for granular access control to encryption
keys, so you can control who has access to specific keys and what actions they can
perform.
Compliance: SSE-KMS and SSE-C can help you meet compliance requirements for data
encryption and key management, as they provide greater control and transparency
over key management and usage.

Amazon Simple Workflow service (SWF) 


provides useful guarantees around task assignments. It ensures that a task is never
duplicated and is assigned only once. Thus, even though you may have multiple
workers for a particular activity type (or a number of instances of a decider), Amazon
SWF will give a specific task to only one worker (or one decider instance). Additionally,
Amazon SWF keeps at most one decision task outstanding at a time for workflow
execution. Thus, you can run multiple decider instances without worrying about two
instances operating on the same execution simultaneously. These facilities enable you
to coordinate your workflow without worrying about duplicate, lost, or conflicting
tasks.

Elastic Fabric Adapter (EFA):


EFA is a high-performance network interface for HPC (High-Performance Computing)
workloads that require low-latency, high-bandwidth communication between EC2
instances. It uses a high-speed interconnect to provide a dedicated, low-latency, high-
bandwidth communication channel between EC2 instances. EFA is designed to work
with MPI (Message Passing Interface) applications, and it is recommended for use in
HPC clusters.

Elastic Network Adapter (ENA):


ENA is a network interface that provides high throughput and low-latency network
performance for EC2 instances. It uses a custom-designed network interface controller
(NIC) that is optimized for use with AWS. ENA is designed to work with a wide range of
workloads, including databases, analytics, and machine learning.

Elastic Network Interface (ENI):


ENI is a virtual network interface that can be attached to an EC2 instance to provide
additional network interfaces. ENIs can be used to create a highly available network
architecture, to attach instances to multiple subnets, or to provide additional network
interfaces for specialized workloads.

Private Virtual Interface (VIF):


VIF is a virtual private network (VPN) connection that can be used to establish a
private, secure connection between an on-premises data center and an AWS VPC
(Virtual Private Cloud). VIF can be used to extend your data center network to AWS, to
provide secure access to AWS resources, or to provide high-bandwidth connectivity to
AWS for specialized workloads.

In summary, EFA is optimized for HPC workloads that require low-latency, high-
bandwidth communication, ENA is designed for a wide range of workloads that require
high throughput and low latency, ENI provides additional network interfaces for EC2
instances, and VIF is used to establish a private, secure connection between an on-
premises data center and an AWS VPC.

AWS CloudFormation
provides a common language for you to describe and provision all the infrastructure
resources in your cloud environment. CloudFormation allows you to use a simple text
file to model and provision, in an automated and secure manner, all the resources
needed for your applications across all regions and accounts. This file serves as the
single source of truth for your cloud environment. AWS CloudFormation is available at
no additional charge, and you pay only for the AWS resources needed to run your
applications.

Amazon Simple Queue Service (SQS)


is a fully managed message queuing service that enables you to decouple and scale
microservices, distributed systems, and serverless applications. SQS offers two types of
message queues. Standard queues offer maximum throughput, best-effort ordering,
and at-least-once delivery. SQS FIFO queues are designed to guarantee that messages
are processed exactly once, in the exact order that they are sent.
AWS manages all ongoing operations and underlying infrastructure needed to provide
a highly available and scalable message queuing service. With SQS, there is no upfront
cost, no need to acquire, install, and configure messaging software, and no time-
consuming build-out and maintenance of supporting infrastructure. SQS queues are
dynamically created and scale automatically so you can build and grow applications
quickly and efficiently.
Amazon SQS offers buffer capabilities to smooth out temporary volume spikes without
losing messages or increasing latency.
Amazon Simple Notification Service (SNS) and Amazon Simple Queue Service (SQS) are
two messaging services provided by AWS that can help you build decoupled and
scalable applications.
Here's an explanation of each service:

Amazon Simple Notification Service (SNS):

Amazon SNS is a fully managed messaging service that enables you to send messages
to a large number of subscribers, or endpoints, simultaneously. You can use SNS to
send push notifications, SMS text messages, and email messages to your subscribers.
SNS works on a publish-subscribe model. A publisher sends a message to a topic, and
SNS delivers the message to all subscribers of that topic. Topics can be created
dynamically, and subscribers can be added or removed dynamically as well. SNS is a
highly scalable service, and it can handle millions of messages per second.
Here are some key features of Amazon SNS:
Multi-protocol support: SNS supports multiple messaging protocols, including HTTP,
HTTPS, email, SMS, and mobile push notifications.
Fanout: SNS allows you to send a message to multiple subscribers simultaneously. This
feature is particularly useful when you need to broadcast a message to a large number
of subscribers.
Filtering: SNS allows you to filter messages based on their attributes. This can help you
reduce costs and increase efficiency by sending only relevant messages to your
subscribers.
Mobile push notifications: SNS provides an easy way to send push notifications to
mobile devices, including iOS, Android, and Amazon Fire OS devices.
Message attributes: SNS allows you to add custom attributes to messages, which can
be used for filtering and routing.
Message encryption: SNS supports encryption of messages in transit and at rest,
ensuring that your data is always secure.
Cross-region replication: SNS allows you to replicate your messages across regions for
increased durability and availability.
Dead-letter queues: SNS provides a dead-letter queue for messages that can't be
delivered to subscribers. This feature helps you debug issues and ensures that no
messages are lost.
CloudTrail integration: SNS integrates with AWS CloudTrail to provide audit logs of all
SNS API calls.

Amazon Simple Queue Service (SQS):


Amazon SQS is a fully managed message queuing service that enables you to decouple
and scale microservices, distributed systems, and serverless applications. SQS allows
you to send, store, and receive messages between software components without
losing messages or requiring other services to be available.
SQS works on a message queue model. Messages are sent to a queue, and then a
consumer retrieves the messages from the queue in a first-in, first-out order. SQS
supports two types of queues: standard queues and FIFO (first-in, first-out) queues.

Here are some key features of Amazon SQS:


Fully managed: SQS is a fully managed service, meaning that Amazon takes care of the
underlying infrastructure, such as scaling, patching, and monitoring.
Message durability: SQS stores messages redundantly across multiple availability
zones, ensuring that messages are not lost in the event of a failure.
Scale: SQS can handle any volume of messages, from a few messages per day to
millions of messages per second.
Decoupling: SQS helps to decouple the components of your application, allowing you
to build more loosely coupled and scalable systems.
FIFO queues: SQS supports FIFO queues, which provide ordered message delivery and
exactly-once processing. FIFO queues are particularly useful for use cases that require
strict message ordering and no duplicates.
Standard queues: SQS also supports standard queues, which provide best-effort
ordering and at-least-once delivery.
Visibility timeout: SQS allows you to specify a visibility timeout for messages. During
this time, the message is invisible to other consumers, preventing multiple consumers
from processing the same message.
Dead-letter queues: SQS provides a dead-letter queue for messages that can't be
processed successfully. This feature helps you debug issues and ensures that no
messages are lost.
Delay queues: SQS allows you to set a delay period for messages, which can be useful
for scenarios where you need to wait before processing a message.
Long polling: SQS supports long polling, which allows consumers to receive messages
as soon as they become available, rather than continuously polling for new messages.
CloudWatch integration: SQS integrates with AWS CloudWatch to provide monitoring
and alarms for your queues.In summary, Amazon SNS and Amazon SQS are two
powerful messaging services that can help you build scalable and decoupled
applications. SNS is ideal for broadcasting messages to many subscribers, while SQS is
designed for decoupling and scaling components in your application.

You might also like