Lesson 05 Database On AWS

AWS Developer Associate
Databases on AWS
Learning Objectives
By the end of this lesson, you will be able to:
Identify the different types of databases offered by AWS
Present an overview of the features and benefits of Amazon RDS
Create a table using DynamoDB console
Delineate the concepts in DynamoDB
List the aspects of Amazon ElastiCache

Introduction to Databases
What Is a Database?
A database is a collection of individual data items stored in a highly structured manner.
Provides the ability to store a large amount of information
Facilitates quick access to information
Allows users to share information at different locations
Ensures data security
AWS databases are both relational and non-relational.

Relational Databases
A Relational Database is a group of data items having pre-defined relationships with each other.
These items are arranged into a set of tables with rows and columns.
Table 2
users
1 1
id int
Table 1 Table 4
ratings first_name varchar tags
last_name varchar
id int id int
email varchar
rating int tag varchar
user_id int Table 3 user_id int

.. ..
movie_id int movies movie_id int
id int
1 1
name varchar
description text
Features of Relational Databases
SQL: is a primary interface
Data integrity: is enforced by a set of constraints
Transactions: result in a COMMIT or a ROLLBACK
ACID compliance: ensures data integrity

AWS Relational Databases
Here are some Relational Database Engines that Amazon RDS offers:
Amazon Aurora
Oracle
Microsoft SQL Server
Maria DB
Key-Value Databases
A Key-Value Database is a type of Non-relational Database. To store data, it uses a collection of key-
value pairs in which the key acts as a unique identifier.
Products
Primary key
Attributes
Partition key Sort key
Product ID Type Schema is defined per item
1 Book ID Odyssey Homer 1871
2 Album ID 6 Partitas Bach

Items
Album ID: Partita
2
Track ID No. 1
Drama,
3 Movie ID The Kid Chaplin
Comedy
Example of data stored as key-value pairs in DynamoDB

Use Cases of Key-Value Databases
Session Store Shopping Cart
Session data is always queried by a Key-Value Databases are capable of

primary key. Hence, a fast key-value scaling large amounts of data and
store is an ideal fit for session data. high volumes of state changes.
AWS In-Memory Databases
An In-Memory Database is a type of purpose-built database that primarily depends on memory

for data storage.
In-Memory Databases are ideal for applications that need microsecond response times.
Application
Master Server
RAM: RAM: RAM: RAM:

Data Partition 1 Data Partition 2 Data Partition 3 Data Partition 4
Use Cases of In-Memory Databases
Real-Time Bidding Gaming Leaderboards
In-Memory Databases can In-Memory Databases can quickly

ingest, process, and analyze real-time deliver sorting results and update
data with sub-millisecond latency. the leaderboard in real-time.
Caching
The primary purpose of a cache

is to facilitate increased
data retrieval performance.
AWS In-Memory Databases
Amazon Elasticache for Redis

A blazing fast in-memory data store that
provides sub-millisecond latency to power
Internet-scale, real-time applications
Amazon Elasticache for Memcached

A Memcached-compatible in-memory
key-value store service that can be
used as a cache or a data store
Amazon RDS
Amazon RDS
Amazon RDS is a Relational Database Management service.
• Provides CPU, memory, IOPS, and storage separately for individual scaling
• Looks after software patching, updates, backups, recovery, and automatic failure
detection
• Facilitates creating backups automatically or manually via snapshot
• Has a primary instance and a simultaneous secondary instance to provide high
availability and avoid failure
It is mainly used to manage the data of e-commerce platforms,

gaming software, apps, and websites.
Benefits of Amazon RDS
Availability of MySQL, postgreSQL, Oracle,

and SQL servers
Need for payment only during use
Ease in handling of patching, backups, and

replication
Simple and fast scaling
AWS RDS Simple and fast deployment
Fast and predictable performance

Amazon RDS Database Engines
Amazon Aurora PostgreSQL
MySQL MariaDB
Microsoft SQL
Oracle
Server
Amazon Aurora
Amazon Aurora is a Relational Database fully managed by Amazon RDS.
Compatibility with MySQL and PostgreSQL
High speed: Up to 5X faster performance than MySQL and 3X faster

performance than PostgreSQL
Applicability for cross-region Read Replica
High availability, durability, and security
Cost-effective
Amazon Aurora consists a storage volume of 10GB logical blocks. It can scale
up to 64 TB when required.
Crash Recovery
Traditional Databases AWS Aurora
• Replay logs since the last • Performs redo of records on

checkpoint demand, as part of disk read
• Generally, takes five minutes • Performs parallel, distributed,
between checkpoints vs and asynchronous operations
• MySQL works with single-thread; • Does not replay on startup of
number of disk accesses are server
very high
Use Cases of Amazon RDS
Web and Mobile Applications
• Amazon RDS is the perfect fit for highly demanding applications as it

provides a high throughput, massive storage scalability, and high
availability.
• The absence of licensing constraints best suits the variable usage

pattern of these applications.
E-Commerce Applications
• Amazon RDS is a flexible, secured, highly scalable, and low-cost

database solution that is well-qualified for small and large e-
commerce businesses.
• It helps satisfy PCI compliance and builds a superior customer

experience, without the hassle of managing the underlying
database.
Mobile and Online Games
• Amazon RDS efficiently manages the database by taking care of the

provisioning, scaling, and monitoring of database servers.
• It can rapidly increase its capacity by providing familiar database

engines to meet user demand.
Database Instances
A Database Instance is a set of memory structures that manage the database.
It is a basic building block of RDS.
The computation and memory capacity of a DB Instance is determined by its

DB Instance class, which is selected as per need.
Every DB Instance can host multiple-user created databases or a single

oracle database with multiple schemas.
Every DB Instance runs on a DB engine.
By default, a customer can have 40 DB Instances.

Backup and Restore in Amazon RDS
VPC A VPC B
RDS RDS
Instance R Instance R
1 2
EC2 S3 EC2
Instance Bucket Instance
A B
Data Flow Diagram during Backup and Restore

Backup and Restore in Amazon RDS
Amazon RDS offers automated backups, point-in-time restores, and database snapshots.
AWS RDS carries the automated

backups of DB Instances, based on The backup retention period can be
the specified backup retention set between one and 35 days.
period.
When a DB Instance is deleted, the

Backups can also be created automated backups also get
manually via snapshots. deleted. But the manual snapshots
remain the same.
Multi-AZ Deployments in Amazon RDS
When a Multi-AZ DB Instance is provisioned, Amazon RDS creates a primary DB

Instance automatically and, simultaneously, replicates the data to a standby instance
in a different Availability Zone (AZ).
Benefits
Enhanced durability
Increased availability
Protected database performance
Automatic failover
Failover Conditions
AWS RDS automatically switches from a primary DB Instance to a standby replica present in
another availability zone whenever one or more of following conditions occur:
Failure of a primary DB Instance Blackout of an availability zone
Software patching of the OS of Change in the DB Instance

DB Instance under process server type
The normal failover time is 60–120 seconds. This may be exceeded in case of a heavy
recovery process.
Failover Conditions
Application Database
servers failure Standby
New standby
Availability Zone A
Primary
Availability Zone B
Read Replicas in Amazon RDS
Read Replicas are one or more copies of a particular Relational Database Instance to handle
high volume read traffic.
Application servers Database server
Read/write Primary
• Any amazon RDS activity initiated runs only
in the current default region.
• Amazon RDS provides high availability and

failover support for DB Instances by Asynchronous
maintaining asynchronous standby replica in replication
multi-availability zone deployments.
• Amazon RDS synchronizes standby replicas

in different availability zones. Read only
BI/reporting
application server Read replica
Costs of Amazon RDS
Amazon RDS offers a pay for what you use. The table below lists the billing procedure for
various parameters:
Parameters Billing procedure
Based on the class, a full hour will be considered even if

DB Instance hours
the DB Instance is consumed for a partial hour
Scaling the provisioned storage capacity within the

Storage (per GB per month)
month will be billed pro-rated
I/O requests per month Total number of storage i/o requests
Data transfer Data transfer in and out on tour DB Instance on Internet

Assisted Practice
RDS Database Instance
Problem Statement: Create an RDS database instance. Duration: 15 mins

Assisted Practice: Guidelines
Steps to create an RDS Database Instance:
1. Go to AWS management console and click on “RDS”.

2. Select the database engine.
3. Fill the required details.
4. Click on “launch DB Instance”.
5. Install WAMP 64 and give the path of its location in command prompt.
6. Enter the endpoint, username, port, and password to connect AWS, RDS and the WAMP
server.
7. Once the connect is done, perform CRUD operation in it.
Amazon DynamoDB
Difference Between SQL and NoSQL Databases
Characteristics SQL NoSQL
Workloads Ad hoc queries, data warehousing, OLAP Web scale applications
Schema-less with a primary key;

Well-defined schema where data is
Data model manages structured or
normalized into tables, rows, and columns
semi-structured data
AWS management console or

Data Access SQL
AWS CLI; performs ad hoc tasks
Performance Optimized for storage Optimized for compute
Scaling Vertical scaling Horizontal scaling

Amazon DynamoDB
DynamoDB is a fully managed NoSQL database that supports key-value and document data.
It is used by systems that require milli-second read latency.
The record in every row is known as item. A TTL (Time to leave) can be set to
automatically delete the items in the table once they expire.
Operations such as create, insert, update, query, scan, and delete are
performed in the table via appropriate API.
For faster performance and data durability, the table data is stored in an SSD
disk and spread across many servers in different availability zones.
Use Cases of Amazon DynamoDB
Ad tech Gaming
Retail Banking and Finance

Use Cases of
Amazon
DynamoDB
Media and Entertainment Software and Internet

Read Consistency in DynamoDB
DynamoDB supports both Eventually Consistent Reads and Strongly Consistent Reads.
Eventually Consistent Read

Stale data is provided instead of the one recently added in the DynamoDB table.
If the read request is repeated after a short time, the response returns the latest data.
Strongly Consistent Read

The response is returned with the most up-to-date data, reflecting the updates from all
prior successful write operations.
Strongly Consistent Read might not be available if there is a network delay or outage.
Amazon DynamoDB Global Tables
Amazon DynamoDB global tables act as a complete solution to deploy a multi-region,

multi-active database, without the need for building and maintaining a replication.
The AWS Regions where the table is to be available can be specified.
DynamoDB executes all the tasks needed to create identical tables in the
specified regions and distributes ongoing data changes to all of them.
How DynamoDB works
2. Add and query

items
3. Monitor and manage

1. Create table table
Benefits of Amazon DynamoDB Global Tables
Is a perfect fit for massively scaled

applications with globally dispersed users
Promotes fast application performance
Provides automatic multi-active replication

to AWS Regions globally
Delivers low-latency data access to users,

irrespective of their location
Amazon DynamoDB Pricing
The cost for using DynamoDB depends on the charges for reading, writing, and storing data
in DynamoDB tables, and for optional features, if any.
DynamoDB has two capacity modes that have specific billing options.
On-demand capacity mode Provisioned capacity mode
Charges for the data reads and writes Charges according to the number of
the application performs on the tables reads and writes specified per second
by the user
DynamoDB Use Case: Duolingo
Duolingo is a popular language-learning website and mobile app that delivers lessons for
80 languages. Duolingo uses DynamoDB to store around 31 billion items.
DynamoDB fits the requirements for Duolingo owing to its

scalability and performance.
Assisted Practice
DynamoDB
Problem Statement: Create a table using the DynamoDB Console. Duration: 15 mins
Assisted Practice: Guidelines
Steps to create a table using the DynamoDB Console:
1. Go to AWS management console and select the DynamoDB service.

2. Click on create table and enter the table name and primary keys.
3. Now select Items and click on create item to insert data in the table.
4. If the data is inserted successfully, you can read it from the dashboard.
5. If you want to remove an item from the table, click on remove.
6. If you want to delete the table, click on Delete table.
DynamoDB Concepts
Indexes
An index is a data structure that allows the user to perform fast queries on
specific columns in a table.
DynamoDB supports two types of indexes.
01 02
Local Global
Secondary Secondary
Index Index
Scan vs Query API Call
Scan API scans the table to Query API performs a direct

look for elements that match lookup to a selected partition.
the criteria. The lookup will be based on
partition or hash key.
DynamoDB APIs
There are three planes in DynamoDB API.
Control Plane
Data Plane
DynamoDB Streams
Control Plane
Control Plane allows to create and manage DynamoDB table.
CREATETABLE
DESCRIBETABLE
UPDATETABLE
Operations
DELETETABLE
LISTTABLE
DESCRIBELIMITS
Data Plane
Data Plane allows to perform CRUD actions on data in a table.
Creating data
Reading data
Updating data
Deleting data
Throughput Capacity
Throughput capacity is the speed at which the file server hosting the file system can
serve file data.
Read and Write capacities
A Read Capacity unit represents only one strongly consistent read per second,
or two Eventually Consistent Reads per second, for an item up to 4KB in size.
A Write Capacity unit represents one write per second for an item up to
1KB in size.
Note
Specify the capacity requirement for Read and Write activity

while creating the table.
DynamoDB On-Demand Capacity
DynamoDB On-Demand Capacity is a flexible billing option that requires no capacity

planning. The user need not mention the Read and Write Capacity.
On-demand is preferable when:
New tables with unknown workloads must be created.
The application traffic is unpredictable.
Pay for what is used is preferred.
Note
On-demand mode can be chosen either while creating the

table, or later, using the Capacity tab.
DynamoDB Accelerator
DynamoDB Accelerator (DAX) is a caching service, which is:
Fully Highly
Manageable Available
10-times In-memory
faster cache
DynamoDB Transactions
DynamoDB transactions help developers operate on multiple items in a single request.
Help the developer implement business logic that requires

multiple, all or no operation across one or more tables
Provide atomicity, consistency, isolation, and durability

(ACID) across tables
Support scale, and performance to a broader set of

workloads
Offer multiple read and write options to meet different

application requirements
Working of DynamoDB Transactions
TransactWriteItems API
Is a batch operation that contains a write set, with one or more PutItem, UpdateItem
and DeleteItem operations. It can optionally check the pre-requisite that must be
satisfied before an update is made.
Idempotency
It is an optionally available feature, which prevents application errors if multiple items
are submitted due to connection time-outs or network errors.
Working of DynamoDB Transactions
Error Handling for Writing

Write transaction fails if a condition expression is not met ‘or’ more than one action in
the same TransactionWriteItems target the same item.
TransactGetItems API
Is a batch operation that contains a read set with one or more GetItem operation. If it is
issued on an item that is a part of an active write transaction, the read transaction is
cancelled. It can include up to 25 unique items or 4 MB data.
DynamoDB Transactions
Within a transaction, a conflict can occur during concurrent item-level requests on a same
item.
The scenarios when transactional conflicts could occur are:
A request (put, update, delete) for an item conflicts with an ongoing

TransactWriteItems request
A request for a TransactWriteItems with an ongoing TransactWriteItems for

the same item
A request for a TransactGetItems with an ongoing TransactWriteItems for

the same item
DynamoDB Time To Live
Amazon DynamoDB Time to Live (TTL) supports defining a per-item timestamp. This helps to
determine when an item is no longer needed.
TTL Features
Removes user or sensor data after one year of inactivity

in an application
Archives expired items to an Amazon S3 data lake via

Amazon DynamoDB Streams and AWS Lambda
Retains sensitive data for a certain amount of time, based

on contractual or regulatory obligations
DynamoDB Streams
DynamoDB Streams are used to replicate the data from one table
to another in a different region.
APIs used for data transfer are:
LISTSTREAM: Retrieves a list of stream descriptors for current account and endpoint
DESCRIBESTREAM: Retrieves detailed information about a given stream.
GETSHRADITERATOR: Retrieves a shard iterator
GETRECORDS: Retrieves the stream records within a given shard

Routing Policies
Routing Policies are used to route the traffic based on the geographic location
from where the DNS query has originated.
Fast and
consistent Fully Fine-grained
performance manageable access control
Highly Event-driven Flexible in

scalable programming nature
Amazon ElasticCache
Amazon Elasticache
ElastiCache is an AWS in-memory data store and cache environment. It is used to cache results
and reduce overhead and latency on database.
It is a web service that improves the performance of web applications.
It helps to set up, manage, and scale a distributed in-memory cache

environment in the cloud.
It supports two open-source memory engines—Redis and Memcached.

Popular Use Cases of Elasticache: Adtech
Ad serving
Real-time bidding
ID-looking
Session tracking
User profile management

Popular Use cases of Elasticache: IoT
Tracking state
Real-time notification
Metadata and reading from

millions of devices
Popular Use cases of Elasticache: Gaming
Recording game details
Leader boards
Session information
Usage history
Logs
Popular Use cases of Elasticache: Mobile and Web
Storing user profile
Session details
Personalization setting
Entity-specific metadata
Amazon Elasticache: Redis
Redis is an in-memory data structure store used as database, cache, and

message broker.
It is single threaded, and its Read Replicas are synced asynchronously.
It collects one to six Redis nodes and the collection process is called Shard.
It uses one to 15 shards when cluster mode is enabled and uses only one
shard when it is disabled.
It stores the backups in s3, with a retention period of 0 to 35 days.

Amazon Elasticache for Redis: Benefits
Monitoring and management Enhanced Redis Engine
Reliable and efficient open

Simplified administrative tasks
source Redis
Security and compliance Scalability
Compliant data protection and Adjustable usage, based on the

help needs
Amazon Elasticache: Memcached
Memcached is used to speed up the dynamic data driven websites. Hence, it is called
distributed memory catching system .
Memcached is simple to use and is multi-threaded.
Memcached cluster can have a maximum of 100 nodes in a region.
Memcached supports both horizontal and vertical scaling.
Memcached is fast and is well established.

Benefits of Amazon Elasticache for Memcached
Extreme Performance Secure and Hardened
By utilizing an end-to-end optimized It continuously monitors your nodes and

stack running on customer nodes, it applies the necessary patch to keep your
provides blazing fast performance. environment safe.
Memcached compatible
It’s compliant with Memcached, so

popular tools we use today will work
seamlessly with the service.
Benefits of Amazon ElastiCache for Memcached
Easily Scalable Fully-Managed
It includes sharding to scale in – memory No longer need to perform management

cache up to 20 nodes and 12.7 TB per tasks as it monitors your cluster to keep
cluster. your workloads up and running.
Auto-Discovery
It saves users’ time by simplifying the

way an application connects to a
Memcached cluster.
Amazon Elasticache Costs
Elasticache offers a usage-based subscription following a free trial. It provides storage space
for one snapshot free of charge for each active ElastiCache for Redis cluster.
Shown below is a list of node types supported by Elasticache:
On-demand nodes: A user pays for memory capacity by

the hour that a node runs.
Reserved nodes: A user can choose to make a one-time

upfront payment, no upfront payment, or one-time upfront
payment with low hourly charges for each reserved node.
Note
Additional back up storage for snapshots is charged at

$0.085 per GB every month.
Memcached versus Redis
Characteristics Memcached Redis
Is an in-memory key value store, Is an in-memory data structure store, used

Description
originally intended for catching as database, cache, and message broker
Replication Does not support replication Supports master-slave replication
Stores variables in memory and

Storage type retrieves information directly Is like a database that resides in memory
from server instead of DB
Memcached versus Redis
Characteristics Memcached Redis
Good to handle high traffic Neither can handle high traffic on read nor
Read and Write speed
websites heavy writes
Key length Has a maximum of 250 bytes Has a maximum of 2GB
Catching relatively small and Session cache, full page cache (FPC),
Ideal for static data such as HTML code Queues, 000000000000000000000 or
fragments counting, and more
Key Takeaways
There are three types of databases offered by AWS—

Relational, Key-Value, and In-Memory Databases.
Amazon RDS is a web service that helps to set up, operate,

and scale a relational database in the AWS Cloud.
Amazon DynamoDB is a fully-managed NoSQL database

service that provides high speed and seamless scalability.
There are three planes in DynamoDB API—Control Plane,

Data Plane, and DynamoDB Streams.
Amazon ElastiCache is used to cache results and reduce the

overhead and latency on the database.
Storing Application Data in MySQL DB using Amazon RDS
Problem Statement:
You are asked to demonstrate joining multiple VPC together using Peering
Connection and Private Link
Tools required:
WAMP Server, AWS RDS, Visual Studio Code
Expected Deliverables:
Screenshots for every steps

Lesson 05 Database On AWS

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lesson 05 Database On AWS

Uploaded by

Copyright:

Available Formats

AWS Developer Associate

By the end of this lesson, you will be able to:

Identify the different types of databases offered by AWS

Present an overview of the features and benefits of Amazon RDS

Create a table using DynamoDB console

Delineate the concepts in DynamoDB

List the aspects of Amazon ElastiCache

A database is a collection of individual data items stored in a highly structured manner.

Provides the ability to store a large amount of information

Facilitates quick access to information

Allows users to share information at different locations

Ensures data security

AWS databases are both relational and non-relational.

user_id int Table 3 user_id int

SQL: is a primary interface

Data integrity: is enforced by a set of constraints

Transactions: result in a COMMIT or a ROLLBACK

ACID compliance: ensures data integrity

Microsoft SQL Server

Product ID Type Schema is defined per item

1 Book ID Odyssey Homer 1871

2 Album ID 6 Partitas Bach

Example of data stored as key-value pairs in DynamoDB

Session Store Shopping Cart

Session data is always queried by a Key-Value Databases are capable of

An In-Memory Database is a type of purpose-built database that primarily depends on memory

RAM: RAM: RAM: RAM:

Real-Time Bidding Gaming Leaderboards

In-Memory Databases can In-Memory Databases can quickly

The primary purpose of a cache

Amazon Elasticache for Redis

Amazon Elasticache for Memcached

Amazon RDS is a Relational Database Management service.

It is mainly used to manage the data of e-commerce platforms,

Availability of MySQL, postgreSQL, Oracle,

Need for payment only during use

Ease in handling of patching, backups, and

Simple and fast scaling

AWS RDS Simple and fast deployment

Fast and predictable performance

Amazon Aurora PostgreSQL

Amazon Aurora is a Relational Database fully managed by Amazon RDS.

Compatibility with MySQL and PostgreSQL

High speed: Up to 5X faster performance than MySQL and 3X faster

Applicability for cross-region Read Replica

High availability, durability, and security

Traditional Databases AWS Aurora

• Replay logs since the last • Performs redo of records on

Web and Mobile Applications

• Amazon RDS is the perfect fit for highly demanding applications as it

• The absence of licensing constraints best suits the variable usage

• Amazon RDS is a flexible, secured, highly scalable, and low-cost

• It helps satisfy PCI compliance and builds a superior customer

Mobile and Online Games

• Amazon RDS efficiently manages the database by taking care of the

• It can rapidly increase its capacity by providing familiar database

A Database Instance is a set of memory structures that manage the database.

It is a basic building block of RDS.

The computation and memory capacity of a DB Instance is determined by its

Every DB Instance can host multiple-user created databases or a single

Every DB Instance runs on a DB engine.

By default, a customer can have 40 DB Instances.

Data Flow Diagram during Backup and Restore