Professional Documents
Culture Documents
Lesson 05 Database On AWS
Lesson 05 Database On AWS
Databases on AWS
Learning Objectives
A Relational Database is a group of data items having pre-defined relationships with each other.
These items are arranged into a set of tables with rows and columns.
Table 2
users
1 1
id int
Table 1 Table 4
ratings first_name varchar tags
last_name varchar
id int id int
email varchar
rating int tag varchar
id int
1 1
name varchar
description text
Features of Relational Databases
Here are some Relational Database Engines that Amazon RDS offers:
Amazon Aurora
Oracle
Maria DB
Key-Value Databases
A Key-Value Database is a type of Non-relational Database. To store data, it uses a collection of key-
value pairs in which the key acts as a unique identifier.
Products
Primary key
Attributes
Partition key Sort key
Application
Master Server
Caching
• Provides CPU, memory, IOPS, and storage separately for individual scaling
• Looks after software patching, updates, backups, recovery, and automatic failure
detection
• Facilitates creating backups automatically or manually via snapshot
• Has a primary instance and a simultaneous secondary instance to provide high
availability and avoid failure
MySQL MariaDB
Microsoft SQL
Oracle
Server
Amazon Aurora
Cost-effective
Amazon Aurora consists a storage volume of 10GB logical blocks. It can scale
up to 64 TB when required.
Crash Recovery
E-Commerce Applications
VPC A VPC B
RDS RDS
Instance R Instance R
1 2
EC2 S3 EC2
Instance Bucket Instance
A B
Amazon RDS offers automated backups, point-in-time restores, and database snapshots.
Benefits
Enhanced durability
Increased availability
Automatic failover
Failover Conditions
AWS RDS automatically switches from a primary DB Instance to a standby replica present in
another availability zone whenever one or more of following conditions occur:
The normal failover time is 60–120 seconds. This may be exceeded in case of a heavy
recovery process.
Failover Conditions
Application Database
servers failure Standby
New standby
Availability Zone A
Primary
Availability Zone B
Read Replicas in Amazon RDS
Read Replicas are one or more copies of a particular Relational Database Instance to handle
high volume read traffic.
Read/write Primary
• Any amazon RDS activity initiated runs only
in the current default region.
BI/reporting
application server Read replica
Costs of Amazon RDS
Amazon RDS offers a pay for what you use. The table below lists the billing procedure for
various parameters:
DynamoDB is a fully managed NoSQL database that supports key-value and document data.
The record in every row is known as item. A TTL (Time to leave) can be set to
automatically delete the items in the table once they expire.
Operations such as create, insert, update, query, scan, and delete are
performed in the table via appropriate API.
For faster performance and data durability, the table data is stored in an SSD
disk and spread across many servers in different availability zones.
Use Cases of Amazon DynamoDB
Ad tech Gaming
DynamoDB supports both Eventually Consistent Reads and Strongly Consistent Reads.
DynamoDB executes all the tasks needed to create identical tables in the
specified regions and distributes ongoing data changes to all of them.
How DynamoDB works
The cost for using DynamoDB depends on the charges for reading, writing, and storing data
in DynamoDB tables, and for optional features, if any.
DynamoDB has two capacity modes that have specific billing options.
Charges for the data reads and writes Charges according to the number of
the application performs on the tables reads and writes specified per second
by the user
DynamoDB Use Case: Duolingo
Duolingo is a popular language-learning website and mobile app that delivers lessons for
80 languages. Duolingo uses DynamoDB to store around 31 billion items.
Problem Statement: Create a table using the DynamoDB Console. Duration: 15 mins
Assisted Practice: Guidelines
An index is a data structure that allows the user to perform fast queries on
specific columns in a table.
01 02
Local Global
Secondary Secondary
Index Index
Scan vs Query API Call
Control Plane
Data Plane
DynamoDB Streams
Control Plane
CREATETABLE
DESCRIBETABLE
UPDATETABLE
Operations
DELETETABLE
LISTTABLE
DESCRIBELIMITS
Data Plane
Creating data
Reading data
Updating data
Deleting data
Throughput Capacity
Throughput capacity is the speed at which the file server hosting the file system can
serve file data.
A Read Capacity unit represents only one strongly consistent read per second,
or two Eventually Consistent Reads per second, for an item up to 4KB in size.
A Write Capacity unit represents one write per second for an item up to
1KB in size.
Note
Note
Fully Highly
Manageable Available
10-times In-memory
faster cache
DynamoDB Transactions
TransactWriteItems API
Is a batch operation that contains a write set, with one or more PutItem, UpdateItem
and DeleteItem operations. It can optionally check the pre-requisite that must be
satisfied before an update is made.
Idempotency
It is an optionally available feature, which prevents application errors if multiple items
are submitted due to connection time-outs or network errors.
Working of DynamoDB Transactions
TransactGetItems API
Is a batch operation that contains a read set with one or more GetItem operation. If it is
issued on an item that is a part of an active write transaction, the read transaction is
cancelled. It can include up to 25 unique items or 4 MB data.
DynamoDB Transactions
Within a transaction, a conflict can occur during concurrent item-level requests on a same
item.
Amazon DynamoDB Time to Live (TTL) supports defining a per-item timestamp. This helps to
determine when an item is no longer needed.
TTL Features
DynamoDB Streams are used to replicate the data from one table
to another in a different region.
LISTSTREAM: Retrieves a list of stream descriptors for current account and endpoint
Routing Policies are used to route the traffic based on the geographic location
from where the DNS query has originated.
Fast and
consistent Fully Fine-grained
performance manageable access control
ElastiCache is an AWS in-memory data store and cache environment. It is used to cache results
and reduce overhead and latency on database.
Ad serving
Real-time bidding
ID-looking
Session tracking
Tracking state
Real-time notification
Leader boards
Session information
Usage history
Logs
Popular Use cases of Elasticache: Mobile and Web
Session details
Personalization setting
Entity-specific metadata
Amazon Elasticache: Redis
It collects one to six Redis nodes and the collection process is called Shard.
It uses one to 15 shards when cluster mode is enabled and uses only one
shard when it is disabled.
Memcached is used to speed up the dynamic data driven websites. Hence, it is called
distributed memory catching system .
Memcached compatible
Auto-Discovery
Elasticache offers a usage-based subscription following a free trial. It provides storage space
for one snapshot free of charge for each active ElastiCache for Redis cluster.
Shown below is a list of node types supported by Elasticache:
Note
Good to handle high traffic Neither can handle high traffic on read nor
Read and Write speed
websites heavy writes
Catching relatively small and Session cache, full page cache (FPC),
Ideal for static data such as HTML code Queues, 000000000000000000000 or
fragments counting, and more
Key Takeaways
Problem Statement:
You are asked to demonstrate joining multiple VPC together using Peering
Connection and Private Link
Tools required:
WAMP Server, AWS RDS, Visual Studio Code
Expected Deliverables:
Screenshots for every steps