Download as pdf or txt
Download as pdf or txt
You are on page 1of 109

Safe Harbor

The preceding is intended to outline our general product direction. It is intended for information purposes
only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code,
or functionality, and should not be relied upon in making purchasing decisions. The development,
release, timing, and pricing of any features or functionality described for Oracle’s products may change
and remains at the sole discretion of Oracle Corporation.

Statements in this presentation relating to Oracle’s future plans, expectations, beliefs, intentions and
prospects are “forward-looking statements” and are subject to material risks and uncertainties. A detailed
discussion of these factors and other risks that affect our business is contained in Oracle’s Securities and
Exchange Commission (SEC) filings, including our most recent reports on Form 10-K and Form 10-Q
under the heading “Risk Factors.” These filings are available on the SEC’s website or on Oracle’s website
at http://www.oracle.com/investor. All information in this presentation is current as of September
2019 and Oracle undertakes no duty to update any statement in light of new information or future events.

Copyright © 2019 Oracle and/or its affiliates.


Exadata Database Machine:
Maximum Availability Architecture (MAA)

Technical Presentation

April, 2020
Program Agenda

1
Exadata & Maximum Availability Architecture
2
MAA Reference Architectures
3
MAA Features in Exadata
4
MAA Exadata Lifecycle Operations
5
Summary

Copyright © 2020 Oracle and/or its affiliates. 3


Exadata Database Machine:
Maximum Availability Architecture (MAA)

Exadata & Maximum


Availability Architecture

Copyright © 2020 Oracle and/or its affiliates.


MAA Solutions: On-Premises to Cloud

Autonomous Database

DBCS/ExaCS/ExaCC

On-Premises Exadata and Recovery


Appliance

Adding MAA Config and Life Cycle


Operations, Shifting Admin Ownership
On-Premises to Oracle with MAA SLAs

MAA integrated Engineered Systems


(config practices, exachk, lowest
brownouts, HA QoS, data protection)
MAA Reference
Architectures and
Best Practices
Copyright © 2020 Oracle and/or its affiliates. 5
Impact of Database Downtime

Average cost of
Average cost of
unplanned data center
downtime per hour
$350K $10M outage or disaster

Percentage of
companies that have
Average amount of
experienced an
downtime per year
87 hours 91% unplanned data center
outage in the last 24
months
Source: Gartner, Data Center Knowledge, IT Process Institute, Forrester Research

Copyright © 2020 Oracle and/or its affiliates. 6


High Availability (HA) Business Challenges

Eliminate risk of downtime


and data loss

Improve service while increasing


return on investment

Copyright © 2020 Oracle and/or its affiliates. 7


Exadata Addressing High Availability
Challenges
Protection from Planned & Unplanned Outages
Type of Outage High Availability Challenges Protection using Exadata

Disruptive Schema Changes due to application changes to meet Schema Changes impacts are greatly reduced with faster
ever-changing business requirements changes, index and object rebuilds and reorganizations

Planned Outages Downtime required for lifecycle management like periodic Downtime required for lifecycle management is mitigated
upgrades of firmware & software, data migration using fast online upgrades, patching automation with
service migration, standby first patching, zero downtime
migration
Data Corruptions due to hardware/software faults, media issues Data Corruptions are prevented or the potential downtime
is reduced dramatically with additional corruption
prevention, detection and auto-repair
Application Brownouts due to server, instance storage failures or Application Brownout reduced to sub-second with fastest
Unplanned Outages due to planned maintenance instance recovery.
Disaster Recovery (DR) Challenges where the DR site is not keeping Disaster Recovery (DR) Challenges are mitigated with
up with Production fastest redo apply resulting in low Recovery Time Objective

Copyright © 2020 Oracle and/or its affiliates. 8


Exadata : Hardware + Software + Database
+ Availability
Decades of Database Innovation Proven at Millions of Mission-Critical Deployments
Multitenant Offload SQL to Storage

In-Memory DB InfiniBand Fabric


Real Application PCI Flash
Clusters Smart Flash Cache, Log

Active Data Guard Storage Indexes

Partitioning
Oracle Exadata Columnar Flash Cache
Advanced
Database DB Machine
Hybrid Columnar
Compression Innovations Innovations Compression 10:1
HCC

Advanced Security, I/O Resource I/O I/O I/O

Label Security, DB Vault Management


Real Application Network Resource
Testing Management
Advanced Analytics, In-Memory Fault
Spatial and Graph Tolerance
Management Packs for Exafusion
Oracle Database Redundant Optimized Hardware Direct-to-Wire Protocol
9
Copyright © 2020 Oracle and/or its affiliates.
Oracle Exadata Advantage

Ideal Database Hardware


Leading edge enterprise-grade
components for maximum
performance and value
Automation
Automated infrastructure
integrated with Oracle
Autonomous Database
Smart System Software
Database-aware algorithms
vastly improve the effectiveness
of ALL workloads

Identical On-Premises and Cloud


Copyright © 2020 Oracle and/or its affiliates.
Oracle Exadata Cloud Offerings
Flexible Exadata Cloud at Customer
Subscription
Model In Data Center of
Customer’s Choice
Database
PaaS Services

Secure
Virtual
Networks
Cloud Exadata Public Cloud Service
Security and In Oracle Public
Hardening
Cloud Data
Oracle- Centers
Managed
Core Exadata Exadata
Platform Infrastructure

Copyright © 2020 Oracle and/or its affiliates.


Gen 2 Exadata Cloud @ Customer
What’s New

Public • Gen 2 public cloud manages Gen 2 Exadata Cloud at Customer


Cloud UI and – Eliminates additional control plane rack in customer data center
Management
– Simpler, lower cost, faster time to value
Secure
Tunnel • New Exadata Cloud at Customer X8 hardware
– Faster CPUs, more cores, more storage than ExaCC X7
• Simpler connectivity to customer network
– Adapts to customer networking standards and requirements
• Now supports Oracle Database 19c
– Long-term support for the 12.2 family

Customer Data Center


• Ready for Autonomous Database at Customer
Runs the best database on the best platform in the best Cloud in your data center

Copyright © 2020 Oracle and/or its affiliates. 12


Exadata: Built-in High Availability
• Redundant Database Servers
– Active-Active highly available clustered servers
– Hot-swappable power supplies and fans
– Redundant power distribution units
– Integrated HA software/firmware stack

• Redundant Network
– Redundant 40Gb/s IB connections and switches
– Client access using HA bonded networks
– Integrated HA software/firmware stack

• Redundant Storage Grid


– Data mirrored across storage servers
– Redundant, non-blocking I/O paths
– Integrated HA software/firmware stack

Copyright © 2020 Oracle and/or its affiliates. 13


Oracle Maximum Availability Architecture
(MAA)
High Availability, Disaster Recovery and Data Protection
Production Copy
Applying 30+ years of lessons learned in solving
toughest HA problems around the world Database
Replication
Solutions to reduce downtime for planned &
unplanned outages for Enterprise customers with
most demanding workloads and requirements
Service level oriented MAA reference architectures
Books, white papers, blueprints
MAA integrated Engineered Systems
Continuous feedback into products R

https://oracle.com/goto/maa

Copyright © 2020 Oracle and/or its affiliates. 14


Oracle Maximum Availability Architecture (MAA)
Continuous Availability

Customer Insights &


Expert Recommendations Application
Continuity
Global Data
Services
Platinum
Data Protection
Gold
HA Features,
Reference Configurations & RMAN + ZDLRA
Silver
Flashback
Architectures Operational
Replication
Practices Active Replication

Bronze
The picture can't be displayed.

Production Site Replicated Site


Active Data Guard GoldenGate

Deployment Choices Scale Out

RAC ASM Sharding


Generic Engineered DBCS Autonomous DB
Systems Systems ExaCS/ExaCC
Copyright © 2020 Oracle and/or its affiliates. 15
Exadata Maximum Availability Architecture
Designed and Tested to Handle All Failure Scenarios
Within Exadata Within a Site Across Sites
Redundant Local standby for HA Remote standby for
Hardware Failover Disaster Recovery
Servers, Disks,
Redo-based Redundant Redundant
Flash, Network,
change Systems Systems
Power
replication with
Redundant Redundant
Redundant data consistency Databases Databases
DATABASE IN-MEMORY

DATABASE IN-MEMORY

DATABASE IN-MEMORY
Software checking
Active clusters, Online patching,
Disk/flash mirroring reconfiguration,
expansion
LAN WAN

Best MAA Database Platform | Fastest RAC Instance and Node Failure Recovery | Fastest Backup - RMAN Offload to Storage
Deep ASM Mirroring Integration | Fastest Data Guard Redo Apply | Complete Failure Testing with Lowest Brownouts
Frequently Updated Health Checks
Copyright © 2020 Oracle and/or its affiliates. 16
Exadata MAA Evolution • Choosing the SLA policy
Customer • Architecture • Application performance
Oracle • Database Management (Tooling)
• Infrastructure • Configuration, Tuning
Management • Lifecycle Operations (Tooling)
• Architecture • Application Performance
• Database Management Autonomous
• Configuration, Tuning Database / Database
• Lifecycle operations
Exadata
Infrastructure • Application Performance
• Cloud
Management
• Architecture On-Premises • Oracle owns and • Oracle owns and manages
• Configuration, Tuning Exadata manages the best Infrastructure
• Database Management integrated MAA
• Blueprints • Policy driven
• Lifecycle Operations DB platform
• Exadata is the best deployments
• Application Performance Cloud automation
integrated MAA DB • • MAA Integrated cloud
On-Premises platform for provisioning • Fully automated Self-
and life cycle Driving, Self-Securing,
• Blueprints operations Self-Repairing Database
• Feedback to
products & features

Copyright © 2020 Oracle and/or its affiliates. 18


Oracle Enterprise Manager Cloud Control
(OEM)
Configuration, Monitoring, Alerting and Management

• Exadata Database Machine


• Data Guard / Active Data Guard
• Multitenant
• Zero Data Loss Recovery Appliance (ZDLRA)
• Recovery Manager (RMAN)
• Real Application Clusters (RAC)
• Edition Based Redefinition (EBR)
• Oracle Sharding
• Oracle GoldenGate (OGG) – Monitoring and Alerting Only

Copyright © 2020 Oracle and/or its affiliates. 19


Exadata Database Machine:
Maximum Availability Architecture (MAA)

MAA Reference Architectures

Copyright © 2020 Oracle and/or its affiliates.


Reference Architectures – Level Set

Blueprints developed and certified by Oracle


Validated by 10,000s of Oracle Customers
Capabilities carry forward as you progress from one tier to the next
Achieving stated service levels requires:
• Utilization of prescribed features and capabilities
• Utilization of prescribed configuration and operational best practices
• Due diligence during pre-production testing
• Due diligence on all life cycle operations
• Maintaining recommended patch levels and versions

Copyright © 2020 Oracle and/or its affiliates. 21


Oracle Maximum Availability
Architecture(MAA) Solution Options

Copyright © 2020 Oracle and/or its affiliates.


BRONZE Primary Availability Domain
Single
Secondary Availability Domain

Instance
Dev, Test, Prod - Single Instance or
Database
Multitenant Database with Backups

Local Backup Replicated


• Single Instance with Clusterware Backups
Restart
• Advanced backup/restore with
RMAN Outage Matrix
• Optional ZDLRA with Unplanned Outage RTO / RPO*
incremental forever and near
zero RPO Recoverable node or instance failure Minutes **
• Storage redundancy and Disasters: corruptions and site failures Hours to days. RPO since last
validation with ASM backup or near zero with ZDLRA
• Multitenant Database/Resource Planned Maintenance
Management with PDB features
Software/hardware updates Minutes **
• Online Maintenance
Major database upgrade Minutes to hour
• Some corruption protection
• Flashback technologies * RPO=0 unless explicitly specified
** Exadata systems have RAC but Bronze Exadata configuration with Single Instance
database running with Oracle Clusterware has highest consolidation density to reduce
23
Copyright © 2019 Oracle and/or its affiliates.
costs
Zero Data Loss Recovery Appliance in
Your Data Center
Unified Management
Protected Recovery Appliance
Databases
Offloads Tape
Delta Push Backup
• Send only Incremental changes and
no more full backups
• Real-time transactions copied over
for continuous data protection

Protects all DBs in Data Center Delta Store


• Petabytes of data • Stores validated, compressed data on disk
• Oracle 10.2-18c, any platform • Fast restores to any point-in-time
• No expensive DB backup agents • Built on Exadata scaling and resilience
Replicates to Remote
• Enterprise Manager end-to-end control Recovery Appliance
Copyright © 2020 Oracle and/or its affiliates. 24
Backing up Exadata
Database Backup
Storage Expansion Rack and X8-2 Extended (XT) Cloud Service
• Fastest Backup and Restore Public • Offsite Storage
• ILM Historical Archive Network • Low Cost
• Second Disk Group
InfiniBand • Expansion of DATA
Media Server
Network
Fiber Channel
SAN
IB,10GigE, or 25GigE

Fiber Channel
SAN
10gigE or 25GigE

Recovery Appliance
• Delta Push & Backup Validation Tape library
• Incremental Forever • Offsite Backups
• Zero Data Loss Recoverability • Vaulting

Copyright © 2020 Oracle and/or its affiliates.


SILVER Primary Availability Domain
RAC Database
Secondary Availability Domain

Prod/Departmental

Bronze +
Local Backup Replicated
• Real Application Clustering (RAC) Backups
• Application Continuity

Outage Matrix
Unplanned Outage RTO/RPO*
Recoverable node or instance failure Zero**

Disasters: corruptions and site failures Hours to days. RPO since last
Checklist found in MAA OTN backup or near zero with ZDLRA
https://www.oracle.com/technetwork/database/op Planned Maintenance
tions/clustering/applicationcontinuity/adb-
continuousavailability-5169724.pdf Software/hardware updates Zero**
Major database upgrade Minutes to hour

* RPO=0 unless explicitly specified


27
Copyright © 2020 Oracle and/or its affiliates.
** To achieve zero, requires applying application checklist
Oracle Real Application Clusters (Oracle RAC)
Node Failure, Instance Failure, Rolling Maintenance
• Utilizes two or more instances of an
Application Oracle Database concurrently
Tier
• Very Scalable
• All instances active; Add capacity online; Ideal for
database consolidation
• Highly Available
Database • Auto-failover of services to an already running
Services Database instance; Outage is transparent to user, in-flight
Tier transactions succeed; Zero downtime rolling
maintenance

Primary Database

Copyright © 2020 Oracle and/or its affiliates. 28


Transparent Application Continuity (TAC)
Application does not see errors during outages
• Uses Application Continuity and
Oracle Real Application Clusters
Request • Transparently tracks and records session
information in case there is a failure
• Built inside of the database, so it works
without any application changes
• Rebuilds session state and replays in-flight
Transparent transactions upon unplanned failure
Application
Continuity • Planned maintenance can be handled by
TAC to drain sessions from one or more
nodes
• Adapts as applications change:
Errors/Timeouts hidden protected for the future

Copyright © 2020 Oracle and/or its affiliates. 29


Transparent Application Continuity
Explained
Normal Operation Failover Phase Failover Phase
1:Reconnect 2:Replay
• Client marks requests:
explicit and discovered. • Checks replay is • Restores and verifies
enabled the session state
• Server tracks session
state, decides which • Verifies timeliness • Replays held calls,
calls to replay, disables restores mutables
side effects. • Creates a new automatically
connection
• Ensures results, states,
• Directed, client holds
• Checks target database messages match
original calls, their
is legal for replay original.
inputs, and validation
data. • Uses Transaction Guard • On success, returns
to guarantee commit control to the
outcome application
Copyright © 2020 Oracle and/or its affiliates. 30
Checklist for Achieving Zero Application
Downtime
1. Use Oracle Clusterware Service (never use default service)
2. Use Recommended Connection String
3. Configure FAN for Connection Pool
4. Drain your service
5. Use Application Continuity or Transparent Application
Continuity

1) MAA Whitepaper: Application Checklist for Continuous Service for MAA Solutions
2) Using RHPhelper to Minimize Downtime During Planned Maintenance on Exadata (MOS 2385790.1)
3. Fleet Patch and Provisioning incorporates MAA practices
Copyright © 2020 Oracle and/or its affiliates. 31
GOLD Primary Region
AD2
DG FSFO
AD1
Secondary Region

Mission Critical

Silver +
• Active Data Guard
Local Local Remote Local
• Comprehensive Data Protection Primary backup
backup Standby Standby
MAA Architecture:
• At least one standby required
across AD or region.
Outage Matrix
• Primary in one data center(or AD) Unplanned Outage RTO/RPO*
replicated to a Standby in another Recoverable node or instance failure Seconds
data center
• Active Data Guard Fast-Start Disasters: corruptions and site failures Seconds. RPO zero or seconds
Failover (FSFO) Planned Maintenance
• Local backups on both primary and
standby Software/hardware updates Zero
Major database upgrade Seconds

RPO=0 unless explicitly specified


Copyright © 2020 Oracle and/or its affiliates. ** To achieve zero, requires applying application checklist 33
Storage Remote Mirroring Architecture
Generic - Must Transmit Writes to All Files
…. INCLUDING CORRUPTED BLOCKS OR BAD DATA
Primary Database Mirrored Volumes

Oracle Instance (in memory)


• Zero Oracle validation
• 7x network volume
• 27x network i/o
SYNC or ASYNC
block replication

Copyright © 2020 Oracle and/or its affiliates. 34


Data Guard Addresses Shortcomings of
Storage Replication
Inadequate isolation, zero application-level validation

“…when something happens in the I/O stack and a database write is


malformed Symmetrix A happily replicates the faulty data to site B and
the corruption goes undetected”
EMC BLOG with Integrity

Copyright © 2020 Oracle and/or its affiliates. 35


Oracle Data Protection
Gold – Comprehensive Data Protection

Capability Physical Block Corruption Logical Block Corruption


Manual

Dbverify, Logical checks for intra-block and


Physical block checks
Analyze inter-object consistency
RMAN, ASM Physical block checks Intra-block logical checks
• Continuous physical block checking at standby • Detect lost write corruption, auto
Active Data • Strong isolation to prevent single point of failure shutdown and failover
Guard • Automatic repair of physical corruptions • Intra-block logical checks at
Runtime

• Automatic database failover (option for lost writes) standby

In-memory intra-block checks,


Database In-memory block and redo checksum
shadow lost write protection
ASM Automatic corruption detection and repair using extent pairs

Exadata HARD checks on write, automatic disk scrub and repair HARD checks on write

Copyright © 2020 Oracle and/or its affiliates. 36


Active Data Guard Overview Offload read only or read
mostly workloads to the
standby database
Primary Standby
Open Read-Write Open Read-Only
DML Redirection

Zero Data Loss at any Distance

Automatic Block Repair

• Synchronous zero data loss replication


Multi-instance Redo
• Database rolling upgrade to reduce downtime Apply for RAC
for planned maintenance (In Memory supported)
• Automatic failover for High Availability

Copyright © 2020 Oracle and/or its affiliates. 37


PLATINUM Primary Region Secondary Region
AD2 AD1 AD1 AD2
Extreme Critical
GG
Replication
Gold +
• GoldenGate Active/Active
Replication Local
Local
• Optional Sharding & Editions Based backup Standby Primary Primary Standby backup
Redefinition
MAA Architecture:
• Each GoldenGate “primary” replica
protected by Exadata, RAC and
Outage Matrix
Active Data Guard Unplanned Outage RTO/RPO*
• Primary in one data center (or AD) Recoverable node or instance failure Zero**
replicated to another Primary in
remote data center (or AD) Disasters including corruptions and site failures Zero***
• Oracle GG & Editions Based
Redefinition for zero downtime Planned Maintenance
application upgrade
Most common software/hardware updates Zero**
• Sharding for scalability and fault
isolation Major database upgrade, application upgrade Zero***
• Local backups on both sites
RPO=0 unless explicitly specified
• Achieve zero downtime through
custom failover to GG replica
** To achieve zero, requires applying application checklist
Copyright © 2020 Oracle and/or its affiliates. *** application failover is custom to failover to GG replica 40
GoldenGate plus 2 Optional Approaches
to Further Protect Your Applications

Use Edition-based
Use Oracle Golden Gate Redefinition Use Oracle Sharding
Required Optional Alternative

Copyright © 2020 Oracle and/or its affiliates. 41


Oracle GoldenGate Microservices Architecture
Capture: committed transactions are captured (and can be filtered) as they occur by reading the transaction logs.

Trail: stages and queues data for routing.


Distribution Server/Receiver: distributes data for routing to target(s).
Route: data is compressed, encrypted for routing to target(s).

Delivery: applies data with transaction integrity.

Trail Trail
Capture Dist. Receiver Files Delivery
Files
Service Service
LAN / WAN / Internet
Over TCP/IP

Delivery Trail Receiver Dist. Trail Capture


Files Service Service Files
Source Target
Oracle & Non-Oracle Oracle & Non-Oracle
Database(s) Database(s)
Bi-directional

Copyright © 2020 Oracle and/or its affiliates. 42


Edition-Based Redefinition
Online Application Upgrade

• Enables application upgrades to be performed online


• Code changes installed in the privacy of a new edition
• Data changes are made safely by writing only to new columns or
new tables not seen by the old edition
• An editioning view exposes a different projection of a table into
each edition to allow each to see just its own columns
• A cross-edition trigger propagates data changes made by the old
edition into the new edition’s columns, or (in hot-rollover) vice-
versa

Copyright © 2020 Oracle and/or its affiliates. 43


Alternate Platinum Option: Sharding
Highly scalable, fault tolerant architecture for Internet Applications

A single logical DB sharded into N physical


Databases
• Custom Built Application optimized to use
shard keys
• Horizontal partitioning of data across Database
independent databases (shards)
– Each shard holds a subset of the data
– Can be single-node or RAC or PDB
– Replicated for high availability
• Shared-nothing architecture:
– Shards don’t share any hardware (CPU, Table1 Table1 Table1
memory, disk), or software (Clusterware) Shard1 Shard2 Shard3
Server1 Server2 Server3
44
Copyright © 2020 Oracle and/or its affiliates.
Sharding Configuration Options
Use Sharding with Active Data Guard, RAC or Oracle GoldenGate

GoldenGate ‘chunk-level’ active-active replication


with automatic conflict detection/resolution

Active Data Guard with Fast-Start Failover


Optionally – complement replication with Oracle RAC for server HA

https://www.oracle.com/database/technologies/high-availability/sharding.html
Copyright © 2020 Oracle and/or its affiliates. 45
Maximum Availability Architecture (MAA)

MAA Features in Exadata

Copyright © 2020 Oracle and/or its affiliates.


Exadata: Maximum Availability Architecture
Features
Code & Data
Configuration Protection

Brownout Quality of
Reduction Service

Performance Management

Copyright © 2020 Oracle and/or its affiliates. 47


Exadata: Maximum Availability Architecture
Features
Data
Code & Protection
Configuration

Brownout Quality of
Reduction Service

Performance Management

Copyright © 2020 Oracle and/or its affiliates. 48


Exadata: Data Protection
Corruption Detection, Prevention & Repair

If an application update in the database encounters corruption


Database reads from the ASM mirror
Repairs the corruption using the good copy
This repair happens without impacting other database processes and
application
When a network packet in the I/O path between DB server and storage
node is corrupted
Storage cell prevents the write
ASM retries by re-sending the packet
Application never encounters corruptions
When a drive is reported as failed, but not physically failed
Automatic power cycle the drive to avoid false positive drive failure
Works on both High Capacity & Extreme Flash Cells

49
Exadata: Data Protection
Corruption Detection, Prevention & Repair

If an application update in the database encounters


corruption
Database reads from the ASM mirror
Repairs the corruption using the good copy
This repair happens without impacting other database processes
and application
When a network packet in the I/O path between DB server
and storage node is corrupted
Storage cell prevents the write
ASM retries by re-sending the packet
Application never encounters corruptions

50
Exadata: Data Protection
Storage Failures
• When a drive is reported as failed, but not physically
failed
Automatic power cycle the drive to avoid false positive drive failure
• Works on both High Capacity Disks & Extreme Flash Cells

• When a storage failure occurs,


• Performs database-aware priority restores
• Control files, log files, SP files, TDE key stores, OCR, Wallets and then
database files (MOS 1968607.1)
• With 12.2 and higher:
• Redundancy restore after storage loss takes much less time
• New REBUILD phase done first which restores REDUNDANCY followed by
restoring BALANCE
• Exadata flash cache leveraged for rebalance reads improving
redundancy restoration performance by up to 30%

Copyright © 2020 Oracle and/or its affiliates. 51


Exadata: Data Protection
Efficient Rebalance with Service Level Protection
• Intelligent and flexible rebalance power setting
• Testing in MAA labs to find best balance between
redundancy restoration timing and service level protection. 
• MAA best practice default of 4 (total across clusters) set at
deployment time
• MAA best practice max of 64 (total across clusters) available
as needed
• MOS note 757552.1 available with more information and
guidance
• Performs database-aware priority restores
• Control files, log files, SP files, TDE key stores, OCR, Wallets and
then database files (MOS 1968607.1)
• 12.2+ ASM rebalance restores redundancy first drastically
reducing secondary failure exposure window
• 12.2+ Exadata leverages flash cache for rebalance reads
improving redundancy restoration performance by up to 30% 52
Copyright © 2020 Oracle and/or its affiliates.
Exadata: Data Protection
Exadata ASM configuration best practices
Eighth and Quarter Rack ASM High
Redundancy Voting Device /Quorum ASM Power Limit Best Practice Check To
Ensure a Rebalance Will Run
Disk Best Practice Check

But now we also check for


an iscsi device based
quorum disk that can house
an additional vote device on
each database node
We already store vote
devices on each of the
three storage cells

Copyright © 2020 Oracle and/or its affiliates. 53


Do-Not-Service LED (X7 and higher)

• Data Center tech gets easy to use


visual to prevent servicing that might
cause an outage.
• Leverages ASMDeactivationOutcome
cell attribute which is storage partner
aware

Copyright © 2020 Oracle and/or its affiliates. 54


M.2 Fast Failure Protection and Online
Replacement (X7 and higher)
• Two M.2 drives house
operating system and cell
software. Storage server hard
disks and flash drives contain
application data only.
• M.2 drives protected with Intel
RSTe RAID
• Chassis can be opened and
M.2 drive replaced online
USB Drive
while the storage server &
continues to service the DBFS_DG
by default
application
Copyright © 2020 Oracle and/or its affiliates. 55
Online Flash Replacement (X7 and higher)
“Hot-Plug”

• Chassis can be opened and


the flash drives can be
replaced online while the
storage server continues to
service the application.
• For a failed drive, replace
when ready
• For an online drive, just tell us
so we can properly prepare it.
Example:
CellCLI> alter physicaldisk FLASH_2_2 drop for replacement;
Copyright © 2020 Oracle and/or its affiliates. 56
Exadata: Maximum Availability Architecture
Features
Code & Data
Configuration Protection

Brownout Quality of
Reduction Service

Performance Management

Copyright © 2020 Oracle and/or its affiliates. 57


Exadata: Quality of Service
For Optimal Performance

LGWR Delay after Hung IO


• Cell Side IO Latency Capping (Hard Disk & Flash )
40
• When excessive IO is performed to a cell over PCI 30
The read IO is redirected to the partner cell 30

Seconds

• The write IO is canceled and temporarily written to healthy flash on the 20
same cell
10
1
• Cell Side Disk Confinement 0
• When a disk goes bad and is taken offline Exadata Traditional
• Diagnostic is automatically run on the disk to determine Storage
health
• If healthy, disk is returned to ONLINE status and re-
synched
• If unhealthy, health factor drop is performed, rebalance
is performed and blue LED is lit after completion

Copyright © 2020 Oracle and/or its affiliates. 58


Exadata: Quality of Service
Smart Storage with IO Resource Manager

• Each IO is tagged with who issued the IO, purpose & priority
• Enables mixed workloads, consolidation of many databases with multiple tiers of performance

Example IO Tasks Action Taken


Table scan from Critical Data Warehouse High-priority query. IORM prioritizes against other scans on both flash and disk!
Table scan read from an Ad-Hoc Query Low-priority, resource-intensive query. Stage to flash, only if there’s room. De-prioritize
disk or flash I/O.
DBWR write - no threat of “free buffer Not urgent – plenty of free buffers. IORM de-prioritizes this I/O
wait”
DBWR write to resolve “free buffer wait” Urgent – users are blocked. IORM prioritizes this I/O
LGWR redo write High-priority I/O. Accelerated via Exadata Flash Log!
Buffer Cache read for OLTP transaction, Medium-priority I/O. Stage to flash. Prioritize against other user I/Os, based on
PDB resource plan.

Copyright © 2020 Oracle and/or its affiliates. 59


Smart Flash Replacement
• After flash failure, a “health factor”
status is set on the set of hard
drives backing the failed flash.
• Reads are satisfied from healthy
partner cell instead of the cell with
a reduced amount of flash
• Health factor status clears after
flash replacement *and* cache
warmup
• This feature reduces the application
service level impact after flash
failure

Copyright © 2020 Oracle and/or its affiliates. 60


Exadata: Quality of Service
SLAs Maintained During Planned Maintenance or When Storage is Compromised

• Exadata flash cache state preserved Performance is Time


during ASM rebalance operations. Time is Money
One practical example is the resync
that occurs during cell software
rolling updates.
• Intelligent routing of IO requests to
cell providing the best service after
flash failure and repair
• Applicable to both unplanned
outages and planned maintenance

Copyright © 2020 Oracle and/or its affiliates. 61


Database Tier IO Cancel
Oracle Grid Infrastructure & Database 19c

Database Tier Database Tier IO Latency Capping 

?
IOs are Pumping
Slow IO ? Cell IO Latency Capping 

Hung IO ? IO Hang detection / repair 


Storage Tier
Sick disk ? Disk confinement 

Undiscovered hardware
/ software issue?

Copyright © 2020 Oracle and/or its affiliates. 62


Rebalance For High Redundancy Diskgroups
Oracle Grid Infrastructure 19c

15% free with a normal or high


Problem: Rebalance runs out of space redundancy diskgroup having < 5 Exadata
after disk failure (ORA-15041) cells and GI versions 12.2 and 18c

Solution for 18c and lower


Run exachk which reports on compliance 9% free with a normal or high redundancy
to our MAA best practice diskgroup having 5 or more Exadata cells
and GI versions 12.2 and 18c

Solution for 19c with high redundancy


diskgroups
Smart rebalance - no need for free space!
0% free with 19c high redundancy
If there is not enough space to rebalance at diskgroup.
the time of failure, offline the disk
Upon replacement, efficiently repopulate it
from partner disks automatically
Copyright © 2020 Oracle and/or its affiliates. 63
Summary: Database Tier IO Cancel
Feature Oracle Has Best Practices You Can Service Level Impact
Provided Implement Expectations
• Protection from • Nothing! Completely • Stable service level
uncommon storage tier transparent. achieved through IO
stalls/hangs redirection on
stalls/hangs

Copyright © 2020 Oracle and/or its affiliates. 64


Exadata: Maximum Availability Architecture
Features
Code & Data
Configuration Protection

Brownout Quality of
Reduction Service

Performance
Management

Copyright © 2020 Oracle and/or its affiliates. 65


Exadata: Management
Enterprise Manager 13c – Improved Plug-in

• Database consolidation workbench


• Planning, Migration, Validation
• Estimate IO bandwidth Savings with
consolidation
• Automated Patching
• Lifecycle management of Exadata in
virtualized environment
• Create/Delete RAC databases & VMs
• Scale up/down cluster and VMs
• Exachk integration

Copyright © 2020 Oracle and/or its affiliates. 66


Exadata: Management
Notification & Replacement Process for any Faults

Fault Management
Components break Fully automated notification and replacement process through ASR (Auto
Service Request)
Components get sick Exadata uniquely qualified to handle sick components with full stack
integration. Exadata provides system/service level high availability.
Intelligent hardware/software Blue light indicating disk replacement can be performed. Cell shutdown
integration helps prevent human error prevention and notification when redundancy would be compromised. X7
Do Not Service LED
Cell Shutdown causing application Smart handshake with database tier and proactive redundancy checks
outage during cell (or cellsrv) shutdown to prevent application outage.

Copyright © 2020 Oracle and/or its affiliates. 67


Exadata: Management
ExaWatcher Graphing Support

• Shameless plug for the What’s New section of the “Exadata


Database Machine System Overview” documentation
• In Oracle Exadata Storage Software release 12.2.1.1.0, GetExaWatcherResults.sh generates
HTML pages that contain charts for IO, CPU utilization, cell server statistics, and alert
history. The IO and CPU utilization charts use data from iostat, while the cell server statistics
use data from cellsrvstat. Alert history is retrieved for the specified timeframe.
• Example on the next slide…

Copyright © 2020 Oracle and/or its affiliates. 68


Copyright © 2020 Oracle and/or its affiliates. 69
Exadata: Management
Health Check using EXAchk Utility

• EXAchk provides configuration specific, up-to-date health check across the entire stack
• Covers Exadata, Database, Grid Infrastructure, ASM critical issues
• Provides MAA scorecard with MAA configuration gaps and guidance to mitigate
• Automated periodic scheduled runs with email notifications
• Continuous evolution of configuration checks
• EXAchk helps with saving a lot of time and money due to proactive health verification which
dramatically reduces downtime
• Currently has over 1000 checks per target
• Development recommends that the latest EXAchk be executed with the following frequency:
• Monthly
• Week before any planned maintenance activity
• Day before any planned maintenance activity
• Immediately after completion of planned maintenance activity or an outage or incident

Note: Automated Exachk Healthcheck MOS 107954.1

Copyright © 2020 Oracle and/or its affiliates. 70


EXAChk: Sample Reports
Assessment Report Findings & MAA Score Card
Health Score, Summary, Recommendations Critical Issues,
Findings How to Solve the incompatible features
problem? usage

Copyright © 2020 Oracle and/or its affiliates. 71


Diagnostic Pack

• A compressed archive containing logs, traces, relevant


diagnostic information about the storage server and an
index.html
• Contents customized to the particular incident
• Outgoing email for the alert contains the diag pack, as well as a
link to it on the server.

Copyright © 2020 Oracle and/or its affiliates. 72


Sick Component Handling
Diagnostic Pack Example

Copyright © 2020 Oracle and/or its affiliates.


Exadata AWR Support
One Stop Shopping for Performance Problems

Copyright © 2020 Oracle and/or its affiliates. 74


Exadata AWR Support
Unique Configuration and Outlier Detection

• Configuration differences detected across storage servers


• Exadata Storage Server Model
• Exadata Storage Version (group by package type/package version)
• Exadata Storage Information (group by all columns - flash cache size, flash log size,
# hard disks, # flash, # griddisks)
• Exadata Griddisks (group by # griddisks, griddisk size and disk type)
• Exadata Celldisks (group by disk type, celldisk size, # celldisks)
• Statistical differences detected compared to data sheet limits
• Max IOPS/throughput for OS statistics are colored dark red
• Outliers for OS and Cell Server statistics are colored (pinkish-red for high, yellow for
low)

Copyright © 2020 Oracle and/or its affiliates. 75


Exadata AWR Support
Outlier Detection Example from a Real (Big) Customer

Copyright © 2020 Oracle and/or its affiliates. 76


Exadata: Maximum Availability Architecture
Features
Code & Data
Configuration Protection

Brownout Quality of
Reduction Service

Management
Performance

Copyright © 2020 Oracle and/or its affiliates. 77


Server Centric Enables Leading-Edge Architecture
Exadata Gets New Technology at Best Cost Due to High Volume Server Economics

Scale-Out with Fastest CPUs Get New and Fastest Processors First

Unified Ultra-fast InfiniBand Use Modern Ultra-Fast Network


faults fault.chassis.device.fail
Scale-out Storage Servers Enable Intelligence in Storage

Ultra-fast NVMe PCIe Flash Get Fastest NVMe PCIe Flash First

Tier PCIe Flash & Huge Disks Get Bigger Disk Drives First

PCI Flash

Copyright © 2020 Oracle and/or its affiliates. 78


Exadata: Performance
Exadata Specific H/W & S/W Features MAA Features
Exadata Smart Scan and Reverse Fastest Object Reorganization
Offload Fastest Instance Recovery
Exadata Smart Logging Fastest Flashback Capability
Exadata Smart Persistent Write Back Fastest Backups to Exadata,
Flash Cache Recovery Appliance
Exadata Persistent Memory Fastest Active Data Guard &
Exadata Active/Active IB network Standby Redo Apply
Fastest GoldenGate Performance

Exadata Hardware + Exadata Software + Oracle Database provides the ultimate performance !!

Copyright © 2020 Oracle and/or its affiliates. 79


Exadata Smart System Software
Fastest Analytics
Unique Smart Scan automatically offloads data intensive SQL operations to storage
Unique Smart Flash Cache and Storage Index automatically accelerate database I/O
Unique automatic conversion of data to fast In-Memory Columnar format in flash

Fastest OLTP
Fastest OLTP I/O with scale-out storage, RDMA, and NVMe flash
Fastest scale-out with unique RDMA algorithms for inter-node cluster coordination
Fully redundant and fastest recovery from failed or sick components

Best Consolidation
Uniquely prioritizes latency sensitive or important workloads through full stack
Uniquely isolates workloads from multiple tenants through full stack
Copyright © 2020 Oracle and/or its affiliates. 80
Exadata X8M (changes from X8 in red)
Scale-Out 2 or 8 Socket Database Servers
Latest 24 core Intel Cascade Lake

100Gb RDMA over Converged Ethernet (RoCE)


Internal Fabric

Scale-Out Intelligent 2-Socket Storage Servers


1.5 TB Persistent Memory per storage server
Three tiers of storage: PMEM, NVMe, HDD

Enhanced consolidation using Linux KVM

Copyright © 2020 Oracle and/or its affiliates. 81


KVM
MAA Characteristics

Full set of Exadata KVM best practices will be available here:


https://www.oracle.com/database/technologies/high-availability/exadata-
maa-best-practices.html
Some MAA notables:
• The number of guests supported on KVM is 12 (8 on Xen)
• Prior generations of Exadata can be connected via Data Guard or Golden Gate
• Standard backup procedures apply
• The KVM host can optionally shapshot VM disk images and store them externally
• Update core Exadata infrastructure with patchmgr
• Update Grid Infrastructure and Database ORACLE_HOMEs with oedacli
• Perform lifecycle operations with vm_maker and oedacli

Copyright © 2020 Oracle and/or its affiliates. 82


PMEM Exadata Data Access Tiers

MAA Characteristics Database Node Storage Cell


• Not drawn to scale 
• Primary copy of data placed in PMEM

X
cache on a read miss Database Read PMEM

• Secondary copy of data placed in Buffer Cache Hot

flash cache on buffer eviction


Sizzling

If a pmem fails in Writethrough mode, no FLASH


Warm
redundancy restoration is required
Buffer Evicted
If a pmem fails in Writeback mode, a resilver
DISK
operation is run to restore redundancy
Cold
Low latency flash reads will repopulate
super low latency pmem

Copyright © 2020 Oracle and/or its affiliates. 83


RDMA Network Fabric RDMA over Converged Ethernet (RoCE)
MAA Characteristics

RDMA Network Fabric Adapter

2 Active-Active ports in every RDMA


Network Fabric Adapter

2 RDMA Network Fabric Switches in


every Exadata single rack
RDMA Network Fabric Switch

22 Ports per switch used for internal


cluster network, cabled ensuring no
single point of failure exists

Copyright © 2020 Oracle and/or its affiliates.


Wait, in the past you have told me about
how Exadata Fast Node Death Detection
(FNDD) uses the InfiniBand Subnet
Database Node
Manager, but Exadata X8M does not have
InfiniBand switches. How does FNDD
work?
RDMA Reads

4 RDMA paths exist between database


nodes and cells to monitor cell liveliness.
X X X X
If all four are unavailable after a short
timeout expires, the cell is evicted

<1
X
Second to complete cell eviction,
maintaining SLA

Cell

Copyright © 2020 Oracle and/or its affiliates.


RDMA Network Fabric MAA Characteristics
Network Fabric Switch Software Updates

• Same tool, patchmgr

• Separate software update


package

• Optimized, built-in handling


of port down/up events

• -verify-config and
–roceswitch-precheck options
available to check state ahead
of time

Copyright © 2020 Oracle and/or its affiliates. 86


Multi-Instance Redo Apply Performance
Lower Latency Active Data Guard Standby Databases

• Utilizes all RAC nodes on the Standby database to parallelize recovery


• OLTP workloads on Exadata show great scalability
7000

6000

5000

Standby 4000 5000 Batch


Apply 3000
Rate
2000 2752
MB/sec OLTP
1000 1400
700 1480
380 740
0 190
1 Instance 2 Instances 4 Instances 8 Instances

Copyright © 2020 Oracle and/or its affiliates. 87


Two Production Customer Examples
Data Guard Redo Apply Performance

• Thomson-Reuters
• Data Warehouse on Exadata, prior to write-back flash cache
• While resolving a gap of observed an average apply rate of 580MB/second
• Allstate Insurance
• Data Warehouse ETL processing resulted in average apply rate over a 3 hour
period of 668MB/second, with peaks hitting 900MB/second

Copyright © 2020 Oracle and/or its affiliates. 89


Exadata: Maximum Availability Architecture
Features
Code & Data
Configuration Protection

Brownout Quality of
Reduction Service

Performance Management

Copyright © 2020 Oracle and/or its affiliates. 90


Brownouts and Blackouts
Its All about Service Levels

A brownout is a significant service level degradation. A blackout is a


complete service level interruption
Brownouts and blackouts translate to lost productivity and revenue
Systems are complicated with many components, and an issue at one
layer can easily cascade to another layer and exacerbate the impact.
Engineered systems are uniquely qualified to solve this very tough
problem.

Copyright © 2020 Oracle and/or its affiliates. 91


Exadata Marquee/New HA Features
Reduced HA Brownout – Fast Node Death Detection on Database Nodes and Cells

Example of Database node power failure with an OLTP workload and CSS
misscount=60

Copyright © 2020 Oracle and/or its affiliates. 92


App Brownout in Typical Configuration
Clusterware Each layer of the application stack
Timeout
has its own failure detection
method
Vendors try to obfuscate these
SAN/LAN SCSI Timeout details by quoting client side failure
numbers
Storage Controller Storage Controller In most cases the fault detection
times are additive
For example if storage controller
Proprietary crashes it will take 2 SCSI timeouts for
Protocol the database server to detect such a
Timeouts failure

Copyright © 2020 Oracle and/or its affiliates. 93


Exadata: Unique Brownout Reduction Features
Instant Failure Detection Maximum Application Uptime

Application Brownout
If a server disappears from both 350
InfiniBand switches, declare it dead in less 300
300

than two seconds 250

Seconds
No waiting for long heartbeat timeouts
200
Reduces application brownouts from 30+
seconds to < 2 seconds 150
100
Active/Active IB configuration provides:
Extreme throughput - 40 Gb/s QDR 50
0.8
Extreme availability - RDS failover in seconds 0
with minimum application impact Exadata 3rd Party Storage

Copyright © 2020 Oracle and/or its affiliates. 94


Exadata: Brownouts and Blackouts
Maintaining Service Levels Since 2008

Let’s watch a 1 minute video featuring our Fast Node Death Detection
(FNDD) feature. If you watch carefully you will still be rewarded with
one new feature referenced at about the 35 second mark 

For more information on older features and best practices, see


http://www.oracle.com/technetwork/database/availability/exadata-maa-best-practices-155385.html , MOS note
757552.1, exachk reports, prior OOW MAA Exadata presentations, and the Exadata documentation.

Copyright © 2020 Oracle and/or its affiliates. 95


Brownouts and Blackouts
Flex ASM

Flex ASM enables continuous


RDBMS<->ASM communication
after an ASM instance crash
without the need for a service
failover
Completely transparent to the
application with no service level
impact

Flex ASM configured with cardinality=ALL on Exadata

Copyright © 2020 Oracle and/or its affiliates. 98


Exadata: Brownouts and Blackouts
Recent Improvement in Brownout Associated With Network Port Failure

The brownout associated with active/passive client access network port failure is
now 60% lower after we thoroughly verified a reduction to the network downdelay
parameter. It also prevents false positive VIP failover when using OVM.
This configuration change is now in the default Exadata deployment and the best
practice check is in exachk.

Copyright © 2020 Oracle and/or its affiliates. 99


Smart Handshake For Storage Server Shutdown
Grid Infrastructure 12c or higher / Exadata 12.1 or higher

• Clear communication to the diskmon process on the database servers


when storage is shutdown prevents errors and application blackouts.

Your service level will smile!

Database Tier

Storage Tier

Copyright © 2020 Oracle and/or its affiliates. 10


Summary: Smart Handshake For Storage Server
Shutdown
Feature Oracle Has Best Practices You Can Service Level Impact
Provided Implement Expectations

Graceful database tier Use graceful shutdown No blackouts when


handling during storage procedures. storage tier is shutdown
server shutdown Related: Use patchmgr for maintenance
for storage server No false positive
software updates as it errors/alerts
ensures grid disks are
handled properly.

Copyright © 2020 Oracle and/or its affiliates. 10


Smart OLTP Caching
19c Grid Infrastructure. Maintaining SLAs During Storage Failures
• SaaS application reading data from the
primary mirror
• Storage failure on cell containing
primary mirror
• No problem, just retrieve data from the
secondary mirror on flash with low
latency

X
• The tertiary mirror continues to provide
protection just in case its one of those
days
• After the storage failure is repaired and
the cell caching state is deemed healthy
again, return to the primary mirror

Copyright © 2020 Oracle and/or its affiliates. 102


Exadata: Maximum Availability Architecture
Features
Code &
Configuration Data
Protection

Brownout Quality of
Reduction Service

Performance Management

Copyright © 2020 Oracle and/or its affiliates. 103


Exadata has Many HA Features Supporting the Most Stringent SLAs
Fast node and cell death detection Active Active IB Network Drop BBU for Replacement

Fast network failure detection Exadata Smart Write Back, Smart Flash Logging, Smart Scan Appliance mode support
and Reverse Offload
Fastest Redo Apply and Instance Recovery Cell Alert Summary
Redundancy protection on cellsrv shutdown
Efficient resilver rebalance after flash failure Flash and Disk Life Cycle Management Alerts
Reduced brownout for instance recovery
I/O latency capping for reads and writes Automatic LED support for disk removal
ILOM hang detection and repair
Cell IO timeout threshold Auto online
Redundancy protection on cell shutdown
Smart Write Back Flash Cache persistence Auto disk management

Automatic ASM mirror read on IO error corruption I/O and Network Resource Management Priority rebalance support

IO error prevention with Exadata disk scrubbing / ASM Health factor on predicatively failed disks EM failure reporting
corruption repair
Disk confinement Failure Monitoring on database servers
Exadata HARD
IO hang detection and repair Updating database nodes with patchmgr
Corruption prevention with HARD support
Cell to Cell offload for Disk Repair Optimized and Faster Exadata Patching
Elimination of false positive drive failures
Cell-to-Cell Rebalance Preserves Flash Cache Custom Diagnostic Package for Cell Alerts
Redundancy Check during power down
Exadata Elastic Configuration VLAN support and automation

Blue OK-to-remove LED light notification Drop hard disk for replacement Exachk – full stack health check with critical issues alerts

Copyright © 2020 Oracle and/or its affiliates. 104


Exadata MAA Benefits
Corruption Reliable &
Pre-Packaged Reduced HA Quality of End-to-End
Prevention & Scalable
MAA Downtime Service Management
Repair Performance

Faster
Zero Reliable
deployment*, Integrated
Few seconds downtime Meeting HA network &
Reduced tools/reports
of Blackout / using SLAs at any storage
guess-work & with end-to-
Brownout corruption scale performance
tuning end visibility
prevention at any scale
requirements

* Pre-Deployed in ExaCS / ExaCS

Copyright © 2020 Oracle and/or its affiliates. 105


Exadata MAA Solution Integration
On-Premises Exadata Gen 2 Exadata Cloud at Gen 2 Exadata Cloud /
Customer Autonomous Database

All Exadata MAA All Exadata MAA All Exadata MAA


configuration best configuration best configuration best practices
practices baked in practices baked in baked in
Exadata MAA operational All Exadata MAA Some Exadata MAA
best practices operational best practices operational best practices
implemented by customer baked in baked in

Copyright © 2019 Oracle and/or its affiliates. 106


Exadata: Maximum Availability Architecture
Features
Code & Data
Configuration Protection

Brownout Quality of
Reduction Service

Performance Management

Copyright © 2020 Oracle and/or its affiliates. 107


High Availability for Maximum Application
Uptime

“Exadata and SuperCluster Only other AL4 Systems


both achieve AL4 fault FIVE NINES • IBM - z Systems
tolerance in a Maximum
Availability Architecture
5X9
99.999%
• HPE - Integrity NonStop &
Superdome
configuration” • Fujitsu – GS & BS2000
• NEC – FT Server/320 Series
• Stratus ftServer & V Series
• Unisys – Dorado
A New Gold Standard

Copyright © 2020 Oracle and/or its affiliates. 108


IDC Report*
Exadata Delivers Real
Business Value
Average results across eight Global
2000 companies
• Five-Year ROI: 429%
“…far less complicated, we don’t have distinct boxes to
• 11 month average payback maintain, and we now have a single technology...”
— Oracle Customer
• 94% less unplanned IDC: Business Value of Oracle Exadata Database Machine
downtime September 2016

* IDC White Paper, Sponsored by Oracle, September 2016

Copyright © 2020 Oracle and/or its affiliates.


109
Risk Mitigation – Downtime
Oracle Exadata Database Machine

Before Oracle With Oracle


Difference % Benefit
Exadata Exadata
Unplanned Downtime
Number of instances per year 7.1 0.7 6.5 90%
MTTR (hours) 2.9 0.4 2.5 86%

Productive hours lost per 100 users per year 1,021 66 955 94%

Unplanned Downtime – Revenue Impact


Total revenue impact per year $423,700 $5,800 $417,900 99%
Planned downtime
Number of instances per year 10.9 6.0 4.9 45%
MTTR (hours) 4.6 1.9 2.7 59%

Productive hours lost per 100 users per year 68 60 8 12%

Source: IDC
Copyright © 2020 Oracle and/or its affiliates. 110
Maximum Availability Architecture (MAA)

Summary

Copyright © 2020 Oracle and/or its affiliates.


Exadata is Highly Engineered and Standardized
Less Risk, High Uptime = Better Results

• Less Deployment Risk and Faster to Market


• Delivered assembled, debugged, and ready-to-run
• Less Performance and Availability Risks
• Optimized database-to-disk including firmware, OS, network
• Industry experts at every layer of the stack help design, build and
support Exadata. Includes MAA input, bug fixes, and configuration
practices.
• Less Operating Risk
• All failure modes tested end-to-end. All systems identical.
• Reduces issue resolution times, reduces vendor management overhead
and improves SLAs
• Operational Play Book (including online elasticity)
Copyright © 2020 Oracle and/or its affiliates. 112
Exadata for Consolidation and Database as a
Service
Best Mixed Workload Performance, Performance Isolation, Availability

• Any bottleneck on consolidated system can stall Manufacturing


Engineering

all workloads. Exadata eliminates bottlenecks Marketing


– Highest network bandwidth, storage offload
– Millions of I/Os per second, unique log optimizations Sales

• Exadata uniquely prioritizes I/O by pluggable Human


database, job, user, service, etc. Resources
Service

• Exadata uniquely prioritizes critical DB network


messages through entire fabric
• Exadata uniquely unifies CPU prioritization with IT/Operations Finance and
I/O prioritization for end-to-end assurance Accounting

Copyright © 2020 Oracle and/or its affiliates. 113


Exadata + MAA: Thousands of Critical Deployments
Half OLTP - Half Analytics - Many Mixed

4 OF THE TOP 5
BANKS, TELCOS, RETAILERS RUN EXADATA
• Petabyte Warehouses
• Online Financial Trading
• Business Applications
• SAP, Oracle, Siebel, PSFT, …
• Massive DB Consolidation
• Public SaaS Clouds
• Oracle Fusion Apps,
Salesforce, SAS, …

Copyright © 2020 Oracle and/or its affiliates. 114


Exadata Advantages Increase Every Year
• Exadata Cloud at Customer
• In-Memory OLTP Acceleration
• In-Memory Columnar in Flash
Dramatically Better • Exadata Cloud Service
• Smart Fusion Block Transfer
Performance and Cost • In-Memory Fault Tolerance • Hot Swappable
• Direct-to-wire Protocol Flash
• JSON and XML offload • 25 GigE Client
• Instant failure detection Network
• Network Resource Management • 3D V-NAND
• Multitenant Aware Resource Mgmt Flash
• IO Priorities • Prioritized File Recovery • Software-in-
• Data Mining Offload Silicon
• Offload Decrypt on Scans • Tiered Disk/ Flash
• Database Aware Flash Cache • PCIe NVMe Flash
• Storage Indexes • Unified InfiniBand
• Columnar Compression
• Smart Scan • DB Processors in Storage
• InfiniBand Scale-Out
• Scale-Out Storage

• Scale-Out Servers

Copyright © 2020 Oracle and/or its affiliates. 115


Exadata Combinations with Other
Engineered Systems
Exadata
Zero Data Loss SuperCluster Database
Recovery Machine Big Data
ZFS Backup Appliance Appliance Exalogic Private Cloud
Appliance Elastic Cloud Appliance Oracle
MiniCluster

Oracle
Database
Appliance

Data Protection Big Data SQL Middleware / Apps Dept


Copyright © 2020 Oracle and/or its affiliates. 116
Summary: High Availability Decisions
Made Easier
Protection From Use Exadata + MAA
Unintuitive double storage failures High redundancy or Normal redundancy with Data Guard
Data loss and downtime For disasters: Data Guard, Golden Gate -> See
http://www.oracle.com/goto/maa
For local failures requiring recovery: Test your restore/recovery strategy to
ensure it works
Unexpected issues during planned If you can negotiate the downtime within service levels, take it. If not,
maintenance leverage rolling patch capabilities available at every tier
Unexpected production workload profile Test environment similar to production, DBMS_WORKLOAD_REPLAY
Known critical issues Run EXAchk monthly and when a new release is published
Resource depletion that affects service Capacity Planning, Resource Management, Enterprise Manager, RAS for
levels new customers
Over-customization Walk the Oracle line as much as you can, and you will gain the most bang
for your buck from your engineered system
Copyright © 2020 Oracle and/or its affiliates. 117

You might also like