Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

DEM135

Introduction to data lakes and


analytics on AWS
Nikki Rouda
Principal PMM
Amazon Web Services

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Fortnite
250+ million players
Data provides a constant feedback loop
for game designers

Up-to-the-minute analysis of gamer


satisfaction to drive gamer engagement

Resulting in the most popular


game played in the world
Customers want more value from their data

Growing From new Increasingly Used by Analyzed by


exponentially sources diverse many people many applications
Companies want more value from their data

Complications Implication

Siloed approaches don’t work anymore A new approach is needed to


extract insights and value
It’s too expensive and limiting
to store data on-premises
Cloud data lakes are the future
Serverless data Operational
processing analytics
Big data Real-time
processing analytics
Security & governance
Data warehouse Predictive
ETL & Catalog
analytics
data management

Customers want:
Cloud data lake
infrastructure
A single data store that is scalable & cost effective
Decoupled storage
& compute resources

To store data securely in standard formats


Data Streaming
Migration Services To analyze their data in a variety of ways
Why choose AWS for data lakes and analytics?

Most Most Most Easiest Widely


comprehensive secure cost-effective to build used
Most comprehensive and open

Data, visualization, engagement, & machine learning

Data Dashboards Digital user engagement Predictive analytics

Analytics
Data Big data Serverless Interactive Operational Real time
warehousing processing data processing query analytics analytics

Data lake infrastructure & management

Infrastructure Security & Data catalog


management & ETL

Data movement

Migration & streaming services


Most comprehensive and open

Data, visualization, engagement, & machine learning


Data Amazon Amazon Amazon Amazon Lex Polly Amazon Amazon
Exchange QuickSight Pinpoint SageMaker Comprehend Rekognition Translate
+ many more

Analytics
Amazon EMR AWS Glue Amazon Amazon
Amazon (Spark & Amazon
(Spark & Elasticsearch Kinesis Data
Redshift Python) Athena
Hadoop) Service Analytics

Data lake infrastructure & management

Amazon AWS Lake AWS Glue


S3/Glacier Formation

Data movement

AWS Database Migration Service | AWS Snowball | AWS Snowmobile | Amazon Kinesis Data Firehose | Amazon Kinesis Data Streams |
Managed Streaming for Kafka
Most secure
Services for security and governance

Customers need to have multiple levels of security, identity and access management,
encryption, and compliance to secure their data lake

Security Identity Encryption Compliance

Amazon GuardDuty AWS IAM AWS Certificate Manager AWS Artifact

AWS Shield AWS SSO AWS Key Management Amazon Inspector


Service
AWS WAF Amazon Cloud Directory Amazon CloudHSM
Encryption at rest
Amazon Macie AWS Directory Service Amazon Cognito
Encryption in transit
Amazon VPC AWS Organizations AWS CloudTrail
Bring your own keys, HSM
support
Most secure - infrastructure certifications
Global United States
CSA CJIS ITAR MTCS Tier 3 [Singapore]
Cloud Security Criminal Justice International Arms Multi-Tier Cloud
Alliance Controls Information Services Regulations Security Standard

ISO 9001 DoD SRG MPAA My Number Act [Japan]


Global Quality DoD Data Protected Media Personal Information
Standard Processing Content Protection

ISO 27001 FedRAMP NIST


Security Management Government Data National Institute of Europe
Controls Standards Standards and Technology
FERPA C5 [Germany]
ISO 27017 Operational Security
Cloud Specific Educational SEC Rule 17a-4(f)
Financial Data Attestation
Controls Privacy Act
Standards
ISO 27018 ISO FFIEC
Financial Institutions
Cyber Essentials
Personal Data VPAT/Section 508
Regulation Accountability Plus [UK]
Protection
Standards Cyber Threat
PCI DSS Level 1 FIPS Protection
Payment Card Government Security
Standards Standards Asia Pacific
G-Cloud [UK]
SOC 1 FISMA FISC [Japan] UK Government
Audit Controls Federal Information Financial Industry Standards
Report Security Management Information Systems

SOC 2 G GxP IT-Grundschutz


Security, Availability, & X P Quality Guidelines IRAP [Australia] [Germany]
Confidentiality Report and Regulations Australian Security Baseline Protection
Standards Methodology
SOC 3 HIPPA
General Controls Protected Health
Information K-ISMS [Korea]
Report
Korean Information
Security
Most cost effective
Decouple compute and storage, choice of PAYG analytics services

Athena &
Storage Compute EMR Redshift QuickSight
Amazon S3 tiers & Spot & Reserved Auto scaling Less than 1/10th of Serverless pay
intelligent tiering Instances the cost of only for what is used
57% less than traditional, on-
From $0.023 per Save up to 90% off on-premises premises solutions Pricing per session for
GB/mo to as low as on-demand prices per IDC report visualization
$0.004 per GB/mo
Widely used
Tens of thousands of data lakes run on AWS across all industries
Thank you!
Nikki Rouda
nrouda@

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

You might also like