Data For Breakfast Stockholm FInal Slides

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 88

SNOWFLAKE

THE CLOUD DATA PLATFORM

© 2020 Snowflake Inc. All Rights Reserved.


DATA…THE NEW FRONTIER

© 2020 Snowflake Inc. All Rights Reserved.


NEW TECHNOLOGY CHANGES
HOW WE USE DATA

Diversification of Explosion of Data Rise of the Cloud


Analytics

Analytics is growing in IoT, mobile, and social Cloud gives us the


importance, everywhere, open up new opportunities ability to scale and
and for everyone for insight centralize data

© 2020 Snowflake Inc. All Rights Reserved.


TAKING A PLATFORM APPROACH LEADS
TO BIG BUSINESS IMPACT

Make Better, Quicker Reduce the Cost of Create a Great Customer


Business Decisions Scaling Data Management Experience with Data
and Analytics

© 2020 Snowflake Inc. All Rights Reserved.


JOURNEY TO A CLOUD DATA PLATFORM

Value
of
Data

Time
On Premises Data Lake, 1st Gen Cloud Cloud Data
EDW Hadoop EDW Platform
All Data

All Users

Fast Answers

SQL Database

© 2020 Snowflake Inc. All Rights Reserved.


TRADITIONAL DATA ARCHITECTURE
Complex and Costly with Multiple Copies of Data

Data Data Normalization


Data Analytics
Data Sources Integration Transformation & Aggregation Data Consumers
OLTP Data Warehouses File Sharing
Databases
Operational
Enterprise Reporting
ETL ELT
Applications
Data Marts
Data Science Ad Hoc
Third-Party CDC Analysis

Web/Log Streaming Backups


Data Cubes
Real-time
Data Lake
Analytics
IoT

© 2020 Snowflake Inc. All Rights Reserved.


MODERN DATA ARCHITECTURE WITH
SNOWFLAKE

Data Sources Data Consumers


OLTP Data Data Data Data Data Data Data
ETL, Streaming
Databases Warehouse Lake Engineering Exchange Applications Science Monetization

Enterprise
Applications Operational
Reporting
Third-Party
Ad Hoc
Web/Log Analysis
Data

IoT Real-time
Analytics

© 2020 Snowflake Inc. All Rights Reserved.


THE VALUE OF A CLOUD DATA PLATFORM

One Platform Unlimited


One Copy of Data, Performance
Many Workloads and Scale

Secure & Near-zero


Governed Access Maintenance,
to All Data as a Service

© 2020 Snowflake Inc. All Rights Reserved.


ONE PLATFORM, ONE COPY OF DATA,
MANY WORKLOADS

Data Warehouse Data Engineering Data Lake


Modernize data Rethink transformation Simplify and accelerate
warehousing to deliver with robust and integrated your data lake with one
faster analytics at scale data pipelines platform for all your data

Data Applications Data Exchanges Data Science


Develop apps with fast Empower your ecosystem Simplify and accelerate
and scalable analytics with secure, governed machine learning and
that delight customers access to all data artificial intelligence

© 2020 Snowflake Inc. All Rights Reserved.


THE IMPACT OF A CLOUD DATA PLATFORM

Productivity

Economics Scale

Concurrency Performance

Elasticity

© 2020 Snowflake Inc. All Rights Reserved.


SITUATION VALUE

● Shift to the cloud with focus ● 60+ analytics teams served


on speed and value by single cloud data platform

● Managing infrastructure ● Migrated in < 90 days

● Significant governance, ● Load speeds improving 86%


regulatory requirements
● 5x faster complex queries

● Improved governance and


democratized insights

© 2020 Snowflake Inc. All Rights Reserved.


SITUATION VALUE

● Multiple data warehouses, ● Diverse data ingested for


some end-of-life analytics, data science

● Data silos caused ambiguity ● Improved agility, scalability and


and reporting disparities analytics performance

● Poor consumer insight ● In depth customer knowledge and


improved services

© 2020 Snowflake Inc. All Rights Reserved.


SITUATION / PAIN SOLUTION / VALUE

● Painfully slow analytics cycles ● Hundreds of newly empowered


analysts
● Limited ability to answer
complex questions ● Improved scalability and faster
query times
● Inability to provide business
continuity and ensure security ● Guaranteed security and data
availability, 24/7/365

© 2020 Snowflake Inc. All Rights Reserved.


PROVEN BY OVER 3000 CUSTOMERS

© 2020 Snowflake Inc. All Rights Reserved.


Platform BI/Analytics ETL

EVER EXPANDING ECOSYSTEM


Data Science Services

© 2020 Snowflake Inc. All Rights Reserved.


TECHNICAL
DEEP DIVE

© 2020 Snowflake Inc. All Rights Reserved.


HOW IS SNOWFLAKE UNIQUE?

ARCHITECTURE

© 2020 Snowflake Inc. All Rights Reserved.


A CLOSER LOOK
Traditional Architectures Snowflake

Shared-disk Shared-nothing Multi-cluster, shared data


Additional capacity requires Resizing cluster requires • Centralized, scale-out storage that expands and
forklift upgrade redistributing data. Shut down contracts automatically
requires unloading.
Reads/Writes at the same • Independent compute clusters can read/write at
time cripples the system Each cluster requires its own copy the same time and resize instantly
of data (ex: test/dev, HA)
Replication requires • Automated backup across multiple availability
additional hardware Vacuuming processes needed
zones/regions
to maintain sort and distribution
for performance • AWS, Azure, GCP

© 2020 Snowflake Inc. All Rights Reserved.


SNOWFLAKE ARCHITECTURE

Scale Out Services

Multi-Cluster Compute

Centralized Storage

Cloud Agnostic Layer

© 2020 Snowflake Inc. All Rights Reserved.


MODERN DATA ARCHITECTURE WITH
SNOWFLAKE CLOUD DATA PLATFORM

Data Sources Data Consumers


OLTP Data Data Data Data Data Data Data
ETL, Streaming
Databases Warehouse Lake Engineering Exchange Applications Science Monetization

Enterprise
Applications Operational
Reporting
Third-Party
Ad Hoc
Web/Log Analysis
Data

IoT Real-time
Analytics
ONE PLATFORM, ONE COPY OF DATA,
MANY WORKLOADS

Data Warehouse Data Engineering Data Lake


Modernize data Rethink transformation Simplify and accelerate
warehousing to deliver with robust and integrated your data lake with one
faster analytics at scale data pipelines platform for all your data

Data Applications Data Exchanges Data Science


Develop apps with fast Empower your ecosystem Simplify and accelerate
and scalable analytics with secure, governed machine learning and
that delight customers access to all data artificial intelligence

© 2020 Snowflake Inc. All Rights Reserved.


L

ETL/ELT Sales

XS M

Snowpipe

S S
Cloud Services
Metadata Management Data
Security Science
Query Planning & Optimization M
Transactional Control
© 2020 Snowflake Inc. All Rights Reserved.
ETL/ELT Sales

XS M … M

Multi-cluster

Snowpipe

XL
S
Cloud Services
Metadata Management Data
Security Science
Query Planning & Optimization
Transactional Control
© 2020 Snowflake Inc. All Rights Reserved.
External

M
ETL/ELT Sales

XS Data M … M
Sharing
Multi-cluster

Snowpipe
Structured &
semi-structured

Clone
S
XL XL
Cloud Services Data protection &
Metadata Management time travel Data
Dev Ops
Security Science
Query Planning & Optimization L
Transactional Control
© 2020 Snowflake Inc. All Rights Reserved. Finance/DBAs
SECURE BY DESIGN, DATA AVAILABILITY
Authentication Access Data External Data
Control Encryption Validation Availability

• Embedded multi-factor • IP whitelisting • All data encrypted, • Certified against • User error: Time Travel,
authentication always, end-to-end enterprise-class Failsafe, Cloning
• Roles-based access
requirements
• Key Pair authentication control model • Encryption keys • Zone failure: Data replicated
managed automatically • PCI and HIPAA available to multiple zones in a region
• Federated authentication • Granular privileges on
/ SSO supported all objects & actions • Region/Provider failure:
Data replication & failover

© 2020 Snowflake Inc. All Rights Reserved.


MORE ON WORKLOADS

Data Data Data Data


Lake Exchange Engineering Science

© 2020 Snowflake Inc. All Rights Reserved.


HOW DOES SCALABLE CLOUD DATA
PLATFORM ENABLE DATA LAKES?
Snowflake

Data Lake
Attributes Additional Benefits
No silos – all data Multiple workloads Governance Global
Open formats Cheap storage Transactional Performance
Raw Representation Schema on read Data sharing Managed service

© 2020 Snowflake Inc. All Rights Reserved.


AUGMENTING EXISTING DATA LAKES

Azure
S3 Storage

GCS Materialized SQL over


Views materialized data
New Files Notifications SQL over
External Snowflake tables
Hive Events Tables SQL over
Hive external table
Metastore
Data
Lake

© 2020 Snowflake Inc. All Rights Reserved.


LOW LATENCY INGEST

Web

IoT
External Stage Unload
(S3, Azure Staging
Mobile Blobs, GCS) Snowpipe Tables
External Stage
(S3, Azure Blobs,
Enterprise GCS)
Apps

Data Sources

© 2020 Snowflake Inc. All Rights Reserved.


SCALABLE TRANSACTIONAL
TRANSFORMATIONS

Table Streams & Tasks


Target
Table 1
Unload

Target
Staging Transformations Table 2 External Stage
Tables
(S3, Azure Blobs, GCS)

© 2020 Snowflake Inc. All Rights Reserved.


SECURE DATA SHARING

Secure Live Frictionless Personalized Global

© 2020 Snowflake Inc. All Rights Reserved.


ENABLING AI, ML, AND DATA SCIENCE

• Improving data science speed and


efficiency with centralized source of
high performance data
• Accelerating data exploration and
preparation by 10-100x
• Connectors to leading and emerging
technologies
• First class ecosystem of partners

© 2020 Snowflake Inc. All Rights Reserved.


SPEAKER Q&A

© 2020 Snowflake Inc. All Rights Reserved.


JOIN US
AFTER
THE
BREAK

© 2020 Snowflake Inc. All Rights Reserved.


THANK YOU

© 2020 Snowflake Inc. All Rights Reserved.


OUR JOURNEY TO SNOWFL AKE

Paul Flynn & Jakob Matto


H O O K U S U P – W E R E I N YO U R C O M M U N I T Y N OW

pjflynn
jakobmatto

2 0 2 0 - 0 3 - 0 6 2
2 0 2 0 - 0 3 - 0 6 3
2 0 2 0 - 0 3 - 0 6 4
2 0 2 0 - 0 3 - 0 6 5
2 0 2 0 - 0 3 - 0 6 6
WHY DID WE NEED TO C HANGE ANY THING?

2 0 2 0 - 0 3 - 0 6
1 7
AW S

• No legacy on-prem and born in the cloud


• Redshift
– Performance : scales as a set of servers, not as a service
– Concurrency and table locking
– Distribution, Vaccuming, Indexing, patching, uptime, environments
– Ultimately Postgres is a conventional server database that wasn’t built for the cloud and we are not
interested in infrasructure or maintenance

2 0 2 0 - 0 3 - 0 6 8
AW S D AT A WA R E H O U S E A R C H I T E C T U R E

2 0 2 0 - 0 3 - 0 6 9
HOW D I D WE D O I T ?

2 0 2 0 - 0 3 - 0 6
2 1 0
S N OWFL AKE M V P

• Test Snowflake, because curiosity is often free


• Build your relationships in the community
• Sell the vision
• Utilize a business case and get an MVP built

2 0 2 0 - 0 3 - 0 6 1 1
COMMUNIT Y – MEET CUS TOMERS

2 0 2 0 - 0 3 - 0 6 1 2
W H AT D I D W E D O ?

2 0 2 0 - 0 3 - 0 6
3 1 3
M I G R AT E - P H A S E 1

• Get across to Snowflake ASAP


• Re-use what you can – Snowflake Import
• Just go live – we are there right now!
• An amazing team helps and some of them are here today

2 0 2 0 - 0 3 - 0 6 1 4
D W D AT AWA R E H O U S E – A F O C U S O N S A A S

2 0 2 0 - 0 3 - 0 6 1 5
P H A S E 2 : W H E R E T H E F O C U S S H O U L D A LWAY S H AV E B E E N

1. Integration has been commoditised so adapt!


2. Stronger focus on business logic and processes

1 2

2 0 2 0 - 0 3 - 0 6 1 6
P H A S E 3 : T A C K L E A N A LY T I C S A D O P T I O N

• DW Data Culture
• Moving through the BI maturity curve

2 0 2 0 - 0 3 - 0 6 1 7
2 0 2 0 - 0 3 - 0 6 1 8
2 0 2 0 - 0 3 - 0 6 1 9
DI

2 0 2 0 - 0 3 - 0 6 2 0
THANK YOU

2 0 2 0 - 0 3 - 0 6 2 1
• In case of demo ghosts à use the following slides instead of live demo

2 0 2 0 - 0 3 - 0 6 2 2
2 0 2 0 - 0 3 - 0 6 2 3
2 0 2 0 - 0 3 - 0 6 2 4
2 0 2 0 - 0 3 - 0 6 2 5
2 0 2 0 - 0 3 - 0 6 2 6
2 0 2 0 - 0 3 - 0 6 2 7
2 0 2 0 - 0 3 - 0 6 2 8
2 0 2 0 - 0 3 - 0 6 2 9
2 0 2 0 - 0 3 - 0 6 3 0
2 0 2 0 - 0 3 - 0 6 3 1
2 0 2 0 - 0 3 - 0 6 3 2
2 0 2 0 - 0 3 - 0 6 3 3
• If the demo ghosts are still haunting this presentation à use the following slides

2 0 2 0 - 0 3 - 0 6 3 4
2 0 2 0 - 0 3 - 0 6 3 5
2 0 2 0 - 0 3 - 0 6 3 6
2 0 2 0 - 0 3 - 0 6 3 7
2 0 2 0 - 0 3 - 0 6 3 8
2 0 2 0 - 0 3 - 0 6 3 9
2 0 2 0 - 0 3 - 0 6 4 0
Snowflake &
Data @ Voi

Björn Idrén
Head of Business Intelligence & Analytics
Voi Technology
BeforeVoi

‣ Ericsson Sales, Tech, Product, Business Intelligence.


‣ IPX, Business Intelligence.
‣ Klarna, Business Intelligence & Data Warehousing.
‣ Soundtrack Your Brand / Spotify Business, Business Intelligence & Analytics
VOI is a fun, safe and green
This is VOI transport option that changes how
people move in the city.

‣ Startup ‣ Operational Heavy


‣ Founded September 2018 ‣ Intense Need for Data & Insights
‣ 550+ Employees ‣ Fast Changing, Fast Pace
‣ 35+ Cities ‣ {any other words describing a positive
‣ 10 Countries chaos with exponential growth}
Being a
Startup
‣ Done is better than perfect
‣ Iteration is innovation
‣ Solving problems with money is not an option
‣ MVP - Minimum Viable Product
‣ Keep it Simple
‣ Keep keeping it simple
Data / BI
setup
‣ All Data reside in the Cloud
‣ Everyone on Mac (all but finance..)
‣ Minimalistic Lake -> DW
‣ Snowflake Partner -
+ 10+ SaaS product for growth
Traditional
Business
Intelligence
‣ 2-3 teams
‣ Delivery organisations
‣ Requirements given from the business
‣ Mature(r) understanding of BI in the business
‣ Speed < 100% accuracy
‣ Often little to no effort on utilising existing data
Startup
Business
Intelligence
‣ One person at best
‣ Speed > 100% accuracy
‣ Prototyping
‣ BI not an established function
‣ Evangelist mode
Speed of Iteration

Data Driven?
Utilize existing Data
Share the data with everyone

An individual without information can't take responsibility.


An individual with information can't help but take responsibility.
- Jan Carlzon
Data Driven
Prototyping
Voi
Share data

First attempt
mapping rides
on a map
Examples
Data Prototyping
Voi
Example - Voi
Live demo - Voi

Click Me

You might also like