Cloud Data Warehousing With Azure Synapse Analytics BRK3051

Cloud Data
Warehousing with Azure

Synapse Analytics
(formerly Azure SQL Data Warehouse)
Igor Stanko
Principal Group PM Manager
igorstan@microsoft.com
Woodward, Phinean
Data Architect, Unilever
Phinean.Woodward@unilever.com
BRK3051
Agenda Azure Synapse Analytics
Overview
What’s new
Demo
Unilever story
Azure Synapse Analytics is Azure SQL Data Warehouse evolved, blending big
data, data warehousing, and data integration into a single service for end-to-
end analytics at cloud scale
Azure Synapse Analytics
limitless analytics service with unmatched time to insight
Designed for analytics workloads

Artificial Intelligence / Machine Learning / Internet of Things
at any scale
Intelligent Apps / Business Intelligence
Synapse Analytics
SaaS developer experiences for
code free and code first
Experience
Experience Synapse Analytics Studio
Multiple languages suited to
Platform
Platform Languages different analytics workloads
MANAGEM ENT
SQL Python .NET Java Scala R
Integrated analytics runtimes
Form Factors available provisioned and
SE CURITY
PROVISIONED ON- DE MAND serverless on-demand
SQL Analytics offering T-SQL for
Analytics Runtimes
MONITORING
batch, streaming and interactive
processing
SQL Spark for big data processing with
METASTORE Python, Scala, R and .NET
DATA INTEGRATION
Integrated platform services
for, management, security,
monitoring, and metastore
Azure
Azure Common Data Model
Data
Data Lake
Lake Storage
Storage
Enterprise Security Data lake integrated and
Optimized for Analytics Common Data Model aware
SQL Analytics
new features available
GA features: Public preview features:

- Performance: Resultset caching - Workload management: Workload Isolation
- Performance: Materialized Views - Data ingestion: Simple ingestion with COPY
- Performance: Ordered columnstore - Data Sharing: Share DW data with Azure Data Share
- Heterogeneous data: JSON support - Trustworthy: Private LINK support
- Trustworthy: Dynamic Data Masking
- Continuous integration & deployment: SSDT support
- Language: Read committed snapshot isolation
Private preview features:

- Data ingestion: Streaming ingestion & analytics in DW
- Built-in ML: Native Prediction/Scoring
- Data lake enabled: Fast query over Parquet files
- Language: Updateable distribution column
- Language: FROM clause with joins
- Language: Multi-column distribution support
Note: private preview features require whitelisting
Scope: Generally Available
Result
Control Node
Best in class price

performance Compute Node Compute Node Compute Node
Interactive dashboarding with

Resultset Caching
- Millisecond responses with resultset caching Storage
- Cache survives pause/resume/scale operations
- Fully managed cache (1TB in size)
Enable caching: Alter Database <DBNAME> Set Result_Set_Caching ON

Purge cache: DBCC DropResultSetCache
Best in class price

performance
Interactive dashboarding with

Materialized Views
- Automatic data refresh and maintenance
- Automatic query rewrites to improve performance
- Built-in advisor
Scope: Public Preview
CREATE WORKLOAD GROUP Sales

WITH
(
[ MIN_PERCENTAGE_RESOURCE = 60 ]
[ CAP_PERCENTAGE_RESOURCE = 100 ]
[ MAX_CONCURRENCY = 6 ] )
Workload aware Intra Cluster Workload Isolation

query execution (Scale In)
Sales
Workload Isolation
- Multiple workloads share deployed resources
60% Compute
1000c DWU
Marketing
- Reservation or shared resource configuration
- Online changes to workload policies 100%
40%
Local In-Memory + SSD Cache
Data
Warehouse
Scope: Private Preview (whitelisting needed)
T-SQL Language
Heterogenous Data
Preparation & SQL Analytics
Ingestion
Event Hubs
Streaming Ingestion Data Warehouse
Native SQL Streaming

Built-in streaming ingestion & analytics
- High throughput ingestion (up to 200MB/sec)
- Delivery latencies in seconds
IoT Hub
- Ingestion throughput scales with compute scale
- Analytics capabilities (SQL-based queries for joins,
aggregations, filters)
Scope: Public Preview
T-SQL Language
SQL Analytics
Event Hubs
Heterogenous Data
Preparation &
Streaming Ingestion Data Warehouse
Ingestion Streaming, Batch &

COPY statement
Trickle loading
IoT Hub
COPY statement
- Simplified permissions (no CONTROL required) Azure Data Lake
- No need for external tables
- Standard CSV support (i.e. custom row terminators,
escape delimiters, SQL dates) --Copy files in parallel directly into data warehouse table
COPY INTO [dbo].[weatherTable]
- User-driven file selection (wild card support) FROM 'abfss://<storageaccount>.blob.core.windows.net/<filepath>'
WITH (
FILE_FORMAT = 'DELIMITEDTEXT’,
SECRET = CredentialObject);
Create Upload Score

models models models
Machine Learning SQL Analytics
enabled DW + =
Model Data Predictions
T-SQL Language
Native PREDICT-ion
- T-SQL based experience (interactive./batch scoring)
- Interoperability with other models built elsewhere Data Warehouse
- Execute scoring where the data lives
--T-SQL syntax for scoring data in SQL DW

SELECT d.*, p.Score
FROM PREDICT(MODEL = @onnx_model, DATA = dbo.mytable
AS d)
WITH (Score float) AS p;
SQL Analytics
Data Lake
Integration
ParquetDirect for interactive 13X

data lake exploration
- >10X performance improvement
- Full columnar optimizations (optimizer, batch)
- Built-in transparent caching (SSD, in-memory,
resultset)
Azure Data Share
Enterprise data sharing

- Built-in flexibility
• Share from DW to DW/DB/other systems
• Choose data format to receive data in (CSV, Parquet)
- One to many data sharing
- Share a single or multiple datasets
Demo scenario explore predict
Arcadia
Event Hub
data warehouse
Data Lake
Parquet
files
Taxi ride – pricing
data
Unilever Story
Speaker name
WE MAKE MANY OF THE WORLD’S
FAVOURITE BRANDS
On any day, 2.5 billion people use Unilever products to look good, feel good and get more out of life – giving us a
unique opportunity to build a brighter future
Modern Data Warehouse Logical Representation
• Each product will have its own

resource group for cataloguing
and cross-charging purposes
• Generally implemented using
ADF
Product Product Product Product Product SQL DW, AAS, PBI, but flexibility
based on requirements
Business Data Lake Business Data Lake Business Data Lake

• Implemented using ADLS
(BDL) (BDL) (BDL)
ADF
• Implemented using Azure
Databricks
Universal Data Lake (UDL) • Implemented using ADLS
Orchestration
Azure Synapse Analytics (formerly SQL DW)
 Used in 50+ projects

 Model – facts & dimensions:
 data available to our user & data science community in a familiar format
 Scale up/down capability – power for processing & costs managed
 New features Unilever is leveraging

 Reserved instance pricing
 Workload management & workload isolation
 Result set cache
 Materialised views
 Fast queries over parquet files
Unilever and Azure Synapse Analytics (Preview)
 In preview for ~3 months

 Brings together data integration, data warehousing and big data processing capabilities
at scale
 Accelerates the delivery of BI, AI and Intelligent Applications
 Build Analytics Solution from ingestion to BI reporting in one place in a single workspace
 Easy Management
 Integrated Dev Ops environment
 Easy deployment
 Secure Environment
 Centralized Monitoring & Alerting
 Additional features
Q/A
Backup
Want to learn more about Azure Synapse Analytics?
Check out these sessions for more information
NEW! Introducing Azure Synapse Analytics: the Next Evolution of SQL Data Warehouse for
BRK2187 Monday, 3:15PM
Every Data Professional
BRK3044 Migrating Your Mission-Critical Data Warehouse to Azure Synapse Analytics Monday, 4:30PM
BRK3330 Unifying AI-to-BI with Azure Synapse Analytics Tuesday, 9:15AM
Modernizing your Data Warehouse with Data Ingestion, Preparation, and Serving using Azure
BRK3224 Tuesday, 10:30AM
Synapse Analytics
BRK3229 Securing Your Data Warehouse with Azure Synapse Analytics Tuesday, 11:45AM
BRK3050 Democratizing the Data Lake with On-Demand Capabilities in Azure Synapse Analytics Tuesday, 3:30PM
BRK3051 Cloud Data Warehousing with Azure Synapse Analytics Wednesday, 10:30AM
Want to learn more about analytics on Azure?
Check out these sessions for more information
Wednesday,
BRK3045 Code-free ETL using Azure Data Factory & Data Share
2:15PM
Wednesday,
BRK3094 Modern Data Integration Scenarios & B2B data sharing using Azure Data Share
3:30PM
BRK3046 Achieving Petabyte-Scale Data Ingestion with Azure Data Factory Thursday, 9:15AM
BRK3043 Maximizing your Azure Databricks Deployment Thursday, 10:30AM
Gaining Business Insights with Open Source Analytics on Azure HDInsight:

BRK3042 Friday, 9:15AM
Patterns and Best Practices
BRK3047 Prepare for the Next Era of Insights Using Azure Data Lake Storage Friday, 11:45AM
Enabling Real-Time Analytics Patterns from the Cloud to the Intelligent Edge with Azure Stream
BRK2066 Tuesday, 2:15PM
Analytics
BRK3048 Build High Performance Time Series and Log Data Analysis Solutions with Azure Data Explorer Friday, 10:30AM
Want to get hands-on?
Check out these labs to get hands-on with the latest in Azure Analytics.
WRK4002 Data Integration using Azure Data Factory and Azure Data Share Tuesday, 4:00PM
WRK4000 Build Solutions Powered by Real-time Analytics using Azure Stream Analytics and Azure Data Explorer Wednesday, 12:30PM
WRK4001 Building an End-to-end Analytics Pipeline with Azure Synapse Analytics Thursday, 4:00PM
Stopping by our booth?

Check out these theater sessions in the Hub for even more information
THR3110 Big Data Processing with Spark and .NET Tuesday, 12:40PM
THR3113 Maximizing ROI in SQL Data Warehouse with Enhanced Workload Management Wednesday, 12:40PM
THR3116 Streaming Data with Azure EventHubs and Kafka Wednesday, 3:05PM
THR3128 Time Series and Machine Learning with Azure Data Explorer Thursday, 1:15PM
THR3133 Staying Productive with Azure Data's Developer Tooling Thursday, 10:20AM
THR3119 What's New with Azure Data Lake Storage? Thursday, 2:30PM
Please evaluate this session
Your feedback is important to us!
Please evaluate this session through

MyEvaluations on the mobile app
or website.
Download the app:
https://aka.ms/ignite.mobileapp
Go to the website:
https://myignite.techcommunity.microsoft.com/evaluations
Find this session Visit aka.ms/MicrosoftIgnite2019/BRK3051
in Microsoft Tech  Download slides and resources
Community
 Access session recordings in 48 hours
 Ask questions & continue the conversation
© Copyright Microsoft Corporation. All rights reserved.
<Add an image for interaction with ASA for real time
scoring>
 Will update once DW-ASA demo is working
CREATE WORKLOAD CLASSIFIER classifier_name
WITH
(
WORKLOAD_GROUP = 'name’ ,
MEMBERNAME = 'security_account' [ [ , ]
IMPORTANCE = { LOW | BELOW_NORMAL | NORMAL (default) | ABOVE_NORMAL | HIGH }])
Scheduler With Importance Turned On

Workload aware
query execution
CEO CEO
1 2 12
3 4 5 6 7 8 9 10 11 12
Normal Low Normal High
Workload Importance
Running Queued
Queued

Cloud Data Warehousing With Azure Synapse Analytics BRK3051

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cloud Data Warehousing With Azure Synapse Analytics BRK3051

Uploaded by

Copyright:

Available Formats

Cloud Data

Warehousing with Azure

Designed for analytics workloads

GA features: Public preview features:

Private preview features:

Best in class price

Interactive dashboarding with

Enable caching: Alter Database <DBNAME> Set Result_Set_Caching ON

Best in class price

Interactive dashboarding with

CREATE WORKLOAD GROUP Sales

Workload aware Intra Cluster Workload Isolation

Streaming Ingestion Data Warehouse

Native SQL Streaming

Ingestion Streaming, Batch &

Create Upload Score

Machine Learning SQL Analytics

--T-SQL syntax for scoring data in SQL DW

ParquetDirect for interactive 13X

Azure Data Share

Enterprise data sharing

• Each product will have its own

Business Data Lake Business Data Lake Business Data Lake

Universal Data Lake (UDL) • Implemented using ADLS

 Used in 50+ projects

 New features Unilever is leveraging

 In preview for ~3 months

BRK3330 Unifying AI-to-BI with Azure Synapse Analytics Tuesday, 9:15AM

BRK3043 Maximizing your Azure Databricks Deployment Thursday, 10:30AM

Gaining Business Insights with Open Source Analytics on Azure HDInsight:

Stopping by our booth?

Please evaluate this session through

in Microsoft Tech  Download slides and resources

Scheduler With Importance Turned On

You might also like