Professional Documents
Culture Documents
Aniruddha BigDataandAnalytics
Aniruddha BigDataandAnalytics
Aniruddha BigDataandAnalytics
anid@microsoft.com
The world is changing
Today, 80% of
AI investment
Data will grow to organizations
increased by
44 ZB in 2020 adopt cloud -first
300% in 2017
strategies
Today, 80% of
AI investment
Data will grow to organizations
D ATA
44 ZB in 2020 CLOUD
adopt cloud -first AI
increased by
300% in 2017
strategies
D ATA AI
Organizations that harness data,
cloud, and AI outperform
CLOUD
Companies surveyed include well-known
enterprises across key industries
Data security,
Unstructured data privacy and regulatory
limits ability to analyze requirements are
and take action of paramount
importance
Unique government opportunities
HYBRID
On-premises Cloud
Private cloud
Reason over any data, anywhere Flexibility of choice Security and Performance
The Azure BIG Data Landscape
AZURE
AZURE AZURE IMPORT AZURE SQL DB AZURE COSMOS DB AZURE SQL DATA WAREHOUSE POWER BI
ANALYSIS SERVICES
DATA FACTORY EXPORT SERVICE
AZURE EXPRESSROUTE AZURE AZURE NETWORK AZURE KEY OPERATIONS AZURE FUNCTIONS
VISUAL STUDIO
ACTIVE DIRECTORY SECURITY GROUPS MANAGEMENT SERVICE MANAGEMENT SUITE
SQL Server 2019
Azure Data Lake
Azure Data Bricks
Industry-leading performance and security, with intelligence over all your data
200
180
160
Vulnerabilities (2010-2017)
140
120
100
80
60
#1 OLTP performance1 40
20
#1 DW performance on 0
AI and Machine Learning T-SQL PHP Python The best of Power BI and
1TB2, 10TB3, and 30TB4
over all data with the power Java Node.js Ruby SQL Server Reporting Services
of SQL and Apache Spark C/C++ C#/VB.NET Intelligent Query Processing with Power BI Report Server
Combine data from many sources without Store high volume data in a data lake and access Easily feed integrated data from many sources to
moving or replicating it it easily using either SQL or Spark your model training
Scale out compute and caching to boost Management services, admin portal, and Ingest and prep data and then train, store, and
performance integrated security make it all easy to manage operationalize your models all in one system
Custom
apps BI Analytics
SQL Server
SQL
master instance
Security
Always Encrypted with secure enclaves
Data Classification and auditing built-in
Manage certificates easier with SQL Configuration Manager
Availability
Always On availability group enhancements
Resumable online index creation
Online Clustered Columnstore index creation and rebuild
Availability groups on Kubernetes
SQL Server 2019
Azure Data Lake
Azure Data Bricks
VALUE
How can we
make it happen?
Prescriptive
What will Analytics
happen?
Theory
Predictive
Theory Analytics
Why did Hypothesis
Hypothesis it happen?
Diagnostic Pattern
Observation What
Analytics
happened?
Observation
Descriptive
Confirmation
Analytics
DIFFICULTY
Understand Gather Implement Data Warehouse
Corporate Requirements
Strategy Reporting & BI and analytic
Reporting &
Analytics Design Analytics
Business Development
Requirements
Data warehouse
Dimension Modelling Physical Design
ETL
ETL
ETL Design
Technical Development
Requirements
Data sources
Setup Infrastructure Install and Tune
Data Lake Uses A Bottom-Up Approach
Batch queries
DEVICES
Interactive queries
Real-time analytics
r
LOGS, FILES AND MEDIA Machine Learning
(UNSTRUCTURED)
Data warehouse
BUSINESS / CUSTOM
APPS
(STRUCTURED)
WASB WASB ADLS Azure Data Lake Storage Gen2
Azure Data Lake Storage Gen2: Single Data Lake Store that combines the performance and
innovation of ADLS with the scale and rich feature set of Blob Storage
SQL Server 2019
Azure Data Lake
Azure Data Bricks
What is Azure Databricks?
A fast, easy and collaborative Apache® Spark™ based analytics platform optimized for Azure
Interactive workspace that enables collaboration between data scientists, data engineers, and business analysts.
Native integration with Azure ser vices (Power BI, SQL DW, Cosmos DB, Blob Storage)
Enterprise-grade Azure security (Active Director y integration, compliance, enterprise -grade SL As)
Azure Databricks
Azure Databricks
Collaborative Workspace
Data warehouses
Optimized Databricks Runtime Engine Data exports
Hadoop storage
DATABRICKS I/O APACHE SPARK SERVERLESS Rest APIs
Data warehouses
Enhance Productivity Build on secure & trusted cloud Scale without limits
Collaborative Workspace
Azure Databricks
Collaborative Workspace
Cosmos DB
Predictive apps
SQL Data
Operational reports
Warehouse
Event Hubs Machine
IoT Hub Learning
Sensors Analysis Services
and devices
Analytical dashboards
HYBRID
On-premises Cloud
Private cloud
Reason over any data, anywhere Flexibility of choice Security and Performance
Empower today’s innovators to unleash the power of data
and reimagine possibilities that will improve our world