Download as txt, pdf, or txt
Download as txt, pdf, or txt
You are on page 1of 2

Good morning.Is webinar is recording.

Recording or slides will be avaiable later

No. Unfortunately, the recording is not available for this virtual session. The
slides are not available,
but you can have all the topics in these links :
https://docs.microsoft.com/en-us/learn/,
https://docs.microsoft.com/en-us/learn/paths/azure-fundamentals
https://docs.microsoft.com/en-us/learn/paths/azure-fundamentals/

Most of my customers gets confused between Azure Databricks vs Azure Synapse as to


an extent there is a overlap,
could you please explain the simple difference?

Well, yes. There is indeed an overlap between the two.


First difference is that Azure Synapse Analytics has several major components
including two compute engines - T-SQL one (SQL Pool)
and Spark (Spark Pool) these are independent but can be used in parallel with the
same set of data stored in Azure Data Lake Storage Gen2.
In Azure Databricks there is no separate SQL (MPP / DW) engine. Azure Databricks
is a "pure" Spark.
I would recommend to use Azure Databricks over Azure Synapse Spark Pool in case you
have:
1. Requirement to go into production immediately (as Synapse Spark Pool is in the
Public Preview Phase).
2. You are planning to have very complex aggregations, mathematical computations,
AI / ML combined with the complex pre-processing.
3. You have very specific requirements for auto-scaling, scaling up and down as
well as performance tuning.
On the over hand I would recommend Azure Synapse Spark pool for:
1. the medium-complexity aggregations,
2. some well defined routine processing,
3. using Spark in conjunction with DW technologies.
Please be aware that these recommendations will only hold until
Azure Synapse Analytics Spark Pools will be release as Generally Available
service. We will review them at that moment.

Please could you point us to sample data and practice exercises to develop our
expertise?

Well I would recommend to attend Modern Data Warehouse Hackathon (it is free of
charge and very hands-on focused):
https://openhack.microsoft.com/ Also there are several Quick Start Guides
available.
They are useful to aquire some hands-on knowledge very fast and they already are
referring to the public datasets.
1. Creating a Azure Synapse SQL Pool: https://docs.microsoft.com/en-
us/azure/synapse-analytics/quickstart-create-sql-pool-portal
NOTE: I would recommend to use the smallest possible (DW100) and to pause /
terminate it after you are done experimenting.
2. Loading data into Azure Synapse SQL Pools (using COPY command - POlybase
backed):
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-
warehouse/quickstart-bulk-load-copy-tsql
3. Using Azure Data Factory to copy data: https://docs.microsoft.com/en-
us/azure/data-factory/quickstart-create-data-factory-copy-data-tool
4. Use SQL On-demand to query data directly from storage account:
https://docs.microsoft.com/en-us/azure/synapse-analytics/quickstart-sql-on-
demand
5. Working with Spark Pool:
https://docs.microsoft.com/en-us/azure/synapse-analytics/quickstart-apache-
spark-notebook

You might also like