Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

0001

Databricks is a user interface wrapped around Apache spark.


Write your code in notebook
Create cluster to execute the code

0002
3 technologies are converging. Since the internet, this has become the most important technology.
It is not possible to process this much of data on-premise.
It is possible only in cloud
Spark is a open-source big data platform for data science
Spark does not provide any IDE, No collaboration, not optimized for cloud.
Databricks is commercial, created by spark folks, team collaboration, many tools that bricks provide, it
scale up, scale out, weight lift.

Hadoop Eco System


0004 Spark Cluster Architecture

006 Apache Spark


What is Databricks workspace?
It is an environment containing all the compute resources needed to support workspace
File Storage
VMs
Databricks software
Network Stuff
Integration with Cloud provider
And all the things that are required to run the environment

Create Databricks workspace in Azure

You might also like