Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 34

Telecom Bell

Cloud Migration Kickoff


Yashodhan Kale
Delivery Solutions Architect | Databricks
05/30/2023
101 Summary

2 Platform & Architecture

Contents 3 Approach

4 Operating Model

5
Additional Details
1 Summary | Business challenges

1 2 3 4 5

TELECOM BELL As IoT and 5G advance, TELECOM BELL is subject Power of data Increase pressure to show
must improve network customers easily switch to many regulations, there is a data-volume growth and profits
QOS to align with providers, prompting including data privacy and explosion, requiring both is constant and data and AI
consumers' changing TELECOM BELL to security regulations, and focus and new capabilities. will be a critical enabler
emphasis on mobile prioritize personalized needs effective ways to
connectivity and data engagement using customer adhere to these.
usage data for customized
messaging and services.
2 Summary| Technical Challenges
Today there are increased expectations and pressure on the Telecom organization to have a strong data & analytics strategy

Data platform is not scalable for analytics, AI/ML

Upfront capacity planning and cost

Governance of the data on HDFS is a challenge

Data sits in silos and not easy to integrate/ connect

Lack of discoverability of data (catalog)

Housekeeping - Maintenance of the in-house cluster is a difficult thru different portals and installations

Advance disaster recovery, durability and availability

Bigger IT infra staff required


3 Summary | Executive Plan

1 Telecom Bell wants to improve the Quality of Service (QoS) of their network and to get there, start
migrating the core applications to cloud.

2 Databricks will bring industry leading expertise and Databricks platform expertise to drive the
transformation at speed.

3 Confluent will bring event streaming platform built on Kafka and the necessary platform support

4 Telecom Bell has a team of 10 Engineers with expertise on Kafka and spark

5
Desired timeline – May 2024
101 Summary

2 Platform & Architecture

Contents 3 Approach

4 Operating Model

5
Additional Details
1 Platform & Architecture | Current Architecture

Limitations
• Data platform is not scalable for analytics,
AI/ML
• Upfront capacity planning and cost
• Governance of the data on HDFS is a challenge
• Data sits in silos and not easy to integrate/
connect
• Lack of discoverability of data (catalog)
• Housekeeping - Maintenance of the in-house
cluster is a difficult thru different portals and
installations
• Advance disaster recovery, durability and
availability
• Bigger IT infra staff required
2 Platform & Architecture | End state Architecture
Design target state architecture for a scalable, secure and well governed data platform
(AI /ML self-serve, advanced engineering capabilities including necessary governance on lake capability)

Designing and activating a World Class Data Platform:


Highlights
• Warehouse + Data Lake capabilities at scale with Governance
• Data product mindset – Marketplace, Self service capabilities
• MLOps – Full ML Lifecycle
• Domain data tiers - Advance data management capabilities, curated
democratized data layers

Fundamental Principles
• Scalability
• Performance
• Industrialized processes governing the pipeline
• Distributed, fault tolerant architecture
• Open file format for better interoperability between systems
• Security and reliability
• Data provenance and lineage
• ACID complaint
3 Platform & Architecture | Current vs New

12 More performant
Governance underand
theoptimized
same roofspark engine

New
4 Platform & Architecture | Artifacts
A World Class Data Platform!Key components of the data platform:
101 Summary

2 Platform & Architecture

Contents 3 Approach

4 Operating Model

5
Additional Details
1 Approach | Our Tenets

B Multiple velocity C Leverage customer D Continues


A Security is job zero delivery of
joint delivery asset first
approach results

E Zero down time F Log the journey at every G Principal of least H Agile
step to look back & learn access Methodology
privilege(PoLAP)

Because - "Approach is the first step towards achieving goals"


2 Approach | Objectives
Industrialization

Migrate core applications


4 to cloud in a secure and
Mindset reliable way
HORIZ
ON
Co-create an operating
3 model that would take
Platform TELECOM BELL where it
wants, in a sustainable
HORIZ way.
ON

Strategic 2 Build strong foundations


with data platform
roadmap development and
HORIZ implementation
ON
Build the data strategy
1 roadmap that empowers
Telecom Bell to overcome
its business challenges
101 Summary

2 Platform & Architecture

Contents 3 Approach

4 Operating Model

5
Additional Details
1 Operating Model | Joint Delivery Approach
Executive Leadership
Databricks Leadership: Telecom Bell Leadership Meeting Cadence
1 1

• Bi-Weekly Steering
Committee Meetings
Program Management • Weekly PMO Meetings
• Daily Delivery Team
Databricks Lead Telecom Bell Lead Meetings
1 1

A B C D
Application Team Platform Team Data Quality & Bringing it Together
Governance

Databricks Databricks Databricks Databricks


(Professional services) (Professional services) (Professional services) (Professional services)
5 5 3 3

Telecom Bell Resources Telecom Bell Resources Telecom Bell Resources Telecom Bell Resources
3 3 4 1
2 Operating Model | Pod Structure
Scrum Master
Shared Resource

Resident
Cloud DevOps Functional Solutions
Engineer Domain Expert Architect
Resident Solutions
Shared Resource

Leader Leader
Architect
Specialist Solutions
Resident
Architect (Security) Platform Customer Success Application Solutions
Leadership Engineer Team Architect
Delivery Solutions
Cloud DevOps Architect
Engineer Customer Success
Leader Data Engineer
Leader Azure Platform
Engineer
Shared Resource

Cloud Architect
Cloud DevOps
Engineer
Data Visualization
Engineer
Scrum Master

Roadmap Officer Data Governance


Lead
Data Quality
Change Engineer
management
Specialist Enterprise Enterprise
Support Support
Product Data Quality & Specialist Solutions
Bring it Owner Governance Architect (Security)
Together

Data Lineage and Test /


PMO Lead Profiling Engineer
Delivery Lead Quality Lead

16 12
Databricks resource Telecom Bell resource
3 Operating Model | Road Map
DIAGNOSTIC OF THE CELEBRATION
Celebrate completion
1 CURRENT ENVIRONMENT

MIGRATION
3
2 END STATE ARCHITECTURE PLAYBOOK
A repeatable guideline to
migrate
PROGRAM applications to new architecture 5 MIGRATION:
KICKOFF 60%
ALONG THE WAY

4 MIGRATION: 10
HUMAN-CENTERED %
1 CHANGE ` Consistently –
Focus on each individual team 3 PLATFOR communicate,
remove
member’s technical skills and capacity M roadblocks &
for change. Reskill team members
eliminate
whose roles are changing Progress
friction

Celebrate
completion of

6 MIGRATI
quick wins to
strengthen
Progress ON morale
100%

2 MINDSET CHANGE
Adopt ‘Data as a Product’, self service
platform, federated governance, domain
specific ownership

Progress

DELIVERAB PROCESS
LES MEASURE PROGRESS
GOALS
3 Operating Model | Timeline

Q2 2023 Q3 2023 Q4 2023 Q1 2024 Q2 2024


Steerco Steerco Steerco Steerco
Meeting Meeting Meeting Meeting

Security and compliance | phase1 Cost optimization


Current State Diagnostics Cost management reports
Platform Best practices and tagging
Move towards Infra as code
Handover
Confluent workspace setup
Security and compliance | phase2
Databricks workspace setup

Define
Elements/Sources/Data
Deploy
Handover
Refactor the code
Application Incorporate changes
Test & Modify
Talk to business team Document &
KT
Assess Current State & Catalog Critical Data Prepare Governance Strategy
Data Quality + Elements (Identify roles, define interaction model)
Design Target State DQ Monitoring

Governance Assess Current State Data Governance Design & Deliver Governance Structure Implement Target State DQ Monitoring Handover

Assess skill and capability Establish ways of working – Agile : Update Roadmap and plan per evolving priorities
gaps within the organization Arrange handover of all areas
documentation, win celebrations
Bring it together Define Pods and teams
Create Upskilling Curriculum and Continuously monitor, foresee risk, mitigate risks , fetch leadership guidance
setup trainings sessions
Project management
101 Summary

2 Platform & Architecture

Contents 3 Approach

4 Operating Model

5
Additional Details
1 Additional Details | Future Scope

Industrialization:
Competitive Differentiation

High throughput of innovation analytics (AI/ML)


Predictive analytics at scale
5 Data driven(real time what-if analysis)
Harmonized MDM; ML & AI based DQ
Fast, repeatable time-to-market from idea to product
1 Additional Details | Risk & Mitigation - Technical

Risks Mitigating Actions

Reconciliation, Check pointing, Audit, Monitoring. Use of fault tolerant ingestion/migration tools like Azure Data Factory – Az
Data Loss Risk
Copy Activity

Data Corruption and Data Integrity Data Validation - Each record is compared in a bidirectional manner, and each record in the old system is compared against the
Risk target system and the target system against the old system

Interference Risks
Align with the stakeholders of each source on how the bandwidth can be shared. “Bring it together” team come into play to
(simultaneously use of source
address this
application)

Schema Evolution Delta file format – Schema evolution feature. Depends on schema on read. Further to make sure there are no incompatible
(Changing Dimensions) schemas coming in. A catalog and governance would be leveraged – Databricks Unity Catalog

Authorization Risk MFA and Identity Federation , access controls at row and column level by Delta Lake

Apply Encryption where possible and appropriate


Data Security Risk All tokens and keys will be securely stored and rotated in Azure Key Vault
Rotate keys on regular interval

Down time due to migration Replicate and activate approach


2 Additional Details | Risk & Mitigation - Other

Risk Mitigating Actions

Resource Availability &  Making sure employees are fully advised about participation into workshops and/or interviews.
Competing Priorities
 Get the right people at the right time

 Strong support from the leadership Group, including areas who are not fully involved by the initial changes. One Team,
Senior Leadership Buy-In and Delays in
Decision Making One direction
 Establish governance to provide clarity on accountabilities for decision making
 Strong support from Senior Leadership if there is a need to put a hold on
Potential Impacts to Other existing projects
Projects  Review current state of ongoing projects to see how it impacts to the Finance model
 Prioritize major changes and focus on the big obstacles upfront
 Agile and inspirational change management and communication structure
Lack of People Adoption –
Major Change  Leverage Bring it together team, and roles like change management experts to steward people readiness and prepare
for change
 Work with scalable and flexible design principles in mind to ensure proper
Design in Isolation integration and alignment with the business. It is a partnership approach
(Enterprise Integration)  Gather key inputs to support cross function process design decisions
where applicable
 Simplify data requests to collect data and information at the appropriate level of detail
Availability of Key Data Inputs
 Assign designated Databricks and Telecom Bell contact to ensure smooth and timely transition of data
and Information
 Discovery Phase to identify hidden environmental risks to foresee and mitigate
3 Additional Details | Assumptions

Area Assumption
Telecom Bell on premise platform is owned and managed by Telecom Bell and Databricks will get the necessary support to extent the setup to provision the
1 Platform
solution per the scope of this effort.
Telecom Bell is responsible for the design, integration and operation of all Client Identity and Access Management, Security Incident and Event Management,
2 Data Security
Vulnerability Scanning and Security Testing tooling and processes as appropriate.
Telecom Bell will provide system access to all source systems or applications required by scope. Telecom Bell will provide access to systems and
5 Access & Setup
environments(including DEV, SIT) within 5 business days of receipt of request.
Databricks persona will not have access to unencrypted PII data. Telecom Bell will be responsible for encrypting any PII data, prior to extraction in the Databricks
6 Access & Setup
platform.

7 Access & Setup PII and GDPR Data handling will be done by Telecom Bell as per the existing practices in delivery , any additional arrangement is out of scope.

Project
9 Management
Telecom Bell will provide relevant functional, technical and process documentation for data platforms and systems required by the scope.

Project
10 Management
Telecom Bell will nominate full time business and technical SMEs aligned to this project as per the agreed pod structure.

Project
11 Management
Telecom Bell data owners /nominees will make every attempt to attend the Scrum meetings and ceremonies to present their progress on the issues assigned

Project
12 Management
Telecom Bell will make sure we get required time and support from all the stakeholders for complete success of the project.

Databricks team will reuse and extend the existing data ingestion tooling and framework to support the ingestion activities into the platform. The project will
14 Data Build
carry a data discovery exercise where it will assess the local market data quality and readiness.

15 Data Build Source System inventory have already been identified and already in place.

16 License The Cloudera CDH on premise license is already expired in March 2022. However, the extended support is required and obtained.
4 Additional Details | Questions

Is there an onboarding guide for the consultants to get started on your environment ?

Is there a Source System inventory already identified and can be shared ?

What are the roles and skills of existing 10 engineers on the team ?

What is the current data governance mechanism ?

Other than Cloudera, what all other paid subscriptions and packages are installed on the concerned architecture ?

Is there any major business contingency on this project plan? If so, what is the impact of the delayed delivery?

What are all the compliances and regulations that Telecom Bell need to follow about the concerned data?

Does Telecom Bell already have Azure account? If so, what is the level of enterprise support plan that is subscribed ?

Does Telecom Bell already have Confluent account? If so, what is the level of enterprise support plan that is subscribed ?

Any due license expires ?

What is the Cloudera’s extended support expiry date ?


Thank you

Thank you so much for you time today.. 


Yashodhan Kale
Modern Technologist | Data and ML at scale

BACKGROUND SELECTED EXPERIENCES

Design and drive clients' Data and AI journeys powered by cloud analytics • Fortune 5 American healthcare company
expertise! Offering data product mindset-driven solutions to deliver platforms Establish and manage DevOps, Data Engineering, and ML engineering teams in close collaboration with Data Scientists. Set
and beyond: Self-service framework, rapid experimentation lab, democratized
up a self-service Data and ML platform on Azure cloud for a Retail enterprise, incorporating an experimentation framework,
data, data products marketplace, multi-cloud solutions, data lake, data fabric,
data mesh patterns with federated governance, domain-specific ownership, and Model Training pipelines, and real-time inference using Azure AKS, Kubeflow, and Snowflake. Implement an Rx enterprise
more Data and ML platform on Azure cloud, enabling ETL pipelines with Databricks and Apache Airflow. Lead the development of
large-scale projects, including legacy modernization, Rx personalization, and Retail personalization programs that impact
millions of lives daily. Collaborate with technology partners, MSFT and NVIDIA, to present objectives, findings, and
RELEVANT FUNCTIONAL AND INDUSTRY EXPERIENCE
incorporate feedback for ML solutions with specialized NVIDIA GPUs. Architect and oversee the implementation of the

Industry Focus: Functional Expertise: Refrigerator IoT project on Azure, leveraging IOT hub, Azure Analytics, and Databricks. Lead the development of SAP HANA
to Spark integration. Manage the enhancement team in Data Engineering for pharmacy-related projects, ensuring critical
• HealthCare • Digital Transformation business deliveries. Design data-driven solutions, including self-service analytics platforms, rapid experimentation labs,
• Retail • Analytics and CDO Strategy
democratized data, multi-cloud solutions, data fabric, data mesh patterns with federated governance, and domain-specific
• Market Research • Open Source
• Finance • Machine Learning, IOT ownership. Develop an ingestion framework for seamless data migration across projects and cloud storage services.
• Data Drive Re-invention
• Multinational American information, data & market measurement company
CERTIFICATIONS Build a retail store data aggregation engine (Retail Intelligence system) for 24 countries, initially using Hadoop
MapReduce, later upgraded to Spark. Migrate on-premise batch processes to the cloud using Docker, Azure Batch
Amazon Web Services Certified Data Analytics - Specialty
Services, and Azure Shipyard for cost efficiency. Perform performance tuning on Apache Spark, cloud Hadoop
Amazon Web Services Solutions Architect - Associate
Cloudera Certified Developer for Apache Hadoop (CCDH) clusters (HDI), and Databricks on Azure and Hadoop platforms.

PREVIOUSLY WHAT HAS BROUGHT ME HERE


Sr Cloud Solution Architect @ Amazon Web Services   Level 6
Sr ML Engineering Manager @ Databricks Level 6 • Customer Obsession • Earn trust
• Deliver Results • Learn and Be Curious
Scale &
Pay as you Self service
go experimentation

ACID Compliant

Time Travel
Event Streaming

Data as product
Exactly once
semantics Inter Operability

Data Migration

Lake House Governance

Identity Management, SSO


Upfront cost

End of support

Not easy to integrate/


connect

Lack of discoverability

Efforts to make data HA &


durable

Maintenance
1 Platform & Architecture | Artifacts
A World Class Data Platform!Key components of the data platform:
Lake House
MLOps
Governance
Databricks Marketplace
Databricks Notebooks
Share Work together Production
2 3
1 insights at scale

Share Notebooks and work with peers across Schedule Notebooks to automatically run
Quickly discover new insights with teams in multiple languages (R, Python, machine learning and data pipelines at
built-in interactive visualizations, or SQL and Scala) and libraries of your choice. scale. Create multistage pipelines using 
leverage libraries such as Matplotlib Real-time coauthoring, commenting and Databricks Workflows. Set up alerts and
and ggplot. Export results and automated versioning simplify collaboration quickly access audit logs for easy
Notebooks in HTML or IPYNB while providing control. monitoring and troubleshooting.
format, or build and share dashboards
that always stay up to date.

You might also like