Acceldata Pulse Slides

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

ACCELDATA DATA

OBSERVABILITY SOLUTIONS

Prashant Tewari
Regional Sales Director
Acceldata Middle East
prashant.tewari@acceldata.io
AGENDA
• DATA OBSERVABILITY

• THE ACCELDATA PLATFORM


• ACCELDATA PULSE
• ACCELDATA TORCH
• ACCELDATA FLOW

• SUMMARY

• QUESTIONS

Acceldata — © 2021 — Confidential & 2


Proprietary.
DATA OBSERVABILITY

Acceldata — © 2021 — Confidential & 3


Proprietary.
Growing Complexity of Enterprise Data Pipelines

Acceldata — © 2021 — Confidential & 4


Proprietary.
Operating enterprise data systems is painful. It doesn’t have to be.

PAIN POINTS

New Business Requirements

IT Operations shouldn’t feel like Siloed technologies don’t


solving murder mysteries understand adjacent systems

Skillset shortage

Legacy APM and log tools lack Primitive open source tools
context, provide rudimentary are insufficient
metrics
Fundamental tension: SLOs/SLAs
vs cloud migration and costs

Acceldata — © 2021 — Confidential & 5


Proprietary.
BEFORE VS TODAY
Traditional monitoring tools weren’t designed
for modern data and analytics systems.
BEFORE TODAY
Application Application
Performance Performance
Monitoring Monitoring
Apps M2M Apps works here...

Data Warehouse
But doesn’t
work here. Data Fabric
Enterprises
need Data
Performance
Analytics Monitoring.

Managers Analytics
Purpose: Support periodic management decisions (e.g., Purpose: Support operational decisions for employees,
monthly, quarterly) customers, suppliers, and partners in real-time

Acceldata — © 2021 — Confidential & 6


Proprietary.
IMAGINE IF...

You had
comprehensive You could quickly
visibility across the identify and resolve You could automate
grid incidents self-healing

MULTI-DIMENTIONAL DATA OBSERVABILITY


You could achieve
near 100% of SLO’s,
and do it with a less
tenured team

You are the 1st to


know there’s a You could predict & You could do more PLATFORM OWNERS,
DATAOPS ENGINEERS,
problem prevent incidents with less SITE RELIABILITY
ENGINEERS
Acceldata — © 2021 — Confidential &
Proprietary.

..
THE ACCELDATA PLATFORM

Acceldata — © 2021 — Confidential & 8


Proprietary.
ACCELDATA OVERVIEW

Multidimensional Data Observability Platform


for Demanding Enterprises

FOUNDED 2018 $45+ MM RAISED 3X GROWTH


Delaware C-Corp Insight, Lightspeed, Sorenson Managing 100 PB data
Palo Alto (HQ), Bangalore, Singapore and Emergent Ventures. Highest CSAT Ratings
Some select angels.
162 Employees

KEY CUSTOMERS ONLY COMPLETE STACK FOR DATA LEADERSHIP TEAM


OBSERVABILITY

Acceldata Observability Platform


simplifies and scales data operations
across all levels of compute, pipelines,
and data.

Acceldata — © 2021 — Confidential & Proprietary. 9


Acceldata Delivers Data Observability for Modern Data Infrastructure

Acceldata developed its multidimensional Data Observability Cloud to support the DataOps revolution that’s gaining momentum as a result of exploding data volumes, accelerating technological
complexity, and increasing criticality of data on the business. Data is eating the world, but data teams lack the visibility to effectively operate, scale, and optimize modern data systems that support
business objectives and, ultimately, competitive advantage.
What is Data Observability? Data observability is a rapidly emerging category which complements the needs of data operations by providing data engineers and executives with the tools and visibility to
understand the state of data, and the data systems that cleanse, transform, and transport data, to ensure data supply chains are optimized, reliable, and aligned with growing business demands.
Acceldata provides multidimensional data observability for all connected data, regardless of data source, technology choice, cloud provider, or location. Acceldata specifically focuses on mission-critical
operational analytics and AI/ML workloads.

Acceldata Overview Multidimensional Data Customer Success Story:


Observability Acceldata helps PhonePe (Walmart), India’s largest Digital
Founded: 2018
Payment Processor, handle 2B+ transactions/month and $650B
Offices: San Jose (HQ), Bangalore, Singapore; 140+ people Data Pipeline in transaction volume annually, while scaling mission-critical
Funding: $46M from Lightspeed, Insight, and Sorenson Observability
data infrastructure by 20x, delivering 100% reliability, and
Growth: 3X revenue growth for three consecutive years, reducing costs by $7 million per year due to higher data
managing 350+PB of data; industry-leading CSAT engineering productivity and reduced software licensing costs.
Flagship product: Acceldata Data Observability Cloud
Key customers: PhonePe (Walmart), Oracle, Blue Cross/Blue
Shield, Verisk, DBS Bank, among many others; customers in ten Customer Benefits
countries across the globe.
Target personas: CDO, CDAO, VP Cloud, VP/Director of Platform
Compute
Performance
● 25-75% decrease in total data system costs
Engineering (economic buyers); data engineer, data architect, Monitoring
● 300% increase in data engineering productivity
cloud architect, and SRE (users)
Target customers: Telcos, financial services, health care, and Data
● 90% reduction in MTTR/RCA and alerts
digital hyper-scalers; 100TBs data, business-critical data Reliability ● 35% improvement in data processing capacity
pipelines, large-scale hybrid environments with 30+ analytical
processing nodes (e.g., Snowflake, Databricks, Spark, EMR,
● 10x data quality coverage with automation
DataProc, HD Insights, Hadoop) ● End-to-end visibility of modern data pipelines
Key Differentiator: Acceldata’s multidimensional data
observability correlates events across infrastructure (analytical
“[Our] research highlights Acceldata grew three times since its
establishment and is well-positioned to emerge as a leader in
● Automate 80% of data quality rules without code
engines), data, and pipeline layers to predict and prevent issues, the multi-billion-dollar data observability industry.” ● Identify duplicate, unused, and stale data; reduce
automate resolution, and increase data engineer productivity data storage by up to 20%
Frost & Sullivan
and data ROI.
Acceldata — © 2021 — Confidential & Proprietary.
Frost & Sullivan Technology Innovation Leadership 2021

Acceldata — © 2021 — Confidential & 11


Proprietary.
ACCELDATA PULSE

1
Acceldata — © 2021 — Confidential &
2
Proprietary.
MULTI-DIMENSIONAL OBSERVABILITY for
HADOOP ECOSYSTEM DEPLOYMENTS

END-TO-END VISIBILITY for


IMPROVED RELIABILITY, USABILITY & COST

INFRASTRUCTURE PLATFORM PROCESSING DATA


✓ System Health (CPU, ✓ Services Health (HDFS, ✓ Job & Query Fingerprinting ✓ File Metadata (Size,
Memory, Network, OS, …) YARN, Zookeeper, …) ✓ Historical Analysis & Temperature, …)
✓ Hotspotting ✓ Hybrid, Multi-cluster & Trending ✓ Schema Drift Monitoring
✓ Overprovisioning Multi-Distro Support (CHP, ✓ Event Correlation ✓ Data Drift Monitoring
CDH, HDP, Apache,
✓ Resource Contention Databricks, AWS EMR, ...) ✓ Configuration ✓ Data Quality Monitoring
✓ Capacity Planning Recommendations ✓ Data Reconciliation
✓ Chargeback Reports
... ✓ Simulation ✓ Throughput Monitoring for
✓ Scheduling Optimization
✓ Bottleneck Analysis Streaming Data
✓ Auto-actions, Self-
tuning/healing ✓ Query Optimization ✓ Data Redundancy
✓ Security (LDAP, RBAC, ...) ✓ Log Analysis ✓ Data Discovery
... ... ✓ PIpeline & Lineage
...
1
Acceldata — © 2021 — Confidential & Proprietary.
3
PULSE SOLUTION ARCHITECTURE

Acceldata
Agents

Time
Yarn App Infra Log Containe JMX
Series
metric Data Metric Data r Metric
Data
s s Data s

Database Time series data Log indices

AWS ECR
Acceldata Services

Acceldata — © 2021 — Confidential & 14


Proprietary.
PULSE DEPLOYMENT ARCHITECTURE

Acceldata — © 2021 — Confidential & 15


Proprietary.
Acceldata Pulse Integrations

• Supported Hadoop Distributions - Amazon EMR, Cloudera, Databricks,


Hortonworks, Google Cloud Dataproc
• Data Storage - HDFS, Amazon S3
• Streaming Support - Kafka, Spark Streaming, Pulsar, Apache NiFi, Flink
• Data Processing - Apache Tez (Presto), Hive, Apache Spark, Map Reduce,
Clickhouse
• Data Warehouse - Druid, Hive LLAP, Impala, Databricks Delta
• Machine Learning - Apache Spark, Zeppelin, PySpark, SparkSQL, Databricks
• NoSQL - HBase
• Orchestration Engine – Apache Airflow

Acceldata — © 2021 — Confidential & 16


Proprietary.
ACCELDATA PULSE SUBSCRIPTION DETAILS
ENTERPRISE GOLD
License Type Annual - Subscription based Annual - Subscription based

HDFS, YARN, HIVE, MAPREDUCE, HIVE LLAP, HDFS, YARN, HIVE, MAPREDUCE, HIVE LLAP,
Product Integrations
HBASE, KAFKA, SPARK, ZOOKEEPER HBASE, KAFKA, SPARK, ZOOKEEPER
Distributions Supported HDP, CDP, CDH, Apache Open Source HDP, CDP, CDH, Apache Open Source

Hosting On-Premise , On-Cloud On-Premise , On-Cloud

Technical Support for Acceldata Pulse True True

Dedicated SRE/Hadoop Admin Support for


No Yes
Hadoop/Big Data Platforms*
SRE Engagement Model SLA Driven Pool Based Support Dedicated Named SRE (1 SRE per 200 Nodes)

Technical Support Type Remote Remote

Technical Support Model Severity based. Severity based.

9am - 5pm India Standard Time &


Technical Support Working Hours 24x7x365.
24x7x365 Pool Based Support
Monday to Friday (excluding Indian Public
Technical Support Working Days Work Days as per customer
Holidays)
Pager/On-call Coverage No Yes

Acceldata — © 2021 — Confidential & 17


Proprietary.
• PULSE IMPLEMENTATION

Getting started with Acceldata Pulse


Continuous:
Optimize
Phase 3 SLOs,
Predict infrastructure
Phase 2 issues, and licensing
Optimize preempt costs
Day 5 system problems
Recommend performance
Days 2-4 performance
Benchmark improvements Value
Day 1 platform; Tenants:
Acceldata observe Improve cycle
Installation system times, system
performance,
(sit back MAGIC DAY! data ROI, and
and let it do competitive
it’s thing) advantage

Acceldata — © 2021 — Confidential & 18


Proprietary.
• BENEFITS

Acceldata Pulse accelerates


enterprise data success 200% 90%
• Improve data system reliability, Decrease DECREASE
scalability, and resilience legacy in overall
maintenance MTTR/RCA
• Predict and preempt problems fees
• Expedite cloud migration and data
validation
• Increase new tech adoption by 50%
• Exceed SLAs, often MTTR is 300% 35%
replaced by a new metric MTBF INCREASE INCREASE
(mean time between failure) in engineering in data
measured in weeks and months productivity processing
capacity

1
Acceldata — © 2021 — Confidential &
Proprietary. 9
Customer Success

COST OPTIMIZATION PERFORMANCE IMPROVEMENT RELIABILITY AT SCALE

40% 2X 0
LOWER data IMPROVEMENT on data SEV 1 ISSUES
infrastructure costs infrastructure performance

Situation: Situation: Situation:


Expensive tools, poor performance and Could only handle processing 50% of ingested Performance issues preventing scalability
system instability data 60% of experienced engineers time spent
firefighting operational issues
Impact: Impact:
Poor price/performance ratio Poor pervasive data system performance and Impact:
scalability issues Limited company growth
Acceldata Resolution:
Replaced multiple tools with centralized Acceldata Resolution: Acceldata Resolution:
monitoring Eliminated unplanned outages and Sev 1 issues 0 unplanned outages, Sev 1 issues for
Reduced Hive LLAP resource costs by ~40% five plus consecutive months over twelve months (and running)
Saved $45M annually on audit costs Optimized HDFS storage cost by ~2PB Scaled data infrastructure by 13x while
Reduced annual software licensing costs of other meeting SLAs
apps by $2+M Saved $5m+ year in software licensing
Improved existing system capacity and saved costs
additional $1M+ in projected capex
20
20
CASE STUDY: PUBMATIC

PubMatic optimizes performance and cost at massive scale

One of the United States


Problem

High MTTR, performance bottlenecks


because of massive scale – larger
number of nodes in a single cluster.
Results

●Storage optimization reduced


block footprint by 30%

Acceldata provided the data
observability tools and expertise
to make our data pipelines more
largest AdTech companies
High infrastructure and OEM support ●Kafka cluster consolidation saved reliable They helped us optimize
HDFS performance, consolidate
Hyper scale setup with costs. infrastructure costs Kafka clusters, and reduce cost per
Large number of nodes
handling petabytes of data Solution
●Reduced OEM support costs ad impression, which is one of our
most critical performance metrics.
significantly Acceldata's data observability
Kafka, Spark, HBase
Open Source - HDP, HDFS,
Acceldata Pulse isolated bottlenecks, ●Eliminated day-to-day saved us millions of dollars for
software licenses that we no
Yarn automated performance engineering involvement and longer need. Now we can focus on
improvements, and distinguished firefighting on outages and scaling to meet the needs of
171 Billion daily ad rapidly growing business.”
impressions, 1 trillion between mandatory and unnecessary performance degradation issues Ashwin Prakash
advertiser bids per day, data to ensure scaled growth of the big to stay focused on growing the Engineering Leader
over 2PB of new data
processed daily* data environment that reliably business.
supports all critical enterprise and
customer facing analytics
requirements.

2
Acceldata — © 2021 — Confidential & Proprietary.
1
What Makes Acceldata Different?

Improves visibility, Reduces costs Increases data team


optimizes data operations productivity

•Optimizes costs by eliminating


Single pane of glass provides overprovisioned hardware and
software
•Empower all data team
unified view into: members to contribute at high
•Engine-level observability •Increases current stack output by level

•Data-level observability
35% •Identify and solve problems fast
•20%
Reduces storage costs by up to
•Increase productivity - 500%
•Pipeline-level observability
•Provides open source Insurance •Free experienced engineers to
policy focus on high-value innovation
projects

Acceldata — © 2021 — Confidential & 22


Proprietary.
QUESTIONS ?

Prashant Tewari
Regional Sales Director
Acceldata Middle East
prashant.tewari@acceldata.io
Mobile: +971562663630

Acceldata — © 2021 — Confidential & 23


Proprietary.

You might also like