Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 36

Pino

Kishoret
Gopalakrishna

Tuesday, August 18, 15


Agend
• Pinot @aLinkedIn -
Current
• Pinot - Architecture
• Pinot Operations
• Pinot @ LinkedIn - Future
Tuesday, August 18, 15
WVMP

Tuesday, August 18, 15


Slice and Dice
Metrics

Tuesday, August 18, 15


Pinot @
LinkedIn

Customers Members Internal


tools

Tuesday, August 18, 15


Pinot @
LinkedIn
100B documents
• 1B documents ingested per day
• 100M queries per day
• 10’s of ms latency
• 30 tables in prod, 250 * 3 std app
nodes
Tuesday, August 18, 15
Key features

Columnar
SQL-like Real-time
storage and
data
indexing
interface load

Tuesday, August 18, 15


(S)QL: Filters and
Aggs
SELECT count(*)
FROM companyFollowHistoricalEvents
WHERE entityId = 121011 AND
'day' > = 15949 AND 'day' < = 15963
AND
paid = 'y’ AND
action = 'stop'
Tuesday, August 18, 15
(S)QL: Group By
SELECT count(*)
FROM companyFollowHistoricalEvents
WHERE entityId = 121011 AND
'day' > = 15949 AND 'day' < = 15963
AND
paid = 'y’
GROUP BY action
Tuesday, August 18, 15
(S)QL: ORDER BY and LIMIT
SELECT *
FROM companyFollowHistoricalEvents
WHERE entityId = 121011 AND
entityId = 1000 AND
action = 'start'
ORDER BY creationTime DESC LIMIT
1
Tuesday, August 18, 15
Whats not supported
• JOIN: unpredictable
performance

• NOT A SOURCE OF TRUTH

• Mutation

Tuesday, August 18, 15


Pinot
• Data flow
• Query Execution
• How to use/operate
• Pinot @ LinkedIn -
Future

Tuesday, August 18, 15


Querie
s

Broker Helix

Pinot
Real
Historical Architecture
time

Raw Kafka Hadoo


p
Dat
a August 18, 15
Tuesday,
Pinot
• Pinot
segments

Tuesday, August 18, 15


Pinot Segment layout: Columnar
storage

Tuesday, August 18, 15


Pinot Segment layout: Sorted Forward
Index

Tuesday, August 18, 15


Pinot Segment layout: Other
techniques
• Indexes: Inverted index, Bitmap,
RoaringBitmap

• Compression: Dictionary Encoding, P4Delta

• Multi Valued columns, skip lists,

• Hyperloglog for unique

• T-digest for Percentile, Quantile

Tuesday, August 18, 15


Star tree
Index
Data aware
pre-computation

Tuesday, August 18, 15


Pinot
• Query
Execution

Tuesday, August 18, 15


Pinot Query Execution:
Distributed

Broker
s

Helix S3
S1 S2

S3 S1 S2

Server
Tuesday, August 18, 15 s
Pinot Query Execution:
Distributed
1.Query

Brokers

Helix
S1 S2 S3

S3 S1 S2

Server
Tuesday, August 18, 15 s
Pinot Query Execution:
Distributed
1.Query

Broker 2. Fetch routing table from


s Helix

Helix
S1 S2 S3

S3 S1 S2

Server
Tuesday, August 18, 15 s
Pinot Query Execution:
Distributed
1.Query

Broker 2. Fetch routing table from


s Helix
3. Scatter
Request
Helix
S1 S2 S3

S3 S1 S2

Server
Tuesday, August 18, 15 s
Pinot Query Execution:
Distributed
1.Query

Broker 2. Fetch routing table from


s Helix
3. Scatter
Request
Helix 4. Process
S3
S1 S2
Request
S3 S1 S2
&
send response
Server
Tuesday, August 18, 15 s
Pinot Query Execution:
Distributed
1.Query

Broker 2. Fetch routing table from


s Helix
3. Scatter
Request
Helix
5. Gather
4. Process
ResponseS2 S3
S1
Request
S3 S1 S2
&
send response
Server
Tuesday, August 18, 15 s
Pinot Query Execution:
Distributed
1.Query 6. Return
Response
Broker 2. Fetch routing table from
s Helix
3. Scatter
Request
Helix
5. Gather
4. Process
ResponseS2 S3
S1
Request
S3 S1 S2
&
send response
Server
Tuesday, August 18, 15 s
Pinot Query Execution: Single Node
Architecture
PLANNER

EXECUTION ENGINE

INVERTED BITMAP
INDEX INDEX

COLUMN FORMAT

Tuesday, August 18, 15


Pinot Query Execution: Single Node
Architecture Combine Operator

SELECT
Aggregation
campaignId, Group by campaignId,Click
sum(clicks) Operator tuple
FROM Table A Projection
Matching
Operator
WHERE doc ids
accountId = 121011 Pinot
Filter
AND
'day' > = 15949
Operato
r Segments
GROUP BY Da ta
campaignId sources
campaign Id click account Id day

Tuesday, August 18, 15


Pinot
• Operation
s

Tuesday, August 18, 15


Cluster Management:
Deployment
Controller
Brokers
Helix

Servers
• Brokers and Servers register themselves in Helix
• All servers start with no use case specific
configuration
Tuesday, August 18, 15
On boarding new use
case
XLNT
Tag XLNT Controller Broker
TableName XLNT_T1 s
Servers 3 Helix
Brokers 1

Create Table
XLNT XLNT XLNT
command

Servers

Tuesday, August 18, 15


Segment
Assignment
S1

Controller Broker
Helix s
S2

Upload S1 S2 S3
S3 Segment
TableName XLNT_T1
Copies 2
S3 S1 S2

Server
s

Tuesday, August 18, 15


Pinot - Fault
tolerance/Elasticity

• AUTO recovery mode: Automatically


redistribute segments on failure/addition of
new nodes
• Custom mode: Run in degraded mode until node
is restarted/replaced.

Tuesday, August 18, 15


Pinot vs
Druid
Druid Pinot

Realtime + Offline, Realtime only -> consistency is hard


Architecture Realtime + Offline
Realtime only and schema evolution/Bootstrap is
hard

Always On all Configurable on Allows trade off between scanning v/s


Inverted Index inverted index + scanning. More data can
columns, per column
Fixed basis be fit in given memory size

Organizing data provides speed/better


Data organization N/A Sorts data compression and removes the need
for inverted index

Smart pre-
N/A star-tree Allows trade off between latency and space
materialization

Query Execution Split into Planning Smart choices can be made at


Fixed Plan
Layer and execution runtime based on
metadata/query.

Tuesday, August 18, 15


Pinot -
Future
• Documentation & tooling

• In progress - consistency among real time


replicas.

• Improve cost to serve - leverage SSD, partial


pre materialization

• ThirdEye - Business Metrics Monitoring

Tuesday, August 18, 15


Thank You

30
Tuesday, August 18, 15

You might also like