Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 19

Kafka

Introduction
1
History
Kafka was originally developed at LinkedIn and was subsequently open-
sourced in early 2011. Jay Kreps, Neha Narkhede, and Jun Rao helped co-
create Kafka

2
Companies using Kafka

And thousands more…

3
What is Kafka?
If you enter the Kafka website, you’ll find the definition of it right on
the first page:

A distributed event streaming platform

4
Distributed
Kafka works as a cluster of one or more nodes that can live in different
Datacenters, we can distribute data/load across different nodes in the Kafka
Cluster, and it is inherently scalable, available, and fault-tolerant.

5
Event Streaming

6
Kafka is comprised of multiple components
working collaboratively to handle and process
real-time data streaming across a network of
nodes.

7
Kafka Architecture

● Producers: Producers are applications that create and send data (called
messages) to topics in Kafka.
● Cluster: Kafka Cluster consists of one or more Kafka servers, called brokers.
● Consumers: Consumers are applications that read messages from topics in
Kafka. 8
Topic, partition

● A topic is similar to a folder or a table in a database .


● Topics tie producers and consumers together while creating a clear-cut
boundary between them.

9
● Topic Creation ● Rebalancing
● Data Distribution ● Parallel Processing
● Data Ordering ● Fault Tolerance
● Scalability
● Consumer group

10
● Every record stored in a partition is assigned an offset which denotes how far
they are from the start of the partition.
● Adding new records increases the offset, guaranteeing that more recent records
will always have a higher offset than older records

11
Brokers

● Storing data and handling requests ● Scale


● Distribute messages ● Keep track offset
● Connect Producer & Consumer ● Clean up data
● Replicas ● Security
● Balancing 12
Replication

● Replica
● Leader and Follower
● Replication Factor
● In-sync replicas
13
● Consistency
Producers

● Publishing Data ● Reliability


● Message Keys ● Compression
● Serialization
● Partitioning
● Acknowledgment
14
Consumers

● Subscription to Topics
● Partition Assignment
● Message Consumption
● Offset Tracking
● Acknowledgment
15
Consumer Groups

● Shared Workload
● Load Balancing
● Failover and High Availability
● Scaling Out

16
● The system will rebalance when a new consumer is added to one of the groups.

17
References

● https://kafka.apache.org/documentation
● https://www.confluent.io/blog/apache-kafka-intro-how-kafka-works
● https://en.wikipedia.org/wiki/Apache_Kafka
● https://medium.com/swlh/apache-kafka-what-is-and-how-it-works-e176ab31fcd5
● https://lankydan.dev/intro-to-kafka-topics-and-partitions

18
Thank you for your time and attention 🙂

19

You might also like