Kafka Monitoring: Ivan Turčinović

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Kafka Monitoring

Ivan Turčinović
ivan.turcinovic@inovatrend.com
Kafka ecosystem
KafkaProducer KafkaProducer ... KafkaProducer

Kafka Cluster Streaming app

Kafka Kafka Kafka


broker 1 broker 2 Streaming app
broker3
.
.
.
Zookeeper 1 Zookeeper 2 Zookeeper 3

Streaming app

2
KafkaConsumer KafkaConsumer ... KafkaConsumer
Kafka Monitoring

• Why
• Troubleshooting
• Identifying potential problems
• Optimize performance
• What
• Logs
• Kafka logs
• System logs
• System resource utilization
• Kafka JMX metrics

3
Demo monitoring setup

Kafka JMX Metrics

jmxtrans

Graphite

Grafana

4
Broker Metrics

• Broker State • BytesInPerSec


• ActiveControllerCount • MessagesInPerSec
• OfflinePartitionsCount • BytesOutPerSec
• UnderReplicatedPartitions • CPULoad
• PartitionCount • Memory Usage
• LeaderCount

5
Broker Performance Metrics
Produce/Fetch Requests

Request Queue

Network Threads
IO Threads

Response Queue Purgatory

Other brokers

6
Broker Performance Metrics

Request Queue Time

Request Local Time


Request Queue

Network Threads
IO Threads

Response Send Time


Response Queue Purgatory

Response Queue Time

Other brokers

7
Request Remote Time
Broker Performance Metrics
Total Time:
• kafka.network:type=RequestMetrics,name=TotalTimeMs,request=Produce
• kafka.network:type=RequestMetrics,name=TotalTimeMs,request=FetchConsumer
• kafka.network:type=RequestMetrics,name=TotalTimeMs,request=FetchFollower
Break Down Times:
• RequestQueueTimeMs
• ResponseSendTimeMs
• ResponseQueueTimeMs
• LocalTimeMs
• RemoteTimeMs
Thread Pools:
• kafka.server:type=KafkaRequestHandlerPool,name=RequestHandlerAvgIdle

8
Percent
• kafka.network:type=SocketServer,name=NetworkProcessorAvgIdlePercent
Producer Metrics

• Producer metrics are picked up from producer JVM, not broker


• kafka.producer:type=producer-metrics,client-id=client_id
• outgoing-byte-rate
• record-error-rate
• record-retry-rate
• record-send-rate
• record-size-avg

9
Consumer Metrics

• Consumer metrics are picked up from consumer JVM, not broker


• kafka.consumer:type=consumer-fetch-manager-metric,client-
id=client_id
• bytes-consumed-rate
• records-consumed-rate
• records-lag
• fetch-latency-avg
• fetch-rate
• records-per-request-avg

10
Consumer Lag Monitoring

• kafka-consumer-groups command

• LinkedIn Burrow
• https://github.com/linkedin/Burrow
• Remora
• https://github.com/zalando-incubator/remora

11
Confluent Control Center

• Commercial product from Confluent


• Included in Confluent Platform Enterprise
• Monitoring
• End-to-end stream monitoring
• Confluent Monitoring Interceptors
• Confluent Metrics Reporter
• System health
• Alerting

12
Thank You!

13

You might also like