ADP 2 1 - Event Broker 2.0 Support Training

Event Broker 2.
0
Support & Troubleshooting
Thursday May 4, 2016

Event Broker 2.0 Agenda
Topic # Topic Presenter Duration (mins)

including Q&A
1 Application Component & Deployment
Architecture
2 System Topology
3 Event Broker Installation
4 Verify that Event Broker Cluster is
Healthy
5 Troubleshooting
6 Performance
7 Q&A
2
Application Components: Event Broker Stand Alone
ArcSight
Management Center
Producers Consumers
Topic setup and Consumers must be able
route management to process the same CEF
version as specified by
the connector (0.1/10)
ArcSight Event Broker

SmartConnector
CEF 0.1/
1.0
eb-cef topic
CEF 0.1/1.0
Routing OK
ArcSight
Logger
CEF
Hadoop HDFS
CEF 0.1/1.0
Routing OK CEF
User defined
consumer
ArcSight
SmartConnector ArcSight ESM
binary eb-esm binary

topic
No routing for
binary
3
Application Components: Event Broker + Investigate
Data consumers
ArcSight ArcSight Installer ArcSight Investigate

Management Center Install, deploy, Large scale English
Manage, monitor, elastic scale language search
administer
Application layer
Data layer
Data producers Event Broker

Data streaming, coordination
and management
Avro
CEF
ArcSight
ArcSight Investigate
SmartConnectors
Event Database
Event
transform routing
Binary
ArcSight ESM
(optional)
4
Deployment Architecture: Event Broker Stand Alone
K8 worker
node
Event
Broker
K8 worker
node
Event
Broker
ArcSight K8
master node
Event
Broker
CEF
ArcMC
Connectors
5
Deployment Architecture: Event Broker + Investigate
K8 worker
node ArcSight Investigate
Connectors Database
Event
Broker
CEF
Avro
K8 worker
node
Event
Broker
K8 worker ArcSight K8
node master node
Event Event ArcSight

Broker Broker Investigate
ArcSight
Installer
SMTP
ArcMC
6
System Topology
7
ADP ArcMC
Manage, Monitor, Admin
Investigate
Search
Other
Applications
Event Broker
Event Producers Long-term Storage

EB
EBWeb Service
EBWeb
WebService
Service
EB Streaming Platform
Kafka
Kafka
Schema Registry
ArcSight
Connectors ArcSight
Logger
ESM
Other CEF AVRO
Event
Sources Other
Kafka
Event Kafka
Transform Kafka
EventKafka
Routing Consumers
Stream Process Stream Process
FIPS IPv6 TLS
EB Stream Processing Streams
Event Consumers
10/15/2016
github.hpe.com/hercules
EB K8S Pods
NAME READY STATUS RESTARTS AGE IP NODE
default-http-backend-w5uv6 1/1 Running 0 14d 172.77.38.6 15.214.129.100
eb-c2av-processor-927505239-xc1ol 1/1 Running 0 14d 172.77.79.4 15.214.129.102
eb-kafka-0 1/1 Running 0 14d 172.77.79.3 15.214.129.102
eb-kafka-manager-1775413351-4u4o3 1/1 Running 0 14d 172.77.28.3 15.214.129.103
eb-routing-processor-546396016-6g1vc 1/1 Running 0 14d 172.77.86.4 15.214.129.101
eb-schemaregistry-2895860841-hc9vo 1/1 Running 0 14d 172.77.86.3 15.214.129.101
eb-web-service-2621833535-6hji3 2/2 Running 0 14d 172.77.38.10 15.214.129.100
eb-zookeeper-0 1/1 Running 0 14d 172.77.66.3 15.214.129.103
hercules-management-1187739270-qfo13 2/2 Running 0 14d 172.77.38.11 15.214.129.100
hercules-rethinkdb-0 1/1 Running 0 14d 172.77.38.9 15.214.129.100
hercules-search-4175371486-rjgx6 3/3 Running 0 14d 172.77.38.14 15.214.129.100
nginx-ingress-controller-rk7o2 1/1 Running 0 14d 172.77.38.8 15.214.129.100
EB and Investigate running on a 4 node (1 master + 3 worker nodes) K8S cluster.

And there are additional pods used by K8S
# kubectl get pods -o wide

9
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
core apiserver-15.214.129.100 1/1 Running 0 14d 15.214.129.100 15.214.129.100
core controller-15.214.129.100 1/1 Running 0 14d 15.214.129.100 15.214.129.100
core heapster-apiserver-2267066391-wkmrq 1/1 Running 0 14d 172.77.38.7 15.214.129.100
core kube-dns-v19-h837r 3/3 Running 0 14d 172.77.38.5 15.214.129.100
core kube-proxy-15.214.129.102 1/1 Running 0 14d 15.214.129.102 15.214.129.102
core kube-registry-proxy-15.214.129.102 1/1 Running 0 14d 172.77.79.2 15.214.129.102
core kube-registry-v0-mjd8b 1/1 Running 0 14d 172.77.38.4 15.214.129.100
core kubernetes-vault-690566690-db7au 1/1 Running 0 14d 172.77.38.3 15.214.129.100
core scheduler-15.214.129.100 1/1 Running 0 14d 15.214.129.100 15.214.129.100
default default-http-backend-w5uv6 1/1 Running 0 14d 172.77.38.6 15.214.129.100
default eb-c2av-processor-927505239-xc1ol 1/1 Running 0 14d 172.77.79.4 15.214.129.102
default eb-kafka-0 1/1 Running 3 14d 172.77.79.3 15.214.129.102
default eb-kafka-manager-1775413351-4u4o3 1/1 Running 0 14d 172.77.28.3 15.214.129.231
default eb-routing-processor-546396016-6g1vc 1/1 Running 0 14d 172.77.86.4 15.214.129.101
default eb-schemaregistry-2895860841-hc9vo 1/1 Running 2 14d 172.77.86.3 15.214.129.101
default eb-web-service-2621833535-6hji3 2/2 Running 0 14d 172.77.38.10 15.214.129.100
default eb-zookeeper-0 1/1 Running 0 14d 172.77.66.3 15.214.129.103
default hercules-management-1187739270-qfo13 2/2 Running 0 14d 172.77.38.11 15.214.129.100
default hercules-rethinkdb-0 1/1 Running 0 14d 172.77.38.9 15.214.129.100
default hercules-search-4175371486-rjgx6 3/3 Running 0 14d 172.77.38.14 15.214.129.100
default nginx-ingress-controller-rk7o2 1/1 Running 0 14d 172.77.38.8 15.214.129.100
# kubectl get pods --all-namespaces -o wide

10
EB Pods and Worker Nodes
NAME STATUS AGE KAFKA ZK INVESTIGATE

15.214.129.100 Ready 14d <none> <none> yes
15.214.129.101 Ready 14d yes yes <none>
• EB pods for Kafka and ZooKeeper are bound to worker nodes using labels
• Can either share the same worker node or can use separate nodes
• Other EB pods are not bound to a specific worker node
• K8S will schedule them on one of the available worker nodes
# kubectl get nodes -L=kafka,zk,investigate

11
Docker containers for the pods
# kubectl get pods -o wide | grep c2av
eb-c2av-processor-927505239-xc1ol 1/1 Running 0 14d 172.77.79.4 15.214.129.102
# ssh 15.214.129.102 docker ps | grep eb-c2av-processor-927505239-xc1ol | grep arcsightsecurity

d076be059ca0 index.docker.io/arcsightsecurity/atlas_sp:2.00.0 "/bin/bash -c 'source" 2 weeks
ago Up 2 weeks
k8s_atlas-c2av-stream.d7a21b95_eb-c2av-processor-927505239-xc1ol_default_4d1e208c-2461-11e7-9cdc-
40a8f02f94fc_42653ee5
# ssh 15.214.129.102
# docker exec -ti d076be059ca0 bash

[root@eb-c2av-processor-927505239-xc1ol /]# ps -ef | grep c2av | grep java
root 82 80 18 Apr18 ? 2-13:14:17 java -Xms2g -Xmx2g -DLOG_LEVEL=info
-DROLLOVER_POLICY=org.apache.log4j.DailyRollingFileAppender -cp ./bin/../lib/stream-processing.jar
-Djavax.net.ssl.trustStore=/run/secrets/eb-c2av-stream-processor.truststore
-Djavax.net.ssl.trustStorePassword=qwerqawer com.hpe.arcsight.eb.sp.Main -p ./config/stream.properties
• Find the node on which an EB pod is running

• Then on that node locate the EB Docker container
• Use any Docker commands on that container for debugging
12
EB Deployment Topology
K8S Master Node K8S Worker Node K8S Worker Node K8S Worker Node
KAFKA KAFKA KAFKA

ZK ZK ZK
atlas_web-service atlas_kafka atlas_kafka atlas_kafka

eb-kafka-2 eb-kafka-0 eb-kafka-1
kubernetes-vault-renew
eb-web-service
atlas_zookeeper atlas_zookeeper atlas_zookeeper
eb-zookeeper-1 eb-zookeeper-2 eb-zookeeper-0
atlas_schema-registry
eb-schema-registry
atlas_sp atlas_sp atlas_kafka_manager
eb-routing-processor eb-c2av-processor eb-kafka-processor

arcsight-installer
Node Pod
Container process
EB + Investigate Deployment Topology
INVESTIGATE KAFKA KAFKA KAFKA

ZK ZK ZK

eb-web-service
mgmt eb-zookeeper-1 eb-zookeeper-2 eb-zookeeper-0
hercules-management
search
search-engine atlas_schema-registry
eb-schema-registry
hercules-search
rethinkdb
hercules-rethinkdb-0
Node Pod
arcsight-installer
Container process
EB Deployment Dependency 1/5
arcsight-installer
Node Pod
Container process

arcsight-installer
Node Pod
Container process
atlas_kafka atlas_kafka atlas_kafka


arcsight-installer
Node Pod
Container process


eb-schema-registry
arcsight-installer
Node Pod
Container process

eb-web-service
eb-schema-registry

arcsight-installer
Node Pod
Container process
Kubernetes Cluster
Investigate Event Broker Event Broker Event Broker
Kubernetes Master Kubernetes Worker Kubernetes Worker Kubernetes Worker
Vertica Cluster
Vertica Vertica Vertica
Investigate High Performance Configuration

Kubernetes Cluster
Investigate Investigate Investigate Investigate
Event Broker Event Broker Event Broker

Kubernetes Master Kubernetes Worker Kubernetes Worker Kubernetes Worker
Vertica Cluster
Investigate High Availability Configuration Option 1

Kubernetes Cluster
Investigate Investigate Investigate
Event Broker Event Broker Event Broker

Kubernetes Worker Kubernetes Worker Kubernetes Worker
Vertica Cluster
Investigate High Availability Configuration Option 2

Kubernetes Cluster
Investigate
Event Broker
Kubernetes Master
Vertica Cluster
Vertica
Investigate Demo Configuration

EB Network Topology
38080 atlas_web-service 32181 atlas_zookeeper 32181 atlas_zookeeper 32181 atlas_zookeeper
ArcMC eb-web-service eb-zookeeper-1 eb-zookeeper-2 eb-zookeeper-0
Logger
39000 Logger
atlas_kafka_manager
ESM
ESM
eb-kafka-processor
9093 atlas_kafka 9093 atlas_kafka 9093 atlas_kafka
8888
arcsight-installer eb-kafka-2 eb-kafka-0 eb-kafka-1 9092
Vertica
8081
atlas_schema-registry atlas_sp atlas_sp
Connectors eb-schema-registry eb-routing-processor eb-c2av-processor

Connectors
PORT
ZooKeeper cluster connection line implying connection to all cluster nodes

Kafka cluster connection line implying connection to all cluster nodes
NOTE: Kafka Cluster Port 9092 is used by Vertica If Investigate is installed
Event Broker 2.0 TLS/FIPS Communication Paths
Schema Stream/Routing Web

Services ArcMC
Registry Process
Connector Kafka Cluster
JMX
Kafka Kafka Kafka

ESM Vertica
JMX
Other
Consumers Zookeeper Zookeeper Zookeeper
Producers
Zookeeper Cluster
Kafka
Manager cAdvisor
(localhost)
Event Broker
Event Broker Component
TLS + FIPS Enabled
External Component
SSH Tunnel & FIPs
No TLS or FIPs
EB Data Storage Topology
atlas_web-service atlas_schema-registry atlas_sp

atlas_kafka atlas_zookeeper
eb-schema-registry eb-zookeeper-1 eb-routing-processor
kubernetes-vault-renew eb-kafka-2
eb-web-service
atlas_sp
eb-c2av-processor
arcsight-installer On each K8S worker node

Kafka and ZooKeeper stores
data to local disk of the node
On K8S Master Node arcsight installer
persists configuration data to local disk
/opt/arcsight/k8s-hostpath-volume/eb/kafka
/opt/arcsight/installer/db /opt/arcsight/k8s-hostpath-volume/eb/zookeeper
Node Pod
• EB Web Service and Schema Registry persists data on a Kafka Topic
• Stream processors (c2av, routing) do not persist any data
Container process
EB Event Data Flow Topology
K8S Worker Node K8S Worker Node K8S Worker Node
K8S Master Node
Event Data
Consumers
Event Data Logger
Producers Logger
ESM
ESM
Connectors
Connectors
atlas_web-service Vertica
apply event transforms apply event routing
eb-web-service
atlas_sp atlas_sp
atlas_kafka_manager eb-c2av-processor eb-routing-processor eb-schema-registry
eb-kafka-processor

arcsight-installer eb-zookeeper-1 eb-zookeeper-2 eb-zookeeper-0
Node Pod
Container process
Event Broker
Installation
28
Before Setting Up Event Broker Systems
Make sure to ask customers about these when starting an investigation.

– Identify Resource Sizing:
Resource sizing decisions are determined by deployment topology, event size, and the event rate. There
are impacts event broker performance. See ArcSight ADP 2.1 - Event Broker 2.0 Sizing.pptx
– Identify the Encryption Mode: Connector, EB, ArcMC, and ESM must be configured to the same mode.
Options are:
– TLS
– TLS + Client Authentication
– TLS + FIPs
– TLS + FIPs + Client Authentication
– A note about TLS:
– Investigate supports TLS between connections, including Vertica.
– Vertica does not support TLS between Vertica and EB (Kafka)
29
Event Broker Installation
There are different files to download, depending on the deployment environment.
Deployment
Files to download Purpose
Environment
Installs the ArcSight Installer for EB stand alone.
The ArcSight Installer is a web application used to configure and deploy Event Broker to the
EB Stand Alone (ADP) • arcsight-installer-1.0.0-14.rc_eb.x86_64.rpm environment.
Images are retrieved from a remote DockerHub.
Customers must have credentials to successfully retrieved images during deployment.
Installs the ArcSight Installer for EB and Investigate.
The ArcSight Installer is a web application used to configure and deploy Event Broker and
EB + Investigate
(with Internet access) • arcsight-installer-1.0.0-14.rc.x86_64.rpm Investigate to the environment.
Images are retrieved from a remote DockerHub.
Customers must have credentials to successfully retrieved images during deployment.
• arcsight-investigate-vertica-scripts.<key>.tar.gz Installs Vertica, the Kafka Scheduler, and configures the environment.
• Vertica License (obtained independently)
Installs the ArcSight Installer for EB and Investigate.
Offline installation for EB and
The ArcSight Installer is a web application used to configure and deploy Event Broker and
Investigate • arcsight-installer-1.0.0-14.rc.x86_64.rpm Investigate to the environment.
(without Internet access)
Environments with NO internet access. Images are retrieved locally after downloading.
• arcsight_eb_images_<key>.tar Contains the Event Broker Images.
• arcsight_investigate_images_<key>.tar Contains the Investigate Images.
• arcsight-investigate-vertica-scripts.<key>.tar.gz
Installs Vertica, the Kafka Scheduler, and configures the environment.
• Vertica License (obtained independently)
Pre-requisites – Event Broker + Investigate Systems
Step Where the step is performed.

Master Node Worker Nodes
Set hostname of systems Yes Yes
Disable SE Linux Yes Yes
Enable firewalld Yes Yes
Install and configure yum Yes Yes
Install Java (OpenJDK) 1.8.0_121 or higher Yes No
Check that chrony is installed and running Yes Yes
Configure system to use network proxy (if your required by the Yes Yes
network policy)
Increased default user process limit Yes Yes
Generate key-pair. Configure ssh from master to workers. Yes No
31
Installer Properties File
Master Node Location: /opt/arcsight/installer.properties
## All Event Broker components will use FIPS-certified encryption algorithms

predeploy.eb.init.fips=false
## Event Broker Kafka will use TLS Client Authentication to verify client connections
predeploy.eb.init.client-auth=false
## Number of partitions for Event Broker topics in Kafka

predeploy.eb.init.noOfTopicPartitions=5
## Replication factor for Event Broker topics in Kafka

predeploy.eb.init.topicReplicationFactor=2
## Kafka log retention size

predeploy.eb.init.kafkaRetentionBytes=10737418240
## Kafka log retention size for the Vertica avro topic. This is uncompressed and requires more space
to hold events for the same duration.
predeploy.eb.init.kafkaRetentionBytesForVertica=10737418240
32
Installer Properties File (continued)
## Kafka log retention duration

predeploy.eb.init.kafkaRetentionHours=672
## Kafka inter-broker protocol version

predeploy.inter.broker.protocol.version=0.10.1.0
## The message format version the broker will use to append messages to the logs.
predeploy.log.message.format.version=0.10.1.0
## Size of kafka and zookeeper pet-sets

predeploy.eb.kafka.count=3
predeploy.eb.zookeeper.count=3
## Host path to store data persistently

predeploy.eb.kafka.path=/opt/arcsight/k8s-hostpath-volume/eb/kafka
predeploy.eb.zookeeper.path=/opt/arcsight/k8s-hostpath-volume/eb/zookeeper
33
## ArcMC hostname
predeploy.eb.arcmc.hosts=localhost:443
## The endpoint identification algorithm to validate the server hostname using the server certificate.
predeploy.ssl.endpoint.identification.algorithm=https
## The number of stream threads

predeploy.stream.num.threads=6
## truncate fields in C2av

predeploy.c2av.field.truncate=false
## Log level for each EB container

predeploy.level=info
predeploy.kafka.log.level=${predeploy.level}
predeploy.zookeeper.log.level=${predeploy.level}
predeploy.schema.log.level=${predeploy.level}
predeploy.web.service.log.level=${predeploy.level}
predeploy.c2av.stream.processor.log.level=${predeploy.level}
predeploy.eventbroker.routing.processor.log.level=${predeploy.level}
## Host path directory for ArcMC certificates

predeploy.arcmc.certs.path=/opt/arcsight/k8s-hostpath-volume/eb/arcmccerts
34
# ArcSight Event Broker

ebTag= 2.00.0
orchestration.bootstrap.image.tag=${ebTag}
kafka.manager.image.tag=${ebTag}
web.service.image.tag=${ebTag}
c2av.stream.processor.image.tag=${ebTag}
eventbroker.routing.processor.image.tag=${ebTag}
kafka.sr.image.tag=${ebTag}
kafka.image.tag=${ebTag}
kafka.image.tag=${ebTag}
#ArcSight Investigate
investigateTag=1.00.0
search.image.tag=${investigateTag}
search.engine.image.tag=${investigateTag}
management.image.tag=${investigateTag}
rethinkdb.image.tag=${investigateTag}
35
Adding Producers and Consumer
– Connectors
– Import Kafka Certificate into Keystore
– Add Kafka as a destination:
– “eb-cef” topic for sending cef data (Logger, Investigate)
– “eb-esm” topic for sending event data (ESM)
– Loggers
– Sign Kafka Consumer Certificate on Event Broker
– Connect to default “eb-cef” topic on Event Broker
– Investigate
– Connect Vertica scheduler to Kafka topic
36
Verify that Event Broker
Cluster is Healthy
37
Container Dependency Order
After deploying Event Broker, pods are configured to start in the following order. Downstream pods will not
start until the dependencies are met.
– A quorum of zookeeper pods in the cluster must be up (2 of 3, or 3 of 5). Total number of zookeepers must be odd.
– All Kafka pods must be up
– Schema Registry pod must be up
– Bootstrap Web Service, Kafka Manager
– Transformation Stream Processor (C2AV), Routing Stream Processor
38
Pod Status: A Healthy Cluster
# kubectl get pods -o wide
default-http-backend-yjcwc 1/1 Running 0 17h 172.77.40.6 15.214.137.102
eb-c2av-processor-967417906-vzu2k 1/1 Running 0 16h 172.77.16.5 15.214.137.112
eb-kafka-0 1/1 Running 1 16h 172.77.16.3 15.214.137.112
eb-kafka-manager-3416293552-otw4w 1/1 Running 0 16h 172.77.16.4 15.214.137.112
eb-routing-processor-965434368-m5bme 1/1 Running 0 16h 172.77.59.5 15.214.137.113
eb-schemaregistry-2463124937-0r27o 1/1 Running 1 16h 172.77.59.4 15.214.137.113
eb-web-service-3440844888-mdwnb 2/2 Running 0 4h 172.77.40.12 15.214.137.102
eb-zookeeper-0 1/1 Running 0 16h 172.77.59.3 15.214.137.113
nginx-ingress-controller-we1fi 1/1 Running 0 17h 172.77.40.8 15.214.137.102
39
Pod Status: An Unhealthy Cluster
# kubectl get pods
NAME READY STATUS RESTARTS AGE
default-http-backend-esvej 1/1 Running 2 22d

eb-c2av-processor-2916524249-ckx1s 0/1 Init:CrashLoopBackOff 1811 8d
eb-kafka-0 0/1 Init:0/1 1811 10d
eb-kafka-2 1/1 Running 4 22d
eb-kafka-manager-4000886174-agwiq 1/1 Running 101 8d

eb-routing-processor-2058902695-bnd4o 0/1 Init:0/1 1808 8d
eb-schemaregistry-3052097136-ieto1 0/1 Init:0/1 1813 8d
eb-web-service-395903104-z6kn1 1/2 CrashLoopBackOff 2207 22d
eb-zookeeper-0 0/1 Pending 0 8d
eb-zookeeper-1 1/1 Running 0 10d

eb-zookeeper-2 1/1 Running 1 22d
hercules-management-2195617592-2e56a 1/2 CrashLoopBackOff 2105 22d
hercules-rethinkdb-0 1/1 Running 1 22d
hercules-search-1406583890-4vln7 3/3 Running 2108 22d
nginx-ingress-controller-uum11 1/1 Running 2 22d
40
Verify that data flows through the system
– Check the EPS monitoring metrics in ArcMC.

– Tells whether events are flowing through Event Broker stream processor (both routing and C2AV transforming)
– Check the offset for each topic in Event Broker Manager (Kafka Manager). You should see the value
increasing over time in all topics.
– The offset value for topics with Binary events (ESM) will be smaller than topics with CEF events. Binary
events are grouped into batches. Each batch is one message in the topic.
– For CEF Events and Investigate check the following :
– CEF topic offset: The offset should increase over time.
– AVRO topic offset: The offset should increase over time and in unison with the CEF topic offset.
– The event count in Vertica table. The row count should increase over time.
– Check the Kafka Scheduler status to see event count and reject count.
– You should be able to see the event count increasing over time.
– The rejected_events count is the number of events where Kafka scheduler had issue loadings.
41
Check Kafka Scheduler on Vertica
# ./install-vertica/kafka_scheduler status
Status of Kafka scheduler: [127.0.0.1:9092] on topic: [eb-internal-avro]

events count | rejected_events count
--------------+-----------------------
60900034 | 0
(1 row)
'investigation_scheduler' scheduler last 5 events:
event_time | log_level | message
exception
-------------------------+-----------
+--------------------------------------------------------------------------------------------------------------------------------------- 2017-04-04
22:23:55.269 | INFO | Received configuration details; planned concurrency: 2, max concurrency: 0, max execution parallelism: 40. Setting lane
count: 2 |
2017-04-04 22:23:55.25 | INFO | Received configuration details; frame duration: 10000, refresh interval: 300000, eof timeout: 1000, resource
pool: kafka_default_pool, new topic policy: FAIR, pushback policy: LINEAR, pushback max count: 5 |
2017-04-04 22:23:53.837 | INFO | Refreshing Scheduler (refresh interval reached).
|
2017-04-04 22:21:57.48 | WARN | OVERSHOT DEADLINE FOR FRAME -- remaining time (ms): -82
|
2017-04-04 22:21:44.369 | WARN | OVERSHOT DEADLINE FOR FRAME -- remaining time (ms): -330
|
(5 rows)
'investigation_scheduler' scheduler last 10 microbatch status:
frame_start | source_name | start_offset | end_offset | end_reason | partition_bytes | partition_messages
-------------------------+------------------+--------------+------------+------------+-----------------+--------------------
2017-04-04 22:25:59.004 | eb-internal-avro | 4413347 | 4415580 | DEADLINE | 3070209 | 2233
(10 rows)
42
Kafka scheduler runs on pid(s): [30419 30423]
Verify Web Service APIs are healthy
– Check logs of the web service container

– # kubectl logs [POD ID/NAME]
– Check Web Service Port with Netstat to make sure that the port is bound
– # netstat -intp | grep 38080
– Verify that you can fetch data from the API using admin credentials
curl –u “admin:atlas” –k https://n15-214-137-h170.arst.usa.hp.com:38080/cluster/broker
[ "10.12.98.180:9092“, “10.12.98.181:9092” ]
43
Verify the topic partition count and replication count
Why it is important:
– Check that the configured partition count matches what you expect it to be.
– Check the replication count and partition count for the topic using Event Broker Manager (Kafka Manager)
or using kafka-topics command line
# kubectl exec eb-zookeeper-0 -- kafka-topics --zookeeper localhost:2181
--describe --topic eb-cef
Topic:eb-cef PartitionCount:5 ReplicationFactor:2 Configs:
Topic: eb-cef Partition: 0 Leader: 1002 Replicas: 1002,1003 Isr: 1002,1003
44
Software Logs and Data
– ArcSight Installer Logs

– /opt/arcsight/installer/logs
– Kubernetes Logs
– ./arcsight/kubernetes/log
– Zookeeper Logs
– ./arcsight/k8s-hostpath-volume/eb/zookeeper/log
– View Kubernetes logs for each container
– # kubectl logs [POD ID/NAME]
– # kubectl logs [WEB SERVICE POD ID/NAME] atlas-web-service
Kafka Topic Data

– /opt/arcsight/k8s-hostpath-volume/eb/kafka
45
Licensing
– There is no license check on either Event Broker or Investigate.

– Users must use their Docker Hub login to deploy images.
– Docker Hub privileges to the images are set by the Licensing Fulfillment team
46
Troubleshooting
Installation
47
Event Broker pods show multiple restarts
This is normal.
Pods will restart as they attempt to synchronize with other pods.
Restarts should cease shortly after all pods in the EB cluster have deployed on all servers.
The number of restarts should be sparse, and be less than 10 in most cases.
The number one factor that can affect the number of restarts is the connection speed in which servers can
connect and download containers.
48
Some of pods are not starting with status ErrImagePull
Problem: This indicates that the image cannot be downloaded from DockerHub. This can be confirmed by
running command kubectl get pods, and then and later execute kubectl describe pod podname.
You will see a message similar to the following:
Failed to pull image "hub.docker.io/hercules/search-engine:master" net/http: request
canceled
Solution: The pod will need to be deleted to re-trigger a download if the image.
– Execute the command: kubectl delete pod failing-podname
– This will terminate failing pod and create a new one with different name.
– Make sure image pull is successful by running the command kubectl get pods to see the status of
newly recreated pod.
49
Multiple Kafka Crashes/Restarts
If data is not removed from a machine prior to re-installation, and the Kafka cluster has been reconfigured,
then Kafka brokers may launch with duplicate IDs, causing one of the Kafka nodes to fail to start.
To identify the issue: Look for the following log in one of the Kafka nodes.
2017-04-09 14:56:06,772] FATAL [Kafka Server 1001], Fatal error during KafkaServer startup. Prepare to shutdown
(kafka.server.KafkaServer)
java.lang.RuntimeException: A broker is already registered on the path /brokers/ids/1001.
To verify the issue: Connect to each system that is running a Kafka broker and check the assigned broker.id
value of each. The broker.id value defined on each Kafka node must be unique.
# ssh worker_node_1 cat /opt/arcsight/k8s-hostpath-volume/eb/kafka/meta.properties | grep id
broker.id=1001
broker.id=1001
broker.id=1002
To recover: If you are reinstalling the cluster delete the existing data directory /opt/arcsight/ as part of
uninstalling the original install. If you are re-labeling or updating an existing cluster make sure the cluster
labels match the original worker node for each Kafka node without conflicts.
50
Troubleshooting-
Other
51
Event Broker and Vertica Diagnosis scripts
Diagnostic tools are packaged in the Event Broker “Web Service” container that extract logs and other cluster information
that can be used to investigate issues.
# find web service container
$ docker ps | grep -i atlas_web-service
c226ee041c48 hub.docker.hpecorp.net/hercules/atlas_web-service:latest
# Copy diagnostic script to host directory

$ docker cp c226ee041c48:/eb/ws/eb_diag/eb_diag.tgz .
# unpack diagnostic script

$ tar -xzvf eb_diag.tgz
eb-diag-beta.sh
eb-diag.sh
vertica-diag.sh
Fix the newlines to Unix format

# mv eb-diag.sh eb-diag.sh.orig
# tr -d '\r' < eb-diag.sh.orig > eb-diag.sh
Run the updated script

# sh eb-diag.sh
52
Cannot Query Zookeeper
Symptom:
When you run the kubectl get pods command to get status of the pods and you see that downstream
pods (see the pod dependency order) do not stay up and the status is a 'CrashLoop'-type error.
Conditions to look for:
Check that zookeeper pods are running.
– If the zookeeper pod status is Pending, you may not have labeled the nodes (zk=yes). Verify that the
nodes are labeled using the kubectl get nodes -L=zk command.
– Verify that you configured an odd number of zookeepers in installer.properties
predeploy.eb.zookeeper.count attribute.
– Check the zookeeper pod logs for errors using the kubectl logs <pod name>.
53
Common Errors/Warnings in Zookeeper Logs
– Quorum Exceptions: Cannot elect a leader. If you see this type of error, check the conditions described in
‘Cannot query zookeeper’
– Socket errors: This can occur if there are too many connections.
– The solution is to restart the pod using the kubectl delete <pod_name>.
– The pod will be recreated automatically.
54
Communication Errors
– SSL Connection Errors: These warnings occur if there is a connection issue between Kafka and a
consumer or producer. Check the steps that you used to import certificates to both EB and consumers.
– Communicate between brokers: If you see this type of error, host names may not be configured properly. It
is possible that the node cannot perform reverse look up or that DNS is not set up properly.
55
A consumer cannot read events from EB
– If this is a new set up of Kafka scheduler, check that Kafka scheduler is configured to communicate to port
39092.
– If this was working at first, but stopped working, it is possible that the offset value is not recognized:
– In this scenario, the kafka scheduler fails to recognize offset ids of messages that are in the topic. It can happen if the
kafka scheduler unexpectedly stops reading from the topic, and then is restarted.
– Solution: execute the kafka_scheduler delete command to delete the meta data. After doing this, immediately run the
kafka_scheduler create command to set up the scheduler.
– Other items to check:
– Check the network connection.
– Check whether the Kafka pods are down.
– Check that you configured the consumer to communicate to all nodes running Kafka. If you specified a connection to
only one node in the cluster and that node down, events will not flow.
– If you are encountering SSL connection errors as well, check the steps that you used to import certificates to both EB
and consumers.
56
An EB component crashes: web service, stream processors, etc.
– If this happens at start up, check the container dependency order. Have any of the dependency pods not
started or have crashed?
– Check Memory: Does the system have enough memory and disk space. It is possible that the system
requires more memory that the system has available.
– Check whether there are too many open sockets.
57
Pods will not start after node is shut down for more than 6 hours
After a system has been down for more than 6 hours. The issue is related to a timed-out certificate. If nodes
are down more than 6 hours certificates are not renewed. The work around:
1. Connect to the master node.
2. Run the update_kubevaulttoken script.
# /opt/arcsight/kubernetes/bin/update_kubevaulttoken
3. Find the kubernetes-vault pod name.

# kubectl get pod -o wide --namespace=core

kubernetes-vault-3865589275-erin5 1/1 CrashLoopBackOff 0 2m 172.77.93.4 15.214.137.51
4. Delete the kubernetes-vault pod. It will be re-created automatically.

# kubectl --namespace=core delete pod kubernetes-vault-3865589275-erin5
5. Check the status of event broker pods. They should restart automatically.
# kubectl get pod -o wide
6. If they do not come up, then undeploy and then redeploy EB using the ArcSight Installer.
58
Event Broker EPS is lower than expected
– Check whether there are resource constraints on brokers: CPU, memory, disk is full. Check usage at
system level or with ArcMC.
– Check for a network bottleneck.
– Check whether Stream Processor is able to keep up with CEF to AVRO transformation.
In ArcMC, the Stream Processor metric will be lower than the Connector EPS. Stream Processor may be
constrained in some way, such as limited system resources.
59
60
Performance
Deployment resource sizing is an important factor in Event Broker performance.
Under resourced systems may impact event throughput and performance.
The following slides are from the Event Broker Sizing guidelines on iRock https://
irock.jiveon.com/docs/DOC-141395
61
Performance
– Try to size so that consumption matches consumption for your SLOWEST Consumer. It’s much better to
have an idle Consumer than one that cannot keep up and must constantly keep the Broker reading from
physical disk.
– Throughput limited by broker network and disk bandwidth. Keep as much as possible in memory and
consider production AND consumption bandwidth – it can get very big very quickly!
– Brokers can converge hundreds of producers into a single topic – allows for great SmartConnector scaling
(eg WINC/WUC)
– Latency is a key factor – do not attempt to Produce or Consume over a WAN link such as between data
centers. Consider separate clusters in each data center and use SmartConnectors to perform dual
destination feeds if required.
– Acknowledgement mode will cause a performance hit – consider this when considering required
throughput. Refer to the Sizing Guide on iRock for detailed examples
https://irock.jiveon.com/docs/DOC-141395
62
Performance
Notes on potential bottlenecks
– Think about production + consumption EPS. So you may have 10K EPS inbound, yet Hadoop
AND Logger both consuming (20K EPS consumption). So you need to size for 30K EPS.
– Hardware is PER NODE for a minimum 3 node cluster.
– This assumes NO ACK and NO TLS.
– Leader ACK, include a 66% performance impact. FULL ACK is even worse!
– Assumed 1765 byte CEF events. This is obviously a fluid value.
– Keep in mind that compression in KAFKA is performed on the Producer (eg the Smart
Connector) using GZIP. KAFKA itself plays no role in compression of data.
– But this becomes far more complicated with dealing with BINARY and AVRO for ESM and Vertica!
– Always recommend 10Gbit network connections INSIDE of the cluster!
https://irock.jiveon.com/docs/DOC-141395
63
Back Up
64
Pre-requisites – Vertica Systems
Step Where the step is performed.

Primary nodes Secondary Nodes
Set up systems with no LVM partitioning and with ext3 or ext4 disk Yes Yes
format
Set hostname of systems Yes Yes
Disable firewalld during installation, ok to enable later Yes Yes
Install and configure yum Yes Yes
Configure system to use network proxy (if your required by the Yes Yes
network policy)
Generate key-pair. Configure ssh from primary to secondary. Yes No
More detail in Vertica documentation

https://my.vertica.com/docs/8.1.x/HTML/index.htm#Authoring/InstallationGuide/BeforeYouInstall/BeforeYouInstallVertica.htm
65
More Information
• ArcSight Data Platform on iRock

– https://irock.jiveon.com/groups/arcsight-data-platform-adp
• Event Broker Sizing Guide

– https://irock.jiveon.com/docs/DOC-137944
• Event Broker FAQ
– https://irock.jiveon.com/message/218634
• Event Broker 2.0 documentation on Protect724

– https://
www.protect724.hpe.com/community/arcsight/productdocs/event-broker/content?filterID=contentstatus%5Bpublished%5D~category
%5Bevent-broker-20%5D
• Docker Self paced training

– https://training.docker.com/category/self-paced-online
• Kubernetes Self-paced tutorials

– https://kubernetes.io/docs/tutorials/kubernetes-basics/
66
Thank you

ADP 2 1 - Event Broker 2.0 Support Training

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ADP 2 1 - Event Broker 2.0 Support Training

Uploaded by

Copyright:

Available Formats

Event Broker 2.

Thursday May 4, 2016

Topic # Topic Presenter Duration (mins)

ArcSight Event Broker

binary eb-esm binary

ArcSight ArcSight Installer ArcSight Investigate

Data producers Event Broker

Event Event ArcSight

Event Producers Long-term Storage

EB and Investigate running on a 4 node (1 master + 3 worker nodes) K8S cluster.

# kubectl get pods -o wide

# kubectl get pods --all-namespaces -o wide

NAME STATUS AGE KAFKA ZK INVESTIGATE

# kubectl get nodes -L=kafka,zk,investigate

# ssh 15.214.129.102 docker ps | grep eb-c2av-processor-927505239-xc1ol | grep arcsightsecurity

# docker exec -ti d076be059ca0 bash

• Find the node on which an EB pod is running

KAFKA KAFKA KAFKA

atlas_web-service atlas_kafka atlas_kafka atlas_kafka

eb-routing-processor eb-c2av-processor eb-kafka-processor

INVESTIGATE KAFKA KAFKA KAFKA

atlas_web-service atlas_kafka atlas_kafka atlas_kafka

atlas_zookeeper atlas_zookeeper atlas_zookeeper

atlas_kafka atlas_kafka atlas_kafka

atlas_zookeeper atlas_zookeeper atlas_zookeeper

atlas_kafka atlas_kafka atlas_kafka

atlas_zookeeper atlas_zookeeper atlas_zookeeper

atlas_web-service atlas_kafka atlas_kafka atlas_kafka

eb-routing-processor eb-c2av-processor eb-kafka-processor

Investigate Event Broker Event Broker Event Broker

Kubernetes Master Kubernetes Worker Kubernetes Worker Kubernetes Worker

Vertica Vertica Vertica

Investigate High Performance Configuration

Investigate Investigate Investigate Investigate

Event Broker Event Broker Event Broker

Vertica Vertica Vertica

Investigate High Availability Configuration Option 1

Investigate Investigate Investigate

Event Broker Event Broker Event Broker

Vertica Vertica Vertica

Investigate High Availability Configuration Option 2

Investigate Demo Configuration

38080 atlas_web-service 32181 atlas_zookeeper 32181 atlas_zookeeper 32181 atlas_zookeeper

ArcMC eb-web-service eb-zookeeper-1 eb-zookeeper-2 eb-zookeeper-0

Connectors eb-schema-registry eb-routing-processor eb-c2av-processor

ZooKeeper cluster connection line implying connection to all cluster nodes

Schema Stream/Routing Web

Kafka Kafka Kafka

atlas_web-service atlas_schema-registry atlas_sp

arcsight-installer On each K8S worker node

atlas_zookeeper atlas_zookeeper atlas_zookeeper

Make sure to ask customers about these when starting an investigation.

Step Where the step is performed.

## All Event Broker components will use FIPS-certified encryption algorithms

## Number of partitions for Event Broker topics in Kafka

## Replication factor for Event Broker topics in Kafka

## Kafka log retention size

## Kafka log retention duration

## Kafka inter-broker protocol version

## Size of kafka and zookeeper pet-sets

## Host path to store data persistently

## The number of stream threads

## truncate fields in C2av