Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 67

Event Broker 2.

0
Support & Troubleshooting

Thursday May 4, 2016


Event Broker 2.0 Agenda

Topic # Topic Presenter Duration (mins)


including Q&A
1 Application Component & Deployment
Architecture
2 System Topology
3 Event Broker Installation
4 Verify that Event Broker Cluster is
Healthy
5 Troubleshooting
6 Performance
7 Q&A

2
Application Components: Event Broker Stand Alone
ArcSight
Management Center

Producers Consumers
Topic setup and Consumers must be able
route management to process the same CEF
version as specified by
the connector (0.1/10)

ArcSight Event Broker


SmartConnector
CEF 0.1/
1.0
eb-cef topic
CEF 0.1/1.0
Routing OK
ArcSight
Logger

CEF

Hadoop HDFS
CEF 0.1/1.0
Routing OK CEF

User defined
consumer

ArcSight
SmartConnector ArcSight ESM

binary eb-esm binary


topic
No routing for
binary

3
Application Components: Event Broker + Investigate
Data consumers

ArcSight ArcSight Installer ArcSight Investigate


Management Center Install, deploy, Large scale English
Manage, monitor, elastic scale language search
administer
Application layer
Data layer

Data producers Event Broker


Data streaming, coordination
and management

Avro
CEF

ArcSight
ArcSight Investigate
SmartConnectors
Event Database
Event
transform routing

Binary

ArcSight ESM

(optional)
4
Deployment Architecture: Event Broker Stand Alone
K8 worker
node

Event
Broker

K8 worker
node

Event
Broker

ArcSight K8
master node

Event
Broker

CEF
ArcMC
Connectors

5
Deployment Architecture: Event Broker + Investigate

K8 worker
node ArcSight Investigate
Connectors Database
Event
Broker
CEF
Avro

K8 worker
node

Event
Broker

K8 worker ArcSight K8
node master node

Event Event ArcSight


Broker Broker Investigate

ArcSight
Installer

SMTP

ArcMC
6
System Topology

7
ADP ArcMC
Manage, Monitor, Admin
Investigate
Search
Other
Applications
Event Broker

Event Producers Long-term Storage


EB
EBWeb Service
EBWeb
WebService
Service

EB Streaming Platform

Kafka
Kafka
Schema Registry

ArcSight
Connectors ArcSight
Logger
ESM
Other CEF AVRO
Event
Sources Other
Kafka
Event Kafka
Transform Kafka
EventKafka
Routing Consumers
Stream Process Stream Process
FIPS IPv6 TLS
EB Stream Processing Streams
Event Consumers

10/15/2016
github.hpe.com/hercules
EB K8S Pods
NAME READY STATUS RESTARTS AGE IP NODE
default-http-backend-w5uv6 1/1 Running 0 14d 172.77.38.6 15.214.129.100
eb-c2av-processor-927505239-xc1ol 1/1 Running 0 14d 172.77.79.4 15.214.129.102
eb-kafka-0 1/1 Running 0 14d 172.77.79.3 15.214.129.102
eb-kafka-1 1/1 Running 0 14d 172.77.66.4 15.214.129.103
eb-kafka-2 1/1 Running 0 14d 172.77.86.6 15.214.129.101
eb-kafka-manager-1775413351-4u4o3 1/1 Running 0 14d 172.77.28.3 15.214.129.103
eb-routing-processor-546396016-6g1vc 1/1 Running 0 14d 172.77.86.4 15.214.129.101
eb-schemaregistry-2895860841-hc9vo 1/1 Running 0 14d 172.77.86.3 15.214.129.101
eb-web-service-2621833535-6hji3 2/2 Running 0 14d 172.77.38.10 15.214.129.100
eb-zookeeper-0 1/1 Running 0 14d 172.77.66.3 15.214.129.103
eb-zookeeper-1 1/1 Running 0 14d 172.77.86.5 15.214.129.101
eb-zookeeper-2 1/1 Running 0 14d 172.77.79.5 15.214.129.102
hercules-management-1187739270-qfo13 2/2 Running 0 14d 172.77.38.11 15.214.129.100
hercules-rethinkdb-0 1/1 Running 0 14d 172.77.38.9 15.214.129.100
hercules-search-4175371486-rjgx6 3/3 Running 0 14d 172.77.38.14 15.214.129.100
nginx-ingress-controller-rk7o2 1/1 Running 0 14d 172.77.38.8 15.214.129.100

EB and Investigate running on a 4 node (1 master + 3 worker nodes) K8S cluster.


And there are additional pods used by K8S

# kubectl get pods -o wide


9
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
core apiserver-15.214.129.100 1/1 Running 0 14d 15.214.129.100 15.214.129.100
core controller-15.214.129.100 1/1 Running 0 14d 15.214.129.100 15.214.129.100
core heapster-apiserver-2267066391-wkmrq 1/1 Running 0 14d 172.77.38.7 15.214.129.100
core kube-dns-v19-h837r 3/3 Running 0 14d 172.77.38.5 15.214.129.100
core kube-proxy-15.214.129.102 1/1 Running 0 14d 15.214.129.102 15.214.129.102
core kube-proxy-15.214.129.103 1/1 Running 0 14d 15.214.129.103 15.214.129.103
core kube-proxy-15.214.129.100 1/1 Running 0 14d 15.214.129.100 15.214.129.100
core kube-proxy-15.214.129.101 1/1 Running 0 14d 15.214.129.101 15.214.129.101
core kube-registry-proxy-15.214.129.102 1/1 Running 0 14d 172.77.79.2 15.214.129.102
core kube-registry-proxy-15.214.129.103 1/1 Running 0 14d 172.77.66.2 15.214.129.103
core kube-registry-proxy-15.214.129.100 1/1 Running 0 14d 172.77.38.2 15.214.129.100
core kube-registry-proxy-15.214.129.101 1/1 Running 0 14d 172.77.86.2 15.214.129.101
core kube-registry-v0-mjd8b 1/1 Running 0 14d 172.77.38.4 15.214.129.100
core kubernetes-vault-690566690-db7au 1/1 Running 0 14d 172.77.38.3 15.214.129.100
core scheduler-15.214.129.100 1/1 Running 0 14d 15.214.129.100 15.214.129.100
default default-http-backend-w5uv6 1/1 Running 0 14d 172.77.38.6 15.214.129.100
default eb-c2av-processor-927505239-xc1ol 1/1 Running 0 14d 172.77.79.4 15.214.129.102
default eb-kafka-0 1/1 Running 3 14d 172.77.79.3 15.214.129.102
default eb-kafka-1 1/1 Running 2 14d 172.77.66.4 15.214.129.103
default eb-kafka-2 1/1 Running 1 14d 172.77.28.4 15.214.129.231
default eb-kafka-manager-1775413351-4u4o3 1/1 Running 0 14d 172.77.28.3 15.214.129.231
default eb-routing-processor-546396016-6g1vc 1/1 Running 0 14d 172.77.86.4 15.214.129.101
default eb-schemaregistry-2895860841-hc9vo 1/1 Running 2 14d 172.77.86.3 15.214.129.101
default eb-web-service-2621833535-6hji3 2/2 Running 0 14d 172.77.38.10 15.214.129.100
default eb-zookeeper-0 1/1 Running 0 14d 172.77.66.3 15.214.129.103
default eb-zookeeper-1 1/1 Running 0 14d 172.77.86.5 15.214.129.101
default eb-zookeeper-2 1/1 Running 0 14d 172.77.79.5 15.214.129.102
default hercules-management-1187739270-qfo13 2/2 Running 0 14d 172.77.38.11 15.214.129.100
default hercules-rethinkdb-0 1/1 Running 0 14d 172.77.38.9 15.214.129.100
default hercules-search-4175371486-rjgx6 3/3 Running 0 14d 172.77.38.14 15.214.129.100
default nginx-ingress-controller-rk7o2 1/1 Running 0 14d 172.77.38.8 15.214.129.100

# kubectl get pods --all-namespaces -o wide


10
EB Pods and Worker Nodes

NAME STATUS AGE KAFKA ZK INVESTIGATE


15.214.129.100 Ready 14d <none> <none> yes
15.214.129.101 Ready 14d yes yes <none>
15.214.129.102 Ready 14d yes yes <none>
15.214.129.103 Ready 14d yes yes <none>

• EB pods for Kafka and ZooKeeper are bound to worker nodes using labels
• Can either share the same worker node or can use separate nodes
• Other EB pods are not bound to a specific worker node
• K8S will schedule them on one of the available worker nodes

# kubectl get nodes -L=kafka,zk,investigate


11
Docker containers for the pods
# kubectl get pods -o wide | grep c2av
eb-c2av-processor-927505239-xc1ol 1/1 Running 0 14d 172.77.79.4 15.214.129.102

# ssh 15.214.129.102 docker ps | grep eb-c2av-processor-927505239-xc1ol | grep arcsightsecurity


d076be059ca0 index.docker.io/arcsightsecurity/atlas_sp:2.00.0 "/bin/bash -c 'source" 2 weeks
ago Up 2 weeks
k8s_atlas-c2av-stream.d7a21b95_eb-c2av-processor-927505239-xc1ol_default_4d1e208c-2461-11e7-9cdc-
40a8f02f94fc_42653ee5

# ssh 15.214.129.102

# docker exec -ti d076be059ca0 bash


[root@eb-c2av-processor-927505239-xc1ol /]# ps -ef | grep c2av | grep java
root 82 80 18 Apr18 ? 2-13:14:17 java -Xms2g -Xmx2g -DLOG_LEVEL=info
-DROLLOVER_POLICY=org.apache.log4j.DailyRollingFileAppender -cp ./bin/../lib/stream-processing.jar
-Djavax.net.ssl.trustStore=/run/secrets/eb-c2av-stream-processor.truststore
-Djavax.net.ssl.trustStorePassword=qwerqawer com.hpe.arcsight.eb.sp.Main -p ./config/stream.properties

• Find the node on which an EB pod is running


• Then on that node locate the EB Docker container
• Use any Docker commands on that container for debugging

12
EB Deployment Topology
K8S Master Node K8S Worker Node K8S Worker Node K8S Worker Node

KAFKA KAFKA KAFKA


ZK ZK ZK

atlas_web-service atlas_kafka atlas_kafka atlas_kafka


eb-kafka-2 eb-kafka-0 eb-kafka-1
kubernetes-vault-renew
eb-web-service
atlas_zookeeper atlas_zookeeper atlas_zookeeper
eb-zookeeper-1 eb-zookeeper-2 eb-zookeeper-0
atlas_schema-registry
eb-schema-registry
atlas_sp atlas_sp atlas_kafka_manager

eb-routing-processor eb-c2av-processor eb-kafka-processor


arcsight-installer

Node Pod

Container process
EB + Investigate Deployment Topology
K8S Master Node K8S Worker Node K8S Worker Node K8S Worker Node

INVESTIGATE KAFKA KAFKA KAFKA


ZK ZK ZK

atlas_web-service atlas_kafka atlas_kafka atlas_kafka


eb-kafka-2 eb-kafka-0 eb-kafka-1
kubernetes-vault-renew
eb-web-service
atlas_zookeeper atlas_zookeeper atlas_zookeeper
mgmt eb-zookeeper-1 eb-zookeeper-2 eb-zookeeper-0

kubernetes-vault-renew
atlas_sp atlas_sp atlas_kafka_manager
hercules-management
eb-routing-processor eb-c2av-processor eb-kafka-processor

search

search-engine atlas_schema-registry
eb-schema-registry
kubernetes-vault-renew

hercules-search

rethinkdb
hercules-rethinkdb-0

Node Pod
arcsight-installer

Container process
EB Deployment Dependency 1/5
K8S Master Node K8S Worker Node K8S Worker Node K8S Worker Node

arcsight-installer

Node Pod

Container process
EB Deployment Dependency 2/5
K8S Master Node K8S Worker Node K8S Worker Node K8S Worker Node

atlas_zookeeper atlas_zookeeper atlas_zookeeper


eb-zookeeper-1 eb-zookeeper-2 eb-zookeeper-0

arcsight-installer

Node Pod

Container process
EB Deployment Dependency 3/5
K8S Master Node K8S Worker Node K8S Worker Node K8S Worker Node

atlas_kafka atlas_kafka atlas_kafka


eb-kafka-2 eb-kafka-0 eb-kafka-1

atlas_zookeeper atlas_zookeeper atlas_zookeeper


eb-zookeeper-1 eb-zookeeper-2 eb-zookeeper-0

arcsight-installer

Node Pod

Container process
EB Deployment Dependency 4/5
K8S Master Node K8S Worker Node K8S Worker Node K8S Worker Node

atlas_kafka atlas_kafka atlas_kafka


eb-kafka-2 eb-kafka-0 eb-kafka-1

atlas_zookeeper atlas_zookeeper atlas_zookeeper


eb-zookeeper-1 eb-zookeeper-2 eb-zookeeper-0
atlas_schema-registry
eb-schema-registry

arcsight-installer

Node Pod

Container process
EB Deployment Dependency 5/5
K8S Master Node K8S Worker Node K8S Worker Node K8S Worker Node

atlas_web-service atlas_kafka atlas_kafka atlas_kafka


eb-kafka-2 eb-kafka-0 eb-kafka-1
kubernetes-vault-renew
eb-web-service
atlas_zookeeper atlas_zookeeper atlas_zookeeper
eb-zookeeper-1 eb-zookeeper-2 eb-zookeeper-0
atlas_schema-registry
eb-schema-registry
atlas_sp atlas_sp atlas_kafka_manager

eb-routing-processor eb-c2av-processor eb-kafka-processor


arcsight-installer

Node Pod

Container process
Kubernetes Cluster

Investigate Event Broker Event Broker Event Broker

Kubernetes Master Kubernetes Worker Kubernetes Worker Kubernetes Worker

Vertica Cluster

Vertica Vertica Vertica

Investigate High Performance Configuration


Kubernetes Cluster

Investigate Investigate Investigate Investigate

Event Broker Event Broker Event Broker


Kubernetes Master Kubernetes Worker Kubernetes Worker Kubernetes Worker

Vertica Cluster

Vertica Vertica Vertica

Investigate High Availability Configuration Option 1


Kubernetes Cluster

Investigate Investigate Investigate

Event Broker Event Broker Event Broker


Kubernetes Worker Kubernetes Worker Kubernetes Worker

Vertica Cluster

Vertica Vertica Vertica

Investigate High Availability Configuration Option 2


Kubernetes Cluster

Investigate

Event Broker
Kubernetes Master

Vertica Cluster

Vertica

Investigate Demo Configuration


EB Network Topology
K8S Master Node K8S Worker Node K8S Worker Node K8S Worker Node

38080 atlas_web-service 32181 atlas_zookeeper 32181 atlas_zookeeper 32181 atlas_zookeeper

ArcMC eb-web-service eb-zookeeper-1 eb-zookeeper-2 eb-zookeeper-0

Logger
39000 Logger
atlas_kafka_manager
ESM
ESM
eb-kafka-processor
9093 atlas_kafka 9093 atlas_kafka 9093 atlas_kafka
8888
arcsight-installer eb-kafka-2 eb-kafka-0 eb-kafka-1 9092

Vertica

8081
atlas_schema-registry atlas_sp atlas_sp

Connectors eb-schema-registry eb-routing-processor eb-c2av-processor


Connectors

PORT

ZooKeeper cluster connection line implying connection to all cluster nodes


Kafka cluster connection line implying connection to all cluster nodes
NOTE: Kafka Cluster Port 9092 is used by Vertica If Investigate is installed
Event Broker 2.0 TLS/FIPS Communication Paths

Schema Stream/Routing Web


Services ArcMC
Registry Process
Connector Kafka Cluster
JMX

Kafka Kafka Kafka


ESM Vertica
JMX

Other
Consumers Zookeeper Zookeeper Zookeeper
Producers
Zookeeper Cluster
Kafka
Manager cAdvisor
(localhost)
Event Broker
Event Broker Component
TLS + FIPS Enabled
External Component
SSH Tunnel & FIPs
No TLS or FIPs
EB Data Storage Topology

atlas_web-service atlas_schema-registry atlas_sp


atlas_kafka atlas_zookeeper
eb-schema-registry eb-zookeeper-1 eb-routing-processor
kubernetes-vault-renew eb-kafka-2
eb-web-service
atlas_sp

eb-c2av-processor

arcsight-installer On each K8S worker node


Kafka and ZooKeeper stores
data to local disk of the node
On K8S Master Node arcsight installer
persists configuration data to local disk

/opt/arcsight/k8s-hostpath-volume/eb/kafka
/opt/arcsight/installer/db /opt/arcsight/k8s-hostpath-volume/eb/zookeeper

Node Pod
• EB Web Service and Schema Registry persists data on a Kafka Topic
• Stream processors (c2av, routing) do not persist any data
Container process
EB Event Data Flow Topology
K8S Worker Node K8S Worker Node K8S Worker Node
K8S Master Node
Event Data
Consumers
Event Data Logger
Producers Logger
atlas_kafka atlas_kafka atlas_kafka
ESM
ESM
eb-kafka-2 eb-kafka-0 eb-kafka-1
Connectors
Connectors

atlas_web-service Vertica
apply event transforms apply event routing
kubernetes-vault-renew
eb-web-service
atlas_sp atlas_sp
atlas_schema-registry
atlas_kafka_manager eb-c2av-processor eb-routing-processor eb-schema-registry
eb-kafka-processor

atlas_zookeeper atlas_zookeeper atlas_zookeeper


arcsight-installer eb-zookeeper-1 eb-zookeeper-2 eb-zookeeper-0

Node Pod

Container process
Event Broker
Installation

28
Before Setting Up Event Broker Systems

Make sure to ask customers about these when starting an investigation.


– Identify Resource Sizing:
Resource sizing decisions are determined by deployment topology, event size, and the event rate. There
are impacts event broker performance. See ArcSight ADP 2.1 - Event Broker 2.0 Sizing.pptx
– Identify the Encryption Mode: Connector, EB, ArcMC, and ESM must be configured to the same mode.
Options are:
– TLS
– TLS + Client Authentication
– TLS + FIPs
– TLS + FIPs + Client Authentication
– A note about TLS:
– Investigate supports TLS between connections, including Vertica.
– Vertica does not support TLS between Vertica and EB (Kafka)

29
Event Broker Installation
There are different files to download, depending on the deployment environment.
Deployment
Files to download Purpose
Environment
Installs the ArcSight Installer for EB stand alone.
The ArcSight Installer is a web application used to configure and deploy Event Broker to the
EB Stand Alone (ADP) • arcsight-installer-1.0.0-14.rc_eb.x86_64.rpm environment.
Images are retrieved from a remote DockerHub.
Customers must have credentials to successfully retrieved images during deployment.
Installs the ArcSight Installer for EB and Investigate.
The ArcSight Installer is a web application used to configure and deploy Event Broker and
EB + Investigate
(with Internet access) • arcsight-installer-1.0.0-14.rc.x86_64.rpm Investigate to the environment.
Images are retrieved from a remote DockerHub.
Customers must have credentials to successfully retrieved images during deployment.
• arcsight-investigate-vertica-scripts.<key>.tar.gz Installs Vertica, the Kafka Scheduler, and configures the environment.
• Vertica License (obtained independently)
Installs the ArcSight Installer for EB and Investigate.
Offline installation for EB and
The ArcSight Installer is a web application used to configure and deploy Event Broker and
Investigate • arcsight-installer-1.0.0-14.rc.x86_64.rpm Investigate to the environment.
(without Internet access)
Environments with NO internet access. Images are retrieved locally after downloading.
• arcsight_eb_images_<key>.tar Contains the Event Broker Images.
• arcsight_investigate_images_<key>.tar Contains the Investigate Images.
• arcsight-investigate-vertica-scripts.<key>.tar.gz
Installs Vertica, the Kafka Scheduler, and configures the environment.
• Vertica License (obtained independently)
Pre-requisites – Event Broker + Investigate Systems

Step Where the step is performed.


Master Node Worker Nodes
Set hostname of systems Yes Yes
Disable SE Linux Yes Yes
Enable firewalld Yes Yes
Install and configure yum Yes Yes
Install Java (OpenJDK) 1.8.0_121 or higher Yes No
Check that chrony is installed and running Yes Yes
Configure system to use network proxy (if your required by the Yes Yes
network policy)
Increased default user process limit Yes Yes
Generate key-pair. Configure ssh from master to workers. Yes No

31
Installer Properties File
Master Node Location: /opt/arcsight/installer.properties

## All Event Broker components will use FIPS-certified encryption algorithms


predeploy.eb.init.fips=false

## Event Broker Kafka will use TLS Client Authentication to verify client connections
predeploy.eb.init.client-auth=false

## Number of partitions for Event Broker topics in Kafka


predeploy.eb.init.noOfTopicPartitions=5

## Replication factor for Event Broker topics in Kafka


predeploy.eb.init.topicReplicationFactor=2

## Kafka log retention size


predeploy.eb.init.kafkaRetentionBytes=10737418240

## Kafka log retention size for the Vertica avro topic. This is uncompressed and requires more space
to hold events for the same duration.
predeploy.eb.init.kafkaRetentionBytesForVertica=10737418240

32
Installer Properties File (continued)
Master Node Location: /opt/arcsight/installer.properties

## Kafka log retention duration


predeploy.eb.init.kafkaRetentionHours=672

## Kafka inter-broker protocol version


predeploy.inter.broker.protocol.version=0.10.1.0

## The message format version the broker will use to append messages to the logs.
predeploy.log.message.format.version=0.10.1.0

## Size of kafka and zookeeper pet-sets


predeploy.eb.kafka.count=3
predeploy.eb.zookeeper.count=3

## Host path to store data persistently


predeploy.eb.kafka.path=/opt/arcsight/k8s-hostpath-volume/eb/kafka
predeploy.eb.zookeeper.path=/opt/arcsight/k8s-hostpath-volume/eb/zookeeper

33
Installer Properties File (continued)
Master Node Location: /opt/arcsight/installer.properties

## ArcMC hostname
predeploy.eb.arcmc.hosts=localhost:443

## The endpoint identification algorithm to validate the server hostname using the server certificate.
predeploy.ssl.endpoint.identification.algorithm=https

## The number of stream threads


predeploy.stream.num.threads=6

## truncate fields in C2av


predeploy.c2av.field.truncate=false

## Log level for each EB container


predeploy.level=info
predeploy.kafka.log.level=${predeploy.level}
predeploy.zookeeper.log.level=${predeploy.level}
predeploy.schema.log.level=${predeploy.level}
predeploy.web.service.log.level=${predeploy.level}
predeploy.c2av.stream.processor.log.level=${predeploy.level}
predeploy.eventbroker.routing.processor.log.level=${predeploy.level}

## Host path directory for ArcMC certificates


predeploy.arcmc.certs.path=/opt/arcsight/k8s-hostpath-volume/eb/arcmccerts

34
Installer Properties File (continued)
Master Node Location: /opt/arcsight/installer.properties

# ArcSight Event Broker


ebTag= 2.00.0
orchestration.bootstrap.image.tag=${ebTag}
kafka.manager.image.tag=${ebTag}
web.service.image.tag=${ebTag}
c2av.stream.processor.image.tag=${ebTag}
eventbroker.routing.processor.image.tag=${ebTag}
kafka.sr.image.tag=${ebTag}
kafka.image.tag=${ebTag}
kafka.image.tag=${ebTag}

#ArcSight Investigate
investigateTag=1.00.0
search.image.tag=${investigateTag}
search.engine.image.tag=${investigateTag}
management.image.tag=${investigateTag}
rethinkdb.image.tag=${investigateTag}

35
Adding Producers and Consumer

– Connectors
– Import Kafka Certificate into Keystore
– Add Kafka as a destination:
– “eb-cef” topic for sending cef data (Logger, Investigate)
– “eb-esm” topic for sending event data (ESM)

– Loggers
– Sign Kafka Consumer Certificate on Event Broker
– Connect to default “eb-cef” topic on Event Broker

– Investigate
– Connect Vertica scheduler to Kafka topic

36
Verify that Event Broker
Cluster is Healthy

37
Container Dependency Order

After deploying Event Broker, pods are configured to start in the following order. Downstream pods will not
start until the dependencies are met.
– A quorum of zookeeper pods in the cluster must be up (2 of 3, or 3 of 5). Total number of zookeepers must be odd.
– All Kafka pods must be up
– Schema Registry pod must be up
– Bootstrap Web Service, Kafka Manager
– Transformation Stream Processor (C2AV), Routing Stream Processor

38
Pod Status: A Healthy Cluster
# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
default-http-backend-yjcwc 1/1 Running 0 17h 172.77.40.6 15.214.137.102
eb-c2av-processor-967417906-vzu2k 1/1 Running 0 16h 172.77.16.5 15.214.137.112
eb-kafka-0 1/1 Running 1 16h 172.77.16.3 15.214.137.112
eb-kafka-1 1/1 Running 1 16h 172.77.59.6 15.214.137.113
eb-kafka-2 1/1 Running 0 16h 172.77.40.11 15.214.137.102
eb-kafka-manager-3416293552-otw4w 1/1 Running 0 16h 172.77.16.4 15.214.137.112
eb-routing-processor-965434368-m5bme 1/1 Running 0 16h 172.77.59.5 15.214.137.113
eb-schemaregistry-2463124937-0r27o 1/1 Running 1 16h 172.77.59.4 15.214.137.113
eb-web-service-3440844888-mdwnb 2/2 Running 0 4h 172.77.40.12 15.214.137.102
eb-zookeeper-0 1/1 Running 0 16h 172.77.59.3 15.214.137.113
eb-zookeeper-1 1/1 Running 0 16h 172.77.16.6 15.214.137.112
eb-zookeeper-2 1/1 Running 0 16h 172.77.40.10 15.214.137.102
nginx-ingress-controller-we1fi 1/1 Running 0 17h 172.77.40.8 15.214.137.102

39
Pod Status: An Unhealthy Cluster
# kubectl get pods
NAME READY STATUS RESTARTS AGE

default-http-backend-esvej 1/1 Running 2 22d


eb-c2av-processor-2916524249-ckx1s 0/1 Init:CrashLoopBackOff 1811 8d
eb-kafka-0 0/1 Init:0/1 1811 10d
eb-kafka-2 1/1 Running 4 22d

eb-kafka-manager-4000886174-agwiq 1/1 Running 101 8d


eb-routing-processor-2058902695-bnd4o 0/1 Init:0/1 1808 8d
eb-schemaregistry-3052097136-ieto1 0/1 Init:0/1 1813 8d
eb-web-service-395903104-z6kn1 1/2 CrashLoopBackOff 2207 22d
eb-zookeeper-0 0/1 Pending 0 8d

eb-zookeeper-1 1/1 Running 0 10d


eb-zookeeper-2 1/1 Running 1 22d
hercules-management-2195617592-2e56a 1/2 CrashLoopBackOff 2105 22d
hercules-rethinkdb-0 1/1 Running 1 22d
hercules-search-1406583890-4vln7 3/3 Running 2108 22d
nginx-ingress-controller-uum11 1/1 Running 2 22d
40
Verify that data flows through the system

– Check the EPS monitoring metrics in ArcMC.


– Tells whether events are flowing through Event Broker stream processor (both routing and C2AV transforming)
– Check the offset for each topic in Event Broker Manager (Kafka Manager). You should see the value
increasing over time in all topics.
– The offset value for topics with Binary events (ESM) will be smaller than topics with CEF events. Binary
events are grouped into batches. Each batch is one message in the topic.
– For CEF Events and Investigate check the following :
– CEF topic offset: The offset should increase over time.
– AVRO topic offset: The offset should increase over time and in unison with the CEF topic offset.
– The event count in Vertica table. The row count should increase over time.
– Check the Kafka Scheduler status to see event count and reject count.
– You should be able to see the event count increasing over time.
– The rejected_events count is the number of events where Kafka scheduler had issue loadings.

41
Check Kafka Scheduler on Vertica

# ./install-vertica/kafka_scheduler status

Status of Kafka scheduler: [127.0.0.1:9092] on topic: [eb-internal-avro]


events count | rejected_events count
--------------+-----------------------
60900034 | 0
(1 row)
'investigation_scheduler' scheduler last 5 events:
event_time | log_level | message
exception
-------------------------+-----------
+--------------------------------------------------------------------------------------------------------------------------------------- 2017-04-04
22:23:55.269 | INFO | Received configuration details; planned concurrency: 2, max concurrency: 0, max execution parallelism: 40. Setting lane
count: 2 |
2017-04-04 22:23:55.25 | INFO | Received configuration details; frame duration: 10000, refresh interval: 300000, eof timeout: 1000, resource
pool: kafka_default_pool, new topic policy: FAIR, pushback policy: LINEAR, pushback max count: 5 |
2017-04-04 22:23:53.837 | INFO | Refreshing Scheduler (refresh interval reached).
|
2017-04-04 22:21:57.48 | WARN | OVERSHOT DEADLINE FOR FRAME -- remaining time (ms): -82
|
2017-04-04 22:21:44.369 | WARN | OVERSHOT DEADLINE FOR FRAME -- remaining time (ms): -330
|
(5 rows)
'investigation_scheduler' scheduler last 10 microbatch status:
frame_start | source_name | start_offset | end_offset | end_reason | partition_bytes | partition_messages
-------------------------+------------------+--------------+------------+------------+-----------------+--------------------
2017-04-04 22:25:59.004 | eb-internal-avro | 4413347 | 4415580 | DEADLINE | 3070209 | 2233
2017-04-04 22:25:59.004 | eb-internal-avro | 4412982 | 4415215 | DEADLINE | 3070011 | 2233
2017-04-04 22:25:59.004 | eb-internal-avro | 4968808 | 4979966 | DEADLINE | 15350438 | 11158
2017-04-04 22:25:59.004 | eb-internal-avro | 3795956 | 3803395 | DEADLINE | 10232976 | 7439
2017-04-04 22:25:59.004 | eb-internal-avro | 3165366 | 3165366 | DEADLINE | 0 | 0
2017-04-04 22:25:59.004 | eb-internal-avro | 4971202 | 4980127 | DEADLINE | 12278830 | 8925
2017-04-04 22:25:59.004 | eb-internal-avro | 3830876 | 3836081 | DEADLINE | 7160944 | 5205
2017-04-04 22:25:59.004 | eb-internal-avro | 3794693 | 3802134 | DEADLINE | 10233608 | 7441
2017-04-04 22:25:59.004 | eb-internal-avro | 3831300 | 3836508 | DEADLINE | 7163372 | 5208
2017-04-04 22:25:59.004 | eb-internal-avro | 3163672 | 3163672 | DEADLINE | 0 | 0
(10 rows)
42
Kafka scheduler runs on pid(s): [30419 30423]
Verify Web Service APIs are healthy

– Check logs of the web service container


– # kubectl logs [POD ID/NAME]
– Check Web Service Port with Netstat to make sure that the port is bound
– # netstat -intp | grep 38080
– Verify that you can fetch data from the API using admin credentials
curl –u “admin:atlas” –k https://n15-214-137-h170.arst.usa.hp.com:38080/cluster/broker
[ "10.12.98.180:9092“, “10.12.98.181:9092” ]

43
Verify the topic partition count and replication count

Why it is important:
– Check that the configured partition count matches what you expect it to be.
– Check the replication count and partition count for the topic using Event Broker Manager (Kafka Manager)
or using kafka-topics command line
# kubectl exec eb-zookeeper-0 -- kafka-topics --zookeeper localhost:2181
--describe --topic eb-cef
Topic:eb-cef PartitionCount:5 ReplicationFactor:2 Configs:
Topic: eb-cef Partition: 0 Leader: 1002 Replicas: 1002,1003 Isr: 1002,1003
Topic: eb-cef Partition: 1 Leader: 1003 Replicas: 1003,1001 Isr: 1003,1001
Topic: eb-cef Partition: 2 Leader: 1001 Replicas: 1001,1002 Isr: 1001,1002
Topic: eb-cef Partition: 3 Leader: 1002 Replicas: 1002,1001 Isr: 1002,1001
Topic: eb-cef Partition: 4 Leader: 1003 Replicas: 1003,1002 Isr: 1003,1002

44
Software Logs and Data

– ArcSight Installer Logs


– /opt/arcsight/installer/logs
– Kubernetes Logs
– ./arcsight/kubernetes/log
– Zookeeper Logs
– ./arcsight/k8s-hostpath-volume/eb/zookeeper/log
– View Kubernetes logs for each container
– # kubectl logs [POD ID/NAME]
– # kubectl logs [WEB SERVICE POD ID/NAME] atlas-web-service

Kafka Topic Data


– /opt/arcsight/k8s-hostpath-volume/eb/kafka

45
Licensing

– There is no license check on either Event Broker or Investigate.


– Users must use their Docker Hub login to deploy images.
– Docker Hub privileges to the images are set by the Licensing Fulfillment team

46
Troubleshooting
Installation

47
Event Broker pods show multiple restarts

This is normal.
Pods will restart as they attempt to synchronize with other pods.
Restarts should cease shortly after all pods in the EB cluster have deployed on all servers.
The number of restarts should be sparse, and be less than 10 in most cases.
The number one factor that can affect the number of restarts is the connection speed in which servers can
connect and download containers.

48
Some of pods are not starting with status ErrImagePull

Problem: This indicates that the image cannot be downloaded from DockerHub. This can be confirmed by
running command kubectl get pods, and then and later execute kubectl describe pod podname.
You will see a message similar to the following:
Failed to pull image "hub.docker.io/hercules/search-engine:master" net/http: request
canceled
Solution: The pod will need to be deleted to re-trigger a download if the image.
– Execute the command: kubectl delete pod failing-podname
– This will terminate failing pod and create a new one with different name.
– Make sure image pull is successful by running the command kubectl get pods to see the status of
newly recreated pod.

49
Multiple Kafka Crashes/Restarts
If data is not removed from a machine prior to re-installation, and the Kafka cluster has been reconfigured,
then Kafka brokers may launch with duplicate IDs, causing one of the Kafka nodes to fail to start.
To identify the issue: Look for the following log in one of the Kafka nodes.
2017-04-09 14:56:06,772] FATAL [Kafka Server 1001], Fatal error during KafkaServer startup. Prepare to shutdown
(kafka.server.KafkaServer)

java.lang.RuntimeException: A broker is already registered on the path /brokers/ids/1001.

To verify the issue: Connect to each system that is running a Kafka broker and check the assigned broker.id
value of each. The broker.id value defined on each Kafka node must be unique.
# ssh worker_node_1 cat /opt/arcsight/k8s-hostpath-volume/eb/kafka/meta.properties | grep id

broker.id=1001

# ssh worker_node_2 cat /opt/arcsight/k8s-hostpath-volume/eb/kafka/meta.properties | grep id

broker.id=1001

# ssh worker_node_3 cat /opt/arcsight/k8s-hostpath-volume/eb/kafka/meta.properties | grep id

broker.id=1002

To recover: If you are reinstalling the cluster delete the existing data directory /opt/arcsight/ as part of
uninstalling the original install. If you are re-labeling or updating an existing cluster make sure the cluster
labels match the original worker node for each Kafka node without conflicts.

50
Troubleshooting-
Other

51
Event Broker and Vertica Diagnosis scripts

Diagnostic tools are packaged in the Event Broker “Web Service” container that extract logs and other cluster information
that can be used to investigate issues.
# find web service container
$ docker ps | grep -i atlas_web-service
c226ee041c48 hub.docker.hpecorp.net/hercules/atlas_web-service:latest

# Copy diagnostic script to host directory


$ docker cp c226ee041c48:/eb/ws/eb_diag/eb_diag.tgz .

# unpack diagnostic script


$ tar -xzvf eb_diag.tgz
eb-diag-beta.sh
eb-diag.sh
vertica-diag.sh

Fix the newlines to Unix format


# mv eb-diag.sh eb-diag.sh.orig
# tr -d '\r' < eb-diag.sh.orig > eb-diag.sh

Run the updated script


# sh eb-diag.sh

52
Cannot Query Zookeeper

Symptom:
When you run the kubectl get pods command to get status of the pods and you see that downstream
pods (see the pod dependency order) do not stay up and the status is a 'CrashLoop'-type error.
Conditions to look for:
Check that zookeeper pods are running.
– If the zookeeper pod status is Pending, you may not have labeled the nodes (zk=yes). Verify that the
nodes are labeled using the kubectl get nodes -L=zk command.
– Verify that you configured an odd number of zookeepers in installer.properties
predeploy.eb.zookeeper.count attribute.
– Check the zookeeper pod logs for errors using the kubectl logs <pod name>.

53
Common Errors/Warnings in Zookeeper Logs

– Quorum Exceptions: Cannot elect a leader. If you see this type of error, check the conditions described in
‘Cannot query zookeeper’
– Socket errors: This can occur if there are too many connections.
– The solution is to restart the pod using the kubectl delete <pod_name>.
– The pod will be recreated automatically.

54
Communication Errors

– SSL Connection Errors: These warnings occur if there is a connection issue between Kafka and a
consumer or producer. Check the steps that you used to import certificates to both EB and consumers.
– Communicate between brokers: If you see this type of error, host names may not be configured properly. It
is possible that the node cannot perform reverse look up or that DNS is not set up properly.

55
A consumer cannot read events from EB

– If this is a new set up of Kafka scheduler, check that Kafka scheduler is configured to communicate to port
39092.
– If this was working at first, but stopped working, it is possible that the offset value is not recognized:
– In this scenario, the kafka scheduler fails to recognize offset ids of messages that are in the topic. It can happen if the
kafka scheduler unexpectedly stops reading from the topic, and then is restarted.
– Solution: execute the kafka_scheduler delete command to delete the meta data. After doing this, immediately run the
kafka_scheduler create command to set up the scheduler.
– Other items to check:
– Check the network connection.
– Check whether the Kafka pods are down.
– Check that you configured the consumer to communicate to all nodes running Kafka. If you specified a connection to
only one node in the cluster and that node down, events will not flow.
– If you are encountering SSL connection errors as well, check the steps that you used to import certificates to both EB
and consumers.

56
An EB component crashes: web service, stream processors, etc.

– If this happens at start up, check the container dependency order. Have any of the dependency pods not
started or have crashed?
– Check Memory: Does the system have enough memory and disk space. It is possible that the system
requires more memory that the system has available.
– Check whether there are too many open sockets.

57
Pods will not start after node is shut down for more than 6 hours

After a system has been down for more than 6 hours. The issue is related to a timed-out certificate. If nodes
are down more than 6 hours certificates are not renewed. The work around:
1. Connect to the master node.
2. Run the update_kubevaulttoken script.
# /opt/arcsight/kubernetes/bin/update_kubevaulttoken

3. Find the kubernetes-vault pod name.


# kubectl get pod -o wide --namespace=core

NAME READY STATUS RESTARTS AGE IP NODE


kubernetes-vault-3865589275-erin5 1/1 CrashLoopBackOff 0 2m 172.77.93.4 15.214.137.51

4. Delete the kubernetes-vault pod. It will be re-created automatically.


# kubectl --namespace=core delete pod kubernetes-vault-3865589275-erin5

5. Check the status of event broker pods. They should restart automatically.
# kubectl get pod -o wide

6. If they do not come up, then undeploy and then redeploy EB using the ArcSight Installer.

58
Event Broker EPS is lower than expected

– Check whether there are resource constraints on brokers: CPU, memory, disk is full. Check usage at
system level or with ArcMC.
– Check for a network bottleneck.
– Check whether Stream Processor is able to keep up with CEF to AVRO transformation.
In ArcMC, the Stream Processor metric will be lower than the Connector EPS. Stream Processor may be
constrained in some way, such as limited system resources.

59
60
Performance
Deployment resource sizing is an important factor in Event Broker performance.

Under resourced systems may impact event throughput and performance.

The following slides are from the Event Broker Sizing guidelines on iRock https://
irock.jiveon.com/docs/DOC-141395

61
Performance

– Try to size so that consumption matches consumption for your SLOWEST Consumer. It’s much better to
have an idle Consumer than one that cannot keep up and must constantly keep the Broker reading from
physical disk.
– Throughput limited by broker network and disk bandwidth. Keep as much as possible in memory and
consider production AND consumption bandwidth – it can get very big very quickly!
– Brokers can converge hundreds of producers into a single topic – allows for great SmartConnector scaling
(eg WINC/WUC)
– Latency is a key factor – do not attempt to Produce or Consume over a WAN link such as between data
centers. Consider separate clusters in each data center and use SmartConnectors to perform dual
destination feeds if required.
– Acknowledgement mode will cause a performance hit – consider this when considering required
throughput. Refer to the Sizing Guide on iRock for detailed examples

https://irock.jiveon.com/docs/DOC-141395

62
Performance
Notes on potential bottlenecks
– Think about production + consumption EPS. So you may have 10K EPS inbound, yet Hadoop
AND Logger both consuming (20K EPS consumption). So you need to size for 30K EPS.
– Hardware is PER NODE for a minimum 3 node cluster.
– This assumes NO ACK and NO TLS.
– Leader ACK, include a 66% performance impact. FULL ACK is even worse!
– Assumed 1765 byte CEF events. This is obviously a fluid value.
– Keep in mind that compression in KAFKA is performed on the Producer (eg the Smart
Connector) using GZIP. KAFKA itself plays no role in compression of data.
– But this becomes far more complicated with dealing with BINARY and AVRO for ESM and Vertica!
– Always recommend 10Gbit network connections INSIDE of the cluster!

https://irock.jiveon.com/docs/DOC-141395

63
Back Up

64
Pre-requisites – Vertica Systems

Step Where the step is performed.


Primary nodes Secondary Nodes
Set up systems with no LVM partitioning and with ext3 or ext4 disk Yes Yes
format
Set hostname of systems Yes Yes
Disable firewalld during installation, ok to enable later Yes Yes
Install and configure yum Yes Yes
Configure system to use network proxy (if your required by the Yes Yes
network policy)
Generate key-pair. Configure ssh from primary to secondary. Yes No

More detail in Vertica documentation


https://my.vertica.com/docs/8.1.x/HTML/index.htm#Authoring/InstallationGuide/BeforeYouInstall/BeforeYouInstallVertica.htm

65
More Information

• ArcSight Data Platform on iRock


– https://irock.jiveon.com/groups/arcsight-data-platform-adp

• Event Broker Sizing Guide


– https://irock.jiveon.com/docs/DOC-137944
• Event Broker FAQ
– https://irock.jiveon.com/message/218634

• Event Broker 2.0 documentation on Protect724


– https://
www.protect724.hpe.com/community/arcsight/productdocs/event-broker/content?filterID=contentstatus%5Bpublished%5D~category
%5Bevent-broker-20%5D

• Docker Self paced training


– https://training.docker.com/category/self-paced-online

• Kubernetes Self-paced tutorials


– https://kubernetes.io/docs/tutorials/kubernetes-basics/

66
Thank you

You might also like