Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

Security

IBM QRADAR Operations


TROUBLESHOOTING Center (SOC)
GUIDE
IBM Security | QRadar

Contents
1 Basics ............................................................................................................................................4
1.1 QRadar Directory Structure ............................................................................................................... 4
1.2 Basic Commands................................................................................................................................ 5
2 Resilient ........................................................................................................................................7
2.1 Backup ............................................................................................................................................... 7
2.2 Health Checkup ................................................................................................................................. 7
2.3 Resilient Button goes missing from QRadar UI ................................................................................. 7
3 Basic Troubleshooting ...................................................................................................................8
3.1 Clearing Browser Cache ..................................................................................................................... 8
3.2 Enabling and Disabling Debug Logs ................................................................................................... 8
3.3 Collecting Logs via CLI and GUI .......................................................................................................... 8
3.3.1 Using GUI............................................................................................................................... 8
3.3.2 Using CLI ................................................................................................................................ 8
3.4 Restarting of services ........................................................................................................................ 9
3.5 Basic Troubleshooting ....................................................................................................................... 9
4 QRadar Agent (Wincollect) ............................................................................................................9
4.1 Reinstalling Wincollect without Rebooting ....................................................................................... 9
4.2 Verification of Agents on Console and Client. ................................................................................... 9
5 Rules and Offenses ...................................................................................................................... 10
5.1 Offenses Overview .......................................................................................................................... 10
5.2 Repopulating Offenses on Console ................................................................................................. 10
5.3 AQL for Offenses.............................................................................................................................. 11
6 Backups ...................................................................................................................................... 11
6.1 Backups Not Generated................................................................................................................... 11
7 High Availability .......................................................................................................................... 12
7.1 Commands ....................................................................................................................................... 12
7.2 HA host in Failed state due to hidden token ................................................................................... 12
8 RegEx , DSM and Parsing ............................................................................................................. 13
8.1 Some Common Regular Expressions ............................................................................................... 13
8.2 Installing RPMs ................................................................................................................................ 13
8.3 Checking the Expensive DSMs ......................................................................................................... 14

1
IBM Security | QRadar

9 User Interface(UI) and Applications/Extensions ............................................................................ 14


9.1 Shrinking of User Interface to the left ............................................................................................. 14
9.2 Applications/Extensions Troubleshooting ....................................................................................... 14
10 Log Sources ................................................................................................................................. 14
10.1 Linux Log Sources with same hostname.......................................................................................... 14
10.2 Special Character in Linux Log Source Identifier ............................................................................. 14
10.3 Troubleshooting JDBC Log Sources Issues ....................................................................................... 15
10.4 Cisco FMC Not Forwarding Not Logs ............................................................................................... 15
10.5 Checking Faulty Log Sources............................................................................................................ 16
11 Performance Related Issues ......................................................................................................... 16
11.1 Console/Magistrate ......................................................................................................................... 16
11.1.1 Services Verification ............................................................................................................ 16
11.1.2 Time Synchronization Failed on Console ............................................................................ 17
11.1.3 Deployment Timeout Error ................................................................................................. 17
11.2 Event Processor and Event Collector............................................................................................... 18
11.2.1 EPS Measuring and License Throttling ................................................................................ 18
11.2.2 Timeout Error on Event Processor ...................................................................................... 19
11.2.3 Persistent Queue Issue (Due to Corrupted ECS-EC Service) ............................................... 20
11.2.4 Backlog in persistent queue ................................................................................................ 22
11.2.5 Accumulator has fallen behind ........................................................................................... 22
11.3 Other Performance Related Commands ......................................................................................... 22
12 Notes and APARs ......................................................................................................................... 23
12.1 Short Notes ...................................................................................................................................... 23
12.2 Tech Notes ....................................................................................................................................... 23
12.2.1 Storage: ............................................................................................................................... 23
12.2.2 Log Sources ......................................................................................................................... 23
12.2.3 EPS IBM Tech notes............................................................................................................. 23
12.2.4 Retention Buckets ............................................................................................................... 23
12.2.5 APARs .................................................................................................................................. 24

2
IBM Security | QRadar

3
IBM Security | QRadar

1 Basics
1.1 QRadar Directory Structure
+-- cv = contains accumulated data
+-- events = Events top-level directory
¦ +-- md = created when encryption is enabled and contains hash values.
¦ +-- payloads = contains event payloads
¦ ¦ +--<YEAR-xxx1>
¦ ¦ ¦ +--<MONTH-1>
¦ ¦ ¦ ¦ +--<DAY-1>
¦ ¦ ¦ ¦ ¦ +--<HOUR-1>
¦ ¦ ¦ ¦ ¦ +--<HOUR-2>
¦ ¦ ¦ ¦ ¦ +-- .
¦ ¦ ¦ ¦ ¦ +--<HOUR-24>
¦ ¦ ¦ ¦ +--<DAY-2>
¦ ¦ ¦ ¦ +-- .
¦ ¦ ¦ ¦ +--<DAY-31>
¦ ¦ ¦ +--<MONTH-2>
¦ ¦ ¦ +-- .
¦ ¦ ¦ +--<MONTH-12>
¦ ¦ +-- .
¦ ¦ +--<YEAR-xxxN>
¦ +-- records = contains event records
¦ ¦ +--<YEAR-xxx1>
¦ ¦ ¦ +--<MONTH-1>
¦ ¦ ¦ ¦ +--<DAY-1>
¦ ¦ ¦ ¦ ¦ +--<HOUR-1>
¦ ¦ +-- .
¦ ¦ +-- .
¦ ¦ +--<YEAR-xxxN>
¦ +-- uncompressedCache = pointers to compressed files
+-- flows = flows top-level directory
¦ +-- payloads = contains flow payloads
¦ ¦ ¦ +--<MONTH-1>
¦ ¦ ¦ ¦ +--<DAY-1>
¦ ¦ ¦ ¦ ¦ +--<HOUR-1>
¦ ¦ +-- .
¦ ¦ +-- .
¦ ¦ +--<YEAR-xxxN>
¦ +-- records = contains flow records
¦ ¦ ¦ +--<MONTH-1>
¦ ¦ ¦ ¦ +--<DAY-1>
¦ ¦ ¦ ¦ ¦ +--<HOUR-1>
¦ ¦ +-- .
¦ ¦ +-- .
¦ ¦ +--<YEAR-xxxN>
¦ +-- uncompressedCache = pointers to compressed files
+-- gv = global views top-level directory
¦ +-- definitions = global view definitions
¦ +-- records = global view records
+-- hprof = host profiles top-level directory
¦ +-- uncompressedCache = cursors for searches
+-- persistent_data = pointer to compressed files
¦ +-- ariel.ariel_proxy_server = saved search results and searches done in the last 24 hours
+-- simarc = QRadar Risk Manager connection data
+-- simevent = QRadar Risk Manager event data
+-- statistics = statistics

4
IBM Security | QRadar

Description Command
To check the psql -U qradar -c " select * from tenant;"
tenant info on
console
Normalized /store/ariel/events/records/aux/tenant-ID/year/month
Events
Raw Events /store/ariel/events/payloads/aux/tenant-ID/year/month

Reference:
https://www-01.ibm.com/support/docview.wss?uid=swg22010279

1.2 Basic Commands


Description Used for Packing reception
Command tcpdump -s 0 -A {IPv4-address | hostname} and port port-
number
tcpdump -s 0 -A {IPv4-address | hostname} and port udp port-
number
Example tcpdump -s 0 -A 192.168.10.36 and port udp 514
Description Displays active connections and open sockets
Command netstat {-a | -r | -i | -g}
-a – all , -r – routing information , -i – interface , -g
Example netstat -r
Description Used to query internet domain servers
Command nslookup domain server-to-be-queried
Example nslookup cisco.com 8.8.8.8
Description Domain Information Groper
Command dig domain-to-be-queried
Example dig geekflare.com
Description Used to enable/disable an interface
Command ifup interface-id
ifdown interface-id
Example ifup eth0
ifdown eth0
Description Adds/Deletes a route
Command route {add|del} –net destination-NW/Subnet-Mask gw
gateway-ip-address
Example route add –net 10.10.10.0/24 gw 192.168.0.1
Description Adds/Deletes a route
Command route add default gw gateway-ip-address
Example route add default gw 192.168.0.1

5
IBM Security | QRadar

Description Finds name to IP or IP to name in IPv4/IPv6 and also query DNS


records
Command host name
Example host www.google.com
Description Displays the ARP table
Command arp -e
Description Displays speed/duplex of NIC.
To set speed/duplex settings -> etc/sysconfig/network-
scripts/ifcfg-eth0
Command ethtool interface-id
Example ethtool eth0
Description Displays the ARP table
Command arp -e
Description Used to resolve the IP address
Command whois ip-to-be-resolved
Description Used to access the node on port 22
Command ssh username@destination-ip-address
Example ssh root@10.254.158.203
Description Allows you copy a file from one host to another in a network
Command scp $filename user@destination:/$path
Storage df -h
CPU cat /proc/cpuinfo
ls cpu
top
iotop
Memory Free -h
IP tables Iptables –L –n –v
Information dmidecode | grep -i product
about
physical
device
TCP Syslog tcpdump -s 0 -A host Device_Address and port 514
UDP Syslog tcpdump -s 0 -A host Device_Address and udp port 514
tcpdump -nnAs0 -i any host <IP address of end device> and port
514

6
IBM Security | QRadar

2 Resilient
2.1 Backup
i. To back up the platform, you must ssh to the virtual appliance and run this
command:sudo resSystemBackup
This creates a backup in the /crypt/backups/ folder in the form of a gz file; for
example, resilient-backup-20170426201138.tar.gz. The time stamp is appended to
the file name for uniqueness. You can rename this file for clarity, and move it to a
secure location.
ii. To back up the platform, you must ssh to the virtual appliance and run this
command: sudo resSystemBackup
The backup file remembers the KeyVault password scheme (cleartext or gpg
encrypted as described in KeyVaults). When running a restore on that file, it
restores that scheme.
iii. You can encrypt the backup by using the ––encrypt option as follows: sudo
resSystemBackup –encryptCopy
It is recommended that you store the backup and its corresponding
backup_passphrase file to a secure location for future use.
Use the –help option to view all the options on the resSystemBackup and
resSystemRestore commands.
iv. To restore a backup, use the resSystemRestore command and the name of the
backup file. For example: sudo resSystemRestore -f /crypt/backups/resilient-
backup-20170426201138.tar.gz

2.2 Health Checkup


Once upgrade, you should be able to access the web app ui straight away. If not, scheck the
status of the services via
systemctl status postgresql-9.6.service
systemctl status elasticsearch.service
systemctl status resilient-scripting.service
systemctl status resilient.service

2.3 Resilient Button goes missing from QRadar UI


i. Please go to the API(https://Console-IP-Address/api_doc) >> Expand the latest
version >> gui_app_framework >> application >> application_id
ii. Select the POST section:
iii. Put the application id as 1302 and status as STOPPED. Click on 'Try it out'
iv. After that is done then please change the status to RUNNING and click on Try it
out.

7
IBM Security | QRadar

3 Basic Troubleshooting
3.1 Clearing Browser Cache
Perform the Steps in orderly fashion:
service hostcontext stop
service tomcat stop
service hostservices stop
rm -rf /opt/tomcat/Catalina/work/localhost/*
rm -rf /opt/tomcat-85/Catalina/work/localhost/*
service hostservices start
service tomcat start
service hostcontext start

3.2 Enabling and Disabling Debug Logs


i. Log into console CLI and run the following command:
touch /var/log/qradar.java.debug
ii. Run the command /opt/qradar/support/mod_log4j.pl
iii. Enter your name
iv. At the main menu select 0) Toggle Debugging
v. Select 'A) Add a new logger'
vi. Enter classpath - com.q1labs.semsources.sources.jdbc.JdbcEventConnector335
vii. q to return to main menu
viii. Finally CQ) for Commit changes and Quit
ix. Replicate issue/perform troubleshooting
x. Log into console CLI to disable debugging
xi. Run the following command /opt/qradar/support/mod_log4j.pl
xii. At main menu select 3) Advanced Menu
xiii. Select 4) Restore defaults and type 'y' to agree restore logging
xiv. Finally CQ) for Commit changes and Quit
xv. After disabling the debugging access the web ui -> Admin Tab -> System and License
Management -> Actions -> Collect Log Data expand the 'Advance Options' and select
'include Debug Logs' so that the above debug is included in the tar file.

3.3 Collecting Logs via CLI and GUI


3.3.1 Using GUI
Admin > System & License Mgmt> Actions > Collect Log Files.
3.3.2 Using CLI
Setup files – /opt/qradar/support/get_logs.sh -s
Application Logs – /opt/qradar/support/get_logs.sh -a
Debug Logs – /opt/qradar/support/get_logs.sh –d

8
IBM Security | QRadar

3.4 Restarting of services


i. Stop the following services in orderly fashion.
service tomcat stop
service hostcontext stop
service hostservices stop
ii. Restart the services in orderly fashion.
service hostservices start
service hostcontext start
service tomcat start

3.5 Basic Troubleshooting


Description Command
Checking if all /opt/qradar/upgrade/util/setup/upgrades/wait_for_start.sh
services are up
and running
Checking the /opt/qradar/bin/myver -v
versions /opt/qradar/bin/myver -a
Deploying from /opt/qradar/upgrade/util/setup/upgrades/do_deploy.pl
CLI
Validating /opt/qradar/support/validate.deployment.sh
Deployment

4 QRadar Agent (Wincollect)


4.1 Reinstalling Wincollect without Rebooting
i. Stop the wincollect services
ii. Delete the IBM folder. C:/Program Files/IBM
iii. Uninstall Wincollect from control panel.
iv. Now, install the wincollect.

4.2 Verification of Agents on Console and Client.


Description Command
Verified the win-collect installed on qradar psql –U qradar –c “select * from
by using below command ale_component_type;”
Verified all the versions of win-collect agent psql -U qradar -c "select * from ale_client;"
installed on the windows client

9
IBM Security | QRadar

5 Rules and Offenses


5.1 Offenses Overview
Offenses in QRadar can be retained indefinitely, if they are not closed or inactive. After the
initial offense rule has fired, the offense is marked as active in QRadar. QRadar checks every 10
minutes to see whether new events have been added to the offense. In this state, the offense is
waiting for new event or flows to hit the Offense Rule test. If new events have been detected,
the offense clock is reset to keep the offense as active for another 30 minutes. QRadar will
mark an offense as dormant if new events or flows occur after 30 minutes. We will also mark
the offense as dormant even if we have not processed any events after 4 hours.

Qradar dormancy period lasts 5 days. After these 5 days, an offense is marked as inactive. New
events triggering the Offense rule test will not contribute to the inactive offense. Our Offense
Model checks each day within these 5 days to determine which offenses are still dormant and
which are inactive. If an event is received during the dormant time, the dormant time is reset
back to zero. You will have to wait another 5 days of no events or flows triggering the rule test
in order for the offense to become inactive.

Note: By default, the system allows 2,500 open (active) offenses and 100,000 (inactive)
offenses. If these values are reached, a System Notification is generated to alert the
administrator that they might need to review offenses that can be closed or tune rules to
reduce the overall number of offenses that are being generated in QRadar. By default the
system will begin to remove 0.05 percent all inactive offenses every 2 hours.
When an offense is closed either by manually closing an offense or by magistrate, which makes
the offense inactive, the Offense Retention Period setting is then applied. The Offense
Retention Period determines how long inactive offenses are kept before being purged from the
Console.
For better management:
The administrator can manage offenses from Admin tab > Advanced> Clean SIM Model. The
options include:
Soft Clean - this option closes all offenses, but does not remove them from QRadar.
Hard Clean - this option closes and removes all offenses from the system. It is not advised to
Hard Clean your SIM Model, unless advised by QRadar Support.

5.2 Repopulating Offenses on Console


Description Command
View the Active Offeneses psql -Uqradar -c "select active_code, count (id) as
offenses_number from offense_view group by active_code;"
View Active and Dormant psql -Uqradar -c "select id from offense_view where
Offenses active_code = 3 and dismissed_code = 2;" > offenses

10
IBM Security | QRadar

5.3 AQL for Offenses


Description AQL
Active Offenses in Last 24 select ("SUM_Active Offense Count" / 2) from
Hours GLOBALVIEW('Offenses Over Time','NORMAL') order by
"Time" desc last 24 HOURS
Dormant Offenses in Last select ("SUM_Dormant Offense Count" /2) from
24 Hours GLOBALVIEW('Offenses Over Time','NORMAL') order by
"Time" desc last 24 HOURS
Active and Dormant select ("SUM_Active Offense Count" / 2) as 'Active Offense
Offenses in Last 24 Hours Sum', ("SUM_Dormant Offense Count" / 2) as 'Dormant
Offense Sum', "Time" * 1000 as 'sTime' from
GLOBALVIEW('Offenses Over Time','NORMAL') order by
"Time" desc last 24 HOURS
Offenses By log Source select logsourcename(logsourceid) as LogSource,
sum(eventcount) / 24*60*60 as EPS from events group by
logsourceid order by EPS desc last 24 hours
Offenses by Domain select DOMAINNAME(domainid) as LogSource,
sum(eventcount) / 24*60*60 as EPS from events group by
domainid order by EPS desc last 24 hours
Offenses created and and SELECT DATEFORMAT(starttime, 'YYYY-MM-dd') as Date,
closed in last 2 days QIDNAME(qid) as 'Event Name', LONG(COUNT()) FROM
events WHERE qid = 28250021 or QID=28250369 GROUP BY
qid, Date Order by Date Desc LAST 2 Days
Offenses closed by user in SELECT DATEFORMAT(starttime, 'YYYY-MM-dd') as Date,
last 2 days QIDNAME(qid) as 'Event Name',username, LONG(COUNT())
FROM events WHERE qid = 28250021 GROUP BY Date,
username Order by Date Desc LAST 2 Days
Offenses assigned to user SELECT DATEFORMAT(starttime, 'YYYY-MM-dd') as Date,
in last 2 days QIDNAME(qid) as 'Event Name',username, LONG(COUNT())
FROM events WHERE qid = 28250180 GROUP BY Date,
username Order by Date Desc LAST 2 Days

6 Backups
6.1 Backups Not Generated
Pre Checks:
i. Last Generated Backup
ii. Check Old qradar.log file and see if backups are completed.
iii. Check Old qradar.error file and see if there are any backup errors.
iv. Check space and /store size.
v. Check all services

11
IBM Security | QRadar

Example:
Found error in qradar.error file
May 28 07:50:10 ::ffff:10.254.158.209 [hostcontext.hostcontext] [Scheduled Backup]
com.q1labs.hostcontext.backup.core.BackupUtils: [ERROR]
[NOT:0000003000][10.254.158.209/- -] [-/- -]The Apache certificates on the managed host do
not match the certificates on the Console. Tomcat connection test failed.
Executed tomcat connection script and fixed the connected
7.Executed test_tomcat_connection script and it shows connected and also see connected in
qradar.log
May 28 10:30:06 ::ffff:10.254.158.209 [test_tomcat_connection] [main]
com.q1labs.hostcontext.backup.core.BackupUtils: [INFO] [NOT:0000006000][10.254.158.209/- -
] [-/- -]Connected to tomcat

7 High Availability
7.1 Commands
Description Command
Current state of the node /opt/qradar/ha/bin/ha cstate
To make the current node primary /opt/qradar/ha/bin/ha takeover
Validates the deployment configurations /opt/qradar/support/ validate_deployment.sh
Verfiying the DRBD Services cat /proc/drbd
HA diagnostics /opt/qradar/support/ha_diagnosis.sh
HA help /opt/qradr/ha2/bin/ha help
Giveback the assigned role /opt/qradar/ha2/bin/ha giveback

7.2 HA host in Failed state due to hidden token


[root@ISB-Console-PRI-secondary ~]# /opt/qradar/ha/bin/ha cstate
Local: R:SECONDARY S:ACTIVE/ONLINE CS:NONE P:1.0 HBC:DOWN RTT:1 I:0 SI:15266417
Remote: R:PRIMARY S:UNKNOWN/INIT CS:NONE P:1.0 HBC:DOWN RTT:-1 I:36798 SI:0
Solution:
i. Remove the hidden token file
ii. Restart the ha manager services (systemctl restart ha_manager)

12
IBM Security | QRadar

8 RegEx , DSM and Parsing


8.1 Some Common Regular Expressions
Field RegEx
Log Source Time Regex: \>(\w{1,3}\s\d{1,3}\s\d{1,4})\s+([\d:]+)
Captre group: $1 $2
Day format: MMM dd yyyy HH:mm:ss
MAC Address ^([0-9a-zA-Z]{2}[:-]){5}([0-9a-zA-Z]{2})$
([0-9a-zA-Z]{2}[:-]){5}
([0-9a-fA-F][0-9a-fA-F]:){5}([0-9a-fA-F][0-9a-fA-F])
IP Address \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}
(\d(.+?).\d(.+?).\d(.+?).)
(\d+.\d+.\d+.\d+)
(\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b)
DOMAIN ([a-zA-Z0-9]([a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,6}
(\w+://\w{3}.\w+.\w{3})
(\w+://\w{3}.\w+.\w+.\w{3})
(htp[s]?://(.+?)["/?:])
DATE (19|20)\d\d([- /.])(0[1-9]|1[012])\2(0[1-9]|[12][0-9]|3[01])
(\d{4}/\d{2}/\d{2})
TIME (\d{2}:\d{2}:\d{2})
EMAIL ^([a-z0-9_\.-]+)@([\da-z\.-]+)\.([a-z\.]{2,6})$
(\w+.\w+@\w+.\w{1,3})
(.+@[^\.].*\.[a-z]{2,}$)
PORT src_port=\d{1,65535}
FLOATINGPOINT Floatng Point Number: ([-+]?\d*\.?\d*$)
INTEGER Integer: ([-+]?\d*$)
URL (htp\://[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(/\ S*)?$)
For example: To match a log that resembles: SEVERITY=43 Construct the following Regular
Expression: SEVERITY=([-+]?\d*$)

8.2 Installing RPMs


i. Move RPM file to tmp or other directory
#rpm -Uvh <filename>
ii. check os-release
#cat /etc/os-release
Example:
/opt/qradar/support/all_servers.sh -C -k "rpm -qa | grep -i Office365"
/opt/qradar/support/all_servers.sh -C -k " /opt/qradar/bin/myver "

13
IBM Security | QRadar

8.3 Checking the Expensive DSMs


watch -d "/opt/qradar/support/jmx.sh -p 7777 -b 'com.q1labs.sem:application=ecs-ec.ecs-
ec,type=filters,name=DSM'" 3. watch -d "/opt/qradar/support/jmx.sh -p 7799 -b
'com.q1labs.sem:application=ecs-ep.ecs-ep,type=filters,name=CRE'"

9 User Interface(UI) and Applications/Extensions


9.1 Shrinking of User Interface to the left
i. Stop the tomcat service
ii. Chang the ownership of the /opt/tomcat/webapps/ folder using chown to
nobody:nobody
iii. Start the tomcat service

9.2 Applications/Extensions Troubleshooting


Description Command
Docker docker ps
docker images
systemctl status docker
Checking psql -U qradar -c "select id,name,status,memory from
Application Status installed_application"
psql -U qradar -c "select SUM(memory) from
installed_application"
/opt/qradar/support/recon ps
Upgrading /opt/qradar/bin/upgrade_applications.py
Application

10 Log Sources
10.1 Linux Log Sources with same hostname
As syslog reads the data from the payload the moment it will find the hostname in the payload
it will parse the logs under one log source.
Also you won’t be able to add the 2nd log source with the same host name and protocol as it
will give you an error stating a log source already exist with same host name and protocol.

10.2 Special Character in Linux Log Source Identifier


I understand the issue thoroughly and found that there is special character in the payload as
hostname. (slot1/KHI-PKCP-TMS-F5)
The auto-discovered payload is not able to capture the payload because of the special character
"/".We change the logsource identifier to slot1/KHI-PKCP-TMS-F5 but we are not able to see
any events going under the logsource. To confirm if qradar allow "/" in logsorce indentifier we
have re-create the same issue in my test lab and found the same.

14
IBM Security | QRadar

Logsource identifier slot1/KHI-PKCP-TMS-F5 is not working. When I change LS identifier to


"slot1" then we are able to see the events going under logsource.

10.3 Troubleshooting JDBC Log Sources Issues


i. First we checked JDBC rpm on qradar console and it was in latest version rpm -qa | grep
-i jdbc
7.3.0-QRADAR-PROTOCOL-JDBC-7.3-20181108140614.noarch.rpm
ii. We checked the parsing order and there was no issue with parsing order
iii. We try to find out the errors in the qradar.log and qardar.error using that IP and I do not
see anything
iv. We also tried by moving the properties file. Initially it was of 0 bytes. Even after moving
the file and disabling/enabling the log source file got created with 0 bytes
v. We checked the qardar.error and qardar.log with jdbc keyword and found this error:
May 16 11:18:37 ::ffff:10.254.158.201 [ecs-ec-ingress.ecs-ec-ingress] [Thread-623284]
java.lang.NoSuchMethodError:
oracle/i18n/text/converter/CharacterConverterOGS.getInstance(I)Loracle/i18n/text/con
verter/CharacterConverter; (loaded from
file:/opt/ibm/si/services/ecs-ec-ingress/0.0.814/bin/orai18n-10.2.0.jar by
sun.misc.Launcher$AppClassLoader@898d0352) called from class
oracle.sql.converter.CharacterConverterFactoryOGS (loaded from
file:/opt/ibm/si/services/ecs-ec-ingress/eventgnosis/lib/q1labs/ojdbc7.jar by
sun.misc.Launcher$AppClassLoader@898d0352).
vi. We found that orai18n-10.2.0.jar (which is bad file) was present in following location
/opt/ibm/si/services/ecs-ec/0.0.814/bin/orai18n-10.2.0.jar
/opt/ibm/si/services/ecs-ec-ingress/0.0.814/bin/orai18n-10.2.0.jar
vii. We renamed this file to old
/opt/ibm/si/services/ecs-ec/0.0.814/bin/orai18n-10.2.0.jar.old
/opt/ibm/si/services/ecs-ec-ingress/0.0.814/bin/orai18n-10.2.0.jar.old
viii. We copied the latest orai file from /opt/qradar/jars/orai18n.jar to below locations then
restarted the service
/opt/ibm/si/services/ecs-ec/0.0.814/bin/
/opt/ibm/si/services/ecs-ec-ingress/0.0.814/bin/

10.4 Cisco FMC Not Forwarding Not Logs


i. Checked the test of protocol and all looks perfect there are no errors.
ii. Then I have checked the protocol version of common and estreamer.
Extreamer protocol was not up to date.
iii. I have provided you link to download the protocol and we have installed the
latest protocol on console.
iv. After that we have performed full deploy changes in admin tab.

15
IBM Security | QRadar

v. You have asked me couple of things related to license and EPS for that I told
you to raise a new case as there is separate team which looks into such type of
issues and AQL queries related issues.
vi. After some time the last event time of log source got updated and events
started flowing under log source.
vii. Then checked in log activity tab for confirmation and the events are flowing.
viii. You have also raised query related to stored events coming under the log
source so I checked the DSM version of CISCO firepower management and
common found that both are up to date. Then I conveyed you to raise another
case or allow me to raise a new case related to parsing issue for this but you
told me that you will discuss this within your team and let me know.

10.5 Checking Faulty Log Sources


select logsourcename(logSourceId) as 'Log Source',DATEFORMAT("startTime",'YYYY-MM-dd
HH:mm:ss') as 'Start Time',"endTime" - "startTime" as 'StorageDelay (s)',
DATEFORMAT("endTime",'YYYY-MM-dd HH:mm:ss') as 'Storage
Time',DATEFORMAT("deviceTime",'YYYY-MM-dd HH:mm:ss') as 'Log Source
Time',QIDNAME(qid) as 'Event Name',"processorId" as 'Event Processor' from events order by
"startTime" desc LIMIT 1000 last 5 minutes

11 Performance Related Issues


11.1 Console/Magistrate
11.1.1 Services Verification
Location Description Command
Console Tomcat – Frontend Service tomcat {start | stop | status }
web-server responsible Service hostcontext {start | stop | status}
for all UI interactions.
Note:
Host Context – It’s a • To stop services, follow the order: Tomcat,
service suite that hostcontext and hostservices.
monitors all Qradar • To start services, follow the order:
components to ensure hostservices, hostcontext and Tomcat.
that each component is systemctl {start | stop | status}
operating as expected. ariel_proxy_server
Different services such
has (host-services,
tomcat and ec-ep) fall
under this suite.
Managed Host Query server queries systemctl {start | stop | status}
the console ariel_query_server

16
IBM Security | QRadar

Console/Managed ECS aka Event systemctl {start | stop | status} ecs-ep


Host Correlation Service systemctl {start | stop | status} ecs-ec
Console/Managed Docker systemctl {start | stop | status} docker
Host
Console/Managed Failed Services systemctl {start | stop | status} - - failed
Host

11.1.2 Time Synchronization Failed on Console


/opt/qradar/support/all_servers.sh -C "date"
This will provide server type, IP and date-time for all managed hosts and you can compare if it is
same on all of them.
FEATURE_TOGGLE_CHRONY_TUNNEL was unavailable in
/store/configservices/staging/globalconfig/nva.conf on Console
Solution:
i. We added it in the file disabling chrony :-
# echo
"FEATURE_TOGGLE_CHRONY_TUNNEL=disabled">>/store/configservices/staging/
globalconfig/nva.conf
ii. This triggered Deploy from UI. We deployed changes and verified the file :-
# grep FEATURE_TOGGLE_CHRONY_TUNNEL /opt/qradar/conf/nva.conf
iii. After this, the time_sync should work by using tlsdate.
iv. We executed following command and ensured that there were no more errors :-
# /opt/qradar/support/all_servers.sh "bash -x /opt/qradar/bin/time_sync.sh"
v. Go to Admin Tab in QRadar, Click on Actions and then “Full Deploy”.

11.1.3 Deployment Timeout Error


i. Applied full deployment and got Time-Out error for the console VIP.
ii. Checked the token and .NODOWNLOAD file: All is good.
iii. Validate_deployment is good.
iv. I have changed the configuration file for the Time-out error.
v. Following Services were restarted in orderly fashion.
systemctl stop tomcat
sytemctl stop hostcontext
systemctl stop hostservices
systemctl start hostservices
systemctl start tomcat
sytemctl start hostcontext
vi. /opt/qradar/upgrade/util/setup/upgrades/wait_for_start.sh -> once this completed
start tomact
vii. Check the UI and full deployment

17
IBM Security | QRadar

11.2 Event Processor and Event Collector


11.2.1 EPS Measuring and License Throttling
How to find out the current EPS of QRadar box so we can renew the license accordingly?
If viewed from UI, the value seems to be under control but when viewed from CLI it's being
throttled alot of times.
i. https://www.ibm.com/support/pages/qradar-about-eps-fpm-limits
ii. https://www.ibm.com/support/pages/qradar-event-rate-eps-graph-may-not-reflect-
entire-event-load-system
iii. https://www.ibm.com/support/pages/qradar-determining-events-second-rate-each-
log-source-qradar

Q1: As per the technote, EPS is measure at 2 points SourceMonitor and StatsFilter. Which
value is checked against the license?
A1:The first number is coalesced and the 2nd number is raw. The peak is the peak since the last
time the service was restarted and again the first number is coalesced and the 2nd number is
raw.
Q2: EPS values being shown on CLI(StatFilter/Sourcemonitor) and UI are very different.
There are 4 values of EPS being shown on UI:
a. Events per second coalesced - Peak 1 sec
b. Events per second coalesced- Average 1 min
c. Events per second Raw - Peak 1 sec
d. Events per second Raw - Average 1 min
A2:
Oct 25 13:25:09 ::ffff:10.254.158.209 [ecs-ec.ecs-ec]
[[type=com.ibm.si.ec.filters.stat.StatFilter][parent=ISB-EP-PRI-primary.ptcl.net.pk:ecs-
ec/EC/Processor2]] com.ibm.si.ec.filters.stat.StatFilter: [INFO]
[NOT:0000006000][10.254.158.209/- -] [-/- -] Events per second: 1s:6829,14121 (peak
12713,34691) (compression: 52%) 5s:4456,10781 (peak 7483,22373) (compression: 59%)
10s:4747,13657 (peak 6910,15464) (compression: 65%) 30s:4332,13261 (peak 6297,14190)
(compression: 67%) 60s:4327,13216 (peak 6018,13855) (compression: 67%)
Values are:
EPS Coalesced - 1 Sec = 6829
EPS RAW - 1 Sec = 14121
EPS Coalesced - Peak 1 sec = 12713
EPS Raw - Peak 1 sec = 34691
EPS Coalesced - 1 Min Avg =4327
EPS RAW - 1 Min Avg =13216
EPS Coalesced - Peak 1 Min Avg =6018
EPS Raw - Peak 1 Min Avg =13855

18
IBM Security | QRadar

EPS Related Commands


Description Command
Checking number of tail -n 15 /var/log/qradar.log | grep "peak of"
dropped events when less -iS /var/log/qradar.log | grep -i "license restrictions"
EPS License limit had tail -n 15 /var/log/qradar.log | grep "peak of"
been reached
EPS per log source by SELECT LOGSOURCENAME(logsourceid) AS "Log Source",
interval last 30 days SUM(eventcount) AS "Number of Events in Interval",
SUM(eventcount) / 2592000 AS "EPS in Interval"
FROM events GROUP BY "Log Source"
ORDER BY "EPS in Interval"
DESC LAST 24 hours
EPS per log source by SELECT LOGSOURCENAME(logsourceid) AS "Log Source",
interval last 7 days SUM(eventcount) AS "Number of Events in Interval",
SUM(eventcount) / 604800 AS "EPS in Interval"
FROM events GROUP BY "Log Source"
ORDER BY "EPS in Interval"
DESC LAST 7 days
Event counts and event SELECT
types per day DATEFORMAT( devicetime, 'dd-MM-yyyy')
AS 'Date of log source',
QIDDESCRIPTION(qid)
AS 'Description of event', COUNT(*)
FROM events
WHERE devicetime >( now() -(7*24*3600*1000) )
GROUP BY "Date of log source", qid
LAST 1 DAYS
Last 24 hours by log Select logsourcename(logsourceid) AS "Log Source",
sources SUM(eventcount) /86400 AS "Average EPS", Max(eventcount) AS
"Peak EPS" FROM events GROUP BY "Log Source" ORDER BY
"Average EPS" DESC LAST 24 hours

11.2.2Timeout Error on Event Processor


The last deploy was performed around 17:15 system time and it timed out for one host
at x.x.158.199, this happened as there were long running database transactions on the
host in question that delayed the deploy process, the transactions in question were
related to previous replication scripts, ie:
Aug 11 15:11:00 ::ffff:x.x.158.209 [hostcontext.hostcontext] [9dd97b25-55e1-4717-
8c38-e9bffa0081f8/SequentialEventDispatcher] com.q1labs.hostcontext.tx.TxSentry:
[WARN] [NOT:0150134100][x.x.158.209/- -] [-/- -]Found unmanaged process on host

19
IBM Security | QRadar

x.x.158.199: postgres, pid=128607, TX age=38755 secs, command=[128607 25260


postgres: qradar qradar [local] SELECT]
Aug 11 15:11:00 ::ffff:x.x.158.209 [hostcontext.hostcontext] [9dd97b25-55e1-4717-
8c38-e9bffa0081f8/SequentialEventDispatcher] com.q1labs.hostcontext.tx.TxSentry:
[WARN] [NOT:0000004000][x.x.158.209/- -] [-/- -] TX on host x.x.158.199:
pid=128607 age=38755 IP=null port=-1 locks=373 query='SELECT
replicate_restore_dump('exttxt', 'public');'
Solution1:
i. Clone the SSH session to console and executed
/opt/qradar/bin/replication.pl -full –clean
ii. On EP, executed /opt/qradar/bin/replication.pl –rebuild
iii. Copied the file /store/configservices/configurationsets from console to the
managed host(EP).
Executed the command /opt/qradar/bin/local_transformation.sh -l -f
to deploy locally.
Solution2:
A potential workaround for a deploy in progress is to attempt to repair a host. To do
this, the support representative can attempt to move the files and restart hostcontext
on the managed host.
Procedure
i. Using SSH, log in to the Console appliance as the root user.
ii. Open an SSH session from the Console to the Managed host.
iii. Move the /store/tmp/status/deployment."IP" file(s) to a temporary directory.
For example, mv /store/tmp/status/deployment."172.16.77.35" /tmp
Warning: Restarting hostcontext on an managed host stops a number of
services and impact event collection capability of the managed host until
services are restarted. This process should only be done during a
maintenance window or at the request of a QRadar Support
Representative.
iv. To restart hostcontext on the Managed host(s), type service hostcontext
restart (7.2 and lower), systemctl restart hostcontext (7.3 and above).
v. Log in to the QRadar Console.

11.2.3 Persistent Queue Issue (Due to Corrupted ECS-EC Service)


The EC cluster was not collecting events.Primary was active and all services were running, the
issue was with the /store/persistent_queue partition as it had a corrupted file in ecs-ec.ecs-ec/:
ls: cannot access ecs-ec.ecs-ec/ecs-ec_EC_TCP_TO_EP_0.dat: Input/output error
total 18G
-????????? ? ? ? ? ? ecs-ec_EC_TCP_TO_EP_0.dat

20
IBM Security | QRadar

Partition info:
10.126.217.81:/persistent_queue 873G 30G 844G 4% /store/persistent_queue
Mount:
10.126.217.81:/persistent_queue /store/persistent_queue glusterfs defaults,_netdev 0 0

The first .dat file after the ecs-ec_EC_TCP_TO_EP_0.dat was from Apr 3rd and the last from Apr
8th
First: -rw-r--r-- 1 root root 100M Apr 3 18:59 ecs-ec_EC_TCP_TO_EP_1.dat
Last: -rw-r--r-- 1 root root 28M Apr 8 13:19 ecs-ec_EC_TCP_TO_EP_180.dat

The configuration file of ecs-ec was also corrupted:


-rw-r--r-- 1 root root 4.0K Apr 15 22:53 ecs-ec_EC_TCP_TO_EP.cfg
# cat ecs-ec.ecs-ec/ecs-ec_EC_TCP_TO_EP.cfg
cat: ecs-ec.ecs-ec/ecs-ec_EC_TCP_TO_EP.cfg: Input/output error

I tried removing the file in question but it did not work:


# rm -rf ecs-ec_EC_TCP_TO_EP_0.dat
rm: cannot remove ‘ecs-ec_EC_TCP_TO_EP_0.dat’: Input/output error

Solution:
i. Stopped ecs-ec and ecs-ec-ingress
systemctl stop ecs-ec-ingress && systemctl stop ecs-ec
ii. Moved all .dat files to /store/ibm_support/persistent_queue_backup/
iii. Removed the ecs-ec_EC_TCP_TO_EP.cfg
iv. Recreated the ecs-ec_EC_TCP_TO_EP.cfg (6444 root:root) and set the its contents as:
3
1
1
0
0
0
false
0393985
true
false
v. Moved the ecs-ec_EC_TCP_TO_EP_1.dat back to /store/persistent_queue/ecs-ec.ecs-ec
and started ecs-ec and ecs-ec-ingress

21
IBM Security | QRadar

11.2.4 Backlog in persistent queue


If you get ticket where logs collection got stop and if there are a large backlog for the ecs-ec
service in the persistent_queue then follow:
i. Set ecs-ec service to manual
/opt/qradar/systemd/bin/manual.sh ecs-ec
ii. Stop ecs-ec
systemctl stop ecs-ec
iii. Move persistent queue to backup location
iv. cd /store/persistent_queue/ecs-ec.ecs-ec/
mkdir -p /store/ibm-support/TS003876452
mv * /store/ibm-support/TS003876452
v. Remove manual flag for ecs-ec and start the service again
/opt/qradar/systemd/bin/manual.sh ecs-ec
systemctl start ecs-ec

11.2.5 Accumulator has fallen behind


https://www.ibm.com/support/pages/qradar-how-troubleshoot-accumulator-issues-using-
collectgvstatssh
/opt/qradar/support/collectGvStats.sh -s | less
Result:
view[10094] time: 83748ms. Refs: {unknown}
view[10098] time: 82968ms. Refs: {unknown}
We want to look for the amount of time it takes for the GVs to load. In this case, it took over 80
seconds for EACH of the Global View to load listed in our example to load. Since we only have
60 seconds to load ALL Global Views, 80 seconds to load each of these are excessive.

In this case, we need to find out which searches are associated with those Views. To do that,
you can run the following command on the console this time.
/opt/qradar/support/collectGvStats.sh -m 10094
/opt/qradar/support/collectGvStats.sh -m 10098

11.3 Other Performance Related Commands


Description Command
ECS-EC-INGRESS watch -d -n 1 "/opt/qradar/support/jmx.sh -p 7787 -b
and EPS Stats 'com.q1labs.sem:application=ecs-ec-ingress.ecs-ec-
ingress,type=sources,name=Source Monitor'"
Event Throttling grep -i QueuedEventThrottleFilter qradar.error
ECS-EC Service watch -n1 "/opt/qradar/support/jmx.sh -p 7777 -b
Stats 'com.q1labs.sem:application=ecs-ec.ecs-ec,type=sources,name=Source
Monitor'"

22
IBM Security | QRadar

Hardware, VM cat /var/log/messages | grep "interrupts"


related Issues cat /var/log/qradar.log | grep "I/O"
cat /var/log/messages | grep "lockup"
SpillOver Queue ls /store/transient/spillover/queue/ecs-ec-ingress.ecs-ec-ingress/ | wc -l

Routing Rules cat /var/log/qradar.log | grep LicenseGivebackFilter


(LicenseGiveback)

12 Notes and APARs


12.1 Short Notes
Multiline protocol is used to merge different payloads (events) into a single payload (event)
after matching a common regex in each payload.
It's not possible to format single payload into a single line event, this needs to be fixed from
remote device.

12.2 Tech Notes


12.2.1 Storage:
i. https://www.ibm.com/support/knowledgecenter/SS42VS_7.3.1/com.ibm.qradar.do
c/b_offboard_storage.pdf
ii. https://www-01.ibm.com/support/docview.wss?uid=swg21693083
iii. https://www.ibm.com/support/knowledgecenter/SS42VS_7.3.1/com.ibm.qradar.do
c/c_qradar_adm_bkup_arch_restor.html

12.2.2 Log Sources


i. https://www.ibm.com/support/pages/qradar-how-can-you-find-out-what-log-sources-are-
generating-most-events
ii. https://www.ibm.com/support/pages/qradar-determining-events-second-rate-each-log-
source-qradar

12.2.3EPS IBM Tech notes


i. https://www.ibm.com/support/pages/qradar-how-can-you-find-out-what-log-
sources-are-generating-most-events
ii. https://www.ibm.com/support/pages/qradar-determining-events-second-rate-each-
log-source-qradar

12.2.4 Retention Buckets


https://www.ibm.com/support/knowledgecenter/SSKMKU/com.ibm.qradar.doc_cloud/t_qrada
r_adm_conf_retention_bucket.html

23
IBM Security | QRadar

12.2.5APARs
APAR IJ15472: EVENT COUNT NUMBERS DOESN'T MATCH IN THE OFFENSE
DETAILS SCREEN ON CLICKING THE EVENT/FLOW COUNT

24

You might also like