NetGuardian Screener Admin Guide 7.3.1

NG|Screener Administration Guide
Version 7.3.1
NetGuardians SA <info@netguardians.ch>
Table of Contents
1. NG|Screener Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2. Architecture and Main Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1. External Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.2. Internal Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3. Notions of Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.1. Storage window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.2. Analysis window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.3. Logs archiving window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.4. Logs cleaning window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2. NG|Screener Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2. Memory Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1. NG|Screeener. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.2. NG|ScreeenerUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.3. NG|Messaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.4. NG|Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.5. NG|Discover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.6. Global . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3. Link display configuration in NG|Screener UI. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4. NG|CaseManager. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.5. Syslog-NG. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3. NG|Connectors Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.2. Configure Update Center . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3. Check Installed Connectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3.1. On NG|Screener Server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3.2. In Management Center . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.4. Uninstall Connectors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.5. Install or Update Solutions/Connectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.5.1. Use NG-Screener-UI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.5.2. Install or Update Connector on NG|Screener Server . . . . . . . . . . . . . . . . . . . . . . 14
3.5.3. Install or Update Solution on NG|Screener Server. . . . . . . . . . . . . . . . . . . . . . . . 14
3.6. Connector’s Service Config file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4. Licensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4.1. Licensing Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4.2. Show NG|Screener License . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.3. Generate C2V file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.4. Activate License . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.5. Update License . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5. Authentication with ngAuth. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5.2. How to access the service ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.3. How to create a new Realm ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.4. How to create a new Role ?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.5. How to create a new User ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.5.1. User Role Mappings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.6. User Storage Federation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.6.1. Adding a Provider. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.6.2. Dealing with Provider Failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.6.3. LDAP and Active Directory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Storage Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Edit Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Other config options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Connect to LDAP over SSL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Sync of LDAP users to ngAuth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
LDAP Mappers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Password Hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
How to setup a new LDAP connection ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.7. Identity Brokering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.7.1. Brokering Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.7.2. Default Identity Provider . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.7.3. OpenID Connect v1.0 Identity Providers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.7.4. SAML v2.0 Identity Providers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
SP Descriptor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

6. User Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

6.2. Multi-tenant installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.3. Role mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.4. Scripts for create roles and users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Role creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

User creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

7. Command Line Tool (ngadmin) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
7.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

7.2. Multi-tenancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
7.3. List of available commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
7.4. Usage and Tips. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
7.4.1. Space in arguments or options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
7.5. Show daemon version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
7.5.1. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

7.6. Control-related commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
7.6.1. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

7.7. Report target commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
7.7.1. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

7.8. Report utility commands. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
7.8.1. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

7.9. Profiling aggregation commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
7.9.1. Export profiling aggregations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
7.9.2. Import profiling aggregations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
7.9.3. Recompute profiling aggregations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.9.4. Rename profiling aggregations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
7.10. Profiling peer group commands. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
7.10.1. Recompute profiling peer groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
7.11. Solution commands. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.11.1. Import solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.12. Dashboard commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.12.1. List dashboards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
7.12.2. Import dashboards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
7.12.3. Export dashboards. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
7.13. Reference Data commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
7.13.1. Reload caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
7.13.2. List cache entries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
7.13.3. List caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
7.14. Field Mapping commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.15. Data Capture Alerting commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.15.1. List alerting policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
7.15.2. Import alerting policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
7.15.3. Export alerting policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
7.16. Polling Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
7.16.1. Polling list directory status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

7.16.2. Polling read poll status file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

7.16.3. Polling read next poll file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

7.16.4. Polling update/create poll status file. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Polling status Comparable Object types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

7.16.5. Polling update/create next poll file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

7.17. Licensing commands. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
7.17.1. Show License Information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

7.17.2. Extract C2V . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

7.17.3. Update License. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

7.18. Util command. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7.18.1. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7.19. Search - reindex all command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7.19.1. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
7.20. Data manipulation commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
7.20.1. Launch initial processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

7.20.2. Remove entries from NG|Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

7.20.3. Remove violations from log-collector and NG|Storage . . . . . . . . . . . . . . . . . . . 81
Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

7.20.4. Sanitize data in NG|Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

7.21. Forensic filters commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
7.21.1. Extract forensic filters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

7.21.2. Import forensic filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

8. NG|Integrity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

8.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
8.2. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

8.2.1. How NG|Integrity works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
8.3. Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
8.3.1. Sample Configuration File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
8.3.2. Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
8.3.3. Sample Use Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
9. Reference Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
9.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
9.2. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

9.3. Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
9.3.1. Module Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
9.3.2. Cache configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
Data Source Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Cache Group Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
9.4. Cache structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
9.5. Cache Reload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
10. Feeding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

10.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

10.2. Feeding operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
10.3. Business data model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
10.3.1. Technical attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
10.3.2. Common business attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
10.3.3. Transaction-specific attributes (ngt-*) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
10.3.4. Channel-specific attributes (ngc-*) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
10.3.5. IT Layer-specific attributes (ngi-*) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
10.3.6. Violation-specific (ngv-*). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
10.3.7. Recapitulation diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
10.4. Event Translation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
10.4.1. Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

10.4.2. Translators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
10.4.3. Scripted fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
10.5. Initial and realtime loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
11. Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

11.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

11.2. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

11.3. Control Administration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
11.4. Available Templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
11.4.1. Template selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
11.4.2. Template configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Simple mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

Advanced mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

11.4.3. Profiling (for simple profiling controls) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Ignore missing values setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
11.5. Scheduled controls missed runs recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
11.6. Managing whitelisting and blacklisting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
11.6.1. PBI control specifics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
11.6.2. Profiling controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
11.6.3. Reference data usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
11.7. Event tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
11.7.1. Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

11.7.2. Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
11.7.3. Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

event_tracking_handling_sample.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
event_tracking_monitoring_sample.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
11.8. Profiling Audit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
11.8.1. Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

11.8.2. Audit de-duplication and compacting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
11.9. Control creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
11.10. Common Python/Spark (a.k.a. PySpark) patterns. . . . . . . . . . . . . . . . . . . . . . . 131
11.10.1. Basic operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Filtering / Limiting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
Selecting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

Aggregating / Reducing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Joining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

11.10.2. Data frame caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
11.11. Execution principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
12. NG|Screener Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
12.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

12.2. Application Database. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
12.3. MariaDB security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
12.4. MariaDB toolkit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
12.4.1. How to connect. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
12.4.2. List of databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
12.4.3. List tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

12.4.4. Select a row from a table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
12.4.5. Display the structure of a table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
12.4.6. Dump the database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
12.4.7. Restore a dump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

12.4.8. Display MariaDB users and privileges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
12.4.9. Create a new user . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
13. Backup and Restore Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
13.1. Backup Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
13.1.1. Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

13.2. Restore Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
13.2.1. Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

14. NG|Storage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

14.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

14.1.1. ES - RDBMS comparison. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
14.2. Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

14.3. Fields & Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
14.3.1. Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

14.3.2. Index Template. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
14.4. Data retention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
14.5. Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

14.6. Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

14.6.1. Disk space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
14.7. NG|Storage (Elasticsearch) useful commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Cluster health. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

Extracting index meta data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
List indexes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

Extracting sample documents from index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
Extracting selected documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
Inserting documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Deleting documents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Counting documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
14.8. NG|Storage Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
14.8.1. How to configure NG|Storage to run as a cluster? . . . . . . . . . . . . . . . . . . . . . . 153
14.8.2. Performance Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
14.8.3. How to size NG|Storage ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
14.8.4. Index per Time Frame . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
14.8.5. Routing a Document to a Shard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
14.8.6. Shards configuration in NgStorageAdmin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
15. NG|Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
15.1. What is spark ?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
15.2. Spark performance programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
15.2.1. Reading data from Elasticsearch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

From Elasticsearch shard to Spark task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
15.2.2. Optimizing joins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
Broadcast join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

15.2.3. Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

15.2.4. Spark performance advices in a nutshell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
15.3. What is mesos ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
15.4. Settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

15.5. Settings for Proof of Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
15.6. Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

15.6.1. allocateResource.sh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
15.6.2. enableSparkAllocationMode.sh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
16. Platform services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
16.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

16.2. NG|Screener. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
16.2.1. ng-screener.service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
16.2.2. ng-screener-ui.service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
16.3. NG|Messaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
16.3.1. ng-messaging.service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
16.3.2. ng-zookeeper.service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
16.4. NG|Discover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
16.4.1. ng-discover.service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
16.5. NG|Storage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
16.5.1. ng-storage.service. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
16.6. NG|Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
16.6.1. ng-history-server.service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
16.6.2. ng-mesos-master.service. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
16.6.3. ng-mesos-shuffle.service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
16.6.4. ng-mesos-slave.service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
16.6.5. ng-thrift-server.service. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
16.7. NG|Platform pseudo-service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
17. NG|Discover Dashboard and Forensic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
17.1. NG|Discover dashboards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
17.2. Dashboards configuration in NG|Screener . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
17.3. Home dashboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
17.4. Forensic dashboards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
17.5. Controls dashboards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
17.6. Best Practices to build dashboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
17.6.1. Limiting the dashboard time period . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

17.6.2. Preventing Combinatorial Explosion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
18. NG|Business Data Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
18.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
18.2. Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

18.3. Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

19. Troubleshooting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
19.1. Stop/Start NG|Screener services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
19.2. Checking Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
19.3. NG|Screener Config Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
19.3.1. General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

19.3.2. Licensing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
19.3.3. Reference Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
19.3.4. Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
19.3.5. Feeding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

19.3.6. Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

19.3.7. Updater . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

19.3.8. Realtime Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
19.3.9. UI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

19.4. NG|Screener Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
19.5. Disk/System usage check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
19.6. MariaDB connection issue relative to system timezone . . . . . . . . . . . . . . . . . . . . . 183
19.7. Profiling aggregation computation using partitioning . . . . . . . . . . . . . . . . . . . . . . . 185
19.8. Single Sign-On troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
19.8.1. Got ERR_TOO_MANY_REDIRECTS after ngDaemon upgrade . . . . . . . . . . . . . 186
19.9. NG|Storage troubleshooting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
19.9.1. Problem #1: My cluster status is red or yellow. What should I do? . . . . . . . . 186
19.9.2. Problem #2: Help! Data nodes are running out of disk space. . . . . . . . . . . . . 187
19.9.3. Problem #3: How can I speed up my index-heavy workload? . . . . . . . . . . . . . 188
19.9.4. Problem #4: How can I do when my instance is overloaded ?. . . . . . . . . . . . . 189
19.10. NG|Processing troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
19.10.1. Control execution seems not progress . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
19.10.2. Controls runs very slow with huge amount of data . . . . . . . . . . . . . . . . . . . . 190
20. Open-Source Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
20.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
20.2. NG|Screener. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
20.3. Case Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
21. NG|OS Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
21.1. Password policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

21.1.1. Important . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

22. NG|Screener Services Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
Appendix A: Migrate Static Data configuration to Reference Data . . . . . . . . . . . . . . . . . . . . 195
A.1. From version 5.0 to 6.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

A.2. From version 6.0 to 7.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

Appendix B: Migration hints from version 6.x to 7.x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
B.1. Configuration files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

B.1.1. New configuration files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
/etc/ng-screener/daemon/modules/executor.conf . . . . . . . . . . . . . . . 199
B.1.2. Moved or renamed files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
B.1.3. Modified contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

/etc/ng-screener/common/ng-screener.conf . . . . . . . . . . . . . . . . . . . . . 199
/etc/ng-screener/common/referenceData.conf . . . . . . . . . . . . . . . . . . . 200
/etc/ng-screener/daemon/modules/{forensic,feeding}.conf . . . . 200
/etc/ng-screener/common/controlCommon.conf . . . . . . . . . . . . . . . . . . . 201
/etc/ng-screener/daemon/modules/control.conf. . . . . . . . . . . . . . . . . 202
/etc/ng-screener/common/security.conf . . . . . . . . . . . . . . . . . . . . . . . . . 203
Connectors' service configuration files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
Apache’s httpd configuration files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
/etc/syslog-ng-rules/syslog-ng.conf . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
B.2. NG|Discover Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206

Appendix C: Migration from version 7.0.x to 7.1.x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
C.1. Use default tenant for non multi-tenant environment. . . . . . . . . . . . . . . . . . . . . . . . 208
Appendix D: Migration from version 7.1.x to 7.2.x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
D.1. Step-by-step procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
D.1.1. Install ngAuth package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
D.1.2. Create the missing realms in NG|Auth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
D.1.3. Migrate local users from the previous installation . . . . . . . . . . . . . . . . . . . . . . 211
D.1.4. Migrate LDAP configuration for users/roles (in each tenant) . . . . . . . . . . . . . . 212
D.1.5. Migrate NG|CaseManager users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
Extra information for troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
D.1.6. Uninstall NG|Screener and NG|CaseManager packages . . . . . . . . . . . . . . . . . 213
D.1.7. Install new versions of NG|Screeener and NG|CaseManager (and other required
213
packages)
D.1.8. Configure the Apache reverse proxy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
One certificate for all tenants (wildcard certificate) . . . . . . . . . . . . . . . . . . . . . . . . 214
One certificate for each tenant. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
D.1.9. Make sure the certificates are included in the keystore . . . . . . . . . . . . . . . . . . 218
D.1.10. Restart the Apache reverse proxy and the applications. . . . . . . . . . . . . . . . . . 218
D.2. LDAP user/role migration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
D.2.1. Connect to the NG|Auth administration console . . . . . . . . . . . . . . . . . . . . . . . . 218
D.2.2. Choose a tenant to configure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
D.2.3. Configure LDAP connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
D.3. Cleanup of DEFAULT tenant (if necessary only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
Appendix E: Authenticate to UI/CM from a python script. . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
E.1. Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

E.2. Example of request to NG|Screener . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
E.3. Example of request to NG|CaseManager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
E.4. Troubleshooting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

E.5. Specific client definition in NG|Auth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
E.5.1. Client scope creation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
E.5.2. Client creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229

Appendix F: Hits creation with python exporter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
F.1. What is the python exporter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
F.2. Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

F.3. How to use the python exporter framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
F.4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234

Appendix G: Timezone Usage in NgScreener . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
G.1. Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

G.2. Dashboards / NgDiscover. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
G.3. Control Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236

G.3.1. Configure scheduled control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
G.3.2. Configure online control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
G.3.3. Execute control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

G.3.4. Generate report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238

G.3.5. Export report to NgCaseManager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
Appendix H: Jupyter Notebook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
H.1. Installing Jupyter Notebook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

Chapter 1. NG|Screener Overview
This chapter presents NG|Screener, the Risk Management solution from NetGuardians SA.
1.1. Overview
NG|Screener is an operational Risk Mitigation solution based on audit trails collection.

NG|Screener brings the best visualisation and analysis features to an easy-to-use and
operate appliance.
NG|Screener uses big data technology and predictive analytics to combine and standardize
data from across the entire bank system. Within the framework of your selected controls,
probes and connectors capture and analyze large volumes of data related to user activity
and transactions. By associating user behavior with core banking transactions, it flags
activity that may indicate fraud. A user-friendly graphical interface gives a consolidated
“control tower” dashboard view to help turn the data captured through controls into
actionable information. You have all the information you need to effectively identify and
investigate fraud, conduct forensic analysis and implement measures to prevent future
incidents. All captured audit trails and transactions are copied and stored permanently.
They cannot be corrupted or erased, which is critical for a successful prosecution case.
NG|Screener is composed of two main components:
• NG|Forensic: It’s interactive dashboard used to do forensic investigation to detect

frauds.
• NG|Control Execution: A system to configure and execute pre-defined controls to detect
fraud patterns in scheduled time frames.
The following products are also provided by NetGuardians.
Chapter 1. NG|Screener Overview | 1

• NG|Case Manager: A workflow engine to manage compliance issues and define
workflows for compliance incident management.
• NG|Console Tracking System: An SSH proxy that registers all console activity.
NG|Analytics Server collects audit trails in a non-intrusive way from heterogeneous

components of the IT infrastructure (core banking platforms, network devices,
infrastructure servers, applications, etc). The audit trail collection may be achieved using
one of the following methods:
• Syslog
• SNMP Traps
• WMI Polling
• JDBC Polling
• Checkpoint OPSEC
• Flat files
• SAP Polling
• LDAP Polling
• Syslog agent
Once collected, audit trails are stored in their original format for long term conservation on
the file system.
The normalization process that translates the original audit trail format to a unified data
model is then executed on demand when a forensic analysis (precise operator-led
investigations) is needed.
Audit trail analysis is achieved using NG|ScreenerUI to drill down into the vast amount of
heterogeneous audit trails. Operational issues (intrusions, performance issues, internal
security threats, etc.) are detected via an intelligent collation and correlation of gathered
audit trails. Furthermore, the solution includes a report generating tool allowing to
automate regular controls and indicator generation.
NG|Case Manager is a workflow engine for managing compliance issues. It enables to

receive incidents sent by defined automatic controls.
NG|Console Tracking System is an SSH proxy, enabling to track all Unix servers
administrators’ activities and sends its audit trails to be processed by the NG|Analytic
Server.
1.2. Architecture and Main Components
This chapter presents NG|Screener global architecture and its main components.
2 | Chapter 1. NG|Screener Overview

1.2.1. External Architecture
• Data Capture & Extraction Layer
This layer is responsible for collecting audit trails. Audit trail collection is the essence of
the solution, thus NG|Screener allows many audit trail collection mechanisms to
integrate with as many systems as possible. Most collection mechanisms proposed by
NG|Screener are not intrusive and do not require installation of other agents onto the
audit trail source system.
There are three alternatives to collect audit trails:
1. Passive collection is the easiest way to collect audit trails as it relies on standard
protocols. NG|Screener directly receives audit trails from their sources and does not
need to access the sources of the audit trails.
NG|Screener supports the following passive collection protocols: Syslog, SNMP

Traps
2. Active collection (polling) is an alternative way to retrieve audit trails when the
source system does not support any of the standard protocols presented above. In
such a situation, NG|Screener will regularly poll the source device to obtain the
latest audit trails available. Generally, read access needs to be configured on the
audit trail source to allow NG|Screener to gather audit trails. NG|Screener supports
the following active collection mechanisms: WMI Polling (Microsoft), Database
Polling, T24 Polling, Checkpoint OPSEC, SAP Polling, LDAP Polling, Flat file reading
3. Agent based collection In cases where neither passive collection nor active
collection is applicable, an agent needs to be installed on the source device to gather
local audit trails and forward them to NG|Screener.
• Raw Data Storage and Dispatching Layer
Once collected, audit trails are passed to syslog-ng to be proceeded and indexed in
NG|Storage, and, simultaneously, they are stored on the file system for archive/recovery
purpose. It is important to notice that for compliance reasons, NG|Screener stores audit
trails in their original format.
Audit trails are stored on the file system under the /log-collector directory and are
organized in a tree as described on Figure [log_storage_organization]
• Data Processing Layer
Once audit trails collection and storage are achieved, the remaining task is the
interpretation of heterogeneous audit trails for analysis. Standardization of collected
audit trails is achieved using a unique normalization process which consists of
translating the proprietary audit trail format to a unified data model.
NG|Screener consists of three ways to present and analyse audit trails:
1. Activity Overview: A volumetric overview for quick anomaly detection

2. Forensic Analysis: A powerful method to drill down into event (normalized audit
trail) details and to create reports.
• Result Visualization Layer
This layer is used to do either a forensic session or to execute controls on all

normalized events. This is the main module used to manipulate NG|Screener data.
1.2.2. Internal Architecture

As shown on the previous schema, there are 4 data flows inside NG|Screener.
1. Data Normalization Flow (Blue)
This flow normalizes and enriches raw events from NG|Messaging and stores them in
NG|Storage. Those normalized events are ready for forensic investigation or control
execution.
2. Control Execution Flow (Green)
This flow is followed when a control is executed. It first fetches data from NG|Storage
and stores the result in thrift server. The results are then used to generate reports
which are stored in database. In addition, the report may be published to an external
channel if needed.
3. Reference Data Flow (Fuchsia)
The reference data flow is used to read information from external data sources (SQL,
CSV, etc.) and store it in NG|Storage, which can then they be used for enriching
normalized events during the normalization process.
4. Realtime Analysis Flow (Yellow)
The realtime analysis flow computes the number of events fetched for each host/service
and raises an alert if it is not in the pre-configured range.
1.3. Notions of Windows
There is a notion of different data windows in NG|Daemon. The following section gives a
detailed description of all of them.
1.3.1. Storage window
This window defines the period of data that is normalized and kept in NG|Storage. The
Daemon uses a dedicated thread to clean data outside of this window at midnight. This
window is present to ensure that a controlled amount of data is stored in NG|Storage to
preserve disk space.
In case we want to load data outside of the storage window for investigation, the Custom
processing job can be used for that purpose. The functionality can be reached from the
Admin / Processing menu in the UI. This job can load data for specific services/hosts
and the desired period.
In case the job is requested to load data which already exist in NG|Storage, it will be
overwritten. That is to ensure there is no duplicated data in NG|Storage. All data outside of
the storage window is automatically removed every night at midnight.
The storage window can be configured in /etc/ng-

screener/common/ngStorage.conf via parameters ngStorageWindowInDays
(global storage window) or indexPatternsWithWindow (storage window by index).
1.3.2. Analysis window
This window defines the maximum period of data that a user can analyze in the UI. It is
configurable per forensic view (e.g. violations, transaction) to limit the maximum time
period a user can choose on each view.
Its purpose is to achieve reasonable response times and user experience of the forensic
views.
Those windows are configurable with the maxPeriod parameter in /etc/ng-

screener/common/forensicMenu.json.

On multi-tenant installations, this file can be made tenant-specific (given a
tenant named XYZ, if a forensicMenu_XYZ.json file exists in the same
directory, it will supersede the default one for all sessions belonging to the
 XYZ tenant).
When creating such a file for a new tenant, one has to make sure it is
readable by the ng-screener user.
1.3.3. Logs archiving window
To save space in storage, log files under /log-collector are compressed periodically.
The logs archiving window defines the period in which logs are left uncompressed.
Normally, this window is set so that all logs arrive in the system in this period. This
prevents multiple archived files from the same day. By default, this window is set to 2 days,
which means that today’s and yesterday’s logs are not archived.
When daemon is installed, it sets a cron job to maintain this window every day. The cron job
is located in /etc/cron.d/logrotate. The maintenance script itself is located in
/usr/local/ng-screener/daemon/script/logrotate.sh.
1.3.4. Logs cleaning window
This window defines the period for which logs are kept in /log-collector. All logs
outside of this period are removed automatically to preserve storage space.
Currently, the script to maintain this window is implemented at integration time.

Chapter 2. NG|Screener Setting
2.1. Introduction
This chapter provides information about NG|Screener settings and configuration.
2.2. Memory Configuration
2.2.1. NG|Screeener
By default, the memory allocated for NG|Screener is equal to 15% of the available machine
memory minus 400MB.
The minimum memory necessary for NG|Screener is 512MB. That setting is located in
/usr/local/ng-screener/tools/packaging/generate-daemon-systemd-env.
After changing its value, restart the ng-screener service to apply the change.
2.2.2. NG|ScreeenerUI
By default, the memory allocated for NG|ScreenerUI is 10% of the available machine
memory minus 400MB.
The minimum memory necessary for NG|ScreenerUI is 400MB. That setting is located in
/usr/local/ng-screener/ui/tools/generate-ui-systemd-env. After changing
its value, restart the ng-screener-ui service to apply the change.
2.2.3. NG|Messaging
By default, the memory allocated for NG|Messaging is 1GB. To change this value, change
the KAFKA_HEAP_OPTS parameter in /usr/local/ng-
screener/ngmessaging/bin/kafka-server-start.sh. Restart the ng-messaging
service to apply the change.
2.2.4. NG|Storage
By default, the memory allocated for NG|Storage is 50% of the available machine memory
minus 400MB.
The minimum memory allocation for NG|Storage is 1GB and the maximum value is 30GB.
Those settings are located in /usr/local/ng-screener/ngstorage/bin/generate-
ngstorage-systemd-env. After changing their values, restart the ng-storage service
to apply the changes.
2.2.5. NG|Discover
By default, the memory allocated for NG|Discover 1GB.
8 | Chapter 2. NG|Screener Setting

To modify the memory allocated to NG|Discover, add NODE_OPTIONS="--max-old-
space-size=1024" to /usr/local/ng-screener/ngdiscover/bin/ngDiscover.
Restart the ng-discover service to apply the change.
2.2.6. Global
Most of the above settings (and others) can be configured globally in /etc/ng-
screener/global.env. All settings placed in that file will take precedence over any
default values. Refer to that file for available settings.
2.3. Link display configuration in NG|Screener UI
In the ng-screener.conf configuration file, one can act on a few display switches that
have an influence on the links shown in the UI (defaults are all true):
# flag to display links to NG|CaseManager application or not

showCaseManagerLink = true
# flag to display links to NG|ManagementCenter application or not

showManagementCenterLink = true
# flag to display links to the online documentation or not

showOnlineDocumentationLink = true
2.4. NG|CaseManager
NG|CaseManager is usually installed on the same machine with NG|Screener but it’s not a
must. If it’s not, the custom description link 'investigate hit' (the main relation from
NG|CaseManager to NG|Screener) will not work out-of-the-box. Apache configuration
needs to be modified in that case.
Edit /etc/httpd/conf.d/netguardians.conf and add this directive:
RewriteRule "/ui/" "http://<ng-screener-ip>/ui/"
This will redirect all requests from the CM to the right server. If you have just
NG|CaseManager on this server then the configuration will be like the following:
Chapter 2. NG|Screener Setting | 9

<VirtualHost *:443>
SSLEngine on
SSLCipherSuite EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH
SSLCertificateFile /etc/httpd/conf.d/netguardians.crt
#SSLCertificateKeyFile /etc/httpd/conf.d/server.key
RewriteEngine On
ProxyPreserveHost On
RequestHeader set X-Forwarded-Proto https

RequestHeader set X-Forwarded-Port 443
RewriteRule "/ui/" "http://server-with-screener/ui/"
# Proxy to caseManager
ProxyPass /cm/ http://127.0.0.1:3000/cm/
ProxyPassReverse /cm/ http://127.0.0.1:3000/cm/
</VirtualHost>
2.5. Syslog-NG
Syslog-NG has a maximum connection limit for TCP, with the default set to 10. If there’s a
lot of sources posting events to Syslog-NG, this limit might need to be increased. This can
be done in /etc/syslog-ng-rules/syslog-ng.conf as follows:
# Standard syslog
source s_collector {
tcp(ip(0.0.0.0) port(514) encoding("iso-8859-1") flags(no-multi-line)
max_connections(50));
udp(ip(0.0.0.0) port(514) encoding("iso-8859-1") flags(no-multi-line));
tcp(ip(0.0.0.0) port(63514) encoding("utf-8") flags(no-multi-line)
max_connections(50));
udp(ip(0.0.0.0) port(63514) encoding("utf-8") flags(no-multi-line));
};
10 | Chapter 2. NG|Screener Setting

Chapter 3. NG|Connectors Management
3.1. Introduction
Figure: NG|Connector Overview
A connector is a component that allows NG|Screener to collect and analyze audit trails
from various sources (i.e. Core Banking applications, firewall devices, Windows AD, custom
applications, etc.).
A connector includes:
• Audit trail collection mechanisms and configurations adapted for the source (syslog,
agent, polling, etc.)
• Translation dictionary to transform audit trails to the Business Data Model
• Pre-configured controls, packaged in the corresponding Solutions
A connector has to be installed for every type of audit trails that are expected to be ingested
by the NG|Analytics Server.
3.2. Configure Update Center
Solutions and Connectors can be managed via Update Center. It is configurable in

/etc/ng-screener/common/updater.conf. The following is a sample configuration:
# URL of the Update Center

# Optional. Default: https://update-center.netguardians.ch:63443
#url = https://update-center.netguardians.ch:63443
# Credentials used for Update Center authentication

# Required
username = username
password = password
Chapter 3. NG|Connectors Management | 11

Where:
• url: address of the update center.
This parameter is optional. Default url is https://update-

center.netguardians.ch:63443.
• username & password: credentials to login to Update Center. Each client has unique
credentials for authentication. These credentials are provided by the vendor.
These parameters are required.
After modifying the configuration file, ng-screener and ng-screener-ui services need
to be restarted for the changes to take effect.
3.3. Check Installed Connectors
The Connectors currently installed on the system can be listed using one of the following
ways:
3.3.1. On NG|Screener Server
Do the following steps to see all the connectors installed on the server:
1. Connect to server as root

2. Use the following command to list all installed Connectors
[root@NG-SCREENER ~]# rpm -qg Productivity/Security | grep connector
3.3.2. In Management Center
Do the following steps in Management Center to see all the installed

solutions/connectors:
1. Open https://server_url/mc/ in a browser, where server_url is the address of

the server
2. Select NG|Screener / Modules and Connectors in the left panel
3. The Installed Modules tab in the right panel shows all packages installed in the
system, which include Connectors.
3.4. Uninstall Connectors
To uninstall a Solution or Connector, do the following steps:
12 | Chapter 3. NG|Connectors Management

2. Use the following command to uninstall a connector
[root@NG-SCREENER ~]# rpm -e --nodeps connector-XXX
where XXX is the name of the Connector.
3.5. Install or Update Solutions/Connectors
Components can be installed by:
• using Update Center through NG|ScreenerUI, if it’s available, or

• directly on the server, by using zip/RPM
3.5.1. Use NG-Screener-UI
From NG|ScreenerUI, do the following steps:
1. Go to Admin / Updater menu, the right panel shows all connectors to be

installed/updated
2. Select the desired connectors to be installed/updated
3. Click on the Update button to install/update selected connectors
The selected packages will be installed in the background. The result of the operation will
be presented to the user as a notification.

Figure: Install or Update Connectors
3.5.2. Install or Update Connector on NG|Screener Server
A new/updated connector can be provided by the vendor in the form of an RPM file. Run the
following steps to apply it manually:
1. Upload the RPM file to the server

3. Uninstall the connector if it already exists (refer to Uninstall Connectors)
4. Install the new connector using the command
[root@NG-SCREENER ~]# rpm -ivh connector-XXX.rpm
3.5.3. Install or Update Solution on NG|Screener Server
A new/updated solution can be provided by the vendor in the form of a zip file. Run the
following steps to apply it manually:
1. Upload the zip file to the server

2. Login to ngadmnin server (Command Line Tool (ngadmin))
3. Install/Update the solution using the command
[admin]> control_addOrUpdateControls -f IB-XXX.zip
3.6. Connector’s Service Config file
Those files are located in /etc/ng-screener/daemon/serviceConfig on the

NG|Screener appliance.
It contains the following information :
• serviceName
This is the name of the service used into our business data model
• serviceDescription
This is a description of the service to understand its purpose
• indexPattern
Define the indexPattern used to store the event into ngStorage. Can be one of ngt, ngc,
ngi or ngv
• indexGranularity
This setting indicate when to create a new ngStorage index for this kind of event. Can be
one of day, month or year. Default value is month.
This should be changed to year if there is less than 10 million events

per year.
 This should be changed to day if there is more than 1 million events per
day.
• dateFormat
This setting indicates the format to use for date parsing during the normalization phase.
It follows the Java patterns as defined in the documentation of the
DateTimeFormatter class. Its default value is yyyy-MM-dd HH:mm:ss.
• numberFormatThousandsSeparator
This setting indicates the character to consider for thousands separators when parsing
numbers. Its default value is a comma (,). Possible values are:
◦ any single character

◦ any of (case insensitive) space (a space), comma (a comma), dot (a dot), quote (a
single quote ').
Whatever the value set here, numbers present in the source data
 without any thousands separators will be parsed all right.
• numberFormatDecimalSeparator
This setting indicates the character to consider for decimal separator when parsing
numbers (i.e. the character that separates the integer part from the non-integer part in
a number). Its default value is a dot (.). Possible values are:
◦ any single character

◦ any of (case insensitive) space (a space), comma (a comma), dot (a dot), quote (a
single quote ').
• parsingRule
2 possible values :
◦ Name to the ruleSet file to use to parse the event line.

◦ Value direct to read the event with the direct ruleset (tab separated list of
field=value)
• syslogNgSource
Only for backward compatibility with old service config file
• syslogNgUseReceivedTime
Only for backward compatibility with old service config file
• syslogService_xx
List of service name used into the log-collector (folder name)
Sample for T24Transaction :
serviceName = temenosT24Transaction
serviceDescription = Temenos T24 Audit Trails
parsingRule = temenosT24Transaction.rules
dateFormat = yyyy-MM-dd HH:mm:ss

numberFormatThousandsSeparator = space
numberFormatDecimalSeparator = .
indexPattern = ngt
indexGranularity = month
syslogNgSource = s_collector
syslogNgUseReceivedTime = false
syslogService_1 = temenosT24Transaction

Chapter 4. Licensing
4.1. Licensing Overview
This section explains how to generate, activate, update or obtain information about your
license.
A C2V (client to vendor) file is a fingerprint of the server on which NG|Screener is installed.
This file allows NetGuardians or NetGuardians reseller to create a license. A V2C (vendor to
client) file is the license created specifically for your server system.
4.2. Show NG|Screener License
To show information about NG|Screener license, click the icon at the top right of the
screen, then select the second tab. The side panel shows system information (Figure
License Information).
Figure: License Information
You can find the following license information:
• State: the license state, it has the following states:
Chapter 4. Licensing | 17
◦ PROVISIONAL: a trial license, it is valid for a time-limited period. After the trial
period, its state changes to EXPIRED. You need to activate the license before its trial
period expires.
◦ ACTIVATED: the license is activated, it is valid for a time-limited period. After that
period, its state changes to EXPIRED. You need to update the license before it
expires.
◦ EXPIRED: the license is expired, you can only use a limited functionality of
NG|Screener. You need to contact the vendor to activate/update the license.
• License Features: contains which connectors are allowed to be used. It may specify the
number of connectors that are allowed to be used, or it contains a list of specific
connectors. With the former, the application can install any connectors such that the
number of connectors does not exceed the number specified in the license feature.
With the latter, you can only install connectors specified in the list. In addition, the
dialog also shows the state of each license feature and its expiration date (in UTC
time). The license feature state may be:
◦ AVAILABLE: the feature is available
◦ EXPIRED: the feature is expired, you need to contact the vendor to update the license
Notice that if the server detects that something is wrong with the license (e.g. system clock
tampering), it may lock the license and the license feature may be expired despite the
expiration date is still valid. If you are in this case, please contact the vendor for support.
4.3. Generate C2V file
A C2V file is used by the vendor to generate the license. To generate a C2V file, follow the
next steps:
1. Go to Admin / Licensing menu, a new page appears to manage license (Figure

Activate License)
2. Click on the Download Fingerprint button. The C2V file is downloaded to your
default download folder with predefined name fingerprint.c2v.
Figure: Activate License
4.4. Activate License
Ask the vendor for an activation license (V2C file) to activate the product. The vendor may
ask you for a C2V file (Section Generate C2V file). After getting the activation file, follow the
18 | Chapter 4. Licensing
next steps:
1. Go to Admin / Licensing menu, a new page appears to manage license (Figure

Activate License)
2. Click on the Choose File button. A dialog box appears asking you to choose the
activation license file (V2C file).
3. Specify the activation license file, then click the Open button. The file name is displayed
in the text box beside the Choose File button.
4. Click the Activate button to activate the license. It may take some minutes to finish
the process.
5. Refresh the web browser to apply the changes
4.5. Update License
Ask the vendor for an update license (V2C file) to update the license and follow the next
steps:
1. Go to Admin / Licensing menu, a new page appears to manage license (Figure Update
License)
2. Click on the Choose File button. A dialog box appears asking you to choose the
location of the update license file (V2C file).
3. Specify the activation license file, then click the Open button. The file name is displayed
in the text box beside the Choose File button.
4. Click the Update button to update the license. It may take some minutes to finish the
process.
5. Refresh the web browser to apply the changes
Figure: Update License
Chapter 4. Licensing | 19
Chapter 5. Authentication with ngAuth
5.1. Introduction
All authentication process of ngScreener is managed by the service ngAuth. This service is
a repackaging of the open source project Keycloak (https://www.keycloak.org). This chapter
will explain basic operations that an ngScreener admin should know. For more advanced
configuration settings, please refer directly to keycloak’s documentation in
https://www.keycloak.org/docs/.
5.2. How to access the service ?
This service is available with the base URL https://myhost/auth/.
To access the admin console, you need to connect to https://myhost/auth/admin and use
the superadmin user that is created at installation. The default password should be
changed, is netguardians.
5.3. How to create a new Realm ?
In the keycloak nomenclature, a Realm is the same as a Tenant in the ngScreener side.
Then the easiest way to add a new Realm is to use the script /usr/local/ng-
screener/tools/multi-tenancy/createTenant.py.
Example : ./createTenant.py -t MYNEWTENANT -u superadmin -p

netguardians -url https://mytenanturl.mycompany.com/auth/
5.4. How to create a new Role ?
A role in ngAuth is only a name. Functionalities are defined in each business application
(ngBrowser or ngCaseManager).
Realm-level roles are a global namespace to define your roles. You can see the list of built-
in and created roles by clicking the Roles left menu item.
20 | Chapter 5. Authentication with ngAuth

Figure 1. Realm Roles
To create a role, click Add Role on this page, enter in the name and description of the role,
and click Save.
Figure 2. Add Role
To simplify, you can even use the tool script /usr/local/ng-

screener/tools/auth/createRoleKeycloak.py
5.5. How to create a new User ?
To create a user, after selecting the right tenant, click on Users in the left menu bar.
Chapter 5. Authentication with ngAuth | 21

Figure 3. Users
This menu option brings you to the user list page. On the right side of the empty user list,
you should see an Add User button. Click that to start creating your new user.
Figure 4. Add User
The only required field is Username. Click save. This will bring you to the management
page for your new user.

Figure 5. User Management
To simplify, you can even use the tool script /usr/local/ng-

screener/tools/auth/createUserKeycloak.py
5.5.1. User Role Mappings
User role mappings can be assigned individually to each user through the Role Mappings
tab for that single user.
Figure 6. Role Mappings
In the above example, we are about to assign the role NG_Admin that was created in the
Create new Roles chapter.
5.6. User Storage Federation
Many companies have existing user databases that hold information about users and their

passwords or other credentials. In many cases, it is just not possible to migrate off of those
existing stores to a pure ngAuth deployment. ngAuth can federate existing external user
databases. Out of the box we have support for LDAP and Active Directory.
The way it works is that when a user logs in, ngAuth will look into its own internal user
store to find the user. If it can’t find it there it will iterate over every User Storage provider
you have configured for the realm until it finds a match. Data from the external store is
mapped into a common user model that is consumed by the ngAuth runtime. This common
user model can then be mapped to OIDC token claims and SAML assertion attributes.
External user databases rarely have every piece of data needed to support all the features
that ngAuth has. In this case, the User Storage Provider can opt to store some things locally
in the ngAuth user store. Some providers even import the user locally and sync periodically
with the external store. All this depends on the capabilities of the provider and how it’s
configured. For example, your external user store may not support OTP. Depending on the
provider, this OTP can be handled and stored by ngAuth.
5.6.1. Adding a Provider
To add a storage provider go to the User Federation left menu item in the Admin
Console.
On the center, there is an Add Provider list box. Choose the provider type you want to
add and you will be brought to the configuration page of that provider.
5.6.2. Dealing with Provider Failures
If a User Storage Provider fails, that is, if your LDAP server is down, you may have trouble
logging in and may not be able to view users in the admin console. ngAuth does not catch
failures when using a Storage Provider to lookup a user. It will abort the invocation. So, if
you have a Storage Provider with a higher priority that fails during user lookup, the login or
user query will fail entirely with an exception and abort. It will not fail over to the next
configured provider.
The local ngAuth user database is always searched first to resolve users before any LDAP
or custom User Storage Provider. You may want to consider creating an admin account that
is stored in the local ngAuth user database just in case any problems come up in
connecting to your LDAP and custom back ends.
Each LDAP and custom User Storage Provider has an enable switch on its admin console
page. Disabling the User Storage Provider will skip the provider when doing user queries so
that you can view and login with users that might be stored in a different provider with
lower priority. If your provider is using an import strategy and you disable it, imported
users are still available for lookup, but only in read only mode. You will not be able to
modify these users until you re-enable the provider.

The reason why ngAuth does not fail over if a Storage Provider lookup fails is that user
databases often have duplicate usernames or duplicate emails between them. This can
cause security issues and unforeseen problems as the user may be loaded from one
external store when the admin is expecting the user to be loaded from another.
5.6.3. LDAP and Active Directory
ngAuth comes with a built-in LDAP/AD provider. It is possible to federate multiple different
LDAP servers in the same ngAuth realm. You can map LDAP user attributes into the ngAuth
common user model. By default, it maps username, email, first name, and last name, but
you are free to configure additional mappings. The LDAP provider also supports password
validation via LDAP/AD protocols and different storage, edit, and synchronization modes.
To configure a federated LDAP store go to the Admin Console. Click on the User
Federation left menu option. When you get to this page there is an Add Provider
select box. You should see ldap within this list. Selecting ldap will bring you to the LDAP
configuration page.
Storage Mode
By default, ngAuth will import users from LDAP into the local ngAuth user database. This
copy of the user is either synchronized on demand, or through a periodic background task.
The one exception to this is passwords. Passwords are not imported and password
validation is delegated to the LDAP server. The benefits to this approach is that all ngAuth
features will work as any extra per-user data that is needed can be stored locally. This
approach also reduces load on the LDAP server as uncached users are loaded from the
ngAuth database the 2nd time they are accessed. The only load your LDAP server will have
is password validation. The downside to this approach is that when a user is first queried,
this will require a ngAuth database insert. The import will also have to be synchronized with
your LDAP server as needed.
Alternatively, you can choose not to import users into the ngAuth user database. In this
case, the common user model that the ngAuth runtime uses is backed only by the LDAP
server. This means that if LDAP doesn’t support a piece of data that a ngAuth feature needs
that feature will not work. The benefit to this approach is that you do not have the overhead
of importing and synchronizing a copy of the LDAP user into the ngAuth user database.
This storage mode is controled by the Import Users switch. Set to On to import users.
Edit Mode
Users, through the User Account Service, and admins through the Admin Console have the
ability to modify user metadata. Depending on your setup you may or may not have LDAP
update privileges. The Edit Mode configuration option defines the edit policy you have
with your LDAP store.

READONLY
Username, email, first name, last name, and other mapped attributes will be
unchangeable. ngAuth will show an error anytime anybody tries to update these
fields. Also, password updates will not be supported.
WRITABLE
Username, email, first name, last name, and other mapped attributes and passwords
can all be updated and will be synchronized automatically with your LDAP store.
UNSYNCED
Any changes to username, email, first name, last name, and passwords will be stored
in ngAuth local storage. It is up to you to figure out how to synchronize back to LDAP.
This allows ngAuth deployments to support updates of user metadata on a read-only
LDAP server. This option only applies when you are importing users from LDAP into
the local ngAuth user database.
Other config options
Console Display Name

Name used when this provider is referenced in the admin console
Priority
The priority of this provider when looking up users or adding a user.
Sync Registrations
Does your LDAP support adding new users? Click this switch if you want new users
created by ngAuth in the admin console or the registration page to be added to LDAP.
Allow Kerberos authentication

Enable Kerberos/SPNEGO authentication in realm with users data provisioned from
LDAP. More info in Kerberos section.
Other options
The rest of the configuration options should be self explanatory. You can mouseover
the tooltips in Admin Console to see some more details about them.
Connect to LDAP over SSL
When you configure a secured connection URL to your LDAP store(for example
ldaps://myhost.com:636 ), ngAuth will use SSL for the communication with LDAP
server. The important thing is to properly configure a truststore on the ngAuth server side,
otherwise ngAuth can’t trust the SSL connection to LDAP.
The global truststore for the ngAuth can be configured with the Truststore SPI. Please
check out the {installguide_name} for more detail. If you don’t configure the truststore SPI,
the truststore will fallback to the default mechanism provided by Java (either the file
provided by system property javax.net.ssl.trustStore or the cacerts file from the
JDK if the system property is not set).
There is a configuration property Use Truststore SPI in the LDAP federation provider
configuration, where you can choose whether the Truststore SPI is used. By default, the
value is Only for ldaps, which is fine for most deployments. The Truststore SPI will
only be used if the connection to LDAP starts with ldaps.
Sync of LDAP users to ngAuth
If you have import enabled, the LDAP Provider will automatically take care of
synchronization (import) of needed LDAP users into the ngAuth local database. As users log
in, the LDAP provider will import the LDAP user into the ngAuth database and then
authenticate against the LDAP password. This is the only time users will be imported. If you
go to the Users left menu item in the Admin Console and click the View all users
button, you will only see those LDAP users that have been authenticated at least once by
ngAuth. It is implemented this way so that admins don’t accidentally try to import a huge
LDAP DB of users.
If you want to sync all LDAP users into the ngAuth database, you may configure and enable
the Sync Settings of the LDAP provider you configured. There are 2 types of
synchronization:
Periodic Full sync

This will synchronize all LDAP users into ngAuth DB. Those LDAP users, which
already exist in ngAuth and were changed in LDAP directly will be updated in ngAuth
DB (For example if user Mary Kelly was changed in LDAP to Mary Smith).
Periodic Changed users sync

When syncing occurs, only those users that were created or updated after the last
sync will be updated and/or imported.
The best way to handle syncing is to click the Synchronize all users button when you
first create the LDAP provider, then set up a periodic sync of changed users. The
configuration page for your LDAP Provider has several options to support you.
LDAP Mappers
LDAP mappers are listeners, which are triggered by the LDAP Provider at various
points, provide another extension point to LDAP integration. They are triggered when a user
logs in via LDAP and needs to be imported, during ngAuth initiated registration, or when a
user is queried from the Admin Console. When you create an LDAP Federation provider,
ngAuth will automatically provide set of built-in mappers for this provider. You are free to
change this set and create a new mapper or update/delete existing ones.

User Attribute Mapper
This allows you to specify which LDAP attribute is mapped to which attribute of
ngAuth user. So, for example, you can configure that LDAP attribute mail to the
attribute email in the ngAuth database. For this mapper implementation, there is
always a one-to-one mapping (one LDAP attribute is mapped to one ngAuth attribute)
FullName Mapper
This allows you to specify that the full name of the user, which is saved in some LDAP
attribute (usually cn ) will be mapped to firstName and lastname attributes in the
ngAuth database. Having cn to contain full name of user is a common case for some
LDAP deployments.
Role Mapper
This allows you to configure role mappings from LDAP into ngAuth role mappings.
One Role mapper can be used to map LDAP roles (usually groups from a particular
branch of LDAP tree) into roles corresponding to either realm roles or client roles of
a specified client. It’s not a problem to configure more Role mappers for the same
LDAP provider. So for example you can specify that role mappings from groups under
ou=main,dc=example,dc=org will be mapped to realm role mappings and role
mappings from groups under ou=finance,dc=example,dc=org will be mapped to
client role mappings of client finance .
Hardcoded Role Mapper

This mapper will grant a specified ngAuth role to each ngAuth user linked with LDAP.
Group Mapper
This allows you to configure group mappings from LDAP into ngAuth group mappings.
Group mapper can be used to map LDAP groups from a particular branch of an LDAP
tree into groups in ngAuth. It will also propagate user-group mappings from LDAP
into user-group mappings in ngAuth.
MSAD User Account Mapper

This mapper is specific to Microsoft Active Directory (MSAD). It’s able to tightly
integrate the MSAD user account state into the ngAuth account state (account
enabled, password is expired etc). It’s using the userAccountControl and
pwdLastSet LDAP attributes. (both are specific to MSAD and are not LDAP
standard). For example if pwdLastSet is 0, the ngAuth user is required to update
their password and there will be an UPDATE_PASSWORD required action added to
the user. If userAccountControl is 514 (disabled account) the ngAuth user is
disabled as well.
By default, there are User Attribute mappers that map basic ngAuth user attributes like
username, firstname, lastname, and email to corresponding LDAP attributes. You are free
to extend these and provide additional attribute mappings. Admin console provides tooltips,

which should help with configuring the corresponding mappers.
Password Hashing
When the password of user is updated from ngAuth and sent to LDAP, it is always sent in
plain-text. This is different from updating the password to built-in ngAuth database, when
the hashing and salting is applied to the password before it is sent to DB. In the case of
LDAP, the ngAuth relies on the LDAP server to provide hashing and salting of passwords.
Most of LDAP servers (Microsoft Active Directory, RHDS, FreeIPA) provide this by default.
Some others (OpenLDAP, ApacheDS) may store the passwords in plain-text by default and
you may need to explicitly enable password hashing for them. See the documentation of
your LDAP server more details.
How to setup a new LDAP connection ?
Click on the "User Federation" left menu to acces the Federation part. Then choose ldap to
create a new connection.
Then fill the required fields to get the connection working :

Figure 7. LDAP Creation
After clicking on save, you get the LDAP admin. Now, it’s time to configure some mappers
to map data from LDAP to the ngAuth data model. The most important for us, it the
ROLE_MAPPER to be able to map some ROLE from the LDAP server with LDAP users.
For that, go to the "Mappers" tab :
Figure 8. LDAP Mappers
Click on the "Create" button in top right of the mappers table.

• Give it a name
• Choose "group-ldap-mapper" in the mapper type field.
• You need to set at least the "LDAP Groups DN"
• Depending of the LDAP type, you need to choose the correct version for the "User
Groups Retrieve Strategy"
Figure 9. LDAP Roles Mapper
5.7. Identity Brokering
An Identity Broker is an intermediary service that connects multiple service providers with
different identity providers. As an intermediary service, the identity broker is responsible
for creating a trust relationship with an external identity provider in order to use its
identities to access internal services exposed by service providers.
From a user perspective, an identity broker provides a user-centric and centralized way to
manage identities across different security domains or realms. An existing account can be
linked with one or more identities from different identity providers or even created based
on the identity information obtained from them.
An identity provider is usually based on a specific protocol that is used to authenticate and
communicate authentication and authorization information to their users. It can be a social
provider such as Facebook, Google or Twitter. It can be a business partner whose users
need to access your services. Or it can be a cloud-based identity service that you want to
integrate with.
Usually, identity providers are based on the following protocols:
• SAML v2.0
• OpenID Connect v1.0

• OAuth v2.0
In the next sections we’ll see how to configure and use ngAuth as an identity broker,
covering some important aspects such as:
• OpenID Connect v1.0 Brokering

• SAML v2.0 Brokering
5.7.1. Brokering Overview
When using ngAuth as an identity broker, users are not forced to provide their credentials
in order to authenticate in a specific realm. Instead, they are presented with a list of identity
providers from which they can authenticate.
You can also configure a default broker. In this case the user will not be given a choice, but
instead be redirected directly to the parent broker.
The following diagram demonstrates the steps involved when using ngAuth to broker an
external identity provider:
Figure 10. Identity Broker Flow
1. User is not authenticated and requests a protected resource in a client application.

2. The client applications redirects the user to ngAuth to authenticate.
3. At this point the user is presented with the login page where there is a list of identity
providers supported by a realm.
4. User selects one of the identity providers by clicking on its respective button or link.
5. ngAuth issues an authentication request to the target identity provider asking for
authentication and the user is redirected to the login page of the identity provider. The
connection properties and other configuration options for the identity provider were
previously set by the administrator in the Admin Console.
6. User provides his credentials or consent in order to authenticate in the identity provider.
7. Upon a successful authentication by the identity provider, the user is redirected back to
ngAuth with an authentication response. Usually this response contains a security token
that will be used by ngAuth to trust the authentication performed by the identity provider
and retrieve information about the user.
8. Now ngAuth is going to check if the response from the identity provider is valid. If valid,
it will import and create a new user or just skip that if the user already exists. If it is a
new user, ngAuth may ask the identity provider for information about the user if that
info doesn’t already exist in the token. This is what we call identity federation. If the user
already exists ngAuth may ask him to link the identity returned from the identity
provider with his existing account. We call this process account linking. What exactly is
done is configurable and can be specified by setup of First Login Flow . At the end of this
step, ngAuth authenticates the user and issues its own token in order to access the
requested resource in the service provider.
9. Once the user is locally authenticated, ngAuth redirects the user to the service provider
by sending the token previously issued during the local authentication.
10. The service provider receives the token from ngAuth and allows access to the protected
resource.
There are some variations of this flow that we will talk about later. For instance, instead of
presenting a list of identity providers, the client application can request a specific one. Or
you can tell ngAuth to force the user to provide additional information before federating his
identity.
Different protocols may require different authentication flows. At this

moment, all the identity providers supported by ngAuth use a flow just like
 described above. However, despite the protocol in use, user experience
should be pretty much the same.
As you may notice, at the end of the authentication process ngAuth will always issue its own
token to client applications. What this means is that client applications are completely
decoupled from external identity providers. They don’t need to know which protocol (eg.:
SAML, OpenID Connect, OAuth, etc) was used or how the user’s identity was validated. They
only need to know about ngAuth.

5.7.2. Default Identity Provider
It’s possible to automatically redirect to a identity provider instead of displaying the login
form. To enable this go to Authentication select the Browser flow. Then click on config
for the Identity Provider Redirector authenticator. Set Default Identity
Provider to the alias of the identity provider you want to automatically redirect users to.
If the configured default identity provider is not found the login form will be displayed
instead.
5.7.3. OpenID Connect v1.0 Identity Providers
ngAuth can broker identity providers based on the OpenID Connect protocol. These IDPs
must support the Authorization Code Flow as defined by the specification in order to
authenticate the user and authorize access.
To begin configuring an OIDC provider, go to the Identity Providers left menu item
and select OpenID Connect v1.0 from the Add provider drop down list. This will
bring you to the Add identity provider page.

Figure 11. Add Identity Provider
You must define the OpenID Connect configuration options as well. They basically describe
the OIDC IDP you are communicating with.
Table 1. OpenID Connect Config

Configuration Description
Authorization URL Authorization URL endpoint required by the
OIDC protocol
Token URL Token URL endpoint required by the OIDC
protocol
Logout URL Logout URL endpoint defined in the OIDC
protocol. This value is optional.
Backchannel Logout Backchannel logout is a background, out-of-
band, REST invocation to the IDP to logout
the user. Some IDPs can only perform logout
through browser redirects as they may only
be able to identity sessions via a browser
cookie.

User Info URL User Info URL endpoint defined by the OIDC
protocol. This is an endpoint from which user
profile information can be downloaded.
Client ID This realm will act as an OIDC client to the
external federation IDP you are configuring
here. Your realm will need a OIDC client ID
when using the Authorization Code Flow to
interact with the external IDP
Client Secret This realm will need a client secret to use
when using the Authorization Code Flow.
Issuer Responses from the IDP may contain an
issuer claim. This config value is optional. If
specified, this claim will be validated against
the value you provide.
Default Scopes Space-separated list of OIDC scopes to send
with the authentication request. The default
is openid
Prompt Another optional switch. This is the prompt
parameter defined by the OIDC specification.
Through it you can force re-authentication
and other options. See the specification for
more details
Validate Signatures Another optional switch. This is to specify if
ngAuth will verify the signatures on the
external ID Token signed by this Identity
provider. If this is on, the ngAuth will need to
know the public key of the external OIDC
identity provider. See below for how to setup
it. WARNING: For the performance purposes,
ngAuth caches the public key of the external
OIDC identity provider. If you think that
private key of your Identity provider was
compromised, it is obviously good to update
your keys, but it’s also good to clear the keys
cache.
Use JWKS URL Applicable if Validate Signatures is on.
If the switch is on, then identity provider
public keys will be downloaded from given
JWKS URL. This allows great flexibility
because new keys will be always re-
downloaded again when identity provider
generates new keypair. If the switch is off,
then public key (or certificate) from the
ngAuth DB is used, so when identity provider
keypair changes, you always need to import
new key to the ngAuth DB as well.

JWKS URL URL where identity provider keys in JWK
format are stored. See JWK specification for
more details. If you use external ngAuth
identity provider, then you can use URL like
http://broker-keycloak:8180/auth/realms/
test/protocol/openid-connect/certs
assuming your brokered ngAuth is running
on http://broker-keycloak:8180 and it’s
realm is test .
Validating Public Key Applicable if Use JWKS URL is off. Here is
the public key in PEM format that must be
used to verify external IDP signatures.
Validating Public Key Id Applicable if Use JWKS URL is off. This field
specifies ID of the public key in PEM format.
This config value is optional. As there is no
standard way for computing key ID from key,
various external identity providers might use
different algorithm from ngAuth. If the value
of this field is not specified, the validating
public key specified above is used for all
requests regardless of key ID sent by
external IDP. When set, value of this field
serves as key ID used by ngAuth for
validating signatures from such providers
and must match the key ID specified by the
IDP.
You can also import all this configuration data by providing a URL or file that points to
OpenID Provider Metadata (see OIDC Discovery specification). If you are connecting to a
ngAuth external IDP, you can import the IDP settings from the url
<root>/auth/realms/{realm-name}/.well-known/openid-configuration.
This link is a JSON document describing metadata about the IDP.
5.7.4. SAML v2.0 Identity Providers
ngAuth can broker identity providers based on the SAML v2.0 protocol.
To begin configuring an SAML v2.0 provider, go to the Identity Providers left menu
item and select SAML v2.0 from the Add provider drop down list. This will bring you to
the Add identity provider page.

Figure 12. Add Identity Provider
You must define the SAML configuration options as well. They basically describe the SAML
IDP you are communicating with.
Table 2. SAML Config

Single Sign-On Service URL This is a required field and specifies the
SAML endpoint to start the authentication
process. If your SAML IDP publishes an IDP
entity descriptor, the value of this field will
be specified there.
Single Logout Service URL This is an optional field that specifies the
SAML logout endpoint. If your SAML IDP
publishes an IDP entity descriptor, the value
of this field will be specified there.

Backchannel Logout Enable if your SAML IDP supports
backchannel logout
NameID Policy Format Specifies the URI reference corresponding to
a name identifier format. Defaults to
urn:oasis:names:tc:SAML:2.0:nameid-
format:persistent.
HTTP-POST Binding Response When this realm responds to any SAML
requests sent by the external IDP, which
SAML binding should be used? If set to off,
then the Redirect Binding will be used.
HTTP-POST Binding for AuthnRequest When this realm requests authentication
from the external SAML IDP, which SAML
binding should be used? If set to off, then
the Redirect Binding will be used.
Want AuthnRequests Signed If true, it will use the realm’s keypair to sign
requests sent to the external SAML IDP
Signature Algorithm If Want AuthnRequests Signed is on,
then you can also pick the signature
algorithm to use.
SAML Signature Key Name Signed SAML documents sent via POST
binding contain identification of signing key
in KeyName element. This by default
contains ngAuth key ID. However various
external SAML IDPs might expect a different
key name or no key name at all. This switch
controls whether KeyName contains key ID
(option KEY_ID), subject from certificate
corresponding to the realm key (option
CERT_SUBJECT - expected for instance by
Microsoft Active Directory Federation
Services), or that the key name hint is
completely omitted from the SAML message
(option NONE).
Force Authentication Indicates that the user will be forced to enter
in their credentials at the external IDP even
if they are already logged in.
Validate Signature Whether or not the realm should expect that
SAML requests and responses from the
external IDP be digitally signed. It is highly
recommended you turn this on!
Validating X509 Certificate The public certificate that will be used to
validate the signatures of SAML requests
and responses from the external IDP.
You can also import all this configuration data by providing a URL or file that points to the

SAML IDP entity descriptor of the external IDP. If you are connecting to a ngAuth external
IDP, you can import the IDP settings from the url <root>/auth/realms/{realm-
name}/protocol/saml/descriptor. This link is an XML document describing
metadata about the IDP.
You can also import all this configuration data by providing a URL or XML file that points to
the entity descriptor of the external SAML IDP you want to connect to.
SP Descriptor
Once you create a SAML provider, there is an EXPORT button that appears when viewing
that provider. Clicking this button will export a SAML SP entity descriptor which you can use
to import into the external SP.
This metadata is also available publicly by going to the URL
http[s]://{host:port}/auth/realms/{realm-name}/broker/{broker-alias}/endpoint/descriptor

Chapter 6. User Management
6.1. Introduction
The User Management module is used to handle user authorization in the system (user
authentication is managed by the NG|Auth module) through user roles.
6.2. Multi-tenant installation
NG|Screener is always installed in multi-tenant mode, which enables each and every login
to be contextual to one tenant (= one of the hosted banks or specific bank internal unit) and,
as such, isolated from the other tenants.
This is configured in the /etc/ng-screener/common/ng-screener.conf

configuration file, through the following property:
#--------------------------------------------------------------------
# Multi-Tenancy
# List of tenants, must be in Upper case. Must not be empty.
# Example: multiTenancy.tenants = TENANT1,TENANT2,TENANT3
multiTenancy.tenants = DEFAULT
Each of the tenants defined through the multiTenancy.tenants property should have a
corresponding realm in NG|Auth.
6.3. Role mapping
NG|Auth, additionnaly to pure authentication management (i.e. checking that a user is who
she claims to be), also associates so-called roles to each user. Roles are only plain names
at this level. Default installation only creates one role, NG_Admin.
It is the responsibility of each application relying on NG|Auth for its authentication to

provide a mapping between those roles (only provided as names) and applicative
functionalities that the corresponding users are allowed to use.
In NG|Screener, this mapping may be customized through the UI: new applicative roles may
be defined there (their name must correspond to role names configured in NG|Auth), and
corresponding functionalities associated to them.
6.4. Scripts for create roles and users
In the directory /usr/local/ng-screener/tools/auth two scripts can be found to

create roles and users inside NG|Auth.
Chapter 6. User Management | 41

Role creation
The script createRoleKeycloak.py must be used to add roles with roles inside
NG|Auth. The parameters are the following:
Short param Long param Description Required

-n --name Name of the role True
-d --description Description of the role False
-t --tenant Tenant of the user True
-u --kc-super-user Super-user name to True
use with keycloak to
connect to master
realm
-p --kc-pwd Super-user password True
to use with keycloak
to connect to master
realm
-m --kc-master-realm Actual name of the False
master realm
Example: ./createRoleKeycloak.py -n Role_1 -t TENANT2 -u superadmin

-p superadmin_password -d "little description of the role"
User creation
The script createUserKeycloak.py must be used to add users with roles inside
NG|Auth. The parameters are the following:

-n --name Name of the user True
-up --password Password of the user True
-r --roles Roles of the user True
(example:
ROLE_1,ROLE_2,ROL
E_create)
-t --tenant Tenant of the user True
-u --kc-super-user Super-user name to True
use with keycloak to
connect to master
realm
-p --kc-pwd Super-user password True
to use with keycloak
to connect to master
realm
42 | Chapter 6. User Management

-m --kc-master-realm Actual name of the False
master realm
Example: ./createUserKeycloak.py -n user3 -t TENANT2 -u superadmin -p

superadmin_password -up userpassword -r "Role_1,Role_2"
Chapter 6. User Management | 43

Chapter 7. Command Line Tool (ngadmin)
7.1. Introduction
NG|Admin is the preferred tool to perform operations through the command line instead of
using NG|ScreenerUI, where only a small subset of administration operations is available.
NG|Admin can be used locally or remotely. For example you can add, delete, extract and list
the controls or channels/targets to/from the server.
It can be reached using the ngadmin command line tool directly with the command name
and parameters - it allows to issue commands quickly without the need to specify
credentials, since those are found in two files located in the current user’s home directory:
• ~/.ngadmin/ngadminUser: its first line is used as encoded user name;

• ~/.ngadmin/ngadminPassword: its first line is used as encoded password.
To help generating those two files, an executable script is provided in

/usr/local/ng-screener/ngadmin/script directory:
generate_ngadmin_credentials.sh.
Its first parameter must be the NG|Screener user to use for login, whereas
the password can be supplied by one of the following means:
• on the command line also, as second parameter
$ generate_ngadmin_credentials.sh MyUser MyPwd
• in case no second parameter is given (recommended usage):

◦ through the script’s standard input

$ echo "MyPwd" | generate_ngadmin_credentials.sh MyUser
• when the ngadmin command’s standard input corresponds to a

terminal (i.e. neither scripted nor piped), the password will be asked for
interactively.
$ generate_ngadmin_credentials.sh MyUser
Please enter MyUser's password:
Once the script has gathered both user name and password, it will generate
the corresponding files for the current user (i.e. the user running the
script).
44 | Chapter 7. Command Line Tool (ngadmin)

To get help about a specific command, you can use -h, --help.
[root@NG-SCREENER ~]$ ngadmin data_launchInitialProcessing --help

...
7.2. Multi-tenancy
In case of a multi-tenant installation, all authenticated ngadmin commands are executed

within a specific tenant.
There is a tenant parameter on the wrapper script’s command line (the user is, as usual,
taken from the ~/.ngadmin/ngadminUser, password from
~/.ngadmin/ngadminPassword file)
$ ngadmin --tenant=MYTENANT showDaemonVersion
7.3. List of available commands
Here is the full list of available commands.
NAME DESCRIPTION
aggregator_exportProfilingAggregations Export profiling aggregations
aggregator_importProfilingAggregations Import profiling aggregations
aggregator_renameProfilingAggregation Rename profiling aggregations
aggregator_recomputeAggregations Recompute profiling aggregations
aggregator_recomputePeerGroups Recompute profiling peer groups
control_addClassification Add Documentation
control_addControls Add the controls
control_addOrUpdateControls Add or update controls
control_addTargets Add the report targets
control_delControls Delete the controls
control_delTargets Delete the report targets
control_exportSolutionsDoc Export Solutions document
control_exportReports Run controls and export their output (report)
as JasperPrint, PDF, and a PNG thumbnail of
the first page
control_extractControls Extract controls
control_extractTargets Extract the report targets
control_listControls List the controls
control_listSolutions List Solutions
control_listProfilingVariableWeights List profiling controls' variable weights
Chapter 7. Command Line Tool (ngadmin) | 45

NAME DESCRIPTION
control_listTargets List the control targets
control_removeOldExecutions Remove oldest execution contexts, on a per
control basis.
control_setProfilingVariableWeights Set profiling controls' variable weights
control_importSolution Import solution ZIP file (containing controls,
targets and channels definitions)
dashboard_exportDashboards Export dashboards
dashboard_importDashboards Import forensic dashboards
dashboard_listDashboards List dashboards
data_launchInitialProcessing Start the initial procesing
data_removeEntries Delete data from NG|Storage
data_removeViolations Remove violations from log-collector and
NG|Storage
data_sanitize remove data not present in log collector
from NG|Storage
datacapturealerting_exportAlertingPolicies Export alerting policies
datacapturealerting_importAlertingPolicies Import alerting policies
datacapturealerting_listAlertingPolicies List alerting policies
forensic_extractFilters Extract forensic filters
forensic_importFilters import forensic filters
help provides basic help
licensing_extractC2V Extract the C2V file from the appliance
licensing_showLicenseInformation Show license information
licensing_updateLicense Update license
polling_listStatus List polling statuses (.pollstatus and
.nextpoll files) of a statuses directory
polling_readNextPollFile Read a polling next poll file (.nextpoll)
polling_readPollStatusFile Read a polling status file (.pollstatus)
polling_updateNextPollFile Update a polling next poll file (.nextpoll)
polling_updatePollStatusFile Update a polling status file (.pollstatus)
referencedata_listCacheEntries List entries of ReferenceData cache
referencedata_listCaches List ReferenceData caches
referencedata_reloadCaches Reload ReferenceData caches
fieldMapping_importFieldMapping Import field mapping
fieldMapping_exportFieldMapping Export field mapping
reload Reload configuration without ngDaemon
restart
search_reindexAll Reindex all data and make it available to the
search functionality
showDaemonVersion Show the daemon version
NAME DESCRIPTION
util_encodePassword Encode the password
7.4. Usage and Tips
7.4.1. Space in arguments or options
Escape any spaces in command arguments or options with a backslash character (\).
For example, the following command exports all targets named My Target:
ngadmin control_extractTargets -f /tmp/mytarget.xml "*/*/My\ Target"
7.5. Show daemon version
This command’s purpose is to display the version of NG|Screener installed.
7.5.1. Syntax
usage: showDaemonVersion [-h | --help]
[-h | --help] this help
ngadmin showDaemonVersion
7.6. Control-related commands
Below you will find a list of commands used to manage reports.
• control_addControls: Add controls from XML files located in a directory or in a zip

archive.
• control_addOrUpdateControls: Add or update controls located in a zip file. WARNING:
only for Solution RPM installation purpose, not for manual calls!
• control_delControls: Delete controls from XML files located in a zip archive or in a
directory and matching a filter.
• control_exportReports: Run a control and export as JasperPrint, PDF and a PNG
thumbnail of the first page.
• control_exportSolutionsDoc: Export the short documentation of specified Solutions
• control_extractControls: Create a zip archive containing all controls that match the
filter expression passed as an argument

• control_listControls: List the controls that match a filter
• control_listSolutions: List the Solutions that match a filter
• control_listProfilingVariableWeights: List the simple profiling controls' variables'
weights
• control_setProfilingVariableWeights: Assign a simple profiling control’s variables'
weights
7.6.1. Syntax
• Add control(s):
Command: control_addControls
Parameter name Description

-f, --file The XML or ZIP file with the controls to store, or a
directory containing XML files
-v, --visibility Force profilingVisibility of controls
-o, --owner Force owner of controls (mandatory in multi-tenant
installations)
--owner-type Type of forced owner, may be user (default) or role
-h, --help This help
Usage examples:
# import control from XML file

ngadmin control_addControls -f /home/MyControl.xml
# import controls contained in ZIP file

ngadmin control_addControls -f /home/MyControls.zip
# import all controls from given directory (XML files)

ngadmin control_addControls -f /home/MyControlDir/
• Add or update controls:
Command: control_addOrUpdateControls

-f, --file The ZIP file with the controls to store. WARNING: Only for
rpm installation purpose!
--rpm-version RPM version
Usage example:

# add or replace existing controls present in the given zip file
# (replacing them only if the existing version did not get modified
# since last import, otherwise using the 'rpm-version' parameter
# to generate the new control's name)
ngadmin control_addOrUpdateControls -f /tmp/MyReports.zip \
--rpm-version 10.0.1
• Delete controls:
Command: control_delControls

-f, --file The XML or ZIP file with the controls to delete, or a
directory containing those XML files
<arg>… List of ID’s or patterns to search the controls to delete
• SOL_1/control1
• SOL_1/contr*
• SOL/*
• …
Usage examples:
# delete control identified by given XML file

ngadmin control_delControls -f /home/MyReports.xml
# delete controls identified by those present in given ZIP file

ngadmin control_delControls -f /home/MyReports.zip
# delete controls identified by XML files present in given directory

ngadmin control_delControls -f /home/MyReports/*
# delete controls corresponding to Solution/name pattern

ngadmin control_delControls '*/My*'
• Extract controls:
Command: control_extractControls

-f, --file The ZIP file with the controls to store. WARNING: Only for
rpm installation purpose!

<arg>… List of ID’s or patterns to search the controls to delete
• SOL_1/control1
• SOL_1/contr*
• SOL/*
• …
Usage example:
# export all controls following given pattern into given ZIP file
ngadmin control_extractControls -f /home/MyReports.zip '*/*/My*'
• Export reports (run controls):
Command: control_exportReports

-f, --from Execution start date (default is 'end date' minus the
default report’s time range), using ISO-8601 format, e.g.
2014-01-25 or 2014-01-25T11:50:00)
-t, --to Execution end date (default is 'now'), expressed in ISO-

8601 format, e.g. 2014-01-25 or 2014-01-25T11:50:00)
-d, --directory Output directory (default is the current directory)
-s, --save Format(s) to save output in, as a list among pdf, png,
csv and jrprint (use the option several times to ask for
several formats)
--png-zoom PNG thumbnail zoom size (default is 0.5), obviously only

relevant when saving in PNG format
--targets Targets to export (using the

ChannelType/ChannelName/Target pattern)
--forceSchedule Force controls to be run as scheduled (only for this time)

-p, --dynamicParameter Dynamic parameters to pass to the execution. Parameter
format is: <label>=<value>. Multiple parameters can
be provided as: -p param1=value1 -p
param2=value2. Provided parameters must match all
defined parameters in executed controls.
<arg>… List of ID’s or patterns to search the controls to export
• SOL_1/control1
• SOL_1/contr*
• SOL/*
• …
Usage example:
# run all controls following the given pattern, for the given time
# frame, without exporting any actual report (no --save option
# used), although PDF reports are of course generated and accessible
# from the UI later
ngadmin control_exportReports --from 2014-01-25T00:00:00 \
--to 2014-01-30T23:59:59 'MySolution/*'
• Export short document of Solutions:
Command: control_exportSolutionsDoc

-d, --directory Output directory (required)
<arg>… List of solution name patterns to search (if empty, all

solutions will be taken into account)
• SOL*
• *
• …
Usage example:

# export all solutions' short documents to given directory
ngadmin control_exportSolutionsDoc -d /tmp/ibdoc '*'
# export all 'Cybercrime...' solutions' short documents to given directory

ngadmin control_exportSolutionsDoc -d /tmp/ibdoc 'Cybercrime*'
# export only one solution's short documents to given directory

ngadmin control_exportSolutionsDoc -d /tmp/ibdoc \
'Cybercrime\ -\ Data\ Leakage'
• List controls:
Command: control_listControls

<arg>… List of IDs or patterns to search for the controls (if empty,
all controls will be listed)
• SOL1/control1
• SOl/contr*
• SOL/*
• …
Usage examples:
# list one control (= check whether this control exists...)

ngadmin control_listControls "<SOLUTION>/<ControlName>"
# list all controls in given solution

ngadmin control_listControls "SOLUTION/*"
# list all controls with specific naming convention, wherever they are
ngadmin control_listControls "*/My*"
# list all controls using several filters (only one match is necessary
# for a control to be listed)
ngadmin control_listControls '*/*open*' '*/*net*'
• List Solutions:
Command: control_listSolutions


<arg>… List of solution name patterns to search (if empty, all
solutions will be listed)
• SOL*
• *
• …
Usage examples:
# list all solutions on the platform

ngadmin control_listSolutions
# list all solutions on the platform (more explicit form)

ngadmin control_listSolutions '*'
# list all solutions which name follow given pattern(s)

ngadmin control_listSolutions 'SOL* IT*'
• List variable weights from simple profiling controls:
Command: control_listProfilingVariableWeights

<arg>… List of IDs or patterns to search for the controls (if empty,
all simple profiling controls will be listed)
• SOL1/control1
• SOL/contr*
• SOL/*
• …
Output is a JSON file like the following one (prettyfied here to ease reading):
[
{
"name":"Pr03 - Unusual Applications",
"id":99,
"variables": [
{
"id":1184,
"type":"AGGREGATION",
"name":"application_day",

"weight":1.0
},
{
"id":1186,
"name":"user_application_day",
"weight":2.0
},
{
"id":1185,
"name":"user_source_ip",
"weight":4.0
},
{
"id":1187,
"name":"user_source_terminal",
"weight":2.0
}
]
},
{
"name":"Pr02 - Unusual Locations",
"id":122,
"variables": [
{
"id":1402,
"name":"ffe_country",
"weight":1.0
},
{
"id":1401,
"name":"ffe_country_city",
"weight":3.0
},
{
"id":1403,
"name":"ffe_user_country",
"weight":3.0
},
{
"id":1400,
"name":"ffe_user_country_city",
"weight":5.0
},
{
"id":1398,
"name":"ffe_user_day_of_week",
"weight":2.0
},
{
"id":1399,
"name":"ffe_user_part_of_day",
"weight":2.0
}
]
}
]
Usage examples:
# list all simple profiling control's variables on the platform

ngadmin control_listProfilingVariableWeights
# list all simple profiling control's variables on the platform

# (more explicit form)
ngadmin control_listProfilingVariableWeights '*/*/*'
• Set new variable weights from a simple profiling control:
Command: control_setProfilingVariableWeights

-f, --file JSON file from which to extract the new variable weights
-c, --controlId Identifier of the ID which variable weights have to be reset
As input file, a JSON file like the following one (prettyfied here to ease reading) is
expected:
{
"application_day": 3.14,
"user_application_day": 2.71,
"user_source_ip": 1.414,
"user_source_terminal": 1.732
}
In case the variables' names do not match those in the control, the operation fails. This was
made so to prevent accidental resetting of one control’s variables' weights with data
actually intended for another control.

It is strongly discouraged to use this command with optimized weights
directly coming out of the optimization algorithm (risk of over-optimization
 which would result in only the already known hits to be shown as hits in
future analyses). This is of course even more true on production
installations.
Usage examples:
# change profiling variable weights for control with id 99

$ ngadmin control_setProfilingVariableWeights --controlId 99 \
--file /tmp/mynewweights.json
Please never ever run this command on a production installation with
optimized weights coming directly out of the optimization algorithm.
In any case please consult with NetGuardian's Risk department before
going further.
Do you want to continue? [y/N]
7.7. Report target commands
Following commands are used to import, export and delete report targets.
• control_addTargets: Add report targets from an XML file or a directory.

• control_delTargets: Delete report targets located in an XML file or those that match the
filter.
• control_extractTargets: Create an XML file with the report targets that match the filter.
• control_listTargets: List in the console the report targets that match the filter
7.7.1. Syntax
• Add report targets:
Command: control_addTargets

-f, --file The XML file with the targets to add, or a directory
containing such files
Usage examples:

# import target(s) present in the given XML file
ngadmin control_addTargets -f /home/MyTarget.xml
# import target(s) present in XML files of the given directory

ngadmin control_addTargets -f /home/MyTargets/
• Delete report targets:
Command: control_delTargets

-f, --file The XML file with the target(s) to delete, or a directory
containing such files
<arg>… List of IDs or patterns to search the targets to delete
• channelType1/channelName1/target1
• channelType*/*/target*
• */*/*
• …
Usage examples:
# remove all targets described in the given XML file

ngadmin control_delTargets -f /home/MyTarget.xml
# remove all targets following the given pattern

ngadmin control_delTargets '*/*/My*'
• Extract report targets:
Command: control_extractTargets

-f, --file Output XML file

<arg>… List of patterns to search for extraction
• */*/*
• …
Usage examples:
# extract all targets which name follows a given pattern, whatever

# their associated channel
ngadmin control_extractTargets -f /home/myTarget.xml '*/*/My*'
• List report targets:
Command: control_listTargets

<arg>… List of IDs or patterns to search (if empty, all targets will
be listed)
• */*/*
• …
Usage examples:
# list a given target (= checking that it exists)

ngadmin control_listTargets '<ChannelType>/<ChannelName>/<Name>'
# list all targets with specific name prefix, wherever they are
ngadmin control_listTargets '*/*/my*'
# list all targets on the platform

ngadmin control_listTargets '*/*/*'
7.8. Report utility commands
The following are some useful utilities to interact with the control database.

• control_addClassification: Used to add HTML documentation to Solutions
• control_removeOldExecutions: Used to clean control database from old generated
reports, leaving only a selected number of latest generated files.
7.8.1. Syntax
• Add classification:
Command: control_addClassification

-f, --file HTML file containing the documentation
-s, --solution Solution to add the documentation to (deduced from the

given filename if empty)
Usage examples:
# add a new documentation to an existing solution identified by name

ngadmin control_addClassification \
-f /solutionDocumentation/IT_Risk_Metrics-COBIT_Program_Changes.html \
-s "IT\ Risk\ Metrics\ -\ COBIT\ Program\ Change"
• Remove old executions:
Command: control_removeOldExecutions

-o, --online Number of online control reports that should be kept (all
if unset)
-s, --schedule Number of scheduled control reports that should be kept

(all if unset)
Usage example:

# leave only 10 online and 10 scheduled control execution results in DB
# (for each control); all others are removed and will not be available
# to download anymore.
ngadmin control_removeOldExecutions -o 10 -s 10
7.9. Profiling aggregation commands
• aggregator_exportProfilingAggregations: Export profiling aggregations

• aggregator_importProfilingAggregations: Import profiling aggregations
• aggregator_recomputeAggregations: Recompute profiling aggregations
• aggregator_renameProfilingAggregation: Rename profiling aggregations
Profiling aggregations are used by profiling controls. A profiling control may utilize one or
more profiling aggregations. When importing/exporting a profiling control by using
ngadmin, it imports/exports its corresponding aggregations, too, so there is no need to
import/export aggregations afterwards.
These import/export commands are useful when you want to import/export a profiling
aggregation which is not associated with a profiling control, or in case you want to
import/export an aggregation without importing/exporting the associated profiling control.
7.9.1. Export profiling aggregations
Command: aggregator_exportProfilingAggregations

-f, --file Output XML file
This command is used to export all profiling aggregations into an XML file on the
filesystem. The -f option is used to indicate the destination file path (location should be
writeable by the ng-screener user).
Usage example:
# export all existing profiling aggregations to a given XML file

ngadmin aggregator_exportProfilingAggregations -f /tmp/pa.xml
7.9.2. Import profiling aggregations
Command: aggregator_importProfilingAggregations

-f, --file Input XML file
-o, --owner Force owner of aggregations (mandatory in multi-tenant

installations)
--owner-type Type of forced owner, may be user (default) or role
This command is used to import profiling aggregations from an XML file. The -f option is
used to indicate the source file path, which should be readable by the ng-screener user.
Usage example:
# import all profiling aggregations present in the given XML file

ngadmin aggregator_importProfilingAggregations -f /tmp/pa.xml
7.9.3. Recompute profiling aggregations
Command: aggregator_recomputeAggregations

--reset only meaningful when the aggregation computation occurs
in partial mode (see aggregation.conf), to cause
preliminary cleanup of all partially aggregated data before
re-computation
<arg>… Aggregation names separated by spaces (if empty, all

aggregations will be recomputed)
This command is used to recompute some or all of the profiling aggregations. The
recalculation process itself is performed asynchronously.
Usage example:
# ordering of the recomputation of the 'myagg' profiling aggregation

ngadmin aggregator_recomputeAggregations myagg

7.9.4. Rename profiling aggregations
Command: aggregator_renameProfilingAggregation

--oldName Old (existing) aggregation name
--newName New aggregation name
This command is used to rename the profiling aggregation.
Usage example:
# renaming aggregation 'someExistingAggregation' to 'aggregationWithNewName'

ngadmin aggregator_renameProfilingAggregation --oldName someExistingAggregation --newName
aggregationWithNewName
7.10. Profiling peer group commands
• aggregator_recomputeAggregations: Recompute profiling peer groups
Profiling peer groups are used by profiling controls. A profiling control may utilize one or no
profiling peer group.
7.10.1. Recompute profiling peer groups
Command: aggregator_recomputePeerGroups

<arg>… Peer group names separated by spaces (if empty, all peer
groups will be recomputed)
This command is used to recompute some or all of the profiling peer groups. The
recalculation process itself is performed asynchronously.
Usage example:
# ordering of the recomputation of the 'mypeergroup' profiling peer group

ngadmin aggregator_recomputePeerGroups mypeergroup

7.11. Solution commands
• control_importSolution: Import solution from the file
7.11.1. Import solution
Command: control_importSolution

-f, --file REQUIRED. The ZIP file with the solution to import.
--rpm-version RPM version, this value is extracted automatically from the

zip file if it’s not provided
-v, --visibility Force visibility ('private' vs. 'public'), default: 'private' -

targets, 'public' - controls")
-o, --owner Force owner of controls, targets, channels
-ot, --owner-type Type of forced owner ('user' vs. 'role', default to 'user')
--exclude-controls If true then don’t import controls. Default: false
--exclude-targets If true then don’t import targets and channels. Default: false
This command is used to import solution ZIP containing definition of: * controls *
aggregations * targets * channels
It combines two commands: control_addTargets and

control_addOrUpdateControls.
Usage example:
ngadmin control_importSolution -f /tmp/Internal_Fraud_solution.zip
7.12. Dashboard commands
• dashboard_listDashboards: List dashboards

• dashboard_importDashboards: Import dashboards
• dashboard_exportDashboards: Export dashboards

7.12.1. List dashboards
Command: dashboard_listDashboards

-t Dashboard type
Accepted values:
• control
• forensic
• all
If empty, all types are selected
<args> List of name patterns to search dashboards. If empty, all

dashboards will be listed.
• dashboard1
• dashboard*
• …
-h This help
Usage example:
# list all control dashboards

ngadmin dashboard_listDashboards -t control
# list one specific dashboard named Forensic Transaction

ngadmin dashboard_listDashboards "Transaction Transaction"
# list all dashboards starting with Forensic

ngadmin dashboard_listDashboards Forensic*
# list multiple dashboards

ngadmin dashboard_listDashboards \
"Forensic Transaction" "Violation"
7.12.2. Import dashboards
Command: dashboard_importDashboards

-d Folder containing xml files to import
The path should be absolute and readable by ng-screener

user.
Only .xml files in this folder are selected to import.
-f Path to the xml file to import

user
-h This help
Usage examples:
# import the file /tmp/dashboard_violation.xml

ngadmin dashboard_importDashboards \
-f /tmp/dashboard_violation.xml
# import all files in folder /tmp/dashboards

ngadmin dashboard_importDashboards \
-d /tmp/dashboards
7.12.3. Export dashboards
Command: dashboard_exportDashboards

-d Path to folder to export to
The path should be absolute and writable by ng-screener

user
Each dashboard is written into a separate file in this folder

named dashboardId.xml

-t Dashboard type
Accepted values:
• control
• forensic
• all
If empty, all types are selected
<args> Dashboard’s name patterns to export (if empty, all

dashboards will be exported)
• dashboard1
• dashboard*
• …
-h This help
Usage example:
# export all control dashboards

ngadmin dashboard_exportDashboards \
-d /tmp/backup_dashboards \
-t control
# export one specific alerting dashboard named Forensic Transaction

-d /tmp/backup_dashboards "Forensic Transaction"
# export all dashboards starting with Forensic

-d /tmp/backup_dashboards Forensic*
# export multiple dashboards

-d /tmp/backup_dashboards \
"Forensic Transaction" "Violation"
7.13. Reference Data commands
• referencedata_reloadCaches: Reload reference data caches

• referencedata_listCacheEntries: List cache entries
• referencedata_listCaches: List existing caches

Reference Data is a module to keep client-specific information, where other modules can
translate data from one field to another. It keeps related data in a key-value cache and can
be queried by a key. For more information about reference data, please refer to chapter
Reference Data.
7.13.1. Reload caches
Command: referencedata_reloadCaches

-g, --group Cache group name
--clear Clear caches before reload (default is false)
This command is used to reload caches in the system by executing the corresponding
queries. Notice that it does not reload the configuration files. If we need to reload the cache
configuration files, a restart of NG|Screener is required.
When reloading a cache, it only adds/updates new values and does not remove non-existing
values. It may leave stale entries in the cache. To reload cache with up-to-date values, use
the --clear option to clear cache before reload.
Option -g is used to reload specific cache groups. Its parameters are the names of the
cache groups. Those parameters are separated by a comma or provided through several -g
parameters. If this option is not specified, it reloads all cache groups in the system.
Usage example:
# reload reference data for two given cache groups

ngadmin referencedata_reloadCaches -g user -g department
7.13.2. List cache entries
Command: referencedata_listCacheEntries

-k, --keys Key column names, used as unique criteria to determine the
cache to search

-c, --classes Java classes of keys defined in cache, use to convert the
input strings given on the command line to the
corresponding java classes before performing the request
on the cache
-t, --keyFormats Formats of input keys, currently only used for date formats
-l, --limit Maximum number of returned results (default is 20)
-v, --values Value column names, used to restrict the listed values
associated with the cache keys
-f, --file Output file, or standard output if empty
-s, --searchKeys Search keys, must be a multiple of the number of keys

(randomly chosen entries are listed if empty); if set, the
--limit parameter is ignored
Usage examples:
# list max 50 entries of cache with key àccount`, only rendering

# `firstname` and `lastname` associated values, into òutput.txt` file
# located in the current folder
ngadmin referencedata_listCacheEntries -k account -v firstname \
-v lastname -l 50 -f output.txt
# list entries from the cache associated with `customer` (string)

# and `modified` (date) keys (which values are respectively
# `John, 01.01.2016` and `Tom, 20.03.2016`, only fetching the associated
# àddress` attribute
ngadmin referencedata_listCacheEntries -k customer -k modified \
-c string -c date -t 0 -t dd.MM.yyyy -s John -s 01.01.2016 \
-s Tom -s 20.03.2016 -v address
7.13.3. List caches
Command: referencedata_listCaches

-a, --all Show all cache information

This command is used to list all existing reference data caches, together with some
information about them (name, type, state, cache definition). The -a option can be used to
display more information about each cache (mainly the number of elements its contains).
Usage example:
# list all available information about existing caches

ngadmin referencedata_listCaches -a
7.14. Field Mapping commands
• fieldMapping_importFieldMapping: Import field mapping

• fieldMapping_exportFieldMapping: Export field mapping
Field mapping is a module to map technical name to business name in ngScreener UI.
Command: fieldMapping_importFieldMapping

-f, -file File name with field mapping definitions to import
-force All already existing field mapping will be removed before

import
-s, -ngStorage Load field mapping based on ngStorage fields
User can remove all field mappings by importing empty json array ( [] ) file with -force
option.
Remember to restart ngScreenerUI after executing this command to see updated changes!
Command: fieldMapping_exportFieldMapping

-f, -file File name where definitions will be exported
7.15. Data Capture Alerting commands
• datacapturealerting_listAlertingPolicies: List alerting policies

• datacapturealerting_importAlertingPolicies: Import alerting policies
• datacapturealerting_exportAlertingPolicies: Export alerting policies
7.15.1. List alerting policies
Command: datacapturealerting_listAlertingPolicies

<args> List of name patterns to search the alerting policies (if
empty, all policies will be listed)
• policy1
• policy*
• …
-h This help
Usage example:
# list one specific alerting policy named daily_policy

ngadmin datacapturealerting_listAlertingPolicies daily_policy
# list all alerting policies starting with daily

ngadmin datacapturealerting_listAlertingPolicies daily*
# list multiple alerting policies

ngadmin datacapturealerting_listAlertingPolicies \
daily_policy weekly_policy
7.15.2. Import alerting policies
Command: datacapturealerting_importAlertingPolicies

-f Path to the xml file to import

user
-h This help
Usage examples:

# import the file /tmp/backup_alerting_policies.xml
ngadmin datacapturealerting_importAlertingPolicies \
-f /tmp/backup_alerting_policies.xml
7.15.3. Export alerting policies
Command: datacapturealerting_exportAlertingPolicies

-f Path to the xml file to export to
The path should be absolute and writable by ng-screener

user
<args> Policies’s name patterns to export (if empty, all policies will
be exported)
• policy1
• policy*
• …
-h This help
Usage example:
# export one specific alerting policy named daily_policy

ngadmin datacapturealerting_exportAlertingPolicies \
-f /tmp/backup_alerting_policies.xml daily_policy
# export all alerting policies starting with daily

-f /tmp/backup_alerting_policies.xml daily*
# export multiple alerting policies

-f /tmp/backup_alerting_policies.xml \
daily_policy weekly_policy
7.16. Polling Status
When finishing polling, the polling saves its last status to the status file. Then at the next
poll, it only polls the new logs from the last status, avoids polling duplicated logs in the
server. For more information on the polling system, please refer to its own documentation
ngPollingSystem_Admin_Guide.

7.16.1. Polling list directory status
This command presents the contents of all .pollstatus and .nextpoll files located in
a specific folder.
Syntax
Command: polling_listStatus

-d, --directory Polling status directory path (default is /etc/ng-
screener/polling-system/status/)
-o, --outputFormat Next poll time output format (for instance dd-MM-yy
HH:mm:ss z, which is the default)
Usage examples:
# display polling status from default directory

ngadmin polling_listStatus
# display polling status from given specific directory

ngadmin polling_listStatus -d /home/user/pollingStatsBackup
# display polling status with specific next-poll date format

ngadmin polling_listStatus -o "dd-MM-yy HH:mm:ss z"
# display polling status with specific directory and date format

ngadmin polling_listStatus -d /home/user/pollingStatsBackup \
-o "dd-MM-yy HH:mm:ss z"
7.16.2. Polling read poll status file
This command presents the contents of a .pollstatus file in a readable format.
Syntax
Command: polling_readPollStatusFile

-f, --file Polling status file to read

Usage example:
# read the polling status from given file

ngadmin polling_readPollStatusFile \
-f /home/user/pollingStatsBackup/test@test2.pollstatus
7.16.3. Polling read next poll file
This command presents the contents of a .nextpoll file in a readable format.
Syntax
Command: polling_readNextPollFile

-f, --file Polling next poll file to read (.nextpoll)
-o, --outputFormat Next poll time output format (for instance dd-MM-yy
HH:mm:ss z, which is the default)
# read and display the next poll time from given file
ngadmin polling_readNextPollFile \
-f /home/user/pollingStatsBackup/test@test2.nextpoll
# read and display the next poll time from given file with given format
ngadmin polling_readNextPollFile \
-f /home/user/pollingStatsBackup/test@test2.nextpoll \
-o "dd-MM-yy HH:mm:ss z"
7.16.4. Polling update/create poll status file
This command updates the content of a .pollstatus file. If the given file doesn’t exist, it
will be created automatically. It doesn’t necessarily mean that the value will be picked by a
running connector. If you want to update the .pollstatus file of a running connector, you
need to stop the daemon, update the status file and restart the daemon to be sure that the
connector picks up the new status.
Polling status Comparable Object types
The polling status file keeps the last polling status. At the next poll, it polls only new logs
since the last status. For that reason, the polling status must be comparable to determine
the new logs to poll.

There are many status types that are comparable. Each type is stored in its own format and
therefore, polling system must be aware the status type to recover the right value.
Currently, polling system supports the following types:
Type Example value Format

STRING toto
LONG 123456789
BIGDECIMAL 12345678910111213.923133333333
INTEGER 723
DOUBLE 123456789.123456789
FLOAT 723.31
DATE 2012-10-12 yyyy-MM-dd
TIME 12:32:11 HH:mm:ss
TIMESTAMP 2012-10-12 12:32:11 yyyy-MM-dd HH:mm:ss[.fffffffff]
2012-10-12 12:32:11.2356984
Syntax
Command: polling_updatePollStatusFile

-f, --file Polling status file to update (.pollstatus), which will be
created if it does not exist already
-t, --type Polling status comparable object type (see above, STRING
being the default)
<args>… Value to write to the poll status file
Usage examples:

# set string value to `toto`
ngadmin polling_updatePollStatusFile \
-f /home/user/pollingStatsBackup/test@test2.pollstatus toto
# set string value to `2365985`

-f /home/user/pollingStatsBackup/test@test2.pollstatus 2365985
# set integer value to 2365985

-f /home/user/pollingStatsBackup/test@test2.pollstatus \
-t INTEGER 2365985
# set double value

-t DOUBLE 123456789.123456789
# set float value

-t FLOAT 723.31
# set date (= day) value

-t DATE 2012-10-12
# set time value

-t TIME 12:32:11
# set timestamp value (without sub-second precision)

-t TIMESTAMP 2012-10-12 12:32:11
# set timestamp value (with sub-second precision)

-t TIMESTAMP 2012-10-12 12:32:11.2356984
7.16.5. Polling update/create next poll file
This command updates the content of a .nextpoll file. If the given file doesn’t exist, it will be
created automatically.
For that new value to be picked up by the polling system, you need to restart it by running
service polling-system restart.

Syntax
Command: polling_updateNextPollFile

-f, --file Polling next poll file to update (.nextpoll), which will be
created if it does not exist already
-i, --inputFormat Time value input format (eg. dd-MM-yy HH:mm:ss z);
time value is expected to be milliseconds since the January
1st 1970, 00:00 GMT, if not set)
<args>… Time value to write to the next poll file
Usage examples:
# set the next poll time to Tuesday, September 18, 2012 6:00:00 PM (GMT)
ngadmin polling_updateNextPollFile \
-f /home/user/pollingStatsBackup/test@test2.nextpoll 1347991200000
# set the next poll time to Tuesday, September 18, 2012 8:00:00 PM (CEST)
ngadmin polling_updateNextPollFile \
-f /home/user/pollingStatsBackup/test@test2.nextpoll \
-i "dd-MM-yy HH:mm:ss z" "18-09-2012 20:00:00 CEST"
7.17. Licensing commands
The next sections present licensing related commands - showing a license’s information,
extracting a C2V file and updating a license.
7.17.1. Show License Information
This command shows information about the current license.
Syntax
Command: licensing_showLicenseInformation

-a, --all Show all license information (= more detailed)

Usage example:
# show known license information

ngadmin licensing_showLicenseInformation
7.17.2. Extract C2V
This command extracts a C2V file from the appliance. This file is needed by NetGuardians to
activate your license.
Syntax
Command: licensing_extractC2V

-f, --file C2V file to be stored
Usage example:
# dump the license information into the given C2V file

ngadmin licensing_extractC2V -f /home/user/C2V.c2v
7.17.3. Update License
This command updates the license on the appliance.
Syntax
Command: licensing_updateLicense

-f, --file The license information to update to
-s, --skip Skip checking if the file is valid for update (default is false)
To skip checking the C2V file before updating, the -s or --skip options can be used. It is
useful if you want to install multiple licenses on the same machine.
Usage examples:

# update the existing license with the given file
ngadmin licensing_updateLicense -f /home/user/license.v2c
# update the license with the given file and without checking
# first if the existing license is the same, potentially resulting
# in multiple licenses being installed
ngadmin licensing_updateLicense -s -f /home/user/license.v2c
7.18. Util command
This provides the encoded form for a password provided.

The password provided will be encoded in a way compliant with the User Management
Module.
7.18.1. Syntax
Command: util_encodePassword

<args>… The password to be encoded
When using the util_encodePassword command, it asks for a password in clear text as
an argument.
When using ngadmin wrapper, it does not accept the password in clear text as parameter -
it asks for the password in a safe prompt instead. This prevents the clear text password to
be captured in command history.
A sample input of test as a password gives the following result:
[root@NG-SCREENER ~]# ngadmin util_encodePassword

Password:
pEzTl+IHosA=
[root@NG-SCREENER ~]#
7.19. Search - reindex all command
This command recreates the search index used by NG|ScreenerUI from scratch. It can be
used to solve problems with objects that cannot be found through the search functionality
in the user interface. Objects (SOL’s, Controls) have to be indexed in order to be searchable.
If one object is modified outside of NG|ScreenerUI or NG|Admin, this command might be
helpful to make those changes visible to the search component.

7.19.1. Syntax
Command: search_reindexAll

Usage example:
# recreate search index for all controls / solutions

ngadmin search_reindexAll
7.20. Data manipulation commands
This section presents commands that are useful when working with NG|Storage
7.20.1. Launch initial processing
Command used to manually restart an INITIAL processing.
It might happen that new log files are added to log-collector with a past date (when, for
example, a new service has been configured) or data was pruned using
data_removeEntries. data_launchInitialProcessing forces a re-analysis and
reloading of the missing entries in NG|Storage.
Syntax
Command: data_launchInitialProcessing

Usage example:
# force reload of missing entries

ngadmin data_launchInitialProcessing
7.20.2. Remove entries from NG|Storage
Command used to remove entries for specified services from NG|Storage.
There are various cases where partially removing some data might come in helpful:

• Whenever some reference data was added or updated and you want to force reload
• Whenever a service or a host was removed
Without a host/service and time frame provided, this command will remove
 all the data from NG|Storage.
Syntax
Command: data_removeEntries

-s, --service Service pattern to select data to clean (* is the default,
stands for all)
-o, --host Host pattern to select data to clean (* is the default, stands
for all)
-f, --from Date pattern to select data to clean (lower bound), expected
format is dd-MM-yyyy
-t, --to Date pattern to select data to clean (higher bound),

expected format is dd-MM-yyyy
--force Bypass confirmation requirement
Usage examples:
# remove all entries

$ ngadmin data_removeEntries --force
# remove all entries from one host/service pair

# for a specific time period only
$ ngadmin data_removeEntries -s temenosT24Protocol \
-o t24Server1 -f 01-01-2016 -t 01-03-2016 --force
# remove all entries for which host and service

# start with specific prefixes
$ ngadmin data_removeEntries -s "temenos*" \
-o "t24Server*" --force

7.20.3. Remove violations from log-collector and NG|Storage
Command used to remove violations for specified control from log-collector and
NG|Storage.
There are various cases where partially removing some violations might come in helpful:
• During a POC
• When a control has been deleted and we won’t keep its violations
Syntax
Command: data_removeViolations controlName

<args>… The name of the control
Usage examples:
# remove violations for control 'C031 - Four Eyes Check Violations'

$ ngadmin data_removeViolations 'C031 - Four Eyes Check Violations'
7.20.4. Sanitize data in NG|Storage
In case log files are deleted from log-collector and we want data in NG|Storage to be kept
in sync, data_sanitize should be used. The command will remove all the data in
NG|Storage for the specified period and files not present in the log-collector.
Syntax
Command: data_sanitize

-f, --from Date pattern to select data to clean (lower bound), expected
format is dd-MM-yyyy
-t, --to Date pattern to select data to clean (higher bound),

expected format is dd-MM-yyyy

 Without time frame specified, this command will sanitize the whole period
Usage examples:
# launch sanitization over the whole period

ngadmin data_sanitize
# launch sanitization over a limited period of time

ngadmin data_sanitize -f 01-01-2016 -t 01-03-2016
7.21. Forensic filters commands
This section presents commands that can be used to import/export forensic filters and
transformations.
7.21.1. Extract forensic filters
Command used to extract all defined forensic transformations and filters to a zip file.
Syntax
Command: forensic_extractFilters

-f, --file The name of the output ZIP file; can include a full path
Usage example:
# extract all transformations and filters to a zip file

ngadmin forensic_extractFilters -f /tmp/filters.zip
7.21.2. Import forensic filters
Command used to import forensic transformations and filters from a zip file.
Syntax
Command: forensic_importFilters

-f, --file The path to the file to import. Can be a .zip file or .json

Usage example:
# import all transformations and filters from a zip file

ngadmin forensic_importFilters -f /tmp/filters.zip

Chapter 8. NG|Integrity
8.1. Introduction
Logs are collected and written into /log-collector, then they are used to do forensic,
generate reports. Logs are written in plain text, and could be modified. The NG|Integrity
module is created to detect the modification of logs under /log-collector.
Depend on the running mode, the module could detect that a log file under /log-
collector is modified, or it could detect which log line has been modified, the old log line,
and the new one before and after the modification.
8.2. Overview
NG|Integrity can work in two modes:
• Daemon: Generate the integrity database periodically with a recurrence interval defined
in the configuration file.
• Single: Generate the integrity database once with .log.gz and .log audit trails files
without current day.
The integrity check can be performed in two modes:
• file : Ensure that files were not modified. This mode is lightweight but does not allow to
know exactly which line was changed.
• line : Ensure the integrity line by line. This mode is heavyweight but does allow to know
exactly what was modified.
8.2.1. How NG|Integrity works
1. An audit trail is sent to the NG|Analytics server and is received by syslog-ng.

2. The audit trails is sent to the log-collector and a copy is stored in an output folder
84 | Chapter 8. NG|Integrity
defined in the NG|Integrity configuration.
3. NG|Integrity updates the integrity database with the new audit trails.
4. NG|Integrity deletes the output folder after updating the integrity database.
5. NG|Integrity periodically reads the state of the log-collector files and compares with
their signature with the integrity database to ensure integrity is guaranteed.
6. NG|Integrity sends an alert if files were tampered with.
8.3. Configuration
The NG|Integrity configuration file is stored at the following path: /etc/ng-

screener/ngintegrity/integrity.conf.
8.3.1. Sample Configuration File
# Folder where syslog-ng will write the data

syslog_output_folder_1=/storage/integrity/<PROGRAM>
# Integrity to check on specific service

#service_to_check_1=temenos@*:file
#service_to_check_2=*@*:file
#service_to_check_3=NG-Audit@*:line
service_to_check_1=NG-Audit@NG-SCREENER:line
# The number of CPU optimized the threading for encryption

nbCPU=2
# The signature type is used to create the digester. It can be SHA-1 or Md5, etc
signatureType=SHA-1
# Directory path where the integrity.db structure is created.

integrityDirectory=/storage/integrity
# Interval checking in number of seconds

intervalChecking=30
# Interval discovery in number of seconds

intervalDiscovery=30
# Log Collector folder

logFolder=/log-collector/
# Destination alert
alert_destination=localhost
8.3.2. Parameters
• syslog_output_folder_n: Folder where NG|Integrity writes its data. The <PROGRAM>

pattern is an alias for the syslog program field.
Chapter 8. NG|Integrity | 85
• service_to_check_n: Defines a service and a host that must be controlled and its
integrity mode (line or file).
• signatureType: The hash algorithm to be used for verification. It can be SHA-1 or MD5.
• integrityDirectory: Directory path where the integrity database structure is created.
• intervalChecking: Number of second between two validations.
• intervalDiscovery: Number of second between two database updates.
• logFolder: Folder that NG|Integrity ensures integrity of.
The option "-v" or "–verbose" can be used to get verbose output. You can find NG|Integrity’s
log file in the following location:
/var/log/ng-screener/ngIntegrity.log
8.3.3. Sample Use Case
A user wishes to ensure the integrity of the core banking system audit trails (temenosT24).
The user would like to be notified when the stored core banking audit trails files are
modified. For this case, the NG|Integrity’s configuration would be the following:
# Folder where syslog-ng will write the data

syslog_output_folder_1=/storage/integrity/<PROGRAM>
# Integrity to check on specific service

service_to_check_1=temenosT24Transaction@*: file
# The number of CPU optimized the threading for encryption

nbCPU=2
# The signature type is used to create the digester. It can be SHA-1 or Md5, etc
signatureType=SHA-1
# Directory path where the integrity.db structure is created.

integrityDirectory=/storage/integrity
# Interval checking in number of seconds

intervalChecking=30
# Interval discovery in number of seconds

intervalDiscovery=30
# Log Collector folder

logFolder=/log-collector/
# Destination alert
alert_destination=localhost
After modifying the configuration file, NG|Integrity should be restarted using the command:
86 | Chapter 8. NG|Integrity
service ngintegrity.ngc start.
Now, the core banking audit trails files’ integrity is being monitored. If an integrity violation
occurs, an audit trail is written under the NG-Integrity service of NG-SCREENER host. You
can define a control to receive an alert when this happens.
Chapter 8. NG|Integrity | 87
Chapter 9. Reference Data
9.1. Introduction
NG|Screener’s Reference Data module is a place to keep client-specific information, which

other modules can use to translate field values to, usually, more meaningful ones. The
module loads all data from external sources (SQL Database, CSV directory, LDAP Directory)
when the module is initialized. It acts as a mean to accelerate the process of accessing
mapping data.
The data is stored in a key-value format. When a module requires information about a key, it
returns all the corresponding values. For example, Reference Data may store user
information with the key being 'user id'. A request on a specific 'user id' would then return
the corresponding user information (e.g. name, account, department, etc.).
9.2. Overview
When the Reference Data module is initialized, it consults the configuration files to locate
the data sources (e.g. SQL Database, CSV directory, LDAP Directory), then executes the
query on those data sources and stores the results in a specific format. When it finishes
executing those queries and all the data is cached, it is ready to serve other modules.
Data in Reference Data module is constructed as key-value pairs, with a set of key-value
pairs called a 'cache' and denoted as 'keys → values'. 'Keys' and 'values' have multiple
attributes. An example association of user information would be 'user_id → user_name,
branch_id', or 'currency, date → rate'.
A cache is represented as a graph, where 'keys' are the source nodes and 'values' are the
target nodes. From a node, a module can access any attribute in another node if it has a
path from the source node to the target node. For example, with the 2 caches:
• user_id → user_name, branch_id

• branch_id → branch_name
it could return information about a 'branch_name' from 'user_id' by querying 'user_id →

branch_id.branch_name'.
There are different cache concepts:
• Standard cache: the cache contains entries as key-value pairs and fetches an entry by
exact match. Imagine an account cache which maps account IDs to account names. A
search for an account ID will search for a cache entry with the exact match of that
account ID and if no such account ID is found in the cache, it will return a cache missed
reply.
88 | Chapter 9. Reference Data

• Comparable cache: for this cache, when searching for an entry, it finds the closest entry
such that the key is lower or greater than the search key depending on the cache
configuration.
For example, a cache which maps IP addresses to cities. In practice, a city is allocated
to a range of IP addresses. In the cache, we store only the lower bound of all those
ranges as the key, and the city name as the cache value. Then, searching for an IP
address will return the closest entry which is lower than the input key. This makes sure
that it returns the correct entry, and it does not need to store all possible ip addresses
in the cache as for the standard cache.
Another usage of a comparable cache is for historic retrieval, such as retrieving
currency exchange rate at a specific time. In the cache, it stores the currency and the
date as keys and the rate as value. The date in the cache is the moment the rate
changed. So between two consecutive dates, the rate does not change. When searching
for a rate at a specific moment, it searches for the closest entry of the corresponding
currency and a date smaller than the given date.
• Manageable cache: in a standard cache, if a key is not found, a cache missed reply is
returned. For management cache, if a key is not found, it inserts this key to the cache
with some default values and returns this new entry.
9.3. Configuration
The module has two types of configuration: module configuration and cache configuration.
Module configuration defines default values for cache parameters, data source parameters,
etc. for cases where they are not defined in the cache/data source configuration. In
addition, it defines global parameters related to the module. Cache configuration defines
the cache structure and how to populate data in the cache. It also defines additional
parameters to manage cache efficiently.
9.3.1. Module Configuration
The module configuration file could be found at /etc/ng-

screener/common/referenceData.conf. It defines the following parameters:
Chapter 9. Reference Data | 89

# DATASOURCE CONFIGURATION
defaultMaxConnections = 0
defaultFetchSize = 50000
defaultConnectionRetryDelay = 0
# CSV DATASOURCE CONFIGURATION

csvDriver = org.relique.jdbc.csv.CsvDriver
csvUrlFormat = jdbc:relique:csv:%s
csvDefaultSeparator = ,
csvDefaultFileExtension = .csv
csvDefaultSuppressHeaders = false
csvDefaultHeaderLine =
csvDefaultSkipLeadingDataLines = 0
# LDAP DATASOURCE CONFIGURATION

ldapDefaultValuesSeparator = ', '
# CACHE CONFIGURATION
defaultCacheInMemorySize = 50000
defaultCacheEviction = LRU
defaultCacheCaseSensitive = true
defaultCacheKeyMatch = exact
defaultCacheKeyClass = String
defaultCacheKeyFormat = ''
# CACHE MANAGEMENT
defaultCacheRefresh = 0
maxReturnedKeys = 1000
cacheSyncIntervalSeconds = 5
The configuration file has 3 main sections: data source configuration, cache configuration,
and cache management. The data source configuration section declares default
parameters of data sources. Each type of a data source (i.e. JDBC, CSV, LDAP) has its
specific parameters. The cache configuration section declares default parameters of a
cache. The cache management section declares other parameters.
Following are the details of those parameters:
• defaultMaxConnections: Defines the maximum number of connections allowed to

be open for a data source at the same time. For one query, it needs to create one
dedicated connection to the data source. If more connections to the same data source
are required, the later queries wait for other connections to be closed and then reuse
them. If this value is set to 1, all queries to the data source are executed sequentially,
one after another. If this values is set to 0 or to a negative value, this parameter is
ignored, and the data source accepts unlimited connections concurrently.
This parameter is optional. Default value is zero.
Scope: JDBC, CSV data sources.
• defaultFetchSize: Give a hint to the underlying DBMS about the maximum number

of results to to fetch on the client side. This parameter should prevent loading all
results into memory to process, thus preventing OutOfMemoryExceptions when the
database returns too many results. Notice that this parameter does not guarantee the
results will be fetched in groups sized by the number declared in this parameter.
Depending on the DBMS implementation, it may ignore this parameter and return all
the results at once. If this value is set to zero, the hint is meant to be ignored.
This parameter is optional. Default value is '50’000'.
• defaultConnectionRetryDelay: Declare the interval in seconds to wait before

trying to reconnect to the data source if it is currently not accessible. It will retry until
the connection is successfully established. If this values is set to 0 or to a negative
value, this parameter is ignored, and it stops immediately when a connection to data
source fails.
Scope: all data sources.
• csvDriver: JDBC driver used to connect to a CSV database. By default, it uses the
csvjdbc driver to establish a connection to a CSV file.
This parameter is optional. Default value is 'org.relique.jdbc.csv.CsvDriver'.
Scope: CSV data source.
• csvUrlFormat: The URL used to connect to a CSV database. It is built based on java
String format (https://docs.oracle.com/javase/8/docs/api/java/util/Formatter.html#
syntax) with parameter is the 'path' provided in the CSV data source configuration. In
brief, '%s' in this parameter replaces the whole 'path' value in the CSV data source
configuration. By default, it uses csvjdbc driver to establish a connection to csv file and
the default URL would be jdbc:relique:csv:/path/to/csv/folder.
This parameter is optional. Default value is the static part of url with dynamic part as
parameter 'jdbc:relique:csv:%s'.
• csvDefaultSeparator: Set the character separating columns in a CSV file.
This parameter is optional. Default value is a comma.
• csvDefaultFileExtension: Set the CSV file extension. Notice that it needs to

include the dot (.) character and that it is case-sensitive.

This parameter is optional. Default value is '.csv'.
• csvDefaultSuppressHeaders: Whether to process the CSV file with header. If the

value is 'true', it treats the CSV file without header and the column headers should be
provided in the 'headerLine' parameter. If the value is 'false', it treats the first line of a
CSV file as header.
This parameter is optional. Default value is 'false'.
• csvDefaultHeaderLine: Define the header columns if they’re not present in the first
line of CSV file. This parameter is only in considered if the 'suppressHeaders' parameter
is set to 'true'.
This parameter is optional. Default value is 'null'.
• csvDefaultSkipLeadingDataLines: Skip some data lines when parsing a CSV file.

If the CSV file has a header, it parses the header in the first line and then skips the
provided number of lines. If the CSV file does not have a header, it skips the provided
number of lines and gets the header line defined in 'headerLine' parameter. If this value
is set to zero or negative, it is ignored.
• ldapDefaultValuesSeparator: For LDAP query, there may be multiple results

matching a filter. This option is used to concatenate those values into one string using
this separator.
This parameter is optional. Default value is ’, ’ (a comma followed by a space, quotation

marks are ignored).
Scope: LDAP data source.
• defaultCacheInMemorySize: The number of cache entries to be kept in memory for

each cache. If a cache has more entries, the rest will be stored on the file system. This
parameter is used to accelerate the process to search for a cache entry. Configuring
this value too big may cause OutOfMemoryError if the system does not have enough
memory to hold all those entries.
Applicable only to Comparable caches. Simple caches store a minimal amount of

entries in memory. This parameter is optional. Default value is '50’000'.

Scope: all caches.
• defaultCacheEviction: Set the algorithm used to evict existing cache entries from
memory. Possible values:
◦ 'LFU': removes the least frequently used entries from memory
◦ 'LRU': removes the least recently used entries from memory
This parameter is optional. Default value is 'LRU' algorithm.
Scope: all caches.
• defaultCacheCaseSensitive: Whether key matching should be performed in a

case-sensitive manner. For AD LDAP running on Windows, the key would be case-
insensitive. For a system running on Linux, the key would be case-sensitive.
This parameter is optional. Default value is 'case-sensitive'.
Scope: all caches.
• defaultCacheKeyMatch: Define how to search the key in cache to find the

corresponding value. Possible values:
◦ 'exact': match the exact key in cache
◦ 'upper': match the least key in cache that is equal to or greater than the input key
◦ 'lower': match the greatest key in cache that is equal to or less than the input key
If the value is 'exact', it will create a simple cache. If the value is 'upper' or 'lower', it
will create a comparable cache.
This parameter is optional. Default value is 'exact' match.
Scope: all caches.
• defaultCacheKeyClass: Java type of the cache key. The key value extracted from
data source is converted to the key class before storing in the cache. For a normal
cache, the key would be 'String'. For comparable caches (i.e. 'keyMatch' is 'upper' or
'lower'), the key class must be comparable. Possible values: String, Long, Int, Integer,
Short, Byte, Float, Double, Date, IP. IP class denotes for IPv4 address in format
XXX.XXX.XXX.XXX, where XXX is in range 0-255. Input value is case sensitive. Currently,
all supported key classes are comparable.
This parameter is optional. Default value is 'String' class.
Scope: all caches.
• defaultCacheKeyFormat: The format of a String key if it represents a different

format. Currently only applicable to dates.

This parameter is optional. Default value is empty.
Scope: all caches.
• defaultCacheRefresh: The period in seconds between two consecutive cache

reloads. The period is calculated from the end of the previous cache load to the
beginning of the next cache load. If the value is set to 0 or negative, this option is
ignored and cache is only loaded at daemon start if needed.
Scope: cache group.
• maxReturnedKeys: The maximum number of keys returned when getting all cache
keys. It is used to prevent OutOfMemoryErrors when a cache has a big number of keys. If
this value is set to zero or negative, the parameter is ignored, and all keys in cache are
returned.
This parameter is optional. Default value is '1’000'.
Scope: all caches.
• cacheSyncIntervalSeconds: Time interval (in seconds) used for cache

synchronization with its backend (if 0 or negative, no time-based synchronization will
occur). Useful when new data is being added (using the 'add-missing' functionality, for
example).
This parameter is optional. Default value is '5'.
Scope: all caches.
9.3.2. Cache configuration
The cache configuration files are located in /etc/ng-

screener/common/referenceData. Typically, each file defines all caches for a data
source. A configuration has two main parts, one part defines the data source, and the other
defines caches. In multi-tenant mode it is necessary to define tenant.
There are sample cache configuration files in /etc/ng-

screener/common/referenceData/samples. They provide examples for different
types of data sources (e.g. different JDBC DBMS, CSV, LDAP), as well as an example of a
comparable cache.
Here is a sample cache configuration file:

<cacheconfig>
<tenant>DEFAULT</tenant>
<datasource type="jdbc">
<jdbcDriver>oracle.jdbc.driver.OracleDriver</jdbcDriver>
<url>jdbc:oracle:thin:@//HOSTNAME:1521/DATABASE_NAME</url>
<username>username</username>
<password>password</password>
</datasource>
<cachegroup name="group1"
cacheInMemorySize="50000"
cacheRefresh="86400">
<query>
select col1, col2, col3 from table
</query>
<domains>
<domain forValue="col2">
<allowed value="Whitelisted"/>
<allowed value="Blacklisted"/>
<allowed value="Neutral"/>
</domain>
</domains>
<cache name="col1_col2col3" inMemorySize="30000">
col1 -> col2, col3
</cache>
<cache name="col1col2_col3"
keyClass="string, date" keyFormat=", 'dd.MM.yyyy'">
col1, col2 -> col3
</cache>
</cachegroup>
</cacheconfig>
The 'datasource' tag defines a data source and there can only be one such tag in a cache
configuration file. The 'cachegroup' tag defines a cache group, which may contain multiple
caches. There may be multiple cache groups in a cache configuration file.
Data Source Configuration
The data source configuration contains information on how to connect to a data source to
fetch data from.
The data source is defined by the 'type' property. It may take one of the following values:
• jdbc: for a JDBC data source

• csv: for a CSV data source
• ldap: for an LDAP data source
Different data source types require different parameters. Following is the list of
parameters of a data source and the scope they apply to.

• connectionRetryDelay: Same meaning as the defaultConnectionRetryDelay
parameter in module configuration file. If this parameter is defined in the data source, it
overwrites the value in the module configuration file.
Scope: all data sources.
• maxConnections: Same meaning as the defaultMaxConnections parameter in

module configuration file. If this parameter is defined in the data source, it overwrites
the value in the module configuration file.
• fetchSize: Same meaning as the defaultFetchSize parameter in module

configuration file. If this parameter is defined in the data source, it overwrites the value
in the module configuration file.
• jdbcDriver: Set the driver used to connect to the data source. Different DBMS
correspond to different drivers. Some examples of those drivers are listed in Table
[tab:jdbc_parameters].
Scope: JDBC data source.
• url: Set the URL for the data source connection. Different data sources expect different
URL formats. Some examples of JDBC URL formats are listed in Table
[tab:jdbc_parameters]. The URLs themselves take the following parameters:
◦ HOSTNAME: host name of the database server, it could be an IP address or a DNS
name
◦ DATABASE_NAME: the database name on the server
In some cases also a port number. This is specific to each DBMS.
The LDAP URL has the following format:

ldap://HOSTNAME:389/CN=Users,DC=company,DC=com. It expects the
HOSTNAME of the LDAP server and the root directory it should connect to.
Scope: JDBC, LDAP data sources.
• username: The user name used to connect to the database. It binds with the password
to form valid credentials.
• password: The password used to connect to the database. It binds with the username
to form valid credentials. This entry can be either in plain text or an encoded form
provided by the NgAdmin encodePassword command.

WARNING: If the password is encoded, you should prepend == to the encoded value to
indicate that fact.
• initialQuery: The initial query executed right after opening a new connection to a
database and before executing the query defined in the cache group. This query should
be idempotent, since it could be executed multiple times before executing the queries
defined in cache groups.
Scope: JDBC data source.
• separator: Same meaning as the csvDefaultSeparator parameter in module

• fileExtension: Same meaning as the csvDefaultFileExtension parameter in

module configuration file. If this parameter is defined in the data source, it overwrites
the value in the module configuration file.
• suppressHeaders: Same meaning as the csvDefaultSuppressHeaders

parameter in module configuration file. If this parameter is defined in the data source, it
overwrites the value in the module configuration file.
• headerLine: Same meaning as the csvDefaultHeaderLine parameter in module

• skipLeadingDataLines: Same meaning as the

csvDefaultSkipLeadingDataLines parameter in module configuration file. If this
parameter is defined in the data source, it overwrites the value in the module
configuration file.
• path: The path to the CSV folder. This folder is used as a datasource and all CSV files in
this folder are served as tables.
• valuesSeparator: For an LDAP query there may be multiple results matching a

filter. This option is used to concatenate those values into one string using the
separator.
Scope: LDAP data source.
Table 3. Jdbc drivers and urls of different DBMSs

as400 com.ibm.as400.access.AS400JDBCDriver
jdbc:as400://HOSTNAME/DATABASE_NAME
MSSQL com.microsoft.sqlserver.jdbc.SQLServerDriver
jdbc:sqlserver://HOSTNAME:1433;databaseName=DATABASE_NAME
MySQL com.mysql.jdbc.Driver
jdbc:mysql://HOSTNAME:3306/DATABASE_NAME
Oracle oracle.jdbc.driver.OracleDriver
jdbc:oracle:thin:@//HOSTNAME:1521/DATABASE_NAME
Postgresql org.postgresql.Driver
jdbc:postgresql://HOSTNAME:5432/DATABASE_NAME
Cache Group Configuration
Cache group configuration contains information on how to create a group of caches with a
common query. It always has a query specified, one or more caches, and may have default
properties for caches in the same group.
a. Properties
Cache groups inherit all the cache properties defined in the module configuration file. If a
property is defined in the cache group, this value will overwrite the corresponding value
from the module configuration file. Following are all valid properties of a cache group:
• name: The name of the cache group. This property is used to distinguish different cache
groups in the same cache configuration file. Therefore different cache groups must
have different names in the same cache configuration file.
• cacheInMemorySize: Same meaning as the defaultCacheInMemorySize
parameter in the module configuration file. If this parameter is defined in the cache
group, it overwrites the value from the module configuration file and becomes the
default value for all caches in the group.
• cacheEviction: Same meaning as the defaultCacheEviction parameter in
module configuration file. If this parameter is defined in the cache group, it overwrites
the value from the module configuration file and becomes the default value for all
caches in the group.
• cacheCaseSensitive: Same meaning as the defaultCacheCaseSensitive
parameter in module configuration file. If this parameter is defined in the cache group,
it overwrites the value from the module configuration file and becomes the default value
for all caches in the group.

• cacheKeyMatch: Same meaning as the defaultCacheKeyMatch parameter in
• cacheKeyClass: Same meaning as the defaultCacheKeyClass parameter in
• cacheKeyFormat: Same meaning as the defaultCacheKeyFormat parameter in
• cacheRefresh: Same meaning as the defaultCacheRefresh parameter in module
configuration file. If this parameter is defined in the cache group, it overwrites the value
from the module configuration file and becomes the default value for all caches in the
group.
b. Query
It specifies the query to execute on the data source to get the results in multiple columns.
Those columns are then used to build the caches in the same group.
c. Allowed values / domains

Some variables may only have values out of a given set (= enumerated values), aka a
domain. It is possible to define what values are allowed per variable using the following
syntax:
<domains>
<domain forValue="col2">
<allowed value="Whitelisted"/>
<allowed value="Blacklisted"/>
<allowed value="Neutral"/>
</domain>
<domain forValue=...>
...
</domain>
</domains>
This defines that variable "col2" may only take one of the three enumerated values:
"Whitelisted", "Blacklisted" or "Neutral".
d. Caches
Cache configuration defines the cache structure as well as its properties. Cache is built
from the result of the query execution. A cache inherits all properties of its cache group
except for 'cacheRefresh'. Following are all valid properties accepted by a cache:

• name: The cache name. This property is used to distinguished different caches in the
same group. Therefore different caches must have different names in the same group.
• inMemorySize: Same meaning as the cacheInMemorySize property in cache group.
If this parameter is defined in a cache, it overwrites the value in the cache group.
• eviction: Same meaning as the cacheEviction property in cache group. If this
parameter is defined in a cache, it overwrites the value in the cache group.
• caseSensitive: Same meaning as the cacheCaseSensitive property in cache
group. If this parameter is defined in a cache, it overwrites the value in the cache group.
• keyMatch: Same meaning as the cacheKeyMatch property in cache group. If this
• keyClass: Same meaning as the cacheKeyClass property in cache group. If this
• keyFormat: Same meaning as the cacheKeyClass property in cache group. If this
• defaultValues: A comma-separated list of default values to be used in case of a
missed hit (= unknown key) on a simple cache. In case this value is not present, missed
hits will throw a 'cache-missing' error.
• addMissing: A boolean value (true/false, false as default) that can be used in
combination with the defaultValues property, indicating whether the default value
provided by a simple cache in case of a missed hit should itself be explicitly added to the
cache for later queries.
Note that with keyMatch, keyClass, keyFormat and caseSensitive attributes in a

cache, we could define multiple values corresponding to a number of keys. Those values
are separated by a comma (,) character. If we define more values than expected, the
redundant ones will be ignored. If we define less values than expected, it uses the last value
for the missing ones.
Also, note that the defaultValues and addMissing attributes are only applicable to
simple caches, and that the number of comma-separated values should be the same as the
number of values defined in the cache (see below).
For comparable caches with multiple keys, only one comparable key is currently supported.
All other keys must have an 'EXACT' type. The comparable key can have either an 'UPPER'
or 'LOWER' type and must be put at the end of the key set. For example, with a cache of
key1,key2 ¬ val1,val2 and attribute keyMatch=exact,lower, key1 has match
type 'EXACT' and key2 has match type 'LOWER'.
Notice that all similar properties in a cache, a cache group and a module configuration have
the same name but different prefixes. The properties in cache have no prefix, the properties
in cache group have 'cache' as prefix and those in module configuration file have
'defaultCache' as prefix.

In addition, a cache’s structure is also defined by specifying the cache key and its
corresponding values separated by '→' characters. The left part defines the cache keys and
the right part defines the cache values. Both keys and values accept multiple columns
separated by a comma (',') character. Those columns must be present in the query results.
The cache can the be queried for any of the values for a given key.
An example cache structure is col1, col2 ¬ col3, col4, with col1 and col2 as the
keys and col3 and col4 as their respective values. The value for col3 or col4 can be
fetched for the combination of col1 and col2 keys.
9.4. Cache structure
As mentioned in the previous section, a cache defines a key and multiple values which form
a cache structure. A cache structure can be represented as a simple graph which where
node is a key/value, and arrows connect key nodes to value nodes. For example, with a
cache structure of node1 ¬ node2, node3, the respective graph would be:
Multiple simple graphs like that compose a larger graph together and form a complete
cache system. For example, the three following caches:
• userid ¬ username, account, branchid

• account ¬ userid, balance
• branchid ¬ branchname, description
form the following cache system graph:

For each path in the graph, the corresponding value can be fetched with the following
convention. Starting with the initial node, we must specify all the intermediate nodes in the
query separated by a dot ('.') character. With the graph above as an example, we could do
the following requests:
• userid ¬ branchid: request the value of branchid for key userid

• userid ¬ branchid.branchname: request the value of branchname for key
userid
• account ¬ userid.branchid.branchname: request the value of branchname for
key account
9.5. Cache Reload
Caches can be reloaded in various ways:
• Using the ngadmin referencedata_reloadCaches command

• Scheduled cache refresh by setting the cacheRefresh attribute in the configuration
file
• By modifying the cache metadata such as key match, key class, etc.
Note that caches are not reloaded on ng-screener start if the cache was already loaded
before and the cache metadata has not been changed.
If a new cache configuration file is added, ng-screener needs to be restarted to load it.

Chapter 10. Feeding
10.1. Introduction
The feeding module provides the ability to normalize events coming from the log-collector
to the business model used for both control execution and forensic investigation.
10.2. Feeding operation
The feeding module is responsible for making sure that the data needed for a specific task
is available in NG|Storage.
In day-to-day life, the content held in NG|Storage relative to each index pattern is managed
by a time-window (meaning that events that do not lie in this time-window are candidates
for purging). Time-windows are expressed as a number of days in the past from today
(default is 365 days for all of the following index patterns: ngv-*, ngi-*, ngc-* and ngt-
*).
Purging of out-of-window data is done everyday at midnight by the NG|Screener daemon

process.
Interactive forensic sessions performed from NG|ScreenerUI use whatever data is

available in NG|Storage at the time.
Incoming events from the source systems (mostly through Syslog-NG) are normalized
on-the-fly and continuously contribute to NG|Storage content.
Control executions also have an impact on NG|Storage content:
• through the new violations/hits they generate, obviously (this part is clearly outside the
scope of the feeding module, except for the purging-when-out-of-time-window part)
• if the data required by a control’s execution is currently not present in NG|Storage,
◦ either because it lies partially or completely outside of the configured time-window,
◦ or because one service is totally excluded from the feeding module’s stream-like
actions
then the feeding module is also responsible for fetching the required data from the
log-collector, normalizing and feeding it to NG|Storage before the control can be
run. It is therefore possible that the time-window is temporarily overlooked for the
concerned services. In any case the nightly purging operation will erase superfluous
data.
Chapter 10. Feeding | 103

10.3. Business data model
The feeding module relies on an interpretation dictionary provided with each connector,
which contains rules to normalize raw audit trails into the unified data model described
below. For more information on the business model refer to the corresponding chapter.
10.3.1. Technical attributes
Each and every event present in Channels (ngc-*), IT Layers (ngi-*), Transactions (ngt-*)
or Violations (ngv-*) indices has the following technical attributes:
Attribute name Attribute type

_id string
eventid string
eventsequencenumber string
@timestamp timestamp
@version integer
_type enum
_index string
rule_dictionary string
rule_logfile string
rule_ruleid string
log_file_path string
For ngv-* indices @timestamp refers to time of violation processing. For all other indices
@timestamp field stores event timestamp.
10.3.2. Common business attributes
Each and every event present in Channels (ngc-*), IT Layers (ngi-*), Transactions (ngt-*)
or Violations (ngv-*) indices has the following business attributes:

observer_service_name string
observer_service_config string
observer_hostname string
observer_address_ip ip address (ipv4)
observer_location location
description string
originallog string
additionaldata string
business_reference string
104 | Chapter 10. Feeding

source_ip string
source_user string
10.3.3. Transaction-specific attributes (ngt-*)
Events in the Transactions (ngt-*) and Violations (ngv-*) indices have the following specific
attributes:

transaction_type string
transaction_status string
transaction_order_id string
transaction_order_type string
transaction_sender_account_id string
transaction_sender_account_name string
transaction_sender_customer_id string
transaction_sender_customer_name string
transaction_sender_account_category string
transaction_sender_bank_id string
transaction_sender_bank_name string
transaction_sender_bank_country string
transaction_sender_branch_id string
transaction_sender_branch_name string
transaction_sender_currency string
transaction_sender_amount float
transaction_receiver_account_id string
transaction_receiver_account_name string
transaction_receiver_customer_id string
transaction_receiver_customer_name string
transaction_receiver_account_category string
transaction_receiver_bank_id string
transaction_receiver_bank_name string
transaction_receiver_bank_country string
transaction_receiver_branch_id string
transaction_receiver_branch_name string
transaction_receiver_currency string
transaction_receiver_amount float
transaction_exchange_rate float
transaction_value_date date

10.3.4. Channel-specific attributes (ngc-*)
Events in the Channels (ngc-*) and Violations (ngv-*) indices have the following specific
attributes:

business_user_id string
business_application string
business_action string
business_contract_id string
business_contract_name string
session_source_ip ip address (ipv4)
session_source_domain string
session_language string
session_country string
session_city string
session_auth_type string
session_screen_resolution string
session_client_os string
10.3.5. IT Layer-specific attributes (ngi-*)
Events in the IT Layers (ngi-*) and Violations (ngv-*) indices have the following specific
attributes:

source_user_id String
source_host string
source_user_domain string
source_port int
target_ip ip address (ipv4)
target_host string
target_user_name string
target_user_id String
target_user_domain string
target_port int
osi_transport_layer string
osi_application_layer string
status string
command string
command_name string

command_parameters string
10.3.6. Violation-specific (ngv-*)
Events in the Violations (ngv-*) indices have the following specific attributes:

score float
control string
control_id string
control_name string
intelligence_bundle string
smart_control_objective string
sco_code string
… and, for profiling controls' violations only, for each profiling variable:
*_score float
*_weight float
*_weighted_score float
10.3.7. Recapitulation diagram
All standard business-model attributes are summarized on the below diagram:


10.4. Event Translation
The raw log file may contain technical information such as user id, account number,
while we want to display to end-user rich information such as user name, account
holder. To do that, the feeding module is able to enrich the normalized event with
additional information by adding / replacing the content of some fields by other information.
All those fields mappings are kept in the Reference Data module (Chapter Reference
Data). We need to configure the feeding module to use those mappings and translate the
events.
The translation configuration files are put in /etc/ng-

screener/daemon/modules/feeding/translators folder.
Some global javascript scriptlets are already supplied by either the deamon itself or the
connectors that need them. Such supplied scripts are deposited in the /etc/ng-
screener/daemon/modules/feeding/translators/scripts folder.
In multi-tenant mode wildcard(*) character is not allowed and tenant in source definition is
mandatory. It contains a sample file in the samples sub-folder. A configuration file looks
like the following example:

<translatorconfig>
<sources>
<source>temenosT24Protocol@*</source>
<source>temenosT24Protocol@host@TENANT1</source>
</sources>
<translator>
<key>Initiator_User_Name=user_id</key>
<value>Initiator_User_UserId=username</value>
<value action="replace">Initiator_Process_Pid=branch_id</value>
<value action="append">
Initiator_User_Domain=branch_id.branch_name
</value>
</translator>
<translator valueSeparator=" | ">

<key regex="(\w{3})">Initiator_Service_Description=currency</key>
<key class="date" format="E MMM dd HH:mm:ss z yyyy">
Time=date
</key>
<value>Initiator_Service_Component=rate</value>
</translator>
<scriptedField overwrite="true">
<field>day_of_week</field>
<script>
["Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday",
"Saturday"][new Date(event['@timestamp']).getDay()]
</script>
</scriptedField>
<scriptedField>
<field>part_of_day</field>
<script>
['0h-6h', '6h-12h', '12h-18h', '18h-24h'][~~(new
Date(event['@timestamp']).getHours() / 6)]
</script>
</scriptedField>
<scriptedField>
<field>transaction_ord_cre_hour</field>
<script>new Date(event['@timestamp']).getHours()
<type>INTEGER</type>
</scriptedField>
</translatorconfig>
A translator configuration file consists of for parts: sources, importBaseDir,

translator and scriptedField.
• The sources tag specifies the sources of the events for which translators and scripted
fields defined in the file apply; this tag is mandatory and may only appear once

• The importBaseDir tag specifies the full path to the common base folder where
imported scripts may be found. It is only mandatory if at least one of the present
scripted fields makes use of the import functionality (see below)
• The translator tags declare how to enrich the normalized event with reference data.
There can be zero, one, or more such translator tags
• The scriptedField tags specify scripts used to change or create new fields. There
can be zero, one or more of them.
Scripted fields and translators are executed in the order they are defined in the XML file.
10.4.1. Sources
The sources tag may have one or multiple source tags. Each source defines the source
of the event (i.e. host, service) on which the file’s translators and scripted fields will apply.
A source has the format service@host where service is the service name (e.g.
ngaudit, nglicensing) and host the host name (e.g. NG-SCREENER) of the event. Wildcards
(*) can be used to match all the service / host names. For example the source *@* matches
all hosts and services.
10.4.2. Translators
Each translator has one or multiple key and value tags. The keys are used as a
reference to find the corresponding values in the Reference Data module. Keys and values
have the same format, namely Event Field = cache node. The Event Field is a
field of the event, it is case-sensitive. The cache node is a node of the cache system as
described in Section Cache structure. Chain nodes are specified using a dot separation
character to specify the path from the key node to the target node in the value cache node.
Those nodes are used to make requests to the Reference Data module to get the
corresponding values of the specified keys.
The keys are extracted from the key event fields of the current event. From these keys,
corresponding values for value tags are retrieved, which are assigned (using the specified
valueSeparator in case of append, otherwise overwriting any existing value) to the value
event fields of the current event.
We can also configure a global valueSeparator in the feeding module configuration file -
feeding.conf. This value is overwritten by any valueSeparator defined in the
translator definition. Default value for valueSeparator is ", ".
The key tag has three properties regex, class and format:
• regex: the regex applied on the key string. It aggregates all matching groups to form a
new key string if the regex matches, otherwise it skips translating the event. This
property is processed before class and format.

• class: specifies the java class of the key to be converted to. The key extracted from this
field will be converted to the specified class before sending a request to the cache. It
accepts the same classes as described in defaultCacheKeyClass parameter of
section Module Configuration.
• format: format of the input key. It is currently used to specify the Date format of the key.
For other classes, the format is simply ignored.
The value tag has one property: action. It accepts the following values (case-insensitive):
• Replace: replaces the requested value in the event field. This is the default value if the
property is not specified.
• Append: appends the requested value to current value of the event field (using the
specified valueSeparator).
10.4.3. Scripted fields
The scriptedField declares one attribute and two elements :
• overwrite (attribute): used to indicate whether we need to replace the field value if the
target field already exists. Default value is false.
• import: optional, multi-valued element indicating that more global script(s) should be
imported prior to defining the specific script (from the script field); they are imported in
the order they are defined in the scriptedField element) from the base location
defined by the importBaseDir global translator config element.
• field: used to indicate which field do we want to put the result of the script into
• script: a javascript expression used to compute the value for the field. All existing fields
are available in a javascript array called event (ex.: event['field']).
• type: allows casting value to numeric types. By default, a field’s value is stored as a
string, even though the actual type might be numeric. To force a field to use another
type, use one of:
◦ INTEGER
◦ LONG
◦ DOUBLE
Pay attention to set a proper index mapping type for that field in NG|Storage as well.
Example scripted field with type set and import functionality activated:

<importBaseDir>/script/location</importBaseDir>
<scriptedField>
<field>transaction_ord_cre_hour</field>
<import>common.js</import>
<script>new Date(event['@timestamp']).getHours()
<type>INTEGER</type>
</scriptedField>
10.5. Initial and realtime loading
When the NG|Screener platform is started for the first time, NG|Storage is empty.
NG|Screener will use a background process called initial loading to populate it with data.
Depending on the configured time-windows, the process loads archive events from the log
collector, normalizes and enriches them before pushing them to NG|Storage. During the
initial loading events are loaded from newest to oldest.
If the NG|Screener daemon is stopped for some reason and then re-started, it will begin its
loading process from the place it has stopped. The time it actually takes to finish the
loading task depends on the performance of the provided infrastructure and volume of data
to be loaded.
NG|Storage creates the following Lucene indices for storing event data:
• ngt-* - data related to Financial Transactions

• ngc-* - data related to Business users actions on the Information System as a whole
• ngi-* - data related to Technical Actions (mostly by technical users) on the system or
technical information
• ngv-* - data related to Violations generated by violation controls
Example:
• directory: /log-collector/2016/T24Server1/temenosT24Transaction/09-
07-2016.log
• created index: ngt-t24server1-temenost24transaction-20160709
For further information about NG|Storage, please refer to NG|Storage documentation.
For further information about clearing the NG|Storage database, please refer to ngAdmin
commands: data_launchInitialProcessing and data_removeEntries.

Chapter 11. Control
11.1. Introduction
NG|Screener’s control module gives the ability to extract valuable audit information in a
customized PDF report. This module provides automatic and use-case oriented report
generation where users are able to fully customize their reports (add logo, define charts,
add text around reported information, etc.). NG|Screener control module’s objectives are
the following:
• Provide a clear and comprehensive overview of a specific situation (use case driven)
• Group different information in a single document (charts, listings, textual description,
etc.)
• Schedule controls for periodic delivery of reports or for alerting purpose
Three types of controls are currently available in NG|Screener. From simple, PBI ones to
the most sophisticated, Machine Learning ones. They differ vastly in implementation
complexity and, as a result, in capabilities offered:
Picture: Algorithm approaches
11.2. Overview
As illustrated on picture Control Overview, Ad. 1, controls are directly available in

NG|ScreenerUI’s left side menu.
114 | Chapter 11. Control

Picture: Control Overview
11.3. Control Administration
Picture: Control View
You can setup a control from the ground-up on the control edition page (picture Control
View, Ad. 1) or duplicate the current control (picture Control View, Ad. 2). The duplication
Chapter 11. Control | 115

button will redirect you to the control edition page with data copied from the base control
for further modifications. Save the modified control with the Save button in the top-right
corner (picture Control Edit, Ad. 1).
Picture: Control Edit
You can pick and configure report templates on Template Selection and Template
Configuration tabs (picture Control Edit, Ad. 2 and 3).
11.4. Available Templates
This section provides a brief overview of report templates available within NG|Screener.
11.4.1. Template selection
Picture: Template Selection
This tab is used to configure the layout of the generated PDF report. The following options
are available and a live preview is also provided on the right side of the page:

• Advanced toggle (see Template Selection, Ad. 1), which lets the user choose between a
simple control definition (partially predefined) and an advanced one, in which data
manipulation is completely left to the user (through Python code). Please note that is is
generally impossible to switch back to simple after advanced mode has been chosen and
configured.
• Report type selection (see Template Selection, Ad. 2), to determine whether the PDF
report should include a title and potentially some graphics (Status mode) or rather a
plain data table (Export mode).
• Timeline toggle (see Template Selection, Ad. 3), to include (or not) a timeline in a
Status report; this is reflected on the preview pane with the presence of a dummy
timeline (see Template Selection, Ad. 3').
• Chart selection (see Template Selection, Ad. 4), for the kind of chart (if any) to include
to the Status report.
• Table section (see Template Selection, Ad. 5), to indicate whether a data table should
also be included at the end of the Status report, its number of columns, its type
(Console for a plain table in Status mode, Normal otherwise). Export reports
always and solely have such a table. The table is reflected on the preview pane with a
dummy data table (see Template Selection, Ad. 5').
11.4.2. Template configuration
Simple mode
Picture: Template Configuration (simple)
The template selected on the previous screen can be configured here. The exact layout of

the configuration pane will depend on the template chosen.
Some elements (titles, descriptions, etc.) may be edited directly in the preview pane on the
right hand side of the page (see Template Configuration (simple), Ad. 1 to 4).
On the left hand side of the page, the timeline, chart and table elements relevant to the
current template can be configured. Mandatory fields are highlighted in red, as shown on
Template Configuration (simple), Ad. 6.
The syntax for the Filter field in the tab’s General section (see Template Configuration
(simple), Ad. 5) conforms to ElasticSearch’s query string syntax (https://www.elastic.co/
guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#query-
string-syntax). This filter is the place to act on when implementing whitelisting (or, on the
contrary, blacklisting) for PBI controls (for profiling controls, such concepts are only defined
at each profiling variable’s level, not globally), as presented in chapter Managing
whitelisting and blacklisting.
Advanced mode
In advanced mode, the tab looks similar to the following screenshot. Again, the exact layout
of the page depends on the template selected.
Picture: Template Configuration (advanced)
The attributes used by the control have to be chosen in Template Configuration (advanced),
Ad. 1. Those fields will be provided to the code as data frames to operate on in Template
Configuration (advanced), Ad. 2. The administrator is then free to implement the control’s
functionality in Python 2.7.
The table’s order clause (see picture Template Configuration (advanced), Ad. 3) has a
syntax resembling a standard SQL order-by clause, where the usable column names are
the following:
• section for the section column (i.e. the first one, whatever its actual name in the data
frame)
Note: the section does not exist if the report type is set to Export (as opposed to
Status).
• column1 to columnn (n being the total number of columns in the table) for the table
data.
Other code input fields are also present on the tab, as shown on picture Timeline, chart and
table code in advanced configuration, to describe how the timeline, charts and tables
should be populated.
Picture: Timeline, chart and table code in advanced configuration

Common code
• Input parameter dfMap is a map (indexed by service name) of all input data frames (as
already brought up above, so-called data frames are matrix-like distributable
representations of data), with columns corresponding to the fields selected in the
Selected Fields area (see picture Template Configuration (advanced), Ad. 1).
• Output result can be either a data frame itself or a map (Python’s dict) of data frames.
Note: Our data frames are implemented using the PySpark library which documentation
can be found on https://spark.apache.org/docs/0.9.0/python-programming-guide.html
Note: The value returned by this function will be provided as an input parameter to all the
other custom functions described below.
Note: In case of an advanced profiling control, access to the aggregation’s indexes in

ElasticSearch is granted through statements like the one below, which returns a data frame
containing the aggregation’s data. The aggregation’s index name is built from the
aggregation’s name in lowercase, replacing all spaces with underscore characters and
prefixing the whole name with na-.
aggDF = spark.read.format("es").options(**es_base_read_conf) \
.load(resource="na-my_aggregation/document")
Timeline code
This function receives df as input parameter, which corresponds to what was returned by
the common code's function described above.
The expected output of this function is a data frame with at least one column (any other
column is actually ignored), named @timestamp, containing the data timestamp
(truncating it to the requested interval is taken care of by the framework). The number of
rows associated with a given timestamp (once truncated) will be reflected in the height of
the timeline curve.
Piechart & barchart code
The function’s output is expected to be a data frame with at least two columns, the first one
being the distinct key, and the second being the number of elements to associate with the
said key. Ordering in decreasing element number and limiting to a maximum number of
different keys is taken care of by the framework.

Table code
The function’s expected output is a data frame with:
• its first column used for the section headers (only applicable if the report type is set to
Status, not if it is set to Export, as shown on picture Template Selection, Ad. 2).
• all other columns (up to the number of columns selected in the table section of the
Template Selection tab) are table data, in the order they should be present in the
report.
11.4.3. Profiling (for simple profiling controls)
This tab allows to specify aggregation variables, scoring variables, giving them weights and
setting a score threshold that marks an event as a hit (i.e. anomaly).
For discrete variables (i.e. aggregation variables which can only take discrete values, like
day-of-week, currency, country…), following scoring methods are available:
• STATISTICAL
• LOGARITHMIC
Ignore missing values setting
The Ignore missing values toggle is only relevant when the aggregation’s dimensions
values, as seen from the to-be-scored data, do not match with any of the aggregated data.

As an example, let’s imagine we have built an aggregation (on transactions)
which dimensions would be the customer (through its ID) and the
transaction type (card, e-banking, cheque, mobile…), and where the
associated variable would be the corresponding transaction’s currency.
Missing value would describe the case where the aggregated data does
 not know of the found customer (yet), or of this (known) customer using the
current transaction’s type before (in the aggregation period).
On the other hand, if this very customer is already known to have used the
same transaction type, just never in that currency (which is not part of the
aggregation’s dimensions, only being the aggregated variable), then the case
cannot be described as a missing value case (see warning below).
In case of an identified missing value case, the variable’s partial score would normally
be set highest (i.e. with a 1), based on the safe side reaction to a behavior never seen
before. After some business analysis, it can very well be decided that such cases should on
the contrary be scored lowest (i.e. with a 0).
This is exactly what activating the Ignore missing values toggle does: ignore the
behavioral anomaly represented by cases identified as new in the axes of the aggregation’s
dimensions.
In the case only the variable's value is seen as new, the partial score will
 always be set highest (i.e. as 1), regardless of the value of the Ignore
missing value toggle.
11.5. Scheduled controls missed runs recovery
When NG|Screener is down it may miss some of scheduled control executions. This can
happen if a control is scheduled to run more frequently than the downtime lasted. If there’s
a will to recover some or all of the missed runs, two parameters need to be set in the
Control Module’s configuration file (located in /etc/ng-
screener/daemon/modules/control.conf).
# Recover missed ScheduledReports at initialization (default: false)

# (allowed values: 'true' or 'false')
recoverMissedRuns=true
# The maximum number of missed runs to recover

# Default: 100, set 0 for unlimited
maxMissedRunsToRecover=30
A restart of NG|Screener is required for any changes to take effect. In the above case,
maximum of 30 of the newest missed runs will be recovered for each scheduled control

when NG|Screener is started after a downtime.
11.6. Managing whitelisting and blacklisting
Most of this chapter’s content applies only to simple controls, meaning that advanced ones
will have - like for most other things also - to deal with the functionality explicitly in their
specific python code sections. The part about reference data usage, though, remains
relevant for all kinds of controls.
11.6.1. PBI control specifics
For PBI controls, blacklisting (resp. whitelisting) an event only consists of having it forcefully
appear (resp. not appear) in the control’s results. Doing so can be triggered through the
Filter attribute on the control editor’s Template Configuration tab.
As an example, the following filter snippets may be added to an already existing filter to
make sure
1. an event with an attribute (aptly named dubious) with a specific value (yes here) always
appears in the control’s output report (= is blacklisted)
(...) OR dubious:yes
2. an event with an attribute (named clear hereafter) has a specific value (sure chosen
here) never appears in the control’s output report (= is whitelisted)
(...) AND NOT clear:sure
11.6.2. Profiling controls
In case of profiling controls, the Filter attributes previously used to have an event be
included in the control’s output report has a slightly different meaning, as it governs which
events get scored, i.e. which events a profiling score is computed for.
Therefore, whitelisting can still be achieved this way, since not scoring an event is a sure
guarantee that it will not end up having a big score and appear in the control’s violations.
Blacklisting is another matter, though, since it not only requires the event to be scored, but
also to be scored high.
White- and blacklisting are rather dealt with at each variable’s level, through the
corresponding settings attached to the variables in the control editor’s Profiling tab.

When activated, each option compares the value of the given attribute (which name is in the
first field) to the given value (entered in the second field). If a difference is found, nothing
specific happens (i.e. variable scoring occurs normally). In case the values are found the
same, though, then the variable’s partial score is forced to 0 (whitelisting) or 1 (blacklisting)
regardless of the other settings. Whatever happens, this partial score will then contribute
to the global score the same way.
If both options are activated, and both expressions match, then blacklisting
 wins.
11.6.3. Reference data usage
In the general case, the attribute(s) on which the decision about black- or whitelisting is
based is/are not coming straight out of the scored event’s original data (i.e. it first has to be
enriched with it).
In case the trigger value(s) can be deducted from a combination of others from the event’s
attributes, then this combination can be either
• be written down in the control’s Filter attribute, if applicable, or

• be computed at normalization time and stored in a new ad’hoc attribute (see the scripted
fields functionality in the feeding module).
There are cases also where the information has to come from other, external sources. One
can think of an explicit list of highly risky, or even embargoed destinations for financial
transactions, for instance (think blacklisting). Or, on the contrary, an explicit list of known
trusted counterparties (state institutions and the like…), which could be used to whitelist
financial transactions.
That’s where reference data come handy. The data source can either be completely
automatic (extracted from a provided CSV file, a database…) or manually managed.

All manual modifications performed on a reference data set (through the
Admin / Reference Data menu in NG|Screener UI) are lost when the
 reference data content is refreshed from its source (something that can be
made to happen on a regular basis through the specific reference data
configuration).
Let’s take an example of possible configuration: we would like to whitelist or blacklist

transactions based on the destination account they are referring.
• let’s assume that, in each financial transaction entered in the system, the destination
account is present - using the business model - in the
transaction_receiver_account_id attribute;
• one could build a reference data model, taking an account_identifier as a key, and
an account_status value (status being something like trusted, resulting in
whitelisting, or forbidden, rather resulting in blacklisting, or also unknown, resulting in
normal processing…); this model could be either filled manually or directly fed from an
external data source;
<cachegroup name="AccountStatusGroup" cacheInMemorySize="500">

<query>
select account_identifier, account_status from acnt_stats
</query>
<domains>
<domain forValue="account_status">
<allowed value="unknown"/>
<allowed value="trusted"/>
<allowed value="forbidden"/>
</domain>
</domains>
<cache name="AccountStatus" defaultValues="unknown">
account_identifier -> account_status
</cache>
</cachegroup>
• the content for this referecence data model can then be edited in NG|Screener UI
(Admin menu, Reference data section)

• the event data source’s translator ought to be configured so that it adds a new
transaction_receiver_account_type attribute based on the reference data
content:
<translator>
<key>transaction_receiver_account_id=account_identifier</key>
<value>transaction_receiver_account_type=account_status</value>
</translator>
• finally the whitelist/blacklist attributes of a given profiling variables, for instance:

11.7. Event tracking
11.7.1. Context
For simple profiling controls, there is the possibility to activate the so-called event tracking
functionality by setting the appropriate switch in the control’s general configuration panel.
This functionality enables tracking of all events that pass through the control. The goal here
is to gather all the information required to be able to notify external systems about hits or
groups of events without any hits.
11.7.2. Configuration
The following parametrisation options are available for the event tracking functionality:
• in the simple_profiling.template file located in the /etc/ng-

screener/daemon/scripttemplates directory, it may be necessary to change the way the
'group_id' and 'nb_references_in_group' values are computed (either extracted directly
from the event data or computed any other way) in the dump_event_tracking_data()
function.
11.7.3. Usage
When activated and properly configured, a simple profiling control will dump data
consisting of
. its own id (= the control's)

. a business reference
. a group identifier
. a number of (hopefully consistent among all members of the same group) items in the
group
. whether the event identified by the business reference is considered a hit for the
control.
into the CTRL_EVENT_TRACKING table in the ngscreener MariaDB database.
To understand what a group is, let’s take an analogy with several payments (= events) being
recorded through a single order (= the group). The group is the entity that may be cleaned
from any suspicion as a whole, while suspicious events are still notified individually.
This table is cleaned up by control executions (all data loaded before the beginning of the
current day is systematically removed). For reference, this cleanup process is performed in
the upload_event_tracking_data() function of the simple_profiling.template file.
There are also two other scripts that have to be tuned to complete the analysis, both
located in the /usr/local/ng-screener/python/packages directory:

• event_tracking_handling_sample.py
• event_tracking_monitoring_sample.py
event_tracking_handling_sample.py
This script - or a spin-off of it - is intended to be run on a regular basis (every few minutes
or so), for instance using the crond daemon. Its parameters are the identifiers of the
controls that must have all declared an event as genuine for the event to be considered
ultimately genuine.
Inside the script, the following variables have to be set properly. They are all gathered at
the beginning of the file for accessibility.
• DB_* (DB_USER and DB_PWD): user name and associated password to access the
ngscreener database where the CTRL_EVENT_TRACKING table is located.
• SP_* (SP_USER, SP_PWD, SP_DBTYPE, SP_DBHOST, SP_DBPORT, SP_DBNAME):
connection information for the external DB where the stored procedures (hence SP as
name prefix) are to be called. SP_DBTYPE currently accepts the following possible
values: mysql, oracle, postgres and sqlserver.
• CM_* (CM_USER, CM_PWD, CM_TARGET): user name, associated password for the
connection to the Case-Manager application, and the name of the "Case Manager"
target to use for custom fields resolution.
Additionally, the following methods, both located in the Handling class close to the
beginning of the file, most probably have to be adapted to each local situation:
• notify_suspicious_reference() is called each time a hit is encountered. Action

could for instance be to check for the presence of a corresponding issue in Case
Manager, and then to call a specific stored procedure with the issue’s identifier and the
event business reference. If this function returns True, the event will be considered as
treated/handled and will not be looked at again. Returning False therefore provides a
way to delay the actual treatment of a hit (waiting until the issue is actually created in
Case Manager, for example), since it will be seen again during next run some time later
(actual frequency depending on the scheduler parametrization). This very behavior is
presented in the sample/template script.
• notify_clean_group() is called once per group if
◦ all events belonging to the group have been analyzed by all target controls;
◦ none of the said controls registered any of those events as a hit.
All errors encountered during DB access (be it the ngscreener database itself or the one
concerned by the stored procedure calls) are logged in /var/log/ng-screener/event-tracking-
db.log.

event_tracking_monitoring_sample.py
This script is another template, this time to check that there are no long-running un-
handled events. It is also intended to be run at regular intervals, with a lower frequency
than the first one, though. The following variables have to be tuned in the script:
• DB_* (DB_USER and DB_PWD): user name and password to access the ngscreener
database where the CTRL_EVENT_TRACKING table is located.
• THRESHOLD_MINUTES: the time (in minutes) above which an un-handled event should
be highlighted (which should be high enough to take into account any treatment action
taking place, and low enough to allow for quick detection if something failed to work).
The handle_old_business_reference() function in the script is the one that is called

for each of the long-running un-handled business references. Adaptations are obviously
required if some specific treatment is wanted.
11.8. Profiling Audit
11.8.1. Context
To be able to tune the profiling parameters, it may be necessary to analyze the statistical
spread of generated scores periodically. This is why all profiled elements are dumped in a
raw CSV file. This is done automatically for simple controls, and has to be done manually
for advanced controls (in the advanced profiling control template, a variable named
PROFILING_AUDIT_PATH contains the name of the directory into which those CSV files
are expected to be dumped).
Dumped columns include:
• the event’s timestamp (@timestamp column)

• all computed partial scores and the corresponding weights (*_score and *_weight
columns), ordered by the corresponding profiling aggregation’s id
• the computed global score (original_score column) and the corresponding
normalized value (score column)
• all used columns in the various report parts (in alphabetical order)
• the control’s hit threshold (threshold column)
Special care must be given to use the same columns in the same sorting
 order for all runs of a given control so that the daily logs could be
concatenated properly.

11.8.2. Audit de-duplication and compacting
Due to the way profiling controls are run (i.e. extending the relevant control’s time frame
towards the past slightly so that we are sure to treat all incoming events, even if they only
arrive after a slight delay), some input data may actually be analyzed more than once,
resulting in duplicates in the generated audit.
A cron job has therefore been added, run every day at 1 AM, which removes duplicate lines
in the previous days' profiling audit logs and compacts the result into one .gz file per
control (identified by its id) per day. It is defined in the /etc/cron.d/profilingAudit
file and refers to the script located at /usr/local/ng-
screener/tools/profiling/profiling_audit_daily_aggreg.sh.
The folder hierarchy where those audit logs are dumped is defined by the
profilingAuditBasePath variable in /etc/ng-
screener/daemon/modules/control.conf. Its default value is
/data/control/profiling).
The same base folder value should be defined in the control.conf

 configuration file and in the profiling_audit_daily_aggreg.sh cron
job script
For a given control run, the PROFILING_AUDIT_PATH variable defining the folder, in
which the CSV log files are dumped is composed of the following components:
1. the control’s id
2. the run date (in form yyyyMMdd)
3. the run time (in form yyyyMMdd_hhmmssSSSSSS)
The final path follows this convention:
/data/control/profiling/<id>/<yyyyMMdd>/<yyyyMMdd_hhmmssSSSSSS>
If, for some reason, the de-duplication and compacting actions should be performed earlier
(for a specific run, for example, to be able to examine quickly what the partial scores for a
specific transaction were), the following commands can be used to generate the compacted
file:

$ cd /data/control/profiling/<id>/<yyyyMMdd>/<yyyyMMdd_hhmmssSSSSSS # (1)
$ find . -name "*.csv" -print0 | xargs -0 sort -ru \
| gzip > /tmp/run.csv.gz # (2)
(1) changing to the directory where the run audit logs were deposited
(if the aim is to gather all existing runs of the same day into one
archive, one can also use one level higher, i.e. at day level)
(2) finding all *.csv files below current directory, removing duplicate
lines, sorting them, and compacting the resulting CSV file into
/tmp/run.csv.gz.
11.9. Control creation
Control creation in version 7 has changed dramatically compared to previous versions of

NG|Screener. There are no more fixed templates to choose from but rather a much more
flexible and configurable control description, as described in the Control chapter.
Moreover, customization of control calculations and/or data extraction is now performed

using Python instead of SQL. Some common replacement patterns are described in the
next chapter.
11.10. Common Python/Spark (a.k.a. PySpark) patterns
11.10.1. Basic operations
Controls now all use the Business Data Model in place of the obsolete NGE model.
Moreover, all data available to the controls is stored in ElasticSearch.
As described in section Advanced mode of the Template configuration chapter, the main
data holder is now a data frame, which is a kind of matrix holding rows (the events) of
columns (the attributes).
Basic operations on data frames are the following:
• filtering: the data frame’s content can be filtered by providing a predicate (on the row
level)
• selecting: only a subset of columns can be kept for each row, reducing the amount of
attributes analyzed
• sorting: changing the data frame’s sorting order
• joining: two data frames may be merged together, generating a third one in the process;
rows from the first two data frames are associated with each other using a join
expression.
One important point about Spark data frames is that they are lazily evaluated (computed

only when really needed). It is smart enough that data not used in the final calculation
(filtered out earlier in the process for example) may not even get fetched from
ElasticSearch.
Moreover, data frames are immutable: every operation performed will actually create
another data frame, leaving the original one(s) intact.
Filtering / Limiting
Filtering on a PySpark data frame is performed through the following statements:
• attribute matching
◦ business_reference attribute value starting with CUST)
df = df.filter(col('business_reference').like('CUST%'))
◦ score attribute value bigger than constant k
df = df.filter(df.score > k)
• row number limitation (first 100 rows)
df = df.limit(100)
Selecting
• Reducing the number of columns (and ordering them)
cols = ('@timestamp', 'source_user', ...)

df = df.select(cols)
• Adding a new computed column

◦ Using user-defined functions
def myConcat(col1, col2):

return col1 + col2
myUdf = udf(myConcat, StringType())

df = df.withColumn('my_new_column_name',
myUdf(df['attribute1'], df['attribute2']))
◦ Using a constant
df = df.withColumn('my_new_column_name', lit(42))

◦ Through an expression using other columns
df = df.withColumn('my_new_column_name',
col('attribute1') + col('attribute2') * 2)
Sorting
Provided we have a data frame with at least the '@timestamp' and 'currency' attributes, we
would like to sort the data frame by currency (ascending) and timestamp (descending, i.e.
reversed chronological).
The following snippet could be used:
sorting_columns = ('currency', '@timestamp')

sorting_orders = (1, 0) # 1 for ascending, 0 for descending order
df = df.orderBy(sorting_columns, sorting_orders)
When only sorting on one column, there is a simpler expression to use:
df = df.orderBy(df['currency'].desc())
Aggregating / Reducing
When we want to aggregate data (count the number of events with a given characteristic,
sum the transaction amounts per currency, etc.), this is called aggregating or reducing.
For instance, when we have a data frame with columns user_id, trans_type and
amount and we want the number of transactions per user in a column named
'trans_per_user', it could be implemented like:
per_user = df.groupBy('user_id').agg(
count(lit(1)).alias('trans_per_user')
)
If we want the sum and average of the amounts of transactions per user and transaction
type, the following expression could come in handy:
key_cols = ('user_id', 'trans_type')

per_user_and_trans_type = df.groupBy(key_cols).agg(
sum('amount').alias('sum_of_amounts'),
mean('amount').alias('avg_of_amounts')
)
Note: if the aggregated and source data have to be joined again into a single data frame,
please refer to the Joining section below.

Joining
Joining two data frames means creating a new one with data from both, following given
association rules (similarly to what joins are in relational databases).
Picking up on the joining point from Aggregating / Reducing section above, to join
aggregated data with the source data, one of the following can be used:
• if the joining columns have the same name in both data frames and an Equi join is
performed:
df = df.join(per_user, ('user_id'))
• if the joining columns have different names, or if the join condition should be a non-
strict-equality comparison, the join condition has to be explicit:
agg = per_user_and_trans_type
df = df.join(agg,
(df.user_id == agg.user_id) & (df.trans_type == agg.trans_type),
'left_outer')
In the last code snippet, the agg variable was only introduced to improve readability and
decrease the overall length of the statement.
Regarding the last parameter of the above join call: its allowed values are the following:
• inner (default): a row in the source data frame will only be carried onto the destination
data frame if it has at least one corresponding row (according to the join condition) in
the data frame being joined
• outer: rows in the source data frame are always carried onto the destination data
frame; if there is no corresponding row in the other data frame, then the corresponding
fields will remain empty
• left_outer: rows in the first source data frame (on the left of the expression) may not
have any corresponding rows in the second source data frame to be presented anyway
in the destination data frame (in case they don’t, the corresponding fields coming from
the source data frame will remain empty); rows from the second source data frame
must have at least one corresponding row in the first source data frame to be
represented in the destination data frame
• right_outer: same as left_outer where the roles of first and second data frames
are inverted
• leftsemi: columns from the second source data frame are never output to the
destination data frame, which is only used to choose the rows from the first source data
frame which have at least one corresponding row (according to the join condition) in it.

11.10.2. Data frame caching
Due to the lazy characteristics of actual data frame evaluation, it can happen that the
same calculation has to be done several times, which in turn can cause serious
performance issues.
For instance, let’s have a look at the following code snippet:
...
high = df.filter(df.score >= 0.8)
low = df.filter(df.score < 0.4)
...
When evaluating these two resulting data frames (high and low) later, if nothing specific is
done, then the df dataframe itself will have to be evaluated twice. That’s because its
intermediate state is not persisted between the two calls.
One solution is to enable the so-called caching of the data frame which is used in more than
one evaluation tree:
...
df.cache()
high = df.filter(df.score >= 0.8)
low = df.filter(df.score < 0.4)
...
Doing so will ensure that the data frame’s content is somehow persisted so that its
evaluation is only be done twice. Unfortunately, it also comes with a price: in the presented
example, the data frame content is evaluated prior to any filtering, which can make it rather
big. As usual, it is a trade-off between performance and memory consumption…
11.11. Execution principles
A general layout of the execution flow of controls is presented below.
Picture: Control execution framework
Processing always starts with Business Data Model in ElasticSearch. This data is fetched by
Spark which does the heavy lifting of data processing (Ad. 1).
The process is performed in one of two ways: either batch mode — with plain old Spark
SQL — or Real-time mode, by utilizing Spark Structured Streaming (Ad. 2). It always
results in potential hits being repopulated back to a relevant index in ElasticSearch (Ad. 3)
and the raw results stored in .parquet files. These only serve as a pivot format for report
population run down the road (Ad. 4).

Chapter 12. NG|Screener Database
12.1. Introduction
Despite we are using NgStorage to store big data, we still use RDBMS in our application to
store metadata (e.g. control definitions, users information, …). In addition, it supports
transaction and ACID features (i.e. atomicity, consistency, isolation, durability).
NG|Screener uses MariaDB as RDBMS, which has only one database instance (i.e.
ngscreener). It is connected on the default port 3306, and located on the /storage
partition which is normally a RAID5 disk.
12.2. Application Database
The application database contains tables with the following prefixes:
• FEEDING_ : these tables contain data for the feeding module (for more information
about feeding, please refer to chapter Feeding)
• UI_ : tables contain ngBrowser data - filters, notifications, etc.
• REALTIMEANALYSIS_ : contain the defined policy and blackout (for more information
about alerting, please refer to NGBrowserGuide chapter Realtime Analysis).
• DCA_: contain definitions of data capture alerting objects
• CONTROL_ : tables store controls configuration (python code, related target, scheduled
information, etc.)
• SECURITY_ : tables hold information about NG|Screener user role and the
authentication method (Local, LDAP)
12.3. MariaDB security
MariaDB credentials should be configured during the installation of ngScreener. The

relevant configuration file is located in /etc/ng-screener/common/ng-
screener.conf.
There are 3 useful settings:
• databaseConfig.url
This is the jdbc url of the MariaDB server to use.
Default: "jdbc:mariadb://localhost:3306/ngscreener"
• databaseConfig.username
This is the MariaDB username to use.
Default: prelude
Chapter 12. NG|Screener Database | 137

• databaseConfig.password
This is the MariaDB password to use. You need to set the encrypted version of the
password that can be generated using ngadmin util_encodePassword.
Default: "==GZw0B7ktM5FXgOd5bj/5QA==", the initial password for prelude user.
12.4. MariaDB toolkit
Below is a list of useful command to interact with MariaDB.
12.4.1. How to connect
To connect with root without password:
mysql
or with a specific user and a password:
mysql -u prelude -p
After this command a prompt will appear to enter the password of the mysql user. To exit
the mariaDB shell you need to enter the command:
exit
12.4.2. List of databases
Display the list of databases:
show databases;
Choose the database to connect to:
use database_name;
12.4.3. List tables
Display the list of existing tables in the current database:
show tables;
138 | Chapter 12. NG|Screener Database

12.4.4. Select a row from a table
Display the contents of a table:
select * from CTRL_CHANNEL;
An example on how to get specific data from a table:
select * from CTRL_CHANNEL where ID=1;
Same but display only specified columns:
select NAME, TYPE from CTRL_CHANNEL where ID=1;
12.4.5. Display the structure of a table
describe tablename;
desc tablename;
12.4.6. Dump the database
Dump all data in the database:
mysqldump ngscreener > database_dump.sql
Dump just one table:
mysqldump ngscreener table_name > table_dump.sql
Dump a set of tables:
mysqldump ngscreener table_name second_table_name > tables_dump.sql
Dump table without data:
mysqldump ngscreener --no-data table_name > schema_dump.sql
12.4.7. Restore a dump
Restore a database dump - specify the database to restore to:
Chapter 12. NG|Screener Database | 139

mysql ngscreener < my_dump.sql
12.4.8. Display MariaDB users and privileges
select * from mysql.user;
12.4.9. Create a new user
Create a new user:
create user 'newuser'@'localhost' identified by 'password';
Add privileges to the user for a specified database:
grant SELECT, INSERT on database_name.* to 'newuser'@'localhost';

flush privileges;
140 | Chapter 12. NG|Screener Database

Chapter 13. Backup and Restore Scripts
In our application, we provide a backup script and a restore script to backup necessary
information and restore them. These scripts could be used to quickly backup/restore
application configuration before upgrading to a new version, or the backup could be used as
an archive to be stored in the system.
13.1. Backup Script
The backup script is used to backup the following information:
• ngscreener database
• System files
◦ /etc/hosts
◦ /etc/cron.d /etc/cron.daily /etc/cron.deny /etc/cron.hourly /etc/cron.monthly
/etc/crontab /etc/cron.weekly
◦ /etc/ng-screener (this will include all config files of all NG applications)
◦ /etc/syslog-ng-rules
◦ /usr/local/prelude-runtime/etc/prelude-lml/ruleset
◦ /usr/local/ng-screener/ngprocessing/ngmesos/etc
◦ /usr/local/ng-screener/ngprocessing/ngspark/conf
• /log-collector (in case the --data option is used)
• Objects managed by ngadmin commands
◦ UI Forensic Filters (using ngadmin forensic_extractFilters)
13.1.1. Usage
The command must be run with root user
[root@NG-SCREENER ~]# /usr/local/ng-screener/tools/backupScript.sh \

[option] backupFile
• option
◦ No options: it backups the database, the configuration files and UI objects managed
by ngadmin command. In this mode, it does not backup /log-collector.
◦ --data: in addition to the above case, it backups /log-collector as well. Notice
that the final backup file could be big if it has many logs under /log-collector.
• backupFile: the file could be absolute or relative path. The file name accepts only the
extension .tar.gz. If the file has no extensions, it adds the extension .tar.gz to the
Chapter 13. Backup and Restore Scripts | 141

final file name automatically.
Example

/tmp/backup-0418.tar.gz
--data /tmp/backup-all-0418
13.2. Restore Script
The restore script is used to restore the information stored in the backup script
• ngscreener database
• System files
◦ /etc/hosts
◦ /etc/cron.d /etc/cron.daily /etc/cron.deny /etc/cron.hourly /etc/cron.monthly
/etc/crontab /etc/cron.weekly
◦ /etc/ng-screener (this will include all config files of all NG applications)
◦ /etc/syslog-ng-rules
◦ /usr/local/prelude-runtime/etc/prelude-lml/ruleset
◦ /usr/local/ng-screener/ngprocessing/ngmesos/etc
◦ /usr/local/ng-screener/ngprocessing/ngspark/conf
• /log-collector (in case the --data option is used)
• Objects managed by ngadmin commands
◦ UI Forensic Filters (using ngadmin forensic_extractFilters)
13.2.1. Usage
The command must be run with root user
[root@NG-SCREENER ~]# /usr/local/ng-screener/tools/restoreScript.sh \

[option] backupFile
• option
◦ No options: it restores the database, the configuration files and UI objects managed
by ngadmin command in the backup file. In this mode, it does not restore /log-
collector.
◦ --data: in addition to the above case, it restores /log-collector as well. /log-
collector is skipped if it does not exist in the backup file.
142 | Chapter 13. Backup and Restore Scripts

• backupFile: the file could be an absolute path, or relative path. The file must be created
by the backup script of the same ng-screener version. If it detects that the backup file is
created by a different ng-screner version, it shows a warning message, and asks if the
user wants to continue to proceed it. In this case, it must be sure that the restore
version and the backup version are compatible.
Example

/tmp/backup-0418.tar.gz
--data /tmp/backup-all-0418
Chapter 13. Backup and Restore Scripts | 143

Chapter 14. NG|Storage
14.1. Introduction
NG|Storage is a repackaging of ElasticSearch - an open-source, highly scalable full-text

search and analytics engine. It allows storing, searching and analysis of big volumes of data
quickly and in near real time. It is generally used as the underlying engine/technology that
powers applications that have complex search features and requirements.
14.1.1. ES - RDBMS comparison
To make the terminology used in NG|Storage clearer, the table below shows a comparison
of the various elements with their RDBMS counterparts:
NG|Storage Classic RDBMS

Indices Databases
Types Tables
Documents Rows
Document properties Columns
An ElasticSearch cluster can contain multiple Indices (databases), which in turn contain
multiple Types (tables). These types hold multiple Documents (rows), and each document
has Properties (columns).
14.2. Architecture
The architectural diagram in External Architecture shows how NG|Storage is used in the
context of NG|Screener.
NG|Storage is used as the main data store for Ng|Discover - forensic sessions are run (and
can only be run) on the data loaded in NG|Storage.
It’s secondary role is to serve as the search feature backend of NG|Discover as well.
144 | Chapter 14. NG|Storage

14.3. Fields & Indexes
In ElasticSearch, all data in every field is indexed by default. That is, every field has a
dedicated inverted index for fast retrieval. All those inverted indexes can be used in a single
query, regardless of the types or indices queried.
Fields can be configured to be indexed or not. If a field is set as not indexed then it is only
stored, and such field cannot be searched or used in filters.
Each field has its own type which determines the data type to be stored in storage and the
way the field is queried. For a string field, the type could be text or keyword. For text
type, the field is tokenized into words and can be queried by those words. For keyword
type, the field is stored as a whole and it can be queried only by the whole field.
14.3.1. Indexes
The way indexes are named is regulated by the following pattern:
<prefix>-<host_name>-<service_name>-<date>
where <prefix> can be one of the following:
• ngt : Data / event related to Financial Transactions

• ngc : Data / event related to Business users actions on the Information System as a
whole
• ngi : Data / event related to Technical Actions (mostly by technical users) on the system
or technical information
• ngv : Data / event related to Violations generated by executing controls
For example, the log file

/log-collector/2016/T24Server1/temenosT24Transaction/09-07-2016.log
would be put into an index named
ngt-t24server1-temenost24transaction-20160709
14.3.2. Index Template
By default, the daemon uses its own mapping template file to set the correct number of
shards and replica. The default setting is 2 shards and 0 replica for each index.
You can overwrite these settings by creating a new mapping file in /etc/ng-
screener/daemon/indextemplates. The mapping template should have a name
corresponding to the index pattern to match (ng*.json, ngt-*.json, …).
The mapping template file should have the following structure:
Default index mapping for all ng* indexes
Chapter 14. NG|Storage | 145

{
"order": 0,
"index_patterns": [
"ng*"
],
"settings": {
"index": {
"refresh_interval": "5s",
"sort": {
"field": [
"@timestamp",
"service",
"host"
],
"order": [
"asc",
"asc",
"asc"
]
},
"number_of_shards": "2",
"auto_expand_replicas": "0-3",
"number_of_replicas": "0"
}
},
"mappings": {
"document": {
"_source": {
"enabled": true
},
"dynamic_templates": [
{
"rate": {
"mapping": {
"type": "float"
},
"match": "*_rate"
}
},
{
"amount": {
"mapping": {
"type": "float"
},
"match": "*_amount"
}
},
{
"count": {
"mapping": {
"type": "integer"
},
"match": "*_count"
}
},
{
"geopoints": {
"mapping": {
"type": "geo_point"
},
"match": "*_geo"
}
},
{
"eventSeqNumber": {
"mapping": {
"index": "false",
"type": "text",
"doc_values": false
},
"match": "eventsequencenumber"
}
},
{
"OriginalLog": {
"mapping": {
"index": "true",
"type": "text",
"enabled": false,
"doc_values": false
},
"match": "originallog"
}
},
{
"strings": {
"mapping": {
"fielddata": true,
"norms": false,
"index": "true",
"type": "keyword"
},
"match_mapping_type": "*",
"match": "*"
}
}
],
"properties": {
"@timestamp": {
"type": "date"
},
"service": {
"type": "keyword"
},
"host": {
"type": "keyword"
}
}
}
},
"aliases": {}
}

The index mapping template file defines a template to apply to an index mapping. The
indexes names to match are specified by a regex pattern. If an index matches
index_patterns, that index uses the settings in the index mapping file. If a field of that
index matches the field regex expression, its settings are applied to that field.
All mapping template files are applied when NG|Daemon starts. New indices matching the
index_patterns will have the settings applied. If an index matches multiple mapping
template files, the file with higher order number takes precedence.
When a template file is added, removed or modified, ng-screener needs to be restarted to

apply the new settings. Those new settings are only applied to newly created indices -
existing ones are not affected. To apply new settings to existing indices, all related indices
need to be recreated and data reloaded by executing data_removeEntries and
data_launchInitialProcessing ngadmin commands.
To change/update a mapping field of a specific index pattern, find the corresponding

mapping template file with highest order number and modify the corresponding settings.
If no such mapping template file exists, create a new one and set the order number to be
the highest one to make sure the settings in this new template are applied.
The effective index mapping template applied to each index can be found in Ng|Storage
Admin (check Tools) at https://server_ip/ui/storageadmin/index.html#!/cluster. Click on
the arrow in the top right corner of the index and select show mappings in the popup
menu. Kopf can also be used to update index settings by clicking on edit setting in the
popup menu. Those settings are applied temporarily, and will be reset when ng-storage is
restarted.
14.4. Data retention
NG|Storage has a sliding window of data since we cannot keep all of it indefinitely. Data
kept in NG|Storage takes 8 times the space it would take compressed in log-collector.
Therefore, data in NG|Storage is maintained using a sliding window approach.
Default windows size for NG|Storage is 365 days. During data loading the most recent log
files are loaded first, followed by older ones.
14.5. Tools
For monitoring NG|Storage we use Ng|Storage Admin which is available at <your-

ngBrowser-address>/storageadmin/index.html
Ng|Storage Admin wraps Kopf (https://github.com/lmenezes/elasticsearch-kopf), a web

administration tool for elasticsearch written in JavaScript, AngularJS, jQuery & Twitter
bootstrap.
It offers an easy way of performing common tasks on an elasticsearch cluster. Not every

single API is covered by this plugin, but it does offer a REST client which allows to explore
the full potential of the ElasticSearch API.
14.6. Limitations
14.6.1. Disk space
NG|Storage uses significantly more disk space to store the same data compared to
compressed files. The rule of thumb is that that data held in NG|Storage occupies 8 times
the amount of disk space as the same data held in /log-collector.
Some examples on how to estimate disk space consumption by NG|Storage are provided
below:
Figure: NG|Storage disk usage
14.7. NG|Storage (Elasticsearch) useful commands
In the commands below we assume that NG|Storage is reachable at localhost:9200,

meaning that REST API is exposed and available.
To verify if the API is available, enter the following URL in the web browser:
http://localhost:9200.
Expected response:

{
"name" : "WTzREcl",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "K0JDs2k0SwCZaWi4FH2yLg",
"version" : {
"number" : "6.1.2",
"build_hash" : "5b1fea5",
"build_date" : "2018-01-10T02:35:59.208Z",
"build_snapshot" : false,
"lucene_version" : "7.1.0",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}
Cluster health
To get cluster information
curl -XGET 'localhost:9200/_cluster/health?pretty'
Example response:
{
"cluster_name" : "elasticsearch",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 534,
"active_shards" : 534,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 12,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 97.8021978021978
}
The most important field here is "status". The cluster health status can be green, yellow
or red. On the shard level, a red status indicates that the specific shard is not allocated in
the cluster. Yellow means that the primary shard is allocated but replicas are not, and
green means that all shards are allocated. The index level status is determined by the
lowest shard status. The cluster status is determined by the lowest index status.

Extracting index meta data
To get mappings from an index named test:
curl -XGET 'localhost:9200/test/_mappings?pretty'
Example response:
{
"test" : {
"mappings" : {
"document" : {
"dynamic_templates" : [
{
"strings" : {
"match_mapping_type" : "string",
"mapping" : {
"index" : false,
"type" : "keyword"
}
}
},
{
"geopoints" : {
"match" : "*_geo",
"mapping" : {
"type" : "geo_point"
}
}
}
]
}
}
}
}
To get settings from an index named test:
curl -XGET 'localhost:9200/test/_settings?pretty'
The output provides information like the number of replicas, number of shards, refresh
interval, etc.
Example response:

{
"test" : {
"settings" : {
"index" : {
"refresh_interval" : "10s",
"number_of_shards" : "2",
"auto_expand_replicas" : "0-3",
"provided_name" : "na-user_application_day-2017-06",
"creation_date" : "1519892820703",
"number_of_replicas" : "0",
"uuid" : "WmCoajILRtS06nnQ1Y51fw",
"version" : {
"created" : "6010299"
}
}
}
}
}
List indexes
curl -XGET 'localhost:9200/_cat/indices?v&pretty'
Example response listing indexes in green state:
health status index uuid pri rep

docs.count docs.deleted store.size pri.store.size
green open na-ffe-country-2017-11 vOYsPA5wSVyFsiyHZJu5gQ 2 0
0 0 528b 528b
green open na-temenos_trx_amount-2018-03-06 DVid5uICTCyVzB7R5ACo6w 2 0
0 0 528b 528b
green open na-application_day-2018-03-03 DfO1QxYjQSixPcwHFyJrvw 2 0
0 0 528b 528b
When there are no indexes in the cluster the response will show an empty list:
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
Extracting sample documents from index
To retrieve 5 documents from an index named test:
curl -XGET 'localhost:9200/test/document/_search?pretty=true&size=5'
Extracting selected documents
You can retrieve documents by some fields. For example, you can get documents where

field severity is equal to info in an index named test:
curl -XGET 'localhost:9200/test/document/_search?q=severity:info&pretty'
Inserting documents
To insert a new document in an index named test:
curl -XPUT 'localhost:9200/test/document/1?pretty' -H 'Content-Type: application/json' -d

'{"somekey" : "some-value"}'
Deleting documents
To remove a document with id 1 from an index named test:
curl -XDELETE 'localhost:9200/test/document/1?pretty'
Counting documents
To count all documents in an index named test:
curl -XGET 'localhost:9200/_cat/count/test?v&pretty'
Example response:
epoch timestamp count

1520427552 12:59:12 13014
14.8. NG|Storage Tuning
14.8.1. How to configure NG|Storage to run as a cluster?
In case you have an installation on multiple nodes, here is the procedure to configures
multiple instances of NG|Storage to run as a cluster.
On each node you need to edit /etc/ng-screener/ngstorage/ngStorage.yml and

set the following properties:
• cluster.name set it to NGELK on every node

• node.master set to true if this node can be elected as master node, false otherwise
• node.data set to true
• node.name set to a unique name on each node or comment out this property to let

elasticsearch assign automatically a name based on a list of Marvel heroes.
• network.host set to 0.0.0.0 to bind NG|Storage to all available network interfaces or to a
specific external interface.
• discovery.zen.ping.unicast.hosts set it to an array of IP addresses of all master nodes.
Ex.: ["172.31.29.159","172.31.29.160","172.31.29.158"]
Verify that the communication on port 9300 is allowed between all nodes. You might need to
open the firewall on port 9300 on each node using the following command:
/bin/firewall-cmd --permanent --zone=public --add-port=9300/tcp
Before starting the daemon, you need to set the replication factor of the indexes. For that
edit your index templates located in /etc/ng-screener/daemon/indextemplates
and change the "number_of_replicas": "0" to the number of replicas you want in
your cluster.
14.8.2. Performance Tuning
NG|Storage can be fine tunned in multiple ways. But the most common use cases is to tune
the number of indexes/shards in a cluster. Most of the times the performance issues
comes from the "Shards Gazillion" problem. We need to think correctly about the index
partitionning depending of each client needs.
14.8.3. How to size NG|Storage ?
Data in NG|Storage is organized into indices. Each index is made up of one or more shards.
Each shard is an instance of a Lucene index, which you can think of as a self-contained
search engine that indexes and handles queries for a subset of the data in an NG|Storage
cluster.
As data is written to a shard, it is periodically published into new immutable Lucene

segments on disk, and it is at this time it becomes available for querying.
As the number of segments grows, these are periodically consolidated into larger
segments. This process is referred to as merging. As all segments are immutable, this
means that the disk space used will typically fluctuate during indexing, as new, merged
segments need to be created before the ones they replace can be deleted. Merging can be
quite resource intensive, especially with respect to disk I/O.
The shard is the unit at which NG|Storage distributes data around the cluster. The speed at
which NG|Storage can move shards around when rebalancing data, e.g. following a failure,
will depend on the size and number of shards as well as network and disk performance.

Avoid having very large shards as this can negatively affect the cluster’s
ability to recover from failure. There is no fixed limit on how large shards
 can be, but a shard size of 50GB is often quoted as a limit that has been
seen to work for a variety of use-cases.
Each shard has data that need to be kept in memory and use heap space. This includes data
structures holding information at the shard level, but also at the segment level in order to
define where data reside on disk. The size of these data structures is not fixed and will vary
depending on the use-case.
One important characteristic of the segment related overhead is however that it is not
strictly proportional to the size of the segment. This means that larger segments have less
overhead per data volume compared to smaller segments. The difference can be
substantial.
In order to be able to store as much data as possible per node, it becomes important to
manage heap usage and reduce the amount of overhead as much as possible. The more
heap space a node has, the more data and shards it can handle.
Indices and shards are therefore not free from a cluster perspective, as there is some level
of resource overhead for each index and shard.
Small shards result in small segments, which increases overhead. Aim to

keep the average shard size between a few GB and a few tens of GB. For
 use-cases with time-based data, it is common to see shards between 20GB
and 40GB in size.
As the overhead per shard depends on the segment count and size, forcing
smaller segments to merge into larger ones through a forcemerge
operation can reduce overhead and improve query performance. This
 should ideally be done once no more data is written to the index. Be aware
that this is an expensive operation that should ideally be performed during
off-peak hours.
The number of shards you can hold on a node will be proportional to the
amount of heap you have available, but there is no fixed limit enforced by
NG|Storage. A good rule-of-thumb is to ensure you keep the number of
 shards per node below 20 to 25 per GB heap it has configured. A node with
a 30GB heap should therefore have a maximum of 600-750 shards, but the
further below this limit you can keep it the better. This will generally help
the cluster stay in good health.
In summary: To size your NG|Storage cluster you can simply determine the minimum heap
size need for the whole cluster by this function :

heapGB = min(8, nbShards/25)
The minimum heap to set for a node is 8GB and the maximum is 30GB.
14.8.4. Index per Time Frame
If we were to have one big index for documents, we would soon run out of space. Logging
events just keep on coming, without pause or interruption. We could delete the old events
with a scroll query and bulk delete, but this approach is very inefficient. When you delete a
document, it is only marked as deleted (see Deletes and Updates). It won’t be physically
deleted until the segment containing it is merged away.
Instead, use an index per time frame. You could start out with an index per year (logs_2014)
or per month (logs_2014-10). Perhaps, when your website gets really busy, you need to
switch to an index per day (logs_2014-10-24). Purging old data is easy: just delete old
indices.
This approach has the advantage of allowing you to scale as and when you need to. You
don’t have to make any difficult decisions up front. Every day is a new opportunity to change
your indexing time frames to suit the current demand. Apply the same logic to how big you
make each index. Perhaps all you need is one primary shard per week initially. Later,
maybe you need five primary shards per day. It doesn’t matter—you can adjust to new
circumstances at any time.
14.8.5. Routing a Document to a Shard
When you index a document, it is stored on a single primary shard. How does Elasticsearch
know which shard a document belongs to? When we create a new document, how does it
know whether it should store that document on shard 1 or shard 2?
The process can’t be random, since we may need to retrieve the document in the future. In
fact, it is determined by a simple formula:
shard = hash(routing) % number_of_primary_shards
The routing value is an arbitrary string, which defaults to the document’s _id but can also
be set to a custom value. This routing string is passed through a hashing function to
generate a number, which is divided by the number of primary shards in the index to return
the remainder. The remainder will always be in the range 0 to number_of_primary_shards
- 1, and gives us the number of the shard where a particular document lives.
This explains why the number of primary shards can be set only when an index is created
and never changed: if the number of primary shards ever changed in the future, all previous
routing values would be invalid and documents would never be found.

Users sometimes think that having a fixed number of primary shards
 makes it difficult to scale out an index later. In reality, there are techniques
that make it easy to scale out as and when you need.
All document APIs (get, index, delete, bulk, update, and mget) accept a routing parameter
that can be used to customize the document-to- shard mapping. A custom routing value
could be used to ensure that all related documents—for instance, all the documents
belonging to the same user—are stored on the same shard.
Custom shard routing example
PUT /ng-test/document/1?routing=shardkey
{
"name": "abc",
"title": "lorem ipsum",
...
}
14.8.6. Shards configuration in NgStorageAdmin
The number of shards in an index is configured at index creation and is immutable. Shards
are configured in index templates with the number_of_shards parameter - see Indexes
chapter.
Changing the number of shards in an existing index requires complete index rebuilding.
To check shards configuration of index see NG|Storage Admin.
Picture:NgStorageAdmin - example of index with 4 shards
To display shards statistics directly from NG|Storage, make a REST API call to the following
URL providing your index name. For example:
http://localhost:9200/_cat/shards/ngt-*?v

Shards statistics example response
index shard prirep state docs store ip

node
ngt-t24server1-temenost24transaction-201805 1 p STARTED 49856 13.1mb
127.0.0.1 SkiXn_D
ngt-t24server1-temenost24transaction-201805 0 p STARTED 50112 13.2mb
127.0.0.1 SkiXn_D
ngt-avaloqsrv-avaloqcorebankingtransaction-201805 1 p STARTED 70 90.5kb
127.0.0.1 SkiXn_D
ngt-avaloqsrv-avaloqcorebankingtransaction-201805 0 p STARTED 58 82.8kb
127.0.0.1 SkiXn_D

Chapter 15. NG|Processing
NG|Processing is the component that packages all elements needed for BigData
processing. This include Apache Spark, Apache Mesos and their respective webUI for
management purpose.
15.1. What is spark ?
Spark is a general-purpose data processing engine that is suitable for use in a wide range
of circumstances. Application developers and data scientists incorporate Spark into their
applications to rapidly query, analyze, and transform data at scale. Tasks most frequently
associated with Spark include interactive queries across large data sets, processing of
streaming data from sensors or financial systems, and machine learning tasks.
15.2. Spark performance programming
Spark is designed to run large scale data processing applications on clusters of machines,
in which it distributes the workload to achieve much faster run time. Despite the fact that
Spark is generally very performant we can face issues with some of our Spark jobs often
failing, getting stuck, and taking long hours to finish. Here is a collection of best practices
and optimization tips for Spark to achieve better performance and cleaner Spark code.
15.2.1. Reading data from Elasticsearch
Most of our Spark jobs reads data from Elasticsearch, thus it’s important to be sure that we
do it efficiently. If we use filtering operations properly on Spark DataFrame then Spark can
translate those filters into Elasticsearch query which can speed up reading and processing
only necessary data. An important hidden feature of using Elasticsearch as a Spark source
is that the Spark-ES connector understand the operations performed within the
DataFrame/SQL and, by default, will translate them into the appropriate QueryDSL. In other
words, the connector pushes down the operations directly at the source, where the data is
efficiently filtered out so that only the required data is streamed back to Spark. This
significantly increases the queries performance and minimizes the CPU, memory and I/O
on both Spark and Elasticsearch clusters as only the needed data is returned (as oppose to
returning the data in bulk only to be processed and discarded by Spark).
The following pySpark code will generate shown below Elasticsearch query:
pySpark code which reads and filters data from Elasticsearch
sparkSession.read.format("es").options(**self.es_base_read_conf).load(resource="ngt-
default_finnovaserver-swisscomfinnovacorebankingtransaction-201904")
.select('business_reference', 'transaction_receiver_amount')
.where(col('transaction_receiver_amount') > 128)
Chapter 15. NG|Processing | 159

Elasticsearch query generated by Spark
{
"size": 10000,
"query": {
"bool": {
"must": [{ "match_all": { "boost": 1.0 } }],
"filter": [
{ "exists": { "field": "transaction_receiver_amount", "boost": 1.0 } },
{
"range": {
"transaction_receiver_amount": {
"from": 128.0, ①
"to": null,
"include_lower": false,
"include_upper": true,
"boost": 1.0
}
}
}
],
"adjust_pure_negative": true,
"boost": 1.0
}
},
"_source": {
"includes": ["business_reference", "transaction_receiver_amount"], ②
"excludes": []
},
"sort": [{ "_doc": { "order": "asc" } }]
}
① pySpark filter was transformed into Elasticsearch query

② Elasticsearch will return only necessary columns pointed out in the select statement
To verify if Spark generates proper Elasticsearch query you can turn on for the moment
logging all queries to certain index.
PUT /ngt-default_finnovaserver-swisscomfinnovacorebankingebankingtransaction-
201904/_settings
{
"index.search.slowlog.threshold.query.warn": "0s",
"index.search.slowlog.threshold.fetch.warn": "0s",
"index.indexing.slowlog.threshold.index.warn": "0s"
}
From now on you will see all queries run on the ngt-default_finnovaserver-
swisscomfinnovacorebankingebankingtransaction-201904 index in the
/var/log/ng-screener/ngstorage/NGELK_index_search_slowlog.log.
To turn off logging Elasticsearch queries, run the following command:
160 | Chapter 15. NG|Processing

PUT /ngt-default_finnovaserver-swisscomfinnovacorebankingebankingtransaction-
201904/_settings
{
"index.search.slowlog.threshold.query.warn": "10s",
"index.search.slowlog.threshold.fetch.warn": "10s",
"index.indexing.slowlog.threshold.index.warn": "10s"
}
Very often Spark is able to figure out to push down filters and selections, even after joins.
See the execution plan for pyspark code to check how actually Spark will apply filters and in
which order it will be executed.
customers = spark.createDataFrame(
[ (1, 1, 'John', 30),
(2, 1, 'Andy', 35),
(3, 2, 'Roger', 40),
(4, None, 'Sarah', 45),
], ['id', 'type_id', 'name', 'age']
)
types = spark.createDataFrame(
[ (1, 'Normal', 10),
(2, 'Premium', 20),
], ['type_id', 'type_name', 'fee']
)
df = (
customers
.join(types, 'type_id', 'left_outer')
.filter(col('age') >= 40)
)
df.explain()
== Physical Plan == ①
*(5) Project [type_id#462L, id#461L, name#463, age#464L, type_name#483, fee#484L]
+- SortMergeJoin [type_id#462L], [type_id#482L], LeftOuter
:- *(2) Sort [type_id#462L ASC NULLS FIRST], false, 0
: +- Exchange hashpartitioning(type_id#462L, 8)
: +- *(1) Filter (isnotnull(age#464L) && (age#464L >= 40)) ②
: +- Scan ExistingRDD[id#461L,type_id#462L,name#463,age#464L]
+- *(4) Sort [type_id#482L ASC NULLS FIRST], false, 0
+- Exchange hashpartitioning(type_id#482L, 8)
+- *(3) Filter isnotnull(type_id#482L)
+- Scan ExistingRDD[type_id#482L,type_name#483,fee#484L]
① Read from the bottom-up. The plan is also visible in the Spark web frontend SQL tab
while a job is running.
② Spark pushed down filter before join
But it’s not always possible to push down filters, for example when DataFrame is cached.
Take a look at the example:

print 'No cache:'
df1 = types.select('type_id', 'fee')
df1 = df1.where(col('fee') >= 20)
df1.explain()
print '\nCached:'
df2 = types.select('type_id', 'fee')

df2 = df2.cache() ①
df2 = df2.where(col('fee') >= 20)
df2.explain()
① Cache the data
No cache:
== Physical Plan ==
*(1) Project [type_id#482L, fee#484L]
+- *(1) Filter (isnotnull(fee#484L) && (fee#484L >= 20))
Cached:
== Physical Plan ==
*(1) Filter (isnotnull(fee#484L) && (fee#484L >= 20)) ①
+- *(1) InMemoryTableScan [type_id#482L, fee#484L], [isnotnull(fee#484L), (fee#484L >=
20)]
+- InMemoryRelation [type_id#482L, fee#484L], StorageLevel(disk, memory,
deserialized, 1 replicas)
+- *(1) Project [type_id#482L, fee#484L]
① Filter is applied after caching data. It could entail with huge performance issue.
To conclude, be careful and pay special attention where do you apply your filters and
columns selection.
From Elasticsearch shard to Spark task
A critical component for scalability is parallelism or splitting a task into multiple, smaller
ones that execute at the same time, on different nodes in the cluster. Shards play a critical
role when reading information from Elasticsearch. Since it acts as a source, elasticsearch
connector will create one Spark partition per Elasticsearch / NG|Storage shard. In short,
roughly speaking more input splits means more tasks that can read at the same time,
different parts of the source. More shards means more buckets from which to read an
index content (at the same time).
A common concern (read optimization) for improving performance is to increase the

number of shards and thus increase the number of tasks on the Spark side. Unless such
gains are demonstrated through benchmarks, we recommend against such a measure
since in most cases, an Elasticsearch shard can easily handle data streaming to a Spark
task. A good rule of thumb is to keep the average shard size between 20GB and 50GB. More

details how to properly size your shards is described in a part devoted to NG|Storage
To sum up, number of shards determines number of tasks in Spark which can read and
process the data, one task is created by Spark to read one partition (shard).
15.2.2. Optimizing joins
Broadcast join
When joining two tables if one of the join is small enough to fit into memory it is advisable to
broadcast it, to avoid shuffles. And if the tasks across multiple stages requires the same
data, it is better to broadcast the value rather than send it to the executors with each task.
To figure out if your DataFrame is a good candidate for broadcasting check the amount of
data which is shuffled during normal join. Check the Spark UI to do so.
To tell Spark that it can use broadcast join use the broadcast hint in your pyspark code:
joined_data = (
large_df
.join(
broadcast(small_df), ①
'business_contract_name')
)
① small dataset wrapped in broadcast() hint
15.2.3. Tools
Using Spark UI is a good method to track jobs execution and detect performances issues.
See technical guide to read more about Spark history server which is available on
NG|Screener platform.
15.2.4. Spark performance advices in a nutshell
1. make it work
◦ select only necessary columns
◦ filter out data as soon as possible
◦ consider using precomputed aggregations if it’s possible
◦ enrich your data during loading data into system (use scripted fields, translators
etc.)
◦ plan ahead how many data you will expect to have in NG|Screener platform
▪ set proper number of shards for certain indices
2. make it right
◦ have a look at ES queries generated by Spark
◦ review you code again, maybe there is still column or row to drop out
◦ have a quick look at Spark UI
▪ check if the number of created tasks isn’t suspicious
▪ check if all the tasks are busy
◦ remove unnecessary lines in your code such as df.show(), df.explain()
3. make it fast (only if there is a performance problem)
◦ use broadcast if it’s possible
◦ try to find proper value for spark.sql.shuffle.partitions
◦ see an execution plan (df.explain())
15.3. What is mesos ?
Apache Mesos is a cluster manager that provides efficient resource isolation and sharing
across distributed applications or frameworks. It sits between the application layer and the
operating system and makes it easier to deploy and manage applications in large-scale
clustered environments more efficiently. It can run many applications on a dynamically
shared pool of nodes.
15.4. Settings
For NgProcessing, the allocated memory depends on two factors.
• The memory allocated to Mesos.

This is the maximum memory Mesos can utilize for running spark jobs. This value is set
to one-third of the available server memory by default and is defined at
/usr/local/ng-screener/ngprocessing/ngmesos/etc/mesos-

slave/resources/mem.
Restart ng-mesos-slave service to apply the change.
• The memory allocated to each spark job.
This value is configured in the spark.executor.memory and
spark.mesos.executor.memoryOverhead parameters in /usr/local/ng-
screener/ngprocessing/ngspark/conf/spark-defaults.conf.
Changes to these parameters are applied automatically for a new spark job.
15.5. Settings for Proof of Concept
For POC, we don’t need to run 3 controls at a time, but only 1 with most of the resources.
To enable this, you need to change the following parameters :
• Mesos
◦ Set the total memory available as half of the machine in /usr/local/ng-
screener/ngprocessing/ngmesos/etc/mesos-slave/resources/mem.
Example for a machine with 32Gb RAM execute the following command echo
16384 > /usr/local/ng-screener/ngprocessing/ngmesos/etc/mesos-
slave/resources/mem
◦ Set the total cpus to use equals to half of the machine + 1 in /usr/local/ng-
screener/ngprocessing/ngmesos/etc/mesos-slave/resources/cpus
Example for a machine with 8 cpus execute the following command echo 5 >
/usr/local/ng-screener/ngprocessing/ngmesos/etc/mesos-
slave/resources/cpus
• Spark
◦ You need to edit the following settings in /usr/local/ng-
screener/ngprocessing/ngspark/conf/spark-default.conf
▪ spark.executor.memory
Set this value to the mesos mem minus 15% and minus 512m. If Mesos has 16384
available, you should set this value to 13400m.
▪ spark.mesos.executor.memoryOverhead should be 10% of the previous value.

With the sample, you should set this value to 1500m.
▪ spark.executor.cores and spark.cores.max should be set to half cpu of the
machine
▪ spark.sql.shuffle.partitions Depending of the amount of data to be treated. If you
are using million of events to be processed, you should extends this value to 50 or
100.

15.6. Tools
15.6.1. allocateResource.sh
allocateResource.sh is bash script for calculating CPUs & memory for spark
executors.
The script should be use before starting Ng|Processing.
When calculation is done, it writes the calculated values to the relevant spark configuration
file.
15.6.2. enableSparkAllocationMode.sh
enableSparkAllocationMode.sh is a bash script for setting the allocation mode in

Mesos to master or slave.
enableSparkAllocationMode.sh [dynamicMode] [masterNode] [mesosMasterIpAddress]

[mesosSlaveIpAddress]
• dynamicMode: [true|false]
◦ true: enable dynamic allocation mode with mesos
◦ false: enable static allocation mode with mesos
• masterNode: [true|false]
◦ true: this is the mesos master node
◦ false: this is the mesos slave node
• mesosMasterIpAddress: external IP address of mesos master (e.g. 192.168.56.11)
• mesosSlaveIpAddress: external IP address of mesos slave (e.g. 192.168.56.12)
• With dynamicMode=false, we only need to pass the first parameter

Chapter 16. Platform services
16.1. Introduction
The NG|Screener platform of multiple services. Most of these services are Java
applications, managed as Systemd services by the operating system.
16.2. NG|Screener
16.2.1. ng-screener.service
Used to manage the core NG|Daemon application.
Logs:
NG|Daemon logs are stored under /var/log/ng-screener/daemon
Configuration:
• serviceConfig directory - services config files

• modules directory - configuration of modules used in daemon
• indexTemplate directory - templates for NG|Storage which are loaded during daemon
startup
• daemon.conf - contains some OS utilities and syslog-ng configs
• ng-screener.env - java environment variables
The service needs at least 512MB to run but is usually configured to use 2GB of memory.
Exposed ports:
• 8081 - REST endpoint for ngAdmin usage
Dependencies:
• NG|Storage
• NG|Processing
• NG|Messaging
• MariaDB
16.2.2. ng-screener-ui.service
Used to manage NG|ScreenerUI, the web interface application.
Chapter 16. Platform services | 167

Logs:
NG|ScreenerUI logs are stored under /var/log/ng-screener/ui
Configuration:
• navigator.json - sidebar configuration with a navigation menu

• /etc/ng-screener/common/forensicMenu.json - forensic menu customization
• ng-screener.env - java environment variables
The service needs at least 400MB to run but is usually configured to use 1GB of memory.
Exposed ports:
• 8080 - ngScreenerUI application
Dependencies:
• NG|Storage
• NG|Discover
• MariaDB
16.3. NG|Messaging
Log files for below services are stored under /var/log/ng-screener/ngmessaging
16.3.1. ng-messaging.service
Service encapsulates Kafka server. Uses 1GB of memory by default.
Configuration for NG|Messaging is stored in /etc/ng-

screener/ngmessaging/server.properties.
Kafka data is stored under /storage/ngmessaging.
16.3.2. ng-zookeeper.service
Zookeeper service Uses 512MB of memory. Starts zookeeper server on port 2181.
Configuration for zookeeper is stored in /etc/ng-

screener/ngmessaging/zookeeper.properties.
168 | Chapter 16. Platform services

dataDir=/storage/zookeeper
# the port at which the clients will connect
clientPort=2181
# disable the per-ip limit on the number of connections since this is a non-production
config
maxClientCnxns=0
Zookeeper data is stored under /storage/zookeeper.
16.4. NG|Discover
16.4.1. ng-discover.service
NG|Discover service. Uses port 5601.
To modify the memory allocated to NG|Discover edit /usr/local/ng-

screener/ngdiscover/bin/ngDiscover.
add an option
NODE_OPTIONS="--max-old-space-size=1024"
Then restart ng-discover service to apply the change.
Ng|Discover configuration is in file /etc/ng-

screener/ngdiscover/ngDiscover.yml
The parameters set by default are:
# Allows to specify a path to mount Kibana at if you are running behind a proxy. This
only affects
# the URLs generated by Kibana, your proxy is expected to remove the basePath value
before forwarding requests
# to Kibana. This setting cannot end in a slash.
server.basePath: "/ui/ngdiscover"
server.xsrf.disableProtection: true
16.5. NG|Storage
16.5.1. ng-storage.service
NG|Storage service.
The amount of memory it uses depends on the available memory of host machine. A script
is used to calculate that amount for each host called generate-ngstorage-systemd-
env.

Configuration file for NG|Storage is available in /etc/ng-
screener/ngstorage/ngStorage.yml
NG|Storage exposes two ports:
• 9200 - for REST API

• 9300 - for nodes communication
Its logs are stored under /var/log/ng-screener/ngstorage
NG|Storage keeps data under /storage/ngstorage/
16.6. NG|Processing
Log files for the following services can be found in /var/log/ng-

screener/ngprocessing. Mesos and Spark have their own dedicated log folders,
ngmesos and ngspark.
16.6.1. ng-history-server.service
Spark history server.
Uses /usr/local/ng-screener/ngprocessing/ngspark/sbin/start-history-
server.sh for service startup.
Exposes port:
• 18080 - for rest API
Configuration for history server is located in /usr/local/ng-

screener/ngprocessing/ngspark/conf/spark-defaults.conf:
spark.eventLog.enabled=true
spark.eventLog.dir=file:/usr/local/ng-screener/ngprocessing/ngspark/history
spark.history.fs.logDirectory=file:/usr/local/ng-screener/ngprocessing/ngspark/history
spark.history.fs.update.interval=5s
spark.history.fs.cleaner.enabled=true
spark.history.fs.cleaner.maxAge=2d
spark.serializer=org.apache.spark.serializer.KryoSerializer
16.6.2. ng-mesos-master.service
Uses /usr/local/ng-screener/ngprocessing/ngmesos/sbin/mesos-init-
wrapper.sh master for service startup.
Script runs Mesos in master mode, loads environment files, sets logging up and loads
configuration parameters as appropriate.

The resources allocated to mesos master node are configured in /usr/local/ng-
screener/ngprocessing/ngmesos/etc/mesos-master/resources
16.6.3. ng-mesos-shuffle.service
Uses /usr/local/ng-screener/ngprocessing/ngspark/sbin/start-mesos-
shuffle-service.sh for service startup.
The external Shuffle Service used is the Mesos Shuffle Service. It provides shuffle data
cleanup functionality on top of the Shuffle Service since Mesos doesn’t yet support notifying
another framework’s termination.
16.6.4. ng-mesos-slave.service
Uses /usr/local/ng-screener/ngprocessing/ngmesos/sbin/mesos-init-
wrapper.sh master for service startup.
Script runs Mesos in slave mode, loads environment files, sets logging up and loads
configuration parameters as appropriate.
The resources allocated to mesos slave nodes are configured in /usr/local/ng-

screener/ngprocessing/ngmesos/etc/mesos-slave/resources
Every mesos slave runs spark and opens port 4040 on the host machine during the
execution of an application. There is a SparkContext web UI accesible at this port that
displays useful information about the application. This includes:
• A list of scheduler stages and tasks

• A summary of RDD sizes and memory usage
• Environment information
• Information about the running executors
16.6.5. ng-thrift-server.service
Thrift server works all the time as a mesos framework. It is configured to use only one
core. Uses /usr/local/ng-screener/ngprocessing/ngspark/sbin/start-
thriftserver.sh for service startup.
Exposes port 10000 on host machine.
Script runs Thrift service with config:

--hiveconf spark.executor.cores=1
--hiveconf spark.cores.max=1
--hiveconf spark.dynamicAllocation.enabled=false
--hiveconf spark.executor.memory=512m
--hiveconf spark.mesos.executor.memoryOverhead=0
16.7. NG|Platform pseudo-service
Since version 7.2 of the platform a new so-called pseudo-service has come to life: ng-
platform.service.
It is deployed from the NG|Storage RPM and gathers together all services from the
following list (provided they are installed at all, of course, therefore being present in both
master and slave - in the sense storage and processing nodes can be master or slave in a
cluster - installations):
• NG|Auth
◦ ng-screener-auth.service
• NG|Discover
◦ ng-discover.service
• NG|MapServer
◦ ng-mapserver.service
• NG|Messaging
◦ ng-kafka-manager.service
◦ ng-messaging.service
◦ ng-zookeeper.service
• NG|Processing
◦ ng-history-server.service
◦ ng-mesos-master.service
◦ ng-mesos-slave.service
◦ ng-mesos-shuffle.service
◦ ng-thrift-server.service
• NG|Scoring
◦ ng-scoring-api.service
◦ ng-scoring-api-ui.service
• NG|Screener
◦ httpd.ngc.service

◦ ng-screener.service
• NG|ScreenerUI
◦ ng-screener-ui.service
• NG|Storage
◦ ng-storage.service
The following commands can therefore be used to manage all these services together:
• stop the whole set of services
[root@NG-SCREENER ~]# systemctl stop ng-platform.service
• start the whole set of services (in the right order)
[root@NG-SCREENER ~]# systemctl start ng-platform.service
• restart the whole set of services (in the right order)
[root@NG-SCREENER ~]# systemctl restart ng-platform.service

Chapter 17. NG|Discover Dashboard and Forensic
definitions
17.1. NG|Discover dashboards
Definitions of all dashboards used in NG|Screener are stored in NG|Storage (Elasticsearch)

and can be created/edited using NG|Discover (Kibana).
To access the dashboard list in NG|Discover, use the Dashboard menu on the left side-
bar. The view shows a list of all dashboards available in NG|Screener.
For detailed information on how to create and edit dashboards see Kibana User Guide:
https://www.elastic.co/guide/en/kibana/current/dashboard.html.
17.2. Dashboards configuration in NG|Screener
Main configuration of dashboards and forensic views (does not apply to control dashboards)
accessible from the left menu in NG|Screener is located at path:
/etc/ng-screener/common/forensicMenu.json.
174 | Chapter 17. NG|Discover Dashboard and Forensic definitions

 XYZ tenant).
By default if contains the configuration of 5 menu items (dashboards):
• Home dashboard (violations)

• Main Forensic (general)
• Forensic transactions
• Forensic channels
• Forensic layers
Each menu item has following structure:
• Title - displayed translation key

• forensicView - id used in url parameter to access dashboard
• iconClass - CSS class applied
• dashboardMapping - list of NG|Discover dashboard definition names. Patterns
[username] and [rolename] are replaced with user name and role respectively.
Note that first available dashboard from the list is used, allowing customization per user
or role.
• displayMenu - whether it is enabled and visible
• maxPeriod - data range used
• order - menu item ordering
Example structure of a menu item config:
{
"title": "sidebar.menu.forensic.violations",
"forensicView": "violations",
"iconClass": "forensic-icon forensic-violations-icon",
"dashboardMapping":["dashboard_ngv_[username]","dashboard_ngv_[rolename]",
"dashboard_ngv"],
"displayMenu": false,
"maxPeriod": "1y",
"order": 0
}
Application processes dashboards using ForensicMenuProvider class.
Chapter 17. NG|Discover Dashboard and Forensic definitions | 175

17.3. Home dashboard
The Home Dashboard is displayed as the default page on NG|Screener UI after user login.
Its definition is stored in NG|Discover under dashboard_ngv name and can be customized
by user or role.
17.4. Forensic dashboards
Forensic dashboards are accessible from left menu of NG|Screener. NG|Discover

templates:
• forensic_ng (also forensic_ngt_[username], forensic_ngt_[rolename] if

exist)
• forensic_ngt
• forensic_ngc
• forensic_ngi
17.5. Controls dashboards
Controls may use dashboard templates to show execution output. A dashboard definition
may be provided as json in control editor or a predefined dashboard from NG_Discover may
be used (see NG|Screener User Guide).
17.6. Best Practices to build dashboard
17.6.1. Limiting the dashboard time period
Staring with version 6.1, you need to limit the maximum analysis period per dashboard.
This is configurable per dashboard due to different complexity of each of them. For
example, a simple dashboard can provide analysis for 1 year and a very complicated one
should be limited to only 1 month or even 1 day to be able to display it in a reasonable time.
For that, you need to edit the file /etc/ng-screener/common/forensicMenu.json

 XYZ tenant).
In this file, you will find an entry for every forensic view. And for each view, you need to set a
correct value (depending of the client system performances) for the maxPeriod
176 | Chapter 17. NG|Discover Dashboard and Forensic definitions
parameter.
"maxPeriod": "3M"
The default value (installed with the RPM) is 3 month. But if the client has a very large
number of events, you need to tune this value to have a reasonable response time of the
forensic view (<10s). The units possible are:
• Y for years
• M for months
• D for days
• H for hours
17.6.2. Preventing Combinatorial Explosion
Based on https://www.elastic.co/guide/en/elasticsearch/guide/current/
_preventing_combinatorial_explosions.html
If you have multiple levels of aggregations in a visualization, you need to change the search
strategy from depth-first to breath-first. For that, you need to put the following json:
{
"collect_mode" : "breadth_first"
}
in the JSON input in the "Advanced Tab" of the visualization.
Chapter 17. NG|Discover Dashboard and Forensic definitions | 177

Chapter 18. NG|Business Data Model
18.1. Introduction
18.2. Structure
The Business Data Model contains four types of records: Layers, Channels, Transactions and
Violations. All of them share a common set of fields. That common set of fields allows for
the creation and navigation of forensic views regardless of the actual type of the data. Other
than that, each data type has its own, specific set of fields. It is important to note that this is
neither an exhaustive nor a complete set - it just defines the minimum set of attributes that
service has to provide. The diagram below show the exact set of fields and their common
subset (Figure Business Data Model structure):
Figure: Business Data Model structure
18.3. Mapping
178 | Chapter 18. NG|Business Data Model

Figure: Business Data Model mapping
Chapter 18. NG|Business Data Model | 179

Chapter 19. Troubleshooting
19.1. Stop/Start NG|Screener services
NG|Screener needs the following services to run correctly:
• The main service is ng-screener.
systemctl start/stop/restart ng-screener
• The log reception service:
systemctl start/stop/restart syslog-ng.ngc.service
• The database service:
systemctl start/stop/restart mariadb.ngc.service
19.2. Checking Services
To check if the services needed by NG|Screener are running, run the following:
• Check NG|Screener:
ps aux | grep ng-screener

or
systemctl status ng-screener
If the service is not running, you can run it by executing:
systemctl start ng-screener
• Check syslog-ng status:
systemctl status syslog-ng.ngc.service
If the service is not running, you can run it by executing:
180 | Chapter 19. Troubleshooting

systemctl start syslog-ng.ngc.service
19.3. NG|Screener Config Files
This section provides information about the location of all NG|Screener configuration files.
All of them are located in the following directories:
• /etc/ng-screener/common: contains configuration files shared by NG|Screener and

NG|ScreenerUI
• /etc/ng-screener/daemon: contains configuration files for NG|Screener
• /etc/ng-screener/daemon/modules: contains configuration files for NG|Screener
modules
• /etc/ng-screener/ui: contains configuration files for NG|ScreenerUI
The individual configuration files are grouped into the following categories:
19.3.1. General
• /etc/ng-screener/common/ng-screener.conf: restart NG|Screener and

NG|ScreenerUI to apply changes
• /etc/ng-screener/common/security.conf: restart NG|Screener and
• /etc/ng-screener/daemon/daemon.conf: restart NG|Screener to apply changes
• /etc/ng-screener/daemon/indextemplates/*: restart NG|Screener to apply
changes
• /etc/ng-screener/daemon/log4j2.xml: restart NG|Screener to apply changes
19.3.2. Licensing
• /etc/ng-screener/common/licensing.conf: restart NG|Screener and

19.3.3. Reference Data
• /etc/ng-screener/common/referenceData.conf: reload NG|Screener and

restart NG|ScreenerUI to apply changes
• /etc/ng-screener/common/referenceData/*.xml: reload NG|Screener and
restart NG|ScreenerUI to apply changes
Chapter 19. Troubleshooting | 181

19.3.4. Aggregation
• /etc/ng-screener/daemon/modules/aggregation.conf: restart NG|Screener

to apply changes
• /etc/ng-screener/daemon/modules/executor.conf: restart NG|Screener to
apply changes
19.3.5. Feeding
• /etc/ng-screener/daemon/modules/feeding.conf: restart NG|Screener to

apply changes
• /etc/ng-screener/daemon/modules/feeding/translators/*.xml: reload
NG|Screener to apply changes
19.3.6. Control
• /etc/ng-screener/common/controlCommon.conf: restart NG|Screener and

• /etc/ng-screener/daemon/modules/control.conf: restart NG|Screener to
apply changes
• /etc/ng-screener/daemon/modules/executor.conf: restart NG|Screener to
apply changes
• /etc/ng-screener/daemon/modules/controlReportTemplates/*: changes
are applied automatically
• /etc/ng-screener/daemon/modules/controlScriptTemplates/*: changes
are applied automatically
19.3.7. Updater
• /etc/ng-screener/common/updater.conf: restart NG|Screener and

19.3.8. Realtime Analysis
• /etc/ng-screener/daemon/modules/realtimeAnalysis.conf: restart
NG|Screener to apply changes
19.3.9. UI
• /etc/ng-screener/common/ngStorage.conf: restart NG|Screener and


19.4. NG|Screener Logs
The NG|Screener outputs its logs to /var/log/ng-screener/. Each of its modules

writes its logs to a dedicated file inside this directory. A file with combined output is also
available as daemon-all.log. You can preview the file by executing:
tail -f /var/log/ng-screener/daemon-all.log
Since it’s a Java application, errors usually include a stacktrace of where the error
happened. If an issue occurred, please send this file to support@netguardians.ch to
seek support.
19.5. Disk/System usage check
If a problem occurred, it is important to check disk/system resource utilization.
• Disk space: Check the amount of available disk space with df -h.
• Resource usage: Check the load of the system with top. The load average indicator
should be below 1.00 in normal circumstances.
top - 15:12:07 up 11 days, 23:22, 3 users, load average: 0.00, 0.00, 0.00
Tasks: 121 total, 2 running, 119 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 0.3%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 2058764k total, 1965376k used, 9 3388k free, 445544k buffers
Swap: 8193140k total, 80384k used, 8112756k free, 653448k cached
19.6. MariaDB connection issue relative to system timezone
In some time zones MariaDB is not able to use the system time zone defined in the system.
The symptoms include failing connection attempts when using JDBC (CLI access usually
works fine however), for instance in the event tracking handling scripts, throwing an
exception/message which looks like the following:
java.sql.SQLException: The server time zone value 'CEST' is unrecognized or

represents more than one time zone. You must configure either the server or
JDBC driver (via the serverTimezone configuration property) to use a more
specific time zone value if you want to utilize time zone support.
First, check how is the environment configured:
• The database’s time zone setting:

mysql> SHOW GLOBAL VARIABLES LIKE 'time_zone';
+---------------+--------+
| Variable_name | Value |
+---------------+--------+
| time_zone | SYSTEM |
+---------------+--------+
1 row in set (0.00 sec)
• The system time zone:
$ date "+%Z %z"

CEST +0200
One solution is to explicitly set the server’s time zone on the client’s JDBC connection URL.
Apart from being conceptually awkward (the client side having to know where the server is
located geographically), this solution only works if all connections to the database are made
sure to assume the same server time zone (including those not made through any JDBC
layer, directly using the command line interface). The following parameter can be added to
the connection URL: &serverTimezone=UTC.
A better solution is to set a proper time zone on the server itself without expecting any
configuration from the clients. Since the system time zone in our case does not seem to be
recognized properly, it has to be explicitly set. Three steps are necessary:
1. Load all necessary time zones into the MariaDB instance
This is done through the MariaDB-provided mysql_tzinfo_to_sql script. It is used

to generate SQL statements to fill all time zone-related tables in the mysql database
from a system directory, which can be piped through to the mysql command line
utilities.
Already available timezone information can be checked in the time_zone_name table.

Per default, no explicit time zone information is loaded into a fresh MariaDB installation
(the server then relies on the system time zone, which is exactly the part that does not
work in our case).
# mysql_tzinfo_to_sql /usr/share/zoneinfo | mysql mysql
The script shown above can also be used to upload time zone
information one by one, if only one is missing, for instance, or if it is not
 acceptable to upload all system-known time zones into the database; as
a second, optional parameter, the script also accepts the time zone
name to be loaded from a directory (given as first parameter).
2. Select the time right time zone and set it in the server’s configuration

All defined time zones can be found in the time_zone_name table. In our example,
let’s take Europe/Zurich, which is used in Switzerland. It has to be set in the
[mysqld] section of the /etc/my.cnf.d/server.cnf file:
[mysqld]
default_time_zone=Europe/Zurich
3. Restart the server
# systemctl restart mariadb.ngc.service
Once this has been done, connections to the database using the JDBC API should occur
normally, with the right time zone conversions between client and server if necessary.
19.7. Profiling aggregation computation using partitioning
When computing profiling aggregation data, a feature called partitioning is used on

NG|Storage. It means that the aggregation results are not expected to be retrieved in one
call (NG|Storage has a limit of about 10000 returned aggregation buckets), but rather using
so-called partitions.
NG|Screener proceeds in the following way:
1. an aggregation’s dimension and variables are defined in a profiling aggregation’s

configuration
2. for each of those, NG|Screener asks NG|Storage for their cardinality, takes the highest
one and computes an ideal number of partitions from it
3. as many aggregation queries are sent to NG|Storage as there are partitions
In usual circumstances, when the configured partition size is big enough and there is
enough data, all partitions are expected to be balanced. But that does not mean that there
aren’t any cases where the partitions are hugely imbalanced. In such a case there are two
configuration parameters in /etc/ng-screener/common/ngStorage.conf that can be
used to fine tune this behavior:
• maxAllowedAggregationBuckets (default value 10000): the maximum number of

elements that are expected from any given partition
• aggregationPartitionSize (default value 3000): the expected average size of a
partition - used to compute the number of partitions from a total number of elements
(smaller than the previous parameter), their difference roughly reflecting the maximum
distance from the mean in partition sizes.
In case an aggregation with multiple terms does not seem to take all data into account, you
may have a look a the daemon logs, looking for messages like:

Partition #xx supplied as many buckets as the maximum allowed (xx), some data may have
been ignored.
This message means that the aggregationPartitionSize parameter is too close to

the maxAllowedAggregationBuckets parameter. Reducing the first one should
eventually get rid of those messages, although it has an effect on performance (as more
queries have to be sent to NG|Storage).
19.8. Single Sign-On troubleshooting
19.8.1. Got ERR_TOO_MANY_REDIRECTS after ngDaemon upgrade
After upgrading ngDaemon, I got a ERR_TOO_MANY_REDIRECTS when I try to access the

UI. This is probably due to the /etc/httpd/conf.d/netguardians.conf that have been
overwritten with the new version rpm postinstall.
Normally, in this folder you should have a file called netguardians.conf.rpmsave. You
simply need to restore this one and restart httpd.ngc (systemctl restart httpd.ngc)
19.9. NG|Storage troubleshooting
Like a car, ngStorage was designed to allow its users to get up and running quickly, without
having to understand all of its inner workings. However, it’s only a matter of time before
you run into engine trouble here or there. This chapter will walk through five common
ngStorage challenges, and how to deal with them.
19.9.1. Problem #1: My cluster status is red or yellow. What should I do?
Cluster status is reported as red if one or more primary shards (and its replicas) is missing,
and yellow if one or more replica shards is missing. Normally, this happens when a node
drops off the cluster for whatever reason (hardware failure, long garbage collection time,
etc.). Once the node recovers, its shards will remain in an initializing state before they
transition back to active status.
The number of initializing shards typically peaks when a node rejoins the cluster, and then
drops back down as the shards transition into an active state, as shown in the graph below.

During this initialization period, your cluster state may transition from green to yellow or
red until the shards on the recovering node regain active status. In many cases, a brief
status change to yellow or red may not require any action on your part.
However, if you notice that your cluster status is lingering in red or yellow state for an
extended period of time, verify that the cluster is recognizing the correct number of
ngStorage nodes.
If the number of active nodes is lower than expected, it means that at least one of your
nodes lost its connection and hasn’t been able to rejoin the cluster. To find out which
node(s) left the cluster, check the logs (located by default in /var/log/ng-
screener/ngstorage/NGELK.log) for a line similar to the following:
[TIMESTAMP] ... Cluster health status changed from [GREEN] to [RED]
Reasons for node failure can vary, ranging from hardware or hypervisor failures, to out-of-
memory errors. Check any of the monitoring tools outlined here for unusual changes in
performance metrics that may have occurred around the same time the node failed, such
as a sudden spike in the current rate of search or indexing requests. Once you have an idea
of what may have happened, if it is a temporary failure, you can try to get the disconnected
node(s) to recover and rejoin the cluster. If it is a permanent failure, and you are not able to
recover the node, you can add new nodes and let ngStorage take care of recovering from
any available replica shards; replica shards can be promoted to primary shards and
redistributed on the new nodes you just added.
However, if you lost both the primary and replica copy of a shard, you can try to recover as
much of the missing data as possible by using ngStorage’s snapshot and restore module. If
you’re not already familiar with this module, it can be used to store snapshots of indices
over time in a remote repository for backup purposes.
19.9.2. Problem #2: Help! Data nodes are running out of disk space
If all of your data nodes are running low on disk space, you will need to add more data
nodes to your cluster. You will also need to make sure that your indices have enough
primary shards to be able to balance their data across all those nodes.
However, if only certain nodes are running out of disk space, this is usually a sign that you
initialized an index with too few shards. If an index is composed of a few very large shards,
it’s hard for ngStorage to distribute these shards across nodes in a balanced manner.
ngStorage takes available disk space into account when allocating shards to nodes. By
default, it will not assign shards to nodes that have over 85 percent disk in use and switch
the ngStorage node to read-only mode.
There are two remedies for low disk space. One is to remove outdated data and store it off
the cluster. This may not be a viable option for all users, but, if you’re storing time-based
data, you can store a snapshot of older indices’ data off-cluster for backup, and update the
index settings to turn off replication for those indices.
The second approach is the only option for you if you need to continue storing all of your
data on the cluster: scaling vertically or horizontally. If you choose to scale vertically, that
means upgrading your hardware. However, to avoid having to upgrade again down the line,
you should take advantage of the fact that ngStorage was designed to scale horizontally. To
better accommodate future growth, you may be better off reindexing the data and
specifying more primary shards in the newly created index (making sure that you have
enough nodes to evenly distribute the shards).
Another way to scale horizontally is to roll over the index by creating a new index, and using
an alias to join the two indices together under one namespace. Though there is technically
no limit to how much data you can store on a single shard, ngStorage recommends a soft
upper limit of 50 GB per shard, which you can use as a general guideline that signals when
it’s time to start a new index or to split the index to more shards.
19.9.3. Problem #3: How can I speed up my index-heavy workload?
ngStorage comes pre-configured with many settings that try to ensure that you retain
enough resources for searching and indexing data. However, if your usage of ngStorage is
heavily skewed towards writes, you may find that it makes sense to tweak certain settings
to boost indexing performance, even if it means losing some search performance or data
replication. Below, we will explore a number of methods to optimize your use case for
indexing, rather than searching, data.
Shard allocation
As a high-level strategy, if you are creating an index that you plan to update
frequently, make sure you designate enough primary shards so that you can spread
the indexing load evenly across all of your nodes. The general recommendation is to
allocate one primary shard per node in your cluster, and possibly two or more
primary shards per node, but only if you have a lot of CPU and disk bandwidth on
those nodes. However, keep in mind that shard overallocation adds overhead and may
negatively impact search, since search requests need to hit every shard in the index.
On the other hand, if you assign fewer primary shards than the number of nodes, you
may create hotspots, as the nodes that contain those shards will need to handle more
indexing requests than nodes that don’t contain any of the index’s shards.
Disable merge throttling

Merge throttling is ngStorage’s automatic tendency to throttle indexing requests
when it detects that merging is falling behind indexing. It makes sense to update your
cluster settings to disable merge throttling (by setting
indices.store.throttle.type to "none") if you want to optimize indexing
performance, not search. You can make this change persistent (meaning it will
persist after a cluster restart) or transient (resets back to default upon restart), based
on your use case.
Increase the size of the indexing buffer

This setting (indices.memory.index_buffer_size) determines how full the
buffer can get before its documents are written to a segment on disk. The default
setting limits this value to 10 percent of the total heap in order to reserve more of the
heap for serving search requests, which doesn’t help you if you’re using ngStorage
primarily for indexing.
Index first, replicate later

When you initialize an index, specify zero replica shards in the index settings, and add
replicas after you’re done indexing. This will boost indexing performance, but it can
be a bit risky if the node holding the only copy of the data crashes before you have a
chance to replicate it.
Refresh less frequently

Increase the refresh interval in the Index Settings API. By default, the index refresh
process occurs every second, but during heavy indexing periods, reducing the refresh
frequency can help alleviate some of the workload.
19.9.4. Problem #4: How can I do when my instance is overloaded ?
The first thing that new users do when they learn about shard overallocation is to say to
themselves:
I don’t know how big this is going to be, and I can’t change the index size later on, so to
be on the safe side, I’ll just give this index 1,000 shards…
One thousand shards—really? And you don’t think that, perhaps, between now and the time
you need to buy one thousand nodes, that you may need to rethink your data model once or
twice and have to reindex?
A shard is not free. Remember:
• A shard is a Lucene index under the covers, which uses file handles, memory, and CPU
cycles.
• Every search request needs to hit a copy of every shard in the index. That’s fine if every
shard is sitting on a different node, but not if many shards have to compete for the same
resources.
• Term statistics, used to calculate relevance, are per shard. Having a small amount of
data in many shards leads to poor relevance.

A little overallocation is good. A kagillion shards is bad. It is difficult to
define what constitutes too many shards, as it depends on their size and
how they are being used. A hundred shards that are seldom used may be
 fine, while two shards experiencing very heavy usage could be too many.
Monitor your nodes to ensure that they have enough spare capacity to deal
with exceptional conditions.
The maximum recommended number of shards can be computed by this

formula :
maxNbShards = 25 shards per 1Gb heap memory of the

cluster.
 If you have a single node 16Gb instance of ngStorage your max number of
shards will be 400 shards.
If you have a 4 nodes cluster with 16Gb memory on each node, your max
number of shards will be 1600.
19.10. NG|Processing troubleshooting
19.10.1. Control execution seems not progress
You have launched a control since a while, and based on daemon-all.log you don’t see any
progress log. The root cause may be that Mesos is unable to assign resources to Spark.
1. Open the Mesos webUI (https://myhost/ui/mesos/)

2. On the bottom left of the page, see how many CPUs and Memory are idle
3. Go to the Frameworks page
4. Verify how much CPUs and Memory are allowed to your Spark process
If no other controls runs and both values are equals to 0, then you have a Spark
misconfiguration. Verify in the spark-default.conf or in the global.env file, how much
memory is set for each Spark executor. Probably, you set a value higher than the number of
CPUs/Memory available (idle state).
19.10.2. Controls runs very slow with huge amount of data
Most of the time in POC mode, we have only 1 node but a big amount of data.

Chapter 20. Open-Source Products
20.1. Introduction
This chapter provides information about NG|Screener and open-source licenses and
sources location of open-source software used by NetGuardians.
20.2. NG|Screener
NG|Screener is commercial software provided by NetGuardians. The NG|Screener license

is available on NG|ScreenerUI (Help → License Agreement).
20.3. Case Manager
Case Manager is a GPLv2 license product (http://www.gnu.org/licenses/gpl-2.0.html). The

source code is available on the NG|Analytics Server in the folder /usr/local/ng-
screener/caseManager/.
Chapter 20. Open-Source Products | 191

Chapter 21. NG|OS Setting
21.1. Password policy
Password requirements are defined in configuration file at /etc/pam.d/system-auth.
The following configuration is required:
password requisite pam_pwquality.so try_first_pass retry=3

password sufficient pam_unix.so sha512 shadow nullok try_first_pass use_authtok
password required pam_deny.so
In order to modify password policy, the following line must be changed:
password requisite pam_pwquality.so try_first_pass retry=3
Configuration allows to specify the complexity of password. The parameters below allow for
a fine grained tuning:
• try_first_pass → number of wrong attempts of changing password ( parameter:

"retry=3")
• minlen → defines the password complexity related to amount of characters and
additional requirements
• lcredit → required number of lowercase letters
• ucredit → required number of uppercase letters
• dcredit → required number of digits
• ocredit → required number of other character
• difok → number of characters that must be different from previous password
Complexity defined in minlen is calculated as a sum of:
• the number of characters in password

• the number of lowercase letters, but not more than lcredit value
• the number of uppercase letters, but not more than ucredit value
• the number of digits, but not more than dcredit value
• the number of special characters, but not more than ocredit value
Setting lcredit, ucredit ,dcredit ,ocredit to a negative value makes the parameter not
account to the total complexity.
Example:
192 | Chapter 21. NG|OS Setting

password requisite pam_pwquality.so try_first_pass retry=3 minlen=16 lcredit=-1
ucredit=-1 dcredit=-1 ocredit=-1 difok=8
21.1.1. Important
In order to enforce password policy for root user (which is strongly suggested) it is
necessary to add an "enforce_for_root" parameter.
password requisite pam_pwquality.so enforce_for_root try_first_pass retry=3

minlen=16 lcredit=-1 ucredit=-1 dcredit=-1 ocredit=-1 difok=8
Chapter 21. NG|OS Setting | 193

Chapter 22. NG|Screener Services Dependencies
The following schema will show you the dependencies between all services of the
NG|Screener plateform.
194 | Chapter 22. NG|Screener Services Dependencies

Appendix A: Migrate Static Data configuration to Reference
Data
We currently do not have a straightforward procedure to migrate static data configuration
files to reference data. Up to version 5.0, we defined caches and translators in one
static data configuration file, while from version 5.1, we define caches in the reference
data module, and translators in the forensic module.
A.1. From version 5.0 to 6.0
This appendix provides an example of static data configuration files in version 5.0 and the
corresponding configuration files in versions 6.0 and 7.0. It thus provides a clue on how to
migrate static data from version 5.0 to version 6.0 and above.
Below is an example of a static data configuration file located at /etc/ng-

screener/daemon/modules/staticdata/config.xml
<?xml version="1.0" encoding="UTF-8"?>

<config>
<connector id="connector_jdbc_oracle" type="sources.jdbc.JdbcDatasource">
<maxConnections>1</maxConnections>
<connectionRetryDelay>3</connectionRetryDelay>
</connector>
<storerPath value="/data/staticdata/"/>
<translators name="userid2name">
<sources>
<source service="temenosT24Protocol"/>
</sources>
<translator name="translator1" connector-ref="connector_jdbc_oracle">
<cache size="1000000">
<load-query>select user_id, username from account</load-query>
</cache>
<initials>
<initial id="userid" name="Initiator_User_Name"/>
</initials>
<target name="Initiator_User_UserId" action="Replace"/>
</translator>
</translators>
<translators name="userid2branchid" orderId="1">

<sources>
</sources>
Appendix A: Migrate Static Data configuration to Reference Data | 195
<load-query>select user_id, branch_id from account</load-query>
</cache>
<initials>
<initial id="userid" name="Initiator_User_Name"/>
</initials>
<target name="Initiator_Process_Pid" action="Replace"/>
</translator>
</translators>
<translators name="branchid2name" orderId="2">

<sources>
</sources>
<load-query>select branch_id, branch_name from branch</load-query>
</cache>
<initials>
<initial id="branchid" name="Initiator_Process_Pid"/>
</initials>
<target name="Initiator_User_Domain" action="Append"/>
</translator>
</translators>
</config>
The above configuration from version 5.0 and lower, when migrated to version 6 and above
needs to be split into two files. One is the cache configuration file to construct caches and it
is located in /etc/ng-screener/common/referencedata/. The other is the translator
configuration file to translate events, which is located:
• for versions 6.x, in /etc/ng-screener/daemon/module/forensic/translators

• for version 7.x and above, in /etc/ng-
screener/daemon/module/feeding/translators.
Below is the corresponding cache configuration file
196 | Appendix A: Migrate Static Data configuration to Reference Data

<cacheconfig>
<datasource type="jdbc">
<maxConnections>1</maxConnections>
<connectionRetryDelay>180</connectionRetryDelay>
</datasource>
<cachegroup name="account_group">
<query>
select user_id, username, branch_id from account
</query>
<cache name="account">
user_id -> username, branch_id
</cache>
</cachegroup>
<cachegroup name="branch_group">
<query>
select branch_id, branch_name from branch
</query>
<cache name="branch">
user_id -> username, branch_id
</cache>
</cachegroup>
</cacheconfig>
And below is the corresponding translator configuration file for version 6.x:
<translatorconfig>
<sources>
</sources>
<translator>
<key>Initiator_User_Name=user_id</key>
<value>Initiator_User_UserId=username</value>
<value>Initiator_Process_Pid=branch_id</value>
<value action="append">Initiator_User_Domain=branch_id.branch_name</value>
</translator>
</translatorconfig>
And finally here comes the corresponding translator configuration file for version 7.x and
above (using business attributes and their naming convention, all lowercase):
Appendix A: Migrate Static Data configuration to Reference Data | 197

<translatorconfig>
<sources>
</sources>
<translator>
<key>source_user=user_id</key>
<value>source_user_id=username</value>
<value>source_pid=branch_id</value>
<value action="append">source_domain=branch_id.branch_name</value>
</translator>
</translatorconfig>
Here are some points to notice about this migration:
• Static data had only one configuration file named config.xml, while the new version
has multiple cache and translator configuration files.
• Each connector tag in static data is migrated to a datasource tag in separate cache
configuration files in reference data.
• connectionRetryDelay tag in static data is provided in minutes, whereas in
reference data it is provided in seconds.
• storerPath tag is moved to cacheLocation property in referenceData.conf.
• For translators with the same source, we can aggregate them into one translator
configuration file.
• Load queries on the same table in different translators can be merged into one query in
cache configuration file
• The notion of cache size in static data is not valid anymore in the new reference data, we
currently use attribute inMemorySize to keep part of the cache in memory. The rest is
kept in disk. Leaving this attribute at its default value should be sufficient in most cases.
• The notion of orderId of translators in static data is not valid anymore. Events will
be translated in the order defined in the translator configuration file.
A.2. From version 6.0 to 7.0
Static data from version 6.0 is compatible with version 7.0. The only difference is that static
data should be loaded now from /etc/ng-screener/common/referenceData instead of
/etc/ngscreener/common/referenceData.
198 | Appendix A: Migrate Static Data configuration to Reference Data

Appendix B: Migration hints from version 6.x to 7.x
B.1. Configuration files
All configuration files are usually located in a sub-directory of /etc/ng-screener or,

less frequently, of /usr/local/ng-screener.
B.1.1. New configuration files
/etc/ng-screener/daemon/modules/executor.conf
This new file was added to deal with the so-called executor module (able to launch Spark
jobs). Its content is as follows:
# Base url to the spark api server

#ngProcessingMonitoringApiUrl=http://localhost:18080/api/v1
B.1.2. Moved or renamed files
Old location/name New location/name

/etc/ng- /etc/ng-
screener/daemon/modules/forensic. screener/daemon/modules/feeding.c
conf and /etc/ng- onf
screener/common/forensicCommon.co
nf
B.1.3. Modified contents
/etc/ng-screener/common/ng-screener.conf
The encoding of the logs in log-collector is not configurable any more (it is always UTF-8).
Therefore, the following variable has now disappeared:
• SyslogStorageFileReadEncoding
On the other hand, the following configuration variables were added:
# Config to send events to syslog (UTF-8 encoded)

#syslogHostname = localhost
#syslogPort = 63514
#syslogCacheFolder = /data/syslogEventsCache/
# Sets the autocomplete attribute on password input

# (false = off, true = on)
autocompletePassword = true
Appendix B: Migration hints from version 6.x to 7.x | 199

/etc/ng-screener/common/referenceData.conf
The following configuration variable has been removed:
• cacheLocation
/etc/ng-screener/daemon/modules/{forensic,feeding}.conf
This file was renamed and underwent some modifications. Most notably, all NRT and
forensic-related configuration variables were removed, namely:
• forensicSessionTimeoutInMillis
• controlSessionTimeoutInMillis
• threadPoolSizeNormalization
• jdbcDriver
• serverAddress
• serverPort
• dbUsername
• dbPassword
• all nrt*
The following variables were added (the rulesetDirectoryPath_* variables were

previously located in the now defunct forensicCommon.conf file in the /etc/ng-
screener/common directory.
200 | Appendix B: Migration hints from version 6.x to 7.x

# Ruleset Directory Path
rulesetDirectoryPath_1=/usr/local/prelude-runtime/etc/prelude-lml/ruleset
rulesetDirectoryPath_2=/etc/ng-screener/daemon/modules/feeding/rulesets
# Path to translator config files

# Default: /etc/ng-screener/daemon/modules/feeding/translators
#translatorConfigDirPath = /etc/ng-screener/daemon/modules/feeding/translators
# End point of the ZooKeeper server

# Default: localhost:2181
#zookeeperHosts = localhost:2181
# End point of the Kafka server

# Default: localhost:9092
#kafkaServer = localhost:9092
# A unique string that identifies the consumer group this consumer belongs to
# Default: ngDaemon
#kafkaGroupId = ngDaemon
# If true the consumer's offset will be periodically committed in the background

# Default: false
#kafkaEnableAutoCommit = false
# The frequency in milliseconds that the consumer offsets are auto-committed to Kafka if
kafkaEnableAutoCommit is set to true
# Default: 1000
#kafkaAutoCommitIntervalMs = 1000
# The timeout used to detect consumer failures when using Kafka's group management
facility
# Default: 15000
#kafkaSessionTimeoutMs = 15000
# The maximum delay between invocations of poll() when using consumer group management
# Default: 30000
#kafkaMaxPollIntervalMs = 30000
# The maximum number of records returned in a single call to poll()

# Default: 10
#kafkaMaxPollRecords = 10
# The number of consumers to create

# Default: 2
#kafkaConsumerNumber = 2
/etc/ng-screener/common/controlCommon.conf
Here, as control templates and forensic templates completely changed forms between the
two releases, the following variables were removed:
• controlTemplatesDirectoryPath
• forensicTemplatesDirectoryPath

Moreover, the default value for the targetCustomTemplatesDirectoryPath has been
changed from
/usr/local/ng-screener/targetTemplates
to
/etc/ng-screener/daemon/modules/targetDescriptionTemplates
/etc/ng-screener/daemon/modules/control.conf
NRT-related and join-script-related configuration variables were removed:
• nrtConcurrentOnlineControls
• nrtConcurrentScheduledControls
• joinScriptsEnabled
• joinScriptsDirectoryPath
• joinScriptExecutionTimeout
Some new configuration variables were added, relative to the new way controls are now run
(using Python scripts, Spark and Mesos…):
# Automatic cleanup of parquet files after control execution

# If this value is set to true, all parquet files will be removed
# after execution
# If this value is set to false, those parquet files will be removed
# regularly by system clean up process
# Default: true
#automaticCleanup = true
# Path to control template configuration

# This directory holds customization for jasper report template
# (e.g. logo)
# Default: /etc/ng-screener/daemon/modules/controlReportTemplates
#templateConfigDir = /etc/ng-screener/daemon/modules/ [...]controlReportTemplates
# Path to the spark script templates for controls

# This templates are used to generate the spark scripts when executing
# a control
# Default: /etc/ng-screener/daemon/modules/controlScriptTemplates
#controlScriptTemplatesPath = /etc/ng-screener/daemon/modules/
[...]controlScriptTemplates
# Name of the spark script template for simple PBI control

# This template is used to generate the spark script when executing
# a Simple PBI control
# Default: spark_simple.py.template
#simpleSparkScriptTemplateName = spark_simple.py.template

# Name of the spark script template for advanced PBI control
# an Advanced PBI control
# Default: spark_advanced.py.template
#advancedSparkScriptTemplateName = spark_advanced.py.template
# Name of the spark script template for simple Profiling control

# a Simple Profiling control
# Default: spark_profiling_simple.py.template
#simpleProfilingSparkScriptTemplateName = spark_profiling_simple.py.template
# Name of the spark script template for advanced Profiling control

# an Advanced Profiling control
# Default: spark_profiling_advanced.py.template
#advancedProfilingSparkScriptTemplateName = [...]spark_profiling_advanced.py.template
# The root path to store the results when executing spark script
# Those results will be used to fill jasper report
# Default: /data/control
#controlResultPath = /data/control
# The connection used to connect to the Thrift server to read spark result
# Thrift server is located at the same server as NgProcessing with
# default port 10000
# Default: jdbc:hive2://localhost:10000
#hiveConnection = jdbc:hive2://localhost:10000
# Base url to the mesos master server

#mesosMasterApiUrl=http://localhost:5050
# Path for script which running spark jobs

#sparkExecutor=/usr/local/ng-screener/ngprocessing/ngspark/ [...]script/ngProcessing.sh
# Maximum number of controls that could be executed at the same time

# Default: 2
#maxConcurrentExecutingControls = 2
A […] at the beginning of a line means it’s a continuation of the previous

 line.
/etc/ng-screener/common/security.conf
The following configuration variables were added:

# Max allowed consecutive login attempts.
# When this limit is reached, login request will be delayed by number of
# milliseconds configurable in 'failedLoginBackoffTimeout' option and
# a captcha will be shown.
# -1 means unlimited
#maxFailedLoginAttempts = 5
# Request delay that will be set on the request that exceeded

# 'maxFailedLoginAttempts' option.
# Value <= 0 means no timeout
#failedLoginBackoffTimeout = 3000
Connectors' service configuration files
In those files, a new attribute is now present, called indexPattern, which may take one of
the following values:
• ngc for channel connectors

• ngt for transaction connectors
• ngi for IT-layers
• ngv for violations
Apache’s httpd configuration files
/etc/httpd/conf.d/netguardians.conf
Allowed SSL cipher suites were upgraded to
SSLCipherSuite EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH
To allow for direct access the the Spark and Mesos consoles from NG|Screener UI, some
new statements were added:

RewriteEngine On
# Proxy to spark history server

RewriteRule ^/sparkhistory$ /sparkhistory/ [R]
ProxyPass /sparkhistory/ http://localhost:18080/
RewriteRule "^/history/(.+)" "/sparkhistory/history/$1" [R]

RewriteRule "^/api/(.+)" "/sparkhistory/api/$1" [R]
<Location /sparkhistory/>
ProxyPassReverse http://localhost:18080/
ProxyHTMLEnable On
ProxyHTMLExtended On
SetOutputFilter INFLATE;proxy-html;DEFLATE;
ProxyHTMLURLMap /static/ /sparkhistory/static/
AddOutputFilterByType SUBSTITUTE text/html

Substitute s|setUIRoot('')|setUIRoot('/sparkhistory')|ni
Substitute s|href="/"|href="/sparkhistory/"|ni
</Location>
# Proxy to mesos UI
RewriteRule "^/master/(.+)" "/mesos/master/$1" [R]
RewriteRule "^/metrics/(.+)" "/mesos/metrics/$1" [R]
RewriteRule ^/mesos$ /mesos/ [R]

ProxyPass /mesos/ http://localhost:5050/
<Location /mesos/>
ProxyPassReverse http://localhost:5050/
RewriteBase /mesos/
ProxyHTMLEnable On
ProxyHTMLExtended On
SetOutputFilter INFLATE;proxy-html;DEFLATE;
ProxyHTMLURLMap /static/ /mesos/static/
</Location>
/etc/httpd/conf.d/httpd.ngc.conf
For security reasons, the following statement was added, because the TRACE HTTP verb is
not used at all:
# Disable http TRACE call

TraceEnable off
/etc/syslog-ng-rules/syslog-ng.conf
Due to violations being written to the log collector now and the new NG|Messaging product,
the following statements were added.
First, parallel to the definition of the s_pipe_polling source:

source s_violation_pipe {
pipe("/usr/local/ng-screener/violations_pipe" flags(no-multi-line));
};
And then, in the section defining the various destinations:
#Kafka destination
destination d_kafka_normal_r {
program("/usr/local/ng-screener/tools/kafkacat -P -b 127.0.0.1 -t ng-syslogEvents -z
snappy" template(dt_default_r) );
};
destination d_kafka_normal_s {
program("/usr/local/ng-screener/tools/kafkacat -P -b 127.0.0.1 -t ng-syslogEvents -z
snappy" template(dt_default_s) );
};
which then have to be added to the local and remote collectors:
# Polling logs
log {
source(s_pipe_polling);
destination(d_file_normal_s);
destination(d_kafka_normal_s);
};
# Remote logging SNMP collector

log {
source(s_pipe_SNMP);
destination(d_file_normal_r);
destination(d_kafka_normal_r);
};
# Remote logging generic collection

log {
source(s_collector);
destination(d_file_normal_r);
destination(d_kafka_normal_r);
};
B.2. NG|Discover Objects
The way ngDiscover stores its objects has changed from version 6 and version 7. To support
porting objects from the old version to the new one, we developed a tool called
importDiscoverObjects.py and put it under /usr/local/ng-
screener/tools/migration-script/.
Do the following steps to port all ngDiscover objects from v6 to v7:
• In v6: Run the following command to export ngDiscover objects to file

curl -XGET localhost:9200/.kibana/_search?size=10000 > \
/tmp/discover_objects.json
• In v7
◦ Upload the discover_objects.json file generated above to this server (assume
the file is put at /tmp/discover_objects.json)
◦ Run the following commands to import those objects to ngDiscover
cd /usr/local/ng-screener/tools/migration-script
python importDiscoverObjects.py /tmp/discover_objects.json

Appendix C: Migration from version 7.0.x to 7.1.x
C.1. Use default tenant for non multi-tenant environment
The following procedure is only applied if the previous installation does not have multi-
tenancy. In case the previous installation is multi-tenant, just skip it and apply the standard
migration.
• Before uninstalling ngScreener, clean all data indices from ngStorage
[root@NG-SCREENER ~]# ngadmin --force-forbidden data_removeEntries --force
• Uninstall ngScreener
[root@NG-SCREENER ~]# rpm -e --nodeps ngBrowser

[root@NG-SCREENER ~]# rpm -e --nodeps ngDaemonDistrib
• Migrate dependent components (e.g. ngMessaging, ngStorage, ngProcessing,

ngDiscover, …) to the latest versions if the latest versions are different from the current
versions.
• Add default tenant (i.e. DEFAULT) parameter to all python scripts under
/storage/python_scripts. Notice that from version 7.1, it adds a cron job to execute the
python scripts automatically. In addition, the python scripts are well structured in the
new version. Please make sure that the existing scripts follow that structure to make
sure they will be taken into account by cron job, and take a look at
/etc/cron.d/pythonScripts and
/storage/python_scripts/execute_python_scripts.sh to see how it works.
• Install ngSyslogNg which is recently introduced in ngScreener 7.1.0
◦ Stop the old syslog-ng.ngc service
[root@NG-SCREENER ~]# systemctl stop syslog-ng.ngc
◦ Backup file /etc/syslog-ng-rules/syslog-ng.conf

◦ Install the ngSyslogNgRpm
[root@NG-SCREENER ~]# rpm -ivh ngSyslogNg-X.X.X.rpm
◦ Restore file /etc/syslog-ng-rules/syslog-ng.conf

◦ Restart syslog-ng.ngc service to take the above config file into account
208 | Appendix C: Migration from version 7.0.x to 7.1.x

[root@NG-SCREENER ~]# systemctl restart syslog-ng.ngc
• Install ngScreener rpms
[root@NG-SCREENER ~]# rpm -ivh ngDaemonDistrib-X.X.X.rpm

[root@NG-SCREENER ~]# rpm -ivh ngBrowser-X.X.X.rpm
• After installing ngScreener

◦ Reindex search objects
[root@NG-SCREENER ~]# ngadmin search_reindexAll
◦ Reference data caches should be reloaded automatically, verify it

◦ Launch initial processing
[root@NG-SCREENER ~]# ngadmin data_launchInitialProcessing
◦ Recompute aggregations when the job launchInitialProcessing finishes
[root@NG-SCREENER ~]# ngadmin aggregator_recomputeAggregations
Appendix C: Migration from version 7.0.x to 7.1.x | 209

Appendix D: Migration from version 7.1.x to 7.2.x
With version 7.2.x we have introduced a new component called NG|Auth, responsible for
user authentication on the whole platform. NG|Auth is a rebranding of the open-source
Keycloak project (see https://www.keycloak.org/ for more information).
D.1. Step-by-step procedure
Here are described the steps to migrate from a version without NG|Auth to a version
including it. Please read it through before starting the first step.
D.1.1. Install ngAuth package
Even before uninstalling NG|Screener, install the new package:
[root@NG-SCREENER ~] rpm -ivh ngAuth-X.X.X.rpm
Per default, NG|Auth is installed with only one tenant (named DEFAULT), having only one
defined role (the famous NG_Admin) and one user belonging to that role (name admin,
initial password netguardians).
D.1.2. Create the missing realms in NG|Auth
If the previous installation was multi-tenant (or if the sole existing tenant’s name had been
changed from the default DEFAULT), create the missing tenant(s) in NG|Auth (tenants
correspond to so-called realms there):
To do that, one first has to grab the create_realm.zip file containing the necessary
scripts (provided with the RPM files) and run the following commands:
[root@NG-SCREENER ~] cd /tmp
[root@NG-SCREENER /tmp] unzip /path/to/create_realm.zip
[root@NG-SCREENER /tmp] create_realm/createKeycloakRealm.py \
--realm NEWREALM \
--kc-super-user superadmin \
--kc-pwd netguardians \
--auth-url https://ngscreener.bankdomain.com/auth
The hostname in the URL (ngscreener.bankdomain.com) is expected to be different for

each installed tenant (the names may all refer to the same IP in the DNS, but the names
have to be different, because this is the way the application distinguishes which is the
intended tenant). The very same name will be used when configuring the httpd.ngc
service (see below).
All added tenants now have a default admin user (initial password: netguardians)
210 | Appendix D: Migration from version 7.1.x to 7.2.x

belonging to the default NG_Admin role). Potential additional users/groups will be added in
a next step.
Additionnally, in case the DEFAULT tenant is not used at all on the installation, one may
want to remove it from NG|Auth entirely (we shall come to that later on).
D.1.3. Migrate local users from the previous installation
First grab the auth_local_migration.zip file containing the migration scripts

(provided with the RPM files), then run the following commands in a shell:
[root@NG-SCREENER ~] cd /tmp
[root@NG-SCREENER /tmp] unzip /path/to/auth_local_migration.zip
[root@NG-SCREENER /tmp] python auth_local_migration/authMigrations.py
This will take all local (= non-tenant-related) users and roles from the previous MariaDB
database and push them (once for each tenant) into NG|Auth. Tenant-related roles are only
pushed to the relevant realm. Please see next chapter to migrate their LDAP mappings.
It is possible that some errors are raised during the script’s run. In case a user or a role
already exists in the target realm, such an error will be raised, and can safely be ignored
(example of the admin user which indeed already exists):
Appendix D: Migration from version 7.1.x to 7.2.x | 211

2019-02-19 16:45:07,930 [INFO] Creating the admin user...
2019-02-19 16:45:09,433 [INFO] User exists with same username
org.keycloak.client.admin.cli.util.HttpResponseException: User exists with same username
at
org.keycloak.client.admin.cli.util.HeadersBodyStatus.checkSuccess(HeadersBodyStatus.java:
61)
at org.keycloak.client.admin.cli.util.HttpUtil.checkSuccess(HttpUtil.java:329)
at
org.keycloak.client.admin.cli.commands.AbstractRequestCmd.process(AbstractRequestCmd.java
:363)
at
org.keycloak.client.admin.cli.commands.AbstractRequestCmd.execute(AbstractRequestCmd.java
:126)
at
org.jboss.aesh.console.command.container.DefaultCommandContainer.executeCommand(DefaultCo
mmandContainer.java:63)
at
org.jboss.aesh.console.command.container.DefaultCommandContainer.executeCommand(DefaultCo
mmandContainer.java:48)
at
org.keycloak.client.admin.cli.aesh.AeshConsoleCallbackImpl.execute(AeshConsoleCallbackImp
l.java:54)
at org.jboss.aesh.console.AeshProcess.run(AeshProcess.java:53)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: {"errorMessage":"User exists with same username"}
... 11 more
At the end of the procedure, do not forget to remove the zip file and its exploded version
from the filesystem:
[root@NG-SCREENER /tmp] rm -rf /tmp/auth_local_migration

[root@NG-SCREENER /tmp] rm -f /path/to/auth_local_migration.zip
D.1.4. Migrate LDAP configuration for users/roles (in each tenant)
This only applies in case such a configuration existed in the past. Please refer to Section
D.2 for details.
D.1.5. Migrate NG|CaseManager users
The user migration for SSO login should be transparent for case manager if the
authentification providers are properly configured.
Extra information for troubleshooting
When connecting using SSO in case manager, the user lookup is done using the username
defined in Keycloak. If there is no match between the login on CM and username defined in
Keycloak, a new user is created on the fly using the following attributes provided by
Keycloak: username, first name, last name and email address.
Make sure those attributes are defined for all new users, otherwise an error will occur
when you connect with the user for the first time using SSO on Case Manager.
For existing users on CM, no configuration change is needed.
D.1.6. Uninstall NG|Screener and NG|CaseManager packages
[root@NG-SCREENER ~] rpm -e --nodeps ngBrowser

[root@NG-SCREENER ~] rpm -e --nodeps ngDaemonDistrib
[root@NG-SCREENER ~] rpm -e --nodeps ngCaseManagerRpm
D.1.7. Install new versions of NG|Screeener and NG|CaseManager (and other

required packages)
[root@NG-SCREENER ~] rpm -ivh ngDaemonDistrib-X.X.X.rpm

[root@NG-SCREENER ~] rpm -ivh ngBrowser-X.X.X.rpm
[root@NG-SCREENER ~] rpm -ivh ngCaseManagerRpm-X.X.X.rpm
1. Once either ngDaemonDistrib or ngBrowser package has been

installed in its new version, all information relative to local users is
erased from the NG|Screener database.
Please make sure to have migrated local users prior to executing that
step.
 2. Once either ngDaemonDistrib or ngBrowser package has been

installed in its new version, all information relative to LDAP
configuration of roles is erased from the NG|Screener database.
Please make sure to at least have a copy of those configuration data

handy (both from the roles' themselves and the /etc/ng-
screener/common/security.conf configuration file - which will
be erased also) before you perform this rpm installation step.
D.1.8. Configure the Apache reverse proxy
As already mentioned above, the way the NG|Screener applications will know which tenant
a user is connected to goes through the hostname used to access the application.
Everything occurs around the /etc/httpd/conf.d/netguardians.conf file.

One certificate for all tenants (wildcard certificate)
In the case where the SSL certificate covers all existing tenants (for instance in a bank
where tenants correspond to business units, all associated to the same top-level domain
name mybank.com - country1.mybank.com, country2.mybank.com… - and the
certificate covers *.mybank.com), the configuration is somewhat simpler: only two virtual
host sections are necessary in the configuration file.

SSLStaplingCache "shmcb:logs/stapling-cache(150000)"
<VirtualHost *:80>
ServerName country1.mybank.com
ServerAlias country2.mybank.com country3.mybank.com
RewriteEngine On
RewriteCond %{HTTPS} !=on
# Redirect any non HTTPS requests to the HTTPS server

RewriteRule ^/?(.*) https://%{SERVER_NAME}/$1 [R,L]
</VirtualHost>
<VirtualHost *:443>
ServerName country1.mybank.com
ServerAlias country2.mybank.com country3.mybank.com
SSLEngine On
SSLCertificateFile ...path-to-the-ssl-certificate-file...
SSLCertificateKeyFile ...path-to-the-certificate-key-file...
#############################################################
# HEADERS
# Change the name of the tenant and duplicate for each tenant
RequestHeader set X-NG-TENANTID "COUNTRY1" "expr=%{HTTP_HOST} == 'country1.mybank.com'"
RequestHeader set Strict-Transport-Security "max-age=63072000; includeSubDomains;

preload"
RequestHeader set X-Frame-Options DENY
RequestHeader set X-Content-Type-Options nosniff

...
# Proxy to ngAuthServer
ProxyPass /auth/ http://127.0.0.1:9090/auth/
ProxyPassReverse /auth/ http://127.0.0.1:9090/auth/
...
</VirtualHost>

One certificate for each tenant
If each tenant is actually covered by a specific SSL certificate, then a solution is to duplicate
the netguardians.conf file, once for each tenant. Each copy of the file can then be
adapted specifically for each tenant.
Only configuration files which name ends with .conf will be taken into
 account by the httpd.ngc service.
Here an extract of such a file for one tenant:

SSLStaplingCache "shmcb:logs/stapling-cache(150000)"
<VirtualHost *:80>
ServerName ngscreener.mybank.com
RewriteEngine On
RewriteCond %{HTTPS} !=on
# Redirect any non HTTPS requests to the HTTPS server

RewriteRule ^/?(.*) https://%{SERVER_NAME}/$1 [R,L]
</VirtualHost>
<VirtualHost *:443>
ServerName ngscreener.mybank.com
SSLEngine On
SSLCertificateFile ...path-to-the-ssl-certificate-file...
SSLCertificateKeyFile ...path-to-the-certificate-key-file...
#############################################################
# HEADERS
# Change the name of the tenant
RequestHeader set X-NG-TENANTID "MYBANK"
RequestHeader set Strict-Transport-Security "max-age=63072000; includeSubDomains;

preload"
RequestHeader set X-Frame-Options DENY
RequestHeader set X-Content-Type-Options nosniff

...
# Proxy to ngAuthServer
ProxyPass /auth/ http://127.0.0.1:9090/auth/
ProxyPassReverse /auth/ http://127.0.0.1:9090/auth/
...
</VirtualHost>
We could imagine the above example duplicated to another tenant with, in the second copy:
• the ServerName directives set to ngscreener.yourbank.com

• the SSLCertificateFile and SSLCertificateKeyFile directives pointing to
another pair of files

• the X-NG-TENANTID header value set to YOURBANK
D.1.9. Make sure the certificates are included in the keystore
The following command should be used to import each referenced certificate to the java
keystore. Of course,
• the tenant-ca alias and

• the /path/to/certificate/xyz.pem
parameters have to be adapted for each tenant.
[root@NG-SCREENER ~] keytool -import -alias tenant-ca \

-file /path/to/certificate/xyz.pem \
-keystore /usr/java/latest/jre/lib/security/cacerts \
-storepass changeit -noprompt
D.1.10. Restart the Apache reverse proxy and the applications
[root@NG-SCREENER ~] systemctl restart httpd.ngc.service

[root@NG-SCREENER ~] systemctl restart ng-screener-auth.service
[root@NG-SCREENER ~] systemctl restart ng-screener-ui
[root@NG-SCREENER ~] systemctl restart ng-screener
[root@NG-SCREENER ~] systemctl restart case-manager.ngc
D.2. LDAP user/role migration
D.2.1. Connect to the NG|Auth administration console
In order to configure LDAP users and/or roles, one must access the specific NG|Auth UI.
This can be done in two ways, depending on the current state of the installation.
• if the new version of the NG|Screener packages were not installed yet, the only way to
access the NG|Auth administation console is through usage of a SSH tunnel.
From a local shell (linux machine or Windows machine with Cygwin installed) or using a
specific PuTTY session configuration (Windows machine), a local port has to be mapped
to the appliance’s port 9090 (bound only on the loopback interface, i.e. on localhost).
As an exemple, the following command binds local port 9999 to the appliance’s port
9090, so that using the http://localhost:9999/auth/admin URL in a local
browser will connect to the appliance’s NG|Auth administation console:
[user@local-machine ~] ssh admin@x.x.x.x -L 9999:localhost:9090

1. x.x.x.x stands for the appliance’s external IP address (which could
also be replaced by the machine’s fully qualified name if DNS is
properly configured).
 2. It may be necessary to change the SSH connection port to the

appliance in the above command if NG|CTS is present on the
machine (in which case the port to use is probably 63022), through
an additionnal -p yyyyy parameter on the command line).
• if the new version of the NG|Screener packages was installed already, and the
httpd.ngc service restarted, then the administration console may be reached directly
on the public appliance’s network address at the following URL:
https://appliance.client.com/auth/admin
(where appliance.client.com is to be replaced by the proper machine address, of

course).
• in both cases, default username is superadmin and default password is

netguardians.
D.2.2. Choose a tenant to configure
Right after the login page, choose the tenant (denoted realm in NG|Auth) for which the
LDAP configuration should occur.
D.2.3. Configure LDAP connection
Once the tenant/realm has been chosen (the steps will have to be taken for each concerned
tenant), click on the User Federation section on the left pane, and choose a new LDAP
provider.
Fill the presented properties, not missing at least the following set (and reading the tooltips
popping-up when hovering the mouse pointer):
• Enabled: should be set, of course

• Import Users: should be set
• Edit Mode: normally READ_ONLY is fine for our purpose
• Vendor: usually Active Directory but can of course be different depending on the
customer (the choice here is useful to pre-fill some of the fields below to sensible
values adapted to the actual LDAP provider)
• Username LDAP attribute: LDAP attribute where the (technical) username is to be
found (to be compared with the previous value of the LdapUserSearchFilter
configuration parameter)
• Connection URL: taken from the former LdapUrl parameter in the /etc/ng-
screener/common/security.conf configuration file
• Users DN: taken from the former LdapRootDn parameter in /etc/ng-
screener/common/security.conf

Potentially the former LdapUserDnPatterns parameter(s) may also
be of interest when filling this field, for instance if the pattern
introduced a systematic specific sub-tree.
 For example: if the previous LdapUserDnPatterns was

uid={0},ou=People, then probably the part ou=People should be
prepended to the value of LdapRootDn to build the requested field
value.
• Authentication Type: normally simple, except if the LDAP can be bound to

without security, which would be very surprising
• Bind DN: taken from the former LdapUsername configuration parameter in
/etc/ng-screener/common/security.conf
• Bind Credentials: taken from the former LdapPassword configuration parameter
in /etc/ng-screener/common/security.conf
The password entered here should be the plain-text version, not the
potentially encrypted form present in the /etc/ng-
 screener/common/security.conf configuration file.
Encryption/hashing will now occur at NG|Auth level.
• Periodic Full Sync and Periodic Changed Users Sync: consider activating
one or both of these settings to enable synchronization between the LDAP provider and
NG|Auth (when users are created, removed…)
When all fields have been filled properly, click the Save button to make the setting
persistent. From this moment, a new Mappers tab appears for the LDAP configuration.

The process to map LDAP groups to NG|Auth roles is depicted hereafter.
The role names to which the LDAP groups will be mapped are taken from
an attribute on the LDAP group.
 This constraint did not exist in previous versions of NG|Screener where role
names could be set arbitrarily, independently of the actual LDAP groups'
attribute values.
The following set of mapper attributes are interesting for roles:
• Name: unique name for the mapper, no specific meaning except for documentation (and
maintenability), mandatory attribute
• Mapper Type: should be role-ldap-mapper
• LDAP Roles DN: DN of the hierarchy where the LDAP groups may be found (does not
have to cover all groups, as several mappers of the same type may be added to a LDAP
provider, each of them potentially covering only part of the whole spectrum)
• Role Name LDAP Attribute: name of the attribute found in the LDAP group which
value is used to build the role name
• Role Object Classes: LDAP class(es) to qualify an LDAP entry as a group in the
given hierarchy of objects

• Membership LDAP Attribute: name of a multi-valued attribute on LDAP groups
used to enumerate group members
• Membership Attribute Type: DN or UID depending on the content of the
Membership LDAP Attribute values
• Membership User LDAP Attribute: attribute on users which value should be
checked against the content of the group’s Membership LDAP Attribute to
determine group membership
• Use Realm Roles Mapping: should be set
D.3. Cleanup of DEFAULT tenant (if necessary only)
Once connected to the NG|Auth admin console (see previous chapter), one can destroy
realms that are not used any more, by clicking on the small bin icon close to the realm
name (somewhat reddish highlighted on the following screenshot).
 This operation cannot be undone!

Appendix E: Authenticate to UI/CM from a python script
Since version 7.2 of the NG|Screener solution, applications like NG|Screener itself or
NG|CaseManager share the same source to authenticate users: NG|Auth.
This brings the huge advantage of allowing Single-Sign-On functionality to the appliance, but
also has an impact on all various scripts which want to connect to those applications as
well.
Especially for python scripts, a new class has been defined which intends to abstract those
SSO considerations when authenticating to the applications (to access their respective
REST-API endpoints, for instance).
The class is called AuthToken and is available from the /usr/local/ng-

screener/tools/auth folder (in the authtoken.py file).
E.1. Authentication
1 from authtoken import AuthToken

2
3 param = {
4 'username': 'admin',
5 'pwd': '********',
6 'tenant': 'DEFAULT',
7 'client_id': 'client_application',
8 'auth_host_protocol': 'https',
9 'auth_host_port': 'machine.domain.com',
10 # 'ssl_cert_file': '/path/to/full/ca-bundle.pem'
11 }
12
13 with AuthToken(**param) as token:
14 response = token.get_call('https://machine.domain.com/resource')
15 if response.status_code == 200:
16 # do something with this content...
17 ...
Let’s have a look at each of those lines:
• line 1: import of the AuthToken class (it may be necessary to either put the
authtoken.py files in a system location or explicitly add its location to the system
path, using a construct similar to
import sys
sys.path.insert(0, '/usr/local/ng-screener/tools/auth')
right before our line 1
• lines 4 to 6: user (extisting in the tenant defined at line 6) name and password required
224 | Appendix E: Authenticate to UI/CM from a python script

for the actual authentication; for simplicity’s sake the user is given here in plaintext, but
of course it may also be extracted from the command line or a file on the filesystem
• line 7: name of the client (in the sense of a NG|Auth client); most probably a specific
client has to be created in the NG|Auth application for the requested tenant; this client
has to fulfill the following requirements (see Section E.5 for more details):
◦ it should be a public kind of client (i.e. not requiring specific credentials for itself)
◦ it should have so-called Client scopes allowing it to access the actual
applications (ngBrowser, ngCaseManager…)
• lines 8 & 9: protocol (http vs. https) and fully qualified name of the machine
(optionally with port) to connect to the NG|Auth application (note that the machine name
should - at least when using the https protocol - be the same also to access the actual
application)
Examples:
◦ http and localhost:9090

◦ https and ngscreener.mybank.com
• line 10: in the specific case of the https protocol used on line 8 above, it may be
necessary to indicate the CA-bundle file to use to check the SSL certificate returned by
the server; in practice, though, if the HTTP server returns the full SSL certificate chain
up to a valid root certificate known in the system’s SSL certificate store, this parameter
is not necessary.
• line 13: actual instanciation of the AuthToken class which in turn performs the actual
authentication process on the NG|Auth instance; the resulting token variable holds a
token through which all accesses (GET, PUT) should be routed.
• lines 14 and following: example of an HTTP GET call performed on a protected URL.
Another useful method also exists for PUT calls: put_call_json, which takes a data
object as additional parameter and which JSON representation will be sent in the
request’s body.
Both get_call and put_call_json methods also accept an optional

 parameter as a dictionary of custom headers to send along with the
request.
E.2. Example of request to NG|Screener
The following example calls for the list of available channels in from the NG|Screener
application (through the UI application) and displays a pretty JSON representation of the
returned list on the standard output:
Appendix E: Authenticate to UI/CM from a python script | 225

1 import json
2
3 ...
4
5 url = 'https://ngscreener.mybank.com/ui/protected/channel/list'
6
8 response = token.get_call(url)
10 print json.dumps(response.json(), indent=2, separators=(',', ': '))
11 else:
12 print 'Error code returned: %d' % response.status_code
E.3. Example of request to NG|CaseManager
The following example calls for the list of existing issues from NG|CaseManager and
dumps them to the standard output:
1 import json
2
3 ...
4
5 url = 'https://ngscreener.mybank.com/cm/issues.json'
6 headers = {
7 'X-Redmine-API-Key': '862af85646b3a929d94b7601a72c33eba52e4a5d'
8 }
9
11 response = token.get_call(url, headers)
13 print json.dumps(response.json(), indent=2, separators=(',', ': '))
14 else:
15 print 'Error code returned: %d' % response.status_code
The actual value set to the X-Redmine-API-Key header obviously has to

 match with the targeted NG|CaseManager instance.
E.4. Troubleshooting
Normally, each time a request is sent through the token instance, the OAuth2
authentication token is refreshed if it happens to have expired since last request was made.
This refreshing is only attempted once (per request) if the request’s return code is 401. In
case another reason was responsible for the 401 return code, the second attempt will fail
again, and token refreshing will not be attempted again, resulting on the 401 response to be
transmitted to the caller.
To have more information about what it actually going on, it may be worth activating the so-
called DEBUG mode, which will dump to the standard output many things.

In DEBUG mode, passwords and actual authentication token values will be
 part of the dumped data.
To activate this DEBUG mode, just configure python’s logging mechanism accordingly:
1 import sys
2 import logging
3
4 logging.basicConfig(stream=sys.stdout,
5 format='%(asctime)s [%(levelname)s] %(message)s',
6 level=logging.DEBUG)
E.5. Specific client definition in NG|Auth
First step is to login as superadmin in the NG|Auth administration application located on

https://ngscreener.mybank.com/auth/admin (actual host name to adapt,
obviously).
 Default password for the superadmin user is netguardians.
E.5.1. Client scope creation
To allow the new client’s access tokens to enable access to other applications, one needs
so-called client scopes. Preferably one client-scope per targeted application (so that they
can be re-used and mixed for several client if necessary later).
Per convention, the existing client scopes created specifically for this purpose on our
solution are named targetApplication-audience. Per default only the ngDaemon-

audience client scope exists (necessary for the ngadmin client to be able to connect to
the daemon’s REST endpoint).
To create a new one, please follow those steps (example here with a ngBrowser-
audience client scope):
1. Fill the scope name, check that the protocol is openid-connect and save
2. Switch to the Mappers tab and add a new mapper of type Audience (the name is
mandatory and should be explicit, obviously, but does not have to match the client
scope’s), for which the Included Client Audience field should be filled with the
client representing the target application (ngBrowser in our example).

This has to be done for as many target applications as necessary:
• ngBrowser for NG|ScreenerUI

• ngCaseManager for NG|CaseManager
• …
E.5.2. Client creation
1. Create a new client

2. Give it a new name (this name will be its so-called client_id) and save
3. Make sure its access type is public

4. Go to the Client Scopes tab and select the required client scopes from the
Available Client Scopes section before clicking on the Add selected button
Done! This client (through its client id) can now be used as seen in Section E.1.

Appendix F: Hits creation with python exporter
F.1. What is the python exporter
The python exporter is a framework to create hit in NGCaseManager.
The first reason of this framework comes from the Event Handling solution.
The Event Handling solution runs after the controls execution and that for a good reason.
The goal of Event Handling is to analyze multiple hits generated by some profiling controls
to validate if a hit is really a hit (e.g. ControlsA and ControlB and ControlC raise a hit so this
is really a hit. Result: Create a hit in CM).
Due to this fact, we cannot use the standard way to create a hit (NGScreener target
solution). To solve this problematic, a framework has been put in place to create a hit with a
python code (same language as Event Handling solution).
F.2. Prerequisites
The python exporter framework uses the official channel/target solution for configuration
purpose. To use it, you must create a dedicated target in NGScreener.
F.3. How to use the python exporter framework
To explain this point, we will start by analyzing a code and explain it.
232 | Appendix F: Hits creation with python exporter

1. from ng_redmine_exporter import ng_redmine_exporter
2. from ng_template import ng_template
3. from ng_parser import ng_parser
4. from ng_target_management import ng_target, channel
5. from ng_violations import ng_violations
6. import sys
7. if __name__ == "__main__":
8. target_manager = ng_target.TargetManagement()
9. channel_json = target_manager.get_channel("Case Manager")
10. c = channel.Channel()
11. c.parse_channel(channel_json, target_manager.get_target(channel_json,
'Pr02UnusualLocation'))
12. parser = ng_parser.Parser()
13. parser.template_filename = "/usr/local/ng-screener/targetTemplates/" +
c.target.custom_description.template
14. parser.parse_arguments(sys.argv[1:])
15. violations = ng_violations.Violation()

16. violations.mapping(parser)
17. template = ng_template.Template()
18. exporter = ng_redmine_exporter.RedmineExporter(c, "user", "passord",)

19. for tokens in violations.violations:
20. exporter.create_issue(c.target, template, parser, tokens)
We can start by the 5 first lines. These lines are mandatory, all these imports provide the
necessary code to run the hit exportation.
The line 6 is a contextual import (depends of the use case). The explanation for it will come
at the line 14.
The line 7 is a standard python code line.
Now, we enter in the real subject. We will start by the channel and target part:
• At the line 8, we initialize a object to manage a target element. Basically, this class
provides two important things:
◦ get_channel: With it, we can recover a dedicated channel base on it name.
◦ get_target: With it (and only when we have a channel), we can recover a dedicated
target base on it name.
• At the line 9, we use the get_channel method (describe previously) to recover all the
information in regards of the channel Case Manager in this case and we get also all the
targets related to the channel.
• At line 10, we initialize a object to manage the channel and target element receive from
the TargetManagement object.
Appendix F: Hits creation with python exporter | 233

• At line 11, from the TargetManagement object, we parse all the channel information and
also the target information related to the target name provides in argument (in our case
Pr02UnusualLocation).
Now, we will analyze how the description is generated:
• At line 12, we initialize a parser object. This object has two main goal:
◦ The first is to parse the HTML template to generate the description render.
◦ The second is the replacement of the special tokens by the control/source values
(e.g. in target definition, we define that the custom field 3 receive the value of the
column 6 of the control). This second goal is this replacement.
• At line 13, we provide the HTML template source file.
• At line 14, This is a specific line of code. In our example, the don’t use the Event Handler
solution but the NGScreener Command_ channel. This line parses the command
argument to parse the file (basically, in shell script, we have the special variable $1 .. $n
but not in python. To parse that, we pass by this method).
• At line 15, we initialize an object to map the data from the source file into an dedicate
object.
• At line 16, we do the mapping explain at the previous line.
• At line 17, we have all the required elements to create a custom description. Therefor,
we initialize an object to create it at this stage.
And finally, this part generate a hit and create it:
• At line 18, we construct an object to communicate with NGCaseManager.

• At line 19 and 20, we read line by line the violation information and create a hit in
NGCaseManager
Now we have seen a complete example of how to use this framework, now we will see
which parts could change in this example.
The first thing could be the line 14. This part depends of the information source (in our
example, a csv that comes from the control result). Depending of your source, this might
have to change.
The second thing is the line 19-20, the reason is that you could be send to different targets
depending of a certain value in the result (in this case, don’t forgot to instance multiple
channels and targets elements).
F.4. Conclusion
With this framework, you have more granularity to manage the hit creation of
NGCaseManager based on result values.
234 | Appendix F: Hits creation with python exporter

Appendix G: Timezone Usage in NgScreener
In this appendix, we explain the timezone usage in NgScreener platform in the following
flows:
• Normalization
• Control Execution
G.1. Normalization
• Syslog doesn’t care about timezone, it takes the date/time and write them into
ngMessaging and log-collector.
Example: the message sent to syslog is
<190> 2019-01-26T00:00:00.000+01:00 Host Service Message
the syslog process is run at Singapore with timezone GMT+8, the log sent to
ngMessaging and log-collector is
26/01/2019 00:00:00 Host LEVEL=info Service: Message
We can see that syslog only take the date/time presented in the message header
without considering its timezone. The event date passed to ngMessaging and log-
collector are local to server’s timezone
• Logs stored in NgMessaging and log-collector do not contain timezone information. It

uses server’s timezone by default.
Example: the following log line in log-collector means that the event occurs at
26/01/2019 00:00:00 at server time.
26/01/2019 00:00:00 Host LEVEL=info Service: Message
• NgStorage stores normalized events with timezone information.
Appendix G: Timezone Usage in NgScreener | 235

Example: the above log line in Singapore will be stored at ngStorage as
{
...
"@timestamp": "2019-01-26T00:00:00+08:00",
"host": "Host",
"service": "Service"
...
}
When importing logs from remote server, try to convert their timestamps
 into server’s time before sending them to syslog.
Translator scripted field may extract hour from event and save it as part of
the normalized event. From users' view, this hour is not the same as they
pretend. Example user at Geneva do a transaction at 10am, it’s stored in
 server at Singapore as 5pm and the hour part of the event is 17 instead of
10. This leads to the rewrite of control for different tenants on different
timezone. It also affects the aggregations based on hour range buckets.
G.2. Dashboards / NgDiscover
In Dashboard / NgDiscover, date is always presented in UTC format which is conformed

with NgStorage. They work transparently in different timezones.
G.3. Control Execution
G.3.1. Configure scheduled control
When user configures Scheduled part of a control:
236 | Appendix G: Timezone Usage in NgScreener

• UI frontend: scheduled date is in user’s timezone. When transferring the date to the
backend, it’s presented as a string in UTC format.
Example: the user is at Geneva (GMT+1), he schedules a control to run first time at
2019-01-26 00:00:00. The date string on the wire is 2019-01-
25T23:00:00.000Z.
• UI backend: it converts the date string in UTC format into its local date time (no
timezone information).
Example: for the example above, if the server is installed at Singapore (GMT+8), the date
is represented as 2019-01-26 07:00:00 at server time.
• DBMS: similar to UI backend, it stores local date time in database
G.3.2. Configure online control
When user clicks on Execute button in UI to run an online control
• if control execution period is last N TimeUnit, that information is sent to the UI backend,
the control execution period is calculated into absolute period by the backend, using its
local date time. Then it stores this local date time in database.
• if control execution period is absolute date range: similar to the case in section Section
G.3.1, the date in user’s timezone is converted into string in UTC format, then converted
back to local date time at server’s timezone and stored in database in local date time
without timezone.
G.3.3. Execute control
When a control is executed:
• Control Execution: similar to UI backend, date presented in this module is in local date
time. When sending date information to NgProcessing module, it converts date object
into string with its timezone information.
Example: for the example above, date is converted to 2019-01-26T07:00:00+08:00.
• NgProcessing: it processes the string passed from Control Execution module, converts
it into python date object with timezone information, and processes it based on this date
object. All requests to NgStorage are done with timezone-aware.
• NgStorage: timestamps are stored in date objects with timezone information.
Example: following is an event stored in NgStorage
Appendix G: Timezone Usage in NgScreener | 237

{
...
"@timestamp": "2019-01-26T00:00:00+08:00",
"host": "Host",
"service": "Service"
...
}
The control filter in control Template Configuration and spark custom codes
 are intact from those timezone conversion.
G.3.4. Generate report
In the report, date is represented at server time without timezone information.
Example, a date 2019-01-26T07:00:00+08:00 is displayed as 2019-01-26

07:00:00 in the report. If the user at Geneva executes the control and take a look at this
report, it may cause confusion since the timezone are not the same.
G.3.5. Export report to NgCaseManager
If it exports the whole pdf file to NgCaseManager, we have the problem in section Section
G.3.4.
If it exports each line of the report to generate a case in NgCaseManager, the date is
presented as a string without timezone information. Then a date 2019-01-26 07:00:00
at Singapore server is displayed as 2019-01-26 07:00:00 at Geneva.
238 | Appendix G: Timezone Usage in NgScreener

Appendix H: Jupyter Notebook
Jupyter Notebook is a popular application that enables you to edit, run Python code into a
web view. It allows you to modify and re-execute parts of your code in a very flexible way.
That’s why Jupyter is a great tool to test and prototype pyspark code.
H.1. Installing Jupyter Notebook
Jupyter Notebook can be set up with a help of virtualenv. It helps to maintain your system
clean since you don’t install system-wide libraries that you are only going to need in a
Jupyter Notebook environment.
To set up Jupyter Notebook execute the following commands on your VM (logged as the
ng-screener user).
#create the virtual environment

mkdir jupyter_env
virtualenv jupyter_env
cd jupyter_env
#activate the virtual environment

source bin/activate
#check if your `pip` command points out to your virtualenv folder

which pip
#install requirements
pip install pyspark==2.4.0 jupyter
Then you can just start the jupyter notebook server by typing the command:
jupyter notebook
To access jupyter notebook site locally you need to forward the 8888 port via ssh tunnel:
ssh -p 63022 -N -f -L 8888:localhost:8888 admin@demolocal.netguardians.ch
To activate already existing environment just run the source bin/activate command in
the folder. Then you can start jupyter notebook.
To clean up your virtual environment just remove the folder.
Appendix H: Jupyter Notebook | 239

NetGuardian Screener Admin Guide 7.3.1

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

NetGuardian Screener Admin Guide 7.3.1

Uploaded by

Copyright:

Available Formats

NG|Screener Administration Guide

1.2. Architecture and Main Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2.1. External Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2.2. Internal Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3. Notions of Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3.1. Storage window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3.2. Analysis window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3.3. Logs archiving window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.3.4. Logs cleaning window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2. Memory Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.3. Link display configuration in NG|Screener UI. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2. Configure Update Center . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.3. Check Installed Connectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.3.1. On NG|Screener Server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.3.2. In Management Center . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.4. Uninstall Connectors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.5. Install or Update Solutions/Connectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.5.1. Use NG-Screener-UI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.5.2. Install or Update Connector on NG|Screener Server . . . . . . . . . . . . . . . . . . . . . . 14

3.5.3. Install or Update Solution on NG|Screener Server. . . . . . . . . . . . . . . . . . . . . . . . 14

3.6. Connector’s Service Config file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4.1. Licensing Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4.3. Generate C2V file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.4. Activate License . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.5. Update License . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

5. Authentication with ngAuth. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5.2. How to access the service ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5.3. How to create a new Realm ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5.4. How to create a new Role ?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5.5. How to create a new User ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

5.5.1. User Role Mappings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.6. User Storage Federation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.6.1. Adding a Provider. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

5.6.2. Dealing with Provider Failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

5.6.3. LDAP and Active Directory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Storage Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Other config options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

Connect to LDAP over SSL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

Sync of LDAP users to ngAuth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

How to setup a new LDAP connection ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

5.7. Identity Brokering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

5.7.1. Brokering Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5.7.2. Default Identity Provider . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

5.7.3. OpenID Connect v1.0 Identity Providers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

5.7.4. SAML v2.0 Identity Providers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

6.2. Multi-tenant installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

6.3. Role mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

6.4. Scripts for create roles and users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

7. Command Line Tool (ngadmin) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

7.3. List of available commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

7.4. Usage and Tips. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

7.4.1. Space in arguments or options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

7.5. Show daemon version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

7.6. Control-related commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

7.7. Report target commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

7.8. Report utility commands. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

7.9. Profiling aggregation commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

7.9.1. Export profiling aggregations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

7.9.2. Import profiling aggregations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

7.9.3. Recompute profiling aggregations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

7.9.4. Rename profiling aggregations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

7.10. Profiling peer group commands. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

7.10.1. Recompute profiling peer groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62