Download as pdf or txt
Download as pdf or txt
You are on page 1of 56

MÁSTER EN BIG DATA ANALYTICS – 2017

UNIVERSIDAD POLITÉCNICA DE VALENCIA (UPV)

Big Data Architectures


for Investment Banking

Ignacio Sales
15.07.2017
AGENDA

Session 1: Introduction to Investment Banking

Session 2: Big Data Architectures in Investment Banking

Session 3: Machine Learning in Investment Banking


AGENDA

1. Introductions and Objectives


2. Investment Banking at 10,000 feet
3. The Trade Life Cycle
4. Big Data in Investment Banking
5. Hands-On Exercise
1.- INTRODUCTION AND OBJECTIVES

Introduction – Ignacio Sales Saborit

Senior Software Architect with over 20 years of experience designing solutions on the Java
Enterprise and .NET platforms. For the last five years, specializing in Big Data solutions for the
Financial Services industry.
Current:
Big Data Senior Architect at GFT IT Consulting (Valencia)

 GFT Data Practice co-owner


 Trainer on Big Data Technologies: Hadoop Development, Administration
Previous:
Senior Consultant at IBM Global Services (Madrid)
Certifications:
 Cloudera Certified Developer for Apache Hadoop (CCD-410)
 Cloudera Certified Administrator for Apache Hadoop (CCAH)
 Cloudera Certified Specialist in Apache HBase CCSHB (CDH4) ignacio.sales@gft.com
https://es.linkedin.com/in/ignacio-sales

GFT Group 03.09.2015 4


1.- INTRODUCTION AND OBJECTIVES

Course Objectives

 Session 1: To provide an overview of the Investment Banking Industry


 Main actors
 Asset classes
 Trade lifecycle
 High level functions – with examples of Big Data implementations
 Specific considerations for the use of Big Data in Investment Banking

 Session 2: To describe some representative Big Data architectural patterns


 From batch to real-time
 Modern data-centric architectures
 Cloud-based architectures

 Session 3: To present an overview of Machine Learning Techniques in Investment Banking


 Machine Learning models
 Use case: Hedge Fund & Trading Engines

GFT Group 03.09.2015 5


Agenda
1. Introduction to GFT
 GFT Group Overview
 GFT Data Practice
2. Investment Banking at 10,000 Feet
 Definition of Investment Banking
 Structure of an Investment Bank
 Asset Classes
3. The Trade LifeCycle
 Pre-Transaction – Client Onboarding
 Demo: Machine Learning for Client Onboarding
 Transacting – Regulatory Reporting
 Financial Accounting – Accounting Control
 Risk calculation – Risk Reporting
4. Big Data in Investment Banking
 The three V’s in Investment Banking
 Investment Banking-specific challenges
 Big Data Adoption: Do’s and Don’t’s
5. Hands On Exercise:
 Introduction to DataBricks
 Overview of ETL in Banking exercise

GFT Group 03.09.2015 6


1.- INTRODUCTION AND OBJECTIVES

GFT Group overview

GFT Group is a business change and technology consultancy trusted by the world’s leading
financial services institutions to solve their most critical challenges.

Specifically defining answers to the current constant of regulatory change – whilst innovating to
meet the demands of the digital revolution.

GFT Group brings together advisory, creative and technology capabilities with innovation culture
and specialist knowledge of the finance sector, to transform the client’s businesses.

Utilising the CODE_n innovation platform, GFT is able to provide international startups, technology
pioneers and established companies access to a global network, which enables them to tap into the
disruptive trends in financial services markets and harness them for their out of the box thinking.

GFT Group 03.09.2015 7


1.- INTRODUCTION AND OBJECTIVES

GFT At a Glance

STAFF
GLOBAL DELIVERY MODEL
3,248
Approximately

4,000 1,096
1,300 1,337 1,386
2,111

employees

2009 2010 2011 2012 2013 2014

STRONG INTERNATIONAL
INNOVATION ECOSYSTEM
PRESENCE

Locations in Global community of


more than

12
countries
1,000
digtial pioneers
Germany, Brazil, Canada, Costa
Rica, Italy, Mexico, Poland, Spain, with startups, established
Switzerland, UK, USA corporations, media and politicians

GFT Group 03.09.2015 8


1.- INTRODUCTION AND OBJECTIVES

GFT Offering: Advisory, Creative, Technology

 Front-office Innovation CHALLENGES SOLUTIONS


Capital  Regulatory Compliance
Markets  Risk Management
 Operational Efficiency Digitalisation Fintechs increasingly GFT Digital Banking lab develops
occupying parts of the innovative solutions for the bank
value chain of the future
 Digital Transformation
Retail  Core Banking
Banking  Operational Efficiency
 Regulatory Compliance
Regulation Growing compliance and As sector specialist GFT offers
regulation requirements onshore consultancy by experienced
industry experts and qualified
nearshore implementation services
 Claims Management
Insurance  Digital Transformation
 Operational Efficiency

Delivery Pressure to reduce costs for GFT delivers IT services at attractive


Efficiency maintaining core banking prices by nearshore hubs in Spain,
Private systems due to falling margins Poland, Brazil and Costa Rica
 Core Banking for the European and US Market
Wealth

GFT Group 03.09.2015 9


1.- INTRODUCTION AND OBJECTIVES

Trusted by the world's leading financial institutions

Capital
Markets

Retail
Banking

Insurance

Private
Wealth

GFT Group 03.09.2015 10


1.- INTRODUCTION AND OBJECTIVES

GFT Data Practice

Focused in both Functional and Technical aspects and covering end to end data lifecycle: data
sourcing, transformation, quality assurance, analytics, persistence, visualisation, BI reporting and
governance.

Main efforts are being dedicated to research & development of the key technologies in the following
areas: Big Data, Data Analytics, Social Mining , Stream Processing and Block Chain.

We then focus on providing the following services both internally and externally: asset creation,
training, project and pre-sales support and communication.

The practice is complemented by two Knowledge Communities: Data Management KC and Businness
Intelligence KC.

GFT Group 03.09.2015 11


1.- INTRODUCTION AND OBJECTIVES

Data Practice

CERTIFICATIONS POCS

People with Developed

25
certifications 10
PoCs
On Hadoop Development &
TRAINING
Administration, Elasticsearch, SAP- On Anti-Money Laundering/KYC,
HANA, MongoDB, Spark and HBase Hadoop based ETL, HPC, Block
Over Chain, Real-time Anomaly Detection
PROJECTS

300
people trained
Supported over

on Hadoop Developer &


Administration, Scala, Machine
Learning Techniques, Storm,
11
projects in 6 clients
Spark…
Volcker Reporting, Financial
Information Management, DaaS,
Balance Generation, Data Quality…

GFT Group 03.09.2015 12


CHAPTER 2

Investment Banking at
10,000 feet
2.- INVESTMENT BANKING AT 10,000 FEET

What is Investment Banking?

 From Investopedia (http://www.investopedia.com/terms/i/investment-


banking.asp):
“Investment banking is a specific division of banking related to the creation
of capital for other companies, governments and other entities. Investment
banks underwrite new debt and equity securities for all types of corporations, aid
in the sale of securities, and help to facilitate mergers and
acquisitions, reorganizations and broker trades for both institutions and private
investors. Investment banks also provide guidance to issuers regarding the issue
and placement of stock.”

GFT Group 03.09.2015 14


2.- INVESTMENT BANKING AT 10,000 FEET

Banks at a Glance

 Numerous types of banks exist, each of which fulfils a specific set of services to its
customers:
 Retail banks
• deal directly with individuals and small to medium size enterprises offering
payment services, savings products, mortgages, credit cards, and insurances
 Savings banks
• offer savings products to a wide range of the public. Saving banks are now
included within the retail bank sector

GFT Group 03.09.2015 15


2.- INVESTMENT BANKING AT 10,000 FEET

Banks at a Glance

 Private banking
• advise and manage the assets for high net worth individuals
 Commercial banks
• deal mostly with deposits, loans, and financing for large corporations. Often
commercial banks also sell more complex banking products to its clients (via
the investment bank)
 Investment banks
• act as an intermediary between an issuer of securities and the investing public,
and trade financial instruments, make markets (bring buyer to seller),
underwrite stock and bond issues, foreign exchange and advise corporations
on capital markets activities (see functions of investment bank in next slides)

GFT Group 03.09.2015 16


2.- INVESTMENT BANKING AT 10,000 FEET

Investment Banking Main Functions

 Investment Banking can be broken into two areas:

 Banking business: raising capital and executing M&A transactions for corporate
clients; raising capital for government clients
 Arranges financing for corporations and governments
 Debt (bonds)
 Equity (stock)
 Convertibles (Convertibles are securities, usually bonds or preferred shares, that can be converted
into common stock)
 Advises on mergers and acquisitions (M&A) transactions

GFT Group 03.09.2015 17


2.- INVESTMENT BANKING AT 10,000 FEET

Investment Banking Main Functions

 Trading business: providing investing, intermediating, and risk-management services


to institutional investor clients, performing research, and also participating in non
client-related investing activities
 Client Trading
 Sells and trades securities and other financial assets as intermediary on behalf of investing clients
 Operates in two business units: (1) Equity and (2) Fixed Income, Currency & Commodities (FICC)
 Research is provided to investing clients
 Proprietary Trading and Principal Investing
 Investment activity by the bank that affects the bank's own accounts
 Focuses on investments in equity (public and private), bonds, convertibles and derivatives in a
manner similar to the investment activities of hedge funds and private equity funds

GFT Group 03.09.2015 18


2.- INVESTMENT BANKING AT 10,000 FEET

Investment Bank Organizational Structure

 A classic distinction for the organisation within an Investment Bank for trade related
activities is by whether or not the areas interact with clients or perform supporting tasks in
the back ground:
 Front Office: Any business unit which directly interacts with the customers and
counterparties such as
 Relationship Management
 Sales
 Traders

GFT Group 03.09.2015 19


2.- INVESTMENT BANKING AT 10,000 FEET

Investment Bank Organizational Structure

 Middle Office: These units perform supporting and controlling activities for the Front
Office and may occasionally interact with the counterparties or other involved parties
(Stock Exchanges, Brokers, Custodians, etc.) during the trade processing such as
 Trade Processing, Reconciliation and Clearing
 Profit & Loss (P&L) calculation and verification
 Market and Credit Risk calculation and monitoring
 Limit and Position Controlling
 Financial Control
 Collateral Management

GFT Group 03.09.2015 20


2.- INVESTMENT BANKING AT 10,000 FEET

Investment Bank Organizational Structure

 Back Office: the back office units are usually responsible for performing all Trade
Settlement, Accounting and Reporting activities:
 Trade Settlement (payments and deliveries, incl. physical delivery of e.g. gold bullions)
 Record Maintenance (account booking and adjustments in General Ledger [GL])
 P&L booking
 Taxation
 Regulatory and MI reporting

GFT Group 03.09.2015 21


2.- INVESTMENT BANKING AT 10,000 FEET

Asset Classes

 Financial Instruments can be classified in different ways. The criteria mostly used is Asset
Class.
 Asset Class: Financial instruments or securities that can be grouped by the same characteristics
and behaviour in market places and which are usually regulated in the same way:
 Equities: stocks
 Fixed Income: bonds
 Commodities: precious metals, crude oil
 Foreign Exchange (FX)
 Cash Equivalents: Money Market loans and deposits
 Real Estate
 Derivatives: Futures, Options, etc

GFT Group 03.09.2015 22


2.- INVESTMENT BANKING AT 10,000 FEET

Asset Classes

 Other criteria describe how or where these instruments are traded or certain
characteristics of the instruments.
 Market Place describes whether the instruments are traded in a regulated market environment e.g.
Stock or Commodities Exchange or Over-the-Counter (OTC)
 Term / Maturity is especially used for fixed income products to describe the length of the loan or
deposit:
 Short Term – Up to one year
 Medium Term – between one year and five years
 Long Term – five years or longer

GFT Group 03.09.2015 23


2.- INVESTMENT BANKING AT 10,000 FEET

High Level Data Flows

GFT Group 03.09.2015 24


2.- INVESTMENT BANKING AT 10,000 FEET

System Map – In Reality

GFT Group 03.09.2015 25


CHAPTER 3

The Trade Life Cycle


3.- THE TRADE LIFECYCLE

The Trade LifeCycle

End of Day Accounting Management


Pre- Post- Financial
Transacting / & Reporting &
Transaction Transaction Accounting
End of Month Control Analysis
Pre-Trade T0 Intra-Day T0 COB – T+1 SOD T+1 – T+4

Client Trade Capture Market Data Valuation Tax Processing Financial Financial
Trade Lifecycle
On-Boarding Management Control Reporting Management &
Event
Management Analysis
Transactional Funding & FX
Reference Data
Data Matching Valuation & Business Control Risk Reporting
Management
Settlements Analytics Control Risk
Management &
New Product Transactional Regulatory
Analysis
Approval Regulatory Trading Risk & Product Control Capital
Asset Services
Reporting P&L
Treasury
Accounting
Accounting, Collateral Management
Control
Regulatory &
Reporting
Policy Client
Valuations

GFT Group 03.09.2015 27


3.- THE TRADE LIFECYCLE

Client onBoarding – KYC / AML

 Private Wealth Management departments are under increased


regulatory pressure to implement Enhanced Due Diligence (EDD) Anti
Money Laundering programs

 A component of these EDD programs is gathering of “Relevant adverse


information”, defined as:
“Information obtained from any source, including the Internet, free and subscription
databases and the media, which is directly or indirectly indicative of involvement in
money laundering, terrorist financing or predicate offences.”1

 The objective of this Service to support decision making in EDD


workflows with advanced Big Data mining algorithms

GFT Group 03.09.2015 28


3.- THE TRADE LIFECYCLE

Data Types – Reference Data

Counterparty Organisation
All financial transactions have two participating parties. Hierarchies exist which enable the mapping of financial data to
the organisation, e.g. to a cost centre, to a particular office
location, etc. Organisations may have multiple organisational
The counterparty is the second party which participates in a hierarchies by which they want to manage information.
financial transaction.
Business hierarchy (Book, Desk, Business Unit,Region..)
Every buyer of an asset must be paired up with a seller that is
willing to sell and vice versa.
Financial hierarchy(GL Account, Cost Centre…)
Counterparty reference data will include:
• counterparty name
• counterparty type (inter / intra company)
Trade Data
Product Instrument
A product is a high-level grouping of financial An instrument is a unique security which can, for
Instruments / securities: FX Spot, Equity Derivatives, example, be traded on an exchange. AAPL is the
Corporate Bonds. unique ticker for Apple Computer stock equity that
can be traded on the NASDAQ exchange.
Product reference data will include:
• product codes Instrument reference data will include:
• product name, e.g. Bond 3 Year • instrument number
• product category, .e.g. Bonds • instrument type

GFT Group 03.09.2015 29


3.- THE TRADE LIFECYCLE

Regulatory Environment

Regulatory The Markets in Financial Instruments Directive (MIFID) is a European Union law that provides
Policies harmonised regulation for investment services across the 30 member states of the European
Economic Area.
 Authorisation, regulation and passporting: once a financial institution is accepted in one EU state it can operate
in others
MIFID  Client categorisation: firms must categorise clients as "eligible counterparties", professional clients or retail
clients

The Sarbanes Oxley Act 2002 (SOX) was introduced under US Federal Law to prevent
reoccurrences of major accounting scandals such as that caused by Enron.
 Includes rules around public company accounting, auditor independence, corporate responsibility and analysts’
SOX conflicts of interests.
 Companies must report annually on the operational effectiveness of the internal controls relating to financial
reporting. The company’s auditors must also attest to and report on the board’s assertions.

Dodd-Frank was introduced in 2010 and brought on financial regulation in the United States in
response to the financial crisis of 2007-2010.
 Consolidation of regulatory agencies and evaluation of systematic risk
Dodd-Frank  Increased transparency of derivatives trading
 Volcker Rule – prohibits proprietary trading by depository banks

GFT Group 03.09.2015 30


3.- THE TRADE LIFECYCLE

Finance Regulatory Agenda


1985 1990 1995 2000 2005 2010 2015 2020
2009 2013
2003 2005
Dodd Frank BCBS 239
IAS 32/39 IFRS 7
IFRS 9 Banking Reform Act

1990 1998 2002


Increasing frequency 2011 2014 2018
1988 2004 2006
Basel Accord IAS 30 FAS 133 Sarbannes FAS 157 BASEL III Volcker MiFID II
Basel II
Oxley IFRS 8 IFRS 10/ Structural Measures
11/12/13
Basel Accord 1988 - 1992
IAS 30 Aug 90 – Jan 91
FAS 133 Jun 98 – Jan 01
Sarbannes Oxley July 02 – Dec 04
IAS 32 / IAS 39 Dec 03 – Jan 05
Basel II
MiFID I
Increasing Jun 04 – Apr 08
Apr 04 – Nov 07
IFRS 7
FAS 157
concurrency Aug 05 – Jan 07
Sep 06 – Nov 07
IFRS 8 Nov 06 – Jan 09
Dodd Frank Jun 09 – Jul 10
IFRS 9 Nov 09 – Jan 18
IFRS 10 / 11 / 12 / 13
Basel III
Extended May 11 – Jan 13
Jun 11 – Jan 19
BCBS 239 Jan 13 – Jan 16
2013 Banking Reform Act
Volcker
periods of Dec 13 – Jul 15
Dec 13 – Jan 19

Structural Measures
MiFID II uncertainty Jan 14 – Jan 19
May 14 -Jan 18

GFT Group 03.09.2015 31


3.- THE TRADE LIFECYCLE

Volcker Rule Reporting

Business Challenge Outcome / Benefits

• The Volcker Rule, a specification of the U.S. Dodd-Frank • Able to calculate several metrics like Inventory Aging,
Wall Street Reform Act CTFR and ITR of positions in order to determine extent of
• A large UK-based investment bank needed to produce proprietary trading
regulatory reports on its proprietary trading • Management of huge volumes of data – fine granularity
• Technological platform which allows both high volume and trade data for key businesses
rapid processing • Rapid calculation of key figures for Volcker reporting
• Scalable to meet future needs of the bank

GFT Performance Volumes

• GFT implemented a new system using the big data platform • Receives data from over 50 trade and position sources
Hadoop
• MapReduce was used for data importing, transformation, • Processes 750 million trades events/day
and calculation • Total 174 TB of historic data
• Sqoop was used to implement the interface to relational
databases • Cluster: 22 nodes (4x4 cores/each + 98GB RAM)
• QlikView was implemented as the main reporting tool

GFT Group 03.09.2015 32


3.- THE TRADE LIFECYCLE

Volcker Rule Reporting – High Level Architecture

Inventory Aging Metric


Metric Calculation Data

Exporter
CFTR Metric
Metric Calculation Data
Source
CSV Source
Source System
Systems CSV System Norm
Normalizers ITR Metric
(x50) CSV Normalizers
Normalizers alized Metric Calculation
CSV Data
Data
Server
Qlickview
Dashboard
Cov. Funds Metric
Metric Calculation Data

RENT-D Metric
Metric Calculation Data

Orchestration

Scheduling

GFT Group 03.09.2015 33


3.- THE TRADE LIFECYCLE

Financial Accounting

The financial accounting area of an investment bank has many


responsibilities, the primary responsibility being reporting the financial
position of the bank.
The financial statements that they produce are published to external
audiences at pre-agreed intervals, e.g. annually and bi-annually.
Many functions need to be carried out within the financial accounting area.
These include:

 Tax Processing
• The bank must calculate and pay appropriate taxes and comply with the local tax laws.
 Funding & FX Control
• The bank needs to maintain a healthy balance of assets and liabilities at all times, and be aware of areas of
weakness.
 Regulatory Capital
• The bank must ensure that they maintain the minimum levels of capital required by the regulatory bodies.
 Accounting Control
• Ensuring that accounts are prepared according to correct accounting practices and the numbers produced
can be supported.
GFT Group 03.09.2015 34
3.- THE TRADE LIFECYCLE

Balance Maintenance

Business Challenge Outcome / Benefits

• Accounting SubLedger is a critical finance system which • The new architecture puts in place a totally scalable
produces credit / debit postings for the accounting platform which will allow continued growth of data volumes
process of a large investment bank • Improved efficiency by migrating existing logic from PL/SQL
• To future-proof the system, the data loading, conversion, to Hadoop
aggregation and balance generation needed to be • Having managed the original subledger platform over the
updated, reducing processing times and improving last 10 years, GFT was able to understand well how to
efficiency successfully apply the new technologies to the evolving
needs of the bank

GFT Performance Volumes

• GFT designed and implemented a new architecture using • 6 different business lines
the big data platform Hadoop
• Configurable aggregation logic was implemented in • 700 million balances/day
MapReduce jobs • 65 million postings/day
• Workflow coordination was done with Oozie and Sqoop to
extract data from existing Oracle databases • Cluster: 30 nodes (2x10 cores/each + 128GB RAM)
• Hive used to query intermediate data (for testing purposes)

GFT Group 03.09.2015 35


3.- THE TRADE LIFECYCLE

Balance Maintenance – High Level Architecture

J
M
S

Calculate Balances

Control Balances

Data Preparation

XML Generation
Import Postings

Summarized

File Delivery
Data Export
SFTP
Posting

Backup
HDFS Unix

HouseKeeping

Archiving &
Purging

Scheduling

GFT Group 03.09.2015 36


3.- THE TRADE LIFECYCLE

Reporting

Reporting is a very important part of investment banking as these reports


are used by senior management as the basis for all decisions that they take.

 The reports, both Financial and Risk, are built using inputs from the
earlier stages of the functional model.
 Report definitions should be standard and agreed across the
business, locations and legal entities where possible.
 The data used in the Risk reports should be consistent and aligned to
the Financial reports, and vice versa. This means that close
communication between the two areas is extremely important, and
only one data source should be used (if possible)

GFT Group 03.09.2015 37


3.- THE TRADE LIFECYCLE

Risk Reporting

Risk exposure is a given within the world of investment banking, in particular market, credit and liquidity risk.
Risk is calculated and reported by portfolio and aggregated to division and bank wide. Risks reported include:

Market Risk Credit Risk Liquidity Risk Operational Risk Legal Risk
The risk of loss of The risk that a The risk that a sale The risk that a loss The risk that a
earnings or capital, counterparty will is unable to be will be incurred due counterparty is not
resulting from a not fulfil its made. to inadequate or legally able to enter
change in the value obligations. failed processes, into a contract or
of financial people, systems. that legislation
instruments. might change
during life of trade

Reports are prepared to both detect trends and areas of particularly high risk. Reports are directed at various levels:
• Traders: market and credit risk of individual trades, and aggregated to trader portfolio; limit reporting
• Desk managers: portfolio risk and desk aggregated risk; limit reporting
• Senior managers: market and credit risk at division level, legal risk, operational risk
• Top managers: overall bank risk (e.g. global VaR), compliance risk reporting, legal risk, and operational risk

GFT Group 03.09.2015 38


3.- THE TRADE LIFECYCLE

Risk Reporting – IFRS 9

Business Challenge Outcome / Benefits

• The IFRS9 Impairment project is the bank’s response to a • Meeting regulatory requirements
regulatory requirement for Jan’18. The bank has to provide • An scalable and high-performance solution for the
the authorities with mandatory risk exposure reports to computation of calculations and transformations using
anticipate impairment expected loss impact given default Spark
• Impairment calculation requires huge amounts of input data, • GFT is now rolling this solution out to different divisions of
and billions of daily calculations. A new proposed the bank
architecture based on Big Data technologies must provide
scalability and reliability

GFT Performance Volumes

• Spark DataFrames were used for the ETL and calculations • Still under development, volume is not significant
and transformations • Smoke test:
• Storage was on HDFS using Avro format and structured
• Input data: 60M default predictions
with Hive tables, allowing for ad-hoc interrogation with Hive
or Impala engines • Processed in about 20min
• Impala was the entry point for QlikView to create the reports • Shared cluster: 15 nodes (48 cores/each and 500GB RAM)
• Programme Management

GFT Group 03.09.2015 39


3.- THE TRADE LIFECYCLE

Risk Reporting – IFRS 9 – High Level Architecture

Internal Big Data Platform

Core Platform

Calculations & Storage


Transformations
Credit
Risk
Data QlikView
reports

Spark HDFS HIVE IMPALA


DataFrames

GFT Group 03.09.2015 40


CHAPTER 4

Big Data in Investment


Banking
4.- BIG DATA IN INVESTMENT BANKING

The 3 v‘s in Investment Banking

 Volume:
 Certainly not at Google or Facebook scale
 Quite a few systems on the limits of relational technologies
 Oracle Exadata / Teradata, etc
 100’s of TB are not uncommon

 Variety:
 Many types of structured data – rapidly changing
 Some use cases do require analysing unstructured data
 Trader / Communications surveillance

 Velocity:
 Algorithmic Trading (Big Data before the name was coined!)
 Many processes are batch-driven
 Architectures evolving from batch to real / near-real time processing
 Holy grail – Straight through processing – Settle at T+0

GFT Group 03.09.2015 42


4.- BIG DATA IN INVESTMENT BANKING

Investment Banking Challenges

 Every Record Counts


 Data Quality / Reconciliations are critical

 Globalization of Production chain - Provide the right tools for each function
 Development (Western Europe)
 Testing (Eastern Europe)
 Production Support (Asia)

GFT Group 03.09.2015 43


4.- BIG DATA IN INVESTMENT BANKING

Investment Banking Challenges

 Learn to move at Open Source Speed


 HW provisioning lead times
 Be agile with platform version updates
 Keep up to date with technical trends

 Very complex organizations


 Hard to break silo mentality
 Continued focus / investment
 Focus on delivery – not strategy
 Lots of money – not always a good thing

GFT Group 03.09.2015 44


4.- BIG DATA IN INVESTMENT BANKING

Big Data Adoption: Dos and Dont´s

Do’s
 Start “small”
 Process and/or store data sets on the multi-Terabyte range
 No need for huge Clusters - 10 to 20 nodes is perfectly acceptable

 Consider virtualized infrastructure for demonstration / PoCs


 Requires fine tuning or migration to dedicated hardware for production QoS

 Select an Enterprise Distribution (Cloudera, HortonWorks, Oracle, IBM…)

 Beware the “Small Print of BigData”: Get help from the experts

GFT Group 03.09.2015 45


4.- BIG DATA IN INVESTMENT BANKING

Big Data Adoption: Dos and Dont´s

Don’ts
 Start a project based on purely technical considerations
 Business Value must be the main driver

 Try to create a data analytics capability on a new Big Data platform


 Big Data can enhance existing modelling capabilities, but will not make their initial creation easier

 Fail to consider security and data governance aspects

 Take the cost reduction promise at face value


 Commodity Hardware is not always applicable
 Enterprise support licensing is not cheap
 Consider your procurement policies / TCO
GFT Group 03.09.2015 46
4.- BIG DATA IN INVESTMENT BANKING

Big Data Adoption Gotchas

Infrastructure

 Hadoop clusters require specific maintenance


 Network bandwidth and tuning
 OS-Level upgrades & certifications
 Security
 Upgrade agility
 Provisioning agility

 Storage IS an issue – Data tends to multiply

GFT Group 03.09.2015 47


4.- BIG DATA IN INVESTMENT BANKING

Big Data Adoption Gotchas

Architecture

 Integration with surrounding systems


 Data Ingestion (easy)
 Data Extraction / Analysis (not so easy)

 Is your system at Big Data scale throughout?

 CPU count based licensing models don’t play well with Hadoop

GFT Group 03.09.2015 48


4.- BIG DATA IN INVESTMENT BANKING

Big Data Adoption Gotchas

Design

 Hadoop is NOT transactional

 The NoSQL landscape is by itself huge

 HBase is NOT a general-purpose RDBMS

 Design for supportability

 Schema on read is hard to do (right)

GFT Group 03.09.2015 49


4.- BIG DATA IN INVESTMENT BANKING

Big Data Adoption Gotchas

Implementation

 Choose the right tool for the job:


 Java MapReduce (Pig / Hive)
 Spark (Java / Scala / Python)
 Flink / Emerging frameworks
 NoSQL

 A Big Data – specific SDLC is not required


 But following best practices is a must

 Skills shortage is real

GFT Group 03.09.2015 50


CHAPTER 5

Hands On Exercise
ETL In Investment Banking
5 . - H A N D S - O N E XE R C I S E

ETL In Investment Banking

 Accessing Databricks Cloud Environment


 https://community.cloud.databricks.com

 Accessing the Exercise Definition and Data


 https://github.com/ignacio-sales/inv_bank_etl

 Exercise Walkthrough

 Q &A

GFT Group 03.09.2015 52


Thank you
APPENDIX

Financial Services Industry –


In Books and Movies
APPENDIX

The Financial Industry in Books and Movies

 Books:
 Too Big to Fail - https://www.amazon.com/Too-Big-Fail-Washington-
System/dp/0143118242/ref=pd_sim_14_9?ie=UTF8&dpID=61Sy1mRL4lL&dpSrc=sims&preST=_AC_UL160_SR104
%2C160_&psc=1&refRID=6B96A138QK9KYNGD25GM
 The Big Short – Michael Lewis - https://www.amazon.com/Big-Short-Inside-Doomsday-
Machine/dp/0393338827
 Liar’s Poker – Michael Lewis - https://www.amazon.com/Liars-Poker-Norton-Paperback-
Michael/dp/039333869X/ref=pd_bxgy_14_img_2?ie=UTF8&psc=1&refRID=FFVAKW546G3WBRPBVDT4
 Flash Boys – Michael Lewis - https://www.amazon.com/Flash-Boys-Wall-Street-
Revolt/dp/0393351599/ref=pd_bxgy_14_img_3?ie=UTF8&psc=1&refRID=BFWXWFWAGFQKEKSDNM4Z
 Barbarians at the Gate – Bryan Burrough - https://www.amazon.com/Barbarians-Gate-Fall-
RJR-
Nabisco/dp/0061655554/ref=pd_sim_14_8?ie=UTF8&dpID=517uhxQLpdL&dpSrc=sims&preST=_AC_UL160_SR107
%2C160_&psc=1&refRID=D1WFZD628ZNM2YG7EAJJ

GFT Group 03.09.2015 55


APPENDIX

The Financial Industry in Books and Movies

 Movies:
 Inside Job (2010) http://www.imdb.com/title/tt1645089/
 The Big Short (2015) http://www.imdb.com/title/tt1596363/
 Margin Call (2011) http://www.imdb.com/title/tt1596363/
 Too Big to Fail (2011) http://www.imdb.com/title/tt1742683/
 Rogue Trader (1999) http://www.imdb.com/title/tt0131566/
 Barbarians at the Gate (1993) http://www.imdb.com/title/tt0131566/

GFT Group 03.09.2015 56

You might also like