Professional Documents
Culture Documents
Financial-Services-EDH Cloudera
Financial-Services-EDH Cloudera
Financial Services
Capture Value from Big Data with an Enterprise Data Hub
Big Data Is Only Getting Bigger
Particularly Relevant in the Financial Services Space
SOPHISTICATED
CONSUMER CREDIT EXPLOSION
MACHINES
1980 TODAY
© Cloudera, Inc. All rights reserved. 2
Where Is the Financial Services Data?
Mapping and Consolidation Are the Tip of the Iceberg for Big Data
• Bank Transactions • Card Transactions • Trade Data • Claims / Policy Data • Trade Data
• Customer Data • Customer Data • Customer Data • Customer Data • Communications /
• ATM Activity • Online Activity • Web Logs • Demographic / Census Documentation
• Online Activity • Demographic / Census • Research / Publications Data • Market Data
• Mobile Activity Data • Market Data • Weather Data • Research / Publications
• Demographic / Census • Marketing / CRM • Communications / • Vehicle Telemetry • Surveys
Data • Integration with Retailers Documentation • Video / Surveillance
• Marketing / CRM / Loyalty • Sensors
• Social / Sentiment • Social / Sentiment • Internet of Things
Sources: Dash, Eric. “Feasting on Paperwork,” The New York Times. September 8, 2011.
Accenture. Coming to Terms with Dodd-Frank. January 2013.
Average customer
acquisition costs retail banks
MORE THAN $350
and requires customers to
carry balances
NEARING $10,000
just to break even
Databases
Compliance and Privacy
More data, more users, and Data
more tools create complexity.
Systems
Need to balance business agility
with security and governance.
Limited Data
Not efficient to keep existing data,
let alone handle new data sources. Data Existing
New Data
Time consuming to transform data Sources Data
for analysis in existing systems.
Batch, Interactive,
Process Discover Model Serve
and Real-Time.
Ingest Analytic Database Machine Learning NoSQL Database
Sqoop, Flume Impala SAS, R, Spark, HBase Leading performance and
Transform Search
Mahout
Streaming
usability in one platform.
MapReduce, Solr Spark Streaming
Hive, Pig, Spark • End-to-end analytic workflows
YARN, Cloudera Manager,
• Access more data
Security and Administration Cloudera Navigator • Work with data in new ways
Unlimited Storage HDFS, HBase • Enable new users
Unlimited Storage
Infrastructure
Optimize your architecture. Discover the value in your data. Empower users directly.
IT analysts and data scientists everyone
Unlimited Storage
• Perimeter security
• Role-based access control
• The only complete policy-based PROTECTION AUDIT
management of sensitive data Encryption for data at Capture a complete
rest or in motion with and immutable record
full key management of all activity
• Data lineage and discoverability
Cloudera Navigator: Cloudera Navigator
Encrypt & Key Trustee SIEM Tools
Maintain a Vulnerability Protect all systems against malware and regularly update anti-virus
✔
Management Program software of programs
Develop and maintain secure systems and applications ✔
Implement Strong Access Restrict Access to cardholder data by business need to know ✔
Control Measures
Identify and authenticate access to system components ✔
Regularly Monitor and Test Track and monitor all access to network resources and cardholder data ✔
Networks
Regularly test security systems and processes ✔
Maintain an Information Maintain a policy that addresses information security for all personnel
✔
Security Policy
Investment 4 of the
Banking top 5
Credit &
Payments
Insurance
Services
& SROs
Recommendation Engine
Solution Centralize data from silos: transactions, clickstreams, service logs, social, etc. Find data
using Search and build models with Pig, Mahout, Spark, analytics tools.
Partners
Risk Modeling
Use Case Report on risk on an on-demand basis and stress-test against scenarios that would require
payout of potentially catastrophic losses with liquid capital.
Calculate Anything
Solution HBase provides a real-time tick store also accommodating massive historical data. Spark
and Impala converge ETL, analytics, and reporting for intra-day modeling.
Partners
Fraud Prevention
Use Case Text mining and machine learning model rare events and combine data from suspicious
transactions with extracts from other sources to confirm and target.
Machine Learning
Solution Impala offsets the latency and constraints of EDWs to expand the data available for SIEM by
decades, also driving down the cost of ad hoc real-time modeling.
Partners
Policy Personalization
Use Case Differentiate coverage options by customizing plans based on information collected about
customers’ lifestyle, health patterns, habits, and preferences.
Stream Processing
Solution Spark Streaming is used to calculate pricing occasions in real time based on live,
unstructured data-in-motion from sensors, mobile devices, onboard computers, etc.
Partners
Active Archive
Solution Impala provides in-cluster reporting and investigation capabilities to keep data centrally
accessible for multi uses. Hadoop will scale to oncoming CAT reporting.
Partners
What can we learn from using much larger and more varied
data sets for advanced security and threat analytics?
© Cloudera,
© 2015 Cloudera, Inc.AllAllrights
Inc. rightsreserved.
reserved. 23
23
Can we accommodate the massive increases in volume, variety,
and velocity of market data while keeping down costs?
© Cloudera,
© 2015 Cloudera, Inc.AllAllrights
Inc. rightsreserved.
reserved. 24
24
How do we use standard profile information for each
house to determine risk and set individualized rates?
© Cloudera,
© 2015 Cloudera, Inc.AllAllrights
Inc. rightsreserved.
reserved. 25
25
Can we eliminate sampling error by
including all our log data in analyses?
© Cloudera,
© 2015 Cloudera, Inc.AllAllrights
Inc. rightsreserved.
reserved. 26
26
Thank you