Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

LogRhythm as a Data Lake

Bill Larsen & Steve Kaufman

August 2019
2019

Speakers

Steve Kaufman Bill Larsen


Technical Product Enterprise Sales
Manager Engineer

©LogRhythm 2019. All rights reserved. 2


2019

Agenda

What is a “Data Lake” ?

What to Consider When Creating a Data Lake

Why Have LogRhythm a Part of Your Data Lake

Example of LogRhythm Data Lakes and Integrations

Questions

©LogRhythm 2019. All rights reserved. 3


2019

What is a Data Lake?

From the great source of Wikipedia…


• A data lake is a system
or repository of data stored in its
natural/raw format, usually
object blobs or files. A data lake is
usually a single store of all
enterprise data including raw
copies of source system data and
transformed data.
• A data landfill is a deteriorated
and unmanaged data lake that is
either inaccessible to its intended
users or is providing little value.

©LogRhythm 2019. All rights reserved. 4


2019

Build a Resource with the End in Mind


INFORMATION DATA FLOW

DATA LAKE Vs. DATA LANDFILL


Data Collection Process Data Collection Process
• Multiple Factors and is a separate topic • Multiple Factors and is a separate topic

GATHER GATHER
ITERATIVE IMPROVEMENTS

• Define Data Elements • Data Not Understood


• Know Data Relationship • No Data Dictionary or Code Book
KNOW KNOW
• Tidy Data – Relationships Validated • Trust With No Validation
• Data Errors Identified and Corrected • Data Errors Overlooked

CLEAN CLEAN
• Exploratory Analysis • Analytics Invalidated
• Statistical Analysis • Results Confusion
ANALYZE ANALYZE
• Organize Data Logically • Assumptions Questionable
• Aggregate / Report • Reports Misleading
PRESENT PRESENT

ACTIONABLE INFORMATION POOR DECISIONS


DECISION DECISION

©LogRhythm 2019. All rights reserved. 5


2019

Data Lakes are NOT a Replacement for a SIEM

SIEM Data Lake


• Real-time Correlation and Alerting • Pool for all security data
• Fast Search and Investigation • Search can be slower
• Compliance Reporting and • Leverage to long term threat hunting
Retention • Retrospective pivot from SIEM

©LogRhythm 2019. All rights reserved. 6


2019

A Data Lake can Enhance the SIEM

Real-Time
Server Real-Time Search Investigations Alarm and Investigations

Firewall/IPS Soc Analyst

Structured Data Feeds


LogRhythm
LogRhythm SIEM Real-Time
Actionable Events
Network

Structured Data Feeds


Correlated Event Feeds

Identity

Unstructured Data Feed

Workstation Retrospective to Realtime hunt

Hunting looking for TTP and IOC

Data Lake

Threat
Hunter
©LogRhythm 2019. All rights reserved. 7
LogRhythm Loves Data
Collection/MDI
2019

LogRhythm MDI: An Example

©LogRhythm 2019. All rights reserved. 9


2019

Log Data Access


LogRhythm Meets the Need

Data Indexer: LogRhythm Data Lake


• Enable use of third-party visualization tools like
Kibana with the LogRhythm platform
• Continue adding more data – Event, Alarm,
Case, Smartflow, Threat Intelligence

Customer Need Data Processor: Log Distribution Services


• Integration with third-party data lakes (Hadoop,
• Open access to log across many
Elastic) via common protocols including JSON,
months to years with LR enterprise Apache Kafka, providing a high-speed alternative
data lake
• Access to data lake via open source
visualization tools and APIs APIs
• Easy and effective enterprise data • Continue to build out platform APIs including
lake integration Search, Alarms, Case, Threat Intelligence

©LogRhythm 2019. All rights reserved. 10


LogRhythm as a Data Lake
2019

LogRhythm + Kibana Integration (Multi-group Example)

Agents DX DX DX
Data Web/Client
Security
Processor Console
Logs

DX DX DX
Agents Data
Ops Logs Processor

©LogRhythm 2019. All rights reserved. 12


2019

Data Architecture: Tiered Hot/Warm Data Indexing


Agents
Challenge Data
Agents Web/Client
• Ability to store and Processor Console
search large amounts Agents

of data

Available in LR 7.4.3
an later
API Gateway
• Tiered Storage

Hot Hot Hot Warm


DX DX DX DX

Storage

Storage Storage

©LogRhythm 2019. All rights reserved. 13


LogRhythm integrated with
your Data Lake
2019

LogRhythm + LDS Integration

Agents DX DX DX
Data Web/Client
Security
Processor Console
Logs

Log Distribution Service

©LogRhythm 2019. All rights reserved. 15


2019

LogRhythm + Elastic Integration

©LogRhythm 2019. All rights reserved. 16


2019

LogRhythm ElasticSearch Nodes

Data Sources

MetaData and LogMessage

Apache NiFi Apache Kafka LogStash

Security Event Data Query HDFS for data outside


LogRhythm ES TTL
ElasticSearch
STIX/TAXII IOC
data

HDFS Kibana

Spark

MapReduce
Falcon Yarn

Oozie

Zepplin

©LogRhythm 2019. All rights reserved. 17


2019

Additional Resources

Related Sessions Key Resources


• FILL THIS OUT • LogRhythm Community – FILL THIS
OUT [Link]

Want help implementing what you just


saw? Ask your Customer Success
Manager about our Co-Pilot Services.

©LogRhythm 2019. All rights reserved. 18


More Resources on the 2019

LogRhythm Community

Dedicated RhythmWorld Section


with all presentation content The LogRhythm
Community also
includes discussions
with other users and
LogRhythm experts,
documentation,
software downloads,
shareable resources —
SmartResponse
plugins, dashboards,
playbooks, DPA rules,
and more!

©LogRhythm 2019. All rights reserved. 19


Questions?
The next Deep Dives
will start at:

1:45pm & 2:30pm

You might also like