Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

DIGITAL FORENSIC RESEARCH CONFERENCE

The Database Forensic File Format and DF-Toolkit


By
James Wagner, Alexander Rasin, Karen Heart, Rebecca Jacob, and Jonathan Grier
From the proceedings of
The Digital Forensic Research Conference
DFRWS 2019 USA
Portland, OR (July 15th - 19th)

DFRWS is dedicated to the sharing of knowledge and ideas about digital forensics research. Ever since it organized the first open workshop devoted to digital forensics in
2001, DFRWS continues to bring academics and practitioners together in an informal environment. As a non-profit, volunteer organization, DFRWS sponsors technical
working groups, annual conferences and challenges to help drive the direction of research and development.

https://dfrws.org
The Database Forensic File
Format and DF-Toolkit
James Wagner, Alexander Rasin, Karen Heart,
Rebecca Jacob, Jonathan Grier
DFRWS 2019
Portland, OR
DePaul Data Systems Lab

James Wagner Alexander Rasin Tanu Malik


(DB Systems Faculty) (DB Systems Faculty)

Karen Heart (Systems Faculty Jacob Furst Jonathan Grier


& Litigation Attorney) (Security Faculty) (Grier Forensics)
What is database forensics?
• Database management system (DBMS): software that
stores and manages a collection of logically related data
• Oracle and Microsoft SQL Server - corporate data
• MySQL and PostgreSQL - webstore back-end
• SQLite - personal applications (e.g., browser history and SMS)

• Digital Forensics:
• solve crimes committed with computers (e.g., phishing and
bank fraud)
• solve crimes where evidence may reside on a computer (e.g.,
money laundering and child exploitation)
• trace security breaches
Page Carving and DBCarver
• Pages
• DBMS I/O
• 4KB or 8KB
• Tables, indexes, etc.

• DBCarver
• Systematic carving
process for database
storage at page level.
• DFRWS ’15, ’16 & CIDR ’17

ApacheDerby, Firebird, IBM DB2, Microsoft SQL Server,


MySQL, Oracle, PostgreSQL, SQLite
Why Page Carving? Metadata!
• Our philosophy:
“Reconstruct the system, not only the data.”

• Metadata and data allow for more complete


timelines.
Advanced
Applications
Disk &
Images Testing
DBCarver
RAM
Snapshots
User
(e.g., FBI/RCFL)
Metadata: Simple Example
• “Return all deleted
records and their
offsets.”

• Offset – within disk


image? DBMS file?

• Delete flag – multiple


“delete” concepts. MySQL IBM DB2
• DFRWS ‘16
PostgreSQL SQL Server
Oracle
SQLite
Metadata: Advanced Examples

Examples Relevant Metadata


• DBA bypasses (or tampers • Deletion flags
with) logs. -DFRWS ’17 • Pointer deconstruction
• Object identifiers
• Sys Admin modifies DBMS
file bytes. - EDBT ’18 • Page Identifiers
• Caching patterns
• Storage optimization
• SSDBM ‘17, DAPD ’19
• Wait! There’s more!
• Checksums
• Data Sanitization • Free space pointers
Motivation How do you
compare carvers?
What should
output look like?
Advanced
DBCarver Applications
&
Disk
Testing

?
Images
Carver
RAM
Tool B
Snapshots

Carver Users
Tool B
It’s a Database!
Just recreate the tables…
Employee ID Name Department Salary Office#
1 Karen CSC 90K 101
2 Alex Chemistry 88K 102
3 Tanu Math 92K 104
4 Jacob History 75K 107

• Example 1: Return all deleted records & their offsets.

• Example 2: Find all records containing a string.


Example 1: Return all deleted
records and their offsets.
Offset Delete Employee Name Department Salary Office#
Flag ID
1K No 1 Karen CSC 90K 101
2K No 2 Alex Chemistry 88K 102
3K Yes 3 Tanu Math 92K 104
4K No 4 Jacob History 75K 107
«Not “DELETE” FROM VACUUM

• Metadata columns are not part of original instance


• Must be added to every table - What if I want more metadata?
• Users must be able to distinguish “real” columns.
• The data and metadata do not fit the relational model
Example 2: Find all records
containing a string.
Offset Delete Employee Name Department Salary Office#
Flag ID
1K No 1 Karen CSC 90K 101
2K No 2 Alex Chemistry 88K 102
3K Yes 3 Tanu Math 92K 104
4K No 4 Jacob History 75K 107
SELECT * FROM Employee
WHERE Name LIKE ‘%MyString%’ • Can’t filter by all columns in
SQL.
OR Department LIKE ‘%MyString%’ …

SELECT * FROM Customer


• How should output be saved?
WHERE Name LIKE ‘%MyString%’
OR Address LIKE ‘%MyString%’ …

Our contributions
Database Forensic File Database Forensic Toolkit
Format (DB3F): (DF-Toolkit):
• Abstract DBMS storage • Visibility – traverse data
engine specifics. with a tree
• Simple to generate and • Display DBMS Objects
ingest. • Object Filtering
• Open and extensible • Keyword Searches
• Scalable
DB3F DF-
Toolkit
Our Contributions How do you
compare carvers? – DB3F
What should
output look like? – DB3F
Advanced
DBCarver Applications
&
Disk DB3F Testing
Images
Carver
RAM Tool B
Snapshots
DF-Toolkit
Carver Users
Tool B
DB3F
• Usable for all database carving tools

• 1 DBMS represented by 1 DB3F file


• A disk image has multiple DB3F files if multiple DBMSes
are present (e.g., don’t mix PostgreSQL and SQLite data)

• A DB3F file contains a series of JSON objects:


• 1st line – DB3F file header JSON
• Every other line represents a single page
DB3F: JSON Database Pages
• Page header metadata
• Records w/ metadata
• Fields can be added
without affecting
previous versions.
• Currently not an
exhaustive list of
metadata fields

71, Supplier#000000071, 31CSQET,


ARGENTINA5, ARGENTINA,
AMERICA, 11-710-812-5403

70, Supplier#000000070,
jd4jZv0cc5KdnA0q9o0, FRANCE 0,

DB3F: Popular Questions
• DB3F supports all datatypes
• Describe in schema field

• Some datatypes don’t fit in a single page


• E.g., BLOBs, large text fields
• DBMSes store refs to these files

• Reliability
• DB3F files use about 2x storage than the DBMS files.
• Reading is dependent on JSON parsing.
DF-Toolkit
• A UI to view and filter DB3F files

• Tree structure
• Traverse and view metadata and data

• Filtering and querying DB3F files


DF-Toolkit: Tree Nodes
• Our philosophy: “A DBMS as a separate system”
RamSnapshot01
Root – major storage structure + SQLite
• A single DBMS in each evidence
- PostgreSQL
(no disk partitions)

Object + Customer Table


e.g., tables, indexes, MVs - Employee Table
+ Page1
Pages
- Page2
[Alex, Professor]
Records [Karen, Dean]
DF-Toolkit: Another tree level?
• Record values currently stored as a list.
• A Value table would store a row for every value and
for all tables.
• Ex. A table with 10 columns and 1M rows -> 10M rows
• The SQL JOIN quickly becomes expensive

Records

Values
DF-Toolkit: UI
DF-Toolkit: Filtering
Pre-filled SQL User-added Conditions
• The JOIN needed for a • Example conditions:
single evidence.
• Does not change. • Filter on column
datatypes for a table.

• Filter on a REGEX
Future Work
Conta
• User study ct Us!
• Aggregate collaborator criteria

• System Catalog Information


• E.g., Replace ObjectID# with table “Customer”
• We assume incomplete systems
• DBMS-specific

• Integration of Non-DBMS Data


• E.g., those references to large data
Contact and Info
• Email
• Jay: jwagne32@depaul.edu
• Alex: arasin@depaul.edu
• Jonathan: jdgrier@grierforensics.com

• DB3F Examples and DF-Toolkit:


http://dbgroup.cdm.depaul.edu/DF-Toolkit.html

• Poster – tomorrow at lunch


“Database Forensics: Where the Wild Things Are”
DB3F: JSON Header
• High-level metadata
• Organizations can easily add/remove fields for their
SOP requirements.
DF-Toolkit: Tree Nodes
• Our philosophy: “A DBMS as a separate system”
+ Root – major storage structure
• A single DBMS in each evidence
(no disk partitions)
• Ex: RAM snapshot w/
PostgreSQL and SQLite
+ Object
– e.g., tables, indexes, MVs

+ Pages

+ Records

You might also like