Professional Documents
Culture Documents
03 Storage1
03 Storage1
Systems (15-445/645)
Lecture #03
Database
Storage
Part 1
FALL 2023 Prof. Andy Pavlo Prof. Jignesh Patel
2
ADMINISTRIVIA
LAST CLASS
COURSE OUTLINE
D I S K- B A S E D A RC H I T E C T U R E
S TO R AG E H I E R A RC H Y
Faster
CPU Registers Smaller
Expensive
CPU Caches
Volatile
Random Access
Byte-Addressable DRAM
Non-Volatile SSD
Sequential Access
Block-Addressable
HDD
Slower
Larger
Network Storage Cheaper
15-445/645 (Fall 2023)
7
S TO R AG E H I E R A RC H Y
Faster
CPU Registers Smaller
CPU Expensive
CPU Caches
Memory DRAM
SSD
Disk HDD
Slower
Larger
Network Storage Cheaper
15-445/645 (Fall 2023)
8
S TO R AG E H I E R A RC H Y
Faster
CPU Registers Smaller
CPU Expensive
CPU Caches
Memory DRAM
Persistent Memory
SSD
Fast Network Storage
Disk HDD
Slower
Larger
Network Storage Cheaper
15-445/645 (Fall 2023)
9
S TO R AG E H I E R A RC H Y
Faster
CPU Registers Smaller
CPU Expensive
CPU Caches
Memory DRAM
Persistent Memory
SSD
Fast Network Storage
Disk HDD
Slower
Larger
Network Storage Cheaper
15-445/645 (Fall 2023)
10
S TO R AG E H I E R A RC H Y
Faster
CPU Registers Smaller
CPU Expensive
CPU Caches
Memory DRAM
Persistent Memory
SSD
Fast Network Storage
Disk HDD
Slower
Larger
Network Storage Cheaper
15-445/645 (Fall 2023)
11
S TO R AG E H I E R A RC H Y
Faster
CPU Registers Smaller
CPU Expensive
CPU Caches
Memory DRAM
Persistent Memory
SSD
Fast Network Storage
Disk HDD
Slower
Larger
Network Storage Cheaper
15-445/645 (Fall 2023)
12
AC C E S S T I M E S
Latency Numbers Every Programmer Should Know
1 ns L1 Cache Ref
4 ns L2 Cache Ref
100 ns DRAM
16,000 ns SSD
2,000,000 ns HDD
~50,000,000 ns Network Storage
1,000,000,000 ns Tape Archives
Source: Colin Scott
15-445/645 (Fall 2023)
13
AC C E S S T I M E S
Latency Numbers Every Programmer Should Know
1 ns L1 Cache Ref 1 sec
4 ns L2 Cache Ref 4 sec
100 ns DRAM 100 sec
16,000 ns SSD 4.4 hours
2,000,000 ns HDD 3.3 weeks
~50,000,000 ns Network Storage 1.5 years
1,000,000,000 ns Tape Archives 31.7 years
Source: Colin Scott
15-445/645 (Fall 2023)
14
S E Q U E N T I A L V S . R A N D O M AC C E S S
S Y S T E M D E S I G N G OA L S
D I S K- O R I E N T E D D B M S
Get Page #2
Execution
Buffer Pool Engine
Directory
Memory
Database File
1 2 3 4 5 … Pages
Disk
15-445/645 (Fall 2023)
17
D I S K- O R I E N T E D D B M S
Get Page #2
Execution
Engine
Pointer to Page #2
Buffer Pool
Directory Header
2
Memory
Database File
1 2 3 4 5 … Pages
Disk
15-445/645 (Fall 2023)
18
D I S K- O R I E N T E D D B M S
Get Page #2
Execution
Engine
Pointer to Page #2
Buffer Pool
Directory Header
Interpret Page #2 layout
2 Update Page #2
Memory
Database File
1 2 3 4 5 … Pages
Disk
15-445/645 (Fall 2023)
19
D I S K- O R I E N T E D D B M S
Lectures #12-13
Get Page #2
Lecture #6 Execution
Engine
Pointer to Page #2
Buffer Pool
Directory Header
Interpret Page #2 layout
2 Update Page #2
Memory
Lecture #6
Database File
1 2 3 4 5 … Pages
Disk
15-445/645 (Fall 2023)
11
W H Y N OT U S E T H E O S ?
On-Disk File
15-445/645 (Fall 2023)
11
W H Y N OT U S E T H E O S ?
On-Disk File
15-445/645 (Fall 2023)
11
W H Y N OT U S E T H E O S ?
On-Disk File
15-445/645 (Fall 2023)
11
W H Y N OT U S E T H E O S ?
On-Disk File
15-445/645 (Fall 2023)
24
W H Y N OT U S E T H E O S ?
M E M O R Y M A P P E D I /O P RO B L E M S
W H Y N OT U S E T H E O S ?
W H Y N OT U S E T H E O S ?
W H Y N OT U S E T H E O S ?
W H Y N OT U S E T H E O S ?
DATA B A S E S TO R AG E
TO DAY ' S AG E N DA
File Storage
Page Layout
Tuple Layout
F I L E S TO R AG E
S TO R AG E M A N AG E R
DATA B A S E PAG E S
DATA B A S E PAG E S
Default DB Page Sizes
There are three different notions of
"pages" in a DBMS: 4KB
→ Hardware Page (usually 4KB)
→ OS Page (usually 4KB, x64 2MB/1GB)
→ Database Page (512B-32KB)
16KB
15-445/645 (Fall 2023)
36
PAG E S TO R AG E A RC H I T E C T U R E
HEAP FILE
Get Page #2
…
15-445/645 (Fall 2023)
38
HEAP FILE
Get Page #2
HEAP FILE
H E A P F I L E : PAG E D I R E C TO R Y Page0
…
about available space:
Page100
→ The number of free slots per page.
→ List of free / empty pages.
Data
TO DAY ' S AG E N DA
File Storage
Page Layout
Tuple Layout
PAG E H E A D E R
PAG E L AYO U T
PAG E L AYO U T
T U P L E - O R I E N T E D S TO R AG E
T U P L E - O R I E N T E D S TO R AG E
T U P L E - O R I E N T E D S TO R AG E
T U P L E - O R I E N T E D S TO R AG E
T U P L E - O R I E N T E D S TO R AG E
S LOT T E D PAG E S
Slot Array
The most common layout scheme is
called slotted pages.
1 2 3 4 5 6 7
Header
The slot array maps "slots" to the
tuples' starting position offsets.
S LOT T E D PAG E S
Slot Array
The most common layout scheme is
called slotted pages.
1 2 3 4 5 6 7
Header
The slot array maps "slots" to the
tuples' starting position offsets.
S LOT T E D PAG E S
Slot Array
The most common layout scheme is
called slotted pages.
1 2 3 4 5 6 7
Header
The slot array maps "slots" to the
tuples' starting position offsets.
S LOT T E D PAG E S
Slot Array
The most common layout scheme is
called slotted pages.
1 2 3 4 5 6 7
Header
The slot array maps "slots" to the
tuples' starting position offsets.
S LOT T E D PAG E S
Slot Array
The most common layout scheme is
called slotted pages.
1 2 3 4 5 6 7
Header
The slot array maps "slots" to the
tuples' starting position offsets.
RECORD IDS
TO DAY ' S AG E N DA
File Storage
Page Layout
Tuple Layout
T U P L E L AYO U T
TUPLE HEADER
T U P L E DATA
D E N O R M A L I Z E D T U P L E DATA
D E N O R M A L I Z E D T U P L E DATA
D E N O R M A L I Z E D T U P L E DATA
D E N O R M A L I Z E D T U P L E DATA
CONCLUSION
NEXT CLASS
Log-Structured Storage
Index-Organized Storage
Value Representation
Catalogs