Professional Documents
Culture Documents
Log Structured File Systems: Motivation
Log Structured File Systems: Motivation
n Motivation:
n Small writes are expensive on disks
n Treat disk like a tape; do large I/Os
Log Structured FS & Unix
n Collect into memory
n Write large chunks
n Append-only log
Arvind Krishnamurthy n Log is only representation on disk
Spring 2001
n Two problems:
n Data constantly moves around
n Disk fills up leaving behind holes
n Eventually fragments
n Inode-map gets written to the disk
holes made up by new data n Disk does not have to be 50% utilized, only segments have to be 50%
utilized!
n Would be great to have a bi-modal distribution
1
Write Cost Graph Simulation Results
n 4 KB writes
n Randomly write blocks
n Greedy cleaner
n Pick the segment with the most amount of free data
Disk Utilization
Locality Observations
n 90-10 pattern: n Greedy strategy based on utilization alone doesn’t work
n 90% of accesses to 10% of the blocks n Need to consider age of the blocks as well
n Things got worse! n Some segments might tie up just a small number of blocks but they
are tied up for a long time
n Need to consider block-seconds rather than just blocks
n Maybe we should segregate data: n Cost-benefit analysis:
n Clean a whole bunch of segments at a time n Benefit = free space generated * age of data
n Segregate old data from new data = (1-u) * age
n Things didn’t improve! n Cost = (1 + u)
n Pick segment with greatest benefit/cost ratio
n Let us look at distribution of segment utilization n Voila: we get a bimodal distribution!
n Excellent research:
n High risk, simulations, lessons learnt, better algo è implementation
2
Unix File System Process Management
n File system paper! 50-75% of code in modern OS goes to n Fork & Exec:
file system, device drivers (support structures) n Fork copies a process
n Hierarchical file system: seems natural n Exec overlays a new process
n In comparison, TOPS10 had single directory per user n Advantages:
n Directories are like files n Fork, change some small pieces of the process
n Beauty in Unix is uniformity n Small piece of code
Child has different set of I/O pipes; used for redirection
n Byte-oriented (no records; previously 80 byte records were n
n Devices and files are the same n Very simple kernel, pull everything into user-level
n Operations, naming, permissions n Simplify kernel
Different people can do it differently
n Set-user Id (avoid special kernel calls) n
n …………
3
More on RAIDs Adding parity bits to RAID
n Benefits n With k+1 disks
n Disk1 has blocks 1, k+1, 2k+1, …
n Load gets automatically balanced among disks
n Disk2 has blocks 2, k+2, 2k+2, …
n Can transfer large file at aggregate bandwidth of all disks n …
n Parity disk has blocks parity(1..k), parity (k+1..2k), …
n Problem --- what if one disk fails ? n If lose any disk, can recover data from other disks plus parity
n Goal --- availability --- never lose access to data n Disk1 holds 1001
n System should continue to work even if some components are not n Disk2 holds 0101
working. n Disk3 holds 1000
n Parity disk: 0100
n Solution: dedicate one disk to hold bitwise parity for other disks in What if we lose disk2? Its contents are parity of remainder!
stripe. Thus can lose any disk and data would still be available.
n Updating a disk block needs to update both data and parity --- need to
use write ahead logging to support crash recovery