Professional Documents
Culture Documents
15 Gfs
15 Gfs
15 Gfs
Michael Siegenthaler
Cornell Computer Science
CS 6464
10th March 2009
Motivating Application: Search
• $$$$$
• Doesn’t scale
2
Google Platform Characteristics
9
GFS: Architecture (2)
10
Master Server
15
Client Write
16
Client Write (2)
17
Client Write (3)
18
Client Record Append
21
GFS: Consistency Model (2)
serial defined
success defined
interspersed with
concurrent consistent
inconsistent
success but
undefined
failure inconsistent
23
Applications and
Record Append Semantics
• Applications should use self-describing
records and checksums when using
Record Append
– Reader can identify padding / record
fragments
• If application cannot tolerate duplicated
records, should include unique ID in
record
– Reader can use unique IDs to filter duplicates
24
Logging at Master
25
Chunk Leases and Version Numbers
28
File Deletion
29
Limitations
• Security?
– Trusted environment, trusted users
– But that doesn’t stop users from interfering
with each other…
• Does not mask all forms of data corruption
– Requires application-level checksum
30
31
32
33
34
35
(for a single file)
36
Recovery Time