Module 5

CST 402 - DISTRIBUTED COMPUTING
Module – V
Consensus and Distributed file system
Module – V
Lesson Plan
▪ L1: Consensus and agreement algorithms – Assumptions, The Byzantine
agreement and other problems
▪ L2: Agreement in (message-passing) synchronous systems with failures –

Consensus algorithm for crash failures
▪ L3: Agreement in (message-passing) synchronous systems with failures –

Consensus algorithm for crash failures
▪ L4: Distributed File System – File Service Architecture
▪ L5: Case Studies: Sun Network File System
▪ L6: Andrew File System
▪ L7: Google File System

Distributed file Systems
A file system is a subsystem of the operating system that performs
file management activities such as
organization
Storing
 retrieval
Naming
Sharing
protection
A distributed file system(DFS) is a method of storing and accessing
files based in a client/server architecture.
 In a distributed file system, one or more central servers store files

that can be accessed, with proper authorization rights, by any
number of remote clients in the network.
Files contain both data and attributes.
The data consist of a sequence of data items , accessible by

operations to read and write any portion of the sequence.
The attributes are held as a single record containing information

such as the length of the file, timestamps, file type, owner’s identity
and access control lists.
The term metadata is often used to refer to all of the extra
information stored by a file system that is needed for the
management of files.
It includes file attributes, directories and all the other persistent
information used by the file system.
Distributed file system requirements
Transparency
Heterogeneity
Concurrent file updates
Fault tolerance
File replication
Consistency
Security
 Efficiency
1. Transparency :
Its the concealment from the user and the application programmer of
the separation of component in a DS.
● Access Transparency: Client programs should be unaware of the
distribution of files.
A single set of operations is provided for access to local and remote
files.
● Location transparency: enable files to be accessed without
knowledge of location
● Mobility transparency: allow the movement of file without affecting
the operation of user
● Performance : Client programs should continue to perform
satisfactorily while the load on the service varies within a specified
range.
● Scaling transparency: The service can be expanded by incremental
2. Concurrent file updates :
Changes to a file by one client should not interfere with the
operation of other clients simultaneously accessing or changing the
same file.
3. File replication :
In a file service that supports replication, a file may be represented
by several copies of its contents at different locations.
It enhances fault tolerance by enabling clients to locate another
server that holds a copy of the file when one has failed.
4.Heterogeneity :
The service interfaces should be defined so that client and server
software can be implemented for different operating systems and
computers.
5.Fault Tolerance: Service continue to operate in face of failure.
6. Security:
there is a need to authenticate client requests so that access control
at the server is based on correct user identities
to protect the contents of request and reply messages with digital
signatures and (optionally) encryption of secret data.
7.Efficiency:
Provide good level of performance
8. Consistency: If any changes made to one file, that changes must

do in other replicated copies
File Service Architecture
 An architecture that offers a clear separation of the main

concerns in providing access to files is obtained by
structuring the file service as three components:
 A flat file service
 A directory service
 A client module.
Client computer Server computer
Application Application Directory service

program program
Flat file service
Client module
Figure 5. File service architecture

 Responsibilities of various modules can be
defined as follows:
 Flat file service:
 Concerned with the implementation of
operations on the contents of file.
 Unique File Identifiers (UFIDs) are used to refer
to files in all requests for flat file service
operations.
 UFIDs are long sequences of bits chosen so that
each file has a unique among all of the files in a
distributed system.
 Directory service:
 Provides mapping between text names for the files
and their UFIDs.
 Clients may obtain the UFID of a file by quoting its
text name to directory service.
 Directory service supports functions needed
generate directories, to add new files to directories.
Client module:
It runs on each computer and provides integrated service (flat file and
directory) as a single API to application programs.
Flat file service operations
Read(FileId, i, n) -> Data if 1≤i≤Length(File): Reads a sequence of up to n items

from a file starting at item i and returns it in Data.
Write(FileId, i, Data) if 1≤i≤Length(File)+1: Write a sequence of Data to a
file, starting at item i, extending the file if necessary.

Create() -> FileId Creates a new file of length0 and delivers a UFID for it.
Delete(FileId) Removes the file from the file store.
GetAttributes(FileId) -> Attr Returns the file attributes for the file.
SetAttributes(FileId, Attr) Sets the file attributes
Directory service operations
Lookup(Dir, Name) -> FileId Locates the text name in the directory and
returns the relevant UFID. If Name is not in
the directory, throws an exception.

AddName(Dir, Name, File) If Name is not in the directory, adds(Name,File)
to the directory and updates the file’s attribute record.
If Name is already in the directory: throws an exception.
UnName(Dir, Name) If Name is in the directory, the entry containing Name
is removed from the directory.
If Name is not in the directory: throws an exception.
GetNames(Dir, Pattern) -> NameSeq Returns all the text names in the directory that match the
regular expression Pattern.
 Access control
 In distributed implementations, access rights checks
have to be performed at the server because the
server RPC interface is an otherwise unprotected
point of access to files.
 An access check is made whenever a file name is
converted to a UFID
 A user identity is submitted with every client request,
and access checks are performed by the server for
every file operation.
.
 File Group
 A file group is a collection of files that can be
located on any server or moved between servers
while maintaining the same names.
– A similar construct is used in a UNIX file
system.
– It helps with distributing the load of file serving
between several servers.
– File groups have identifiers which are unique
throughout the system (and hence for an open
system, they must be globally unique).
DFS: Case Studies
 NFS (Network File System)
 Developed by Sun Microsystems (in 1985)
 Most popular, open, and widely used.
Client and server communicate using RPC ( tcp or udp,
support both)
 Os independent but originally developed for unix
 AFS (Andrew File System)
 Developed by Carnegie Mellon University as part of Andrew
distributed computing environments (in 1986)
 A research project to create campus wide file system.
20
NFS architecture
The NFS module resides in the kernel on each computer.
Requests referring to files in a remote file system are translated

by the client module to NFS protocol operations and then
passed to the NFS server module at the computer holding the
relevant file system.
The NFS client and server modules communicate using remote

procedure calls.
The RPC interface to the NFS server is open: any process can
send requests to an NFS server;
if the requests are valid and they include valid user credentials,
they will be acted upon.
Virtual File System :
Provide access transparency
User can issue file operation for local and remote files with out
distinction
The file identifiers used in NFS are called file handles.
The file system identifier field is a unique number that is allocated to each
file system when it is created.
● The i-node number is needed to locate the file in file sysytem and also
used to store its attribute and i-node numbers are reused after a file is
removed.
● The i-node generation number is needed to increment each time i-node
numbers are reused after a file is removed.
NFS Client :
The NFS client module cooperates with the virtual file system in each
client machine.
It operates in a similar manner to the conventional UNIX file system,
transferring blocks of files to and from the server and caching the
blocks in the local memory whenever possible.
If the file is local, a reference to the index of the local file will be made
If the file is remote, it contains the file handle of the remote file.
NFS server operations
• read(fh, offset, count) -> attr, data

• write(fh, offset, count, data) -> attr
• create(dirfh, name, attr) -> newfh, attr
• remove(dirfh, name) status
• getattr(fh) -> attr
• setattr(fh, attr) -> attr
 NFS access control and authentication
– The NFS server is stateless server, so the user's
identity and access rights must be checked by the
server on each request.
 Kerberos has been integrated with NFS to provide a
stronger and more comprehensive security solution.
The Andrew File System (AFS)
 Like NFS, AFS provides transparent
access to remote shared files for UNIX
programs running on workstations.
 AFS is implemented as two software

components that exist at UNIX processes
called Vice and Venus. ( exist as two
process)
 Vice > flat file service

 Venus  dir service
27
Architecture : The Andrew File System (AFS)
W o rk s t a t i o n s S e rv e rs
Us e rV e n u s
p ro g ra m
Vi c e
UNI X k e rn e l
UNI X k e rn e l
Us e rV e n u s Ne t wo rk
p ro g ra m
UNI X k e rn e l
Vi c e
Ve n u s
Us e r
p ro g ra m UNI X k e rn e l
UNI X k e rn e l
Figure 11. Distribution of processes in the Andrew File System
28
 The files available to user processes running on
workstations are either local or shared.
 Local files are handled as normal UNIX files.
 They are stored on the workstation’s disk and

are available only to local user processes.
 Shared files are stored on servers, and copies

of them are cached on the local disks of
workstations.
29
Operations
Fetch(fid) -> attr, data Returns the attributes (status) and, optionally, the contents of file
.
Store(fid, attr, data) Updates the attributes and (optionally) the contents of a specified
file.
Create() -> fid Creates a new file and records.
Remove(fid) Deletes the specified file.
SetLock(fid, mode) Sets a lock on the specified file or directory.
.
ReleaseLock(fid) Unlocks the specified file or directory.
30
Google File System
▪ Google Inc. developed the Google File System (GFS), a scalable distributed file
system (DFS), to meet the company’s growing data processing needs.
▪ GFS offers fault tolerance, dependability, scalability, availability, and performance

to big networks and connected nodes.
▪ The Google File System reduced hardware flaws while gains of commercially
available servers.
▪ it manages two types of data namely File metadata and File Data.
Google File System
▪ Large (64 MB) pieces of the stored data are split up and replicated at least three
times around the network
▪ Reduced network overhead results from the greater chunk size.
▪ Without hindering applications, GFS is made to meet Google’s huge cluster

requirements.
▪ Hierarchical directories with path names are used to store files.
▪ The master is in charge of managing metadata, including namespace, access

control, and mapping data.
Google File System
▪ The master communicates with each chunk server by timed heartbeat messages
and keeps track of its status updates
▪ More than 1,000 nodes with 300 TB of disc storage capacity make up the largest
GFS clusters.
▪ This is available for constant access by hundreds of clients.

Google File System
Google File System
▪ Components of GFS
▪ GFS Clients: they can be computer programs or applications which may be used to
request files. Requests may be made to access and modify already-existing files or
add new files to the system.
▪ GFS Master Server:

It serves as the cluster’s coordinator
It preserves a record of the cluster’s actions in an operation log
Additionally, it keeps track of the data that describes chunks, or metadata.

Google File System
GFS Chunk Servers:
They are the GFS’s workhorses.
They keep 64 MB-sized file chunks.
The master server does not receive any chunks from the chunk servers.
Instead, they directly deliver the client the desired chunks.

The GFS makes numerous copies of each chunk and stores them on various chunk
servers in order to assure stability; the default is three copies.
Google File System
Features of GFS
Fault tolerance.
Reduced client and master interaction
High availability.
Critical data replication.

Automatic and efficient data recovery.
High aggregate throughput

Consensus and Agreement Algorithms
 Agreement among the processes in a distributed system is a fundamental
requirement for a wide range of applications.
 Many forms of coordination require the processes to exchange information to

negotiate with one another and eventually reach a common understanding or
agreement, before taking application-specific actions
Assumptions
1. Failure models :
 Among the n processes in the system, at most f processes can be faulty.
 A faulty process can behave in any manner allowed by the failure model assumed
 The various failure models – fail-stop, send omission and receive omission, and
Byzantine.
 2. Synchronous/asynchronous communication:
 If a failure-prone process chooses to send a message to process Pi but fails, then Pi

cannot detect the non-arrival of the message in an asynchronous system because
this scenario is indistinguishable from the scenario in which the message takes a
very long time in transit
 In a synchronous system, however, the scenario in which a message has not been
sent can be recognized by the intended recipient, at the end of the round.
 3. Network connectivity :
 The system has full logical connectivity, i.e., each process can communicate with
any other by direct message passing.
 4. Sender identification :
 A process that receives a message always knows the identity of the sender process.
 5. Channel reliability :
 The channels are reliable, and only the processes may fail
 6. Authenticated vs. non-authenticated messages :
 With unauthenticated messages, when a faulty process relays a message to other

processes,
 (i) it can forge the message and claim that it was received from another process,
 (ii) it can also tamper with the contents of a received message before relaying it.
 When a process receives a message, it has no way to verify its authenticity.
 An unauthenticated message is also called an oral message or an unsigned message

 Using authentication via techniques such as digital signatures, it is easier to solve
the agreement problem because, if some process forges a message or tampers with
the contents of a received message before relaying it, the recipient can detect the
forgery or tampering.
 7. Agreement variable :
 The agreement variable may be boolean or multivalued, and need not be an
integer.
The Byzantine Agreement Problem
 In the Byzantine agreement problem, a single value, which is to be agreed on, is
initialized by an arbitrary processor and all nonfaulty processors have to agree on
that value.
 In the consensus problem, every process has its own initial value and all non faulty
process must agree on a single common value.
 The Byzantine agreement problem requires a designated process, called the source
process, with an initial value, to reach agreement with the other processes about
its initial value, subject to the following conditions:
 Agreement : All non-faulty processes must agree on the same value
 Validity : If the source process is non-faulty, then the agreed upon value by all the
non-faulty processes must be the same as the initial value of the source
 Termination : Each non-faulty process must eventually decide on a value
 There are two other popular flavors of the Byzantine agreement problem –
 1. the consensus problem,
 2. The interactive consistency problem
 The consensus problem
 The consensus problem differs from the Byzantine agreement problem in that each
process has an initial value and all the correct processes must agree on a single
value
 Agreement : All non-faulty processes must agree on the same (single) value.
 Validity: If all the non-faulty processes have the same initial value, then the agreed
upon value by all the non-faulty processes must be that same value.
 Termination : Each non-faulty process must eventually decide on a value.

 The interactive consistency problem
 each process has an initial value, and all the correct processes must agree upon a
set of values, with one value for each process
 Agreement : All non-faulty processes must agree on the same array of values
A [ v1 …….vn].
Validity : If process i is non-faulty and its initial value is vi, then all non faulty
processes agree on vi as the ith element of the array A. If process j is faulty, then the
non-faulty processes can agree on any value for A [j].
Termination: Each non-faulty process must eventually decide on the array A

Agreement in (message-passing)
synchronous systems with failures
 Consensus algorithm for crash failures
 consensus algorithm for n processes, where up to f processes, where f<n, may fail
in the fail-stop model
 Here, the consensus variable x is integer-valued.
 Each process has an initial value xi.

 If up to f failures are to be tolerated, then the algorithm has f +1 rounds.
 In each round, a process i sends the value of its variable xi to all other processes
 Of all the values received within the round and its own value xi at the start of the
round, the process takes the minimum, and updates xi.
 After f +1 rounds, the local value xi is guaranteed to be the consensus value.
Agreement in (message-passing)
synchronous systems with failures

Module 5

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Module 5

Uploaded by

Copyright:

Available Formats

CST 402 - DISTRIBUTED COMPUTING

▪ L2: Agreement in (message-passing) synchronous systems with failures –

▪ L3: Agreement in (message-passing) synchronous systems with failures –

▪ L4: Distributed File System – File Service Architecture

▪ L5: Case Studies: Sun Network File System

▪ L6: Andrew File System

▪ L7: Google File System

 In a distributed file system, one or more central servers store files

Files contain both data and attributes.

The data consist of a sequence of data items , accessible by

The attributes are held as a single record containing information

Concurrent file updates

8. Consistency: If any changes made to one file, that changes must

 An architecture that offers a clear separation of the main

Application Application Directory service

Flat file service

Figure 5. File service architecture

Read(FileId, i, n) -> Data if 1≤i≤Length(File): Reads a sequence of up to n items

Write(FileId, i, Data) if 1≤i≤Length(File)+1: Write a sequence of Data to a

file, starting at item i, extending the file if necessary.

returns the relevant UFID. If Name is not in

the directory, throws an exception.

Requests referring to files in a remote file system are translated

The NFS client and server modules communicate using remote

Virtual File System :

Provide access transparency

• read(fh, offset, count) -> attr, data

 AFS is implemented as two software

 Vice > flat file service

Figure 11. Distribution of processes in the Andrew File System

 Local files are handled as normal UNIX files.

 They are stored on the workstation’s disk and

 Shared files are stored on servers, and copies

▪ GFS offers fault tolerance, dependability, scalability, availability, and performance

▪ Reduced network overhead results from the greater chunk size.

▪ Without hindering applications, GFS is made to meet Google’s huge cluster

▪ Hierarchical directories with path names are used to store files.

▪ The master is in charge of managing metadata, including namespace, access

▪ This is available for constant access by hundreds of clients.

▪ GFS Master Server:

It preserves a record of the cluster’s actions in an operation log

Additionally, it keeps track of the data that describes chunks, or metadata.

They are the GFS’s workhorses.

They keep 64 MB-sized file chunks.

Instead, they directly deliver the client the desired chunks.

Reduced client and master interaction

Critical data replication.

High aggregate throughput

 Many forms of coordination require the processes to exchange information to

 If a failure-prone process chooses to send a message to process Pi but fails, then Pi

 6. Authenticated vs. non-authenticated messages :

 With unauthenticated messages, when a faulty process relays a message to other

 When a process receives a message, it has no way to verify its authenticity.

 An unauthenticated message is also called an oral message or an unsigned message

 Termination : Each non-faulty process must eventually decide on a value.

Termination: Each non-faulty process must eventually decide on the array A

 Each process has an initial value xi.

You might also like