Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

 Fault-tolerance or graceful degradation is the property that

enables a system (often computer-based) to continue operating


properly in the event of the failure of (or one or more faults
within) some of its components.
 As an example in real life environment, the Transmission
Control Protocol (TCP) is designed to allow reliable two-way
communication in a packet-switched network, even in the
presence of communications links which are imperfect or
overloaded.

The basic characteristics of fault tolerance require:

1. No single point of failure


2. No single point of repair
3. Fault isolation to the failing component
4. Fault containment to prevent propagation of the failure
5. Availability of reversion modes

Two main reasons for the occurrence of a fault :

1. Node failure -Hardware or software failure.


2. Malicious Error-Caused by unauthorized Access.

Fault Tolerance is needed in order to provide 3 main feature to


distributed systems :

1. Reliability-Focuses on a continuous service with out any


interruptions.
2. Availability - Concerned with read readiness of the system.
3. Security-Prevents any unauthorized access.

Examples-Patient Monitoring systems, flight control systems, Banking


Services etc.
Fault Tolerance Techniques
1) Replication :

 Creating multiple copies or replica of data items and storing them


at different sites
 Main idea is to increase the availability so that if a node fails at
one site, so data can be accessed from a different site.
 Has its limitation too such as data consistency and degree of
replica.

2 ) Check Pointing

 Saving the state of a system when they are in a consistent state


and storing it on a stable storage.
 Each such instance when a system is in the stable state is called
a check point.
 In case of a failure, system is restored to its previous consistent
state.
 Saves useful computation.

You might also like