Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 55

UNIT X – Database Recovery Techniques

- 3hrs

1
Recovery Concepts
 In database the failure may occur due to inconsistency, network failure,
errors or any accidental damage but the data stored in database must be
available when required.
 So, database recovery is restoring the data when it get deleted, hacked or
damaged accidentally to the previous existing condition.
Recovery System in DBMS from Transaction Failure
• In a database recovery management system, there are mainly two recovery
techniques that can help a DBMS in recovering and maintaining the
atomicity of a transaction. Those are as follows
1.Log Based Recovery.
2.Shadow Paging

2
Log Based Recovey
 A log is a sequence of records that contains the history of all updates made to the
Database. Log the most commonly used structure for recording database modification.
Some time log record is also known as system log.
 Update log has the following fields-
1. Transaction Identifier: To get the Transaction that is executing.
2. Data item Identifier: To get the data item of the Transaction that is running.
3. The old value of the data item (Before the write operation).
4. The new value of the data item (After the write operation).
 The various kinds of log records are as shown in the following points. This is the basic
structure of the format of a log record.
<T, Start >. The Transaction has started.
<T, X, V1,V2>. The Transaction has performed write on data. V1 is a value that X will have
value before writing, and V2 is a Value that X will have after the writing operation.
<T, Commit>. The Transaction has been committed.
<T, Abort>. The Transaction has aborted.
3
Log Based Recovey

 Following points should be remembered while doing the Log Based Recovery
• Whenever a transaction performs a write, it is essential that the log record for that write is to be
created before the D.B. is modified.
• Once a log record exists, we can output the modification into D.B. if required. Also, we have the
ability to undo the modification that has already been updated in D.B.
 Log Based Recovery work in two modes These modes are as follow-
• Immediate Mode
• Deferred Mode

4
Log Based Recovey in immediate update
 In immediate update Mode of log-based recovery, database modification is
performed while Transaction is in Active State.
 It means as soon as Transaction is performed or executes its WRITE
Operation, then immediately these changes are saved in Database
also.
 In immediate Update Mode, there is no need to wait for the execution of the
commit statement to update the Database.
Explanation
• Consider the transition T1 as shown in the above table. The log of this
Transaction is written in the second column. So when the value of data
items A and B are changed from 1000 to 950 and 1050 respectively at that
time, the value of A and B will also be Update in the Database.

5
Log Based Recovey in immediate update
• In the case of Immediate Mode, we Need both Old value and New value of the Data Item in
the Log File.
• Now, if the system is crashed or failed in the following cases may be possible.

Case 1: If the system crashes after Transaction executing the Commit statement.
• In this case, when Transaction executed commit statement, then corresponding
commit entry will also be made to the Log file immediately.
• To recover, the database recovery manager will check the log file to recover the Database,
then the recovery manager will find both <T, Start > and < T, Commit> in the Log file then it
represents that Transaction T has been completed successfully before the system failed
so REDO(T) operation will be performed and Updated values of Data Item A and B will be
set in the database.

6
Log Based Recovery in immediate update

7
Log Based Recovery in Defferred Mode
 In the Deferred Mode of Log-based recovery method, all modifications to Database are
recorded but WRITE Operation is deferred until the Transaction is partially committed.
 It means In the case of Deferred mode, Database is modified after Commit operation of
Transaction is performed.
 For database Recovery in Deferred Mode, there may be two possible cases.
Case 1: If the system fails or crashes after Transaction performed the commit operation. In this
situation, since the Transaction has performed the commit operation successfully so there
will be an entry for the commit statement in the Log file of the Transaction.

So after System Failure, when the recovery manager will recover the Database, then he will
check the log file, and the recovery manager will find both <T, Start> and <T, Commit> It
means Transaction has been completed successfully before the system crash so in this
situation REDO(T) operation will be performed and Updated value of Data item A and B will
be set in Database.

8
Log Based Recovery in Defferred Mode
Case 2: If Transaction failed before executing the Commit, it means there is no commit
statement in Transaction as shown in the table given below, then there will be no entry for
Commit in the log file. 
• So, in this case, when the system will fail or crash, then the recovery manager will check
the Log file, and he/she will find the < T, Start> entry in the Log file but not find the < T,
Commit> entry. It means before system failure, Transaction was not completed
successfully, so to ensure the atomicity property, the recovery manager will set the old
value of data items A and B.
Note – In this case of Deferred Mode, there is no need to Perform UNDO (T). Update
values of data item not written to Database immediately after the WRITE operation. 
• In deferred modes, updated values will be written only after the Transaction commit.
So, in this case, there is an old value of the data item in the Database.
 

9
Shadow Paging Recovery Method
 It is a commonly used method for database recovery systems in DBMS. It requires less disk
access than do-log methods.
 Here the D.B. is partitioned into some number of fixed-length blocks known as pages, and it
maintains two-page tables during the life cycle of Transaction.

10
Shadow Paging Recovery Method
• Here each entry contains a pointer to a certain block on the disk.
The key idea is to maintain two-page tables during the
transaction-1) Current page table 2) Shadow page table.
• When the Transaction starts, both the pages are identical. But
during the Transaction, the current page table makes all the
changes while the shadow page table remains as it was before.
On the shadow page, the instructions of the Transaction are
stored.

11
Checkpoints Recovery Methods in DBMS
 A checkpoint is another recovery technique used in database recovery management in DBMS. In
this technique, checkpoint operation is performed periodically that copies log information onto
stable storage (volatile to stable storage). The information and operations performed at each
checkpoint consists of the following-
 The Start of the checkpoint and the time and date of the checkpoint is written to the log, and it’s
done on a stable storage device.
 All log data from the buffers within the computer memory is copied to the log on the stable
storage.
 The databases are updated from the buffers that are in the volatile storage that are then moved to
the physical Database.
 An end of checkpoint record is written, and the address of the checkpoint record is saved on a file
accessible to the recovery routine on start-up after a system crash.
 The frequency of check pointing is a design consideration of the recovery system. Following are
the options-
– The fixed interval of time.
– Transaction consistent checkpoint.
– Action-consistent checkpoint.
– Transaction oriented checkpoint

12
Difference between Deferred update and immediate update

• Deferred update – This technique does not physically update the database on disk until a
transaction has reached its commit point. Before reaching commit, all transaction updates
are recorded in the local transaction workspace. If a transaction fails before reaching its
commit point, it will not have changed the database in any way so UNDO is not needed. It
may be necessary to REDO the effect of the operations that are recorded in the local
transaction workspace, because their effect may not yet have been written in the database.
Hence, a deferred update is also known as the No-undo/redo algorithm
• Immediate update – In the immediate update, the database may be updated by some
operations of a transaction before the transaction reaches its commit point. However, these
operations are recorded in a log on disk before they are applied to the database, making
recovery still possible. If a transaction fails to reach its commit point, the effect of its
operation must be undone i.e. the transaction must be rolled back hence we require both
undo and redo. This technique is known as undo/redo algorithm.

13
Database backup and recovery from catastropic failures

 Some of the backup techniques are as follows :


• Full database backup – In this full database including data and database, Meta information
needed to restore the whole database, including full-text catalogs are backed up in a
predefined time series.
• Differential backup – It stores only the data changes that have occurred since last full
database backup. When same data has changed many times since last full database backup,
a differential backup stores the most recent version of changed data. For this first, we need to
restore a full database backup.
• Transaction log backup – In this, all events that have occurred in the database, like a record
of every single statement executed is backed up. It is the backup of transaction log entries
and contains all transaction that had happened to the database. Through this, the database
can be recovered to a specific point in time. It is even possible to perform a backup from a
transaction log if the data files are destroyed and not even a single committed transaction is
lost.
14
Read only and Read-Write Transaction
 If the database operations in a transaction do not update the database but
Only retrieve data, the transaction is called a read-only transaction otherwise
the transaction is called read-write transaction
Granularity
 The size of a data item is called its granularity, and it can be a field of some
record in the database, or it may be a larger unit such as a record or even a
whole disk block.
Basic database access operations
read_item(X): Reads a database item named X into a program variable. To
simplify our notation, we assume that the program variable is also named X.
write_item(X): Writes the value of program variable X into the database item
named X

15
Executing read_item(X) and Write_item(X)

Executing a read_item(X) command includes the following steps:


1. Find the address of the disk block that contains item X.
2. Copy that disk block into a buffer in main memory
3. Copy item X from the buffer to the program variable named X.
Executing a write_item(X) command includes the following steps:
1. Find the address of the disk block that contains item X.
2. Copy that disk block into a buffer in main memory
3. Copy item X from the program variable named X into its correct location in
the buffer.
4. Store the updated block from the buffer back to disk

16
Executing read_item(X) and Write_item(X)

 A transaction includes read_item and write_item operations to access and


update the database.
 Below figure shows examples of two very simple transactions.

17
Database Buffer

 A database buffer is a temporary storage area in memory used to hold a


copy of a database block.
 The DBMS will maintain in the database cache a number of data buffers in
main memory.
 Each buffer typically holds the contents of one database disk block, which
contains some of the database items being processed.
 When these buffers are all occupied, and additional database disk blocks
must be copied into memory, some buffer replacement policy is used to
choose which of the current buffers is to be replaced.
 If the chosen buffer has been modified, it must be written back to disk before
it is reused.

18
States of Transactions / Transaction Model
 A transaction in a database can be in one of the following states −

1) Active − In this state, the transaction is being executed. This is the initial state of
every transaction.
2) Partially Committed − When a transaction executes its final operation, it is said to
be in a partially committed state.

19
States of Transactions
3) Failed − A transaction is said to be in a failed state if any of the checks made by
the database recovery system fails. A failed transaction can no longer proceed
further.
4) Aborted − If any of the checks fails and the transaction has reached a failed state,
then the recovery manager rolls back all its write operations on the database to
bring the database back to its original state where it was prior to the execution of
the transaction. Transactions in this state are called aborted. The database
recovery module can select one of the two operations after a transaction aborts −
- Re-start the transaction
- Kill the transaction
5) Committed − If a transaction executes all its operations successfully, it is said to
be committed. All its effects are now permanently established on the database
system.

20
ACID Properties
 A transaction is a very small unit of a program and it may contain several low-level
tasks.
 A transaction in a database system must maintain Atomicity, Consistency,
Isolation, and Durability − commonly known as ACID properties − in order to
ensure accuracy, completeness, and data integrity.

1) Atomicity
• This property states that a transaction must be treated as an atomic unit, that is,
either all of its operations are executed or none.
• There must be no state in a database where a transaction is left partially
completed.
• States should be defined either before the execution of the transaction or after the
execution/abortion/failure of the transaction.

21
ACID Properties
1) Atomicity
Consider the following transaction T consisting of T1 and T2: Transfer of 100 from
account X to account Y.

If the transaction fails after completion of T1 but before completion of T2.( say,
after write(X) but before write(Y)), then amount has been deducted from X but not
added to Y.
This results in an inconsistent database state.
Therefore, the transaction must be executed in entirety in order to ensure correctness
of database state.

22
ACID Properties
2) Consistency
•The database must remain in a consistent state after any transaction.
•No transaction should have any adverse effect on the data residing in the database.
•If the database was in a consistent state before the execution of a transaction, it must
remain consistent after the execution of the transaction as well.
•Referring to the example above, The total amount before and after the transaction
must be maintained.
Total before T occurs = 500 + 200 = 700.
Total after T occurs = 400 + 300 = 700.

Therefore, database is consistent. Inconsistency occurs in case T1 completes


but T2 fails. As a result T is incomplete.

23
ACID Properties
3) Isolation
•In a database system where more than one transaction are being executed
simultaneously and in parallel, the property of isolation states that all the transactions
will be carried out and executed as if it is the only transaction in the system.
•No transaction will affect the existence of any other transaction.

4) Durability
•The database should be durable enough to hold all its latest updates even if the
system fails or restarts.
•If a transaction updates a chunk of data in a database and commits, then the
database will hold the modified data.
•If a transaction commits but the system fails before the data could be written on to
the disk, then that data will be updated once the system springs back into action.

24
Isolation Example
Let X= 500, Y = 500.
Consider two transactions T and T”.

Suppose T has been executed till Read (Y) and then T’’ starts. As a result ,
interleaving of operations takes place due to which T’’ reads correct value of X but
incorrect value of Y and sum computed by
??????

This results in database inconsistency, due to a loss of some units value. Hence,
transactions must take place in isolation and changes should be visible only after a
they have been made to the main memory.

25
26
27
28
Buffer Replacement Policies
The critical choice that the buffer manager must make is what block to throw
out of the buffer pool when a buffer is needed for a newly requested block.
The buffer-replacement strategies commonly used may be familiar to you
from other applications of scheduling policies, such as in operating systems.
Frame is chosen for replacement by a replacement policy:
Least-recently-used (LRU)
Most-recently-used (MRU)
First-In-First-Out (FIFO)
Clock / Circular order
Policy can have big impact on number of I/Os
Depends on the access pattern

Note: Please go through Operating system notes.

29
30
31
32
33
34
35
36
37
38
Serializability
 When multiple transactions are running concurrently then there is a possibility that
the database may be left in an inconsistent state.
 Serializability is a concept that helps us to check which schedules are serializable.
 A serializable schedule is the one that always leaves the database in consistent
state.

 Schedule − A chronological execution sequence of a transaction is called a


schedule. A schedule can have many transactions in it, each comprising of a
number of instructions/tasks.
 Serial Schedule − It is a schedule in which transactions are aligned in such a way
that one transaction is executed first. When the first transaction completes its cycle,
then the next transaction is executed. Transactions are ordered one after the other.
This type of schedule is called a serial schedule, as transactions are executed in a
serial manner.
39
Types of Serializability
 There are two types of Serializability –
1. Conflict Serializability
2. View Serializability
1) Conflict Serializability
 Conflict Serializability is one of the type of Serializability, which can be used to check
whether a non-serial schedule is conflict serializable or not.
 A schedule is called conflict serializable if we can convert it into a serial schedule
after swapping its non-conflicting operations.
Conflicting operations
Two operations are said to be in conflict, if they satisfy all the following three conditions:
1. Both the operations should belong to different transactions.
2. Both the operations are working on same data item.
3. At least one of the operation is a write operation.

40
Example of Conflict Serializability
Lets consider this schedule:

To convert this schedule into a serial schedule we must have to swap the R(A) operation of transaction
T2 with the W(A) operation of transaction T1.

However we cannot swap these two operations because they are conflicting operations, thus we can say
that this given schedule is not Conflict Serializable.

41
Example of Conflict Serializability
Lets take another example:

42
View Serializability
 View Serializability is a process to find out that a given schedule is view serializable or not.
 Two schedules S1 and S2 are said to be view equal if below conditions are satisfied :

1) Initial Read
If a transaction T1 reading data item A from initial database in S1 then in S2 also T1 should
read A from initial database.

43
View Serializability
2) Updated Read
If Ti is reading A which is updated by Tj in S1 then in S2 also Ti should read A which is
updated by Tj.

3) Final Write operation


If a transaction T1 updated A at last in S1, then in S2 also T1 should perform final write
operations.

44
Concurrency Control
 Concurrency control is the procedure in DBMS for managing simultaneous
operations without conflicting with each another.
 Concurrent access is quite easy if all users are just reading data. There is no way
they can interfere with one another.
 Though for any practical database, would have a mix of reading and WRITE
operations and hence the concurrency is a challenge.

 Concurrency control is used to address such conflicts which mostly occur with a
multi-user system.
 It helps you to make sure that database transactions are performed concurrently
without violating the data integrity of respective databases.
 Therefore, concurrency control is a most important element for the proper
functioning of a system where two or multiple database transactions that require
access to the same data, are executed simultaneously.
45
Why use Concurrency method?
Reasons for using Concurrency control method is DBMS:

•To apply Isolation through mutual exclusion between conflicting transactions


•To resolve read-write and write-write conflict issues
•To preserve database consistency through constantly preserving execution obstructions
•The system needs to control the interaction among the concurrent transactions. This control is
achieved using concurrent-control schemes.
•Concurrency control helps to ensure serializability

46
Concurrency Control Protocols
Different concurrency control protocols offer different benefits between the amount of
concurrency they allow and the amount of overhead that they impose.

•Lock-Based Protocols
•Two Phase
•Timestamp-Based Protocols
•Validation-Based Protocols

47
Lock-Based Protocols
 A lock is a data variable which is associated with a data item. This lock signifies
that operations that can be performed on the data item.
 Locks help synchronize access to the database items by concurrent
transactions.
 All lock requests are made to the concurrency-control manager. Transactions
proceed only once the lock request is granted.

 Binary Locks: A Binary lock on a data item can either locked or unlocked states.

 Shared/exclusive: This type of locking mechanism separates the locks based on their uses.
If a lock is acquired on a data item to perform a write operation, it is called an exclusive
lock.

48
Lock-Based Protocols
1. Shared Lock (S):
A shared lock is also called a Read-only lock. With the shared lock, the data item
can be shared between transactions.
This is because you will never have permission to update data on the data item.

For example, consider a case where two transactions are reading the account
balance of a person.
The database will let them read by placing a shared lock.
However, if another transaction wants to update that account's balance, shared lock
prevent it until the reading process is over.

49
Lock-Based Protocols
2. Exclusive Lock (X):
With the Exclusive Lock, a data item can be read as well as written.
This is exclusive and can't be held concurrently on the same data item. X-lock is
requested using lock-x instruction.
Transactions may unlock the data item after finishing the 'write' operation.

For example, when a transaction needs to update the account balance of a person.
You can allows this transaction by placing X lock on it.
Therefore, when the second transaction wants to read or write, exclusive lock
prevent this operation.

50
Deadlock Handling
 Deadlock refers to a specific situation where two or more processes are waiting for
each other to release a resource or more than two processes are waiting for the
resource in a circular chain.

 A deadlock is a condition where two or more transactions are waiting indefinitely for
one another to give up locks.

 Deadlock is said to be one of the most feared complications in DBMS as no task


ever gets finished and is in waiting state forever.

51
Deadlock Example in DBMS
 For example: In the student table, transaction T1 holds a lock on some rows and needs to
update some rows in the grade table.
 Simultaneously, transaction T2 holds locks on some rows in the grade table and needs to
update the rows in the Student table held by Transaction T1.

 Now, the main problem arises. Now Transaction T1 is waiting for T2 to release its lock and
similarly, transaction T2 is waiting for T1 to release its lock.
 All activities come to a halt state and remain at a standstill. It will remain in a standstill until
the DBMS detects the deadlock and aborts one of the transactions.

52
Deadlock Avoidance
 When a database is stuck in a deadlock state, then it is better to avoid the database
rather than aborting or restating the database. This is a waste of time and resource.

 Deadlock avoidance mechanism is used to detect any deadlock situation in advance. A


method like "wait for graph" is used for detecting the deadlock situation but this
method is suitable only for the smaller database. For the larger database, deadlock
prevention method can be used.

Deadlock Prevention
• Deadlock prevention method is suitable for a large database. If the resources are
allocated in such a way that deadlock never occurs, then the deadlock can be
prevented.
• The Database management system analyzes the operations of the transaction whether
they can create a deadlock situation or not. If they do, then the DBMS never allowed that
transaction to be executed.
53
Deadlock Detection
 In a database, when a transaction waits indefinitely to obtain a lock, then the DBMS should
detect whether the transaction is involved in a deadlock or not.
 The lock manager maintains a Wait for the graph to detect the deadlock cycle in the database.
Wait for Graph
• This is the suitable method for deadlock detection. In this method, a graph is created based on
the transaction and their lock. If the created graph has a cycle or closed loop, then there is a
deadlock.
• The wait for the graph is maintained by the system for every transaction which is waiting for
some data held by the others. The system keeps checking the graph if there is any cycle in the
graph.

54
Timestamp-based Protocols
 The timestamp-based algorithm uses a timestamp to serialize the execution of concurrent transactions.
This protocol ensures that every conflicting read and write operations are executed in timestamp order.
 The protocol uses the System Time or Logical Count as a Timestamp.

 The older transaction is always given priority in this method. It uses system time to determine the time
stamp of the transaction. This is the most commonly used concurrency protocol.
 Lock-based protocols help you to manage the order between the conflicting transactions when they will
execute. Timestamp-based protocols manage conflicts as soon as an operation is created. Example,

55

You might also like