Chapter 2

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 43

Chapter 2 - Transaction Concepts and concurrency control

2.1 Describe a transaction, properties of transaction, state of the transaction.


2.2 Executing transactions concurrently associated problem in concurrent execution.
2.3 Schedules, types of schedules, concept of Serializability, Precedence graph for
Serializability.
2.4 Ensuring Serializability by locks, different lock modes, 2PL and its variations.
2.5 Basic timestamp method for concurrency, Thomas Write Rule.
2.6 Locks with multiple granularity, dynamic database concurrency (Phantom Problem).
2.7 Timestamps versus locking.
2.8 Deadlock and deadlock handling - Deadlock Detection ( wait-die, wound-wait), Deadlock
Detection and Recovery (Wait for graph)

2.1 Transaction
• A transaction is a unit of program execution that accesses and possibly updates various
data items.
• A transaction must see a consistent database.
• During transaction execution the database may be inconsistent.
• When the transaction is committed, the database must be consistent.
• Two main issues to deal with:
- Failures of various kinds, such as hardware failures and system crashes
- Concurrent execution of multiple transactions
Example : To transfer some money from one bank account to another
We want to transfer Rs 500 from A’s account to B’s account
Read(A); //Read the balance of A
A := A-500; // Withdraw Rs.500 from A account
Write(A); //Update the balance of A account by storing the current amount
Read(B); //Read the balance of B
B := B+500; // Add Rs. 500 into B
Write(B); //Store updated balance of Account B
Commit;
Properties of Transaction: ACID
To preserve integrity of data, the database system must ensure ACID Properties:
Atomicity : This property states that a transaction must be treated as an atomic unit, that is,
either all of its operations are executed or none. Transaction must ensure that it should not be
partially completed.
Consistency : The database must remain in a consistent state after any transaction. No
transaction should have any adverse effect on the data residing in the database. If the database
was in a consistent state before the execution of a transaction, it must remain consistent after the
execution of the transaction as well. (e.g. If money is transfer from A to B then it should get
transferred. If any failure occurs then it may leave DB in inconsistent way).
Isolation − In a database system where more than one transaction are being executed
simultaneously and in parallel, the property of isolation states that all the transactions will be
carried out and executed as if there is only one transaction in the system. No transaction will
affect the existence of any other transaction.
Durability : The database should be durable enough to hold all its latest updates even if the
system fails or restarts. If a transaction updates a chunk of data in a database and commits, then
the database will hold the modified data. If a transaction commits but the system fails before the
data could be written on to the disk, then that data will be updated once the system springs back
into action.

In Short:
Atomicity: Either transfer the money from A’s to B’s Account or don’t transfer the money.
Consistency: Don’t leave the transaction in inconsistent state in case of any failure.
Isolation: In case of more than one transaction it should get executed as if separately.
Durability: System should hold modified data after the execution of transaction.
State of the Transaction:

Active - The initial state, the transaction stays in this state while it is executing.
Partially committed – When the transaction completes the last statement.
Failed - It the system decides that the normal execution of the transaction can no longer proceed,
then transaction is termed as failed. If some failure occurs in active state or partially committed
state, transaction enters into failed state.
Committed – When transaction completes its execution successfully it enters committed state
from partial state.
Aborted – To ensure atomicity, changes made by failed transaction are undone ie transaction is
rolled back. After rollback, the transaction enters in aborted state. When the transaction is in
failed state, it rollbacks that transaction and enters in aborted state.
When the transaction is in aborted state, the system has two options:
1. If the transaction was aborted as of some hardware or software error, then transaction
can be restarted and that transaction is considered to be new transaction.
2. If the transaction was aborted because of some internal logical error or because of
input was bad or the desired data were not found in the database, then system can kill such
transactions.

2.2 Executing transactions concurrently associated problem in concurrent execution


• In a multi-user system, multiple users can access and use the same database at one time,
which is known as the concurrent execution of the database. It means that the same
database is executed simultaneously on a multi-user system by different users.
• DBMS controls the execution of two or more transactions in parallel. However is allows
only one operation of any transaction to occur at any given time within the system.
• DBMS interleaves the actions of different transactions to improve performance, in terms
of increased throughput or improved response times for short transactions.
• In case of short transaction with long transaction, of there is serial execution short
transaction may get stuck behind but in case of interleaved execution allows short
transaction to get complete quickly.
• This concurrent execution may lead to problems or conflicts.
• When two or more non serial transactions execute concurrently, then there may be some
conflicting operations.
• Two operations are said to be conflicting, if they satisfy all the following conditions.
1. The operations belong to different transactions.
2. At least one of the operations is write operation.
3. The operation access the same object or item.
Example:
The following set of operations is conflicting:

T1 T2 T3
R(X)
W(X)
W(X)

T1 T2 T3

R(X
)
R(X
)
R(X
)

The following set of operations is not conflicting:


T1 T2 T3

R(X)
W(Y)
R(X)

2.3 Schedule:
• Schedule represents the chronological order in which instructions are executed in the
system.
• A schedule can have many transactions in it, each transaction comprising of a number of
instructions.
• A schedule can be defined as, a sequence of operations by a set of concurrent transactions
that preserves the order of the operations in each of the individual transactions.
Schedule

Transactions

Instructions
Consider two transactions T1 and T2 are executing concurrently as,

T1 T2

Read(X) Read(Y)
X=X+5 Y=Y+100
Write(X) Read(X)
Commit; X=X+Y
Write(X)
Commit;

Then schedule S of {T1, T2} is:

T1 T2

Read(X)
X=X+5
Read(Y)
X=X+Y
Read(X)
Y=Y+100
Write(X)
Commit;
Read(X)
X=X+Y
Write(X)
Commit;

Complete Schedule: A schedule that contains either a commit or an abort action for each
transaction.
Types of Schedule:
Serial Schedule:
• Serial schedule consists of a sequence of instructions from various transactions where the
instructions belonging to one transaction appear together in that schedule.
• In other words, If the operations of different transactions are not interleaved ie
transactions are executed one-by-one from start to finish, the schedule is called as a Serial
schedule.
• E.g. – Consider the banking system of several accounts and set of transactions that
accesses and updates those accounts. Let T1 and T2 are two transactions. Assume initial
balance of A and B are 1000 and 2000 respectively.

T1: Transfer Rs.50 from Account A to B


T1: Read(A);
A=A-50;
Write (A);
Read(B)
B=B+50;
Write(B);
T2: Transfers 10% of balance from Account A to B
T2: Read(A);
temp=A*(0.1);
A=A-temp;
Write (A);
Read(B)
B=B+temp;
Write(B);
T1 and T2 are two serial schedules.

i) Schedule 1
S <T1,T2> are serial schedule where transaction T1 will execute first and then T2.

T1 T2
Read(A);
A=A-50;
Write (A);
Read(B)
B=B+50;
Write(B);
Read(A);
temp=A*(0.1);
A=A-temp;
Write (A);
Read(B)
B=B+temp;
Write(B);

After execution of T1 A is 950 and B is 2050


After execution of T2 A is 855 and B is 2145
A+B after execution of S<T1,T2> 3000

ii) Schedule 2
S<T2,T1> transaction T2 will execute first and then T1.

T1 T2
Read(A);
temp=A*(0.1);
A=A-temp;
Write (A);
Read(B)
B=B+temp;
Write(B);
Read(A);
A=A-50;
Write (A);
Read(B)
B=B+50;
Write(B);

After execution of T2 A is 900 and B is 2100


After execution of T1 A is 850 and B is 2150
A+B after execution of S<T2,T1> 3000
After executing both the transaction account balance of A is Rs 855 and of B is Rs 2145. Here
A+B is constant and preserves consistency.
Thus for set of n transaction there exist different n! Valid schedule.
Concurrent Schedule(Non-Serial Schedule):
• When several transactions are executed concurrently then it is called as Concurrent
schedule.
• Several execution sequences are possible, the various instructions from both transaction
may now be interleaved or mixed with each other.
• It is not possible to predict exactly how many instructions of transaction will be executed
before CPU switches to other transaction.
• The number of possible concurrent schedules for a set of n transactions is more than n!
i) Consistent concurrent schedule for T1 and T2:
This concurrent schedule preserves the consistency of database for T1 and T2. This
concurrent schedule preserves the consistency of database and A+B is constant.
Schedule 3:

T1 T2

Read(A);
A=A-50;
Write(A);
Read(A)
Temp=A*0.1;
A=A-temp;
Write(A);
Read(B);
B=B+50;
Write(B);
Read(B);
B=B+temp;
Write(B);

ii) Inconsistent concurrent schedule for T1 and T2:


The following concurrent schedule does not preserve the sum of A + B
Schedule 4:
T1 T2

Read(A);
A=A-50;
Read(A)
Temp=A*0.1;
A=A-temp;
Write(A);
Read(B);
Write(A);
Read(B);
B=B+50;
Write(B);
B=B+temp;
Write(B);

Serializable Schedule:
• A serializable schedule always leaves the database in consistent state.
• Concurrent schedule results in consistent state if its result is equivalent to serial schedule
of that transaction. Such schedule is called as Serializable schedule.
• Types of serializabiliy:
1. Conflict serializable schedule
2. View serializable schedule

1. Conflict serializable schedule:


Consider T1 and T2 are two transaction and S schedule for T1 and T2.
Ii and Ij are two instruction. Let Ii and Ij be two Instructions of transactions Ti and Tj
respectively.
If Ii and Ij refer to different data items, then Ii and Ij can be executed in any sequence.
But if Ii and Ij refer to the same data items then the order of two instructions may matter.
Instructions li and lj conflict if and only if there exists some item A accessed by both li and lj, and
at least one of these instructions wrote A.
1. li = Read(A), lj = Read(A). li and lj don’t conflict.
2. li = Read(A), lj = Write(A). They conflict. (It will read original value of A)
3. li = Write(A), lj = Read(A). They conflict. (It will read that value of A which is written by
write(A). Order of instructions matters. After commit changes will be reflected. (There is
possibility, transaction may rollback)
4. li = Write(A), lj = Write(A). They conflict. (Here order does not affect. But database changed
and it makes difference for next read).
Schedule 5:

T1 T2

Read(A)
Write(A)
Read(A)
Write(A)
Read(B)
Write(B)
Read(B)
Write(B)

Here Write(A) of T1 conflicts with Read(A) of T2.


Write(B) of T1 conflicts with Read(B) of T2.
But Write(A) of T2 does not conflicts with Read(B) of T1 because they are accessing different
data items.
If Ii and Ij two consecutive instruction of Schedule S and if they do not conflict, then we can
swap the order of Ii and Ij to produce new schedule S’. S and S’ are same except order of Ii and
Ij whose order does not matter.
Schedule 6:

T1 T2

Read(A)
Write(A)
Read(A)
Read(B)
Write(A)
Write(B)
Read(B)
Write(B)

Let us swap more instructions of S’:


i) Read(B) of T1 and Read(A) of T2
ii) Write (B) of T1 and Write(A) of T2
iii) Write(B) of T1 and Read(A) of T2
Schedule 7:

T1 T2

Read(A)
Write(A)
Read(B)
Write(B)
Read(A)
Write(A)
Read(B)
Write(B)

This is a serial schedule of T1 and T2. Thus concurrent schedule S is transferred to serial
schedule S’ by a series of swaps of non-conflicting transactions and schedules S and S’ are
conflict equivalent.
Schedule S is conflict serializable, if it is conflict equivalent to a serial schedule.
Following schedule is not conflict serializable, since it is not conflict equivalent to any serial
schedule <T1, T2> or <T2, T1> We are unable to swap instructions in following schedule to
obtain either the serial schedule < T1, T2 >, or the serial schedule < T2, T1 >.
Schedule 8:

T1 T2

Read(A)
Write(A)
Write(A)

1. This is non serial schedule.


2. Then check whether this is conflict schedule? Are there any conflicts? – Yes
3. Check if it is conflict seralizable or not? – All the instructions are conflicting so there is
no question of swapping instruction.
Schedule 9:

T1 T5

Read(A)
A=A-50
Write(A)
Read(B)
B=B-10
Write(B)
Read(B)
B=B+50
Write(B)
Read(A)
A=A+10
Write(A)

Result of above schedule is same as serial schedule <T1, T5>, But this is not conflict
serialization, since in schedule Write(B) of T5 conflict with Read(B) of T1. Thus we cannot
move all instructions of T1 before those of T5 by swapping consecutive non conflicting
instructions.

2. View Serializabiliy:
Let S and S´ be two schedules with the same set of transactions. S and S´ are view equivalent if
the following three conditions are met, for each data item A,
1. If in schedule S, transaction Ti reads the initial value of A, then in schedule S’ also
transaction Ti must read the initial value of A.
2. If transaction Ti executes Read(A) in schedule S, and if that value was produced
by a Write(A) operation executed by transaction Tj , then the Read(A) operation
of transaction Ti must, in schedule S´ , also read the value of A that was produced
by the same Write(A) operation of transaction Tj .
3. The transaction (if any) that performs the final Write(A) operation in schedule S
must also perform the final Write(A) operation in schedule S’.
As can be seen, view equivalence is also based purely on reads and writes alone.
Schedule 1 is not equivalent to schedule 2 since in schedule 1 the value of account A read by
transaction T2 was produced by T1, where as this is not the case in schedule2. Schedule 1 is
view equivalent to schedule 3, because values of account A and B read by transaction T2 were
produced by T1 in both schedules.
Schedule 10:

T3 T4 T6

Read(A)
Write(A)
Write(A)
Write(A)

This schedule is view equivalent to serial schedule <T3, T4, T6>


Transaction T4 and T6 perform Write(A) operations without having performed a Read(A)
operation. Writes of this form are called blind writes.
Every conflict serializable schedule is view serializable but there are some view serializable
schedules that are not conflict serializable.
A view serializable schedule in which blind write appear is not a conflict serializable. Schedule
10 is view serializable but is not conflict serializable.
Testing for serializability:
• A serializability schedule gives same result as some serial schedule.
• A serial schedule always gives correct result ie serializable schedule is always correct.
Following are the methods for determining conflict and view serializability.
1. Testing of conflict serializability:
⮚ There is an algorithm to establish the seralizability of a given schedule for a set of
transactions.
⮚ This algorithm uses a directed graph called precedence graph constructed from
given schedule.
Precedence Graph:
• A precedence graph is also called as conflict graph and serializability graph. It is used in
concurrency control databases.
• It is redirected graph consists of a pair G=(V,E) where V is set of vertices and E is set
edges.
• Set of vertices indicates set of transactions in schedule.
• There is an edge from Ti -> Tj if one of the following three condition holds.
⮚ Ti executes Write(A) before Tj executes Read(A)
⮚ Ti executes Read(A) before Tj executes Write(A)
⮚ Ti executes Write(A) before Tj executes Write(A)
A precedence graph is said to be acyclic if there are no cycles in graph otherwise it is cyclic
graph.
To check conflict serializability construct graph G for schedule S. If the graph G has a
cycle, schedule S is not conflict serializable.
Algorithm:
Step1: Construct a precedence graph G for given schedule S.
Step2: If a graph G has a cycle, schedule S is not conflict serializable.
Topological sorting: It is a directed graph with linear ordering of its vertices such that for every
directed graph uv from vertex u to vertex v, u comes before v in the sorting.
If the graph is acyclic then find a serial schedule using topological sorting:
i) Initialize the serial schedule as empty.
ii) Find transaction Ti, such that there are no arcs entering Ti, Ti is the next transaction in the
serial schedule.
iii) Remove Ti and all edges emitting from Ti, if the remaining set is non-empty, return to
(ii), otherwise the serial schedule is complete.
Schedule 11:

T11 T12 T13

Read(A)
Read(B)
A:=f1(A)
B:=f2(B) Read(C)
Write(B)
C:=f3(C)
Write(C)
Write(A)
Read(A)
A:=f4(A)
Read(C)
Write(A)
C:=f3(C)
Write(C)
B:=f6(B)
Write(B)

This graph contains cycle. Hence schedule is not conflict serializable.

Schedule 12:

T14 T15 T16

Read(A)
A:=f1(A)
Read(C)
Write(A)
A:=f2(C)
Read(B)
Write(C)
Read(A)
Read(C)
B:=f3(B)
Write(B)
C:=f4(C)
Read(B)
Write(C)
A:=f5(A)
Write(A)
B:=f6(B)
Write(B)

The graph is acyclic. The conflict equivalent serial schedule for given schedule can be obtained
using step 2 algorithm:
• T14 transaction with no arcs entering in T14. Hence T14 is the first transaction in serial
schedule. Remove T14 and all edges emitting from T14.
• T15 is the next schedule, since it has no incoming edges. Remove T15 and edges emitting
from T15.
• T16 is the last schedule. Hence serial schedule which is conflict equivalent to given
schedule is:

Hence schedule is conflict serializable.

Precedence graph for Schedule 3:

T1 T2
Read(A);
A=A-50;
Write(A);
Read(A)
Temp=A*0.1;
A=A-temp;
Write(A);
Read(B);
B=B+50;
Write(B);
Read(B);
B=B+temp;
Write(B);

This graph is acyclic and conflict equivalent to serial schedule T1->T2. Hence
schedule is conflict serializable.

Precedence graph for Schedule 4:

T1 T2
Read(A);
A=A-50;
Read(A)
Temp=A*0.1;
A=A-temp;
Write(A);
Read(B);
Write(A);
Read(B);
B=B+50;
Write(B);
B=B+temp;
Write(B);

This is cyclic graph. Hence schedule is not conflict serializable.

Consider Precedence graph:

Serial schedule of given graph using topological sorting:


2. Testing of View Serializability:
• Testing of View Serializability is complicated. It has been shown that testing for view
serializability is itself NP-Complete.
• Thus there exists no algorithm to test for view serializability. Concurrency control
schemes can still use sufficient conditions for view serializability.
• If sufficient conditions are satisfied, the schedule is view serializable schedule. But there
may be view serializable schedules that do not satisfy the sufficient conditions.
Non-Serializable schedule –
The types are based on recoverability-
1. Recoverable schedule
2. Non recoverable schedule
1. Recoverable schedule-
• If a transaction T is committed, it should never be necessary to rollback T. The schedule
those meet this condition are recoverable schedules and those are not are called as non
recoverable.
• A recoverable schedule is one where for each pair of transaction Ti and Tj such that Tj
reads a data item previously written by Ti, the commit operation of Ti appears before the
commit operation of Tj. Otherwise schedule is non recoverable.
• Consider the example:

T8 T9

Read(A)
Write(A)
Read(A)
Read(B)
Transaction T9 reads the data written by T8. Commit of transaction T8 occurs after commit of
transaction T9. Hence it is non recoverable schedule.
Types of recoverable schedule:
1. Cascadeless schedule
2. Strict schedule
3. Cascading rollback schedule

1. Cascadeless schedule-
Even if a schedule is recoverable, to cover correctly from the failure of a transaction Ti, it
may have to rollback the transaction.
Example:
Cascadeless schedule

T10 T11 T12

Read(A)
Write(A)
Read(B)
Write(B)
Read(C)

Cascade rollback schedule

T10 T11 T12

Read(A)
Read(B)
Write(A)
Read(A)
Write(A)
Read(A)
Transaction T10 writes a value of A that is read by transaction T11. Transaction 11 writes a value
of A that is read by T12. Suppose that at a point transaction T10 fails. T10 must rolled back.
Since T11 is dependent on T10, T11 must be rolled back. Since T12 is dependent on T11, T12
must be rolled back.
This concept in which a single transaction failure results in a series of transaction rollbacks, is
called as cascading rollback.
2. Strict schedule:
• In a schedule, a transaction is neither allowed to read nor write a data item until the last
transaction that has written it is committed or aborted, then such schedule is called as
Strict schedule.
• It allows only committed read and write operations.
3. Cascading Rollback or cascading abort schedule-
• In a schedule, failure of one transaction causes several other dependent transactions to
rollback or abort then such a schedule is called as Cascading Rollback or cascading abort
schedule.
• It leads to wastage of CPU time.
• It occurs because of dirty read problem.
Non recoverable schedule:
• Consider a situation when the transaction Ti fails before it commits. Since Tj has read the
value written by Ti, we must abort Tj to ensure atomicity. But this is not possible as Tj
has already been committed. Thus it is situation where it is impossible to recover from
the failuare of Ti. This is non recoverable schedule.
• Non recoverable schedules are not allowed.
• DBMS requires that all schedule must be recoverable.

Lock based Protocol:


• If all schedules in a concurrent environment are restricted to serializable schedule, the
result will be consistent with some serial execution of transactions and will be considered
correct.
• Serializability can easily be ensured if access to database is done in mutually exclusive
manner ie if one transaction is accessing data item, no other transaction can modify that
data item.
• The most common method to implement mutual exclusion is to use locks.
• Consider the database is made up of data-items. A lock is a variable associated with each
data item. Manipulating value of lock is called locking.
• Following are the two modes of locks:
1. Shared: If a transaction Ti has obtained a shared mode lock-S on item A, then Ti
can read but it cannot write A. It is called as Read-locked item.
2. Exclusive: If a transaction Ti has obtained an exclusive mode lock-X on item A,
then Ti can read and write A. It is called as Write-locked item.
Lock requests are made to the concurrency-control manager by the programmer. Transaction can
proceed only after request is granted.
Lock-compatibility matrix:

• A transaction may be granted a lock on an item, if the requested lock is compatible with
locks already held on the item by other transactions.
• Any number of transactions can hold shared locks on an item, but if any transaction holds
an exclusive on the item no other transaction may hold any lock on the item.
• If a lock cannot be granted, the requesting transaction is made to wait till all incompatible
locks held by other transactions have been released. The lock is then granted.
• Example of a transaction performing locking:
T2: lock-S(A);
read (A);
unlock(A);
lock-S(B);
read (B);
unlock(B);
display(A+B)
Locking as above is not sufficient to guarantee serializability — if A and B get updated
in-between the read of A and B, the displayed sum would be wrong.
Two Phase locking (2PL) protocol:

• A locking protocol is a set of rules followed by all transactions while requesting and
releasing locks. Locking protocols restrict the set of possible schedules.
• 2PL protocol requires that each transaction issue a lock and unlock requests in two
phases:
1. Growing Phase - Transaction may obtain locks, Transaction may not release locks.
2. Shrinking Phase - Transaction may release locks, Transaction may not obtain locks.
Initially the transaction is in growing phase. In this it acquires locks as needed. Once the
transaction releases a lock, it enters the shrinking phase and it can issue no more lock
requests.
The protocol ensures serializability. It can be proved that the transactions can be serialized in
the order of their lock points (i.e., the point where a transaction acquired its final lock).
Example of two phase transactions:
T3: Lock-X (B)
Read(B)
B=B-50
Write(B)
Lock-X (A)
Read(A)
A=A+50
Write(A)
Unlock(B)
Unlock(A)
T4: Lock-S (A)
Read(A)
Lock-S(B)
Read(A)
Display(A+B)
Unlock(A)
Unlock(B)
The unlock instructions do not need to appear at the end of the transaction.
2PL does not ensure freedom from deadlock. T3 and T4 are in two phase, but they are
deadlocked in the schedule.

T3 T4

Lock-X (B)
Read(B)
B=B-50
Write(B)
Lock-S (A)
Read(A)
Lock-S(B)
Lock-X (A)
Read(A)
A=A+50
Write(A)
Unlock(B)
Unlock(A)
Read(A)
Display(A+B)
Unlock(A)
Unlock(B)

T1 has a X lock and T2 wants a S on B and T2 has a S lock and T1 wants a X on A. Cascading
rollback may occur under 2PL.
Variations of 2PL:
1. Strict two phase locking protocol(Strict 2PL):
• Cascading rollbacks can be avoided by a modification of 2PL called Strict 2PL.
• It requires that all exclusive locks held by a transaction must be held until the transaction
commits.
• This requirement ensures that any data written by an uncommitted transaction are locked
in exclusive mode until the transaction commits, preventing any other transaction from
reading data.
• So it has a growing phase but no shrinking phase.
2. Rigorous two phase locking protocol (Rigorous 2PL):
• It requires that all locks to be held until transaction commits.
• The transactions can be easily serialized in the order in which they commit.
• Most database system implements strict or rigorous 2PL.
3. Conservative 2PL:
• It is also called as static 2PL.
• A transaction should lock all the items it accesses, before the transaction begins execution
by pre declaring its read set and write set.
• If any of the pre declared items needed can’t be locked, the transaction doesn't lock any
item. It waits until all items become available.
• Conservative 2PL is deadlock free protocol.
• This is not flexible or dynamic protocol
Summary of Two phase Locking Protocol
2 Modes of lock – Shared and Exclusive
Growing and Shrinking phase
Variations of 2PL
Timestamp-Based Protocols:
• For every transaction Ti in the system, a fixed value called timestamp is associated,
denoted by TS(Ti). This timestamp is assigned by database system before Ti starts
execution.
• For stamping system clock value is used since it is always unique.
• If an old transaction Ti has time-stamp TS(Ti), a new transaction Tj is assigned
time-stamp TS(Tj), then TS(Ti) <TS(Tj).
• The protocol manages concurrent execution such that the time-stamps determine the
serializability order.
• In order to assure such behavior, the protocol maintains two timestamp values for each
data A.
W-timestamp(A) – The transaction which performs write operation on data item A, the
timestamp of that transaction is assigned to W-timestamp(A). It is the largest time-stamp of any
transaction that executed write(A) successfully. E.g. If T1 transaction performs write(A) then
timestamp of T1 is assigned to W-timestamp(A).
R-timestamp(A) - The transaction which performs read operation on data item A, the timestamp
of that transaction is assigned to R-timestamp(A). It is the largest time-stamp of any transaction
that executed read(A) successfully.
e.g.
If timestamp of T1 is 1:30 ie T1 enters into the system at 1:30. When T1 performs Write(A)
operation then timestamp of T1 is given to W-timestamp(A) so it will be 1:30
Lets consider that T1 enters to the system at 1:50pm so TS(T1) is 1:50pm.
If new Transaction T2 enters to the system at 2:10pm so TS(T2) is 2:10pm.
TS(T1)<TS(T2)
W-Timestamp(A)? – If T1 is performing write operation on A then W-Timestamp(A) = TS(T1)
= 1:50pm
R-Timestamp(A)? – If f T1 is performing read operation on A then R-Timestamp(A) = TS(T1)
= 1:50pm
T1,T2,T3
T1 = 1:10pm TS(T1) = 1:10pm
T2 = 1:32pm TS(T2) = 1:32pm
T3 = 1:45pm TS(T3) = 1:45pm
TS(T1)<TS(T2)<TS(T3)
Now T1 performs Write(A) operation so What is W-Timestamp(A)? = TS(T1) = 1:10pm
Now T2 performs Write(A) operation so What is W-Timestamp(A)? = TS(T2) = 1:32pm(Old
value ie 1:10pm will be overwritten by 1:32pm)
Now T3 performs Read(A) operation so What is R-Timestamp(A)? = TS(T3) = 1:45pm
W-Timestamp(A) R-Timestamp(A)
1:32pm 1:45pm
• The timestamp ordering protocol ensures that any conflicting read and write operations
are executed in timestamp order.
• Suppose a transaction Ti issues a Read(A)
⮚ If TS(Ti) < W-timestamp(A), then Ti needs to read a value of A that was already
overwritten. Hence, the read operation is rejected, and Ti is rolled back.
⮚ If TS(Ti) ≥ W-timestamp(A), then the read operation is executed, and
R-timestamp(A) is set to max(R-timestamp(A), TS(Ti)).
• Suppose that transaction Ti issues Write(A)
⮚ If TS(Ti) < R-timestamp(A), then the value of A that Ti is producing was needed
previously, and the system assumed that that value would never be produced.
Hence, the write operation is rejected, and Ti is rolled back.
⮚ If TS(Ti) < W-timestamp(A), then Ti is attempting to write an obsolete value of
A.
Hence, this write operation is rejected, and Ti is rolled back.
• Otherwise, the write operation is executed, and W-timestamp(A) is set to TS(Ti).
Timestamp protocol ensures freedom from deadlock as no transaction ever waits.
But the schedule may not be cascade-free, and may not even be recoverable.
Timestamp-Based Protocols:
Lets consider T1, T2 and T3 transactions
TS(T1) = 1:10pm TS(T2) = 1:32pm TS(T3):1:45pm
Option1:
If W-timestamp(A) is 1:45pm ie T3 has performed Write(A)
Now T1 wants to perform Read(A). So according to Timestamp-Based Protocol check is this
allowed or not?
-> Check TS(Ti) < W-timestamp(A)?
TS(T1) < W-timestamp(A) ie 1:10pm < 1:45pm Yes – Then Reject and Rollback T1
Option2:
If W-timestamp(A) is 1:32pm ie T2 has performed Write(A)
Now T3 wants perform Read(A)
So according to Timestamp-Based Protocol check is this allowed or not?
->Check TS(Ti) < W-timestamp(A)?
TS(T3) < W-timestamp(A) ie 1:45pm not less than 1:32pm so TS(Ti) > W-timestamp(A)
So T3 can perform Read(A) and R-timestamp = 1:45pm
Thomas Write rule:
• Modified version of the timestamp-ordering protocol in which obsolete write operations
may be ignored under certain circumstances.
• It states that, if a more recent transaction has already written the value of an object then a
less recent transaction does not need to perform its own write since it will eventually be
overwritten by more than one.
• When Ti attempts to write data item A, if TS(Ti) < W-timestamp(A), then Ti is
attempting to write an obsolete value of {A}. Rather than rolling back Ti as the
timestamp ordering protocol would have done, this {write} operation can be ignored.
• Otherwise this protocol is the same as the timestamp ordering protocol.
• Thomas' Write Rule allows greater potential concurrency. Outdated writes are ignored.
• Allows some view-serializable schedules that are not conflict-serializable.

T16 T17

Read(A)
Write(A)
Write(A)

• Apply Timestamp ordering protocol to given schedule. Since T16 starts before T17.
TS(T16)<TS(17) so Read(A) of T16 and Write(A) of T17 operation succeeds.
• When T16 attempts its Write(A) operation, we observe that TS(16) < W-timestamp(A),
since W-timestamp(A)=T17. According to timestamp protocol, write(A) must be rejected,
T16 will be roll backed.
• Timestamp ordering protocol rolls back the transaction T16, but the value of Write(A)
operation of T16 is already written by write(A) of T17, and the value that Write(A) of
T16 is attempting to write will never be read ie we can ignore the Write(A) of T16.
The modification to Timestamp ordering protocol is called as Thomas Write Rule.

Locks with Multiple Granularity:


• Allow data items to be of various sizes and define a hierarchy of data granularities,
where the small granularities are nested within larger ones.
• It can be represented graphically as a tree.
• When a transaction locks a node in the tree explicitly, it implicitly locks all the node's
descendants in the same mode.
Example –
• If a transaction Ti needs to access the entire database, and a locking protocol is
used, then Ti must lock each item in the database. Clearly, executing these locks is time
consuming. It would be better if Ti could issue a single lock request to lock the entire database.
• If transaction Tj needs to access only a few data items, it should not be required to lock
the entire database, since otherwise concurrency is lost.
We need a mechanism to allow the system to define multiple levels of granularity.
Data items to be of various sizes and defining a hierarchy of data granularities, where the small
granularities are nested within larger ones.

• The highest level represents the entire database(DB).


• Below it are nodes of type area(A), the database consists of exactly these areas.
• Each area in turn has nodes of type file(F) as its children.
• Each area contains exactly those files that are its child nodes.
• No file is in more than one area. Finally, each file has nodes of type record(ra).
• As before, the file consists of exactly those records that are its child nodes, and no record
can be present in more than one file.
• Granularity can range from small to large data items(entire DB, a file, page, record or a
field).
⮚ Course granularity : It refers to large data item. Eg. Entire relation or DB.
⮚ Fine granularity : It refers to small data item. Eg. Tuple or attribute.

Modes of locks:
In addition to S and X lock modes, there are three additional lock modes with multiple
granularity:
3. intention-shared (IS): indicates explicit locking at a lower level of the tree but only with
shared locks.

4. intention-exclusive (IX): indicates explicit locking at a lower level with exclusive or shared
locks.
Shared and intention-exclusive (SIX): the sub tree rooted by that node is locked explicitly in
shared mode and explicit locking is being done at a lower level with exclusive-mode locks.
intention locks allow a higher level node to be locked in S or X mode without having to check all
descendent nodes.

The compatibility matrix for all lock modes is:

Multiple Granularity ensures serializability.


Each transaction Ti can lock a node A by following these rules:
1. The lock compatibility matrix must be observed.
2. The root of the tree must be locked first, and may be locked in any mode.
3. A node A can be locked by Ti in S or IS mode only if the parent of A is currently locked
by Ti in either IX or IS mode.
4. A node A can be locked by Ti in X, SIX, or IX mode only if the parent of A is currently
locked by Ti in either IX or SIX mode.
5. Ti can lock a node only if it has not previously unlocked any node (that is, Ti is
two-phase).
6. Ti can unlock a node A only if none of the children of A are currently locked by Ti.
Locking - root-to-leaf, unlocking - leaf-to-root

Example:
1. Suppose that transaction T18 reads record ra2 in file Fa. Then, T18 needs to lock the
database, area A1, and Fa in IS mode (and in that order), and finally to lock ra2 in S
mode.
2. Suppose that transaction T19 modifies record ra9 in file Fa. Then, T19 needs to lock the
database, area A1, and file Fa in IX mode, and finally to lock ra9 in X mode.
3. Suppose that transaction T20 reads all the records in file Fa. Then, T20 needs to lock the
database and area A1 (in that order) in IS mode, and finally to lock Fa in S mode.
4. Suppose that transaction T21 reads the entire database. It can do so after locking the
database in S mode.

Dynamic Database Concurrency (Phantom Problem)


If the concurrency control is performed at the tuple granularity, the conflict may be undetected.
This is also called as Phantom Problem.
e.g.
We search for a dno=10. Say T8 requires to access all the records(tuples) of Physics.
Select count(*) from emp where dno=10;
And T9 is transaction which is executed.
Insert into emp values(11,’Akash’,10);
Let us consider S is the schedule having T8 and T9 transactions.
If T8 uses the newly added tuple by T9 while computing count(*) then T8 reads a value written
by T9. So in serial schedule T9 must come first before T8.

Timestaming Versus Locking:

No Timestamping Locking

1 Used to decide whether transaction Used for concurrency control.


should wait or rollback.

2 It is used for deadlock prevention. It is used improve performance.

3 Timestamp is unique identifier to A lock is variable associated with a data


identify the transaction. item in a database.

4 Timestamp methods assign Locking methods prevents unserializable


timestamp to each transaction and schedules by keeping more than one
enforced serializabilty by ensuring transaction from accessing the same data
that the transaction timestamps elements.
match he schedule for the
transaction.

5 Timestamp methods may have Locking method does not have to abort
causing more transaction abort transaction because they prevent potentially
than a locking protocol. conflicting transaction from interacting
with other transactions.

6 Space is needed for read and Space in the lock is proportional to the
write-times with every database number of database element locked.
element, whether or not is
currently accessed.
Deadlock
Consider the partial schedule.

Neither T3 nor T4 can make progress - executing lock-S(B) causes T4 to wait for T3 to release
its lock on B, while executing lock-X(A) causes T3 to wait for T4 to release its lock on A.
Such a situation is called a deadlock.
To handle a deadlock one of T3 or T4 must be rolled back and release the locks.
• Two-phase locking does not ensure freedom from deadlocks.
• In addition to deadlocks, there is a possibility of starvation.
• Starvation occurs if the concurrency control manager is badly designed.
• For example:
⮚ A transaction may be waiting for an X-lock on an item, while a sequence of other
transactions request and are granted an S-lock on the same item.
⮚ The same transaction is repeatedly rolled back due to deadlocks.
• Concurrency control manager can be designed to prevent starvation.
• The potential for deadlock exists in most locking protocols.
• When a deadlock occurs there is a possibility of cascading roll-backs.
• Cascading roll-back is possible under two-phase locking. To avoid this, follow a
modified protocol called strict two-phase locking -- a transaction must hold all its
exclusive locks till it commits/aborts.
• Rigorous two-phase locking is even stricter. Here, all locks are held till commit/abort. In
this protocol transactions can be serialized in the order in which they commit.
• System is deadlocked if there is a set of transactions such that every transaction in the set
is waiting for another transaction in the set.
• Deadlock prevention protocols ensure that the system will never enter into a deadlock
state. Some prevention strategies :
⮚ Require that each transaction locks all its data items before it begins execution
(pre declaration).
⮚ Impose partial ordering of all data items and require that a transaction can lock
data items only in the order specified by the partial order.
Deadlock Prevention:
Following schemes use transaction timestamps for the sake of deadlock prevention alone.
• wait-die scheme — non-preemptive
⮚ Older transaction may wait for younger one to release data item. (older means
smaller timestamp) Younger transactions never wait for older ones; they are rolled
back instead.
⮚ A transaction may die several times before acquiring needed data item. If T1 is
older than T2, T1 is allowed to wait. Otherwise if T1 is younger than T2, T1 is
aborted and restarted later.
• wound-wait scheme — preemptive
⮚ Older transaction wounds (forces rollback) of younger transaction instead of
waiting for it. Younger transactions may wait for older ones.
⮚ May be fewer rollbacks than wait-die scheme.
Wait-die Wound-wait
If older dependent on younger If older dependent on younger
Old – Wait Old wounds younger
If younger dependent on old If younger dependent on old
Younger – roll back(die) Younger – wait
Deadlock Detection:
Deadlocks can be described as a wait-for graph, which consists of a pair G = (V,E),
⮚ V is a set of vertices (all the transactions in the system)
⮚ E is a set of edges; each element is an ordered pair Ti → Tj.
• If Ti → Tj is in E, then there is a directed edge from Ti to Tj, implying that Ti is waiting
for Tj to release a data item.
• When Ti requests a data item currently being held by Tj, then the edge Ti → Tj is
inserted in the wait-for graph. This edge is removed only when Tj is no longer holding a
data item needed by Ti.
• The system is in a deadlock state if and only if the wait-for graph has a cycle. Must
invoke a deadlock-detection algorithm periodically to look for cycles.
Examples:
1. Following is the list of an interleaved set of transaction T1,T2,T3 and T4. Check whether
there is deadlock or not?

Tim Transaction Code


e

t1 T1 Lock(A,X)
t2 T2 Lock(B,X)
t3 T3 Lock(A,S)
t4 T4 Lock(B,S)
t5 T1 Lock(B,S)
t6 T2 Lock(D,S)
t7 T3 Lock(C,S)
t8 T4 Lock(C,X)

Solution: As per the given time t1 to t8 the schedule can be written as follows:

T1 T2 T3 T4

X(A)
X(B)
S(A)
S(B)
S(B)
S(D)
S(C)
X(C)

Since there is no cycle in wait-for-graph so it is deadlock free.


2. Following is the list of an interleaved set of transaction T1,T2, T3 and T4. Check
whether there is deadlock or not?

Tim Transaction Code


e
t1 T1 Lock(A,X)
t2 T2 Lock(B,X)
t3 T3 Lock(A,S)
t4 T4 Lock(B,S)
t5 T1 Lock(B,S)
t6 T3 Lock(D,X)
t7 T2 Lock(D,S)
t8 T4 Lock(C,sX
)

Solution: As per the given time t1 to t8 the schedule can be written as follows:

T1 T2 T3 T4

X(A)
X(B)
S(A)
S(B)
S(B)
X(D)
S(D)
X(C)
Since there is a cycle in wait-for-graph so there is a deadlock. Transactions involved in deadlock
are T1,T2, T3

You might also like