6-Review of DBMS Techniques - Serializability-10-01-2024

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 41

Module:1

Database Design Techniques


Topics:
Review of DBMS Techniques:
Introduction to Transaction Processing
Transaction concepts
ACID properties of Transactions
Transaction states
Serial and Serializable Schedules
Schedules based on recoverability
Schedules based on Serializability
Conflict Serializabilty
Transaction and System concepts
• A transaction is a unit of program execution that accesses and
possibly updates various data items.
• A transaction must see a consistent database.
• During transaction execution the database may be inconsistent.
• When the transaction is committed, the database must be
consistent.
• Two main issues to deal with:
– Failures of various kinds, such as hardware failures and
system crashes
– Concurrent execution of multiple transactions
Desirable properties of Transactions: ACID Properties
• Atomicity. Either all operations of the transaction are properly
reflected in the database or none are.
• Consistency. Execution of a transaction in isolation preserves the
consistency of the database.
• Isolation. Although multiple transactions may execute
concurrently, each transaction must be unaware of other
concurrently executing transactions. Intermediate transaction
results must be hidden from other concurrently executed
transactions.
– That is, for every pair of transactions Ti and Tj, it appears to Ti
that either Tj finished execution before Ti started, or Tj started
execution after Ti finished.
• Durability. After a transaction completes successfully, the
changes it has made to the database persist, even if there are
system failures.
Example of Fund Transfer
• Transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
• Atomicity requirement — if the transaction fails after step 3 and
before step 6, the system should ensure that its updates are not
reflected in the database, else an inconsistency will result.
• Consistency requirement – the sum of A and B is unchanged by
the execution of the transaction.
Example of Fund Transfer (Cont.)

• Isolation requirement — if between steps 3 and 6, another


transaction is allowed to access the partially updated database,
it will see an inconsistent database
(the sum A + B will be less than it should be).
Can be ensured trivially by running transactions serially, that is
one after the other. However, executing multiple transactions
concurrently has significant benefits, as we will see.
• Durability requirement — once the user has been notified that
the transaction has completed (i.e., the transfer of the $50 has
taken place), the updates to the database by the transaction
must persist despite failures.
Transaction States
• Active, the initial state; the transaction stays in this state
while it is executing
• Partially committed, after the final statement has been
executed.
• Failed, after the discovery that normal execution can no
longer proceed.
• Aborted, after the transaction has been rolled back and the
database restored to its state prior to the start of the
transaction. Two options after it has been aborted:
– restart the transaction – only if no internal logical error
– kill the transaction
• Committed, after successful completion.
Transaction State (Cont.)
Characterizing schedules based on
serializability: Concurrent Executions
• Multiple transactions are allowed to run concurrently in the
system. Advantages are:
– increased processor and disk utilization, leading to better
transaction throughput: one transaction can be using the CPU
while another is reading from or writing to the disk
– reduced average response time for transactions: short
transactions need not wait behind long ones.
• Concurrency control schemes – mechanisms to achieve isolation,
i.e., to control the interaction among the concurrent transactions
in order to prevent them from destroying the consistency of the
database
Schedules
• Schedules – sequences that indicate the chronological
order in which instructions of concurrent transactions are
executed
– a schedule for a set of transactions must consist of all
instructions of those transactions
– must preserve the order in which the instructions
appear in each individual transaction.
Example Schedules (Schedule 1)
• Let T1 transfer $50 from A to B, and T2 transfer 10% of the balance from A to B.
The following is a serial schedule in which T1 is followed by T2.

Initial Value A = 1000


A = 1000 A = 950
B = 2000 B = 2000
B = 2050
Temp = 95
A = 950 – 95
A = 855
B = 2050 + 95
B = 2145
Example Schedule (Schedule 3)

• Let T1 and T2 be the


transactions defined
previously. This schedule is not
a serial schedule, but it is
equivalent to Schedule 1.

In both Schedule 1 and 3, the sum A + B is preserved.


Schedule 4
The following concurrent
schedule does not preserve
the value of the the sum A + B. A = 1000
A = 950 (Not written no
A = 1000
Initial Value
Temp = 100
A = 1000
A = 1000 -100
B = 2000
A = 900
B = 2000
A = 950
B = 2000
B = 2000 + 50
B = 2050
B = 2000 + 100
B = 2100
Serializability
• Basic Assumption – Each transaction preserves database
consistency.
• Thus serial execution of a set of transactions preserves
database consistency.
• A (possibly concurrent) schedule is serializable if it is equivalent
to a serial schedule. Different forms of schedule equivalence
give rise to the notions of:
1. conflict serializability
2. view serializability
• We ignore operations other than read and write instructions,
and we assume that transactions may perform arbitrary
computations on data in local buffers in between reads and
writes. Our simplified schedules consist of only read and write
instructions.
Conflict Serializability
• Instructions li and lj of transactions Ti and Tj respectively.
• If I and J refer to different data items, then we can swap I and
J without affecting the results of any instruction in the
schedule.
• However, if I and J refer to the same data item Q, then the
order of the two steps may matter.
• Since we are dealing with only read and write instructions,
there are four cases that we need to consider:
1. I = read(Q), J = read(Q). The order of I and J does not
matter, since the same value of Q is read by Ti and Tj ,
regardless of the order.
Conflict Serializability
2. I = read(Q), J = write(Q). If I comes before J , then Ti does not read
the value of Q that is written by Tj in instruction J. If J comes before
I, then Ti reads the value of Q that is written by Tj. Thus, the
order of I and J matters.
3. I = write(Q), J = read(Q). The order of I and J matters for
reasons similar to those of the previous case.
4. I = write(Q), J = write(Q). Since both instructions are write
operations, the order of these instructions does not affect either Ti
or Tj .
However, the value obtained by the next read(Q) instruction of S is
affected, since the result of only the latter of the two write
instructions is preserved in the database.
If there is no other write(Q) instruction after I and J in S, then the
order of I and J directly affects the final value of Q in the database
state that results from schedule S.
Conflict Serializability (Cont.)
• We say that I and J conflict if they are operations by different
transactions on the same data item, and at least one of these
instructions is a write operation.
• Schedule 3

• The write(A) instruction of T1 conflicts with the read(A)


instruction of T2. However, the write(A) instruction of T2 does
not conflict with the read(B) instruction of T1, because the two
instructions access different data items.
Schedule 5
Since the write(A) instruction of T2 in schedule 3 does not conflict
with the read(B) instruction of T1, we can swap these instructions
to generate an equivalent schedule, schedule 5,
Schedule 6
• Swap the read(B) instruction of T1 with the read(A) instruction
of T2.
T1 T2
Read (A)
Write (A)
Read (B)
Read (A)
Write (A)
Write (B)
Read (B)
Write (B)
Schedule 7
• Swap the write(B) instruction of T1 with the write(A)
instruction of T2.

T1 T2
Read (A)
Write (A)
Read (B)
Read (A)
Write (B)
Write (A)
Read (B)
Write (B)
Schedule 8
• Swap the write(B) instruction of T1 with the read(A) instruction of
T2.
• A serial schedule that is equivalent to schedule 3.
• we have shown that schedule 3 is equivalent to a serial schedule.
• So, schedule 3 is conflict equivalent Schedule
T1 T2
Read (A)
Write (A)
Read (B)
Write (B)
Read (A)
Write (A)
Read (B)
Write (B)
Conflict Serializability (Cont.)
• If a schedule S can be transformed into a schedule S´ by a series
of swaps of non-conflicting instructions, we say that S and S´ are
conflict equivalent.
• The concept of conflict equivalence leads to the concept of
conflict serializability. We say that a schedule S is conflict
serializable if it is conflict equivalent to a serial schedule
Example for “not conflict serializable” (Sche: 9)
• Example of a schedule that is not conflict serializable:

• We are unable to swap instructions in the above schedule to


obtain either the serial schedule < T3, T4 >, or the serial
schedule < T4, T3 >.
View Serializability
• Consider the two schedules S & S’, where the same set of transactions
participates in the both schedules.
• Let S and S´ be two schedules with the same set of transactions. S and
S´ are view equivalent if the following three conditions are met:
1. For each data item Q, if transaction Ti reads the initial value of Q in
schedule S, then transaction Ti must, in schedule S´, also read the
initial value of Q.
2. For each data item Q if transaction Ti executes read(Q) in schedule
S, and that value was produced by transaction Tj (if any), then
transaction Ti must in schedule S´ also read the value of Q that was
produced by transaction Tj .
3. For each data item Q, the transaction (if any) that performs the
final write(Q) operation in schedule S must perform the final
write(Q) operation in schedule S´.
• As can be seen, view equivalence is also based purely on reads and
writes alone.
View Serializability (Cont.)
• A schedule S is view serializable it is view equivalent to a
serial schedule. Concept of view equivalence leads to the
concept of view serializability.
• Every conflict serializable schedule is also view serializable.
• But, Every view serializable schedule that is not conflict
serializable because of Blind Write
Testing for Serializability
• Consider some schedule of a set of transactions T1, T2, ..., Tn
• Precedence graph — a direct graph where the vertices are
the transactions (names).
• G = (V, E)
• V is a set of vertices and E is a set of edges
• We draw an arc from Ti to Tj if the two transaction conflict,
and Ti accessed the data item on which the conflict arose
earlier.
• We may label the arc by thex item that was accessed.

y
• The set of vertices consists of all the transactions participating
in the schedule.
• The set of edges consists of all edges Ti →Tj for which one of
three conditions holds:
1. Ti executes write(Q) before Tj executes read(Q).
2. Ti executes read(Q) before Tj executes write(Q).
3. Ti executes write(Q) before Tj executes write(Q).
Precedence graph - Example 1
Precedence graph - Example 2

• It contains the edge


T1 →T2, because T1
executes read(A)
before T2 executes
write(A).
• It also contains the
edge T2→T1, because
T2 executes read(B)
before T1 executes
write(B).
1. Ti executes write(Q)
before Tj executes read(Q).
2. Ti executes read(Q) before
Tj executes write(Q).
3. Ti executes write(Q)
before Tj executes write(Q).

From T2 to T1 ---All 3
conditions are violated. So,
we can not draw T2 T1
Test for Conflict Serializability
• If the precedence graph for schedule S has a cycle, then the
schedule is not a conflict serializable.
• If the graph contains no cycles, then the schedule S is conflict
serializable.
• If precedence graph is acyclic, the serializability order can be
obtained by a topological sorting of the graph. This is a linear
order consistent with the partial order of the graph.
Illustration of topological sorting
• Thus, to test for conflict serializability, we need to construct
the precedence graph and to invoke a cycle-detection
algorithm.
• Cycle-detection algorithms exist which take order n2 time,
where n is the number of vertices in the graph. (Better
algorithms take order n + e where e is the number of edges.)
Schedule 10 – Special case
A = 1000
A = 1000 – 50
A = 950
B = 2000
B = 2000 - 10
B = 1990
B = 1990
B = 1990 + 50
B = 2040
A = 950
A = 950 + 10
A = 960
Schedule 10 – Serial Schedule
T1 T2 A = 1000
Read (A) A = 1000 – 50
A = A – 50 A = 950
Write (A)
B = 2000
Read (B)
B = B +50
B = 2000 + 50
Write (B)
B = 2050
Read (B)
B = 2020
B = B - 10 B = 2050 – 10
Write (B) B = 2040
Read (A) A = 950
A = A + 10 A = 950 + 10
Write (A) A = 960
1. Ti executes write(Q) before Tj
executes read(Q).
2. Ti executes read(Q) before Tj
executes write(Q).
3. Ti executes write(Q) before Tj
executes write(Q).

• All 3 conditions are satisfied


for both T1 to T5 and T5 to T1.
So, we can draw arc T1 T5
and T5 T1
• Cycle there, So, No Conflict
serializability
Test for View Serializability
• The precedence graph test for conflict serializability must
be modified to apply to a test for view serializability.
• View serializability is not used in practice due to its high
degree of computational complexity
• It is complicated to test
• No efficient algorithms to test
Transaction Isolation and Atomicity
• If a transaction Ti fails, for whatever reason, we need to undo
the effect of this transaction to ensure the atomicity property
of the transaction.
• In a system that allows concurrent execution, the atomicity
property requires that any transaction Tj that is dependent on
Ti (that is, Tj has read data written by Ti) is also aborted.
• To achieve this, we need to place restrictions on the type of
schedules permitted in the system.
Characterizing schedules based on
recoverability
• Nonrecoverable schedules:

Schedule 11: Partial


Schedule– Non
recoverable schedule

 T7 is a transaction that performs only one instruction: read(A). We call this a


partial schedule because we have not included a commit or abort operation
for T6.
 Notice that T7 commits immediately after executing the read(A) instruction.
Thus, T7 commits while T6 is still in the active state.
 Now suppose that T6 fails before it commits. T7 has read the value of data
item A written by T6. Therefore, we say that T7 is dependent on T6.
 Because of this, we must abort T7 to ensure atomicity. However, T7 has
already committed and cannot be aborted.
 Thus, we have a situation where it is impossible to recover correctly from the
failure of T6.
Cont…
• Recoverable schedule
 is one where, for each pair of transactions Ti and Tj such that
Tj reads a data item previously written by Ti , the commit
operation of Ti appears before the commit operation of Tj .
 For the example of partial schedule to be recoverable, T7
would have to delay committing until after T6 commits.
Cont…
• Cascading rollback – Even if a schedule is recoverable, to
recover correctly from the failure of a transaction Ti , we may
have to roll back several transactions. Such situations occur if
transactions have read data written by Ti .
• As an illustration, consider the partial schedule 12. Transaction
T8 writes a value of A that is read by transaction T9.
Transaction T9 writes a value of A that is read by transaction
T10.
• Suppose that, at this point, T8 fails. T8 must be rolled back.
Since T9 is dependent on T8, T9 must be rolled back. Since T10
is dependent on T9, T10 must be rolled back. This
phenomenon, in which a single transaction failure leads to a
series of transaction rollbacks, is called cascading rollback.

Schedule 12
cont.
• Cascadeless schedules — Cascading rollback is undesirable,
since it leads to the undoing of a significant amount of work.
• It is desirable to restrict the schedules to those where cascading
rollbacks cannot occur. Such schedules are called cascadeless
schedules.
• Formally, a cascadeless schedule is one where, for each pair of
transactions Ti and Tj such that Tj reads a data item previously
written by Ti , the commit operation of Ti appears before the
read operation of Tj .
• It is easy to verify that every cascadeless schedule is also
recoverable.

You might also like