Transactions & Concurrency Control: CS4262 Distributed Systems

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 45

Transactions & Concurrency

Control
CS4262 Distributed Systems

Dilum Bandara
Dilum.Bandara@uom.lk

Some slides adapted from U Uthaiyashankar (WSO2) and Rajkumar


Buyya’s
Outline
 Transactions
 Properties
 Classification
 Transaction implementation
 Concurrency control

2
Transactions
 Transactions protect a shared resource (e.g.,
shared data) against simultaneous access by
several concurrent processes
 Transactions can do much more…
 They allow a process to access & modify multiple data
items as a single atomic operation
 All-or-nothing property

3
Transactions in DBMS
 A unit of work performed within a DBMS against
a database
 Treated in a coherent & reliable way independent of
other transactions
 2 main purposes
 To provide reliable units of work that allow correct
recovery from failures & keep a database consistent
even in cases of (complete or partial) system failure
 To provide isolation between programs accessing a
database concurrently

4
Goal of Transactions
 To ensure objects managed by a server remain
in a consistent state when accessed by multiple
transactions in the presence of server failure
 Transactions deal with process crash failures &
communication omission failures
 Transactions don’t deal with Byzantine failure

5
Transaction Model
 Recoverable objects
 Objects that can be recovered after their server
crashes
 Failure atomicity
 Effects of the transaction are atomic even when
server crashes
 Transactions require coordination among
 A client program
 Some recoverable objects
 A coordinator

6
Transaction Model (Cont.)
 Each transaction is created & managed by a
coordinator which implements the coordinator
interface
 Coordinator gives each transaction an identifier (TID)
 Clients invoke transaction coordinator interface
methods to perform a transaction
 A transaction may be aborted by client or server

7
Transaction Coordinator Interface
 openTransaction() → trans
 Starts a new transaction & delivers a unique TID trans
 trans will be used in other operations in the transaction
 closeTransaction(trans) → (commit, abort)
 Ends a transaction
 commit – indicates that transaction has committed
 abort – Indicates that it has aborted
 abortTransaction(trans)
 Aborts the transaction

8
Properties of a Transaction
 Proposed by Harder & Reuter in 1983
 ACID Property
 Atomic
 Consistent
 Isolated
 Durable

9
Atomic Property
 Process all of a transaction or none of it
 To outside world, transaction happens indivisibly
 While a transaction happens, other processes
can’t see any intermediate states
 Example
 A file is 100 bytes long when a transaction starts to
append to it
 If other processes read the file during transaction,
they should see & access only the 100 bytes
 If transaction commits successfully, file grows
instantaneously to its new size
10
Consistent Property
 Data on all systems reflects the same state
 Transaction doesn’t violate system invariants
 Example
 In a banking system, a key invariant is the law of
conservation of money
 After any internal transfer, amount of money in bank
must be same as it was before transfer
 During internal transfer operations, this invariant may
be violated
 But this violation isn’t visible outside the transaction

11
Isolated Property
 Transactions don’t interact/interfere with one
another
 Concurrent transactions don’t interfere with each
other
 i.e., transactions are serializable
 Example
 If 2+ transactions are running at the same time, to
each of them & to other processes, final result
appears as though all transactions ran sequentially (in
some system-dependent order)

12
Durable Property
 Effects of a completed transaction are persistent
 Once a transaction commits, changes are
permanent
 No failure after the commit can undo results or
cause them to be lost

13
Ensuring Transaction Properties
 To support “failure atomicity” & “durability”,
objects must be recoverable
 To support “isolation” & “consistency” server
needs to synchronize operations
 Perform transactions serially (sequentially)
 Generally, an unacceptable solution
 Allow transactions to execute concurrently, if they
would have the same effect as a serial execution
 Known as serializable or serially equivalent transactions

14
Classification of Transactions
 Flat transactions
 Nested transactions
 Distributed transactions

15
Flat Transactions
 A series of operations
 Don’t allow partial results to be committed or
aborted
 e.g., booking multiple air lines from start to
destination

16
Nested Transactions
 Constructed from a number of sub-transactions
 Top-level transaction may fork off children that
run in parallel to one another
 Incase of failure results of some sub-transaction
that committed must be undone
 e.g., booking multiple air lines from start to
destination

17
Distributed Transactions
 Logically, a flat, indivisible transaction that
operates on distributed data
 In contrast, a nested transaction is logically
decomposed into a hierarchy of sub-transactions

18
Source: Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007
Achieving Transaction Properties in
Distributed Transactions
 To achieve atomicity & durability
 Distributed commit protocols
 To achieve consistency & isolation
 Concurrency control algorithms (distributed locking
protocols)
 We need separate distributed algorithms to
handle “locking of data” & “committing entire
transaction”

19
Transaction Implementation
 Consider transactions on a file system
 If each process executing a transaction just updates
the file it uses in place, then transactions will not be
atomic

 2 common implementation techniques


 Private workspace
 Write-ahead log

20
Private Workspace
 Provide private (local) copies
 All updates on private copies
 Update the original (global) copy while
committing
 Issues
 Cost of copying everything
 Possible solution
 Keep a pointer to relevant file block
 When writing copy file block, instead of entire file
 Update original when closing file
 Examples – Andrew file systems
21
Example – Private Workspace
Implementation

22
Source: Andrew S. Tanenbaum , Distributed Operating Systems
Write-Ahead Log
 Files are modified in-place
 Before any changes, a record is written in a log
indicating
 Which transaction is making the change
 Which file & block is being changed
 What old & new values are
 Change is made to file only after log is written
successfully
 Example – log structured file system
 If transaction succeeds & is committed, a commit
record is written in log
 If transaction is aborted, log is used to revert file
back to its original state – rollback 23
Writeahead Log Implementation

24
Concurrency Control
 Goal
 Allow simultaneous execution of transactions
 While enabling each transaction to be isolated & data
items being manipulated are left in a consistent state
 Consistency
 Achieved by giving transactions access to data items in
a specific order
 Final result would be the same as if all transactions
were run sequentially

25
Achieving Concurrency Control
 3 different managers
1. Data manager
2. Scheduler
3. Transaction manager

26
Achieving Concurrency Control
(Cont.)
 Data manager
 Performs actual read/write on data
 Not concerned about transaction
 Scheduler
 Responsible for properly controlling concurrency
 Determines which transaction is allowed to pass a read/write
operation to data manager & at which time
 Schedules individual read/write s.t. transaction isolation &
consistency properties are met
 Transaction manager
 Guarantees transaction atomicity
 Processes transaction primitives by transforming them into
scheduling requests
27
Distributed Concurrency Control
 Each site has a scheduler & data manager
 Each transaction is handled by a transaction manager
 Transaction manager communicates with schedulers
at different sites
 Scheduler may also communicate with remote data
managers

28
Concurrency Control
 Schedules
 Concurrency control issues
 Serial equivalence
 Issues in aborting transactions

29
Schedules – Example

 Schedule – system dependent order of actions


 Final result looks as though all transactions ran sequentially
30
Concurrent Transaction Issues
 Lost update problem
 Transactions should not read a “stale” value & use it
in computing a new value
 Inconsistent retrievals problem
 Update transactions shouldn’t interfere with retrievals
 Illustrated in the context of banking examples…

31
“Lost Update” Problem
 3 bank accounts (A, B, C) with balances,
 A = $100, B = $200, C = $300
 Transaction T
 Transfers from account A to B, for 10% increase in the B’s
balance
 B = $200 + ($200*10%) = $220
 A = $100 – ($200*10%) = $80
 Transaction U
 Transfers from C to B, for 10% increase in B’s balance
 B = $220 + ($220*10%) = $242
 C = $300 – ($220*10%) = $278
 What might happen if transactions T & U are executed
concurrently?
32
“Lost Update” Problem (Cont.)

33
“Inconsistent Retrievals” Problem
 2 bank accounts (A, B) with balances
 A = $200
 B = $200
 Transaction V
 Transfers $100 of account A to B
 A = $200 - $100 = $100
 B = $200 + $100 = $300
 Transaction W
 Computes sum of all account balances (branch total)
 Total = $100 + $300 +…… = $400 + ……

 What might happen if transactions V & W are executed


concurrently? 34
“Inconsistent Retrievals” Problem
(Cont.)

35
Serial Equivalence
 Serial equivalence prevents occurrence of lost
updates & inconsistent retrievals
 Thus, serial equivalence is used as a criterion in
concurrency control algorithms
 2 transactions are serially equivalent if
 All pairs of conflicting operations of 2 transactions are
executed in same order at all of the objects that both
transactions access

36
Serial Equivalence (Cont.)
 Consider 2 objects i & j, & 2 transactions T & U
 Serially equivalent orderings require one of the
following 2 conditions
 T accesses i before U, and T accesses j before U
 U accesses i before T, and U accesses j before T

37
“Lost Update” Problem

38
Serially Equivalent Interleaving

39
“Inconsistent Retrievals” Problem

40
Serially Equivalent Interleaving

41
Issues in Aborting Transactions
 Servers must record effects of all committed
transactions & none of aborted transactions
 2 well-known problems
 Dirty reads
 Premature writes
 Both of these can occur in the presence of
serially equivalent executions of transactions

42
Example – Dirty Read

43
Example – Premature Writes

 Due to interaction between write operations on


the same object belonging to different
transactions
 If U aborts & T commits, balance should be $105
44
Concurrency Control Algorithms
 2-Phase Locking (2PL)
 Get all locks, then release them
 No more lock acquisition when a lock is release
 Timestamp ordering
 Lamport’s time stamp
 Optimistic concurrency
 Go & do the changes, if there is a problem let’s
handle them later

 How to do these concurrently on a distributed


system? 45

You might also like