Professional Documents
Culture Documents
CSI3131_Ch6_Synchronization_Part1
CSI3131_Ch6_Synchronization_Part1
CSI3131_Ch6_Synchronization_Part1
Chapter 6
Synchronization Tools-Part I
• 6.1 Background
• 6.2 The critical-section problem
• 6.3 Peterson's solution
• 6.4 Hardware support for synchronization
• 6.5 Mutex locks
• 6.6 Semaphores
…
2
6.1 Chapter Objectives
3
6.1 Problems with concurrency = parallelism (Race condition)
4
6.1 Example 1
• Two threads execute this same procedure and share the same
database
• They can be interrupted anywhere
• The result of concurrent execution of P1 and P2 depends on the order
of their interlacing (particular order in which the access takes place)
• Often good, sometimes bad (we cannot trust it!)
5
6.1 Example 1 - Overview of a possible bad execution (race)
P1 P2
Interruption
Mr. Smith requests a
or delay
flight reservation
Mr. Green requests a
flight reservation
THE NEED:
A thread should be able to say to others: “LET ME FINISH my
critical section (CS)”!
• CS is part of a task (shared data or resources) whose execution
must not interlace with other tasks CSs - Indivisibility of the CS
(Atomic execution)
• Once a task enters, it must be allowed to complete this section
without allowing other tasks to manipulate on the same data
9
6.2 The critical-section problem (CSP)
• Critical-section problem is to design a protocol that the threads can
use to synchronize their activity so as to cooperatively share data.
• the result of their actions does not depend on the interlacing order of
their execution
• The execution of CSs must be mutually exclusive: at any time, only
one thread can execute a CS for a given data (even when there
are multiple CPUs)
• This can be achieved by placing special instructions at the
beginning and end of the CS
• Once a task enters the CS, it must complete it as an atomic bloc (no
other tasks are allowed to mess with the same data).
• The CS must be locked to become indivisible (Atomic)
Note: A program may have many CSs but for simplicity, sometimes
we may assume that there is only one CS in a program. 10
6.1 Let’s repeat Example 2 with CS implementation!
• Two Threads operations in parallel on a shared variable a = 0, (b is private
to each Thread) Case 1
Th1 Case 2 Th2
Th1 interruption Th2
interruption
b=a b=a
b=a
CS b++ b++
a=b b=a a=b
(a = 1) b++ (a = 1)
CS b++
a=b
a=b
(a = 2) Without CS
(a = 1)
• Case1: CS implemented – results: a=2 (trusted result).
• Case 2 (problem): CS not implemented a=1(Untrusted result).
• But there might be cases where the result will be correct! (still untrusted!!)
11
6.1 Repeat Example 1 with CS implementation
P1 P2
Mr. Smith requests a
flight reservation
Mr. Green requests a
flight reservation
while (true) {
entry CS
critical
critical section
section
ATOMIC
exit CS
entry CS
critical section
Database says: seat A is available
ATOMIC
CS seat A is assigned to X and marked
occupied
exit CS
• Progress
• Ensures programs will cooperatively determine what task will next enter its CS
• A CS will only be given to a task that is waiting to enter it
• Whenever a CS becomes available, if there are threads/tasks waiting for it, one of them
must be able to enter it (no deadlock)
• Bounded Waiting
• Limits the amount of time a program will wait before it can enter its Critical Section (CS)
• A thread waiting to enter a CS will finally be able to enter (alternation and no starvation)
• No thread can be forever excluded from the CS because of other threads monopolizing it.
• Bounded waiting implies progress, and progress implies Mutual execution and
no deadlock.
Note the difference between deadlock (to see in Ch 8) and starvation
15
6.2 Some CSP Solutions
• Software based solutions
• algorithms that do not use special instructions
• Peterson’s algorithm
• Hardware solutions
• rely on the existence of certain special (processor) instructions
• test_and_set(), cmpxchg()
• Higher-level software tools (HLST) solution
• Provides certain system calls to the programmer
The first two solutions have problems, the third works (Peterson's solution)
17
6.3 S/W-based solutions – Collaboration problem
Not practical, but interesting to understand the problem
18
6.3 So let`s start with S/W algorithm 1 - Threads give each other a turn
• Have Ti tasks and a shared variable turn indicating whose turn it is
now.
• Turn is initialized to a valid task number (assuming turn = 0).
• The two threads are moving forward, giving each other a turn
• So, no deadlock, no starvation
• but what if we have 1000 threads and only a handful are active?
20
6.3 S/W-based solutions, algorithm 1 – Discussion
• But Assuming we have 1000 threads and only a handful are active:
• Each time, before a thread can enter the critical section, it must wait
until all the others have had this chance!
• This is in clear contradiction with the Progress requirement
21
6.3 S/W-based solutions, algorithm 2 - Introduction
22
6.3 S/W-based solutions, algorithm 2 - Introduction
• A Boolean variable per Thread: flag [0] “and”
flag [1]
• Ti signals that it wants to execute its CS by
setting flag [i] = true Thread Ti:
• But Ti does not enter if another thread is also while(true){
set it flag to enter! flag[i] = true;
• Mutual exclusion seems fine while(flag[j]); //BW
CS
• But progress does not look satisfied why?
flag[i] = false;
• Consider the sequence:
noCS
• T0: flag[0] = true // “After you, sir”! }
• T1: flag[1] = true // “After you, sir”!
• Each thread will wait indefinitely to execute
its CS: deadlock
23
6.3 S/W-based solution, algorithm 2 – Example with 2 threads
After you sir …! After you sir …!
• T0: flag[0]=true;
• T1: flag[1]=true;
• Deadlock!
24
6.3 S/W-based solutions, algorithm 3 – Peterson solution
• It provides a good algorithmic description of solving the CSP
• It illustrates some of the complexities involved in designing
software that addresses the requirements of:
• mutual exclusion, progress, and bounded waiting
• Thread Ti can enter if the other doesn't want to enter or it's his luck
• flag[j] == false or turn == i
26
6.3 S/W-based solutions, algorithm 3 – Peterson solution
• If both tasks try to enter at the same time, turn will be set by both Tasks
roughly the same time:
• But turn is shared by both, so only the latest of these assignments will hold (the
other will occur but will be overwritten immediately).
• The eventual value of turn determines which of the two tasks is allowed to
enter its CS first.
• We now prove that this solution is correct. We need to show that:
• Mutual exclusion is preserved.
• The progress requirement is satisfied.
• The bounded-waiting requirement is met.
28
6.3 S/W-based solutions, algorithm 3 – Analysis (2)
• Is Mutual exclusion is preserved?
• Yes, a task enters its CS only if it is its turn when both want in.
• it is not possible as task can hold only 1 value that allow only 1 task
to enter CS.
• Is the bounded-waiting requirement is me?
• Yes, an executing task will set its flag to false before reaching its
remainder section (or no CS), this will unlock the other task from its
busy wait and allows it to execute its CS.
29
6.3 S/W-based solutions, algorithm 3 – Analysis – special case!
• Assume that T0 is the only one who needs the CS, or that T1 is slow to act:
T0 can return immediately (flag[1]==false the last time T1 went out)
While(1){
flag[0] = true // takes the initiative
turn = 1 // give a chance to the other!
CS
• This property is desirable, but it can cause starvation for T1 if it is slow (race
condition)
30
6.3 S/W-based solutions, algorithm 3 – Analysis – failing task!
• If a solution satisfies the requirements of Mutual Execution and
Progress, it provides robustness against the thread failure in its no CS
• A thread that fails in its no CS is like a thread that never asks to enter...
31
6.3 S/W algorithm 3 – Analysis – May not work on modern computer!
• To improve performance, processors and/or compilers may reorder read and
write operations that have no dependencies.
• For a multithreaded application with shared data, the reordering of
instructions may render inconsistent or unexpected results.
• Example of data shared between two threads:
boolean flag = false;
int x = 0;
• where Thread 1 performs the statements
while (!flag);
print x;
• and Thread 2 performs
x = 100;
• So, we expect T1 to output 100 right? flag = true;
• NO guaranty! there are no data dependencies between variables flag and x, it is
possible that a processor may reorder the instructions for Thread 2 so that flag is
assigned true before assignment of x = 100
• it is possible that Thread 1 would output 0 for variable x
32
6.3 S/W algorithm 3 – Analysis – May not work on modern computer!
• How does this affect Peterson's solution?
• It is possible that both threads may be active in their critical sections at the
same time!
• The solutions that we will see from now on are all based on the
existence of specialized instructions, which make the work
easier.
34
6.4 Hardware support for synchronization
• Can we just disable interrupts entering the CS then enabling it
when exiting?
• We cannot disable interrupts on all CPUs of a muti-core
processor at the same time
• So not good in general!
Thread Pi:
while(true){
disable interrupt
CS
enable interrupt
nonCS or remainder section (RS)
}
35
6.4 H/W sync support- Memory barriers (or memory fences)
36
6.4 H/W sync support - Memory barriers (or memory fences)
• Let's return to our most recent example on slide 32!
• T1: If we add a memory barrier (MB) operation to Thread 1
while (!flag)
• we guarantee that the value of flag is loaded
memory_barrier();
before the value of x.
print x;
• T2: Similarly, if we place a MB between the assignments performed by Thread 2
x = 100;
memory_barrier();
flag = true;
• We ensure that the assignment to x occurs before the assignment to flag.
• With respect to Peterson's solution, we could place a memory barrier between the first
two assignment statements in the entry section to avoid the reordering of operations
Note: memory barriers are considered very low-level operations and are typically only
used by kernel developers when writing specialized code that ensures mutual exclusion.
37
6.4 H/W sync support – Instructions
• Modern computer systems provide special hardware
instructions that:
• Test and modify the content of a word
• Automatically swap the contents of two words atomically - as one
uninterruptible unit
• We can use these special instructions to solve the critical-
section problem in a relatively simple manner
• The main abstracted concept describing the:
• test_and_set() and compare_and_swap() instructions
38
6.4 H/W sync support – Instructions - test_and_set()
• It is executed atomically
• If two test_and_set() instructions are executed simultaneously
(each on a different core), they will be executed sequentially in
some arbitrary order.
• Mutual exclusion can implement by declaring a boolean variable
lock, initialized to false - the structure of process Ti is:
//Instruction definition while (true){
boolean test_and_set(boolean *target){ while (test_and_set(&lock)); //BW
boolean rv = *target;
*target = true;
/* critical section */
return rv; }
lock = false;
cons. /* remainder section */
• Still using busy waiting }
if (*value == expected)
*value = new_value;
• To enforce atomic execution, the lock prefix is used to lock the bus
while the destination operand is being updated
• Interchanges the contents of source operand and destination
operand atomically
42
6.4 H/W sync support – Instructions - CAS with waiting (1)
• An algorithm using CAS that satisfies all the critical-section
requirements. boolean waiting[n]; //initialized to false
int lock; //initialized to 0
• The common data structures are:
• waiting[i] == false
• or key == 0
• The value of key can become 0 only if the compare_and_swap() is executed; all
other tasks must wait.
• The variable waiting[i] can become false only if another process leaves its
critical section
• Only one waiting[i] is set to false, maintaining the mutual-exclusion
requirement
43
6.4 H/W sync support – Instructions - CAS with waiting (2)
• Mutual exclusion maintained: while (true) {
waiting[i] = true;
• A task exiting the CS either sets lock to 0
key = 1;
or sets waiting[j] to false - Both allow a
while (waiting[i] && key == 1)
waiting task to enter its critical section to
key = compare_and_swap(&lock,0,1);
proceed
waiting[i] = false;
• Bounded-waiting requirement fulfilled
/* critical section */
• when a task leaves its critical section, it
scans the waiting[i] in the cyclic
j = (i + 1) % n;
ordering (𝑖 + 1, 𝑖 + 2, …, 𝑛 − 1, 0, …, 𝑖 − 1). while ((j != i) && !waiting[j])
• It designates the first task in this ordering j = (j + 1) % n;
that is in the entry section
(waiting[j] == true) as the next one to if (j == i)
enter the CS. lock = 0;
else
• Any task waiting to enter its critical
waiting[j] = false;
section will thus do so within 𝑛 − 1 turns.
(no starvation) /* remainder section */
} 44
6.4 H/W sync support – Instructions - Atomic variables (AV)
• AV is a tool for solving CSP (critical-section problem).
• Provides atomic operations on basic data types such as int. and bool.
• Special atomic data types and functions for accessing and manipulating
atomic variables exist only on supported Oses; e.i (atomic_int *v)
• Often implemented using compare_and_swap() operations (see texbook)
• Limitation: their use is often limited to single updates of shared data such as
counters and sequence generators.
• They do not entirely solve race conditions in all circumstances
• Example: in producer/consumer tasks when the buffer is empty, and two consumers
are waiting for count > 0. If a producer increment count, both consumers could
proceed to consume, even though the value of count was only set to 1.
45
HLST - Instructions (System calls)
• The solutions seen so far are “ugly”: difficult and things
can easily get bad!
• The H/W-based solutions to CS (6.4) are complicated and
generally inaccessible to application programmers
• We need an easier and robust way to avoid common
errors, like deadlocks, starvation, race, etc.
• Higher-level software (HLS) synchronization tools are
needed to solve CSP
• The methods that we are going to see use powerful
instructions, which are implemented by System calls to
the OS
46
6.5 HLST - Mutex locks (also called a spinlock )
• Mutex is short for mutual exclusion
47
6.5 HLST - Mutex locks (also called a spinlock )
• The definition of acquire(): acquire() {
while (!available);//BW
available = false;
}
• If not available, then the calling task is blocked (BW) until the lock is released.
• Disadvantage:
• It requires busy waiting
• Other tasks that are not in CS, must loop continuously in the call to acquire().
48
6.6 HLST – Semaphores
• It is similarly to a mutex lock (Spinlocks) but can also provide more
sophisticated ways for tasks to synchronize their activities
• A semaphore S is an integer variable is accessed only through two standard
atomic operations: wait(S) {
• wait() with the definition: while (S <= 0);// BW
S--; Atomic
}
• signal()with the definition: signal(S) {
S++; Atomic
}
• Testing of (S ≤ 0), and its possible modification (S--), must not interrupted.
49
6.6 HLST – Semaphores – The idea!
50
6.6 HLST – Semaphores usage – binary S
51
6.6 HLST – Spinlock : binary semaphores – example 1
• Initialize S = 1;
52
6.6 HLST – Semaphores usage – Counting S
• Counting semaphores control access to a finite number of instances.
• S can range over an unrestricted domain
T0 T1
wait(S)
wait(Q)
wait(Q) wait(S)
55
6.6 HLST – Semaphore - observation
• When S >= 0:
• The number of threads that can run wait(S) without BW = S
• S threads can enter the CS
• More powerfull than the other sync. mechanisms we have already seen
• In solutions where S >1 it will be necessary to have a 2nd semaphore to
control threads entry to CS one at a time (mutual exclusion)
• When S become grater than one, the first thread to enter the CS
is the first to test S - random choice -> possible starvation
• This will no longer be true in the next solution
56
6.6 HLST – Semaphore implementation with no BW (1)
57
6.6 HLST – Semaphore implementation with no BW (2)
58
6.6 HLST – Semaphore implementation with no BW (3)
59
6.6 HLST – Semaphore implementation with no BW (4)
• To implement semaphores under this new definition, we define a
semaphore as new type as follows (somewhat as a class in java):
typedef struct {
int value;
struct process *list;
} semaphore;
• A signal() operation removes one process from the list of waiting processes
and awakens that process.
• Now, what is the new definition of wait()and signal()?
• next slide
60
6.6 HLST – Semaphore implementation with no BW (5)
• Definition of the wait() semaphore operation:
wait(semaphore *S){
• S is a semaphore typed var.
S->value--; //decrem. value
• *S is apointer to S if (S->value < 0) {
add this process to S->list;
• S->value is a pointer to S member
sleep();
value
}
• Notice: sleep() replaces the previous }
BW (while loop). But here it is placed after value--, it can get negative! Why? (next slide)
• Definition of the signal() semaphore operation
• sleep() suspends the process signal(semaphore *S) {
that invokes it. S->value++;
if (S->value <= 0) {
• wakeup(P) resumes the remove a process P from S->list;
execution of a suspended wakeup(P);
process P. }
• These two operations are }
provided by the OS as basic system calls.
61
6.6 HLST – Semaphore implementation with no BW (6) -Discussion
• Under this new definition, a semaphore value is negative, its magnitude is the
number of processes waiting on that semaphore.
• The operations wait() and signal() must be executed atomically (only one
task at a time)
• In a single-core, it can be solved by inhibiting interrupts when a task is performing these
operations.
• In a multicores (SMP), interrupts also must be disabled on every processing core – this is
not an easy task and performance can seriously diminish
– Alternative techniques: such as compare_and_swap() or spinlocks etc. (they have
BW implmented in them!!) to ensure that wait() and signal() are performed
atomically
• It is important to admit that we have not completely eliminated BW, we have
limited BW to the critical sections of the wait() and signal() operations,
and these sections are short.
• Thus, the CS is almost never occupied, and busy waiting occurs rarely, and then for
only a short time… While BW in a task may be long (minutes or hours) 62