CSI3131_Ch6_Synchronization_Part1

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 62

ELEMENTS OF OPERATING SYSTEMS – CSI 3131

Chapter 6
Synchronization Tools-Part I

Reference: Tenth Edition (zybook) - Silberschatz, Galvin and Gagne


Plan

• 6.1 Background
• 6.2 The critical-section problem
• 6.3 Peterson's solution
• 6.4 Hardware support for synchronization
• 6.5 Mutex locks
• 6.6 Semaphores

2
6.1 Chapter Objectives

• A major portion of this chapter is devoted to process


synchronization and coordination among cooperating processes
• Describe the critical-section problem (CSP) and illustrate a race
condition.
• Classic software-based solution to the critical-section problem
• Hardware support for synchronization solutions: to the critical-
section problem using memory barriers, compare-and-swap
operations, and atomic variables.
• High-level Software tools (HLST) solutions: Demonstrate how
mutex locks, semaphores, monitors, and condition variables can
be used to solve the critical-section problem.

3
6.1 Problems with concurrency = parallelism (Race condition)

• Concurrent threads often share user data


(files or common memory) and resources Shared
(cooperative tasks) between
threads
• Race condition occurs where several tasks
access and manipulate the same data
concurrently
• A race condition exists when access to shared
data is not controlled, possibly resulting in
corrupt data values (the outcome of the
execution depends on the particular order in
which the access takes place).
• non-determinism

Shared data is not saved when switching threads

4
6.1 Example 1

• Two threads execute this same procedure and share the same
database
• They can be interrupted anywhere
• The result of concurrent execution of P1 and P2 depends on the order
of their interlacing (particular order in which the access takes place)
• Often good, sometimes bad (we cannot trust it!)

1- M. X asks for a flight reservation

2- Database says: seat A is available

3- seat A is assigned to X and marked occupied

5
6.1 Example 1 - Overview of a possible bad execution (race)
P1 P2
Interruption
Mr. Smith requests a
or delay
flight reservation
Mr. Green requests a
flight reservation

Database says seat


30A is available
Database says seat
30A is available

Seat 30A is assignmed Seat 30A is assignmed


to Mr. Smith and to Mr. Green and
marked occupied marked occupied

Race problem!! Seat 30A assigned to two


persons (Dual Assignment)! 6
6.1 Example 2
• Two Threads operations in parallel on a shared variable a = 0, (b is private
to each Thread) Case 1 Case 2
Th1 interruption Th2 Th1 interruption Th2
b=a b=a
b=a
b++ b++
a=b b=a a=b
(a = 1) b++ (a = 1)
b++
a=b
a=b
(a = 2)
(a = 1)
• Case1 (normal): The two tasks are executed one after the other – results:
a=2.
• Case 2 (problem): T1 works on its own b so the final result would be a=1.
• But there might be cases where the result will be correct! But still something
needs to be done!
7
6.1 Example 3
Thread P1 Thread P2

static int a; static int a;

void echo(){ void echo(){


scanf("%d",&a);
scanf("%d",&a);
printf("%d\n",a);
printf("%d\n",a); }
}

• If variable a is shared, the first a value is erased by the second a value!


• If it is private, then display order is reversed!

• Also see related interactive examples in 6.1 on zybook!


8
6.2 Thread asynchrony - Critical Section

• When multiple threads execute in parallel, we cannot make


assumptions about the execution speed of the threads, nor their
interlacing (particular order in which the access takes place)
• May be different each time the program is run
• non-determinism

THE NEED:
A thread should be able to say to others: “LET ME FINISH my
critical section (CS)”!
• CS is part of a task (shared data or resources) whose execution
must not interlace with other tasks CSs - Indivisibility of the CS
(Atomic execution)
• Once a task enters, it must be allowed to complete this section
without allowing other tasks to manipulate on the same data
9
6.2 The critical-section problem (CSP)
• Critical-section problem is to design a protocol that the threads can
use to synchronize their activity so as to cooperatively share data.
• the result of their actions does not depend on the interlacing order of
their execution
• The execution of CSs must be mutually exclusive: at any time, only
one thread can execute a CS for a given data (even when there
are multiple CPUs)
• This can be achieved by placing special instructions at the
beginning and end of the CS
• Once a task enters the CS, it must complete it as an atomic bloc (no
other tasks are allowed to mess with the same data).
• The CS must be locked to become indivisible (Atomic)

Note: A program may have many CSs but for simplicity, sometimes
we may assume that there is only one CS in a program. 10
6.1 Let’s repeat Example 2 with CS implementation!
• Two Threads operations in parallel on a shared variable a = 0, (b is private
to each Thread) Case 1
Th1 Case 2 Th2
Th1 interruption Th2
interruption
b=a b=a
b=a
CS b++ b++
a=b b=a a=b
(a = 1) b++ (a = 1)
CS b++
a=b
a=b
(a = 2) Without CS
(a = 1)
• Case1: CS implemented – results: a=2 (trusted result).
• Case 2 (problem): CS not implemented a=1(Untrusted result).
• But there might be cases where the result will be correct! (still untrusted!!)

11
6.1 Repeat Example 1 with CS implementation
P1 P2
Mr. Smith requests a
flight reservation
Mr. Green requests a
flight reservation

Database says seat


30A is available

Seat 30A is assignmed


Database says seat XX to Mr. Green and
(other than 30A)is marked occupied CS
available or maybe
all seat are occupied
... CS
“Bad” interlacing is not possible
Trusted result now!
12
6.2 CS Problem - Program structure
• Each Thread must request permission to enter its critical section.
• The section of code implementing this request is the entry section
• The critical section may be followed by an exit section.
• The remaining code is the remainder section.

while (true) {
entry CS

critical
critical section
section
ATOMIC

exit CS

remainder section (nonCS)


}
13
6.2 Application (a program that repeats forever)
while (true) {
M. X asks for a flight
reservation

entry CS

critical section
Database says: seat A is available
ATOMIC
CS seat A is assigned to X and marked
occupied
exit CS

remainder section (nonCS)


}
14
6.2 Criteria required for valid solutions to CSP
• Mutual exclusion
• At any time, at most one thread can be in a CS

• Progress
• Ensures programs will cooperatively determine what task will next enter its CS
• A CS will only be given to a task that is waiting to enter it
• Whenever a CS becomes available, if there are threads/tasks waiting for it, one of them
must be able to enter it (no deadlock)
• Bounded Waiting
• Limits the amount of time a program will wait before it can enter its Critical Section (CS)

• A thread waiting to enter a CS will finally be able to enter (alternation and no starvation)
• No thread can be forever excluded from the CS because of other threads monopolizing it.

• Bounded waiting implies progress, and progress implies Mutual execution and
no deadlock.
Note the difference between deadlock (to see in Ch 8) and starvation
15
6.2 Some CSP Solutions
• Software based solutions
• algorithms that do not use special instructions
• Peterson’s algorithm

• Hardware solutions
• rely on the existence of certain special (processor) instructions
• test_and_set(), cmpxchg()
• Higher-level software tools (HLST) solution
• Provides certain system calls to the programmer

• Mutex, Semaphores, monitor

• All solutions are based on the atomic access to central memory:


a memory address can only be assigned by one instruction at a
time, therefore by one thread at a time.
• In general, all solutions are based on the existence of atomic
instructions, which function as basic Critical Sections 16
6.3 Software-based solutions – Cooperation problem

For instance, let's assume a group of students are collaborating on an


assignment
• Solution 1: They pass the homework copy to each other in a fixed
order: Marc, Ahmed, Marie, Marc, Ahmed, Marie…
• Solution 2: Similar to 1, but only give the chance to those who want to
work
• Solution 3: Combine ideas from previous solutions

The first two solutions have problems, the third works (Peterson's solution)

17
6.3 S/W-based solutions – Collaboration problem
Not practical, but interesting to understand the problem

• Algorithms 1 and 2 are not valid


• Like students’ collaboration solutions 1 and 2!
• But they show the difficulty of the problem

• Algorithm 3 is valid (Peterson's algorithm)


• But no guarantees that it will work correctly on modern computer
architectures as they may reorder instructions(the way memory
access instructions load and store are handled).
• Notation
• For simplicity, let's consider only 2 threads: T0 and T1
• When we discuss task Ti, Tj will always be the other task (i != j)

• while(X){A} // repeat A as long as X is true


• while(X); // wait while X is true

18
6.3 So let`s start with S/W algorithm 1 - Threads give each other a turn
• Have Ti tasks and a shared variable turn indicating whose turn it is
now.
• Turn is initialized to a valid task number (assuming turn = 0).

• Ti may enter the critical section if and only if (iff) turn = i


• After exiting the critical section, turn is assigned next i task number, so
it gains access to its CS next
• Ti can be in a busy wait if Tj is in its CS.
Busy wait
Thread Ti:
while(true){
T0→T1→Tn →T0→T1→Tn … while(turn!==i);
even if some of them are not CS
interested at all! turn = j;
no CS
}
19
6.3 S/W-based solutions, algorithm 1 – Example with 2 threads
Thread T0: Thread T1:
While(true){ While(true){
while(turn!=0); //BW while(turn!=1); //BW
CritSect CS
turn = 1; turn = 0;
no CS no CS
} }

• The two threads are moving forward, giving each other a turn
• So, no deadlock, no starvation
• but what if we have 1000 threads and only a handful are active?

20
6.3 S/W-based solutions, algorithm 1 – Discussion

• While we can say that Mutual exclusion and Bounded Waiting


requirements are satisfied
• Only one thread can enter the CS at any moment (Mutual exclusion)
• Tasks alternate verry well and no starvation (bounded waiting)

• But Assuming we have 1000 threads and only a handful are active:
• Each time, before a thread can enter the critical section, it must wait
until all the others have had this chance!
• This is in clear contradiction with the Progress requirement

21
6.3 S/W-based solutions, algorithm 2 - Introduction

• Algorithm 2 takes into account the criticism of Algorithm 1


• Gives the CS only to threads that want it (active threads)

• However, we cannot allow a task to systematically give itself the CS


• This will cause starvation for others
• Each process that wants to enter must give others a chance (invite)
before entering! So it will never enter unless others got their chances first!

22
6.3 S/W-based solutions, algorithm 2 - Introduction
• A Boolean variable per Thread: flag [0] “and”
flag [1]
• Ti signals that it wants to execute its CS by
setting flag [i] = true Thread Ti:
• But Ti does not enter if another thread is also while(true){
set it flag to enter! flag[i] = true;
• Mutual exclusion seems fine while(flag[j]); //BW
CS
• But progress does not look satisfied why?
flag[i] = false;
• Consider the sequence:
noCS
• T0: flag[0] = true // “After you, sir”! }
• T1: flag[1] = true // “After you, sir”!
• Each thread will wait indefinitely to execute
its CS: deadlock
23
6.3 S/W-based solution, algorithm 2 – Example with 2 threads
After you sir …! After you sir …!

Thread T0: Thread T1:


While(true){ While(true){
flag[0]=true; flag[1]=true;
Deadlock while(flag[1]);//BW while(flag[0]);//BW Deadlock
CS CS
flag[0]=false; flag[1]=false;
no CS no CS
} }

• T0: flag[0]=true;
• T1: flag[1]=true;
• Deadlock!
24
6.3 S/W-based solutions, algorithm 3 – Peterson solution
• It provides a good algorithmic description of solving the CSP
• It illustrates some of the complexities involved in designing
software that addresses the requirements of:
• mutual exclusion, progress, and bounded waiting

• No guarantees that it will work correctly on modern computer


architectures!
• It is restricted to two processes/tasks that alternate (P0 (or Pj)
and P1(or Pi))
• It combines the two ideas from algor.1 and algor.2:
• flag[i] = intention to enter;
• Turn = whose turn
int turn;
boolean flag[2];
25
6.3 S/W-based solutions, algorithm 3 – Peterson solution
Thread Ti:
• Initialization: while (true) {
• flag[0]=flag[1]=false flag[i] = true;//I want to enter
• Turn = i or j turn = j;//I give priority to Tj
while (flag[j] && turn == j); //BW
• If want to run CS, is indicated
by flag[i] = true /* Ti critical section */
• Put flag[i] = false on exit
flag[i] = false;
/*remainder section */
}
• Thread Ti should wait if the other wants to enter and it’s the other’s turn
• flag[j] == true and turn == j

• Thread Ti can enter if the other doesn't want to enter or it's his luck
• flag[j] == false or turn == i
26
6.3 S/W-based solutions, algorithm 3 – Peterson solution

• Ti uses the boolean flag[ i ] to indicate its desire to enter its CS


• But also uses turn to give Tj the priority to enter its CS
• Ti can enter the CS only when flag[j] == false or turn == i

Thread T0: Thread T1:


while (true) { while (true) {
flag[0] = true;//T0 want to enter flag[1] = true;//T1 want to enter
turn = 1;//T0 gives priority to T1 turn = 0;//T1 gives priority to T0
while (flag[1] && turn == 1); //BW while (flag[0] && turn == 0); //BW

/* T0 critical section */ /* T1 critical section */

flag[0] = false; //give a go to T1 flag[1] = false; //give a go to T0


/*remainder section */ /*remainder section */
} }
27
6.3 S/W-based solutions, algorithm 3 – Analysis (1)
• To enter the critical section, a task Ti first sets flag[i] to be true and then
sets turn to the value j
• thereby asserting that if Tj wishes to enter its CS, it can do so.

• If both tasks try to enter at the same time, turn will be set by both Tasks
roughly the same time:
• But turn is shared by both, so only the latest of these assignments will hold (the
other will occur but will be overwritten immediately).
• The eventual value of turn determines which of the two tasks is allowed to
enter its CS first.
• We now prove that this solution is correct. We need to show that:
• Mutual exclusion is preserved.
• The progress requirement is satisfied.
• The bounded-waiting requirement is met.

28
6.3 S/W-based solutions, algorithm 3 – Analysis (2)
• Is Mutual exclusion is preserved?
• Yes, a task enters its CS only if it is its turn when both want in.

• Is the progress requirement is satisfied?


• Yes, a task will enter its CS if and only if it is its turn and the other task
does not want to enter its CS.
• What if both want to enter?

• it is not possible as task can hold only 1 value that allow only 1 task
to enter CS.
• Is the bounded-waiting requirement is me?
• Yes, an executing task will set its flag to false before reaching its
remainder section (or no CS), this will unlock the other task from its
busy wait and allows it to execute its CS.

29
6.3 S/W-based solutions, algorithm 3 – Analysis – special case!
• Assume that T0 is the only one who needs the CS, or that T1 is slow to act:
T0 can return immediately (flag[1]==false the last time T1 went out)
While(1){
flag[0] = true // takes the initiative
turn = 1 // give a chance to the other!

while(flag[1]&&turn==1);// flag[1]=false, so To will enter CS

CS

flag[0] = false // give a chance to the other!


} // but he must be faster than me
// to get his chance!!

• This property is desirable, but it can cause starvation for T1 if it is slow (race
condition)
30
6.3 S/W-based solutions, algorithm 3 – Analysis – failing task!
• If a solution satisfies the requirements of Mutual Execution and
Progress, it provides robustness against the thread failure in its no CS
• A thread that fails in its no CS is like a thread that never asks to enter...

• On the other hand, difficult situation arise if a thread fails in the CS


• A thread Ti which fails in its CS does not send a signal to the other threads:
for them, Ti is still in its CS (so they are blocked!)...
• Solution: timeout
• A thread that has SC after a certain time is interrupted by the OS

• The Peterson algorithm can be generalized to more than 2 tasks


• However, in this case there are more elegant algorithms, such as the baker's
algorithm, based on the idea of 'taking a number at the counter’...
• No time to cover it ...

31
6.3 S/W algorithm 3 – Analysis – May not work on modern computer!
• To improve performance, processors and/or compilers may reorder read and
write operations that have no dependencies.
• For a multithreaded application with shared data, the reordering of
instructions may render inconsistent or unexpected results.
• Example of data shared between two threads:
boolean flag = false;
int x = 0;
• where Thread 1 performs the statements
while (!flag);
print x;
• and Thread 2 performs
x = 100;
• So, we expect T1 to output 100 right? flag = true;
• NO guaranty! there are no data dependencies between variables flag and x, it is
possible that a processor may reorder the instructions for Thread 2 so that flag is
assigned true before assignment of x = 100
• it is possible that Thread 1 would output 0 for variable x
32
6.3 S/W algorithm 3 – Analysis – May not work on modern computer!
• How does this affect Peterson's solution?

• It is possible that both threads may be active in their critical sections at the
same time!

• The only way to preserve mutual exclusion is by using proper synchronization


tools (next sections)
33
6.3 S/W-based solutions – General Discussion and Criticism
• In order for threads with shared variables to succeed, it is
necessary that all involved threads use the same coordination
algorithm.
• A common protocol

• S/W solution is difficult to program! And to understand!


• Threads that desire entry into their CS are busy waiting (polling
situation); which consumes CPU time.
• For long critical sections, it is preferable to completely block threads
that must wait… then wake them up when they can enter CS.
• reminder: interrupts and semaphores (to see…)

• The solutions that we will see from now on are all based on the
existence of specialized instructions, which make the work
easier.

34
6.4 Hardware support for synchronization
• Can we just disable interrupts entering the CS then enabling it
when exiting?
• We cannot disable interrupts on all CPUs of a muti-core
processor at the same time
• So not good in general!

Thread Pi:
while(true){
disable interrupt
CS
enable interrupt
nonCS or remainder section (RS)
}

35
6.4 H/W sync support- Memory barriers (or memory fences)

• Problem: In section 6.3 (Peterson's algorithm slides 18, 32 and 33) we


saw that a multicores system may reorder instructions – lead to
unreliable data states!
• Solution: memory barriers: a computer instructions.
• It force propagating any memory changes to all threads running on
different cores
• It ensures that all loads and store (read and write) operations are
completed before additional load and store operations are
performed

36
6.4 H/W sync support - Memory barriers (or memory fences)
• Let's return to our most recent example on slide 32!
• T1: If we add a memory barrier (MB) operation to Thread 1
while (!flag)
• we guarantee that the value of flag is loaded
memory_barrier();
before the value of x.
print x;
• T2: Similarly, if we place a MB between the assignments performed by Thread 2
x = 100;
memory_barrier();
flag = true;
• We ensure that the assignment to x occurs before the assignment to flag.

• With respect to Peterson's solution, we could place a memory barrier between the first
two assignment statements in the entry section to avoid the reordering of operations

Note: memory barriers are considered very low-level operations and are typically only
used by kernel developers when writing specialized code that ensures mutual exclusion.

37
6.4 H/W sync support – Instructions
• Modern computer systems provide special hardware
instructions that:
• Test and modify the content of a word
• Automatically swap the contents of two words atomically - as one
uninterruptible unit
• We can use these special instructions to solve the critical-
section problem in a relatively simple manner
• The main abstracted concept describing the:
• test_and_set() and compare_and_swap() instructions

38
6.4 H/W sync support – Instructions - test_and_set()
• It is executed atomically
• If two test_and_set() instructions are executed simultaneously
(each on a different core), they will be executed sequentially in
some arbitrary order.
• Mutual exclusion can implement by declaring a boolean variable
lock, initialized to false - the structure of process Ti is:
//Instruction definition while (true){
boolean test_and_set(boolean *target){ while (test_and_set(&lock)); //BW
boolean rv = *target;
*target = true;
/* critical section */
return rv; }

lock = false;
cons. /* remainder section */
• Still using busy waiting }

• When Ti leaves its CS, the selection of Tj to enter its CS is arbitrary:


no bounded (timed) waiting ->Starvation is possible 39
6.4 H/W sync support – Instructions - compare_and_swap() (CAS) (1)
• Verry similar to test_and_set()atomically, Mutual exclusion …

• Mechanism is based on swapping the content of two words

• Operates on three operands: lock (value), expected and new_value


• The definition of the atomic CSA instruction:
int compare_and_swap(int *value, int expected, int new_value) {
int temp = *value;

if (*value == expected)
*value = new_value;

return temp; //The function call always returns the original


//value of lock (or value here)
}
cons.
• Still using busy waiting (see next slide)

• It does not satisfy the bounded-waiting requirement 40


6.4 H/W sync support – Instructions - compare_and_swap() (CAS) (2)
• Mutual exclusion can implement by declaring a boolean variable lock,
initialized to false or 0 - the structure of process Ti is:
• The first process that invokes CAS will set lock to 1.
• It will then enter its critical section, because the original value of lock was
equal to the expected value of 0.
• Subsequent calls to CAS will not succeed, because lock now is not equal to
the expected value of 0.
while (true) {
• When a process exits its while (compare_and_swap(&lock, 0, 1) != 0); //BW
critical section, it sets
lock back to 0, which /* critical section */
expected
allows another process lock (value) new_value
lock = 0;
to enter its critical section.
/* remainder section */
}
41
6.4 H/W sync support – Instructions – CAS (cmpxchg on Intel x86)
• On Intel x86 architectures, the assembly language statement On Intel
x86 is used to implement the compare_and_swap() instruction
• The general form of this instruction appears as:

lock cmpxchg <destination operand>, <source operand>

• To enforce atomic execution, the lock prefix is used to lock the bus
while the destination operand is being updated
• Interchanges the contents of source operand and destination
operand atomically

42
6.4 H/W sync support – Instructions - CAS with waiting (1)
• An algorithm using CAS that satisfies all the critical-section
requirements. boolean waiting[n]; //initialized to false
int lock; //initialized to 0
• The common data structures are:

• Ti can enter its critical section only if either:

• waiting[i] == false

• or key == 0

• The value of key can become 0 only if the compare_and_swap() is executed; all
other tasks must wait.
• The variable waiting[i] can become false only if another process leaves its
critical section
• Only one waiting[i] is set to false, maintaining the mutual-exclusion
requirement

43
6.4 H/W sync support – Instructions - CAS with waiting (2)
• Mutual exclusion maintained: while (true) {
waiting[i] = true;
• A task exiting the CS either sets lock to 0
key = 1;
or sets waiting[j] to false - Both allow a
while (waiting[i] && key == 1)
waiting task to enter its critical section to
key = compare_and_swap(&lock,0,1);
proceed
waiting[i] = false;
• Bounded-waiting requirement fulfilled
/* critical section */
• when a task leaves its critical section, it
scans the waiting[i] in the cyclic
j = (i + 1) % n;
ordering (𝑖 + 1, 𝑖 + 2, …, 𝑛 − 1, 0, …, 𝑖 − 1). while ((j != i) && !waiting[j])
• It designates the first task in this ordering j = (j + 1) % n;
that is in the entry section
(waiting[j] == true) as the next one to if (j == i)
enter the CS. lock = 0;
else
• Any task waiting to enter its critical
waiting[j] = false;
section will thus do so within 𝑛 − 1 turns.
(no starvation) /* remainder section */
} 44
6.4 H/W sync support – Instructions - Atomic variables (AV)
• AV is a tool for solving CSP (critical-section problem).

• Provides atomic operations on basic data types such as int. and bool.

• For instance, increment/decrement int (counter) cause race! (see 6.1)


• AV ensure mutual exclusion on a shared variable (race avoidance)

• Special atomic data types and functions for accessing and manipulating
atomic variables exist only on supported Oses; e.i (atomic_int *v)
• Often implemented using compare_and_swap() operations (see texbook)

• Limitation: their use is often limited to single updates of shared data such as
counters and sequence generators.
• They do not entirely solve race conditions in all circumstances
• Example: in producer/consumer tasks when the buffer is empty, and two consumers
are waiting for count > 0. If a producer increment count, both consumers could
proceed to consume, even though the value of count was only set to 1.
45
HLST - Instructions (System calls)
• The solutions seen so far are “ugly”: difficult and things
can easily get bad!
• The H/W-based solutions to CS (6.4) are complicated and
generally inaccessible to application programmers
• We need an easier and robust way to avoid common
errors, like deadlocks, starvation, race, etc.
• Higher-level software (HLS) synchronization tools are
needed to solve CSP
• The methods that we are going to see use powerful
instructions, which are implemented by System calls to
the OS

46
6.5 HLST - Mutex locks (also called a spinlock )
• Mutex is short for mutual exclusion

• It is the simplest of these higher-level software synchronization tools

• Mutex lock protects CSs and thus prevent race conditions


• A task must acquire the lock before entering a CS; it releases the lock
when it exits the CS. while (true){
• acquire() function acquires the lock
acquire lock
• release() function releases the lock

• On multicore systems, spinlocks can critical section


be the preferable choice for locking, if
release lock
a lock is to be held for a short duration
• One thread can "spin" (BW) on one core remainder section
while another thread performs its CS on
another core. }

47
6.5 HLST - Mutex locks (also called a spinlock )
• The definition of acquire(): acquire() {
while (!available);//BW
available = false;
}

• The definition of release(): release() {


available = true;
}

• boolean variable available is true only if the lock is available, otherwise it is


false.
• A task calling acquire() can enter its CS, only if the lock is available.

• If not available, then the calling task is blocked (BW) until the lock is released.

• Disadvantage:
• It requires busy waiting

• Other tasks that are not in CS, must loop continuously in the call to acquire().
48
6.6 HLST – Semaphores
• It is similarly to a mutex lock (Spinlocks) but can also provide more
sophisticated ways for tasks to synchronize their activities
• A semaphore S is an integer variable is accessed only through two standard
atomic operations: wait(S) {
• wait() with the definition: while (S <= 0);// BW
S--; Atomic
}
• signal()with the definition: signal(S) {
S++; Atomic
}

• The int S in wait() and signal() operations must be executed atomically:


• No other task can simultaneously modify that same semaphore S value.

• Testing of (S ≤ 0), and its possible modification (S--), must not interrupted.

49
6.6 HLST – Semaphores – The idea!

• Assume a room with a limited capacity


• A guard at the door counts the people who enter, only allows entry
if the room is not full
• The guard Initializes a counter to the maximum number of people
• The guard subtract 1 from the counter when he lets someone in
• The guard add 1 to the counter when someone leave the room

• The guard don't let anyone in when counter = 0 (room full)


• If we want only 1 person at a time in the room, we simply initialize
the counter at 1.

50
6.6 HLST – Semaphores usage – binary S

• The value of a binary semaphore S can range only between 0 and 1.

• Thus, it behave similarly to mutex locks.

• In fact, a binary semaphore is a counting semaphore with S is


restricted to 0 and 1.
• binary semaphores can be used for mutual exclusion instead of
mutex locks on systems that do not provide mutex locks.

51
6.6 HLST – Spinlock : binary semaphores – example 1

• Initialize S = 1;

Thread T1: Thread T2:


while(true){ while(true){
wait(S); wait(S);
CritSect CritSect
signal(S); signal(S);
nonCritSect nonCritSect
} }

• The Semaphore can be easily generalized to n threads (initialize


S=n;).

52
6.6 HLST – Semaphores usage – Counting S
• Counting semaphores control access to a finite number of instances.
• S can range over an unrestricted domain

• S is initialized to the number of resources available (positive int).


• Each task that wishes to use a resource performs a wait() operation on S
(thereby decrementing the count (S--)).
• When a process releases a resource, it performs a signal() operation
(incrementing the count (S++)).
• When the count for S goes to 0, all resources are being used.

• After that, processes that wish to use a resource will block


(while (S <= 0);// BW) until the count becomes greater than 0.
wait(S) {
signal(S) {
while (S <= 0);// BW
S++; Atomic
S--; Atomic }
}
53
6.6 HLST – Semaphores usage – Example 2 - synchronization problem
• Consider two concurrently running threads:
• T1 with a statement 𝑆1 and T2 with a statement 𝑆2.
• Suppose we require that 𝑆2 be executed only after 𝑆1 has completed.
• We can implement this scheme readily by letting T1 and T2 share a
common semaphore synch, initialized to 0.
S1;
• In T1, we insert the statements signal(synch);
• In T2, we insert the statements wait(synch);
• While T1 runs in S1, T2 is BW in the while loop. S2;

• After 𝑆1 has been executed, T1 invokes signal (synch) -> synch


increments to 1(synch++) - releasing T2.
• Now S2 can executes
wait(synch) {
while (synch <= 0);// BW signal(synch) {
synch--; synch++;
} } 54
6.6 HLST – Possible Starvation and Deadlock with semaphores spinloks
• A thread may never be able to execute because it never tests the
semaphore at the right time (starvation)

• Suppose S and Q initialized to 1


• They will be both 0 on the second wait(stuck in BW)
• Neither tread can move forward (deadlock).

T0 T1
wait(S)
wait(Q)

wait(Q) wait(S)
55
6.6 HLST – Semaphore - observation

• When S >= 0:
• The number of threads that can run wait(S) without BW = S
• S threads can enter the CS
• More powerfull than the other sync. mechanisms we have already seen
• In solutions where S >1 it will be necessary to have a 2nd semaphore to
control threads entry to CS one at a time (mutual exclusion)
• When S become grater than one, the first thread to enter the CS
is the first to test S - random choice -> possible starvation
• This will no longer be true in the next solution

56
6.6 HLST – Semaphore implementation with no BW (1)

Implementation without busy waiting (actually a reduced BW!)


• The Mutex and the presented semaphore in previous slides,
both suffer from busy waiting!
• To overcome the BW problem, we can modify the definition
of the wait() and signal() operations

57
6.6 HLST – Semaphore implementation with no BW (2)

• To overcome the BW problem, the definition of the wait() is modified


as follows:
• Instead of BW in wait()the process can suspend itself.

• Suspend places a process into a waiting queue associated with the


semaphore
• then state of the process is switched to the waiting state.

• Then control is transferred to the CPU scheduler, which selects another


process to execute.
• Now, how and when to restart the suspended process?
• next slide

58
6.6 HLST – Semaphore implementation with no BW (3)

• A process that is suspended, waiting on a semaphore S, should be


restarted when some other process executes a signal() operation.
• The process is restarted by a wakeup() operation, which changes the
process from the waiting state to the ready state.
• The process is then placed in the ready queue. (The CPU may or may
not be switched immediately from the running process to the newly
ready process, depending on the CPU-scheduling algorithm).
• Now, how to implement semaphores under this new definition?
• next slide

59
6.6 HLST – Semaphore implementation with no BW (4)
• To implement semaphores under this new definition, we define a
semaphore as new type as follows (somewhat as a class in java):
typedef struct {
int value;
struct process *list;
} semaphore;

• A semaphore typed variable has 2 members:


int value and a list of processes
• When a process must wait on a semaphore, it is added to the list of processes.

• A signal() operation removes one process from the list of waiting processes
and awakens that process.
• Now, what is the new definition of wait()and signal()?
• next slide

60
6.6 HLST – Semaphore implementation with no BW (5)
• Definition of the wait() semaphore operation:
wait(semaphore *S){
• S is a semaphore typed var.
S->value--; //decrem. value
• *S is apointer to S if (S->value < 0) {
add this process to S->list;
• S->value is a pointer to S member
sleep();
value
}
• Notice: sleep() replaces the previous }
BW (while loop). But here it is placed after value--, it can get negative! Why? (next slide)
• Definition of the signal() semaphore operation
• sleep() suspends the process signal(semaphore *S) {
that invokes it. S->value++;
if (S->value <= 0) {
• wakeup(P) resumes the remove a process P from S->list;
execution of a suspended wakeup(P);
process P. }
• These two operations are }
provided by the OS as basic system calls.
61
6.6 HLST – Semaphore implementation with no BW (6) -Discussion
• Under this new definition, a semaphore value is negative, its magnitude is the
number of processes waiting on that semaphore.
• The operations wait() and signal() must be executed atomically (only one
task at a time)
• In a single-core, it can be solved by inhibiting interrupts when a task is performing these
operations.
• In a multicores (SMP), interrupts also must be disabled on every processing core – this is
not an easy task and performance can seriously diminish
– Alternative techniques: such as compare_and_swap() or spinlocks etc. (they have
BW implmented in them!!) to ensure that wait() and signal() are performed
atomically
• It is important to admit that we have not completely eliminated BW, we have
limited BW to the critical sections of the wait() and signal() operations,
and these sections are short.
• Thus, the CS is almost never occupied, and busy waiting occurs rarely, and then for
only a short time… While BW in a task may be long (minutes or hours) 62

You might also like