Lec07 Synchronization

Goals for Today
CS194-24 • Tips for Programming in a Design Team

Advanced Operating Systems • Synchronization
Structures and Implementation • Scheduling
Lecture 7
Interactive is important!
How to work in a group / Ask Questions!
Synchronization Review
February 20th, 2013

Prof. John Kubiatowicz
http://inst.eecs.berkeley.edu/~cs194-24 Note: Some slides and/or pictures in the following are
adapted from slides ©2013
2/20/13 Kubiatowicz CS194-24 ©UCB Fall 2013 Lec 7.2
Recall: Example of fork( ) Recall: Thread Pools

int main(int argc, char **argv)
• Problem with previous version: Unbounded Threads
{ – When web-site becomes too popular – throughput sinks
char *name = argv[0]; • Instead, allocate a bounded “pool” of worker threads,
int child_pid = fork();
if (child_pid == 0) {
representing the maximum level of multiprogramming
printf(“Child of %s sees PID of %d\n”,
queue
name, child_pid);
Master
return 0;
Thread
} else {
printf(“I am the parent %s. My child is %d\n”,
name, child_pid); Thread Pool
return 0;
master() { worker(queue) {
}
} allocThreads(worker,queue); while(TRUE) {
_______________________________ while(TRUE) { con=Dequeue(queue);
% ./forktest con=AcceptCon(); if (con==null)
Child of forktest sees PID of 0 Enqueue(queue,con); sleepOn(queue);
I am the parent forktest. My child is 486 wakeUp(queue); else
} ServiceWebPage(con);
}
}
}
2/20/13 Kubiatowicz CS194-24 ©UCB Fall 2013 Lec 7.3 2/20/13 Kubiatowicz CS194-24 ©UCB Fall 2013 Lec 7.4
Recall: Common Notions of Thread Creation Tips for Programming in a Project Team
• cobegin/coend • Statements in block may run in parallel • Big projects require more than one
cobegin
job1(a1);
• cobegins may be nested person (or long, long, long time)
job2(a2); • Scoped, so you cannot have a missing coend – Big OS: thousands of person-years!
• It’s very hard to make software
coend
• fork/join
tid1 = fork(job1, a1);
• Forked procedure runs in parallel
project teams work correctly
• Wait at join point if it’s not finished
job2(a2);
join tid1; – Doesn’t seem to be as true of big
• future construction projects
• Future possibly evaluated in parallel
v = future(job1(a1)); » Empire state building finished in
• Attempt to use return value will wait
… = …v…; one year: staging iron production
thousands of miles away
• forall • Separate thread launched for each iteration » Or the Hoover dam: built towns to
hold workers
forall(I from 1 to N)
C[I] = A[I] + B[I] • Implicit join at end
end
“You just have – Is it OK to miss deadlines?
to get your » We make it free (slip days)
• Threads expressed in the code may not turn into independent
computations synchronization right!” » Reality: they’re very expensive as
– Only create threads if processors idle time-to-market is one of the most
important things!
– Example: Thread-stealing runtimes such as cilk
Big Projects Techniques for Partitioning Tasks

• What is a big project? • Functional
– Time/work estimation is hard – Person A implements threads, Person B implements
– Programmers are eternal optimistics semaphores, Person C implements locks…
(it will only take two days)! – Problem: Lots of communication across APIs
» This is why we bug you about » If B changes the API, A may need to make changes
starting the project early
» Had a grad student who used to say he just needed » Story: Large airline company spent $200 million on a new
“10 minutes” to fix something. Two hours later… scheduling and booking system. Two teams “working
together.” After two years, went to merge software.
• Can a project be efficiently partitioned? Failed! Interfaces had changed (documented, but no one
– Partitionable task decreases in time as noticed). Result: would cost another $200 million to fix.
you add people • Task
– But, if you require communication: – Person A designs, Person B writes code, Person C tests
» Time reaches a minimum bound – May be difficult to find right balance, but can focus on
» With complex interactions, time increases! each person’s strengths (Theory vs systems hacker)
– Mythical person-month problem: – Since Debugging is hard, Microsoft has two testers for
» You estimate how long a project will take each programmer
» Starts to fall behind, so you add more people • Most Berkeley project teams are functional, but
» Project takes even more time! people have had success with task-based divisions
Communication Coordination
• More people mean more communication • More people  no one can make all meetings!
– Changes have to be propagated to more people – They miss decisions and associated discussion
– Think about person writing code for most – Example from earlier class: one person missed
fundamental component of system: everyone depends meetings and did something group had rejected
on them! – Why do we limit groups to 5 people?
• Miscommunication is common » You would never be able to schedule meetings otherwise
– “Index starts at 0? I thought you said 1!” – Why do we require 4 people minimum?
» You need to experience groups to get ready for real world
• Who makes decisions?
• People have different work styles
– Individual decisions are fast but trouble
– Some people work in the morning, some at night
– Group decisions take time
– How do you decide when to meet or work together?
– Centralized decisions require a big picture view (someone
who can be the “system architect”) • What about project slippage?
• Often designating someone as the system architect – It will happen, guaranteed!
can be a good thing – Ex: phase 4, everyone busy but not talking. One person
– Better not be clueless way behind. No one knew until very end – too late!
– Better have good people skills • Hard to add people to existing group
– Better let other people do work – Members have already figured out how to work together
How to Make it Work? Suggested Documents for You to Maintain

• People are human. Get over it.
– People will make mistakes, miss meetings, miss • Project objectives: goals, constraints, and priorities
deadlines, etc. You need to live with it and adapt • Specifications: the manual plus performance specs
– It is better to anticipate problems than clean up
afterwards. – This should be the first document generated and the
last one finished
• Document, document, document
– Consider your Cucumber specifications as one possibility
– Why Document?
» Expose decisions and communicate to others • Meeting notes
» Easier to spot mistakes early – Document all decisions
» Easier to estimate progress – You can often cut & paste for the design documents
– What to document?
• Schedule: What is your anticipated timing?
» Everything (but don’t overwhelm people or no one will read)
– This document is critical!
– Standardize!
» One programming format: variable naming conventions, tab • Organizational Chart
indents,etc. – Who is responsible for what task?
» Comments (Requires, effects, modifies)—javadoc?
Use Software Tools Test Continuously
• Source revision control software • Integration tests all the time, not at 11pm
on due date!
– Git – check in frequently. Tag working code – Utilize Cucumber features to test frequently!
– Easy to go back and see history/undo mistakes – Write dummy stubs with simple functionality
– Work on independent branches for each feature » Let’s people test continuously, but more work
» Merge working features into your master branch – Schedule periodic integration tests
» Consider using “rebase” as well » Get everyone in the same room, check out code, build, and
– Communicates changes to everyone test.
» Don’t wait until it is too late!
• Redmine » This is exactly what the autograder does!
– Make use of Bug reporting and tracking features! • Testing types:
– Use Wiki for communication between teams – Integration tests: Use of Cucumber with BDD
– Also, consider setting up a forum to leave information – Unit tests: check each module in isolation (CPP Unit)
for one another about the current state of the design – Daemons: subject code to exceptional cases
• Use automated testing tools – Random testing: Subject code to random timing changes
– Rebuild from sources frequently • Test early, test later, test again
– Run Cucumber tests frequently – Tendency is to test once and forget; what if something
» Use tagging features to run subsets of tests! changes in some other part of the code?
Administrivia Review: Synchronization problem with Threads
• How did the first design document go? • One thread per transaction, each running:
• How did design reviews go? Deposit(acctId, amount) {
acct = GetAccount(actId); /* May use disk I/O */
• Recall: Midterm I: Three weeks from today! acct->balance += amount;
StoreAccount(acct); /* Involves disk I/O */
– Wednesday 3/13 }
– Intention is a 1.5 hour exam over 3 hours • Unfortunately, shared state can get corrupted:
– No class on day of exam! Thread 1 Thread 2
• Midterm Timing: load r1, acct->balance
load r1, acct->balance
– Listed as 6:00-9:00PM add r1, amount2
– Could also be: 4:00-7:00pm store r1, acct->balance
add r1, amount1
– Preferences? (I need to get a room for the exam) store r1, acct->balance
• Topics: everything up to the previous Monday • Atomic Operation: an operation that always runs to
– OS Structure, BDD, Process support, Synchronization, completion or not at all
Scheduling, Memory Management, I/O – It is indivisible: it cannot be stopped in the middle and state
cannot be modified by someone else in the middle
Review: Ways of entering the kernel/
changing the flow of control Interrupt Controller
• The Timer Interrupt:
Interrupt Mask
Priority Encoder
– Callbacks scheduled to be called when timer expires IntID
– Cause of scheduler events – change which process of thread is CPU
running
Interrupt Int Disable
Timer
• System Calls
– Controlled function call into kernel from user space
– User-level code stops, kernel-level code
– What about asynchronous system calls? Software Control
Interrupt NMI
• Normal Interrupts Network
– Entered via hardware signal • Interrupts invoked with interrupt lines from devices
– Typically Asynchronous to the instruction stream • Interrupt controller chooses interrupt request to honor
– Often structured in some sort of hierarchy (some interrupts – Mask enables/disables interrupts
higher priority than others
– Priority encoder picks highest enabled interrupt
• Exceptions:
– Software Interrupt Set/Cleared by Software
– Instruction execution fails for some reason
– Interrupt identity specified with ID line
– Typically Synchronous to the instruction stream
• CPU can disable all interrupts with internal flag
• Non-maskable interrupt line (NMI) can’t be disabled
Example: Network Interrupt Typical Linux Interfaces

Raise priority • Disabling and Enabling Interrupts on the 2.6 Linux Kernel:
 local_irq_disable();
Reenable All Ints
External Interrupt
add $r1,$r2,$r3 /* interrupts are disabled ... */

Save registers
“Interrupt Handler”
subi $r4,$r1,#4 local_irq_enable();

Dispatch to Handler
slli $r4,$r4,#2  – These operations often single assembly instructions
Transfer Network » The only work for local processor!
Pipeline Flush Packet from hardware
to Kernel Buffers » If competing with another processor, but use other form of
 synchronization
lw $r2,0($r4)
lw $r3,4($r4) Restore registers – Dangerous if called when interrupts already disabled
add $r2,$r2,$r3 Clear current Int » Then, when you code reenables, you will change semantics
Disable All Ints
sw

8($r4),$r2
Restore priority • Saving and restoring interrupt state first:
RTI unsigned long flags;
local_irq_save(flags); // Save state
• Disable/Enable All Ints  Internal CPU disable bit /* Do whatever, including disable/enable*/
– RTI reenables interrupts, returns to user mode local_irq_restore(flags); // Restore
• Raise/lower priority: change interrupt mask • State of the system
• Software interrupts can be provided entirely in in_interrupt(); // In handler or bottom half
software at priority switching boundaries in_irq(); // Specifically in handler
Linux Interrupt control (Con’t) Implementation of Locks by Disabling Interrupts?
• No more global cli()! • Key idea: maintain a lock variable and impose mutual
exclusion only during operations on that variable
– Used to be that cli()/sti() could be used to enable and
disable interrupts on all processors
– First deprecated (2.5), then removed (2.6) int value = FREE;
» Could serialize device drivers across all processors!
» Just a bad idea Acquire() { Release() {
disable interrupts; disable interrupts;
– Better option?
if (value == BUSY) { if (anyone on wait queue) {
» Fine-grained spin-locks between processors (more later) put thread on wait queue; take thread off wait queue
» Local interrupt control for local processor Go to sleep(); Place on ready queue;
} else {
• Disabling specific interrupt (nestable) // Enable interrupts?
value = FREE;
disable_irq(irq); // Wait current handlers } else {
}
disable_irq_nosync(irq); // Don’t waitcurrent handler value = BUSY; enable interrupts;
enable_irq(irq); // Reenable line } }
synchronize_irq(irq); // Wait for current handler enable interrupts;
}
– Not great for buses with multiple interrupts per line, such as
PCI! More when we get into device drivers.
How to implement locks?

How to Re-enable After Sleep()? Atomic Read-Modify-Write instructions
• Recall from Nachos: ints disabled when you call sleep: • Problem with previous solution?
– Responsibility of the next thread to re-enable ints
– Can’t let users disable interrupts! (Why?)
– When the sleeping thread wakes up, returns to acquire
and re-enables interrupts – Doesn’t work well on multiprocessor
Thread A Thread B » Disabling interrupts on all processors requires messages
. and would be very time consuming
• Alternative: atomic instruction sequences
.
disable ints
sleep – These instructions read a value from memory and write
sleep return a new value atomically
enable ints
. – Hardware is responsible for implementing this correctly
. » on both uniprocessors (not too hard)
.
disable int » and multiprocessors (requires help from cache coherence
sleep protocol)
sleep return – Unlike disabling interrupts, can be used on both
enable ints uniprocessors and multiprocessors
.
• What about other Operating systems?
Examples of Read-Modify-Write Implementing Locks with test&set: Spin Lock
• test&set (&address) { /* most architectures */
result = M[address];
M[address] = 1; • Another flawed, but simple solution:
return result;
} int value = 0; // Free
• swap (&address, register) { /* x86 */ Acquire() {
temp = M[address]; while (test&set(value)); // while busy
M[address] = register; }
register = temp;
} Release() {
• compare&swap (&address, reg1, reg2) { /* 68000 */ value = 0;
if (reg1 == M[address]) { }
M[address] = reg2;
return success; • Simple explanation:
} else { – If lock is free, test&set reads 0 and sets value=1, so
}
return failure;
lock is now busy. It returns 0 so while exits.
} – If lock is busy, test&set reads 1 and sets value=1 (no
• load-linked&store conditional(&address) { change). It returns 1, so while loop continues
/* R4000, alpha */
loop: – When we set value = 0, someone else can get lock
ll r1, M[address];
movi r2, 1; /* Can do arbitrary comp */ • Better: test&test&set
sc r2, M[address];
beqz r2, loop; • Busy-Waiting: thread consumes cycles while waiting
}
Better Locks using test&set Using of Compare&Swap for queues

• Can we build test&set locks without busy-waiting? • compare&swap (&address, reg1, reg2) { /* 68000 */
if (reg1 == M[address]) {
– Can’t entirely, but can minimize! M[address] = reg2;
– Idea: only busy-wait to atomically check lock value return success;
} else {
int guard = 0; return failure;
int value = FREE; }
}
Acquire() { Release() {
// Short busy-wait time // Short busy-wait time
while (test&set(guard)); Here is an atomic add to linked-list function:
while (test&set(guard)); addToQueue(&object) {
if (value == BUSY) { if anyone on wait queue {
take thread off wait queue do { // repeat until no conflict
put thread on wait queue; ld r1, M[root] // Get ptr to current head
Place on ready queue;
go to sleep() & guard = 0; } else { st r1, M[object] // Save link in new object
} else { } until (compare&swap(&root,r1,object));
value = FREE; }
value = BUSY; } root next next
guard = 0; guard = 0;
}
}• Note: sleep has to be sure to reset the guard variable next
– Why can’t we do it just before or just after the sleep? New
Object
Higher-level Primitives than Locks Recall: Semaphores
• What is the right abstraction for synchronizing • Semaphores are a kind of generalized lock
threads that share memory? – First defined by Dijkstra in late 60s
– Want as high a level primitive as possible – Main synchronization primitive used in original UNIX
• Good primitives and practices important! • Definition: a Semaphore has a non-negative integer
– Since execution is not entirely sequential, really hard to value and supports the following two operations:
find bugs, since they happen rarely – P(): an atomic operation that waits for semaphore to
– UNIX is pretty stable now, but up until about mid-80s become positive, then decrements it by 1
(10 years after started), systems running UNIX would » Think of this as the wait() operation
crash every week or so – concurrency bugs – V(): an atomic operation that increments the semaphore
• Synchronization is a way of coordinating multiple by 1, waking up a waiting P, if any
concurrent activities that are using shared state » This of this as the signal() operation
– This lecture and the next presents a couple of ways of – Note that P() stands for “proberen” (to test) and V()
structuring the sharing stands for “verhogen” (to increment) in Dutch
Semaphores Like Integers Except Two Uses of Semaphores

• Semaphores are like integers, except • Mutual Exclusion (initial value = 1)
– No negative values – Also called “Binary Semaphore”.
– Only operations allowed are P and V – can’t read or write – Can be used for mutual exclusion:
value, except to set it initially semaphore.P();
// Critical section goes here
– Operations must be atomic semaphore.V();
» Two P’s together can’t decrement value below zero • Scheduling Constraints (initial value = 0)
» Similarly, thread going to sleep in P won’t miss wakeup – Locks are fine for mutual exclusion, but what if you
from V – even if they both happen at same time want a thread to wait for something?
• Semaphore from railway analogy – Example: suppose you had to implement ThreadJoin
– Here is a semaphore initialized to 2 for resource control: which must wait for thread to terminiate:
Initial value of semaphore = 0
ThreadJoin {
semaphore.P();
}
ThreadFinish {
semaphore.V();
Value=2
Value=0
Value=1 }
Summary (Synchronization)
• Important concept: Atomic Operations
– An operation that runs to completion or not at all
– These are the primitives on which to construct various
synchronization primitives
• Talked about hardware atomicity primitives:
– Disabling of Interrupts, test&set, swap, comp&swap,
load-linked/store conditional
• Showed several constructions of Locks
– Must be very careful not to waste/tie up machine
resources
» Shouldn’t disable interrupts for long
» Shouldn’t spin wait for long
– Key idea: Separate lock variable, use hardware
mechanisms to protect modifications of that variable
• Started talking abut higher level constructs that are
harder to “screw up”
2/20/13 Kubiatowicz CS194-24 ©UCB Fall 2013 Lec 7.33

Lec07 Synchronization

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lec07 Synchronization

Uploaded by

Copyright:

Available Formats

Goals for Today

CS194-24 • Tips for Programming in a Design Team

February 20th, 2013

2/20/13 Kubiatowicz CS194-24 ©UCB Fall 2013 Lec 7.2

Recall: Example of fork( ) Recall: Thread Pools

Big Projects Techniques for Partitioning Tasks

How to Make it Work? Suggested Documents for You to Maintain

Administrivia Review: Synchronization problem with Threads

Example: Network Interrupt Typical Linux Interfaces

add $r1,$r2,$r3 /* interrupts are disabled ... */

subi $r4,$r1,#4 local_irq_enable();

How to implement locks?

Better Locks using test&set Using of Compare&Swap for queues

Semaphores Like Integers Except Two Uses of Semaphores

2/20/13 Kubiatowicz CS194-24 ©UCB Fall 2013 Lec 7.33

You might also like