Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 64

Survey of Race Condition

Analysis Techniques

Team Extremely Awesome


Nels Beckman

Project Presentation
17-654: Analysis of Software Artifacts

Analysis of Software Artifacts - 1


Spring 2006
A Goal-Based Literature Search
• This semester we explored many
fundamental style of software analysis.
• How might each one be applied to the
same goal?
• (Finding race conditions)
• Purpose:
• Analyze strengths of different analysis
styles normalized to one defect type.
• See how you might decide amongst
different techniques on a real project.

Analysis of Software Artifacts - 2


Spring 2006
What is a Race Condition?
• One Definition:
• “A race occurs when two threads can
access (read or write) a data variable
simultaneously and at least one of the two
accesses is a write.” (Henzinger 04)
• Note:
• Locks not specifically mentioned.

Analysis of Software Artifacts - 3


Spring 2006
Why Race Conditions?
• Race conditions are insidious bugs:
• Can corrupt memory.
• Often not detected until later in execution.
• Appearance is non-deterministic.
• Difficult to reason about the interaction of
multiple threads.
• My intuition?
• It should be relatively easy to ensure that I
am at least locking properly.

Analysis of Software Artifacts - 4


Spring 2006
But First: Locking Discipline
• Mutual Exclusion Locking Discipline
• A programing discipline that will ensure an
absence of race conditions.
• Requires a lock be held on every access to
a shared variable.
• Not the only way to achieve freedom
from races!
• See example, next slide.
• Some tools check MLD, not race safety.

Analysis of Software Artifacts - 5


Spring 2006
Example: (Yu '05)

t u v
t:Fork(u)

t:Lock(a) u:Lock(a)
t:Write(x) u:Write(x)
t:Unlock(a) u:Unlock(a)
t:Join(u)
t:Write(x)
t:Fork(v)
t:Lock(a) v:Lock(a)
t:Write(x) v:Write(x)
t:Unlock(a) v:Unlock(a)
t:Join(v)

Analysis of Software Artifacts - 6


Spring 2006
Four Broad Analysis Types
• Type-Based Race Prevention
• Languages that cannot express “racy”
programs.
• Dynamic Race Detectors
• Using instrumented code to detect races.
• Model-Checkers
• Searching for reachable race states.
• Flow-Based Race Detectors
• Of the style seen in this course.

Analysis of Software Artifacts - 7


Spring 2006
Dimensions of Comparison
• Ease of Use
• Annotations
• What is the associated burden with annotating the code?
• Expression
• Does tools restrict my ability to say what I want?
• Scalability
• Could this tool legitamately claim to work on a large code
base?
• Soundness
• What level of assurance is provided?
• Precision
• Can I have confidence in the results?

Analysis of Software Artifacts - 8


Spring 2006
Type-Based Race Prevention
• Goal:
• To prevent race conditions using the
language itself.
• Method:
• Encode locking discipline into language.
• Relate shared state and the locks that
protect them.
• Use typing annotations.
• Recall ownership types; this will seem
familiar.

Analysis of Software Artifacts - 9


Spring 2006
Example: Race-Free Cyclone
• To give a better feel, let's look at
Cyclone.
• Other type-based systems are very similar.

Analysis of Software Artifacts - 10


Spring 2006
Example: Race-Free Cyclone
• Things we want to express:
• “This lock protects this variable.”

int*l p1 = new 42;


int*loc p2 = new 43;

Analysis of Software Artifacts - 11


Spring 2006
Example: Race-Free Cyclone
• Things we want to express:
• “This lock protects this variable.”

int*l p1 = new 42;


int*loc p2 = new 43;

Declares a variable of type “an integer


protected by the lock named l.”

Analysis of Software Artifacts - 12


Spring 2006
Example: Race-Free Cyclone
• Things we want to express:
• “This lock protects this variable.”

int*l p1 = new 42;


int*loc p2 = new 43;

(loc is a special lock name. It means this


variable is never shared.)

Analysis of Software Artifacts - 13


Spring 2006
Example: Race-Free Cyclone
• Things we want to express:
• “This is a new lock.”

let lk<l> = newlock();

Analysis of Software Artifacts - 14


Spring 2006
Example: Race-Free Cyclone
• Things we want to express:
• “This is a new lock.”

let lk<l> = newlock();

Variable name

Analysis of Software Artifacts - 15


Spring 2006
Example: Race-Free Cyclone
• Things we want to express:
• “This is a new lock.”

let lk<l> = newlock();

Lock type name

Analysis of Software Artifacts - 16


Spring 2006
Example: Race-Free Cyclone
• Things we want to express:
• “This function should only be called when in
posession of this lock.”

void inc<l:LU>(int*l p;{l}) {


// blah blah
}

Analysis of Software Artifacts - 17


Spring 2006
Example: Race-Free Cyclone
• Things we want to express:
• “This function should only be called when in
posession of this lock.”

void inc<l:LU>(int*l p;{l}) {


// blah blah
}
This can be ignored for now...

Analysis of Software Artifacts - 18


Spring 2006
Example: Race-Free Cyclone
• Things we want to express:
• “This function should only be called when in
posession of this lock.”

void inc<l:LU>(int*l p;{l}) {


// blah blah
}
When passed an int whose protection lock
is l...

Analysis of Software Artifacts - 19


Spring 2006
Example: Race-Free Cyclone
• Things we want to express:
• “This function should only be called when in
posession of this lock.”

void inc<l:LU>(int*l p;{l}) {


// blah blah
}
The caller must already possess lock l...

Analysis of Software Artifacts - 20


Spring 2006
Example: Race-Free Cyclone
void inc<l:LU>(int*l p;{l}) {
*p = *p + 1;
}
void inc2<l:LU>(lock_t<l> plk, int*l p;{}) {
sync(plk) { inc(p); }
}
void f(;{}) {
let lk<l> = newlock();
int*l p1 = new 42;
int*loc p2 = new 43;
spawn(g);
inc2(lk, p1);
inc2(nonlock, p2);
}

Analysis of Software Artifacts - 21


Spring 2006
Example: Race-Free Cyclone
void inc<l:LU>(int*l p;{l}) {
*p = *p + 1;
}
void inc2<l:LU>(lock_t<l> plk, int*l p;{}) {
sync(plk) { inc(p); }
}
void f(;{}) { It would be a type error
let lk<l> = newlock();
to call inc without
possessing the lock for
int*l p1 = new 42; the first argument.
int*loc p2 = new 43;
spawn(g);
inc2(lk, p1);
inc2(nonlock, p2);
}

Analysis of Software Artifacts - 22


Spring 2006
Example: Race-Free Cyclone
void inc<l:LU>(int*l p;{}) {
*p = *p + 1;
}
void inc2<l:LU>(lock_t<l> plk, int*l p;{}) {
sync(plk) { inc(p); }
}
void f(;{}) { Imagine if the effects
let lk<l> = newlock();
clause were empty...
int*l p1 = new 42;
int*loc p2 = new 43;
spawn(g);
inc2(lk, p1);
inc2(nonlock, p2);
}

Analysis of Software Artifacts - 23


Spring 2006
Example: Race-Free Cyclone
void inc<l:LU>(int*l p;{}) {
*p = *p + 1;
}
void inc2<l:LU>(lock_t<l> plk, int*l p;{}) {
sync(plk) { inc(p); }
}
void f(;{}) { A dereference would also
let lk<l> = newlock();
signal a compiler error,
since it is unprotected.
int*l p1 = new 42;
int*loc p2 = new 43;
spawn(g);
inc2(lk, p1);
inc2(nonlock, p2);
}

Analysis of Software Artifacts - 24


Spring 2006
Type-Based Race Prevention
• Positives:
• Soundness
• Programs are race-free by construction.
• Familiarity
• Languages are usually based on well-known languages.
• Locking discipline is a very common paradigm.
• Relatively Expressive
• These type systems have been integrated with
polymorphism, object migration.
• Classes can be parameterized by different locks
• Types Can Often be Inferred
• Intra-procedural (thanks to effects clauses)

Analysis of Software Artifacts - 25


Spring 2006
Type-Based Race Prevention
• Negatives:
• Restrictive:
• Not all race-free programs are legal.
• e.g. Object initialization, other forms of
syncrhonization (fork/join, etc.).
• Annotation Burden:
• Lots of annotations to write, even for non-
shared data.
• Especially to make more complicate features,
like polymorphism, work.
• Another Language

Analysis of Software Artifacts - 26


Spring 2006
Type-Based Race Prevention
• Open Research Questions:
• Reduce Restrictions as Much as Possible
• Initialization phase
• Subclassing without run-time checks in OO
• Encoding of thread starts and stops
• Remove annotations for non-threaded code

Analysis of Software Artifacts - 27


Spring 2006
Type-Based Race Prevention
• Open Research Questions:
• Personally, sceptical that inference can
improve a whole lot.
• Programmer intent still must be specified
somehow in locking discipline.
• But escape analysis could infer thread-locals.

Analysis of Software Artifacts - 28


Spring 2006
Dynamic Race Detectors
• Find race conditions by:
• Instrumenting the source code.
• Running lockset and happens-before
analyses.
• Lockset has no false-negatives.
• Happens-before has no false positives.
• Instrumented source code will be
represented by us.
• We see all (inside the program)!

Analysis of Software Artifacts - 29


Spring 2006
Lockset Analysis
• Imagine we’re watching the program
execute…

...
marbury = 5;
madison = 5;
makeStuffHappen();
...

Analysis of Software Artifacts - 30


Spring 2006
Lockset Analysis
• Whenever a lock is acquired, add that to
the set of “held locks.”

... Held
roe = 5; Locks:
wade = 5; my_objec
synchronize(my_object) { t
... (0x34EFF
0)

Analysis of Software Artifacts - 31


Spring 2006
Lockset Analysis
• Likewise, remove locks when they are
released.

...
brown = 43;
board = “yes”; Held
} // end synch Locks:
...

Analysis of Software Artifacts - 32


Spring 2006
Lockset Analysis
• The first time a variable is accessed, set its
“candidate set” to be the set of held locks.

Candidate
Set: Held
rob_fros Locks:
t lock1
...
rob_frost = false; (0xFFFF0
(0xFFFF0 1)
...
1) lock2
(0xFFFF0 (0xFFFF08)
8)

Analysis of Software Artifacts - 33


Spring 2006
Lockset Analysis
• The next time that variable is accessed, take
the intersection of the candidate set and the
set of currently held locks…
Candidate
Set:
rob_fros Held
t Locks:
...
∩ lock1
if(!rob_frost) {
(0xFFFF0 (0xABFF4
...
1) 4)
(0xFFFF0
8)

Analysis of Software Artifacts - 34


Spring 2006
Lockset Analysis
• If the intersection is empty, flag a
potential race condition!
Candidate
Set:
rob_fros Held
t Locks:
...
∩ lock1
if(!rob_frost) {
(0xFFFF0 (0xABFF4
...
1) 4)
(0xFFFF0
8)

Analysis of Software Artifacts - 35


Spring 2006
Happens-Before Analysis
• More complicated.
• Intuition:
• Certain operations define an ordering
between operations of threads.
• Establish thread counters to create a partial
ordering.
• When a variable access occurs that can’t
establish itself as being ‘after’ the previous
one, we have detected an actual race.

Analysis of Software Artifacts - 36


Spring 2006
Happens-Before on our Example
t u
1 t:Fork(u)

t:Lock(a) u:Lock(a)
t:Write(x) 1
u:Write(x)
2 t:Unlock(a) u:Unlock(a)

t:Join(u)
t:Write(x)
t:Fork(v)

Analysis of Software Artifacts - 37


Spring 2006
Happens-Before on our Example
t u
1 t:Fork(u)

t:Lock(a) u:Lock(a)
t:Write(x) 1
u:Write(x)
2 t:Unlock(a) u:Unlock(a)

t:Join(u)
t:Write(x)
t:Fork(v)

Clock value.

Analysis of Software Artifacts - 38


Spring 2006
Happens-Before on our Example
t u
1 t:Fork(u)

t:Lock(a) u:Lock(a)
t:Write(x) 1
u:Write(x)
2 t:Unlock(a) u:Unlock(a)

t:Join(u)
t:Write(x) x:
t:Fork(v) u-1
t-2
Each variable stores the thread clock value for the
most recent access of each thread.

Analysis of Software Artifacts - 39


Spring 2006
Happens-Before on our Example
t u
1 t:Fork(u)

t:Lock(a) u:Lock(a)
t:Write(x) 1
u:Write(x)
2 t:Unlock(a) u:Unlock(a)

t:Join(u)
t:Write(x) t: x:
t:Fork(v) self-2 u-1
u-1 t-2
Also, threads learn about and store the clock values
of other threads through synchronization activities.

Analysis of Software Artifacts - 40


Spring 2006
Happens-Before on our Example
t u
1 t:Fork(u)

t:Lock(a) 1
t:Write(x) …
2 t:Unlock(a) 32

t:Join(u)
t:Write(x) t: x:
t:Fork(v) self-2 u-32
u-32 t-2
If u were to go off, incrementing its count and
accessing variables, t would find out after the join.

Analysis of Software Artifacts - 41


Spring 2006
Happens-Before on our Example
t When an access does occur, it is a requirement that:
for each previous thread access of x:
t’s knowledge of that thread’s time

x’s knowledge of that thread’s time

t:Join(u)
t:Write(x) t: x:
t:Fork(v) self-2 u-32
u-32 t-2

Analysis of Software Artifacts - 42


Spring 2006
So, combining the two…
• Modern dynamic race detectors use both
techniques.
• Lockset analysis will detect any violation of
locking discipline.
• This means we will get plenty of false positives
when strict locking discipline is not followed.
• Simple requires less memory and fewer cycles.

Analysis of Software Artifacts - 43


Spring 2006
So, combining the two…
• Modern dynamic race detectors use both
techniques.
• Happens-Before will report actual race
conditions that were detected.
• Extremely path sensitive.
• No false positives!
• False negatives can be a problem.
• High memory and CPU overhead.
• As we have seen, happens-before does not
merely enforce locking discipline.
• Works when threads are ‘ordered.’

Analysis of Software Artifacts - 44


Spring 2006
So, combining the two…
• Performance-wise:
• Use lockset, then switch to happens-before
for variables where a race is detected.
• Of course this is dynamic! No guarantee or
reoccurrence!
• Similarly, modify detection granularity at
runtime.

Analysis of Software Artifacts - 45


Spring 2006
Future Research
• Use static tools to limit search space
• We can soundly approximate every location
where race might occur.
• Performance improvements
• Could be used for in-field monitoring.
• Improve chances of HB hitting?

Analysis of Software Artifacts - 46


Spring 2006
Model-Checking for Race Conditons
• The Art of Model Checking
• Develop a model of your software system
that can be completely explored to find
reachable error states

Analysis of Software Artifacts - 47


Spring 2006
Model-Checking for Race Conditons
• Normally, scope of model determines
whether or not model checking is
feasible.
• Detailed model – Model checking takes
longer.
• Simple model – Must be detailed enough to
capture principles of interest.

Analysis of Software Artifacts - 48


Spring 2006
Model-Checking for Race Conditons
• Model-checking concurrent programs is
quite a challenge
• Take a large state space
• Add all possible thread interleavings
• Result – Very large state space
• Details of specific models would be too
muc to go into

Analysis of Software Artifacts - 49


Spring 2006
Model-Checking for Race Conditons
• Strategies:
• Persistent Sets
• Eliminate pointless thread interleavings
• Sometimes known as partial order reduction
• Contexts
• Represent every other thread with one abstract
state machine.
• Like CEGAR, only refine as much as needed.

Analysis of Software Artifacts - 50


Spring 2006
Model-Checking for Race Conditons
• Ease of use?
• Annotations
• None
• Expression
• Some tools use model-checking to implement
lockset which does not allow much expression.
• Others allow us to find actual race conditions!
• Scalability
• A Question Mark: Is the state space small
enough?
• Previous tools using partial order reduction
have been used on large software, not for races
Analysis of Software Artifacts - 51
Spring 2006
Model-Checking for Race Conditons
• Soundness?
• Yes, model-checking in this manner is
sound, as long as it terminates.
• Precision?
• Depends on how your model is used.
• In one model lockset analysis is used. Tends to
be imprecise.
• Another model directly searches for “racy”
states, which makes it very precise, but it
doesn't yet work in the presence of aliasing.

Analysis of Software Artifacts - 52


Spring 2006
Good 'ole Flow-Based Analysis
• Has been approached in a few ways
• Engineering Approach
• Sacrifice Soundness
• Increase Precision as Much as Possible
• Rank Results
• Use Heuristics and Good Judgement
• Think of PREfix or Coverity
• Rely on Alias Analysis
• Rely on Programmer Annotations

Analysis of Software Artifacts - 53


Spring 2006
Good 'ole Flow-Based Analysis
• Engineering Approach:
• Start with interprocedural lockset analysis
• Make simple improvements:
• “use statistical analysis to computer the
probability that s ... similar to known locks.”
• “realize that the first, last or only shared data in
a critical section are special.”
• “if the number of distinct entry locksets in a
function exceeds a fixed limit we skip the
function”
• (Engler ’03)

Analysis of Software Artifacts - 54


Spring 2006
Many Benefits
• Ease of Use?
• Annotations
• None or a constant number that give immidiate
precision improvements.
• Expression
• Non-lock based idioms are 'hard-coded' by
heuristics.
• Scalability
• More than any other.
• Linux, FreeBSD, Commercial OS
• 1.8MLOC in 2-14 minutes

Analysis of Software Artifacts - 55


Spring 2006
Many Benefits
• Soundness?
• Not sound in a few specific ways.
• Ability to detect some false negative.
• Precision?
• Fewer false positives than traditional
lockset tools.
• ~6 when run on Linux 2.5.
• 10s, 100s, 1000s in other static tools on
smaller applications.

Analysis of Software Artifacts - 56


Spring 2006
Other Flow-Based Tools
• Some Rely on Alias Analysis
• Limited by Current State-of-the-Art
• Still Many False Positives
• May not Scale
• Some Rely on Programmer Annotations
to distinguish all the hard cases
• May impose programmer burden

Analysis of Software Artifacts - 57


Spring 2006
So, Let’s Do a Final Comparison…

Analysis of Software Artifacts - 58


Spring 2006
Annotations
• Type-Based Systems
• Annotations are a major limiting factor.
They can be inferred, but they must be
understood by the programmer.
• Dynamic Tools
• Unnecessary
• Model-Checking
• Unnecessary
• Flow-Based Analysis
• Necessary in some form or another

Analysis of Software Artifacts - 59


Spring 2006
Expression
• Type-Based Systems
• Limited to strict locking discipline.
• Dynamic Tools
• Thanks to combination of lockset and happens-
before, relative freedom.
• Model-Checking
• Can allow great expression (Depends on
technology).
• Flow-Based Analysis
• Expression can be traded for soundness or
annotations.

Analysis of Software Artifacts - 60


Spring 2006
Scalability
• Type-Based Systems
• Scalability Limited by Annotations
• Dynamic Tools
• Getting better, but performance still a major
issue (1-3x mem. Usage, 1.5x CPU usage)
• Model-Checking
• Not extremely scalable. Depends highly on
number of processes.
• Flow-Based Analysis
• Has shown the best scalability.

Analysis of Software Artifacts - 61


Spring 2006
Soundness
• Type-Based Systems
• Sound
• Dynamic Tools
• Fundamentally unsound; but lockset will
catch most possible races in execution.
• Model-Checking
• Also sound. May not terminate.
• Flow-Based Analysis
• Different techniques trade soundness for
precision.

Analysis of Software Artifacts - 62


Spring 2006
Precision
• Type-Based Systems
• Low precision. Strict MLD.
• Dynamic Tools
• Better precision.
• Model-Checking
• Can be very high. Not complete
(undecidability of reachability).
• Flow-Based Analysis
• High precision using an engineering
approach.

Analysis of Software Artifacts - 63


Spring 2006
Questions

Analysis of Software Artifacts - 64


Spring 2006

You might also like