Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Symbolic Execution

• Builds predicates that characterize


– Conditions for executing paths
Symbolic Execution and Proof of – Effects of the execution on program state
Properties • Bridges program behavior to logic
• Finds important applications in
– program analysis
– test data generation
– formal verification (proofs) of program correctness

(c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 1 (c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 3

Formal proof of properties Symbolic state


Values are expressions over symbols
• Relevant application domains: Executing statements computes new expressions
– Rigorous proofs of properties of critical subsystems
• Example: safety kernel of a medical device Execution with concrete values Execution with symbolic values
– Formal verification of critical properties particularly before before
low 12 low L
resistant to dynamic testing high 15 high H
• Example: security properties mid - mid -
– Formal verification of algorithm descriptions and logical mid = (high+low)/2 mid = (high+low)/2
designs
• less complex than implementations after after
low 12 Low L
high 15 high H
mid 13 mid (L+H)/2

(c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 4 (c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 5
Dealing with branching statements Executing while (high >= low) {
a sample program Add an expression that records the condition for the execution of
the branch (PATH CONDITION)
char *binarySearch( char *key, char *dictKeys[ ],
char *dictValues[ ], int dictSize) {
before
int low = 0; low = 0
int high = dictSize - 1; and high = (H-1)/2 -1
int mid;
int comparison;
and mid = (H-1)/2

Branching stmt while (high >= low) { while (high >= low)!!
mid = (high + low) / 2;
comparison = strcmp( dictKeys[mid], key );
if (comparison < 0) { after
low = mid + 1; low = 0
} else if ( comparison > 0 ) { and high = (H-1)/2 -1
high = mid - 1;
} else {
and mid = (H-1)/2
return dictValues[mid]; and (H-1)/2 - 1 >= 0 if the TRUE branch was taken
}
}
return 0;
... and not((H-1)/2 - 1 >= 0) if the FALSE branch was taken

(c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 6 (c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 7

Summary information Example of summary information


(Referring to Binary search: Line 17, mid = (high+low)/2 )
• Symbolic representation of paths may become • If we are reasoning about the correctness of the binary search algorithm,
extremely complex the complete condition:
low = L
• We can simplify the representation by replacing and high = H
a complex condition P with a weaker condition and mid = M
and M = (L+H)/2
W such that
• Contains more information than needed and can be replaced with the
P => W weaker condition:
low = L
• W describes the path with less precision and high = H
• W is a summary of P and mid = M
and L <= M <= H
• The weaker condition contains less information, but still enough to reason
about correctness.

(c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 8 (c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 9
Weaker preconditions Loops and assertions
• The weaker predicate L <= mid <= H is chosen based on • The number of execution paths through a program with
what must be true for the program to execute correctly loops is potentially infinite
• It cannot be derived automatically from source code • To reason about program behavior in a loop, we can
• it depends on our understanding of the code and our place within the loop an invariant:
rationale for believing it to be correct – assertion that states a predicate that is expected to be true
each time execution reaches that point.
• A predicate stating what should be true at a given point
can be expressed in the form of an assertion • Each time program execution reaches the invariant
assertion, we can weaken the description of program
• Weakening the predicate has a cost for testing: state:
– satisfying the predicate is no longer sufficient to find data that
– If predicate P represents the program state
forces program execution along that path.
• test data that satisfies a weaker predicate W is necessary to – and the assertion is W
execute the path, but it may not be sufficient – we must first ascertain P => W
• showing that W cannot be satisfied shows path infeasibility – and then we can substitute W for P

(c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 10 (c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 11

Pre- and post-conditions Verifying program correctness


• Suppose: • If for each program segment we can verify that
– every loop contains an assertion – Starting from the precondition
– there is an assertion at the beginning of the program – Executing the program segment
– a final assertion at the end – The postcondition holds at the end of the segment
• Then:
• Then
– every possible execution path would be a sequence
of segments from one assertion to the next. – We verify the correctness of an infinite number of
program paths
• Terminology:
– Precondition: The assertion at the beginning of a
segment,
– Postcondition: The assertion at the end of the
segment
(c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 12 (c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 13
Example Executing the loop once…
Precondition
char *binarySearch( char *key, char *dictKeys[ ], Initial values: low = L Forall{i,j} 0 <= i < j < size
char *dictValues[ ], int dictSize) { Precondition: is sorted: and high = H dictKeys[i] <= dictKeys[j]

int low = 0;
Forall{i,j} 0 <= i < j < size :
int high = dictSize - 1; dictKeys[i] <= dictKeys[j] Instantiated invariant: Forall{i,j} 0 <= i < j < size :
int mid; dictKeys[i] <= dictKeys[j]
int comparison; and Forall{k} 0 <= k < size :
while (high >= low) { Invariant: in range dictKeys[k] = key => L <= k <= H
mid = (high + low) / 2;
Forall{i} 0 <= i < size : After executing: mid = (high + low)/2
comparison = strcmp( dictKeys[mid], key ); Invariant
if (comparison < 0) { dictKeys[i] = key =>
Forall{i} 0 <= i < size :
low = mid + 1; low <= i <= high low = L dictKeys[i] = key =>
} else if ( comparison > 0 ) { and high = H low <= i <= high
high = mid - 1;
} else {
and mid = M
return dictValues[mid]; and Forall{i,j} 0 <= i < j < size :
} dictKeys[i] <= dictKeys[j]
} and Forall{k} 0 <= k < size :
return 0;
dictKeys[k] = key => L <= k <= H
and H >= M >= L ….

(c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 14 (c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 15

…executing the loop once From the loop to the end


After executing the loop low = M+1
and high = H If the invariant is satisfied, but the condition is false:
and mid = M
and Forall{i,j} 0 <= i < j < size :
dictKeys[i] <= dictKeys[j] low = L
and Forall{k} 0 <= k < size : and high = H
dictKeys[k] = key => L <= k <= H and Forall{i,j} 0 <= i < j < size :
and H >= M >= L dictKeys[i] <= dictKeys[j]
and dictkeys[M]<key and Forall{k} 0 <= k < size :
dictKeys[k] = key => L <= k <= H
The new instance of the invariant: and L > H
Forall{i,j} 0 <= i < j < size :
dictKeys[i] <= dictKeys[j]
and Forall{k} 0 <= k < size : If the the condition satisfies the post-condition, the program
dictKeys[k] = key => M+1 <= k <= H is correct wrt the pre- and post-condition:
If the invariant is satisfied, the loop is correct wrt the preconditions and the invariant

(c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 16 (c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 17
Reasoning about Hoare triples:
Compositional reasoning
inference
• Follow the hierarchical structure of a program
– at a small scale (within a single procedure) premise
[I and C] S [I]
– at larger scales (across multiple procedures…)
[I] while(C){S} [I and notC]
conclusion
• Hoare triple: [pre] block [post]
Inference rule says:
if we can verify the premise (top),
• if the program is in a state satisfying the then we can infer the conclusion (bottom)
precondition pre at entry to the block, then
after execution of the block it will be in a state
satisfying the postcondition post

(c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 18 (c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 19

Some other rules: if statement Reasoning style


• Summarize the effect of a block of program code (a
whole procedure) by a contract == precondition +
postcondition
• Then use the contract wherever the procedure is called

[P and C] thenpart [Q] [P and notC] elsepart [Q]


example
[P] if (C){thenpart} else {elsepart} [Q]
summarizing binarySearch:
(forall i,j, 0 <= i < j < size : keys[i] <= keys[j])
s = binarySearch(k, keys, vals, size)
(s=v and exists i , 0 <= i , size : keys[i] = k and vals[i] = v)
or
(s=v and not exists i , 0 <= i , size : keys[i] = k)

(c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 20 (c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 21
Reasoning about data structures and
Structural invariants
classes
• Data structure module = collection of • Structural characteristics that must be
procedures (methods) whose specifications are maintained as specified as structural invariants
strongly interrelated (~loop invariants)
• Contracts: specified by relating procedures to • Reasoning about data structures
an abstract model of their (encapsulated) inner
– if the structural invariant holds before execution
state
– and each method execution preserve the invariant
– …then the invariant holds for all executions
example:
Dictionary can be abstracted as {<key, value>}
independent of the implementation as a list, Example: Each method in a search tree class
tree, hash table, etc. maintains the ordering of keys in the tree
(c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 22 (c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 23

Abstraction function Summary


• maps concrete objects to abstract model states • Symbolic execution = bridge from an operational view
of program execution to logical and mathematical
statements.
Dictionary example • Basic symbolic execution technique: execute using
symbols
abstraction function
• Symbolic execution for loops, procedure calls, and data
[<k,v> in !(dict) ] structures: proceed hierarchically
– compose facts about small parts into facts about larger parts
o = dict.get(k)
• Fundamental technique for
[o=v] – Generating test data
– Verifying systems
– Performing or checking program transformations
• Tools are essential to scale up

(c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 24 (c) 2007 Mauro Pezzè & Michal Young Ch 7, slide 25

You might also like