Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 25

Formal verification of programs

with arrays
Vu An Hoa
Assoc. Prof. Chin Wei Ngan
Agenda
• Introduction
• Review of first order logic
• Satisfiability modulo theory problem
• Implementation & illustrations
• Evaluation
• Concluding remarks
• Questions and Answers
Introduction
• Hip/Sleek: program formal verification system
– Designed to reason about recursive data structures
(linked list, binary tree, etc.)
– Developed at NUS by Chin et al.
• Hip
– Verification front end: parse and verify program in our
version of C/C#/Java
• Sleek
– Separation logic entailment checker
– Also support classical logic (or pure logic)
Formal verification
• “… act of proving or disproving the correctness of
intended algorithms underlying a system with
respect to a certain formal specification or property,
using formal methods of mathematics.”
• Independent in the model of computation
– Does not consider language issues such as the range of
type int is limited, a + b ≠ b + a, …
• Immune from algorithmic bugs
– Unlike testing methods like path coverage, boundary
value analysis, …
– Desirable!
Formal verification
• Example
– Prove that the following function In C, type int
allows for values
int sumfirst(int n) { up to 231 – 1.
if (n <=0) return 0; What if n > 109 ?
else return n + sumfirst(n-1);
}
always returns the sum 1 + 2 + … + n for input n ≥ 0.
– Induction on the input
• Base case n = 0 is trivial.
English • Assume that n ≥ 0 and sumfirst(n) returns 1 + 2 + … + n. Then n
+ 1 ≥ 1 > 0 and hence, sumfirst(n+1) returns (n+1) + sumfirst(n)
which equals to 1 + 2 + … + n + (n+1) by induction hypothesis.
The project
• Hip/Sleek is powerful in verifying properties of
recursive data structures
– There was no support for array.
– Why is array not a recursive data structure?
• Problem: extend Hip/Sleek to support array.
– Important
• Array is used intensively in many programs (why?)
• High rate of exposure to algorithmic bug
– Challenging
Array vs. Recursive data structures
• Array vs. Linked list
– Array: arranged contiguously in the memory; the
memory location of any array element can be
computed directly => efficient!
– Linked list: need to traverse the node pointers to
access the farther node.
• Random access vs. recursive access
• Array = a map from integer to value
First order logic: Syntax
• Alphabet
– Logical symbols: , ∧, ∨, →, ↔, ∀, ∃, =, (, ), variables (v1, v2, …, x,
y, …)
– Parameters: symbols for function (f1, f2, …), relation (R1, R2, …)
and constant (c1, c2, …)
• Terms
– Any variable is a term. If t1,…,tk are terms and fi is a k-ary
function then fi t1 t2 … tk is a term.
• Formulas
– Atomic: t1 = t2 or Ri t1 … tk where tj‘s are terms.
– If α and β are formulas then ( α), (α ∧ β), (α ∨ β), (α → β), (α
↔ β), ∀vi α, ∃vi α are.
First order logic: Syntax
• Example
– Language of arithmetic L = {S, +, -, x} U {<} U {0, 1}
• Formal deduction system
– Symbol manipulation mechanism to derive a formula
α (conclusion) from a set of formulas Γ (hypotheses).
– Hilbert’s system = Axioms + 1 rule (Modus Ponens)
– Natural deduction = 2 x 7 + 2 logical rules
• (universal | existential | …) (introduction | elimination)
– Notation: Γ⊢α for α is derivable from Γ
– “Formal method of mathematic” = formal deduction
First order logic: Semantic
• Structure 𝒮 for a language L
– Universe of discourse: a set, denoted by|𝒮|
– Interpretation: Functions, relations or constants
on |𝒮| correspond to parameters of L.
• Variable instantiation in a structure S
– A map ε from the variable symbols to |𝒮|
• Validity
– In a fixed structure 𝒮 and a variable instantiation ε,
a formula is either valid (true) or invalid (false).
First order logic: Semantic
• Satisfaction
– Denote Γ ⊨α [𝒮,ε] if in 𝒮 and ε, if every member of Γ is
true then α is true.
– Denote Γ ⊨α if Γ ⊨α [𝒮,ε]holds for any 𝒮 and ε.
– Γ is satisfiable if there are S and I to make every α in Γ
true. Otherwise, Γ is unsatisfiable.
• Example
– Let L = {A} U {} U {} and α = ∃x ∀y (A x y = y)
– In 𝒮 :|𝒮|=ℕ, A: (x,y) ↦ x+y and any ε, α is true.
– In 𝒮 :|𝒮|=ℝ, A: (x,y) ↦ xy and any ε, α is false.
– x = x is true in any 𝒮 and ε
Satisfiability Modulo Theory
• Satisfiability
– Given a collection of formulas Γ, decide whether
there is S and I such that every member of Γ is true.
• Modulo Theory
– Standard models/theories: integer, real numbers,
arrays, bit vectors, …
• Undecidable in general
– decidable in many interesting cases: Presburger
arithmetic, elementary real number, Boolean SAT, …
• SMT solvers: Z3, CVC3, Yices, …
Satisfiability Modulo Theory
• SMTLIB
– standard language for SMT solvers
– library of benchmarks
• Output of SMT solvers:
– sat: there is an instantiation such that the
formulas are satisfiable
– unsat: the collection is unsatisfiable
– unknown: the solver cannot decide
• Example
Satisfiability Modulo Theory
• We want to use SMT to construct proofs.
• From definitions:
– Γ ⊨α ⇔ Γ ∪{α} is unsatisfiable.
– Γ ⊢ α ⇔ Γ ∪{α} is inconsistent (i.e. Γ ∪{α}
derives two contradictory formulas)
• Soundness and completeness theorem:
– Γ⊢α ⇔ Γ ⊨α
• So Γ⊢α ⇔ Γ ∪{α} is unsatisfiable.
– In fact, SMT solver decides unsatisfiability based on
inconsistency.
Implementation
• Hip/Sleek
– Symbolic execution (context transformation)
– Important assumptions:
• every loop terminates
• every recursive functions is well-founded (eventually reduces to
the base case)
• Array
– Currently support pure logic only
– Additional assumption:
• no pointer aliasing i.e. different array variables are for different
arrays
– Allow for user defined relations
Implementation
• Arrays
– Viewed as value just like integer
– Utilize SMT array theory
• Relations
– Define in SMT using Boolean-value functions
– Axiomatize using the user’s definition
• Implication
– For Γ⊢α, ask whether Γ ∪{α} is satisfiable
– Both outputs sat and unknown are deemed as satisfiable =>
implication is invalid => there is no proof => Γ⊬α
– Reliable (as long as Z3 is reliable)
Illustration
• Sum of the elements of an array
Illustration
• Selection sort
Illustration
• Data structures
Evaluate
• Limitations
– Incapable of performing induction
– Intermediate steps discovery
– Memory context (same pointer)
– Can only tackle easy problems
– Work on model checking (no formal proof
generation)
Evaluate
• Frama-C + Jessie + Why
– Do not allow recursive relations
– Check for memory safety
– Prove termination using user supply loop invariant
– Incorporate many solvers
• Hip/Sleek
– Allow recursive relations
– Utilize only Z3 for arrays
The glory unveils
• Def: The difficulty level of alpha with respect
to Gamma = least n in N such that alpha in
Gamma_n.
• When the difficulty level < 3, the result is said
to be trivial. When Gamma |/- alpha, level of
alpha is infinity.
• Observation: Provable examples are all of
trivial level.
Concluding remarks
• In this project, we achieve
– An (incomplete) solution to verification of
programs with arrays
– A collection of comprehensive examples that
illustrates the capability of our system
– Analysis of the verification power
Concluding remarks
• A lot of interesting stuffs can be added
– Separation logic
– Proof generation
– Induction
– Invariant detection heuristics
– Etc.
Thank you for your attention

QUESTIONS AND ANSWERS

You might also like