Let's first consider a simple imperative programming language that can encode assignments, arithmetic operations, conditional branches, skips, sequences and non-recursive function calls. Clearly, any program written in this language has a finite number of possible execution paths, as well as a finite number of steps in each execution path. This fact allows us to transform a program to a logical formula, such that each model of the formula corresponds to a valid execution path of the program and vice versa.
More precisely, suppose F(x_1,...,x_n) is a program that takes n inputs and gives one output. Then F can be represented as a formula (called a path formula) in form of \phi_F(x_1,...,x_n,y) = \bigvee\nolimits_{i \in I}\left( p_i \wedge y = e_i \right),where I is a finite index set, p_i is a predicate and e_i is an expression. A path formula \phi is called valid if the following conditions are satisfied: i) \bigvee_{i \in I}\ p_i is a tautology, which means all possible paths are covered by the formula, ii) \neg\bigvee_{i \neq j}(p_i \wedge p_j) is a tautology, which means no two paths will be taken simultaneously, and iii) an assignment to \{x_1,..., x_n, y\} satisfies \phi_F if and only if f(x_1,..., x_n)=y, which means solutions of the formula precisely characterizes the input-output relation of program F.
One can employ techniques of symbolic execution to convert programs of various imperative languages into path formulae, see e.g., [1,2]. Such conversions usually allow one to reduce the problems of verifying program properties to those of checking satisfiability for first-order constraints. For example, to verify that function F(x_1,x_2) is commutative, i.e., \forall x_1,x_2.\ F(x_1,x_2) = F(x_2,x_1), it suffices to check that constraint \phi_F(x_1,x_2,y) \wedge \phi_F(x_2,x_1,y') \wedge y \neq y' is unsatisfiable.
Modern SMT solvers effectively decide formulas over combinations of theories such as uninterpreted functions, boolean algebra, integer arithmetic, bit-vectors, arrays, sets and maps [3]. Program verification thus can benefit from advances in SMT solvers by modelling programs as first-order constraints augmented with theories, see e.g., [4-7].
Handling recursive functions
Uninterpreted functions are often leveraged to facilitate automatic reasoning about recursive functions. The idea is to abstract each recursive call to an uninterpreted function call. If a constraint is unsatisfiable assuming uninterpreted functions, then it is also unsatisfiable assuming the correct interpretation. On the other hand, if a constraint is satisfiable assuming uninterpreted functions, then no reliable conclu-sion could be drawn for the correct interpretation, as the assumptions can be wrong. In [4], the authors proposes a technique to refine assumptions iteratively when the constraint is satisfiable. This technique is the cornerstone of the Leon verification system [5].As an illustration of the technique, consider function
function size (list) { if (list == null) { return 0 } else { return 1 + size(list.next) } }with path formula \phi_{size} \equiv (b_1 \wedge y = 0) \vee (\neg b_1 \wedge y = 1 + \mathrm{size(list.next)}), and define \phi = (\mathrm{size(list)} = y) \wedge (b_1 \Leftrightarrow \mathrm{list} = \mathrm{null}) \wedge \phi_{size}.Note that a boolean variable b_1, called a control literal, is introduced to represent the truth value of the branching condition in \mathrm{size}. The purpose of introducing control literals to a path formula is to guard invocations of uninterpreted functions. It turns out that we can switch between under- and over-approximation of a path formula by changing assignments of its control literals [4], as explained below.
Suppose we want to verify property \psi, say \psi \equiv \mathrm{size(list)} \ge 0. Define \psi_1 \equiv \neg\psi \wedge \phi. Given a correct interpretation of \mathrm{size}, \psi_1 is satisfiable iff the \mathrm{size} function satisfies property \psi. In particular, if \psi_1 is unsat, then so is \psi for any interpretation of \mathrm{size}. However, if a counterexample is reported assuming uninterpreted functions, then this may be because \mathrm{size(list.next)} is assigned an incorrect value. To exclude such case, we check the satisfiability of \psi_1 \wedge b_1. If the later formula is sat, then so is \psi_1 and we are done. Otherwise, we have to explore the branch restricted by b_1. For this purpose, we unfold the definition of \mathrm{size(list.next)} one more time in \phi, and then rewrite formula \psi_1 with appropriate substitutions, e.g., replacing \mathrm{list} with \mathrm{list.next}, using fresh variables for b_1 and y, etc. The new formula is denoted as \psi_2, and we proceed to check the satisfiabilty of \psi_2 and \psi_2 \wedge b_2, where b_2 is the control literal introduced by the unfolding.
We repeat these steps and produce a sequence of alternating approximations. Thus the verification procedure for property \psi would look like
n := 1
while true do
if \psi_n is unsat then output UNSAT and break
if \psi_n\wedge b_n is sat then output SAT and break
n := n + 1
done
Observe that the procedure iteratively computes a succession of under- and over-approximations of the recursive calls in the provided constraint. Also, it finds a counterexample for the constraint whenever the constraint is satisfiable. Intuitively, this happens because a counterexample corresponds to an execution that violates the property, and the procedure enumerates all possible executions in increasing lengths. On the other hand, since our demo language becomes Turing complete after it is equipped with recursions, we cannot expect the procedure to always terminate when the provided constraint is unsatisfiable. For an important class of recursive functions, however, the approach outlined here acts as a decision procedure and terminates in all cases [8]. For instance, the \mathrm{size} function listed above falls into this class.(To be completed)
References
1. Liu, Chang, et al. "Automating distributed partial aggregation." SoCC, 2014.2. Srivastava, et al. "Path-based inductive synthesis for program inversion." PLDI, 2010.
3. De Moura, Leonardo, et al. "Satisfiability modulo theories: introduction and applications." Comm. of ACM, 2011.
4. P. Suter, A. S. Koksal, et al. "Satisfiability modulo recursive programs." SAS, 2011.
5. Blanc, Regis, et al. "An overview of the Leon verification system." SCALA, 2013.
6. Bjørner, et al. "Program Verification as Satisfiability Modulo Theories." SMT@ IJCAR, 2012.
7. Bjørner, et al. "Higher-order Program Verification as Satisfiability Modulo Theories with Algebraic Data-types." Higher-Order Program Analysis, 2013.
8. Suter, P., et al. "Decision procedures for algebraic data types with abstractions." POPL, 2010.