Ericpony's Blog: Formal

Showing posts with label Formal. Show all posts

Wednesday, November 15, 2017

Liveness to safety reduction with predicate abstraction

An automata-based approach for LTL model checking

Linear-time Temporal Logic (LTL) can be model checked for finite-state systems using symbolic automata-based techniques [6]. Given a transition system $S$ and an LTL formula $φ$, the model checking problem (denoted as $S ⊨ φ$) asks whether $φ$ is true on all infinite execution paths of $S$. A classical automata-based approach [6] toward this problem consists of building a Buchi automaton $S_{¬φ}$ with a fairness condition $f_{¬φ}$ such that $S ⊨ φ$ iff $S × S_{¬φ} ⊨ FG\ ¬f_{¬φ}$. Therefore, when we model check an LTL formula, we may assume without loss of generality that the automata-based transformation has been applied and consider the problem in form of $S ⊨ FG\ ¬f_{¬φ}$. In the following, we shall drop the subscript $¬φ$ and simply regard $FG\ ¬f$ as the target property to verify.

Liveness-to-safety reduction for finite-state systems

Liveness-to-safety reduction [4, 5] is a technique to translate an LTL model checking problem to an invariant verification problem. The invariant encodes the the absence of counterexamples to $FG\ ¬f$, namely, the system does not have a computation that visits some state $s ⊨ f$ infinitely often. The reduction algorithm augments the states of the system with flags seen, triggered, and loop, as well as maintaining a fresh copy of the state variables. It then simulates the system execution and in the meantime guesses the entry point of a cycle by non-deterministically flagging a state as "seen" and saving the valuation of the state variables. An execution will be "triggered" if it encounters a state $s ⊨ f$ after having flagged a state as "seen". Finally, a loop is detected if a triggered execution enters a seen state. The safety property thus specifies that no loop is detected on any execution. This reduction is sound and complete for finite-state and parameterised systems, since a liveness property is violated in such systems iff a lasso-shaped counterexample is present.

Liveness-to-safety reduction for infinite-state systems

The above liveness-to-safety reduction is unsound for infinite-state systems where an execution can violate a liveness property without visiting the same state twice. To overcome this problem, the authors of [1] propose to incorporate the notion of predicate abstraction in the reduction. The abstraction is supposed to be coarse enough to map a real counterexample (i.e. a path from an initial state that visits the fairness condition infinitely often) to a lasso-shaped abstract path, but at the same time precise enough not to admit spurious abstract cycles. The soundness of this approach is based on the fact that existential abstraction preserves any temporal logic without existential quantification over paths, e.g., $\forall$CTL$^*$ and LTL [7, 8].

Overview of the reduction procedure

Below we introduce the liveness-to-safety reduction proposed in [1] for infinite-state systems. Give a system $S$ and an LTL property $FG\ ¬f$, the procedure generates a sequence of safety verification problems $$S_0 ⊨_{inv} φ_0 ,\quad S_1 ⊨_{inv} φ_1,\quad \dots.$$ We shall denote the encoding of the $i$-th safety property as $$S_i, φ_i := {\rm encode}(S, f, P_i, W_i),$$ where $P_i$ is a set of state predicate, and $W_i$ is a set well-founded relations. Furthermore, the encoding ensures that $S ⊨ FG\ ¬f$ holds if $S_i ⊨_{inv} φ_i$ for some $i$; here $φ_i$ essentially states that cycles consisting of states satisfying $f$ is unreachable. The iteration terminates if some generated safety property is satisfied. Otherwise, suppose we have found $S_i \not⊨_{inv} φ_i$ for some $i$. We analyse a finite counterexample path $π$ of $S_i$ to determine whether it corresponds to an infinite real counterexample for $FG\ ¬f$ in $S$. If so, then we conclude that the property does not hold. Otherwise, if we can conclude that $π$ does not correspond to any real counterexample in $S$, we try to extract a set of new predicates $P'$ and/or well-founded relations $W'$ to produce a refined encoding: $$S_{i+1}, φ_{i+1} := {\rm encode}(S, f, P_i ∪ P', W_i ∪ W'),$$ where $P', W' := {\rm refine}(S_i, π, P_i, W_i)$. It is possible that the algorithm can neither confirm nor refute the existence of real counterexamples, in which case it aborts and returns “unknown”. The algorithm might also diverge and/or exhaust resources in various intermediate steps (e.g. in checking $S_i ⊨_{inv} φ_i$ or during refinement).
In the following, we shall describe the encoding and refinement procedures in more details. We begin with a simplified version that only uses predicates from $P$, i.e., $W = ∅$. We then describe how to extend the encoding to exploit also well-founded relations.

Counterexamples and refinement

Consider a safety checking problem $S_i ⊨_{inv} φ_i$. Each state of $S_i$ is a state of $S$ augmented with flags seen, triggered, and loop as in the case of finite-state system. However, instead of comparing the variation of state variables, the augmented system $S_i$ compares the bit-vector induced by the predicates $P_i$ to determine the equivalence between two states. A counterexample $π$ for a safety property $S_i ⊨_{inv} φ_i$ is a finite sequence of tuples where each tuple consists of an abstract state and a concrete state. The projection of the sequence to abstract states forms a lasso-shaped abstract path, and the projection to concrete states recovers the execution from which this counterexample is obtained. When the invariant is violated, we need to make sure that the obtained counterexample does not result from an insufficient precision of the abstraction induced by $P_i$. To do this, the algorithm first searches for a concrete lasso witnessing a real violation of the LTL property, using standard bounded model checking. If this fails, it tries to prove that the abstract lasso is infeasible by checking an increasing number of finite unrollings of the loop. Upon infeasibility, new predicates can be extracted from sequence interpolants with standard predicate refinement strategies for invariant properties [3].

Enhancement with well-founded relations

In the refinement operation described above, new predicates are extracted at the point when the unrolling of a spurious abstract counterexample becomes infeasible. However, it is possible that all finite unrollings of the abstract loop are feasible, but there is no concrete path executing the loop for an infinite number of times. If this happens, the refinement step might not terminate even though the abstract path cannot be concretised. The following example illustrate one such situations.

Example. Suppose that we want to verify an LTL property $FG\ ¬(x≤y)$ against a transition system $$S=(ℕ \times ℕ, \{(0,y) \mid y≥0\}, \{(x,y) \to (x+1, y)\}).$$ Let $P_0 = \{(x≤y), (y=0)\}$ be the initial set of predicates. Then $S_0 ⊨_{inv} \varphi_0$ admits an abstract counterexample path through the self loop at $\langle(x≤y), ¬(y=0)\rangle$. Unrolling this abstract path, however, will not terminate: any path formula obtained by unrolling the abstract loop for $i$ times is feasible by starting from the concrete initial state $(0,i).$

To address the spurious abstract paths whose finite prefixes are all feasible, we extend the encoding to incorporate well-founded relations over the concrete states. The idea is to block these spurious abstract paths by finding suitable termination arguments in form of (disjunctive) well-founded transition invariants. More precisely, given a finite set $W$ of well-founded relations, we encode $S, φ$ such that a lasso-shaped abstract path leads to a counterexample only if the loop contains at least two concrete states not covered by any relation in $W$. The resulting abstraction is sound as it preserves all concrete infinite paths. Indeed, if there is an infinite path such that any two states on it are covered by some relation in $W$, then by Ramsey's Theorem, there must be a relation in $W$ admitting an infinite descending chain, a contradiction. Below we show how spurious counterexamples can be removed by a suitably chosen well-founded relation.

Example. Consider the spurious abstract counterexample in the previous example. Projecting its loop to concrete states leads to two states $(a,b)$ and $(a+1,b)$ for some integers $a ≤ b$. Including relation $R := \{((x,y), (x',y)) \mid x < x' ≤ y \}$ in $W$ will remove the path from the abstraction and prove the desired LTL property.

k-liveness as a well-founded relation. There are various heuristics to come up with new abstraction predicates and well-founded relations for refinement (see [1] for the details). That said, it is still possible that at some point all heuristics get stuck, failing to produce new predicates or relations to remove a spurious counterexample. In this situation, we may exploit the notion of k-liveness in hope to make further progress. Intuitively, we can maintain an auxiliary integer variable $k$ in the transition system to count the occurrences of $f$. With this counter, the refinement may always discover a new well-founded relation $R := \{ (s, s') ∣ k < k' ≤ n \}$ where $n$ is the number of occurrences of $f$ in the current counterexample. By adding this relation, we allow the procedure to progress by blocking all counterexamples where the occurrences of $f$ is less than or equal to $n$. This heuristic is sound since $f$ occurs infinitely often in a real counterexample.

References

1. Infinite-state Liveness-to-Safety via Implicit Abstraction and Well-founded Relations
2. Counterexample-guided Abstraction Refinement for Symbolic Model Checking
3. Abstractions from Proofs
4. Liveness Checking as Safety Checking
5. Liveness Checking as Safety Checking for Infinite State Spaces
6. An Automata-Theoretic Approach to Linear Temporal Logic
7. Model Checking and Abstraction
8. Property preserving abstractions for the verification of concurrent systems

Friday, October 20, 2017

Verifying liveness property for array systems

Numerous results in the literature of regular model checking have focused on parameterised concurrent systems, which consists of a parameterised number of identical concurrent processes. However, only a handful of these results have addressed parameterised systems allowing concurrent infinite-state processes, such as processes operating on variables over an unbounded data domain. Such kind of systems are also known as parameterised array systems, for a parameterised number of processes can easily be encoded via a parameterised array containing the local states of the processes. Existing results for verifying such systems have mostly dealt with safety properties. For liveness properties, it is sometimes possible to apply an ad-hoc finitary abstraction so that the system under consideration can be handled in the classical context of regular model checking. The following example demonstrates how finitary abstraction may be applied to verify an array system.

Chang-Robert's algorithm is a leader election protocol for a clockwise directed ring of processes where each process has a unique ID. Initially, all processes are active. The algorithm starts from one initiator sending a message to the next process in the ring, tagged with its ID. When an active process with ID $p$ receives a message tagged with $q$, there are three cases: i) $q < p$ : then $p$ drops the message. ii) $q > p$ : then $p$ becomes passive, and passes on the message. iii) $p = q$ : then $p$ becomes the leader. A passive process simply forwards every message it receives. If a process receives a message while it is holding one, the message with a larger ID will take over the another message.
The idea behind Chang-Robert's algorithm is that only the message with the highest ID will complete the round trip and finish the election by the third case. The parameterised version of the algorithm has two sources of infinity: the number of processes and the number of IDs. However, observe that we do not need the exact ID to run the election correctly---it suffices to know which ID is the highest. Hence, we can apply an abstraction that maps the highest ID to 1 and the other IDs to 0. It is clear that a process is elected as the leader in the concrete system iff it is so in the abstracted system. Verification of the liveness property "a leader will eventually be elected" then reduces to a fair termination problem amenable to off-the-shelf regular model checking tools.
Unfortunately, not every parameterised array system can be successfully simplified using a finitary abstraction. The following example provides an evidence for this fact.

Dijkstra's self-stabilising mutual exclusion algorithm assumes a clockwise directed ring of processes $p_0,\dots,p_{N-1}$ with $N\ge 2$. Each process $p_i$ holds an integer variable $x_i$ that can be read by the process $p_{i-1~{\rm mod}~N}$ in the ring. The values of each $x_i$ ranges in $\{0, \dots, K-1\}$, where $K\ge N$. A process is privileged if one of the following two conditions holds:

For $i>0$, $p_i$ is privileged if $x_i \neq x_{i-1}$.
$p_0$ is privileged if $x_0 = x_{N-1}$.

A privileged processes $p_i$ is allowed to update its local variable $x_i$, causing the loss of its privilege:

For $i>0$, $p_i$ can copy the value of $x_{i-1}$ to $x_i$ if $x_{i-1} \neq x_i$.
$p_0$ can set $x_0 \leftarrow (x_0+1)~{\rm mod}~K$ if $x_0 = x_{N-1}$.

Dijkstra's algorithm is self-stabilising in that the systems eventually converges to states where exactly one process is privileged. As before, the parameterised version of Dijkstra's algorithm has two sources of infinity: concurrency and numeric data type. Nevertheless, the liveness property of Dijkstra's algorithm cannot be as easily verified with a finitary abstraction as in the previous example. To see this, consider an abstraction function $\alpha$ that maps the concrete values of the variables $x_0,\dots,x_{N-1}$ to a finite abstract domain. Then for sufficiently large $K$, there is $k\in\{0,\dots,K-1\}$ such that $\alpha(k)=\alpha(k+1)$. Under this abstraction, however, any concrete state where $x_0,\dots,x_{N-1}\in\{k,k+1\}$ will result in a spurious fair path that does not stabilise. Clearly, such kind of spurious counterexamples would appear in any finitary abstraction of the data domain. What if we keep the data intact but instead apply finitary abstraction to the underlying topology? Unfortunately, it turns out that two incomparable processes alone suffice to produce a spurious counterexample.

Instead of verifying the desired property as a whole, we may break it into lemmas and prove them separately. More precisely, we can decompose the liveness property of Dijkstra's algorithm into three lemmas:

Lemma 1: The value of $x_0$ is updated infinitely often.

Lemma 2: Eventually $x_0 \neq x_i$ for all $i \neq 0$.

Lemma 3: Eventually $p_0$ is the only privileged process.

Lemma 1 implies Lemma 2 due to the assumption that $n \le m$ and the pigeonhole principle. Lemma 2 implies Lemma 3 by a simple propagation argument. With appropriate abstraction, these two implication relations can be modelled as liveness properties of simple token-passing rings and are amenable to standard liveness checkers (e.g. [1]) for regular model checking. Below we shall concentrate on how to automate the the proof of Lemma 1.

We first establish an invariant "it always holds that at least one process is privileged". This invariant can be seen by abstracting the value of each $x_i$ to EQ if $x_i = x_0$, and to NE if $x_i \neq x_0$. The invariant then follows from the fact that an abstract state is either in form of [EQ...EQ], where $p_0$ is privileged, or in form of [...EQ NE...], where $p_i$ is privileged for some $i\ge 1$.

With the above invariant, we know that the system never gets stuck. Now we show that $x_0$ is updated infinitely often. Given an infinite run of the system, consider a *maximal* segment $\pi$ of the run such that the status of $x_0$ is not changed on $\pi$ (i.e. $x_0$ is always privileged or unprivileged on $\pi$). We argue that the length of $\pi$ is finite. Consider abstract states w ∈ (P+N)* such that w[i] = P if $p_i$ is privileged, and w[i] = N if $p_i$ is not privileged. Then the transitions on $\pi$ are abstracted to

x·P·N·y → x·N·P·y
x·P·P·y → x·N·P·y
x·P·P·y → x·N·N·y
x·P → x·N

where x ∈ (P+N)(P+N)* and y ∈ (P+N)*. Now it is clear that $\pi$ is finite, since each abstract transition either moves a P rightward, or eliminates at least one P from the state. Therefore the status of $x_0$ will change infinitely often on any infinite run, which implies that $x_0$ is updated infinitely often.

As a result, we can reduce Lemma 1 to a safety property and a termination property of simple token ring protocols, and verify them separately with safety verifiers (e.g. [2]) and termination verifiers (e.g. [3]).

References

1. SLRP: Solver for Liveness of Randomised Parameterised systems

2. Learning to Prove Safety over Parameterised Concurrent Systems

3. Fair Termination for Parameterized Probabilistic Concurrent Systems

Tuesday, June 20, 2017

A note on bisimulation and coinduction

A machine $(X,Y,S,\delta,\lambda)$ is a 5-tuple composed by an input alphabet $X$, an output alphabet $Y$, a state space $S$, a transition function $\delta:S\times X \rightarrow S$, and an output function $\lambda: S\rightarrow Y.$ We assume that $\delta$ is total without loss of generality. Also, its domain be extended to $X^*$ by defining $\delta^*(s,\epsilon)=s$ and $\delta^*(s,wx)=\delta(\delta^*(s,w),x).$ A bisimulation $R$ between two machines $M=(X,Y,S,\delta,\lambda)$ and $M'=(X,Y,S',\delta',\lambda')$ is a relation on $S\times S'$ defined recursively as $$R=\{(s,s'): \lambda(s)=\lambda'(s') \wedge \forall x\in X. (\delta(s,x),\delta'(s',x))\in R\}.$$ The union of all bisimulations is the greatest bisimulation and is often denoted by $\sim$. When $\sim$ is defined for one machine, it forms an equivalence relation and is called bisimilarity. Bisimilarity can be characterised inductively as follows. We define a sequence of equivalence relations $R_0 \supseteq R_1 \supseteq \cdots$ by letting $R_0 = S\times S$ and for $n\ge 1$, $(s, s') \in R_n$ iff for all $x\in X$, $\delta(s, x)$ implies $\exists x'\in X.\ (\delta(s, x), \delta(s', x')) \in R_{n-1}.$ It is easy to verify that the $R_n$ converges to the bisimilarity $\sim$, namely, $\bigcap_{n\ge 0} R_n=\sim$. (To see this, note that $\supseteq$ holds as $R_n \supseteq \sim$ for all $n\ge 0.$ Also, $\bigcap_{n\ge 0} R_n$ is a bisimulation and thus $\subseteq$ holds.) This inductive definition gives a simple fixpoint procedure for computing bisimilarity, see below.

Bisimulation and language equivalence

We call a machine a (deterministic) automaton if its output alphabet, denoted by $\mathcal{B}$, consists of only "rejected" and "accepted", denoted by $\bot$ and $\top$, respectively. It is interesting to note that a bisimulation $R$ between two automata $M$ and $M'$ over the same input alphabet $X$ can be itself described by an automaton $(X,\mathcal{B},R,\delta_R,\lambda_R)$, such that
1. $\delta_R((s,s'), x)=(\delta(s,x),\delta'(s',x))$ iff $(\delta(s,x),\delta'(s',x))\in R,$ and
2. $\lambda_R((s,s'))=\lambda(s)=\lambda'(s').$
Given an automaton $M=(X,Y,S,\delta,\lambda)$, the state language of a state $s$ of $M$ is defined as $L(s)=\{w\in X^{*}:\lambda\circ\delta^*(s,w)=\top\}.$ We call $L(s)$ the language of $M$ when $s$ is the initial state of $M$. The following implication can be shown by induction on the length of words:$$s\sim s' \implies L(s)=L(s').$$
Bisimulation equivalence is also called behavioural or observational equivalence by regarding the observable behaviours of a state as its state language. In the world of finite-state systems, two other notions of equivalence are often used in practice to prove language equivalence. The first notion is graph isomorphism. As the minimal automaton recognising a ($\omega$-)regular language is unique up to graph isomorphism, checking language equivalence amounts to checking graph isomorphism between their minimised automata. The second notion is trace equivalence, which relates two states iff they yield the same set of traces. Trace equivalence and bisimulation equivalence coincide on deterministic machines but the latter is finer in general (e.g. consider $a(b+c)$ v.s $ab+ac$). Here is a comparison among the three relations on finite machines:
Complexity: Bisimulation equivalence is $O(m\log n)$; Graph isomorphism is in NP; Trace equivalence is PSPACE-Complete.
Fineness: Graph isomorphism (w.o. minimisation) $\subseteq$ Bisimulation equivalence $\subseteq$ Trace equivalence $\subseteq$ Language equivalence

The proof principle of coinduction

Coinduction is a mathematical tool to define bisimulation. To demonstrate its use, we present here a technique that proves language equivalence using coinduction [1]. A $w$-derivative of a language $L\subseteq X^{*}$ is defined as $L_{w}=\{u\in X^{*}:wu\in L\}.$ The set of $\mathcal{L}$ languages on $X$ can be defined as an automaton $M_{\mathcal{L}}=(X^{*},\mathcal{L},\mathcal{B},\delta,\lambda)$ such that $\delta(L,x)=L_{x}$ and $\lambda(L)=\top$ iff $\epsilon\in L$. It turns out that $L(l)=l$ for all $l\in\mathcal{L}$, namely the state language of each state in $M_{\mathcal{L}}$ is the state itself. Hence we have$$L\sim L' \iff L=L'.$$This fact provides a uniform way to prove the equivalence of two languages. The more familiar method of proof by induction requires that one start from the equality of the base case (which is the pair of the minimal words in the two languages) and proceed to establish the equivalence of the pairs of longer words. Induction however doesn't work when the languages contain words of infinite length. In contrast, proof by coinduction starts from $\{(L,L')\}$ and continues to include more and more pairs via the transitions until a least fixed point is reached. This fixed point, whenever exists, is a bisimulation relation contained in the bisimilarity $\sim$ and thus implies that $L = L'.$
The principle of coinduction is not always effective—the process of constructing a bisimulation may not terminate even when the target languages are equivalent. This fact can be expected since otherwise the Halting problem would be solvable. On the other hand, the coinduction process always terminates when the two languages under consideration are regular: it either constructs a bisimilarity when they are equivalent, or finds a counterexample when they are not. In the case when the two languages are the same regular language, the process actually constructs the smallest finite automaton recognising them (more details later). This fact is a corollary of Kleene's Theorem, stating that a language is regular iff it is accepted by a finite automaton; see Section 8 of [1] for details. In fact, the most efficient equivalence checking algorithm for regular languages up to date is based on coinduction [4].

Bisimulation and fixed points

Given a machine $M = (X,Y,S,\delta,\lambda)$, one can define a monotone function $$F(A) := \{(s,s')\in S\times S : \forall x\in X\cup\{\epsilon\}.(\delta(s,x),\delta(s',x))\in A\}$$on lattice $(2^{S\times S},\subseteq).$ In the terminologies of fixed-point theory, bisimulations are the post-fixed points of $F$ and bisimilarity is the greatest post-fixed point $\nu X.~F(X).$ (Note: these fixed points collapse when $\delta$ defines deterministic transitions.) In this regard, proof by coinduction can be seen as showing that a property is closed under backward computation and contained in a coinductively defined set (ie. the greatest fixed point). In contrast, induction involves showing a property is closed under forward computation and containing an inductively defined set (ie. the least pre-fixed point). Intuitively, backward means that the set is obtained by keeping refine larger sets until the first (ie. the greatest) fixpoint is reached; and forward oppositely means that the set is obtained by keeping extend smaller sets until the first (ie. the smallest) fixpoint is reached.
The following procedure gives a naive fixpoint computation for the bisimilarity over a finite and non-deterministic machine. The bisimilarity is represented as a functional relation $\rho \subset S\times\mathbb N$ that maps states to labels of the equivalence classes induced by the bisimilarity.

Procedure ComputeBisimilarity
Begin
Set $\rho := \{(s, 1) : s \in S\}$, $\rho' := \emptyset$
While $\rho \neq \rho'$ do
Set $\gamma := \{(s, \{ (x, \emptyset) : \delta(s,x)\in S \}) : s \in S \}$
For all $s, x, s'$ such that $\delta(s,x) = s'$
Set $\gamma(s)(x) := \gamma(s)(x) \cup \{ \rho(s') \}$
Set $id := 0$, $A' := \mathbb N$, $\rho' := \emptyset$
Let $L$ be a listing of $\gamma$ sorted by the second component
For each $(s,A)$ in $L$ do
If $A\neq A'$ then
Set $A' := A$, $id := id + 1$
Set $\rho' := \rho' \cup \{(s,id)\}$
Output $\rho$ as the equivalence classes induced by the bisimilarity
End

The procedure starts from a trivial binary relation where all states are equivalent. In each iteration, the algorithm computes a mapping $\gamma$ from each state to the labelling of its successors induced by the current bisimulation. A state will then be re-labelled according to the labelling of its successors changes. The refinement will continue until the labelling is stable, meaning that the greatest fixpoint (i.e. the unique largest bisimulation) is reached. See this paper for a survey of algorithms for computing bisimulations.
The fact that bisimulation relation can be constructed inductively implies that coinductive and inductive reasoning should yield the same results on inductively defined structures such as lists and trees, and make a difference only on structures that contain cycles or infinite paths. Further, while bisimulation coincides with bisimilarity on deterministic machines (such as $M_{\mathcal L}$), coinduction in general does not pinpoint bisimilarity for non-deterministic machines.

Bisimulation and minimisation

A homomorphism between two machines $M$ and $M'$ over the same input alphabets is any function $f:S\rightarrow S'$ satisfying $f(\delta(s,x))=\delta'(f(s),x)$, namely $$ \array{ S\times X &\overset{f\times id_X}{\longrightarrow}& S'\times X \\ \delta \downarrow & & \downarrow \delta' \\ S &\overset{f}{\longrightarrow}& S' }$$A homomorphism is called unique if given any homomorphism $g$ there exists an isomorphism $h$ such that $f=h\circ g$. The notion of unique homomorphisms is closely related to that of bisimulations. First, a mapping $f:S\rightarrow S'$ is a homomorphism iff $\{(s,f(s)):s\in S\}$ is a bisimulation between machines $M$ and $M'$. Furthermore, given a homomorphism $f:S\rightarrow S'$,
i) If $R$ is a bisimulation on $M$, then $f(R)=\{(f(u),f(v)):(u,v)\in R\}$ is a bisimulation on $M'$. [p.34, 5]
ii) If $R'$ is a bisimulation on $M'$, $f^{-1}(R')=\{(u,v):(f(u),f(v))\in R'\}$ is a bisimulation on $M$. [p.34, 5]
iii) $f$ is unique iff for any homomorphism $g:S\rightarrow S'$, the set $\{(f(s),g(s)):s\in S\}$ is a bisimulation on $M'$.
Unique homomorphism is a powerful utility to study languages and automata. For example, fix an automaton $M$ and consider a mapping $\phi: s \mapsto L(s)$ from $M$ to $M_{\mathcal L}.$ It is straightforward to check that $\phi$ is a unique homomorphism. One interesting fact about $\phi$ is that it identifies precisely the bisimilar states in $M$. That is, $$u\sim v \iff \phi(u)=\phi(v),$$which follows from the fact that $\{(u,v)\in S\times S: \phi(u)=\phi(v)\}$ is a bisimulation on $M$ and that $\{(\phi(u),\phi(v))\in \mathcal{L}\times \mathcal{L}:u\sim v\}$ is a bisimulation on $M_{\mathcal L}.$ Another amazing fact is that $\phi$ effectively minimises automata. To see this, let $\langle s \rangle_N$ (resp. $\langle S \rangle_N$) denote the sub-automaton generated by state $s$ (resp. sets of states $S$), i.e., the automaton induced by the states reachable from $s$ (resp. $S$), in automaton $N$. Also, we lift $\phi$ to a graph homomorphism such that given an automaton $N$, $\phi(N)$ is the sub-automaton of $M_{\mathcal L}$ generated by $\{\phi(s):s\in State(N)\}$. It turns out that$$\langle L(s) \rangle_{M_{\mathcal L}} = \langle \phi(s) \rangle_{M_{\mathcal L}} = \phi(\langle s \rangle_M).$$Since given $L$ we are free to choose any $s$ and $M$ satisfying $L(s)=L$, it follows that
i) If $L$ is regular, then $\langle L \rangle_{M_{\mathcal L}}$ is the canonical minimum automaton accepting $L$, and
ii) $\phi(\langle s \rangle_M)$ is the canonical minimisation of automaton $\langle s \rangle_M$.
In particular, $L$ is accepted by a finite automaton iff $\{L_w: w\in X^*\}$ is finite. This result is equivalent to the well-known characterisation of regular languages by Nerode and Myhill:
Theorem. (Nerode-Myhill) $L$ is accepted by a finite automaton if $R_L$ has a finite index. (Namely there are a finite number of classes in the equivalence relation $R_L$ on $L$ defined by $(u,v)\in R_L$ iff $\forall w\in L. (uw\in L \iff vw\in L).$)

References and further reading

1. Automata and Coinduction (An Exercise in Coalgebra)
2. Introduction to Bisimulation and Coinduction (book) (slides)
3. An introduction to (co)algebra and (co)induction
4. Checking NFA Equivalence with Bisimulations up to Congruence
5. Universal Coalgebra: A Theory of Systems
6. Bisimulation and Language Equivalence

Saturday, May 6, 2017

A note on recognisability and definability

Definitions. An $n$-ary relation is associated with a set of $n$-tuples; an $n$-ary function $f$ is associated with a set of $(n+1)$-tuples $\{(a_1,...,a_n,f(a_1,...,a_n)\}$, which is called the graph of function $f.$ A sequence $s_1,s_2,...$ over $A$ is associated with a function $s:\mathbb N^n\rightarrow A$ such that $s(i)=s_i.$ Given an integer $n$, we use $(n)_p$ to denote its p-ary expansion; given a p-expansion $w\in\{0,...,p-1\}^*$, we use $[w]_p$ to denote its value $\sum_{i=0}^{|w|-1}p^{|w|-1-i}w[i].$ By convention, we set $(0)_p=\epsilon$ and $[\epsilon]_p=0.$

Recognisable Sets

Definitions. A set is called regular or recognisable if it can be recognised by a finite DFA. For $L\subseteq\Sigma^*$, $\sim_L$ is called a right congruence iff $u\sim_L v \iff uw\sim_L vw$ for all $w\in\Sigma^*.$ Note that a right congruence is an equivalence relation.

Myhill-Nerode Theorem. A set $L$ is regular iff $\sim_L$ has finite index over $\Sigma^*.$

Definitions. (p-recognisability) A set of integers is called p-recognisable iff the p-ary expansion of the integers is recognised by a finite automaton with alphabet $\{0,...,p-1\}.$ The automaton would read a word from left to right and ignores the leading zeros, so that if it accepts $w$ then it accepts all words in $0^*w.$
Fact. A sequence $s_1,s_2,...$ over $A$ is p-recognisable if $s^{-1}(a)$ is p-recognisable for each $a\in A.$ When $A$ is finite, it is easier to define a sequence in this way.

Definitions of recognisability generalise to n-ary relations naturally by replacing the alphabet $\{0,...,p-1\}$ to $\{0,...,p-1\}^n.$ All properties and facts listed above still hold in the general case.

Definitions. (p-automata) A p-automaton recognising a sequence $s_1,s_2,...$ over a finite domain is a complete deterministic Moore machine that reads $(n)_p$ and outputs $s_n$ for all $n\ge 1.$ In such case the sequence is called p-recognisable. Note that p-recognisability of sequences generalises p-recognisability of sets, since a set is p-recognisable iff there exists a p-automaton that outputs 1 on $(n)_p$ if $n$ belongs to the set and outputs 0 otherwise. Alternatively, we can describe a set of integers using a characteristic sequence. The characteristic sequence of a set $S$ refers to a sequence over $\{0,1\}$ where the nth bit is 1 iff $n\in S.$ A set is p-recognisable iff its characteristic sequence is p-recognisable.
A p-automaton can recognise an n-ary relation by recognising an n-dimensional sequence $s:\mathbb N^n\rightarrow\{0,1\}.$ The input language $((a_1)_p,...,(a_n)_p)$ is cleverly stipulated as a subset of $(\{0,...,p-1\}^n)^*$ instead of $(\{0,...,p-1\}^*)^n.$ Namely, components of a relation convoluted over alphabet $\{0,...,p-1\}^n$. In this way, theorems such as Myhill-Nerode still hold for $n\ge2.$

Definable Sets

Definitions. A set $L$ of $n$-tuples is called (first-order) definable by a formula $\phi$ in a first-order structure $\mathcal S$ iff $\phi$ has exactly $n$ free variables and $L=\{(a_1,...,a_n)\in D^n :$ $S\models \phi(a_1,...,a_n)\}$, where $D$ is the domain of $\mathcal S.$ A set is definable in a structure $\mathcal S$ if it is definable by a formula in $\mathcal S.$ When the structure is clear in the context we simply called a set definable. A sequence $s_1,s_2,...$ over $A$ is definable if the graph of the function $s:\mathbb N\rightarrow A$ associated with the sequence is definable.
Fact. A sequence $s_1,s_2,...$ over $A$ is definable iff $s^{-1}(a)$ is definable for each $a\in A.$ When $A$ is finite, it is easier to define a sequence in this way.

Definitions. (p-definability) Given $x\in\mathbb N$, define $V_p(x)\equiv$ the largest power of $p$ that divides $x.$ Let $V_p(0)\equiv 1.$ A subset of $\mathbb N$ is called $p$-definable if it is definable in $[\mathbb N,+,V_p].$ Given $x\in\mathbb N$, define $V_p(x)\equiv$ the largest power of $p$ that divides $x$ and $V_p(0)\equiv 1.$ Define a predicate $P_p$ that $P_p(x)$ holds iff $x$ is a power of $p.$ Define a binary relation $B_p$ such that $B_p(x,y)$ holds iff $y$ is a power of $p$ occurring in the p-ary expansion of $x.$ (Thus $B_p(x,y)$ iff $y<x$, $P_p(y)$, and there exits a $0\le z <y$ such that $y|(x-z).$)

Great efforts were made in the history to characterise 2-recognisable sets in an FO structure. Three candidate structures were considered for this purpose in order: $[\mathbb N, +, P_2]$, $[\mathbb N, +, B_2]$, and $[\mathbb N, +, V_2].$ We note two facts here about these structures:
Fact. $[\mathbb N, +, B_p]$ and $[\mathbb N, +, V_p]$ are (FO-)interpretable in each other.
Fact. $[\mathbb N, +, P_p]$ is strictly weaker than $[\mathbb N, +, V_p].$
To see the first, note that for $x>0$, $V_p(x)=y$ iff $B_2(x,y)\wedge \forall 0\le z<y.\ \neg B_p(x,z)$, and that $B_p(x,y)$ holds iff $\exists 0\le z <y.\ V_p(x-z, y).$ The second fact follows from an interesting decidability result. Let $p^x$ denote the exponential function. It is known that $[\mathbb N,+,V_p]$ and $[\mathbb N,+,p^x]$ are decidable, but $[\mathbb N,+,p^x,V_p]$ is undecidable. Clearly $P_p$ can be defined by $V_p$ and $p^x.$ If $[\mathbb N, +, P_p]$ is equal to $[\mathbb N, +, V_p]$, then $V_p$ will be definable by $p_x$, which implies $[\mathbb N,+,V_p]$ and $[\mathbb N,+,p^x,V_p]$ are mutually interpretable. This is impossible since the first is decidable and the second is not. The ultimate characterisation of 2-recognisable sets is given by the following theorem.
Theorem I. For any prime number p, a set is p-definable iff it is p-recognisable.
Proof. ($\Rightarrow$) It suffices to show that given any formula $\phi$ from a structure $\mathcal S$ equivalent to $[\mathbb N,+,V_p]$, we can construct a p-automaton $A_\phi$ recognising the set definable by $\phi$, i.e., $\mathcal S \models \phi(a_1,...,a_n)$ implies that $A_\phi$ accepts $([a_1]_p,...,[a_n]_p).$ To simply our proof, we choose $\mathcal S=[\mathbb N,R_1,R_2]$, where $R_1(x,y,z)$ holds iff $x+y=z$ and $R_2(x,y)$ holds iff $V_p(x)=y.$ Now we prove the claim by structural induction. First note that the three atom formulas $x=y$, $R_1(x,y,z)$, and $R_2(x,y)$ in $\mathcal S$ are p-recognisable (Exercise!). To complete the induction, it suffices to show that we can construct p-automata $A_{\neg\phi}$, $A_{\phi\vee\psi}$ and $A_{\exists x\phi}$ from p-automata $A_\phi$ and $A_\psi.$ Negation construction is immediate since p-automata is closed under complementation. For union, we modify the transition functions of the two p-automata so that they operate over the same set of input variables. For instance, suppose the free variables in $\phi$ are $x_1,...,x_m,y_1,...,y_n$ and the free variables in $\psi$ are $y_1,...,y_n,z_1,...,z_r.$ Then after the modification, the transition functions of the p-automata have the same domain $(x_1,...,x_m,$ $y_1,...,y_n,$ $z_1,...,z_r).$ The union of the two modified automata gives $A_{\phi\vee\psi}.$ For projection, to construct $\exists x.\phi(x,x_1,...,x_n)$ we just relabel the transitions of the p-automaton by replacing $(a,a_1,...,a_n)$ with $(a_1,...,a_n).$ Note that we may need to add new loops after the relabelling to recognise leading zeros caused by the projection. See [3] for details of this proof.

Corollary. The theory of $[\mathbb N,+,V_p]$ is decidable.
Proof. Given a sentence in $[\mathbb N,+,V_p]$, we can rewrite it to the form of either $\exists x.\phi(x)$ or $\neg\exists x.\phi(x).$ By Theorem I, $\phi(x)$ is p-recognisable. Hence we can construct an finite automaton $A$ that accepts a word $w$ iff $[\mathbb N,+,V_p]\models\phi([w]_p).$ Hence the truth of the sentence can be checked by checking emptiness of a recognisable language, which is decidable.

Corollary. The set of squares $\{n^2:n\in\mathbb N\}$ is not p-recognisable.
Proof. Suppose to the contrary that the set of squares is p-recognisable. By Theorem I, there exists a formula $\phi$ in $[\mathbb N,+,V_p]$ such that $\phi(x)$ holds iff $x$ is a square. Define $$\psi(c,d):=\phi(d)\wedge\forall d'.(d'>c\wedge\phi(d')\implies d'\ge d),$$ which asserts that $d$ is the smallest square larger than $c$. Observe that the set $\{(x,y)\in\mathbb N^2: y=x^2\}$ is p-definable by $$\lambda(x,y):=\phi(y)\wedge\exists a.\ \psi(y,a)\wedge a=y+2x+1.$$ Also, the set $\{(x,y,z)\in\mathbb N^3: z=xy\}$ is p-definable by formula
$\exists a,b,c.\ \lambda(a,x)\wedge\lambda(b,y)\wedge\phi(c)\wedge c=a+b+2z \wedge\exists d.\ \psi(c,d)\wedge d=c+2x+2y+1$
The validity of this formula exploits the fact that $c=(x+y)^2$ if $c$ is a square and the smallest square greater than $c$ is $c+2x+2y+1$ for some $x,y\in\mathbb N$. It then follows that multiplication is p-definable, which is impossible since $[\mathbb N,+,\cdot]$ is not decidable but $[\mathbb N,+,V_p]$ is.
(To be continued)

References and further reading

1. Recognizable sets of integers, Michel Rigo.
2. Languages, Automata, and Logic, Wolfgang Thomas.
3. Logic and p-recognizable sets of integers, V. Bruyère et al.

Wednesday, April 26, 2017

Deadlock-freeness, safey, and liveness

Deadlock-freeness of a concurrent systems refers to the property that at any time point, at least one process in the system is allowed to take an action. In other words, all reachable states of a deadlock-free system have a successor. At the first glance, deadlock-freeness is a safety property and can be handled as such with safety verifiers by marking all states with no successors as unsafe. However, there is a subtle fact about deadlock-freeness that make it different from safety.
Recall that the safety of a transition system $T$ can be proved by verifying a system with richer behaviours than $T$. Many tools exploit this fact to compute compatible over-approximation for safety verification. Hower, a sound over-approximation for safety properties may not be sound for deadlock-free properties, since deadlock-freeness is not preserved by disabling transitions. Hence, deadlock-free properties are in general not suitable to be verified with conventional safety verifiers as that may lead to unexpected results.
That said, if we are able to determine the set of deadlock-free states a priori, it is still possible to model deadlock-freeness as a safety property (eg., see [1]). For instance, we can mark all states with no successors as unsafe, and check if the system remains forever in safe states. The marking are static and will not be changed by adding or removing transitions. (In fact, using marking one may even model a liveness property as a safety property. But this doesn't simplify the problem since coming up with correct marking would require to check the liveness property first.)

References

1. Alpern, B. "Defining liveness", 1985.
2. A. Biere, C. Artho, and V. Schuppan. "Liveness checking as safety checking", 2002.
3. V. Schuppan and A. Biere. "Efficient reduction of finite state model checking to reachability
analysis", 2004.
4. V. Schuppan and A. Biere. "Liveness checking as safety checking for infinite state spaces", 2006.

Thursday, April 6, 2017

Existence of sets that are regular but not computable

A regular set is called computable if one can construct a finite-state automaton to recognise it. Given a finite alphabet $\Sigma$, a rational transition system is a tuple $(I, T)$ comprised of a regular set $I\subseteq\Sigma^*$ as the initial states and a rational relation $T\subseteq\Sigma^*\times\Sigma^*$ as the transition relation. We will use $T^k$ to denote the rational relation formed by composing $T$ for $k$ times, and $T^*$ to denote the reflexive and transitive closure of $T$. In this way, $T^*(I)$ will represent the reachability set of system $(I,T)$, i.e., the set of words that can be reached from the words in $I$ through a finite number of transitions.
It is clear that $T^k(I)$ is regular and computable for any $0\le k<\infty.$ It however may be less obvious that the reachability set $T^*(I)$ is not necessarily regular. To see this, consider a system $\Sigma\equiv\{a,b\}$, $I\equiv\{(ab)^n:n\ge 1\}$ and $T\equiv\{(xbay, xaby): x,y\in \Sigma^*\}.$ Then $T^*(I) \cap a^*b^* =\{a^nb^n:n\ge1\}$ is a well-know non-regular set, which implies that $T^*(I)$ is non-regular as well. On the other hand, there are rational transition systems of which the reachability sets are guaranteed to be regular, such as pushdown systems. In fact, the reachability relation of a pushdown system is regular and poly-time computable [12]. Surprisingly, $T^*(I)$ it is not necessarily computable even when it is regular. Namely, given a regular language $A$ and a rational transition system $(I, T)$, it is undecidable to check where $T^*(I)=A$ even though the former is known to be a regular set. This post will try to present a self-contained argument about the existence of such systems.
Subword ordering. Our argument would centre around a reflexive and transitive relation called "a subword of": given $x,y\in\Sigma^*$, $x\preceq y$ holds iff $x$ can be obtained by removing some segments from $y.$
Lemma. (Higman) A set is finite if all elements in it are mutually incomparable w.r.t. $\preceq.$
Proof. For $x,y\in\Sigma^*$, let $x\sim y$ iff there are distinct alphabets $a_1,...,a_n$ in $\Sigma$ such that both $x$ and $y$ match $a_1^+a_2^+...a_n^+.$ It is clear that $\sim$ is an equivalence relation splitting $\Sigma^*$ into $\binom{N}{1}1! + \binom{N}{2}2! + ... + \binom{N}{N} N!$ classes where $N=|\Sigma|$, and two words are comparable iff they are in the same class. ☐
Given a set $X\subseteq \Sigma^*$, define the upward closure of $X$ as $uc(X)\equiv\{w:\exists w'\in X.\ w'\preceq w\}.$ We say $X$ is upward closed iff $X=uc(X).$ Higman's lemma tells us that every upward closed set have a finite representation.
Lemma 1. Every upward closed set is regular.
Proof. Note that for any $X\subseteq\Sigma^*$, elements in $min(X)\equiv\{w\in X: \forall w'\in X.\ w'\not\preceq w\}$ are mutually incomparable. It follows from Higman's lemma that $min(X)$ is finite. Also, $X$ is upward closed iff $X=uc(X)=\bigcup_{w\in min(X)}wc(w)$, which implies $X$ is regular since $min(X)$ is finite and $uc(w)$ is regular for every $w\in\Sigma^*.$ ☐
Another consequence of Higman's lemma is that every increasing (in the sense of set inclusion) sequence of upward closed sets is finite.
Lemma 2. If $X_0\subseteq X_1\subseteq X_2\subseteq...$ and $X_i=uc(X_i)$ for all $i\ge0$, then there exists a finite $n$ such that $X_k=X_n$ for all $k\ge n.$
Proof. Let $X\equiv \bigcup_{i\ge 0} X_i.$ Since $min(X)$ is finite by Higman's lemma, there exists $n$ such that $X_k\cap min(X)=\emptyset$ for all $k\ge n.$ But this means $X_k\subseteq X_n$ for all $k\ge n.$ ☐
Lemma 3. For every infinite sequence of words $w_0,w_1,...\in \Sigma^*$, there exists $i<j$ such that $w_i\preceq w_j.$
Lemma 4. For any sets $X_0,X_1,...\subseteq\Sigma^*$, $uc(\bigcup_{i\ge 0}X_i)=\bigcup_{i\ge 0}uc(X_i).$
Lossy channel systems. We briefly introduce the concept of lossy channel systems. A channel system with $n$ queues are like a pushdown system with $n$ stacks except that it uses FIFO queues instead of FILO stacks. Channel systems are more powerful than pushdown systems, because they only need one queue to simulate Turing machines while pushdown systems need two stacks. A lossy channel system is a channel system where contents in the queues can be spontaneously lost. Technically, the capability of losing data can be introduced to a channel system by adding a non-deterministic self loop to each control state that removes an arbitrary alphabet from the queues.
Note that channel systems (lossy or not) can be expressed as rational transition systems where a configuration with n queues is encoded as a word in form of [control state]#[contents of the 1st queue]#...#[contents of the nth queue]#. We can define a subword ordering $\preceq_L$ over the configurations of a channel system as follows:

$c$#$w_1$#$...$#$w_n$# $\preceq_L$ $c'$#$w_1'$#$...$#$w_n'$#$\quad$ iff $\quad c=c'$ and $w_i\preceq w_i'$ for $i=1,...,n.$

Since the number of control states is finite, Lemma 1~3 also holds for $\preceq_L.$
Lossy channel systems enjoy many good properties. Among them, two properties are of particular interest for our establishment:
Fact 1. If $u\rightarrow^* w$ then $u\rightarrow^* w'$ for all $w'\preceq w.$ It follows that the complement of the reachability set is upward closed w.r.t. $\preceq_L.$
Fact 2. For any set of configurations $C$, the set $pre^*(C)\equiv\{w:\exists w'\in C.\ w\rightarrow^* w'\}$ of its predecessors is upward closed w.r.t. $\preceq_L.$
Theorem 1. The reachability set of a lossy channel system is regular.
Proof. This theorem is an immediate consequence of Fact 1, Lemma 1, and the fact that regular sets are closed under complementation. ☐
Given a system $(I, T)$ and a regular set $B\subseteq\Sigma^*$, the reachability problem asks whether $T^*(I)\cap B \neq\emptyset$, i.e., words in $B$ are reachable from words in $I$ through a finite number of transitions.
Theorem 2. The reachability problem of a lossy channel system is decidable.
Proof. Historically, this result was first proved in [6] using backward analysis as follows. Define the backward transition function $pre(C)\equiv\{w:\exists w'\in C.\ w\rightarrow w'\}$, which would preserve regularity when the forward transition relation is regular. Given a regular set $B$ of the bad configurations, let $X_0\equiv B$ and $X_{n+1}\equiv X_n\cup pre(X_n)$ for $n\ge 0$. Define $X\equiv\bigcup_{i\ge 0}X_i$. Note that $X$ is the set of configurations backward reachable from $B$, and thus is upward closed by Fact 2. By Lemma 4, we can construct another sequence $X_0',X_1',...$ converging to $X$ by defining $X_n'\equiv uc(X_n).$ Each $X_{n+1}'$ can be computed from $X_n'$ for $n\ge0.$ Hence, by Lemma 2, the fixed point $X$ can be computed within a finite number of steps. Since $T^*(I)\cap B\neq\emptyset$ if and only if $X\cap I\neq\emptyset$, the reachability problem is decidable. ☐
Remark. The above proof actually gives an algorithm to construct the reachability set. However, the algorithm relies on Higman’s Lemma for termination and thus the time complexity is not primitive recursive [8,9].

Now we proceed to show that the reachability set of a lossy channel system, albeit regular, is not computable. Given a system $(I,T)$ and a word $w$, the recurrent state problem asks whether there exists an infinite trace from $I$ that visits $w$ infinitely often.
Theorem 3. [6] The recurrent state problem is undecidable for lossy channel systems.

To solve the recurrent state problem it suffices to check $w \in T^*(I)$ and $w \in T^*(\{w\})$. Hence, if these two questions are decidable then so is the recurrent state problem, leading to a contradiction with Theorem 3. Therefore, we can conclude that there exists no procedure that computes a finite-state automaton (or any symbolic representation allowing effective membership query) for $T^*(I)$.
Corollary. There exists a regular transition system $(I,T)$ such that $T^*(I)$ is regular but not computable.
Remark. Does the above corollary holds when $T$ is length-preserving? I didn't find any result about this at the time I was writing this post.

Discussion. The proof of Theorem 2 relies on properties of the subword ordering (or a well-quasi-ordering in general) to assert that backward search always terminates. In fact, reachability can be shown both r.e. and co-r.e. whenever a regular proof exists: just launch two procedures at the same time, one enumerating all possible counter-examples and one enumerating all possible proofs. It is clear that exactly one of the two procedures will terminate. (Here a counterexample is an execution path from $I$ to $B$, and a proof is a regular set $P$ satisfying the following three proof rules: i) $I\subseteq P$, ii) $T(P)\subseteq P$, and iii) $P\cap B=\emptyset.$) In particular, reachability is decidable for a transition system whenever the set of reachable configurations is regular.
More generally, the above argument works whenever a "regular" symbolic proof exists and the one-step image $T(X)$ or pre-image $T^{-1}(X)$ can be computed effectively from $X$. The "regularity" here refers to the existence of any finitary representations that are computable and closed under complement and intersection. This argument has been applied to prove decidability results for transition systems such as Petri nets and lossy counter systems [10]. See [11] for a general discussion on this kind of proof techniques.

References and further reading

1. Transitive Closures of Regular Relations for Verifying Infinite-state Systems, 2000.
2. Unreliable Channels are Easier to Verify than Perfect Channels, 1996.
3. On-the-fly Analysis of Systems with Unbounded, Lossy FIFO Channels, 2005.
4. Effective Lossy Queue Languages, 2001.
5. Undecidable Problems in Unreliable Computations, 2003.
6. Undecidability of Verifying Programs with Unreliable Channels, 1996.
7. Verifying Systems with Infinite but Regular State Spaces, 1998.
8. Post Embedding Problem is not Primitive Recursive, 2007.
9. The Ordinal Recursive Complexity of Lossy Channel Systems, 2008.
10. Lossy Counter Machines Decidability Cheat Sheet, 2010.
11. A unified approach for studying the properties of transitions systems, TCS, 1982.
12. On the regular structure of prefix rewriting. TCS, 1992.

Monday, February 20, 2017

DFA as proof for (recurrent) unreachability

Is this post we discussed how to prove unreachability problem $f^*(I)\cap B=\emptyset$ in an infinite-state transition system. In particular, we used the L* learning algorithm to learn a regular inductive invariant. In order to answer membership queries, we put a restriction that the transition relation $f$ is length-preserving, such that it is decidable to check whether a given configuration is reachable. Is putting this restriction the only way to make learning a proof possible? In this post, we shall discuss an elegant learning framework that can be applied to prove not only unreachability (aka. safety verification) but also recurrent unreachability (aka. liveness verification). We begin with briefly recapitulating the definitions.

Learning to prove unreachability for non-length-preserving systems

Given a finite set $\Sigma$ and a binary relation $f:\Sigma^* \to \mathcal P(\Sigma^*)$, define $F:\mathcal P(\Sigma^*) \to \mathcal P(\Sigma^*)$ by $F(S):= \bigcup_{v\in S}f(v).$ A binary relation is called regular if it can be accepted by an FA over $(\Sigma\cup\{\epsilon\})\times(\Sigma\cup\{\epsilon\}).$ Given two regular sets $I$ and $B$ and a regular relation $f$, the unreachability problem $(I,F,B)$ asks whether $B$ is unreachable from $I$ through $F$, namely, $F^*(I)\cap B=\emptyset.$ A proof for unreachability is a set $P\subseteq \Sigma^*$ satisfying $I\subseteq P$, $f(P)\subseteq P$, and $P\cap B=\emptyset.$ Such a set exists iff $B$ is unreachable from $I$.
Suppose we want to find a proof using the L* learning algorithm. In order to make membership queries decidable, Vardhan [1] proposed to learn, instead of the set of configurations, the set of configuration-witness pairs, where a witness refers to the number of steps needed to reach a configuration from the initial configurations. More precisely, in Vardhan's framework, the proof we aim to learn is a regular subset $W$ of $\Omega:=(\Sigma\cup\{\bot\})\times\{0,1\}$ ($\bot$ is the padding symbol). Each word in $W$ encodes a pair $(c,n)$ such that $c$ is a configuration and $c\in F^n(I)$. We shall use $[c,n]$ to denote the encoding of $(c,n).$ Now, given $I,f,B$, we shall consider an unreachability problem for $I',f',B'$ such that
$\begin{eqnarray*}
I' & := & \{[c,0]: c\in I\};\\
f'(S) &:=& \{[f(c),n+1]: [c,n]\in S\};\\
B' &:= & \{[c,n]:c\in B, n\ge 0\}.
\end{eqnarray*}$
Note that if the sets $I,f,B$ are regular then so are the sets $I',f',B'$.
It is straightforward to learn a proof for unreachability problem $(I',F',B')$ using L*: membership query is decidable thanks to the presence of witness, and equivalence query can be answered as we did before, aiming to find an inductive invariant as an over-approximation. After L* learns a proof for the new problem, we obtain a proof for the original problem by projection. However, note that the transformation from configurations to configuration-witness pairs doesn't preserve regularity. Hence, the transformed problem $(I',F',B')$ doesn't necessarily have a regular proof even when the original problem $(I,F,B)$ does (in which case the problem is decidable). In particular, our previous technique guarantees to terminate when $F^*(I)$ is regular, while the technique presented here doesn't. It would be interesting to characterise the classes for which this technique is effective, namely, the classes where regularity of $F^*(I)$ implies that of $F'^*(I')$.

Learning to prove recurrent unreachability for length-preserving systems

The notion of "learning configuration-witness pairs instead of configurations" can also be applied to learn proofs for recurrent unreachability. Given $I,f,B$ as before, we say $B$ is recurrently reachable from $I$ iff there exists $w\in I$ such that the trace $f*(w)$ visited $B$ infinitely often. To prove recurrent unreachability, a witness needs to make it decidable to check whether a configuration is visited i.o. from $I$. This doesn't look possible as a single witness would then need to encode an infinite amount of information. (Recall that a witness for reachability only needs to encode a finite path.) For length-preserving systems, however, a counterexample to recurrent reachability is a lasso-shaped path and hence has a finite encoding. In fact, we can apply the well-known liveness-to-safety reduction for regular model checking to translate a recurrent reachability problem to a unreachability problem. More precisely, given a length-preserving transition system $(I,T,F)$ over alphabet $\Sigma$, we can specify a unreachability problem $(I',T',B')$ over alphabet $\Sigma\times\Sigma$ where
$I' := F\times F$,
$T' := \{((u,v),(u,w)) : u\in\Sigma^* \wedge |u|=|v| \wedge (v,w) \in T\}$, and
$B' := (T^*(I) \cap F)\times(T^*(I) \cap F)$.
Observe that $B'$ is reachable from $I'$ in $(I',T',B')$ iff $F$ can be visited infinitely often from $I$ in $(I,T,F)$. Even though $B'$ is in general not computable, the unreachability problem is still amenable to the automata learning method introduced in our last post. To see this, suppose that the learner has made an equivalence query for a candidate proof $P$. If either $T'(P)\subseteq P$ or $I'\subseteq P$ fails, a counterexample for the query can be computed as before. To check whether $P ∩ B = ∅$, we first check whether there is some $y \in P \cap (F\times F)$. If no such $y$ exists, then $P$ is a proof for the unreachability problem. Otherwise, we proceed to check whether $y \in T'^*(I')$, which is decidable since $T'$ is length-preserving. If $y \in T'^*(I')$, then $y$ is a witness for the fact that $B'$ is reachable from $I'$. Otherwise, we can use $y$ as a counterexample for the equivalence query.

Tuesday, January 31, 2017

DFA as proof for unreachability

Preliminaries. Given a finite set $\Sigma$ and a binary relation $f:\Sigma^* \to \mathcal P(\Sigma^*)$, we would use $F$# to denote the set function $\mathcal P(\Sigma^*) \to \mathcal P(\Sigma^*)$ defined by $F(S):= \bigcup_{v\in S}f(v).$ Also,$$F^0(S):= S;\quad F^{n+1}(S):= F(F^n(S))\cup F^n(S);\quad F^*(S):= \bigcup\nolimits_{n\ge 0} F^n(S).$$Intuitively, $F^*(S)$ is the set reachable from $S$ by applying $f$ to $S$ finitely many times. A binary relation is called regular if it can be recognised by a finite automaton over alphabet $(\Sigma\cup\{\epsilon\})\times(\Sigma\cup\{\epsilon\}).$
Unreachability analysis. Suppose $I$ and $B$ are two regular sets and $f$ is a regular relation. The unreachability problem $(I,F,B)$ asks whether $B$ is unreachable from $I$ through $F$, namely, $F^*(I)\cap B=\emptyset$. In software verification, many safety properties of a software can be modelled as (un)reachability problems [1] for an infinite state system. Particularly, the semantics of a program is modelled by a binary relation over configurations $\Sigma^*.$ If we let $I$ and $B$ denote the sets of the initial and the bad configurations, respectively, then $F^*(I)\cap B=\emptyset$ means that the execution of the program would never go wrong.
Tractability. Given $(I,F,B)$, $F^*(I)$ is not necessarily regular even though $F^n(S)$ is regular for all $n\ge0.$ Also, most interesting properties about $F^*(S)$ are undecidable. This fact is not surprising as the transition relation of a Turing Machine is regular*. To make the verification problem more tractable, we assume that $f$ is length-preserving, i.e., $w\in f(u)$ only if $|w|=|u|$, which weakens the expressive power of the transition system $(I,f)$ from a Turing machine to a linear-bounded automaton (LBA), which is only allowed to work with the amount of space occupied by the input string. Membership of $F^*(I)$ becomes decidable with the length-preserving restriction, but unreachability remains undecidable, as $LBA\cap DFA=LBA$ and checking emptiness of LBA is undecidable. This follows from the fact one can encode a given TM $T$ as an LBA $A$ such that $L(A)$ consists of precisely the halting runs of $T$.
Remark. While a regular transition system is as powerful as a TM (which can compute functions like $f(w)=w\#w$), not all transition systems are regular. For example, transition systems induced by higher-order pushdown automata allow transitions like $w \to w\#w$ and are thus not regular.
Regular proof synthesis. While the problem is undecidable, we can given a positive answer to it as long as we can find a proof . Such a proof can be a set $S$, called an inductive invariant, that satisfies $B\cap S=\emptyset$ and $F^*(I)\subseteq S.$ Recall that in this post we introduced L*, an algorithm that can learn a DFA by querying the target language. Below, we try to construct a teacher for L* to learn a regular proof for unreachability problem $(I,F,B).$ Upon a membership queries for word $w$, the teacher checks if $I$ is reachable from $w$ through $f^{-1}$. This check is decidable thanks to the assumption that $f$ is length-preserving, which guarantees that a word is only reachable from a finite number of words. Upon an equivalence query for DFA $A$, the teacher performs the following 4 steps in turn:

Step 1. If $\exists w\in I\setminus L(A)$ then return $w$. // check that $I\subseteq L(A)$)
Step 2. If $\exists w\in L(A)\cap B$ then halt and report "reachable" if $Mem(u)=yes$; return $w$ otherwise. // check that $L(A)$ doesn't overlap $B$
Step 3. If there exist $u\in L(A)$ and $w\in f(u)\setminus L(A)$, then return $w$ if $Mem(u)=yes$ and return $u$ otherwise. // check that $L(A)$ is inductive
Step 4. Halt and report "unreachable".

If a candidate DFA $A$ passes Step 1-3, then $L(A)$ is clearly an inductive invariant. Observe that each counterexample falls in the symmetric difference of $L(A)$ and $F^*(I).$ When $F^*(I)$ is regular, it follows from the complexity results of L* that an inductive invariant can be learnt with at most $n$ equivalence queries, where $n$ is the number of states in the minimal DFA of $F^*(I)$, and at most $m\!\cdot\!(n+1)$ membership queries, where $m$ is an upper-bound on the length of the counter-examples returned by the teacher. Here we have several notes in order:
On the length-preserving restriction. We have assumed that the transition relation preserve lengths. In this way, a bad configuration is reachable from some initial configuration if and only if it can be reached within a bounded memory. This restriction is not essential for practical unreachability analysis as long as we allow "padding" in the initial set. That is, for every word in the initial configurations, say $w$, we would pad it with an arbitrary number of placeholders, say $\bot$, so that $w\in I$ implies $w\bot^*\subseteq I$. In this way, the size of available memory for computing $w$ is finite but effectively unbounded. On the other hand, though length-preserving relations are powerful enough to model many useful transition systems, they cannot model systems that require $\epsilon$ in their alphabet, such as lossy channel machines.
Minimality of proof. While L* always learns a minimal DFA by the Myhill-Nerode theorem, here it does not necessarily learn the minimal inductive invariant in terms of set inclusion. Instead, depending on the counterexamples returned by the teacher, L* would try to learn one of the inductive invariants and, when it found one, it learns the minimal DFA for that particular invariant.
Completeness of the method. When $F^*(I)$ is regular, L* is guaranteed to find a proof. When $F^*(I)$ is not regular, however, our method may not find a regular proof even when one exists. On the other hand, checking unreachability is in fact decidable when a regular proof exists, since we can launch two procedures at the same time, one looking for a counterexample and one looking for a proof. From this aspect, L* learning is weaker in completeness than brute force enumeration. (Note that finding a DFA for $F^*(I)$ may still be undecidable even when $F^*(I)$ is regular.)
A non-terminating example. Consider an unreachability problem $(I,F,B)$ where $I:=\{(ab)^n:n\ge 1\}$, $f:= \{(xbay, xaby): x,y\in \{a,b\}^*\}$, and $B$ is some trivial set such as $\emptyset.$ Note that $F^*(I)$ is not regular, since $F^*(I)\cap a^*b^*=\{a^nb^n:n\ge1\}$ is not regular. Now consider a Teacher implementing Step 1-4 for equivalence queries that would return a shortest counterexample for the candidate proof. Below we list the queries and the automata $A$ learnt by L* in each iteration.
Iteration 1: $A=\emptyset$; containment check of $I$ fails with counterexample $ab$; add $ab.$
Iteration 3: $A=a(ba)^*b$; inductiveness check fails at $abab \to aabb$; add $aabb.$

digraph finite_state_machine {
rankdir=LR;
size="8,5"
node [shape = doublecircle]; 3 ;
node [shape = circle];
1 -> 2 [ label = "a" ];
2 -> 3 [ label = "b" ];
3 -> 2 [ label = "a" ];
"init" [shape = point];
"init" -> 1;
}

Iteration 4: $A=a(ba|ab)^*b$; inductiveness check fails at $aababb \to aaabbb$; add $aaabbb.$

digraph finite_state_machine {
rankdir=LR;
size="8,5"
node [shape = doublecircle]; 4 ;
node [shape = circle];
2 -> 3 [ label = "b" ];
1 -> 3 [ label = "a" ];
3 -> 2 [ label = "a" ];
3 -> 4 [ label = "b" ];
4 -> 3 [ label = "a" ];
"init" [shape = point];
"init" -> 1;
}

Iteration 5: $A=a(ba|a(ab)^*b)^*b$; inductiveness check fails at $aaababbb \to aaaabbbb$; add $aaaabbbb.$

digraph finite_state_machine {
rankdir=LR;
size="8,5"
node [shape = doublecircle]; 5 ;
node [shape = circle];
2 -> 3 [ label = "b" ];
3 -> 2 [ label = "a" ];
3 -> 4 [ label = "b" ];
1 -> 4 [ label = "a" ];
4 -> 3 [ label = "a" ];
4 -> 5 [ label = "b" ];
5 -> 4 [ label = "a" ];
"init" [shape = point];
"init" -> 1;
}

One can observe that L* got trapped from the 3rd iteration - it keeps splitting state 2, creating a lineage that leads to a divergent sequence of counterexamples. On the other hand, L* has indeed tried its best after the 3rd iteration: each time L* receives a counterexample of length $2n$, it succeeds to find an inductive proof for the reachable configurations of length $\le 2n$, namely $A_n:=\{w\in \{a,b\}^*:|w|\le n$ and $0\le$#$a-$#$b\le n$ in all prefixes of $w \}.$ Note that $|A_{n+1}|=|A_{n}|+1.$ Due to the fact that L* progresses incrementally in terms of automata size, and that $a^{n+1}b^{n+1}$ is the shortest counter-example for $A_n$, L* will never reach an automaton covering all possible $A_n$.

Learning with witnesses. We have discussed how to use the L* learning algorithm to prove $F^*(I)\cap B=\emptyset$. In order to answer membership queries, we assumed that the transition relation is length-preserving. However, it is possible to use L* without this assumption. For example, Vardhan et al. [3] proposed to learn a set of configuration-witness pairs as a proof, where a witness refers to the number of steps needed to reach a configuration. More precisely, the proof they proposed to learn is a regular subset of $((\Sigma\cup\{\bot\})\times\{0,1\})^*$, where $\bot$ serves as a padding symbol. Each word in this set encodes a pair $(c,n)$, meaning that $c$ is a configuration contained in $F^n(I)$. Let $[c,n]$ denote the encoding of $(c,n).$ Given $I$, $f$, and $B$, we define $I'$, $f'$, and $B'$ as
$\begin{eqnarray*}
I' & := & \{[c,0]: c\in I\};\\
f'(S) &:=& \{[f(c),n+1]: [c,n]\in S\};\\
B' &:= & \{[c,n]:c\in B, n\ge 0\}.
\end{eqnarray*}$
Note that if $I$, $f$, and $B$ are regular then so are $I'$, $f'$, and $B'.$
It is straightforward to learn a proof for unreachability problem $(I',F',B')$ using L*: membership query is decidable thanks to the presence of witness, and equivalence query can be answered as we did before, aiming to find an inductive invariant as an over-approximation. After L* learns a proof for the new problem, we obtain a proof for the original problem by projection. However, note that the transformation from configurations to configuration-witness pairs doesn't preserve regularity. Hence, the transformed problem $(I',F',B')$ doesn't necessarily have a regular proof even when the original problem $(I,F,B)$ does. In particular, unlike the previous method, the learning process here doesn't guarantee to terminate when $F^*(I)$ is regular. It would be interesting to characterise the classes for which this technique is effective, namely, the classes where regularity of $F^*(I)$ implies that of $F'^*(I')$.

References

1. A unified approach for studying the properties of transitions systems, TCS 1982.
2. Learning to prove safety over parameterised concurrent systems, FMCAD 2017.
3. Learning to verify safety properties, ICFME 2004.

Wednesday, December 7, 2016

A note on induction and coinduction

An illustrative example. Consider a context-free grammar (CFG) with start symbol $S$, alphabet $\Sigma=\{0,1,2\}$, and the following production rules:\begin{equation}S\rightarrow 0\ |\ 1\ |\ 0S\ |\ 1S\label{eg1-cfg1}\end{equation}A CFG is conventionally interpreted inductively, which means a string belongs to a grammar's language if and only if it can be transformed from the start symbol through a finite number of rule applications. Hence, the language of the above grammar is $\{0,1\}^+$, for this set contains exactly the strings in $\Sigma$ that can be derived from $S$. On the other hand, a grammar can also be interpreted coinductively, meaning that a string belongs to the a grammar's language if and only if its existence is consistent with the grammar rules. In this sense, a string is in the language of \eqref{eg1-cfg1} if and only if it is of one of the forms 0, 1, $0x$ or $1x$, and, therefore, the language of the grammar is $\{0,1\}^+\cup\{0,1\}^\omega$, the set of nonempty strings comprised of 0's and 1's. Note that any string containing 2 is not in the language, since if $x2y=\{0,1\}\cdot z$ is in the language, then so is a string $x'2y$ with $|x'| = |x| - 1$ and this will eventually lead to a contradiction. Now consider another CFG as follows:\begin{equation}S\rightarrow 0\ |\ 1\ |\ S\ |\ SS\label{eg1-cfg2}\end{equation}It is clear that grammars \eqref{eg1-cfg1} and \eqref{eg1-cfg2} produce the same language when they are interpreted inductively. However, if a coinductive interpretation is taken, then the language of \eqref{eg1-cfg2} will become $\Sigma^\omega$ since any string satisfies rule $S \rightarrow S$. In short, an inductive interpretation of a set of rules looks for the minimal satisfying set (a string is in the language iff it can be derived via the rules), while a coinductive interpre-tation looks for a maximal one (a string is in the language iff it doesn't violate the rules).
The concepts of induction and coinduction can be unified in order theory. To see this, fix a countable domain $D$ and consider a predicate $Q: D \rightarrow \{true,false\}$. We shall use $Q$ to capture the minimal set of elements satisfying a set of recurrent equations. For example, the language induced by grammar \eqref{eg1-cfg1} can be captured by $Q$ with $D=\Sigma^\omega$ and recurrent equations $Q(0)=true$, $Q(1)=true$, $Q(0x)=Q(x)$, and $Q(1x)=Q(x)$, such that a string $x$ is in the language iff $Q(x)=true$. Now we rewrite these equations by replacing the $Q$'s on the left-hand side of each equation with a functional $F$ over $D\rightarrow \{true,false\}$, obtaining a set of equations defining $F$: \begin{equation}F(Q)(0)=true;\quad F(Q)(1)=true;\quad F(Q)(0x)=Q(x);\quad F(Q)(1x)=Q(x).\label{eg1-fp1}\end{equation}Consider the (standard) complete lattice $(D\rightarrow \{true,false\},\ \sqsubseteq)$ of predicates where $(Q_1\sqcap Q_2)(d) = Q_1(d)\wedge Q_2(d)$, $(Q_1\sqcup Q_2)(d) = Q_1(d)\vee Q_2(d)$, $\bot$ maps all $d\in D$ to $false$, $\top$ maps all $d\in D$ to $true$, and $Q_1 \sqsubseteq Q_2$ means $Q_1(d)\Rightarrow Q_2(d)$ for all $d\in D$. Observe that $F$ is a monotone function on this lattice: $Q_1 \sqsubseteq Q_2 \Rightarrow F(Q_1) \sqsubseteq F(Q_2)$. Hence, by Tarski's Fixed Point Theorem, $F$ has a least fixed point (denoted by $\lfp(F)$) and a greatest fixed point (denoted by $\gfp(F)$). The relationship between these fixed points is$$\bot\sqsubseteq F^k(\bot)\sqsubseteq \bigsqcup\nolimits_n F^n(\bot) \sqsubseteq \lfp(F) \sqsubseteq \gfp(F) \sqsubseteq \bigsqcap\nolimits_n F^n(\top) \sqsubseteq F^k(\top) \sqsubseteq \top.$$Take the $F$ defined in \eqref{eg1-fp1} as an example. $F^n(\bot)$ captures the set of strings that can be derived from grammar \eqref{eg1-cfg1} with a parse tree of height less than $n+1$. Furthermore, since each equation only has a finite number of $Q$'s on the right-hand side, $F$ is in fact a Scott continuous function and thus $\lfp(F)=\bigsqcup_n F^n(\bot)$, i.e., $\lfp(F)(x)=true$ iff $x \in\{0,1\}^+$, the set of strings that can be derived from $S$ in a finite number of steps. To find $\gfp(F)$, define $A_n=\{$nonempty binary strings of length less than $n+1\}$ and $B_n=\{$infinite strings in $\Sigma^*$ containing a binary prefix of length less than $n+1\}$. Observe that $F^n(\top)$ captures the set $A_n \cup B_n$, $F^n(\top) \sqcap F^{n+1}(\top)$ captures the set $A_{n+1} \cup B_{n+1}$. Thus $\bigsqcap\nolimits_n F^n(\top)$ will capture the set $\lim_{n\rightarrow\infty}(A_n\cup B_n)$ if the limit exists. However, since $A_n \nearrow \{0,1\}^+$ and $B_n \searrow\{0,1\}^\omega$, we have $$\limsup_{n\rightarrow\infty}(A_n\cup B_n) = \bigcap_{n\ge 1}\bigcup_{k\ge n} (A_k \cup B_k) = \bigcap_{n\ge 1}(\{0,1\}^+\cup B_n) = \{0,1\}^+\cup\{0,1\}^\omega,$$and$$\liminf_{n\rightarrow\infty}(A_n\cup B_n) = \bigcup_{n\ge 1}\bigcap_{k\ge n} (A_k \cup B_k) = \bigcup_{n\ge 1}(A_n \cup \{0,1\}^\omega) = \{0,1\}^+\cup\{0,1\}^\omega.$$It follows that $\lim_{n\rightarrow\infty}(A_n\cup B_n)=\{0,1\}^+\cup\{0,1\}^\omega$, which is also the set captured by $\gfp(F)$ as $F$ is Scott continuous.
Proof by induction. In mathematical induction, we usually set $D=\mathbb N\cup\{0\}$ and try to prove that $Q(0)$ holds and $Q(n)\Rightarrow Q(n+1)$ for all $n\ge 0$. Generally, proving a property $Q$ by induction amounts to proving that $Q$ holds in the base cases and $Q(d)\Rightarrow F(Q)(d)$ for all $d\in D$. The predicate $Q$ satisfying this proof rule turns out to be $\bigsqcup_n F^n(\bot)$, which, when $F$ is Scott continuous, agrees with the least fixed point of $F$. In this sense, proving by induction usually corresponds to taking the least fixed point. In program verification, inductions and least fixed points are extensively used to reason about semantics, because the semantics of a program should precisely describe what will happen during the evaluation of the program.
Proof by coinduction. Suppose $Q$ is the greatest fixed point of $F$. By definition, any predicate $Q'$ satisfying $Q' \sqsubseteq F(Q')$ also satisfies $Q' \sqsubseteq Q$, since $Q$ is an upperbound of any ascending chain in the lattice. Hence, to show $Q(d)$ holds it suffices to find some predicate $Q'$ such that $Q'(d)$ holds and $Q'(d')\Rightarrow F(Q')(d')$ for all $d' \in D$. The rationale behind doing this is to regard the idea of being acceptable (i.e., $Q(d)$ holds) as being consistent with the induction rules (i.e., $\exists Q': Q'(d) \wedge Q'(d')\Rightarrow F(Q')(d')$ for all $d'$). In program verification, coinductions and greatest fixed points are extensively used to check consistence with specifications, because the specification of a program should not forbid a behaviour unless it explicitly specifies so.
Collapse of fixed points. The least fixed point and the greatest fixed point in a complete lattice can coincide when Banach's Fixed Point Theorem can be applied. Namely, a "contractive" operator in a metric space has a unique fixed point. A functional $F$ will be contractive if there exists a well-founded relation $\prec$ on $D$ such that the clause of $F(Q)(d)$ only call $Q$ on arguments smaller than $d$ with respect to $\prec$. For example, the $F$ defined in \eqref{eg1-fp1} is contractive when $D$ is set to $\Sigma^*$; in such case, both $\lfp(F)$ and $\gfp(F)$ capture $\{0,1\}^+$. By contrast, the functional that expresses grammar \eqref{eg1-cfg2}, i.e., the functional $F$ that satisfies\begin{equation}F(Q)(0)=true;\ F(Q)(1)=true;\ F(Q)(x)=Q(x);\ F(Q)(xy)=Q(x)\wedge Q(y),\label{eg1-fp2}\end{equation}is not contractive even when $D=\Sigma^*$. Indeed, one can see that $\lfp(F)$ captures $\{0,1\}^+$ but $\gfp(F)$ captures $\{0,1\}^*$.

Inductive and coinductive datatypes

An introduction to (co)algebra and (co)induction.
https://pages.lip6.fr/Pierre-Evariste.Dagand/stuffs/notes/20150126_coinduction.txt
http://wiki.c2.com/?CoinductiveDataType

Friday, March 18, 2016

A note on semi-linear and regular sets

Regular sets

A set is called regular if it can be recognised by a finite DFA. For $L\subseteq\Sigma^*$, $\sim_L$ is called a right congruence iff $u\sim_L v \iff uw\sim_L vw$ for all $w\in\Sigma^*.$ Note that a right congruence is an equivalence relation. The classes of $\sim_L$ induce a minimal complete DFA recognising $L.$ Intuitively, words belonging to the same class in $\sim_L$ lead to the same state in the DFA. The classes containing the empty string and the strings in $L$ correspond to the initial state and the final states, respectively.
Myhill-Nerode Theorem. A set $L$ is regular iff $\sim_L$ has finite index over $\Sigma^*.$
Rational sets. A set is called rational iff it can be generated from a finite number of finite subsets of $\Sigma^*$ through a finite number of rational operations, namely union, concatenation and the Kleene star.
Kleene's theorem. A set of words is regular iff it is rational. (Note: The only-if part doesn't holds for non-length-preserving relations.)

Semi-linear sets

A subset of $\mathbb N^k$ is linear if it can be expressed as$$u_0 + \langle u_1, \dots, u_m \rangle = \{u_0 + a_1u_1 + \dots + a_mu_m \mid a_1, \dots, a_m \in \mathbb{N}\}$$such that $u_0,...,u_m$ are vectors of dimension $k$. A set is called semi-linear if it is a union of finitely many linear sets.
Fact. Semi-linear sets are closed under complement, intersection, projection, etc., in an effective way.
Fact. A set is semi-linear iff it is Presburger-definable.
Fact. If a set is upward-closed or downward-closed, then it is semi-linear.

Parikh image. Let $\Sigma=\{a_1,...,a_k\}$. Given a word $w$, we use $|w|(a)$ to denote the number of occurrences of symbol $a$ in $w$. A function $\phi:\Sigma^*\rightarrow\mathbb N^k$ is called the Parikh image if the $i$-th component of $\phi(w)$ is $|w|(a_i).$ [note]

Fact. Parikh image is a morphism from monoid $(\Sigma^*,\cdot,\epsilon)$ to monoid $(\mathbb N^k,+,0).$
Fact. A set $S\subseteq \mathbb N^k$ is semi-linear iff there is a regular set $R\subseteq\Sigma^*$ such that $\phi(R)=S$, where $\phi$ is the Parikh image. [1]

From RE to its Parikh image. The key observation to computing the Parikh image of a regular set is to describe the depths of Kleene stars by constraints over the dependency between frequencies. For example, $(a^*b)^*$ can be reduced to $a^*b^*$ plus a constraint "$|w|(b)=0$ implies $|w|(a)=0$", which can further resolve to two linear conditions "$|w|(b)=0\wedge |w|(a)=0$" and "$|w|(b)\ge 1 \wedge |w|(a)\ge 0$" corresponding to a union of two linear sets. See [2] and the references therein for more details abort constructions.

Parikh's Theorem. [1] If $L$ is a context-free language then there exists a regular language $R$ such that $\phi(L)=R.$ (This implies CFL and RL are indistinguishable when the alphabet is commutative.)

Real numbers

A relation in real closed fields is definable in the first-order logic if and only if it is a Boolean combination of polynomial equations [?].

Connection with logic

Connection with MSO. [Buchi] A set $L\subseteq\Sigma^*$ is regular iff it can be defined in $MSO(\Sigma, <_{\mathbb N}, \{P_a\}_{a\in\Sigma})$. [slide]
Connection with LTL. [?] A set is LTL-defiable iff it is $FO(\Sigma^\omega, <_{\mathbb N}, \{P_a\}_{a\in\Sigma})$-definable.
Connection with FO. A regular expression constructed without the Kleene star is called star-free. For example, $(ab)^*$ is star-free but $(aa)^*$ is not. McNaughton showed that a regular set is star-free iff it can be defined in $FO(\Sigma, <_{\mathbb N}, \{P_a\}_{a\in\Sigma})$, cf. [3]. Hence, checking star-freeness of a regular expression is equivalent to checking whether an MSO-formula is FO-definable. The former problems is shown to be PSPACE-complete, see here and here. This complexity result is generalised to checking star-freeness of rational relations, see here.

References

1. On Context-Free Languages, J. ACM, 1966.
2. Semi-linear Parikh Images of Regular Expressions via Reduction, MFCS10.
3. First-order definable languages, 2008.
4. Jewels of Formal Language Theory, 1982.

Wednesday, February 17, 2016

A note on checking properties of regular languages

Deterministic Finite Automata

Equivalence. If a DFA with $n$ states does not accept every string, it rejects some string of length at most $n-1$. It follows that the languages of two DFA $A$ and $B$ coincide iff they accept the same words up to length $|A|\cdot|B| - 1$. (Hint: consider automaton $A\times B$.) This observation leads to an exponential-time algorithm for language equivalence checking. In practice, equivalence between DFA is often checked by first minimising the DFA (see below). Isomorphism between DFA can be checked in linear time when an appropriate representation is used, cf. Hopcroft's method.

Minimisation. The minimal DFA of a regular language is unique by the Myhill-Nerode Theorem. We can hence check language equivalence of DFA by checking isomorphism between DFA. DFA minimisation takes $O(|\Sigma|\cdot n\log n)$ time using Hopcroft's method, which is the most commonly used approach in practice. Aho, Hopcroft and Ullman proposed an algorithm that uses union-and-find to create and locate the equivalence classes for states in a DFA (cf. Sec. 3 of [2]). The running time is $O(kn\!\cdot\!A(kn,n))$ with $k=|\Sigma|$.

Universality can be checked by checking that all reachable states are accepting.
Complement is obtained by simply flipping the accepting and the non-accepting states.
Aperiodicity (ie. existence of non-trivial cycles) of a DFA is PSPACE-complete, cf. here.
Intersection vacuity of an unbounded number of DFA is PSPACE-complete, cf. here and here.
Reversal. The trivial way to get the DFA of the reverse language is to determinise the NFA obtained by flipping the initial and the accepting states and reversing the directions of the transitions. This approach leads to exponential blowup. It is unclear as to the lower bound of this problem.

Non-deterministic Finite Automata

Equivalence. NFA equivalence problem is PSPACE-complete --- in fact, it is the first problem shown to be in this complexity class, see [5]. Practically, checking equivalence of NFA is often done by first converting NFA to DFA using subset construction, which however can lead to exponential growth in automata size. There are other determinisation methods in the literature, c.f. wiki and [4].

Universality. Using determinisation, if an NFA of size $n$ does not accept every string then it rejects some string of length at most $|\Sigma|^n-1$. Together with the classic result that reachability can be checked is in NL, we easily see that university checking for NFA is in PSPACE. One can show that the problem is PSPACE-complete and it is so even when every state of the NFA is accepting.

Complement. Complementing a DFA can be easily done by flipping the accepting and the non-accepting states. This method doesn't work for NFA: the language of the "flipped" NFA will be the superset of its complement in general (e.g., consider the NFA $\{q_0 \xrightarrow{a} q_1, q_0 \xrightarrow{a} q_2\}$ where $q_1$ is accepting). In [6], the authors give very precise bounds for complexities for complement on NFA. Particularly, there are NFA with exponentially large complement: the language of all words over $\{1,…,n\}$ which don't contain all symbols is accepted by an NFA of size $n$, but its complement needs $2^n$ states. Hence, complement of NFA is EXPSPACE-complete; no general method for complementing NFA can have better complexity than doing subset construction in theory.

Minimisation. Minimisation to/from NFA is PSPACE-hard. Given a DFA or an NFA, it is PSPACE-hard to find the size of the minimal equivalent NFA. Some hardness of approximation results have shown that it is even hard to approximate the size of the minimal equivalent NFA. Since a regular expression can be converted to an NFA of comparable size, these hardness results show that we can't expect any efficient algorithm to convert an regular expression to a minimal NFA. See these lecture notes by Artem Kaznatcheev for more details.

Regular expressions

Equivalence. Like NFA, checking equivalence of RE is PSPACE-complete. This can be expected since RE can be converted to and from NFA of roughly the same size, the complexity of checking properties of RE should be comparable to that of checking NFA. However, if intersection is allowed then the difficulty rises to EXPTIME. Further, if negation is allowed then the problem becomes non-elementary [5].

Star-freeness of RE is PSPACE-complete, see here. This problem corresponds to checking the non-existence of non-trivial cycles in DFA, which is also PSPACE-complete.
Emptiness for regular expressions with intersection is PSPACE-complete.
(Often, checking the a property for DFA is far more efficient than checking the same property for RE. So it is surprising to note that the above two problems have the same complexity for both DFA and RE.)

Star-free Regular expressions

Checking equivalence of star-free RE is NP-complete, cf. Section 2 of [3].

Resources

Online converters: RE-to-NFA, NFA-to-RE, NFA-to-DFA
RE equivalence matching in the wild:
1.Backtracking (used in Perl, Python, etc): Discussion, Demo, Code
2.Virtual-machine method (used in Ruby): Explanation

References

1. Equivalence of regular languages.
2. Testing the Equivalence of Regular Languages
3. On the equivalence, containment, and covering problems for the regular and context-free languages
4. Five determinisation algorithms
5. Word problems requiring exponential time
6. State Complexity of Concatenation and Complementation of Regular Languages
7. Complexity of Some Problems from the Theory of Automata

Saturday, December 5, 2015

Logical relations as a proof technique — Part I

This is the part I of my lecture note based on the lecture series "Logical Relations" given by Amal Ahmed at Oregon Programming Languages Summer School, 2015. See here for Part II.

Overview. These lectures will introduce the proof method of logical relations. Logical relations have been used to prove a wide range of properties, including normalization, type safety, equivalence of programs, non-interference (in security typed languages), and compiler correctness. We will start by looking at how to prove normalization and type safety for the simply-typed lambda calculus (STLC). We will then add recursive types to STLC and develop a step-indexed logical relation, using step-indexing to ensure that the logical relation remains well-founded. We will cover polymorphic and existential types and develop a logical relation to reason about the associated notions of parametricity and representation independence. The lectures will also discuss how to establish that a logical relation corresponds to contextual equivalence.

Free theorems. In parametric polymorphism, types cannot be checked, inspected or instantiated by programs. Hence, programs with polymorphic types don't change behaviors according to their type parameters. It follows that all functions of type $\forall a.\, a \rightarrow a$ are identity functions, since nothing but the argument can be returned by the function. Similarly, all functions of type $\forall a.\, a \rightarrow Int$ are constant functions. In contrast, there doesn't exist a function of type $\forall a.\, Int \rightarrow a$, since the return value can be obtained from nowhere. Results of this kind are called free theorems, and many of them can be proved using logical relations.

Logical predicates. Unary logical relations are also called logical predicates. They are used to assert properties of one single program, such as termination. In contrast, binary logical relations are used to assert relations between two programs, such as equivalence. Note that we only put closed programs in a logical relation. By and large, to prove that a well-typed and closed term $e$ has some special property, we shall set up a predicate $P_\tau$ such that $P_\tau(e)$ holds if and only if all of the following conditions hold: i) $\vdash e: \tau$, ii) $e$ has the property of interests, and iii) the property of interests is preserved by elimination forms (or introduction forms) of type $\tau$.

Strong normalization. In history, logical relations first appeared as a technique to prove strong normalization, which means that a well-typed term always reduces to the same value regardless of the strategies of evaluation. Since typing rules in STLC are deterministic, proving strong normalization for STLC amounts to showing that every well-typed program terminates, that is, $\vdash e:\tau \implies e \Downarrow$. Observe that this statement cannot be proved by applying structural induction on typing derivations directly. The problem of induction occurs at the application rule: we cannot conclude $e_1\ e_2\Downarrow$ from $e_1 \Downarrow$ and $e_2 \Downarrow$. Nevertheless, this problem can be resolved by using a stronger induction hypothesis, which leads to the following definition of logical predicates.
We shall set up a predicate $SN_\tau(\cdot)$ such that $SN_\tau(e)$ holds only if term $e$ has type $\tau$ and is strongly normalizing. We consider two types for STLC: boolean and function types. For boolean, it suffices to define that $SN_{bool}(e)$ holds iff $\vdash e:bool \wedge e \Downarrow$, since boolean doesn't has elimination forms. For functions types, we define that $SN_{\tau\rightarrow\tau'}(e)$ holds iff

$\vdash e':\tau\rightarrow\tau'\quad\wedge\quad e' \Downarrow\quad \wedge\quad (\forall e.\, SN_{\tau}(e) \implies SN_{\tau'}(e'\ e))$.

(1)

We shall see that this definition provides the strengthened hypothesis we need to deal with the application rule. The proof of $\vdash e:\tau \implies e \Downarrow$ now breaks into two parts: A) $\vdash e:\tau \implies SN_{\tau}(e)$, and B) $SN_{\tau}(e) \implies e \Downarrow$. Part B follows immediately from the definition of $SN_\tau(e)$. It remains to prove part A, and we rely on structural induction to do this. But again, we cannot use the statement $\vdash e:\tau \implies SN_{\tau}(e)$ we are going to prove as an induction hypothesis. It only asserts closed terms, but to show that it holds for terms of function type, say, for $(\lambda x:\tau.\, e'): \tau \rightarrow \tau'$, we need the hypothesis to hold for open term $e'$ so as to perform induction.
Instead, we shall use the following stronger claim as an induction hypothesis. Given a mapping $\gamma$ from variables to values, we write $\gamma \models \Gamma$ when $dom(\gamma)=dom(\Gamma)$ and $\forall x\in dom(\Gamma).\, SN_{\Gamma(x)}(\gamma(x))$ holds. That is, $\gamma$ maps variables in $dom(\Gamma)$ to values that strongly normalize with types compatible with $\Gamma$. We claim that for any term $e$ and substitution $\gamma$,

$\Gamma \vdash e:\tau \wedge \gamma \models \Gamma \implies SN_{\tau}(\gamma(e))$.

(2)

It is clear that part A follows from this claim by setting $\Gamma$ to be empty. Now we shall prove (2) by inducting on the derivation rules case by case.
Case $\quad\rhd\ \ \vdash true: bool\quad$ and $\quad\rhd\ \ \vdash false: bool$.
　Note that $\gamma(x)=x$ for $x\in \{\,true,\,false\,\}$, which is of course strongly normalizing.
Case $\quad\Gamma(x)=\tau\ \ \rhd\ \ \Gamma \vdash x:\tau$.
　By premises $\Gamma(x)=\tau$ and $\gamma\models\Gamma$, $\gamma(x)$ has type $\tau$. Thus $SN_{\tau}(\gamma(x))$ holds by $\gamma\models\Gamma$.
Case $\quad\vdash e': \tau\rightarrow\tau'$, $\vdash e: \tau\ \ \rhd\ \ \vdash e'\ e: \tau'$.
　Note that $\gamma(e'\ e)=\gamma(e')\ \gamma(e)$. By premise $\gamma\models\Gamma$, both $SN_{\tau\rightarrow\tau'}(\gamma(e'))$ and $SN_{\tau}(\gamma(e))$ holds by induction. Thus $SN_{\tau'}(\gamma(e')\ \gamma(e))$, which equals to $SN_{\tau'}(\gamma(e'\ e))$, holds by (1).
Case $\quad\Gamma, x:\tau \vdash e':\tau'\ \ \rhd\ \ \Gamma\vdash\lambda x:\tau.\, e': \tau\rightarrow\tau'$.
　 This is the case we got stuck with before introducing logical predicates, so it is not surprising that the proof for this case is far more complicated than others. Before we proceed, we need the following two lemmas:
Lemma 1. (substitution lemma) If $\Gamma \vdash e:\tau$ and $\gamma \models \Gamma$, then $\Gamma \vdash \gamma(e):\tau.$
Lemma 2. (SN is preserved by forward/backward reductions) Suppose that $\vdash e:\tau$ and $e \hookrightarrow e'$. Then $SN_{\tau}(e)$ holds if and only if $SN_{\tau}(e')$ holds.
To prove (2), i.e., $\Gamma \vdash \lambda x:\tau.\, e' \wedge \gamma\models\Gamma$ implies $SN_{\tau\rightarrow\tau'}(\gamma(\lambda x:\tau.\, e'))$, it suffices to prove the following three statements:
1) $\gamma(\lambda x:\tau.\, e')$ has type $\tau\rightarrow\tau'$. This follows by premise $\gamma\models \Gamma$ and Lemma 1;
2) $\gamma(\lambda x:\tau.\, e')\Downarrow$. This is obvious, since $\gamma(\lambda x:\tau.\, e')=\lambda x:\tau.\,\gamma(e')$ is a lambda value;
3) $\forall e.\,SN_\tau(e)\implies SN_{\tau'}(\gamma(\lambda x:\tau.\, e')\ e)$. By induction, we know that $\Gamma, x:\tau \vdash e':\tau'$ and $\gamma'\models \Gamma, x:\tau$ implies $SN_{\tau'}(\gamma'(e'))$. Suppose that $SN_\tau(e)$ holds for some $e$. Then $e\Downarrow v$ for some value $v$, and $SN_{\tau}(v)$ holds by Lemma 2. Given premises $\Gamma \vdash \lambda x:\tau.\, e'$ and $\gamma\models\Gamma$, define $\gamma'=\gamma[x \mapsto v]$. Note that $\gamma'\models\Gamma,x:\tau$. Thus $SN_{\tau'}(\gamma'(e'))$ holds by induction. However, note that
$\gamma'(e')=\gamma(e')[x \mapsto v]=(\lambda x:\tau.\,\gamma(e'))\ v=\gamma(\lambda x:\tau.\,e')\ v$.
It then follows from $SN_{\tau'}(\gamma'(e'))$ and Lemma 2 that $SN_{\tau'}(\gamma(\lambda x:\tau.\,e')\ e)$ holds. This concludes our proof of strong normalization for STLC.

(End of lecture note Part I. See here for Part II.)