Ericpony's Blog

Tuesday, June 20, 2017

A note on bisimulation and coinduction

A machine $(X,Y,S,\delta,\lambda)$ is a 5-tuple composed by an input alphabet $X$, an output alphabet $Y$, a state space $S$, a transition function $\delta:S\times X \rightarrow S$, and an output function $\lambda: S\rightarrow Y.$ We assume that $\delta$ is total without loss of generality. Also, its domain be extended to $X^*$ by defining $\delta^*(s,\epsilon)=s$ and $\delta^*(s,wx)=\delta(\delta^*(s,w),x).$ A bisimulation $R$ between two machines $M=(X,Y,S,\delta,\lambda)$ and $M'=(X,Y,S',\delta',\lambda')$ is a relation on $S\times S'$ defined recursively as $$R=\{(s,s'): \lambda(s)=\lambda'(s') \wedge \forall x\in X. (\delta(s,x),\delta'(s',x))\in R\}.$$ The union of all bisimulations is the greatest bisimulation and is often denoted by $\sim$. When $\sim$ is defined for one machine, it forms an equivalence relation and is called bisimilarity. Bisimilarity can be characterised inductively as follows. We define a sequence of equivalence relations $R_0 \supseteq R_1 \supseteq \cdots$ by letting $R_0 = S\times S$ and for $n\ge 1$, $(s, s') \in R_n$ iff for all $x\in X$, $\delta(s, x)$ implies $\exists x'\in X.\ (\delta(s, x), \delta(s', x')) \in R_{n-1}.$ It is easy to verify that the $R_n$ converges to the bisimilarity $\sim$, namely, $\bigcap_{n\ge 0} R_n=\sim$. (To see this, note that $\supseteq$ holds as $R_n \supseteq \sim$ for all $n\ge 0.$ Also, $\bigcap_{n\ge 0} R_n$ is a bisimulation and thus $\subseteq$ holds.) This inductive definition gives a simple fixpoint procedure for computing bisimilarity, see below.

Bisimulation and language equivalence

We call a machine a (deterministic) automaton if its output alphabet, denoted by $\mathcal{B}$, consists of only "rejected" and "accepted", denoted by $\bot$ and $\top$, respectively. It is interesting to note that a bisimulation $R$ between two automata $M$ and $M'$ over the same input alphabet $X$ can be itself described by an automaton $(X,\mathcal{B},R,\delta_R,\lambda_R)$, such that
1. $\delta_R((s,s'), x)=(\delta(s,x),\delta'(s',x))$ iff $(\delta(s,x),\delta'(s',x))\in R,$ and
2. $\lambda_R((s,s'))=\lambda(s)=\lambda'(s').$
Given an automaton $M=(X,Y,S,\delta,\lambda)$, the state language of a state $s$ of $M$ is defined as $L(s)=\{w\in X^{*}:\lambda\circ\delta^*(s,w)=\top\}.$ We call $L(s)$ the language of $M$ when $s$ is the initial state of $M$. The following implication can be shown by induction on the length of words:$$s\sim s' \implies L(s)=L(s').$$
Bisimulation equivalence is also called behavioural or observational equivalence by regarding the observable behaviours of a state as its state language. In the world of finite-state systems, two other notions of equivalence are often used in practice to prove language equivalence. The first notion is graph isomorphism. As the minimal automaton recognising a ($\omega$-)regular language is unique up to graph isomorphism, checking language equivalence amounts to checking graph isomorphism between their minimised automata. The second notion is trace equivalence, which relates two states iff they yield the same set of traces. Trace equivalence and bisimulation equivalence coincide on deterministic machines but the latter is finer in general (e.g. consider $a(b+c)$ v.s $ab+ac$). Here is a comparison among the three relations on finite machines:
Complexity: Bisimulation equivalence is $O(m\log n)$; Graph isomorphism is in NP; Trace equivalence is PSPACE-Complete.
Fineness: Graph isomorphism (w.o. minimisation) $\subseteq$ Bisimulation equivalence $\subseteq$ Trace equivalence $\subseteq$ Language equivalence

The proof principle of coinduction

Coinduction is a mathematical tool to define bisimulation. To demonstrate its use, we present here a technique that proves language equivalence using coinduction [1]. A $w$-derivative of a language $L\subseteq X^{*}$ is defined as $L_{w}=\{u\in X^{*}:wu\in L\}.$ The set of $\mathcal{L}$ languages on $X$ can be defined as an automaton $M_{\mathcal{L}}=(X^{*},\mathcal{L},\mathcal{B},\delta,\lambda)$ such that $\delta(L,x)=L_{x}$ and $\lambda(L)=\top$ iff $\epsilon\in L$. It turns out that $L(l)=l$ for all $l\in\mathcal{L}$, namely the state language of each state in $M_{\mathcal{L}}$ is the state itself. Hence we have$$L\sim L' \iff L=L'.$$This fact provides a uniform way to prove the equivalence of two languages. The more familiar method of proof by induction requires that one start from the equality of the base case (which is the pair of the minimal words in the two languages) and proceed to establish the equivalence of the pairs of longer words. Induction however doesn't work when the languages contain words of infinite length. In contrast, proof by coinduction starts from $\{(L,L')\}$ and continues to include more and more pairs via the transitions until a least fixed point is reached. This fixed point, whenever exists, is a bisimulation relation contained in the bisimilarity $\sim$ and thus implies that $L = L'.$
The principle of coinduction is not always effective—the process of constructing a bisimulation may not terminate even when the target languages are equivalent. This fact can be expected since otherwise the Halting problem would be solvable. On the other hand, the coinduction process always terminates when the two languages under consideration are regular: it either constructs a bisimilarity when they are equivalent, or finds a counterexample when they are not. In the case when the two languages are the same regular language, the process actually constructs the smallest finite automaton recognising them (more details later). This fact is a corollary of Kleene's Theorem, stating that a language is regular iff it is accepted by a finite automaton; see Section 8 of [1] for details. In fact, the most efficient equivalence checking algorithm for regular languages up to date is based on coinduction [4].

Bisimulation and fixed points

Given a machine $M = (X,Y,S,\delta,\lambda)$, one can define a monotone function $$F(A) := \{(s,s')\in S\times S : \forall x\in X\cup\{\epsilon\}.(\delta(s,x),\delta(s',x))\in A\}$$on lattice $(2^{S\times S},\subseteq).$ In the terminologies of fixed-point theory, bisimulations are the post-fixed points of $F$ and bisimilarity is the greatest post-fixed point $\nu X.~F(X).$ (Note: these fixed points collapse when $\delta$ defines deterministic transitions.) In this regard, proof by coinduction can be seen as showing that a property is closed under backward computation and contained in a coinductively defined set (ie. the greatest fixed point). In contrast, induction involves showing a property is closed under forward computation and containing an inductively defined set (ie. the least pre-fixed point). Intuitively, backward means that the set is obtained by keeping refine larger sets until the first (ie. the greatest) fixpoint is reached; and forward oppositely means that the set is obtained by keeping extend smaller sets until the first (ie. the smallest) fixpoint is reached.
The following procedure gives a naive fixpoint computation for the bisimilarity over a finite and non-deterministic machine. The bisimilarity is represented as a functional relation $\rho \subset S\times\mathbb N$ that maps states to labels of the equivalence classes induced by the bisimilarity.

Procedure ComputeBisimilarity
Begin
Set $\rho := \{(s, 1) : s \in S\}$, $\rho' := \emptyset$
While $\rho \neq \rho'$ do
Set $\gamma := \{(s, \{ (x, \emptyset) : \delta(s,x)\in S \}) : s \in S \}$
For all $s, x, s'$ such that $\delta(s,x) = s'$
Set $\gamma(s)(x) := \gamma(s)(x) \cup \{ \rho(s') \}$
Set $id := 0$, $A' := \mathbb N$, $\rho' := \emptyset$
Let $L$ be a listing of $\gamma$ sorted by the second component
For each $(s,A)$ in $L$ do
If $A\neq A'$ then
Set $A' := A$, $id := id + 1$
Set $\rho' := \rho' \cup \{(s,id)\}$
Output $\rho$ as the equivalence classes induced by the bisimilarity
End

The procedure starts from a trivial binary relation where all states are equivalent. In each iteration, the algorithm computes a mapping $\gamma$ from each state to the labelling of its successors induced by the current bisimulation. A state will then be re-labelled according to the labelling of its successors changes. The refinement will continue until the labelling is stable, meaning that the greatest fixpoint (i.e. the unique largest bisimulation) is reached. See this paper for a survey of algorithms for computing bisimulations.
The fact that bisimulation relation can be constructed inductively implies that coinductive and inductive reasoning should yield the same results on inductively defined structures such as lists and trees, and make a difference only on structures that contain cycles or infinite paths. Further, while bisimulation coincides with bisimilarity on deterministic machines (such as $M_{\mathcal L}$), coinduction in general does not pinpoint bisimilarity for non-deterministic machines.

Bisimulation and minimisation

A homomorphism between two machines $M$ and $M'$ over the same input alphabets is any function $f:S\rightarrow S'$ satisfying $f(\delta(s,x))=\delta'(f(s),x)$, namely $$ \array{ S\times X &\overset{f\times id_X}{\longrightarrow}& S'\times X \\ \delta \downarrow & & \downarrow \delta' \\ S &\overset{f}{\longrightarrow}& S' }$$A homomorphism is called unique if given any homomorphism $g$ there exists an isomorphism $h$ such that $f=h\circ g$. The notion of unique homomorphisms is closely related to that of bisimulations. First, a mapping $f:S\rightarrow S'$ is a homomorphism iff $\{(s,f(s)):s\in S\}$ is a bisimulation between machines $M$ and $M'$. Furthermore, given a homomorphism $f:S\rightarrow S'$,
i) If $R$ is a bisimulation on $M$, then $f(R)=\{(f(u),f(v)):(u,v)\in R\}$ is a bisimulation on $M'$. [p.34, 5]
ii) If $R'$ is a bisimulation on $M'$, $f^{-1}(R')=\{(u,v):(f(u),f(v))\in R'\}$ is a bisimulation on $M$. [p.34, 5]
iii) $f$ is unique iff for any homomorphism $g:S\rightarrow S'$, the set $\{(f(s),g(s)):s\in S\}$ is a bisimulation on $M'$.
Unique homomorphism is a powerful utility to study languages and automata. For example, fix an automaton $M$ and consider a mapping $\phi: s \mapsto L(s)$ from $M$ to $M_{\mathcal L}.$ It is straightforward to check that $\phi$ is a unique homomorphism. One interesting fact about $\phi$ is that it identifies precisely the bisimilar states in $M$. That is, $$u\sim v \iff \phi(u)=\phi(v),$$which follows from the fact that $\{(u,v)\in S\times S: \phi(u)=\phi(v)\}$ is a bisimulation on $M$ and that $\{(\phi(u),\phi(v))\in \mathcal{L}\times \mathcal{L}:u\sim v\}$ is a bisimulation on $M_{\mathcal L}.$ Another amazing fact is that $\phi$ effectively minimises automata. To see this, let $\langle s \rangle_N$ (resp. $\langle S \rangle_N$) denote the sub-automaton generated by state $s$ (resp. sets of states $S$), i.e., the automaton induced by the states reachable from $s$ (resp. $S$), in automaton $N$. Also, we lift $\phi$ to a graph homomorphism such that given an automaton $N$, $\phi(N)$ is the sub-automaton of $M_{\mathcal L}$ generated by $\{\phi(s):s\in State(N)\}$. It turns out that$$\langle L(s) \rangle_{M_{\mathcal L}} = \langle \phi(s) \rangle_{M_{\mathcal L}} = \phi(\langle s \rangle_M).$$Since given $L$ we are free to choose any $s$ and $M$ satisfying $L(s)=L$, it follows that
i) If $L$ is regular, then $\langle L \rangle_{M_{\mathcal L}}$ is the canonical minimum automaton accepting $L$, and
ii) $\phi(\langle s \rangle_M)$ is the canonical minimisation of automaton $\langle s \rangle_M$.
In particular, $L$ is accepted by a finite automaton iff $\{L_w: w\in X^*\}$ is finite. This result is equivalent to the well-known characterisation of regular languages by Nerode and Myhill:
Theorem. (Nerode-Myhill) $L$ is accepted by a finite automaton if $R_L$ has a finite index. (Namely there are a finite number of classes in the equivalence relation $R_L$ on $L$ defined by $(u,v)\in R_L$ iff $\forall w\in L. (uw\in L \iff vw\in L).$)

References and further reading

1. Automata and Coinduction (An Exercise in Coalgebra)
2. Introduction to Bisimulation and Coinduction (book) (slides)
3. An introduction to (co)algebra and (co)induction
4. Checking NFA Equivalence with Bisimulations up to Congruence
5. Universal Coalgebra: A Theory of Systems
6. Bisimulation and Language Equivalence

Sunday, June 18, 2017

A note on useful decidable temporal logics

Subclasses of FO(LTL)

In the following, we assume that non-constant function symbols and equality are not allowed. The addition of them can lead to the loss of decidability for any of the subclasses introduced here.

Monadic FO(LTL), proposed by Hodkinson et al. [1] in 2000, is the first non-trivial decidable subclass of FO(LTL). This class contains all sub-formulas beginning with a temporal operator (Since or Until) have at most one free variable. Thus, we can talk about objects in the intended domain using the full power of first-order logic; however, temporal operators may be used to describe the development in time of only one object (two are enough to simulate the behaviour of Turing machines). The satisfiability problem for formulas in this class can be reduced to fragments of classical two-sorted first-order logics in which one sort is intended for temporal reasoning. The decidability results also hold for the temporal logics with finite domains.

Remark. FO(LTL) provides only ‘implicit’ access to time: quantification over points in time in the sense of first-order logic is not permitted, and the only means of expressing temporal properties is by the temporal operators Since and Until. It turns out that monadic FO(LTL) is weaker than the two-sorted first-order language, where one sort of which refers to points in time and the other to the first-order domain. For example, the formula $∃t_1 ∃t_2 (t_1 < t_2 ∧ ∀x(P(t_1, x) ↔ P(t_2, x)))$ is not expressible in FO(LTL) over any interesting class of linear time structures [2, 3].

References

1. Decidable fragments of first-order temporal logics.
2. Temporal Logic. Part I & Part II. Clarendon Press, Oxford.
3. Temporal connectives versus explicit timestamps in temporal query languages.

Friday, May 19, 2017

A note on useful decidable logics & theories - Part II

Decidable fragments of second-order logic

(Weak) MSO with one successor. The monadic second-order theory of structure $(\mathbb N, <)$ (denoted by S1S) is decidable. Buchi showed in 1962 that given any formula $\phi(x_1,...,x_n)$ over ${\rm MSO}(\mathbb N, <)$, one can effectively compute an $\omega$-automaton $A(x_1,...,x_n)$ defining the same relation modulo convolution. There is a similar connection between weak $MSO(\mathbb N, <)$ (denoted by WS1S) and finite-word automata. See [10] for a historical survey.

MSO over binary trees. The structure of infinite binary trees is $T_2 := (\{1,2\}^*,S_1,S_2)$ with $S_i := \{(v,vi) : v \in \{1,2\}^*\}$. Rabin's Tree Theorem [4] states that the MSO-theory of the infinite binary tree (denoted by S2S) is decidable, or in other terminology: that model-checking the binary tree with respect to MSO-properties is decidable. In [4], Rabin also proved the decidability of a bunch of theories using interpretation in $T_2$, such as MSO over n-ary trees (see below). Note that Rabin's theorem is subsumed by a more general result stating that MSO over the prefix-rewriting systems is decidable [?].

MSO over n-ary trees. The decidability results of MSO-theory over binary trees $T_2$ can be extended to $n$-ary branching trees $T_n$ with vertices in $[1..n]^*$ by encoding a vertex $a_1 a_2 \dots a_k$ of $T_n$ by a vertex $1^{a_1} 2 1^{a_2} 2 \dots 2 1^{a_k} 2$ of $T_2$. The translation of MSO-formulas is straightforward.

MSO over pushdown automata. Let $G := (V,E)$ denote the configuration graph of a pushdown automaton, where $V\subseteq Q\times\Sigma^*$. Let $n := \max(|Q|,|\Sigma|).$ One can MSO-interpret $G$ in $T_n$ by representing the configuration $(q, s_1,...,s_k)$ as the vertex $s_1...s_k\ q$ of $T_n$. Each step between the configurations in $G$ is translated in MSO to a local move from one vertex to another in $T_n$. Decidability of MSO over binary trees hence subsumes the fundamental result of Muller and Schupp [6] that checking MSO properties is decidable for the configuration graph of a pushdown automaton.

Decidable modal logic

Probabilistic modal logic (PML). Let $\Sigma$ be a finite set of actions. The syntax of the logic is defined as follows:$$\phi ::= \top \mid \bot \mid \phi\vee\phi \mid \neg\phi \mid (\phi)_a\ {\rm op}~p,$$where $a \in \Sigma$, ${\rm op}\in \{=,<\}$, and $0\le p\le 1$. The semantics of PML is defined over a discrete non-deterministic probabilistic transition system such that for $s\in S$, e.g.,$$s \models (\phi)_a \le p \quad {\rm iff} \quad \Pr(s' \models \phi \mid s\xrightarrow{a} s') \le p.$$Larsen and Skou [7] show that, when the transition probabilities are rational and bounded above zero, given any PML formula $\phi$ and fixed $0<\epsilon<1$, one can compute an experiment $t_{\epsilon,\phi}$ and an observation $o_{\epsilon,\phi}$ such that $s \models \phi$ iff $\Pr(o_{\epsilon,\phi}$ is observed by running $t_{\epsilon,\phi}$ on $s) \ge 1-\epsilon$, and $s \not\models \phi$ iff $\Pr(o_{\epsilon,\phi}$ is observed by running $t_{\epsilon,\phi}$ on $s) \le \epsilon$. Such experiments can be computed, executed and observed in polynomial time. Larsen and Skou show that PML characterises probabilistic bisimulation in the following sense: two states are probabilistic bisimilar iff they satisfy exactly the same set of PML formulas.

Misc

Reduction of decidability. Suppose that model checking $L$-theory of structure $S$ is known to be decidable. The model checking problem for a structure $S'$ with respect to $L'$-theory is decidable if we can give an $L$-description of $S'$ within structure $S$. That is, we characterise structure $S'$ using $L$-formulas over $S$. In other words, we provide a satisfiability-preserving translation of any $L'$-formulas $\phi$ into $L$-formula $\psi$ such that $S'\models \phi$ iff $S\models \psi$. Hence $L'$-theory over structure $S'$ is also decidable.

References

1. Structural properties of complete problems for exponential time. 1997.
2. Complexity results for classes of quantification formulas. 2001.
3. Verification of randomised security protocols.
4. Decidability of second-order theories and automata on infinite trees. 1969.
5. Complexity results for classes of quantificational formulas. 1980.
6. The theory of ends, pushdown automata, and second-order logic. 1985.
7. Bisimulation through Probabilistic Testing. 1991.
8. The Classical Decision Problem. 2001.
9. A Survival Guide to Presburger Arithmetic. 2018.
10. Automata: From Logics to Algorithms. 2007.
11. Complete instantiation for quantified formulas in satisfiabiliby modulo theories. 2009.

Saturday, May 6, 2017

A note on recognisability and definability

Definitions. An $n$-ary relation is associated with a set of $n$-tuples; an $n$-ary function $f$ is associated with a set of $(n+1)$-tuples $\{(a_1,...,a_n,f(a_1,...,a_n)\}$, which is called the graph of function $f.$ A sequence $s_1,s_2,...$ over $A$ is associated with a function $s:\mathbb N^n\rightarrow A$ such that $s(i)=s_i.$ Given an integer $n$, we use $(n)_p$ to denote its p-ary expansion; given a p-expansion $w\in\{0,...,p-1\}^*$, we use $[w]_p$ to denote its value $\sum_{i=0}^{|w|-1}p^{|w|-1-i}w[i].$ By convention, we set $(0)_p=\epsilon$ and $[\epsilon]_p=0.$

Recognisable Sets

Definitions. A set is called regular or recognisable if it can be recognised by a finite DFA. For $L\subseteq\Sigma^*$, $\sim_L$ is called a right congruence iff $u\sim_L v \iff uw\sim_L vw$ for all $w\in\Sigma^*.$ Note that a right congruence is an equivalence relation.

Myhill-Nerode Theorem. A set $L$ is regular iff $\sim_L$ has finite index over $\Sigma^*.$

Definitions. (p-recognisability) A set of integers is called p-recognisable iff the p-ary expansion of the integers is recognised by a finite automaton with alphabet $\{0,...,p-1\}.$ The automaton would read a word from left to right and ignores the leading zeros, so that if it accepts $w$ then it accepts all words in $0^*w.$
Fact. A sequence $s_1,s_2,...$ over $A$ is p-recognisable if $s^{-1}(a)$ is p-recognisable for each $a\in A.$ When $A$ is finite, it is easier to define a sequence in this way.

Definitions of recognisability generalise to n-ary relations naturally by replacing the alphabet $\{0,...,p-1\}$ to $\{0,...,p-1\}^n.$ All properties and facts listed above still hold in the general case.

Definitions. (p-automata) A p-automaton recognising a sequence $s_1,s_2,...$ over a finite domain is a complete deterministic Moore machine that reads $(n)_p$ and outputs $s_n$ for all $n\ge 1.$ In such case the sequence is called p-recognisable. Note that p-recognisability of sequences generalises p-recognisability of sets, since a set is p-recognisable iff there exists a p-automaton that outputs 1 on $(n)_p$ if $n$ belongs to the set and outputs 0 otherwise. Alternatively, we can describe a set of integers using a characteristic sequence. The characteristic sequence of a set $S$ refers to a sequence over $\{0,1\}$ where the nth bit is 1 iff $n\in S.$ A set is p-recognisable iff its characteristic sequence is p-recognisable.
A p-automaton can recognise an n-ary relation by recognising an n-dimensional sequence $s:\mathbb N^n\rightarrow\{0,1\}.$ The input language $((a_1)_p,...,(a_n)_p)$ is cleverly stipulated as a subset of $(\{0,...,p-1\}^n)^*$ instead of $(\{0,...,p-1\}^*)^n.$ Namely, components of a relation convoluted over alphabet $\{0,...,p-1\}^n$. In this way, theorems such as Myhill-Nerode still hold for $n\ge2.$

Definable Sets

Definitions. A set $L$ of $n$-tuples is called (first-order) definable by a formula $\phi$ in a first-order structure $\mathcal S$ iff $\phi$ has exactly $n$ free variables and $L=\{(a_1,...,a_n)\in D^n :$ $S\models \phi(a_1,...,a_n)\}$, where $D$ is the domain of $\mathcal S.$ A set is definable in a structure $\mathcal S$ if it is definable by a formula in $\mathcal S.$ When the structure is clear in the context we simply called a set definable. A sequence $s_1,s_2,...$ over $A$ is definable if the graph of the function $s:\mathbb N\rightarrow A$ associated with the sequence is definable.
Fact. A sequence $s_1,s_2,...$ over $A$ is definable iff $s^{-1}(a)$ is definable for each $a\in A.$ When $A$ is finite, it is easier to define a sequence in this way.

Definitions. (p-definability) Given $x\in\mathbb N$, define $V_p(x)\equiv$ the largest power of $p$ that divides $x.$ Let $V_p(0)\equiv 1.$ A subset of $\mathbb N$ is called $p$-definable if it is definable in $[\mathbb N,+,V_p].$ Given $x\in\mathbb N$, define $V_p(x)\equiv$ the largest power of $p$ that divides $x$ and $V_p(0)\equiv 1.$ Define a predicate $P_p$ that $P_p(x)$ holds iff $x$ is a power of $p.$ Define a binary relation $B_p$ such that $B_p(x,y)$ holds iff $y$ is a power of $p$ occurring in the p-ary expansion of $x.$ (Thus $B_p(x,y)$ iff $y<x$, $P_p(y)$, and there exits a $0\le z <y$ such that $y|(x-z).$)

Great efforts were made in the history to characterise 2-recognisable sets in an FO structure. Three candidate structures were considered for this purpose in order: $[\mathbb N, +, P_2]$, $[\mathbb N, +, B_2]$, and $[\mathbb N, +, V_2].$ We note two facts here about these structures:
Fact. $[\mathbb N, +, B_p]$ and $[\mathbb N, +, V_p]$ are (FO-)interpretable in each other.
Fact. $[\mathbb N, +, P_p]$ is strictly weaker than $[\mathbb N, +, V_p].$
To see the first, note that for $x>0$, $V_p(x)=y$ iff $B_2(x,y)\wedge \forall 0\le z<y.\ \neg B_p(x,z)$, and that $B_p(x,y)$ holds iff $\exists 0\le z <y.\ V_p(x-z, y).$ The second fact follows from an interesting decidability result. Let $p^x$ denote the exponential function. It is known that $[\mathbb N,+,V_p]$ and $[\mathbb N,+,p^x]$ are decidable, but $[\mathbb N,+,p^x,V_p]$ is undecidable. Clearly $P_p$ can be defined by $V_p$ and $p^x.$ If $[\mathbb N, +, P_p]$ is equal to $[\mathbb N, +, V_p]$, then $V_p$ will be definable by $p_x$, which implies $[\mathbb N,+,V_p]$ and $[\mathbb N,+,p^x,V_p]$ are mutually interpretable. This is impossible since the first is decidable and the second is not. The ultimate characterisation of 2-recognisable sets is given by the following theorem.
Theorem I. For any prime number p, a set is p-definable iff it is p-recognisable.
Proof. ($\Rightarrow$) It suffices to show that given any formula $\phi$ from a structure $\mathcal S$ equivalent to $[\mathbb N,+,V_p]$, we can construct a p-automaton $A_\phi$ recognising the set definable by $\phi$, i.e., $\mathcal S \models \phi(a_1,...,a_n)$ implies that $A_\phi$ accepts $([a_1]_p,...,[a_n]_p).$ To simply our proof, we choose $\mathcal S=[\mathbb N,R_1,R_2]$, where $R_1(x,y,z)$ holds iff $x+y=z$ and $R_2(x,y)$ holds iff $V_p(x)=y.$ Now we prove the claim by structural induction. First note that the three atom formulas $x=y$, $R_1(x,y,z)$, and $R_2(x,y)$ in $\mathcal S$ are p-recognisable (Exercise!). To complete the induction, it suffices to show that we can construct p-automata $A_{\neg\phi}$, $A_{\phi\vee\psi}$ and $A_{\exists x\phi}$ from p-automata $A_\phi$ and $A_\psi.$ Negation construction is immediate since p-automata is closed under complementation. For union, we modify the transition functions of the two p-automata so that they operate over the same set of input variables. For instance, suppose the free variables in $\phi$ are $x_1,...,x_m,y_1,...,y_n$ and the free variables in $\psi$ are $y_1,...,y_n,z_1,...,z_r.$ Then after the modification, the transition functions of the p-automata have the same domain $(x_1,...,x_m,$ $y_1,...,y_n,$ $z_1,...,z_r).$ The union of the two modified automata gives $A_{\phi\vee\psi}.$ For projection, to construct $\exists x.\phi(x,x_1,...,x_n)$ we just relabel the transitions of the p-automaton by replacing $(a,a_1,...,a_n)$ with $(a_1,...,a_n).$ Note that we may need to add new loops after the relabelling to recognise leading zeros caused by the projection. See [3] for details of this proof.

Corollary. The theory of $[\mathbb N,+,V_p]$ is decidable.
Proof. Given a sentence in $[\mathbb N,+,V_p]$, we can rewrite it to the form of either $\exists x.\phi(x)$ or $\neg\exists x.\phi(x).$ By Theorem I, $\phi(x)$ is p-recognisable. Hence we can construct an finite automaton $A$ that accepts a word $w$ iff $[\mathbb N,+,V_p]\models\phi([w]_p).$ Hence the truth of the sentence can be checked by checking emptiness of a recognisable language, which is decidable.

Corollary. The set of squares $\{n^2:n\in\mathbb N\}$ is not p-recognisable.
Proof. Suppose to the contrary that the set of squares is p-recognisable. By Theorem I, there exists a formula $\phi$ in $[\mathbb N,+,V_p]$ such that $\phi(x)$ holds iff $x$ is a square. Define $$\psi(c,d):=\phi(d)\wedge\forall d'.(d'>c\wedge\phi(d')\implies d'\ge d),$$ which asserts that $d$ is the smallest square larger than $c$. Observe that the set $\{(x,y)\in\mathbb N^2: y=x^2\}$ is p-definable by $$\lambda(x,y):=\phi(y)\wedge\exists a.\ \psi(y,a)\wedge a=y+2x+1.$$ Also, the set $\{(x,y,z)\in\mathbb N^3: z=xy\}$ is p-definable by formula
$\exists a,b,c.\ \lambda(a,x)\wedge\lambda(b,y)\wedge\phi(c)\wedge c=a+b+2z \wedge\exists d.\ \psi(c,d)\wedge d=c+2x+2y+1$
The validity of this formula exploits the fact that $c=(x+y)^2$ if $c$ is a square and the smallest square greater than $c$ is $c+2x+2y+1$ for some $x,y\in\mathbb N$. It then follows that multiplication is p-definable, which is impossible since $[\mathbb N,+,\cdot]$ is not decidable but $[\mathbb N,+,V_p]$ is.
(To be continued)

References and further reading

1. Recognizable sets of integers, Michel Rigo.
2. Languages, Automata, and Logic, Wolfgang Thomas.
3. Logic and p-recognizable sets of integers, V. Bruyère et al.

Wednesday, April 26, 2017

Deadlock-freeness, safey, and liveness

Deadlock-freeness of a concurrent systems refers to the property that at any time point, at least one process in the system is allowed to take an action. In other words, all reachable states of a deadlock-free system have a successor. At the first glance, deadlock-freeness is a safety property and can be handled as such with safety verifiers by marking all states with no successors as unsafe. However, there is a subtle fact about deadlock-freeness that make it different from safety.
Recall that the safety of a transition system $T$ can be proved by verifying a system with richer behaviours than $T$. Many tools exploit this fact to compute compatible over-approximation for safety verification. Hower, a sound over-approximation for safety properties may not be sound for deadlock-free properties, since deadlock-freeness is not preserved by disabling transitions. Hence, deadlock-free properties are in general not suitable to be verified with conventional safety verifiers as that may lead to unexpected results.
That said, if we are able to determine the set of deadlock-free states a priori, it is still possible to model deadlock-freeness as a safety property (eg., see [1]). For instance, we can mark all states with no successors as unsafe, and check if the system remains forever in safe states. The marking are static and will not be changed by adding or removing transitions. (In fact, using marking one may even model a liveness property as a safety property. But this doesn't simplify the problem since coming up with correct marking would require to check the liveness property first.)

References

1. Alpern, B. "Defining liveness", 1985.
2. A. Biere, C. Artho, and V. Schuppan. "Liveness checking as safety checking", 2002.
3. V. Schuppan and A. Biere. "Efficient reduction of finite state model checking to reachability
analysis", 2004.
4. V. Schuppan and A. Biere. "Liveness checking as safety checking for infinite state spaces", 2006.

Thursday, April 6, 2017

Existence of sets that are regular but not computable

A regular set is called computable if one can construct a finite-state automaton to recognise it. Given a finite alphabet $\Sigma$, a rational transition system is a tuple $(I, T)$ comprised of a regular set $I\subseteq\Sigma^*$ as the initial states and a rational relation $T\subseteq\Sigma^*\times\Sigma^*$ as the transition relation. We will use $T^k$ to denote the rational relation formed by composing $T$ for $k$ times, and $T^*$ to denote the reflexive and transitive closure of $T$. In this way, $T^*(I)$ will represent the reachability set of system $(I,T)$, i.e., the set of words that can be reached from the words in $I$ through a finite number of transitions.
It is clear that $T^k(I)$ is regular and computable for any $0\le k<\infty.$ It however may be less obvious that the reachability set $T^*(I)$ is not necessarily regular. To see this, consider a system $\Sigma\equiv\{a,b\}$, $I\equiv\{(ab)^n:n\ge 1\}$ and $T\equiv\{(xbay, xaby): x,y\in \Sigma^*\}.$ Then $T^*(I) \cap a^*b^* =\{a^nb^n:n\ge1\}$ is a well-know non-regular set, which implies that $T^*(I)$ is non-regular as well. On the other hand, there are rational transition systems of which the reachability sets are guaranteed to be regular, such as pushdown systems. In fact, the reachability relation of a pushdown system is regular and poly-time computable [12]. Surprisingly, $T^*(I)$ it is not necessarily computable even when it is regular. Namely, given a regular language $A$ and a rational transition system $(I, T)$, it is undecidable to check where $T^*(I)=A$ even though the former is known to be a regular set. This post will try to present a self-contained argument about the existence of such systems.
Subword ordering. Our argument would centre around a reflexive and transitive relation called "a subword of": given $x,y\in\Sigma^*$, $x\preceq y$ holds iff $x$ can be obtained by removing some segments from $y.$
Lemma. (Higman) A set is finite if all elements in it are mutually incomparable w.r.t. $\preceq.$
Proof. For $x,y\in\Sigma^*$, let $x\sim y$ iff there are distinct alphabets $a_1,...,a_n$ in $\Sigma$ such that both $x$ and $y$ match $a_1^+a_2^+...a_n^+.$ It is clear that $\sim$ is an equivalence relation splitting $\Sigma^*$ into $\binom{N}{1}1! + \binom{N}{2}2! + ... + \binom{N}{N} N!$ classes where $N=|\Sigma|$, and two words are comparable iff they are in the same class. ☐
Given a set $X\subseteq \Sigma^*$, define the upward closure of $X$ as $uc(X)\equiv\{w:\exists w'\in X.\ w'\preceq w\}.$ We say $X$ is upward closed iff $X=uc(X).$ Higman's lemma tells us that every upward closed set have a finite representation.
Lemma 1. Every upward closed set is regular.
Proof. Note that for any $X\subseteq\Sigma^*$, elements in $min(X)\equiv\{w\in X: \forall w'\in X.\ w'\not\preceq w\}$ are mutually incomparable. It follows from Higman's lemma that $min(X)$ is finite. Also, $X$ is upward closed iff $X=uc(X)=\bigcup_{w\in min(X)}wc(w)$, which implies $X$ is regular since $min(X)$ is finite and $uc(w)$ is regular for every $w\in\Sigma^*.$ ☐
Another consequence of Higman's lemma is that every increasing (in the sense of set inclusion) sequence of upward closed sets is finite.
Lemma 2. If $X_0\subseteq X_1\subseteq X_2\subseteq...$ and $X_i=uc(X_i)$ for all $i\ge0$, then there exists a finite $n$ such that $X_k=X_n$ for all $k\ge n.$
Proof. Let $X\equiv \bigcup_{i\ge 0} X_i.$ Since $min(X)$ is finite by Higman's lemma, there exists $n$ such that $X_k\cap min(X)=\emptyset$ for all $k\ge n.$ But this means $X_k\subseteq X_n$ for all $k\ge n.$ ☐
Lemma 3. For every infinite sequence of words $w_0,w_1,...\in \Sigma^*$, there exists $i<j$ such that $w_i\preceq w_j.$
Lemma 4. For any sets $X_0,X_1,...\subseteq\Sigma^*$, $uc(\bigcup_{i\ge 0}X_i)=\bigcup_{i\ge 0}uc(X_i).$
Lossy channel systems. We briefly introduce the concept of lossy channel systems. A channel system with $n$ queues are like a pushdown system with $n$ stacks except that it uses FIFO queues instead of FILO stacks. Channel systems are more powerful than pushdown systems, because they only need one queue to simulate Turing machines while pushdown systems need two stacks. A lossy channel system is a channel system where contents in the queues can be spontaneously lost. Technically, the capability of losing data can be introduced to a channel system by adding a non-deterministic self loop to each control state that removes an arbitrary alphabet from the queues.
Note that channel systems (lossy or not) can be expressed as rational transition systems where a configuration with n queues is encoded as a word in form of [control state]#[contents of the 1st queue]#...#[contents of the nth queue]#. We can define a subword ordering $\preceq_L$ over the configurations of a channel system as follows:

$c$#$w_1$#$...$#$w_n$# $\preceq_L$ $c'$#$w_1'$#$...$#$w_n'$#$\quad$ iff $\quad c=c'$ and $w_i\preceq w_i'$ for $i=1,...,n.$

Since the number of control states is finite, Lemma 1~3 also holds for $\preceq_L.$
Lossy channel systems enjoy many good properties. Among them, two properties are of particular interest for our establishment:
Fact 1. If $u\rightarrow^* w$ then $u\rightarrow^* w'$ for all $w'\preceq w.$ It follows that the complement of the reachability set is upward closed w.r.t. $\preceq_L.$
Fact 2. For any set of configurations $C$, the set $pre^*(C)\equiv\{w:\exists w'\in C.\ w\rightarrow^* w'\}$ of its predecessors is upward closed w.r.t. $\preceq_L.$
Theorem 1. The reachability set of a lossy channel system is regular.
Proof. This theorem is an immediate consequence of Fact 1, Lemma 1, and the fact that regular sets are closed under complementation. ☐
Given a system $(I, T)$ and a regular set $B\subseteq\Sigma^*$, the reachability problem asks whether $T^*(I)\cap B \neq\emptyset$, i.e., words in $B$ are reachable from words in $I$ through a finite number of transitions.
Theorem 2. The reachability problem of a lossy channel system is decidable.
Proof. Historically, this result was first proved in [6] using backward analysis as follows. Define the backward transition function $pre(C)\equiv\{w:\exists w'\in C.\ w\rightarrow w'\}$, which would preserve regularity when the forward transition relation is regular. Given a regular set $B$ of the bad configurations, let $X_0\equiv B$ and $X_{n+1}\equiv X_n\cup pre(X_n)$ for $n\ge 0$. Define $X\equiv\bigcup_{i\ge 0}X_i$. Note that $X$ is the set of configurations backward reachable from $B$, and thus is upward closed by Fact 2. By Lemma 4, we can construct another sequence $X_0',X_1',...$ converging to $X$ by defining $X_n'\equiv uc(X_n).$ Each $X_{n+1}'$ can be computed from $X_n'$ for $n\ge0.$ Hence, by Lemma 2, the fixed point $X$ can be computed within a finite number of steps. Since $T^*(I)\cap B\neq\emptyset$ if and only if $X\cap I\neq\emptyset$, the reachability problem is decidable. ☐
Remark. The above proof actually gives an algorithm to construct the reachability set. However, the algorithm relies on Higman’s Lemma for termination and thus the time complexity is not primitive recursive [8,9].

Now we proceed to show that the reachability set of a lossy channel system, albeit regular, is not computable. Given a system $(I,T)$ and a word $w$, the recurrent state problem asks whether there exists an infinite trace from $I$ that visits $w$ infinitely often.
Theorem 3. [6] The recurrent state problem is undecidable for lossy channel systems.

To solve the recurrent state problem it suffices to check $w \in T^*(I)$ and $w \in T^*(\{w\})$. Hence, if these two questions are decidable then so is the recurrent state problem, leading to a contradiction with Theorem 3. Therefore, we can conclude that there exists no procedure that computes a finite-state automaton (or any symbolic representation allowing effective membership query) for $T^*(I)$.
Corollary. There exists a regular transition system $(I,T)$ such that $T^*(I)$ is regular but not computable.
Remark. Does the above corollary holds when $T$ is length-preserving? I didn't find any result about this at the time I was writing this post.

Discussion. The proof of Theorem 2 relies on properties of the subword ordering (or a well-quasi-ordering in general) to assert that backward search always terminates. In fact, reachability can be shown both r.e. and co-r.e. whenever a regular proof exists: just launch two procedures at the same time, one enumerating all possible counter-examples and one enumerating all possible proofs. It is clear that exactly one of the two procedures will terminate. (Here a counterexample is an execution path from $I$ to $B$, and a proof is a regular set $P$ satisfying the following three proof rules: i) $I\subseteq P$, ii) $T(P)\subseteq P$, and iii) $P\cap B=\emptyset.$) In particular, reachability is decidable for a transition system whenever the set of reachable configurations is regular.
More generally, the above argument works whenever a "regular" symbolic proof exists and the one-step image $T(X)$ or pre-image $T^{-1}(X)$ can be computed effectively from $X$. The "regularity" here refers to the existence of any finitary representations that are computable and closed under complement and intersection. This argument has been applied to prove decidability results for transition systems such as Petri nets and lossy counter systems [10]. See [11] for a general discussion on this kind of proof techniques.

References and further reading

1. Transitive Closures of Regular Relations for Verifying Infinite-state Systems, 2000.
2. Unreliable Channels are Easier to Verify than Perfect Channels, 1996.
3. On-the-fly Analysis of Systems with Unbounded, Lossy FIFO Channels, 2005.
4. Effective Lossy Queue Languages, 2001.
5. Undecidable Problems in Unreliable Computations, 2003.
6. Undecidability of Verifying Programs with Unreliable Channels, 1996.
7. Verifying Systems with Infinite but Regular State Spaces, 1998.
8. Post Embedding Problem is not Primitive Recursive, 2007.
9. The Ordinal Recursive Complexity of Lossy Channel Systems, 2008.
10. Lossy Counter Machines Decidability Cheat Sheet, 2010.
11. A unified approach for studying the properties of transitions systems, TCS, 1982.
12. On the regular structure of prefix rewriting. TCS, 1992.