Ericpony's Blog: A note on semi-linear and regular sets

Friday, March 18, 2016

A note on semi-linear and regular sets

Regular sets

A set is called regular if it can be recognised by a finite DFA. For

$L\subseteq\Sigma^*$ ,

$\sim_L$ is called a right congruence iff

$u\sim_L v \iff uw\sim_L vw$ for all

$w\in\Sigma^*.$ Note that a right congruence is an equivalence relation. The classes of

$\sim_L$ induce a minimal complete DFA recognising

$L.$ Intuitively, words belonging to the same class in

$\sim_L$ lead to the same state in the DFA. The classes containing the empty string and the strings in

$L$ correspond to the initial state and the final states, respectively.
Myhill-Nerode Theorem. A set

$L$ is regular iff

$\sim_L$ has finite index over

$\Sigma^*.$
Rational sets. A set is called rational iff it can be generated from a finite number of finite subsets of

$\Sigma^*$ through a finite number of rational operations, namely union, concatenation and the Kleene star.
Kleene's theorem. A set of words is regular iff it is rational. (Note: The only-if part doesn't holds for non-length-preserving relations.)

Semi-linear sets

A subset of

$\mathbb N^k$ is linear if it can be expressed as

$u_0 + \langle u_1, \dots, u_m \rangle = \{u_0 + a_1u_1 + \dots + a_mu_m \mid a_1, \dots, a_m \in \mathbb{N}\}$ such that

$u_0,...,u_m$ are vectors of dimension

$k$ . A set is called semi-linear if it is a union of finitely many linear sets.
Fact. Semi-linear sets are closed under complement, intersection, projection, etc., in an effective way.
Fact. A set is semi-linear iff it is Presburger-definable.
Fact. If a set is upward-closed or downward-closed, then it is semi-linear.

Parikh image. Let

$\Sigma=\{a_1,...,a_k\}$ . Given a word

$w$ , we use

$|w|(a)$ to denote the number of occurrences of symbol

$a$ in

$w$ . A function

$\phi:\Sigma^*\rightarrow\mathbb N^k$ is called the Parikh image if the

$i$ -th component of

$\phi(w)$ is

$|w|(a_i).$ [note]

Fact. Parikh image is a morphism from monoid

$(\Sigma^*,\cdot,\epsilon)$ to monoid

$(\mathbb N^k,+,0).$
Fact. A set

$S\subseteq \mathbb N^k$ is semi-linear iff there is a regular set

$R\subseteq\Sigma^*$ such that

$\phi(R)=S$ , where

$\phi$ is the Parikh image. [1]

From RE to its Parikh image. The key observation to computing the Parikh image of a regular set is to describe the depths of Kleene stars by constraints over the dependency between frequencies. For example,

$(a^*b)^*$ can be reduced to

$a^*b^*$ plus a constraint "

$|w|(b)=0$ implies

$|w|(a)=0$ ", which can further resolve to two linear conditions "

$|w|(b)=0\wedge |w|(a)=0$ " and "

$|w|(b)\ge 1 \wedge |w|(a)\ge 0$ " corresponding to a union of two linear sets. See [2] and the references therein for more details abort constructions.

Parikh's Theorem. [1] If

$L$ is a context-free language then there exists a regular language

$R$ such that

$\phi(L)=R.$ (This implies CFL and RL are indistinguishable when the alphabet is commutative.)

Real numbers

A relation in real closed fields is definable in the first-order logic if and only if it is a Boolean combination of polynomial equations [?].

Connection with logic

Connection with MSO. [Buchi] A set

$L\subseteq\Sigma^*$ is regular iff it can be defined in

$MSO(\Sigma, <_{\mathbb N}, \{P_a\}_{a\in\Sigma})$ . [slide]
Connection with LTL. [?] A set is LTL-defiable iff it is

$FO(\Sigma^\omega, <_{\mathbb N}, \{P_a\}_{a\in\Sigma})$ -definable.
Connection with FO. A regular expression constructed without the Kleene star is called star-free. For example,

$(ab)^*$ is star-free but

$(aa)^*$ is not. McNaughton showed that a regular set is star-free iff it can be defined in

$FO(\Sigma, <_{\mathbb N}, \{P_a\}_{a\in\Sigma})$ , cf. [3]. Hence, checking star-freeness of a regular expression is equivalent to checking whether an MSO-formula is FO-definable. The former problems is shown to be PSPACE-complete, see here and here. This complexity result is generalised to checking star-freeness of rational relations, see here.

References

1. On Context-Free Languages, J. ACM, 1966.
2. Semi-linear Parikh Images of Regular Expressions via Reduction, MFCS10.
3. First-order definable languages, 2008.
4. Jewels of Formal Language Theory, 1982.

Ericpony's Blog