Ericpony's Blog: December 2015

Sunday, December 20, 2015

Contextual equivalence (in progress)

A type environment $\Delta$ is defined by $\Delta ::= \cdot\ |\ \Delta, \alpha$, where $\alpha$ denotes a type variable. A judgement in form of $\Delta\vdash \tau$ states that type $\tau$ is well-formed in $\Delta$, i.e., all free type variables in $\tau$ appear in $\Delta$. An example of judgements of this kind is $\beta \vdash \forall \alpha.\, \alpha\rightarrow \beta$. We lift the judgement to define $\Delta \vdash \Gamma$, meaning $\forall x\in Dom(\Gamma).\,\Delta\vdash\Gamma(x)$. That is, all the types in the range of $\Gamma$ are well-formed in $\Delta$. When we write $\Delta; \Gamma \vdash e:\tau$, we mean $\Gamma \vdash e:\tau$ plus that $\tau$ is well-formed in $\Delta$. A type parameter $\alpha$ is introduced to term $e$ by type abstraction $\Lambda \alpha.\, e$, and eliminated from $e$ by type application $e[\tau]$ (for $e$ in form of $\Lambda \alpha.\, e$). The corresponding typing rules are$$\frac{\Delta, \alpha; \Gamma \vdash e: \tau}{\Delta; \Gamma \vdash \Lambda \alpha.\,e: \forall\alpha.\tau}\quad\mbox{and}\quad\frac{\Delta; \Gamma \vdash e: \forall\alpha.\tau\quad\Delta \vdash \tau'}{\Delta; \Gamma \vdash e[\tau']: \tau[\tau'/\alpha]}.$$In this lecture, we shall consider an extension of STLC with type abstraction and type application. This extension is in fact a terminating fragment of System F.
A context is comprehended as a complete program with a single hole that is capable of accepting an expression of a certain type. Context typing is in general expressed as $C: (\Gamma\vdash \tau) \leadsto (\Gamma'\vdash \tau')$. That is, given a context $C$, we can embed any expression $e$ in it such that $\Gamma\vdash e:\tau$ implies $\Gamma'\vdash C[e]:\tau'$.
Two expressions $e_1$ and $e_2$ are contextual equivalent in $\Gamma$ (written as $\Gamma \vdash e_1 \approx_{ctx} e_2: \tau$) iff $\Gamma \vdash e_1:\tau$, $\Gamma \vdash e_2:\tau$, and for all $C:(\Gamma \vdash \tau) \leadsto (\cdot \vdash bool)$, $\exists v:\tau.$ $e_1 \Downarrow v \wedge e_2 \Downarrow v$.
Note that here we only consider contextual equivalence for terminating programs. We shall extend its definition for general programs a bit later in this lecture. Also, we assume that the context yields a closed Boolean program after $e$ is embedded. This assumption is made without loss of generality: for any two programs of type inhabited by at least two members, there exists a context to differentiate their return values with Boolean if and only if the two programs are not contextual equivalent. (Any two programs of type with only one inhabitant are contextual equivalent by definition.)
The main difficulty of reasoning about contextual equivalence lies in the fact that equivalence must be shown for all contexts $C$. Proving equivalence by inducting on the structure of $C$ is problematic for $C = \lambda x:\tau.\, C$ and $C = \Lambda \alpha.\,C$. Hence, instead of proving contextual equivalence directly, people usually use proof methods such as logical relations or other technical bisimulations. We shall prove the following two interesting theorems in this lecture to demonstrate the technique of logical relations:
Theorem 1. If $\vdash e: \forall \alpha.\, \alpha\rightarrow\alpha$ and $\vdash \tau$ and $\vdash v:\tau$, then $e[\tau]\ v \hookrightarrow^* v$.
Theorem 2. If $\vdash e: \forall \alpha.\, \alpha\rightarrow bool$, $\vdash \tau_1 $, $\vdash \tau_2$, $\vdash v_1:\tau_1$, $\vdash v_2:\tau_2$, then $e[\tau_1]\ v_1 = e[\tau_2]\ v_2$.
The idea is to define logical relations such that contextual equivalent terms can be related by hypothesis. Let's first define value interpretations for some closed types:
$\V[bool]=\{\ (true,true),\ (false,false)\ \}$
The slogan of LRs for function types is "You give me related inputs; I give you related outputs." Do not confuse this with Theorem 2, where the two inputs are not required to be related.
$\V[\tau\rightarrow\tau']=\{\ (\lambda x:\tau.\, e_1,\lambda x:\tau.\, e_2)\ |\ \forall (v_1, v_2)\in \V[\tau].\, (e_1[v_1/x], e_2[v_2/x]) \in \E[\tau']\ \}$
The interpretation for polymorphic types is new.
$\V[\forall \alpha.\, \tau]=\{\ (\Lambda \alpha.\,e_1, \Lambda \alpha.\,e_2)\ |\ \forall \tau_1, \tau_2.\, (e_1[\tau_1/\alpha], e_2[\tau_2/\alpha]) \in \V[\tau](\rho[\alpha\mapsto(\tau_1, \tau_2)])\ \}$
Note that $\tau_1[\tau'/\alpha]$ and $\tau_2[\tau'/\alpha]$ are different types. Hence we relate them with a LR for parametric type $\tau$, and use $\rho$ (called a relational substitution) to remember the types $\tau_1,\ \tau_2$ picked for type parameter $\alpha$. In general, a relational substitution is in form of $\{\alpha_1\mapsto(\tau_{11},\tau_{12}), \alpha_2\mapsto(\tau_{21},\tau_{22}), ... \}$. We shall write $\V[\tau](\rho)$ to express the fact that $\V[\tau]$ is parametrized by substitution $\rho$, assuming that $dom(\rho)\supseteq FTV(\tau)$.
$\V[\alpha]=\{\ (\tau_1, \tau_2)\ |\ \rho(\alpha)=(\tau_1, \tau_2)\ \}$
(to be continued)

Thursday, December 10, 2015

Logical relations as a proof technique — Part II

This is the part II of my lecture note based on the lecture series "Logical Relations" given by Amal Ahmed at Oregon Programming Languages Summer School, 2015. See here for the Part I of this note.

Type safety. The slogan for type safety is "if a term is well-typed, it won't get stuck." Formally, this means that if $\vdash e:\tau$ and $e\hookrightarrow^* e'$, then either $e'$ is a value, or there exists $e''$ such that $e'\hookrightarrow e''$. Note that type safety doesn't require termination, and thus is weaker than strong normalization we proved earlier. On the other hand, type safety can be proved for Turing-complete languages such as STLC with recursive types (whereas strong normalization can not).
A well-known approach to prove type safety is via proving progress and preservation (note that these are proof methods for type safety, not the definition of it):
Progress lemma. If $\vdash e:\tau$, then either $e'$ is a value, or $\exists e''.\, e'\hookrightarrow e''.$
Preservation lemma. If $\vdash e:\tau$ and and $e'\hookrightarrow e''$, then $\vdash e'':\tau$.
Once these lemmas are established, type safety can be proved easily by induction. In this lecture, however, we shall use logical relations instead of the above lemmas to prove type safety.

We need a bunch of definitions before we proceed. Define that $val(e)$ holds iff $e$ is a value; $red(e)$ holds iff $\exists e'.\, e\hookrightarrow e'$; $safe(e)$ holds iff $\forall e'.\, e \hookrightarrow^* e' \implies red(e') \vee val(e')$. Now the definition of type safety can be stated as $\vdash e:\tau \implies safe(e)$. Recall that to prove strong normalization, we introduced a logical predicate $SN_\tau$ and used it to break statement $\vdash e:\tau \implies e\Downarrow$ into $\vdash e:\tau \implies SN_\tau(e)$ and $SN_\tau(e) \implies e\Downarrow$. Here we shall adopt a similar strategy. To begin with, we define the value interpretations of type-safe values and closed expressions as
$\V[bool] = \{\ false,\, true\ \}$
$\V[\tau\rightarrow\tau']=\{\ \lambda x:\tau.\, e\ |\ \forall v \in \V[\tau].\, e[v/x] \in \E[\tau']\ \}$
$\E[\tau]=\{\ e\ |\ \forall e'.\, e \hookrightarrow e' \implies red(e') \vee e' \in \V[\tau]\ \}$
Now, to prove $\vdash e:\tau \implies safe(e)$, it suffices to prove that $\vdash e:\tau \implies e \in \E[\tau]$ and $e \in \E[\tau] \implies safe(e)$. As before, the main difficulty lies in proving the first part, since the second follows immediately from the definition of $\E[\tau]$. Also, while we are proving type safety for closed terms, we need to use a hypothesis about open terms so as to apply induction to lambda expressions. For this purpose, we define a value interpretation for environments as follows:
$\G[\emptyset] = \{\emptyset\}$
$\G[\,\Gamma, x:\tau\,] = \{\ \gamma[x\mapsto v]\ |\ \gamma \in \G[\Gamma] \wedge v \in \V[\tau]\ \}$
Hence, a substitution $\gamma$ is in $\G[\Gamma]$ iff it maps variables to values that are type-safe and consistent with $\Gamma$. An expression is called semantically type-safe in $\Gamma$, written as $\Gamma \models e:\tau$, if and only if $\gamma(e) \in \E[\tau]$ for all substitutions $\gamma \in \G[\Gamma]$. That is, $\Gamma \models e:\tau$ iff $e$ always reduces to a type-safe closed term after its free variables are replaced by type-safe values consistent with $\Gamma$.
As mentioned above, to prove type-safety it suffices to show A) $\vdash e:\tau \implies \models e:\tau$, and B) $\models e:\tau \implies safe(e)$. Part A says that the syntactic notion of type safety (ie., well-typedness) implies a semantic notion of type safety which says that a term will not get stuck in execution. It is in fact a special case of the so-called "Fundamental Property" of logical relation $\ \Gamma \vdash e:\tau \implies \Gamma \models e:\tau$. Part B can be shown directly as follows. Recall that $safe(e)$ holds iff $\forall e'.\, e \hookrightarrow^* e' \implies red(e') \vee val(e')$. Suppose that $e \hookrightarrow^* e'$. If $\models e:\tau$, then we have $e\in \E[\tau]$ since $\emptyset(e)=e$. But this means $red(e') \vee e' \in \V[\tau]$, that is, $e'$ is reducible or $e'$ is a value. This concludes the proof.
You can see here and in the example of strong normalization that the property of interest follows almost immediately from the definition of the logical relations we use. This is often the case if you construct suitable logical relations to assist your proof.
Now we are going to prove the Fundamental Property, i.e., $\Gamma \vdash e:\tau \implies \Gamma \models e:\tau$. As we did in the proof of strong normalization, we perform induction on the type derivations case by case.

(to be continued)

Saturday, December 5, 2015

Logical relations as a proof technique — Part I

This is the part I of my lecture note based on the lecture series "Logical Relations" given by Amal Ahmed at Oregon Programming Languages Summer School, 2015. See here for Part II.

Overview. These lectures will introduce the proof method of logical relations. Logical relations have been used to prove a wide range of properties, including normalization, type safety, equivalence of programs, non-interference (in security typed languages), and compiler correctness. We will start by looking at how to prove normalization and type safety for the simply-typed lambda calculus (STLC). We will then add recursive types to STLC and develop a step-indexed logical relation, using step-indexing to ensure that the logical relation remains well-founded. We will cover polymorphic and existential types and develop a logical relation to reason about the associated notions of parametricity and representation independence. The lectures will also discuss how to establish that a logical relation corresponds to contextual equivalence.

Free theorems. In parametric polymorphism, types cannot be checked, inspected or instantiated by programs. Hence, programs with polymorphic types don't change behaviors according to their type parameters. It follows that all functions of type $\forall a.\, a \rightarrow a$ are identity functions, since nothing but the argument can be returned by the function. Similarly, all functions of type $\forall a.\, a \rightarrow Int$ are constant functions. In contrast, there doesn't exist a function of type $\forall a.\, Int \rightarrow a$, since the return value can be obtained from nowhere. Results of this kind are called free theorems, and many of them can be proved using logical relations.

Logical predicates. Unary logical relations are also called logical predicates. They are used to assert properties of one single program, such as termination. In contrast, binary logical relations are used to assert relations between two programs, such as equivalence. Note that we only put closed programs in a logical relation. By and large, to prove that a well-typed and closed term $e$ has some special property, we shall set up a predicate $P_\tau$ such that $P_\tau(e)$ holds if and only if all of the following conditions hold: i) $\vdash e: \tau$, ii) $e$ has the property of interests, and iii) the property of interests is preserved by elimination forms (or introduction forms) of type $\tau$.

Strong normalization. In history, logical relations first appeared as a technique to prove strong normalization, which means that a well-typed term always reduces to the same value regardless of the strategies of evaluation. Since typing rules in STLC are deterministic, proving strong normalization for STLC amounts to showing that every well-typed program terminates, that is, $\vdash e:\tau \implies e \Downarrow$. Observe that this statement cannot be proved by applying structural induction on typing derivations directly. The problem of induction occurs at the application rule: we cannot conclude $e_1\ e_2\Downarrow$ from $e_1 \Downarrow$ and $e_2 \Downarrow$. Nevertheless, this problem can be resolved by using a stronger induction hypothesis, which leads to the following definition of logical predicates.
We shall set up a predicate $SN_\tau(\cdot)$ such that $SN_\tau(e)$ holds only if term $e$ has type $\tau$ and is strongly normalizing. We consider two types for STLC: boolean and function types. For boolean, it suffices to define that $SN_{bool}(e)$ holds iff $\vdash e:bool \wedge e \Downarrow$, since boolean doesn't has elimination forms. For functions types, we define that $SN_{\tau\rightarrow\tau'}(e)$ holds iff

$\vdash e':\tau\rightarrow\tau'\quad\wedge\quad e' \Downarrow\quad \wedge\quad (\forall e.\, SN_{\tau}(e) \implies SN_{\tau'}(e'\ e))$.

(1)

We shall see that this definition provides the strengthened hypothesis we need to deal with the application rule. The proof of $\vdash e:\tau \implies e \Downarrow$ now breaks into two parts: A) $\vdash e:\tau \implies SN_{\tau}(e)$, and B) $SN_{\tau}(e) \implies e \Downarrow$. Part B follows immediately from the definition of $SN_\tau(e)$. It remains to prove part A, and we rely on structural induction to do this. But again, we cannot use the statement $\vdash e:\tau \implies SN_{\tau}(e)$ we are going to prove as an induction hypothesis. It only asserts closed terms, but to show that it holds for terms of function type, say, for $(\lambda x:\tau.\, e'): \tau \rightarrow \tau'$, we need the hypothesis to hold for open term $e'$ so as to perform induction.
Instead, we shall use the following stronger claim as an induction hypothesis. Given a mapping $\gamma$ from variables to values, we write $\gamma \models \Gamma$ when $dom(\gamma)=dom(\Gamma)$ and $\forall x\in dom(\Gamma).\, SN_{\Gamma(x)}(\gamma(x))$ holds. That is, $\gamma$ maps variables in $dom(\Gamma)$ to values that strongly normalize with types compatible with $\Gamma$. We claim that for any term $e$ and substitution $\gamma$,

$\Gamma \vdash e:\tau \wedge \gamma \models \Gamma \implies SN_{\tau}(\gamma(e))$.

(2)

It is clear that part A follows from this claim by setting $\Gamma$ to be empty. Now we shall prove (2) by inducting on the derivation rules case by case.
Case $\quad\rhd\ \ \vdash true: bool\quad$ and $\quad\rhd\ \ \vdash false: bool$.
　Note that $\gamma(x)=x$ for $x\in \{\,true,\,false\,\}$, which is of course strongly normalizing.
Case $\quad\Gamma(x)=\tau\ \ \rhd\ \ \Gamma \vdash x:\tau$.
　By premises $\Gamma(x)=\tau$ and $\gamma\models\Gamma$, $\gamma(x)$ has type $\tau$. Thus $SN_{\tau}(\gamma(x))$ holds by $\gamma\models\Gamma$.
Case $\quad\vdash e': \tau\rightarrow\tau'$, $\vdash e: \tau\ \ \rhd\ \ \vdash e'\ e: \tau'$.
　Note that $\gamma(e'\ e)=\gamma(e')\ \gamma(e)$. By premise $\gamma\models\Gamma$, both $SN_{\tau\rightarrow\tau'}(\gamma(e'))$ and $SN_{\tau}(\gamma(e))$ holds by induction. Thus $SN_{\tau'}(\gamma(e')\ \gamma(e))$, which equals to $SN_{\tau'}(\gamma(e'\ e))$, holds by (1).
Case $\quad\Gamma, x:\tau \vdash e':\tau'\ \ \rhd\ \ \Gamma\vdash\lambda x:\tau.\, e': \tau\rightarrow\tau'$.
　 This is the case we got stuck with before introducing logical predicates, so it is not surprising that the proof for this case is far more complicated than others. Before we proceed, we need the following two lemmas:
Lemma 1. (substitution lemma) If $\Gamma \vdash e:\tau$ and $\gamma \models \Gamma$, then $\Gamma \vdash \gamma(e):\tau.$
Lemma 2. (SN is preserved by forward/backward reductions) Suppose that $\vdash e:\tau$ and $e \hookrightarrow e'$. Then $SN_{\tau}(e)$ holds if and only if $SN_{\tau}(e')$ holds.
To prove (2), i.e., $\Gamma \vdash \lambda x:\tau.\, e' \wedge \gamma\models\Gamma$ implies $SN_{\tau\rightarrow\tau'}(\gamma(\lambda x:\tau.\, e'))$, it suffices to prove the following three statements:
1) $\gamma(\lambda x:\tau.\, e')$ has type $\tau\rightarrow\tau'$. This follows by premise $\gamma\models \Gamma$ and Lemma 1;
2) $\gamma(\lambda x:\tau.\, e')\Downarrow$. This is obvious, since $\gamma(\lambda x:\tau.\, e')=\lambda x:\tau.\,\gamma(e')$ is a lambda value;
3) $\forall e.\,SN_\tau(e)\implies SN_{\tau'}(\gamma(\lambda x:\tau.\, e')\ e)$. By induction, we know that $\Gamma, x:\tau \vdash e':\tau'$ and $\gamma'\models \Gamma, x:\tau$ implies $SN_{\tau'}(\gamma'(e'))$. Suppose that $SN_\tau(e)$ holds for some $e$. Then $e\Downarrow v$ for some value $v$, and $SN_{\tau}(v)$ holds by Lemma 2. Given premises $\Gamma \vdash \lambda x:\tau.\, e'$ and $\gamma\models\Gamma$, define $\gamma'=\gamma[x \mapsto v]$. Note that $\gamma'\models\Gamma,x:\tau$. Thus $SN_{\tau'}(\gamma'(e'))$ holds by induction. However, note that
$\gamma'(e')=\gamma(e')[x \mapsto v]=(\lambda x:\tau.\,\gamma(e'))\ v=\gamma(\lambda x:\tau.\,e')\ v$.
It then follows from $SN_{\tau'}(\gamma'(e'))$ and Lemma 2 that $SN_{\tau'}(\gamma(\lambda x:\tau.\,e')\ e)$ holds. This concludes our proof of strong normalization for STLC.

(End of lecture note Part I. See here for Part II.)