This is ericpony's blog

Monday, June 16, 2014

Functors and Monads in Scala ─ Part I

A type constructor is a type that you can apply type arguments to construct a new type: given a type A and a type constructor M, you get a new type M[A]. A functor is a type constructor, say M, together with a function
    lift: (A -> B) -> M[A] -> M[B]
that lifts a function over the original types into a function over the new types. Moreover, we require that the lifting function should preserve composition and identities. That is, if function h is formed by composing functions g and f, then lift(h) should be equivalent to the composition of lift(g) and lift(f). Formally, this means
    h(a) = g(f(a))  implies  lift h a = lift g (lift f a).
Also, if h is an identity function on variables of type A, then lift(h) should be an identity function on variables of type M[A]:
    h(a) = a  implies  lift h a = a.
A monad is a functor associated with two special operations, unitizing and binding:
    unit: A -> M[A]
    bind: (A -> M[B]) -> M[A] -> M[B]
A unitizing operation takes an object of the original type and maps it to an object of the new type. A binding operation is useful in composing two monadic functions, i.e., functions with signature $*\rightarrow M[*]$. Using binding, the composition of two monadic functions f: A -> M[B] and g: B -> M[C] can be defined as
    a => bind g f(a),
which has type A -> M[C]. Composition of monadic functions is supposed to be associative and has unit as a left and right identity. For this purpose, definitions of bind and unit must satisfy three "monad laws". That is, the following three equations
1.    bind (a => bind g f(a)) la = bind g (bind (a => f(a)) la)
2.    bind f (unit a) = f(a)
3.    bind unit la = la
should hold for any values a with type A, la with type M[A], functions f with type A -> M[B] and g with type B -> M[C].

A monad can be thought of as representing computations just like a type represents values. These computations are operations waiting to be carried out. Presumably, the programmer has some interest in not carrying them out directly. For example, the computations to be done may not be fully known at compile time. One can find similar ideas of representing computations as objects in the Command design pattern of GoF. Regarding monads as computations, the purpose of unit is to coerce a value into a computation; the purpose of bind is to construct a computation that evaluates another computation and yields a value. Informally, unit gets us into a monad and bind gets us around the monad. To get out of the monad, one usually uses an operation with type M[A] -> A, though the precise type depends on the purpose of the monad at hand.

In Scala, a monad is conventionally treated like a collection associated with list operations: the unit operation corresponds to a constructor of singleton lists; lift and bind are named as map and flatMap, respectively, in the sense that lift maps a list to another list and bind maps a list of lists to another list of lists and then flattens the result to a one-layered list. Map and flatMap operations kind of subsume each other given the flatten operation, as
    flatMap g la = flatten (map g la)
    map (a => f(a)) la = flatMap (a => unit f(a)) la
Hence, the minimal set of operations a monad must provide can be {unit, flatMap} or {unit, map, flatten}. Scala doesn't define a base trait for Monad and leaves the decision of interface to the programmer. On the other hand, if a monad implements methods map, flapMap and filter, then it can be composed with other monads via for-comprehension to streamline monadic computations.

Examples

Optional values. Option (aka. Maybe) monads are used extensively in many functional programming languages: see here for a more general treatment of option monads, and here for a detailed explanation of their usages in Scala. Below is a minimalistic implementation of Option type, and its companion subtypes Some and None:
sealed trait Option[T] {
    def map[U](f: T => U):Option[U]
    def flatMap[U](f: T => Option[U]): Option[U]
}
case class Some[T](val t:T) extends Option[T] { 
    override def map[U](f: T => U): Option[U] = new Some[U](f(t))
    override def flatMap[U](f: T => Option[U]): Option[U] = f(t)
}
case class None[T] extends Option[T] {
    override def map[U](f: T => U): Option[U] = new None[U]
    override def flatMap[U](f: T => Option[U]): Option[U] = new None[U]
}
The canonical advantage of the option monad is that it allows composition of partial functions, i.e., functions that return a null pointer when they fail to give correct results. Instead, a partial function returns None (representing an "empty" Option) and this result may be composed with other functions. In other words, it allows transformation of partial functions into total ones. For example, suppose g1, g2 and g3 are functions that return either a null pointer or an addable value, and we want to print g3(g2(x) + g1(x)) given that the result is meaningful. Without Option, one has to use nested if-then null checks or a try-catch block to handle possible null inputs/outputs, for example,
try{ 
  val y = g3(g2(x) + g1(x))
  if(y != null) println(y)
}catch {
  case e : NullPointerException => ; 
}
If we make g1, g2 and g3 return an option monad via a wrapper, we can achieve the same purpose using for-comprehension, so that nothing would be printed if any of g1, g2 or g3 returns null:
def monadify[T](f: T => T) =
  (x: T) => { 
    val y = if(x!=null) f(x) else x
    if(y != null) Some(y) else None
  }
for {
  y <- monadify(g1)(x)
  z <- monadify(g2)(x)
  r <- monadify(g3)(y + z)
} yield println(r)
Whether the functional ("idiomatic") way of handling null pointers is preferable to the traditional one is still under debate, especially among sophisticated users of imperative programming languages. Since these two approaches differ very little in performance and Scala supports both of them, the choice is really only a matter of taste than truth IMHO.

An important thing to notice about this example is that the monad is used here as a control structure. We have got a sequence of operations chained together in a monad via for-comprehension. When we evaluate the monadic sequence, the implementation of the binding operator that combines monads actually does control flow management. This trick of using the chaining operator to insert control flow into monadic sequences is an incredibly useful technique and is used in Scala by a variety of monads beside Option, such as Try, Either, Future and Promise.

State transformers. A stateful computation can be modeled as a function that takes inputs with an initial state and produces an output paired with the new state. Due to the presence of states, a function may produce different results even though it is given the the same inputs. Let Env be a type representing the possible states of the environment. An Effect is a mapping from Env to Env that describes changes of states, i.e., the side-effects caused by a computation, and the monad type M adjoins an extra side-effect to any type:
    Effect = Env -> Env
    M[A] = (A, Effect)
The unit function pairs up a value with the identity on side-effects; the map function propagates a state leaving it unchanged:
    unit a = (a, e => e)
    (a, e) map f = (f(a), e)
If you want to adjoin two side-effects e and e', you can use bind to compose them and get a single effect e' ○ e, where ○ denotes function composition:
    (a, e) bind (a => (f(a), e')) = (f(a), e' ○ e)    
In Scala, the state monad described above may be written as
case class M[+A](a: A, e: Env => Env) {
    private def bind[B](f: A => M[B]): M[B] = {
      val fa = f(a); M(fa.a, env => fa.e(e(env)))
    }
    override def toString = showM(this)
    def map[B](f: A => B): M[B] = bind(x => unitM(f(x)))
    def flatMap[B](f: A => M[B]): M[B] = bind(f)
}
def unitM[A](a: A) = M(a, env => env)
The following example shows a state monad that increments a counter each time an arithmetic operation is invoked. (Note that one can actually declare the counter as a field member, avoiding the use of monads. This may explain the viewpoint that state monads are arguably less important in impure FPs like Scala than in pure FPs like Haskell.) The add and div operations demonstrate how to handle a monad wrapped in another monad using nested for-comprehensions.
type Env = Int
// Ad hoc operations for this particular monad
def showM[A](m: M[A]) = "Value: " + m.a + "; Counter: " + m.e(0)
val tickS = M(0, e => e + 1)

// Following code demonstrates the use of monads
type Value = Double
def add(a: M[Option[Value]], b: M[Option[Value]]): M[Option[Value]] = 
  for {
    _  <- tickS
    aa <- a 
    bb <- b
    c = for {
      a <- aa if aa.nonEmpty
      b <- bb if bb.nonEmpty
    } yield a + b     // yield None if aa is None or bb is None
  } yield c

def div(a: M[Option[Value]], b: M[Option[Value]]): M[Option[Value]] = 
  for {
    _  <- tickS
    aa <- a
    bb <- b
    c = for {
      a <- aa if aa.nonEmpty
      b <- bb if bb.nonEmpty
      b <- Some(b) if b != 0
    } yield a / b     // yield None if b==0 or aa is None or bb is None
  } yield c

val const0 = unitM(Some(0.0))
val const1 = unitM(Some(1.0))
val const2 = unitM(Some(2.0))
val const3 = unitM(Some(3.0))
val expr = div(add(const3, add(const1, const2)), const2)  
println(expr)                            // Value: Some(3.0); Counter: 2
println(add(div(expr, const0), const1))  // Value: None; Counter: 5
The syntax sugar of for-comprehensions abstracts the details and help the user manipulate monads without getting involved in the underlying machinery of binding and lifting. The fact that the abstraction works as expected is a consequence of the monad laws, and it is up to the designer of the monad to assure that the implementation does keep the laws.

We just stop here as this post is getting too long. More examples, techniques and discussions will be introduced in the Part II of this tutorial.

References and resources

1. Monads are Elephants, Part 1 ~ Part 4, is a gentle and pragmatic tutorial that explains exactly what a Scala programmer needs to know to exploit monads. Scala Monads: Declutter Your Code With Monadic Design is one of the videos on YouTube that teaches you how to write modular and extensible programs with monadically structured code.
2. Many excellent papers and slides about monads could be found in the homepage of Philip Wadler, one of the most enthusiastic advocates of monadic design in the FP community. Among others, The Essence of Functional Programming and Monads for Functional Programming are two must-reads for anyone who is struggling to turn monadic reasoning into his intuitions.
3. For a general and rigid mathematical treatment of monads, see e.g., Arrows, Structures, and Functors: The Categorical Imperative, where Section 10.2 describes monads as generalized monoids in a way that basically makes everything that we can write down or model using abstract syntax a monad. Mathematically mature computer science readers will find everything they need to know about the subject in the celebrated book Categories for the Working Mathematician by Mac Lane, the co-founder of category theory.
4. There are also books available online that introduce programmers with moderate math background into the subject, including Category Theory for Computing Science by M. Barr and C. Wells, and Computational Category Theory by D.E. Rydeheard and R.M. Burstall. Also check this thread and references therein about the relations between category theory, type theory, and functional programming.

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...