Fun with parallel monad comprehensions (The Monad.Reader)
This article is a re-publication of an article that I wrote some time ago for The Monad.Reader magazine, which is an online magazine about functional programming and Haskell. You can also read the article in the original PDF format as part of the Issue 18 (together with two other interesting articles). The samples from the article can be found on Github.
Monad comprehensions have an interesting history. They were the first syntactic extension for
programming with monads. They were implemented in Haskell, but later replaced with plain list
comprehensions and monadic do
notation. Now, monad comprehensions are back in Haskell,
more powerful than ever before!
Redesigned monad comprehensions generalize the syntax for working with lists. Quite interestingly, they also generalize syntax for zipping, grouping and ordering of lists. This article shows how to use some of the new expressive power when working with well-known monads. You'll learn what "parallel composition" means for parsers, a poor man's concurrency monad and an evaluation order monad.
1 Introduction
This article is inspired by my earlier work on joinads [1], an extension that
adds pattern matching on abstract values to the computation expression syntax in F#.
Computation expressions are quite similar to the do
notation in Haskell.
After implementing the F# version of
joinads, I wanted to see how the ideas would look in Haskell. I was quite surprised to find out that
a recent extension for GHC adds some of the expressive power of joinads to Haskell.
To add some background: the F# computation expression syntax can be used to work with
monads, but also with monoids and a few other abstract notions of computation.
It also adds several constructs that generalize imperative features of F#, including while
and for
loops as well as exception handling. The joinads extension adds support for
pattern-matching on "monadic values". For example, you can define a parallel programming monad
and use joinads to wait until two parallel computations both complete or wait until the first of
the two completes returning a value matching a particular pattern.
How are F# joinads related to Haskell? A recent GHC patch implemented by Nils Schweinsberg [2, 3] brings back support for monad comprehensions to Haskell. The change is now a part of the main branch and will be available in GHC starting with the 7.2 release. The patch doesn't just re-implement original monad comprehensions, but also generalizes recent additions to list comprehensions, allowing parallel monad comprehensions and monadic versions of operations like ordering and grouping [4].
The operation that generalizes parallel comprehensions is closely related to a merge
operation
that I designed for F# joinads. In the rest of this article, I demonstrate some of the
interesting programs that can be written using this operation and the elegant syntax provided by
the re-designed monad comprehensions.
1.1 Quick review of list comprehensions
List comprehensions are a very powerful mechanism for working with lists in Haskell. I expect that you're already familiar with them, but let me start with a few examples. I will use the examples later to demonstrate how the generalized monad comprehension syntax works in a few interesting cases.
If we have a list animals
containing "cat" and "dog" and a list sounds
containing
animal sounds "meow" and "woof", we can write the following snippets:
> [ a ++ " " ++ s | a <- animals, s <- sounds ]
["cat meow","cat woof","dog meow","dog woof"]
> [ a ++ " " ++ s | a <- animals, s <- sounds, a !! 1 == s !! 1 ]
["dog woof"]
> [ a ++ " " ++ s | a <- animals | s <- sounds ]
["cat meow","dog woof"]
The first example uses just the basic list comprehension syntax. It uses two generators to implement a Cartesian product of the two collections. The second example adds a guard to specify that we want only pairs of strings whose second character is the same. The guard serves as an additional filter for the results.
The last example uses parallel list comprehensions. The syntax is available after enabling the
ParallelListComp
language extension. It allows us to take elements from multiple lists, so
that the nth element of the first list is matched with the nth element of the
second list. The same functionality can be easily expressed using the zip
function.
1.2 Generalizing to monad comprehensions
The three examples we've seen in the previous section are straightforward when working with lists.
After installing the latest development snapshot of GHC and turning on the
MonadComprehensions
language extension, we can use the same syntax for working
with further notions of computation. If we instantiate the appropriate type classes, we can even use
guards, parallel comprehensions and operations like ordering or grouping.
Here are some of the type classes and functions that are used by the desugaring:
class Monad m where
(>>=) :: m a -> (a -> m b) -> m b
return :: a -> m a
class (Monad m) => MonadPlus m where
mzero :: m a
mplus :: m a -> m a -> m a
class (Monad m) => MonadZip m where
mzip :: m a -> m b -> m (a, b)
guard :: MonadPlus m => Bool -> m ()
guard b = if b then return () else mzero
Aside from Monad
, the desugaring also uses the MonadPlus
and MonadZip
type classes. The former is used only for the guard
function, which is also defined above. The latter is a new class which has been added as a generalization of parallel
list comprehensions. The name of the function makes it clear that the
type class is a generalization of the zip
function. The patch also defines a MonadGroup
type class that generalizes grouping operations inside list comprehensions, but I will not discuss that feature in this article.
You can find the general desugaring rules in the patch description [2]. In this
article, we'll just go through the examples from the previous section
and examine what the translation looks like. The following declaration shows how to
implement the Monad
, MondPlus
, and MonadZip
type classes
for lists:
instance Monad [] where
source >>= f = concat $ map f source
return a = [a]
instance MonadPlus [] where
mzero = []
mplus = (++)
instance MonadZip [] where
mzip = zip
The >>=
operation, called bind, applies the provided function to each element of the input list
and then concatenates the generated lists. The return
function creates a singleton list containing
the specified value. The mzero
value from MonadPlus
type class is an empty list, which
means that guard
returns [()]
when the argument is True
and the empty list otherwise. Finally,
the mzip
function for lists is just zip
.
Now we have everything we need to look at the desugaring of monad comprehensions. The first
example from the previous section used multiple generators and can be translated purely in terms
of Monad
:
animals >>= (\a -> sounds >>= (\s ->
return $ a ++ " " ++ b))
Every generator is translated to a binding using >>=
. The operations are nested, and
the innermost operation always returns the result of the output function. The next snippet shows
what happens when we add a predicate to filter the results:
animals >>= (\a -> sounds >>= (\s ->
guard (a !! 1 == s !! 1) >>= (\_ ->
return $ a ++ " " ++ s) ))
A predicate is translated into a call to the guard
function in the innermost part of the
desugared expression. When the function returns mzero
value (an empty list), the result of the
binding will also be mzero
, so the element for which the predicate doesn't hold will be
filtered out. Finally, let's look at the translation of the last example:
(animals `mzip` sounds) >>= (\(a, s) ->
return $ a ++ " " ++ s)
When we use parallel comprehensions, the inputs of the generators are combined using the mzip
function. The result is passed to the bind operation, which applies the output function to values
of the combined computation. If we also specified filtering, the guard
function would be added
to the innermost expression, as in the previous example.
As you can see, the translation of monad comprehensions is quite simple, but it adds
expressivity to the syntax for working with monads. In particular, the do
notation doesn't provide
an equivalent syntactic extension for writing parallel comprehensions. (Constructs like generalized
ordering, using functions of type m a -> m a
, and generalized
grouping, using functions of type m a -> m (m a)
, add even more expressivity, but that's a topic for another article.)
In the next three sections, I show how we could implement the mzip
operation for several
interesting monads, representing parsers, resumable computations, and
parallel computations. At the end of the article, I also briefly
consider laws that should hold about the mzip
operation.
2 Composing parsers in parallel
What does a parallel composition of two parsers mean? Probably the best thing we can do is to run both parsers on the input string and return a tuple with the two results. That sounds quite simple, but what is this construct good for? Let's first implement it and then look at some examples.
2.1 Introducing parsers
A parser is a function that takes an input string and returns a list of possible results. It may be empty (if the parser fails) or contain several items (if there are multiple ways to parse the input). The implementation I use in this article mostly follows the one by Hutton and Meijer [5].
newtype Parser a
= Parser (String -> [(a, Int, String)])
The result of parsing is a tuple containing a value of type a
produced by the parser, the number
of characters consumed by the parser, and the remaining unparsed part of the string. The Int
value
represents the number of characters consumed by the parser. It is not usually included in the definition,
but we'll need it in the implementation of mzip
.
Now that we have a definition of parsers, we can create our first primitive parser and a function that runs a parser on an input string and returns the results:
item :: Parser Char
item = Parser (\input -> case input of
"" -> []
c:cs -> [(c, 1, cs)])
run :: Parser a -> [a]
run (Parser p) input =
[ result | (result, _, tail) <- p input, tail == [] ]
The item
parser returns the first character of the input string. When it succeeds, it consumes
a single character, so it returns 1 as the second element of the tuple. The run
function
applies the underlying function of the parser to a specified input. As specified by the condition
tail == []
, the function returns the results of those parsers which parsed the entire input.
The next step is to make the parser monadic.
2.2 Implementing the parser monad
Parsers are well known examples of monads and of monoids. This means that we can
implement both the Monad
and the MonadPlus
type
classes for our Parser
type. The implementation looks as follows:
instance Monad Parser where
return a = Parser (\input -> [(a, 0, input)])
(Parser p1) >>= f = Parser (\input ->
[ (result, n1 + n2, tail)
| (a, n1, input') <- p1 input
, let (Parser p2) = f a
, (result, n2, tail) <- p2 input' ])
instance MonadPlus Parser where
mzero = Parser (\_ -> [])
mplus (Parser p1) (Parser p2) = Parser (\input ->
p1 input ++ p2 input)
The return
operation returns a single result containing the specified value that doesn't consume
any input. The >>=
operation can be implemented using ordinary list comprehensions.
It runs the parsers in sequence, returns the result of the second parser and consumes the sum of
characters consumed by the first and the second parser. The mzero
operation creates a parser that
always fails, and mplus
represents a nondeterministic
choice between two parsers.
The two type class instances allow us to use some of the monad comprehension syntax. We can now
use the item
primitive to write a few simple parsers:
sat :: (Char -> Bool) -> Parser Char
sat pred = [ ch | ch <- item, pred ch ]
char, notChar :: Char -> Parser Char
char ch = sat (ch ==)
notChar ch = sat (ch /=)
some p = [ a:as | a <- p, as <- many p ]
many p = some p `mplus` return []
The sat
function creates a parser that parses a character matching the specified predicate.
The generator syntax ch <- item
corresponds to monadic binding and is desugared
into an application of the >>=
operation. Because the Parser
type is an instance
of MonadPlus
, we can use the predicate pred ch
as a guard. The
desugared version of the function is:
sat pred = item >>= (\ch ->
guard (pred ch) >>= (\_ -> return ch))
The some
and many
combinators are mutually recursive. The first creates a parser that
parses one or more occurrences of p
. We encode it using a monad comprehension with two bindings.
The parser parses p
followed by many p
. Another way to write the some
parser would be to
use combinators for working with applicative functors. This would allow us to write just
(:) <$> p <*> many p
. However, using combinators becomes more difficult when we need
to specify a guard as in the sat
parser. Monad comprehensions provide a uniform
and succinct alternative.
The order of monadic bindings usually matters. The monad comprehension syntax makes this fact
perhaps slightly less obvious than the do
notation. To demonstrate this, let's look at a parser that
parses the body of an expression enclosed in brackets:
brackets :: Char -> Char -> Parser a -> Parser a
brackets op cl body =
[ inner
| _ <- char op
, inner <- brackets op cl body `mplus` body
, _ <- char cl ]
skipBrackets = brackets '(' ')' (many item)
The brackets
combinator takes characters representing opening and closing brackets and a parser for parsing
the body inside the brackets. It uses a monad comprehension with three
binding expressions that parse an
opening brace, the body or more brackets, and then the closing brace.
If you run the parser using run skipBrackets "((42))"
you get a list containing "42"
, but also
"(42)"
. This is because the many item
parser can also consume brackets. To correct that, we need
to write a parser that accepts any character except opening and
closing brace. As we will see
shortly, this can be elegantly solved using parallel comprehensions.
2.3 Parallel composition of parsers
To support parallel monad comprehensions, we need to implement MonadZip
.
As a reminder, the type class defines an operation mzip
with the following type:
mzip :: m a -> m b -> m (a, b)
By looking just at the type signature, you can see that the operation can be implemented in terms
of >>=
and return
like this:
mzip ma mb = ma >>= \a -> mb >>= \b -> return (a, b)
This is a reasonable definition for some monads, such as the Reader
monad, but not for all of
them. For example, mzip
for lists should be zip
, but the definition above would behave as a
Cartesian product! A more interesting definition for parsers, which cannot be expressed using other
monad primitives, is parallel composition:
instance MonadZip Parser where
mzip (Parser p1) (Parser p2) = Parser (\input ->
[ ((a, b), n1, tail1)
| (a, n1, tail1) <- p1 input
, (b, n2, tail2) <- p2 input
, n1 == n2 ])
The parser created by mzip
independently parses the input string using both of the parsers.
It uses list comprehensions to find all combinations of results such that the number of
characters consumed by the two parsers was the same. For each matching combination, the parser
returns a tuple with the two parsing results. Requiring that the two parsers consume the same
number of characters is not an arbitrary decision. It means that the remaining unconsumed strings
tail1
and tail2
are the same and so we can return either of them. Using a counter is more efficient
than comparing strings and it also enables working with infinite strings.
Let's get back to the example with parsing brackets. The following snippet uses parallel monad comprehensions to create a version that consumes all brackets:
skipAllBrackets = brackets '(' ')' body
where body = many [ c | c <- notChar '(' | _ <- notChar ')' ]
The parser body
takes zero or more of any characters that are not opening or closing brackets.
The parallel comprehension runs two notChar
parsers on the same input. They both read a single
character and they succeed if the character is not `(' and `)' respectively. The resulting parser
succeeds only if both of them succeed. Both parsers return the same character, so we
return the first one as the result and ignore the second.
Another example where this syntax is useful is validation of inputs. For example, a valid Cambridge phone number consists of 10 symbols, contains only digits, and starts with 1223. The new syntax allows us to directly encode these three rules:
cambridgePhone =
[ n | n <- many (sat isDigit)
| _ <- replicateM 10 item
| _ <- startsWith (string "1223") ]
The encoding is quite straightforward. We need some additional
combinators, such as replicateM
, which repeats a parser a specified
number of times, and startsWith
, which runs a parser and then
consumes any number of characters.
We could construct a single parser that recognizes valid Cambridge phone numbers without using
mzip
. The point of this example is that we can quite nicely combine several independent rules,
which makes the validation code easy to understand and extend.
2.4 Parallel composition of context-free parsers
Monadic parser combinators are very expressive. In fact, they are
often too expressive, which makes it difficult to implement
the combinators efficiently. This was a motivation for the development
of non-monadic parsers, such as the one by Swierstra
[6, 7], which are less expressive but
more efficient. Applicative functors, developed by McBride
and Paterson [8], are a weaker abstraction that can be
used for writing parsers. The next snippet shows the
Haskell type class Applicative
that represents applicative
functors.
class (Functor f) => Applicative f where
pure :: a -> f a
(<*>) :: f (a -> b) -> f a -> f b
If you're familiar with applicative functors, you may know that there is an alternative definition
of Applicative
that uses an operation with the same type signature as mzip
. We could use
mzip
to define an Applicative
instance, but this would give us a very different parser definition!
In some sense, the following example combines two different applicative functors, but I'll write more
about that in the next section.
The usual applicative parsers allow us to write parsers where the choice of the next parser
doesn't depend on the value parsed so far. In terms of formal language theory, they can express
only context-free languages. This is still sufficient for many practical purposes. For
example, our earlier brackets
parser can be written using the applicative combinators:
brackets op cl body =
pure (\_ inner _ -> inner)
<*> char op
<*> brackets op cl body `mplus` body
<*> char cl
The example first creates a parser that always succeeds and returns a function using the pure
combinator. Then it applies this function (contained in a parser) to three arguments (produced
by the three parsers). The details are not important, but the example shows that comprehensions
with independent generators can be expressed just using the Applicative
interface.
The interesting question is, what operation does mzip
represent for context-free grammars? A language we obtain if parses for two other languages both
succeed is an intersection of the two languages. An intersection of two context-free
languages is not necessarily context-free, which can be demonstrated using the following example:
A = { am bm cn | m, n ≥ 0 }
B = { an bm cm | m, n ≥ 0 }
A ∩ B = { am bm cm | m ≥ 0 }
The language A accepts words that start with some number of `a' followed by the same
number of `b' and then arbitrary number of `c' symbols. The language B is similar, but it
starts with a group of any length followed by two groups of the same length. Both are context-free. In fact, our parser brackets
can be used to parse the two
character groups with the same length.
The intersection A ∩ B is not context-free [9], but we can easily
encode it using parallel composition of parsers. We don't need the full power of monads
to parse the first group and calculate its length. It could be implemented just in terms of
Applicative
and mzip
, but we use the nicer monad comprehension syntax:
[ True | _ <- many $ char 'a', _ <- brackets 'b' 'c' unit
| _ <- brackets 'a' 'b' unit, _ <- many $ char 'c' ]
where unit = return ()
The example uses both parallel and sequential binding, but the sequential composition doesn't
involve dependencies on previously parsed results. It uses brackets
to parse two groups of the
same length, followed (or preceded) by many
to consume the remaining group of arbitrary length.
2.5 Applicative and parallel parsers
Before moving to the next example, let's take a look how mzip
relates to applicative functors.
As already mentioned, an alternative definition of applicative functors ([8] section 7)
uses an operation with exactly the same type signature as mzip
. You can find the alternative
definition in the following snippet:
class Functor f => Monoidal f where
unit :: f ()
(⋆) :: f a -> f b -> f (a, b)
pure :: Monoidal f => a -> f a
pure a = fmap (const a) unit
(<*>) :: Monoidal f => f (a -> b) -> f a -> f b
f <*> a = fmap (uncurry ($)) (f ⋆ a)
Applicative functors are more general than monads, which means that every monad is also an applicative functor.
When introducing the _mzip_ operation, we attempted to define it in terms of _return_ and _>>=_. That code
was actually a definition of ⋆. However, as already explained, we cannot use that definition (or the
⋆ operation) for _mzip_, because that definition isn't always intuitively right. Referring to
intuition is always tricky, so I'll express my expectations more formally in terms of laws at the end of the article.
However, if we always just used ⋆ as the mzip
operation, we wouldn't get any additional expressive power, so there
would be no point in using parallel monad comprehensions. The expression [ e | a <- e1 | b <- e2]
would mean exactly the same thing as [ e | a <- e1, b <- e2]
.
For example, the definition of ⋆ for the usual List
monad gives us the Cartesian product of lists. This isn't very useful, because we can get that
behavior using multiple generators. Instead, parallel list comprehensions use zipping of lists, which comes
from a different applicative functor, namely ZipList
.
The example with parsers is similar. The implementation of ⋆ would give us sequential
composition of parsers. This wouldn't be very useful, so I defined mzip
as the intersection of
parsers. This is an interesting operation that adds expressive power to the language. The mzip
operation also defines an instance of applicative functor for parsers, but a different one.
For the fans of category theory, the ⋆ operation is a natural transformation of a
monoidal functor that can be defined by the monad we're working with. The mzip
operation
can be also viewed as a natural transformation of some monoidal functor, but it may
be a different one. One of the additional laws that we can require about mzip
is commutativity
(more about that later), which means that mzip
should be defined by a
symmetric monoidal functor.
3 Parallelizing cooperative computations
As the name parallel monad comprehensions suggests, we can use the syntax for running computations in parallel. Unlike comprehensions with multiple generators, parallel comprehensions cannot have any dependencies between the composed bindings. This means that the computations can be evaluated independently.
In this section, I demonstrate the idea using a poor man's concurrency monad inspired by Claessen [10]. The monad can be used to implement a lightweight cooperative concurrency. When running two computations in parallel, we can allow interleaving of atomic actions from the two threads.
3.1 Modelling resumable computations
The example I demonstrate here models computations using resumptions. This concept is slightly simpler than the original poor man's concurrency monad (which is based on continuations). A resumption is a computation that has either finished and produced some value or has not finished, in which case it can run one atomic step and produce a new resumption:
data Monad m => Resumption m r
= Step (m (Resumption m r))
| Done r
The type is parametrized over a monad and a return type. When evaluating a resumption, we
repeatedly run the computation step-by-step. While evaluating, we perform the effects allowed by
the monad m
until we eventually get a result of type r
. If you're interested in more details
about the Resumption
type, you can find similar definitions in Harrison's cheap threads
[11] and Papaspyrou's resumption transformer [12].
Now that we have a type, we can write a function to create a Resumption
that runs a single
atomic action and then completes, and a function that runs a Resumption
:
run :: Monad m => Resumption m r -> m r
run (Done r) = return r
run (Step m) = m >>= run
action :: Monad m => m r -> Resumption m r
action a = Step [ Done r | r <- a ]
The run
function takes a resumption that may perform effects specified by the monad m
and
runs the resumption inside the monad until it reaches Done
. The function runs the whole
computation sequentially (and it cannot be implemented differently). The cooperative concurrency
can be added later by creating a combinator that interleaves the steps of two Resumption
computations.
The action
function is quite simple. It returns a Step
that runs the specified action
inside the monad m
and then wraps the result using the Done
constructor. I implemented the
function using monad comprehensions to demonstrate the notation again, but it could be equally
written using combinators or the do
notation.
3.2 Implementing the resumption monad
You can see the implementation of Monad
instance for
Resumption m
below.
The return
operation creates a new resumption in the "done" state
which contains the specified
value. The >>=
operation constructs a resumption that gets the result of the
first resumption and then calls the function f
. When the left parameter is Done
, we
apply the function to the result and wrap the application inside return
(because the function
is pure) and Step
. When the left parameter is Step
, we create a resumption that runs the
step and then uses >>=
recursively to continue running steps from the left argument until it
finishes.
instance Monad m => Monad (Resumption m) where
return a = Done a
(Done r) >>= f = Step $ return (f r)
(Step s) >>= f = Step $ do
next <- s
return $ next >>= f
instance MonadTrans Resumption where
lift = action
The listing also defines an instance of MonadTrans
to make Resumption
a monad transformer.
The lift
function takes a computation in the monad m
and turns it into a computation in the
Resumption
monad. This is exactly what our function for wrapping atomic actions does.
Equipped with the two type class instances and the run
function, we can write some interesting
computations. The following function creates a computation that runs for the specified number of
steps, prints some string in each step and then returns a specified value:
printLoop :: String -> Int -> a -> Resumption IO a
printLoop str count result = do
lift $ putStrLn str
if count == 1 then return result
else printLoop str (count - 1) result
cats = run $ printLoop "meow" 3 "cat"
The function is written using the do
notation. It first prints the specified string, using
lift
to turn the IO ()
action into a single-step Resumption IO ()
. When the counter reaches
one, it returns the result; otherwise it continues looping.
The snippet defines a simple computation cats
that prints "meow" three times and then
returns a string "cat". The computations created by printLoop
are not fully opaque. If we have
two computations like cats
, we can treat them as sequences of steps and interleave them.
This is what the mzip
operation does.
3.3 Parallel composition of resumptions
If we have two resumptions, we can compose them to run in sequence either using the do
notation or
using a monad comprehension with two generators. To run them in parallel, we need to implement
interleaving of the steps as shown in the next snippet:
instance Monad m => MonadZip (Resumption m) where
mzip (Done a) (Done b) = Done (a, b)
mzip sa sb = Step [ mzip a b | a <- step sa, b <- step sb ]
where step (Done r) = return $ Done r
step (Step sa) = sa
The result of mzip
is a resumption that consists of multiple steps. In each step, it performs
one step of both of the resumptions given as arguments. When it reaches a state when both of the
resumptions complete and produce results, it returns a tuple containing the results using Done
.
A step is performed using an effectful step
function. To keep the implementation simple, we keep
applying step
to both of the resumptions and then recursively combine the results. Applying
step
to a resumption that has already completed isn't a mistake. This operation doesn't do
anything and just returns the original resumption (without performing any effects).
Once we define mzip
, we can start using the parallel comprehension syntax for working with
resumptions. The next snippet demonstrates two ways of composing resumptions. In both examples, we
compose a computation that prints "meow" two times and then returns "cat" with a computation
that prints "woof" three times and then returns "dog":
animalsSeq =
[ c ++ " and " ++ d
| c <- printLoop "meow" 2 "cat"
, d <- printLoop "woof" 3 "dog" ]
animalsPar =
[ c ++ " and " ++ d
| c <- printLoop "meow" 2 "cat"
| d <- printLoop "woof" 3 "dog" ]
The only difference between the two examples is that the first one composes the operations using multiple generators (separated by comma) and the second one uses parallel comprehensions (separated by bar).
When you run the first example, the program prints meow, meow, woof, woof, woof and then returns a string "cat and dog". The second program interleaves the steps of the two computations and prints meow, woof, meow, woof, woof and then returns the same string.
4 Composing computations in parallel
In the previous section, we used the parallel comprehension syntax to create computations that model parallelism using resumptions. Resumptions can be viewed as lightweight cooperative threads. They are useful abstraction, but the simple implementation in the previous section does not give us any speed-up on multi-core CPU. This section will follow a similar approach, but we look at how to implement actual parallelism based on evaluation strategies.
Marlow et al. [13] introduced an Eval
monad for explicitly
specifying evaluation order. I will start by briefly introducing the monad, so don't
worry if you're not familiar with it already. I'll then demonstrate how to define a
MonadZip
instance for this monad. This way, we can use the parallel
comprehension syntax for actually running computations in parallel.
4.1 Introducing the evaluation-order monad
The evaluation-order monad is represented by a type Eval a
. When writing code inside the monad,
we can use several functions of type a -> Eval a
that are called strategies. These
functions take a value of type a
, which may be unevaluated, and wrap it inside the monad.
A strategy can specify how to evaluate the (unevaluated) value. The two most common strategies
are rpar
and rseq
(both are functions of type a -> Eval a
). The rpar
strategy starts
evaluating the value in background and rseq
evaluates the value eagerly before returning.
A typical pattern is to use the do
notation to spawn one computation in parallel and then
run another computation sequentially. This way we can easily parallelize two function calls:
fib38 = runEval $ do
a <- rpar $ fib 36
b <- rseq $ fib 37
return $ a + b
The example shows how to calculate the 38th Fibonacci number. It starts calculating
fib 36
in parallel with the rest of the computation and then calculates fib 37
sequentially.
The do
block creates a value of type Eval Integer
. We then pass
this value to runEval
,
which returns the wrapped value. Because we used the rpar
and rseq
combinators when constructing
the computation, the returned value will be already evaluated.
However, it is worth noting that the return
operation of the monad doesn't specify any evaluation
order. The function runEval . return
is just an identity function that doesn't force evaluation
of the argument. The evaluation order is specified by additional combinators such as rpar
.
The Eval
monad is implemented in the parallel
package [14] and very
well explained in the paper by Marlow et al. [13]. You can find the definition of
Eval
and its monad instance in the snippet below. We don't need to know how
rpar
and rseq
work, so we omit them from the listing.
data Eval a = Done a
runEval :: Eval a -> a
runEval (Done x) = x
instance Monad Eval where
return x = Done x
Done x >>= k = k x
The Eval a
type is simple. It just wraps a value of type a
. The runEval
function unwraps
the value and the Monad
instance implements composition of computations in the
usual way. The power of the monad comes from the evaluation annotations we can add.
We transform values of type a
into values of type Eval a
; while doing so, we can specify
the evaluation strategy. The strategy is usually given using combinators, but if we add an
instance of the MonadZip
class, we can also specify the evaluation order using the monad
comprehension syntax.
4.2 Specifying parallel evaluation order
The mzip
operator for the evaluation order monad encapsulates a common pattern that runs two
computations in parallel. We've seen an example in the previous section - the first
computation is started using rpar
and the second one is evaluated eagerly in parallel using rseq
.
You can find the implementation in the next snippet.
instance MonadZip Eval where
mzip ea eb = do
a <- rpar $ runEval ea
b <- rseq $ runEval eb
return (a, b)
A tricky aspect of Eval
is that it may represent computations with explicitly
specified evaluation order (created, for example, using rpar
). We can also create computations
without specifying evaluation order using return
. The fact that return
doesn't evaluate the
values makes it possible to implement the mzip
function, because the required type signature is
Eval a -> Eval b -> Eval (a, b)
.
The two arguments already have to be values of type Eval
, but we want to specify the evaluation
order using mzip
after creating them. If all Eval
values were already evaluated, then the mzip
operation couldn't have any effect. However, if we create values using return
, we can then apply
mzip
and specify how they should be evaluated later. This means that mzip
only works for
Eval
computations created using return
.
Once we understand this, implementing mzip
is quite simple. It extracts the underlying
(unevaluated) values using runEval
, specifies the evaluation order using rpar
and rseq
and
then returns a result of type Eval (a, b)
, which now carries the evaluation order specification.
Let's look how we can write a sequential and parallel version of a
snippet that calculates the
38th Fibonacci number:
fibTask n = return $ fib n
fib38seq = runEval [ a + b | a <- fibTask 36
, b <- fibTask 37 ]
fib38par = runEval [ a + b | a <- fibTask 36
| b <- fibTask 37 ]
The snippet first declares a helper function fibTask
that creates a delayed value using
the sequential fib
function and wraps it inside the Eval
monad without specifying evaluation
strategy. Then we can use the function as a source for generators in the monad comprehension
syntax. The first example runs the entire computation sequentially - aside from some wrapping
and unwrapping, there are no evaluation order specifications. The second example runs the
two sub-computations in parallel. The evaluation order annotations are added by the mzip
function from the desugared parallel comprehension syntax.
To run the program using multiple threads, you need to compile it using GHC with the -threaded
option. Then you can run the resulting application with command line arguments +RTS -N2 -RTS
,
which specifies that the runtime should use two threads. I measured the performance on a dual-core
Intel Core 2 Duo CPU (2.26GHz). The time needed to run the first version was approximately 13
seconds while the second version completes in 9 seconds.
4.3 Writing parallel algorithms
The +/- 1.4x speedup is less than the maximal 2x speedup, because the example parallelizes two
calculations that do not take equally long. To generate a better potential for parallelism,
we can implement a recursive pfib
function that splits the computation into two parallel branches
recursively until it reaches some threshold:
pfib :: Integer -> Eval Integer
pfib n | n <= 35 = return $ fib n
pfib n = [ a + b | a <- pfib $ n - 1
| b <- pfib $ n - 2 ]
I hope you'll agree that the declaration looks quite neat. A nice consequence of using parallel comprehensions is that we can see which parts of the computation will run in parallel without any syntactic noise. We just replace a comma with a bar to get a parallel version! The compiler also prevents us from trying to parallelize code that cannot run in parallel, because of data dependencies. For example, let's look at the Ackermann function:
ack :: Integer -> Integer -> Eval Integer
ack 0 n = return $ n + 1
ack m 0 = ack (m - 1) 1
ack m n = [ a | na <- ack m (n - 1)
, a <- ack (m - 1) na ]
The Ackermann function is a well-known function from computability theory. It is interesting
because it grows very fast (as a result, it cannot be expressed using primitive
recursion). For example, the value of ack 4 2
is 265536 - 3.
We're probably not going to be able to finish the calculation, no matter how many cores our CPU has. However, we can still try to parallelize the function by replacing the two sequential generators with a parallel comprehension:
ack m n = [ a | na <- ack m (n - 1)
| a <- ack (m - 1) na ]
If you try compiling this snippet, you get an error message saying Not in scope: na
. We can
easily see what went wrong if we look at the desugared version:
((ack m (n - 1)) `mzip`
(ack (m - 1) na)) >>= (\(na, a) -> a)
The problem is that the second argument to mzip
attempts to access the value na
which is
defined later. The value is the result of the first expression, so we can access it only after
both of the two parallelized operations complete.
In other words, there is a data dependency between the computations that we're trying to parallelize. If we were not using parallel monad comprehensions, we could mistakenly think that we can parallelize the function and write the following:
ack m n = runEval $ do
na <- rpar $ ack m (n - 1)
a <- rseq $ ack (m - 1) na
return a
This would compile, but it wouldn't run in parallel! The value na
needs to be evaluated before
the second call, so the second call to ack
will block until the first one completes. This
demonstrates a nice aspect of writing parallel computations using the comprehension syntax.
Not only that the syntax is elegant, but the desugaring also performs a simple sanity check on our code.
5 Parallel comprehension laws
I have intentionally postponed the discussion about laws to the end of
the article. So far, we have looked
at three different implementations of mzip
. I discussed some of the expectations
informally to aid intuition. The type of mzip
partially specifies how
the operation should behave, but not all well-typed implementations are intuitively right.
In my understanding, the laws about mzip
are still subject to discussion, although some were
already proposed in the discussion about the GHC patch [2]. I hope to contribute to
the discussion in this section. We first look at the laws that can be
motivated by the category theory
behind the operation, and then discuss additional laws inspired by the work on
F# joinads.
5.1 Basic laws of parallel bindings
As already briefly mentioned, the mzip
operation can be viewed as a natural
transformation defined by some monoidal functor
[15]. In practice,
this means that the mzip
operation should obey two laws. The first one is usually called
naturality and it specifies the behavior of mzip
with respect to the map
function of
a monad (the function can be implemented in terms of bind and return and corresponds to
liftM
from the Haskell base library). The second law is associativity, and we can express it
using a helper function assoc ((a,b),c) = (a,(b,c))
:
map (f × g) (mzip a b) ≡ mzip (map f a) (map g b) (naturality)
mzip a (mzip b c) ≡ map assoc (mzip (mzip a b) c) (associativity)
The naturality law specifies that we can change the order of applying
mzip
and map
. The equation has already been identified as a law in the discussion about the
patch [2]. The law is also required by applicative functors
[8] - this is not surprising as applicative functors
are also monoidal.
The associativity law is also very desirable. When we write a
comprehension such as [e | a <- m1 | b <- m2 | c <- m3]
, the desugaring first needs to zip two
of the three inputs and then zip the third with the result, because mzip
is a binary operation. The order
of zipping feels like an implementation detail, but if we don't require associativity, it may
affect the meaning of our code.
5.2 Parallel binding as monoidal functor
A monoidal functor in category theory defines a natural transformation (corresponding to our mzip
),
but also a special value called unit. A Haskell representation of monoidal functor is the
Monoidal
type class. For a monoidal functor f
, the
type of unit is f ()
. Every applicative functor in Haskell has a unit.
For example, the unit value for ZipList
is an infinite list of
()
values. We do not necessarily need to add unit to the definition of
MonadZip
, because it is not needed by the desugaring. However, it is interesting to explore
how a special munit
value would behave.
The laws for monoidal functors specify that if we combine munit
with any other value using mzip
,
we can recover the original value using map snd
. In the language
of monad comprehensions, the law says that the expression [ e | a <- m ]
should be equivalent to
[ e | a <- m | () <- munit]
.
It is important to realize that the computation created using monadic return
isn't the same thing
as unit of the monoidal functor associated with mzip
. For lists, return ()
creates a singleton
list. The return
operation is unit of a monoidal functor defined by the monad, but the unit
associated with mzip
belongs to a different monoidal functor!
5.3 Symmetry of parallel binding
Another sensible requirement for mzip
(which exists in F# joinads) is that reordering of
the arguments only changes the order of elements in the resulting tuple. In theory, this means that
the monoidal functor defining mzip
is symmetric. We can specify the law using a
helper function swap (a,b) = (b,a)
:
mzip a b ≡ map swap (mzip a b) (symmetry)
The symmetry law specifies that [(x, y) | x <- a | y <- b]
should mean
the same thing as [(x, y) | y <- b | x <- a]
. This may look like a very strong requirement, but
it fits quite well with the usual intuition about the zip
operation and parallel monad
comprehensions. The symmetry law holds for lists, parsers and the
evaluation order monad. For the poor
man's concurrency monad, it holds if we treat effects that occur within a single step of the
evaluation as unordered (which may be a reasonable interpretation). For some monads, such as the
State
monad, it is not possible to define a symmetric mzip
operation.
The three laws that we've seen so far are motivated by the category theory laws for
the (symmetric) monoidal functors that define our mzip
operation. However, we also need to relate the functors in some way to
the monads with which we are combining them.
5.4 Relations with additional operations
The discussion about the patch [2] suggests one more law that relates the mzip
operation with the map
operation of the monad, called information preservation:
map fst (mzip a b) ≡ a ≡ map snd (mzip b a) (information preservation)
The law specifies that combining a computation with some other computation using mzip
and then
recovering the original form of the value using map
doesn't lose
information. However, this law is a bit
tricky. For example, it doesn't hold for lists if a and b are
lists of different length, since
the zip
function restricts the length of the result to the length of the shorter list. Similarly, it
doesn't hold for parsers (from the first section) that consume a different number of characters.
However, the law expresses an important requirement: when we combine certain computations, it should be possible to recover the original components. A law that is similar to (\ref{law:ipreserve}) holds for applicative functors, but with a slight twist:
map fst (mzip a munit) ≡ a ≡ map snd (mzip munit a) (applicative information preservation)
Instead of zipping two arbitrary monadic values, the law for applicative functors zips an arbitrary value with unit. In case of lists, unit is an infinite list, so the law holds. Intuitively, this holds because unit has a maximal structure (in this case, the structure is the length of the list).
Ideally, we'd like to say that combining two values with the same structure creates a new value
which also has the same structure. There is no way to refer to the
"structure" of a monadic value directly,
but we can create values with a given structure using map
. This weaker law holds for both lists and
parsers. Additionally, we can describe the case when one of the two computations is mzero
and
when both of them are created using return
:
map fst (mzip a (map f a)) ≡ a ≡ map snd (mzip (map g a) a) (weak information preservation)
map fst (mzip a mzero) ≡ mzero ≡ map snd (mzip mzero a) (zero)
mzip (return a) (return b) ≡ return (a, b)} (unit merge)
The weak information preservation law is quite similar to the original information preservation law.
The only difference is that instead of zipping with an arbitrary monadic value b, we're zipping a
with a value constructed using map
(for any total functions f and g). This means that the
actual value(s) that the monadic value contains (or produces) can be different, but the structure will
be the same, because map
is required to preserve the structure.
The zero law specifies that mzero
is the zero element with respect to
mzip
. It almost seems that it holds of necessity, because mzero
with type m a
doesn't contain any value of
type a
, so there is no way we could construct a value of type
(a,b)
. This second law complements the
first, but it is not difficult to see that it contradicts the
original information preservation.
Finally, the unit merge law describes how mzip
works with respect to the monadic
return
operation. This is interesting, because we're relating mzip
of one applicative functor with
the unit of another applicative functor (defined by the monad). The law isn't completely arbitrary;
an equation with a similar structure is required for causal commutative arrows [16].
In terms of monad comprehension syntax, the law says that [ e | a <- return e1 | b <- return e2 ]
is equivalent to [ e | (a, b) <- return (e1, e2)]
.
I believe that the three laws I proposed in this section partly answer
the question of how to relate
the two structures combined by parallel monad comprehensions - the MonadPlus
type class
with the symmetric monoidal functor that defines mzip
.
Conclusions
After some time, monad comprehensions are back in Haskell! The recent GHC patch makes them even more useful by generalizing additional features of list comprehensions including parallel binding and support for operations like ordering and grouping. In this article, I focused on the first generalization, although the remaining two are equally interesting.
We looked at three examples: parallel composition of parsers (which applies
parsers to the same input), parallel composition of resumptions (which interleaves
the steps of computations) and parallel composition of an evaluation order monad (which runs two
computations in parallel). Some of the examples are inspired by my previous work on joinads that add
a similar language extension to F#. The F# version includes an operation very similar to
mzip
from the newly included MonadZip
type class. I also proposed several laws - some of
them inspired by category theory and some by my work on F# joinads - hoping
that this article may contribute to the discussion about the laws required by MonadZip
.
Acknowledgements
Thanks to Alan Mycroft for inspiring discussion about some of the monads demonstrated in this article
and to Dominic Orchard for many useful comments on a draft of the article as well as discussion
about category theory and the mzip
laws. I'm also grateful to Brent Yorgey for proofreading the article
and suggesting numerous improvements.
Source code & Links
- The article is available as part of The Monad.Reader Issue 18 (PDF)
- The source code for the samples can be found on Github
References
- [1] Joinads: A retargetable control-flow construct for reactive, parallel and concurrent programming (PDF) - Tomas Petricek and Don Syme
- [2] Bring back monad comprehensions - Haskell Trac
- [3] Fun with monad comprehensions - Nils Schweinsberg
- [4] Comprehensive comprehensions (PDF) - Simon Peyton Jones and Philip Wadler
- [5] Monadic Parsing in Haskell (PDF) - Graham Hutton and Erik Meijer
- [6] Deterministic, error-correcting combinator parsers - S. Doaitse Swierstra and Luc Duponcheel
- [7] Combinator parsing: A short tutorial (PDF) - S. Doaitse Swierstra
- [8] Applicative programming with effects - Conor Mcbride and Ross Paterson.
- [9] Pumping lemma for context-free languages - Wikipedia
- [10] A poor man's concurrency monad - Koen Claessen
- [11] Cheap (but functional) threads (PDF) - William L. Harrison
- [12] A resumption monad transformer and its applications in the semantics of concurrency - Nikolaos S. Papaspyrou
- [13] Seq no more: better strategies for parallel Haskell (PDF) - Simon Marlow, Patrick Maier, Hans-Wolfgang Loidl, Mustafa K. Aswad, and Phil Trinder
- [14] The parallel package - Haskell Hackage
- [15] Monoidal functor - Wikipedia
- [16] Causal commutative arrows and their optimization (PDF) - Hai Liu, Eric Cheng, and Paul Hudak
Published: Tuesday, 19 July 2011, 11:28 PM
Author: Tomas Petricek
Typos: Send me a pull request!
Tags: haskell, research, parallel