Processing trees with F# zipper computation
One of the less frequently advertised new features in F# 3.0 is the query syntax.
It is an extension that makes it possible to add custom operations in an F#
computation expression. The standard query { .. }
computation uses this to define
operations such as sorting (sortBy
and sortByDescending
) or operations for taking
and skipping elements (take
, takeWhile
, ...). For example, you can write:
1: 2: 3: |
|
In this article I'll use the same notation for processing trees using the zipper pattern. I'll show how to define a computation that allows you to traverse a tree and perform transformations on (parts) of the tree. For example, we'll be able to say "Go to the left sub-tree, multiply all values by 2. Then go back and to the right sub-tree and divide all values by 2" as follows:
1: 2: 3: 4: 5: 6: 7: |
|
This example behaves quite differently to the usual query
computation. It mostly
relies on custom operations like left
, right
and up
that allow us to navigate
through a tree (descend along the left or right sub-tree, go back to the parent node).
The only operation that does something is the map
operation which transforms the
current sub-tree.
This was just a brief introduction to what is possible, so let's take a detailed look at how this works...
Trees and zippers
First of all, we need to define a data type that represents the tree. This is going to be a standard binary tree with values in leafs (but you could do the similar thing for other kinds of trees or structures):
1: 2: 3: 4: |
|
Next, we need to look at the concept of a zipper. Intuitively, a zipper is something that allows you to navigate through the tree. When you go the left sub-tree, we want to get a value that contains this sub-tree together with the path that describes how we got there (and contains remaining branches that are needed to reconstruct the original tree when we go back up).
The paper that introduced Zippers is very readable and introduces the idea in more details. As a fun fact, you can also read about how zippers can be derived automatically using differentiation. In our case, the zipper data type for the tree is defined as follows:
1: 2: 3: 4: 5: 6: 7: 8: 9: |
|
The TreeZipper<'T>
type pairs a tree (the current sub-tree that we are focused at)
with a path which describes "how we got there". If we are at the root, then the path
is Top
. If we descend to the left-sub tree, then the path is represented using
Left(path, tree)
where path
is the previous path and tree
is the tree on the
right (that we just skipped over).
Working with zippers
Let's look at three simple functions that operate on the tree zipper and then we'll navigate through a simple tree using them:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: |
|
The left
and right
functions are symmetric. If the TreeZipper<'T>
represents a
position where the current tree is Leaf
then they fail, because we cannot descend to
a sub-tree of a leaf. If the tree is a Node
then they pick the left (or right) sub-tree
and make it a current tree by using it as the first argument of TZ
. The path is
constructed by appending the other tree (r
or l
, respectively) to the previous
path using Left
or Right
.
On the other hand, the current
operation only works when the zipper points at
a leaf node. In that case, it simply returns the value from the leaf.
The following example defines a simple tree and then uses zipper operations to
get to one of the leaves in the tree:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: |
|
To make this example complete, we also need to add operation up
that goes to the
parent. We will also need an operation that (recursively) goes back to the top of the tree:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: |
|
The up
operation fails when we are already in the root and the path is just Top
.
Otherwise, it gets the current tree and takes the other branch from the path value.
These two trees are combined into a new Node
which is then returned as the current
tree.
Adding computational operations
In order to define an F# computation expression, we need to define the meaning of
for
and yield
. Most often, these operations are implemented by monadic bind
and unit. We will not follow this pattern exactly. If we did that, we wouldn't be
able to implement the motivating example as nicely.
The operations that we need to define are unit
, which
simply creates a tree containing just a leaf, and bindSub
which transforms
all leaves in the current sub-tree using a specified function:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: |
|
The type of bindSub
is ('T -> TreeZipper<'T>) -> TreeZipper<'T> -> TreeZipper<'T>
,
which is almost like monadic bind with the exception that the transformation function
needs to transform elements of type 'T
into trees containing leaves of the same type.
This requiremenet follows from the fact that we run the transformation only on the
current tree (bindT current
) while the path
is left unchanged (and since path contains
other trees of the same type, the type needs to stay the same). You could implement
bind
that applies f
to trees in the path, but that would not be as interesting
(because all transformations like map
would happen on the whole tree - not just
the current sub-tree).
Defining the Computation builder
Providing monadic operations
Now we have everything we need to define an F# computation builder for working with
tree zippers. We start with a simple definition that allows just for
and yield
and then add all the (interesting) custom operators. Our yield
is defined as unit
and for
corresponds to bindSub
:
1: 2: 3: 4: 5: 6: 7: 8: |
|
Equipped with the above definition, we can write computations that transform the
entire tree. For example, to multiply all values of the tree by 2, we can write
the following computation (you can pass the result to printfn "%O"
to get a nicer
output)
1: 2: |
|
As discussed earlier, the bindSub
operation (and thus our for
) only allows the
result to have the same type as the input. Writing, for example, x.ToString()
, would
be invalid. In this case, it does not seem very logical, but that's because we have not
added operations for navigating through the tree - the for
and yield
can be
only used to transform the entire tree.
Adding custom operations
Now we're getting to the most interesting bit of the article. How do we add custom
operations to the computation expression? To do that, we need to annotate the method
with CustomOperation
attribute. The attribute defines the name of the operation
and can specify additional parameters.
The fact that makes custom operations interesting is that they can have types such
as TreeZipper<'T> -> TreeZipper<'T>
which is exactly the type of our navigational
operations. The syntax allows other operations - such as zipping, grouping and
joining - but we will not need these in this article.
To add navigational operations, transformations and getter for the current element, we can write the following code:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: |
|
The MaintainsVariableSpace
parameter allows us to specify that an operation transforms
values in the abstract data type without touching them. If you have an operation
M<'T> -> M<'T>
then it is generally true
, but if you have an operation such as
M<'T> -> M<int>
then the variable space would not be preserved (because variables
are stored in a tuple used in place of 'T
, so turning that to int
loses the
values in the tuple). In our case, navigation and transformation maintains the
variable space, but current
operation does not.
The map
operation uses another interesting attribute. If we annotate a function
argument with ProjectionParameter
then we can write map (x * 2)
and the argument
expression x * 2
is implicitly turned into a function that takes all the variables
and returns the result. Something like fun vars -> vars.x * 2
.
Sample tree computations
Let's finish the article with two examples that process the sample
tree defined
earlier. The first one corresponds to our earlier computation (written using
pipelines) that picks a specific leaf:
1: 2: 3: 4: 5: |
|
The computation expression syntax always has to start with for .. in .. do
, so we
write that even if we do not actually access the value assigned to x
. This is
simply translated to For
followed by Yield
(which does not change the tree).
The for
can then be followed by custom operations. In this example, we move
right, right, left and then get the current value from the tree. Note that adding
left
after current
gives a type error, because current
extracts a value
(which is not a tree zipper that can be transformed).
We can now also write the sample from the introduction of the article:
1: 2: 3: 4: 5: 6: 7: |
|
Here, we write for
followed by left
to navigate to a left sub-tree. Then
we perform a transformation using map
. Because this is implemented using the
bindSub
operation, the transformation only affects the current sub-tree, but
not the parents (the root node and its right sub-tree).
This example is more interesting, because it actually uses the variable x
in
the transformations (written using map
). The computation expression notation
actually gives us some benefits over a simple pipeline, because we do not need
to create an explicit lambda each time we write map
. Thanks to ProjectionParameter
attribute, this is done automatically.
Summary
The main goal of this article is to demonstrate how you can define simple computation expressions with custom operations which is a new (and pretty powerful) feature in F# 3.0. In the example discussed in this article, I used the tree zipper.
The custom operations make it possible to navigate through the tree using
a conventient syntax and perform transformations on current sub-tree just by
writing commands like left
and right
. When writing the transformations using
map (x / 2)
we also do not need to write explicit lambda functions.
This article is really just an introduction - there are many topics that I did not talk about, because the custom constructs can also implement operations that resemble grouping or joining. But I'll leave that for another article!
Calls Linq.QueryBuilder.Take
Calls Linq.QueryBuilder.SortByDescending
| Node of Tree<'T> * Tree<'T>
| Leaf of 'T
override ToString : unit -> string
| Node(l, r) -> sprintf "(%O, %O)" l r
| Leaf v -> sprintf "%O" v
| Top
| Left of Path<'T> * Tree<'T>
| Right of Path<'T> * Tree<'T>
override ToString : unit -> string
| Top -> "T"
| Left(p, t) -> sprintf "L(%O, %O)" p t
| Right(p, t) -> sprintf "R(%O, %O)" p t
| TZ of Tree<'T> * Path<'T>
override ToString : unit -> string
Navigates to the left sub-tree
Navigates to the right sub-tree
Gets the value at the current position
val unit : v:'a -> TreeZipper<'a>
Build tree zipper with singleton tree
--------------------
type unit = Unit
Transform leaves in the current sub-tree of 'treeZip'
into other trees using the provided function 'f'
type TreeZipperBuilder =
new : unit -> TreeZipperBuilder
member Current : tz:TreeZipper<'a> -> 'a
member Current : tz:TreeZipper<'a> -> 'a
member For : tz:TreeZipper<'T> * f:('T -> TreeZipper<'T>) -> TreeZipper<'T>
member Left : tz:TreeZipper<'a> -> TreeZipper<'a>
member Left : tz:TreeZipper<'a> -> TreeZipper<'a>
member Right : tz:TreeZipper<'a> -> TreeZipper<'a>
member Right : tz:TreeZipper<'a> -> TreeZipper<'a>
member Select : tz:TreeZipper<'a> * f:('a -> 'a) -> TreeZipper<'a>
member Select : tz:TreeZipper<'a> * f:('a -> 'a) -> TreeZipper<'a>
...
--------------------
new : unit -> TreeZipperBuilder
Global instance of the computation builder
type CustomOperationAttribute =
inherit Attribute
new : name:string -> CustomOperationAttribute
member AllowIntoPattern : bool
member IsLikeGroupJoin : bool
member IsLikeJoin : bool
member IsLikeZip : bool
member JoinConditionWord : string
member MaintainsVariableSpace : bool
member MaintainsVariableSpaceUsingBind : bool
member Name : string
...
--------------------
new : name:string -> CustomOperationAttribute
type ProjectionParameterAttribute =
inherit Attribute
new : unit -> ProjectionParameterAttribute
--------------------
new : unit -> ProjectionParameterAttribute
Calls TreeZipperBuilder.Right
Calls TreeZipperBuilder.Left
Calls TreeZipperBuilder.Current
Extracts the current value and returns it
Calls TreeZipperBuilder.Select
Transform the current sub-tree using 'f'
Calls TreeZipperBuilder.Up
Calls TreeZipperBuilder.Top