Processing trees with F# zipper computation

One of the less frequently advertised new features in F# 3.0 is the query syntax. It is an extension that makes it possible to add custom operations in an F# computation expression. The standard `query { .. }` computation uses this to define operations such as sorting (`sortBy` and `sortByDescending`) or operations for taking and skipping elements (`take`, `takeWhile`, ...). For example, you can write:

```1: query { for x in 1 .. 10 do
2:         take 3
3:         sortByDescending x }```

In this article I'll use the same notation for processing trees using the zipper pattern. I'll show how to define a computation that allows you to traverse a tree and perform transformations on (parts) of the tree. For example, we'll be able to say "Go to the left sub-tree, multiply all values by 2. Then go back and to the right sub-tree and divide all values by 2" as follows:

```1: tree { for x in sample do
2:        left
3:        map (x * 2)
4:        up
5:        right
6:        map (x / 2)
7:        top }```

This example behaves quite differently to the usual `query` computation. It mostly relies on custom operations like `left`, `right` and `up` that allow us to navigate through a tree (descend along the left or right sub-tree, go back to the parent node). The only operation that does something is the `map` operation which transforms the current sub-tree.

This was just a brief introduction to what is possible, so let's take a detailed look at how this works...

Trees and zippers

First of all, we need to define a data type that represents the tree. This is going to be a standard binary tree with values in leafs (but you could do the similar thing for other kinds of trees or structures):

```1: type Tree<'T> =
2:   | Node of Tree<'T> * Tree<'T>
3:   | Leaf of 'T
4:   override x.ToString() = (...)```

Next, we need to look at the concept of a zipper. Intuitively, a zipper is something that allows you to navigate through the tree. When you go the left sub-tree, we want to get a value that contains this sub-tree together with the path that describes how we got there (and contains remaining branches that are needed to reconstruct the original tree when we go back up).

The paper that introduced Zippers [2] is very readable and introduces the idea in more details. As a fun fact, you can also read about how zippers can be derived automatically using differentiation [1]. In our case, the zipper data type for the tree is defined as follows:

``` 1:
2: type Path<'T> =
3:   | Top
4:   | Left of Path<'T> * Tree<'T>
5:   | Right of Path<'T> * Tree<'T>
6:   override x.ToString() = (...)
7:
8: type TreeZipper<'T> =
9:   | TZ of Tree<'T> * Path<'T>
10:   override x.ToString() = (...)```

The `TreeZipper<'T>` type pairs a tree (the current sub-tree that we are focused at) with a path which describes "how we got there". If we are at the root, then the path is `Top`. If we descend to the left-sub tree, then the path is represented using `Left(path, tree)` where `path` is the previous path and `tree` is the tree on the right (that we just skipped over).

Working with zippers

Let's look at three simple functions that operate on the tree zipper and then we'll navigate through a simple tree using them:

``` 1: let left = function
2:   | TZ(Leaf _, _) -> failwith "cannot go left"
3:   | TZ(Node(l, r), p) -> TZ(l, Left(p, r))
4:
5: /// Navigates to the right sub-tree
6: let right = function
7:   | TZ(Leaf _, _) -> failwith "cannot go right"
8:   | TZ(Node(l, r), p) -> TZ(r, Right(p, l))
9:
10: /// Gets the value at the current position
11: let current = function
12:   | TZ(Leaf x, _) -> x
13:   | _ -> failwith "cannot get current"```

The `left` and `right` functions are symmetric. If the `TreeZipper<'T>` represents a position where the current tree is `Leaf` then they fail, because we cannot descend to a sub-tree of a leaf. If the tree is a `Node` then they pick the left (or right) sub-tree and make it a current tree by using it as the first argument of `TZ`. The path is constructed by appending the other tree (`r` or `l`, respectively) to the previous path using `Left` or `Right`.

On the other hand, the `current` operation only works when the zipper points at a leaf node. In that case, it simply returns the value from the leaf. The following example defines a simple tree and then uses zipper operations to get to one of the leaves in the tree:

``` 1: let branches =
2:   Node( Node(Leaf 1, Leaf 3),
3:         Node(Leaf 7, Node(Leaf 12, Leaf 20)) )
4:
5: // Wrap it as a zipper & print
6: let sample = TZ(branches, Top)
7: printfn "%O" sample
8:
9: // Get one of the tree leaves
10: sample |> right |> right |> left |> current```

To make this example complete, we also need to add operation `up` that goes to the parent. We will also need an operation that (recursively) goes back to the top of the tree:

```1: let up = function
2:   | TZ(l, Left(p, r))
3:   | TZ(r, Right(p, l)) -> TZ(Node(l, r), p)
4:   | TZ(_, Top) -> failwith "cannot go up"
5:
6: // Navigate to the root of the tree
7: let rec top = function
8:   | TZ(_, Top) as t -> t
9:   | tz -> top (up tz)```

The `up` operation fails when we are already in the root and the path is just `Top`. Otherwise, it gets the current tree and takes the other branch from the path value. These two trees are combined into a new `Node` which is then returned as the current tree.

Adding computational operations

In order to define an F# computation expression, we need to define the meaning of `for` and `yield`. Most often, these operations are implemented by monadic bind and unit. We will not follow this pattern exactly. If we did that, we wouldn't be able to implement the motivating example as nicely.

The operations that we need to define are `unit`, which simply creates a tree containing just a leaf, and `bindSub` which transforms all leaves in the current sub-tree using a specified function:

``` 1: let unit v = TZ(Leaf v, Top)
2:
3: /// Transform leaves in the current sub-tree of 'treeZip'
4: /// into other trees using the provided function 'f'
5: let bindSub f treeZip =
6:   let rec bindT = function
7:     | Leaf x -> let (TZ(t, _)) = top (f x) in t
8:     | Node(l, r) -> Node(bindT l, bindT r)
9:   let (TZ(current, path)) = treeZip
10:   TZ(bindT current, path)```

The type of `bindSub` is `('T -> TreeZipper<'T>) -> TreeZipper<'T> -> TreeZipper<'T>`, which is almost like monadic bind with the exception that the transformation function needs to transform elements of type `'T` into trees containing leaves of the same type.

This requiremenet follows from the fact that we run the transformation only on the current tree (`bindT current`) while the `path` is left unchanged (and since path contains other trees of the same type, the type needs to stay the same). You could implement `bind` that applies `f` to trees in the path, but that would not be as interesting (because all transformations like `map` would happen on the whole tree - not just the current sub-tree).

Defining the Computation builder

Providing monadic operations

Now we have everything we need to define an F# computation builder for working with tree zippers. We start with a simple definition that allows just `for` and `yield` and then add all the (interesting) custom operators. Our `yield` is defined as `unit` and `for` corresponds to `bindSub`:

```1: type TreeZipperBuilder() =
2:   /// Enables the 'for x in xs do ..' syntax
3:   member x.For(tz:TreeZipper<'T>, f) : TreeZipper<'T> = bindSub f tz
4:   /// Enables the 'yield x' syntax
5:   member x.Yield(v) = unit v
6:
7: /// Global instance of the computation builder
8: let tree = TreeZipperBuilder()```

Equipped with the above definition, we can write computations that transform the entire tree. For example, to multiply all values of the tree by 2, we can write the following computation (you can pass the result to `printfn "%O"` to get a nicer output)

```1: tree { for x in sample do
2:        yield x * 2 }```

As discussed earlier, the `bindSub` operation (and thus our `for`) only allows the result to have the same type as the input. Writing, for example, `x.ToString()`, would be invalid. In this case, it does not seem very logical, but that's because we have not added operations for navigating through the tree - the `for` and `yield` can be only used to transform the entire tree.

Adding custom operations

Now we're getting to the most interesting bit of the article. How do we add custom operations to the computation expression? To do that, we need to annotate the method with `CustomOperation` attribute. The attribute defines the name of the operation and can specify additional parameters.

The fact that makes custom operations interesting is that they can have types such as `TreeZipper<'T> -> TreeZipper<'T>` which is exactly the type of our navigational operations. The syntax allows other operations - such as zipping, grouping and joinging - but we will not need these in this article.

To add navigational operations, transformations and getter for the current element, we can write the following code:

``` 1: type TreeZipperBuilder with
2:   // Operations for navigation through the tree
3:   [<CustomOperation("left", MaintainsVariableSpace=true)>]
4:   member x.Left(tz) = left tz
5:   [<CustomOperation("right", MaintainsVariableSpace=true)>]
6:   member x.Right(tz) = right tz
7:   [<CustomOperation("up", MaintainsVariableSpace=true)>]
8:   member x.Up(tz) = up tz
9:   [<CustomOperation("top", MaintainsVariableSpace=true)>]
10:   member x.Top(tz) = top tz
11:
12:   /// Extracts the current value and returns it
13:   [<CustomOperation("current", MaintainsVariableSpace=false)>]
14:   member x.Current(tz) = current tz
15:
16:   /// Transform the current sub-tree using 'f'
17:   [<CustomOperation("map", MaintainsVariableSpace=true)>]
18:   member x.Select(tz, [<ProjectionParameter>] f) = bindSub (f >> unit) tz```

The `MaintainsVariableSpace` parameter allows us to specify that an operation transforms values in the abstract data type without touching them. If you have an operation `M<'T> -> M<'T>` then it is generally `true`, but if you have an operation such as `M<'T> -> M<int>` then the variable space would not be preserved (because variables are stored in a tuple used in place of `'T`, so turning that to `int` loses the values in the tuple). In our case, navigation and transformation maintains the variable space, but `current` operation does not.

The `map` operation uses another interesting attribute. If we annotate a function argument with `ProjectionParameter` then we can write `map (x * 2)` and the argument expression `x * 2` is implicitly turned into a function that takes all the variables and returns the result. Something like `fun vars -> vars.x * 2`.

Sample tree computations

Let's finish the article with two examples that process the `sample` tree defined earlier. The first one corresponds to our earlier computation (written using pipelines) that picks a specific leaf:

```1: tree { for x in sample do
2:        right
3:        right
4:        left
5:        current }```

The computation expression syntax always has to start with `for .. in .. do`, so we write that even if we do not actually access the value assigned to `x`. This is simply translated to `For` followed by `Yield` (which does not change the tree).

The `for` can then be followed by custom operations. In this example, we move right, right, left and then get the current value from the tree. Note that adding `left` after `current` gives a type error, because `current` extracts a value (which is not a tree zipper that can be transformed).

We can now also write the sample from the introduction of the article:

```1: tree { for x in sample do
2:        left
3:        map (x * 2)
4:        up
5:        right
6:        map (x / 2)
7:        top }```

Here, we write `for` followed by `left` to navigate to a left sub-tree. Then we perform a transformation using `map`. Because this is implemented using the `bindSub` operation, the transformation only affects the current sub-tree, but not the parents (the root node and its right sub-tree).

This example is more interesting, because it actually uses the variable `x` in the transformations (written using `map`). The computation expression notation actually gives us some benefits over a simple pipeline, because we do not need to create an explicit lambda each time we write `map`. Thanks to `ProjectionParameter` attribute, this is done automatically.

Summary

The main goal of this article is to demonstrate how you can define simple computation expressions with custom operations which is a new (and pretty powerful) feature in F# 3.0. In the example discussed in this article, I used the tree zipper.

The custom operations make it possible to navigate through the tree using a conventient syntax and perform transformations on current sub-tree just by writing commands like `left` and `right`. When writing the transformations using `map (x / 2)` we also do not need to write explicit lambda functions.

This article is really just an introduction - there are many topics that I did not talk about, because the custom constructs can also implement operations that resemble grouping or joining. But I'll leave that for another article!

References

val query : Linq.QueryBuilder

Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.query
val x : int
custom operation: take (int)

Calls Linq.QueryBuilder.Take
custom operation: sortByDescending ('Key)

Calls Linq.QueryBuilder.SortByDescending
type Tree<'T> =
| Node of Tree<'T> * Tree<'T>
| Leaf of 'T
override ToString : unit -> string

Full name: Tree-zipper-query.Tree<_>
union case Tree.Node: Tree<'T> * Tree<'T> -> Tree<'T>
union case Tree.Leaf: 'T -> Tree<'T>
val x : Tree<'T>
override Tree.ToString : unit -> string

Full name: Tree-zipper-query.Tree`1.ToString
match x with
| Node(l, r) -> sprintf "(%O, %O)" l r
| Leaf v -> sprintf "%O" v
type Path<'T> =
| Top
| Left of Path<'T> * Tree<'T>
| Right of Path<'T> * Tree<'T>
override ToString : unit -> string

Full name: Tree-zipper-query.Path<_>
union case Path.Top: Path<'T>
union case Path.Left: Path<'T> * Tree<'T> -> Path<'T>
union case Path.Right: Path<'T> * Tree<'T> -> Path<'T>
val x : Path<'T>
override Path.ToString : unit -> string

Full name: Tree-zipper-query.Path`1.ToString
match x with
| Top -> "T"
| Left(p, t) -> sprintf "L(%O, %O)" p t
| Right(p, t) -> sprintf "R(%O, %O)" p t
type TreeZipper<'T> =
| TZ of Tree<'T> * Path<'T>
override ToString : unit -> string

Full name: Tree-zipper-query.TreeZipper<_>
union case TreeZipper.TZ: Tree<'T> * Path<'T> -> TreeZipper<'T>
val x : TreeZipper<'T>
override TreeZipper.ToString : unit -> string

Full name: Tree-zipper-query.TreeZipper`1.ToString
let (TZ(t, p)) = x in sprintf "%O [%O]" t p
val left : _arg1:TreeZipper<'a> -> TreeZipper<'a>

Full name: Tree-zipper-query.left

Navigates to the left sub-tree
val failwith : message:string -> 'T

Full name: Microsoft.FSharp.Core.Operators.failwith
val l : Tree<'a>
val r : Tree<'a>
val p : Path<'a>
val right : _arg1:TreeZipper<'a> -> TreeZipper<'a>

Full name: Tree-zipper-query.right

Navigates to the right sub-tree
val current : _arg1:TreeZipper<'a> -> 'a

Full name: Tree-zipper-query.current

Gets the value at the current position
val x : 'a
val branches : Tree<int>

Full name: Tree-zipper-query.branches
val sample : TreeZipper<int>

Full name: Tree-zipper-query.sample
val printfn : format:Printf.TextWriterFormat<'T> -> 'T

Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.printfn
val up : _arg1:TreeZipper<'a> -> TreeZipper<'a>

Full name: Tree-zipper-query.up
val top : _arg1:TreeZipper<'a> -> TreeZipper<'a>

Full name: Tree-zipper-query.top
val t : TreeZipper<'a>
val tz : TreeZipper<'a>
Multiple items
val unit : v:'a -> TreeZipper<'a>

Full name: Tree-zipper-query.unit

Build tree zipper with singleton tree

--------------------
type unit = Unit

Full name: Microsoft.FSharp.Core.unit
val v : 'a
val bindSub : f:('a -> TreeZipper<'a>) -> treeZip:TreeZipper<'a> -> TreeZipper<'a>

Full name: Tree-zipper-query.bindSub

Transform leaves in the current sub-tree of 'treeZip'
into other trees using the provided function 'f'
val f : ('a -> TreeZipper<'a>)
val treeZip : TreeZipper<'a>
val bindT : (Tree<'a> -> Tree<'a>)
val t : Tree<'a>
val current : Tree<'a>
val path : Path<'a>
Multiple items
type TreeZipperBuilder =
new : unit -> TreeZipperBuilder
member Current : tz:TreeZipper<'a> -> 'a
member For : tz:TreeZipper<'T> * f:('T -> TreeZipper<'T>) -> TreeZipper<'T>
member Left : tz:TreeZipper<'a> -> TreeZipper<'a>
member Right : tz:TreeZipper<'a> -> TreeZipper<'a>
member Select : tz:TreeZipper<'a> * f:('a -> 'a) -> TreeZipper<'a>
member Top : tz:TreeZipper<'a> -> TreeZipper<'a>
member Up : tz:TreeZipper<'a> -> TreeZipper<'a>
member Yield : v:'a -> TreeZipper<'a>

Full name: Tree-zipper-query.TreeZipperBuilder

--------------------
new : unit -> TreeZipperBuilder
val x : TreeZipperBuilder
member TreeZipperBuilder.For : tz:TreeZipper<'T> * f:('T -> TreeZipper<'T>) -> TreeZipper<'T>

Full name: Tree-zipper-query.TreeZipperBuilder.For

Enables the 'for x in xs do ..' syntax
val tz : TreeZipper<'T>
val f : ('T -> TreeZipper<'T>)
member TreeZipperBuilder.Yield : v:'a -> TreeZipper<'a>

Full name: Tree-zipper-query.TreeZipperBuilder.Yield

Enables the 'yield x' syntax
val tree : TreeZipperBuilder

Full name: Tree-zipper-query.tree

Global instance of the computation builder
Multiple items
type CustomOperationAttribute =
inherit Attribute
new : name:string -> CustomOperationAttribute
member AllowIntoPattern : bool
member IsLikeGroupJoin : bool
member IsLikeJoin : bool
member IsLikeZip : bool
member JoinConditionWord : string
member MaintainsVariableSpace : bool
member MaintainsVariableSpaceUsingBind : bool
member Name : string
...

Full name: Microsoft.FSharp.Core.CustomOperationAttribute

--------------------
new : name:string -> CustomOperationAttribute
member TreeZipperBuilder.Left : tz:TreeZipper<'a> -> TreeZipper<'a>

Full name: Tree-zipper-query.TreeZipperBuilder.Left
member TreeZipperBuilder.Right : tz:TreeZipper<'a> -> TreeZipper<'a>

Full name: Tree-zipper-query.TreeZipperBuilder.Right
member TreeZipperBuilder.Up : tz:TreeZipper<'a> -> TreeZipper<'a>

Full name: Tree-zipper-query.TreeZipperBuilder.Up
member TreeZipperBuilder.Top : tz:TreeZipper<'a> -> TreeZipper<'a>

Full name: Tree-zipper-query.TreeZipperBuilder.Top
member TreeZipperBuilder.Current : tz:TreeZipper<'a> -> 'a

Full name: Tree-zipper-query.TreeZipperBuilder.Current

Extracts the current value and returns it
member TreeZipperBuilder.Select : tz:TreeZipper<'a> * f:('a -> 'a) -> TreeZipper<'a>

Full name: Tree-zipper-query.TreeZipperBuilder.Select

Transform the current sub-tree using 'f'
Multiple items
type ProjectionParameterAttribute =
inherit Attribute
new : unit -> ProjectionParameterAttribute

Full name: Microsoft.FSharp.Core.ProjectionParameterAttribute

--------------------
new : unit -> ProjectionParameterAttribute
val f : ('a -> 'a)
custom operation: right

Calls TreeZipperBuilder.Right
custom operation: left

Calls TreeZipperBuilder.Left
custom operation: current

Calls TreeZipperBuilder.Current

Extracts the current value and returns it
custom operation: map ('a)

Calls TreeZipperBuilder.Select

Transform the current sub-tree using 'f'
custom operation: up

Calls TreeZipperBuilder.Up
custom operation: top

Calls TreeZipperBuilder.Top
val tree : TreeZipperBuilder

Full name: Tree-zipper-query.tree

Global instance of the computation builder
val x : int
val sample : TreeZipper<int>

Full name: Tree-zipper-query.sample
custom operation: left

Calls TreeZipperBuilder.Left
custom operation: map ('a)

Calls TreeZipperBuilder.Select

Transform the current sub-tree using 'f'
custom operation: up

Calls TreeZipperBuilder.Up
custom operation: right

Calls TreeZipperBuilder.Right
custom operation: top

Calls TreeZipperBuilder.Top

Published: December 19, 2012 14:22
Tags: F# language | Research | Haskell

Share:

• Walnut Dining Tables by Walnut Dining (1/8/2013 11:16:56 AM)

Great post, very informative. I think a lot of people will find this very useful.Keep post in coming future as well!!!

• quit claim deed by quit claim deed (1/8/2013 12:32:38 PM)

Great post, very informative. I think a lot of people will find this very useful.Keep post in coming future as well!!!

• Bucks County PA real estate by Bucks County PA real estate (1/31/2013 10:58:51 AM)

Great post, very informative. I think a lot of people will find this very useful.Keep post in coming future as well!!!

• RE: Processing trees with F# zipper computation by Cesar Mendoza (1/31/2013 9:57:56 PM)

There is an entertaining article that talks about Zippers in the Special Poetry and Fiction Edition of the Monad Reader. The article is named "Theseus and the Zipper". http://themonadreader.files.wordpress.com/2011/03/specialissue.pdf

• RE: Processing trees with F# zipper computation by consultation voyance gratuite (5/14/2013 1:32:30 PM)

It’s a really great site you have here. Thank you for the effort to be so good for us (even though we don`t deserve it) and keep it up.

Send comments to this article

Hyperlinks are detected automatically and you can use **bold** and __italic__ to format your comment.

 Title: Name: Url: Remember me