F# Overview (IV.) - Language Oriented Programming

Defining precisely what the term language oriented programming means in context of the F# language would be difficult, so I will instead explain a few examples that will demonstrate how I understand it. In general, the goal of language oriented programming is to develop a language that would be suitable for some (more specific) class of tasks and use this language for solving these tasks. Of course, developing a real programming language is extremely complex problem, so there are several ways for making it easier. As the most elementary example, you can look at XML files (with certain schema) as language that are processed by your program and solve some specific problem (for example configuring the application). As a side note, I should mention that I'm not particularly happy with the term ‘language’ in this context, because the term can be used for describing a wide range of techniques from very trivial constructs to a complex object-oriented class libraries, but I have not seen any better term for the class of techniques that I’m going to talk about.

What I will focus on in this article is using languages inside F# - this means that the custom language will be always a subset of the F# language, but we will look at ways for giving it a different meaning than the standard F# code would have. In some articles you can also see the term domain specific language, which is quite close to what we're discussing here. The domain specific language is a language suitable for solving some class of problems and in general it can be either external, meaning that it is a separate language (e.g. a XML file) or an internal, meaning that it is written using a subset of the host language. Using this terminology, we will focus on writing internal DSLs in F#.

Since this term is not as widely used as functional or object oriented programming which we discussed in earlier articles in this series (Functional Programming [^], Object Oriented and Imperative Programming [^]), let me very quickly introduce why I believe that this is an important topic. I think the main reason why language oriented development is appealing paradigm is that it allows very smooth cooperation of people working on the project - there are people who develop the language and those who use it. The language developers need to have advanced knowledge of the technology (F#) and also of the problem that their language is trying to solve (e.g. some mathematical processing), but they don't need to know about all the problems that are being solved using the language. On the other side, the users of the language need only basic F# knowledge and they can fully focus on solving the real problems.

Discriminated Union as Declarative Language

Probably the simplest example of domain-specific language that can be embedded in the F# code is a discriminated union, which can be used for writing declarative specifications of behavior or for example for representing and processing mathematical expressions:

> type Expr = 
  | Binary of string * Expr * Expr
  | Var    of string 
  | Const  of int;;

> let e = Binary("+", Const(2), Binary("*", Var("x"), Const(4)));;
val e : Expr

In this example we created a discriminated union and used it for building a value representing a mathematical expression. This is of course very primitive ‘language’, but when you implement functions for working with these values (for example differentiation or evaluation) you’ll get a simple language for processing mathematical expressions inside F#. Another problems that could be solved using this technique include for example configuration of some graphical user interface or definition of template for some simple data manipulation.

Active Patterns

A language feature that is closely related to discriminated unions is called active patterns. Active patterns can be used for providing different views on some data type, which allows us to hide the internal representation of the type and publish only these views. Active patterns are similar to discriminated unions, because they can provide several views on a single value (in the previous example we had a value that we could view either as Binary, Var or Const) and similarly as constructors of discriminated union, active patterns can be used in pattern matching constructs.

A typical example, where a type can be viewed using different views is a complex number, which can be either viewed in a Cartesian representation (real and imaginary part) or in a polar form (absolute value and phase). Once the module provides these two views for a complex number type, the internal representation of the type can be hidden, because all users of the type will work with the number using active patterns, which also makes it easy to change the implementation of the type as needed.

It is recommended to use active patterns in public library API instead of exposing the names of discriminated union constructors, because this makes it possible to change the internal representation without breaking the existing code. The second possible use of active patterns is extending the ‘vocabulary’ of a language built using discriminated union. In the following example we will implement an active pattern Commutative that allows us to decompose a value of type Expr into a call to commutative binary operator:

> let (|Commutative|_|) x =
    match x with
    | Binary(s, e1, e2) when (s = "+") || (s = "*") -> Some(s, e1, e2)
    | _ -> None;;
val ( |Commutative|_| ) : Expr -> (string * Expr * Expr) option

As you can see, the declaration of active pattern looks like a function declaration, but uses a strangely looking function name. In this case we use the (|PatternName|_|) syntax, which declares a pattern that can return a successful match or can fail. The pattern has a single argument (of type Expr) and returns an option type, which can be either Some(...) when the value matches the pattern or None. As we will show later, the patterns that can fail can be used in a match construct, where you can test a value against several different patterns.

As demonstrated in this example, active patterns can be used in a similar sense in which you can use discriminated unions to define a language for constructing the values. The key difference is that discriminated unions can be used for building the value (meaning that they will be used by all users of the language) and active patterns are used for decomposing the values and so they will be used in a code that interprets the language (written usually by the language designer) or by some pre-processing or optimizing code (written by advanced users of the language).

In the next example we will look at one advanced example of using the numerical language that we define earlier. We will implement a function that tests whether two expressions are equal using the commutativity rule, meaning that for example 10*(a+5) will be considered as equal to (5+a)*10:

> let rec equal e1 e2 = 
    match e1, e2 with
    | Commutative(o1, l1, r1), Commutative(o2, l2, r2) ->
       (o1 = o2) && (equal l1 r2) && (equal r1 l2)
    | _ -> e1 = e2;;
val equal : Expr -> Expr -> bool

> let e1 = Binary("*", Binary("+", Const(10), Var("x")), Const(4));;
  let e2 = Binary("*", Const(4), Binary("+", Var("x"), Const(10)));;
  equal e1 e2;;
val it : bool = true

As you can see, implementing the equal function that uses the commutativity rule is much easier using the Commutative active pattern than it would be explicitly by testing if the value is a use of specific binary operator. Also, when we’ll introduce a new commutative operator, we’ll only need to modify the active pattern and the equal function will work correctly.

Sequence comprehensions

Before digging deeper into advanced language-oriented features of F#, I'll need to do a small digression and talk about sequence comprehensions. This is a language construct that allows us to generate sequences, lists and arrays of data in F# and as we will see later it can be generalized to allow solving several related problems. Anyway, let's first look at an example that filters an F# list:

> let people = [ ("Joe", 55); ("John", 32); ("Jane", 24); ("Jimmy", 42) ];;
val people : (string * int) list

> [ for (name, age) in people
    when age < 30
    -> name ];;
val it : string list = ["Jane"]

In this example we first declared a list with some data and then used a sequence expression, wrapped between square brackets [ and ], to select only some elements from the list. The use of square brackets indicate that the result should be an F# list (you can also use [| .. |] to get an array or seq { .. } to get a sequence as I'll show later). The code inside the comprehension can contain most of the ordinary F# expressions, but in this example I used one extension, the when .. -> construct, which can be used for typical filtering and projection operations. The same code can be written like this:

> [ for (name, age) in people do
    if (age < 30) then
      yield name ];;
val it : string list = ["Jane"]

In this example, we used an ordinary for .. do loop (in the previous example the do keyword was missing and we used if .. then condition instead of when. Finally, returning a value from a sequence comprehension can be done using the yield construct. The point of this example is to demonstrate that the code inside the comprehension is not limited to some specific set of expressions and can, in fact, contain very complex F# code. I will demonstrate the flexibility of sequence comprehensions in one more example - the code will generate all possible words (of specified length) that can be generated using the given alphabet:

> let rec generateWords letters start len =
    seq { for l in letters do
            let word = (start ^ l)
            if len = 1 then
              yield  word
            if len > 1 then
              yield! generateWords letters word (len-1) }
val generateWords : #seq<string> -> string -> int -> seq<string>
> generateWords ["a"; "b"; "c"] "" 4;;
val it : seq<string> = seq ["aaaa"; "aaab"; "aaac"; "aaba"; ...]

This example introduces two interesting constructs. First of all, we're using seq { .. } expression to build the sequence, which is a lazy data structure, meaning that the code will be evaluated on demand. When you ask for the next element, it will continue evaluating until it reaches yield construct, which returns a word and then it will block again (until you ask for the next element). The second interesting fact is that the code is recursive - the generateWord function calls itself using yield! construct, which first computes the elements from the given sequence and then continues with evaluation of the remaining elements in the current comprehension.

F# Computation Expression

The next F# feature that we will look at can be viewed as a generalization of the sequence comprehensions. In general, it allows you to declare blocks similar to the seq { .. } block that execute the F# code in a slightly different way. In the case of seq this difference is that the code can return multiple values using yield.

In the next example we will implement a similar block called maybe that performs some computation and returns Some(res) when the computation succeeds, but it can also stop its execution when some operation fails and return None immediately, without executing the rest of the code inside the block. Let's first implement a simple function that can either return some value or can fail and return None:

let readNum () =
  let s = Console.ReadLine()
  let succ,v = Int32.TryParse(s)
  if (succ) then Some(v) else None

Now, we can write a code that reads two numbers from the console and adds them together, producing a value Some(a+b). However, when a call to readNum fails, we want to return None immediately without executing the second call to readNum. This is exactly what the maybe block will do (I'll show the implementation of the block shortly):

let n = 
  maybe { do   printf "Enter a: "
          let! a = readNum()
          do   printf "Enter b: "
          let! b = readNum()
          return a + b }  
printf "Result is: %A" n

The code inside the block first calls printf and then uses a let! construct to call the readNum function. This operation is called monadic bind and the implementation of maybe block specifies the behavior of this operation. Similarly, it can also specify behavior of the do and return operation, but in this example the let! is the most interesting, because it tests whether the computed value is None and stops the execution in such case (otherwise it starts executing the rest of the block).

Before looking at the implementation of the maybe block, let's look at the type of the functions that we'll need to implement. Every block (usually called computation expression in F#) is implemented by a monadic builder which has the following members that define elementary operators:

// Signature of the builder for monad M
type MaybeBuilder with
  member Bind   : M<'a> * ('a -> M<'b>) -> M<'b>
  member Return : 'a -> M<'a>
  member Delay  : (unit -> M<'a>) -> M<'a>

We'll shortly discuss how the F# compiler uses these members to execute the computation expression, but let me first add a few short comments for those who are familiar with Haskell monads. The Bind and Return members specify the standard monadic operators (known from Haskell), meaning that Bind is used when we use the let! operator in the code and Return is called when the computation expression contains return and finally, the Delay member allows building monads that are executed lazily.

The computation expression block is just a syntactic extension that makes it possible to write a code that uses the monadic operations, but is similar to an ordinary F# code. This means that the code inside the computation expression is simply translated to calls to the basic monadic operation, which we looked at earlier. The following example should put some light on the problem, because it shows how the F# compiler translates the code written using the maybe block:

maybe.Delay(fun () ->
  printf "Enter a"
  maybe.Bind(readNum(), fun a ->
    printf "Enter b"
    maybe.Bind(readNum(), fun b ->
      maybe.Return(a + b))

As we can see, the original code is split into single expressions and these are evaluated separately as arguments of the monadic operations. It is also important to note that the expression may not be evaluated, because this depends on the behavior of the monadic operation.

For example, let's analyze the third line, where a first call to the Bind operation occurs. The first argument will be evaluated asking for a user input and will produce either None or Some(n). The second argument is a function that takes one argument (a) and executes the rest of the computation expression. As you can see, the let binding in the original code was translated to a call to the Bind operation which can perform some additional processing and change the semantics and then assign a value to the variable by calling the given function. Also note that the first argument of the Bind operation is a monadic type (in the signature presented above it was M<'a>, while the argument of the function given as a second argument is ordinary type (unwrapped 'a). This means that the monadic type can hold some additional information - in our maybe monad, the additional information is a possibility of the failure of the operation.

Let's look at the implementation of the maybe monad now. The Bind operation will test if the first argument is Some(n) and then it will call the function given as a second argument with n as an argument. If the value of the first argument is None the Bind operation just returns None. The second key operation is Result which simply wraps an ordinary value into a monadic type - in our example it will take a value a (of type 'a) and turn it into a value Some(a) (of type M<'a>):

type M<'a> = option<'a>

let bind f d = 
  match d with
   | None -> None
   | Some(v) -> f v
let result v = Some(v)
let delay  f = f()
type MaybeBuilder() =
  member x.Bind(v, f) = bind v f
  member x.Return(v)  = result v
  member x.Delay(f)   = delay f 
  member x.Let(v, f)  = bind (result v) f
let maybe = MaybeBuilder()

In this example we looked at computation expressions and implemented a simple monadic builder for representing a computations that can fail. We implemented support only for basic language constructs (like let and let!), but in general the computation expression can allow using constructs like if, try .. when and other. For more information, please refer to [1]. Computation expressions are very powerful when you want to modify the behavior of the F# code, without changing the semantics of elementary expressions, for example by adding a possibility to fail (as we did in this example), or by executing the code asynchronously (as asynchronous workflows [2], which are part of the F# library do).

F# Meta-Programming and Reflection

The last approach to language oriented programming that I’ll present in this overview is using meta-programming capabilities of the F# language and .NET runtime. In general the term ‘meta-programming’ means writing a program that treats code as data and manipulates with it in some way. In F# this technique can be used for translating a code written in F# to other languages or formats that can be executed in some other execution environment or it can be used for analysis of the F# code and for calculating some additional properties of this code.

The meta-programming capabilities of F# and .NET runtime can be viewed as a two separate and orthogonal parts. The .NET runtime provides a way for discovering all the types and top-level method definitions in a running program: this API is called reflection. F# quotations provide a second part of the full meta-programming support - they can be used for extracting an abstract syntax trees of members discovered using the .NET reflection mechanism (note that the F# quotations are a feature of the F# compiler and as such can’t be produced by C# or VB compilers).

.NET and F# Reflection

The F# library also extends the .NET System.Reflection to give additional information about F# data types – for example we can use the F# reflection library to examine possible values of the Expr type (discriminated union) declared earlier:

> open Microsoft.FSharp.Reflection;;
> let exprTy = typeof<Expr>
  if FSharpType.IsUnion(exprTy) then
    let opts = FSharpType.GetUnionCases(exprTy)
    opts |> (fun m -> m.Name)
    [| |];;
val it : string[] = [| "Binary"; "Var"; "Const" |]

An important part of the .NET reflection mechanism is the use of custom attributes, which can be used to annotate any program construct accessible via reflection with additional metadata. The following example demonstrates the syntax for attributes in F# by declaring Documentation attribute (simply by inheriting from the System.Attribute base class) and also demonstrates how a static method in a class can be annotated with the attribute:

type DocumentationAttribute(doc:string) =
  inherit System.Attribute()
  member x.Documentation = doc

type Demo =
  [<Documentation("Adds one to a given number")>]
  static member AddOne x = x + 1

Using the .NET System.Reflection library it is possible to examine members of the Demo type including reading of the associated attributes (which are stored in the compiled DLL and are available at run-time):

> let ty = typeof<Demo>
  let mi = ty.GetMethod("AddOne")
  let at = mi.GetCustomAttributes
                (typeof<DocumentationAttribute>, false)
  (at.[0] :?> DocumentationAttribute).Doc;;
val it : string = "Adds one to a given number"

F# Quotations

F# quotations form the second part of the meta-programming mechanism, by allowing the capture of type-checked F# expressions as structured terms. There are two ways for capturing quotations – the first way is to use quotation literals and explicitly mark a piece of code as a quotation and the second way is to use ReflectedDefinition attribute, which instructs the compiler to store quotation data for a specified top-level member. The following example demonstrates a few simple quoted F# expressions – the quoted expressions are ordinary type-checked F# expressions wrapped between <@@ and @@>:

> open Microsoft.FSharp.Quotations;;
> <@@ 1 + 1 @@>
val it : Expr<int>

> <@@ (fun x -> x + 1) @@>
val it : Expr<int -> int>

Quotation processing is usually done on the raw representation of the quotations, which is represented by the non-generic Expr type (however the type information about the quoted expression is still available dynamically via the Type property). The following example implements a trivial evaluator for quotations. GenericTopDefnApp is an active pattern that matches with the use of a function given as a first argument (in this example a plus operator), the Int32 pattern recognizes a constant of type int):

> open Microsoft.FSharp.Quotations.Patterns;;
> open Microsoft.FSharp.Quotations.DerivedPatterns;;
> let plusOp = <@@ (+) @@>
  let rec eval x =
    match x with
    | SpecificCall plusOp (_, [l; r]) ->
        (eval l) + (eval r)
    | Int32(n) -> 
    | _ ->        
        failwith "unknonw construct"  
val eval : Expr -> int

> let tst = <@@ (1+2) + (3+4) @@>
  eval tst
val it : int = 10

Quotation Templates and Splicing

When generating quotations programmatically, it is often useful to build a quotation by combining several elementary quotations into a one, more complex quotation. This can be done by creating a quotation template, which is a quotation that contains one or more holes. Holes are written using the underscore symbol and define a place, where another quotation can be filled in the template. In the following example, we will look at a template that contains two holes and can be used for generating a quotation that represents addition of two values:

> let addTempl a b = <@@ %a + %b @@>;;
val addTempl : Expr<int> -> Expr<int> -> Expr<int>

> eval(addTempl <@@ 2 @@> <@@ 40 @@>);;
val it : int = 42

In this example, we first create a quotation template addTempl. This template takes two expressions as parameters and constructs an expression that represnts addition. The two expression given as an argument ar 'spliced' into the created expression using the % operator. Note that the holes are typed, meaning that the values that can be filled in the template have to be quotations representing an expression of type int.

The splicing operator is also useful mechanism for providing input data for programs that evaluate quotations. As you can see in the following example, we can use it for embedding a value that represents a database table (the |> is a pipelining operator, which applies the argument on the left hand side to the function on the right hand side). This example is based on the FLINQ project, which allows writing database queries in F# and executing them as SQL queries on a database engine:

> <@@ (%db).Customers 
      |> filter (fun x -> x.City = "London")
      |> map (fun x -> x.Name) @@>
val it : Expr<seq<string>>

In the raw representation, the spliced value can be recognized using the LiftedValue pattern, which returns a value of type obj, which can contain any F# value.

Quoting Top-Level Definitions

The second option for quoting F# code is by explicitly marking top-level definitions with an attribute that instructs the F# compiler to capture the quotation of the entire definition body. This option is sometimes called non-intrusive meta-programming, because it allows processing of the member body (e.g. translating it to some other language and executing it heterogeneously), but doesn’t require any deep understanding of meta-programming from the user of the library. The following code gives a simple example:

let addOne x = 
  x + 1

The quotation of a top-level definition (which can be either a function or a class member) annotated using the ReflectedDefinition attribute is then made available through the F# quotation library at runtime using the reflection mechanism described earlier, but the member is still available as a compiled code and can be executed.

When a quotation represents a use of a top-level definition it is possible to check if this top-level definition was annotated using the ReflectedDefinition attribute and so the quotation of the definition is accessible. This can be done using the ResolveTopDefinition function as demonstrated in the following example:

let expandedQuotation = 
  match <@@ addOne @@> with 
  | Lambda(_, Call(_, mi, _)) -> 
      match mi with
      | MethodWithReflectedDefinition(quot) -> quot
      | _ -> faliwith "Quotation not available!"
  | _ -> failwith "Not a method use!"      

Using Active Patterns with Quotations

As already mentioned, the programmatic access to F# quotation trees uses F# active patterns, which allow the internal representation of quotation trees to be hidden while still allowing the use of pattern matching as a convenient way to decompose and analyze F# quotation terms. Active-patterns can be also used when implementing a quotation processor, because they can be used to group similar cases together. In the following example we declare an active pattern that recognizes two binary operations:

let plusOp  = <@@ (+) @@>
let minusOp = <@@ (-) @@>
let (|BinaryOp|_|) x =
  match x with
  | SpecificCall plusOp (_, [l; r]) -> Some("+", l, r)
  | SpecificCall minusOp (_, [l; r]) -> Some("-", l, r)
  | _ -> None

let rec eval x =
  match x with
  | BinaryOp (op, l, r) ->
      if (op = "+") then (eval l) + (eval r)
        else (eval l) - (eval r)
  (* ... *)

In this example we declared BinaryOp active pattern, which can be used for matching a quotation that represents either addition or subtraction. In a code that processes quotations, grouping of related cases together by using active patterns is very useful, because you can define active patterns for all quotation types that your translator or analyzer can process, factor out all the code that recognizes all the supported cases and keep the translator or analyzer itself very simple.

Article Series Links


Published: Saturday, 3 November 2007, 12:00 AM
Author: Tomas Petricek
Typos: Send me a pull request!
Tags: functional, asynchronous, meta-programming, f#