TP

# Writing custom F# LINQ query builder

One of the attendees of my virtual F# in Finance course, Stuart recently asked me a pretty advanced question about writing custom queries with F#, because he was interested in writing a nicer querying library for Amazon DynamoDB (his project is here).

I realized that I don't really know about any good resource about doing this, so I wrote a little sample that shows how you can do this. The idea is that we want to be able to write something like this:

 1: 2: 3: 4: 5:  query { for p in DynamoDB.People do where (p.Age > 10) where (p.Name = "Tomas") select (p.Age, p.Name) } 

The DynamoDB could even be a type generated by a type provider (with all the tables available in Dynamo DB). Now, the above example uses the built-in query builder, which is extensible, but, as far as I know, you have to use LINQ expression trees to support it. In this article, I'm going to use an alternative approach with custom builder (so you would write dynamo { ... } instead of query { ... }).

I wanted to write a minimal example showing how to do this, so this blog post is going to be mostly code (unlike my other chatty articles!), but it should give you (and Stuart :-)) some idea how to do this. I was quite intrigued by the idea of having a nice query language for DynamoDB, so I'm hoping that this blog post can help move the project forward!

## Defining types and computation builder

First of all, we need to define types that model the database and the query builder that lets us write something like query { ... }. In this blog post, I'll call the builder simpleq. These could contain some more code, but they do not need to contain any implmentation - we will never actually execute any of these methods! Instead, we'll capture F# quotation of the query and then translate that to a simple data structure that describes the query.

So, we define Query<'T> which represents a query that returns values of type 'T (but it never actually does anything). We define a sample type Person and a type DB representing the database:

  1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13:  open Microsoft.FSharp.Quotations /// Represents a query (but it is never executed!) type Query<'T> = NA /// Sample type that would be ideally imported from database type Person = { Name : string Age : int } /// Models a database with one 'People' table type DB = static member People : Query = NA 

Next, we'll define a computation builder that defines what operations we're allowed to perform in the query. This is an object with some specially named members (and also marked with attributes) that the compiler understands. When we then write something like the query in the introduction, the compiler translates the query into calls to the computation builder operations.

If we were not using quotation to translate the query, then the computation builder would actually contain implementation of the transformations. But in our case, the implementation is going to be empty.

The builder defines For and Yield, which are required for any query. Then we also define Quote operation which instructs the compiler to capture a quotation (so that we do not have to write those explicitly using <@@ .. @@>, which is ugly!) The remaining three operations (discussed below) define custom operations that can appear in the query:

  1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29:  /// Defines a query builder which lets us write queries containing /// 'for', 'where', 'selectAttr' and 'selectCount' operations. type SimpleQueryBuilder() = /// 'For' and 'Yield' enables the 'for x in xs do ..' syntax member x.For(tz:Query<'T>, f:'T -> Query<'R>) : Query<'R> = NA member x.Yield(v:'T) : Query<'T> = NA /// Instructs the compiler to capture quotation of the query member x.Quote(e:Expr<_>) = e /// Represents filtering of the source using specified condition [] member x.Where ( source:Query<'T>, [] f:'T -> bool ) : Query<'T> = NA /// Represents projection where we select specified properties [] member x.SelectAttrs ( source:Query<'T>, [] f:'T -> 'R) : Query<'R> = NA /// Represents projection where we get the count of values [] member x.SelectCount(source:Query<'T>) : int = failwith "Never executed" /// Global instance of the computation builder let simpleq = SimpleQueryBuilder() 

Aside from Yield and For, which are required and boilerplate operations, there are three custom operations marked with the CustomOperation attribute. You can add any number of those you need. There are a few interesting things:

• the where operation transforms Query<'T> into Query<'T> and so it is marked with MaintainsVariableSpace=true (which basically means that it does not change the type of values that query produces - it only filters them).

• The selectAttrs operation takes a projection as an argument - to write this nicely in the query as selectAttrs (p.Name, p.Age), we need to add the ProjectionParameter attribute.

• The selectCount operation is simpler and has no additional parameters (but you can add any parameter you need, e.g. the standard take operation takes an integer and you'd write take 10).

The snippet ends with a definition of a value simpleq - this is an instance of the query builder that is used when we write simpleq { .. }, so let's now look at a sample query we can write:

 1: 2: 3: 4: 5: 6:  let q = simpleq { for p in DB.People do where (p.Age > 10) where (p.Name = "Tomas") selectCount } 

Because we added the Quote operation, the type of q is Expr<int> which means that it is an expression. Now, in a real example, you'd also add Run operation of type Expr<'T> -> 'T that would evaluate the query. Here, we'll just look at the translation (and I won't discuss how to do the evaluation).

## Defining the query model

Before looking at the query translation, let's have a look at the information we want to extract from the sample query. We want to build a value of the following Query type which describes what kind of operations are performed:

  1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20:  /// Defines a single condition of the form /// p. type QueryCondition = { Property : string Operator : string Constant : obj } /// Specifies what kind of projection happens at the /// end of query (count or list of projected attributes) type QueryProjection = | SelectAttributes of string list | SelectCount /// A query consits of source (table) name, a list of /// filters specified using where and an optional /// projection at the end. type Query = { Source : string Where : QueryCondition list Select : QueryProjection option } 

The domain model pretty much speaks for itself! We can write one or more where clauses. In this example, we'll recognize only very limited forms of conditions - you have to write property on the left, operator and a constant. The select clause at the end is optional (you don't have to write it in the query) so the Select property in Query is marked with option too.

## Translating queries using quotations

Now we're getting to the most interesting bit of this blog post - how can we take the quotation that represents the query (the q value) and turn it into a Query value which models our query. If you're going through this in F# interactive, you can type q in the FSI and see what the query looks like. It takes some time to understand how to read this, but you might see something like:

 1: 2: 3: 4: 5: 6: 7:  Call (Some (Value (FSI_0080+QueryBuilder)), SelectCount, [Call (Some (Value (FSI_0080+QueryBuilder)), Where, [Call (Some (Value (FSI_0080+QueryBuilder)), For, [PropertyGet (None, People, []), (...) ]), Lambda (p, Call (None, op_GreaterThan, [PropertyGet (Some (p), Age, []), Value (10)]))])]) 

With a bit of effort, you can see the operations that the query calls (in this example, I left just where (p.Age > 10) in the query. The operations in the output appear in a reverse order - so our query ends with a call to SelectCount. Before that, it calls where (with a lambda that you can see on the last three lines as an argument). Finally, the input is People property on line 4.

### Translating the query

The key part of the whole sample is the following translateQuery function. It takes a quotation and checks for calls to one of the supported operations. The function returns a Query value representing the query and it builds the result gradually as it (recursively) walks over the query.

The For operation is where everything starts, because this specifies the source. The projection and filtering operations first call translateQuery recursively (on the nested part of the expression) and then add more information to the Query record:

  1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: 35: 36: 37: 38:  open Microsoft.FSharp.Quotations.Patterns open Microsoft.FSharp.Quotations.DerivedPatterns let rec translateQuery e = match e with // simpleq.SelectAttrs(, fun p -> p.Name, p.Age) | SpecificCall <@@ simpleq.SelectAttrs @@> (builder, [tTyp; rTyp], [source; proj]) -> let q = translateQuery source let s = translateProjection proj { q with Select = Some(SelectAttributes s) } // simpleq.SelectCount() | SpecificCall <@@ simpleq.SelectCount @@> (builder, [tTyp], [source]) -> let q = translateQuery source { q with Select = Some SelectCount } // simpleq.Where(, fun p -> p.Age > 10) | SpecificCall <@@ simpleq.Where @@> (builder, [tTyp], [source; cond]) -> let q = translateQuery source let w = translateWhere cond { q with Where = w :: q.Where } // simpleq.For(DB.People, <...>) | SpecificCall <@@ simpleq.For @@> (builder, [tTyp; rTyp], [source; body]) -> let source = // Extract the table name from 'DB.People' match source with | PropertyGet(None, prop, []) when prop.DeclaringType = typeof -> prop.Name | _ -> failwith "Only sources of the form 'DB.' are supported!" { Source = source; Where = []; Select = None } // This should never happen | e -> failwithf "Unsupported query operation: %A" e 

If you are new to quotations and active patterns, then there is quite a lot of going on here. Here is a summary of the important things:

• The translateQuery function is recursive. All of the operations except for For contain nested query (of the same format) as one of the arguments. In the comments, this is indicated by the <source> parameter. So, we always recursively process <source> and then extract information from the other parameter.

• The pattern SpecificCall <@@ ... @@> is a way of checking that the quotation represents piece of code that calls the specified operation. When it matches, it produces a tuple with (builder, [..], args) where the last element are the arguments of the method called (the other are instance and type arguments, but we ignore those).

Now, I omitted the definitions of translateWhere and translateProjection functions, so let's look at those next. These will be a bit simpler, but they'll use additional quotation patterns.

### Translating where clauses

The translateWhere function is parsing a function of the form fun p -> p.Prop <op> <Value>. In quotations, we can use Lambda pattern to look for function and Call for a call to an operator. We check that this is call to an operator by making sure that the name of the method called starts with op_:

  1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19:  let translateWhere = function | Lambda(var1, Call (None, op, [left; right])) -> match left, right with | PropertyGet(Some (Var var2), prop, []), Value(value, _) when var1.Name = var2.Name && op.Name.StartsWith("op_") -> // We got 'where' that we understand. Build QueryCondition! { Property = prop.Name Operator = op.Name.Substring(3) Constant = value } | e -> // 'where' with not supported format // (this can happen so report more useful error) failwithf "%s\nGot: %A" ( "Only expressions of the form " + "'p.Prop ' are supported!") e // This should not happen - the parameter is always lambda! | _ -> failwith "Expected lambda expression" 

THe function looks for a fairly complex pattern that represents the valid syntax that can appear after the where clause. Note that the compiler automatically turns where <cond> into a call simplq.Where(..., fun p -> <cond>), so it adds a lambda function that we have to decompose.

### Translating select clauses

Translation of the select clauses is quite similar. Again, we look for a lambda function and then transform the body into a list of properties that should be returned by the projection.

This time, we look for two kinds of body expressions. When you write select p.Name, this is represented as a function containing property access as the body. We turn this into a singleton list. When you write select (p.Name, p.Age), we get a tuple and we turn this into a list of names corresponding to the elements returned by the tuple:

  1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23:  /// Translate property access (may be in a tuple or not) let translatePropGet varName = function | PropertyGet(Some(Var v), prop, []) // The body is simple projection when v.Name = varName -> prop.Name | e -> // Too complex expression in projection failwithf "%s\nGot: %A" ( "Only expressions of the form " + "'p.Prop' are supported!" ) e /// Translate projection - this handles both of the forms /// (with or without tuple) and calls translatePropGet let translateProjection e = match e with | Lambda(var1, NewTuple args) -> // Translate all tuple items List.map (translatePropGet var1.Name) args | Lambda(var1, arg) -> // There is just one body expression [translatePropGet var1.Name arg] | _ -> failwith "Expected lambda expression" 

So, now that we have all the definitions, you can see how this works. Try calling translateQuery function on the sample q query that we wrote earlier:

 1:  q |> translateQuery 

If you do this, you should see a nice Query value that represents the information you need about the query. For a query with just one where clause, I got the following:

 1: 2: 3: 4: 5: 6:  { Source = "People" Where = [ { Property = "Age" Operator = "GreaterThan" Constant = 10 }] Select = Some SelectCount} 

## Summary

In this blog post, I wrote a minimal example that shows how you can implement a custom query computations for querying data sources like Amazon DynamoDB. This is not the easiest thing you can do with F#, but if you compare this with writing a translator for LINQ expression trees, then it actually does not look so bad - active patterns and pattern matching are super valuable language features here!

One thing I have not done in this article is to add support for evaluating the queries. To do this, you'd need to add Run operation to the query builder, which takes Expr<'T>, extracts the Query value from the expression (we have this bit) and then produces a value of type 'T by actually sending database request (this is what you'd need to add). Aside from that, I think this minimal example shows all the important tricks!

val query : Linq.QueryBuilder

Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.query
val p : People
type DynamoDB =
static member People : seq<People>

Full name: Query-translation.DynamoDB
property DynamoDB.People: seq<People>
custom operation: where (bool)

Calls Linq.QueryBuilder.Where
People.Age: int
People.Name: string
custom operation: select ('Result)

Calls Linq.QueryBuilder.Select
namespace Microsoft
namespace Microsoft.FSharp
namespace Microsoft.FSharp.Quotations
type Query<'T> = | NA

Full name: Query-translation.Query<_>

Represents a query (but it is never executed!)
union case Query.NA: Query<'T>
type Person =
{Name: string;
Age: int;}

Full name: Query-translation.Person

Sample type that would be ideally imported from database
Person.Name: string
Multiple items
val string : value:'T -> string

Full name: Microsoft.FSharp.Core.Operators.string

--------------------
type string = System.String

Full name: Microsoft.FSharp.Core.string
Person.Age: int
Multiple items
val int : value:'T -> int (requires member op_Explicit)

Full name: Microsoft.FSharp.Core.Operators.int

--------------------
type int = int32

Full name: Microsoft.FSharp.Core.int

--------------------
type int<'Measure> = int

Full name: Microsoft.FSharp.Core.int<_>
type DB =
static member People : Query<Person>

Full name: Query-translation.DB

Models a database with one 'People' table
Multiple items
static member DB.People : Query<Person>

Full name: Query-translation.DB.People

--------------------
type People =
{Name: string;
Age: int;}

Full name: Query-translation.People
Multiple items
type SimpleQueryBuilder =
new : unit -> SimpleQueryBuilder
member For : tz:Query<'T> * f:('T -> Query<'R>) -> Query<'R>
member Quote : e:Expr<'a> -> Expr<'a>
member SelectAttrs : source:Query<'T> * f:('T -> 'R) -> Query<'R>
member SelectCount : source:Query<'T> -> int
member Where : source:Query<'T> * f:('T -> bool) -> Query<'T>
member Yield : v:'T -> Query<'T>

Full name: Query-translation.SimpleQueryBuilder

Defines a query builder which lets us write queries containing
'for', 'where', 'selectAttr' and 'selectCount' operations.

--------------------
new : unit -> SimpleQueryBuilder
val x : SimpleQueryBuilder
member SimpleQueryBuilder.For : tz:Query<'T> * f:('T -> Query<'R>) -> Query<'R>

Full name: Query-translation.SimpleQueryBuilder.For

'For' and 'Yield' enables the 'for x in xs do ..' syntax
val tz : Query<'T>
val f : ('T -> Query<'R>)
member SimpleQueryBuilder.Yield : v:'T -> Query<'T>

Full name: Query-translation.SimpleQueryBuilder.Yield
val v : 'T
Multiple items
member SimpleQueryBuilder.Quote : e:Expr<'a> -> Expr<'a>

Full name: Query-translation.SimpleQueryBuilder.Quote

Instructs the compiler to capture quotation of the query

--------------------
active recognizer Quote: Expr -> Expr option

Full name: Microsoft.FSharp.Quotations.Patterns.( |Quote|_| )
val e : Expr<'a>
Multiple items
type Expr =
override Equals : obj:obj -> bool
member GetFreeVars : unit -> seq<Var>
member Substitute : substitution:(Var -> Expr option) -> Expr
member ToString : full:bool -> string
member CustomAttributes : Expr list
member Type : Type
static member AddressOf : target:Expr -> Expr
static member AddressSet : target:Expr * value:Expr -> Expr
static member Application : functionExpr:Expr * argument:Expr -> Expr
static member Applications : functionExpr:Expr * arguments:Expr list list -> Expr
...

Full name: Microsoft.FSharp.Quotations.Expr

--------------------
type Expr<'T> =
inherit Expr
member Raw : Expr

Full name: Microsoft.FSharp.Quotations.Expr<_>
Multiple items
type CustomOperationAttribute =
inherit Attribute
new : name:string -> CustomOperationAttribute
member AllowIntoPattern : bool
member IsLikeGroupJoin : bool
member IsLikeJoin : bool
member IsLikeZip : bool
member JoinConditionWord : string
member MaintainsVariableSpace : bool
member MaintainsVariableSpaceUsingBind : bool
member Name : string
...

Full name: Microsoft.FSharp.Core.CustomOperationAttribute

--------------------
new : name:string -> CustomOperationAttribute
member SimpleQueryBuilder.Where : source:Query<'T> * f:('T -> bool) -> Query<'T>

Full name: Query-translation.SimpleQueryBuilder.Where

Represents filtering of the source using specified condition
val source : Query<'T>
Multiple items
type ProjectionParameterAttribute =
inherit Attribute
new : unit -> ProjectionParameterAttribute

Full name: Microsoft.FSharp.Core.ProjectionParameterAttribute

--------------------
new : unit -> ProjectionParameterAttribute
val f : ('T -> bool)
type bool = System.Boolean

Full name: Microsoft.FSharp.Core.bool
member SimpleQueryBuilder.SelectAttrs : source:Query<'T> * f:('T -> 'R) -> Query<'R>

Full name: Query-translation.SimpleQueryBuilder.SelectAttrs

Represents projection where we select specified properties
val f : ('T -> 'R)
member SimpleQueryBuilder.SelectCount : source:Query<'T> -> int

Full name: Query-translation.SimpleQueryBuilder.SelectCount

Represents projection where we get the count of values
val failwith : message:string -> 'T

Full name: Microsoft.FSharp.Core.Operators.failwith
val simpleq : SimpleQueryBuilder

Full name: Query-translation.simpleq

Global instance of the computation builder
val q : Expr<int>

Full name: Query-translation.q
val p : Person
property DB.People: Query<Person>
custom operation: where (bool)

Calls SimpleQueryBuilder.Where

Represents filtering of the source using specified condition
custom operation: selectCount

Calls SimpleQueryBuilder.SelectCount

Represents projection where we get the count of values
type QueryCondition =
{Property: string;
Operator: string;
Constant: obj;}

Full name: Query-translation.QueryCondition

Defines a single condition of the form
p.<Property> <op> <Constant>
QueryCondition.Property: string
QueryCondition.Operator: string
QueryCondition.Constant: obj
type obj = System.Object

Full name: Microsoft.FSharp.Core.obj
type QueryProjection =
| SelectAttributes of string list
| SelectCount

Full name: Query-translation.QueryProjection

Specifies what kind of projection happens at the
end of query (count or list of projected attributes)
union case QueryProjection.SelectAttributes: string list -> QueryProjection
type 'T list = List<'T>

Full name: Microsoft.FSharp.Collections.list<_>
union case QueryProjection.SelectCount: QueryProjection
Multiple items
type Query =
{Source: string;
Where: QueryCondition list;
Select: QueryProjection option;}

Full name: Query-translation.Query

A query consits of source (table) name, a list of
filters specified using where and an optional
projection at the end.

--------------------
type Query<'T> = | NA

Full name: Query-translation.Query<_>

Represents a query (but it is never executed!)
Query.Source: string
Query.Where: QueryCondition list
Query.Select: QueryProjection option
type 'T option = Option<'T>

Full name: Microsoft.FSharp.Core.option<_>
union case Option.Some: Value: 'T -> Option<'T>
union case Option.None: Option<'T>
val translateWhere : _arg1:Expr -> QueryCondition

Full name: Query-translation.translateWhere
active recognizer Lambda: Expr -> (Var * Expr) option

Full name: Microsoft.FSharp.Quotations.Patterns.( |Lambda|_| )
val var1 : Var
active recognizer Call: Expr -> (Expr option * System.Reflection.MethodInfo * Expr list) option

Full name: Microsoft.FSharp.Quotations.Patterns.( |Call|_| )
val op : System.Reflection.MethodInfo
val left : Expr
val right : Expr
active recognizer PropertyGet: Expr -> (Expr option * System.Reflection.PropertyInfo * Expr list) option

Full name: Microsoft.FSharp.Quotations.Patterns.( |PropertyGet|_| )
Multiple items
active recognizer Var: Expr -> Var option

Full name: Microsoft.FSharp.Quotations.Patterns.( |Var|_| )

--------------------
type Var =
interface IComparable
new : name:string * typ:Type * ?isMutable:bool -> Var
member IsMutable : bool
member Name : string
member Type : Type
static member Global : name:string * typ:Type -> Var

Full name: Microsoft.FSharp.Quotations.Var

--------------------
new : name:string * typ:System.Type * ?isMutable:bool -> Var
val var2 : Var
val prop : System.Reflection.PropertyInfo
active recognizer Value: Expr -> (obj * System.Type) option

Full name: Microsoft.FSharp.Quotations.Patterns.( |Value|_| )
val value : obj
property Var.Name: string
property System.Reflection.MemberInfo.Name: string
System.String.StartsWith(value: string) : bool
System.String.StartsWith(value: string, comparisonType: System.StringComparison) : bool
System.String.StartsWith(value: string, ignoreCase: bool, culture: System.Globalization.CultureInfo) : bool
System.String.Substring(startIndex: int) : string
System.String.Substring(startIndex: int, length: int) : string
val e : Expr * Expr
val failwithf : format:Printf.StringFormat<'T,'Result> -> 'T

Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.failwithf
val translatePropGet : varName:string -> _arg1:Expr -> string

Full name: Query-translation.translatePropGet

Translate property access (may be in a tuple or not)
val varName : string
val v : Var
val e : Expr
val translateProjection : e:Expr -> string list

Full name: Query-translation.translateProjection

Translate projection - this handles both of the forms
(with or without tuple) and calls translatePropGet
active recognizer NewTuple: Expr -> Expr list option

Full name: Microsoft.FSharp.Quotations.Patterns.( |NewTuple|_| )
val args : Expr list
Multiple items
module List

from Microsoft.FSharp.Collections

--------------------
type List<'T> =
| ( [] )
| ( :: ) of Head: 'T * Tail: 'T list
interface IEnumerable
interface IEnumerable<'T>
member GetSlice : startIndex:int option * endIndex:int option -> 'T list
member IsEmpty : bool
member Item : index:int -> 'T with get
member Length : int
member Tail : 'T list
static member Cons : head:'T * tail:'T list -> 'T list
static member Empty : 'T list

Full name: Microsoft.FSharp.Collections.List<_>
val map : mapping:('T -> 'U) -> list:'T list -> 'U list

Full name: Microsoft.FSharp.Collections.List.map
val arg : Expr
module Patterns

from Microsoft.FSharp.Quotations
module DerivedPatterns

from Microsoft.FSharp.Quotations
val translateQuery : e:Expr -> Query

Full name: Query-translation.translateQuery
active recognizer SpecificCall: Expr -> Expr -> (Expr option * System.Type list * Expr list) option

Full name: Microsoft.FSharp.Quotations.DerivedPatterns.( |SpecificCall|_| )
member SimpleQueryBuilder.SelectAttrs : source:Query<'T> * f:('T -> 'R) -> Query<'R>

Represents projection where we select specified properties
val builder : Expr option
val tTyp : System.Type
val rTyp : System.Type
val source : Expr
val proj : Expr
val q : Query
val s : string list
member SimpleQueryBuilder.SelectCount : source:Query<'T> -> int

Represents projection where we get the count of values
member SimpleQueryBuilder.Where : source:Query<'T> * f:('T -> bool) -> Query<'T>

Represents filtering of the source using specified condition
val cond : Expr
val w : QueryCondition
member SimpleQueryBuilder.For : tz:Query<'T> * f:('T -> Query<'R>) -> Query<'R>

'For' and 'Yield' enables the 'for x in xs do ..' syntax
val body : Expr
val source : string
property System.Reflection.MemberInfo.DeclaringType: System.Type
val typeof<'T> : System.Type

Full name: Microsoft.FSharp.Core.Operators.typeof

Published: Tuesday, 7 April 2015, 1:41 PM
Author: Tomas Petricek
Typos: Send me a pull request!
Tags: f#, functional programming, linq