Writing custom F# LINQ query builder
One of the attendees of my virtual F# in Finance course, Stuart recently asked me a pretty advanced question about writing custom queries with F#, because he was interested in writing a nicer querying library for Amazon DynamoDB (his project is here).
This article is also available in Japanese, thanks to Yukitoshi Suzuki!
独自のF# LINQクエリビルダーを作成する
I realized that I don't really know about any good resource about doing this, so I wrote a little sample that shows how you can do this. The idea is that we want to be able to write something like this:
1: 2: 3: 4: 5: |
|
The DynamoDB
could even be a type generated by a type provider (with all the tables available in Dynamo DB).
Now, the above example uses the built-in query
builder, which is extensible, but, as far as I know, you have to
use LINQ expression trees to support it. In this article, I'm going to use an alternative approach with custom
builder (so you would write dynamo { ... }
instead of query { ... }
).
I wanted to write a minimal example showing how to do this, so this blog post is going to be mostly code (unlike my other chatty articles!), but it should give you (and Stuart :-)) some idea how to do this. I was quite intrigued by the idea of having a nice query language for DynamoDB, so I'm hoping that this blog post can help move the project forward!
Defining types and computation builder
First of all, we need to define types that model the database and the query builder that lets us
write something like query { ... }
. In this blog post, I'll call the builder simpleq
. These
could contain some more code, but they do not need to contain any implmentation - we will never
actually execute any of these methods! Instead, we'll capture F# quotation of the query and then
translate that to a simple data structure that describes the query.
So, we define Query<'T>
which represents a query that returns values of type 'T
(but it never
actually does anything). We define a sample type Person
and a type DB
representing the database:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: |
|
Next, we'll define a computation builder that defines what operations we're allowed to perform in the query. This is an object with some specially named members (and also marked with attributes) that the compiler understands. When we then write something like the query in the introduction, the compiler translates the query into calls to the computation builder operations.
If we were not using quotation to translate the query, then the computation builder would actually contain implementation of the transformations. But in our case, the implementation is going to be empty.
The builder defines For
and Yield
, which are required for any query. Then we also
define Quote
operation which instructs the compiler to capture a quotation (so that
we do not have to write those explicitly using <@@ .. @@>
, which is ugly!)
The remaining three operations (discussed below) define custom operations that can appear
in the query:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: |
|
Aside from Yield
and For
, which are required and boilerplate operations,
there are three custom operations marked with the CustomOperation
attribute.
You can add any number of those you need. There are a few interesting things:
-
the
where
operation transformsQuery<'T>
intoQuery<'T>
and so it is marked withMaintainsVariableSpace=true
(which basically means that it does not change the type of values that query produces - it only filters them). -
The
selectAttrs
operation takes a projection as an argument - to write this nicely in the query asselectAttrs (p.Name, p.Age)
, we need to add theProjectionParameter
attribute. -
The
selectCount
operation is simpler and has no additional parameters (but you can add any parameter you need, e.g. the standardtake
operation takes an integer and you'd writetake 10
).
The snippet ends with a definition of a value simpleq
- this is an instance
of the query builder that is used when we write simpleq { .. }
, so let's now
look at a sample query we can write:
1: 2: 3: 4: 5: 6: |
|
Because we added the Quote
operation, the type of q
is Expr<int>
which means
that it is an expression. Now, in a real example, you'd also add Run
operation of type
Expr<'T> -> 'T
that would evaluate the query. Here, we'll just look at the translation
(and I won't discuss how to do the evaluation).
Defining the query model
Before looking at the query translation, let's have a look at the information we want
to extract from the sample query. We want to build a value of the following Query
type
which describes what kind of operations are performed:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: |
|
The domain model pretty much speaks for itself! We can write one or more where
clauses.
In this example, we'll recognize only very limited forms of conditions - you have to write
property on the left, operator and a constant. The select
clause at the end is optional
(you don't have to write it in the query) so the Select
property in Query
is marked with
option
too.
Translating queries using quotations
Now we're getting to the most interesting bit of this blog post - how can we take the
quotation that represents the query (the q
value) and turn it into a Query
value
which models our query. If you're going through this in F# interactive, you can type
q
in the FSI and see what the query looks like. It takes some time to understand how
to read this, but you might see something like:
1: 2: 3: 4: 5: 6: 7: |
|
With a bit of effort, you can see the operations that the query calls (in this example,
I left just where (p.Age > 10)
in the query. The operations in the output appear in
a reverse order - so our query ends with a call to SelectCount
. Before that, it calls
where
(with a lambda that you can see on the last three lines as an argument). Finally,
the input is People
property on line 4.
Translating the query
The key part of the whole sample is the following translateQuery
function. It takes
a quotation and checks for calls to one of the supported operations. The function returns
a Query
value representing the query and it builds the result gradually as it (recursively)
walks over the query.
The For
operation is where everything starts, because this specifies the source. The
projection and filtering operations first call translateQuery
recursively (on the
nested part of the expression) and then add more information to the Query
record:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: 35: 36: 37: 38: |
|
If you are new to quotations and active patterns, then there is quite a lot of going on here. Here is a summary of the important things:
-
The
translateQuery
function is recursive. All of the operations except forFor
contain nested query (of the same format) as one of the arguments. In the comments, this is indicated by the<source>
parameter. So, we always recursively process<source>
and then extract information from the other parameter. -
The pattern
SpecificCall <@@ ... @@>
is a way of checking that the quotation represents piece of code that calls the specified operation. When it matches, it produces a tuple with(builder, [..], args)
where the last element are the arguments of the method called (the other are instance and type arguments, but we ignore those).
Now, I omitted the definitions of translateWhere
and translateProjection
functions,
so let's look at those next. These will be a bit simpler, but they'll use additional
quotation patterns.
Translating where clauses
The translateWhere
function is parsing a function of the form fun p -> p.Prop <op> <Value>
.
In quotations, we can use Lambda
pattern to look for function and Call
for a call to an
operator. We check that this is call to an operator by making sure that the name of the method
called starts with op_
:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: |
|
THe function looks for a fairly complex pattern that represents the valid syntax that can appear
after the where
clause. Note that the compiler automatically turns where <cond>
into a call
simplq.Where(..., fun p -> <cond>)
, so it adds a lambda function that we have to decompose.
Translating select clauses
Translation of the select clauses is quite similar. Again, we look for a lambda function and then transform the body into a list of properties that should be returned by the projection.
This time, we look for two kinds of body expressions. When you write select p.Name
, this
is represented as a function containing property access as the body. We turn this into a
singleton list. When you write select (p.Name, p.Age)
, we get a tuple and we turn this
into a list of names corresponding to the elements returned by the tuple:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: |
|
So, now that we have all the definitions, you can see how this works. Try
calling translateQuery
function on the sample q
query that we wrote earlier:
1:
|
|
If you do this, you should see a nice Query
value that represents the information you
need about the query. For a query with just one where
clause, I got the following:
1: 2: 3: 4: 5: 6: |
|
Summary
In this blog post, I wrote a minimal example that shows how you can implement a custom query computations for querying data sources like Amazon DynamoDB. This is not the easiest thing you can do with F#, but if you compare this with writing a translator for LINQ expression trees, then it actually does not look so bad - active patterns and pattern matching are super valuable language features here!
One thing I have not done in this article is to add support for evaluating the queries.
To do this, you'd need to add Run
operation to the query builder, which takes Expr<'T>
,
extracts the Query
value from the expression (we have this bit) and then produces a
value of type 'T
by actually sending database request (this is what you'd need to add).
Aside from that, I think this minimal example shows all the important tricks!
Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.query
static member People : seq<People>
Full name: Query-translation.DynamoDB
Calls Linq.QueryBuilder.Where
Calls Linq.QueryBuilder.Select
Full name: Query-translation.Query<_>
Represents a query (but it is never executed!)
{Name: string;
Age: int;}
Full name: Query-translation.Person
Sample type that would be ideally imported from database
val string : value:'T -> string
Full name: Microsoft.FSharp.Core.Operators.string
--------------------
type string = System.String
Full name: Microsoft.FSharp.Core.string
val int : value:'T -> int (requires member op_Explicit)
Full name: Microsoft.FSharp.Core.Operators.int
--------------------
type int = int32
Full name: Microsoft.FSharp.Core.int
--------------------
type int<'Measure> = int
Full name: Microsoft.FSharp.Core.int<_>
static member People : Query<Person>
Full name: Query-translation.DB
Models a database with one 'People' table
static member DB.People : Query<Person>
Full name: Query-translation.DB.People
--------------------
type People =
{Name: string;
Age: int;}
Full name: Query-translation.People
type SimpleQueryBuilder =
new : unit -> SimpleQueryBuilder
member For : tz:Query<'T> * f:('T -> Query<'R>) -> Query<'R>
member Quote : e:Expr<'a> -> Expr<'a>
member SelectAttrs : source:Query<'T> * f:('T -> 'R) -> Query<'R>
member SelectCount : source:Query<'T> -> int
member Where : source:Query<'T> * f:('T -> bool) -> Query<'T>
member Yield : v:'T -> Query<'T>
Full name: Query-translation.SimpleQueryBuilder
Defines a query builder which lets us write queries containing
'for', 'where', 'selectAttr' and 'selectCount' operations.
--------------------
new : unit -> SimpleQueryBuilder
Full name: Query-translation.SimpleQueryBuilder.For
'For' and 'Yield' enables the 'for x in xs do ..' syntax
Full name: Query-translation.SimpleQueryBuilder.Yield
member SimpleQueryBuilder.Quote : e:Expr<'a> -> Expr<'a>
Full name: Query-translation.SimpleQueryBuilder.Quote
Instructs the compiler to capture quotation of the query
--------------------
active recognizer Quote: Expr -> Expr option
Full name: Microsoft.FSharp.Quotations.Patterns.( |Quote|_| )
type Expr =
override Equals : obj:obj -> bool
member GetFreeVars : unit -> seq<Var>
member Substitute : substitution:(Var -> Expr option) -> Expr
member ToString : full:bool -> string
member CustomAttributes : Expr list
member Type : Type
static member AddressOf : target:Expr -> Expr
static member AddressSet : target:Expr * value:Expr -> Expr
static member Application : functionExpr:Expr * argument:Expr -> Expr
static member Applications : functionExpr:Expr * arguments:Expr list list -> Expr
...
Full name: Microsoft.FSharp.Quotations.Expr
--------------------
type Expr<'T> =
inherit Expr
member Raw : Expr
Full name: Microsoft.FSharp.Quotations.Expr<_>
type CustomOperationAttribute =
inherit Attribute
new : name:string -> CustomOperationAttribute
member AllowIntoPattern : bool
member IsLikeGroupJoin : bool
member IsLikeJoin : bool
member IsLikeZip : bool
member JoinConditionWord : string
member MaintainsVariableSpace : bool
member MaintainsVariableSpaceUsingBind : bool
member Name : string
...
Full name: Microsoft.FSharp.Core.CustomOperationAttribute
--------------------
new : name:string -> CustomOperationAttribute
Full name: Query-translation.SimpleQueryBuilder.Where
Represents filtering of the source using specified condition
type ProjectionParameterAttribute =
inherit Attribute
new : unit -> ProjectionParameterAttribute
Full name: Microsoft.FSharp.Core.ProjectionParameterAttribute
--------------------
new : unit -> ProjectionParameterAttribute
Full name: Microsoft.FSharp.Core.bool
Full name: Query-translation.SimpleQueryBuilder.SelectAttrs
Represents projection where we select specified properties
Full name: Query-translation.SimpleQueryBuilder.SelectCount
Represents projection where we get the count of values
Full name: Microsoft.FSharp.Core.Operators.failwith
Full name: Query-translation.simpleq
Global instance of the computation builder
Full name: Query-translation.q
Calls SimpleQueryBuilder.Where
Represents filtering of the source using specified condition
Calls SimpleQueryBuilder.SelectCount
Represents projection where we get the count of values
{Property: string;
Operator: string;
Constant: obj;}
Full name: Query-translation.QueryCondition
Defines a single condition of the form
p.<Property> <op> <Constant>
Full name: Microsoft.FSharp.Core.obj
| SelectAttributes of string list
| SelectCount
Full name: Query-translation.QueryProjection
Specifies what kind of projection happens at the
end of query (count or list of projected attributes)
Full name: Microsoft.FSharp.Collections.list<_>
type Query =
{Source: string;
Where: QueryCondition list;
Select: QueryProjection option;}
Full name: Query-translation.Query
A query consits of source (table) name, a list of
filters specified using `where` and an optional
projection at the end.
--------------------
type Query<'T> = | NA
Full name: Query-translation.Query<_>
Represents a query (but it is never executed!)
Full name: Microsoft.FSharp.Core.option<_>
Full name: Query-translation.translateWhere
Full name: Microsoft.FSharp.Quotations.Patterns.( |Lambda|_| )
Full name: Microsoft.FSharp.Quotations.Patterns.( |Call|_| )
Full name: Microsoft.FSharp.Quotations.Patterns.( |PropertyGet|_| )
active recognizer Var: Expr -> Var option
Full name: Microsoft.FSharp.Quotations.Patterns.( |Var|_| )
--------------------
type Var =
interface IComparable
new : name:string * typ:Type * ?isMutable:bool -> Var
member IsMutable : bool
member Name : string
member Type : Type
static member Global : name:string * typ:Type -> Var
Full name: Microsoft.FSharp.Quotations.Var
--------------------
new : name:string * typ:System.Type * ?isMutable:bool -> Var
Full name: Microsoft.FSharp.Quotations.Patterns.( |Value|_| )
System.String.StartsWith(value: string, comparisonType: System.StringComparison) : bool
System.String.StartsWith(value: string, ignoreCase: bool, culture: System.Globalization.CultureInfo) : bool
System.String.Substring(startIndex: int, length: int) : string
Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.failwithf
Full name: Query-translation.translatePropGet
Translate property access (may be in a tuple or not)
Full name: Query-translation.translateProjection
Translate projection - this handles both of the forms
(with or without tuple) and calls `translatePropGet`
Full name: Microsoft.FSharp.Quotations.Patterns.( |NewTuple|_| )
module List
from Microsoft.FSharp.Collections
--------------------
type List<'T> =
| ( [] )
| ( :: ) of Head: 'T * Tail: 'T list
interface IEnumerable
interface IEnumerable<'T>
member GetSlice : startIndex:int option * endIndex:int option -> 'T list
member Head : 'T
member IsEmpty : bool
member Item : index:int -> 'T with get
member Length : int
member Tail : 'T list
static member Cons : head:'T * tail:'T list -> 'T list
static member Empty : 'T list
Full name: Microsoft.FSharp.Collections.List<_>
Full name: Microsoft.FSharp.Collections.List.map
from Microsoft.FSharp.Quotations
from Microsoft.FSharp.Quotations
Full name: Query-translation.translateQuery
Full name: Microsoft.FSharp.Quotations.DerivedPatterns.( |SpecificCall|_| )
Represents projection where we select specified properties
Represents projection where we get the count of values
Represents filtering of the source using specified condition
'For' and 'Yield' enables the 'for x in xs do ..' syntax
Full name: Microsoft.FSharp.Core.Operators.typeof
Published: Tuesday, 7 April 2015, 2:41 PM
Author: Tomas Petricek
Typos: Send me a pull request!
Tags: f#, functional programming, linq