TP

# Building LINQ Queries at Runtime in C#

Since the first beta versions of LINQ we could hear comments that it is perfect for queries known at compile-time, however it is not possible to use it for building queries dynamically at runtime. In this article I show that this can be actually done very well for most of the common cases. The solution offered by Microsoft (mentioned in [1]) is to build query from a string, however this has many limitations and it in fact goes completely against what LINQ tries to achieve, which is writing queries in a type-safe way with full compile-time checking. In this article I will first show a few support functions to make the life a bit easier and then we will use them for building two sample applications that allows user to build a query dynamically. The solution is largely motivated by my previous use of F#, where working with “expressions” is possible at more advanced level, however I’ll write about F# later and now let’s get back to C# 3.0.

## Linq Utilities #1: Expandable Queries

First of all, we will need an extension that I wrote some time ago, which allows us to “call functions in LINQ queries” as demonstrated in the following example:

// Lambda expression that calculates the price
Expression<Func<Nwind.Product, decimal?>> calcPrice =
(p) => p.UnitPrice * 1.19m;

// Query that selects products
var q =
from p in db.Products.ToExpandable()
where calcPrice.Expand(p) > 30.0m
select new {
p.ProductName,
OriginalPrice = p.UnitPrice,
ShopPrice = calcPrice.Expand(p) };


## Linq Utilities #2: Type Inference for Anonymous Functions

The second little utility will allow us to use C# implicitly typed variables (using the var keyword) to declare anonymous functions and lambda expressions (i.e. data representation of the anonymous function). As you probably know you, the decision whether anonymous function will be compiled as a Func delegate (that can be invoked) or as an Expression data structure (that can be for example translated to SQL) depends on the variable type:

Func<int,int,int> func = (int a, int b) => a + b;
Expression<Func<int,int,int>> expr = (int a, int b) => a + b;


Obviously the following code looks very confusing:

var func = (int a, int b) => a + b;

The problem is that the compiler doesn’t know if func should be of type Expression or Func and indeed, if you try it you will get a compiler error message saying “Cannot assign lambda expression to an implicitly-typed local variable”. Unfortunately this forces you to write the type, which can be really complicated sometimes and also this way of declaring anonymous functions makes it impossible to write an anonymous function returning C# 3.0 anonymous type:

Func<int,int,???> func = (int a, int b) => new { Sum = a + b, Mul = a * b };

When you want to return anonymous type you can no longer write name of the type (because it is anonymous!), so writing expression like this looks like an impossible task. Luckily we can implement really simple workaround for this problem:

public static class Linq {
// Returns the given anonymous method as a lambda expression
public static Expression<Func<T, R>>
Expr<T, R>(Expression<Func<T, R>> f) {
return f;
}

// Returns the given anonymous function as a Func delegate
public static Func<T, R>
Func<T, R>(Func<T, R> f) {
return f;
}
}

You can see that we declared two simple static methods that just return the argument, however the type of the argument (which is a Func delegate in case of the Func method and Expression data type in case of the Expr method) tells the compiler if we want to build an expression or a delegate. Using this class we can rewrite the previous examples as following:

// Let’s start with the simple examples:
var func = Linq.Func((int a, int b) => a + b);
var expr = Linq.Expr((int a, int b) => a + b);

// .. and using anonymous types is possible as well!
var func = Linq.Func((int a, int b) =>
new { Sum = a + b, Mul = a * b });
var expr = Linq.Expr ((int a, int b) =>
new { Sum = a + b, Mul = a * b });


Note that when using Linq.Func or Linq.Expr you have to explicitly specify the type of the arguments, because the compiler can’t infer this from the type of the variable, however the code is usually shorter anyway and in addition you can use anonymous types!

## Dynamic Queries #1: Selecting Customers

All right, let’s finally look how we can build a LINQ query dynamically at runtime! In the first example we will implement one feature that may be quite common in a lot of applications. We will build a query for selecting customers from the Northwind database and we will let user to choose what property of the Customer type he wants to enter. For simplicity we will give a choice between CompanyName, Country and a ContactName. Once the user selects what property he wants to query, we will ask for the value and run a query that will return all customers whose property selected by the user contains the entered value (as a substring). To implement this we first create a data context and then we write a function that will build a LINQ query once it gets a value entered by the user and a lambda expression representing a function that reads the selected property from of the customer. It may sound a bit confusing, but I’ll explain everything soon:

NorthwindDataContext db = new NorthwindDataContext();

var queryBuilder = Linq.Func
((Expression<Func<Customer, string>> selector, string val) =>
from c in db.Customers.ToExpandable()
where selector.Expand(c).IndexOf(val) != -1
select c);


Before digging deeper, let’s look how the queryBuilder function can be used:

var q = queryBuilder(c => c.CompanyName, "ab");
foreach (var cust in q)
Console.WriteLine(cust);


So what exactly happens here? We built a function (or maybe parameterized query would be better name) called queryBuilder that has two arguments. The second argument is a value that the user entered and the first argument is a lambda expression (named selector) that specifies what property of the Customer type we want to look for – when we give it an anonymous method that returns company name of a customer as an argument it will build a query that returns all customers with the specified company name.

When you run the code, it executes the queryBuilder function with the lambda expression reading company name and a value “ab”. The queryBuilder then returns a query that calls the lambda expression using the Expand method which I talked about earlier (in Linq Utilities #1). It is also important to note that the query uses the ToExpandable method, because this has to be present anywhere where we want to use Expand to call some lambda expression in a LINQ query.

Let’s now finish the sample and write a code that actually allows user to choose how he wants to query the Customers data table:

var dict = new Dictionary<string, Expression<Func<Customer, string>>>
{ { "CompanyName", c => c.CompanyName },
{ "Country",     c => c.Country },
{ "ContactName", c => c.ContactName } };

// Get the input from the user
Console.Write("Filter - field name (CompanyName, ContactName, Country):\n> ");

Console.Write("Filter - value:\n> ");

// Build the query using query builder and run it
var q = queryBuilder(dict[field], value);
foreach (var cust in q)
Console.WriteLine(cust);


The application (just a simple console program) creates a dictionary that contains string as a key and a lambda expression for reading specific properties of a customer as a value. When user enters a string representing the property, it looks it up in the dictionary and gets a lambda expression that can be given as an argument to our parameterized query declared earlier. And that’s all! You could very easily use the same principle in a Windows Forms applications and let user select the property for example from a drop down list, which is probably the most common scenario for building LINQ queries at runtime.

## Dynamic Queries #2: Combining Functions Using “OR” and “AND”

In this example, we will write slightly more complex application that builds LINQ query dynamically. It will allow user to choose if he wants to combine several criteria using AND or OR logical operator and then enter the values to find for every property (again, we use Customer table from a Northwind database). This means that the user we will be able to enter queries like (CompanyName = "Google") OR (Country = "Czech Republic") or (CompanyName = "Microsoft") AND (Country = "USA"). The key aspect in this example is that query is combined from several basic conditions without knowing the number of these conditions in advance, so this principle could be used when you want to allow the user to add any number of conditions he wants.

We will again need a dictionary for mapping string (entered by the user) to a lambda expression returning the property of the Customer object. To make the code a bit more readable then in the previous example we will define an abbreviation for this rather complex type, which can be done thanks to the C# using construct (unfortunately there are a few limitations, so we have to use fully qualified namespace names):

using CustomerPropSelector = System.Linq.Expressions
.Expression<System.Func<EeekSoft.LinqDemos.Customer, string>>;


Now we can use the CustomerPropSelector type as a value in a dictionary (it is actually the same dictionary as the one we used in the previous example):

NorthwindDataContext db = new NorthwindDataContext();
var dict = new Dictionary<string, CustomerPropSelector>
{ { "CompanyName", c => c.CompanyName },
{ "Country",     c => c.Country },
{ "ContactName", c => c.ContactName } };


The next type abbreviation that we will use in the code will represent a condition telling whether specified customer matches the specified criteria or not. The condition is a lambda expression taking a Customer object as an argument and returning a boolean value:

using CustomerCondition = System.Linq.Expressions
.Expression<System.Func<EeekSoft.LinqDemos.Customer, bool>>;


We can now define two primitive conditions that we will need later - to define a condition we use the C# anonymous function syntax, so you can easily write a condition that for example tests whether a customer is from UK or something like that. The two primitive conditions that we need are however really trivial - one of them returns true for every customer and the other false:

CustomerCondition trueCond  = (c) => true;
CustomerCondition falseCond = (c) => false;


Now, we will need a way for combining two conditions, so we can write a combinator that can be used to produce a condition from two of them. We will use two such combinators later. One will be OR which takes two conditions and returns true when any of them returns true and the other will be AND which will return true when both conditions return true, as you would expect. The interesting is the type of these combinators - it is a function taking two conditions and returning a condition as a result, so we could write them as Func<CustomerCondition, CustomerCondition, CustomerCondition>. Unfortunately, the C# using statement doesn't allow using nested abbreviations, so we have to replace CustomerCondition in the previous type and we'll get somewhat ugly type:

using CustomerConditionCombinator =
System.Func<System.Linq.Expressions.Expression<System.Func<EeekSoft.LinqDemos.Customer, bool>>,
System.Linq.Expressions.Expression<System.Func<EeekSoft.LinqDemos.Customer, bool>>,
System.Linq.Expressions.Expression<System.Func<EeekSoft.LinqDemos.Customer, bool>>>;


Despite the fact that C# forces us to write this in a really ugly way, it isn't that difficult to understand what it actually does - it just takes two conditions (which are actually lambda expressions returning a boolean value when given Customer as an argument) and returns a condition that can call these two conditions and produces boolean as a result, very likely by performing some logical operation on the values returned by these two conditions. We can now write two basic combinators for logical OR and AND. The combinator receives two conditions (f and g) as arguments, and returns a new lambda expression as the result. This lambda expression takes a Customer value as an argument, calls the two conditions using the Expand method and performs the logical operation on the result:

CustomerConditionCombinator combineOr =
(f, g) => (c) => f.Expand(c) || g.Expand(c);

CustomerConditionCombinator combineAnd =
(f, g) => (c) => f.Expand(c) && g.Expand(c);


Now we have everything we need to finally start building the query! Let me first demonstrate how we can build a query using the combineOr combinator from two simple conditions, one testing whether the customer is from UK and the other testing if it's from Seattle:

// Define two simple conditions first
CustomerCondition isUk = (c) => c.Country == "UK";
CustomerCondition isSeattle = (c) => c.City == "Seattle"

// Combine the conditions using OR
CustomerCondition expr = combineOr(isUk, isSeattle);

// Build a query (using 'Expand' and 'ToExpandable')
var q = from c in db.Customers.ToExpandable()
where expr.Expand(c) select c;


If you run this example, it will indeed return all customers that are either from UK or live in Seattle. Note that we still have to use the Expand to call the condition we built in the LINQ query and it is also important not to forget the ToExpandable call, otherwise the “LINQ to SQL” will give an error message that it doesn't understand the query.

Let's now move to the final example - here we will actually let user construct a query he wants. Let's assume we have a value generateOr saying if we want to build a query by combinating several conditions using OR or using AND. To build a query we will start with one of our primitive conditions and add a condition using the combinator for every property of the customer. This means that when combining conditions using OR we will start with a query representing "false", than we will append first condition and get something like "(false || cond1)", after adding second condition we will get "(false || cond1) || cond2" and so on. You can see that when building a sequence of ORs we need to start with falseCond, so that the condition returns true only when any of the conditions succeeds and when building sequence of ANDs, we have to start with trueCond, so that the condition succeeds when all conditions return true.

The code to build a condition in the way described in the previous paragraph can be written as a foreach loop (working with the KeyValuePair from the dictionary declared above) that modifies the variable holding the current expression (expr) in every iteration:

// Select the combinator and initial condition
CustomerConditionCombinator combinator =
generateOr ? combineOr : combineAnd;
CustomerCondition expr =
generateOr ? falseCond : trueCond;

// Loop over all properties in the dictionary
foreach (var item in dict)
{
// Read the value from the user
Console.Write("Enter value for '{0}':\n> ", item.Key);
CustomerPropSelector propSelector = item.Value;

// Build a condition testing whether entered value is a substring
CustomerCondition currentCond = (c) =>
propSelector.Expand(c).IndexOf(enteredVal) != -1;

// Finally, combine expressions using the combinator
expr = combinator(expr, currentCond);
}


Since C# 3.0 and LINQ in general use some of the concepts from functional programming, we can actually used one nice functional trick to write the code in a slightly shorter and less error-prone way. We can use a query operator called Fold (it was part of the LINQ in earlier version, but it seems to be missing now, so I added it to my LINQ Extensions [3] project). What does this operation do? It takes some initial value and runs a specified function to combine the value with every element in the sequence step by step. In our case the initial value is a condition (either trueCond or falseCond) and the function to combine values is essentially the body of the foreach loop. This means that function will ask user for the value and return the value newly produced by the combinator:

CustomerConditionCombinator combinator =
generateOr ? combineOr : combineAnd;

// Build expression using the 'Fold' aggregate operator
var expr = dict.Fold((e, item) => {
// Read the value from the user
Console.Write("Enter value for '{0}':\n> ", item.Key);
CustomerPropSelector propSelector = item.Value;

// Build a condition testing whether entered value is a substring
CustomerCondition currentCond = (c) =>
propSelector.Expand(c).IndexOf(enteredVal) != -1;

// Finally, return the combined expression
return combinator(e, currentCond);
}, generateOr ? falseCond : trueCond);


Finally, we can use a simple LINQ query to run the generated condition (expr variable) - as mentioned in other places of this article, we can call it using Expand and we shouldn't forgot ToExpandable call which processes our combined expression before calling the “LINQ to SQL” translator:

var q = from c in db.Customers.ToExpandable()
where expr.Expand(c) select c;


## Summary

I hope this article convinced you that there is quite reasonable way for building queries dynamically at runtime in LINQ, which is much safer and less error-prone than building a string. Even though it may look complicated at the beginning after defining a few primitive constructs the code for building queries looks reasonable and is also extensible, so you're not limited to combining conditions sequentially using only two operators as I demonstrated in the example, in fact it could be easily modified to build for example a tree representing more complex logical combinations of conditions. To run this code you'll need a few (not very difficult to write, but important) extensions that are part of my LINQ Extensions project available at CodePlex. Finally, I'd like to give a credit to the authors of the F# language where this kind of expression processing is natively supported, because without my experiences with F# I probably wouldn't think of building queries in a way I present it in this article.