Tomas Petricek

Searching for new ways of thinking in programming & working with data

I believe that the most interesting work is not the one solving hard problems, but the one changing how we think about the world. I follow this belief in my work on data science tools, functional programming and F# teaching, in my programming languages research and I try to understand it through philosophy of science.

The Gamma

I'm working on making data-driven storytelling easier, more open and reproducible at the Alan Turing Institute.

Consulting

I'm author of definitive F# books and open-source libraries. I offer my F# training and consulting services as part of fsharpWorks.

Academic

I published papers about theory of context-aware programming languages, type providers, but also philosophy of science.

Tomas Petricek
  • Tomas Petricek
  • Home
  • F# Trainings
  • Talks and books
  • The Gamma
  • Academic

LINQ extensions - Simplified keyword search

Recently, I came across interesting question at LINQ Forums (Dynamic conditions: How to achieve multiple "OR" conditions with LINQ? [1]). The question is whether LINQ (and especially LINQ to SQL) provides any simple way to return only records that contain one or more of specified keywords in the name. The question looks simple, but it is simple only if you know the number of keywords that you want to look for. In this case you can write following LINQ query:

  // Products that contain "kwd1" or "kwd2" in the name 
var q = from p in db.Products where p.ProductName.Contains("kwd1") || p.ProductName.Contains("kwd2") select p;

The problem with previous code is that you can't use it if the list of keywords is dynamically entered by user (and so its length may vary). Of course, if you want to run query on in-memory data, you can get very nice results by writing extension method called ContainsAny that performs test for keyword array, but if you want to be able to translate query to SQL, the situation is a bit complicated.

Building expression tree

The first solution is to build the whole expression tree that is used for filtering (in the where clause) using Expression object. The code that builds expression tree is much longer, because it must generate expression for calling method Contains for every keyword in the array. The following example shows how to build test for only two keywords for simplicity:

// build as many conditions as you need
Expression t1 = Expression.CallVirtual(
   typeof(string).GetMethod("Contains"),
   Expression.Property(p, typeof(Product).GetProperty("ProductName")),
   new Expression[] { Expression.Constant("Chef") });
Expression t2 = Expression.CallVirtual(
   typeof(string).GetMethod("Contains"), 
   Expression.Property(p, typeof(Product).GetProperty("ProductName")),
   new Expression[] { Expression.Constant("Sir") });

// merge conditions using Expression.Or

var test = Expression.Lambda<Func<Product, bool>> (Expression.Or(t1, t2), new ParameterExpression[] { p }); var filter = Queryable.Where(db.Products, test); // Execute the query
var q = from prod in filter select prod;

You might also think of using Queryable.Union method that returns union of results from several queries. Generating query using this method is possible and the code is less complicated, but the resulting SQL command generated by the LINQ to SQL is very poor and it would be obviously slower. That's definitely not an issue in LINQ, because the query created using Union is very far from natural query that you would normally use.

Better solution

The previous solution can be used for generating any expression tree, so there will be always certain situations, when this would be the only possible way, however I think that these two methods (ContainsAny and ContainsAll) will be used very often. I already wrote an extension to LINQ that allows you to 'call' expression tree in query [3], so I used this project and I added support for the following two methods:

using EeekSoft.Query;

// Select products, that conain one of values from array in their name
var q1 = from p in db.Products.ToExpandable()
where p.ProductName.ContainsAny("Sir", "Chef")
select new { p.ProductName }; // Select products, that conain all of values from array in their name
var q1 = from p in db.Products.ToExpandable()
where p.ProductName.ContainsAll("Gu", "Ca")
select new { p.ProductName };

To be able to write this code, you have to download the library EeekSoft.Query first. [See Downloads]

How does it work?

I already described the method that I used here for extending the LINQ in my previous article [3], so you can refer to it for more details. The important point is the call to the ToExpandable extension method. This method returns IQueryable wrapper around the original data source (data table in this case). When query is executed, this wrapper is called and it modifies the expression tree before it is passed to LINQ to SQL converter, so the methods like ContainsAny are replaced with some other expression that the LINQ to SQL can understand.

References and downloads

  • Download code samples (29kB)
  • [1] Dynamic conditions: How to achieve multiple "OR" conditions with LINQ? [^] - LINQ Project General - MSDN Forums (by marcioesteves)
  • [2] The LINQ Project [^] - Microsoft.com
  • [3] Calling functions in LINQ queries [^] - Tomasp.net

Published: Friday, 28 July 2006, 3:57 AM
Author: Tomas Petricek
Typos: Send me a pull request!
Tags: c#

Contact & about

This site is hosted on GitHub and is generated using F# Formatting and DotLiquid. For more info, see the website source on GitHub.

Please submit issues & corrections on GitHub. Use pull requests for minor corrections only.

  • Twitter: @tomaspetricek
  • GitHub: @tpetricek
  • Email me: tomas@tomasp.net

Blog archives

October 2020 (1),  July 2020 (1),  April 2020 (2),  December 2019 (1),  February 2019 (1),  November 2018 (1),  October 2018 (1),  May 2018 (1),  September 2017 (1),  June 2017 (1),  April 2017 (1),  March 2017 (2),  January 2017 (1),  October 2016 (1),  September 2016 (2),  August 2016 (1),  July 2016 (1),  May 2016 (2),  April 2016 (1),  December 2015 (2),  November 2015 (1),  September 2015 (3),  July 2015 (1),  June 2015 (1),  May 2015 (2),  April 2015 (3),  March 2015 (2),  February 2015 (1),  January 2015 (2),  December 2014 (1),  May 2014 (3),  April 2014 (2),  March 2014 (1),  January 2014 (2),  December 2013 (1),  November 2013 (1),  October 2013 (1),  September 2013 (1),  August 2013 (2),  May 2013 (1),  April 2013 (1),  March 2013 (1),  February 2013 (1),  January 2013 (1),  December 2012 (2),  October 2012 (1),  August 2012 (3),  June 2012 (2),  April 2012 (1),  March 2012 (4),  February 2012 (5),  January 2012 (2),  November 2011 (5),  August 2011 (3),  July 2011 (2),  June 2011 (2),  May 2011 (2),  March 2011 (4),  December 2010 (1),  November 2010 (6),  October 2010 (6),  September 2010 (4),  July 2010 (3),  June 2010 (2),  May 2010 (1),  February 2010 (2),  January 2010 (3),  December 2009 (3),  July 2009 (1),  June 2009 (3),  May 2009 (2),  April 2009 (1),  March 2009 (2),  February 2009 (1),  December 2008 (1),  November 2008 (5),  October 2008 (1),  September 2008 (1),  June 2008 (1),  March 2008 (3),  February 2008 (1),  December 2007 (2),  November 2007 (6),  October 2007 (1),  September 2007 (1),  August 2007 (1),  July 2007 (2),  April 2007 (2),  March 2007 (2),  February 2007 (3),  January 2007 (2),  November 2006 (1),  October 2006 (3),  August 2006 (2),  July 2006 (1),  June 2006 (3),  May 2006 (2),  April 2006 (2),  December 2005 (1),  July 2005 (4),  June 2005 (5),  May 2005 (1),  April 2005 (3),  March 2005 (3),  January 2005 (1),  December 2004 (3),  November 2004 (2), 

License

Unless explicitly mentioned, all articles on this site are licensed under Creative Commons Attribution Share Alike. All source code samples are licensed under the MIT License.

CC License logo