Using custom grouping operators in LINQ

You can use LINQ to write queries that perform grouping of data using group by or ordering of data using orderby clause. LINQ provides the default (and the most common) implementation of both of the operations, but sometimes you may need a slightly different behavior when grouping or ordering data (this article is motivated by a question on StackOverflow which needs to do exactly that for grouping).

Let's look at a simple example, which shows when we may need a different behavior when grouping data. For example, we may have the following list of stock trades containing a name of a stock and the price of the trade (stored for example as a list of TradeInfo classes with properties Name and Price):

{ { Name = "MSFT", Price = 80.00 },
  { Name = "MSFT", Price = 70.00 },
  { Name = "GOOG", Price = 100.00 },
  { Name = "GOOG", Price = 200.00 },
  { Name = "GOOG", Price = 300.00 },
  { Name = "MSFT", Price = 30.00 },
  { Name = "MSFT", Price = 20.00 } }

Now, we may want to group adjacent trades into a single summary record which will contain the name of the stock (which is same for all trades in each group), the number of trades in the group and an average price in the group. The desired results are:

{ { Name = "MSFT", Count = 2, AvgPrice = 75.00 },
  { Name = "GOOG", Count = 3, AvgPrice = 200.00 },
  { Name = "MSFT", Count = 2, AvgPrice = 25.00 } }

The operation that we want to do is very similar to group by in LINQ, but it doesn't do quite the same thing! If we used group by, we would get only two groups as the result. However, as I wrote earlier, we want to group only adjacent trades. You could write your own extension method to do this, but then you need to leave the elegant LINQ query syntax. In this article, I'll show you how to get the desired results using a simple LINQ query with a group by clause...

Read the complete article (English)
Sunday, February 07, 2010

Source code for Real World Functional Programming available!

As you can see, there has been quite a bit of silence on this blog for a while. There are two reasons for that - the first is that I'm still working on the book Real World Functional Programming, so all my writing activities are fully dedicated to the book. The second reason is that I'm having a great time doing an internship in the Programming Principles and Tools group at Microsoft Research in Cambridge with the F# team and namely the F# language designer Don Syme. The photo on the left side is the entrance gate to the Trinity College of the Cambridge University taken during the few days when there was a snow. I recently started using Live Gallery, so you can find more photos from Cambridge in my online gallery. Anyway, I just wanted to post a quick update with some information (and downloads!) related to the book...

Read the complete article (English)
Thursday, February 12, 2009

Functional Programming in .NET using C# and F# (Manning Greenpaper)

Functional programming languages have been around for a while and were always astonishing for their ability to express the ideas in a succinct, declarative way allowing the developer to focus on the essence of problem rather than on technical details of the solution. Recently, functional paradigm is gaining new prominence as an efficient way to handle development of multi-processor, parallel and asynchronous applications.

Functional ideas are arising in C# as well as in other main-stream languages and functional languages are becoming an alternative for real-world projects. Also, Microsoft recently introduced a new language called F#, which has a strong background in traditional functional languages, but as a .NET language also benefits from the rich .NET and Visual Studio ecosystem.

Book cover
Available via MEAP | 500 pages
Softbound print: March 2009 (est.)

This article is partially an excerpt from my book Real-world Functional Programming in .NET [1]. Thanks to my editors at Manning I have the permission to publish it on my blog. We’ll look at several aspects of functional programming and how the same concepts, which are essential for the functional paradigm, look in the F# and in C# 3.0 with LINQ. We will shortly look at the basic programming language features like lambda functions and type inference that are now available in both F# and C#. Functional programming isn’t only about language features, but also about using different programming style, so we’ll look at some high level concepts as well. These include using immutable data structures for developing code that can be executed in parallel and writing code in a more declarative style.

Thanks to the combination of C# 3.0 and F#, this article shows the ideas in a way that should be familiar to you in C#, but also shows a further step that you can take with a primarilly functional language F#. If you're a .NET developer and you want to understand what functional programming is and how it can help you to become better and more productive then continue reading. If you'll find this article interesting, then don't forget to check out the book, which explains everything in larger detail and discusses many other interesting ideas.

Read the complete article (English)
Thursday, December 11, 2008

Reactive Programming (IV.) - Developing a game in Reactive LINQ

In this part of the article series about Reactive LINQwe're going to implement a slightly more complicated application using the library that I introduced in the previous three articles. We're going to use basic event stream queries from the second article as well as advanced operators introduced in the third part. This time, I'll also show the F# version of all the examples, so we're going to build on the ideas from the first part.

I originally wanted to write the demo only in Visual Basic, because I think that it is really amazig to show an idea that came from functional programming in a language that no one (maybe until recently) connects with functional programming. Then I realized that I really want to show the F# version too, because F# was an inspiration for the whole Reactive LINQ idea and it is interesting alone as well. But finally, I thought that don't showing the C# version may look offensive to many readers (especially since I'm still C# MVP...). So, I ended up writing the game in all three languages, but the code is surprisingly similar in all of them!

Read the complete article (English)
Monday, November 24, 2008

Reactive Programming (III.) - Useful Reactive LINQ Operators

In the previous article, I introduced Reactive LINQ. I explained the different point of view that we can use when working with .NET events. The idea is that .NET events can be viewed as streams of values. The value is information about the event (such as position of a mouse click or a mouse movement). These streams can be processed using LINQ queries - we can for example filter all values that are not interesting for us using where LINQ clause. For example if we want to handle clicks only in some specified area.

In the previous article, I talked about basic LINQ query operators such as select and where and some useful methods that Reactive LINQ provides (for example for merging event streams). Today, we'll take a look at two more advanced kinds of operations that we can use for working with event streams. In particular, we'll talk about aggregation operators (that you certainly know from LINQ) and about switching. Switching is a concept from functional reactive programming and it allows us to change dynamically how the application behaves. However, I'll explain this in a more detail soon.

In this article, I'm going to use mostly C# (and some Visual Basic). The functionality that I'm describing in this part isn't part of the standard F# Event module that I discussed in the first part. I implemented most of them in F# too, but I'm not going to write the samples in both of the versions in this part. If you've seen the first two articles, you'll be definitely able to use the F# versions as well, because they follow exactly the same ideas as the C#/LINQ versions. I'm going to talk about a larger demo application in the last section and I'll show an F# version as well, so you'll see some F# examples in the next part. This part serves more as a reference of the available operators, so you may read only some parts of it, then jump to the last one (to see an exciting example!) and then come back here.

Read the complete article (English)
Friday, November 21, 2008

Reactive programming (II.) - Introducing Reactive LINQ

In this article I'm going to introduce my project Reactive LINQ. This is largely inspired by the ideas that come from functional reactive programming in the Haskell language and from functionality that's available for working in events in F#. I introduced these ideas in my previous article in this mini-series, so if you're interested in learning more about these interesting technologies, you should definitely read the previous article about First class events in F#.

In this article, we're going to look how to apply the same ideas to the C# language and how to use LINQ queries for processing events. This article is just another my article that shows how to implement some idea from functional programmin in C#. I believe this will once again show that the new features in the C# 3.0 aren't just about querying databases, but are more widely useful even in situations that are not directly related to data-processing.

Read the complete article (English)
Wednesday, November 19, 2008

Reactive programming (I.) - First class events in F#

I believe that the LINQ project and changes in C# 3.0 and VB 9 are interesting because they allow rewriting of many ideas from functional programming. An ability to express queries easily is one of these ideas, but it is definitely not the only one. There are many other interesting ideas. The C# 3.0 language isn't primary a functional language, so it isn't easy to discover the idea if you use only C#, but it is possible to implement it if you know the idea already.

I already wrote a few interesting C# examples that were inspired by some functional idea. I'm a big fan of the F# language, so it is not a surprise that I started with an F# version of the problem and then looked at the way to do the same thing in C#. In particular, this is how my article about building dynamic queries in C# came to the existence - the F# version used FLINQ and Quotations and then I demonstrated how to do the same in C# using expression trees. Another example is my article about asynchronous programming in C# using iterators, which shows how to implement something like F# asynchronous workflows using iterators in C# 2.0.

Functional Reactive Programming

Today, I'm going to look at another very interesting idea from functional programming. It is called Functional Reactive Programming and it comes from the Haskell community. You can find a list of related Haskell projects here. However, similar things (though they are not purely functional and simplified) are available in the F# language as well. Don Syme introduced them in his blog post called F# First Class Events: Simplicity and Compositionality in Imperative Reactive Programming. In this article, I'm going to briefly introduce the implementation available in F# and I'll extend it a little bit to allow some more interesting things. In the second article from this series, I'll show how to implement the same thing in C# 3.0 (and in VB 9 too!)

Read the complete article (English)
Sunday, November 16, 2008

Calculating with infinite sequences on MSDN

About a year ago, I wrote an article about using lazy computations in C# 3.0. It was published by the C# Community PM Charlie Calvert at the C# Developer Center. The article was a first of two articles where I wanted to demonstrate that C# 3.0 can be used for implementing useful constructs known from functional languages. I realized that I never posted the link to the second article to my blog, so you can find the annotation and link below.

However, I remembered about these two articles because I was just working on chapters 11 and 12 of the Real-world Functional Programming in .NET book that I’m writing. Lazy values, which were the topic of my first article, are discussed in the second part of chapter 11 and IEnumerable and F# sequences are the topic for the first part of chapter 12. Because I already wrote two articles on this topic, I had to think really hard to find better (and still simple enough) examples where these concepts are useful in practice. I also finally have enough space to show how these two concepts relate and talk about some interesting background – for example in Haskell, lazy sequences are in fact just ordinary lists that are lazy thanks to the Haskell nature.

A year ago, I definitely wouldn’t believe that today, I’ll be writing about the same topics, but this time as part of a book that has partly the same goal as these two articles – to show that functional programming ideas are really useful in the real-world and can enrich your programming toolbox (no matter whether you’re using C# or F# language). Anyway, here is the link to the second article – as usual when I look at something that I worked on a long time ago, I think I should rewrite it to make it better :-), but it still gives you an idea what is the book that I’m working on about...

Read the complete article (English)
Thursday, November 13, 2008

Functional Programming in .NET book - An update

Recently, I announced on my blog that I’m working on a book for Manning called Real world Functional Programming in .NET. The goal of the book is to explain the most interesting and useful ideas of functional programming to a real world C# developer. I'm writing this book, because I believe that functional programming is becoming increasingly important. Here is a couple of reasons why you should have this book on your bookshelf:

  • Ideas behind C# 3.0 and LINQ - these main-stream technologies are inspired by functional programming and the new C# 3.0 features give us definitely much more than just a new way to query databases. The book explains the ideas behind these features and shows how to use them more efficiently.
  • Learning the F# language - F# is becoming a first-class citizen in the Visual Studio family of languages, which alone would be a good reason for learning it! Even if you're not going to use it for your next large .NET project, you'll find it useful for quick prototyping of ideas and testing how .NET libraries work thanks to the great interactive tools.
  • Real world examples - the book includes a large set of real-world examples that show how to develop real applications in a functional way - both in F# and C#. Among other things, the examples show how to utilize multi-core CPUs, how to better obtain and process data and how to implement animations and GUI applications in a functional way.

The book is available via the MEAP (Manning Early Access Program) and if you want to get a better idea what is the book about, you can read the first chapter for free. Anyway, it is more than a month since I posted the announcement, so I decided to write a brief update....

Read the complete article (English)
Monday, October 20, 2008

Asynchronous Programming in C# using Iterators

In this article we will look how to write programs that perform asynchronous operations without the typical inversion of control. To briefly introduce what I mean by 'asynchronous' and 'inversion of control' - asynchronous refers to programs that perform some long running operations that don't necessary block a calling thread, for example accessing the network, calling web services or performing any other I/O operation in general. The inversion of control refers to the code structure that you have to use when writing a code that explicitly passes a C# delegate as a callback to the asynchronous method (typically called BeginSomething in .NET). The asynchronous method calls the delegate when the operation completes, which reverses the way you write the code - instead of encoding the control flow using typical language constructs (e.g. while loop) you have to use global variables and write your own control mechanism.

The funny thing about this article is that it could have been written at least 3 years ago when a beta version of Visual Studio 2005 and C# 2.0 became first available, but it is using iterators in a slightly bizarre way, so it is not easy to realize that this is possible. Actually, I will use some C# 3.0 methods in the article as well, but only extension methods and mainly just to keep the code nicer. As with my earlier article about building LINQ queries at runtime, I realized that it can be done in C# when I was playing with the F# solution (called F# Asynchronous Workflows), where this approach is very natural, so I will shortly mention the F# implementation as well.

Read the complete article (English)
Thursday, November 15, 2007

Lazy Computation in C# on MSDN

I think that one of the interesting things about C# 3.0 is that it gives you the ability to use many techniques known from functional languages (like Haskell or F#). Most of the articles about C# 3.0 and LINQ focus on the queries and LINQ to SQL, but I believe that using these functional techniques deserve some attention as well. This is why I'm very happy that my article about one of these techniques - representing lazy computations - is now available at the C# Developer Center. I would like to thank to Charlie Calvert [^], who is the Community Program Manager for C# and who edited and published my article there. Here is the annotation:

Most of the programming languages used in practice (including for example C#, VB.NET, C++, Python or Java) employ so called eager evaluation, which means that the program evaluates all expression and statements in the order in which they are written, so all the preceding statements and expressions are evaluated before executing the next piece of code. This, for example, means that all arguments to a method call are evaluated before calling the method. Sometimes it may be useful to delay an execution of some code until the result is actually needed, either because the result may not be needed at all (but we can’t tell that before executing some computation) or because we don’t want to block the program for a long time by executing all computations in advance and instead we want to execute the computations later, when we will actually need the result.

In this article we will look how these lazy computations can be written in C# (using some of the new language features from version 3.0). We will first implement a Lazy class to represent this kind of computation, then look at a few simple examples to demonstrate how the class can be used, and finally we will examine one slightly more complicated, but practically useful application.

You can read the complete article here: Lazy Computation in C# [^]

Permanent link & comments (English)
Saturday, October 06, 2007

Building LINQ Queries at Runtime in F#

In an article about building LINQ queries at runtime in C# 3.0, I described how you can build a LINQ query dynamically, for example by combining a set of conditions using the 'or' operator in the where clause. I mentioned that the way I implemented it is largely influenced by the F# language, which provides very natural way for manipulations with code like this. In this article I will first shortly introduce FLINQ sample, which is an F# library implementing LINQ support and than I will implement the same examples I presented in the earlier article in F#.

Read the complete article (English)
Saturday, August 18, 2007

Building LINQ Queries at Runtime in C#

Since the first beta versions of LINQ we could hear comments that it is perfect for queries known at compile-time, however it is not possible to use it for building queries dynamically at runtime. In this article I show that this can be actually done very well for most of the common cases. The solution offered by Microsoft (mentioned in [1]) is to build query from a string, however this has many limitations and it in fact goes completely against what LINQ tries to achieve, which is writing queries in a type-safe way with full compile-time checking. In this article I will first show a few support functions to make the life a bit easier and then we will use them for building two sample applications that allows user to build a query dynamically. The solution is largely motivated by my previous use of F#, where working with “expressions” is possible at more advanced level, however I’ll write about F# later and now let’s get back to C# 3.0...

Read the complete article (English)
Monday, July 30, 2007

CLinq - LINQ support for the C++/CLI language

I started working on this project, because I attended C++ class at our university and I had to do some application in C++. Because I hate doing useless projects I wanted to work on something interesting and so I started thinking whether it would be possible to enable LINQ support in C++/CLI...

C++/CLI is a very flexible language and the following example proves that enabling LINQ support in C++/CLI isn't impossible. The following database query returns name of contact and company for all customers living in London:

// create connection to database
NorthwindData db(".. connection string ..");

// declare database query
Expr<Customers^> cvar = Var<Customers^>("c");
CQuery<String^>^ q = db.QCustomers
  ->Where(clq::fun(cvar, cvar.City == "London"))
  ->Select(clq::fun(cvar, 
      cvar.ContactName + Expr<String^>(", ") + cvar.CompanyName));

// execute query and output results
for each(String^ s in q->Query)
  Console::WriteLine(s);

If you are interested in more information about CLinq project you can...

Read the complete article (English)
Friday, March 02, 2007

Can't return anonymous type from method? Really?

One of the new features introduced in C# 3.0 which will be available in Visual Studio "Orcas" (currently in CTP version) is anonymous type. Anonymous type is something very similar to tuple type from Cω [1] (which is based on tuple types known from many functional programming languages including F#). Anonymous types are extremely useful in LINQ queries, because it allows you to construct type with several properties without declaring the type (with all the properties). Example of query with anonymous type looks like this:

var q = from c in db.Customers
          where c.Country = "Czech Republic"
          select new { FullName=c.Name+" "+c.Surname, Address=c.Address };

Ok, it's probabbly not the best example, but it demonstrates the point - you want to return some information from query and you don't need to declare type that contains FullName and Address properties before, because you need it only for this single query (and you want to return only these two fields, so you don't transfer additional data that you don't need from database).

Now let's get to the second point...

Read the complete article (English)
Tuesday, January 23, 2007

Concepts behind the C# 3.0 language

One of the lectures that I attended last year was Programming Methodology and Philosophy of Programming Languages. The lecture was mostly about history of programming languages and how several features evolved, disappeared and than after many years appeared again in another programming language.

As I final work I decided to write an article that describes ideas that influenced the design of the C# 3.0 language. Some of these features are known from functional languages (for example from LISP or Haskell), some other were developed at Microsoft Research and appeared in the F# language or Cω. I also wanted to show in what ways are these features limited in the C# 3.0. I think that thanks to these limitation, the C# 3.0 is still a simple (or at least not difficult) to understand which is very important for mainstream language, but I find it interesting to know what is possible in other (less limited) languages.

Read the complete article (English)
Sunday, October 15, 2006

LINQ extensions - Simplified keyword search

Recently, I came across interesting question at LINQ Forums (Dynamic conditions: How to achieve multiple "OR" conditions with LINQ? [1]). The question is whether LINQ (and especially LINQ to SQL) provides any simple way to return only records that contain one or more of specified keywords in the name. The question looks simple, but it is simple only if you know the number of keywords that you want to look for. In this case you can write following LINQ query:

// Products that contain "kwd1" or "kwd2" in the name 
var q = from p in db.Products where p.ProductName.Contains("kwd1") || p.ProductName.Contains("kwd2") select p;

The problem with previous code is that you can't use it if the list of keywords is dynamically entered by user (and so its length may vary). Of course, if you want to run query on in-memory data, you can get very nice results by writing extension method called ContainsAny that performs test for keyword array, but if you want to be able to translate query to SQL, the situation is a bit complicated.

Read the complete article (English)
Friday, July 28, 2006

Calling functions in LINQ queries

The LINQ Project [^] is an extension to .NET Framework and most important .NET languages (C# and VB.Net) that extends these languages with query operators and some language features that make it possible to integrate queries in the languages. Thanks to LINQ you can write queries that read data from database (or any other data source). For example, imagine that you want to write set of queries for eshop and you need to perform a price calculation in more queries. The problem with LINQ queries is that you can't simply call a function written in C# that calculates price. The following example is NOT WORKING for this reason:

// function used in filter
static decimal CalcPrice(Nwind.Product p) { return p.UnitPrice * 1.19m; }
// query that uses MyFunc
var q = from p in db.Products where CalcPrice(p) > 30m select p

I think that this is a big limitation, because when you want to keep some more complex logic in the data access layer you should be able to reuse parts of queries that are similar across more queries. The good thing is that with latest release, LINQ became extensible so it is possible to write a extensions that allow this scenario...

Read the complete article (English)
Saturday, June 10, 2006