TP

Comparing date range handling in C# and F#

I was recently working on some code for handling date ranges in Deedle. Although Deedle is written in F#, I also wrote some internal integration code in C#. After doing that, I realized that the code I wrote is actually reusable and should be a part of Deedle itself and so I went through the process of rewriting a simple function from (fairly functional) C# to F#. This is a small (and by no means representative!) example, but I think it nicely shows some of the reasons why I like F#, so I thought I'd share it.

The problem

One thing that we are adding to Deedle is a "BigDeedle" implementation of internal data structures. The idea is that you can load very big frames and series without actually loading all data into memory.

When you perform slicing on a large series and then merge some of the parts of the series (say, years 2010, 2012 and 2014), you end up with a series that combines a couple of chunks. If you then restrict the series (say, from June 2012 to June 2014), you need to restrict the ranges of the chunks:

Demonstration

As the diagram shows, this is just a matter of iterating over the chunks, keeping those in the range, dropping those outside of the range and restrictingthe boundaries of the other chunks. So, let's start with the C# version I wrote.

Restricting ranges in C#

To keep the sample self-contained, I'll also include a simple definition of a Date type that I was using in my experiments. This is not the key part, but it is worth showing. A date is essentially an integer - a number of days since the beginning of the universe (or some other important milestone):

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
/// <summary>
/// A date as a number of days since the beginning
/// </summary>
class Date {
  public int Offset { get; set; }
  public static bool operator <=(Date d1, Date d2) { return d1.Offset <= d2.Offset; }
  public static bool operator >=(Date d1, Date d2) { return d1.Offset >= d2.Offset; }
  public static bool operator <(Date d1, Date d2) { return d1.Offset < d2.Offset; }
  public static bool operator >(Date d1, Date d2) { return d1.Offset > d2.Offset; }
}
/// <summary>
/// An array of ranges represented as date pairs
/// </summary>
class Ranges {
  public Tuple<Date, Date>[] Ranges { get; set; }
}

The Date type defines a couple of custom operators so that we can compare dates (I only defined those that I needed). The Ranges type is the simplest possible wrapper over an array of Date pairs. The type is internal, so I was just using tuples to save some typing.

Next, let's have a look at the RestrictRanges function. This takes Ranges together with lower and upper bound of the restriction. It then iterates over the ranges using SelectMany and returns the range unmodified (if it is within the restriction), skips it (if it is outside) or adjusts its lower and upper bounds:

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
/// <summary>
/// Restrict the specified collection of ranges 
/// according to the provided restriction range
/// </summary>
static Ranges RestrictRanges(Ranges ranges, Date loRestr, Date hiRestr)
{
  var newRanges = ranges.Ranges.SelectMany(range =>
  {
    if (range.Item1 >= loRestr && range.Item2 <= hiRestr)
      return new[] { range };
    else if (range.Item2 < loRestr || range.Item1 > hiRestr)
      return new Tuple<Date, Date>[0];
    else
      return new[] { Tuple.Create
          ( range.Item1 > loRestr ? range.Item1 : loRestr, 
            range.Item2 < hiRestr ? range.Item2 : hiRestr ) };
  }).ToArray();
  return new Ranges { Ranges = newRanges };
}

This is fairly simple and readable piece of code. We might be able to make it a bit nicer if we used iterators, but that would require a separate method (because we are returning Ranges and not IEnumerable<T> here). We could also use a named type rather than tuple (to replace Item1 and Item2 with Lower and Upper), which would make it a bit more readable, but it wouldn't change the structure. Before discussing the code further, let's look at the F# version.

Restricting ranges in F#

As with the C# version, we need to start with type definitions. This is not really the important part, but I wanted to have a self-contained sample for the blog, so I'm including those too:

1: 
2: 
3: 
4: 
/// A date as a number of days since the beginning
type Date = { Offset : int }
/// An array of ranges represented as date pairs
type Ranges = { Ranges : (Date*Date)[] }

If we're happy to use a simple F# record, then this is all we need. For records, the compiler automatically provides structural equality and structural comparison. This means that we can write { Offset = 123 } <= { Offset = 125 } straight away and the result is true. We can also omit the <summary> tag in the comment (F# adds it automatically).

This is not really the main thing though. In practice, I would use records for simple types that do not have complex internal logic - and so I might start with the above Date record, but later turn it into something that is closer to the C# type with explicit definitions. Nevertheless, it is nice that we can write simple record in the first step and F# gets all the defaults right (makes it immutable, adds equality and comparison).

The more interesting thing is the restrictRanges function. We follow exactly the same logic as before (even using Array.collect which is an equivalent of SelectMany):

1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
9: 
/// Restrict the specified collection of ranges 
/// according to the provided restriction range
let restrictRanges (loRestr:Date, hiRestr:Date) ranges =
  let newRanges = 
    ranges.Ranges |> Array.collect (fun (lo, hi) ->
        if lo >= loRestr && hi <= hiRestr then [| lo, hi |]
        elif hi < loRestr || lo > hiRestr then [| |]
        else [| max lo loRestr, min hi hiRestr |] )
  { Ranges = newRanges }

Alternatively, we could use sequence expressions (which I prefer in this case) and rewrite replace the Array.collect function with the [| .. |] block, which gives us:

1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
9: 
/// Restrict the specified collection of ranges 
/// according to the provided restriction range
let restrictRangesArrExpr (loRestr:Date, hiRestr:Date) ranges =
  let newRanges = 
    [| for lo, hi in ranges.Ranges do
        if lo >= loRestr && hi <= hiRestr then yield lo, hi
        elif hi < loRestr || lo > hiRestr then ()
        else yield max lo loRestr, min hi hiRestr |]
  { Ranges = newRanges }

Comparing the two versions

It is certainly a matter of taste, but I was quite surprised by how different the F# version looks - both versions implement the same logic and both use functional programming style, but there are a few little details that (in my opinion) make the F# version nicer.

For me, the interesting thing about this comparison is that it is not looking at any big ideas. It is comparing two functions written in the same style, using pretty much the same code. But even then, the using F# gives us a couple of little benefits that make the code (I think) nicer.

There is no fundamental reason why C# could not do any of these in a future version. In fact, I think that pattern matching on tuples gets mentioned quite often. But this were just 4 "little things" that I found in one 7-line function...

It might also be the case that I'm more used to writing and reading F# - this is, of course, true - but if we look at what we deleted, I think it was mostly noise: things like new [] { .. }, Tuple<Date, Date>, range.Item1 > loRestr ? .. : .., ToArray(), .Item1 and .Item2 are all about the implementation details and not about the function logic.

Date.Offset: int
Multiple items
val int : value:'T -> int (requires member op_Explicit)

Full name: Microsoft.FSharp.Core.Operators.int

--------------------
type int = int32

Full name: Microsoft.FSharp.Core.int

--------------------
type int<'Measure> = int

Full name: Microsoft.FSharp.Core.int<_>
type Ranges =
  {Ranges: (Date * Date) [];}

Full name: Restricting-ranges.Ranges


 An array of ranges represented as date pairs
Multiple items
Ranges.Ranges: (Date * Date) []

--------------------
type Ranges =
  {Ranges: (Date * Date) [];}

Full name: Restricting-ranges.Ranges


 An array of ranges represented as date pairs
type Date =
  {Offset: int;}

Full name: Restricting-ranges.Date


 A date as a number of days since the beginning
val restrictRanges : loRestr:Date * hiRestr:Date -> ranges:Ranges -> Ranges

Full name: Restricting-ranges.restrictRanges


 Restrict the specified collection of ranges
 according to the provided restriction range
val loRestr : Date
val hiRestr : Date
val ranges : Ranges
val newRanges : (Date * Date) []
Ranges.Ranges: (Date * Date) []
module Array

from Microsoft.FSharp.Collections
val collect : mapping:('T -> 'U []) -> array:'T [] -> 'U []

Full name: Microsoft.FSharp.Collections.Array.collect
val lo : Date
val hi : Date
val max : e1:'T -> e2:'T -> 'T (requires comparison)

Full name: Microsoft.FSharp.Core.Operators.max
val min : e1:'T -> e2:'T -> 'T (requires comparison)

Full name: Microsoft.FSharp.Core.Operators.min
val restrictRangesArrExpr : loRestr:Date * hiRestr:Date -> ranges:Ranges -> Ranges

Full name: Restricting-ranges.restrictRangesArrExpr


 Restrict the specified collection of ranges
 according to the provided restriction range

Published: Wednesday, 22 April 2015, 5:55 PM
Author: Tomas Petricek
Typos: Send me a pull request!
Tags: f#, c#, deedle, linq, functional programming