How many tuple types are there in C#?
In a recent StackOverflow question the poster asked about the difference between tupled and curried form of a function in F#. In F#, you can use pattern matching to easily define a function that takes a tuple as an argument. For example, the poster's function was a simple calculation that multiplies the number of units sold n by the price p:
1:
|
|
The function takes a single argument of type Tuple<float, int>
(or, using the nicer F# notation
float * int
) and immediately decomposes it into two variables, price
and count
. The other
alternative is to write a function in the curried form:
1:
|
|
Here, we get a function of type float -> int -> float
. Usually, you can read this just as a
function that takes float
and int
and returns float
. However, you can also use partial
function application and call the function with just a single argument - if the price of
an apple is $1.20, we can write salesCurried 1.20
to get a new function that takes just
int
and gives us the price of specified number of apples. The poster's question was:
So when I want to implement a function that would have taken n > 1 arguments, should I for example always use a curried function in F# (...)? Or should I take the simple route and use regular function with an n-tuple and curry later on if necessary?
You can see my answer on StackOverflow. The point of this short introduction was that the question inspired me to think about how the world looks from the C# perspective...
To curry or not to curry?
I will not repeat the whole answer in the blog post. The key idea is that you should use
tuple when the tuple has some logical meaning. For example, if you have a function that
takes a range or 2D coordinates, it makes sense to use float * float
.
This makes sense because you can then nicely compose multiple functions that work with
ranges. For example, let's say we have a function normalizeRange
and expandRange
:
1: 2: 3: 4: 5: |
|
Now we can easily write code that takes some range, normalizes it and expands it by 10:
1: 2: |
|
So, if your tuple has some logical meaning, taking tuple as an argument leads to more composable code and makes it easier to understand. On the other hand, if there is no logical connection, it is better to use the curried form - this makes it possible to use partial function application.
How about tuples in C#?
In C#, we can work with tuples using the Tuple<T1, T2, ...>
family of types. This is
certainly possible, but it is not particularly convenient, because you need to write
the long type name repeatedly (you can use var
inside method, but not in the method
declaration).
However, there is another place where tuples appear in C# - it is perfectly reasonable to treat all .NET methods as functions that take a single tuple as the input and return some other type as the result. This is how .NET methods look when you call them from F#:
1:
|
|
We do not usually think about this as a tuple - it is just a method call - but what if
C# had (in some future version)
syntactic support for tuples and let you write (42, "Hello world")
to create a tuple
value of type Tuple<int, string>
?
How many tuple types are there in .NET?
This inspired me to do a quick analysis of the standard .NET libraries to have a look at the tuples that standard .NET methods take. How many of them follow the good practice and take a tuple that actually means something? And how many of them should instead use the curried form, because the tuple has no logical meaning?
Checking the logical meaning will be difficult, but we can see how many of the tuples are used by more than one or two methods. If they are used in multiple places, it likely means that they represent some common pattern or some common single-purpose data structure.
This is pretty easy analysis to do using F# Interactive. Let's first look at all the types
in the current AppDomain
(this uses assemblies that are loaded by default in F# - so
nothing fancy). We also only look at "mscorlib" and "System" assemblies:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: |
|
The code is a simple sequence expression that iterates over all assemblies and yields all types. On my machine, this gives us some 17000 types. Now, let's get a list with all tuples - we'll iterate over all methods in each type and generate a list with the names of parameter types. We skip all methods with less than 2 parameters:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: |
|
So, on my machine there are 16463 methods in .NET that take some tuple as an argument. Now, the question is, how many of them are used repeatedly? We can easily group the tuples by the list of strings (F# implements structural comparison, so this is easy to do), calculate the counts for each group and sort the results:
1: 2: 3: 4: 5: 6: 7: |
|
Most common tuples in .NET
If we run Seq.length counts
, we get 5805 as the result. This means that there are 5 thousand
distinct tuples (among roughly 15 thousand different methods). That certainly does not look like
most of them have some logical connection. But some of the top ones certainly do - here are the
top 8 (ignoring generics) with their counts:
-
string * string
(714) - looks like many methods take two strings - not sure if there is any logical meaning, but there probably are a few common uses -
byte[] * int * int
(341) - this one looks like an array with offset and length - clearly this is a nice tuple with logical meaning int * int
(327) - similar to two stringsobject * object
(180) - hmm, maybe .NET likes untyped API :-)-
int * object
(165) - I was a bit puzzled by this one, so I checked the methods that use this type. Good old untyped collections from the .NET 1.0 days! char[] * int * int
(159) - similarly to the number 2, another nice logical tuple!string * string * string
(156) - wow, so many methods take 3 stringsITypeDescriptorContext * Type
(152) - huh??
How many are actually useful?
It looks like there is quite a few tuple types that actually mean something useful. But what is the distribution? Let's use the FSharp.Charting library to draw a quick chart that draws a column chart plotting the counts for every single of the 5000 tuple types:
1: 2: 3: 4: |
|
If you create a chart using just Chart.Column
, then you will not see very much - the number
of counts drops very quickly from the high numbers that we've seen for the first 10 types.
But if we make the Y scale logarithmic (a good way to create misleading charts!) then we
can actually see something:
The cart shows that a vast majority of tuples are used less than 10 times and only 2000 (of some 5000) are used more than once. The analysis based on just the number of occurrences is definitely not precise, but let's say that tuples which are used more than 10 times are useful and those that are used more than 3 times are possibly useful. We can then easily draw a chart showing the proportions:
1: 2: 3: 4: 5: 6: |
|
This snippet gives us the following nice chart (I tweaked the look a bit - a nice feature
of F# chart is that you can use Ctrl+G
to open a property grid and change the fonts
rather than doing everything from code):
Surely, this is ridiculous!
Yes, I can hear that. I'm comparing incomparable here - it does not make sense to look at .NET libraries as if they were F# libraries and then claim that they are poorly designed. The new version of my blog does not even have comments, but you can still argue with me on Twitter.
But before doing that - I'm not trying to criticise the design of .NET libraries in any way.
If your only option is to define a method that takes parameters "as a tuple" then that's
the way to go. I'm certainly not suggesting that .NET should use curried form using
Func<T1, ...>
delegates or that people should use Tuple<T1, ...>
instead of ordinary
methods.
This article is merely a thought experiment with some interesting analysis of .NET types.
We can see that there are a few "natural tuples" in .NET library design (like
byte[] * int * int
) but the parameters of a majority of methods do not logically form
a tuple.
So, is it better to use languages that make a clear distinction between (curried) functions and functions taking a tuple? I think so - it makes it easier to write composable code (by writing functions that take and return simple "ad-hoc" types as tuples) and it gives you an easy way of grouping related types. There is no class representing array range in .NET because adding an entire class for this would be over-kill. A simple type like tuple (supported by the language) makes this perfectly possible. On the other hand, you need to think more carefully about library design to make sure that you use tuples correctly.
Full name: Tuples-in-csharp.salesTuple
val float : value:'T -> float (requires member op_Explicit)
Full name: Microsoft.FSharp.Core.Operators.float
--------------------
type float = Double
Full name: Microsoft.FSharp.Core.float
--------------------
type float<'Measure> = float
Full name: Microsoft.FSharp.Core.float<_>
Full name: Tuples-in-csharp.salesCurried
Full name: Tuples-in-csharp.normalizeRange
Full name: Tuples-in-csharp.expandRange
static val PI : float
static val E : float
static member Abs : value:sbyte -> sbyte + 6 overloads
static member Acos : d:float -> float
static member Asin : d:float -> float
static member Atan : d:float -> float
static member Atan2 : y:float * x:float -> float
static member BigMul : a:int * b:int -> int64
static member Ceiling : d:decimal -> decimal + 1 overload
static member Cos : d:float -> float
...
Full name: System.Math
Math.Round(a: float) : float
Math.Round(d: decimal, mode: MidpointRounding) : decimal
Math.Round(d: decimal, decimals: int) : decimal
Math.Round(value: float, mode: MidpointRounding) : float
Math.Round(value: float, digits: int) : float
Math.Round(d: decimal, decimals: int, mode: MidpointRounding) : decimal
Math.Round(value: float, digits: int, mode: MidpointRounding) : float
| ToEven = 0
| AwayFromZero = 1
Full name: System.MidpointRounding
Full name: Tuples-in-csharp.types
val seq : sequence:seq<'T> -> seq<'T>
Full name: Microsoft.FSharp.Core.Operators.seq
--------------------
type seq<'T> = Collections.Generic.IEnumerable<'T>
Full name: Microsoft.FSharp.Collections.seq<_>
inherit MarshalByRefObject
member ActivationContext : ActivationContext
member AppendPrivatePath : path:string -> unit
member ApplicationIdentity : ApplicationIdentity
member ApplicationTrust : ApplicationTrust
member ApplyPolicy : assemblyName:string -> string
member BaseDirectory : string
member ClearPrivatePath : unit -> unit
member ClearShadowCopyPath : unit -> unit
member CreateComInstanceFrom : assemblyName:string * typeName:string -> ObjectHandle + 1 overload
member CreateInstance : assemblyName:string * typeName:string -> ObjectHandle + 3 overloads
...
Full name: System.AppDomain
String.StartsWith(value: string, comparisonType: StringComparison) : bool
String.StartsWith(value: string, ignoreCase: bool, culture: Globalization.CultureInfo) : bool
from Microsoft.FSharp.Collections
Full name: Microsoft.FSharp.Collections.Seq.length
Full name: Tuples-in-csharp.tuples
| Default = 0
| IgnoreCase = 1
| DeclaredOnly = 2
| Instance = 4
| Static = 8
| Public = 16
| NonPublic = 32
| FlattenHierarchy = 64
| InvokeMethod = 256
| CreateInstance = 512
...
Full name: System.Reflection.BindingFlags
Type.GetMethods(bindingAttr: BindingFlags) : MethodInfo []
Full name: Tuples-in-csharp.counts
Full name: Microsoft.FSharp.Collections.Seq.groupBy
Full name: Microsoft.FSharp.Core.Operators.id
member Clone : unit -> obj
member CopyTo : array:Array * index:int -> unit + 1 overload
member GetEnumerator : unit -> IEnumerator
member GetLength : dimension:int -> int
member GetLongLength : dimension:int -> int64
member GetLowerBound : dimension:int -> int
member GetUpperBound : dimension:int -> int
member GetValue : [<ParamArray>] indices:int[] -> obj + 7 overloads
member Initialize : unit -> unit
member IsFixedSize : bool
...
Full name: System.Array
Full name: Microsoft.FSharp.Collections.Array.ofSeq
Full name: Microsoft.FSharp.Collections.Array.map
Full name: Microsoft.FSharp.Collections.Array.sortBy
Full name: Microsoft.FSharp.Core.Operators.snd
Full name: Microsoft.FSharp.Collections.Array.rev
namespace FSharp
--------------------
namespace Microsoft.FSharp
static member Area : data:seq<#value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string -> GenericChart
static member Area : data:seq<#value * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string -> GenericChart
static member Bar : data:seq<#value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string -> GenericChart
static member Bar : data:seq<#value * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string -> GenericChart
static member BoxPlotFromData : data:seq<#value * #seq<'a2>> * ?Name:string * ?Title:string * ?Color:Color * ?XTitle:string * ?YTitle:string * ?Percentile:int * ?ShowAverage:bool * ?ShowMedian:bool * ?ShowUnusualValues:bool * ?WhiskerPercentile:int -> GenericChart (requires 'a2 :> value)
static member BoxPlotFromStatistics : data:seq<#value * #value * #value * #value * #value * #value * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string * ?Percentile:int * ?ShowAverage:bool * ?ShowMedian:bool * ?ShowUnusualValues:bool * ?WhiskerPercentile:int -> GenericChart
static member Bubble : data:seq<#value * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string * ?BubbleMaxSize:int * ?BubbleMinSize:int * ?BubbleScaleMax:float * ?BubbleScaleMin:float * ?UseSizeForLabel:bool -> GenericChart
static member Bubble : data:seq<#value * #value * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string * ?BubbleMaxSize:int * ?BubbleMinSize:int * ?BubbleScaleMax:float * ?BubbleScaleMin:float * ?UseSizeForLabel:bool -> GenericChart
static member Candlestick : data:seq<#value * #value * #value * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string -> CandlestickChart
static member Candlestick : data:seq<#value * #value * #value * #value * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Color * ?XTitle:string * ?YTitle:string -> CandlestickChart
...
Full name: FSharp.Charting.Chart
static member Chart.Column : data:seq<#value * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Drawing.Color * ?XTitle:string * ?YTitle:string -> ChartTypes.GenericChart
Full name: Microsoft.FSharp.Collections.Seq.map
Full name: Microsoft.FSharp.Collections.Seq.countBy
static member Chart.Doughnut : data:seq<#value * #value> * ?Name:string * ?Title:string * ?Labels:#seq<string> * ?Color:Drawing.Color * ?XTitle:string * ?YTitle:string -> ChartTypes.DoughnutChart
Published: Tuesday, 17 September 2013, 3:11 PM
Author: Tomas Petricek
Typos: Send me a pull request!
Tags: c#, f#, functional programming