TP

F# Overview (II.) - Functional Programming

As already mentioned in the Introduction for this article series, F# is a typed functional language, by which I mean that types of all values are determined during the compile-time. However, thanks to the use of a type inference, the types are explicitly specified in the code very rarely as we will see in the following examples. The type inference means that the compiler deduces the type from the code, so for example when calling a function that takes int as an argument and returns string as a result, the compiler can infer the type of the variable where the result is asigned (it has to be string) as well as the type of the variable that is given as an argument (it has to be int). Basic data types (aside from a standard set of primitive numeric and textual types that are present in any .NET language) available in F# are tuple, discriminated union, record, array, list, function and object. In the following quick overview, we will use the F# interactive, which is a tool that compiles and executes the entered text on the fly.

The F# interactive can be either used from Visual Studio or by running the fsi.exe from the F# installation directory. In the whole article series we will also use the F# lightweight syntax, which makes the code white-space sensitive, but simplifies many of the syntactical rules. To enable the lightweight syntax enter the following command in the FSI:

> #light;;

The double semicolon is used as the end of the FSI input and sends the entered text to the compiler. This allows us to enter multiple lines of code in a single command. With a few exceptions (mostly when showing a declaration of a complex type) all the code samples in this article are written as commands for FSI including the double semicolon and the result printed by the FSI. Longer code samples can be entered to FSI as well - just add the double semicolon to terminate the input.

F# Data Types Overview

Tuples

The first example demonstrates constructing and deconstructing a tuple type. Tuple is a simple type that groups together two or more values of possibly different types. It is important to note that the length of the tuple is known at compile time, so we can for example create a tuple with two elements of types int and string:

> let tuple = (42, "Hello world!");;
val tuple : int * string

> let (num, str) = tuple;;
val num : int
val str : string

As you can see, the compiler deduced a type of the expression that is present on the right side of the equals sign and the F# interactive printed the type, so we can review it. In this example the type of a first element in a tuple is int and the type of the second element is string. The asterisk denotes that the type is a tuple. Similarly, you can define a tuple with more than three elements, but the type changes with the number of elements in a tuple, which means that tuples can't be used for storing an unknown number of values. This can be done using lists or arrays, which will be discussed later.

The syntax used for deconstructing the value into variables num and str is in general called pattern matching and it is used very often in the F# language – the aim of pattern matching is to allow matching a value against a pattern that specifies different view of the data type – in case of tuple, one view is a single value (of type tuple) and the second view is a pair of two values (of different types). Pattern matching can be used with all standard F# types, most notably with tuples, discriminated unions and record types. In addition, F# also supports generalized pattern matching constructs called active patterns, which are discussed later in this overview.

Tuple types are very handy for returning multiple values from functions, because this removes the need to declare a new class or use references when writing a function that performs some simple operation resulting in more returned values (especially in places where C# uses ref and out parameters). In general, I would recommend using tuples when the function is either simple (like division with remainder), local (meaning that it will not be accessed from a different module or file) or it is likely to be used with pattern matching. For returning more complicated structures it is better to use record types which will be discussed shortly.

Discriminated Union

In the next sample we demonstrate working with the discriminated union type. This type is used for representing a data type that store one of several possible options (where the options are well known when writing the code). One common example of data type that can be represented using discriminated unions is an abstract syntax tree (i.e. an expression in some programming language):

> // Declaration of the 'Expr' type
  type Expr = 
  | Binary   of string * Expr * Expr
  | Variable of string 
  | Constant of int;;
(...)

> // Create a value 'v' representing 'x + 10'
  let v = Binary("+", Variable "x", Constant 10);;
val v : Expr

To work with the values of a discriminated union type, we can again use pattern matching. In this case we use the match language construct, which can be used for testing a value against several possible patterns – in case of the Expr type, the possible options are determined by all identifiers used when declaring the type (these are called constructors), namely Binary, Variable and Constant. The following example declares a function eval, which evaluates the given expression (assuming that getVariableValue is a function that returns a value of variable):

> let rec eval x =
    match x with
    | Binary(op, l, r) ->
        let (lv, rv) = (eval l, eval r)
        if   (op = "+") then lv + rv 
        elif (op = "-") then lv - rv 
        else failwith "Unknonw operator!"
    | Variable(var) -> 
        getVariableValue var
    | Constant(n) ->
        n;;
val eval : Expr -> int

When declaring a function we can use the let keyword that is used for binding a value to a name. I don’t use a term variable known from other programming languages for a reason that will be explained shortly. When writing a recursive function, we have to explicitly state this using the rec keyword as in the previous example.

Discriminated unions form a perfect complement to the typical object-oriented inheritance structure. In an OO hierarchy the base class declares all methods that are overridden in derived classes, meaning that it is easy to add new type of value (by adding a new inherited class), but adding a new operation requires adding method to all the classes. On the other side, a discriminated union defines all types of values in advance, which means that adding a new function to work with the type is easy, but adding a new type of value (new constructor to the discriminated union) requires modification of all existing functions. This suggests that discriminated unions are usually a better way for implementing a Visitor design pattern in F#.

Records

The next data type that we will look at is a record type. It can be viewed as a tuple with named members (in case of record these are called labels), which can be accessed using a dot-notation and as mentioned earlier it is good to use this type when it would be difficult to understand what the members in a tuple represent. One more difference between a record type and a tuple is that records have to be declared in advance using a type construct:

> // Declaration of a record type
  type Product = 
    { Name:string
      Price:int };;
(...)

> // Constructing a value of the 'Product' type
  let p = { Name="Test"; Price=42; };;
val p : Product

> p.Name;;
val it : string = "Test"

> // Creating a copy with different 'Name'
  let p2 = { p with Name="Test2" };;
val p2 : Product

The last command uses an interesting construct - the with keyword. The record types are by default immutable, meaning that the value of the member can’t be modified. Since the members are immutable you will often need to create a copy of the record value with one (or more) modified members. Doing this explicitly by listing all the members would be impractical, because it would make adding a new members very difficult, so F# supports the with keyword to do this.

F# records are in many ways similar to classes and they can be, indeed, viewed as simplified classes. Record types are by default immutable, which also means that F# use a structural comparison when comparing two values of a record type (instead of the default reference comparison used when working with classes) and if you need this behavior (e.g. for storing records as a keys in a dictionary) it is very practical to use them. Also, using a record instead of a class is a good idea in a functional code where you can use the with construct. Exposing a record type in a public interface of the module requires additional care and it is often useful to make the labels available as members, which makes it easier to modify implementation of the type later. This topic will be further discussed in the third part of this article series.

Lists

The types used for storing collections of values are list and array. F# list is a typical linked-list type known from many functional languages – it can be either an empty list (written as []) or a cell containing a value and a reference to the tail, which is itself a list (written as value::tail). It is also possible to write a list using a simplified syntax, which means that you can write [1; 2; 3] instead of 1::2::3::[] (which is exactly the same list written just using the two basic list constructors). Array is a .NET compatible mutable array type, which is stored in a continuous memory location and is therefore very efficient – being a mutable type, array is often used in imperative programming style, which will be discussed later. The following example shows declaration of a list value and an implementation of a recursive function that adds together all elements in the list:

> let nums = [1; 2; 3; 4; 5];;
val nums : list<int>

> let rec sum list = 
    match list with
    | h::tail -> (sum tail) + h
    | [] -> 0
val sum : list<int> -> int

Similarly as earlier we declared a recursive function using let rec and inside the body we used pattern matching to test whether the list is an empty list or a list cell. Note that list is a generic type, which means that it can store values of any F# type. The type in our example is list<int>, which means that the declared instance of list contains integers. Functions working with generic types can be restricted to some specific type - for example the sum function above requires a list of integers that can be added (this is inferred by the type inference, because the default type used with the + operator is int). Alternatively, the function can be generic as well, which means that it works with any lists - for example a function that returns the last element in the list doesn’t depend on the type and so it can be generic. The signature of a generic function to return the last element would be last : list<'a> -> 'a.

An important feature when writing recursive functions in F# is the support for tail-calls. This means that when the last operation performed by the function is a call to a function (including a recursive call to itself), the runtime drops the current stack frame, because it isn’t needed anymore - the value returned by the called function is a result of the caller. This minimizes a chance for getting a stack overflow exception. The sum function from the previous example can be written using an auxiliary function that uses a tail recursion as following:

> // 'acc' is usually called an 'accumulator' variable
  let rec sumAux acc list = 
    match list with
    | h::tail -> sumAux (acc + h) tail
    | [] -> acc
val sum : int -> list<int> -> int

> let sum list = sumAux 0 list
val sum : list<int> -> int

Functions

Finally, the type that gives name to the whole functional programming is a function. In F#, similarly to other functional languages, functions are first-class values, meaning that they can be used in a same way as any other types. They can be given as an argument to other functions or returned from a function as a result (a function that takes function as an argument or returns function as a result is called high-order function) and the function type can be used as a type argument to generic types - you can for example create a list of functions. The important aspect of working with functions in functional languages is the ability to create closures – creating a function that captures some values available in the current stack frame. The following example demonstrates a function that creates and returns a function for adding specified number to an initial integer:

> let createAdder n =
    (fun arg -> n + arg);;
val createAdder : int -> int -> int

> let add10 = createAdder 10;;
val add10 : int -> int

> add10 32;;
val it : int = 42

In the body of the createAdder function we use a fun keyword to create a new unnamed function (a function constructed in this way is called a lambda function). The type of createAdder (int -> int -> int) denotes that when the function is called with int as an argument, it produces a value of type function (which takes an integer as a parameter and produces an integer as a result). In fact, the previous example could be simplified, because any function taking more arguments is treated as a function that produces a function value when it is given the first argument, which means that the following code snippet has the same behavior. Also note that the types of the function createAdder declared earlier and the type of the function add are the same):

> let add a b = a + b;;
val add : int -> int -> int

> let add10 = add 10;;
val add10 : int -> int

When declaring the function value add10 in this example, we used a function that expects two arguments with just one argument. The result is a function with a fixed value of the first argument which now expects only one (the second) argument. This aspect of working with functions is known as currying.

Many functions in the F# library are implemented as high-order functions and functions as an arguments are often used when writing a generic code, that is a code that can work with generic types (like list<'a>, which we discussed earlier). For example standard set of functions for manipulating with list values is demonstrated in the following example:

> let odds = List.filter (fun n -> n%2 <> 0) [1; 2; 3; 4; 5];;
val odds : list<int> = [1; 3; 5]

> let squares = List.map (fun n -> n * n) odds;;
val squares : list<int> = [1; 9; 25]

It is interesting to note that the functions that we used for manipulating with lists are generic (otherwise they wouldn’t be very useful!). The signature of the filter function is ('a -> bool) -> list<'a> -> list<'a>, which means that the function takes list of some type as a second argument and a function that returns a true or false for any value of that type, finally the result type is same as the type of the second argument. In our example we instantiate the generic function with a type argument int, because we’re filtering a list of integers. The signatures of generic functions often tell a lot about the function behavior. When we look at the signature of the map function (('a -> 'b) -> list<'a> -> list<'b>) we can deduce that map calls the function given as a first argument on all the items in the list (given as a second argument) and returns a list containing the results.

In the last example we will look at the pipelining operator (|>) and we will also look at one example that demonstrates how currying makes writing the code easier - we will use the add function declared earlier:

> let nums = [1; 2; 3; 4; 5];;
val nums : list<int>

> let odds_plus_ten = 
       nums 
    |> List.filter (fun n-> n%2 <> 0)
    |> List.map (add 10)
val odds_plus_ten : list<int> = [11; 13; 15];;

Sequences of filter and map function calls are very common and writing it as a single expression would be quite difficult and not very readable. Luckily, the sequencing operator allows us to write the code as a single expression in a more readable order - as you can see in the previous example, the value on the left side of the |> operator is given as a last argument to the function call on the right side, which allows us to write the expression as sequence of ordinary calls, where the state (current list) is passed automatically to all functions. The line with List.map also demonstrates a very common use of currying. We want to add 10 to all numbers in the list, so we call the add function with a single argument, which produces a result of the type we needed - a function that takes an integer as an argument and returns an integer (produced by adding 10) as the result.

Function Composition

One of the most interesting aspects of working with functions in functional programming languages is the possibility to use function composition operator. This means that you can very simply build a function that takes an argument, invokes a first function with this argument and passes the result to a second function. For example, you can compose a function fst, which takes a tuple (containing two elements) and returns the first element in the tuple with a function uppercase, which takes a string and returns it in an uppercase:

> (fst >> String.uppercase) ("Hello world", 123);;
val it : string = "HELLO WORLD"

> let data = [ ("Jim", 1); ("John", 2); ("Jane", 3) ];;
val data : (string * int) list

> data |> List.map (fst >> String.uppercase);;
val it : string list = ["JIM"; "JOHN"; "JANE"]

In the first command, we just compose the functions and call the returned function with a tuple as an argument, however the real advantage of this trick becomes more obvious in the third command, where we use the function composition operator (>>) to build a function that is given as an argument to a map function that we used earlier. The function composition allows us to build a function without explicitly using a lambda function (written using the fun keyword) and when this features is used reasonably it makes the code more compact and keeps it very readable.

Expressions and Variable Scoping

The F# language doesn’t have a different notion of a statement and an expression, which means that every language construct is an expression with a known return type. If the construct performs only a side effect (for example printing to a screen or modifying a global mutable variable or a state of .NET object) and doesn’t return any value then the type of the construct is unit, which is a type with only one possible value (written as “()”). The semicolon symbol (;) is used for sequencing multiple expressions, but the first expression in the sequence should have a unit as a result type. The following example demonstrates how the if construct can be used as an expression in F# (though in the optional F# lightweight syntax, which makes whitespace significant and which we used in the rest of this overview, the semicolon symbol can be omitted):

> let n = 1
  let res = 
    if n = 1 then
      printfn "..n is one..";
      "one"
    else
      "something else";;
..n is one..
val res : string = "one"

When this code executes it calls the true branch of the if expression, which first calls a side-effecting function, which prints a string and then returns a string ("one") as the result. The result is then assigned to the res value.

Unlike some languages that allow one variable name to appear only once in the entire function body (e.g. C#) or even treat all variables declared inside the body of a function as a variable with scope of the whole function (e.g. Visual Basic or JavaScript), the scope of F# values is determined by the let binding and it is allowed to hide a value by declaring a value with the same name. The following (slightly esoteric) example demonstrates this:

> let n = 21
  let f = 
    if n > 10 then
      let n = n * 2
      (fun () -> printfn "%d" n)
    else
      let n = n / 2
      (fun () -> printfn "%d" n)
  f ();;
42
val it : unit 

In this example, the value n declared inside a branch of the if expression is captured by a function created using the fun keyword, which is returned from the if expression and bound to the value named f. When the f is invoked it indeed uses the value from the scope where it was created, which is 42. In languages, where the variable named n would refer to a value stored globally, it would be rather problematic to write a code like this. Of course, writing a code similar to what I demonstrated in this example isn't a good idea, because it makes the code very difficult to read. There are however situations where hiding a value that is no longer needed in the code is practical.

Article Series Links

Published: Saturday, 3 November 2007, 12:00 AM
Author: Tomas Petricek
Typos: Send me a pull request!
Tags: functional, f#