# Keep your multi-core CPU busy with F#

The growth of computer CPU speed is slowly being replaced by the growth of number of CPUs (or cores) in the computer at least for the close future. This causes a revolution in the way software is written, because traditional and most widely used way of writing concurrent applications using threads is difficult and brings several serious issues. Some predictions say that within a few years, almost every computer will have about 16 cores, so there is a huge need for programming paradigms or idioms that help developers write concurrent software easily (see also The Free Lunch Is Over [^] written by Herb Sutter).

Functional programming languages (especially pure functional languages) are interesting from this point of view, because the program doesn't have side-effects which makes it very easy to parallelize it (programs in pure functional languages can't have any side-effects by design, in other functional languages like F# the side-effects can be eliminated by following functional programming style).

## Finding primes example

Let's say I want to find all prime numbers between 1 million and 1.1 million (this is example of an operation that can be nicely divided between more processors). First we will need function to test whether `n` is a prime:

```// Tests whether n is prime - expects n > 1
let is_prime n =
// calculate how large divisors should we test..
let max = int_of_float (Math.Sqrt( float_of_int n ))
// try to divide n by 2..max (stops when divisor is found)
not ({ 2 .. max } |> Seq.filter ( fun d -> n%d = 0) |> Seq.nonempty)
```

To find all primes in the specified range we could use the following code:

```let primes = [1000000 .. 1100000] |> List.filter is_prime
```

This code subsequently executes the `is_prime` function for all numbers that we want to test, but with multiple CPU cores it would be nice if the function divided the numbers into several parts and executed every part on different thread, so application would take benefit from multiple cores available in the system. The `is_prime` function doesn't have any side-effects so executing it several times in parallel can't change the result of operation (if the order of primes in the returned list doesn't change).

I wrote a function that does exactly what I described in the previous paragraph. To execute the operation in parallel, you can use the following code (the only difference is that I replaced `List` module with my `ParallelList` module):

```let primes = [1000000 .. 1100000] |> ParallelList.filter is_prime
```

On my notebook (with Intel Core Duo processor) the first code (using `List.filter`) takes about 2,3sec and using `ParallelList.filter` the operation takes only 1.3sec. The program isn't 2 times faster, because there is some overhead for creating and synchronizing threads, but in this case the speed increase is significant.

Aside from the `filter` function, the `ParallelList` module contains also the `map` function (which does the same thing as the `List.map`). There is also a function `set_thread_count` that you can use to configure how many threads should be used when executing parallel operations (the default value is `2`).

## Performance and future work

The performance of these functions is the key issue. Currently the `ParallelList` functions work can't be used for small number of repetitions of simple function, because the overhead is larger than the profit from parallel execution. If the operation takes less than 0.01ms and the number of repetitions is less than 1000 the `List` functions are usually better, but for operations taking longer time the results of `ParallelList` are better even with smaller number of repetitions. For operation taking about 0.1ms the `ParallelList` gives better results for more than 200 repetitions and for operation taking more than 1ms the number of repetitions is not very important (`ParallelList` is better even for 10 repetitions). These are inaccurate results that I got on my notebook, so if you're thinking of using the `ParallelList`, be sure to do some tests in your scenario! You can see some tests that I did in the demo project in `stats.fs` source file.

As I said earlier, `ParallelList` supports only `filter` and `map` functions, so implementing more functions would be useful. It would be also useful to provide some alternatives for functions that can't be executed in parallel (like `fold_left`) that could be used in some situations. I'd also like to implement functions for working with other collection types like `Seq` (`IEnumerable`) and `array` in the future.

I'm interested in your ideas and suggestions, so if you find something that could be improved in the code, or if you have any other idea, let me know!

Published: March 24, 2007 23:13
Tags: Functional Programming in .NET | Parallel | F# language

Share:

• RE: Keep your multi-core CPU busy with F# by Tom Kirby-Green (3/25/2007 8:04:35 AM)

Very interesting stuff, timely and useful. I'm looking forward to seeing where you take this. It's clearly PLINQ-like - but available now rather than 2-3 years out! Good luck with the project! :-)

• RE: Keep your multi-core CPU busy with F# by Jules (3/25/2007 1:08:43 PM)

Fold can be parallelized for associative functions. A ParallelList.fold would be cool :)

• RE: Keep your multi-core CPU busy with F# by Jonathan Allen (3/26/2007 1:27:41 AM)

Correct me if I'm wrong, but isn't it better to check only previously found primes than to try every possible divisor?

• RE: Keep your multi-core CPU busy with F# by Tomas (3/26/2007 11:35:01 PM)

Tom: Yes.. it is very similar to what PLINQ will probably do (I guess that PLINQ was inspied by functional programmin too :-))

• RE: Keep your multi-core CPU busy with F# by Tomas (3/26/2007 11:38:39 PM)

Jules: Thanks for the suggestion, I will add Fold for associative functions. It could be also useful to implement the following fold alternative:

// first 'fold' parts of the list (using first function)
// than join the results from every part
val fold_join : ('b -> 'a -> 'b) -> ('b -> 'b -> 'b) -> 'b -> 'a list

• RE: Keep your multi-core CPU busy with F# by Tomas (3/26/2007 11:50:45 PM)

Jonathan: Yes.. testing only primes as a divisors would be better, but I'd have to find all primes from 1 to X (which might take some time when testing numbers around 1M). For finding primes in the interval starting from 1 the algorithm called 'Sieve of Eratosthenes' would be better.

• RE: Keep your multi-core CPU busy with F# by (2/5/2009 9:49:08 PM)

Great post. F# is indeed very interesting.
Actually, I am researching about multicore programming with C#. I am not working with F#, but I should buy your book when it is available.
Recently, I bought a book about C# threaded programming "C# 2008 and 2005 Threaded Programming", by Gaston Hillar, Packt Publishing from B&N http://search.barnesandnoble.com/C-2008-and-2005-Threaded-Programming/Gaston-C-Hillar/e/9781847197108/?itm=1
The book is very easy to understand and it helpme to exploit both my multicore notebook (dual core) and desktop PC (quad core). Highly recommended to C# developers who aren't gurus.
I will go on researching about multicore programming. Besides, I am very interested in functional programming and specially in F#. Therefore, I will buy your book when available.