Keep your multi-core CPU busy with F#
The growth of computer CPU speed is slowly being replaced by the growth of number of CPUs (or cores) in the computer at least for the close future. This causes a revolution in the way software is written, because traditional and most widely used way of writing concurrent applications using threads is difficult and brings several serious issues. Some predictions say that within a few years, almost every computer will have about 16 cores, so there is a huge need for programming paradigms or idioms that help developers write concurrent software easily (see also The Free Lunch Is Over [^] written by Herb Sutter).
Functional programming languages (especially pure functional languages) are interesting from this point of view, because the program doesn't have side-effects which makes it very easy to parallelize it (programs in pure functional languages can't have any side-effects by design, in other functional languages like F# the side-effects can be eliminated by following functional programming style).
Finding primes example
Let's say I want to find all prime numbers between 1 million and 1.1 million (this is example of
an operation that can be nicely divided between more processors). First we will need function to test
whether n
is a prime:
// Tests whether n is prime - expects n > 1
let is_prime n =
// calculate how large divisors should we test..
let max = int_of_float (Math.Sqrt( float_of_int n ))
// try to divide n by 2..max (stops when divisor is found)
not ({ 2 .. max } |> Seq.filter ( fun d -> n%d = 0) |> Seq.nonempty)
To find all primes in the specified range we could use the following code:
let primes = [1000000 .. 1100000] |> List.filter is_prime
This code subsequently executes the is_prime
function for all numbers that we want to test,
but with multiple CPU cores it would be nice if the function divided the numbers into
several parts and executed every part on different thread, so application would take benefit from multiple
cores available in the system. The is_prime
function doesn't have any side-effects so
executing it several times in parallel can't change the result of operation (if the order of primes in the
returned list doesn't change).
I wrote a function that does exactly what I described in the previous paragraph. To execute the operation
in parallel, you can use the following code (the only difference is that I replaced List
module with
my ParallelList
module):
let primes = [1000000 .. 1100000] |> ParallelList.filter is_prime
On my notebook (with Intel Core Duo processor) the first code (using List.filter
) takes about
2,3sec and using ParallelList.filter
the operation takes only 1.3sec.
The program isn't 2 times faster, because there is some overhead for creating and synchronizing threads, but in
this case the speed increase is significant.
Aside from the filter
function, the ParallelList
module contains also the
map
function (which does the same thing as the List.map
). There is also
a function set_thread_count
that you can use to configure how many threads should be used when
executing parallel operations (the default value is 2
).
Performance and future work
The performance of these functions is the key issue. Currently the ParallelList
functions work
can't be used for small number of repetitions of simple function, because the overhead is larger than the profit
from parallel execution. If the operation takes less than 0.01ms and the number of repetitions is less than 1000
the List
functions are usually better, but for operations taking longer time the results of ParallelList
are better even with smaller number of repetitions. For operation taking about 0.1ms the ParallelList
gives better results for more than 200 repetitions and for operation taking more than 1ms the number of repetitions is
not very important (ParallelList
is better even for 10 repetitions). These are inaccurate results that I got
on my notebook, so if you're thinking of using the ParallelList
, be sure to do some tests in your scenario!
You can see some tests that I did in the demo project in stats.fs
source file.
As I said earlier, ParallelList
supports only filter
and map
functions, so implementing
more functions would be useful. It would be also useful to provide some alternatives for functions that can't
be executed in parallel (like fold_left
) that could be used in some situations. I'd also like to implement
functions for working with other collection types like Seq
(IEnumerable
) and array
in the future.
I'm interested in your ideas and suggestions, so if you find something that could be improved in the code, or if you have any other idea, let me know!
Downloads
Published: Saturday, 24 March 2007, 11:13 PM
Author: Tomas Petricek
Typos: Send me a pull request!
Tags: functional, parallel, f#