Tomas Petricek

Searching for new ways of thinking in programming & working with data

I believe that the most interesting work is not the one solving hard problems, but the one changing how we think about the world. I follow this belief in my work on data science tools, functional programming and F# teaching, in my programming languages research and I try to understand it through philosophy of science.

The Gamma

I'm working on making data-driven storytelling easier, more open and reproducible at the Alan Turing Institute.

Consulting

I'm author of definitive F# books and open-source libraries. I offer my F# training and consulting services as part of fsharpWorks.

Academic

I published papers about theory of context-aware programming languages, type providers, but also philosophy of science.

Tomas Petricek
  • Tomas Petricek
  • Home
  • F# Trainings
  • Talks and books
  • The Gamma
  • Academic

PLDI 2010 Trip Report

In June, I attended my first "big" academic conference Programming Languages Design and Implementation (PLDI 2010) in Toronto. I attended the conference, because I presented a paper that I wrote with Don Syme as a result of my internship at Microsoft Research, but I'll write more about that shortly (and thanks to MSR for partly supporting my attendance at the conference!)

As far as I understand it, the focus of the conference is more on implementation. Many people believe that the current programming languages are good enough and we need to make sure that they run well (e.g. compilers optimize code to run better on multi-core) and that we need better tools for working with them (e.g. automatic verification of the code we write), so these areas were the main focus of the conference. However, there were some very interesting talks on the design of programming languages, which is an area that I personally find more interesting...

Garbage Collection in Reactive Scenario

I presented a paper named Collecting Hollywood’s Garbage: Avoiding Space-Leaks in Composite Events at the International Symposium on Memory Management (ISMM 2010), which is a workshop co-located with the main conference. The symposium focuses on various theoretical problems of garbage collection, real-time garbage collection as well as efficient (and concurrent) implementations of GC algorithms.

My work was somewhat more general and I tried to point out that the usual garbage collection techniques do not work well in some scenarios - in particular in reactive programming. This is an observation that many WPF developers already know. This is a reason why we have so-called Weak Event pattern. However, the motivation for my talk was different (arising from the use of F# event combinators such as Event.map) and perhaps more fundamental. Here is a brief summary of my talk:

If we mix the declarative and imperative programming style, which is usual in languages like F# it becomes easy to introduce patterns where the usual garbage collector for objects cannot automatically dispose all components that we intuitively consider garbage. We present a duality between the definitions of garbage for objects and events. In particular, an object is garbage if it cannot be reached from any root objects, but an event is garbage if it doesn't lead to any leaf event that causes some imperative action (in .NET terminology, has an event handler attached). We combine the two dual notions into a single definition that can be used for collecting garbage in a reactive programming model in a mixed functional/imperative language. We present a garbage collection algorithm for this mixed theoretical model and, more importantly, we use the principle to implement a correct reactive library for F#.

After the talk, Dave Detlefs (who works in the CLR team at Microsoft) asked a very interesting question - How do we identify what is an "event" and what is an "object" in the runtime? In my theoretical model, these two are explicitly separated, but in a real-world (e.g. .NET) they would both be some kind of objects. One interesting point about "event" is that it can only trigger other events, but never receives any value back from them. This means that "event" should only call methods that return void. However, this is an interesting problem that deserves further research...

More information

For more information about my work, you can read the full paper and look at the slides from the ISMM workshop:

  • Full text of the paper (PDF) (with Don Syme) - with all appendices
  • Slides from the talk (PPTX, PDF format)

Interesting talks from PLDI

The main conference contained a large number of interesting talks. Here is a brief summary of the talks that I found particularly interesting - mostly because they are in some ways related to my interests (such as F#, functional programming and web programming):

Safe Programmable Speculative Parallelism

(P. Prabhu, G. Ramalingam, K. Vaswani)

If we have a for loop where every iteration is independent of all other iterations (e.g. do not rely on some state calculated by previous iterations), we can easily parallelize the loop, for example, by creating a new task for every iteration. Unfortunately, this isn't always possible as many loops keep some state. Speculative parallelism provides a way for solving this - we specify a prediction function that attempts to guess the next state (without consuming much resources), so that we can run the next iteration in parallel. If the guess was correct, we calculated multiple values in parallel. If no, we need to recalculate the values that depend on the current result. The paper presents a static verification algorithm that checks whether the "recalculation" needs to rollback some side-effects (which would be impossible). The authors also presented a C# library, which allows us to write something like this:

Speculation.Iterate(0, numTasks,
  (i, state) => {
    // Body of the stateful loop 
    // - calculate new state based on current 'state' and index 'i'
  }, (i) =>
    // Prediction function
    // - calculate new state without knowing the current 'state'
  });

This raises some interesting problems for F# as well. It turns out that encoding this style of computations using computation expressions isn't easy, because we need to write two functions for every step (while computation expressions or monads use only a single continuation). If you know about similar types of computations where we need two (or more) functions, I would be quite interested in hearing about them!

Ur: Statically-Typed Metaprogramming with Type-Level Record Computation

(A. Chlipala)

This talk caught my interest because it discussed a language for web programming that appeared to be quite related to what I did some time ago in my F# Web Tools (and indeed, also related to the work by the Links team in Edinburgh). The key topic of the presented paper was a bit different though. The language has a type-system where a type can depend on some computation (dependent types) and focuses on programming with record types and names (which is a very natural way to map SQL databases to a functional language). Some interesting uses of this technique include:

  • Versioning - we can write a (type-safe) reusable function that translates an old version of (any) table to a new version given a few properties (such as name of the primary key and names of new columns).
  • Grids - we can write a reusable function that generates grid for viewing and editing data and takes the type of the table as parameter.

This is something that is possible in F#/.NET only using dynamic reflection that analyzes values and types at runtime. Dependent types make it possible to write this kind of generic code more directly (and in a type-safe way).

Complete Functional Synthesis

(V. Kuncak, M. Mayer, R. Piskac, P. Suter)

The authors present an extension of the Scala language that allows us to write a specification of value (typicaly a number) that we want to get as the result and automatically constructs (synthetizes) code to calculate the number. The specification is simply a Boolean expression that specifies what properties should hold about the value. For example, say we have a value totsec specifying a total number of seconds and we want to convert it to the hh:mm::ss format. Using the new choose construct, we can write:

val (hours, minutes, seconds) = 
  choose((h: Int, m: Int, s: Int) =>
    h * 3600 + m * 60 + s == totsec &&
    0 <= m && m <= 60 &&
    0 <= s && s <= 60 &&
    0 <= h)

This is a bit like generalized pattern matching that makes the language even more declarative. The compiler is smart enough to know whether it can always find a single unique solution to the specification - if that's the case it emits a warning (because we want to avoid introducing non-determinism to our code). This is certainly an interesting extension and I can easily imagine using it in F# (where it could be probably quite nicely implemented using quotations...)

Traceable Data Types for Self-Adjusting Computation

(Umut A. Acar et al.)

The paper presents an easy and efficient way of writing self-adjusting computations in a ML-like language. To example what that means, here is a quote from the abstract: "Self-adjusting computation provides an evaluation model where computations can respond automatically to modifications to their data by using a mechanism for propagating modifications through the computation." This means that we create some functional data-structure (e.g. queue) or perform some computation (e.g. heapsort) that uses some input data. When the input data change, we would typically need to remove some elements to recover the data-structure (sort the list again or rebuild the queue).

When using self-adjusting computations, the compiler automatically keeps track of how the data structure was created and performs the necessary updates. It would re-run the part of the computation (used to construct the data-structure) that is different than in the original run. This seems like an interesting technique that could be used for writing some efficient functional data-structures...

Programming Languages for Web (APLWACA)

Another event that was held as part of the PLDI conference was the APLWACA workshop (the long acronym includes keywords like programming languages, cloud and also web). The workshop was organized for the first time and was very interesting (it is unfortunate that something like this didn't exist a few years ago, because it would be a perfect venue for my draft paper about F# Web Tools, which was never published anywhere). Anyway, it is great to know that there is a place for meeting people interested in this kind of topics from an academic perspective! I'm surely planning to submit something next year (if it will be held again - and I hope it will)

Some of the topics discussed at the workshop this year included security of JavaScript (which is a tricky topic if you want to forbid third-party code to use some features of JavaScript such as XmlHttpRequest, but is very important for sites like Facebook). Another topic that caught my attention was a discussion of the semantics of the HOP language, which is a language for programming interactive client/server web applications.

Published: Monday, 5 July 2010, 11:23 PM
Author: Tomas Petricek
Typos: Send me a pull request!
Tags: universe, academic, meta-programming

Contact & about

This site is hosted on GitHub and is generated using F# Formatting and DotLiquid. For more info, see the website source on GitHub.

Please submit issues & corrections on GitHub. Use pull requests for minor corrections only.

  • Twitter: @tomaspetricek
  • GitHub: @tpetricek
  • Email me: tomas@tomasp.net

Blog archives

October 2020 (1),  July 2020 (1),  April 2020 (2),  December 2019 (1),  February 2019 (1),  November 2018 (1),  October 2018 (1),  May 2018 (1),  September 2017 (1),  June 2017 (1),  April 2017 (1),  March 2017 (2),  January 2017 (1),  October 2016 (1),  September 2016 (2),  August 2016 (1),  July 2016 (1),  May 2016 (2),  April 2016 (1),  December 2015 (2),  November 2015 (1),  September 2015 (3),  July 2015 (1),  June 2015 (1),  May 2015 (2),  April 2015 (3),  March 2015 (2),  February 2015 (1),  January 2015 (2),  December 2014 (1),  May 2014 (3),  April 2014 (2),  March 2014 (1),  January 2014 (2),  December 2013 (1),  November 2013 (1),  October 2013 (1),  September 2013 (1),  August 2013 (2),  May 2013 (1),  April 2013 (1),  March 2013 (1),  February 2013 (1),  January 2013 (1),  December 2012 (2),  October 2012 (1),  August 2012 (3),  June 2012 (2),  April 2012 (1),  March 2012 (4),  February 2012 (5),  January 2012 (2),  November 2011 (5),  August 2011 (3),  July 2011 (2),  June 2011 (2),  May 2011 (2),  March 2011 (4),  December 2010 (1),  November 2010 (6),  October 2010 (6),  September 2010 (4),  July 2010 (3),  June 2010 (2),  May 2010 (1),  February 2010 (2),  January 2010 (3),  December 2009 (3),  July 2009 (1),  June 2009 (3),  May 2009 (2),  April 2009 (1),  March 2009 (2),  February 2009 (1),  December 2008 (1),  November 2008 (5),  October 2008 (1),  September 2008 (1),  June 2008 (1),  March 2008 (3),  February 2008 (1),  December 2007 (2),  November 2007 (6),  October 2007 (1),  September 2007 (1),  August 2007 (1),  July 2007 (2),  April 2007 (2),  March 2007 (2),  February 2007 (3),  January 2007 (2),  November 2006 (1),  October 2006 (3),  August 2006 (2),  July 2006 (1),  June 2006 (3),  May 2006 (2),  April 2006 (2),  December 2005 (1),  July 2005 (4),  June 2005 (5),  May 2005 (1),  April 2005 (3),  March 2005 (3),  January 2005 (1),  December 2004 (3),  November 2004 (2), 

License

Unless explicitly mentioned, all articles on this site are licensed under Creative Commons Attribution Share Alike. All source code samples are licensed under the MIT License.

CC License logo