TP

List of publications

Journal and conference publications

The following papers have been published in recognized top-tier programming language conferences and in journals and present my most important results. Further workshop publications and abstracts, many of which are highly-cited, are listed below.

Ascending the Ladder to Self-Sustainability: Achieving Open Evolution in an Interactive Graphical System

Joel Jakubovic, Tomas Petricek. In Proceedings of Onward! Essays 2022

Self-sustaining systems expose their implementation and can be modified and improved from within. However, the few examples that exist are tightly linked to textual language-based accounts of compiler bootstrapping. We explore how the bootstrapping process can work in more structured, visual programming environments.

AI Assistants: A Framework for Semi-Automated Data Wrangling

Tomas Petricek, Gerrit J J van den Burg, Alfredo Nazábal, Taha Ceritli, Ernesto Jiménez-Ruiz, Christopher K I Williams. IEEE Transactions on Knowledge and Data Engineering, 2022

Despite the rise of artificial intelligence, data wrangling remains a tedious and manual task. We introduce AI assistants, semi-automatic interactive tools to streamline data wrangling. An AI assistant guides the analyst through a specific data wrangling task and recommends suitable data transformation, respecting constraints obtained through interaction with the expert.

The Gamma: Programmatic Data Exploration for Non-programmers

Tomas Petricek. In Proceedings of VL/HCC 2022.

Data exploration tools based on code can access any data source, result in reproducible scripts and encourage users to verify, reuse and modify existing code, but can we make them accessible to non-experts? The Gamma is a novel text-based data exploration tool that answers the question in the affirmative.

Linked Visualisations via Galois Dependencies

Roly Perera, Minh Nguyen, Tomas Petricek and Meng Wang. In Proceedings of POPL 2022

We present new language-based dynamic analysis techniques for linking visualisations to data in a fine-grained way, allowing a user to interactively explore how data attributes map to visual or other output elements by selecting substructures of interest.

Covid-19, Charitable Giving and Collectivism: a data-harvesting approach

Peter Taylor-Gooby, Tomas Petricek and Jack Cunliffe. Journal of Social Policy, 2021

This paper charts responses to urgent appeals by welfare charities through crowd funding websites in order to examine the impact of the Covid-19 pandemic on public generosity and social cohesion in the UK.

Programming as Architecture, Design, and Urban Planning

Tomas Petricek. In Proceedings of Onward! Essays 2021

In this essay, I draw a parallel between the world of software and the world of architecture, design and urban planning. I hope to convince the reader that this is a well-justified parallel and I point to a number of discussions in architecture, design and urban planning from which the software world could learn.

Composable data visualizations

Tomas Petricek. Journal of Functional Programming, 2021

Compost is a data visualization library that lets you compose rich interactive data visualizations from a small number of basic primitives. The library is based on the functional programming idea of composable domain-specific languages. This functional programming pearl is a tutorial that introduces Compost through a wide range of examples.

Foundations of a live data exploration environment

Tomas Petricek. The Art, Science, and Engineering of Programming, 2020

The way data analysts write code is different from the way software engineers do so. They use few abstractions, work interactively and rely heavily on external libraries. We capture this way of working and build a programming environment that makes data exploration easier by providing instant live feedback.

What we talk about when we talk about monads

Tomas Petricek. The Art, Science, and Engineering of Programming, 2018

Computer science provides an in-depth understanding of technical aspects of programming concepts, but if we want to understand how programming concepts evolve, how programmers think about them and how they are used in practice, we need to consider a broader perspective. This paper explains how computer scientists and programmers talk about monads and why they do so.

Data exploration through dot-driven development

Tomas Petricek. In proceedings of ECOOP 2017.

Data literacy is becoming increasingly important. While spreadsheets make simple data analytics accessible to a large number of people, creating transparent scripts requires expert programming skills. In this paper, we describe the design of a data exploration language that makes the task more accessible by embedding advanced programming concepts into a simple core language.

Miscomputation in software: Learning to live with errors

Tomas Petricek. The Art, Science, and Engineering of Programming, 2017

Computer programs do not always work as expected. In fact, ominous warnings about the desperate state of the software industry continue to be released with almost ritualistic regularity. In this paper, we look at the 60 years history of programming and at the different practical methods that software community developed to live with programming errors.

Types from data: Making structured data first-class citizens in F#

Tomas Petricek, Gustavo Guerra and Don Syme. In Proceedings of PLDI 2016

The paper presents F# Data, a library of type providers that integrate external data in XML, CSV and JSON formats into the type system of the F# language. F# Data infers the shape of structured documents and uses it to guarantee a relative safety property.

Effect systems revisited - control-flow algebra and semantics

Alan Mycroft, Dominic Orchard, Tomas Petricek.
Semantics, Logics, and Calculi, 2016

This paper joins two recent developments related to effect systems - the development of richer effect systems based on complex algebraic structures and the development of semantic models based on monads. We show how graded joinads link the two developments.

Against a Universal Definition of 'Type'

Tomas Petricek. In Proceedings of Onward! Essays 2015

What is the definition of type? Having a clear and precise answer to this question would avoid many misunderstandings but it would hurt science, "hamper the growth of knowledge" and "deflect the course of investigation into narrow channels of things already understood".

Coeffects: A calculus of context-dependent computation

Tomas Petricek, Dominic Orchard and Alan Mycroft. In Proceedings of ICFP 2014

We present a general calculus for tracking contextual properties of computations, including per-variable properties (usage patterns, caching requirements) and whole-context properties (platform version, rebindable resources). This resolves questions that were left unanswered in the ICALP 2013 paper below.

Coeffects: Unified static analysis of context-dependence

Tomas Petricek, Dominic Orchard and Alan Mycroft. In Proceedings of ICALP 2013.

Effects capture how computations affect the environment, coeffects capture the requirements that computations place on the environment. We present a unified coeffect system that can be used for checking liveness and properties of data-flow or distributed programs,

Recent workshop papers

The following papers cover a wide range of topics from data journalism, theory and practice of programming to philosophy and history of compter science. They appeared in workshop and conference proceedings, but many have been highly influential.

Interaction vs. Abstraction: Managed Copy and Paste

Jonathan Edwards, Tomas Petricek. Proceedings of PAINT 2022

Inspired by version control systems like git, we propose managed copy & paste, in which the programming environment records copy & paste operations, along with structural edit operations, so that it can track the differences between copies and reconcile them on command.

Complementary science of interactive programming systems

Tomas Petricek, Joel Jakubovic. Presented at HaPoC 2021

Is it worth looking at the history of programming in order to discover lost ideas that could be utilized by present-day computer scientists? This paper argues that this is the case and presents an early experiment that recovers a number of interesting programming ideas from, against all odds, Commodore 64 BASIC.

Typed Image-based Programming with Structure Editing

Jonathan Edwards and Tomas Petricek. Human Aspects of Types and Reasoning Assistants, 2021

Many beloved programming system like Smalltalk and LISP are image-based. This encourages exploration and allows flexibility, but it makes collaboration hard. We tackle the problem of schema change in image-based programming systems using static types.

Report on HOPL IV - ACM SIGPLAN History of Programming Languages Conference

Mark Priestley, and Tomas Petricek and David Hemmendinger, IEEE Ann. Hist. Comput. vol 43, no 3

Report on HOPL IV, the latest in the ACM's occasional series of conferences on the history of programming languages that took place on June 20–22, 2021, in association with PLDI 2021.

Evaluating programming systems design

Jonathan Edwards, Stephen Kell, Tomas Petricek, Luke Church. Proceedings of PPIG 2019

Research on programming systems design needs to consider a wide range of aspects in their full complexity. In this paper, we ask whether new media such as multimedia essays can serve as publication formats, more suitable for evaluating programming systems design.

Stories of storytelling about UK’s EU funding

Mariana Marasoiu et al. Proceedings of European Data and Computational Journalism Conference, 2018

We explore the difficulties of open data analysis using the data available on EU funding to the UK. We report on some of the fundamental difficulties we observed whilst analysing this data and we suggest ways of addressing such difficulties.

You guessed it! Reflecting on preconceptions and exploring data without statistics

Pablo León-Villagrá et al. Proceedings of European Data and Computational Journalism Conference, 2018

We live in times in which information is abundant, but trust in expert analysis is low. How can we make complex issues accessible for readers and overcome their preconceptions? We propose a novel way of presenting readers with data and raising awareness for individual bias and preconceptions.

Wrattler: Reproducible, live and polyglot notebooks

Tomas Petricek, James Geddes, Charles Sutton. In Proceedings of TaPP 2018

Notebook systems such as Jupyter are a popular environment for data science, but they use an architecture that leads to a limited model of interaction and makes versioning and reproducibility difficult. Wrattler revisits the architecture and allows richer forms of interactivity, efficient evaluation and guaranteed reproducibility.

Critique of 'An anatomy of interaction: co-occurrences and entanglements'

Tomas Petricek. Proceedings of Salon des Refusés, 2018

The idea of interaction is not just another programming abstraction, but different way of structuring our thinking about programming. This includes thinking about how users can interact with the software more generally, but also what are effective metaphorical ways of thinking about software.

The Gamma: Programming tools for open data-driven storytelling

Tomas Petricek. Proceedings of European Data and Computational Journalism Conference, 2017

The rise of open data initiatives means that there is an increasing amount of raw data available. At the same time, the general public increasingly distrusts statistics and post-truth has been chosen as the word of 2016. The Gamma project aims to help reverse this development.

Work in progress

Start here if you want to know what I'm currently working on. The documents below are unpublished drafts and any feedback is more than welcome!

Cultures of programming: Understanding the history of programming through controversies and technical artifacts

Tomas Petricek. Unpublished draft.

This paper documents the socio-technological context that shapes programming languages. To structure our discussion, we introduce the idea of a culture of programming which embodies a particular perspective on programming. We identify four major cultures: hacker culture, engineering culture, managerial culture and mathematical culture.

Older workshop papers

The following papers cover a wide range of topics from data journalism, theory and practice of programming to philosophy and history of compter science. They appeared in workshop and conference proceedings, but many have been highly influential.

Programming language theory: Thinking the unthinkable

Tomas Petricek. Presented at PPIG 2016.

Our thinking is shaped by basic assumptions that we rarely question. What are some of the hidden assumptions that we never question and that determine how programming languages are designed? And what might the world of programming look like if we based our thinking on different basic principles?

In the Age of Web: Typed Functional-First Programming Revisited

Tomas Petricek, Don Syme. In Post-proceedings of ML 2014

Most programming languages were designed before the age of web. This matters because the web changes many assumptions that typed functional language designers take for granted. In this paper, we present how F# language and libraries face the challenges posed by the web.

Embedding effect systems in Haskell

Dominic Orchard, Tomas Petricek. Haskell 2014

This paper leverages recent advances in Haskell's type system to embed fine-grained monadic computations in Haskell, providing user-programmable effect systems. We explore a number of practical examples that make Haskell even better and safer for effectful programming.

What can Programming Language Research Learn from the Philosophy of Science?

Tomas Petricek. In Proceedings of AISB 2014

This essay looks how theories from philosophy of science apply to programming language research. I argue that it is important to clearly state research programme, avoid early over-emphasis on precision and I consider how to produce truly reusable experiments.

The F# Computation Expression Zoo

Tomas Petricek and Don Syme. In Proceedings of PADL 2014.

F# computation expressions F# provide a uniform syntax for computations such as monads, monad transformers and applicative functors. The syntax is adaptable and close to built-in language features of Python and C#. This article provides the details shows that the right syntax can often be determined by considering laws.

Themes in Information-Rich Functional Programming for Internet-Scale Data Sources

Don Syme, Keith Battocchi, Kenji Takeda, Donna Malayeri and Tomas Petricek. DDFP 2013

In this position paper we describe the key themes in information-rich functional programming that we have observed during the work on F# 3.0 type providers.

Evaluation strategies for monadic computations

Tomas Petricek. In Proceedings of MSFP 2012.

When translating pure functional code to a monadic version, we need to use different translation depending on the desired evaluation strategy. In this paper, we present an unified translation that is parameterized by an operation malias. We also show how to give call-by-need translation for certain monads.

Extending monads with pattern matching

Tomas Petricek, Alan Mycroft and Don Syme. In Haskell Symposium 2011.

The paper introduces a docase notation for Haskell that can be used for any monad that provides additional operations representing parallel composition, choice and aliasing. We require the operations to form a near-semiring, which gurantees that the notation resembles pattern matching.

Joinads: a retargetable control-flow construct for reactive, parallel and concurrent programming

Tomas Petricek and Don Syme. In Proceedings of PADL 2011.

Reactive, parallel and concurrent programming models are often difficult to encode in general-purpose programming languages. We present a lightweight language extension based on pattern matching that can be used for encoding a wide range of these models.

The F# Asynchronous Programming Model

Don Syme, Tomas Petricek and Dmitry Lomov. In PADL 2011.

We describe the asynchronous programming model in F#, and its applications. It combines a core language with a non-blocking modality to author lightweight asynchronous tasks. This allows smooth transitions between synchronous and asynchronous code and eliminates the inversion of control.

Collecting Hollywood's Garbage: Avoiding Space-Leaks in Composite Events

Tomas Petricek and Don Syme. In Proceedings of ISMM 2010.

The article discusses memory leaks that can occur in a reactive programming model based on events. It presents a formal garbage collection algorithm that could be used in this scenario and a correct reactive library based on this idea, implemented in F#.

Encoding monadic computations in C# using iterators

Tomas Petricek. In Proceedings of ITAT 2009

The paper shows how to encode certain monadic computations (such as a continuation monad for asynchronus programming) using the iterator language feature in C# 2.0.

Theses and older projects

Here you'll find my PhD thesis on from University of Cambridge as well as Bachelor and Master thesis from Charles University in Prague. My PhD developed a theory of coeffects; my master thesis introduced joinads and my Bachelor thesis described one of the first functional-language to JavaScript translators.

Context-aware programming languages (PhD thesis)

Tomas Petricek. University of Cambridge, 2017

The key point made by this thesis is the realization that an execution environment or a context is fundamental for writing modern applications and that programming languages should provide abstractions for programming with context and verifying how it is accessed. The thesis summarizes all my work on coeffects.

Reactive Programming with Events (Master thesis)

Master thesis. Charles University in Prague, 2010

The thesis uses early version of joinad language extension for F# and garbage collection for events. Based on these concepts, it builds a simple reactive library for F# that allows writing reactive computations in both control-flow and data-flow styles.

Client-side Scripting using Meta-programming (Bachelor thesis)

Bachelor thesis. Charles University in Prague, 2007

The thesis presents a web development framework for F# that automatically splits a single F# program with monadic modality annotations into client-side JavaScript and server-side ASP.NET application.

Notes, articles and reports

I wrote a couple of articles, notes and reports that were never formally published, but might still be of interest. This includes my work on compiling F# to JavaScript and notes on teaching functional programming.

Teaching Functional Programming to Professional .NET Developers

Tomas Petricek. In Pre-proceedings of TFPIE 2012.

Functional programming is often taught at universities, but with the recent rise of functional programming in the industry, it is becoming important to teach functional concepts to experienced developers. This requires a very different approach.

Effect and coeffect type systems (First-year PhD report)

Tomas Petricek. First year report, Computer Laboratory.

The research goal discussed in the report is to use types in an ML-style language to track additional properties of computations including various kinds of effects (communication, memory access) and coeffects (how a computation depends on a context). The document briefly summarizes the work done during the first-year (including the work on joinads and coeffects) and it proposes future research projects.

Fun with Parallel Monad Comprehensions

Tomas Petricek. The Monad.Reader Issue 18.

The article presents several monads that can benefit from the parallel monad comprehension notation that is supported in the re-designed monad comprehension syntax in Haskell. The examples are inspired by the work on joinads and include parsers, parallel and concurrent computations.

F# Web Tools: Rich client/server web applications in F#

Tomas Petricek. Unpublished draft, 2007

This paper presents one of the first "Ajax" programming frameworks that let you develop integrated client/server applications in an integrated way using a single language. The framework lets you use F# on both sides and provides a smooth integration between client and server-side code.