Everybody can use Excel, but creating a web-based data-driven story requires professional developers, if not a team. I'm working on making data-driven storytelling easier, more open and reproducible.
The Gamma is a research project to build tools that easily integrate with modern data sources (open government data, public online sources) and let users easily create visualizations that are directly linked to the data, making the visualizations more transparent, reproducible, but also easy to adapt to explore other aspects of the data.
- Visualizing Olympic medalists is a demo that shows how such open data-driven articles could look like. It lets explores the history of Olympic medals.
- Computation + Journalism 2015 paper about an earlier prototype describes ideas and motivations of the project in more details. Watch a 15 minute demo or a 45 minute talk from StrangeLoop.
- The Gamma is on GitHub and everything is available under the MIT license. You can learn about the latest news on Twitter at @thegamma_net.
I'm a frequent conference speaker, founding member of the F# Software Foundation author of C# and F# books and author of many definitive F# libraries. I have been Microsoft MVP since 2004 and used F# since early Microsoft Research versions.
Have you seen the F# testimonials and are you thinking how can your company also benefit from the safety, correctness, efficiency and faster time-to-market provided by F#?
- fsharpWorks trainings — At fsharpWorks, we love sharing our knowledge with your team and we offer a wide range of workshops. We created an online course about F# in Finance and Type Providers and we regularly run an in-person course Fast Track to F# in London. We offer all of these and more as on-site trainings too — just drop us an email!
- F# books and articles — I wrote Real World Functional Programming, which explains functional concepts using C# and F#, edited a collection of F# case studies: F# Deep Dives and also wrote a free O'Reilly report: Analyzing and Visualizing Data with F#.
Coeffects and research
I recently submitted my PhD thesis at University of Cambridge and I closely collaborate with the F# team in Microsoft Research Cambridge.
My recent publications cover a range of topics from theory of context-aware programming, F# and type providers to language extensions for concurrent, reactive and asynchronous programming.
- Coeffects playgrouund is an interactive essay that lets you explore my PhD research in an accessible and fun way. You can read more in our ICFP 2014 paper.
- Academic web page has links to other published papers, work-in-progress drafts, research talks and also information about student projects and courses that I supervised.
Philosophy of science
During my (computer science) PhD, I became interested in how programming language research is done and how it should be done. We tend to think that science has infallible methods for discovering the truth, but is that the case? Or is science more 'sloppy' and 'irrational' than its methodological image as Paul Feyerabend says?
- History and philosophy of types is my most recent work in this area. It uses types as an example of a concept that appears simple, but is (and needs to be) more complex. Watch my LambdaDays talk or read the full-length Onward! essay.
- Philosophy posts on my blog — start with philosophy and history books every computer scientist should read and come to some of the events organized by the HaPoC Comission.
Thursday, 16 July 2020, 10:20 PM
For a long time, I've been thinking about how to design a data visualization library that would make it easier to compose charts from simple components. On the one hand, there are charting libraries like Google Charts, which offer a long list of pre-defined charts. On the other hand, there are libraries like D3.js, which let you construct any data visualization, but in a very low-level way. There is also Vega, based the idea of grammar of graphics, which is somewhere in between, but requires you to specify charts in a fairly complex language including a huge number of transformations that you need to write in JSON.
My final motivation for working on this was the You Draw It article series by New York Times, which uses interactive charts where the reader first has to make their own guess before seeing the actual data. I wanted to recreate this, but for bar charts, when working on visualizing government spending using The Gamma.
The code for this was somewhat hidden inside The Gamma, but last month, I finally extracted all the functionality into a new stand-alone library Compost.js with simple and clean source code on GitHub and an accompanying paper draft that describes it (PDF).
In this article, I will show how to use Compost.js to implement a "You Draw" bar chart inspired by the NYT article. When loaded, all bars show the average value. You have to drag the bars to positions that you believe represent the actual values. Once you do this, you can click "Show me how I did" and the chart will animate to show the actual data, revealing how good your guess was. Before looking at the code, you can have a look at the resulting interactive chart, showing the top 5 areas from the 2015 UK budget (in % of GDP):
Here you'll find what I'm working on — my blog posts tend to be either updates about projects I'm working on, trainings and talks I'm doing, or longer posts that are early versions of my ideas — some of them become papers, some of them have been cited in other papers, some will be soon forgotten.
Tuesday, 21 April 2020, 2:42 PM
What would a small formal language for data scripting look like? The lambda calculus captures the essence of functional programming. In this article, I present a small formal calculus that captures the essence of data scripting as done, for example, by journalists exploring data using Python and pandas.
Tuesday, 7 April 2020, 11:13 PM
Like software construction, architecture and urban planning often deal with complex systems that evolve over a long period of time. Some of the successful systems are also quite messy despite theories telling us that such systems should not work. These are just a few of the reasons why software engineers can learn interesting things from reading about architecture and urban planning.
Monday, 2 December 2019, 4:48 PM
Rather than answering the question in the title of the essay, I will look at a much more interesting question: what art the kinds of arguments that are employed to support a particular choice? Following this perspective lets us learn what educators consider important in computer science and allow us to make our debates about education more informed.
Friday, 8 February 2019, 11:22 AM
Is there any fundamental knowledge about software engineering that will remain relevant in the next 100 years? In this blog post, I discuss why teaching software engineering in a university environment is difficult. I also suggest how we can design a more useful software engineering course that will not go out of date with the next shift in technologies and methodologies. The key idea is that we need to focus on the motivation behind software engineering and the reasoning that leads to the adoption of particular software engineering methods in the face of particular problems that the software industry is attempting to address.
Monday, 12 November 2018, 12:58 PM
I published papers about programming languages including type providers, theory of coeffects, concurrent and reactive programming, but also philosophy and history of programming. My academic page has a complete list, including teaching and other activities.
Tomas Petricek. Unpublished draft.
This paper presents The Gamma, a data exploration environment for non-experts, based on a single interaction principle. The Gamma allows transfer of knowledge from one data source to another and learning from previously created data analyses. Our approach allows journalists and the public to benefit from the rise of open data, by making data exploration easier, more transparent and more reproducible.
Tomas Petricek. The Art, Science, and Engineering of Programming, 2020
The way data analysts write code is different from the way software engineers do so. They use few abstractions, work interactively and rely heavily on external libraries. We capture this way of working and build a programming environment that makes data exploration easier by providing instant live feedback.
Jonathan Edwards, Stephen Kell, Tomas Petricek, Luke Church. Proceedings of PPIG 2019
Research on programming systems design needs to consider a wide range of aspects in their full complexity. In this paper, we ask whether new media such as multimedia essays can serve as publication formats, more suitable for evaluating programming systems design.