Everybody can use Excel, but creating a web-based data-driven story requires professional developers, if not a team. I'm working on making data-driven storytelling easier, more open and reproducible.
The Gamma is a research project to build tools that easily integrate with modern data sources (open government data, public online sources) and let users easily create visualizations that are directly linked to the data, making the visualizations more transparent, reproducible, but also easy to adapt to explore other aspects of the data.
- Visualizing Olympic medalists is a demo that shows how such open data-driven articles could look like. It lets explores the history of Olympic medals.
- Computation + Journalism 2015 paper about an earlier prototype describes ideas and motivations of the project in more details. Watch a 15 minute demo or a 45 minute talk from StrangeLoop.
- The Gamma is on GitHub and everything is available under the MIT license. You can learn about the latest news on Twitter at @thegamma_net.
I'm a frequent conference speaker, founding member of the F# Software Foundation author of C# and F# books and author of many definitive F# libraries. I have been Microsoft MVP since 2004 and used F# since early Microsoft Research versions.
Have you seen the F# testimonials and are you thinking how can your company also benefit from the safety, correctness, efficiency and faster time-to-market provided by F#?
- fsharpWorks trainings — At fsharpWorks, we love sharing our knowledge with your team and we offer a wide range of workshops. We created an online course about F# in Finance and Type Providers and we regularly run an in-person course Fast Track to F# in London. We offer all of these and more as on-site trainings too — just drop us an email!
- F# books and articles — I wrote Real World Functional Programming, which explains functional concepts using C# and F#, edited a collection of F# case studies: F# Deep Dives and also wrote a free O'Reilly report: Analyzing and Visualizing Data with F#.
Coeffects and research
I recently submitted my PhD thesis at University of Cambridge and I closely collaborate with the F# team in Microsoft Research Cambridge.
My recent publications cover a range of topics from theory of context-aware programming, F# and type providers to language extensions for concurrent, reactive and asynchronous programming.
- Coeffects playgrouund is an interactive essay that lets you explore my PhD research in an accessible and fun way. You can read more in our ICFP 2014 paper.
- Academic web page has links to other published papers, work-in-progress drafts, research talks and also information about student projects and courses that I supervised.
Philosophy of science
During my (computer science) PhD, I became interested in how programming language research is done and how it should be done. We tend to think that science has infallible methods for discovering the truth, but is that the case? Or is science more 'sloppy' and 'irrational' than its methodological image as Paul Feyerabend says?
- History and philosophy of types is my most recent work in this area. It uses types as an example of a concept that appears simple, but is (and needs to be) more complex. Watch my LambdaDays talk or read the full-length Onward! essay.
- Philosophy posts on my blog — start with philosophy and history books every computer scientist should read and come to some of the events organized by the HaPoC Comission.
Wednesday, 7 October 2020, 1:43 AM
In most discussions about how to make programming better, someone eventually says something along the lines of "we'll just have to wait until deep learning solves the problem!" I think this is a naively optimistic idea, but it raises one interesting question: In what sense are programs created using deep learning a different kind of programs than those written by hand?
This question recently arose in discussions that we have been having as part of the PROGRAMme project, which explores historical and philosophical perspectives on the question "What is a (computer) program?" and so this article owes much debt to others involved in the project, especially Maël Pégny, Liesbeth De Mol and Nick Wiggershaus.
Many people will intuitively think that, if you train a deep neural network to solve some a problem, you get a different kind of program than if you manually write some logic to solve the problem. But what exactly is the difference? In both cases, the program is a sequence of instructions that are deterministically executed by a machine, one after another, to produce the result.
When reading the excellent book Inventing Temperature by Hasok Chang recently, I came across the idea of operationalism, which I believe provides a useful perspective for thinking about the issue of deep learning and programming. The operationalist point of view was introduced by a physicist Percy Williams Bridgman. To quote: we mean by any concept nothing more than a set of operations; the concept is synonymous with the corresponding set of operations. What does this tell us about deep learning and programming?
Here you'll find what I'm working on — my blog posts tend to be either updates about projects I'm working on, trainings and talks I'm doing, or longer posts that are early versions of my ideas — some of them become papers, some of them have been cited in other papers, some will be soon forgotten.
Thursday, 16 July 2020, 10:20 PM
Compost makes it easy to create custom interactive data visualizations! In this blog post, I explained how to use Compost to create an interactive "You Draw" bar chart inspired by the awesome interactive line charts from New York Times. The example shows the power of Compost including the use of domain-specific values, compositionality and the use of Model View Update architecture of interactivity.
Tuesday, 21 April 2020, 2:42 PM
What would a small formal language for data scripting look like? The lambda calculus captures the essence of functional programming. In this article, I present a small formal calculus that captures the essence of data scripting as done, for example, by journalists exploring data using Python and pandas.
Tuesday, 7 April 2020, 11:13 PM
Like software construction, architecture and urban planning often deal with complex systems that evolve over a long period of time. Some of the successful systems are also quite messy despite theories telling us that such systems should not work. These are just a few of the reasons why software engineers can learn interesting things from reading about architecture and urban planning.
Monday, 2 December 2019, 4:48 PM
Rather than answering the question in the title of the essay, I will look at a much more interesting question: what art the kinds of arguments that are employed to support a particular choice? Following this perspective lets us learn what educators consider important in computer science and allow us to make our debates about education more informed.
Friday, 8 February 2019, 11:22 AM
Is there any fundamental knowledge about software engineering that will remain relevant in the next 100 years? In this blog post, I discuss why teaching software engineering in a university environment is difficult. I also suggest how we can design a more useful software engineering course that will not go out of date with the next shift in technologies and methodologies. The key idea is that we need to focus on the motivation behind software engineering and the reasoning that leads to the adoption of particular software engineering methods in the face of particular problems that the software industry is attempting to address.
I published papers about programming languages including type providers, theory of coeffects, concurrent and reactive programming, but also philosophy and history of programming. My academic page has a complete list, including teaching and other activities.
Tomas Petricek. Unpublished draft.
This paper presents The Gamma, a data exploration environment for non-experts, based on a single interaction principle. The Gamma allows transfer of knowledge from one data source to another and learning from previously created data analyses. Our approach allows journalists and the public to benefit from the rise of open data, by making data exploration easier, more transparent and more reproducible.
Tomas Petricek. The Art, Science, and Engineering of Programming, 2020
The way data analysts write code is different from the way software engineers do so. They use few abstractions, work interactively and rely heavily on external libraries. We capture this way of working and build a programming environment that makes data exploration easier by providing instant live feedback.
Jonathan Edwards, Stephen Kell, Tomas Petricek, Luke Church. Proceedings of PPIG 2019
Research on programming systems design needs to consider a wide range of aspects in their full complexity. In this paper, we ask whether new media such as multimedia essays can serve as publication formats, more suitable for evaluating programming systems design.