TP

# Is deep learning a new kind of programming? Operationalistic look at programming

In most discussions about how to make programming better, someone eventually says something along the lines of "we'll just have to wait until deep learning solves the problem!" I think this is a naively optimistic idea, but it raises one interesting question: In what sense are programs created using deep learning a different kind of programs than those written by hand?

This question recently arose in discussions that we have been having as part of the PROGRAMme project, which explores historical and philosophical perspectives on the question "What is a (computer) program?" and so this article owes much debt to others involved in the project, especially Maël Pégny, Liesbeth De Mol and Nick Wiggershaus.

Many people will intuitively think that, if you train a deep neural network to solve some a problem, you get a different kind of program than if you manually write some logic to solve the problem. But what exactly is the difference? In both cases, the program is a sequence of instructions that are deterministically executed by a machine, one after another, to produce the result.

When reading the excellent book Inventing Temperature by Hasok Chang recently, I came across the idea of operationalism, which I believe provides a useful perspective for thinking about the issue of deep learning and programming. The operationalist point of view was introduced by a physicist Percy Williams Bridgman. To quote: we mean by any concept nothing more than a set of operations; the concept is synonymous with the corresponding set of operations. What does this tell us about deep learning and programming?

Published: Wednesday, 7 October 2020, 1:43 AM
Tags: programming languages, philosophy, research

# Data exploration calculus: Capturing the essence of exploratory data scripting

Most real-world programming languages are too complex to be studied using formal methods. For this reason, academics often work with simple theoretical languages instead. The λ-calculus is a simple formal language that is often used for talking about functional languages, the π-calculus is a model of concurrent programming and there is an entire book, A Theory of Objects modelling various object-oriented systems.

Animation from Financial Times article "Why the world's recycling system stopped working".

Those calculi try to capture the most interesting aspect of the programming language. This is function application in functional programming, sending of messages in concurrent programming and object construction with inheritance in object-oriented programming.

Recently, I have been working on programming tools for data exploration. In particular, I'm interested in the kind of programming that journalists need to do when they work with data. A good example is the coding done for the Why the world's recycling system stopped working article by Financial Times, which is available on GitHub.

Although data journalists and other data scientists use regular programming languages like Python, the kind of code they write is very different from the kind of code you need to write when building a library or a web application in Python.

In a paper Foundations of a live data exploration environment that was published in February 2020 in the open access Programming Journal, I wanted to talk about some interesting work that I've been doing on live previews in The Gamma. For this, I needed a small model of my programming language.

In the end the most interesting aspect of the paper is the definition of the data exploration calculus, a small programming language that captures the kind of code that data scientists write to explore data. This looks quite different from, say, a λ-calculus and π-calculus. It should be interesting not only if you're planning to do theoretical programming language research about data scripting, but also because it captures some of the atypical properties of the programs that data scientists write...

Published: Tuesday, 21 April 2020, 2:42 PM
Tags: academic, research, programming languages, thegamma, data science

# On architecture, urban planning and software construction

Despite having the term science in its name, it is not always clear what kind of discipline computer science actually is. Research on programming is sometimes like science, sometimes like mathematics, sometimes like engineering, sometimes like design and sometimes like art. It also has a long tradition of importing ideas from a wide range of other disciplines.

In this article, I will look at ideas from architecture and urban planning. Architecture has already been an inspiration for design patterns, although some would say that we did quite poor job and imported a trivialized (and not very useful) version of the idea. However, there are many other interesting ideas in architecture and urban planning worth exploring.

To explain why learning from architecture and urban planning is a good idea, I will first discuss similarities between problems solved by architects or urban planners and programmers. I will then look at a number of concrete ideas that we can learn, mostly taking inspiration from four books that I've read recently. There are two general areas:

• First, writing about architecture and urban planning often uses interesting methodologies that research on programming could adopt to gain new insights into systems, programming and its problems.

• Second, there are a number of more concrete ideas in architecture and urban planning that might directly apply to software. For example, can programmers learn how to deal with complexity of software by looking at how urban planners deal with the complexity of cities? Or, can we learn about software maintenance by looking at how buildings evolve in time?

The nature of problems that programmers face are often more similar to the problems that architects and urban planners have to deal with than, say, the problems that scientists, engineers or mathematicians need to solve. We might not want to go all the way and completely rebuild how we do programming to mirror architecture and urban planning, but treating the ideas from those disciplines as equal to those from science or engineering will make programming richer and more productive discipline.

Published: Tuesday, 7 April 2020, 11:13 PM
Tags: academic, programming languages, philosophy, design

# What to teach as the first programming language and why

The number of Google search results for the phrase "choosing the first programming language" at the time of writing is 15,800. This illustrates just how debated the issue of choosing the first programming language is. In this blog post, I will not actually try to answer the question posed in the title of the post. I will not discuss what language we should teach as the first one. Instead, I will look at a more interesting question.

I will investigate the arguments that are used in favour of or against particular programming languages in computer science curriculum. I am more interested in the kind of argumentation that is employed to support a particular choice than in the specific languages involved. This approach is valuable for two reasons. First, by looking at the argumentation used, we can learn what educators consider important about computer science. Second, understanding the motivations behind different arguments allows us to make our own debates about the choice of a programming language more informed.

The scope of this blog post is limited to the choice of the first programming language taught in an undergraduate computer science programmes at universities. This means that I will not discuss other important contexts such as choices at a primary or a secondary education level, choices for independent learners and choices in other university degrees that might involve programming.

Note that this blog post is adapted from an essay that I wrote as part of a Postgrduate Certificate for Higher Education programme at University of Kent, so it assumes less knowledge about programming than a typical reader of my blog has. This makes it accessible to a broader audience thinking about education though!

Published: Monday, 2 December 2019, 4:48 PM
Tags: functional, research, academic, programming languages, philosophy, writing

# Programming as interaction: A new perspective for programming language research

In May, I joined the School of Computing at the University of Kent as a Lecturer (equivalent of Assistant Professor in some other countries). When applying for the job, I spent a lot of time thinking about how to best explain the kind of research that I would like to do. This blog post is a brief summary of my ideas. I'm interested in way too many things, including philosophy and design and data journalism, but this post will be mainly about programming language research. After all, I'm a member of the Programming Languages and Systems group!

Unlike some of my other posts about programming languages, I won't try to convince you that we should be studying programming languages completely differently this time. Instead, I want to describe one simple trick that will make current programming language research much more interesting!

A lot of programming language papers today talk about programs and program properties. In statically typed programming languages, we can check that a program $$e$$ has certain type $$\tau$$, which means that, when the program is run, it will only produce values of the type. This is very nice, but it misses a fundamental thing about programming. How was this program $$e$$ actually constructed?

When programming, you spend most of your time working with programs that are unfinished. This means that they do not do what they are supposed to be (eventually) doing and, very often, they are not well-typed or even syntactically invalid. However, that does not mean that we can afford to ignore them. In many cases, programmers can even run those programs (using REPL or using a notebook environment). In other words, programming language research should not study programs, but should instead study programming!

I'm also writing this because I'll soon be looking for collaborators and PhD students, so if the ideas in this blog post sound interesting to you or if you've been working on something related, please let me know! You can get in touch at @tomaspetricek or email tomas@tomasp.net.

We'll have funding for PhD students from September 2019 and I'm also working on getting money for a post-doc position. All of these are open ended, so if the blog post made you curious (and you wouldn't mind living in Canterbury or London), definitely reach out!

Published: Monday, 8 October 2018, 12:22 PM
Tags: academic, research, programming languages, data science

# Would aliens understand lambda calculus?

Unless you are a sci-fi author or some secret government agency, the question whether aliens would understand lambda calculus is probably not your main practical concern. However, the question is intriguing because it nicely vividly formulates a fundamental question about our formal mathematical knowledge. Are mathematical theories and results about them invented, i.e. constructed by humans, or discovered, i.e. are they eternal truths that exist regardless of whether there are humans to know them?

The question makes for a fantastic late night pub debate, but how can we go about answering it using a more serious methodology? Is there a paper one can read to better understand the problem? Occasionally, a talk or an online comment by a computer scientist comments on this question, but way too often, people miss the fact that the nature of mathematical entities is one of the fundamental questions of philosophy of mathematics. Alas, all those discussions are carefully hidden in the humanities department!

Published: Tuesday, 22 May 2018, 10:27 AM
Tags: academic, research, programming languages, philosophy

# The design side of programming language design

The word "design" is often used when talking about programming languages. In fact, it even made it into the name of one of the most prestigious academic programming conferences, Programming Language Design and Implementation (PLDI). Yet, it is almost impossible to come across a paper about programming languages that uses design methods to study its subject. We intuitively feel that "design" is an important aspect of programming languages, but we never found a way to talk about it and instead treat programming languages as mathematical puzzles or as engineering problems.

This is a shame. Applying design thinking, in the sense used in applied arts, can let us talk about, explore and answer important questions about programming languages that are ignored when we limit ourselves to mathematical or engineering methods. I think the programming language community is, perhaps unconsciously, aware of this - one of the reviews of my recent PLDI paper said "this is a nice, novel design paper, and the community often wants more design papers in our conferences". The problem is that we we do not know how to write and evaluate work that follows design methodology.

To better understand how design works, I recently read The Philosophy of Design by Glenn Parsons. The book perhaps did not answer many of my questions about design, but it did give me a number of ideas about what design is, what questions it can explore and how those could be relevant for the study of programming languages...

Published: Tuesday, 12 September 2017, 5:42 PM
Tags: academic, research, programming languages, philosophy, design

# Papers we Scrutinize: How to critically read papers

As someone who enjoys being at the intersection of the academic world and the world of industry, I'm very happy to see any attempts at bridging this harmful gap. For this reason, it is great to see that more people are interested in reading academic papers and that initiatives like Papers We Love are there to help.

There is one caveat with academic papers though. It is very easy to see academic papers as containing eternal and unquestionable truths, rather than as something that the reader should actively interact with. I recently remarked about this saying that "reading papers" is too passive. I also mentioned one way of doing more than just "reading", which is to write "critical reviews" – something that we recently tried to do at the Salon des Refusés workshop. In this post, I would like to expand my remark.

First of all, it is very easy to miss the context in which papers are written. The life of an academic paper is not complete after it is published. Instead, it continues living its own life – people refer to it in various contexts, give different meanings to entities that appear in the paper and may "love" different parts of the paper than the author. This also means that there are different ways of reading papers. You can try to reconstruct the original historical context, read it according to the current main-stream interpretation or see it as an inspiration for your own ideas.

I suspect that many people, both in academia and outside, read papers without worrying about how they are reading them. You can certainly "do science" or "read papers" without reflecting on the process. That said, I think the philosophical reflection is important if we do not want to get stuck in local maxima.

Published: Wednesday, 12 April 2017, 2:05 PM

# The mythology of programming language ideas

If you read a about the history of science, you will no doubt be astonished by some of the amazing theories that people used to believe. I recently finished reading The Invention of Science by David Wootton, which documents many of them (and is well worth reading, not just because of this!) For example, did you know that if you put garlic on a magnet, the magnet will stop working? Fortunately, you can recover the magnet by smearing goats blood on it. Giambattista della Porta tested this and concluded that it was false, but Alexander Ross argued that our garlic is perhaps not so vigorous as those of ancient Greeks.

You can just laugh at these stories, but they can serve as interesting lessons for any scientist. The lesson, however, is not the obvious one. Academics will sometimes read those stories and use them to argue against something they do not consider scientific - arguing that it is like believing that garlic break magnets.

This is not how the analogy works. What is amazing about the old stories is that the conclusions that now seem funny often had very solid reasoning behind them. If you believed in the basic assumption of the time, then you could reach the same conclusions by following fairly sound reasoning principles. In other words, the amazing theories were scientific and entirely reasonable. The lesson is that what seems a completely reasonable idea now, may turn out to be wrong and quite hilarious in retrospect.

In this article, I will look at a couple of amazing theories that people believed in the past and I will explain why they were reasonable given the way of thinking of the time. Along the way, I will explore some of the ways of thinking that we use today about programming and computer science and why they might appear silly in the future.

Published: Tuesday, 7 March 2017, 3:31 PM

# Towards open and transparent data-driven storytelling: Notes from my Alan Turing Institute talk

As mentioned in an earlier blog post, I've been spending some time at the Alan Turing Institute recently working on The Gamma project. The goal is to make data visualizations on the web more transparent. When you see a visualization online, you should be able to see where the data comes from, how it has been transformed and check that it is not misleading, but you should also be able to modify it and visualize other aspects of the data that interest you.

I gave a talk about my work as part of a talk series at The Alan Turing Institute, which has been recorded and is now available on YouTube. If you prefer to watch talks, this is a good 45 minute overview of what I've been working on with great video quality (the video switches from camera view to screen capture for demos!)

If you prefer text or do not have 45 minutes to watch the talk (right now), I wrote a short summary that highlights the most important ideas from the talk. You can also check out the talk slides, although I'll include the most important ones here.

Published: Thursday, 2 March 2017, 11:53 AM
Tags: thegamma, data journalism, programming languages

# Thinking the unthinkable: What we cannot think in programming

This blog post is an edited and more accessible version of an article Thinking the unthinkable that I recently presented at the PPIG 2016 conference. The original article (PDF) has proper references and more details; the minimalistic talk slides give a quick summary of the article.

If you find this interesting, you might also be interested in a workshop we are planning. To keep in touch leave a comment on GitHub, ping me at @tomaspetricek or email tomas@tomasp.net!

Our thinking is shaped by basic assumptions that we rarely question. These assumptions exist at different scales. Foucault's episteme describes basic assumptions of an epoch (such as Renaissance); Kuhn's research paradigms determine how scientists of a given discipline approach problems and Lakatos' research programmes provide undisputable assumptions followed by a group of scientists.

In this article, I try to discover some of the hidden assumptions in the area of programming language research. What are assumptions that we never question and that determine how programming languages are designed? And what might the world look like if we based our design method on different basic principles?

Published: Tuesday, 11 October 2016, 5:30 PM
Tags: philosophy, programming languages

# Can programming be liberated from function abstraction?

When you start working in the programming language theory business, you'll soon find out that lambda abstraction and functions break many nice ideas or, at least, make your life very hard. For example, type inference is easy if you only have var x = ..., but it gets hard once you want to infer type of x in something like fun x -> ... because we do not know what is assigned to x. Distributed programming is another example - sending around data is easy, but once you start sending around function values, things become hard.

Every programming language researcher soon learns this trick. When someone tells you about a nice idea, you reply "Interesting... but how does this interact with lambda abstraction?" and the other person replies "Whoa, hmm, let me think more about this." Then they go back and either give up, because it does not work, or produce something that works, in theory, well with lambda abstraction, but is otherwise quite unusable.

When working on The Gamma project and the little scripting language it runs, I recently went through a similar thinking process. Instead of letting lambda abstraction spoil the party again, I think we need to think about different ways of code reuse.

Published: Tuesday, 27 September 2016, 4:53 PM
Tags: thegamma, research, programming languages

# The Gamma — Visualizing Olympic Medalists

Olympic Games are perfect opportunity to do a fun data visualization project - just like New Year, you can easily predict when they will happen and you can get something interesting ready in advance. I used this year's Games in Rio as a motivation to resume working on The Gamma project. If you did not read my previous article, the idea is to build tooling for open, reproducible and interactive data-driven storytelling. When you see a visualization, not only you should be able to see how it has been created (what data it uses & how), but you should also be able to modify it, without much programming experience, and look at other interesting aspects of the data.

The first version of The Gamma project tries to do some of this, using historical and current data on Olympic medals as a sample dataset. You can play with the project at The Gamma web site:

Without further ado, here are the most important links if you want to explore how it all works on your own. However, continue reading and I'll describe the most important parts!

The project is still in early stages and the code needs more documentation (and ReadMe files, I know!) However, if you would be interested in using it for something or you have some interesting data to visualize, do not hesitate to ping me at @tomaspetricek. Also, thanks to the DNI Innovation Fund for funding the project and to the Alan Turing Institute for providing a place to work on this over the coming months!

Published: Tuesday, 6 September 2016, 4:37 PM
Tags: thegamma, data journalism, programming languages

# The Gamma and Digital News Innovation Fund

Last year, I wrote a bit about my interest in building programming tools for data journalism. Data journalism might sound like a niche field, but that is not the case if we talk about data-driven storytelling more generally,

In programming, your outcome is typically some application that does stuff. In data science, your outcome is very often a report or a story that aims to influence people's behavior or company decisions. No matter whether you are a journalist writing an article about government spending or an analyst producing internal sales reports, you are telling stories with data.

Being able to tell stories with data (but also verify and assess other people's stories that can be backed by data) is becoming a vital skill in the modern world, which is partly why I find this topic extremely important and interesting. But to do this currently, you need to be a skilled programmer, great designer and good storyteller, all at the same time!

I have not written about this topic much over the last year (mainly because I was busy with Coeffects, fsharpConf, FsLab and fsharpWorks), but this will be changing - I'm very happy to announce that my data-journalism related project The Gamma has been awarded funding from the DNI Innovation Fund and I'll be working on it over the next year at the Alan Turing Institute in London.

Published: Wednesday, 31 August 2016, 1:38 PM
Tags: thegamma, data journalism, programming languages

Combining philosophy and computer science might appear a bit odd. The disciplines have very little overlap. Both philosophers and computer scientists get taught formal logic at some point in their undergraduate courses, but that's probably as close as they get.

But the fact that the disciplines do not overlap much might very well be the reason why putting them together is interesting. In an article about Design and Science, Joichi Ito (from MIT Media Lab), describes the term antidisciplinary and nicely summarizes why looking at such unusual combinations is worthwhile:

Interdisciplinary work is when people from different disciplines work together. But antidisciplinary is something very different; it's about working in spaces that simply do not fit into any existing academic discipline.

[When focusing on disciplines, it] takes more and more effort and resources to make a unique contribution. While the space between and beyond the disciplines can be academically risky, it (...) requires fewer resources to try promising, unorthodox approaches; and provides the potential to have tremendous impact (...).

As you can see from some of my earlier blog posts, I think the space between philosophy and computer science is an interesting area. In this article, I'll explain why. Unlike some of the previous posts (about miscomputation, types and philosophy of science), this post is quite broad and does not go into much detail.

At the danger of sounding like a collection of random rants, I look at a number of questions that arise when you look at computer science from the philosophical perspective, but I won't attempt to answer them. You can see this article as a research proposal too - and I hope to write more about some of the questions in the future. I wish antidisciplinary work was more common and I believe looking into such questions could have the tremendous impact that Joichi Ito mentioned.

Published: Thursday, 26 May 2016, 1:33 PM
Tags: philosophy, programming languages

# Coeffects playground: Interactive essay based on my PhD thesis

In my PhD thesis, I worked on integrating contextual information into a type system of functional programming languages. For example, say your mobile application accesses something from the environment such as GPS sensor or your Facebook friends. With coeffects, this could be a part of the type. Rather than having type string -> Person, the type of a function would also include resources and would be string -{ gps, fb }-> Person. I wrote longer introduction to coeffects on this blog before.

As one might expect, the PhD thesis is more theoretical and it looks at other kinds of contextual information (e.g. past values in stream-based data-flow programming) and it identifies abstract coeffect algebra that captures the essence of contextual information that can be nicely tracked in a functional language.

I always thought that the most interesting thing about the thesis is that it gives people a nice way to think about context in a unified way. Sadly, the very theoretical presentation in the thesis makes this quite hard for those who are not doing programming language theory.

To make it a bit easier to explore the ideas behind coeffects, I wrote a coeffect playground that runs in a web browser and lets you learn about coeffects, play with two example context-aware languages, run a couple of demos and learn more about how the theory works. Go check it out now or continue below to learn more about some interesting internals!

Published: Tuesday, 12 April 2016, 3:33 PM
Tags: coeffects, research, functional programming, programming languages

# The Gamma: Simple code behind interactive articles

There are huge amounts of data around us that we could use to better understand the world. Every company collects large amounts of data about their sales or customers. Governments and international organizations increasingly release interesting data sets to the public through various open government data initiatives (data.gov or data.gov.uk). But raw data does not tell you much.

An interesting recent development is data journalism. Data journalists tell stories using data. A data driven article is based on an interesting observation from the data, it includes (interactive) visualizations that illustrate the point and it often allows the reader to get the raw data.

Adding a chart produced in, say, Excel to an article is easy, but building good interactive visualization is much harder. Ideally, the data driven article should not be just text with static pictures, but a program that links the original data source to the visualization. This lets readers explore how the data is used, update the content when new data is available and change parameters of the visualization if they need to understand different aspect of the topic.

This is in short what I'm trying to build with The Gamma project. If you're interested in building better reports or data driven articles, continue reading!

I did a talk about The Gamma project at the fantastic Future Programming workshop at the StrangeLoop conference last week (thanks for inviting me!) and there is a recording of my 40 minute talk on YouTube, so if you prefer to watch videos, check it out!

Are you a data journalist or data analyst? We're looking for early partners! I joined the EF programme to work on this and if the project sounds like something you'd like to see happen, please get in touch or share your contact details on The Gamma page!

Published: Monday, 28 September 2015, 5:07 PM
Tags: thegamma, type providers, data journalism, programming languages

# Miscomputation: Learning to live with errors

If trials of three or four simple cases have been made, and are found to agree with the results given by the engine, it is scarcely possible that there can be any error (...).

Charles Babbage, On the mathematical
powers of the calculating engine (1837)

Anybody who has something to do with modern computers will agree that the above statement made by Charles Babbage about the analytical engine is understatement, to say the least.

Computer programs do not always work as expected. There is a complex taxonomy of errors or miscomputations. The taxonomy of possible errors is itself interesting. Syntax errors like missing semicolons are quite obvious and are easy to catch. Logical errors are harder to find, but at least we know that something went wrong. For example, our algorithm does not correctly sort some lists. There are also issues that may or may not be actual errors. For example an algorithm in online store might suggest slightly suspicious products. Finally, we also have concurrency errors that happen very rarely in some very specific scenario.

If Babbage was right, we would just try three or four simple cases and eradicate all errors from our programs, but eliminating errors is not so easy. In retrospect, it is quite interesting to see how long it took early computer engineers to realise that coding (i.e. translating mathematical algorithm to program code) errors are a problem:

Errors in coding were only gradually recognized to be a signiﬁcant problem: a typical early comment was that of Miller [circa 1949], who wrote that such errors, along with hardware faults, could be "expected, in time, to become infrequent".

Mark Priestley, Science of Operations (2011)

We mostly got rid of hardware faults, but coding errors are still here. Programmers spent over 50 years finding different practical strategies for dealing with them. In this blog post, I want to look at four of the strategies. Quite curiously, there is a very wide range.

Published: Monday, 27 July 2015, 2:15 PM
Tags: philosophy, research, programming languages