Software designers, not engineers: An interview from alternative universe
While the physicists investigate the nature of the mysterious portal that has recently appeared in North London, several human beings recently came through the portal, which appears to be a gate into an alternative universe. As we understood from the last two people coming through the portal, it seems to be a linked with a universe that is in many ways like ours, reached about the same level of social and technological development, but differs in numerous curious details. The paths through which people in this alternative universe reached similar results as our world are often subtly different.
The most recent visitor from the alternative universe is Ms Zaha Atkinson, who would most likely be titled software engineer in our world, although the title she uses in her home world is software designer. She is a well-known software designer and has been also titled using the strange-sounding title softwarenova, a label that we will soon say more about. As with other technological and societal developments, the alternative universe seems to have arrived at very similar results as our worlds. Software is eating the (alternative) world, but it is built in very different ways. The interview with Ms Zaha Atkinson, presented below, reveals how very different the world of software is when we think of programmers as software designers rather than as software engineers.
This article is a work of fiction. Any resemblance to actual events or persons, living or dead, may or may not be entirely coincidental. It has been largely inspired by the book Designerly Ways of Knowing by Nigel Cross. Ms Zaha Atkinson also may or may not be entirely fictional.
Is deep learning a new kind of programming? Operationalistic look at programming
In most discussions about how to make programming better, someone eventually says something along the lines of "we'll just have to wait until deep learning solves the problem!" I think this is a naively optimistic idea, but it raises one interesting question: In what sense are programs created using deep learning a different kind of programs than those written by hand?
This question recently arose in discussions that we have been having as part of the PROGRAMme project, which explores historical and philosophical perspectives on the question "What is a (computer) program?" and so this article owes much debt to others involved in the project, especially Maël Pégny, Liesbeth De Mol and Nick Wiggershaus.
Many people will intuitively think that, if you train a deep neural network to solve some a problem, you get a different kind of program than if you manually write some logic to solve the problem. But what exactly is the difference? In both cases, the program is a sequence of instructions that are deterministically executed by a machine, one after another, to produce the result.
When reading the excellent book Inventing Temperature by Hasok Chang recently, I came across the idea of operationalism, which I believe provides a useful perspective for thinking about the issue of deep learning and programming. The operationalist point of view was introduced by a physicist Percy Williams Bridgman. To quote: we mean by any concept nothing more than a set of operations; the concept is synonymous with the corresponding set of operations. What does this tell us about deep learning and programming?
Creating interactive You Draw bar chart with Compost
For a long time, I've been thinking about how to design a data visualization library that would make it easier to compose charts from simple components. On the one hand, there are charting libraries like Google Charts, which offer a long list of pre-defined charts. On the other hand, there are libraries like D3.js, which let you construct any data visualization, but in a very low-level way. There is also Vega, based the idea of grammar of graphics, which is somewhere in between, but requires you to specify charts in a fairly complex language including a huge number of transformations that you need to write in JSON.
My final motivation for working on this was the You Draw It article series by New York Times, which uses interactive charts where the reader first has to make their own guess before seeing the actual data. I wanted to recreate this, but for bar charts, when working on visualizing government spending using The Gamma.
The code for this was somewhat hidden inside The Gamma, but last month, I finally extracted all the functionality into a new stand-alone library Compost.js with simple and clean source code on GitHub and an accompanying paper draft that describes it (PDF).
In this article, I will show how to use Compost.js to implement a "You Draw" bar chart inspired by the NYT article. When loaded, all bars show the average value. You have to drag the bars to positions that you believe represent the actual values. Once you do this, you can click "Show me how I did" and the chart will animate to show the actual data, revealing how good your guess was. Before looking at the code, you can have a look at the resulting interactive chart, showing the top 5 areas from the 2015 UK budget (in % of GDP):
Data exploration calculus: Capturing the essence of exploratory data scripting
Most real-world programming languages are too complex to be studied using formal methods. For this reason, academics often work with simple theoretical languages instead. The λ-calculus is a simple formal language that is often used for talking about functional languages, the π-calculus is a model of concurrent programming and there is an entire book, A Theory of Objects modelling various object-oriented systems.
Animation from Financial Times article "Why the world's recycling system stopped working".
Those calculi try to capture the most interesting aspect of the programming language. This is function application in functional programming, sending of messages in concurrent programming and object construction with inheritance in object-oriented programming.
Recently, I have been working on programming tools for data exploration. In particular, I'm interested in the kind of programming that journalists need to do when they work with data. A good example is the coding done for the Why the world's recycling system stopped working article by Financial Times, which is available on GitHub.
Although data journalists and other data scientists use regular programming languages like Python, the kind of code they write is very different from the kind of code you need to write when building a library or a web application in Python.
In a paper Foundations of a live data exploration environment that was published in February 2020 in the open access Programming Journal, I wanted to talk about some interesting work that I've been doing on live previews in The Gamma. For this, I needed a small model of my programming language.
In the end the most interesting aspect of the paper is the definition of the data exploration calculus, a small programming language that captures the kind of code that data scientists write to explore data. This looks quite different from, say, a λ-calculus and π-calculus. It should be interesting not only if you're planning to do theoretical programming language research about data scripting, but also because it captures some of the atypical properties of the programs that data scientists write...
On architecture, urban planning and software construction
Despite having the term science in its name, it is not always clear what kind of discipline computer science actually is. Research on programming is sometimes like science, sometimes like mathematics, sometimes like engineering, sometimes like design and sometimes like art. It also has a long tradition of importing ideas from a wide range of other disciplines.
In this article, I will look at ideas from architecture and urban planning. Architecture has already been an inspiration for design patterns, although some would say that we did quite poor job and imported a trivialized (and not very useful) version of the idea. However, there are many other interesting ideas in architecture and urban planning worth exploring.
To explain why learning from architecture and urban planning is a good idea, I will first discuss similarities between problems solved by architects or urban planners and programmers. I will then look at a number of concrete ideas that we can learn, mostly taking inspiration from four books that I've read recently. There are two general areas:
First, writing about architecture and urban planning often uses interesting methodologies that research on programming could adopt to gain new insights into systems, programming and its problems.
Second, there are a number of more concrete ideas in architecture and urban planning that might directly apply to software. For example, can programmers learn how to deal with complexity of software by looking at how urban planners deal with the complexity of cities? Or, can we learn about software maintenance by looking at how buildings evolve in time?
The nature of problems that programmers face are often more similar to the problems that architects and urban planners have to deal with than, say, the problems that scientists, engineers or mathematicians need to solve. We might not want to go all the way and completely rebuild how we do programming to mirror architecture and urban planning, but treating the ideas from those disciplines as equal to those from science or engineering will make programming richer and more productive discipline.
What to teach as the first programming language and why
The number of Google search results for the phrase "choosing the first programming language" at the time of writing is 15,800. This illustrates just how debated the issue of choosing the first programming language is. In this blog post, I will not actually try to answer the question posed in the title of the post. I will not discuss what language we should teach as the first one. Instead, I will look at a more interesting question.
I will investigate the arguments that are used in favour of or against particular programming languages in computer science curriculum. I am more interested in the kind of argumentation that is employed to support a particular choice than in the specific languages involved. This approach is valuable for two reasons. First, by looking at the argumentation used, we can learn what educators consider important about computer science. Second, understanding the motivations behind different arguments allows us to make our own debates about the choice of a programming language more informed.
The scope of this blog post is limited to the choice of the first programming language taught in an undergraduate computer science programmes at universities. This means that I will not discuss other important contexts such as choices at a primary or a secondary education level, choices for independent learners and choices in other university degrees that might involve programming.
Note that this blog post is adapted from an essay that I wrote as part of a Postgrduate Certificate for Higher Education programme at University of Kent, so it assumes less knowledge about programming than a typical reader of my blog has. This makes it accessible to a broader audience thinking about education though!
What should a Software Engineering course look like?
When I joined the School of Computing at the University of Kent, I was asked what subjects I wanted to teach. One of the topics I chose was Software Engineering. I spent quite a lot of time reading about the history of software engineering when working on my paper on programming errors and I go to a fair number of professional programming conferences, so I thought I can come up with a good way of teaching it! Yet, I was not quite sure how to go about it or even what software engineering actually means.
In this blog post, I share my thought process on deciding what to cover in my Software Engineering module and also a rough list of topics. The introduction explaining why I chose these and how I structure them is perhaps more important than the list itself, but it is fairly long, so if you just want to see a list you can skip ahead to Section 2 (but please read the introduction if you want to comment on the list!) I also add a brief reflection on why I think this is a good approach, referencing a couple of ideas from philosophy of science in Section 3.
Write your own Excel in 100 lines of F#
I've been teaching F# for over seven years now, both in the public F# FastTrack course that we run at SkillsMatter in London and in various custom trainings for private companies. Every time I teach the F# FastTrack course, I modify the material in one way or another. I wrote about some of this interesting history last year in an fsharpWorks article. The course now has a stable half-day introduction to the language and a stable focus on the ideas behind functional-first programming, but there are always new examples and applications that illustrate this style of programming.
For the upcoming December course in London, I added a number of demos and hands-on tasks built using Fable, partly because running F# in a browser is an easy way to illustrate many concepts and partly because Fable has some amazing functional-first libraries.
If you are interested in learning F# and attending our course, the next F# FastTrack takes place on 6-7 December in London at SkillsMatter. We also offer custom on-site trainings. Get in touch at @tomaspetricek or email firstname.lastname@example.org for a 10% discount for the course.
One of the new samples I want to show, which I also live coded at NDC 2018, is building a simple web-based Excel-like spreadsheet application. The spreadsheet demonstrates all the great F# features such as domain modeling with types, the power of compositionality and also how functional-first approach can be amazingly powerful for building user interfaces.
Programming as interaction: A new perspective for programming language research
In May, I joined the School of Computing at the University of Kent as a Lecturer (equivalent of Assistant Professor in some other countries). When applying for the job, I spent a lot of time thinking about how to best explain the kind of research that I would like to do. This blog post is a brief summary of my ideas. I'm interested in way too many things, including philosophy and design and data journalism, but this post will be mainly about programming language research. After all, I'm a member of the Programming Languages and Systems group!
Unlike some of my other posts about programming languages, I won't try to convince you that we should be studying programming languages completely differently this time. Instead, I want to describe one simple trick that will make current programming language research much more interesting!
A lot of programming language papers today talk about programs and program properties. In statically typed programming languages, we can check that a program \(e\) has certain type \(\tau\), which means that, when the program is run, it will only produce values of the type. This is very nice, but it misses a fundamental thing about programming. How was this program \(e\) actually constructed?
When programming, you spend most of your time working with programs that are unfinished. This means that they do not do what they are supposed to be (eventually) doing and, very often, they are not well-typed or even syntactically invalid. However, that does not mean that we can afford to ignore them. In many cases, programmers can even run those programs (using REPL or using a notebook environment). In other words, programming language research should not study programs, but should instead study programming!
I'm also writing this because I'll soon be looking for collaborators and PhD students, so if the ideas in this blog post sound interesting to you or if you've been working on something related, please let me know! You can get in touch at @tomaspetricek or email email@example.com.
We'll have funding for PhD students from September 2019 and I'm also working on getting money for a post-doc position. All of these are open ended, so if the blog post made you curious (and you wouldn't mind living in Canterbury or London), definitely reach out!
Would aliens understand lambda calculus?
Unless you are a sci-fi author or some secret government agency, the question whether aliens would understand lambda calculus is probably not your main practical concern. However, the question is intriguing because it nicely vividly formulates a fundamental question about our formal mathematical knowledge. Are mathematical theories and results about them invented, i.e. constructed by humans, or discovered, i.e. are they eternal truths that exist regardless of whether there are humans to know them?
The question makes for a fantastic late night pub debate, but how can we go about answering it using a more serious methodology? Is there a paper one can read to better understand the problem? Occasionally, a talk or an online comment by a computer scientist comments on this question, but way too often, people miss the fact that the nature of mathematical entities is one of the fundamental questions of philosophy of mathematics. Alas, all those discussions are carefully hidden in the humanities department!
I believe that knowing a bit about philosophy of mathematics is important if we want to have a meaningful debate about philosophical questions of mathematics (sic!) and so I did a talk on this very subject at CodeMesh 2017. This article is slightly refined and hopefully more polished version of the talk for those who, like me, prefer reading over watching. Keep in mind that the question about the nature of mathematical entities is one of the fundamental questions of an entire academic discipline. As such, this article cannot possibly cover all the relevant discussions. Compared to some other writings in this space, this article is, at least, based on a couple of philosophical books that, I believe, have useful things to say on the subject!
The design side of programming language design
The word "design" is often used when talking about programming languages. In fact, it even made it into the name of one of the most prestigious academic programming conferences, Programming Language Design and Implementation (PLDI). Yet, it is almost impossible to come across a paper about programming languages that uses design methods to study its subject. We intuitively feel that "design" is an important aspect of programming languages, but we never found a way to talk about it and instead treat programming languages as mathematical puzzles or as engineering problems.
This is a shame. Applying design thinking, in the sense used in applied arts, can let us talk about, explore and answer important questions about programming languages that are ignored when we limit ourselves to mathematical or engineering methods. I think the programming language community is, perhaps unconsciously, aware of this - one of the reviews of my recent PLDI paper said "this is a nice, novel design paper, and the community often wants more design papers in our conferences". The problem is that we we do not know how to write and evaluate work that follows design methodology.
To better understand how design works, I recently read The Philosophy of Design by Glenn Parsons. The book perhaps did not answer many of my questions about design, but it did give me a number of ideas about what design is, what questions it can explore and how those could be relevant for the study of programming languages...
Getting started with The Gamma just got easier
Over the last year, I have been working on The Gamma project, which aims to make data-driven visualizations more trustworthy and to enable large number of people to build visualizations backed by data. The Gamma makes it possible to create visualizations that are built on trustworthy primary data sources such as the World Bank and you can provide your own data source by writing a REST service.
A great piece of feedback that I got when talking about The Gamma is that this is a nice ultimate goal, but it makes it hard for people to start with The Gamma. If you do not want to use the World Bank data and you're not a developer to write your own REST service, how do you get started?
To make starting with The Gamma easier, the gallery now has a new four-step getting started page where you can upload your data as a CSV file or paste it from Excel spreadsheet and create nice visualizations that let your reader explore other aspects of the data.
Head over to The Gamma Gallery to check it out or continue reading to learn more about creating your first The Gamma visualization...
Papers we Scrutinize: How to critically read papers
As someone who enjoys being at the intersection of the academic world and the world of industry, I'm very happy to see any attempts at bridging this harmful gap. For this reason, it is great to see that more people are interested in reading academic papers and that initiatives like Papers We Love are there to help.
There is one caveat with academic papers though. It is very easy to see academic papers as containing eternal and unquestionable truths, rather than as something that the reader should actively interact with. I recently remarked about this saying that "reading papers" is too passive. I also mentioned one way of doing more than just "reading", which is to write "critical reviews" – something that we recently tried to do at the Salon des Refusés workshop. In this post, I would like to expand my remark.
First of all, it is very easy to miss the context in which papers are written. The life of an academic paper is not complete after it is published. Instead, it continues living its own life – people refer to it in various contexts, give different meanings to entities that appear in the paper and may "love" different parts of the paper than the author. This also means that there are different ways of reading papers. You can try to reconstruct the original historical context, read it according to the current main-stream interpretation or see it as an inspiration for your own ideas.
I suspect that many people, both in academia and outside, read papers without worrying about how they are reading them. You can certainly "do science" or "read papers" without reflecting on the process. That said, I think the philosophical reflection is important if we do not want to get stuck in local maxima.
The mythology of programming language ideas
If you read a about the history of science, you will no doubt be astonished by some of the amazing theories that people used to believe. I recently finished reading The Invention of Science by David Wootton, which documents many of them (and is well worth reading, not just because of this!) For example, did you know that if you put garlic on a magnet, the magnet will stop working? Fortunately, you can recover the magnet by smearing goats blood on it. Giambattista della Porta tested this and concluded that it was false, but Alexander Ross argued that our garlic is perhaps not so vigorous as those of ancient Greeks.
You can just laugh at these stories, but they can serve as interesting lessons for any scientist. The lesson, however, is not the obvious one. Academics will sometimes read those stories and use them to argue against something they do not consider scientific - arguing that it is like believing that garlic break magnets.
This is not how the analogy works. What is amazing about the old stories is that the conclusions that now seem funny often had very solid reasoning behind them. If you believed in the basic assumption of the time, then you could reach the same conclusions by following fairly sound reasoning principles. In other words, the amazing theories were scientific and entirely reasonable. The lesson is that what seems a completely reasonable idea now, may turn out to be wrong and quite hilarious in retrospect.
In this article, I will look at a couple of amazing theories that people believed in the past and I will explain why they were reasonable given the way of thinking of the time. Along the way, I will explore some of the ways of thinking that we use today about programming and computer science and why they might appear silly in the future.
Towards open and transparent data-driven storytelling: Notes from my Alan Turing Institute talk
As mentioned in an earlier blog post, I've been spending some time at the Alan Turing Institute recently working on The Gamma project. The goal is to make data visualizations on the web more transparent. When you see a visualization online, you should be able to see where the data comes from, how it has been transformed and check that it is not misleading, but you should also be able to modify it and visualize other aspects of the data that interest you.
I gave a talk about my work as part of a talk series at The Alan Turing Institute, which has been recorded and is now available on YouTube. If you prefer to watch talks, this is a good 45 minute overview of what I've been working on with great video quality (the video switches from camera view to screen capture for demos!)
If you prefer text or do not have 45 minutes to watch the talk (right now), I wrote a short summary that highlights the most important ideas from the talk. You can also check out the talk slides, although I'll include the most important ones here.
The Gamma dataviz package now available!
There were a lot of rumors recently about the death of facts and even the death of statistics. I believe the core of the problem is that working with facts is quite tedious and the results are often not particularly exciting. Social media made it extremely easy to share your own opinions in an engaging way, but what we are missing is a similarly easy and engaging way to share facts backed by data.
This is, in essence, the motivation for The Gamma project that I've been working on recently. After several experiments, including the visualization of Olympic medalists, I'm now happy to share the first reusable component based on the work that you can try and use in your data visualization projects. If you want to get started:
- Check out thegamma-script package on npm
- Minimal example of thegamma-script in action
- How to use thegamma-script in your projects
The package implements a simple scripting language that anyone can use for writing simple data aggregation and data exploration scripts. The tooling for the scripting language makes it super easy to create and modify existing data analyses. Editor auto-complete offers all available operations and a spreadsheet-inspired editor lets you create scripts without writing code - yet, you still get a transparent and reproducible script as the result.
Thinking the unthinkable: What we cannot think in programming
This blog post is an edited and more accessible version of an article Thinking the unthinkable that I recently presented at the PPIG 2016 conference. The original article (PDF) has proper references and more details; the minimalistic talk slides give a quick summary of the article.
If you find this interesting, you might also be interested in a workshop we are planning. To keep in touch leave a comment on GitHub, ping me at @tomaspetricek or email firstname.lastname@example.org!
Our thinking is shaped by basic assumptions that we rarely question. These assumptions exist at different scales. Foucault's episteme describes basic assumptions of an epoch (such as Renaissance); Kuhn's research paradigms determine how scientists of a given discipline approach problems and Lakatos' research programmes provide undisputable assumptions followed by a group of scientists.
In this article, I try to discover some of the hidden assumptions in the area of programming language research. What are assumptions that we never question and that determine how programming languages are designed? And what might the world look like if we based our design method on different basic principles?
Can programming be liberated from function abstraction?
When you start working in the programming language theory business, you'll soon find out that
lambda abstraction and functions break many nice ideas or, at least, make your life very hard.
For example, type inference is easy if you only have
var x = ..., but it gets hard once you
want to infer type of
x in something like
fun x -> ... because we do not know what is
x. Distributed programming is another example - sending around data is easy, but
once you start sending around function values, things become hard.
Every programming language researcher soon learns this trick. When someone tells you about a nice idea, you reply "Interesting... but how does this interact with lambda abstraction?" and the other person replies "Whoa, hmm, let me think more about this." Then they go back and either give up, because it does not work, or produce something that works, in theory, well with lambda abstraction, but is otherwise quite unusable.
When working on The Gamma project and the little scripting language it runs, I recently went through a similar thinking process. Instead of letting lambda abstraction spoil the party again, I think we need to think about different ways of code reuse.
The Gamma — Visualizing Olympic Medalists
Olympic Games are perfect opportunity to do a fun data visualization project - just like New Year, you can easily predict when they will happen and you can get something interesting ready in advance. I used this year's Games in Rio as a motivation to resume working on The Gamma project. If you did not read my previous article, the idea is to build tooling for open, reproducible and interactive data-driven storytelling. When you see a visualization, not only you should be able to see how it has been created (what data it uses & how), but you should also be able to modify it, without much programming experience, and look at other interesting aspects of the data.
The first version of The Gamma project tries to do some of this, using historical and current data on Olympic medals as a sample dataset. You can play with the project at The Gamma web site:
Without further ado, here are the most important links if you want to explore how it all works on your own. However, continue reading and I'll describe the most important parts!
The project is still in early stages and the code needs more documentation (and ReadMe files, I know!) However, if you would be interested in using it for something or you have some interesting data to visualize, do not hesitate to ping me at @tomaspetricek. Also, thanks to the DNI Innovation Fund for funding the project and to the Alan Turing Institute for providing a place to work on this over the coming months!
The Gamma and Digital News Innovation Fund
Last year, I wrote a bit about my interest in building programming tools for data journalism. Data journalism might sound like a niche field, but that is not the case if we talk about data-driven storytelling more generally,
In programming, your outcome is typically some application that does stuff. In data science, your outcome is very often a report or a story that aims to influence people's behavior or company decisions. No matter whether you are a journalist writing an article about government spending or an analyst producing internal sales reports, you are telling stories with data.
Being able to tell stories with data (but also verify and assess other people's stories that can be backed by data) is becoming a vital skill in the modern world, which is partly why I find this topic extremely important and interesting. But to do this currently, you need to be a skilled programmer, great designer and good storyteller, all at the same time!
I have not written about this topic much over the last year (mainly because I was busy with Coeffects, fsharpConf, FsLab and fsharpWorks), but this will be changing - I'm very happy to announce that my data-journalism related project The Gamma has been awarded funding from the DNI Innovation Fund and I'll be working on it over the next year at the Alan Turing Institute in London.