Data exploration calculus: Capturing the essence of exploratory data scripting

Most real-world programming languages are too complex to be studied using formal methods. For this reason, academics often work with simple theoretical languages instead. The λ-calculus is a simple formal language that is often used for talking about functional languages, the π-calculus is a model of concurrent programming and there is an entire book, A Theory of Objects modelling various object-oriented systems.

Animation from Financial Times article "Why the world's recycling system stopped working".

Those calculi try to capture the most interesting aspect of the programming language. This is function application in functional programming, sending of messages in concurrent programming and object construction with inheritance in object-oriented programming.

Recently, I have been working on programming tools for data exploration. In particular, I'm interested in the kind of programming that journalists need to do when they work with data. A good example is the coding done for the Why the world's recycling system stopped working article by Financial Times, which is available on GitHub.

Although data journalists and other data scientists use regular programming languages like Python, the kind of code they write is very different from the kind of code you need to write when building a library or a web application in Python.

In a paper Foundations of a live data exploration environment that was published in February 2020 in the open access Programming Journal, I wanted to talk about some interesting work that I've been doing on live previews in The Gamma. For this, I needed a small model of my programming language.

In the end the most interesting aspect of the paper is the definition of the data exploration calculus, a small programming language that captures the kind of code that data scientists write to explore data. This looks quite different from, say, a λ-calculus and π-calculus. It should be interesting not only if you're planning to do theoretical programming language research about data scripting, but also because it captures some of the atypical properties of the programs that data scientists write...

Published: Tuesday, 21 April 2020, 2:42 PM
Tags: academic, research, programming languages, thegamma, data science
Read the complete article

On architecture, urban planning and software construction

Despite having the term science in its name, it is not always clear what kind of discipline computer science actually is. Research on programming is sometimes like science, sometimes like mathematics, sometimes like engineering, sometimes like design and sometimes like art. It also has a long tradition of importing ideas from a wide range of other disciplines.

In this article, I will look at ideas from architecture and urban planning. Architecture has already been an inspiration for design patterns, although some would say that we did quite poor job and imported a trivialized (and not very useful) version of the idea. However, there are many other interesting ideas in architecture and urban planning worth exploring.

To explain why learning from architecture and urban planning is a good idea, I will first discuss similarities between problems solved by architects or urban planners and programmers. I will then look at a number of concrete ideas that we can learn, mostly taking inspiration from four books that I've read recently. There are two general areas:

The nature of problems that programmers face are often more similar to the problems that architects and urban planners have to deal with than, say, the problems that scientists, engineers or mathematicians need to solve. We might not want to go all the way and completely rebuild how we do programming to mirror architecture and urban planning, but treating the ideas from those disciplines as equal to those from science or engineering will make programming richer and more productive discipline.

Published: Tuesday, 7 April 2020, 11:13 PM
Tags: academic, programming languages, philosophy, design
Read the complete article

All blog posts by tag

f# (112), functional (66), research (44), c# (37), asynchronous (27), parallel (23), academic (22), functional programming (20), universe (20), programming languages (18), meta-programming (18), philosophy (15), links (15), presentations (14), data science (12), writing (12), joinads (12), thegamma (11), web (10), data journalism (9), math and numerics (9), random thoughts (9), talks (8), phalanger (8), haskell (7), mono (7), webcast (7), fslab (5), open source (5), visualization (4), fun (4), accelerator (4), design (3), type providers (3), linq (3), f# data (3), .net (3), training (2), coeffects (2), deedle (2), monads (2), art (2), fractals (2), funscript (2), new york (2), manning (2), books (2), teaching (1), fable (1), machine learning (1), comonads (1), fake (1), f# formatting (1), deep dives (1), async (1), events (1), trainings (1), london (1), literate (1)