# Better F# data science with FsLab and Ionide

At NDC Oslo 2016, I did a talk about some of the recent new F# projects that are making data science with F# even nicer than it used to be. The talk covered a wider range of topics, but one of the nice new thing I showed was the improved F# Interactive in the Ionide plugin for Atom and the integration with FsLab libraries that it provides.

In particular, with the latest version of Ionide for Atom and the latest version of FsLab package, you can run code in F# Interactive and you'll see resulting time series, data frames, matrices, vectors and charts as nicely pretty printed HTML objects, right in the editor. The following shows some of the features (click on it for a bigger version):

In this post, I'll write about how the new Ionide and FsLab integration works, how you can use it with your own libraries and also about some of the future plans. You can also learn more by getting the FsLab package, or watching the NDC talk:

Published: Wednesday, 6 July 2016, 4:03 PM
Tags: f#, fslab, data science

# Upcoming F# events - learn Suave, FsLab & more!

Some people in the F# community have reputation for traveling too much. I do not know how that is possible, but as it happens, I will be visiting a couple of places in June and doing a number of talks, workshops and courses. So, if you are thinking about getting into F#, web development with F# using the amazing Suave library, playing with the new trendy F# to JavaScript compiler called Fable, or learning about the recent features in FsLab and Ionide, then continue reading!

The map includes all my travels, but not all of the pins are for F# events. I'm visiting Prague just to see my family (even though there is a new awesome F# meetup there) and my stop in Paris is attending Symposium for the History and Philosophy of Programming (although we might still do something with the local F# group too).

Published: Tuesday, 31 May 2016, 1:51 AM
Tags: c#, f#, functional programming, talks

Combining philosophy and computer science might appear a bit odd. The disciplines have very little overlap. Both philosophers and computer scientists get taught formal logic at some point in their undergraduate courses, but that's probably as close as they get.

But the fact that the disciplines do not overlap much might very well be the reason why putting them together is interesting. In an article about Design and Science, Joichi Ito (from MIT Media Lab), describes the term antidisciplinary and nicely summarizes why looking at such unusual combinations is worthwhile:

Interdisciplinary work is when people from different disciplines work together. But antidisciplinary is something very different; it's about working in spaces that simply do not fit into any existing academic discipline.

[When focusing on disciplines, it] takes more and more effort and resources to make a unique contribution. While the space between and beyond the disciplines can be academically risky, it (...) requires fewer resources to try promising, unorthodox approaches; and provides the potential to have tremendous impact (...).

As you can see from some of my earlier blog posts, I think the space between philosophy and computer science is an interesting area. In this article, I'll explain why. Unlike some of the previous posts (about miscomputation, types and philosophy of science), this post is quite broad and does not go into much detail.

At the danger of sounding like a collection of random rants, I look at a number of questions that arise when you look at computer science from the philosophical perspective, but I won't attempt to answer them. You can see this article as a research proposal too - and I hope to write more about some of the questions in the future. I wish antidisciplinary work was more common and I believe looking into such questions could have the tremendous impact that Joichi Ito mentioned.

Published: Thursday, 26 May 2016, 1:33 PM
Tags: philosophy, programming languages

# Coeffects playground: Interactive essay based on my PhD thesis

In my PhD thesis, I worked on integrating contextual information into a type system of functional programming languages. For example, say your mobile application accesses something from the environment such as GPS sensor or your Facebook friends. With coeffects, this could be a part of the type. Rather than having type string -> Person, the type of a function would also include resources and would be string -{ gps, fb }-> Person. I wrote longer introduction to coeffects on this blog before.

As one might expect, the PhD thesis is more theoretical and it looks at other kinds of contextual information (e.g. past values in stream-based data-flow programming) and it identifies abstract coeffect algebra that captures the essence of contextual information that can be nicely tracked in a functional language.

I always thought that the most interesting thing about the thesis is that it gives people a nice way to think about context in a unified way. Sadly, the very theoretical presentation in the thesis makes this quite hard for those who are not doing programming language theory.

To make it a bit easier to explore the ideas behind coeffects, I wrote a coeffect playground that runs in a web browser and lets you learn about coeffects, play with two example context-aware languages, run a couple of demos and learn more about how the theory works. Go check it out now or continue below to learn more about some interesting internals!

Published: Tuesday, 12 April 2016, 3:33 PM
Tags: coeffects, research, functional programming, programming languages

# Happy New Year 2016 around the World

Just like last year and the year before, I wanted to participate in the #FsAdvent event, where someone writes a blog post about something they did with F# during December. Thanks to Sergey Tihon for the organization of the English version and the Japanese F# community for coming up with the idea a few years ago!

As my blog post ended up on 31 December, I wanted to do something that would fit well with the theme of ending of 2015 and starting of the new year 2016 and so I decided to write a little interactive web site that tracks the "Happy New Year" tweets live across the globe. This is partly inspired by Happy New Year Tweets from Twitter in 2014, but rather than analyzing data in retrospect, you can watch 2016 come live!

Published: Wednesday, 30 December 2015, 6:09 PM
Tags: f#, data journalism, data science, visualization

# Philosophy of science books every computer scientist should read

When I tell my fellow computer scientists or software developers that I'm interested in philosophy of science, they first look a bit confused, then we have a really interesting discussion about it and then they ask me for some interesting books they could read about it. Given that Christmas is just around the corner and some of the readers might still be looking for a good present to get, I thought that now is the perfect time to turn my answer into a blog post!

So, what is philosophy of science about? In summary, it is about trying to better understand science. I'll keep using the word science here, but I think engineering would work equally well. As someone who recently spent a couple of years doing a PhD on programming language theory, I find this extremely important for computer science (and programming). How can we make better programming languages if we do not know what better means? And what do we mean when we talk about very basic concepts like types or programming errors?

Reading about philosophy of science inspired me to write a couple of essays on some of the topics above including What can programming language research learn from the philosophy of science? and two essays that discuss the nature of types in programming languages and also the nature of errors and miscomputations. This blog post lists some of the interesting books that I've read and that influenced my thinking (not just) when writing the aforementioned essays.

Published: Thursday, 10 December 2015, 12:42 PM
Tags: philosophy, research, talks

# F# + ML |> MVP Summit Talks

I was fortunate enough to make it to the Microsoft MVP summit this year. I didn't learn anything secret (and even if I did, I wouldn't tell you!) but one thing I did learn is that there is a lot of interest in data science and machine learning both inside Microsoft and in the MVP community. What was less expected and more exciting was that there was also a lot of interest in F#, which is a perfect fit for both of these topics!

When I visited Microsoft back in May to talk about Scalable Machine Learning and Data Science with F# at an internal event, I ended up chatting with the organizer about F# and we agreed that it would be nice to do more on F#, which is how we ended up organizing the F# + ML |> MVP Summit 2015 mini-conference on the Friday after the summit.

Published: Wednesday, 18 November 2015, 2:03 AM
Tags: f#, fslab, talks, machine learning, data science

# The Gamma: Simple code behind interactive articles

There are huge amounts of data around us that we could use to better understand the world. Every company collects large amounts of data about their sales or customers. Governments and international organizations increasingly release interesting data sets to the public through various open government data initiatives (data.gov or data.gov.uk). But raw data does not tell you much.

An interesting recent development is data journalism. Data journalists tell stories using data. A data driven article is based on an interesting observation from the data, it includes (interactive) visualizations that illustrate the point and it often allows the reader to get the raw data.

Adding a chart produced in, say, Excel to an article is easy, but building good interactive visualization is much harder. Ideally, the data driven article should not be just text with static pictures, but a program that links the original data source to the visualization. This lets readers explore how the data is used, update the content when new data is available and change parameters of the visualization if they need to understand different aspect of the topic.

This is in short what I'm trying to build with The Gamma project. If you're interested in building better reports or data driven articles, continue reading!

I did a talk about The Gamma project at the fantastic Future Programming workshop at the StrangeLoop conference last week (thanks for inviting me!) and there is a recording of my 40 minute talk on YouTube, so if you prefer to watch videos, check it out!

Are you a data journalist or data analyst? We're looking for early partners! I joined the EF programme to work on this and if the project sounds like something you'd like to see happen, please get in touch or share your contact details on The Gamma page!

Published: Monday, 28 September 2015, 5:07 PM
Tags: type providers, data journalism, programming languages

# Creating web sites with Suave: How to contribute to F# Snippets

The core of many web sites and web APIs is very simple. Given an HTTP request, produce a HTTP response. In F#, we can represent this as a function with type Request -> Response. To make our server scalable, we should make the function asynchronous to avoid unnecessary blocking of threads. In F#, this can be captured as Request -> Async<Response>. Sounds pretty simple, right? So why are there so many evil frameworks that make simple web programming difficult?

Fortunately, there is a nice F# library called Suave.io that is based exactly on the above idea:

Suave is a simple web development F# library providing a lightweight web server and a set of combinators to manipulate route flow and task composition.

I recently decided to start a new version of the F# Snippets web site and I wanted to keep the implementation functional, simple, cross-platform and easy to contrbute to. I wrote a first prototype of the implementation using Suave and already received a few contributions via pull requests! In this blog post, I'll share a few interesting aspects of the implementation and I'll give you some good pointers where you can learn more about Suave. There is no excuse for not contributing to F# Snippets v2 after reading this blog post!

Published: Tuesday, 15 September 2015, 11:26 PM
Tags: f#, web

# In the age of the web: Typed functional-first programming revisited

Most programming languages were designed before the age of web. This matters because the web changes many assumptions that typed functional language designers tak for granted. For example, programs do not run in a closed world, but must instead interact with (changing and likely unreliable) services and data sources, communication is often asynchronous or event-driven, and programs need to interoperate with untyped environments like JavaScript libraries.

How can statically-typed programming languages adapt to the modern world? In this article, I look at one possible answer that is inspired by the F# language and various F# libraries. In F#, we use type providers for integration with external information sources and for integration with untyped programming environments. We use lightweight meta-programming for targeting JavaScript and computation expressions for writing asynchronous code.

This blog post is a shorter version of a ML workshop paper that I co-authored earlier this year and you should see this more as a position statement. I'm not sure if F# and the solutions shown here are the best ones, but I think they highlight very important questions in programming language design that I very much see as unsolved.

The article has two sections. First, I'll go through a simple case study showing how F# can be used to build a client-side web widget. Then, I'll discuss some of the implications for programming language design based on the example.

Published: Wednesday, 9 September 2015, 5:14 PM
Tags: f#, type providers, web, functional programming, research

# Miscomputation: Learning to live with errors

If trials of three or four simple cases have been made, and are found to agree with the results given by the engine, it is scarcely possible that there can be any error (...).

Charles Babbage, On the mathematical
powers of the calculating engine (1837)

Anybody who has something to do with modern computers will agree that the above statement made by Charles Babbage about the analytical engine is understatement, to say the least.

Computer programs do not always work as expected. There is a complex taxonomy of errors or miscomputations. The taxonomy of possible errors is itself interesting. Syntax errors like missing semicolons are quite obvious and are easy to catch. Logical errors are harder to find, but at least we know that something went wrong. For example, our algorithm does not correctly sort some lists. There are also issues that may or may not be actual errors. For example an algorithm in online store might suggest slightly suspicious products. Finally, we also have concurrency errors that happen very rarely in some very specific scenario.

If Babbage was right, we would just try three or four simple cases and eradicate all errors from our programs, but eliminating errors is not so easy. In retrospect, it is quite interesting to see how long it took early computer engineers to realise that coding (i.e. translating mathematical algorithm to program code) errors are a problem:

Errors in coding were only gradually recognized to be a signiﬁcant problem: a typical early comment was that of Miller [circa 1949], who wrote that such errors, along with hardware faults, could be "expected, in time, to become infrequent".

Mark Priestley, Science of Operations (2011)

We mostly got rid of hardware faults, but coding errors are still here. Programmers spent over 50 years finding different practical strategies for dealing with them. In this blog post, I want to look at four of the strategies. Quite curiously, there is a very wide range.

Published: Monday, 27 July 2015, 2:15 PM
Tags: philosophy, research, programming languages

# Visualizing interesting world facts with FsLab

In case you missed my recent official FsLab announcement, FsLab is a data-science package for .NET built around F# that makes it easy to get data using type providers, analyze them interactively (with great R integration) and visualize the results. You can find more on on fslab.org, which also has links to some videos and download page with templates and other instructions.

Last time, I mentioned that we are working on integrating FsLab with the XPlot charting library. XPlot is a wonderful F# library built by Taha Hachana that wraps two powerful HTML5 visualization libraries - Google Charts and plot.ly.

I thought I'd see what interesting visualizations I can built with XPlot, so I opened the World Bank type provider to get some data about the world and Euro area, to make the blog post relevant to what is happening in the world today.

Published: Tuesday, 30 June 2015, 4:07 PM
Tags: f#, fslab, data science

# Against the definition of types

Science is much more 'sloppy' and 'irrational' than its methodological image.

Paul Feyerabend, Against Method (1975)

Programming languages are a fascinating area because they combine computer science (and logic) with many other disciplines including sociology, human computer interaction and things that cannot be scientifically quantified like intuition, taste and (for better or worse) politics.

When we talk about programming languages, we often treat it mainly as scientific discussion seeking some objective truth. This is not surprising - science is surrounded by an aura of perfection and so it is easy to think that focusing on the core scientific essence (and leaving out everything) else is the right way of looking at programming languages.

However this leaves out many things that make programming languages interesting. I believe that one way to fill the missing gap is to look at philosophy of science, which can help us understand how programming language research is done and how it should be done. I wrote about the general idea in a blog post (and essay) last year. Today, I want to talk about one specific topic: What is the meaning of types?

This blog post is a shorter (less philosophical and more to the point) version of an essay that I submitted to Onward! Essays 2015. If you want to get a quick peek at the ideas in the essay, then continue reading here! If you want to read the full essay (or save it for later), you can get the full version from here.

Published: Thursday, 14 May 2015, 3:46 PM
Tags: philosophy, research

# Announcing FsLab: Data science package for Mono and .NET

After over a year of working on FsLab and talking about it at conferences, it is finally time for an official announcement. So, today, I'm excited to announce FsLab - a cross-platform package for doing data science with .NET and Mono.

It is probably not necessary to explain why data science is an important area. We live surrounded by information, but extracting useful knowledge from the vast amounts of data is not an easy task. You have to access data in different formats (JSON-based REST services, XML, CSV files or even HTML tables), you need to deal with missing values, combine and align data from multiple sources and then build visualizations (or reports) to tell the right story.

The goal of FsLab is to make this process easier. FsLab combines the power of F# type providers, the efficiency and robustness of Mono and .NET and the high quality engineering of the open-source ecosystem around F# and C#.

Published: Tuesday, 5 May 2015, 3:55 PM
Tags: f#, fslab, data science

# Comparing date range handling in C# and F#

I was recently working on some code for handling date ranges in Deedle. Although Deedle is written in F#, I also wrote some internal integration code in C#. After doing that, I realized that the code I wrote is actually reusable and should be a part of Deedle itself and so I went through the process of rewriting a simple function from (fairly functional) C# to F#. This is a small (and by no means representative!) example, but I think it nicely shows some of the reasons why I like F#, so I thought I'd share it.

One thing that we are adding to Deedle is a "BigDeedle" implementation of internal data structures. The idea is that you can load very big frames and series without actually loading all data into memory.

When you perform slicing on a large series and then merge some of the parts of the series (say, years 2010, 2012 and 2014), you end up with a series that combines a couple of chunks. If you then restrict the series (say, from June 2012 to June 2014), you need to restrict the ranges of the chunks:

As the diagram shows, this is just a matter of iterating over the chunks, keeping those in the range, dropping those outside of the range and restrictingthe boundaries of the other chunks. So, let's start with the C# version I wrote.

Published: Wednesday, 22 April 2015, 4:55 PM
Tags: f#, c#, deedle, linq, functional programming