Tomas Petricek's Publications

Searching for new ways of thinking in programming & working with data

I'm a visiting researcher at the Alan Turing Institute working on tools for data-driven storytelling. I also work closely with the F# team in Microsoft Research and I recently submitted my PhD thesis at University of Cambridge. If you read 3 of my papers, consider the following ones!

Type Providers

F# Data infers types from sample XML and JSON documents and safely embed them into F#. Our PLDI 2016 paper is an ACM SIGPLAN Research Highligt.

Coeffects

Coeffects are theory of context-aware programming languages developed in my PhD thesis. Check out our ICFP 2014 paper or my interactive essay.

Philosophy

How different communities approach errors? My paper on history and philosophy of errors published in ‹Programming› 2017 was selected as reviewers choice.

Tomas Petricek
  • Tomas Petricek
  • Home
  • F# Trainings
  • Talks and books
  • The Gamma
  • Academic

Types from data Making structured data first-class citizens in F#

Tomas Petricek, Gustavo Guerra, Don Syme

In proceedings of PLDI 2016

Most modern applications interact with external services and access data in structured formats such as XML, JSON and CSV. Static type systems do not understand such formats, often making data access more cumbersome. Should we give up and leave the messy world of external data to dynamic typing and runtime checks? Of course, not!

We present F# Data, a library that integrates external structured data into F#. As most real-world data does not come with an explicit schema, we develop a shape inference algorithm that infers a shape from representative sample documents. We then integrate the inferred shape into the F# type system using type providers. We formalize the process and prove a relative type soundness theorem.

Our library significantly reduces the amount of data access code and it provides additional safety guarantees when contrasted with the widely used weakly typed techniques.

Paper and more information

  • Download the paper (PDF)
  • Supplementary screencast shows the library in action
  • For more info, see F# Data homepage
  • Poster about F# Data (PDF) from Student Research Competition

Watch the talk

As far as I'm aware, the PLDI 2016 talk has not been recorded, but I did a brief practical demonstration of the F# Data library, similar to one that I did in a number of industry talks. One good talk to watch has been recorded by Microsoft's Channel 9 team:

Bibtex

If you want to cite the paper, you can use the following BibTeX information, or get full details from the paper paper page on ACM

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
@inproceedings{fsharp-data-pldi2016,
  author    = {Petricek, Tomas and Guerra, Gustavo and Syme, Don},
  title     = {Types from data: Making structured
               data first-class citizens in F\#},
  booktitle = {Proceedings of Conference on Programming
               Language Design and Implementation},
  series    = {PLDI 2016},
  location  = {Santa Barbara, California, USA},
  year      = {2016}
}

If you have any comments, suggestions or related ideas, I'll be happy to hear from you! Send me an email at tomas@tomasp.net or get in touch via Twitter at @tomaspetricek.

Published: Sunday, 1 May 2016, 12:00 AM
Author: Tomas Petricek
Typos: Send me pull request!

Contact & about

This site is hosted on GitHub and is generated using F# Formatting and DotLiquid. For more info, see the website source on GitHub.

Please submit issues & corrections on GitHub. Use pull requests for minor corrections only.

  • Twitter: @tomaspetricek
  • GitHub: @tpetricek
  • Email me: tomas@tomasp.net

Blog archives

October 2020 (1),  July 2020 (1),  April 2020 (2),  December 2019 (1),  February 2019 (1),  November 2018 (1),  October 2018 (1),  May 2018 (1),  September 2017 (1),  June 2017 (1),  April 2017 (1),  March 2017 (2),  January 2017 (1),  October 2016 (1),  September 2016 (2),  August 2016 (1),  July 2016 (1),  May 2016 (2),  April 2016 (1),  December 2015 (2),  November 2015 (1),  September 2015 (3),  July 2015 (1),  June 2015 (1),  May 2015 (2),  April 2015 (3),  March 2015 (2),  February 2015 (1),  January 2015 (2),  December 2014 (1),  May 2014 (3),  April 2014 (2),  March 2014 (1),  January 2014 (2),  December 2013 (1),  November 2013 (1),  October 2013 (1),  September 2013 (1),  August 2013 (2),  May 2013 (1),  April 2013 (1),  March 2013 (1),  February 2013 (1),  January 2013 (1),  December 2012 (2),  October 2012 (1),  August 2012 (3),  June 2012 (2),  April 2012 (1),  March 2012 (4),  February 2012 (5),  January 2012 (2),  November 2011 (5),  August 2011 (3),  July 2011 (2),  June 2011 (2),  May 2011 (2),  March 2011 (4),  December 2010 (1),  November 2010 (6),  October 2010 (6),  September 2010 (4),  July 2010 (3),  June 2010 (2),  May 2010 (1),  February 2010 (2),  January 2010 (3),  December 2009 (3),  July 2009 (1),  June 2009 (3),  May 2009 (2),  April 2009 (1),  March 2009 (2),  February 2009 (1),  December 2008 (1),  November 2008 (5),  October 2008 (1),  September 2008 (1),  June 2008 (1),  March 2008 (3),  February 2008 (1),  December 2007 (2),  November 2007 (6),  October 2007 (1),  September 2007 (1),  August 2007 (1),  July 2007 (2),  April 2007 (2),  March 2007 (2),  February 2007 (3),  January 2007 (2),  November 2006 (1),  October 2006 (3),  August 2006 (2),  July 2006 (1),  June 2006 (3),  May 2006 (2),  April 2006 (2),  December 2005 (1),  July 2005 (4),  June 2005 (5),  May 2005 (1),  April 2005 (3),  March 2005 (3),  January 2005 (1),  December 2004 (3),  November 2004 (2), 

License

Unless explicitly mentioned, all articles on this site are licensed under Creative Commons Attribution Share Alike. All source code samples are licensed under the MIT License.

CC License logo