Over the past 3 months Bocoup has been working closely with the Guardian Interactive team on the Miso Project, a set of open source libraries designed to expedite and simplify the creation of data-driven interactive content. We are excited to announce the release of the first of these libraries called Dataset. You can see the code here on github.

Traditionally, data-driven interactive applications require a series of steps that all follow a similar workflow. For example, remote data sources need to be fetched, the data is then parsed to match the client-side representation of the model and then perhaps transformed and queried to obtain the actual information required by the rendering layer. These steps can be individually accomplished through custom code or a collection of existing libraries and frameworks. When writing Dataset, we wanted to simplify this part of the workflow by creating a single library that managed the entire process.

Dataset comes with a growing list of examples that showcase not only its ease of use, but also how easy it is to integrate it into existing libraries. For example, here’s a quick bar chart showcasing Dataset’s own github repo commit history using jQuery Sparklines:

Some of Dataset’s facilities echo common patterns that we already see in MVC frameworks: easy tie-in to API endpoints and creating common client-side models that then make the data accessible. The Dataset structure is itself a similar abstraction to Backbone collections or Ember arrays for handling sets of data. While many frameworks provide these facilities for handling sets of data, our focus is on providing a more efficient implementation with more extensive APIs for manipulating the incoming data. By making data a first-class citizen in an MVC application, Dataset functions as a management layer that can be used as a step before, during or after an MVC framework like Backbone.js. It is our goal to grow the library in a way that facilitates interoperability with those frameworks and so we are looking forward to hearing how you might use it in your workflow.

Available Features

Dataset has a variety of features that try to cover the common set of functionality required by client-side data-driven applications:

  • A series of importers are responsible for fetching data from local and remote sources, like google spreadsheets.
  • A variety of parsers are waiting to transform the incoming data from formats such as CSV to our standard and fast-to-traverse format.
  • A series of computational functions are available to easily obtain metrics about one or more columns in the data, such as min/max and groupBy.
  • A simple and powerful filtering API that lets you create sub-selections of the data that match a particular set of conditions.
  • An event system that allows subscription to specific data changes such as new rows being added or existing rows being updated.

While Dataset was written to facilitate browser-based data management, thanks to Tim Branyen‘s efforts it is now also available as a node.js module.

Why We Built It

At Bocoup, we are committed to moving the Open Web forward through developing new Open Web technologies for industries in transition to the web. We want to ensure that as this happens, the best software tools for doing so are open.

Inspired by the needs of journalism, we set out to work with the Guardian, a leader in Open Journalism, to develop Open Web data journalism tools. As we began building Dataset, we realised that its facilities are valuable for other software paradigms that we work on at Bocoup, and so we are excited to release a tool that that focuses on client-side data management.

We as web developers have a lot to learn about narrative and storytelling in our data focused web applications. In light of this, it is especially compelling for us to be focusing on an initiative like the Miso Project that bridges this gap.