What I got up to on Thursday

16 November 2008

Last Thursday and Friday, I was very lucky to be invited to the Guardian’s first internal hack day. Whilst it was primarily an internal event, they also invited along a few of their friends to see what we could do with some of their information.

It was a really stimulating two days – exciting to see just what the Guardian is doing with their data and their journalism, and the ways they’re trying to make it more open. A particular highlight was seeing Simon Rogers explain the process of researching infographics and data-sourced news articles, and offering his talent for hunting down data to anyone who needed it; he provided a lot of hackers with useful sets of information that were only ever going to be found through a series of tactical phonecalls. For those of us not requesting data to order, the Guardian’s new full-text RSS feeds came in very, very handy, let me tell you.

It was also great to meet some of their technical staff. Obviously, the Guardian developer programme is in safe hands with Matt McAllister, and I’ve known Simon for a while, but it was great to meet lots more of their developers, client-side team and QAs; they were, to a person, lovely and talented, and it’s clear that the Guardian has a deep culture of quality.

I orginally wanted to build something along the lines of CelebDAQ but for journalists. The idea would be that you invested in journalists and made returns based on the column inches they filed; the goal was to highlight a lot of the high-volume content on the Guardian website that goes unnoticed, whilst making the more prolific and “celebrity” writers like Charlie Brooker expensive commodities.

Unfortunately, it soon become clear that the volume of scraping and data-parsing I would have to undertake would take far longer than I planned, and I wasn’t planning on staying up all night.

So I scaled down my thinking, and instead of undertaking “real programming” I started thinking instead about “neat hacks”, and the result was this:

In a nutshell, it parses the Guardian’s publicly available politics RSS feed, counts the number of names of Labour MPs and of Conservative MPs (not to mention the words “Labour”, “Tory”, and “Conservative”), and then works out the “swing” of the page. That data is then sent over serial to an Arduino, which outputs the result on a little bargraph.

It wasn’t the hardest of challenges, but I did get to write some Wiring and learn how to send serial data from Ruby, and I had a lot of fun poking electronic circuits. I was fortunate enough to win a subscription to Make for my troubles, as were the other team of plucky hardware hackers in the room – a lovely surprise to end the two days on.

37 hacks were submitted overall – impressive given the short period of time and how busy everybody was – and they ranged from the entertaining to the remarkably useful, from the thought-provoking to the empowering. Jemima Kiss has written up a few of the stand-out hacks in her Guardian blogpost on the event. It was great to see what such a talented – and multi-skilled – room could produce in under 24 hours, and I hope that the internal team at the Guardian enjoyed it as much as I did.

Many thanks to everyone who organised the event, and I look forward to seeing what the Guardian do with their data – and their great hacking – on a larger scale.