Charles Arthur recently wrote that if [he] had one piece of advice to a journalist starting out now, it would be: learn to code.
I understand the point he’s making, but I think there’s a further degree of subtlety to the argument. After all, learning to code is hard. Learning to glue together bits of scripts, and later bash your way into scripting langauges really is useful, but even that isn’t easy. It requires you to learn to translate intent into code, to know what’s possible, to know what’s easy and what’s hard, and to know what to do when third-party things you’re glueing together don’t work.
In short: it’s really easy to make a mess, and a mess that was difficult and stressful at that.
So my advice would be somewhat different, and apply to both those journalists who find code easy, and those who find it impossible:
Learn to think like a programmer.
What’s really important is to not understand how to do magical things with code, but to learn what magical things are possible, what the necessary inputs for that magic are, and who to ask to do it.
Identify the repetitive tasks that computers are good at. Yes, they’re good at find-and-replace, but tools like regular expressions are even handier, and I’m amazed how few people understand that find-and-replace is the beginning, not the end, of text processing. (And yes, I’m aware that regex are a quick way to give yourself two problems.)
Computers are really good at processing regular data, and they are really, really good at repetitive tasks. Every time I watched someone in an office doing a repetitive, regular task I despaired, because that’s exactly the kind of thing we have computers for.
You shouldn’t try to build the program that magically automates everything. But you should learn to smell the tasks where computers could help; learn to sniff out the angles on a story that a computer would be a useful tool for.
So that means when you find a table, or a regular data source, you don’t just take a print-out; ask for an Excel file, to convert to CSV, or maybe even a database dump. Even if you can’t do something with it, somebody else can. So the important thing to remember is what a progammer might want to receive.
When you’re gathering data, regularity is important. If you’re using Excel, keep it really simple, and one-column-per-thing, so that later a programmer can do something with the CSV. If you’re gathering textual information, put it in a plain text file, rather than Word; it’ll save you time in the long run.
Also: there are lots of useful tools that are halfway between being a programmer and not, and these are the most interesting spaces for the journalist right now. Simon linked to a bunch of these at the Guardian Hack Day, and it amazed me how many great tools there are for the non-programmer to do programmer-like tasks.
Excel, for starters, is a great environment (if a little limited and esoteric) for starting to explore datasets in a relatively visual way – structured data formats aren’t as immediate to more visual thinkers. Obvious examples include the frankly remarkable DabbleDB and, even though it’s never as useful as I hope it might be, Yahoo Pipes.
These let you exercise programmer-like thinking without needing to be a programmer. And then, when you’ve discovered what it is you want to do, even with the vaguest of prototypes, handing all your information and ideas over to a coder is much easier.
Why? Because you’ve already been thinking like a programmer. You’re handing them thoughts and data in the format they like.
So how do you learn this?
Partly, you have to try a bit of code yourself, but I’d make sure you’re always on the right side of the “understanding what I’m doing” vs “doing neat stuff” seesaw; understanding should be your goal.
Partly, it’s getting handy with a shell. One of the best places to explore what you can do with data is the command line; as well as the true scripting languages, there are tools like grep
, sed
and awk
which can be remarkably powerful. Not entirely user-friendly, I’ll give you, but easier than breaking out a full program.
And partly, it’s relaxing a little and stepping away from the Office suite. Putting your data in formats like CSV, XML, JSON, and plain text doesn’t just make the data more useful to coders; it’ll be more useful for you, when you want to move it around.
I remain convinced there’s an interesting book on “doing smart stuff with computers that isn’t quite programming but isn’t far off”, because let’s face it, most people deal with data all the time now, and have the ideal tool for working with it on their desks. Now they just need to work with it a little.
So whilst this isn’t quite the “learning to code” that Charles speaks of, it’s not far off. And indeed, I think he hits the nail on the head much better in his conclusion:
…nowadays, computers are a sort of primary source too. You’ve got to learn to interrogate them effectively – and quote them meaningfully – too.
That feels about right. You don’t need to be a coder, but you need to be able to interrogate computers meaningfully. Do that how you will.
(As for me? Well, I wanted to be a journalist, but fate didn’t turn that way (although I’ve worked in the media and had a small amount of writing published). I did, however, seem to take to the coding malarkey a little better. I still maintain I’m not really a programmer, and certainly not in the sense that my real-programmer friends are, but evidence sometimes disproves that).