Charles Arthur recently wrote that if [he] had one piece of advice to a journalist starting out now, it would be: learn to code.

I understand the point he’s making, but I think there’s a further degree of subtlety to the argument. After all, learning to code is hard. Learning to glue together bits of scripts, and later bash your way into scripting langauges really is useful, but even that isn’t easy. It requires you to learn to translate intent into code, to know what’s possible, to know what’s easy and what’s hard, and to know what to do when third-party things you’re glueing together don’t work.

In short: it’s really easy to make a mess, and a mess that was difficult and stressful at that.

So my advice would be somewhat different, and apply to both those journalists who find code easy, and those who find it impossible:

Learn to think like a programmer.

What’s really important is to not understand how to do magical things with code, but to learn what magical things are possible, what the necessary inputs for that magic are, and who to ask to do it.

Identify the repetitive tasks that computers are good at. Yes, they’re good at find-and-replace, but tools like regular expressions are even handier, and I’m amazed how few people understand that find-and-replace is the beginning, not the end, of text processing. (And yes, I’m aware that regex are a quick way to give yourself two problems.)

Computers are really good at processing regular data, and they are really, really good at repetitive tasks. Every time I watched someone in an office doing a repetitive, regular task I despaired, because that’s exactly the kind of thing we have computers for.

You shouldn’t try to build the program that magically automates everything. But you should learn to smell the tasks where computers could help; learn to sniff out the angles on a story that a computer would be a useful tool for.

So that means when you find a table, or a regular data source, you don’t just take a print-out; ask for an Excel file, to convert to CSV, or maybe even a database dump. Even if you can’t do something with it, somebody else can. So the important thing to remember is what a progammer might want to receive.

When you’re gathering data, regularity is important. If you’re using Excel, keep it really simple, and one-column-per-thing, so that later a programmer can do something with the CSV. If you’re gathering textual information, put it in a plain text file, rather than Word; it’ll save you time in the long run.

Also: there are lots of useful tools that are halfway between being a programmer and not, and these are the most interesting spaces for the journalist right now. Simon linked to a bunch of these at the Guardian Hack Day, and it amazed me how many great tools there are for the non-programmer to do programmer-like tasks.

Excel, for starters, is a great environment (if a little limited and esoteric) for starting to explore datasets in a relatively visual way – structured data formats aren’t as immediate to more visual thinkers. Obvious examples include the frankly remarkable DabbleDB and, even though it’s never as useful as I hope it might be, Yahoo Pipes.

These let you exercise programmer-like thinking without needing to be a programmer. And then, when you’ve discovered what it is you want to do, even with the vaguest of prototypes, handing all your information and ideas over to a coder is much easier.

Why? Because you’ve already been thinking like a programmer. You’re handing them thoughts and data in the format they like.

So how do you learn this?

Partly, you have to try a bit of code yourself, but I’d make sure you’re always on the right side of the “understanding what I’m doing” vs “doing neat stuff” seesaw; understanding should be your goal.

Partly, it’s getting handy with a shell. One of the best places to explore what you can do with data is the command line; as well as the true scripting languages, there are tools like grep, sed and awk which can be remarkably powerful. Not entirely user-friendly, I’ll give you, but easier than breaking out a full program.

And partly, it’s relaxing a little and stepping away from the Office suite. Putting your data in formats like CSV, XML, JSON, and plain text doesn’t just make the data more useful to coders; it’ll be more useful for you, when you want to move it around.

I remain convinced there’s an interesting book on “doing smart stuff with computers that isn’t quite programming but isn’t far off”, because let’s face it, most people deal with data all the time now, and have the ideal tool for working with it on their desks. Now they just need to work with it a little.

So whilst this isn’t quite the “learning to code” that Charles speaks of, it’s not far off. And indeed, I think he hits the nail on the head much better in his conclusion:

…nowadays, computers are a sort of primary source too. You’ve got to learn to interrogate them effectively – and quote them meaningfully – too.

That feels about right. You don’t need to be a coder, but you need to be able to interrogate computers meaningfully. Do that how you will.

(As for me? Well, I wanted to be a journalist, but fate didn’t turn that way (although I’ve worked in the media and had a small amount of writing published). I did, however, seem to take to the coding malarkey a little better. I still maintain I’m not really a programmer, and certainly not in the sense that my real-programmer friends are, but evidence sometimes disproves that).

7 comments on this entry.

  • Emmet Connolly | 22 Jan 2009

    Where I work we call this being lazy, and it’s considered a good thing. A good programmer will never be bothered to do something manually that could be scripted in roughly the same time. If you’re grafting too hard, it means there might be a smarter way of doing things. Surely the same goes for journalists. And just like a programmer, a good journalist should also be able to see things within his field that would not be doable without computers.

    Also, the approach you’re describing plays to professional journalism’s strengths in it’s rivalry with “citizen” journalism. Investigative work must be where it’s at for real journalists from now on, but traditionally it’s really time-consuming work, especially when the subjects to be interrogated are massive impenetrable things. I listened to an episode of On The Media a while ago about the financial meltdown, and the failure of journalists to pick up on anything being wrong before it was too late. Apparently one of the problems was simply that nobody had the skills to crunch the numbers, even though all the the necessary raw data was publicly available.

    I think waxy.org has done some creative work that illustrates this path well: analyzing data, shelling work out to Mechanical Turk, etc.

  • Jonathan Hayward | 22 Jan 2009

    The burden of proof, it appears, is on the non-programmer to meet the programmer.

    What if the core insight of usability is that programmers have forgotten how to think about the rest of the world? In other words, what if programmers who have not re-learned how to think like a non-programmer have trouble seeing how to view an application apart from what follows from its mechanical innards?

    Usability is high ROI because good things happen when IT shifts the burden to technical people understanding how non-technical people work.

    I can’t deny that journalists might benefit from learning to think like a programmer, as from learning another language, or learning to think about culture like an anthropologist, or any other of a number of skills by which a journalist’s vista could be expanded. But it seems strange to me to wax poetic about how much better things are for everyone involved when we shift the burden of responsibility on to journalists to think like a programmer.

  • Ethan | 22 Jan 2009

    I agree with you that we could all benefit from learning to think like a programmer. But just saying that isn’t very helpful. What I need from you coders out there is to take the next step and teach me. Explain what you know, in accessible English. I’d like to get handy with a shell. What is a shell? How do I get handy with it? What is the command line? What are grep, sed and awk? “Not user-friendly” is a tremendous understatement. It falls to you guys to make them user-friendly through humane explanation.

  • Alistair | 23 Jan 2009

    A sibling suggestion would be ‘Learn to explore inquisitively’. One of the reasons only 20% of an application’s functionality is used by the majority of users is that their major motivation when they start using an application is ‘How do I do [x] in [y]?’, as opposed ‘What [x]s can [y] do for me?’ Taking a few minutes in the beginning to explore the menus and toolbars of an application – even if you understand none of it – is priceless in the longer term, as it gives you an understanding of how the application might help you solve future problems.

  • Tony Hirst | 23 Jan 2009

    I’ve been thinking for some time about trying to run a mashup w/s for journalists using things like Yahoo pipes, Google and Zoho spreadsheets (which double up as webpage screenscrapers for lists and tables) and Google maps.

    So if you can think of some example case studies of the sort of thing it might be good for a jounralist to “programme”, let me know, and I’ll try and post some demos.

  • Val | 24 Jan 2009

    @Jonathan: huh? Seems to me like the burden is on those whose livelihood is at risk. To be clear, that would be journalists, even the best of whom are in jeopardy from all angles, rather than programmers, who continue to be in high demand. (I’ve been on both sides, working in newsrooms for ~15 years, and as a coder of some sort for ~10.)

    You won’t find a more ardent defender of usability and human factors in program design than myself, but that’s not relevant to the issue. We’re not going to see another revolution in programming paradigms in time to save modern journalism — in fact, that revolution has already played out in the last decade or so with the rise of dynamic languages like PHP and Python. They’re no walk in the park, but they’re an order of magnitude easier than the languages they replace, C, C++, Java, C#, etc. Getting another order of magnitude of usability from the basic computing platform is going to take a lot of time.

    In the meantime, Tom’s point is well-considered. Regardless of language or environment, computing has been and will be for some time to come inherently deterministic. So, all of Tom’s advice is spot-on: try to arrange things in such a way that takes advantage of that determinism, and avoid things that obstruct it.

    That’s not to say that fuzzy areas aren’t important — rather, those are areas best left to wetware (the stuff between your ears). The mundane, repetitive, rote stuff — the stuff that makes your brain ache with boredom — there, computers excel, and you should try to shovel as much as you can in that direction. In order to do so, you need to understand a bit about how they see the world.

    To make good use of a car, you don’t have to be able to rebuild the engine — but being conversant with how it works can help you make better decisions about how to operate it, when not to, when to look for a different approach, and even let you make minor changes quickly and cheaply without having to get outside, expensive help.

    And, yes, theoretically it should be the burden of the manufacturer to design the car so it can automatically drive you home, open the garage door and pour you a drink. But lots of other people will be driving while you wait for that to happen.

    Same with computers.

    If you go and report on a subject area that has a lot of data, and you have a sense of how to organize the data so they can be analyzed, summarized, visualized programatically, then you can get your story out faster, or with greater depth, or find some non-obvious aspect buried deep in the data. If you can’t program it, and you make it hard for you programmer to make use of the data because you didn’t heed Tom’s advice, then your story will be late, or lack the depth your competition brings.

    Technical people can indeed understand how non-technical people work. That’s not of much use if you give them a pretty-looking but horribly-structured spreadsheet that makes sense to you but can’t be programatically interrogated. They’re going to have to work extra hard to make up for the things you couldn’t be bothered with — and generally on the short end of the deadline.

    Give them good stuff to begin with and you’ll both succeed. Or not — chances are there’s another journo behind you who can grok what needs to be done and would be more than happy to take your place.

  • davee | 9 Jul 2009

    I am a ‘career’ programmer. I ‘ve been a professional programmer for 34 years now and in all that time, I’ve only ever run into a couple of programmers that ‘think like a programmer’.

    Most programmers seldom spend more than a few years before moving into management, their original goal. unfortunately, the vast majority of these ‘stepping stone’ programmers is that they don’t have their heart in it and subsequently never really learn enough about programming to ever become a ‘good’ programmer. This is evident by the incredible number of bugs in their code. One could write a book about the bad practices they employ.

    Another problem is that every time we change jobs, we usually have to start from scratch. It takes time to learn new applications, languages, operating systems, frameworks, libraries. It’s not like we can often take anything other than our programming skills themselves from job to job. This is fairly unique in the business world. Most other professions do not incur such an extreme burden on jobs. This is, I think, mainly because most of the other professions have been around for a lot longer and have standardized the way they do things to a large degree. Imagine someone coming n board and having on real expectations for a couple of months. On the Job Training at its best.

    The problem that I see is that programming needs to be elevated to the same status as reading, writing, and arithmetic. It needs to be an essential skill set for anyone who uses a computer.

    The reasoning behind this is the whole history of the computer world has been focused on programmers and not on the people that actually know the tasks being requested of the person programming the computer. Historically, a user goes to a programmer and asks for a program. What he gets is largely based on the communication skills of both the user and the programmer, in their ability to make sense out of two entirely unrelated languages, the language of the application, and the language of computer programming.

    This was essential in the early decades of the IT world in order to build a reasonable infrastructure to move forward.

    Today, we need regular knowledge workers that can build applications at least at the primitive prototype level that can be created with minimum skills and passed off to ‘career’ programmers to do the major back-end and UI methods needed to make the application robust.

    It becomes a numbers game; it’s easier to teach everyone the fundamentals of programming than it is to teach programmers every application. Wouldn’t it be nice to have a programmer come on staff that already knows how to do most things?

    Thinking like a programmer is easy for some of us, impossible for others. I tend to express it more like, ‘think like a computer’. A computer is an extremely literal device and being able to think in literal terms makes programming easy. Computers see only in black or white, a good programmer does the same. The only other real skill ‘real’ programmers have is that we’re pretty darn good at learning things on our own. We have to be to survive in the kind of pressure cooker that is the ‘real computer programmer’.

    In the end, we need two types of programmers, system programmers who have a deep understanding of the platform, and user-programmers, people who know what they want and can build a shell. The user and the programmer would be a team of sorts.

    Such a programmer could spend all of their time building and integrating libraries that the rest could use, making it more a program by numbers exercise.