Why Is Code Hard to Understand?

Anyone who has tried to look at somebody else’s computer code — especially in the likely event that it hasn’t been well-commented — knows how hard it is to figure out what’s going on. (With sometimes dramatic consequences.) There are probably numerous reasons why, having to do with the difference between heuristic human reasoning and the starkly literal nature of computer instructions. Here’s a short paper that highlights one reason in particular: people tend to misunderstand code when it seems like it should be doing one thing, while it’s actually doing something else. (Via Simon DeDeo.)

What Makes Code Hard to Understand?
Michael Hansen, Robert L. Goldstone, Andrew Lumsdaine

What factors impact the comprehensibility of code? Previous research suggests that expectation-congruent programs should take less time to understand and be less prone to errors. We present an experiment in which participants with programming experience predict the exact output of ten small Python programs. We use subtle differences between program versions to demonstrate that seemingly insignificant notational changes can have profound effects on correctness and response times. Our results show that experience increases performance in most cases, but may hurt performance significantly when underlying assumptions about related code statements are violated.

As someone who is jumping back into programming myself after a lengthy hiatus, this stuff is very interesting. I wonder how far we are away from natural-language programming, where we can just tell the computer what we want in English and it will reliably do it. (Guess: pretty far, but not that far.)

  1. Here is an experiment for you. Write a computer program and time how long it takes you, then take that computer program and write precise English instructions that would do the same thing. Then give those instructions to another moderately skilled programmer and time how long it takes them to create a computer program for you. If your instructions weren’t precise enough then rewrite them and give them to yet another programmer. I bet that it takes less than 5% as long to convert precise English instructions into code than it does to write a program from scratch.

  2. Thought-provoking. Letting my mind wander on this subject while I did a quick bit of cooking, I came to the conclusion that the human understanding of anything, whether a human invention (which is what you can say history is, e.g., as well as computer programs) or the natural world (whether or not you take that as another form of human invention), is potentially problematic. How do we decide what Eliot’s Waste Land means? Or the Second Amendment to the U.S. Constitution? What does quantum mechanics mean?

    But that’s not very germane to the post. Understanding someone else’s code or making one’s own code intelligible to others can be challenges in the workplace, but they’re much less important if one is programming for oneself. (Carroll didn’t tell us which will be the case for him.) If you’re doing it for yourself, you may have a problem if you look at something you wrote in haste earlier, without adequate commenting, and can’t now figure it out. Otherwise, I wouldn’t think this would matter.

  3. @John Branch:

    “If you’re doing it for yourself, you may have a problem if you look at something you wrote in haste earlier, without adequate commenting, and can’t now figure it out.”

    So… write intelligible code even when it’s for yourself. The person banging their head against the wall trying to figure it out might just be you six months later.

  4. Siri and Dragon are pretty decent, and mass produced. So you’re right; pretty far, but not that far.

  5. I don’t think we need natural language programming; human spoken languages so ambiguous, I have no idea what a compiler might spit out.

    That said, it would be fun to see what could be done with some static analysis for when you “break” semantic expectations in code. Python is notorious for this, where data state leaks all over the place (this makes it very easy to code), and static analysis is very difficult (as it’s an interpreted language). You can do some very nasty things to “fool” a person.

    Reading other people’s code in some kinds of languages tends to be better, I find, when the languages are smaller and have fewer syntactic features. I’ve got a pet language myself (Erlang), and I find it fascinating that they actually based the language on studies about human productivity and human ability to understand others’ code. That said, it’s a strange prolog-based language (not a von neumann style syntax), and so some coders get tripped up looking at it (they complain that it looks ugly).

  6. Have you heard of Inform 7? It’s a tool for programming Interactive Fiction and has code designed to read like natural language. It’s still not the same as talking to a computer and having it understand English like a human being, but it’s pretty cool anyway. It at least demonstrates the concept of natural language programming.

    I’ve actually read some programmers, if memory serves, who thought it was more difficult to program in than a normal programming language, because they expected to be able to do things that the language didn’t support. I never personally had this problem though, to me it seems way more powerful and easier to use than other languages, at least for the purpose of writing text-games, which is what it’s for. I found it before ever using higher level languages like c++ though, and basically taught myself to program using it, (which makes me biased in it’s favor obviously), before later moving on to general purpose languages. Using it after being used to something like c++ might be harder, I don’t really know.

    If anyone thinks this is cool enough to really geek out about and dig into, there’s a white paper here by Graham Nelson, one of Inform 7’s developers, that explains the theoretical aspects, (and difficulties,) of designing a programming language to read like natural language.

  7. MAD worked pretty well ( in the 1960s ) at being almost English. It was my first programming language and I learned how to program in just a couple of hours of instruction.

  8. On ‘natural-language programming’:

    “When someone says ‘I want a programming language in which I need only say what I wish done,’ give him a lollipop. ”

    by Alan Perlis, first recipient of the Turing Award. Since I don’t like quotes without explanation, I guess such natural languages would be an obstacle because english does not, necessarily, support the paradigms people use when programming. For instance, LISP is quite bizarre and disconnected from english but it is really an experience to program in it. Could it be made strongly typed or would it run amok with my variables like C when I incorrectly declare my functions?

    But great talking about reading code, I have a hard time reading 6 month-old codes from myself, even when I did make comments

  9. As a programmer, it is difficult for me to imagine how I can code, using a natural language unless we change the entire programming techniques.

  10. There are basically four reasons why code is hard to understand.

    1. The job that the program does is inherently hard to understand. It could be inherently complex, or it could be straightforward in principle, but inherently hard to express in any formal notation.

    2. One or more wrong tools were used for this task. It could be the wrong programming language, the wrong framework, some wrong libraries or something else. Sometimes, this is not the programmer’s fault because no good tool exists, or because existing tools/libraries are also hard to understand or fragile in other ways.

    3. The programmer doesn’t understand the tools that they are using to the level required for the task. Sorry, but Python is not easier to use than other programming languages if the job is hard. In fact, features of Python which make rapid development easy usually get in the way of anything moderately complex.

    4. The programmer doesn’t know how to write maintainable code. There are common reasons for this. One is that that they were never taught how. (Programming is an artisinal craft, and like all such crafts, requires a certain amount of apprenticeship.) The other (and is quite common amongst non-software engineers, scientists and economists) is that they stubbornly refuse to admit that they’re programming.

    I can’t tell you how many times I’ve had to deal with people who don’t want to learn source code control with objections like “oh, but it’s just 100 lines of Matlab”.

    Scientists who can’t write small programs are going to go the way of biologists who don’t know any maths. It worked for a while, but the era is long gone where you could get away with it.

  11. 5. The tool is very low level. Assembler being an obvious example, and yes, sometimes it is the right tool.

    6. The code was machine generated or heavily processed.

    7. It was deliberately written in that way.

  12. Spoken languages were not designed for instruction, they are for conversation. Even when giving instructions to another person we resort to writing, diagrams, demonstration, and practice to communicate our wishes. Some of this is due to the limitations of our memory, but not all. I think any system that used spoken language for programming would have to implement these other forms of communication as well.

  13. I think it’s worth pointing out that aiming for understandable code is commendable, but I would argue it is not as important as correctness and these two aspects should not be confused (although the former probably have some influence on the latter). Some of the comments here are more aimed at which language designs or features might encourage correctness rather than which are more readable. However, the choice of language is less significant than the practice/methodology followed during program development if your aim is correctness. The cited paper mentions the potentially misleading effect of choice of variable and function names but doesn’t mention the other real-world problem of comments in code becoming outdated and misleading the reader. Careful choice of names and commenting style are both important. Nevertheless, you can still code an incorrect function in any language if you misunderstand the specification or the specification inevitably changes. The road to achieving correctness is to build the known aspects of the specification into the code itself by using test-driven-development (TDD), writing unit tests that are quick and painless to run and inserting assertions into the code. A great place to learn more is http://software-carpentry.org which is all about educating researchers in writing readable and reusable code, and making their research reproducible.

  14. As a software engineer, i don’t think developing Natural Language Programming is too difficult of a problem. What I think would be difficult is the massive effort it would take to keep a natural programming language up to date with the constantly changing hardware and software and the vast varieties of hardware and software to keep it relevant. For example, though there are standard APIs there are still significant differences when programming software to run on Windows versus Linux. Or on x86 versus ARM. And complexity and variations increase by orders of magnitude when we go to higher levels of abstraction.

    Nonetheless, I think Java is a major success story that shows it is possible to bridge the gap between a vast amount of different hardware and software without having to code for each seperately.

  15. AI:

    5. The tool is very low level. Assembler being an obvious example, and yes, sometimes it is the right tool.

    That’s true, but I don’t think it’s relevant for the purpose of this discussion, and to the extent that it’s true, is partly subsumed by the observation that “the job is hard”.

    There are essentially only three reasons why you ever need to drop to assembler. In descending order of likelihood:

    1. There are some constraints which prevent you from using a higher-level language (e.g. you’re writing firmware on a very small machine).

    2. You need to access some machine feature where the programming language doesn’t give you any sane way to get to it (e.g. custom hardware, or some advanced instruction).

    3. You have profiled your code, found a bottleneck, and there’s no way to fix the bottleneck in the higher-level language.

    Whichever way you cut it, you’re doing something inherently difficult.

    6. The code was machine generated or heavily processed.

    Then you’re looking at the wrong code. Unless you’re debugging the code generator/processor, of course, in which case (once again) you’re doing something inherently difficult.

    I used to be a compiler writer, so I’m not without sympathy if you find yourself in that situation.

    7. It was deliberately written in that way.

    That’s such a rare scenario that it’s probably not worth considering. It’s amusing to try to work out how an IOCCC entry works, but that doesn’t usually happen in paid work.

    In most cases, obfuscation can adequately be explained by stupidity, or at least ignorance (which is not the same thing, and much more forgivable).

  16. In most cases, obfuscation can adequately be explained by stupidity, or at least ignorance (which is not the same thing, and much more forgivable).

    I think another common reason (perhaps the most common) is deadlines. When you are behind and running into the final crunch to get the product out the door to your customer, then making your code easily understandable and maintainable is a low priority.

  17. There are essentially only three reasons why you ever need to drop to assembler.

    Don’t forget reverse engineering!

  18. (As a CS grad,) I think natural language programming is not coming any time soon, but the last few years domain specific languages have become a lot more popular it seems.

    Basically, if you’re doing math, you want something that represents mathematical functions easily (like in octave perhaps?). If you’re doing web programming, you want to think in terms of requests and responses (web frameworks, especially newer ones). If you’re doing calendar manipulation, you want an easy way to define date (unix date can parse stuff like ‘next week’). Once you start looking they’re everywhere, sometimes radically so, sometimes just making it a little easier. I assume the language BlueJay mentions is an example of that, a language for text games. Siri would be another one (but more limited because of its programming interface, we are not that good at parsing language for meaning).

    None of this is anything new of course. In fact, programming well is really about finding ways to explain your problem and solution to the computer in a way that you can still follow. It’s just that the tools have gotten to a point where everyone is at least thinking about creating their own mini-syntax nowadays, and it will only get easier as we all get more comfortable with this kind of metaprogramming. [wild speculation]Probably, in a couple of decades or 2, most coders will never have used a “real” general programming language, which will be an elite skill reserved for CS graduates, engineers and malware writers.

  19. Sorry, I must have misheard you… I thought you said natural language programming was coming. Ever. Talk about the wrong tool for the job!

    Barring catastrophe, a general purpose AI capable of following natural language instructions ought to be somewhere in our future… I wouldn’t class that as programming any more than I do when I command my biological minions, though.

  20. Quite amusing that in the description of the “funcall” example the authors say “where f (x) = x + 5” but the actual code in the Appendix is given as
    “1 def f(x):
    2 return x + 4”

    I wasted 5 minutes reading the description and banging my head wondering why the answer wasn’t 120.

  21. Glad to see you’re interested in my paper! For anyone who wants to learn more, I’ve put up a blog post (which includes a link to the data).

    It feels like natural language programming is still very far away. I think part of the problem is the “punch card” model of compilation we still often use, where you feed a program into the compiler and it either works or spits out a bunch of errors. A better approach might be more back-and-forth: have the compiler note ambiguities in the code and suggest ways to help it resolve them.

    There’s a neat paper by John Pane and others called “Studying the Language and Structure in Non-Programmers’ Solutions to Programming Problems” in which they have kids provide natural language solutions to problems (like controlling Pac-Man in a game). The results are quite interesting — e.g., people rarely specify stopping conditions, such as what happens with Pac-Man hits a wall.

  22. In my first programming job out of college (BCPL (C’s predecessor) on a PDP-10), I left work on a Friday evening with a coworker grumbling at some code _he himself had written_. I came back Monday morning and he was still there. “What’s the problem?” I asked. “This line doesn’t do the right thing.” Me: “Put parentheses around the innermost logical operators.” Him: “That’s not necessary, the operator precedence takes care of that”. Me: “Shut up and put in the parentheses.” He did. It worked.

    The local lesson, of course, is that operator precedence is a really stupid idea and LISP has a far more sensible syntax than any language currently used*. But real life is far worse than that: getting software right is essentially impossible. Turing proved that programming is the most complex possible mathematical structure (i.e. complexity of possible program behavior increases with program size faster than any computable function), pretty much even before there were computers. Very kewl bloke, Turing. His “Turing Test” is worlds more subtle than pretty much anyone realizes. It’s not about pretending to be human, it’s about a male human pretending to be a female human and thus requires that computers be able empathize with our sexuality. We forget that Turing was gay, and that he was sensitive to the sexual identity aspect of our humanity.

    *: Note that this strongly implies that “natural language programming” is a really horrendous idea. I translate Japanese to English technical stuff nowadays to keep body and soul together. Japanese (as well as many other languages) doesn’t require that the singular/plural distinction be made (it can be made if needed), and it usually isn’t. In reading Japanese, there’s never (well, almost never) any problem understanding what the author was intending, but try to translate it, and if you don’t really really really understand what’s going on, you’re dead. (There are also problems such as the Japanese equivalent of “Set A to B” can mean either take what’s in B and copy it into A or take A and smash it into B. (The object of “set” is either the thing picked up and moved or the thing smashed; both being perfectly reasonable uses of the objective case). If you know which of A and B is the thing that the author thinks needs to be set, you don’t even notice the ambiguity. It you aren’t following the discussion, you can’t translate it. (Japanese is particularly difficult in this respect, but there’s plenty of that sort of ambiguity going on in English as well.)