When Software Ages Badly


In some respects, software ages gracefully: it generally starts out working poorly but gets better over time as bugs are fixed. Unlike hardware, there’s no physical wearing out of parts. This post is about a few ways in which software doesn’t get better with age.

In The Mythical Man Month Brooks observes that any sufficiently complex software system will tend to have an “irreducible number of errors”—a reference to the fact that in a crufty old software base, any bug fix is likely to break something else. It is possible to fight this phenomenon through aggressive modularization and refactoring but these are expensive.

The second way for old software to become less reliable is when the meaning of its code changes. Obviously this can happen when the software is ported to a new architecture where, for example, pointers or integers have a different size. Also, a library or operating system update—even an obvious fix for a simple bug—can break an application that relied on the previous semantics. A compiler upgrade can break working code in any number of insidious ways. For example, an improved optimizer can make code run faster, exposing previously latent race conditions. A C/C++ compiler’s exploitation of undefined behavior can be ratcheted up a notch, breaking code that previously worked. This has been continuously happening for at least a decade and it’s not going to stop in the foreseeable future.

The inputs that are provided to a software system can change in harmful ways. The best example of this that I can think of is when a new class of attack becomes well understood, such as stack buffer overflows, heap overflows, and integer overflows. When a nasty new fuzzing tool appears, a previously solid-seeming piece of code can be shown to be not correct at all. The input domain can also change shape when a piece of software is reused in a new environment; Ariane 5 flight 501 is an excellent example.

Finally, in the long run, good old software rot can happen. Programming idioms, programming languages, software architectures, and other aspects of code are always evolving and eventually progress gets the better of almost any piece of software.

We’ve been writing software for hardly more than half a century, and I suspect that the vast majority of lines of code in existence were written since 2000. What will the world look like when we’ve had software for as long as we’ve had the printing press? What about when we’ve had computer languages for as long as we’ve had spoken languages? How are we going to perform refactoring on a code base of a hundred billion lines? Vernor Vinge envisions professional programmer-archaeologists whose job it is to deal with these systems; I’m not sure we’re that far off from needing these people now.

,

9 responses to “When Software Ages Badly”

  1. “Finally, in the long run, good old software rot can happen.” — By “software rot”, do you mean that the language, text encoding, and/or storage medium have become extinct? Analogously to the way that, say, a Latin manuscript written in blackletter can be said to have suffered from “book rot”? 😉

    Personally, I’ve always understood the term “bit-rot” to refer ironically to the way that a program can (apparently) stop working for *unknown* reasons; a sort of “God did it” shorthand. So up to this point, your post was actually explaining the possible *causes* of the phenomenon I’d call “bit rot”.

    I’d say “language extinction” is a particularly extreme case of “the meaning of the code changes”.

    I think another huge cause of bit-rot in my own code is the phenomenon of operant conditioning you just talked about the other day. Many times I’ve written something that worked GREAT, then left it alone for a few years, and then come back to find the code horribly “rotted”: crashing all the time, nonsensical control flow, comments that are clearly wrong…

    Clearly all those flaws didn’t *really* come from code beetles; they were always there, and only became apparent after the code’s author-and-sole-tester took a lengthy hiatus and unlearned all his compensating behaviors.

  2. It seems you’re talking mostly about source code. Many of these points don’t apply to binary code (such as changing the compiler), though it has its own share of problems (new OS/hardware incompatibilities etc.).

  3. Hi igorsk, yes, mostly source code. Binary code is interesting since (as you say) it runs on a much more stable layer of the system. On the other hand, it’s not very fun to fix bugs in binary code. I find it interesting and depressing that embedded systems projects often keep the same compiler version for the entire life of the project, or at the very least go to great lengths to ensure that the old compilers are kept available.

    Arthur, there certainly are a lot of different kinds of rot that can set in, and I probably wasn’t too clear about which ones I was talking about. We need a name for the final kind of rot that you mention!

  4. Similar to library or operating system updates, software can break when it is moved to new, supposedly-backwards-compatible hardware. For example, I saw many in-house Windows programs break on multi-core machines that used to work fine on single-core machines.

    Software can also age badly because of changing expectations e.g. changing ideas of what is an acceptable level of failure or inconvenience. The software may work just as well (or equivalently, just as badly) as it used to, but exposure to new and improved (but possibly unrelated) systems change our idea of what is acceptable. For example, we might once have deemed it acceptable if the word processor required a restart every hour or so. After several years of continuous use, the software may behaving in exactly the same way, but we no longer deem that behaviour acceptable.

  5. If a codebase is so unforgiving that it cannot be maintained, something is very wrong. In my opinion, the largest culprit is lack of documentation, the second is lack of tests, and the third is badly architected code. Not sure about the ordering there, but a lot can be done to ‘fix’ ‘bad’ code without writing any new production/application code.

    While I have never worked with a huge code base, I would estimate that we have about 100k LOC spread out over 1.5M to 2M LOC due to bad architecture, code ‘sharing’ via copy-paste-tweak-slightly, and generally hideous development practices. From this experience, I conclude that when companies hire skilled developers and train them well, the risk of ending up with a colossal ball of mud goes down greatly.

    I leave this blog post with absolute certainty that my work is full of code-termites. (AKA: Why I’m quitting work to go back to school full time again!)

  6. It’s always fun to revisit code that I wrote long enough ago that I have no detailed memory of how the code works. Depending on context, this can be anything from “I wrote it 6 months ago and have already forgotten all the details” to “I wrote it 10 years ago”.

    I found that doing this a few times gave me the inspiration to up the documentation in my own research codes a notch.

  7. Yeah, lack of documentation and tests is huge. My own research software does not look good on these metrics. On the other hand, one of the reasons I’m in academia is that it gives me the freedom to throw things away when I get tired of them.