Use of Goto in Systems Code


The goto wars of the 1960s and 1970s are long over, and goto is dead—except that it isn’t. It has a number of legitimate uses in well-structured code including an idiom seen in systems code where a failure partway through a sequence of stateful operations necessitates unwinding the operations that have already completed. For example:

int init_device (void) {
  if (allocate_memory() != SUCCESS) goto out1;
  if (setup_interrupts() != SUCCESS) goto out2;
  if (setup_registers() != SUCCESS) goto out3;
  .. more logic ...
  return SUCCESS;
 out3:
  teardown_interrupts();
 out2: 
  free_memory();
 out1:
  return ERROR;
}

Is goto necessary to make this code work? Certainly not. We can write this instead:

int init_device (void) {
  if (allocate_memory() == SUCCESS) {
    if (setup_interrupts() == SUCCESS) {
      if (setup_registers() == SUCCESS) {
        ... more logic ...
        return SUCCESS;
      }
      teardown_interrupts();
    } 
    free_memory();
  }
  return ERROR;
}

And in fact a decent compiler will turn both of these into the same object code, or close enough. Even so, many people, including me, prefer the goto version, perhaps because it doesn’t result in as much unsightly indentation of the central part of the function.

Tonight’s mainline Linux kernel contains about 100,000 instances of the keyword “goto”. Here’s a nice clean goto chain of depth 10.

Here are the goto targets that appear more than 200 times:

out (23228 times)
error (4240 times)
err (4184 times)
fail (3250 times)
done (3179 times)
exit (1825 times)
bail (1539 times)
out_unlock (1219 times)
err_out (1165 times)
out_free (1053 times)
nla_put_failure (929 times)
failed (849 times)
out_err (841 times)
unlock (831 times)
cleanup (713 times)
drop (535 times)
retry (533 times)
again (486 times)
end (469 times)
bad (454 times)
errout (376 times)
err1 (362 times)
found (362 times)
error_ret (331 times)
error_out (276 times)
err2 (271 times)
fail1 (264 times)
err_free (262 times)
next (260 times)
out1 (242 times)
leave (240 times)
abort (228 times)
restart (224 times)
badframe (221 times)
out2 (218 times)
error0 (208 times)
fail2 (208 times)

“goto out;” is indeed a classic. This kind of code has been on my mind lately since I’m trying to teach my operating systems class about how in kernel code, you can’t just bail out and expect someone else to clean up the mess.


29 responses to “Use of Goto in Systems Code”

  1. I wonder how much of this code could be replaced by simple C++-style RAII, where the entity-appropriate cleanup code runs upon it going out of scope. Even further, though, how many bugs would be obviated, avoided, or fixed by making that change?

    That question might be answerable by fairly shallow static analysis, by looking at what kinds of things are most frequently the subject of cleanup code, and then at functions that allocate such things but don’t clean them up along all exit paths.

  2. Hi Phil, interesting question. This idiom, while being very easy to understand, is not as easy to validate in a large function since it requires matching up operations that are not close together.

    I suspect the GCC developers will be performing the refactoring that you suggest since GCC now compiles as C++.

  3. It might be more prudent to consider the minimal set of control flow structures you would need to be able to justify excluding goto from a language. I’ve found that I’ve used gotos in practice for, in rough order of how often they come up:
    * RAII-type constructs (with blocks in python, try/finally in many other languages)
    * (multilevel (aka labeled) break/continues
    * A construct which would be early return if the code in question were a separate method [you can argue that this needs to be split into its own method, but there are times when that really isn’t desirable]
    * Needing different exit points for a while loop
    * Convoluted control flow for if statements (a superset of what includes ternary expressions and short-circuit operators)

    From my perspective, the only truly “bad” use of goto is a use that makes control flow irreducible. This includes Duff’s Device, by the way.

  4. I don’t think gotos should ever be used. If you find that you are ever using it, there is always a way to refactor the code to eliminate them. For instance, keep in mind the inversion of IF statements. Weed out the exceptions to get to the important code.

    In your IF statements, throw the exception and return. Then you will have no goto lines and the code will be easier to read.

  5. @webreac: except that that’s hardly fair, since Dijkstra never gave his writing the title “Go To Considered Harmful”, opting instead for the slightly more modest “A Case Against the Goto Statement”.

    He did argue that goto should be abolished from higher-level programming languages (graciously allowing machine code to be exempted), but I wouldn’t consider that a “grand sweeping statement”, given that he presented arguments. Nor was he ever in an argument with Knuth (that I know of, at least) — the structured programming storm against goto was underway well before Knuth entered the fray.

    Knuth’s article is the more valuable one because he’s treating the matter from an engineer’s point of view, discussing where goto is and isn’t useful/effective and what alternative language constructs might be valuable (next to plain while and if).

    Dijkstra wasn’t half the dogmatic hardass that common lore makes him out to be — although he was certainly fond of speaking bluntly and with great conviction, and he had no shortage of overzealous followers.

    @Tim: you should definitely read Knuth’s article. Of course you can always eliminate goto — just like you can always eliminate “for” and “while” by introducing gotos. That, in itself, is not an argument for anything.

    Exceptions are neat, but they don’t exist in every language. Besides, in languages where exceptions do exist, people have been quick to point out they’re just “gotos in disguise” — not even the alternatives are safe from dogma!

  6. Written by academia and never worked in the actual industry. Tracking down closig curly braces is a pain. I will take early exiting from a method in an error situation with a goto any day over hunting down nesting with broken white space (tabs not as spaces, etc).

  7. Finally someone that understands the real value of goto in safe programs written in C.
    @Tim It’s absolutely not easier to read if you have many things to free(). You probably never wrote non-trivial code in C.

  8. You know, in the first case I actually do prefer the code with indented ifs (sometimes called “arrow code”), because here we do not get a handle to the acquired resources to put in local variables, so the fact we do or do not have a resource at a given point in the function is not very self-documenting (it may be far from obvious if there is logic for various reasons between the different if (whatever() != SUCCESS) goto outn;), at least with the indented code it is visible.

    However, I do use gotos for resource acquisitions which return handles (like open, malloc, etc.), but only to a single label just after the successful return, at which point the local variables are tested to know if each resource was allocated, and it is freed if it was; for handle types without a sentinel value (like NULL for pointers) it does require adding a Boolean local var, but give it a name derived from that of the local var and the mental overhead is near-zero:

    int func(params)
    {
    void* pointer = NULL;
    int file; int file_opened = FALSE;
    // … preparation logic …
    if ( (pointer = malloc(size)) == NULL ) goto error;
    // … intermediate logic …
    if ( !could_open(location, &file) ) goto error;
    file_opened = TRUE;
    // … more logic …
    return SUCCESS
    error:
    if (file_opened)
    close(file);
    if (pointer != NULL)
    free(pointer);
    return ERROR;
    }

    I think of the main code being on a catwalk above the floor with no railings, if everything goes right it gets to the other side, and if something goes wrong it falls on a conveyor belt which, whichever place we fell from, will get us back to a known state. I used to use “do { if () break; } while (0);” instead of goto, but it turned out to be harder to understand for others.

  9. I hardly think the Linux kernel is a bastion of quality code that ought to be used to teach anyone, anything other than “if you want to write a kernel – don’t do it like this”

  10. Hi Joshua, great list. I’m curious about the irreducible criterion, though. I always think of this as an implementation detail that makes compilers hard, not really as something that programmers would ever care about. But I haven’t thought about it that much.

    Perhaps compilers should emit a warning when they detect irreducible flow graphs :).

  11. webreac, thanks for the link, I haven’t read that one.

    Tim, I totally disagree. Code should be written in a way that communicates the developer’s intent. Goto is a tool that we sometimes use to accomplish this.

    Jeroen, did you know Dijkstra? I never met him but “speaking bluntly and with great conviction” echoes what I’ve heard from other sources (others have put it a bit more strongly, sometimes).

    Pierre, interesting. I hadn’t thought about the local variable issue, nor have I written code in the style you mention where we test for allocation explicitly. I like the catwalk analogy.

  12. Code like this could be written much more cleanly and succinct using something like D’s scope guards, C#’s using/Python’s with/… or C++’s RAII.

  13. regehr: to be honest, my experience is in embedded (well, features phones running real-time OSes, that still counts, right?), but not in kernel/driver code, where typically implicit/global state gets manipulated, rather I worked on multimedia libraries in userland, so the techniques and style may be different. That technique does scale relatively well though since it has also worked well for me when doing high-level coding in Cocoa Touch (the iPhone development framework).

  14. @regehr: I had college from Dijkstra once (special guest appearance), but I didn’t know him personally. All my information is second-hand, but even second-hand the man’s style shines through. (One of my professors was an ardent follower, and if imitation is the sincerest form of flattery, by all accounts he’s a very sincere man…)

    Dijkstra *was* quite often wrong, by any reasonable practical standard, but always gloriously so, and never without something you could take away from it. It saddens me to think that in some circles he’ll only be remembered as “the guy who told us goto was bad” (and perhaps semaphores and the eponymous algorithm).

  15. Bob,

    I find your comment interesting considering the Linux kernel is probably the most reasonable C codebase I’ve ever encountered. I’m sure there are dark, nasty corners I just haven’t encountered yet, but nothing I’ve seen from it so far stands out as particularly bad.

    What specifically don’t you like about the way the Linux kernel is coded?

  16. There is, of course, a solution that doesn’t require the use of ‘goto’, that doesn’t require the use of any more curly brackets that the ‘goto’ version, that doesn’t require any extra conditional expressions to be evaluated in the case that the function succeeds, and that allows the function to exit at the bottom in both the success and failure conditions (all of which are desirable). It looks like this:

    int init_device (void) {
    int result = allocate_memory();

    if (result == SUCCESS)
    result = setup_interrupts();

    if (result == SUCCESS)
    result = setup_registers();

    if (result == SUCCESS)
    … more logic …
    else
    deinit_device ();

    return result;
    }

    The deinit_device () function—which you have anyway—calls teardown_interrupts() and free_memory(). As with all well-written functions, these two functions should be capable of operating without error regardless of whether allocate_memory() and setup_interrupts() were called or completed properly. In a great many initialization functions this is straightforward, because they rely on pointers or other return values that can be used to track the success/failure state of the initialization function; in the relative few that don’t, good practice dictates inclusion of a state variable for this purpose.

    In addition to the benefits listed above, isolating all of the deinitialization code in a single function and set of subfunctions allows you to prevent spreading the same cleanup code in multiple places in the code. For instance, if allocate_memory() allocates a dozen blocks of memory, is it going to include calls to free the first 11 of those blocks itself in the case of failure, and then duplicate calls to free those 11 blocks again in free_memory()?

  17. Hi Nate, unfortunately I would guess that many deallocation functions are not in fact written to work properly when the corresponding allocation failed (or was never called). I agree it would be nice if they were.

  18. Actually, you don’t have to rely on the deallocation functions in the OS working properly when the corresponding allocation failed (or was never called). The deallocation functions I’m referring to are the ones in the driver code that includes init_device().

    For instance, teardown_interrupts() and free_memory() appear to be driver-level functions that would make calls to the OS to step through interrupt deinitialization and free memory, respectively, so it’s trivial to make sure that these driver-level functions are capable of operating without error even if allocate_memory() and setup_interrupts() were never called or completed properly.

  19. Nate, sloccount says there are about 6 million lines of code in Linux’s drivers subdirectly. How is it trivial to go through that code making sure that deallocation functionss in this code base work properly when the allocation failed or wasn’t called?

  20. It strikes me that the example code given might be better served with a switch/case that cascades through. In fact, I might consider that a more reasonable and obvious solution, as it declares the failure logic up front and clearly delimited, and you can be confident that, outside the switch, the rest of the code is purely logic to do with initialising the device rather than handling error cases.

  21. I haven’t suggested reworking the Linux codebase to remove all uses of ‘goto’, so that’s a strawman argument. I haven’t even argued for changing what is a de facto coding standard in the Linux kernel.

    All I’ve said is that the use of ‘goto’ is not the only way to avoid descending into the seventh circle of nesting hell when trying to unwind initialization. There is a third way, which I’ve shown, which avoids the disadvantages of the two more common approaches that you identified in your original blog post.

    When writing or rewriting a driver or other code module, it is indeed trivial to ensure that deinitialization functions are robust enough to be called at any point in initialization without error. And that’s exactly how the Linux community tends to make these sorts of changes: by deciding ‘here’s a better approach’ and then incrementally implementing it when new code is written or old code is rewritten.

  22. Can’t all legitimate uses of goto (and they exist) essentially be compressed into something like “you need a state machine”? Exceptions are handy, but they aren’t a state machine. Doesn’t Knuth’s essay, which I haven’t read in years, say something like that? Of course, the problem with goto is that state machines are 2d and goto is essentially 1d, like code in general. But 2d approaches to code I find unmanageable and unwieldy, and a lot of other people do so as well, though LabView seems to be widely used, so somebody doesn’t.

  23. @#1: There’s an interesting paper by Gylfason and (LaTeX notation for accented letters) Hj\'{a}lmt\'{y}sson titled “Exceptional Kernel: Using C++ Exceptions in the Linux Kernel”. The authors argue that C++ exceptions can be made sufficiently cheap that they can be used in most kernel code.

  24. The ideomatic use at the lead of the article can be improved in a subtle way: do not use anonymous labels like “out2”. Use labels that correspond to the failed operation, e.g.

    if (alloc() != 0) goto out_alloc;

    This permits an easier addition and removal of code blocks. Maintainability!

    Same goes to the indentation, actually, because changing the indentation level inflates diffs needlessly. That is why the goto-based exception handling is superior, as long as rigid discipline is maintained, of course.

    For some reason beginners fail to grasp the issues of program lifecycle, and think that they only need to debug a piece of code once.