Skip to content

GCC pre-4.8 Breaks Broken SPEC 2006 Benchmarks

UPDATE: Ok, I did something stupid. I wrote most of this post while using a pre-4.8 snapshot that I had sitting around and then I built the actual 4.8.0 release to verify the behavior but I screwed up a path or something. The 4.8.0 release does not have the behavior described in this post. It was patched a few days before the release, see comments 9 and 10 below. I apologize for the sloppiness. Of course, SPEC is still broken and there are still plenty of undefined programs not in SPEC that GCC will break for you…

Near the top of the GCC 4.8.0 release notes we find this:

GCC now uses a more aggressive analysis to derive an upper bound for the number of iterations of loops using constraints imposed by language standards. This may cause non-conforming programs to no longer work as expected, such as SPEC CPU 2006 464.h264ref and 416.gamess. A new option, -fno-aggressive-loop-optimizations, was added to disable this aggressive analysis.

The thing that is happening here—a compiler breaking a previously working program by exploiting undefined behavior—is a topic I’ve written plenty about, but breaking the current version of SPEC seems like a big enough deal to make it worth looking into. If you want the goriest possible details, check out some discussion at the GCC bugzilla and the response from SPEC. Here I’ll just present the essentials.

The problem with h264ref is a function that contains code like this:

int d[16];

int SATD (void)
{
  int satd = 0, dd, k;
  for (dd=d[k=0]; k<16; dd=d[++k]) {
    satd += (dd < 0 ? -dd : dd);
  }
  return satd;
}

Using GCC 4.8.0 on Linux on x86-64, we get this (I’ve cleaned it up a bit):

$ gcc -S -o - -O2 undef_gcc48.c
SATD:
.L2:
	jmp	.L2

Of course this is an infinite loop. Since SATD() unconditionally executes undefined behavior (it’s a type 3 function), any translation (or none at all) is perfectly acceptable behavior for a correct C compiler. The undefined behavior is accessing d[16] just before exiting the loop. In C99 it is legal to create a pointer to an element one position past the end of the array, but that pointer must not be dereferenced. Similarly, the array cell one element past the end of the array must not be accessed.

One thing we might ask is: What is the compiler thinking here? Why not just give a nice warning or error message if the program is going to unconditionally execute undefined behavior? The basic answer is that the compiler isn’t somehow maliciously saying to itself: “Aha– undefined behavior. I’ll screw up the object code.” Rather, the compiler is simply assuming that undefined behavior never occurs. How does that come into play in SATD()?

In detail, here is what’s going on. A C compiler, upon seeing d[++k], is permitted to assume that the incremented value of k is within the array bounds, since otherwise undefined behavior occurs. For the code here, GCC can infer that k is in the range 0..15. A bit later, when GCC sees k<16, it says to itself: “Aha– that expression is always true, so we have an infinite loop.” The situation here, where the compiler uses the assumption of well-definedness to infer a useful dataflow fact, is precisely analogous to the null-pointer check optimization which got some Linux kernels into trouble.

My view is that exploiting undefined behavior can sometimes be OK if a good debugging tool is available to find the undefined behavior. In this case, a memory safety error, we’ll turn to Valgrind and the address sanitizer (an awesome new feature in GCC 4.8.0, ported over from LLVM). Here goes:

$ gcc undef_gcc48.c
$ valgrind -q ./a.out
$ gcc -fsanitize=address undef_gcc48.c
$ ./a.out 
$ clang -fsanitize=address undef_gcc48.c
$ ./a.out 
$

No dice, bummer. Valgrind cannot change the layout of objects so I would not have expected it to notice this, but it seems like the address sanitizer should be able to catch the problem. On the other hand, where dynamic analysis fails us, static analysis succeeds:

$ gcc -O3 undef_gcc48.c -Wall 
undef_gcc48.c: In function ‘SATD’:
undef_gcc48.c:6:29: warning: array subscript is above array bounds [-Warray-bounds]
   for (dd=d[k=0]; k<16; dd=d[++k]) {
                             ^
undef_gcc48.c: In function ‘main’:
undef_gcc48.c:6:29: warning: array subscript is above array bounds [-Warray-bounds]
   for (dd=d[k=0]; k<16; dd=d[++k]) {
                             ^

For whatever reason, this warning shows up at -O3 but not -O2. We should, however, not pat ourselves on the back quite yet—this kind of static analysis is generally somewhat brittle and we cannot expect to always get such a nice warning.

It would be reasonable to ask the larger question of whether a compiler should be breaking code like this, but it’s late to be opening that can of worms, so I’ll leave this discussion for another day (or for the comments). I have not looked into 416.gamess but I suspect it’s more of the same.

{ 29 } Comments

  1. Jesse Ruderman | March 23, 2013 at 1:18 am | Permalink

    Interesting that Address Sanitizer (in both gcc and clang) misses the bug. I guess it’s not turning off the “assume accesses are in-bounds” optimization? I think it should.

  2. Jesse Ruderman | March 23, 2013 at 1:26 am | Permalink

    It would be nice if compilers would say “Hey, I compiled your program into an infinite loop. But looking back at your original code, syntactically there was a loop condition, so maybe this isn’t quite what you wanted?”

  3. tobi | March 23, 2013 at 3:56 am | Permalink

    Quoting from the bug: “It certainly reads memory past
    the end of an array, and so assigns an undetermined value to the variable dd.”

    What more is there to say? I can’t believe that even professionals don’t know what undefined behavior means. At the very least they must assume that an AV is possible reading off the end of the array. They just mentally tune this out.

  4. Barry Kelly | March 23, 2013 at 4:46 am | Permalink

    Tobi, the value of d[16] is never used – the loop exits after it is read; some compilers would eliminate the read. It is a bug in the code though, but is the right logic in a compiler to assume seemingly buggy code is not executed and replace it with nop?

    The compiler should be more helpful IMHO. Either warn (without the need to amplify warning settings) when it discovers this (so warning only at -O3 would be OK) or compile the code with bad behaviour as is – the programmer may know that this array is stored inside a larger object so it’s never a problen in practice.

    The compiler shouldn’t be a smart-ass adversary though.

  5. rp1 | March 23, 2013 at 5:52 am | Permalink

    “Aha– undefined behavior. I’ll screw up the object code.”

    That’s exactly what it’s doing. No thanks.

  6. Slothrop | March 23, 2013 at 8:02 am | Permalink

    I cannot reproduce the issue
    with the final version that was released
    yesterday:
    gcc version 4.8.0 (GCC)

  7. Joseph Mansfield | March 23, 2013 at 8:22 am | Permalink

    “It would be reasonable to ask the larger question of whether a compiler should be breaking code like this”

    The compiler isn’t breaking any code – the code is already broken. It invokes undefined behaviour and you get what you asked for.

  8. No | March 23, 2013 at 8:59 am | Permalink

    Autism is a hell of a drug. Maybe if you don’t try to be such a smartass with your code to the point where I can’t figure out what it’s supposed to do, you won’t run into those kinds of problems.

    Succinct code is important, but this is just taken to a whole new level of pretentious idiocy.

  9. Alexander Monakov | March 23, 2013 at 9:45 am | Permalink

    Please note that the situation has been significantly improved on March 14 when the bugfix for GCC PR 53265 [1] has landed on trunk (4.8 release includes that). Now GCC is able to emit a warning (“iteration 15 invokes undefined behavior”), and actually avoids “breaking” the code after emitting a warning.

    [1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53265

  10. Slothrop | March 23, 2013 at 9:50 am | Permalink

    This was fixed by the following commit:
    http://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=228bf2b89cffbf4bd36f5b3febeb2e83e956e096

  11. regehr | March 23, 2013 at 10:20 am | Permalink

    Alexander and Slothrop, thanks! I have added an update to the post.

    Slothrop, are you still having trouble reproducing this behavior? If so, please send me the output of “gcc -v; gcc -S -O2 undef_gcc48.c -o -”

  12. Jakub Jelinek | March 23, 2013 at 10:31 am | Permalink

    The PR53265 commit actually made it into GCC 4.8.0 release, so there is no difference between 4.8.0 and 4.8.1 (to be released in 2-3 month).
    If a loop has just single exit and known constant number of iterations, as in the SPEC loop, we just don’t lower the upper bound estimate because of undefined behavior constraints, and in many cases we actually even warn (not in this case, there was a pass ordering issue that resulted for 4.8 to decision in between warning more often, but having false positives, or warning less often without false positives.
    Were you above trying some older GCC version (as in, before March 14th or so)? For 4.9.0 we’ll hopefully resolve the issues with false positives and make the warning warn even about the above loop.

  13. Alexander Monakov | March 23, 2013 at 10:34 am | Permalink

    John, why does your update talk about 4.8.1? March 14 was more than a week ago, the new warning+breakage avoidance code *is* present in 4.8.0.

  14. regehr | March 23, 2013 at 10:40 am | Permalink

    Hi Alexander and Jakub, I was previously using a pre-4.8.0 compile of GCC, but I did build the actual 4.8.0 release yesterday in order to verify this behavior before writing the blog post. If I screwed that up, then I apologize and will add a big note about this at the top of the post. Double checking now…

  15. regehr | March 23, 2013 at 10:59 am | Permalink

    Hi Alexander and Jakub, of course you’re right and I’ve added a big note at the top of this post and changed its title. Sorry!

  16. Warp | March 23, 2013 at 3:35 pm | Permalink

    I must agree with the sentiment “The compiler isn’t breaking any code – the code is already broken.”

    It’s not the compiler’s job to try to guess what buggy code “is really trying to do” or what the programmer “really intended”, and somehow fix it. If the code has a bug (causing UB) then it has a bug. It’s not the compiler’s fault, it’s the programmer’s fault, and in this case the compiler is doing nothing that the language standard wouldn’t allow it to do.

    (Sure, a warning would be extremely nice and helpful, but the compiler isn’t required to issue one, especially because issuing such warnings in all possible cases is, AFAIK, provably impossible.)

  17. anon | March 23, 2013 at 8:03 pm | Permalink

    That’s stupid. Accessing the array at index 16 might be undefined behaviour, but still it should be very clear to the compiler that the loop is not endless. k is incremented and there is no memory access within the loop that might change that fact, not even with aliasing assumed, just read accesses to anything but a local integer.

  18. Richard Smith | March 24, 2013 at 6:05 pm | Permalink

    clang -fsanitize=address happens to not catch this, but -fsanitize=undefined catches it:

    $ clang -fsanitize=undefined undef_gcc48.c
    $ ./a.out
    undef_gcc48.c:6:28: runtime error: index 16 out of bounds for type ‘int [16]‘
    $

  19. regehr | March 25, 2013 at 12:05 am | Permalink

    Thanks Richard! I know the address sanitizer has some fine-grained flags but I’m surprised that -fsanitize=address doesn’t turn on full bounds checking.

  20. Jakub Jelinek | March 25, 2013 at 1:41 am | Permalink

    BTW, if you want a testcase where -fno-aggressive-loop-optimizations changes generated code even in GCC 4.8.0, one testcase is e.g.
    int a[4];

    __attribute__((noinline, noclone)) int
    foo (int x)
    {
    int i, r = 0, n = x & 31;
    for (i = 0; i < n; i++)
    r += a[i];
    return r;
    }

    int
    main ()
    {
    int x = 255;
    __asm volatile ("" : "+r" (x));
    return foo (x);
    }

    With -O3 (and without -fno-aggressive-loop-optimizations), GCC determines the upper bound of the loop to be 4 rather than 32 (or more, when not considering VRP) and so decides to completely unroll the loop (with comparisons/branch out before every read from a), so it actually won't do any uninitialized read beyond end of a, while with -O3 -fno-aggressive-loop-optimizations it is not allowed to take it into account and thus vectorizes the loop and will happily read beyond end of a array.

    As for the original testcase, gcc -fsanitize=address -fno-common detects it, but when the d array is common. See http://gcc.gnu.org/PR55739
    on why common symbols aren't instrumented by AddressSanitizer right now.

  21. Jakub Jelinek | March 25, 2013 at 2:01 am | Permalink

    Oh, BTW, in 464.h264ref d is an automatic array, not global one, so AddressSanitizer complains about it just fine even without -fno-common (so it would do if e.g. the array was initialized to something non-zero).

  22. regehr | March 25, 2013 at 6:59 am | Permalink

    Thanks Jakub!

  23. bcs | March 25, 2013 at 12:16 pm | Permalink

    Does UB allow altering the compile time of a program? I.e. if a function can be proven to be type 3, issue a warning (if enabled) and generate no output symbol (regardless of what warnings are enabled) in the object file.

  24. Uwe Meyer-Gruhl | March 25, 2013 at 5:06 pm | Permalink

    I suspect that exactly these kinds of optimizations trigger another problem that apparently went unnoticed:

    When I cross-compile linux 3.8.4 with gcc 4.8.0, my ARM machine boots the kernel and stalls completely after the message “booting the kernel”. The same kernel compiled with gcc 4.7.2 works just fine.

    I even tried to use “-fno-aggressive-loop-optimizations”, but to no avail. I do not have the time or means to investigate any further, but this sucks.

  25. Danilo Piazzalunga | March 26, 2013 at 12:58 am | Permalink

    bcs: I think so.

    Undefined behavior literally allows the compiler to do anything it wants. See also http://en.wikipedia.org/wikiUndefined_behavior, especially section “Compiler easter eggs”.

  26. regehr | March 26, 2013 at 6:40 am | Permalink

    Hi bcs, this could be done, I remember a version of LLVM (perhaps not any released version though) that was not even generating a return instruction for some type 3 functions. But really, if the compiler can actually notice this kind of thing happening, a nice error message would be nice.

    Hi Uwe, I agree, that sucks!

  27. Mans | March 26, 2013 at 6:50 am | Permalink

    Uwe, try enabling earlyprintk in your kernel config and command line. This might give you a better idea of where things are blowing up. What ARM hardware are you using? (Sorry for drifting off-topic.)

  28. Uwe Meyer-Gruhl | March 26, 2013 at 10:32 am | Permalink

    My ARM machine is an Iomega iConnect (http://www.congenio.de/infos/iconnect.html).

    BTT: For the example given, I would expect the compiler to issue a warning for a non-terminating loop and not silently generate the code for it. Like, when you say:

    char c;

    if (c > 127) {

    }

    you also get a warning that the condition will always yield false.

  29. Uwe Meyer-Gruhl | April 2, 2013 at 4:17 pm | Permalink

    @Mans: thanks for the tip, but it turns out that the kernel dies early during tty_unregister_device(), namely in klist_next(). It tries to deallocate space for the unused serial devices.

    It looks like the “fun case analysis” to me, where a pointer check is skipped because of optimizations, see here:

    http://blog.regehr.org/archives/213

    However, the linux 3.8.5 code cannot even be compiled with “-O0″ for a check, it breaks in mm/memory.c, so I am at an end there. I guess that this will not be the only place where gcc 4.8.0 “will break the code for you”. ;)

    Linux kernel maintainers will have a hard time getting that to work…