C Compilers Disprove Fermat’s Last Theorem


[Update: I wrote another post on this topic that may explain the underlying issues more clearly.]

Obviously I’m not serious: compilers are bad at solving high-level math problems and also there is good reason to believe this theorem cannot be disproved. But I’m getting ahead of myself. Recently — for reasons that do not matter here — I wanted to write C code for an infinite loop, but where the compiler was not to understand that the loop was infinite.  In contrast, if I had merely written

  while (1) { }

or

  for (;;) { }

most optimizing compilers would see the loop’s inability to exit and generate code accordingly. For example, given this C code:

  void foo (void)
  {
    for (;;) { }
    open_pod_bay_doors();
  }

Most compilers will emit something like this:

  foo:
    L2: jmp L2

In this case the compiler emits neither the call to open_pod_bay_doors() nor the final return instruction because both are provably not executed.

Perhaps interestingly, LLVM/Clang recognizes that this slightly obfuscated infinite loop never exits:

  unsigned int i = 0;
  do {
    i+=2;
  } while (0==(i&1));

Faced with a loop optimizer that has some brains, I decided to stop messing around and wrote a loop that should thwart any compiler’s termination analysis:

  const int MAX = 1000;
  int a=1,b=1,c=1;
  while ((c*c*c) != ((a*a*a)+(b*b*b))) {
    a++;
    if (a>MAX) {
      a=1;
      b++;
    }
    if (b>MAX) {
      b=1;
      c++;
    }
    if (c>MAX) {
      c=1;
    }
  }

This loop only terminates if it finds a counterexample to a special case of Fermat’s Last Theorem.  Fermat’s Last Theorem, of course, states that no solution exists for the equation an + bn = cn for positive integers a, b, and c and for integer n>2. Here n=3 and a,b,c are in the range [1..1000]. On a platform with 32-bit integers 1000 is a reasonable maximum because 2*10003 is not much less than 231.

It turns out that when optimizations are enabled, several compilers (LLVM/Clang 2.7, Open64-x86 4.2.3, Sun CC 5.10, and Intel CC 11.1.072) go ahead and permit this loop to terminate.  Specifically, when the loop is enclosed in a function, the compilers emit x86 assembly which looks something like this:

  fermat:
    ret

The implication, of course, is that the compiler has disproved Fermat’s Last Theorem.  Faced with this incredible mathematical discovery, I held my breath and added a line of code at the end of the function to print the counterexample: the values of a, b, and c.  Unfortunately, with their bluffs called in this fashion, all of the compilers emitted code that actually performed the requested computation, which of course does not terminate.  I got the feeling that these tools — like Fermat himself — had not enough room in the margin to explain their reasoning.

What is really going on here is that compiler optimizations and termination analysis have long been at odds.  In other words, if compilers were obligated to preserve the termination properties of the code they translate, they would be unable to perform (or would have more difficulty performing) some of the optimizations that they use to create efficient code.  A choice is being made by compiler developers — probably consciously, though it’s hard to be sure — to prefer speed over correctness. The news, however, is not all bad: Microsoft’s C compiler, the Wind River Diab C compiler, and several versions of GCC all did the right thing, changing the termination properties of none of the examples I tried.

Update from Sat 5/1: It turns out the LLVM folks have been working on this problem lately and their latest SVN now does not contain this bug.  Very nice!

Update from Sat 5/1: Someone on Reddit noticed (and one of my students confirmed) that the Microsoft compilers do have termination bugs.  The compilers in both Visual Studio 2008 and 2010 generate code for the Fermat function, but then calls to this function are dropped because it is believed to be free of side effects (this was exactly what LLVM did before they fixed the problem).

Update from Friday 4/30

I’ll try to clarify a few of the questions that have come up on Reddit and in the comments here.  Also I fixed a mistake in the statement of Fermat’s Last Theorem that someone on Reddit pointed out.  Thanks!

Q: Does this actually matter at all?

A: Yes, but in very specialized situations that usually only come up when developing embedded software.  One example is described here: the poster wants the program being simulated to hang when main() exits, but LLVM deletes the loop that was intended to hang up the processor.  The workaround was to compile the code with optimizations turned off.  Another example happens when an embedded system has updated its firmware and wants to do nothing until the watchdog timer reboots the processor into the new version.  It’s no coincidence that gcc and the Wind River C compiler — both of which are heavily used in the embedded world — get termination right.

Q: Since infinite loops are bad style, isn’t it OK for the compiler to terminate them?  Shouldn’t people be putting the CPU to sleep, blocking the running thread, or whatever?

A: First, not all programs have an operating system or even a threading system to call out to. Embedded software commonly runs on the bare metal.  Second, the meaning of a program is defined by the language standard and style has nothing to do with it.  See my earlier post The Compiler Doesn’t Care About Your Intent.

Q: Does the C standard permit/forbid the compiler to terminate infinite loops?

A: The compiler is given considerable freedom in how it implements the C program, but its output must have the same externally visible behavior that the program would have when interpreted by the “C abstract machine” that is described in the standard.  Many knowledgeable people (including me) read this as saying that the termination behavior of a program must not be changed.  Obviously some compiler writers disagree, or else don’t believe that it matters.  The fact that reasonable people disagree on the interpretation would seem to indicate that the C standard is flawed.  In contrast, the Java language definition is quite clear that infinite loops may not be terminated by the JVM.

Q: Are you saying the compiler should do termination analysis?  That’s impossible by trivial reduction to the halting problem.

A: Termination analysis does not need to be part of the compiler at all. However, I (and others) would claim that the compiler should perform a termination analysis of any useless loop before deleting it.  Although the general problem is not computable, many specific instances can be easily solved.

Q: Does the Fermat code in this post execute any signed integer overflows or other undefined behaviors?

A: I don’t believe so.

Update from Saturday 5/1

Q: Didn’t you know Fermat’s Last Theorem was proved in 1995?

A: I did know that.  Since I got my math degree in 1995, it would have been very difficult for me to miss this event :). I was making a weak joke and also referring to the fact that proofs, especially complicated ones, can contain errors. In fact, as someone noted in the comments, Wiles’ initial proof was wrong. Also note that the n=3 special case was proved much earlier, in 1770.

Q: What’s the best workaround if you really want an infinite loop in C?

A: As several people have pointed out, looping on a volatile-qualified variable is probably the best choice. But keep in mind that compilers don’t always respect volatile….

One more update from Saturday 5/1

Here’s a fun complete program that is more compelling than the code above because it explicitly uses a return value from the “theorem disproved” branch of the code:

int fermat (void)
{
  const int MAX = 1000;
  int a=1,b=1,c=1;
  while (1) {
    if (((a*a*a) == ((b*b*b)+(c*c*c)))) return 1;
    a++;
    if (a>MAX) {
      a=1;
      b++;
    }
    if (b>MAX) {
      b=1;
      c++;
    }      
    if (c>MAX) {
      c=1;
    }
  }
  return 0;
}

#include <stdio.h>

int main (void)
{
  if (fermat()) {
    printf ("Fermat's Last Theorem has been disproved.\n");
  } else {
    printf ("Fermat's Last Theorem has not been disproved.\n");
  }
  return 0;
}

Here’s what the Intel and Sun compilers have to say:

regehr@john-home:~$ icc fermat2.c -o fermat2
regehr@john-home:~$ ./fermat2
Fermat's Last Theorem has been disproved.
regehr@john-home:~$ suncc -O fermat2.c -o fermat2
"fermat2.c", line 20: warning: statement not reached
regehr@john-home:~$ ./fermat2
Fermat's Last Theorem has been disproved.

Open64-x86 and LLVM/Clang 2.7 have the same behavior.  Although plenty of folks in the peanut gallery disagree, it seems perfectly clear to me that this is a serious error in these compilers.  I mean, why return 1?  Is that worse or better than returning 0?  Neither result makes any sense.

,

49 responses to “C Compilers Disprove Fermat’s Last Theorem”

  1. (You probably already have realized this.) The reason the compiler produced code that terminated was based upon the fact that the values being altered (i.e., a, b and c) within the “while” loop are never referenced again after the loop. But, once you added code which printed the resulting values it was forced to actually generate the code for the loop.

  2. I suspect your earlier “do…while” loop would also compile to a program which indefinitely iterated if you were to utilize the value “i” (in a non-trivial way) after the loop.

  3. Hi Justin- Yeah, it’s the lack of side effects that lets the compiler get rid of these loops. The problem is whether non-termination is considered a side effect!

    Regarding the do..while loop, you’re right, the Intel compiler for example no longer eliminates the loop.

  4. A compiler could prove that that does not terminate.

    Trivially, you can run it for a while (10^9 iterations) and you’ll end up back at your initial conditions, guaranteeing it will never terminate.

    I’m sure one could conceive of a more general class of termination conditions and a more efficient way to detect those that would also include this loop.

    Problem is computer arithmetic is finite so any “for-all” type statement in math will be problematic to use to thwart a compiler.

  5. It’s important to remember that C ints are not mathematical ints. In particular, C ints have an upper limit and overflow is undefined behavior.

    There are two cases to consider:

    1. There’s a solution. The loop terminates and it can be omitted as the results are unused.

    2. There’s no solution. The int a will eventually overflow, and the behavior of the code is undefined. Thus the compiler is allowed to do anything, including omitting the loop. At least gcc 4.1 and later will perform such optimizations.

  6. *shrug* I’m with the LLVM/Clang developers on this one. Being able to delete useless loops like this one without proving termination is useful.

    Also, I’d echo the previous comments that signed integer overflow in C is undefined. I wonder what happens if you make your integers unsigned. I’d guess it produces the same code.

  7. Hi Reid and others,

    This bug in LLVM ‘s optimizer has already caused problems for a group using it to compile embedded software. This is in LLVM’s bugzilla somewhere but I don’t have the link handy. They had to compile the module containing the infinite loop w/o optimizations to make their system work properly!

    John

  8. I assume for each compiler you are using the default settings and optimization levels? Did you try different compiler flags and find that these are the results across the board.

  9. Hi A,

    This wasn’t using the default optimization level, which I think is “no optimization” for all of these compilers. I used -O2, I believe. No, I didn’t do any kind of systematic survey.

    John

  10. You missed the real interesting way to approach this. You can use the c preprocessor to write a program that will compile at all only if the theorem is false. How long a program, IDK.

  11. I think if you were to declare the variables a, b, and c as ‘volatile int’ instead of just ‘int’, the compiler would be forced to read the value of the variable each time it is accessed. In that case, the generated code will probably contain the loop instructions because the variable manipulations shouldn’t be optimized out.

  12. Hi Fish- Please try the program below. The emitted code for fermat() does the specified computation, but main() never calls it! The optimizer concludes that fermat() is free of side effects and drops the call.

    void fermat (void)
    {
    const int MAX = 1000;
    int a=1,b=1,c=1;
    while ((a*a*a) != ((b*b*b)+(c*c*c))) {
    a++;
    if (a>MAX) {
    a=1;
    b++;
    }
    if (b>MAX) {
    b=1;
    c++;
    }
    if (c>MAX) {
    c=1;
    }
    }
    }

    int main (void)
    {
    fermat();
    return 0;
    }

  13. Fish– If you turn on LTO, the code below is the entire output.

    ; ModuleID = ‘/tmp/webcompile/_4199_0.bc’
    target datalayout = “e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64”
    target triple = “x86_64-linux-gnu”

    define i32 @main() nounwind readnone {
    entry:
    ret i32 0
    }

  14. ” while ((c*c*c) != ((a*a*a)+(b*b*b))) {”
    This will overflow, and the compiler can tell that it will. Signed overflow is undefined. Thus, a compiler would be perfectly correct to replace your program with an instruction that causes monkeybats to fly out of your nose.

  15. Ok, I can reproduce that. So it’s not quite right to say it generates no code for the loop – it does, as you can verify by compiling a function containing the loop.

    What clang apparently does do is elide the function call. I would guess the reason you have to turn on LTO to see the loop disappear is the dead code stripping.

  16. Try making your variables volatile, or in a greater scope. You compiler determines, properly, that they aren’t needed down the line and the loop has no side effects.

  17. This reminds me of the super computer viruses that Arthur C. Clarke described in his book, 3001. The protagonists use a similar technique to force an almost-sentient alien computer to go into a never-ending routine, thereby saving humanity from premature judgement.

    Well done!

  18. Actually, it was a recently introduced bug in the inliner, that I’ve just committed a fix for to LLVM HEAD. I had previously tracked down all these issues in LLVM’s dead code elimination passes a couple of years ago, but it looks like a new one got introduced while I wasn’t looking.

  19. The interesting discussion aside, an infinite loop invisible to the compiler is easy to do with gcc:

    int i = 1;
    while (i)
    __asm__ volatile (“” : “+r”(i));

  20. I don’t see why this behaviour would be a problem. If you want a non-terminating loop that can be used then just do something with side-effects inside of it so that the compiler doesn’t optimise it away. I’d much rather a compiler remove a huge loop that has no effect rather than waste clock cycles doing nothing… if anything it would be nicer if this generated warnings to I could find the offending code and fix/remove it.

  21. Ha, this is normal compiler optimisation. It drops the code because the compiler detects that all results are never used.
    This is allowed for C/C++ compilers and every embedded developer (should) know that he has to force the compiler to not drop it if he writes infinite or delay loops.
    A common solution with gcc is to insert “volatile asm(“nop”)” into a delay loop to force the compiler to not drop the loop.
    Other ways are using volatile variables and other compiler hints.

    So there is no termination analysis done at all.

  22. Um, might be being a bit pedantic but surely you must have heard :

    “””The final proof of the conjecture for all n came in the late 20th century. In 1984, Gerhard Frey suggested the approach of proving the conjecture through the modularity conjecture for elliptic curves. Building on work of Ken Ribet, Andrew Wiles succeeded in proving enough of the modularity conjecture to prove Fermat’s Last Theorem, with the assistance of Richard Taylor. Wiles’s achievement was reported widely in the popular press, and has been popularized in books and television programs.
    “””
    (http://en.wikipedia.org/wiki/Fermat's_Last_Theorem)

  23. I can see your point that non-termination of loops might be a desired side-effect at some point and shouldn’t be optimised out. But similarly you might add in dummy instructions to delay a program briefly, should those be optimised out as well?

    Lets say you have a bit of network code that you want to take over X microseconds to respond, but you don’t want to use heavy weight timers.

    In the real world there is no such thing as code without side-effects (all instructions use electricity and warm up the CPU); it is just often we don’t want the side effects that code comes with.

  24. “Are you saying the compiler should do termination analysis? That’s impossible by trivial reduction to the halting problem.”
    So is every compiler optimization, BTW. 🙂

    Very interesting observation, John!
    You may also have seen this: http://pho.ucsd.edu/rjhala/koenig.pdf

  25. Hi Will- A C compiler is definitely permitted to delete a terminating loop (or other sequence of dummy instructions) that is free of side effects. Note that the definition of side effects here is from the C standard and does not include timing or power.

  26. I can understand the argument for removing the loops because they have no side effects, but I’m curious about why all compilers assume fermat() returns 1 instead of 0?

  27. Hi Roy- I have no idea why these compilers return “1”. It makes no sense, the behavior is absurd and should be fixed.

  28. It’s very easy to see why they return 1. “return 0;” is discarded as unreachable code, and after that “return 1;” is the only way left to exit the function. Since optimizing compilers assume the function will terminate, it _must_ do so through “return 1;”.

  29. Academics are usually disconnected from reality, and this article is a fine example of this. Computers exist to manipulate real world objects and not to prove theorems. It’s much more important for a compiler to do compile-time memoization than to try to prove that a piece of code will never terminate.

    The real problem here is not the elimination of the redundant loop, but the lack of any meaningful infinite loop instruction for the C programming language that is supposed to be used in bare metal systems. C programmers have to simulate infinite loops and timed loops using a series of instructions that compilers may remove them.

    The solution would be to augment C with infinite loop instructions. For example:

    loop; //infinite loop
    loop for 1000; //timed loop for 1000 millseconds

    In the end, it’s a matter of language specifications, not of correctness. The correct thing to do in a compiler is to replace any piece of code that produces a constant expression with the result of that expression. Performance is incredibly important, even in this day, age, economic and environmental situation.

  30. Veky– Right, but what I mean is, it’s hard to understand why the compiler designers would permit their tool to execute an infeasible branch in the code, particularly when this has observable side effects. Generally compilers do not do this.

  31. Achilleas– You have misunderstood. The compiler has improperly changed the observable behavior of a strictly conforming C program. Both of these terms are defined in the C standard.

  32. The compiler generating such code for this program that terminates is clearly wrong. (Knowing that your “int”s have 31 value bits.)

    The fermat() function has two exit points. The “return 0” statement cannot be reached, indeed. The compiler infers that the “return 1” statement must be the one that leaves the function, and since it determines fermat() not to have any side effects, it simply eliminates the loop. This *directly* violates the semantics of the “if” selection statement, C99 6.8.4.1 p2: “[…] the first substatement is executed if the expression compares unequal to 0 […]”

    The code is completely well-defined, thus the compiler has no excuse for this overzealous optimization. The standard doesn’t say anything about the selection statement *having* to evaluate to nonzero, and the compiler errs when it still assumes so.

  33. @Veky, I still fail to see why “return 0;” is considered unreachable. For that to be true, the loop has to either
    a) return 1 from within the loop, which is a circular argument for why fermat() returns 1, or
    b) run infinitely, which contradicts what we see.

  34. Ah, I see why “return 0;” is unreachable now, because those two cases are the only possible cases.

  35. This is indeed an infinite loop. – and the compilers see that. in fact the code does nothing but enumerating the variables.
    try writing to /dev/null an it will do the trick.
    In case of a non operating system run, you can defiantly try to access some libs for checking the time and hang until T+x.
    In any case: have you ever hear about the sleep function?

  36. I know you mathemeticians are into mathematical proofs, which are beyond my expertise or interest, but of course this is a dimensional problem, and you cannot express something with fewer variables than dimensions. That is my proof. Let you mathemeticians waste more centuries on this.

  37. I tried these with the 64-bit Microsoft compiler in VS2008 and the programs above all seem to loop appropriately. Which test did you have it failing?

  38. This is a fascinating conversation.
    However if I put in a math operation, I want it to be executed. (Perhaps this is a program for seeing how much power the math chip consumes when fully loaded?) I shouldn’t have to know how/when/why a compiler might remove my perfectly valid code. If it wants to optimize the MATH, great, but it shouldn’t optimize away my intent. For example I’m debugging a while loop (thus my research into this topic and coming across this VERY HELPFUL discussion) that increments an integer by reference. Even though the while-condition involved the integer (by ref) the compiler still optimized away the while (but not a single iteration of the contents of the while) (It behaves like an ‘if’ not a ‘while’)
    This is simply one of those issues that makes software “hard”, and thus makes S/W engineering expensive. Why does it have to be so damn hard?

    Perhaps an optimization level of ‘4’ could be used to remove ridiculous-infinite-loops-with-no-consequences, then those that care about it can use that. (But then why are they using such silly constructs to require optimizing them?)

    Cheers