If you spend much time testing compilers, you’ll run into some strange phenomena even in apparently simple areas like computer arithmetic. For example John Cook wrote a post today explaining why IEEE floats have two different values of zero. Integer arithmetic is generally a lot simpler than floating point math, but it still contains a few surprises. Today we’ll ask the question: What is the value of the C99 expression (INT_MIN % -1)? Or to put it in English, what is the remainder when the smallest integer value is divided by negative one?

Mathematically speaking, it should be obvious that the result is 0 because any integer divided by -1 or 1 leaves no remainder. Looking at section 6.5.5 p5 of the C99 standard (I’m using N1124) we find:

The result of the / operator is the quotient from the division of the first operand by the second; the result of the % operator is the remainder. In both operations, if the value of the second operand is zero, the behavior is undefined.

There are no surprises here. Perhaps the only interesting thing is that (in keeping with C’s design) dividing by zero results in undefined behavior instead of causing a math exception to fire.

Next let’s look at the behavior of some C implementations. Given this source code:

int my_mod (int x, int y) { return x % y; }

the current version of GCC for x86 gives this assembly code:

my_mod:movl 4(%esp), %eaxmovl %eax, %edxsarl $31, %edxidivl 8(%esp)movl %edx, %eax ret

The other x86 compilers that I tried (Intel, Sun, more versions of GCC, LLVM, Microsoft VC) generate more or less the same code. At first glance it looks good: the x86 idiv instruction leaves the quotient in EAX and the remainder in EDX. The sar instruction is a right-shift that implements a sign-extend.

But there’s a small problem: when we compute INT_MIN % -1, the quotient is -INT_MIN which is not representable in a standard 2’s complement integer (recall that INT_MIN is commonly defined as -INT_MAX-1). In this case the idiv instruction does not compute the result but rather throws an exception. Thus, despite the fact that the remainder is representable in an integer, and despite the fact that the C standard says what to do here, we’ll end up with a processor exception that probably kills the program.

If computing (INT_MIN % -1) reliably resulted in a processor exception, it wouldn’t be that bad. However, the behavior is not reliable. For example, when GCC can show that the arguments to the expression are constant, it folds the expression into the expected zero result. In effect, this is a new kind of undefined behavior (LLVM explicitly gives undefined semantics to this case).

The real problem here is that the C standard has specified a semantics for the % operator that doesn’t have a fast implementation on common instruction set architectures. C people really hate it when their language does anything slowly. So this is basically a flaw in the standard that should be fixed — I’ve heard that C1X will do exactly that.

I should point out that not everyone agrees that the C standard requires INT_MIN % -1 to evaluate to 0. Section 6.5.5 p6 it says:

When integers are divided, the result of the / operator is the algebraic quotient with any fractional part discarded. If the quotient a/b is representable, the expression (a/b)*b + a%b shall equal a.

Some people read this as saying “if / has an undefined result, then % does also.” The more common reading is “if the quotient is representable, this equality holds, but the definition of % is unaffected.”

C1x Committee Draft now says:

6.5.5p6: “When integers are divided, the result of the / operator is the algebraic quotient with any

fractional part discarded.105) If the quotient a/b is representable, the expression

(a/b)*b + a%b shall equal a; otherwise, the behavior of both a/b and a%b is

undefined.”

So INT_MIN % -1 is now clearly an undefined behaviour.

Hi Olivier- I’d heard about this, but hadn’t looked up the language. This is definitely an improvement!