Find the Integer Bug

Not all of the functions below are correct. The first person to leave a comment containing a minimal fix to a legitimate bug will get a small prize. I’m posting this not because the bug is particularly subtle or interesting but rather because I wrote this code for a piece about integer overflow and thought — wrongly, as usual — that I could get this stuff right the first time.

By “legitimate bug” I mean a bug that would cause the function to execute undefined behavior or return an incorrect result using GCC or Clang on Linux on x86 or x64 — I’m not interested in unexpected implementation-defined integer truncation behaviors, patches for failures under K&R C on a VAX-11, or style problems. For convenience, here are GCC’s implementation-defined behaviors for integers. I don’t know that Clang has a section in the manual about this but in general we expect its integers to behave like GCC’s.

#include <limits.h>
#include <stdint.h>

/*
 * specification:
 *
 * perform 32-bit two's complement addition on arguments a and b
 * store the result into *rp
 * return 1 if overflow occurred, 0 otherwise
 */

int32_t checked_add_1(int32_t a, int32_t b, int32_t *rp) {
  int64_t lr = (int64_t)a + (int64_t)b;
  *rp = lr;
  return lr > INT32_MAX || lr < INT32_MIN;
}

int32_t checked_add_2(int32_t a, int32_t b, int32_t *rp) {
  int32_t r = (uint32_t)a + (uint32_t)b;
  *rp = r;
  return 
    (a < 0 && b < 0 && r > 0) ||
    (a > 0 && b > 0 && r < 0);
}

int32_t checked_add_3(int32_t a, int32_t b, int32_t *rp) {
  if (b > 0 && a > INT32_MAX - b) {
    *rp =  (a + INT32_MIN) + (b + INT32_MIN);
    return 1;
  }
  if (b < 0 && a < INT32_MIN - b) {
    *rp =  (a - INT32_MIN) + (b - INT32_MIN);
    return 1;
  }
  *rp = a + b;
  return 0;
}

UPDATE: The prize has been awarded, see comment 5. The prize is an Amazon gift certificate for $27.49 — the cost of the kindle edition of Hacker’s Delight.

April 20, 2014

regehr

Computer Science, Software Correctness

19 responses to “Find the Integer Bug”

Chris Lattner says:

April 20, 2014 at 9:53 pm

These lines can overflow in checked_add_3:
*rp = (a + INT32_MIN) + (b + INT32_MIN);
*rp = (a – INT32_MIN) + (b – INT32_MIN);

The addition and subtraction on a are not safe. As far as a fix goes, use checked_add_1 or 2. 🙂
regehr says:

April 20, 2014 at 10:04 pm

Hi Chris, can you give inputs that cause an overflow in checked_add_3?
Ryan Fox says:

April 20, 2014 at 10:19 pm

checked_add_2 seems suspicious. The intermediate value of ((uint32_t)a + (uint32_t)b) will never be negative, however, it might go outside of the range of the int32_t it’s being assigned to if a or b are sufficiently large or negative, which is undefined behaviour, right?

I’m not sure of how you’d fix it though, short of just scrapping that function.
regehr says:

April 20, 2014 at 10:23 pm

Hi Ryan, I don’t want to give things away but the cast from unsigned to signed is not a problem due to the guarantees provided in the linked GCC documentation: “For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.”
Alex says:

April 20, 2014 at 10:39 pm

checked_add_2 seems to return an incorrect result if both a and b are INT_MIN, as in that case the unsigned addition overflows into 0, but the overflow check fails, as it only checks for r > 0 or r < 0.
regehr says:

April 20, 2014 at 10:42 pm

Alex — you win unless Chris is correct that checked_add_3 is wrong (I thought it was correct). We will wait for him to supply inputs that trigger an overflow.
Andrew Cobb says:

April 20, 2014 at 10:50 pm

I see a problem with checked_add_2 when you add INT32_MIN and INT32_MIN

a fix would be to use

(a < 0 && b = 0) ||
(a > 0 && b > 0 && r <= 0);

but I'm not quite sure yet if changing both is necessary or sufficient.
Ryan Fox says:

April 20, 2014 at 10:52 pm

That’s a little confusing… Does int32_t count as width 32? ((INT32_MAX+1) % 2^32) isn’t going to do anything. Maybe I’m not sure what “reduced module 2^N” means exactly…

((INT32_MAX + INT32_MAX) % whatever) on uint32_t values is going to result in a positive number. (a > 0 && b > 0 && r < 0) will always be false.
Andrew Cobb says:

April 20, 2014 at 10:54 pm

Looks like I’m late to the party and the HTML sanitizer took out some of the code.

let’s try again anyway:

(a < 0 && b < 0 && r >= 0) ||
(a > 0 && b > 0 && r <= 0);
regehr says:

April 20, 2014 at 10:59 pm

Andrew, yes, that is more or less the fix that I used.
Chris Lattner says:

April 20, 2014 at 11:13 pm

Looks like I confused myself (something easily done :-), please disregard my comment above.
regehr says:

April 20, 2014 at 11:20 pm

Hi Chris, that function was in there as a red herring, it would be sort of crazy to use it in practice, I wrote it just to see what it took to avoid both width extension and unsigned operations.
regehr says:

April 20, 2014 at 11:21 pm

I should add that I only spotted the bug in checked_add_2() when a test case failed, ugh.
regehr says:

April 21, 2014 at 12:12 am

Ryan, perhaps the GCC text would be better phrased as “conversions from a larger to smaller signed int are performed by lopping off the unneeded high-order bits” and then “conversions from unsigned to signed of the same size will change the interpretation but not the bits.”
Zeev Tarantov says:

April 21, 2014 at 1:36 am

Are you aware of this? http://safeint.codeplex.com/
Do you recommend it?
Zeev Tarantov says:

April 21, 2014 at 1:37 am

Oh, you’re mentioned. So, do you recommend using it?
regehr says:

April 21, 2014 at 7:21 am

Hi Zeev, I do recommend SafeInt, although I believe that you have to be willing to live with C++ exceptions if you use it.
Roberto Bagnara says:

April 21, 2014 at 9:49 am

With the help of the ECLAIR system I found the same counterexample to correctness found by Andrew and that that is the only problematic case. Moreover, the system quickly proves that it is not necessary to change the second comparison of `r’ with 0: a strict less than is OK. Finally, the code does not have any undefined behavior: only the implementation-defined behavior John said he is not interested in:

/tmp/regehr.c:14.9: B.TGEN:
(negative) signed conversion overflow (C89 i.d.b.)
when entering in function with: a = -1; b = -2147483648; (rp = any);
/tmp/regehr.c:14.9: B.TGEN:
(positive) signed conversion overflow (C89 i.d.b.)
when entering in function with: a = 2147483647; b = 2147483647; (rp = any);
/tmp/regehr.c:19.15: B.TGEN:
(positive) signed conversion overflow (C89 i.d.b.)
when entering in function with: a = 0; b = -1; (rp = any);
regehr says:

April 21, 2014 at 9:59 am

Roberto, that is great!

I don’t know much about ECLAIR, but it looks very proprietary. Are there some good resources about how it works? Is there any provision to give it away to researchers?