How Integers Should Work (In Systems Programming Languages)

My last post outlined some of the possibilities for integer semantics in programming languages, and asked which option was correct. This post contains my answers. Just to be clear: I want practical solutions, but I’m not very interested by historical issues or backwards compatibility with any existing language, and particularly not with C and C++.

We’ll start with:

Premise 1: Operations on default integer types return the mathematically correct result or else trap.

This is the same premise that as-if infinitely ranged begins with, but we’ll perhaps end up with some different consequences. The things I like about this premise are:

  • It appears to be consistent with the principle of least astonishment.
  • Since it is somewhat harsh — ruling out many semantics such as 2’s complement wraparound, undefined overflow, saturation, etc. — it gives a manageably constrained design space.
  • I just don’t like two’s complement wraparound. It’s never what I want.

Of course there are specialized domains like DSP that want saturation and crypto codes that want wraparound; there’s nothing wrong with having special datatypes to support those semantics. However, we want the type system to make it difficult for those specialized integers to flow into allocation functions, array indices, etc. That sort of thing is the root of a lot of serious bugs in C/C++ applications.

Despite the constraints of premise 1, we have considerable design freedom in:

  • When to trap vs. giving a correct result?
  • Do traps occur early (as soon as an overflow happens) or late (when an overflowed value is used inappropriately)?

For high-level languages, and scripting languages in particular, traps should only occur due to resource exhaustion or when an operation has no sensible result, such as divide by zero. In other words, these language should provide arbitrary precision integers by default.

The tricky case, then, is systems languages — those used for implementing operating systems, virtual machines, embedded systems, etc. It is in these systems where overflows are potentially the most harmful, and also they cannot be finessed by punting to a bignum library. This leads to:

Premise 2: The default integer types in systems languages are fixed size.

In other words, overflow traps will occur. These will raise exceptions that terminate the program if uncaught. The rationale for fixed-size integers is that allocating more storage on demand has unacceptable consequences for time and memory predictability. Additionally, the possibility that allocation may happen at nearly arbitrary program points is extraordinarily inconvenient when implementing the lowest levels of a system — interrupt handlers, locks, schedulers, and similar.

Side note: The Go language provides arbitrary precision math at compile time. This is a good thing.

Overflows in intermediate results are not fun to deal with. For example, check out the first few comments on a PHP bug report I filed a while ago. The code under discussion is:

result = result * 10 + cursor - '0';

It occurs in an atoi() kind of function. This function — like many atoi functions (if you’ve written one in C/C++, please go run it under IOC) — executes an integer overflow while parsing 2147483647. The overflow is a bit subtle; the programmer seems to have wanted to write this code:

result * 10 + (cursor - '0')

However, the code actually evaluates as:

(result * 10 + cursor) - '0'

and this overflows.

How can we design a language where programmers don’t need to worry about transient overflows? It’s easy:

Premise 3: Expression evaluation always returns the mathematically correct result. Overflow can only occur when the result of an expression is stored into a fixed-size integer type.

This is more or less what Mattias Engdegí¥rd said in a comment to the previous post. Clearly, eliminating transient overflows is desirable, but how costly is it? I believe that in most cases, mathematical correctness for intermediate results is free. In a minority of cases math will need to be performed at a wider bitwidth, for example promoting 32-bit operands to 64 bits before performing operations. Are there pathological cases that require arbitrary precision? This depends on what operators are available inside the expression evaluation context. If the language supports a builtin factorial operator and we evaluate y = x!; there’s no problem — once the factorial result is larger than can fit into y, we can stop computing and trap. On the other hand, evaluating b = !!(x! % 1000000); is a problem — there’s no obvious shortcut in the general case (unless the optimizer is especially smart). We can sidestep this kind of problem by making factorial into a library function that returns a fixed-width integer.

A consequence of premise 3 that I like is that it reduces the number of program points where math exceptions may be thrown. This makes it easier to debug complex expressions and also gives the optimizer more freedom to rearrange code.

The next question is, should overflows trap early or late? Early trapping — where an exception is thrown as soon as an overflow occurs — is simple and predictable; it may be the right answer for a new language. But let’s briefly explore the design of a late-trapping overflow system for integer math. Late trapping makes integers somewhat analogous to pointers in C/C++/Java where there’s an explicit null value.

We require a new signed integer representation that has a NaN (not-a-number) value. NaN is contagious in the sense that any math operation which inputs a NaN produces NaN as its result. Integer NaNs can originate explicitly (e.g., x = iNaN;) and also implicitly due to uninitialized data or when an operation overflows or otherwise incurs a value loss (e.g., a narrowing type cast).

Integer NaN values propagate silently unless they reach certain program operations (array indexing and pointer arithmetic, mainly) or library functions (especially resource allocation). At that point an exception is thrown.

The new integer representation is identical to two’s complement except that the bit pattern formerly used to represent INT_MIN now represents NaN. This choice has the side effect of eliminating the inconvenient asymmetry where the two’s complement INT_MIN is not equal to -INT_MAX. Processor architectures will have to add a family of integer math instructions that properly propagate NaN values and also a collection of addressing modes that trap when an address register contains a NaN. These should be simple ALU hacks.

A few examples:

int x; // x gets iNaN 
char c = 1000; // c gets iNaN 
cout << iNaN; // prints "iNaN" 
p = &x + iNaN; // throws exception 
x = a[iNaN]; // throws exception

Good aspects of this design are that an explicit “uninitialized” value for integers would be useful (see here for example), execution would be efficient given hardware support, programs would tolerate certain kinds of benign overflows, and the stupid -INT_MAX thing has always bothered me. It’s also possible that a more relaxed overflow system leads to some nice programming idioms. For example, iterating over the full range is awkward in C. But now that we have a sentinel value we can write:

for (x = INT_MIN; x != iNaN; x++) { use (x); }

The main disadvantage of late trapping is that it would sometimes mask errors and make debugging more difficult in the same way that null pointers do. Type system support for “valid integers”, analogous to non-null pointers, would mitigate these problems. On the other hand, perhaps we’d have been better off with early trapping.

Finally, we could design an integer system using the iNaN integer representation and also early trapping. In this system, iNaN would exclusively be used to represent uninitialized or garbage values. Any operation on iNaN other than simple copying would result in a trap.


40 responses to “How Integers Should Work (In Systems Programming Languages)”

  1. > Premise 3: Expression evaluation always returns the mathematically correct result. Overflow can only occur when the result of an expression is stored into a fixed-size integer type.

    I don’t like this. I’d expect { int x=a+b; int y=x-c; } to be equivalent to { int y = (a+b)-c; }. If it’s not equivalent, you’d better have a good explanation why. For example, you can say that (a+b) has a type of big integer. This can work, but then you have an implicit cast from big integer to int, which is not necessarily what you always want.

    The problem of the compiler optimizing the big integers out is a big one as well. The factorial example demonstrates the problem perfectly. However, you don’t have to have a primitive factorial function to have this problem. You can just write a large expression like ((a*b*c*d*e*f*g) % z) and have the same problem.

  2. Hi Arseniy, I don’t find expressions like ((a*b*c*d*e*f*g) % z) to be a problem at all — on reasonable architectures this can be evaluated without even spilling out of the register file. The inequivalence of x=a+b; y+x-c; and y=a+b-c; is because in the first case, you have inserted an explicit truncation to x’s type.

    Anyway, I expect opinions to vary — this is why we have so many programming languages :).

  3. So this is basically a brain-dead reimplementation of floating point? The treatment of NaN seems especially short-sighted.

  4. I’m not sure I like allowing invalid intermediate results. It means that there are many expressions whose correctness is dependent on a compiler, and more specifically a compilation target. It’s not hard to come up with an example of an expression that is valid, but only calculable on specific architectures.

    If none of the examples that Arseniy came up with satisfy you, think of something absurd like (x <> 19999. As it stands, no computer architecture I know of could calculate the proper result. But with a clever compiler you can derive an equivalent program that will execute correctly. If you have the clever compiler and allow invalid intermediate results, your program will execute correctly. If you have the same constraints and a slightly dumber compiler, then you will trap.

    There are probably examples that are flat out impossible to execute without trapping on particular architectures, but fine on others. As a result, architecture and compiler details will start bubbling up and affecting program correctness. The ways in which this will happen will be far more perfidious (if less common) than the current architecture details program writers often must consider such as word length.

  5. Nice post… but isn’t the correct fix that we all progressively move to higher level languages where these things have been fixed already?

    Isn’t it what is happening? How many people should really be coding in C in 2011?

    BTW the primary use of wrap-around, at least in my work, is that mod 2^L is fast (as opposed to generic modulos which are deadly slow). There is good reason to believe that you can have both the correct behavior (as you describe) coupled with fast modulos on powers of two…. after all, you are not really proposing that we give up on the binary representation in hardware, right?

  6. Interesting exploration. Did you intentionally make iNaN compare unlike floating point NaN, though?

    for (x = INT_MIN; x != iNaN; x++)

    In IEEE floating point, NaN is not equal to NaN. If you wanted the same behavior, you would write the loop like:

    for (x = INT_MIN; x == x; x++)

    or more likely, use an isINaN macro.

  7. Daniel,

    These semantics in low(er) level languages would make the implementations of those high(er) level languages faster and less prone to bugs. Not only is this behaviour hard to get right, but every language implemented written on top of C has to undo a few of C’s warts – they can’t just translate addition to addition, they have to translate it to addition with a few carefully written checks.

    As for twos complement, John wrote above “The new integer representation is identical to two’s complement except…” — that is, wrap-around doesn’t work, but modulo operations are unaffected.

  8. Are there any performance advantages to not trapping early? Do things become slower because each arithmetic operation would need a overflow check afterwards? Or once one allows for ALU hacks, do both early trapping and late trapping have the same performance implications.

  9. Alex, I hadn’t thought of the “shift left by a lot, then shift right by a lot, then test” example.

    How about this, then: Intermediate results are done at twice the bitwidth of any variable seen in the expression. This would stop a lot of overflows from happening, but hopefully be free of pathologies caused by factorials and massive shifts.

    #$%@ing WordPress…

  10. Daniel, the low-level vs. high-level question is tricky. Of course I agree that a lot of current C/C++ codes should have been written in something else. But on the other hand, in embedded systems there are plenty of people who still think C is too high level, and they have a point — they’re coding for small, special-purpose microcontrollers. And I don’t think anyone has a great alternative to C/C++ for operating systems, hypervisors, virtual machines, etc.

    Summary: low-level languages are still important, but looking forward, not for the vast majority of programmers.

  11. jmah, I didn’t put that much thought into it, but I was assuming that the programming language would recognize a comparison against iNaN and compile it into something appropriate. Of course this doesn’t really matter for the proposed system where integer NaN has a unique representation.

  12. pdles, interesting questions. The AIR paper claims a 6% slowdown on CPU-bound code for their early trapping implementation in GCC. This is quite a bit better than IOC, our early trapping implementation in LLVM. However, they cheat a little by inserting checks after all GCC optimization passes have run — those passes can eliminate overflows that were present at the source level.

    I’m not a hardware person but I believe that hardware support drives the overhead of early and late trapping to near zero. On the other hand, early trapping HW support makes a lot of previously non-trapping instructions into trapping instructions and so it’s a more invasive change to an ISA.

  13. Daniel, I think fast modulos are great and we want to keep them. We just need the modulo operators to be explicit in programs as opposed to hidden (and perhaps accidental).

  14. I would suggest that it seems more consistent to link the late trapping of integers and pointers, by making the result of adding or subtracting an iNAN from a pointer always give NULL, so that:

    p = &x + iNaN; /* p == NULL */
    y = *p; /* Traps */

    This would then mean that you could emit the same code for pointer + integer as you do for integer + integer (by defining the iNAN representation to be a NULL in pointer context). This means that you don’t need a separate trap for use of NANs – all the work is carried by the trap for use of NULL, which is already done.

  15. It would be valuable to also have a “constrained” integer type.

    A type that demands you statically prove overflows cannot occur, ever. And then, of course, checks that proof for correctness.

    That type would be a real “integer of least astonishment”.

    In java two things combine to make java’s “int” astonishing :
    1) try {} catch(Exception) {}
    2) have an overflow somewhere unexpected (or nullpointer, or …)

    Sure we could bring systems programming to the level of java. That would bring us closer to the state of the art in programming languages, in the sense that it would bring systems programming to the 1990’s.

    But it’s getting close to 2012 now …

    (another thing systems programming sorely needs is closures)

  16. “This choice has the side effect of eliminating the inconvenient asymmetry where the two’s complement INT_MIN is not equal to -INT_MAX.”

    Nevermind inconvenient, it’s just nonsensical, since there’s no way to represent -INT_MIN:
    abs(INT_MIN) < 0 !
    What the heck kind of system allows abs() to return a negative?

  17. Proposition 3 seems to me unnecessarily complex, if you really wants to have the correct result with expressions with “big” numbers just use “big ints”: they should be available in the language, but they should just not be the default in “system language”.

  18. A counterpoint: use unsigned types in C. They can overflow, but the net result is almost always what you want (e.g. a = b + c – d will work unless b + c – d > UINT_MAX mathematically) and the behavior is actually defined.

    If you’re willing to write platform- and compiler-specific code, you can even write things like size_t offset = (char *) a – (char *) b and have a + offset == b: unsigned overflow allows you to have “negative” offset. (Maybe.)

  19. I thought about it again and the more I think about it the more it seems that Proposition 3 is contradictory with Propostion 2: you pick fixed sized integers as a default due to real-time/memory allocation constraints, but wouldn’t Proposition 3 reintroduce the need to use “big ints” to evaluate intermediary results?

  20. I absolutely loved the nod to DSP and Crypto. A lot of software people don’t recognize *why* operations behave the way they do! From someone who does embedded software, here’s my take on integer operations.

    For an embedded system, I hate using “int” and “short.” How many bits does an Int represent? On one platform it might be 8-bits. On another, 16, or 32. stdint.h in my libc allows me to use uint16_t. This is an unsigned integer that is 16 bits in width.

    Should the integer overflow, there’s a bit in a status register that sets. If I want, I can check that to make sure my calculations returned ok. I believe that most architectures do this…

    What would be nice is an additional C keyword that would allow me to define the variable’s behavior in overflow, saturation or wrap around. I suppose this would necessitate the ALU to change, though I wonder how much it would get used.

    I’m torn with Expression Evaluation. I tend to throw parens on everything so I get the order I want.

  21. caf, I agree.

    oelewapperke, definitely. Making systems code more amenable to formal methods is one of the main motivations for all this.

    reno, my idea was that realistic code would not require arbitrary precision at runtime. I think that would be the case >99% of the time. But if you still don’t like it, how about if intermediate results were guaranteed to behave “as if” they were computed at 128 or 256 bits of precision?

  22. This would prevent SIMD optimisation of sums, for the same reason that saturating arithmetic does: You have to compute your sum in a loop, so each accumulation is in its own statement, triggering the overflow trap. For SIMD optimisation, we would like to accumulate into N subtotals in a vector, then sum the contents of the vector. But the compiler can no longer so that, because it has to assume you want all the possible intermediate traps to be visible.

    So you would slow down by a factor of 4 or 8 in that case (depending on how wide your machine is, and the size of integer).

  23. I consider the overflow behavior to be a property of the operation, not of the datatype. It’s conceivable that one program could want to perform wrapping addition in one place and trapping addition in another; making the specification of the operation independent of the specification of the type. In that vein, I’d prefer to have all the variants of the operations to be available in one language and be denoted with different symbols; the programmer can be specific about exactly what semantics he/she wants at each point in the program, and if they change their mind, they needn’t insert type coercions or rewrite the type signatures of the whole program, but simply change the operation at the usage site.

    I disagree with John about having larger precision for intermediate results. This is exactly the trouble that Java’s original floating point semantics invited, since on Intel the compilers made use of faster instructions which had more internal precision, giving slightly different results–the “strictfp” keyword was born, which dictated that within a particular method, all floating point operations must obey the “stricter” semantics. I don’t think you want to repeat that fiasco, since what you are proposing has the unfortunate consequence that simply rewriting “a + b + c” to “x = a + b; x + c” could give different results. This also limits compiler optimizations since “=” becomes a sequencing operator, preventing the reordering of code and introducing a runtime check. In a modern SSA-based compiler the two expressions would give rise to identical intermediate representations and therefore identical machine code.

  24. I consider integer overflow programming bugs to be, in part, derived from calling the type “integer” or “int”. This implies a whole world of semantics to the programmer’s mind, starting with the idea that it represents an element from an infinite set. Perhaps if these types were instead named mod256, mod65536 or mod4294967296, programmers wouldn’t assume that they are just integers that can have arbitrary arithmetic operations performed upon them.

    But that will never happen, and your scheme is better than what we currently have, so I will endorse your scheme instead. I particularly like part one, trap if you can’t perform the arithmetic. Whoever thought that silently ignoring the carry bit was the right course was a .. person that I don’t bear any particular fondness for…

  25. Hi Phil, I too hate the way that the HW carry bit — computed for free on every add or subtract — is not available in C/C++. Charmingly, the LLVM IR does expose these bits.

    Alex, I’m not sure that very many vectorization opportunities would be squashed by my integer plan. But in any case, I put correctness first and performance second.

  26. Ben, I agree about putting more semantics into the operations and less into the types. Volatile in C/C++ is a pretty great example of a failure to apply this design principle.

    I don’t think the x86 80-bit FP example is useful here. The extra precision was never the problem. The problem was that only one architecture had it.

    I don’t share your distaste for making assignment a synonym for potential overflow traps. My premise is that we need a way to help programmers avoid nasty reasoning about transient overflows, and I haven’t heard a lot of better solutions. I’m comfortable making the compiler people work a bit harder.

  27. > My premise is that we need a way to help programmers avoid nasty reasoning about transient overflows

    Is-it really an issue? ‘big int’ are still here for such programmers .

    I’m not against preventing transient overflows but:
    – if we allow ‘big ints’ in the transient computations, this reintroduce the real time/allocation issues.
    – if we use 128bit, this gives to the programmer a ‘weird’ story: “transient overflow will not happen most of the time, but they can still happen.”

    So I think that the story:
    1) the compiler won’t reorganize the operations unless it can prove that there is no overflow.
    2) any operation which overflow will trap
    is much more simple to understand, even if it gives more work to the programmer.

  28. The Problem with Integers definitely deserves closer attention. There is a lot to learn from Ada’s integer handling (Flight 501 nonwithstanding); from what I remember, it allows the implementation to return the right value instead of trapping in the cases when an intermediate value overflows but the final value can be represented. This sounds helpful, but its value is dubious – without a guarantee the programmer cannot rely on anything.

    There is also Habit, with an interesting take on things. Their divide-by-zero handling (statically guaranteed not to happen by the type system) is nice, although it is a comparatively minor issue. They do have type-level numbers, which may be more significant.

  29. Hi reno, I don’t think trapping when a result gets beyond 128 bits is any weirder than a different arbitrary cutoff. It is only the craziest of expressions that would trap in this case, I think.

    I don’t like your plan because it makes examples like the PHP code crash when it didn’t really need to. I could easily find more examples like this. It’s reasonable (in a systems language) to place a burden on the programmer, but the language should take the hard jobs whenever possible.

  30. Mattias, thanks for the Habit reference. I don’t share Mark’s “all solutions look pretty much like Haskell” point of view, but it’s certainly a good experiment to see how far that kind of thing can be pushed

  31. An off-topic comment regarding CSmith – a follow-up on the conversation about struct copying, and my request to not generate code relying on it: we mainly don’t support it so that people don’t end up using it 🙂

    Basically it’s an accelerator target, so we’re obsessed with speed in the tools; and frequently people who write code for it are fresh out of school. They can easily pass structs by value instead of by pointer for no good reason, and it’s just a pointless slowdown; in fact, it’s the only place in C that I can think of where you can implicitly spend lots of cycles without noticing.

    Target limitations aren’t a problem, except for the very limited instruction memory size, however that can be lifted in a simulation environment.

  32. John,

    For the record I’d like to state that I’m all for having sane semantics.

    My interpretation of what Ben said about assignment being a sequencing operator and the consequences thereof appears different from yours though: having explicit sequencing in the semantics is not just about telling the compiler people to work harder, it will by necessity limit the kind of optimizations a compiler can perform.

    How about weakening the semantic guarantees into a set of possible trapping behaviors, and the result of a trapping program is any element from this set?

  33. It’s more than “=” just becoming a sequencing operator and thereby inhibiting code motion; linguistically it changes the semantics of local variables, making them into finite storage locations instead of simply bindings to evaluated expressions. It breaks the substitutability of local variables and many innocent looking program transformations would either become illegal or subtly alter program semantics. Given how cavalier the C crowd is with optimization changing program semantics, I don’t think this is a road we want to go down.

  34. Ben if you had a reasonable counterproposal then fine, but you’re just arguing for the status quo, which sucks. I’ve looked at hundreds of integer overflow bugs in C/C++; even good programmers have a hard time avoiding them. A better semantics is needed. Trapping on every overflow is an option but produces lots of “fussy overflows” and I’m saying how to avoid those.

  35. No disagreement that status quo, at least for C, sucks. However, integer overflows aren’t nearly the problem in Java/C# as they are in C. Partly because integers aren’t used to index raw memory datastructures (only arrays), they have well-defined, platform-independent semantics, and there are built-in 64-bit types for larger situations.

    But that still leaves systems programming without a good solution. I’ve had a couple shots at this, first with a DSL I designed for describing instruction sets circa 2005. That language had signed and unsigned integers of arbitrary width (up to 64), denoted -int.K and +int.K, respectively. Addition and subtraction operations introduced an extra carry bit. Multiplication resulted in a result that was at least as wide as the sum of the input operands. Bit accesses had a nice syntax and made it easier to check the high-order bit for overflow. There were no traps. Overall, this little language had some nice features but always found the extra bits annoying. I took another shot at this in Virgil I, where the language had similar kinds of bit accesses and abitrarily-sized “raw” data types. Raw data types could be 1-64 bits long and had only bitwise operators. Integers, on the other hand, were always 32 bits, signed, and wrapped with 2’s complement semantics. Integers could be implicitly converted to raws, but not vice versa.

    I am working on this problem again for Virgil III. I haven’t solved it yet–too busy with other language problems. But I think the solution lies in the direction of giving programmers three kinds of operators (wrap, saturate, and trap), and not in the direction of extra intermediate precision. Regardless, all operations have to be platform independent and never allocate from the heap. It should also be possible to write programs that use, e.g. explicit 8-bit integer arithmetic and therefore compile to very efficient code on microcontrollers.

  36. Ben, I agree, explicit extra bits sounds annoying, but maybe the syntax can be adjusted so they don’t really get in the way unless you feel like dealing with them? I like the checked operations in LLVM where you say (result, overflow) = checked_add x, y.

    Re. wrap, saturate, and trap — which one would map to the syntactically obvious operators like + and -? Hopefully trap. I don’t have a feel for how these options play out in large codes.

    I’m not sure that 8-bit math matters that much for a new language. Looking forward, there don’t seem to be a lot of compelling reasons to use 8-bit micros in most new designs. Even MSP430 has 16-bit registers. I’d almost certainly populate a new design with Cortex M0-M3. The M0 is like 12,000 gates.

  37. Maybe it’s nostalgia, maybe it’s just because I’d like to solve the general case. If the compiler supports 32- and 16-bit arithmetic, it might as well support 8-bit as well. AVR is still very popular and it’d be a shame to leave it behind.

  38. I realize that I am coming to this conversation a bit late, but I want to point out a few key facts that are missing from the discussing:

    (1) One word that has been missing in the discussion is associativity. Most languages have default associativity rules; without them, different “implementations” (compiler, processor, etc.) produce different results.

    x = a + b + c

    defaults to

    x = (a + b) + c

    by associativity rules. When you have fixed precision (overflow), iNaN, etc., (a + b) + c != a + (b + c)

    As mentioned above, this is why many forms of parallelization (including SIMD) cannot guarantee correctness in the general case.

    Not to belabor the obvious, but associativity rules are necessary. If one compiler defaulted to (a + b) + c and the other defaulted to a + (b + c), we’d have even bigger problems.

    The best that you can do is a speculative form of parallelization. Compute in parallel, but also detect if a sequential (associative) computation would overflow; when this occurs, roll-back and re-execute sequentially.

    Kapre and DeHon described this scheme quite nicely for floating-point addition in the following paper; it’s a hardware paper, so be warned.

    (2) Premise 3 is impossible in the general case, although it’s a nice ideal to aim for. In a real system, you are stuck with a fixed amount of storage (cache, RAM, disk) and fixed-precision arithmetic operators.

    Even if I defined a soft integer library that could grow dynamically and perform operations using a fixed-bitwidth operator, I could still accumulate so much that I would overflow the available storage in my system — an unlikely corner case, yes, but you still need to prove that it would never happen.

    At best, a system programmer today knows the operation and the data type (e.g., add, char) and can derive overflow conditions on the granularity of individual operators. For example, I can determine the conditions at which a + b or a + b + c can overflow.

    Now, the idea of up-casting a 32-bit integer to a 64-bit integer has promise, but it provides an illusion of safety. Now, I know that a + b and a + b + c are both safe, because I won’t saturate until I add more than 2^32 integers at once — and no one is going to do that in a single line of code, I hope!

    But, you still need to certify that every loop that performs any type of accumulation operation adds 2^32 or fewer integers.

    Now, the percentage of code that works correctly definitely increases under the this scheme, but the percentage of code that doesn’t is that much harder to understand. The systems programmer now needs to understand the internal mechanics of the hardware implementation of the ALU operator, which few programmers really want to know about anyway. Thus, even though you are eliminating more errors, the percentage of programmers who actually understand what is going on and why the error occurs goes down even further.

    Even so, this approach does not help with multiplication. If I multiply two n-bit numbers, the result is a 2n-bit number, so half of my output space (so to speak) is an overflow. Detection is easy — just use an OR operation — but there are many more values for a and b where a * b overflows, compared to a + b.

    I like the idea of an integer NaN; perhaps introducing integer +/- infinity could be a good idea as well.

    The best we can do is come up with rules that are EASY to understand — i.e., minimize the need for systems programmers to become arithmeticians and then propagate that information to them immediately. And, always TRAP, or otherwise stop execution as quickly as possible when an error occurs.

    Although this is a software blog, it’s worth noting that the world of FPGAs and custom hardware is even more dangerous, as there are many tempting reasons to design efficient arithmetic operators that do not conform to the standards — and some of the best of these techniques increase the precision internally and then return the result at the correct precision (e.g., Altera). The problem is that you’ll get a different result if you run the same code on your desktop machine.

  39. > But, you still need to certify that every loop that performs any type of accumulation operation adds 2^32 or fewer integers.

    I think that the proposal 3 is only about the inside of the evaluation of an expression (results would be stored in fixed sized integers) so loops don’t matter here.

    > I like the idea of an integer NaN; perhaps introducing integer +/- infinity could be a good idea as well.

    Integer NaN seems a good idea until you realize that it induce “Not a Boolean”: the result of “x < y" isn't really only false, true then, one must add a "Not a Boolean" which means that every "if then … else … " becomes "if then … else … elseUndefined …", not a pleasant thought to have to write all these extra cases!
    The only nice thing is that it makes the flow control much more easy to follow than with operation which can TRAP everywhere..