This post is a quick explanation of what my integer quiz was intended to do.
The background is that our Integer Overflow Checker paper is being presented at ICSE this week. One of the things that bothered us when we wrote that paper was that we couldn’t find any concise explanation of C’s integer rules that we could reference. The C99 standard is less than readable. So I started writing up an explanation of C’s promotion rules and that sort of material. But it was boring — so I made a quiz instead.
At first I wanted to make the quiz about pure C, but very quickly it became clear that for practical purposes, no such thing exists. In other words, truly portable C code is so far from C as it actually exists that there’s no point pretending. So I added in (but didn’t word very clearly) the set of implementation-defined behaviors that most of us are used to: those that pertain to common compilers such as Intel CC, GCC, Clang, and many others on x86 and x86-64. I’m not totally sure why so many people missed this part of the instructions, but it seems that some people followed links into the quiz, bypassing this critical bit of information, and others (shockingly) just started answering questions without reading the instructions.
The quiz has two points:
- C programmers need to understand the integer promotion rules and related parts of the C standard. This isn’t because anyone should write the kind of code shown in the quiz, but rather so that we can avoiding writing that kind of code and also so we can understand how other people’s code goes wrong. The IOC paper shows that without tool support, some significant fraction of C programmers do not understand and/or are not capable of following the rules for integer operations. (I’m one of these programmers.)
- C programmers must understand and respect the sharp distinction between implementation-defined behavior and undefined behavior. Implementation-defined behavior can be relied on (for a single compiler, or a collection of compatible compilers) but undefined behavior should never be executed. This material is confusing and people commonly get it wrong. The nastiness of undefined behavior is something that I’ve written about before and also Chris Lattner’s article is a must-read.
I agree with the sentiment expressed in several comments that this quiz serves as something of an indictment of the C language. Integers should not be so difficult. But since we have no realistic story for getting rid of the mountains of C/C++ code that run so many real-world systems, we might as well figure out what to do about it. It’s 2012 and I cannot even think of a language that would be a better choice for implementing a JVM, an OS kernel, or a hypervisor. This is depressing.
There are definitely things the quiz could have done better but overall I’m happy with it. The wide exposure aided by HN and Reddit (more than 50,000 hits as of June 5) hopefully means that some number of programmers who did not previously appreciate these issues now do.
The last issue I wanted to address is especially tricky: the rules for left-shifts of signed numbers in C99. As the quiz tried to make clear, basically any code that left-shifts a signed value will encounter undefined behavior. Now, it so happens that compilers don’t take advantage of this (or at least, I don’t know of any that do). Does that mean that it’s OK to write code that does these undefined operations? Probably not. Signed overflows were reliably compiled to 2’s complement semantics until one day (some years ago) when they no longer were. This broke a lot of code and pissed people off but the compiler writers just pointed to the C standard and effectively said: “Not our problem.” To this day a lot of C programmers (perhaps even a majority of them, if we consider the large number of C programmers who don’t know the language very well) are unaware of the fact that signed overflow is not 2’s complement. Will the compiler developers start to exploit the undefinedness of many left shifts? My guess is that if doing so can increase the performance of at least one important program, they will.