Integer Undefined Behavior Detection using Clang 3.3

Undefined behaviors in C/C++ are harmful to developers:

There are many kinds of undefined behavior
They can be hard to understand
Their effect changes depending on which compiler version you use, which compiler options you use, and they get worse every time an optimizer gets smarter
Plenty of them aren’t reliably detected by any tool that I know of

Until these languages die, which isn’t going to happen anytime soon, our best defense against undefined behaviors is to write better checking tools. Recently, Clang has started to accumulate a nice collection of such tools, many of which can be enabled using the compiler flag -fsanitize=undefined. The Clang manual has more details.

Our modest contribution has been a collection of checks for integer undefined behaviors like signed overflow and shift-past-bitwidth. These checks have been part of LLVM for a while and finally now they are part of the 3.3 release which comes in variety of convenient pre-compiled packages.

To find integer undefined behaviors in a code base that you are about, there are three steps:

Install Clang/LLVM 3.3 from a binary package or from source and make sure that clang and clang++ are in your PATH. If compiling from source, you will need to build compiler-rt. Full instructions are here.
Build your code base using clang or clang++ and a flag such as -fsanitize=undefined
Test the compiled code as thoroughly as possible; if any sanitizer output appears then you have probably found one or more bugs. The lines you care about will contain the string “runtime error”.

Let’s go through a quick example. I did this on a Linux machine but it should be more or less the same on other platforms. Grab the latest stable version of Perl, untar it, and run its configure script:

wget http://www.cpan.org/src/5.0/perl-5.18.0.tar.gz tar xvf perl-5.18.0.tar.gz cd perl-5.18.0/ ./Configure

When the configure script asks for a C compiler, respond with clang -fsanitize=undefined. Then build Perl, run its test suite, and look for problems:

make -j4 make test > make.out 2>&1 grep 'runtime error' make.out | sort | uniq

At this point you should see several hundred lines of undefined behavior errors. Here’s the full output.

Since modern C compilers actually exploit undefined integer behaviors in order to generate code that you didn’t expect or want, this whole exercise is probably worth doing for codes whose correctness you care about.

June 19, 2013

regehr

Compilers, Software Correctness

8 responses to “Integer Undefined Behavior Detection using Clang 3.3”

Daniel Lemire says:

June 19, 2013 at 5:03 pm

Nice research output. Does that count as a paper? 😉
Michael Norrish says:

June 19, 2013 at 5:24 pm

Indeed, it looks as if the test output shows that it is detecting float overflows as well. Presumably these are the results of casts, rather than arithmetic. (My very shaky memory tells me that arithmetic overflows on floats just produce NaNs…)
regehr says:

June 19, 2013 at 5:28 pm

Hi Michael, I believe that your guess is correct. It’s not totally clear to me that FP checks should be turned on by the catchall undefined behavior detection flag but it’s not something I think about much either.
Octoploid says:

June 20, 2013 at 12:50 am

When you build from source please note
that ubsan is only build when using cmake,
make will not work.

-fsanitize=undefined is a great tool. Some projects react promptly to bug reports e.g.:
http://thread.gmane.org/gmane.comp.fonts.freetype.devel/8794/focus=8817
Some others don’t:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57324
Steffen Mueller says:

June 25, 2013 at 12:36 am

Interesting tool!

Regarding applying it to Perl: I’m certain it found a few genuine bugs (regexp compilation using I32’s as offsets?), but others are pretty much what one would expect. For example the error at “pp.c:2527:7” is in the opcode that implements a special case of integer arithmetic: A (Perl) user needs to explicitly enable “use integer” mode to get the machine’s raw integer arithmetic. In this mode, perl doesn’t prevent the user from doing the wrong thing. The test which provokes the undefined behaviour is specifically testing the corner cases. One could argue it’s therefore a bug in the test script.

In any case, this seems a very useful tool. Thank you!
regehr says:

June 25, 2013 at 1:15 pm

Hi Steffen, thanks for the comments, I’d b very interested to hear how many of these end up being useful to the Perl developers.
DM says:

June 26, 2013 at 10:44 am

@Michael Norrish: You are right, on most systems, the default is that an arithmetic overflow produces Â±âˆž. However, it is possible to set the system to trap on overflow. Also, in most programs, Â±âˆž are undesirable and checking for them is thus a worthy endeavour (as done in the AstrÃ©e system, for instance).

Overflow on conversion to integer, if I remember correctly, is an undefined behavior.
DM says:

June 26, 2013 at 10:44 am

@John: Thanks for all your work, since it’s now integrated into clang… We can therefore use the PAGAI static analyzer to check for the reachability of the trap condition!