Type-based alias analysis, where pointers to different types are assumed to point to distinct objects, gives compilers a simple and effective way to disambiguate memory references in order to generate better code. Unfortunately, C and C++ make it easy for programmers to violate the assumptions upon which type-based alias analysis is built. “Strict aliasing” refers to a collection of rules in the C and C++ standards that restrict the ways in which you are allowed to modify and look at memory objects, in order to make type-based alias analysis work in these weakly-typed languages. The problem is that the strict aliasing rules contain tricky and confusing corner cases and also that they rule out many idioms that have historically worked, such as using a pointer type cast to view a float as an unsigned, in order to inspect its bits. Such tricks are undefined behavior. See the first part of this post for more of an introduction to these issues.
The purpose of this piece is to call your attention to a new paper, Detecting Strict Aliasing Violations in the Wild, by Pascal Cuoq and his colleagues. C and C++ programmers should read it. Sections 1 and 2 introduce strict aliasing, they’re quick and easy.
Section 3 shows what compilers think about strict aliasing problems by looking at how a number of C functions get translated to x86-64 assembly. This material requires perseverance but it is worth taking the time to understand the examples in detail, because compilers apply the same thinking to real programs that they apply to tiny litmus tests.
Section 4 is about a new tool, built as part of Trust-in-Soft’s static analyzer, that can diagnose violations of the strict aliasing rules in C code. As the paper says, “it works best when applied on definite inputs,” meaning that the tool should be used as an extended checker for tis-interpreter. Pascal tells me that a release containing the strict aliasing checker is planned, but the time frame is not definite. In any case, readers interested in strict aliasing, but not specifically in tools for dealing with it, can skip this section.
Section 5 applies the strict aliasing checker to open source software. This is good reading because it describes problems that are very common in the wild today. Finding a bug in zlib was a nice touch: zlib is small and has already been looked at closely. Some programs mitigate these bugs by asking the compiler to avoid enforcing the strict aliasing rules: LLVM, GCC, and Intel CC all take a -fno-strict-aliasing flag and MSVC doesn’t implement type-based alias analysis at all. Many other programs contain time bombs: latent UB bugs that don’t happen to be exploited now, that might be exploited later when the compiler becomes a bit brighter.