Checking Up on Dataflow Analyses

An important tool for reasoning about programs without running them is dataflow analysis, which can be used to compute facts such as:

  • an integer-valued variable is non-negative at a particular program point
  • a conditional branch always goes one way
  • an indirect write cannot modify a particular array

Dataflow facts drive optimizations such as constant propagation and dead code elimination. Outside of compilers, dataflow analyses are important in tools that prove safety properties, such as absence of undefined behaviors in C or C++ programs. Dataflow analysis is backed by a rich academic literature and tool writers have decades of experience implementing it. This set of lecture notes is my favorite introduction to dataflow analysis.

Despite all of the nice theorems, dataflow analyses are hard to get right, in part because people are just not very good at reasoning about properties that hold over sets of program executions. As a simple exercise in this kind of thinking, try this: you have a variable A whose value is known to be in the range [0..1000] and a variable B that is known to be in the range [10000..10050]. The program computes the bitwise xor of A and B and assigns the result into C. Provide tight bounds on the values that C can take. If you find this to be easy, that’s great — but consider that you then have to make this work for all intervals and after that there are about 9,999 similarly fiddly things left to implement and any mistakes you make are likely to result in miscompilations.

So now let’s finally get to the point of this post. The point is that since dataflow is hard, if you implement a dataflow analysis, you should also implement a dynamic checker that looks for cases where the analysis has come to the wrong conclusion. To keep going with the running example, the program being analyzed contains this line of code:

// A in [0..1000] and B in [10000..10050]
C = A ^ B;

If our interval analysis says that C has to be in the range [9225..12255], we would rewrite the code to add an assertion:

// A in [0..1000] and B in [10000..10050]
C = A ^ B;
assert(C >= 9225 && C <= 12255);

Now we can run the instrumented program and, if we’re very lucky, it will execute this code with values such as A = 830 and B = 10041, making C = 9223, triggering an assertion violation telling us that our analysis is unsound and giving us something to debug. The other way to debug this sort of problem is to backtrack from a miscompilation — an experience enjoyed by basically nobody.

Every dataflow analysis that I’ve implemented or supervised the implementation of has gotten a dynamic checker. This is a great testing technique that has numerous advantages. It finds bugs in practice. It can be automated, either using a program generator like Csmith or else using a regular test suite. The annotated programs serve as a useful way to look at the precision of an analysis: if not very many assertions show up, or if the bounds are very loose, then we probably need to do some tuning for precision. Finally, if we don’t find any bugs, then the dataflow analysis is probably (please excuse me, formal methods people) correct enough for practical purposes.

Of course, not everything that can be computed by a dataflow analysis can be expressed in code. For example, in C and C++ there’s no lightweight way to assert that a pointer refers to valid storage or that a variable is uninitialized. E-ACSL is a neat piece of related work in this general area.

Just recently I’ve been looking at the efficiency of integer overflow checking using LLVM and since then I wrote some code that uses integer range dataflow facts to remove overflow checks that can be proved to not fire. Here’s the under-review patch. The ranges are computed by an LLVM pass called LazyValueInfo (LVI), which better be correct if we’re going to rely on it, so I wrote a little LLVM pass that does the necessary checking. It processes each function in between the optimization and code generation phases by:

(If you build this pass, you’ll need a slightly hacked LLVM, see the README.) Although it might seem tempting to run the optimizers again on the instrumented code in order to clean it up a bit, this would be a very bad idea: the very dataflow analyses that we’re trying to check up on would be used to drive optimizations, which could easily end up incorrectly deleting our checks.

So far, some testing using Csmith hasn’t turned up any bugs in LVI, which is great. Less great is the fact that it drops precision all over the place: a lot of tuning up is needed before it is the basis for a really strong redundant overflow check remover.

The technique I’m describing is not as widely known or as widely used as it should be, considering how easy and effective it is. The best answer is that C = [9216..10239].

UPDATE: The latest version of my LLVM dataflow checker also validates the results of computeKnownBits(), which tries to figure out which bits of a value must be either zero or one.

Here’s a bit of further reading about how these techniques (and others) were applied to the Frama-C static analyzer.

Efficient Integer Overflow Checking in LLVM

(Here’s some optional background reading material.)

We want fast integer overflow checking. Why? First, if the undefined behavior sanitizers go faster then testing goes faster. Second, when overhead drops below a certain point people will become willing to use UBSan to harden production code against integer overflows. This is already being done in parts of Android. It isn’t the kind of thing to do lightly: some of LLVM’s sanitizers, such as ASan, will increase an application’s attack surface. Even UBSan in trapping mode — which does not increase attack surface that I know of — could easily enable remote DoS attacks. But this is beside the point.

Let’s start with this function:

int foo(int x, int y) {
  return x + y; 
}

When compiled with trapping integer overflow checks (as opposed to checks that provide diagnostics and optionally continue executing) Clang 3.8 at -O2 gives:

foo:
        addl    %esi, %edi
        jo      .LBB0_1
        movl    %edi, %eax
        retq
.LBB0_1:
        ud2

This code is efficient; here Chandler Carruth explains why. On the other hand, signed integer overflow checking slows down SPEC CINT 2006 by 11.8% overall, with slowdown ranging from negligible (GCC, Perl, OMNeT++) to about 20% (Sjeng, H264Ref) to about 40% (HMMER).

Why do fast overflow checks add overhead? The increase in object code size due to overflow checking is less than 1% so there’s not going to be much trouble in the icache. Looking at HMMER, for example, we see that it spends >95% of its execution time in a function called P7Viterbi(). This function can be partially vectorized, but the version with integer overflow checks doesn’t get vectorized at all. In other words, most of the slowdown comes from integer overflow checks interfering with loop optimizations. In contrast, GCC and Perl probably don’t get much benefit from advanced loop optimizations in the first place, hence the lack of slowdown there.

Here I’ll take a second to mention that I had to hack SPEC slightly so that signed integer overflows wouldn’t derail my experiments. The changes appear to be performance-neutral. Only GCC, Perl, and H254Ref have signed overflows. Here’s the patch for SPEC CPU 2006 V1.2. All performance numbers in this post were taken on an i7-5820K (6-core Haswell-E at 3.3 GHz), running Ubuntu 14.04 in 64-bit mode, with frequency scaling disabled.

Now for the fun part: making code faster by removing overflow checks that provably don’t fire. At -O0 the SPEC INT 2006 benchmarks have 67,678 integer overflow checks, whereas at -O3 there are 38,527. So that’s nice: LLVM 3.8 can already get rid of 43% of naively inserted checks.

Let’s look at some details. First, the good news: all of the overflow checks in these functions are optimized away by LLVM:

int foo2(int x, int y) {
  return (short)x + (short)y;
}

int foo3(int x, int y) {
  return (long)x + (long)y;
}

int foo4(int x, int y) {
  return (x >> 1) + (y >> 1);
}

int32_t foo5(int32_t x, int32_t y) {
  const int32_t mask = ~(3U << 30);
  return (x & mask) + (y & mask);
}

int32_t foo6(int32_t x, int32_t y) {
  const int32_t mask = 3U << 30;
  return (x | mask) + (y | mask);
}

int32_t foo7(int32_t x, int32_t y) {
  const int32_t mask = 1U << 31;
  return (x | mask) + (y & ~mask);
}

See for yourself here. The common theme across these functions is that each overflow check can be seen to be unnecessary by looking at the high-order bits of its inputs. The code for these optimizations is here.

Now on the other hand, LLVM is unable to see that these functions don’t require overflow checks:

int foo8(int x) {
  return x < INT_MAX ? x + 1 : INT_MAX;
}

int foo9(int x, int y) {
  if (x < 10 && x > -10 && y < 10 && y > -10)
    return x + y;
  return 0;
}

Here’s another one where the check doesn’t get eliminated:

void foo10(char *a) {
  for (int i = 0; i < 10; i++)
    a[i] = 0;
}

There’s good news: Sanjoy Das has a couple of patches that, together, solve this problem. Their overall effect on SPEC is to reduce the overhead of signed integer overflow checking to 8.7%.

Finally, this function should get one overflow check rather than three:

unsigned foo11(unsigned *a, int i) {
  return a[i + 3] + a[i + 5] + a[i + 2];
}

This example (or something like it) is from Nadav Rotem on twitter; his redundant overflow check removal pass for Swift eliminates this sort of thing at the SIL level. I’ve done a bit of work on bringing these ideas to LLVM, and will hopefully have more to write about that later on.

In summary, signed integer overflow checks in LLVM are fast but they get in the way of the optimizers, which haven’t yet been systematically taught to see through them or eliminate them. There’s plenty of low-hanging fruit, and hopefully we can get the overhead down to the point where people can turn on overflow checking in production codes without thinking too hard about the tradeoffs.

Addenda:

I haven’t forgotten about Souper! We’ve taught it to eliminate integer overflow checks. I’ll write about that later too.

See Dan Luu’s article on this topic.

SQLite with a Fine-Toothed Comb

One of the things I’ve been doing at Trust-in-Soft is looking for defects in open-source software. The program I’ve spent the most time with is SQLite: an extremely widely deployed lightweight database. At ~113 KSLOC of pointer-intensive code, SQLite is too big for easy static verification. On the other hand, it has an incredibly impressive test suite that has already been used to drive dynamic tools such as Valgrind, ASan, and UBSan. I’ve been using these tests with tis-interpreter.

Here I try to depict the various categories of undefined behaviors (UBs) where this post is mainly about the blue-shaded area:

This figure leaves out many UBs (there are a few hundred in all) and is not to scale.

For honesty’s sake I should add that using tis-interpreter isn’t painless (it wasn’t designed as a from-scratch C interpreter, but rather is an adaptation of a sound formal methods tool). It is slower even than Valgrind. It has trouble with separately compiled libraries and with code that interacts with the system, such as mmap() calls. As I tried to show in the figure above, when run on code that is clean with respect to ASan and UBSan, it tends to find fiddly problems that a lot of people aren’t going to want to hear about. In any case, one of the reasons that I’m pushing SQLite through tis-interpreter is to help us find and fix the pain points.

Next let’s look at some bugs. I’ve been reporting issues as I go, and a number of things that I’ve reported have already been fixed in SQLite. I’ll also discuss how these bugs relate to the idea of a Friendly C dialect.

Values of Dangling Pointers

SQLite likes to use — but not dereference — pointers to heap blocks that have been freed. It did this at quite a few locations. For example, at a time when zHdr is dangling:

if( (zHdr>=zEndHdr && (zHdr>zEndHdr 
  || offset64!=pC->payloadSize))
 || (offset64 > pC->payloadSize)
){
  rc = SQLITE_CORRUPT_BKPT;
  goto abort_due_to_error;
}

These uses are undefined behavior, but are they compiled harmfully by current C compilers? I have no evidence that they are, but in other situations the compiler will take advantage of the fact that a pointer is dangling; see this post and also Winner #2 here. You play with dangling pointers at your own risk. Valgrind and ASan make no attempt to catch these uses, as far as I know.

Using the value of a dangling pointer is a nettlesome UB, causing inconvenience while — as far as I can tell — giving the optimizer almost no added power in realistic situations. Eliminating it is a no-brainer for Friendly C.

Uses of Uninitialized Storage

I found several reads of uninitialized storage. This is somewhere between unspecified and undefined behavior.

One idiom was something like this:

int dummy;
some sort of loop {
  ...
  // we don't care about function()'s return value
  // (but its other callers might)
  dummy += function();
  ...
}
// dummy is not used again

Here the intent is to avoid a compiler warning about an ignored return value. Of course a better alternative is to initialize dummy; the compiler can still optimize away the unwanted bookkeeping if function() is inlined or otherwise specialized.

At least one uninitialized read that we found was potentially harmful, though we couldn’t make it behave unpredictably. Also, it had not been found by Valgrind. Both of these facts — the predictability and the lack of an alarm — might be explained by the compiler reusing a stack slot that had previously contained a different local variable. Of course we must not count on such coincidences working out well.

A Friendly C could ignore reads of uninitialized storage based on the idea that tool support for detecting this class of error is good enough. This is the solution I would advocate. A heavier-handed alternative would be compiler-enforced zeroing of heap blocks and automatic variables.

Out-of-Bounds Pointers

In C you are not allowed to compute — much less use or dereference — a pointer that isn’t inside an object or one element past its end.

SQLite’s vdbe struct has a member called aMem that uses 1-based array indexing. To avoid wasting an element, this array is initialized like this:

p->aMem = allocSpace(...);
p->aMem--;

I’ve elided a bunch of code, the full version is in sqlite3VdbeMakeReady() in this file. The real situation is more complicated since allocSpace() isn’t just a wrapper for malloc(): UB only happens when aMem points to the beginning of a block returned by malloc(). This could be fixed by avoiding the 1-based addressing, by allocating a zero element that would never be used, or by reordering the struct fields.

Other out-of-bounds pointers in SQLite were computed when pointers into character arrays went past the end. This particular class of UB is commonly seen even in hardened C code. It is probably only a problem when either the pointer goes far out of bounds or else when the object in question is allocated near the end of the address space. In these cases, the OOB pointer can wrap, causing a bounds check to fail to trigger, potentially causing a security problem. The full situation involves an undesirable UB-based optimization and is pretty interesting. A good solution, as the LWN article suggests, is to move the bounds checks into the integer domain. The Friendly C solution would be to legitimize creation of, and comparison between, OOB pointers. Alternately, we could beef up the sanitizers to complain about these things.

Illegal Arguments

It is UB to call memset(), memcpy(), and other library functions with invalid or null pointer arguments. GCC actively exploits this UB to optimize code. SQLite had a place where it called memset() with an invalid pointer and another calling memcpy() with a null pointer. In both cases the length argument was zero, so the calls were otherwise harmless. A Friendly C dialect would only require each pointer to refer to as much storage as the length argument implies is necessary, including none at all.

Comparisons of Pointers to Unrelated Objects

When the relational operators >, >=, <, <= are used to compare two pointers, the program has undefined behavior unless both pointers refer to the same object, or to the location one past its end. SQLite, like many other programs, wants to do these kinds of comparisons. The solution is to reduce undefined behavior to implementation defined behavior by casting to uintptr_t and comparing the resulting integers. For example, SQLite now has this method for checking if a pointer (that might be to another object) lies within a memory range:

# define SQLITE_WITHIN(P,S,E) \
    ((uintptr_t)(P)>=(uintptr_t)(S) && \
     (uintptr_t)(P)<(uintptr_t)(E))

Uses of this macro can be found in btree.c.

With respect to pointer comparisons, tis-interpreter’s (and Frama-C’s) intentions are stronger than just detecting undefined behavior: we want to ensure that execution is deterministic. Comparisons between unrelated objects destroy determinism because the allocator makes no guarantees about their relative locations. On the other hand, if the pair of comparisons in a SQLITE_WITHIN call is treated atomically, and if S and E point into the same object, there is no determinism violation. We added a “within” builtin to Frama-C that can be used without violating determinism and also a bare pointer comparison that — if used — breaks the guarantee that Frama-C’s results hold for all possible allocation orders. Sound analysis of programs that depend on the relative locations of different allocated objects is a research problem and something like a model checker would be required.

In Friendly C, comparisons of pointers to unrelated objects would act as if they had already been cast to uintptr_t. I don’t think this ties the hands of the optimizer at all.

Summary

SQLite is a carefully engineered and thoroughly tested piece of software. Even so, it contains undefined behaviors because, until recently, no good checker for these behaviors existed. If anything is going to save us from UB hell, it’s tools combined with developers who care to listen to them. Richard Hipp is a programming hero who always responds quickly and has been receptive to the fiddly kinds of problems I’m talking about here.

What to Do Now

Although the situation is much, much better than it was five years ago, C and C++ developers will not be on solid ground until every undefined behavior falls into one of these two categories:

  • Erroneous UBs for which reliable and ubiquitous (but perhaps optional and inefficient) detectors exist. These can work either at compile time or run time.
  • Benign behaviors for which compiler developers have provided a documented semantics.

The Strict Aliasing Situation is Pretty Bad

I’ll start with a quick review of the strict aliasing rules in C and C++ and then present some less well-known material.

Strict Aliasing

Compiler optimizations are often shot down by the potential for pointers to be aliases. For example, although we might naively expect a compiler to optimize this function to return zero, that cannot happen because x and y might refer to the same location:

int foo(int *x, int *y) {
  *x = 0;
  *y = 1;
  return *x;
}

Generated code typically looks like this:

foo:    movl    $0, (%rdi)
        movl    $1, (%rsi)
        movl    (%rdi), %eax
        ret

Failure to optimize is particularly frustrating when we know for sure that x and y are not aliases. The “restrict” keyword in C was introduced to help solve this sort of problem but we’re not going to talk about that today. Rather, we’re going to talk about an orthogonal solution, the aliasing rules in C and C++ that permit the compiler to assume that an object will not be aliased by a pointer to a different type. Often this is called “strict aliasing” although that term does not appear in the standards. Consider, for example, this variant of the program above:

int foo(int *x, long *y) {
  *x = 0;
  *y = 1;
  return *x;
}

Since a pointer-to-int and a pointer-to-long may be assumed to not alias each other, the function can be compiled to return zero:

foo2:   movl    $0, (%rdi)
        xorl    %eax, %eax
        movq    $1, (%rsi)
        ret

As we see here, the aliasing rules give C and C++ compilers leverage that they can use to generate better code. On the other hand, since C is a low-level language and C++ can be used as a low-level language, both permit casting between pointer types, which can end up creating aliases that violate the compiler’s assumptions. For example, we might naively write code like this to access the representation of a floating point value:

unsigned long bogus_conversion(double d) {
  unsigned long *lp = (unsigned long *)&d;
  return *lp;
}

This function is undefined under the aliasing rules and while it happens to be compiled into the same code that would be emitted without the strict aliasing rules, it is easy to write incorrect code that looks like it is getting broken by the optimizer:

#include <stdio.h>

long foo(int *x, long *y) {
  *x = 0;
  *y = 1;
  return *x;
}

int main(void) {
  long l;
  printf("%ld\n", foo((int *)&l, &l));
}

$ gcc-5 strict.c ; ./a.out
1
$ gcc-5 -O2 strict.c ; ./a.out
0
$ clang strict.c ; ./a.out
1
$ clang -O2 strict.c ; ./a.out
0

An exception to the strict aliasing rules is made for pointers to character types, so it is always OK to inspect an object’s representation via an array of chars. This is necessary to make memcpy-like functions work properly.

So far, this is very well known. Now let’s look at a few consequences of strict aliasing that are perhaps not as widely known.

Physical Subtyping is Broken

An old paper that I like uses the term “physical subtyping” to refer to the struct-based implementation of inheritance in C. Searching for “object oriented C” returns quite a few links. Additionally, many large C systems (the Linux kernel for example) implement OO-like idioms. Any time this kind of code casts between pointer types and dereferences the resulting pointers, it violates the aliasing rules. Many aliasing rule violations can be found in this book about object oriented C. Some build systems, such as Linux’s, invoke GCC with its -fno-strict-aliasing flag to avoid problems.

Update based on some comments from Josh Haberman and Sam Tobin-Hochstadt: It looks like the specific case where the struct representing the derived type includes its parent as its first member should not trigger UB. The language in this part of the standard is very hard to parse out.

This program from the Cerberus project illustrates the problem with changing the type of a pointer to struct:

#include <stdio.h>

typedef struct { int i1; } s1;
typedef struct { int i2; } s2;

void f(s1 *s1p, s2 *s2p) {
  s1p->i1 = 2;
  s2p->i2 = 3;
  printf("%i\n", s1p->i1);
}

int main() {
  s1 s = {.i1 = 1};
  f(&s, (s2 *)&s);
}

$ gcc-5 effective.c ; ./a.out
3
$ gcc-5 -O2 effective.c ; ./a.out
2
$ clang-3.8 effective.c ; ./a.out
3
$ clang-3.8 -O2 effective.c ; ./a.out
3

Chunking Optimizations Are Broken

Code that processes bytes one at a time tends to be slow. While an optimizing compiler can sometimes make a naive character-processing loop much faster, in practice we often need to help the compiler out by explicitly processing word-sized chunks of bytes at a time. Since the data reinterpretation is generally done by casting to a non-character-typed pointer, the resulting accesses are undefined. Search the web for “fast memcpy” or “fast memset”: many of the hits will return erroneous code. Example 1, example 2, example 3. Although I have no evidence that it is being miscompiled, OpenSSL’s AES implementation uses chunking and is undefined.

One way to get chunking optimizations without UB is to use GCC’s may_alias attribute, as seen here in Musl. This isn’t supported even by Clang, as far as I know.

Offset Overlap is Bad

Here is a devilish little program by Richard Biener and Robbert Krebbers that I found via the Cerberus report:

#include <stdio.h>
#include <stdlib.h>

struct X {
  int i;
  int j;
};

int foo(struct X *p, struct X *q) {
  q->j = 1;
  p->i = 0;
  return q->j;
}

int main() {
  unsigned char *p = malloc(3 * sizeof(int));
  printf("%i\n", foo((struct X *)(p + sizeof(int)),
                     (struct X *)p));
}

It is ill-formed according to LLVM and GCC:

$ clang-3.8 krebbers.c ; ./a.out
0
$ clang-3.8 -O2 krebbers.c ; ./a.out
1
$ gcc-5 krebbers.c ; ./a.out
0
$ gcc-5 -O2 krebbers.c ; ./a.out
1

int8_t and uint8_t Are Not Necessarily Character Types

This bug (and some linked discussions) indicate that compiler developers don’t necessarily consider int8_t and uint8_t to be character types for aliasing purposes. Wholesale replacement of character types with standard integer types — as advocated here, for example — would almost certainly lead to interesting strict aliasing violations when the resulting code was run through a compiler that doesn’t think int8_t and uint8_t are character types. Happily, no compiler has done this yet (that I know of).

Summary

A lot of C code is broken under strict aliasing. Separate compilation is probably what protects us from broader compiler exploitation of the brokenness, but it is a very poor kind of protection. Static and dynamic checking tools are needed. If I were writing correctness-oriented C that relied on these casts I wouldn’t even consider building it without -fno-strict-aliasing.

Pascal Cuoq provided feedback on a draft of this piece.

A Variable Argument Hazard

Variable argument functions in C and C++ can be tricky to use correctly, and they typically only get compiler-based type checking for special cases such as printf(). Recently I ran across an entertaining variable argument bug using tis-interpreter.

Although I will use code from SQLite to illustrate this bug, this bug has not been found in SQLite itself. Rather, it was in some code used to test that database.

SQLite’s API has a configuration function that takes variable arguments:

int sqlite3_config(int op, ...);

Several of its configuration commands, such as this one, expect pointer-typed arguments:

case SQLITE_CONFIG_LOG: {
  typedef void(*LOGFUNC_t)(void*,int,const char*);
  sqlite3GlobalConfig.xLog = va_arg(ap, LOGFUNC_t);
  sqlite3GlobalConfig.pLogArg = va_arg(ap, void*);
  break;
}

We’re only allowed to execute this code when the first two variable arguments are respectively compatible with LOGFUNC_t and void *, otherwise the behavior is undefined.

The test code contains calls that look like this:

sqlite3_config(SQLITE_CONFIG_LOG, 0, pLog); 

At first glance there’s no problem — normally it is fine to pass a 0 to a function that expects a pointer argument: the 0 is turned into an appropriately-typed null pointer. On the other hand, when a function takes variable arguments (or when it lacks a prototype) the compiler cannot do this kind of caller-side type conversion and instead a different set of rules, the “default argument promotions” are applied, resulting in a zero-valued int being passed to sqlite3_config() in the first variable argument position. Another fun aspect of the default argument promotions is that they force all float arguments to be promoted to double. These rules seem to be the same in C++ though they would fire less often due to C++’s stricter rules about function prototypes.

This problem was not noticed because the code happens to be compiled benignly on both x86 and x64 platforms. On x64 Linux, pointers and ints are not the same size, but the calling convention says that the first six integer and pointer arguments are passed in 64-bit registers — so the problem gets covered up by an implicit promotion to 64 bits. The ABI does not guarantee that the higher bits are zeroed in this case, but in this example they happen to be cleared by GCC and LLVM.

On x86 Linux, all arguments are passed in memory but since pointers and integers are the same size, the problem is again covered up. An ABI that represents a null pointer differently than a zero-valued long integer would expose this bug right away, as would an x64 program that passes at least six variable arguments. In any case, the fix is easy:

sqlite3_config(SQLITE_CONFIG_LOG, (LOGFUNC_t)0, pLog); 

In summary, be careful with the default argument promotions that are done when calling variable argument functions.

Thanks to Pascal Cuoq who provided comments on this piece.

Also see this article by Rich Felker and this one by Jens Gustedt.

Python Exercises for Kids

For the last year or so I’ve been giving Python exercises to my 11 year old. I thought I’d share some of them. If any of you have been doing similar things, I’d love to hear what worked for you. I think it is helpful that I’m not much of a Python programmer, this forces him to read the documentation. The other day he said “Wow– Stack Overflow is great!”

Fibonacci

Print the Fibonacci series, useful for teaching basics of looping.

Number Guessing Game

The user thinks of a number between 1 and 100. The computer tries to guess it based on feedback about whether the previous guess was too high or low. This one was great for learning about the kinds of off-by-one errors that one customarily runs into while implementing a binary search.

Binary Printer

Print the non-negative binary integers in increasing order. This one forces some representation choices.

Palindrome Recognizer

Recognizing “A man, a plan, a canal — Panama!” as a palindrome requires some string manipulation.

Door Code Recognizer

Our Paris apartment building will let you in if you enter the correct five-digit sequence regardless of how many incorrect digits you previously entered. I thought replicating this behavior in Python would be slightly tricky for the kid but he saw that a FIFO could easily be created from a list.

Monte Carlo Pi Estimator

If you generate random points in the square between -1,-1 and 1,1, the fraction of points that lie inside the unit circle will approach pi/4. The kid got a kick out of seeing this happen. Of course I had to explain the derivation of pi/4 and the formula for a circle since he hasn’t seen this stuff in school yet. Also of course we could have simply worked in quadrant one but that seemed to make the explanations more complicated.

I should perhaps add that I explained this exercise to my kid very differently than I’m explaining it here. We started by drawing a box containing a circle and then pretended to throw darts into the box. The concepts are not hard: all we need to understand is random sampling and the area of a circle. Math gets made to seem a lot harder than it is, in many cases, and also we tend to conflate the difficulty with the order in which the concepts are presented in the school system.

Turtle Graphics Interpreter

The current project is implementing a turtle that takes four commands:

  • forward n
  • right-turn n
  • left-turn n
  • color c

So far we’ve been restricting angles to 0, 90, 180, 270 (and no radians yet…) but I don’t think sine and cosine will be hard concepts to explain in this context. Also, so far the turtle commands are just a series of function calls, the parser will come later. I wonder what would be the simplest loop construct for this turtle language? Perhaps “repeat n” followed by some sort of block construct.

Planning for Disaster

Alan Perlis once said:

I think that it’s extraordinarily important that we in computer science keep fun in computing. When it started out, it was an awful lot of fun. Of course, the paying customers got shafted every now and then, and after a while we began to take their complaints seriously. We began to feel as if we really were responsible for the successful, error-free perfect use of these machines. I don’t think we are.

This is a nice sentiment, perhaps even a defensible one if we interpret it as narrowly talking about academic computer science. On the other hand, probably within 20 years, there’s going to be a major disaster accompanied by the loss of many lives that is caused more or less directly by software defects. I don’t know what exactly will happen, but something related to transportation or SCADA seems likely. At that point we can expect things to hit the fan. I’m not optimistic that, as a group, computer scientists and computing professionals can prevent this disaster from happening: the economic forces driving automation and system integration are too strong. But of course we should try. We also need to think about what we’re going to do if, all of a sudden, a lot of people suddenly expect us to start producing computer systems that actually work, and perhaps hold us accountable when we fail to do so.

Obviously I don’t have the answers but here are a few thoughts.

  • We know that it is possible to create safety-critical software that (largely) works. Generally this happens when the organizations creating the software are motivated not only by market forces, but also by significant regulatory pressure. Markets (and the humans that make them up) are not very good at analyzing low-probability risks. A huge amount of critical software is created by organizations that are subject to very little regulatory pressure.
  • It is difficult to tell people that something is a bad idea when they very much want it to be a good idea. We should get used to doing this, following Parnas’s courageous example.
  • It is difficult to tell people that something is going to be slow and expensive to create, when they very much want it to be quick and cheap. We need to get used to saying that as well.
  • We can and should take responsibility for our work. I was encouraged by the field’s generally very positive response to The Moral Character of Cryptographic Work. Computer scientists and computing professionals almost always think that their particular technology makes the world better or is at worst neutral — but that is clearly not always the case. Some of this could be taught.
  • We need to be educating CS students in methods for creating software that works: testing, specification, code review, debugging, and formal methods. You’d think this is obvious but students routinely get a CS degree without having done any sort of serious software testing.

Finally, let’s keep in mind that causes are tricky. A major human disaster usually has a complex network of causes, perhaps simply because any major disaster with a single cause would indicate that the system had been designed very poorly. Atomic Accidents makes it clear that most nuclear accidents have been the result of an anti-serendipitous collection of poor system design, unhappy evolution and maintenance, and failure-enhancing responses by humans.

Acknowledgments: Pascal Cuoq and Kelly Heller gave some great feedback on drafts of this piece.

Do Fiddly Buffer Overruns Matter?

by Pascal Cuoq and John Regehr

Using tis-interpreter we found a “fiddly” buffer overrun in OpenSSL: it only goes a few bytes out of bounds and the bytes it writes have just been read from the same locations. Fiddly undefined behaviors are common and it can be hard to convince developers to fix them, especially in a case such as this where the behavior is intentional. More accurately, the behavior was intentional when the code was written in the 1990s—a simpler age of more interesting computers. Nevertheless, we reported this bug and it was recently fixed in the master branch. The new version of the file is far nicer. The bug is still present in the release branches.

The bug is in the implementation of RC4: a comparatively old cipher that has had a useful life, but is reaching the end of it. Indeed, OpenSSL’s rc4_enc.c was — before the recent fix — some seriously 1990s C code with plentiful use of the preprocessor, design choices based on characteristics of Alphas and UltraSPARCs, and this dodgy OOB-based performance hack. A new paper “Analysing and Exploiting the Mantin Biases in RC4” might be the final nail in the coffin of RC4.

The encryption code in rc4_enc.c causes up to RC4_CHUNK - 1 bytes past the end of the output buffer to be overwritten with values that have been read from the same (out of bounds) locations. There are several variants delineated with #ifdefs. On an ordinary x86-64, RC4_CHUNK is 8 and execution will go through line 217:

for (; len & (0 - sizeof(RC4_CHUNK)); len -= sizeof(RC4_CHUNK)) {
    ichunk = *(RC4_CHUNK *) indata;
    otp = ...
    *(RC4_CHUNK *) outdata = otp ^ ichunk;
    indata += sizeof(RC4_CHUNK);
    outdata += sizeof(RC4_CHUNK);
}

This loop consumes all the complete 8-byte words found in the input buffer, and writes as many words into the output buffer. So far so good. Things get interesting after the loop:

if (len) {
    RC4_CHUNK mask = (RC4_CHUNK) - 1, ochunk;

    ichunk = *(RC4_CHUNK *) indata;  // !
    ochunk = *(RC4_CHUNK *) outdata; // !
    ...
    mask >>= (sizeof(RC4_CHUNK) - len) << 3;
               
    ... something about ultrix ...

    ochunk &= ~mask;
    ochunk |= (otp ^ ichunk) & mask;
    *(RC4_CHUNK *) outdata = ochunk; // !!

If there remain between one and seven unprocessed bytes, these are taken care of by reading an entire 8-byte word from indata, reading an entire word from outdata, computing a new word made from encrypted input and original out-of-bound values, and writing that word to outdata.

What could go wrong? At first glance it seems like this overrun could trigger an unwanted page fault, but that isn’t the case on architectures where an aligned word never crosses a page boundary. However, in a concurrent environment, the illegal writing of illegally read data can be directly observed. We wrote a small program that shows this. The key is a struct that places important data directly after a buffer that is the destination of RC4-encrypted data:

struct stuff {
    unsigned char data[data_len];
    int important;
};

Then, one thread repeatedly puts RC4-encrypted data into the buffer:

void *thread2(void *arg) {
    RC4_KEY key;
    const char *key_data = “Hello there.”;
    RC4_set_key(&key, strlen(key_data), (const unsigned char *)key_data);
    while (!stop)
        RC4(&key, data_len, src->data, dst->data);
    return 0;
}

While the other thread waits for the important field to be corrupted:

void *thread1(void *arg) {
    int x = 0;
    while (!stop) {
        dst->important = ++x;
        check(&dst->important, x);
    }
    return 0;
}

The check() function is compiled separately to help ensure that the value being checked comes from RAM rather than being cached in a register:

void check(int *addr, int val) { assert(*addr == val); }

Our complete code is on github. If you run it, you should see the assertion being violated almost immediately. We’ve tested against OpenSSL 1.0.2f on Linux and OS X. On Linux you must make sure to get the C implementation of RC4 instead of the assembly one:

./Configure no-asm linux-x86_64

Although any of tis-interpreter, Valgrind, or ASan can find the OpenSSL RC4 bug when the encrypted data bumps up against the end of an allocated block of memory, none of them finds the bug here since the “important” field absorbs the OOB accesses. There’s still an UB but it’s subtle: accessing buffer memory via a pointer-to-long is a violation of C’s strict aliasing rules.

So, do fiddly buffer overruns matter? Well, this particular bug is unlikely to have an observable effect on programs written in good faith, though one of us considered submitting as an Underhanded Crypto Challenge entry a backdoored chat program with end-to-end encryption that, each time the race condition triggers, sends a copy of the private message being processed to Eve, encrypted with Eve’s key. Alas, the race window is very short. A fiddly OOB can also be a problem if the compiler notices it. In an example from Raymond Chen’s blog, GCC uses the OOB array reference as an excuse to compile this function to “return true”:

int table[4];
bool exists_in_table(int v) {
    for (int i = 0; i <= 4; i++) {
        if (table[i] == v) return true;
    }
    return false;
}

That could only happen here if we used link-time optimization to give the compiler access to both the application and the OpenSSL code at once.

The Problem with Friendly C

I’ll assume you’re familiar with the Proposal for Friendly C and perhaps also Dan Bernstein’s recent call for a Boring C compiler. Both proposals are reactions to creeping exploitation of undefined behaviors as C/C++ compilers get better optimizers. In contrast, we want old code to just keep working, with latent bugs remaining latent.

After publishing the Friendly C Proposal, I spent some time discussing its design with people, and eventually I came to the depressing conclusion that there’s no way to get a group of C experts — even if they are knowledgable, intelligent, and otherwise reasonable — to agree on the Friendly C dialect. There are just too many variations, each with its own set of performance tradeoffs, for consensus to be possible. To get a taste of this, notice that in the comments for the Friendly C post, several people disagree with what I would consider an extremely non-controversial design choice for Friendly C: memcpy() should have memmove() semantics. Another example is what should be done when a 32-bit integer is shifted by 32 places (this is undefined behavior in C and C++). Stephen Canon pointed out on twitter that there are many programs typically compiled for ARM that would fail if this produced something besides 0, and there are also many programs typically compiled for x86 that would fail when this evaluates to something other than the original value. So what is the friendly thing to do here? I’m not even sure– perhaps the baseline should be “do what gcc 3 would do on that target platform.” The situation gets worse when we start talking about a friendly semantics for races and OOB array accesses.

I don’t even want to entertain the idea of a big family of Friendly C dialects, each making some niche audience happy– that is not really an improvement over our current situation.

Luckily there’s an easy away forward, which is to skip the step where we try to get consensus. Rather, an influential group such as the Android team could create a friendly C dialect and use it to build the C code (or at least the security-sensitive C code) in their project. My guess is that if they did a good job choosing the dialect, others would start to use it, and at some point it becomes important enough that the broader compiler community can start to help figure out how to better optimize Friendly C without breaking its guarantees, and maybe eventually the thing even gets standardized. There’s precedent for organizations providing friendly semantics; Microsoft, for example, provides stronger-than-specified semantics for volatile variables by default on platforms other than ARM.

Since we published the Friendly C proposal, people have been asking me how it’s going. This post is a long-winded way of saying that I lost faith in my ability to push the work forward. However, I still think it’s a great idea and that there are people besides me who can make it happen.

Reducers are Fuzzers

A test case isn’t just a test case: it lives in the (generally extremely large) space of inputs for the software system you are testing. If we have a test case that triggers a bug, here’s one way we can look at it:

The set of test cases triggering a bug is a useful notion since we can search it. For example, a test case reducer is a program that searches for the smallest test case triggering a bug. It requires a way to transform a test case into a smaller one, for example by deleting part of it. The new variant of the test case may or may not trigger the bug. The process goes like this:

I’ve spent a lot of time watching reducers run, and one thing I’ve noticed is that the reduction process often triggers bugs unrelated to the bug that is the subject of the reduction:

Sometimes this is undesirable, such as when a lax interestingness test permits the reduction of one bug to get hijacked by a different bug. This happens all the time when reducing segfaults, which are hard to tell apart. But on the other hand, if we’re looking for bugs then this phenomenon is a useful one.

It seems a bit counterintuitive that test case reduction would lead to the discovery of new bugs since we might expect that the space of inputs to a well-tested software system is mostly non-bug-triggering with a few isolated pockets of bug-triggering inputs scattered here and there. I am afraid that that view might not be realistic. Rather, all of the inputs we usually see occupy a tiny portion of the space of inputs, and it is surrounded by huge overlapping clouds of bug-triggering inputs. Fuzzers can push the boundaries of the space of inputs that we can test, but not by as much as people generally think. Proofs remain the only way to actually show that a piece of software does the right thing any significant chunk of its input space. But I digress. The important fact is that reducers are decently effective mutation-based fuzzers.

In the rest of this post I’d like to push that idea a bit farther by doing a reduction that doesn’t correspond to a bug and seeing if we can find some bugs along the way. We’ll start with this exciting C++ program

#include <iostream>

int main() {
  std::cout << "Hello World!++" << std::endl;
}

and we’ll reduce it under the criterion that it remains a viable Hello World implementation. First we’ll preprocess and then use delta’s topformflat to smoosh everything together nicely:

g++ -std=c++11 -E -P -O3 hello-orig.cpp | topformflat > hello.cpp

You might be saying to yourself something like “it sure is stupid that John is passing an optimization flag to the preprocessor,” but trust me that it actually does change the emitted code. I didn’t check if it makes a difference here but I’ve learned to avoid trouble by just passing the same flags to the preprocessor as to the compiler.

Anyhow, the result is 550 KB of goodness:

Here’s the code that checks if a variant is interesting — that is, if it acts like a Hello World implementation:

g++ -O3 -std=c++11 -w hello.cpp >/dev/null 2>&1 &&
ulimit -t 1 &&
./a.out | grep Hello

The ulimit is necessary because infinite loops sometimes get into the program that is being reduced.

To find compiler crashes we’ll need a bit more elaborate of a test:

if
  g++ -O3 -std=c++11 -w hello.cpp >compiler.out 2>&1
then
  ulimit -t 1 &&
  ./a.out | grep Hello
else
  if
    grep 'internal compiler error' compiler.out
  then
    exit 101
  else
    exit 1
  fi
fi

When the compiler fails we look at its output and, if it contains evidence of a compiler bug, exit with code 101, which will tell C-Reduce that it should save a copy of the input files that made this happen.

The compiler we’ll use is g++ r231221, the development head from December 3 2015. Let’s get things going:

creduce --nokill --also-interesting 101 --no-default-passes \
  --add-pass pass_clex rm-tok-pattern-4 10 ../test.sh hello.cpp

The -also-interesting 101 option indicates that the interestingness test will use process exit code 101 to tell C-Reduce to make a snapshot of the directory containing the files being reduced, so we can look at it later. --no-default-passes clears C-Reduce’s pass schedule and -add-pass pass_clex rm-tok-pattern-4 10 add a single pass that uses a small sliding window to remove tokens from the test case. The issue here is that not all of C-Reduce’s passes are equally effective at finding bugs. Some passes, such as the one that removes dead variables and the one that removes dead functions, will probably never trigger a compiler bug. Other passes, such as the one that removes chunks of lines from a test case, eliminate text from the test case so rapidly that effective fuzzing doesn’t happen. There are various ways to deal with this problem, such as probabilistically rejecting improvements or rejecting improvements that are too large, but for this post I’ve chosen the simple expedient of running just one pass that (1) makes progress very slowly and (2) seems to be a good fuzzer.

The dynamics of this sort of run are interesting: as the test case walks around the space of programs, you can actually see it brush up against a compiler bug and then wander off into the weeds again:

The results are decent: after about 24 hours, C-Reduce caused many segfaults in g++ and triggered six different internal compiler errors: 1, 2, 3, 4, 5, 6. One of these was already reported, another looks probably like a duplicate, and four appear to be new.

I did a similar run against Clang++ r254595, again the development head from December 3. This produced segfaults and also triggered 25 different assertions:

llvm/include/llvm/Support/Casting.h:230: typename cast_retty::ret_type llvm::cast(Y &) [X = clang::FunctionProtoType, Y = clang::QualType]: Assertion `isa(Val) && "cast() argument of incompatible type!"' failed.
llvm/include/llvm/Support/Casting.h:95: static bool llvm::isa_impl_cl::doit(const From *) [To = clang::FunctionTemplateDecl, From = const clang::Decl *]: Assertion `Val && "isa<> used on a null pointer"' failed.
llvm/tools/clang/lib/AST/Decl.cpp:2134: clang::APValue *clang::VarDecl::evaluateValue(SmallVectorImpl &) const: Assertion `!Init->isValueDependent()' failed.
llvm/tools/clang/lib/AST/Decl.cpp:2181: bool clang::VarDecl::checkInitIsICE() const: Assertion `!Init->isValueDependent()' failed.
llvm/tools/clang/lib/AST/ExprCXX.cpp:451: static clang::DependentScopeDeclRefExpr *clang::DependentScopeDeclRefExpr::Create(const clang::ASTContext &, clang::NestedNameSpecifierLoc, clang::SourceLocation, const clang::DeclarationNameInfo &, const clang::TemplateArgumentListInfo *): Assertion `QualifierLoc && "should be created for dependent qualifiers"' failed.
llvm/tools/clang/lib/AST/../../include/clang/AST/TypeNodes.def:98: clang::TypeInfo clang::ASTContext::getTypeInfoImpl(const clang::Type *) const: Assertion `!T->isDependentType() && "should not see dependent types here"' failed.
llvm/tools/clang/lib/CodeGen/CodeGenModule.cpp:623: llvm::StringRef clang::CodeGen::CodeGenModule::getMangledName(clang::GlobalDecl): Assertion `II && "Attempt to mangle unnamed decl."' failed.
llvm/tools/clang/lib/CodeGen/../../include/clang/AST/Expr.h:134: void clang::Expr::setType(clang::QualType): Assertion `(t.isNull() || !t->isReferenceType()) && "Expressions can't have reference type"' failed.
llvm/tools/clang/lib/Lex/PPCaching.cpp:101: void clang::Preprocessor::AnnotatePreviousCachedTokens(const clang::Token &): Assertion `CachedTokens[CachedLexPos-1].getLastLoc() == Tok.getAnnotationEndLoc() && "The annotation should be until the most recent cached token"' failed.
llvm/tools/clang/lib/Parse/../../include/clang/Parse/Parser.h:2256: void clang::Parser::DeclaratorScopeObj::EnterDeclaratorScope(): Assertion `!EnteredScope && "Already entered the scope!"' failed.
llvm/tools/clang/lib/Sema/../../include/clang/AST/DeclTemplate.h:1707: void clang::ClassTemplateSpecializationDecl::setInstantiationOf(clang::ClassTemplatePartialSpecializationDecl *, const clang::TemplateArgumentList *): Assertion `!SpecializedTemplate.is() && "Already set to a class template partial specialization!"' failed.
llvm/tools/clang/lib/Sema/../../include/clang/Sema/Lookup.h:460: clang::NamedDecl *clang::LookupResult::getFoundDecl() const: Assertion `getResultKind() == Found && "getFoundDecl called on non-unique result"' failed.
llvm/tools/clang/lib/Sema/SemaDecl.cpp:10455: clang::Decl *clang::Sema::ActOnParamDeclarator(clang::Scope *, clang::Declarator &): Assertion `S->isFunctionPrototypeScope()' failed.
llvm/tools/clang/lib/Sema/SemaDeclCXX.cpp:11373: ExprResult clang::Sema::BuildCXXDefaultInitExpr(clang::SourceLocation, clang::FieldDecl *): Assertion `Lookup.size() == 1' failed.
llvm/tools/clang/lib/Sema/SemaExpr.cpp:2274: ExprResult clang::Sema::ActOnIdExpression(clang::Scope *, clang::CXXScopeSpec &, clang::SourceLocation, clang::UnqualifiedId &, bool, bool, std::unique_ptr, bool, clang::Token *): Assertion `R.getAsSingle() && "There should only be one declaration found."' failed.
llvm/tools/clang/lib/Sema/SemaExprCXX.cpp:2272: clang::FunctionDecl *clang::Sema::FindUsualDeallocationFunction(clang::SourceLocation, bool, clang::DeclarationName): Assertion `Matches.size() == 1 && "unexpectedly have multiple usual deallocation functions"' failed.
llvm/tools/clang/lib/Sema/SemaExprCXX.cpp:6663: ExprResult clang::Sema::CorrectDelayedTyposInExpr(clang::Expr *, clang::VarDecl *, llvm::function_ref): Assertion `TyposInContext < ~0U && "Recursive call of CorrectDelayedTyposInExpr"' failed.
llvm/tools/clang/lib/Sema/SemaExprMember.cpp:91: IMAKind ClassifyImplicitMemberAccess(clang::Sema &, const clang::LookupResult &): Assertion `!R.empty() && (*R.begin())->isCXXClassMember()' failed.
llvm/tools/clang/lib/Sema/SemaLookup.cpp:1904: bool clang::Sema::LookupQualifiedName(clang::LookupResult &, clang::DeclContext *, bool): Assertion `(!isa(LookupCtx) || LookupCtx->isDependentContext() || cast(LookupCtx)->isCompleteDefinition() || cast(LookupCtx)->isBeingDefined()) && "Declaration context must already be complete!"' failed.
llvm/tools/clang/lib/Sema/SemaLookup.cpp:2729: Sema::SpecialMemberOverloadResult *clang::Sema::LookupSpecialMember(clang::CXXRecordDecl *, clang::Sema::CXXSpecialMember, bool, bool, bool, bool, bool): Assertion `CanDeclareSpecialMemberFunction(RD) && "doing special member lookup into record that isn't fully complete"' failed.
llvm/tools/clang/lib/Sema/SemaOverload.cpp:11671: ExprResult clang::Sema::CreateOverloadedBinOp(clang::SourceLocation, unsigned int, const clang::UnresolvedSetImpl &, clang::Expr *, clang::Expr *): Assertion `Result.isInvalid() && "C++ binary operator overloading is missing candidates!"' failed.
llvm/tools/clang/lib/Sema/SemaTemplate.cpp:2906: ExprResult clang::Sema::BuildTemplateIdExpr(const clang::CXXScopeSpec &, clang::SourceLocation, clang::LookupResult &, bool, const clang::TemplateArgumentListInfo *): Assertion `!R.empty() && "empty lookup results when building templateid"' failed.
llvm/tools/clang/lib/Sema/SemaTemplateDeduction.cpp:609: (anonymous namespace)::PackDeductionScope::PackDeductionScope(clang::Sema &, clang::TemplateParameterList *, SmallVectorImpl &, clang::sema::TemplateDeductionInfo &, clang::TemplateArgument): Assertion `!Packs.empty() && "Pack expansion without unexpanded packs?"' failed.
llvm/tools/clang/lib/Sema/SemaTemplateInstantiate.cpp:2781: llvm::PointerUnion *clang::LocalInstantiationScope::findInstantiationOf(const clang::Decl *): Assertion `isa(D) && "declaration not instantiated in this scope"' failed.
llvm/tools/clang/lib/Sema/SemaTemplateVariadic.cpp:290: bool clang::Sema::DiagnoseUnexpandedParameterPack(clang::Expr *, clang::Sema::UnexpandedParameterPackContext): Assertion `!Unexpanded.empty() && "Unable to find unexpanded parameter packs"' failed.

I have to admit that I felt a bit overwhelmed by 25 potential bug reports, and I haven’t reported any of these yet. My guess is that a number of them are already in the bugzilla since people have been fuzzing Clang lately. Anyway, I’ll try to get around to reducing and reporting these. Really, this all needs to be automated so that when subsequent reductions find still more bugs, these just get added to the queue of reductions to run.

If you were interested in reproducing these results, or in trying something similar, you would want to use C-Reduce’s master branch. I ran everything on an Ubuntu 14.04 box. While preparing this post I found that different C-Reduce command line options produced widely varying numbers and kinds of crashes.

Regarding previous work, I believe — but couldn’t find written down — that the CERT BFF watches out for new crashes when running its reducer. In a couple of papers written by people like Alex Groce and me, we discussed the fact that reducers often slip from triggering one bug to another.

The new thing in this post is to show that triggering new bugs while reducing isn’t just some side effect. Rather, we can go looking for trouble, and we can do it without being given a bug to reduce in the first place. A key enabler for easy bug-finding with C-Reduce was finding a simple communication mechanism by which the interestingness test can give C-Reduce a bit of out-of-band information that a variant should be saved for subsequent inspection. I’m not trying to claim that reducers are awesome fuzzers, but on the other hand, it might be a little bit awesome that mutating Hello World resulted in triggering 25 different assertion violations in a mature and high-quality compiler. I bet we’d have done even better by starting with a nice big fat Boost application.