The Last Mile for a Verification Problem

A few weeks ago my student Lu Zhao defended his PhD thesis and we were lucky to get his external committee member, Magnus Myreen, to attend in person since he was in the USA anyway for a meeting. While he was out here, Magnus gave a talk on some not-yet-published work that is really cool.

The background for Magnus’s work is L4.verified, a tour de force of formal verification where a microkernel was proved to be correct. Now the first thing you should always ask when anyone says that anything was “proved to be correct” is: What was proved about what, and how? The question is important because historically speaking (and currently, but especially in the past) these words have been used rather loosely. In the case of L4.verified, “proved to be correct” means roughly that:

The implementation of a microkernel in C, whose meaning was determined using a formal semantics for C that was developed by the L4.verified researchers, was proved using Isabelle/HOL to implement a formal specification of the L4.verified kernel. Residual unproved assumptions remain with respect to some inline assembly language and also some unformalized aspects of ARM hardware such as its virtual memory subsystem.

Where could this go wrong? It could be the case that the high-level specification does not formalize what was intended. It could be the case that incorrect assumptions were made about the hardware and about the inline assembly that manipulates it. It could be the case that the C formalization fails to match up with the compiler that is used. This mismatch could be due to an error in the formalization or an error in the compiler. Of course, it is well-known that compilers contain bugs. Finally, the Isabelle/HOL prover could itself be incorrect, though most people would consider this highly unlikely.

Magnus’s work (done in conjunction with Thomas Sewell of NICTA, and currently in progress) is focused on eliminating the dependency on a correct C formalization and C compiler. Of course, L4.verified still has to run on some target machine. Their idea is to replace the C abstract machine (augmented with ARM-specific stuff) with the formal model of the ARM instruction set that was developed at Cambridge at least ten years ago. This model has been used in a variety of projects and is believed to be of high quality.

The most obvious way to eliminate L4.verified’s trusted C compiler is to use CompCert, and in fact Magnus and Thomas tried to do that. This line of work was stopped by incompatibilities between CompCert’s view of the meaning of C and L4.verified’s, and also by practical differences between Isabelle/HOL and Coq — the latter prover being the one used by CompCert. It wasn’t clear to me how much effort would have been required to overcome these problems.

The second approach that Thomas and Magnus tried is decompiling the L4.verified kernel’s ARM object code into math. In other words, they treat each machine code function as a mathematical function. Decompilation was the topic of Magnus’s PhD work and he achieved some very impressive results using it. There are two obvious problems with the decompilation approach: loops and side effects. Loops are simply decompiled into tail recursion, and side-effecting functions can be handled by treating the machine’s memory (or part of it) as an explicit input to and output from each function. Plenty of details are found in Magnus’s papers and in his PhD thesis. The next step is to connect the decompiled functions with their equivalent functions as described by the C semantics, and this is accomplished by some rewriting steps and finally using an SMT solver to do some of the dirty work. Magnus described this last part, connecting the C code with the decompiled code, as working for 65% of the functions in L4.verified. Of course, we might expect that it works for the easiest 65% of functions, so it could be the case that a significant amount of work remains, and my guess is that the last dozen or so functions will turn into serious headaches.

I was very curious whether they had so far run into any compiler bugs and it seems that they have not. On the other hand, this is expected since compiler bugs will force the equivalence check to fail, meaning that all miscompiled functions are still hiding in the 35% of functions whose equivalence has not yet been verified. My guess is that it’s substantially likely that they will not run into any serious compiler bugs while doing this work, for two reasons. First, the L4.verified kernel has been tested fairly heavily already, and has been subjected to worst-case execution time analysis at the object code level. It seems likely that serious miscompilations would have already been discovered. Second, L4.verified is coded in a somewhat careful subset of C that eschews, for example, bitfields in favor of explicit shifts and masks, side-stepping a common area where compiler bugs are found. It seems likely that they might run into a few fairly harmless miscompilations such as illegally reordered memory operations where the reordering doesn’t matter that much.

Anyway, it was great having Magnus out in Utah and hearing about this work, which represents a nice improvement on an already extremely impressive piece of work.

Does Portable Byte-Order Code Optimize?

When reading data whose byte-ordering may differ from the host computer’s, Rob Pike advocates writing portable code as opposed to using #ifdefs to optimize the case where the encoded data’s byte ordering matches the host. His arguments seem reasonable and it’s definitely a win if the compiler can transparently recognize the useless data extraction operations and optimize them away. Rob’s code is basically this:

int read_little_endian_int (unsigned char *data)
{
  int i = 
    (data[0]<<0) | (data[1]<<8) | 
    (data[2]<<16) | (data[3]<<24);
  return i;
}

On a little-endian host, this code can be simplified to just a 32-bit load. But do compilers actually perform this optimization? It turns out not. Clang (built today from trunk) gives this:

read_little_endian_int:
 movzbl (%rdi), %ecx
 movzbl 1(%rdi), %eax
 shll $8, %eax
 orl %ecx, %eax
 movzbl 2(%rdi), %ecx
 shll $16, %ecx
 orl %eax, %ecx
 movzbl 3(%rdi), %eax
 shll $24, %eax
 orl %ecx, %eax
 ret

GCC (built a few days ago) and Intel’s compiler (12.0.5) produce very similar code.

Does the sub-optimality of this code matter? Probably not if we’re reading from a slow device, but it does matter if we’re reading from RAM. My Core i7 920 (running in 64-bit mode) can copy a large block of data using 32-bit operations at about 7 GB/s, but reaches only about 2 GB/s when running the data through the code emitted by Clang. The results for GCC and Intel CC are similar.

But now the plot thickens slightly. If we wish to check that the function above is implemented and compiled correctly, we might write a program with this main function:

int main (void)
{
  int i = INT_MIN;
  while (1) {
    int i2 = read_little_endian_int ((unsigned char *)&i);
    if (i != i2) {
      printf ("oops %d != %d\n", i, i2);
    }
    if (i == INT_MAX) break;
    i++;
  }
  return 0;
}

GCC and Intel CC compile this in the obvious way, but Clang (at -O3) turns this main function into a nop:

 main:
 xorl %eax, %eax
 ret

Now why would Clang understand that read_little_endian_int is a nop when it is used in this context, but fail to emit the obvious code for the function itself? I’m not sure what to make of this. Maybe someone familiar with LLVM’s optimizers can help out here. The full code is here if anyone wants to play with it.

57 Small Programs that Crash Compilers

It’s not clear how many people enjoy looking at programs that make compilers crash — but this post is for them (and me). Our paper on producing reduced test cases for compiler bugs contained a large table of results for crash bugs. Below are all of C-Reduce’s reduced programs for those bugs.

Can we conclude anything just by looking at these? It’s hard to say… many of these C fragments are not obviously hard to compile — to see the problem we would need to know the details of the translation to a particular compiler’s intermediate representation.

In general, we don’t know of any way to make these programs much smaller. In other words, C-Reduce already implements most of the tricks we can think of. It will always be the case that an experienced compiler developer who understand a particular bug will be able to produce considerably better reduced test cases than C-Reduce can. Our goal, rather, is to create tests that a naive user cannot improve very much. So if you see interesting opportunities to improve these test cases, we’d love to hear about it. The current version of C-Reduce fails to implement optimizations such as constant folding, constant propagation, and loop peeling. We haven’t seen much need for these, though.

These are the verbatim tool output; there are definitely some formatting warts.

C1 : Crashes Clang 2.6 at -O0:

#pragma pack(1)
struct S1 {
  int f0;
  char f2
};
struct {
  struct S1 f0
    }
a[] = { 0 }
      ;

C2 : Crashes Clang 2.6 at -O2:

struct S0 {
  int f0:1;
  int f4
}
a;
void
fn1 () {
  struct S0 b[][1][1] = { 0 };
  b[0][0][0] = a;
}

C3 : Crashes Clang 2.6 at -O2:

unsigned short a;
int b;
char c;
short
fn1 () {
  return 1 / a;
}
void
fn2 () {
  b = fn1 ();
  char d = b;
  c = d % 3;
}

C4 : Crashes Clang 2.6 at -O3:

int a, b, c, d, e;
#pragma pack(1)
struct S0 {
  int f0:14;
  int f1:13;
  int f2:28;
  int f3:23;
  int f4:12
};
void fn1 (struct S0);
void
fn2 () {
  int f;
lbl_2311:
  ;
  struct S0 g = { 0, 0, 1 };
  fn1 (g);
  b && e;
  for (; c;) {
    if (d)
      goto lbl_2311;
    f = a && 1 ? 0 : 1;
    g.f4 = f;
  }
}

C5 : Crashes Clang 2.6 at -O2:

 int crc32_context, g_2 = 0, g_5;
int g_8;
int *g_39, *g_371;
int g_81;
int func_1_l_15 ;
static short safe_add_func_int16_t_s_s ( short si1, int si2 ) {
    return si1 > 67 ? si1 : si1 + si2;
  }
   
   
   
   
    
   
    static int func_1 (  ) {
    int l_462 = 0;
    g_2 = 0;
    for ( ;
  g_2 < 12;
  g_2 = safe_add_func_int16_t_s_s ( g_2, 5 ) )     {
       g_5 = 1;
       for ( ;
 g_5;
 ++g_5 ) 	{
 	  g_8 = 1;
 	  for ( ;
 g_8 >= 0;
 g_8 = g_8 - 1 ) 	    {
 	      func_1_l_15 = 1;
  	      for ( ;
 func_1_l_15;
 		    func_1_l_15 =   func_1_l_15  - 1  ) 		if ( g_8 ) 		  break;
 	    }
 	  g_371 = &l_462;
 	  int *l_128 = &g_81;
 	  *l_128 = *g_39;
 	}
 *g_371 =    0 != 0   ;
     }
    return 0;
  }
   int main (  ) {
    func_1 (  );
    crc32_context = g_2;
    crc32_context += g_5;
  }
  

C6 : Crashes Clang 2.6 at -O0:

#pragma pack(1)
struct S2 {
  int f1;
  short f4
};
struct S3 {
  struct S2 f1;
  int f3:14
};
struct {
  struct S3 f3
    }
a = { 0, 0, 0 };

C7 : Crashes Clang 2.6 at -O1:

int *a;
static int **b;
int c, d, e;
void
fn1 () {
  d = &b == c;
  for (;;) {
    int **f = &a;
    if (e) {
    } else
      b = f;
    if (**b)
      continue;
    **f;
  }
}

C8 : Crashes Clang 2.6 at -O1:

#pragma pack(1)
struct S0 {
  int f3;
  char f4
};
struct {
  struct S0 f6;
  int f8
}
a = { 0, 0, 0 };

C9 : Crashes Clang 2.6 at -O2:

struct S0 {
  int f0;
  int f1;
  short f3;
  int f7;
  int f8
}
b;
int a, c, d, e, f;
void
fn1 (struct S0 p1) {
  d++;
  c = p1.f8;
  e = 0;
  a = p1.f7;
}
void
fn2 () {
  e = 0;
  for (; e; e++) {
    if (d)
      for (;;) {
      }
    --f;
  }
  fn1 (b);
}

C10 : Crashes Clang 2.6 at -O1:

union U2 {
  int f0;
  unsigned short f2
}
b;
static int a = 1;
void
fn1 (int p1, unsigned short p2) {
}
int fn2 (union U2);
union U2 fn3 ();
static unsigned long long
fn5 () {
  fn1 (b.f2, b.f0);
  return 0;
}
static char
fn4 () {
  fn5 ();
  return 0;
}
int
main () {
  a || fn2 (fn3 (fn4 () ) );
}

C11 : Crashes Clang 2.7 at -O1:

int *a;
static int **b;
int c, d, e;
void
fn1 () {
  d = &b == c;
  for (;;) {
    int **f = &a;
    if (e) {
    } else
      b = f;
    if (**b)
      continue;
    **f;
  }
}

C12 : Crashes Clang 2.7 at -O0:

char a;
unsigned char b;
int c;
void
fn1 () {
  (b ^= c) != a;
}

C13 : Crashes Clang 2.7 at -O2:

int a, b;
void fn1 ();
void
fn2 (short p1) {
  short c;
  c = (65532 | 3) + p1;
  fn1 (c && 1);
  b = (0 == p1) * a;
}

C14 : Crashes GCC 3.2.0 at -O1:

void
fn1 () {
  struct S0 *a;
  struct S0 *b, *c = &a;
  struct S0 **d = &c;
  if (&b == &a) {
  }
}

C15 : Crashes GCC 3.2.0 at -O3:

volatile int a, b, c, i;
char d;
void
fn1 () {
  int e;
  {
    for (;; c++) {
      int f[50] = { };
      if (b) {
        {
          0;
          {
            {
              int g = a, h = d;
              e = h ? g : g / 0;
            }
          }
          a = e;
        }
      }
    }
  }
}
void
main () {
  i = 0 / 0;
  a;
}

C16 : Crashes GCC 3.2.0 at -O3:

int a, c;
volatile int b;
void
fn1 () {
  b;
  for (;;)
    break;
  int d = b, e = a;
  c = a ? d : d % 0;
}
void
fn2 () {
  if (0 % 0)
    b;
}

C17 : Crashes GCC 3.2.0 at -O2:

union U1 {
  int f0;
  char f1
};
void
fn1 (union U1 p1) {
  p1.f1 = 0;
  for (; p1.f1;) {
  }
}

C18 : Crashes GCC 3.2.0 at -O1:

int a, b;
void
fn1 () {
  b = 4294967290UL <= a | b;
}

C19 : Crashes GCC 3.2.0 at -O3:

int a, b, c;
int
fn1 (int p1, int p2) {
  return p1 - p2;
}
void
fn2 () {
  int d;
  int **e;
  int ***f = &e;
  d = a && b ? a : a % 0;
  if (fn1 (f == 0, 2) )
    c = ***f;
}

C20 : Crashes GCC 3.3.0 at -O3:

int a, b, d;
struct S0 {
  int f3
};
int *volatile c;
void fn1 (struct S0);
void
fn2 () {
  int e;
  struct S0 **f;
  struct S0 ***g = &f;
  (a && b && b ? 0 : b) > (&c && 0);
  e = 0 == g;
  d = e >> 1;
  for (;;)
    fn1 (***g);
}

C21 : Crashes GCC 3.4.0 at -O3:

int a, b;
struct U0 {
  char f0;
  int f2
};
void
fn1 () {
  struct U0 c;
  for (; c.f0 != 1; c.f0 = c.f0 + a)
    b -= 1;
}

C22 : Crashes GCC 3.4.0 at -O3:

int a, b, d, e;
struct S0 {
  int f3
};
int *c;
void fn1 (struct S0);
void
fn2 () {
  struct S0 **f;
  struct S0 ***g = &f;
  (a && b && b ? 0 : b) > (&c == d);
  e = 1 < (0 == g);
  for (;;)
    fn1 (***g);
}

C23 : Crashes GCC 4.0.0 at -O2:

int ***a;
int b;
int *c;
void
main () {
  if (&c == a)
    b = 0 == *a;
}

C24 : Crashes GCC 4.0.0 at -O2:

int a[][0];
int *const b = &a[0][1];
int
fn1 () {
  return *b;
}

C25 : Crashes GCC 4.0.0 at -O0:

int a, b;
unsigned char c;
void
fn1 () {
  (0 >= a & (0 || b) ) > c;
}

C26 : Crashes GCC 4.0.0 at -O1:

struct {
  int f9:1
}
a;
const int b[] = { 0 };
void fn1 ();
void
main () {
  for (;;) {
    a.f9 = b[0];
    fn1 ();
  }
}

C27 : Crashes GCC 4.0.0 at -O0:

int a, c;
unsigned char b;
void
fn1 () {
  b > (c > 0 & 0 < a);
}

C28 : Crashes GCC 4.0.0 at -O2:

int **a[][0];
static int ***const b = &a[0][1];
void fn1 ();
int
fn2 () {
  return ***b;
  fn1 ();
}
void
fn1 () {
  **b;
}

C29 : Crashes GCC 4.1.0 at -O1:

volatile int ***a;
int b;
int **c;
void
fn1 () {
  if (&c == a)
    b = 0 == *a;
}

C30 : Crashes GCC 4.1.0 at -O1:

struct {
  int f0;
  int f2
}
a;
int b;
void
fn1 () {
  a.f2 = 0;
  int *c[] = { 0, 0, 0, 0, &a.f0, 0, 0, 0, &a.f0 };
  b = *c[4];
}

C31 : Crashes GCC 4.1.0 at -O2:

int a, b;
unsigned c;
void
fn1 () {
  for (; c <= 0;)
    if (b < c)
      a = 1 && c;
}

C32 : Crashes GCC 4.1.0 at -O1:

unsigned a;
int b;
void
main () {
  unsigned c = 4294967295;
  int d = c;
  b = a <= d || a;
}

C33 : Crashes GCC 4.1.0 at -O1:

const volatile long a;
void
main () {
  printf ("%d\n", (int) a);
}

C34 : Crashes GCC 4.1.0 at -O3:

int a, b;
union U1 {
  int f0;
  int f1
};
void
fn1 () {
  union U1 c = { 1 };
  int d = 1;
  if ( (c.f1 & a ? c.f1 : 1 - a) ^ d) {
  } else
    b = 0;
}

C35 : Crashes GCC 4.2.0 at -O1:

volatile int ***a;
int b;
int **c;
void
fn1 () {
  if (&c == a)
    b = 0 == *a;
}

C36 : Crashes GCC 4.2.0 at -O1:

struct S2 {
  volatile int f5:1;
  int f6
};
static struct S2 a;
void
main () {
  printf ("%d\n", a.f5);
}

C37 : Crashes GCC 4.3.0 at -O1:

long long *a;
int b;
void
fn1 () {
  long long **c = &a;
  int d = 7;
lbl_2890: {
    long long **e = &a;
    b = (e == c) < d;
    d = 0;
    goto lbl_2890;
  }
}

C38 : Crashes GCC 4.3.0 at -O2:

struct S2 {
  volatile int f5:1;
  int f6
};
static struct S2 a;
void
main () {
  printf ("%d\n", a.f5);
}

C39 : Crashes GCC 4.3.0 at -O3:

int a;
short b;
void
fn1 () {
  int c[0];
  for (;;) {
    a = c[0];
    b = 0;
    for (; b < 7; b += 1)
      c[b] = 0;
  }
}

C40 : Crashes GCC 4.3.0 at -O1:

volatile int **a;
int *b;
void
fn1 () {
  if (a == &b)
    **a;
}

C41 : Crashes GCC 4.3.0 at -O3:

int a, b, c, d, e, f;
void
fn1 () {
  char g;
lbl_120:
  if (b || e >= 0 & d >= 0 || a)
    return;
  g = f < 0 ? 1 : f;
  d = g == 0 || (char) f == 0 && g == 1 ? 0 : 0 % 0;
  if (c)
    goto lbl_120;
}

C42 : Crashes Intel CC 12.0.5 at -O1:

struct U0 {
  int f0
}
a;
struct U0
fn1 () {
  return a;
}
void
main () {
  0 > a.f0;
  fn1 ();
}

C43 : Crashes Open64 4.2.4 at -O3:

int a;
int *b;
unsigned c;
void
fn1 () {
  for (; a; a--)
    if (*b) {
      c = 0;
      for (; c >= 5; c++) {
      }
    }
}

C44 : Crashes Open64 4.2.4 at -O3:

short a;
void
fn1 () {
  long b;
  b = 44067713550;
  a |= b;
}

C45 : Crashes Open64 4.2.4 at -O3:

volatile int a;
void
fn1 () {
  int b = 1;
  a || b--;
}

C46 : Crashes Open64 4.2.4 at -O2:

int a, b;
void fn1 ();
void fn2 ();
void
fn3 () {
  fn2 ();
  fn1 ();
}
void
fn2 () {
  if (1) {
  } else
    for (;; b++) {
      int c = 0;
      int *d = &a;
      int **e = &d;
      *e = &c;
      *d = 0;
      *d |= 0;
    }
}

C47 : Crashes Open64 4.2.4 at -O3:

struct S0 {
  int f1:1
};
int a, b;
void
fn1 () {
  for (; b;) {
    struct S0 c = { };
    if (1) {
      c = c;
      a = c.f1;
    }
  }
}

C48 : Crashes Open64 4.2.4 at -O3:

int a, b;
int
fn1 () {
  int *c = &b;
  a = 0;
  for (; a >= -26; --a) {
    unsigned d = 18446744073709551615;
    int *e = &b;
    *e &= d;
  }
  return *c;
}

C49 : Crashes Open64 4.2.4 at -O3:

static int a, c, d;
int b;
int *e;
void
fn1 () {
  for (; a; a += 1) {
    b = 0;
    for (; b > -16; --b)
      for (; c;) {
        int *f = &d;
        *f = 0;
      } *e = 0;
  }
}

C50 : Crashes Sun CC 5.11 at -xO4:

unsigned char a, d;
struct {
  int f2
}
b;
int c, e;
void
fn1 (p1) {
}
void
fn2 () {
  c = 0;
  for (; c <= 0;)
    e = b.f2;
  fn1 (0);
  b = b;
  d = -a;
}

C51 : Crashes Sun CC 5.11 at -fast:

int a, c;
int b[1];
void
fn1 () {
  short d;
  for (; a; a -= 1) {
    d = b = b;
    b[0] = 0;
  }
}

C52 : Crashes Sun CC 5.11 at -xO4:

int a, b, d;
short c;
int
fn1 (p1) {
  return a ? 0 : p1;
}
void
fn2 () {
  int e = 0;
  for (;;) {
    c = 0;
    d = fn1 (e ^ ~~c);
    d && b;
  }
}

C53 : Crashes Sun CC 5.11 at -fast:

long a;
int b, d;
int *c;
void
fn1 () {
  int *e;
  for (;; b--)
    for (; d;) {
      *c = 0;
      *c &= (&e != 1) / a;
    }
}

C54 : Crashes Sun CC 5.11 at -xO0:

#pragma pack(1)
struct {
  int f3:1;
  int f4:16
}
a = { 1, 0 };

C55 : Crashes Sun CC 5.11 at -xO3:

int a, c;
static int b = 1;
void fn1 ();
void
fn2 () {
  for (; a; a--) {
    c = 0;
    for (; c != 1;) {
      if (b)
        break;
      fn1 ();
    }
  }
}

C56 : Crashes Sun CC 5.11 at -xO4:

#pragma pack(1)
struct S0 {
  int f1;
  int f3:1
}
a;
void
fn1 (struct S0 p1) {
  p1.f3 = 0;
}
void
fn2 () {
  fn1 (a);
}

C57 : Crashes Sun CC 5.11 at -fast:

int a, c, d, e, f, g, h, i, j, k;
volatile int b;
int
fn1 () {
  for (; d; d = a) {
    int *l = &c;
    c = -3;
    for (; c > -23; --c)
      if (k) {
        if (*l)
          continue;
        return b;
      }
    for (; i; ++i) {
      j = 0;
      g = h;
      for (; f <= 1; f += 1) {
      }
    }
  }
  return e;
}

New Paper on Test-Case Reduction

For about the last four years my group has struggled with the problem of taking a large program that makes a compiler do something bad (crash, run forever, generate incorrect code) and turning it into a minimum-sized program that elicits the same bad behavior. Delta is a great tool but it does not entirely solve the problem. Anyway, the short version of the story is that we spent a ton of time on this problem and only made good headway during the last year or so.

Our various approaches and their results are written up in a paper that I’ll present at PLDI 2012 in Beijing in June. I’m pretty happy with it. The audience is narrow but one of the new tools, C-Reduce (we’ll announce a release here shortly), really nails the problem and produces output far smaller than any tool anyone else has produced.

It had been a long time (eight years, perhaps) since I took both the technical lead and writing lead on a paper and let me tell you, it was awful. I’m not saying I did all the work or anything, but I did a lot of it and it about killed me. Anyway, I have a fresh appreciation for implementation effort that students put into a systems paper. C-Reduce is a simple tool that has basically one internal interface, and somehow it took at least a dozen iterations before I arrived at a version of this interface that I’m happy with.

How Many C Programs Are There?

If I choose a size S, can you tell me how many valid C programs exist that are no larger than that size? I’m actually interested in the answer — it’ll help me make a point in a paper I’m writing. Shockingly, the Internet (or at least, the part of it that I looked at based on a few searches) does not provide a good answer.

Let’s start with a few premises:

  1. Since it would be exceedingly difficult to construct the exact answer, we’re looking for a respectably tight lower bound.
  2. S is measured in bytes.
  3. Since it seems obvious that there’s an exponential number of C programs, our answer will be of the form NS.

But how to compute N? My approach (and I’m not claiming it’s the best one) is to consider only programs of this form:

int main() {
int id1;
int id2;
....
}

The C99 standard guarantees that the first 63 characters of each identifier are significant, so (to drop into Perl regex mode for a second) each identifier is characterized as: [a-zA-Z_][a-zA-Z0-9_]{62}. The number of different identifiers of this form is 53*6362, but we spend 69 bytes in order to generate that many programs, in addition to 15 bytes of overhead to declare main(). The 69th root of the large number is between 43 and 44, so I claim that 43(N-15) is a not-incredibly-bad lower bound on the number of C programs of size S. Eventually we’ll hit various other implementation limits and be forced to declare additional functions, but I suspect this can be done without invalidating the lower bound.

Of course, the programs I’ve enumerated are not very interesting. Perhaps it would have been better to ask:

  • How many non-alpha-equivalent C programs are there?

or even:

  • How many functionally inequivalent C programs are there?

But these sound hard and don’t serve my narrow purpose of making a very blunt point.

Anyway, I’d be interested to hear people’s thoughts on this, and in particular on the 43S (claimed) lower bound.

C Puzzle: Double Trouble

I ran into an interesting C program that both of my C oracles (KCC and Frama-C) consider to be well-defined, but that are inconsistently compiled by the current development versions of GCC and Clang on x86-64.

int printf (const char *, ...);

void fn2 (int p1)
{
  printf ("%d\n", p1);
}

union U0 {
  short f0;
  int f1;
};

union U0 b;
int *c = &b.f1;

int main ()
{
  short *d = &b.f0;
  *d = 0;
  *c = 1;
  fn2 (b.f0);
  return 0;
}

The results:

[regehr@gamow 1]$ current-gcc -O1 small.c ; ./a.out
1
[regehr@gamow 1]$ current-gcc -O2 small.c ; ./a.out
0
[regehr@gamow 1]$ clang -O1 small.c ; ./a.out
1
[regehr@gamow 1]$ clang -O2 small.c ; ./a.out
0

GCC is revision 183992 and Clang is 150180.

The question is: Is this program well-defined (in which case I get to report two compiler bugs) or is it undefined (in which case I get to report two bugs in C analysis tools)?

UPDATE:

Just to confuse things a bit more, I’ll add this program. GCC prints 1 at all optimization levels but Clang changes its answer to 0 at higher levels.

int printf (const char *, ...);

union U0 {
  short f0;
  int f1;
} b;

union U0 *c = &b;

int main ()
{
  b.f0 = 0;
  c->f1 = 1;
  printf ("%d\n", b.f0);
  return 0;
}

I decided to report this as an LLVM bug just to make sure that the folks there have thought this kind of example over. I suspect it’ll get closed as a WONTFIX, which is OK — real compiler design involves a lot of hard tradeoffs.

C99 Language Lawyer Needed

The program below came up during some tests. The question is, is it well-defined by the C99 language? It appears to clearly be undefined behavior by C11 and C++11.

struct {
  int f0:4;
  int f1:4;
} s;

int main (void) {
  (s.f0 = 0) + (s.f1 = 0);
  return 0;
}