How Many C Programs Are There?

If I choose a size S, can you tell me how many valid C programs exist that are no larger than that size? I’m actually interested in the answer — it’ll help me make a point in a paper I’m writing. Shockingly, the Internet (or at least, the part of it that I looked at based on a few searches) does not provide a good answer.

Let’s start with a few premises:

  1. Since it would be exceedingly difficult to construct the exact answer, we’re looking for a respectably tight lower bound.
  2. S is measured in bytes.
  3. Since it seems obvious that there’s an exponential number of C programs, our answer will be of the form NS.

But how to compute N? My approach (and I’m not claiming it’s the best one) is to consider only programs of this form:

int main() {
int id1;
int id2;
....
}

The C99 standard guarantees that the first 63 characters of each identifier are significant, so (to drop into Perl regex mode for a second) each identifier is characterized as: [a-zA-Z_][a-zA-Z0-9_]{62}. The number of different identifiers of this form is 53*6362, but we spend 69 bytes in order to generate that many programs, in addition to 15 bytes of overhead to declare main(). The 69th root of the large number is between 43 and 44, so I claim that 43(N-15) is a not-incredibly-bad lower bound on the number of C programs of size S. Eventually we’ll hit various other implementation limits and be forced to declare additional functions, but I suspect this can be done without invalidating the lower bound.

Of course, the programs I’ve enumerated are not very interesting. Perhaps it would have been better to ask:

  • How many non-alpha-equivalent C programs are there?

or even:

  • How many functionally inequivalent C programs are there?

But these sound hard and don’t serve my narrow purpose of making a very blunt point.

Anyway, I’d be interested to hear people’s thoughts on this, and in particular on the 43S (claimed) lower bound.

C Puzzle: Double Trouble

I ran into an interesting C program that both of my C oracles (KCC and Frama-C) consider to be well-defined, but that are inconsistently compiled by the current development versions of GCC and Clang on x86-64.

int printf (const char *, ...);

void fn2 (int p1)
{
  printf ("%d\n", p1);
}

union U0 {
  short f0;
  int f1;
};

union U0 b;
int *c = &b.f1;

int main ()
{
  short *d = &b.f0;
  *d = 0;
  *c = 1;
  fn2 (b.f0);
  return 0;
}

The results:

[regehr@gamow 1]$ current-gcc -O1 small.c ; ./a.out
1
[regehr@gamow 1]$ current-gcc -O2 small.c ; ./a.out
0
[regehr@gamow 1]$ clang -O1 small.c ; ./a.out
1
[regehr@gamow 1]$ clang -O2 small.c ; ./a.out
0

GCC is revision 183992 and Clang is 150180.

The question is: Is this program well-defined (in which case I get to report two compiler bugs) or is it undefined (in which case I get to report two bugs in C analysis tools)?

UPDATE:

Just to confuse things a bit more, I’ll add this program. GCC prints 1 at all optimization levels but Clang changes its answer to 0 at higher levels.

int printf (const char *, ...);

union U0 {
  short f0;
  int f1;
} b;

union U0 *c = &b;

int main ()
{
  b.f0 = 0;
  c->f1 = 1;
  printf ("%d\n", b.f0);
  return 0;
}

I decided to report this as an LLVM bug just to make sure that the folks there have thought this kind of example over. I suspect it’ll get closed as a WONTFIX, which is OK — real compiler design involves a lot of hard tradeoffs.

C99 Language Lawyer Needed

The program below came up during some tests. The question is, is it well-defined by the C99 language? It appears to clearly be undefined behavior by C11 and C++11.

struct {
  int f0:4;
  int f1:4;
} s;

int main (void) {
  (s.f0 = 0) + (s.f1 = 0);
  return 0;
}

Randomly Testing a Static Analyzer

Static analyzers are intended to find bugs in code, and to show that certain kinds of bugs don’t exist. However, static analyzers are themselves large, complicated programs, leading to a “who watches the watchmen” problem. Pascal Cuoq, one of the people behind the excellent Frama-C analyzer, took it upon himself to run the fuzz-fix cycle for Frama-C and Csmith all the way to its fixpoint — where no bugs can be found.

Fuzz testing relies on knowing some invariants for the system under test; often, this is just the trivial “program shouldn’t crash.” Luckily, for compiler testing we can do much better and Pascal did some hacking to turn Frama-C into a (relatively) efficient interpreter for C programs, making it possible to compare Frama-C’s interpretation of a program against regular C compilers.

The aspect of this exercise that I found most interesting was that Frama-C found some nasty bugs in Csmith at the same time that Csmith was finding problems in Frama-C. You might say that Csmith fuzzed itself, but without Frama-C’s deep inspection of the generated programs’ behavior, we couldn’t see the bugs. Recall that Csmith’s key guarantee is that its output is well-defined by the C standard. Frama-C found five bugs where we failed to provide that guarantee.

The bugs found in Frama-C are listed here and more details can be found in a short paper that we (but mainly Pascal) wrote. We hypothesize that random testing would reveal a similar set of issues in other static analysis tools for C. If these tools are being used as part of safety arguments for critical systems, somebody should do this fuzzing.