Split Vote


In my group’s recent compiler testing paper we wrote:

We have never seen an “interesting” split vote where randomized differential testing of a collection of C compilers fails to produce a clear consensus answer

Randomized differential testing is just a fancy way of describing this process:

  1. Randomly generate a test input
  2. Run it through several different implementations of the same specification
  3. Check if all implementations produced equivalent output

Today we saw our first split vote using a program generated by Csmith. The reduced test case is:

#include <stdio.h>
struct S0 {
  unsigned f1 : 1;
};

struct S0 s;

int main (void) {
  int x = -3;
  int y = x >= (0, s.f1);
  printf ("%d\n", y);
  return 0;
}

GCC, KCC, and CompCert all print “0\n”. MSVC 2010, Intel CC 12.0.2, and today’s development snapshot of Clang (all for x86-64) print “1\n”. All compilers have optimizations turned off. There are two possibilities:

  1. The test case is ill-formed or ambiguous.
  2. Three of the six tools are wrong.

I’m pretty sure that (using the C99 standard) the test case is fine and the correct value for y is 0. The reasoning is that s.f1, an unsigned value, is promoted to signed int before the comparison is performed, making the comparison operator signed, resulting in false, or zero. The type and value of the left operand to the comma operator should be irrelevant.

There are a few interesting things going on here:

  • Two of the three (apparently) correct results were produced by relatively lightly-tested research compilers.
  • Most compiler bugs are in the optimizers. Therefore, problems like this that show up with optimizations disabled are relatively rare.
  • C is not the simple “portable assembly language” that people like to claim it is. Nobody gets all of its corner cases right, even for something relatively simple like integers.
  • Just yesterday, Xuejun — the main Csmith hacker — added support for the comma operator. Most compilers’ implementations of it are probably not very well tested.
  • Intuitively, n-version programming should work. Knight and Leveson famously showed that it may not.

Related previous posts from this blog are here and here.

Update: I should have added that I’m interested to see if there are any other compilers that get this wrong. If you have access to a compiler not on my list above and wouldn’t mind running the test and posting the result in a comment, I’d appreciate it.

Update from July 13: The behavior of this program is a bit more subtle than my explanation above indicates. John McCall’s comment has the best explanation I’ve seen so far.

Update from July 14: In C, a global variable DOES NOT need an explicit initializer. It is automatically initialized to zero.


59 responses to “Split Vote”

  1. ARM’s C compiler versions 4.0 and 4.1 both print 1 (with identical generated code), apparently due to performing an unsigned comparison.

  2. Thanks Mans!

    bcs, I’d be happy to add the Digital Mars compiler if I can do so on Linux. (The MSVC result in this post is from one of my more MS-friendly students.)

  3. I’m amazed at how many compilers are returning 1. What about the situation is causing all of them to exhibit this particular error? It seems to have something to do with the comma, but why would the comma operator mess things up in a compiler?

  4. The Oracle/SunOS compiler prints 1 on both sparc and i386 processors:

    Sun C 5.10 SunOS_sparc Patch 141861-02 2009/09/21
    Sun C 5.10 SunOS_i386 Patch 142363-02 2009/09/21

  5. I think the logic for ‘1’ here is that the comma operator does not perform integral promotions on its operand, and then the RHS of the comparison is no longer a bitfield. The standard seems quite unclear about that, though.

  6. #include

    int main (void) {
    unsigned f1 = 0;
    int x = -3;
    int y = x >= f1;
    printf (“%d\n”, y);
    return 0;
    }
    Taking it out of the struct doesn’t change it either. PGI isn’t doing the integer promotion in the expected direction.

  7. Phil, actually, by making it a normal unsigned type instead of an unsigned bitfield, you change what is supposed to happen. In the program you gave in #9, no promotion should occur. Instead, the usual arithmetic conversions occur, causing the expression to effectively become (unsigned int)-3 >= (unsigned int)0, which does evaluate to 1.

    Isn’t C wonderful?

    You could try making f1 an unsigned short; this would be converted to (int)-3 >= (int)0.

  8. John, this is an interesting reading but I can’t see where it would come from. Can you give some references into the standard to back it up?

  9. The comma operator very clearly indicates that it has the type and value of its right operand. All the places that perform integer promotions are spelled out. So the interpretation turns on how you interpret 6.3.1.1p2:

    The following may be used in an expression wherever an int or unsigned int may be used:
    — An object or expression with an integer type whose integer conversion rank is less than or equal to the rank of int and unsigned int.
    — A bit-field of type _Bool, int, signed int, or unsigned int.
    If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.

    Specifically, which kinds of expressions count as “an object” or “a bit-field”? These terms apply to declarations and storage, not expressions.

    Clearly expressions which directly name an object/bitfield — ‘x’, ‘x.myInt’, ‘x.myBitField’ — should count as objects/bitfields. Presumably parentheses shouldn’t matter. Most other expressions don’t just propagate a value out. Conditional expressions neatly side-step the question by making their operands undergo the usual arithmetic conversions directly. ‘*&x’ might raise interesting problems, except it’s moot because that would qualify as an expression of x’s type and therefore undergo promotion based on it. So the ambiguity is basically on exactly this case, and I don’t know of anything clarifying it.

    If you think the definition is “an expression whose value comes from a bit-field l-value”, then it should undergo promotion. If you think the definition is “an expression that is a bitfield l-value”, then it should not.

  10. For C++, core issue 742 is very closely related to John’s test case. Issue 324 is also related.

    An important question here is whether the bit-field size is part of the “type” of an expression or field. C++ says it isn’t, while C is unclear. See WG14 N1260 for details.

    The kinds of expressions involved here are those that assume the type of an operand. In C, that includes comma, simple and compound assignment, and increment/decrement. In C++, it also includes the conditional operator.

    I think many compilers that print 1 are not identifying “0, s.i1” as an expression that denotes a bit-field. On the other hand, I suspect that VC++ simply ignores the bit-field size for purposes of integer promotion. e.g. I see that VC++ 2010 prints 1 for John’s test case when I replace “(0, s.f1)” with s.f1.

    As far as I know, it would be possible for a C compiler to get the intuitive behavior by “digging” into an expression to look for the bit-field. A C++ compiler would have to do something more with the conditional operator. e.g.

    int i;
    unsigned int ui;
    struct S {
    unsigned ui2 : 2;
    unsigned ui3 : 3;
    unsigned ui32 : 32;
    } s;

    Consider these expressions:
    — i ? s.ui2 : s.ui3
    — i ? s.ui2 : s.ui32
    — i ? s.ui2 : ui
    In C++, all of the expressions are lvalues of type unsigned int, and according to 324’s resolution, they are also bit-fields. In C, the expressions are non-lvalues, and the first one has type int.

  11. John,

    The comma operator expression’s type is the type of the rhs (6.5.17p2) but it’s still an expression, and as such, is not direct access to the bit-field. However, the type is still a bit-field and still of a lower rank than int (since it can only store one bit and bit-fields are allowed to reduce it’s type to accommodate – 6.7.2.1p10), so I guess in either case, it should undergo promotion.

  12. Another related case is

    struct S0 { unsigned long long f1: 40; };

    struct S0 s;

    int main(void) {
    unsigned long long x;
    s.f1 = 0x8000000000LL;
    x = s.f1 + s.f1;
    printf(“%llx\n”, x);
    }

    The issue here (assuming LP64 or LLP64) is whether in the expression s.f1 + sf.1,
    both operands (having the same type and not undergoing integral promotion) undergo
    any conversion at all. If they don’t, the result type is the same type, i.e. unsigned:40,
    so the result here would be zero. gcc 4.4 in fact does give a zero result here.
    Some other compilers give 1000000000.

    I believe that reflects a fundamental choice of whether to carry the bit width as part of
    an expression type or whether just to special-case the promotion of bitfield expressions.
    It’s compilers that do the latter, that may then have to special-case a comma expression
    whose RHS is a bitfield.

  13. Thanks everyone for the great information! We’d have added the comma operator long ago if we knew what fun it would be.

  14. Just out of curiosity John, have you approached any of the big commercial compiler companies about getting free licenses for their products? I don’t work in the compiler business but if a university research time offered to rigorously test my software, for free, I’d say yes. You can always ignore bug reports after all.

  15. msalib, that’s a great question. I started writing a response and it got pretty long. Instead of pasting it here, I’ll write it up as its own post, hopefully in the near future.

  16. Unless those “big name, big $$” embedded compilers use something other than EDG, you’re going to see a lot of the same 1 answers.

  17. cc: Sun C 5.10 SunOS_sparc 2009/06/03

    Flags: “-m32 -O0” and “-m64 -O0” (same result)

    With comma operator: result=1
    Without comma operator: result=0

  18. IAR ANSI C/C++ Compiler V5.40.2.51604/W32 for ARM
    Copyright (C) 1999-2009 IAR Systems AB.

    Returns 1 with and without comma, with and without optimisation.

  19. C:\>bcc32.exe -o0 compilerTest.c
    Borland C++ 5.5.1 for Win32 Copyright (c) 1993, 2000 Borland
    compilerTest.c:
    Warning W8012 compilerTest.c 10: Comparing signed and unsigned values in function main
    Turbo Incremental Link 5.00 Copyright (c) 1997, 2000 Borland

    C:\>compilerTest.exe
    1

  20. Ken Thompson’s Plan 9 386 compiler gives the result as 1.

    term% cat test.c
    #include
    #include

    struct S0 {
    unsigned f1 : 1;
    };

    struct S0 s;

    void main(int, char **) {
    int x = -3;
    int y = x >= (0, s.f1);
    print(“%d\n”, y);

    exits(nil);
    }
    term% 8c -FTVwN test.c && 8l test.8
    warning: test.c:12 result of operation not used
    term% 8.out
    1
    term%

  21. Yet another C compiler producing “1”:

    pcc 1.0.0 (Release 20110221), which supposedly support C99 (and is in the base systems of both Open- and Net- BSD, also prints “1”.

  22. 0 from HP C V7.1-015 on OpenVMS Alpha V8.3, irrespective of the optimizer, and irrespective of the 0, operator.

  23. Using: Open Watcom C32 Optimizing Compiler Version 1.9

    The output is simply 1

  24. I would have said this program invokes undefined behavior (or some similar classification of badness), since variable “s” is used uninitialized. Am I missing something?

  25. This is an ambiguity in the standard that has been cleared up in the upcoming latest revision of the ISO C Standard (commonly known as C1X until it gets ratified) in section 6.3.1.1 paragraph 2. In essence the bitfield type (from the result of the comma expression) is converted to an int, and hence the comparison is a signed int comparison. In essence, the C1X standard says that the value of y is 0.

  26. msalib,
    Every compiler has bugs. Presumably the bugs most worth finding and fixing are those that a compiler vendor’s customers are most likely to run into and cause actual problems rather than some of the really obscure cases that csmith can reveal.
    Naturally, academics want to publish their results, but what is the incentive for compiler company X to give John their compiler and then have him publish a list of bugs found up on the internet for all X’s customers and competitors to see, when X could just download csmith themselves and run it privately to find and fix bugs.
    In fact I believe that has already happened.

  27. Ummm… since s is uninitialized in the example code, isn’t *any* result here valid? The comparison should be undefined unless I’m missing something.

    That still leaves the curiosity question of why some compilers choose to make an undefined comparison come out to true, but that could be as simple as saving a few instructions, even without optimization, especially since the more industrial compilers are the ones showing this behavior.

    Of course, the same thing may happen if you initialize the struct, but as written it seems ok.

  28. Rajan, is there a C draft available that shows how the ambiguity was cleaned up? I see N1548, dated 2010-12-02, which has the DR315 fix in it, but I don’t think that’s sufficient.

  29. Ryan, for your first question, the latest C Standard is up for vote by ISO member countries right now. You should be able to access it through your ISO liason group which if you are in Canada would be have me as its head 🙂 or if you are in USA then you have to go through INCITS (I am a member of PL22.11 as well so if you have trouble getting it there let me know).

    For your second statement, can you let me know why you don’t think the part listed in the response to the defect report which says “If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int.” is sufficient? I may have missed something.

  30. Just as a clarification, the text quoted above is not the actual text in the standard, just one of the responses in the defect report. It also has the following statement in a later response: “If an int can represent all values of the original type (as restricted by the width, for a bit-field), the type is converted to an int;
    ” (this is the actual text in the new standard).

  31. Reader, you are correct. I’ve heard from engineers at several compiler companies, who are using Csmith internally. From my point of view this is a great outcome — reporting bugs is hard work! These compilers will get better, which is the main thing.

  32. The correct result is 0, because the int type is sufficient to express all values of the bitfield, hence its value is converted to (signed) int.

    The comma operator is simply exposing unrelated bugs in a smaller number of compilers. The result type of the comma operator is the type of the last operand, which is bitfield.

    The bug is that some compilers are promoting the bitfield to unsigned int, either before or after the comma operator is applied, and hence all the comparison operands are converted to unsigned.

    Even if lvalue->rvalue conversion is done before the comma expression, it shall convert to signed int because signed int is sufficient to express all values. It would be the same with an unsigned char rather than a bitfield.

    I don’t see any ambiguity. Ever since C89 (ANSI C), integer promotion has said that int is preferred to unsigned if int can represent all values of all operands.