Split Vote

In my group’s recent compiler testing paper we wrote:

We have never seen an “interesting” split vote where randomized differential testing of a collection of C compilers fails to produce a clear consensus answer

Randomized differential testing is just a fancy way of describing this process:

Randomly generate a test input
Run it through several different implementations of the same specification
Check if all implementations produced equivalent output

Today we saw our first split vote using a program generated by Csmith. The reduced test case is:

#include <stdio.h>

struct S0 {
  unsigned f1 : 1;
};

struct S0 s;

int main (void) {
  int x = -3;
  int y = x >= (0, s.f1);
  printf ("%d\n", y);
  return 0;
}

GCC, KCC, and CompCert all print “0\n”. MSVC 2010, Intel CC 12.0.2, and today’s development snapshot of Clang (all for x86-64) print “1\n”. All compilers have optimizations turned off. There are two possibilities:

The test case is ill-formed or ambiguous.
Three of the six tools are wrong.

I’m pretty sure that (using the C99 standard) the test case is fine and the correct value for y is 0. The reasoning is that s.f1, an unsigned value, is promoted to signed int before the comparison is performed, making the comparison operator signed, resulting in false, or zero. The type and value of the left operand to the comma operator should be irrelevant.

There are a few interesting things going on here:

Two of the three (apparently) correct results were produced by relatively lightly-tested research compilers.
Most compiler bugs are in the optimizers. Therefore, problems like this that show up with optimizations disabled are relatively rare.
C is not the simple “portable assembly language” that people like to claim it is. Nobody gets all of its corner cases right, even for something relatively simple like integers.
Just yesterday, Xuejun — the main Csmith hacker — added support for the comma operator. Most compilers’ implementations of it are probably not very well tested.
Intuitively, n-version programming should work. Knight and Leveson famously showed that it may not.

Related previous posts from this blog are here and here.

Update: I should have added that I’m interested to see if there are any other compilers that get this wrong. If you have access to a compiler not on my list above and wouldn’t mind running the test and posting the result in a comment, I’d appreciate it.

Update from July 13: The behavior of this program is a bit more subtle than my explanation above indicates. John McCall’s comment has the best explanation I’ve seen so far.

Update from July 14: In C, a global variable DOES NOT need an explicit initializer. It is automatically initialized to zero.

July 12, 2011

regehr

Compilers, Computer Science, Software Correctness

59 responses to “Split Vote”

bcs says:

July 12, 2011 at 4:49 pm

I’d try DMC if I still had a windows box up anywhere. let’s see what this gets:
http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D&artnum=140285

(BTW: would you be interested in adding it to the voter pool?)
Mans says:

July 12, 2011 at 6:01 pm

ARM’s C compiler versions 4.0 and 4.1 both print 1 (with identical generated code), apparently due to performing an unsigned comparison.
regehr says:

July 12, 2011 at 6:19 pm

Thanks Mans!

bcs, I’d be happy to add the Digital Mars compiler if I can do so on Linux. (The MSVC result in this post is from one of my more MS-friendly students.)
Phil says:

July 12, 2011 at 6:24 pm

I get a 0 from

IBM XL C/C++ for Linux, V11.1 (5724-X14)
Version: 11.01.0000.0002

on Power7.
Phil says:

July 12, 2011 at 6:42 pm

Portland Group gives 1
pgcc 10.3-0 64-bit target on x86-64 Linux -tp k8-64e
Chucky Ellison says:

July 12, 2011 at 7:54 pm

I’m amazed at how many compilers are returning 1. What about the situation is causing all of them to exhibit this particular error? It seems to have something to do with the comma, but why would the comma operator mess things up in a compiler?
Chris Lattner says:

July 12, 2011 at 8:08 pm

Eli fixed clang in r135026:
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20110711/043853.html

The power of open source! 🙂

-Chris
Andrew Johnson says:

July 12, 2011 at 8:22 pm

The Oracle/SunOS compiler prints 1 on both sparc and i386 processors:

Sun C 5.10 SunOS_sparc Patch 141861-02 2009/09/21
Sun C 5.10 SunOS_i386 Patch 142363-02 2009/09/21
Phil says:

July 12, 2011 at 8:35 pm

Actually, removing the comma on PGI didn’t change its behavior. Still returned 1.
John McCall says:

July 12, 2011 at 8:38 pm

I think the logic for ‘1’ here is that the comma operator does not perform integral promotions on its operand, and then the RHS of the comparison is no longer a bitfield. The standard seems quite unclear about that, though.
Phil says:

July 12, 2011 at 8:40 pm

#include

int main (void) {
unsigned f1 = 0;
int x = -3;
int y = x >= f1;
printf (“%d\n”, y);
return 0;
}
Taking it out of the struct doesn’t change it either. PGI isn’t doing the integer promotion in the expected direction.
Chucky Ellison says:

July 12, 2011 at 9:44 pm

Phil, actually, by making it a normal unsigned type instead of an unsigned bitfield, you change what is supposed to happen. In the program you gave in #9, no promotion should occur. Instead, the usual arithmetic conversions occur, causing the expression to effectively become (unsigned int)-3 >= (unsigned int)0, which does evaluate to 1.

Isn’t C wonderful?

You could try making f1 an unsigned short; this would be converted to (int)-3 >= (int)0.
regehr says:

July 12, 2011 at 10:05 pm

John, this is an interesting reading but I can’t see where it would come from. Can you give some references into the standard to back it up?
regehr says:

July 12, 2011 at 10:07 pm

Chris, you guys are super fast! Nice.
John McCall says:

July 12, 2011 at 10:50 pm

The comma operator very clearly indicates that it has the type and value of its right operand. All the places that perform integer promotions are spelled out. So the interpretation turns on how you interpret 6.3.1.1p2:

The following may be used in an expression wherever an int or unsigned int may be used:
â€” An object or expression with an integer type whose integer conversion rank is less than or equal to the rank of int and unsigned int.
â€” A bit-field of type _Bool, int, signed int, or unsigned int.
If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.

Specifically, which kinds of expressions count as “an object” or “a bit-field”? These terms apply to declarations and storage, not expressions.

Clearly expressions which directly name an object/bitfield â€” ‘x’, ‘x.myInt’, ‘x.myBitField’ â€” should count as objects/bitfields. Presumably parentheses shouldn’t matter. Most other expressions don’t just propagate a value out. Conditional expressions neatly side-step the question by making their operands undergo the usual arithmetic conversions directly. ‘*&x’ might raise interesting problems, except it’s moot because that would qualify as an expression of x’s type and therefore undergo promotion based on it. So the ambiguity is basically on exactly this case, and I don’t know of anything clarifying it.

If you think the definition is “an expression whose value comes from a bit-field l-value”, then it should undergo promotion. If you think the definition is “an expression that is a bitfield l-value”, then it should not.
regehr says:

July 12, 2011 at 10:55 pm

Thanks John!
bcs says:

July 12, 2011 at 11:09 pm

DMC (under wine) gives 1 both with and with out optimization (and no extra flags).

FWIW: http://digitalmars.com/features.html
Ryan Prichard says:

July 13, 2011 at 1:52 am

For C++, core issue 742 is very closely related to John’s test case. Issue 324 is also related.

An important question here is whether the bit-field size is part of the “type” of an expression or field. C++ says it isn’t, while C is unclear. See WG14 N1260 for details.

The kinds of expressions involved here are those that assume the type of an operand. In C, that includes comma, simple and compound assignment, and increment/decrement. In C++, it also includes the conditional operator.

I think many compilers that print 1 are not identifying “0, s.i1” as an expression that denotes a bit-field. On the other hand, I suspect that VC++ simply ignores the bit-field size for purposes of integer promotion. e.g. I see that VC++ 2010 prints 1 for John’s test case when I replace “(0, s.f1)” with s.f1.

As far as I know, it would be possible for a C compiler to get the intuitive behavior by “digging” into an expression to look for the bit-field. A C++ compiler would have to do something more with the conditional operator. e.g.

int i;
unsigned int ui;
struct S {
unsigned ui2 : 2;
unsigned ui3 : 3;
unsigned ui32 : 32;
} s;

Consider these expressions:
— i ? s.ui2 : s.ui3
— i ? s.ui2 : s.ui32
— i ? s.ui2 : ui
In C++, all of the expressions are lvalues of type unsigned int, and according to 324’s resolution, they are also bit-fields. In C, the expressions are non-lvalues, and the first one has type int.
Andreas Zwinkau says:

July 13, 2011 at 3:31 am

cparser outputs 1

Since common sense says (unsigned:1) fits into (int), we consider this a bug.
Mans says:

July 13, 2011 at 4:12 am

PathScale EKOPath gives 0 with and without optimisation.
Renato Golin says:

July 13, 2011 at 4:18 am

John,

The comma operator expression’s type is the type of the rhs (6.5.17p2) but it’s still an expression, and as such, is not direct access to the bit-field. However, the type is still a bit-field and still of a lower rank than int (since it can only store one bit and bit-fields are allowed to reduce it’s type to accommodate – 6.7.2.1p10), so I guess in either case, it should undergo promotion.
wert says:

July 13, 2011 at 6:22 am

1 from aCC: HP C/aC++ B3910B A.06.20
Al Grant says:

July 13, 2011 at 7:36 am

Another related case is

struct S0 { unsigned long long f1: 40; };

struct S0 s;

int main(void) {
unsigned long long x;
s.f1 = 0x8000000000LL;
x = s.f1 + s.f1;
printf(“%llx\n”, x);
}

The issue here (assuming LP64 or LLP64) is whether in the expression s.f1 + sf.1,
both operands (having the same type and not undergoing integral promotion) undergo
any conversion at all. If they don’t, the result type is the same type, i.e. unsigned:40,
so the result here would be zero. gcc 4.4 in fact does give a zero result here.
Some other compilers give 1000000000.

I believe that reflects a fundamental choice of whether to carry the bit width as part of
an expression type or whether just to special-case the promotion of bitfield expressions.
It’s compilers that do the latter, that may then have to special-case a comma expression
whose RHS is a bitfield.
regehr says:

July 13, 2011 at 10:06 am

Thanks everyone for the great information! We’d have added the comma operator long ago if we knew what fun it would be.
regehr says:

July 13, 2011 at 10:22 am

I’d be particularly interested to see results from more “big name, big $$” embedded compilers.
Geoff Richards says:

July 13, 2011 at 10:53 am

For what it’s worth, tcc version 0.9.25 prints 1, but I’m not sure if that’s meant to implement a complete ANSI C or just a subset.
msalib says:

July 13, 2011 at 5:14 pm

Just out of curiosity John, have you approached any of the big commercial compiler companies about getting free licenses for their products? I don’t work in the compiler business but if a university research time offered to rigorously test my software, for free, I’d say yes. You can always ignore bug reports after all.
regehr says:

July 13, 2011 at 6:52 pm

msalib, that’s a great question. I started writing a response and it got pretty long. Instead of pasting it here, I’ll write it up as its own post, hopefully in the near future.
alexr says:

July 13, 2011 at 8:06 pm

Unless those “big name, big $$” embedded compilers use something other than EDG, you’re going to see a lot of the same 1 answers.
zzz says:

July 13, 2011 at 10:24 pm

cc: Sun C 5.10 SunOS_sparc 2009/06/03

Flags: “-m32 -O0” and “-m64 -O0” (same result)

With comma operator: result=1
Without comma operator: result=0
Tim Bates says:

July 13, 2011 at 11:43 pm

IAR ANSI C/C++ Compiler V5.40.2.51604/W32 for ARM
Copyright (C) 1999-2009 IAR Systems AB.

Returns 1 with and without comma, with and without optimisation.
dog says:

July 14, 2011 at 1:13 am

C:\>bcc32.exe -o0 compilerTest.c
Borland C++ 5.5.1 for Win32 Copyright (c) 1993, 2000 Borland
compilerTest.c:
Warning W8012 compilerTest.c 10: Comparing signed and unsigned values in function main
Turbo Incremental Link 5.00 Copyright (c) 1997, 2000 Borland

C:\>compilerTest.exe
1
P9IDF says:

July 14, 2011 at 2:58 am

Ken Thompson’s Plan 9 386 compiler gives the result as 1.

term% cat test.c
#include
#include

struct S0 {
unsigned f1 : 1;
};

struct S0 s;

void main(int, char **) {
int x = -3;
int y = x >= (0, s.f1);
print(“%d\n”, y);

exits(nil);
}
term% 8c -FTVwN test.c && 8l test.8
warning: test.c:12 result of operation not used
term% 8.out
1
term%
Richard Bonichon says:

July 14, 2011 at 4:45 am

Yet another C compiler producing “1”:

pcc 1.0.0 (Release 20110221), which supposedly support C99 (and is in the base systems of both Open- and Net- BSD, also prints “1”.
Hoff says:

July 14, 2011 at 5:49 am

0 from HP C V7.1-015 on OpenVMS Alpha V8.3, irrespective of the optimizer, and irrespective of the 0, operator.
J Armstrong says:

July 14, 2011 at 6:30 am

Using: Open Watcom C32 Optimizing Compiler Version 1.9

The output is simply 1
Adam Chlipala says:

July 14, 2011 at 6:47 am

I would have said this program invokes undefined behavior (or some similar classification of badness), since variable “s” is used uninitialized. Am I missing something?
Mans says:

July 14, 2011 at 7:25 am

Adam: all objects with static storage duration are initialised to zero. There is no problem here.
Rajan Bhakta says:

July 14, 2011 at 7:28 am

This is an ambiguity in the standard that has been cleared up in the upcoming latest revision of the ISO C Standard (commonly known as C1X until it gets ratified) in section 6.3.1.1 paragraph 2. In essence the bitfield type (from the result of the comma expression) is converted to an int, and hence the comparison is a signed int comparison. In essence, the C1X standard says that the value of y is 0.
Mike says:

July 14, 2011 at 7:55 am

IBM’s XLC/C++ gives the correct result (0) on z/OS and AIX
Reader says:

July 14, 2011 at 8:56 am

msalib,
Every compiler has bugs. Presumably the bugs most worth finding and fixing are those that a compiler vendor’s customers are most likely to run into and cause actual problems rather than some of the really obscure cases that csmith can reveal.
Naturally, academics want to publish their results, but what is the incentive for compiler company X to give John their compiler and then have him publish a list of bugs found up on the internet for all X’s customers and competitors to see, when X could just download csmith themselves and run it privately to find and fix bugs.
In fact I believe that has already happened.
hacksoncode says:

July 14, 2011 at 9:04 am

Ummm… since s is uninitialized in the example code, isn’t *any* result here valid? The comparison should be undefined unless I’m missing something.

That still leaves the curiosity question of why some compilers choose to make an undefined comparison come out to true, but that could be as simple as saving a few instructions, even without optimization, especially since the more industrial compilers are the ones showing this behavior.

Of course, the same thing may happen if you initialize the struct, but as written it seems ok.
Ryan Prichard says:

July 14, 2011 at 11:38 am

Rajan, is there a C draft available that shows how the ambiguity was cleaned up? I see N1548, dated 2010-12-02, which has the DR315 fix in it, but I don’t think that’s sufficient.
Rajan Bhakta says:

July 14, 2011 at 1:48 pm

Ryan, for your first question, the latest C Standard is up for vote by ISO member countries right now. You should be able to access it through your ISO liason group which if you are in Canada would be have me as its head 🙂 or if you are in USA then you have to go through INCITS (I am a member of PL22.11 as well so if you have trouble getting it there let me know).

For your second statement, can you let me know why you don’t think the part listed in the response to the defect report which says “If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int.” is sufficient? I may have missed something.
Rajan Bhakta says:

July 14, 2011 at 1:54 pm

Just as a clarification, the text quoted above is not the actual text in the standard, just one of the responses in the defect report. It also has the following statement in a later response: “If an int can represent all values of the original type (as restricted by the width, for a bit-field), the type is converted to an int;
” (this is the actual text in the new standard).
regehr says:

July 14, 2011 at 3:10 pm

Reader, you are correct. I’ve heard from engineers at several compiler companies, who are using Csmith internally. From my point of view this is a great outcome — reporting bugs is hard work! These compilers will get better, which is the main thing.
Mans says:

July 14, 2011 at 4:04 pm

I find shocking the number of commenters who believe the value is uninitialised.
leek says:

July 14, 2011 at 7:11 pm

The correct result is 0, because the int type is sufficient to express all values of the bitfield, hence its value is converted to (signed) int.

The comma operator is simply exposing unrelated bugs in a smaller number of compilers. The result type of the comma operator is the type of the last operand, which is bitfield.

The bug is that some compilers are promoting the bitfield to unsigned int, either before or after the comma operator is applied, and hence all the comparison operands are converted to unsigned.

Even if lvalue->rvalue conversion is done before the comma expression, it shall convert to signed int because signed int is sufficient to express all values. It would be the same with an unsigned char rather than a bitfield.

I don’t see any ambiguity. Ever since C89 (ANSI C), integer promotion has said that int is preferred to unsigned if int can represent all values of all operands.
boysetsfrog says:

July 14, 2011 at 10:10 pm

Mans, you mean the whole 2 of them? …Shocking
regehr says:

July 14, 2011 at 10:19 pm

The uninitialized thing came up quite a few times on reddit and hacker news today too.

GCC 3.4.4 prints 1.